Research and Development Publications

Ganiz, M. C., Pottenger, W. M. and George, C., "Higher Order Naive Bayes: A Novel Non-IID Approach to Text Classification", (2010).

“The underlying assumption in traditional machine learning algorithms is that instances are I.I.D., Independent and Identically Distributed. These critical independence assumptions made in traditional machine learning algorithms prevent them from going beyond instance boundaries to exploit latent relations between features. In this article, we develop a general approach to supervised learning by leveraging higher-order dependencies between features…” Read More

Ganiz M. C., Lytkin, N. I. and Pottenger, W. M., "Leveraging Higher Order Dependencies Between Features for Text Classification", (2009).

Traditional machine learning methods only consider relation- ships between feature values within individual data instances while dis- regarding the dependencies that link features across instances. In this work, we develop a general approach to supervised learning by leveraging higher-order dependencies between features. We introduce a novel Bayesian framework for classication named Higher Order Naive Bayes …” Read More

Kontostathis, A. and Pottenger, W. M. "A Framework for Understanding LSI Performance", (2006)

“Many models for understanding LSI have been proposed. Ours is the rst to study the values produced by LSI in the term by dimension vectors. The framework presented here is based on term co-occurrence data. We show a strong correlation between second-order term co-occurrence and the values produced by the Singular Value Decomposition (SVD) algorithm that forms the foundation for LSI…” Read More

Kasik, D., Ebert, D., Lebanon, G., Park, H. and Pottenger, W. M. Data Transformations and Representations for Information Generation. (2009)

“At the core of successful visual analytics systems are computational techniques that transform data into concise, human comprehensible visual representations. The general process often requires multiple transformation steps before a final visual representation is generated. This article characterizes the complex raw data to be analyzed and then describes two different sets of transformations and representations…” Read More

Nikolov, A., Li, S. and Pottenger, W. M., "Privacy-Enhancing Distributed Higher-Order ARM". (2009)

“Traditional association rule mining algorithms assume that data instances are independent and identically distributed. In statistical relational learning, however, relationships between instances can be leveraged to improve performance of learning algorithms…” Read More

Ribarsky, W., Fisher, B., Turner, A. E. and Pottenger, W. M. Science of Analytical Reasoning. (2009)

“There has been progress in the science of analytical reasoning and in meeting the recommendations for future research that were laid out when the field of visual analytics was established. Researchers have also developed a group of visual analytics tools and methods that embody visual analytics principles and attack important and challenging real-world problems. However, these efforts are only the beginning and much study remains to be done. This article examines the state of the art in visual analytics methods and reasoning and gives examples of current tools and capabilities…” Read More

"Using Clustering to Detect Chinese Censorware"

“The Chinese government restricts access to religious, political, and pornographic content through the use of an intricate system of surveillance and censorship infrastructure. This infrastructure creates patterns that seem anomalous when compared to normal Chinese Internet tra.c. Previous detection methods could neither detect zero-day attacks nor lower…” Read More