Aptima Projects | Center for Intelligent Information Retrieval

Phase II: Interactive Text Classification of Usability Reports

Phase I: Automated Diagnosis of Usability Problems Using Statistical Computational Methods

Principal Investigator:

Andrew McCallum, PI
mccallum@cs.umass.edu

Information Extraction and Synthesis Laboratory (IESL)/
Center for Intelligent Information Retrieval (CIIR)
Department of Computer Science
140 Governors Drive
University of Massachusetts
Amherst, MA 01003-9264

Project Summary

The effects of poor usability range from mere inconvenience to disaster. Human factors specialists employ usability analysis to reduce the likelihood or impact of such failures. However, good usability analysis requires usability reports that are rarely collected, rarely complete, and difficult to analyze.

Aptima and the CIIR have partnered to develop a usability analysis system that addresses these problems. The system will consist of (1) an interface to elicit useful usability reports in natural language, (2) a text analysis engine that classifies these reports (or existing usability reports) using a validated taxonomy, and (3) an analysis interface for analyzing individual usability reports and trends in usability problems. We will train and test the system using a very large corpus of publicly available usability reports categorized into an extension of the User Action Framework. Aptima and CIIR will deliver a report of results, a software prototype, and a corpus of manually categorized usability reports.

In Phase I, UMass Amherst implemented and test several different state-of-the-art text classification algorithms on the corpus of labeled usability received from Aptima, as well as developing new algorithms if warranted by the preliminary results. The algorithms may include (1) generative Bayesian classifiers, (2) Maximum entropy classifiers, (2) support vector machines (SVMs), (3) latent semantic indexing (LSI, aka latent semantic analysis, LSA), (4) probabilistic latent semantic indexing, and (5) latent Dirichlet allocation.

In Phase II, UMass Amherst is developing text classification algorithms for use with usability reports provided by Aptima. In addition, the UMass team will select, refine, or develop algorithms that automatically segment the text into sections discussing different aspects of the domain and will select, refine, or develop learning algorithms that support interaction between the user and the text classification system.

This work is supported in part by the Center for Intelligent Information Retrieval (CIIR) and in part by AFOSR STTR through a subcontract from Aptima.