Current Projects

The Lemur Project is a collaboration with the CIIR and the School of Computer Science at Carnegie Mellon University. The Lemur Toolkit is designed to facilitate research in language modeling and information retrieval, where IR is broadly interpreted to include such technologies as ad hoc and distributed retrieval, cross-language IR, summarization, filtering, and classification. As part of the Lemur project, the CIIR has developed Indri, a language model-based search engine for complex queries. In an NSF funded CRI collaborative research project between UMass Amherst and CMU, the team is focusing on the continued development of the open-source Lemur software toolkit for language modeling and information retrieval.

EAGER: Dynamic Contextual Explanation of Search Results (Defuddle)
This NSF-funded research project aims to investigate and develop Defuddle, an approach and a system that analyzes documents at the top of a search engine’s ranked list to find human-readable explanations for why documents were retrieved for this query and, unlike existing technology, for how the documents relate to each other. The resulting advances in result explanation will make it easier for people to make sense of what happens when they search the web or any other collection of text documents.

Athena: Learning-oriented Search With Personalized Learning Flows
This NSF-funded project is a collaboration with the CIIR and the University of North Carolina at Chapel Hill. The Athena project will develop technology called "search as learning," a set of search technologies that encourage and support learning rather than just simple document finding. The Athena work will extend the state of the art in text representation, neural approaches including attention techniques, query and topic modeling, contextual text summarization, and understanding human approaches to complex search activities.

Searching for Answers Through Iterative Feedback
In this NSF-funded research project, we will work on four research tasks: (a) develop and evaluate iterative relevance feedback models for answers; (b) develop and evaluate interactive summarization techniques for answers; (c) develop and evaluate finer-grained feedback approaches for answers; (d) develop and evaluate a conversation-based model for answer retrieval. This project will be the first to study methods and models for interacting with ranked lists of answers.

Mirador: Explainable Computational Models for Recognizing and Understanding Controversial Topics Encountered Online
Mirador is an NSF-funded research project whose aim is to develop algorithms and tools that allow a person to recognize that a web page or other document discusses one or more topics that are controversial -- that is, about which there is strong disagreement within some sizeable group of people. The project will develop algorithms and tools that explain the controversy surrounding the topic, identifying the populations that disagree, the stances that they take, and how those stances conflict with each other. The project will assist people in critical evaluation of on-line material and help them understand why a page is educative or why it is not.