Current Projects

Lemur/Indri
The Lemur Project is a collaboration with the CIIR and the School of Computer Science at Carnegie Mellon University. The Lemur Toolkit is designed to facilitate research in language modeling and information retrieval, where IR is broadly interpreted to include such technologies as ad hoc and distributed retrieval, cross-language IR, summarization, filtering, and classification. As part of the Lemur project, the CIIR has developed Indri, a language model-based search engine for complex queries. In an NSF funded CRI collaborative research project between UMass Amherst and CMU, the team is focusing on the continued development of the open-source Lemur software toolkit for language modeling and information retrieval.

EAGER: Dynamic Contextual Explanation of Search Results
This NSF-funded research project aims to investigate and develop Defuddle, an approach and a system that analyzes documents at the top of a search engine’s ranked list to find human-readable explanations for why documents were retrieved for this query and, unlike existing technology, for how the documents relate to each other. The resulting advances in result explanation will make it easier for people to make sense of what happens when they search the web or any other collection of text documents.

Searching for Answers Through Iterative Feedback
In this NSF-funded research project, we will work on four research tasks: (a) develop and evaluate iterative relevance feedback models for answers; (b) develop and evaluate interactive summarization techniques for answers; (c) develop and evaluate finer-grained feedback approaches for answers; (d) develop and evaluate a conversation-based model for answer retrieval. This project will be the first to study methods and models for interacting with ranked lists of answers.

Mirador: Explainable Computational Models for Recognizing and Understanding Controversial Topics Encountered Online
Mirador is an NSF-funded research project whose aim is to develop algorithms and tools that allow a person to recognize that a web page or other document discusses one or more topics that are controversial -- that is, about which there is strong disagreement within some sizeable group of people. The project will develop algorithms and tools that explain the controversy surrounding the topic, identifying the populations that disagree, the stances that they take, and how those stances conflict with each other. The project will assist people in critical evaluation of on-line material and help them understand why a page is educative or why it is not.

Interactive Construction of Complex Query Models
This NSF-funded research program will investigate and implement SearchIE, a search-based approach to information "extraction." SearchIE will allow rapid, personalized, situational identification of types of objects or actions in text, where those types are likely to be useful for a complex search task. The result is that the technology can radically improve online searching for lay persons as well as professionals by significantly reducing the time needed to focus queries into relevant information.

Connecting the Ephemeral and Archival Information Networks
This NSF-funded project is a collaboration with the CIIR, Carnegie Mellon University, and RMIT University. The team will use the explicit and implicit links between the ephemeral and archival networks to improve the effectiveness of search that is targeted at social data, web data, or both. Researchers will demonstrate the validity of our hypothesis using a range of existing TREC tasks focused on either social media search or web search. In addition, we will explore two new tasks, conversation search and aggregated social search, which can exploit the integrated network of ephemeral and archival information.

Understanding the Relevance of Text Passages
Developing effective passage retrieval would have a major effect on search tools by greatly extending the range of queries that could be answered directly using text passages retrieved from the web. This is particularly important for mobile search applications with limited output bandwidth based on using either a small screen or speech output. In this case, the ability to use passages to reduce the amount of output while maintaining high relevance will be critical. In this NSF-funded project, we study research issues that have either been ignored, or only partially addressed, in prior research, such as showing whether passages can be better answers than documents for some queries, predicting which queries have good answers at the passage level, ranking passages to retrieve the best answers, and evaluating the effectiveness of passages as answers.

Topical Positioning System (TPS) for Informed Reading of Web Pages
This NSF-funded project addresses the challenge of increasing the critical literacy of people looking for information on the Web, including information regarding healthcare, policy, or any other broadly discussed topic. The research on Topical Positioning System "TPS" drives the vision of developing a browser tool that shows a person whether the web page in front of them discusses a provocative topic, whether the material is presented in a heavily biased way, whether it represents an outlier (fringe) idea, and how its discussion of issues relates to the broader context and to information presented in "familiar" sources.