Award ID: 1813662
Award Title: III: Small: Mirador: Explainable Computational Models for Recognizing and Understanding Controversial Topics Encountered Online
Duration: September 1, 2018 - August 31, 2022
Principal Investigator: James Allan, PI

Center for Intelligent Information Retrieval (CIIR)
Manning College of Information and Computer Sciences
140 Governors Drive
University of Massachusetts Amherst
Amherst, MA 01003-9264

Project Abstract

This project aims to develop algorithms and tools that allow a person to recognize that a web page or other document discusses one or more topics that are controversial -- that is, about which there is strong disagreement within some sizeable group of people. The project will develop algorithms and tools that explain the controversy surrounding the topic, identifying the populations that disagree, the stances that they take, and how those stances conflict with each other. The advances in these algorithms will broaden the research community's understanding of how discussions and disagreements on topics can be modeled computationally and how that resulting information can be conveyed to a general user. The project will assist people in critical evaluation of on-line material and help them understand why a page is educative or why it is not.

The aim of this project is to provide users with tools that illuminate the broader context of the topic or topics of a single page or document that someone finds. Previous work has shown that it is possible to recognize with reasonable accuracy that a document is part of a controversial topic, but that work is fragmented across different genres, demands more robust modeling and more thorough evaluation, and lacks explanatory power that can help a reader understand why and how a text is contentious. In this project, the researchers explore fundamental questions about how controversy can be modeled computationally so that it can be recognized "in the wild". The project also explores model variations that allow an algorithm to extract an explanation of the nature of the controversy. The project applies and extends text analysis and comparison techniques. It leverages powerful statistical language modeling methods as well as recent neural network (deep learning) approaches to represent text, its controversial nature, its stances, and their relationships, all extracted from Web pages and other documents. The modeling will be initially used offline to identify collections of topics known to be controversial and then adapt that collection by monitoring slowly-changing news sources and blog postings as well as ephemeral microblog sources of data to capture rapid changes in controversy. The researchers will make the resulting techniques available by providing an open-source example server.

Broader Impacts
A key impact of this work is workforce development, training graduate students and others in strong research methods, how to carry out research, how to write research, and how to present it at conferences. A second key impact is publication of the methods and algorithms developed under this project, informing the broader scientific community about the work. Those impacts will occur throughout (and beyond) the life of the project. As more and more people turn to the Web and to social networks for answers to their questions, tools such as Mirador will become critically important.


IR-1164: Kim, Y. and Allan, J., "Unsupervised Explainable Controversy Detection from Online News," in the Proceedings of the European Conference on Information Retrieval. Cologne, Germany, April 14-18, 2019, pp. 836-843.

IR-1211: Yu, P., Rahimi, N., Huang, Z. and Allan, J., "Learning to Rank Entities for Set Expansion from Unstructured Data ," in the Proceedings of the International Conference on the Theory of Information Retrieval (ICTIR 2020), Stavanger, Norway, September 14-18, 2020, pp. 21-28.

IR-1227: Rahimi, N.,  Kim, Y.,  Zamani, H. and Allan, J., "Explaining Documents’ Relevance to Search Queries," CIIR Technical Report, https://arxiv.org/abs/2111.01314, 2020.

IR-1230: Huang, Z., Rahimi, N., Yu, P., Shang, J. and Allan, J., "AutoName: A Corpus-Based Set Naming Framework ," in the Proceedings of The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 21), July 11-15, 2021, pp. 2101–2105.

IR-1231: Ramezani, S., Rahimi, R. and Allan, J., "Aspect Category Detection in Product Reviews using Contextual Representation," in the Proceedings of ACM SIGIR Workshop on eCommerce (SIGIR eCom’20) ACM, SIGIR eCom’20, July 30, 2020, Virtual Event, China.

IR-1232: Kim, Y., Jang, M. and Allan, J., "Explaining Text Matching on Neural Natural Language Inference," in ACM Transactions on Information Systems 38, 4 (October 2020), 23 pages.

IR-1244: Chowdhury, T., Rahimi, N. and Allan, J., "Equi-explanation Maps: Concise and Informative Global Summary Explanations," in the Proceedings of the ACM FAccT* conference, Seoul, South Korea on June 21-24 2022, pp. 464–472.

IR-1252: Kim, Y., Bonab, H., Rahimi, R. and Allan, J., "Query-driven Segment Selection for Ranking Long Documents," in the Proceedings of The 30th ACM International Conference on Information and Knowledge Management (CIKM '21), Virtual Event, Australia, November 1-5, 2021, pp. 3147–3151.

IR-1253: Sarwar, S.,  Moraes, F.,  Jiang, J. and Allan, J., "Utility of Missing Concepts in Query Biased Summarization," in the Proceedings of The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 21), July 11-15, 2021, pp. 2056–2060.

IR-1271: (Chowdhury, T.,  Rahimi, N. and Allan, J., "Rank-LIME: Locally interpretable Model-Agnostic explanations for Learning to Rank," CIIR Technical Report, https://arxiv.org/abs/2212.12722, 2021.

IR-1278: Kim, Y., Rahimi, R. and Allan, J., "Alignment Rationale for Query-Document Relevance," in Proceedings of The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 22), Madrid, Spain, July 11-15, 2022, pp.2489–2494.

Point of Contact: allan@cs.umass.edu

This material is based upon work supported in part by the Center for Intelligent Information Retrieval (CIIR) and in part by the National Science Foundation under Grant No. (NSF IIS-1813662). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.