CIIR Talk Series: Ian Soboroff | Center for Intelligent Information Retrieval

Talk Title: Evaluating Generative IR Systems

Date: Friday, March 28, 2025 - 1:30 - 2:30 PM

Abstract: Information systems that generate answers are all the rage, and “ten blue links” seem terribly dated. But it’s a lot harder to measure if the generative output is good or not. We have well-understood methods for evaluating the ten-blue-link setup, but those methods don’t translate easily to the generative paradigm. Ranked list evaluation methods also have the advantage that they can create reusable test collections: you can often use them to evaluate new systems you didn’t have when you did the original evaluation. There are methods for evaluating generated text that come from question answering and text summarization, but they aren’t reusable, and they can be labor-intensive. These methods are now an active area of research, supported by five different tracks in the Text REtrieval Conference.

Bio: Dr. Ian Soboroff is a computer scientist and manager of the Retrieval Group at the National Institute of Standards and Technology (NIST). The Retrieval Group organizes the Text REtrieval Conference (TREC), the Text Analysis Conference (TAC), and the TREC Video Retrieval Evaluation (TRECVID). These are all large, community-based research workshops that drive the state-of-the-art in information retrieval, video search, web search, text summarization and other areas of information access. He has authored many publications in information retrieval evaluation, test collection building, text filtering, collaborative filtering, and intelligent software agents. His current research interests include building test collections for complex information tasks and evaluating generative information systems.

Zoom Access: Zoom Link and reach out to Hamed Zamani or Dan Parker for the passcode.