CIIR Talk Series: Niranjan Balasubramanian | Center for Intelligent Information Retrieval

Speaker: Niranjan Balasubramanian, Stony Brook University

Title: What ails multihop QA and how to fix it

Date: Friday, October 28, 2022 - 1:30 - 2:30 PM EDT (North American Daylight Saving Time) via Zoom. On campus attendees will gather in CS 151 to view the presentation.

Zoom Access: Zoom Link and reach out to Hamed Zamani for the passcode.

Abstract: Multihop QA has seen much empirical progress on many datasets recently. However, training and evaluating on typical crowdsourced datasets is problematic because of the potential for shortcut reasoning based on artifacts. What can we do about this? In this three part talk, I will first show how we can formalize and measure disconnected reasoning, a type of bad multihop reasoning. We devise an automatic transform of an existing dataset that allows us to both measure and design incentives to reduce disconnected reasoning. In the second part, I will discuss constructing new datasets using a bottom-up construction process, where we can join singlehop questions to form multihop questions. I will show how this process allows us to better control for desired properties in the resulting dataset. In the third part, I will briefly present our new work which shows how synthetically generated data can be used to teach a broad range of multihop skills in a reliable manner.

Bio: Niranjan is an Assistant Professor in the Computer Science department at Stony Brook University, where he heads the Language Understanding and Reasoning lab (LUNR). Prior to joining Stony Brook, he was a post-doctoral researcher in the University of Washington, and was one of the early members of the Allen Institute for Artificial Intelligence. Niranjan completed his PhD in Computer Science from the University of Massachusetts Amherst.

p.s. When he is not writing abstracts, he fancies painting them. He believes his abilities for one are marginally better than the other.