CIIR Talk Series: Benjamin Piwowarski

Speaker: Benjamin Piwowarski, Institute of Intelligent Systems and Robotics, Sorbonne University

Talk Title: Closing the efficiency–effectiveness gap with cross-encoders

Date: Friday, April 24, 2026 - 1:30 - 2:30 PM EDT (North American Daylight Saving Time)

Abstract: Neural cross-encoders remain the effectiveness ceiling for text re-ranking, yet they sit at a singular point on the efficiency–effectiveness curve: accurate enough that we cannot retire them, slow enough that we wish we could. This talk confronts that trade-off through two complementary studies.

First, which training recipe actually matters? A controlled reproduction across 9 encoder backbones (BERT through ModernBERT/Ettin, 17M–184M parameters) and 6 training objectives shows that choosing the right loss rivals scaling the backbone. We also briefly describe experimaestro, the framework that makes this study auditable and reproducible.

Second, which cross-encoder computations are actually necessary? A progressive attention-masking analysis identifies the superfluous ones and yields MICE, a minimal-interaction architecture 4× faster than a standard cross-encoder — matching late-interaction latency without sacrificing ranking quality.

Together, these studies chart a concrete path toward closing the efficiency–effectiveness gap. 

Bio: Dr. Benjamin Piwowarski is a senior researcher (Directeur de Recherche) at the French National Center for Scientific Research (CNRS), working in the MLIA team at ISIR (Sorbonne Université). His current research centers on natural language processing and information access, with a focus on neural information retrieval, dialogue-based information access, controlled text generation, multilingual representation learning, and a better understanding of transformer architectures. Previously, he held a Research Associate position at the University of Glasgow (2008-2011) on quantum-physics-inspired models for information access; worked at Yahoo! Research (2006-2008) on web mining and on models of the interaction between users and search engines; and was a postdoc at the University of Chile (2004-2006) on XQuery evaluation. His PhD (1999-2003) applied machine learning techniques (Bayesian Networks) to Structured Information Retrieval; within this field he also contributed to new evaluation metrics for search engines. He regularly serves on the program committees of the main Information Retrieval and NLP venues (SIGIR, CIKM, ECIR, ARR) and is a member of the CNRS National Committee, the evaluation body for CNRS researchers and laboratories. He was General Chair of SIGIR 2019 in Paris, and Program Co-Chair of ECIR 2018 (Grenoble) and ICTIR 2025 (Padua).
 

Zoom Access: Zoom Link and reach out to Hamed Zamani or Dan Parker for the passcode.