History

The Center for Intelligent Information Retrieval (CIIR) was formed in September 1992 with W. Bruce Croft as Director. Croft had been working in the field of information retrieval since he was a graduate student in 1975, and the advent of new storage media and networks in the 90’s created more general interest in the technology of search engines. The CIIR was a National Science Foundation State/Industry University Cooperative Research Center (S/IUCRC) from 1992 to 2001, with Professors Rick Adrion, Wendy Lehnert, Victor Lesser, and Edwina Rissland as co-Principal Investigators.

The current faculty involved in the CIIR includes Distinguished Professor Emeritus Croft, Professor James Allan, Associate Professor Hamed Zamani, Assistant Professor Negin Rahimi, Associate Professor Mohit Iyyer, Distinguished Professor Andrew McCallum, Associate Professor Benjamin Marlin, and Associate Professor Brendan O'Connor along with Adjunct Associate Professor Hanna Wallach, Adjunct Associate Professor R. Manmatha, Adjunct Associate Professor David Smith, and Adjunct Professor Hong Yu.

Croft joined the then-Department of Computer Science at UMass Amherst in 1979. Dr. Allan joined the CIIR in 1994 as a Senior Post-doctoral Research Associate and later as a Research Assistant Professor. He received a tenure-track professorship in the Department in 1998. Professor Allan became the co-Director of the CIIR while Dr. Croft was Department Chair from 2001-2007. Prof. Allan became the CIIR Director in January 2021, replacing Dr. Croft. Dr. Hamed Zamani joined CICS/CIIR in 2020 as an Assistant Professor. He was named the CIIR Associate Director in January 2021 and was promoted to CICS Associate Professor with tenure in 2024. Dr. Negin Rahimi joined the CIIR in 2019 as a Senior Postdoctoral Researcher, was promoted to CICS Research Assistant Professor in September 2021, and was promoted to tenure-track Assistant Professor in January 2024. From 1995-1999, Dr. Jamie Callan was Assistant Director of the CIIR. After receiving his Ph.D. in 1993 from the Department, Dr. Callan joined the CIIR as a Senior Post-doctoral Research Associate and later as a Research Assistant Professor before leaving in 1999 for a tenure-track professor position at Carnegie Mellon University.

Dr. McCallum joined the CIIR in 2002 as a Research Associate Professor and was promoted to a tenure-track Associate Professor in 2003, full Professor in 2009, and Distinguished Professor in 2018. Benjamin Marlin joined the CIIR in 2011 as an Assistant Professor. Brendan O'Connor joined the CIIR in 2014 as an Assistant Professor. Hanna Wallach joined the CIIR in 2007 as a Senior Postdoctoral Research Associate and became a tenure-track Assistant Professor in 2010. Mohit Iyyer joined the CIIR as associated faculty in 2020 as a CICS Assistant Professor. Hong Yu joined the CIIR as adjunct faculty in 2012. After receiving his Ph.D. from UMass in 1997, Manmatha started as a Post-doctoral researcher and was promoted to Research Assistant Professor in 1998 and Research Associate Professor in 2006. David Smith joined the CIIR in 2008 as a Research Assistant Professor. Additional associated faculty involved with the CIIR during its early years included Kathryn McKinley, Eliot Moss, Ed Riseman, and David Stemple.

Since 1992, we have employed/trained nearly 400 graduate and undergraduate students (nearly 50/50 split). Eighty-nine of the Center's students received Ph.D.s.

The original mission of the Center for Intelligent Information Retrieval (CIIR) was to “develop technology that supports the emerging information infrastructure into the next century” (i.e. the 21st century). This mission was important in 1992 when the Center began, and became even more critical with the advent of the World-Wide Web and the Internet community.

The research carried out in the Center has been described in more than 1,100 journal and refereed conference papers, with many CIIR-authored publications being selected for best paper and test of time awards. Some of the contributions we made during our first ten years include the following:

  • We made significant contributions to understanding and improving the retrieval process though probabilistic models, including the first description of a retrieval system based on statistical language models.
  • We introduced and improved a number of techniques for text and query representation, such as phrase representations, passages, "named entities", statistical stemming, and query expansion.
  • We led the development of techniques for distributed search based on automatically representing databases and combining local searches.
  • We produced the first high capacity probabilistic filtering architecture and carried out some of the earliest evaluations of machine learning algorithms for filtering.
  • We helped to define and evaluate the first versions of event detection and tracking software.
  • We carried out some of the earliest research on ranking and representation techniques for Asian languages, and showed how bilingual dictionaries can be an effective basis for a cross-lingual system.
  • We developed some of the first approaches to information extraction that emphasized learning.
  • We have evaluated novel techniques for indexing images and video.

The CIIR was also involved in a number of ground-breaking industry collaborations during its early years. Examples of those collaborations include:

  • West Publishing's "WIN" legal document retrieval system is based on information retrieval technology from the CIIR.
  • Lotus Development created a world-wide customer support technical reference retrieval system based on CIIR technology.
  • Infoseek licensed INQUERY to provide low-cost, high speed searches of the Internet.
  • The U.S. Library of Congress used INQUERY to provide access to a number of collections including the American Memory collection of historic photographic archives and the Global Legal Information Network. Another INQUERY system was the basis of the Thomas System, the corpus of Congressional Research Reports, Public Policy File, and all existing Federal law and pending bills.
  • The Executive Office of President Clinton and the Vice President's Office of the National Performance Review (renamed National Partnership for Reinventing Government) used INQUERY on its web site. The White House Home Page on the World Wide Web had links to many of the President's databases searchable using INQUERY. The publications server included transcripts of Presidential speeches, actual recorded speeches, press briefings, foreign and domestic policy documents, etc.
  • The General Services Administration (GSA) funded the application of advanced technology from the CIIR to develop a web searching capability across the entire Federal Government (all .gov and .mil sites). "Govbot" was the first government information portal that indexed over 1.5 million web pages for a one-stop shopping site of government information.

For a perspective on what the CIIR was focusing on during the 90's, see "What Do People Want From Information Retrieval?" by W. Bruce Croft.

View the timeline of the CIIR's history.