Indri (Language modeling meets inference networks)
Indri is a new search engine from the Lemur project; a cooperative effort between the University of Massachusetts Amherst and Carnegie Mellon University to build language modeling information retrieval tools.
Effective
* Best-in-class ad hoc retrieval performance
Flexible
* Supports popular structured query operators from INQUERY
* Open source, with a flexible BSD-inspired license
* Parses PDF, HTML, XML, and TREC documents
* Word and PowerPoint parsing (Windows only)
Usable
* Supports UTF-8 encoded text
* Includes both command line tools and a Java user interface
* API can be used from Java, PHP, or C++
* Works on Windows, Linux, Solaris and Mac OS X
Powerful
* Can be used on a cluster of machines for faster indexing and retrieval
* Field retrieval
* Passage retrieval
* Scales to terabyte-sized collections
Information
* Indri (News, Features, Discussion)
* IndriBuildIndex
* Query language
* Code documentation
* Indri Retrieval Model
* Search the collected works of William Shakespeare with Indri
* Research
* Indri Tips
* People
Download Indri
* https://sourceforge.net/projects/lemur/