I am interested in the areas of Information Retrieval, Computer Vision, and at their intersection in Image and Video Retrieval. I am also interested in Document Analysis and Recognition including the recognition and retrieval of printed material particularly in Indian languages and the recognition and retrieval of handwritten manuscripts.

I am a Research Associate Professor in the Department of Computer Science and work with the Multi-media Indexing and Retrieval (MIR) group at the Center for Intelligent Information Retrieval (CIIR). The group's aim is to index non-textual sources of information by either converting them to ASCII text and using a search engine like INDRI or by directly indexing the information's content.

For a current list of publications go here. This list is automatically generated from the CIIR publications website.

My current work focuses on:

  1. Automatically Annotating and Retrieving Images. Along with Victor Lavrenko and my former students Jiwoon Jeon and Shaolei Feng I have investigated a number of models in this area. These include both discrete and continuous relevance models (CMRM, CRM, MBRM and NCRM), a maximum entropy models, an inference network model. My current research in this area is focused on Markov Random Field models - see the CIVR'08 paper. See the publication list for papers. Here is the first paper which started it all
  2. Distributed Image Search. I am doing this with Tingxin Yan and Deepak Ganesan. We are investigating how to represent queries and images concisely so that resource limited devices (eg Imotes) may be used as sensor devices and can be searched in a distributed manner. Here is our SenSys08 paper
  3. Indexing and Retrieving Handwritten Manuscripts. We are particularly focused on George Washington's manuscripts. This work was primarily done with Toni Rath my former student and Victor Lavrenko We use an approach based on relevance models which allow us to use ASCII queries. We have also used approach called word spotting (using word image matching) - the idea being to create automatic indices. Jamie Rothfeder, Nitin Srimal and I also investigated scale space techniques for segmenting handwritten manuscripts. Along with my former student Shaolei Feng and Prof. Nicholas Howe at Smith College I also investigated some recognition models for such manuscripts. Go to the publication list below to check out papes. to check out papers.

    Check out this demonstration of a handwriting retrieval system based on relevance models for 1000 pages (8 GB of data) of George Washington's manuscripts based on text queries. This is the first automatic (does not use manual annotations) retrieval system for historical manuscripts.

  4. Alignment Techniques for Printed and Handwritten Documents. I have investigated a number of techniques for aligning handwritten document images to transcripts to automatically generate groundtruth. This includes work with Micah Kornfield and James Allan using dynamic time warping and with Jamie Rothfeder and Toni Rath using HMM's. Along with Shaolei Feng, I investigated an automatic technique to align OCR output for printed books and their electronic versions on Gutenberg using HMM's
  5. Searching Printed Indian Language Documents. Along with Prof. C. V. Jawahar at IIIT Hyderabad and Anand Kumar I am investigating techniques to search printed documents in Indian languages for which OCR systems are not readily available. This work uses locality sensitive hashing for fast search in a book.

My previous work includes:

  1. Meta Search (or combining the outputs of multiple search engines). This work is based on modeling the score distributions of relevant documents as Gaussians and those of non-relevant documents as exponentials. A mixture model can be solved using Expectation-Maximization to recover the parameters of these distributions when relevance is unknown.
  2. I have also worked on image matching under deformations (affine, similarity), image retrieval using color and appearance and text detection in images and on the scale space segmentation of handwritten manuscripts.

More on my research.

Other interests:

I am a co-founder and technical advisor to SnapTell a mobile image search company.

I used to write stories. Here are two samples The Shadow and Marshall Teddy if you have time to kill.

