> > |
Passages and XML, James leading.
We consider the topic of passage retrieval and XML element retrieval
for this class. The general idea is that there are situations in
which a portion of a document is more useful than the entire thing.
In passage retrieval, a key question is finding the portions. In XML
retrieval, the XML markup elements may provide the right portions.
- (Required) Kaszkiel and Zobel, "Passage retrieval revisited", SIGIR 1997 (9 pages, at ACM). This paper evaluates a range of passage retrieval approaches. It builds on a classic SIGIR 1994 paper by Callan.
- (At least the poster) Jiang and Zhai, "Extraction of coherent relevant passages using hidden Markov models". ACM Transactions on Information Systems, 24(3):295-319. (25 pages, at ACM). This paper talks about a way to extract just the right length of passage. A sketch of the work is available in a poster presented at SIGIR:
- Jiang and Zhai, "Accurately extracting coherent relevant passages using hidden Markov models", (2 page poster summary, at ACM)
- (Skim at least) Fuhr et al, "Overview of the INEX 2007 Ad Hoc Track". Pre-proceedings of INEX 2007. (22 pages, pdf). Provides an overview of retrieval using XML elements, much of which is summary evaluation results. You may find this description of the query format useful:
- Trotman and Sigurbjornsson, "Narrowed Extended XPath I (NEXI)". INEX 2004 workshop proceedings. Defines the query language used for structured queries in INEX. Worth skimming to figure out the queries. (11 pages, pdf)
- Ogilvie and Callan, "Hierarchical language models for XML component retrieval". In Proceedings of the INEX 2005 workshop. (15 pages, pdf). This paper sketches how element retrieval in XML can be done in a language modeling framework.
- (Required) Kams and Koolen, "On the relation between relevant passages and XML document structure." Proceedings of the SIGIR 2007 workshop on focused retrieval. (5 pages, pdf, pdf slides, full workshop proceedings). This paper links XML element retrieval and passage retrieval: is there a relationship at all?
- Itakura and Clarke, "From passages into elements in XML retrieval". Proceedings of the SIGIR 2007 workshop on focused retrieval. (6 pages, pdf, pdf slides, full workshop proceedings). How should those relevant XML elements be found?
Discussion also included Wade and Allan's 2005 technical report on evaluation of passage retrieval.
|