> > |
Advance Topic in Information Retieval?
CMPSCI 791H, Language Models (Advanced Topics in Information Retrieval)
Fall 2002
NOTE THE DATES SO YOU GET THE CORRECT WEEK
For November 4
- "Latent Dirichlet allocation". D. M. Blei, A. Y. Ng, and M. I. Jordan. Technical Report UCB//CSD-02-1194. This paper is available in two forms: read them both. Note that this paper may be particularly challenging, but do your best. PS.gz, 30 pages and PS.gz, 8 pages
- There is no additional paper this week; these will be challenging enough.
For October 28
- "Probabilistic Models of Text and Link Structure for Hypertext Classification", L. Getoor, E. Segal, B. Taskar, D. Koller. IJCAI01 Workshop on Text Learning: Beyond Supervision, Seattle, Washington, August 2001. PS That version seems to print badly (the margins are messed up). Here is a version in PDF that has the margins stripped off. Make sure when you print it you select "center page" or whatever the option is. PDF without margins
* Taskar, B., E. Segal and D. Koller. (2002). Discriminative Probabilistic Models for Relational Data, Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI02), Edmonton, Canada. [online|http://robotics.stanford.edu/~btaskar/]
For October 21
- Djoerd Hiemstra, "Term-Specific Smoothing for the Language Modeling Approach to Information Retrieval:The Importance of a Query Term ." In SIGIR 2002, pp 35-41 [PDF|http://wwwhome.cs.utwente.nl/~hiemstra/papers/sigir02lm.pdf]
- John Canny, "Collaborative Filtering with Privacy via Factor Analysis", In SIGIR 2002, pp 238-245[PDF|http://www.cs.berkeley.edu/~jfc/papers/02/SIGIR02.pdf]
For October 16 (Wednesday, but Monday class schedule)
- Si and Callan, "Using sampled data and regression to merge search engine results." In SIGIR 2002, pp 19-26. [[PS|http://www-2.cs.cmu.edu/~callan/Papers/sigir02-lsi.ps]
- Si, Jin, Callan, and Ogilvie, "Language modeling framework for resource selection and results merging." To appear in CIKM 2002. [PS|http://www-2.cs.cmu.edu/~callan/Papers/cikm02-lsi.ps]
For October 14
- ''October 14th is a holiday. Class will meet on the 16th,which is a Wednesday, but a Monday class schedule.''
For October 7
- Bennett, Dumais, and Horvitz. "Probabilistic combination of text classifiers using reliability indicators: models and results." In SIGIR 2002, pp 207-214. [PDF|http://research.microsoft.com/~sdumais/SIGIR2002-Combo.pdf]
- Federico and Bertoldi. "Statistical cross-language information retrieval using N-best query translations." In SIGIR 2002, pp. 167-174. Get the [PDF via the ACM portal|http://doi.acm.org/10.1145/564376.564407]
For September 30
*John Lafferty and Chengxiang Zhai. "Probabilistic relevance models based on document and query generation," In Proceedings of the Workshop on Language Modeling and Information Retrieval, Carnegie Mellon University, 2001, [PS|http://www.cs.cmu.edu/~lafferty/ps/dq.ps] (''this is different from the document on the language modeling workshop page'')
For September 23
- C.Zhai and J. Lafferty, "Two-Stage language models for information retrieval." Appears in SIGIR 2002, pp. 49-56. [PS|http://www-2.cs.cmu.edu/%7Elafferty/ps/two-stage.ps]
- Y. Zhang, J. Callan, and T. Minka, "Novelty and redundancy detection in adaptive filtering." Appears in SIGIR 2002, pp. 81-88. [PS|http://www-2.cs.cmu.edu/%7Ecallan/Papers/sigir02-yiz.ps]
September 16
- Title Language Model for Information Retrieval by R. Jin, A.G. Hauptmann, and C. Zhai, Carnegie Mellon University. Appears in SIGIR 2002, pp. 42-47. [PDF|http://www.cs.cmu.edu/%7Eczhai/paper/sigir2002-titlemod.pdf]
- W. Kraaij, T. Westerveld, and D. Hiemstra, " The Importance of prior probabilities for entry page search ." SIGIR 2002, pp. 27-34. [HTML|http://wwwhome.cs.utwente.nl/%7Ewesterve/sigirEPpriors.html]
Papers under consideration. Feel free to add suggestions, providing a link to the paper and preferably some thoughts about why you think it'd be interesting.
* "Expectation-propagation for the generative aspect model", John Lafferty and Thomas Minka. Uncertainty in Artificial Intelligence (UAI), 2002 [ps|http://www.cs.cmu.edu/~lafferty/ps/aspect.ps]
- C. Dwork, R. Kumar, M. Naor, and D. Sivakumar, "Rank Aggregation Methods for the Web." World Wide Web 2001. [HTML|http://citeseer.nj.nec.com/dwork01rank.html]
- Stephen Robertson and Djoerd Hiemstra,"Language Models and Probability of Relevance," In Proceedings of the Workshop on Language Modeling and Information Retrieval, Carnegie Mellon University, 2001,[PDF|http://la.lti.cs.cmu.edu/callan/Workshops/lmir01/WorkshopProcs/Papers/ser.pdf]
- Fernando Pereira. Formal Grammar and Information Theory: Together Again?. Philosophical Transactions of the Royal Society, 358(1769):1239-1253, April 2000.[PDF|http://www.cis.upenn.edu/~pereira/papers/rsoc.pdf]
- Christopher J. C. Burges, (1998). A Tutorial on Support Vector Machines for Pattern Recognition [CiteSeerLink | http://citeseer.nj.nec.com/burges98tutorial.html]
- "Optimal Mixture Models in IR", Lavrenko, V., in the Proceedings of the 24'th European Colloquium on IR Research (ECIR'02), Glasgow, Scotland, March 25-27, 2002. [PDF|http://ciir.cs.umass.edu/~lavrenko/pub/OptimalMixtureModels.pdf]
- A. Berger and J. Lafferty. Information Retrieval as Statistical Translation. In Proceedings of SIGIR-99, Berkeley, CA, August 1999. [CiteSeerLink|http://citeseer.nj.nec.com/berger99information.html]
[James's page|http://ciir.cs.umass.edu/cmpsci791h] (now hopelessly out of date)
-- AndreGauthier - 28 Oct 2002
|