<<O>>  Difference Topic IRseminar08-0215 (r1.1 - 11 Feb 2008 - JamesAllan)
Line: 1 to 1
Added:
>
>
META TOPICPARENT IRseminarS08

IR seminar, February 15, 2008

Learning to rank, Ben leading.

First, the LETOR benchmark dataset paper (pdf) is worth reading because it explains the nature of the learning to rank problem in IR, the data available for working on it, common features used, and two baseline algorithms (ranking SVM and RankBoost).

I've organized learning to rank papers by the approach they take. The papers are listed below.

  1. multi-class classification: classify documents by relevance label, then order them by most likely label or expected label (McRank, ordinal regression).
  2. pairwise classification/regression: decompose training data into pairwise preferences, then train a binary classifier on preferences (ranking SVM), or a regression model on magnitudes of differences (RankNet).
  3. ranker combination: train several "rankers" (generally classifiers) and combine their outputs (McRank, RankBoost), or combine outputs of several given rankers (metasearch methods).
  4. permutation or "listwise": learn a permutation of items directly (ListNet, CRanking).

Required papers:

Recommended:

Harder but of interest:

  • ListNet (Cao et al.)
  • CRanking (Lebanon & Lafferty)
  • Large Margin Rank Boundaries for Ordinal Regression (Herbrich et al.) (chapter in "Introduction to Large Margin Classifiers"; see below for link)

machine learning background:

The most important things to understand are classification and regression. Classifiers and regressors are both functions f(x_1, x_2, ..., x_n) that take real-valued "features" x_i (which encode some information about the document/query) and compute a number. Classifiers compute discrete class labels while regressors calculate real numbers. "Supervised learning" is the process of finding the function that minimizes some loss function (e.g. classification error, Bayes risk, the sum of squared residuals) calculated on a set of training data (features + labels/scores). Specific learning algorithms are distinguished by the space of functions they search over, the loss function they minimize, and how they solve the optimization problem.

Here are some papers, though all of them go into more detail than necessary:

Revision -
Revision r1.1 - 11 Feb 2008 - 02:08 - JamesAllan