<<O>>  Difference Topic ATask-BasedEvaluationOfMulti-DocumentSummarization (r1.1 - 07 Dec 2007 - ElifAktolga)
Line: 1 to 1
Added:
>
>
META TOPICPARENT Fall2007ReadingGroup
-- ElifAktolga - 07 Dec 2007

Do Summaries Help? A Task-Based Evaluation of Multi-Document Summarization

Date Place Author Keyword
2005 SIGIR K. McKeown?, R. J. Passonneau, and D. K. Elson summarization, evaluation

Summary

In this paper, the usefulness of Columbia's multi-document summarization system Newsblaster is evaluated.

Design of Evaluation

Report writing Task:

  • Three questions on each of four topics
  • 30 minutes to write the report
  • Use only the provided information (4 clusters: 2 have related news, 2 have relevant news)
  • One of these conditions: no summaries; one sentence summaries; Newsblaster summaries; human summaries
  • Report content quality is measured

Method of Evaluation

Pyramid method for scoring the reports:

  • look over all available summaries
  • look for clause-length "summarization content units" (SCUs)
  • weight them by the number of summaries they occur in
  • arrange SCUs in a pyramid; fewest SCUs at the top (contained in all the summaries), most of them at the bottom (contained only in 1 summary)
  • good summaries will contain all the SCUs at the topmost tier and further below
  • pyramid score is the ratio of the summary's score (sum of SCU weights in the summary) to the optimal report score

Results

  • reports are significantly better if the Newsblaster summaries were provided than when no summary was provided
  • user satisfaction: higher with multi-document summaries (human or Newsblaster) than without any summaries
  • the better the summary was, the more it was used by the users

Reference

Revision -
Revision r1.1 - 07 Dec 2007 - 02:54 - ElifAktolga