Aurora Pons-Porrata, Rafael Berlanga-Llavori, Jose Ruiz-Shulcloper,
Building a Hierarchy of Events and Topics for Newspapers Digital Libraries,
2003
DESCRIPTION:
The main point of this paper is to introduce an algorithm to create a
hierarchical cluster structure given a similarity measure and a clustering
routine. The algorithm traverses the tree structure and reviews the cluster
assignments for each level in the hierarchy. It assumes that there might be
the case of merges and splits of clusters when a new document arrives.
Another interesting point of this paper is the use of a similarity measure
that takes into account the temporal and spatial proximity of documents.
The authors refer to the use of "time entities" as specific dates or time
intervals that were automatically extracted from the documents. It also
uses a spatial proximity feature with the use of a thesaurus of place names.
CRITIQUE:
The evaluation ignores the hierarchical structure.
COMMENTS:
It'd be interesting to explore the idea of time entities. Moreover, we could
use "entity models" to compare the places and people involved in two stories.
to top