Skip to topic | Skip to bottom
Home
Main
Main.KwikLook1r1.1 - 26 Mar 2003 - 17:03 - GiridharKumarantopic end

Start of topic | Skip to actions
As the title suggests, this paper focuses of on-line NED, where the process of returning just the first document in a time-ordered cluster will not suffice. The algorithm presented in this paper is a variant of the single-pass algorithm mentioned in [1].

As each new document is encountered, it is processed immediately to extract features to build up a query representation of the document’s content. The query’s initial threshold is determined by evaluating the new document with the query. If, on comparison, the document does not trigger any previous query by exceeding its particular threshold, it is marked as a new event. The threshold model developed for the task incorporates time information. The intuition is that documents that are widely spaced apart in time are more likely to deal with new (different) events.

Performance-wise, it was found that increasing the number of features used to build the queries results in improved performance, with an unacceptable increase in running time of the system. At low feature dimensionality, misses were attributed to the inability of the feature extraction process to weight event-level features more heavily than more general topic-level features. Even at higher feature dimensionalities, misses occurred, which were finally ascribed to the poor weight assignment strategy for query features. The system’s performance was compared with similar ones from CMU and Dragon systems. Though numerically each system performed better at different regions in the DET graph, they were all not statistically significant.

-- GiridharKumaran - 26 Mar 2003
to top


You are here: Main > TDTProject > KwikLook1

to top

Copyright © 1999-2008 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback