<<O>>  Difference Topic TDTNewEventDetection (r1.2 - 05 Aug 2003 - GiridharKumaran)

META TOPICPARENT TDTProject
New Event Detection Task
Line: 21 to 21

  • Allan, J., Jin, H., Rajman, M., Wayne, C., Gildea, D., Lavrenko, V., Hoberman, R., Caputo, D., Summer Workshop Final Report, Center for Language and Speech Processing, Johns Hopkins University (1999)pdf KwikLook6

  • Lavrenko, V. and Allan, J., DeGuzman,. E., LaFlamme, D., Pollard, V. and Thomas, S. Relevance Models for Topic Detection and Tracking. In the Proceedings of HLT 2002, San Diego, CA, March 24-27, 2002. )pdf KwikLook7
Added:
>
>

  • Y. Yang, J. Zhang, J. Carbonell and C. Jin. Topic-conditioned Novelty Detection. ACM SIGKDD Internaltional Conference on Knowledge Discovery and Data Mining, pp 688-693, 2002.pdf KwikLook8

Some Machine Learning Papers

 <<O>>  Difference Topic TDTNewEventDetection (r1.1 - 30 Jun 2003 - AlvaroBolivar)
Line: 1 to 1
Added:
>
>
META TOPICPARENT TDTProject
New Event Detection Task

Papers and Reports

Please feel free to include more papers, and edit the KwikLooks.

  • Y. Yang, J. Carbonell, R. Brown, T. Pierce, B.T Archibald, X. Liu. Learning Approaches for Detecting and Tracking News Events. IEEE Intelligent Systems: Special Issue on Applications of Intelligent Information Retrieval, Vol 14(4), 32-43, 1999.ps.gz KwikLook

  • R. Papka and J. Allan. On-line New Event Detection using Single Pass Clustering, UMASS Computer Science Technical Report 98-21, 1998.ps KwikLook1

  • Stokes, N., and Carthy, J. First Story Detection using a Composite Document Representation, HLT 2001, Human Language Technology Conference, San Diego, California, March 18-21, 2001.pdf KwikLook2

  • James Allan , Victor Lavrenko , Hubert Jin. First Story Detection in TDT is Hard. Proceedings of the Ninth International Conference on Information and Knowledge Management, p.374-381, November 06-11, 2000, Mc Lean, Virginia.pdf KwikLook3

  • NED-related work by CMU, Jamie Carbonell, Yiming Yang, Ralk Brown, Jian Zhang, Jenny Ma. LTI, CMU, TDT 2002 Report. KwikLook4

  • PARC at TDT2002: First Story Detection Thorsten Brants, Francine Chen, Ayman Farahat, TDT 2002 Report. KwikLook5

  • Allan, J., Jin, H., Rajman, M., Wayne, C., Gildea, D., Lavrenko, V., Hoberman, R., Caputo, D., Summer Workshop Final Report, Center for Language and Speech Processing, Johns Hopkins University (1999)pdf KwikLook6

  • Lavrenko, V. and Allan, J., DeGuzman,. E., LaFlamme, D., Pollard, V. and Thomas, S. Relevance Models for Topic Detection and Tracking. In the Proceedings of HLT 2002, San Diego, CA, March 24-27, 2002. )pdf KwikLook7

Some Machine Learning Papers

  • William Cohen and Jacob Richman. Learning to Match and Cluster Large High-Dimensional Data Sets For Data Integration . KDD-2002.pdf

  • William Cohen, Pradeep Ravikumar and Stephen Fienberg. A Comparison of String Distance Metrics for Name-Matching Tasks. Submitted for publication. pdf

  • Mikhail Bilenko and Raymond J. Mooney . Adaptive Duplicate Detection Using Learnable String Similarity Measures. Submitted for publication, March 2003. pdf

  • Andrew McCallum and Ben Wellner. Toward Conditional Models of Identity Uncertainty with Application to Proper Noun Coreference. IJCAI Workshop on Information Integration on the Web, 2003. pdf

-- GiridharKumaran - 26 May 2003


A rough guide to what must be taken into consideration when building a language model for NED. As usual, feel free to comment or edit.

Feature Don't Include Unsure Include for sure Source
Preprocessing: Stemming X     TND '99
Preprocessing: Stopping X     TND '99
Preprocessing: Removing numbers X     TND '99
Preprocessing: PARC Stopping Method X     PARC
Named Entities     X TND '99
Named Entities : Normalize names across stories   X   TND '99
Named Entities : Create stop list of named entities   X   TND '99
Named Entities : Restrict history to stories with similar content     X ?
Time Decay     X TND '99
Normalize story lengths     X TND '99
Composite document representations : Ex. Lexical Chains     X Nikola Stokes et. al.
Relevance models   X   UMass
Deferral window of stories     X CMU
Hellinger Similarity Measure     X PARC
Part 0f Speech Tagging     X PARC
Time Information (Decay?) X     PARC
Deferral Window   X   PARC

-- GiridharKumaran - 10 Apr 2003

META TOPICMOVED AlvaroBolivar date="1057003701" from="Main.NewEventDetection" to="Main.TDTNewEventDetection"
Revision r1.1 - 30 Jun 2003 - 19:14 - AlvaroBolivar
Revision r1.2 - 05 Aug 2003 - 18:30 - GiridharKumaran