> > |
| META TOPICPARENT | TDTProject |
New Event Detection Task
Papers and Reports
Please feel free to include more papers, and edit the KwikLooks.
- Y. Yang, J. Carbonell, R. Brown, T. Pierce, B.T Archibald, X. Liu. Learning Approaches for Detecting and Tracking News Events. IEEE Intelligent Systems: Special Issue on Applications of Intelligent Information Retrieval, Vol 14(4), 32-43, 1999.ps.gz KwikLook
- R. Papka and J. Allan. On-line New Event Detection using Single Pass Clustering, UMASS Computer Science Technical Report 98-21, 1998.ps KwikLook1
- Stokes, N., and Carthy, J. First Story Detection using a Composite Document Representation, HLT 2001, Human Language Technology Conference, San Diego, California, March 18-21, 2001.pdf KwikLook2
- James Allan , Victor Lavrenko , Hubert Jin. First Story Detection in TDT is Hard. Proceedings of the Ninth International Conference on Information and Knowledge Management, p.374-381, November 06-11, 2000, Mc Lean, Virginia.pdf KwikLook3
- NED-related work by CMU, Jamie Carbonell, Yiming Yang, Ralk Brown, Jian Zhang, Jenny Ma. LTI, CMU, TDT 2002 Report. KwikLook4
- PARC at TDT2002: First Story Detection Thorsten Brants, Francine Chen, Ayman Farahat, TDT 2002 Report. KwikLook5
- Allan, J., Jin, H., Rajman, M., Wayne, C., Gildea, D., Lavrenko, V., Hoberman, R., Caputo, D., Summer Workshop Final Report, Center for Language and Speech Processing, Johns Hopkins University (1999)pdf KwikLook6
- Lavrenko, V. and Allan, J., DeGuzman,. E., LaFlamme, D., Pollard, V. and Thomas, S. Relevance Models for Topic Detection and Tracking. In the Proceedings of HLT 2002, San Diego, CA, March 24-27, 2002. )pdf KwikLook7
Some Machine Learning Papers
- William Cohen and Jacob Richman. Learning to Match and Cluster Large High-Dimensional Data Sets For Data Integration . KDD-2002.pdf
- William Cohen, Pradeep Ravikumar and Stephen Fienberg. A Comparison of String Distance Metrics for Name-Matching Tasks. Submitted for publication. pdf
- Mikhail Bilenko and Raymond J. Mooney . Adaptive Duplicate Detection Using Learnable String Similarity Measures. Submitted for publication, March 2003. pdf
- Andrew McCallum and Ben Wellner. Toward Conditional Models of Identity Uncertainty with Application to Proper Noun Coreference. IJCAI Workshop on Information Integration on the Web, 2003. pdf
-- GiridharKumaran - 26 May 2003
A rough guide to what must be taken into consideration when building a language model for NED. As usual, feel free to comment or edit.
| Feature | Don't Include | Unsure | Include for sure | Source |
| Preprocessing: Stemming | X | | | TND '99 |
| Preprocessing: Stopping | X | | | TND '99 |
| Preprocessing: Removing numbers | X | | | TND '99 |
| Preprocessing: PARC Stopping Method | X | | | PARC |
| Named Entities | | | X | TND '99 |
| Named Entities : Normalize names across stories | | X | | TND '99 |
| Named Entities : Create stop list of named entities | | X | | TND '99 |
| Named Entities : Restrict history to stories with similar content | | | X | ? |
| Time Decay | | | X | TND '99 |
| Normalize story lengths | | | X | TND '99 |
| Composite document representations : Ex. Lexical Chains | | | X | Nikola Stokes et. al. |
| Relevance models | | X | | UMass |
| Deferral window of stories | | | X | CMU |
| Hellinger Similarity Measure | | | X | PARC |
| Part 0f Speech Tagging | | | X | PARC |
| Time Information (Decay?) | X | | | PARC |
| Deferral Window | | X | | PARC |
-- GiridharKumaran - 10 Apr 2003
| META TOPICMOVED | AlvaroBolivar | date="1057003701" from="Main.NewEventDetection" to="Main.TDTNewEventDetection" |
|