<<O>>  Difference Topic TDTProject (r1.21 - 02 May 2005 - GiridharKumaran)

META TOPICPARENT WebHome

Line: 49 to 49

When looking for rules of interpretation, take a look at the mapping between the old rules and the new rules.
Added:
>
>
Links to topic descriptions:

-- BinLiu - 25 Mar 2003

-- AlvaroBolivar - 30 Jun 2003

 <<O>>  Difference Topic TDTProject (r1.20 - 18 Jul 2003 - FangfangFeng)

META TOPICPARENT WebHome

Line: 63 to 63


Changed:
<
<
>
>

META FILEATTACHMENT NEDUsingLanguageModelingTechniques?.ppt attr="" comment="Presentation given on 7th April" date="1049735690" path="\\Dandenong\giridhar\NEDUsingLanguageModelingTechniques.ppt" size="262144" user="GiridharKumaran" version="1.1"
META FILEATTACHMENT SentenceForest?.ppt attr="" comment="Ramesh's SenForest Presentation on 14th April" date="1050781865" path="\\Dandenong\giridhar\SentenceForest.ppt" size="821760" user="GiridharKumaran" version="/exp/rcf/share/bin/rlog -h /usr/www/wiki/pub/Main/TDTProject/SentenceForest.ppt,v"
META FILEATTACHMENT Timeline.jpg attr="" comment="" date="1053704498" path="C:\Documents and Settings\Giri\Desktop\Timeline.jpg" size="7613" user="GiridharKumaran" version="1.1"
 <<O>>  Difference Topic TDTProject (r1.19 - 18 Jul 2003 - FangfangFeng)

META TOPICPARENT WebHome

Line: 63 to 63


Changed:
<
<
>
>

META FILEATTACHMENT NEDUsingLanguageModelingTechniques?.ppt attr="" comment="Presentation given on 7th April" date="1049735690" path="\\Dandenong\giridhar\NEDUsingLanguageModelingTechniques.ppt" size="262144" user="GiridharKumaran" version="1.1"
META FILEATTACHMENT SentenceForest?.ppt attr="" comment="Ramesh's SenForest Presentation on 14th April" date="1050781865" path="\\Dandenong\giridhar\SentenceForest.ppt" size="821760" user="GiridharKumaran" version="/exp/rcf/share/bin/rlog -h /usr/www/wiki/pub/Main/TDTProject/SentenceForest.ppt,v"
META FILEATTACHMENT Timeline.jpg attr="" comment="" date="1053704498" path="C:\Documents and Settings\Giri\Desktop\Timeline.jpg" size="7613" user="GiridharKumaran" version="1.1"
 <<O>>  Difference Topic TDTProject (r1.18 - 10 Jul 2003 - GiridharKumaran)

META TOPICPARENT WebHome

Line: 44 to 44

    • 60 topics annotated by researchers

Annotation Guides:

Changed:
<
<
Includes "rules of interpretation".
>
>
Includes rules of interpretation.

When looking for rules of interpretation, take a look at the mapping between the old rules and the new rules.
 <<O>>  Difference Topic TDTProject (r1.17 - 01 Jul 2003 - GiridharKumaran)

META TOPICPARENT WebHome

Line: 62 to 62


Added:
>
>

META FILEATTACHMENT NEDUsingLanguageModelingTechniques?.ppt attr="" comment="Presentation given on 7th April" date="1049735690" path="\\Dandenong\giridhar\NEDUsingLanguageModelingTechniques.ppt" size="262144" user="GiridharKumaran" version="1.1"
META FILEATTACHMENT SentenceForest?.ppt attr="" comment="Ramesh's SenForest Presentation on 14th April" date="1050781865" path="\\Dandenong\giridhar\SentenceForest.ppt" size="821760" user="GiridharKumaran" version="/exp/rcf/share/bin/rlog -h /usr/www/wiki/pub/Main/TDTProject/SentenceForest.ppt,v"
META FILEATTACHMENT Timeline.jpg attr="" comment="" date="1053704498" path="C:\Documents and Settings\Giri\Desktop\Timeline.jpg" size="7613" user="GiridharKumaran" version="1.1"
Added:
>
>
META FILEATTACHMENT Annotator_GUI.java attr="" comment="Latest version of the NED annotator tool" date="1057026088" path="\\Dandenong\giridhar\Annotator_GUI.java" size="17635" user="GiridharKumaran" version="1.1"
 <<O>>  Difference Topic TDTProject (r1.16 - 30 Jun 2003 - AlvaroBolivar)

META TOPICPARENT WebHome

Line: 9 to 9

TDT group meetings are Mondays from 10-11 or from 10:30-11:30. If there is a talk scheduled at 11am, the meetings will start at 10.

Added:
>
>
2003 Summer Meetings: TDT group meetings are Mondays from 2-3. Room CS303.

-- JamesAllan - 20 Mar 2003


Line: 33 to 36

    • Covers from October 1998 to December 1998
    • about 9130 news
    • 60 topics annotated by researchers
Changed:
<
<
  • TDT-4 Corpus
>
>

    • used for 2002 test
    • 8 English sources, 7 Mandarin sources and 5 Arabic sources
    • Covers from October 1 2001 to January 31 December 2002
    • 90375 news
    • 60 topics annotated by researchers
Changed:
<
<
-- BinLiu - 25 Mar 2003


Annotation Guides Includes "rules of interpretation"

>
>
Annotation Guides: Includes "rules of interpretation".

Added:
>
>
When looking for rules of interpretation, take a look at the mapping between the old rules and the new rules.

Changed:
<
<
-- AlvaroBolivar - 25 Jun 2003


New Event Detection Papers and Reports

Please feel free to include more papers, and edit the KwikLooks.

  • Y. Yang, J. Carbonell, R. Brown, T. Pierce, B.T Archibald, X. Liu. Learning Approaches for Detecting and Tracking News Events. IEEE Intelligent Systems: Special Issue on Applications of Intelligent Information Retrieval, Vol 14(4), 32-43, 1999.ps.gz KwikLook

  • R. Papka and J. Allan. On-line New Event Detection using Single Pass Clustering, UMASS Computer Science Technical Report 98-21, 1998.ps KwikLook1

  • Stokes, N., and Carthy, J. First Story Detection using a Composite Document Representation, HLT 2001, Human Language Technology Conference, San Diego, California, March 18-21, 2001.pdf KwikLook2

  • James Allan , Victor Lavrenko , Hubert Jin. First Story Detection in TDT is Hard. Proceedings of the Ninth International Conference on Information and Knowledge Management, p.374-381, November 06-11, 2000, Mc Lean, Virginia.pdf KwikLook3

  • NED-related work by CMU, Jamie Carbonell, Yiming Yang, Ralk Brown, Jian Zhang, Jenny Ma. LTI, CMU, TDT 2002 Report. KwikLook4

  • PARC at TDT2002: First Story Detection Thorsten Brants, Francine Chen, Ayman Farahat, TDT 2002 Report. KwikLook5

  • Allan, J., Jin, H., Rajman, M., Wayne, C., Gildea, D., Lavrenko, V., Hoberman, R., Caputo, D., Summer Workshop Final Report, Center for Language and Speech Processing, Johns Hopkins University (1999)pdf KwikLook6

  • Lavrenko, V. and Allan, J., DeGuzman,. E., LaFlamme, D., Pollard, V. and Thomas, S. Relevance Models for Topic Detection and Tracking. In the Proceedings of HLT 2002, San Diego, CA, March 24-27, 2002. )pdf KwikLook7

Some Machine Learning Papers

  • William Cohen and Jacob Richman. Learning to Match and Cluster Large High-Dimensional Data Sets For Data Integration . KDD-2002.pdf

  • William Cohen, Pradeep Ravikumar and Stephen Fienberg. A Comparison of String Distance Metrics for Name-Matching Tasks. Submitted for publication. pdf

  • Mikhail Bilenko and Raymond J. Mooney . Adaptive Duplicate Detection Using Learnable String Similarity Measures. Submitted for publication, March 2003. pdf

  • Andrew McCallum and Ben Wellner. Toward Conditional Models of Identity Uncertainty with Application to Proper Noun Coreference. IJCAI Workshop on Information Integration on the Web, 2003. pdf
>
>
-- BinLiu - 25 Mar 2003

Changed:
<
<
-- GiridharKumaran - 26 May 2003
>
>
-- AlvaroBolivar - 30 Jun 2003


Changed:
<
<
A rough guide to what must be taken into consideration when building a language model for NED. As usual, feel free to comment or edit.

Feature Don't Include Unsure Include for sure Source
Preprocessing: Stemming X     TND '99
Preprocessing: Stopping X     TND '99
Preprocessing: Removing numbers X     TND '99
Preprocessing: PARC Stopping Method X     PARC
Named Entities     X TND '99
Named Entities : Normalize names across stories   X   TND '99
Named Entities : Create stop list of named entities   X   TND '99
Named Entities : Restrict history to stories with similar content     X ?
Time Decay     X TND '99
Normalize story lengths     X TND '99
Composite document representations : Ex. Lexical Chains     X Nikola Stokes et. al.
Relevance models   X   UMass
Deferral window of stories     X CMU
Hellinger Similarity Measure     X PARC
Part 0f Speech Tagging     X PARC
Time Information (Decay?) X     PARC
Deferral Window   X   PARC
>
>
TASKS

Changed:
<
<
-- GiridharKumaran - 10 Apr 2003
>
>

Added:
>
>


META FILEATTACHMENT NEDUsingLanguageModelingTechniques?.ppt attr="" comment="Presentation given on 7th April" date="1049735690" path="\\Dandenong\giridhar\NEDUsingLanguageModelingTechniques.ppt" size="262144" user="GiridharKumaran" version="1.1"
META FILEATTACHMENT SentenceForest?.ppt attr="" comment="Ramesh's SenForest Presentation on 14th April" date="1050781865" path="\\Dandenong\giridhar\SentenceForest.ppt" size="821760" user="GiridharKumaran" version="/exp/rcf/share/bin/rlog -h /usr/www/wiki/pub/Main/TDTProject/SentenceForest.ppt,v"
META FILEATTACHMENT Timeline.jpg attr="" comment="" date="1053704498" path="C:\Documents and Settings\Giri\Desktop\Timeline.jpg" size="7613" user="GiridharKumaran" version="1.1"
 <<O>>  Difference Topic TDTProject (r1.15 - 30 Jun 2003 - AlvaroBolivar)

META TOPICPARENT WebHome

Line: 15 to 15

TDT Corpus

Changed:
<
<
  • TDT-1 Corpus
>
>

    • used for 1997 pilot-study
    • 2 sources from Reuters and CNN
    • About 16,000 news
    • Covers July 1, 1994 to June 30, 1995
    • 25 topics annotated by researchers
Changed:
<
<
  • TDT-2 Corpus
>
>

    • used for 1998 test
    • 6 English sources: New York Times, Associated Press Worldstream, CNN, ABC, PRI and VOA
    • Covers from January 4 to June 30, 1998
    • About 6,000 news
    • 100 topics annotated by researchers
Changed:
<
<
  • TDT-3 Corpus
>
>

    • used for 1999, 2000 and 2001 tests
    • 8 English sources(New York Times, Associated Press Worldstream, CNN, ABC, PRI, VOA English, NBC and MSNBC) and 3 Mandarin sources (Xinhua News, Zaobao and VOA Mandarin)
    • Covers from October 1998 to December 1998
Line: 44 to 44


Changed:
<
<
In the annotation guide for TDT3, you will find the rules of interpretation for topic labeling of TDT data.
>
>
Annotation Guides Includes "rules of interpretation"

-- AlvaroBolivar - 25 Jun 2003

Added:
>
>


New Event Detection Papers and Reports

 <<O>>  Difference Topic TDTProject (r1.14 - 25 Jun 2003 - AlvaroBolivar)

META TOPICPARENT WebHome

Line: 42 to 42

-- BinLiu - 25 Mar 2003

Added:
>
>

In the annotation guide for TDT3, you will find the rules of interpretation for topic labeling of TDT data.

-- AlvaroBolivar - 25 Jun 2003



New Event Detection Papers and Reports

 <<O>>  Difference Topic TDTProject (r1.13 - 26 May 2003 - GiridharKumaran)

META TOPICPARENT WebHome

Line: 64 to 64

  • Lavrenko, V. and Allan, J., DeGuzman,. E., LaFlamme, D., Pollard, V. and Thomas, S. Relevance Models for Topic Detection and Tracking. In the Proceedings of HLT 2002, San Diego, CA, March 24-27, 2002. )pdf KwikLook7
Changed:
<
<
-- GiridharKumaran - 26 Mar 2003
>
>
Some Machine Learning Papers

  • William Cohen and Jacob Richman. Learning to Match and Cluster Large High-Dimensional Data Sets For Data Integration . KDD-2002.pdf

  • William Cohen, Pradeep Ravikumar and Stephen Fienberg. A Comparison of String Distance Metrics for Name-Matching Tasks. Submitted for publication. pdf

  • Mikhail Bilenko and Raymond J. Mooney . Adaptive Duplicate Detection Using Learnable String Similarity Measures. Submitted for publication, March 2003. pdf

  • Andrew McCallum and Ben Wellner. Toward Conditional Models of Identity Uncertainty with Application to Proper Noun Coreference. IJCAI Workshop on Information Integration on the Web, 2003. pdf

-- GiridharKumaran - 26 May 2003



 <<O>>  Difference Topic TDTProject (r1.12 - 23 May 2003 - GiridharKumaran)

META TOPICPARENT WebHome
Added:
>
>

2003 New Event Detection Timeline.Today is 07 Aug 2008 .

Timeline.jpg



TDT group meetings are Mondays from 10-11 or from 10:30-11:30. If there is a talk scheduled at 11am, the meetings will start at 10.

-- JamesAllan - 20 Mar 2003

Line: 89 to 96

META FILEATTACHMENT NEDUsingLanguageModelingTechniques?.ppt attr="" comment="Presentation given on 7th April" date="1049735690" path="\\Dandenong\giridhar\NEDUsingLanguageModelingTechniques.ppt" size="262144" user="GiridharKumaran" version="1.1"
META FILEATTACHMENT SentenceForest?.ppt attr="" comment="Ramesh's SenForest Presentation on 14th April" date="1050781865" path="\\Dandenong\giridhar\SentenceForest.ppt" size="821760" user="GiridharKumaran" version="/exp/rcf/share/bin/rlog -h /usr/www/wiki/pub/Main/TDTProject/SentenceForest.ppt,v"
Added:
>
>
META FILEATTACHMENT Timeline.jpg attr="" comment="" date="1053704498" path="C:\Documents and Settings\Giri\Desktop\Timeline.jpg" size="7613" user="GiridharKumaran" version="1.1"
 <<O>>  Difference Topic TDTProject (r1.11 - 05 May 2003 - GiridharKumaran)

META TOPICPARENT WebHome
TDT group meetings are Mondays from 10-11 or from 10:30-11:30. If there is a talk scheduled at 11am, the meetings will start at 10.
Line: 64 to 64

A rough guide to what must be taken into consideration when building a language model for NED. As usual, feel free to comment or edit.

Feature Don't Include Unsure Include for sure Source
Changed:
<
<
Preprocessing: Stemming X     NED Workshop
Preprocessing: Stopping X     NED Workshop
Preprocessing: Removing numbers X     NED Workshop
>
>
Preprocessing: Stemming X     TND '99
Preprocessing: Stopping X     TND '99
Preprocessing: Removing numbers X     TND '99

Preprocessing: PARC Stopping Method X     PARC
Changed:
<
<
Named Entities     X NED Workshop
Named Entities : Normalize names across stories   X   NED Workshop
Named Entities : Create stop list of named entities   X   NED Workshop
>
>
Named Entities     X TND '99
Named Entities : Normalize names across stories   X   TND '99
Named Entities : Create stop list of named entities   X   TND '99

Named Entities : Restrict history to stories with similar content     X ?
Changed:
<
<
Time Decay     X NED Workshop
Normalize story lengths     X NED Workshop
>
>
Time Decay     X TND '99
Normalize story lengths     X TND '99

Composite document representations : Ex. Lexical Chains     X Nikola Stokes et. al.
Relevance models   X   UMass
Deferral window of stories     X CMU
 <<O>>  Difference Topic TDTProject (r1.10 - 19 Apr 2003 - GiridharKumaran)

META TOPICPARENT WebHome
TDT group meetings are Mondays from 10-11 or from 10:30-11:30. If there is a talk scheduled at 11am, the meetings will start at 10.
Line: 86 to 86

-- GiridharKumaran - 10 Apr 2003

Changed:
<
<

>
>

META FILEATTACHMENT NEDUsingLanguageModelingTechniques?.ppt attr="" comment="Presentation given on 7th April" date="1049735690" path="\\Dandenong\giridhar\NEDUsingLanguageModelingTechniques.ppt" size="262144" user="GiridharKumaran" version="1.1"
Added:
>
>
META FILEATTACHMENT SentenceForest?.ppt attr="" comment="Ramesh's SenForest Presentation on 14th April" date="1050781865" path="\\Dandenong\giridhar\SentenceForest.ppt" size="821760" user="GiridharKumaran" version="/exp/rcf/share/bin/rlog -h /usr/www/wiki/pub/Main/TDTProject/SentenceForest.ppt,v"
 <<O>>  Difference Topic TDTProject (r1.9 - 14 Apr 2003 - GiridharKumaran)

META TOPICPARENT WebHome
TDT group meetings are Mondays from 10-11 or from 10:30-11:30. If there is a talk scheduled at 11am, the meetings will start at 10.
Line: 63 to 63

A rough guide to what must be taken into consideration when building a language model for NED. As usual, feel free to comment or edit.

Changed:
<
<
Feature Don't Include Unsure Include for sure
Preprocessing: Stemming X    
Preprocessing: Stopping X    
Preprocessing: Removing numbers X    
Named Entities     X
Named Entities : Normalize names across stories   X  
Named Entities : Create stop list of named entities   X  
Named Entities : Restrict history to stories with similar content     X
Time Decay     X
Normalize story lengths     X
Composite document representations : Ex. Lexical Chains     X
Relevance models   X  
Deferral window of stories     X
>
>
Feature Don't Include Unsure Include for sure Source
Preprocessing: Stemming X     NED Workshop
Preprocessing: Stopping X     NED Workshop
Preprocessing: Removing numbers X     NED Workshop
Preprocessing: PARC Stopping Method X     PARC
Named Entities     X NED Workshop
Named Entities : Normalize names across stories   X   NED Workshop
Named Entities : Create stop list of named entities   X   NED Workshop
Named Entities : Restrict history to stories with similar content     X ?
Time Decay     X NED Workshop
Normalize story lengths     X NED Workshop
Composite document representations : Ex. Lexical Chains     X Nikola Stokes et. al.
Relevance models   X   UMass
Deferral window of stories     X CMU
Hellinger Similarity Measure     X PARC
Part 0f Speech Tagging     X PARC
Time Information (Decay?) X     PARC
Deferral Window   X   PARC

-- GiridharKumaran - 10 Apr 2003

 <<O>>  Difference Topic TDTProject (r1.8 - 10 Apr 2003 - GiridharKumaran)

META TOPICPARENT WebHome
TDT group meetings are Mondays from 10-11 or from 10:30-11:30. If there is a talk scheduled at 11am, the meetings will start at 10.

-- JamesAllan - 20 Mar 2003

Added:
>
>


TDT Corpus

  • TDT-1 Corpus
Line: 33 to 35

-- BinLiu - 25 Mar 2003

Added:
>
>


New Event Detection Papers and Reports

Please feel free to include more papers, and edit the KwikLooks.

Line: 49 to 53

  • PARC at TDT2002: First Story Detection Thorsten Brants, Francine Chen, Ayman Farahat, TDT 2002 Report. KwikLook5
Changed:
<
<
  • Allan, J., Jin, H., Rajman, M., Wayne, C., Gildea, D., Lavrenko, V., Hoberman, R., Caputo, D., Summer Workshop Final Report, Center for Language and Speech Processing, Johns Hopkins University (1999)pdf KwikLook6?
>
>
  • Allan, J., Jin, H., Rajman, M., Wayne, C., Gildea, D., Lavrenko, V., Hoberman, R., Caputo, D., Summer Workshop Final Report, Center for Language and Speech Processing, Johns Hopkins University (1999)pdf KwikLook6

  • Lavrenko, V. and Allan, J., DeGuzman,. E., LaFlamme, D., Pollard, V. and Thomas, S. Relevance Models for Topic Detection and Tracking. In the Proceedings of HLT 2002, San Diego, CA, March 24-27, 2002. )pdf KwikLook7

-- GiridharKumaran - 26 Mar 2003

Added:
>
>

A rough guide to what must be taken into consideration when building a language model for NED. As usual, feel free to comment or edit.

Feature Don't Include Unsure Include for sure
Preprocessing: Stemming X    
Preprocessing: Stopping X    
Preprocessing: Removing numbers X    
Named Entities     X
Named Entities : Normalize names across stories   X  
Named Entities : Create stop list of named entities   X  
Named Entities : Restrict history to stories with similar content     X
Time Decay     X
Normalize story lengths     X
Composite document representations : Ex. Lexical Chains     X
Relevance models   X  
Deferral window of stories     X

-- GiridharKumaran - 10 Apr 2003



META FILEATTACHMENT NEDUsingLanguageModelingTechniques?.ppt attr="" comment="Presentation given on 7th April" date="1049735690" path="\\Dandenong\giridhar\NEDUsingLanguageModelingTechniques.ppt" size="262144" user="GiridharKumaran" version="1.1"
 <<O>>  Difference Topic TDTProject (r1.7 - 07 Apr 2003 - GiridharKumaran)

META TOPICPARENT WebHome
TDT group meetings are Mondays from 10-11 or from 10:30-11:30. If there is a talk scheduled at 11am, the meetings will start at 10.
Line: 52 to 52

  • Allan, J., Jin, H., Rajman, M., Wayne, C., Gildea, D., Lavrenko, V., Hoberman, R., Caputo, D., Summer Workshop Final Report, Center for Language and Speech Processing, Johns Hopkins University (1999)pdf KwikLook6?

-- GiridharKumaran - 26 Mar 2003

Added:
>
>

META FILEATTACHMENT NEDUsingLanguageModelingTechniques?.ppt attr="" comment="Presentation given on 7th April" date="1049735690" path="\\Dandenong\giridhar\NEDUsingLanguageModelingTechniques.ppt" size="262144" user="GiridharKumaran" version="1.1"
 <<O>>  Difference Topic TDTProject (r1.6 - 29 Mar 2003 - GiridharKumaran)

META TOPICPARENT WebHome
TDT group meetings are Mondays from 10-11 or from 10:30-11:30. If there is a talk scheduled at 11am, the meetings will start at 10.
Line: 48 to 48

  • NED-related work by CMU, Jamie Carbonell, Yiming Yang, Ralk Brown, Jian Zhang, Jenny Ma. LTI, CMU, TDT 2002 Report. KwikLook4

  • PARC at TDT2002: First Story Detection Thorsten Brants, Francine Chen, Ayman Farahat, TDT 2002 Report. KwikLook5
Added:
>
>

  • Allan, J., Jin, H., Rajman, M., Wayne, C., Gildea, D., Lavrenko, V., Hoberman, R., Caputo, D., Summer Workshop Final Report, Center for Language and Speech Processing, Johns Hopkins University (1999)pdf KwikLook6?

-- GiridharKumaran - 26 Mar 2003

 <<O>>  Difference Topic TDTProject (r1.5 - 27 Mar 2003 - GiridharKumaran)

META TOPICPARENT WebHome
TDT group meetings are Mondays from 10-11 or from 10:30-11:30. If there is a talk scheduled at 11am, the meetings will start at 10.
Line: 39 to 39

  • Y. Yang, J. Carbonell, R. Brown, T. Pierce, B.T Archibald, X. Liu. Learning Approaches for Detecting and Tracking News Events. IEEE Intelligent Systems: Special Issue on Applications of Intelligent Information Retrieval, Vol 14(4), 32-43, 1999.ps.gz KwikLook
Changed:
<
<
  • R. Papka and J. Allan. On-line New Event Detection using Single Pass Clustering, UMASS Computer Science Technical Report 98-21, 1998.pdf KwikLook1
>
>
  • R. Papka and J. Allan. On-line New Event Detection using Single Pass Clustering, UMASS Computer Science Technical Report 98-21, 1998.ps KwikLook1

  • Stokes, N., and Carthy, J. First Story Detection using a Composite Document Representation, HLT 2001, Human Language Technology Conference, San Diego, California, March 18-21, 2001.pdf KwikLook2
 <<O>>  Difference Topic TDTProject (r1.4 - 26 Mar 2003 - GiridharKumaran)

META TOPICPARENT WebHome
TDT group meetings are Mondays from 10-11 or from 10:30-11:30. If there is a talk scheduled at 11am, the meetings will start at 10.
Line: 34 to 34

-- BinLiu - 25 Mar 2003

New Event Detection Papers and Reports

Added:
>
>

Please feel free to include more papers, and edit the KwikLooks.


  • Y. Yang, J. Carbonell, R. Brown, T. Pierce, B.T Archibald, X. Liu. Learning Approaches for Detecting and Tracking News Events. IEEE Intelligent Systems: Special Issue on Applications of Intelligent Information Retrieval, Vol 14(4), 32-43, 1999.ps.gz KwikLook
 <<O>>  Difference Topic TDTProject (r1.3 - 26 Mar 2003 - GiridharKumaran)

META TOPICPARENT WebHome
TDT group meetings are Mondays from 10-11 or from 10:30-11:30. If there is a talk scheduled at 11am, the meetings will start at 10.

-- JamesAllan - 20 Mar 2003

Changed:
<
<

TDT Corpus

>
>
TDT Corpus

  • TDT-1 Corpus
    • used for 1997 pilot-study
Line: 32 to 32

    • 60 topics annotated by researchers

-- BinLiu - 25 Mar 2003

Added:
>
>

New Event Detection Papers and Reports

  • Y. Yang, J. Carbonell, R. Brown, T. Pierce, B.T Archibald, X. Liu. Learning Approaches for Detecting and Tracking News Events. IEEE Intelligent Systems: Special Issue on Applications of Intelligent Information Retrieval, Vol 14(4), 32-43, 1999.ps.gz KwikLook

  • R. Papka and J. Allan. On-line New Event Detection using Single Pass Clustering, UMASS Computer Science Technical Report 98-21, 1998.pdf KwikLook1

  • Stokes, N., and Carthy, J. First Story Detection using a Composite Document Representation, HLT 2001, Human Language Technology Conference, San Diego, California, March 18-21, 2001.pdf KwikLook2

  • James Allan , Victor Lavrenko , Hubert Jin. First Story Detection in TDT is Hard. Proceedings of the Ninth International Conference on Information and Knowledge Management, p.374-381, November 06-11, 2000, Mc Lean, Virginia.pdf KwikLook3

  • NED-related work by CMU, Jamie Carbonell, Yiming Yang, Ralk Brown, Jian Zhang, Jenny Ma. LTI, CMU, TDT 2002 Report. KwikLook4

  • PARC at TDT2002: First Story Detection Thorsten Brants, Francine Chen, Ayman Farahat, TDT 2002 Report. KwikLook5

-- GiridharKumaran - 26 Mar 2003

 <<O>>  Difference Topic TDTProject (r1.2 - 26 Mar 2003 - BinLiu)

META TOPICPARENT WebHome
TDT group meetings are Mondays from 10-11 or from 10:30-11:30. If there is a talk scheduled at 11am, the meetings will start at 10.

-- JamesAllan - 20 Mar 2003

Added:
>
>

TDT Corpus

  • TDT-1 Corpus
    • used for 1997 pilot-study
    • 2 sources from Reuters and CNN
    • About 16,000 news
    • Covers July 1, 1994 to June 30, 1995
    • 25 topics annotated by researchers
  • TDT-2 Corpus
    • used for 1998 test
    • 6 English sources: New York Times, Associated Press Worldstream, CNN, ABC, PRI and VOA
    • Covers from January 4 to June 30, 1998
    • About 6,000 news
    • 100 topics annotated by researchers
  • TDT-3 Corpus
    • used for 1999, 2000 and 2001 tests
    • 8 English sources(New York Times, Associated Press Worldstream, CNN, ABC, PRI, VOA English, NBC and MSNBC) and 3 Mandarin sources (Xinhua News, Zaobao and VOA Mandarin)
    • Covers from October 1998 to December 1998
    • about 9130 news
    • 60 topics annotated by researchers
  • TDT-4 Corpus
    • used for 2002 test
    • 8 English sources, 7 Mandarin sources and 5 Arabic sources
    • Covers from October 1 2001 to January 31 December 2002
    • 90375 news
    • 60 topics annotated by researchers

-- BinLiu - 25 Mar 2003

 <<O>>  Difference Topic TDTProject (r1.1 - 21 Mar 2003 - JamesAllan)
Line: 1 to 1
Added:
>
>
META TOPICPARENT WebHome
TDT group meetings are Mondays from 10-11 or from 10:30-11:30. If there is a talk scheduled at 11am, the meetings will start at 10.

-- JamesAllan - 20 Mar 2003

Revision r1.1 - 21 Mar 2003 - 02:50 - JamesAllan
Revision r1.21 - 02 May 2005 - 17:26 - GiridharKumaran