This directory contains our experiment data in Nallapati, R., Feng, A., Peng, F., and Allan, J. , "Event Threading within News Topics" in the Proceedings of CIKM 2004 conference, pp. 446-453.
The data come from TDT-2 and TDT-3. Topics numbered 200?? are TDT-2 topics, 300?? and 310?? are in TDT-3. Unfortunately, the news stories themselves are proprietary and we cannot make them available directly.
They can be obtained from the Linguistic Data Consortium at a cost that varies depending on your circumstances. Details can be found at http://www.ldc.upenn.edu/Projects/TDT2/data-release.html