< < |
@
1.83
log
@none
@
text
@d1 1
a1 1
d15 4
a18 1
a20 2
@
1.82
log
@none
@
text
@d1 1
a1 1
a11 11
* February 3rd 10-11am (Room 151)
*++++No meeting February 10th++++*
@
1.81
log
@none
@
text
@d1 1
a1 1
d56 2
a57 2
@
1.80
log
@none
@
text
@d1 1
a1 1
d66 2
a67 2
@
1.79
log
@none
@
text
@d1 1
a1 1
d85 2
a86 2
@
1.78
log
@none
@
text
@d1 1
a1 1
d82 2
a83 2
@
1.77
log
@none
@
text
@d1 1
a1 1
d63 2
a64 2
@
1.76
log
@none
@
text
@d1 1
a1 1
d53 2
a54 2
@
1.75
log
@none
@
text
@d1 1
a1 1
d9 1
a9 1
All Spring 2006 meetings are held in CS151 from 10am-11am, unless otherwise specified.
d14 1
a14 1
* February 6th 10-11am (Room 151)*
d22 1
a22 1
*++++No meetings scheduled February 13th and 20th++++*
d24 1
a24 1
* February 27th 10-11am (Room 151)*
d33 1
a33 1
* March 6th 10-11am (Room 151)*
d42 1
a42 1
* March 13th 10-11am (Room 151)*
d51 1
a51 1
*++++No meeting March 20th++++
d53 9
a61 1
* March 27th 10-11am (Room 151)*
d70 1
a70 1
* April 3rd 10-11am (Room 151)*
d79 1
a79 1
* April 10th 10-11am (Room 151)*
d88 2
a89 2
* April 17th 10-11am (Room 151)*
:
d95 1
a95 1
* April 24th 10-11am (Room 151)*
d97 1
a97 1
d100 1
a100 1
d102 1
a102 1
* May 1st 10-11am (Room 151)*
d112 1
a112 1
* May 8th 10-11am (Room 151)*
d122 1
a122 9
* May 15th 10-11am (Room 151)*
@
1.74
log
@none
@
text
@d1 1
a1 1
d5 1
a5 1
This page lists past, current, and upcoming CIIR lab meetings. Please feel free to edit the wiki to sign up for a meeting. Normally a meeting should include two 30-minute talks that are somehow connected. The connection could be very strong, in which case the two talks should be prepared with some collaboration.
d9 115
a123 1
All meetings are held in CS151, unless otherwise specified.
@
1.73
log
@none
@
text
@d1 1
a1 1
d15 2
a16 2
-
- Talk 1 Ron Bekkerman - _
- Abstract:
@
1.72
log
@none
@
text
@d1 1
a1 1
d9 1
a9 1
We meet from 1:10pm - 2:10pm on Mondays in CS151.
d11 1
@
1.71
log
@none
@
text
@d1 1
a1 1
d17 4
a20 1
a22 2
-
- Talk 3 Hema Raghavan - _
- Abstract:
d32 1
a32 4
-
- Talk 3 Hema Raghavan - Interaction in TDT Tracking
- Abstract: Interaction in News Filtering has been restricted to document level feedback. In addition, the assumption with feedback is that
a user provides feedback on every document delivered to him. Current news filtering evaluation frameweorks do not consider a limit on the user's available labeling effort. In this talk we show that allowing users to provide subsets of documents for feedback in addition to marking documents as relevant is indeed beneficial for News Filtering, resulting in substantial improvements in TDT cost with as few as five documents labeled.
@
1.70
log
@none
@
text
@d1 1
a1 1
d11 11
a21 1
@
1.69
log
@none
@
text
@d1 1
a1 1
d21 4
a24 2
-
- Talk 3 Hema Raghavan
- Abstract:
@
1.68
log
@none
@
text
@d1 1
a1 1
d15 2
a16 2
-
- Talk 1 Don Metzler - Modeling Query Term Dependencies
- Abstract: Most information retrieval models make the assumption that terms occurrences are independent of each other. Many attempts in the past to relax this assumption have been made, including the linked dependence model, n-gram language models, and models that make explicit use of phrases. In this work, we propose and evaluate a general probabilistic framework for modeling dependencies between query terms. Experimental results show that using different dependence assumptions across varying types and sizes of collections can yield signifiant improvements over models that assume strict term independence.
d18 2
a19 2
-
- Talk 2 Jamie Rothfeder - Aligning Transcriptions and Automatically Segmented Handwritten Documents
- Abstract: The MIR lab has developed a system for automatically segmenting word images from degraded, handwritten documents. This system has an error rate of around 18% when used on 100 documents from the George Washington collection. ASCII transcriptions corresponding to each of these 100 documents are available. If we knew exactly how the words in the transcriptions corresponded to the words in the handwritten documents, then we could automatically generate data in the form of word image, ASCII term pairs. These pairs are crucial for training automatic recognizers such as the one introduced in "Holistic Word Recognition for Handwritten Historical Documents" by Rath et. al. Aligning the transcriptions with the handwritten documents would be a trivial task if the segmentation were error free, since the sequence of word images would correspond directly to the sequence of ASCII terms in the transcriptions. Unfortunately, segmentation errors offset the direct alignment and make our problem more complicated. In this talk, I will discuss a HMM-based method to align perfect transcriptions to imperfectly segmented documents. Our hidden variables represent the sequence of word images that have been automatically segmented from a given document, the state space for these variables are all the terms in the transcription for the document. The observed variables are the features extracted from the automatically segmented images. We use the Viterbi algorithm to decode these hidden variables and thus assign a transcript word to each of the segments. After this, a second post-processing step is employed to improve the alignment.
@
1.67
log
@none
@
text
@d1 1
a1 1
d15 1
a15 3
-
- Talk 1
- Abstract:
- Talk 2 Don Metzler - Modeling Query Term Dependencies
d17 2
a18 1
-
- Talk 3 Jamie Rothfeder - Aligning Transcriptions and Automatically Segmented Handwritten Documents
d20 4
@
1.66
log
@none
@
text
@d1 1
a1 1
d13 4
a16 3
- April 6, 2005
- Talk 1 Giri Kumaran - Recent Advances in Topic Detection and Tracking
- Abstract: We present a new way to represent and compare stories in news streams. In addition to the usual term-vector representation for each story, we also create additional vector representations that contain only the terms conveying a concise description of the story's topic. Comparison between stories is done by considering not only simple cosine similarity, but also named-entity overlap, non named-entity overlap, similarity between the concise descriptions and so on. All these similarity values are used as features to make a decision using a support vector machine. We report significant performance improvements in the tasks of New Event Detection, Link Detection, and Tracking.
@
1.65
log
@none
@
text
@d1 1
a1 1
d12 1
a12 1
d14 2
a15 2
-
- Talk 1 Giri Kumaran -
- Abstract:
@
1.64
log
@none
@
text
@d1 1
a1 1
d13 1
a13 1
*April 6, 2005
d16 3
a18 3
-
- Talk 2 Don Metzler -
- Abstract:
- Talk 3 Jamie Rothfeder - Aligning Transcriptions and Automatically Segmented Handwritten Documents
d21 1
a21 4
*March 9, 2005
d27 1
a27 1
* Febuary 9th
@
1.63
log
@none
@
text
@d1 1
a1 1
d18 3
a20 2
-
- Talk 3 Jamie Rothfeder -
- Abstract:
@
1.62
log
@none
@
text
@d1 1
a1 1
d12 18
a29 1
* Febuary 9th
a35 8
*March 9, 2005
-
- Talk 1 VanessaMurdock - Ad Hoc Sentence Retrieval
- Abstract: Sentence retrieval has become an integral part of question answering systems, novelty detection and summarization. Each of these tasks has different requirements of a "good" sentence. Studies of sentence retrieval have been done on a task-specific basis. We demonstrate a query-likelihood baseline for sentence retrieval independent of a specific task. We investigate ways to estimate a "translation" model, translating queries to sentences, incorporating external resources such as WordNet?. We show significant performance gains by smoothing using the document that contains the sentence.
- Talk 2 Wei Li
- Talk 3 Koji Eguchi
@
1.61
log
@none
@
text
@d1 1
a1 1
d11 3
a13 2
d19 7
@
1.60
log
@none
@
text
@d1 1
a1 1
d12 2
a13 1
d205 2
d223 1
@
1.59
log
@none
@
text
@d1 1
a1 1
d10 7
@
1.58
log
@none
@
text
@d1 1
a1 1
a29 1
a30 1
@
1.57
log
@none
@
text
@d1 1
a1 1
d29 4
a32 2
@
1.56
log
@none
@
text
@d1 1
a1 1
d28 1
a28 1
@
1.55
log
@none
@
text
@d1 1
a1 1
d25 2
a28 1
@
1.54
log
@none
@
text
@d1 1
a1 1
d3 193
a195 190
CIIR weekly lab meetings
This page lists past, current, and upcoming CIIR lab meetings. Please feel free to edit the wiki to sign up for a meeting. Normally a meeting should include two 30-minute talks that are somehow connected. The connection could be very strong, in which case the two talks should be prepared with some collaboration.
The talks could be on your own work or any other interesting work that is related to the lab's general research direction. To stimulate new research ideas, please include a slide in the end that addresses future directions and open research questions related to that work.
We meet from 1:10pm - 2:10pm on Mondays in CS151.
Fall 2004
- November 15th
- Talk 1 VanessaMurdock - Sentence Retrieval from Questions
- Abstract: Passage retrieval has applications in question-answering, summarization, HARD, novelty detection, and machine translation. For tasks such as these there is more emphasis on the quality of the top of the ranked list, with less emphasis on the overall quality of the list. The richer the set of passages, in terms of relevant content, the more accurate the results. We present a simple translation model for passage retrieval at the sentence level. We choose sentences because sentences are a natural linguistic unit, whereas a passage may be an arbitrary piece of text. We demonstrate the translation model framework on TREC data, in the context of factoid question-answering, and show that it performs better than retrieval based on query likelihood, and on par with other systems.
- Nov 22nd (No lab meeting -- Virtual Thursday)
Summer 2004
- June 28
- Don Metzler - Indri
- Trevor Strohman - Indri
- July 19
- FernandoDiaz - Using Temporal Profiles of Queries for Precision Prediction (SIGIR practice talk)
- ToniRath - Handwriting Retrieval
Spring 2004
- January 12
- Charles Sutton on learning to perform multiple sequence labeling tasks simultaneously. PPT
- Shaolei Feng on using the Bernoulli model for something
- ProjectorSetup by
- January 19, no meeting (Martin Luther King Day)
- January 26, meeting was cancelled
- February 2, meeting was cancelled
- February 9, meeting was cancelled
- February 16, no meeting (Presidents' Day)
- February 23, meeting was cancelled
- March 8
- Don Metzler on multiple-Bernoulli models for language modeling
- Chirag Shah on evaluating high accuracy retrieval techniques
- ProjectorSetup by Ramesh
- March 15 (Spring break; may not meet.)
- Xiaoyong Liu on automatic recognition of reading levels from user queries
- Mark Smucker on document dependent smoothing
- ProjectorSetup by Trevor
- March 22
- Xiaoyan Li on using answer models for novelty detection
- Andres Corrada-Emmanuel
- ProjectorSetup by NadiaGhamrawi?
- March 29
- Wei Li on answer retrieval from extracted tables
- Steve Cronen-Townsend on a language modeling framework for selective query expansion
- ProjectorSetup by XingWei
- April 5
- Hema Raghavan Experiments with ASR documents for IR and TDT
- Josh Lewis on search for Rexo and/or NSDL
- ProjectorSetup by Giridhar
- April 14, Jeremy's PhD? defense talk is at 10:30
- April 19, no meeting (Patriot's Day)
- April 26
- Chung Heong Gooi - Cross Document Coreferencing on a Large Scale Corpus PPT
- ProjectorSetup by YunZhon?
- May 3
- Giridhar Kumaran - Text Categorization and Named Entities for New Event Detection
- JiwoonJeon - Content Based Yahoo Photo News Retrieval
- ProjectorSetup by JJ
- May 17 (Classes ended the previous week)
Fall 2003
We meet from 11-12 on Tuesdays in CS151. Italics dates are in the past. Bold dates need one or more speakers.
- September 16. Predicting value of query expansion
- September 23. Cross-language issues
- Victor giving a tutorial of statistical machine translation basics PPT:
- Leah talking about the DARPA surprise language exercise.
- September 30. Smoothing
- Alvaro on smoothing at eBay
- Ramesh on a Zhai and Lafferty smoothing paper. PPT
- Chengxiang Zhai and John Lafferty, Model-based Feedback in the Language Modeling Approach to Information Retrieval,CIKM, 403-410, 2001. citeseer
- Chengxiang Zhai and John Lafferty, A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval, SIGIR 334-342, 2001. citeseer
- October 14. Music
- Jeremy/Victor on their ACM Multimedia paper on CRFs for music retrieval
- Vanessa on "Automatic transcription of piano music" by Rafeal, ISMIR 2002 PDF.
- October 21. Arabic
- Nasreen on name transliteration PPT
- Giri presenting "Unsupervised learning of Arabic stemming using a parallel corpus" by Rogati et al, ACL 2003 PS.gz.
- October 28. CIKM practice talks
- Xiaoyan on time-based language models PPT:
- Ao Feng on clustering evaluation in TDT detection PPT:
- November 4. (CIKM is happening in New Orleans.)
- Hema Raghavan on Query-Free News Search (Monika Henzinger et al, WWW 2003)HTML
- James Allan on aligning transcripts and handwriting
- November 11. No meeting; today is Veteran's Day.
- November 18. (TREC is happening in Gaithersburg.)
- Trevor Strohman on IR performance issues PPT
- Fuchun Peng on extracting information from technical papers
- November 25.
- Ben Carterette on BLEU and IR
- Xing Wei on table processing PPT
- December 2.
- Don Metzler on LM and Inference networks PPT
- Ramesh Nallapati on maximum entropy for IR PPT
- December 9.
- Xiaoyong Liu on experiments with clusters and language models PPT
- Margie Connell on cross-language processing for TDT PPT
- December 16.
- Toni Rath on historical manuscript retrieval and recognition
@
1.53
log
@none
@
text
@d1 1
a1 1
d7 1
a7 1
The talks could be on your own work or any other work that is related to the lab's general research direction that you find interesting. To stimulate new research ideas, please include a slide in the end that addresses future directions and open research questions related to that work.
a15 3
-
- Talk 2: GiridharKumaran - Web Search by Topic Familiarity
- Abstract: Current web search engines return a list of documents in response to a query by a user. Search engines offer a variety of advanced search options to help users further refine their queries. Some examples are options to specify domains, conjunctions and disjunctions of query terms, files types, ranges of dates etc. We seek to provide an advanced search option for another problem frequently encountered by users, specifically fetching of documents in accordance with the users familiarity with a topic. While the results returned by the search engine are most likely relevant to what the user is searching for, it is up to the user to go through the list of documents and select those that correspond to his or her information need. Of course, the alternative is for the user to modify the query such that only introductory or advanced documents are returned. This is a difficult task for the average web user. We have developed two systems that provide a user with an option to obtain only introductory or advanced documents on any topic. One system uses a classifier learned by training on examples, while the second performs automatic query expansion to return only low familiarity documents. Work done at Yahoo! Research Labs by Giridhar Kumaran, Rosie Jones, and Omid Madani.
@
1.52
log
@none
@
text
@d1 1
a1 1
d7 1
a7 1
The talks could be on your own work or any other work that is related to the lab's general research direction that you find interesting. To stimulate new research ideas, please include a slide in the end that addresses future directions and open research questions related to that work.
d16 2
a17 2
-
- Talk 2 (Please sign up!)
- Abstract:
@
1.51
log
@none
@
text
@d1 1
a1 1
d5 1
a5 1
This page lists past, current, and upcoming CIIR lab meetings. Please feel free to edit the wiki to sign up for a meeting. Normally a meeting should include two 30-minute talks that are somehow connected. The connection could be very strong, in which case the two talks should be prepared with some collaboration. Or the connection could be very weak in which case that seems less useful.
d7 24
a30 1
We meet from 11-12 on Mondays in CS151.
d33 2
@
1.50
log
@none
@
text
@d1 1
a1 1
d20 1
a20 5
- July 12
- Toni Rath - Handwriting Retrieval
- anyone?
d22 1
a22 1
@
1.49
log
@none
@
text
@d1 1
a1 1
d24 4
a174 1
@
1.48
log
@none
@
text
@d1 1
a1 1
d20 1
a20 1
@
1.47
log
@none
@
text
@d1 1
a1 1
d19 4
@
1.46
log
@none
@
text
@d1 1
a1 1
d16 1
a16 1
@
1.45
log
@none
@
text
@d1 1
a1 1
d11 1
a11 1
d16 1
a16 1
@
1.44
log
@none
@
text
@d1 1
a1 1
d9 11
a100 5
@
1.43
log
@none
@
text
@d1 1
a1 1
d90 5
@
1.42
log
@none
@
text
@d1 1
a1 1
a68 1
-
- Leah Larkey on language-specific models for TDT
d78 1
@
1.41
log
@none
@
text
@d1 1
a1 1
d68 1
a68 1
-
- Chung Heong Gooi - Cross Document Coreferencing on a Large Scale Corpus PPT
d172 1
@
1.40
log
@none
@
text
@d1 1
a1 1
d68 1
a68 1
-
- Chung Heong Gooi - Cross Document Coreferencing on a Large Scale Corpus
d171 1
@
1.39
log
@none
@
text
@d1 1
a1 1
d68 1
a68 1
-
- JiwoonJeon - Content Based Yahoo News Photo Retrieval
d74 1
a74 1
@
1.38
log
@none
@
text
@d1 1
a1 1
d78 1
a78 2
@
1.37
log
@none
@
text
@d1 1
a1 1
d73 1
a73 1
@
1.36
log
@none
@
text
@d1 1
a1 1
d59 2
a60 2
-
- Leah Larkey on language-specific models for TDT
- Ao Feng
d69 1
a69 1
@
1.35
log
@none
@
text
@d1 1
a1 1
d54 1
a54 1
-
- Hema Raghavan on using soundex codes for indexing ASR documents
@
1.34
log
@none
@
text
@d1 1
a1 1
d40 1
a40 1
-
- Leah Larkey on language-specific models for TDT
d59 1
a59 1
@
1.33
log
@none
@
text
@d1 1
a1 1
d34 1
a34 1
-
- Leah Larkey on language-specific models for TDT
d40 1
a40 1
-
- Don Metzler on multiple-Bernoulli models for language modeling
@
1.32
log
@none
@
text
@d1 1
a1 1
d18 3
a20 9
d22 1
a22 4
d26 1
a26 4
d34 2
a35 2
d39 2
a40 2
d44 2
a45 2
d49 2
a50 2
d54 2
a55 2
d59 2
a60 2
d63 2
a65 3
d69 1
a69 1
@
1.31
log
@none
@
text
@d1 1
a1 1
d81 1
a81 1
@
1.30
log
@none
@
text
@d1 1
a1 1
d43 1
a43 1
@
1.29
log
@none
@
text
@d1 1
a1 1
d41 1
a41 1
@
1.28
log
@none
@
text
@d1 1
a1 1
d31 1
a31 1
@
1.27
log
@none
@
text
@d1 1
a1 1
d26 1
a26 1
@
1.26
log
@none
@
text
@d1 1
a1 1
d98 1
a98 1
@
1.25
log
@none
@
text
@d1 1
a1 1
d31 1
a31 1
d53 1
a53 1
d98 1
a98 1
d103 1
a103 1
@
1.24
log
@none
@
text
@d1 1
a1 1
d93 1
a93 1
@
1.23
log
@none
@
text
@d1 1
a1 1
d12 1
a12 1
-
- Charles Sutton on learning to perform multiple sequence labeling tasks simultaneously.
d169 1
d184 1
@
1.22
log
@none
@
text
@d1 1
a1 1
d21 1
a21 1
d26 1
a26 1
d31 1
a31 1
d38 1
a38 1
d43 1
a43 1
d48 1
a48 1
d58 1
a58 1
d63 1
a63 1
d68 1
a68 1
d73 1
a73 1
d83 1
a83 1
d88 1
a88 1
@
1.21
log
@none
@
text
@d1 1
a1 1
d130 1
a130 1
-
- Nasreen on name transliteration
d147 1
a147 1
-
- Trevor Strohman on IR performance issues
d153 1
a153 1
-
- Xing Wei on table processing
d180 3
@
1.20
log
@none
@
text
@d1 1
a1 1
d12 2
a13 2
@
1.19
log
@none
@
text
@d1 1
a1 1
d7 1
a7 1
We meet from 11-12 on Tuesdays in CS151.
d11 1
a11 1
d16 3
a18 1
d23 1
a23 1
d28 1
a28 1
d33 3
a35 1
d40 1
a40 1
d45 1
a45 1
d50 1
a50 1
d55 1
a55 1
d60 1
a60 1
d65 1
a65 1
- March 16 (Spring break; may not meet.)
d70 1
a70 1
d75 1
a75 1
d80 1
a80 1
d85 1
a85 1
d90 1
a90 1
d95 1
a95 1
d100 1
a100 16
- May 18 (Classes ended the previous week)
@
1.18
log
@none
@
text
@d1 1
a1 1
d7 108
d150 1
a150 1
- November 4. (CIKM is happening in New Orleans.)
d155 1
a155 1
- November 11. No meeting; today is Veteran's Day.
d157 1
a157 1
- November 18. (TREC is happening in Gaithersburg.)
d162 1
a162 1
d167 1
a167 1
d172 1
a172 1
d177 1
a177 2
- December 16.
- Victor Lavrenko on topic to be decided
@
1.17
log
@none
@
text
@d1 1
a1 1
d12 2
a13 2
d16 1
a16 1
-
- Victor giving a tutorial of statistical machine translation basics
d21 1
a21 1
-
- Ramesh on a Zhai and Lafferty smoothing paper.
d38 2
a39 2
-
- Xiaoyan on time-based language models
- Ao Feng on clustering evaluation in TDT detection
d60 2
a61 2
-
- Don Metzler on LM and Inference networks
- Ramesh Nallapati on maximum entropy for IR
d65 3
a67 3
-
- Xiaoyong Liu on experiments with clusters and language models
- Margie Connell on cross-language processing for TDT
d73 11
@
1.16
log
@none
@
text
@d1 1
a1 1
d71 1
a71 1
-
- Toni Rath on topic to be decided
@
1.15
log
@none
@
text
@d1 1
a1 1
d65 1
a65 1
-
- Xiaoyong Liu on topic to be decided
@
1.14
log
@none
@
text
@d1 1
a1 1
d45 1
a45 1
d52 1
a52 1
d57 1
a57 1
d62 1
a62 1
d67 1
a67 1
d72 1
a72 1
@
1.13
log
@none
@
text
@d1 1
a1 1
d15 1
a15 1
- September 23. Cross-language issues
d19 1
a19 1
d32 1
a32 1
d36 2
a37 1
d42 1
a42 1
- November 4. (CIKM is happening in New Orleans.)
d44 1
a44 1
d49 1
a49 1
- November 18. (TREC is happening in Gaithersburg.)
d51 1
@
1.12
log
@none
@
text
@d1 1
a1 1
d39 1
a39 1
@
1.11
log
@none
@
text
@d1 1
a1 1
d69 1
a69 1
-
- Toni Rather on topic to be decided
@
1.10
log
@none
@
text
@d1 1
a1 1
d42 2
a43 1
-
- Hema Raghavan on topic to be decided
@
1.9
log
@none
@
text
@d1 1
a1 1
d36 1
a36 1
d38 1
d48 1
@
1.8
log
@none
@
text
@d1 1
a1 1
d25 1
a25 3
- October 7.
- Ao on clustering evaluation within TDT.
d27 1
a27 1
d41 3
d45 1
d47 6
a52 1
d54 14
a67 3
- December 2.
- December 9.
- December 16.
@
1.7
log
@none
@
text
@d1 1
a1 1
d37 1
a37 1
@
1.6
log
@none
@
text
@d1 1
a1 1
d27 1
a27 1
@
1.5
log
@none
@
text
@d1 1
a1 1
d25 1
a25 1
d27 1
d32 1
d37 1
a37 1
d40 1
@
1.4
log
@none
@
text
@d1 1
a1 1
d22 2
@
1.3
log
@none
@
text
@d1 1
a1 1
d34 2
a35 1
@
1.2
log
@none
@
text
@d1 1
a1 1
a2 1
d24 1
@
1.1
log
@none
@
text
@d1 1
a1 1
a3 1
d12 11
a22 3
- September 16. Andres and Steve C-T.
- September 23. Leah and Victor.
- September 30. Alvaro and Ramesh.
d25 9
a33 2
- October 14. Jeremy/Victor and ??.
- October 21. Nasreen and ??.
@
|