<<O>>  Difference Topic LabMeeting (r1.121 - 29 Aug 2006 - JamesAllan)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 373 to 373

    • Toni Rath on historical manuscript retrieval and recognition
      ProjectorSetup Chirag

Deleted:
<
<

@

1.83 log @none @ text @d1 1 a1 1 d15 4 a18 1

    • Talk 1:
a20 2
    • Talk 2:
    • Abstract:
@

1.82 log @none @ text @d1 1 a1 1 a11 11

* February 3rd 10-11am (Room 151)

    • Talk 1:
    • Abstract:

    • Talk 2:
    • Abstract:

*++++No meeting February 10th++++* @

1.81 log @none @ text @d1 1 a1 1 d56 2 a57 2

    • Talk 2:
    • Abstract:
@

1.80 log @none @ text @d1 1 a1 1 d66 2 a67 2

    • Talk 2:
    • Abstract:
@

1.79 log @none @ text @d1 1 a1 1 d85 2 a86 2

    • Talk 2:
    • Abstract:
@

1.78 log @none @ text @d1 1 a1 1 d82 2 a83 2

    • Talk 1:
    • Abstract:
@

1.77 log @none @ text @d1 1 a1 1 d63 2 a64 2

    • Talk 1:
    • Abstract:
@

1.76 log @none @ text @d1 1 a1 1 d53 2 a54 2

    • Talk 1:
    • Abstract:
@

1.75 log @none @ text @d1 1 a1 1 d9 1 a9 1 All Spring 2006 meetings are held in CS151 from 10am-11am, unless otherwise specified. d14 1 a14 1 * February 6th 10-11am (Room 151)* d22 1 a22 1 *++++No meetings scheduled February 13th and 20th++++* d24 1 a24 1 * February 27th 10-11am (Room 151)* d33 1 a33 1 * March 6th 10-11am (Room 151)* d42 1 a42 1 * March 13th 10-11am (Room 151)* d51 1 a51 1 *++++No meeting March 20th++++ d53 9 a61 1 * March 27th 10-11am (Room 151)* d70 1 a70 1 * April 3rd 10-11am (Room 151)* d79 1 a79 1 * April 10th 10-11am (Room 151)* d88 2 a89 2 * April 17th 10-11am (Room 151)* : d95 1 a95 1 * April 24th 10-11am (Room 151)* d97 1 a97 1

    • Talk 1:
d100 1 a100 1
    • Abstract:
d102 1 a102 1 * May 1st 10-11am (Room 151)* d112 1 a112 1 * May 8th 10-11am (Room 151)* d122 1 a122 9 * May 15th 10-11am (Room 151)*

    • Talk 1:
    • Abstract:

    • Talk 2:
    • Abstract:

@

1.74 log @none @ text @d1 1 a1 1 d5 1 a5 1 This page lists past, current, and upcoming CIIR lab meetings. Please feel free to edit the wiki to sign up for a meeting. Normally a meeting should include two 30-minute talks that are somehow connected. The connection could be very strong, in which case the two talks should be prepared with some collaboration. d9 115 a123 1 All meetings are held in CS151, unless otherwise specified. @

1.73 log @none @ text @d1 1 a1 1 d15 2 a16 2

    • Talk 1 Ron Bekkerman - _
    • Abstract:
@

1.72 log @none @ text @d1 1 a1 1 d9 1 a9 1 We meet from 1:10pm - 2:10pm on Mondays in CS151. d11 1 @

1.71 log @none @ text @d1 1 a1 1 d17 4 a20 1

    • Talk 2 Fernando Diaz - _
a22 2
    • Talk 3 Hema Raghavan - _
    • Abstract:
d32 1 a32 4
    • Talk 3 Hema Raghavan - Interaction in TDT Tracking
    • Abstract: Interaction in News Filtering has been restricted to document level feedback. In addition, the assumption with feedback is that
a user provides feedback on every document delivered to him. Current news filtering evaluation frameweorks do not consider a limit on the user's available labeling effort. In this talk we show that allowing users to provide subsets of documents for feedback in addition to marking documents as relevant is indeed beneficial for News Filtering, resulting in substantial improvements in TDT cost with as few as five documents labeled.

@

1.70 log @none @ text @d1 1 a1 1 d11 11 a21 1

@

1.69 log @none @ text @d1 1 a1 1 d21 4 a24 2

    • Talk 3 Hema Raghavan
    • Abstract:
@

1.68 log @none @ text @d1 1 a1 1 d15 2 a16 2

    • Talk 1 Don Metzler - Modeling Query Term Dependencies
    • Abstract: Most information retrieval models make the assumption that terms occurrences are independent of each other. Many attempts in the past to relax this assumption have been made, including the linked dependence model, n-gram language models, and models that make explicit use of phrases. In this work, we propose and evaluate a general probabilistic framework for modeling dependencies between query terms. Experimental results show that using different dependence assumptions across varying types and sizes of collections can yield signifiant improvements over models that assume strict term independence.
d18 2 a19 2
    • Talk 2 Jamie Rothfeder - Aligning Transcriptions and Automatically Segmented Handwritten Documents
    • Abstract: The MIR lab has developed a system for automatically segmenting word images from degraded, handwritten documents. This system has an error rate of around 18% when used on 100 documents from the George Washington collection. ASCII transcriptions corresponding to each of these 100 documents are available. If we knew exactly how the words in the transcriptions corresponded to the words in the handwritten documents, then we could automatically generate data in the form of word image, ASCII term pairs. These pairs are crucial for training automatic recognizers such as the one introduced in "Holistic Word Recognition for Handwritten Historical Documents" by Rath et. al. Aligning the transcriptions with the handwritten documents would be a trivial task if the segmentation were error free, since the sequence of word images would correspond directly to the sequence of ASCII terms in the transcriptions. Unfortunately, segmentation errors offset the direct alignment and make our problem more complicated. In this talk, I will discuss a HMM-based method to align perfect transcriptions to imperfectly segmented documents. Our hidden variables represent the sequence of word images that have been automatically segmented from a given document, the state space for these variables are all the terms in the transcription for the document. The observed variables are the features extracted from the automatically segmented images. We use the Viterbi algorithm to decode these hidden variables and thus assign a transcript word to each of the segments. After this, a second post-processing step is employed to improve the alignment.
@

1.67 log @none @ text @d1 1 a1 1 d15 1 a15 3

    • Talk 1
    • Abstract:
    • Talk 2 Don Metzler - Modeling Query Term Dependencies
d17 2 a18 1
    • Talk 3 Jamie Rothfeder - Aligning Transcriptions and Automatically Segmented Handwritten Documents
d20 4 @

1.66 log @none @ text @d1 1 a1 1 d13 4 a16 3

  • April 6, 2005
    • Talk 1 Giri Kumaran - Recent Advances in Topic Detection and Tracking
    • Abstract: We present a new way to represent and compare stories in news streams. In addition to the usual term-vector representation for each story, we also create additional vector representations that contain only the terms conveying a concise description of the story's topic. Comparison between stories is done by considering not only simple cosine similarity, but also named-entity overlap, non named-entity overlap, similarity between the concise descriptions and so on. All these similarity values are used as features to make a decision using a support vector machine. We report significant performance improvements in the tasks of New Event Detection, Link Detection, and Tracking.
@

1.65 log @none @ text @d1 1 a1 1 d12 1 a12 1

d14 2 a15 2

    • Talk 1 Giri Kumaran -
    • Abstract:
@

1.64 log @none @ text @d1 1 a1 1 d13 1 a13 1 *April 6, 2005 d16 3 a18 3

    • Talk 2 Don Metzler -
    • Abstract:
    • Talk 3 Jamie Rothfeder - Aligning Transcriptions and Automatically Segmented Handwritten Documents
d21 1 a21 4

*March 9, 2005 d27 1 a27 1 * Febuary 9th @

1.63 log @none @ text @d1 1 a1 1 d18 3 a20 2

    • Talk 3 Jamie Rothfeder -
    • Abstract:
@

1.62 log @none @ text @d1 1 a1 1 d12 18 a29 1 * Febuary 9th a35 8

*March 9, 2005

    • Talk 1 VanessaMurdock - Ad Hoc Sentence Retrieval
    • Abstract: Sentence retrieval has become an integral part of question answering systems, novelty detection and summarization. Each of these tasks has different requirements of a "good" sentence. Studies of sentence retrieval have been done on a task-specific basis. We demonstrate a query-likelihood baseline for sentence retrieval independent of a specific task. We investigate ways to estimate a "translation" model, translating queries to sentences, incorporating external resources such as WordNet?. We show significant performance gains by smoothing using the document that contains the sentence.
    • Talk 2 Wei Li
    • Talk 3 Koji Eguchi

@

1.61 log @none @ text @d1 1 a1 1 d11 3 a13 2

d19 7 @

1.60 log @none @ text @d1 1 a1 1 d12 2 a13 1

    • Talk 1
d205 2 d223 1 @

1.59 log @none @ text @d1 1 a1 1 d10 7 @

1.58 log @none @ text @d1 1 a1 1 a29 1

    • Abstract: Coming soon...
a30 1
    • Abstract: Coming soon...
@

1.57 log @none @ text @d1 1 a1 1 d29 4 a32 2

@

1.56 log @none @ text @d1 1 a1 1 d28 1 a28 1

  • Dec 19th
@

1.55 log @none @ text @d1 1 a1 1 d25 2 a28 1

@

1.54 log @none @ text @d1 1 a1 1 d3 193 a195 190

CIIR weekly lab meetings

This page lists past, current, and upcoming CIIR lab meetings. Please feel free to edit the wiki to sign up for a meeting. Normally a meeting should include two 30-minute talks that are somehow connected. The connection could be very strong, in which case the two talks should be prepared with some collaboration.

The talks could be on your own work or any other interesting work that is related to the lab's general research direction. To stimulate new research ideas, please include a slide in the end that addresses future directions and open research questions related to that work.

We meet from 1:10pm - 2:10pm on Mondays in CS151.

Fall 2004

  • November 15th
    • Talk 1 VanessaMurdock - Sentence Retrieval from Questions
    • Abstract: Passage retrieval has applications in question-answering, summarization, HARD, novelty detection, and machine translation. For tasks such as these there is more emphasis on the quality of the top of the ranked list, with less emphasis on the overall quality of the list. The richer the set of passages, in terms of relevant content, the more accurate the results. We present a simple translation model for passage retrieval at the sentence level. We choose sentences because sentences are a natural linguistic unit, whereas a passage may be an arbitrary piece of text. We demonstrate the translation model framework on TREC data, in the context of factoid question-answering, and show that it performs better than retrieval based on query likelihood, and on par with other systems.

  • Nov 22nd (No lab meeting -- Virtual Thursday)

  • Nov 29th

  • Dec 6th

  • Dec 13th

  • Dec 19th

Summer 2004

  • June 28
    • Don Metzler - Indri
    • Trevor Strohman - Indri

  • July 19
    • FernandoDiaz - Using Temporal Profiles of Queries for Precision Prediction (SIGIR practice talk)
    • ToniRath - Handwriting Retrieval

Spring 2004

  • January 12
    • Charles Sutton on learning to perform multiple sequence labeling tasks simultaneously. PPT
    • Shaolei Feng on using the Bernoulli model for something
    • ProjectorSetup by

  • January 19, no meeting (Martin Luther King Day)

  • January 26, meeting was cancelled

  • February 2, meeting was cancelled

  • February 9, meeting was cancelled

  • February 16, no meeting (Presidents' Day)

  • February 23, meeting was cancelled

  • March 8
    • Don Metzler on multiple-Bernoulli models for language modeling
    • Chirag Shah on evaluating high accuracy retrieval techniques
    • ProjectorSetup by Ramesh

  • March 15 (Spring break; may not meet.)
    • Xiaoyong Liu on automatic recognition of reading levels from user queries
    • Mark Smucker on document dependent smoothing
    • ProjectorSetup by Trevor

  • March 22
    • Xiaoyan Li on using answer models for novelty detection
    • Andres Corrada-Emmanuel
    • ProjectorSetup by NadiaGhamrawi?

  • March 29
    • Wei Li on answer retrieval from extracted tables
    • Steve Cronen-Townsend on a language modeling framework for selective query expansion
    • ProjectorSetup by XingWei

  • April 5
    • Hema Raghavan Experiments with ASR documents for IR and TDT
    • Josh Lewis on search for Rexo and/or NSDL
    • ProjectorSetup by Giridhar

  • April 14, Jeremy's PhD? defense talk is at 10:30

  • April 19, no meeting (Patriot's Day)

  • April 26
    • Chung Heong Gooi - Cross Document Coreferencing on a Large Scale Corpus PPT
    • ProjectorSetup by YunZhon?

  • May 3
    • Giridhar Kumaran - Text Categorization and Named Entities for New Event Detection
    • JiwoonJeon - Content Based Yahoo Photo News Retrieval
    • ProjectorSetup by JJ

Fall 2003

We meet from 11-12 on Tuesdays in CS151. Italics dates are in the past. Bold dates need one or more speakers.

  • September 16. Predicting value of query expansion

  • September 23. Cross-language issues
    • Victor giving a tutorial of statistical machine translation basics PPT:
    • Leah talking about the DARPA surprise language exercise.

  • September 30. Smoothing
    • Alvaro on smoothing at eBay
    • Ramesh on a Zhai and Lafferty smoothing paper. PPT
      • Chengxiang Zhai and John Lafferty, Model-based Feedback in the Language Modeling Approach to Information Retrieval,CIKM, 403-410, 2001. citeseer
      • Chengxiang Zhai and John Lafferty, A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval, SIGIR 334-342, 2001. citeseer

  • October 7. (cancelled)

  • October 14. Music
    • Jeremy/Victor on their ACM Multimedia paper on CRFs for music retrieval
    • Vanessa on "Automatic transcription of piano music" by Rafeal, ISMIR 2002 PDF.
      ProjectorSetup Your wiki name here

  • October 21. Arabic
    • Nasreen on name transliteration PPT
    • Giri presenting "Unsupervised learning of Arabic stemming using a parallel corpus" by Rogati et al, ACL 2003 PS.gz.
      ProjectorSetup VanessaMurdock

  • October 28. CIKM practice talks

  • November 4. (CIKM is happening in New Orleans.)
    • Hema Raghavan on Query-Free News Search (Monika Henzinger et al, WWW 2003)HTML
    • James Allan on aligning transcripts and handwriting
      ProjectorSetup Giri

  • November 11. No meeting; today is Veteran's Day.

  • November 18. (TREC is happening in Gaithersburg.)
    • Trevor Strohman on IR performance issues PPT
    • Fuchun Peng on extracting information from technical papers
      ProjectorSetup Nadia

  • November 25.
    • Ben Carterette on BLEU and IR
    • Xing Wei on table processing PPT
      ProjectorSetup Charles

  • December 2.
    • Don Metzler on LM and Inference networks PPT
    • Ramesh Nallapati on maximum entropy for IR PPT
      ProjectorSetup Gary

  • December 9.

  • December 16.
    • Toni Rath on historical manuscript retrieval and recognition
      ProjectorSetup Chirag

@

1.53 log @none @ text @d1 1 a1 1 d7 1 a7 1 The talks could be on your own work or any other work that is related to the lab's general research direction that you find interesting. To stimulate new research ideas, please include a slide in the end that addresses future directions and open research questions related to that work. a15 3

    • Talk 2: GiridharKumaran - Web Search by Topic Familiarity
    • Abstract: Current web search engines return a list of documents in response to a query by a user. Search engines offer a variety of advanced search options to help users further refine their queries. Some examples are options to specify domains, conjunctions and disjunctions of query terms, files types, ranges of dates etc. We seek to provide an advanced search option for another problem frequently encountered by users, specifically fetching of documents in accordance with the users familiarity with a topic. While the results returned by the search engine are most likely relevant to what the user is searching for, it is up to the user to go through the list of documents and select those that correspond to his or her information need. Of course, the alternative is for the user to modify the query such that only introductory or advanced documents are returned. This is a difficult task for the average web user. We have developed two systems that provide a user with an option to obtain only introductory or advanced documents on any topic. One system uses a classifier learned by training on examples, while the second performs automatic query expansion to return only low familiarity documents. Work done at Yahoo! Research Labs by Giridhar Kumaran, Rosie Jones, and Omid Madani.

@

1.52 log @none @ text @d1 1 a1 1 d7 1 a7 1 The talks could be on your own work or any other work that is related to the lab's general research direction that you find interesting. To stimulate new research ideas, please include a slide in the end that addresses future directions and open research questions related to that work. d16 2 a17 2

    • Talk 2 (Please sign up!)
    • Abstract:
@

1.51 log @none @ text @d1 1 a1 1 d5 1 a5 1 This page lists past, current, and upcoming CIIR lab meetings. Please feel free to edit the wiki to sign up for a meeting. Normally a meeting should include two 30-minute talks that are somehow connected. The connection could be very strong, in which case the two talks should be prepared with some collaboration. Or the connection could be very weak in which case that seems less useful. d7 24 a30 1 We meet from 11-12 on Mondays in CS151. d33 2 @

1.50 log @none @ text @d1 1 a1 1 d20 1 a20 5

  • July 12
    • Toni Rath - Handwriting Retrieval
    • anyone?

  • July 12
d22 1 a22 1

@

1.49 log @none @ text @d1 1 a1 1 d24 4 a174 1

@

1.48 log @none @ text @d1 1 a1 1 d20 1 a20 1

  • July 5
@

1.47 log @none @ text @d1 1 a1 1 d19 4 @

1.46 log @none @ text @d1 1 a1 1 d16 1 a16 1

  • June 21
@

1.45 log @none @ text @d1 1 a1 1 d11 1 a11 1

  • June 7
d16 1 a16 1
  • June 14
@

1.44 log @none @ text @d1 1 a1 1 d9 11 a100 5

@

1.43 log @none @ text @d1 1 a1 1 d90 5 @

1.42 log @none @ text @d1 1 a1 1 a68 1

    • Leah Larkey on language-specific models for TDT
d78 1 @

1.41 log @none @ text @d1 1 a1 1 d68 1 a68 1

    • Chung Heong Gooi - Cross Document Coreferencing on a Large Scale Corpus PPT
d172 1 @

1.40 log @none @ text @d1 1 a1 1 d68 1 a68 1

    • Chung Heong Gooi - Cross Document Coreferencing on a Large Scale Corpus
d171 1 @

1.39 log @none @ text @d1 1 a1 1 d68 1 a68 1

    • JiwoonJeon - Content Based Yahoo News Photo Retrieval
d74 1 a74 1
    • TBA
@

1.38 log @none @ text @d1 1 a1 1 d78 1 a78 2

    • TBA
    • TBA
@

1.37 log @none @ text @d1 1 a1 1 d73 1 a73 1

    • TBA
@

1.36 log @none @ text @d1 1 a1 1 d59 2 a60 2

    • Leah Larkey on language-specific models for TDT
    • Ao Feng
d69 1 a69 1
    • Vanessa Murdock
@

1.35 log @none @ text @d1 1 a1 1 d54 1 a54 1

    • Hema Raghavan on using soundex codes for indexing ASR documents
@

1.34 log @none @ text @d1 1 a1 1 d40 1 a40 1

    • Leah Larkey on language-specific models for TDT
d59 1 a59 1
    • Mark Smucker
@

1.33 log @none @ text @d1 1 a1 1 d34 1 a34 1

    • Leah Larkey on language-specific models for TDT
d40 1 a40 1
    • Don Metzler on multiple-Bernoulli models for language modeling
@

1.32 log @none @ text @d1 1 a1 1 d18 3 a20 9

d22 1 a22 4 d26 1 a26 4 d34 2 a35 2
    • TBA
    • TBA
d39 2 a40 2
    • TBA
    • TBA
d44 2 a45 2
    • TBA
    • TBA
d49 2 a50 2
    • TBA
    • TBA
d54 2 a55 2
    • TBA
    • TBA
d59 2 a60 2
    • TBA
    • TBA
d63 2 a65 3 d69 1 a69 1
    • TBA
@

1.31 log @none @ text @d1 1 a1 1 d81 1 a81 1

    • TBA
@

1.30 log @none @ text @d1 1 a1 1 d43 1 a43 1

@

1.29 log @none @ text @d1 1 a1 1 d41 1 a41 1

    • TBA
@

1.28 log @none @ text @d1 1 a1 1 d31 1 a31 1

@

1.27 log @none @ text @d1 1 a1 1 d26 1 a26 1

@

1.26 log @none @ text @d1 1 a1 1 d98 1 a98 1

@

1.25 log @none @ text @d1 1 a1 1 d31 1 a31 1

d53 1 a53 1 d98 1 a98 1 d103 1 a103 1 @

1.24 log @none @ text @d1 1 a1 1 d93 1 a93 1

@

1.23 log @none @ text @d1 1 a1 1 d12 1 a12 1

    • Charles Sutton on learning to perform multiple sequence labeling tasks simultaneously.
d169 1 d184 1 @

1.22 log @none @ text @d1 1 a1 1 d21 1 a21 1

d26 1 a26 1 d31 1 a31 1 d38 1 a38 1 d43 1 a43 1 d48 1 a48 1 d58 1 a58 1 d63 1 a63 1 d68 1 a68 1 d73 1 a73 1 d83 1 a83 1 d88 1 a88 1 @

1.21 log @none @ text @d1 1 a1 1 d130 1 a130 1

    • Nasreen on name transliteration
d147 1 a147 1
    • Trevor Strohman on IR performance issues
d153 1 a153 1
    • Xing Wei on table processing
d180 3 @

1.20 log @none @ text @d1 1 a1 1 d12 2 a13 2

    • TBA
    • TBA
@

1.19 log @none @ text @d1 1 a1 1 d7 1 a7 1 We meet from 11-12 on Tuesdays in CS151. d11 1 a11 1

  • January 6
d16 3 a18 1
  • January 13
d23 1 a23 1
  • January 20
d28 1 a28 1
  • January 27
d33 3 a35 1
  • February 3
d40 1 a40 1
  • February 10
d45 1 a45 1
  • February 17
d50 1 a50 1
  • February 24
d55 1 a55 1
  • March 2
d60 1 a60 1
  • March 9
d65 1 a65 1
  • March 16 (Spring break; may not meet.)
d70 1 a70 1
  • March 23
d75 1 a75 1
  • March 30
d80 1 a80 1
  • April 6
d85 1 a85 1
  • April 13
d90 1 a90 1
  • April 20
d95 1 a95 1
  • April 27
d100 1 a100 16

  • May 25
@

1.18 log @none @ text @d1 1 a1 1 d7 108 d150 1 a150 1

  • November 4. (CIKM is happening in New Orleans.)
d155 1 a155 1
  • November 11. No meeting; today is Veteran's Day.
d157 1 a157 1
  • November 18. (TREC is happening in Gaithersburg.)
d162 1 a162 1
  • November 25.
d167 1 a167 1
  • December 2.
d172 1 a172 1
  • December 9.
d177 1 a177 2
  • December 16.
    • Victor Lavrenko on topic to be decided
@

1.17 log @none @ text @d1 1 a1 1 d12 2 a13 2

    • Andres
    • Steve C-T.
d16 1 a16 1
    • Victor giving a tutorial of statistical machine translation basics
d21 1 a21 1
    • Ramesh on a Zhai and Lafferty smoothing paper.
d38 2 a39 2
    • Xiaoyan on time-based language models
    • Ao Feng on clustering evaluation in TDT detection
d60 2 a61 2
    • Don Metzler on LM and Inference networks
    • Ramesh Nallapati on maximum entropy for IR
d65 3 a67 3
    • Xiaoyong Liu on experiments with clusters and language models
    • Margie Connell on cross-language processing for TDT
      ProjectorSetup BenWellner
d73 11 @

1.16 log @none @ text @d1 1 a1 1 d71 1 a71 1

    • Toni Rath on topic to be decided
@

1.15 log @none @ text @d1 1 a1 1 d65 1 a65 1

    • Xiaoyong Liu on topic to be decided
@

1.14 log @none @ text @d1 1 a1 1 d45 1 a45 1

ProjectorSetup Your wiki name here
d52 1 a52 1
ProjectorSetup Your wiki name here
d57 1 a57 1
ProjectorSetup Your wiki name here
d62 1 a62 1
ProjectorSetup Your wiki name here
d67 1 a67 1
ProjectorSetup Your wiki name here
d72 1 a72 1
ProjectorSetup Your wiki name here
@

1.13 log @none @ text @d1 1 a1 1 d15 1 a15 1

  • September 23. Cross-language issues
d19 1 a19 1
  • September 30. Smoothing
d32 1 a32 1
  • October 21. Arabic
d36 2 a37 1
  • October 28.
d42 1 a42 1
  • November 4. (CIKM is happening in New Orleans.)
d44 1 a44 1

d49 1 a49 1

  • November 18. (TREC is happening in Gaithersburg.)
d51 1 @

1.12 log @none @ text @d1 1 a1 1 d39 1 a39 1

ProjectorSetup Your wiki name here
@

1.11 log @none @ text @d1 1 a1 1 d69 1 a69 1

    • Toni Rather on topic to be decided
@

1.10 log @none @ text @d1 1 a1 1 d42 2 a43 1

    • Hema Raghavan on topic to be decided
@

1.9 log @none @ text @d1 1 a1 1 d36 1 a36 1

  • October 28.
d38 1 d48 1 @

1.8 log @none @ text @d1 1 a1 1 d25 1 a25 3

d27 1 a27 1
  • October 14. Music
d41 3 d45 1 d47 6 a52 1
  • November 25.
d54 14 a67 3
  • December 2.
  • December 9.
  • December 16.
@

1.7 log @none @ text @d1 1 a1 1 d37 1 a37 1

ProjectorSetup Your wiki name here
@

1.6 log @none @ text @d1 1 a1 1 d27 1 a27 1

ProjectorSetup Your wiki name here
@

1.5 log @none @ text @d1 1 a1 1 d25 1 a25 1

  • October 7.
d27 1 d32 1 d37 1 a37 1

d40 1 @

1.4 log @none @ text @d1 1 a1 1 d22 2 @

1.3 log @none @ text @d1 1 a1 1 d34 2 a35 1

  • October 28.
@

1.2 log @none @ text @d1 1 a1 1 a2 1

d24 1 @

1.1 log @none @ text @d1 1 a1 1 a3 1

d12 11 a22 3

  • September 16. Andres and Steve C-T.
  • September 23. Leah and Victor.
  • September 30. Alvaro and Ramesh.
d25 9 a33 2
  • October 14. Jeremy/Victor and ??.
  • October 21. Nasreen and ??.
@

 <<O>>  Difference Topic LabMeeting (r1.120 - 10 May 2006 - XiaoyongLiu)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 146 to 146

    • Abstract: Pseudo-parallel corpora are corpora which follow parallel topical distributions but may only contain a few exact translations. While the usefulness of such corpora is questionable for training statistical machine translation systems, previous results indicate that they may be helpful for cross-lingual information retrieval. In this talk, I will describe experiments comparing and re-aligning parallel document corpora. Our technique uses only geometric properties of the corpora (ie, does not require training a translation system) and achieves surprsingly strong alignment performance.

    • Talk 2: Xiaoyong Liu
Changed:
<
<
    • Abstract: TB
>
>
    • Abstract: The most common approach to cluster-based retrieval (CBR), which was proposed in 1970s, is to retrieve one or more clusters in their entirety to a query. Research in this area has suggested that “optimal” clusters exist that, if retrieved, would yield very large improvements in effectiveness relative to document-based retrieval (DBR). However, no real retrieval strategy has achieved this result. Except for precision-oriented searches on very small data sets, DBR is found to be generally more effective. There has been a resurgence of research in CBR in the past few years including our own efforts in this area. The general approach is to use clusters as a form of document smoothing. Studies have shown that clusters can indeed improve retrieval performance automatically on modern test collections and the language modeling framework is an effective probabilistic retrieval framework for studying CBR. The reported results are encouraging but there is still large room for improvement as compared to what optimal clusters could potentially produce were they retrieved. In the proposed research, we examine the optimal and real performance of CBR with the goal of identifying the characteristics of optimal clusters. We develop a set of techniques that will address several aspects of CBR including systematic modeling of document-cluster relationships, different ways of representing clusters for retrieval, and possibly new retrieval models that are more suitable for CBR. Preliminary results on TREC collections demonstrate the promise of the research.

Spring 2005

 <<O>>  Difference Topic LabMeeting (r1.119 - 06 May 2006 - FernandoDiaz)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 142 to 142

* May 12th 10-11am (Room 151)*

Changed:
<
<
    • Talk 1: Fernando Diaz
    • Abstract: TBA
>
>
    • Talk 1: Fernando Diaz "Experiments Toward Pseudo-Parallel Corpora"
    • Abstract: Pseudo-parallel corpora are corpora which follow parallel topical distributions but may only contain a few exact translations. While the usefulness of such corpora is questionable for training statistical machine translation systems, previous results indicate that they may be helpful for cross-lingual information retrieval. In this talk, I will describe experiments comparing and re-aligning parallel document corpora. Our technique uses only geometric properties of the corpora (ie, does not require training a translation system) and achieves surprsingly strong alignment performance.

    • Talk 2: Xiaoyong Liu
    • Abstract: TB
 <<O>>  Difference Topic LabMeeting (r1.118 - 04 May 2006 - ChiragShah)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 132 to 132

* May 5th 10-11am (ROOM 151)*

    • Talk 1: Chirag Shah
Changed:
<
<
    • Abstract: TBA
>
>
    • Title: "Using Named Entities Representation for Story Link Detection (SLD)"
    • Abstract: Topic Detection and Tracking (TDT) forum has evoked a new line of research that explicitly focuses on the event-based organization of broadcast news. The uniqueness of this research has made it worthwhile to address some of its issues with a different approach than traditional IR. We identify some of these peculiarities and argue that named entities provide a better way of representing the documents in TDT related research. We support this argument by a series of experiments with Story Link Detection (SLD) task on TDT corpora. Our proposed systems that make use of named entities for document representation exhibit significant performance improvement over the baseline. Executing the experiments on different TDT corpora of varying nature, we identify some of the issues with different approaches regarding their effectiveness. We also provide a deeper analysis of the results and pinpoint the unique characteristics of various systems. This knowledge is used to combine two different systems and boost the performance even further. We are currently working on understanding the limitations of named entities based representations and identifying the ways to address them.

    • Talk 2: Xiaoyan Li "Sentence Level Information Patterns for Novelty Detection"
 <<O>>  Difference Topic LabMeeting (r1.117 - 03 May 2006 - KateMoruzzi)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 144 to 144

    • Talk 1: Fernando Diaz
    • Abstract: TBA
Changed:
<
<
    • Talk 2: Xing Yi
>
>
    • Talk 2: Xiaoyong Liu

    • Abstract: TB

 <<O>>  Difference Topic LabMeeting (r1.116 - 01 May 2006 - KateMoruzzi)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 129 to 129

* April 28th - MEETING CANCELED

Changed:
<
<

* May 5th 10-11am (ROOM CHANGE - ROOM 150)*

>
>
* May 5th 10-11am (ROOM 151)*

    • Talk 1: Chirag Shah
    • Abstract: TBA
Changed:
<
<
    • Talk 2: Xiaoyan Li
    • Abstract: TBA
>
>
    • Talk 2: Xiaoyan Li "Sentence Level Information Patterns for Novelty Detection"

Added:
>
>
    • Abstract: The detection of new information in a document stream is an important component of many potential applications. In this work, a new novelty detection approach based on the identification of sentence level information patterns is proposed. Given a user’s information need, some information patterns in sentences such as combinations of query words, sentence lengths, named entities and phrases, and other sentence patterns, may contain more important and relevant information than single words. A thorough analysis of sentence level information patterns is elaborated on data from the TREC novelty tracks, including sentence lengths, named entities, and opinion patterns. I will present how we perform novelty detection based on information patterns, which focuses on the identification of previously unseen query-related patterns in sentences. A unified pattern-based approach is presented to novelty detection for both specific NE topics and more general topics. Experiments on novelty detection were carried out on data from the TREC 2003 and 2004 novelty tracks. Experimental results show that the proposed approach significantly improves the performance of novelty detection for both specific and general topics, therefore the overall performance for all topics from the 2002-2004 TREC novelty tracks, in terms of precision at top ranks. Future research directions along this line are suggested in the conclusions of the work.

* May 12th 10-11am (Room 151)*

Line: 146 to 145

    • Abstract: TBA

    • Talk 2: Xing Yi
Changed:
<
<
    • Abstract: TBA
>
>
    • Abstract: TB

Spring 2005

 <<O>>  Difference Topic LabMeeting (r1.115 - 24 Apr 2006 - KateMoruzzi)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 127 to 127

    • Title: Cleaning and Augmenting Databases by Learning their Alignments with Text Collections
    • Abstract: Many real-world data mining applications begin with noisy, partially-filled databases. One might correct and fill these databases using information extracted from unstructured text. However, state-of-the-art extraction systems are trained by machine learning and typically require a large quantity of training data. This paper introduces a method of automatically cleaning and augmenting databases using only the existing noisy database and unlabeled text. A conditional random field model is used to learn alignments between existing database records and the appearance of their information in text, and simultaneously learn to perform extraction of new records not yet in the database. We present preliminary results learning extractors for bibliographic citations using real-world and noisy Bibtex databases.
Changed:
<
<
* April 28th 10-11am (Room 151)*
>
>
* April 28th - MEETING CANCELED

Deleted:
<
<
    • Talk 1 Xiaoyong Liu
    • Abstract:

    • *Talk 2:
    • Abstract:

* May 5th 10-11am (ROOM CHANGE - ROOM 150)*

Deleted:
<
<


    • Talk 1: Chirag Shah
    • Abstract: TBA
 <<O>>  Difference Topic LabMeeting (r1.114 - 21 Apr 2006 - KateMoruzzi)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 135 to 135

    • *Talk 2:
    • Abstract:
Changed:
<
<
* May 5th 10-11am (Room 151)*
>
>
* May 5th 10-11am (ROOM CHANGE - ROOM 150)*

    • Talk 1: Chirag Shah
 <<O>>  Difference Topic LabMeeting (r1.113 - 19 Apr 2006 - KedarBellare)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 124 to 124

    • Talk 2: Kedar Bellare
Changed:
<
<
    • Abstract: TBA
>
>
    • Title: Cleaning and Augmenting Databases by Learning their Alignments with Text Collections
    • Abstract: Many real-world data mining applications begin with noisy, partially-filled databases. One might correct and fill these databases using information extracted from unstructured text. However, state-of-the-art extraction systems are trained by machine learning and typically require a large quantity of training data. This paper introduces a method of automatically cleaning and augmenting databases using only the existing noisy database and unlabeled text. A conditional random field model is used to learn alignments between existing database records and the appearance of their information in text, and simultaneously learn to perform extraction of new records not yet in the database. We present preliminary results learning extractors for bibliographic citations using real-world and noisy Bibtex databases.

* April 28th 10-11am (Room 151)*

 <<O>>  Difference Topic LabMeeting (r1.112 - 18 Apr 2006 - VanessaMurdock)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 131 to 131

    • Talk 1 Xiaoyong Liu
    • Abstract:
Changed:
<
<
    • Talk 2: Vanessa Murdock
    • Abstract: TBA
>
>
    • *Talk 2:
    • Abstract:

* May 5th 10-11am (Room 151)*

 <<O>>  Difference Topic LabMeeting (r1.111 - 14 Apr 2006 - KateMoruzzi)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 119 to 119

* April 21st 10-11am (Room 151)*

    • Talk 1 Yun Zhou
Changed:
<
<
    • Abstract: TBA
>
>
*Title: A novel approach to predict retrieval performance: Ranking Robustness
    • Abstract: A general observation in the field of noise data retrieval is that as retrieval effectiveness improves, the ranking function becomes more robust against data corruption. Motivated by this, we introduce a statistical measure to quantify the notion of ranking robustness in the context of regular document retrieval. We show that the new approach is at least as good as the clarity score method across a variety of collections. In particular, a combination of the two usually leads to further improvements.

    • Talk 2: Kedar Bellare
    • Abstract: TBA
 <<O>>  Difference Topic LabMeeting (r1.110 - 12 Apr 2006 - JamesAllan)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 7 to 7

The talks could be on your own work or any other interesting work that is related to the lab's general research direction. To stimulate new research ideas, please include a slide in the end that addresses future directions and open research questions related to that work.

All Spring 2006 meetings are held on Friday mornings in CS151 from 10am-11am, unless otherwise specified.

Added:
>
>

LMSpring05?, LMFall04?, LMSummer04?, LMSpring04?, LMFall03, LMSpring03?


Spring 2006

 <<O>>  Difference Topic LabMeeting (r1.109 - 11 Apr 2006 - XingWei)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 107 to 107

    • Talk 1: Xing Wei
Changed:
<
<
    • Abstract: TBA
>
>
    • Title: LDA-Based Document Models for Ad-hoc Retrieval
    • Abstract: Previous research on cluster-based retrieval has shown that simple topic models can lead to significant improvements in retrieval performance. An approach to building topic models based on a formal generative model of documents, Latent Dirichlet Allocation (LDA), is heavily cited in the machine learning literature, but its feasibility and effectiveness in information retrieval is still unknown. In this paper, we study how to efficiently use LDA to improve ad-hoc retrieval. We propose an LDA-based document model within the language modeling framework, and evaluate it on several TREC collections. Gibbs sampling is employed to conduct approximate inference in LDA and the computational complexity is analyzed. We show that significant improvements over cluster-based retrieval can be obtained with reasonable efficiency.

    • Talk 2: Giridhar Kumaran
    • Title: Simple Questions to Improve Pseudo-Relevance Feedback Results, and Some Failure Analysis
 <<O>>  Difference Topic LabMeeting (r1.108 - 11 Apr 2006 - GiridharKumaran)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 110 to 110

    • Abstract: TBA

    • Talk 2: Giridhar Kumaran
Changed:
<
<
    • Abstract: TBA
>
>
    • Title: Simple Questions to Improve Pseudo-Relevance Feedback Results, and Some Failure Analysis
    • Abstract: We explore methods to further improve the performance of pseudo-relevance feedback. Studies suggest that new methods for tackling difficult queries are required. Our approach is to gather more information about the query from the user by asking her simple questions. The equally simple responses are used to modify the original query. Our experiments using the TREC Robust Track queries show that we can obtain a significant improvement in mean average precision of around 10% over pseudo-relevance feedback. This improvement is also spread across more queries compared to ordinary pseudo-relevance feedback, as suggested by geometric mean average precision.

* April 21st 10-11am (Room 151)*

 <<O>>  Difference Topic LabMeeting (r1.107 - 06 Apr 2006 - KateMoruzzi)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 132 to 132

    • Talk 1: Chirag Shah
Changed:
<
<
    • Abstract:
>
>
    • Abstract: TBA

Changed:
<
<
    • Talk 2:
    • Abstract:
>
>
    • Talk 2: Xiaoyan Li
    • Abstract: TBA

* May 12th 10-11am (Room 151)*

 <<O>>  Difference Topic LabMeeting (r1.106 - 06 Apr 2006 - KateMoruzzi)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 116 to 116

    • Talk 1 Yun Zhou
    • Abstract: TBA
Changed:
<
<
    • Talk 2:
>
>

    • Talk 2: Kedar Bellare

    • Abstract: TBA

* April 28th 10-11am (Room 151)*

Line: 142 to 144

    • Talk 1: Fernando Diaz
    • Abstract: TBA
Changed:
<
<
    • Talk 2:
    • Abstract:
>
>
    • Talk 2: Xing Yi
    • Abstract: TBA

 <<O>>  Difference Topic LabMeeting (r1.105 - 06 Apr 2006 - RameshNallapati)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 94 to 94

* April 7th 10-11am (Room 151)*

    • Talk 1: Ramesh Nallapati
Changed:
<
<
    • Abstract: TBA
>
>
    • Title: Smoothed Dirichlet distribution: Understanding the Cross-entropy ranking function in IR
    • Abstract: In this work we analyze the popular Cross-entropy ranking function in information retrieval. We uncover the generative distribution, namely the Smoothed Dirichlet distribution, underlying this ranking function and show that this distribution captures term occurrence distribution much better than the multinomial, thus offering a reason behind the success of the ranking function. We present theoretically motivated approximations to the distribution that lead to a closed form maximum likelihood solution, much like the multinomial, making it ideal for online IR tasks. We use the new distribution to construct a new, well-motivated ad-hoc retrieval algorithm. Our experiments show that this algorithm performs at least as well as similar algorithms that employ cross-entropy ranking. It also provides additional flexibility, e.g. in handling queries of various lengths, due to a consistent generative framework.

    • Talk 2: Shaolei Feng
    • Title: A Hierarchical, HMM based Automatic Evaluation of OCR Accuracy for a Digital Library of Books
 <<O>>  Difference Topic LabMeeting (r1.104 - 05 Apr 2006 - KateMoruzzi)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 115 to 115

    • Talk 1 Yun Zhou
    • Abstract: TBA
Changed:
<
<
    • Talk 2: Ron Bekkerman
>
>
    • Talk 2:

    • Abstract: TBA

* April 28th 10-11am (Room 151)*

Line: 138 to 138

* May 12th 10-11am (Room 151)*

Changed:
<
<
    • Talk 1:
    • Abstract:
>
>
    • Talk 1: Fernando Diaz
    • Abstract: TBA

    • Talk 2:
    • Abstract:
 <<O>>  Difference Topic LabMeeting (r1.103 - 03 Apr 2006 - ShaoleiFeng)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 97 to 97

    • Abstract: TBA

    • Talk 2: Shaolei Feng
Changed:
<
<
    • Abstract: TBA
>
>
    • Title: A Hierarchical, HMM based Automatic Evaluation of OCR Accuracy for a Digital Library of Books
    • Abstract: Content-based on line book retrieval usually requires first converting printed text into machine readable text using an OCR engine and then doing full text search on the results. Many of these books are old and there are a variety of processing steps that are required to create an end to end system. Changing any step can affect OCR performance and hence a good automatic statistical evaluation of OCR performance on book length material is needed. Evaluating OCR performance on the entire book is non-trivial. The only easily obtainable ground truth must be automatically aligned with the OCR output over the entire length of a book. This may be viewed as equivalent to the problem of aligning two large (easily a million long) sequences. The problem is further complicated by OCR errors as well as the possibility of large chunks of missing material in one of the sequences. I will describe a Hidden Markov Model (HMM) based hierarchical alignment algorithm to align OCR output and the ground truth for books. The alignment process works by breaking up the problem of aligning two long sequences into the problem of aligning many smaller subsequences. Joint work with R.Manmatha while visiting Google.

* April 14th 10-11am (Room 151)*

 <<O>>  Difference Topic LabMeeting (r1.102 - 03 Apr 2006 - ShaoleiFeng)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

 <<O>>  Difference Topic LabMeeting (r1.101 - 31 Mar 2006 - ChiragShah)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 126 to 126

* May 5th 10-11am (Room 151)*

Changed:
<
<
    • Talk 1:
>
>
    • Talk 1: Chirag Shah

    • Abstract:

    • Talk 2:
 <<O>>  Difference Topic LabMeeting (r1.100 - 29 Mar 2006 - JiwoonJeon)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 87 to 87

    • Talk 2: Jiwoon Jeon
Changed:
<
<
    • Abstract: TBA
>
>
    • Title: Predicting the Quality of Answers with Non-Textual Features
    • Abstract: New types of document collections are being developed by various web services. The service providers keep track of non-textual features such as click counts. In this paper, we present a framework to use non-textual features to predict the quality of documents. We also show our quality measure can be successfully incorporated into the language modeling-based retrieval model. We test our approach on a collection of question and answer pairs gathered from a community based question answering service where people ask and answer questions. Experimental results using our quality measure show a significant improvement over our baseline.

* April 7th 10-11am (Room 151)*

 <<O>>  Difference Topic LabMeeting (r1.99 - 29 Mar 2006 - HemaRaghavan)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 64 to 64

* March 31st 10-11am (Room 151)*

    • Talk 1: Hema Raghavan
Changed:
<
<
    • Abstract: TBA
>
>
    • Abstract: We empirically analyze the convergence speed of active learning on a variety of text categorization problems and relate it to measures of problem difficulty such as the feature set size required for the maximum achievable performance. The speed of convergence is a measure of how quickly an active learning algorithm converges to its best possible classification performance on a given problem. Quickness or speed is a function of the number of feedback iterations. A problem that needs many instances to converge (slow speed) is considered difficult. We explore 4 difficulty measures (2 each of instance and feature complexity respectively) which allow us to rank numerous categorization problems and experimental test-beds based on their difficulty for active learning. Our feature complexity measures capture many previous results in feature selection. We find that the speed of convergence is inversely related (r=-0.7) to feature complexity. This has useful implications for future research, especially in understanding a dual approach for active learning where the teacher is asked to provide feedback on features in addition to labeling instances. We find that the improvement in the speed of active learning brought about due to such a dual feedback approach is negatively correlated with feature complexity (r=-0.65). Our experiments show that such a dual feedback approach can increase the speed of active learning by 57\% on average on 358 binary text classification problems in 9 standard corpora that we consider, because most bench-mark text categorization corpora contain problems of low to medium complexity.


    • Talk 2: Jiwoon Jeon
    • Abstract: TBA
 <<O>>  Difference Topic LabMeeting (r1.98 - 28 Mar 2006 - VanessaMurdock)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 72 to 72

* April 7th 10-11am (Room 151)*

Changed:
<
<
    • Talk 1: Vanessa Murdock
>
>
    • Talk 1: Ramesh Nallapati

    • Abstract: TBA

    • Talk 2: Shaolei Feng
Line: 99 to 99

    • Talk 1 Xiaoyong Liu
    • Abstract:
Changed:
<
<
    • Talk 2:
    • Abstract:
>
>
    • Talk 2: Vanessa Murdock
    • Abstract: TBA

* May 5th 10-11am (Room 151)*

 <<O>>  Difference Topic LabMeeting (r1.97 - 27 Mar 2006 - XiaoyongLiu)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 66 to 66

    • Talk 1: Hema Raghavan
    • Abstract: TBA
Changed:
<
<
    • Talk 2: Xiaoyong Liu
>
>
    • Talk 2: Jiwoon Jeon

    • Abstract: TBA
Line: 97 to 97

* April 28th 10-11am (Room 151)*

Changed:
<
<
    • Talk 1 Jiwoon Jeon
>
>
    • Talk 1 Xiaoyong Liu

    • Abstract:
    • Talk 2:
    • Abstract:
 <<O>>  Difference Topic LabMeeting (r1.96 - 07 Mar 2006 - MarkSmucker)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 53 to 53

* March 10th 10-11am (Room 151)*

Changed:
<
<
    • Talk 1: Mark Smucker "Find-similar"
>
>
    • Talk 1: Mark Smucker

    • Abstract: TBA

    • Talk 2: Don Metzler
 <<O>>  Difference Topic LabMeeting (r1.95 - 06 Mar 2006 - JiwoonJeon)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 51 to 51

trends. Joint work with Andrew McCallum?.
Deleted:
<
<
    • Talk 2: Jiwoon Jeon
    • Title: "A Framework to Predict the Quality of Answers with Non-Textual Features"
    • Abstract: New types of document collections are being developed by various web services. The service providers keep track of non-textual features such as click counts. In this talk, we present a framework to use non-textual features to predict the quality of documents. We also show our quality measure can be successfully incorporated into the language modeling-based retrieval model. We test our approach on a collection of question and answer pairs gathered from a community based question answering service where people ask and answer questions. Experimental results using our quality measure show a significant improvement over our baseline.


* March 10th 10-11am (Room 151)*

    • Talk 1: Mark Smucker "Find-similar"
Line: 102 to 97

* April 28th 10-11am (Room 151)*

Changed:
<
<
    • Talk 1
>
>
    • Talk 1 Jiwoon Jeon

    • Abstract:
    • Talk 2:
    • Abstract:
 <<O>>  Difference Topic LabMeeting (r1.94 - 28 Feb 2006 - RonBekkerman)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 97 to 97

    • Talk 1 Yun Zhou
    • Abstract: TBA
Changed:
<
<
    • Talk 2:
    • Abstract:
>
>
    • Talk 2: Ron Bekkerman
    • Abstract: TBA

* April 28th 10-11am (Room 151)*

 <<O>>  Difference Topic LabMeeting (r1.93 - 27 Feb 2006 - KateMoruzzi)

@%META:TOPICINFO{author="KateMoruzzi" date="1138715902" format="1.0" version="1.84"}%

CIIR weekly lab meetings

Line: 34 to 34

* March 3rd 10-11am (Room 151)*

    • Talk 1: Xuerui Wang
Changed:
<
<
    • Abstract: TBA