|
| META TOPICPARENT | Fall2007ReadingGroup |
-- ElifAktolga - 04 Oct 2007
|
|
Summary
|
< < |
This paper suggests a statistical noisy-channel based approach to calculating the similarity between question and answer pairs in question answering.
|
> > |
This paper suggests a statistical noisy-channel based approach to calculating similarities between question and answer pairs in question answering.
|
|
Background and Motivation
|
< < |
This paper describes a totally different approach to QA with less components than in traditional QA systems, supported by a clearer design. Previous research showed that ad-hoc IR and the bag-of-words approach do not work well in QA, so finding other methods to measure similarity between Q and A pairs are required.
|
> > |
A totally different approach to QA with fewer components than in traditional QA systems is described, supported by a clearer design. Previous research showed that ad-hoc IR and the bag-of-words as single approaches do not work well in QA, so other methods to measure similarity between Q and A pairs are required.
|
|
Core Idea: Map questions and candidate sentences to different spaces for computing similarity. The approach proposed in this paper uses a statistical noisy channel similar to MT systems, mapping Q and A’s in the space of parse trees.
Contribution
|
> > |
- Unlike traditional QA systems, use of a statistical approach to QA with different components:
- 1st component: IR engine retrieves a number of sentences from documents, given a question
- 2nd component: Given the question, and the IR engine output, the Answer Identifier System recognises relevant substrings in the candidate sentences and ranks them using probabilities
- Idea: bridging the gap between questions and candidate answers in the space of parse trees
Methods
1. Noisy channel:
- Aim: generate the question from a candidate answer sentence
- use a tree with syntactic/semantic annotations
- reduce the length gap between the Q and A by making a cut in the answer parse tree and selecting the relevant parts; mark candidate elements
- choose best candidate by computing P(Q|S)
2. Training the Answer Identifier System:
- a probability model is trained by means of Q and A pairs in order to estimate P(Q|S)
- QA pairs for training are generated by processing sentences, identifying important terms, and reducing sentences by cuts
Other Approaches
- Statistical-based Reasoning: LCC's QA system has a theorem prover that proves QA pairs by means of their logical structure and WordNet? (2002)
--> learning relations between QA pairs
- question reformulation (Hermjakob et. al. (2002)
- use of semi-structured databases (Lin 2002)
|
|
Reference
|