Paper Reading: Scaling Question Answering to the Web

slides: here

Venue: WWW 2001

paper link: here

Mulder is the 1st general-purpose, fully-automated question answering(QA) system. Instead of indexing any structured corpora, Mulder uses a commercial search engine as its knowledge base. The system has 6 components:

  1. Question Parser: it constructs a tree of the question’s phrasal structure and determines the question’s syntactic structure. The results from the question parser are also used in other components(2,3). The paper uses MEI parser and PC-KIMMO parser to parse the questions.
  2. Question Classifier: it can narrow the number of candidate answers and is used in answer extraction(5) to filter answer candidates of unexpected types. A set of rules based on the wh-phrase recognition by question parser are used to classify the answer type to be either numerical, nominal or temporal. For example, if the MEI parser recognizes Wh-adjective phrases(e.g. how tall) in the question, then the answer type is classified as numerical.
  3. Query Formulation: this component translates the question into a set of keyword queries. The motivation behind is that the reformulated queries may include terms in the target sentences. For example, given the questio “When did Nixon visit China”, we may expect the term “visited” in th target sentence. This strategy is called verb conversion which replaces th auxiliary and main verb with the conjugated verb. The 2nd strategy quer expansion is to replace the adjective in the question with its attribute nou in WordNet. For example, given the question “How tall is Mt. Everest”, th term “tall” is replaced with “height”. The 3rd strategy noun phrase formation is to compose a query with noun phrases that do not contain othe noun phrases. By treating single noun phrase as a query, more pages ca be returned from a search engine which may increase the recall of the Q system. The 4th strategy transformation uses rules of Transformationa Grammar to translate the question into equivalent assertions. For example given the question “who shot JFK” and the rule of stripping the interrogative, we have the new question “shot JFK”.
  4. Search Engine: the original question and the ones generated from query formulation(3) will be sent to Google to get a list of Web pages.
  5. Answer Extraction: this component aims to find candidate answers from the Web pages obtained from Google. The 1st step is to score each page summary. The scoring function gives higher scores to the summaries with more important keywords which are close to each other. Then top ranked summaries will be parsed by MEI parser to obtain phrases which are considered as candidate answers. As mentioned earlier, those phrases in unexpected types(not matching with question classification) will be filtered.
  6. Answer Selection: this component first assigns a score to each candidate answer. It gives higher scores to candidate answers which have more important keywords nearby. Then the candidate answers will be clustered into groups by a simplified version of suffix tree clustering algorithm, where answers that share the same words will be clustered into the same group. Then clusters are ranked according to sum of scores of their member answers, and for each cluster, answers are ranked by their individual scores.

The authors experiment Mulder on TREC-8 dataset and compare it with Google and AskJeeves. The results show that Mulder has the highest recall when user effort is fixed. And when recall is fixed, Mulder has the least user effort. The ablation study shows that the answer extraction is the most important component.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s