AI Research on Question Answering - Dictionary of Arguments
Norvig I 872
Question Answering/AI Research/Norvig/Russell: Information retrieval is the task of finding documents that are relevant to a query, where the query may be a question, or just a topic area or concept. Question answering is a somewhat different task, in which the query really is a question, and the answer is not a ranked list of documents but rather a short response—a sentence, or even just a phrase. CF. >Information retrieval.
There have been question-answering NLP (natural language processing) systems since the 1960s, but only since 2001 have such systems used Web information retrieval to radically increase their breadth of coverage. The ASKMSR system (Banko et al., 2002)(1) is a typical Web-based question-answering system. It is based on the intuition that most questions will be answered many times on the Web, so question answering should be thought of as a problem in precision, not recall. We don’t have to deal with all the different ways that an answer might be phrased—we only have to find one of them.
E.g., [Who killed Abraham Lincoln?] – Web entry: “John Wilkes Booth altered history with a bullet. He will forever be known as the man who ended Abraham Lincoln’s life.”
Problem: To use this passage to answer the question, the system would have to know that ending a life can be a killing, that “He” refers to Booth, and several other linguistic and semantic facts.
Norvig I 873
ASKMSR does not attempt this kind of sophistication—it knows nothing about pronoun reference, or about killing, or any other verb. It does know 15 different kinds of questions, and how they can be rewritten as queries to a search engine. It knows that [Who killed Abraham Lincoln] can be rewritten as the query [* killed Abraham Lincoln] and as [Abraham Lincoln was killed by *]. It issues these rewritten queries and examines the results that come back - not the full Web pages, just the short summaries of text that appear near the query terms. The results are broken into 1-, 2-, and 3-grams (>Language models/Norvig) and tallied for frequency in the result sets and for weight: an n-gram that came back from a very specific query rewrite (such as the exact phrase match query [“Abraham Lincoln was killed by *”]) would get more weight than one from a general query rewrite, such as [Abraham OR Lincoln OR killed]. ASKMSR relies upon the breadth of the content on the Web rather than on
its own depth of understanding. >Information Extraction, >Information retrieval.
Norvig I 885
History: Banko et al. (2002)(1) present the ASKMSR question-answering system; a similar system is due to Kwok et al. (2001)(2). Pasca and Harabagiu (2001)(3) discuss a contest-winning question-answering system. Two early influential approaches to automated knowledge engineering were by Riloff (1993)(4), who showed that an automatically constructed dictionary performed almost as well as a carefully handcrafted domain-specific dictionary, and by Yarowsky (1995)(5), who showed that the task of word sense classification (…) could be accomplished through unsupervised training on a corpus of unlabeled text with accuracy as good as supervised methods.
The idea of simultaneously extracting templates and examples from a handful of labeled examples was developed independently and simultaneously by Blum and Mitchell (1998)(6), who called it cotraining and by Brin (1998)(7), who called it DIPRE (Dual Iterative Pattern Relation Extraction). You can see why the term cotraining has stuck. Similar early work, under the name of bootstrapping, was done by Jones et al. (1999)(8). The method was advanced by the QXTRACT (Agichtein and Gravano, 2003)(9) and KNOWITALL (Etzioni et al., 2005)(10) systems. Machine reading was introduced by Mitchell (2005)(11) and Etzioni et al. (2006)(12) and is the focus of the TEXTRUNNER project (Banko et al., 2007(13); Banko and Etzioni, 2008(14)). (Cf. >Information extraction).
(…) it is also possible to do information extraction based on the physical structure or layout of text rather than on the linguistic structure. HTML lists and tables in both HTML and relational databases are home to data that can be extracted and consolidated (Hurst, 2000(15); Pinto et al., 2003(16); Cafarella et al., 2008(17)). The Association for Computational Linguistics (ACL) holds regular conferences and publishes the journal Computational Linguistics. There is also an International Conference on Computational Linguistics (COLING). The textbook by Manning and Schütze (1999)(18) covers statistical language processing, while Jurafsky and Martin (2008)(19) give a comprehensive introduction to speech and natural language processing.
1. Banko, M., Brill, E., Dumais, S. T., and Lin, J. (2002). Askmsr: Question answering using the worldwide web. In Proc. AAAI Spring Symposium on Mining Answers from Texts and Knowledge Bases, pp. 7–9.
2. Kwok, C., Etzioni, O., andWeld, D. S. (2001). Scaling question answering to the web. In Proc. 10th
International Conference on the World Wide Web.
3. Pasca,M. and Harabagiu, S.M. (2001). High performance question/answering. In SIGIR-01, pp. 366–
4. Riloff, E. (1993). Automatically constructing a dictionary for information extraction tasks. In AAAI-93,
5. Yarowsky, D. (1995). Unsupervised word sense disambiguation rivaling supervised methods. In ACL-
95, pp. 189–196.
6. Blum, A. L. and Mitchell, T. M. (1998). Combining labeled and unlabeled data with co-training. In
COLT-98, pp. 92–100.
7. Brin, D. (1998). The Transparent Society. Perseus.
8. Jones, R., McCallum, A., Nigam, K., and Riloff, E. (1999). Bootstrapping for text learning tasks. In
Proc. IJCAI-99 Workshop on Text Mining: Foundations, Techniques, and Applications, pp. 52–63.
9. Agichtein, E. and Gravano, L. (2003). Querying text databases for efficient information extraction. In
Proc. IEEE Conference on Data Engineering.
10. Etzioni, O., Cafarella, M. J., Downey, D., Popescu, A.-M., Shaked, T., Soderland, S., Weld, D. S., and
Yates, A. (2005). Unsupervised named-entity extraction from the web: An experimental study. AIJ,
11. Mitchell, T. M. (2005). Reading the web: A breakthrough goal for AI. AIMag, 26(3), 12–16.
12. Etzioni, O., Banko, M., and Cafarella, M. J. (2006). Machine reading. In AAAI-06.
13. Banko, M., Cafarella, M. J., Soderland, S., Broadhead, M., and Etzioni, O. (2007). Open information extraction from the web. In IJCAI-07.
14. Banko, M. and Etzioni, O. (2008). The tradeoffs between open and traditional relation extraction. In ACL-08, pp. 28–36.
15. Hurst, M. (2000). The Interpretation of Text in Tables. Ph.D. thesis, Edinburgh.
16. Pinto, D.,McCallum, A.,Wei, X., and Croft,W. B. (2003). Table extraction using conditional random fields. In SIGIR-03.
17. Cafarella,M. J.,Halevy, A., Zhang, Y.,Wang, D. Z., and Wu, E. (2008). Webtables: Exploring the power of tables on the web. In VLDB-2008.
18. Manning, C. and Sch¨utze, H. (1999). Foundations of Statistical Natural Language Processing. MIT
19. Jurafsky, D. and Martin, J. H. (2008). Speech and Language Processing: An Introduction to Natural
Language Processing, Computational Linguistics, and Speech Recognition (2nd edition). Prentice-
Hall._____________Explanation of symbols: Roman numerals indicate the source, arabic numerals indicate the page number. The corresponding books are indicated on the right hand side. ((s)…): Comment by the sender of the contribution. Translations: Dictionary of Arguments The note [Concept/Author], [Author1]Vs[Author2] or [Author]Vs[term] resp. "problem:"/"solution:", "old:"/"new:" and "thesis:" is an addition from the Dictionary of Arguments. If a German edition is specified, the page numbers refer to this edition.
Stuart J. Russell
Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010