Download lecture16_question_answering_askMSR

Question-Answering via the Web: the AskMSR System Note: these viewgraphs were originally developed by Professor Nick Kushmerick, University College Dublin, Ireland. These copies are intended only for use for review in ICS 278. 1 Question-Answering • Users want answers, not documents Databases Information Retrieval Information Extraction Question Answering Intelligent Personal Electronic Librarian • Active research over the past few years, coordinated by US government “TREC” competitions • Recent intense interest from security services (“What is Bin Laden’s bank account number?”) 2 Question-Answering on the Web • Web = a potentially enormous “data set” for data mining – e.g., >8 billion Web pages indexed by Google • Example: AskMSR Web question answering system – “answer mining” • Users pose relatively simple questions – E.g., “who killed Abraham Lincoln”? • • • • Simple parsing used to reformulate as a “template answer” Search engine results used to find answers (redundancy helps) System is surprisingly accurate (on simple questions) Key contributor to system success is massive data (rather than better algorithms) – References: • Dumais et al, 2002: Web question answering: is more always better? In Proceedings of SIGIR'02 3 AskMSR Lecture 5 • Web Question Answering: Is More Always Better? – Dumas, Bank, Brill, Lin, Ng (Microsoft, MIT, Berkeley) • Q: “Where is the Louvre located?” • Want “Paris” or “France” or “75058 Paris Cedex 01” or a map • Don’t just want URLs Adapted from: COMP-4016 ~ Computer Science Department ~ University College Dublin ~ www.cs.ucd.ie/staff/nick ~ © Nicholas Kushmerick 2002 4 “Traditional” approach (Straw man?) • Traditional deep natural-language processing approach – Full parse of documents and question – Rich knowledge of vocabulary, cause/effect, common sense, enables sophisticated semantic analysis • E.g., in principle this answers the “who killed Lincoln?” question: • The non-Canadian, non-Mexican president of a North American country whose initials are AL and who was killed by John Wilkes booth died ten revolutions of the earth around the sun after 1855. 5 AskMSR: Shallow approach • Just ignore those documents, and look for ones like this instead: 6 AskMSR: Details 2 1 3 5 4 7 Step 1: Rewrite queries • Intuition: The user’s question is often syntactically quite close to sentences that contain the answer – Where is the Louvre Museum located? – The Louvre Museum is located in Paris – Who created the character of Scrooge? – Charles Dickens created the character of Scrooge. 8 Query rewriting • – – – Classify question into seven categories Who is/was/are/were…? When is/did/will/are/were …? Where is/are/were …? a. Category-specific transformation rules eg “For Where questions, move ‘is’ to all possible locations” “Where is the Louvre Museum located” Nonsense,  “is the Louvre Museum located” but who cares? It’s  “the is Louvre Museum located” only a few  “the Louvre is Museum located” more queries  “the Louvre Museum is located” to Google.  “the Louvre Museum located is” (Paper does not give full details!) b. Expected answer “Datatype” (eg, Date, Person, Location, …) When was the French Revolution?  DATE • Hand-crafted classification/rewrite/datatype rules (Could they be automatically learned?) 9 Query Rewriting - weights • One wrinkle: Some query rewrites are more reliable than others Where is the Louvre Museum located? Weight 5 if we get a match, it’s probably right Weight 1 Lots of non-answers could come back too +“the Louvre Museum is located” +Louvre +Museum +located 10 Step 2: Query search engine • Throw all rewrites to a Web-wide search engine • Retrieve top N answers (100?) • For speed, rely just on search engine’s “snippets”, not the full text of the actual document 11 Step 3: Mining N-Grams • Unigram, bigram, trigram, … N-gram: list of N adjacent terms in a sequence • Eg, “Web Question Answering: Is More Always Better” – Unigrams: Web, Question, Answering, Is, More, Always, Better – Bigrams: Web Question, Question Answering, Answering Is, Is More, More Always, Always Better – Trigrams: Web Question Answering, Question Answering Is, Answering Is More, Is More Always, More Always Betters 12 Mining N-Grams • Simple: Enumerate all N-grams (N=1,2,3 say) in all retrieved snippets • Use hash table and other fancy footwork to make this efficient • Weight of an n-gram: occurrence count, each weighted by “reliability” (weight) of rewrite that fetched the document • Example: “Who created the character of Scrooge?” – – – – – – – – Dickens - 117 Christmas Carol - 78 Charles Dickens - 75 Disney - 72 Carl Banks - 54 A Christmas - 41 Christmas Carol - 45 Uncle - 31 13 Step 4: Filtering N-Grams • Each question type is associated with one or more “data-type filters” = regular expression • When… Date • Where… Location • What … Person • Who … • Boost score of n-grams that do match regexp • Lower score of n-grams that don’t match regexp • Details omitted from paper…. 14 Step 5: Tiling the Answers Scores 20 Charles 15 10 Dickens Dickens merged, discard old n-grams Mr Charles Score 45 Mr Charles Dickens tile highest-scoring n-gram N-Grams N-Grams Repeat, until no more overlap 15 Experiments • Used the TREC-9 standard query data set • Standard performance metric: MRR – Systems give “top 5 answers” – Score = 1/R, where R is rank of first right answer – 1: 1; 2: 0.5; 3: 0.33; 4: 0.25; 5: 0.2; 6+: 0 16 Results [summary] • Standard TREC contest test-bed: ~1M documents; 900 questions • E.g., “who is president of Bolivia” • E.g., “what is the exchange rate between England and the US” • Technique doesn’t do too well (though would have placed in top 9 of ~30 participants!) – MRR = 0.262 (ie, right answered ranked about #4-#5) – Why? Because it relies on the enormity of the Web! • Using the Web as a whole, not just TREC’s 1M documents… MRR = 0.42 (ie, on average, right answer is ranked about #2-#3) 17 Example • Question: what is the longest word in the English language? – Answer = pneumonoultramicroscopicsilicovolcanokoniosis (!) – Answered returned by AskMSR: • 1: “1909 letters long” • 2: the correct answer above • 3: “screeched” (longest 1-syllable word in English) 18 Open Issues • In many scenarios (eg, monitoring Bin Laden’s email) we only have a small set of documents! • Works best/only for “Trivial Pursuit”-style factbased questions • Limited/brittle repertoire of – question categories – answer data types/filters – query rewriting rules 19

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download lecture16_question_answering_askMSR