Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Ancient Greek grammar wikipedia , lookup
Spanish grammar wikipedia , lookup
Georgian grammar wikipedia , lookup
Old English grammar wikipedia , lookup
Serbo-Croatian grammar wikipedia , lookup
Latin syntax wikipedia , lookup
Yiddish grammar wikipedia , lookup
Malay grammar wikipedia , lookup
Lexical semantics wikipedia , lookup
Semantic Role Labeling: English PropBank LING 5200 Computational Corpus Linguistics Martha Palmer 1 Ask Jeeves – A Q/A, IR ex. What do you call a successful movie? Blockbuster Tips on Being a Successful Movie Vampire ... I shall call the police. Successful Casting Call & Shoot for ``Clash of Empires'' ... thank everyone for their participation in the making of yesterday's movie. Demme's casting is also highly entertaining, although I wouldn't go so far as to call it successful. This movie's resemblance to its predecessor is pretty vague... VHS Movies: Successful Cold Call Selling: Over 100 New Ideas, Scripts, and Examples from the Nation's Foremost Sales Trainer. LING 5200, 2006 2 Ask Jeeves – filtering w/ POS tag What do you call a successful movie? Tips on Being a Successful Movie Vampire ... I shall call the police. Successful Casting Call & Shoot for ``Clash of Empires'' ... thank everyone for their participation in the making of yesterday's movie. Demme's casting is also highly entertaining, although I wouldn't go so far as to call it successful. This movie's resemblance to its predecessor is pretty vague... VHS Movies: Successful Cold Call Selling: Over 100 New Ideas, Scripts, and Examples from the Nation's Foremost Sales Trainer. LING 5200, 2006 3 Filtering out “call the police” Different senses, - different syntax, - different kinds of participants, - different types of propositions. call(you,movie,what) ≠ call(you,police) you movie what LING 5200, 2006 you 4 police WordNet – Princeton (Miller 1985, Fellbaum 1998) On-line lexical reference (dictionary) Nouns, verbs, adjectives, and adverbs grouped into synonym sets Other relations include hypernyms (ISA), antonyms, meronyms Typical top nodes - 5 out of 25 (act, action, activity) (animal, fauna) (artifact) (attribute, property) (body, corpus) LING 5200, 2006 5 Cornerstone: English lexical resource That provides sets of possible syntactic frames for verbs. And provides clear, replicable sense distinctions. AskJeeves: Who do you call for a good electronic lexical database for English? LING 5200, 2006 6 WordNet – Princeton (Miller 1985, Fellbaum 1998) Limitations as a computational lexicon Contains little syntactic information Comlex has syntax but no sense distinctions No explicit lists of participants Sense distinctions very fine-grained, Definitions often vague Causes problems with creating training data for supervised Machine Learning – SENSEVAL2 Verbs > 16 senses (including call) Inter-annotator Agreement ITA 71%, Automatic Word Sense Disambiguation, WSD 63% LING 5200, 2006 7 Dang & Palmer, SIGLEX02 WordNet – call, 28 senses 1. name, call -- (assign a specified, proper name to; "They named their son David"; …) -> LABEL 2. call, telephone, call up, phone, ring -- (get or try to get into communication (with someone) by telephone; "I tried to call you all night"; …) ->TELECOMMUNICATE 3. call -- (ascribe a quality to or give a name of a common noun that reflects a quality; "He called me a bastard"; …) -> LABEL 4. call, send for -- (order, request, or command to come; "She was called into the director's office"; "Call the police!") -> ORDER LING 5200, 2006 8 WordNet: - call, 28 senses WN2 , WN13,WN28 WN3 WN19 WN15 WN26 WN4 WN 7 WN8 WN9 WN1 WN22 WN20 WN25 WN18 WN27 WN5 WN 16 WN6 WN23 WN12 WN17 , WN 11 LING 5200, 2006 WN10, WN14, WN21, WN24 9 WordNet: - call, 28 senses, Senseval2 groups, ITA 82%, WSD 70% WN5, WN16,WN12 Loud cry WN3 WN19 WN1 WN22 Label WN15 WN26 Bird or animal cry WN4 WN 7 WN8 WN9 Request WN20 WN18 WN27 Challenge WN2 WN 13 Phone/radioWN28 WN6 WN23 Visit Bid WN17 , WN 11 LING 5200, 2006 WN25 Call a loan/bond WN10, WN14, WN21, WN24, 10 Filtering out “call the police” Different senses, - different syntax, - different kinds of participants, - different types of propositions. call(you,movie,what) ≠ call(you,police) you movie what LING 5200, 2006 you 11 police Proposition Bank: From Sentences to Propositions (Predicates!) Powell met Zhu Rongji battle wrestle join debate Powell and Zhu Rongji met consult Powell met with Zhu Rongji Proposition: meet(Powell, Zhu Rongji) Powell and Zhu Rongji had a meeting meet(Somebody1, Somebody2) ... When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane. meet(Powell, Zhu) LING 5200, 2006 discuss([Powell, Zhu], return(X, plane)) 12 Semantic role labels: Marie broke the LCD projector. break (agent(Marie), patient(LCD-projector)) Filmore, 68 cause(agent(Marie), Jackendoff, 72 change-of-state(LCD-projector)) (broken(LCD-projector)) agent(A) -> intentional(A), sentient(A), causer(A), affector(A) patient(P) -> affected(P), change(P),… LING 5200, 2006 13 Dowty, 91 Capturing semantic roles* SUBJ Richard broke [ ARG1 the laser pointer.] SUBJ [ARG1 The windows] were broken by the hurricane. SUBJ [ARG1 The vase] broke into pieces when it toppled over. *See also Framenet, http://www.icsi.berkeley.edu/~framenet/ LING 5200, 2006 14 Frame File example: give – Roles: Arg0: giver Arg1: thing given Arg2: entity given to Example: double object The executives gave the chefs a standing ovation. Arg0: The executives REL: gave Arg2: the chefs Arg1: a standing ovation LING 5200, 2006 15 Annotation procedure PTB II - Extraction of all sentences with given verb Create Frame File for that verb Paul Kingsbury (3100+ lemmas, 4400 framesets,120K predicates) Over 300 created automatically via VerbNet First pass: Automatic tagging (Joseph Rosenzweig) http://www.cis.upenn.edu/~josephr/TIDES/index.html#lexicon Second pass: Double blind hand correction 84% ITA, 91% Kappa Paul Kingsbury Betsy Klipple, Olga Babko-Malaya Tagging tool highlights discrepancies Scott Cotton Third pass: Solomonization (adjudication) LING 5200, 2006 16 NomBank Frame File example: gift (nominalizations, noun predicates, partitives, etc. Roles: Arg0: giver Arg1: thing given Arg2: entity given to Example: double object Nancy’s gift from her cousin was a complete surprise. Arg0: her cousin REL: gave Arg2: Nancy Arg1: gift LING 5200, 2006 17 Trends in Argument Numbering Arg0 = proto-typical agent (Dowty) Arg1 = proto-typical patient Arg2 = indirect object / benefactive / instrument / attribute / end state Arg3 = start point / benefactive / instrument / attribute Arg4 = end point LING 5200, 2006 18 Additional tags - (arguments o adjuncts?) Variety of ArgM’s (Arg#>4): TMP - when? LOC - where at? DIR - where to? MNR - how? PRP -why? REC - himself, themselves, each other PRD -this argument refers to or modifies another ADV –others LING 5200, 2006 19 Inflection, etc. Verbs also marked for tense/aspect Passive/Active Perfect/Progressive Third singular (is has does was) Present/Past/Future Infinitives/Participles/Gerunds/Finites Modals and negations marked as ArgMs for convenience LING 5200, 2006 20 Word Senses in PropBank Orders to ignore word sense not feasible for 700+ verbs Mary left the room Mary left her daughter-in-law her pearls in her will Frameset leave.01 "move away from": Arg0: entity leaving Arg1: place left Frameset leave.02 "give": Arg0: giver Arg1: thing given Arg2: beneficiary How do these relate to traditional word senses in WordNet? LING 5200, 2006 21 WordNet: - call, 28 senses, groups WN5, WN16,WN12 Loud cry WN3 WN19 WN1 WN22 Label WN15 WN26 Bird or animal cry WN4 WN 7 WN8 WN9 Request WN20 WN18 WN27 Challenge WN2 WN 13 Phone/radioWN28 WN6 WN17 , WN 11 LING 5200, 2006 WN25 Call a loan/bond WN23 Visit WN10, WN14, WN21, WN24, Bid 22 Overlap with PropBank Framesets WN5, WN16,WN12 Loud cry WN3 WN19 WN1 WN22 Label WN15 WN26 Bird or animal cry WN4 WN 7 WN8 WN9 Request WN20 WN18 WN27 Challenge WN2 WN 13 Phone/radioWN28 WN6 WN23 Visit Bid WN17 , WN 11 LING 5200, 2006 WN25 Call a loan/bond WN10, WN14, WN21, WN24, 23 Overlap between Senseval2 Groups and Framesets – 95% Frameset2 Frameset1 WN1 WN2 WN3 WN4 WN6 WN7 WN8 WN11 WN12 WN13 WN19 WN5 WN 9 WN10 WN 14 WN20 develop LING 5200, 2006 24 Sense Hierarchy (Palmer, et al, SNLU04 - NAACL04) PropBank Framesets – ITA >90% coarse grained distinctions 20 Senseval2 verbs w/ > 1 Frameset Maxent WSD system, 73.5% baseline, 90% accuracy Sense Groups (Senseval-2) - ITA 82% (up to 90% ITA) Intermediate level – 71% -> 74% LING 5200, 2006 WordNet – ITA 71% fine grained distinctions, 60.2% -> 66% 25 Limitations to PropBank Args2-4 seriously overloaded, poor performance VerbNet and FrameNet both provide more fine-grained role labels WSJ too domain specific, too financial, need broader coverage genres for more general annotation Additional Brown corpus annotation, also GALE data FrameNet has selected instances from BNC LING 5200, 2006 26 Improving generalization More data? General purpose class-based lexicons for unseen words and new usages? VerbNet, but limitations of VerbNet Semantic classes for backoff? Can we merge FrameNet and PropBank data?, What about new words and new usages of old words? WordNet hypernyms; WSD example lexical sets (Patrick Hanks) verb dependencies - DIRT, (Dekang Lin), very noisy We’re still a long way from events, inference, etc. LING 5200, 2006 27 FrameNet: Telling.inform Time In 2002, Speaker the U.S. State Department Target INFORMED Addressee North Korea Message that the U.S. was aware of this program , and regards it as a violation of Pyongyang's nonproliferation commitments LING 5200, 2006 28 FrameNet/PropBank:Telling.inform Time ArgM-TMP In 2002, Speaker – Arg0 (Informer) the U.S. State Department Target – REL INFORMED Addressee – Arg1 (informed) North Korea Message – Arg2 (information) that the U.S. was aware of this program , and regards it as a violation of Pyongyang's nonproliferation commitments LING 5200, 2006 29 Frames File: give w/ VerbNet PropBank instances mapped to VerbNet Roles: Arg0: giver Arg1: thing given Arg2: entity given to Example: double object The executives gave the chefs a standing ovation. Arg0: Agent The executives REL: gave Arg2: Recipient the chefs Arg1: Theme a standing ovation LING 5200, 2006 30 OntoNote Additions Department Arg1: Founder Arg0: Arg1: NP NP PP NP S NP NP Admit Arg0: Arg1: VP VP SBAR Technology Arg1: Transfer Arg0: Arg1: OntoBank adds Arg2: NP S VP NP PP NP NP NP The founder of Pakistan’s nuclear department Abdul Qadeer Khan has admitted he transferred nuclear technology to Iran, Libya, and North Korea • Co-reference • Word Sense Resolution into Predicates NP • Entity types and predicate frames connected to nodes in ontology LING 5200, 2006 31 Founder Nation Agency Person Acknowledge Transfer Know-how Nation Nation Nation