Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Automatic sense prediction for implicit discourse relations in text Emily Pitler, Annie Louis, Ani Nenkova University of Pennsylvania ACL 2009 Implicit discourse relations • Explicit comparison – I am in Singapore, but I live in the United States. • Implicit comparison – The main conference is over Wednesday. I am staying for EMNLP. • Explicit contingency – I am here because I have a presentation to give at ACL. • Implicit contingency – I am a little tired; there is a 13 hour time difference. Related work • Soricut and Marcu (2003) – Sentence level • Wellner et al. (2006) – used GraphBank annotations that do not differentiate between implicit and explicit. – difficult to verify success for implicit relations. • Marcu and Echihabi (2001) – Artificial implicit – Delete the connective to generate dataset. – [Arg1, but Arg2] => [Arg1, Arg2] Word pairs investigation • The most easily accessible features are the words in the two text spans of the relation. • There is some relationship that hold between the words in the two arguments. – The recent explosion of country finds mirrors the “closed-end fund mania” of the 1920s. Mr. Foot says, when narrowly focused funds grew wildly popular. They fell into oblivion after the 1929 crash. – Popular (受歡迎) and oblivion (被遺忘) are almost antonyms. – Triggers the contrast relation between the sentences. Word pairs selection • Marcu and Echihabi (2001) – Only nouns, verbs, and others cue phrases. – Using all words were superior to those based on only nonfunctions words. • Lapata and Lascarides (2004) – Only verbs, nouns, and adjectives. – Verb pairs are one of best features. – No useful information was obtained using nouns and adjectives. • Blair-Goldensohn et al. (2007) – – – – Stemming. Small vocabulary. Cutoff on the minimum frequency of a feature. Filtering stop-words has a negative impact on the results. Analysis of word pair features • Finding the word pairs with highest information gain on the synthetic data. – The government says it has reached most isolated townships by now, but because roads are blocked, getting anything but basic food supplies to people remains difficult. – Remove but => comparison example – Remove because => contingency example Features for sense prediction • • • • • • Polarity tags Inquirer tags Money/Percent/Number Verbs First-last/first 3 words Context Polarity Tags pairs • Similar to word pairs, but words replaced with polarity tags. • Each word’s polarity was assigned according to its entry in the Multi-perspective Question Answering Opinion Corpus (Wilson et al., 2005) • Each sentiment word is tagged as positive, negative, both, or neutral. • Using the number of negated and non-negated positive, negative, and neutral sentiment word in the two spans as features. • Executives at Time Inc. Magazine Co., a subsidiary of Time Warner, have said the joint venture with Mr. Lang wasn’t a good one. [Negated Positive] • The venture, formed in 1986, was supposed to be Time’s low-cost, safe entry into women’s magazines. [Positive] Inquirer Tags • Look up what semantic categories each word falls into according to the General Inquirer lexicon (Stone et al., 1966). • See more observation for each semantic class than for any particular word, reducing the data sparsity problem. • Complementary classes – “Understatement” vs. “Overstatement” – “Rise” vs. “Fall” – “Pleasure” vs. “Pain” • Only verbs. Money/Percent/Num • If two adjacent sentences both contain numbers, dollar amounts, or percentages, it is likely that a comparison relation might hold between the sentences. • Count of numbers, percentages, and dollar amounts in the two arguments. • Number of times each combination of number/percent/dollar occurs in the two arguments. • Newsweek's circulation for the first six months of 1989 was 3,288,453, flat from the same period last year • U.S. News' circulation in the same time was 2,303,328, down 2.6% Verbs • Number of pairs of verbs in Arg1 and Arg2 from the same verb class. – Two verbs are from the same verb class if each of their highest Levin verb class levels are the same. – The more related the verbs, the more likely the relation is an Expansion. • Average length of verb phrases in each argument – They [are allowed to proceed] => Contingency – They [proceed] => Expansion, Temporal • POS tags of the main verb – Same tense => Expansion – Different tense => Contingency, Temporal First-Last, First3 • Prior work found first and last words very helpful in predicting sense – Wellner et al., 2006 – Often explicit connectives Context • Some implicit relations appear immediately before or immediately after certain explicit relations. • Indicating if the immediately preceding/following relation was an explicit. – Connective – Sense of the connective • Indicating if an argument begins a paragraph. Dataset • Penn Discourse Treebank • Largest available annotated corpus of discourse relations – Penn Treebank WSJ articles – 16,224 implicit relations between adjacent sentences • I am a little tired; [because] there is a 13 hour time difference. – Contingency.cause.reason • Use only the top level of the sense annotations. Top level discourse relations • Comparison (轉折) – 但是、可是、卻、即使、竟然、然而…… • Contingency (因果) – 因為、由於、因此、於是…… • Expansion (並列) – 又、並且、而且…… • Temporal (時序) – 在此之前、之後…… Discourse relations Relation Sense Proportion of implicits Expansion 53% Contingency 26% Comparison 15% Temporal 6% Experiment setting • • • • Developed features on sections 0-1 Trained on sections 2-20 Tested on sections 21-22 Binary classification task for each sense • Trained on equal numbers of positive and negative examples • Tested on natural distribution • Naïve Bayes classifier Results: comparison Features f-score First-Last, First3 21.01 Context 19.32 Money/Percent/Num 19.04 Random 9.91 Polarity is actually the worst feature 16.63 Distribution of opposite polarity pairs Comparison Positive-Negative or Negative-Positive Pairs 30% Not Comparison 31% Results: contingency Features f-score First-Last, First3 36.75 Verbs 36.59 Context 29.55 Random 19.11 Results: expansion Features f-score Polarity Tags 71.29 Inquirer Tags 70.21 Context 67.77 Random 64.74 • Expansion is majority class • precision more problematic than recall • These features all help other senses Results: temporal Features f-score First-Last, First3 15.93 Verbs 12.61 Context 12.34 Random 5.38 Temporals often end with words like “Monday” or “yesterday” Best feature sets • Comparison – Selected word pairs. • Contingency – Polarity, verb, first/last, modality, context, selected word pairs. • Expansion – Polarity, inquirer tags, context. • Temporal – First/last, selected word pairs. Best results Relation F-score baseline Comparison 21.96 17.13 Contingency 47.13 31.10 Expansion 76.41 63.84 Temporal 16.76 16.21 Sequence model for discourse relations • Tried Conditional random field classifier. Model Accuracy Naïve Bayes Model 43.27% Conditional Random Fields 44.58% Conclusion • First study that predicts implicit discourse relations in a realistic setting. • Better understanding of word pairs. – The feature in fact do not capture opposite semantic relation but rather give information about function word co-occurrences. • Empirical validation of new and old features. – Polarity, verb classes, context, and some lexical features indicate discourse relations.