Download ppt

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Automatic sense prediction for
implicit discourse relations in text
Emily Pitler, Annie Louis, Ani Nenkova
University of Pennsylvania
ACL 2009
Implicit discourse relations
• Explicit comparison
– I am in Singapore, but I live in the United States.
• Implicit comparison
– The main conference is over Wednesday. I am staying
for EMNLP.
• Explicit contingency
– I am here because I have a presentation to give at ACL.
• Implicit contingency
– I am a little tired; there is a 13 hour time difference.
Related work
• Soricut and Marcu (2003)
– Sentence level
• Wellner et al. (2006)
– used GraphBank annotations that do not differentiate
between implicit and explicit.
– difficult to verify success for implicit relations.
• Marcu and Echihabi (2001)
– Artificial implicit
– Delete the connective to generate dataset.
– [Arg1, but Arg2] => [Arg1, Arg2]
Word pairs investigation
• The most easily accessible features are the words
in the two text spans of the relation.
• There is some relationship that hold between the
words in the two arguments.
– The recent explosion of country finds mirrors the
“closed-end fund mania” of the 1920s. Mr. Foot says,
when narrowly focused funds grew wildly popular.
They fell into oblivion after the 1929 crash.
– Popular (受歡迎) and oblivion (被遺忘) are almost
antonyms.
– Triggers the contrast relation between the sentences.
Word pairs selection
• Marcu and Echihabi (2001)
– Only nouns, verbs, and others cue phrases.
– Using all words were superior to those based on only nonfunctions words.
• Lapata and Lascarides (2004)
– Only verbs, nouns, and adjectives.
– Verb pairs are one of best features.
– No useful information was obtained using nouns and adjectives.
• Blair-Goldensohn et al. (2007)
–
–
–
–
Stemming.
Small vocabulary.
Cutoff on the minimum frequency of a feature.
Filtering stop-words has a negative impact on the results.
Analysis of word pair features
• Finding the word pairs with highest information gain on
the synthetic data.
– The government says it has reached most isolated
townships by now, but because roads are blocked, getting
anything but basic food supplies to people remains difficult.
– Remove but => comparison example
– Remove because => contingency example
Features for sense prediction
•
•
•
•
•
•
Polarity tags
Inquirer tags
Money/Percent/Number
Verbs
First-last/first 3 words
Context
Polarity Tags pairs
• Similar to word pairs, but words replaced with polarity tags.
• Each word’s polarity was assigned according to its entry in the
Multi-perspective Question Answering Opinion Corpus (Wilson et
al., 2005)
• Each sentiment word is tagged as positive, negative, both, or
neutral.
• Using the number of negated and non-negated positive, negative,
and neutral sentiment word in the two spans as features.
• Executives at Time Inc. Magazine Co., a subsidiary of Time Warner,
have said the joint venture with Mr. Lang wasn’t a good one.
[Negated Positive]
• The venture, formed in 1986, was supposed to be Time’s low-cost,
safe entry into women’s magazines. [Positive]
Inquirer Tags
• Look up what semantic categories each word falls
into according to the General Inquirer lexicon
(Stone et al., 1966).
• See more observation for each semantic class
than for any particular word, reducing the data
sparsity problem.
• Complementary classes
– “Understatement” vs. “Overstatement”
– “Rise” vs. “Fall”
– “Pleasure” vs. “Pain”
• Only verbs.
Money/Percent/Num
• If two adjacent sentences both contain numbers, dollar
amounts, or percentages, it is likely that a comparison
relation might hold between the sentences.
• Count of numbers, percentages, and dollar amounts in
the two arguments.
• Number of times each combination of
number/percent/dollar occurs in the two arguments.
• Newsweek's circulation for the first six months of 1989
was 3,288,453, flat from the same period last year
• U.S. News' circulation in the same time was 2,303,328,
down 2.6%
Verbs
• Number of pairs of verbs in Arg1 and Arg2 from the
same verb class.
– Two verbs are from the same verb class if each of their
highest Levin verb class levels are the same.
– The more related the verbs, the more likely the relation is
an Expansion.
• Average length of verb phrases in each argument
– They [are allowed to proceed] => Contingency
– They [proceed] => Expansion, Temporal
• POS tags of the main verb
– Same tense => Expansion
– Different tense => Contingency, Temporal
First-Last, First3
• Prior work found first and last words very
helpful in predicting sense
– Wellner et al., 2006
– Often explicit connectives
Context
• Some implicit relations appear immediately
before or immediately after certain explicit
relations.
• Indicating if the immediately
preceding/following relation was an explicit.
– Connective
– Sense of the connective
• Indicating if an argument begins a paragraph.
Dataset
• Penn Discourse Treebank
• Largest available annotated corpus of discourse
relations
– Penn Treebank WSJ articles
– 16,224 implicit relations between adjacent sentences
• I am a little tired; [because] there is a 13 hour
time difference.
– Contingency.cause.reason
• Use only the top level of the sense annotations.
Top level discourse relations
• Comparison (轉折)
– 但是、可是、卻、即使、竟然、然而……
• Contingency (因果)
– 因為、由於、因此、於是……
• Expansion (並列)
– 又、並且、而且……
• Temporal (時序)
– 在此之前、之後……
Discourse relations
Relation Sense
Proportion of implicits
Expansion
53%
Contingency
26%
Comparison
15%
Temporal
6%
Experiment setting
•
•
•
•
Developed features on sections 0-1
Trained on sections 2-20
Tested on sections 21-22
Binary classification task for each sense
• Trained on equal numbers of positive and
negative examples
• Tested on natural distribution
• Naïve Bayes classifier
Results: comparison
Features
f-score
First-Last, First3
21.01
Context
19.32
Money/Percent/Num
19.04
Random
9.91
Polarity is actually
the worst feature
16.63
Distribution of opposite polarity pairs
Comparison
Positive-Negative or
Negative-Positive Pairs
30%
Not Comparison
31%
Results: contingency
Features
f-score
First-Last, First3
36.75
Verbs
36.59
Context
29.55
Random
19.11
Results: expansion
Features
f-score
Polarity Tags
71.29
Inquirer Tags
70.21
Context
67.77
Random
64.74
• Expansion is
majority class
• precision more
problematic than
recall
• These features
all help other
senses
Results: temporal
Features
f-score
First-Last, First3
15.93
Verbs
12.61
Context
12.34
Random
5.38
Temporals often
end with words
like “Monday” or
“yesterday”
Best feature sets
• Comparison
– Selected word pairs.
• Contingency
– Polarity, verb, first/last, modality, context, selected
word pairs.
• Expansion
– Polarity, inquirer tags, context.
• Temporal
– First/last, selected word pairs.
Best results
Relation
F-score
baseline
Comparison
21.96
17.13
Contingency
47.13
31.10
Expansion
76.41
63.84
Temporal
16.76
16.21
Sequence model for
discourse relations
• Tried Conditional random field classifier.
Model
Accuracy
Naïve Bayes Model
43.27%
Conditional Random Fields
44.58%
Conclusion
• First study that predicts implicit discourse
relations in a realistic setting.
• Better understanding of word pairs.
– The feature in fact do not capture opposite
semantic relation but rather give information
about function word co-occurrences.
• Empirical validation of new and old features.
– Polarity, verb classes, context, and some lexical
features indicate discourse relations.