Download Identifying Expressions of Opinion in Context

Identifying Expressions of Opinion in Context Eric Breck and Yejin Choi and Claire Cardie IJCAI 2007 Introduction • Traditional information extraction: answer questions about facts • Extract answers to subjective questions: how does X feel about Y? • Subjective information extraction and question answering will require techniques to analyze text below the sentence level Introduction: System Requirement • Is its polarity positive, negative, or neutral? • With what strength or intensity is the opinion expressed: mild, medium, strong or extreme? • Who or what is the source, or holder, of the opinion? • What is its target, i.e. what is the opinion about? Introduction: Examples • Minister Vedrine criticized the White House reaction. – the agent role = “Minister Vedrine” – the object/theme role = “White House reaction” • 17 persons were killed by sharpshooters faithful to the president. • Tsvangirai said the election result was “illegitimate” and a clear case of “highway robbery”. • Criminals have been preying on Korean travelers in China. Introduction • Direct subjective expressions (DSEs) – criticized, faithful to – Said (speech event, if subjective) • Expressive subjective elements (ESEs) – illegitimate, highway robbery – preying on (instead of mugging) • None has directly tackled the problem of opinion expression identification. Subjective Expressions • The expressions can vary in length from one word to over twenty words. • They may be verb phrases, noun phrases, or strings of words that do not correspond to any linguistic constituent. • Subjectivity is a realm of expression where writers get quite creative, so no short fixed list can capture all expressions of interest. • Also, an expression which is subjective in one context is not always subjective in another context. Approach • This task is treated as a tagging problem. • Conditional random field • Class variable – IOB vs IO • Features • A linear-chain conditional random field is chosen, using MALLET toolkit. Features (1) • Lexical features – The word at position i relative to the current token. – Lex-4 ~ Lex4, , 18,000 binary features per position (vocabulary size) • Syntactic features – POS (45 binary features) – prev, cur, next (CASS partial parser, constituent type), 100 binary features each. • Dictionary-based features Features (2) • Dictionary-based features: 4 sources – WordNet: WordNet hypernyms (29,989 binary features) – Levin: Levin’s categorization of English words – Framenet: word in the categorization of nouns and verbs in Framenet – Wilson clues (subjective): strong or weak (two binary features) Statistics of Data MPQA corpus, 535 documents. 135 for training, 400 for testing. 10-fold cross validation Evaluation • Metric: Precision/Recall/F-measure – Exact – Overlap • Baselines: dictionary-based – two dictionaries of subjectivity clues: Wiebe vs. Wilson – Wilson is incorporated in this experiment Results (DSE/ESE) Results (DSE and ESE) Results (Dictionary-based) • WordNet is the most useful • The other dictionaries only help a little Discussion • Rules of boundary agreement is not defined for the annotations: order 1 outperform order 0 • DSEs includes speech events like “said” or “a statement”, which may be objective. • Expressions of subjectivity tend to cluster, therefore density-based features might help. • Inter-annotator agreement of DSE: 0.75; ESE:0.72

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Identifying Expressions of Opinion in Context