Download Sentiment detection

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Sentiment Detection
Rik Sarkar (03305048)
Kedar Godbole (03305805)
Outline
 Sentiment detection: the problem statement
 Difficulties in sentiment detection
 Approaches to sentiment detection
 Conclusion
 Project proposal
Problem Statement
 Detect the polarity about a particular topic in
a document
Polarity:
- Positive
- Negative
- Mixed
- Neutral
Motivation
Reviews on the Web
 Opinions about a product
 Opinions about the individual aspects of a
product
 Movie/book reviews
 Feedback/evaluation forms
Issues
 Reference to multiple objects in the same
document
- The NR70 is trendy. T-Series is fast becoming
obsolete.
 Dependence on the context of the document
- “Unpredictable” plot ; “Unpredictable” performance
 Negations have to be captured
- Monochrome display is not what the user wants
Issues (contd.)
 Metaphors/Similes
- The metallic body is solid as a rock
 Part-of and Attribute-of relationships
- The small keypad is inconvenient
 Absence of a polar word
- How can someone sit through this seminar?
Approaches to Sentiment Detection
 Based on pre-selected sets of words
 Naive Bayes
 Support Vector Machines
 Unsupervised learning
 Enhancement by NLP
An Unsupervised Learning Technique
Extract phrases from the review based on patterns of
POS tags
 JJ – Adjective
First word
Second word
 RB – Adverb
JJ
NN
 NN – Noun
RB
JJ
JJ
JJ
NN
JJ
Unsupervised Learning
PointWise Mutual Information (PMI)
and Semantic Orientation (SO)
PMI(word1, word2) =
 p( word1 & word 2) 
log 

 p( word1) p( word 2) 
SO (phrase) = PMI (phrase, ”excellent”)
– PMI (phrase, “poor”)
Unsupervised Learning
 Determine the Semantic Orientation (SO) of the
phrases
 Search on AltaVista
 SO (phrase) =
 hits( phraseNEAR" excellent" )hits(" poor" ) 
log 

 hits( phraseNEAR" poor" )hits(" excellent" ) 
Unsupervised Learning
Calculate average semantic orientation of document:
Extracted
phrase
POS tags
Semantic
Orientation
Low fees
JJ NN
0.333
Online service
JJ NN
2.780
Inconveniently
located
RB VB
-1.541
Average Semantic Orientation = 0.524
Need for NLP
 Identifying phrases is not enough – need to
identify subject/object
- The NR70 is trendy. T-Series is fast becoming
obsolete.
 Need to identify part-of and attribute-of
relationship
- The battery is long-lasting
Focus of the sentiment
Feature/attribute terms:
 BNP - Base Noun Phrases
- battery, display, keypad
 dBNP - Definite Base Noun Phrases
- “the display”
 bBNP - Beginning Definite Base Noun
Phrases
- “The battery is long-lasting”
Sentiment Analyzer
 Sentiment lexicon database
- <lexical_entry> <POS> <sent_category>
- “excellent” JJ +
 Sentiment pattern database
- <predicate> <sent_category> <target>
- “I am impressed with the flash capabilities”
- impress + PP(by;with) target
SA (contd.)
 Identify sentences containing feature terms
 Ternary expressions (T-expressions)
- +ve/-ve sentiment verbs
<target, verb, “”>
- trans verbs
<target, verb, source>
 Binary expressions (B-expressions)
- <adjective, target>
SA (contd.)
 Identify sentiment phrases within subject,
object phrases
 Associating sentiment with the target
- Based on sentiment patterns
“I was impressed by the flash capabilities”
“This camera takes excellent pictures”
- Based on B-expressions
“Poor performance in a dark room”
Other issues
 Position of the sentiment words
- Words at the beginning and end of a review
 Sentiment about the characters in the movie
versus Sentiment about the actors in the
movie – abstraction.
“He played the role of a very corrupt politician”
“He played the role brilliantly”
Conclusion
 Sentiment detection can be used in areas
ranging from marketing research to movie
reviews.
 Sentiment Detection is a “hard” problem due
to context-sensitivity, complex sentences, etc.
 Statistical methods should be augmented
with NLP techniques.
References
 Yi, Nasukawa, et al. Sentiment Analyzer: Extracting
Sentiments about a Given Topic using NLP
techniques. Proceedings of the Third IEEE
International Conference on Data Mining, p. 427, Nov
19-22, 2003
 Peter D. Turney. Thumbs Up or Thumbs Down?
Semantic Orientation Applied to Unsupervised
Classification of Reviews. Proceedings of the 40th
Annual Meeting of ACL, p. 417-424, 2002
 Matthew Hurst and Kamal Nigam. Retrieving Topical
Sentiments from Online Document Collections.
Document Recognition and Retrieval XI, p. 27-34,
2004
References (contd.)
 B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up?
Sentiment classification using Machine Learning
techniques. Proceedings of the 2002 ACL EMNLP
Conference, p. 79-86, 2002
Project
 Sentiment analyzer for a specific domain
 Given set of features, initial list of polar words
 Learns new polar words from documents
analyzed