Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sentiment analysis overview in the text area --Yuanyuan Liu Sentiment analysis • Sentiment analysis (also known as opinion mining) refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in source materials. Sentiment analysis is widely applied to reviews and social media for a variety of applications, ranging from marketing to customer service. • Generally speaking, sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. The attitude may be his or her judgment or evaluation (see appraisal theory), affective state (that is to say, the emotional state of the author when writing), or the intended emotional communication (that is to say, the emotional effect the author wishes to have on the reader). Introduction • Goal • Granularity Document level Paragraph level Sentence level [feature/aspect level] • Evaluation accuracy[precision and recall] Methods • knowledge-based techniques • classify text by affect categories based on the presence of unambiguous affect words such as happy, sad, afraid, and bored. • assign arbitrary words a probable “affinity” to particular emotions. • statistical methods • Machine learning • hybrid approaches Measures using ML • Classifier • • • • Naïve Bayes Maximum Entropy (MaxEnt) Feature-based SVM … • Neural networks • Recurrent neural network(RNN) • Convolutional neural network(CNN) • … • Deep memory network and attention model Sentiment Lexicons • • • • • GI (The General Inquirer) LIWC (Linguistic Inquiry and Word Count) MPQA Subjectivity Cues Lexicon Bing Liu Opinion Lexicon SentiWordNet Naïve Bayes • assign to a given document d the class c∗ = arg maxc P (c | d) • Assumption: the fi’s are conditionally independent given d’s class: Naïve Bayes • Advantages: • Simple • Disadvantages: • Its conditional independence assumption clearly does not hold in real-world situations. MaxEnt • MaxEnt model MaxEnt • Advantages: • MaxEnt makes no assumptions about the relationships between features, and so might potentially perform better when conditional independence assumptions are not met. • Disadvantages: • A lot of computations. • Adam Berger • http://www.cs.cmu.edu/afs/cs/user/aberger/www/html/tu torial/tutorial.html SVM • Find a hyper plane and maximize the margin. Accuracy comparison Datasets: movie reviews from the Internet Movie Database(IMDb) papers • Survey: • Thumbs up? Sentiment Classification using machine Learning Techniques (Pang & Lee) • Opinion mining and sentiment analysis (Pang & Lee) • Comprehensive Review Of Opinion Summarization (Kim et al) • New Avenues in Opinion Mining and Sentiment Analysis (Cambria et al) RNN • A recurrent neural network (RNN) is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. • Application: • handwriting recognition • speech recognition RNN RNN CNN • A convolutional neural network (CNN, or ConvNet) is a type of feed-forward artificial neural network in which the connectivity pattern between its neurons is inspired by the organization of the animal visual cortex. CNN Aspect Level Sentiment Classification with Deep Memory Network Duyu Tang Bing Qin Ting Liu Motivation • Drawbacks of conventional neural models • capture context information in an implicit way, and are incapable of explicitly exhibiting important context clues of an aspect. • expensive computation • Intuition: only some subset of context words are needed to infer the sentiment towards an aspect. • E.g. “ great food but the service was dreadful! ” Background:Memory network • Question answering • Central idea: inference with a long-term memory component • Components: • • • • Memory m: an array of objects I: converts input to internal feature representation G: updates old memories with new input O: generates an output representation given a new input and the current memory state • R: outputs a response based on the output representation Background: attention model • One important property of human perception is that one does not tend to process a whole scene in its entirety at once. • Instead, humans focus attention selectively on parts of the visual space to acquire information when and where it is needed, and combine information from different fixations over time to build up an internal representation of the scene, guiding future eye movements and decision making. Deep memory network model aspect word • sentence s = {w1, w2, … , wi, … , wn} • Word embedding matrix: • word embedding of wi : vocabulary size The dimension of the word vector • Task: determining the sentiment polarity of sentences towards the aspect wi. Overview of the approach Figure 1: An illustration of our deep memory network with three computational layers (hops) for aspect level sentiment classification Attention model • Content attention • Location attention Content attention • Intuition: context words do not contribute equally to the semantic meaning of a sentence the importance of a word should be different if we focus on different aspect Content attention • Input: external memory m: aspect vector vaspect : • Output: mi is a piece of memory m αi ∈ [0,1] is the weight of mi and ∑i αi = 1 Calculation of αi • Softmax function • where Location attention • Intuition: • a context word closer to the aspect should be more important than a farther one. Location attention—model 1 • The memory vector mi: • vi ∈ Rdx1 is a location vector for word wi n is the sentence length k is the hop number li is the location of wi Location attention—model 2 • The memory vector mi: • vi ∈ Rdx1 is a location vector for word wi Location attention—model 3 • The memory vector mi: • vi is regarded as a parameter Location attention—model 4 • The memory vector mi: • Different from Model 3, location representations are regarded as neural gates to control how many percent of word semantics is written into the memory. The Need for Multiple Hops • Computational models that are composed of multiple processing layers have the ability to learn representations of data with multiple levels of abstraction. • In this work, the attention layer in one layer is essentially a weighted average compositional function, which is not powerful enough to handle the sophisticated computationality like negation, intensification and contrary in language. Aspect level sentiment classification • Regard the output vector in last hop as the feature, and feed it to a softmax layer for aspect level sentiment classification. • Means; minimizing the cross entropy error of sentiment classification Loss function: Experiments • Datasets [from SemEval 2014] Comparison to other methods • accuracy • runtime Effects of location attention Visualize Attention Models Error Analysis • 1. non-compositional sentiment expression. • E.g. “dessert was also to die for!” • 2. complex aspect expression consisting of many words. • E.g. “ask for the round corner table next to the large window.” • 3. sentimental relation between context words such as negation, comparison and condition. • E.g. “but dinner here is never disappointing, even if the prices are a bit over the top”. Conclusion • develop deep memory networks that capture importance of context words for aspect level sentiment classification. • leverage both content and location information. • using multiple computational layers in memory network could obtain improved performance. Thanks