Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Index Copernicus Value- 56.65 Volume||5||Issue||03||March-2017||Pages-6266-6273||ISSN(e):2321-7545 Website: http://ijsae.in DOI: http://dx.doi.org/10.18535/ijsre/v5i03.04 New Avenues in opinion mining: Considering Dual Sentiment Analysis Authors 1 Pankaj R Chandre , Rohan Raj2, Himanshu Raj3 Dept. of Computer Engineering, Smt. Kashibai Navale College of Engineering, Pune, India Email Id - [email protected], [email protected] ABSTRACT In the modern world, due to advancement and outreach of technology, ease of access to any kind of product or service is growing immensely. Subjective attitude (i.e. Sentiment) ranging from that of products, news, movies to that of social networking mediums is being given for each and every product now-a-days. The market now, not only values the expert opinion but the reviews of masses have taken an equal importance as they are the one using the products and services. For the betterment of products, the input must be understood and the data must be analyzed by proper Machine Learning techniques along with Natural Language Processing in order to draw the conclusions and comprehending the overall situation. The topicbased text classification based on the Bag-of-Words model has some fundamental inadequacies, although various algorithms and classifiers (like naïve Bayes, support vector machines) are already analyzing sentiments and giving categorical feedback as a generic output. Polarity shift problem restricts the performance of these existing models. To address this problem for sentiment classification, Dual sentiment analysis (DSA) has been expanded from a 2 facet classification to a 3 facet classification which considers neutral reviews from the dataset as well for better accuracy and understanding. For each training and test review, a novel data expansion technique is being proposed that will use opposite class labels of positive and negative sentiments in one to one correspondence for a dual training and dual prediction algorithm. A corpus method based pseudo-antonym dictionary has also been proposed to remove the single language (English) based restriction and to maintain domain consistency as it will be pairing up words on the basis of sentiment strength. Keywords- Natural Language Processing, Bag-of-Words, Machine Learning, Dual Sentiment Analysis, Opinion mining, Naïve Bayes, Support Vector Machines, Dataset, Polarity shift, Corpus Method 1. INTRODUCTION Natural language processing, text analysis and computational linguistics are used to identify and extract subjective information in source materials. This is nothing but Sentiment analysis which is widely applied to reviews and social media for a variety of applications, ranging from marketing to customer service. Analyzers are used for polarity identification. Analyzers are of two types, manual (domain oriented) and automatic (generalized oriented). We used domain oriented in our methodology. In manual analyzer, predefined data set exists in which similar/ related term have to be fed and then the result occurs. Sentiment analysis is used to classify polarity and the sentiment analyzer is used to define polarity opinion expressed is (+) ve, (-) ve or (=) neutral [1]. A model called dual sentiment analysis (DSA) addresses this problem of polarity for sentiment classification. We first propose a novel data expansion technique by creating a sentiment reversed review for both, training and test review. On this basis, we propose a dual training algorithm to make use of original and reversed training reviews in pairs for learning a sentiment classifier, and a dual prediction algorithm to classify the test reviews by considering two sides of one review. Sentiment analysis is to extract the opinion of the user from the text Pankaj R Chandre et al IJSRE Volume 5 Issue 3 March 2017 Page 6266 document, identifying the orientations of opinions from the text. This movie was awesome-[sentiment]. This was boring-[sentiment]. Sentiment analysis and opinion mining, as a special text mining task for determining the subjective attitude (i.e., sentiment) expressed by the text, is becoming a hotspot in the field of data mining and natural language processing. Opinions and its related concepts such as sentiments, evaluations, attitudes and emotions are the subjects of study of sentiment analysis and opinion mining. The inception and rapid growth of the field coincide with those of the social media on the Web, e.g., reviews, forum discussions, blogs, micro blogs, Twitter, and social networks. Because for the first time in human history, we have a huge volume of opinionated data recorded in digital forms and the future seems to be very piquant with the inception and expansion of clouds, business analytics, data science and other such topics. Since early 2000, sentiment analysis has grown and still vastly growing to be one of the most active research areas in natural language processing. Data mining, Web mining, and text mining have very wide applications relating to sentiment analysis. There are three types of semantic orientation for any review, which can be positive, negative, or neutral, but some reviews are ambiguous to ascertain. We examined the effect of valence shifters on classifying the reviews. Three types of valence shifters are scrutinized which are negations, intensifiers, and diminishers. To reverse the semantic polarity of a particular term, Negations are used, while whenever we need to increase and decrease, respectively, the degree to which a term is positive or negative, intensifiers and diminishers are used. Sentiment classification is a basic task in sentiment analysis, to classify the sentiment (e.g., positive or negative) of a given text. The bag-of-words (BOW) model is typically used for text representation. In the BOW model, a Review text is represented by a vector of independent words to train a sentiment classifier statistical machine. Learning algorithms (such as naïve Bayes, maximum entropy classifier, and support vector machines) are then employed. The organization of this paper is as follows. Section 2 reviews the related work. In Section 3, we present the proposed system with introduction of the DSA framework in detail. Section 3 also presents the two methods for constructing an antonym dictionary. The experimental scopes are discussed in Section 3. Section 4 finally draws conclusions and outlines directions for the future work. 2. RELATED WORK There are four categories of Sentiment analysis: document-level, sentence-level, phrase-level, and aspectlevel sentiment analysis. Phrase/subsentence and aspect-level sentiment analysis are affected by complex polarity shift. As discussed by T. Wilson et al. [13], they began with a lexicon of words with established prior polarities, and identify the “contextual polarity” of phrases, based on some refined annotations. Choi Cardie [15] uses a different kind of negators to improve subsentential sentiment analysis. Supervised model for subsentential was developed by Nakagawa et al. [10], which ascertain inter-node based polarity in the dependency graph. Term-counting and machine learning methods are two types of document and sentence-level sentiment classification. In termcounting, based on manually-collected or external lexical resources [20], [22], content words are oriented on basis of total orientation score. Sentiment classification is a statistical problem in the case of machine learning methods, where a text is represented by a bag-of-words; then, the supervised machine learning algorithms are applied as classifier [21]. Machine learning methods are more evident than term-counting methods in various sentiment classification literatures. In the case of term-counting methods, the sentiment of polarity-shifted words can be reversed and then summed up but it's relatively more tedious in case of the bag-of-words model. Das and Chen [6] proposed a method by simply attaching “NOT” to words in the scope of negation, so that in the text “I don’t like book”, the word “like” becomes a new word “like-NOT”. But this showed a very little improvement as reported by Yet pang et al. [21]. Linguistic features or lexical resources were also considered to model Pankaj R Chandre et al IJSRE Volume 5 Issue 3 March 2017 Page 6267 polarity shift. Syntactic parsing was used to capture valence shifters which improved the term counting system significantly but showed very marginal improvement in the case of machine learning system. Li and Huang [12] put forward bifurcation of each sentence into the polarity shifted and polarity un-shifted part to represent them as two bags-of-words. Classification models are then trained and combined to project final polarity. Table 1- An Example of Creating Reversed Training Reviews Review Text Class Original review I don't like to eat Chinese. It tastes bad. Negative Reversed review I like to eat Chinese. It tastes good. Positive In this paper we extend the previous work in three major aspects; firstly, a selective data expansion procedure is added. Then we extend DSA framework to consider neutral sentiments as well. Lastly, a corpus-based data dictionary is constructed to remove external dependencies. 3. PROPOSED SYSTEM 3.1 Data Expansion Technique In the field of natural language processing and text mining, Agirre and Martinez [24] proposed expanding the amount of labeled data through a Web search using monosemous synonyms or unique expressions in definitions from Word-Net for the task of word sense disambiguation. Training data from external dictionary was put forth by Fujita and Fujino [7]. Rui Xia et al. [1] proposed a construction of original and reversed reviews in a one -to-one correspondence. The further dataset was expanded at training and test stage. Rui Xia et al. [1] also proposed a two-step data expansion technique based on antonym dictionary. After detecting negation, Text-reversion is done in which all sentiment words out of the scope of negation are reversed to their antonym. Label-reversion is then applied to reverse the class label. Table 1 gives an example of the given proposition. Given, an original training review, “I don’t like to eat Chinese. It tastes bad. (class: Negative)”, the reversed review is obtained by three steps: 1) the sentiment word “bad” is reversed to its antonym “good”; 2) the negation word “don’t” is removed. Since “like” is in the scope of negation, it is not reversed; 3) the class label is reversed from Negative to Positive. Note that in data expansion for the test dataset, we only conduct Text Reversion. Machine based sentiment reversed review might be not as good as a human-generated sentence. So we should maintain an optimal grammatical quality. Considering this problem some weight or numerical value could be attached to words on the basis of their sentiment strength. 3.2 Dual Training In this stage, two training sets are created. The original training samples are referred as 'Original training set'. Then the original training samples are reversed to their opposites and are called as 'Reversed training set'. One-to-one correspondence is always maintained between the original and reversed reviews. The classifier is designed by maximizing a combination of the likelihoods of the original and reversed training samples. This process is called as dual training [1]. We will be using Naïve Bayes classifier to derive the dual training algorithm. Along with Naïve Bayes, Logistic regression model and Support vector machines will be examined during the experiment. The Naïve Bayes uses a combined probability for training parameters, whereas the Logistic regression uses a log probability function and SVM's optimizes a combined hinge loss function. First, we determine the train set and test set in the dataset by considering the basic formula of the Naïve Bayes Theorem, Pankaj R Chandre et al IJSRE Volume 5 Issue 3 March 2017 Page 6268 ( ) ( ) Where, the P(C) is the prior probability of class C, P(x) is the prior probability of the training data. P(C/A) is the probability of C given x and P(x/C) is the probability of x given C. Second, we convert data into frequency table. As mentioned in the basic formula of the theorem, to calculate the prior probability of the class as well as of the training data, we compute the Prior probability by the given formula, Where, P(C) is the probability of the class, N is the total count of class in the training set and (N.C) is the total count of particular class in training set. Next, we compute the conditional probability/likelihood of each word attribute by the method, | | | Generally, we want the most probable hypothesis given the training data so we compute the posterior probability by the formula, CMAP = arg max P(x1, x2,….,xn).P(C) And at last, we determine the class of the test set and then proceed for the prediction stage so as to classify the reviews in three-class sentiment classification. Suppose the example in Table 1 is used as one training sample. As far as only the original sample (“I don’t like to eat Chinese. It tastes bad.”) is considered, the feature “like” will be improperly recognized as a negative indicator (since the class label is Negative), ignoring the expression of negation. Nevertheless, if the generated opposite sample (“I like to eat Chinese. It tastes good.”) is also used for training, “like” will be learned correctly, due to the removal of negation in sample reversion. Therefore, the procedure of dual training can correct some learning errors caused by polarity shift. Fig. 1 Model Architecture 3.3 Dual Prediction After training classification model, the original and reversed test samples are used together for prediction. We predict the test sample in terms of positive and negative degrees. Let and ̃ be the original and reversed Pankaj R Chandre et al IJSRE Volume 5 Issue 3 March 2017 Page 6269 sample. We are using ̃ to assist the prediction of rather than predictig the class of ̃.This process is called dual prediction [1]. Let | and | ̃ denotes posterior probability of and ̃ respectively. When we want to measure how positive a test review is, we not only consider how positive the original test review is (i.e., | ), but also consider how negative the reversed test review is (i.e., | ̃ ). Conversely, when we measure how negative a test review is, we consider the probability of being negative (i.e., | ), as well as the probability of ̃ being positive (i.e., | ̃ ). The dual predicting function is defined as: | | ̃ ̃ | | |̃ |̃ Where is a tradeoff parameter . Let us use the example given in Table 1 again to explain why dual prediction works in addressing the polarity shift problem. This time we assume that “I don’t like to eat Chinese” is an original test review, and “I like to eat Chinese” is the reversed test review. In traditional BOW, “like” will contribute a high positive score in predicting overall orientation of the test sample, despite of the negation structure “don’t like”. Hence, it is very likely that the original test review will be misclassified as Positive. While in Dual Prediction, due to the removal of negation in the reversed review, “like” this time the plays a positive role. Therefore, the probability that the reversed review being classified into Positive must be high. In Dual Prediction, a weighted combination of two component predictions is used as the dual prediction output. In this manner, the prediction error of the original test sample can also be compensated by the prediction of the reversed test sample. Apparently, this can reduce some prediction errors caused by polarity shift. In the experimental study, we will extract some real examples from our experiments to prove the effectiveness of both dual training and dual prediction. 3.4 Selective Data Expansion Our review example in Table 1 has a very distinct sentiment polarity. However not all the cases will be free from ambiguity. Such a problem could limit the use of all the labeling reviews for data expansion and data training. To tackle such problems Rui Xia et al. [1] investigated and subsequently put forth a selective data expansion procedure to select a part of training reviews for data expansion. Let us use another pair of complex examples to understand the proposed technique. Review (a): The mobile's processor is fast, and the cost is low. It's easy to play games. Review (b): The mobile's processor is somewhat smooth, but the cost is bit high. It's not tough to play games. In review (a), it's having a very strong sentiment with a low polarity shift rate. The statement explicitly explains its view as well as the degree of the sentiment. Hence, both the original and reversed review are examples of a good labeling instance. In review (b), the sentiment polarity is not distinct and unambiguous as compared to review (a). So, therefore creating a reversed review for review (b) is not required as in the case of review (a). For this purpose Rui Xia et al. [1] proposed a sentiment degree metric for selecting the most sentiment-distinct training reviews for data expansion. 3.5 Positive-Negative-Neutral framework for DSA Most widely used sentiment analysis technique i.e. polarity classification classifies the reviews either positive or negative. But there are situation where neutral reviews also exist. The existing DSA systems are not able to classify the neutral review. Hence, a system is proposed by Rui Xia et al. [1], which gives us a 3-class sentiment classification. Table 2 gives an example of creating the reversed reviews for sentiment-mixed neutral reviews. Any neutral review comprises of two main situations. It may be an objective text which could be neither positive nor Pankaj R Chandre et al IJSRE Volume 5 Issue 3 March 2017 Page 6270 negative or it could be an ambiguous statement with mixed positive and negative text projecting a conflicting sentiment. Therefore the reversed review's sentiment is also supposed to be on neutral grounds. The selective data expansion procedure is still used in this case, i.e., only the labeled data with high posterior probability will be used for data expansion. Table 2 : An Example Of Data Expansion For Neutral Reviews Review Text Class Original review This Chinese soup is spicy but it tastes delicious. Neutral Reversed review This Chinese soup is sweet, but it doesn't tastes delicious. Neutral 3.6 Pseudo-Antonym Dictionary Existing DSA system is totally dependent on a word that is the opposite of actual word called as external antonym dictionary. There is still a question that how to construct a suitable dictionary for sentiment classification. We can easily get various external antonym dictionaries directly from the well-defined lexicon such as WordNet (http://wordnet.princeton.edu/) in English. Various other web-based repositories hosting service such as GitHub could also provide abundant resources. These databases categorize English words into synonym sets called synsets which provides short, general definition, and records the various semantic relations between these synonym sets. After analyzing any specific synonym set we have to find its corresponding antonym set from any other external source like Thesaurus. But since it couldn't still guarantee domain-consistency of our tasks or sample subsets. Hence, a corpus-based method was proposed by Rui Xia et al. [1] to construct a pseudo-antonym dictionary. It basically uses mutual information based on labeled training data to identify the most positiverelevant and most negative-relevant features. Categories are then made and pairing is done for the features that have same level of sentiment strength as compared to antonym word. Table 3 : Repository of Words in Sentiment Classification #Positive (-: (^-^) :-* (o: <3 Affordable Marvelous #Negative )-: :-/ :*( )o: >:o Annoying Pathetic #Neutral :| :-o 8-) :o->-<|: Alright Average Yeah Mutual dependence of any two random variables is measured by the Mutual Information (MI). It’s a feature selection method in text categorization and sentiment classification [14]. The ranking and relevance of positive and negative groups are calculated by MI metric. Equivalent ranked positive and negative relevant words are matched as a pair of antonym words which gives us a pseudo antonym dictionary. A pseudo dictionary doesn't matches words on the basis of their exact meaning, rather matches them on the basis of the contextual sentiment strength they possess. Therefore semantic meaning always supersedes a syntactic meaning. They are language-independent and domain-adaptive as well. Thus, DSA model is widely applicable irrespective to the availability of lexical antonym dictionaries across different languages and domains. 4. CONCLUSION In this paper, we focus on creating reversed review to assist supervised sentiment classification which gives us an insight in sentiment analysis and opinion mining. Previous experiments demonstrate the effectiveness of DSA model in case of polarity classification and it significantly outperforms several alternatives. The use Pankaj R Chandre et al IJSRE Volume 5 Issue 3 March 2017 Page 6271 of corpus-based pseudo-antonym dictionary implicates a very wide practical approach when constrained with limited lexical resources and domain knowledge. We plan to conduct a wide range of practical experiments which will include real world data-sets from different commercial and research based sources such as Amazon, Princeton University, Flipkart and others. It will help us to explore new methods and possibilities based on the analytics drawn from the experiment results and will also provide a base for our future endeavors. In the future, we plan to integrate this model with real time web-based applications and various social media to test its diverse applicability. Furthermore, complex polarity shift patterns with more vague meaning and incompatible or conflicting sentiments will be considered. REFERENCES 1. Rui Xia, Feng Xu, Chengqing Zong, Qianmu Li, Yong Qi, and Tao Li, “Dual Sentiment Analysis: Considering Two Sides of One Review”. IEEE Trans. Knowl. Data Eng., vol. 27, no. 8, pp. Aug. 2015. 2. Tian, N., Xu, Y., Li, Y., Abdel-Hafez, A., Josang, A.: Product feature taxonomy learning based on user reviews. In: WEBIST 2014 10th International Conference on Web Information Systems and Technologies (2014). 3. R. Xia, T. Wang, X. Hu, S. Li, and C. Zong, “Dual Training and Dual Prediction for Polarity Classification, “Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 521-525, 2013. 4. R. Xia, T. Wang, X. Hu, S. Li, and C. Zong,“Dual training and dual prediction for polarity classification,” in Proc. Annu. Meeting Assoc. Comput. Linguistics, 2013, pp. 521–525. 5. Erik Cambria, Amir Hussain, “Sentic Computing Techniques, Tools, and Applications”, Published at Springer May 9, 2012 6. A. Abbasi, S. France, Z. Zhang, and H. Chen, “Selecting attributes for sentiment classification using feature relation networks” IEEE Trans-actions on Knowledge and Data Engineering (TKDE), vol. 23, no. 3, pp. 447-462, 2011. 7. S. Fujita and A. Fujino,“Word sense disambiguation by combining labeled data expansion and semisupervised learning method,” Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP), pp. 676-685, 2011. 8. I. Councill, R. MaDonald, and L. Velikovich, “What’s Great and what’s Not: Learning to Classify the Scope of Negation for Improved Sentiment Analysis,” Proceedings of the Workshop on negation and speculation in natural language processing, pp. 51-59, 2010. 9. S. Li, S. Lee, Y. Chen, C. Huang and G. Zhou, “Sentiment Classification and Polarity Shifting,” Proceedings of the International Conference on Computational Linguistics (COLING), 2010. 10. T. Nakagawa, K. Inui, and S. Kurohashi. “Dependency tree-based sentiment classification using CRFs with hidden variables,” Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), pp. 786-794, 2010. 11. X. Ding,B. Liu and L. Zhang, "Entity discovery and assignment for opinion mining applications" Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2009. 12. S. Li and C. Huang, “Sentiment classification considering negation and contrast transition,” Proceedings of the Pacific Asia Conference on Language, Information and Computation (PACLIC), 2009. 13. T. Wilson, J. Wiebe, and P. Hoffmann, "Recognizing contextual polarity:An exploration of features for phrase-level sentiment analysis,"Computational Linguistics, vol. 35, no. 3, pp. 399-433, 2009. Pankaj R Chandre et al IJSRE Volume 5 Issue 3 March 2017 Page 6272 14. S. Li, R. Xia, C. Zong and C.Huang, “A framework of feature selection methods for text categorization,” Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 692-700,2009. 15. Y. Choi and C. Cardie, “Learning with Compositional Semantics as Structural Inference for Subsentential Sentiment Analysis,” Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 793-801, 2008. 16. Y. Choi and C. Cardie, “Learning with Compositional Semantics as Structural Inference for Sub sentential Sentiment Analysis, “Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 793-801, 2008. 17. V. Ng, S. Dasgupta and S. Arifin, “Examining the Role of Linguistic Knowledge Sources in the Automatic Identification and Classification of Reviews,” Proceedings of the International Conference on Computational Linguistics and Annual Meeting of the Association for Computational Linguistics (COLING/ACL), pp. 611-618, 2006. 18. A. Kennedy and D. Inkpen, “Sentiment classification of movie reviews using contextual valence shifters,” Computational Intelligence,vol. 22, pp. 110–125, 2006. 19. M. Gamon, “Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis,” Proceedings of the International Conference on Computational Linguistics (COLING), pp. 841-847, 2004. 20. P. Turney and M. L. Littman, “Measuring praise and criticism: Inference of semantic orientation from association,” ACM Transactions on Information Systems (TOIS), vol. 21, no. 4, pp. 315-346, 2003. 21. B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up?: sentiment classification using machine learning techniques,” Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP),pp. 79-86, 2002. 22. P. Turney, “Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews,” Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2002. 23. S. Das and M. Chen, “Yahoo! for Amazon: Extracting market sentiment from stock message boards,” Proceedings of the Asia Pacific Finance Association Annual Conference, 2001. 24. E. Agirre, and D. Martinez, “Exploring automatic word sense disambiguation with decision lists and the Web,” Proceedings of the COLINGWorkshop on Semantic Annotation and Intelligent Content, pp. 11-19,2000. 25. R. Mihalcea and D. Moldovan, “An automatic method for generating sense tagged corpora” Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 1999. Pankaj R Chandre et al IJSRE Volume 5 Issue 3 March 2017 Page 6273