Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A SURVEY ON SENTIMENT ANALYSIS CHARU NANDA MOHIT DUA BANASTHALI VIDYAPITH NIT, KURUKSHETRA JAIPUR, RAJASTHAN HARYANA AGENDA INTRODUCTION ABOUT NLP TASKS OF NLP SENTIMENT ANALYSIS APPROACHES OPINION MINING CLASSES OF SA LEVELS OF SA WORK DONE RELATED WORK CONCLUSION AND FUTURE SCOPE REFERENCES INTRODUCTION NLP IS A SUBFIELD OF ARTIFICIAL INTELLIGENCE AND HELPS IN THE INTERACTION BETWEEN COMPUTERS AND HUMANS USING NATURAL LANGUAGES. SENTIMENT ANALYSIS IS AN “APPLICATION OF NLP” WHICH HELPS IN PROCESSING HUGE DATA IN THE FORM OF HUMAN OPINION OR REVIEWS ABOUT PARTICULAR THINGS OR OBJECT ON THE WEB AND CATEGORIZE THEM INTO DIFFERENT CLASSES. ABOUT NLP NATURAL LANGUAGE PROCESSING (NLP) IS A FIELD OF COMPUTER SCIENCE AND COMPUTATIONAL LINGUISTICS CONCERNED WITH THE INTERACTION BETWEEN COMPUTER SYSTEMS AND HUMAN BEING. NLP IS CONSIDERED AS A SUB FIELD OF ARTIFICIAL INTELLIGENCE (AI) AND IS RELATED TO HUMAN-COMPUTER INTERACTION. Theory and development of comp systems able to perform tasks requiring human intelligence. Study of language and its structure. AI LINGUISTICS NLP THE APPROACHES TO NLP ARE BASED ON MACHINE LEARNING, A TYPE OF ARTIFICIAL INTELLIGENCE THAT USES PATTERNS IN DATA TO IMPROVE UNDERSTANDING. A GOOD ADVANTAGE OF MACHINE LEARNING ALGORITHMS IS THAT IT FOCUSES ON THE MOST COMMON CASES. ON PROVIDING HUGE DATA, THE SYSTEMS THAT ARE BASED ON AUTOMATIC LEARNING ARE MORE ACCURATE. TASKS OF NLP 1. MACHINE TRANSLATION 2. SENTIMENT ANALYSIS 3. OPTICAL CHARACTER RECOGNITION 4. DISCOURSE ANALYSIS 5. PART-OF-SPEECH TAGGING 6. SPEECH RECOGNITION 7. WORD SENSE DISAMBIGUATION 8. STEMMING 9. QUESTION ANSWERING 10. PARSING…. SENTIMENT ANALYSIS “PROCESS OF IDENTIFYING AND CATEGORIZING OPINIONS EXPRESSED IN A PIECE OF TEXT, ESPECIALLY IN ORDER TO DETERMINE WHETHER THE WRITER'S ATTITUDE TOWARDS A PARTICULAR TOPIC, PRODUCT, ETC. IS POSITIVE, NEGATIVE, OR NEUTRAL.” SENTIMENT ANALYSIS IS PROCESS OF EXTRACTING INFORMATION USUALLY FROM A SET OF DOCUMENTS, OFTEN USING ONLINE REVIEWS TO DETERMINE POLARITY ABOUT OBJECTS. SENTIMENTS APPEAR LIKE: I LIKE IT- POSITIVE IT WAS HORRIBLE- NEGATIVE IT WAS JUST ONE TIME WATCH- NEUTRAL POSITIVE WORDS NEGATIVE WORDS NEUTRAL WORDS Good Bad May be Great Boring Not sure Awesome Disappointing It could be Beautiful Worst I don’t know Superb Horrible So so Marvelous Unhappy Maybe yes maybe no APPROACHES EARLIER APPROACHES 1. MEETINGS 2. INTERVIEWS 3. QUESTION ANSWERING SESSIONS CURRENT APPROACHES 1. BLOGS 2. SOCIAL MEDIA WEBSITES 3. FORUMS “THE CLASSIFICATION OF SENTIMENT ANALYSIS APPROACHES CAN BE CATEGORIZED AS MACHINE LEARNING WHERE IT FOCUSES ON THE DEVELOPMENT OF COMPUTER PROGRAMS THAT CAN TEACH THEMSELVES TO GROW AND CHANGE WHEN EXPOSED TO NEW DATA, OTHER IS LEXICON BASED WHICH RESTS ON THE IDEA THAT AN IMPORTANT PART OF LEARNING A LANGUAGE CONSISTS OF BEING ABLE TO UNDERSTAND AND PRODUCE LEXICAL PHRASES AS CHUNKS AND THE LAST ONE IS HYBRID APPROACH WHICH IS THE COLLECTION OF ABOVE MENTIONED TWO APPROACHES.” OPINION MINING OPINION MINING IS A SYNONYM OF SENTIMENT ANALYSIS. AS THE NAME SUGGESTS MINING WHAT IS THE OPINION OF PEOPLE ABOUT DIFFERENT THINGS IS WHAT IT REFERS TO. IT IS THE TYPE OF TEXT MINING WHICH CLASSIFIES THE TEXT ACCORDING TO ITS POLARITY INTO DIFFERENT CLASSES. VOLUMINOUS AMOUNT OF DATA IS PRESENT ON THE WEB AND IN CONTINUOUSLY INCREASING. THE PROCESSING OF THIS DATA HAS BECOME THE NECESSITY OF TODAY'S TIME. OPINIONS Implicit & Explicit Regular & Comparative EXPLICIT AND IMPLICIT EXPLICIT DATA IS THE INFORMATION THAT IS PROVIDED INTENTIONALLY SUCH AS REGISTRATION FORMS ETC. IMPLICIT DATA IS INFORMATION THAT IS NOT PROVIDED INTENTIONALLY BUT IS GATHERED, EITHER THROUGH ANALYSIS OF EXPLICIT DATA OR THROUGH DIRECTLY. REGULAR AND COMPARATIVE REGULAR OPINION IS CONSIDERED AS THE STANDARD OPINION AND IT CAN EITHER BE DIRECT OR INDIRECT DIRECT – “THE RESOLUTION OF THIS PHONE IS BRILLIANT.” INDIRECT – “AFTER I SWITCHED TO THIS PHONE, I LOST ALL MY DATA!” IN COMPARATIVE ONE CAN EXPRESS THEIR OPINIONS BY THE COMPARING SIMILAR ENTITIES RATHER THAN BY EXPRESSING IT DIRECTLY AS POSITIVE OR NEGATIVE. “WHITE SHIRT IS BETTER THAN BLACK.” CLASSES OF SENTIMENT ANALYSIS BASED ON WORK DONE TILL NOW IT HAS BEEN OBSERVED THAT TASK OF SENTIMENT ANALYSIS IS PERFORMED ON THE SUBJECTIVE REVIEWS COLLECTED FROM THE INTERNET. THESE REVIEWS ARE THEN PROCESSED AND CATEGORIZED INTO THREE DIFFERENT CLASSES NAMELY • POSITIVE • NEGATIVE • NEUTRAL LEVELS OF SENTIMENT ANALYSIS Document Level Sentence Level Entity and aspect level DOCUMENT LEVEL- : AT THIS LEVEL, IT IS CHECKED WHETHER THE WHOLE OPINION DOCUMENT EXPRESSES A POSITIVE OR NEGATIVE SENTIMENT . SENTENCE LEVEL- AT THIS SENTENCE LEVEL, THE TASK IS RESTRICTED TO SENTENCES AND THE POLARITY OF THOSE SENTENCES IS DETERMINED. ENTITY AND ASPECT LEVEL- THIS IS THE LEVEL WHICH GIVES THE MOST PRECISE SENTIMENT WHICH IS NOT CLEARLY DEFINED BY THE ABOVE TWO LEVELS. IT GIVES FINE GRAINED ANALYSIS OF EACH OF THE PARTICULAR ENTITY. WORK DONE AS MENTIONED EARLIER, THE WORK DONE TILL NOW HAS BEEN ON SUBJECTIVE REVIEWS. THE WORK DONE TILL NOW IS CONSIDERABLE ON • NEWS • TWITTER • AMAZON DATA • BUSINESS DATA DIFFERENT LANGUAGES IN WHICH WORK HAS BEEN DONE • ENGLISH • NEPALI • CHINESE • BENGALI • THAI MANY DIFFERENT APPROACHES ARE USED FOR THE PROCESSING OF TEXT SUCH AS MACHINE LEARNING, DOMAIN KNOWLEDGE DRIVEN ANALYSIS, STATISTICAL APPROACHES. THESE APPROACHES PROVIDE AN EFFECTIVE WAY OF SCRUTINIZING INFORMATION AND DECIDING THE POLARITY OF THAT PARTICULAR DATA. RELATED WORK DAS AND HIS TEAM DEVELOPED SENTIWORDNET FOR BENGALI LANGUAGE CONSISTING OF 35,805 WORDS . SHARMA ET. AL USED UNSUPERVISED LEXICON METHOD OVER THE TWEETS TO FIND THE POLARITY. HUANGFU ET. AL PRESENTED IMPROVED CHINESE NEWS SENTIMENT ANALYSIS METHOD. YAN ET. AL PROPOSED AND IMPLEMENTED TIBETAN SENTENCE SENTIMENT TENDENCY JUDGMENT SYSTEM BASED ON MAXIMUM ENTROPY MODEL. JOSHI ET. AL PROPOSED STRATEGY FOR HINDI LANGUAGE. THEY DEVELOPED A HINDI SENTIWORDNET BY REPLACING WORDS OF ENGLISH WORDNET BY THEIR HINDI EQUIVALENTS. CONCLUSION ON THE BASIS OF STUDIES AND SURVEYS DONE TILL DATE, THIS CAN BE PROCLAIMED THAT VARIETY OF TASKS AND APPROACHES HAVE BEEN USED AND RESULTED IN MANY ACHIEVEMENTS. THE BEST CLASSIFICATION METHOD AMONGST THE THREE IS CONSIDERED TO BE THE MACHINE LEARNING APPROACH USED FOR PREDICTING THE POLARITY OF THE SENTIMENTS BASED ON THE DATASET. THIS CAN BE USED IN ALL THE DIFFERENT ASPECTS AND WILL BE ABLE TO GIVE MUCH GOOD RESULTS. AREAS COULD BE POLITICS, FASHION, EDUCATION, BUSINESS ETC. FUTURE SCOPE IT IS OBSERVED THAT SENTIMENT ANALYSIS PLAYS A VITAL ROLE IN RESOLVING THE ISSUE OF POLARITY OF THE REVIEWS. THE STUDY SHOWED THAT THE DOMAIN OF WORK IN HINDI LANGUAGE IS VERY SMALL. ALSO SOME NEW TECHNIQUES COULD BE UTILIZED TO INCREASE THE ACCURACY OF THE RESULTS AND HENCE GIVE US MORE ACCURATE AND PRECISE ANSWERS. REFERENCES AMITAVA DAS, SIVAJI BANDOPADAYA, SENTIWORDNET FOR BANGLA, KNOWLEDGE SHARING EVENT -4: TASK, VOLUME 2,2010. AMITAVA DAS, SIVAJI BANDOPADAYA, ”SENTIWORDNET FOR INDIAN LANGUAGES”, PROCEEDINGS OF THE 8TH WORKSHOP ON ASIAN LANGUAGE RESOURCES, PAGES 5663, BEIJING, CHINA, AUGUST 2010. YAKSHI SHARMA, VEENU MANGAT, MANDEEP KAUR, A PRACTICAL APPROACH TO SEMANTIC ANALYSIS OF HINDI TWEETS”, 1ST INTERNATIONAL CONFERENCE ON NEXT GENERATION COMPUTING TECHNOLOGIES(NGCT-2015), DEHRADUN, INDIA,PAGE NO(677-680), SEPTEMBER 4-5, 2015. YU HUANGFU, GUOSHI WU, YU SU JING LI, PENGFEI SUN JIE HU, “ÄN IMPROVED SENTIMENT ANALYSIS ALGORITHM FOR CHINESE NEWS”, 12TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWEDGE DISCOVERY(FSKD), PAGE NO(1366-1371), 2015. PURTATA BHOIR, SHILPA KOLTE, “SENTIMENT ANALYSIS OF MOVIE REVIEWS USING LEXICON APPROACH”, IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH, 2015. XIADONG YAN, TAO HUANG, “TIBETIAN SENTENCE SENTIMENT ANALYSIS BASED ON THE MAXIMUM ENTROPY MODEL”.10TH INTERNATIONAL CONFERENCE ON BROADBAND AND WIRELESS COMPUTING, COMMUNICATION AND APPLICATION, PAGE NO (594597), 2015. CHANDAN PRASAD GUPTA, BAL KRISHAN, “DETECTING SENTIMENT ANALYSIS IN NEPALI TEXTS, IEEE, PAGE NO (1-4), 2015. ZHONGKAI HU,JIANQING HU,WEIFENG DING,XIAOLIN ZHENG“REVIEW SENTIMENT ANALYSIS BASED ON DEEP LEARNING”,12TH INTERNATIONAL CONFERENCE ON EBUSINESS ENGINEERING, IEEE, PAGE NO(87-94), 2015. YAN WAN, HONGZHURUI NIE. TIANGUANG LAN, ZHAHUI WANG, “FINE GRAINED SENTIMENT ANALYSIS OF ONLINE REVIEWS”, 12TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PAGE NO(1406-1411), 2015 ANDREA SALINCA, “BUSINESS REVIEW CLASSIFICATION EMPLOYING SENTIMENT ANALYSIS”, 17TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING, PAGE NO (247-250), 2015. THANK YOU…