Download Sentiment Analysis - Academic Science,International Journal of

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Naive Bayes classifier wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Sentiment Analysis- A Competent Tool in Data Mining
Sanket Kulkarni
DJ Sanghvi COE
Vile Parle (West)
Mumbai, India
DJ Sanghvi COE
Vile Parle(West)
Mumbai, India
DJ Sanghvi COE
Vile Parle(West)
Mumbai, India
DJ Sanghvi COE
Vile Parle(West)
Mumbai, India
[email protected]
[email protected]
[email protected]
[email protected]
As more and more devices are getting access to the web the
data produced has also increased enormously. Of all the total
data produced till now 90% of it is produced in last two years,
this stat itself shows how the revolution of internet is producing
vast amount of data which if used effectively can do wonders.
People now-a-days communicate, participate on many social
websites, blogs, forums etc. from which can offer great
opportunity to analyze the data, apply theories, algorithms and
technologies that search and extract relevant data from huge
quantities of data available from various websites and mine
them for opinions thereafter. Data analysis is widely growing
as a field and sentimental analysis is an important feature
involved in it. Sentimental analysis is basically determining the
attitude/ judgment/ evaluation/ emotional state or intended
emotional communication of the speaker or the writer with the
use of natural language processing, text analysis,
computational logistics and various algorithms. The main
target of this paper is to bring into notice the various
sentimental analysis techniques which are used widely in the
data analysis and the applications of sentimental analysis which
can be an important tool for many business and e-commerce
website and start-ups if used effectively.
Sentiment analysis, SVM, Naïvebayes, lexicon.
The World Wide Web is growing at an alarming rate not only
in size but also in the types of services and contents provided.
Each and every users are participating more actively and are
generating vast amount of new data. In this era of automated
systems and digital information every field of life is evolving
rapidly and generating data because of which huge amounts of
data produced in field of science, engineering, medical,
marketing, finance etc. Automated systems are needed
automated analysis and classification of data which help to take
enterprise level decisions. This analysis techniques include
various methods like text analysis, sentimental analysis etc.
sentimental analysis is used to find opinions, identify the
sentiments they express and classify the polarity as shown in
fig below:
Fig 1.1
There are three main classifications levels in sentiment
analysis: 1. Document level classification. 2. Aspect level
classification.3. Sentence level classification Document-level
aims to classify an opinion document as expressing a positive
or negative opinion or sentiment. It considers the document a
basic information unit. Sentence level aims to classify the senment expressed in each sentence. However, there’s not much
difference between document level and sentence level because
sentence are just short documents. Classifying text at the document level or at the sentence level does not provide the
necessary detail needed opinions on all aspects of the entity
which is needed in many applications, to obtain these details;
we need to go to the aspect level. Aspect-level aims to classify
the sentiment with respect to the specific aspects of entities.
The first step is to identify the entities and their aspects. The
opinion holders can give different opinions for different aspects
of the same entity. The data sets used in sentiment analysis are
an important issue in this field. The main sources of data are
from the product reviews. These reviews are important to the
business holders as they can take business decisions according
to the analysis results of users’ opinions about their products.
The various sentimental classification techniques and
algorithms are shown below.
The word w is positively correlated to the class i, whenMi(w)
is greater than 0. The word w is negatively correlatedto the
class iwhen Mi(w) is less than 0.
PMI is used in many applications like developing a
contextual entropy model to expand a set of seed words
generated from a small corpus of stock market news articles.
Their contextual entropy measures the similarity between two
words by comparing their contextual distributions using an
entropy measure allowing discovery of words similar to seed
words. Once the seed words has been expanded words are used
to classify the sentiments of new articles.
3.1.2. CHI-SQUARE (x2)
Let n be the total number of documents in the collection, p(w)
be the conditional probability for class i for documents which
contain w, Pi be the global fraction of documents whichcontain
w, Pi be the global fraction of documents containingthe class i,
and F(w) be the global fraction of documents whichcontain the
word w. Therefore, the x2-statistic of the wordbetween word w
and class iis defined as
Xi2 = n* F(w)2 *[ pi(w) – pi ]2 / [ F(w)*[(1-F(w)]* Pi *(1-P i)]
Xi2 and PMI are two different ways of measuring the corelation between the terms and categories. X i2 is better than
PMI as it is normalized value; therefore, these values are more
comparable across terms in same category. X i2 is used in many
applications and one example is contextual advertising.
3.1.3 Latent Semantic Indexing (LSI)
Fig 1.2 classification of models
As shown in the fig. above different models will be
3.1.1.Point-wise Mutual Information (PMI):
The mutual information measure provides a formal way to
model the mutual information between the features and the
classes. This measure was derived from the information theory.
The point-wise mutual information (PMI) Mi(w) between the
word w and the class iis defined on the basis of the level of cooccurrence between the class iand word w. The expected cooccurrence of class iand word w, on the basis of mutual
independence, is given by Pi * F(w), and the true co-occurrence
is given by F(w) *pi(w). The mutual information is defined in
terms of the ratiobetween these two values and is given by the
Mi(w) =[ log (F(w) * Pi (w) / F(w) * Pi ] = log [ pi (w) / Pi]
Feature selection models attempt to reduce the dimensionality
of data by picking from original set of attributes. Feature
transformation features create a smaller set of features of as a
function of original set of features. LSI is one of the famous
feature transformation models. LSI method transforms text
space to new axis systems which is linear combination of
original word features. Principal component analysis is used to
achieve this goal. It determines the axis system which retains
the greatest level of information about the variations in the
underlying attribute values. The main disadvantage of LSI is
that it is an unsupervised technique which is blind to
underlying class-distribution. Therefore, the features found by
LSI are not necessarily the directions along which the class –
distribution of the underlying documents can be separated.
4.Sentiment classification Techniques:
Sentiment classification techniques can be divided into
machine learning approach, lexicon based approach and hybrid
approach. Machine learning approach uses machine learning
algorithms and uses linguistic features. The lexicon based
approach depends on sentiment lexicon, a collection of known
and precompiled sentiment terms. The hybrid approach
combines both approaches and is very common with sentiment
lexicons playing an important role in majority of methods.
There is a brief description of the algorithms and techniques in
next subsection mentioned below
4.1. Lexicon-based approach
Opinion words are employed in many sentiment
classificationtasks. Positive opinion words are used to express
some desiredstates, while negative opinion words are used to
express someundesired states. There are also opinion phrases
and idiomswhich together are called opinion lexicon. There
are three mainapproaches in order to compile or collect the
opinion word list.Manual approach is very time consuming
and it is not usedalone. It is usually combined with the other
two automatedapproaches as a final check to avoid the
mistakes that resultedfrom automated methods. The two
automated approaches arepresented in the following
4.1.1. Dictionary-based approach
A small set of opinion words is collected manuallywith
known orientations. Then, this set is grown by searchingin the
well known corpora WordNet or thesaurus fortheir synonyms
and antonyms. The newly found words areadded to the seed
list then the next iteration starts. The iterativeprocess stops
when no new words are found. After the processis completed,
manual inspection can be carried out toremove or correct
errors.The dictionary based approach has a major
disadvantagewhich is the inability to find opinion words with
domain andcontext specific orientations. Qiu and He used
dictionary-based approach to identify sentiment sentences in
contextualadvertising. They proposed an advertising strategy
toimprove ad relevance and user experience. They used
syntacticparsing and sentiment dictionary and proposed a rule
basedapproach to tackle topic word extraction and consumers’
attitudeidentification in advertising keyword extraction.
Theyworked on web forums from
Theirresults demonstrated the effectiveness of the
proposedapproach on advertising keyword extraction and ad
4.1.2. Corpus-based approach
The Corpus-based approach helps to solve the problem of
finding opinion words with context specific orientations. Its
methods depend on syntactic patterns or patterns that
occurtogether along with a seed list of opinion words to find
otheropinion words in a large corpus. One of these methods
wererepresented by Hatzivassiloglou and McKeown
Theystarted with a list of seed opinion adjectives, and used
themalong with a set of linguistic constraints to identify
additionaladjective opinion words and their orientations. The
constraintsare for connectives like AND, OR, BUT, EITHEROR. . .. . .;the conjunction AND for example says that
conjoined adjectivesusually have the same orientation. This
idea is calledsentiment consistency, which is not always
consistent practically.There are also adversative expressions
such as but,however which are indicated as opinion changes.
In order todetermine if two conjoined adjectives are of the
same or differentorientations, learning is applied to a large
corpus. Then,the links between adjectives form a graph and
clustering is performedon the graph to produce two sets of
words: positiveand negative. Statistical approach.
Finding co-occurrence patterns orseed opinion words can be
done using statistical techniques.This could be done by
deriving posterior polarities using theco-occurrence of
adjectives in a corpus, as proposed by Fahrniand Klenner]. It
is possible to use the entire set of indexeddocuments on the
web as the corpus for the dictionary construction.This
overcomes the problem of the unavailabilityof some words if
the used corpus is not large enough.The polarity of a word can
be identified by studying theoccurrence frequency of the word
in a large annotated corpusof texts. If the word occurs more
frequently among positivetexts, then its polarity is positive. If
it occurs more frequentlyamong negative texts, then its
polarity is negative. Ifit has equal frequencies, then it is a
neutral word.The similar opinion words frequently appear
together in acorpus. This is the main observation that the state
of the artmethods are based on. Therefore, if two words
appear togetherfrequently within the same context, they are
likely to have thesame polarity. Therefore, the polarity of an
unknown wordcan be determined by calculating the relative
frequency ofco-occurrence with another word. This could be
done usingPMI. Semantic approach.
The Semantic approach gives sentimentvalues directly and
relies on different principles for computingthe similarity
between words. This principle gives similarsentiment values
to semantically close words. WordNet forexample provides
different kinds of semantic relationshipsbetween words used
to calculate sentiment polarities. WordNetcould be used too
for obtaining a list of sentiment words by iterativelyexpanding
the initial set with synonyms and antonymsand then
determining the sentiment polarity for an unknownword by
the relative count of positive and negative synonymsof this
4.1.3. Lexicon-based and natural language
processing techniques
Natural Language Processing (NLP) techniques are
sometimesused with the lexicon-based approach to find the
syntacticalstructure and help in finding the semantic relations.
Moreoand Romerohave used NLP techniques as
preprocessingstage before they used their proposed lexiconbased SA algorithm.Their proposed system consists of an
automatic focusdetection module and a sentiment analysis
module capableof assessing user opinions of topics in news
items which usea taxonomy-lexicon that is specifically
designed for news analysis.Their results were promising in
scenarios where colloquiallanguage predominates.The
approach for SA presented by Caro and Grella was based on a
deep NLP analysis of the sentences, using adependency
parsing as a pre-processing step. Their SA algorithmrelied on
the concept of Sentiment Propagation, whichassumed that
each linguistic element like a noun, a verb, etc.can have an
intrinsic value of sentiment that is propagatedthrough the
syntactic structure of the parsed sentence. Theypresented a set
of syntactic-based rules that aimed to cover asignificant part
of the sentiment salience expressed by a text.They proposed a
data visualization system in which theyneeded to filter out
some data objects or to contextualize thedata so that only the
information relevant to a user query isshown to the user. In
order to accomplish that, they presenteda context-based
method to visualize opinions by measuring thedistance, in the
textual appraisals, between the query and thepolarity of the
words contained in the texts themselves. Theyextended their
algorithm by computing the context-basedpolarity scores.
Their approach approved high efficiency afterapplying it on a
manual corpus of 100 restaurants reviews.
4.2. Machine learning approach:
Machine learning approach relies on the famous machine
learning algorithms to solve the SA as a regular text
classificationproblem that makes use of syntactic and/or
linguistic features.Text Classification Problem Definition: We
have a set oftraining records D = {X1, X2, . . .,Xn} where
each record islabeled to a class. The classification model is
related to the features in the underlying record to one of the
class labels. Then for a given instance of unknown class, the
model is used to predict a class label for it. The hard
classification problem is when only one label is assigned to an
instance. The soft classification problem is when a
probabilistic value of labels is assigned to an instance.
4.2.1. Supervised learning:
The supervised learning methods depend on the existence
oflabeled training documents. There are many kinds
ofsupervised classifiers in literature. In the next subsections,
wepresent in brief details some of the most frequently used
classifiers in sentiment analysis. Decision tree classifiers.
Decision tree classifier providesa hierarchical decomposition
of the training data spacein which a condition on the attribute
value is used to dividethe data The condition or predicate is
the presence orabsence of one or more words. The division of
the data spaceis done recursively until the leaf nodes contain
certain minimumnumbers of records which are used for the
purpose ofclassification.
There are other kinds of predicates which depend on
thesimilarity of documents to correlate sets of terms which
maybe used to further partitioning of documents. The
differentkinds of splits are Single Attribute split which use the
presenceor absence of particular words or phrases at a
particular nodein the tree in order to perform the split.
Similarity-basedmulti-attribute split uses documents or
frequent words clustersand the similarity of the documents to
these words clusters inorder to perform the split. Discriminatbased multi-attributesplit uses discriminants such as the Fisher
discriminate forperforming the split Linear classifiers.
Given Xi ={x1 . . . . . . xn} is the normalizeddocument word
frequency, vector Ai = {ai . . . . . an} isa vector of linear
coefficients with the same dimensionality asthe feature space,
and b is a scalar; the output of the linearpredictor is defined as
p= Ai . Xi + b, which is the output ofthe linear classifier. The
predictor p is a separating hyperplanebetween different
classes. There are many kinds of linear classifiers;among
them is Support Vector Machines (SVM) which is a form of
classifiers that attempt to determine goodlinear separators
between different classes. Two of the mostfamous linear
classifiers are discussed in the followingsubsections. Support Vector Machines
Classifiers (SVM).
Themain principle of SVMs is to determine linear separators
in thesearch space which can best separate the different
classes. Inthere are 2 classes x, o and there are 3 hyperplanes
A,B and C. Hyperplane A provides the best separation
betweenthe classes, because the normal distance of any of the
datapoints is the largest, so it represents the maximum margin
ofseparation.Text data are ideally suited for SVM
classification because
of the sparse nature of text, in which few features are
irrelevant,but they tend to be correlated with one another
andgenerally organized into linearly separable categories
SVM can construct a nonlinear decision surface in the
originalfeature space by mapping the data instances nonlinearly to an inner product space where the classes can be
separated linearly with a hyperplane. .This discriminative
classifier is considered the best text classification method (Rui
Xia, 2011; Ziqiong, 201). M. Rushdi Saleh (2011) has
applied the new research area by using Support Vector
Machines (SVM) for testing different domains of data sets
and using several weighting schemes. They have
accomplished experiments with different features on three
corpora. Two of them have already been used in several
works. The SINAI Corpus has been built from
specifically in order to prove the feasibility of the SVM for
different domains. Neural Network (NN).
Neural Network consists of many neurons where the neuron is
its basic unit. The inputs to the neurons are denoted by the
vector overlineXi which is the word frequencies in the ith
document. There are a set ofweights A which are associated
with each neuron used in order to compute a function of its
inputs f(.). The linear function of the neural network is: pi = A
. XiIn a binary classificationproblem, it is assumed that the
class label of Xi is denotedby yiand the sign of the predicted
function pi yields the classlabel.
Multilayer neural networks are used for non-linear
boundaries.These multiple layers are used to induce multiple
approximateenclosed regions belonging to a particular class. Rule-based classifiers.
In rule based classifiers, the dataspace is modeled with a set
of rules. The left hand side representsa condition on the
feature set expressed in disjunctivenormal form while the
right hand side is the class label. Theconditions are on the
term presence. Term absence is rarelyused because it is not
informative in sparse data.There are numbers of criteria in
order to generate rules, thetraining phase construct all the
rules depending on these criteria.The most two common
criteria are support and confidence. The support is the
absolute number of instances in thetraining data set which are
relevant to the rule. The Confidencerefers to the conditional
probability that the right hand side ofthe rule is satisfied if the
left-hand side is satisfied. Probabilistic classifiers.
Probabilistic classifiers usemixture models for classification.
The mixture model assumesthat each class is a component of
the mixture. Each mixturecomponent is a generative model
that provides the probabilityof sampling a particular term for
that component. These kindsof classifiers are also called
generative classifiers. Three of themost famous probabilistic
classifiers are discussed in the nextsubsections. Naïve Bayes Classifier (NB).
The Naïve Bayesclassifier is the simplest and most commonly
used classifier. Naïve Bayes classification model computes
the posterior probabilityof a class, based on the distribution of
the words in thedocument. The model works with the BOWs
feature extractionwhich ignores the position of the word in the
document. It usesBayes Theorem to predict the probability
that a given featureset belongs to a particular label.
1. Consider a training set of samples, each with the
class labels T. There are k classes, C1, C2, . . . ,Ck.
Every sample consists of an n-dimensional vector,
X = { x1, x2, . . . ,xn}, representing n measured
values of the n attributes, A1,A2, . ,An,
The classifier will classify the given sample X such
that it belongs to the class having the highest
posterior probability. That is X is predicted to
belong to the class Ci if and only P(Ci |X) > P(Cj
|X) for 1≤ j ≤ m, j≠ i. Thus we find the class that
maximizes P(Ci |X). The maximized value of P(Ci
|X) for class Ci is called the maximum posterior
By bayes theorem:
P(A|B) =
The simplicity of the naïve bayes theorem is very useful when
it comes to document classification (HanhoonKhang
(2012), Melville et al., 2009; Rui Xia, 2011; Ziqiong,
2011).The main idea is to estimate the probabilities of
categories given a test document by using the joint
probabilities of words and categories. The simplicity of the
Naïve Bayes algorithm makes this process efficient.
HanhoonKhang (2012) has proposed an improved
version of the Naïve Bayes algorithm and a unigrams +
bigrams was used as the feature, the gap between the positive
accuracy and the negative accuracy was narrowed to 3.6%
compared to when the original Naïve Bayes was used, and
that the 28.5% gap was able to be narrowed compared to
when SVM was used. Bayesian Network (BN).
The main assumption ofthe NB classifier is the independence
of the features. The otherextreme assumption is to assume that
all the features are fullydependent. This leads to the Bayesian
Network model which isa directed acyclic graph whose nodes
represent randomvariables, and edges represent conditional
dependencies. BNis considered a complete model for the
variables and their relationships.Therefore, a complete joint
probability distribution(JPD) over all the variables, is
specified for a model. In Textmining, the computation
complexity of BN is very expensive;that is why, it is not
frequently used Maximum Entropy Classifier:
The MaxentClassifier (known as a conditional exponential
classifier) convertslabeled feature sets to vectors using
encoding. Thisencoded vector is then used to calculate
weights for each featurethat can then be combined to
determine the most likelylabel for a feature set. This classifier
is parameterized by aset of X{weights}, which is used to
combine the joint featuresthat are generated from a feature-set
by an X{encoding}. Inparticular, the encoding maps each
C{(featureset, label)} pairto a vector. The probability of each
label is then computedusing the following equation:
P(fs | label) =
dotprod(weights; encode(fs; label))
sum(dotprod(weights; encode(fs;l))forlinlabels)
4.3. Weakly,
The main purpose of text classification is to classify
documentsinto a certain number of predefined categories. In
order toaccomplish that, large number of labeled training
documentsare used for supervised learning, as illustrated
before. In textclassification, it is sometimes difficult to create
these labeledtraining documents, but it is easy to collect the
unlabeled documents.The unsupervised learning methods
overcome thesedifficulties. Many research works were
presented in this fieldincluding the work presented by Ko and
Seo. They proposeda method that divides the documents into
sentences,and categorized each sentence using keyword lists
of eachcategory and sentence similarity measure.The concept
of weak and semi-supervision is used in manyapplications.
Youlan and Zhou have proposed a strategythat works by
providing weak supervision at the level of featuresrather than
instances. They obtained an initial classifierby incorporating
prior information extracted from an existingsentiment lexicon
into sentiment classifier model learning.They refer to prior
information as labeled features and usethem directly to
constrain model’s predictions on unlabeledinstances using
generalized expectation criteria.
4.4. Meta classifiers
In many cases, the researchers use one kind or more of
classifiersto test their work. One of these articles is the work
proposedby Lane and Clarke. They presented a MLapproach
to solve the problem of locating documents carryingpositive
or negative favorability within media analysis. Theimbalance
in the distribution of positive and negative samples,changes in
the documents over time, and effective training andevaluation
procedures for the models are the challenges theyfaced to
reach their goal. They worked on three data set generated by a
media-analysis company. They classified documentsin two
ways: detecting the presence of favorability, andassessing
negative vs. positive favorability. They have used
fivedifferent types of features to create the data sets from the
rawtext. They tested many classifiers to find the best one
which are(SVM, K-nearest neighbor, NB, BN, DT, a Rule
learner andother). They showed that balancing the class
distribution intraining data can be beneficial in improving
performance,but NB can be adversely affected.
5. Applications of SENTIMENT Analysis:
Each algorithm has its own particular way of analyzing. Using
various algorithms sentiment analysis can be performed on
social media websites blogs or websites and the analysis can
be used widely. Thus the vast amount of unused and useless
data can be turned valuable which can be used for various
applications mentioned below:
5.1 Stock Mark Prediction:
Stock market prediction is one of the important application of
sentimental analysis. Stock market events are easily
quantifiable using returns from indices and or individual
stocks which provide meaningful and automated labels.
Machine learning algorithm can be used to extract various
significant stock movements that can collect appropriate pre,
post and contemporaneous text from social media sources. A
label can be provided to each sentence that is extracted related
to a particular share and can be labeled as positive or negative.
A model can be trained which predicts the labels of future
sentences by taking into consideration net sentiment of each
day and show it holds great significant power in for
subsequent stock market movements. This is how sentimental
analysis can be used in stock markets for successful trading
strategies based on the system mentioned above and find
significant returns over other baseline methods.
5.2 Politics:
Sentiment analysis can be used for tracking opinions of public
from various public forum and political blogs. It can be used
by political organization to track the issues of the public and
which issues are close to the voters heart and the political
organizations can include them in their rallies which can have
a positive effect on the people. It can also be used to identify
whether the new scheme government want to finalize, people
are happy with it or not. While, it can also be used to predict
the poll results during election days by finding the sentiment
of the people related to a particular organization.
5.3 Recommender system:
Recommender system can be useful for getting user rating
from text. Sentimental analysis can be used as a subcomponent technology for recommend systems by not
recommending objects that receive negative feedback which
can be classified as recommended or not recommended
5.4 Rank finding:
Sentimental analysis can be used to track literary reputation.
It can be used to perform analytics in a group of blogs that are
related to the same field and by analyzing the number of
users, comments and reviews one can predict who is more
famous or adroit among the people. This can enable us to rank
a blog and identify the experts work and rate it as the highest
blog rank.
5.5 Business
Sentimental analysis has been adopted by many business who
deal with markets. The companies can take product reviews
using sentimental analysis and also track their brand value as
a whole using it. Using the sentiment analysis they can create
their marketing strategy accordingly and also fetch any
financial news. Various other applications are
Automatic tracking of user feedback and opinions
of brands and any products they have launched from
review sites.
Gauging reaction to company-related events and
incidents, like during a new product launch it can
give them instant feedback about the reception of
the new product. It
Monitoring crucial issues to avert harmful viral
effects, like dealing with customer complaints that
occur in social media and routing the complaints to
the particular department that can handle it, before
the complaints spread.
Analyzing purchaser inclinations, competitors, and
Key challenges identified by researchers for this application
include, identifying aspects of product, associating opinions
with aspects of product, identifying fake reviews and
processing reviews with no canonical forms.
5.6 Summarization
Key challenges identified by researchers for this application
include, identifying aspects of product, associating opinions
with aspects of product, identifying fake reviews and
processing reviews with no canonical forms. It includes
analysis on comments related to features of a product, review
sentences that give opinion of each feature and propose a
summary of all the extracted extracted information.
Summarization of single and multiple documents is also a
feature that sentiment analysis can augment.
5.7 Government intelligence
Government intelligence is one more application for
sentiment analysis. It has been proposed by monitoring
sources, the increase in weird or hostile communications can
be tracked. It can also be used for efficient rule making where
it can be used to assist in automatically analyzing the opinions
about pending policies or government-regulation proposals.
Other application includes tracking the citizens opinion about
a new scheme , predicting the likelihood of success of a new
legislative reform.
5.8 Geographical uses:
Sentimental analysis can be an effective tool at the time of
disaster where the sentiments of the people change according
to not only the location of the users but also the distance from
the disaster. People usually have many doubts who stay far
away from where the disaster has happened so a model can be
built which is integrated with the system that can help
response organizations to have a real time map which displays
both the physical disaster and the spikes of intense emotional
activity during the course of the disaster. This can be used for
future iteration for getting real time alerts of the emotional
status of the affected population.
Conclusion and Future scope:
Sentiment mining research is of utmost importance not only
for commercial establishments but also for the common man.
With the World Wide Web offering various ideas and
opinions it is very important to be aware of the malicious
opinions also. Based on our comprehensive literature reviews
and discussions, we argue that we are actually initiating new
research questions of analyzing online product reviews and
other valuable online information from a domain users point
of view and exploring how such online reviews can really
benefit ordinary users. In the case of product reviews there
exists a visible gap between the designersperspective and the
domain users perspective. Also that, not a single classifier can
be called completely efficient as the results depend on a
number of factors. The data used in SA are mostly on Product
Reviews in theoverall count.Naïve Bayes and Support Vector
Machines are the most frequentlyused ML algorithms for
solving SC problem. They are considereda reference model
where many proposed algorithms arecompared to. The other
kinds of data areused more frequently over recent years
specially the socialmedia. The other kinds of data are news
articles or news feeds;web Blogs, social media, and others.
researchersInformation from micro-blogs, blogs and forums
as well as
news source, is widely used in SA recently. This media
informationplays a great role in expressing people’s feelings,
oropinions about a certain topic or product. Using social
networksites and micro-blogging sites as a source of data still
needs deeper analysis. There are some benchmark data sets
especially in reviews like IMDB which are used for
algorithms evaluation. In many applications, it is important to
consider the contextof the text and the user preferences. That
is why we need to makemore research on context-based SA.
Using TL techniques, wecan use related data to the domain in
question as a training data.Using NLP tools to reinforce the
SA process has attractedresearchers recently and needs some
more advancements. The non-English languages includethe
other Latin languages (Spanish, Italian); Germaniclanguages
(German, Dutch); Far East languages (Chinese,Japanese,
Taiwanese); Middle East languages (Arabic). still, the English
language is the most frequentlyused language due to the
availability of its resources includinglexica, corpora and
dictionaries. This opens a new challengeto researchers in
order to build lexica, corpora and dictionariesresources for
other languages.
Wilson T, Wiebe J, Hoffman P. Recognizing
contextual polarity in phrase-level sentiment
analysis. In: Proceedings of HLT/ EMNLP; 2005.
2. Michael Hagenau, Michael Liebmann, Dirk
Neumann. Automatednews reading: stock price
prediction based on financialnews using contextcapturing features. DecisSuppSyst; 2013.
3. LambovDinko, PaisSebastia˜ o, Dias Ga˜ el.
independent sentiment analysis. In:Presented at the
Linguistics(PACLING’11); 2011.
4. Russell, S. &Norvig, P. Artificial Intelligence: A
Modern Approach, London: Prentice Hall, 2003.
5. Quinlan, J. R. “Improved use of continuous
attributes in C4.5”, Journal of Artificial Intelligence
Research, Vol. 4, 1996, pp. 77-90
6. .Langseth, H. & Nielsen, T. “Classification using
Hierarchical Naïve Bayes models”, Machine
Learning, Vol. 63, No. 2, 2006, pp. 135-159.
7. Web Intelligence and Intelligent Agent Technology,
2008. WI-IAT '08. IEEE/WIC/ACM International
Conference on (Volume:1 )
8. Das, S. and Chen, M., Yahoo! for Amazon:
Extracting market sentiment from stock message
boards. In Proceedings of the Asia Pacific Finance
Association Annual Conference (APFA),2 001.
9. Ferguson, P., O‘Hare, N., Davy, M., Bermingham,
A., Tattersall, S., Sheridan, P., Gurrin, C., and
Smeaton, A. F., Exploring the use of paragraphlevel annotations for sentiment analysis in financial
blogs.1st Workshop on Opinion Mining and
Sentiment Analysis (WOMSA),2009.
10. Denecke, K..,UsingSentiWordNet for Multilingual
Sentiment Analysis .Proc. of the IEEE 24th
International Conference on Data Engineering
Workshop (ICDEW 2008), IEEE Press:507-512.