Download A Sentiment Analysis as a Tool to Identify The Status Of Universities

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Pattern recognition wikipedia , lookup

Data (Star Trek) wikipedia , lookup

Time series wikipedia , lookup

Transcript
Proceedings of the 2012 International Conference on Industrial Engineering and Operations Management
Istanbul, Turkey, July 3 – 6, 2012
A Sentiment Analysis as a Tool to Identify The Status Of
Universities: The Case of ITU
Mine Işık
Department of Industrial Engineering
Istanbul Technical University
Istanbul, Turkey
Başar Öztayşi
Department of Industrial Engineering
Istanbul Technical University
Istanbul, Turkey
Kübra H. Fenerci
Department of Industrial Engineering
Boğaziçi University
Istanbul, Turkey
Abstract
All Data mining is a popular statistical technique that is extensively used in recent years by both academicians and
practitioners. Data miners aim at extracting useful information from huge amount of row data exploiting analytical
techniques. Although, data mining is heavily rely on structured data, the most beneficial part of the whole data is
stored as unstructured texts. At this point, text mining has emerged to meet this requirement and has filled the
hidden gap of analytics. The aim of this research is using statistical text analytics and Natural Language Processing
(NLP) techniques to reveal the pattern of sentiments in categorized texts related to state universities. To this end,
data extracted from various social media websites is refined from noise and clustered with the help of statistical
methods. In this manner, contents of website users’ comments are being investigated beside positive and negative
sentiments addressed to these contents are identified. Implementation of the generated model is based on Istanbul
Technical University case, yet, with small changes in the model, it can be extended to other state universities.
Keywords
Text mining, opinion mining, sentiment analysis, clustering, Natural Language Processing (NLP)
1. Introduction
Understanding the customers, vendors, business processes, and the extended supply chain has been the key for
organizational success. Companies use analytical decision making tools to better understand their customers to
optimize their supply chain and maintain the best customer service (Davenport, 2006). Understanding the customer
commonly depends on analysis of the data that is collected and stored by the companies. The term data mining is
used to describe the process of discovering previously unknown patterns in the data. Basically, data mining uses
statistical and artificial intelligence methods to analyze the data.
Besides the structured data stored in databases, companies also own a vast amount of data stored in text format. It is
reported that 85 to 90 percent of all corporate data is captured and stored in text and other unstructured forms
(McKnight, 2005). The term text mining has emerged from the need to analyze and benefit from these text data.
Text mining is described as the semi-automated process of extracting patterns, useful information and knowledge
from large amounts of unstructured data sources (Turban et al. 2011). While data mining and text mining both have
1118
the same purposes as identifying valid, novel and useful patterns, the input of text mining is a collection of
unstructured data file types such as word documents, PDF files, XML files.
Text mining has been used in areas that large amounts of textual data are generated. These areas include analyzing
court orders, financial reports, patent files, customer comments and emails (Weng and Liu, 2004). The major
applications of text mining are; identification of key phrases and relationships within text set of categories based on
main themes of the document, grouping similar documents, connecting related documents by identifying their
shared concepts.
With the emergence of Web 2.0, the way that people express their opinions changed dramatically. Internet users
today, post reviews of products at online shopping sites and express their comments in, Internet forums, discussion
groups, blogs and any kind of social media. This online word-of-mouth behavior represents a new source of
information for text mining applications. Analyzing these texts is of great importance because of the valuable
information contained in them. Opinion mining (or sentiment analysis) term defines the process that exploits these
sources to help businesses and individuals with gaining such information effectively and easily (Liu, 2007). From
the business perspective, opinion mining provides feedback to companies about their products or services. Opinions
are also important for potential customers; these opinions can be used to make better products and services. Another
application of opinion mining is online advertisement; a company may want to display its advertisement where there
is a positive opinion about the product. On the other hand, it is not desirable to show an advertisement to Internet
user where there is a negative opinion about the product.
Both professionals and academics are developing techniques to analyze the opinions in web and especially in social
media. In this paper, an opinion-mining model is developed to analyze the comments on social media, about Istanbul
Technical University (ITU). Using a sample set of 300 comments, first a text clustering is accomplished to define
the categories. “Education quality”, “campus” and “general reputation” are the found to be the clusters. After the
clusters are identified a rule-based system is developed to classify these comments into the clusters. Using natural
language processing (NLP), the model can also specify the comment as being positive or negative. The resulting
model can automatically assign a new review to one of the clusters with determining its sign.
The rest of this paper is organized as follows. In Section 2, the relevant literature is reviewed. The model and steps
of the study is explained in Section 3. Finally the limitations and future suggestions are given in conclusion.
2. Literature Review
With text mining, unstructured data generated within a company or over Internet have became a source for
analytical surveys. The initial applications of text mining include analysis of court orders and academic publications,
predictions using quarterly financial reports, analysis of patent files and emails automatic prioritization (Turban et
al. 2011). Some exemplary applications can be summarized as follows:
• Marketing Applications: Text mining can be used to increase customer satisfaction, cross sell and upsell
opportunities and overall lifetime value by analyzing documents generated by call center notes and user
comments on Internet. Coussement and Van Den Poel, (2008) propose a model for complaint management
automation built on email detection system. In another study, Thorleuchter et al. (2012) propose a
customer profitability prediction model in B-to-B marketing context. Ghani et al. (2006) develop an
attribute extraction system to enhance retailers’ ability to analyze product databases.
• Biomedical Applications: since medical literature is more standardized and the terminology is relatively
constant, text mining is largely used in biomedical analysis. Nakov et al. (2005) use text mining to
discover gene-protein relationships. Shatkay et al. (2007) propose a model for protein location prediction.
Krallinger et al. (2005) outline the text mining approaches in molecular biology and biomedicine.
• Academic Applications: In scientific publications specific information is kept in written text and text
mining is needed to enable improved search without destroying the publishers’ barriers to public access. In
the recent literature, test mining is used to classify scientific articles based on citation context (Aljaber et
al. 2011). Text mining is also used to build concept maps, Chen et al. (2008) construct e-Learning domain
concept maps from academic articles.
Turban et al. (2011) identify the most popular text mining applications as; information extraction, topic tracking,
summarization, categorization, clustering, concept linking and question answering. Information or knowledge
1119
extraction is the identification of key phrases and relationships with in text. The second application, topic tracking is
the process of predicting documents of interest to a user based on user profile and other supplementary data.
Summarization is the text mining application automatically summarizes the documents without information loss.
Another important text mining application is categorization. Categorization is the process that starts with identifying
the main themes of a document and places the documents to predefined set of categories. Clustering, just like
categorization, aims to group documents but in clustering there are no predefined categories. The aim of clustering is
to group similar documents together. Concept linking is the text mining process that finds related documents by
identifying their shared concepts. And finally, text mining is used to automatically find the most suitable answer to a
question which is known as “question answering”. The relevant studies about the listed text mining applications are
given in Table 1.
Application
Information Extraction
Topic Tracking
Summarization
Categorization
Clustering and
Concept Linking
Question answering
Table 1: Review of text mining applications
Purpose
References
Identification of key phrases and
Tilak et al. (2011), Kovačevića (2012), Chan and
relationships with in text
Franklin (2011), Chen and Parvathi (2011)
Predict documents of interest
Qiu et al. (2010), Kobayashi and Yung (2008),
AlSumait (2008), Qiu et al. (2008), Lee and Kim
(2008)
Automatically summarize the
Bhattacharya et al. (2011), Trappey et al.(2009),
documents
Chong et al. (2010), Aliguliyev (2010), Jin (2011)
Place the documents to predefined Trappey et al.(2009), Lu et al. (2010), Shehata et al.
set of categories
(2010), Chen and Chen (2011), Weia (2011)
Group similar documents
Liu and Liu (2011), Moriizumi et al. (2011), Yang
together.
and Dorbin (2011), Raja and Tretter (2011), Marx et
al. (2011)
Find the best answer to a given
Dizier and Moens (2011), Fleuren et al. (2011), Liu
question.
(2009), Pechsiri and Kawtrakul (2007), Terol et al.
(2007)
A relatively new text mining application is sentiment analysis or opinion mining. Sentiment analysis is a technique
used to detect positive and negative opinions about specific products or services using large amounts of sources such
as customer feedbacks, forums, blogs and other social media. Sentiment analysis uses information extraction and
categorization among the previously determined applications. Recent studies in the literature can be given as follows;
Tang et al. (2009) define the four problems about sentiment analysis as; subjectivity classification, word sentiment
classification, document sentiment classification and opinion extraction, and discuss different approaches and issues
about these problems. Kang et al. (2012) classify a review document as a positive sentiment and as a negative
sentiment using the supervised learning algorithm. The authors found that there is a tendency for the positive
classification accuracy to appear up to approximately 10% higher than the negative classification accuracy. Eirinaki
et al.(2011) propose a feature based opinion mining model and suggest a search engine based on sentiment analysis
that retrieve and aggregate comments about a product . Wu and Tan (2011) propose a two stage framework for cross
domain sentiment analysis. In the first stage, a bridge is built between the source domain and the target domain to
get some most confidently labeled documents in the target domain. In the second stage the authors exploit the
intrinsic structure, revealed by these most confidently labeled documents, to label the target-domain data. Bai
(2011), propose a heuristic search-enhanced Markov blanket model that is can be used for extracting sentiments and
used the model for online movie and news reviews. The proposed model is capable of identifying a parsimonious set
of predictive features. Zhu et al. (2010) use individual model based on artificial neural networks for sentiment
analysis. The authors apply the model to movie reviews and the results of the model present higher accuracy than
support vector machines and hidden Markov model. Leong et al. (2012) use sentiment analysis for teaching
purposes; in their study the researchers use SMS texts as source of data for teaching evaluation. In this study, online
comments about Istanbul Technical University are analyzed with sentiment analysis using SAS Enterprise Miner
software. First the text clusters are generated and then rules are generated to classify each comment in a group. To
the best of our knowledge this is the initial attempt to apply opinion mining in education.
1120
3. Model
3.1. Data Preparation
Since web documents are abundant, it is unavoidable to get the data that explores the word-of-mouth and personal
judgments on the desired topic from this invaluable mine of information. To take advantage of huge data sets that
are hidden in online stores, it is crawled and used as an input of text mining algorithms. If we would like to mention
web crawling; it is the process used by search engines to collect data sets from the web. Many online forums and
review sites exist for people to post their opinions about ITU. In order to utilize it, the related comments about ITU
are gathered from the social media web sites. 7 different forum sites are investigated. The 300 comments are
examined in other saying data is crawled. Each comment is stored with the information of date, forum type, and user
name in order to track the data evolutionary.
3.2. Text Cleaning
Crawled texts are in spread sheet and needed to be transformed into a more convenient form to be used in a
commercial software that belongs to SAS. Data is converted to SAS format providing “wTurkish” encoding in order
to support non ANSI Turkish characters. Then, data including 300 lines is exported to UTF8 format text files to be
used in Sentiment Analysis Software. Texts that are not related to the point of concern and duplications are deleted
and resultant data is saved in our local computers.
3.3. Text Mining
Text mining process starts with importing text data into SAS Enterprise Miner and parsing it into its words and
phrases aiming at preparing these terms to be meaningful for analysis. Clustering process starts with the preparation
of the data. This preparation step includes parsing and filtering.
First phase of the data arrangement is “Text Parsing”. Since software supports Turkish, word stemming is available.
Whereat it is possible to eliminate some part of speech tags, abbreviations, auxiliary verbs, conjunctions,
determiners, interjections, infinitive markers, possessive markers and pronouns are ignored as a result of a
comprehensive analysis. It is recognized that these part of speech tags become inconvenient and meaningless while
kept in resultant dataset. Also, stemming is permitted in text parsing node to construct a more general model that can
be used in future studies. Software provides “inflectional stemming” which can reach roots of words and can
understand all inflection suffixes. Parsed documents are ready to be used in second data preprocessing step named
“Text Filtering.”
The analysis feature of this step consists of frequency weighting and term weighting. Text Filtering is used to get rid
of the noisy terms. From the variety of filtering methodologies, only Log and Binary frequency weighting methods
are experienced. One strategy of developing the text data matrix is transition from simple counts to more complex
weighting formulations.
The frequency weights represent the first step in quantifying texts. Since absolute counts might be affected by
documents that have a high level of variability regarding to size. In this study, Log frequency weighting method is
used in order to dampen the effect of terms, which abundantly occur in a document.
Afterwards, entropy term weighting is used to modify frequency weights to adjust for document size and word
distribution. Also spell check is conducted by the way of word similarity algorithms in order to find and correct
misspellings.
As a final step, texts are clustered. Since clustering techniques are subjective, different number of clusters is
examined and three mutually exclusive clusters are conducted eventually. Expectation maximization algorithm
guarantees that there is no common instance among clusters. Descriptive terms declare that these clusters are;
education, campus and corporate reputation.
3.4. Sentiment Analysis
The aim of this step is to reveal the hidden sentiments related to a specific object and features of that object.
Sentiment classification has been investigated in wide range of fields even on restaurant reviews from customer
perspective (Kang et al., 2012) and also on hospitality expenditures and stock returns (Singal, 2012). Moreover,
1121
sentiment analysis from unstructured text has witnessed a boom in interest in recent years (Bai, 2011). Automatic
and accurate understanding of sentiments expressed within the online text could lead to effective information
retrieval (Bai, 2011).
The input of this stage is inherited from text mining results. In this research ITU is considered as a product and the
features are defined as education, campus and corporate reputation. An empirical study on sentiment categorization
on ITU is conducted. Both statistical and rule based models are developed by utilizing the Sentiment Analysis
Studio of SAS software.
Statistical model uses a Bayes solution and smoothed relative frequency as a text normalization method. This
technique divides the term frequency by a normalization factor such as the length of the document. Overall
precision of the statistical model is 72.22% and 80% of data is partitioned for model training and 20% of data for
validating. Since output of the statistical model consists of words and phrases, they are used in rule-based models as
initial dictionary terms. In rule-based models, Boolean Perl operators are used to define relationships among terms
such as distances, ordinal distances, being in the same sentence or paragraph. Nearly, 8000 rules are written to
capture all the relationships in documents correctly. Although a classical search engine may understand positive and
negative words in a document, the solution obtained in this study links these positivity and negativity to the correct
feature including campus, corporate reputation and education.
3.5. Model Testing and Validating
Both text mining and sentiment analysis models are tested in an independent set of 50 documents. 70% accuracy is
obtained in Sentiment analysis part that is an open for improvement score. Since clustering is a descriptive method,
it is not possible to calculate accuracy/precision metrics in this technique. Being a subjective method, test results of
clustering interpreted manually and found to be reasonable.
3.6. Findings
This study aims at exploring the viewpoints and feelings on ITU from the perspective of not only its students but
also in the eye of any person that have any idea on it. Since many online forums and review sites exist for people to
post their opinions about, the data is gathered and analyzed from web. After clustering step is initiated, descriptive
terms divide the data into three different blocks called as education, campus and corporate reputation.
The general idea about corporate reputation of ITU can be regarded as considerably positive. A good many
comments heavily rely on the engineering quality, and the trust in the education of technical universities that results
in qualified and competent engineers. On the other hand, negative comments are about corporate reputation of ITU
includes excessive bureaucracy that students encounter.
17%
32%
51%
Positive
Negative
Neutral
Figure 1 Sentiment distribution of corporate reputation
General opinion about the campus facilities of ITU is optimistic while the library is the point in question. On the
contrary, the interpretation about the long-lasting constructions is negative due to any inconvenience caused.
1122
21%
6%
73%
Positive
Negative
Neutral
Figure 2 Sentiment distribution of ITU campuses
Comments about education of ITU mostly touch on education quality of engineering faculty. Commenters mention
that they are utilizing key concepts that ITU’s schooling brings to them. On the contrary, exams of ITU are found to
be unnecessarily hard and number of projects conducted.
33%
67%
Positive
Negative
Neutral
Figure 3 Sentiment distribution about education of ITU
4. Conclusions
A great deal of data on the web is already available. Despite data mining is heavily rely on structured data, it is
reported that the biggest portion of all valuable data is captured and stored in text. This research focuses on using
statistical text analytics and NLP techniques to reveal and analyze the pattern of sentiments in the case of ITU. For
this aim, website users’ comments are investigated and positive, negative and neutral sentiments addressed to the
contents of campus, corporate image and education comments are identified.
Firstly the data on hand is cleaned from noise to enable us making more clarified judgments. Then text-clustering
techniques are applied. With regards to the issue of cluster discrimination, the opinions of people hide in gathered
texts reveal that the distance of clusters sufficiently differs from each other. Hence, this can be considered as an
evidence of good cluster distinction.
Each cluster is investigated independently in sentiment analysis study. As a result, hidden sentiments in the
comments of social media users are discovered.
5. Limitations and Future Explorations
As the amount of gathered data increases, the exploration power of the hidden patterns becomes more reliable and
accurate. So the constructed model will be applied to different document sets such a way that the questionnaires are
also prepared and added to the unstructured data taken from various online forums. The proposed method for
predicting sentiments could enable not only the professionals but also researchers, to extract opinions from the
Internet about universities. Besides, as a future work, implementation of the generated model with small changes can
be extended to other state and private universities.
1123
Another limitation is to determine correct relationships and word distances in Turkish documents in order to catch
the comments that are valuable for the survey. As distance increases, the possibility of hitting a sentiment that is
irrelevant related to ITU. So the distance alternatives can be investigated by conducting the sensitivity analysis.
If we talk about the limitations, since the Turkish comments are investigated, the usage of positive adjectives
sometimes refers actually negative feelings about the main point of concern. Positive sentiments are assigned to
these comments yet they are not laudatory feelings.
References
Aliguliyev, R.M., Clustering Techniques and Discrete Particle Swarm Optimization Algorithm for Multi-Document
Summarization, Computational Intelligence, 26(4), pp. 420–448, 2010.
Aljaber B., Martinez D., Stokes N. and Bailey J. Improving MeSH classification of biomedical articles using citation
contexts, Journal of Biomedical Informatics, 44(5) pp. Pages 881–896, 2011.
AlSumait, L.; Barbara, D.; Domeniconi, C., On-line LDA: Adaptive Topic Models for Mining Text Streams with
Applications to Topic Detection and Tracking, ICDM '08. Eighth IEEE International Conference on Data
Mining, pp. 3 - 12, 2008.
Bai X., Predicting consumer sentiments from online text, Decision Support Systems, 50(4) pp. 732-742, 2011.
Bhattacharya S., Thuc V. and Srinivasan P., MeSH: a window into full text for document
summarization,Bioinformatics, 27 (13), pp. 120-128, 2011.
Chan, S.W.K., and Franklin J.,A text-based decision support system for financial sequence prediction, Decision
Support Systems,52, pp. 189–198, 2011.
Chen N.S., Kinshuk, Wei C.W., Chen H.J.,Mining e-Learning domain concept map from academic articles,
Computers & Education, 50(3), pp. 1009-1021, 2008.
Chen W. and Parvathi C., Extracting hot spots of topics from time-stamped documents, Data & Knowledge
Engineering, 70(7), pp 642-660, 2011.
Chen Y.T., Chen M.C., Using chi-square statistics to measure similarities for text categorization, Expert Systems with
Applications, 38(4), pp. 3085-3090, 2011.
Chong L., Huang M.L., Zhu X.Y., Li, M., A New Approach for Multi-Document Update Summarization , Journal of
Computer Science and Technology, 25(4), pp. 739-749, 2010.
Coussement, K., and Van Den Poel, D., Improving Customer Complaint Management by automatic Email
Classification Using Linguistic Style Features as Predictors, Decision Support Systems, 44(4), pp. 870-882, 2008.
Davenport T.H. Competing on analytics, Harvard Business Review, January 2006.
Dizier P.S., and Moens M.F., Knowledge and reasoning for question answering: Research perspectives, Information
Processing and Management,47(6), pp. 899–906,2011.
Eirinaki M., Pisal S., Singh J., Feature-based opinion mining and ranking Original Research Article, Journal of
Computer and System Sciences, In Press, November 2011.
Fleuren W.W., Verhoeven S., Frijters R., Heupers B., Polman J., van Schaik R., de Vlieg J., Alkema W., CoPub
update: CoPub 5.0 a text mining system to answer biological questions,Nucleic Acids Res., pp. 450-454, 2011.
Ghani, R., Probst K., Liu Y., Krema M., and Fano, A., Text mining for product attribute Extraction, SIGKDD
Eplorations,8(1), pp. 41-48, 2006.
Jin F., Huang M.L. and Zhu X.Y., Guided Structure-Aware Review Summarization, Journal of Computer Science and
Technology, 26(4), pp.676-684, 2011
Kang H., Yoo S.J., and Han D., Senti-lexicon and improved Naïve Bayes algorithms for sentiment analysis of
restaurant reviews, Expert Systems with Applications, 39(5), pp. 6000-6010, 2012.
Kobayashi M. and Yung R., Tracking Topic Evolution in On-Line Postings: 2006 IBM Innovation Jam Data, Lecture
Notes in Computer Science Advances in Knowledge Discovery and Data Mining, 5012, pp. 616-625, 2008.
Kovačevića A.,Konjovića Z., Milosavljevića B., Nenadicb G., C.,Mining methodologies from NLP publications: A
case study in automatic terminology recognition, Computer Speech & Language, 26/2, pp. 105–126, 2012.
Krallinger, M., Erhardt R.A.A. and Valencia A., Text-mining approaches in molecular biology and biomedicine, Drug
Discovery Today, 10(6) pp. 439–445, 2005.
1124
Lee S., Kim H.J., News Keyword Extraction for Topic Tracking, Fourth International Conference on Networked
Computing and Advanced Information Management, pp. 554-559, 2008.
Leong C.K., Lee Y.H., Mak W.K.,Mining sentiments in SMS texts for teaching evaluation, Expert Systems with
Applications, 39(3), pp. 2584-2589, 2012.
Liu B. Web DataMining Exploring Hyperlinks,Contents, and Usage Data, Springer, Chicago, 2007.
Liu B., Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer, Chicago, USA, 2007.
Liu Y. ,Narasimhan N.,Vasudevan V., and Agichtein E., Is this urgent?: exploring time-sensitive information needs
in collaborative question answering, In Proceedings of the 32nd international ACM SIGIR conference on
Research and development in information retrieval, pp. 712-713, 2009.
Liu Y., Liu C.W., Research of fast SOM clustering for text information, Expert Systems with Applications, 38(8), pp.
9325–9333, 2011.
Lu S.H., Chiang D.A., Keh H.C., Huang H.H., Chinese Text Classification by the Naïve Bayes Classifier and the
Associative Classifier with Multiple Confidence Threshold Values, Knowledge-Based Systems, 23(6), pp.598604, 2010.
Marx Z., Dagan I., Shamirc E., Cross-partition clustering: revealing corresponding themes across related datasets,
Journal of Experimental & Theoretical Artificial Intelligence, 23(2), pp.153-180, 2011.
McKnight, W.Text Data Mining in Business Intelligence,
Information Management Magazine,
http://www.information-management.com/issues/20050101/1016487-1.html (accessed Dec 10, 2011), 2005.
Moriizumi, S., Chu, B., Cao, H., Matsukawa, H., Supply Chain Risk Driver Extraction using Text Mining Technique,
Information-an International Interdisciplinary Journal, 14(6), pp. 1935-1945, 2011.
Nakov, P., Schwartz A., Wolf, B. and Hearst M.A., Supporting Annotation Layers for Natural Language Processing,
Proceedings of the ACL, interactive poster and demonstration sessions, Ann Arbor, MI. Association for
Computational Linguistics, pp. 65-68, 2005.
Pechsiri C., Kawtrakul A., Mining Causality from Texts for Question Answering System, Transactions on
Information and Systems, E90-D (10), pp.1523-1533, 2007.
Qiu J., Liao L. and Dong J.D., Topic Detection and Tracking for Chinese News Web Pages, International Conference
on Advanced Language Processing and Web Information Technology, pp. 114 - 120, 2008.
Qiu J., Liao L. and Li P., Enhancing Topic Tracking for Chinese News Web Pages with Temporal Information and
Key Web Contexts , International Journal of Innovative Computing, Information and Control, 6(1), pp. 399-408,
2010.
Raja U., Tretter M.J., Classification of software patches: a text mining approach, Journal of Software Maintenance
and Evolution: Research and Practice, 28(2), pp. 69–87, 2011.
Shatkay, H., Höglund A., Brady S., Blum T., Dönnes P., and Kohlbacher O., SherLoc: High- Accuracy Prediction of
Protein subcellular Localization by Integrating Text and Protein Sequence Data, Bioinformatics, 23(11), pp.
1410-1417, 2007.
Shehata S., Karray F., Kamel M.S., An Efficient Model for Enhancing Text Categorization Using Sentence Semantics,
Computational Intelligence, 26(3), pp.215–231, 2010.
Singal, M., Effect of consumer sentiment on hospitality expenditures and stock returns, International Journal of
Hospitality Management, 31, pp.511-521, 2012.
Tang H., Tan S., Cheng X., A survey on sentiment detection of reviews, Expert Systems with Applications, 36(7) pp.
10760-10773, 2009.
Terol R.M., Martínez-Barco P, Palomar M., A knowledge based method for the medical question answering problem,
Computers in Biology and Medicine, 37(10), pp. 1511-1521, 2007.
Thorleuchter, D., Van den Poel, Dirk and Prinzie A.,Analyzing existing customers’ websites to improve the customer
acquisition process as well as the profitability prediction in B-to-B marketing, Expert Systems with Applications,
39(3), pp. 2597–2605, 2012.
Tilak O., Hoblitzell A., Mukhopadhyay S., You Q., Fang S., Xia Y., Bidwell J., Multilevel text mining for bone
biology, Concurrency and Computation: Practice and Experience, 23/17, pp. 2355–2364, 2011.
Trappey A.,Trappey C., Wu C.Y., Automatic patent document summarization for collaborative knowledge systems
and services, Journal of Systems Science and Systems Engineering, 18(1), pp. 71-94, 2009.
Turban E., Sharda R. and Delen D. Decision Support and Business Intelligence Systems, Prentice Hall, New Jersey,
1125
2011.
Weia C.P., Linb Y.T.,Yang C.C., Cross-lingual text categorization: Conquering language boundaries in globalized
environments, Information Processing & Management, 47(5), pp. 786–804, 2011.
Weng, S.S., and Liu C.K., Using Text Classification and Multiple concepts to Answer Emails, Expert Systems with
Applications, 26(4), pp. 529-543, 2004.
Wu Q., Tan S., A two-stage framework for cross-domain sentiment classification, Expert Systems with Applications,
38(11), pp. 14269-14275, 2011.
Yang C.C. and Dorbin T., Analyzing and Visualizing Web Opinion Development and Social Interactions With
Density-Based Clustering, IEEE Transactions on Systems, Man, and Cybernetics—Part A: Systems and Humans,
41(6), pp.1144-1155, 2011.
Zhu J., Xu C., Wang H., Sentiment classification using the theory of ANNs, The Journal of China Universities of
Posts and Telecommunications, 17(1), pp. 58-62, 2010.
1126