Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Proceedings of the 2012 International Conference on Industrial Engineering and Operations Management Istanbul, Turkey, July 3 – 6, 2012 A Sentiment Analysis as a Tool to Identify The Status Of Universities: The Case of ITU Mine Işık Department of Industrial Engineering Istanbul Technical University Istanbul, Turkey Başar Öztayşi Department of Industrial Engineering Istanbul Technical University Istanbul, Turkey Kübra H. Fenerci Department of Industrial Engineering Boğaziçi University Istanbul, Turkey Abstract All Data mining is a popular statistical technique that is extensively used in recent years by both academicians and practitioners. Data miners aim at extracting useful information from huge amount of row data exploiting analytical techniques. Although, data mining is heavily rely on structured data, the most beneficial part of the whole data is stored as unstructured texts. At this point, text mining has emerged to meet this requirement and has filled the hidden gap of analytics. The aim of this research is using statistical text analytics and Natural Language Processing (NLP) techniques to reveal the pattern of sentiments in categorized texts related to state universities. To this end, data extracted from various social media websites is refined from noise and clustered with the help of statistical methods. In this manner, contents of website users’ comments are being investigated beside positive and negative sentiments addressed to these contents are identified. Implementation of the generated model is based on Istanbul Technical University case, yet, with small changes in the model, it can be extended to other state universities. Keywords Text mining, opinion mining, sentiment analysis, clustering, Natural Language Processing (NLP) 1. Introduction Understanding the customers, vendors, business processes, and the extended supply chain has been the key for organizational success. Companies use analytical decision making tools to better understand their customers to optimize their supply chain and maintain the best customer service (Davenport, 2006). Understanding the customer commonly depends on analysis of the data that is collected and stored by the companies. The term data mining is used to describe the process of discovering previously unknown patterns in the data. Basically, data mining uses statistical and artificial intelligence methods to analyze the data. Besides the structured data stored in databases, companies also own a vast amount of data stored in text format. It is reported that 85 to 90 percent of all corporate data is captured and stored in text and other unstructured forms (McKnight, 2005). The term text mining has emerged from the need to analyze and benefit from these text data. Text mining is described as the semi-automated process of extracting patterns, useful information and knowledge from large amounts of unstructured data sources (Turban et al. 2011). While data mining and text mining both have 1118 the same purposes as identifying valid, novel and useful patterns, the input of text mining is a collection of unstructured data file types such as word documents, PDF files, XML files. Text mining has been used in areas that large amounts of textual data are generated. These areas include analyzing court orders, financial reports, patent files, customer comments and emails (Weng and Liu, 2004). The major applications of text mining are; identification of key phrases and relationships within text set of categories based on main themes of the document, grouping similar documents, connecting related documents by identifying their shared concepts. With the emergence of Web 2.0, the way that people express their opinions changed dramatically. Internet users today, post reviews of products at online shopping sites and express their comments in, Internet forums, discussion groups, blogs and any kind of social media. This online word-of-mouth behavior represents a new source of information for text mining applications. Analyzing these texts is of great importance because of the valuable information contained in them. Opinion mining (or sentiment analysis) term defines the process that exploits these sources to help businesses and individuals with gaining such information effectively and easily (Liu, 2007). From the business perspective, opinion mining provides feedback to companies about their products or services. Opinions are also important for potential customers; these opinions can be used to make better products and services. Another application of opinion mining is online advertisement; a company may want to display its advertisement where there is a positive opinion about the product. On the other hand, it is not desirable to show an advertisement to Internet user where there is a negative opinion about the product. Both professionals and academics are developing techniques to analyze the opinions in web and especially in social media. In this paper, an opinion-mining model is developed to analyze the comments on social media, about Istanbul Technical University (ITU). Using a sample set of 300 comments, first a text clustering is accomplished to define the categories. “Education quality”, “campus” and “general reputation” are the found to be the clusters. After the clusters are identified a rule-based system is developed to classify these comments into the clusters. Using natural language processing (NLP), the model can also specify the comment as being positive or negative. The resulting model can automatically assign a new review to one of the clusters with determining its sign. The rest of this paper is organized as follows. In Section 2, the relevant literature is reviewed. The model and steps of the study is explained in Section 3. Finally the limitations and future suggestions are given in conclusion. 2. Literature Review With text mining, unstructured data generated within a company or over Internet have became a source for analytical surveys. The initial applications of text mining include analysis of court orders and academic publications, predictions using quarterly financial reports, analysis of patent files and emails automatic prioritization (Turban et al. 2011). Some exemplary applications can be summarized as follows: • Marketing Applications: Text mining can be used to increase customer satisfaction, cross sell and upsell opportunities and overall lifetime value by analyzing documents generated by call center notes and user comments on Internet. Coussement and Van Den Poel, (2008) propose a model for complaint management automation built on email detection system. In another study, Thorleuchter et al. (2012) propose a customer profitability prediction model in B-to-B marketing context. Ghani et al. (2006) develop an attribute extraction system to enhance retailers’ ability to analyze product databases. • Biomedical Applications: since medical literature is more standardized and the terminology is relatively constant, text mining is largely used in biomedical analysis. Nakov et al. (2005) use text mining to discover gene-protein relationships. Shatkay et al. (2007) propose a model for protein location prediction. Krallinger et al. (2005) outline the text mining approaches in molecular biology and biomedicine. • Academic Applications: In scientific publications specific information is kept in written text and text mining is needed to enable improved search without destroying the publishers’ barriers to public access. In the recent literature, test mining is used to classify scientific articles based on citation context (Aljaber et al. 2011). Text mining is also used to build concept maps, Chen et al. (2008) construct e-Learning domain concept maps from academic articles. Turban et al. (2011) identify the most popular text mining applications as; information extraction, topic tracking, summarization, categorization, clustering, concept linking and question answering. Information or knowledge 1119 extraction is the identification of key phrases and relationships with in text. The second application, topic tracking is the process of predicting documents of interest to a user based on user profile and other supplementary data. Summarization is the text mining application automatically summarizes the documents without information loss. Another important text mining application is categorization. Categorization is the process that starts with identifying the main themes of a document and places the documents to predefined set of categories. Clustering, just like categorization, aims to group documents but in clustering there are no predefined categories. The aim of clustering is to group similar documents together. Concept linking is the text mining process that finds related documents by identifying their shared concepts. And finally, text mining is used to automatically find the most suitable answer to a question which is known as “question answering”. The relevant studies about the listed text mining applications are given in Table 1. Application Information Extraction Topic Tracking Summarization Categorization Clustering and Concept Linking Question answering Table 1: Review of text mining applications Purpose References Identification of key phrases and Tilak et al. (2011), Kovačevića (2012), Chan and relationships with in text Franklin (2011), Chen and Parvathi (2011) Predict documents of interest Qiu et al. (2010), Kobayashi and Yung (2008), AlSumait (2008), Qiu et al. (2008), Lee and Kim (2008) Automatically summarize the Bhattacharya et al. (2011), Trappey et al.(2009), documents Chong et al. (2010), Aliguliyev (2010), Jin (2011) Place the documents to predefined Trappey et al.(2009), Lu et al. (2010), Shehata et al. set of categories (2010), Chen and Chen (2011), Weia (2011) Group similar documents Liu and Liu (2011), Moriizumi et al. (2011), Yang together. and Dorbin (2011), Raja and Tretter (2011), Marx et al. (2011) Find the best answer to a given Dizier and Moens (2011), Fleuren et al. (2011), Liu question. (2009), Pechsiri and Kawtrakul (2007), Terol et al. (2007) A relatively new text mining application is sentiment analysis or opinion mining. Sentiment analysis is a technique used to detect positive and negative opinions about specific products or services using large amounts of sources such as customer feedbacks, forums, blogs and other social media. Sentiment analysis uses information extraction and categorization among the previously determined applications. Recent studies in the literature can be given as follows; Tang et al. (2009) define the four problems about sentiment analysis as; subjectivity classification, word sentiment classification, document sentiment classification and opinion extraction, and discuss different approaches and issues about these problems. Kang et al. (2012) classify a review document as a positive sentiment and as a negative sentiment using the supervised learning algorithm. The authors found that there is a tendency for the positive classification accuracy to appear up to approximately 10% higher than the negative classification accuracy. Eirinaki et al.(2011) propose a feature based opinion mining model and suggest a search engine based on sentiment analysis that retrieve and aggregate comments about a product . Wu and Tan (2011) propose a two stage framework for cross domain sentiment analysis. In the first stage, a bridge is built between the source domain and the target domain to get some most confidently labeled documents in the target domain. In the second stage the authors exploit the intrinsic structure, revealed by these most confidently labeled documents, to label the target-domain data. Bai (2011), propose a heuristic search-enhanced Markov blanket model that is can be used for extracting sentiments and used the model for online movie and news reviews. The proposed model is capable of identifying a parsimonious set of predictive features. Zhu et al. (2010) use individual model based on artificial neural networks for sentiment analysis. The authors apply the model to movie reviews and the results of the model present higher accuracy than support vector machines and hidden Markov model. Leong et al. (2012) use sentiment analysis for teaching purposes; in their study the researchers use SMS texts as source of data for teaching evaluation. In this study, online comments about Istanbul Technical University are analyzed with sentiment analysis using SAS Enterprise Miner software. First the text clusters are generated and then rules are generated to classify each comment in a group. To the best of our knowledge this is the initial attempt to apply opinion mining in education. 1120 3. Model 3.1. Data Preparation Since web documents are abundant, it is unavoidable to get the data that explores the word-of-mouth and personal judgments on the desired topic from this invaluable mine of information. To take advantage of huge data sets that are hidden in online stores, it is crawled and used as an input of text mining algorithms. If we would like to mention web crawling; it is the process used by search engines to collect data sets from the web. Many online forums and review sites exist for people to post their opinions about ITU. In order to utilize it, the related comments about ITU are gathered from the social media web sites. 7 different forum sites are investigated. The 300 comments are examined in other saying data is crawled. Each comment is stored with the information of date, forum type, and user name in order to track the data evolutionary. 3.2. Text Cleaning Crawled texts are in spread sheet and needed to be transformed into a more convenient form to be used in a commercial software that belongs to SAS. Data is converted to SAS format providing “wTurkish” encoding in order to support non ANSI Turkish characters. Then, data including 300 lines is exported to UTF8 format text files to be used in Sentiment Analysis Software. Texts that are not related to the point of concern and duplications are deleted and resultant data is saved in our local computers. 3.3. Text Mining Text mining process starts with importing text data into SAS Enterprise Miner and parsing it into its words and phrases aiming at preparing these terms to be meaningful for analysis. Clustering process starts with the preparation of the data. This preparation step includes parsing and filtering. First phase of the data arrangement is “Text Parsing”. Since software supports Turkish, word stemming is available. Whereat it is possible to eliminate some part of speech tags, abbreviations, auxiliary verbs, conjunctions, determiners, interjections, infinitive markers, possessive markers and pronouns are ignored as a result of a comprehensive analysis. It is recognized that these part of speech tags become inconvenient and meaningless while kept in resultant dataset. Also, stemming is permitted in text parsing node to construct a more general model that can be used in future studies. Software provides “inflectional stemming” which can reach roots of words and can understand all inflection suffixes. Parsed documents are ready to be used in second data preprocessing step named “Text Filtering.” The analysis feature of this step consists of frequency weighting and term weighting. Text Filtering is used to get rid of the noisy terms. From the variety of filtering methodologies, only Log and Binary frequency weighting methods are experienced. One strategy of developing the text data matrix is transition from simple counts to more complex weighting formulations. The frequency weights represent the first step in quantifying texts. Since absolute counts might be affected by documents that have a high level of variability regarding to size. In this study, Log frequency weighting method is used in order to dampen the effect of terms, which abundantly occur in a document. Afterwards, entropy term weighting is used to modify frequency weights to adjust for document size and word distribution. Also spell check is conducted by the way of word similarity algorithms in order to find and correct misspellings. As a final step, texts are clustered. Since clustering techniques are subjective, different number of clusters is examined and three mutually exclusive clusters are conducted eventually. Expectation maximization algorithm guarantees that there is no common instance among clusters. Descriptive terms declare that these clusters are; education, campus and corporate reputation. 3.4. Sentiment Analysis The aim of this step is to reveal the hidden sentiments related to a specific object and features of that object. Sentiment classification has been investigated in wide range of fields even on restaurant reviews from customer perspective (Kang et al., 2012) and also on hospitality expenditures and stock returns (Singal, 2012). Moreover, 1121 sentiment analysis from unstructured text has witnessed a boom in interest in recent years (Bai, 2011). Automatic and accurate understanding of sentiments expressed within the online text could lead to effective information retrieval (Bai, 2011). The input of this stage is inherited from text mining results. In this research ITU is considered as a product and the features are defined as education, campus and corporate reputation. An empirical study on sentiment categorization on ITU is conducted. Both statistical and rule based models are developed by utilizing the Sentiment Analysis Studio of SAS software. Statistical model uses a Bayes solution and smoothed relative frequency as a text normalization method. This technique divides the term frequency by a normalization factor such as the length of the document. Overall precision of the statistical model is 72.22% and 80% of data is partitioned for model training and 20% of data for validating. Since output of the statistical model consists of words and phrases, they are used in rule-based models as initial dictionary terms. In rule-based models, Boolean Perl operators are used to define relationships among terms such as distances, ordinal distances, being in the same sentence or paragraph. Nearly, 8000 rules are written to capture all the relationships in documents correctly. Although a classical search engine may understand positive and negative words in a document, the solution obtained in this study links these positivity and negativity to the correct feature including campus, corporate reputation and education. 3.5. Model Testing and Validating Both text mining and sentiment analysis models are tested in an independent set of 50 documents. 70% accuracy is obtained in Sentiment analysis part that is an open for improvement score. Since clustering is a descriptive method, it is not possible to calculate accuracy/precision metrics in this technique. Being a subjective method, test results of clustering interpreted manually and found to be reasonable. 3.6. Findings This study aims at exploring the viewpoints and feelings on ITU from the perspective of not only its students but also in the eye of any person that have any idea on it. Since many online forums and review sites exist for people to post their opinions about, the data is gathered and analyzed from web. After clustering step is initiated, descriptive terms divide the data into three different blocks called as education, campus and corporate reputation. The general idea about corporate reputation of ITU can be regarded as considerably positive. A good many comments heavily rely on the engineering quality, and the trust in the education of technical universities that results in qualified and competent engineers. On the other hand, negative comments are about corporate reputation of ITU includes excessive bureaucracy that students encounter. 17% 32% 51% Positive Negative Neutral Figure 1 Sentiment distribution of corporate reputation General opinion about the campus facilities of ITU is optimistic while the library is the point in question. On the contrary, the interpretation about the long-lasting constructions is negative due to any inconvenience caused. 1122 21% 6% 73% Positive Negative Neutral Figure 2 Sentiment distribution of ITU campuses Comments about education of ITU mostly touch on education quality of engineering faculty. Commenters mention that they are utilizing key concepts that ITU’s schooling brings to them. On the contrary, exams of ITU are found to be unnecessarily hard and number of projects conducted. 33% 67% Positive Negative Neutral Figure 3 Sentiment distribution about education of ITU 4. Conclusions A great deal of data on the web is already available. Despite data mining is heavily rely on structured data, it is reported that the biggest portion of all valuable data is captured and stored in text. This research focuses on using statistical text analytics and NLP techniques to reveal and analyze the pattern of sentiments in the case of ITU. For this aim, website users’ comments are investigated and positive, negative and neutral sentiments addressed to the contents of campus, corporate image and education comments are identified. Firstly the data on hand is cleaned from noise to enable us making more clarified judgments. Then text-clustering techniques are applied. With regards to the issue of cluster discrimination, the opinions of people hide in gathered texts reveal that the distance of clusters sufficiently differs from each other. Hence, this can be considered as an evidence of good cluster distinction. Each cluster is investigated independently in sentiment analysis study. As a result, hidden sentiments in the comments of social media users are discovered. 5. Limitations and Future Explorations As the amount of gathered data increases, the exploration power of the hidden patterns becomes more reliable and accurate. So the constructed model will be applied to different document sets such a way that the questionnaires are also prepared and added to the unstructured data taken from various online forums. The proposed method for predicting sentiments could enable not only the professionals but also researchers, to extract opinions from the Internet about universities. Besides, as a future work, implementation of the generated model with small changes can be extended to other state and private universities. 1123 Another limitation is to determine correct relationships and word distances in Turkish documents in order to catch the comments that are valuable for the survey. As distance increases, the possibility of hitting a sentiment that is irrelevant related to ITU. So the distance alternatives can be investigated by conducting the sensitivity analysis. If we talk about the limitations, since the Turkish comments are investigated, the usage of positive adjectives sometimes refers actually negative feelings about the main point of concern. Positive sentiments are assigned to these comments yet they are not laudatory feelings. References Aliguliyev, R.M., Clustering Techniques and Discrete Particle Swarm Optimization Algorithm for Multi-Document Summarization, Computational Intelligence, 26(4), pp. 420–448, 2010. Aljaber B., Martinez D., Stokes N. and Bailey J. Improving MeSH classification of biomedical articles using citation contexts, Journal of Biomedical Informatics, 44(5) pp. Pages 881–896, 2011. AlSumait, L.; Barbara, D.; Domeniconi, C., On-line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking, ICDM '08. Eighth IEEE International Conference on Data Mining, pp. 3 - 12, 2008. Bai X., Predicting consumer sentiments from online text, Decision Support Systems, 50(4) pp. 732-742, 2011. Bhattacharya S., Thuc V. and Srinivasan P., MeSH: a window into full text for document summarization,Bioinformatics, 27 (13), pp. 120-128, 2011. Chan, S.W.K., and Franklin J.,A text-based decision support system for financial sequence prediction, Decision Support Systems,52, pp. 189–198, 2011. Chen N.S., Kinshuk, Wei C.W., Chen H.J.,Mining e-Learning domain concept map from academic articles, Computers & Education, 50(3), pp. 1009-1021, 2008. Chen W. and Parvathi C., Extracting hot spots of topics from time-stamped documents, Data & Knowledge Engineering, 70(7), pp 642-660, 2011. Chen Y.T., Chen M.C., Using chi-square statistics to measure similarities for text categorization, Expert Systems with Applications, 38(4), pp. 3085-3090, 2011. Chong L., Huang M.L., Zhu X.Y., Li, M., A New Approach for Multi-Document Update Summarization , Journal of Computer Science and Technology, 25(4), pp. 739-749, 2010. Coussement, K., and Van Den Poel, D., Improving Customer Complaint Management by automatic Email Classification Using Linguistic Style Features as Predictors, Decision Support Systems, 44(4), pp. 870-882, 2008. Davenport T.H. Competing on analytics, Harvard Business Review, January 2006. Dizier P.S., and Moens M.F., Knowledge and reasoning for question answering: Research perspectives, Information Processing and Management,47(6), pp. 899–906,2011. Eirinaki M., Pisal S., Singh J., Feature-based opinion mining and ranking Original Research Article, Journal of Computer and System Sciences, In Press, November 2011. Fleuren W.W., Verhoeven S., Frijters R., Heupers B., Polman J., van Schaik R., de Vlieg J., Alkema W., CoPub update: CoPub 5.0 a text mining system to answer biological questions,Nucleic Acids Res., pp. 450-454, 2011. Ghani, R., Probst K., Liu Y., Krema M., and Fano, A., Text mining for product attribute Extraction, SIGKDD Eplorations,8(1), pp. 41-48, 2006. Jin F., Huang M.L. and Zhu X.Y., Guided Structure-Aware Review Summarization, Journal of Computer Science and Technology, 26(4), pp.676-684, 2011 Kang H., Yoo S.J., and Han D., Senti-lexicon and improved Naïve Bayes algorithms for sentiment analysis of restaurant reviews, Expert Systems with Applications, 39(5), pp. 6000-6010, 2012. Kobayashi M. and Yung R., Tracking Topic Evolution in On-Line Postings: 2006 IBM Innovation Jam Data, Lecture Notes in Computer Science Advances in Knowledge Discovery and Data Mining, 5012, pp. 616-625, 2008. Kovačevića A.,Konjovića Z., Milosavljevića B., Nenadicb G., C.,Mining methodologies from NLP publications: A case study in automatic terminology recognition, Computer Speech & Language, 26/2, pp. 105–126, 2012. Krallinger, M., Erhardt R.A.A. and Valencia A., Text-mining approaches in molecular biology and biomedicine, Drug Discovery Today, 10(6) pp. 439–445, 2005. 1124 Lee S., Kim H.J., News Keyword Extraction for Topic Tracking, Fourth International Conference on Networked Computing and Advanced Information Management, pp. 554-559, 2008. Leong C.K., Lee Y.H., Mak W.K.,Mining sentiments in SMS texts for teaching evaluation, Expert Systems with Applications, 39(3), pp. 2584-2589, 2012. Liu B. Web DataMining Exploring Hyperlinks,Contents, and Usage Data, Springer, Chicago, 2007. Liu B., Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer, Chicago, USA, 2007. Liu Y. ,Narasimhan N.,Vasudevan V., and Agichtein E., Is this urgent?: exploring time-sensitive information needs in collaborative question answering, In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pp. 712-713, 2009. Liu Y., Liu C.W., Research of fast SOM clustering for text information, Expert Systems with Applications, 38(8), pp. 9325–9333, 2011. Lu S.H., Chiang D.A., Keh H.C., Huang H.H., Chinese Text Classification by the Naïve Bayes Classifier and the Associative Classifier with Multiple Confidence Threshold Values, Knowledge-Based Systems, 23(6), pp.598604, 2010. Marx Z., Dagan I., Shamirc E., Cross-partition clustering: revealing corresponding themes across related datasets, Journal of Experimental & Theoretical Artificial Intelligence, 23(2), pp.153-180, 2011. McKnight, W.Text Data Mining in Business Intelligence, Information Management Magazine, http://www.information-management.com/issues/20050101/1016487-1.html (accessed Dec 10, 2011), 2005. Moriizumi, S., Chu, B., Cao, H., Matsukawa, H., Supply Chain Risk Driver Extraction using Text Mining Technique, Information-an International Interdisciplinary Journal, 14(6), pp. 1935-1945, 2011. Nakov, P., Schwartz A., Wolf, B. and Hearst M.A., Supporting Annotation Layers for Natural Language Processing, Proceedings of the ACL, interactive poster and demonstration sessions, Ann Arbor, MI. Association for Computational Linguistics, pp. 65-68, 2005. Pechsiri C., Kawtrakul A., Mining Causality from Texts for Question Answering System, Transactions on Information and Systems, E90-D (10), pp.1523-1533, 2007. Qiu J., Liao L. and Dong J.D., Topic Detection and Tracking for Chinese News Web Pages, International Conference on Advanced Language Processing and Web Information Technology, pp. 114 - 120, 2008. Qiu J., Liao L. and Li P., Enhancing Topic Tracking for Chinese News Web Pages with Temporal Information and Key Web Contexts , International Journal of Innovative Computing, Information and Control, 6(1), pp. 399-408, 2010. Raja U., Tretter M.J., Classification of software patches: a text mining approach, Journal of Software Maintenance and Evolution: Research and Practice, 28(2), pp. 69–87, 2011. Shatkay, H., Höglund A., Brady S., Blum T., Dönnes P., and Kohlbacher O., SherLoc: High- Accuracy Prediction of Protein subcellular Localization by Integrating Text and Protein Sequence Data, Bioinformatics, 23(11), pp. 1410-1417, 2007. Shehata S., Karray F., Kamel M.S., An Efficient Model for Enhancing Text Categorization Using Sentence Semantics, Computational Intelligence, 26(3), pp.215–231, 2010. Singal, M., Effect of consumer sentiment on hospitality expenditures and stock returns, International Journal of Hospitality Management, 31, pp.511-521, 2012. Tang H., Tan S., Cheng X., A survey on sentiment detection of reviews, Expert Systems with Applications, 36(7) pp. 10760-10773, 2009. Terol R.M., Martínez-Barco P, Palomar M., A knowledge based method for the medical question answering problem, Computers in Biology and Medicine, 37(10), pp. 1511-1521, 2007. Thorleuchter, D., Van den Poel, Dirk and Prinzie A.,Analyzing existing customers’ websites to improve the customer acquisition process as well as the profitability prediction in B-to-B marketing, Expert Systems with Applications, 39(3), pp. 2597–2605, 2012. Tilak O., Hoblitzell A., Mukhopadhyay S., You Q., Fang S., Xia Y., Bidwell J., Multilevel text mining for bone biology, Concurrency and Computation: Practice and Experience, 23/17, pp. 2355–2364, 2011. Trappey A.,Trappey C., Wu C.Y., Automatic patent document summarization for collaborative knowledge systems and services, Journal of Systems Science and Systems Engineering, 18(1), pp. 71-94, 2009. Turban E., Sharda R. and Delen D. Decision Support and Business Intelligence Systems, Prentice Hall, New Jersey, 1125 2011. Weia C.P., Linb Y.T.,Yang C.C., Cross-lingual text categorization: Conquering language boundaries in globalized environments, Information Processing & Management, 47(5), pp. 786–804, 2011. Weng, S.S., and Liu C.K., Using Text Classification and Multiple concepts to Answer Emails, Expert Systems with Applications, 26(4), pp. 529-543, 2004. Wu Q., Tan S., A two-stage framework for cross-domain sentiment classification, Expert Systems with Applications, 38(11), pp. 14269-14275, 2011. Yang C.C. and Dorbin T., Analyzing and Visualizing Web Opinion Development and Social Interactions With Density-Based Clustering, IEEE Transactions on Systems, Man, and Cybernetics—Part A: Systems and Humans, 41(6), pp.1144-1155, 2011. Zhu J., Xu C., Wang H., Sentiment classification using the theory of ANNs, The Journal of China Universities of Posts and Telecommunications, 17(1), pp. 58-62, 2010. 1126