Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 9. Text Categorization 1 Text Categorization • Text categorization is to classify texts into predefined target category. • Techniques for text categorizations are the same as those used in data mining, such as decision tree, KNN etc. • But with texts, data transformation step is the key – to select features that are relevant to the target category. 2 Coursera, Text Mining and Analytics, ChengXiang Zhai 3 Coursera, Text Mining and Analytics, ChengXiang Zhai 4 Naïve Bayes Classifier • Naïve Bayes classification (a generative classifier) has shown to work well for text categorization. • Naïve Bayes also scales up very well for large data. 5 Machine Learning, by Tom Mitchell 6 Machine Learning, by Tom Mitchell 7 Machine Learning, by Tom Mitchell 8 Machine Learning, by Tom Mitchell 9 Machine Learning, by Tom Mitchell 10 Machine Learning, by Tom Mitchell 11 Coursera, Text Mining and Analytics, ChengXiang Zhai 12 Parameter Estimation for Logistic Regression Coursera, Text Mining and Analytics, ChengXiang Zhai 13 Coursera, Text Mining and Analytics, ChengXiang Zhai 14 Coursera, Text Mining and Analytics, ChengXiang Zhai 15 Coursera, Text Mining and Analytics, ChengXiang Zhai 16 Coursera, Text Mining and Analytics, ChengXiang Zhai 17 SVM (cont.) More information on SVM, including the kernel functions: (my old CSC 578 notes) http://condor.depaul.edu/ntomuro/courses/578/notes/SVMoverview.pdf 18 Coursera, Text Mining and Analytics, ChengXiang Zhai 19 Coursera, Text Mining and Analytics, ChengXiang Zhai 20