Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Classification and Decision Trees Iza Moise, Evangelos Pournaras Iza Moise, Evangelos Pournaras 1 Overview Classification Decision Trees Iza Moise, Evangelos Pournaras 2 Classification Iza Moise, Evangelos Pournaras 3 Definition Classification is a data mining function that assigns items in a collection to target categories or classes. The goal is to accurately predict the target class for each data point. • Supervised • Outcome → class Iza Moise, Evangelos Pournaras 4 Definition Classification is a data mining function that assigns items in a collection to target categories or classes. The goal is to accurately predict the target class for each data point. • Supervised • Outcome → class Iza Moise, Evangelos Pournaras 4 Definition Classification is a data mining function that assigns items in a collection to target categories or classes. The goal is to accurately predict the target class for each data point. • Supervised • Outcome → class Iza Moise, Evangelos Pournaras 4 Types of Classification I binary classification → target attribute has only two values I multi class targets have more than two values crispy classification → given an input, the classifier returns its label probabilistic → given an input, the classifier returns its probabilities to belong to each class Iza Moise, Evangelos Pournaras 5 Applications Classification Example: Spam Filtering Classify as “Spam” or “Not Spam” 1 1 Machine Learning: CS 6375 Introduction, Instructor: Vibhav Gogate,The University of Texas at Dallas Iza Moise, Evangelos Pournaras 6 Applications[cont.] Classification Example: Weather Prediction 2 2 Machine Learning: CS 6375 Introduction, Instructor: Vibhav Gogate,The University of Texas at Dallas Iza Moise, Evangelos Pournaras 7 Applications[cont.] • Customer Target Marketing • Medical Disease Diagnosis • Supervised Event Detection • Multimedia Data Analysis • Document Categorization and Filtering • Social Network Analysis Iza Moise, Evangelos Pournaras 8 A Three-Phase Process 1. Training phase: a model is constructed from the training instances. → classification algorithm finds relationships between predictors and targets → relationships are summarised in a model → train the model on data with known labels (training data) 2. Testing phase: test the model on a test sample whose class labels are known but not used for training the model (testing data) 3. Usage phase: use the model for classification on new data whose class labels are unknown (new data) Iza Moise, Evangelos Pournaras 9 Training Phase - Model Construction 3 3 Data Warehousing and Data Mining, Instructor: Prof. Hany Saleeb Iza Moise, Evangelos Pournaras 10 Testing Phase - Model usage 4 4 Data Warehousing and Data Mining, Instructor: Prof. Hany Saleeb Iza Moise, Evangelos Pournaras 11 Accuracy of the model • the known label of the test sample vs. the classified result from the model • accuracy rate = % of test samples correctly classified • test set is independent of training set (avoid over-fitting) Iza Moise, Evangelos Pournaras 12 Methods of classification • Decision Trees • k-Nearest Neighbours • Neural Networks • Logistic Regression • Linear Discriminant Analysis Iza Moise, Evangelos Pournaras 13 Decision Trees Iza Moise, Evangelos Pournaras 14 Main principles A decision tree creates a hierarchical partitioning of the data which relates the different partitions at the leaf level to the different classes. Data requirements: • Attribute-Value description: object expressible in terms of a fixed collection of properties or attributes (e.g., hot, mild, cold). • Predefined classes (target values): the target function has discrete output values (boolean or multi-class). • Sufficient data: enough training cases should be provided to learn the model. Iza Moise, Evangelos Pournaras 15 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the best predictor • path: a disjunction of tests to make the final decision Iza Moise, Evangelos Pournaras 16 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the best predictor • path: a disjunction of tests to make the final decision Iza Moise, Evangelos Pournaras 16 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the best predictor • path: a disjunction of tests to make the final decision Iza Moise, Evangelos Pournaras 16 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the best predictor • path: a disjunction of tests to make the final decision Iza Moise, Evangelos Pournaras 16 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the best predictor • path: a disjunction of tests to make the final decision Iza Moise, Evangelos Pournaras 16 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the best predictor • path: a disjunction of tests to make the final decision Classification on new instances is done by following a matching path from the root to a leaf node Iza Moise, Evangelos Pournaras 16 5 5 Dr. Saed Sayad, adjunct Professor at the University of Toronto Iza Moise, Evangelos Pournaras 17 Split criterion A condition (or predicate) on: • a single attribute → univariate split • multiple attributes → multivariate split I Recursively split the training data I Goal: maximize the information gain (the discrimination among the classes) → how well an attribute separates the examples according to their target classification Iza Moise, Evangelos Pournaras 18 How to build a decision tree? Top-down tree construction: • all training data are the root • data are partitioned recursively based on selected attributes • bottom-up tree pruning → remove subtrees or branches, in a bottom-up manner, to improve the estimated accuracy on new cases. • conditions for stopping partitioning: • all samples for a given node belong to the same class • there are no remaining attributes for further partitioning • there are no samples left Iza Moise, Evangelos Pournaras 19 Pros and Cons Pros: X simple to understand and interpret X little data preparation and little computation X indicates which attributes are most important for classification Iza Moise, Evangelos Pournaras 20 Pros and Cons Cons: X learning an optimal decision tree is NP-complete X perform poorly with many classes and small data X computationally expensive to train X over-complex trees do not generalise well from the training data (overfitting) Iza Moise, Evangelos Pournaras 21 What’s next? • k-nearest Neighbors • Clustering Iza Moise, Evangelos Pournaras 22