Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Classification and Decision Trees Iza Moise, Evangelos Pournaras Iza Moise, Evangelos Pournaras 1 Overview Classification Decision Trees Iza Moise, Evangelos Pournaras 2 Classification Iza Moise, Evangelos Pournaras 3 Definition Classification is a data mining function that assigns items in a collection to target categories or classes. The goal is to accurately predict the target class for each data point. • Supervised • Outcome → class Iza Moise, Evangelos Pournaras 4 Definition Classification is a data mining function that assigns items in a collection to target categories or classes. The goal is to accurately predict the target class for each data point. • Supervised • Outcome → class Iza Moise, Evangelos Pournaras 4 Definition Classification is a data mining function that assigns items in a collection to target categories or classes. The goal is to accurately predict the target class for each data point. • Supervised • Outcome → class Iza Moise, Evangelos Pournaras 4 Types of Classification I binary classification → target attribute has only two values I multi class targets have more than two values crispy classification → given an input, the classifier returns its label probabilistic → given an input, the classifier returns its probabilities to belong to each class Iza Moise, Evangelos Pournaras 5 Applications Classification Example: Spam Filtering Classify as “Spam” or “Not Spam” 1 1 Machine Learning: CS 6375 Introduction, Instructor: Vibhav Gogate,The University of Texas at Dallas Iza Moise, Evangelos Pournaras 6 Applications[cont.] Classification Example: Weather Prediction 2 2 Machine Learning: CS 6375 Introduction, Instructor: Vibhav Gogate,The University of Texas at Dallas Iza Moise, Evangelos Pournaras 7 Applications[cont.] • Customer Target Marketing • Medical Disease Diagnosis • Supervised Event Detection • Multimedia Data Analysis • Document Categorization and Filtering • Social Network Analysis Iza Moise, Evangelos Pournaras 8 A Three-Phase Process 1. Training phase: a model is constructed from the training instances. → classification algorithm finds relationships between predictors and targets → relationships are summarised in a model → train the model on data with known labels (training data) 2. Testing phase: test the model on a test sample whose class labels are known but not used for training the model (testing data) 3. Usage phase: use the model for classification on new data whose class labels are unknown (new data) Iza Moise, Evangelos Pournaras 9 Training Phase - Model Construction 3 3 Data Warehousing and Data Mining, Instructor: Prof. Hany Saleeb Iza Moise, Evangelos Pournaras 10 Testing Phase - Model usage 4 4 Data Warehousing and Data Mining, Instructor: Prof. Hany Saleeb Iza Moise, Evangelos Pournaras 11 Accuracy of the model • the known label of the test sample vs. the classified result from the model • accuracy rate = % of test samples correctly classified • test set is independent of training set (avoid over-fitting) Iza Moise, Evangelos Pournaras 12 Methods of classification • Decision Trees • k-Nearest Neighbours • Neural Networks • Logistic Regression • Linear Discriminant Analysis Iza Moise, Evangelos Pournaras 13 Decision Trees Iza Moise, Evangelos Pournaras 14 Main principles A decision tree creates a hierarchical partitioning of the data which relates the different partitions at the leaf level to the different classes. Data requirements: • Attribute-Value description: object expressible in terms of a fixed collection of properties or attributes (e.g., hot, mild, cold). • Predefined classes (target values): the target function has discrete output values (boolean or multi-class). • Sufficient data: enough training cases should be provided to learn the model. Iza Moise, Evangelos Pournaras 15 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the best predictor • path: a disjunction of tests to make the final decision Iza Moise, Evangelos Pournaras 16 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the best predictor • path: a disjunction of tests to make the final decision Iza Moise, Evangelos Pournaras 16 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the best predictor • path: a disjunction of tests to make the final decision Iza Moise, Evangelos Pournaras 16 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the best predictor • path: a disjunction of tests to make the final decision Iza Moise, Evangelos Pournaras 16 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the best predictor • path: a disjunction of tests to make the final decision Iza Moise, Evangelos Pournaras 16 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the best predictor • path: a disjunction of tests to make the final decision Classification on new instances is done by following a matching path from the root to a leaf node Iza Moise, Evangelos Pournaras 16 5 5 Dr. Saed Sayad, adjunct Professor at the University of Toronto Iza Moise, Evangelos Pournaras 17 Split criterion A condition (or predicate) on: • a single attribute → univariate split • multiple attributes → multivariate split I Recursively split the training data I Goal: maximize the information gain (the discrimination among the classes) → how well an attribute separates the examples according to their target classification Iza Moise, Evangelos Pournaras 18 How to build a decision tree? Top-down tree construction: • all training data are the root • data are partitioned recursively based on selected attributes • bottom-up tree pruning → remove subtrees or branches, in a bottom-up manner, to improve the estimated accuracy on new cases. • conditions for stopping partitioning: • all samples for a given node belong to the same class • there are no remaining attributes for further partitioning • there are no samples left Iza Moise, Evangelos Pournaras 19 Pros and Cons Pros: X simple to understand and interpret X little data preparation and little computation X indicates which attributes are most important for classification Iza Moise, Evangelos Pournaras 20 Pros and Cons Cons: X learning an optimal decision tree is NP-complete X perform poorly with many classes and small data X computationally expensive to train X over-complex trees do not generalise well from the training data (overfitting) Iza Moise, Evangelos Pournaras 21 What’s next? • k-nearest Neighbors • Clustering Iza Moise, Evangelos Pournaras 22