Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Classification and Decision Trees Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Classification Decision Trees Iza Moise, Evangelos Pournaras, Dirk Helbing 2 Classification Iza Moise, Evangelos Pournaras, Dirk Helbing 3 Definition Classification is a data mining function that assigns items in a collection to target categories or classes. The goal is to accurately predict the target class for each case in the data. • Supervised • Outcome → class Iza Moise, Evangelos Pournaras, Dirk Helbing 4 Definition Classification is a data mining function that assigns items in a collection to target categories or classes. The goal is to accurately predict the target class for each case in the data. • Supervised • Outcome → class Iza Moise, Evangelos Pournaras, Dirk Helbing 4 Definition Classification is a data mining function that assigns items in a collection to target categories or classes. The goal is to accurately predict the target class for each case in the data. • Supervised • Outcome → class Iza Moise, Evangelos Pournaras, Dirk Helbing 4 Types of Classification I binary classification → target attribute has only two values I multi class targets have more than two values crispy classification → given an input, the classifier returns its label probabilistic → given an input, the classifier returns its probabilities to belong to each class Iza Moise, Evangelos Pournaras, Dirk Helbing 5 Applications Classification Example: Spam Filtering Classify as “Spam” or “Not Spam” 1 1 Machine Learning: CS 6375 Introduction, Instructor: Vibhav Gogate,The University of Texas at Dallas Iza Moise, Evangelos Pournaras, Dirk Helbing 6 Applications[cont.] Classification Example: Weather Prediction 2 2 Machine Learning: CS 6375 Introduction, Instructor: Vibhav Gogate,The University of Texas at Dallas Iza Moise, Evangelos Pournaras, Dirk Helbing 7 Applications[cont.] • Customer Target Marketing • Medical Disease Diagnosis • Supervised Event Detection • Multimedia Data Analysis • Document Categorization and Filtering • Social Network Analysis Iza Moise, Evangelos Pournaras, Dirk Helbing 8 A Three-Phase Process 1. Training phase: a model is constructed from the training instances. → classification algorithm finds relationships between predictors and targets → relationships are summarised in a model 2. Testing phase: test the model on a test sample whose class labels are known but not used for training the model 3. Usage phase: use the model for classification on new data whose class labels are unknown Iza Moise, Evangelos Pournaras, Dirk Helbing 9 Training Phase - Model Construction 3 3 Data Warehousing and Data Mining, Instructor: Prof. Hany Saleeb Iza Moise, Evangelos Pournaras, Dirk Helbing 10 Testing Phase - Model usage 4 4 Data Warehousing and Data Mining, Instructor: Prof. Hany Saleeb Iza Moise, Evangelos Pournaras, Dirk Helbing 11 Methods of classification • Decision Trees • k-Nearest Neighbours • Neural Networks • Logistic Regression • Linear Discriminant Analysis Iza Moise, Evangelos Pournaras, Dirk Helbing 12 Methods of classification • Decision Trees • k-Nearest Neighbours • Neural Networks • Logistic Regression • Linear Discriminant Analysis Iza Moise, Evangelos Pournaras, Dirk Helbing 12 Decision Trees Iza Moise, Evangelos Pournaras, Dirk Helbing 13 Main principles A decision tree creates a hierarchical partitioning of the data which relates the different partitions at the leaf level to the different classes. Data requirements: • Attribute-Value description: object expressible in terms of a fixed collection of properties or attributes (e.g., hot, mild, cold). • Predefined classes (target values): the target function has discrete output values (boolean or multi-class). • Sufficient data: enough training cases should be provided to learn the model. Iza Moise, Evangelos Pournaras, Dirk Helbing 14 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the topmost decision node • path: a disjunction of test to make the final decision Iza Moise, Evangelos Pournaras, Dirk Helbing 15 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the topmost decision node • path: a disjunction of test to make the final decision Iza Moise, Evangelos Pournaras, Dirk Helbing 15 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the topmost decision node • path: a disjunction of test to make the final decision Iza Moise, Evangelos Pournaras, Dirk Helbing 15 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the topmost decision node • path: a disjunction of test to make the final decision Iza Moise, Evangelos Pournaras, Dirk Helbing 15 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the topmost decision node • path: a disjunction of test to make the final decision Iza Moise, Evangelos Pournaras, Dirk Helbing 15 Main principles [cont.] • decision node = test on an attribute • branch = an outcome of the test • leaf node = classification or decision • root = the topmost decision node • path: a disjunction of test to make the final decision Classification on new instances is done by following a matching path from the root to a leaf node Iza Moise, Evangelos Pournaras, Dirk Helbing 15 5 5 Dr. Saed Sayad, adjunct Professor at the University of Toronto Iza Moise, Evangelos Pournaras, Dirk Helbing 16 Split criterion A condition (or predicate) on: • a single attribute → univariate split • multiple attributes → multivariate split I Recursively split the training data I Goal: maximize the information gain → how well an attribute separates the examples according to their target classification Iza Moise, Evangelos Pournaras, Dirk Helbing 17 How to build a decision tree? Top-down tree construction: • all training data are the root • data are partitioned recursively based on selected attributes • bottom-up tree pruning → remove subtrees or branches, in a bottom-up manner, to improve the estimated accuracy on new cases. • conditions for stopping partitioning: • all samples for a given node belong to the same class • there are no remaining attributes for further partitioning • there are no samples left Iza Moise, Evangelos Pournaras, Dirk Helbing 18 Pros and Cons Pros: X simple to understand and interpret X little data preparation and little computation X indicates which attributes are most important for classification Iza Moise, Evangelos Pournaras, Dirk Helbing 19 Pros and Cons Cons: X learning an optimal decision tree is NP-complete X perform poorly with many classes and small data X computationally expensive to train X over-complex trees do not generalise well from the training data (overfitting) Iza Moise, Evangelos Pournaras, Dirk Helbing 20 What’s next? • k-nearest Neighbors • Clustering Iza Moise, Evangelos Pournaras, Dirk Helbing 21