Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
College of Science & Technology Dep. Of Computer Science & IT BCs of Information Technology Data Mining Chapter 4_2: Classification Methods (Examples) 2013 Prepared by: Mahmoud Rafeek Al-Farra www.cst.ps/staff/mfarra Course’s Out Lines 2 Introduction Data Preparation and Preprocessing Data Representation Classification Methods Evaluation Clustering Methods Mid Exam Association Rules Knowledge Representation Special Case study : Document clustering Discussion of Case studies by students Out Lines 3 Comparing Classification Methods Machine learning techniques Decision Trees k-Nearest Naïve Neighbors Bayesian Classifiers Neural Networks Comparing Classification Methods 4 Predictive Accuracy: Ability to correctly predict the class label. Speed: Computation costs involved in generating and using model Robustness: Ability to make correct predictions given noisy or/and missing values Comparing Classification Methods 5 Scalability: Ability to construct model efficiently given large amounts of data Interpretability: Level of understanding and insight that is provided by the model. Machine learning techniques 6 Learning: Things learn when they change their behavior in a way that makes them perform better in the future. Machine learning is the subfield of artificial intelligence that is concerned with the design and development of algorithms that allow computers (machines) to improve their performance over time (to learn) based on data, such as from sensor data or databases Machine learning techniques 7 Examples of machine learning techniques: Decision Trees k-Nearest Naïve Neighbors Bayesian Classifiers Decision Trees 8 Decision tree learning is a common method used in data mining. It is an efficient method for producing classifiers from data. A Decision Tree is a tree-structured plan of a set of attributes to test in order to predict the output. Decision Trees 9 Decision tree consist of: 10 An internal node is a test on an attribute, e.g. Body temperature . A branch represents an outcome of the test, e.g., Warm A leaf node represents a class label e.g. Mammals At each node, one attribute is chosen to split training examples into distinct classes as much as possible A new case is classified by following a matching path to a leaf node. Decision tree consist of: 11 Weather Data: Play or not Play? 12 Outlook Temperature Humidity Windy Play? sunny hot high false No sunny hot High true No overcast hot high false Yes rain mild high false Yes rain cool normal false Yes rain cool normal true No overcast cool normal true Yes sunny mild high false No sunny cool normal false Yes rain mild normal false Yes sunny mild normal true Yes overcast mild high true Yes overcast hot normal false Yes rain mild high true No Weather Data: Play or not Play? 13 Outlook sunny Case Study: How To Build a tree? rain overcast Humidity Yes Windy high normal true false No Yes No Yes How To Build a tree? 14 Top-down tree construction Which is the best attribute? …. Next … 15 k-Nearest Naïve Neighbors Bayesian Classifiers Thanks 16