Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Geographic information system wikipedia , lookup
Inverse problem wikipedia , lookup
Neuroinformatics wikipedia , lookup
Computational phylogenetics wikipedia , lookup
Theoretical computer science wikipedia , lookup
Multidimensional empirical mode decomposition wikipedia , lookup
Data analysis wikipedia , lookup
Data assimilation wikipedia , lookup
CES 514 Data Mining Spring 2010 Home Work # 4 Due: April 14, 2010 1) Consider the training examples shown in the table below for a binary classification problem (the last column being the class). (a) What is the entropy of this collection of training examples? (b) What are the information gains of the attributes Gender and Car Type? (c) What are the gain ratios of Gender and Shirt Size ? (d) Apply the Naïve Bayes’ algorithm to determine the probability that the data point shown below belongs to class C1: ID 21 Gender F Car Type Family Shirt Size Large Class ? (e) Apply the decision tree built using the algorithm presented in class to determine the class to which the above data point belongs. 2) You are given a data set that contains various attributes. The last column contains the classification (0 or 1). (a) Using Weka, create a decision tree based on 80% of the data points and apply it to the remaining 20% data items. Report the success % achieved. (The data set can be found in the file hw4problem2dataset.txt in the Home Work directory.) (b) For the same data set as in Problem 2(b), apply the Naïve Bayes’ algorithm. Use the same 80% of the data points for training and determine the probability for each of the remaining 20% to be classified as 0 or 1. Finally calculate the success % achieved.