Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Data Mining and Knowledge Discovery (KSE525) Assignment #3 (April 19, 2012) 1. [10 points] Build the decision tree for the following relational table. label. Use the information gain for attribute selection. the best. The last attribute is the class Let's assume that multi-way split is always You need to explain how you calculated the information gain in detail. ID code Outlook Temperature Humidity Windy Play a b c d e f g h i j k l m n Sunny Sunny Overcast Rainy Rainy Rainy Overcast Sunny Sunny Rainy Sunny Overcast Overcast Rainy Hot Hot Hot Mild Cool Cool Cool Mild Cool Mild Mild Mild Hot Mild High High High High Normal Normal Normal High Normal Normal Normal High Normal High False True False False False True True False False False True True False True No No Yes Yes Yes No Yes No Yes Yes Yes Yes Yes No 2. [6 points] Classification can be used for automatic speech recognition which is one of the main features of Apple Siri. Discuss what the class label is in this type of applications. Then, briefly explain what classification techniques can be used for developing the application. 3. [6 points] Discuss the advantages and disadvantages of lazy classification (e.g., k-nearest neighbor classification) in comparison with eager classification. 4. [8 points] A notable problem of the information gain is that it prefers attributes with a large number of distinct values. Explain why the information gain suffers from the problem and why the gain ratio or Gini index does not. 5. [20 points] Download and install Weka (explained in class). Then, build the decision tree using J48 (C4.5) for the Wine data set in the UCI machine learning repository. modify the format of the original data file as required by Weka. representation of the decision tree. Notice that you need to Copy and paste the text