Download Assignement _2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
CE417 - Data Mining Course
Homework Assignment #2
Fall 2007
1. What is the essential difference between association rules and decision rules?
2. Consider the following data set shown as Table. The goal is to develop association rules using the
a priori algorithm for trying to predict when a certain (evidently indoor) game may be played.
a. Using 75% minimum confidence and 20% minimum support, generate one-antecedent
association rules for predicting play.
b. Using 75% minimum confidence and 20% minimum support, generate two-antecedent
association rules for predicting play.
c. Multiply the observed support times the confidence for each of the rules in part a and b, and
rank them in a table.
No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Outlook
sunny
sunny
overcast
rain
rain
rain
overcast
sunny
sunny
rain
sunny
overcast
overcast
rain
Temperature Humidity
hot
high
hot
high
hot
high
mild
high
cool
normal
cool
normal
cool
normal
mild
high
cool
normal
mild
normal
mild
normal
mild
high
hot
normal
mild
high
Windy
FALSE
TRUE
FALSE
FALSE
FALSE
TRUE
TRUE
FALSE
FALSE
FALSE
TRUE
TRUE
FALSE
TRUE
Play
no
no
yes
yes
yes
no
yes
no
yes
yes
yes
yes
yes
no
3. In neural network algorithm
a. Should we prefer a large hidden layer or a small one? Describe the benefits and drawbacks of
each.
b. Describe the benefits and drawbacks of using large or small values for the learning rate and
momentum term.
4. Explain the fundamental differences between the design of an artificial neural network and
“classical” information-processing system.
5. Consider the data in the following table. The target variable is salary. Start by discretizing salary
as follows:
• Less than $35,000
Level 1
• $35,000 to less than $45,000
Level 2
• $45,000 to less than $55,000
Level 3
• Above $55,000
Level 4
a. Construct a classification and regression tree to classify salary based on the other variables.
b. Construct a C4.5 decision tree to classify salary based on the other variables.
c. Compare the two decision trees and discuss the benefits and drawbacks of each.
d. Generate the full set of decision rules for the CART decision tree.
e. Generate the full set of decision rules for the C4.5 decision tree.
f. Compare the two sets of decision rules and discuss the benefits and drawbacks of each.
Occupation
Service
Management
Sales
Staff
Notes:
•
•
•
Gender
Female
Male
Male
Male
Female
Male
Female
Female
Male
Female
Male
Age
45
25
33
25
35
26
45
40
30
50
25
Salary
$48,000
$25,000
$35,000
$45,000
$65,000
$45,000
$70,000
$50,000
$40,000
$40,000
$25,000
All homeworks must be solved and written independently. If you use someone else’s work including books,
papers or any other material, then you have to acknowledge it and directly cite those resources in every
place in your document that they are used.
You should submit your solutions in PDF format to [email protected], before 30th of Aban. The
subject of the email should conform to the following format:
[DMC][HW2][your student number(s)]
For example: [DMC][HW2][87777777-86666666]
Your email should have one PDF attachment that contains your solutions. The name of the file should be
your student number and the file should reflect your full name. You should also deliver a hard copy of your
solutions to Dr. Abolhassani in the first session of the class after the deadline.