Download Yes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Administrative notes
• Labs this week: project time. Remember, you need
to pass the project in order to pass the course! (See
course syllabus.)
• Clicker grades should be on-line now
Computational Thinking
ct.cs.ubc.ca
Administrative notes
•
•
•
•
March 3: Data mining reading quiz
March 14: Midterm 2
March 17: In the News call #3
March 30: Project deliverables and individual
report due
Computational Thinking
ct.cs.ubc.ca
Data mining: finding patterns in data
Part 1: Building decision tree classifiers from data
Computational Thinking
ct.cs.ubc.ca
Learning goals
• [CT Building Block] Students will be able to build a
simple decision tree
• [CT Building Block] Students will be able to describe
what considerations are important in building a
decision tree
Computational Thinking
ct.cs.ubc.ca
Why data mining?
• The world is awash with digital data; trillions
of gigabytes and growing
• How many bytes in a gigabyte?
Clicker question
A. 1 000 000
B. 1 000 000 000
C. 1 000 000 000 000
Computational Thinking
ct.cs.ubc.ca
Why data mining?
• The world is awash with digital data; trillions
of gigabytes and growing
• A trillion gigabytes is a zettabyte, or
1 000 000 000 000 000 000 000 bytes
Computational Thinking
ct.cs.ubc.ca
Why data mining?
• More and more, businesses and institutions
are using data mining to make decisions,
classifications, diagnoses, and
recommendations that affect our lives
Computational Thinking
ct.cs.ubc.ca
Data mining for classification
Recall our loan application example
Computational Thinking
ct.cs.ubc.ca
Data mining for classification
• In the loan strategy example, we focused on
fairness of different classifiers, but we didn’t
focus much on how to build a classifier
• Today you’ll learn how to build decision tree
classifiers for simple data mining scenarios
Computational Thinking
ct.cs.ubc.ca
A rooted tree in computer science
• Before we get to decision trees, we’ll define what
is a tree
Computational Thinking
ct.cs.ubc.ca
A rooted tree in computer science
A collection of nodes such that
• one node is the designated root
• a node can have zero or more
children; a node with zero
children is a leaf
• all non-root nodes have
a single parent
Computational Thinking
ct.cs.ubc.ca
A rooted tree in computer science
A collection of nodes such that
• one node is the designated root
• a node can have zero or more
children; a node with zero
children is a leaf
• all non-root nodes have
a single parent
• edges denote parent-child relationships
• nodes and/or edges may be labeled by data
Computational Thinking
ct.cs.ubc.ca
A rooted tree in computer science
Often but not always drawn with root on top
Computational Thinking
ct.cs.ubc.ca
Are these rooted trees?
Clicker question
1
A. 1 but not 2
B. 2 both not 1
2
C. Both 1 and 2
D. Neither 1 nor 2
Computational Thinking
http://jerome.boulinguez.free.fr/english/file/hotpotatoes/familytree.htm
ct.cs.ubc.ca
Is this a rooted tree?
Clicker question
A. Yes
B. No
C. I’m not sure
Node 2 has two parents,
and there’s no unique root
Computational Thinking
http://jerome.boulinguez.free.fr/english/file/hotpotatoes/familytree.htm
ct.cs.ubc.ca
Decision trees: trees whose node labels are
attributes, edge labels are conditions
Computational Thinking
ct.cs.ubc.ca
Decision trees: trees whose node labels are
attributes, edge labels are conditions
Computational Thinking
ct.cs.ubc.ca
Decision trees: trees whose node labels are
attributes, edge labels are conditions
Computational Thinking
ct.cs.ubc.ca
Decision trees: trees whose node labels are
attributes, edge labels are conditions
Computational Thinking
ct.cs.ubc.ca
Decision trees: trees whose node labels are
attributes, edge labels are conditions
Computational Thinking
ct.cs.ubc.ca
https://gbr.pepperdine.edu/2010/08/how-gerber-used-adecision-tree-in-strategic-decision-making/
Decision trees: trees whose node labels are
attributes, edge labels are conditions
A decision tree for max profit loan strategy
colour
blue
orange
credit
rating
> 61
approve
credit
rating
< 61
deny
> 50
approve
< 50
deny
(Note that some worthy applicants are denied loans,
while other unworthy ones get loans)
Computational Thinking
ct.cs.ubc.ca
Exercise: Construct the decision tree for the
“Group Unaware” loan strategy
Computational Thinking
ct.cs.ubc.ca
Building decision trees from training data
• Should you get an ice cream?
• You might start out with the following data
Weather
Wallet
Ice
Cream?
Great
Empty
No
Nasty
Empty
No
Great
Full
Yes
Okay
Full
Yes
Nasty
Full
No
Computational Thinking
ct.cs.ubc.ca
Building decision trees from training data
• Should you get an ice cream?
• You might start out with the following data
attributes
Weather
Wallet
Ice
Cream?
Great
Empty
No
Nasty
Empty
No
Great
Full
Yes
Okay
Full
Yes
Nasty
Full
conditions
Computational Thinking
ct.cs.ubc.ca
No
Building decision trees from training data
• Should you get an ice cream?
• You might start out with the following data
• You might build a decision tree that looks like this:
attributes
Weather
Wallet
Ice
Cream?
Great
Empty
No
Nasty
Empty
No
Great
Full
Yes
Okay
Full
Yes
Nasty
Full
conditions
Computational Thinking
ct.cs.ubc.ca
No
Wallet
Empty
Full
Weather
No
Nasty
No
Okay
Yes
Great
Yes
Shall we play a game?
Suppose we want
to help a soccer
league decide whether
on not to cancel games.
We have some data.
Our goal is a decision
tree to help officials
make decisions
Assume that decisions
are the same given the
same information
Outlook
Temperature
Humidity
Windy
Play?
sunny
hot
high
false
No
sunny
hot
high
true
No
overcast
hot
high
false
Yes
rain
mild
high
false
Yes
rain
cool
normal
false
Yes
rain
cool
normal
true
No
overcast
cool
normal
true
Yes
sunny
mild
high
false
No
sunny
cool
normal
false
Yes
rain
mild
normal
false
Yes
sunny
mild
normal
true
Yes
overcast
mild
high
true
Yes
overcast
hot
normal
false
Yes
rain
mild
high
true
No
Computational Thinking Example adapted from http://www.kdnuggets.com/
data_mining_course/index.html#materials
ct.cs.ubc.ca
Create a decision tree
Group exercise
Create a decision tree
that decides whether
the game should be
played or not
The leaf nodes should be
whether or not to play
The non-leaf nodes
should be attributes (e.g.,
Outlook, Windy)
The edges should be
conditions (e.g., sunny,
hot, normal)
Computational Thinking
ct.cs.ubc.ca
Outlook
Temperature
Humidity
Windy
Play?
sunny
hot
high
false
No
sunny
hot
high
true
No
overcast
hot
high
false
Yes
rain
mild
high
false
Yes
rain
cool
normal
false
Yes
rain
cool
normal
true
No
overcast
cool
normal
true
Yes
sunny
mild
high
false
No
sunny
cool
normal
false
Yes
rain
mild
normal
false
Yes
sunny
mild
normal
true
Yes
overcast
mild
high
true
Yes
overcast
hot
normal
false
Yes
rain
mild
high
true
No
Some example potential starts to the
decision tree
Windy?
Outlook?
Overcast
Temperature?
…
Rainy
Humidity?
…
Computational Thinking
ct.cs.ubc.ca
true
Sunny
Windy?
…
false
Humidity?
Humidity?
…
…
How did you split up your tree and why?
Student responses
• Looked at each condition of every attribute,
case by case, until got to the end, working
left to right through the columns
• There are ways that the tree can be made
smaller, by finding patterns. E.g., if it’s
overcast, the answer is always “yes”.
Computational Thinking
ct.cs.ubc.ca
Here’s that example again
Create a decision tree
that decides whether
the game should be
played or not
The leaf nodes should be
whether or not to play
The non-leaf nodes
should be attributes (e.g.,
Outlook, Windy)
The edges should be
conditions (e.g., sunny,
hot, normal)
Computational Thinking
ct.cs.ubc.ca
Outlook
Temperature
Humidity
Windy
Play?
sunny
hot
high
false
No
sunny
hot
high
true
No
overcast
hot
high
false
Yes
rain
mild
high
false
Yes
rain
cool
normal
false
Yes
rain
cool
normal
true
No
overcast
cool
normal
true
Yes
sunny
mild
high
false
No
sunny
cool
normal
false
Yes
rain
mild
normal
false
Yes
sunny
mild
normal
true
Yes
overcast
mild
high
true
Yes
overcast
hot
normal
false
Yes
rain
mild
high
true
No
Deciding which nodes go where:
A decision tree construction algorithm
• Top-down tree construction
• At start, all examples are at the root.
• Partition the examples recursively by choosing
one attribute each time.
• In deciding which attribute to split on, one common
method is to try to reduce entropy – i.e., each time
you split, you should make the resulting groups
more homogenous. The more you reduce entropy,
the higher the information gain.
Computational Thinking
ct.cs.ubc.ca
Let’s go back to our example
Intuitively, our goal is to get
to having as few mixed
“Yes” and “No” answers
together in groups as
possible.
So in the initial case, we
have 14 mixed Yes’s and
No’s
Computational Thinking
ct.cs.ubc.ca
Outlook
Temperature
Humidity
Windy
Play?
sunny
hot
high
false
No
sunny
hot
high
true
No
overcast
hot
high
false
Yes
rain
mild
high
false
Yes
rain
cool
normal
false
Yes
rain
cool
normal
true
No
overcast
cool
normal
true
Yes
sunny
mild
high
false
No
sunny
cool
normal
false
Yes
rain
mild
normal
false
Yes
sunny
mild
normal
true
Yes
overcast
mild
high
true
Yes
overcast
hot
normal
false
Yes
rain
mild
high
true
No
What happens if we split on Temperature?
temperature
hot
mild
cool
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
No
Yes
No
4 mixed
No
No
6 mixed
Overall entropy = 4 + 4 + 6 = 14
Computational Thinking
ct.cs.ubc.ca
4 mixed
What’s the entropy if you split on Outlook?
Group exercise
A. 0
B. 5
C. 10
D. 14
E. None of the
above
Computational Thinking
ct.cs.ubc.ca
Outlook
Temperature
Humidity
Windy
Play?
sunny
hot
high
false
No
sunny
hot
high
true
No
overcast
hot
high
false
Yes
rain
mild
high
false
Yes
rain
cool
normal
false
Yes
rain
cool
normal
true
No
overcast
cool
normal
true
Yes
sunny
mild
high
false
No
sunny
cool
normal
false
Yes
rain
mild
normal
false
Yes
sunny
mild
normal
true
Yes
overcast
mild
high
true
Yes
overcast
hot
normal
false
Yes
rain
mild
high
true
No
What’s the entropy if you split on Outlook?
Group exercise results
Outlook
sunny
overcast
rainy
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
No
Yes
No
No
0 mixed
5 mixed
Overall entropy = 5 + 0 + 5 = 10
Computational Thinking
ct.cs.ubc.ca
No
5 mixed
What if you split on Windy?
Windy
false
true
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
No
Yes
No
No
No
8 mixed
Overall entropy = 8+6=14
Computational Thinking
ct.cs.ubc.ca
6 mixed
What if you split on Humidity?
Humidity
high
normal
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
No
Yes
No
Yes
No
No
7 mixed
Overall entropy = 7+7=14
Computational Thinking
ct.cs.ubc.ca
7 mixed
The best option to split on is “Outlook”
It does the best job of reducing entropy
Computational Thinking
ct.cs.ubc.ca
This example suggests why a more
complex entropy definition might be better
Windy
false
Humidity
high
true
normal
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
No
No
No
No
Humidity is better, even though both have “entropy” 14
Computational Thinking
ct.cs.ubc.ca
Great! Now we do the same thing again
Here’s what we have so far:
Outlook
sunny
overcast
rainy
For each option, we have to decide which attribute to split on
next: Temperature, Windy, or Humidity.
Computational Thinking
ct.cs.ubc.ca
Great! Now we do the same thing again
Clicker question
What’s the best attribute to split on for Outlook = sunny?
A.
B.
Windy
Temperature
hot
mild
No
Yes
No
No
cool
Yes
false
Computational Thinking
Humidity
high
true
normal
Yes
Yes
No
Yes
Yes
No
No
Yes
No
No
ct.cs.ubc.ca
C.
No
We don’t need to split for Outlook =
overcast
• The answer was yes each time. So we’re done
there.
Computational Thinking
ct.cs.ubc.ca
What’s the best attribute to split on for
Outlook = rain? Clicker question
A.
B.
Windy
Temperature
hot
N/A
mild
cool
Yes
Yes
No
No
Computational Thinking
ct.cs.ubc.ca
C.
false
Humidity
high
true
normal
Yes
No
Yes
Yes
Yes
No
No
Yes
Yes
No
This was, of course, a simple example
• In this example, the algorithm found the tree with
the smallest number of nodes
• We were given the attributes and conditions
• A simplistic notion of entropy worked (a more
sophisticated notion of entropy is typically used to
determine which attribute to split on)
Computational Thinking
ct.cs.ubc.ca
This was, of course, a simple example
• In more complex examples, like the loan
application example
•
•
•
We may not know which conditions or attributes are
best to use
The final decision may not be correct in every case
(e.g., given two loan applicants with the same colour
and credit rating, one may be credit worthy while the
other is not)
Even if the final decision is always correct, the tree may
not be of minimum size
Computational Thinking
ct.cs.ubc.ca
Coding up a decision tree classifier
Outlook
sunny
No
normal
Yes
Computational Thinking
ct.cs.ubc.ca
rainy
Yes
Humidity
high
overcast
Windy
false
Yes
true
No
Coding up a decision tree classifier
Outlook
sunny
Yes
Humidity
high
No
overcast
normal
Yes
Can you see the relationship
between the hierarchical tree
structure and the hierarchical
nesting of “if” statements?
Computational Thinking
ct.cs.ubc.ca
Coding up a decision tree classifier
Outlook
sunny
Yes
Humidity
high
No
overcast
normal
Yes
Can you extend the code to
handle the “rainy” case?
Computational Thinking
ct.cs.ubc.ca
Learning goals
• [CT Building Block] Students will be able to build a
simple decision tree
• [CT Building Block] Students will be able to describe
what considerations are important in building a
decision tree
Computational Thinking
ct.cs.ubc.ca