Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Administrative notes • Labs this week: project time. Remember, you need to pass the project in order to pass the course! (See course syllabus.) • Clicker grades should be on-line now Computational Thinking ct.cs.ubc.ca Administrative notes • • • • March 3: Data mining reading quiz March 14: Midterm 2 March 17: In the News call #3 March 30: Project deliverables and individual report due Computational Thinking ct.cs.ubc.ca Data mining: finding patterns in data Part 1: Building decision tree classifiers from data Computational Thinking ct.cs.ubc.ca Learning goals • [CT Building Block] Students will be able to build a simple decision tree • [CT Building Block] Students will be able to describe what considerations are important in building a decision tree Computational Thinking ct.cs.ubc.ca Why data mining? • The world is awash with digital data; trillions of gigabytes and growing • How many bytes in a gigabyte? Clicker question A. 1 000 000 B. 1 000 000 000 C. 1 000 000 000 000 Computational Thinking ct.cs.ubc.ca Why data mining? • The world is awash with digital data; trillions of gigabytes and growing • A trillion gigabytes is a zettabyte, or 1 000 000 000 000 000 000 000 bytes Computational Thinking ct.cs.ubc.ca Why data mining? • More and more, businesses and institutions are using data mining to make decisions, classifications, diagnoses, and recommendations that affect our lives Computational Thinking ct.cs.ubc.ca Data mining for classification Recall our loan application example Computational Thinking ct.cs.ubc.ca Data mining for classification • In the loan strategy example, we focused on fairness of different classifiers, but we didn’t focus much on how to build a classifier • Today you’ll learn how to build decision tree classifiers for simple data mining scenarios Computational Thinking ct.cs.ubc.ca A rooted tree in computer science • Before we get to decision trees, we’ll define what is a tree Computational Thinking ct.cs.ubc.ca A rooted tree in computer science A collection of nodes such that • one node is the designated root • a node can have zero or more children; a node with zero children is a leaf • all non-root nodes have a single parent Computational Thinking ct.cs.ubc.ca A rooted tree in computer science A collection of nodes such that • one node is the designated root • a node can have zero or more children; a node with zero children is a leaf • all non-root nodes have a single parent • edges denote parent-child relationships • nodes and/or edges may be labeled by data Computational Thinking ct.cs.ubc.ca A rooted tree in computer science Often but not always drawn with root on top Computational Thinking ct.cs.ubc.ca Are these rooted trees? Clicker question 1 A. 1 but not 2 B. 2 both not 1 2 C. Both 1 and 2 D. Neither 1 nor 2 Computational Thinking http://jerome.boulinguez.free.fr/english/file/hotpotatoes/familytree.htm ct.cs.ubc.ca Is this a rooted tree? Clicker question A. Yes B. No C. I’m not sure Node 2 has two parents, and there’s no unique root Computational Thinking http://jerome.boulinguez.free.fr/english/file/hotpotatoes/familytree.htm ct.cs.ubc.ca Decision trees: trees whose node labels are attributes, edge labels are conditions Computational Thinking ct.cs.ubc.ca Decision trees: trees whose node labels are attributes, edge labels are conditions Computational Thinking ct.cs.ubc.ca Decision trees: trees whose node labels are attributes, edge labels are conditions Computational Thinking ct.cs.ubc.ca Decision trees: trees whose node labels are attributes, edge labels are conditions Computational Thinking ct.cs.ubc.ca Decision trees: trees whose node labels are attributes, edge labels are conditions Computational Thinking ct.cs.ubc.ca https://gbr.pepperdine.edu/2010/08/how-gerber-used-adecision-tree-in-strategic-decision-making/ Decision trees: trees whose node labels are attributes, edge labels are conditions A decision tree for max profit loan strategy colour blue orange credit rating > 61 approve credit rating < 61 deny > 50 approve < 50 deny (Note that some worthy applicants are denied loans, while other unworthy ones get loans) Computational Thinking ct.cs.ubc.ca Exercise: Construct the decision tree for the “Group Unaware” loan strategy Computational Thinking ct.cs.ubc.ca Building decision trees from training data • Should you get an ice cream? • You might start out with the following data Weather Wallet Ice Cream? Great Empty No Nasty Empty No Great Full Yes Okay Full Yes Nasty Full No Computational Thinking ct.cs.ubc.ca Building decision trees from training data • Should you get an ice cream? • You might start out with the following data attributes Weather Wallet Ice Cream? Great Empty No Nasty Empty No Great Full Yes Okay Full Yes Nasty Full conditions Computational Thinking ct.cs.ubc.ca No Building decision trees from training data • Should you get an ice cream? • You might start out with the following data • You might build a decision tree that looks like this: attributes Weather Wallet Ice Cream? Great Empty No Nasty Empty No Great Full Yes Okay Full Yes Nasty Full conditions Computational Thinking ct.cs.ubc.ca No Wallet Empty Full Weather No Nasty No Okay Yes Great Yes Shall we play a game? Suppose we want to help a soccer league decide whether on not to cancel games. We have some data. Our goal is a decision tree to help officials make decisions Assume that decisions are the same given the same information Outlook Temperature Humidity Windy Play? sunny hot high false No sunny hot high true No overcast hot high false Yes rain mild high false Yes rain cool normal false Yes rain cool normal true No overcast cool normal true Yes sunny mild high false No sunny cool normal false Yes rain mild normal false Yes sunny mild normal true Yes overcast mild high true Yes overcast hot normal false Yes rain mild high true No Computational Thinking Example adapted from http://www.kdnuggets.com/ data_mining_course/index.html#materials ct.cs.ubc.ca Create a decision tree Group exercise Create a decision tree that decides whether the game should be played or not The leaf nodes should be whether or not to play The non-leaf nodes should be attributes (e.g., Outlook, Windy) The edges should be conditions (e.g., sunny, hot, normal) Computational Thinking ct.cs.ubc.ca Outlook Temperature Humidity Windy Play? sunny hot high false No sunny hot high true No overcast hot high false Yes rain mild high false Yes rain cool normal false Yes rain cool normal true No overcast cool normal true Yes sunny mild high false No sunny cool normal false Yes rain mild normal false Yes sunny mild normal true Yes overcast mild high true Yes overcast hot normal false Yes rain mild high true No Some example potential starts to the decision tree Windy? Outlook? Overcast Temperature? … Rainy Humidity? … Computational Thinking ct.cs.ubc.ca true Sunny Windy? … false Humidity? Humidity? … … How did you split up your tree and why? Student responses • Looked at each condition of every attribute, case by case, until got to the end, working left to right through the columns • There are ways that the tree can be made smaller, by finding patterns. E.g., if it’s overcast, the answer is always “yes”. Computational Thinking ct.cs.ubc.ca Here’s that example again Create a decision tree that decides whether the game should be played or not The leaf nodes should be whether or not to play The non-leaf nodes should be attributes (e.g., Outlook, Windy) The edges should be conditions (e.g., sunny, hot, normal) Computational Thinking ct.cs.ubc.ca Outlook Temperature Humidity Windy Play? sunny hot high false No sunny hot high true No overcast hot high false Yes rain mild high false Yes rain cool normal false Yes rain cool normal true No overcast cool normal true Yes sunny mild high false No sunny cool normal false Yes rain mild normal false Yes sunny mild normal true Yes overcast mild high true Yes overcast hot normal false Yes rain mild high true No Deciding which nodes go where: A decision tree construction algorithm • Top-down tree construction • At start, all examples are at the root. • Partition the examples recursively by choosing one attribute each time. • In deciding which attribute to split on, one common method is to try to reduce entropy – i.e., each time you split, you should make the resulting groups more homogenous. The more you reduce entropy, the higher the information gain. Computational Thinking ct.cs.ubc.ca Let’s go back to our example Intuitively, our goal is to get to having as few mixed “Yes” and “No” answers together in groups as possible. So in the initial case, we have 14 mixed Yes’s and No’s Computational Thinking ct.cs.ubc.ca Outlook Temperature Humidity Windy Play? sunny hot high false No sunny hot high true No overcast hot high false Yes rain mild high false Yes rain cool normal false Yes rain cool normal true No overcast cool normal true Yes sunny mild high false No sunny cool normal false Yes rain mild normal false Yes sunny mild normal true Yes overcast mild high true Yes overcast hot normal false Yes rain mild high true No What happens if we split on Temperature? temperature hot mild cool Yes Yes Yes Yes Yes Yes No Yes Yes No Yes No 4 mixed No No 6 mixed Overall entropy = 4 + 4 + 6 = 14 Computational Thinking ct.cs.ubc.ca 4 mixed What’s the entropy if you split on Outlook? Group exercise A. 0 B. 5 C. 10 D. 14 E. None of the above Computational Thinking ct.cs.ubc.ca Outlook Temperature Humidity Windy Play? sunny hot high false No sunny hot high true No overcast hot high false Yes rain mild high false Yes rain cool normal false Yes rain cool normal true No overcast cool normal true Yes sunny mild high false No sunny cool normal false Yes rain mild normal false Yes sunny mild normal true Yes overcast mild high true Yes overcast hot normal false Yes rain mild high true No What’s the entropy if you split on Outlook? Group exercise results Outlook sunny overcast rainy Yes Yes Yes Yes Yes Yes No Yes Yes No Yes No No 0 mixed 5 mixed Overall entropy = 5 + 0 + 5 = 10 Computational Thinking ct.cs.ubc.ca No 5 mixed What if you split on Windy? Windy false true Yes Yes Yes Yes Yes Yes Yes No Yes No Yes No No No 8 mixed Overall entropy = 8+6=14 Computational Thinking ct.cs.ubc.ca 6 mixed What if you split on Humidity? Humidity high normal Yes Yes Yes Yes Yes Yes No Yes No Yes No Yes No No 7 mixed Overall entropy = 7+7=14 Computational Thinking ct.cs.ubc.ca 7 mixed The best option to split on is “Outlook” It does the best job of reducing entropy Computational Thinking ct.cs.ubc.ca This example suggests why a more complex entropy definition might be better Windy false Humidity high true normal Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No No Yes Yes No No Yes Yes No No Yes No No No No Humidity is better, even though both have “entropy” 14 Computational Thinking ct.cs.ubc.ca Great! Now we do the same thing again Here’s what we have so far: Outlook sunny overcast rainy For each option, we have to decide which attribute to split on next: Temperature, Windy, or Humidity. Computational Thinking ct.cs.ubc.ca Great! Now we do the same thing again Clicker question What’s the best attribute to split on for Outlook = sunny? A. B. Windy Temperature hot mild No Yes No No cool Yes false Computational Thinking Humidity high true normal Yes Yes No Yes Yes No No Yes No No ct.cs.ubc.ca C. No We don’t need to split for Outlook = overcast • The answer was yes each time. So we’re done there. Computational Thinking ct.cs.ubc.ca What’s the best attribute to split on for Outlook = rain? Clicker question A. B. Windy Temperature hot N/A mild cool Yes Yes No No Computational Thinking ct.cs.ubc.ca C. false Humidity high true normal Yes No Yes Yes Yes No No Yes Yes No This was, of course, a simple example • In this example, the algorithm found the tree with the smallest number of nodes • We were given the attributes and conditions • A simplistic notion of entropy worked (a more sophisticated notion of entropy is typically used to determine which attribute to split on) Computational Thinking ct.cs.ubc.ca This was, of course, a simple example • In more complex examples, like the loan application example • • • We may not know which conditions or attributes are best to use The final decision may not be correct in every case (e.g., given two loan applicants with the same colour and credit rating, one may be credit worthy while the other is not) Even if the final decision is always correct, the tree may not be of minimum size Computational Thinking ct.cs.ubc.ca Coding up a decision tree classifier Outlook sunny No normal Yes Computational Thinking ct.cs.ubc.ca rainy Yes Humidity high overcast Windy false Yes true No Coding up a decision tree classifier Outlook sunny Yes Humidity high No overcast normal Yes Can you see the relationship between the hierarchical tree structure and the hierarchical nesting of “if” statements? Computational Thinking ct.cs.ubc.ca Coding up a decision tree classifier Outlook sunny Yes Humidity high No overcast normal Yes Can you extend the code to handle the “rainy” case? Computational Thinking ct.cs.ubc.ca Learning goals • [CT Building Block] Students will be able to build a simple decision tree • [CT Building Block] Students will be able to describe what considerations are important in building a decision tree Computational Thinking ct.cs.ubc.ca