Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
KDD Group Research Seminar Fall, 2001 - Presentation 8 – 11 Incremental Learning Friday, November 16 James Plummer [email protected] Reference Mitchell, Tom M. “Machine Learning” MaGraw-Hill Companies. 1997. Laboratory for Knowledge Discovery in Databases (KDD) Kansas State University Department of Computing and Information Sciences Outline • Machine Learning – Extracting information from data – Forming concepts • The Data – Arrangement of Data • Attributes, Labels, and Instances – Categorization of Data – Results • MLJ ( Machine Learning in Java ) – Collection of Machine Learning algorithms – Current Inducers • Incremental Learning – Description of technique – Nearest Neighbor Algorithm – Distance-Weighted Algorithm • Advantages and Disadvantages – Gains and Loses. Laboratory for Knowledge Discovery in Databases (KDD) Kansas State University Department of Computing and Information Sciences Machine Learning • Sometimes called Data Mining • The process of extracting useful information from data – Marketing databases, medical databases, weather databases • Finding Consumer purchase patterns • Used to form concepts – Predictions – Classifications – Numeric Answers Laboratory for Knowledge Discovery in Databases (KDD) Kansas State University Department of Computing and Information Sciences The Data • Arrangement of Data – A piece of data is a set of attributes ai which make up an instance xj • Attributes can be considered evidence – Each instance has a label or category f(xj) (outcome value) xj = a1, a2, a3, . . . ai; f(xj); – A set of data is a set of instances • Categorization – A set of instances is used as control for new query instances xq(training) – Calculate f^(xj) based on training data • f^(xj) is the predicted value of the actual f(xj) • Results – The number of correctly predicted values over the total number of query instances – f^(xq)correct / f(xq)total Laboratory for Knowledge Discovery in Databases (KDD) Kansas State University Department of Computing and Information Sciences Data Example • Predict the values of Example 6, 7, 8 given data examples 1 through 5 Example Outlook Air Temp Humidity Wind Play Tennis? 1 Sunny Hot High Weak No 2 Sunny Hot High Strong No 3 Overcast Hot High Weak Yes 4 Rain Mild High Weak Yes 5 Rain Cool Normal Weak Yes 6 Rain Cool Normal Strong Yes 7 Overcast Cool Normal Strong No 8 Sunny Mild High Weak Yes Laboratory for Knowledge Discovery in Databases (KDD) Kansas State University Department of Computing and Information Sciences MLJ (Machine Learning in Java) • MLJ is a collection of learning algorithms – Inducers • Categorize data to learn concepts • Currently in Development – ID3 • Uses trees – Naïve Bayes • Uses complex calculations – C4.5 • Uses trees with pruning techniques • Incremental Learning – Uses comparison techniques – Soon to be added Laboratory for Knowledge Discovery in Databases (KDD) Kansas State University Department of Computing and Information Sciences Incremental Learning • Instance Based Learning – k-Nearest Neighbor – All instances correspond to points in an n-dimensional space • The distance between two instances is determined by: d ( xi , x j ) n 2 ( a ( x ) a ( x )) r i r j r 1 ar(x) is the rth attribute of instance x – Given a query instance xq to be categorized the k-nearest neighbors are calculated – f^(xq) is assigned the most frequent value of the nearest k f(xj) k f ^ ( xq ) arg max f ( xi ) * i 1 – For k = 1, f^(xq) will be assigned f(xi) if xi is the closest instance in the space Laboratory for Knowledge Discovery in Databases (KDD) Kansas State University Department of Computing and Information Sciences Distance-Weighted Nearest Neighbor • Same as k-Nearest Neighbor – Effect of f(xj) on f^(xq) based on d(xq, xj) k f (x j ) i 1 d ( xq , x j ) f ^ ( xq ) arg max * – In the case xq = xi then f^(xq) = f(xi) – Examine three cases for the 2 dimensional space to the right • k=1 • k=5 • Weighted, k=5 Laboratory for Knowledge Discovery in Databases (KDD) Kansas State University Department of Computing and Information Sciences Advantages and Disadvantages • Gains of using k-Nearest Neighbor – – – – – – • Individual attributes can be weighted differently Change d(xi, xq) to allow nearest xi to have stronger of weaker effect on f^(xq) Unaffected by noise in training data Very Effective when provided a large set of training data Flexible, f^(xq) can be calculated in many useful ways Very small training time Loses – Not good when training data is insufficient – Not very effective if similar xi have disimilar f^(xi) – More computation time need to categorize new instances Laboratory for Knowledge Discovery in Databases (KDD) Kansas State University Department of Computing and Information Sciences Referrences • Mitchell, Tom M. “Machine Learning” MaGraw-Hill Companies. 1997. • Witten and Frank, “Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations”. Morgan Kaufmann publishers. 2000. * equation reduced for simplicity Laboratory for Knowledge Discovery in Databases (KDD) Kansas State University Department of Computing and Information Sciences