Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Mining g Classification: Support Vector Machine (SVM) Lecture Notes for Chapter 5 Introduction to Data Mining Byy Tan, Steinbach, Kumar Edited by Wei Ding for CS 470/670 Artificial Intelligence © Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 Support Vector Machines z Find a linear hyperplane (decision boundary) that will separate the data © Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 2 Support Vector Machines z One Possible Solution © Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 3 4/18/2004 4 Support Vector Machines z Another possible solution © Tan,Steinbach, Kumar Introduction to Data Mining Support Vector Machines z Other possible solutions © Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 5 4/18/2004 6 Support Vector Machines z z Which one is better? B1 or B2? How do you define better? © Tan,Steinbach, Kumar Introduction to Data Mining Support Vector Machines z Find hyperplane maximizes the margin => B1 is better than B2 © Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 7 Linear SVM: Separable Case z z z z A linear SVM is a classifier that searches for a hyperplane yp p with the largest g margin, g which is hwy y it is often known as a maximal margin classifier. N training examples Each example (xi,yi) (i=1,2,…,N), xi=(xi1,xi2,…,xid)T corresponds to the attribute set for the ith example (that is, is each object is represented by d attributes) attributes), yi is in {-1,1} that denotes its class label. The decision boundary of a linear classifier can be written in the following form: w.x + b = 0 (where w and b are parameters of the model) © Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 8 Support Vector Machines r r w• x + b = 0 r r w • x + b = +1 r r w • x + b = −1 ⎧ 1 r f (x) = ⎨ ⎩− 1 r r if w • x + b ≥ 1 r r if w • x + b ≤ − 1 © Tan,Steinbach, Kumar M i = Margin Introduction to Data Mining 4/18/2004 2 r 2 || w || 9 Support Vector Machines z We want to maximize: Margin = 2 r 2 || w || r || w ||2 – Which is equivalent to minimizing: L( w) = 2 – But subjected to the following constraints: r r if w • x i + b ≥ 1 ⎧ 1 r f ( xi ) = ⎨ r r − 1 if w • x i + b ≤ −1 ⎩ This is a constrained optimization problem – Numerical approaches to solve it (e.g., lagrange multiplier) © Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 10 Summary SVM has its roots in statistical learning theory z It has shown promising empirical results in many practical applications, from handwritten digit recognition g to text categorization g z Works very well with high-dimensional data and voids the curse of dimensionality yp problem z A unique aspect of this approach is that it represents the decision boundary using a subset of the training examples, known as the support vectors. z © Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 11