Download Data Mining Classification: Support Vector Machine (SVM) Support

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Data Mining
g
Classification: Support Vector Machine
(SVM)
Lecture Notes for Chapter 5
Introduction to Data Mining
Byy Tan, Steinbach, Kumar
Edited by Wei Ding for CS 470/670 Artificial Intelligence
© Tan,Steinbach, Kumar
Introduction to Data Mining
4/18/2004
1
Support Vector Machines
z
Find a linear hyperplane (decision boundary) that will separate the data
© Tan,Steinbach, Kumar
Introduction to Data Mining
4/18/2004
2
Support Vector Machines
z
One Possible Solution
© Tan,Steinbach, Kumar
Introduction to Data Mining
4/18/2004
3
4/18/2004
4
Support Vector Machines
z
Another possible solution
© Tan,Steinbach, Kumar
Introduction to Data Mining
Support Vector Machines
z
Other possible solutions
© Tan,Steinbach, Kumar
Introduction to Data Mining
4/18/2004
5
4/18/2004
6
Support Vector Machines
z
z
Which one is better? B1 or B2?
How do you define better?
© Tan,Steinbach, Kumar
Introduction to Data Mining
Support Vector Machines
z
Find hyperplane maximizes the margin => B1 is better than B2
© Tan,Steinbach, Kumar
Introduction to Data Mining
4/18/2004
7
Linear SVM: Separable Case
z
z
z
z
A linear SVM is a classifier that searches for a
hyperplane
yp p
with the largest
g
margin,
g which is hwy
y it is
often known as a maximal margin classifier.
N training examples
Each example (xi,yi) (i=1,2,…,N), xi=(xi1,xi2,…,xid)T
corresponds to the attribute set for the ith example
(that is,
is each object is represented by d attributes)
attributes), yi
is in {-1,1} that denotes its class label.
The decision boundary of a linear classifier can be
written in the following form: w.x + b = 0 (where w
and b are parameters of the model)
© Tan,Steinbach, Kumar
Introduction to Data Mining
4/18/2004
8
Support Vector Machines
r r
w• x + b = 0
r r
w • x + b = +1
r r
w • x + b = −1
⎧ 1
r
f (x) = ⎨
⎩− 1
r r
if w • x + b ≥ 1
r r
if w • x + b ≤ − 1
© Tan,Steinbach, Kumar
M i =
Margin
Introduction to Data Mining
4/18/2004
2
r 2
|| w ||
9
Support Vector Machines
z
We want to maximize:
Margin =
2
r 2
|| w ||
r
|| w ||2
– Which is equivalent to minimizing: L( w) =
2
– But subjected to the following constraints:
r r
if w • x i + b ≥ 1
⎧ 1
r
f ( xi ) = ⎨
r r
−
1
if
w
• x i + b ≤ −1
⎩
‹
This is a constrained optimization problem
– Numerical approaches to solve it (e.g., lagrange multiplier)
© Tan,Steinbach, Kumar
Introduction to Data Mining
4/18/2004
10
Summary
SVM has its roots in statistical learning theory
z It has shown promising empirical results in many
practical applications, from handwritten digit
recognition
g
to text categorization
g
z Works very well with high-dimensional data and
voids the curse of dimensionality
yp
problem
z A unique aspect of this approach is that it
represents the decision boundary using a subset
of the training examples, known as the support
vectors.
z
© Tan,Steinbach, Kumar
Introduction to Data Mining
4/18/2004
11