Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of Wisconsin - Madison Support Vector Machines Maximizing the Margin between Bounding Planes w x 0w = í + 1 A+ A- x 0w = í à 1 2 jj wjj 2 Proximal Vector Machines Fitting the Data using two parallel Bounding Planes w x 0w = í + 1 A+ A- 0 xw= í à 1 2 jj wjj 2 Standard Support Vector Machine Formulation Solve the quadratic program for some ÷ > 0: min ÷ 2 k k y 2 2 1 2kw; í k 22 y; w; í s. t. D (Aw à eí ) + y > e + (QP) , where D i i = æ1, denotes A + or A à membership. Margin is maximized by minimizing 12kw; í k 22 PSVM Formulation We have from the QP SVM formulation: min w; í s. t. Solving for min w; í ÷ 2 k k y 2 2 + 12kw; í k 22 = e D (Aw à eí ) + y = y in terms of w and í ÷ 2ke à (QP) D (A w à eí 2 )k 2 gives: + 1 2kw; í 2 k2 This simple, but critical modification, changes the nature of the optimization problem tremendously!! Advantages of New Formulation Objective function remains strongly convex An explicit exact solution can be written in terms of the problem data PSVM classifier is obtained by solving a single system of linear equations in the usually small dimensional input space Exact leave-one-out-correctness can be obtained in terms of problem data Linear PSVM We want to solve: min w; í ÷ 2ke à D (A w à eí 2 )k 2 + 1 2kw; í 2 k2 Setting the gradient equal to zero, gives a nonsingular system of linear equations. Solution of the system gives the desired PSVM classifier Linear PSVM Solution h i w í = I (÷ + 0 H H) à1 0 H De Here, H = [A à e] The linear system to solve depends on: 0 HH which is of the size (n + 1) â (n + 1) n is usually much smaller than m Linear Proximal SVM Algorithm Input Define A; D H = [A à e] Calculate 0 v = H De h i Solve Classifier: ( ÷I + H 0H ) w í = v si gn(w 0x à í ) Nonlinear PSVM Formulation Linear PSVM: (Linear separating surface: x 0w min w; í ÷ 2 k k y 2 2 + 12kw; í k 22 = í) (QP) s. t. D (Aw à eí ) + y = e By QP “duality”, w = A 0Du. Maximizing the margin in the “dual space” , gives: min ÷2ke à D (AA 0D u à eí ) k 22+ 12ku; í k 22 u; í Replace AA 0 by a nonlinear kernel K (A ; A 0) : min ÷2ke à D (K (A; A 0)D u à eí ) k 22+ 12ku; í k 22 u; í The Nonlinear Classifier The nonlinear classifier: K (x 0; A 0)D u = í K (A; A 0) : R m â n â R n â m7 à ! R mâ m Where K is a nonlinear kernel, e.g.: Gaussian (Radial Basis) Kernel : 0 K ( A; A ) i j = " à ö kA i à A j k 22 ; i; j = 1; . . .; m The i j -entry of K (A; A 0) represents the “similarity” of data points A i and A j Nonlinear PSVM Similar to the linear case, setting the gradient equal to zero, we obtain: hi u í = I (÷ + 0 H H) à1 0 H De Defining slightly different: H = [K (A; A 0) à e] Here, the linear system to solve is of the size (m + 1) â (m + 1) However, reduced kernels techniques can be used (RSVM) to reduce dimensionality. Non Linear Proximal SVM Algorithm Input Define A; D K = K ( A; A 0) H = [A K à e] Calculate 0 v = H De h i Solve ( ÷I + H 0H ) uwí = v Classifier: si gn(K à àí )í ) si gn(w (x 0; A0x0)u u = Du