Download Incremental Reduced Support Vector Machines

Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute of Statistical Science Academia Sinica 2003 International Conference on Informatics, Cybernetics, and Systems ISU, Kaohsiung, Dec. 14 2003 Outline  Support Vector Machines for classification problems  Linear and nonlinear SVMs  Difficulties with nonlinear SVMs for large problems  Storage and computational complexity  Reduced Support Vector Machines  Incremental Reduced Support Vector Machines  Numerical Results  Conclusions Support Vector Machines (SVMs) Powerful tools for Data Mining  SVMs become the most promising learning algorithm for classification and regression  SVMs have a sound theoretical foundation  Base on statistical learning theory  SVMs have an optimal defined separating surface  SVMs can be generated very efficiently and have high accuracy  SVMs can be extend from linear to nonlinear case  By using kernel functions Support Vector Machines for Classification Maximizing the Margin between Bounding Planes A+ A- Support Vector Machine Formulation  Solve the quadratic program for some min (QP) ， s. t. where ： , denotes or membership.  SSVM：Smooth Support Vector Machine is an efficient SVM algorithm proposed by Yuh-Jye Lee Nonlinear Support Vector Machine  Extend to nonlinear cases by using kernel functions  The value of kernel function represents the inner product in the feature space  Map data from input space to a higher dimensional feature space where the data can be separated linearly  Nonlinear Support Vector Machine formulation: min s. t. Difficulties with Nonlinear SVM for Large Problems  The nonlinear kernel  Long CPU time to compute is fully dense numbers  Runs out of memory while storing kernel matrix  Computational complexity depends on  Complexity of nonlinear SSVM ø  Separating surface depends on almost entire dataset  Need to store the entire dataset after solving the problem Reduced Support Vector Machines Overcoming Computational & Storage Difficulties by Using a Rectangular Kernel  Choose a small random sample of  The small random sample is a representative sample of the entire dataset is 1% to 10% of the rows of  Typically  Replace corresponding by with in nonlinear SSVM  Only need to compute and store the rectangular kernel numbers for  Computational complexity reduces to  The nonlinear separator only depends on Reduced Set plays the most important role in RSVM  It is natural to raise two questions:  Is there a way to choose the reduced set other than random selection so that RSVM will have a better performance?  Is there a mechanism that determines the size of reduced set automatically or dynamically?  Incremental reduced support vector machine is proposed to answer these questions Our Observations (Ⅰ)  The nonlinear separating surface is a linear combination of a set of kernel functions  If the kernel functions are very similar, the hypothesis space spanned by this kernel functions will be very limited. Our Observations (Ⅱ)  Start with a very small reduced set , then add new data point only when the kernel function is dissimilar to the current function set  These points contribute the most extra information How to measure the dissimilar? solving least squares problems  The information criterion is  The distance from the kernel vector to the column space of is greater than a threshold  This distance can be determined by solving a least squares problem Dissimilar Measurement solving least squares problems  It has a unique solution , and the distance is í IRSVM Algorithm pseudo-code (sequential version) 1 Randomly choose two data from the training data as the initial reduced set 2 Compute the reduced kernel matrix 3 For each data point not in the reduced set 4 Computes its kernel vector 5 Computes the distance from the kernel vector 6 to the column space of the current reduced kernel matrix 7 If its distance exceed a certain threshold 8 9 Add this point into the reduced set and form the new reduced kernal matrix Until several successive failures happened in line 7 10 Solve the QP problem of nonlinear SVMs with the obtained reduced kernel 11 A new data point is classified by the separating surface Speed up IRSVM  Note we have to solve the least squares problem many times whose time complixity is  The main cost depends on but not on  Take advantage of this fact, we proposed a batch version of IRSVM that examines a batch points once IRSVM Algorithm pseudo-code (Batch version) 1 Randomly choose two data from the training data as the initial reduced set 2 Compute the reduced kernel matrix 3 For a batch data point not in the reduced set 4 Computes their kernel vectors 5 Computes the corresponding distances from these kernel vector 6 to the column space of the current reduced kernel matrix 7 For those points’ distance exceed a certain threshold 8 9 Add those point into the reduced set and form the new reduced kernal matrix Until no data points in a batch were added in line 7,8 10 Solve the QP problem of nonlinear SVMs with the obtained reduced kernel 11 A new data point is classified by the separating surface IRSVM on four public data sets Conclusions  IRSVM — an advanced algorithm of RSVM  Start with extremely small reduced set and sequentially expands to include informative data points into the reduced set  Determine the size of the reduced set automatically and dynamically but no pre-specified  The reduced set generated by IRSVM will be more representative  All advantages of RSVM for dealing with large scale nonlinear classification problem are retained  Experimental tests show that IRSVM used a smaller reduced set without scarifying classification accuracy

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Incremental Reduced Support Vector Machines