Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Discriminate Analysis Outline • Introduction • Linear Discriminant Analysis • Examples 1 Introduction • What is Discriminant Analysis? • Statistical technique to classify objects into mutually exclusive and exhaustive groups based on a set of measurable object's features Introduction • Purpose of Discriminant Analysis • To classify objects (people, customers, things, etc.) into one of two or more groups based on a set of features that describe the objects (e.g. gender, age, income, weight, preference score, etc. ). • Two things to check • Which set of features can best determine group membership of the object? Feature Selection • What is the classification rule or model to best separate those groups? Classification 2 Outline • Introduction • Linear Discriminant Analysis • Examples Linear Discriminant Analysis (LDA) • Linear discriminant analysis (LDA), • Also called Fisher's linear discriminant • Methods used in statistics and machine learning to find the linear combination of features which • best separate two or more classes of object or event. • The resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification. 3 Dimensionality Reduction • Curse of dimensionality: • Problem caused by higher the dimension of the feature vectors • Data sparsity • Undertrained classifier • Goal: • Reduce dimension of feature vectors without loss of information Linear Discriminant Analysis (LDA) • Goal: • Try to optimize class separability • Also known as Fisher´s discriminant analysis 4 Linear Discriminant Analysis (LDA) • Problem statement • Assign class category (or the group, class label): “good” and “bad” for each product • Class category is also called dependent variable. • • • Based on Features • Each measurement on the product is called features that describe the object • it is also called independent variable. Dependent variable (Y) is the group • The dependent variable is always category (nominal scale) variable Independent variables (X) are the object features that might describe the group • independent variables can be any measurement scale (i.e. nominal, ordinal, interval or ratio) Linear Discriminant Analysis (LDA) • Linear Discriminant Analysis (LDA) • • Assume that the groups are linearly separable • Use linear discriminant model (LDA) What is Linearly separable? • It suggests that the groups can be separated by a linear combination of features that describe the objects 5 Linear Discriminant Analysis (LDA) • PCA vs LDA • PCA is trying to find the strongest correlation in the dataset • LDA is trying to optimize class separability Linear Discriminant Analysis (LDA) • Goal of LDA: try to maximize class seperability 6 Different Approaches to LDA • Class-dependent transformation • Maximizing the ratio of between class variance to within class • variance. Involving using two optimizing criteria for transforming the data sets independently • Class-independent transformation • Maximizing the ratio of overall variance to within class variance • Only using one optimizing criterion to transform the data sets and hence all data points irrespective of their class identity are transformed using this transform. Numerical Example • Given a two-class problem. • Input: two sets of 2-D data points Class 1 Class 2 7 Numerical Example • Step 1 • Compute the mean of each data set and mean of entire data set. Data Points in Class 1 Mean of Set 1 µ1 n*1 column vector Data Points in Class 2 Mean of Set 2 µ2 n*1 column vector Data Points in both Class 1 and Class 2 Mean of Entire Data µ3 n*1 column vector Where n is the number of dimension. In our case, it is equal to 2 Numerical Example • Step 2 • Compute the Between Class Scatter Matrix Matrix Sw Sb and Within Class Scatter Within Class Scatter Matrix where pj is the prior probabilities of the jth class is covariance matrix of the jth class (set j) and Between Class Scatter Matrix where and µ3 µj is the mean of the entire data is the mean of the jth class (set j) Where n is the number of dimension. In our case, it is equal to 2 8 Numerical Example • Step 3 • Eigenvectors computation Optimizing Criterion Class-dependent transformation: Obtain the eigenvectors from Maximizing the ratio of between class variance to within class variance. Eigenvectors Transform_j Involving using two optimizing criteria for transforming the data sets independently Optimizing Criterion Class-independent transformation: Obtain the eigenvectors from Maximizing the ratio of overall variance to within class variance Only using one optimizing criterion to transform the data sets and hence all data points irrespective of their class identity are transformed using this transform. Eigenvectors Transform_spec Numerical Example • Step 4 • Transformed matrix calculation Where “transform_j” is composed of eigenvectors from Where “transform_spec” is composed of eigenvectors from 9 Numerical Example • Step 5 • Euclidean distance calculate where µ ntrans n x is the mean of the transformed data set is the class index is the test vector For n classes, n Euclidean distances are obtained for each test point Numerical Example • Step 6 • Classification result is based on the smallest Euclidean distance among the n distances classifiers the test vector a belonging to class n 10 Extension to Multiple Classes • Between Class Scatter Matrix Extension to Multiple Classes • Within Class Scatter Matrix 11 Extension to Multiple Classes r r S w−1S bφi = λφi • Questions? 12