Download Discriminate Analysis - UML Computer Science

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Time series wikipedia , lookup

Transcript
Discriminate Analysis
Outline
• Introduction
• Linear Discriminant Analysis
• Examples
1
Introduction
• What is Discriminant Analysis?
• Statistical technique to classify objects into mutually exclusive
and exhaustive groups based on a set of measurable object's
features
Introduction
• Purpose of Discriminant Analysis
• To classify objects (people, customers, things, etc.) into one of
two or more groups based on a set of features that describe the
objects (e.g. gender, age, income, weight, preference score, etc.
).
• Two things to check
• Which set of features can best determine group membership of
the object?
Feature Selection
• What is the classification rule or model to best separate those
groups?
Classification
2
Outline
• Introduction
• Linear Discriminant Analysis
• Examples
Linear Discriminant Analysis (LDA)
• Linear discriminant analysis (LDA),
• Also called Fisher's linear discriminant
• Methods used in statistics and machine learning to find the
linear combination of features which
• best separate two or more classes of object or event.
• The resulting combination may be used as a linear classifier, or,
more commonly, for dimensionality reduction before later
classification.
3
Dimensionality Reduction
• Curse of dimensionality:
• Problem caused by higher the dimension of the feature vectors
• Data sparsity
• Undertrained classifier
• Goal:
• Reduce dimension of feature vectors without loss of information
Linear Discriminant Analysis (LDA)
• Goal:
• Try to optimize class separability
• Also known as Fisher´s discriminant analysis
4
Linear Discriminant Analysis (LDA)
• Problem statement
• Assign class category (or the group, class label): “good” and “bad” for each
product
• Class category is also called dependent variable.
•
•
• Based on Features
• Each measurement on the product is called features that describe the object
• it is also called independent variable.
Dependent variable (Y) is the group
• The dependent variable is always category (nominal scale) variable
Independent variables (X) are the object features that might
describe the group
• independent variables can be any measurement scale (i.e. nominal, ordinal,
interval or ratio)
Linear Discriminant Analysis (LDA)
• Linear Discriminant Analysis (LDA)
•
• Assume that the groups are linearly separable
• Use linear discriminant model (LDA)
What is Linearly separable?
• It suggests that the groups can be separated by a linear
combination of features that describe the objects
5
Linear Discriminant Analysis (LDA)
• PCA vs LDA
• PCA is trying to find the strongest correlation in the dataset
• LDA is trying to optimize class separability
Linear Discriminant Analysis (LDA)
• Goal of LDA: try to maximize class seperability
6
Different Approaches to LDA
• Class-dependent transformation
• Maximizing the ratio of between class variance to within class
•
variance.
Involving using two optimizing criteria for transforming the data
sets independently
• Class-independent transformation
• Maximizing the ratio of overall variance to within class variance
• Only using one optimizing criterion to transform the data sets
and hence all data points irrespective of their class identity are
transformed using this transform.
Numerical Example
• Given a two-class problem.
• Input: two sets of 2-D data points
Class 1
Class 2
7
Numerical Example
• Step 1
• Compute the mean of each data set and mean of entire data set.
Data Points in Class 1
Mean of Set 1
µ1
n*1 column vector
Data Points in Class 2
Mean of Set 2
µ2
n*1 column vector
Data Points in both
Class 1 and Class 2
Mean of Entire Data
µ3
n*1 column vector
Where n is the number of dimension. In our
case, it is equal to 2
Numerical Example
• Step 2
• Compute the Between Class Scatter Matrix
Matrix
Sw
Sb
and Within Class Scatter
Within Class Scatter Matrix
where
pj
is the prior probabilities of the jth class
is covariance matrix of the jth class (set j)
and
Between Class Scatter Matrix
where
and
µ3
µj
is the mean of the entire data
is the mean of the jth class (set j)
Where n is the number of dimension. In our case, it is equal to 2
8
Numerical Example
• Step 3
• Eigenvectors computation
Optimizing Criterion
Class-dependent transformation:
Obtain the eigenvectors from
Maximizing the ratio of between class
variance to within class variance.
Eigenvectors
Transform_j
Involving using two optimizing criteria
for transforming the data sets
independently
Optimizing Criterion
Class-independent transformation:
Obtain the eigenvectors from
Maximizing the ratio of overall variance
to within class variance
Only using one optimizing criterion to
transform the data sets and hence all
data points irrespective of their class
identity are transformed using this
transform.
Eigenvectors
Transform_spec
Numerical Example
• Step 4
• Transformed matrix calculation
Where “transform_j” is composed of eigenvectors from
Where “transform_spec” is composed of eigenvectors from
9
Numerical Example
• Step 5
• Euclidean distance calculate
where
µ ntrans
n
x
is the mean of the transformed data set
is the class index
is the test vector
For n classes, n Euclidean distances are obtained for each test point
Numerical Example
• Step 6
• Classification result is based on the smallest Euclidean distance among the n
distances classifiers the test vector a belonging to class n
10
Extension to Multiple Classes
• Between Class Scatter Matrix
Extension to Multiple Classes
• Within Class Scatter Matrix
11
Extension to Multiple Classes
r
r
S w−1S bφi = λφi
• Questions?
12