Download Fodor I K. A survey of dimension reduction techniques[J]. 2002.

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Forecasting wikipedia , lookup

Linear regression wikipedia , lookup

Least squares wikipedia , lookup

Data assimilation wikipedia , lookup

Transcript
Principal Components Analysis
Gang Ren
2015.11.25
Background
Advances in data collection and storage capabilities during the past decades
have led to an information overload in most sciences. Meanwhile, great number
of datasets and High-dimensional datasets present many challenges as well as
some opportunities for our analysis. One of the problems with high-dimensional
datasets is that, in many cases, not all the measured features are “ important” for
understanding the underlying phenomena of interest. So, we should take some
measures to implement dimension reduction.
Dimension reduction
 In machine learning and statistics , dimensionality
reduction or dimension reduction is the process of
reducing the number of random variables under
consideration.
Formalize:
f: x->y
(the dimensions of y is lower than x)
Main dimension reduction techniques
•Principal component analysis(PCA)(主成分分析)
•Linear discriminant analysis(LDA)(线性判别分析)
•Local Linear Embedding(LLE) (局部线性嵌入)
•Laplacian Eigenmaps (拉普拉斯特征映射)
PCA
probably one of the most widely-used and well-known of
dimension reduction methods, invented by Pearson (1901) and
Hotelling (1933).
In essence, PCA seeks to reduce the dimension of the data by
finding a few orthogonal linear combinations( the PCs ) of the
original variables with the largest variance.
PCA is a useful statistical technique that has found application in
fields such as face recognition, Gene expression analysis and image
compression.
Background mathematics
Before getting to a description of PCA, we first introduce
mathematical concepts that will be used in PCA.
standard deviation
Variance
Covariance
The covariance Matrix
Eigenvectors and Eigenvalues
Standard deviation
The centroid of the points is defined by the mean of each
dimension
But the mean doesn’t tell us a lot about the data except for a
sort of middle point.
Here’s an example:
A=[0,8,12,20]
B=[8,9,11,12]
(these two data sets have exactly the same mean (10), but
are obviously quite different)
Standard deviation
 The Standard Deviation (SD) of a data set is a
measure of how spread out the data is.
And so, as expected, the first set has a much larger standard
deviation due to the fact that the data is much more spread out
from the mean.
Variance
Also present the spread of the data
Covariance
However many data sets have more than one dimension, and the
aim of the statistical analysis of these data sets is usually to see if
there is any relationship between the dimensions.
Covariance is always measured between 2 dimensions. If you
calculate the covariance between one dimension and itself, you get
the variance.
Like this:
Covariance
If X and Y are two different dimensions ,here is the
formula for covariance:
Covariance of
variables X and Y
Sum over all
n objects
Mean of
variable X
Mean of
variable Y
measures the correlation between X and Y
• cov(X,Y)=0: independent
•cov(X,Y)>0: move same dir
•cov(X,Y)<0: move oppo dir
The covariance Matrix
Recall that covariance is always measured between
2 dimensions. If we have a data set with more than 2
dimensions, there is more than one covariance
measurement that can be calculated.
So, the definition for the covariance matrix for a set
of data with n dimensions is:
The covariance Matrix
•An example: We’ll make up the covariance matrix for
an imaginary 3 dimensional data set, using the usual
dimensions x, y and z. Then, the covariance matrix has
3 rows and 3 columns, and the values are this:
Eigenvectors and Eigenvalues
In linear algebra, an eigenvector or characteristic
vector of a square matrix is a vector that does not change
its direction under the associated linear transformation. In
other words—if x is a vector that is not zero, then it is an
eigenvector of a square matrix A if Ax is a scalar multiple
of x. This condition could be written as the equation:
•where λ is a number (also called a scalar) known as
the eigenvalue or characteristic value associated with the
eigenvector x.
Eigenvectors and Eigenvalues
identity matrix
Solution:
Change of basis
•In linear algebra, a basis for a vector space of dimension n is a
sequence of n vectors (α1, …, αn) with the property that every
vector in the space can be expressed uniquely as a linear
combination of the basis vectors.
A
AB
B
•Of which, pi is a row vector to denote the i-th basis, aj is a
column vector to denote j-th original data record.
Change of basis
Example: (3,2)
Basis: (1,0) (0,1)
Example for 2-D to 1-D
•Example :
mean centering
Example for 2-D to 1-D
Desirable outcome:
Key observation:
variance = largest!
How can we
get it ?
Example for 2-D to 1-D
Process:
covariance matrix
=
=
Example for 2-D to 1-D
,
Eigenvector
Eigenvalue
Example for 2-D to 1-D
Result:
Example for 2-D to 1-D
For this dataset:
λ1=2 , λ2=2/5;
C1=[1,1]’ , C2=[-1,1]’ ;
Steps of PCA
• Let X be the mean vector (taking the
mean of all rows)
• Adjust the original data by the mean
X’ = X – X
• Compute the covariance matrix C of
adjusted X
• Find the eigenvectors and eigenvalues
of C
• Get a matrix P consisted of k ordered
eigenvectors
• Y=PX is the result that want to get
What are the assumptions of PCA?
Assume relationships among variables are LINEAR cloud of points in pdimensional space has linear dimensions that can be effectively
summarized by the principal axes.
If the structure in the data is NONLINEAR (the cloud of points twists and
curves its way through p-dimensional space), the principal axes will not be
an efficient and informative summary of the data.
References
1.Carreira-Perpinán M A. A review of dimension reduction techniques[J].
Department of Computer Science. University of Sheffield. Tech. Rep. CS96-09, 1997, 9: 1-69.
2.Fodor I K. A survey of dimension reduction techniques[J]. 2002.
3.Smith L I. A tutorial on principal components analysis[J]. Cornell
University, USA, 2002, 51: 52.
4. Duda, Richard O., Peter E. Hart, and David G. Stork. Pattern
classification. John Wiley & Sons, 2012.
5. http://blog.codinglabs.org/articles/pca-tutorial.html
6. http://m.blog.csdn.net/blog/zhang11wu4/8584305
Thanks for the Attention! 