Download Reducing Dimensionality

COM S 485 Spring 2007 sa349 Reducing Dimensionality : The main question that arises when we deal with high dimensions data is, “Where it fits?”. We make use of the following ways for answering this question. 1. Clustering 2. Reducing dimensions This can be done in the following ways : a. Feature selection (keeping only the important dimensions and discarding the less important dimensions). b. Feature extraction (combining features as opposed to completely throwing away any of them e.g. considering the sum of the total studies and work experience instead of considering only the work experience while considering the applications for a job) Feature extraction can be done in several ways (linear and non linear methods). We shall focus on the following linear methods for Feature extraction. 1. Principal Component Analysis 2. Linear Discriminant Analysis 3. Latent Semantic Indexing First, we need to review the following linear algebra concepts: Review of Basic Linear Algebra Concepts: Vector : Object description (data of interest e.g. answers to a questionnaire) Basis : Set of attributes (e.g. every question in a questionnaire is an attribute) Matrix : Transformation for changing the basis (operators that take in objects with some attributes and spit out objects with some other attributes). Let U = all possible objects (vectors). i,e, the universal set Linearity assumption : For all x,y belonging to U, we have x+y belongs to U. For all a, we have ax belongs to U. A(ax+by) = aA(x) + bA(y). Geometric interpretation of matrices : Suppose we have a set of points in 2D that form a circle with a1,a2 as basis. Then, matrix multiplication can only rotate the circle and/or scale the basis vectors. Similarity Measure: We know that the inner product (or dot product) of two vectors in the measure of similarity of the vectors. If the two vectors are x,y then their dot product is given by: COM S 485 Spring 2007 sa349 xT.y = ∑ xiyi and the angle between the vectors is given by: cosӨ = (xT.y) ||x|| ||y|| Length of the projection of x on y = xT.y ||y|| where the length of vector x: ||x||=sqrt(xT.x) Eigen values and vectors: Definition: If A is a matrix, then x is its eigen vector with eigen value  , if Ax  x (direction in which the transformation A only changes the scale) FACT- for A symmetric, eigen vectors can be orthogonal. FACT- for A symmetric, 1  max x T Ax  magnitude of largest stretch x 1  n  min x T Ax  magnitude of smallest stretch. x 1 (in fact  2  max x Ax ) T x 1 xV1 where is the 1 largest eigen value and  2 ,... n is smallest. Proof: Notice that maximizing max x T Ax  max x 1 x xT x x T Ax A  max T x x x x x x T Ax Let F(x) be T , x0. x x We want to find maximum F(x). F has an extremum in x  F ( x) ( x)  0 xi r  1,..., n. n x T Ax   xi2 aii  2 xi x j aij i 1  i j x T Ax  2 xi aii  2 x j aij xi j i (since A is symmetric) COM S 485 Spring 2007 sa349 n  2 x j xij j 1  2 A(1, :) x x x  x T 2 i x T x  2 xi xi other way round for every  x T Ax   max T x x x Proof: If  is eigen vector, Ax  x  x T Ax  x T x y T Ay x T Ax max T  T   y y y x x On substituting: T T 2 A(i, :) x. x x  x A x.2 xi F ( x)  0 T xi ( x x) 2 T  x x A(i, :) x  T x xA x  T x T x A x.xi A x.x T Ax  Ax  x .x x x T max x T Ax is an eigen value. xT x i I

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Reducing Dimensionality