Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The multivariate gaussian distribution Covariance matrices The multivariate gaussian distribution Gaussian random vectors Gaussian characteristic functions Eigenvalues of the covariance matrix Uncorrelation and independence Linear combinations The multivariate gaussian density October 3, 2013 1 / 38 Covariance matrices 2 / 38 Covariance matrices The covariance matrix KX of an n-dimensional random variable I mX is the expectation vector X = (X1 , X2 , . . . , Xn )t mX = E(X ) = (mX1 , mX2 , . . . , mXn )t is the square n ⇥ n matrix defined by KX = E (X 0 k11 B k21 =B @ ··· kn1 mX )(X k12 k22 ··· kn2 ··· ··· ··· ··· I mX ) t 1 For i 6= j, kij = E (Xi k1n k2n C C ··· A knn I mXi )(Xj The diagonal entries of KX are kii = E (Xi 3 / 38 mXj ) = Cov(Xi , Xj ) mXi )2 = Var(Xi ) 4 / 38 Covariance matrices Covariance matrices The covariance matrix KX is: I Let us say that the random variables symmetric, X1 kij = Cov(Xi , Xj ) = Cov(Xj , Xi ) = kji I m Xn n X zi (Xi mXi ) = 0 with probability 1, i=0 That is, for all z = (z1 , z2 , . . . , zn )t 2 Rn , z KX z m X2 , . . . , X n are linearly independent if positive-semidefinite. t m X1 , X 2 then z1 = z2 = · · · = zn = 0 0 5 / 38 Covariance matrices 6 / 38 Covariance matrices Proof: Let Theorem Y = z1 X1 + · · · zn Xn = z t X The random variables X1 m X1 , X 2 m X2 , . . . , X n m Xn Notice that mY = are linearly independent if and only if KX is positive-definite; that is, if and only if n X z i m Xi = z t m X i=1 Therefore z t KX z > 0 for all z 6= 0. Y 7 / 38 mY = z t (X mX ) 8 / 38 Covariance matrices Covariance matrices Moreover, We have z t KX z = z t E (X t = E z (X mX ) t z mX )(X mY )(Y = E (Y mY ) 2 = () t mX )(X = E (Y z t KX z = 0 for some z 6= 0 mX ) z mY ) 2 Y 2 Y = 0 for some z 6= 0 () Y mY = 0 with probability 1, for some z 6= 0 P () ni=1 zi (Xi mXi ) = 0 with probability 1, t for some z = (z1 , z2 , . . . , zn )t 6= 0 0 () X1 m X1 , X 2 m X2 , . . . , X n mXn are linealy dependent 9 / 38 Linear transformations 10 / 38 Linear transformations Theorem Proof: We have Let X = (X1 , X2 , . . . , Xn )t be an n-dimensional random variable, let A be an m ⇥ n real matrix, and let Y = (Y1 , Y2 , . . . , Ym )t be the m-dimensional random variable defined by mY = E(Y ) = E(AX ) = A E(X ) = AmX , and Y = AX . KY = E (Y Then m Y = A mX , K Y = A KX A t 11 / 38 mY )(Y mY ) t = E A (X mX )(X mX ) t A t = A E (X mX )(X mX )t At = AKX At 12 / 38 Gaussian characteristic functions Gaussian characteristic functions Let X1 , X2 , . . ., Xn be independent gaussian random variables. where Their joint characteristic function is MX (!1 , !2 , . . . , !n ) = MX1 (!1 )MX2 (!2 ) · · · MXn (!n ) = n Y ✓ exp i!i mXi i=1 = exp n ✓ X i!i mXi i=1 ✓ = exp i! t mX 1 2 1 2 2 2 Xi ! i ! = (!1 , !2 , . . . , !n )t I mX = (mX1 , . . . , mXn ) is the expectation vector. I ◆ 2 2 Xi ! i 1 t ! KX ! 2 I 0 B B KX = B B @ ◆! 2 X1 . 2 X2 0 . . . 0 . . . . 2 Xn 1 C C C C A is the covariance matrix. ◆ KX is diagonal because the random variables are independent and, hence, pairwise uncorrelated. 13 / 38 Gaussian random vectors 14 / 38 Marginal distributions If m = (mj ) and K = (kij ) we have Definition If a random vector X has characteristic function ✓ ◆ 1 t MX (!1 , !2 , . . . , !n ) = exp i! t m ! K! , 2 ✓ MXj (!j ) = MX (0, . . . , !j , . . . , 0) = exp i mj !j where ! t = (!1 , !2 , . . . , !n ), m is a column n ⇥ 1 vector, and K is a square positive-semidefinite n ⇥ n matrix, we say that X is a n-dimensional gaussian random vector. I 1 kjj !j2 2 ◆ Each component Xj is a 1-dimensional gaussian random variable with parameters m Xj = m j We also say that the X1 , X2 , . . ., Xn are jointly gaussian random variables. 2 j 15 / 38 = kjj 16 / 38 Marginal distributions Eigenvalues of the covariance matrix The matrix KX is symmetric. Thus, it can be transformed into a diagonal matrix by means of an orthogonal transformation. Moreover, m11;Xr Xs = 1 @ 2 MX (!1 , !2 , . . . , !n ) i2 @!r @!s = krs + mr ms I (0,0,...,0) There exists an orthogonal matrix C such that CKX C t = D = diag( Therefore Cov(Xr , Xs ) = krs I I The numbers i 1, 2, . . . , n) 2 R are the eigenvalues of KX . Equivalently, K is the covariance matrix of X . KX = C t DC 17 / 38 Eigenvalues of the covariance matrix 18 / 38 Linear independence of the components Hence, 0 z t KX z = z t C t DCz = (Cz)t D (Cz) = y t Dy = n X yi2 i , i=1 where y = Cz. Therefore, I All the eigenvalues are nonnegative: i 0, I KX is a positive-definite matrix if and only if i > 0 for all i. In that case, the random variables Xi mXi , 1 i n, are linearly independent. I This condition is equivalent to det(KX ) 6= 0. Only in this case there exists a density fX (x1 , x2 , . . . , xn ). I But gaussian random vectors are defined although KX is not necessarily invertible. (That is, Xi mXi , 1 i n, could be not all linearly independent.) i = 1, 2, . . . , n 19 / 38 20 / 38 Uncorrelation and independence Uncorrelation and independence Therefore ✓ Theorem t 1 t ! KX ! 2 i!k mXk 1 2 2 2 Xk ! k ✓ exp i!k mXk ◆! 1 2 2 2 Xk ! k MX (!1 , !2 , . . . , !n ) = exp i! mX If the random variables X1 , X2 , . . ., Xn are jointly gaussian and pairwise uncorrelated, then they are jointly independent. = exp n ✓ X k=1 Proof: Cov(Xi , Xj ) = 0 =) KX = diag( 2 X1 , 2 X2 , . . . , 2 Xn ) = n Y k=1 ◆ ◆ = MX1 (!1 )MX2 (!2 ) · · · MXn (!n ) 21 / 38 Linear combinations 22 / 38 Linear combinations Proof: We need only to prove that Y is gaussian. Theorem Let X be an n-dimensional gaussian random variable, let A be an m ⇥ n real matrix, and let ⇣ t ⌘ ⇣ t MY (!1 , !2 , . . . , !n ) = E e i! Y = E e i! Y = AX . ✓ t = exp i (! A)mX ) Then, Y is an m-dimensional gaussian random variable with mY = A mX and KY = AKX At . ✓ = exp i ! t (AmX ) I If m n, A has full rang m, and X has a probability density fX (x1 , . . . , xn ), then the random vector Y also has a density fY (y1 , . . . , ym ). ✓ = exp i ! t mY 23 / 38 AX ⌘ = MX (! t A) ◆ 1 t t t (! A)KX (! A) 2 ◆ 1 t t ! (AKX A ) ! 2 ◆ 1 t ! KY ! 2 24 / 38 Linear combinations Linear combinations Proof: If X is gaussian so is Y . Reciprocally, suppose that Y = ! t X is gaussian for all !. Theorem Since The n-dimensional random variable X = (X1 , . . . , Xn if and only if the 1-dimensional random variable )t mY = ! t mX , is gaussian we have t Y = a1 X 1 + · · · + an X n = a X is gaussian for all a = (a1 , a2 , . . . an )t 2 Rn 2 Y = ! t KX ! ⇣ t ⌘ ⇣ ⌘ MX (!) = E e i! X = E e iY ✓ = MY (1) = exp imY ✓ = exp i! t mX 1 2 2 Y 1 t ! KX ! 2 ◆ ◆ 25 / 38 The multivariate gaussian density 26 / 38 The multivariate gaussian density Let X1 , X2 , . . ., Xn be independent gaussian random variables. Hence, Then, fX (x1 , x2 , . . . , xn ) = fX1 (x1 )fX2 (x2 ) · · · fXn (xn ) = n Y i=1 = (2⇡)n/2 = (2⇡)n/2 p 1 2⇡ e ((xi mXi )/ Xi ) 2 (2⇡)n/2 Xi 1 X1 X2 1 2 fX (x1 , x2 , . . . , xn ) = ··· e 1 2 Xn 1 p e det(KX ) 1 (x 2 Pn i=1 ((xi m X )t K X mXi )/ 1 Xi ) 2 where 1 p det(KX ) exp ✓ 1 (x 2 t 1 mX ) KX (x mX ) ◆ x = (x1 , x2 , . . . , xn )t (x mX ) 27 / 38 28 / 38 The multivariate gaussian density The multivariate gaussian density Moreover, Now, let us consider a linear transformation I mY = AmX =) mX = A Y = AX , being A a non-singular n ⇥ n matrix. 1 mY I I The linear system y = Ax has a unique solution x = A I The jacobian determinant is 1y . KY = AKX At =) KY 1 = (At ) J(x1 , x2 , . . . , xn ) = det(A) 1 KX 1 A 1 = (A ) KX 1 A 1 t 1 I det(KY ) = det(KX ) det(A)2 29 / 38 The multivariate gaussian density 30 / 38 The multivariate gaussian density In this way, we have fX (x1 , x2 , . . . , xn ) fY (y1 , y2 , . . . , yn ) = |J(x1 , x2 , . . . , xn )| = 1 p · (2⇡)n/2 det(KX ) |det(A)| ✓ 1 · exp (A 1 y A 1 mY )t KX 1 (A 2 = 1 p · (2⇡)n/2 det(KY ) ✓ 1 · exp A 1 (y mY ) 2 = 1y x=A 1 y A 1 mY ) 1 KX A 1 (y mY ) 1 t 1 ✓ 1 (y 2 1 p = exp n/2 (2⇡) det(KY ) ◆ I t 1 p · (2⇡)n/2 det(KY ) ✓ 1 · exp (y mY )t (A 2 ◆ ) KX A 1 (y mY ) t 1 ◆ mY ) KY (y mY ) ◆ The above expression is analogous to the one we have in the case of independent random variables. But now, the covariance matrix KY is not necessarily a diagonal matrix. 31 / 38 32 / 38 The multivariate gaussian density The multivariate gaussian density = X Y ⇢=0 For instance, for n = 2 we obtain: fXY (x, y ) = 2⇡ where p 1 exp ⇢2 1 X Y ✓ ◆ 1 1 · · a(x, y ) , 2 1 ⇢2 0.15 a(x, y ) = ✓ x mX X ◆2 2⇢ x mX X · y mY Y + ✓ y mY Y ◆2 0.1 2 0.05 0 0 -2 0 -2 2 33 / 38 Multidimensional gaussian density X = Y 34 / 38 The multivariate gaussian density ⇢ = 0.5 X = Y ⇢ = 0.9 0.3 0.15 0.1 0.2 2 2 0.1 0.05 0 0 0 0 -2 -2 0 0 -2 -2 2 2 35 / 38 36 / 38 The multivariate gaussian density X =2 Conditional densities Let X , Y be jointly gaussian. Then, ⇢ = 0.7 Y fY |X (y |X = x) = =p 0.1 0.075 0.05 I 2 0.025 1 2⇡ p 1 ⇢2 exp Y 1 2 ✓ y mY |X Y |X ◆2 ! mY |X is the expected value of Y given X : mY |X = E(Y |X = x) = ⇢ 0 0 -4 fXY (x, y ) fX (x) Y X (x mX ) + mY -2 0 I -2 2 4 37 / 38 2 Y |X = (1 ⇢2 ) 2 Y. 38 / 38