Download The multivariate gaussian distribution The multivariate gaussian

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
The multivariate gaussian distribution
Covariance matrices
The multivariate gaussian distribution
Gaussian random vectors
Gaussian characteristic functions
Eigenvalues of the covariance matrix
Uncorrelation and independence
Linear combinations
The multivariate gaussian density
October 3, 2013
1 / 38
Covariance matrices
2 / 38
Covariance matrices
The covariance matrix KX of an n-dimensional random variable
I
mX is the expectation vector
X = (X1 , X2 , . . . , Xn )t
mX = E(X ) = (mX1 , mX2 , . . . , mXn )t
is the square n ⇥ n matrix defined by
KX = E (X
0
k11
B k21
=B
@ ···
kn1
mX )(X
k12
k22
···
kn2
···
···
···
···
I
mX ) t
1
For i 6= j,
kij = E (Xi
k1n
k2n C
C
··· A
knn
I
mXi )(Xj
The diagonal entries of KX are
kii = E (Xi
3 / 38
mXj ) = Cov(Xi , Xj )
mXi )2 = Var(Xi )
4 / 38
Covariance matrices
Covariance matrices
The covariance matrix KX is:
I
Let us say that the random variables
symmetric,
X1
kij = Cov(Xi , Xj ) = Cov(Xj , Xi ) = kji
I
m Xn
n
X
zi (Xi
mXi ) = 0 with probability 1,
i=0
That is, for all z = (z1 , z2 , . . . , zn )t 2 Rn ,
z KX z
m X2 , . . . , X n
are linearly independent if
positive-semidefinite.
t
m X1 , X 2
then
z1 = z2 = · · · = zn = 0
0
5 / 38
Covariance matrices
6 / 38
Covariance matrices
Proof: Let
Theorem
Y = z1 X1 + · · · zn Xn = z t X
The random variables
X1
m X1 , X 2
m X2 , . . . , X n
m Xn
Notice that
mY =
are linearly independent if and only if KX is positive-definite; that
is, if and only if
n
X
z i m Xi = z t m X
i=1
Therefore
z t KX z > 0 for all z 6= 0.
Y
7 / 38
mY = z t (X
mX )
8 / 38
Covariance matrices
Covariance matrices
Moreover,
We have
z t KX z = z t E (X
t
= E z (X
mX ) t z
mX )(X
mY )(Y
= E (Y
mY ) 2 =
()
t
mX )(X
= E (Y
z t KX z = 0 for some z 6= 0
mX ) z
mY )
2
Y
2
Y
= 0 for some z 6= 0
() Y mY = 0 with probability 1, for some z 6= 0
P
() ni=1 zi (Xi mXi ) = 0 with probability 1,
t
for some z = (z1 , z2 , . . . , zn )t 6= 0
0
() X1
m X1 , X 2
m X2 , . . . , X n
mXn are linealy dependent
9 / 38
Linear transformations
10 / 38
Linear transformations
Theorem
Proof: We have
Let X = (X1 , X2 , . . . , Xn )t be an n-dimensional random variable,
let A be an m ⇥ n real matrix, and let Y = (Y1 , Y2 , . . . , Ym )t be
the m-dimensional random variable defined by
mY = E(Y ) = E(AX ) = A E(X ) = AmX ,
and
Y = AX .
KY = E (Y
Then
m Y = A mX ,
K Y = A KX A t
11 / 38
mY )(Y
mY ) t
= E A (X
mX )(X
mX ) t A t
= A E (X
mX )(X
mX )t At = AKX At
12 / 38
Gaussian characteristic functions
Gaussian characteristic functions
Let X1 , X2 , . . ., Xn be independent gaussian random variables.
where
Their joint characteristic function is
MX (!1 , !2 , . . . , !n ) = MX1 (!1 )MX2 (!2 ) · · · MXn (!n )
=
n
Y
✓
exp i!i mXi
i=1
= exp
n ✓
X
i!i mXi
i=1
✓
= exp i! t mX
1
2
1
2
2 2
Xi ! i
! = (!1 , !2 , . . . , !n )t
I
mX = (mX1 , . . . , mXn ) is the expectation vector.
I
◆
2 2
Xi ! i
1 t
! KX !
2
I
0
B
B
KX = B
B
@
◆!
2
X1
.
2
X2
0
.
. .
0
.
.
. .
2
Xn
1
C
C
C
C
A
is the covariance matrix.
◆
KX is diagonal because the random variables are independent
and, hence, pairwise uncorrelated.
13 / 38
Gaussian random vectors
14 / 38
Marginal distributions
If m = (mj ) and K = (kij ) we have
Definition
If a random vector X has characteristic function
✓
◆
1 t
MX (!1 , !2 , . . . , !n ) = exp i! t m
! K! ,
2
✓
MXj (!j ) = MX (0, . . . , !j , . . . , 0) = exp i mj !j
where ! t = (!1 , !2 , . . . , !n ), m is a column n ⇥ 1 vector, and K is
a square positive-semidefinite n ⇥ n matrix, we say that X is a
n-dimensional gaussian random vector.
I
1
kjj !j2
2
◆
Each component Xj is a 1-dimensional gaussian random
variable with parameters
m Xj = m j
We also say that the X1 , X2 , . . ., Xn are jointly gaussian random
variables.
2
j
15 / 38
= kjj
16 / 38
Marginal distributions
Eigenvalues of the covariance matrix
The matrix KX is symmetric. Thus, it can be transformed into a
diagonal matrix by means of an orthogonal transformation.
Moreover,
m11;Xr Xs =
1 @ 2 MX (!1 , !2 , . . . , !n )
i2
@!r @!s
= krs + mr ms
I
(0,0,...,0)
There exists an orthogonal matrix C such that
CKX C t = D = diag(
Therefore
Cov(Xr , Xs ) = krs
I
I
The numbers
i
1,
2, . . . ,
n)
2 R are the eigenvalues of KX .
Equivalently,
K is the covariance matrix of X .
KX = C t DC
17 / 38
Eigenvalues of the covariance matrix
18 / 38
Linear independence of the components
Hence,
0  z t KX z = z t C t DCz = (Cz)t D (Cz)
= y t Dy =
n
X
yi2 i ,
i=1
where y = Cz. Therefore,
I
All the eigenvalues are nonnegative:
i
0,
I
KX is a positive-definite matrix if and only if i > 0 for all i.
In that case, the random variables Xi mXi , 1  i  n, are
linearly independent.
I
This condition is equivalent to det(KX ) 6= 0. Only in this case
there exists a density fX (x1 , x2 , . . . , xn ).
I
But gaussian random vectors are defined although KX is not
necessarily invertible. (That is, Xi mXi , 1  i  n, could be
not all linearly independent.)
i = 1, 2, . . . , n
19 / 38
20 / 38
Uncorrelation and independence
Uncorrelation and independence
Therefore
✓
Theorem
t
1 t
! KX !
2
i!k mXk
1
2
2
2
Xk ! k
✓
exp i!k mXk
◆!
1
2
2
2
Xk ! k
MX (!1 , !2 , . . . , !n ) = exp i! mX
If the random variables X1 , X2 , . . ., Xn are jointly gaussian and
pairwise uncorrelated, then they are jointly independent.
= exp
n ✓
X
k=1
Proof:
Cov(Xi , Xj ) = 0 =) KX = diag(
2
X1 ,
2
X2 , . . . ,
2
Xn )
=
n
Y
k=1
◆
◆
= MX1 (!1 )MX2 (!2 ) · · · MXn (!n )
21 / 38
Linear combinations
22 / 38
Linear combinations
Proof: We need only to prove that Y is gaussian.
Theorem
Let X be an n-dimensional gaussian random variable, let A be an
m ⇥ n real matrix, and let
⇣ t ⌘
⇣ t
MY (!1 , !2 , . . . , !n ) = E e i! Y = E e i!
Y = AX .
✓
t
= exp i (! A)mX )
Then, Y is an m-dimensional gaussian random variable with
mY = A mX and KY = AKX At .
✓
= exp i ! t (AmX )
I
If m  n, A has full rang m, and X has a probability density
fX (x1 , . . . , xn ), then the random vector Y also has a density
fY (y1 , . . . , ym ).
✓
= exp i ! t mY
23 / 38
AX
⌘
= MX (! t A)
◆
1 t
t
t
(! A)KX (! A)
2
◆
1 t
t
! (AKX A ) !
2
◆
1 t
! KY !
2
24 / 38
Linear combinations
Linear combinations
Proof: If X is gaussian so is Y .
Reciprocally, suppose that Y = ! t X is gaussian for all !.
Theorem
Since
The n-dimensional random variable X = (X1 , . . . , Xn
if and only if the 1-dimensional random variable
)t
mY = ! t mX ,
is gaussian
we have
t
Y = a1 X 1 + · · · + an X n = a X
is gaussian for all a = (a1 , a2 , . . . an )t 2 Rn
2
Y
= ! t KX !
⇣ t ⌘
⇣ ⌘
MX (!) = E e i! X = E e iY
✓
= MY (1) = exp imY
✓
= exp i! t mX
1
2
2
Y
1 t
! KX !
2
◆
◆
25 / 38
The multivariate gaussian density
26 / 38
The multivariate gaussian density
Let X1 , X2 , . . ., Xn be independent gaussian random variables.
Hence,
Then,
fX (x1 , x2 , . . . , xn ) = fX1 (x1 )fX2 (x2 ) · · · fXn (xn )
=
n
Y
i=1
=
(2⇡)n/2
=
(2⇡)n/2
p
1
2⇡
e
((xi
mXi )/
Xi
)
2
(2⇡)n/2
Xi
1
X1 X2
1
2
fX (x1 , x2 , . . . , xn ) =
···
e
1
2
Xn
1
p
e
det(KX )
1
(x
2
Pn
i=1
((xi
m X )t K X
mXi )/
1
Xi
)
2
where
1
p
det(KX )
exp
✓
1
(x
2
t
1
mX ) KX (x
mX )
◆
x = (x1 , x2 , . . . , xn )t
(x mX )
27 / 38
28 / 38
The multivariate gaussian density
The multivariate gaussian density
Moreover,
Now, let us consider a linear transformation
I
mY = AmX =) mX = A
Y = AX ,
being A a non-singular n ⇥ n matrix.
1
mY
I
I
The linear system y = Ax has a unique solution x = A
I
The jacobian determinant is
1y .
KY = AKX At =)
KY 1 = (At )
J(x1 , x2 , . . . , xn ) = det(A)
1
KX 1 A
1
= (A
) KX 1 A
1 t
1
I
det(KY ) = det(KX ) det(A)2
29 / 38
The multivariate gaussian density
30 / 38
The multivariate gaussian density
In this way, we have
fX (x1 , x2 , . . . , xn )
fY (y1 , y2 , . . . , yn ) =
|J(x1 , x2 , . . . , xn )|
=
1
p
·
(2⇡)n/2 det(KX ) |det(A)|
✓
1
· exp
(A 1 y A 1 mY )t KX 1 (A
2
=
1
p
·
(2⇡)n/2 det(KY )
✓
1
· exp
A 1 (y mY )
2
=
1y
x=A
1
y
A
1
mY )
1
KX A
1
(y
mY )
1 t
1
✓
1
(y
2
1
p
=
exp
n/2
(2⇡)
det(KY )
◆
I
t
1
p
·
(2⇡)n/2 det(KY )
✓
1
· exp
(y mY )t (A
2
◆
) KX A
1
(y
mY )
t
1
◆
mY ) KY (y
mY )
◆
The above expression is analogous to the one we have in the
case of independent random variables.
But now, the covariance matrix KY is not necessarily a
diagonal matrix.
31 / 38
32 / 38
The multivariate gaussian density
The multivariate gaussian density
=
X
Y
⇢=0
For instance, for n = 2 we obtain:
fXY (x, y ) =
2⇡
where
p
1
exp
⇢2
1
X Y
✓
◆
1
1
·
·
a(x,
y
)
,
2 1 ⇢2
0.15
a(x, y ) =
✓
x
mX
X
◆2
2⇢
x
mX
X
·
y
mY
Y
+
✓
y
mY
Y
◆2
0.1
2
0.05
0
0
-2
0
-2
2
33 / 38
Multidimensional gaussian density
X
=
Y
34 / 38
The multivariate gaussian density
⇢ = 0.5
X
=
Y
⇢ = 0.9
0.3
0.15
0.1
0.2
2
2
0.1
0.05
0
0
0
0
-2
-2
0
0
-2
-2
2
2
35 / 38
36 / 38
The multivariate gaussian density
X
=2
Conditional densities
Let X , Y be jointly gaussian. Then,
⇢ = 0.7
Y
fY |X (y |X = x) =
=p
0.1
0.075
0.05
I
2
0.025
1
2⇡
p
1
⇢2
exp
Y
1
2
✓
y
mY |X
Y |X
◆2 !
mY |X is the expected value of Y given X :
mY |X = E(Y |X = x) = ⇢
0
0
-4
fXY (x, y )
fX (x)
Y
X
(x
mX ) + mY
-2
0
I
-2
2
4
37 / 38
2
Y |X
= (1
⇢2 )
2
Y.
38 / 38
Related documents