Download Lecture 5 - IDA.LiU.se

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Covariance and contravariance of vectors wikipedia , lookup

Matrix calculus wikipedia , lookup

Jordan normal form wikipedia , lookup

Four-vector wikipedia , lookup

Transcript
Outline of lecture 5
The multivariate normal distribution
 Characterizing properties of the univariate normal distribution
 Different definitions of normal random vectors
 Conditional distributions
 Independence
 Cochran’s theorem
Probability theory 2008
The univariate normal distribution
- defining properties

A distribution is normal if and only if it has the probability density
1
( x   )2
f X ( x) 
exp( 
)
2
2

2 
where   R and  > 0.

A distribution is normal if and only if the sample mean
1 n
X   Xi
n i 1
and the sample variance
1 n
s 
( X i  X )2

n  1 i 1
are independent for all n.
2
Probability theory 2008
The univariate normal distribution
- defining properties
 Suppose that X1 and X2 are independent of each other, and that the same is
true for the pair
 Y1   a11 a12  X 1 
   
 
Y
a
a
 2   21 22  X 2 
where no coefficient vanishes. Then all four variables are normal.
Special case: rotations other than multiples of 90 degrees
x2
x1
Probability theory 2008
The univariate normal distribution
- defining properties
Let F be a class of distributions such that
X  F  a + bX  F
Can F be comprised of distributions other than the normal
distributions?
cf. Cauchy distributions
Probability theory 2008
The multivariate normal distribution
- a first definition
 A random vector is normal if and only if every linear combination of its
components is normal
Immediate consequences:
Every component is normal
The sum of all components is normal
Every marginal distribution is normal
Vectors in which the components are independent normal random
variables are normal
Linear transformations of normal random vectors give rise to new
normal vectors
Probability theory 2008
Illustrations of independent and dependent normal
distributions
Probability theory 2008
Illustrations of independent and dependent normal
distributions
http://stat.sm.u-tokai.ac.jp/~yama/graphics/bnormE.html
Probability theory 2008
Parameterization of the multivariate
normal distribution
 Is a multivariate normal distribution uniquely determined by the
vector of expected values and the covariance matrix?
 Is there a multivariate normal distribution for any covariance
matrix?
Probability theory 2008
Fundamental results for covariance matrices
Let  be a covariance matrix.
Since  is symmetric there exists an orthogonal matrix C
(C’C = C C’ = I) such that
C’  C = D and  = CD C’
where D is a diagonal matrix.
Since  is also nonnegative-definite, there exists a symmetric
matrix B such that
BB=
If X has independent components with variance 1, Y = BX has
covariance matrix 
Probability theory 2008
The multivariate normal distribution
- a second definition
 A random vector is normal if and only if it has a characteristic function of
the form
 X (t )  E (ei t' X )  exp( i t' μ 
t' Λ t
)
2
where  is a nonnegative-definite, symmetric matrix and  is a vector of
constants
Proof of the equivalence of definition I and II:
Let XN( , ) according to definition I, and set Z = t’X. Then E(Z) = t’u and
Var(Z) = t’ t, and Z(1) gives the desired expression.
Let XN( , ) according to definition II. Then we can derive the characteristic
function of any linear combination of its components and show that it is normally
distributed.
Probability theory 2008
The multivariate normal distribution
- a third definition
 Let Y be normal with independent standard normal components
and set
X  Λ1/2Y  μ
Y  Λ-1/2 ( X  μ)
Then
 1 
f X ( x)  

2



n/2
 ( x  μ)' Λ 1( x  μ) 
exp 

2
det Λ


1
provided that the determinant is non-zero.
Probability theory 2008
The multivariate normal distribution
- a fourth definition
 Let Y be normal with independent standard normal components
and set
X  AY  μ
Then X is said to be a normal random vector.
Probability theory 2008
The multivariate normal distribution
- conditional distributions
 All conditional distributions in a multivariate normal vector
are normal
 The conditional distribution of each component is equal to that of
a linear combination of the other components plus a random error
Probability theory 2008
The multivariate normal distribution
- conditional distributions and optimal predictors
 For any random vector X it is known that E(Xn | X1, …, Xn-1) is an
optimal predictor of Xn based on X1, …, Xn-1 and that
Xn = E(Xn | X1, …, Xn-1) + 
where  is uncorrelated to the conditional expectation.
 For normal random vectors X, the optimal predictor E(Xn | X1, …, Xn-1)
is a linear expression in X1, …, Xn-1
Probability theory 2008
The multivariate normal distribution
- calculation of conditional distributions

Let XN (0 , ) where
 1 2  1


 2 6 0 
 1 0 4 


Determine the conditional distribution of X3 given X1 and X2

Set Z = a X1 + bX2 + c
Minimize the variance of the prediction error Z - X3
Probability theory 2008
The multivariate normal vector
- uncorrelated and independent components
The components of a normal random vector are
independent if and only if they are uncorrelated
Probability theory 2008
The multivariate normal distribution
- orthogonal transformations
 Let X be a normal random vector with independent standard
normal components, and let C be an orthogonal matrix.
 Then
Y = CX
has independent, standard normal components
Probability theory 2008
Quadratic forms of the components of a multivariate
normal distribution – one-way analysis of variance
Let Xijij, i = 1, …, k, j = 1, …, ni , be k samples of observations.
Then, the total variation in the X-values can be
decomposed as follows:
k
k
ni
2
X

n
X

n
(
X

X
)

(
X

X
)

 i i. ..  ij i.
2
ij
i, j
2
..
2
i 1
i 1 j 1
X ' I X  X ' A1 X  X ' A 2 X  X ' A 3 X
Probability theory 2008
1 / n

 .
A1   .

 .
1 / n

.
.
.
.
. 1/ n 

. 
. 

. 
. 1 / n 
1
1


1 
n1
 n1
1
 1
  n 1 n
1
1

 .

 1

.
n

1





A3  















.
.

1
n1
.
1
n1
1
1
n1

.

1
n1
.
.
.
.
.
1
nk
1

nk
.
1
1
nk
1
1
nk

.
Probability theory 2008
1
nk
1
nk
1

nk
1
.

1
nk
.
.

.
.
.
1

nk
1
1
nk
































Decomposition theorem for nonnegativedefinite quadratic forms
Let
where
n
x
i 1
2
i
 Q1  ...  Q p
Qi  x' Ai x are non - negative quadratic forms and
p
p
 Rank ( A )   r  n
i 1
i
i 1
i
Then there exists an orthogonal matrix C such that with x = Cy
(y = C’x)
Q1  y12  ...  yr21
.
.
Q p  yn2 rp 1  ...  yr2p
Probability theory 2008
Decomposition theorem for nonnegative-definite
quadratic forms (Cochran’s theorem)
Let X1, …, Xn be independent and N(0; 2) and suppose that
n
X
i 1
2
i
 Q1  ...  Q p
where
Qi  X' Ai X are nonnegative quadratic forms and
p
p
 Rank ( A )   r  n
i 1
i
i 1
i
Then there exists an orthogonal matrix C such that with X = CY (Y = C’X)
Q1  Y12  ...  Yr12
.
.
Q p  Yn2 rp 1  ...  Yr2p
Furthermore, Q1, …, Qp are independent and 22-distrubuted with r1, …rp
degrees of freedom
Probability theory 2008
Quadratic forms of the components of a multivariate
normal distribution – one-way analysis of variance
Let Xijij, i = 1, …, k, j = 1, …, ni , be independent and N( ,2). Then,
2
the total sum of squares  X ij can be decomposed into three
i, j
quadratic forms
k
k
ni
nX   ni ( X i.  X .. )   ( X ij  X i. ) 2  Q1  Q2  Q3
2
..
2
i 1
i 1 j 1
which are independent and 22-distrubuted with 1, k-1, and n-k
degrees of freedom
Probability theory 2008
Exercises: Chapter V
5.1, 5.2, 5.6, 5.8, 5.14, 5.16, 5.17, 5.27
Probability theory 2008