Download Introduction to Normal Distribution

Document related concepts
no text concepts found
Transcript
Introduction to Normal Distribution
Nathaniel E. Helwig
Assistant Professor of Psychology and Statistics
University of Minnesota (Twin Cities)
Updated 17-Jan-2017
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 1
Copyright
c 2017 by Nathaniel E. Helwig
Copyright Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 2
Outline of Notes
1) Univariate Normal:
3) Multivariate Normal:
Distribution form
Distribution form
Standard normal
Probability calculations
Probability calculations
Affine transformations
Affine transformations
Conditional distributions
Parameter estimation
Parameter estimation
2) Bivariate Normal:
4) Sampling Distributions:
Distribution form
Univariate case
Probability calculations
Multivariate case
Affine transformations
Conditional distributions
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 3
Univariate Normal
Univariate Normal
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 4
Univariate Normal
Distribution Form
Normal Density Function (Univariate)
Given a variable x ∈ R, the normal probability density function (pdf) is
(x−µ)2
1
−
f (x) = √ e 2σ2
σ 2π
1
(x − µ)2
= √ exp −
2σ 2
σ 2π
(1)
where
µ ∈ R is the mean
σ > 0 is the standard deviation (σ 2 is the variance)
e ≈ 2.71828 is base of the natural logarithm
Write X ∼ N(µ, σ 2 ) to denote that X follows a normal distribution.
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 5
Univariate Normal
Standard Normal
Standard Normal Distribution
If X ∼ N(0, 1), then X follows a standard normal distribution:
1
2
f (x) = √ e−x /2
2π
0.0
f (x )
0.2
0.4
(2)
−4
Nathaniel E. Helwig (U of Minnesota)
−2
0
x
Introduction to Normal Distribution
2
4
Updated 17-Jan-2017 : Slide 6
Univariate Normal
Probability Calculations
Probabilities and Distribution Functions
Probabilities relate to the area under the pdf:
Z
P(a ≤ X ≤ b) =
b
f (x)dx
a
(3)
= F (b) − F (a)
where
Z
x
F (x) =
f (u)du
(4)
−∞
is the cumulative distribution function (cdf).
Note:
F (x) = P(X ≤ x)
Nathaniel E. Helwig (U of Minnesota)
=⇒
0 ≤ F (x) ≤ 1
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 7
Univariate Normal
Probability Calculations
Normal Probabilities
0.2
0.3
0.4
Helpful figure of normal probabilities:
0.1
34.1% 34.1%
2.1%
0.0
0.1%
−3σ
13.6%
−2σ
−1σ
13.6%
µ
1σ
2.1%
2σ
0.1%
3σ
From http://en.wikipedia.org/wiki/File:Standard_deviation_diagram.svg
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 8
Univariate Normal
Probability Calculations
Normal Distribution Functions (Univariate)
Helpful figures of normal pdfs and cdfs:
1.0
1.0
μ = 0,
μ = 0,
μ = 0,
μ = −2,
σ 2 = 0.2,
σ 2 = 1.0,
σ 2 = 5.0,
σ 2 = 0.5,
0.8
μ = 0,
μ = 0,
μ = 0,
μ = −2,
σ 2 = 0.2,
σ 2 = 1.0,
σ 2 = 5.0,
σ 2 = 0.5,
0.6
2
Φμ,σ (x)
0.6
2
φμ,σ (x)
0.8
0.4
-3
-2
-1
0.4
-3
-2
-1
0.2
0.2
0.0
0.0
−5
−4
−3
−2
−1
0
x
1
2
3
4
5
http://en.wikipedia.org/wiki/File:Normal_Distribution_PDF.svg
−5
−4
−3
−2
−1
0
x
1
2
3
4
5
http://en.wikipedia.org/wiki/File:Normal_Distribution_CDF.svg
Note that the cdf has an elongated “S” shape, referred to as an ogive.
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 9
Univariate Normal
Affine Transformations
Affine Transformations of Normal (Univariate)
Suppose that X ∼ N(µ, σ 2 ) and a, b ∈ R with a 6= 0.
If we define Y = aX + b, then Y ∼ N(aµ + b, a2 σ 2 ).
Suppose that X ∼ N(1, 2). Determine the distributions of. . .
Y =X +3
Y = 2X + 3
Y = 3X + 2
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 10
Univariate Normal
Affine Transformations
Affine Transformations of Normal (Univariate)
Suppose that X ∼ N(µ, σ 2 ) and a, b ∈ R with a 6= 0.
If we define Y = aX + b, then Y ∼ N(aµ + b, a2 σ 2 ).
Suppose that X ∼ N(1, 2). Determine the distributions of. . .
Y =X +3
=⇒
Y ∼ N(1(1) + 3, 12 (2)) ≡ N(4, 2)
Y = 2X + 3
=⇒
Y ∼ N(2(1) + 3, 22 (2)) ≡ N(5, 8)
Y = 3X + 2
=⇒
Y ∼ N(3(1) + 2, 32 (2)) ≡ N(5, 18)
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 10
Univariate Normal
Parameter Estimation
Likelihood Function
Suppose that x = (x1 , . . . , xn ) is an iid sample of data from a normal
iid
distribution with mean µ and variance σ 2 , i.e., xi ∼ N(µ, σ 2 ).
The likelihood function for the parameters (given the data) has the form
n
Y
n
Y
(xi − µ)2
√
L(µ, σ |x) =
f (xi ) =
exp −
2
2σ 2
2πσ
i=1
i=1
2
1
and the log-likelihood function is given by
LL(µ, σ 2 |x) =
n
X
i=1
n
n
1 X
n
(xi −µ)2
log(f (xi )) = − log(2π)− log(σ 2 )− 2
2
2
2σ
Nathaniel E. Helwig (U of Minnesota)
i=1
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 11
Univariate Normal
Parameter Estimation
Maximum Likelihood Estimate of the Mean
The MLE of the mean is the value of µ that minimizes
n
X
(xi − µ) =
i=1
where x̄ = (1/n)
Pn
2
i=1 xi
n
X
xi2 − 2nx̄µ + nµ2
i=1
is the sample mean.
Taking the derivative with respect to µ we find that
P
∂ ni=1 (xi − µ)2
= −2nx̄ + 2nµ ←→
∂µ
x̄ = µ̂
i.e., the sample mean x̄ is the MLE of the population mean µ.
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 12
Univariate Normal
Parameter Estimation
Maximum Likelihood Estimate of the Variance
The MLE of the variance is the value of σ 2 that minimizes
Pn
n
x 2 nx̄ 2
n
n
1 X
2
2
2
(xi − µ̂) = log(σ ) + i=12 i − 2
log(σ ) + 2
2
2
2σ
2σ
2σ
i=1
where x̄ = (1/n)
Pn
i=1 xi
is the sample mean.
Taking the derivative with respect to σ 2 we find that
∂ n2 log(σ 2 ) +
1
2σ 2
Pn
∂σ 2
i=1 (xi
− µ̂)2
=
n
n
1 X
−
(xi − µ̂)2
2σ 2 2σ 4
i=1
which implies that the sample variance σ̂ 2 = (1/n)
MLE of the population variance σ 2 .
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Pn
i=1 (xi
− x̄)2 is the
Updated 17-Jan-2017 : Slide 13
Bivariate Normal
Bivariate Normal
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 14
Bivariate Normal
Distribution Form
Normal Density Function (Bivariate)
Given two variables x, y ∈ R, the bivariate normal pdf is
h
n
io
(y −µy )2
2ρ(x−µx )(y −µy )
(x−µx )2
1
+
−
exp − 2(1−ρ
2)
2
2
σx σy
σx
σ
p y
f (x, y ) =
2
2πσx σy 1 − ρ
(5)
where
µx ∈ R and µy ∈ R are the marginal means
σx ∈ R+ and σy ∈ R+ are the marginal standard deviations
0 ≤ |ρ| < 1 is the correlation coefficient
X and Y are marginally normal: X ∼ N(µx , σx2 ) and Y ∼ N(µy , σy2 )
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 15
Bivariate Normal
Distribution Form
0.10
4
0.12
√
Example: µx = µy = 0, σx2 = 1, σy2 = 2, ρ = 0.6/ 2
0
y
0.04
0.04
0.06
−2
f (x, y )
0.08
0.06
0.10
f (x, y )
0.08
2
0.12
4
0.02
0
x
2
4
−4
y
0.00
−2
−2
0.02
2
0
−4
−4
−4
−2
0
2
4
x
http://en.wikipedia.org/wiki/File:MultivariateNormal.png
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 16
Bivariate Normal
Distribution Form
Example: Different Means
µx = −1, µy = −1
−2
0
2
4
x
−4
−2
0
2
4
x
0.08
0.06
f (x, y )
0.04
0
0.02
−2
0.00
−4
y
2
0.10
0.08
0.06
0.04
0.02
0.00
f (x, y )
2
0
y
−2
−4
0.08
f (x, y )
0.06
0.04
0.02
0.00
−4
−4
0.10
4
0.12
4
0.10
4
2
0
−2
y
0.12
µx = 1, µy = 2
0.12
µx = 0, µy = 0
−4
−2
0
2
4
x
√
Note: for all three plots σx2 = 1, σy2 = 2, and ρ = 0.6/ 2.
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 17
Bivariate Normal
Distribution Form
Example: Different Correlations
ρ=0
0
2
4
x
−4
−2
0
2
4
x
0.15
0.10
0.05
f (x, y )
0
−2
0.00
−4
y
2
0.08
0.06
0.04
0.02
0.00
f (x, y )
2
0
y
−2
−4
0.08
f (x, y )
0.06
0.04
0.02
0.00
−2
4
0.10
4
0.10
4
2
0
y
−2
−4
−4
0.20
ρ = 1.2/ 2
0.12
ρ = −0.6/ 2
−4
−2
0
2
4
x
Note: for all three plots µx = µy = 0, σx2 = 1, and σy2 = 2.
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 18
Bivariate Normal
Distribution Form
Example: Different Variances
σy = 1
σy = 2
−2
0
2
4
x
0.08
4
−4
−2
0
2
4
x
0.06
0.04
0.02
f (x, y )
2
0.00
y
0
−2
−4
0.08
0.06
0.04
0.02
0.00
−4
−4
f (x, y )
2
0
y
−2
0.10
0.05
0.00
−4
f (x, y )
0
−2
y
2
0.15
0.10
4
4
0.12
σy = 2
−4
−2
0
2
4
x
Note: for all three plots µx = µy = 0, σx2 = 1, and ρ = 0.6/(σx σy ).
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 19
Bivariate Normal
Probability Calculations
Probabilities and Multiple Integration
Probabilities still relate to the area under the pdf:
Z
bx
Z
by
P([ax ≤ X ≤ bx ] and [ay ≤ Y ≤ by ]) =
f (x, y )dy dx
ax
where
RR
(6)
ay
f (x, y )dy dx denotes the multiple integral of the pdf f (x, y ).
Defining z = (x, y ), we can still define the cdf:
F (z) = P(X ≤ x and Y ≤ y )
Z x Z y
=
f (u, v )dv du
−∞
Nathaniel E. Helwig (U of Minnesota)
(7)
−∞
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 20
Bivariate Normal
Probability Calculations
Normal Distribution Functions (Bivariate)
4
−4
−2
0
2
0.8
0.6
0.4
0.2
−4
y
0
−2
F (x, y )
2
0.08
0.02
4
0.0
0.00
−4
f (x, y )
0.06
0.04
0
−2
y
2
0.10
4
0.12
1.0
Helpful figures of bivariate normal pdf and cdf:
−4
x
−2
0
2
4
x
√
Note: µx = µy = 0, σx2 = 1, σy2 = 2, and ρ = 0.6/ 2
Note that the cdf still has an ogive shape (now in two-dimensions).
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 21
Bivariate Normal
Affine Transformations
Affine Transformations of Normal (Bivariate)
Given z = (x, y )0 , suppose that z ∼ N(µ, Σ) where
µ = (µx , µy )0 is the 2 × 1 mean vector
2
σx
ρσx σy
Σ=
is the 2 × 2 covariance matrix
ρσx σy
σy2
a11 a12
b1
0 0
Let A =
and b =
with A 6= 02×2 =
.
a21 a22
b2
0 0
If we define w = Az + b, then w ∼ N(Aµ + b, AΣA0 ).
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 22
Bivariate Normal
Conditional Distributions
Conditional Normal (Bivariate)
The conditional distribution of a variable Y given X = x is
fY |X (y |X = x) =
fXY (x, y )
fX (x)
(8)
where
fXY (x, y ) is the joint pdf of X and Y
fX (x) is the marginal pdf of X
In the bivariate normal case, we have that
Y |X ∼ N(µ∗ , σ∗2 )
(9)
σ
where µ∗ = µy + ρ σyx (x − µx ) and σ∗2 = σy2 (1 − ρ2 )
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 23
Bivariate Normal
Conditional Distributions
Derivation of Conditional Normal
To prove Equation (9), simply write out the definition and simplify:
fY |X (y |X = x) =
fXY (x, y )
fX (x)
(
−
exp
"
(x−µx )2
σx2
1
2(1−ρ2 )
=
+
exp
"
(
(x−µx )2
σx2
1
−
2(1−ρ2 )
exp
=
(
exp
=
σ2
ρ2 y2
σx
+
(y −µy )2
σy2
−
2ρ(x−µx )(y −µy )
σx σy
1
2σy2 (1−ρ2 )
p
/ 2πσx σy 1 − ρ2
√ (x−µx )2
−
/ σx 2π
2
2σx
(y −µy )2
σy2
√
2πσy
p
−
2ρ(x−µx )(y −µy )
σx σy
#
+
(x−µx )2
2σx2
)
1 − ρ2
"
−
#)
#)
σ
(x − µx )2 + (y − µy )2 − 2ρ σy (x − µx )(y − µy )
x
p
√
2πσy 1 − ρ2
(
)
h
i2
σ
exp − 2 1 2 y − µy − ρ σy (x − µx )
x
2σy (1−ρ )
=
p
√
2πσy 1 − ρ2
which completes the proof.
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 24
Bivariate Normal
Conditional Distributions
Statistical Independence for Bivariate Normal
Two variables X and Y are statistically independent if
fXY (x, y ) = fX (x)fY (y )
(10)
where fXY (x, y ) is joint pdf, and fX (x) and fY (y ) are marginals pdfs.
Note that if X and Y are independent, then
fY |X (y |X = x) =
fX (x)fY (y )
fXY (x, y )
=
= fY (y )
fX (x)
fX (x)
(11)
so conditioning on X = x does not change the distribution of Y .
If X and Y are bivariate normal, what is the necessary and sufficient
condition for X and Y to be independent? Hint: see Equation (9)
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 25
Bivariate Normal
Conditional Distributions
Example #1
A statistics class takes two exams X (Exam 1) and Y (Exam 2) where
the scores follow a bivariate normal distribution with parameters:
µx = 70 and µy = 60 are the marginal means
σx = 10 and σy = 15 are the marginal standard deviations
ρ = 0.6 is the correlation coefficient
Suppose we select a student at random. What is the probability that. . .
(a) the student scores over 75 on Exam 2?
(b) the student scores over 75 on Exam 2, given that the student
scored X = 80 on Exam 1?
(c) the sum of his/her Exam 1 and Exam 2 scores is over 150?
(d) the student did better on Exam 1 than Exam 2?
(e) P(5X − 4Y > 150)?
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 26
Bivariate Normal
Conditional Distributions
Example #1: Part (a)
Answer for 1(a):
Note that Y ∼ N(60, 152 ), so the probability that the student scores
over 75 on Exam 2 is
75 − 60
P(Y > 75) = P Z >
15
= P(Z > 1)
= 1 − P(Z < 1)
= 1 − Φ(1)
= 1 − 0.8413447
= 0.1586553
Rx
2
where Φ(x) = −∞ f (z)dz with f (x) = √1 e−x /2 denoting the standard
2π
normal pdf (see R code for use of pnorm to calculate this quantity).
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 27
Bivariate Normal
Conditional Distributions
Example #1: Part (b)
Answer for 1(b):
Note that (Y |X = 80) ∼ N(µ∗ , σ∗2 ) where
µ∗ = µY + ρ σσYX (x − µX ) = 60 + (0.6)(15/10)(80 − 70) = 69
σ∗2 = σY2 (1 − ρ2 ) = 152 (1 − 0.62 ) = 144
If a student scored X = 80 on Exam 1, the probability that the student
scores over 75 on Exam 2 is
75 − 69
P(Y > 75|X = 80) = P Z >
12
= P (Z > 0.5)
= 1 − Φ(0.5)
= 1 − 0.6914625
= 0.3085375
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 28
Bivariate Normal
Conditional Distributions
Example #1: Part (c)
Answer for 1(c):
Note that (X + Y ) ∼ N(µ∗ , σ∗2 ) where
µ∗ = µX + µY = 70 + 60 = 130
σ∗2 = σX2 + σY2 + 2ρσX σY = 102 + 152 + 2(0.6)(10)(15) = 505
The probability that the sum of Exam 1 and Exam 2 is above 150 is
150 − 130
P(X + Y > 150) = P Z > √
505
= P (Z > 0.8899883)
= 1 − Φ(0.8899883)
= 1 − 0.8132639
= 0.1867361
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 29
Bivariate Normal
Conditional Distributions
Example #1: Part (d)
Answer for 1(d):
Note that (X − Y ) ∼ N(µ∗ , σ∗2 ) where
µ∗ = µX − µY = 70 − 60 = 10
σ∗2 = σX2 + σY2 − 2ρσX σY = 102 + 152 − 2(0.6)(10)(15) = 145
The probability that the student did better on Exam 1 than Exam 2 is
P(X > Y ) = P(X − Y > 0)
0 − 10
=P Z > √
145
= P (Z > −0.8304548)
= 1 − Φ(−0.8304548)
= 1 − 0.2031408
= 0.7968592
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 30
Bivariate Normal
Conditional Distributions
Example #1: Part (e)
Answer for 1(e):
Note that (5X − 4Y ) ∼ N(µ∗ , σ∗2 ) where
µ∗ = 5µX − 4µY = 5(70) − 4(60) = 110
σ∗2 = 52 σX2 + (−4)2 σY2 + 2(5)(−4)ρσX σY =
25(102 ) + 16(152 ) − 2(20)(0.6)(10)(15) = 2500
Thus, the needed probability can be obtained using
150 − 110
P(5X − 4Y > 150) = P Z > √
2500
= P (Z > 0.8)
= 1 − Φ(0.8)
= 1 − 0.7881446
= 0.2118554
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 31
Bivariate Normal
Conditional Distributions
Example #1: R Code
# Example 1a
> pnorm(1,lower=F)
[1] 0.1586553
> pnorm(75,mean=60,sd=15,lower=F)
[1] 0.1586553
# Example 1d
> pnorm(-10/sqrt(145),lower=F)
[1] 0.7968592
> pnorm(0,mean=10,sd=sqrt(145),lower=F)
[1] 0.7968592
# Example 1b
> pnorm(0.5,lower=F)
[1] 0.3085375
> pnorm(75,mean=69,sd=12,lower=F)
[1] 0.3085375
# Example 1e
> pnorm(0.8,lower=F)
[1] 0.2118554
> pnorm(150,mean=110,sd=50,lower=F)
[1] 0.2118554
# Example 1c
> pnorm(20/sqrt(505),lower=F)
[1] 0.1867361
> pnorm(150,mean=130,sd=sqrt(505),lower=F)
[1] 0.1867361
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 32
Multivariate Normal
Multivariate Normal
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 33
Multivariate Normal
Distribution Form
Normal Density Function (Multivariate)
Given x = (x1 , . . . , xp )0 with xj ∈ R ∀j, the multivariate normal pdf is
1
1
0 −1
exp − (x − µ) Σ (x − µ)
f (x) =
(12)
2
(2π)p/2 |Σ|1/2
where
µ = (µ1 , . . . , µp )0

σ11 σ12
σ21 σ22

Σ= .
..
 ..
.
is the p × 1 mean vector

· · · σ1p
· · · σ2p 

..  is the p × p covariance matrix
..
.
. 
σp1 σp2 · · ·
σpp
Write x ∼ N(µ, Σ) or x ∼ Np (µ, Σ) to denote x is multivariate normal.
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 34
Multivariate Normal
Distribution Form
Some Multivariate Normal Properties
The mean and covariance parameters have the following restrictions:
µj ∈ R for all j
σjj > 0 for all j
√
σij = ρij σii σjj where ρij is correlation between Xi and Xj
σij2 ≤ σii σjj for any i, j ∈ {1, . . . , p} (Cauchy-Schwarz)
Σ is assumed to be positive definite so that Σ−1 exists.
Marginals are normal: Xj ∼ N(µj , σjj ) for all j ∈ {1, . . . , p}.
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 35
Multivariate Normal
Probability Calculations
Multivariate Normal Probabilities
Probabilities still relate to the area under the pdf:
Z
b1
P(aj ≤ Xj ≤ bj ∀j) =
Z
a1
where
R
···
R
bp
···
f (x)dxp · · · dx1
(13)
ap
f (x)dxp · · · dx1 denotes the multiple integral f (x).
We can still define the cdf of x = (x1 , . . . , xp )0
F (x) = P(Xj ≤ xj ∀j)
Z x1
Z xp
···
f (u)dup · · · du1
=
−∞
Nathaniel E. Helwig (U of Minnesota)
(14)
−∞
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 36
Multivariate Normal
Affine Transformations
Affine Transformations of Normal (Multivariate)
Suppose that x = (x1 , . . . , xp )0 and that x ∼ N(µ, Σ) where
µ = {µj }p×1 is the mean vector
Σ = {σij }p×p is the covariance matrix
Let A = {aij }n×p and b = {bi }n×1 with A 6= 0n×p .
If we define w = Ax + b, then w ∼ N(Aµ + b, AΣA0 ).
Note: linear combinations of normal variables are normally distributed.
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 37
Multivariate Normal
Conditional Distributions
Multivariate Conditional Distributions
Given variables x = (x1 , . . . , xp )0 and y = (y1 , . . . , yq )0 , we have
fY |X (y|X = x) =
fXY (x, y)
fX (x)
(15)
where
fY |X (y|X = x) is the conditional distribution of y given x
fXY (x, y) is the joint pdf of x and y
fX (x) is the marginal pdf of x
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 38
Multivariate Normal
Conditional Distributions
Conditional Normal (Multivariate)
Suppose that z ∼ N(µ, Σ) where
z = (x0 , y0 )0 = (x1 , . . . , xp , y1 , . . . , yq )0
µ = (µ0x , µ0y )0 = (µ1x , . . . , µpx , µ1y , . . . , µqy )0
Note: µx is mean vector of x, and µy is mean vector of y
Σxx Σxy
Σ=
where (Σxx )p×p , (Σyy )q×q , and (Σxy )p×q ,
Σ0xy Σyy
Note: Σxx is covariance matrix of x, Σyy is covariance matrix of y,
and Σxy is covariance matrix of x and y
In the multivariate normal case, we have that
y|x ∼ N(µ∗ , Σ∗ )
(16)
0
−1
where µ∗ = µy + Σ0xy Σ−1
xx (x − µx ) and Σ∗ = Σyy − Σxy Σxx Σxy
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 39
Multivariate Normal
Conditional Distributions
Statistical Independence for Multivariate Normal
Using Equation (16), we have that
y|x ∼ N(µ∗ , Σ∗ ) ≡ N(µy , Σyy )
(17)
if and only if Σxy = 0p×q (a matrix of zeros).
Note that Σxy = 0p×q implies that the p elements of x are uncorrelated
with the q elements of y.
For multivariate normal variables: uncorrelated → independent
For non-normal variables: uncorrelated 6→ independent
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 40
Multivariate Normal
Conditional Distributions
Example #2
Each Delicious Candy Company store makes 3 size candy bars:
regular (X1 ), fun size (X2 ), and big size (X3 ).
Assume the weight (in ounces) of the candy bars (X1 , X2 , X3 ) follow a
multivariate normal distribution with parameters:
 


5
4 −1 0
µ = 3
and
Σ = −1 4 2
7
0
2 9
Suppose we select a store at random. What is the probability that. . .
(a) the weight of a regular candy bar is greater than 8 oz?
(b) the weight of a regular candy bar is greater than 8 oz, given that
the fun size bar weighs 1 oz and the big size bar weighs 10 oz?
(c) P(4X1 − 3X2 + 5X3 < 63)?
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 41
Multivariate Normal
Conditional Distributions
Example #2: Part (a)
Answer for 2(a):
Note that X1 ∼ N(5, 4)
So, the probability that the regular bar is more than 8 oz is
8−5
P(X1 > 8) = P Z >
2
= P(Z > 1.5)
= 1 − Φ(1.5)
= 1 − 0.9331928
= 0.0668072
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 42
Multivariate Normal
Conditional Distributions
Example #2: Part (b)
Answer for 2(b):
(X1 |X2 = 1, X3 = 10) is normally distributed, see Equation (16).
The conditional mean of (X1 |X2 = 1, X3 = 10) is given by
µ∗ = µX1 + Σ012 Σ−1
22 (x̃ − µ̃)
4 2 −1 1 − 3
= 5 + −1 0
2 9
10 − 7
1
9 −2
−2
= 5 + −1 0
3
32 −2 4
= 5 + 24/32
= 5.75
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 43
Multivariate Normal
Conditional Distributions
Example #2: Part (b) continued
Answer for 2(b) continued:
The conditional variance of (X1 |X2 = 1, X3 = 10) is given by
σ∗2 = σX2 1 − Σ012 Σ−1
22 Σ12
4 2 −1 −1
= 4 − −1 0
2 9
0
1
9 −2
−1
= 4 − −1 0
0
32 −2 4
= 4 − 9/32
= 3.71875
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 44
Multivariate Normal
Conditional Distributions
Example #2: Part (b) continued
Answer for 2(b) continued:
So, if the fun size bar weighs 1 oz and the big size bar weighs 10 oz,
the probability that the regular bar is more than 8 oz is
8 − 5.75
P(X1 > 8|X2 = 1, X3 = 10) = P Z > √
3.71875
= P(Z > 1.166767)
= 1 − Φ(1.166767)
= 1 − 0.8783477
= 0.1216523
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 45
Multivariate Normal
Conditional Distributions
Example #2: Part (c)
Answer for 2(c):
(4X1 − 3X2 + 5X3 ) is normally distributed.
The expectation of (4X1 − 3X2 + 5X3 ) is given by
µ∗ = 4µX1 − 3µX2 + 5µX3
= 4(5) − 3(3) + 5(7)
= 46
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 46
Multivariate Normal
Conditional Distributions
Example #2: Part (c) continued
Answer for 2(c) continued:
The variance of (4X1 − 3X2 + 5X3 ) is given by
 
4
2

σ∗ = 4 −3 5 Σ −3
5

 
4
−1
0
4



−1 4 2
−3
= 4 −3 5
0
2 9
5
 
19
= 4 −3 5 −6
39
= 289
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 47
Multivariate Normal
Conditional Distributions
Example #2: Part (c) continued
Answer for 2(c) continued:
So, the needed probability can be obtained as
63 − 46
P(4X1 − 3X2 + 5X3 < 63) = P Z < √
289
= P(Z < 1)
= Φ(1)
= 0.8413447
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 48
Multivariate Normal
Conditional Distributions
Example #2: R Code
# Example 2a
> pnorm(1.5,lower=F)
[1] 0.0668072
> pnorm(8,mean=5,sd=2,lower=F)
[1] 0.0668072
# Example 2b
> pnorm(2.25/sqrt(119/32),lower=F)
[1] 0.1216523
> pnorm(8,mean=5.75,sd=sqrt(119/32),lower=F)
[1] 0.1216523
# Example 2c
> pnorm(1)
[1] 0.8413447
> pnorm(63,mean=46,sd=17)
[1] 0.8413447
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 49
Multivariate Normal
Parameter Estimation
Likelihood Function
Suppose that xi = (xi1 , . . . , xip ) is a sample from a normal distribution
iid
with mean vector µ and covariance matrix Σ, i.e., xi ∼ N(µ, Σ).
The likelihood function for the parameters (given the data) has the form
L(µ, Σ|X) =
n
Y
f (xi ) =
i=1
n
Y
i=1
1
(2π)p/2 |Σ|1/2
1
exp − (xi − µ)0 Σ−1 (xi − µ)
2
and the log-likelihood function is given by
n
LL(µ, Σ|X) = −
np
n
1X
log(2π) − log(|Σ|) −
(xi − µ)0 Σ−1 (xi − µ)
2
2
2
i=1
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 50
Multivariate Normal
Parameter Estimation
Maximum Likelihood Estimate of Mean Vector
The MLE of the mean vector is the value of µ that minimizes
n
X
(xi − µ)0 Σ−1 (xi − µ) =
i=1
n
X
x0i Σ−1 xi − 2nx̄0 Σ−1 µ + nµ0 Σ−1 µ
i=1
where x̄ = (1/n)
Pn
i=1 xi
is the sample mean vector.
Taking the derivative with respect to µ we find that
∂LL(µ, Σ|X)
= −2nΣ−1 x̄ + 2nΣ−1 µ
∂µ
←→
x̄ = µ̂
The sample mean vector x̄ is the MLE of the population mean µ vector.
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 51
Multivariate Normal
Parameter Estimation
Maximum Likelihood Estimate of Covariance Matrix
The MLE of the covariance matrix is the value of Σ that minimizes
−n log(|Σ−1 |) +
n
X
tr{Σ−1 (xi − µ̂)(xi − µ̂)0 }
i=1
where µ̂ = x̄ = (1/n)
Pn
i=1 xi
is the sample mean.
Taking the derivative with respect to Σ−1 we find that
n
X
∂LL(µ, Σ|X)
=
−nΣ
+
(xi − µ̂)(xi − µ̂)0
∂Σ−1
i=1
P
i.e., the sample covariance matrix Σ̂ = (1/n) ni=1 (xi − x̄)(xi − x̄)0 is
the MLE of the population covariance matrix Σ.
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 52
Sampling Distributions
Sampling Distributions
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 53
Sampling Distributions
Univariate Case
Univariate Sampling Distributions: x̄ and s2
In the univariate normal case, we have that
P
x̄ = (1/n) ni=1 xi ∼ N(µ, σ 2 /n)
P
(n − 1)s2 = ni=1 (xi − x̄)2 ∼ σ 2 χ2n−1
χ2k denotes a chi-square variable with k degrees of freedom.
σ 2 χ2k =
Pk
2
i=1 zi
iid
where zi ∼ N(0, σ 2 )
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 54
Sampling Distributions
Multivariate Case
Multivariate Sampling Distributions: x̄ and S
In the multivariate normal case, we have that
P
x̄ = (1/n) ni=1 xi ∼ N(µ, Σ/n)
P
(n − 1)S = ni=1 (xi − x̄)(xi − x̄)0 ∼ Wn−1 (Σ)
Wk (Σ) denotes a Wishart variable with k degrees of freedom.
Wk (Σ) =
Pk
0
i=1 zi zi
iid
where zi ∼ N(0p , Σ)
Nathaniel E. Helwig (U of Minnesota)
Introduction to Normal Distribution
Updated 17-Jan-2017 : Slide 55
Related documents