Download COURSE 2T1 (MATH 29531) SEMESTER 1 NOTES ON NORMAL

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
COURSE 2T1 (MATH 29531)
SEMESTER 1
NOTES ON NORMAL DISTRIBUTION
The normal distribution is one of the most important continuous probability distributions. This
is because the histogram of many random variables (e.g. weight, length, strength of material) often
closely approximates the normal distribution as the number of measurements increases and the
class intervals become narrower. The earliest punlished derivation of the normal distribution seems
to be that in a pamphlet of de Moivre dated November 12, 1733. This pamphlet was in Latin; in
1738 de Mooivre published an English translation with some additions. In 1774 Laplace obtained
the normal distribution as an approximation to hypergeometric distribution. The work of Gauss
in 1809 and 1816 established techniques based on the normal distribution, which became standard
methods used during the 19th century.
Definition
Probability Density Function If a random variable X has the pdf
f (x) =
(
(x − µ)2
1
√
exp −
2σ 2
2πσ
)
,
−∞ < x < ∞
(1)
then it is said to have the Normal distribution with parameters µ (−∞ < µ < ∞) and σ (σ > 0).
A normal distribution with µ = 0 and σ = 1 is called the standard normal distribution. A random
variable having the standard normal distribution is denoted by Z.
Properties
The normal random variable X has the following properties:
• If X has the normal distribution with parameters µ and σ then Y = αX + β is normally
distributed with parameters αµ + β and ασ. In particular, Z = (X − µ)/σ has the standard
normal distribution;
• the normal pdf is a bell-shaped curve that is symmetric about µ and that attains its maximum
value of:
√
1
2πσ
=
0.399
σ
at x = µ;
• 68.26% of the total area bounded by the curve lies between µ − σ and µ + σ;
• 95.44% is between µ − 2σ and µ + 2σ;
• 99.74% is between µ − 3σ and µ + 3σ;
1
(2)
• the expected value is:
• the variance is:
E(X) = µ;
(3)
V ar(X) = σ 2 ;
(4)
SD(X) = σ;
(5)
• the standard deviation is:
• for all odd r, the rth central moment is:
E [{X − E(X)}r ] = 0;
(6)
• for all even r, the rth central moment is:
E [{X − E(X)}r ] = σ r × (r − 1) × (r − 3) × · · · × 3 × 1;
• the mean deviation is:
E {|X − E(X)|} = σ
• the coefficient of variation is:
r
2
;
π
(8)
σ
;
µ
CV (X) =
(7)
(9)
• the skewness is:
γ1 (X) = 0;
(10)
γ2 (X) = 3;
(11)
• the kurtosis is:
• the entropy is:
E [− log f (X)] = log
³√
´ 1
2πσ + .
2
(12)
Cumulative Distribution Function The cdf for the normal random variable X is
F (x) =
1
√
2πσ
(
(y − µ)2
exp −
2σ 2
−∞
Z
x
)
dy.
(13)
The cdf for the standard normal random variable Z is usually denoted by
Φ(z) =
1
√
2π
(
y2
exp −
2
−∞
Z
z
)
dy.
(14)
To compute the cdf F for given µ and σ, you will use the result
F (x) = Φ
µ
x−µ
,
σ
¶
(15)
and will read the last term off from the attached table for the standard normal distribution. Sometimes you may want to compute probabilities of the form Pr(X > x) or Pr(X < −x) or Pr(X > −x)
or something else. For these you can use the following properties:
2
(i) Pr(Z > c) = 1 − Pr(Z < c), c > 0;
(ii) Pr(Z < −c) = 1 − Pr(Z < c), c > 0;
(iii) Pr(Z > −c) = Pr(Z < c), c > 0.
Percentiles The 100(1 − α)% percentile of the normal variable X is given by the simple formula:
xα = µ + σzα ,
(16)
where zα is the 100(1 − α)% percentile of the standard normal random variable Z. The values of
zα can be read off from the same table as above for all α ≤ 0.5. If α > 0.5 then use the fact that
zα = −z1−α .
(17)
Estimation
Suppose you have a dataset x1 , x2 , . . . , xn from a normal distribution with parameters µ and σ.
The estimator for µ whether or not σ is known is:
Pn
i=1 xi
µ̂ =
n
= x̄.
(18)
σ2
.
n
(19)
This estimator is unbiased for µ and
V ar (µ̂) =
The estimate for σ when µ is unknown is:
σ̂ =
The estimate for σ when µ is known is:
v
u
n
u1 X
t
(xi − x̄)2 = S.
n
i=1
q
σ̂ =
(20)
S 2 + (x̄ − µ)2 .
(21)
Neither is an unbiased estimator for σ. In fact,
E(S) =
p
σ 2/nΓ(n/2)
Γ((n − 1)/2)
(22)
p
(23)
and
E
·q
S2
+ (x̄ − µ)
2
¸
=
σ 2/nΓ((n + 1)/2)
.
Γ(n/2)
An unbiased estimator for σ 2 is:
n
S2 =
n−1
n
1 X
(xi − x̄)2 .
n − 1 i=1
3
(24)
A 95% confidence interval for µ (when σ unknown) is:
µ
µ̂ − tn−1,0.025 √
S
S
, µ̂ + tn−1,0.025 √
,
n−1
n−1
¶
(25)
where the tn−1,0.025 is the 97.5% percentile of the Student’s t distribution with degrees of freedom
n − 1.
A 95% confidence interval for σ (when µ unknown) is:
 s
S
n
χ2n−1,0.975
,S
s
n
χ2n−1,0.025

,
(26)
where the χ2n−1,q is the 100(1−q)% percentile of the chi-squared distribution with degrees of freedom
n − 1.
4
Normal PDF
0.0
-1
0
1
2
-4
-2
0
2
x
mu=0,sd=5
mu=0,sd=10
4
0.2
0.0
0.0
0.2
PDF
0.4
x
0.4
-2
PDF
0.2
PDF
0.2
0.0
PDF
0.4
mu=0,sd=2
0.4
mu=0,sd=1
-10
-5
0
5
10
-20
x
-10
0
x
5
10
20
Normal PDF
-4
-2
0
2
4
-6
-4
-2
0
x
mu=2,sd=1
mu=-5,sd=1
4
2
4
0.1
PDF
0.1
2
0.3
x
0.3
-6
PDF
0.3
0.1
PDF
0.3
mu=-2,sd=1
0.1
PDF
mu=0,sd=1
-6
-4
-2
0
2
4
-6
x
-4
-2
0
x
6
Related documents