Download Lecture 9, Oct 4 and Oct. 6.

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Math/Stat 352
Lecture 9
Section 4.5
Normal distribution
1
Abraham de Moivre,
1667-1754
A French mathematician, who
introduced the Normal distribution
in his book “The doctrine of
chances: or, a method for
calculating the probabilities of
events in play”, first published in
1718 and considered the first
textbook on Probability.
Pierre-Simon
Laplace (1749–1827)
A French mathematician and
astronomer. Extended the Moivre’s
result in the book “Analytical theory
of probabilities”, published in 1812.
(So-called de Moivre-Laplace
theorem)
De Moivre-Laplace theorem suggests an approximation
to the central part of Binomial distribution
Johann Carl Friedrich Gauss (1777–1855)
A German mathematician and scientist who contributed significantly
to many fields, including number theory, statistics, analysis,
differential geometry, geodesy, geophysics, astronomy and optics.
Referred to mathematics as "the queen of sciences.” Rigorously
justified the method of least squares in 1809 using the Normal
distribution for errors.
Charles Sanders Peirce, 1839-1814 Francis Galton, 1822-1911
an American philosopher, logician, an English prolific scientist
mathematician, and scientist
Wilhelm Lexis, 1837-1914
an eminent German
statistician, economist, social
scientist, a founder of the
interdisciplinary study of
insurance.
Coined the term “Normal distribution” around 1875
Karl Pearson (March 27, 1857 – April 27, 1936)
established the discipline of mathematical statistics
Made the term “Normal distribution” popular
0.02
0.04
0.06
0.08
0.10
Concentrated around 90
0.00
dbinom(q, 100, 0.9)
0.12
Binomial(100,0.9), np=90
70
75
80
85
90
Number of succsses
95
100
Bin(100,0.9)np=90
0.04
0.06
0.08
0.10
Only a small fraction of possible
outcomes has not negligible probability
(i.e. only small part can be seen
in experiment)
0.02
prob is very small (not 0!) here
0.00
dbinom(q, 100, 0.1)
0.12
Bin(100,0.1)np=1
0
0
20
40
60
Number of succsses
80
100
Poisson? Binomial? Normal?
Binomial(1000,.03) N(30,30)
Poisson
0.04
Binomial
0.02
Normal
0.00
dpois(q, 30)
0.06
Poisson(30)
0
10
20
30
Number of succsses
40
50
Poisson(1)
0.3
0.2
0.1
0.0
0.2
dpois(q, 1)
0.3
dbinom(q, 1000, 1/1000)
Binomial(1000,.001]
0
2
4
6
8
10
0.0
0.1
Number of succsses
0
2
4
6
8
10
Number of succsses
The distributions are not symmetric, np=λ=1 too small for normal apprxmtn.
Poisson? Binomial? Normal?
Rule of thumb:
If n is large (n > 100), p is small (p < 0.05), and
both np and n(1-p) are not small (say >10) then
B(n,p)~P(np)~N(np,np(1-p))
The Normal Distribution
The normal distribution (also called the Gaussian distribution) is by far
the most commonly used distribution in the sciences. It provides a good
model for many, although not all, continuous populations.
The normal distribution is continuous. The mean of a normal population
may have any value, and the variance may have any positive value.
0.7
Normal densities
Properties of Normal pdf:
0.6
Mean= 0, Std=0.6
0.4
0.3
Mean= 7, Std=1
Mean= 5, Std=1
• Centered at the mean
0.1
0.2
Mean= 0, Std=1
• Larger standard
deviation gives “flatter”
curve with longer “tails”.
Mean= 0, Std=3
0.0
Density
0.5
• Bell shaped curve,
symmetric
-10
-8
-6
-4
-2
0
2
4
6
8
10
Value of random variable
11
Normal R.V.: pdf, mean, and variance
The pdf of a normal random variable with mean µ and variance σ2 ,
X ~ N(µ, σ2) , is s given by
1
=
f ( x)
e− ( x−µ )
σ 2π
2
/ 2σ 2
, −∞< x<∞
If X ~ N(µ, σ2), then the mean and variance of X are given by
µX = µ
σ X2 = σ 2
Note: The normal pdf is symmetric, so the mean =median.
12
Normal distributions – what are the most likely
observations? 68-95-99.7% Rule



X ~ N(µ, σ2) pdf.
About 68% of the observations are in the interval µ ± σ.
About 95% of the observations are in the interval µ ± 2σ.
About 99.7% of the observations are in the interval µ ± 3σ.
The proportion of a normal observations that are within a given
number of standard deviations of the mean is the same for any
normal population.
13
Standard Normal Distribution
The standard normal distribution is a normal distribution with mean 0 and
variance 1, X ~ N(0, 1).
Standard normal distribution is usually denoted by Z: Z ~ N(0, 1).
We can convert the observations from any X ~ N(µ, σ2), to the “standard
units”:
z=
x−µ
σ
This process is often called “standardization”. The value of X in standard
units is often called “z-score”.
Standard units tell how many standard deviations an observation is from
the population mean.
14
Computing standard normal probabilities: Table A.2
P(Z < 0.47) = 0.6808
Computing standard normal probabilities: Table A.2
P(Z ≤ 1.38) =P(Z < 1.38) = 0.9162
Reminder: The total area under a pdf curve is 1.
P(Z > 1.38)=0.0838
Computing standard normal probabilities: Table A.2
P(0.71 < Z < 1.28) = P(Z < 1.28) – P(Z < 0.71)
Computing standard normal probabilities: Table A.2
Symmetry in action
P(Z < 0.67) = symmetry = P(Z > -0.67)
P(Z < 0.67) = symmetry = 1- P(Z < -0.67)
Standard Normal Percentiles
Given that P(Z < a)=0.95 find a. Here a is called 95th percentile of Z.
Inside the table I looked for 0.95.
Found 0.9495 and 0.9505.
Used z-value corresponding to
the midpoint (0.95) between the
two available probabilities 1.645.
a=1.645
If an available probability is closer
to the one we need, use the
z-value corresponding to
that probability.
0.95
a =?
Finding Probabilities for any Normal Variable
Problem: Let X ~ N(µ, σ2), find P(a< X < b).
The probability that X lies within any interval is given by the integral of
the pdf of X over that interval:
𝒃
where
𝑷 𝒂 < 𝑿 < 𝒃 = � 𝒇 𝒙 𝒅𝒅,
1
=
f ( x)
e− ( x−µ )
σ 2π
2
/ 2σ 2
𝒂
, −∞< x<∞
The integral that provides probability does not have a closed form
solution. How to proceed?
Solution: Convert X to standard normal (“standardize”) and use z-table.
20
Computing normal probabilities: any X ~ N(µ, σ2)
Let X ~ N(50, 25). Find P( 42 < X < 52).
𝟒𝟒 −𝟓𝟓
𝟓
P(42 < X < 52)= 𝑷(
<𝒁<
𝟓𝟐 −𝟓𝟓
)
𝟓
= P( 1.6 < Z < 0.4) = 0.6554 – 0.0548 = 0.6006.
Computing Normal Percentiles: any Normal distribution
Let X ~ N(50, 25). Find the 40th percentile of X.
Find 40th percentile of standard normal: z = -0.25.
“Destandardize”: −𝟎. 𝟐𝟐 =
𝒂−𝟓𝟓
,
𝟓
thus a=48.75.
Example.
A process manufactures ball bearings whose diameters are normally distributed
with mean 2.505 cm and standard deviation of 0.008 cm. Specifications call for
the diameter to be in the interval 2.5±0.01 cm. What proportion of the ball
bearings will meet the specifications?
Soln: Let X = diameter of a ball bearing. X ~ N(2.505, 0.0082).
P( 2.49 < X < 2.51)= standardize= P( -1.88 < Z < 0.63)= 0.7357 – 0.0301 = 0.7056
Example: ball bearings contd.
Suppose the machine was recalibrated, so that the mean diameter is now
2.5 cm. To what value must the standard deviation be lowered, so that
95% of the diameters will meet the specifications.
Soln. X = diameter of a ball bearing, X ~ N(2.5, σ2). Want σ s.th.
P( 2.49 < X < 2.51) = 0.95. Standardize:
P(
𝟐.𝟒𝟒−𝟐.𝟓
𝝈
<𝒁<
𝟐.𝟓𝟓−𝟐.𝟓
)
𝝈
= P(
−𝟎.𝟎𝟎
𝝈
<𝒁<
𝟎.𝟎𝟎
)
𝝈
=0.95.
Need z-values that satisfy this equation: -1.96 = -0.01/σ and 1.96 = 0.01/σ.
Thus σ=0.01/1.96=0.0051cm.
Linear Functions of Normal Random Variables
Let X ~ N(µ, σ2) and let a ≠ 0 and b be constants. Then
aX +b ~ N(aµ+b, a2σ2)
Let X1, X2, …, Xn be independent and normally distributed with means µ1,
µ2,…, µn and variances σ12 , σ22 , … , σn2 . Let c1, c2,…, cn be constants,
and c1 X1 + c2 X2 +…+ cnXn be a linear combination. Then
𝒄 𝟏 𝑿𝟏 + 𝒄 𝟐 𝑿𝟐 + … + 𝒄𝒏 𝑿𝒏 ~
𝑵(𝒄𝟏 𝝁𝟏 + 𝒄𝟐 𝝁𝟐 + … + 𝒄𝒏 𝝁𝒏 , 𝒄𝟐𝟏 σ12 + 𝒄𝟐𝟐 σ22 + …+ 𝒄𝟐𝒏 σn2 )
25
Example
A chemist measures the temperature of a solution in oC. The
measurement is denoted C, and is normally distributed with mean 40oC
and standard deviation 1oC. The measurement is converted to oF by the
equation F = 1.8C + 32. What is the distribution of F?
Soln: let X=temperature in oC, X ~ N(40, 1). Let Y= temperature in oF.
Then Y = 1.8X +32.
Since Y is a linear function of a normal random variable, Y has a normal
distribution. Mean of Y= 1.8(40)+32= 104 oF. Variance of Y = (1.8)2(1)= 3.24.
So, Y ~ N(104, 3.24).
26
Distributions of Functions of Normals
Let X1, X2, …, Xn be independent and identically normally distributed with
mean µ and variance σ2. Then
 σ2 
X ~ N  μ,  .
 n 
Let X and Y be independent, with X ~ N(µX,
σ
2
X
) and Y ~ N(µY,
σY2 ).
Then
X + Y ~ N ( μ X + μY , σ X2 + σY2 )
X − Y ~ N ( μ X − μY , σ X2 + σY2 )
27
SAMPLING DISTRIBUTION OF THE SAMPLE MEAN

EXAMPLE: Students in an university have a weight distribution that is
known to be N(150, 20). Let X1, X2, …, X16 represent the weights of 16
randomly selected students from this university. If X is the average
weight for this sample, find P( X > 160).

Solution: Since the sample came from a normal distribution, the
sample mean has a normal distribution as well.
X ~N(μ, σ2/n )=N(150, 202/16)=N(150, 25).
Thus,
P ( X > 160) =P (
X − 150 160 − 150
>
) =P ( Z > 2) =1 − P ( Z ≤ 2) =1 − 0.9772 =0.0228.
5
5
EXAMPLE, CONTD.

An elevator at this university has a capacity of 1500 pounds.
What is the probability that 9 students who enter the elevator will
have a safe ride, i.e. their total weight is less than 1,500 lb?

Solution: The sample mean has a normal distribution:
X ~ N(μ, σ2/n )= N(150, 202/9)=N(150, 44.44). Also,
P( Total weight < 1500)=P(
P( X < 166.67) = P (
X
<1500/9)=P( X <166.67). So,
X − 150 166.67 − 150
>
) = P ( Z < 2.5) = 0.9938.
6.67
6.67
Estimating the Parameters
If X1,…, Xn are a random sample from a N(µ,σ2) distribution, µ is estimated
with the sample mean
and σ2 is estimated with the sample variance
𝟏
� )𝟐 .
∑𝒏𝒊=𝟏(𝑿𝒊 − 𝑿
deviation 𝒔𝟐 =
X
𝒏−𝟏
As with any sample mean, the uncertainty (standard deviation) in X is σ / n
which we replace withs / n , if σ is unknown. The mean is an unbiased
estimator of µ.
30