Download LECTURE 3(Week 1)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Objectives (BPS 3)
The Normal distributions

Density curves

Normal distributions

The 68-95-99.7 rule

The standard Normal distribution

Finding Normal proportions

Using the standard Normal table

Finding a value given a proportion
Density curves
A density curve is a mathematical model of a distribution.
It is always on or above the horizontal axis.
The total area under the curve, by definition, is equal to 1, or 100%.
The area under the curve for a range of values is the proportion of all
observations for that range.
Histogram of a sample with the
smoothed density curve
theoretically describing the
population
Density curves come in any
imaginable shape.
Some are well-known
mathematically and others aren’t.
Normal distributions
Normal—or Gaussian—distributions are a family of symmetrical, bellshaped density curves defined by a mean m (mu) and a standard
deviation s (sigma): N (m, s).
1
f ( x) 
e
2
1  xm 
 

2 s 
2
x
e = 2.71828… The base of the natural logarithm
π = pi = 3.14159…
x
A family of density curves
Here the means are the same (m =
15) while the standard deviations
are different (s = 2, 4, and 6).
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
Here the means are different
(m = 10, 15, and 20) while the
standard deviations are the same
(s = 3).
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
All Normal curves N (m, s) share the same
properties

About 68% of all observations
Inflection point
are within 1 standard deviation
(s) of the mean (m).

About 95% of all observations
are within 2 s of the mean m.

Almost all (99.7%) observations
are within 3 s of the mean.
mean µ = 64.5
standard deviation s = 2.5
N(µ, s) = N(64.5, 2.5)
Reminder: µ (mu) is the mean of the idealized curve, while x is the mean of a sample.
σ (sigma) is the standard deviation of the idealized curve, while s is the s.d. of a sample.
The standard Normal distribution
Because all Normal distributions share the same properties, we can
standardize our data to transform any Normal curve N (m, s) into the
standard Normal curve N (0,1).
N(64.5, 2.5)
N(0,1)
=>
x
z
Standardized height (no units)
For each x we calculate a new value, z (called a z-score).
Standardizing: calculating z-scores
A z-score measures the number of standard deviations that a data
value x is from the mean m.
z
(x  m )
s
When x is 1 standard deviation larger
than the mean, then z = 1.
for x  m  s , z 
m s  m s
 1
s
s
When x is 2 standard deviations larger
than the mean, then z = 2.
for x  m  2s , z 
m  2s  m 2s

2
s
s
When x is larger than the mean, z is positive.
When x is smaller than the mean, z is negative.
Example: Women heights
N(µ, s) =
N(64.5, 2.5)
Women’s heights follow the N(64.5″,2.5″)
distribution. What percent of women are
Area= ???
shorter than 67 inches tall (that’s 5′7″)?
mean µ = 64.5"
standard deviation s = 2.5"
x (height) = 67"
Area = ???
m = 64.5″ x = 67″
z =0
z =1
We calculate z, the standardized value of x:
z
(x  m)
s
, z
(67  64.5) 2.5

 1  1 stand. dev. from mean
2.5
2.5
Because of the 68-95-99.7 rule, we can conclude that the percent of women
shorter than 67″ should be, approximately, .68 + half of (1 − .68) = .84, or
84%.
Percent of women shorter than 67”
For z = 1.00, the area under
the standard Normal curve
to the left of z is 0.8413.
N(µ, s) =
N(64.5”, 2.5”)
Area ≈ 0.84
Conclusion:
Area ≈ 0.16
84.13% of women are shorter than 67″.
By subtraction, 1 − 0.8413, or 15.87%, of
women are taller than 67".
m = 64.5” x = 67”
z=1
The National Collegiate Athletic Association (NCAA) requires Division I athletes to
score at least 820 on the combined math and verbal SAT exam to compete in their
first college year. The SAT scores of 2003 were approximately normal with mean
1026 and standard deviation 209.
What proportion of all students would be NCAA qualifiers (SAT ≥ 820)?
x  820
m  1026
s  209
(x  m)
z
s
(820  1026)
209
 206
z
 0.99
209
Table A : area under
z
N(0,1) to the left of
z - .99 is 0.1611
or approx. 16%.
Area right of 820
=
=
Total area
1
−
−
Area left of 820
0.1611
≈ 84%
Note: The actual data may contain students who scored
exactly 820 on the SAT. However, the proportion of scores
exactly equal to 820 being 0 for a normal distribution is a
consequence of the idealized smoothing of density curves.
The NCAA defines a “partial qualifier” eligible to practice and receive an athletic
scholarship, but not to compete, as a combined SAT score of at least 720.
What proportion of all students who take the SAT would be partial qualifiers?
That is, what proportion have scores between 720 and 820?
x  720
m  1026
s  209
(x  m)
z
s
(720  1026)
209
 306
z
 1.46
209
Table A : area under
z
Area between
720 and 820
≈ 9%
=
=
Area left of 820
0.1611
−
−
Area left of 720
0.0721
N(0,1) to the left of
z - .99 is 0.0721
About 9% of all students who take the SAT have scores
or approx. 7%.
between 720 and 820.
Finding a value given a proportion
When you know the proportion, but you don’t know the x-value that
represents the cut-off, you need to use Table A backward.
1. State the problem and draw a picture.
2. Use Table A backward, from the inside out to the margins, to
find the corresponding z.
3. Unstandardize to transform z back to the original x scale by
using the formula:
x  m  zs
Example: Women’s heights
Women’s heights follow the N(64.5″,2.5″)
distribution. What is the 25th percentile for
women’s heights?
mean µ = 64.5"
standard deviation s = 2.5"
proportion = area under curve=0.25
We use Table A backward to get the z.
On the left half of Table A (with proportions 0.5), we find that a
proportion of 0.25 is between z = -0.67 and –0.68.
We’ll use z = –0.67.
Now convert back to x:
x  m  zs  64.5  (0.67)(2.5)  62.825"
The 25th percentile for women’s heights is 62.825”, or 5’ 2.82”.
Example 1
1. P(Z < 1.96) =
2. P(Z > 1.96) =
3. P(Z < -1.96) =
4. P(-1.96 < Z < 1.96)=
5. P(Z < 4) ≈
6. P(Z > 4) ≈
7. P(Z < -4) ≈
8. P(Z > -4) ≈
Example 2
Consider a normal distribution with μ = 16 and σ = 4. The
68-95-99.7 rule says that 95% of the distribution is between
which two values?
a.
4 and 16
b.
68 and 99.7
c.
12 and 20
d.
8 and 24
Example 3
Bigger animals tend to carry their young longer before birth. The length of horse
pregnancies from conception to birth varies according to a roughly normal
distribution with mean 336 days and standard deviation 15 days.
a. What percent of horse pregnancies last less than 300 days?
b. What percent of horse pregnancies last more that a regular year (365 days)?
Example 3
c. What percent of horse pregnancies last between 320 and 350 days?
d. What percent of horse pregnancies last less than 300 days or more than a regular
year?