Download Smoothing the histogram: The Normal Curve (Chapter 8)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Sept. 3 Statistic for the day:
Compared to same month the previous year,
average monthly drop in U.S. traffic fatalities from
May 2007 to Feb. 2008: 4.2%
Drop in March 2008: 22.1%
Drop in April 2008: 17.9%
Smoothing the histogram:
The Normal Curve (Chapter 8)
A histogram tends to be rough. To replace it with a bell
shaped curve:
Center the “bell” at the mean.
The “bell” should be just wide enough so that the middle
95% of the bell is 4 standard deviations wide.
This makes systematic, accurate predictions of all sorts
possible, provided the bell shape is appropriate for the
underlying population.
Recall this dataset on handspans from the
previous lecture:
Histogram of HandSpan, with Normal Curve
23.5
21.0
21.5
23.0
21.0
22.0
22.5
23.0
21.0
22.5
23.0
22.0
22.5
24.5
24.5
20.5
22.5
22.5
23.0
21.5
22.0
20.5
22.0
24.5
21.5
21.0
23.5
22.0
22.0
22.0
24.0
20.5
24.0
22.0
24.5
22.0
19.0
23.5
20.5
22.0
18.0
24.0
22.5
22.0
20.0
21.5
21.0
23.5
21.5
21.5
22.0
22.5
20.5
21.0
24.0
22.0
Women (n = 89)
21.5
18.0
20.5
18.5
19.0
18.5
20.0
20.0
20.0
21.0
19.5
19.0
16.0
17.5
18.5
20.0
19.0
19.0
20.0
18.0
20.5
21.5
20.5
20.5
19.5
22.0
20.0
21.0
19.5
19.5
18.5
21.5
20.0
22.5
17.0
17.0
20.5
21.0
20.0
23.0
21.5
21.5
20.0
18.5
18.5
18.5
20.0
20.0
20.5
21.0
19.0
20.5
21.0
20.0
21.0
20.0
18.5
16.5
19.0
19.0
19.0
20.5
17.0
19.5
19.0
20.0
19.5
20.5
20.0
19.5
19.0
18.5
20.0
17.0
21.0
18.5
20.5
19.5
19.0
20.5
21.0
18.5
17.5
19.5
18.5
19.5
20.0
20.0
18.5
Frequency
20
Men (n = 78)
23.5
22.5
24.5
24.5
20.0
21.0
24.0
21.0
21.0
22.5
23.0
24.0
24.5
23.0
10
0
15
20
25
HandSpan
Mean = 20.86
Standard deviation = 1.927
Histogram of HandSpan, with Normal Curve
Histogram of Height, with Normal Curve
20
Frequency
Frequency
30
10
20
10
0
0
15
20
HandSpan
Mean = 20.86
Standard deviation = 1.927
25
60
70
80
Height
Mean = 68 inches or 5 feet 8 inches
Standard deviation = 4 inches
1
Research Question 2: How high should
I build my doorways so that 99% of the
people will not have to duck?
Z-Scores: Measurement in Standard
Deviations
75 − mean 75 − 68
Z=
=
= 1.75
SD
4
Histogram of Height, with Normal Curve
30
Frequency
Research Question 1: If I built my doors
75 inches (6 feet 3 inches) high, what
percent of the people would have to
duck?
20
10
0
60
70
80
Height
Question 1
(x=75)
Question 2
(x=??)
Q1: The value of x is 75; find the amount of distribution above it.
Q2: Find the value of x so that 99% of the distribution is below it.
Compute your Z-score.
1. How many standard deviations are you
above or below the mean.
Use:
Mean = 68 inches
Standard deviation = 4 inches
2. Now use the table from the book (p. 157) to
determine what percentile you are.
Compare Heights of Females and Males
Stat 100 students Sp01
80
Assume male heights have a normal distribution with
mean 70 inches and st dev 3 inches. Assume female
heights have a normal distribution with mean 64 inches
and st dev 3 inches.
Height
What is your Z-Score within your sex?
70
What is your percentile within your sex?
60
Female
Male
Sex
2
Histogram of Height, with Normal Curve
Answer to Question 1: What percent of people would
have to duck if I built my doors 75 inches high?
From the standard normal table in the book: .96 or
96% of the distribution is below 1.75. Hence, .04
or 4% is above 1.75.
Frequency
Recall: 75 has a Z-score of 1.75
30
20
4% in here
10
0
60
So 4% of the distribution is above 75 inches.
70
75
80
Height
Question 1
(x=75)
The value at x is 75; find the amount of distribution above
it. Convert 75 to Z = 1.75 and use Table 8.1 on p. 157.
Question 2: What is the value so that 99% of the
distribution is below it? (called the 99th percentile.)
3. Now convert it over to inches:
2.33 =
h99 − 68
4
h99 = 68 + 2.33(4) = 77.3
30
Frequency
1. Look up the Z-score that corresponds to the 99th
percentile. From the table: Z = 2.33.
Histogram of Height, with Normal Curve
20
10
99% in here
0
60
70
80
Height
Question 2
77.3 inches is the 99th percentile
Therefore, 99% of the distribution is shorter than 77.3
inches (6 foot 5.3 inches) and that’s how high the door
should be built.
To find the value so that 99% of the distribution is below it: Look up
the Z-score for the 99th percentile and convert it back to inches.
Answer these questions:
To answer question 1, first convert 72
inches to a z-score:
(Assume that adults’ heights are normally distributed with
mean 68 inches and standard deviation 4 inches.)
.84 or 84%
3
Answer to Question 2: What is the first quartile of heights?
Translation: “First quartile” means 25th percentile, which
means .25 are below that height.
From p. 157: Find the z-score corresponding to the 25th
percentile.
–.67
Now convert this z-score into a height:
Z − score =
h − 68
4
h = 68 + 4( Z − score)
There are roughly 305 million people in US.
About 49% are over the age of 20 (Census Bureau).
That is about 150 million.
Shaquille O’Neal is 7 feet 1 inch or
85 inches tall. How many people in
the country are taller?
We will assume that heights are normally distributed
with mean 68 inches and standard deviation 4 inches.
O’Neal’s Z-score is Z = (85-68)/4 = 4.25. In other words
O’Neal is 4.25 standard deviations above the mean(!)
There is only 0.000011 of the normal distribution above
4.25 standard deviations.
Page 157
Hence, there should be roughly
.000011 times 150 million
or 1650 people taller than Shaquille O’Neal.
Note: This is an extremely rough calculation, since the
normal distribution approximation is less accurate at the
extremes. Also, cutting off at age 20 might miss some
tall teens!
Page 158
Suppose someone claims to have
tossed a fair coin 100 times and got
70 heads. Would you believe them?
4
Toss a coin 100 times
Repeat 500 times and form a histogram
So the distribution of the number of
heads in 100 tosses of a fair coin is:
90
80
Frequency
70
60
50
40
30
20
10
0
35
45
55
Number of heads
65
1. What is the mean?
2. What is the standard deviation?
3. Let’s suppose the smooth version is normal.
5