Download Chapter 2 Packet - Somerset Independent Schools

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
2016 version
1
AP Statistics Chapter 2 –
Modeling Distributions of Data
Chapter 2 Test Standards:
Here’s what’s on the test, question by question:
Multiple Choice Section:
#1. I can use percentiles to locate individual values within
distributions of data.
#2. I can correctly interpret a percentile rank in the
context of a given situation.
#3. I can use z-scores to make comparisons and form
conclusions due to those z-score values.
#4. I can interpret a Normal probability plot to assess
normality of a distribution of data.
#5. I can describe the effect of adding, subtracting,
multiplying by, or dividing by a constant on the shape,
center, and spread of a distribution of data.
#6. I can look at different dot plots displaying distributions
of data and determine which one is best approximated by
a normal distribution.
#7. I can use the standard Normal distribution (z-scores)
to calculate the proportion of values in a specified interval
(< or >).
#8. I can use the standard Normal distribution (z-scores)
to calculate the proportion of values in a specified interval
(< or >).
#9. I can interpret a cumulative relative frequency graph
and calculate the IQR for a distribution using that graph.
2016 version
2
#10. I can describe the effect of adding, subtracting,
multiplying by, or dividing by a constant on the values
from a distribution of data and determine which of those
values and change and which ones do not.
#11. I can use the standard Normal distribution to
determine a z-score from a percentile or quartile value (via
INVNORM).
#12. I can use the standard Normal distribution to
determine a z-score from a percentile or quartile value (via
INVNORM).
Free Response Section:
Free Response Question #1.
 I can use z-scores to make comparisons and form
conclusions due to those z-score values.
 I can use the standard Normal distribution to determine a zscore from a percentile or quartile value (via INVNORM).
Free Response Question #2.
 I can use the standard Normal distribution (z-scores) to
calculate the proportion of values in a specified interval (< or
>).
 I can use the standard Normal distribution (z-scores) to
calculate the proportion of values in a specified interval (for
values ‘between’).
 I can use the standard Normal distribution (z-scores) to
calculate the probability of an event for values in a specified
interval (< or >).
2016 version
3
Day #1: Introduction to Percentile; Describing
Location in a Distribution
 I can use percentiles to locate individual values within
distributions of data.
During this lesson, we will learn how to describe the
‘location’ in a distribution. Another way to state this is by
saying we are going to ‘measure position’.
Vocabulary:
Percentile (measuring relative location/position of data)
– the ‘pth‘ percentile of a distribution is the value with ‘p’
percent of the observations less than it.
Example: Wins in Major League Baseball
The stem plot below shows the number of wins for each of
the 30 Major League Baseball teams in 2009.
5 9
6 2455
7 00455589
8 0345667778
9 123557
10 3
Key: 5|9
represents a team
with 59 wins.
2016 version
4
Problem: Find the percentiles for the following teams:
(a) The Colorado Rockies, who won 92 games.
(b) The New York Yankees, who won 103 games.
(c) The Kansas City Royals and Cleveland Indians, who
both won 65 games.
Practice Multiple Choice Question:
#1. The dotplot below displays the total number of miles
that the 28 residents of one street in a certain community
traveled to work in one five-day work week.
Which of the following is closest to the percentile rank of a
resident from this street who traveled 85 miles to work
that week?
a. 60
b. 70
c. 75
d. 80
e. 85
2016 version
5
EXAMPLE: Percentiles… continued
#1. Using the definition of percentile that says “the
percent of observations strictly below the given value”,
what percentile rank is a female who has a temperature of
98.0 degrees F?
#2. Using the definition of percentile that says “the
percent of observations at or below the given value”,
what percentile rank is a female who has a temperature of
98.0 degrees F?
#3. Using the definition of percentile that says “the
percent of observations strictly below the given value”,
what percentile rank is a male who has a temperature of
97.4 degrees F?
2016 version
#4. Using the definition of percentile that says “the
percent of observations at or below the given value”,
what percentile rank is a male who has a temperature of
97.4.0 degrees F?
#5. State the temperature of a female that would rank at
the 64th percentile using the definition: “the percent of
observations strictly below the given value.”
#6. State the temperature of a male that would rank at
the 93rd percentile using the definition: “the percent of
observations at or below the given value.”
6
2016 version
7
 I can interpret a cumulative relative frequency graph.
We will now look at a graph that helps us determine
percentile ranks. This graph is known as a ‘cumulative
relative frequency graph.’ Some textbooks, refer to it as an
‘ogive.’
Example: State Median Household Incomes
Here is a table showing the distribution of median
household incomes for the 50 states and the District of
Columbia.
Median
Cumulative
Relative
Cumulative
Income Frequency
Relative
Frequency
Frequency
($1000s)
Frequency
35 to < 40
1
1/51 = 0.020
1
1/51 = 0.020
40 to < 45
10
10/51 = 0.196
11
11/51 = 0.216
45 to < 50
14
14/51 = 0.275
25
25/51 = 0.490
50 to < 55
12
12/51 = 0.236
37
37/51 = 0.725
55 to < 60
5
5/51 = 0.098
42
42/51 = 0.824
60 to < 65
6
6/51 = 0.118
48
48/51 = 0.941
65 to < 70
3
3/51 = 0.059
51
51/51 = 1.000
Here is the cumulative relative frequency graph for the
income data.
 The point at (50,0.49) means 49% of the states had
median household incomes less than $50,000.
 The point at (55, 0.725) means that 72.5% of the states
had median household incomes less than $55,000.
 Thus, 72.5% - 49% = 23.5% of the states had median
household incomes between $50,000 and $55,000 since
the cumulative relative frequency increased by 0.235.
 Due to rounding error, this value is slightly different
than the relative frequency for the 50 to <55 category.
2016 version
8
Problem: Use the cumulative relative frequency graph
for the state income data to answer each question.
(a) At what percentile is California, with a median income of
$57,445?
(b) Estimate and interpret the first quartile of this solution.
(c) Between what two values would the MIDDLE 50% lie?
(d) What is the IQR of the data?
(e) What percentile rank would a median household income of
$45,000 be?
2016 version
9
Day #2: Review of Percentiles and Describing
Location in a Distribution; Introduction of Z-scores
 I can use percentiles to locate individual values within
distributions of data.
A small AP class has a test and the test scores are the
following:
54
82
58
84
60
89
68
94
72
96
78
97
 What is a percentile rank a measure of?
 What percentile rank is the person who scored
an 82%? Interpret this rank.
 Which number(s) are at the 25th, 50th and 75th
percentile ranks?
2016 version
10
 I can interpret a cumulative relative frequency graph
(ogive).
Below is a cumulative relative frequency graph for the lengths,
in minutes, of 200 songs recorded by the Rolling Stones.
 What are the median and interquartile range of song lengths?
Draw lines on the graph to show how you arrived at your
answers.
 What is the 90th percentile?
 What percentile would 5.5 minutes rank at?
2016 version
11
Introduction of Z-Scores: How to compare an
‘apple’ to an ‘orange’: z-scores!!!!
 I can find the standardized value (z-score) of an
observation. Interpret z-scores in context.
 If ‘x’ is an observation from a distribution that has a mean
µ and standard deviation σ the standardized value of x is:
 A standardized value is often called a ________________.
 A _________________ tells us how many standard
deviations the original observation falls away from the
mean, and in which direction.
2016 version
12
 When we ‘standardize’ a variable with a normal
distribution, it produces a new variable that has what we
call the:
Example: Wins in Major League Baseball
In 2009, the mean number of wins was 81 with a standard
deviation of 11.4 wins.
Problem: Find and interpret the z-scores for the following
teams.
(a) The New York Yankees, with 103 wins.
(b) The New York Mets, with 70 wins.
2016 version
13
 We also use z-scores to give data a common scale.
 Without looking back at the previous page, try to
recall the formula for a Z-Score:
EXAMPLE: Home run kings
The single-season home run record for major league
baseball has been set just three times since Babe
Ruth hit 60 home runs in 1927. Roger Maris hit 61 in
1961, Mark McGwire hit 70 in 1998 and Barry Bonds
hit 73 in 2001. In an absolute sense, Barry Bonds
had the best performance of these four players,
since he hit the most home runs in a single season.
However, in a relative sense this may not be true.
Baseball historians suggest that hitting a home run
has been easier in some eras than others. This is
due to many factors, including quality of batters,
quality of pitchers, hardness of the baseball,
dimensions of ballparks, and possible use of
performance-enhancing drugs. To make a fair
comparison, we should see how these performances
rate relative to others hitters during the same year.
Problem: Compute the standardized scores for each
performance. Which player had the most
outstanding performance relative to his peers?
2016 version
Year
14
Player
1927 Babe Ruth
HR
Mean
SD
60
7.2
9.7
1961
Roger
Maris
61
18.8
13.4
1998
Mark
McGwire
70
20.7
12.7
2001
Barry
Bonds
73
21.4
13.2
z-score
2016 version
15
EXAMPLE #1: SAT vs. ACT
Student A takes the SAT in math and scores a 680, while
Student B takes the ACT and scores a 27 on the same
portion of the test. The scores of the SAT math test are
N(500, 100) and the scores of the ACT math test are
N(18,6). Which student (A or B) has the higher score?
EXAMPLE #2: SAT vs. ACT
Student A takes the SAT in reading and scores a 645, while
student B takes the ACT and scores a 26 on the same
portion of the test. The scores of the SAT reading test are
N(510, 85) and the scores of the ACT reading test are N(19,
5). Which student (A or B) has the higher score?
2016 version
16
MULTPLE CHOICE PRACTICE:
#1. One of the values in a normal distribution is 58
and its corresponding z-score is 2.08. If the mean of
the distribution is 53, what is the standard deviation
of the distribution?
a. 5
b. 0.416
c. 2.40
d. -2.40
e. -0.416
#2. The weight of adult male grizzly bears living in
the wild in the continental United States is
approximately normally distributed with a mean of
500 pounds and a standard deviation of 50 pounds.
The weight of adult female grizzly bears is
approximately normally distributed with a mean of
300 pounds and a standard deviation of 40 pounds.
Approximately, what would be the weight of a
female grizzly bear with the same standardized score
(z-score) as a male grizzly bear with a weight of 530
pounds?
a. 276 pounds
b. 324 pounds
c. 330 pounds
d. 340 pounds
e. 530 pounds
2016 version
17
Day #3: Transforming Data
 I can describe the effect of adding, subtracting,
multiplying by, or dividing by a constant on the shape,
center, and spread of a distribution of data.
EXAMPLE: Test Scores
Here are a graph and table of summary statistics for a
sample of 30 test scores.
The maximum possible score on the test was 50 points.
s x Min Q1 M Q3 Max IQR Range
n x
Score 30 35.8 8.17 12 32 37 41 48 9
36
 Suppose that the teacher was nice and added 5 points
to each test score.
 How would this change the shape, center, and spread
of the distribution?
2016 version
18
Here are graphs and summary statistics for the original
scores and the +5 scores:
Score
s x Min Q1 M Q3 Max IQR Range
n
x
30 35.8 8.17 12 32 37 41 48
9
36
Score + 5 30 40.8 8.17
17
37 42 46
53
9
36
 From both the graph and summary
statistics, we can see that the measures
of center and measures of position all
increased by 5.
 However the shape of the distribution
did not change nor did the spread of
the distribution.
2016 version
19
Suppose that the teacher wanted to convert the original
test scores to percent’s. Since the test was out of 50
points, he should multiply each score by 2 to make them
out of 100.
Here are graphs and summary statistics for the original
scores and the doubled scores.
n
Score
x
sx
30 35.8 8.17
Min Q1 M Q3 Max IQR Range
12
32 37 41
48
9
36
Score x 2 60 71.6 16.34 24
64 74 82
96
18
72
 From the graphs and summary
statistics we can see that the measures
of center, location, and spread all have
doubled, just like the individual
observations.
 But even though the distribution is
more spread out, the shape hasn’t
changed. It is still skewed to the left
with the same clusters and gaps.
2016 version
20
EXAMPLE: Taxi Cabs
In 2010, Taxi Cabs in New York City charged an initial
fee of $2.50 plus $2 per mile.
In equation form, fare = 2.50 + 2(miles). At the end
of a month a businessman collects all of his taxi cab
receipts and calculates some numerical summaries.
The mean fare he paid was $15.45 with a standard
deviation of $10.20. What are the mean and
standard deviation of the lengths of his cab rides in
miles?
2016 version
21
EXAMPLE: Song Lengths
According to these data, the mean song length was
4.23 minutes, and the standard deviation was 1.38
minutes. A music lover who wants to create a mix of
songs wants to have 5 seconds of silence between
songs, so he needs to add five seconds to the length
of each song. He also wants to express the times in
seconds, rather than minutes. Find the mean and
standard deviation of the transformed data.
 What are the mean and standard deviation of
the z-scores of song lengths? Justify your
answer.
2016 version
22
EXAMPLE:
Height, in meters is measured for each person in a
sample. After the data are collected, all the height
measurements are converted from meters to
centimeters by multiplying each measurement by
100. What statistics will remain the same for both
units of measure? Which ones will change?
MULTIPLE CHOICE PRACTCE QUESTON:
Suppose the distribution of a set of scores has a
mean of 28 and a standard deviation of 6. If 3 is
added to each score, what will be the mean and the
standard deviation of the distribution of new scores?
Mean
Standard Deviation
a. 31
10
b. 31
6
c. 28
10
d. 28
18
e. 28
6
2016 version
23
EXAMPLE:
FREE RESPONSE QUESTION: (from 2007 AP
Statistics Exam (Form B) #1)
The Better Business Council of a large city has
concluded that students in the city’s schools are
not learning enough about economics to
function in the modern world. These findings
were based on test results from a random
sample of 20 twelfth-grade students who
completed a 46-question-multiple-choice test
on basic economic concepts. The data set
below shows the number of questions that each
of the 20 students in the sample answered
correctly.
12
16
18
17
18
33
41
44
38
35
19
36
19
13
43
8
16
14
10
9
a. Display these data in a stemplot.
b. Use your stemplot from part (a) to describe
the main features of this score distribution.
c. Why would it be misleading to report only a
measure of center for this score
distribution?
2016 version
24
2016 version
25
Day #4: Intro to Density Curves
 I can approximately locate the median (equal-areas
point) and the mean (balance point) on a density
curve.
Density Curves
Density Curves – 3 things to know about them
 Describes the overall distribution.
 Always on or above the horizontal axis.
 The area is exactly 1 underneath it.
EXAMPLE OF A DENSITY CURVE:
Imagine a histogram underneath the density curve shown:
2016 version
EXAMPLE: #1: “Finding Means and Medians.”
The figures below display three density curves, each with
three points indicated. At which of these points on each
curve do the mean and the median fall?
a. Graph A:
i. Describe the shape:
ii. Mode:
iii. Median:
iv. Mean:
b.Graph B:
i. Describe the shape:
ii. Mode:
iii. Median:
iv. Mean:
c. Graph C:
i. Describe the shape:
ii. Mode:
iii. Median:
iv. Mean:
26
2016 version
EXAMPLE: Unusual distribution
 Describe the SHAPE of this density curve.
 Mark with vertical lines the mean, median, and
mode.
27
2016 version
28
MULTIPLE CHOICE PRACTICE:
For the following histogram, what is the proper
ordering of the mean, median, and mode? Note that
the graph is NOT numerically precise – only the
relative positions are important.
a. I = mean, II = median, III = mode
b. I = mode, II = median, III = mean
c. I = median, II = mean, III = mode
d. I = mode, II = mean, III = median
e. I = mean, II = mode, III = median
HOW TO THINK OF THE MEAN & MEDIAN:
Median of density curves –
Mean of density curves –
2016 version
29
EXAMPLE: “Biking Accidents”
Accidents on a level, 3 mile bike path occur
uniformly along the length of the path. The figure
below displays the density curve that describes the
uniform distribution of accidents.
a. Explain why this curve satisfies the two
requirements for a density curve.
b. The proportion of accidents that occur in the
first mile of the path is the area under the
density curve between 0 miles and 1 mile. What
is the area?
2016 version
30
c. Sue’s property adjoins the bike path between
the 0.8 mile mark and the 1.1 mile mark. What
proportion of accidents happen in front of Sue’s
property? Explain.
d. What is the mean of the density curve pictured
above?
e. What is the median?
f. What proportion of accidents occur between
the 0.5 mile mark and 1.2 mile mark OR
between the 2.7 mile mark and 3.0 mile mark.
2016 version
Examples of how density curves would
model various distributions:
31
2016 version
REVIEW EXAMPLE: 400 Meter Sprint:
Both male and female athletes competed in a
400 meter sprint. The male times were
N(51.02, 0.25) and the females were N(51.95,
0.35).
(All times are in seconds)
Sarah ran the sprint in 51.89. Josh ran the
sprint in 51.53
Which athlete did better?
32
2016 version
33
Day #5: Normal Curves & Distributions
Symbols for mean and standard deviation:
Normal Curves -- _________________ in shape.
A ‘normal curve’ describes a ________________
distribution.
Characteristics:
 ______________________________
 ______________________________
 ______________________________
 The mean is located at the same place on the curve
as the _________________.
2016 version
34
 Normal curves can come in different shapes & sizes,
as shown below.
 These two normal curves show the mean and
standard deviation.
 The standard deviation σ controls the spread of a
Normal curve. Curves with larger standard deviations
are MORE spread out.
 Normal Distribution: is described by a ‘normal’
curve.
 Any particular Normal distribution is specified by
TWO numbers:
 The mean of a Normal distribution is at the center of
the symmetric Normal curve.
 The standard deviation is the distance from the
center of the inflection points on either side.
 We abbreviate the Normal distribution as follows:
2016 version
35
 Let’s take a look at the standard normal distribution:
2016 version
36
 I can use the 68–95–99.7 rule to estimate the percent
of observations from a Normal distribution that fall in
an interval involving points one, two, or three standard
deviations on either side of the mean.
68-95-99.7% Rule
AKA: ‘Empirical Rule’
 _____ of all observations fall within _______ from ______.
 _____ of all observations fall within _______ from ______
 _____ of all observations fall within _______ from ______
2016 version
37
EXAMPLE: Potato Chips
The distribution of weights of 9 ounce bags of a particular
brand of potato chips is approximately Normal with a
mean µ = 9.12 ounces and a standard deviation of σ = 0.05
ounces.
a. Shade the region that is TWO standard deviations above
the mean.
b. What percent of bags weight less than 9.02 ounces?
c. Between what weights do the middle 68% of bags fall?
d. What percent of 9 ounce bags of this brand of potato chips
weight between 8.97 and 9.17 ounces?
e. A bag that weighs 9.07 ounces is at what percentile in this
distribution?
2016 version
38
EXAMPLE: Batting Averages
The histogram below shows the distribution of
batting average (proportion of hits) for the 432
Major League Baseball players with at least 100 plate
appearances in the 2009 season. The smooth curve
shows the overall shape of the distribution.
Describe the distribution above.
Describe the shape of the histogram.
2016 version
39
Example: Batting Averages
In the previous alternate example about batting averages
for Major League Baseball players in 2009, the mean of the
432 batting averages was 0.261 with a standard deviation
of 0.034.
Suppose that the distribution is exactly Normal with μ =
0.261 and σ = 0.034.
Problem:
(a) Sketch a Normal density curve for this distribution
of batting averages. Label the points that are 1, 2,
and 3 standard deviations from the mean.
2016 version
40
(b) What percent of the batting averages are above
0.329? Show your work.
(c) What percent of the batting averages are between
0.227 and .295?
(d) What percent of the batting averages are greater
than 0.159?
(e) What percent of the batting averages are between
0.193 and 0.295?
2016 version
41
Day #6: Review of Density Curves/
Empirical Rule/ Standard Normal
Calculations
 I can approximately locate the median (equal-areas point)
and the mean (balance point) on a density curve.
On the density curve below, draw two vertical lines where
you think the median and the mean of the distribution are.
Label each line, and describe in words what feature of the
curve you are using to locate each measure.
2016 version
MULTIPLE CHOICE Practice: “Cockroaches”
The weights of laboratory cockroaches follow a Normal
distribution with mean 80 grams and standard deviation
of 2 grams. The figure below is the Normal curve for this
distribution of weights.
#1. Point C on this Normal curve corresponds to
a. 84 g b. 82 g c. 78 g d. 76 g e. 74 g.
#2. About what percent of the cockroaches have weights
between 76 and 84 grams?
a. 99.7% b. 95% c. 68% d. 47.5% e. 34%
#3. About what percent of the cockroaches have weights
less than 78 grams?
a. 34% b. 32% c. 16% d. 2.5% e. none of these.
42
2016 version
43
 I can use the standard Normal distribution to calculate
the proportion of values in a specified interval.
Finding Areas under the Standard Normal Curve
#1. How do you use the standard normal table (Table A)
to find the area under the standard normal curve to the
left of a given z-value?
#2. How do you use the standard normal table (Table A)
to find the area under the standard normal curve to the
right of a given z-value?
2016 version
44
#3. How do you use the standard normal table (Table A)
to find the area under the standard normal curve between
two given z-values?
MULTIPLE CHOICE PRACTICE:
A normal density curve has which of the
following properties?
a. It is symmetric.
b. It has a peak centered above its mean.
c. The spread of the curve is proportional to it
standard deviation.
d. All of the properties, (a) to (c), are correct.
e. None of the properties, (a) to (c), is correct.
2016 version
45
Examples: Use the “Standard Normal Probabilities Table”
(Table A) to find the proportion of observations from a
standard normal distribution that satisfies each of the
following statements.
#1. z < 1.24
#2. z > -0.23
2016 version
#3. –1.84 < z < 0.02
#4. –0.34 < z < 0.93
46
2016 version
EXTRA PRACTICE:
Using Table A (table of standard normal probabilities) or
your TI-83 calculator, find the proportion of observations
from a standard normal distribution that satisfies each of
the following statements. In each case, shade the area
under the standard normal curve to that is the answer to
the question.
a. Z < -2.25
b. -2.25 < Z < 1.77
47
2016 version
c. Z > 0.83
d. 2.25 < Z < 1.77
48
2016 version
49
Example: Serving Speed
In the 2008 Wimbledon tennis tournament, Rafael Nadal
averaged 115 miles per hour (mph) on his first serves1.
Assume that the distribution of his first serve speeds is
Normal with a mean of 115 mph and a standard deviation
of 6 mph.
a. About what proportion of his first serves would you
expect to exceed 120 mph?
2016 version
b. What percent of Rafael Nadal’s first serves are
between 100 and 110 mph?
50
2016 version
51
Day #7: Review of Standard Normal
Calculations
Multiple Choice Practice
#1. Suppose the distribution of a set of scores has a mean
of 28 and a standard deviation of 6. If 3 is added to each
score, what will be the mean and the standard deviation of
the distribution of new scores?
Mean
Standard Deviation
a. 31
10
b. 31
6
c. 28
10
d. 28
18
e. 28
6
#2. Two measures of
center are marked on
the density curve
shown.
a. The median is at
the solid line
and the mean is
at the dotted
line.
b. The median is at
the dotted line and the mean is at the solid line.
c. The mode is at the solid line and the median is at the
solid line.
d. The mode is at the solid line and the median is at the
dotted line.
e. The mode is at the dotted line and the mean is at the
solid line.
2016 version
52
#3. A market research company employs a large number
of typists to enter data into a computer. The time taken
for new typists to learn the computer system is known to
have a normal distribution with a mean of 90 minutes and
a standard deviation of 18 minutes. The proportion of
typists that take more than two hours to learn the
computer system is
a. 0.952
b. 0.548
c. 0.048
d. 0.452
#4. A company produces packets of soap powder labeled
“Giant Size 32 Ounces.” The actual weight of soap powder
in a box has a normal distribution with a mean of 33
ounces and a standard deviation of 0.7 ounces. What
proportion of packets is underweight (i.e.: weights less
than 32 ounces)?
a. 0.0764
b. 0.2420
c. 0.7580
d. 0.9236
2016 version
53
EXAMPLE: 2002 AP STATISTICS EXAM #3:
There are 4 runners on the New High School team. The
team is planning to participate in a race in which each
runner runs a mile. The team time is the sum of the
individual times for the 4 runners. Assume that the
individual times of the 4 runners are all independent of
each other. The individual times, in minutes, of the
runners in similar races are approximately normally
distributed with the following means and standard
deviations.
Mean
Standard Deviation
Runner 1
4.9
0.15
Runner 2
4.7
0.16
Runner 3
4.5
0.14
Runner 4
0.15
Runner 3 thinks that he can run a mile in less than 4.2
minutes in the next race. Is this likely to happen? Explain
2016 version
54
EXAMPLE: 2006 AP STATISTICS EXAM #3:
Golf balls must meet a set of five standards in order to be
used in professional tournaments. One of these standards
is distance traveled. When a ball is hit by a mechanical
device, Iron Byron, with a 10-degree angle of launch, a
backspin of 42 revolutions per second, and a ball velocity
of 235 feet per second, the distance the ball travels may
not exceed 290.7 yards. Manufacturers want to develop
balls that will travel as close to the 290.7 yards as possible
without exceeding that distance. A particular
manufacturer has determined that the distances traveled
for the balls it produces are normally distributed with a
standard deviation of 2.6 yards. This manufacturer has a
new process that allows it to set the mean distance the
ball will travel.
If the manufacturer sets the mean distance traveled to
288.5 yards, what is the probability that a ball that is
randomly selected for testing will travel too far?
2016 version
55
EXAMPLE: GRE EXAMS:
The Graduate Record Examinations are widely used
to help predict the performance of applicants to
graduate schools. The range of possible scores on a
GRE is 200 to 900. The psychology department at a
university finds that the scores of its applicants on
the quantitative GRE are approximately Normal with
mean = 544 and standard deviation = 103.
#1. Make an accurate sketch of the distribution of these
applicants’ GRE scores. Be sure to provide a scale on the
horizontal axis.
#2. Use the 68–95–99.7 rule to find the proportion of
applicants whose score is between 338 and 853.
2016 version
#3. What proportion of GRE scores are below 500?
#4. What proportion of GRE scores are above 800?
#5. What proportion of GRE scores are between 558 and
757?
56
2016 version
57
EXAMPLE: IQ SCORES:
Scores on the Wechsler Adult Intelligence Scale
(WAIS, a standard ‘IQ test’) for the 20 to 34 age
group are approximately normally distributed with μ
= 110 and σ = 25. Use the 68-95-99.7% rule to
answer these questions.
a. About what percent of people in this age group
have scores above 110?
b. About what percent have scores above 160?
c. In what range do the middle 95% all of IQ scores
lie?
2016 version
58
d. Consider percentile ranks:
i. If someone’s score were reported as
the 16th percentile, about what score
would that individual have?
ii. 84th percentile?
iii. 97.5th percentile?
e. What percent of people age 20 to 34 have IQ
scores above 100?
f. What percent have scores above 150?
g. What percent have a score BETWEEN 68 and
115?
2016 version
59
EXAMPLE: 1999 AP Statistics Exam Question #4
A company is considering implementing one of two
quality control plans for monitoring the weights of
automobile batteries that it manufactures. If the
manufacturing process is working properly, the
battery weights are approximately normally
distributed with a specified mean and standard
deviation.
 Quality control plan A calls for rejecting a battery
as defective if its weight falls more than 2
standard deviation below the specified mean.
 Quality control plan B calls for rejecting a battery
as defective if its weight falls more than 1.5 inter
quartile ranges below the lower quartile of the
specified population.
a. What proportion of batteries will be rejected by
plan A?
2016 version
b. What proportion of batteries will be rejected
by plan B?
60
2016 version
61
Day #8: ‘Working Backwards’ with Normal
Calculations (using INVNorm)
 I can use the standard Normal distribution to
determine a z-score from a percentile.
 I can use Table A to find the percentile of a value from
any Normal distribution and the value that
corresponds to a given percentile.
Use the “Standard Normal Probabilities Table”
(Table A) to find the value z of a standard normal
variable that satisfies each of the following
conditions. (Use the value of z from Table A that
comes closest to satisfying the condition.)
#1. The point z with 30% of the observations falling
below it.
2016 version
#2. The point z with 74% of the observations falling
above it.
#3. The 40th percentile.
62
2016 version
63
EXAMPLE: GRE EXAMS:
The Graduate Record Examinations are widely used
to help predict the performance of applicants to
graduate schools. The range of possible scores on a
GRE is 200 to 900. The psychology department at a
university finds that the scores of its applicants on
the quantitative GRE are approximately Normal with
mean = 544 and standard deviation = 103.
Calculate and interpret the 34th percentile of the
distribution of applicants’ GRE scores.
EXAMPLE: IQ SCORES:
Scores on the Wechsler Adult Intelligence Scale
(WAIS, a standard ‘IQ test’) for the 20 to 34 age
group are approximately normally distributed with μ
= 110 and σ = 25. How high an IQ score is needed to
be in the highest 25%?
2016 version
64
EXAMPLE: Heights of three-year-old females
According to http://www.cdc.gov/growthcharts/,
the heights of 3 year old females are approximately
Normally distributed with a mean of 94.5 cm and a
standard deviation of 4 cm. What is the third
quartile of this distribution?
MULTIPLE CHOICE!!!
The time to complete a standardized exam is
approximately normal with a mean of 70 minutes
and a standard deviation of 10 minutes. How much
time should be given to complete the exam so that
80% of the students will complete the exam in the
time given?
a. 84 minutes
b. 78.4 minutes
c. 92.8 minutes
d. 79.8 minutes
2016 version
65
EXAMPLE: Basketball
During the 2009-2010 basketball season, the number
of points scored in each game by the Boston Celtics
was approximately normally distributed with a mean
of 99.2 points and a standard deviation of 10.5
points.
a. What is the 33rd percentile of points scored by
the Celtics?
b. The mean number of points scored by Los
Angeles Lakers was 101.7. In what proportion
of their games did the Celtics score more than
the Lakers’ mean score?
2016 version
66
EXAMPLE:
#1. Find the proportion of observations from a
standard Normal distribution that satisfies
-1.51 < z < 0.84.
Sketch the Normal curve and shade the area under
the curve that is the answer to the question.
#2. What z-score in a Normal distribution has 58% of
all scores below it?
#3. What z-score in a Normal distribution has 33% of
all scores above it?
2016 version
EXAMPLE: Hand Sanitizer
A company produces packets of hand sanitizer that
are deemed to be underweight if less than 31.5
ounces. The actual weight of the packets of
sanitizer has a normal distribution with a mean of
32.8 oz. and a standard deviation of 0.9 oz. What
proportion of packets are underweight (i.e., weigh
less than 31.5 oz.)?
67
2016 version
EXAMPLE: Inflection Points
Consider the ‘inflection points’ of this normal curve
and approximate the mean and standard deviation
values.
68
2016 version
69
Day #9: Review of ‘Invnorm’/
Assessing Normality
#1. In a large set of data that are approximately
normally distributed.

d is the value in the data set that has a
z-score of 1.50

e is the value of the third quartile, and

f is the value of the 80th percentile.
What is the correct order from least to greatest
for the values of e, f and d?
#2. A distribution of scores is approximately
normal with a mean of 82 and standard
deviation of 4.6. What equation can be used to
find the score ‘x’ above which 35 percent of the
scores fall?
2016 version
70
#3. The weights of laboratory cockroaches
follow a Normal distribution with mean 80
grams and standard deviation of 2 grams. How
much would a cockroach weigh, if it ranks at the
80th percentile?
#4. The weights of Russet potatoes are
normally distributed with a mean of 1.3 pounds
and a standard deviation of 0.37 pounds. What
is the probability that a randomly selected
potato will weigh more than 1.65 pounds?
2016 version
Assessing Normality
 I can make an appropriate graph to determine if a
distribution is bell-shaped.
Normal Probability Plot – a graph that provides a good
assessment of how _____________ a distribution is.
Example: No Space in the Fridge?
The measurements listed below describe the useable
capacity (in cubic feet) of a sample of 36 side-by-side
refrigerators. <source: Consumer Reports, May 2010>
Are the data close to Normal?
12.9 13.7 14.1 14.2 14.5 14.5 14.6 14.7 15.1 15.2
15.3 15.3 15.3 15.3 15.5 15.6 15.6 15.8 16.0 16.0
16.2 16.2 16.3 16.4 16.5 16.6 16.6 16.6 16.8 17.0
17.0 17.2 17.4 17.4 17.9 18.4
Here is a histogram of these data. It seems roughly
symmetric and bell shaped.
71
2016 version
72
 I can use the 68-95-99.7 rule to assess Normality of a
data set.
The mean and standard deviation of these data are x =
15.825 and sx = 1.217.

x  1sx

x  2 sx

x  3sx
= (14.608, 17.042)
24 of 36 = 66.7%
= (13.391, 18.259)
34 of 36 = 94.4%
= (12.174, 19.467)
36 of 36 = 100%
These percent’s are quite close to what we
would expect based on the 68-95-99.7 rule.
Combined with the graph, this gives good
evidence that this distribution is close to
Normal.
2016 version
73
 I can interpret a Normal probability plot
EXAMPLE: Here is a Normal probability plot (also
called a Normal quartile plot) of the refrigerator data
from the previous page. Interpret this normal
probability plot.
INTERPRET:
2016 version
74
EXAMPLE: State land areas
Problem: The histogram and Normal probability plot
below display the land areas for the 50 states. Is this
distribution approximately Normal?
2016 version
75
EXAMPLE: NBA free throw percentage
This is an example of a distribution that is skewed to
the left. Notice that the lowest free throw
percentages are too the left of what we would
expect and the highest free throw percentages are
not as far to the right as we would expect. Interpret
this normal probability plot.
2016 version
76
EXAMPLE:
A normal probability plot of the survival times
of the guinea pigs in a medical experiment is
shown below. Use this plot to describe the
shape of the distribution of survival times. Then
explain carefully how this shape is seen in the
normal probability plot.
2016 version
77
#2. Multiple Choice Question: The plot shown
is a normal probability plot for a set of data.
The data value is plotted on the x-axis, and the
standardized value is plotted on the y-axis.
Which statement is true for this data set?
a. The data are clearly normally
distributed.
b. The data are approximately normally
distributed.
c. The data are clearly skewed to the right.
d. The data are clearly skewed to the left.
e. There is insufficient information to
determine the shape of the distribution.
2016 version
#3. EXAMPLE: A Normal probability plot for
the weights of 40 squirrels trapped and
released on a college campus is shown below.
Is the distribution of squirrel weights
approximately Normal? Justify your answer.
78
2016 version
79
Day #10: FRAPPY PRACTICE/ Test Review
AP Statistics – Chapter 2 Free Response Question
The Dow Jones Industrial Average (“The Dow”) is an
index measuring the stock performance of
30 large American companies, and is often used as a
measure of overall economic growth in the
United States. Below is Minitab output describing
the daily percentage changes in the Dow for
the first three months of 2009 and the first three
months of 2010. (Note that the market was open for
61 days during the first three months of each year. A
negative value indicates a percentage decrease in
the index for that day).
Descriptive Statistics: Dow 2009, Dow 2010
Variable
N
Mean
Dow 2009
61 -0.198 2.331
Dow 20120 61 0.078
StDev Min
0.821
Q1
Median Q3
Max
-4.660 -1.530 -0.310
1.150 6.820
-2.640 -0.270 0.110
0.465 1.660
Both distributions are approximately Normally
distributed.
#1. Consider a day when the Dow increased by 1%.
In which year, 2009 or 2010, would such a
day be considered a better day for the stock market,
relative to other days in that year?
Provide appropriate statistical calculations to
support your answer.
2016 version
#2. Based on these data, estimate the number of
days in 2009 that the Dow decreased by more than
1% in these 61 days.
#3. Estimate the 19th percentile of daily change for
the first three months of 2010.
80
2016 version
81
AP Statistics – Chapter 2 Free Response Question
RUBRIC/Scoring Criteria
The Dow Jones Industrial Average (“The Dow”) is an index
measuring the stock performance of
30 large American companies, and is often used as a
measure of overall economic growth in the
United States. Below is Minitab output describing the daily
percentage changes in the Dow for the first three months
of 2009 and the first three months of 2010. (Note that the
market was open for 61 days during the first three months
of each year. A negative value indicates a percentage
decrease in the index for that day).
Descriptive Statistics: Dow 2009, Dow 2010
Variable
N
Mean
Dow 2009
61 -0.198 2.331
Dow 20120 61 0.078
StDev Min
0.821
Q1
Median Q3
Max
-4.660 -1.530 -0.310
1.150 6.820
-2.640 -0.270 0.110
0.465 1.660
Both distributions are approximately Normally
distributed.
#1. Consider a day when the Dow increased by 1%. In
which year, 2009 or 2010, would such a day be considered
a better day for the stock market, relative to other days in
that year? Provide appropriate statistical calculations to
support your answer.
To get an ‘E’, student must do BOTH of the
following correctly:
 Student must properly calculate the two zscores.
 Student must properly decide which day is
‘considered a better day for the stock market.’
2016 version
82
To get a ‘P’, student must do ONE of the
components correctly.
A student gets an ‘I’ if they do NEITHER of the
components correctly.
#2. Based on these data, estimate the number of days in
2009 that the Dow decreased by more than 1% in these 61
days.
To get an ‘E’, student must do ALL THREE of the
following correctly:
 Student must draw a normal curve correctly.
 Student must show a reasonable amount of
‘work’.
 Student must get the correct ANSWER.
To get a ‘P’, student must do TWO of the THREE
components correctly.
A student gets an ‘I’ if they do EITHER ONE or NONE
of the components correctly.
#3. Estimate the 19th percentile of daily change for the
first three months of 2010.
To get an ‘E’, student must do BOTH of the
following correctly:
 Student must show a reasonable amount of
‘work’.
 Student must get the correct ANSWER.
To get a ‘P’, student must do ONE of the
components correctly.
A student gets an ‘I’ if they do NEITHER of the
components correctly.