Download Chapter 2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
Chapter 2
Section 2.1
Check Your Understanding, page 89:
1. c
2. Her daughter weighs more than 87% of girls her age and she is taller than 67% of girls her age.
3. About 65% of calls lasted less than 30 minutes. This means that about 35% of calls lasted 30 minutes
or longer.
4. The first quartile (25th percentile) is at about 14 minutes. The third quartile (75th percentile) is at about
32 minutes. This suggests that the IQR = 32 − 14 = 18 minutes.
Check Your Understanding, page 91:
1. z =
class.
65 − 67
= −0.466. Lynette’s height is about 0.46 standard deviations below the mean height of the
4.29
=
2. Brent’s z-score
is z
height of the class.
74 − 67
= 1.63. Brent’s height is about 1.63 standard deviations above the mean
4.29
74 − 76
3. Since Brent’s z-score is −0.85, we know that −0.85 = . Solving this for σ we find that
σ = 2.35 inches.
σ
Check Your Understanding, page 97:
1. Converting the heights from inches to centimeters will not change the shape. However, it will
multiply the center and spread by 2.54.
2. Adding 6 inches to each of the students’ heights will not change the shape of the distribution, nor will
it change the spread. It will, however, add 6 inches to the center.
3. Converting the class heights to z-scores will not change the shape of the distribution. It will change
the mean to 0 and the standard deviation to 1.
Check Your Understanding, page 103:
1. This figure depicts a legitimate density curve because it is positive everywhere and it has total area of
.
2. About 12% of the observations lie between 7 and 8.
Chapter 2: Modeling Distributions of Data
35
3. Point A in the graph below is the approximate median.
4. Point B in the graph above is the approximate mean. The mean is less than the median in this case
because the distribution is skewed to the left.
Exercises, page 105:
2.1 (a) Putting the data in order we get: 13 13 13 15 19 22 23 23 24 26 26 30 31 34 38 49 50 50 51 57.
5
The girl with 22 pairs of shoes is the 6th smallest. Therefore, her percentile is
= 0.25. In other words,
20
25% of the girls had fewer pairs of shoes than she did. (b) Putting the data in order we get: 4 5 5 5 6 7 7 7
7 8 10 10 10 10 11 12 14 22 35 38. The boy with 22 pairs has more shoes than 17 people. Therefore, his
17
percentile is
= 0.85. In other words, 85% of boys had fewer pairs of shoes than he did.(c) The boy is
20
more unusual because only 15% of the boys have as many or more than he has, while the girl has a value
that is more centered in the distribution. 25% have fewer and 75% have as many or more.
2.2 (a) Since 4 states have smaller values for the percent of residents aged 65 and older, the percentile
4
for Colorado is
= 0.08. In other words, 8% of states have a smaller percent of residents aged 65 and
50
older. (b) Since 40 states have smaller values for the percent of residents aged 65 and older, the
40
percentile for Rhode Island is
= 0.80. In other words, 80% of states have a smaller percent of
50
residents aged 65 and older. (c) Colorado is the more unusual state having only 8% of the states with
smaller percentages of residents aged 65 and older. Rhode Island has a value that is closer to the center of
the distribution with 80% lower and 20% as larger or larger.
2.3 According to the Los Angeles Times, the speed limits on California highways are such that 85% of
the vehicle speeds on those stretches of road are less than the speed limit.
2.4 Larry’s wife should gently break the news that being in the 90th percentile is not good news in this
situation. About 90% of men similar to Larry have lower blood pressures. The doctor was suggesting
that Larry take action to lower his blood pressure.
36
The Practice of Statistics for AP*, 4/e
2.5 The girl in question weighs more than 48% of girls her age, but is taller than 78% of the girls her age.
Since she is taller than 78% of girls, but only weighs more than 48% of girls, she is probably fairly
skinny.
2.6 Peter’s time was slower than 80% of his previous race times that season, but it was slower than only
50% of the racers at the league championship meet.
2.7 (a) The highlighted student sent about 212 text messages in the 2-day period which placed her at
about the 80th percentile. (b) The median number of texts is the same as the 50th percentile. Locate 50%
on the y-axis, read over to the points and then find the relevant place on the x-axis. The median is
approximately 115 text messages.
2.8 (a) Maryland has about 13% foreign-born residents placing it at about the 70th percentile. (b) Locate
30% on the y-axis, read over to the points and then find the relevant place on the x-axis. The 30th
percentile is approximately 4.5% foreign-born.
2.9 (a) First find the quartiles. The first quartile is the 25th percentile. Find 25 on the y-axis, read over to
the line and then down to the x-axis to get about $19. The 3rd quartile is the 75th percentile. Find 75 on
the y-axis, read over to the line and then down to the x-axis to get about $50. So the interquartile range is
$50 − $19 =
$31. (b) The person who spent $19.50 is just above what we have called the 25th percentile.
It appears that $19.50 is at about the 26th percentile. (c) The graph is below:
2.10 (a) To find the 60th percentile, find 60 on the y-axis, read over to the line and then read down to the
x-axis to find approximately 1000 hours. (b) To find the percentile for the lamp that lasted 900 hours,
find 900 on the x-axis, read up to the line and across to the y-axis to find that it is approximately the 35th
percentile.
2.11 Eleanor’s standardized=
score, z
=
z
27 − 18
= 1.5.
6
Chapter 2: Modeling Distributions of Data
680 − 500
= 1.8, is higher than Gerald’s standardized score,
100
37
2.12 The standardized batting averages (z-scores) for these three outstanding hitters are:
Player
z-score
Cobb
0.420 − 0.266
=
z = 4.15
0.0371
Williams
0.406 − 0.267
=
z = 4.26
0.0326
Brett
0.390 − 0.261
=
z = 4.07
0.0317
All three hitters were at least 4 standard deviations above their peers, but Williams’ z-score is the highest.
2.13 (a) Judy’s bone density score is about one and a half standard deviations below the average score
for all women her age. The fact that your standardized score is negative indicates that your bone density
is below the average for your peer group. The magnitude of the standardized score tells us how many
standard deviations you are below the average (about 1.5). (b) If we let σ denote the standard deviation
of the bone density in Judy’s reference population, then we can solve for σ in the equation
948 − 956
−1.45 =
. Thus, σ = 5.52 grams/cm2.
σ
2.14 (a) Mary’s z-score (0.5) indicates that her bone density score is about half a standard deviation
above the average score for all women her age. Even though the two bone density scores are exactly the
same, Mary is 10 years older so her z-score is higher than Judy’s (−1.45). Judy’s bones are healthier
when comparisons are made to other women in their age groups. (b) If we let σ denote the standard
deviation of the bone density in Mary’s reference population, then we can solve for σ in the equation
948 − 944
0.5 =
. Thus, σ = 8 grams/cm2. There is more variability in the bone densities for older
σ
women, which is not surprising.
2.15 (a) Since 22 salaries were less than Lidge’s salary, his salary is at the
22
= 75.86 percentile. (b)
29
6,350,000 − 3,388,617
= 0.79. Lidge’s salary was 0.79 standard deviations above the mean salary of
3,767, 484
$3,388,617.
z
2.16 Madson’s salary in 2008 was $1,400,000. Since there were 14 salaries less than Madson’s, his
14
salary is at the
= 48 percentile. Note that we could also argue that since Madson’s salary was the
29
1, 400,000 − 3,388,617
median, his percentile would be 50%. Madson’s z-score is z =
= −0.53. Madson
3,767, 484
had a typical salary compared to the rest of the team since his percentile was 50%, but there were some
players who made much more than he did since his z-score is negative, indicating a salary below the
mean team salary.
2.17 (a) In the national group, about 94.8% of the test takers scored below 65. Scott’s percentiles, 94.8th
among the national group and 68th within the school, indicate that he did better among all test takers than
38
The Practice of Statistics for AP*, 4/e
he did among the 50 boys at his school. (b) Scott’s z-scores
are z
=
group
and z
=
64 − 58.2
= 0.62 among the 50 boys at his school.
9.4
64 − 46.9
= 1.57 among the national
10.9
2.18 The boys at Scott’s school did very well on the PSAT. Scott’s score was relatively better when
compared to the national group than to his peers at school. Only 5.2% of the test takers nationally scored
65 or higher, yet about 32% scored 65 or higher at Scott’s school.
2.19 (a) The mean and the median both increase by 18 so the mean is 87.188 and the median is 87.5.
Mean New =
=
sum of student heights standing on chairs
number of students
( height of first student + 18) +  + ( height of last student + 18 )
number of students
sum of student heights standing on floor + 18 ∗ number of students
number of students
= Mean Old +=
18 69.188 +=
18 87.188.
=
The median is still the height of the middle student. Now that this student is standing on a chair 18 inches
from the ground, the median will be 18 inches larger. (b) The standard deviation and IQR do not change.
For the standard deviation, note that although the mean increased by 18, the observations each increased
by 18 as well so that the deviations did not change. For the IQR, Q1 and Q3 both increase by 18 so that
their difference remains the same as in the original data set.
2.20 (a) The mean and median salaries will each increase by $1000 (the distribution of salaries just shifts
by $1000). (b) The extremes and quartiles will also each increase by $1000. The standard deviation will
not change. Nothing has happened to affect the variability of the distribution. The center has shifted
location, but the spread has not changed.
2.21 (a) To give the heights in feet, not inches, we would divide each observation by 12 (12 inches = 1
foot). Thus
height of first student (inches)
height of last student(inches)
+ +
12
12
Mean New =
number of students
1  height of first student (inches) +  + height of last student (inches) 


12 
number of students

1
 1
= =
Mean Old  =
 69.188 5.77 feet.
12
 12 
=
The median is still the height of the middle student. To convert this height to feet, we divide by 12:
69.5
Median=
= 5.79 feet.
New
12
Chapter 2: Modeling Distributions of Data
39
(b) To find the standard deviation in feet, note that each deviation in terms of feet is found by dividing
the original deviation by 12.
standard deviation New
(first deviation (ft)) 2 + ... + (last deviation (ft)) 2
=
n-1
(first deviation (in)) 2 + ... + (last deviation (in)) 2
=
n −1
1
3.2
=
∗ standard deviation Old =
=
0.27 feet.
12
12
The first and third quartiles are still the medians of the first and second halves of the data; these values
must simply be converted to feet. To do this, divide the first and third quartiles of the original data set by
67.75
71
12:
=
Q1 = 5.65 feet and Q=
= 5.92 feet. So the interquartile range is
3
12
12
IQR = 5.92 − 5.65 = 0.27 feet.
2.22 (a) The mean and median will each increase by 5% since all of the observations will increase by
5%. (b) The 5% raise will increase the distance of the quartiles from the median. The quartiles and the
standard deviation will each increase by 5% (the original values will be multiplied by 1.05).
2.23 To find the mean temperature in degrees Fahrenheit multiply by
9
and add 32 so we get
5
9
9
= 77 degrees Fahrenheit. To find the standard deviation, we just multiply by
( 25) + 32
5
5
since adding 32 just shifts the distribution and does not affect the spread. So we get
9
=
=
standard deviation
( 2 ) 3.6 degrees Fahrenheit.
f
5
Mean=
f
2.24 To get a correct measurement in inches we need to subtract the 0.2 inches that Clarence mistakenly
added. Then to transform that same measurement to cm we need to multiply by 2.54. So the new mean is
7.62 cm. The change in location does not affect the standard deviation so we just
( 3.2 − 0.2 ) ∗ 2.54 =
multiply the old standard deviation by 2.54. 0.1( 2.54 ) = 0.254 cm.
2.25 Sketches will vary. Use them to confirm that the students understand the meaning of (a) symmetric
and bimodal
2.26 Sketches will vary. Use them to confirm that the students understand the meaning of skewed to the
left.
1
by 3 rectangle, the
3
1
area beneath the curve is 1. (b) One third of accidents occur in the first mile: this is a by 1 rectangle,
3
2.27 (a) It is on or above the horizontal axis everywhere, and because it forms a
40
The Practice of Statistics for AP*, 4/e
1
1
. (c) One-tenth of accidents occur next to Sue’s property: this is a by 0.3
3
3
rectangle, so the proportion is 0.1.
so the proportion is
2.28 (a) The area under the curve is a rectangle with height 1 and width 1. Thus, the total area under the
curve is 1(1) = 1. (b) The area under the uniform distribution between 0.8 and 1 is 0.2 (1) = 0.2, so 20% of
the observations lie above 0.8. (c) The area under the uniform distribution between 0.25 and 0.75 is
0.5 (1) = 0.5, so 50% of the observations lie between 0.25 and 0.75.
2.29 Both are 1.5. The mean is 1.5 because this is the obvious balance point of the rectangle. The
median is also 1.5 because the distribution is symmetric (so that median = mean) and because half of the
area lies to the left and half to the right of 1.5.
2.30 Both are 0.5. The mean is 0.5 because this is the obvious balance point of the rectangle. The
median is also 0.5 because the distribution is symmetric (so that median = mean) and because half of the
area lies to the left and half to the right of 0.5.
2.31 (a) Mean is C, median is B (the right skew pulls the mean to the right). (b) Mean is B, median is B
(this distribution is symmetric).
2.32 (a) Mean is A, median is A (the distribution is symmetric). (b) Mean A, median B (the left skew
pulls the mean to the left).
2.33 c
2.34 b
2.35 c
2.36 b
2.37 d
2.38 e
2.39 The distribution is skewed to the right since most of the values are 25 minutes or less, but the values
stretch out up to about 90 minutes. The data is centered roughly around 20 minutes and the range of the
distribution is close to 90 minutes. The two largest values appear to be outliers.
Chapter 2: Modeling Distributions of Data
41
2.40 (a) A bar chart is given below:
(b) 6%. Since 6% of the sample was left-handed,
population that is left-handed is 6%.
3
, our best estimate of the percentage of the
50
Section 2.2
Check Your Understanding, page 114:
1. The graph is shown below:
2. Since 67 inches is one standard deviation above the mean, approximately
women have heights greater than 67 inches.
1 − 0.68
= 16% of young
2
3. Since 62 is one standard deviation below the mean and 72 is three standard deviations above the mean,
0.68 0.997
approximately
+
=
84% of young women have heights between 62 and 72 inches.
2
2
42
The Practice of Statistics for AP*, 4/e
Check Your Understanding, page 119:
1. The proportion is 0.9177. A graph is shown below:
2. The proportion is 0.9842. A graph is shown below:
3. The proportion is 0.9649 − 0.2877 =
0.6772. A graph is shown below:
Chapter 2: Modeling Distributions of Data
43
4. The z-score for the 20th percentile is -0.84. A graph is shown below:
5. The 55th percentile is the z-value where 45% are greater than that value. z = 0.13. A graph is shown
below:
Check Your Understanding, page 124:
240 − 170
= 2.33. The proportion of z30
scores above 2.33 is 1 − 0.9901 =
0.0099. A graph is shown below:
=
1. For 14-year old boys with cholesterol 240, the z-score
is z
44
The Practice of Statistics for AP*, 4/e
200 − 170
= 1. The proportion of z-scores between 1
30
and 2.33 is 0.9901 − 0.8413 =
0.1488. A graph is shown below:
2. The z-score for a cholesterol level of=
200 is z
3. The 80th percentile of the Standard Normal distribution is 0.84 (see graph below). This means that the
distance, x, of Tiger Woods’ drive lengths that satisfies the 80th percentile is the solution to
x − 304
0.84 =
. Solving for x, we get 310.72 yards.
8
Exercises, page 131:
2.41 The Normal density curve with mean 69 and standard deviation 2.5 is shown below.
Chapter 2: Modeling Distributions of Data
45
2.42 The Normal distribution for the weights of 9-ounce bags of potato chips is shown below.
The interval containing weights within 1 standard deviation of the mean goes from 9.07 to 9.17. The
interval containing weights within 2 standard deviations of the mean goes from 9.02 to 9.22. The interval
containing weights within 3 standard deviations of the mean goes from 8.97 to 9.27.
2.43 (a) Approximately 2.5% of men are taller than 74 inches, which is 2 standard deviations above the
mean. (b) Approximately 95% of men have heights between 69 − 5 =
64 inches and 69 + 5 =
74 inches.
(c) Approximately 16% of men are shorter than 66.5 inches, because 66.5 is one standard deviation below
the mean. Approximately 2.5% are shorter than 64 inches, because 64 inches is two standard deviations
below the mean. So approximately 16% − 2.5% =
13.5% of men have heights between 64 inches and
66.5 inches. (d) The value 71.5 is one standard deviation above the mean. Thus, the area to the left of
71.5 is 0.68 + 0.16 = 0.84. In other words, 71.5 is the 84th percentile of adult male American heights.
2.44 (a) Approximately 2.5% of bags weigh less than 9.02 ounces, because 9.02 is two standard
deviations below the mean. (b) Approximately 68% of bags have weights between
ounces and 9.12 + 0.05 =
9.17 ounces. (c) Approximately 84% of bags have weights less than 9.17
ounces, because 9.17 is one standard deviation above the mean. Approximately 0.15% of bags have
weight less than 8.97 ounces, because 8.97 is three standard deviations below the mean. So
approximately 84% − 0.15% =
83.85% of bags have weight between 8.97 and 9.17 ounces. (d) The
value 9.07 is one standard deviation below the mean. This means that the area to the left of 9.07 is 0.16.
In other words, 9.07 is the 16th percentile of the weights of these potato chip bags.
2.45 The standard deviation is approximately 0.2 for the tall, more concentrated one and 0.5 for the short,
less concentrated one.
2.46 The mean is 10 and the standard deviation is about 2. The entire range of scores is about
16 − 4 =
12. Since the range of values in a normal distribution is about six standard deviations, we have
12
= 2 for σ.
6
2.47 (a) 0.9978. The graph is shown below (left). (b) 1 − 0.9978 = 0.0022. The graph is shown below
(right).
46
The Practice of Statistics for AP*, 4/e
(c) 1 – 0.0485 = 0.9515. The graph is show below (left). (d) 0.9978 – 0.0485 = 0.9493. The graph is
show below (right).
2.48 (a) 0.0069. The graph is show below (left). (b) 1 − 0.9931 = 0.0069. The graph is shown below
(right).
(c) 0.9931 − 0.8133 = 0.1798. The graph is shown below (left). (d) 0.1020 − 0.0016 = 0.1004. The
graph is shown below (right).
2.49 (a) 0.9505 − 0.0918 =
0.8587. The graph is shown below (left). (b) 0.9633 − 0.6915 =
0.2718. The
graph is shown below (right).
Chapter 2: Modeling Distributions of Data
47
2.50 (a) 0.7823 − 0.0202 =
0.7621. The graph is shown below (left). (b) 0.3745 − 0.1335 =
0.241. The
graph is shown below (right).
2.51 (a) The value that is closest to 0.1000 in Table A is 0.1003. This corresponds to a value of
66 th percentile.
−1.28 for z. (b) The point where 34% of observations are greater is also the 10 − 34 =
The value that is closest to 0.6600 in Table A is 0.6591 which corresponds to a z-score of 0.41.
2.52 (a) The value that is closest to 0.6300 in Table A is 0.6293 which corresponds to a z-value of 0.33.
(b) If 75% of values are greater than z, then 25% are lower. The value that is closest to 0.2500 in Table
A is 0.2514 which corresponds to a z-score of -0.67.
2.53 (a) State: Let x = the length of pregnancies. The variable x has a Normal distribution with
µ = 266 days and σ = 16 days. We want the proportion of pregnancies that last less than 240 days. Plan:
The proportion of pregnancies lasting less than 240 days is shown in the graph below (left).
240 − 266
= −1.63, so x < 240 corresponds to z < −1.63. Using table A we
16
see that the proportion of observations less than -1.63 is 0.0516 or about 5.2%. Conclude: About 5.2%
of pregnancies last less than 240 days which means that 240 is approximately the 5th percentile.
Do: For x = 240 we have z =
48
The Practice of Statistics for AP*, 4/e
(b) State: Let x = the length of pregnancies. The variable x has a Normal distribution with µ = 266 days
and σ = 16 days. We want the proportion of pregnancies lasting between 240 and 270 days. Plan: The
proportion of pregnancies lasting between 240 and 270 days is shown in the graph above (right). Do:
170 − 266
From part (a) we have that for x = 240, z = −1.63. For x = 270, we
have z = 0.25. Using
=
16
table A we see that the proportion of observations less than 0.25 is 0.5987. So the proportion of
observations between −1.63 and 0.25 is 0.5987 − 0.0516 =
0.5471 or about 55%. Conclude:
Approximately 55% of pregnancies last between 240 and 270 days. (c) State: Let x = the length of
pregnancies. The variable x has a Normal distribution with µ = 266 days and σ = 16 days. We want the
number of days such that 80% of people have shorter pregnancies than that number of days. Plan: The
80th percentile for the length of human pregnancy is shown in the graph below. Do: Using Table A, the
80th percentile for the standard Normal distribution is 0.84. Therefore, the 80th percentile for the length of
x − 266
human pregnancy can be found by solving the equation 0.84 =
for x. Thus,
16
=
x ( 0.84 )16 + 266
= 279.44. Conclude: The longest 20% of pregnancies last approximately 279 or
more days.
2.54 State: Let x = the IQ scores of people aged 20 to 34. The variable x has a Normal distribution with
µ = 110 and σ = 25. We want the proportion of IQ scores that are less than 150. Plan: The proportion
of IQ scores less than 150 is shown in the graph below (left).
150 − 110
= 1.6. Using Table A we see that the proportion of observations
25
less than 1.6 is 0.9452 or approximately 94.5%. Conclude: An IQ score of 150 is approximately the
95th percentile. (b) State: Let x = the IQ scores of people aged 20 to 34. The variable x has a Normal
distribution with µ = 110 and σ = 25. We want the proportion of IQ scores that are between 125 and 150.
Plan: The proportion of IQ scores between 125 and 150 is shown in the graph above (right). Do: From
Do: For x = 150 we=
have z
Chapter 2: Modeling Distributions of Data
49
125 − 110
= 0.6. So the
25
proportion of observations between 0.6 and 1.6, using Table A is 0.9452 − 0.7257 =
0.2195 or about 22%.
Conclude: About 22% of people aged 20-34 have IQ scores between 125 and 150. (c) State: Let x = the
IQ scores of people aged 20 to 34. The variable x has a Normal distribution with µ = 110 and σ = 25.
We want to find the score such that 98% of people aged 20-34 have an IQ score that is smaller. Plan:
The 98th percentile of the IQ scores is shown in the graph below.
part (a), we have that for x = 150, z = 1.6. For x = 125 we have
that z
=
Do: Using Table A, the 98th percentile for the standard Normal distribution is closest to 2.05. Therefore,
x − 110
the 98th percentile for the IQ scores can be found by solving the equation 2.05 =
for x. Thus,
25
=
x ( 2.05 ) 25 + 110
= 161.25. Conclude: In order to qualify for MENSA membership a person must score
162 or higher.
2.55 (a) We want to find the area under the N(0.37, 0.04) distribution to the right of 0.3. The graphs
below show that this area is equivalent to the area under the N(0, 1) distribution to the right of
0.3 − 0.37
z=
= −1.75.
0.04
Using Table A, the proportion of adhesions higher than 0.30 is 1 − 0.0401 =
0.9599. We would expect
trains to arrive on time about 96% of the time. (b) We want to find the area under the N(0.37, 0.04)
distribution to the right of 0.5. The graphs below show that this area is equivalent to the area under the
0.5 − 0.37
=
N(0, 1) distribution to the right
of z = 3.25. Using Table A, the proportion of adhesions
0.04
higher than 0.50 is 1 − 0.9994 =
0.0006. So we would expect trains to arrive early .06% of the time.
50
The Practice of Statistics for AP*, 4/e
(c) It makes sense to try to have the value found in part (a) larger. We want the train to arrive at its
destination on time, but not to arrive at the switch point early.
2.56 (a) If a lid is too small to fit, that means that its diameter is less than 3.95. Let x = diameter of the
lid. We wish to find the proportion of lids for which x < 3.95. The graph showing this area is below (on
3.95 − 3.98
= −1.5. Using Table A, the proportion of observations
the left). For x = 3.95, we have z =
0.02
less than -1.5 is 0.0668. So, approximately 6.7% of lids are too small. (b) If a lid is too big to fit, that
means that the diameter is more than 4.05. Let x = diameter of the lid. We wish to find the proportion of
lids for which x > 4.05. The graph showing this area is below (on the right). For x = 4.05, we have
4.05 − 3.98
=
z = 3.5. Using Table A, the proportion of observations less than 3.5 is approximately 1, so
0.02
the proportion of observations greater than 3.5 is approximately 0. In other words, it is extremely rare for
a lid to be too big.
(c) It makes more sense to have a larger proportion of lids too small rather than too large. The company
wants to make sure that the fit is snug. If more lids are too large, there will be more spills. If lids are too
small, customers will just try another lid. But if lids are too large, the customer may not notice and then
spill the drink.
2.57 (a) We want to solve for the appropriate mean of the distribution so that the adhesion is less than
0.30 on less than 2% of days. First, find the z-value for which 2% of observations are lower. Using Table
A, we find z = −2.05. Since we want this z-value to correspond to an adhesion of 0.30, we solve
0.30 − µ
z=
= −2.05 for µ. In other words, µ =
0.30 + 2.05 ( 0.04 ) =
0.382. The mean adhesion should be
0.04
0.30 − 0.37
0.382. (b) If the mean adhesion stays at 0.37, then solve z =
= −2.05 for σ. In other words
σ
σ
=
0.30 − 0.37
= 0.034. The standard deviation of the adhesion values should be 0.034. (c) To
−2.05
Chapter 2: Modeling Distributions of Data
51
compare the options, we want to find the area under the N ( µ ,σ ) to the right of 0.50. Under option (a),
0.50 − 0.382
= 2.95. Using Table A, we find that the area is 1 − 0.9984 =
0.0016. Under option (b),
0.04
0.50 − 0.37
=
z = 3.82. Using Table A, we find that the area is 1 − 0.9999 =
0.0001. Therefore, we prefer
0.034
option (b).
=
z
2.58 (a) We want to solve for the appropriate mean so that less than 1% of lids have diameter less than
3.95. First, find the z-value for which 1% of observations are lower. Using Table A, we find
z = −2.33. Since we want this z-value to correspond to a diameter of 3.95, we solve
3.95 − µ
z=
= −2.33 for µ. In other words, µ =
3.95 + 2.33 ( 0.02 ) =
4.00 inches. So the mean lid
0.02
diameter should be 4.00 inches. (b) If the mean diameter stays at 3.98, then solve
3.95 − 3.98
3.95 − 3.98
words σ = 0.013 inches. So the standard deviation
z=
= −2.33 for σ. In other=
σ
−2.33
of the lid diameters should be 0.013 inches. (c) We prefer reducing the standard deviation. This will not
increase the number of lids that are too big, but will reduce the number of lids that are too small. If we
just shift the mean to the right, we will reduce the number of lids that are too small, but we will increase
the number of lids that are too big.
2.59 (a) Using Table A, the closest values to the deciles are ±1.28. (b) The deciles for the heights of
young women are 64.5 ± 1.28(2.5) or 61.3 inches and 67.7 inches.
2.60 The quartiles for a standard Normal distribution are ±0.6745. For a N ( µ , σ ) distribution,
µ − 0.6745σ , Q3 =
µ + 0.6745σ , and IQR =1.349σ . Therefore, 1.5 ( IQR ) = 2.0235σ , and the
Q1 =
suspected outliers are below Q1 − 1.5 ( IQR ) =
µ − 2.698σ or above Q3 + 1.5 ( IQR ) =
µ + 2.698σ . The
proportion outside of this range is approximately the same as the area under the standard Normal
distribution outside of the range from −2.7 to 2.7, which is 2 ( 0.0035 ) = 0.007 or 0.70%.
2.61 Use the two graphs below and the given information to set up two equations in two unknowns.
The two equations are 1.04 =
60 − µ
σ
and 1.88 =
75 − µ
σ
. Multiplying both sides of the equations by σ
and subtracting yields 0.84σ = 15 or σ = 17.86 minutes. Substituting this value back into the first
60 − µ
or µ =
equation we obtain 1.04 =
60 − 1.04(17.86) =
41.43 minutes.
17.86
52
The Practice of Statistics for AP*, 4/e
2.62 Use the given information and the graphs below to set up two equations in two unknowns.
1− µ
2−µ
. Multiplying both sides of the equations by σ and
The two equations are −0.25 = and 2.05 =
σ
σ
subtracting yields −2.3σ =
−1 or σ = 0.4348 minutes. Substituting this value back into the first equation
1− µ
we obtain −0.25 = or µ =
1 + 0.25(0.4348) =
1.1087 minutes.
0.4348
2.63 (a) Descriptive statistics and a histogram are provided below.
Variable
shlength
N
44
Mean
15.586
StDev
2.550
Minimum
9.400
Q1
13.525
Median
15.750
Q3
17.400
Maximum
22.800
The distribution of shark lengths is roughly symmetric with a peak at 16 and varies from 9.4 feet to 22.8
feet. (b) 68.2% of the lengths fall within one standard deviation of the mean, 95.5% of the lengths fall
within two standard deviations of the mean, and 100% of the lengths fall within 3 standard deviations of
the mean. These are very close to the 68-95-99.7 rule. (c) A Normal probability plot is shown below.
Chapter 2: Modeling Distributions of Data
53
Except for one small shark and one large shark, the plot is fairly linear, indicating that the Normal
distribution is appropriate. (d) The graphical display in (a), check of the 68−95−99.7 rule in (b), and
Normal probability plot in (c) indicate that shark lengths are approximately Normal.
2.64 (a) Descriptive statistics and a histogram are provided below.
Variable
density
N
29
Mean
5.4479
StDev
0.2209
Minimum
4.8800
Q1
5.2950
Median
5.4600
Q3
5.6150
Maximum
5.8500
The measurements of the earth’s density are roughly symmetric with a mean of 5.45 and varies from 4.88
to 5.85. (b) The densities follow the 68−95−99.7 rule closely—75.86% (22 out of 29) of the densities fall
within one standard deviation of the mean, 96.55% (28 out of 29) of the densities fall within two standard
deviations of the mean, and 100% of the densities fall within 3 standard deviations of the mean. (c)
Normal probability plots from Minitab (left) and a TI calculator (right) are shown below.
54
The Practice of Statistics for AP*, 4/e
The Normal probability plot is roughly linear, indicating that the densities are approximately Normal. (d)
The graphical display in (a), the 68-95-99.7 rule in (b) and the Normal probability plot in (c) all indicate
that these measurements are approximately Normal.
2.65 The plot is nearly linear. Because heart rate is measured in whole numbers, there is a slight “step”
appearance to the graph.
2.66 The shape of the Normal probability plot suggests that the data are right-skewed. This can be seen in
the steep, nearly vertical section in the lower left—these numbers were less spread out than they should
be for Normal data—and the three apparent outliers that deviate from the line in the upper right; these
were much larger than they would be for a Normal distribution.
149,164
= 0.1147 or about 11.47%. (b) The percent of
1,300,599
149,164 + 50,310
scores greater than or equal to 27 is
= 0.1534 or about 15.34%. (c) For the normal
1,300,599
distribution with µ = 21.2 and σ = 5.0, the percent of observations greater than 27 corresponds to the
27 − 21.2
percent of observations=
above z = 1.16 on the Standard Normal curve. Using Table A, this
5
proportion is 0.1230. So about 12.30% of scores would be 27 or higher.
2.67 (a) The percent of scores above 27 is
2.68 Women’s weights are skewed to the right: This makes the mean higher than the median, and it is
also revealed in the differences M − Q=
133.2 − 118.3= 14.9 pounds and
1
Q3 − M= 157.3 − 133.2= 24.1 pounds.
2.69 d
2.70 c
2.71 b
Chapter 2: Modeling Distributions of Data
55
2.72 c
2.73 c
2.74 c
2.75 For both kinds of cars we see that the highway miles per gallon is higher than the city miles per
gallon, a conclusion that we would have expected. The two-seater cars have a wider spread of miles per
gallon values than the minicompact cars do, both on the highway and in the city. Also the miles per
gallon values are slightly lower for the two-seater cars than for the minicompact cars. This difference is
greater on the highway than it is in the city.
2.76 (a) The two-way table is shown below. (b) The percent of eggs in each group that hatched are
59.26% in a cold nest, 67.86% in a neutral nest, and 72.12% in a hot nest. The percents indicate that
hatching increases with temperature. The cold nest did not prevent hatching, but made it less likely.
Cold
Neutral
Hot
Hatched
16
38
75
Not hatched
11
18
29
Total
27
56
104
56
The Practice of Statistics for AP*, 4/e
Chapter Review Exercises (page 136)
R2.1 (a) The only way to obtain a z-score of 0 is if the x value equals the mean. Thus, the mean is 170
cm. To find the standard deviation, use the fact that a z-score of 1 corresponds to a height of 177.5. Then
177.5 − 170
so σ = 7.5. Thus the standard deviation of the height distribution for 15-year-old males is
1=
σ
7.5 cm. (b) Since 2.5
=
2.5.
x − 170
,=
x 2.5(7.5) + 170
= 188.75 cm. A height of 188.75 cm has a z-score of
7.5
179 − 170
= 1.20. Paul is somewhat taller than average for his age. His height is 1.20
7.5
standard deviations above the average male height for his age. (b) 85% of boys Paul’s age are shorter
than Paul’s height.
R2.2
(a) z
=
R2.3 (a) Reading up from 10 hours on the x-axis to the graphed line and then across to the y-axis, we see
that 10 hours corresponds to about the 70th percentile. (b) The median (50th percentile) is about 5, Q1
(25th percentile) is about 2.5, and Q3 (75th percentile) is about 11. There are outliers, according to the
1.5 ( IQR ) rule, because values exceeding Q3 + 1.5 ( IQR ) =
23.75 clearly exist.
R2.4 (a) If we converted the guesses from feet to meters the shape of the distribution would not change.
43.7
42
The new mean would be
= 13.32 meters, the median would be
= 12.80 meters, the standard
3.28
3.28
12.5
12.5
deviation would be
= 3.81 meters, and the IQR would be
= 3.81 meters. (b) The mean error
3.28
3.28
would be 43.7 − 42.6 =
1.1 feet. The standard deviation of the errors would be the same as the standard
deviation of the guesses, 12.5 feet, because we have just shifted the distribution, but not changed its width
by subtracting 42.6 from each guess.
R2.5 (a) Answers will vary but the line should be slightly to the right of the main peak. Line A in the
graph below. (b) Answers will vary but the line should be slightly to the right of the line for the median.
Line B in the graph below.
R2.6 (a) 336 ± 3(3) =
( 327,345) (b) 339 is one standard deviation above the mean, so 16% of the horse
pregnancies last longer than 339 days. This is because 68% are within one standard deviation of the
mean, so 32% are more than one standard deviation from the mean and half of those are greater than 339.
Chapter 2: Modeling Distributions of Data
57
R2.7 (a) The graph is given below (on the left). Looking up −2.25 in Table A, the proportion of
observations below −2.25 is 0.0122. (b) The graph is below (on the right). The proportion of
observations larger than −2.25 is 1 − 0.0122 =
0.9878.
(c) The graph is given below (on the left). Looking up 1.77 in Table A gives 0.9616. This is the
proportion of observations below 1.77. We would like the proportion greater than 1.77 so we subtract
this from 1. 1 − 0.9616 =
0.0384. 3.84% of observations are above 1.77. (d) The graph is given below
(on the right). Subtract the area below −2.25 from the area below 1.77 to get 0.9616 − 0.0122 =
0.9494.
94.94% of observations are between −2.25 and 1.77.
R2.8 (a) Looking up 0.80 in the body of Table A, we find that the z corresponding to the number closest
to 0.80 is 0.84. The graph is shown below (on the left). (b) If 35% of all values are greater than a
particular z-value, then 65% are lower. Looking up 0.65 in the body of Table A, we find that the z
corresponding to the number closest to 0.65 is 0.39. The graph is shown below (on the right).
2500 − 3668
= −2.29. The z-value that corresponds to a baby weight less than 2500 grams
511
at birth is −2.29. The percent of babies weighing less than this is the area to the left (see graph below).
According to Table A, this is 0.0110. So, approximately 1% of babies will be identified as having low
birth weight.
R2.9 (a) z =
58
The Practice of Statistics for AP*, 4/e
(b) The z-values corresponding to the quartiles are −0.67 and 0.67. To find the x-values corresponding
to the quartiles, we solve the following equations for x.
x − 3668
3325.63
−0.67 =
⇒ x =−0.67(511) + 3668 =
511
x − 3668
0.67
3668 4010.37
=
=
⇒ x 0.67(511) +=
511
The quartiles of the birth weight distribution are 3325.6 and 4010.4.
R2.10 If the distribution is Normal, it must be symmetric about its mean – and in particular, the 10th and
90th percentiles must be equal distances above and below the mean – so the mean is 250 points. If 225
points below (above) the mean is the 10th (90th) percentile, this is 1.28 standard deviations below (above)
225
= 175.8 points.
the mean, so the distribution’s standard deviation is
1.28
R2.11 A histogram and Normal probability plot are given below. Both indicate that the data are not
exactly Normally distributed. The histogram is roughly symmetric. The descriptive statistics given
below indicate that the mean and median are very similar which is consistent with rough symmetry. For
most purposes, these data can be considered approximately Normal.
Variable
length of thorax
N
49
Mean
0.8004
StDev
0.0782
Minimum
0.6400
Q1
0.7600
Median
0.8000
Q3
0.8600
Maximum
0.9400
R2.12 The steep, nearly vertical portion at the bottom and the bow upward indicate that the distribution
of the data is right-skewed with several outliers. In other words, this data is not Normally distributed.
Chapter 2: Modeling Distributions of Data
59
AP Statistics Practice Test (page 138)
T2.1 e. The percentile tells what percent of scores are below that figure.
T2.2 d. At one standard deviation away from the mean, the curve has its inflection point. Also, the vast
majority of the curve lies within 3 standard deviations of the mean.
T2.3 b. Both the addition and the multiplication affect the mean, but only the multiplication affects the
standard deviation.
T2.4 b. About 20% consume fewer than 4 ounces and about 60% consume fewer than 8 ounces, so about
40% consume between 4 and 8 ounces.
T2.5 a. Solve the following equation for σ. 1.04 =
percentile of the distribution as found in Table A.
60 − 55
σ
. The value 1.04 is the approximate 85th
T2.6 d. There is not enough area between A and B or between F and G to constitute 25%. Since A and
G are the min and the max, they are obviously part of the 5 number summary. The points B and F are too
close to the ends of the distribution.
T2.7 c. Two feet constitutes 24 inches. Since there is a spread of about 6 standard deviations in a length
that encompasses 99.7% of all observations, we would estimate the standard deviation to be
24
= 4 inches.
6
T2.8 e. The proportion of observations with z > −3.0 in a Standard Normal distribution is 0.9987.
T2.9
e. z
=
540 − 470
= 0.27.
110
T2.10 c. Jane’s z-score is 0.27 (see previous problem). Colleen’s z-score
is z
=
530 − 515
= 0.13.
116
T2.11 (a) Jane’s performance was better. She did more curl-ups than 85% of girls her age. This means
that she qualified for both the President’s award and for the National award. Matt did more curl-ups than
50% of boys his age. This means that less than 50% of the boys his age did better than he did, whereas
less than 15% of the girls her age did better than Jane. Matt qualified for the National award, but did not
qualify for the President’s award. (b) Since Jane’s score has a higher percentile than Matt’s score, her
standardized score will also be larger than his.
23.9 − 22.8
= 1. So
1.1
the proportion of observations lower than this is 0.8413 (using Table A). This means that this soldier’s
head circumference is in approximately the 84th percentile. A graph is shown below.
T2.12 (a) The z-value that corresponds to this soldier’s head circumference
is z
=
60
The Practice of Statistics for AP*, 4/e
20 − 22.8
= −2.55. Using Table A, the area below −2.55
1.1
26 − 22.8
is 0.0054 (see graph below). Standardizing the right endpoint we
get z = 2.91. The area
=
1.1
below 2.91 (using Table A) is 0.9982, so the area above 2.91 is 1 − 0.9982 =
0.0018 (see graph below).
This means that the area in both tails is 0.0054 + 0.0018 =
0.0072. So approximately 0.7% of soldiers
require custom helmets.
(b) Standardizing the left endpoint we get z =
(c) The quartiles of a Standard Normal distribution are −0.67 and 0.67. To find the quartiles of the head
circumference distribution we solve the following equations for x.
x − 228
−0.67 =
⇒ x =−0.67(1.1) + 22.8 =22.063
1.1
x − 228
0.67
22.8 23.537
=
=
⇒ x 0.67(1.1) + =
1.1
This means that Q1 = 22.063 and Q3 = 23.537. So IQR = Q3 − Q1 = 23.537 − 22.063 = 1.474 inches.
T2.13 No, this data does not seem to follow a Normal distribution. First, there is a large difference
between the mean and the median. The mean is 48.25 and the median is 37.80. The Normal distribution
is symmetric so the mean and median should be quite close in a data set drawn from one. This data set
appears to be highly skewed to the right. This can be seen by the fact that the mean is so much larger than
the median. It can also be seen by the fact that the distance between the minimum and the median is
37.80 − 2 =
35.80, but the distance between the median and the maximum is 204.90 − 37.80 =
167.10.
Chapter 2: Modeling Distributions of Data
61