Download Communicate your thinking clearly and completely.

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
Friendly High School
10000 Allentown Road Fort Washington, MD 20744
Office (301)449-4900
Fax (301)449-4911
Raynah Adams
Principal
AP Statistics
Summer Assignments
2014
Chapter 1
1
Summer Assignment 1A AP Statistics
Name:
Directions: Work on these sheets. Answer completely, but be concise.
Part 1: Multiple Choice.
Circle the letter corresponding to the best answer.
1. Which of the following statements is NOT true?
(a) In a symmetric distribution, the mean and the median are equal.
(b) The first quartile is equivalent to the twenty-fifth percentile.
(c) In a symmetric distribution, the median is halfway between the first and third quartiles.
(d) The median is always greater than the mean.
(e) The range is the difference between the largest and the smallest observation in the data set.
2. Consumers’ Union measured the gas mileage in miles per gallon of 38 automobiles from the same
model year on a special test track. The pie chart below provides information about the country of
manufacture of the model cars used by Consumers’ Union. Based on the pie chart, we may
conclude that
(a) Japanese cars get significantly lower gas mileage than cars of other countries. This is because
their slice of the pie is at the bottom of the chart.
(b) U.S cars get significantly higher gas mileage than cars from other countries.
(c) Swedish cars get gas mileages that are between those of Japanese and U.S. cars.
(d) Mercedes, Audi, Porsche, and BMW represent approximately a quarter of the cars tested.
(e) More than half of the cars in the study were from the United States.
3. A researcher reports that, on average, the participants in his study lost 10.4 pounds after two
months on his new diet. A friend of yours comments that she tried the diet for two months and lost
no weight, so clearly the report was a fraud. Which of the following statements is correct?
(a) Your friend must not have followed the diet correctly, since she did not lose weight.
(b) Since your friend did not lose weight, the report must not be correct.
(c) The report gives only the average. This does not imply that all participants in the study lost
10.4 pounds or even that all lost weight. Your friend’s experience does not necessarily
contradict the study results.
(d) In order for the study to be correct, we must now add your friend’s results to those of the study
and recompute the new average.
(e) Your friend is an outlier.
Chapter 1
2
4. The following is an ogive of the number of ounces of alcohol (one ounce is about 30 milliliters)
consumed per week in a sample of 150 college students.
A study wished to classify the students as “light,” “moderate,” “heavy,” and “problem” drinkers by
the amount consumed per week. About what percent of students are moderate drinkers, that is,
consume between 4 and 8 ounces per week?
(a) 60%
(b) 20%
(c) 40%
(d) 80%
(e) 50%
5. “Normal” body temperature varies by time of day. A series of readings was taken of the body
temperature of a subject. The mean reading was found to be 36.5°C with a standard deviation of
0.3°C. When converted to °F, the mean and standard deviation are (°F = °C(1.8) + 32):
(a) 97.7, 32
(b) 97.7, 0.30
(c) 97.7, 0.54
(d) 97.7, 0.97
(e) 97.7, 1.80
6. The following is a histogram showing the actual frequency of the closing prices of a particular
stock on the New York Stock Exchange. The class that contains the 80th percentile is
(a) 20–30
(b) 10–20
(c) 40–50
(d) 50–60
(e) 30–40
Chapter 1
3
7. Which of the following is likely to have a mean that is smaller than the median?
(a) The salaries of all National Football League players.
(b) The scores of students (out of 100 points) on a very easy exam in which most get nearly perfect
scores but a few do very poorly.
(c) The prices of homes in a large city.
(d) The scores of students (out of 100 points) on a very difficult exam in which most get poor
scores but a few do very well.
(e) Amounts awarded by civil court juries.
8. There are three children in a room, ages three, four, and five. If a four-year-old child enters the
room the
(a) mean age will stay the same but the variance will increase.
(b) mean age will stay the same but the variance will decrease.
(c) mean age and variance will stay the same.
(d) mean age and variance will increase.
(e) mean age and variance will decrease.
9. The weights of the male and female students in a class are summarized in the following boxplots:
Which of the following is NOT correct?
(a) About 50% of the male students have weights between 150 and 185 pounds.
(b) About 25% of female students have weights more than 130 pounds.
(c) The median weight of male students is about 162 pounds.
(d) The mean weight of female students is about 120 pounds because of symmetry.
(e) The male students have less variability than the female students.
10. When testing water for chemical impurities, results are often reported as bdl, that is, below
detection limit. The following are the measurements of the amount of lead in a series of water
samples taken from inner-city households (in parts per million):
5, 7, 12, bdl, 10, 8, bdl, 20, 6
Which of the following is correct?
(a) The mean lead level in the water is about 10 ppm.
(b) The mean lead level in the water is about 8 ppm.
(c) The median lead level in the water is 7 ppm.
(d) The median lead level in the water is 8 ppm.
(e) Neither the mean nor the median can be computed because some values are unknown.
Chapter 1
4
Part 2: Free Response
Communicate your thinking clearly and completely.
11. The test grades for a certain class were entered into a Minitab worksheet, and then “Descriptive
Statistics” were requested. The results were
MTB > Describe 'Grades'.
Grades
N
28
Grades
MIN
35.00
MEAN
74.71
MAX
94.00
MEDIAN
76.00
TRMEAN
75.50
Q1
68.00
Q3
84.00
STDEV
12.61
SEMEAN
2.38
You happened to see, on a scrap of paper, that the lowest grades were 35, 57, 59, 60, . . . but you
don’t know what the other individual grades are. Nevertheless, a knowledgeable user of statistics
can tell a lot about the data set simply by studying the set of descriptive statistics above.
(a) Construct a modified boxplot for these data.
(b) Write a brief description of what the results tell you about the distribution of grades. Be sure to
address
 the general shape of the distribution
 unusual features, including possible outliers
 the middle 50% of the data
 any significance in the difference between the mean and the median
Chapter 1
5
12. The University of Miami Hurricanes has been among the more successful teams in college football.
The weights in pounds and positions of the players on the 2005 team were recorded. The positions
are quarterback (QB), running back (RB), offensive line (OL), wide receiver (WR), tight end (TE),
kicker/punter (KP), defensive back (DB), linebacker (LB), and defensive line (DL).
Here are side-by-side boxplots of the weights.
Code: 1=QB, 2=RB, 3=OL, 4=WR, 5=TE, 6=KP, 7=DB, 8=LB, 9=DL
(a) Briefly compare the weight distributions. Which position has the heaviest players overall?
Which has the lightest?
(b) Are any individual players outliers within their position?
13. Give an example of a small data set for which the mean is greater than the third quartile. Indicate
the mean and the third quartile.
Chapter 1
6
1B
AP Statistics
Name:
Directions: Work on these sheets. Answer completely, but be concise.
Part 1: Multiple Choice.
Circle the letter corresponding to the best answer.
1. Mr. Yates picked up a dozen items in the grocery store with a mean cost of $3.25. Then he added an
apple pie for $6.50. The new mean for all 13 items is
(a) $3.00
(b) $3.50
(c) $3.75
(d) $4.88
(e) None of the above
Use the following to answer Question 2:
2. Which of the following bar graphs is equivalent to the pie chart?
(a)
(b)
(c)
(d)
(e) None of these.
Chapter 1
7
Percent
3. Consider the following ogive of the scores of students in an introductory statistics course:
Score
A grade of C or C+ is assigned to a student who scores between 55 and 70. The percentage of
students who obtained a grade of C or C+ is
(a)
(b)
(c)
(d)
(e)
25%
30%
20%
50%
15%
4. For the following histogram, what is the proper ordering of the mean and median? Note that the
graph is NOT numerically precise—only the relative positions are important.
(a)
(b)
(c)
(d)
(e)
I is the mean and II is the median.
II is the median and III is the mean.
I is the median and II is the mean.
II is the mean and III is the median.
I is the mean and III is the median.
5. A researcher wishes to calculate the average height of patients suffering from a particular disease.
From patient records, the mean was computed as 156 cm, and standard deviation as 5 cm. Further
investigation reveals that the scale was misaligned, and that all readings are 2 cm too large, for
example, a patient whose height is really 180 cm was measured as 182 cm. Furthermore, the
researcher would like to work with statistics based on meters. The correct mean and standard
deviation ar:
(a) 1.56m, 0.05m
(b) 1.54m, 0.05m
(c) 1.56m, 0.03m
(d) 1.58m, 0.05m
(e) 1.58m, 0.07m
Chapter 1
8
6. A medical researcher collects health data on many women in each of several countries. One of the
variables measured for each woman in the study is her weight in pounds. The following list gives
the five-number summary for the weights of women in one of the countries.
Country A:
100, 110, 120, 160, 200
About what percent of Country A women weigh between 110 and 200 pounds?
(a) 50%
(b) 65%
(c) 75%
(d) 85%
(e) 95%
7. The median age of five people in a meeting is 30 years. One of the people, whose age is 50 years,
leaves the room. The median age of the remaining four people in the room is
(a)
(b)
(c)
(d)
(e)
40 years.
30 years.
25 years.
less than 30 years.
Cannot be determined from the information given.
8. The time plot below gives the number of burglaries committed each month for a city in Ohio. The
plot is for the three-year period January 1987 to December 1989.
Which of the following is a true statement?
(a) The number of burglaries in each month of 1988 was lower than the number of burglaries in
each month of 1989.
(b) The median number of burglaries for a month in 1988 was a little over 25.
(c) The total number of burglaries in 1989 was higher than in 1988.
(d) More burglaries seem to be committed in June, July, and August during 1987, 1988, and 1989.
(e) None of the above.
Chapter 1
9
9. Here is a summary graph of complex carbohydrates (in grams) for each of three fiber groups in a
set of data related to cereals.
Which of the following is NOT correct?
(a) The low-fiber group is more variable than the medium-fiber group because the central box is
larger.
(b) About 25% of low-fiber cereals have less than 12 g of complex carbohydrates per serving.
(c) About 50% of medium-fiber cereals have more than 15 g of complex carbohydrates per
serving.
(d) The average amount of complex carbohydrates per serving for the high-fiber group appears to
be much smaller than for the other two groups.
(e) About 25% of the medium-fiber cereals have less than 10 g of complex carbohydrates.
10. Earthquake intensities are measured using a device called a seismograph, which is designed to be
most sensitive to earthquakes with intensities between 4.0 and 9.0 on the open-ended Richter scale.
Measurements of nine earthquakes gave the following readings:
4.5 L 5.5 H 8.7 8.9 6.0 H 5.2
where L indicates that the earthquake had an intensity below 4.0 and H indicates that the
earthquake had an intensity above 9.0. The median earthquake intensity of the sample is
(a) Cannot be computed since all of the values are not known
(b) 8.70
(c) 5.75
(d) 6.00
(e) 6.47
Chapter 1
10
Part 2: Free Response
Communicate your thinking clearly and completely.
11. We all “know” that the body temperature of a healthy person is 98.6°F. In reality, the actual body
temperature of individuals varies. Here is a back-to-back stemplot of the body temperatures of 130
healthy individuals (65 males and 65 females).
Males
Females
3
(a) Here are boxplots, produced by Minitab, for these
distributions. Label both boxplots with the five-number
summary values.
(b) Determine whether the 3 points graphed by the +
symbol are indeed outliers by our defined criteria.
(c) Write a few sentences comparing the body temperatures
of adult males and females.
Chapter 1
11
7
9
1110
32
544444
7666
998888
11000000
332222
554444
77666666
9888
1000
32
54
96
96
96
96
97
97
97
97
97
98
98
98
98
98
99
99
99
99
99
100
100
100
100
100
4
7
8
22
4
677
8888999
000001
222222333
444445
6666777777
8888889
0011
223
4
9
0
8
12. The following data represent scores of 50 students on a calculus test.
72
57
74
71
65
72
67
76
53
51
93
72
79
67
75
70
57
72
65
68
59
83
61
100
75
78
76
72
83
66
74
74
73
69
77
65
56
76
61
61
73
68
67
72
64
80
67
49
68
74
(a) Construct a relative frequency histogram for this data set.
(b) Describe the shape, center, and spread of the distribution of test scores.
(c) Would the mean and standard deviation be appropriate measures of center and spread for these
test scores? Explain.
Chapter 1
12
1C
AP Statistics
Name:
Directions: Work on these sheets. Answer completely, but be concise.
Part I: Multiple Choice.
Circle the letter corresponding to the best answer.
1. The five-number summary for scores on a statistics exam is 11, 35, 61, 70, 79. In all, 380 students
took the test. About how many had scores between 35 and 61?
(a) 26
(b) 76
(c) 95
(d) 190
(e) None of these
2. The following bar graph gives the percent of owners of three brands of trucks who are satisfied
with their truck.
From this graph we may legitimately conclude that
(a) owners of other brands of trucks are less satisfied than the owners of these three brands.
(b) Chevrolet owners are substantially more satisfied than Ford or Toyota owners.
(c) there is very little difference in the satisfaction of owners for the three brands.
(d) Chevrolet probably sells more trucks than Ford or Toyota.
(e) a pie chart would have been a better choice for displaying these data.
3. A reporter wishes to portray baseball players as overpaid. Which measure of center should he
report as the average salary of major league players?
(a) The mean.
(b) The median.
(c) Either the mean or median. It doesn’t matter since they will be equal.
(d) Neither the mean nor median. Both will be much lower than the actual average salary.
(e) The standard deviation should be used to show the great disparity between the astronomical
salaries of the few superstars and the salaries of the rest of the players.
4. The mean salary of all female workers is $35,000. The mean salary of all male workers is $41,000.
What must be true about the mean salary of all workers?
(a) It must be $38,000.
(b) It must be larger than the median salary.
(c) It could be any number between $35,000 and $41,000.
(d) It must be larger than $38,000.
(e) It cannot be larger than $40,000.
Chapter 1
13
5. Consider the following output analyzing pH values of some 1986 data on precipitation events.
Summary stats for
1986
NumNumeric = 55
NumNonNumeric = 0
NumCases = 55
Mean = 6.0673
Median = 6.1000
Std Deviation = 0.47339
Range = 2.4000
Minimum = 4.6000
Maximum = 7
75-th %ile = 6.4000
Which of the following is NOT correct?
(a) The 25th percentile is about 5.9.
(b) Some outliers appear to be present below a pH of 5.2.
(c) About 95% of the observations have pH values in the approximate range 6 ± 1.
(d) About 10% of the values are in the range 5.8 to 6.0.
(e) About 75% of the values are less than 6.4.
6. A sample of 99 distances has a mean of 24 feet and a median of 24.5 feet. Unfortunately, it has
just been discovered that an observation which was erroneously recorded as “30” actually had a
value of “35.” If we make this correction to the data, then
(a) the mean remains the same, but the median is increased.
(b) the mean and median remain the same.
(c) the median remains the same, but the mean is increased.
(d) the mean and median are both increased.
(e) we do not know how the mean and median are affected without further calculations, but the
variance is increased.
7. Forty students took a statistics examination having a maximum of 50 points. The score distribution
is given in the following stem-and-leaf plot:
0|28
1|2245
2|01333358889
3|001356679
4|22444466788
5|000
The third quartile of the score distribution is equal to
(a) 43
(b) 44
(c) 45
(d) 23
Chapter 1
14
(e) 32
Chapter 1
15
8. Rainwater was collected in water collectors at 30 different sites near an industrial comples and the
amount of acidity (pH level) was measured. The mean and standard deviation of the values are
4.60 and 1.10, respectively. When the pH meter was recalibrated back at the laboratory, it was
found to be in error. The error can be corrected by adding 0.1 pH units to all of the values and then
multiplying the result by 1.2. The mean and standard deviation of the corrected pH measurements
are
(a) 5.64, 1.44
(b) 5.64, 1.32
(c) 5.40, 1.44
(d) 5.40, 1.32
(e) 5.64, 1.20
9. An experiment was conducted to investigate the effect of a new weed killer to suppress weed
germination in onion crops. Two chemicals were used, the standard weed killer (C) and the new
chemical (W). Both chemicals were tested at high and low concentrations. Measurements are
made, of the percent weed germination on each of 50 plots for each treatment combination. Here
are some boxplots of the results:
W-low conc.
C-low conc.
W-high conc.
C-high conc.
0
10
20
30
40
50
|---------|--------|---------|---------|---------|
|
|
_______________
| -----------|________|______|-------|
|
___________
|
----|_____|_____|---|
|
______________
|
-----|_______|______|---------*
*
*
|
|
_________
|
--|____|____|---
Which of the following is NOT a feature of these data?
(a) At either high or low concentrations, the new chemical (W) gives better control of weed
germination than the standard weed killer (C).
(b) Fewer weeds germinate at higher concentrations of both chemicals.
(c) The results from the standard chemical are less variable than those from the new chemical.
(d) High or low concentrations of either chemical have approximately the same effects on weed
germination.
(e) Some of the results from the low concentration of weed killer W have fewer weeds germinating
than some of the results from the high concentration of W.
10. A clothing and textiles student is trying to assess the effect of a jacket’s design on the time it takes
preschool children to put the jacket on. In a pretest, she times 7 children as they put on her
prototype jacket. The times (in seconds) are provided below.
n
n
65
39
n
43
102
The n’s represent children who had not put the jacket on after 120 seconds (in which case the
children were allowed to stop). Which of the following would be the best value to use as the
“typical” times required to put on the jacket?
(a) The mean time, which was 62.25 seconds.
(b) The mean time, which was 85.6 seconds.
(c) The median time, which was 54 seconds.
(d) The median time, which was 102 seconds.
(e) The missing times (the n’s) mean we can’t calculate any useful measures of center.
Chapter 1
16
Part 2: Free Response
Communicate your thinking clearly and completely.
11. Ogives of Distributions of Arithmetic Test Scores for Seventh- and Eighth-Graders
(Connect 7s with a smooth curve for seventh-grade ogive and connect 8s with a smooth curve for eighth-grade ogive.)
100+------------------------------------------------7---------8
|
|
|
|
|
|
|
|
|
|
|
7
|
8 |
|
|
|
|
|
|
|
90+--------|---------|---------|---------|--7------|-----8---|
|
|
|
|
|
|
|
|
|
|
7
|
|
|
|
|
|
|
|
|
80+--------|---------|---------|-------7-|---------|---8-----|
|
|
|
|
|
|
|
|
|
|
7 |
|
|
|
|
|
|
|
|
|
70+--------|---------|---------|---------|---------|-8-------|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8
|
PERCENT 60+--------|---------|---------|---------|---------|---------|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
50+--------|---------|---------|---------|---------|---------|
|
|
|
|
|
|
|
|
|
7
|
8
|
|
|
|
|
|
|
|
|
40+--------|---------|---------|---------|---------|---------|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
30+--------|---------|-----7---|---------|--8------|---------|
|
|
|
|
|
|
|
|
|
|
8
|
|
|
|
|
|
|
|
|
20+--------|---------|-7-------|-------8-|---------|---------|
|
|
|
|
|
|
|
|
7 |
|
8
|
|
|
|
|
|
|
|
|
|
10+--------|---------|-------8-|---------|---------|---------|
|
|
|
|
|
|
|
7
8
|
|
|
|
|
|
|
|
|
|
|
0+----7---|--8------|---------|---------|---------|---------|
5
10
15
TEST SCORE
20
25
30
(a) What is the estimated percent of eighth-grade pupils whose arithmetic scores fall below the
median score for grade seven? Justify your answer.
(b) What is the shape of the distribution of the eighth-grade test scores? Justify your answer.
Chapter 1
17
12. During the early part of the 1994 baseball season, many sports fans and baseball players noticed
that the number of home runs being hit seemed to be unusually large. Here are the data on the
number of home runs hit by American and National League teams:
American League
National League
35, 40, 43, 49, 51, 54, 57, 58, 58, 64, 68, 68, 75, 77
29, 31, 42, 46, 47, 48, 48, 53, 55, 55, 55, 63, 63, 67
(a) Construct an appropriate graph for comparing the number of home runs hit in the two leagues.
(b) Calculate numerical summaries of the number of home runs hit in the two leagues. Which of
these numbers would be most appropriate for comparing the two leagues? Explain.
(c) Are there any outliers in either of the two data sets? Justify your answer numerically.
(d) Write a few sentences comparing the distributions of home runs in the two leagues.
Chapter 1
18
1D
AP Statistics
Name:
Directions: Work on these sheets. Answer completely, but be concise.
Part 1: Multiple Choice. Circle the letter corresponding to the best answer.
1. Here are the IQ test scores of 10 randomly chosen fifth-grade students:
145
139
126
122
125
130
96
110
118
118
To make a stemplot of these scores, you would use as stems
(a) 0 and 1.
(b) 09, 10, 11, 12, 13, and 14.
(c) 96, 110, 118, 122, 125, 126, 130, 139, and 145.
(d) 0, 2, 3, 5, 6, 8, 9.
(e) None of the above is a correct answer.
2. For the IQ test scores from the previous question, what kind of plot is appropriate?
(a) a stemplot but not a boxplot
(b) a boxplot but not a histogram
(c) a bar graph but not a pie chart
(d) a histogram but not a dotplot
(e) None of the above is a correct answer.
3. Rainwater was collected in water collectors at 30 different sites near an industrial complex and the
amount of acidity (pH level) was measured. The data ranged from pH 2.6 to pH 6.3. The
following stemplot of the data was constructed.
2|679
3|237789
4|1222446899
5|0556788
6|0233
Which of the following boxplots is correct?
4. A scientist is weighing each of 30 fish. She obtains a mean of 30 g and a standard deviation of 2 g.
After completing the weighing, she finds that the scale was misaligned and always under reported
every weight by 2 g that is, a fish that really weighed 26 g was reported to weigh 24 grams. What
are the mean and standard deviation after correcting for the error in the scale?
(a) 28 g, 2 g
Chapter 1
(b) 30 g, 4 g
(c) 32 g, 2 g
19
(d) 32 g, 4 g
(e) 28 g, 4 g
5. If a distribution is skewed to the right,
(a) the mean is less than the median.
(b) the mean and median are equal.
(c) the mean is greater than the median.
(d) It’s impossible to tell without seeing the data.
(e) None of the above is a correct answer.
6. The population of the United States is aging, though less rapidly than in other developed countries.
Here is a stemplot of the percents of residents aged 65 and older in the 50 states, according to the
2000 census:
5
6
7
8
9
10
11
12
13
14
15
16
17
|
|
|
|
|
|
|
|
|
|
|
|
|
7
5
679
6
02233677
0011113445789
00012233345568
034579
36
There are two outliers: Alaska has the lowest percent of older
residents, and Florida has the highest. What is the percent for
Florida?
(a) 13.8%
(b) 57%
(c) 176%
(d) 17.6%
(e) 5.7%
6
7. Ignoring the outliers, the shape of the distribution in the previous question is
(a) strongly skewed to the right.
(b) slightly skewed to the right.
(c) roughly symmetric.
(d) slightly skewed to the left.
(e) strongly skewed to the left.
8. The center of the distribution in Question 6 is close to
(a) 12.7%.
(b) 13.5%.
(c) 13.8%.
(d) 12.
(e) It’s impossible to tell.
9. To make a boxplot of a distribution, you must know
(a) all of the individual observations.
(b) the mean and the standard deviation.
(c) the quartiles.
(d) the five-number summary.
(e) the individual observations, the mean, and the IQR.
10. What are all the values that a standard deviation can possibly take?
(a) 0 ≤ s
Chapter 1
(b) 0 ≤ s ≤ 1
(c) –1 ≤ s ≤ 1
20
(d) s ≤ 0
(e) any real number
Part 2: Free Response
Communicate your thinking clearly and completely.
11. Here are data on the percent of people in several age groups who attended a movie in the past 12
months:
Age group
Movie attendance
18 to 24 years
83%
25 to 34 years
73%
35 to 44 years
68%
45 to 54 years
60%
55 to 64 years
47%
65 to 74 years
32%
75 years and over
20%
(a) Display these data in a bar graph in the space above.
What is the main feature of the data?
(b) Would it be correct to make a pie chart of these data? Why?
(c) A movie studio wants to know what percent of the total audience for movies is 18 to 24 years
old. Explain why these data do not answer this question.
12. Here is a histogram of the number of unprovoked attacks by alligators on people in Florida over a
33-year period. The classes are “1 ≤ attacks < 3,” “3 ≤ attacks < 5,” and so on.
(a) What is the overall shape of the distribution?
(b) What is the approximate median of the yearly counts of alligator attacks?
Chapter 1
21
(c) Here is a time plot for the alligator data. Connect the numbers from 1 to 8 four times, then 1.
22.5+
7
5
Attacks 8
34 6
6
15.0+
4
6
7 2
1
7
5
2
1
81
3
7.5+
34
8
- 1 34 7 12 5
- 2
6
5 8
0.0+
+-------+-------+-------+-------+-------+
0
8
16
24
32
40
Coded Year (1=1972, 33=2004)
What overall pattern does this plot show?
(d) Why is the typical number of attacks from 1972 to 2004 not very useful in, say, 2006?
13. The states differ greatly in the kinds of severe weather that afflict them. The table below shows the
average property damage caused by tornadoes per year over the period from 1950 to 1999 in each
of the 50 states and Puerto Rico. (To adjust for the changing buying power of the dollar over time,
all damages were restated in 1999 dollars.)
State
($millions)
Alabama
51.88
Alaska
0.00
Arizona
3.47
Arkansas
40.96
California
3.68
Colorado
4.62
Connecticut 2.26
Delaware
0.27
Florida
37.32
Georgia
51.68
Hawaii
0.34
Idaho
0.26
Illinois
62.94
Indiana
53.13
Iowa
49.51
Kansas
49.28
Kentucky
24.84
Chapter 1
State
($millions)
Louisiana
27.75
Maine
0.53
Maryland
2.33
Mass.
4.42
Michigan
29.88
Minnesota
84.84
Mississippi 43.62
Missouri
68.93
Montana
2.27
Nebraska
30.26
Nevada
0.10
N Hampshire 0.66
New Jersey
2.94
New Mexico
1.49
New York
15.73
N Carolina 14.90
N Dakota
14.69
22
State
($millions)
Ohio
44.36
Oklahoma
81.94
Oregon
5.52
Penn.
17.11
Puerto Rico 0.05
Rhode Island 0.05
S Carolina 17.19
S Dakota
10.64
Tennessee
23.47
Texas
88.60
Utah
3.57
Vermont
0.24
Virginia
7.42
Washington
2.37
W Virginia
2.14
Wisconsin
31.33
Wyoming
1.78
(a) Here is a histogram of the data, with classes “0 ≤ damage < 10,” “10 ≤ damage < 20,” and so
on.
Describe the shape, center, and spread of the distribution. Which states may be outliers? (To
understand the outliers, note that most tornadoes in largely rural states such as Kansas cause
little property damage. Damage to crops is not counted as property damage.)
(b) Here is the “default” histogram that the calculator makes for these data.
How does this calculator histogram compare with the graph in (a)?
(c) What are the top five states for tornado damage? The bottom five? (Include Puerto Rico,
though it is not a state.)
(d) Give the five-number summary. Explain why you can see from these five numbers that the
distribution is strongly skewed to the right.
(e) The histogram suggests that a few states are outliers. Show that there are no suspected outliers
according to the 1.5  IQR rule. (You see once again that a rule is not a substitute for plotting
your data.)
(f) Find the mean property damage. Explain why the mean and median differ so greatly for this
distribution.
Chapter 1
23
Chapter 1
24