Download Total - Michigan State University`s Statistics and Probability

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Categorical variable wikipedia , lookup

Transcript
STT 231 - 001
PRACTICE EXERCISES SHEET 1A
Example Multiple Choice Questions
1.
Tallies and cross-tabulations are used to summarize which of these variable types?
A. Quantitative
B. Mathematical
C. Continuous
D. Categorical
2.
The table below summarizes observed data on the gender and party membership of 1000 individuals:
Gender
Male
Female
Total
Democrat
300
200
500
Party Membership
Republican
300
200
500
Total
600
400
1000
Which one of the following statements about the relationship between gender and party is suggested by the data in the table?
A. There is a relationship between gender and party membership.
B. There is no relationship between gender and party membership.
C. There is a relationship between gender and being a Democrat but not between gender and being a Republican.
D. There is a relationship between gender and party membership for males but not for females.
3.
Which one of these variables is a categorical variable?
A. Number of ear pierces a person has
B. Height of a person
C. Weight of a person
D. Opinion about legalization of marijuana
4.
Which one of the following variables is not categorical?
A. Age of a person.
B. Gender of a person: male or female.
C. Choice on a test item: true or false.
D. Marital status of a person (single, married, divorced, other)
5.
Which of the following is not a term used for a quantitative variable?
A. Measurement variable
B. Numerical variable
C. Continuous variable
D. Categorical variable
6.
Listed below is the number of Congressional Medals of Honor awarded in wars fought by the United States.
War
Medals
Civil
1,520
World
War I
124
World
War II
440
Korean
141
Vietnam
239
Other
105
Total
2,569
What percent of all medals given were awarded during World War I and World War II?
A. 4.83%
B. 17.13%
C. 21.95%
D. 78.01%
1
Questions 7 to 9: In a survey of 1000 adults, respondents were asked about the expense of a college education and the relative
necessity of financial assistance. The correspondents were classified as to whether they currently had a child in college or not (college
status), and whether they thought the loan obligation for most college students was too high, about right, or too little (loan obligation
opinion). The table below summarizes some of the survey results. Use these results to answer the following questions.
College Status
Child in College
No Child in College
Too High
350
250
Loan Obligation Opinion
About Right Too Little
80
10
200
110
Total
440
560
7.
Which type of variable is Loan Obligation Opinion?
A. Categorical
B. Quantitative
C. Continuous
D. Measurement
8.
What role does the variable Loan Obligation Opinion play in this study?
A. Explanatory
B. Response
C. Confounding
D. It plays no role in the study.
9.
Which group had the greatest percentage of adults who thought loan obligations are too high?
A. Those adults that have a child in college
B. Those adults that do not have a child in college
C. Both groups have the same percent thinking loan obligations are too high
10. The EPA sends out a survey to learn about people’s water usage habits. Some of the questions included in the survey are given
below.
Q1. How many times a week do you take a shower?
Q2. Do you leave the water running when you brush your teeth?
Q3. When you water your lawn, how long do you let the water run?
For each question, determine if it leads to categorical responses or quantitative responses.
KEY: Q1 and Q3 lead to quantitative responses, while Q2 leads to categorical responses.
Questions 11 and 12: A USA TODAY/CNN/Gallup Poll conducted April 19, 2005, was based on telephone interviews with 616 U.S.
Catholics. One question asked was “When you think about your commitment to the Catholic Church, how much is your commitment
affected by who the pope is -- a great deal, a moderate amount, not much, or not at all? The percentages are provided in the table
below.
2005 APR 19
Great deal
10%
Moderate Amount
32%
Not Much
37%
Not at all
20%
No Opinion
1%
11.
The response variable being measured for this question could be called the Commitment Status. State what type of variable
commitment status is and suggest an appropriate graph that could be made to display the distribution of this variable.
KEY: Commitment status is a categorical variable and an appropriate graph would be a pie chart or bar chart.
12. Approximately how many respondents stated that their commitment is not affected much?
KEY: 37% of 616 is 227.92 so approximately 228 respondents answered not much.
13.
A.
B.
C.
D.
The percent of data which lie between the lower and upper quartiles is
10%.
25%.
50%.
75%.
2
14.
a.
b.
c.
d.
A five-number summary for a data set is 35, 50, 60, 70, 90. About what percent of the observations are between 35 and 90?
25%
50%
95%
100%
Questions 15 and 16: In a survey, students are asked how many hours they study in a typical week. A five-number summary of the
responses is: 2, 9, 14, 20, 60.
15.
a.
b.
c.
d.
Which interval describes the number of hours spent studying in a typical week for about 50% of the students sampled?
2 to 9
9 to 14
9 to 20
14 to 20
16.
a.
b.
c.
d.
Fill in the blank in the following sentence. About 75% of the students spent at least ____ hours studying in a typical week.
9
14
20
45
Questions 17 and 18: The following histogram is for the weights (lbs) of 119 female college students.
(Source: idealwtwomen dataset on the CD.)
17.
a.
b.
c.
d.
What is the approximate shape of the distribution?
Nearly symmetric.
Skewed to the left.
Skewed to the right.
Bimodal (has more than one peak).
18.
a.
b.
c.
d.
The best choice for the median weight for the 119 women based on the histogram is approximately
100 pounds.
110 pounds.
135 pounds.
160 pounds.
3
Questions 19 to 21: The following histogram shows the distribution of the difference between the actual and “ideal” weights for 119
female college students. Ideal weights are responses to the question “What is your ideal weight”? The difference = actual ideal.
(Source: idealwtwomen dataset on the CD.)
19.
a.
b.
c.
d.
What is the approximate shape of the distribution?
Nearly symmetric.
Skewed to the left.
Skewed to the right.
Bimodal (has more than one peak).
20.
a.
b.
c.
d.
The median of the distribution is approximately
10 pounds.
10 pounds.
30 pounds.
50 pounds.
21.
a.
b.
c.
d.
Most of the women in this sample felt that their actual weight was
about the same as their ideal weight.
less than their ideal weight.
greater than their ideal weight.
no more than 2 pounds different from their ideal weight.
4
Questions 22 and 23: The following histogram is for the weights (lbs) of 63 male college students.
(Source: idealwtmen dataset on the CD.)
22.
a.
b.
c.
d.
What is the best description for the approximate shape of this distribution?
Nearly symmetric.
Skewed to the left.
Skewed to the right.
Bimodal (has more than one peak).
23.
a.
b.
c.
d.
The median weight for the 63 men is approximately
130 pounds.
150 pounds.
180 pounds.
220 pounds.
5
Questions 24 to 26: The following histogram gives the distribution of the difference between the actual and ideal weights for a
sample of male college students. Ideal weights are responses to the question “What is your ideal weight”? The difference = actual
ideal.
24.
a.
b.
c.
d.
What is the approximate shape of the distribution?
Nearly symmetric.
Skewed to the left.
Skewed to the right.
Bimodal (has more than one peak).
25.
a.
b.
c.
d.
The median of the distribution is approximately
10 pounds.
0 pounds.
10 pounds.
20 pounds.
26.
a.
b.
c.
d.
Most of the men in this sample felt that their actual weight was
about the same as their ideal weight.
less than their ideal weight.
greater than their ideal weight.
no more than 2 pounds different from their ideal weight.
6
27. The following boxplot is for the results of the women’s 400-meter dash final race during the 2000 Olympics in Sydney, Australia.
Cathy Freeman won in 49.11 seconds. Choose the correct statement about the boxplot.
A.
B.
C.
D.
The median time is more than 50 seconds.
The median time is less than 49.75 seconds.
The fastest time of 49.11 seconds is an outlier.
The slowest time of 51.04 seconds is an outlier.
28.
A.
B.
C.
D.
Which of the following provides the most information about the shape of a data set?
Boxplot
Pie chart
Five number summary
Stem-and-leaf plot
Questions 29 to 31: According to a national sleep foundation survey, around 31 million Americans are sleep deprived. They also say
women need more sleep than men and are being short-changed. Below are the five number summaries for the number of hours of
sleep at night based on a survey of American men and women.
Men: 5.5, 6, 6.5, 7.5, 9
Women: 4.5, 5, 6, 7, 8
29. Write a sentence to compare men versus women in terms of the median amount of sleep at night
KEY: The survey shows that the median about of sleep at night for women is 6 hours, about a half an hour less than that for men
(which was 6.5 hours).
30. Write a sentence to compare men versus women in terms of the interquartile range for the amount of sleep at night.
KEY: Based on the survey, about 50% of the men get between 6 and 7.5 hours of sleep at night, while the interquartile range for
women is from 5 to 7 hours of sleep.
31. What percent of women sleep at least 6 hours at night? What percent of men do so?
KEY: Based on the survey, about 50% of the women get at least 6 hours of sleep at night, while 75% of men do so.
7
Questions 32 to 34: A psychologist has developed a new technique intended to improve rote memory. To test
the method against other standard methods, 45 high school students are selected at random and each is taught
the new technique. The data on the number of words memorized correctly by the students were used to create
the following histogram. Note the first class represents the interval [70, 72).
32. What proportion of students memorized correctly at least 94 words?
KEY: A total of 8 out of 45 students did so, for a proportion of 0.178 or 17.8%.
33.
What is the overall shape of the distribution of the number of memorized words for these students?
KEY: The histogram is bimodal, thus showing evidence of two subgroups of students that perhaps should not be aggregated (or
combined). The data should be further examined to try to identify the factor that has created these two subgroups.
34.
Can we calculate the exact range of the 45 responses? If yes, calculate it. If no, explain why not.
KEY: No, we do not know the exact values for the minimum and maximum. We do know that the smallest
observation is in the class [70, 72) and thus could be 70 or 71 words. Likewise the largest observation
falls in the class [98, 100) and could be 98 or 99 words.
35.
A.
B.
C.
D.
Which one of the following statements is most correct about a skewed dataset?
The mean and median will usually be different.
The mean and median will usually be the same.
The mean will always be higher than the median.
Whether the mean and median are the same depends on whether the data set is skewed to the right or to the left.
Questions 36 to 38: Listed below is a stem-and-leaf plot of the times it took 13 students to drink a 12 ounce beverage. Values for
stems represent seconds and values for leaves represent tenths of a second.
3| 1234
3| 5
4| 0
5| 6
6| 11379
7|
8| 2
8
36.
a.
b.
c.
d.
What was the median time to drink the beverage?
3.5 seconds.
4.0 seconds.
5.6 seconds.
6.9 seconds.
37.
a.
b.
c.
d.
The lower quartile is
3.1 seconds.
3.35 seconds.
3.4 seconds.
3.5 seconds.
38.
a.
b.
c.
d.
The upper quartile is
6.9 seconds.
6.5 seconds.
6.1 seconds.
5.6 seconds.
39.
a.
b.
c.
d.
Which of the following would indicate that a dataset is skewed to the right?
The interquartile range is larger than the range.
The range is larger than the interquartile range.
The mean is much larger than the median.
The mean is much smaller than the median.
40.
a.
b.
c.
d.
If an exam was worth 100 points, and your score was at the 80th percentile, then
your score was 80 out of 100.
80% of the class had scores at or above your score.
20% of the class had scores at or above your score.
20% of the class had scores at or below your score.
Questions 41 to 43: The table below provides a statistical summary of the number of CDs owned as reported by students in a class
survey done at Penn State University.
Variable
CDs
N
250
Mean
85
Minimum
0
Q1
30
Median
50
Q3
100
Maximum
500
41.
a.
b.
c.
d.
Approximately what percent of students own somewhere between 30 and 50 CDs?
50%
25%
20%
4%
42.
a.
b.
c.
d.
What is the interquartile range for these data?
500
100
70
30
43.
a.
b.
c.
d.
Based on the summary shown, which of the following statements most likely describes the shape of the CDs owned dataset?
The summary is evidence that the data are symmetric and bell-shaped.
The summary is evidence that the data are symmetric but not bell-shaped.
The summary is evidence that the data are skewed to the left.
The summary is evidence that the data are skewed to the right.
9
Questions 44 to 47: The following boxplot gives the distribution of the ratings of a new brand of peanut butter for 50 randomly
selected consumers (100 points possible with higher points corresponding to a more favorable rating).
44.
a.
b.
c.
d.
Based on the boxplot,
the distribution appears to be skewed to the left.
the distribution appears to be skewed to the right.
there appear to be outliers at about 60 and 90.
there do not appear to be any outliers.
45.
a.
b.
c.
d.
The lower quartile is between
50 to 60 points.
60 to 70 points.
70 to 75 points.
80 to 90 points.
46.
a.
b.
c.
d.
The median is
60 points.
70 points.
75 points.
80 points.
47.
a.
b.
c.
d.
The upper quartile is between
50 to 60 points.
60 to 70 points.
70 to 75 points.
80 to 90 points.
10
48. A recent study was conducted to compare the age of vehicles in a student parking lot versus those in a faculty parking lot at a
major university. A random sample of 15 cars was taken from each lot and the age of the car was recorded by taking the
current year and subtracting the model year from it. The two boxplots are shown below to summarize these results.
Compare the two distributions based on these side-by-side boxplots.
KEY: Overall the faculty cars are newer with a median age of 3 years compared to a median age of 6 years for students’ cars. There
was (at least) one faculty car that was much older than the rest and thus shown as an outlier at 7 years. The overall range for
the ages of the students’ cars is larger (10 – 1 = 9 years) as compared to the range for the ages of the cars owned by faculty (7 –
0 = 7 years).
11
49.
a.
b.
c.
d.
An outlier is a data value that
is larger than 1 million.
equals the minimum value in a set of data.
equals the maximum value in a set of data.
is not consistent with the bulk of the data.
50.
a.
b.
c.
d.
Which statistic is not resistant to an outlier in the data?
Lower quartile
Upper quartile
Median
Mean
51.
a.
b.
c.
d.
Which one of these statistics is unaffected by outliers?
Interquartile range
Mean
Standard deviation
Range
52.
a.
b.
c.
d.
Which one of the following statistics would be affected by an outlier?
Median
Standard deviation
Lower quartile
Upper quartile
53.
a.
b.
c.
d.
Which of the following could account for an outlier in a dataset?
Natural variability in the measurement of interest.
Recording the wrong category for an individual's value of a categorical variable.
A symmetric distribution for the measurement of interest.
Measuring more than one variable for each individual.
54. By inspection, determine which of the following sets of numbers has the smallest standard deviation.
A. 2, 3, 4, 5
B. 4, 4, 4, 5
C. 0, 0, 5, 5
D. 5, 5, 5, 5
Question 55 – 59: Here is a stem-leaf plot of the exam scores of 30 students.
3|9
4 | 599
5 | 378
6 | 3677789
7 | 002333556689
8 | 5678
55. The median of this distribution of exam scores is
(a) 69
(b) 70
(c) 71
(d) 72
(e) 73
56. The first quartile of this distribution of exam scores is
(a) 57
(b) 63
(c) 67
(d) 69
(e) 70
57. If the score 88 is changed to a score 99, the median remains unchanged.
(a) true
(b) false
58. If the score 88 is changed to a score 99, the mean remains unchanged.
(a) true
(b) false
12
59. The distribution of scores is best described as
(a) symmetrical (b) skewed to the right
(c) skewed to the left
Questions 60 – 64. A sample of 1,000 students was cross-tabulated by sex and score on an SAT Exam. The
contingency table is shown below.
SAT Score
Sex
1000 to
1100 to
1300 to
1500 to
Total
<1100
<1300
<1500
<1600
60
180
240
120
600
Female
60
150
140
50
400
Male
120
330
380
170
1000
Total
60. For the sample of 1,000 students, males make-up what percentage?
(a) 10% (b) 20%
(c) 30%
(d) 40%
(e) 50%
61. What percentage of males scored 1300 or above?
(a) 50.0% (b) 47.5% (c) 19.0% (d) 14.0%
62. What is the risk of scoring below 1100 for females?
(a) 0.1
(b) 0.2
(c) 0.6
(d) 0.01
(e) not possible to determine
(e) 0.9
63. What is the relative risk of scoring below 1100 for females relative to males?
(a) 0.15 (b) 0.2
(c) 0.85
(d) 0.67
(e) 1
64. When comparing the performance of males and females on the SAT Exam using this sample, which is more
appropriate to use?
(a) frequencies
(b) percents
65. Consider this sample of n=3 scores: 1, 12, 14. The mean and standard deviation of this sample are
(a) 9, 8.9
(b) 12, 7
(c) 12, 9
(d) 9, 4.1
(e) none of these
66. Here is a dotplot of n=19 length measurements in cm. The median length is closest to
(a) 78 cm
(b) 76 cm
(c) 81 cm
(d) 83 cm
(e) 70 cm
.
.: ... .
.
:
.: . .
...
-------+---------+---------+---------+---------+---------length
72.0
75.0
78.0
81.0
84.0
87.0
Questions 67 – 73: Below is a boxplot of the examination scores of a large class of students.
15
25
35
45
55
65
75
85
score
13
67. Approximately, what percentage of students scored above 47?
(a) 75% (b) 50%
(c) 45%
(d) 25%
(e) 10%
68. The median score is about
(a) 34
(b) 43
(c) 30
(d) 39
(e) 41
69. The lowest score is about
(a) 17
(b) 34
(c) 39
(d) 48
(e) not possible to determine
70. The highest score is about
(a) 17
(b) 57
(c) 75
(d) 85
(e) not possible to determine
71. Approximately, what percentage of students scored between 34 and 39?
(a) 50% (b) 40%
(c) 35%
(d) 25%
(e) 75%
72. The third quartile of this distribution is about
(a) 17
(b) 34
(c) 39
(d) 47
(e) 75
73. The inter-quartile range (IQR) is about
(a) 13
(b) 5
(c) 3
(d) 25%
(e) 75%
QUESTIONS 74 – 78
Here is a steam and leaf display of the pulse rate (beats per minute) of 25 women, taken by a researcher at a
health clinic.
56
60 4 4
66 7 8
71 2 3 3
76 7 7 7 8 8
80 1 1 2 4 4 4
88
74. The distribution of sample weights is best described as
A. Symmetric about its mean
C. Skewed toward small values
75. The sample median is
A. 76
B. 74
B. Skewed toward large values
D. Multimodal
C. 77
D. 80
76. The Inter-quartile range, IQR is
A. 20
B. 21
C. 12
D. 13
E. None of these
E. None of these
14
77. Which of the following statement is true?
A.
B.
C.
D.
The mean is greater than the median
The median is greater than the mean
The mean is equal to the median
The mean is equal to the mode
78. For this sample, the percentage of women with pulse rate more than 70 is closest to
A. 60.6%
B. 60.2%
C. 73.9%
D. 69.6%
E. 72.0%
QUESTIONS 79 – 80
A classroom is furnished with a collection of 32 old chairs, as described in this table:
No Arms Arms Writing Desk
Wood 8
5
3
Plastic 5
7
4
79. Approximately what fraction of all chairs have a writing desk attached?
A. 0.094
B. 0.125
C. 0.22
D. 0.25
E. 0.75
D. 0.25
E. 0.75
80. What fraction of plastic chairs have a writing desk attached?
A. 0.094
B. 0.125
C. 0.22
QUESTIONS 81 - 83
Below is a boxplot of the seal strengths of n = 100 sample potato chip bags.
0
10
20
30
40
50
60
70
80
90
100
10. The median seal strength is closest to
Seal Strength
(a) 6
(b) 15
(c) 25
(d) 30
(e) 20
81. Approximately, what percentage of bags had seal strengths above 15?
A. 50%
B. 5%
C. 10%
D. 25%
E. 75%
15
82. Assume that Q1 = 3 and Q3 = 15. What is the value of the upper fence?
A. 33
B.18
C.15
D.39
E. 92
83. The distribution of scores is best described as
A. Skewed to the right
C. Symmetric about its mean
B. Skewed to the left
D. Symmetric about its median
84. The weekly salaries (in dollars) of sixteen government workers are listed below.
690
728
491
519
564
535
679
787
813
455
685
489
644
611
356
1086
Identify potential outlier(s), if there is(are) any, in the given data.
A. 356 and 986
B. 611
C. 356
D. 1086
E. None
85. The National Education Association collects data on the number of years of teaching experience of high –
school teachers. A sample taken this year of 19 high – school teachers yielded the following data on number of
years of teaching experience.
16
7
17
25
24
18
19
14
1
9
16
39
34
20
10
18
15
14
21
Identify potential outlier(s), if there is (are) any, in the given data.
A. 34 and 39
D. 1, 34 and 39
86.
B. 1 and 39
E. None
C. 1
The Centers for Disease Control lists causes of death for individual states in
2002. The mortality data for one state is given.
In this state, what percent of deaths were from causes not listed here?
A 34.8%
B 59.1%
D. 29.9%
C 70.1%
16
87.
The Centers for Disease Control lists causes of death for individual states
in 2002. The mortality data for one state is given.
Is it reasonable to conclude that, in this state, cancer or respiratory diseases were
the cause of approximately 29% of deaths in 2002?
A Yes, because the percentages can always be added in relative frequency
tables.
B No, because these categories overlap.
C No, because the percentages do not add up to 100%.
D No, because the percentages in relative frequencies tables can never be
added.
E Yes, because these categories do not overlap.
Describe the distribution (shape, center, spread, unusual features).
88. The histogram shows the sizes (in acres) of 169 farms in Lake County. In
addition to describing the distribution, approximate the percentage of farms that
are under100 acres.
A) A. The distribution of the size of farms in Lake County is symmetric, with
farm sizes ranging from 0 to 450 acres. The mode of the distribution is
between 100 and 150 acres. It appears that 118 of 169 farms are under100
acres, approximately 70%.
17
B The distribution of the size of farms in Lake County is skewed to the right.
Most of the farms are smaller than 150 acres, with some larger ones, from
150 to 300 acres. Five farms were larger than the rest, over 400 acres. The
mode of the distribution is between 0 and 50 acres. It appears that 62 of 169
farms are under100 acres, approximately 37%.
C The distribution of the size of farms in Lake County is skewed to the
right. Most of the farms are smaller than 150 acres, with some larger ones,
from 150 to 300 acres. Five farms were larger than the rest, over 400 acres.
The mode of the distribution is between 0 and 50 acres. It appears that 118
of 169 farms are under100 acres, approximately 70%.
D The distribution of the size of farms in Lake County is symmetric, with farm
sizes ranging from 0 to 450 acres. The mode of the distribution is between 0
and 50 acres. It appears that 118 of 169 farms are under100 acres,
approximately 70%.
E The distribution of the size of farms in Lake County is skewed to the right.
Most of the farms are smaller than 50 acres, with some larger ones, from 150
to 300 acres. Five farms were larger than the rest, over 400 acres. The mode
of the distribution is between 0 and 50 acres. It appears that 118 of 169 farms
are under100 acres, approximately 70%.
Find the median of the data.
89. The stem-and-leaf display shows the results of the mathematic test of 30
students.
A) 79
B) 81
C) 80
D) 87.5
E) 88.5
18
90. A small company employs a supervisor at $1300 a week, an inventory manager
at $800 a week, 5 stock boys at $400 a week each, and 3 drivers at $500 a week
each.
A $560
B $400
C $500
D $450
E $800
Solve the problem.
91. Here are summary statistics for the normal monthly precipitation (in inches) for
August for 20 different U.S. cities.
Would you describe this distribution as symmetric or skewed?
A) Skewed to the left, because the mean is smaller than the median and the upper quartile is
farther from the median than the lower quartile.
B) Skewed to the right, because the mean is smaller than the median and the lower quartile is
farther from the median than the upper quartile.
C) Skewed to the left, because the mean is smaller than the median and the lower quartile
is farther from the median than the upper quartile.
D) Skewed to the right, because the mean is larger than the median and the upper quartile is
farther from the median than the lower quartile.
E) Symmetric, because the mean and the median are close
92. Here are boxplots of the points scored during the first 10 games of the basketball
season for both Caroline and Alexandra. Summarize the similarities and
differences in their performance so far.
A) The girls have a different average score per game. Caroline is much more consistent,
because her IQR is about 4 points, while Alexandra's is over 15.
19
B) Both girls have a median score of about 18 points per game. Caroline is much more
consistent, because her IQR is about 6 points, while Alexandra's is over 20.
C) The girls have a different average score per game, but the same median score of about 18
points per game. Their IQR are different, but this does not give anymore information on
the girls' performance.
D) Both girls have a median score of about 18 points per game. Alexandra is much more
consistent, because her IQR is about 15 points, while Caroline's is over 3.
E) Both girls have a median score of about 18 points per game. Caroline is much more
consistent, because her IQR is about 4 points, while Alexandra's is over 15.
20
Questions 93 - 96
The histogram shows the sizes (in acres) of 169 farms in Lake County.
21
93. Which summary statistics would you choose to summarize the center in these data?
A. Median
D. Midrange
B. Mean
E. Interquartile Range
C. Mode
94. Which summary statistics would you choose to summarize the spread in these data?
A. Median
D. Range
B. Mean
E. Interquartile Range
C. Mode
F. Standard Deviation
95. Justify your answers to questions 93 and 94.
A. The Median and the Standard Deviation respectively, would be used to summarize the center
and spread, of the sizes, in acres, of the 169 farms in Lake County because the distribution is
right skewed.
B. The Mean and the Interquartile Range respectively, would be used to summarize the center
and spread, of the sizes, in acres, of the 169 farms in Lake County because the distribution is
right skewed.
C. The Midrange and the Range respectively, would be used to summarize the center and spread,
of the sizes, in acres, of the 169 farms in Lake County because the distribution is right skewed.
D. The Median and the Interquartile Range respectively, would be used to summarize the
center and spread, of the sizes, in acres, of the 169 farms in Lake County because the
distribution is right skewed.
E. The Mean and the Interquartile Range respectively, would be used to summarize the center
and spread, of sizes, in acres, of the 169 farms in Lake County because the distribution is right
skewed.
96. From the histogram, would you expect the mean or the median to be larger? Explain.
A. The median is larger than the mean because the distribution of the sizes, in acres, of the 169
farms in Lake County is skewed to the right.
B. The mean is larger than the median because the distribution of the sizes, in acres, of the 169
farms in Lake County is skewed to the right.
C. The mean and the median are basically equal because of the presence of at least one outlier.
D. All of the above
97. Here are the ages for a sample of n = 5 children: 1, 9, 7, 5, 3. The Mean, M, and the Sample
Standard Deviation, SSD, and the Median, MD, are given by
A. M = 4.5; SSD = 3.16; MD = 7
C. M = 5.5; SSD = 3.16; MD = 7
E. M = 6; SSD = 3.4; MD = 7
B. M = 5; SSD = 10; MD = 5
D. M = 5; SSD = 3.16; MD = 5
F. None of the above
22
QUESTIONS 98 – 101
Here are summary statistics for the sizes (in acres) of Finger Lakes vineyards:
Count
Mean
Standard Deviation
Median
Interquartile Range, IQR
Minimum
Lower Quartile, Q1
Upper Quartile, Q3
Maximum
36
46.50 acres
47.76
33.50
36.50
6
18.50
55
250
98. The distribution of the sizes (in acres) of Finger Lakes vineyards is
A. Symmetric
C. Bimodal
B. Heavily left – skewed
D. Heavily right – skewed
99. The Upper Fence
A. 109.75
C. 250
C. None of the above
B. – 36.25
D. 6
100. The Lower Fence
A. 109.75
C. 250
D. None of the above
B. - 36.25
D. 6
101. The data value ________________________ is an outlier.
A. 109.75
C. 250
E. None of the above
B. – 36.25
D. 6
QUESTIONS 102 – 106
A meteorologist preparing a talk about global warming compiled a list of weekly low
temperatures (in degrees Fahrenheit) he observed at his southern Florida home last year. The
coldest temperature for any week was 36oF but he inadvertently recorded the Celsius value of 2o.
Assuming that he correctly listed all the other temperatures, state how this error will affect these
summary statistics.
23
102. The Mean temperature will
A. Increase
B. Decrease
C. Not change
103. The Median temperature will
A. Increase
B. Decrease
C. Not change
104. The range in temperature will
A. Increase
B. Decrease
C. Stay the same
105. The Interquartile Range (IQR) will
A. Increase
B. Decrease
C. Not change
106. The Standard Deviation in temperature will
A. Increase
B. Decrease
C. Stay (roughly) the same
QUESTIONS 107 – 109
Some companies offer 401(K) retirement plans to employees, permitting them to shift part of their
before-tax salaries into investments such as mutual funds. Employers typically match 50% of the
employees’ contribution up to about 6% of salary. One company, concerned with what it believed
was a low employee participation rate in its 401(K) plan, sampled 30 other companies with similar
plans and asked for their 401(K) participation rates.
107. Identify the WHO
A.
B.
C.
D.
E.
All similar companies offering 401(K) retirement plans
401(K) employee participation rates (in percent)
30 similar companies offering 401(K) retirement plans
Retirement plans to employees
None of the above
108. Identify WHAT was investigated
A.
B.
C.
D.
E.
30 similar companies offering 401(K) retirement plan
All similar companies offering 401(K) retirement plan
Retirement plans to employees
401(K) employee participation rates (in percent)
None of the above
24
109. Identify the POPULATION IN INTEREST
A.
B.
C.
D.
E.
All similar companies offering 401(K) retirement plan
Retirement plans to employees
30 similar companies offering 401(K) retirement plan
401(K) employee participation rates (in percent)
None of the above
QUESTIONS 110 – 111
Below is a boxplot of the seal strengths of n = 100 sample potato chip bags.
0
10
20
30
40
50
60
70
80
90
100
14. The lower quartile, Q1, the median, MD, and Seal
the upper
quartile, Q3, are respectively?
Strength
110. The Lower quartile Q1, the Median MD, the Upper quartile Q3, the Lower Fence LF, the Upper Fence
UF, are respectively?
A. 0, 6, 34, - 15, 33
B. 3, 6, 15, - 15, 33
C. 1, 7, 15, 0, 102
D. 0, 15, 94, 12, 18
E. 1, 15, 94, 15, 33
111. The dataset of the seal strength of the sample of n = 100 potato chip bags has ________________ outliers.
A. 6
B. 5
C. 4
D. 3
E. 2
QUESTIONS 112 – 113
Here are amounts of money (cents) in coins carried by 10 students in a statistics class:
50
35
0
97
76
0
0
87
23
65
25
112. The mean, M; Median, MD; and Standard deviation, SD, of these data are
A.
B.
C.
D.
E.
M = 43.3; MD = 42.5; SD = 35.3
M = 42.5; MD = 35; SD = 37.2
M = 43.3; MD = 42.5; SD = 37.2
M = 42.5; MD = 43.3; SD = 37.2
M = 43.3; MD = 38; SD = 37.2
113. The five-number summary of this data is
A.
B.
C.
D.
E.
0, 0, 42.5, 76, 97
0, 0.29, 57.5, 81.5, 97
0, 0.29, 42.5, 75, 97
0, 0, 38, 76, 97
None of the above
114. You look at real estate ads for houses in Naples, Florida. There are many houses ranging from
$200,000 to $500,000 in price. The few houses on the water, however, have prices up to $15 million.
The distribution of house prices will be
A.
B.
C.
D.
E.
Skewed to the left
Roughly symmetric
Bimodal
Multi-modal
Skewed to the right
115. One measure of the difficulty of a golf course is its length: the total distance (in yards) from tee to hole for
all 18 holes. Here are the summary statistics for the lengths of all the golf courses in Vermont:
COUNT
MEAN
STANDARD DEVIATION
MIN
Q1
MEDIAN
Q3
MAX
45
5892.91 YD
386.59
5185
5585.75
5928
6131
6796
What summary statistics would you use to describe these data?
A.
B.
C.
D.
E.
Median and interquartile range (IQR) because the distribution is multimodal.
Mean and standard deviation because the distribution is roughly unimodal and symmetric.
Mean and range because the distribution is roughly unimodal and symmetric.
Median and range because the distribution is roughly unimodal and symmetric.
Median and interquartile range (IQR) because the distribution displays hidden outliers.
26
QUESTIONS 116 – 117
The State Education Department requires local school districts to keep these records on all students: age, race or
ethnicity, days absent, current grade level, standardized test scores in reading and mathematics, and any
disabilities or special educational needs.
116. Identify the WHO.
A. State Education Department
C. Records on all students
E. None of the above
B. Students
D. Local School Districts
117. Identify how many variables the local school districts kept on each student.
A. 5
D. 4
B. 6
E. 3
C. 7
QUESTIONS 118 – 121
Consider the following weights (in kg) for a sample of twenty-three guinea pigs: 0.98 1.14 1.11 1.00 1.00 1.02
1.00 0.95 0.98 1.29 1.06 1.04 1.08 1.01 1.02 1.03 1.18 1.24 0.97 0.99 1.06 1.09 1.19
Here is a stem-and-leaf plot of the sample data
0.957889
1.000012234
1.06689
1.113
1.189
1.24
1.29
118. The distribution of sample weights is best described as
A.
B.
C.
D.
Symmetric about its mean
Multimodal
Skewed toward small values (negative skewedness)
Skewed toward large values (positive skewedness)
119. The sample median weight is
A. 1.00 kg
D. 1.03 kg
B. 1.02 kg
E. 1.04 kg
C. 1.025 kg
120. The 1st quartile (Q1) is
A. 1.11
D. 1.00
B. 1.08
E. None of these
C. 1.09
27
121. For this sample, the percentage of guinea pigs with weights less than 1.10 kg is
A. 73.9%
D. 88.6%
B. 65.2%
E. 78.2%
C. 60.9%
QUESTIONS 122 – 123
A sample of 1,400 children was studied to investigate the relationship of the characteristic carrier of
Streptococcus pyogenes (2 levels) and the characteristic size of tonsils (3 class intervals). The frequency data
are given in the following table.
Carrier
Noncarrier
Total
0.4 cm - under 0.6 cm
20
500
520
Tonsil Size (diameter)
0.6 cm - under 0.8 cm
20
600
620
0.8 cm - 1.0 cm
50
210
260
Total
90
1310
1400
122. What percentage of the persons with large tonsils (0.8 cm to 1.0 cm in diameter) are carriers?
A. 40%
D. 19.2%
B. 55.6%
E. 16%
C. 28.5%
123. The median tonsil diameter for the 90 carriers of Streptococcus pyogenes is in which interval?
A. 0.4 cm - under 0.6 cm
C. 620
B. 0.6 cm - under 0.8 cm
D. 0.8cm – 1.0cm
QUESTIONS 124 – 126
The Cleveland Casting Plant is a large highly automated producer of gray and nodular iron automotive castings
for Ford Motor Company. The company is interested in keeping the pouring temperature of the molten iron (in
degrees Fahrenheit) close to the specified value of 2550 degrees. Cleveland Casting measured the pouring
temperature for 10 randomly selected crankshafts.
124. Identify the WHO.
A. Cleveland Casting Plant
C. The pouring temperature of the molten iron
E. None of the above
B. Ford Motor company
D. 10 crankshafts at Cleveland Casting Plant
125. Identify WHAT was investigated.
A. The pouring temperature (in degrees Fahrenheit) of molten iron
B. Cleveland Casting Plant
C. Automotive castings for Ford Company
D. Nodular iron
E. None of the above
28
126. Identify the POPULATION OF INTEREST.
A. The pouring temperature (in degrees Fahrenheit) of molten iron
B. 10 crankshafts at Cleveland Casting
C. All crankshafts at Cleveland Casting
D. Cleveland Casting Plant
E. Ford Motor Company
QUESTIONS 127 – 128
Here are the annual numbers of deaths from tornadoes in the United States from 1990 through 2000.
53 39 39 33 69 30 25 67 130 94 40
127. The median death from tornadoes in the United States from 1990 through 2000 is
A. 30
D. 39.5
B. 40
E. 49.5
C. 27.5
128. The Inter-quartile range (IQR), from this data, of deaths from tornadoes in the United States from 1990
through 2000 is
A. 98.5
D. 36
B. 62.5
E. 68
C. 32
QUESTIONS 129 – 130
The Men’s Combined Skiing event consists of a downhill and a slalom. Here is the total times (adding the
two race times) for the 2006 Olympics:
Count
Mean
Standard Deviation
Min
Q1
Median
Q3
Max
35
196.079
5.80009
189.350
192.238
193.270
200.625
211.890
29
# of Skiers
Combined Times (Sec)
129. What is the range of these lengths?
A. 22.54
D. 8.387
B. 189.350
E. 2.809
C. 211.890
130. Between what values do the central 50% of these times lie?
A. Between 193.270 and 196.079
C. Between 189.350 and 211.890
E. None of the above
B. Between 192.238 and 200.625
D. Between 200.625 and 211.890
QUESTION 131
An investigation, compiled information about recent nonmilitary plane crashes (www.planecrashinfo.com). The
causes, to the extent that they could be determined, are summarized in the table.
CAUSE
Pilot Error
Other Human Error
Weather
Mechanical Failure
Sabotage
PERCENT
40
5
6
14
6
30
Is it reasonable to conclude that the weather or mechanical failures caused only 20% of the recent plane crashes
and what percent of deaths were from causes not listed here?
A. It is not reasonable to conclude that weather or mechanical failures caused only 20% of the recent plane
crashes because these categories overlap. The percent of deaths from causes not listed is 29%.
B. It is not reasonable to conclude that weather or mechanical failures caused only 20% of the recent plane
crashes because the data is left-skewed. The percent of deaths from causes not listed is 29%.
C. It is not reasonable to conclude that weather or mechanical failures caused only 20% of the recent plane
crashes because the sum of the percent column is only 71%. The percent of deaths from causes not listed
is 29%.
D. It is reasonable to conclude that weather or mechanical failures caused only 20% of the recent
plane crashes as long as each plane crash had only one cause, as it appears to be the case. The
percent of deaths from causes not listed is 29%.
31