Download Statistics Midterm Review Name The next three questions concern

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Categorical variable wikipedia , lookup

Regression toward the mean wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Statistics Midterm Review
Name ____________________________
The next three questions concern the following scenario:
A recent Gallup poll asked "Do you consider pro wrestling to be a sport, or not?" Of the people asked, 81% said
"No." The results were based on telephone interviews with a randomly selected national sample of 1,028
adults, 18 years and older, conducted August 16-18, 1999.
1.
(a)
(b)
(c)
(d)
(e)
The population for this poll appears to be:
all professional wrestlers.
all adults, 18 years or older.
all adults, 18 years or older, who were interviewed.
all adults who answered “no.”
all fans of professional wrestling who are 18 years or older.
2.
(a)
(b)
(c)
(d)
(e)
The sample for this poll is:
all professional wrestlers.
all adults, 18 years or older.
all adults, 18 years or older, who were interviewed.
all adults who answered “no.”
all fans of professional wrestling who are 18 years or older.
3. The variable measured in this study is
(a) professional wrestlers.
(b) individuals who responded “yes.”
(c) adults.
(d) opinion about whether professional wrestling is a sport.
(e) “yes.”
4. At the beginning of the school year, a high-school teacher asks every student in her classes to fill out a
survey that asks for their age, gender, the number of years they have lived at their current address, their favorite
school subject, and whether they plan to go to college after high school. Which of the following best describes
the variables that are being measures?
(a) four quantitative variables.
(b) five quantitative variables.
(c) two categorical variables and two quantitative variables.
(d) two categorical variables and three quantitative variables.
(e) three categorical variables and two quantitative variables.
5. A forester surveys a sample of trees in a certain state forest and records the following information about each
tree: species, height, diameter of trunk 4 feet above the ground, and type of leaves (needle or broadleaf). The
quantitative variables he recorded are:
(a) height only.
(b) species only.
(c) diameter of trunk only.
(d) (a) and (c)
(e) all four variables are quantitative.
6. The difference between an observational study and an experiment is:
(a) An observational study involves direct measurement of the individuals being studied; experiments do not.
(b) An experiment begins with a question of interest, observational studies produce a question at the end of the
study.
(c) In an experiment, a specific treatment is imposed on the individuals involved.
(d) An observational study involves human subjects; an experiment involves animals, plants, or other nonhuman subjects.
(e) An experiment always takes place in a laboratory; an observational study takes place in the “real world.”
7. Suppose that in a certain population of voters, 60% favor candidate Doe. If you took 40 different samples of
100 voters from this population, which of the following best describes the results you would expect from these
samples?
(a) Sixty percent of the voters in each of the samples would favor candidate Doe.
(b) The percentage of voters in the sample who favor candidate Doe would vary, but most would be close to
60%.
(c) Exactly 60% of the voters in 95% of the samples would favor candidate Doe, but in the other 5%, you
would get a different result.
(d) In 60% of the samples, the majority of voters would favor candidate Doe?
(e) It is not possible to make any generalizations about so many different samples.
8. You have data on the summer earnings of a sample of 1,000 high school students. What kind of graph
should you use to describe the distribution of their earnings?
(a) Bar graph.
(b) Line graph.
(c) Histogram.
(d) Pie chart.
(e) None of these.
Here is a dotplot of the adult literacy rates in 177 countries in 2008, according to the United Nations. For
example, the lowest literacy rate was 23.6%, in the African country of Burkina Faso.
9. The overall shape of this distribution is
(a) clearly skewed to the right
(b) clearly skewed to the left
(c) roughly symmetric
(d) no clear shape
10.
The mean of this distribution (don't try to find it) is
certainly
(a) very close to the median.
(b) clearly less than the median.
(c) clearly greater than the median.
(d) can't say because the mean is random.
Dot Plot
List_of_countries_by_literacy_rate
0
20
40 60 80 100 120
Literacy_Rate
11. Based on the shape of this distribution, what numerical measures would best describe it?
(a) the five-number summary.
(b) the mean and standard deviation.
(c) the mean and the quartiles.
(d) the median and the standard deviation.
(e) none of these
12. The mean of a distribution of scores of given to be x  43 with a standard deviation of s  4 . If five is added to each
of the values in the distribution, the new mean and standard deviation will be, respectively:
(a)
(b)
(c)
(d)
(e)
13.
x
x
x
x
x
 43 and
 43 and
 43 and
 48 and
 48 and
s 9
s 9
s  20
s2
s4
Last year, students in a Statistics class were given a survey and asked how many cans of soda they had
consumed the week before the survey. It turned out that these students consumed an average of 6.25 cans of
soda with a median of 4 cans and a standard deviation of 6.9 cans. A histogram of the data looks like:
Which histogram above would best represent the distribution of soda consumed the week before the survey
was taken?
14. If a bar graph is to be accurate, it is essential that
(a) the bars touch each other.
(b) the bars be drawn vertically.
(c) both horizontal and vertical scales be clearly marked in equal units.
(d) the bars all have the same width.
(e) the explanatory variable be plotted on the horizontal axis.
15. Which of these statements about the standard deviation s is true?
(a) s is always 0 or positive.
(b) s should be used to measure spread only when the mean x is used to measure center.
(c) s is a number that has no units of measurement.
(d) Both (a) and (b), but not (c).
(e) All of (a), (b), and (c).
16.
The five-number summary of the distribution of scores on the final exam in Psych 001 last semester was
18 39 62 76 100. A total of 416 students took the exam. About how many students had scores above
39?
(a) 416
(b) 312
(c) 104
(d) 400
(e) 250
17. The 5-number summary for a univariate data set is given by
{min = 5, Q1 = 18, Med = 20, Q3 = 40, max = 75}. If you wanted to construct a modified boxplot for the
dataset (that is, one that would show outliers, if any existed), what would be the maximum possible length
of the right side “whisker”?
(a) 33
(b) 35
(c ) 45
(d) 53
(e) 55
18. Which of the following is likely to have a mean that is smaller than the median?
(a) The salaries of all National Basketball Association players.
(b)
Amounts awarded by juries from lawsuit involving injuries.
(c ) The prices of homes in a large city.
(d) The long distance race in which most runners took a long time but a few finished it rather quickly.
(e) The scores of students (out of 100 points) on a very easy exam in which most get nearly perfect scores
but a few do very poorly.
19. A biologist has gathered data on a population of bears in the forests of the northeast. A frequency polygon
plot of the weights of the sample of bears and their sex is given below. Based on the plot, which statement
below is TRUE?
Weights of Bears
14
Sex of Bear
Female
Male
12
Frequency
10
8
6
4
2
0
0
80
160
240
Weight
320
400
480
(a) Since the distributions overlap, there is not much difference between the weights of male and female
bears.
(b) The female bears have a higher mean weight than the male bears and also exhibit more variability in
those weights.
(c ) The female bears have a higher mean weight than the male bears and also exhibit less variability in those
weights.
(d) The male bears have a higher mean weight than the female bears and also exhibit more variability in
those weights.
(e) The male bears have a higher mean weight than the female bears and also exhibit less variability in those
weights
20.
For a normal distribution with mean 20 and standard deviation 5, approximately what percent of the
observations will be between 5 and 35?
(a) 50% (b) 68%
(c) 95%
(d) 99.7%
(e) 100%
21.
Two measures of center are marked on the density curve above.
(a) The median is at the dashed line and the mean is at the solid line.
(b) The median is at the solid line and the mean is at the dashed line.
(c) The mode is at the dashed line and the median is at the solid line.
(d) The mode is at the solid line and the median is at the dashed line.
(e) None of these is correct.
22.
Items produced by a manufacturing process are supposed to weigh 90 grams. However, the
manufacturing process is such that there is variability in the items produced and they do not all weigh
exactly 90 grams. The distribution of weights can be approximated by a normal distribution with a mean of
90 grams and a standard deviation of 1 gram. Using the 68–95–99.7 rule, what percentage of the items will
either weigh less than 88 grams or more than 92 grams?
(a) 0.3%
23.
(b) 3%
(c) 5%
(d) 95%
(e) 99.7%
Which of the following is least likely to have a nearly normal distribution?
(a) Heights of all female students taking Statistics at Franklin Academy.
(b) IQ scores of all students taking Statistics at Franklin Academy.
(c) The SAT Math scores of all students taking Statistics at Franklin Academy.
(d) Family incomes of all students taking Statistics at Franklin Academy.
(e) Time from conception to birth of all students taking Statistics at Franklin Academy.
24.
Scores on the American College Testing (ACT) college entrance exam follow the normal distribution
with mean 18 and standard deviation 6. Wayne's standard score on the ACT was
-0.7. What was Wayne’s actual ACT score?
(a) 4.2
25.
(b) -4.2
(c) 9.6
(d) 13.8
(e) 22.2
The test grades at a large school have an approximately normal distribution with a mean of 50. What is
the standard deviation of the data so that 80% of the students are within 12 points (above or below) the
mean?
(a) 5.875
(b)
9.375
(c)
10.375
(d) 14.5
(e)
cannot be determined from the given information
The death rates from heart disease per 100,000 people in a group of developed countries were recorded. The
distribution is roughly described by this normal curve:
26.
27.
From this normal curve, we see that the mean heart disease death rate per 100,000 people is about:
(a) 60
(b) 120
(c) 190
(d) 250
(e) 400
From the normal curve, we see that the standard deviation of the heart disease rate per
100,000 people is closest to
(a) 25
(b) 65
(c) 100
(d) 200
(e) 400
28. Which of the following are true statements?
I.
II.
III.
(a) I only
(d) I, II, and III
In all normal distributions, the mean and median are equal.
All bell-shaped curves are normal distributions no matter what the particular mean and
standard deviation are.
Virtually all the area under a normal curve is within three standard deviations of the
mean, no matter what the particular mean and standard deviation are.
(b) I and II
(c) II and III
(e) I and III
29. Suppose that adult women in China have heights that are normally distributed with mean 155 centimeters
and standard deviation 8 centimeters. Adult women in Japan have heights which are normally distributed
with mean 158 centimeters and standard deviation 6 centimeters. Which country has the higher percentage
of women taller than 167 centimeters?
(a)
(b)
(c)
(d)
China
Japan
The percentages are the same.
It is not possible to tell from the information given.
30. Which one of the following would be a correct interpretation if you have a z-score of +2.0 on an exam?
(a)
(b)
(c)
(d)
(e)
It means that you missed two questions on the exam.
It means that you got twice as many questions correct as the average student.
It means that your grade was two points higher than the mean grade on this exam.
It means that your grade was in the upper 2% of all grades on this exam.
It means that your grade is two standard deviations above the mean for this exam.
31. The mean blood pressure for 47-year-old males in the United States is normally distributed with a mean of
139 mg and a standard deviation of 26 mg. A doctor tells a 47- year-old male patient that he is in the lowest
10% of all people in this population. Which one of the values below is nearest to the patient’s actual blood
pressure?
(a) 96
(b) 106
(c) 108
(d) 125
(e) 127
A study gathers data on the outside temperature during the winter, in degrees Fahrenheit, and the amount of
natural gas a household consumes, in cubic feet per day. Call the temperature x and gas consumption y. The
house is heated with gas, so x helps explain y. The least-squares regression line is y = 1344 – 19x. The next two
questions concern this line.
32. On a day when the temperature is 20 F, the regression line predicts that gas used will be about
(a) 1724 cubic feet
(e) None of these
(b) 1383 cubic feet
(c) 1325 cubic feet
(d) 964 cubic feet
33. When the temperature goes up 1 degree, what happens to the gas usage predicted by the regression line?
(a) It goes down 1 cubic foot.
(b) It goes up 1 cubic foot.
(c) It goes down 19 cubic feet.
(d) It goes up 19 cubic feet.
(e) Can't tell without seeing the data.
34. The correlation between temperature x and gas usage y is r = –0.7. Which of the following would not
change r?
(a) measuring temperature in degrees Celsius instead of degrees Fahrenheit.
(b) removing two outliers from the data used to calculate r.
(c) measuring gas usage in hundreds of cubic feet, so that all values of y are divided by 100.
(d) Both (a) and (c)
(e) All of (a), (b), and (c)
35. All 753 students in grades 1 through 6 in an elementary school are given a math test that was designed for
third graders. The body weights of all 753 students are also recorded. We expect to see __________
between the weight of a student and their test score.
(a) a positive association
(b) little or no association
(c) a negative association
(d) either positive or negative association, but it's hard to predict which
36. A study of the effects of television measured how many hours of television each of 125 grade school
children watched per week during a school year and their reading scores. Which variable would you put on
the horizontal axis of a scatterplot of the data?
(a) Hours of television, because it is the response variable.
(b) Hours of television, because it is the explanatory variable.
(c) Reading score, because it is the response variable.
(d) Reading score, because it is the explanatory variable.
(e) It makes no difference, because there is no explanatory-response distinction in this study.
37. The study described in the previous question found that children who watch more television tend to have
lower reading scores than children who watch fewer hours of television. The study report says that “Hours
of television watched explained 9% of the observed variation in the reading scores of the 125 subjects." The
correlation between hours of TV and reading score must be
(a) r = 0.09
(b) r = -0.09 (c) r = 0.3
(e) Can't tell from the information given.
(d) r = -0.3
38. There is a close relationship between the correlation r and the slope b of the least-squares regression line. In
particular, it is true that
(a) r and b always have the same sign, which shows whether the variables are positively or negatively
associated.
(b) r and b both always take values between –1 and 1.
(c) the slope b is always at least as large as the correlation r.
(d) the slope b is always equal to r2, the square of the correlation.
(e) Both (a) and (b) are true.
39.
If there were something genetic that made people simultaneously more susceptible to both smoking and
lung cancer, this would be an instance of
(a) causation.
(b) confounding.
(c) common response.
(d) the placebo effect.
(e) voluntary response.
40.
The United Nations has data on the percent of adult males and females who are illiterate in each of these
142 countries. The correlation between male illiteracy rate and female illiteracy rate is r = 0.945. This tells
us that
(a) countries with high male illiteracy tend to also have high female illiteracy, and the relationship is very
strong.
(b) countries with high male illiteracy tend to also have high female illiteracy, but the two are only weakly
related.
(c) countries with high male illiteracy tend to have low female illiteracy, and the relationship is very strong.
(d) countries with high male illiteracy tend to have low female illiteracy, but the two are only weakly
related.
(e) there is very little relationship between the illiteracy rates for males and females.
41. Which of the following relationships is most likely to result in a strong negative correlation?
(a) The fuel efficiency of a car (miles per gallon) and its speed.
(b) The outdoor temperature and the number of fans running in non-air-conditioned dorm rooms.
(c) The comfort rating of a mattress and the number of hours uninterrupted sleep obtained.
(d) The price of a home and its square footage.
(e) The number of people showering in a college dorm and the water pressure in each shower.
42.
x
y
23
19
15
18
26
22
24
20
22
27
29
25
32
32
40
38
41
35
46
45
The regression line for the data given above is y  2.35  0.86x .
What is the residual for the point whose x-value is 29?
(a) –2.29
(b) –1.71
(c) 1.71
(d) 2.29
(e) 5.15
43. A scatterplot and a least-squares regression line are shown in the figure below. If the point (20, 25) that is labeled A is
removed from the data set, which one of the statements below is TRUE?
25
A
20
Y
15
10
5
0
0
5
10
X
15
20
(a) The slope will decrease and the y-intercept will decrease.
(b) The slope will decrease and the y-intercept will increase.
(c ) The slope will increase and the y-intercept will increase.
(d) The slope will increase and the y-intercept will decrease.
(e) No conclusion can be drawn since the coordinates of the other data points are unknown.
44. In a table of random digits,
(a) each pair of digits 00, 01, 02, . . . , 99 appears exactly once in any row of the table.
(b) any pair of entries is equally likely to be any of the 100 possible pairs 00, 01, 02, . . . , 99.
(c) a specific pair such as 00 cannot be repeated until all other pairs have appeared.
(d) the pair 00 can appear, but 000 is not random and can never appear in the table.
(e) no specific pair such as 00 can occur more than three times in any row.
45. A simple random sample is
(a) any sample selected by using chance.
(b) any sample that gives every individual the same chance to be selected.
(c) a sample that gives every possible sample of the same size the same chance to be selected.
(d) a sample that selects equal numbers of individuals from each stratum.
(e) a sample that contains the same percent of each subgroup in the population.
46. Bias in a sampling method is
(a) any error in the sample result, that is, any deviation of the sample result from the truth about the population.
(b) the random error due to using chance to select a sample.
(c) any error due to practical difficulties such as contacting the subjects selected.
(d) any systematic error that tends to occur in the same direction every time you use this sampling method.
(e) racism or sexism on the part of those who take the sample.
47. An instructor has five sections of a course: A, B, C, D, and E. She wants to randomly select three sections
for a special teaching evaluation. She labels the classes as follows:
A = 1, B= 2, C = 3, D =4 and E =5. She starts at the beginning of this list of random digits:
15689 14227 06565 14374
Which classes did she select?
(a) A, E, and A
(b) A and D
(c) A, B, and C
(d) B, C, and D
(e) A, D, and E
48. When Ann Landers asked her readers to tell her "if your sex life has gone downhill after marriage," more
than 100,000 people responded. This is an example of
(a) a voluntary response sample.
(b) a simple random sample.
(c) a stratified sample.
(d) a convenience sample.
(e) a well designed survey.
49. 53% of the people asked agreed that we should have a third party. The number 53% is a
(a) correlation. (b) parameter. (c) margin of error. (d) confidence level. (e) statistic.
50. You want to take an SRS of 50 of the 816 students who live in a college dormitory. You label the students
001 to 816 in alphabetical order. In the table of random digits you read the entries
96746 12149 37823 71868 18442 35119 62103 39244
The first three students in your sample have labels
(a) 967, 461, 214
(b) 967, 121, 378
(d) 461, 214, 718
(e) 674, 612, 149
(c) 461, 214, 937
51. Another correct choice of labels for the 816 students in the previous question is
(a) 000 to 816 in alphabetical order.
(b) 001 to 816 in order of the student ID numbers.
(c) 000 to 815 in alphabetical order.
(d) Both (b) and (c) are correct.
(e) All of (a), (b), and (c) are correct.