Download Chapter 1 - Brandywine School District

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Statistics Midterm Review
Name_______________________
CHAPTER 1 MULTIPLE CHOICE QUESTIONS
1. A psychologist wants to know if college students with normal vision and adults with normal vision
can be fooled by a certain optical illusion. She recruits 50 students from across her university and 50
adults from the town, shows them the optical illusion and finds that 42 of the students and 39 of the
adults are fooled by the illusion. The population in this study is
(a) the 42 students who were fooled.
(b) the 50 students who served as subjects.
(c) all students in the PSY 120 class.
(d) all adults with normal vision.
2. The psychologist in the previous question carried out:
(a) a census
(b) an experiment
(c) an observational study
(d) a sample survey
3. At the beginning of the school year, a high-school teacher asks every student in her classes to fill out a
survey that asks for their age, gender, the number of years they have lived at their current address,
their favorite school subject, and whether they plan to go to college after high school. Which of the
following best describes the variables that are being measures?
(a)
(b)
(c)
(d)
(e)
four quantitative variables.
five quantitative variables.
two categorical variables and two quantitative variables.
two categorical variables and three quantitative variables.
three categorical variables and two quantitative variables.
4. When we take a census, we attempt to collect data from
(a) a random sample
(b) every individual selected in a random sample
(c) every individual in the population
(d) a random sample of individuals in the United States.
(e) a random sample of individuals in any single country.
The next three questions concern the following scenario:
Your school has 1000 students. You ask 80 of them, “Have you had a cold in the last three months?”
Fifty-six percent of them answered “yes.”
5. This is an example of
(a) an experiment.
(b) a census. (c) an observational study.
(d) both (b) and (c).
(e) none of these.
6. In this situation, the word “no” could be described as
(a) a categorical variable.
(b) a quantitative variable.
(c) a sample.
(d) a value of the variable “had a cold in the last three months.”
(e) an individual.
7. Which of the following statements best describes what you can conclude from this result?
(a) If you sampled a different group of 80 students, you would get the same result: 56% would answer
yes.
(b) 560 of the students in the school would answer yes to the same question.
(c) If you asked a smaller group of 40 students, the percentage of students who answered yes would
be 28%.
(d) A different group of 80 students would probably produce a slightly different result—close to 56%,
but not exactly the same.
(e) Both (a) and (b).
CHAPTER 2 MULTIPLE CHOICE QUESTIONS
1. A company database contains the following information about each employee: age, date hired, sex
(male or female), ethnic group (Asian, black, Hispanic, etc.), job category (clerical, management,
technical, etc.), yearly salary. Which of the following lists of variables are all categorical?
(a) age, sex, ethnic group.
(b) sex, ethnic group, job category.
(c) ethnic group, job category, yearly salary.
(d) yearly salary, age.
(e) age, date hired.
2. According to the student newspaper, the mean salary of male full professors in the School of
Management is $117,302. The median of these salaries
(a) would be lower, because salary distributions are skewed to the left.
(b) would be lower, because salary distributions are skewed to the right.
(c) would be higher, because salary distributions are skewed to the left.
(d) would be higher, because salary distributions are skewed to the right.
3. Jorge's score on Exam 1 in his statistics class was at the 64th percentile of the scores for all students.
His score falls
(a) between the minimum and the first quartile
(b) between the first quartile and the median
(c) between the median and the third quartile
(d) between the third quartile and the maximum
4. Which of these statements about the standard deviation s is true?
(a) s is always 0 or positive.
(b) s should be used to measure spread only when the mean x is used to measure center.
(c) s is a number that has no units of measurement.
(d) Both (a) and (b), but not (c).
(e) All of (a), (b), and (c).
5. A well-drawn histogram should have
(a) bars all the same width
(b) no space between bars (unless a class has no observations)
(c) a clearly marked vertical scale
(d) all of these
6. For a distribution that is skewed to the left, usually
(a) the mean will be larger than the median
(b) the median will be larger than the mean
(c) the first quartile will be larger than the third quartile
(d) the standard deviation will be negative
(e) the minimum will be larger than the maximum
Here are boxplots of the number of calories in 20 brands of beef hot dogs, 17 brands of meat hot dogs,
and 17 brands of poultry hot dogs.
7. The main advantage of boxplots over stemplots and histograms is
(a) boxplots make it easy to compare several distributions, as in this example
(b) boxplots show more detail about the shape of the distribution
(c) boxplots use the five-number summary, whereas stemplots and histograms use the mean and standard
deviation
(d) boxplots show skewed distributions, whereas stemplots and histograms show only symmetric
distributions
8. This plot shows that
(a) all poultry hot dogs have fewer calories than the median for beef and meat hot dogs
(b) about half of poultry hot dog brands have fewer calories than the median for beef and meat hot dogs
(c) hot dog type is not helpful in predicting calories, because some hot dogs of each type are high and
some of each type are low
(d) most poultry hot dog brands have fewer calories than most beef and meat hot dogs, but a few poultry
hot dogs have more calories than the median beef and meat hot dog
9. We see from the plot that the median number of calories in a beef hot dog is about
(a) 190
(b) 179
(c) 153
(d) 139
(e) 129
10. The box in each boxplot marks
(a) the full range covered by the data
(b) the range covered by the middle half of the data
(c) the range covered by the middle three-quarters of the data
(d) the span one standard deviation on each side of the mean
(e) the span two standard deviations on each side of the mean
11. The calorie counts for the 17 poultry brands are:
129 132 102 106 94 102 87 99 170 113 135 142 86 143 152 146 144
The median of these values is
(a) 129
(b) 132
(c) 130.5
(d) 121
(e) 170
12. The first quartile of the 17 poultry hot dog calorie counts is
(a) 99
(b) 102
(c) 100.5
(d) 143.5
(e) 143
Here are the number of hours that each of a group of students studied for this exam:
2 4 22 6 1 4 1 5 7 4
The next six questions use these data.
13. What is the median number of study hours?
(a) 2.5
(b) 4
(c) 4.5
(d) 5.6
(e) 56
14. What is the mean number of study hours?
(a) 2.5
(b) 4
(c) 4.5
(d) 5.6
(e) 56
15. What is the third quartile of the number of study hours?
(a) 1.5
(b) 2
(c) 5.5
(d) 6
(e) 6.5
16. What is the standard deviation of the number of study hours?
(a) 5.8
(b) 6.1
(c) 37.2
(d) 33.4
(e) 56
17. The data contain one high outlier (22 hours). Which of your results for the previous four questions
would change if this were 7 hours instead of 22 hours? (You do not need to calculate new values.)
(a) All four would change.
(b) The mean, the third quartile, and the standard deviation would change.
(c) The mean and the standard deviation would change.
(d) Only the mean would change.
(e) None of the four would change.
18. Which of your results for the same four questions are measured in hours?
(a) All four are measured in hours.
(b) All except the standard deviation are measured in hours.
(c) Only the median and the mean are measured in hours.
(d) Only the median is measured in hours.
(e) None of the four are measured in hours.
19. A stemplot is
(a) the same as a boxplot
(b) a part of a small tree
(c) a picture of a distribution
(d) the same as a histogram
(e) useful for describing categorical variables
20. An outlier will usually have a large effect on the
(a) first quartile
(b) mean
(c) third quartile
(d) median
21. Fifty percent of the observations will be at or above the
(a) maximum
(b) mean
(c) median
(e) sample size
(d) third quartile
22. Here is a set of data: 1300, 18, 25, 19, -7, 24. Which observation is the outlier?
(a) 1300
(b) 25
(c) 19
(d) -7
(e) 24
23. The five numbers in the five-number summary are
(a) the five smallest observations
(b) the five largest observations
(c) the mean, the median, the maximum, the standard deviation and the sample size
(d) x1, x2, x3, x4, and x5.
(e) the median, the quartiles, the minimum and the maximum
The next four questions are based on the following stemplot of 12 exam scores. (The stem
is tens and the leaf is units.):
24. The median is
(a) 84
(b) 85
(c) 86
(d) 88
25. The mean is
(a) less than the median because the distribution is skewed left
(b) greater than the median because the distribution is skewed left
(c) less than the median because the distribution is skewed right
(d) equal to the median
(e) impossible to determine
(e) 90
26. The first and third quartiles are, respectively,
(a) 76 and 92
(b) 78 and 92
(c) 92 and 78
(d) 92 and 76
(e) None of the above.
27. The five-number summary of these exam scores consists of
(a) the mode, median, mean, first and third quartiles
(b) the max, min, sample size, first and third quartiles
(c) the max, min, median, first and third quartiles
(d) the max, min, mean, first and third quartiles
(e) 76, 80, 84, 88 and 92
CHAPTER 3 MULTIPLE CHOICE QUESTIONS
1. For a normal distribution with mean 20 and standard deviation 5, approximately what percent of the
observations will be between 5 and 35?
(a) 50% (b) 68%
(c) 95%
(d) 99.7%
(e) 100%
2.
Two measures of center are marked on the density curve below.
(a) The median is at the dashed line and the mean is at the solid line.
(b) The median is at the solid line and the mean is at the dashed line.
(c) The mode is at the dashed line and the median is at the solid line.
(d) The mode is at the solid line and the median is at the dashed line.
(e) None of these is correct.
3. Items produced by a manufacturing process are supposed to weigh 90 grams. However, the
manufacturing process is such that there is variability in the items produced and they do not all weigh
exactly 90 grams. The distribution of weights can be approximated by a normal distribution with a
mean of 90 grams and a standard deviation of 1 gram. Using the 68–95–99.7 rule, what percentage of
the items will either weigh less than 88 grams or more than 92 grams?
(a) 0.3%
(b) 3%
(c) 5%
(d) 95%
(e) 99.7%
4. Which of the following is least likely to have a nearly normal distribution?
(a) Heights of all female students taking Statistics at Franklin Academy.
(b) IQ scores of all students taking Statistics at Franklin Academy.
(c) The SAT Math scores of all students taking Statistics at Franklin Academy.
(d) Family incomes of all students taking Statistics at Franklin Academy.
(e) Time from conception to birth of all students taking Statistics at Franklin Academy.
5. Scores on the American College Testing (ACT) college entrance exam follow the normal distribution
with mean 18 and standard deviation 6. Wayne's standard score on the ACT was
-0.7. What was Wayne’s actual ACT score?
(a) 4.2
(b) -4.2
(c) 9.6
(d) 13.8
(e) 22.2
6. In a certain southwestern city the air pollution index averages 62.5 during the year with a standard
deviation of 18.0. Assuming that the 68-95-99.7 rule is appropriate, the index falls within what
interval about 68% of the time?
(a) (8.5, 116.5)
(d) (44.5, 80.5)
(b) (26.5, 98.5)
(e) (44.5, 98.5)
(c) (26.5, 116.5)
The death rates from heart disease per 100,000 people in a group of developed countries were recorded.
The distribution is roughly described by this normal curve:
7. From this normal curve, we see that the mean heart disease death rate per 100,000 people is about:
(a) 60
(b) 120
(c) 190
(d) 250
(e) 400
8.
Which of the following are true statements?
I.
II.
III.
In all normal distributions, the mean and median are equal.
All bell-shaped curves are normal distributions no matter what the particular mean
and standard deviation are.
Virtually all the area under a normal curve is within three standard deviations of the
mean, no matter what the particular mean and standard deviation are.
(a) I only
(d) I, II, and III
(b) I and II
(e) I and III
(c) II and III
9. Which one of the following would be a correct interpretation if you have a z-score of +2.0 on an exam?
(a) It means that you missed two questions on the exam.
(b) It means that you got twice as many questions correct as the average student.
(c) It means that your grade was two points higher than the mean grade on this exam.
(d) It means that your grade was in the upper 2% of all grades on this exam.
(e) It means that your grade is two standard deviations above the mean for this exam.
10. The mean blood pressure for 47-year-old males in the United States is normally distributed with a
mean of 139 mg and a standard deviation of 26 mg. A doctor tells a 47- year-old male patient that he
is in the lowest 10% of all people in this population. Which one of the values below is nearest to the
patient’s actual blood pressure?
(a) 96
(b) 106
(c) 108
(d) 125
(e) 127
CHAPTER 4 MULTIPLE CHOICE QUESTIONS
A study gathers data on the outside temperature during the winter, in degrees Fahrenheit, and the amount
of natural gas a household consumes, in cubic feet per day. Call the temperature x and gas consumption y.
The house is heated with gas, so x helps explain y. The least-squares regression line is y = 1344 – 19x.
The next two questions concern this line.
1. On a day when the temperature is 20 F, the regression line predicts that gas used will be about
(a) 1724 cubic feet
(e) None of these
(b) 1383 cubic feet
(c) 1325 cubic feet
(d) 964 cubic feet
2. When the temperature goes up 1 degree, what happens to the gas usage predicted by the regression
line?
(a) It goes down 1 cubic foot.
(b) It goes up 1 cubic foot.
(c) It goes down 19 cubic feet.
(d) It goes up 19 cubic feet.
(e) Can't tell without seeing the data.
3. The correlation between temperature x and gas usage y is r = –0.7. Which of the following would not
change r?
(a) measuring temperature in degrees Celsius instead of degrees Fahrenheit.
(b) removing two outliers from the data used to calculate r.
(c) measuring gas usage in hundreds of cubic feet, so that all values of y are divided by 100.
(d) Both (a) and (c)
(e) All of (a), (b), and (c)
4. All 753 students in grades 1 through 6 in an elementary school are given a math test that was designed
for third graders. The body weights of all 753 students are also recorded. We expect to see
__________ between the weight of a student and their test score.
(a) a positive association
(b) little or no association
(c) a negative association
(d) either positive or negative association, but it's hard to predict which
5. A study of the effects of television measured how many hours of television each of 125 grade school
children watched per week during a school year and their reading scores. Which variable would you
put on the horizontal axis of a scatterplot of the data?
(a) Hours of television, because it is the response variable.
(b) Hours of television, because it is the explanatory variable.
(c) Reading score, because it is the response variable.
(d) Reading score, because it is the explanatory variable.
(e) It makes no difference, because there is no explanatory-response distinction in this study.
6. The study described in the previous question found that children who watch more television tend to
have lower reading scores than children who watch fewer hours of television. The study report says
that “Hours of television watched explained 9% of the observed variation in the reading scores of the
125 subjects." The correlation between hours of TV and reading score must be
(a) r = 0.09
(b) r = -0.09 (c) r = 0.3
(e) Can't tell from the information given.
(d) r = -0.3
7. There is a close relationship between the correlation r and the slope b of the least-squares regression
line. In particular, it is true that
(a) r and b always have the same sign, which shows whether the variables are positively or negatively
associated.
(b) r and b both always take values between –1 and 1.
(c) the slope b is always at least as large as the correlation r.
(d) the slope b is always equal to r2, the square of the correlation.
(e) Both (a) and (b) are true.
8. If there were something genetic that made people simultaneously more susceptible to both smoking
and lung cancer, this would be an instance of
(a) causation.
(b) confounding.
(c) common response.
(d) the placebo effect.
(e) voluntary response.
9. The United Nations has data on the percent of adult males and females who are illiterate in each of
these 142 countries. The correlation between male illiteracy rate and female illiteracy rate is r = 0.945.
This tells us that
(a) countries with high male illiteracy tend to also have high female illiteracy, and the relationship is
very strong.
(b) countries with high male illiteracy tend to also have high female illiteracy, but the two are only
weakly related.
(c) countries with high male illiteracy tend to have low female illiteracy, and the relationship is very
strong.
(d) countries with high male illiteracy tend to have low female illiteracy, but the two are only weakly
related.
(e) there is very little relationship between the illiteracy rates for males and females.
10. Which of the following relationships is most likely to result in a strong negative correlation?
(a) The fuel efficiency of a car (miles per gallon) and its speed.
(b) The outdoor temperature and the number of fans running in non-air-conditioned dorm rooms.
(c) The comfort rating of a mattress and the number of hours uninterrupted sleep obtained.
(d) The price of a home and its square footage.
(e) The number of people showering in a college dorm and the water pressure in each shower.
11.
x
y
23
19
15
18
26
22
24
20
22
27
29
25
32
32
40
38
41
35
46
45
The regression line for the data given above is y  2.35  0.86 x . What is the residual for the point whose
x-value is 29?
(a) –2.29
(b) –1.71
(c) 1.71
(d) 2.29
(e) 5.15
12. A scatterplot and a least-squares regression line are shown in the figure below. If the point (20, 25) that is
labeled A is removed from the data set, which one of the statements below is TRUE?
25
A
20
Y
15
10
5
0
0
5
10
X
15
20
(a) The slope will decrease and the y-intercept will decrease.
(b) The slope will decrease and the y-intercept will increase.
(c ) The slope will increase and the y-intercept will increase.
(d) The slope will increase and the y-intercept will decrease.
(e) No conclusion can be drawn since the coordinates of the other data points are unknown.
Related documents