Download Test 7D (Cumulative) AP Statistics Name:

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Sampling (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Gibbs sampling wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Test 7D (Cumulative)
AP Statistics
Name:
Part 1: Multiple Choice. Circle the letter corresponding to the best answer.
1. According to the U.S. Census bureau, 23.5% of people in the United States are under the age of 18.
In a random sample of 250 residents of a small town in Ohio, 28% of the sample was under 18.
Which one of the following statements is true?
(a) 23.5% and 28% are statistics, 250 and 18 are parameters.
(b) 23.5% and 28% are parameters, 250 and 18 are statistics.
(c) 23.5% and 28% are parameters, 18 is a statistic.
(d) 28% is a parameter and 23.5% is a statistic.
(e) 23.5% is a parameter and 28% is a statistic.
2. Vermont is particularly beautiful in early October, during “foliage season.” At that time of year, a
large proportion of cars on Interstate 91 near Brattleboro have out-of-state license plates. Suppose
a Vermont State Trooper randomly selects 50 cars driving past Exit 2 on I-91 and calculates the
proportion of cars with out-of-state plates. Which of the following does the sampling distribution
of this proportion describe?
(a) The distribution of cars with out-of-state plates in his sample.
(b) The distribution of all cars passing this exit on I-91 with out-of-state plates.
(c) The distribution of sample proportions from all possible samples of size 50 from the population
of cars passing this exit on I-91.
(d) The distribution of sample proportions from all samples of size 50 that the trooper actually
collects.
(e) The proportion of cars with out-of-state plates from all the cars the trooper could possibly
sample.
3. Which of the following sources of error are taken into account by the sampling distributions of
means?
I. Error associated with voluntary response.
II. Error associated with undercoverage.
III. Sampling variability
(a) I only
(b) II only
(c) III only
(d) II and III
(e) I, II, and III
©BFW Publishers
The Practice of Statistics for AP*, 5/e
4. There are 1200 students at Highland high school. The school newspaper conducts a poll that asks
200 randomly selected students how many hours of sleep they got last night. They find that the
mean hours of sleep is 6.7 hours and the standard deviation is 2 hours. Can we estimate the
s
2
standard deviation of the sampling distribution of means by using
=
= 0.142 hours?
n
200
(a) No, because “hours of sleep” is not a continuous variable.
(b) No, because we can’t be sure that the central limit theorem applies to this situation.
(c) No, because we don’t know that the distribution of sleep hours for the population is
approximately Normal.
(d) No, because the sample is more than 10% of the population.
(e) Yes, because all the conditions for using this formula have been met.
5. Which of the following is an accurate restatement of the central limit theorem?
(a) For sufficiently large samples of size n from a population with standard deviation σ, the
standard deviation of the sampling distribution of means is
σ
n
, regardless of the shape of the
population distribution.
(b) For sufficiently large samples of size n from a Normally distributed population, the sampling
distribution of means is approximately Normal.
(c) For sufficiently large samples of size n from any population, the sampling distribution of means
is approximately Normal, regardless of the shape of the population distribution.
(d) If a sample consists of n independent observations from any population whose standard
deviation is σ, the standard deviation of the sampling distribution of means is given by
σ
n
.
(e) If a sample consists of n independent observations from any population whose standard
deviation is σ, the sampling distribution of means is approximately Normal.
6. Suppose we want to compare the proportion of Texas residents (population 26 million) and the
proportion of Wyoming residents (population 526 thousand) who have purchased items through
online auction sites. We would like to select samples from each state in such a way that the
sampling distributions have roughly equal variances. If the population proportions are nearly the
same in each state, which one of the following statements is true?
(a) Since the population sizes are so different, it’s impossible to produce sample proportions whose
sampling distributions have roughly the same variance.
(b) Since the population sizes are random variables, we cannot estimate the variances of the
sampling distributions.
(c) The variances of the sampling distributions with be nearly the same if we sample the same
percentage of residents from each state (for example, 0.5% of Texas residents and 0.5% of
Wyoming residents).
(d) The variances of the sampling distributions with be nearly the same if we sample a higher
percentage of Texas residents to compensate for the larger population size.
(e) The variances of the sampling distributions with be nearly the same if we sample the same
number of residents of each state.
©BFW Publishers
The Practice of Statistics for AP*, 5/e
7. Consider a population of field mice with a mean weight of 46 grams and a standard deviation of 8
grams. You collect a simple random sample of 15 mice. Which one of the following quantities is
a random variable?
I. The mean weight of the sample of 15 field mice.
II. The mean of the sampling distribution of means for the weight of samples of 15 mice.
III. The mean weight of the entire population of mice.
(a)
(b)
(c)
(d)
(e)
I only
II only
III only
I and II
I, II, and III
8. A forester who wants to evaluate the health of maple trees in a large forest randomly selects 10
locations in the forest and creates 20-meter diameter circles with each location as a center (making
sure none of the circles overlap). He then evaluates all the maple trees in each circle. Which one
of the following sampling methods is he using?
(a) Simple random sample
(b) Stratified random sample
(c) Systematic random sample
(d) Cluster sample
(e) Multistage sample
9. A restaurant maître d’ wants to be able to predict the time customers will have to wait for a table on
the basis of how many names are on her waiting list. She collects data on y = the time a group of
customers have to wait for a table, and x = the number of names already on the waiting list when
that group is added to the list. She finds that the relationship is roughly linear, and calculates the
least-squares regression line yˆ = 2.8 + 3.77 x. One group waited 15 minutes when there were 4
names ahead of them on the list. Which expression below represents the residual for this
observation?
(a) 4 − ⎡⎣2.8 + 3.77 (15)⎤⎦
(b) 15 − ⎡⎣2.8 + 3.77 ( 4)⎤⎦
(c) ⎡⎣2.8 + 3.77 ( 4)⎤⎦ + 15
(d) ⎡⎣2.8 + 3.77 ( 4)⎤⎦ − 15
(e) ⎡⎣2.8 + 3.77 ( 4 )⎤⎦ − 4
10. The five-number summary for the lengths of the first 100 words in Robert Fagles’ translation of
Homer’s Odyssey is 2 3 4 5 12. Which one of the following could be the 60th percentile of this
distribution?
(a) 2
(b) 3
(c) 4
(d) 6
(e) 8
©BFW Publishers
The Practice of Statistics for AP*, 5/e
Part 2: Free Response
Show all your work. Indicate clearly the methods you use, because you will be graded on the
correctness of your methods as well as on the accuracy and completeness of your results and
explanations.
11. A company that sells bicycles online maintains a telephone help line to assist customers who are
assembling bicycles after they have been delivered. To determine how many “helpers” are needed,
they keep detailed records of the percentage of bicycle purchasers who call in for help. They have
determined that 18% of all buyers call the help line.
(a) Suppose we select a random sample of 25 buyers. What are the mean and standard deviation of
the count of buyers among the 25 who call in for help?
(b) What is the probability that exactly 8 of these 25 buyers call in for help?
(c) Suppose we select a random sample of 200 buyers and calculate the proportion of buyers in the
sample who call in for help. Describe the sampling distribution for this sample proportion.
(d) What is the probability that more than 20% of the buyers in this sample of 200 call in for help?
©BFW Publishers
The Practice of Statistics for AP*, 5/e
12. City planners in Carbury have spent many years studying traffic patterns at the intersection of
Main and State Streets. They have determined that the number of cars passing through the
intersection in any randomly-selected one-hour period has a mean of 207 cars and a standard
deviation of 60 cars. The distribution is moderately skewed to the right.
(a) Suppose the planners take a simple random sample of 40 one-hour periods. The sample mean
will be an unbiased estimator of the population mean. In the context of this problem, what is
meant by the term “unbiased”?
(b) Describe the sampling distribution of means for samples of 40 one-hour intervals.
(c) What is the probability that two consecutive simple random samples of size 40 both have
sample means below 190 cars?
(d) The city planners install new road signs directing “through traffic”—cars just passing through
town without stopping—to take alternative routes that avoid this intersection. After the signs
have been in place for two weeks, a single simple random sample of 40 one-hour periods
produces a mean of 190 cars. Do you think this means that the signs have reduced traffic
through this intersection? Support your answer with appropriate probabilities.
©BFW Publishers
The Practice of Statistics for AP*, 5/e
Test 7D
Part 1
1. e Since the census involves an entire population, 23.5% is the true proportion of people under 18
in the entire population of the U.S. The 28% came from a sample, so it is a statistic.
2.
3.
4.
5.
6.
7.
8.
9.
c The sampling distribution of proportions consists of the proportions from all possible samples
of a given size from the population of interest.
c The sampling distribution takes into account only the variability arising from random sampling.
Error arising from methods of data collection must be addressed separately.
d We can only apply this formula for estimating the standard deviation of the sampling
distribution of means if the 10% condition has been satisfied.
c From the text: “Draw an SRS of size n from any population with mean µ and finite standard
deviation σ. The central limit theorem (CLT ) says that when n is large, the sampling
distribution of the sample mean x is approximately Normal.”
e Variance (and thus standard deviation) are strongly dependent upon sample size, but as long as
the sample is less than 10% of the population, the size of the population has no meaningful
effect on variance.
a Only the sample mean is a random variable (or statistic). The mean of the population and the
mean of the sampling distribution are equal to each other and are a fixed parameter.
d This is a cluster sample: several groups of individuals in close proximity to each other are
selected randomly, and all members of the group are sampled.
b Residual = yobserved − yˆ = 15 − ⎡⎣28 + 3.77 (4 )⎤⎦ .
10. c The 60th percentile is somewhere between the median (50th percentile) and Q3 (75th percentile),
or it could be equal to either one if there are repeated values. Thus the only possible value
among the given choices is 4.
Part 2
11. (a) The count of buyers who call the help line has a binomial distribution with n = 25 and p = 0.18.
So µ X = np = 25 ⋅ 0.18 = 4.5 and σ X = npq = 25 ⋅ 0.18 ⋅ 0.82 ≈ 1.92 . (b) X is binomial n = 25 and p
⎛ 25 ⎞
= 0.18, so P ( x = 8) = ⎜ ⎟ 0.188 ⋅ 0.8217 ≈ 0.0408 . (c) µ pˆ = p = 0.18;
⎝ 8 ⎠
p (1 − p )
0.18 ( 0.82 )
=
≈ 0.0272 . Since np = 200 ⋅ 0.18 = 36 and n (1 − p ) = 200 ⋅ 0.82 = 164,
n
200
both of which are greater than 10, the distribution is approximately Normal. (d)
Using the
0.20 − 0.18 ⎞
⎛
Normal distribution, P ( pˆ > 0.20 ) = P ⎜ z >
⎟ = P ( z > 0.74) = 1 − 0.7704 = 0.2296 . Using a
0.0272 ⎠
⎝
calculator, P ( pˆ > 0.20) = 0.2310 . Using the binomial distribution, P ( X > 40) =0.2020.
12. (a) The mean number of cars in a sample of 40 one-hour intervals is an unbiased estimator
because the mean of its sampling distribution is the same as the mean number of cars in the population
σ
60
of all one-hour intervals. (b) µx = µ X = 207 , σ x = X =
≈ 9.49 . Since n = 40 ≥ 30 , the
n
40
distribution is approximately Normal by the central limit theorem.
σ pˆ =
©BFW Publishers
The Practice of Statistics for AP*, 5/e
⎛
⎞
⎜
190 − 207 ⎟
(c) P ( x < 190 ) = P ⎜ z <
⎟ = P ( z < −1.79) = 0.0367 . The probability of two consecutive
60
⎜⎜
⎟⎟
40 ⎠
⎝
(and independent) samples having means below 190 is therefore (0.0367) ⋅ (0.0367) = 0.00135 by the
multiplication rule for independent events.
(d) If the mean number of cars per one-hour period were still 207, the probability of getting a sample
mean of 190 or lower is 0.0367. This is low enough for us to suspect that the true mean (after the signs
have been in place for two weeks) is now lower than 207. We can conclude that the signs have
reduced traffic through the intersection. [Note: since this probability is not extremely low, a similar
and equally legitimate argument could be made that the traffic has not changed.]
©BFW Publishers
The Practice of Statistics for AP*, 5/e