* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Test 7D (Cumulative) AP Statistics Name:
Survey
Document related concepts
Transcript
Test 7D (Cumulative) AP Statistics Name: Part 1: Multiple Choice. Circle the letter corresponding to the best answer. 1. According to the U.S. Census bureau, 23.5% of people in the United States are under the age of 18. In a random sample of 250 residents of a small town in Ohio, 28% of the sample was under 18. Which one of the following statements is true? (a) 23.5% and 28% are statistics, 250 and 18 are parameters. (b) 23.5% and 28% are parameters, 250 and 18 are statistics. (c) 23.5% and 28% are parameters, 18 is a statistic. (d) 28% is a parameter and 23.5% is a statistic. (e) 23.5% is a parameter and 28% is a statistic. 2. Vermont is particularly beautiful in early October, during “foliage season.” At that time of year, a large proportion of cars on Interstate 91 near Brattleboro have out-of-state license plates. Suppose a Vermont State Trooper randomly selects 50 cars driving past Exit 2 on I-91 and calculates the proportion of cars with out-of-state plates. Which of the following does the sampling distribution of this proportion describe? (a) The distribution of cars with out-of-state plates in his sample. (b) The distribution of all cars passing this exit on I-91 with out-of-state plates. (c) The distribution of sample proportions from all possible samples of size 50 from the population of cars passing this exit on I-91. (d) The distribution of sample proportions from all samples of size 50 that the trooper actually collects. (e) The proportion of cars with out-of-state plates from all the cars the trooper could possibly sample. 3. Which of the following sources of error are taken into account by the sampling distributions of means? I. Error associated with voluntary response. II. Error associated with undercoverage. III. Sampling variability (a) I only (b) II only (c) III only (d) II and III (e) I, II, and III ©BFW Publishers The Practice of Statistics for AP*, 5/e 4. There are 1200 students at Highland high school. The school newspaper conducts a poll that asks 200 randomly selected students how many hours of sleep they got last night. They find that the mean hours of sleep is 6.7 hours and the standard deviation is 2 hours. Can we estimate the s 2 standard deviation of the sampling distribution of means by using = = 0.142 hours? n 200 (a) No, because “hours of sleep” is not a continuous variable. (b) No, because we can’t be sure that the central limit theorem applies to this situation. (c) No, because we don’t know that the distribution of sleep hours for the population is approximately Normal. (d) No, because the sample is more than 10% of the population. (e) Yes, because all the conditions for using this formula have been met. 5. Which of the following is an accurate restatement of the central limit theorem? (a) For sufficiently large samples of size n from a population with standard deviation σ, the standard deviation of the sampling distribution of means is σ n , regardless of the shape of the population distribution. (b) For sufficiently large samples of size n from a Normally distributed population, the sampling distribution of means is approximately Normal. (c) For sufficiently large samples of size n from any population, the sampling distribution of means is approximately Normal, regardless of the shape of the population distribution. (d) If a sample consists of n independent observations from any population whose standard deviation is σ, the standard deviation of the sampling distribution of means is given by σ n . (e) If a sample consists of n independent observations from any population whose standard deviation is σ, the sampling distribution of means is approximately Normal. 6. Suppose we want to compare the proportion of Texas residents (population 26 million) and the proportion of Wyoming residents (population 526 thousand) who have purchased items through online auction sites. We would like to select samples from each state in such a way that the sampling distributions have roughly equal variances. If the population proportions are nearly the same in each state, which one of the following statements is true? (a) Since the population sizes are so different, it’s impossible to produce sample proportions whose sampling distributions have roughly the same variance. (b) Since the population sizes are random variables, we cannot estimate the variances of the sampling distributions. (c) The variances of the sampling distributions with be nearly the same if we sample the same percentage of residents from each state (for example, 0.5% of Texas residents and 0.5% of Wyoming residents). (d) The variances of the sampling distributions with be nearly the same if we sample a higher percentage of Texas residents to compensate for the larger population size. (e) The variances of the sampling distributions with be nearly the same if we sample the same number of residents of each state. ©BFW Publishers The Practice of Statistics for AP*, 5/e 7. Consider a population of field mice with a mean weight of 46 grams and a standard deviation of 8 grams. You collect a simple random sample of 15 mice. Which one of the following quantities is a random variable? I. The mean weight of the sample of 15 field mice. II. The mean of the sampling distribution of means for the weight of samples of 15 mice. III. The mean weight of the entire population of mice. (a) (b) (c) (d) (e) I only II only III only I and II I, II, and III 8. A forester who wants to evaluate the health of maple trees in a large forest randomly selects 10 locations in the forest and creates 20-meter diameter circles with each location as a center (making sure none of the circles overlap). He then evaluates all the maple trees in each circle. Which one of the following sampling methods is he using? (a) Simple random sample (b) Stratified random sample (c) Systematic random sample (d) Cluster sample (e) Multistage sample 9. A restaurant maître d’ wants to be able to predict the time customers will have to wait for a table on the basis of how many names are on her waiting list. She collects data on y = the time a group of customers have to wait for a table, and x = the number of names already on the waiting list when that group is added to the list. She finds that the relationship is roughly linear, and calculates the least-squares regression line yˆ = 2.8 + 3.77 x. One group waited 15 minutes when there were 4 names ahead of them on the list. Which expression below represents the residual for this observation? (a) 4 − ⎡⎣2.8 + 3.77 (15)⎤⎦ (b) 15 − ⎡⎣2.8 + 3.77 ( 4)⎤⎦ (c) ⎡⎣2.8 + 3.77 ( 4)⎤⎦ + 15 (d) ⎡⎣2.8 + 3.77 ( 4)⎤⎦ − 15 (e) ⎡⎣2.8 + 3.77 ( 4 )⎤⎦ − 4 10. The five-number summary for the lengths of the first 100 words in Robert Fagles’ translation of Homer’s Odyssey is 2 3 4 5 12. Which one of the following could be the 60th percentile of this distribution? (a) 2 (b) 3 (c) 4 (d) 6 (e) 8 ©BFW Publishers The Practice of Statistics for AP*, 5/e Part 2: Free Response Show all your work. Indicate clearly the methods you use, because you will be graded on the correctness of your methods as well as on the accuracy and completeness of your results and explanations. 11. A company that sells bicycles online maintains a telephone help line to assist customers who are assembling bicycles after they have been delivered. To determine how many “helpers” are needed, they keep detailed records of the percentage of bicycle purchasers who call in for help. They have determined that 18% of all buyers call the help line. (a) Suppose we select a random sample of 25 buyers. What are the mean and standard deviation of the count of buyers among the 25 who call in for help? (b) What is the probability that exactly 8 of these 25 buyers call in for help? (c) Suppose we select a random sample of 200 buyers and calculate the proportion of buyers in the sample who call in for help. Describe the sampling distribution for this sample proportion. (d) What is the probability that more than 20% of the buyers in this sample of 200 call in for help? ©BFW Publishers The Practice of Statistics for AP*, 5/e 12. City planners in Carbury have spent many years studying traffic patterns at the intersection of Main and State Streets. They have determined that the number of cars passing through the intersection in any randomly-selected one-hour period has a mean of 207 cars and a standard deviation of 60 cars. The distribution is moderately skewed to the right. (a) Suppose the planners take a simple random sample of 40 one-hour periods. The sample mean will be an unbiased estimator of the population mean. In the context of this problem, what is meant by the term “unbiased”? (b) Describe the sampling distribution of means for samples of 40 one-hour intervals. (c) What is the probability that two consecutive simple random samples of size 40 both have sample means below 190 cars? (d) The city planners install new road signs directing “through traffic”—cars just passing through town without stopping—to take alternative routes that avoid this intersection. After the signs have been in place for two weeks, a single simple random sample of 40 one-hour periods produces a mean of 190 cars. Do you think this means that the signs have reduced traffic through this intersection? Support your answer with appropriate probabilities. ©BFW Publishers The Practice of Statistics for AP*, 5/e Test 7D Part 1 1. e Since the census involves an entire population, 23.5% is the true proportion of people under 18 in the entire population of the U.S. The 28% came from a sample, so it is a statistic. 2. 3. 4. 5. 6. 7. 8. 9. c The sampling distribution of proportions consists of the proportions from all possible samples of a given size from the population of interest. c The sampling distribution takes into account only the variability arising from random sampling. Error arising from methods of data collection must be addressed separately. d We can only apply this formula for estimating the standard deviation of the sampling distribution of means if the 10% condition has been satisfied. c From the text: “Draw an SRS of size n from any population with mean µ and finite standard deviation σ. The central limit theorem (CLT ) says that when n is large, the sampling distribution of the sample mean x is approximately Normal.” e Variance (and thus standard deviation) are strongly dependent upon sample size, but as long as the sample is less than 10% of the population, the size of the population has no meaningful effect on variance. a Only the sample mean is a random variable (or statistic). The mean of the population and the mean of the sampling distribution are equal to each other and are a fixed parameter. d This is a cluster sample: several groups of individuals in close proximity to each other are selected randomly, and all members of the group are sampled. b Residual = yobserved − yˆ = 15 − ⎡⎣28 + 3.77 (4 )⎤⎦ . 10. c The 60th percentile is somewhere between the median (50th percentile) and Q3 (75th percentile), or it could be equal to either one if there are repeated values. Thus the only possible value among the given choices is 4. Part 2 11. (a) The count of buyers who call the help line has a binomial distribution with n = 25 and p = 0.18. So µ X = np = 25 ⋅ 0.18 = 4.5 and σ X = npq = 25 ⋅ 0.18 ⋅ 0.82 ≈ 1.92 . (b) X is binomial n = 25 and p ⎛ 25 ⎞ = 0.18, so P ( x = 8) = ⎜ ⎟ 0.188 ⋅ 0.8217 ≈ 0.0408 . (c) µ pˆ = p = 0.18; ⎝ 8 ⎠ p (1 − p ) 0.18 ( 0.82 ) = ≈ 0.0272 . Since np = 200 ⋅ 0.18 = 36 and n (1 − p ) = 200 ⋅ 0.82 = 164, n 200 both of which are greater than 10, the distribution is approximately Normal. (d) Using the 0.20 − 0.18 ⎞ ⎛ Normal distribution, P ( pˆ > 0.20 ) = P ⎜ z > ⎟ = P ( z > 0.74) = 1 − 0.7704 = 0.2296 . Using a 0.0272 ⎠ ⎝ calculator, P ( pˆ > 0.20) = 0.2310 . Using the binomial distribution, P ( X > 40) =0.2020. 12. (a) The mean number of cars in a sample of 40 one-hour intervals is an unbiased estimator because the mean of its sampling distribution is the same as the mean number of cars in the population σ 60 of all one-hour intervals. (b) µx = µ X = 207 , σ x = X = ≈ 9.49 . Since n = 40 ≥ 30 , the n 40 distribution is approximately Normal by the central limit theorem. σ pˆ = ©BFW Publishers The Practice of Statistics for AP*, 5/e ⎛ ⎞ ⎜ 190 − 207 ⎟ (c) P ( x < 190 ) = P ⎜ z < ⎟ = P ( z < −1.79) = 0.0367 . The probability of two consecutive 60 ⎜⎜ ⎟⎟ 40 ⎠ ⎝ (and independent) samples having means below 190 is therefore (0.0367) ⋅ (0.0367) = 0.00135 by the multiplication rule for independent events. (d) If the mean number of cars per one-hour period were still 207, the probability of getting a sample mean of 190 or lower is 0.0367. This is low enough for us to suspect that the true mean (after the signs have been in place for two weeks) is now lower than 207. We can conclude that the signs have reduced traffic through the intersection. [Note: since this probability is not extremely low, a similar and equally legitimate argument could be made that the traffic has not changed.] ©BFW Publishers The Practice of Statistics for AP*, 5/e