* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Ch1-26 Review during AP EXAM week
Inductive probability wikipedia , lookup
Psychometrics wikipedia , lookup
History of statistics wikipedia , lookup
Confidence interval wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Foundations of statistics wikipedia , lookup
Taylor's law wikipedia , lookup
Law of large numbers wikipedia , lookup
German tank problem wikipedia , lookup
Regression toward the mean wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Name: ________________________________ AP Multiple Choice Review 1-100 1. A magazine has 1,620,000 subscribers, of whom 640,000 are women and 980,000 are men. Thirty percent of the women read the advertisements in the magazine and 50 percent of the men read the advertisements in the magazine. If a random sample of 100 subscribers is selected, what is the expected number of subscribers in the sample who read the advertisements? (A) 30 (B) 40 (C) 42 (D) 50 (E) 80 2. A manufacturer makes light bulbs and claims that their reliability is 98 percent. Reliability is defined to be the proportion of nondefective items that are produced over the long term. If the company’s claim is correct, what is the expected number of non-defective light bulbs in a random sample of 1,000 bulbs? (A) 20 (B) 200 (C) 960 (D) 980 (E) 1,000 3. When a virus is placed on a tobacco leaf, small lesions appear on the leaf. To compare the mean number of lesions produced by two different strains of virus, one strain is applied to half of each of 8 tobacco leaves, and the other strain is applied to the other half of each leaf. The strain that goes on the right half of the leaf is decided by a flip of a coin. The lesions that appear on each half are then counted. The data are given below: LEAF 1 2 3 4 5 6 7 8 STRAIN1 31 20 18 17 9 8 10 7 STRAIN 2 18 17 14 11 10 7 5 6 What is the number of degrees of freedom associated with the appropriate t-test for testing to see if there is a difference between the mean number of lesions per leaf produced by the two strains? (A) 7 (B) 8 (C) 11 (D) 14 (E) 16 4. In a cluster sample: (A) We randomly select subsets of the population and sample everyone in that subset (B) An alphabetic list of individuals is used for random selection into the sample. (C) Individuals are distinguished based on some characteristic such as gender before they are randomly selected from the population (D) Only convenient members of the population are chosen (E) None of the above 5. Which of the following can be used to show a cause-and-effect relationship between two variables? (A) A census (B) A controlled experiment (C) An observational study (D) A sample survey (E) A cross-sectional survey 6. The heights of adult women are approximately normally distributed about a mean of 65 inches with a standard deviation of 2 inches. If Rachel is at the 90th percentile in height for adult women, then her height, in inches, is closest to: (A) 60 (B) 62 (C) 68 (D) 70 (E) 74 7. Sara and Ryan plan to visit a bookstore. Based on their previous visits to this bookstore, the probability distributions of the numbers of books they will buy are given below: X = Number of books Sara will buy Y = Number of books Ryan will buy X 0 1 2 Y 0 1 2 P(X) 0.50 0.25 0.25 P(Y) 0.25 0.50 0.25 Assuming that Sara and Ryan make their decisions independently, what is the probability that they will purchase no books on this visit to the bookstore? (A) 0.0625 (B) 0.1250 (C) 0.1875 (D) 0.2500 (E) 0.7500 8. Joan’s doctor told her that the standardized score (Z-score) for her systolic blood pressure, as compared to the blood pressure of other women her age, is 1.50. Which of the following is the best interpretation of this standardized score? (A) Joan’s systolic blood pressure is 150 (B) Joan’s systolic blood pressure is 1.50 standard deviations above the average systolic blood pressure of women her age. (C) Joan’s systolic blood pressure is 1.50 above the average systolic blood pressure of women her age. (D) Joan’s systolic blood pressure is 1.50 times the average systolic blood pressure of women her age. (E) Only 1.5% of women Joan’s age have a higher systolic blood pressure than she does. 9. Every Thursday, Thomas and Sean Video Venture has “roll the dice” day. A customer may choose to roll two fair dice and rent a second movie for an amount (in cents) equal to the numbers uppermost on the dice with the larger number first. For example, if the customer rolls a 2 and a 4, a second movie may be rented for $0.42. If a 2 and a 2 are rolled, a second movie may be rented for $0.22. Let X represent the amount paid for a second movie on roll the dice day. The expected value of X is $0.47 and the standard deviation of X is $0.15. If a customer rolls a dice and rents a second movie every Thursday for 20 consecutive weeks, what is the total amount that the customer would expect to pay for these second movies? (A) $0.45 (B) $0.47 (C) $0.67 (D) $3.00 (E) $9.40 10. A sampling distribution of the means of all possible samples of size 100 is formed. The parent population has a mean of μ = 4.2 and a standard deviation of σ= 1.7. What is the value of the standard error of the mean? (A) 0.017 (B) 0.17 (C)0.42 (D) 1.7 (E) 4.2 11. Anthropologists must often estimate from human remains how tall a person was when alive. To do this they study how overall height can be predicted from the length of a leg bone in a group of 36 living males. The data show that the bone lengths have a mean of 45.9 centimeters and the standard deviation of 4.20 cm. Overall the height for the same men has standard deviation of 8.14 cm. The correlation between bone length and height is 0.914. The slope of the least squares regression line of height on bone length is about: (A) 0.47 (B) 1.77 (C) 151.1 (D) 91.5 12. Near election time, the Gallop Poll increases the size of its samples from about 1500 people to about 4000 people. The purpose of this is: (A) To reduce the bias of the result (B) To increase the bias of the result (C) To reduce the variability of the result (D) To increase the variability of the result. 13. Mr. Roger’s wants to curve student’s exam scores based on the highest score in the class. He takes the highest score (which happens to be an outlier) and treats it as the perfect score. He then computes everyone else’s score as a percentage of this perfect score. You’re smart and complain that his method is not resistant. What would be a more resistant method of grading these exams? (A) Grading scores relative to the mean score. (B) Treating the top two scores as the perfect score. (C) Computing an individual’s score relative to the mode. (D) Grading scores relative to the median score. (E) All of the above 14. An industrial experiment compares the degree of micro-porosity (which eventually leads to cracks and failure in use) in aluminum alloy produced under two conditions. Ultrasound measurements of 5 ingots produced by the first method give mean of 4.4 and standard deviation of 0.8. Similar measurements on 6 ingots produced by the second method have mean of 3.8 and standard deviation of 1.0. The standard error of the difference in means is (A) 0.766 (B) 0.543 (C) 0.197 (D) 0.295 15. You read that SAT scores in high school explain only 9% of the variation in student’s later grades in college. The correlation between SAT scores and college grades is therefore: (A) r = 0.9 (B) r = 0.81 (C) r = 0.09 (D) r = 0.3 (E) r = 0.03 16. Which of the following is equal in a normal distribution? I. Mean II. Median II. Mode (A) I and II only (B) I and III only (C) II and III only (D) I, II, and III (E) none of the above 17. Which of the following statement(s) is true? (A) Two normal curves can have the same mean, but different standard deviations. (B) If a data set is normally distributed, approximately 68% of the data is within one standard deviation of the mean. (C) The standard normal distribution has a mean of 0 and a standard deviation of 1. (D) The area under every normal curve is 1, no matter what the mean or standard deviations (E) All of the above. 18. Which of the following would you expect to be true about the correlation between distance traveled and the total amount paid on the Sam Houston Tollway (A) Strong and positive (B) Weak and positive (C) Zero (D) Strong and negative (E) Weak and negative 19. Gender is a _________ variable and weight is a ___________ variable. (A) Quantitative; categorical (B) Categorical; quantitative (C) Explanatory; response (D) Response; explanatory (E) None of the above 20. The standard error of the sample mean is determined by: (A) Size of the population (B) Size of the sample (D) Number of samples (E) None of the above 21. Which of the following events are NOT disjoint? (A) Drawing a red card that’s a spade. (C) Drawing a king that’s a queen (C) Sample proportion (B) Rolling an even number that’s prime (D) None of the above. 22. Which has a larger probability? I. Picking a red card or a king in one draw II. Rolling a 7 or better on two dice (A) I (B) II (C) They are the same 23. Suppose a study finds that the correlation coefficient relating family income to SAT scores is r = 0.89. Which of the following conclusions are justified? I. Poverty cause low SAT scores II. Wealth causes high SAT scores III. There is a strong association between family income and SAT scores. (A) I only (B) II only (C) III only (D) I and II (E) I, II, and III 24. Nancy and Connie both took the ABC achievement test, which has N (950, 50). If Nancy scored 2.5 standard deviations above the mean and Connie had a score of 675, how much higher was Nancy’s score? (A) 1075 (B) 1700 (C) 400 (D) 525 (E) none of these are correct 25. A certain test has a mean of 60 and a standard deviation of 15. To convert the scores to a different scale, the test makers use the following transformations: x* = 40 + 0.8 x. What is the new mean and new standard deviation? (A) 88 ; 52 (B) 48 ; 12 (C) 48; 52 (D) 88; 12 (E) none of the above 26. Individual observations that fall well outside the overall pattern of the data are called: (A) symmetric (B) outliers (C) gaps (D) skewed (E) normal 27. A distribution is ____________ if the portions greater and less than its center are mirror images of each other. (A) skewed right (B) reflected (C) truncated (D) skewed left (E) symmetric 28. Which of the following is not a criteria for a Student T-test? (A) A simple random sample (B) The sample standard deviation (C) No outliers in the distribution (D) A sample less than 30 29. A Student t-distribution has a standard error of 0.52, a sample size of 92, and a mean of 37. What is the standard deviation? (A) 0.0542 (B) 0.1405 (C) 0.026 (D) 4.99 30. Which of the following does not have to be true for binomial probabilities? (A) A fixed number of trials (B) The “n” observations are all independent. (C) Each observation has 2 possible outcomes. (D) n > 30 (E) The possibility for each outcome is fixed. 31. Suppose you roll a six-sided die 10 times. What is the probability of getting three 5’s in those 10 rolls? (A) 0.60 (B) 0.30 (C) 0.000618 (D) 0.155 (E) 0.930 32. A study of department chairperson ratings and student ratings of the performance of high school statistics reports a correlation of r = 1.14 between the two ratings. From this information we can conclude that (A) Chairpersons and students tend to agree on who is a good teacher (B) Chairpersons and students tend to disagree on who is a good teacher (C) There is a little relationship between chairperson and student ratings of teachers (D) There is a strong association between chairperson and student ratings of teachers, but it would be incorrect to infer causation. (E) a mistake in arithmetic has been made. 33. The heights of American men age 18 to 24 are approximately normally distributed with mean 68 inches and standard deviation 2.5 inches. Only about 5% of young men have heights outside the range of: (A) 65,5 inches to 70.5 inches (B) 63 inches to 73 inches (C) 60.5 inches to 75.5 inches (D) 58 inches to 78 inches 34. The scores on a statistics exam are strongly skewed to the left. What is the best way to describe the distribution? (A) The five number summary (B) The mean and standard deviation (C) The mean, median, and mode (D) The correlation and its square. 35. You record the age, marital status, and earned income of a sample of 1363 women. What is the number of variables you have recorded? (A) 1363 (B) four- age, marital status, income, and number of women (C) three- age, marital status, and income. (D) two- age and income (marital status is not a variable because it is not a quantitative data) 36. Other things being held equal, the margin of error in a confidence interval decreases as: (A) The confidence level increases (B) The sample size, n, increases. (C) The population standard deviation increases. (D) The sample size, n, decreases. (E) The sample mean decreases. 37. In hypothesis testing, a p-value of less than 5% indicates the following: (A) The p-value is statistically significant. (B) One would fail to accept Ho, the status quo. (C) A test statistic is large (greater than 1.9) to give such a small p-value. (D) One would support the claim of change, Ha (E) All of the above 38. A copy machine dealer has data on the number “x” of copy machines of each of 89 customer locations and the number “y” of service calls in a month at each location. Summary calculations give x = 8.4 sx = 2.1 sy = 3.8 r = 0.86 y = 14.2 What is the slope of the least squares regression line of the number of service calls on the number of copies? (A) 0.86 (B) 1.56 (C) 0.48 (D) none of these (E) Cannot determine from the information given. 39. A company employs over 5000 workers of whom 20% are Hispanic. If the 20 members of the union executive committee were chosen form the workers, without regard to ethnic background, the number of Hispanics on the committee would have the B ( 20, 0.2) distribution. What is the probability that 4 or fewer members of the committee are Hispanic? (A) 0.6296 (B) 0.3819 (C) 0.6181 (D) 0.3704 (E) 0.3999 40. A professor teaches two statistics classes. The morning class has 25 students and their average on the first test was 82. The evening class has 15 students and their average on the same test was 74. What is the average on this test if the professor combines the sores for both classes? (A) 76 (B) 78 (C) 79 (D) 80 (E) The average cannot be calculated since individual scores of each student are not available. 41. The histogram below displays a set of measurements. Which of the boxplots below displays the same set of measurements? (a) A (b) B (c) C (d) D (e) E 42. A random sample of size 10 was taken from a population. The sample has a variance of zero. Which of the following statements must be true? I. The population also ahs a variance of zero. II. The sample mean is equal to the sample median. III. The ten data points in the sample are equal in numerical value. (A) I only (B) II only (C) III only (D) I and II (E) II and III Y P(Y=y) X P(X =x) 1 1/6 2 2/3 3 ? 1 ? 2 1/4 3 1/4 4 ? 43. The tables above show part of the probability distribution for random variables X and Y. If X and Y are independent and the joint probability P ( X = 3, Y = 4) = (A) 1/8 (B) 1/6 1 then P( Y= 1) equals: 16 (C) 1/4 (D) 3/8 (E) 1/2 44. For college-bound high school seniors in 1996, the nationwide mean SAT verbal score was 505 with a standard deviation of about 110, and the mean SAT math score was 508 with a standard deviation of about 110. Students who do well on the verbal portion of the SAT tend to do well on the mathematics portion. If the two scores from each student are added, then mean of the combined scores is 1,013. What is the standard deviation of the combined verbal and math scores? (A) 110 2 (B) 110 (C) 1102 1102 (D) 220 (E) The standard deviation cannot be computed from the information given 45. As shown above, the least-squares regression line has been fitted to the winning percentages for a local sports team in each of the years 1983 through 1995. The percentage for the 1996 season was then plotted (as circled above). Which of the following statements correctly describes how the value for the 1996 season will change the appearance of the least-squares regression line and the correlation coefficient if a new least-squares regression line is fitted to the 1983 through 1996 data? (A) The 1996 point will make the LSQR line steeper and the correlation coefficient stronger. (B) The 1996 point will make the LSQR line steeper and the correlation coefficient weaker. (C) The 1996 point will make the LSQR line closer to horizontal and the correlation coefficient stronger. (D) The 1996 point will make the LSQR line closer to horizontal and the correlation coefficient weaker. (E) The 1996 point will not have any effect on the LSQR line since it follows the same downward trend. 46. A random sample of two observations is taken from a population that is normally distributed with a mean of 100 and a standard deviation of 5. Which of the following is closest to the probability that the sum of the two observations is greater than 221? (A) 0.0015 (B) 0.0250 (C)0.0500 (D) 0.4500 (E) 0.9985 47. A particular psychologist test is used to measure academic motivation. The average test score for all female college students nationwide is 115. A large university estimates the mean test score for female students on its campus by testing a random sample of “n” female students and constructing a confidence interval based on their scores. Which of the following statements about the confidence interval are true? I. The resulting interval will contain 115. II. The 95% confidence interval for n = 100 will generally be shorter than the 95% confidence interval for n= 50. III. For n=100, the 95% confidence interval will be longer than the 90% confidence interval. (A) I only (B) II only (C) III only (D) II and III (E) None of the above gives a complete set of true responses. 48. The primary reason for blocking when designing an experiment is to reduce: (A) The sensitivity of the experiment (B) Variation (C) the need for randomization (D) bias (E) confounding 49. A survey was conducted at a movie theater to determine movie-goers’ preference for different kinds of popcorn. The results of the survey showed that Brand A was preferred by 65% of the people with a margin of error plus or minus 3%. What is meant by the statement “plus or minus 3%”? (A) Three percent of the population that was surveyed will change their minds. (B) Three percent of the time the results of such a survey are not accurate. (C) Three percent of the population was surveyed. (D) The true proportion of the population who preferred Brand A popcorn could be determined if 3% more of the population was surveyed. (E) It would be unlikely to get the observed sample proportion of 65% unless the actual percentage of people in the population of moviegoers who prefer Brand A is between 62% and 68%. 50. When performing a test of significance for a null hypothesis, H0, against the alternative hypothesis, Ha, the p-value is: (A) the probability that Ho is true. (B) the probability that Ha is true. (C) the probability that Ho is false. (D) the probability of observing a value of a test statistic at least as extreme as that observed in the sample if Ho is true. (E) the probability of observing a value of a test statistic at least as extreme as that observed in the sample if Ha is true. 51. Twenty men and 20 women with high blood pressure were subjects in an experiment to determine the effectiveness of a new drug in lowering blood pressure. Ten of the 20 men and 10 of the 20 women were chosen at random to receive the new drug. The remaining 10 men and 10 women received a placebo. The change in blood pressure was measured for each subject. The design of this experiment is: (A) Completely randomized with one factor, drug. (B) Completely randomized with one factor (C) Randomized block, blocked by drug and gender. (D) Randomized block, blocked by drug (E) Randomized block, blocked by gender. 52. A large elementary school has 15 classrooms, with 24 children in each classroom. A sample of 30 children is chosen by the following procedure: “Each of the 15 teachers selects 2 children from his or her classroom to be in the sample by numbering the children from 1 to 24, then using a random digit table to select two different random numbers between 01 and 24. The 2 children with those numbers are in the sample.” Did this procedure give a simple random sample of 30 children from the elementary school? (A) No, because the teachers were not selected randomly. (B) No, because not all possible groups of 30 children had the same chance of being chosen. (C) No, because not all children had the same chance of being chosen. (D) Yes, because each child had the same chance of being chosen. (E) Yes, because the numbers were assigned randomly to the children. 53. The corn rootworm is a pest that can cause significant damage to corn, resulting in a reduction in yield and thus in farm income. A farmer will examine a random sample of plants from a field in order to decide whether or not the number of corn rootworms in the whole field is at a dangerous level. If the farmer concludes that it is, the field will be treated. The farmer is testing the null hypothesis that the number of rootworms is not at a dangerous level against the alternative hypothesis that the number is at a dangerous level. Suppose that the number of corn rootworms in the whole field actually is at a dangerous level. Which of the following is equal to the power of the test? (A) The probability that the farmer will decide to treat the fields. (B) The probability that the farmer will decide not to treat the field. (C) The probability that the farmer will fail to reject the null hypothesis. (D) The probability that the farmer will reject the alternative hypothesis. (E) The probability that the farmer will not get a statistically significant result. 54. A statistics professor is interested in exploring the relationship between student’s grades on the midterm (x) and final exam grades (y). He calculates that the regression equation is ŷ = 15.29 + 0.96x. What does the 0.96 mean? (A) 96% of the variation in y is explained by x. (B) For each additional point on the midterm, the predicted final score would increase by 0.96. (C) For each additional point on the final, the predicted midterm score would increase by 0.96. (D) 96% of the variation in x is explained by y. (E) None of the above 55. Which of the following would give a simple random sample of AHS students? (A) Randomly picking 20 students from a randomly chosen Spanish class. (B) Asking the first 20 students who arrive to school. (C) Writing down the names of students in the library at lunch and then drawing the names of 20 students from a hat. (D) Randomly asking 5 freshmen, 5 sophomores, 5 juniors, and 5 seniors. (E) None of the above 56. If every woman married a man who was exactly 2 inches taller than she, what would the correlation between the heights of married women and men be? (A) somewhat negative (B) somewhat positive (C) 0 (D) 1 (E) – 1 57. If you flip a coin three times, what is the probability that you get at least one head? (A) 1/8 (B) 3/8 (C) 1/2 (D) 7/8 58. We hypothesize that 90% of female students remain in senior level AP math courses in Texas all year whereas only 80% of senior male students remain in senior level AP math courses in Texas all year. If we sampled 40 female students and 30 male students, what would be the standard deviation of the difference between the sample proportions? (A) 0.0871 (B) 0.2217 (C) 0.1205 (D) 0.0837 (E) 0.7201 59. Which of the following pairs of events are NOT independent? (A) Flipping a head, flipping a tail (B) Rolling a 6 on a die, rolling another 6 (C) Drawing a king, drawing another king (without replacement) (D) Drawing a king, drawing another king (with replacement) 60. In an attempt to discover if the reaction time to a new pain medicine is different in men and women, Methodist Hospital decided to conduct a test in which each of the subjects would receive a dosage proportionate to his/her body weight. The results of the men were then compared to those of the women. This is an example of a __________ experiment. (A) Matched pair (B) Blocked (C) Stratified (D) Systematic (E) Cluster 61. Three main principles in experimental design are control, randomization, and ___________. (A) Matching (B) Placebo (C) Comparison of results (D) Replication 62. A social scientist wished to determine the difference s between the percentage of Los Angeles marriages and the percentage of New York marriages that end in divorce in the first year. How large of a sample (same for each group) should be taken to estimate the difference to within ±.07 at the 94% confidence level? (A) 181 (B) 361 (C) 722 (D) 1083 (E) 1443 63. The Law of Large Numbers states that: (A) In order to have a good experiment; one must have a large sample. (B) When conducting an experiment, one must work with a large mean. (C) When conducting an experiment, one must work with a large standard deviation. (D) In the long run, the observed mean approaches and remains close to the population mean. (E) In the long run, the sample mean will become larger than the population mean. 64. In a statistics course a least squares regression equation was computed to predict the final score from the score on the first test. The equation of the LSRL was ŷ = 10 + 0.9x where “y” is the final exam score and “x” is the score on the first exam. If James scored 78 on the first test and 88 on his final exam, what is the value of the residual at this point? (A) -7.8 (B) -1.9 (C) -1.3 (D) 1.9 (E) 7.8 65. Data are obtained for a group of college freshmen examining their SAT scores from their senior year of high school and their GPA’s during their first year in college. The resulting regression equation is ŷ = .00161x +1.35 where r = .632. What percentage of the variation in GPA’s can be explained by looking at the SAT scores? (A) 0.16% (B) 16.1% (C) 39.9 (D) 63.2% (E) Cannot be determined from the information given 66. What is the meaning the p-value of a test statistic? (A) The probability, assuming that Ha is true, that the test statistic will take a value at least as extreme as that actually observed. (B) The probability, assuming that Ho is true, that the test statistic will take a value at least as extreme as that actually observed. (C) The probability, assuming that Ha is not true, that the test statistic will take a value at least as extreme as that actually observed. (D) The probability, assuming that Ho is not true, that the test statistic will take a value at least as extreme as that actually observed 67. Of the following, which p-value would be significant at the 5% level? (A) 0.51 (B) 0.055 (C) 0.123 (D) all of these answers are significant (E) none of these are significant 68. Given a LSRL with a high r2 value, which of the following are true? I. It is a risky procedure to predict the y-values within the range of the x-values given by the data. II. It is a risky procedure to predict the y-values outside the range of x-values given by the data. III. It is a safe procedure to predict the y-values within the range of the x-values given by the data. IV. It is a safe procedure to predict the y-values outside the range of x-values given by the data. (A) I and II only (B) II and III only (C) II and IV only (D) III and IV only (E) I and IV only 69. Which of the following is a resistant statistic? (A) mean (B) standard deviation (E) median (C) chi-squared (D) t-score 70. In a normal distribution, approximately what percent of the observations fall within 3 standard deviations of the mean? (A) 95% (B) 98% (C) 99% (D) 98.5% (E) 99.7% 71. In a lottery game, the probability of winning the following amount of money is as follows: Money($) 5 100 5000 Probability 1/50 1/1000 1/10000 If each ticket costs $1.00, how much can you expect to win? (A) $0.70 (B) -$0.25 (C) -$0.88 (D) -$0.30 (E) $0.50 72. A two-sample t-test gives a t-value of 4.5. The .05 critical value from the t-table is 3.7. Which of the following is true? (A) The p-value will be greater than .05 (B) The p-value will be smaller than .05 (C) Either “A” or “B” may be true depending on the sample size (D) Either “A” or “B” may be true depending on the degrees of freedom. 73. Suppose that 60% of students who take the AP English exam score a 4 or a 5, 25% score a 3, and the rest score a 1 or 2. Suppose further that 95% of those scoring 4 or 5 receive college credit, 50% of those scoring 3 receive college credit, and 4% of those scoring 1 or 2 receive college credit. A student is chosen at random from among those who took the AP exam and who received college credit. What is the probability that he/she received a 3 on the exam? (A) 0.125 (B) 0.178 (C) 0.701 (D) 0.813 (E) 0.822 74. In designing an experiment, blocking is used (A) To reduce bias (C) As a substitute for a control group (E) To reduce variation (B) To control the level of the experiment (D) As the first step in randomization 75. In testing: H o : 1 2 0 and H a : 1 2 0 ,a two-sample t-test gives a p-value of .042. Which of the following are true? (A) The null hypothesis of no difference cannot be rejected at the 0.05 significance level. (B) The 90% confidence interval contains the value 0 in its interior. (C) The null hypothesis of no difference can be rejected at the 0.05 significance level. (D) The 95% confidence interval contains the value 0 in its interior. 76. Which of the following would be a good reason to use a z-test instead of a t-test as a test statistic? (A) The sample size is large (B) The variances are assumed to be equal. (C) The degree of freedom is greater than 15. (D) The standard deviation of the population is known. (E) None of these 77. When comparing two samples which of the following is the most important? (A) The samples are selected randomly (B) The sample sizes are equal (C) The samples have the same variances (D) The samples come from approximately normal distributions (E) There are no outliers in the data 78. Which of the following assumptions is the t-test LEAST robust against? (A) Small sample size (B) Outliers (C) Non-symmetric distribution of the data (D) Transformed data The heart disease death rates per 100,000 people in the United States for certain years, as reported by the National Center for Health Statistics, were Year 1950 1960 1970 1975 1980 Death Rate 307.6 286.2 253.6 217.8 202.0 79. Which of the following is a correct interpretation of the slope of the least squares regression line for the data above? A) The heart disease rate per 100,000 people has been dropping about 3.627 per year B) The baseline heart disease rate is 7386.87 C) The regression line explains 96.28% of the variation in heart disease death rates over the years. D) The regression line explains 98.12% of the variation in heart disease death rates over the years. E) Heart disease will be cured in the year 2036. 80. Based on the regression line, what is the predicted death rate for the year 1983? (A) 195.4 per 100,000 people (B) 192.5 per 100,000 people (C) 196.8 per 100,000 people (D) 198.5 per 100,000 people (E) 194.5 per 100,000 people 81. In making predictions about the data from a regression equation, it is often dangerous to make predictions about data outside the domain of the explanatory variable. The term for this process is called_______. (A) interpretation (B) interpolation (C) extrapolation (D) estimation 82. Amos Tversky and Thomas Gilovich in their study on the “Hot Hand” in basketball (Chance, Winter 2989, page 20), found that in a random sample of games, Larry Bird hit a second free throw in 48 of 53 attempts after the first free throw was missed. Larry hit a second free throw in 251 of 285 attempts after the first free throw was made. Is there sufficient evidence to say that the probability that bird will make a second free throw is different depending on whether or not he made the first free throw? (A) Since p < .001 , there is sufficient evidence that the probability that Larry Bird will make a second free throw is different depending on whether he made the first free throw or not. (B) Since .001 < p < .01, there is sufficient evidence that the probability that Larry Bird will make a second free throw is different depending on whether he made the first free throw or not. (C) Since .01 < p < .05, there is sufficient evidence that the probability that Larry Bird will make a second free throw is different depending on whether he made the first free throw or not. (D) Since .05 < p < .10, there is little evidence that the probability that Larry Bird will make a second free throw is different depending on whether he made the first free throw or not. (E) Since p > .10, there is little evidence that the probability that Larry Bird will make a second free throw is different depending on whether he made the first free throw or not. 83. A study of accident records at a large engineering company in England (The Lancet, October 22, 1994) reported the following number of injuries on each shift for 1 year: Shift Morning Afternoon Night Number of injuries 1372 1578 1686 Did the study provide enough evidence to say that the number of accidents on the three shifts is not the same? (A) There is sufficient evidence to say that the number of accidents on each shift is not the same (B) There is not sufficient evidence to say that the number of accidents on each shift is not the same. 84. An inspection procedure at a manufacturing plant involves picking three items at random and then accepting the whole lot if at least two of the three items are in perfect condition. If in reality 90% of the whole lot are perfect, what is the probability that the lot will be accepted? (A) 0.003 (B) 0.028 (C) 0.081 (D) 0.810 (E) 0.972 500 people used a home test for HIV and then all underwent more conclusive hospital testing. The accuracy of the home test was evidenced in the following table: HIV Healthy Positive Test 35 25 Negative Test 5 435 85. What is the predictive value of the test? That is, what is the probability that a person tested has HIV and tests positive? (A) 0.070 (B) 0.130 (C) 0.538 (D) 0.583 (E) 0.875 86. What is the false-positive rate? That is, what is the probability of testing positive given that the person does not have HIV? (A) 0.054 (B) 0.050 (C) 0.130 (D) 0.417 (E) 0.875 87. What is the sensitivity of the test? That is, what is the probability of testing positive given that the person has HIV? (A) 0.070 (B) 0.130 (C) 0.538 (D) 0.583 (E) 0.875 88. What is the specificity of the test? That is, what is the probability of testing negative given that the person does not have HIV? (A) 0.125 (B) 0.583 (C) 0.870 (D) 0.94 (E) 0.950 89. Which of the following are important in the design of experiments? I. Control of confounding variables II. Randomization in assigning subjects to different treatments. III. Replication of the experiment using sufficient numbers of subjects. (A) I and II only (B) I and III only (C) II and III only (D) I, II, and III (E) None of the above 90. Suppose that 35% of all business executives are willing to switch companies if offered a higher salary. If a job placement service randomly contacts 100 executives, what is the probability that over 40% will be willing to switch companies if offered a higher salary? (A) 0.1250 (B) 0.1977 (C) 0.4207 (D) 0.8023 (E) 0.8531 91. If we reject the null hypothesis, when in fact, the null hypothesis is true, we have: (A) Committed a Type I error (B) Committed a Type II error (C) A probability of being correct which is equal to the p-value (D) Explained the power of a test. 92. A researcher plans to conduct a test of hypotheses at the 1% significance level. She designs her study to have power of 0.90 at a particular alternative value of the parameter of interest. The probability that the researcher will commit a type I error is: (A) 0.01 (B) 0.10 (C) 0.90 (D) equal to the p-value and cannot be determined until the data is collected 93. The power of a statistical test of hypotheses is: (A) The smallest significance level at which the data will allow you to reject the null hypothesis. (B) Equal to one minus the p-value. (C) The extent to which the test will reject both a one-sided and two-sided hypothesis. (D) The probability, that at a fixed level, a significance test will reject the null hypothesis when this particular alternative value of the parameter is true. 94. A manufacturer knows 4% of all floppy disks that come off the production line are defective. What is the probability that if you randomly select floppy disks from a week of production that you will need to sample 15 disks before you find a defective one? (A) 0.0226 (B) 0.4579 (C) 0.4353 (D) 0.5421 (E) 0.9774 95. In general, how does tripling the sample size change the confidence interval? (A) It triples the interval size (B) It divides the interval size by 3. (C) It multiplies the interval size by 1.732 (D) It divides the interval by 1.732 (E) This question cannot be answered without knowing the sample size. 96. Pamela is playing an instant lottery game. What is the probability that she will not win until the 8th try if the probability is one out of 100 on a single try? (A) 0.0009 (B) 0.0093 (C) 0.0993 (D) 0.9321 (E) 0.9907 97. Which of the following are true? I. In a block design, the random assignment of units to treatments is carried out separately within each block. II. The purpose of blocking is to reduce variation in results. III. Matched pairs design is a special type of blocked design. (A) I only (B) II only (C) I and II only (D) II and III only (E) I, II, and III 98. The regression line for a set of data is ŷ = 2x + b. this line passes through the point (3 ,4). If x and and y values respectively, then (A) x (B) x 3 y are the sample means of the x y (C) x 4 (D) 2x 2 (E) 2x 10 99. If the probability that switch works properly is 0.8, what is the probability that exactly 3 out of 10 switches are defective? (A) 0.0008 (B) 0.1147 (C) 0.2013 (D) 0.2563 (E) 0.5000 100. Assuming σ is known, which of the following would most likely result in the widest confidence interval for estimating μ. (A) Large sample size, α = 0.01 (B) Large sample size, α = 0.05 (C) Small sample size, α = 0.01 (D) Small sample size, α = 0.05 (E) Without the sample mean, this question cannot be answered. AP Statistics 1) The weights of male and female students in a class are summarized in the following boxplots: Males Females 80 100 120 160 140 Weight (pounds) 180 200 220 240 Which of the following is NOT correct? a) About 50% of the male students have weights between 150 and 185 lbs. b) About 25% of the female students have weights more than 128 lbs. c) The median weight of the male students is about 166 lbs. d) The male students have less variability than the female students. e) The mean weight of the female students is about 120 because of symmetry. 2) The following is a stem-plot of the birth weights of male babies born to the smoking group. The stems are in units of kg. Stems Leaves key 2 3,4,6,7,7,8,8,8,9 2|3 means 2.3 3 2,2,3,4,6,7,8,9 4 1,2,2,3,4 5 3,5,5,6 The median birth weight is: a)13.5 b) 3.5 c) 3.2 d) 3.7 e) Average of 13 and 14 3) The heights in centimeters of 5 students are: 165, 175, 176, 159, 170 The sample median and sample mean are respectively: a) 170, 169 b) 176, 169 c) 169, 170 d) 170, 170 e) 176, 176 4) Rainwater was collected in water collectors at thirty different sites near an industrial basin and the amount of acidity (pH level) was measured. The mean and standard deviation of the values are 4.60 and 1.10 respectively. When the pH meter was recalibrated back at the laboratory, it was found to be in error. The error can be corrected by adding 0.1 pH units to all of the values and then multiply the result by 1.2. The mean and standard deviation of the corrected pH measurements are: a) 5.64, 1.32 b) 5.64, 1.44 c) 5.40, 1.44 d) 5.40, 1.32 e) 5.64, 1.20 5) The output from a sewage treatment plant is constantly monitored to assess treatment effectiveness. Suppose that the mean coliform content is 20 bacteria/ml with a standard deviation of 4 bacteria/ml, and is known to be normally distributed. An automatic measuring device is being used to monitor the bacteria levels. An alarm should ring whenever the bacteria level exceeds the 97.5th percentile. The upper bound should be: a) 20 per ml b) 24 per ml c) 16 per ml d) 32 per ml e) 28 per ml 6) A random variable X has a probability distribution as follows: r 0 1 2 3 P(R=r) 2k 3k 13k 2k The probability that P(X < 2.0) is equal to: a) .15 b) .25 c) .65 d) .90 e) 1.00 7) It has been estimated that as many as 70% of the fish caught in certain areas of the great Lakes have liver cancer due to the pollutants present. Find an approximate 95% range for the number of fish with liver cancer present in a sample of 130 fish. a) (63, 119) b) (86, 97) c) (80, 102) d) (36, 146) e) (75, 107) 8) Newsweek in 1989 reported that 60% of young children have blood lead levels that could impair their neurological development. Assuming a random sample from the population of all school children at risk, the probability that at least 5 children out of 10 in a sample taken from a school may have a blood level that may impair development is: a) about .25 b) about .20 c) about .84 d) about .16 e) about .64 9) The marks on a Statistics test are normally distributed with a mean of 62 and a variance of 225. If the instructor wishes to assign B's or higher to the top 30% of the students in the class, what mark is required to get a B or higher? a) 69.9 b) 71.5 c) 73.2 d) 74.6 e) 68.7 10) Government regulations indicate that the total weight of cargo in a certain kind of airplane cannot exceed 330 kg. On a particular day a plane is loaded with 100 boxes of goods. If the weight distribution for individual boxes is normal with mean 3.2 kg and standard deviation 0.4 kg, what is the probability that the regulations will not be met? a) 0.4938 b) 0.1239 c) 0.9938 d) 0.0062 e) 0.5062 11) The Central Limit Theorem tells us that the sampling distribution of sample means is approximately normal. Which of the following conditions are necessary for the theorem to be valid? a) The sample size has to be sufficiently large. b) We have to be sampling from a normal distribution. c) The population has to be symmetric. d) Population variance has to be sufficiently small. e) Both (a) and (c). 12) A nutritionist wants to study the effect of storage time (6, 12, and 18 months) on the amount of vitamin C present in freeze-dried fruit when stored for these lengths of time. Vitamin C is measured in milligrams per 100 milligrams of fruit. Six fruit packs were randomly assigned to each of the three storage times. The treatment, experimental unit, and response are respectively: a) a specific storage time, a fruit pack, amount of vitamin C b) a fruit pack, amount of vitamin C, a specific storage time c) random assignment, a fruit pack, amount of vitamin C d) a specific storage time, amount of vitamin C, a fruit pack e) a specific storage time, the nutritionist, amount of vitamin C 13) To answer this question, use the following numbers extracted from a table of random digits: 38683 50279 78224 09844 13578 28251 12708 24684 A scientist will be measuring the total amount of woody debris in a random sample of sites selected without replacement from a population of 45 sites. The sites are labeled 01, 02, …, 45, and she starts at the beginning of the line of random digits and takes consecutive pairs of digits. Which of the following is correct? a) Her sample is 38, 25, 02, 38, 22 b) Her sample is 38, 68, 35, 02, 22 c) Her sample is 38, 35, 02, 22, 40 d) Her sample is 38, 65, 35, 02, 79 e) Her sample is 38, 35, 27, 28, 08 14) The effect of salt upon the growth of grasses is of concern in many places where excess irrigation is causing salt to rise to the surface. In order to determine baseline yields, a sample of 24 fields was selected, and the biomass of grasses in a standard-sized plot was measured (kg). The computer output appears below: QUANTILES (DEF=4) N MEAN STD DEV VARIANCE RANGE STD MEAN T:MEAN=0 A 95% confidence interval for the mean yield is: a) 9.09 ± 1.9600(1.35) b) 9.09 ± 2.0687(1.35) 24 9.09 6.64 44.0 21.9 1.35 6.7153 MAX Q3 MED Q1 MIN 22.6 11.45 8.15 3.775 0.7 PROB>|T| c) 9.09 ± 2.0639(6.64) 0.0001 d) 9.09 ± 2. 0639 (1.35) e) 9.09 ± 2.0687(6.64) 15) Recently, a price war has developed among retailers selling Brand X denim jeans. A major chain buyer wishes to estimate the mean price of these jeans during this period to compare it to the normal selling price of $20.00. A random sample of 7 major retailers produces a mean retail price of $13.50 with a standard deviation of $3.50. An 80% confidence interval for the true mean retail price of Brand X jeans during the price war is: a) (10.93, 16.07) b) (11.60, 15.40) c) (11.81, 15.19) d) (10.00, 17.00) e) (8.46, 18.54) 16) You wish to estimate , the average lifetime of a particular type of battery. You are planning to select "n" batteries of this type and to operate them continuously until they fail. You have some feeling that the standard deviation of the lifetimes should be around 20 hours, and you wish your estimate of to be within 1 hour of with 95% confidence. How many batteries should you select? a) 77 b) 784 c) 40 d) 1537 e) 1083 17) The 3-M Company started a new recreation program for its employees in the hope that a little recreation would improve an employee's performance at work. To determine whether the high cost of the program is justified, the president of the company wishes to estimate the proportion of the employees who participate in the recreational activities. In a random sample of 200 employees, 60 were found to regularly participate in the recreation program. A 95% confidence interval for the true proportion of 3M employees who participate in the new recreation program is: a) (0.237, 0.364) b) (0.298, 0.302 ) c) (0.267, 0.333) d) (0.247, 0.353) e) (0.231, 0.369) 18) A 95% confidence interval for p, the proportion of soda drinkers who prefer Big Red was found to be (0.236, 0.282). Which of the following is correct? a) About 95% of soda drinkers have between a 23.6% and a 28.2% chance of drinking Big Red. b) There is a 95% probability that the sample proportion lies between 0.236 and 0.282. c) If a second sample were taken, there is a 95% chance that its confidence interval would contain 0.25. d) This confidence interval indicates that we would likely reject the hypothesis H 0: p = 0.25. e) We are reasonable certain that the true proportion of soda drinkers who prefer Big Red is between 24% and 28%. 19) A researcher wants to see if birds that build larger nests lay larger eggs. He selects two random samples of nests: one of small nests and the other of large nests. He weighs one egg from each nest. The data are summarized below. Small nests Large nests Sample size 60 159 Sample mean (g) 37.2 35.6 Sample variance 24.7 39.0 A 95% confidence interval for the difference between the average mass of eggs in small and large nests is: 1.6±1.33=(0.27, 2.93) b) 1.6±1.48=(0.12, 3.08) c) 1.6±1.33=(-5.71, 8.91) d) 1.6±1.76=( 016 . , 3.36) e)1.6±1.6=(-0.003, 3.20) 20) In a statistical test for the equality of a mean, such as H 0 : 10 , if 0.05 , a) 95% of the time we will make an incorrect inference b) 95% of the time the null hypotheses will be correct c) 5% of the time we will say that there is a real difference when there is no difference d) 5% of the time we will say that there is no real difference when there is a difference a) has been calculated as ( 0.73, 192 . ) based on n = 15 observations from a population with a normal N , distribution. The hypotheses of interest are H 0 : 0 versus H A : 0 . Based on this 21) An appropriate 95% confidence interval for 2 confidence interval: a) we should reject H 0 at the 0.05 level of significance H 0 at the 0.10 level of significance. 0.05 level of significance. H 0 at the 0.10 level of significance. b) we should not reject H 0 at the c) we should reject d) we should not reject e) we cannot perform the required test since we do not know the value of the test statistic. 22) A study was carried out to investigate the effectiveness of a treatment. 1000 subjects participated in the study, with 500 being randomly assigned to the “treatment group” and the other 500 to the “control (or placebo) group”. A statistically significant difference was reported between the responses of the two groups (p-value < .005). Thus, a) there is a large difference between the effects of the treatment and the placebo. b) there is strong evidence that the treatment is very effective. c) there is little evidence that the treatment has any effect. d) there is strong evidence that there is a difference in effect between the treatment and the placebo. e) there is evidence of a strong treatment effect. 23) We wish to test if a new feed increases the mean weight gain compared to an old feed. At the conclusion of the experiment it was found that the new feed gave a 10 kg bigger gain than the old feed. A two-sample t-test with the proper one-sided alternative was done and the resulting p-value was .082. This means: a) there is an 8.2% chance the null hypothesis is true b) there is an 8.2% chance the alternate hypothesis is true c) there was only an 8.2% chance of observing an increase greater than 10 kg (assuming the null hypothesis was true). d) there was only an 8.2% chance of observing an increase greater than 10 kg (assuming the null hypothesis was false). e) there is only an 8.2% chance of getting a 10 kg increase. 24) An experiment was conducted to assess the effectiveness of spraying oats with malathion (at 0.25 lbs/acre) to control the cereal leaf beetle. A sample of 10 farms was selected at random from southwest Manitoba. Each farm was assigned at random to either the control group (no spray) or the treatment group (spray). At the conclusion of the experiment, a plot on each farm was selected and the number of larvae per stem was measured. Power refers to: a) the ability to detect an effect of malathion when in fact there is no effect. b) the ability to not detect an effect of malathion when in fact there is no effect. c) the ability to not detect an effect of malathion when in fact there is an effect. d) the ability to detect an effect of malathion when in fact there is an effect. e) the ability to make a correct decision regardless if malathion has an effect or not. 25) For children between the ages of 18 months and 29 months, there is approximately a linear relationship between “height” and “age”. ^ The relationship can be represented by y 64.93 0.63x , where y represents height (in centimeters) and x represents age (in months). Joseph is 22.5 months old and is 80 centimeters tall. What is Joseph’s residual? a) 79.1 b) 0.9 c) -0.9 d) 56.6 e) 64.93 26) If the correlation between body weight and annual income were high and positive, we could conclude that: a) high incomes cause people to eat more food. b) high incomes cause people to gain wegith. c) high-income people tend to be heavier than low-income people. d) low incomes cause people to eat more food. e) high-income people tend to spend a greater proportion of their income on food than low-income people. 27) Each person in a random sample of 50 was asked to state his/her sex and preferred color. The resulting frequencies are shown below. Color Red Blue Green Sex Male 5 14 6 Female 15 6 4 A Chi-square test is used to test the null hypothesis that sex and preferred color are independent. Which of the following statements is a correct decision about the null hypothesis? a) Reject at the 0.005 level. b) Reject at the 0.005 level, but not at the 0.01 level. c) Reject at the 0.05 level, but not at the 0.025 level. d) Reject at the 0.025 level, but not at the 0.01 level. e) No conclusion can be drawn since not all of the observed counts are greater than or equal to 5 Statistics AP Inference Review Name: _________________________________ Multiple Choice: 1. Do CHS students prefer Coke or Pepsi? 100 students were randomly selected during the lunch periods to taste sodas and their preference was recorded? Which inference procedure could be used in this scenario? a) 1-proportion z b) 2 proportion z c) 1 sample mean t d) 2 sample mean t e) none of these 2. How many marshmallows can one stuff into their mouth? 30 randomly selected volunteers were asked to stuff their mouths with as many large size marshmallows as they could after having their jaw line measured. If you know someone’s jaw line size, can you predict the number of marshmallows they can stuff in it? Which inference procedure could be used in this scenario? a) 1 sample mean t test b) 2 sample mean t test c) 2 independence test d) linear regression test e) none of these 3. Does the temperature of what you drink affect your mouth temperature? Randomly selected subjects first had their mouth temperature taken. Then, they were randomly assigned to drink ¼ cup of cold or hot water and hold it in their mouths for 10 seconds. Then their mouth temperatures were measured again. After 20 minutes the procedure was repeated using the other water temperature treatment. Which inference procedure could be used in this scenario? a) 2 sample mean t test b) linear regression test c) matched-pairs test d) 2 proportion z test e) none of these How do you eat your Oreo’s? Part I: During the three lunch periods students were encourage to participate in a survey for the school’s newspaper. As a reward, the participating students received an Oreo cookie. Hidden AP Stat students recorded how these volunteers ate their cookie (whole, twist and lick filling, or twist and eat half of cookie). Which inference procedure could be used in this scenario? a)2 GOF b) 2 homogeneity c) 2 independence d) 2 proportion z test e) none of these 4. Oreo’s Part II: While recording how students consumed their Oreo cookie, gender of student was recorded by the secret observer. Is their an association between gender and method of consumption of Oreo cookie? Which inference procedure should be used in this scenario? a)2 GOF b) 2 homogeneity c) 2 independence d) linear regression test e) none of these 5. 6. Is there a difference between Green Forest and HEB paper towels strength when wet? A randomly selected sheet from each paper towel was selected from 25 rolls of each. Holding each paper towel with 2 tablespoons of water poured into the middle, marbles were added to the center of the paper towel. The total number of marbles the paper towel could hold was counted for the 25 trials of each type of towel. Which inference procedure could be used in this scenario? a) linear regression test b) 2 sample mean t test c) matched-pairs test d) 2 independence e) none of these 7. AP Stat students surveyed 100 seniors for their evening plans after the graduation ceremony {attend project graduation (school sponsored event), attend a party (not project graduation), or spend quite evening at home with family}). Which inference procedure could be used in this scenario? a) 2 proportion z test b) 2 homogeneity c) 2 independence d) linear regression test e) none of these 8. How long do people take to sharpen their pencils? In randomly selected classes teachers were given new #2 pencils for their students. An AP Stat student, pretending to study, secretly timed how long students took sharpening the new #2 pencils. The data from 37 students was collected. Which inference procedure could be used in this scenario? a) 1-proportion z b) 2 proportion z c) 1 sample mean t d) 2 sample mean t e) none of these