Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Foundations of statistics wikipedia , lookup
History of statistics wikipedia , lookup
Confidence interval wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Regression toward the mean wikipedia , lookup
Practice Exam 3 STAT 1100, Spring 2009 Laurel Chiappetta Name_______________________________ This is a closed-book exam. You are allowed to use the statistical tables in your textbook, a calculator, and one two-sided sheet of notes. There are 24 problems, with point values as shown. Parts of problems do not necessarily receive equal weight. If you want to receive partial credit for incorrect answers, show your work. Don’t spend too much time on any one problem. (5) 1. Data concerning a particular nominal variable will be collected and the sample proportion p̂ will be computed from this data. If the sample size n is large enough, the sampling distribution of p̂ will be a. b. c. d. Exactly Binomial with mean p and standard deviation np. Approximately normal with mean p and standard deviation Exactly Poisson with mean p. Approximately exponential with mean p. p(1 p) / n . (5) 2. After making a change in her teaching method, Dr. Rivera wonders whether her statistics class is performing better, on the average, than past classes. The mean of her past classes was 65 and the mean of her current class is 68. In her test of significance, she calculates z 0.75 . What should she conclude? a. The new teaching method produced worse performance. b. The new teaching method produced significantly better performance. c. The level of performance under the new teaching method is exactly the same as under the old method. d. The new teaching method did not produce significantly better performance. (5) 3. In testing hypotheses, which of the following would be strong evidence against the null hypothesis? a. b. c. d. using a small level of significance. using a large level of significance. obtaining data with a small P-value. obtaining data with a large P-value. . . What is the P(5) 4. In a two-tailed hypothesis test, you calculated the test statistic z 216 value? a. b. c. d. 0.0154 0.0308 0.4846 0.9692 Page 1 of 8 Use the following information to answer questions 5 and 6: In a two-tailed test of H 0 : 95 , the sample mean was 98.6 and the P-value of the test statistic z was 0.0325. (5) 5. What does this P-value mean? a. The probability is 0.0325 that the population mean is 98.6. b. The probability that the population mean is 95 is only 0.0325. c. If the population mean is 98.6, the probability of a sample mean as extreme as (or more extreme than) 95 is 0.0325. d. If the population mean is 95, the probability of a sample mean as extreme as (or more extreme than) 98.6 is 0.0325. (5) 6. Is the result statistically significant at the 0.01 level? a. b. c. d. Yes. No. Could be yes or no, depending on the situation. There’s not enough information provided to determine the answer. (5) 7. The SAT scores of entering freshmen at University X have a N(1200, 90) distribution and the SAT scores of entering freshmen at University Y have a N(1215, 110) distribution. A random sample of 100 freshmen is sampled from each University, with x the sample mean of the 100 scores from University X and y the sample mean of the 100 scores from University Y. The probability that x is less than 1190 is a. b. c. d. 0.0116 0.1335 0.4090 0.4562 (5) 8. Suppose we are planning on taking a random sample from a population and calculating the sample mean. If we double the sample size, then X will be multiplied by a. b. 2 1 2 b c. 2 1 d. 2 Page 2 of 8 (5) 9. Random samples of size 81 are taken from an infinite population whose mean and standard deviation are 45 and 9, respectively. The mean and standard error of the sampling distribution of the sample mean (respectively) are a. b. c. d. 9 and 45. 45 and 9. 81 and 45. 45 and 1. (5) 10. A survey asks a random sample of 1500 adults in Ohio if they support an increase in the state sales tax from 5% to 6%, with the additional revenue going to education. Let p̂ denote the proportion in the sample that say they support the increase. Suppose that 40% of all adults in Ohio support the increase. The mean p̂ of p̂ is a. b. c. d. 5% 40% ± 5% 0.40 600 (5) 11. A researcher wishes to determine if students are able to complete a certain pencil and paper maze more quickly while listening to classical music. Suppose the time (in seconds) needed for high school students to complete the maze while listening to classical music follows a normal distribution with mean and standard deviation = 4. Suppose also that in the general population of all high school students, the time needed to complete the maze follows a normal distribution with mean 40 and standard deviation 4. The researcher, therefore, decides to test the hypotheses H 0 : = 40 H 1 : < 40. To do so, the researcher has 10,000 high school students complete the maze with classical music playing. The mean time for these students is x 39.8 seconds and the P-value is less than 0.0001. Suppose that two high school students decide to see if they get the same results as the researcher. They both take the maze while listening to classical music. The mean of their times is x 39.8 seconds, the same as that of the researcher. It is appropriate to conclude which of the following? a. They have reproduced the results of the researcher and their P-value will be the same as that of the researcher. b. They have reproduced the results of the researcher, but their P-value will be slightly smaller than that of the researcher. c. They will reach the same statistical conclusion as the researcher, but their P-value will be a bit different from that of the researcher. d. None of the above. Page 3 of 8 Use the following information to answer questions 12 and 13: A car manufacturer claims that a certain model averages 30 miles per gallon (mpg) of gasoline, with a standard deviation of 2 mpg. A consumer organization believes the mean mileage is lower than that claimed. They buy a sample of cars and find that the mean gas mileage is 28.5 mpg. (5) 12. What are the null and alternative hypotheses of the test? Use symbols, not words. H 0 : µ=30 H 1 : µ<30 (5) 13. The test is conducted with 0.01. The P-value is found to be 0.007. Which of the following is the correct conclusion? a. b. c. d. The true mean mileage of this model is 30 mpg as the manufacturer claims. The true mean mileage of this model does not differ significantly from 30 mpg The true mean mileage of all cars of this model is 28.5 mpg. The true mean mileage of all cars of this model is less than 30 mpg. (5) 14. A 90% confidence interval estimate of the population mean can be interpreted to mean that a. if we repeatedly draw samples of the same size from the same population, 90% of the values of the sample means x will result in a confidence interval that includes the population mean . b. there is a 90% probability that the population mean will lie between the lower confidence limit (LCL) and the upper confidence limit (UCL). c. we are 90% confident that we have selected a sample whose range of values does not contain the population mean . d. we are 90% confident that 10% the values of the sample means x will result in a confidence interval that includes the population mean . (5) 15. If two populations are normally distributed, the sampling distribution of the difference in sample means, X 1 X 2 , will be a. b. c. d. approximately normally distributed. normally distributed only if both sample sizes are greater than 30. normally distributed. normally distributed only if both population sizes are greater than 30. Page 4 of 8 (5) 16. If two random samples of sizes n1 and n 2 are selected independently from two populations with variances 12 and 22 , then the standard error of the sampling distribution of the difference in sample means, X 1 X 2 , equals a. b. 12 n1 12 n1 22 22 n2 n2 c. ( 12 22 ) / n1 n 2 d. ( 12 22 ) / n1 n 2 b (5) 17. An agricultural researcher plants 25 plots with a new variety of corn. The average yield for these plots is x 150 bushels per acre. Assume that the yield per acre for the new variety of corn follows a normal distribution with unknown mean and standard deviation = 10 bushels per acre. A 90% confidence interval for is a. b. c. d. 150 ± 2.00 150 ± 3.29 150 ± 3.92 150 ± 32.90 (5) 18. An engineer designs an improved light bulb. The previous design had an average lifetime of 1200 hours. The new bulb had a lifetime of 1201 hours, using a sample of 2000 bulbs. Although the improvement observed is quite small, the effect was statistically significant. The explanation is a. b. c. d. that new designs typically have more variability than standard designs. that the sample size is very large. that the mean of 1200 is large. all of the above. (5) 19. Suppose that the population of the scores of all high school seniors that took the SAT-M (SAT math) test this year follows a normal distribution with mean and standard deviation = 100. You read a report that says, "On the basis of a simple random sample of 100 high school seniors that took the SAT-M test this year, a confidence interval for is 512.00 ± 25.76." The confidence level for this interval is a. b. c. d. 90% 95% 99% over 99.9% Page 5 of 8 Use the following to answer questions 20 and 21: Bags of a certain brand of tortilla chips claim to have a net weight of 14 ounces. Net weights actually vary slightly from bag to bag and are normally distributed with mean . A representative of a consumer advocate group wishes to see if there is any evidence that the mean net weight is less than advertised and so intends to test the hypotheses H 0 : = 14 H 1 : < 14. To do this, he selects 24 bags of this brand at random and determines the net weight of each. He finds the sample mean to be x 13.82 and the sample standard deviation to be s = 0.44. (5) 20. Referring to the above data, suppose we were not sure if the distribution of net weights was normal. In which of the following circumstances would we not be safe using a t procedure in this problem? a. b. c. d. The mean and median of the data are nearly equal. A histogram of the data shows slight skewness. A stemplot of the data has a large outlier. The sample standard deviation is large. (5) 21. Based on the above data, we would a. Reject H 0 b. Reject H 0 c. Reject H 0 d. Reject H 0 at significance level 0.10 but not at 0.05. at significance level 0.05 but not at 0.025. at significance level 0.025 but not at 0.01. at significance level 0.01. (5) 22. True or False. The best time to formulate the hypotheses in a test of significance is after summary statistics from the data are computed and analyzed. That way, you know in which direction to set up the alternative hypothesis. a. b. c. d. True. False. Could be true or false, depending on the situation. There’s not enough information provided to determine the answer. Page 6 of 8 (30) 23. The Admissions officer for the graduate programs at Michigan State University believes that the average score on the GRE exam at her university is significantly higher than the national average of 1300. Assume that the population of scores is normally distributed and the population standard deviation is 125. A random sample of 25 scores had an average of 1370. a. Determine the critical value and rejection region in terms of the sample mean X-bar if a 0.05 level of significance is used. H0: µ=1300, H1: µ>1300 so rejection region is an upper one-tailed test Z > z Critical value (this is a z-test of the mean because we are dealing with averages and the population δ is known) is z=+1.645 so rejection region is z values > +1.645 but question asks for region and critical value in terms of x-bar so if z=1.645 and x-bar = µ + z() so the critical value in terms of x-bar = 1300 + 125(1.645) = 1505.63. So, x-bar values > 1505.63 would be significant at =0.05 (this is the rejection region in terms of x-bar). z b. Calculate the appropriate standardized test statistic. z x n 1370 1300 125 25 2.8 d. Give (or bound) the P-value as exact as the tables in the text allow. P(Z>z) = 1-P(Z<z) = 1-0.9974 = 0.0026 e. Does the Admissions officer have sufficient statistical evidence to support her belief? Justify your answer. Yes. Because p=0.0026 < =0.05, there is sufficient evidence to support that the average GRE score at her university is higher than the national average. f. If this researcher later found out that the average GRE scores at her university were mis-quoted and were actually found to be 1301, a number very close to 1300, what type of error has she made? TYPE I error (reject a true null hypothesis) (10) 24. Gasoline octane numbers are periodically measured by a gasoline manufacturer. Data for 10 samples from the manufacturing process produced a sample mean of 86.962. The octane levels have historically been approximately normally distributed with a population standard deviation of 0.113. a. Construct a 99% confidence interval for the population mean octane level. z-interval (population sd given) xbar z/2 * /sqrt(n) = 86.962 2.575(0.113/sqrt(10)) = 86.962 0.0920 (86.87,87.054) b. How many gasoline samples are necessary to estimate the true mean octane level with 99% confidence and a margin of error of only 0.04? z 2.575 * 0.113 n /2 52.9 so 52 gasoline samples would be required! 0.04 W 2 2 Page 7 of 8