* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Review Questions for Final
History of statistics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Foundations of statistics wikipedia , lookup
Taylor's law wikipedia , lookup
Psychometrics wikipedia , lookup
Omnibus test wikipedia , lookup
Misuse of statistics wikipedia , lookup
Math 251, Review for Final, Autumn 2002 The following questions are samples of the types of questions that may be on the final. There may be questions on the final from topics not represented here. For further review, look at your old tests and reviews, assigned homework, etc. Material covered since 3rd test will probably comprise about 30% of the points on the final test. This material includes hypothesis test for means (large samples and small samples), hypothesis tests for proportions, hypothesis test for the difference of two population means. The chi-square test for goodness of fit, and analysis of variance. The rest of the test will comprise of questions chosen from the other material covered throughout the quarter. 1. (a) Which type of random variable is the number of consumers refusing to answer a telephone survey and what possible values can it take? (b) How many bridge hands are there that have 4 aces? What is the probability of getting such a hand? 2. For events A and B in a sample space S, we are told P(A) = .5 and P(B) = .3 and P(A and B) = .15. Which of the following is true? (a) A and B independent events. (b) P(A or B) = .8 (c) A and B mutually exclusive events. (d) All of the above. 3. Which of the following is true about a binomial random variable for n trials with probability of success on each trial given as p. (a) The probability of n successes is pn. (b) Its variance is equal to np(1-p). (c) The probability of no successes is (1-p)n. (d) All of the above. 4. An hypothesis test on the mean reports a P-value of .031. Which of the following is true? (a) The null hypothesis should be accepted if the level of significance is .03. (b) The null hypothesis should be rejected if the level of significance is .05. (c) There is almost a 97% chance of making a Type I error. (d) All of the above. 5. If a 95% confidence interval for the population mean has length 12 when the sample size is 100, what would the length of a 95% confidence interval from the same population be if the sample size were 1600? (a) 12 (b) 48 (c) 3 (d) 6 6. A two-tailed hypothesis test on the mean of an approximately normal population is conducted with a sample size of n=10. For what t-values should the null hypothesis be rejected given that the level of significance is .05? (a) t -1.96 or t 1.96 (c) t -1.833 or t 1.833 (b) t -2.262 or t 2.262 (d) t -2.228 or t 2.228 7. (a) Given the data 9,12,15,17,17,19,23,44,57,61,63,70. Find the mean, median, range, and mode. b) If your score is at the 81st percentile on a national exam which was taken by 200,000 people, approximately how many of those 200,000 test takers scored higher than you? 8. In a state with 459,341 voters, a poll of 2300 voters finds that 45 percent support the Republican candidate, where in reality, unknown to the pollster, 42 percent support the Republican candidate. (a) What is the value of the statistic of interest? (b) What is the value of the parameter of interest? (c) Describe the population of interest. (d) In general, is it true that given a certain population, the parameter of interest will not change under repeated sampling? Explain. 9. (a) According to Chebychev’s theorem, how much data from any distribution can be more than 3 standard deviations from the mean? (b) Given a population of size 4,800 with unknown distribution, at least how many data values are within 4 standard deviations of the mean? 10. The following ranked data represent the number of miles driven each day by a salesman over a 30-day period. 31 71 86 37 74 86 43 75 87 44 75 89 44 78 89 55 81 92 58 81 92 65 81 93 65 82 99 66 84 101 Construct a relative frequency histogram for these data whose first class has class limits 30-44: 11. Consider the sample of 30 numbers 31 71 86 37 74 86 43 75 87 44 75 89 44 78 89 55 81 92 58 81 92 65 81 93 65 82 121 66 84 133 for which x = 2258, and x2 = 184670 (or (x-2= 14,717.86667) Find: (a) the sample mean (b) the sample variance (c) the sample standard deviation (d) Given that Q1= 65, Q2= 79.5 and Q3= 87, construct a boxplot for the data. (e) Find the interquartile range for the data. 12. (True or False) (a) The median is a resistant measure because it is not influenced by extreme observations. (b) The mean is a resistant measure because extreme measures on one side average out with those on the other side. (c) The mean and median are equal in a symmetric distribution. (d) right. The mean is usually to the right of the median in a distribution that is skewed to the 13. The following represent scores of a group of 15 students on Math and English tests. Scores on English Test 73 75 77 77 78 79 80 81 82 83 84 85 85 86 89 Scores on Math test 72 75 79 83 84 85 87 88 90 91 92 93 93 97 98 (a) Construct stem and leaf plots for both tests splitting stems 7,8,9 into two parts with leaves 0-4 on one part and 5-9 on the other part? (b) Which test scores seem to have a higher standard deviation? Explain. Don't compute! 14. Suppose distribution of test scores for a certain test is normal with = 70 and = 12. Suppose that 500 students wrote the test. (a) What test score would have a z-score of -2.25? (b) What score would put a student at the 90th percentile? (c) Approximately what number of students would have scores between 60 and 90? 15. A study of behavior of a large number of drug offenders after treatment for drug abuse suggests that the likelihood of conviction within a two-year period after treatment may depend on the offender's education. The proportions of the total number of cases falling to four education/conviction categories are shown in the following table: 10 or more years of Convicted .1 Not Convicted .3 education 9 or less years of education .27 .33 Suppose a single offender is selected from the treatment program. Define the events: A: The offender has 10 or more years of education. B: The offender is convicted within 2 years of completion of treatment. Find: (a) P(A or B) (b) P(A and B) (c) P(B|A) (d) The probability that neither A nor B occurs. (e) Are A and B independent? (f) Are A and B mutually exclusive? 16. A business employs 600 men and 400 women. Five percent of the men and 10% of the women have been working there for more than 20 years. If an employee is selected by chance, what is the probability the employee is male, given that the length of employment is more than 20 years? 17. (a) How many permutations are there of 30 objects taken 3 at a time? (b) In how many ways can a gold medal, silver medal and bronze medal be awarded to 30 competitors in a fencing competition? (c) How many menu possibilities are there in a restaurant that offers 5 different appetizers, 6 Salads, 12 main dishes and 10 desserts if one choice is made from each category? (d) Suppose that a large shipment of CD’s contains 5% defective CD’s. Suppose a customer chooses 2 of these CD’s at random. What is the probability that: i) Both CD’s will be good? ii) Both CD’s will be defective? iii) Exactly one CD is defective? iv) At least one CD is defective? v) At least one CD is good? 18. A jury pool consists of 13 men and 15 women. What is the probability that a randomly chosen jury from this pool will consist of 5 men and 7 women? 19. Let x be the random variable that represents the number of heads observed when 5 fair coins are tossed. Make a probability distribution for x, and find the probability that one will get more than 3 heads when tossing five fair coins. 20. Consider the random variable whose probability distribution is given by the following table. x p(x) 3 .1 7 .3 8 .45 11 ? (a) Is this a discrete or continuous random variable? (b) Find P(x = 11). (c) Construct a probability histogram for p(x), and compute the expected value of x and the standard deviation of x. 21. The following sample data concerns the number of years a student studied German in school versus their score on a proficiency test. Years (x) 3 Test Score(y) 57 Note: x = 35 4 78 y = 697 4 72 2 58 5 89 3 63 x2 = 133 y2 = 50085 4 73 5 84 3 75 2 48 xy =2554 (a) Find the equation of the least squares line for this data. (b) Use your line from (a) to predict the score on the proficiency test of a person who had 3.5 years of German. (c) Use the regression line in (a) to predict the number of years of German required to achieve a proficiency score of 75. (d) Compute the correlation coefficient r for this data. What does this coefficient suggest about a linear relationship between number of years German was studied in school and test scores for this sample? That is, determine whether it is a good fit, and whether it indicates a positive or negative linear relationship. 22. Cascade Airlines (a.k.a. “Crashcade” and now defunct) records showed that on average 10% of prospective passengers will not claim their reservations on a certain flight. Suppose that they booked 21 passengers for 20 seats on that flight. (a) Find the mean and standard deviation for the number of passengers who will claim a reservation. (b) Find the probability that all passengers who show up for the flight will receive a seat? 23. A developer wishes to test whether the mean depth of water below the surface in a large development tract was less than 500 feet. For the sample data, n = 32 test holes, the sample mean was 486 feet, and the standard deviation was s = 53 feet. Complete the test using the P-value approach, and report the conclusion for a 1% level of significance. 24. A vendor was concerned that a soft drink machine was not dispensing 6 ounces per cup, on average. A sample size of 40 gave a mean amount per cup of 5.95 ounces and a standard deviation of .15 ounce. (a) Find the P-value (b) For which of the following levels of significance would the null hypothesis be rejected? (c) For each case in part (b), what type of error has possibly been committed? (d) Find a 98% confidence interval for the mean amount of soda dispensed per cup. (e) Supposing that the population standard deviation is = .15, what sample size would be needed so that the margin of error in a 98% confidence interval is E = .01? 25. On June 7, 1999 a poll on the USA Today website showed that out of 2000 respondents, 71% felt that Andre Agassi deserved to be ranked among the greatest tennis players ever. (a) Assuming that the 2000 respondents form a random sample of the population of tennis fans, construct a 95% confidence interval for the proportion of all tennis fans who feel that Andre Agassi should be ranked among the greatest tennis players ever. (b) Based on (a), would you be comfortable in saying that the poll is accurate to within plus or minus 2 percent 19 times out of 20? Explain. (c) In actuality, the survey was based on voluntary responses from readers of the USA Today sports website. Do you think the 2000 respondents actually formed random sample? Explain. 26. (a) Suppose that a February Gallup poll of 1200 randomly selected voters found that 53 percent support George W. Bush's energy policy. Conduct an hypothesis test at a level of significance of = .01 to test whether the true voter population support for George W. Bush's energy policy in February was greater than 50 percent. (b) Report the P-value of the test in (a) and give a practical interpretation of it. 27. A brand of paint claims that in one coat, 1 gallon will cover at least 350 square feet on average. A random sample of ten 1-gallon cans produced the following data. Area Covered (Square Feet): 342, 378, 358, 364, 381, 392, 339, 356, 386, 347 Note: for this data x = 3643 x2 = 1330395 (a) Conduct the hypothesis test: H0: = 350 vs. Ha: > 350 at a level of significance of significance of = .05. Be sure to state critical region, test statistic and conclusion in your answer. (b) Construct a 99% confidence interval for the mean. 28. In a 1993 survey of 50 Education graduates and 50 Social Science graduates, the following data were obtained for their average starting salaries. Major Education Social Sciences Mean 22,554 20,348 St. Dev 2225 2375 (a) Find a point estimate for the difference in average starting salaries for Education and Social Science majors. (b) Let 1 be the population mean salary for the Education graduates and 2 be the population mean salary for the Social Science graduates. Report the P-value for the hypothesis test H0: 1- 2 = 1200 versus Ha: 1- 2 > 1200. (c) Based on (b), do you think there is sufficient evidence to believe that 1 is at least $1200 greater than 2 ? Explain. 29. Suppose that the probability is .91 that a person who has reservations for a certain opera will show up, and the decision of one person is independent from that of another. Suppose the opera has sold 1243 tickets. What is the probability that at least 1140 people will show up for the opera. 30. (a) If you were to conduct an hypothesis test to determine if the means from several different populations are equal using the method of analysis of variance, what assumptions would you make on the populations? What distribution would you use to conduct your test? (b) Do problem 3, p. 532. (See Answer in Text) 31. (a) A local radio station claims that 15 percent of all people in Riverside say it is their favorite station, 65 percent of all people in Riverside listen to it occasionally, while 20 percent never listen to it. Suppose you surveyed 200 randomly selected people in Riverside and found that of those 200 people, 20 claimed it was their favorite station, 131 said they listen to it occasionally, while 49 never listen to it. Conduct an hypothesis test at a level of significance of .05 to determine whether the stations claim is correct. Make sure to state the rejection region for your test. (b) What are the assumptions one must make when using the chi-square test for Goodness-of-Fit? (c) For further practice, see, e.g. problem 3, p. 500. 32. List conditions that are needed on the population and on the random sample(s) in order to make inferences in the following settings. In some cases, there may be no conditions required, so just list none. (a) Confidence interval for a mean from a large sample. (b) Hypothesis test on a mean using a small sample. (c) Hypothesis test on a proportion. (d) Hypothesis test concerning two means from large independent samples. 33. Confidence intervals for variance and standard deviation. Do problem #11 on p. 307. See text for answer.