Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
History of statistics wikipedia , lookup
Foundations of statistics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
Confidence interval wikipedia , lookup
Regression toward the mean wikipedia , lookup
German tank problem wikipedia , lookup
Resampling (statistics) wikipedia , lookup
1 Practice Test 3 – Summer 2010 1. What are the following values for Z a. Z0.0102 = 2.32 b. Z0.5 = 0 2. What are the following values for t where the degrees of freedom is 23 a. t0.01 = 2.5 b. t0.025 = 2.069 3. Doctors at the UT Medical Research University have determined that the time to complete a specific operation is normally distributed with mean 200 minutes and std deviation 10 minutes. Doctors who complete this operation in much less time may be taking shortcuts and putting the patient at risk. Doctors who take much longer may be losing their skills as a result of age or some other problem. It was decided to measure the time it takes Dr. Feelgood to do this operation for his next 16 patients. Plans are to estimate the average time [X-bar] it takes the doctor to complete the operation on these 16 patients. a. What is the sampling distribution of the sample mean? Be complete in your answer. X-Bar ~ Normal μX-Bar = μX = 200 σX-Bar = σX/SQRT(n) = 10/4 = 2.5 b. Draw a picture of this sampling distribution and label the 3 sigma limits. c. After collecting the data for the 16 patients, the sample mean was calculated to be 205.1 minutes and the sample standard deviation calculated to be 11.0 minutes (not needed here since we were given the “true” standard deviation of 10 – this means that you will use z scores). Test the hypothesis that the mean time to complete this operation is 200 minutes at a 5% level of significance. [Ho: μx = 200]. Show all work and be sure to write your managerial statistical 2 summary of what you found. You may use the unstandardized approach, standardized approach, or p-value approach to work this problem. Unstandardized Approach: UCV = μX-Bar + 1.96*σX-Bar = 200 + 1.96*(2.5) = 204.9 LCV = μX-Bar - 1.96*σX-Bar = 200 - 1.96*(2.5) = 195.1 Since X-Bar(205.1) > UCV(204.9). reject H0: μX = 200, at a 5% level of significance () Standardized Approach: Z = (X-Bar - μX-Bar) / σX-Bar = (205.1 – 200) / 2.5 = 2.04 Since Z(2.04) > Z/2 (1.96), reject H0: μX = 200, at a 5% level of significance () p-Value Approach: See below d. Calculate the p-value for this hypothesis test. Since p-Value(0.0414) < (0.05), reject H0: μX = 200, at a 5% level of significance () e. Provide a 95% confidence interval estimate for the mean time [use information given in part c above] X-Bar + Z/2 * (σX-Bar ) = X-Bar + Z/2 * [σX/SQRT(n)] = 205.1 + 1.96*[10/SQRT(16)] = 205.1 + 4.9 = [200.2 , 210.0] Note: assume std deviation is known (10) f. Provide a point estimate for the mean time to complete the operation. Point estimate for μX = X-Bar = 205.1 g. Assuming the hypothesis was rejected and the hospital board told Dr. Feelgood that he was too slow and would be forced to retire, Dr. Feelgood called his great grandson, Leroy Studyhard, who just completed a statistics course at UTA to look at the data and tell him if anything was wrong in their approach taken to reach this conclusion. Leroy immediately decided that the “assumed standard deviation of 10 minutes” might be incorrect and the hospital should have used the sample 3 standard deviation of 11 minutes to test the hypothesis. Now, test the hypothesis Ho: μx = 200 at a 5% level of significance assuming the standard deviation is unknown and you have to use the sample standard deviation or 11 in your analysis. Standardized Approach: t = (X-Bar - μX-Bar) / [sX/SQRT(n)] = (205.1 – 200) / (11 / 4) = 1.85 Since t(1.85) < t/2 (2.131), fail to reject H0: μX = 200, at a 5% level of significance () Note: d.f. = n – 1 = 16 – 1 = 15 and t/2 = t 0.025 = 2.131 h. What argument would you suggest that Leroy tell his great grandfather to present to the hospital administration to keep his job. Unless the hospital has some evidence that the assumed true standard deviation (10) is a valid assumption backed up with empirical evidence, it would be more reasonable to use the sample standard deviation (11) which resulted from “real” data. This would then result in “Failing to Reject H0: μX = 200”, which would imply that the hospital has “NO” statistical proof that his grandfather is any different than other doctors as to the mean time to do an operation. i. Provide a 95% confidence interval estimate for the mean time [use information given in part g above] X-Bar + t/2 * (σX-Bar ) = X-Bar + t/2 * [sX/SQRT(n)] = 205.1 + 2.131*[11/SQRT(16)] = 205.1 + 5.9 = [199.2 , 211.0] j. Now provide a point estimate for the mean time to complete the operation. Same as before: Point estimate for μX = X-Bar = 205.1 k. After all the dust settled in the law suit, the hospital administration decided to collect data on all the doctors and wanted to estimate the mean time to complete the operation within + 0.7 minutes at 95% confidence. They really want a good/precise/accurate estimate for the mean time to complete the operation. Since the standard deviation calculated for Dr. Feelgood was 11, they decided to use this value for the standard deviation. Since Leroy appeared to understand this statistical stuff they hired him to help out. What sample size should he recommend? n = [(Z/2 * σX) / B]2 = [(1.96 * 11)/ 0.7]2 = 94,864 l. What do you think the hospital administration will say when you give them your answer in part k above? You’ve got to be kidding! That’s more operations that we will do the next 50 years. What can you tell me with a real small sample size? [In some cases, this might not be much] 4. The local Koke plant is concerned about the ability of a new filling machine to fill 12 oz Koke cans. They heard of Leroy’s great skills with statistics and hired him as a consultant to check the new machine out just to make sure that the average fill is 12 oz like the can specified on the outside of the can. 4 a. Leroy took a sample of 5,000 cans during a test production run and tested the hypothesis that the mean fill was 12 oz. at a 5% level of significance resulting in a “rejection of the null hypothesis”. In fact, based on the data, the average fill appears to be significantly less that 12 oz . Leroy tells the CEO of the Koke plant that they are underfilling the cans and they could get in trouble with the FDA rules dealing with the truth in labeling laws. The CIO said “Since you have rejected the hypothesis that the mean fill is 12 oz, what would you estimate the mean fill to actually be”. Leroy then calculated a 95% confidence interval resulting in 11.996 + 0.001 [11.995 – 11.997]. Even though the mean has been shown to be statistically less than 12 oz, what would you think a “common sense/practical” comment would be to justify using the new machine. Even though we rejected the hypothesis that the mean is equal to 12 oz, the “MAGNITUDE” of the difference as reflected in the confidence interval [11.995 – 11.997] is NOT large enough to be of any “PRACTICAL” importance. In other words, even if the true mean were at the lower confidence limit (11.995), the difference between this mean (11.995) and the targeted mean (12.0) is ONLY 0.005 oz. Most people would probably say that this is close enough to 12.0 oz. b. Why do you think you can reject ANY hypothesis if the sample size is large enough? Might want to use the power curves below for three different sample sizes. First of all, the odds of the true mean being exactly equal to 12.0000000000000…… is logically and statistically equal to “0”. As the sample size gets larger the variability of X-Bar’s get smaller (approaches 0 as the sample size approaches infinity), since we know that σX-Bar = σX/SQRT(n). We also know that the mean of the X-Bars (μX-Bar ) is equal to the mean of X (μX ). So, if the null hypothesis is true we would expect our X-Bar to be real, real, real close to 12 (in a limiting sense equal to 12). 5 Now, since we started out saying the odds of the true mean being exactly equal to 12.00000 is equal to “0”, this implies that the true mean is something else than 12.000… If the true mean is anything other than 12.000000, according to the logic discussed above, we would expect X-Bar to be real close to this true mean (which is not 12.0000) and as the sample size gets larger, eventually this X-Bar would be in the rejection Region of your hypothesis. See following picture of the sampling distribution of X-Bar as the sample sizes get larger. 5. An extensive campaign at the BigLots company encouraged employees to carpool in order to win the “Green Award” given out by the mayor. In order for the mayor to determine if BigLots should win this award, she decided to send her son, Leroy Studyhard, out to the parking lot and randomly sample 100 cars parked in the employee lot [the employee parking lot has over 10,000 autos parked in it]. Leroy leaves a survey on each windshield offering a cash award if the employee fills out the survey and drops it off at the exit gate. The survey asked the following question “Do you have any employees riding with you who does not live in your house?” If the answer is “yes”, then that car in involved in carpooling. Of the 100 cars surveyed, 20 responded yes and 80 responded no. a. Estimate the proportion of BigLots autos on the parking lot who are involved in carpooling. 20/100 = 0.2 b. Provide a 95% confidence interval estimate for the proportion involved in carpooling. p-hat + Z/2 SQRT [(p-hat)*(1 – p-hat) / n ] = .2 + 1.96*SQRT[(.2)*(.8)/100] = .2 + .078 = [0.122 , .278] c. After not winning the award, the CEO of BigLots calls the mayor and complained because he believed that at least 40% of his employees are involved in carpooling which is much larger than the winning company who had 30% carpooling. How should the mayor explain why they did not receive the Green Award? Based on the survey results for your company we are 95% confident that the percent of your employee’s cars involve in carpooling is between 12.2% and 27.8%. As you can see from this data, the winner who had 30% involved is better than your company. 6 d. After the explanation, the CEO believes that the sample size taken by the mayor was simply too small to get an accurate estimate of the proportion of carpoolers in his company and he insist that next year the mayor take a much larger sample size. If the mayor does this [larger sample], how would you expect the width of the confidence interval to change [ stay the same, get wider, get tighter]? Would you expect the results to indicate that the true proportion carpoolers is approximately 20% or 40%? As the sample size gets larger, I would expect the confidence interval to get tighter and based on previous results I would expect it to still be centered around some value between 12.2% and 27.8% ( last years confidence interval). Hence, I have no reason to expect this new interval to be centered around 40%.