Download PracticeTest 3 key

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Confidence interval wikipedia , lookup

Regression toward the mean wikipedia , lookup

German tank problem wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
1
Practice Test 3 – Summer 2010
1. What are the following values for Z
a. Z0.0102 = 2.32
b. Z0.5 = 0
2. What are the following values for t where the degrees of freedom is 23
a. t0.01 = 2.5
b. t0.025 = 2.069
3. Doctors at the UT Medical Research University have determined that the time to
complete a specific operation is normally distributed with mean 200 minutes and std
deviation 10 minutes. Doctors who complete this operation in much less time may be
taking shortcuts and putting the patient at risk. Doctors who take much longer may be
losing their skills as a result of age or some other problem. It was decided to measure the
time it takes Dr. Feelgood to do this operation for his next 16 patients. Plans are to
estimate the average time [X-bar] it takes the doctor to complete the operation on these
16 patients.
a. What is the sampling distribution of the sample mean? Be complete in your
answer.
X-Bar ~ Normal
μX-Bar = μX = 200
σX-Bar = σX/SQRT(n) = 10/4 = 2.5
b. Draw a picture of this sampling distribution and label the 3 sigma limits.
c. After collecting the data for the 16 patients, the sample mean was calculated to
be 205.1 minutes and the sample standard deviation calculated to be 11.0
minutes (not needed here since we were given the “true” standard deviation
of 10 – this means that you will use z scores). Test the hypothesis that the mean
time to complete this operation is 200 minutes at a 5% level of significance. [Ho:
μx = 200]. Show all work and be sure to write your managerial statistical
2
summary of what you found. You may use the unstandardized approach,
standardized approach, or p-value approach to work this problem.
Unstandardized Approach:
UCV = μX-Bar + 1.96*σX-Bar = 200 + 1.96*(2.5) = 204.9
LCV = μX-Bar - 1.96*σX-Bar = 200 - 1.96*(2.5) = 195.1
Since X-Bar(205.1) > UCV(204.9). reject H0: μX = 200, at a 5% level of significance ()
Standardized Approach:
Z = (X-Bar - μX-Bar) / σX-Bar = (205.1 – 200) / 2.5 = 2.04
Since Z(2.04) > Z/2 (1.96), reject H0: μX = 200, at a 5% level of significance ()
p-Value Approach: See below
d. Calculate the p-value for this hypothesis test.
Since p-Value(0.0414) < (0.05), reject H0: μX = 200, at a 5% level of significance ()
e. Provide a 95% confidence interval estimate for the mean time [use information
given in part c above]
X-Bar + Z/2 * (σX-Bar ) = X-Bar + Z/2 * [σX/SQRT(n)] = 205.1 + 1.96*[10/SQRT(16)]
= 205.1 + 4.9 = [200.2 , 210.0]
Note: assume std deviation is known (10)
f. Provide a point estimate for the mean time to complete the operation.
Point estimate for μX = X-Bar = 205.1
g. Assuming the hypothesis was rejected and the hospital board told Dr. Feelgood
that he was too slow and would be forced to retire, Dr. Feelgood called his great
grandson, Leroy Studyhard, who just completed a statistics course at UTA to look
at the data and tell him if anything was wrong in their approach taken to reach this
conclusion. Leroy immediately decided that the “assumed standard deviation
of 10 minutes” might be incorrect and the hospital should have used the sample
3
standard deviation of 11 minutes to test the hypothesis. Now, test the hypothesis
Ho: μx = 200 at a 5% level of significance assuming the standard deviation is
unknown and you have to use the sample standard deviation or 11 in your
analysis.
Standardized Approach:
t = (X-Bar - μX-Bar) / [sX/SQRT(n)] = (205.1 – 200) / (11 / 4) = 1.85
Since t(1.85) < t/2 (2.131), fail to reject H0: μX = 200, at a 5% level of significance ()
Note: d.f. = n – 1 = 16 – 1 = 15 and t/2 = t 0.025 = 2.131
h. What argument would you suggest that Leroy tell his great grandfather to present
to the hospital administration to keep his job.
Unless the hospital has some evidence that the assumed true standard deviation (10) is a valid
assumption backed up with empirical evidence, it would be more reasonable to use the sample
standard deviation (11) which resulted from “real” data. This would then result in “Failing to
Reject H0: μX = 200”, which would imply that the hospital has “NO” statistical proof that his
grandfather is any different than other doctors as to the mean time to do an operation.
i. Provide a 95% confidence interval estimate for the mean time [use information
given in part g above]
X-Bar + t/2 * (σX-Bar ) = X-Bar + t/2 * [sX/SQRT(n)] = 205.1 + 2.131*[11/SQRT(16)]
= 205.1 + 5.9 = [199.2 , 211.0]
j. Now provide a point estimate for the mean time to complete the operation.
Same as before: Point estimate for μX = X-Bar = 205.1
k. After all the dust settled in the law suit, the hospital administration decided to
collect data on all the doctors and wanted to estimate the mean time to complete
the operation within + 0.7 minutes at 95% confidence. They really want a
good/precise/accurate estimate for the mean time to complete the operation.
Since the standard deviation calculated for Dr. Feelgood was 11, they decided to
use this value for the standard deviation. Since Leroy appeared to understand this
statistical stuff they hired him to help out. What sample size should he
recommend?
n = [(Z/2 * σX) / B]2 = [(1.96 * 11)/ 0.7]2 = 94,864
l. What do you think the hospital administration will say when you give them your
answer in part k above?
You’ve got to be kidding! That’s more operations that we will do the next 50 years. What can
you tell me with a real small sample size? [In some cases, this might not be much]
4. The local Koke plant is concerned about the ability of a new filling machine to fill 12 oz
Koke cans. They heard of Leroy’s great skills with statistics and hired him as a
consultant to check the new machine out just to make sure that the average fill is 12 oz
like the can specified on the outside of the can.
4
a. Leroy took a sample of 5,000 cans during a test production run and tested the
hypothesis that the mean fill was 12 oz. at a 5% level of significance resulting in a
“rejection of the null hypothesis”. In fact, based on the data, the average fill
appears to be significantly less that 12 oz . Leroy tells the CEO of the Koke plant
that they are underfilling the cans and they could get in trouble with the FDA
rules dealing with the truth in labeling laws. The CIO said “Since you have
rejected the hypothesis that the mean fill is 12 oz, what would you estimate the
mean fill to actually be”. Leroy then calculated a 95% confidence interval
resulting in 11.996 + 0.001 [11.995 – 11.997]. Even though the mean has been
shown to be statistically less than 12 oz, what would you think a “common
sense/practical” comment would be to justify using the new machine.
Even though we rejected the hypothesis that the mean is equal to 12 oz, the “MAGNITUDE” of
the difference as reflected in the confidence interval [11.995 – 11.997] is NOT large enough to
be of any “PRACTICAL” importance. In other words, even if the true mean were at the lower
confidence limit (11.995), the difference between this mean (11.995) and the targeted mean
(12.0) is ONLY 0.005 oz. Most people would probably say that this is close enough to 12.0 oz.
b. Why do you think you can reject ANY hypothesis if the sample size is large
enough? Might want to use the power curves below for three different sample
sizes.
First of all, the odds of the true mean being exactly equal to 12.0000000000000……
is logically and statistically equal to “0”. As the sample size gets larger the variability of X-Bar’s
get smaller (approaches 0 as the sample size approaches infinity), since we know that σX-Bar =
σX/SQRT(n). We also know that the mean of the X-Bars (μX-Bar ) is equal to the mean of X (μX ).
So, if the null hypothesis is true we would expect our X-Bar to be real, real, real close to 12 (in a
limiting sense equal to 12).
5
Now, since we started out saying the odds of the true mean being exactly equal to 12.00000 is
equal to “0”, this implies that the true mean is something else than 12.000… If the true mean is
anything other than 12.000000, according to the logic discussed above, we would expect X-Bar
to be real close to this true mean (which is not 12.0000) and as the sample size gets larger,
eventually this X-Bar would be in the rejection Region of your hypothesis. See following picture
of the sampling distribution of X-Bar as the sample sizes get larger.
5. An extensive campaign at the BigLots company encouraged employees to carpool in order to
win the “Green Award” given out by the mayor. In order for the mayor to determine if BigLots
should win this award, she decided to send her son, Leroy Studyhard, out to the parking lot and
randomly sample 100 cars parked in the employee lot [the employee parking lot has over 10,000
autos parked in it]. Leroy leaves a survey on each windshield offering a cash award if the
employee fills out the survey and drops it off at the exit gate. The survey asked the following
question “Do you have any employees riding with you who does not live in your house?” If the
answer is “yes”, then that car in involved in carpooling. Of the 100 cars surveyed, 20 responded
yes and 80 responded no.
a. Estimate the proportion of BigLots autos on the parking lot who are involved in
carpooling. 20/100 = 0.2
b. Provide a 95% confidence interval estimate for the proportion involved in
carpooling.
p-hat + Z/2 SQRT [(p-hat)*(1 – p-hat) / n ] = .2 + 1.96*SQRT[(.2)*(.8)/100] = .2 + .078 =
[0.122 , .278]
c. After not winning the award, the CEO of BigLots calls the mayor and complained
because he believed that at least 40% of his employees are involved in carpooling
which is much larger than the winning company who had 30% carpooling. How
should the mayor explain why they did not receive the Green Award?
Based on the survey results for your company we are 95% confident that the percent of your
employee’s cars involve in carpooling is between 12.2% and 27.8%. As you can see from this
data, the winner who had 30% involved is better than your company.
6
d. After the explanation, the CEO believes that the sample size taken by the mayor
was simply too small to get an accurate estimate of the proportion of carpoolers in
his company and he insist that next year the mayor take a much larger sample
size. If the mayor does this [larger sample], how would you expect the width of
the confidence interval to change [ stay the same, get wider, get tighter]? Would
you expect the results to indicate that the true proportion carpoolers is
approximately 20% or 40%?
As the sample size gets larger, I would expect the confidence interval to get tighter and based on
previous results I would expect it to still be centered around some value between 12.2% and
27.8% ( last years confidence interval). Hence, I have no reason to expect this new interval to be
centered around 40%.