Download Math 109 Lab #5: Confidence Intervals Considered Spring 02011

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Math 109
Lab #5: Confidence Intervals Considered
Spring 02011
Due date and time: Friday, March 25, 02011, 02011, 8:10:00 AM EDT.
The purpose of this lab is to take a careful look at confidence intervals. We will
generate 200 random samples and examine how the length of the confidence interval
depends on the sample size.
As you are well aware, when the sample size is large, and regardless of the
distribution of the underlying population, the sample mean 0 has a distribution that is
approximately normal. Hence, for a random sample of size n $ 30 from any distribution
with mean ì and standard deviation ó, we have
–that is, the interval indicated above has a 95% probability of containing the true population
mean ì. Additionally, if n is large enough, the population standard deviation ó can be
approximated by the sample standard deviation s. It follows that the probability that the
random interval
contains ì will be approximately .95.
You will use MINITAB today to generate 100 samples and to compute the upper and
lower endpoints for the 95% CI from random samples of sizes n = 200 and n = 10.
0. Log in to the network using your username and network password, and launch
MINITAB.
1. Open the Report Pad and type your name at the top.
2. Activate the Session Window and make sure that the command prompt MTB>
is visible by clicking on Enable Commands under the Editor menu.
3. Type Base=xyz, where xyz is your favorite three-digit number, and press Enter.
Consider the following problem:
According to a University of Florida researcher, the average level of mercury
uptake in wading birds is normally distributed with a mean of 10 parts per
million (ppm) and a populations standard deviation of 3 ppm.
You will simulate taking 100 samples of size k2 = 200 from this normal distribution
with mean 10 and SD 3 and find each confidence interval. This process has been
automated for your convenience.
4. Type the following lines in the Session Window, pressing Enter after each line:
Erase C1,C2,C3
Let k1=1
Let k2=200
Execute ‘W:\Depts\math\mbollman\M109\ConfIntervals.txt’ 100
Press Enter. Save your work now to your H: drive as Lab5.mpj. Before going on, do
the following:
5. Record the 95% confidence interval computed from the first sample:
6. Does this confidence interval contain the true population mean ì = 10?
7. Find the length of this confidence interval.
8. Label column C4 “Sample Number” and fill this column with numbers from 1-100
by doing the following: Under the Calc menu, select Make Patterned Data and then
Simple Set of Numbers... Fill out the dialogue box as follows:
Store patterned data in: Sample Number
From first value: 1
To last value: 100
In steps of: 1
Number of times to list each value: 1
Number of times to list the sequence: 1
Click on OK. Save your work now.
9. You will now graph all 100 confidence intervals at once and compute the
proportion of CI’s that contain the true population parameter ì = 10. (MINITAB is wonderful
sometimes.) When the sample size is large, this proportion should be close to 95%;
however, when the sample size is small, strange things can happen.
From the Graph menu, select Scatterplot and choose Simple.
Select “Sample Number” for both row 1 and row 2 of the Y’s.
Select “Lower Endpoint” and “Upper Endpoint” respectively, for rows 1 and 2 of the
X’s.
Click on the Multiple Graphs... button, choose Overlaid on the same graph, and
click OK.
Click on the Scale... button, choose the Reference Lines tab, and enter 10 in the
“Show reference lines at X values” box. Click OK.
Click on the Data View... button, choose the Data Display tab, and click on
Symbols and Project lines. Click OK.
Click OK again. The graph should appear, but might not be all that easy to read.
10. Once the graph appears, we will edit it a bit. Double click in the middle of the
graph to get the Edit Project Lines pop-up window. Make sure that this is the name of the
pop-up window, otherwise none of what follows here will make sense.
Click on the Attributes tab.
Choose Custom.
Under Lines, select the solid line.
Under Color, choose any dark color that you like. It must be visible against
a white background.
Under Size, choose 1.
Click on the Options tab.
Under Projection Direction, choose Toward Y scale.
Under Base position, choose Custom and enter 10.
Click on OK. A much nicer graph should appear. Double-click on the name of the
graph and rename it something like “n=200 from N(10,3)”. Right-click on the graph and
choose Append graph to report to copy it to your Report Pad. Save your work now.
The graph should have 100 horizontal lines stacked one over the other and a
vertical line at 10. Each horizontal line represents a confidence interval for one of the 100
samples. The black dot on the left represents the lower endpoint and the red dot on the
right represents the upper endpoint. For some samples, the CI contains the true
mean–which is indicated graphically by an interval that crosses the vertical line. For other
samples, the population mean falls outside the confidence interval.
11. Use this graph to determine what percentage of the intervals does not contain
the population mean ì = 10–count the number of intervals with the red upper endpoint to
the left of 10 plus the number of intervals with the black left endpoint to the right of 10.
12. Now determine what proportion (percentage) of the CI’s contains ì = 10. (This
is easy if you have answered #11 correctly.)
13. How does this information compare to the assertion that there is approximately
a 95% chance that the random interval
contains ì = 10?
14. We will now repeat this experiment with a sample size of 10 (10, being less than
30, counts as a “small sample”). Go back to step 4 and type the following commands. The
third command has changed:
Erase C1,C2,C3
Let k1=1
Let k2=10
Execute ‘W:\Depts\math\mbollman\M109\ConfIntervals.txt’ 100
Note that this will overwrite your earlier data.
Repeat the commands from earlier to construct the graph and answer the following
questions for this experiment as you did for the first experiment:
15. Record the 95% confidence interval computed from the first sample:
16. Does this confidence interval contain the true population mean ì = 10?
17. Find the length of this confidence interval.
18. Use this graph to determine what percentage of the intervals does not contain
the population mean ì = 10–count the number of intervals with the red upper endpoint to
the left of 10 plus the number of intervals with the black left endpoint to the right of 10.
19. Now determine what proportion (percentage) of the CI’s contains ì = 10.
20. How does this information compare to the assertion that there is approximately
a 95% chance that the random interval
contains ì = 10?
21. Describe any difference that you see in the graphs for the two different sample
sizes.
Save your work one more time, exit MINITAB, and email your completed MINITAB
file to me no later than 12:00:00 noon, Friday, March 18, 02011. Answer the following
questions on your own outside the lab:
22. Based on your data, discuss the accuracy of the statement “A 95% confidence
interval contains the true mean of the population.”
23. What can you say about the relationship between the length of a CI and the size
of the sample?
Your report should include answer to all of the questions indicated in this document
and a copy of your graphs from the Report Pad.