Download Confidence Intervals Act 2 Using Technology to Create a Sampling

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Gibbs sampling wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Confidence Intervals Act 2
Using Technology to Create a Sampling Distribution for Sample Means
Let’s look again at the Columbian Mild coffee data. Remember that the population mean price for the
Columbian mild coffee was 134.338 cents per pound. In activity 1 we wanted to explore how far
random sample means would be from the population value. So we created our own Sampling
distribution for sample means. This was a lot of work. Remember, everyone in our class calculated a
few sample means from samples of size 30. They then put the dots on the board for each sample mean
to create the distribution.
Could we do the same thing, but this time use technology? Definitely.
As we saw above, we want to create a sampling distribution from the Columbian Mild coffee data. Go
to www.matt-teachout.org and click on the “Statistics” tab. Now go to “data sets” and open the coffee
data. Copy the column that says “Columbian mild”.
There are several good programs on the market. We will look at “Stat Key” by the Lock family. Go to
the website www.lock5stat.com . Click on the button that says “StatKey”. Look for the tab that says
“sampling distributions mean” and click on it. Click on the edit data tab. Delete the data set currently
there and paste in the Columbian Mild data. Click on “samples of size n” and put in 30. (Remember our
activity yesterday used samples of size 30.) Turn off the button that says “first column is identifier” as
we have only a single column of data. Now click ok.
You are now ready to create your sampling distribution. It is always best to start slowly by having the
computer take 1 random sample for you. So click on “Generate 1 sample”. Let’s see if we understand
what we are looking at. The computer picked 30 numbers randomly from our list and calculated the
sample mean. The right side at the bottom shows the actual numbers. Then the computer put a dot on
the graph exactly like we did yesterday when we put magnets on the board. Click it a couple more times
and each time note that the computer took a random sample of 30, found the sample mean and put a
dot for the sample mean on the graph. Notice the sample means are different each time. This also
happened yesterday when our class made the distribution.
Now let’s speed up the process. Click on “generate 10 samples” a few times. This calculates 10 random
sample means at a time for us. Let’s go faster. Click on “generate 100 samples” a few times. This
calculates 100 random sample means at a time for us. Notice the computer has already collected more
random samples than our whole class was able to do yesterday. You know what is next. Let’s really
speed this up and get a ton of random samples. Click on “generate 1000 samples” a few times. This
calculates 1000 random sample means at a time!!
You have created a sampling distribution of sample means. Use the sampling distribution to answer
some of the same questions we answered yesterday.
1. Notice not all the dots (sample means) are exactly the same as the population mean 134.338 cents
per pound. Again, what does this tell us about the difficulty of taking one random sample and using it to
estimate the population value? (Remember this is called a “point estimate”.)
2. Compare the sampling distribution from “Stat Key” on the Lock website to the sampling distribution
we did by hand yesterday. Stat key kept track of how many random samples you took. How many was
there? As the number of sample means (dots) increases, is it easier or harder to determine the shape?
What is the shape of the distribution?
3. Stat Key calculated the mean average of the all the sample means and put a pointer at the center. Is
the center close to the population mean of 134.338 cents per pound? As the number of random
samples increases, does the mean of the distribution get more or less accurate as an estimate of the
population value? Why do you think that is?
4. In the last activity, we estimated the standard deviation of the distribution. We said that this has a
special name and is called the “standard error”. Notice that Stat Key calculated the standard deviation
for the distribution. What was the standard error for the distribution?
5. Recall in the last activity that we used the distribution on the board to find an approximate standard
error. Go to the upper left corner of Stat Key and click on where it says “two tails”. Notice it calculated
the two Columbian mild coffee prices that 95% of the sample means fell in between. If we look at the
total distance between these numbers, how many standard errors are they apart. Does this agree with
Empirical rule?