Download Statistics MINITAB

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Statistics
MINITAB - Lab 8
Central Limit Theorem
The Central Limit Theorem is one of the most important theorems in statistics. The statement of
the theorem (without proof) is:
Consider a random sample of size n selected from a population (any population)
with mean  and standard deviation .
Then, when n is sufficiently large, the sampling distribution of x will be
approximately a normal distribution with mean  x =  and standard deviation
 x =  / n . The larger the sample size, the better will be the normal approximation
to the sampling distribution of x
What this means is that for any population distribution, regardless of shape etc., if we repeatedly
take samples of a large enough size and record the sample means, then the distribution of those
means will approach a normal distribution with a mean the same as the population mean, and a
standard deviation equal to the population standard deviation divided by the square root of the
sample size. We will now look at the sampling distribution of the mean from a skewed distribution
(the exponential).
1. The exponential distribution is an asymmetric continuous probability distribution. The
probability distribution function for the exponential is:
f x  
1

e

x

where the mean =  and the standard deviation = 
Using MINITAB (release 13.20) generate random variables from an exponential population
with parameter  = 20.
Go the CALC > RANDOM DATA > EXPONENTIAL generate 1000 rows of data and store in
columns C1-C50, with a mean of 20:
1
How many rows of data
do you want in each
column?
How many columns of
data do you want?
Specify which columns
you want them in.
What is the mean of the
distribution you want to
simulate?
2. Have a look at the shape of the distribution of these random variables and their descriptive
statistics:
Go to STAT > BASIC STATISTICS > DISPLAY DESCRIPTIVE STATISTICS.
Then select column C1 and click on GRAPHS > HISTOGRAM OF DATA, WITH NORMAL
CURVE. This will give both the normal descriptive statistics in the session window and also
draw a histogram of the data with a normal curve superimposed on it.
What is the mean of C1 ____________
What is the standard deviation of C1____________
Describe the shape of the histogram (could we consider this a normal distribution, why ?)
_____________________________________________________________________
_____________________________________________________________________
Have a quick look at C2, C3 and C4 aswell.
2
3. First we will get a sampling distribution of means for a sample size of n=2.
Go to the CALC menu and click on ROW STATISTICS. Select the mean as the statistic of interest
and input variables columns C1-C2. Store the result in C51. Name column C51 n=2. Column C51
now contains 1000 means from samples of size n=2 from an exponential population with
parameter  = 20. These 1000 means are a sampling distribution of the mean.
What statistic do
you want?
You want the mean
of C1 and C2.
Next available column.
We want to look at the sampling distribution of the mean for different sample sizes. Repeat this
step 3 more times using the following:
Input variable
Store Result
Name for column
C1-C5
C52
n=5
C1-C15
C53
n=15
C1-C50
C54
n=50
Columns C51-C54 contain sampling distributions of the mean for samples of sizes 2, 5, 15 and 50
respectively from a exponential population with parameter  = 20. Get descriptive statistics and
histograms with a normal distribution curve superimposed on it for each of the columns C51-C54.
Fill in the following table:
3
Sample Size
Mean
Median
Skew ?
Standard
Deviation
Expected Standard
Deviation by C.L.T. *
n=2
n=5
n = 15
n = 50
* C.L.T. = Central Limit Theorem
What is the expected mean for each case (by the C.L.T.) ?
_______________
What pattern are you seeing with regard to shape (skew symmetry, bell shape etc.) as the sample
size increase ?
______________________________________________________________________________
______________________________________________________________________________
______________________________________________________________________________
What pattern can you see in the standard deviation of the sampling distributions as the sample
size increases ? Why might this be important ? (NB: the standard deviation of a sampling
distribution of the mean is often called the standard error of the mean).
______________________________________________________________________________
______________________________________________________________________________
REVISION SUMMARY
After this lab you should be able to :
-
Understand the Central Limit Theorem
-
Simulate data with a specific distribution
-
Generate a histogram with the normal curve superimposed on it
-
Create a sampling distribution of means
-
Work out expected mean/standard deviation by the CLT
END
4