Download SPSS Guide

CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS CENTRAL LIMIT THEOREM (SECTION 7.2 OF UNDERSTANDABLE STATISTICS) The Central Limit Theorem says that if x is a random variable with any distribution having mean µ and standard deviation σ, then the distribution of sample means x based on random samples of size n is such that for sufficiently large n: (a) The mean of the x distribution is approximately the same as the mean of the x distribution. (b) The standard deviation of the x distribution is approximately σ n. (c) The x distribution is approximately a normal distribution. Furthermore, as the sample size n becomes larger and larger, the approximations mentions in (a), (b) and (c) become better. We can use SPSS to demonstrate the Central Limit Theorem. The computer does not prove the theorem. A proof of the Central Limit Theorem requires advanced mathematics and is beyond the scope of an introductory course. However, we can use the computer to gain a better understanding of the theorem. To demonstrate the Central Limit Theorem, we need a specific x distribution. One of the simplest is the uniform probability distribution. 332 Copyright © Houghton Mifflin Company. All rights reserved. Part IV: SPSS Guide 333 The normal distribution is the usual bell-shaped curve, but the uniform distribution is the rectangular or box-shaped graph. The two distributions are very different. The uniform distribution has the property that all subintervals of the same length inside the interval 0 to 9 have the same probability of occurrence no matter where they are located. This means that the uniform distribution on the interval from 0 to 9 could be represented on the computer by selecting random numbers from 0 to 9. Since all numbers from 0 to 9 would be equally likely to be chosen, we say we are dealing with a uniform (equally likely) probability distribution. Note that when we say we are selecting random numbers from 0 to 9, we do not just mean whole numbers or integers; we mean real numbers in decimal form such as 2.413912, and so forth. Because the interval from 0 to 9 is 9 units long and because the total area under the probability graph must by 1, the height of the uniform probability graph must be 1/9. The mean of the uniform distribution on the interval from 0 to 9 is the balance point. Looking at the Figure, it is fairly clear that the mean is 4.5. Using advanced methods of statistics, it can be shown that for the uniform probability distribution x between 0 and 9, µ = 4.5 and σ = 3 3 2 ≈ 2.598 The figure shows us that the uniform x distribution and the normal distribution are quite different. However, using the computer we will construct one hundred sample means x from the x distribution using a sample size of n = 40. We will use 100 rows (for the 100 samples) and 40 columns (sample size is 40). We can vary the number of samples as well as the sample size n according to how many rows and columns we use. We will see that even though the uniform distribution is very different from the normal distribution, the histogram of the sample means is somewhat bell shaped. We will also see that the mean or the x distribution is close to the predicted mean of 4.5 and that the standard deviation is close to σ n or 2.598 40 or 0.411. Example In order for us to get familiar with the procedure, let us first work with 100 samples using a sample size of n = 5. Follow these steps. Also note that your results will rary. First, name the first column (variable) x1. Enter a number (any number) in the 100th cell of the first column to define the variable size (that is, the number of samples). Then use TransformhCompute for five times (since our sample size n = 5). Note that TransformhCompute works with one target variable at a time. Since our sample size is 5, we need to generate random numbers from the uniform distribution in 5 columns ( that is, 5 variables). That is why we need to use TransformhCompute for five times. Each time we use the formula xi = RV.UNIFORM(0, 9), here i = 1, 2, 3, 4, 5. Note that the TransformhCompute dialog box preserves the numeric expression used most recently. Therefore the expression RV.UNIFORM(0, 9) only needs to be entered once. After that, all you have to do in the TransformhCompute dialog box is to change the target variable name, that is, to change the value of i. Displayed below is our fifth use of TransformhCompute with this formula. Here i = 5. Therefore the formula reads x5 = RV.UNIFORM(0, 9). Copyright © Houghton Mifflin Company. All rights reserved. 334 Technology Guide Understandable Statistics, 8th Edition Click on OK. Another hundred of random numbers will be generated in the fifth column under variable name x5. So 100 random samples of size 5 from the uniform distribution on (0, 9) are generated. Next, let us take the mean of each of the 100 rows (5 columns across) and store the values under the variable name xbar. Use TransformhCompute with the formula xbar = MEAN(x1, x2, x3, x4, x5) as shown below. Copyright © Houghton Mifflin Company. All rights reserved. Part IV: SPSS Guide 335 Click on OK. The results follow. Let us now look at the mean and standard deviation of xbar (the sample means) as well as its histogram, using the menu options hAnalyzehDescriptive Statistics h Frequencies. Uncheck “Display frequency table”, click on “Charts” and select “Histogram”, then click on “Statistics” and select “Mean” and “Std deviation”. Click on OK. The results follow. Copyright © Houghton Mifflin Company. All rights reserved. 336 Technology Guide Understandable Statistics, 8th Edition Note that the histogram is already quite close to a bell shaped one. Here the sample size is only 5. When the sample size is sufficiently large, the histogram will look more like a normal distribution. Now let us draw 100 random samples of size 40 from the uniform distribution on the interval from 0 to 9. The steps will be the same as above, only that now we need to repeat TransformhCompute for 40 times with the formula xi = RV.UNIFORM(0, 9), here i = 1, 2 . . . 40. After that we compute the sample mean by xbar = MEAN(x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x16, x17, x18, x19, x20, x21, x22, x23, x24, x25, x26, x27, x28, x29, x30, x31, x32, x33, x34, x35, x36, x37, x38, x39, x40). Do these, and the results follow. (Your results will vary.) Copyright © Houghton Mifflin Company. All rights reserved. Part IV: SPSS Guide 337 Now look at the mean and standard deviation of xbar (the sample means) as well as its histogram, using the menu options hAnalyzehDescriptive Statistics h Frequencies. Uncheck “Display frequency table”, click on “Charts” and select “Histogram”, then click on “Statistics” and select “Mean” and “Std deviation”. Click on OK. The results follow. Note the Mean and Std Dev are very close to the values predicted by the Central Limit Theorem. The histogram for this sample does not appear very similar to a normal distribution. Let’s try another sample. The following are the results. Copyright © Houghton Mifflin Company. All rights reserved. 338 Technology Guide Understandable Statistics, 8th Edition This histogram looks more like a normal distribution. You will get slightly different results each time you draw 100 samples. LAB ACTIVITIES FOR CENTRAL LIMIT THEOREM 1. Repeat the experiment of Example 1. That is, draw 100 random samples of size 40 each from the uniform probability distribution between 0 and 9. Then take the means of each of these samples and put the results under the variable name xbar. Next use hAnalyzehDescriptive Statistics h Frequencies on xbar. How does the mean and standard deviation of the distribution of sample means compare to those predicted by the Central Limit Theorem? How does the histogram of the distribution of sample means compare to a normal curve? 2. Next take 100 random samples of size 20 from the uniform probability distribution between 0 and 9. Again put the means under the variable name xbar and then use hAnalyzehDescriptive Statistics h Frequencies on xbar. How do these results compare to those in problem 1? How do the standard deviations compare? Copyright © Houghton Mifflin Company. All rights reserved.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download SPSS Guide