Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Transcript

Ismor Fischer, 5/29/2012 5.2-1 5.2 Formal Statement and Examples Sampling Distribution of a Normal Variable Given a random variable X. Suppose that the population distribution of X is known to be normal, with mean µ and variance σ 2, that is, X ~ N(µ, σ). Then, for any sample size n, it follows that the sampling distribution of X is normal, σ2 σ . with mean µ and variance n , that is, X ~ Nµ, n Comments: σ n is called the “standard error of the mean,” denoted SEM, or more simply, s.e. The corresponding Z-score transformation formula is Z = X −µ ~ N(0, 1). σ/ n Example: Suppose that the ages X of a certain population are normally distributed, with mean µ = 27.0 years, and standard deviation σ = 12.0 years, i.e., X ~ N(27, 12). The probability that the age of a single randomly selected individual is less than 30 years is P(X < 30) = PZ < 30 − 27 12 = P(Z < 0.25) = 0.5987. In this population, the probability that the average age of 36 random people is under 30 years old, is much greater than the probability that the age of one random person is under 30 years old. Now consider all random samples of size n = 36 taken from this population. By the above, their mean ages X are also normally distributed, with mean µ = 27 yrs 12 yrs σ = = 2 yrs. as before, but with standard error n 36 That is, X ~ N(27, 2). Exercise: Compare the two probabilities of being under 24 years old. The probability that the mean age of a single sample of n = 36 randomly selected individuals is less than 30 years is P( X < 30) = PZ < X µ = 27 30 Exercise: Compare the two probabilities of being between 24 and 30 years old. 30 − 27 2 = P(Z < 1.5) = 0.9332. µ = 27 30 X Ismor Fischer, 5/29/2012 5.2-2 If X ~ N(µ, σ) approximately, then X ~ Nµ, σ approximately. (The larger the value n of n, the better the approximation.) In fact, more is true... IMPORTANT GENERALIZATION: The Central Limit Theorem Given any random variable X, discrete or continuous, with finite mean µ and finite variance σ 2. Then, regardless of the shape of the population distribution of X, as the sample size n gets larger, the sampling distribution of X becomes increasingly closer to σ2 σ , normal, with mean µ and variance n , that is, X ~ Nµ, n approximately. X −µ More formally, = → N ( 0,1) as n → ∞ . Z σ/ n Intuitively perhaps, there is less variation between different sample mean values, than there is between different population values. This formal result states that, under very general conditions, the sampling variability is usually much smaller than the population variability, as well as gives the precise form of the “limiting distribution” of the statistic. What if the population standard deviation σ is unknown? Then it can be replaced by the s sample standard deviation s, provided n is large. That is, X ~ Nµ, approximately, n s if n ≥ 30 or so, for “most” distributions (... but see example below). Since the value n is a sample-based estimate of the true standard error s.e., it is commonly denoted s.e. Because the mean µ X of the sampling distribution is equal to the mean µ X of the population distribution – i.e., E [ X ] = µ X – we say that X is an unbiased estimator of µ X . In other words, the sample mean is an unbiased estimator of the population mean. A biased sample estimator is a statistic θˆ whose “expected value” either consistently overestimates or underestimates its intended population parameter θ . Many other versions of CLT exist, related to so-called Laws of Large Numbers. Ismor Fischer, 5/29/2012 5.2-3 Example: Consider a(n infinite) population of paper notes, 50% of which are blank, 30% are ten-dollar bills, and the remaining 20% are twenty-dollar bills. Experiment 1: Randomly select a single note from the population. Random variable: X = $ amount obtained x f(x) = P(X = x) 0 .5 10 .3 .5 20 .2 .3 .2 Mean µ X = E[X] = (.5)(0) + (.3)(10) + (.2)(20) = $7.00 Variance σ X 2 = E[ (X – µ X )2 ] = (.5)(−7)2 + (.3)(3)2 + (.2)(13)2 = 61 Standard deviation σ X = $7.81 Ismor Fischer, 5/29/2012 5.2-4 Experiment 2: Each of n = 2 people randomly selects a note, and split the winnings. Random variable: X = $ sample mean amount obtained per person x (x1, x2) Probability x 0 (0, 0) .5 × .5 = 0.25 5 (0, 10) .5 × .3 = 0.15 10 (0, 20) .5 × .2 = 0.10 5 (10, 0) .3 × .5 = 0.15 10 (10, 10) .3 × .3 = 0.09 15 (10, 20) .3 × .2 = 0.06 10 (20, 0) .2 × .5 = 0.10 15 (20, 10) .2 × .3 = 0.06 20 (20, 20) .2 × .2 = 0.04 f ( x ) = P( X = x ) 0 .25 5 .30 = .15 + .15 10 .29 = .10 + .09 + .10 .25 .30 .29 .12 15 .12 = .06 + .06 20 .04 .04 Mean µ X = (.25)(0) + (.30)( 5) + (.29)(10) + (.12)(15) + (.04)(20) = $7.00 = µ X !! Variance σ X 2 = (.25)(−7)2 + (.30)(−2)2 + (.29)(3)2 + (.12)(8)2 + (.04)(13)2 σX2 61 = 30.5 = = n !! 2 Standard deviation σ X = $5.52 = σX !! n Ismor Fischer, 5/29/2012 5.2-5 Experiment 3: Each of n = 3 people randomly selects a note, and split the winnings. Random variable: X = $ sample mean amount obtained per person x 0 3.33 6.67 3.33 6.67 10 6.67 10 13.33 (x1, x2, x3) (0, 0, 0) (0, 0, 10) (0, 0, 20) (0, 10, 0) (0, 10, 10) (0, 10, 20) (0, 20, 0) (0, 20, 10) (0, 20, 20) Probability .5 × .5 × .5 = 0.125 .5 × .5 × .3 = 0.075 .5 × .5 × .2 = 0.050 .5 × .3 × .5 = 0.075 .5 × .3 × .3 = 0.045 .5 × .3 × .2 = 0.030 .5 × .2 × .5 = 0.050 .5 × .2 × .3 = 0.030 .5 × .2 × .2 = 0.020 3.33 6.67 10 6.67 10 13.33 10 13.33 16.67 (10, 0, 0) .3 × .5 × .5 = 0.075 (10, 0, 10) .3 × .5 × .3 = 0.045 (10, 0, 20) .3 × .5 × .2 = 0.030 (10, 10, 0) .3 × .3 × .5 = 0.045 (10, 10, 10) .3 × .3 × .3 = 0.027 (10, 10, 20) .3 × .3 × .2 = 0.018 (10, 20, 0) .3 × .2 × .5 = 0.030 (10, 20, 10) .3 × .2 × .3 = 0.018 (10, 20, 20) .3 × .2 × .2 = 0.012 6.67 10 13.33 10 13.33 16.67 13.33 16.67 20 (20, 0, 0) .2 × .5 × .5 = 0.050 (20, 0, 10) .2 × .5 × .3 = 0.030 (20, 0, 20) .2 × .5 × .2 = 0.020 (20, 10, 0) .2 × .3 × .5 = 0.030 (20, 10, 10) .2 × .3 × .3 = 0.018 (20, 10, 20) .2 × .3 × .2 = 0.012 (20, 20, 0) .2 × .2 × .5 = 0.020 (20, 20, 10) .2 × .2 × .3 = 0.012 (20, 20, 20) .2 × .2 × .2 = 0.008 x f ( x ) = P( X = x ) 0.00 .125 3.33 .225 = .075 + .075 + .075 6.67 .285 = .050 + .045 + .050 + .045 + .045 + .050 10.00 .207 = .030 + .030 + .030 + .027 + .030 + .030 + .030 13.33 .114 = .020 + .018 + .018 + .020 + .018 + .020 16.67 .036 = .012 + .012 + .012 20.00 .008 .285 .225 .114 .036 Mean µ X = Exercise = $7.00 = µ X !!! Variance σ X Standard deviation σ X = $4.51 = 2 .207 .125 σX2 61 = Exercise = 20.333 = = n !!! 3 σX !!! n .008 Ismor Fischer, 5/29/2012 5.2-6 The tendency toward a normal distribution becomes stronger as the sample size n gets larger, despite the mild skew in the original population values. This is an empirical consequence of the Central Limit Theorem. For most such distributions, n ≥ 30 or so is sufficient for a reasonable normal approximation to the sampling distribution. In fact, if the distribution is symmetric, then convergence to a bell curve can often be seen for much lower n, say only n = 5 or 6. Recall also, from the first result in this section, that if the population is normally distributed (with known σ), then so will be the sampling distribution, for any n. BUT BEWARE.... Ismor Fischer, 5/29/2012 5.2-7 However, if the population distribution of X is highly skewed, then the sampling distribution of X can be highly skewed as well (especially if n is not very large), i.e., relying on CLT can be risky! (Although, sometimes using a transformation, such as ln(X) or X, can restore a bell shape to the values. Later…) Example: The two graphs on the bottom of this page are simulated sampling distributions for the highly skewed population shown below. Both are density histograms based on the means of 1000 random samples; the first corresponds to samples of size n = 30, the second to n = 100. Note that skew is still present! Population Distribution

Similar