Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MATH 2560 C F03 Elementary Statistics I LECTURE 22: The Sampling Distribution of a Sample Mean. 1 Outline ⇒ the mean and standard deviation of x̄ ⇒ the sampling distribution of x̄ ⇒ the Central Limit Theorem 2 The Mean and Standard Deviation of x̄ ⇒ The sample mean x̄ from a sample or an experiment is an estimate of the mean µ; ⇒ The sample distribution of x̄ is determined by the design, the sample size n, and the population distribution. ⇒ Probability model for measurements on each individual in an SRS: 1) an SRS of size n; 2) measure a variable X on each individual in the sample; 3) data: observation on n random variables: X1 , X2 , ..., Xn ; (here: Xi is a measurement on one individual selected at random from the population); 4) Xi are independent random variables with the same distribution; ⇒ The sample mean of an SRS of size n is: x̄ = 1 (X1 + X2 + ... + Xn ). n ⇒ µ is the mean of each observation Xi , since the population has mean µ; ⇒ Calculation of the mean and standard deviation for x̄ (use the addition rules for means and standard deviations): µx̄ = 1 1 (µX1 + µX2 + ... + µXn ) = (µ + µ + ... + µ) = µ; n n ⇒ The mean of x̄ is the same as the mean of the population. 1 1 2 σ2 2 2 2 2 σx̄2 = ( )2 (σX + σ + ... + σ ) = ( ) (nσ ) = ; X2 Xn 1 n n n Standard deviation: σ σx̄ = √ . n Mean and Standard Deviation of a Sample Mean Let x̄ be the mean of an SRS of size n from a population having mean µ and standard deviation σ. The mean and standard deviation of x̄ are µx̄ = µ, σ σx̄ = √ . n Example 5.14. Heights of women. The height X of a single randomly chosen young woman is N (64.5, 2.5),√an SRS of n = 100 young women. Then µx̄ = µ = 64.5, and σx̄ = √σn = 2.5/ 100 = 0.25 inch. 3 The Sampling Distribution of x̄ ⇒ Let us describe the shape of the probability distribution of a sample mean x̄. It depends on the shape of the population distribution; ⇒ Important remark: if the population distribution is normal, then so is the distribution of the sample mean. Sampling Distribution of a Sample Mean If a population has the N (µ, σ) distribution, √ then the sample mean x̄ of n independent observations has the N (µ, σ/ n). ⇒ General fact: any linear combination of independent normal random variables is also normally distributed: if X and Y are independent normal random variables and a and b are any fixed numbers, then aX + bY is also normally distributed, and so it is for any number of normal variables. Example 5.17. Golf Tournament. Tom’s score X has the N (110, 10) distribution; George’s score Y has the N (100, 8) distribution. They play independently. What is the probability that Tom will score lower than George? We want P (X < Y )? Let us calculate first mean and st. dev. for X − Y : µX−Y = µX − µY = 110 − 100 = 10; 2 2 + σY2 = 102 + 82 = 164; = σX σX−Y √ σX−Y = 164 = 12.8. Thus, X − Y has the N (10, 12.8) distribution. Then (see also Figure 5.9), P (X < Y ) = P (X−Y < 0) = P ( (X − Y ) − 10 0 − 10 < ) = P (Z < −0.78) = 0.2177. 12.8 12.8 Tm will have the lower score in about one of every five matches. 4 The Central Limit Theorem ⇒ What happens when the population distribution of x̄ is not normal? ⇒ As the sample size increases, the disribution of x̄ gets closer to a normal distribution; ⇒ This is true no matter what shape the population distribution has, as long as the population has a finite standard deviation σ. ⇒ This famous fact of probability theory is called the Central Limit Theorem. Central Limit Theorem Draw an SRS of size n from any population with mean µ and finite standard deviation σ. When n is large, the sampling distribution of the sample mean x̄ is approximately normal: σ x̄ is approximately N (µ, √ ). n ⇒ Any variable that is a sum of many small influences will have approximately a normal distribution. ⇒ More observations are required if the shape of the population distribution is far from normal. Figure 5.10 shows the central limit theorem in action for a very nonnormal population (see also Statistical Applets on our web). ⇒ The central limit theorem allows us to use normal distribution calculations about sample means from many observations even the population distribution is not normal. Example 5.19. Performance preventive maintenance on an airconditioning unit. X is the time to perform it and has the exponential distribution (see Figure 5.10 (a)). µ = 1 hour and σ = 1 hour. A company operates 70 of these units. What is the probability that their average maintenance time exceeds 50 minutes? We √ want P (x̄ > 0.83) since 50 minutes is √ 0.83 hour. We note that σ/ 70 = 1 70 = 0.12 hour. Thus, by central limit theorem x̄ has N (1, 0.12) distribution. Figure 5.11 shows this normal curve (solid) and also the actual density curve of x̄ (dashed). A normal distribution calculation gives the desired probability as 0.9222. The exact correct probability is the area under the dashed density curve in the figure. It is 0.9294. The error is 0.007. Another important example of the central limit theorem: it is the normal approximation for sample proportions and counts. Figure 5.12 summarizes the facts about the sampling distribution of x̄ in a way that reminds us of th big idea of a sampling distribution. 5 Summary 1. The sample mean x̄ of an SRS of size n drawn from a large population with mean µ and standard deviation σ has a sampling distribution with mean and standard deviation µx̄ = µ, σ σx̄ = √ . n The sample mean x̄ is therefore an unbiased estimator of the population mean µ and is less variable than a single observation. 2. Linear combinations of independent normal random variables have normal distributions. In particular, if the population has a normal distribution, so does x̄. 3. The central limit theorem states that for large n the sampling distribution of x̄ is approximately N (µ, √σn ) for any population with mean µ and finite standard deviation σ.