Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 7 Sampling Distributions McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter Outline 7.1 The Sampling Distribution of the Sample Mean 7.2 The Sampling Distribution of the Sample Proportion 7-2 7.1 Sampling Distribution of the Sample Mean The sampling distribution of the sample mean is the probability distribution of the population of the sample means obtainable from all possible samples of size n from a population of size N 7-3 Example 7.1: The Risk Analysis Case Population of the percent returns from six grand prizes In order, the values of prizes (in $,000) are 10, 20, 30, 40, 50 and 60 Label each prize A, B, C, …, F in order of increasing value The mean prize is $35,000 with a standard deviation of $17,078 7-4 Example 7.1: The Risk Analysis Case Continued Any one prize of these prizes is as likely to be picked as any other of the six Uniform distribution with N = 6 Each stock has a probability of being picked of 1/6 = 0.1667 7-5 Example 7.1: The Risk Analysis Case #3 Figure 7.2 (a) 7-6 Example 7.1: The Risk Analysis Case #4 Now, select all possible samples of size n = 2 from this population of size N = 6 Select all possible pairs of prizes How to select? Sample randomly Sample without replacement Sample without regard to order 7-7 Example 7.1: The Risk Analysis Case #5 Result: There are 15 possible samples of size n = 2 Calculate the sample mean of each and every sample 7-8 Example 7.1: The Risk Analysis Case #6 Sample Mean 15 20 25 30 35 40 45 50 55 Frequency Probability 1 1/15 1 1/15 2 2/15 2 2/15 3 3/15 2 2/15 2 2/15 1 1/15 1 1/15 Table 7.2 (b) Figure 7.2 (b) 7-9 Observations Although the population of N = 6 grand prizes has a uniform distribution, … … the histogram of n = 15 sample mean prizes: Seems to be centered over the sample mean return of 35,000, and Appears to be bell-shaped and less spread out than the histogram of individual returns 7-10 General Conclusions 1. If the population of individual items is normal, then the population of all sample means is also normal 2. Even if the population of individual items is not normal, there are circumstances when the population of all sample means is normal (Central Limit Theorem) 7-11 General Conclusions Continued 3. The mean of all possible sample means equals the population mean 4. The standard deviation sx of all sample means is less than the standard deviation of the population Each sample mean averages out the high and the low measurements, and so are closer to the population mean than many of the individual population measurements 7-12 The Sampling Distribution of x 1. If the population being sampled is normal, then so is the sampling distribution of the sample mean, x 2. The mean sx of the sampling distribution of x is mx = m That is, the mean of all possible sample means is the same as the population mean 7-13 The Sampling Distribution of x #2 3. The variance s2x of the sampling distribution of x is 2 s 2 x s n That is, the variance of the sampling distribution of x is Directly proportional to the variance of the population Inversely proportional to the sample size 7-14 The Sampling Distribution of x #3 The standard deviation sx of the sampling distribution of x is s sx n That is, the standard deviation of the sampling distribution of x is Directly proportional to the standard deviation of the population Inversely proportional to the square root of the sample size 7-15 Notes The formulas for s2x and sx hold if the sampled population is infinite The formulas hold approximately if the sampled population is finite x is the point estimate of m, and the larger the sample size n, the more accurate the estimate Because as n increases, sx decreases as 1/√n 7-16 Effect of Sample Size Figure 7.3 7-17 Example 7.2: Car Mileage Statistical Inference x = 31.56 mpg for a sample of size n = 50 and σ = 0.8 If the population mean µ is exactly 31, what is the probability of observing a sample mean that is greater than or equal to 31.56? 7-18 Example 7.2: Car Mileage Statistical Inference #2 Calculate the probability of observing a sample mean that is greater than or equal to 31.6 mpg if µ = 31 mpg Want P(x > 31.5531 if µ = 31) Use s as the point estimate for s so that sx s 0.8 0.113 n 50 7-19 Example 7.2: Car Mileage Statistical Inference #3 31.56 m x Px 31.56 if m 31 P z sx 31.56 31 P z 0.113 P z 4.96 7-20 Example 7.2: Car Mileage Statistical Inference #4 z = 4.96 > 3.99, so P(z ≥ 4.84) < 0.00003 If m = 31 mpg, fewer than 3 in 100,000 samples have a mean at least as large as observed Have either of the following explanations: If m = 31 mpg, very unlucky in picking this sample Not unlucky, m is not 31 mpg, but is really larger Difficult to believe such a small chance would occur, so conclude that there is strong evidence that m does not equal 31 mpg Also, m is, in fact, actually larger than 31 mpg 7-21 Central Limit Theorem Now consider a non-normal population Still have: mx = m and sx = s/n Exactly correct if infinite population Approximately correct if population size N finite but much larger than sample size n But if population is non-normal, what is the shape of the sampling distribution of x? The sampling distribution is approximately normal if the sample is large enough, even if the population is non-normal (Central Limit Theorem) 7-22 The Central Limit Theorem #2 No matter what is the probability distribution that describes the population, if the sample size n is large enough, then the population of all possible sample means is approximately normal with mean mx = m and standard deviation sx = s/n Further, the larger the sample size n, the closer the sampling distribution of the sample mean is to being normal In other words, the larger n, the better the approximation 7-23 Effect of Sample Size Figure 7.5 7-24 How Large? How large is “large enough?” If the sample size is at least 30, then for most sampled populations, the sampling distribution of sample means is approximately normal Here, if n is at least 30, it will be assumed that the sampling distribution of x is approximately normal If the population is normal, then the sampling distribution of x is normal regardless of the sample size 7-25 Unbiased Estimates A sample statistic is an unbiased point estimate of a population parameter if the mean of all possible values of the sample statistic equals the population parameter x is an unbiased estimate of m because mx = m In general, the sample mean is always an unbiased estimate of m The sample median is often an unbiased estimate of m but not always 7-26 Unbiased Estimates Continued The sample variance s2 is an unbiased estimate of s2 That is why s2 has a divisor of n – 1 and not n However, s is not an unbiased estimate of s Even so, the usual practice is to use s as an estimate of s 7-27 Minimum Variance Estimates Want the sample statistic to have a small standard deviation All values of the sample statistic should be clustered around the population parameter Then, the statistic from any sample should be close to the population parameter 7-28 Minimum Variance Estimates Continued Given a choice between unbiased estimates, choose one with smallest standard deviation The sample mean and the sample median are both unbiased estimates of m The sampling distribution of sample means generally has a smaller standard deviation than that of sample medians 7-29 7.2 The Sampling Distribution of the Sample Proportion The probability distribution of all possible sample proportions is the sampling distribution of the sample proportion If a random sample of size n is taken from a population then the sampling distribution of the sample proportion is Approximately normal, if n is large Has a mean that equals ρ Has standard deviation p1 p s p̂ n 7-30