Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Mon, Nov 1, 2004 Parameters and Statistics The purpose of a statistic is to estimate a population parameter. A sample mean is used to estimate the population mean. A sample proportion is used to estimate the population proportion. Example Example 8.1, p. 464. The Census Bureau surveys 3000 employees and asks them, “Have the job skills demanded by your job increased over the past few years?” 57% replied, “Yes.” That is a sample proportion. What is the population proportion? Some Questions What if the survey were repeated? Would the survey results again be 57%? Would the sample proportion be close to 57%? Might it be 99%? Might it be 1%? Some Questions We hope that the sample proportion is close to the population proportion. How close can we expect it to be? Would it be worth it to collect a larger sample? If the sample were larger, would we expect the sample proportion (probably) to be closer to the population proportion? How much closer? The Sampling Distribution of a Statistic Sampling Distribution of a Statistic – The distribution of values of the statistic over all possible samples of size n from that population. The Sample Proportion Let p be the population proportion. Then p is a fixed value (for a given population). Let p^ (“p-hat”) be the sample proportion. Then p^ is a random variable; it takes on a new value every time a sample is collected. The sampling distribution of p^ is the probability distribution of all the possible values of p^. Example Suppose that this class is 1/3 freshmen. Suppose that we take a sample of 2 students, selected with replacement. Find the sampling distribution of p^. Example 1/3 1/3 F P(FF) = 1/9 N P(FN) = 2/9 F P(NF) = 2/9 N P(NN) = 4/9 2/3 2/3 1/3 N F 2/3 Example Let X be the number of freshmen in the sample. The probability distribution of X is x P(X = x) 0 4/9 1 4/9 2 1/9 Example Let p^ be the proportion of freshmen in the sample. The sampling distribution of p^ is x P(p^ = x) 0 4/9 1/2 4/9 1 1/9 Simulating Sampling with the TI83 Use the TI-83 to simulate sampling 2 people (with replacement) from a population in which 1/3 are freshmen. Use the function randBin(n, p). n = sample size (n = 2). p = proportion of freshmen (p = 1/3). The function will report the number of freshmen in the sample. Example Now do it for a sample of size n = 30. Use a seed of 63. We find that randBin(30, 1/3) = 9. This represents a sample proportion of 9 out of 30, or 9/30 = 0.30. If we press ENTER several more times, we get 11, 9, 14, 6, and 16. These represent sample proportions of 11/30, 9/30, 14/30, 6/30, and 16/30. Example The expression randBin(n, p, k) will compute randBin(n, p) k times and put the results in a list. With a seed of 94, randBin(30, 1/3, 100) produces the list {11, 14, 8, 10, 10, 5, 13, 9, 9, …}. Example If we divide each value by 30, we get the sample proportions {11/30, 14/30, 8/30, 10/30, 10/30, …}. The Histogram 15 10 5 0.1 0.2 0.3 0.4 0.5 0.6 p^ Larger Sample Size Now we will select samples of size 120 instead of size 30. Set the seed to 216. randBin(120, 1/3, 100) produces {44, 33, 43, 41, 38, 44, 46, 43, …} The sample proportions are {44/120, 33/120, 43/102, 41/120, 38/120, …} The Histogram 25 20 15 10 5 0.1 0.2 0.3 0.4 0.5 0.6 p^ Observations and Conclusions Observation #1: The values of p^ are clustered around p. Conclusion #1: p^ is probably close to p. Observations and Conclusions Observation #2: As the sample size increases, the clustering is tighter. Conclusion #2a: Larger samples give more reliable estimates. Conclusion #2b: For large sample sizes, we can make very good estimates of the value of p. More Observations and Conclusions Observation #3: The distribution of p^ appears to be approximately normal. The Histogram 15 10 5 0.1 0.2 0.3 0.4 0.5 0.6 p^ The Histogram 15 10 5 0.1 0.2 0.3 0.4 0.5 0.6 p^ One More Conclusion Conclusion #3: We can use the normal distribution to calculate just how close to p we can expect p^ to be. However, we must know and for the distribution of p^. The Sampling Distribution of p^ It turns out that the sampling distribution of p^ is approximately normal with the following parameters. Mean of pˆ p p1 p Variance of pˆ n Standard deviation of pˆ p1 p n The Sampling Distribution of p^ The approximation to the normal distribution is excellent if np 5 and n1 p 5. Example Suppose 51% of the population plan to vote for candidate X, i.e., p = 0.51. What is the probability that an exit survey of 1000 people would show candidate X with less than 45% support, i.e., p^ .45? Example First, describe the sampling distribution of p^ if the sample size is n = 1000. p^ is approximately normal. Check: np = 510 5 and n(1 – p) = 490 5. p^ = 0.51. p^ = ((.51)(.49)/1000) = 0.01581. Example The z-score of 0.45 is z = (0.45 – 0.51)/.01581 = -3.795. P(p^ 0.45) = P(Z -3.795) = 0.00007385 (not likely!) That is why surveys work (within the margin of error) and that is why people are saying that the exit polls failed yesterday. We have computed the p-value of 0.45 under the null hypothesis that p = 0.51! Let’s Do It! Let’s do it! 8.5, p. 484 – Probabilities about the Proportion of People with Type B Blood. Let’s do it! 8.6, p. 485 – Estimating the Proportion of Patients with Side Effects. Let’s do it! 8.7, p. 487 – Testing hypotheses about Smoking Habits. See Example 8.5, p. 486 – Testing Hypotheses about the Proportion of Cracked Bottles.