Download 7.2 Estimating a Population Proportion

7.2 – Estimating a Population Proportion: Objectives: 1. 2. 3. 4. 5. Find Critical Values Interpret Confidence Intervals Find the Margin of Error Construct confidence intervals for population proportions Determine the sample size. Chapter Overview: This chapter presents the beginning of inferential statistics. • The two major activities of inferential statistics are: 1. To use sample data to estimate values of population parameters. 2. To test hypotheses or claims made about population parameters. • We introduce methods for estimating values of these important population parameters: proportions, means, and variances. • We also present methods for determining sample sizes necessary to estimate those parameters. Section Overview: In this section we present methods for using a sample proportion to estimate the value of a population proportion. • The sample proportion is the best point estimate of the population proportion. • We can use a sample proportion to construct a confidence interval to estimate the true value of a population proportion, and we should know how to interpret such confidence intervals. • We should know how to find the sample size necessary to estimate a population proportion. Point Estimate: A point estimate is a single value (or point) used to approximate a population parameter. The sample proportion p̂ is the best point estimate of the population proportion p. Example: In a Pew Research Center poll, 70% of 1501 randomly selected adults in the United States believe in global warming, so the sample proportion is p̂ = 0.70. Find the best point estimate of the proportion of all adults in the United States who believe in global warming. Solution: Because the sample proportion is the best point estimate of the population proportion, we conclude that the best point estimate of p is 0.70. When using the sample results to estimate the percentage of all adults in the United States who believe in global warming, the best estimate is 70%. A point estimate fails to indicate just how good that estimate really is. Because of this flaw, another type of estimate has been developed called a confidence interval. Confidence Interval: A confidence interval (or interval estimate) is a range (or an interval) of values used to estimate the true value of a population parameter. A confidence interval is sometimes abbreviated as CI. A confidence interval is associated with a confidence level, such as 0.95 (or 95%). The confidence level gives us the success rate of the procedure used to construct the confidence interval. Confidence Level: The confidence level is the probability 1 – α (often expressed as the equivalent percentage value) that the confidence interval actually does contain the population parameter, assuming that the estimation process is repeated a large number of times. (The confidence level is also called degree of confidence, or the confidence coefficient.) For instance, a 0.95 (or 95%) confidence level, α = 0.05. For a 0.99 (or 99%) confidence level, α = 0.01. Most common choices for confidence levels are 90% ( α = 0.10), 95 % ( α = 0.05), or 99% ( α = 0.01). Interpreting a Confidence Interval: We must be careful to interpret confidence intervals correctly. There is a correct interpretation and many different and creative incorrect interpretations of the confidence interval 0.677 < p < 0.723. Correct Interpretation: “We are 95% confident that the interval from 0.677 to 0.723 actually does contain the true value of the population proportion p.” This means that if we were to select many different samples of size 1501 and construct the corresponding confidence intervals, 95% of them would actually contain the value of the population proportion p. Note: In this correct interpretation, the level of 95% refers to the success rate of the process being used to estimate the proportion. Incorrect Interpretation: There is a 95% chance that the true value of p will fall between 0.677 and 0.723. It would also be incorrect to say that “95% of sample proportions fall between 0.677 and 0.723. Caution: Confidence intervals can be used informally to compare different data sets, but the overlapping of confidence intervals should not be used for making formal and final conclusions about equality of proportions. Critical Values: A standard z score can be used to distinguish between sample statistics that are likely to occur and those that are unlikely to occur. Such a z score is called a critical value. The number zα/2 is a critical value that is a z score with the property that it separates an area of α/2 in the right tail of the standard normal distribution. Critical values are based on the following observations: 1. Under certain conditions, the sampling distribution of sample proportions can be approximated by a normal distribution. 2. A z score associated with a sample proportion has a probability of α/2 of falling in the right tail. 3. The z score separating the right-tail region is commonly denoted by zα/2 and is referred to as a critical value because it is on the borderline separating z scores from sample proportions that are likely to occur from those that are unlikely to occur. Notation for Critical Value: 1. The critical value zα/2 is the positive z value that is at the vertical boundary separating an area of α/2 in the right tail of the standard normal distribution. 2. The value of –zα/2 is at the vertical boundary for the area of α/2 in the left tail. 3. The subscript α/2 is simply a reminder that the z score separates an area of α/2 in the right tail of the standard normal distribution. Example: Find the critical value zα/2 corresponding to a 95% confidence level. Solution: A 95% confidence level corresponds to α = 0.05 and α / 2 = 0.025 . Therefore, by noting that the cumulative area to the left must be 1 − α = 1 − 0.025 = 0.975 and using table A-2, we find the z-score to be 1.96. For a 95% confidence level, the critical value is therefore zα Example: Find the critical value zα 2 2 = 1.96 . that corresponds to a 98% confidence level. Solution: A 98% confidence level corresponds to α = 0.02 and α / 2 = 0.01 . Therefore, by noting that the cumulative area to the left must be 1 − α 2 = 1 − 0.01 = 0.99 and using table A-2, we find the z-score to be 2.33. Example: Find the critical value zα 2 that corresponds to a 91% confidence level. Solution: A 91% confidence level corresponds to α = 0.09 and α / 2 = 0.045 . Therefore, by noting that the cumulative area to the left must be 1 − α 2 = 1 − 0.045 = 0.955 and using table A-2, we find the z-score to be 1.70. Margin of Error: When data from a simple random sample are used to estimate a population proportion p, the margin of error, denoted by E, is the maximum likely difference (with probability 1 – α, such as 0.95) between the observed proportion and the true value of the population proportion p. The margin of error E is also called the maximum error of the estimate and can be found by multiplying the critical value and the standard deviation of the sample proportions: E = zα 2 pˆ qˆ n For a 95% confidence level, α = 0.05 , so there is a probability of 0.05 that the sample proportion will be in error by more than E. Example: Assume that a sample is used to estimate a population proportion p. Find the margin of error E that corresponds to the given statistics and confidence level. 95% confidence level; n = 320, p̂ = 0.60. Solution: The z-score corresponding to a 95% confidence level is 1.96. E = zα 2 pˆ qˆ n (0.6)(0.4) 320 E = 1.96(0.027 ) E = 0.054 E = 1.96 Example: Assume that a sample is used to estimate a population proportion p. Find the margin of error E that corresponds to the given statistics and confidence level. 90% confidence level; n = 480, p̂ = 0.7. Solution: The z-score corresponding to a 90% confidence level is 1.645. E = zα 2 pˆ qˆ n (0.7)(0.3) 480 E = 1.645(0.021) E = 1.645 E = 0.034 Constructing a Confidence Interval for a Population Proportion p: We may construct a confidence interval given the following: p = population proportion p̂ = sample proportion qˆ = 1 − pˆ n = number of sample values E = margin of error zα/2 = z score separating an area of α/2 in the right tail of the standard normal distribution Then the confidence interval may be written as: pˆ − E < p < pˆ + E where E = zα 2 pˆ qˆ n The confidence interval is often expressed in the following equivalent forms: pˆ ± E or ( pˆ − E , pˆ + E ) Requirements: 1. The sample is a simple random sample. 2. The conditions for the binomial distribution are satisfied: there is a fixed number of trials, the trials are independent, there are two categories of outcomes, and the probabilities remain constant for each trial. 3. There are at least 5 successes and 5 failures. Procedure for Constructing Confidence Intervals: 1. 2. 3. 4. Verify that requirements are met. Find the critical value for the confidence level. Find the margin of error. Determine confidence interval. Example: Use the given degree of confidence and sample data to construct a confidence interval for the population proportion p. n = 51, p̂ = .27; 95% confidence level. Solution: Assume requirements are met. The z-score corresponding to a 95% confidence level is 1.96. The margin of error is: E = zα 2 pˆ qˆ n (0.27)(0.73) 51 E = 1.96(0.062) E = 0.122 E = 1.96 The confidence interval pˆ − E < p < pˆ + E is: 0.27 − 0.122 < p < 0.27 + 0.122 0.148 < p < 0.392 14.8% < p < 39.2% Example: Use the given degree of confidence and sample data to construct a confidence interval for the population proportion p. n = 110, p̂ = .55; 88% confidence level. Solution: Assume requirements are met. The z-score corresponding to an 88% confidence level is 1.56. The margin of error is: E = zα 2 pˆ qˆ n (0.55)(0.45) 110 E = 1.56(0.047) E = 0.074 E = 1.56 The confidence interval pˆ − E < p < pˆ + E is: 0.55 − 0.074 < p < 0.55 + 0.074 0.476 < p < 0.624 47.6% < p < 62.4% Example: Use the given degree of confidence and sample data to construct a confidence interval for the population proportion p. n = 125, p̂ = .72; 90% confidence level. Solution: Assume requirements are met. The z-score corresponding to a 90% confidence level is 1.645. The margin of error is: E = zα 2 pˆ qˆ n (0.72)(0.28) 125 E = 1.645(0.002) E = 1.645 E = 0.003 The confidence interval pˆ − E < p < pˆ + E is: 0.72 − 0.003 < p < 0.72 + 0.003 0.717 < p < 0.723 71.7% < p < 72.3% Example: A survey of 865 voters in one state reveals that 408 favor approval of an issue before the legislature. Construct the 95% confidence interval for the true proportion of all voters in the state who favor approval. Solution: Assume requirements are met. The z-score corresponding to a 95% confidence level is 1.96. The margin of error is: E = zα 2 pˆ qˆ n (0.472)(0.528) 865 E = 1.96(0.017) E = 0.033 E = 1.96 The confidence interval pˆ − E < p < pˆ + E is: 0.472 − 0.033 < p < 0.472 + 0.033 0.439 < p < 0.505 43.9% < p < 50.5% A survey found that 47.2% of all voters approve of an issue. We can be 95% confident that the actual percentage will fall between 43.9% and 50.5%. Example: Of 367 randomly selected medical students, 30 said that they planned to work in a rural community. Find a 95% confidence interval for the true proportion of all medical students who plan to work in a rural community. Solution: Assume requirements are met. The z-score corresponding to a 95% confidence level is 1.96. The margin of error is: E = zα 2 pˆ qˆ n (0.082)(0.918) 367 E = 1.96(0.014) E = 1.96 E = 0.028 The confidence interval pˆ − E < p < pˆ + E is: 0.082 − 0.028 < p < 0.082 + 0.028 0.054 < p < 0.110 5.4% < p < 11.0% 8.2% of medical students surveyed planned to work in a rural community. We can be 95% confident that the actual number is between 5.4% and 11%. Analyzing Polls: When analyzing polls consider the following: 1. The sample should be a simple random sample, not an inappropriate sample (such as a voluntary response sample). 2. The confidence level should be provided. (It is often 95%, but media reports often neglect to identify it.) 3. The sample size should be provided. (It is usually provided by the media, but not always.) 4. Except for relatively rare cases, the quality of the poll results depends on the sampling method and the size of the sample, but the size of the population is usually not a factor. Caution: Never follow the common misconception that poll results are unreliable if the sample size is a small percentage of the population size. The population size is usually not a factor in determining the reliability of a poll. Determining Sample Size: Suppose we want to collect sample data in order to estimate some population proportion. The question is how many sample items must be obtained? If we solve the formula for margin of error, we obtain the following formula: [ z ] n= 2 α 2 pˆ qˆ E2 This formula requires p̂ as an estimate of the population proportion. I f no such estimate exists, we replace p̂ with 0.5 and q̂ with 0.5 to obtain: [ z ] 0.25 n= 2 α 2 E2 These formulas show that the sample size does not depend on the size of the population, but rather, on the desired confidence level and margin of error and sometimes p̂ . Round Off Rule for Sample Size: If the computed sample size is not a whole number, then round the value up to the next whole number. Example: Use the given data to find the minimum sample size required to estimate the population proportion. Margin of error: 0.05; confidence level: 99%; from a prior study, p̂ is estimated at 0.15. Solution: [z ] pˆ qˆ n= 2 α 2 E2 2 [ 2.575] (0.15)(0.85) n= (0.05) 2 0.845 n= 0.0025 n = 338 Example: Use the given data to find the minimum sample size required to estimate the population proportion. Margin of error: 0.024; confidence level: 95%; p̂ is unknown Solution: [z ] (0.25) n= 2 α 2 E2 2 [ 1.96] (0.25) n= (0.024) 2 0.9604 n= 0.000576 n = 1667 Example: A newspaper article about the results of a poll states: "In theory, the results of such a poll, in 99 cases out of 100 should differ by no more than 5 percentage points in either direction from what would have been obtained by interviewing all voters in the United States." Find the sample size suggested by this statement. Solution: Margin of error: 0.05; confidence level: 99% p̂ is unknown. [z ] (0.25) n= 2 α 2 E2 2 [ 2.575] (0.25) n= (0.05) 2 1.6576 n= 0.0025 n = 663 Finding the Point Estimate and E from a Confidence Interval: Sometimes we want to better understand a confidence interval that might have been obtained from a journal article or generated by computer software. If we already know the confidence interval limits, then the sample point estimate and E can be found as follows: Point estimate of p: pˆ = (upper confidence limit) + (lower confidence limit) 2 Margin of Error: E= (upper confidence limit) - (lower confidence limit) 2 Example: Find the point estimate of the proportion of people who wear hearing aids if, in a random sample of 304 people, 20 people had hearing aids. The confidence interval is 5.9% to 7.5% Solution: Using the formula (upper confidence limit) + (lower confidence limit) 2 0.075 + 0.059 pˆ = 2 pˆ = 0.067 pˆ = The point estimate would be 6.7%. Also, remember that a point estimate is also obtained by simple finding the proportion 20/304 = 0.0657.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 7.2 Estimating a Population Proportion