Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter Seven: Confidence Intervals and Sample Size A point estimate is: The best point estimate of the population mean µ is the sample mean X. Three Properties of a Good Estimator 1. Unbiased 2. Consistent 3. Relatively efficient Why do statisticians prefer interval estimates to point estimates? The confidence level of an interval estimate is: A confidence interval is: Example: Consider the pennies.mtw data set. Generate a random sample of 50 pennies. What is your point estimate of the mean µ? What is the sampling error? Let’s construct a confidence interval for the mean. We have a large sample size, so the distribution of the sample means is: By the Empirical Rule, approximately of the sample means will lie within standard deviations of the true population mean. What is the 95% confidence interval for your sample? (In Minitab, go to Stat>Basic Statistics>1-Sample Z. The Options button allows you to specify the level of confidence desired). 1 Derivation of Confidence Interval Formula: Formula for Confidence Interval of the Mean for a Specific n: σ σ X − zα/2 √ < µ < X + zα/2 √ n n where zα/2 is the z value with an area of α/2 to its right. For a 90% confidence interval zα/2 = 1.65; for a 95% confidence interval zα/2 = 1.96; for a 99% confidence interval zα/2 = 2.58 σ is called the The term zα/2 √ n Rounding Rule: When using raw data, round off one more decimal place than that found in the data. When using a sample mean and standard deviation, use the same number of decimal places as given in the mean. Example: Find zα/2 for the 98% confidence interval. Warning!! Here we are assuming: Example: A large airline wants to estimate its average number of unoccupied seats per flight over the past year. The records of 225 flights are randomly selected and the number of unoccupied seats is noted for each of the sampled flights. The sample mean is 11.6 seats and the sample standard deviation is 4.1 seats. (a) What is the best point estimate for µ, the average number of unoccupied seats per flight during the past year? (b) Estimate µ using a 90% confidence interval. 2 (c) Interpret your result in (b). We express our confidence as “We can be 90% confident that Example: A random sample of 100 observations from a normally distributed population possesses a mean equal to 83.2 and a standard deviation equal to 6.4. Find a 99% confidence interval for µ. This procedure is used when: Sample Size Sometimes we may want to determine the sample size necessary to make an accurate estimate. To do so, we use the following formula: Example: A university president would like to estimate the average age of students at the university. How large a sample is necessary if she wishes to be 95% confident that the estimate should be accurate to within 1 year? A previous study determined that the standard deviation of the ages is known to be 3.5 years. Note: When finding sample size, the size of the population is irrelevant when the population is large or infinite or when sampling is done with replacement. If σ is unknown, one can estimate it using s from a previous study. We have been working under the assumption that σ is known and the variable is normally distributed or that σ is unknown and n ≥ 30. What happens if n < 30 and σ is not known? When σ is unknown and the sample size is less than 30, we use the t-distribution. Characteristics of the t-Distribution The t-distribution is similar to the standard normal distribution in that: 1. It is bell-shaped. 2. It is symmetric about the mean. 3 3. The mean, median, and mode are all equal to 0 and are located at the center of the distribution. 4. The curve never touches the x-axis. The t-distribution differs from the standard normal distribution in the following ways: 1. The variance is greater than 1. 2. The t-distribution is a family of curves based on the degrees of freedom, which is related to the sample size. 3. As the sample size increases, the t-distribution approaches the standard normal distribution. What are the degrees of freedom? Formula for Confidence Interval of the Mean when σ is unknown and n < 30: s s < µ < X + tα/2 √ X − tα/2 √ n n where zα/2 is the z value with an area of α/2 to its right. The degrees of freedom are n − 1. Example: Find the tα/2 value for a 95% confidence interval when the sample size is 20. Example: A manufacturer of printers wishes to estimate the mean number of characters printed before a printhead fails. The printer manufacturer tests 15 printheads and finds the mean number of characters is 1.24 and standard deviation 0.19. (a) Form a 99% confidence interval for the mean number of characters printed before the printhead fails. Interpret the result. (b) What assumption is required for the interval to be valid? What happens if the population distribution departs greatly from normality? Section 7.3: Confidence Intervals and Sample Size for Proportions 4 Notation: Let p be the population proportion and pb be the sample proportion. Then pb = X n and qb = n−X = 1 − pb n where X is the number of sample units possessing the characteristic of interest and n is the sample size. Example: In a recent survey of 150 households, 54 had central air. Find pb and qb where pb is the proportion that has central air. For proportions, the confidence interval is given by: r r pbqb pbqb pb − zα/2 < p < pb + zα/2 . n n Example: A survey conducted by Gallop of 1404 respondents found 323 students paid for their education using student loans. Find the 90% confidence interval of the true proportion of students who paid for their education using student loans. To determine the sample size for proportions, use the formula z 2 α/2 n = pbqb E where E is the level of accuracy desired. 5 Example: A researcher wants to estimate, with 95% confidence, the proportion of people who own a home computer. The previous study indicated that 45% of those surveyed owned a computer at home. The researcher wants to be accurate within 2% of the true proportion. Find the minimum sample size required for the study. 6