Survey

Transcript

Unit 6: Confidence Intervals Elementary Statistics Larson Farber Ch. 6 Larson/Farber Definition Review Ch. 6 Larson/Farber Big Picture – Confidence Intervals A group of college students collected data on the speed of vehicles traveling through a construction zone on a state highway, where the posted speed was 25 mph. Assume that the standard deviation for the recorded speed of the vehicles is 3.5 mph. The recorded speed of 14 randomly selected vehicles is as follows: 20, 24, 27, 28, 29, 30, 32, 33, 34, 36, 38, 39, 40, 40 þ Assuming speeds are approximately normally distributed, how fast do you think the true mean speed of drivers in this construction zone is? x 32.14 sx 6.18 n 14 Ch. 6 Larson/Farber ? 3.5 x 32.14 sx 6.18 Ch. 6 Larson/Farber Construction of a Confidence Interval þ The construction of a confidence interval for the population mean depends upon three factors – The point estimate of the population – The level of confidence – The standard deviation of the sample mean Ch. 6 Larson/Farber Section 6.1 Confidence Intervals for the Mean (large samples) Ch. 6 Larson/Farber Point Estimate DEFINITION: A point estimate is a single value estimate for a population parameter. The best point estimate of the population mean is the sample mean Ch. 6 Larson/Farber Part I: Point Estimate A random sample of 35 airfare prices (in dollars) for a one-way ticket from Atlanta to Chicago. Find a point estimate for the population mean, . 99 101 107 102 109 98 105 103 101 105 98 107 104 96 105 95 98 94 100 104 111 114 87 104 108 101 87 103 106 117 94 103 101 The sample mean is The point estimate for the price of all one way tickets from Atlanta to Chicago is $101.77. Ch. 6 Larson/Farber 105 90 Interval Estimates Point estimate • 101.77 An interval estimate is an interval or range of values used to estimate a population parameter. ( • 101.77 ) The level of confidence, x, is the probability that the interval estimate contains the population parameter. Ch. 6 Larson/Farber Distribution of Sample Means When the sample size is at least 30, the sampling distribution for is normal. Sampling distribution of For c = 0.95 0.025 0.95 -1.96 0 1.96 0.025 z 95% of all sample means will have standard scores between z = -1.96 and z = 1.96 Ch. 6 Larson/Farber Ch. 6 Larson/Farber Solution: Finding the Margin of Error 0.95 0.025 0.025 zc -zc = -1.96 z=0 zczc= 1.96 z 95% of the area under the standard normal curve falls within 1.96 standard deviations of the mean. Ch. 6 Larson/Farber Maximum Error of Estimate The maximum error of estimate E is the greatest possible distance between the point estimate and the value of the parameter it is estimating for a given level of confidence, c. When n is greater than 30, the sample standard deviation, s, can be used for . Ch. 6 Larson/Farber Part 2: Maximum Error of Estimate A random sample of 35 airfare prices (in dollars) for a one-way ticket from Atlanta to Chicago. 99 101 107 102 109 98 105 103 101 105 98 107 104 96 105 95 98 94 100 104 111 114 87 104 108 101 87 103 106 117 94 103 101 105 90 Find E, the maximum error of estimate for the one-way plane fare from Atlanta to Chicago for a 95% level of confidence given s = 6.69. Ch. 6 Larson/Farber Maximum Error of Estimate s = 6.69 n = 35 Find E, the maximum error of estimate for the one-way plane fare from Atlanta to Chicago for a 95% level of confidence given s = 6.69. Using zc = 1.96, You are 95% confident that the maximum error of estimate is $2.22. Ch. 6 Larson/Farber Confidence Intervals for the Population Mean A c-confidence interval for the population mean μ • x E x E where E zc n • The probability that the confidence interval contains μ is c. Ch. 6 Larson/Farber Part III:Confidence Intervals for Find the 95% confidence interval for the one-way plane fare from Atlanta to Chicago. You found Left endpoint ( 99.55 = 101.77 and E = 2.22 Right endpoint • 101.77 ) 103.99 With 95% confidence, you can say the mean one-way fare from Atlanta to Chicago is between $99.55 and $103.99. Ch. 6 Larson/Farber How could we get closer? ( 99.55 Ch. 6 Larson/Farber • 101.77 ) 103.99 Construction of a Confidence Interval þ The construction of a confidence interval for the population mean depends upon three factors – The point estimate of the population – The level of confidence – The standard deviation of the sample mean Ch. 6 Larson/Farber How could we get closer? ( 99.55 • 101.77 ) 103.99 Two ways to get a smaller Confidence Interval: • Lower confidence level (e.g. 75%) • Bigger Sample Ch. 6 Larson/Farber Sample Size Given a c-confidence level and an maximum error of estimate, E, the minimum sample size n, needed to estimate , the population mean is Ch. 6 Larson/Farber Part IV: Sample Size You want to estimate the mean one-way fare from Atlanta to Chicago. How many fares must be included in your sample if you want to be 95% confident that the sample mean is within $2 of the population mean? You should include at least 43 fares in your sample. Since you already have 35, you need 8 more. Ch. 6 Larson/Farber Section 6.2 What happens if we don’t have 30 observations? Confidence Intervals for the Mean (small samples) Ch. 6 Larson/Farber Normal or t-Distribution? Is n 30? Yes No Is the population normally, or approximately normally, distributed? No Cannot use the normal distribution or the t-distribution. Yes Use the normal distribution with E z σ Yes Is known? No c Use the t-distribution with E tc s n and n – 1 degrees of freedom. Ch. 6 Larson/Farber Use the normal distribution with σ E zc n If is unknown, use s instead. n þ Comparing three curves – The standard normal curve – The t curve with 14 degrees of freedom – The t curve with 4 degrees of freedom Ch. 6 Larson/Farber The t-Distribution If the distribution of a random variable x is normal and n < 30, then the sampling distribution of is a t-distribution with n – 1 degrees of freedom. Sampling distribution n = 13 d.f. = 12 c = 90% .90 .05 -1.782 .05 0 t 1.782 The critical value for t is 1.782. 90% of the sample means (n = 13) will lie between t = -1.782 and t = 1.782. Ch. 6 Larson/Farber Confidence Interval–Small Sample Maximum error of estimate In a random sample of 13 American adults, the mean waste recycled per person per day was 4.3 pounds and the standard deviation was 0.3 pound. Assume the variable is normally distributed and construct a 90% confidence interval for . 1. The point estimate is = 4.3 pounds 2. The maximum error of estimate is Ch. 6 Larson/Farber Finding tc If c = 0.90 n = 13 (df =12) tc = ? d.f. = n - 1 Ch. 6 Larson/Farber http://surfstat.anu.edu.au/surfstat-home/tables/t.php Ch. 6 Larson/Farber Confidence Interval–Small Sample Maximum error of estimate In a random sample of 13 American adults, the mean waste recycled per person per day was 4.3 pounds and the standard deviation was 0.3 pound. Assume the variable is normally distributed and construct a 90% confidence interval for . 1. The point estimate is = 4.3 pounds 2. The maximum error of estimate is Ch. 6 Larson/Farber Confidence Interval–Small Sample 1. The point estimate is = 4.3 pounds 2. The maximum error of estimate is Left endpoint Right endpoint ) ( • 4.3 4.152 4.15 < < 4.45 4.448 With 90% confidence, you can say the mean waste recycled per person per day is between 4.15 and 4.45 pounds. Ch. 6 Larson/Farber Normal or t-Distribution? Is n 30? Yes No Is the population normally, or approximately normally, distributed? Use the normal distribution with σ E zc n If is unknown, use s instead. No Cannot use the normal distribution or the t-distribution. Yes Use the normal distribution with E z σ Yes Is known? No c n Use the t-distribution with E tc s n and n – 1 degrees of freedom. Ch. 6 Larson/Farber See Pg 329 of textbook 1. The Graduate Management Admission Test (GMAT) is a test required for admission into many masters of business administration (MBA) programs. Total scores on the GMAT are normally distributed and historically have a standard deviation of 113. Suppose a random sample of 8 students took the test, and their scores are recorded. 2. Sean is estimating the average number of Christmas Trees he will find in the windows of each store in the mall. He observes each of the 10 stores in the mall and records a sample mean of 15 trees with a standard deviation of 6. 3. Patrick wonders about the average number of servings of eggnog at the Holiday Party. He knows that typically this variable has a standard deviation of 2.2 servings. He records a sample mean of 4 servings for a sample of 50 people. 1. The Graduate Management Admission Test (GMAT) is a test required for admission into many masters of business administration (MBA) programs. Total scores on the GMAT are normally distributed and historically have a standard deviation of 113. Suppose a random sample of 8 students took the test, and their scores are recorded. (We know population is normally distributed, so we can use Z even though n <30) 2. Sean is estimating the average number of Christmas Trees he will find in the windows of each store in the mall. He observes each of the 10 stores in the mall and records a sample mean of 15 trees with a standard deviation of 6. (We do not know population is normally distributed, so must use t with 9 degrees of freedom because we have a small sample and we do not know sigma) 3. Patrick wonders about the average number of servings of eggnog at the Holiday Party. He knows that typically this variable has a standard deviation of 2.2 servings. He records a sample mean of 4 servings for a sample of 50 people. (We do not know population is normally distributed, but we know sigma is 2.2 so we can use Z.) Section 6.3 Confidence Intervals for Population Proportions Ch. 6 Larson/Farber What if we are interested in a population proportion or percentage? For example: What percentage of the population likes spinach? Ch. 6 Larson/Farber Confidence Intervals for Population Proportions The point estimate for p, the population proportion of successes, is given by the proportion of successes in a sample (Read as p-hat) is the point estimate for the proportion of failures where Required Condition: If np >= 5 and nq >=5 the sampling distribution for p-hat is normal. Ch. 6 Larson/Farber Confidence Intervals for Population Proportions The maximum error of estimate, E, for a x-confidence interval is: A c-confidence interval for the population proportion, p, is Ch. 6 Larson/Farber Confidence Interval for p In a study of 1907 fatal traffic accidents, 449 were alcohol related. Construct a 99% confidence interval for the proportion of fatal traffic accidents that are alcohol related. Ch. 6 Larson/Farber Confidence Interval for p In a study of 1907 fatal traffic accidents, 449 were alcohol related. Construct a 99% confidence interval for the proportion of fatal traffic accidents that are alcohol related. 1. The point estimate for p is 2. 1907(.235) > 5 and 1907(.765) > 5, so the sampling distribution is normal. 3. Ch. 6 Larson/Farber Confidence Interval for p In a study of 1907 fatal traffic accidents, 449 were alcohol related. Construct a 99% confidence interval for the proportion of fatal traffic accidents that are alcohol related. Left endpoint Right endpoint ( .21 • .235 ) .26 0.21 < p < 0.26 With 99% confidence, you can say the proportion of fatal accidents that are alcohol related is between 21% and 26%. Ch. 6 Larson/Farber Minimum Sample Size If you have a preliminary estimate for p and q, the minimum sample size given a x-confidence interval and a maximum error of estimate needed to estimate p is: If you do not have a preliminary estimate, use 0.5 for both . Ch. 6 Larson/Farber Example–Minimum Sample Size You wish to estimate the proportion of fatal accidents that are alcohol related at a 99% level of confidence. Find the minimum sample size needed to be be accurate to within 2% of the population proportion. With no preliminary estimate use 0.5 for You will need at least 4415 for your sample. Ch. 6 Larson/Farber Example–Minimum Sample Size You wish to estimate the proportion of fatal accidents that are alcohol related at a 99% level of confidence. Find the minimum sample size needed to be be accurate to within 2% of the population proportion. Use a preliminary estimate of p = 0.235. With a preliminary sample you need at least n = 2981 for your sample. Ch. 6 Larson/Farber Ch. 6 Larson/Farber Example #1 (pg 310) Market researchers use the number of sentences per advertisement as a measure of readability for magazine advertisements. Suppose for the 50 advertisements we determine that the average number of sentences (xbar) is 12.4 and the standard deviation is 5.0: Compute the 95% confidence interval for the mean mu. – Question #1: Do we know the population standard deviation? – Question #2: What is the interval? Ch. 6 Larson/Farber Ch. 6 Larson/Farber Solution xbar = 12.4 s = 5.0 n = 50 c = 0.95 Zc = 1.96 (for c = 0.95) E = 1.96 * 5 / √50 = 1.4 12.4 – 1.4 < μ < 12.4 + 1.4 11.0 < μ < 13.8 Answer: We are 95% confident that the mean number of sentences in the POPULATION is between 11.0 and 13.8. Ch. 6 Larson/Farber Example #2 (Pg. 327) You randomly select 16 coffee shops and measure the temperature of the coffee sold at each. The sample mean temperature is 162 degrees with a standard deviation of 10 degrees. You know that the distribution of temperature is normally distributed. A. Find the 95% confidence interval for the mean temperature. B. Find the 99% confidence interval for the mean temperature. Ch. 6 Larson/Farber Ch. 6 Larson/Farber 95% Confidence Interval xbar = 162.0 s = 10.0 n = 16 c = 0.95 tc = 2.132 (df = 15, c = .95) E = 2.132 * 10 / √16 = 5.3 162 – 5.3 < μ < 162 + 5.3 156.7 < μ < 167.3 Answer: We are 95% confident that the average temperature of all the coffee in the POPULATION is between 157 and 167 degrees. Ch. 6 Larson/Farber What about 99%? xbar = 162.0 s = 10.0 n = 16 c = 0.99 tc = 2.947 (df = 15, c = .99) E = 2.947 * 10 / √16 = 7.4 162 – 7.4 < μ < 162 + 7.4 154.6 < μ < 169.4 Answer: We are 95% confident that the average temperature of all the coffee in the POPULATION is between 154.6 and 169.4 degrees. Ch. 6 Larson/Farber