Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
6.1 Confidence Interval for the Mean (n 30 or σ known w/normal population) Often we do not know the Mean of an entire population, so we estimate Point Estimate: A single value estimate – most unbiased estimate of the Population mean is the sample mean x Interval Estimate: Range of values to estimate a population parameter. Forms an interval using a margin of error (E) (x – E , x + E) Margin of Error: A sampling error is the difference between the actual mean and the point estimate of the mean. x – μ Since we do not usually know the actual mean and the sample mean varies from sample to sample, we construct a maximum value of error (called margin of error) within a level of confidence Larson/Farber 1 Estimating the Population Mean μ Market researchers use the number of sentences per advertisement as a measure of readability for magazine advertisements. The following represents a random sample of the number of sentences found in 50 advertisements. Find a point estimate of the population mean, . (Journal of Advertising Research) 9 20 18 16 9 9 11 13 22 16 5 18 6 6 5 12 25 17 23 7 10 9 10 10 5 11 18 18 9 9 17 13 11 7 14 6 11 12 11 6 12 14 11 9 18 12 12 17 11 20 x x 620 12.4 n 50 Your point estimate for the mean length of all magazine advertisements is 12.4 sentences. Interval estimate : An interval, or range of values, used to estimate a population parameter. Point estimate How confident do we want 10.3 14.5 12.4 to be that the interval ( ) estimate contains the population mean μ? (90%, 95%, 80%??? - This Interval estimate (with margin of error = 2.1) determines what our margin 2 of error will be. • Larson/Farber Confidence Intervals Level of confidence c • The probability that the interval estimate contains the population parameter. c ½(1 – c) c is the area under the standard normal curve between the critical values. ½(1 – c) -zc z=0 Critical values zc z Use the Standard Normal Table to find the corresponding z-scores. • If the level of confidence is 90% (c = .90) this means that we are 90% confident that the interval contains the population mean μ. • There is 10% left for the ‘tails’ of the distribution. (5% for each tail = .05) • -zc = invnorm (.05) = -1.645 and zc = invnorm (.95) = 1.645 Larson/Farber 4th ed 3 Margin of Error (E) • Greatest possible distance between the point estimate and the value of the parameter it is estimating for a given level of confidence, c. • Sometimes called the maximum error of estimate or error tolerance. σ E zcσ x zc n When n 30, the sample standard deviation, s, can be used for . Example: Use the magazine advertisement data and a 95% confidence level to find the margin of error for the mean number of sentences in all magazine advertisements. Assume the sample standard deviation (s) is about 5.0. 95% of the area under the curve falls within 1.96 standard deviations of the mean. 0.95 0.025 0.025 -zc = -1.96 E = 1.96 (5/√50) = 1.4 sentences. Larson/Farber z=0 zzcc= 1.96 Ti 83/84 Stat-Tests 7:Zinterval () z 11.0 < μ < 13.8 A 95% confidence interval for the population mean (Probability confidence interval contains μ is 95%) (x – E , x + E) 4 (12.4 – 1.4 , 12.4 + 1.4) = (11 , 13.8) Constructing Confidence Intervals for μ Finding a Confidence Interval for a Population Mean (n 30 or σ known with a normally distributed population) In Words In Symbols x x x n 1. Find the sample statistics n and 2. Specify , if known. Otherwise, if n 30, find the sample standard deviation s and use it as an estimate for . (x x )2 s n 1 3. Find the critical value zc that corresponds to the given level of confidence. Std. Norm. Table -> InvNorm() 4. Find the margin of error E. 5. Find the left and right endpoints and form the confidence interval. E zc n Interval: x E x E Interpretation: “If a large number of samples is collected and a confidence interval is created for each sample, approximately C% of these intervals will contain μ.” Larson/Farber 5 Practice A publisher wants to estimate the mean length of time (in minutes) All adults spend reading newspapers. To determine this estimate, the publisher takes a random sample of 15 people and obtains the following results. 11, 9, 8, 10, 10, 9, 7, 11, 11, 7, 6, 9, 10, 8, 10 From past studies, the publisher assumes a population std deviation of 1.5 minutes And that the population is Normally Distributed Construct a 90% and 99% confidence interval for the population mean. Larson/Farber 6 Sample Size Given a c-confidence level & a margin of error E, the minimum 2 zc n sample size n needed to estimate the population mean is E (If is unknown you can estimate it with ‘s’ if sample >= 30) Example: You want to estimate the mean number of sentences in a magazine advertisement. How many magazine advertisements must be included in the sample if you want to be 95% confident that the sample mean is within one sentence of the population mean? Assume the sample std. deviation is about 5.0. 0.95 0.025 0.025 -zc = -1.96 z = 0 Larson/Farber zczc= 1.96 z z 1.96 5.0 n c 96.04 E 1 You should include at least 97 magazine 7 advertisements in your sample 2 2 6.2 Confidence Interval for Mean (σ unknown ) x - t s n • When the population standard deviation is unknown, the sample size is less than 30, and the random variable x is approximately normally distributed, it follows a t-distribution (critical values are denoted tc) w/properties below: 1. 2. 3. 4. 5. The t-distribution is bell shaped and symmetric about the mean. The t-distribution is a family of curves, each determined by a parameter called the degrees of freedom. When estimating a population mean, the degrees of freedom = n – 1 The total area under a t-curve is 1 or 100%. The mean, median, and mode of the t-distribution are equal to zero. As the degrees of freedom increase, t-distribution approaches the normal distribution. After 30 d.f., t-distribution is very close to the standard normal z-distribution. The tails in the t-distribution are “thicker” than those in the standard normal distribution. Larson/Farber d.f. = 2 d.f. = 5 Standard normal curve 0 t 8 Example: Critical Values of t Find the critical value tc for a 95% confidence when the sample size is 15. Solution: d.f. = n – 1 = 15 – 1 = 14 95% of the area under the tdistribution curve with 14 degrees of freedom lies between t = +2.145. c = 0.95 tc = 2.145 -tc = -2.145 tc =2.145 t A c-confidence interval for the population mean μ (The probability that the confidence interval contains μ is c. x E x E Larson/Farber where E tc s n 9 Confidence Intervals and t-Distributions In Words In Symbols x x n (x x )2 s n 1 1. Identify the sample statistics n, x , and s. 2. Identify the degrees of freedom, level of confidence c, and the critical value tc. d.f. = n – 1 3. Find the margin of error E. E tc 4. Find the left and right endpoints & find the confidence interval. Larson/Farber s n xE xE 10 Example: Constructing a Confidence Interval You randomly select 16 coffee shops and measure the temperature of the coffee sold at each. The sample mean temperature is 162.0ºF with a sample standard deviation of 10.0ºF. Find the 95% confidence interval for the mean temperature. Assume the temperatures are approximately normally distributed. Use t-distribution (n < 30, σ unknown, approximate normal distribution) • • • n =16, x = 162.0 s = 10.0 c = 0.95 df = n – 1 = 16 – 1 = 15 tc = 2.131 Margin of Error s 2.131 10 5.3 E tc n 16 Confidence Interval (162 – 5.3, 162 +5.3) = (156.7, 167.3) 156.7 < μ < 167.3 With 95% confidence, you can say that the mean temperature of coffee sold is between 156.7ºF and 167.3ºF. Larson/Farber Ti 83/84 Stat-Tests 11() 8:Tinterval z-Normal Distribution OR t-Distribution? z t Note: You must have reason to believe you are working with an approximately normal distribution to use the t-distribution. If n 30, then according to the Central Limit Theorem the sampling distribution of the sample means approximates a normal distribution, so you have met this part of the criteria. Additionally, if n 30, the t-distribution is very close to the standard normal z-distribution so actually you could use either the z or t distribution, though your book guides you to use a “t” in this situation since the standard deviation of the population is unknown. If n < 30 and the population is not normally distributed (or you do not know) then you cannot use. the standard normal (z) or the t-distribution. Example: Normal(z) or t-Distribution? You randomly select 25 newly constructed houses. The sample mean construction cost is $181,000 and the population standard deviation is $28,000. Assuming construction costs are normally distributed, should you use the normal distribution, the t-distribution, or neither to construct a 95% confidence interval for the population mean construction cost? Solution: Use the normal (z) distribution (the population is normally distributed and the population standard deviation is known) Larson/Farber 13 Practice Assume the variable is normally distributed and use (as appropriate) a normal distribution or t-distribution to construct a 90% confidence interval for the population mean. A) In a random sample of 10 adults from the U.S., the mean waste generated Per person per day was 4.54 pounds and the standard deviation was 1.21 pounds. B) Repeat part (a), assuming the same statistics came from a sample size of 500. Note that the sample size is quite large (much greater than 30). Construct a 90% confidence interval for the population mean first using a z-standard normal distribution, then again using a t-distribution. Compare The two answers and consider the reason for your results. Larson/Farber 14 6.3 Confidence Interval for Population Proportions Recall that ‘p’ is the probability of success in a binomial experiment. Just as we estimated the mean, we can estimate population proportions (p). Point Estimate for p (“p-hat”) Point Estimate for q (“q-hat”) pˆ x number of successes in sample n number in sample qˆ 1 pˆ A binomial distribution can be approximated by the normal distribution. If np >= 5 and nq >=5 p̂. is approximately ‘normal’. Example: A survey of 1219 U.S. adults: 354 said that their favorite sport to watch is football. Find a point estimate for the population proportion of U.S. adults who say their favorite sport to watch is football. (The Harris Poll) x 354 pˆ 0.290402 29.0% n 1219 A c-confidence interval for the population proportion p (The probability that the confidence interval contains p is c.) pˆ E p pˆ E where E zc pq ˆˆ n Constructing Confidence Intervals for p In Words 1. In Symbols Identify the sample statistics n and x. 2. Find the point estimate p̂. 3. Verify that the sampling distribution of p̂ can be approximated by the normal distribution. 4. Find the critical value zc that corresponds to the given level of confidence c. 5. Find the margin of error, E. 6. Find left & right endpoints and find confidence interval. Larson/Farber pˆ x n npˆ 5, nqˆ 5 Use Standard Normal Table E zc pq ˆˆ n pˆ E p pˆ E 16 Example: Confidence Interval for p In a survey of 1219 U.S. adults, 354 said that their favorite sport to watch is football. Construct a 95% confidence interval for the proportion of adults in the United States who say that their favorite sport to watch is football. Solution: Recall pˆ 0.290402 qˆ 1 pˆ 1 0.290402 0.709598 • Verify the sampling distribution of p̂ can be approximated by the normal npˆ 1219 0.290402 354 5 distribution nqˆ 1219 0.709598 865 5 • Margin of error: invNorm (.975) = 1.96 E zc • Confidence Interval: Larson/Farber p̂ Ti 83/84 Stat-Tests A:1-PropZInt pq (0.290402) (0.709598) ˆˆ 1.96 0.025 n 1219 -E to p̂ +E 0.265 < p < 0.315 With 95% confidence, you can say that the proportion of adults who say football is their favorite sport is 17 between 26.5% and 31.5%. Sample Size • Given a c-confidence level and a margin of error E, the minimum sample 2 size n needed to estimate p is z c ˆ ˆ n pq E • This formula assumes you have an estimate for p̂ and qˆ . • If not, use pˆ 0.5 and qˆ 0.5. Example: You are running a political campaign and wish to estimate, with 95% confidence, the proportion of registered voters who will vote for your candidate. Your estimate must be accurate within 3% of the true population. Find the minimum sample size needed if 1) no preliminary estimate is available and 2) a preliminary estimate gives p̂ = .31 qˆ 0.5. #1: Since you do not have a preliminary estimate, use: pˆ 0.5 2 2 zc 1.96 ˆ ˆ (0.5)(0.5) n pq Answer: 1068 voters 1067.11 0.03 E #2: Use preliminary estimate: 2 Larson/Farber pˆ 0.31 qˆ 1 pˆ 1 0.31 0.69 2 z 1.96 ˆ ˆ c (0.31)(0.69) n pq 913.02 E 0. 03 Answer: 914 voters 18 Practice You are a travel agent and wish to estimate with 95% confidence, the Proportion of vacationers who plan to travel outside the U.S. in the next 12 months. Your estimate must be accurate within 3% of the true proportion. (A) No preliminary estimate is available. Find the minimum sample size. (A) Find the minimum sample size needed, using a prior study that found that 26% of the respondents said they planned to travel outside the U.S. in the next 12 months. Larson/Farber 19