Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter Eight Estimation Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Estimating When is Known Assumptions: • We have a simple random sample of size n. • If the distribution is normal, methods work for any sample size. • If the distribution is unknown, a sample size of at least 30 (sometimes even more) is required. Copyright © Houghton Mifflin Company. All rights reserved. 8|2 Point Estimate • an estimate of a population parameter given by a single number Copyright © Houghton Mifflin Company. All rights reserved. 8|3 Examples of Point Estimates • x is used as a point estimate for . • s is used as a point estimate for . Copyright © Houghton Mifflin Company. All rights reserved. 8|4 Margin of Error • the magnitude of the difference between the point estimate and the true parameter value Copyright © Houghton Mifflin Company. All rights reserved. 8|5 Margin of Error The margin of error using x as a point estimate for is x or x Copyright © Houghton Mifflin Company. All rights reserved. 8|6 Confidence Level • A confidence level, c, is a measure of the degree of assurance we have in our results. • The value of c may be any number between zero and one. • Typical values for c include 0.90, 0.95, and 0.99. Copyright © Houghton Mifflin Company. All rights reserved. 8|7 Critical Value for a Confidence Level, c • the value zc such that the area under the standard normal curve falling between – zc and zc is equal to c. Copyright © Houghton Mifflin Company. All rights reserved. 8|8 Confidence Level Copyright © Houghton Mifflin Company. All rights reserved. 8|9 Find z0.90 such that 90% of the area under the normal curve lies between z0.90 and z0.90 P(– z0.90 < z < z0.90 ) = 0.90 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 10 Find z0.90 such that 90% of the area under the normal curve lies between –z0.90 and z0.90 P(z < z0.90 ) = (1 – 0.90)/2 = .05 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 11 Find z0.90 such that 90% of the area under the normal curve lies between –z0.90 and z0.90 • According to Appendix Table 3, 0.0500 lies exactly halfway between two values in the table (.0505 and .0495). • Averaging the z values associated with areas gives z0.90 = 1.645. • z0.90 = 1.645. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 12 Common Levels of Confidence and Their Corresponding Critical Values Copyright © Houghton Mifflin Company. All rights reserved. 8 | 13 Confidence Interval • A c confidence interval for is an interval computed from sample data. • In a c confidence interval for , c is the probability of generating an interval containing the actual value of . Copyright © Houghton Mifflin Company. All rights reserved. 8 | 14 To Find a Confidence Interval for When is Known: • Let x represent the appropriate random variable. • Obtain a simple random sample (of size n) of x values • Compute the sample mean, x . • If you cannot assume x has a normal distribution, use a sample size of 30 or more. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 15 Confidence Interval for When is Known: xE xE x Sample Mean E zc n c confidence level z c critical value for confidence level where based on the standard normal distributi on Copyright © Houghton Mifflin Company. All rights reserved. 8 | 16 Create a 95% confidence interval for the mean driving time between Philadelphia and Boston. Assume that the mean driving time of 64 trips was 6.4 hours and that the standard deviation is 0.9 hours. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 17 Creating a 95% confidence interval x = 6.4 hours = 0.9 hours c = 95%, so zc = 1.96 n = 64 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 18 x = 6.4 hours = 0.9 hours 95% confidence interval will be from xE to Copyright © Houghton Mifflin Company. All rights reserved. xE 8 | 19 x = 6.4 hours = 0.9 hours c = 95%, so zc = 1.96 n = 64 0.9 E zc 1.96 .2205 n 64 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 20 95% Confidence Interval: 6.4 – .2205 < < 6.4 + .2205 6.1795 < < 6.6205 We are 95% sure that the true time is between 6.18 and 6.62 hours. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 21 We may get different confidence intervals for different samples. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 22 We may get different confidence intervals for different samples. • For each sample the c confidence interval goes from x E to x E • If we select many samples of the same size and find the corresponding confidence intervals, then the proportion of these intervals that actually contain is c. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 23 When estimating the mean, how large a sample must be used in order to assure a given level of confidence? Use the formula: z c n E Copyright © Houghton Mifflin Company. All rights reserved. 2 8 | 24 How do we determine the value of the population standard deviation, ? • Use the standard deviation, s, of a preliminary sample of size 30 or larger to estimate . Copyright © Houghton Mifflin Company. All rights reserved. 8 | 25 Determine the sample size necessary to determine (with 99% confidence) the mean time it takes to drive from Philadelphia to Boston. We wish to be within 15 minutes of the true time. Assume that a preliminary sample of 45 trips had a standard deviation of 0.8 hours. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 26 ... determine with 99% confidence... • z0.99 = 2.58 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 27 ... We wish to be within 15 minutes of the true time. ... • E = 15 minutes = 0.25 hours Copyright © Houghton Mifflin Company. All rights reserved. 8 | 28 ...a preliminary sample of 45 trips had a standard deviation of 0.8 hours. • Since the preliminary sample is large enough, we can assume that the population standard deviation is approximately equal to 0.8 hours. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 29 Minimum Sample Size = zc n E 2 2 2.58(0.8) 68.16 .25 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 30 Rounding Sample Size • Any fractional value of n is always rounded to the next higher whole number. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 31 Minimum Sample Size • n 68.16 • Round to the next higher whole number. • To be 99% confident in our results, the minimum sample size = 69. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 32 Estimating When is Unknown • Apply the Student’s t distribution. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 33 Student’s t Variable x t s n Copyright © Houghton Mifflin Company. All rights reserved. 8 | 34 The shape of the t distribution depends only on the sample size, n, if the basic variable x has a normal distribution. When using the t distribution, we will assume that the x distribution is normal. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 35 Appendix Table 4 (Page A8) gives values of the variable t corresponding to the number of degrees of freedom (d.f.) Copyright © Houghton Mifflin Company. All rights reserved. 8 | 36 Degrees of Freedom • d.f. = n – 1 • where n = sample size Copyright © Houghton Mifflin Company. All rights reserved. 8 | 37 The t Distribution has a Shape Similar to that of the the Normal Distribution Copyright © Houghton Mifflin Company. All rights reserved. 8 | 38 Properties of a Student’s t Distribution • Symmetric about the mean 0. • Depends on the degrees of freedom. • Bell-shaped with thicker tails than the normal distribution. • As the degrees of freedom increase, the t distribution approaches the standard normal distribution Copyright © Houghton Mifflin Company. All rights reserved. 8 | 39 Appendix Table 4 • Gives various t values for different degrees of freedom Copyright © Houghton Mifflin Company. All rights reserved. 8 | 40 Using Table 4 to Find Critical Values tc for a c Confidence Level Copyright © Houghton Mifflin Company. All rights reserved. 8 | 41 If the required d.f. are not in the table: • Use the closest d.f. that is smaller than the needed d.f. • This results in a larger critical value tc. • The resulting confidence interval will be longer and have a probability slightly higher than c. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 42 Using Table 4 to Find Critical Values of tc • Find the column in the table with the given c heading • Compute the number of degrees of freedom: d.f. = n 1 • Read down the column under the appropriate c value until we reach the row headed by the appropriate d.f. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 43 To find the critical value tc for a 95% confidence interval if n = 8. • Find the column in the table with c heading 0.950 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 44 To find the critical value tc for a 95% confidence interval if n = 8. • Compute the number of degrees of freedom: d.f. = n 1 = 8 1 = 7 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 45 To find the critical value tc for a 95% confidence interval if n = 8. • Read down the column under the appropriate c value until we reach the row headed by d.f. = 7 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 46 Find the critical value tc for a 95% confidence interval if n = 8. tc = 2.365 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 47 Finding Confidence Intervals for When is Unknown xE xE where x Sample Mean s E tc n c = confidence level (0 < c < 1) tc = critical value for confidence level c, and degrees of freedom = n 1 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 48 The mean weight of eight fish caught in a local lake is 15.7 ounces with a standard deviation of 2.3 ounces. Construct a 90% confidence interval for the mean weight of the population of fish in the lake. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 49 Mean = 15.7 ounces Standard deviation = 2.3 ounces. • n = 8, so d.f. = n – 1 = 7 • For c = 0.90, Appendix Table 4 gives t0.90 = 1.895. s 2.3 E tc 1.895 1.54 n 8 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 50 Mean = 15.7 ounces Standard deviation = 2.3 ounces. E = 1.54 The 90% confidence interval is: xE xE 15.7 - 1.54 < < 15.7 + 1.54 14.16 < < 17.24 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 51 The 90% Confidence Interval: 14.16 < < 17.24 • We are 90% sure that the true mean weight of the fish in the lake is between 14.16 and 17.24 ounces. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 52 Review of the Binomial Distribution • Completely determined by the number of trials (n) and the probability of success (p) in a single trial. • q=1–p • If np and nq are both > 5, the binomial distribution can be approximated by the normal distribution. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 53 A Point Estimate for p, the Population Proportion of Successes r pˆ ( read as " p hat " ) n Copyright © Houghton Mifflin Company. All rights reserved. 8 | 54 Point Estimate for q (Population Proportion of Failures) qˆ ( read as " q hat " ) 1 pˆ Copyright © Houghton Mifflin Company. All rights reserved. 8 | 55 For a sample of 500 airplane departures, 370 departed on time. Use this information to estimate the probability that an airplane from the entire population departs on time. r 370 pˆ 0.74 n 500 We estimate that there is a 74% chance that any given flight will depart on time. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 56 Margin of Error for p as a Point Estimate for p pˆ p Copyright © Houghton Mifflin Company. All rights reserved. 8 | 57 Maximal Margin of Error • the maximal error of estimate E for a confidence interval E zc Copyright © Houghton Mifflin Company. All rights reserved. pq n 8 | 58 Confidence Interval for p for Large Samples (np and nq > 5) pˆ E r where pˆ n p and pˆ E E zc pˆ (1 pˆ ) n c = confidence level zc = critical value for confidence level c taken from a normal distribution Copyright © Houghton Mifflin Company. All rights reserved. 8 | 59 For a sample of 500 airplane departures, 370 departed on time. Find a 99% confidence interval for the proportion of airplanes that depart on time. Is the use of the normal distribution justified? n 500 Copyright © Houghton Mifflin Company. All rights reserved. pˆ 0.74 8 | 60 For a sample of 500 airplane departures, 370 departed on time. Find a 99% confidence interval for the proportion of airplanes that depart on time. Can we use the normal distribution? npˆ 370 Copyright © Houghton Mifflin Company. All rights reserved. nqˆ 130 8 | 61 For a sample of 500 airplane departures, 370 departed on time. Find a 99% confidence interval for the proportion of airplanes that depart on time. n pˆ and n qˆ are both 5 so the use of the normal distribution is justified. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 62 Out of 500 departures, 370 departed on time. Find a 99% confidence interval. r 370 pˆ 0.74 n 500 .74(.26) E 2.58 0.0506 500 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 63 99% confidence interval for the proportion of airplanes that depart on time: E = 0.0506 Confidence interval is: pˆ E p pˆ E . 74 0 . 0506 p . 74 0 . 0506 0 . 6894 p 0 . 7906 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 64 99% confidence interval for the proportion of airplanes that depart on time Confidence interval is 0 . 6894 p 0 . 7906 We are 99% confident that between 69% and 79% of the planes depart on time. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 65 The point estimate and the confidence interval do not depend on the size of the population. The sample size, however, does affect the accuracy of the statistical estimate. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 66 Interpretation of Poll Results The proportion responding in a certain way is: p̂ the sample estimate of the population proportion. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 67 A 95% confidence interval for population proportion p is: p̂ margin of error p p̂ margin of error p̂ poll report Copyright © Houghton Mifflin Company. All rights reserved. 8 | 68 Interpret the following poll results: “ A recent survey of 400 households indicated that 84% of the households surveyed preferred a new breakfast cereal to their previous brand. Chances are 19 out of 20 that if all households had been surveyed, the results would differ by no more than 3.5 percentage points in either direction.” Copyright © Houghton Mifflin Company. All rights reserved. 8 | 69 “Chances are 19 out of 20 …” • 19/20 = 0.95 • A 95% confidence interval is being used. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 70 “... 84% of the households surveyed preferred …” • 84% represents the percentage of households who preferred the new cereal. 84 % represents pˆ . Copyright © Houghton Mifflin Company. All rights reserved. 8 | 71 “... the results would differ by no more than 3.5 percentage points in either direction.” • • • • 3.5% represents the margin of error, E. The confidence interval is: 84% - 3.5% < p < 84% + 3.5% 80.5% < p < 87.5% Copyright © Houghton Mifflin Company. All rights reserved. 8 | 72 The poll indicates ( with 95% confidence): • between 80.5% and 87.5% of the population prefer the new cereal. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 73 Sample Size for Estimating p for the Binomial Distribution Copyright © Houghton Mifflin Company. All rights reserved. 8 | 74 Formula for Minimum Sample Size for Estimating p for the Binomial Distribution If p is an estimate of the true population proportion, zc n p1 p E Copyright © Houghton Mifflin Company. All rights reserved. 2 8 | 75 Formula for Minimum Sample Size for Estimating p for the Binomial Distribution If we have no preliminary estimate for p, the probability is at least c that the point estimate r/n for p will be in error by less than the quantity E if n is at least: 2 1 zc n 4 E Copyright © Houghton Mifflin Company. All rights reserved. 8 | 76 The manager of a furniture store wishes to estimate the proportion of orders delivered by the manufacturer in less than three weeks. She wishes to be 95% sure that her point estimate is in error either way by less than 0.05. Assume no preliminary study is done to estimate p. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 77 She wishes to be 95% sure ... • z0.95 = 1.96 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 78 ... that her point estimate is in error either way by less than 0.05. • E = 0.05 Copyright © Houghton Mifflin Company. All rights reserved. 8 | 79 ... no preliminary study is done to estimate p. 1 zc n 4 E 2 2 1 1.96 n 384.16 4 0.05 The minimum required sample size (if no preliminary study is done to estimate p) is 385. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 80 If a preliminary estimate estimate of p indicated that p was approximately equal to 0.75: zc n pq E 2 2 1.96 n .75(.25) 288.12 0.05 The minimum required sample size (if this preliminary study is done to estimate p) is 289. Copyright © Houghton Mifflin Company. All rights reserved. 8 | 81