Download Unit-14-Confidence-Interval-for

Elementary Statistics Triola, Elementary Statistics 11/e Unit 14 The Confidence Interval for Means, σ Unknown Estimation Unit 14 The Confidence Interval for Means, σ Unknown (Section 7-4) We are now ready to begin our exploration of how we make estimates of the population mean. Before we get started, I want to emphasize the importance of having collect a representative sample, i.e. one that is a simple random sample. Without that, our estimates are useless. ̅, the mean of our sample. However, we do not The best estimate of the mean that is available to us is 𝒙 expect 𝑥̅ equal 𝜇, therefore, this single estimate, while a good start is somewhat useless because we do not know how far off from 𝜇 we might be. What we need is a Lower Bound and an Upper Bound in which we could have some confidence that 𝜇 falls between these two limits. The first thing we need to quantify is the word “confidence”. As example, let’s say that we want to be 95% confident, and we now explore what that means, other than “pretty darn sure”. First note, that we are working with averages, 𝑥̅ and 𝜇. That means that the probability distribution we will be working with is the sampling distribution of the mean, whose mean is 𝝁𝒙̅ and standard deviation, 𝝈𝒙̅ . According to the Central Limit Theorem,. 𝜇𝑥̅ = 𝜇 and 𝜎𝑥̅ = 𝜎 , √𝑛 and so we will be working with those latter values. Now picture the sampling distribution with 𝜇 at its center. All possible 𝑥̅ are in the sampling distribution somewhere, and so if we find a value E such that the interval, 𝜇 ± 𝐸 which is centered on 𝜇, captures 95% of the area under the curve, it will also capture 95% of all possible 𝑥̅ . Unfortunately, I don’t have any good pictures to show this, so I’ll lecture on this in class. One last piece, if 95% of the 𝑥̅ lie within 𝐸 of 𝜇 then 𝜇 must lie within 𝐸 of 95% of the 𝑥̅ . To put this in another way. 95% of the time we take a sample, we’re going to get 𝑥̅ such that 𝜇 lies in 𝑥̅ . ±𝐸. Therefore this is our 95% confidence interval and 𝐸 is called the margin of error. Now all we have to do is to find E. Unfortunately, this is easier said than done. Let’s go back to our sampling distribution. If 𝜇 ± 𝐸 encompasses 95% under the normal curve, then 𝐸 must be 1.96 standard deviations units from 𝜇. Under any normal curve, 95% of the area centered under the curve, is bounded by 1.96𝜎 from the mean. Hence, 𝐸 = 1.96𝜎𝑥̅ = 1.96 𝜎 . √𝑛 We’re done, right? Well, not exactly. Remember, we set out to estimate 𝜇 because we didn’t know it. So what would we know 𝜎 if we don’t know 𝜎? Fortunately, for very large samples, say size 100 or larger, we can use s, the sample standard deviation in place of 𝜎. Therefore, we can have 𝐸 = 1.96 𝑠 , √𝑛 but what do we do about smaller sample, say size 20 or 30? This problem wasn’t solved until around the turn of the 20th century, when William Gosset, working for the Guinness Brewery company worked out a probability distribution that could be used to perform quality control tests using small samples. He called it the Student t distribution. This distribution is very similar in shape to the Normal distribution, except that it is wider, i.e. it has a larger standard deviation. Furthermore, the size of the standard deviation depends on the sample size, the smaller the sample the 48 Elementary Statistics Triola, Elementary Statistics 11/e Unit 14 The Confidence Interval for Means, σ Unknown larger the standard deviation. Take at a look at the following figure that shows different shapes for the t-distribution as a function of sample size, as well as comparing it to the Normal distribution. Here are a few more rules for working with the t-distribution. If you know for a fact, or you strongly suspect (because you carefully examined the histogram of your sample) that the underlying population is normally distributed, then the sample size is not that important other than its effect on the shape of the t-curve. However, if you suspect that the underlying population is not all that normally shaped, then you sample size should be a minimum of 30. Nomenclature When we were working with the Standard Normal distribution, we called the horizontal axis, the z-axis. The z value that bordered the 95% area centered under the curve to the right was called a critical value and denoted 𝑧𝛼⁄2 . 𝜶 is called the significance and is the sum of the area of the tails, i.e. the area under the curve to the left and right of the centered 95% area. Hence, in this case 𝛼 = 0.05. See the figure below. The critical value corresponding to a centered 95% area is, 𝑧𝛼⁄2 = 1.96. (1.96 = NORM.S.INV(.975)) 49 Elementary Statistics Triola, Elementary Statistics 11/e Unit 14 The Confidence Interval for Means, σ Unknown We have a completely analogous situation when it comes to using the t-distribution. The axis is called the t-axis, the critical value is 𝑡𝛼⁄2 , and 𝐸 = 𝑡𝛼⁄2 𝑠 . √𝑛 In order to find 𝑡𝛼⁄2 we need to know to things, the confidence level and the size of the sample. There is one more twist when finding 𝐝𝐞𝐠𝐫𝐞𝐞𝐬 𝐨𝐟 𝐟𝐫𝐞𝐞𝐝𝐨𝐦 . First, we work with 𝛼, the significance which is one minus the confidence level expressed as a decimal. For a confidence level of 95% we have, 𝛼 = 1.0 − 0.95 = 0.05 We also need to use the degrees of freedom which is simply the sample size minus one, Deg_ freedom = 𝑛 − 1 Worked Example We receive a batch of 50,000 washers, and we wish to estimate the average inside diameter of the washers. We carefully select a simple random sample of size 20 and find that the average inside diameter is 24.78mm with a standard deviation of 1.62mm. We want to calculate a 95% confidence interval for our estimate of the batch µ. First we calculate 𝑡𝛼⁄2 using T.INV.2T, 50 Elementary Statistics Triola, Elementary Statistics 11/e Unit 14 The Confidence Interval for Means, σ Unknown Note that the tool uses the work Probability instead of 𝛼 or Significance. We see that 𝑡𝛼⁄2 = 2.0930 and we proceed to calculate E, (2.0930)(1.62) 𝑠 𝐸 = 𝑡𝛼⁄2 = = 0.7582 √𝑛 √20 Finally, the confidence interval is, 𝑥̅ ± 𝐸 = 24.78 ± 0.76 = (24.02, 25.54) Below is the Excel spreadsheet that I used to calculate these values. If you double click on the table, you will bring up a copy of Excel. Then if you select any of the cells, such as the value for t, you will see the Excel formula in the formula bar, 𝒇𝒙 toward the top. x 24.78 s n 1.62 t E x-E x+E 20 2.093024 0.7582 24.02182 25.53818 One last note. Suppose that the manufacturer of the washers had claimed that the average inside diameter was 25.00mm. On the basis of this sample, could you refute the claim? You could not because 25.00 lies within (24.02, 25.54). Hence, there’s no reason to doubt the manufacturer’s claim. Now it’s your turn to have some fun. You’ll want to open Excel 2010 and label some cells as I did. Here’s the situation. Assume the population is normally distributed. For a sample size of 61, the average weight loss was 4.0 kg with a standard deviation of 6.4 kg. Find a 99% confidence interval for the mean of the population. Use Excel in exactly the same way I did. See the answer at the end of this unit. This is the end of Unit 14. In class, you will get more practice with these concepts by working exercises in MyMathLab. 51 Elementary Statistics Triola, Elementary Statistics 11/e Unit 14 The Confidence Interval for Means, σ Unknown Answers x s 4 n 6.4 t E x-E x+E 61 2.660283 2.1799 1.8201 6.1799 52

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Unit-14-Confidence-Interval-for