* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download SP17 Lecture Notes 6 - Confidence Interval for a Population Mean
Survey
Document related concepts
Transcript
Lecture notes 6 – confidence interval for a population mean Highlights: • • • • • Confidence interval outline The point estimate The margin of error Interpretation of a confidence interval Worked example Confidence Intervals We have discussed the idea of using a point estimate (which is a statistic calculated from observed data) to estimate the value of a parameter (which is unknown). For instance, we may use a sample mean (a point estimate) to estimate the value of a population mean (a parameter). One issue we face is that our estimates are nearly always wrong. And they are wrong not because we make mistakes (though this is possible), but because they are based on limited, random samples. A sample mean, for instance, is random insofar as the sample itself is random. If a new sample is taken, a different mean will be calculated. Confidence Intervals So how do we deal with the fact that we are nearly always wrong? By trying to determine how wrong we may be. Perhaps we believe that our estimate is probably only off by a little. Perhaps we believe that our estimate is likely to be off by quite a bit. We quantify this by calculating a margin of error, which we add and subtract from our point estimate in order to form a confidence interval (CI): CI = point estimate ± margin of error The point estimate The point estimate is our “best guess” for the value of an unknown parameter. It is a sample statistic that we calculate from our data. For instance, is a point estimate for . It is always important to keep in mind that, while a point estimate is our best guess, there will always been some “error” in this guess. It is for this reason that we find margins of error to go along with point estimates. The margin of error The margin of error is the maximum amount by which we believe our point estimate may be in error (i.e. by which is may differ from the true unknown parameter value), at a desired “level of confidence” or “confidence level”. The margin of error will be found by multiplying a critical value by a standard error. We then add and subtract this from the point estimate. Visually, this process is akin to drawing a t-distribution centered on the point estimate, and then chopping off the tails: Interpretation of a CI When we build a confidence interval, we expect that it will capture the unknown parameter value. You can think of the CI as a range of plausible values for the unknown parameter, at some level of confidence. The level of confidence is the % chance that a CI constructed using a random sample will capture the true unknown parameter value. So, if we make a 95% CI for a parameter, then we know that, 95% of the time, these kinds of CIs will capture the parameter. However we never get to know if it really did capture it this time. Example In many animal species, individuals communicate through UV signals visible to one another but invisible to humans. A scientist is interested in the role of UV colors in the Lissotriton vulgaris newt, and conducted a study that measured the difference in length of time a female of the species spent near males with and without the UV presence. A positive measurement indicates that the female spent longer time with the UV present, and a negative means less time under the same conditions. The average measurement from 23 trials is 50.7, with a standard deviation of 87.3. Construct a 99% CI for the true average time difference given by all newts in this scenario. Formula for a confidence interval All of the confidence intervals we will see follow this general formula: CI for unknown parameter = point estimate ± margin of error And the margin of error (ME) will follow the general formula: ME = (critical value) * (standard error of point estimate) So, we have the general formula: CI = point estimate ± (critical value) * (standard error) Formula for a confidence interval To apply this general formula to the case of constructing a confidence interval for a population mean, note that the point estimate is the sample mean, the critical value is in terms of the t-distribution, and the standard error of the sample mean is the sample standard deviation divided by the square root of the sample size. And so we have: Note that “t” is a critical value, which is found on your ttable under the column for a 99% CI. Constructing the CI Using this formula, we can construct the CI in the space below: Interpreting the CI Once the CI is constructed, we need to give an English interpretation. The general form of some acceptable interpretations is: “The range of plausible values for ___________, at ____% confidence, is _______ to _______.” “We are _____% confident that the true ____________ lies between ________ and ________.” “Under repeated sampling, _____% of similarly constructed intervals will contain the true __________.” Interpreting the CI Note that we do not say “There is a ____% chance that the true ________ lies between _______ and _______.” This is because the value of the parameter is fixed (but unknown). Once we have constructed a CI, the parameter either is or is not contained in this interval. We just don’t get to know whether it is or is not. The key here is that the CI is the thing that is random, not the parameter. We can talk about the probability that a random CI will contain the parameter, not the probability that a specific CI contains the parameter; it either does or does not. Interpreting the CI Give an interpretation of the CI we constructed in the space below: