Download Unit-19-Introductio-to-Confidence

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Elementary Statistics
Triola, Elementary Statistics 11/e
Unit 19 Introduction to Confidence Intervals
We are now ready to begin our exploration of how we make estimates of the population mean. Before
we get started, I want to emphasize the importance of having collected a representative sample, i.e. one
that is a simple random sample. Without that, our estimates are useless.
Μ…, the mean of our sample. However, we do not
The best estimate of the mean that is available to us is 𝒙
expect π‘₯Μ… to equal πœ‡, therefore, this single estimate, while a good start is somewhat useless, because we
do not know how far off we might be from πœ‡. What we need is a Lower Bound and an Upper Bound in
which we could have some amount of confidence that πœ‡ falls between these two limits. In our words,
we would like to find some value E, such that we are, say 95% confident that given the average, π‘₯Μ… , of any
sample, πœ‡ lies somewhere between π‘₯Μ… βˆ’ 𝐸 π‘Žπ‘›π‘‘ π‘₯Μ… + 𝐸. In other words, π‘₯Μ… βˆ’ 𝐸 is our Lower Bound and
π‘₯Μ… + 𝐸 is our Upper Bound for estimating πœ‡. We would like to be able to say that we are 95% confident
(or 99% or whatever percent we want) that the actual value πœ‡ lies between this lower and upper bound.
However, even this definition is a bit vague because what do we mean by β€œconfident”.
To sharpen things up a bit, let’s consider the sampling distribution of the means. This distribution
consists of many values π‘₯̅𝑖 , one for each sample we could possibly take from the population. We do not
expect any of the π‘₯̅𝑖 ′𝑠 to equal each other, but since the distribution is normally shaped, most of them
will be clustering around πœ‡, the mean of the population. To say that want to be 95% confident means
that we want to find a value for 𝐸 such that 95% of the π‘₯̅𝑖 ′𝑠 fall within ±πΈ of πœ‡.
|
πœ‡βˆ’πΈ
πœ‡
|
πœ‡+𝐸
If 95% of the π‘₯̅𝑖′ 𝑠 fall within ±πΈ of πœ‡, then πœ‡ must fall within ±πΈ of each of these 95% π‘₯̅𝑖′ 𝑠. Imagine all
those π‘₯̅𝑖′ 𝑠 with their own interval π‘₯̅𝑖 ± 𝐸 and πœ‡ falling with 95% of those intervals. We would get a
picture that looks like,
45
Copyright © RHarrow 2013
Unit 19
Introduction to Confidence Intervals
Each of the green bars is an interval, π‘₯̅𝑖 ± 𝐸 that includes the mean of the population using a 95%
Μ… ± 𝑬 that does not include the
confidence level. The red bar, one out of twenty, is an interval, 𝒙
population mean.
Finally, since 95% of all the possible π‘₯̅𝑖′ 𝑠 with their intervals, π‘₯̅𝑖 ± 𝐸, include πœ‡, we can be 95% confident
that any one sample π‘₯̅𝑖 has this property too, i.e.
π‘₯̅𝑖 βˆ’ 𝐸 ≀ πœ‡ ≀ π‘₯̅𝑖 + 𝐸
Now all we have to do is to find a value for E, which we call the margin of error.
First note, that we are working with averages, π‘₯Μ… and πœ‡. That means that the probability distribution we
will be working with is the sampling distribution of the mean. The mean of this distribution is 𝝁𝒙̅ and the
standard deviation is πˆπ’™Μ… . According to the Central Limit Theorem,. 𝝁𝒙̅ = 𝝁 and πˆπ’™Μ… =
𝝈
,
βˆšπ’
and so we will
be working with these values.
Now picture the sampling distribution with πœ‡ at its center. All possible π‘₯Μ… ′𝑠 are in the sampling
distribution somewhere, and so if we find a value E such that the interval, πœ‡ ± 𝐸, which is centered on
πœ‡, captures 95% of the area under the curve, it will also capture 95% of all the possible π‘₯Μ… ′𝑠. Take a look
at the chart below. It is a chart of the Standard Normal Curve, and hence its center is 0.
46
Copyright © RHarrow 2013
Unit 19
Introduction to Confidence Intervals
There’s a lot going on here, so let’s take things one step at a time. The area of 0.95 is centered under
the curve. The critical value, π’›πœΆβ„πŸ , is the boundary between the centered 0.95 area and the β€œred zone”
to the right of it. Since we are looking at the graph of a Standard Normal Distribution, that value of
𝑧𝛼⁄2 equals 1.96. 𝜢 is called the significance, and in this case it simply equals 1.0 βˆ’ 0.95 =
0.05 because we are using a 95% confidence level Hence, in this case, 𝛼⁄2 = 0.025. In other words, the
area of each red zone is 0.025 and together they sum to 0.05.
How did we find π’›πœΆβ„πŸ = 𝟏. πŸ—πŸ”? Look at the chart above. We want the value of z such that the area to
its right, the red zone is 0.025. Hence the total area to the left of 𝑧𝛼⁄2 is 0.975 and NORM.S.INV(0.975) =
1.96.
I know that this is a lot to take in, so you may want to re-read the above few paragraphs. First, let’s find
the value of 𝑧𝛼⁄2 for an 80% confidence level. Find 𝛼 = 1 βˆ’ 0.80 and then divide that by two. Finally,
find NORM.S.INV(𝟏. 𝟎 βˆ’ πœΆβ„πŸ).
Question #1
For an 80% confidence interval, what is the value of 𝑧𝛼⁄2 ?
Another way to think about 𝑧𝛼⁄2 is as a number of standard deviation units for the Standard Normal
Distribution. For example, a value of 𝑧𝛼⁄2 = 1.96 means that 1.96 standard deviation units below and
above the mean of 0, covers 95% of the area under the Standard Normal Curve. See Figure 7.3 above.
Recalling the formula for translating from the real world to the β€œz-world”, i.e. the axis of the Standard
Normal Distribution,
𝑧=
π‘₯βˆ’πœ‡
𝜎
if we let 𝑧 = 𝑧𝛼⁄2 , and 𝜎 = πœŽβ„
since we are working with the sampling distribution, (the Central
βˆšπ‘›
Limit Theorem says that the standard deviation of the sampling distribution is πœŽβ„ ), we get the
βˆšπ‘›
following result,
𝑧𝛼⁄2 =
π‘₯Μ… βˆ’ πœ‡
𝜎/βˆšπ‘›
and then from this we get,
(𝑧𝛼⁄2 )
𝜎
βˆšπ‘›
= π‘₯Μ… βˆ’ πœ‡ = 𝐸
Therefore,
𝑬 = π’›πœΆβ„πŸ
𝝈
βˆšπ’
47
Copyright © RHarrow 2013
Unit 19
Introduction to Confidence Intervals
To find E, the margin of error, all we have to do is multiply 𝑧𝛼⁄2 by
𝜎
.
βˆšπ‘›
There’s just one problem with
this terrific plan. Remember, we are trying to get an estimate for πœ‡, the mean of the population.
However, if we don’t know πœ‡ why would we know 𝜎, the standard deviation of the population? We will
resolve this dilemma in the next unit.
This is the end of Unit 19.
Now turn to your homework in MyMathLab to get
more practice with these concepts.
48
Copyright © RHarrow 2013