Download Chapter Ten: Introduction to inference

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Section 10.1
Confidence Intervals
The Basics
Confidence Intervals!
A level C confidence level for a parameter is an
interval computed from sample data by a method that
has probability C of producing an interval containing
the true value of the parameter.
A confidence interval is found by the following
formula:
estimate ± margin of error
The margin of error shows how accurate we believe
our guess is, based on the variability of the estimate.
Confidence Intervals!
Any confidence interval has two
parts:
– An interval computed from the data
– A confidence level giving the
probability that the method produces
an interval that covers the parameter.
Critical Values!
The number z* with
probability p lying to
its right under the
standard normal
curve is called the
upper p critical
value of the
standard normal
distribution.
*
Make sure you know how to read the table of critical values.
Conditions for Inference about a
Population Mean
The data is an SRS from the population
2. Observations from the population have a
normal distribution with an unknown mean
(μ) and standard deviation (σ)
3. Independence is assumed for the individual
observations when calculating a confidence
interval. When we are sampling without
replacement from a finite population, it is
sufficient to verify that the population is at
least 10 times the sample size.
1.
CAUTION
Be sure to check that the
conditions for constructing a
confidence interval for the
population mean are satisfied
before you perform any
calculations.
Here’s how to find the level C
confidence interval!
Any normal curve has probability C between point z*
standard deviations below its mean and the point z*
standard deviations above its mean.
The standard deviation of the sampling distribution of
x is /√n and its mean is the population mean . So
there is probability C that the observed sample mean
x is between
 - z*(/√n) and  + z*(/√n)
Whenever this happens, the population mean  is
contained between
x - z*(/√n) and x + z*(/√n)
This is the confidence interval. The estimate of the
unknown  is x, and the margin of error is z*(/√n)
Possible Statements About a
Confidence Interval:
“I am __% confident that the true (parameter) is between __
and __ because when I computed the endpoints of the interval
I followed a procedure that creates intervals containing the
true (parameter) about __% of the time it is followed.”
“This __% confidence interval is one result of a method that
had a __% chance of producing an interval capturing the true
(parameter).”
“This __% confidence interval is one observed result of a
process that, were it to be repeated over many random
samples, would produce intervals containing the population
parameter __% of the time.”
“I am 100% confident that the sampling procedure I used has
a __% chance of obtaining a sample with a (statistic) such that
the interval constructed around the (statistic) contains the
(parameter).”
Coach Adams’ Standard Comment
KEEP IN MIND, I change how I say things, but I never
change the fact that I must comment on what I am
confident about, why I have that level of confidence,
and INCLUDE CONTEXT!
Since we don’t want to forget context, imagine that
when I conducted a 99% confidence interval on the
Capture the Mean Activity that I got (76.7, 91.7).
Here is what I might say:
“I am 99% confident that the actual course average for
students in the Fall 2009 AP Statistics class was
between a 76.7 and a 91.7 because the methods I
used results in intervals such that 99% of all intervals
created will capture this true average grade.”
Helping Us Understand
Confidence Intervals
The confidence interval mentioned on the
previous slide (76.7, 91.7) was calculated with
99% confidence. Essentially, this particular
interval is based on one sample so it may be
one of the 99% of all intervals that captures 
or it may be one of the 1% of all intervals that
fails to capture .
The confidence level is the probability that the
METHOD gives an interval that captures the
true parameter.
REMEMBER
Our “confidence” is in the
PROCEDURE used to
generate the interval.
INFERENCE TOOLBOX (p 631)
Steps for constructing a CONFIDENCE INTERVAL:
1—PARAMETER—Identify the population of interest
and the parameter you want to draw a conclusion
about.
2—CONDITIONS—Choose the appropriate inference
procedure. VERIFY conditions (SRS, Normality,
Independence) before using it.
3—CALCULATIONS—If the conditions are met, carry
out the inference procedure.
4—INTERPRETATION—Interpret your results in the
context of the problem. CONCLUSION,
CONNECTION, CONTEXT (meaning that our
conclusion about the parameter connects to our work
in part 3 and includes appropriate context)
What does a confidence interval
look like?
There are two commonly used formats for
expressing a confidence interval
– Estimate ± margin of error
Example: 57.39 ± 3.97
– Interval notation
Example: (53.42, 61.36)
How confidence intervals behave!
Margin of error gets smaller when
– z* gets smaller. Smaller z* is the same as smaller
confidence level C. To obtain a smaller margin of
error, you must be willing to accept lower
confidence.
–  gets smaller. The standard deviation  measures
the variation in the population. It is easier to pin
down  when  is smaller.
– n gets larger. Increasing the sample size n reduces
the margin of error for any fixed confidence interval.
Because n appears under the square root sign, we
must take four times as many observations in order
to cut the margin of error in half.
Sample size for desired margin of error!
Ultimately, the population standard deviation is being given to us,
so this doesn’t change.
And we are typically given a set confidence level, C, which directly
determines z*.
So, ultimately, we can only change our sample size to achieve the
desired SMALL INTERVAL WITH A HIGH LEVEL OF
CONFIDENCE!
To determine the sample size n that will yield a confidence
interval for a population mean with a specific margin of error
m, set the expression for the margin of error to be less than or
equal to m and solve for n:
ALWAYS ROUND UP to the
z*

n
m
next whole number value of n
to ensure that your sample is
large enough to give desired
margin of error (or smaller).
Sample Size Example
Again, consider our Capture the Mean activity. If we
want to estimate the true mean within 5 points, how
big of a sample do we need?
σ=6.4551; 99% confidentz*=2.576
6.4551
2.576
5
n
2.576 6.4551
 n
5
2
 2.576 6.4551 

 n
5


n  11.0601
So we need at least 12
grades in our sample to
bring the margin of
error to within 5 points.
(Notice that we rounded
11.06 up to 12.)
Some cautions!
The data must be an SRS sample from the population.
The formula is not correct for probability sampling designs
more complex than an SRS.
There is no correct method for inference from data
haphazardly collected with bias of unknown size.
Because x is strongly influenced by a few extreme
observations, outliers can have a large effect on the
confidence interval. Search for outliers and try to correct
them or justify their removal before computing the interval.
If the sample size is small and the population is not normal,
the true confidence interval will be different from the value C
used in computing the interval. If n≥ 15 then the confidence
interval will not be greatly disturbed by nonnormal
populations.
You must know the standard deviation of the population.
Technology Disclaimer
As always, you will be allowed unrestricted use of your
calculator on quizzes and tests (as well as the actual
AP Exam). For this reason, ALWAYS be certain to
write down the values of key numbers that are being
used (means, standard deviations, degrees of
freedom, significance levels, etc.) along with results of
the calculator procedures in order to receive full credit.
To use your calculator: STAT: TESTS: Z Interval
Make sure you are comfortable with difference
between when you have data versus stats
Plug in exactly what you are asked for.