Download Chapter 19 Confidence intervals for proportions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Opinion poll wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Chapter 19
Confidence intervals for
proportions
math2200
Sample proportion
• A good estimate of the population proportion
• Natural sampling variability
• How does the sample proportion
vary from
sample to sample?
– Approximately normal

N  p,

pq 

n 
– About 95% of all samples have
within 2 sd of p
– Population proportion, p, is unknown
– Standard deviation is unknown
We can estimate!
• Standard error estimates standard deviation
• If
estimates p accurately, by the 68-95-99.7%
Rule, we know
– about 68% of all samples will have ’s within 1 SE of
p
– about 95% of all samples will have ’s within 2 SEs
of p
– about 99.7% of all samples will have ’s within 3 SEs
of p
What can we say about p?
• The true value of p is
. (Wrong!)
• The true value of p is close to
. (OK!)
• We are 95% confident that the true value of p is
between
and
• This is a 95% confidence interval of p
• In this specific context, we can also call it oneproportion z-interval
Margin of error
• The extent of the confidence interval on either
side of
is called the margin of error
– The margin of error for our 95% confidence interval is
2se.
– For the 99.7% confidence interval, the margin of error
is 3se.
– The more confident we want to be, the larger the
margin of error must be.
– Certainty versus precision
• Confidence interval: estimate ± margin of error
What Does “95% Confidence”
Really Mean?
• Each confidence interval uses a sample
statistic to estimate a population
parameter.
• But, since samples vary, the statistics we
use, and thus the confidence intervals we
construct, vary as well.
What Does “95% Confidence”
Really Mean? (cont.)
• The figure to the right
shows that some of
our confidence
intervals capture the
true proportion (the
green horizontal line),
while others do not.
What Does “95% Confidence”
Really Mean? (cont.)
• Our confidence is in the process of
constructing the interval, not in any one
interval itself.
• Thus, we expect 95% of all “95%confidence intervals” to contain the true
parameter that they are estimating.
Margin of Error: Certainty vs.
Precision
• We can claim, with 95% confidence, that
the interval pˆ  2 SE ( pˆ ) contains the true
population proportion.
– The extent of the interval on either side of p̂ is
called the margin of error (ME).
• In general, confidence intervals have the
form estimate ± ME.
• The more confident we want to be, the
larger our ME needs to be.
Margin of Error: Certainty vs.
Precision (cont.)
Margin of Error: Certainty vs.
Precision (cont.)
• To be more confident, we wind up being less
precise.
– We need more values in our confidence interval to be
more certain.
• Because of this, every confidence interval is a
balance between certainty and precision.
• The tension between certainty and precision is
always there.
– Fortunately, in most cases we can be both sufficiently
certain and sufficiently precise to make useful
statements.
Margin of Error: Certainty vs. Precision
• The choice of confidence level is
somewhat arbitrary, but keep in mind this
tension between certainty and precision
when selecting your confidence level.
• The most commonly chosen confidence
levels are 90%, 95%, and 99% (but any
percentage can be used).
Critical value
• The number of SE’s affects the margin of
error
• This number is called the critical value,
denoted by z*
• For a 95% confidence interval, the precise
critical value is 1.96
• For a 90% confidence interval, the precise
critical value is 1.645
Critical Values (cont.)
• Example: For a 90% confidence interval,
the critical value is 1.645:
Margin of error
• A confidence interval too wide is not very useful
• How large a margin of error can we tolerate?
– Reduce level of confidence makes margin of error
smaller (not a good idea, though)
– You should think about this ahead of time, when you
design your study. Choose a larger sample!
– Generally, a margin of error of 5% or less is
acceptable
Sample size
• Suppose a candidate is planning a poll
and wants to estimate voter support within
3% with 95% confidence. How large a
sample does she need?
•
• But we do not know
the poll!
• However,
before we conduct
Sample size
• To be conservative, we set
• Solve for n, we have n = 1067.1
• So, we will need at least 1068 respondents to keep the
margin of error as small as 3% with a confidence level of
95%
– In practice, you may need more since many people do not respond
– If the response rate is too low, then the study might be a voluntary
response study, which can be biased
• To cut the se in half, we must quadruple the sample size n
Assumptions
• Independence
– Plausible independence condition
– Randomization condition
• Sampled at random?
• From a properly randomized experiment?
– 10% condition
• Sampling without replacement can be viewed
roughly the same as sampling with replacement
Assumptions
• Sample size assumption
– Normal approximation comes from the CLT
– Sample size must be large enough
– Check the success/failure condition
•
Example
• In May 2002, the Gallup Poll asked 537
randomly sampled adults “generally
speaking, do you believe the death penalty
is applied fairly or unfairly in this country
today?”
• 53% answered “fairly”
• 7% said “don’t know”
• What can we conclude from this survey?
Example
•
•
•
•
•
Plausible independence? (Yes.)
Randomization condition? (Yes.)
10% condition? (Yes.)
Success/failure condition (Yes.)
Let’s find a 95% confidence interval
– Standard error:
– Margin of error:
– 95% confidence interval
A deeper understanding of
confidence interval
• How should we interpret the 95% confidence?
– Randomness is from sample to sample
– This does not say p is random
• We are the one who is uncertain, not the parameter.
– The interval itself is random (varies from sample to
sample)
– This interval is about p, not about
• The sample proportion varies
– Report confidence interval or margin or error
• Always pay attention to the required
assumptions
– Independence
– Sample size
• In practice, watch out for biased samples
What have we learned?
• Finally we have learned to use a sample to say
something about the world at large.
• This process (statistical inference) is based on
our understanding of sampling models, and will
be our focus for the rest of the book.
• In this chapter we learned how to construct a
confidence interval for a population proportion.
• And, we learned that interpretation of our
confidence interval is key—we can’t be certain,
but we can be confident.