Download Estimation with Confidence Intervals

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Chapter 6
Inferences Based on a Single
Sample: Estimation with
Confidence Intervals
Large-Sample Confidence Interval
for a Population Mean
How to estimate the population mean and
assess the estimate’s reliability?
x is an estimate of  , and we use CLT to
assess how accurate that estimate is
According to CLT, 95% of all x from sample
size n lie within  1.96 x of the mean
We can use this to assess accuracy of x
as an estimate of 
Large-Sample Confidence Interval
for a Population Mean
1.96
x  1.96 x  x 
n
We are 95% confident,
for any x from sample

size n, that  will lie in the interval x  1.96
n
Large-Sample Confidence Interval
for a Population Mean
We usually don’t know  , but with a large
sample s is a good estimator of  .
We can calculate confidence intervals for
different confidence coefficients
Confidence coefficient – probability that a
randomly selected confidence interval
encloses the population parameter
Confidence level – Confidence coefficient
expressed as a percentage
Large-Sample Confidence Interval
for a Population Mean
The confidence coefficient is equal to 1-  ,
and is split between the two tails of the
distribution
Large-Sample Confidence Interval
for a Population Mean
The Confidence Interval is expressed more
generally as

x  z 2 x  x  z 2
n
For samples of size > 30, the confidence interval is
expressed as
 s 
x  z 2 

 n
Requires that the sample used be random
Large-Sample Confidence Interval
for a Population Mean
Commonly used values of z/2
Confidence level
z 2z/2
100(1-)

/2
90%
.10
.05
1.645
95%
99%
.05
.01
.025
.005
1.960
2.575
Small-Sample Confidence Interval
for a Population Mean
2 problems presented by sample sizes of
less than 30:
– CLT no longer applies
– Population standard deviation is almost
always unknown, and s may provide a poor
estimation when n is small
Small-Sample Confidence Interval
for a Population Mean
If we can assume that the sampled
population is approximately normal, then the
sampling distribution of x can be assumed
to be approximately normal
Instead of using
x
z
 n
we use
x
t
s n
This t is referred to as the t-statistic
Small-Sample Confidence Interval
for a Population Mean
The t-statistic has
a sampling distribution
very similar to z
Variability dependent
on n, or sample size.
Variability is expressed as (n-1) degrees of
freedom (df). As (df) gets smaller, variability
increases
Small-Sample Confidence Interval
for a Population Mean
Table for t-distribution contains t-value for various
combinations of degrees of freedom and t
Partial table below shows components of table
See Table 7.3
Small-Sample Confidence Interval
for a Population Mean
Comparing t and z distributions for the same =0.05, with
n=5 (df=4) for the t-distribution, you can see that the t-score
is larger, and therefore the confidence interval will be wider.
The closer df gets to 30, the more closely the t-distribution
approximates the normal distribution (N(0,1)).
Small-Sample Confidence Interval
for a Population Mean
When creating a confidence interval around
 for a small sample we use
 s 
x  t 2 

 n
basing t/2 on n-1 degrees of freedom
We assume a random sample drawn from a
population that is approximately normally
distributed
Large-Sample Confidence Interval for a
Population Proportion
Confidence intervals around a proportion are confidence
intervals around the probability of success in a binomial
experiment.
ˆ  x / n  257 / 484  0.531
Sample statistic of interest is p
The mean of the sampling distribution of p̂ is p , p̂ is an
unbiased estimator of p.
The standard deviation of the sampling distribution is
 p  pq n , where q=1-p.
For large samples, the sampling distribution of p̂ is
approximately normal.
Large-Sample Confidence Interval for a
Population Proportion
ˆ  3 pˆ fall between 0
Sample size n is large if p
and 1.
Confidence interval is calculated as
ˆ  z 2 p  p
ˆ  z
p
2
pq
ˆ  z
p
n
x
where pˆ 
and qˆ  1  pˆ .
n
2
ˆ qˆ
p
n
Large-Sample Confidence Interval for a
Population Proportion
When p is near 0 or 1, the confidence intervals
calculated using the formulas presented are
misleading.
An adjustment can be used that works for any p,
even with very small sample sizes:
~
p  z
2
~
~
p (1  p ) ~ x  2
,p
n4
n4
Determining the Sample Size
When we want to estimate  to within x units with a
(1-) level of confidence, we can calculate the
sample size needed.
We use the Sampling Error (SE), which is half the
width of the confidence interval.
To estimate  with SE and 100(1-)% confidence,

z 
n
2
 2
SE 
2
2
where  is estimated by s or R/4.
Determining the Sample Size
Assume a sample with =0.01 and a range R of
0.4. What size sample do we need to achieve a
desired SE of 0.025 ?

z 
n
2
 2
SE 2
2
(2.575) .1

 106.09
2
.025
2
2
Determining the Sample Size
Sample size can also be estimated for population
proportion p:
2

z 2   pˆ qˆ 
n
2
SE 
Estimates with a value of p being equal or close to
0.5 are the most conservative.
Finite Population Correction for
Simple Random Sampling
Used when the sample size n is large relative to
the size of the population N, when n/N >0.05
Standard error calculation for  with correction:
x 
s
n
N n
N
Standard error calculation for p with correction:
p 
pˆ 1  pˆ  N  n
n
N