Download CI for Mean and Proportions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
How sure we really are
Confidence intervals for means
and proportions
FETP India
Competency to be gained
from this lecture
Calculate 95% confidence intervals
for means and proportions
Key issues
• Concept of confidence interval
• Confidence interval for means
• Confidence interval for proportions
What we learnt so far (1/3)
Population parameters are fixed
We can take samples from the population
Several samples of size ‘n’ are possible
Each sample give estimates (e.g., means)
called “statistics”
• Statistics vary from sample to sample
•
•
•
•
 This is called “Sampling fluctuation”
Concept of confidence interval
What we learnt so far (2/3)
• The distribution of a statistic for all possible
samples of given size ‘n’ is called “sampling
distribution”
• For large ‘n’, the sampling distribution is
‘normal’ even if the original distribution is
not
• If the original distribution is normal, the
result is true even for small ‘n’
Concept of confidence interval
What we learnt so far (3/3)
• The mean of the sampling distribution is the
‘population mean’
• The standard deviation of the sampling
distribution is known as standard error
 SE= Population SD /√n
Concept of confidence interval
Easy to estimate the standard deviation,
difficult to estimate the mean
• Samples generate sample means and
standard error
• The usefulness of these parameters vary:
 The standard deviation from a single sample as
an estimate of population SD for large ‘n’ is fair
 The mean from a single sample as an estimate of
population mean may not be
Concept of confidence interval
How can the population
mean be estimated?
• It is desirable to give a range of values with
a specific level of confidence that the true
population mean is one of the values in the
range
• We can obtain this using the sampling
distribution – which is ‘normal’ using the
properties of ‘normal’ distribution
 Mean
 Standard deviation
Concept of confidence interval
From the standard error (SE) to the
confidence interval
• The point estimate x (mean in the sample) is
a point in the sampling distribution and
there is a 95% chance that it lies in the
µ1.96 SE interval
• But µ is not known
• Interchanging µ and x we can infer that
there is a 95% chance that µ lies in the
interval x 1.96 SE
Concept of confidence interval
Inference using various levels of
confidence
• Using the properties of the normal
distribution, we can infer what proportion of
the values lie between values
• Considering the distribution of the means:
 68% of sample means will lie within 1 standard
deviation above or below the sample mean
 95% of sample means will lie within 1.96 standard
deviation above or below the sample mean
• “1.96” come from the standard z table for alpha=0.05
Concept of confidence interval
Confidence interval for a mean
• The confidence interval of the mean gives
the range of plausible values for the true
population mean
95%CI (x - 1.96

n
, x  1.96

n
)

Confidence interval for a mean
Example of a calculation of a
confidence interval for a mean
• Sample of 100 observations,
 Mean height is 68”
 SD: 10”
• Standard error of the mean = 10 / 100 = 1
• 95% confidence limits for population mean
are 68 1.96 x (1)
 Approximately 66” to 70”
10
10
95%CI (68 - 1.96
, 68  1.96
)
100
100
Confidence interval for a mean
Interpretation of the calculation of the
confidence interval for a mean
• The 95% confidence interval for the mean of
68 is (66, 70)
• This means that with repeated random
sampling, 95% of the intervals will contain
the true mean (µ)
• Since we have one of these intervals, we can
be 95% confident that this interval contains
the true mean
Confidence interval for a mean
Calculating a 95% confidence interval
for a mean in practice
• Epi-Info, “Epitable” module
• Open-Epi calculator (Open source)
 www.openepi.com
• Excel
Confidence interval for a mean
Calculating a 95% confidence interval for
a mean in OpenEpi: 1/2 (Methods)
4. Click “calculate”
2. Click “Enter”
1. Choose “Mean, CI”
3. Enter data
Confidence interval for a mean
Calculating a 95% confidence interval for
a mean in OpenEpi: 1/2 (Results)
Confidence interval for a mean
Exercise to calculate the 95% confidence
interval for a mean
• Study of gestational age at birth in the past
month in a sample of health care facilities
• Results of the study
 n=350 births
 Sample mean= 37.5 weeks
 s=12.2
• What is the 95% confidence interval?
95%CI (37.5 - 1.96
12.2
350
, 37.5  1.96
12.1
350
)  (36,39)
Confidence interval for a mean
Applying the same methods to generate
confidence intervals for proportions
• The central limit theorem also applies to
distribution of sample proportions when the
sample size is large enough
 The population proportion replaces the
population mean
 The binomial distribution replaces the normal
distribution
Confidence interval for a proportion
Using the binomial distribution
• The binomial distribution is a sampling
distribution for p
• Formula of the standard error:
SEproportion 
p(1 p)
n
 Where n = Sample size, p = proportion

Confidence interval for a proportion
Using the central limit theorem
• As the sample n increases, the binomial
distribution becomes very close to a normal
distribution (Central limit theorem)
• Thus, we can use the normal distribution to
calculate confidence intervals and test
hypotheses
• If np and n (1-p) and equal to 10 or more,
then the normal approximation may be used
Confidence interval for a proportion
Applying the concept of the confidence
interval of the mean to proportions
• For means, the 95% confidence interval
was:
95%CI (x - 1.96

n
, x  1.96

n
)
• For proportions, we just replace the formula of the
standard error of the mean by the standard error of
 the proportion that comes from the binomial
distribution
95%CI (p - 1.96
p(1 p)
p(1 p)
, p + 1.96
)
n
n
Confidence interval for a proportion
Calculation of a confidence interval for a
proportion: Prevalence of goiter in
Solan,
Himachal Pradesh, India, 2005
• Sample of 363 children:
 63 (17%) present with goiter
• Standard error of the proportion
SE 
0.17(1 0.17)
0.17 x0.83

 0.019
363
363
• 95% confidence limits for the proportion are
0.17 1.96 x (0.019)

 Approximately 13% to 21%
Interpretation of the calculation of the
confidence interval for the proportion
• The 95% confidence interval for the
proportion of 17% is (13%, 21%)
• This means that with repeated random
sampling, 95% of the intervals will contain
the true proportion
• Since we have one of these intervals, we can
be 95% confident that this interval contains
the true proportion
Confidence interval for a proportion
Calculating a 95% confidence interval
for a proportion in practice
• Epi-Info, “Epitable” module
• Open-Epi calculator (Open source)
 www.openepi.com
Confidence interval for a proportion
Calculating a 95% confidence interval for
a proportion in OpenEpi: 1/2 (Methods)
2. Click “Enter”
4. Click “calculate”
1. Choose “Proportion”
3. Enter data
Confidence interval for a proportion
Calculating a 95% confidence interval for
a proportion in OpenEpi: 1/2 (Results)
Confidence interval for a proportion
Exercise to calculate the 95% confidence
interval for a proportion
• In a sample of 250 HIV infected persons with
AIDS, 116 are positive for tuberculosis
• What is the 95% confidence interval?
0.46x0.54
0.46x0.54
95%CI (0.46 - 1.96
,0.46 + 1.96
)  (40,53)
250
250
Confidence interval for a proportion
From estimation to testing
• Confidence interval is about estimating
• The sampling distribution can also be used to
test hypotheses
 Statistical testing
Dealing with non-normal parent
population
• If sample size exceeds 30, we are safe because the
sampling distribution will approach the normal
distribution
• If the sample size is smaller than 30, the
distribution is different
• The 1.96 value will be replaced by another value
coming from the t-distribution
 Slightly different from the normal distribution
 Depends upon the sample size
 The degrees of value will be n-1
Take home messages
• Confidence intervals use the central limit
theorem to estimate a range of possible
values for the population parameter on the
basis of the sample estimate, the standard
deviation and the sample size
• The 95% confidence intervals lies at +/- 1.92
the standard error, that is calculated using
different methods for means (s/√n) and
proportions (√[p(1-p)/n)]
Related documents