Download Confidence Interval for Population Proportion - simulation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Confidence Interval for Population Proportion
In practice, the population proportion (p) is hardly ever known. We normally estimate it with a sample proportion ( p̂ ).
Let's think about the population of adults. If we want to estimate the population proportion of adults who smoke,
we could randomly select 100 adults and ask them if they smoke -- let 1 be 'Yes'; and 0 be 'No'. The sample data might
look like the following:
1
0
1
0
0
1
0
1
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
Note: Sample proportion of ‘Yes’ responses is 16/100 = 0.16
Hence a point estimate of the population proportion of adult smokers is 0.16 = 16%.
An interval estimate of the population proportion is (16% - 1%, 16% + 1%) = (15%, 17%).
Other interval estimates can be:



(16% - 1.5%, 16% + 1.5%) = (14.5%, 17.5%)
(16% - 2%, 16% + 2%) = (14%, 18%)
(16% - 3%, 16% + 3%) = (13%, 19%)
From the population of adult smokers, there are many, many samples of 100. If we form all samples of 100 and calculate
the sample proportion for each sample, then from the section on Sampling Distribution of the Sample Proportion, we
know that the population of the sample proportions has a mean that is the same as the population proportion. That is:
mean of population of sample proportions = population proportion
 pˆ  p
Also, the standard deviation of the population of the sample proportions is  pˆ 
p 1  p 
.
n
Process of forming population of sample proportions:
Underlying Population
Population Proportion = p
Sample 1
Sample 2
Sample 3
Sample 4
Sample 5
...
Sample k
Calculate sample proportion for each sample
pˆ1
pˆ 2
pˆ 3
pˆ 4
pˆ 5
pˆ k
Population of Sample Proportions
pˆ1
pˆ 2
pˆ 3
pˆ 4
pˆ 5
pˆ k
population mean =  pˆ  p
and standard deviation =  pˆ 
p 1  p 
n
... ...
Population of Sample Means is approximately normal with mean  p̂  p and standard
deviation  pˆ 
p 1  p 
n
if the following conditions are satisfied:
Condition 1: Sample size (n) must be less than or equal to 5% of the population size (N);
in other words, n  0.05N .
(Condition 1 is needed to ensure that the sampled values are independent of each other.)
Condition 2: n  p  1  p   10
A confidence interval of the population proportion has the form:

 pˆ  z / 2


pˆ 1  pˆ 
pˆ 1  pˆ  

, pˆ  z / 2

n
n

where z / 2 depends on the level of confidence.
Example:
Find a 95% confidence interval for the population proportion of adult smokers. The sample data are as follows:
1
0
1
0
0
1
0
1
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
1
0
1
0
0
0
1
0
0
0
0
1
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
Note: Sample proportion of ‘Yes’ responses is 16/100 = 0.16 = 16%
0
0
0
0
0
0
0
0
0
0
Solution:
In order to make the assumption that the distribution of the sample proportions is approximately normal,
the following conditions have to be met:
Condition 1: Sample size (n) must be less than or equal to 5% of the population size (N), i.e.,
n  0.05N .
Condition 2: n  pˆ  1  pˆ   10
Since the number of adults is in the millions, sample size of 100 is less than 5% of the population size (N).
n  pˆ  1  pˆ   100(0.16)(0.84)  13.44
Since both Condition 1 and Condition 2 are met, we can assume that the population consisting of all sample
proportions of size 100 is approximately normal .
Since the level of confidence is 95%,   100%  95%  5%  0.05; and  /2 = 5%/2 = 2.5% = 0.025.
Hence, z / 2 = 1.96.

pˆ 1  pˆ 
pˆ 1  pˆ  

Confidence interval =  pˆ  z / 2
, pˆ  z / 2


n
n



0.16  0.84 
0.16  0.84  

  0.16 1.96
, 0.16  1.96


100
100


  0.088, 0.232 
Thus, we are 95% confident that the population proportion is between  0.088, 0.232  .
Remember that the population proportion may or may not lie inside of  0.088, 0.232  since
only 95% of the of the confidence intervals contain the population proportion
and 5% do not contain the population proportion. We do not know if the  0.088, 0.232  is part of the 95% or part of
the 5%.
Related documents