Download Confidence Intervals - Lightweight OCW University of Palestine

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Confidence Intervals
Dr. Amjad El-Shanti
MD, PMH,Dr PH
University of Palestine
2016
Confidence Interval
• We know that when we want to study a
population, we select a random sample from this
population and the importance of the sample is
by how much it reflect information about the
population.
• We studied the sample mean and the standard
error (SE) for each sample, and we want to study
how much these two values tell us about the
likely values of the population mean which is
unusually unknown.
Statistical estimation
Every member of the
population has the
same chance of being
selected in the sample
Population
Parameters
Random sample
estimation
Statistics
Large sample Case:
• If we select different samples each one would give a different
estimate, the difference being due to sampling variation.
• Imagine collecting many independent samples of the same
size and calculating the sample mean of each one.
• A frequency distribution of these means could then be
formed and it is usually normally distributed with mean which
is usually the population mean and standard deviation would
equal to б/ √n where б is unknown we use the estimated
value sd/√n, sd = the standard deviation of the sample.
• This value as we said before it is the standard error of the
sample mean and it measures how precisely the population
means estimated by the sample mean.
• The larger the sample. The smaller the standard error.
Large sample Case:
• The interpretation of the standard error of a
sample mean is similar to that of the standard
deviation.
• Approximately 95% of the sample means
obtained by repeated samplin would lie within
1.96 standard error above or below the
population mean.
Large sample Case:
• As there is a 95% probability that the sample mean lies within
1.96 standard error above or below the population mean, there
is a 95% probability that the interval between
X-1.96(sd/√n) and X+1.96(sd/√n) contains the population mean.
• The interval from X-1.96(sd/√n) to X+1.96(sd/√n) represents
likely values of the population mean and it is called the 95%
confidence interval for the population mean and
X-1.96(sd/√n) and X+1.96(sd/√n) are the upper and lower
95% confidence limits foe the population mean.
 For Large Sample Size , The Size Confidence Interval is:
C. I. = X+ 1.96 (sd/ √n)
• The confidence intervals for percentages other than 95% are
calculated in the same way using the appropriate percentage
point Z of the standard normal distribution in place of 1.96 for
example , the 99% confidence interval is C. I. = X+ 2.58 (sd/ √n)
Interval estimation
Confidence interval (CI)
2%
14%
34%
34%
14%
2%
z
-3.0 -2.0
-2.58
-1.0
-1.96
0.0
1.0
2.0
1.96
3.0
2.58
Interval estimation
Confidence interval (CI), interpretation and example
50
Frequency
40
30
20
10
0
22.5 27.5 32.5 37.5 42.5 47.5 52.5 57.5
25.0 30.0 35.0 40.0 45.0 50.0 55.0 60.0
Age in years
x= 41.0, SD= 8.7, SEM=0.46, 95% CI (40.0, 42), 99%CI (39.7, 42.1)
Statistical estimation
Estimate
Point estimate
sample mean
sample proportion
Interval estimate
confidence interval for mean
confidence interval for proportion
Point estimate is always within the interval estimate
Small Sample Size:
• For the previous calculation we always assumed a large sample
size (n>25), If sample size is small, we have two problems:
1. The sample standard deviation (sd) is subject to sampling
variation, may not be reliable to estimate б.
2. When the distribution in the population is not normal , the
distribution of sample mean also not normal.
– The second problem can solve by central limit theorem
which says that whether the variable is normal or not
normal the sample mean will tend to be normally
distributed. But because of first problem we can’t use the
normal distribution, instead we will use a distribution
called the t-distribution and this is valid only when the
population is normally distributed. If the population is not
normally distributed we either use a transformation or
nonparametric confidence interval.
Confidence Interval Using t-distribution:
• (X- μ)/ (sd/√n) is a t-distribution with (n-1)
degree of freedom.
• The shape of t-distribution is a symmetrical bell
shaped distribution with mean of zero but it is
more spread out having longer tails.
For small sample size the confidence interval is:
C.I.= X+t * (sd/√n)
where t can be found from the table if we know
the percentage and the degree of freedom. As
the sample size increase the t- distribution
approach the normal distribution.
Confidence Interval
• Example 1:
As part of a malaria control program, it was planned to spray all 10000 houses in a
rural area with insecticide and it was necessary to estimate the amount that
would be required . Since it was not feasible to measure all 10000 houses a
random sample of 100 houses was chosen and the spray able surface of each
of these was measured.
The mean spray able surface area of these 100 houses was 23.3 m2 and the
standard deviation was 5.9 m2 . It is unlikely that the mean surface area of all
10000 houses (μ). Its precession is measured by the standard error (б/√n)
estimated by:
s/√n = 5.9/ √100 = 0.6 m2
- There is a 95% probability that the sample mean of 23.3 m2 differs from the
population mean by less than 1.96 s.e = 1.96 * 0.6= 1.2 m2
The 95% confidence interval is :
X + 1.96 * s/√n = 23.3+ 1.2= 22.1 to 24.5 m2
• We say that with confidence interval, the mean of the population house area
be between 22.1 and 24.5 m2
Confidence Interval
• Example 2:
The following are the numbers of hours of relief obtained by six arthritic patients
after receiving new drug:
2.2, 2.4, 4.9, 2.5, 3.7, 4.3 hours
X = 2.2+2.4+4.9+2.5+3.7+4.3 = 3.3
6
S= 1.13 hours,
n=6 d.f =n-1 = 6-1 =5
s.e= s/√n= 0.46 hours
The 5 % point of the t distribution with 5 degree of freedom is 2.57 (from the
table), and so the 95% confidence interval for the average number of hours of
relief for arthritic patient in general is
3.3+ 2.57 * 0.046= 3.3+1.2= 2.1 to 4.5 hours.
Related documents