Download CONFIDENCE INTERVAL-I Statistical Methods In Economics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
CONFIDENCE INTERVAL-I
(Concept, interpretation and derivation)
Statistical Methods In Economics-Ii
Lesson: Confidence Interval-I
(concept, interpretation and derivation)
Lesson Developer: Anjani K. Kochak
College/Department: Department Of Economics, Lady Shri
Ram College, University Of Delhi
Table of contents
Page
I
Institute of Lifelong Learning, University of Delhi
CONFIDENCE INTERVAL-I
(Concept, interpretation and derivation)
1. Introduction
2
2. Basic concept of confidence interval
2
3. Interpretation of confidence interval
4
4. Other levels of confidence
7
5. Confidence level, width and precision
9
6. One sided confidence interval
11
7. General derivation of confidence interval
12
8. Practice questions
14
Learning Objectives
I
Institute of Lifelong Learning, University of Delhi
CONFIDENCE INTERVAL-I
(Concept, interpretation and derivation)
In this lesson you will learn another method of estimation, namely
interval estimation. Unlike point estimation where we calculate a single value
to estimate the population parameter, in this method we construct an
interval of values. The lesson will explain how a confidence interval for the
population parameter is derived, given the confidence level and the sampling
distribution of the corresponding sample statistic. This is illustrated for the
population mean. You will also learn how to interpret this confidence
interval. Confidence intervals for different confidence levels as well as one
sided confidence intervals are also explained with examples. The relationship
between width of the confidence interval and confidence level is explained to
highlight the trade-off between precision and reliability. Finally you will learn
the general methodology of constructing a confidence interval.
Introduction
We just learnt about a point estimator of a parameter and its desirable
properties. The sample mean is an unbiased and minimum variance point estimator of the
population mean. However because of sampling fluctuations the sample mean from different
samples will differ from the population mean. The important question is how close is to µ
for a given sample. The point estimate provides no information about the magnitude of the
sampling error that may occur. An alternative to point estimation is interval estimation
where we calculate an interval of values rather than just one value to estimate the true
population parameter. This interval is called a confidence interval. Besides providing a range
to estimate the population parameter, the confidence interval can also be used for testing
hypothesis about the population parameter.
Basic concept of confidence interval.
If we have to derive the confidence interval for any population parameter we need
to first know the sampling distribution of the corresponding sample statistic – e.g to
calculate a confidence interval for the population mean we must know the sampling
distribution of the sample mean. Suppose we have a normal population with mean µ and
standard deviation σ and the standard deviation, σ is known. If we take repeated samples
of size n from this population and derive the sampling distribution of the sample mean
we
have learnt that will also be normally distributed with mean µ and standard deviation
σ/√n. Therefore if we subtract µ from
and divide by σ/√n we will get a standard normal
variable z which is Ω N(0,1)i.e
I
Institute of Lifelong Learning, University of Delhi
CONFIDENCE INTERVAL-I
(Concept, interpretation and derivation)
X Ω N(µ, σ2)
if
then
Ω N(µ, σ2/n)
and
Z=
µ
σ
Ω N(0,1).
Next we need to specify the confidence level or confidence coefficient of the confidence
interval. Confidence levels are expressed as percentages—the common ones being
90%,95% and 99%.Confidence coefficients are expressed as probabilities- the
corresponding ones being 0.90, 0.95 and 0.99.
If we want to construct a 95% confidence interval for the population mean for the
population described above, we need to do some algebraic operations.
We know from the normal tables that for a standard normal variate z, 95% of all
observations would lie between -1.96 and +1.96, i.e
µ
P(
1.96) = 0.95
σ
(A)
Now we work on the expression inside the brackets
1. First multiply by σ/√n on both sides
P(
σ
2. Then subtract
P(
σ
µ
1.96 σ/√n) = 0.95
from each term
µ
1.96 σ/√n) = 0.95
3. To do away with the negative sign of µ, we multiply throughout by -1 and reverse
the direction of both inequalities
P(
σ
µ
1.96 σ/√n) = 0.95
Rearranging terms we get
P(
σ
µ
1.96 σ/√n) = 0.95 (B)
The elements inside the brackets indicate a random interval for the population mean .This
is called a random interval because the two limits of
have a random element
which
varies from sample to sample. The lower limit is
and the upper limit is
.96
σ/√n. The interval is centered on the sample mean and extends 1.96 σ/√n to each side of
The width of the interval is fixed. It is 2 times 1.96 σ/√n. We can interpret the expression
(B) as stating that the probability that the population mean will lie within the random
interval is 0.95.
I
Institute of Lifelong Learning, University of Delhi
CONFIDENCE INTERVAL-I
(Concept, interpretation and derivation)
Now for a given sample if we calculate the sample mean and replace it for in the random
interval (B), the fixed interval that we get is known as the 95% confidence interval for the
population mean.
(
σ
1.96 σ/√n) is a 95% confidence interval for µ
Example: A random sample of 25 observations is taken from a normal population
whose standard deviation is given as 4.The sample mean works out to be
115.Construct a 95% confidence interval for the population mean.
Solution: [115- (1.96)4/5, 115 +( 1.96)4/5]=[113.432,116.568]
Interpretation of confidence interval
It is true that the probability that the random interval will contain the population
mean is 0.95 .However once we calculate the confidence interval by substituting data from
a given sample it would be incorrect to say that the probability that the confidence interval
will contain the population mean is 0.95.This is because the interval is fixed and so is the
value of µ,therefore either the population mean lies in the given interval or not-- there is
no randomness or uncertainty about it. In the above example the 95% confidence interval
was [113.432, 116.568].The population mean is a fixed number so will either lie in this
interval or not. If the population mean was, say 116, then it would lie in the interval. How
then does one interpret the confidence interval?
We can interpret the 95%confidence interval on similar lines as the interpretation of
the long run relative frequency as an approximation of probability. If we say that the
occurrence of an event E has a probability of 0.95, it implies that if the experiment is
performed a large number of times, then the event E will occur 95% of the time. For
example if a fair coin is tossed once ,the probability of obtaining a head is ½ .According to
the long run relative frequency interpretation of probability this means that if the coin is
tossed an infinite number of times, head would occur in 50% of the cases i.e if the coin is
tossed 1000 times we would get head in approximately 500 cases i.e the relative frequency
of head will approximately be equal to the probability of head—the approximation getting
better as n, the number of times the experiment is repeated , gets larger. Symbolically
P(E)=
, where n is the number of times the experiment is repeated and x is the
number of times the event E occurs.
I
Institute of Lifelong Learning, University of Delhi
CONFIDENCE INTERVAL-I
(Concept, interpretation and derivation)
Thus the correct interpretation of the confidence interval is that if an infinite number
of random samples are collected and a 95% confidence interval for the population mean is
calculated for each sample, then 95%of these intervals will contain the population mean.
To illustrate suppose we take 20 random samples of size 16 from a normal
population which has a given standard deviation σ=2 and find the sample mean and the
95% confidence interval for the population mean for all 20 samples. This is tabulated below.
The population mean is 100 and you will notice that it lies within 19 confidence intervals and
is outside of only one confidence interval. i.e the 13th sample. This implies that in 95% of
the confidence intervals the population mean lies within the interval. This is also illustrated
diagrammatically in Figure 1. Only in the confidence interval of sample 13 the population
mean does not lie within the interval.
Sample number Sample mean 95% confidence interval for
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
99.2
99.4
99.6
99.8
99.9
100
100.2
100.3
100.5
100.7
100.8
100.9
101.5
99.1
99.3
99.25
99.85
99.95
100.65
100.75
98.22,100.18
98.42,100.38
98.62,100.58
98.82,100.78
98.92,100.88
99.02,100.98
99.22,101.18
99.32,101.28
99.52,101.48
99.72,101.68
99.82,101.78
99.92,101.88
100.52,102.48
98.12,100.08
98.32,100.28
98.27,100.23
98.87,100.83
98.97,100.93
99.67,101.63
99.77,101.73
It is important to note that the above example was designed to illustrate the
meaning of a 95% confidence interval. In practice none or 2 or more of the confidence
intervals may not contain the true population mean. If we had taken 1000 samples of size
16 instead of 20, and calculated 95%confidence intervals for the population mean from each
I
Institute of Lifelong Learning, University of Delhi
CONFIDENCE INTERVAL-I
(Concept, interpretation and derivation)
sample, in close to 950 samples would the population mean lie in the constructed
confidence interval. It must be emphasized that only when we take a large number of
samples and construct a large number of confidence intervals, will 95% of them contain the
true population mean.
Figure1
Other levels of confidence
I
Institute of Lifelong Learning, University of Delhi
CONFIDENCE INTERVAL-I
(Concept, interpretation and derivation)
In the previous section we derived a 95% confidence interval for the population
mean. 95% was the confidence level of the confidence interval.If we express it as
probability and not % it is referred to as the confidence coefficient i.e a 95% confidence
level implies that the confidence coefficient is 0.95. We can change the confidence level to
99% or 90% or any other level. The 95%confidence interval was derived from the
probability 0.95 for the initial inequality (A).If we want to construct a 90% confidence
interval the initial probability of 0.95 must be replaced by 0.90 .This implies that the zcritical value changes from 1.96 to 1.645. A 90% confidence interval is then obtained by
using 1.645 in place of 1.96 in equation (A).This means that the random interval that we
get will now look like
P(
σ
µ
σ/√n) = 0.90
We can interpret this expression as stating that the probability that the population
mean will lie within the random interval is 0.90 i.e if we take a large number of samples and
construct a large number of confidence intervals, approximately 90% of them would contain
the true population mean.
Likewise a 99%confidence interval can be obtained by using 2.58 instead of 1.96 in
equation (A).Thus we can change the level of confidence by replacing 1.96 with the
appropriate standard normal critical value. How do we find the appropriate standard normal
critical value? The normal tables give critical values of
such that the area to its left is
1–  /2 as in figure 1. You can check from the normal tables that the area to the left of 1.96
is 0.975,to the left of 2.58 is 0.995 and so on. Since the curve is symmetrical, area to the
left of would be  /2 and thus the area between and
would be 1-  (figure
2).Thus if we have to find a 94 % confidence interval, the value of  would be 0.06.We
would therefore have to find z0.03 i.e critical value of z such that the area to its left is
0.97,which is 1.88.Now the area between -1.88 and 1.88 would be 0.94.
I
Institute of Lifelong Learning, University of Delhi
CONFIDENCE INTERVAL-I
(Concept, interpretation and derivation)
Thus in general a 100(1-
)
% confidence interval for the population mean
of a
normal population when the value of the population standard deviation is known is
(
σ/√n)
–
Thus it is clear that the choice of the confidence level determines the value of
used in calculating the confidence interval.
Example: The following confidence interval was selected by a researcher.
[
–
(σ/√n)]
Determine the confidence level.
Solution: Since the value of
is 1.75, the area to the left of this is about 0.96
and therefore the area between -1.75 and 1.75 would be 0.92.Therefore the
confidence level would be 92%.
.
Example : Construct a (i)99% and(ii) 94% (iii) 90% confidence interval for the
population mean, if a random sample of 25 observation from a normal population
with standard deviation σ=2 gave a mean of 18.5.
I
Institute of Lifelong Learning, University of Delhi
CONFIDENCE INTERVAL-I
(Concept, interpretation and derivation)
Solution:(i) 99% confidence interval
[18.5- 2.58(2/5), 18.5 +2.58(2/5)]
( 18.5 – 1.032, 18.5+ 1.032)
( 17.468 ,19.532)
(ii) 94% confidence interval
[18.5- 1.88(2/5), 18.5 +1.88(2/5)]
( 18.5 –0.752 , 18.5 +0.752)
( 17.748 ,19.252)
(iii) 90% confidence interval
[18.5- 1.645(2/5), 18.5 +1.645(2/5)]
( 18.5 – 0.658, 18.5 +0.658)
( 17.842 ,19.158)
Confidence level, width and precision
The above example shows that higher the confidence level, wider is the confidence
interval. For the 99% confidence interval the width was 2.064 (2*1.032), while for the 90%
confidence interval it was 1.316 (2*0.658).The width or the length of the observed
confidence interval is an important measure of the quality of the information obtained from
the sample. The wider the confidence interval, the more confident we are that the interval
will actually contain the true population value. On the other hand, the wider the interval, the
less information we have about the true population parameter. Thus we cannot say that a
99% interval is to be preferred to a 90 % interval; the gain in reliability is at the cost of
lower precision. The ideal situation would be to have a relatively short interval with a high
confidence level. This would be possible if we reduce the standard error of the sampling
statistic.
When we construct a confidence interval for the population mean by drawing a
random sample from a normal population with a known standard deviation σ, the standard
I
Institute of Lifelong Learning, University of Delhi
CONFIDENCE INTERVAL-I
(Concept, interpretation and derivation)
error of the sample mean is
.The standard error can be reduced by increasing the
sample size,n.If we increase n, then for a given level of confidence the width of the
confidence interval decreases which increases precision of the estimate.
Example:A random sample from a normal population gave a mean of 450.
The population standard deviation was known to be 8.Consruct a 95% confidence
interval for the population mean and find its width if
(i)
(ii)
Sample size is 16
Sample size is 81
Solution:
(i)
[450- 1.96*(8/4) , 450 + 1.96*(8/4) ]
[446.08 , 453.92 ]
(ii)
Width=7.84
[450- 1.96*(8/9) , 450 + 1.96*(8/9) ]
[448.26 , 451.74 ]
Width=3.48
Thus increasing the confidence level increases the width of the confidence interval,
while increasing the sample size decreases the width of the confidence interval, other things
remaining constant. Therefore if we want to keep both the width and the level of
confidence constant, this can be done by suitably choosing n.
Example: We wish to measure the mean expenditure on housing of a population
which is normally distributed with standard deviation =1.5.What should be the sample size
so that the width of the 95% confidence interval is at most 2.
Solution:
w= 2
2=2*1.96*1.5/
n = (1.96*1.5)2
n= (2.94)2 = 8.6436 =9
In general the sample size necessary for a 100(1a width of w is--
)% confidence interval to have
I
Institute of Lifelong Learning, University of Delhi
CONFIDENCE INTERVAL-I
(Concept, interpretation and derivation)
n= {(2*
*
2
The half width is called the bound on error of estimation or the maximum error of the
estimate.When we construct a 100 (1)% confidence interval for the population mean,
the bound on error of estimation is
.
Example:A random sample of 16 observations was drawn from a normal population
whose standard deviation was known to be 1.6.If the sample mean was 234,
construct a 99% confidence interval for the population mean. What would be the
maximum error of the estimate?
Solution: 99% confidence interval for the population mean
[234 - 2.58 (1.6/4), 234 + 2.58(1.6/4)]
[234 – 1.032, 234 + 1.032]
[232.968 , 235.032]
Maximum error of the estimate:
2.58(1.6/4)= 1.032
One Sided confidence interval
Till now we discussed a two- sided confidence interval. This gives both a lower
confidence limit or bound and an upper confidence limit or bound for the parameter to be
estimated. Sometimes the researcher may want only one of these bounds in which case we
can construct a one- sided confidence interval. For example the researcher may want to
know a 95% upper limit for average income of a city. The interpretation of the one sided
confidence interval is similar to that of a two sided interval. A 95% upper limit is a fixed
number for a given sample. However it will vary from sample to sample and if we take a
large number of samples and calculate the 95% upper limit for all of them, then in
approximately 95% of those samples the true mean will lie below the upper bound.
A 100(1-  )% one sided upper confidence bound for the population mean based on
a sample of n observations from a normal population with a known standard deviation is-
z
σ/√n
Similarly a 100(1-  ) % one sided lower confidence bound is-
z
σ/√n
I
Institute of Lifelong Learning, University of Delhi
CONFIDENCE INTERVAL-I
(Concept, interpretation and derivation)
Example: The sample mean of 16 observations from a normal population with a
known standard deviation =2, was 450. Calculate a 99% one sided upper bound for
.
Solution: 99% one sided upper bound for
= 450 +(2.33)2/4
= 450+1.165 =451.165
Example: A sample of 36 observations from a normal population gave a mean of 125.The
population standard deviation was known to be 1.5.Calculate a 95% one sided lower bound
for and interpret it.
Solution:
95% one sided lower bound for
125-2.58 (1.5/6)
125-0.645
124.355
A 95% lower limit is a fixed number for a given sample e.g for the above sample it is
124.355. However it will vary from sample to sample and if we take a large number of
samples and calculate the 95% lower limit for all of them, then in approximately 95% of
those samples the true mean will lie above the lower bound.
General derivation of confidence interval
Let us construct a confidence interval for the population parameter Φ on the basis of a
sample of n observations X1……………..Xn. Let h(X1……………..Xn) denote a random variable which
satisfies the following properties(i)
(ii)
The variable h(X1……………..Xn) is a function of both X1……………..Xn and Φ
The probability distribution of the variable h(X1……………..Xn) does not depend on Φ
or any other unknown parameter
I
Institute of Lifelong Learning, University of Delhi
CONFIDENCE INTERVAL-I
(Concept, interpretation and derivation)
In the construction of the confidence interval for the population mean from a normal
population with a known standard deviation , Φ= and
h(X1……………..Xn) is
, which satisfies both the properties. It functionally depends on
but
it’s probability distribution is N(0,1), independent of
.In general the form of the
distribution depends on the sampling distribution of the corresponding sample statistic.
Now for any
lying between 0 and 1,we can find constants a and b,such that
P(a< h(X1……………..Xn) <b) = 1-
In the construction of the confidence interval for the population mean from a normal
population with a known standard deviation , a= –
and b=
Then the inequalities can be manipulated to yield the following random interval
P[l(X1……………..Xn )< Φ < u(X1……………..Xn )] = 1-
Once we substitute sample values in this we get 100(1- )% confidence interval for Φ.
[l(x1…….xn ) , u(x1…….xn )]
In the case of the population mean
l(x1…….xn) = (
u(x1…….xn)
–
=(
I
Institute of Lifelong Learning, University of Delhi
CONFIDENCE INTERVAL-I
(Concept, interpretation and derivation)
References
1.J.E Freund Mathematical Statistics
2.J.L.Devore:Probability and Statistics for Engineering and the Sciences
Practice Questions
Q.1. The life of a 100 w light bulb measured in hours, is normally distributed with
standard deviation =25 hours. If a sample of 16 bulbs gave a mean life of 1500 hours,
construct a 99% confidence interval for the population mean.
Q.2.
A random sample of 100 observations selected from a normal population gave a
mean = 143.72. The standard deviation for the population is given as =14.8
(i)
Construct
(a)
a 99% confidence interval for
(b)
a 95% confidence interval for
(c)
a 90% confidence interval for
(ii)
Does the width of the confidence intervals decrease as the confidence
level decreases? Explain.
Q. 3. A public health official wanted to know how often university students visit their
health centre due to illness. The officials took a random sample of 100 students and found
an average of 2.3 visits per student per year. The population is known to be normally
distributed with a standard deviation of 0.4.Make a 97 % confidence interval for the true
population mean visits per student per year
Q. 4. . A random sample is selected from a normal population which has a standard
deviation of 7.14.The sample mean was found to be 48.52.
(i)
Make
(a)
(b)
(c)
a 95% confidence interval for
n=225
n=100
n=64
assuming
I
Institute of Lifelong Learning, University of Delhi
CONFIDENCE INTERVAL-I
(Concept, interpretation and derivation)
(ii)
Does the width of the confidence intervals increase as the sample size
decreases? Explain.
Q. 5. To study the internet penetration in India it is desired to estimate the average
number of hours that teenagers spend with the internet per week. Assuming the sample is
drawn from a normal population with standard deviation =3.0 hours, how large must the
sample be so that it is possible to assert with 95% confidence that the maximum error of
estimation is 20 minutes ?
Q.6. Construct a 94% one sided upper bound given the following information. The sample
mean of 25 observations is 56.4.The population from which the sample is drawn is normal
with =2.5
Q.7. What is the the confidence level for the following one sided confidence limits
(i) upper limit =
(ii)lower limit =
Q.8. The following confidence interval for the population mean was reported by a
researcher on the basis of a random sample (250.5, 260.5).Find the (i) sample mean (ii)the
confidence level, if the population was normally distributed with
and the sample size
was 36.
Q.9.
A random sample of size n was drawn from a normal population distribution with a
given standard deviation
and the following two sided confidence intervals for were
constructed Find the confidence level for the two confidence interval for
(i)
(ii)
(
(
±
±
Q.10. Explain the relationship between width of a confidence interval and the sample size.
Construct a 95% confidence interval for the population mean
if a random sample of 25
observations is taken from a normal population with =5. What is the width of the
confidence interval? If the width has to be halved, by how much should we increase the
sample size, other things remaining the same?
I
Institute of Lifelong Learning, University of Delhi
Related documents