Download CONFIDENCE INTERVAL-I Statistical Methods In Economics

CONFIDENCE INTERVAL-I (Concept, interpretation and derivation) Statistical Methods In Economics-Ii Lesson: Confidence Interval-I (concept, interpretation and derivation) Lesson Developer: Anjani K. Kochak College/Department: Department Of Economics, Lady Shri Ram College, University Of Delhi Table of contents Page I Institute of Lifelong Learning, University of Delhi CONFIDENCE INTERVAL-I (Concept, interpretation and derivation) 1. Introduction 2 2. Basic concept of confidence interval 2 3. Interpretation of confidence interval 4 4. Other levels of confidence 7 5. Confidence level, width and precision 9 6. One sided confidence interval 11 7. General derivation of confidence interval 12 8. Practice questions 14 Learning Objectives I Institute of Lifelong Learning, University of Delhi CONFIDENCE INTERVAL-I (Concept, interpretation and derivation) In this lesson you will learn another method of estimation, namely interval estimation. Unlike point estimation where we calculate a single value to estimate the population parameter, in this method we construct an interval of values. The lesson will explain how a confidence interval for the population parameter is derived, given the confidence level and the sampling distribution of the corresponding sample statistic. This is illustrated for the population mean. You will also learn how to interpret this confidence interval. Confidence intervals for different confidence levels as well as one sided confidence intervals are also explained with examples. The relationship between width of the confidence interval and confidence level is explained to highlight the trade-off between precision and reliability. Finally you will learn the general methodology of constructing a confidence interval. Introduction We just learnt about a point estimator of a parameter and its desirable properties. The sample mean is an unbiased and minimum variance point estimator of the population mean. However because of sampling fluctuations the sample mean from different samples will differ from the population mean. The important question is how close is to µ for a given sample. The point estimate provides no information about the magnitude of the sampling error that may occur. An alternative to point estimation is interval estimation where we calculate an interval of values rather than just one value to estimate the true population parameter. This interval is called a confidence interval. Besides providing a range to estimate the population parameter, the confidence interval can also be used for testing hypothesis about the population parameter. Basic concept of confidence interval. If we have to derive the confidence interval for any population parameter we need to first know the sampling distribution of the corresponding sample statistic – e.g to calculate a confidence interval for the population mean we must know the sampling distribution of the sample mean. Suppose we have a normal population with mean µ and standard deviation σ and the standard deviation, σ is known. If we take repeated samples of size n from this population and derive the sampling distribution of the sample mean we have learnt that will also be normally distributed with mean µ and standard deviation σ/√n. Therefore if we subtract µ from and divide by σ/√n we will get a standard normal variable z which is Ω N(0,1)i.e I Institute of Lifelong Learning, University of Delhi CONFIDENCE INTERVAL-I (Concept, interpretation and derivation) X Ω N(µ, σ2) if then Ω N(µ, σ2/n) and Z= µ σ Ω N(0,1). Next we need to specify the confidence level or confidence coefficient of the confidence interval. Confidence levels are expressed as percentages—the common ones being 90%,95% and 99%.Confidence coefficients are expressed as probabilities- the corresponding ones being 0.90, 0.95 and 0.99. If we want to construct a 95% confidence interval for the population mean for the population described above, we need to do some algebraic operations. We know from the normal tables that for a standard normal variate z, 95% of all observations would lie between -1.96 and +1.96, i.e µ P( 1.96) = 0.95 σ (A) Now we work on the expression inside the brackets 1. First multiply by σ/√n on both sides P( σ 2. Then subtract P( σ µ 1.96 σ/√n) = 0.95 from each term µ 1.96 σ/√n) = 0.95 3. To do away with the negative sign of µ, we multiply throughout by -1 and reverse the direction of both inequalities P( σ µ 1.96 σ/√n) = 0.95 Rearranging terms we get P( σ µ 1.96 σ/√n) = 0.95 (B) The elements inside the brackets indicate a random interval for the population mean .This is called a random interval because the two limits of have a random element which varies from sample to sample. The lower limit is and the upper limit is .96 σ/√n. The interval is centered on the sample mean and extends 1.96 σ/√n to each side of The width of the interval is fixed. It is 2 times 1.96 σ/√n. We can interpret the expression (B) as stating that the probability that the population mean will lie within the random interval is 0.95. I Institute of Lifelong Learning, University of Delhi CONFIDENCE INTERVAL-I (Concept, interpretation and derivation) Now for a given sample if we calculate the sample mean and replace it for in the random interval (B), the fixed interval that we get is known as the 95% confidence interval for the population mean. ( σ 1.96 σ/√n) is a 95% confidence interval for µ Example: A random sample of 25 observations is taken from a normal population whose standard deviation is given as 4.The sample mean works out to be 115.Construct a 95% confidence interval for the population mean. Solution: [115- (1.96)4/5, 115 +( 1.96)4/5]=[113.432,116.568] Interpretation of confidence interval It is true that the probability that the random interval will contain the population mean is 0.95 .However once we calculate the confidence interval by substituting data from a given sample it would be incorrect to say that the probability that the confidence interval will contain the population mean is 0.95.This is because the interval is fixed and so is the value of µ,therefore either the population mean lies in the given interval or not-- there is no randomness or uncertainty about it. In the above example the 95% confidence interval was [113.432, 116.568].The population mean is a fixed number so will either lie in this interval or not. If the population mean was, say 116, then it would lie in the interval. How then does one interpret the confidence interval? We can interpret the 95%confidence interval on similar lines as the interpretation of the long run relative frequency as an approximation of probability. If we say that the occurrence of an event E has a probability of 0.95, it implies that if the experiment is performed a large number of times, then the event E will occur 95% of the time. For example if a fair coin is tossed once ,the probability of obtaining a head is ½ .According to the long run relative frequency interpretation of probability this means that if the coin is tossed an infinite number of times, head would occur in 50% of the cases i.e if the coin is tossed 1000 times we would get head in approximately 500 cases i.e the relative frequency of head will approximately be equal to the probability of head—the approximation getting better as n, the number of times the experiment is repeated , gets larger. Symbolically P(E)= , where n is the number of times the experiment is repeated and x is the number of times the event E occurs. I Institute of Lifelong Learning, University of Delhi CONFIDENCE INTERVAL-I (Concept, interpretation and derivation) Thus the correct interpretation of the confidence interval is that if an infinite number of random samples are collected and a 95% confidence interval for the population mean is calculated for each sample, then 95%of these intervals will contain the population mean. To illustrate suppose we take 20 random samples of size 16 from a normal population which has a given standard deviation σ=2 and find the sample mean and the 95% confidence interval for the population mean for all 20 samples. This is tabulated below. The population mean is 100 and you will notice that it lies within 19 confidence intervals and is outside of only one confidence interval. i.e the 13th sample. This implies that in 95% of the confidence intervals the population mean lies within the interval. This is also illustrated diagrammatically in Figure 1. Only in the confidence interval of sample 13 the population mean does not lie within the interval. Sample number Sample mean 95% confidence interval for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 99.2 99.4 99.6 99.8 99.9 100 100.2 100.3 100.5 100.7 100.8 100.9 101.5 99.1 99.3 99.25 99.85 99.95 100.65 100.75 98.22,100.18 98.42,100.38 98.62,100.58 98.82,100.78 98.92,100.88 99.02,100.98 99.22,101.18 99.32,101.28 99.52,101.48 99.72,101.68 99.82,101.78 99.92,101.88 100.52,102.48 98.12,100.08 98.32,100.28 98.27,100.23 98.87,100.83 98.97,100.93 99.67,101.63 99.77,101.73 It is important to note that the above example was designed to illustrate the meaning of a 95% confidence interval. In practice none or 2 or more of the confidence intervals may not contain the true population mean. If we had taken 1000 samples of size 16 instead of 20, and calculated 95%confidence intervals for the population mean from each I Institute of Lifelong Learning, University of Delhi CONFIDENCE INTERVAL-I (Concept, interpretation and derivation) sample, in close to 950 samples would the population mean lie in the constructed confidence interval. It must be emphasized that only when we take a large number of samples and construct a large number of confidence intervals, will 95% of them contain the true population mean. Figure1 Other levels of confidence I Institute of Lifelong Learning, University of Delhi CONFIDENCE INTERVAL-I (Concept, interpretation and derivation) In the previous section we derived a 95% confidence interval for the population mean. 95% was the confidence level of the confidence interval.If we express it as probability and not % it is referred to as the confidence coefficient i.e a 95% confidence level implies that the confidence coefficient is 0.95. We can change the confidence level to 99% or 90% or any other level. The 95%confidence interval was derived from the probability 0.95 for the initial inequality (A).If we want to construct a 90% confidence interval the initial probability of 0.95 must be replaced by 0.90 .This implies that the zcritical value changes from 1.96 to 1.645. A 90% confidence interval is then obtained by using 1.645 in place of 1.96 in equation (A).This means that the random interval that we get will now look like P( σ µ σ/√n) = 0.90 We can interpret this expression as stating that the probability that the population mean will lie within the random interval is 0.90 i.e if we take a large number of samples and construct a large number of confidence intervals, approximately 90% of them would contain the true population mean. Likewise a 99%confidence interval can be obtained by using 2.58 instead of 1.96 in equation (A).Thus we can change the level of confidence by replacing 1.96 with the appropriate standard normal critical value. How do we find the appropriate standard normal critical value? The normal tables give critical values of such that the area to its left is 1–  /2 as in figure 1. You can check from the normal tables that the area to the left of 1.96 is 0.975,to the left of 2.58 is 0.995 and so on. Since the curve is symmetrical, area to the left of would be  /2 and thus the area between and would be 1-  (figure 2).Thus if we have to find a 94 % confidence interval, the value of  would be 0.06.We would therefore have to find z0.03 i.e critical value of z such that the area to its left is 0.97,which is 1.88.Now the area between -1.88 and 1.88 would be 0.94. I Institute of Lifelong Learning, University of Delhi CONFIDENCE INTERVAL-I (Concept, interpretation and derivation) Thus in general a 100(1- ) % confidence interval for the population mean of a normal population when the value of the population standard deviation is known is ( σ/√n) – Thus it is clear that the choice of the confidence level determines the value of used in calculating the confidence interval. Example: The following confidence interval was selected by a researcher. [ – (σ/√n)] Determine the confidence level. Solution: Since the value of is 1.75, the area to the left of this is about 0.96 and therefore the area between -1.75 and 1.75 would be 0.92.Therefore the confidence level would be 92%. . Example : Construct a (i)99% and(ii) 94% (iii) 90% confidence interval for the population mean, if a random sample of 25 observation from a normal population with standard deviation σ=2 gave a mean of 18.5. I Institute of Lifelong Learning, University of Delhi CONFIDENCE INTERVAL-I (Concept, interpretation and derivation) Solution:(i) 99% confidence interval [18.5- 2.58(2/5), 18.5 +2.58(2/5)] ( 18.5 – 1.032, 18.5+ 1.032) ( 17.468 ,19.532) (ii) 94% confidence interval [18.5- 1.88(2/5), 18.5 +1.88(2/5)] ( 18.5 –0.752 , 18.5 +0.752) ( 17.748 ,19.252) (iii) 90% confidence interval [18.5- 1.645(2/5), 18.5 +1.645(2/5)] ( 18.5 – 0.658, 18.5 +0.658) ( 17.842 ,19.158) Confidence level, width and precision The above example shows that higher the confidence level, wider is the confidence interval. For the 99% confidence interval the width was 2.064 (2*1.032), while for the 90% confidence interval it was 1.316 (2*0.658).The width or the length of the observed confidence interval is an important measure of the quality of the information obtained from the sample. The wider the confidence interval, the more confident we are that the interval will actually contain the true population value. On the other hand, the wider the interval, the less information we have about the true population parameter. Thus we cannot say that a 99% interval is to be preferred to a 90 % interval; the gain in reliability is at the cost of lower precision. The ideal situation would be to have a relatively short interval with a high confidence level. This would be possible if we reduce the standard error of the sampling statistic. When we construct a confidence interval for the population mean by drawing a random sample from a normal population with a known standard deviation σ, the standard I Institute of Lifelong Learning, University of Delhi CONFIDENCE INTERVAL-I (Concept, interpretation and derivation) error of the sample mean is .The standard error can be reduced by increasing the sample size,n.If we increase n, then for a given level of confidence the width of the confidence interval decreases which increases precision of the estimate. Example:A random sample from a normal population gave a mean of 450. The population standard deviation was known to be 8.Consruct a 95% confidence interval for the population mean and find its width if (i) (ii) Sample size is 16 Sample size is 81 Solution: (i) [450- 1.96*(8/4) , 450 + 1.96*(8/4) ] [446.08 , 453.92 ] (ii) Width=7.84 [450- 1.96*(8/9) , 450 + 1.96*(8/9) ] [448.26 , 451.74 ] Width=3.48 Thus increasing the confidence level increases the width of the confidence interval, while increasing the sample size decreases the width of the confidence interval, other things remaining constant. Therefore if we want to keep both the width and the level of confidence constant, this can be done by suitably choosing n. Example: We wish to measure the mean expenditure on housing of a population which is normally distributed with standard deviation =1.5.What should be the sample size so that the width of the 95% confidence interval is at most 2. Solution: w= 2 2=2*1.96*1.5/ n = (1.96*1.5)2 n= (2.94)2 = 8.6436 =9 In general the sample size necessary for a 100(1a width of w is-- )% confidence interval to have I Institute of Lifelong Learning, University of Delhi CONFIDENCE INTERVAL-I (Concept, interpretation and derivation) n= {(2* * 2 The half width is called the bound on error of estimation or the maximum error of the estimate.When we construct a 100 (1)% confidence interval for the population mean, the bound on error of estimation is . Example:A random sample of 16 observations was drawn from a normal population whose standard deviation was known to be 1.6.If the sample mean was 234, construct a 99% confidence interval for the population mean. What would be the maximum error of the estimate? Solution: 99% confidence interval for the population mean [234 - 2.58 (1.6/4), 234 + 2.58(1.6/4)] [234 – 1.032, 234 + 1.032] [232.968 , 235.032] Maximum error of the estimate: 2.58(1.6/4)= 1.032 One Sided confidence interval Till now we discussed a two- sided confidence interval. This gives both a lower confidence limit or bound and an upper confidence limit or bound for the parameter to be estimated. Sometimes the researcher may want only one of these bounds in which case we can construct a one- sided confidence interval. For example the researcher may want to know a 95% upper limit for average income of a city. The interpretation of the one sided confidence interval is similar to that of a two sided interval. A 95% upper limit is a fixed number for a given sample. However it will vary from sample to sample and if we take a large number of samples and calculate the 95% upper limit for all of them, then in approximately 95% of those samples the true mean will lie below the upper bound. A 100(1-  )% one sided upper confidence bound for the population mean based on a sample of n observations from a normal population with a known standard deviation is- z σ/√n Similarly a 100(1-  ) % one sided lower confidence bound is- z σ/√n I Institute of Lifelong Learning, University of Delhi CONFIDENCE INTERVAL-I (Concept, interpretation and derivation) Example: The sample mean of 16 observations from a normal population with a known standard deviation =2, was 450. Calculate a 99% one sided upper bound for . Solution: 99% one sided upper bound for = 450 +(2.33)2/4 = 450+1.165 =451.165 Example: A sample of 36 observations from a normal population gave a mean of 125.The population standard deviation was known to be 1.5.Calculate a 95% one sided lower bound for and interpret it. Solution: 95% one sided lower bound for 125-2.58 (1.5/6) 125-0.645 124.355 A 95% lower limit is a fixed number for a given sample e.g for the above sample it is 124.355. However it will vary from sample to sample and if we take a large number of samples and calculate the 95% lower limit for all of them, then in approximately 95% of those samples the true mean will lie above the lower bound. General derivation of confidence interval Let us construct a confidence interval for the population parameter Φ on the basis of a sample of n observations X1……………..Xn. Let h(X1……………..Xn) denote a random variable which satisfies the following properties(i) (ii) The variable h(X1……………..Xn) is a function of both X1……………..Xn and Φ The probability distribution of the variable h(X1……………..Xn) does not depend on Φ or any other unknown parameter I Institute of Lifelong Learning, University of Delhi CONFIDENCE INTERVAL-I (Concept, interpretation and derivation) In the construction of the confidence interval for the population mean from a normal population with a known standard deviation , Φ= and h(X1……………..Xn) is , which satisfies both the properties. It functionally depends on but it’s probability distribution is N(0,1), independent of .In general the form of the distribution depends on the sampling distribution of the corresponding sample statistic. Now for any lying between 0 and 1,we can find constants a and b,such that P(a< h(X1……………..Xn) <b) = 1- In the construction of the confidence interval for the population mean from a normal population with a known standard deviation , a= – and b= Then the inequalities can be manipulated to yield the following random interval P[l(X1……………..Xn )< Φ < u(X1……………..Xn )] = 1- Once we substitute sample values in this we get 100(1- )% confidence interval for Φ. [l(x1…….xn ) , u(x1…….xn )] In the case of the population mean l(x1…….xn) = ( u(x1…….xn) – =( I Institute of Lifelong Learning, University of Delhi CONFIDENCE INTERVAL-I (Concept, interpretation and derivation) References 1.J.E Freund Mathematical Statistics 2.J.L.Devore:Probability and Statistics for Engineering and the Sciences Practice Questions Q.1. The life of a 100 w light bulb measured in hours, is normally distributed with standard deviation =25 hours. If a sample of 16 bulbs gave a mean life of 1500 hours, construct a 99% confidence interval for the population mean. Q.2. A random sample of 100 observations selected from a normal population gave a mean = 143.72. The standard deviation for the population is given as =14.8 (i) Construct (a) a 99% confidence interval for (b) a 95% confidence interval for (c) a 90% confidence interval for (ii) Does the width of the confidence intervals decrease as the confidence level decreases? Explain. Q. 3. A public health official wanted to know how often university students visit their health centre due to illness. The officials took a random sample of 100 students and found an average of 2.3 visits per student per year. The population is known to be normally distributed with a standard deviation of 0.4.Make a 97 % confidence interval for the true population mean visits per student per year Q. 4. . A random sample is selected from a normal population which has a standard deviation of 7.14.The sample mean was found to be 48.52. (i) Make (a) (b) (c) a 95% confidence interval for n=225 n=100 n=64 assuming I Institute of Lifelong Learning, University of Delhi CONFIDENCE INTERVAL-I (Concept, interpretation and derivation) (ii) Does the width of the confidence intervals increase as the sample size decreases? Explain. Q. 5. To study the internet penetration in India it is desired to estimate the average number of hours that teenagers spend with the internet per week. Assuming the sample is drawn from a normal population with standard deviation =3.0 hours, how large must the sample be so that it is possible to assert with 95% confidence that the maximum error of estimation is 20 minutes ? Q.6. Construct a 94% one sided upper bound given the following information. The sample mean of 25 observations is 56.4.The population from which the sample is drawn is normal with =2.5 Q.7. What is the the confidence level for the following one sided confidence limits (i) upper limit = (ii)lower limit = Q.8. The following confidence interval for the population mean was reported by a researcher on the basis of a random sample (250.5, 260.5).Find the (i) sample mean (ii)the confidence level, if the population was normally distributed with and the sample size was 36. Q.9. A random sample of size n was drawn from a normal population distribution with a given standard deviation and the following two sided confidence intervals for were constructed Find the confidence level for the two confidence interval for (i) (ii) ( ( ± ± Q.10. Explain the relationship between width of a confidence interval and the sample size. Construct a 95% confidence interval for the population mean if a random sample of 25 observations is taken from a normal population with =5. What is the width of the confidence interval? If the width has to be halved, by how much should we increase the sample size, other things remaining the same? I Institute of Lifelong Learning, University of Delhi

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download CONFIDENCE INTERVAL-I Statistical Methods In Economics