Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Journal of Reliability and Statistical Studies [ISSN: 0974-8024 (Print), 2229-5666 (Online)] Vol. 3, Issue 2 (2010): 93-101 ON CLASSES OF UNBIASED SAMPLING STRATEGIES Shashi Bhushan1 and Shubhra Katara2 1. Department of Statistics, PUC Campus, Mizoram University, Aizawl – Mizoram, India. E Mail: [email protected] 2. Department of Statistics, Bareilly College, Bareilly, U.P., India. E Mail: [email protected]. Abstract In this paper, we have proposed to use two classes of sampling strategies based on the modified ratio estimator using the standard deviation and the coefficient of skewness of the auxiliary variable by Singh (2003) for estimating the population mean (total) of the study variable in a finite population. The properties of the proposed sampling strategies are studied and some concluding remarks are given. Also, an empirical study is included as an illustration. Keywords: Ratio estimator, Variance, Unbiasedness, Mean square error. 1. Introduction The use of information on an auxiliary variable for increasing the efficiency of sampling strategy is quite well established in sampling from finite populations. Many transformed estimation procedures are also available in the literature for increasing the efficiency but, unfortunately, this is always done at the cost of unbiasedness. There are numerous instances where the bias of these estimation procedures is very large with respect to their mean square error and therefore may not be advisable. The aim of this paper is to introduce some efficient as well as unbiased sampling strategies for finite population. Further, it may be of interest to note that there are very few works in sampling literature wherein the focus is on devising better sampling strategies both in terms of efficiency and unbiasedness. This may be done by some innovations in both the aspects of sampling strategy viz namely sampling scheme and estimation procedure. In this paper, we have used the prior information about coefficient of coefficient of skewness and standard deviation at both the stages – estimation and sampling scheme so that the accuracy of the sampling strategy is improved. Recently, some attempts have been made by various authors, including Singh (2003), Senapati (2005), Singh (2005) among others, to improve the existing sampling strategies by using information about auxiliary variable. The use of prior value of coefficient of skewness and standard deviation in estimating the population mean of characteristic under study y was made by Singh (2003). It was mentioned by the above author that the use of such prior information about the coefficient of skewness and standard deviation leads to more efficient estimation population mean (total) of the study variable. The ratio estimator for estimating the population mean of the study variable y is given by yR = y µ X = RX x (1.1) 94 Journal of Reliability and Statistical Studies, December 2010, Vol. 3(2) µ = y , y is the sample mean of the study variable y , x is the sample mean of where R x the some auxiliary variable x and X is the population mean of the x which is assumed to be known. If the population standard deviation of x denoted by σ x and population coefficient of skewness denoted by β1x is known, then Singh (2003) proposed a modified product estimator for estimating the population mean Y of the study variable given by ( x β1x + σ x ) (1.2) ySP = y ( X β1 x + σ x ) The bias and mean square error of ySP under simple random sampling are given by 1 1 (1.3) Bias ( ySP ) = ( − )Y ρν Cx C y n N and 1 1 MSE ( y SP ) = ( − )Y 2 {C y2 + ν C x2 (ν + 2 K )} (1.4) n N where C y is the population coefficient of variation of y , ν = X β1x ( X β1x + σ x ) , K = ρ C y C y and ρ is the population correlation coefficient between x and y . Similar to (1.2), a modified ratio estimator can be given by ( X β1 x + σ x ) ySR = y ( x β1x + σ x ) having bias and mean square error given by 1 1 Bias ( ySR ) = ( − )Y (ν 2C x2 − ρν C x C y ) and n N 1 1 MSE ( y SR ) = ( − )Y 2 {C y2 + ν C x2 (ν − 2 K )} respectively. n N In this paper, Singh (2003) estimator is modified using Walsh (1970) and Reddy (1973). We propose two classes of unbiased sampling strategies such that the estimator of population mean Y is ( X β1x + σ x ) ySA = y (1.5) A ( x β1x + σ x ) + (1 − A) ( X β1x + σ x ) { } Note that for A =1 the proposed estimator reduces to ySR . We now consider this estimator under the following sampling schemes: (i). Simple random sampling without replacement along with the jack-knife technique * and denote the resulting estimator as Y¶ . SS (ii). Midzuno (1952) - Lahiri (1951) - Sen (1952) type sampling scheme and denote the resulting estimator by ySM . On Classes of Unbiased Sampling … 95 Both the sampling strategies aim at getting some classes of better sampling strategies than the existing ones in the sense of unbiasedness and lesser mean square error. 2. Bias and MSE under SRSWOR Consider estimator ySA under Simple random sampling without replacement and denote it by ySS . Let y = Y + e0 and x = X + e1 such that E (e0 ) = E (e1 ) = 0 Putting these values in (1.5) and using Taylor’s series expansion, we have Aβ1xYe1 A2 β12xYe12 Aβ1x e0 e1 + − +K ySS − Y = e0 − X β1x + σ x ( X β1x + σ x ) 2 X β1x + σ x Taking expectation on both sides and using (2.1) we have Bias ( ySS ) = E ( ySS ) − Y (2.1) (2.2) A2 β12x Aβ1x E (e12 ) − E (e0 e1 ) (upto first order of approximation) =Y 2 ( X β1x + σ x ) ( X β1x + σ x ) 1 1 Since E (e0 2 ) = ( − )Y 2C y 2 n N 1 1 2 2 2 E (e1 ) = ( − ) X Cx n N 1 1 E (e0 e1 ) = ( − ) XY ρ CxC y (2.3) n N Therefore, 1 1 Bias ( ySS ) = Y ( − ) { A2ν 2Cx 2 − Aνρ Cx C y } (2.4) n N Now, for mean square error, considering (2.2) to the first order of approximation, we get MSE ( ySS ) = E ( ySS − Y ) 2 Aβ1xYe1 = E e0 − X β1x + σ x = E (e0 2 ) + 2 A2 β12xY 2 2 Aβ1xY E (e12 ) − E ( e0e1 ) 2 ( X β1x + σ x ) ( X β1x + σ x ) { } 1 1 = Y 2 ( − ) C y 2 + ν Cx 2 ( A2ν − 2 AK ) (2.5) n N Note that on putting A = 1 in (2.4) and (2.5) we get expressions of bias and mean square error of ySR and the optimizing value of the characterizing scalar A is given by K A = = Aopt (say) (2.6) ν The minimum mean square error under optimizing value of A = Aopt is 1 1 MSE ( ySS ) = Y 2 ( − )(1 − ρ 2 )C y 2 n N (2.7) 96 Journal of Reliability and Statistical Studies, December 2010, Vol. 3(2) which is same as the mean square error of the linear regression estimator. Also note that Bias ( ySS ) = 0 under the optimizing value of A . 3. Unbiasedness and MSE of Jack-Knife Sampling Strategy Let us now apply Quenouille’s (1956) method of Jack-Knife such that the sample of size n = 2m from a population of size N is split up at random into two sub samples of size m each. For further details one may refer to Gray and Schucany (1972). Let us define X β1x + σ x y SS( i ) = yi ; i = 1, 2 A ( xi β1x + σ x ) + (1 − A) ( X β1x + σ x ) y SS(3) = y s X β1x + σ x A ( xs β1x + σ x ) + (1 − A) ( X β1x + σ x ) (3.1) where is the characterizing scalar to be chosen suitably such that s = s1 + s2 , s1 and s2 be the two sub samples of size m each and + denotes the disjoint union. y1 , y2 and ys denote the sample means based on two sub samples of size m and the entire sample of size n = 2m for characteristic y . x1 , x2 and xs denote the sample means based on two sub samples of size m and the entire sample of size n = 2m for characteristic x . It can be easily seen that 1 1 Bias y SS(i ) = Y ( − ) { A2ν 2Cx 2 − Aνρ CxC y } ; i = 1, 2 m N 1 1 Bias y SS(3) = Y ( − ) { A2ν 2Cx 2 − Aνρ CxC y } = B1 (say ) (3.2) 2m N ( ) ( ) (2) ' ySS + ySS Let us define Y¶ as an alternative estimator of the population mean Y . SS = 2 ' The bias of Y¶ is (1) SS ' 1 1 Bias(Y¶ − ) { A2ν 2Cx 2 − Aνρ CxC y } = B2 ( say) (3.3) SS ) = Y ( m N * We propose the jackknife estimator Y¶ for estimating population mean Y given by SS (3) N − 2n ¶ ' − y YSS SS ¶ * ySS − RYSS B 2( N − n) where R = 1 = = Y¶ (3.4) SS B2 1− R N − 2n 1− 2( N − n) Taking expectation of (3.4) and using (3.2) and (3.3) we obtain * * E (Y¶ showing that Y¶ is an unbiased estimator of population mean Y to the SS ) = Y SS first order of approximation. Consider (3) ' y (3) − RY¶ ' * SS SS MSE (Y¶ = E −Y ) SS 1− R 2 On Classes of Unbiased Sampling … { 97 } 2 (3) 1 ¶ ' − Y + RY E y RY − SS SS (1 − R ) 2 ( 3) (3) ' 1 2 ¶' = E ( ySS − Y )2 + R 2 E (Y¶ SS − Y ) − 2 RE ( ySS − Y )(YSS − Y ) 2 (1 − R) Also, (3) (3) 1 1 E ( ySS − Y ) 2 = MSE ( ySS ) = Y 2 ( − ) C y 2 + ν Cx 2 A2ν − 2 AK 2m N Further, = { { ( } (3.5) )} (3.6) 2 y +y ' 2 SS SS E (Y¶ −Y SS − Y ) = E 2 2 (1) ( 2) 1 = E ( ySS − Y ) + ( y SS − Y ) 4 (1) (2) (1) ( 2) 1 = E ( ySS − Y ) 2 + E ( ySS − Y ) 2 + 2E ( y SS − Y )( ySS − Y ) 4 Since (i) (i) 1 1 E ( ySS − Y )2 = MSE ( y SS ) = Y 2 ( − ) C y 2 + ν Cx 2 ( A2ν − 2 AK ) ; i = 1, 2 m N Let yi = Y + e0(i ) and xi = X + e1( i ) such that E ( e0(i ) ) = E ( e1(i ) ) = 0 i = 1, 2 (1) { { (2) } } { } (3.7) (3.8) Consider (1) ( 2) Aβ1xYe1(1) (2) Aβ1xYe1(2) E ( ySS − Y )( ySS − Y ) = E e0(1) − e0 − X β1x + σ x X β1x + σ x 2 Aβ1xY { E(e0(2)e1(1) ) + E (e0(1)e1(2) )} + X Aβ β1+xYσ E (e1(1)e1(2) ) X β1x + σ x 1x x Substituting the results in Sukhatme and Sukhatme 1 E (e0(1) e0 (2) ) = − Y 2C y 2 N 1 2 2 (1) (2) E (e1 e1 ) = − X C x N 1 E (e0 (1) e1(2) ) = E (e0(2) e1(1) ) = − XY ρ C xC y we have N (1) ( 2) 1 2 E ( ySS − Y )( ySS − Y ) = − Y {C y 2 + A2ν 2C x 2 − 2 Aνρ Cx C y } N 1 2 2 2 2 = − Y C y + ν C x ( A ν − 2 AK ) N Putting the values from (3.8) and (3.9) in (3.7) we have ' 1 2 2 2 1 1 E (Y¶ 2 − − {C y 2 +ν Cx 2 ( Aν − 2 AK )} SS − Y ) = Y 4 m N N = E (e0(1) e0(2) ) − { } { } 1 1 − ) C y 2 + ν C x 2 ( A2ν − 2 AK ) 2m N Now consider = Y 2( (3.9) (3.10) 98 Journal of Reliability and Statistical Studies, December 2010, Vol. 3(2) (1) ( 3) E ( ySS (2) (3) ' ySS + ySS − Y )(Y¶ − Y )} SS − Y ) = E{( ySS − Y )( 2 (3) (1) (3) ( 2) 1 {E ( ySS − Y )( ySS − Y ) + E ( ySS − Y )( ySS − Y )} 2 Since ( 3) (i ) Aβ1xYe1 (i ) Aβ1xYe1( i ) E ( ySS − Y )( ySS − Y ) = E e0 − e0 − X β1x + σ x X β1x + σ x = where i = 1, 2 2 = E (e0e0(i ) ) − Aβ1xY Aβ1xY ( i) E (e0(i ) e1 ) + E (e0 e1(i ) )} + { E (e1e1 ) + X β1x + σ x X β σ 1x x using the following results given in Sukhatme and Sukhatme 1 1 E (e0 e0( i ) ) = − Y 2C y 2 2m N 1 1 E (e1e1( i ) ) = − X 2C x 2 2m N 1 1 E (e0 e1(i ) ) = E (e0( i )e1 ) = − XY ρ C xC y for i = 1, 2 we have 2m N ( 3) (i) 1 1 E ( ySS − Y )( ySS − Y ) = Y 2 − C y 2 + ν C x 2 ( A2ν − 2 AK ) (3.11) 2m N Putting these values from (3.6), (3.10), (3.11) in (3.5) we have * 1 1 1 MSE (Y¶ Y2 − (1 + R 2 − 2 R ) C y 2 + ν C x 2 ( A2ν − 2 AK ) SS ) = (1 − R) 2 2 m N 1 2 1 2 2 2 (3.12) = Y ( − ) C y + ν C x ( A ν − 2 AK ) n N which is equal to the mean square error of ySS . Therefore, the optimizing value of the { } { { } } characterizing scalar A is given by (2.6) and the minimum mean square error under optimizing value of A = Aopt is given by (2.7). 4. Unbiasedness and MSE of Midzuno-Lahiri-Sen Type Sampling Strategy Let us consider ySA under Midzuno (1952)-Lahiri (1951)-Sen (1952) type sampling scheme and denote it by ySM . The proposed Midzuno-Lahiri-Sen type sampling scheme for selecting a sample s of size n deals with selecting first unit with probability proportional to X β1x + σ x + Aβ1x ( xi − X ) where xi is the size of the first selected unit such that X β1x + σ x + Aβ1x ( xi − X ) P (i ) = P(selecting first unit i with size xi ) = (4.1) N ( X β1x + σ x ) and selecting the remaining n − 1 units in the sample from N − 1 units in the population by simple random sampling without replacement. Thus the probability of selecting the sample s of size n is X β1x + σ x + Aβ1x ( xs − X ) P (s ) = (4.2) ( X β1x + σ x ). N Cn On Classes of Unbiased Sampling … 99 where xs and ys are the sample mean of x and y respectively based on the sample s . Consider y s ( X β1 x + σ x ) E ( ySM ) = E{ } X β1x + σ x + Aβ1x ( xs − X ) N = s =1 N = ys ( X β1x + σ x ) P( s ) + σ x + Aβ1x ( xs − X ) Cn ∑ Xβ 1x Cn ∑ s =1 ys Cn N = E ( ys ) (under simple random sampling without replacement) =Y (4.3) showing that ySM is an unbiased estimator of population mean Y for all values of A under the proposed Midzuno-Lahiri-Sen type sampling scheme. Now, since 2 MSE ( ySM ) = V ( ySM ) = E ( ySM ) −Y 2 where N Cn 2 2 E ( ySM ) = ∑ ySM .P (s ) s =1 = = 2 y s ( X β1x + σ x ) P( s ) ∑ + + − X β σ A β ( x X ) s=1 1x 1x x s N Cn N Cn ∑y s=1 2 s ( X β1 x + σ x ) 1 X β1x + σ x + Aβ1x ( xs − X ) N Cn −1 2 Aβ1x e1 = E (Y + e0 ) 1 + X β1x + σ x Aβ1x e1 A2 β12x e12 = E (Y 2 + e02 + 2Ye0 ) 1 − + − K 2 X β 1x + σ x ( X β 1 x + σ x ) 2 2 2 A β1xY Aβ1xY = Y 2 + E (e02 ) + (upto 1st order of E (e12 ) − 2 E (e0 e1 ) 2 ( X β1x + σ x ) X β1x + σ x approximation) Therefore, by using (2.3), we have * 1 1 MSE ( ySM ) = Y 2 ( − ) C y 2 + ν C x 2 ( A2ν − 2 AK ) = MSE ( y SS ) = MSE (Y¶ SS ) n N = MSE ( ySA ) (say) (4.4) { } Thus, the optimizing value of the characterizing scalar A is given by (2.6) and the minimum mean square error under optimizing value of A = Aopt is given by (2.7). Further, let the minimum mean square error under optimizing value of A = Aopt be denoted by MSE ( ySA )min . 100 Journal of Reliability and Statistical Studies, December 2010, Vol. 3(2) 5. Concluding Remarks If the minimizing value A = K δ = Aopt is known then, we have 1 1 MSE ( y ) − MSE ( y SA )min = ( − )Y 2 ρ 2C y 2 ≥ 0 n N 2 1 1 MSE ( y R ) − MSE ( ySA )min = ( − )Y 2 ( C x − ρ C y ) ≥ 0 n N 2 1 1 MSE ( yP ) − MSE ( ySA )min = ( − )Y 2 ( C x + ρ C y ) ≥ 0 n N 2 1 1 MSE ( ySR ) − MSE ( y SA )min = ( − )Y 2 (δ C x − ρC y ) ≥ 0 n N 2 1 1 MSE ( ySP ) − MSE ( y SA )min = ( − )Y 2 (δ C x + ρ C y ) ≥ 0 n N Hence, under the optimizing value of the characterizing scalar A = K (5.1) (5.2) (5.3) (5.4) (5.5) ν = Aopt , the proposed sampling strategies are always better than y , ySR , ySP , yP , yR and ylr in sense of unbiasedness and gain in efficiency. 6. Empirical Study Let us consider the following example considered by Singh and Chaudhary (1986) wherein the following values were obtained Y = 1467.545, X = 22.62, Cx = 1.460904, C y = 1.745871, β1x = 3.307547, σ x =32.28717, ρ = 0.902147. The bias, mean square errors and percent relative efficiency (PRE) w.r.t. y of the sample mean y , ratio estimator yR , product estimator yP , ySR , ySP and ylr are given by Est. (t) y yR yP ylr ySR ySP Bias (t) 0 -244.68 3376.77 3867.58 -830.47 2358.86 MSE (t) 6564590 1249925 21072230 1221872 1884104 15731012 PRE( t : 100 525.20 31.15 537.26 348.42 41.73 y) * ySM / Yµ SS 0 1221872 537.26 1 1 Note that the above results are scaled by the factor ( − ) . It can be easily observed n N from the table that only sample mean y and the proposed sampling strategies are unbiased. Further, it can be easily observed that ylr and the proposed sampling strategies attains the minimum mean square error but ylr is biased. It is evident from the above empirical study that the proposed sampling strategies are better than the remaining sampling strategies both in terms of unbiasedness and mean square error. References 1. 2. Gray H. L. and Schucany W. R. (1972). The Generalized Jack-knife Statistic, Marcel Dekker, New York. Lahiri D. B. (1951). A method of sample selection providing unbiased ratio estimates, Bull. Int. Stat. Inst., 3, p. 133-140. On Classes of Unbiased Sampling … 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 101 Midzuno H. (1952). On the sampling system with probability proportional to the sum of the sizes, Ann. Inst. Stat. Math., 3, p. 99-107. Murthy M. N. (1967). Sampling theory and methods, Statistical Publishing Society, Calcutta, India. Quenouille M. H. (1956). Notes on bias in estimation, Biometrika, 43, p. 353-360. Reddy V. N. (1973). On ratio and product methods of estimation, Sankhya, B, 35(3), p. 307 – 316. Sen A. R. (1952). Present status of probability sampling and its use in the estimation of a characteristic, Econometrika, 20, p. 103. Senapati S. C., Sahoo L. N. and Misra G. (2006). On a Scheme of Sampling of Two Units With Inclusion Probability Proportional to Size, Austrian Journal of Statistics, 35(4), p. 445–454. Singh G. N. (2003). On the improvement of product method of estimator in sample surveys, J. I. S. A. S., 56(3), p. 267-275. Singh H. P., Singh R., Espejo M. R., Pineda M. D. and Nadarajah S. (2005). On The Efficiency Of A Dual To Ratio-Cum-Product Estimator In Sample Surveys, Mathematical Proceedings of the Royal Irish Academy, 105A (2), p. 51–56. Singh D. and Chaudhary F. S. (1986). Theory and analysis of sampling survey design, Wiley Eastern Limited, New Delhi. Sukhatme P. V. and Sukhatme B. V. (1997). Sampling Theory of Surveys with Applications, Piyush Publications, Delhi. Walsh J. E. (1970). Generalization of ratio estimator for population total, Sankhya, A, 32, p. 99-106.