Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
VOL. 2, NO.7, JULY 2012 ISSN 2222-9833 ARPN Journal of Systems and Software ©2009-2012 AJSS Journal. All rights reserved http://www.scientific-journals.org A PPS Sampling Scheme Using Harmonic Mean of An Auxiliary Variable L.N. Sahoo, 2M. Dalabehera, 3A.K. Mangaraj 1 1 2 Department of Statistics, Utkal University, Bhubaneswar 751004, India Department of Statistics, Orissa University of Agriculture & Technology,Bhubaneswar 751003, India 3 Department of Statistics, Rajendra Junior College, Bolangir 767002, India E-mail :{ 1 [email protected]} ABSTRACT We consider a probability proportional to size sampling scheme by using harmonic mean of an auxiliary variable, when the correlation between study variable and auxiliary variable is highly negative. This is achieved by considering inverse transformation of the auxiliary variable and then utilizing the transformed auxiliary variable values at the design stage of the survey operation in selecting a sample. Keywords: Auxiliary variable, correlation coefficient, efficiency, finite population, inclusion probability, ppz sampling, unbiased estimator. 1. INTRODUCTION Consider a finite population of units denoted by . Let and respectively be the values of the study variable and an auxiliary variable (taken as size measure) for the ith unit . Suppose that ’s are known and an estimate is needed for the population mean of on the basis of a random sample of units drawn from . In many surveys, known values are utilized to select a sample such that the probability of selecting , i.e., depends on xi and we must have and . Such an unequal probability sampling scheme is called a ppx (probability proportional to ) design. The ppx selection method will be very precise when the regression of on is linear passing through the origin under the assumption that ! , the correlation coefficient between and is positive. But, when ! " , the conventional ppx method of estimation may lead to erroneous results. [1] and [2] have suggested some alternative methods of estimation considering origin shifted auxiliary variables. These alternative methods although more efficient than the usual ppx methods, have limited applications because they are based on some restrictions on the #$%&#'#( . In this paper, we provide a probability proportional to size (PPS) sampling method of estimation with a transformed size measure ) when ! has a high ! negative value. Thus, our new PPS sampling method automatically utilizes the harmonic mean of values. 2. SUGGESTED PPS SAMPLING METHOD OF ESTIMATION Here values of ) are used at the design stage for selecting units in the sample such that * + , , , where ) is the value of ) on and - ) . This type of unequal probability sampling may be called as a probability proportional to the values of ) (ppz) sampling. It may be noted here that the population mean of ) is -. , , /0 where 12 is the simple harmonic mean of the values in . Thus, in the suggested ppz sampling with replacement scheme, an unbiased estimator of is given by , 344* 67 + 67 , with variance !+ 5 89344* : ; 5/0 < + -. = *+ < ! ; + + /0 5 5 =. (2.1) Hence, it is clear that, 89344* : is reduced to zero when > ) ? > . ! 3. EFFICIENCY OF PPZ SAMPLING WITH REPLACEMENT If the units in are selected by simple random sampling with replacement, the variance of the mean per unit estimator is given by (3.1) 8 @ A. 5 From (2.1) and (3.1), we have 89344* : " 8 provided < . + ) - *+ i.e., /0 B!+ /0 (3.2) Application of this criterion is difficult in practice. Because, the efficiency gain of ppz sampling over simple random sampling is greatly dependent on some properties of the population such as the composition of the units, and the quality of the size measures etc. To get a simpler idea of the situation, we use the following super-population model studied earlier by [3]: Super Population Model Let E C D D F , !+ 208 VOL. 2, NO.7, JULY 2012 ISSN 2222-9833 ARPN Journal of Systems and Software ©2009-2012 AJSS Journal. All rights reserved http://www.scientific-journals.org where C and G( are constants and F ’s are uncorrelated random variables such that HF I J and HF I . !+ Under this model, after a few algebraic steps, we obtain K < /B/0 5 /0 89344* : ; and D J = /0 J =, /0 ) L* 8 ;G L* D where 1 5 and (3.3) (3.4) -. . Hence, we have 8 " 89344* : provided / B/0 /0 MN< E< K< . (3.5) Taking as the size measure, this condition was derived by [4] who pointed out that this may easily be satisfied if C is sufficiently large. Since L* " or L! 1 , according as or " , where L! the condition (3.5) is also likely to be satisfied for smaller values of C. Thus, linearity of regression is not a sufficient condition for 344* to be better than . However, from (3.3) and (3.4) we find that if the regression line passes through the origin, i.e., if C , 344* has a better performance than . An inspection of the scatter diagram for at least some pairs of sample values may help in this respect. 4. EMPIRICAL EFFICIENCY OF DIFFERENT ESTIMATORS UNDER PPZ SAMPLING WITHOUT REPLACEMENT Suppose that the units in are selected one by one with probability proportional to the ) values at each draw without replacing the units selected at the previous draws. Under this set of samplings we now consider the following three important estimators of that have studied at length in the literature and that are well known for their optimal properties. Horvitz-Thompson Estimator [5] It is defined as 3OP 67 + , Q+ where R is the inclusion probability of in . Symmetrized Des Raj Estimator [6] For the special case of the sample S of 2 units, taken with ppz sampling without replacement, this estimator is defined by 3TU W + 9 S : D B4+ B4V 4+ V 4V X, where and S denote respectively the initial probabilities of selecting and S . Rao-Hartley-Cochran Estimator [7] Split the population at random into groups of sizes 5 such that 5 . Then from each group select exactly one unit with probability proportional to ) values of that group. Then this estimator is defined by 3YOZ 5 + *+ I,+ , where - is the total of ) values in the ith group. In order to avoid the difficulty that arises for an analytical comparison of the efficiencies of the above mentioned estimators and to have an overall idea about their relative performance, variances of the estimators 3[[, 3OP 3TU and 3YOZ are computed for when the samples are drawn from the following six populations having ! " : Population 1 [8, p.59]: \, quite rate per hundred in the U.S. manufacturing (1960-72), unemployment rate (%), ! ]^^ Population 2 [9, p.96]: _, per capita consumption of lamb (lb), price per pound of lamb (cents), ! ]`a Population 3 [9, p.96]: _, per capita consumption of veal (lb), price per pound of veal (cents), ! ]_^ Population 4 [9, p.96]: _, per capita consumption of chicken (lb), price per pound of chicken (cents), ! ]b`_ Population 5 [9, p.96]: _, per capita consumption of beef (lb), price per pound of beef (cents), ! ]^cc Population 6 [10]: , artificial, artificial, ! ]bc. Relative efficiencies of different estimators with respect to , based on simple random sampling without replacement are displayed in table 4.1. It interesting to note that, the Rao-Hartley-Cochran estimator is uniformly superior to other comparable estimators for the six populations. On the other hand, performances of 3OP and 3TU are highly unpredictable in the sense that they are uniformly less efficient than 344* . Table 1: Relative Efficiencies of Different Estimators w.r.t. (in %) Pop. No. 1 2 3 4 5 6 100 100 100 100 100 100 344* 209 187 159 367 285 135 Estimators 3OP 143 80 65 188 119 77 3TU 135 78 66 154 120 72 3YOZ 227 201 170 394 316 142 5. CONCLUSION The PPS sampling scheme introduced in this paper is not only helpful to the survey statisticians for selecting a sample when the auxiliary variable under consideration has a high negative correlation with the study variable, but also capable for increasing accuracy of estimation of the population mean up to certain level under some mild restrictions. The novel feature of the suggested scheme is that it utilizes population harmonic mean of the auxiliary variable that can be easily computed from its known values on different units of the population. However, we 209 VOL. 2, NO.7, JULY 2012 ISSN 2222-9833 ARPN Journal of Systems and Software ©2009-2012 AJSS Journal. All rights reserved http://www.scientific-journals.org stress on further studies, analytical and empirical, for better understanding on the properties of the proposed sampling design. REFERENCES [1] Reddy, V. N. and Rao, T. J. (1977). Modified PPS method of estimation. Sankhyā, C39, 185-197. [2] Sahoo, J., Sahoo, L. N. and Mohanty, S. (1994). Unequal probability sampling using a transformed auxiliary variable. Metron, 52, 71-83. [3] Agrawal, M. C. and Jain, N. (1989). A new predictive product estimator. Biometrika, 76, 822823. [4] Raj, D. (1958). On the relative accuracy of some sampling techniques. Journal of the American Statistical Association, 53, 98-101. [5] Horvitz, D. G. and Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association, 47, 663-685. [6] Murthy, M. N. (1957). Ordered and unordered estimators in sampling without replacement. Sankhyā, 18, 379-390. [7] Rao, J. N. K., Hartley, H. O. and Cochran, W. G. (1962). A simple procedure of unequal probability sampling without replacement. Journal of the Royal Statistical Society, B24, 482-491. [8] Gujarati, D. (1978). Basic Econometrics. Mc-Graw Hill Book Company. [9] Maddala, G.S. (1977). Econometrics. Mc-Graw Hill Kogakusha Limited. [10]Srivenkataramana, T. and Tracy, D. S. (1980). An alternative to ratio method in sample survey. Annals of the Institute of statistical Mathematics, 32, 111120. 210