Download A PPS Sampling Scheme Using Harmonic Mean of An Auxiliary

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
VOL. 2, NO.7, JULY 2012
ISSN 2222-9833
ARPN Journal of Systems and Software
©2009-2012 AJSS Journal. All rights reserved
http://www.scientific-journals.org
A PPS Sampling Scheme Using Harmonic Mean of
An Auxiliary Variable
L.N. Sahoo, 2M. Dalabehera, 3A.K. Mangaraj
1
1
2
Department of Statistics, Utkal University, Bhubaneswar 751004, India
Department of Statistics, Orissa University of Agriculture & Technology,Bhubaneswar 751003, India
3
Department of Statistics, Rajendra Junior College, Bolangir 767002, India
E-mail :{ 1 [email protected]}
ABSTRACT
We consider a probability proportional to size sampling scheme by using harmonic mean of an auxiliary variable, when the
correlation between study variable and auxiliary variable is highly negative. This is achieved by considering inverse
transformation of the auxiliary variable and then utilizing the transformed auxiliary variable values at the design stage of
the survey operation in selecting a sample.
Keywords: Auxiliary variable, correlation coefficient, efficiency, finite population, inclusion probability, ppz sampling, unbiased
estimator.
1. INTRODUCTION
Consider a finite population of units denoted by
. Let and respectively be the values of the study variable and an
auxiliary variable (taken as size measure) for the ith
unit . Suppose that ’s are known and
an estimate is needed for the population mean of on
the basis of a random sample of units drawn from .
In many surveys, known values are utilized to
select a sample such that the probability of selecting ,
i.e., depends on xi and we must have and
. Such an unequal probability sampling
scheme is called a ppx (probability proportional to )
design. The ppx selection method will be very precise
when the regression of on is linear passing through
the origin under the assumption that ! , the correlation
coefficient between and is positive. But, when
! " , the conventional ppx method of estimation may
lead to erroneous results. [1] and [2] have suggested
some alternative methods of estimation considering
origin shifted auxiliary variables. These alternative
methods although more efficient than the usual ppx
methods, have limited applications because they are
based on some restrictions on the #$%&#'#( .
In this paper, we provide a probability proportional
to size (PPS) sampling method of estimation with a
transformed size measure ) when ! has a high
!
negative value. Thus, our new PPS sampling method
automatically utilizes the harmonic mean of values.
2. SUGGESTED PPS SAMPLING
METHOD OF ESTIMATION
Here values of ) are used at the design stage for
selecting units in the sample such that
*
+ , ,
,
where )
is the value of ) on and - )
. This
type of unequal probability sampling may be called as a
probability proportional to the values of ) (ppz)
sampling. It may be noted here that the population mean
of ) is -. ,
,
/0
where 12 is the simple harmonic
mean of the values in . Thus, in the suggested ppz
sampling with replacement scheme, an unbiased
estimator of is given by
,
344* 67 + 67 ,
with variance
!+
5
89344* : ;
5/0
<
+
-. =
*+
<
!
;
+ +
/0
5
5
=.
(2.1)
Hence, it is clear that, 89344* : is reduced to zero
when > ) ? > .
!
3. EFFICIENCY OF PPZ SAMPLING
WITH REPLACEMENT
If the units in are selected by simple random
sampling with replacement, the variance of the mean per
unit estimator is given by
(3.1)
8 @
A.
5
From (2.1) and (3.1), we have 89344* : " 8
provided
<
. + )
- *+
i.e.,
/0 B!+ /0
(3.2)
Application of this criterion is difficult in practice.
Because, the efficiency gain of ppz sampling over simple
random sampling is greatly dependent on some
properties of the population such as the composition of
the units, and the quality of the size measures etc. To get
a simpler idea of the situation, we use the following
super-population model studied earlier by [3]:
Super Population Model
Let
E
C D D F
,
!+
208
VOL. 2, NO.7, JULY 2012
ISSN 2222-9833
ARPN Journal of Systems and Software
©2009-2012 AJSS Journal. All rights reserved
http://www.scientific-journals.org
where C and G( are constants and F
’s are
uncorrelated random variables such that HF
I
J
and HF
I
.
!+
Under this model, after a few algebraic steps, we
obtain
K < /B/0 5
/0
89344* : ;
and
D
J
=
/0
J
=,
/0
)
L* 8 ;G L* D
where 1 5
and
(3.3)
(3.4)
-. .
Hence, we have 8 " 89344* : provided
/ B/0 /0 MN<
E<
K<
.
(3.5)
Taking as the size measure, this condition was
derived by [4] who pointed out that this may easily be
satisfied if C is sufficiently large. Since L* " or L!
1 ,
according as or " , where L! the condition (3.5) is also likely to be satisfied for
smaller values of C. Thus, linearity of regression is not a
sufficient condition for 344* to be better than .
However, from (3.3) and (3.4) we find that if the
regression line passes through the origin, i.e., if C ,
344* has a better performance than . An inspection of
the scatter diagram for at least some pairs of
sample values may help in this respect.
4. EMPIRICAL EFFICIENCY OF
DIFFERENT ESTIMATORS UNDER PPZ
SAMPLING WITHOUT REPLACEMENT
Suppose that the units in are selected one by one
with probability proportional to the ) values at each
draw without replacing the units selected at the previous
draws. Under this set of samplings we now consider the
following three important estimators of that have
studied at length in the literature and that are well known
for their optimal properties.
Horvitz-Thompson Estimator [5]
It is defined as
3OP 67 + ,
Q+
where R
is the inclusion probability of in .
Symmetrized Des Raj Estimator [6]
For the special case of the sample S of 2 units,
taken with ppz sampling without replacement, this
estimator is defined by
3TU W + 9 S : D
B4+ B4V 4+
V
4V
X,
where and S denote respectively the initial
probabilities of selecting and S .
Rao-Hartley-Cochran Estimator [7]
Split the population at random into groups of sizes
5 such that 5
. Then from each
group select exactly one unit with probability
proportional to ) values of that group. Then this
estimator is defined by
3YOZ 5
+
*+ I,+ ,
where -
is the total of ) values in the ith group.
In order to avoid the difficulty that arises for an
analytical comparison of the efficiencies of the above
mentioned estimators and to have an overall idea about
their relative performance, variances of the estimators
3[[, 3OP 3TU and 3YOZ are computed for when
the samples are drawn from the following six
populations having ! " :
Population 1 [8, p.59]: \, quite rate per
hundred in the U.S. manufacturing (1960-72), unemployment rate (%), ! ]^^
Population 2 [9, p.96]: _, per capita
consumption of lamb (lb), price per pound of lamb
(cents), ! ]`a
Population 3 [9, p.96]: _, per capita
consumption of veal (lb), price per pound of veal
(cents), ! ]_^
Population 4 [9, p.96]: _, per capita
consumption of chicken (lb), price per pound of
chicken (cents), ! ]b`_
Population 5 [9, p.96]: _, per capita
consumption of beef (lb), price per pound of beef
(cents), ! ]^cc
Population 6 [10]: , artificial, artificial,
! ]bc.
Relative efficiencies of different estimators with
respect to , based on simple random sampling without
replacement are displayed in table 4.1. It interesting to
note that, the Rao-Hartley-Cochran estimator is
uniformly superior to other comparable estimators for
the six populations. On the other hand, performances of
3OP and 3TU are highly unpredictable in the sense that
they are uniformly less efficient than 344* .
Table 1: Relative Efficiencies of Different
Estimators w.r.t. (in %)
Pop.
No.
1
2
3
4
5
6
100
100
100
100
100
100
344*
209
187
159
367
285
135
Estimators
3OP
143
80
65
188
119
77
3TU
135
78
66
154
120
72
3YOZ
227
201
170
394
316
142
5. CONCLUSION
The PPS sampling scheme introduced in this paper
is not only helpful to the survey statisticians for selecting
a sample when the auxiliary variable under consideration
has a high negative correlation with the study variable,
but also capable for increasing accuracy of estimation of
the population mean up to certain level under some mild
restrictions. The novel feature of the suggested scheme is
that it utilizes population harmonic mean of the auxiliary
variable that can be easily computed from its known
values on different units of the population. However, we
209
VOL. 2, NO.7, JULY 2012
ISSN 2222-9833
ARPN Journal of Systems and Software
©2009-2012 AJSS Journal. All rights reserved
http://www.scientific-journals.org
stress on further studies, analytical and empirical, for
better understanding on the properties of the proposed
sampling design.
REFERENCES
[1] Reddy, V. N. and Rao, T. J. (1977). Modified PPS
method of estimation. Sankhyā, C39, 185-197.
[2] Sahoo, J., Sahoo, L. N. and Mohanty, S. (1994).
Unequal probability sampling using a transformed
auxiliary variable. Metron, 52, 71-83.
[3] Agrawal, M. C. and Jain, N. (1989). A new
predictive product estimator. Biometrika, 76, 822823.
[4] Raj, D. (1958). On the relative accuracy of some
sampling techniques. Journal of the American
Statistical Association, 53, 98-101.
[5] Horvitz, D. G. and Thompson, D. J. (1952). A
generalization of sampling without replacement
from a finite universe. Journal of the American
Statistical Association, 47, 663-685.
[6] Murthy, M. N. (1957). Ordered and unordered
estimators in sampling without replacement.
Sankhyā, 18, 379-390.
[7] Rao, J. N. K., Hartley, H. O. and Cochran, W. G.
(1962). A simple procedure of unequal probability
sampling without replacement. Journal of the Royal
Statistical Society, B24, 482-491.
[8] Gujarati, D. (1978). Basic Econometrics. Mc-Graw
Hill Book Company.
[9] Maddala, G.S. (1977). Econometrics. Mc-Graw Hill
Kogakusha Limited.
[10]Srivenkataramana, T. and Tracy, D. S. (1980). An
alternative to ratio method in sample survey. Annals
of the Institute of statistical Mathematics, 32, 111120.
210