Download Sampling with Unequal Probabilities and without Replacement

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Transcript
Sampling with Unequal Probabilities and without Replacement
Author(s): H. O. Hartley and J. N. K. Rao
Source: The Annals of Mathematical Statistics, Vol. 33, No. 2 (Jun., 1962), pp. 350-374
Published by: Institute of Mathematical Statistics
Stable URL: http://www.jstor.org/stable/2237517
Accessed: 10-04-2015 20:50 UTC
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact [email protected].
Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and extend access to The Annals of
Mathematical Statistics.
http://www.jstor.org
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
SAMPLING WITH UNEQUAL PROBABILITIES AND WITHOUT
REPLACEMENT'
BY H. 0.
HARTLEY AND
J. N. K.
RAO
Iowa State University
0. Summary. Given a population of N units, it is required to draw a random
sample of n distinct units in such a way that the probability for the ith unit
to be in the sample is proportional to its "size" xi (sampling with p.p.s. without
replacement). From a number of alternatives of achieving this, one well known
procedure is here selected: The N units in the population are listed in a random
order and their xi are cumulated and a systematic selection of n elements from
a "random start" is then made on the cumulation. The mathematical theory
associated with this procedure, not available in the literature to date, is here
provided: With the help of an asymptotic theory, compact expressions for the
variance of the estimate of the population total are derived together with variance estimates. These formulas are applicable for moderate values of N. The
reduction in variance, as compared to sampling with p.p.s. with replacement,
is clearly demonstrated.
1. Introduction. Most survey designs incorporate as a basic sampling procedure the selection of n units at random, with equal probabilities and without
replacement drawn from a population of N units. It is, however, sometimes
advantageous to select units with unequal probabilities. For example, such a
procedure may be found appropriate when a "measure of size" xi is known for
all the units in the population (i = 1, 2, ***, N) and it is suspected that these
known sizes xi are correlated with the characteristics yi for which the population total Y is to be estimated. One method (though by no means the only
method) of utilizing the xi is to draw units with probabilities proportional to
sizes xi (p.p.s.), a technique frequently used in sample surveys, particularly for
primary sampling units in multi-stage designs. Now the theory of sampling
with unequal probabilities is equivalent to multinomial sampling provided
units are drawn with replacement. On the other hand, it is well known from the
theory of equal-probability selection that sampling with replacement results in
estimators which are less precise than those computed from samples selected
without replacement; the proportional reduction in variance is given by n/N
("finite population correction"). It has therefore been felt for some time that a
similar increase in precision should be reaped by switching to selection without
replacement in unequal probability sampling. However, the theory of unequal
probability sampling without replacement involves mathematical and computational difficulties and has therefore not been fully developed.
Received January 30, 1961; revised September 18, 1961.
1 Research sponsored by the Office of Ordnance Research, U.S. Army, under Grant
No. DA-ARO (D)-31-124-G93.
350
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
WITH
SAMPLING
PROBABILITIES
UNEQUAL
351
A general theory of unequal probability sampling without replacement was
first given by Horvitz and Thompson [4]. Since then, several papers have been
published on the topic, but we shall review here only the papers relevant to the
particular problem considered here. The estimator of the total Y as proposed
by Horvitz and Thompson is
n
(1)
g
Y
Y+2
EP,
7ri
with variance
N
(1.2)
V(P)
=r
2
1 7i
N1V
i'>i=i
i
7ri7ri'
yiyi'
y
-
where 7riis the probability for the ith unit to be in a sample of size n and Pii,
is the probability for the units i and i' both to be in the sample. Now when the
7ri are exactly proportional to the y , V( g) is zero which suggests that if we
make 7ri proportional to xi, i.e., if we set
N
(1.3)
(n-1)ri
yxi=
Pii,
=
a considerable reduction in the variance will result since "xi' and "yi" are correlated. Horvitz and Thompson, for the case n = 2, propose two methods to
satisfy (1.3) approximately, but their methods have some limitations. Yates
and Grundy [9] also deal with n = 2, and they suggest an iterative procedure
to obtain revised "size measures" which satisfy (1.3) approximately, but their
method becomes cumbersome when N becomes large. Moreover, this method
is unmanageable when n > 2. Des Raj [3] employs (1.3) as a set of N equations
for the N(N - 1)/2 probabilities Pii, and determines the latter by minimizing
(1.2) subject to (1.3). This leads to a "linear programming problem" for the
N(N - 1)/2 positive Psi,. The "objective function" (the variance) involves
the unknown population values y1 and these are replaced by the known sizes xi
with the assumption that
(1.4)
yi = ca + fxi
exactly. Even if this assumption is accepted, the method is clearly unmanageable for large N. Moreover, if it is assumed that the yi of the population satisfy
(1.4) exactly with unknown a and f, then, clearly the regression estimator has
zero variance and, even if an error term is introduced into (1.4), the regression
estimator would still be the "best estimator" so that it is of little interest to
consider other estimators under such assumptions.
In this paper, we adopt a particular procedure of drawing a sample in such a
way that (1.3) is satisfied exactly with the original sizes xi. Although this procedure, which is described in Section 2, is well known to survey practitioners
(e.g., Horvitz and Thompson [4], p. 678), no formulas for the probabilities Pii,
in terms of the 7riare available in the literature, due to mathematical difficulties.
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
352
H. 0.
HARTLEY
AND J. N.
K. RAO
These difficulties are resolved in this paper and compact expressions for the variance and estimated variance of I are obtained for moderate values of N. In
Section 2, the sampling procedure is described and illustrated with the help of a
numerical example. Section 3 deals with the cases n = 2, N = 3 and 4. We
develop an asymptotic theory for the case n = 2 and N large in Section 4. In
Sub-section 4.1, Pi, is explicitly derived in terms of the gr. Sub-section 4.2
deals with the evaluation of the variance formulas, and a numerical example is
given in Sub-section 4.3. Estimation of the variance is considered in Sub-section
4.4. Section 5 deals with the general case n > 2, and compact expressions for
the variance and estimated variance of Y are obtained, which are applicable
when N is relatively large compared to n. In Section 6, a comparison with ratio
estimation is made.
The extension of the theory developed here for simple sampling to more complex sample designs such as stratification, multi-stage sampling, etc., is comparatively straightforward and therefore is not presented here.
2. A particular sampling procedure.
2.1. Description of the procedure.It is easy to show that there is no sampling
procedure whatsoever satisfying (1.3) unless -y = n(n - 1)/X and
np,i < 1,
(2.1.1)
where pi = xilX and X is the population total of the xi. Henceforth we shall
only consider such sizes xi and associated probabilities pi which satisfy the necessary condition (2.1.1). The following sampling procedure is now considered:
a. Arrange the units in a random order and denote (without loss of generality)
by j = 1, 2, * , N this random order and by Ij =
1 (npi), Ilo = 0,
the progressive totals of the (npi) in that order.
b. Select a "random start", i.e., select a "uniform variate" d with 0 ? d < 1.
Then the n selected units are those whose index, j, satisfies
.
(2.1.2)
I1j-1l
d + k < llj
for some integer k between 0 and n - 1. Since np, ? 1, every one of the n
integers k = 0, 1, * **, n - 1 will select a different sampling unit j. It is easy
to show that ri = npi for this sampling procedure.
2.2. Numerical example. Consider a population of N = 8 units arranged in a
random order and with sizes xj shown in the second column of Table 1. A sample of n = 3 is to be drawn using our sampling procedure. Instead of computing
the quantities npi we scale all computations up by a factor of X/n = 300/3 =
100. Thus we compute the progressive sums of the xi and these are shown in
column 3 of Table 1 and correspond to the quantities X H1^/n.Then we select a
random integer between 1 and 100 and this corresponds to the quantity Xd/n.
In our example this integer turned out to be 36 and the selection of the three
units in accordance with (2.1.2) is shown in column 4. We must find the lines
(j) where the column 100 HIjpasses through the levels 100d = 36 (for k = 0),
100d + 100 = 136 (for k = 1) and 100d + 200 = 236 (for k = 2). The units
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
SAMPLING
Size
j
UNEQUAL
353
PROBABILITIES
TABLE 1
3 units from population of N = 8 unite (p.p.s.)
Selection of n
Unit Number
WITH
Progressive Sum
Start
-
36
X/n=
Step=
100iiooH,
100
1
15
15
2
3
4
5
6
81
26
42
20
16
96
122
164
184
200
k = 1,100d + 100 = 136
7
45
245
k =2, lOOd + 200 = 236
8
55
300
k = 0, 100d = 36
j = 2, 4 and 7 are thereby selected. This procedure(either with or without the
initial randomization)has been frequentlyused but, in the absenceof a better
theory, is usually treated approximatelyas a p.p.s. sample drawnwith replacement.
2.3. A cyclicalanalogueto the sampling procedure.From the point of view of
the mathematicaltreatmentit is convenientto use an alternativebut stochastically equivalent procedureto the steps a and b as follows:
a'. Arrange the units in a random order, denote by j = 1, 2,
*.,
N this
randomorderand form (as before) the progressivetotals II, given by (2.1.2).
Mark off on the perimeterof a circle with radiusn/(27r) arcs of lengths (np,)
in clockwise direction starting at the top (jth unit correspondsto jth arc of
length np, on the circle).
b'. Select a uniformarc s with 0 < s < n. Then the n selectedunits are those
whose indicesj satisfy
,
llj-i
(2.3.1)
s+k
< Hj
for some integerk between -(n - 1) and (n - 1) for whichalso0 ?S s + k <
n. Precisely n units are selected by this process. It is clear that all results in
Sub-section2.1 still hold. For we know with certainty that of the above values
of s + k the algebraicallysmallest will lie between 0 and 1 and this may be
identifiedwith the variate d in step a.
3. The cases n = 2; N = 3 and 4.
3.1. The case n = 2, N = 3. Clearly we have
(3.1.1)
Pi,i
= 1 -
ri.
= 7ri + 7ri,-
1,
wherei" is the third unit in the populationand 7ri+ 7ri, + 7ri = 2. Substituting for Pii, from (3.1.1) in (1.2), and after some algebra,we get
,312
.,
V( P)
3
Y,<
x,,s
.*)
2
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
354
H. 0.
J. N. K. RAO
AND
HARTLEY
(1 - 7rj)yj/7rj. In the special case of equal rw = 3, (3.1.2)
where Y* =
reduces to the well known variance formula in equal probability sampling without replacement.
3.2. The case n = 2, N = 4. We evaluate P12 and assume2 without loss of
generality that
(3.2.1)
7rl _
and
7r2
T4
>
7r3.
We distinguish two subclasses of the randomization results:
Class 1. The units i
Class 2. The units i
=
=
1 and i = 2 are adjacent on the circle.
1 and i = 2 are separated by one unit.
It is easy to see that in class 1 the probability that i = 1 and i = 2 are the
sampled units is given by
(7r, +
(3.2.2)
P12=i
r2 -1
if 7rI + 7r 2>_
~~~~~~~if
Trl + 7r2 K 1
O
while in class 2, using (3.2.1), it is given by
(3.2.3)
(3.2.3)
p,,}~0
P12
( 7rl +
72
+
73-
7)2
i f 71
+
73
_
if ri + 7rs >1.
Since the relative frequency of sequences in the above classes 1 and 2 are in the
ratio I to J, the overall probability P,2 is given by
(3.2.4)
P12 = 1P'2 + IPX12.
The substitution of (3.2.4) in (1.2) yields a formula for V( P), although not in a
compact form. Similar results have been obtained for the case n = 2, N = 5.
As N increases, the exact evaluation of Pii, becomes more and more laborious
and, in any case, the resulting formula will be too complicated to yield a practical formula for V( ). Therefore, we develop an asymptotic theory in the next
section and obtain compact expressions for V( 2).
3.3. Numerical example. To compare the efficiency of our sampling procedure
for the case n = 2, N = 4 with both the procedures of Yates and Grundy [9]
and Des Raj [3] we use the three populations examined by these authors. The
three populations have the same set of pi values and are given in Table 2.
Variances of 1 for the three procedures and the three populations are given
in Table 3. Variances for procedures 1 and 2 are taken from Des Raj [3]. For
procedure 3 the variance is obtained from (3.2.4) and (1.2). It may be noted
that substitution of (3.2.4) in (1.2) gives an exact formula for the variance. The
variance of P for p.p.s. sampling with replacement is also shown in Table 3 for
comparison.
From Table 3 it is seen that procedures 1, 2 and 3 are more efficient than
' The numbering of the units i = 1, 2, 3, 4 is here beforerandomization.
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
SAMPLING
WITH
UNEQUAL
355
PROBABILITIES
TABLE 2
Three populations of size N
= 4
Unit
Number
Pi
Population A
y.
Population B
y.
Population C
y.
1
2
3
4
0.1
0.2
0.3
0.4
0.5
1.2
2.1
3.2
0.8
1.4
1.8
2.0
0.2
0.6
0.9
0.8
TABLE 3
Comparativeefficiencyof four sampling procedures
Procedure
1.
2.
3.
4.
Des Raj
Yates and Grundy
Hartley and Rao
With replacement
PopulationA
PopulationB
PopulationC
Var.
Eff. %
Var.
Eff. %
Var.
Eff. %
0.200
0.323
0.367
0.500
100.0
61.9
54.5
40.0
0.200
0.269
0.367
0.500
100.0
74.3
54.5
40.0
0.100
0.057
0.033
0.125
100.0
175.4
333.3
80.0
samplingwith replacementfor all the three populations.For populationsA and
B the linear model (1.4) is fairly well satisfied so that Des Raj's "optimum
procedure"is more efficient.For population C the model is not appropriateso
that considerableloss in efficiencyresultsfrom Des Raj's procedureas compared
to procedures2 and 3. No generalstatement can be made between procedures2
and 3 regardingefficiency.However, it can be shown that procedures2 and 3
have exactly the same V(g2) to order O(N') for large N. For the present example with N = 4, this result for "large N" does not, of course, apply.
4. The case n = 2 and N large. Since the case n = 2 is useful particularlyin
stratifieddesignsand since this case has been extensivelydealt with in the literature, we considerthis case in detail in the followingsubsections.
4.1 Evaluationof theprobabilitiesPii, . For the evaluationof Pii, we make the
following two assumptions: (a) 7ri ? cN' for all i and N, and (b) cjN2 <
il < c2N2 for all pairs (i, i') and N, where c, c1 and C2 are universal constants and where S4, is the mean squareof the 7rj (j $ i, i'). It shouldbe noted
that the upperlimit in (b) is, of course,a consequenceof (a). While assumption
(a) is vital, the lowerlimit in assumption(b) could be circumventedby a special
argument not given here. Under the above two assumptionsit can be shown
that statements on the relativeorder of magnitudeof our leading term V'( 1)
in the varianceformulaV( f) (see eq. 4.2.4) to the terms of lowerorderof magnitude can be made. In fact it can be shown that the tth lower order term in
V( I) divided by V'( 1) is less than or equal to const. Nt (t = 1, 2, 3), where
the terms for t = 1, 2 have been retainedin equation (4.2.2). In orderto sim-
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
H. 0. HARTLEY AND J. N. K. RAO
356
plify the argument, however, we fix an arbitrary scale of absoluteorder of magnitude by making, in what follows, the additional assumptions that 0 _ yi _ c*
for all i and N and that the variance formula in sampling with replacement,
V'( P), is of order 0(N2) (in practice this is usually the case). In sampling without replacement V'( Y) will be the leading term in the variance formula and it is
necessary, in order to supply formulas for moderately large N, to evaluate the
term of next lower order of magnitude in powers of N-'. This term will represent
the gain in precision due to sampling without replacement. The variance of Y
to order O(N') is obtained by evaluating the probabilities Pii, to order o(N-3)
and substituting it in the variance formula (1.2). For the benefit of smaller
size populations we also find V(Y) to order O(N?) by evaluating Pii, to order
O(Nm4) and substituting it in (1.2).
We use the circular analogue to the sampling procedure described in Subsection 2.3. The total number of arrangements of the N units on the circle,
namely N!, can be divided into (N - 1) groups according to whether there are
v = 0,1, ... , (N - 2) units "between" i and i', where "between" means
that we have v units when proceeding from unit i to i' in clockwise direction.
There are N X (N - 2) ! arrangements in each of these (N - 1) groups so
that they are all represented with equal probability 1/N - 1. Consider now
the contribution to Pii, from a particular group with v units between units i
and i'. For the ith unit to be in the sample, the inequalities Hi-i < s + k < Hi
must be satisfied where k can take values -1, 0 and 1 and s is a uniform arc
with 0 ? s < 2. This means that either Hi-l < s < Hi or Hi-, - 1 < s <
Hi - 1 if Hi-, _ 1 and HIi-L+ 1 < s < Hi + 1 if Hi < 1 must be satisfied. Therefore, to evaluate Pii, we have to add the contributions to Pii, from
the first case, say P'i,, and from the second case, say P"-,. Since the length
of the range of s is equal to 7riin both cases, Psi, and P"', are identical. Consider
now the evaluation of P,ti, . Since
IIi-1 <_ s < IL2,
(4.1.1)
a positive contribution to P$'i, can be made only if
Hli + T _ s + 1 < Hi + T, + 7ri,
(4.1.2)
is satisfied,where T, =
Erv1r. (4.1.2) can be written as
1+ t
-
7ri < Tv < 1 + t
r-
-
2
where t = s -Hi-l . The uniform variate t (like s) has an ordinate density of
2, and from (4.1.1) we have 0 ? t < irsi. Therefore, the integrated contribution
to P$i' is given by
7ri
f
- Pr(1 +
t -7r
(4.1.3)
< Tv
1 + t - ir) dt
[FV(1+t-7i
)-Fv(1+t-ini-
-7rij
7r
=2
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
i)]dt,
WITH
SAMPLING
357
PROBABILITIES
UNEQUAL
where Ft(T) denotes the cumulative distribution function of the total T,.
Since the units are arranged in a random order prior to drawing the sample,
Tv represents the total of v values of the lrj drawn with equal probability and
without replacement from the finite population of (N - 2) values of the 7rj.
Therefore, noting that EN rj = 2, we find that
E(Tv)
N-2N -2
=
= vir (say)
and
V(T,) = v[l -v/(N
-2)]M,
where
,S~i"
Si2i,
(N-3X-'
_ )-1
(N
(N
-
i
N
r
(7rj _ f,)2
E
i#(i.i')
2
3)-1{Z
_-2
iri)
[(2-
-,
-
/(N
-2)]
Finally, adding the two (identical) contributions to P'i, and P"i, given by
(4.1.3), multiplying by the factor (N - 1)-i which represents the (constant)
probability of a random arrangement of N units in which v units lie "between"
units i and i', and summing over v, we obtain
N-2
Pii, = (N - 1)-1 E
v=O
(4.1.4)
ri
+ t - ri) - FJ1
*I[FV(l
+ t - ri - ri,)]
dt.
In order to obtain usable results, we now evaluate an approximation to (4.1.4)
by expanding Ft(T) in an Edgeworth series of which the cumulative normal
integral is the leading term. Edgeworth series representation of a cumulative
distribution function F(x) is (see, e.g., Kendall and Stuart [5], p. 158) given by
(4.1.5)
exp {
F(X)
(-
} P(x),
exp {-_y2}
dy;
Dj
where
P(x) -(27r)-f
00
Dj is the jth order derivative operating on P(x) and kj denotes standardized
cumulants. In what follows we assume, without loss of generality, that i = 1
and i' = 2. In our case, (4.1.5) is applied to the standardized variate
TV
-
[v(2
-
S12{V[1 -
71
-
r2)/(N
-
2)]
v/(N -2)11
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
358
H. 0.
AND
HARTLEY
K. RAO
J. N.
in place of x so that F(x) is the finite proportion F,(z) (say) of values z, with
z, < z. This function is therefore a step function with a finite number of discontinuities and the complete series representation (4.1.5) will yield this step
function for almost all values of z while at the points of discontinuity the right
hand side of (4.1.5) will yield Pr(z, < z) + ' Pr(z. = z). We, therefore, have
Fv(z)
(4.1.6)
P(z) - (k3/6)D3P(z) + R(v),
=
where the kj are the cumulants of z, and
(4.1.7)
R(v) = exp
kj(j)
{
Dj} P(z)
k D3} P(z)
-
{1
-
and therefore is a double infinite series each term involving a power product
of the cumulants kj and an associated high-order derivative DrP(z), the term
with the least order differential is seen to be (k4/4!) D4P(z). In order to express the cumulant k3 in terms of the standardized cumulant K3 (say) of the
finite population of the irj, we make use of the results given by Wishart [8].
In terms of the variable z, we find, after some substitution, that
(4.1.8)
k3
[
=
1- (1-N-2)
(N
(1-N--2)7
-{v
-
2)]
K3
Substituting (4.1.6) in (4.1.4) we obtain
N-2
Z
E
P12= (N- i(4.1.9)
*f
where D3P(z) =P
Z=
V=o
71
{P(zl)
P(z2)
-
-
k3
[p(3)(Zk3
[P3(3)
dt +
(Z2
(z)
{t + I-7r-
/8S12 vi
v(2-7r1-7r2)(N-2)}
V 2)'
-N
(4.1.10)
Z2 =
{t + 1
-7
-72
and the remainder term
p
p = (N
2)'
- N -
is given by
N-2
(4.1.11)
/S12 V'(1
-v('2-7r-7r2)(N-2)-}
-
1)-
rl
Zf
[R(z1)
-
dt
R(Z2)]
while k3 is given by (4.1.8). We now apply the Euler-Maclaurin formula
fb
g(t)
dt-.
g(b)
-
g(a)
(b
(4.1.12)
+
(b
=
-
24
-
a)3
a)gql' (a + b)
(3)
9
a +
2
b
(b
+
-
a) 5
1920
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
g
WITH
SAMPLING
359
PROBABILITIES
UNEQUAL
here formulated for a general function g(x) satisfying the required continuity
conditions, where gr)(x) denotes the rth order derivative with regard to x and
t is such that a _ t < b. This formula is first applied to the differences
- P(Z2)
P(zl)
and P(3)(zi) - P(3) (Z2) in (4.1.9) thereby giving rise to four terms involving
respectively p(l), p(3), p(4) and p(6). The integration over t of these four terms is
then performed by again using (4.1.12). Retaining only the relevant terms in
(4.1.9) we obtain
N-2
(4.1.13)
= (N-J
Pl2P~~~(N~~1V
k+ 7r3 72
F3
L
21 7T2
12
1)
)
(3)(v)
dv+ P + w+ Ply
6S12j
where v1 = v[1 - v/(N
-
(4.1.14)
(71 + 72)
-
P24
k3 i7r2vTP(4)(V2)
24S12
V2 =
V1
2rl2
vJP41 (v2)+
P(3)(V2)_
Vl
-
-1
7
2)],
v(2 N
S1212V
N2)]
/2(l
)
and p is given by (4.1.11) while corepresents the aggregated remainder terms in
the application of formula (4.1.12) and p' denotes the remainder when approximating Jv by f dv. The discussion of the remainder terms p, coand p' is given
in Appendix I where it is shown that these remainder terms do not contribute
to P12 to the order of approximation desired, namely O(N-4).
We now evaluate the remaining terms in (4.1.13). The first term which involves
exp {-v2/2}
P(1)(V2) - (2r)
will be called A. We now make the transformation
u = v-
(4.1.15)
(N-2)
so that v = (N - 2){1 - [4u2/(N - 2)]b,. Expanding the exponential in
P(l) (v2) as well as the term v-2 in powers of u, we obtain
A
(4.1.16)
-
-
(N
(N
lh4p6
-
2)
1) (2
7r1r2
(2Xr
-
-
7T2)
e-P2/2 exp
f2
{-h-2p4
hJ_
+ higher terms) (1 + lh-22 +
3h-44
+ higher terms) dp,
where
(4.1.17)
h = (2
- 7r-
7r2)
(N -2)
Sg2
and the variable of integration has been changed to
p = 2 uh(T-
(4.1.18)
Expanding the term exp
2)-.
{ } in (4.1.16) and multiplying by the series inside
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
360
H. 0.
HARTLEY
K. RAO
AND J. N.
(4.1.16) we reach
(N -2)
A
(N
(4.1.19)
1) (2 -
-
[1 + 2h-2(p2
7r17r2
rl
h
7r2)
+ 8h-4(3p4
p4)
-
(7)'
-
-
eP/
6p6 + p8) + higher terms] dp.
Since h is O(MT), we can (for large N) replace the integration limits in (4.1.19)
by - X and + X apart from errors which are O(e N'a). Using the standardized
normal moments we therefore obtain
(N-2)
A.
(4.1.20)
(2-
(1 -
7r-2
h-2
+ 3h-4)
to order 0(W4). By a similar argument it can be shown that the first of the
two terms in (4.1.13) which involves P(3)(v2) and which we call B, can be reduced to
ri 7r2g2(2r
B
1
6(N
1)(2-
-
r
r2)
-
h
lh-2(p6
1)-
e-P212[(p2
f
6p4 + 3p2) + higherterms]dp
-
while the other term involving P(3)(v2), which we call C, reduces to
r
24(N
(4.1.22)
-2) (27r)-'
7r2 1'(N
12
mri(
_
1)(2
-
-
ri
-
r2)
h
e _P2'2[(p2
-
1)-
h-2(p6
-
4p4 + p2) + higherterms]dp.
h
The integral of the terms retained in (4.1.21) is zero while C can be seen to
so that B and C do not contribute to P12 to order 0(N4). Finally,
be 0(N5)
consider the term D (say) involving k3 in (4.1.13). Substituting for k3 from (4.1.
18), we obtain
D
=
2(N
D = 3(N
-
2)r7r2K3
rl1)(2-
7r2)
h
(2irX-
Lf
h
P2/22ih (p4
-
3p2) +
-h3(8p6 _ 9p4 -
p8)
+ higherterms]dp,
and the evaluation of the terms retained yields
D
(4.1.23)
K3 812
) (2
-7r17r2)4
since
which is of order0(N4)
(4.1.24)
(N
=
(N - 2)-
(
-rj
2 -
2
1.2)
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
361
SAMPLING WITH UNEQUAL PROBABILITIES
Adding the expressions A and D we obtain for the probability P12 an approximation to order o(N-4) given by
(4.1.25)
(N - 2)7r2 1r2
P12
where h is given by (4.1.17) and
h2 + 3h
-
2K3 h-3(N
-
2)']
-
by (4.1.24). Since the last two terms in
K3
we obtain to order 0(N F3) the simplifiedformula
(4.1.25) are O(N4),
(4.1.26)
[1
P12-(
(N
-
(N - 2)7r17r2
1)(2- 'l-
(1
-
h2).
7r2)
We now apply two checks to verify, independently of the arguments given in
Appendix I, that the remainder terms p, w and p' do not contribute to P12 to
are retained in (4.1.25)
and that all terms to order 0(N4)
order 0(N4),
above. The first check is the special case when all 'r1 are equal to 2/N so that
S12 = 0 and h'1 = 0, and (4.1.25) reduces to 2/N(N - 1) which is the correct
probability for units 1 and 2 to be both in a sample of size 2. However, this
check tests only the leading term in (4.1.25) since h 1 = 0 so that the coefficients
of the remaining terms in (4.1.25) are not affected by the check. A more searching check which takes account of all the terms in (4.1.25) is provided by testing
the order to which the equation
N
= (n -1)7ri
ZPii,
(4.1.27)
is satisfied. It is shown in Appendix II that (4.1.27) is in fact satisfied to order
if (4.1.25) is substituted in (4.1.27) which confirms that (4.1.25) is
O(Nm3)
correct to order 0(N4).
4.2. Varianceformulasto orders0(N1) and O(N0). In AppendixII we have
simplified (4.1.25) to the following expression:
Pii, -27i
+
ri'[1
+
I2(3i
+
7i')
+
4(7r
+
7ri,
2ri ri')]
N
(4.2.1)
-
Eir41
x7rtlrt
+
3(3r,
+ 7ri,)] +
+
1?r?,ri'
7r(7rr')
(i
r(-
+
)
7ri t
-
(f
r)
to o(N-4), where the subscripts 1 and 2 are replaced by i and i' respectively.
Substituting (4.2.1) in the variance formula (1.2) we obtain after considerable
rearrangement of terms
V((
(4.2.2)
N
(
)r
)r
3
7i
E
-
Z.d/k
(
(8
_
_\
N
4 (
74
?
2
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
Z
7ri,
362
H. 0.
HARTLEY
AND J. N.
K. RAO
correct to O(N?). If only terms to O(N1) are retained in (4.2.2), we obtain
V( )
(4.2.3)
()(i
7ri
correct to O(N'). For the special case of equal probabilities iri = 2/N, (4.2.2)
reduces to the well knowii variance formula in equal probability sampling withTABLE 4
Data for population of size N
i =
=
20
1
2
3
4
5
6
7
8
9
10
Yi
Xi
19
18
9
9
17
14
14
12
21
24
22
25
27
23
35
24
20
17
15
14
i=
11
12
13
14
15
16
17
18
19
20
Yi
18
37
12
47
27
25
25
13
19
12
Xi
18
40
12
30
27
26
21
9
19
12
out replacement to O(N0). In sampling with replacement the variance of f' is
(4.2.4)
V(
Z
U)
ri
2
which is of order O(N2). (4.2.3) when compared with (4.2.4) clearly shows the
characteristic reduction in the variance through the "finite population corrections" (1- -7ri).
4.3. Numerical example. Horvitz and Thompson [4] give an example of N = 20
blocks in Ames, Iowa. The data are reproduced in Table 4 below where yi denotes the number of households in ith block and xi denotes the eye-estimated
number of households in ith block. The probability iri for the ith unit to be in
the sample is taken proportional to the eye-estimated number of households
Xi, i.e., iri = 2xi/X. In Table 5 below we give the evaluation of the variance
of f from formulas (4.2.2), (4.2.3) and (4.2.4). Also shown in Table 5 are the
evaluations of the variances of alternative estimators which are described by
Horvitz and Thompson.
A comparison of the variances in Table 5 shows that sampling with p.p.s.
is vastly superior to sampling with equal probabilities. It must not be forgotten,
however, that there are other devices of decreasing the variance in the latter
case with the help of the known xi values, e.g., ratio and regression methods of
estimation. Among the procedures of p.p.s. sampling there is little to choose in
this example except that about 7% (235/3241) is gained in precision through
our procedure of sampling without replacement as compared to sampling with
replacement. It is of interest to exhibit the nature of convergence of the various
approximations to V( ') by regarding the variance formula for sampling with
replacement as an approximation to O(N2) as set out in Table 6. The con-
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
WITH
SAMPLING
363
PROBABILITIES
UNEQUAL
TABLE 5
Variances of various estimators of the total of the yi-population shown in Table 4
Form of
Numerical Value of
Estimator Variance of Estimator
Sampling Procedure
Equal probability sampling without replacement
p.p.s. with replacement
*
3,241-Eq. (4.2.4)
yi/7ri
E
Horvitz and Thompson procedure 1
16,219
Ng
n
3,095
yi/Pi
E
n
Horvitz and Thompson procedure 2
3,075
yi/Pi
E
n
Present procedure; sampling without replacement
and p.p.s.
3,025-Eq. (4.2.3)
3,007-Eq. (4.2.2)
yi/7rt
E
* For a description of these procedures and definition of the Pi see Horvitz and Thompson [41,Tables 2 and 3.
TABLE 6
Approximations
for population of Table 4
to V(t)
Order of
Approximation
Formula
Used
V(g)
Difference
O(N2)
Eq. (4.2.4)
Eq. (4.2.3)
Eq. (4.2.2)
3,241
216
3,025
18
O(N1)
O(NO)
vergence in this example appears to be satisfactory although the population
size (N = 20) is much smaller than those encountered in survey work.
4.4. Estimate of the variance. Horvitz and Thompson [4] give an unbiased
estimate of V( Y) that seems to be unsatisfactory since it may take negative
values. Yates and Grundy [9] propose an alternative unbiased estimator of
variance which is believed to take negative values less often, and is given by
(4.4.1)
vn)
7r i
=
Pii, Yi _yi
-
where v( g) denotes the estimator of V( ). For the case n
to
v(g) =
(4.4.2)
w
-
-
Pii,
-
2, (4.4.1) reduces
=
,
7ri
7ri/
where the units numbered i and i' are included in the sample of size 2. Substituting for Pii, from (4.2.1) in (4.4.2), we find after some simplification
N
V( )
_
-
r(i
I
+
+ 7i')
(N)2
2-
E
(4.4.3)
2 2+
')
+
2Z
-1
2
N
N
3
+
4Iri
+
7ri')
Z
2r}
1
1
Yi
(
/i'
J\w~~~~~i
7ri'/
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
364
H. 0.
HARTLEY
AND
J. N.
K. RAO
to order O(N?). If only terms to order O(N1) are retained in (4.4.3), we get the
simplified formula
(4.4.4)
v(g)
+irT') +
2-(ir
-
7r,
ri t
For the special case of equal probabilities irj = 2/N, both (4.4.3) and (4.4.4)
reduce to the well known formula for estimator of variance in equal probability
sampling without replacement.
In this connection, it may be worth while to point out an important aspect
of Yates and Grundy estimator of variance for the case n = 2. Narain [7] has
shown that a necessary condition for a sampling procedure without replacement,
for which 7rj = npj, to be more efficient than sampling with replacement is
< 2(n
(4.4.5)
for all i and i'. For the case n
(4.4.6)
=
-
n
1)
2, (4.4.5) reduces to
Pii, _r.-ri.
Therefore, using (4.4.6), it immediately follows that Yates and Grundy's estimator of variance (4.4.2) is always positive. That is, if there is a sampling
procedure without replacement for which the variance is smaller than the variance when sampling with replacement independent of the yj's, which is the case
we are interested in, then Yates and Grundy's estimator of variance is always
positive. It may be noted that this result is true only for n = 2, since conditions
(4.4.5) are not sufficient to show that (4.4.1) is always positive when n > 2.
(4.4.1) is always positive if conditions (4.4.6) for all i and i' (i # i') are satisfied. However, conditions (4.4.5) do not imply (4.4.6) except when n = 2.
(4.2.3) and (4.2.4) imply that, to O(N2), our particular sampling procedure
without replacement is more precise than sampling with replacement independent of the yj's, so that (4.4.6) follows as a necessary condition to 0(N3).
Conditions (4.4.6) can, of course, be directly verified for our sampling procedure
using (4.2.1).
5. The general case n > 2 and N large. A striking feature of our sampling
procedure is that it permits an evaluation of Pii , and hence of V( ?) and v( ),
for the case n > 2. It is interesting to note that most of the published literature
on this topic does not have anything to offer for n > 2, and deals only with
the case n = 2, due to difficulties in evaluating Pii, .
The methods of attack for n > 2 are similar to those used for n = 2. However, the former case presents certain new features. Therefore, in this section
we give a brief derivation of Pi, with details of the new features.
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
WITH
SAMPLlNG
365
PROBABILITIES
UNEQUAL
Extending the arguments given in Sub-section 4.1, it can be shown that
s;
[N-n
Pii, = (N
(.)
+ t -7r,-xr,) )dt +**X
[F,(l + t -7r,)-F,(1
EI
,-1
N-n+m
Z
+
+
|
[F,(m+1+t-7ri)-F.(m+1+t-7ri-7rsi)]dt+
s
N-2
[F,(n -1+
j
E
t - xi,)
F.(n -1+
-
t -,
-
7i,)] dt},
o- n-20
where, as before, F.(T) denotes the cumulative distribution function of the
total T. of the v values 7r ,
= v(n
E(T.)
7r,)/(N
r-
-
-
2),
and
V(T.) = vI1 - [v/(N
-
2)lSi,
It may be noted that (5.1) reduces to (4.1.4) when n = 2. Now it will be shown
that each of the (n - 1) integrals summed over v in (5.1) contributes identically
to Pi,, to order O(N4), making the assumptions mentioned earlier in Subsection
, n - 2) in (5.1),
4.1. Consider the mth integral summed over v (m = 0, 1,
say ,(Si given by
s;
N-n+m
= (N-
E
1)-
+ 1 + t-,-
[F(m + 1 + t-,)-F(m
|
)dt,
and let i = 1 and i' = 2 without loss of generality. Proceeding now exactly as
in the case of n = 2, by expanding F,(T) in an Edgeworth series, applying the
by f dv, we find
Euler-Maclaurin formula (4.1.12), and approximating
N-nm
P(m") =
(N-
1)_
_
|
TIT2 v4P(1) (V2m)
+
2
p(3)(V2m)
vLP
L2
(5.2)
+
T1T2
Vp1P (P
V2m)
-6kS1n"2 VIP(4
(V2m)]
dv + Pm +
(On
+
Pm
where
v
V2 =
1vl
m
-
[v/(N -2)]J
+
1-rl
+ "'
-
v(n
-
Ir
-
TO
(Vis12),
PM, Corand p' are the remainder terms similar to p, coand p' for the case n=
2, P(')(x) denotes the rth order derivative of the cumulative normal distribution
P(x), and k8 is the standardized cumulant of the total T. given by (4.1.8). It
can be shown that Pm, c"mand Pmdo not contribute to P(2) to O(N4), using
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
366
H. 0. HARTLEY AND J. N. K. RAO
arguments similar to those employed for p, w and p' in Appendix I. We now
evaluate the remaining terms in (5.2). The first term which involves P(C)(V2n)
will be called Am . We make the transformation
V-
(5.3)
cm=U ,
where
Cm
2(m +
2(n-
N-2
-
1)
- 7r2
rl
r2)
ir1-
Then, in terms of u, we have
V1(CmN2)[
2
+(Nu
(N
Cm)
(Cm
2) (cm -
n
)J
Expanding now the exponential in P(1)(V2m) as well as the term vL- in powers
of u (justification of the expansion can be easily shown), and changing the
variable of integration u to p, where
p
(5.4)
=
uh(N
(cm -N
-2)
2)
where
h = (n
(5.5)
-
r-i
r2)(N
2S1-21
-2)
we find, after considerable simplification, that
(N -2)717(2
+ hmn
(3p2
)-t
_6p4
+
8
(5.6)
h2
PI)
h-
[/
+
h8
(3P4-
4
(2
6p6 +
8
p
-2
1h
+
/
(15p4
-
45p6 +
+ 384 (105p4 - 420p6 + 210p8
-
15p8
pO)
-
28p10+
p12)
+ higher terms dp,
where
(N - 2)s12(1
him
(n
-
7 r2)
(
m2-
2c)
N--2)
The limits of integration in (5.6) are
h(m
-
cm) (N -
2)
{cm -
[cm1/(N -
2)]}
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
SAMPLING
WITH
UNEQUAL
367
PROBABILITIES
and h(N -n + m - cm) (N - 2)`{cm - [cl/(N - 2)]}, which are -O(N)
and 0(N2) respectively. Therefore, the integration limits can be replaced by
- oo and + oo apart from errors which are 0(e-N Na). The main feature here is
the appearance of a "non-centrality type" term him which is zero for the case
n = 2. Using now the first twelve standardized normal moments, we find that
the coefficients of all terms involving himare zero, and that
Am *
(5.7)
(N
-
2) r2
+ 3h4)
(1-h-2
Since h does not depend on m, Am also does not depend on m.
o(N-4).
By a similar analysis, it can be shown that the first of the two terms in (5.2)
which involves P(3)(V2m) and which we call Bi, can be reduced to
to
(N2)7rirASj22
24(N1)(n - i -
(5.8)
*Fe-P212[(p2-1)
+ h
L
(p5
2
+ hlm (p8
cm- N-
-2)
(2r)
-
6p3 + 3p)
~~~~~~~~2
+ 45p4 _
-15p6
6p4 + 3p2)
-
(p6
-_
1
Cm \
2
higherterms
p2)+
dp
while the other term involving P(3)(V2m), which we call Cm, reduces to
=
C'm
(5.9)*
(N
24(N
I e2
-
-
-1-7r
72)
+ him (p5
-1)
[(p2
L
J
2)7r72(27)-1
1)(n
4p3 + p)
-
~~~~~2
+ h2n(p8 _
(p
h-
-
2
lip6 + 21p4
3p2)
-
2
4p4
+ higher terms dp.
Using now the standardized normal moments, we find that the integral of the
so that Bm and Cmdo
terms retained in Bm is zero and Cmis of order 0(W5),
not contribute to P17) to order O(N 4).
Finally, the term Dm (say), involving k3, in (5.2) is reduced to
Dm
=
-
(N
-2)irT12K3(2ir)Y'2f
iri-
6(N-1l)(n-
r2)
J
QVCm
+hl (p6
6p
N 2m 2)
1
e'2/2
p2
L (p7
-
2
-6p5
-
Cm20
33p)
(
J__
+ 3p3)
2
+ hlm p9-13p7
13
_ m (p9 +8
himh-2
4
(10
+
33P5
+
3P
-1p+3
-
9P3)
33p6
3
6_94
-p)
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
-p
368
+ hi.
AND
HARTLEY
H. 0.
(N-
2
-
+ h2m(p8
-
13p8 +
-
2) (cn
33p6_
-(N
3p2)
(8_6?3p
2)
-
pCm
(N-2)3
6p6 + 3p4)]-h-3(N
-
-
9p4)
-
. 3
)21[(p
Nc
2)2 (1
h
2
?hlm(76?3p4)
+ h-2(N
2)
N-
-
(
Cin
8m(plo
+ 45P4)
45p4)]C'
-
+
K. RAO
- 24pl + 150p8 -240p6
(p12
+h-'j(p1
(5.10)
J. N.
(1
-
-
2)?
Cm
(m
)l0P
-
3
N-
2)
3p4)
)j(p
3(NP-2)4-(1
2)ir1r2K3 h3(N2
+ higher terms5 dp.
Using now the standardized normal moments, the integral of the terms retained
in (5.10) reduces to
Dm-6(NP 6)(n-
-3i hr(N-)2)
12(1
2;
(5.11)
_-6(1 N2)ro
+-2(N 2)N-
N- 2)
(N(N m2N-2) -2)(
+N
)(
-3 2)+
2
)
(N -2)4(
)l
N-
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
369
SAMPLING WITH UNEQUAL PROBABILITIES
Further simplification of (5.11) results in
-
Dm-in
(5.12)
(N
-
2(N
-
2)(ri
1) (n
2
-
K3
'l-
h-3(N
2)
-
7r2)
which is of order 0(W4) and does not depend on m. Adding now the expressions
(5.7) and (5.12) for Am and Dm respectively, we obtain for P(2 ) an approxima-
tion to order0(N4)
(5.13)
given by
(N
P(rn)
-
(N
-
-[1
2)7r,lr2
- 7rl-
1)(n
h 2 + 3h4-4
-
2K3h 3 (N
2
-
7r2)
Since (5.13) does not depend on m, it follows that
n-2
P12=
(5.14)
E
P
m=0
(
)7,-r
(N-2)r
1)(n
(n-1)
(N1-
2
7r2h
- r-
+ 3h -
7r2)
2K3 h3(N
-
-2)
For the special case n
to order 0(N4).
2, (5.14) reduces to (4.1.25). The
two checks mentioned in Section 4 for the case n = 2 are also satisfied by (5.14).
Expressing h and K3 in terms of the irj we obtain from (5.14):
pi i
(n-i)
n
___n
(n-1)
i
fl+
+
i
ri
3i+ri' (1ri
3
+
+
N
-
3(n
(ir ri, + ri ri,)
+ 2(n-1)
N
v
fl
(5.15)
-1)
n
+
__(n
__
1)
21s'ilr
nn4
(rs
2
N
2
2
1n5
3(n
+rs r,)-
)
r
1T)
+7r
r,.gr
.r%
2
2(n -1)
f4
r
1
to O(N-4), where the subscripts 1 and 2 are replaced by i and i' respectively.
Substituting (5.15) in the variance formula (1.2) we obtain
V(
(5.16)
- E
[1
fl
(n-
1)
7rf2)
_ (n-1l)?E(27r3 (ni _ 7ri
2n
correct to 0(N?). If only terms to 0(N')
(5.17)
Y
(yi
V(2)=-
7ri[i
(n_
+
_
(n
Y)
) (
Y1
are retained in (5.16), we obtain
i)
)j](Ys
_
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
370
H. 0.
HARTLEY
AND J. N.
K. RAO
correct to O(N'). The variance of Y in sampling with replacement is
(5.18)
V'(1?)
=
Y
(Yi
E-T
n
7ri
which is of order O(N2). (5.17) when compared with (5.18) clearly shows the
characteristic reduction in the variance through the "finite population corrections" {1 - [(n - 1)/n]rjw}
Substituting (5.15) in Yates and Grundy's estimator of variance (4.4.1), we
find
1)
v(f) - (n-
[1
2
12
_ n ( 2i +7i_i
(5.19)
-
(7ri+wrs) +(f
r
2(
n3\
7)ri
rjn)
2A
r
Jn
+
+ irit)
2~
+2
Zir
3
E
(Yi
_ypt
correctto O(N?). If only terms to 0(N1) are retainedin (5.19), we obtain the
simplifiedformula
(n - 1)
(5.20) v(Y)
[1 - (7ri + 7ri,) +
,
(f
wjrn)]
(Y
-
correct to O(N'). For the special case of equal probabilities7r = n/N, both
(5.19) and (5.20) reduce to the familiarformulafor estimator of variance in
equal probabilitysamplingwithout replacement.
6. A comparisonwith ratio method of estimation. Cochran[2] makes a comparisonof the varianceof Y in p.p.s. samplingwithreplacementwith the variance
of the ratio estimate, YR = (Y/X)X,in equal probabilitysamplingwithout the
finite populationcorrectionfactor. Since a compact expressionfor the variance
of fl in p.p.s. samplingwithout replacementis obtainedin this paper, it will be
of interest to comparethis variancewith the varianceof the ratio estimate not
ignoringthe finite populationcorrectionfactor. (5.17) can be written as
(6.1)
V( i') I--
-
nlliJi
(yi
_
ypi)2-
_
n
fl
)
1
(Vyi-_
ypi)2
to order O(N'). The varianceof the ratio estimate fR for large samples is
(6.2)
V( YR)
=
n(N
)(1 I -N)
E (yi_YP)2
Since N - 1 = N[1 - (1/N)], expanding (Nfrom (6.2),
(6.3)
V(YR)
N E (-i - Ypi)
n i
1)-' biniomiallywe obtain
n
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
SAMPLING
WITH
UNEQUAL
371
PROBABILITIES
to O(N'). Since the correction factors in (6.1) and (6.3) are exactly the same,
the comparison of V( ) with V( 1R) reduces to the comparison of the variance
of f in p.p.s. sampling with replacement with the variance of the ratio estimate
without the correction factor. Cochran, assuming the model y2 = Yp2 + ei where
E(ei Ipi) = 0 and E(e' Ipi) = ap , g > 0, a > 0, has shown that the variance
of f in p.p.s. sampling with replacement is smaller or greater than the variance
of the ratio estimate without the correction factor according as g > 1 or g < 1
respectively. If g = 1, the variances are identical. It is also stated that in practice
g usually lies between 1 and 2, so that the p.p.s. estimate is generally more
precise.
APPENDIX I
The order of magnitude of the remainder terms p', w and p
p' denotes the remainder when approximating E, by f dv. It involves the
terminal differentials of the integrands at the end points of integration v = 0
and v = N - 2 which become zero since V2is infinite at these points and since
p' also involves the remainder
the integrands involve the term exp (-v2/2).
of
the form
is
which
term of the Euler-Maclaurin formula
(N
-
[(N - 2)ON]/(2m)
2) B2mf(2m)
!,
where B2m is the Bernoulli number, f(2m) is the (2m)th derivative with regard
to v of any of the integrand functions involved in (4.1.13) and 0 < 0 < 1 while
the order of the remainder term, 2m, is at our disposal.
2)ON and the corresponding
The relation between the v argument of (Nis
from
by
given
(4.1.14),
v2argument,
V2 =
[1
-
a(7r1 +
72) ] (1
-
[ON( 1 -
20N)
ON)Y] S72l(N
- 2y)
We now separate the values of ONbetween 0 and 1 into two groups. In the first
group, ONis equal to I or the leading term of the difference between ON and 2 is
proportional to, N rN with rN > 0. The remaining values of ONfall in the second
group. It is easily seen that for the values of ONin the second group V2is 0(N3)
with s > 0 since S12 is of order O(N'1), and the argument to be used for the
remainder term in case (b) below also applies to the values of ONin this group.
We
For values of ONin the first group, either v2is zero or is of order 0(N+rN).
'
>
first
the
4.
Consider
and
<
two
cases
rN
the
(b)
rN
(a)
now distinguish
find
we
in
u
the
variable
(4.1.14),
(4.1.15)
by
given
case (a). Introducing
v2
= const. (N
-
2)-'Sj2u[1
= const. (N
-
2S1
(2u/N
-
E (-2)
(
-2)2]-
)i(2u/N
-
2)2i+1
Therefore, by repeated differentiation, we have that for the largest value of Jul,
dtv2 = dtV2 dut
dvt
0(N't).
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
372
H.
0.
HARTLEY
AND
J. N.
K. RAO
The repeated differentiation of f(v2) with regard to v or u will now be seen to
have a leading term of the form (d'f/dv2) (dv2/du)t which is of order 0(Nt12).
Therefore, from the Leibnitz formula of differentiation of a product which is of
the form vpbf(v2), b > 0, as evident from (4.1.13), it is seen that the remainder
term is 0(Nk) with k > 4 provided 2m is chosen sufficiently large. In case (b),
the remainderterm goes down as 0(e-bN Na) wheres = 1 - 2rN > 0, and hence
is smaller than 0(N4).
Therefore, the remainder term p' does not contribute to
P12 to 0(NF4).
We consider next the magnitude of the remainder w arising from the application
of the Euler-Maclaurin formula (4.1.12) to the differences P(zi) - P(z2) and
P(3)(z1) - P'3) (z2). The first of these is of the form c(z1 - Z2)5P(O)(z) where
Z2 < z < z1 . Now from (4.1.10), (4.1.15) and (4.1.18) we have
z
-z2
=
2w2(N
-
2)-Si21[1
+ 0(p2N')],
from which it appears that the leading term of z1 - Z2 iS of order 0(N-) and
that z = v2 + O(NT). Therefore, a contribution co, (say) to cois given by
WI= (N-1)'
=
(N
E f
-
(Z1
1)Y'r1[2r2(N
dt
-z2)5P"5)()
-2)5S21]5
E
[1 +
0(p2NF1)][P5)(V)
+
0(N4])
v
Now because of the properties of the normal differentials, E, P(O)(V2) iS
and, since dv/dV2= O(N) and dp/dv2 = O(N0), all terms in W,are
seen to be of smaller order than 0(N4).
Similar arguments apply to the difference P(l) (z1) - p(3) (z2) as well as to the remainder terms arising from applying
Euler-Maclaurin formula to the integration f l dt of the retained terms in p(l),
P'3', p(4) and p(6).
Finally, we turn to the discussion of the remainder p given by the sum (4.1.11)
with R(v) given by the double expansion (4.1.7) involving the cumulants kr and
higher differentials of the standardized normal P(z). It is necessary here to show
that kr (r ? 4) is of order 0(Nc) with c > 2 provided that
O(e-NNa)
0 <q
?
v/N < 1
-
q <l
as N -*> oo. We have shown that k, is of order O(N-r+l) when the K,. are bounded
and v/N satisfies the above condition.3 However, we shall not give this discussion
here. Instead, we use a formula recently employed by Barton and David [1]
who show that, if the set of finite populations of size N are assumed to be random
samples from the same infinite "super" population with cumulants Kr, the rth
cumulant
kI
= [(N
- v/N)
rv-r?l
+ (-1)'(v/N)'y(N
-V)
2r+]Kr
provided we make the above assumpwhich shows that k, is of'order 0(N-2rl)
tion which, although more restrictive than that by Madow, is adequate for our
3This is also the assumption made by Madow [6].
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
WITH
SAMPLING
373
PROBABILITIES
UNEQUAL
purposes. We should point out that Barton and David do not make the above
assumption since they use the above formula only as a step to proving another
formula which does not involve the K7 (see their equation (1)). We now split
the sum in (4.1.11) into two parts, viz., the central part where v/N is satisfying
the condition mentioned above and the remainder (tail) sum. It is easy to see
that the latter sum is of order O(e-NNa) because of the asymptotic properties
of the standardized normal differentials. For the former sum, since kris O(Nhr+l)X
it can be shown by an analysis similar to the term D involving k3 that for each
term of (4.1.7) when substituted in (4.1.11) there results a term of order smaller
than O(NT4). The inversion of the double summation in (4.1.7) and the limiting
process N -?oo is not discussed here.
II
APPENDIX
Verification of the order of magnitude of the probabilities Pii,
We have from (4.1.25) that Pii is given by
.
pi
(N-2)1r
(N
1)(2 -
[1 -h-2
7r'
-
-
7
+ 3h
-
2K3 h-3(N
2)]
-
7ri')
to O(Ar4), where the subscripts 1 and 2 are replaced by i and i' respectively.
We now show that (4.1.25) is in fact correct to 0(A745) by verifying that
= O(AV-). Using (4.1.17) for h
D174i Pii, =ri to an order (N - 1) 0(4)
can be put in the form
for
the
above
for
Pii,
expression
and (4.1.24)
K3,
.
p
Pii,
(N
-_2)ri
Tit
(2 -
(N-1)
N
,
r-1-1i
7#
(
(2 _ri
_
23
31
-
_
s2
_
- 7r-i-i
-
- 3
(N1~~~~~~1
N-
~~~2
Exj8
6
(2
2
(N -2)
N
7ri) 4
_
4
3
1
N
-(2 - 7ri-
2 2
_)
+N-3
N
2
_
16
1
(
(L
N2-3
ri-
i, )3
-7rj-
3)r
2
E7r,
(N-2)(2-x,-ri7r,2)j
reducesto
which, to O(N-4),
N
2
_
Pii = 7rX7ri
_
_
_
_(2 -7ri-
r
,/ Tj
_~~1
_
_
_
ri,)
1
(2 -
2
-
2
7i
2
-
'7r
7r - w-i)3
+
N
3 (()
(2 -
Xi-
\2
tj
N
12.
2 Ertj
_
7ru,)5
(2 -
Expanding all denominators binomially and retaining all terms to
ri 0(NV4),
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions
-
,) 4
we
374
H. 0.
HARTLEY
AND J. N. K. RAO
obtain
_t[
Pis,
+ 7ri') + '(ri
+
2(i
-
8
'
T
+
(rt
+ 7r' + 27ri7ri')]
+
+
7ri')]
+ 3
+
jQrtr'
trj')
E(
)2
32 (1
)
-
8 ((E 1
Summing the above expression for Pii, over i' from 1 to N except i' = i and
noting that EN 7r; = 2, we obtain
N
N
EPii
- j ri(2
-
7wi) + 4ji7r(2 -
i) +
i
(j
\
2
2
-
N
N
+
7i
-t
-,r
ri(2
-
r)
1
which, to O(N3),
ri
1
reduces to 7rithereby providing the announced check.
REFERENCES
[1] BARTON, D. E. and DAVID, F. N. (1961). The central sampling moments of the mean in
samples from a finite population (Aty's formulae and Madow's central limit).
Biometrika 48 199-201.
[2] COCHRAN, W. G. (1953). Sampling Techniques. Wiley, New York.
[3] DES RAJ (1956). A note on the determination of optimum probabilities in sampling
without replacement. Sankhya 17 197-200.
[4] HORVITZ, D. G. and THOMPSON,D. J. (1952). A generalization of sampling without
replacement from a finite universe. J. Amer. Stat. Assn. 47 663-685.
[5] KENDALL, M. G. and STUART, A. (1958). The Advanced Theory of Statistics, Rev. ed.
Hafner, New York.
[6] MADOW, W. G. (1948). On the limiting distributions of estimates based on samples from
finite universes. Ann. Math. Statist. 19 535-545.
[7] NARAIN, R. D. (1951). On sampling without replacement with varying probabilities.
J. Ind. Soc. Agric. Statist. 3 169-174.
j8] WISHART, JOHN (1952). Moment coefficients of the k-statistics in samples from a finite
population. Biometrika 39 1-13.
[9] YATES,F. and GRUNDY, P. M. (1953). Selection without replacement from within strata
with probability proportional to size. J. Roy. Statist. Soc., Ser. B 15 235-261.
This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC
All use subject to JSTOR Terms and Conditions