Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sampling with Unequal Probabilities and without Replacement Author(s): H. O. Hartley and J. N. K. Rao Source: The Annals of Mathematical Statistics, Vol. 33, No. 2 (Jun., 1962), pp. 350-374 Published by: Institute of Mathematical Statistics Stable URL: http://www.jstor.org/stable/2237517 Accessed: 10-04-2015 20:50 UTC Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and extend access to The Annals of Mathematical Statistics. http://www.jstor.org This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions SAMPLING WITH UNEQUAL PROBABILITIES AND WITHOUT REPLACEMENT' BY H. 0. HARTLEY AND J. N. K. RAO Iowa State University 0. Summary. Given a population of N units, it is required to draw a random sample of n distinct units in such a way that the probability for the ith unit to be in the sample is proportional to its "size" xi (sampling with p.p.s. without replacement). From a number of alternatives of achieving this, one well known procedure is here selected: The N units in the population are listed in a random order and their xi are cumulated and a systematic selection of n elements from a "random start" is then made on the cumulation. The mathematical theory associated with this procedure, not available in the literature to date, is here provided: With the help of an asymptotic theory, compact expressions for the variance of the estimate of the population total are derived together with variance estimates. These formulas are applicable for moderate values of N. The reduction in variance, as compared to sampling with p.p.s. with replacement, is clearly demonstrated. 1. Introduction. Most survey designs incorporate as a basic sampling procedure the selection of n units at random, with equal probabilities and without replacement drawn from a population of N units. It is, however, sometimes advantageous to select units with unequal probabilities. For example, such a procedure may be found appropriate when a "measure of size" xi is known for all the units in the population (i = 1, 2, ***, N) and it is suspected that these known sizes xi are correlated with the characteristics yi for which the population total Y is to be estimated. One method (though by no means the only method) of utilizing the xi is to draw units with probabilities proportional to sizes xi (p.p.s.), a technique frequently used in sample surveys, particularly for primary sampling units in multi-stage designs. Now the theory of sampling with unequal probabilities is equivalent to multinomial sampling provided units are drawn with replacement. On the other hand, it is well known from the theory of equal-probability selection that sampling with replacement results in estimators which are less precise than those computed from samples selected without replacement; the proportional reduction in variance is given by n/N ("finite population correction"). It has therefore been felt for some time that a similar increase in precision should be reaped by switching to selection without replacement in unequal probability sampling. However, the theory of unequal probability sampling without replacement involves mathematical and computational difficulties and has therefore not been fully developed. Received January 30, 1961; revised September 18, 1961. 1 Research sponsored by the Office of Ordnance Research, U.S. Army, under Grant No. DA-ARO (D)-31-124-G93. 350 This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions WITH SAMPLING PROBABILITIES UNEQUAL 351 A general theory of unequal probability sampling without replacement was first given by Horvitz and Thompson [4]. Since then, several papers have been published on the topic, but we shall review here only the papers relevant to the particular problem considered here. The estimator of the total Y as proposed by Horvitz and Thompson is n (1) g Y Y+2 EP, 7ri with variance N (1.2) V(P) =r 2 1 7i N1V i'>i=i i 7ri7ri' yiyi' y - where 7riis the probability for the ith unit to be in a sample of size n and Pii, is the probability for the units i and i' both to be in the sample. Now when the 7ri are exactly proportional to the y , V( g) is zero which suggests that if we make 7ri proportional to xi, i.e., if we set N (1.3) (n-1)ri yxi= Pii, = a considerable reduction in the variance will result since "xi' and "yi" are correlated. Horvitz and Thompson, for the case n = 2, propose two methods to satisfy (1.3) approximately, but their methods have some limitations. Yates and Grundy [9] also deal with n = 2, and they suggest an iterative procedure to obtain revised "size measures" which satisfy (1.3) approximately, but their method becomes cumbersome when N becomes large. Moreover, this method is unmanageable when n > 2. Des Raj [3] employs (1.3) as a set of N equations for the N(N - 1)/2 probabilities Pii, and determines the latter by minimizing (1.2) subject to (1.3). This leads to a "linear programming problem" for the N(N - 1)/2 positive Psi,. The "objective function" (the variance) involves the unknown population values y1 and these are replaced by the known sizes xi with the assumption that (1.4) yi = ca + fxi exactly. Even if this assumption is accepted, the method is clearly unmanageable for large N. Moreover, if it is assumed that the yi of the population satisfy (1.4) exactly with unknown a and f, then, clearly the regression estimator has zero variance and, even if an error term is introduced into (1.4), the regression estimator would still be the "best estimator" so that it is of little interest to consider other estimators under such assumptions. In this paper, we adopt a particular procedure of drawing a sample in such a way that (1.3) is satisfied exactly with the original sizes xi. Although this procedure, which is described in Section 2, is well known to survey practitioners (e.g., Horvitz and Thompson [4], p. 678), no formulas for the probabilities Pii, in terms of the 7riare available in the literature, due to mathematical difficulties. This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions 352 H. 0. HARTLEY AND J. N. K. RAO These difficulties are resolved in this paper and compact expressions for the variance and estimated variance of I are obtained for moderate values of N. In Section 2, the sampling procedure is described and illustrated with the help of a numerical example. Section 3 deals with the cases n = 2, N = 3 and 4. We develop an asymptotic theory for the case n = 2 and N large in Section 4. In Sub-section 4.1, Pi, is explicitly derived in terms of the gr. Sub-section 4.2 deals with the evaluation of the variance formulas, and a numerical example is given in Sub-section 4.3. Estimation of the variance is considered in Sub-section 4.4. Section 5 deals with the general case n > 2, and compact expressions for the variance and estimated variance of Y are obtained, which are applicable when N is relatively large compared to n. In Section 6, a comparison with ratio estimation is made. The extension of the theory developed here for simple sampling to more complex sample designs such as stratification, multi-stage sampling, etc., is comparatively straightforward and therefore is not presented here. 2. A particular sampling procedure. 2.1. Description of the procedure.It is easy to show that there is no sampling procedure whatsoever satisfying (1.3) unless -y = n(n - 1)/X and np,i < 1, (2.1.1) where pi = xilX and X is the population total of the xi. Henceforth we shall only consider such sizes xi and associated probabilities pi which satisfy the necessary condition (2.1.1). The following sampling procedure is now considered: a. Arrange the units in a random order and denote (without loss of generality) by j = 1, 2, * , N this random order and by Ij = 1 (npi), Ilo = 0, the progressive totals of the (npi) in that order. b. Select a "random start", i.e., select a "uniform variate" d with 0 ? d < 1. Then the n selected units are those whose index, j, satisfies . (2.1.2) I1j-1l d + k < llj for some integer k between 0 and n - 1. Since np, ? 1, every one of the n integers k = 0, 1, * **, n - 1 will select a different sampling unit j. It is easy to show that ri = npi for this sampling procedure. 2.2. Numerical example. Consider a population of N = 8 units arranged in a random order and with sizes xj shown in the second column of Table 1. A sample of n = 3 is to be drawn using our sampling procedure. Instead of computing the quantities npi we scale all computations up by a factor of X/n = 300/3 = 100. Thus we compute the progressive sums of the xi and these are shown in column 3 of Table 1 and correspond to the quantities X H1^/n.Then we select a random integer between 1 and 100 and this corresponds to the quantity Xd/n. In our example this integer turned out to be 36 and the selection of the three units in accordance with (2.1.2) is shown in column 4. We must find the lines (j) where the column 100 HIjpasses through the levels 100d = 36 (for k = 0), 100d + 100 = 136 (for k = 1) and 100d + 200 = 236 (for k = 2). The units This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions SAMPLING Size j UNEQUAL 353 PROBABILITIES TABLE 1 3 units from population of N = 8 unite (p.p.s.) Selection of n Unit Number WITH Progressive Sum Start - 36 X/n= Step= 100iiooH, 100 1 15 15 2 3 4 5 6 81 26 42 20 16 96 122 164 184 200 k = 1,100d + 100 = 136 7 45 245 k =2, lOOd + 200 = 236 8 55 300 k = 0, 100d = 36 j = 2, 4 and 7 are thereby selected. This procedure(either with or without the initial randomization)has been frequentlyused but, in the absenceof a better theory, is usually treated approximatelyas a p.p.s. sample drawnwith replacement. 2.3. A cyclicalanalogueto the sampling procedure.From the point of view of the mathematicaltreatmentit is convenientto use an alternativebut stochastically equivalent procedureto the steps a and b as follows: a'. Arrange the units in a random order, denote by j = 1, 2, *., N this randomorderand form (as before) the progressivetotals II, given by (2.1.2). Mark off on the perimeterof a circle with radiusn/(27r) arcs of lengths (np,) in clockwise direction starting at the top (jth unit correspondsto jth arc of length np, on the circle). b'. Select a uniformarc s with 0 < s < n. Then the n selectedunits are those whose indicesj satisfy , llj-i (2.3.1) s+k < Hj for some integerk between -(n - 1) and (n - 1) for whichalso0 ?S s + k < n. Precisely n units are selected by this process. It is clear that all results in Sub-section2.1 still hold. For we know with certainty that of the above values of s + k the algebraicallysmallest will lie between 0 and 1 and this may be identifiedwith the variate d in step a. 3. The cases n = 2; N = 3 and 4. 3.1. The case n = 2, N = 3. Clearly we have (3.1.1) Pi,i = 1 - ri. = 7ri + 7ri,- 1, wherei" is the third unit in the populationand 7ri+ 7ri, + 7ri = 2. Substituting for Pii, from (3.1.1) in (1.2), and after some algebra,we get ,312 ., V( P) 3 Y,< x,,s .*) 2 This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions 354 H. 0. J. N. K. RAO AND HARTLEY (1 - 7rj)yj/7rj. In the special case of equal rw = 3, (3.1.2) where Y* = reduces to the well known variance formula in equal probability sampling without replacement. 3.2. The case n = 2, N = 4. We evaluate P12 and assume2 without loss of generality that (3.2.1) 7rl _ and 7r2 T4 > 7r3. We distinguish two subclasses of the randomization results: Class 1. The units i Class 2. The units i = = 1 and i = 2 are adjacent on the circle. 1 and i = 2 are separated by one unit. It is easy to see that in class 1 the probability that i = 1 and i = 2 are the sampled units is given by (7r, + (3.2.2) P12=i r2 -1 if 7rI + 7r 2>_ ~~~~~~~if Trl + 7r2 K 1 O while in class 2, using (3.2.1), it is given by (3.2.3) (3.2.3) p,,}~0 P12 ( 7rl + 72 + 73- 7)2 i f 71 + 73 _ if ri + 7rs >1. Since the relative frequency of sequences in the above classes 1 and 2 are in the ratio I to J, the overall probability P,2 is given by (3.2.4) P12 = 1P'2 + IPX12. The substitution of (3.2.4) in (1.2) yields a formula for V( P), although not in a compact form. Similar results have been obtained for the case n = 2, N = 5. As N increases, the exact evaluation of Pii, becomes more and more laborious and, in any case, the resulting formula will be too complicated to yield a practical formula for V( ). Therefore, we develop an asymptotic theory in the next section and obtain compact expressions for V( 2). 3.3. Numerical example. To compare the efficiency of our sampling procedure for the case n = 2, N = 4 with both the procedures of Yates and Grundy [9] and Des Raj [3] we use the three populations examined by these authors. The three populations have the same set of pi values and are given in Table 2. Variances of 1 for the three procedures and the three populations are given in Table 3. Variances for procedures 1 and 2 are taken from Des Raj [3]. For procedure 3 the variance is obtained from (3.2.4) and (1.2). It may be noted that substitution of (3.2.4) in (1.2) gives an exact formula for the variance. The variance of P for p.p.s. sampling with replacement is also shown in Table 3 for comparison. From Table 3 it is seen that procedures 1, 2 and 3 are more efficient than ' The numbering of the units i = 1, 2, 3, 4 is here beforerandomization. This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions SAMPLING WITH UNEQUAL 355 PROBABILITIES TABLE 2 Three populations of size N = 4 Unit Number Pi Population A y. Population B y. Population C y. 1 2 3 4 0.1 0.2 0.3 0.4 0.5 1.2 2.1 3.2 0.8 1.4 1.8 2.0 0.2 0.6 0.9 0.8 TABLE 3 Comparativeefficiencyof four sampling procedures Procedure 1. 2. 3. 4. Des Raj Yates and Grundy Hartley and Rao With replacement PopulationA PopulationB PopulationC Var. Eff. % Var. Eff. % Var. Eff. % 0.200 0.323 0.367 0.500 100.0 61.9 54.5 40.0 0.200 0.269 0.367 0.500 100.0 74.3 54.5 40.0 0.100 0.057 0.033 0.125 100.0 175.4 333.3 80.0 samplingwith replacementfor all the three populations.For populationsA and B the linear model (1.4) is fairly well satisfied so that Des Raj's "optimum procedure"is more efficient.For population C the model is not appropriateso that considerableloss in efficiencyresultsfrom Des Raj's procedureas compared to procedures2 and 3. No generalstatement can be made between procedures2 and 3 regardingefficiency.However, it can be shown that procedures2 and 3 have exactly the same V(g2) to order O(N') for large N. For the present example with N = 4, this result for "large N" does not, of course, apply. 4. The case n = 2 and N large. Since the case n = 2 is useful particularlyin stratifieddesignsand since this case has been extensivelydealt with in the literature, we considerthis case in detail in the followingsubsections. 4.1 Evaluationof theprobabilitiesPii, . For the evaluationof Pii, we make the following two assumptions: (a) 7ri ? cN' for all i and N, and (b) cjN2 < il < c2N2 for all pairs (i, i') and N, where c, c1 and C2 are universal constants and where S4, is the mean squareof the 7rj (j $ i, i'). It shouldbe noted that the upperlimit in (b) is, of course,a consequenceof (a). While assumption (a) is vital, the lowerlimit in assumption(b) could be circumventedby a special argument not given here. Under the above two assumptionsit can be shown that statements on the relativeorder of magnitudeof our leading term V'( 1) in the varianceformulaV( f) (see eq. 4.2.4) to the terms of lowerorderof magnitude can be made. In fact it can be shown that the tth lower order term in V( I) divided by V'( 1) is less than or equal to const. Nt (t = 1, 2, 3), where the terms for t = 1, 2 have been retainedin equation (4.2.2). In orderto sim- This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions H. 0. HARTLEY AND J. N. K. RAO 356 plify the argument, however, we fix an arbitrary scale of absoluteorder of magnitude by making, in what follows, the additional assumptions that 0 _ yi _ c* for all i and N and that the variance formula in sampling with replacement, V'( P), is of order 0(N2) (in practice this is usually the case). In sampling without replacement V'( Y) will be the leading term in the variance formula and it is necessary, in order to supply formulas for moderately large N, to evaluate the term of next lower order of magnitude in powers of N-'. This term will represent the gain in precision due to sampling without replacement. The variance of Y to order O(N') is obtained by evaluating the probabilities Pii, to order o(N-3) and substituting it in the variance formula (1.2). For the benefit of smaller size populations we also find V(Y) to order O(N?) by evaluating Pii, to order O(Nm4) and substituting it in (1.2). We use the circular analogue to the sampling procedure described in Subsection 2.3. The total number of arrangements of the N units on the circle, namely N!, can be divided into (N - 1) groups according to whether there are v = 0,1, ... , (N - 2) units "between" i and i', where "between" means that we have v units when proceeding from unit i to i' in clockwise direction. There are N X (N - 2) ! arrangements in each of these (N - 1) groups so that they are all represented with equal probability 1/N - 1. Consider now the contribution to Pii, from a particular group with v units between units i and i'. For the ith unit to be in the sample, the inequalities Hi-i < s + k < Hi must be satisfied where k can take values -1, 0 and 1 and s is a uniform arc with 0 ? s < 2. This means that either Hi-l < s < Hi or Hi-, - 1 < s < Hi - 1 if Hi-, _ 1 and HIi-L+ 1 < s < Hi + 1 if Hi < 1 must be satisfied. Therefore, to evaluate Pii, we have to add the contributions to Pii, from the first case, say P'i,, and from the second case, say P"-,. Since the length of the range of s is equal to 7riin both cases, Psi, and P"', are identical. Consider now the evaluation of P,ti, . Since IIi-1 <_ s < IL2, (4.1.1) a positive contribution to P$'i, can be made only if Hli + T _ s + 1 < Hi + T, + 7ri, (4.1.2) is satisfied,where T, = Erv1r. (4.1.2) can be written as 1+ t - 7ri < Tv < 1 + t r- - 2 where t = s -Hi-l . The uniform variate t (like s) has an ordinate density of 2, and from (4.1.1) we have 0 ? t < irsi. Therefore, the integrated contribution to P$i' is given by 7ri f - Pr(1 + t -7r (4.1.3) < Tv 1 + t - ir) dt [FV(1+t-7i )-Fv(1+t-ini- -7rij 7r =2 This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions i)]dt, WITH SAMPLING 357 PROBABILITIES UNEQUAL where Ft(T) denotes the cumulative distribution function of the total T,. Since the units are arranged in a random order prior to drawing the sample, Tv represents the total of v values of the lrj drawn with equal probability and without replacement from the finite population of (N - 2) values of the 7rj. Therefore, noting that EN rj = 2, we find that E(Tv) N-2N -2 = = vir (say) and V(T,) = v[l -v/(N -2)]M, where ,S~i" Si2i, (N-3X-' _ )-1 (N (N - i N r (7rj _ f,)2 E i#(i.i') 2 3)-1{Z _-2 iri) [(2- -, - /(N -2)] Finally, adding the two (identical) contributions to P'i, and P"i, given by (4.1.3), multiplying by the factor (N - 1)-i which represents the (constant) probability of a random arrangement of N units in which v units lie "between" units i and i', and summing over v, we obtain N-2 Pii, = (N - 1)-1 E v=O (4.1.4) ri + t - ri) - FJ1 *I[FV(l + t - ri - ri,)] dt. In order to obtain usable results, we now evaluate an approximation to (4.1.4) by expanding Ft(T) in an Edgeworth series of which the cumulative normal integral is the leading term. Edgeworth series representation of a cumulative distribution function F(x) is (see, e.g., Kendall and Stuart [5], p. 158) given by (4.1.5) exp { F(X) (- } P(x), exp {-_y2} dy; Dj where P(x) -(27r)-f 00 Dj is the jth order derivative operating on P(x) and kj denotes standardized cumulants. In what follows we assume, without loss of generality, that i = 1 and i' = 2. In our case, (4.1.5) is applied to the standardized variate TV - [v(2 - S12{V[1 - 71 - r2)/(N - 2)] v/(N -2)11 This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions 358 H. 0. AND HARTLEY K. RAO J. N. in place of x so that F(x) is the finite proportion F,(z) (say) of values z, with z, < z. This function is therefore a step function with a finite number of discontinuities and the complete series representation (4.1.5) will yield this step function for almost all values of z while at the points of discontinuity the right hand side of (4.1.5) will yield Pr(z, < z) + ' Pr(z. = z). We, therefore, have Fv(z) (4.1.6) P(z) - (k3/6)D3P(z) + R(v), = where the kj are the cumulants of z, and (4.1.7) R(v) = exp kj(j) { Dj} P(z) k D3} P(z) - {1 - and therefore is a double infinite series each term involving a power product of the cumulants kj and an associated high-order derivative DrP(z), the term with the least order differential is seen to be (k4/4!) D4P(z). In order to express the cumulant k3 in terms of the standardized cumulant K3 (say) of the finite population of the irj, we make use of the results given by Wishart [8]. In terms of the variable z, we find, after some substitution, that (4.1.8) k3 [ = 1- (1-N-2) (N (1-N--2)7 -{v - 2)] K3 Substituting (4.1.6) in (4.1.4) we obtain N-2 Z E P12= (N- i(4.1.9) *f where D3P(z) =P Z= V=o 71 {P(zl) P(z2) - - k3 [p(3)(Zk3 [P3(3) dt + (Z2 (z) {t + I-7r- /8S12 vi v(2-7r1-7r2)(N-2)} V 2)' -N (4.1.10) Z2 = {t + 1 -7 -72 and the remainder term p p = (N 2)' - N - is given by N-2 (4.1.11) /S12 V'(1 -v('2-7r-7r2)(N-2)-} - 1)- rl Zf [R(z1) - dt R(Z2)] while k3 is given by (4.1.8). We now apply the Euler-Maclaurin formula fb g(t) dt-. g(b) - g(a) (b (4.1.12) + (b = - 24 - a)3 a)gql' (a + b) (3) 9 a + 2 b (b + - a) 5 1920 This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions g WITH SAMPLING 359 PROBABILITIES UNEQUAL here formulated for a general function g(x) satisfying the required continuity conditions, where gr)(x) denotes the rth order derivative with regard to x and t is such that a _ t < b. This formula is first applied to the differences - P(Z2) P(zl) and P(3)(zi) - P(3) (Z2) in (4.1.9) thereby giving rise to four terms involving respectively p(l), p(3), p(4) and p(6). The integration over t of these four terms is then performed by again using (4.1.12). Retaining only the relevant terms in (4.1.9) we obtain N-2 (4.1.13) = (N-J Pl2P~~~(N~~1V k+ 7r3 72 F3 L 21 7T2 12 1) ) (3)(v) dv+ P + w+ Ply 6S12j where v1 = v[1 - v/(N - (4.1.14) (71 + 72) - P24 k3 i7r2vTP(4)(V2) 24S12 V2 = V1 2rl2 vJP41 (v2)+ P(3)(V2)_ Vl - -1 7 2)], v(2 N S1212V N2)] /2(l ) and p is given by (4.1.11) while corepresents the aggregated remainder terms in the application of formula (4.1.12) and p' denotes the remainder when approximating Jv by f dv. The discussion of the remainder terms p, coand p' is given in Appendix I where it is shown that these remainder terms do not contribute to P12 to the order of approximation desired, namely O(N-4). We now evaluate the remaining terms in (4.1.13). The first term which involves exp {-v2/2} P(1)(V2) - (2r) will be called A. We now make the transformation u = v- (4.1.15) (N-2) so that v = (N - 2){1 - [4u2/(N - 2)]b,. Expanding the exponential in P(l) (v2) as well as the term v-2 in powers of u, we obtain A (4.1.16) - - (N (N lh4p6 - 2) 1) (2 7r1r2 (2Xr - - 7T2) e-P2/2 exp f2 {-h-2p4 hJ_ + higher terms) (1 + lh-22 + 3h-44 + higher terms) dp, where (4.1.17) h = (2 - 7r- 7r2) (N -2) Sg2 and the variable of integration has been changed to p = 2 uh(T- (4.1.18) Expanding the term exp 2)-. { } in (4.1.16) and multiplying by the series inside This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions 360 H. 0. HARTLEY K. RAO AND J. N. (4.1.16) we reach (N -2) A (N (4.1.19) 1) (2 - - [1 + 2h-2(p2 7r17r2 rl h 7r2) + 8h-4(3p4 p4) - (7)' - - eP/ 6p6 + p8) + higher terms] dp. Since h is O(MT), we can (for large N) replace the integration limits in (4.1.19) by - X and + X apart from errors which are O(e N'a). Using the standardized normal moments we therefore obtain (N-2) A. (4.1.20) (2- (1 - 7r-2 h-2 + 3h-4) to order 0(W4). By a similar argument it can be shown that the first of the two terms in (4.1.13) which involves P(3)(v2) and which we call B, can be reduced to ri 7r2g2(2r B 1 6(N 1)(2- - r r2) - h lh-2(p6 1)- e-P212[(p2 f 6p4 + 3p2) + higherterms]dp - while the other term involving P(3)(v2), which we call C, reduces to r 24(N (4.1.22) -2) (27r)-' 7r2 1'(N 12 mri( _ 1)(2 - - ri - r2) h e _P2'2[(p2 - 1)- h-2(p6 - 4p4 + p2) + higherterms]dp. h The integral of the terms retained in (4.1.21) is zero while C can be seen to so that B and C do not contribute to P12 to order 0(N4). Finally, be 0(N5) consider the term D (say) involving k3 in (4.1.13). Substituting for k3 from (4.1. 18), we obtain D = 2(N D = 3(N - 2)r7r2K3 rl1)(2- 7r2) h (2irX- Lf h P2/22ih (p4 - 3p2) + -h3(8p6 _ 9p4 - p8) + higherterms]dp, and the evaluation of the terms retained yields D (4.1.23) K3 812 ) (2 -7r17r2)4 since which is of order0(N4) (4.1.24) (N = (N - 2)- ( -rj 2 - 2 1.2) This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions 361 SAMPLING WITH UNEQUAL PROBABILITIES Adding the expressions A and D we obtain for the probability P12 an approximation to order o(N-4) given by (4.1.25) (N - 2)7r2 1r2 P12 where h is given by (4.1.17) and h2 + 3h - 2K3 h-3(N - 2)'] - by (4.1.24). Since the last two terms in K3 we obtain to order 0(N F3) the simplifiedformula (4.1.25) are O(N4), (4.1.26) [1 P12-( (N - (N - 2)7r17r2 1)(2- 'l- (1 - h2). 7r2) We now apply two checks to verify, independently of the arguments given in Appendix I, that the remainder terms p, w and p' do not contribute to P12 to are retained in (4.1.25) and that all terms to order 0(N4) order 0(N4), above. The first check is the special case when all 'r1 are equal to 2/N so that S12 = 0 and h'1 = 0, and (4.1.25) reduces to 2/N(N - 1) which is the correct probability for units 1 and 2 to be both in a sample of size 2. However, this check tests only the leading term in (4.1.25) since h 1 = 0 so that the coefficients of the remaining terms in (4.1.25) are not affected by the check. A more searching check which takes account of all the terms in (4.1.25) is provided by testing the order to which the equation N = (n -1)7ri ZPii, (4.1.27) is satisfied. It is shown in Appendix II that (4.1.27) is in fact satisfied to order if (4.1.25) is substituted in (4.1.27) which confirms that (4.1.25) is O(Nm3) correct to order 0(N4). 4.2. Varianceformulasto orders0(N1) and O(N0). In AppendixII we have simplified (4.1.25) to the following expression: Pii, -27i + ri'[1 + I2(3i + 7i') + 4(7r + 7ri, 2ri ri')] N (4.2.1) - Eir41 x7rtlrt + 3(3r, + 7ri,)] + + 1?r?,ri' 7r(7rr') (i r(- + ) 7ri t - (f r) to o(N-4), where the subscripts 1 and 2 are replaced by i and i' respectively. Substituting (4.2.1) in the variance formula (1.2) we obtain after considerable rearrangement of terms V(( (4.2.2) N ( )r )r 3 7i E - Z.d/k ( (8 _ _\ N 4 ( 74 ? 2 This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions Z 7ri, 362 H. 0. HARTLEY AND J. N. K. RAO correct to O(N?). If only terms to O(N1) are retained in (4.2.2), we obtain V( ) (4.2.3) ()(i 7ri correct to O(N'). For the special case of equal probabilities iri = 2/N, (4.2.2) reduces to the well knowii variance formula in equal probability sampling withTABLE 4 Data for population of size N i = = 20 1 2 3 4 5 6 7 8 9 10 Yi Xi 19 18 9 9 17 14 14 12 21 24 22 25 27 23 35 24 20 17 15 14 i= 11 12 13 14 15 16 17 18 19 20 Yi 18 37 12 47 27 25 25 13 19 12 Xi 18 40 12 30 27 26 21 9 19 12 out replacement to O(N0). In sampling with replacement the variance of f' is (4.2.4) V( Z U) ri 2 which is of order O(N2). (4.2.3) when compared with (4.2.4) clearly shows the characteristic reduction in the variance through the "finite population corrections" (1- -7ri). 4.3. Numerical example. Horvitz and Thompson [4] give an example of N = 20 blocks in Ames, Iowa. The data are reproduced in Table 4 below where yi denotes the number of households in ith block and xi denotes the eye-estimated number of households in ith block. The probability iri for the ith unit to be in the sample is taken proportional to the eye-estimated number of households Xi, i.e., iri = 2xi/X. In Table 5 below we give the evaluation of the variance of f from formulas (4.2.2), (4.2.3) and (4.2.4). Also shown in Table 5 are the evaluations of the variances of alternative estimators which are described by Horvitz and Thompson. A comparison of the variances in Table 5 shows that sampling with p.p.s. is vastly superior to sampling with equal probabilities. It must not be forgotten, however, that there are other devices of decreasing the variance in the latter case with the help of the known xi values, e.g., ratio and regression methods of estimation. Among the procedures of p.p.s. sampling there is little to choose in this example except that about 7% (235/3241) is gained in precision through our procedure of sampling without replacement as compared to sampling with replacement. It is of interest to exhibit the nature of convergence of the various approximations to V( ') by regarding the variance formula for sampling with replacement as an approximation to O(N2) as set out in Table 6. The con- This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions WITH SAMPLING 363 PROBABILITIES UNEQUAL TABLE 5 Variances of various estimators of the total of the yi-population shown in Table 4 Form of Numerical Value of Estimator Variance of Estimator Sampling Procedure Equal probability sampling without replacement p.p.s. with replacement * 3,241-Eq. (4.2.4) yi/7ri E Horvitz and Thompson procedure 1 16,219 Ng n 3,095 yi/Pi E n Horvitz and Thompson procedure 2 3,075 yi/Pi E n Present procedure; sampling without replacement and p.p.s. 3,025-Eq. (4.2.3) 3,007-Eq. (4.2.2) yi/7rt E * For a description of these procedures and definition of the Pi see Horvitz and Thompson [41,Tables 2 and 3. TABLE 6 Approximations for population of Table 4 to V(t) Order of Approximation Formula Used V(g) Difference O(N2) Eq. (4.2.4) Eq. (4.2.3) Eq. (4.2.2) 3,241 216 3,025 18 O(N1) O(NO) vergence in this example appears to be satisfactory although the population size (N = 20) is much smaller than those encountered in survey work. 4.4. Estimate of the variance. Horvitz and Thompson [4] give an unbiased estimate of V( Y) that seems to be unsatisfactory since it may take negative values. Yates and Grundy [9] propose an alternative unbiased estimator of variance which is believed to take negative values less often, and is given by (4.4.1) vn) 7r i = Pii, Yi _yi - where v( g) denotes the estimator of V( ). For the case n to v(g) = (4.4.2) w - - Pii, - 2, (4.4.1) reduces = , 7ri 7ri/ where the units numbered i and i' are included in the sample of size 2. Substituting for Pii, from (4.2.1) in (4.4.2), we find after some simplification N V( ) _ - r(i I + + 7i') (N)2 2- E (4.4.3) 2 2+ ') + 2Z -1 2 N N 3 + 4Iri + 7ri') Z 2r} 1 1 Yi ( /i' J\w~~~~~i 7ri'/ This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions 364 H. 0. HARTLEY AND J. N. K. RAO to order O(N?). If only terms to order O(N1) are retained in (4.4.3), we get the simplified formula (4.4.4) v(g) +irT') + 2-(ir - 7r, ri t For the special case of equal probabilities irj = 2/N, both (4.4.3) and (4.4.4) reduce to the well known formula for estimator of variance in equal probability sampling without replacement. In this connection, it may be worth while to point out an important aspect of Yates and Grundy estimator of variance for the case n = 2. Narain [7] has shown that a necessary condition for a sampling procedure without replacement, for which 7rj = npj, to be more efficient than sampling with replacement is < 2(n (4.4.5) for all i and i'. For the case n (4.4.6) = - n 1) 2, (4.4.5) reduces to Pii, _r.-ri. Therefore, using (4.4.6), it immediately follows that Yates and Grundy's estimator of variance (4.4.2) is always positive. That is, if there is a sampling procedure without replacement for which the variance is smaller than the variance when sampling with replacement independent of the yj's, which is the case we are interested in, then Yates and Grundy's estimator of variance is always positive. It may be noted that this result is true only for n = 2, since conditions (4.4.5) are not sufficient to show that (4.4.1) is always positive when n > 2. (4.4.1) is always positive if conditions (4.4.6) for all i and i' (i # i') are satisfied. However, conditions (4.4.5) do not imply (4.4.6) except when n = 2. (4.2.3) and (4.2.4) imply that, to O(N2), our particular sampling procedure without replacement is more precise than sampling with replacement independent of the yj's, so that (4.4.6) follows as a necessary condition to 0(N3). Conditions (4.4.6) can, of course, be directly verified for our sampling procedure using (4.2.1). 5. The general case n > 2 and N large. A striking feature of our sampling procedure is that it permits an evaluation of Pii , and hence of V( ?) and v( ), for the case n > 2. It is interesting to note that most of the published literature on this topic does not have anything to offer for n > 2, and deals only with the case n = 2, due to difficulties in evaluating Pii, . The methods of attack for n > 2 are similar to those used for n = 2. However, the former case presents certain new features. Therefore, in this section we give a brief derivation of Pi, with details of the new features. This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions WITH SAMPLlNG 365 PROBABILITIES UNEQUAL Extending the arguments given in Sub-section 4.1, it can be shown that s; [N-n Pii, = (N (.) + t -7r,-xr,) )dt +**X [F,(l + t -7r,)-F,(1 EI ,-1 N-n+m Z + + | [F,(m+1+t-7ri)-F.(m+1+t-7ri-7rsi)]dt+ s N-2 [F,(n -1+ j E t - xi,) F.(n -1+ - t -, - 7i,)] dt}, o- n-20 where, as before, F.(T) denotes the cumulative distribution function of the total T. of the v values 7r , = v(n E(T.) 7r,)/(N r- - - 2), and V(T.) = vI1 - [v/(N - 2)lSi, It may be noted that (5.1) reduces to (4.1.4) when n = 2. Now it will be shown that each of the (n - 1) integrals summed over v in (5.1) contributes identically to Pi,, to order O(N4), making the assumptions mentioned earlier in Subsection , n - 2) in (5.1), 4.1. Consider the mth integral summed over v (m = 0, 1, say ,(Si given by s; N-n+m = (N- E 1)- + 1 + t-,- [F(m + 1 + t-,)-F(m | )dt, and let i = 1 and i' = 2 without loss of generality. Proceeding now exactly as in the case of n = 2, by expanding F,(T) in an Edgeworth series, applying the by f dv, we find Euler-Maclaurin formula (4.1.12), and approximating N-nm P(m") = (N- 1)_ _ | TIT2 v4P(1) (V2m) + 2 p(3)(V2m) vLP L2 (5.2) + T1T2 Vp1P (P V2m) -6kS1n"2 VIP(4 (V2m)] dv + Pm + (On + Pm where v V2 = 1vl m - [v/(N -2)]J + 1-rl + "' - v(n - Ir - TO (Vis12), PM, Corand p' are the remainder terms similar to p, coand p' for the case n= 2, P(')(x) denotes the rth order derivative of the cumulative normal distribution P(x), and k8 is the standardized cumulant of the total T. given by (4.1.8). It can be shown that Pm, c"mand Pmdo not contribute to P(2) to O(N4), using This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions 366 H. 0. HARTLEY AND J. N. K. RAO arguments similar to those employed for p, w and p' in Appendix I. We now evaluate the remaining terms in (5.2). The first term which involves P(C)(V2n) will be called Am . We make the transformation V- (5.3) cm=U , where Cm 2(m + 2(n- N-2 - 1) - 7r2 rl r2) ir1- Then, in terms of u, we have V1(CmN2)[ 2 +(Nu (N Cm) (Cm 2) (cm - n )J Expanding now the exponential in P(1)(V2m) as well as the term vL- in powers of u (justification of the expansion can be easily shown), and changing the variable of integration u to p, where p (5.4) = uh(N (cm -N -2) 2) where h = (n (5.5) - r-i r2)(N 2S1-21 -2) we find, after considerable simplification, that (N -2)717(2 + hmn (3p2 )-t _6p4 + 8 (5.6) h2 PI) h- [/ + h8 (3P4- 4 (2 6p6 + 8 p -2 1h + / (15p4 - 45p6 + + 384 (105p4 - 420p6 + 210p8 - 15p8 pO) - 28p10+ p12) + higher terms dp, where (N - 2)s12(1 him (n - 7 r2) ( m2- 2c) N--2) The limits of integration in (5.6) are h(m - cm) (N - 2) {cm - [cm1/(N - 2)]} This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions SAMPLING WITH UNEQUAL 367 PROBABILITIES and h(N -n + m - cm) (N - 2)`{cm - [cl/(N - 2)]}, which are -O(N) and 0(N2) respectively. Therefore, the integration limits can be replaced by - oo and + oo apart from errors which are 0(e-N Na). The main feature here is the appearance of a "non-centrality type" term him which is zero for the case n = 2. Using now the first twelve standardized normal moments, we find that the coefficients of all terms involving himare zero, and that Am * (5.7) (N - 2) r2 + 3h4) (1-h-2 Since h does not depend on m, Am also does not depend on m. o(N-4). By a similar analysis, it can be shown that the first of the two terms in (5.2) which involves P(3)(V2m) and which we call Bi, can be reduced to to (N2)7rirASj22 24(N1)(n - i - (5.8) *Fe-P212[(p2-1) + h L (p5 2 + hlm (p8 cm- N- -2) (2r) - 6p3 + 3p) ~~~~~~~~2 + 45p4 _ -15p6 6p4 + 3p2) - (p6 -_ 1 Cm \ 2 higherterms p2)+ dp while the other term involving P(3)(V2m), which we call Cm, reduces to = C'm (5.9)* (N 24(N I e2 - - -1-7r 72) + him (p5 -1) [(p2 L J 2)7r72(27)-1 1)(n 4p3 + p) - ~~~~~2 + h2n(p8 _ (p h- - 2 lip6 + 21p4 3p2) - 2 4p4 + higher terms dp. Using now the standardized normal moments, we find that the integral of the so that Bm and Cmdo terms retained in Bm is zero and Cmis of order 0(W5), not contribute to P17) to order O(N 4). Finally, the term Dm (say), involving k3, in (5.2) is reduced to Dm = - (N -2)irT12K3(2ir)Y'2f iri- 6(N-1l)(n- r2) J QVCm +hl (p6 6p N 2m 2) 1 e'2/2 p2 L (p7 - 2 -6p5 - Cm20 33p) ( J__ + 3p3) 2 + hlm p9-13p7 13 _ m (p9 +8 himh-2 4 (10 + 33P5 + 3P -1p+3 - 9P3) 33p6 3 6_94 -p) This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions -p 368 + hi. AND HARTLEY H. 0. (N- 2 - + h2m(p8 - 13p8 + - 2) (cn 33p6_ -(N 3p2) (8_6?3p 2) - pCm (N-2)3 6p6 + 3p4)]-h-3(N - - 9p4) - . 3 )21[(p Nc 2)2 (1 h 2 ?hlm(76?3p4) + h-2(N 2) N- - ( Cin 8m(plo + 45P4) 45p4)]C' - + K. RAO - 24pl + 150p8 -240p6 (p12 +h-'j(p1 (5.10) J. N. (1 - - 2)? Cm (m )l0P - 3 N- 2) 3p4) )j(p 3(NP-2)4-(1 2)ir1r2K3 h3(N2 + higher terms5 dp. Using now the standardized normal moments, the integral of the terms retained in (5.10) reduces to Dm-6(NP 6)(n- -3i hr(N-)2) 12(1 2; (5.11) _-6(1 N2)ro +-2(N 2)N- N- 2) (N(N m2N-2) -2)( +N )( -3 2)+ 2 ) (N -2)4( )l N- This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions 369 SAMPLING WITH UNEQUAL PROBABILITIES Further simplification of (5.11) results in - Dm-in (5.12) (N - 2(N - 2)(ri 1) (n 2 - K3 'l- h-3(N 2) - 7r2) which is of order 0(W4) and does not depend on m. Adding now the expressions (5.7) and (5.12) for Am and Dm respectively, we obtain for P(2 ) an approxima- tion to order0(N4) (5.13) given by (N P(rn) - (N - -[1 2)7r,lr2 - 7rl- 1)(n h 2 + 3h4-4 - 2K3h 3 (N 2 - 7r2) Since (5.13) does not depend on m, it follows that n-2 P12= (5.14) E P m=0 ( )7,-r (N-2)r 1)(n (n-1) (N1- 2 7r2h - r- + 3h - 7r2) 2K3 h3(N - -2) For the special case n to order 0(N4). 2, (5.14) reduces to (4.1.25). The two checks mentioned in Section 4 for the case n = 2 are also satisfied by (5.14). Expressing h and K3 in terms of the irj we obtain from (5.14): pi i (n-i) n ___n (n-1) i fl+ + i ri 3i+ri' (1ri 3 + + N - 3(n (ir ri, + ri ri,) + 2(n-1) N v fl (5.15) -1) n + __(n __ 1) 21s'ilr nn4 (rs 2 N 2 2 1n5 3(n +rs r,)- ) r 1T) +7r r,.gr .r% 2 2(n -1) f4 r 1 to O(N-4), where the subscripts 1 and 2 are replaced by i and i' respectively. Substituting (5.15) in the variance formula (1.2) we obtain V( (5.16) - E [1 fl (n- 1) 7rf2) _ (n-1l)?E(27r3 (ni _ 7ri 2n correct to 0(N?). If only terms to 0(N') (5.17) Y (yi V(2)=- 7ri[i (n_ + _ (n Y) ) ( Y1 are retained in (5.16), we obtain i) )j](Ys _ This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions 370 H. 0. HARTLEY AND J. N. K. RAO correct to O(N'). The variance of Y in sampling with replacement is (5.18) V'(1?) = Y (Yi E-T n 7ri which is of order O(N2). (5.17) when compared with (5.18) clearly shows the characteristic reduction in the variance through the "finite population corrections" {1 - [(n - 1)/n]rjw} Substituting (5.15) in Yates and Grundy's estimator of variance (4.4.1), we find 1) v(f) - (n- [1 2 12 _ n ( 2i +7i_i (5.19) - (7ri+wrs) +(f r 2( n3\ 7)ri rjn) 2A r Jn + + irit) 2~ +2 Zir 3 E (Yi _ypt correctto O(N?). If only terms to 0(N1) are retainedin (5.19), we obtain the simplifiedformula (n - 1) (5.20) v(Y) [1 - (7ri + 7ri,) + , (f wjrn)] (Y - correct to O(N'). For the special case of equal probabilities7r = n/N, both (5.19) and (5.20) reduce to the familiarformulafor estimator of variance in equal probabilitysamplingwithout replacement. 6. A comparisonwith ratio method of estimation. Cochran[2] makes a comparisonof the varianceof Y in p.p.s. samplingwithreplacementwith the variance of the ratio estimate, YR = (Y/X)X,in equal probabilitysamplingwithout the finite populationcorrectionfactor. Since a compact expressionfor the variance of fl in p.p.s. samplingwithout replacementis obtainedin this paper, it will be of interest to comparethis variancewith the varianceof the ratio estimate not ignoringthe finite populationcorrectionfactor. (5.17) can be written as (6.1) V( i') I-- - nlliJi (yi _ ypi)2- _ n fl ) 1 (Vyi-_ ypi)2 to order O(N'). The varianceof the ratio estimate fR for large samples is (6.2) V( YR) = n(N )(1 I -N) E (yi_YP)2 Since N - 1 = N[1 - (1/N)], expanding (Nfrom (6.2), (6.3) V(YR) N E (-i - Ypi) n i 1)-' biniomiallywe obtain n This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions SAMPLING WITH UNEQUAL 371 PROBABILITIES to O(N'). Since the correction factors in (6.1) and (6.3) are exactly the same, the comparison of V( ) with V( 1R) reduces to the comparison of the variance of f in p.p.s. sampling with replacement with the variance of the ratio estimate without the correction factor. Cochran, assuming the model y2 = Yp2 + ei where E(ei Ipi) = 0 and E(e' Ipi) = ap , g > 0, a > 0, has shown that the variance of f in p.p.s. sampling with replacement is smaller or greater than the variance of the ratio estimate without the correction factor according as g > 1 or g < 1 respectively. If g = 1, the variances are identical. It is also stated that in practice g usually lies between 1 and 2, so that the p.p.s. estimate is generally more precise. APPENDIX I The order of magnitude of the remainder terms p', w and p p' denotes the remainder when approximating E, by f dv. It involves the terminal differentials of the integrands at the end points of integration v = 0 and v = N - 2 which become zero since V2is infinite at these points and since p' also involves the remainder the integrands involve the term exp (-v2/2). of the form is which term of the Euler-Maclaurin formula (N - [(N - 2)ON]/(2m) 2) B2mf(2m) !, where B2m is the Bernoulli number, f(2m) is the (2m)th derivative with regard to v of any of the integrand functions involved in (4.1.13) and 0 < 0 < 1 while the order of the remainder term, 2m, is at our disposal. 2)ON and the corresponding The relation between the v argument of (Nis from by given (4.1.14), v2argument, V2 = [1 - a(7r1 + 72) ] (1 - [ON( 1 - 20N) ON)Y] S72l(N - 2y) We now separate the values of ONbetween 0 and 1 into two groups. In the first group, ONis equal to I or the leading term of the difference between ON and 2 is proportional to, N rN with rN > 0. The remaining values of ONfall in the second group. It is easily seen that for the values of ONin the second group V2is 0(N3) with s > 0 since S12 is of order O(N'1), and the argument to be used for the remainder term in case (b) below also applies to the values of ONin this group. We For values of ONin the first group, either v2is zero or is of order 0(N+rN). ' > first the 4. Consider and < two cases rN the (b) rN (a) now distinguish find we in u the variable (4.1.14), (4.1.15) by given case (a). Introducing v2 = const. (N - 2)-'Sj2u[1 = const. (N - 2S1 (2u/N - E (-2) ( -2)2]- )i(2u/N - 2)2i+1 Therefore, by repeated differentiation, we have that for the largest value of Jul, dtv2 = dtV2 dut dvt 0(N't). This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions 372 H. 0. HARTLEY AND J. N. K. RAO The repeated differentiation of f(v2) with regard to v or u will now be seen to have a leading term of the form (d'f/dv2) (dv2/du)t which is of order 0(Nt12). Therefore, from the Leibnitz formula of differentiation of a product which is of the form vpbf(v2), b > 0, as evident from (4.1.13), it is seen that the remainder term is 0(Nk) with k > 4 provided 2m is chosen sufficiently large. In case (b), the remainderterm goes down as 0(e-bN Na) wheres = 1 - 2rN > 0, and hence is smaller than 0(N4). Therefore, the remainder term p' does not contribute to P12 to 0(NF4). We consider next the magnitude of the remainder w arising from the application of the Euler-Maclaurin formula (4.1.12) to the differences P(zi) - P(z2) and P(3)(z1) - P'3) (z2). The first of these is of the form c(z1 - Z2)5P(O)(z) where Z2 < z < z1 . Now from (4.1.10), (4.1.15) and (4.1.18) we have z -z2 = 2w2(N - 2)-Si21[1 + 0(p2N')], from which it appears that the leading term of z1 - Z2 iS of order 0(N-) and that z = v2 + O(NT). Therefore, a contribution co, (say) to cois given by WI= (N-1)' = (N E f - (Z1 1)Y'r1[2r2(N dt -z2)5P"5)() -2)5S21]5 E [1 + 0(p2NF1)][P5)(V) + 0(N4]) v Now because of the properties of the normal differentials, E, P(O)(V2) iS and, since dv/dV2= O(N) and dp/dv2 = O(N0), all terms in W,are seen to be of smaller order than 0(N4). Similar arguments apply to the difference P(l) (z1) - p(3) (z2) as well as to the remainder terms arising from applying Euler-Maclaurin formula to the integration f l dt of the retained terms in p(l), P'3', p(4) and p(6). Finally, we turn to the discussion of the remainder p given by the sum (4.1.11) with R(v) given by the double expansion (4.1.7) involving the cumulants kr and higher differentials of the standardized normal P(z). It is necessary here to show that kr (r ? 4) is of order 0(Nc) with c > 2 provided that O(e-NNa) 0 <q ? v/N < 1 - q <l as N -*> oo. We have shown that k, is of order O(N-r+l) when the K,. are bounded and v/N satisfies the above condition.3 However, we shall not give this discussion here. Instead, we use a formula recently employed by Barton and David [1] who show that, if the set of finite populations of size N are assumed to be random samples from the same infinite "super" population with cumulants Kr, the rth cumulant kI = [(N - v/N) rv-r?l + (-1)'(v/N)'y(N -V) 2r+]Kr provided we make the above assumpwhich shows that k, is of'order 0(N-2rl) tion which, although more restrictive than that by Madow, is adequate for our 3This is also the assumption made by Madow [6]. This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions WITH SAMPLING 373 PROBABILITIES UNEQUAL purposes. We should point out that Barton and David do not make the above assumption since they use the above formula only as a step to proving another formula which does not involve the K7 (see their equation (1)). We now split the sum in (4.1.11) into two parts, viz., the central part where v/N is satisfying the condition mentioned above and the remainder (tail) sum. It is easy to see that the latter sum is of order O(e-NNa) because of the asymptotic properties of the standardized normal differentials. For the former sum, since kris O(Nhr+l)X it can be shown by an analysis similar to the term D involving k3 that for each term of (4.1.7) when substituted in (4.1.11) there results a term of order smaller than O(NT4). The inversion of the double summation in (4.1.7) and the limiting process N -?oo is not discussed here. II APPENDIX Verification of the order of magnitude of the probabilities Pii, We have from (4.1.25) that Pii is given by . pi (N-2)1r (N 1)(2 - [1 -h-2 7r' - - 7 + 3h - 2K3 h-3(N 2)] - 7ri') to O(Ar4), where the subscripts 1 and 2 are replaced by i and i' respectively. We now show that (4.1.25) is in fact correct to 0(A745) by verifying that = O(AV-). Using (4.1.17) for h D174i Pii, =ri to an order (N - 1) 0(4) can be put in the form for the above for Pii, expression and (4.1.24) K3, . p Pii, (N -_2)ri Tit (2 - (N-1) N , r-1-1i 7# ( (2 _ri _ 23 31 - _ s2 _ - 7r-i-i - - 3 (N1~~~~~~1 N- ~~~2 Exj8 6 (2 2 (N -2) N 7ri) 4 _ 4 3 1 N -(2 - 7ri- 2 2 _) +N-3 N 2 _ 16 1 ( (L N2-3 ri- i, )3 -7rj- 3)r 2 E7r, (N-2)(2-x,-ri7r,2)j reducesto which, to O(N-4), N 2 _ Pii = 7rX7ri _ _ _ _(2 -7ri- r ,/ Tj _~~1 _ _ _ ri,) 1 (2 - 2 - 2 7i 2 - '7r 7r - w-i)3 + N 3 (() (2 - Xi- \2 tj N 12. 2 Ertj _ 7ru,)5 (2 - Expanding all denominators binomially and retaining all terms to ri 0(NV4), This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions - ,) 4 we 374 H. 0. HARTLEY AND J. N. K. RAO obtain _t[ Pis, + 7ri') + '(ri + 2(i - 8 ' T + (rt + 7r' + 27ri7ri')] + + 7ri')] + 3 + jQrtr' trj') E( )2 32 (1 ) - 8 ((E 1 Summing the above expression for Pii, over i' from 1 to N except i' = i and noting that EN 7r; = 2, we obtain N N EPii - j ri(2 - 7wi) + 4ji7r(2 - i) + i (j \ 2 2 - N N + 7i -t -,r ri(2 - r) 1 which, to O(N3), ri 1 reduces to 7rithereby providing the announced check. REFERENCES [1] BARTON, D. E. and DAVID, F. N. (1961). The central sampling moments of the mean in samples from a finite population (Aty's formulae and Madow's central limit). Biometrika 48 199-201. [2] COCHRAN, W. G. (1953). Sampling Techniques. Wiley, New York. [3] DES RAJ (1956). A note on the determination of optimum probabilities in sampling without replacement. Sankhya 17 197-200. [4] HORVITZ, D. G. and THOMPSON,D. J. (1952). A generalization of sampling without replacement from a finite universe. J. Amer. Stat. Assn. 47 663-685. [5] KENDALL, M. G. and STUART, A. (1958). The Advanced Theory of Statistics, Rev. ed. Hafner, New York. [6] MADOW, W. G. (1948). On the limiting distributions of estimates based on samples from finite universes. Ann. Math. Statist. 19 535-545. [7] NARAIN, R. D. (1951). On sampling without replacement with varying probabilities. J. Ind. Soc. Agric. Statist. 3 169-174. j8] WISHART, JOHN (1952). Moment coefficients of the k-statistics in samples from a finite population. Biometrika 39 1-13. [9] YATES,F. and GRUNDY, P. M. (1953). Selection without replacement from within strata with probability proportional to size. J. Roy. Statist. Soc., Ser. B 15 235-261. This content downloaded from 193.0.118.39 on Fri, 10 Apr 2015 20:50:40 UTC All use subject to JSTOR Terms and Conditions