Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data assimilation wikipedia , lookup
Expectation–maximization algorithm wikipedia , lookup
Bias of an estimator wikipedia , lookup
Time series wikipedia , lookup
Regression analysis wikipedia , lookup
Linear regression wikipedia , lookup
Coefficient of determination wikipedia , lookup
STATISTICS & PROBABILITY LETTERS ELSEVIER Statistics & Probability Letters 40 (1998) 307-319 Another look at the jackknife: further examples of generalized bootstrap Snigdhansu Chatterjee * Theoretical Statistics and Mathematics Unit, Indian Statistical Institute, 203 B. 7". Road, Calcutta 700 035, India Received January 1998; received in revised form May 1998 Abstract In this paper we have three main results. (a) We show that all jackknife schemes are special cases of generalised bootstrap. (b) We introduce a new generalised bootstrap technique called DBS to estimate the mean-squared error of the least-squares estimate in linear models where the number of parameters tend to infinity with the number of data points, and the error terms are uncorrelated with possibly different variances. Properties of this new resampling scheme are comparable to those of UBS introduced by Chatterjee (1997, Tech. Report No. 2/97, Calcutta). (c) We show that delete-d jackknife schemes belong to DBS or UBS depending on the limit of n ld. We also study the second-order properties of jackknife variance estimates of the least-squares parameter estimate in regression. (~) 1998 Elsevier Science B.V. All rights reserved Keywords." Bootstrap; Jackknife; Least squares; Many parameter regression; Second-order efficiency 1. Introduction In Efron (1979) the bootstrap method was introduced to understand the jackknife better, and is a general technique to estimate the distribution o f statistical functionals. The naive bootstrap scheme is to sample with replacement from the data, then to calculate the statistic of interest for the sample from sample or resample, and to repeat this process over all possible resamples. Suppose (wl,w2 . . . . . wn) is a random sample from Multinomial (n, 1/n, 1/n .... ,1/n). The naive bootstrap can then be viewed as attaching the random weight wi to the ith data point, calculating the statistic o f interest for the new data, and finally integrating out the extraneous randomisation induced by the wi's. Bootstrap with random weights is often referred to as 9eneralised bootstrap. The bootstrap is not a generalisation o f the jackknife, in the sense that the jackknife cannot be seen as a special case o f naive bootstrap. However, generalised bootstrap is a direct generalisation o f naive bootstrap, so it is o f interest to know how generalised bootstrap relates to the jackknife. We show in this paper that all known jackknife procedures are special cases of generalised bootstrap. We look at the problem o f estimating the variance o f the least-squares estimate in linear regression models where the number o f parameters can tend to infinity with the number o f data points. In linear regression, * E-mail: [email protected]. 0167-7152/98/$ - see front matter @ 1998 Elsevier Science B.V. All rights reserved PII: S0167-7152(98)00116-3 s. Chatterjee / Statistics & Probability Letters 40 (1998) 307-319 308 the random weights can be attached either to the pairs {(yi,xi), i = 1..... n} or to the residuals r i = y i xifl. We have the residual bootstrap when Multinomial (n, 1In . . . . . 1/n) weights are attached to the centered residuals ri - F , and paired bootstrap when square roots of such weights are attached to {(yi,xi), i = 1..... n}. Accordingly, there can be two varieties of generalised bootstrap in linear regression. The external bootstrap of Wu (1986), wild bootstrap of Mammen (1989) and weighted bootstrap of Liu (1988) are examples of generalised bootstrap where the generalisation is by attaching random weights on residuals. Generalised paired bootstrap is carried out by selecting random weights {wi:n; i = 1..... n} and then transforming the ith data point (xi, Yi) to (wx/-~..~xi, ~ Y i ) . Let the least-squares estimate of parameter fl be/~ using the original data and /~B with the transformed data. Assume that the weights satisfy EB(Wi:n- 1 )2= trB~, i---- 1..... n where trBn may depend on n. The generalised bootstrap variance estimate is defined to be - - ^ - - (1.1) The uncorrelated weights bootstrap (UBS), of which the paired bootstrap is a special case was introduced in Chatterjee (1997). The consistency of its variance estimate of the least-squares estimate was also established there. In this paper we introduce a variant of the UBS, which we call the 'degenerate weights bootstrap' (hereafter DBS). We show that the DBS has same properties as the UBS, only the conditions on the weights are different. We show that all the known jackknife schemes belong either to UBS or DBS. In the delete-d jackknife scheme, if d/n ~ c C (0, 1) as n ---*co, then the scheme belongs to UBS, otherwise it belongs to DBS. In Bose and Chatterjee (1997) higher-order properties of resampling variance estimates of the least-squares estimate in linear regression was studied. We extend the results to the delete-d jackknife, exploiting its identification as an UBS or DBS scheme. Based on a consideration of second-order efficiency and likely model conditions, we recommend the delete-In/5] jackknife scheme. We know that the delete-1 jackknife fails to give consistent variance estimator for non-smooth estimators like sample quantiles, unlike the delete-d jackknife where d(n - d ) - l ~ c ~ (0, 1). Based on the fact that the delete-1 jackknife scheme belongs to DBS and with d = O ( n ) the delete-d jackknife belongs to UBS, we conjecture that results for UBS can be carried over to non-smooth estimators, but results for DBS will not carry over to such situations. The idea of bootstrapping with random weights probably appeared in Rubin (198l). Bootstrapping with exchangeable weights have been treated in Efron (1982), Lo (1987), Weng (1989), Zheng and Tu (1988) and Praestgaard and Wellner (1993). Other generalised bootstrap methods may be found in Boos and Monahan (1986), Lo (1991), H/irdle and Marron (1991), Mammen (1992, 1993). A review can be found in Barbe and Bartail (1995). Bootstrap schemes for linear models have been discussed in Efron (1979), Freedman (1981), and in Bickel and Freedman (1983). Hinkley (1977) Wu (1986) and Shao and Wu (1987) have studied consistency of the bootstrap and the jackknife in heteroskedastic linear models. Bootstrap in regression models with many parameters have been considered by Bickel and Freedman (1983) and Mammen (1992), who showed the consistency of bootstrapping residuals and the wild bootstrap respectively for the least squares estimate of the regression parameters. Liu and Singh (1992) compared the well-known bootstrap and jackknife schemes in linear models. They showed that for estimating the variance of the least-squares estimates of r, some resampling schemes such as the paired bootstrap and wild bootstrap produce consistent results under heteroskedasticity, and some others such as the usual residual bootstrap do not yield consistent estimates under heteroskedasticity but are more efficient under homoskedasticity. These resampling techniques thus are either robust or efficient, and accordingly they were classified as belonging to R-class or E-class. We consider a large class of models include the heteroskedastic linear model and the many parameter regression model as special cases. In our set-up, we find a representation of the DBS variance estimator, from which it follows that this generalised bootstrap estimator belongs to the R-class. Analogous results for the UBS scheme has been already established in Chatterjee (1997). This shows that all the usual delete-d jackknife schemes belong to the R-class. S. Chatterjee I Statistics & Probability Letters 40 (1998) 307-319 309 There are some weighted jackknife schemes also, namely those suggested by Hinkley (1977), Wu (1986), Liu and Singh (1992). The first two are modifications to the usual jackknives, adapted to suit regression data of unbalanced nature. Liu and Singh's weighted jackknife came up as an example of a E-class resampling technique. The asymptotic properties of these weighted jackknives are comparable to the usual delete-d jackknives, and their usefulness lies mainly on their small sample performance. We show how to interpret these weighted jackknives as generalised bootstrap schemes. The models that we consider consist of a linear signal term, where the number of parameters can tend to infinity with the number of data points, and the errors are assumed to be uncorrelated but not necessarily independent and they can have different variances. We describe the details of the model below. Let r/n be a fixed sequence of real numbers, not identically zero. Also, let {Pn} be a sequence of nondecreasing positive integers satisfying pn ~<n. For any n~>l, by /~n we mean a vector of length p~ which consists of (r/1,r/2..... r/p,). Also, for each i>~l, x * = ( x * ( j ) , j = l . . . . . Pi) is a observed random vector of length Pi. For each n>~l, the vectors xi:n = (xi:n(j), j = 1. . . . . Pn), 1 <~i <~n are vectors of dimension Pn obtained by augmenting zeroes to xi, i.e. Xi:n(J)= x*(j) 0 if l<~j<<,pi, if pi<j<~pn. Since {p,,} is non-decreasing, the above expression is well defined. We have a sequence of models {Jgn; n>~ 1}, where ~/n is the model Yi =xTn~n + ei, 1~i~n. Note that the above model is nested, in the sense that fin is the first n components of tim for any m > n. We drop the suffix n both from pn and fin. We denote xi:n henceforth by xi. Let X denote the (n × p)-dimensional matrix whose ith row is formed by x~. Let X T denote the transpose of the matrix X. Also, let y and e be the n-dimensional vectors whose ith entries are Yi and ei, respectively. Then Jc'n may be written as y=X~+e. (1.2) We make the following assumption on the nature of the design matrix. For any m > p if Jm = { i 1 , i 2 is a subset of {1,2 ..... n}, and if X* is the (m × p)-dimensional matrix whose jth row is xY I/' then XT*X * > mk(p)l ..... ira} (1.3) for some k ( p ) > 0 , where I is the identity matrix. This is only slightly stronger than the condition that no p of the xi's lie on a ( p - 1)- or lower-dimensional hyperplane. The other model assumptions are ( )2 xrixi<K(p)<oc, E ~ ei V i = 1 , 2 ..... n, = O(n), (1.4) (1.5) l <~i<~n Eeiej = 0 ; Z l<~i<~n iCj, e8 = Op(n)" (1.6) (1.7) S. Chatterjee I Statistics & Probability Letters 40 (1998) 307-319 310 Note that (1.3) and (1.4) are very general conditions where the bounds are allowed to depend on the dimension of the design matrix. One consequence of these condition is 0 < k ( p ) <di/n < K ( p ) < ~ , (1.8) where di, 1 <~i <~p are the eigenvalues of X T, X. It may be noted that in the usual linear regression model with heteroskedastic errors, the error variables are assumed to be independent and satisfying E[ei[x,] =0, Var[eilxi] = z~. Liu and Singh (1992) had further assumed that z2i<<.co for all i for a positive constant co. This implies conditions ( 1.5) and (1.6). Bickel and Freedman (1983) have shown the consistency of the residual bootstrap distribution of the parameter estimate if p2/n--~0. Such models have also been discussed in Portnoy (1984, 1985, 1988). In all these papers, the errors are assumed to be i.i.d, random variables with zero expectation and finite second moment. Since we allow the number of parameters to increase with the size of the data, this model also incorporates some dynamic linear models with long memory. One important special case is the ARCH model widely used in econometrics, and its variants. A detailed review of the ARCH and its variants can be found in Bera and Higgins (1993). The least squares estimate for the parameter fl is given by = ( X T X ) - I X T y = fl + (XTX)-IXXe. From our model assumptions, (XvX) -l always exists. The quantity we wish to estimate is /In = E(/~- fl)(/~fl)T. We denote the expectation of ee v by Zn, so that (S,)ij= { z~ 0 if i=j, if i~=j. 2. The DBS resampling scheme Let {wi:n; 1 <~i<~n, n>>.1} be a triangular array of non-negative random variables. We will drop the suffix n from the notation of the weights. The generalised bootstrap resampling scheme is carried out by weighting each data point (yi, x i ) with the random weight v/~i, then computing the statistic of interest and taking expectation of the random weight vector. The above set-up can be taken as a direct generalization of the paired bootstrap, where the {wi; 1 <<.i<<.n}are given by a random sample from Multinomial(n; 1In .... ,1/n). Let V(wi)=(~2n, Cov(wi, wj)=Cn~2n , i¢j, l<~i, j<~n. We present the conditions for the UBS scheme first, and then discuss the conditions for DBS scheme. The UBS scheme is discussed in details in Chatterjee (1997). The letters k and K, with or without suffix, are used as generic for constants. Different indices a, b, e .... indicate that the relation hold for all possible choices of a ¢ b ~ c . . . . We use the notation = E ( wa - l ~ ( wb - l ~J ( wc - 1~k Cijk... \ aBn /I \ aBn /I \ GB~-/I . . . . In the following relations, il, iz.... denote positive integers. E(wi)= 1, (2.1) S. Chatterjee / Statistics & Probability Letters 40 (1998) 307-319 (2.2) a~n ~ k > 0 , Kn>~ Z 311 wi>~kn' (2.3) K>k>0, 1 <~i<~n PB[Wi > k > 0 for at least [cn] of the i's, c ¢ (1/3, 1)] = 1 - O ( n - 2 ) , (2.4) Cll = O(n- 1), (2.5) k Vil,i2 ..... ik satisfying E i j = 3 , ci~i2...ik=O(n-k+l), (2.6) cili2...ik= O(n -k+2 ), (2.7) O( n-k+3 ), (2.8) ci,iz...ik = O(n-k+4). (2.9) j-1 k Vil,i2 . . . .. ik satisfying ~-~ i j = 4 , j=l k Vi~, i2. . . . . ik satisfying Z ij = 6, Cili2...ik = j= 1 k 'V'/I, i2 . . . . . ik satisfying Z ij = 8, j i The above conditions are satisfied if {wi} follows Multinominal(n; 1/n ..... l/n). We define the bootstrap estimate of fl to be /~B= (xTwx)-IXTWy if at least p of the wi's are not 0, /~ otherwise and the bootstrap variance estimate to be 1 ^ VB = ~B2nE~(flB--/~)(/~B--/~)T. If yi's are i.i.d, this variance estimate coincides with the generalised bootstrap variance estimate used for estimating the variance of the sample mean. See Barbe and Bertail (1995), for use of this statistic for other statistical functionals. The notable feature of this resampling scheme is that the weights {wi:n} are asymptotically uncorrelated. So we will call this the uncorrleated weights bootstrap (hereafter UBS). The DBS, or 'degenerate weights bootstrap' is a variant of the UBS where condition (2.2) is dropped and some of the cross moment conditions are relaxed. The precise conditions are stated below. (2.10) E ( w i ) = 1, o (2.11) Kn>~ Z wi>~kn' K>k>0, (2.12) i<~i<~n PB[Wi > k > 0 for at least [cn] of the i's, c E (1/3, 1)] = 1 - O(n 2), (2.13) cll =O(n-1), (2.14) S. Chatterjee / Statistics & Probability Letters 40 (1998) 307-319 312 k Vii, i2,...,ik satisfying ~/j=3, Cili2...ik = O(n-k+ltrffn1), (2.15) cil i2...ik ~- O(n-k+2 ), (2.16) cil i:...ik = O(n-k+3 trffn2), (2.17) j=l k V i i , i2 . . . . . ik satisfying Z i j = 4 , j=l k Vii, i2..... ik satisfying Z i j - - 6 , j=l k Vii, i2..... ik satisfying Z / j - - - - S , •"~( --k+4 Citi2...ik ~ L3 n (2.18) -4- GBn ). j=l As can be seen from condition (2.11 ), the variance of the weights go to 0 as n --+ oo, and hence the weights are asymptotically degenerate. However, the crucial condition is still (2.14), which makes the degenerate weights bootstrap (DBS) a variant of the UBS. The expression for the bootstrap estimate of the variance of the least squares estimate is as earlier. We denote this bootstrap variance estimate by Ions. We have a parallel to Theorem 3.1 of Chatterjee (1997), which we state below. Let 2 if i = j , (H)ij= 0 if i # j . Theorem 2.1. In resamplin9 with weights satisfying conditions for DBS, the following expansion holds for the mean-squared error term: VDB S -- Vn = ( x T x ) - I X T [ H - ~_~n]X(XTX)-I-ar Op(n-2p6y(p)2), where 9(P) ---K(p)-2[K(p)/k(P)]ZP[P/(P - 1)]P-1. In particular, DBS is an consistent resamplin9 techniques for the heteroskedastic linear model and also the many parameter regression model where n-l p3 9(p)---+ O. Note that the expressions in the above expansion involve (p x p) matrices, where p --+ oc. The interpretation is as follows: a matrix An = ((an, ij)) is called Op(bn) if sup an; q = O p ( b n ). l <~i<~p,l <~j<~p The proof of this theorem follows identical arguments as presented in Chatterjee (1997). For the sake of completeness, we give a proof for the case Pn - 1 in the appendix. Bose and Chatterjee (1997) derived higher-order terms in the variance expansion in case p = 1 for different resampling schemes. We now present a similar result for the DBS scheme. Let us assume the existence of the following limits: /: n ,~1/2 lim f - - / , n--+~\LnJ lim 1_ x2: n--+oon i<~i<~n t t ' 1 42 = lim - Z xizi, n___+c~ni<~i<~ n Let E l <~i<~nxiei ql = t'K" 2'11/2' ~,Z.~l <<,i<~nXi : E l <<.i<~nX?ei ?]2 = (EI<~i<<nX6)I/2, 71 ~--"(/~ 1 ?]2) T E , <.i<..x4 72= lim n---+~ n S. Chatterjee / Statistics & Probability Letters 40 (1998) 307-319 H = c~,~2 _0~5yl3.2 313 -~ ~'3'~o) and Q = ~/THr/. Theorem 2.2. / f p = 1, then the variance estimate from the DBS satisfies 1 Tn + O e ( n 113'2(VDBS -- Vn) = Z n +//I,"2 I), where the second-order term is given by T,, = Q + 3a2nc22~8~072 - 2O'BnC376~t + n(3a~,c112 -- 20B,Cl2 -- Cll )~4~0 = Q + R,, say. The proof of this theorem is same as that of Theorem 2.2(v) of Bose and Chatterjee (1997) and we omit it. 3. Another look at the jackknife Let d be an integer in [1,n). From the set {1, 2 . . . . . n} choose d integers. The delete-d parameter estimate /~(-a) is the usual least-squares estimate from the data with the chosen d observations dropped. Such estimates are found for all possible choices of sets of d observations to delete. The jackknife variance estimate is an appropriately scaled version of ~ (/~(-d) -/~)2 where the sum runs over all possible choices of d integers from 1 to n. A detailed study of the delete-d jackknife can be found in Shao (1988), Wu (1986, 1990), Shao and Wu (1987, 1989), Shao and Tu (1995). For example, in the usual delete-1 jackknife, introduced by Quenouille (1956), the variance estimator is Vj = 1/n(n- 1)~l~<i~<" (Ji-/~)2. Here the pseudovalue Ji is defined as n/~- (n - 1)/~(_i), and /~(-i) stands for the least-squares estimate from the data set with ith observation deleted. The above sum can be replaced by a weighted sum also, with weights possibly depending on the subset of ignored observations. Let us introduce some notation which we use to describe the important weighted jackknife schemes. Let G = x T x and for any i, 6~=x~G-Ixi. For a ( n × p ) matric A and S E 5Pn,d as in the proof of the previous theorem, As is defined to be the submatrix of A formed by deleting the ilth ..... ijth rows of A. Denote X~Xs by Gs. Our model conditions imply that Gs is positive definite for all choices of S E 5°,,,d, as long as d + p ~< n. Let /~s : GslX~ys . The variance estimator of Hinkley (1977) is V m = n ( n - p)-' Z (1-6i)2(~(-i)-~)(~(-i) -~)T I<~i<~n and that of Wu (1986) by Vjw = (())' n-p d- 1 IGI ~ IGsl(~s - ~)(~s -/~)T. SG~,,.d A detailed study of these weighted jackknives is made in Shao and Wu (1987). The third weighted jackknife scheme, due to Liu and Singh (1992), came up as an example of a jackknife procedure that is in the E-class. It is defined for p - 1 case, and does not have any simple extension to higher-dimensional models. The variance estimate is defined as L, VjLs- n2(~-Z 1) ~ (Ji ~/~) 2 xi where L,, = Z " xT" s. Chatterjee / Statistics & Probability Letters 40 (1998) 307-319 314 We present a different way of looking at the jackknife through the following theorem. Theorem 3.1. In terms of resamplin 9 variance estimates of the least-squares estimate of the parameter in linear models, the jackknife resamplin9 schemes are special cases of 9eneralised bootstrap resamplin9 schemes. Proof. To establish this proposition, we have to show that the jackknives are driven by a scheme of random weights. Let N~ = {1, 2 . . . . ,n} and 6en,d = {all subsets of size d from t~n}. We will identify a typical element of 6e~,d by S--{i1,i2 ..... id}. There are (~) such elements in 5an,d. S c denotes the set N,\S. Let {~s, SESP~,d} be a collection of vectors in ~" defined by ~s = (~s(1), ~s(2) ..... ¢s(n)), where S cs 0 ~s(i) if i E s c, if iES, where cs,S E 6e,,d is a set of constants. Suppose {Ps, S E 5e~,d} is a set of probabilities, that is Ps >>-0 for all S E 5an,d and f~'~SE,~,dPS = 1. Consider the random vector IV, = (wl:~, w2 :, ..... w~:~) which has the following probability law P[Wn=~S]=Ps, SESe,,d (3.1) Observe that for i E Nn ;s if iff& if iES. wi:n= It can now be easily seen that if the ith data point (xi, yi) is scaled by ~ , and the least-squares estimate calculated for the transformed data, the resulting estimate is/~s. For the different jackknife variance estimates, we only have to choose the constants cs and ps appropriately, so that the mean squared deviation from one is same but possibly depending on n. These condition can be summarised as (cs - 1)2ps = nl ~, _n._~.-_~-._~_( .c,s _ l ) 2 p s Z S:if~S i=1 i = 1 , 2 . . . . n. (3.2) S:ifES --1/2, S E 5a~d for some One way of ensuring (3.2) holds is to specify Ps, S E 6en,d and then take cs = 1 + k Ps constant k > 0. For the delete-d jackknife, we take ps = (d) -~, SE6e,,d SO that the expression in (3.2) do not depend on S. For the weighted jackknives, we have the following relations: Ps~ ICsl for Wu (1986), PiC~ (1 - •i)2 Pi 1 for Hinkley (1977), ~ X~-? for Liu and Singh (1992). (3.3) (3.4) (3.5) Note that for Hinkley's jackknife and Liu and Singh's jackknife the set S can only be singletons, so when S = {i}, we use the notation ~i and pi in place of Cs and Ps, respectively. Thus, we can see that with an appropriate choice of weights, the jackknife variance estimate is of the form (1.1). [] S. Chatterjee I Statistics & Probability Letters 40 (1998) 307-319 315 n-p Incidentally, for (3.4) it can be seen using the Binet-Cauchy expansion that the sum ~ s d ( n - d)-l((d-l) ]GI)-IIGsl is ( n - d - p + l ) / ( n - d ) . Wu (1986) had pointed out that the extra factor o f ( n - d - p + 1)~(n-d) should be absorbed in the f i s - / ~ for exact standardisation. The entire discussion leading to the proposed jackknife estimator in Wu (1986) can now be motivated as resampling from a generalised bootstrap viewpoint. Having established that the jackknife is a special case of generalised bootstrap, our next goal is to be able to exploit the results known for different generalised bootstrap schemes for the jackknife. To this end we now relate the usual delete-d jackknives to UBS or DBS. Theorem 3.2. (i) Suppose {dn} is a sequence of inteyers such that d , / n - + c E (0, 1) and d, + p, ~ n. Then the delete-d, jackknife scheme satisfies the conditions of UBS. (ii) Suppose {d~} is a sequence of integers such that d,/n ---+O, so that, in particular, d,, can be constant at any positive inteqer. Then the delete-d, jackknife scheme satisfies the conditions of DBS. ProoL We retain the notations and definitions of the proof of the previous theorem. Note that for delete-d jackknives, we have {s(i) = n n- d 0 if i E S c, if i E S and ps = (~) -~, S E 5¢,.d. Now, it can be verified by direct calculation that conditions of UBS hold if din--+c* E(0, 1), where c* < c of (2.4). Note that c can be any fraction less than one, so the extra condition on c* can easily be satisfied by taking c large enough. Similarly, direct calculations will also show that conditions of DBS are satisfied if d/n---+ O. Also, observe that using weights {wl :,, w2 :n. . . . . w,:,} on the data points we get the required jackknife estimates. This completes the proof. [] Remark 1. For the weighted jackknives, a detailed calculation of the higher-order cross moments, as is necessary for identification of a procedure as UBS or DBS, seems intractable. However, other conditions are easily verified, and this leads us to conjecture that the weighted jackknives are also UBS or DBS depending on the number of elements of S, possibly under some weak conditions on the model. Remark 2. But more importantly, observe that in case p _= 1 and pi is proportional to x,-2, we get an E-class technique, whereas generally we get an R-class technique. Liu and Singh (1992) remarked that an adaptive resampling scheme can probably be designed to fit in E-class in homoscedastic regression case and R-class in heteroscedastic case. The weighted jackknife generalised bootstrap is thus seen to be such an adpative estimate. The second-order properties of the jackknife variance estimator can also be studied using known results about UBS and DBS schemes. For the present discussion, we restrict our attention to the p = 1 case. From Bose and Chatterjee (1997), we know that the different resampling schemes can be separated based on the second term in the variance expansion. Accordingly, the most efficient of these resampling schemes can also be found out and we call such a resampling scheme a second-order efficient scheme. Remark 3. Second-order results for jackknife variance estimators. For the delete-d jackknife, we have from Theorem 2.2 R,, = 3c~SC~o72d(n- d) -I + 2~6~1(1 - d(n - d) -j ) - ~4~0(1 - 2d(n - d) - l ). (3.6) If d/n---+ O, the above expression reduces to R, =2~6~1- ~4~0, which is same as that of delete-1 jackknife, see Bose and Chatterjee (1997), Theorem 2(i). This is as expected, since we have already shown that delete-1 jackknife is an example of a DBS scheme. S. Chatterjee I Statistics & Probability Letters 40 (1998) 307-319 316 For the jackknife schemes that satisfy UBS conditions, if lim,__.~ a0(2 - cx4~;2) C = 3~4~0)~2 _ 2~2~ 1 q- 4~0 d(n- d ) - l = c and c C (0, 1), then (3.7) gives a second-order efficient jackknife. This is entirely in keeping with what has been observed and remarked in Wu (1990), that it seems desirable to let both d ~ c~ and n - d ~ c~, but the optimal choice for the limit of d/(n- d) depends on the problem at hand. If we make the natural assumptions that ~0 = ~l and ct = 72 = 1, then (3.7) gives c = 7" i This leads us to suggest that delete-[n/5] is a generally good jackknife scheme. Thus, in the context of linear models, we have been able to identify a unified framework, the generalised bootstrap set-up, of which all resampling schemes are special cases. The random weights may either be attached to residuals or to the observations themselves, thus resulting in two separate class of resampling procedures. Among the latter class of resampling techniques, we have two separate procedures, the UBS and the DBS. We have established that the jackknives are generalised bootstrap schemes, and that they are DBS or UBS generalised bootstrap schemes depending on whether din goes to zero or a positive fraction in the limit. It is known that the delete-1 jackknife, which is a DBS scheme, yields inconsistent answers for some problems, like estimating the variance of the sample median, whereas if d ~ c~, the delete-d jackknife is consistent. It can already be conjectured that the DBS would not be able to provide correct answers in problems like minimisation of L1 norm, but the UBS would give consistent results. One remarkable advantage of the UBS or DBS resampling schemes is that i.i.d, random variables satisfy the conditions on the weights, subject to some lower-order moments restrictions. This makes the computation feasible and efficient, and resampling involves a good deal of Monte Carlo computations. For example, for any d, we can approximate the delete-d jackknife variance estimate by using i.i.d, weights supported on a compact set inside the positive half of the real line and having unit mean and variance d/(n- d). The accuracy of approximation can be improved to any degree by imposing restrictions only on the lower-order moments of the weights. This results can be favourably compared to those of Shao (1987) or Shao (1989). Appendix Proof of Theorem 2.1. p_= 1. We use the notation (wi - W/-- - O'Bn 1) throughout this proof. Expanding the expression for VDBS, we get ^ VDBS ~.~BEB(flB _ fi)2 1 I ~-~4=l,nWiXiei T--EB GBn 1 I ~i=l,n WiX2i f 5-EB ], ~B. + i=l,n i=l,n xiei }2 ~i=[,n wixiei wine, ( \ i=l,n xiei Zi:,,.g E,=,,n x2 1 2,=i;WiX2 1) S. Chatterjee I Statistics & Probability Letters 40 (1998) 307 319 317 1 - a2Eu[C~ +Dn] 2, where Cn - Ei=l,n Wixiei Ei=l,nXiei Ei=l,n x2 E i 1,n x2 and o=wixiei( 1 ,) and hence ' |( ~~ _2 EBC.~ - L2 x2~2 , , + E xix~e,~c,, \ 1-<.i<~n °Bn = 1 Z i¢.j (4.1) x~e2 + O P ( n - 2 ) " L~2 l < i < . Note that D. = Tl. + T2n, where Tln= E ( w i - 1 ) x i e i l~i<~n ( ,) 1 W 2 El<~i~n iXi E, <.i<..x~ and T2n : Z xiei I~i~n ( ' , ) E|~iTnWiX2 E ' ~ i ~ nx2 It remains to show that EBT12~,EBT~. and EBC.Dn, divided by a2., is Op(n-2). 1 1E l(El<~i<~n(Wi_l)xiei~2(Z <~InE<:o : _-7 =~ t OBr t L' n .2k~-~" ~ B . . . . . Bn ~n <~ ~7',~ (Wi -- 1)X2 Wixiei <~i<~n EB Z Wix2i I <~i<~n = Op(n-20"2n) , (4.2) 1 2 1 E 1 (~.__~l~i~nXie~~2 I ~'~ )2 SBr2n : ~ B ~ kEl<~i<~nWiX~) (Wi-- 1)X2i (7Bn °Bn ~n \ 1<~i<~n <<.nZkaZn~EB )2 \ l <~i<~n Z I ~w,-~)x~ \l<~i<~n ( W i - 1)xiei \ l ~i~n ----EB Bn n2k L~ ) Z xiei l<~i<~n / Z ( Wi -- 1 )X~ l<~i<~n S. Chatterjee I Statistics & Probability Letters 40 (1998) 307-319 318 "l <~-~-~ ~-'~xiei l<~i<~n ;[I EB / II 1/2 ~_. Nx2i l<.i<.n (4.3) = Op(rt-2 ), 1E 1 ( ) (~-~M'<i~<n(wi-1)xiei) ~--- B-i-~ Z x i e i ~'~------~. Z EBT2nC. = Z•--1 O'Bn °Bn ~n l<~i<~n ~l<~i<~n Wi i ('() ) ~< nka2" L2n 1<~i~n nk L~ 1)X2 l<~i<~n 1<~i~n EB Z xiei k l <~i<~n (Wi-- l<~i~n l~i2x~ei+ EB ~ ~l~jxi4e, 1<~i<~n i#j 1 1 1<~i<~n 1<~i<~n iT~j (4.4) ~---Op(n -2 ) ( ~ 1 <~i<~n(Wi -- 1)xiei)2 O'Bn O'Bn y~., <_,~.w,x~ n nkaZn L2 l~i<~n nk L 2 EB l~i<~n YVi Xi2e i2 + ~ .,,2 Z 1 <~i~n lYBn 1 nk L 2 EB Z Z~iWjxixjeieJ)isgj aBn 1 nk L2n WiWjxixjeiej Wiix2 iT~j VYi xi4 ei2 + Z ,rr3 1 <~i<~n + Z (wi - 1)x2 l~i~n \ 1 <~i~n / 2 2 i2 W/2 WkXiXke i~k (ZWkx2+Wxj2J-Wxi2)]k#kjei EB [ Z iVVi , . ,xie 34 2 +~-" Wie WkXi2x k2ei2 1<~i~n iT~k +ZGZ l <~i<~nigj k~jkyi = Op(n-2). 2 j +2 ~ WiiG.Wkxixjxkeie 1 <~i<~n Z WiWjj2xi# eieJ i#j (4.5) S. Chatterjee I Statistics & Probability Letters 40 (1998) 307 319 319 Thus, we have VDBS= ( 1 / L ] ) ~ 1<~i<~nxi2 ei2 + Op(n -2) as required. The proof for the general p case follows similar lines. See Chatterjee (1997) for details of arguments involved. [] Acknowledgements The author would like to thank Professor AMp Bose for his invaluable help and guidance in this research. References Barbe, P., Bertail, P., 1995. The Weighted Bootstrap, Lecture Notes in Statistics, vol. 98. Springer, New York. Bera, A.K., Higgins, M.L., 1993. A survey of ARCH models: properties, estimation and testing. J. Economic Surveys 7, 305-366. Bickel, P.J., Freedman, D.A., 1983. Bootstrapping regression models with many parameters. In: Bickel, P., Doksum, K., Hodges, Jr. J., (Eds.), Festschrift for Erich L. Lehmann, Wadsworth, Belmont, Calif., pp. 28-48. Bose, A., Chatterjee, S., 1997. Second order comparison of different resarnpling schemes in linear regression. Tech. Report No. 15/97, Star-Math Unit, Indian Statistical Institute, Calcutta. Boos, D., Monahan, J.F., 1986. Bootstrap methods using prior information. Biometrika 73, 77-83. Chatterjee, S., 1997. Generalized bootstrap in regression models with many parameters. Tech. Report No. 2/97, Stat-Math Unit, Indian Statistical Institute, Calcutta. Efron, B., 1979. Bootstrap methods: another look at the jackknife. Ann. Statist. 7, 101-118. Efron, B., 1982. The Jackknife, the Bootstrap and Other Resampfing Plans. SIAM, Philadelphia. Freedman, D.A., 1981. Bootstrapping regression models. Ann. Statist. 9, 1218-1228. H~rdle, W., Matron, S., 1991. Bootstrap simultaneous error bars for non-parr.metric regression. Ann. Statist. 19, 778-796. Hinkley, D.V., 1977. Jackknifing in unbalanced situations. Technometrics 19, 285-292. Liu, R.Y., 1988. Bootstrap procedures under some non-i.i.d, models. Ann. Statist. 16, 1696-1708. Liu, R.Y., Singh, K., 1992. Efficiency and robustness in resampling. Ann. Statist. 20, 370-384. Lo, A.Y., 1987. A large sample study of the Bayesian bootstrap. Ann. Statist. 15, 360-375. Lo, A.Y. 1991. Bayesian bootstrap clones and a biometry function. Sankhya, Set. A 53, 320-333. Mammen, E., 1989. Asymptotics with increasing dimension for robust regression with applications to the bootstrap, Ann. Statist. 17, 382-400. Mammen, E., 1992. When does Bootstrap Work? Asymptotic Results and Simulations, Lecture Notes in Statistics, vol. 77. Springer, New York. Mammen, E., 1993. Bootstrap and wild bootstrap for high dimensional linear models. Ann. Statist. 21, 255-285. Portnoy, S., 1984. Asymptotic behaviour of M-estimators of p regression parameters when pe/n is large, I: consistency. Ann. Statist. 12, 1298--1309. Portnoy, S., 1985. Asymptotic behaviour of M-estimators of p regression parameters when pZ/n is large, I: asymptotic normality. Ann. Statist. 13, 1403-1417. Portuoy, S., 1988. Asymptotic behaviour of the empirical distribution of M-estimator residuals from a regression model with many parameters. Ann. Statist. 14, 1152-1170. Praestgaard, J., Wellner, J.A., 1993. Exchangeably weighted bootstrap of the general empirical process. Ann. Probab. 21, 2053-2086. Quenouille, M., 1956. Notes on bias in estimation. Biometrica 43, 353-360. Rubin, D., 1981. The Bayesian bootstrap. Ann. Statist. 9, 130-134. Shao, J., 1987. Sampling and resampling: an efficient approximation to jackknife variance estimators in linear models. Chinese J. Appl. Prob. Statist. 3, 368-379. Shao, J., 1988. Consistency of the jackknife estimators of the variance of sample quantiles. Comm. Statist. A 17, 3017-3028. Shao, J., 1989. The efficiency and consistency of approximations to the jackknife variance estimators. J. Amer. Statist. Assoc. 84, 114-119. Shao, J., Tu, D., 1995. The Jackknife and the Bootstrap. Springer, New York. Shao, J., Wu, C.F.J., 1987. Heteroskedasticity-robustness of jackknife variance estimators in linear models. Ann. Statist. 15, 1563-1579. Shao, J., Wu, C.F.J., 1989. A general theory for jackknife variance estimation. Ann. Statist. 17, 1176-1197. Weng, C.S., 1989. On a second order asymptotic property of the Bayesian bootstrap mean. Ann. Statist. 17, 705-710. Wu, C.F.J., 1986. Jackknife, bootstrap and other resampling methods in regression analysis (with discussion). Ann. Statist. 14, 1295. Wu, C.F.J., 1990. On the asymptotic properties of the jackknife histogram. Ann. Statist. 18, 1438-1452. Zheng, Tu, 1988. Random weighting method in regression models. Sci. Sinica Set. A 31, 1442-1459.