Download Another look at the jackknife: further examples of generalized

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data assimilation wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Bias of an estimator wikipedia , lookup

Time series wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Transcript
STATISTICS &
PROBABILITY
LETTERS
ELSEVIER
Statistics & Probability Letters 40 (1998) 307-319
Another look at the jackknife: further examples of
generalized bootstrap
Snigdhansu Chatterjee *
Theoretical Statistics and Mathematics Unit, Indian Statistical Institute, 203 B. 7". Road, Calcutta 700 035, India
Received January 1998; received in revised form May 1998
Abstract
In this paper we have three main results. (a) We show that all jackknife schemes are special cases of generalised
bootstrap. (b) We introduce a new generalised bootstrap technique called DBS to estimate the mean-squared error of the
least-squares estimate in linear models where the number of parameters tend to infinity with the number of data points, and
the error terms are uncorrelated with possibly different variances. Properties of this new resampling scheme are comparable
to those of UBS introduced by Chatterjee (1997, Tech. Report No. 2/97, Calcutta). (c) We show that delete-d jackknife
schemes belong to DBS or UBS depending on the limit of n ld. We also study the second-order properties of jackknife
variance estimates of the least-squares parameter estimate in regression. (~) 1998 Elsevier Science B.V. All rights reserved
Keywords." Bootstrap; Jackknife; Least squares; Many parameter regression; Second-order efficiency
1. Introduction
In Efron (1979) the bootstrap method was introduced to understand the jackknife better, and is a general
technique to estimate the distribution o f statistical functionals. The naive bootstrap scheme is to sample with
replacement from the data, then to calculate the statistic of interest for the sample from sample or resample,
and to repeat this process over all possible resamples. Suppose (wl,w2 . . . . . wn) is a random sample from
Multinomial (n, 1/n, 1/n .... ,1/n). The naive bootstrap can then be viewed as attaching the random weight
wi to the ith data point, calculating the statistic o f interest for the new data, and finally integrating out
the extraneous randomisation induced by the wi's. Bootstrap with random weights is often referred to as
9eneralised bootstrap. The bootstrap is not a generalisation o f the jackknife, in the sense that the jackknife
cannot be seen as a special case o f naive bootstrap. However, generalised bootstrap is a direct generalisation
o f naive bootstrap, so it is o f interest to know how generalised bootstrap relates to the jackknife. We show
in this paper that all known jackknife procedures are special cases of generalised bootstrap.
We look at the problem o f estimating the variance o f the least-squares estimate in linear regression models
where the number o f parameters can tend to infinity with the number o f data points. In linear regression,
* E-mail: [email protected].
0167-7152/98/$ - see front matter @ 1998 Elsevier Science B.V. All rights reserved
PII: S0167-7152(98)00116-3
s. Chatterjee / Statistics & Probability Letters 40 (1998) 307-319
308
the random weights can be attached either to the pairs {(yi,xi), i = 1..... n} or to the residuals r i = y i
xifl. We have the residual bootstrap when Multinomial (n, 1In . . . . . 1/n) weights are attached to the centered
residuals ri - F , and paired bootstrap when square roots of such weights are attached to {(yi,xi), i = 1..... n}.
Accordingly, there can be two varieties of generalised bootstrap in linear regression. The external bootstrap
of Wu (1986), wild bootstrap of Mammen (1989) and weighted bootstrap of Liu (1988) are examples of
generalised bootstrap where the generalisation is by attaching random weights on residuals. Generalised paired
bootstrap is carried out by selecting random weights {wi:n; i = 1..... n} and then transforming the ith data
point (xi, Yi) to (wx/-~..~xi, ~ Y i ) .
Let the least-squares estimate of parameter fl be/~ using the original data
and /~B with the transformed data. Assume that the weights satisfy EB(Wi:n- 1 )2= trB~, i---- 1..... n where trBn
may depend on n. The generalised bootstrap variance estimate is defined to be
-
-
^
-
-
(1.1)
The uncorrelated weights bootstrap (UBS), of which the paired bootstrap is a special case was introduced in
Chatterjee (1997). The consistency of its variance estimate of the least-squares estimate was also established
there. In this paper we introduce a variant of the UBS, which we call the 'degenerate weights bootstrap'
(hereafter DBS). We show that the DBS has same properties as the UBS, only the conditions on the weights
are different. We show that all the known jackknife schemes belong either to UBS or DBS. In the delete-d
jackknife scheme, if d/n ~ c C (0, 1) as n ---*co, then the scheme belongs to UBS, otherwise it belongs to DBS.
In Bose and Chatterjee (1997) higher-order properties of resampling variance estimates of the least-squares
estimate in linear regression was studied. We extend the results to the delete-d jackknife, exploiting its
identification as an UBS or DBS scheme. Based on a consideration of second-order efficiency and likely
model conditions, we recommend the delete-In/5] jackknife scheme.
We know that the delete-1 jackknife fails to give consistent variance estimator for non-smooth estimators
like sample quantiles, unlike the delete-d jackknife where d(n - d ) - l ~ c ~ (0, 1). Based on the fact that the
delete-1 jackknife scheme belongs to DBS and with d = O ( n ) the delete-d jackknife belongs to UBS, we
conjecture that results for UBS can be carried over to non-smooth estimators, but results for DBS will not
carry over to such situations.
The idea of bootstrapping with random weights probably appeared in Rubin (198l). Bootstrapping with
exchangeable weights have been treated in Efron (1982), Lo (1987), Weng (1989), Zheng and Tu (1988) and
Praestgaard and Wellner (1993). Other generalised bootstrap methods may be found in Boos and Monahan
(1986), Lo (1991), H/irdle and Marron (1991), Mammen (1992, 1993). A review can be found in Barbe
and Bartail (1995). Bootstrap schemes for linear models have been discussed in Efron (1979), Freedman
(1981), and in Bickel and Freedman (1983). Hinkley (1977) Wu (1986) and Shao and Wu (1987) have
studied consistency of the bootstrap and the jackknife in heteroskedastic linear models. Bootstrap in regression
models with many parameters have been considered by Bickel and Freedman (1983) and Mammen (1992),
who showed the consistency of bootstrapping residuals and the wild bootstrap respectively for the least squares
estimate of the regression parameters. Liu and Singh (1992) compared the well-known bootstrap and jackknife
schemes in linear models. They showed that for estimating the variance of the least-squares estimates of r,
some resampling schemes such as the paired bootstrap and wild bootstrap produce consistent results under
heteroskedasticity, and some others such as the usual residual bootstrap do not yield consistent estimates under
heteroskedasticity but are more efficient under homoskedasticity. These resampling techniques thus are either
robust or efficient, and accordingly they were classified as belonging to R-class or E-class. We consider a
large class of models include the heteroskedastic linear model and the many parameter regression model as
special cases. In our set-up, we find a representation of the DBS variance estimator, from which it follows
that this generalised bootstrap estimator belongs to the R-class. Analogous results for the UBS scheme has
been already established in Chatterjee (1997). This shows that all the usual delete-d jackknife schemes belong
to the R-class.
S. Chatterjee I Statistics & Probability Letters 40 (1998) 307-319
309
There are some weighted jackknife schemes also, namely those suggested by Hinkley (1977), Wu (1986),
Liu and Singh (1992). The first two are modifications to the usual jackknives, adapted to suit regression data
of unbalanced nature. Liu and Singh's weighted jackknife came up as an example of a E-class resampling
technique. The asymptotic properties of these weighted jackknives are comparable to the usual delete-d jackknives, and their usefulness lies mainly on their small sample performance. We show how to interpret these
weighted jackknives as generalised bootstrap schemes.
The models that we consider consist of a linear signal term, where the number of parameters can tend to
infinity with the number of data points, and the errors are assumed to be uncorrelated but not necessarily
independent and they can have different variances. We describe the details of the model below.
Let r/n be a fixed sequence of real numbers, not identically zero. Also, let {Pn} be a sequence of nondecreasing positive integers satisfying pn ~<n.
For any n~>l, by /~n we mean a vector of length p~ which consists of (r/1,r/2..... r/p,). Also, for each
i>~l, x * = ( x * ( j ) , j = l . . . . . Pi) is a observed random vector of length Pi. For each n>~l, the vectors
xi:n = (xi:n(j), j = 1. . . . . Pn), 1 <~i <~n are vectors of dimension Pn obtained by augmenting zeroes to xi, i.e.
Xi:n(J)=
x*(j)
0
if l<~j<<,pi,
if pi<j<~pn.
Since {p,,} is non-decreasing, the above expression is well defined. We have a sequence of models
{Jgn; n>~ 1}, where ~/n is the model
Yi =xTn~n + ei,
1~i~n.
Note that the above model is nested, in the sense that fin is the first n components of tim for any m > n.
We drop the suffix n both from pn and fin. We denote xi:n henceforth by xi.
Let X denote the (n × p)-dimensional matrix whose ith row is formed by x~. Let X T denote the transpose
of the matrix X. Also, let y and e be the n-dimensional vectors whose ith entries are Yi and ei, respectively.
Then Jc'n may be written as
y=X~+e.
(1.2)
We make the following assumption on the nature of the design matrix. For any m > p if Jm = { i 1 , i 2
is a subset of {1,2 ..... n}, and if X* is the (m × p)-dimensional matrix whose jth row is xY
I/' then
XT*X * > mk(p)l
.....
ira}
(1.3)
for some k ( p ) > 0 , where I is the identity matrix. This is only slightly stronger than the condition that no p
of the xi's lie on a ( p - 1)- or lower-dimensional hyperplane.
The other model assumptions are
( )2
xrixi<K(p)<oc,
E
~
ei
V i = 1 , 2 ..... n,
= O(n),
(1.4)
(1.5)
l <~i<~n
Eeiej = 0 ;
Z
l<~i<~n
iCj,
e8 = Op(n)"
(1.6)
(1.7)
S. Chatterjee I Statistics & Probability Letters 40 (1998) 307-319
310
Note that (1.3) and (1.4) are very general conditions where the bounds are allowed to depend on the
dimension of the design matrix. One consequence of these condition is
0 < k ( p ) <di/n < K ( p ) < ~ ,
(1.8)
where di, 1 <~i <~p are the eigenvalues of X T, X.
It may be noted that in the usual linear regression model with heteroskedastic errors, the error variables
are assumed to be independent and satisfying
E[ei[x,] =0,
Var[eilxi] = z~.
Liu and Singh (1992) had further assumed that z2i<<.co for all i for a positive constant co. This implies
conditions ( 1.5) and (1.6).
Bickel and Freedman (1983) have shown the consistency of the residual bootstrap distribution of the
parameter estimate if p2/n--~0. Such models have also been discussed in Portnoy (1984, 1985, 1988). In all
these papers, the errors are assumed to be i.i.d, random variables with zero expectation and finite second
moment. Since we allow the number of parameters to increase with the size of the data, this model also
incorporates some dynamic linear models with long memory. One important special case is the ARCH model
widely used in econometrics, and its variants. A detailed review of the ARCH and its variants can be found
in Bera and Higgins (1993).
The least squares estimate for the parameter fl is given by
= ( X T X ) - I X T y = fl + (XTX)-IXXe.
From our model assumptions, (XvX) -l always exists. The quantity we wish to estimate is /In = E(/~- fl)(/~fl)T. We denote the expectation of ee v by Zn, so that
(S,)ij=
{
z~
0
if i=j,
if i~=j.
2. The DBS resampling scheme
Let {wi:n; 1 <~i<~n, n>>.1} be a triangular array of non-negative random variables. We will drop the suffix
n from the notation of the weights. The generalised bootstrap resampling scheme is carried out by weighting
each data point (yi, x i ) with the random weight v/~i, then computing the statistic of interest and taking
expectation of the random weight vector. The above set-up can be taken as a direct generalization of the
paired bootstrap, where the {wi; 1 <<.i<<.n}are given by a random sample from Multinomial(n; 1In .... ,1/n).
Let
V(wi)=(~2n,
Cov(wi, wj)=Cn~2n ,
i¢j,
l<~i, j<~n.
We present the conditions for the UBS scheme first, and then discuss the conditions for DBS scheme. The
UBS scheme is discussed in details in Chatterjee (1997).
The letters k and K, with or without suffix, are used as generic for constants. Different indices a, b, e ....
indicate that the relation hold for all possible choices of a ¢ b ~ c . . . . We use the notation
= E ( wa - l ~ ( wb - l ~J ( wc - 1~k
Cijk...
\
aBn
/I \
aBn
/I \
GB~-/I . . . .
In the following relations, il, iz.... denote positive integers.
E(wi)= 1,
(2.1)
S. Chatterjee / Statistics & Probability Letters 40 (1998) 307-319
(2.2)
a~n ~ k > 0 ,
Kn>~ Z
311
wi>~kn'
(2.3)
K>k>0,
1 <~i<~n
PB[Wi > k > 0 for at least [cn] of the i's, c ¢ (1/3, 1)] = 1 - O ( n - 2 ) ,
(2.4)
Cll = O(n- 1),
(2.5)
k
Vil,i2 ..... ik satisfying E i j = 3 ,
ci~i2...ik=O(n-k+l),
(2.6)
cili2...ik= O(n -k+2 ),
(2.7)
O( n-k+3 ),
(2.8)
ci,iz...ik = O(n-k+4).
(2.9)
j-1
k
Vil,i2 . . . .. ik satisfying ~-~ i j = 4 ,
j=l
k
Vi~, i2. . . . . ik satisfying Z ij = 6,
Cili2...ik =
j= 1
k
'V'/I, i2 . . . . . ik satisfying Z
ij = 8,
j i
The above conditions are satisfied if {wi} follows Multinominal(n; 1/n ..... l/n).
We define the bootstrap estimate of fl to be
/~B=
(xTwx)-IXTWy
if at least p of the wi's are not 0,
/~
otherwise
and the bootstrap variance estimate to be
1
^
VB = ~B2nE~(flB--/~)(/~B--/~)T.
If yi's are i.i.d, this variance estimate coincides with the generalised bootstrap variance estimate used for
estimating the variance of the sample mean. See Barbe and Bertail (1995), for use of this statistic for other
statistical functionals.
The notable feature of this resampling scheme is that the weights {wi:n} are asymptotically uncorrelated.
So we will call this the uncorrleated weights bootstrap (hereafter UBS).
The DBS, or 'degenerate weights bootstrap' is a variant of the UBS where condition (2.2) is dropped and
some of the cross moment conditions are relaxed. The precise conditions are stated below.
(2.10)
E ( w i ) = 1,
o
(2.11)
Kn>~ Z
wi>~kn'
K>k>0,
(2.12)
i<~i<~n
PB[Wi > k > 0 for at least [cn] of the i's, c E (1/3, 1)] = 1 - O(n 2),
(2.13)
cll =O(n-1),
(2.14)
S. Chatterjee / Statistics & Probability Letters 40 (1998) 307-319
312
k
Vii, i2,...,ik satisfying
~/j=3,
Cili2...ik = O(n-k+ltrffn1),
(2.15)
cil i2...ik ~- O(n-k+2 ),
(2.16)
cil i:...ik = O(n-k+3 trffn2),
(2.17)
j=l
k
V i i , i2 . . . . .
ik satisfying Z i j = 4 ,
j=l
k
Vii, i2..... ik satisfying Z i j - - 6 ,
j=l
k
Vii, i2..... ik satisfying Z / j - - - - S ,
•"~( --k+4
Citi2...ik ~ L3 n
(2.18)
-4-
GBn ).
j=l
As can be seen from condition (2.11 ), the variance of the weights go to 0 as n --+ oo, and hence the weights
are asymptotically degenerate. However, the crucial condition is still (2.14), which makes the degenerate
weights bootstrap (DBS) a variant of the UBS. The expression for the bootstrap estimate of the variance
of the least squares estimate is as earlier. We denote this bootstrap variance estimate by Ions. We have a
parallel to Theorem 3.1 of Chatterjee (1997), which we state below.
Let
2 if i = j ,
(H)ij=
0
if i # j .
Theorem 2.1. In resamplin9 with weights satisfying conditions for DBS, the following expansion holds for
the mean-squared error term:
VDB S -- Vn = ( x T x ) - I X T [ H -
~_~n]X(XTX)-I-ar
Op(n-2p6y(p)2),
where 9(P) ---K(p)-2[K(p)/k(P)]ZP[P/(P - 1)]P-1.
In particular, DBS is an consistent resamplin9 techniques for the heteroskedastic linear model and also
the many parameter regression model where n-l p3 9(p)---+ O.
Note that the expressions in the above expansion involve (p x p) matrices, where p --+ oc. The interpretation
is as follows: a matrix An = ((an, ij)) is called Op(bn) if
sup
an; q = O p ( b n ).
l <~i<~p,l <~j<~p
The proof of this theorem follows identical arguments as presented in Chatterjee (1997). For the sake
of completeness, we give a proof for the case Pn - 1 in the appendix. Bose and Chatterjee (1997) derived
higher-order terms in the variance expansion in case p = 1 for different resampling schemes. We now present
a similar result for the DBS scheme.
Let us assume the existence of the following limits:
/: n ,~1/2
lim f - - /
,
n--+~\LnJ
lim
1_
x2:
n--+oon i<~i<~n t t '
1
42
= lim - Z xizi,
n___+c~ni<~i<~
n
Let
E l <~i<~nxiei
ql = t'K"
2'11/2'
~,Z.~l <<,i<~nXi :
E l <<.i<~nX?ei
?]2 = (EI<~i<<nX6)I/2,
71 ~--"(/~ 1
?]2) T
E , <.i<..x4
72= lim
n---+~
n
S. Chatterjee / Statistics & Probability Letters 40 (1998) 307-319
H =
c~,~2
_0~5yl3.2
313
-~ ~'3'~o)
and Q = ~/THr/.
Theorem 2.2. / f p = 1,
then the variance estimate from the DBS satisfies
1 Tn + O e ( n
113'2(VDBS -- Vn) = Z n +//I,"2
I),
where the second-order term is given by
T,, = Q + 3a2nc22~8~072 - 2O'BnC376~t + n(3a~,c112 -- 20B,Cl2 -- Cll )~4~0
= Q + R,, say.
The proof of this theorem is same as that of Theorem 2.2(v) of Bose and Chatterjee (1997) and we omit it.
3. Another look at the jackknife
Let d be an integer in [1,n). From the set {1, 2 . . . . . n} choose d integers. The delete-d parameter estimate
/~(-a) is the usual least-squares estimate from the data with the chosen d observations dropped. Such estimates
are found for all possible choices of sets of d observations to delete. The jackknife variance estimate is an
appropriately scaled version of ~ (/~(-d) -/~)2 where the sum runs over all possible choices of d integers
from 1 to n. A detailed study of the delete-d jackknife can be found in Shao (1988), Wu (1986, 1990),
Shao and Wu (1987, 1989), Shao and Tu (1995). For example, in the usual delete-1 jackknife, introduced by
Quenouille (1956), the variance estimator is Vj = 1/n(n- 1)~l~<i~<" (Ji-/~)2.
Here the pseudovalue Ji is defined as n/~- (n - 1)/~(_i), and /~(-i) stands for the least-squares estimate
from the data set with ith observation deleted. The above sum can be replaced by a weighted sum also, with
weights possibly depending on the subset of ignored observations. Let us introduce some notation which we
use to describe the important weighted jackknife schemes. Let G = x T x and for any i, 6~=x~G-Ixi. For
a ( n × p ) matric A and S E 5Pn,d as in the proof of the previous theorem, As is defined to be the submatrix
of A formed by deleting the ilth ..... ijth rows of A. Denote X~Xs by Gs. Our model conditions imply that
Gs is positive definite for all choices of S E 5°,,,d, as long as d + p ~< n. Let /~s : GslX~ys . The variance
estimator of Hinkley (1977) is
V m = n ( n - p)-' Z
(1-6i)2(~(-i)-~)(~(-i) -~)T
I<~i<~n
and that of Wu (1986) by
Vjw =
(())'
n-p
d- 1
IGI
~
IGsl(~s - ~)(~s -/~)T.
SG~,,.d
A detailed study of these weighted jackknives is made in Shao and Wu (1987). The third weighted jackknife
scheme, due to Liu and Singh (1992), came up as an example of a jackknife procedure that is in the E-class. It
is defined for p - 1 case, and does not have any simple extension to higher-dimensional models. The variance
estimate is defined as
L,
VjLs- n2(~-Z 1) ~
(Ji ~/~) 2
xi
where L,, =
Z
"
xT"
s. Chatterjee / Statistics & Probability Letters 40 (1998) 307-319
314
We present a different way of looking at the jackknife through the following theorem.
Theorem 3.1. In terms of resamplin 9 variance estimates of the least-squares estimate of the parameter
in linear models, the jackknife resamplin9 schemes are special cases of 9eneralised bootstrap resamplin9
schemes.
Proof. To establish this proposition, we have to show that the jackknives are driven by a scheme of random
weights.
Let N~ = {1, 2 . . . . ,n} and 6en,d = {all subsets of size d from t~n}. We will identify a typical element of
6e~,d by S--{i1,i2 ..... id}. There are (~) such elements in 5an,d. S c denotes the set N,\S. Let {~s, SESP~,d}
be a collection of vectors in ~" defined by ~s = (~s(1), ~s(2) ..... ¢s(n)), where
S cs
0
~s(i)
if i E s c,
if iES,
where cs,S E 6e,,d is a set of constants. Suppose {Ps, S E 5e~,d} is a set of probabilities, that is Ps >>-0 for all
S E 5an,d and f~'~SE,~,dPS = 1. Consider the random vector IV, = (wl:~, w2 :, ..... w~:~) which has the following
probability law
P[Wn=~S]=Ps,
SESe,,d
(3.1)
Observe that for i E Nn
;s
if iff&
if iES.
wi:n=
It can now be easily seen that if the ith data point (xi, yi) is scaled by ~ ,
and the least-squares estimate
calculated for the transformed data, the resulting estimate is/~s. For the different jackknife variance estimates,
we only have to choose the constants cs and ps appropriately, so that the mean squared deviation from one
is same but possibly depending on n. These condition can be summarised as
(cs - 1)2ps = nl ~, _n._~.-_~-._~_( .c,s _ l ) 2 p s
Z
S:if~S
i=1
i = 1 , 2 . . . . n.
(3.2)
S:ifES
--1/2, S E 5a~d for some
One way of ensuring (3.2) holds is to specify Ps, S E 6en,d and then take cs = 1 + k Ps
constant k > 0.
For the delete-d jackknife, we take ps = (d) -~, SE6e,,d SO that the expression in (3.2) do not depend
on S. For the weighted jackknives, we have the following relations:
Ps~
ICsl
for Wu (1986),
PiC~ (1 - •i)2
Pi
1
for Hinkley (1977),
~ X~-? for Liu and Singh (1992).
(3.3)
(3.4)
(3.5)
Note that for Hinkley's jackknife and Liu and Singh's jackknife the set S can only be singletons, so when
S = {i}, we use the notation ~i and pi in place of Cs and Ps, respectively.
Thus, we can see that with an appropriate choice of weights, the jackknife variance estimate is of the
form (1.1). []
S. Chatterjee I Statistics & Probability Letters 40 (1998) 307-319
315
n-p
Incidentally, for (3.4) it can be seen using the Binet-Cauchy expansion that the sum ~ s d ( n - d)-l((d-l)
]GI)-IIGsl is ( n - d - p + l ) / ( n - d ) .
Wu (1986) had pointed out that the extra factor o f ( n - d - p +
1)~(n-d)
should be absorbed in the f i s - / ~ for exact standardisation. The entire discussion leading to the proposed
jackknife estimator in Wu (1986) can now be motivated as resampling from a generalised bootstrap viewpoint.
Having established that the jackknife is a special case of generalised bootstrap, our next goal is to be able
to exploit the results known for different generalised bootstrap schemes for the jackknife. To this end we now
relate the usual delete-d jackknives to UBS or DBS.
Theorem 3.2. (i) Suppose {dn} is a sequence of inteyers such that d , / n - + c E (0, 1) and d, + p, ~ n. Then
the delete-d, jackknife scheme satisfies the conditions of UBS.
(ii) Suppose {d~} is a sequence of integers such that d,/n ---+O, so that, in particular, d,, can be constant
at any positive inteqer. Then the delete-d, jackknife scheme satisfies the conditions of DBS.
ProoL We retain the notations and definitions of the proof of the previous theorem. Note that for delete-d
jackknives, we have
{s(i) =
n
n- d
0
if i E S c,
if i E S
and ps = (~) -~, S E 5¢,.d. Now, it can be verified by direct calculation that conditions of UBS hold if
din--+c* E(0, 1), where c* < c of (2.4). Note that c can be any fraction less than one, so the extra condition on c* can easily be satisfied by taking c large enough. Similarly, direct calculations will also show
that conditions of DBS are satisfied if d/n---+ O. Also, observe that using weights {wl :,, w2 :n. . . . . w,:,} on the
data points we get the required jackknife estimates. This completes the proof. []
Remark 1. For the weighted jackknives, a detailed calculation of the higher-order cross moments, as is
necessary for identification of a procedure as UBS or DBS, seems intractable. However, other conditions are
easily verified, and this leads us to conjecture that the weighted jackknives are also UBS or DBS depending
on the number of elements of S, possibly under some weak conditions on the model.
Remark 2. But more importantly, observe that in case p _= 1 and pi is proportional to x,-2, we get an E-class
technique, whereas generally we get an R-class technique. Liu and Singh (1992) remarked that an adaptive
resampling scheme can probably be designed to fit in E-class in homoscedastic regression case and R-class
in heteroscedastic case. The weighted jackknife generalised bootstrap is thus seen to be such an adpative
estimate.
The second-order properties of the jackknife variance estimator can also be studied using known results
about UBS and DBS schemes. For the present discussion, we restrict our attention to the p = 1 case. From
Bose and Chatterjee (1997), we know that the different resampling schemes can be separated based on the
second term in the variance expansion. Accordingly, the most efficient of these resampling schemes can also
be found out and we call such a resampling scheme a second-order efficient scheme.
Remark 3. Second-order results for jackknife variance estimators. For the delete-d jackknife, we have from
Theorem 2.2
R,, = 3c~SC~o72d(n- d) -I + 2~6~1(1 - d(n - d) -j ) - ~4~0(1 - 2d(n - d) - l ).
(3.6)
If d/n---+ O, the above expression reduces to R, =2~6~1- ~4~0, which is same as that of delete-1 jackknife,
see Bose and Chatterjee (1997), Theorem 2(i). This is as expected, since we have already shown that delete-1
jackknife is an example of a DBS scheme.
S. Chatterjee I Statistics & Probability Letters 40 (1998) 307-319
316
For the jackknife schemes that satisfy UBS conditions, if lim,__.~
a0(2 - cx4~;2)
C = 3~4~0)~2 _ 2~2~ 1 q- 4~0
d(n-
d ) - l = c and c C (0, 1), then
(3.7)
gives a second-order efficient jackknife. This is entirely in keeping with what has been observed and remarked
in Wu (1990), that it seems desirable to let both d ~ c~ and n - d ~ c~, but the optimal choice for the limit
of d/(n- d) depends on the problem at hand. If we make the natural assumptions that ~0 = ~l and ct = 72 = 1,
then (3.7) gives c = 7"
i This leads us to suggest that delete-[n/5] is a generally good jackknife scheme.
Thus, in the context of linear models, we have been able to identify a unified framework, the generalised
bootstrap set-up, of which all resampling schemes are special cases. The random weights may either be
attached to residuals or to the observations themselves, thus resulting in two separate class of resampling
procedures. Among the latter class of resampling techniques, we have two separate procedures, the UBS and
the DBS. We have established that the jackknives are generalised bootstrap schemes, and that they are DBS or
UBS generalised bootstrap schemes depending on whether din goes to zero or a positive fraction in the limit.
It is known that the delete-1 jackknife, which is a DBS scheme, yields inconsistent answers for some
problems, like estimating the variance of the sample median, whereas if d ~ c~, the delete-d jackknife is
consistent. It can already be conjectured that the DBS would not be able to provide correct answers in
problems like minimisation of L1 norm, but the UBS would give consistent results.
One remarkable advantage of the UBS or DBS resampling schemes is that i.i.d, random variables satisfy
the conditions on the weights, subject to some lower-order moments restrictions. This makes the computation
feasible and efficient, and resampling involves a good deal of Monte Carlo computations. For example, for any
d, we can approximate the delete-d jackknife variance estimate by using i.i.d, weights supported on a compact
set inside the positive half of the real line and having unit mean and variance d/(n- d). The accuracy of
approximation can be improved to any degree by imposing restrictions only on the lower-order moments of
the weights. This results can be favourably compared to those of Shao (1987) or Shao (1989).
Appendix
Proof of Theorem 2.1. p_= 1.
We use the notation
(wi -
W/-- - O'Bn
1)
throughout this proof. Expanding the expression for VDBS, we get
^
VDBS
~.~BEB(flB
_
fi)2
1
I ~-~4=l,nWiXiei
T--EB
GBn
1
I ~i=l,n WiX2i
f
5-EB ],
~B.
+
i=l,n
i=l,n xiei }2
~i=[,n wixiei
wine, (
\
i=l,n xiei
Zi:,,.g
E,=,,n x2
1
2,=i;WiX2
1)
S. Chatterjee I Statistics & Probability Letters 40 (1998) 307 319
317
1
- a2Eu[C~ +Dn] 2, where
Cn - Ei=l,n Wixiei
Ei=l,nXiei
Ei=l,n x2
E i 1,n x2
and
o=wixiei( 1
,)
and hence
' |( ~~
_2 EBC.~ - L2
x2~2
, , + E xix~e,~c,,
\ 1-<.i<~n
°Bn
=
1
Z
i¢.j
(4.1)
x~e2 + O P ( n - 2 ) "
L~2 l < i < .
Note that D. = Tl. + T2n, where
Tln= E ( w i - 1 ) x i e i
l~i<~n
(
,)
1
W 2
El<~i~n iXi
E, <.i<..x~
and
T2n :
Z xiei
I~i~n
(
'
, )
E|~iTnWiX2
E ' ~ i ~ nx2
It remains to show that EBT12~,EBT~. and EBC.Dn, divided by a2., is Op(n-2).
1
1E
l(El<~i<~n(Wi_l)xiei~2(Z
<~InE<:o : _-7
=~ t
OBr t
L' n
.2k~-~" ~ B
. . . . . Bn ~n
<~
~7',~
(Wi -- 1)X2
Wixiei
<~i<~n
EB
Z
Wix2i
I <~i<~n
= Op(n-20"2n) ,
(4.2)
1
2
1 E 1 (~.__~l~i~nXie~~2 I ~'~
)2
SBr2n : ~
B ~ kEl<~i<~nWiX~)
(Wi-- 1)X2i
(7Bn
°Bn ~n
\ 1<~i<~n
<<.nZkaZn~EB
)2
\ l <~i<~n
Z
I
~w,-~)x~
\l<~i<~n
( W i - 1)xiei
\ l ~i~n
----EB
Bn n2k L~
)
Z
xiei
l<~i<~n
/
Z ( Wi -- 1 )X~
l<~i<~n
S. Chatterjee I Statistics & Probability Letters 40 (1998) 307-319
318
"l
<~-~-~
~-'~xiei
l<~i<~n
;[I
EB
/
II
1/2
~_. Nx2i
l<.i<.n
(4.3)
= Op(rt-2 ),
1E 1 (
) (~-~M'<i~<n(wi-1)xiei)
~--- B-i-~ Z x i e i
~'~------~.
Z
EBT2nC. =
Z•--1
O'Bn
°Bn
~n
l<~i<~n
~l<~i<~n Wi i
('()
)
~<
nka2" L2n 1<~i~n
nk L~
1)X2
l<~i<~n
1<~i~n
EB Z
xiei
k l <~i<~n
(Wi--
l<~i~n
l~i2x~ei+ EB ~ ~l~jxi4e,
1<~i<~n
i#j
1 1
1<~i<~n
1<~i<~n
iT~j
(4.4)
~---Op(n -2 )
( ~ 1 <~i<~n(Wi -- 1)xiei)2
O'Bn
O'Bn
y~., <_,~.w,x~
n
nkaZn L2
l~i<~n
nk L 2 EB
l~i<~n
YVi Xi2e i2 + ~
.,,2
Z
1 <~i~n
lYBn
1
nk L 2 EB
Z
Z~iWjxixjeieJ)isgj
aBn 1
nk L2n
WiWjxixjeiej
Wiix2
iT~j
VYi xi4 ei2 + Z
,rr3
1 <~i<~n
+
Z (wi - 1)x2
l~i~n
\ 1 <~i~n
/
2 2 i2
W/2 WkXiXke
i~k
(ZWkx2+Wxj2J-Wxi2)]k#kjei
EB [ Z
iVVi
, . ,xie
34 2 +~-" Wie WkXi2x k2ei2
1<~i~n
iT~k
+ZGZ
l <~i<~nigj k~jkyi
= Op(n-2).
2 j +2 ~
WiiG.Wkxixjxkeie
1 <~i<~n
Z WiWjj2xi# eieJ
i#j
(4.5)
S. Chatterjee I Statistics & Probability Letters 40 (1998) 307 319
319
Thus, we have VDBS= ( 1 / L ] ) ~ 1<~i<~nxi2 ei2 + Op(n -2) as required. The proof for the general p case follows
similar lines. See Chatterjee (1997) for details of arguments involved. []
Acknowledgements
The author would like to thank Professor AMp Bose for his invaluable help and guidance in this research.
References
Barbe, P., Bertail, P., 1995. The Weighted Bootstrap, Lecture Notes in Statistics, vol. 98. Springer, New York.
Bera, A.K., Higgins, M.L., 1993. A survey of ARCH models: properties, estimation and testing. J. Economic Surveys 7, 305-366.
Bickel, P.J., Freedman, D.A., 1983. Bootstrapping regression models with many parameters. In: Bickel, P., Doksum, K., Hodges, Jr. J.,
(Eds.), Festschrift for Erich L. Lehmann, Wadsworth, Belmont, Calif., pp. 28-48.
Bose, A., Chatterjee, S., 1997. Second order comparison of different resarnpling schemes in linear regression. Tech. Report No. 15/97,
Star-Math Unit, Indian Statistical Institute, Calcutta.
Boos, D., Monahan, J.F., 1986. Bootstrap methods using prior information. Biometrika 73, 77-83.
Chatterjee, S., 1997. Generalized bootstrap in regression models with many parameters. Tech. Report No. 2/97, Stat-Math Unit, Indian
Statistical Institute, Calcutta.
Efron, B., 1979. Bootstrap methods: another look at the jackknife. Ann. Statist. 7, 101-118.
Efron, B., 1982. The Jackknife, the Bootstrap and Other Resampfing Plans. SIAM, Philadelphia.
Freedman, D.A., 1981. Bootstrapping regression models. Ann. Statist. 9, 1218-1228.
H~rdle, W., Matron, S., 1991. Bootstrap simultaneous error bars for non-parr.metric regression. Ann. Statist. 19, 778-796.
Hinkley, D.V., 1977. Jackknifing in unbalanced situations. Technometrics 19, 285-292.
Liu, R.Y., 1988. Bootstrap procedures under some non-i.i.d, models. Ann. Statist. 16, 1696-1708.
Liu, R.Y., Singh, K., 1992. Efficiency and robustness in resampling. Ann. Statist. 20, 370-384.
Lo, A.Y., 1987. A large sample study of the Bayesian bootstrap. Ann. Statist. 15, 360-375.
Lo, A.Y. 1991. Bayesian bootstrap clones and a biometry function. Sankhya, Set. A 53, 320-333.
Mammen, E., 1989. Asymptotics with increasing dimension for robust regression with applications to the bootstrap, Ann. Statist. 17,
382-400.
Mammen, E., 1992. When does Bootstrap Work? Asymptotic Results and Simulations, Lecture Notes in Statistics, vol. 77. Springer,
New York.
Mammen, E., 1993. Bootstrap and wild bootstrap for high dimensional linear models. Ann. Statist. 21, 255-285.
Portnoy, S., 1984. Asymptotic behaviour of M-estimators of p regression parameters when pe/n is large, I: consistency. Ann. Statist.
12, 1298--1309.
Portnoy, S., 1985. Asymptotic behaviour of M-estimators of p regression parameters when pZ/n is large, I: asymptotic normality. Ann.
Statist. 13, 1403-1417.
Portuoy, S., 1988. Asymptotic behaviour of the empirical distribution of M-estimator residuals from a regression model with many
parameters. Ann. Statist. 14, 1152-1170.
Praestgaard, J., Wellner, J.A., 1993. Exchangeably weighted bootstrap of the general empirical process. Ann. Probab. 21, 2053-2086.
Quenouille, M., 1956. Notes on bias in estimation. Biometrica 43, 353-360.
Rubin, D., 1981. The Bayesian bootstrap. Ann. Statist. 9, 130-134.
Shao, J., 1987. Sampling and resampling: an efficient approximation to jackknife variance estimators in linear models. Chinese J. Appl.
Prob. Statist. 3, 368-379.
Shao, J., 1988. Consistency of the jackknife estimators of the variance of sample quantiles. Comm. Statist. A 17, 3017-3028.
Shao, J., 1989. The efficiency and consistency of approximations to the jackknife variance estimators. J. Amer. Statist. Assoc. 84, 114-119.
Shao, J., Tu, D., 1995. The Jackknife and the Bootstrap. Springer, New York.
Shao, J., Wu, C.F.J., 1987. Heteroskedasticity-robustness of jackknife variance estimators in linear models. Ann. Statist. 15, 1563-1579.
Shao, J., Wu, C.F.J., 1989. A general theory for jackknife variance estimation. Ann. Statist. 17, 1176-1197.
Weng, C.S., 1989. On a second order asymptotic property of the Bayesian bootstrap mean. Ann. Statist. 17, 705-710.
Wu, C.F.J., 1986. Jackknife, bootstrap and other resampling methods in regression analysis (with discussion). Ann. Statist. 14, 1295.
Wu, C.F.J., 1990. On the asymptotic properties of the jackknife histogram. Ann. Statist. 18, 1438-1452.
Zheng, Tu, 1988. Random weighting method in regression models. Sci. Sinica Set. A 31, 1442-1459.