Download Hoeffding, Wassily; (1958)An upper bound for the variance of certain statistics." (Air Research and Dev. Command)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
AN UPPER BOUND FOR THE VARIANCE OF CERTAIN STATISTICS
'
.
by
Wassily Hoeffding
University of North Carolina
.~
:: .:.
This research was supported by the United
States Air Force through the Air Force
Office of Scientific Research of the Air
Research and Development Command, under
Contract No. AF 18(600)-458. Reproduction
in whole or in part is permitted for any
purpose of the United states Government.
.j·f~
;,::~k:~f.~~;~,:· :
Institute of Statistics
Mimeograph Series No. 193
"March, 1958
AN UPPER BOUND FOR THE VARIANCE OF CERTAIN STATISTICS
l
by
Wassily Hoeftding
University of North Carolina
1. Results.
Let X ,X , ••• ,X be independent and identically disl 2
n
tributed random variables (real- or vector-valued). Let f(X l ,X 2 ) be a
bounded function such that f(X l ,X2 )
= f(X2 ,Xl ).
erality we shall assume thet the bounds are 0
~
With no loss of genf (Xl ,X2 )
~
1.
Let
2
U = n(n-i}
Examples of statistics of this form are given below. The mean of U is
E(U)
=p
and the variance is
var(U)
2
= n(n-l)
2
2
(2(n-2)(r-p ) + s - p)
,
where
As n tends to infinity,
£2 J.
ID
(U-p) has a normal limiting distribution
Hence i f we have an upper bound for the variance of U which de-
pends only on p and n, we can obtain an approximate confidence region for
p and a lower bound for the power of a test based on U when n is large (see
[3] ).
1 This research was supported by the United states Air Force through the
Air Force Office of Scientific Research of the Air Research and Development Command, under Contract No. AF 18(600)-458. Reproduction in whole
or in part 1s permitted for any purpose of the United states Government.
- 2 2
2
It is known ~2~ that 2(r-p ) ~ s - p , and since obviously s ~ p, we
have
var(U)
< -2 p(l-p).
-n
In this note we shall show that, under the stated assumptions,
-p
2
( 1)
-(l-p)
2
~
1
2
,
P
,
1
p ~2
It is easily seen that the sign of equality holds in (1) if, with proba(for p ~ ~) or f(X l ,X2 )
where g(X) takes the values 0 and 1 only.
bility one, f(Xl'X 2 )
(for p
~
1
2)'
Inequality (1)
(2)
= S(X 1 )g(X2 )
~plies
= 1- g(X l )g(X2 )
that
2
2
var(U) ~ n(n-l) (2(n-2) R(p) + 1 - P ) •
An inequality analogous to (1) was conjectured by Daniels and
Kendall Ll J for the variance of the finite population analogue of the
statistic t defined in Example 1 below. A proof of this conjecture
suggested by Sundrum ["4 J does not seem to be complete.
We now give three examples of statistics to which the present bound
is applicable; in the first two examples the bound can be attained.
Example 1. Let Xi
= (Yi'Zi)
be a random vector with two con-
tinuously distributed components, and f(X l ,X )= 1 or 0 according as
2
(Yl-Y2 )(Zl-Z2) is positive or negative.
well-known rank correlation coefficient.
In this case t
= 2U
- 1 is a
The condition for equality
in (1) is satisfied if Z 1s a function of Y of a certain form, for
instance positive and decreasing for Y < 0 and negative and increasing
- 3 for Y > 0 (if P
and
d~creasing
~
1
2);
or negative and increasing for Y < 0 and positive
for Y > 0 (if P
~ ~).
For if we let g(X) = 0 or 1 ac-
cording as Y < 0 or Y > 0, then, with probability one, f(X l ,X2 )
= g(X )g(X ) in the first case and f(Xl'X ) = 1- g(X )g(X 2) in the
l
2
2
l
second case.
These two cases correspond to the inverse canonical rank-
is attained, for instance, if Xi can take only two values, a and b,
such that either a + b < 0
Here we may take g(X)
case, and g(X)
= 0 or
< b (if P ~
= 0 or
1
2) or a < 0 < a + b (if p
~
1
2).
1 according as X < 0 or X > 0 in the first
1 according as X > 0 or X < 0 in the second case.
Example 3. Let Xi again be real-valued, and let
where FO(:X) is a continuous (cumulative) distribution function.
Then,
if Xi has the distribution function F(x),
and (U - ~)/2 differs in large samples negligible little from the Cramervon Mises goodness of fit criterion
fO[F n (x)
n Fn(X) denotes the number of observations
- F o(x)
< x.
J
2 dF o(x), where
In this case the
- 4 condition for equality in (1) cannot be satisfied, and presumably the
upper bound for the variance can be further tmproved.
2. Proof of Inequality (1). We first assume that
Let Xl ,X2 ' ••• 'X be independent and identically distributed random varin
ables,
A
-2 n
n
p = nEE
i=l j=l
As n
-->
"
f. j '
1
A
-3 n n n
r = n E E E f. 4 fO k •
i=l j=l k=l lu 1
and rA converge in probability to p and r respectively
00, p
We shall show that
A
(4)
r -
"2
"
p -< R(p)
+ €
n
where R(p) is the function defined in (l).and
verge to 0 as n
-->
00.
,
€
n
are numbers which con-
Since the function R(p) is continuous, inequality
(4) easily tmplies inequality (1).
A
A
Both p and r are functions of the nxn matrix IIfiJ II whose elements
satisfy the conditions
f
Let
Then
ij = 0 or 1,
- 5 2 A
n p=
n
1: F.,
1=1 1
A
1\
We first show that, in order to find an upper bound for r when p
is fixed, we may assume that
(6)
For suppose that there are integers 1, j, k such that F.
> F.,
J
1 -
and f ik
< f jk. Thus f ik = 0, f jk =
k
f
i,
Let IIf~vll be the nxn matrix de-
L
fined by f lk = f ki = 1, fjk = f kj = 0, f~v = f uv otherwise. The transA
formed matrix satisfies (5), and the value of p is not affected by the
transformation. Also, writing F'u for the sum of the u-th row in the
new matrix, Fi = F i + 1, Fj =F j - 1,
1:(F~)
2
2
- 1: Fu
F~
= (Fi+l) 2
=
= Fu otherwise.
Therefore
2
2
2
- Fi + (Fj-l) - F j
2(F i -F j) + 2
> O.
A
Thus the value of r is increased by the transformation.
By repeated
application of this transformation we can obtain a matrix which satisfies (6), for instance as followso
We first take one of the rows with
the largest sum to be the i-th row, and for every j
f
i we apply the
f
i with f ik < fjko Having exhausted all
k and all j, we are left with a matrix such that every column different
transformation for each k
from the i-th that has a 0 in the i-th place will consist of zeros
only.
This implies that any further transformations will not affect
this i-th row. We next take one of the rows with the largest sum among
the remaining n-l rows, and repeat the procedure described above, and
- 6 so on.
Eventually we obtain a matrix which satisfies the conditions
A
(5) and (6) without changing the value of p and without decreasing the
A
value of r.
With no loss of generality we may assume that, in addition to
(5) and (6), the matrix I/f ij " satisfies the condition
(7)
> F 2 -> ... -> Fn
F
•
1-
For this can always be achieved by rearranging suitable rows and columns,
which will not affect the values of A
p and "r.
Thus we now restrict our-
selves to symmetric matrices whose every row and every column, apart
from a 0 in the main diagonal, consists of a sequence of l's followed
by a sequence of 0' s. A typical matrix of this form is shown below.
0
1
1
1
1
1
0
1
0
1
1
0
0
0
1
1
0
1
0
0
0
1
1
1
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
For a fixed matrix IIfij /I of this type let m denote the greatest
integer such that fm,m+l = 1. Then f
ij = 1 if i
and f 1j = o if 1 ~ m+l, j ~ m+l. Hence
n
2"
p=
=
and
m
.E
m
m
n
.E
foj+.E
.E
i=l j=l
1
n
n
.E
i=l j=m+l
m+l, j
m
~
m+l (i
+
.E
.E f "
1j
i=m+l j=l i J
fo
i=l j=m+l
m
m(m-l) + 2.E
~
fO
1
j
f
j),
- 7 -
m
=
n
n
2
L (m - 1 +
L f
)
i=l
j=mt1 i j
+
m
2
L
(L f .. )
i=mt-1 j=l lJ
2
m
n
m
n
2
n
m
2
= m(m-1) +2(m-1) L
L f
+ L ( L f. ) +
L (Lf. ) .
1=1 j=mt1 1j 1=1 j=m+1 1 j
j=m+1 1=1 1 j
If we put
d -
m
1
1
a1 =
n-m
=-
bj
L f
,
m 1=1 1j
1
m
n
1 m
L
f i . = L a
(
) L
m n-m 1=1 j=m+1
J
m 1=1 1
1
=-n-m
n
L
b ,
j:::m+1 j
we obtain
(8)
n 2 "p::: m(m-1) + 2 m(n-m)d
and
"
n 3 r = (m-1) n
2 "
p + (n-m)
2
m
2
2
L a + m
i
1=1
::: (m-1) n 2 "p + m(n-m)n d 2
1 m
+ m(n-m)["(n-m) L (ai-d)
m i=l
2
n
1
+ mL (b .-d)
n-m j=m+1 J
2
7
-
Now
m
1
m(n-m)
=
n
2
L
L [(fij-d) - (ai-d) - (bj-d)J
i=l j=mt1
1
m(n -m~
WJ
::: d - d
~
1=1
n (
)2 1 m (
)2
1
n (
)2
L
fij-d - -m Z ai-d - - n
m
Z
bj-d
j=m+1
1=1
- j=m+1
21 m
2
- L (a. -d)
m 1=1 1
1
n-m
n
2
L
(bj-d).
j=m+1
- 8 Since the left hand side is nonnegative, we obtain
2
1 m
2
1
n
2
2
- L (ai-d) + --- L (bj-d) < d - d •
m i=l
n-m j=m+l
-
(10)
It now follows from (9) and (10) that
/\
n3 r
< (m-1)
n
2/\
2
p + m(n-m) n d
(11)
2
+ m(n-m)me.x(m, n-m) (d-d ).
/\
/\
We thus have obtained an upper bound for r in terms of p for all
matrices /Ifijll with m fixed.
We now put
m
c = - •
n
By equation (8), since 0
2
(12)
C
/\
~ P +
~
d
~
1, the range of c is given by
2
C
/\
c
(I-c) -< 1 - p - -n •
n'
We therefore may assume that c is bounded away from 0 and 1 as n
Indeed, inequality (1) is trivially true if p
= 0 or
1; and if 0
->
00 •
< P < 1,
/\
since p tends to p in probability, inequalities (12) imply that c is
bounded away from 0 and 1 with a probability which approaches one as n
tends to infinity.
From (II) we obtain
/\
r
where E~
2
->
/\
< c p + c(l-c)d
0
as n
->
00.
2
2
+ c{l-c)max c{l-c}(d-d ) +
E~ ,
If we express d in terms of ~ and c by
Inequalities (10) and (II) are closely related to the sharp upper bound
for the variance of the Wilcoxon-Mann-Whitney two-sample statistic in
terms of its mean; see D. van Dantzig, "On the consistency and the
p.ower of Wilcoxon r s two sample test," Nederl. Akad. Wetensch. Proc.
Sere A, Vol. 54, No.1, pp. 1-8 (195l), footnote 4.
- 9 means of (8)" we have
1\2
1\
1\
r - p ~ R(p" c) + €n"
where
2
2
2
2
p-c r
p-c
(
)(
p-c) ]
=
R( P"c )
-P + cp + ~L 2c(1-c) + max c, 1-c 1 - 2c(1-c)
and €n
->
0 as n
->
(D •
We now maximize R(p, c) with respect to c.
By (12) we may assume
that
l-c
~
(l-p)
1/2
•
We can write
R(p"c)
= G(l-p,
I-c), c ~
1
2;
R(p"c)
= G(p,c)"
c ~
1
2'
where
G(p,c)
= -p2 +
pc +
'41
2 -1 1 3
p c
- 'Ii: c •
We calculate
1/2
1
1/2 2 2
1/2
G(p"p
)-G(PIC)=~(P -c)(c +2p
c-p).
If c
~ ~, the right side is positive, and hence G(P1 C ) ~ G(p, pl/2)
= p3/2
_ p2.
It follows that
and the function on the right is just R(p) as defined in (1).
This com-
pletes the proof for the case where f(X l ,X2 ) takes on the values 0 and
1 only.
Now let f(X l ,X2 ) be any function which satisfies the assumptions
of the theorem.
Then f(X l ,X2 ) can be approximated by a function f'(X "X2 )
l
- 10 -
which also satisfies the assumptions of the theorem and takes only finitely many values, in such a way that pI
rl
= E f'(X l ,X2 )
= E f'(X l ,X2 )
and
f'(Xl,X,) are as close as we please to p and r, re-
It will therefore be sufficient to assume that f(X l ,X2 )
spectively.
takes on the k values fl'f 2 , ... ,fk • Then it can be written in the
form
k
f(X l ,X2 )
= i:l
f i c i (Xl ,x2 ),
Now let y l ,Y2 ,Y, be mutually independent random variables, independent of Xl 'X 2 'X', where each Yi is uniformly distributed on the
Yi ~ 1. The random vectors Zj = (X j ' Yj ), j = 1,2,', are
mutually independent and identically distributed. Now define
interval 0
~
where
1,
0< Y < fl/2
o
otherwise.
-
-
i
Then
The conditional expected value of f*(Zl'Z2) for Xl ,X2 fixed is
E[f*(Zl'Z2)
Hence
Ix l ,x2 J:::
k
i~l fiC i (Xl'X2 ) ::: f(Xl'X2 ) •
- 11 -
(13)
Also
k
f*(Zl,Z2) f*(Zl,Z3)
= i~l
k
j~l d i (Y1 ) d1 (Y2 ) c i (X 1 'X2 )
dj(Y1 ) d j (Y 3 )
C
j
(X l 'X 3 )
so that
k
> t
- j=l
k
l:
j=l
fifo c.(X l ,X 2 ) c.(X1'X~)
J ~
J
~
Thus
Since, by the first part of the proof, inequality (1) is true for f*,
it follows from (13) and (14) that it is also true for f.
is complete.
The proof
..
I
,
...
- 12 References
[ l J M. E. Daniels and M. G. Kendall , "The significance of rank of
correlations where parental correlation exists." Biometrika,
Vol. 34 (1947), pp. 197-208.
£2 J
W. Hoeffding, "A class of statistics with asymptotically normal
distribution." Ann. Math. Stat. , Vol. 19 (1948),
pp. 293-325.
[ 3 ] M. G. Kendall, Rank Correlation Methods.
and New York (Hafner), 1955.
£4]
2d ed. London (Griffin)
R. M. Sundrum, "Moments of the rank correlation coefficient
in the general case." Biometrika, Vol. 40 (1953),
pp. 409-420.
T
Related documents