Download Hoeffding, Wassily; (1953)The extreme of the expected value of a function of independent random variables." (Air Research and Dev. Command)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Birkhoff's representation theorem wikipedia , lookup

Congruence lattice problem wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Transcript
THE EXTREMA OF THE EXPECTED VALUE OF
1>.
FUNCTION OF INDEPENDENT
RANDOM VARIABLES1
by
~
.•
Wassi1y Hoeffding
Instit .1te of Statistics, Universit-,. c,!' North Carolina
1
Institute of Statistics
Mimeograph Series No. 83
Limited Distribution
October, 1953
1. This research was supported in p9rt by THE UNITED STATES
A.IR FORCE under Contract AF 18( 600 )-458 ,i\onitored bJ the Office
of Scientific Research.
THE EXTRETlA OF THE EXPECTED VALUE OF A FUNCTION OF INDEPENDENT
RANDOM VARIABLES -:~
by IIJassily Hoeffding
University of North Carolina
The problem is considered of determining the least upper
Summary.
(or greatest lowGr) bound for the expected value EK(X , ••• , X ) of a given
l
n
function K of n random variables Xl' ••• , X under the assumption that
n
Xl' ••• , X are independent and each X. has given range and satisfies k
n
conditions of the form
J
Eg~j)(X.)
~
J
== c .. , i == 1, ••• , k.
~J
It is Sfuown that
under general conditions we need consider only discrete random variables
X. which take on at most k .... 1 values.
J
1.
Introduction.
Let.£ be the class of n-dimensional cdfs (cumu-
lative distribution functions) F(x) == F(x , ••• , x ) which satisfy the
l
n
conditions
(1.1)
(1. 2)
(1.3)
j g~j\x) dF.(x)
J
~
==
c .. , i == 1, ••• , k; j == 1, ••• , n,
~J
F . (x) = 0 if x < A., F. (x) == 1 if x > B., j == 1, ... , n,
J
where the functions
J
g~j)(x)
J
and the constants c
J
ij
' A , B are given.
j
j
We
-)~This research was supported in whole or in part by the United Stat es
Air Force under Contract No. AF 18(600)-- 458 monitored by Headquarters,
Air Research and Development Command, p. O. ~ox 1395, Baltimore 3, Marylmd.
- 2 -
allow that A.
J
=-
00
and/or B.
J
= 00.
Here and in what follows, when the
domain of integration is not indicated, the integral extends over the
entire rmge of the variables involved.
It will be understood that all
odfs are continuous on the right.
Let
K(~)
be a function such that
exists for all F in C.
=
The problem is to determine the least upper and
the greatest lower bound of ¢(F) when F is in :;;
C.
It will be sufficient
to cmnsider only the least upper bound.
Special cases of statistical interest include
K(~)
= 0 or
1 accord-
ing as a function f(~) does or does not exceed a given constant; K(~)
=
max (xl' ••• , x ), etc.,• g.(j)()
x _- x i(.
g~ven moments up to order k ) ;
n
g~j)(x)
= 1 or
~
For n
~
~
-
0 according as x < b. or x > b. (given quantiles), etc.
~
1, gil)(x)
~
= xi, K(x) = 1 if x ~ t, K(x) = 0 otherwise,
the problem was stated and its solution found by Chebyshev.
This result
was extended to more general function K(x) by Markov, Poss~ and others.
For references and proofs see Shohat and Tamarkin L-6_7. An extension to
mi
the case g~l)(x) = x
, Al = 0 was considered by Wald /-7 7. A recent
--
~
contribution is due to Royden
For n arbitrary, k
L-5 7.
= 1, glj) = x,
A.
J
= 0,
B. =
J
00, K(~) =
1 or 0
- 3 -
qccording as Z~xi .::: t or < t
- -7
Birnbaum, Raymond and Zuckerman /- 2
I
showed that when looking for the least upper bound of ¢(F) we need consider only cdfs F. which are step functions with at most two steps.
J
gave an explicit solution for n
= 2.
¢(F) the distribution function of the
L- 1_7 and
n --->
00
~sseen ~
3
_7
For the case k
S1xm
= 3,
gij)(x)
=
They
xi,
z~xi' the inequality of Berry
gives bounds which are asymptotically best as
but can presumably be improved for finite n.
In the present paper it is shown that if in the general problem
~~
as stated above we restrict ourselves to the subclass C of C where the
= =
F. are step functions with a finite number of steps, then we need conJ
sider only step functions with at most k + I steps.
(Theorem 2.1).
The same result holds in the unrestricted problem i f (A)
each Fine can be approximated (in a certain sense) by a step function
=
in c*, and (B) ¢(F) is, in a sense, a continuous function of F
=
(Theorem 2.2).
Sufficient conditions for the fulfillment of assumptions
(A) and (B) are given in sections 3 and
2.
The main theorems.
Let
g be
(1.1) to (1.3), and suppose that ¢(F) =
4.
the class of cdfs F(~) defined by
/'K(~) dF(~)
exists for all F
that Fl ' ... , Fn are step funct ions with a finite number of steps.
C be the subclass of C~~ in which each FJo (j = 1, ••• , n) is a step
:=Ill
=
function with at most m steps.
Let
- 4-
Theorem 2.1.
¢(F)
sup
::
sup
¢(F)
•
~~
Fee"
=
F e £k+l
Let F(x) "" Fl(x l ) .'. F (x ) be an arbitrary cdf in Ci~
n n
=
Proof.
such that for some j, F. has more than k+l steps.
J
It is sufficient to show
that there exists a cdf G in £k+1 such that ¢(G) ~ ¢(F).
This, in turn,
will easily follow when we show that if Fn (x) has m > k + 1 steps, there
exists a cdf H (x) such that
n
a)
H (x) has less than m steps;
b)
H(_x) :: Fl(x l ) ••• F lex 1) H (x ) is in C ;
nnn n
n
=
c)
By assumption Fn(x) is of the form
F (x) :: 0
n
vlhel'e
:: p 1 + ••• + pr
if ar -< x < ar+ l' r :: I ' ••• , m - 1
"" 1
i f a m -< x ,
- 5-
p
> 0,
r
= 1,
r
, •• , m ,
••• + g.J.mpm "" c.III ,
gOr
i
= 1, r = 1, .0., m;
= g~n)(a
J.
)
r'
i
= 1,
c
••
0'
On
kj
= 0,
=1
r
1, ... , k
'
= 1,
••• , m •
Let
H (x-)
n
=0
if
••• +(pr +tdr )
=1
if
a r -< x < a r+ l' r=1, ••• ,m-1
if
a < x
m-
In order to satisfy condition b) it is sufficient to choose t, d , ••• ,
1
d
r
(2.1)
in such a way that
p
r
... td
> 0,
r-
r
= 1,
••• , m,
- 6-
(2.2)
••• + g.1..'11dm ==
a,
i ==
a,
1, ••• , k
'tie can write
¢(H) - ¢(F)
m
=t
Z K d
r=l r r
where
K ;:
r
Let A
= a or
n-l
) ' K(X 1 , ••• , x -1' a)
n
r
d {
"IT
)
F. (x. )
J
j=l J
1 according as the rank of the matrix
gal'
... ,
gam
••••••
gk1'
... , gkm
Kl '
... ,
is less than or equal to k
+
2.
m
Z K d
r= 1 r r
K
m
Then the equations (2.2) and
::0
A
•
•
- 7have a solution (d , ... , d ) 'I (0, ... , 0). Having thus fixed d , ... ,
l
m
l
d , choose t as the largest number which satisfies the inequalities
m
(2.1). This number exists and is positive. Conditions a), b) and c)
are now satisfied, and the proof is complete.
Theorem 2.2.
Suppose that the follolTing two conditions are
sati~-
fied.
A)
For_every F(!)::> Fl(xl ) ••• Fn(xn ) in
there exists a cdf F-:~(X)
-
::>
every 6 > 0
Fl~~(xl) ... F-\x ) in C~~ such that
n n
=
sup
x
exists
£. ~d
j
6 > 0 such that for any G(~)
= 1, ••• , n.
= G1 (x l ) ••• Gn(xn )
~
£ ~ich
fies the inequalities
sup
x
- G.(x) I < 6 ,
I F.(x)
J
J
we have
,I ¢(F)
- ¢(G)
I
<
e
j
= 1,
••. , n ,
satis-
- 8-
e
Then
sup
=
¢(F)
F e gk+l
F e C
:::
Proof•
¢(F).
sup
Conditions
A)
sup ¢(F)
F e C
and
B) imply that
=
sup
F e
=
The theorem now follows from Theorem 2.1
¢(F)
"
C~'"
=
•
It is easy to extend Theorems 2.1 and 2 0 2 to tho case where the
number k of restrictions
r (")
'; gi J (x) dF /x) = c ij ' i
= 1,
••• , k, depends
on j.
Conditions
A) and
B) make use of the distance between two cdfs
F and G defined by
max
sup
j
x
- G.(x)
I F.(x)
J
J
j
Obviously an analogous theorem can be stated with an arbitrary distance
.d(F, G).
- 9 -
4 it
In sections 3 and
will be shown that assumptions A)
and B)
of Theorem 2.2 are satisfied for certain classes C and functions K which
=
are of interest in statistics.
3. Approximation of a cdf by a step function.
It will now be shown
that condition A) of Theorem 2.2 is satisfied if C is the class of distri~
butions of the product type (1.1) with prescribed moments and ranges.
Theorem 3.1.
class defined
b~
~!!:2-~
A) of Th~~ 2.2 is2..~~fied if Q is the
1nij
(1.1) to (1.3) ~~th g~j)(x) = x , where the m' are
-
-
j 1
1
arbitrary positiv8 integers.
The thEO rem is an immediate consequence of the following lemma.
Let F(x) be a cdf on the real line such that
Lemma 3.1.
::2
c.,
1.
i = 1,
F(x) = 0 if x < A,
wh3r8 we may have A
';';';;;;~:"-";;"'~,:J..'..;.,.~
~ F~\x) whj_ch is
= -00
0
or B
_
= 00
F(x)
•
=1
G & .,
S
if x > B
Then for every 6 > 0
,
~re eXis~~
step flillction with a finite number of steps, satis-
fies conditions (3.1) .~ (3.2), and fo~_which
s~p II F* (x) -
F(x)
<
5
o
- 10 To prove Lemma 3.1 we shall need
Lemma 3.2.
If F(x) is any cdf which satisfies conditions (3.1) and
(3.?), there exists a cdf which is a step function with a finite number of
steps and satisfies the same conditions.
The statement of Lemma 3.2 is well knmm.
from Shohat and Tamarkin
als 0 Royden
-r
:5, Lemma
Proof of
Le~~
L- 6,
2 7
-
3.1.
For example, it follows
Theorems 1.2 and 1.3 and Lemma 3.1
7.
Cf.
D
Given 5 > 0, we can choose a finite set of
points
such that
Pr
If p
= F(ar+1 - 0) r
F (x)
r
f
F(ar ) < 5,
r
0, 1, ••• , m
=
0
0, let
=0
= 1
if
x<a
if
a r -<x<ar+ 1
if
a r+ 1 -< x.
~~
r
By LeMaa 3.2 there exists a cdf F;(x) which is a step function with
- 11 -
a finite number of steps and such that
(
=
-"
~f-
F (x)
r
= 0 if
/
I
.
xl. elF (x),
= 1, ••• ,
s ,
if x > ar+ 1
•
i
r
x < a ,
1
r
Let
if
.><.
F"(x)
= 0 if
where p F'*(x)
r r
x < A
= 0 if pr
~}
,
= O.
ar -< x < ar+ l'
F (x)
= 1 if x
It can be verified that
r = Q, 1, ••. , m ,
>
B •
"
F'~~(x)
has the
properties stated in Lemma 3.1.
4.
Continuity of ¢(F).
In this section we consider sufficient
conditions for the continuity of ¢(F) in the sense of assumption
Theorem 2.20
B) of
The next theorem shows that assumption B) is satisfied if
¢(F) is the probability that the random vector! with cdf F is contained
in a set S of a fairly eeneral type.
- 12 Theorem
4.1.
Condition
B) of Theorem 2.2 is satisfied if
= 1 .£E
0 ancording as x does or does not be}ong to a Borel-measurable
set S such that every set S. = "ll x. : xeS ~ , j = 1, ••• , n, is the
K(x)
~
J
. J
-
J
union of a finite and bounded number of intervals.
(Here
{x: A } denotes the set of all points x which satisfy
condition A.)
~roof.
Let 0 be an arbitrary positive number.
Fl(x l ) ••• F (x ) and
n n
G(x)
-
Let
F(~)
=
= G1(x1 ) ••• Gn (xn ) be any two cdfs in =C such
that
sup ( Fh(x) - Gh(x)
x
I<
6,
h = 1, •.• , n
•
Let
so that
¢(F) - ~(G)
=
We may assume that every set S. is the union of at most N non-over-
J-
lapping intervals f where N is a fixed number.
For a fixed integer j and
- 3;3 -
fixed values ~ (h
M < N.
j
'I
j) denote these intcrv81s by II' 1 2 , ••• , II'T'
WJ;.,
~lje have
(L(Xl'~"'X,.J- l,x'+l""'x
••. G (x ) I",
J
n )d/l Fl(Y.l) ••• F~J- lex,J-1)G'+1(x.+1)
J . J
n n
,~
r
J
..,
dG .(x.)
J
J
'-.,
f •
SinC8
<'
dG. (x . )
J J
)
I<
26
,
1
m
Tt!G
get
, ¢(F) - ¢(G)
so that condition
Condition
I
<
2nA6 < 2nN6 ,
B) is satisfied.
B) of Theorem 2.2 is evidently satisfied tmder more
general assumptions than those of Theorem
b.• l. Thus if K(!) ::
- 14 -
r
r = 1, ••• , R, then ¢(F) :;:) K(~)elF(~) is also continuous.
The same is
true if K(_x) can be approximated by a function of the form L: d K (x),
rruniformly in the range of the distributions F in
£.
Using Theorems 3.1 and 4.1 we can stato the following corollary of
Theorem 2.2.
Theorem
4.2.
such that F.(A.- 0)
J J
Let C be the class of cdfs F(~)
-=
= 0,
F.(B.+ 0) :;: 1,
J J
r
x
mij
:;:
Fl(xl) .. ·Fn(xn )
elF/x) = c ij ' i=l, ••• ,k;
j=l, ••• ,n, where the numbers A ., B ., c .. and tho integers
m.. are given.
__
J
J
~J
~J
Let S be a "Rorol-nee.surable set such that every set {x j :
union of a finit3 and bounded number of intervals.
sup
FeC
==
Then
J
S
S
dF.
~eS}
_~..l.-
is the
-155.
Concluding remark.
The problem considered in this paper can
..
be modified by admitting only those cdfs in C for which the rnarginal
distributions Fl, ••• ,Fnare identical.
It is of interest to note that
with this restriction the conditions A), B) of Theorem 2.2 are no
longer sufficient in order to reduce the class of competing cdfs to
step functions with a bounded number of steps.
For example, consider
the problem of the least upper bound for the expected value of the
largest of n independent, identically distributed rffildom variables with
given mean and variance,
Hartley and David ~4_7 shm!ed that under the
additional assumption that the cd! is continuous the least upper bound
is attained with a continuous cdf when n > 2.
At least for n
= 2 it
can be shown that the Hartley-David bound can not be arbitrarily
closelJr apnroached with a discrete cdf having a bounded number of steps.
On the other hand if the assumption that the random variables are
identically distributed is dropped, the conditions of Theorem 2.2 arc
satisfied, dnd hence the least upper bound is attained or approached
with step functions having at most three steps.
-16References
"The accuracy of the Gaussian approximation to the stun of independent variates",
Transactions-of the American Mathematical
~iety, Vol. 49(1941), pp. 122-136.
/-2 7 z. 1.v.
-
-
Birn1)atun, J. Raymond
and H. S. Zuckerman,
"A generalization of Tshebyshev1s
inequality to two dimensions", Annals of
Mathematical Statistics, Vol. IS-CIJ47J,
PP.
70-79.
-
''Fouri.er analysis of distribution functions.
A mathematical st,udy of the Laplace.;.Gaussi.an
law", Acta rhtho ,~i)lo "n s PPII 1-125
(1944)-.---_._.
/-4
-
-
7 H.
O. Hartley and H. A.
David,
"Univorsal bounds for mean range and
extreme observation", •••••
I~ounds on a distribution function when
its first n moments are given", Annals
of Mathematical Statistics, Vol.~-rr953),
361-376.
;-6 7 J.
-
-
A. Shohat and J. D.
Tamarkin,
IlTr~roblem ~f)21ent!'..}. American
Mathematical Society, 1943.
"Limits of a distribution function determined by absolute moments and inequalities
satisfied by absolute moments", Transactions of the American MathematICal
SocIety., Vol. 46 (1939), pp. 280-306: