Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
THE EXTREMA OF THE EXPECTED VALUE OF 1>. FUNCTION OF INDEPENDENT RANDOM VARIABLES1 by ~ .• Wassi1y Hoeffding Instit .1te of Statistics, Universit-,. c,!' North Carolina 1 Institute of Statistics Mimeograph Series No. 83 Limited Distribution October, 1953 1. This research was supported in p9rt by THE UNITED STATES A.IR FORCE under Contract AF 18( 600 )-458 ,i\onitored bJ the Office of Scientific Research. THE EXTRETlA OF THE EXPECTED VALUE OF A FUNCTION OF INDEPENDENT RANDOM VARIABLES -:~ by IIJassily Hoeffding University of North Carolina The problem is considered of determining the least upper Summary. (or greatest lowGr) bound for the expected value EK(X , ••• , X ) of a given l n function K of n random variables Xl' ••• , X under the assumption that n Xl' ••• , X are independent and each X. has given range and satisfies k n conditions of the form J Eg~j)(X.) ~ J == c .. , i == 1, ••• , k. ~J It is Sfuown that under general conditions we need consider only discrete random variables X. which take on at most k .... 1 values. J 1. Introduction. Let.£ be the class of n-dimensional cdfs (cumu- lative distribution functions) F(x) == F(x , ••• , x ) which satisfy the l n conditions (1.1) (1. 2) (1.3) j g~j\x) dF.(x) J ~ == c .. , i == 1, ••• , k; j == 1, ••• , n, ~J F . (x) = 0 if x < A., F. (x) == 1 if x > B., j == 1, ... , n, J where the functions J g~j)(x) J and the constants c J ij ' A , B are given. j j We -)~This research was supported in whole or in part by the United Stat es Air Force under Contract No. AF 18(600)-- 458 monitored by Headquarters, Air Research and Development Command, p. O. ~ox 1395, Baltimore 3, Marylmd. - 2 - allow that A. J =- 00 and/or B. J = 00. Here and in what follows, when the domain of integration is not indicated, the integral extends over the entire rmge of the variables involved. It will be understood that all odfs are continuous on the right. Let K(~) be a function such that exists for all F in C. = The problem is to determine the least upper and the greatest lower bound of ¢(F) when F is in :;; C. It will be sufficient to cmnsider only the least upper bound. Special cases of statistical interest include K(~) = 0 or 1 accord- ing as a function f(~) does or does not exceed a given constant; K(~) = max (xl' ••• , x ), etc.,• g.(j)() x _- x i(. g~ven moments up to order k ) ; n g~j)(x) = 1 or ~ For n ~ ~ - 0 according as x < b. or x > b. (given quantiles), etc. ~ 1, gil)(x) ~ = xi, K(x) = 1 if x ~ t, K(x) = 0 otherwise, the problem was stated and its solution found by Chebyshev. This result was extended to more general function K(x) by Markov, Poss~ and others. For references and proofs see Shohat and Tamarkin L-6_7. An extension to mi the case g~l)(x) = x , Al = 0 was considered by Wald /-7 7. A recent -- ~ contribution is due to Royden For n arbitrary, k L-5 7. = 1, glj) = x, A. J = 0, B. = J 00, K(~) = 1 or 0 - 3 - qccording as Z~xi .::: t or < t - -7 Birnbaum, Raymond and Zuckerman /- 2 I showed that when looking for the least upper bound of ¢(F) we need consider only cdfs F. which are step functions with at most two steps. J gave an explicit solution for n = 2. ¢(F) the distribution function of the L- 1_7 and n ---> 00 ~sseen ~ 3 _7 For the case k S1xm = 3, gij)(x) = They xi, z~xi' the inequality of Berry gives bounds which are asymptotically best as but can presumably be improved for finite n. In the present paper it is shown that if in the general problem ~~ as stated above we restrict ourselves to the subclass C of C where the = = F. are step functions with a finite number of steps, then we need conJ sider only step functions with at most k + I steps. (Theorem 2.1). The same result holds in the unrestricted problem i f (A) each Fine can be approximated (in a certain sense) by a step function = in c*, and (B) ¢(F) is, in a sense, a continuous function of F = (Theorem 2.2). Sufficient conditions for the fulfillment of assumptions (A) and (B) are given in sections 3 and 2. The main theorems. Let g be (1.1) to (1.3), and suppose that ¢(F) = 4. the class of cdfs F(~) defined by /'K(~) dF(~) exists for all F that Fl ' ... , Fn are step funct ions with a finite number of steps. C be the subclass of C~~ in which each FJo (j = 1, ••• , n) is a step :=Ill = function with at most m steps. Let - 4- Theorem 2.1. ¢(F) sup :: sup ¢(F) • ~~ Fee" = F e £k+l Let F(x) "" Fl(x l ) .'. F (x ) be an arbitrary cdf in Ci~ n n = Proof. such that for some j, F. has more than k+l steps. J It is sufficient to show that there exists a cdf G in £k+1 such that ¢(G) ~ ¢(F). This, in turn, will easily follow when we show that if Fn (x) has m > k + 1 steps, there exists a cdf H (x) such that n a) H (x) has less than m steps; b) H(_x) :: Fl(x l ) ••• F lex 1) H (x ) is in C ; nnn n n = c) By assumption Fn(x) is of the form F (x) :: 0 n vlhel'e :: p 1 + ••• + pr if ar -< x < ar+ l' r :: I ' ••• , m - 1 "" 1 i f a m -< x , - 5- p > 0, r = 1, r , •• , m , ••• + g.J.mpm "" c.III , gOr i = 1, r = 1, .0., m; = g~n)(a J. ) r' i = 1, c •• 0' On kj = 0, =1 r 1, ... , k ' = 1, ••• , m • Let H (x-) n =0 if ••• +(pr +tdr ) =1 if a r -< x < a r+ l' r=1, ••• ,m-1 if a < x m- In order to satisfy condition b) it is sufficient to choose t, d , ••• , 1 d r (2.1) in such a way that p r ... td > 0, r- r = 1, ••• , m, - 6- (2.2) ••• + g.1..'11dm == a, i == a, 1, ••• , k 'tie can write ¢(H) - ¢(F) m =t Z K d r=l r r where K ;: r Let A = a or n-l ) ' K(X 1 , ••• , x -1' a) n r d { "IT ) F. (x. ) J j=l J 1 according as the rank of the matrix gal' ... , gam •••••• gk1' ... , gkm Kl ' ... , is less than or equal to k + 2. m Z K d r= 1 r r K m Then the equations (2.2) and ::0 A • • - 7have a solution (d , ... , d ) 'I (0, ... , 0). Having thus fixed d , ... , l m l d , choose t as the largest number which satisfies the inequalities m (2.1). This number exists and is positive. Conditions a), b) and c) are now satisfied, and the proof is complete. Theorem 2.2. Suppose that the follolTing two conditions are sati~- fied. A) For_every F(!)::> Fl(xl ) ••• Fn(xn ) in there exists a cdf F-:~(X) - ::> every 6 > 0 Fl~~(xl) ... F-\x ) in C~~ such that n n = sup x exists £. ~d j 6 > 0 such that for any G(~) = 1, ••• , n. = G1 (x l ) ••• Gn(xn ) ~ £ ~ich fies the inequalities sup x - G.(x) I < 6 , I F.(x) J J we have ,I ¢(F) - ¢(G) I < e j = 1, ••. , n , satis- - 8- e Then sup = ¢(F) F e gk+l F e C ::: Proof• ¢(F). sup Conditions A) sup ¢(F) F e C and B) imply that = sup F e = The theorem now follows from Theorem 2.1 ¢(F) " C~'" = • It is easy to extend Theorems 2.1 and 2 0 2 to tho case where the number k of restrictions r (") '; gi J (x) dF /x) = c ij ' i = 1, ••• , k, depends on j. Conditions A) and B) make use of the distance between two cdfs F and G defined by max sup j x - G.(x) I F.(x) J J j Obviously an analogous theorem can be stated with an arbitrary distance .d(F, G). - 9 - 4 it In sections 3 and will be shown that assumptions A) and B) of Theorem 2.2 are satisfied for certain classes C and functions K which = are of interest in statistics. 3. Approximation of a cdf by a step function. It will now be shown that condition A) of Theorem 2.2 is satisfied if C is the class of distri~ butions of the product type (1.1) with prescribed moments and ranges. Theorem 3.1. class defined b~ ~!!:2-~ A) of Th~~ 2.2 is2..~~fied if Q is the 1nij (1.1) to (1.3) ~~th g~j)(x) = x , where the m' are - - j 1 1 arbitrary positiv8 integers. The thEO rem is an immediate consequence of the following lemma. Let F(x) be a cdf on the real line such that Lemma 3.1. ::2 c., 1. i = 1, F(x) = 0 if x < A, wh3r8 we may have A ';';';;;;~:"-";;"'~,:J..'..;.,.~ ~ F~\x) whj_ch is = -00 0 or B _ = 00 F(x) • =1 G & ., S if x > B Then for every 6 > 0 , ~re eXis~~ step flillction with a finite number of steps, satis- fies conditions (3.1) .~ (3.2), and fo~_which s~p II F* (x) - F(x) < 5 o - 10 To prove Lemma 3.1 we shall need Lemma 3.2. If F(x) is any cdf which satisfies conditions (3.1) and (3.?), there exists a cdf which is a step function with a finite number of steps and satisfies the same conditions. The statement of Lemma 3.2 is well knmm. from Shohat and Tamarkin als 0 Royden -r :5, Lemma Proof of Le~~ L- 6, 2 7 - 3.1. For example, it follows Theorems 1.2 and 1.3 and Lemma 3.1 7. Cf. D Given 5 > 0, we can choose a finite set of points such that Pr If p = F(ar+1 - 0) r F (x) r f F(ar ) < 5, r 0, 1, ••• , m = 0 0, let =0 = 1 if x<a if a r -<x<ar+ 1 if a r+ 1 -< x. ~~ r By LeMaa 3.2 there exists a cdf F;(x) which is a step function with - 11 - a finite number of steps and such that ( = -" ~f- F (x) r = 0 if / I . xl. elF (x), = 1, ••• , s , if x > ar+ 1 • i r x < a , 1 r Let if .><. F"(x) = 0 if where p F'*(x) r r x < A = 0 if pr ~} , = O. ar -< x < ar+ l' F (x) = 1 if x It can be verified that r = Q, 1, ••. , m , > B • " F'~~(x) has the properties stated in Lemma 3.1. 4. Continuity of ¢(F). In this section we consider sufficient conditions for the continuity of ¢(F) in the sense of assumption Theorem 2.20 B) of The next theorem shows that assumption B) is satisfied if ¢(F) is the probability that the random vector! with cdf F is contained in a set S of a fairly eeneral type. - 12 Theorem 4.1. Condition B) of Theorem 2.2 is satisfied if = 1 .£E 0 ancording as x does or does not be}ong to a Borel-measurable set S such that every set S. = "ll x. : xeS ~ , j = 1, ••• , n, is the K(x) ~ J . J - J union of a finite and bounded number of intervals. (Here {x: A } denotes the set of all points x which satisfy condition A.) ~roof. Let 0 be an arbitrary positive number. Fl(x l ) ••• F (x ) and n n G(x) - Let F(~) = = G1(x1 ) ••• Gn (xn ) be any two cdfs in =C such that sup ( Fh(x) - Gh(x) x I< 6, h = 1, •.• , n • Let so that ¢(F) - ~(G) = We may assume that every set S. is the union of at most N non-over- J- lapping intervals f where N is a fixed number. For a fixed integer j and - 3;3 - fixed values ~ (h M < N. j 'I j) denote these intcrv81s by II' 1 2 , ••• , II'T' WJ;., ~lje have (L(Xl'~"'X,.J- l,x'+l""'x ••. G (x ) I", J n )d/l Fl(Y.l) ••• F~J- lex,J-1)G'+1(x.+1) J . J n n ,~ r J .., dG .(x.) J J '-., f • SinC8 <' dG. (x . ) J J ) I< 26 , 1 m Tt!G get , ¢(F) - ¢(G) so that condition Condition I < 2nA6 < 2nN6 , B) is satisfied. B) of Theorem 2.2 is evidently satisfied tmder more general assumptions than those of Theorem b.• l. Thus if K(!) :: - 14 - r r = 1, ••• , R, then ¢(F) :;:) K(~)elF(~) is also continuous. The same is true if K(_x) can be approximated by a function of the form L: d K (x), rruniformly in the range of the distributions F in £. Using Theorems 3.1 and 4.1 we can stato the following corollary of Theorem 2.2. Theorem 4.2. such that F.(A.- 0) J J Let C be the class of cdfs F(~) -= = 0, F.(B.+ 0) :;: 1, J J r x mij :;: Fl(xl) .. ·Fn(xn ) elF/x) = c ij ' i=l, ••• ,k; j=l, ••• ,n, where the numbers A ., B ., c .. and tho integers m.. are given. __ J J ~J ~J Let S be a "Rorol-nee.surable set such that every set {x j : union of a finit3 and bounded number of intervals. sup FeC == Then J S S dF. ~eS} _~..l.- is the -155. Concluding remark. The problem considered in this paper can .. be modified by admitting only those cdfs in C for which the rnarginal distributions Fl, ••• ,Fnare identical. It is of interest to note that with this restriction the conditions A), B) of Theorem 2.2 are no longer sufficient in order to reduce the class of competing cdfs to step functions with a bounded number of steps. For example, consider the problem of the least upper bound for the expected value of the largest of n independent, identically distributed rffildom variables with given mean and variance, Hartley and David ~4_7 shm!ed that under the additional assumption that the cd! is continuous the least upper bound is attained with a continuous cdf when n > 2. At least for n = 2 it can be shown that the Hartley-David bound can not be arbitrarily closelJr apnroached with a discrete cdf having a bounded number of steps. On the other hand if the assumption that the random variables are identically distributed is dropped, the conditions of Theorem 2.2 arc satisfied, dnd hence the least upper bound is attained or approached with step functions having at most three steps. -16References "The accuracy of the Gaussian approximation to the stun of independent variates", Transactions-of the American Mathematical ~iety, Vol. 49(1941), pp. 122-136. /-2 7 z. 1.v. - - Birn1)atun, J. Raymond and H. S. Zuckerman, "A generalization of Tshebyshev1s inequality to two dimensions", Annals of Mathematical Statistics, Vol. IS-CIJ47J, PP. 70-79. - ''Fouri.er analysis of distribution functions. A mathematical st,udy of the Laplace.;.Gaussi.an law", Acta rhtho ,~i)lo "n s PPII 1-125 (1944)-.---_._. /-4 - - 7 H. O. Hartley and H. A. David, "Univorsal bounds for mean range and extreme observation", ••••• I~ounds on a distribution function when its first n moments are given", Annals of Mathematical Statistics, Vol.~-rr953), 361-376. ;-6 7 J. - - A. Shohat and J. D. Tamarkin, IlTr~roblem ~f)21ent!'..}. American Mathematical Society, 1943. "Limits of a distribution function determined by absolute moments and inequalities satisfied by absolute moments", Transactions of the American MathematICal SocIety., Vol. 46 (1939), pp. 280-306: