Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Functional decomposition wikipedia , lookup
History of the function concept wikipedia , lookup
Non-standard calculus wikipedia , lookup
Proofs of Fermat's little theorem wikipedia , lookup
Exponential family wikipedia , lookup
Negative binomial distribution wikipedia , lookup
Exponential distribution wikipedia , lookup
CONVERGENCE IN DISTRIBUTION Convergence in distribution is a property of cumulative distribution functions (CDFs). It’s frequently described, however, in terms of the random variables that have those distributions. Let’s say that we have a sequence of random variables X1, X2, X3, … with corresponding cumulative distribution functions F1, F2, F3, … Let’s say also that we have a target limit random variable X with cumulative distribution function F. d d The concept should be described as { Fn } F , but we often use { Xn } X . d We say that { Fn } F if and only if for every x which is a point of continuity of F, the sequence of numbers { Fn(x) } converges to the number F(x) An example that follows will indicate why we need the proviso that x be a continuity point of F. Suppose that Fn(x) = R |S0 |1 e T if x 0 n x n 1 if x 0 Here of course Fn(x) is the cumulative distribution function of an exponential random n 1 variable with mean . Intuitively this should converge to the cumulative distribution n function of the exponential random variable with mean 1. Indeed, using F(x) = 0 R S T1 e if x 0 x if x 0 It can be shown easily that for any x, { Fn(x) } F(x). Once x is specified, this is all about convergence of a sequence of numbers. The convergence in distribution concept is not as useful when F is the cumulative distribution function of a degenerate random variable, meaning a random variable that takes only one value. An example of such an F is F(x) = I( x M), which corresponds to a random variable for which P[ X = M ] = 1. When the limiting cumulative distribution function is degenerate at value M, then d P the statement { Fn } F is equivalent to { Xn } M. Page 1 gs2011 CONVERGENCE IN DISTRIBUTION d Here’s why. First assume that { Fn } F , where F is degenerate at M. Pick any x < M. Then { Fn(x) } = { P[ Xn x ] } F(x) = 0. One choice is x = M - , so we conclude { P[ Xn M - ] } 0. Pick any x > M. Then { Fn(x) } = { P[ Xn x ] } F(x) = 1. One choice is x = M + , so we conclude { P[ Xn M + ] } 1. These two facts together give us { P[ Xn - M > ] } 0. This is the P statement { Xn } M. P Now assume that { Xn } M. For any x < M we have { P[ Xn x ] } 0, and for any x > M (say x = M + ) we have { P[ Xn M + ] } 1. If these statements are converted to their equivalents in terms of cumulative distribution functions, then we d conclude { Fn } F. Let’s consider the reason that we need to exclude discontinuity points of the limit F. Suppose that Xn is uniformly distributed on the interval (- 1n , 1n ). This certainly converges in distribution (and in probability) to the random variable X with P[ X = 0 ] = 1. The limiting cumulative distribution function F(x) = I(x 0) has a discontinuity at x = 0. We note that Fn(0) = 12 for every n but F(0) = 1. Thus { Fn(0) } does not converge to F(0). However, x = 0 is the only discontinuity point and convergence happens to every other x-value. There’s a very interesting property of cumulative distribution functions that is somewhat hard to believe. If { Fn } is any sequence of cumulative distribution functions, then there is a subsequence that converges to a cumulative (sub)distribution function F. The (sub) part of this comment is that the limit will satisfy all the requirements for a cumulative distribution function except possible that F() < 1. The proof of this outrageous assertion hangs on the fact that 0 Fn(x) 1 for every n and also this mathematical statement: If 1, 2, 3, … is a sequence of numbers between 0 and 1 (or more generally between any two finite values), then there is a subsequence 1 , 2 , 3 , ... that Page 2 gs2011 CONVERGENCE IN DISTRIBUTION converges. The prime marks on the subscripts simply indicate that we’ve extracted some set of values from the original sequence. The proof of this statement is itself interesting, but we’ll simply take it as true and use it in the development that follows. Now… let’s find the subsequence of { Fn } that converges. Let x1, x2, x3, … be a listing of all the rational numbers. As the set of rationals is countably infinite, it is possible to set up this sequence. Work with x1 and consider the sequence of numbers F1(x1), F2(x1), F3(x1), F4(x1), … This is a set of values between 0 and 1 so there is a convergent subsequence. Let the index points of the convergent subsequence Fn(x1) be 1:1, 1:2, 1:3, 1:4, and so on. That is, we’re saying that F1:1(x1), F1:2(x1), F1:3(x1), F1:4(x1), … is a convergent sequence. Now move to x2 and consider the sequence F1:1(x2), F1:2(x2), F1:3(x2), F1:4(x2), … Again, this is a set of values between 0 and 1 and thus has a convergent subsequence. Let the index points of the convergent subsequence be 2:1, 2:2, 2:3, 2:4, and so on. Here we are saying that F2:1(x2), F2:2(x2), F2:3(x2), F2:4(x2), … is a convergent sequence. Now we’ve got a subsequence that converges for both x1 and x2 . Moreover, we can observe that the values to which these converge are correctly ordered. That is, if x1 < x2, then lim F2:n x1 lim F2:n x2 . This happens because Fn(x1) Fn(x2) for every cumulative n bg n bg distribution function in the original sequence. (The inequalities would be reversed if x1 > x2.) Consider next x3 and consider the sequence F2:1(x3), F2:2(x3), F2:3(x3), F2:4(x3), … As before, this is a set of values between 0 and 1 and thus has a convergent subsequence. Let the index points of the convergent subsequence be 3:1, 3:2, 3:3, 3:4, and so on. We are saying that F3:1(x3), F3:2(x3), F3:3(x3), F3:4(x3), … is a convergent sequence. At this point, we have a subsequence that converges for x1, x2, and x3 . As above, the values to which these converge is ordered in the same way as are x1, x2, and x3 . Let this process continue indefinitely. Now the diagonal sequence { Fk:k } will converge at x1, x2, x3, … We now have a subsequence that converges at every rational number! Thus, we claim that the subsequence converges at every x, whether rational or not. There are a number of technical mathematical difficulties to check, but in this case, everything will work out. This technique is called Helly’s extraction principle. Page 3 gs2011