Download CONVERGENCE IN DISTRIBUTION !F)!F)!F)!F)!F)!F)!F)!F)!F)!F)!F)!F

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Functional decomposition wikipedia , lookup

History of the function concept wikipedia , lookup

Non-standard calculus wikipedia , lookup

Sequence wikipedia , lookup

Proofs of Fermat's little theorem wikipedia , lookup

Exponential family wikipedia , lookup

Negative binomial distribution wikipedia , lookup

Exponential distribution wikipedia , lookup

Tweedie distribution wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
CONVERGENCE IN DISTRIBUTION

Convergence in distribution is a property of cumulative distribution functions (CDFs).
It’s frequently described, however, in terms of the random variables that have those
distributions. Let’s say that we have a sequence of random variables X1, X2, X3, … with
corresponding cumulative distribution functions F1, F2, F3, … Let’s say also that we
have a target limit random variable X with cumulative distribution function F.
d
d
The concept should be described as { Fn }  F , but we often use { Xn }  X .
d
We say that { Fn }  F if and only if
for every x which is a point of continuity of F, the sequence of numbers { Fn(x) }
converges to the number F(x)
An example that follows will indicate why we need the proviso that x be a continuity
point of F.
Suppose that
Fn(x) =
R
|S0
|1  e
T
if x  0
n

x
n 1
if x  0
Here of course Fn(x) is the cumulative distribution function of an exponential random
n 1
variable with mean
. Intuitively this should converge to the cumulative distribution
n
function of the exponential random variable with mean 1. Indeed, using
F(x) =
0
R
S
T1  e
if x  0
x
if x  0
It can be shown easily that for any x, { Fn(x) }  F(x). Once x is specified, this is all
about convergence of a sequence of numbers.
The convergence in distribution concept is not as useful when F is the cumulative
distribution function of a degenerate random variable, meaning a random variable that
takes only one value. An example of such an F is F(x) = I( x  M), which corresponds to
a random variable for which P[ X = M ] = 1.
When the limiting cumulative distribution function is degenerate at value M, then
d
P
the statement { Fn }  F is equivalent to { Xn }  M.

Page
1
gs2011
CONVERGENCE IN DISTRIBUTION

d
Here’s why. First assume that { Fn }  F , where F is degenerate at M.
Pick any x < M. Then { Fn(x) } = { P[ Xn  x ] }  F(x) = 0. One
choice is x = M - , so we conclude { P[ Xn  M -  ] }  0.
Pick any x > M. Then { Fn(x) } = { P[ Xn  x ] }  F(x) = 1. One
choice is x = M + , so we conclude { P[ Xn  M +  ] }  1.
These two facts together give us { P[ Xn - M  >  ] }  0. This is the
P
statement { Xn }  M.
P
Now assume that { Xn }  M. For any x < M we have
{ P[ Xn  x ] }  0, and for any x > M (say x = M + ) we have
{ P[ Xn  M +  ] }  1. If these statements are converted to their
equivalents in terms of cumulative distribution functions, then we
d
conclude { Fn }  F.
Let’s consider the reason that we need to exclude discontinuity points of the limit F.
Suppose that Xn is uniformly distributed on the interval (- 1n , 1n ). This certainly converges
in distribution (and in probability) to the random variable X with P[ X = 0 ] = 1. The
limiting cumulative distribution function F(x) = I(x  0) has a discontinuity at x = 0. We
note that Fn(0) = 12 for every n but F(0) = 1. Thus { Fn(0) } does not converge to F(0).
However, x = 0 is the only discontinuity point and convergence happens to every other
x-value.
There’s a very interesting property of cumulative distribution functions that is somewhat
hard to believe. If { Fn } is any sequence of cumulative distribution functions, then there
is a subsequence that converges to a cumulative (sub)distribution function F. The (sub)
part of this comment is that the limit will satisfy all the requirements for a cumulative
distribution function except possible that F() < 1.
The proof of this outrageous assertion hangs on the fact that 0  Fn(x)  1 for every n and
also this mathematical statement:
If 1, 2, 3, … is a sequence of numbers between 0 and 1 (or more generally
between any two finite values), then there is a subsequence 1 ,  2 , 3 , ... that

Page
2
gs2011
CONVERGENCE IN DISTRIBUTION

converges. The prime marks on the subscripts simply indicate that we’ve
extracted some set of values from the original sequence.
The proof of this statement is itself interesting, but we’ll simply take it as true and use it
in the development that follows.
Now… let’s find the subsequence of { Fn } that converges. Let x1, x2, x3, … be a listing
of all the rational numbers. As the set of rationals is countably infinite, it is possible to
set up this sequence.
Work with x1 and consider the sequence of numbers F1(x1), F2(x1), F3(x1), F4(x1), …
This is a set of values between 0 and 1 so there is a convergent subsequence. Let the
index points of the convergent subsequence Fn(x1) be 1:1, 1:2, 1:3, 1:4, and so on. That
is, we’re saying that F1:1(x1), F1:2(x1), F1:3(x1), F1:4(x1), … is a convergent sequence.
Now move to x2 and consider the sequence F1:1(x2), F1:2(x2), F1:3(x2), F1:4(x2), … Again,
this is a set of values between 0 and 1 and thus has a convergent subsequence. Let the
index points of the convergent subsequence be 2:1, 2:2, 2:3, 2:4, and so on. Here we are
saying that F2:1(x2), F2:2(x2), F2:3(x2), F2:4(x2), … is a convergent sequence. Now we’ve
got a subsequence that converges for both x1 and x2 . Moreover, we can observe that the
values to which these converge are correctly ordered. That is, if x1 < x2, then
lim F2:n x1  lim F2:n x2 . This happens because Fn(x1)  Fn(x2) for every cumulative
n
bg
n
bg
distribution function in the original sequence. (The inequalities would be reversed if
x1 > x2.)
Consider next x3 and consider the sequence F2:1(x3), F2:2(x3), F2:3(x3), F2:4(x3), … As
before, this is a set of values between 0 and 1 and thus has a convergent subsequence.
Let the index points of the convergent subsequence be 3:1, 3:2, 3:3, 3:4, and so on. We
are saying that F3:1(x3), F3:2(x3), F3:3(x3), F3:4(x3), … is a convergent sequence. At this
point, we have a subsequence that converges for x1, x2, and x3 . As above, the values to
which these converge is ordered in the same way as are x1, x2, and x3 .
Let this process continue indefinitely. Now the diagonal sequence { Fk:k } will converge
at x1, x2, x3, … We now have a subsequence that converges at every rational number!
Thus, we claim that the subsequence converges at every x, whether rational or not.
There are a number of technical mathematical difficulties to check, but in this case,
everything will work out.
This technique is called Helly’s extraction principle.

Page
3
gs2011