Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MS&E 321
Stochastic Systems
Prof. Peter W. Glynn
Appendix A:
A.1
Spring 12-13
June 1, 2013
Page 1 of 3
Convergence Concepts for Random Variables
Convergence Definitions
Recall that a real-valued rv X is a function X : Ω → R. Given that a rv is a function, there
are many different mechanisms for determining whether a sequence of rvs/functions (Xn : n ≥ 0)
converges to a limit rv/function X∞ .
Almost Sure Convergence:
We say that (Xn : n ≥ 1) converges almost surely to X∞ if P (A) = 1, where
A = {ω : Xn (ω) → X∞ (ω)
as n → ∞}
and write Xn → X∞ a.s. as n → ∞ when this convergence holds. This type of convergence is
equivalently called: convergence with probability one (written Xn → X∞ w.p. 1 as n → ∞);
convergence almost everywhere (written Xn → X∞ a.e. as n → ∞); convergence almost certainly
(written Xn → X∞ a.c. as n → ∞).
Remark A.1.1 Note that the event A depends on the infinite-dimensional joint distribution of
(Xn : 1 ≤ n ≤ ∞). As a result, rigorous discussion of probability assignments to events like A
needed to await the development of measure-theoretic probability in the early 20th century.
Convergence in p’th Mean:
We say that (Xn : n ≥ 1) converges in p’th mean (for p > 0) to X∞ if E|Xn |p < ∞ for n ≥ 1
and kXn − X∞ kp → 0 as n → ∞, where
kY kp = E 1/p |Y |p
Lp
for E|Y |p < ∞. When this convergence holds, we write Xn −→ X∞ as n → ∞.
Convergence in Probability
We say that (Xn : n ≥ 1) converges in probability to X∞ if, for each > 0,
P (|Xn − X∞ | > ) → 0
p
as n → ∞, in which case we write Xn →
− X∞ as n → ∞.
Remark A.1.2 Convergence in probability and convergence in p’th mean is a statement about
the joint distribution of the two rv’s Xn and X∞ for n large.
Remark A.1.3 Convergence in p’th mean implies convergence in probability, since Markov’s inequality implies that
E|Xn − X∞ |p
P (|Xn − X∞ | > ) ≤
.
p
1
§ APPENDIX A:
CONVERGENCE CONCEPTS FOR RANDOM VARIABLES
Exercise A.1.1 Prove that almost sure convergence implies convergence in probability.
Weak Convergence
We say that (Xn : n ≥ 1) converges weakly to X∞ if X∞ is a finite-valued rv for which
Ef (Xn ) → Ef (X∞ )
as n → ∞ for each bounded and continuous function f : R → R, in which case we write
Xn ⇒ X∞
as n → ∞. Weak convergence is equivalently called “convergence in distribution”.
Remark A.1.4 Weak convergence is a statement about the distribution of Xn when n is large.
Remark A.1.5 Weak convergence can be equivalently formulated as: Xn ⇒ X∞ as n → ∞ if and
only if X∞ is a finite-valued rv for which
P (Xn ≤ x) → P (X∞ ≤ x)
as n → ∞ whenever x is a continuity point of P (X∞ ≤ ·).
p
Exercise A.1.2 Prove that Xn →
− X∞ as n → ∞ implies that Xn ⇒ X∞ as n → ∞.
Remark A.1.6 Convergence in probability does not imply almost sure convergence. Note that
if Y = (Yn : n ≥ 0) is a nearest neighbor symmetric random walk on Z, then the recurrence
of Y implies that Xn , I(Yn = 0) = 1 infinitely often a.s. so Xn 9 0 a.s. as n → ∞. But
p
P (Xn = 0) = P (Yn 6= 0) → 1 as n → ∞, so Xn →
− 0 as n → ∞.
The following diagram makes clear the relationship between these convergence concepts:
Almost Sure
Convergence
Convergence in p’th mean
@
@
@
R
@
Convergence
in Probability
?
Weak Convergence
A.2
Basic Facts on Almost Sure Convergence
Fact 1: Suppose that Xn → X∞ a.s. as n → ∞ and Yn → Y∞ a.s. as n → ∞, where X∞ and
Y∞ are finite-valued. If f : R2 → R is continuous, then h(Xn , Yn ) → h(X∞ , Y∞ ) a.s. as n → ∞.
2
§ APPENDIX A:
CONVERGENCE CONCEPTS FOR RANDOM VARIABLES
Fact 2: Suppose that Xn → X∞ a.s. as n → ∞ and (Tn : n ≥ 1) is a sequence of Z+ -valued rv’s
for which Tn → ∞ a.s. as n → ∞. Then,
XTn → X∞
a.s.
as n → ∞.
A.3
Basic Facts on Convergence in Probability
p
Fact 1: Suppose that Xn →
− X∞ , where X∞ is finite-valued. If h : R → R is continuous, then
p
h(Xn ) →
− h(X∞ ) as n → ∞.
p
p
Remark A.3.1 If Xn →
− X∞ and Yn →
− Y∞ as n → ∞ and ((Xn , Yn ) : 1 ≤ n ≤ ∞) are all defined
p
on a common sample space (i.e. are jointly distributed), then (Xn , Yn ) →
− (X∞ , Y∞ ) as n → ∞.
Fact 2:
n → ∞.
p
p
Suppose that Xn → X∞ a.s. as n → ∞ and Tn →
− ∞ as n → ∞. Then, XTn →
− X∞ as
p
p
p
Remark A.3.2 If Xn →
− X∞ and Tn →
− ∞ as n → ∞, it is not always the case that XTn →
− X∞
as n → ∞.
A.4
Basic Facts on Weak Convergence
Fact 1: Suppose that Xn ⇒ X∞ as n → ∞. If h : R → R is continuous, then h(Xn ) ⇒ h(X∞ )
as n → ∞. In fact, if P (X∞ ∈ Dh ) = 0 (where Dh = {x ∈ R : h(·) is discontinuous at x}), then
h(Xn ) ⇒ h(X∞ ) as n → ∞.
Remark A.4.1 The above result is called the continuous mapping principle.
p
Fact 2: Suppose that Xn ⇒ X∞ as n → ∞ and Yn →
− c as n → ∞, where c is deterministic. If
2
h : R → R is such that P ((X∞ , c) ∈ Dh ) = 0, then h(Xn , Yn ) ⇒ h(X∞ , c) as n → ∞.
p
Remark A.4.2 The above fact implies that if Xn ⇒ X∞ and Yn →
− c, then Xn + Yn ⇒ X∞ + c
and Xn Yn ⇒ cX∞ as n → ∞.
p
Remark A.4.3 If Xn ⇒ X∞ and Yn →
− Y∞ as n → ∞, it is not always the case that h(Xn , Yn ) ⇒
2
h(X∞ , Y∞ ) (even if h : R → R is continuous everywhere).
p
Remark A.4.4 If Xn ⇒ X∞ as n → ∞ and Tn →
− ∞ as n → ∞, it is not always the case that
XTn ⇒ X∞ as n → ∞.
3