Download Chapter 6 Convergence

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Randomness wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Probability Theory
Four different convergence concepts
Let X1, X2, … be a sequence of (usually dependent) random
variables
Chapter 6
Definition 1.1. Xn converges almost surely (a.s.), or
with probability 1 (w.p.1), to the random variable X as n→∞ iff
Convergence
D fi iti 1.2.
Definition
1 2 Xn converges in
i probability
b bilit to
t the
th random
d
variable X as n→∞ iff for every ε>0
Thommy Perlinger, Probability Theory
1
Thommy Perlinger, Probability Theory
Four different convergence concepts
Definition 1.3. Xn converges in r-mean to the random variable X
as n→∞ iff
2
Convergence in probability
Definition 1.2. Xn converges in probability to the random variable X as n→∞
iff for every ε>0
p
Notation. Xn → X as n→∞.
Definition 1.4. Xn converges in distribution to the random
variable X as n→∞ iff
In situations where the limiting distribution is degenerate, that is, the limiting
random variable X is a constant, convergence in probability is (in statistics)
also known as consistency.
Chebyshev’s inequality. Let X be a random variable with mean μ and finite
variance σ2. Then
where C(FX) is the continuity set of FX.
Thommy Perlinger, Probability Theory
3
4
1
Problem 6.8.6 a
The weak law of large numbers
Let X₁,X₂,… be i.i.d. Pa(1,2)-distributed random variables, and set
p
Yn = min{X₁,X₂,…,Xn}. Show that Yn→1 as n→∞. Since
The weak law of large numbers. Let X1, X2, … be a sequence of i.i.d.
random variables with mean μ and finite variance σ2 and set
Sn= X1+X2+…+X
+ +Xn. Then
it follows that
The distribution function of Yn is therefore given by
Proof. The statement is a simple consequence of Chebyshev’s inequality.
and so (for any ε>0)
it follows that
It thus follows that
5
6
Convergence in probability: Extension
Theorem 6.7
Example: Consistency of S2
Consistency of S2. Let X1, X2, … be a sequence of i.i.d. random variables
with mean μ and finite variance σ2. Define the sample variance by
Theorem 6.7.Suppose that X1, X2, … converges in probability to a constant a
and that h is a continuous function. Then
Proof . h is continuous, so given ε>0 there exists a δ>0 such that
Since (prove this!)
The continuity of h thus makes sure that
it follows from Chebyshev
Chebyshev’s
s inequality that
and since Xn converges in probability to X
p
and thus, a sufficient condition for S2 → σ2 is that Var(S2) →0 as n→∞.
7
Thommy Perlinger, Probability Theory
8
2
Convergence in probability: Extension
Exercise 6.6.2
Convergence in probability: Extension
Exercise 6.6.2
Exercise 6.6.2. Suppose that X1, X2, … converges in probability to a random
variable X and that h is a continuous function. Then
Proof . To prove this we use the fact that on a closed interval (compact set)
any continuous function h is actually uniformly continuous.
We therefore divide the sample space into two disjoint subsets
and it now follows that
To this we use the fact that for any given η>0 there exists an A such that
Since h for any A is uniformly continuous on [-A,A] we can (on this interval)
for any given η>0 and ε>0 find a δ>0 and an m such that for any n>m
Thommy Perlinger, Probability Theory
9
Almost sure convergence
Thommy Perlinger, Probability Theory
10
Example: Almost sure convergence
Definition 1.1. Xn converges almost surely, or with probability one, to the
random variable X as n→∞ iff
Let the sample space S be [0,1] with the uniform distribution. Define random
variables Xn(s) = s+sn and X(s) = s. As n→∞ we have that
a.s.
Notation. Xn → X as n→∞.
When to prove that Xn converges (or fails to converge) almost surely we can
use the following result.
which means that
a.s.
Xn → X as n→∞ iff, ∀ε>0 and 0<δ<1, ∃n0 such that, ∀n>n0,
But since Pr([0,1)) = 1 it follows by Definition 1.1 that
Thommy Perlinger, Probability Theory
11
Thommy Perlinger, Probability Theory
12
3
Relationship: Almost sure convergence
and convergence in probability
Comparison of Definitions 1.1 and 1.2. We have that
Example: Convergence in probability
but not almost sure convergence
Let the sample space S be [0,1] with the uniform distribution. Define X(s) = s
and the sequence X1, X2,… by
Since for m>n,
etc. It is clear that for any 0<ε<1
p
where
h
In is
i th
the iinterval
t
l related
l t d to
t Xn. It is
i thus
th clear
l
th t Xn → X as n→∞.
that
However, since Xn(s) alternates between s and s+1 infinitely often, that is
it is ”proven” that
it is clear that Xn does not converge to X almost surely as n→∞.
Thommy Perlinger, Probability Theory
13
Relationship: Convergence in r-mean
and convergence in probability
Thommy Perlinger, Probability Theory
14
Convergence in distribution
(and relationships between concepts)
Definition 1.3. Xn converges in r-mean to the random variable X as n→∞ iff
Definition 1.4. Xn converges in distribution to the random variable X as n→∞
iff
r
Notation. Xn → X as n→∞.
where C(FX) is the continuity set of FX. Notation. Xn → X as n→∞.
Convergence in r-mean is stronger convergence concept than convergence in
probability. By Markov’s inequality (for any ε>0)
Convergence in distribution is the weakest concept of the four but also the
most useful. The (complete) relationships can be described as
d
which implies that
where all implications are strict.
Thommy Perlinger, Probability Theory
15
Thommy Perlinger, Probability Theory
16
4
Problem 6.8.6 b
Important results concerning limits
Consider two functions f and g, such that
Let X₁,X₂,… be i.i.d. Pa(1,2)-distributed random variables, and set
Yn = min{X₁,X₂,…,Xn}. Show that Un= n(Yn-1) converges in distribution as
n→∞ and determine the limit distribution
n→∞,
distribution. Since
Then
If h is a continuous function then
it follows that
The single most important limit (in its most general form) is the following:
Let an→a as n→∞. Then
d
It is thus clear that Un → X as n→∞ where X ∈ Exp(1/2).
Thommy Perlinger, Probability Theory
17
Thommy Perlinger, Probability Theory
18
Important results concerning limits
(when using Taylor series expansion)
Convergence via transforms
Theorem 4.1. Let X, X₁,X₂,… be nonnegative, integer-valued random
variables. Then
The Taylor series expansion of a function f that is infinitely differentiable in a
neighborhood of x=a is the power series
Theorem 4.2. Let X₁,X₂,… be random variables for which the mgf’s exist for
–h<t<h for some h>0, and suppose that X is a random variable whose mgf
ψX(t) exists for –h1≤t≤h1 where 0<h1<h. If
which implies that
Some terms in the expansion might be insignificant in the limit. For such
terms we can use the ”o”-concept. The function f(x) is said to be ”little-o” of
g(x) if f(x)/g(x)→0 as x→0 and we write
then
Thommy Perlinger, Probability Theory
19
Thommy Perlinger, Probability Theory
20
5
Problem 6.8.19
Problem 6.8.19
Let X₁,X₂,… be i.i.d. random variables with mean μ<∞, and let Nn Ge(pn),
0<pn<1, independent of X₁,X₂,…. Determine the limit distribution of
We now note that for a general probability distribution with mean μ (where
the mgf exists) it holds that
as n→∞ if pn→0 as n→∞. It follows from Theorem 3.6.3 that
that is
and
d ffrom Th
Theorem 3
3.3.4
3 4 we have
h
th t
that
Si
Since
ψX(p
( nt) → ψX(0) = 1 as n→∞ it
i therefore
h f
f ll
follows
that
h
d
and so it is clear that Yn → Exp(μ) as n→∞.
21
The weak law of large numbers
Revisited
22
The Central Limit Theorem
The weak law of large numbers (LLN). Let X1, X2, … be a sequence of i.i.d.
random variables with mean μ and mgf ψX(t). Then
The Central Limit Theorem (CLT). Let X1, X2, … be i.i.d. random variables
with mean μ, variance σ2, and mgf ψX(t) and set Sn= X1+X2+…+Xn. Then
Proof.
Proof. Because of linear properties of moment generating functions it is no
restriction to let μ=0 and σ2=1. This means that
as n→∞. It is clear that
, or equvalently,
23
Thommy Perlinger, Probability Theory
24
6
Convergence of sums of sequences
of random variables
The Central Limit Theorem
Theorem 6.1-6.3. Let X₁,X₂,… and Y₁,Y₂,… be sequences of random
variables. Then
and therefore we get that
Theorem 6
6.6.
6 Let X₁,X
X₂,… and Y₁,Y
Y₂,… be sequences of random variables.
variables
Suppose further that Xn and Yn are independent for all n and that X and Y are
independent. Then
as n→∞, and we are done since this is the moment generating function of
N(0,1).
Thommy Perlinger, Probability Theory
25
Slutsky’s theorem (or Cramér’s theorem)
Theorem 6.5. Let X₁,X₂,… and Y₁,Y₂,… be sequences of random variables.
Suppose that
Thommy Perlinger, Probability Theory
26
Exercise 6.6.3
Let X₁,X₂,… be i.i.d. Be(p)-distributed random variables where 0<p<1. We
would like to construct a confidence interval for the population proportion p.
What about the random behavior of the sample proportion?
Set Sn= X1+X2+…+Xn and consider Y₁,Y₂,… where Yn=Sn/n. Since
where a is a constant. Then
it follows by the Central Limit Theorem (CLT) that
which, for instance, implies that
Thommy Perlinger, Probability Theory
27
28
7
Exercise 6.6.3
Exercise 6.6.3
Hence, in the denominator we have to replace p(1-p) with Yn(1-Yn), that is
Now it follows by the third and the fourth result of Theorem 6.5 (Slutsky) that
is to be replaced by
Since the square root is a continuous function, it follows by Theorem 6.7 that
With the aid of Slutsky’s theorem we can prove that CLT still ”works”.
First it follows by the law of large numbers that
Finally, it follows by the fourth result of Theorem 6.5 (Slutsky) that
and therefore by the second result of Theorem 6.5 (Slutsky) we have that
29
Problem 6.8.26
as n→∞. It is thus clear that the approximation is still valid for sufficiently
large sample sizes.
30
Problem 6.8.26
Let X₁,X₂,… be positive i.i.d. random variables with mean μ and variance
σ2<∞, and set Sn= X1+X2+…+Xn. Determine the limit distribution of
So therefore we have that
In order to use the central limit theorem we rewrite the expression as
Hence, it follows from Theorem 6.5 (Slutsky) that
The expression in the numerator meets the requirements of the central limit
theorem, and the expression in the denominator meets the requirements of
the law of large numbers.
since a linear function of a normal random variable also is normal.
Thommy Perlinger, Probability Theory
31
Thommy Perlinger, Probability Theory
32
8