Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Probability Generating Functions
H. Krieger, Mathematics 157, Harvey Mudd College
Spring, 2005
Non-negative integer valued random variables: Let N be a random
variable with values from the set {+∞, 0, 1, 2, . . .}. Then N is referred to as
a non-negative integer valued random variable. Letting pk = P (N = k) for
k = 0, 1, 2, . . ., we see that
∞
X
pk = P (N < +∞)
k=0
and
P (N = +∞) = 1 −
∞
X
pk .
k=0
If P (N = +∞) = 0, then N is said to be finite with probability one or simply
finite valued.
Convolutions: If N and M are independent non-negative integer valued
random variables, with pk = P (N = k) for k = 0, 1, 2, . . . and qj = P (M = j)
for j = 0, 1, 2, . . ., then N + M is another non-negative integer valued random
variable with distribution given by
X
ri = P (N + M = i) =
pk qj =
k+j=i
i
X
pk qi−k
k=0
for i = 0, 1, 2, . . .. In this case we write {rk } = {pk } ∗ {qk} and say that {rk } is
the convolution of {pk } and {qk }.
Expectations: If P (N = +∞) > 0, then E(N ) = +∞. Otherwise,
E(N ) =
∞
X
kpk =
k=0
∞
X
kP (N = k).
k=1
Note that 0 ≤ E(N ) ≤ +∞, with E(N ) = 0 if and only if P (N > 0) = 0.
Moreover, we have the alternate formula
E(N ) =
∞
X
P (N ≥ k) =
k=1
∞
X
k=0
1
P (N > k),
which also holds even if P (N = +∞) > 0.
pgf ’s: For 0 ≤ z ≤ 1, in fact we can let z be complex with |z| ≤ 1, define
g(z) = p0 + zp1 + z 2 p2 + · · · =
∞
X
z k pk .
k=0
Note that, for |z| < 1, g(z) = E(z N ) since we can define z +∞ = 0 in this
case. Then g is called the probability generating function or pgf of the sequence {pP
k } or, less precisely, of the random variable N . Since each pk ≥ 0
∞
and 0 ≤ k=0 pk ≤ 1, this power series converges uniformly and is infinitely
differentiable for |z| < 1. Probability generating functions have the following
properties:
1. The values of g(z) uniquely determine {pk }, since g (k) (0)/k! = pk for
k = 0, 1, 2, . . .. In fact, we see that values of g for real s with 0 < s < 1
are sufficient for this purpose since these values determine g(1) as well as
g and all
of its derivatives at 0. Note that, in particular, g(0) = p0 and
P∞
g(1) = k=0 pk = P (N < +∞).
P∞
k−1
pk . Hence if we consider real values
2. For |z| < 1, g 0 (z) =
k=1 kz
0 < s < 1 for z, we see that as s ↑ 1,
g 0 (s) ↑
∞
X
kpk ,
k=1
which is E(N ) in the case that N is finite valued. But in that case, we
also see that
1 − g(s)
1−s
=
=
=
g(s) − g(1)
s−1
∞
X
(1 + s + · · · + sk−1 )pk
k=1
∞
X
j=0
=
∞
X
sj
∞
X
pk
k=j+1
sj P (N > j).
j=0
Thus 1−g(s)
1−s is the generating function for the sequence {P (N > k)} and
we have
g(s) − g(1)
↑ E(N ) as s ↑ 1.
s−1
In other words, g is differentiable from below at 1 with
g 0 (1) = lim g 0 (s) = E(N ).
s↑1
2
3. Similarly, for n > 1 and N finite valued, we get
g (n) (1) = lim g (n) (s) = E(N (N − 1) · · · (N − n + 1)),
s↑1
so, for example:
g 00 (1) = E(N (N − 1)) = E(N 2 ) − E(N ).
4. If N and M are independent non-negative integer valued random variables,
with generating functions gN and gM respectively, then the generating
function gN +M of N +M is given for 0 < s < 1 by gN +M (s) = gN (s)gM (s).
To see this note that
gN +M (s) = E(sN +M ) = E(sN sM ) = E(sN )E(sM ) = gN (s)gM (s),
using the independence of the random variables sN and sM . In other
words, the generating function of the convolution of two sequences is the
product of the generating functions of the individual sequences, a familiar
result for many transforms.
5. Suppose {Xn } are independent, identically distributed, finite non-negative
integer valued random variables which have a common generating function
gX (s) = E(sX1 ). Let N be non-negative integer valued, independent of the
{Xn }, with generating function gN (s) = E(sN ). Define the random sum
Pn
PN
S = k=1 Xk to be 0 when N = 0 and to be k=1 Xk when N = n ≥ 1.
Then S has generating function gS given by
gS (s) = gN (gX (s)) .
This result follows by conditioning on the value of N and using its independence from the {Xn } as follows:
gS (s)
= E(s
PN
k=1
Xk
) = P (N = 0) +
∞
X
E(s
PN
k=1
Xk
|N = n)P (N = n)
n=1
= P (N = 0) +
= P (N = 0) +
= P (N = 0) +
∞
X
n=1
∞
X
n=1
∞
X
Pn
E(s
k=1
Pn
E(s
k=1
Xk
|N = n)P (N = n)
Xk
)P (N = n)
n
[gX (s)] P (N = n) = gN (gX (s)) .
n=1
One important use of this result is to give a proof of Wald’s Lemma for
this situation. Using the chain rule we see that
0
0
0
0
E(S) = gS0 (1) = gN
(gX (1)) gX
(1) = gN
(1)gX
(1) = E(N )E(X1 ).
3
Continuity Theorem: Suppose that {Xn : n = 1, 2, 3, . . .} are finite non(n)
negative integer valued random variables so that P (Xn = k) = pk , for n =
P∞ (n)
1, 2, 3, . . . , k = 0, 1, 2, . . ., with k=0 pk = 1, for n = 1, 2, 3, . . .. Let gn be the
pgf for the random variable Xn . Then there exists a sequence {pk } such that
(n)
lim p
n→∞ k
= pk for k = 0, 1, 2, . . .
if and only if there is a function g(s) defined for 0 < s < 1 such that
∞
X
lim gn (s) = lim
n→∞
In this case, g(s) =
n→∞
P∞
k=0
(n)
sk pk
= g(s) for 0 < s < 1.
k=0
sk pk . Moreover,
∞
X
pk = 1 iff lim g(s) = 1.
s↑1
k=0
Proof: First suppose that there exists a sequence {pk } such that
(n)
lim p
n→∞ k
= pk for k = 0, 1, 2, . . . .
Note
P∞ that in this case we have
P∞0 ≤k pk ≤ 1 for k = 0, 1, 2, . . . and, in fact,
p
≤
1.
Hence,
g(s)
=
k
k=0
k=0 s pk is actually well
P∞ defined for 0 ≤ s ≤ 1.
So, given s ∈ (0, 1) and ε > 0, choose K so that k=K+1 sk < ε/2. Then
observe that
K
∞
X
X
(n)
|gn (s) − g(s)| ≤
|pk − pk | +
sk .
k=0
k=K+1
Therefore, if we choose M so that whenever n ≥ M we have
K
X
(n)
|pk − pk | < ε/2,
k=0
we see that if n ≥ M then |gn (s) − g(s)| < ε.
For the converse, assume that there is a function g(s) defined for 0 < s < 1
such that
∞
X
(n)
lim gn (s) = lim
sk pk = g(s) for 0 < s < 1.
n→∞
0
(n )
Suppose that {pk
n→∞
k=0
(n)
} is a convergent subsequence of {pk }, i.e.
(n0 )
lim
pk
0
n →∞
= pk
exists ∀k. Then,
lim gn0 (s) = g(s)
n0 →∞
4
so that g is the generating function of {pk }. Consequently, every convergent
(n)
subsequence of {pk } must have the same limit, namely {pk }. By using a
(n)
diagonal argument one can show that, in fact, every subsequence of {pk } has a
(n)
further subsequence that does converge. Therefore, the original sequence {pk }
(n)
must be convergent, with limn→∞ pk = pk for k = 0, 1, 2, . . . .
The result
∞
X
pk = 1 iff lim g(s) = 1
s↑1
k=0
is a direct consequence of the fact that
lim g(s) =
s↑1
∞
X
pk .
k=0
Theorem: Let N be a finite non-negative integer
P∞ valued random variable
so that pk = P (N = k) for k = 0, 1, 2, . . ., and k=0 pk = P (N
< +∞) = 1.
P∞
Let Φ be the generating function of N so that g(s) = E(sN ) = k=0 sk pk with
g(1) = 1. If 0 ≤ p1 < 1 and E(N ) = g 0 (1) ≤ 1, then there is no solution of
the equation g(s) = s in the interval [0, 1). If E(N ) = g 0 (1) > 1 (which implies
that 0 ≤ p1 < 1), then there is a unique solution of the equation g(s) = s in the
interval [0, 1).
Proof: Let h(s) = g(s) − s. Then
h00 (s) = g 00 (s) =
∞
X
k(k − 1)sk−2 pk ≥ 0
k=2
so that h is convex with h(1) = 0. Moreover,
h0 (s) = g 0 (s) − 1 =
∞
X
ksk−1 pk − 1 ≤ E(N ) − 1
k=1
for s ∈ [0, 1). Hence, if 0 ≤ p1 < 1 and E(N ) ≤ 1, then h0 (s) < 0 for s ∈ [0, 1)
and the equation h(s) = 0 has no solution in this interval. On the other hand,
if E(N ) > 1, then h0 (1) > 0 and h0 (0) = p1 − 1 < 0 with h(0) = p0 ≥ 0. Thus,
there is a unique solution of the equation h(s) = 0 in the interval [0, 1).
Question: What happens if p1 = 1?
5