Download Lecture #4: Pairs of random variables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Randomness wikipedia , lookup

Transcript
LECTURE 4
Pairs of Random Variables
4.1 TWO RANDOM VARIABLES
Let (̀, F , P) be a probably space and consider two measurable mappings
X : ̀ Ă Ă,
Y : ̀ Ă Ă.
In other words, X and Y are two random variables defined on the common probability
space (̀, F , P). In order to specify the random variables, we need to determine
P{(X, Y ) ∈ A} = P({ω ∈ ̀ : (X(ω), Y (ω)) ∈ A})
for every Borel set A ⊆ Ă2 . By the properties of the Borel σ-field, it can be shown that it
suffices to determine the probabilities of the form
P{a < X < b, c < Y < d},
a < b, c < d,
or equivalently, the probabilities of the form
P{X ≤ x, Y ≤ y},
x, y ∈ Ă.
The latter defines their joint cdf
FX ,Y (x, y) = P{X ≤ x, Y ≤ y},
x, y ∈ Ă,
which is the shaded region in Figure ؏.؏.
The joint cdf satisfies the following properties:
؏. FX ,Y (x, y) ≥ 0.
؏. If x1 ≤ x2 and y1 ≤ y2 , then
FX ,Y (x1 , y1 ) ≤ FX ,Y (x2 , y2 ).
؏. Limits.
lim F (x,
x, yĂ∞ X ,Y
y) = 1,
lim FX ,Y (x, y) = lim FX ,Y (x, y) = 0.
yĂ−∞
xĂ−∞
2
Pairs of Random Variables
y
(x, y)
x
Figure ؏.؏. An illustration of the joint pdf of X and Y.
؏. Marginal cdfs.
lim F (x,
yĂ∞ X ,Y
lim F (x,
xĂ∞ X ,Y
y) = FX (x),
y) = FY (y).
The probability of any (Borel) set can be determined from the joint cdf. For example,
P{a < X ≤ b, c < Y ≤ d} = FX ,Y (b, d) − FX ,Y (a, d) − FX ,Y (b, c) + FX ,Y (a, c)
as illustrated in Figure ؏.؏
We say that X and Y are statistically independent or independent in short if for every
y
d
c
a
x
b
Figure ؏.؏. An illustration of P{a < X ≤ b, c < Y ≤ d}.
4.2
(Borel) A and B,
Pairs of Discrete Random Variables
3
P{X ∈ A, Y ∈ B} = P{X ∈ A} P{Y ∈ B}.
Equivalently, X and Y are independent if
FX ,Y (x, y) = FX (x)FY (y),
x, y ∈ Ă.
In the following, we focus on three special cases and discuss the joint, marginal, and
conditional distributions of X and Y for each case.
Ȃ X and Y are discrete.
Ȃ X and Y are continuous.
Ȃ X is discrete and Y is continuous (mixed).
4.2 PAIRS OF DISCRETE RANDOM VARIABLES
Let X and Y be discrete random variables on the same probability space. They are completely specified by their joint pmf
p X ,Y (x, y) = P{X = x, Y = y},
x ∈ X , y ∈ Y.
By the axioms of probability,
󵠈 󵠈 p X ,Y (x, y) = 1.
x∈X y∈Y
We use the law of total probability to find the marginal pmf of X:
p X (x) = 󵠈 p(x, y),
y∈Y
x ∈ X.
The conditional pmf of X given Y = y is defined as
p X|Y (x | y) =
p X ,Y (x, y)
,
pY (y)
pY (y) ̀= 0, x ∈ X .
Check that if pY (y) ̀= 0, then p X|Y (x|y) is a pmf for X.
Example ؏.؏. Consider the pmf p(x, y) described by the following table
−3
y −1
2
0
x
1 2.5
0
1
4
1
8
1
8
0
1
8
1
4
1
8
0
4
Pairs of Random Variables
Then,
x = 0,
1/4
󶀂
󶀒
󶀒
p X (x) = 󶀊3/8
󶀒
󶀒
󶀚3/8
x = 1,
x = 2.5,
and
p X|Y (x |2) = 󶁇
1/2
1/2
x = 0,
x = 1.
Chain rule. p X ,Y (x, y) = p X (x)pY|X (y|x) = pY (y)p X|Y (x|y).
Independence. X and Y are independent if
p X ,Y (x, y) = p X (x)pY (y),
x ∈ X , y ∈ Y,
which is equivalent to
p X|Y (x | y) = p X (x),
x ∈ X , pY (y) ̀= 0.
Law of total probability. For any event A,
P(A) = 󵠈 p X (x) P(A | X = x).
x
Bayes rule. Given p X (x) and pY|X (y|x) for every (x, y) ∈ X × Y, we can find p X|Y (x|y)
as
p X ,Y (x, y)
pY (y)
p X (x)pY|X (y|x)
=
pY (y)
p X (x)pY|X (y|x)
=
∑ p X ,Y (u, y)
p X|Y (x | y) =
u∈X
=
p X (x)pY|X (y|x)
.
∑ p X (u)pY|X (y|u)
u∈X
The final formula is entirely in terms of the known quantities p X (x) and pY|X (y|x).
Example ؏.؏ (Binary symmetric channel). Consider the binary communication channel in Figure ؏.؏. The bit sent is X ∼ Bern(p), p ∈ [0, 1], and the bit received is
Y = (X + Z) mod 2 = X ⊕ Z,
where the noise Z ∼ Bern(є), є ∈ [0, 1/2], is independent of X. We find p X|Y (x|y), pY (y),
4.2
Pairs of Discrete Random Variables
Z ∈ {0, 1}
X ∈ {0, 1}
Y ∈ {0, 1}
Figure ؏.؏. Binary symmetric channel.
and P{X ̀= Y }, namely, the probability of error. First, we use the Bayes rule
p X|Y (x | y) =
pY|X (y|x)
p (x).
∑ pY|X (y|u)p X (u) X
u∈X
We know p X (x), but we need to find pY|X (y|x):
pY|X (y|x) = P{Y = y | X = x} = P{X ⊕ Z = y | X = x}
= P{x ⊕ Z = y | X = x} = P{Z = y ⊕ x | X = x}
= P{Z = y ⊕ x}
= pZ (y ⊕ x).
since Z and X are independent
Therefore
pY|X (0 | 0) = pZ (0 ⊕ 0) = pZ (0) = 1 − є,
pY|X (0 | 1) = pZ (0 ⊕ 1) = pZ (1) = є,
pY|X (1 | 0) = pZ (1 ⊕ 0) = pZ (1) = є,
pY|X (1 | 1) = pZ (1 ⊕ 1) = pZ (0) = 1 − є.
Plugging into the Bayes rule equation, we obtain
pY|X (0|0)
(1 − є)(1 − p)
p (0) =
.
pY|X (0|0)p X (0) + pY|X (0|1)p X (1) X
(1 − є)(1 − p) + єp
єp
p X|Y (1|0) = 1 − p X|Y (0|0) =
.
(1 − є)(1 − p) + єp
pY|X (1|0)
є(1 − p)
p X (0) =
.
p X|Y (0|1) =
pY|X (1|0)p X (0) + pY|X (1|1)p X (1)
(1 − є)p + є(1 − p)
(1 − є)p
p X|Y (1|1) = 1 − p X|Y (0|1) =
.
(1 − є)p + є(1 − p)
p X|Y (0|0) =
We already found pY (y) as
pY (y) = pY|X (y|0)p X (0) + pY|X (y|1)p X (1)
=󶁇
(1 − є)(1 − p) + єp
є(1 − p) + (1 − є)p
for y = 0,
for y = 1.
5
6
Pairs of Random Variables
Now to find the probability of error P{X ̀= Y }, consider
P{X ̀= Y } = p X ,Y (0, 1) + p X ,Y (1, 0)
= pY|X (1|0)p X (0) + pY|X (0|1)p X (1)
= є(1 − p) + єp
= є.
Alternatively, P{X ̀= Y } = P{Z = 1} = є. An interesting special case is є = 1/2, whence
P{X ̀= Y } = 1/2, which is the worst possible (no information is sent), and
pY (0) =
1
1
1
p + (1 − p) = = pY (1).
2
2
2
Therefore Y ∼ Bern(1/2), regardless of the value of p. In this case, the bit sent X and the
bit received Y are independent (check!).
4.3
PAIRS OF CONTINUOUS RANDOM VARIABLES
4.3.1 Joint and Marginal Densities
We say that X and Y are jointly continuous random variables if their joint cdf is continuous
in both x and y. In this case, we can define their joint pdf, provided that it exists, as the
function f X ,Y (x, y) such that
FX ,Y (x, y) = 󵐐
x
−∞
󵐐
y
−∞
f X ,Y (u, 󰑣) du d󰑣 ,
If FX ,Y (x, y) is differentiable in x and y, then
x, y ∈ Ă.
󰜕2 F(x, y)
󰜕x󰜕y
FX ,Y (x + ̀x, y + ̀y) − FX ,Y (x + ̀x, y) − FX ,Y (x, y + ̀y) + FX ,Y (x, y)
= lim
̀x,̀yĂ0
̀x̀y
P{x < X ≤ x + ̀x, y < Y ≤ y + ̀y}
.
= lim
̀x,̀yĂ0
̀x̀y
f X ,Y (x, y) =
The joint pdf f X ,Y (x, y) satisfies the following properties:
؏.
f X ,Y (x, y) ≥ 0.
؏. 󵐐
∞
−∞
󵐐
∞
−∞
f X ,Y (x, y) dx d y = 1.
؏. The probability of any set A ⊂ Ă2 can be calculated by integrating the joint pdf over A:
P{(X, Y ) ∈ A} = 󵐐 󵐐
(x, y)∈A
f X ,Y (x, y) dx d y.
4.3
Pairs of Continuous Random Variables
7
The marginal pdf of X can be obtained from the joint pdf via the law of total probability:
f X (x) = 󵐐
f X ,Y (x, y) d y.
∞
−∞
To see this, recall that
(x) =
d x
󵐐 (u) du
dx −∞
and consider
FX (x) = lim FX ,Y (x, y) = 󵐐
yĂ∞
x
−∞
󵐐
∞
−∞
f X ,Y (u, y) d y du
with (u) = ∫−∞ f X ,Y (u, y)d y. We then differentiate FX (x) with respect to x. Alternatively, we can consider the following figure
∞
y
x
and note that
f X (x) = lim
̀xĂ0
x
x + ̀x
P{x < X ≤ x + ̀x}
̀x
= lim
1
lim 󵠈 P{x < X ≤ x + ̀x, ǹy < Y ≤ (n + 1)̀y}
̀x ̀yĂ0 n=−∞
= lim
1 ∞
󵐐 f (x, y) d y ̀x
̀x −∞ X ,Y
̀xĂ0
̀xĂ0
∞
=󵐐
−∞
∞
f X ,Y (x, y) d y.
Example ؏.؏. Let (X, Y ) ∼ f (x, y), where
f (x, y) = 󶁇
c
x ≥ 0, y ≥ 0, x + y ≤ 1,
0 otherwise.
We find c, fY (y), and P{Y ≤ 2X}. Note first that
1=󵐐
∞
−∞
󵐐
∞
−∞
f (x, y) dx d y = 󵐐 󵐐
1
1−y
0
0
c dx d y = c 󵐐 (1 − y) d y = 12 c.
1
0
8
Pairs of Random Variables
Hence, c = 2. To find fY (y), we use the law of total probability
fY (y) = 󵐐
f (x, y) dx
∞
−∞
󶀂∫
=󶀊 0
󶀚0
(1−y)
=󶁇
2 dx
0 ≤ y ≤ 1,
otherwise
2(1 − y) 0 ≤ y ≤ 1,
0
otherwise.
fY (y)
؏
y
؏
؏
To find the probability of the set {Y ≤ 2X}, we first sketch the set
y
x = 21 y
1
2
3
x
1
From the figure we find that
P 󶁂X ≥ 12 Y 󶁒 = 󵐐
f X ,Y (x, y) dx d y
{(x, y):x≥ 21 y}
2
3
= 󵐐 󵐐󰑦
0
(1−y)
2
dx d y = .
2
3
Recall that X and Y are independent if f X ,Y (x, y) = f X (x) fY (y) for every x, y. But f X ,Y (0, 1) =
2 and f X (0) fY (1) = 0, so X and Y are not independent. Alternatively, consider
f X ,Y (1/2, 1/2) = 2 ̀= 1 ⋅ 1 = f X (1/2) fY (1/2)
to see that X and Y are dependent.
4.3
Pairs of Continuous Random Variables
9
4.3.2 Conditional Densities
Let X and Y be continuous random variables with joint pdf f X ,Y (x, y). We wish to define
FY|X (y | X = x) = P{Y ≤ y | X = x}.
We cannot define the above conditional probability as
P{Y ≤ y, X = x}
P{X = x}
since both the numerator and the denominator are equal to zero. Instead, we define the
conditional probability for continuous random variables as a limit
FY|X (y|x) = lim P{Y ≤ y | x < X ≤ x + ̀x}
̀xĂ0
= lim
̀xĂ0
= lim
̀xĂ0
= lim
̀xĂ0
=󵐐
y
−∞
P{x < X ≤ x + ̀x, y ≤ y}
P{x < X ≤ x + ̀x}
∫−∞ ∫x
y
x+̀x
f X ,Y (u, 󰑣) du d󰑣
f X (x)̀x
∫−∞ f X ,Y (x, 󰑣)̀x d󰑣
y
f X (x)̀x
f X ,Y (x, 󰑣)
d󰑣.
f X (x)
By differentiating w.r.t. y, we thus define the conditional pdf as
fY|X (y|x) =
f X ,Y (x, y)
f X (x)
if f X (x) ̀= 0.
It can be readily checked that fY|X (y|x) is a valid pdf and
FY|X (y|x) = 󵐐
y
−∞
fY|X (u|x) du
is a valid cdf for every x such that f X (x) > 0. Sometimes we use the notation
Y | {X = x} ∼ fY|X (y|x)
to denote Y has a conditional pdf fY|X (y|x) given X = x.
Chain rule. f X ,Y (x, y) = f X (x) fY|X (y|x) = fY (y) f X|Y (x|y).
Independence. X and Y are independent iff
f X ,Y (x, y) = f X (x) fY (y),
or equivalently,
f X|Y (x | y) = f X (x),
x, y ∈ Ă,
x ∈ Ă, fY (y) ̀= 0.
10
Pairs of Random Variables
Law of total probability. For any event A,
P(A) = 󵐐
f X (x) P(A| X = x) dx.
∞
−∞
Bayes rule. Given f X (x) and fY|X (y|x), we can find
fY|X (y|x)
f (x)
fY (y) X
f X (x) fY|X (y|x)
f X|Y (x | y) =
=
=
∫−∞ f X ,Y (u, y) du
∞
f X (x) fY|X (y|x)
∫−∞ f X (u) fY|X (y|u) du
∞
.
Example ؏.؏. As in Example ؏.؏, let f (x, y) be defined as
f (x, y) = 󶁇
x ≥ 0, y ≥ 0, x + y ≤ 1,
2
0 otherwise.
We already know that
fY (y) = 󶁇
2(1 − y) 0 ≤ y ≤ 1,
0
otherwise.
Therefore
f X ,Y (x, y) 󶀂
= 󶀊1 − y
f X|Y (x | y) =
󶀚0
fY (y)
1
0 ≤ y < 1, 0 ≤ x ≤ 1 − y,
otherwise.
In other words, X | {Y = y} ∼ Unif[0, 1 − y].
fX|Y (x|y)
1
1− y
1− y
x
4.4
Pairs of Mixed Random Variables
11
Example ؏.؏. Let Λ ∼ Unif[0, 1] and the conditional pdf of X given {Λ = λ} be
f X|Λ (x |λ) = λe −λx ,
i.e., X | {Λ = λ} ∼ Exp(λ). Then, by the Bayes rule,
0 < λ ≤ 1,
󶀂
󶀒1
fΛ|X (λ|3) = 1
= 󶀊 9 (1 − 4e −3)
󶀒
󶀚0
∫0 fΛ (u) f X|Λ (3|u) du
f X|Λ (3|λ) fΛ (λ)
λe −3λ
0 < λ ≤ 1,
otherwise.
fΛ (λ)
λ
0
1
fΛ|X (λ|3)
1.378
0.56
λ
1
3
1
4.4 PAIRS OF MIXED RANDOM VARIABLES
Let X be a discrete random variable with pmf p X (x). For each x with p X (x) ̀= 0, conditioned on the event {X = x}, let Y be a continuous random variable with conditional cdf
FY|X (y|x) and conditional pdf
fY|X (y|x) =
󰜕
F (y|x),
󰜕y Y|X
provided that the derivative is well-defined. Then, by the law of total probability,
FY (y) = P{Y ≤ y}
= 󵠈 P{Y ≤ y| X = x}p X (x)
x
= 󵠈 FY|X (y|x)p X (x)
x
= 󵠈󵐐
x
y
−∞
fY|X (󰑣|x) d󰑣p X (x),
12
Pairs of Random Variables
and hence
fY (y) =
d
F (y) = 󵠈 fY|X (y|x)p X (x).
dy Y
x
The conditional pmf of X given Y = y can be defined as a limit:
P{X = x, y < Y ≤ y + ̀y}
̀yĂ0
P{y < Y ≤ y + ̀y}
p (x) P{y < Y ≤ y + ̀y|X = x}
= lim X
̀yĂ0
P{y < Y ≤ y + ̀y}
p X (x) fY|X (y|x)̀y
= lim
̀yĂ0
fY (y)̀y
fY|X (y|x)
p X (x).
=
fY (y)
p X|Y (x | y) = lim
Chain rule. p X (x) fY|X (y|x) = fY (y)p X|Y (x|y).
Bayes rule. Given p X (x) and fY|X (y|x),
p X|Y (x | y) =
p X (x) fY|X (y|x)
.
∑ u p X (u) fY|X (y|u)
Example ؏.؏ (Additive Gaussian noise channel). Consider the communication channel
in Figure ؏.؏. The signal transmitted is a binary random variable
X=󶁇
+󵀂P
−󵀂P
w.p. p,
w.p. 1 − p.
The received signal, also called the observation, is
Y = X + Z,
where Z ∼ N(0, N) is additive noise and independent of X. First note that Y | {X = x} ∼
Z ∼ N(0, N)
X
Y
Figure ؏.؏. Additive Gaussian noise channel.
4.5
Signal Detection
13
N(x, N). To see this, consider
P{Y ≤ y | X = x} = P{X + Z ≤ y | X = x}
= P{Z ≤ y − x | X = x}
= P{Z ≤ y − x},
which, by taking derivative w.r.t. y, implies that
fY|X (y|x) = fZ (y − x) =
In other words,
Y | {X = +󵀂P} ∼ N(+󵀂P, N)
(󰑦−󰑥)2
1
e − 2󰑁 .
󵀂2πN
and Y | {X = −󵀂P} ∼ N(−󵀂P, N).
Next by the law of total probability,
fY (y) = p X (+󵀂P) fY|X (y| + 󵀂P) + p X (−󵀂P) fY|X (y| − 󵀂P)
=p
(󰑦−󵀂󰑃)2
(󰑦+󵀂󰑃)2
1
1
e − 2󰑁 + (1 − p)
e − 2󰑁 .
󵀂2πN
󵀂2πN
Finally, by the Bayes rule,
p X|Y (+󵀂P | y) =
=
p
p
󵀂2πN
e−
󵀂2πN
(󰑦−󵀂󰑃)2
2󰑁
pe
pe
󰑦 󵀂󰑃
󰑁
e−
+
󰑦󵀂 󰑃
󰑁
(󰑦−󵀂󰑃)2
2󰑁
(1 − p) − (󰑦+󵀂󰑃)2
e 2󰑁
󵀂2πN
+ (1 − p)e −
󰑦󵀂 󰑃
󰑁
and
p X|Y (−󵀂P | y) =
pe
(1 − p)e
󰑦 󵀂󰑃
󰑁
−󰑦󵀂󰑃
󰑁
+ (1 − p)e −
󰑦󵀂 󰑃
󰑁
.
4.5 SIGNAL DETECTION
Consider the general digital communication system depicted in Figure ؏.؏, where the
signal sent is X ∼ p X (x), x ∈ X = {x1 , x2 , . . . , xn }, and the observation (received signal)
is
Y | {X = x} ∼ fY|X (y|x).
A detector is a mapping d : Ă Ă X that generates an estimate X̀ = d(Y ) of the input
signal X. We wish to find an optimal detector d ∗ (y) that minimizes the probability of
error
Pe := P{ X̀ ̀= X} = P{d(Y ) ̀= X}.
14
Pairs of Random Variables
X ∈ {x1 , x2 , . . . , xn }
Y
noisy
channel
detector
fY|X (y|x)
̀ ∈ {x1 , x2 , . . . , xn }
X
d(y)
Figure ؏.؏. The general digital communication system.
Suppose that there is no observation and we would like to find the best guess d ∗ of X,
that is,
d ∗ = arg min P{d ̀= X}.
d
Clearly, the best guess is the one with the largest probability, that is,
d ∗ = arg max p X (x).
(؏.؏)
d ∗ (y) = arg max p X|Y (x | y).
(؏.؏)
x
This simple observation continues to hold with the observation Y = y. Define the maximum a posteriori probability (MAP) detector as
x
Then, by (؏.؏)
P{d ∗ (y) ̀= X | Y = y} ≤ P{d(y) ̀= X | Y = y}
for any detector d(y). Consequently,
P{d ∗ (Y ) ̀= X} = 󵐐 P{d ∗ (y) ̀= X | Y = y}
≤ 󵐐 P{d(y) ̀= X | Y = y}
= P{d(Y ) ̀= X},
and the MAP detector minimizes the probability of error Pe .
When X is uniform on {x1 , x2 , . . . , xn }, that is, p X (x1 ) = p X (x2 ) = ⋅ ⋅ ⋅ = p X (xn ) = 1/n,
p X|Y (x | y) =
fY|X (y|x)
p X (x) fY|X (y|x)
=
,
fY (y)
n fY (y)
which depends on x only through the likelihood fY|X (y|x) for a given y. Then, the MAP
detection rule in (؏.؏) simplifies as the maximum likelihood (ML) detection rule
d ∗ (y) = arg max fY|X (y|x).
x
4.5
Signal Detection
15
Example ؏.؏. We revisit the additive Gaussian noise channel in Example ؏.؏ with signal
X=󶁇
+󵀂P
−󵀂P
w.p.
w.p.
1
2
1
2
independent noise Z ∼ N(0, N), and output Y = X + Z. The MAP detector is
d ∗ (y) = 󶁇
+󵀂P
−󵀂P
if p X|Y (+󵀂P|y) > p X|Y (−󵀂P|y),
otherwise.
Since the two signals are equally likely, the MAP detection rule reduces to the ML detextion rule
+󵀂P if fY|X (y| + 󵀂P) > fY|X (y| − 󵀂P),
d ∗ (y) = 󶁇
−󵀂P otherwise.
f (y |+󵀂P)
f (y |−󵀂P)
+󵀂P
−󵀂P
y
From the figure above and using the Gaussian pdf, the MAP/ML detector reduces to the
minimum distance detector
d ∗ (y) = 󶁇
+󵀂P
−󵀂P
(y − 󵀂P)2 < (y − (−󵀂P))2 ,
otherwise,
which further simplifies to
d ∗ (y) = 󶁇
+󵀂P
−󵀂P
y > 0,
y ≤ 0.
Now the minimum probability of error is
Pe∗ = P{d ∗ (Y ) ̀= X}
= P{X = 󵀂P} P{d ∗ (Y ) = −󵀂P | X = 󵀂P}
+ P{X = −󵀂P} P{d ∗ (Y ) = 󵀂P | X = −󵀂P}
1
1
P{Y ≤ 0 | X = 󵀂P} + P{Y > 0 | X = −󵀂P}
2
2
1
1
󵀂
= P{Z ≤ − P} + P{Z > 󵀂P}
2
2
P
= Q󶀣󵀊 󶀳.
N
=
Note that the probability of error is a decreasing function of P/N, which is often called
the signal-to-noise ratio (SNR).
16
4.6
Pairs of Random Variables
FUNCTIONS OF TWO RANDOM VARIABLES
Let (X, Y ) ∼ f (x, y) and let (x, y) be a differentiable function. To find the pdf of Z =
(X, Y ), we first find the inverse image of {Z ≤ z} to compute its probability expressed as
a function of z, i.e., FZ (z), and then take the derivative.
Example ؏.؏. Suppose that X ∼ f X (x) and Y ∼ fY (y) are independent. Let Z = X + Y .
Then
FZ (z) = P{Z ≤ z}
= P{X + Y ≤ z}
=󵐐
∞
−∞
∞
=󵐐
−∞
∞
=󵐐
−∞
∞
=󵐐
−∞
P{X + Y ≤ z | X = x} f X (x) dx
P{x + Y ≤ z | X = x} f X (x) dx
P{Y ≤ z − x} f X (x) dx
FY (z − x) f X (x) dx.
Taking derivative w.r.t. z, we have
fZ (z) = 󵐐
∞
−∞
fY (z − x) f X (x) dx,
which is the convolution of f X (x) and fY (y). For example, if X ∼ N(̀ X , σX2 ) and Y ∼
N(̀Y , σY2 ) are independent, then it can be readily checked that
Z = X + Y ∼ N(̀ X + ̀Y , σX2 + σY2 ).
A similar result also holds for the sum of two independent discrete random variables
(replacing pdfs with pmfs and integrals with sums). For example, if X ∼ Poisson(λ1 )
and Y ∼ Poisson(λ2 ) are independent, then Z = X + Y ∼ Poisson(λ1 ) ∗ Poisson(λ2 ) =
Poisson(λ1 + λ2 ). The property that the sum of two independent random variables with
the same distribution has the same distribution, which is obeyed by Gaussian and Poisson
random variables, is referred to as infinite divisibility. For example, a Poisson(λ) r.v. can
be written as the sum of any number of independent Poisson(λi ) r.v.s, as long as ∑i λi = λ.
It is sometimes easier to work with cdfs first to find the pdf of a function of X and Y
(especially when (x, y) is not differentiable).
Example ؏.؏ (Minimum and maximum). Let X ∼ f X (x) and Y ∼ fY (y) be independent.
Define
U = max{X, Y } and V = min{X, Y }.
Problems
17
To find the pdf of U , we first find its cdf
FU (u) = P{U ≤ u} = P{X ≤ u, Y ≤ u} = FX (u)FY (u).
Using the product rule for derivatives,
fU (u) = f X (u)FY (u) + fY (u)FX (u).
Now to find the pdf of V , consider
P{V > 󰑣} = P{X > 󰑣, Y > 󰑣} 󳨐⇒ 1 − FV (󰑣) = (1 − FX (󰑣))(1 − FY (󰑣)).
Thus
fV (󰑣) = f X (󰑣) + fY (󰑣) − f X (󰑣)FY (󰑣) − fY (󰑣)FX (󰑣).
More generally, the joint pdf of (U , V ) can be found by similar arguments; see Problem ??.
PROBLEMS
؏.؏.
Geometric with conditions. Let X be a geometric random variable with pmf
p X (k) = p(1 − p)k−1 , k = 1, 2, . . . .
Find and plot the conditional pmf p X (k|A) = P{X = k|X ∈ A} if:
(a) A = {X > m} where m is a positive integer.
(b) A = {X < m}.
(c) A = {X is an even number}.
Comment on the shape of the conditional pmf of part (a).
؏.؏.
Conditional cdf. Let A be a nonzero probability event A. Show that
؏.؏.
Joint cdf or not. Consider the function
(a) P(A) = P(A|X ≤ x)FX (x) + P(A|X > x)(1 − FX (x)).
P(A|X ≤ x)
(b) FX (x |A) =
FX (x).
P(A)
G(x, y) = 󶁆
1 if x + y ≥ 0,
0 otherwise.
Can G be a joint cdf for a pair of random variables? Justify your answer.
؏.؏.
Time until the n-th arrival. Let the random variable N(t) be the number of packets
arriving during time (0, t]. Suppose N(t) is Poisson with pmf
pN (n) =
(λt)n −λt
e
n!
for n = 0, 1, 2, . . . .
Let the random variable Y be the time to get the n-th packet. Find the pdf of Y .
18
Pairs of Random Variables
؏.؏.
Diamond distribution. Consider the random variables X and Y with the joint pdf
f X ,Y (x, y) = 󶁆
c, if |x| + |y| ≤ 1/󵀂2,
0, otherwise,
where c is a constant.
(a) Find c.
(b) Find f X (x) and f X|Y (x|y).
(c) Are X and Y independent random variables? Justify your answer.
(d) Define the random variable Z = (|X| + |Y |). Find the pdf fZ (z).
؏.؏.
Coin with random bias. You are given a coin but are not told what its bias (probability of heads) is. You are told instead that the bias is the outcome of a random variable P ∼ U[0, 1]. To get more information about the coin bias, you flip
it independently ؏؏ times. Let X be the number of heads you get. Thus X ∼
Binom(10, P). Assuming that X = 9, find and sketch the a posteriori probability
of P, i.e., fP|X (p|9).
؏.؏.
First available teller. Consider a bank with two tellers. The service times for the
tellers are independent exponentially distributed random variables X1 ∼ Exp(λ1 )
and X2 ∼ Exp(λ2 ), respectively. You arrive at the bank and find that both tellers
are busy but that nobody else is waiting to be served. You are served by the first
available teller once he/she is free.
(a) What is the probability that you are served by the first teller?
(b) Let the random variable Y denote your waiting time. Find the pdf of Y .
؏.؏.
Optical communication channel. Let the signal input to an optical channel be given
by
X=󶁆
1 with probability 12
10 with probability 12 .
The conditional pmf of the output of the channel Y |{X = 1} ∼ Poisson(1), i.e., Poisson with intensity λ = 1, and Y |{X = 10} ∼ Poisson(10).
(a) Show that the MAP rule reduces to
D(y) = 󶁆
1,
y < y∗
10, otherwise.
(b) Find y ∗ and the corresponding probability of error.
؏.؏.
Iocane or Sennari. An absent-minded chemistry professor forgets to label two
identically looking bottles. One bottle contains a chemical named “Iocane” and
the other bottle contains a chemical named “Sennari”. It is well known that the
radioactivity level of “Iocane” has the U[0, 1] distribution, while the radioactivity
level of “Sennari” has the Exp(1) distribution.
Problems
19
(a) Let X be the radioactivity level measured from one of the bottles. What is
the optimal decision rule (based on the measurement X) that maximizes the
chance of correctly identifying the content of the bottle?
(b) What is the associated probability of error?
؏.؏؏. Independence. Let X ∈ X and Y ∈ Y be two independent discrete random variables.
(a) Show that any two events A ⊆ X and B ⊆ Y are independent.
(b) Show that any two functions of X and Y separately are independent; that is, if
U = (X) and V = h(Y ) then U and V are independent.
؏.؏؏.
Family planning. Alice and Bob choose a number X at random from the set {2, 3, 4}
(so the outcomes are equally probable). If the outcome is X = x, they decide to
have children until they have a girl or x children, whichever comes first. Assume
that each child is a girl with probability ؏/؏ (independent of the number of children
and gender of other children). Let Y be the number of children they will have.
(a) Find the conditional pmf pY|X (y|x) for all possible values of x and y.
(b) Find the pmf of Y .
؏.؏؏. Radar signal detection. The signal for a radar channel S = 0 if there is no target and
a random variable S ∼ N(0, P) if there is a target. Both occur with equal probability. Thus
S=󶁆
0,
with probability 12
X ∼ N(0, P), with probability 21 .
The radar receiver observes Y = S + Z, where the noise Z ∼ N(0, N) is independent of S. Find the optimal decoder for deciding whether S = 0 or S = X and its
probability of error? Provide your answer in terms of intervals of y and provide
the boundary points of the intervals in terms of P and N.
؏.؏؏. Ternary signaling. Let the signal S be a random variable defined as follows:
−1 with probability 13
󶀂
󶀒
󶀒
S = 󶀊 0 with probability 13
󶀒
󶀒
󶀚
+1 with probability 31 .
The signal is sent over a channel with additive Laplacian noise Z, i.e., Z is a Laplacian random variable with pdf
fZ (z) =
λ −λ|z|
e
,
2
−∞ < z < ∞ .
The signal S and the noise Z are assumed to be independent and the channel output
is their sum Y = S + Z.
20
Pairs of Random Variables
(a) Find fY|S (y|s) for s = −1, 0, +1 . Sketch the conditional pdfs on the same
graph.
(b) Find the optimal decoding rule D(Y ) for deciding whether S is −1, 0 or +1.
Give your answer in terms of ranges of values of Y .
(c) Find the probability of decoding error for D(y) in terms of λ.
؏.؏؏. Signal or no signal. Consider a communication system that is operated only from
time to time. When the communication system is in the “normal” mode (denoted
by M = 1), it transmits a random signal S = X with
X=󶁇
+1,
−1,
with probability 1/2,
with probability 1/2.
When the system is in the “idle” mode (denoted by M = 0), it does not transmit
any signal (S = 0). Both normal and idle modes occur with equal probability. Thus
S=󶁇
X,
with probability 1/2,
0,
with probability 1/2.
The receiver observes Y = S + Z, where the ambient noise Z ∼ U[−1, 1] is independent of S.
(a) Find and sketch the conditional pdf fY|M (y|1) of the receiver observation Y
given that the system is in the normal mode.
(b) Find and sketch the conditional pdf fY|M (y|0) of the receiver observation Y
given that the system is in the idle mode.
(c) Find the optimal decoder d ∗ (y) for deciding whether the system is normal or
idle. Provide the answer in terms of intervals of y.
(d) Find the associated probability of error.
؏.؏؏. Two independent uniform random variables. Let X and Y be independently and
uniformly drawn from the interval [0, 1].
(a) Find the pdf of U = max(X, Y ).
(b) Find the pdf of V = min(X, Y ).
(c) Find the pdf of W = U − V .
(d) Find the pdf of Z = (X + Y ) mod 1.
(e) Find the probability P{|X − Y | ≥ 1/2}.
؏.؏؏. Two independent Gaussian random variables. Let X and Y be independent Gaussian random variables, both with zero mean and unit variance. Find the pdf of
|X − Y |.
Problems
؏.؏؏.
21
Functions of exponential random variables. Let X and Y be independent exponentially distributed random variables with the same parameter λ. Define the following three functions of X and Y :
U = max(X, Y ) ,
V = min(X, Y ) ,
W =U −V .
(a) Find the joint pdf of U and V .
(b) Find the joint pdf of V and W. Are they independent?
Hint: You can solve part (b) either directly by finding the joint cdf or by expressing
the joint pdf in terms of fU ,V (u, 󰑣) and using the result of part (a).
؏.؏؏. Maximal correlation. Let (X, Y ) ∼ FX ,Y (x, y) be a pair of random variables.
(a) Show that
FX ,Y (x, y) ≤ min{FX (x), FY (y)}.
Now let F(x) and G(y) be continuous and invertible cdfs and let X ∼ F(x).
(b) Find the cdf of
(c) Show that
Y = G −1 (F(X)).
FX ,Y (x, y) = min{F(x), G(y)}.