Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Continuous Random Variables: Joint PDFs,
Conditioning, Expectation and Independence
Berlin Chen
Department of Computer Science & Information Engineering
National Taiwan Normal University
Reference:
- D. P. Bertsekas, J. N. Tsitsiklis, Introduction to Probability , Sections 3.4-3.6
Multiple Continuous Random Variables (1/2)
• Two continuous random variables X and Y associated
with a common experiment are jointly continuous and can
be described in terms of a joint PDF f X ,Y satisfying
P X , Y B
–
f X ,Y
f X ,Y x , y dxdy
x , y B
is a nonnegative function
– Normalization Probability
f X ,Y x , y dxdy 1
• Similarly, f X ,Y a , c can be viewed as the “probability per
unit area” in the vicinity of a, c
P a X a , c Y c
aa cc f X ,Y x , y dxdy f X ,Y a , c 2
– Where is a small positive number
Probability-Berlin Chen 2
Multiple Continuous Random Variables (2/2)
•
Marginal Probability
P X A P X A and Y ,
X A f X ,Y x , y dydx
– We have already defined that
P X A
•
X A
f X x dx
We thus have the marginal PDF
f X x f X ,Y x , y dy
Similarly
f Y y f X ,Y x , y dx
Probability-Berlin Chen 3
An Illustrative Example
•
Example 3.10. Two-Dimensional Uniform PDF. We are told that
the joint PDF of the random variables X and Y is a constant c
on an area S and is zero outside. Find the value of c and the
marginal PDFs of X and Y .
The correspond ing uniform joint PDF on
an area S is defined to be (cf. Example 3.9)
1
,
f X ,Y x,y Size of area S
0,
f X ,Y x,y
if x,y S
otherwise
1
for x,y S
4
for 1 x 2
for 1 y 2
f X x 14 f X,Y x,y dy
1
3
14 dy
4
4
for 2 x 3
f Y y 12 f X,Y x,y dx
1
1
12 dx
4
4
for 2 y 3
f X x 23 f X,Y x,y dy
1
1
23 dy
4
4
f Y y 13 f X,Y x,y dx
1
1
13 dx
4
2
for 3 y 4
f Y y 12 f X,Y x,y dx
1
1
12 dx
4
4
Probability-Berlin Chen 4
Joint CDFs
• If X and Y are two (either continuous or discrete)
random variables associated with the same experiment ,
their joint cumulative distribution function (Joint CDF) is
defined by
F X ,Y x , y P X x , Y y
– If X and Y further have a joint PDF f X ,Y ( X and Y are
continuous random variables) , then
F X ,Y x , y x y f X ,Y s , t dsdt
And
f X ,Y x , y
If
F X ,Y x , y
2
xy
F X ,Y can be differentiated at the point x, y
Probability-Berlin Chen 5
An Illustrative Example
• Example 3.12. Verify that if X and Y are described by a
uniform PDF on the unit square, then the joint CDF is
given by
F X ,Y x , y P X x , Y y xy , for 0 x,y 1
Y
2 FX ,Y x, y
xy
0,1
1,1
0,0
1,0
X
1 f X ,Y x, y , for all x, y in the unit square
Probability-Berlin Chen 6
Expectation of a Function of Random Variables
• If X and Y are jointly continuous random variables,
and g is some function, then Z g X , Y is also a
random variable (can be continuous or discrete)
– The expectation of Z can be calculated by
E Z E g X , Y g x , y f X ,Y x , y dxdy
– If Z is a linear function of X and Y , e.g., Z aX bY , then
E Z E aX bY a E X b E Y
• Where
a and b are scalars
We will see in Section 4.1 methods for computing the PDF of (if it has one).
Z
Probability-Berlin Chen 7
More than Two Random Variables
• The joint PDF of three random variables X , Y and Z
is defined in analogy with the case of two random
variables
P X , Y , Z B
f X ,Y , Z x , y , z dxdydz
X ,Y , Z B
– The corresponding marginal probabilities
f X ,Y x , y f X ,Y , Z x , y , z dz
f X x f X ,Y , Z x , y , z dydz
• The expected value rule takes the form
E g X , Y , Z g x , y , z f X ,Y , Z x , y , z dxdy dz
– If g is linear (of the form aX bY cZ ), then
E aX bY cZ a E X b E Y c E Z
Probability-Berlin Chen 8
Conditioning PDF Given an Event (1/3)
• The conditional PDF of a continuous random variable X ,
given an event A
– If A cannot be described in terms of X , the conditional PDF
is defined as a nonnegative function f X A x satisfying
P X B A B f X
A
x dx
• Normalization property
fX
A
x dx
1
Probability-Berlin Chen 9
Conditioning PDF Given an Event (2/3)
– If A can be described in terms of X ( A is a subset of the real
line with P X A 0 ), the conditional PDF is defined as a
nonnegative function f X A x satisfying
fX
f X x
,
A x P X A
0,
if x A
otherwise
• The conditional PDF is zero outside the
conditioning event
and for any subset
P X B X
fX
B
P X
A
remains the same shape as
f X except that it is scaled along
the vertical axis
B, X A
P X A
A B f X x dx
P X A
A B f X A x dx
– Normalization Property f X
A
x dx
A f X
A
A
x dx
1
Probability-Berlin Chen 10
Conditioning PDF Given an Event (3/3)
• If A1 , A2 , , An are disjoint events with P Ai 0 for
each i , that form a partition of the sample space, then
n
f X x P A i f X
i 1
Ai
x
– Verification of the above total probability theorem
think of as an event , X x
B
P X x P Ai P X x Ai
n
i 1
x
n
f X t dt P Ai x f X
i 1
Ai
and use the total probability theorem from Chapter 1
t dt
Taking the derivative of both sides with respect to x
n
f X x P Ai f X
i 1
Ai
x
Probability-Berlin Chen 11
Illustrative Examples (1/2)
• Example 3.13. The exponential random variable is
memoryless.
– The time T until a new light bulb burns out is exponential
distribution. John turns the light on, leave the room, and when he
returns, t time units later, find that the light bulb is still on, which
corresponds to the event A={T>t}
– Let X be the additional time until the light bulb burns out. What is
the conditional PDF of X given A ?
X T t , A T t
T is exponential
e t , t 0
f T t
otherwise
0,
PT t e t
The conditional CDF of X given A is defined by The conditiona l PDF of X given
P X x A PT t x T t (where x 0)
PT t x T t
PT t x
PT t
e t x
e t
e x
PT t x and T t
PT t
the event A is also exponentia l
with parameter .
Probability-Berlin Chen 12
Illustrative Examples (2/2)
•
Example 3.14. The metro train arrives at the station near your home
every quarter hour starting at 6:00 AM. You walk into the station
every morning between 7:10 and 7:30 AM, with the time in this
interval being a uniform random variable. What is the PDF of the
time you have to wait for the first train to arrive?
- The arrival time, denoted by X , is a uniform random
variable over the interval 7 : 10 to 7 : 30
- Let random varible Y model the waiting time
- Let A be a event
A 7 : 10 X 7 : 15 (You board the 7 : 15 train)
1/20
- Let B be a event
B 7 : 15 X 7 : 30 (You board the 7 : 30 train)
- Let Y be uniform conditioned on A
- Let Y be uniform conditioned on B
For 0 y 5, PY y
Total Probability theorem:
PY y P A PY
A
y P B PY B y
1 1 3 1
1
4 5 4 15 10
1
3 1
1
For 5 y 15, PY y 0
4
4 15 20
Probability-Berlin Chen 13
Conditioning one Random Variable on Another
• Two continuous random variables X and Y have a joint
PDF. For any y with fY y 0 , the conditional PDF of X
given that Y y is defined by
f X Y x y
f X ,Y x , y
fY y
– Normalization Property
f X Y x y dx 1
• The marginal, joint and conditional PDFs are related to
each other by the following formulas
f X ,Y x , y f Y y f X Y x y ,
f X x f X ,Y x , y dy .
marginalization
Probability-Berlin Chen 14
Illustrative Examples (1/2)
• Notice that the conditional PDF f X Y x y has the same
shape as the joint PDF f X ,Y x , y , because the
normalizing factor f Y y does not depend on x
f X Y x 3 . 5
f X Y x 3 . 5
f X ,Y x ,3 .5
f Y 3 .5
f X ,Y x , 2 .5
f Y 2 .5
f X Y x 3 .5
f X ,Y x ,1 .5
f Y 1 .5
1/ 4
1
1/ 4
1/ 4
1/ 2
1/ 2
1/ 4
1
1/ 4
cf. example 3.13
Figure 3.16: Visualization of the conditional PDF f X Y x y .
Let X , Y have a joint PDF which is uniform on the set S . For
each fixed y , we consider the joint PDF along the slice Y y
and normalize it so that it integrates to 1
Probability-Berlin Chen 15
Illustrative Examples (2/2)
•
Example 3.15. Circular Uniform PDF. Ben throws a dart at a
circular target of radius r . We assume that he always hits the target,
and that all points of impact x , y are equally likely, so that the
joint PDF f X ,Y x, y of the random variables x and y is uniform
– What is the marginal PDF f Y y
1
, if x , y is in the circle
f X ,Y x , y area of the circle
0,
otherwise
1
2
2
2
2, x y r
πr
0,
otherwise
fY y
f X,Y x,y dx x 2 y 2 r 2
1
πr 2
2
2
x 2 y 2 r 2 1dx
1
πr 2
r y , if y r
2
fX
1
πr
2
Y x y
f X,Y x,y
fY y
dx
r2 y2
r2 y
1dx
2
2
πr
(Notice here that PDF f Y y is not uniform)
1
πr 2
2
πr 2
r2 y2
1
2
r
2
y
For each value y , f X
2
Y
,
x y
if x
2
y2 r2
is uniform
Probability-Berlin Chen 16
Conditional Expectation Given an Event
• The conditional expectation of a continuous random
variable X , given an event A ( P A 0 ), is defined by
E X A xf
X A
x dx
– The conditional expectation of a function
form
Eg X A g x f X
– Total Expectation Theorem
n
EX P Ai E X Ai
and
i 1
A
g X also has the
x dx
Eg X P Ai Eg X Ai
n
i 1
A1 , A2 , , An
• Where
are disjoint events with P Ai 0 for
each i , that form a partition of the sample space
Probability-Berlin Chen 17
An Illustrative Example
•
Example 3.17. Mean and Variance of a Piecewise Constant PDF.
Suppose that the random variable X has the piecewise constant
PDF
if 0 x 1,
1 / 3,
f X x 2 / 3,
0,
if 1 x 2 ,
otherwise.
Define event A1 X lies in the first interval [0,1]
event A2 X lies in the second interval [1,2]
P A1 01 1 / 3 dx 1 / 3, P A2 12 2 / 3 dx 2 / 3
fX
f X x
1, 0 x 1
P
X
A
x
1
A1
0,
otherwise
Recall that the mean and second moment of
a uniform random variable over an interval
[ a, b ] is a b / 2 and a 2 ab b 2 / 3
E X A 3 / 2 , E X
E X A1 1 / 2 , E X
2
A1 1 / 3
2
2
A2 7 / 3
fX
f X x
1, 1 x 2
P
X
A
x
2
A2
0,
otherwise
E X P A1 E X A1 P A2 E X A2
1 / 3 1 / 2 2 / 3 3 / 2 7 / 6
P A E X
E X
2
1
2
A1 P A2 E X
2
A2
1 / 3 1 / 3 2 / 3 7 / 3 15 / 9
var X 15 / 9 7 / 6 2 11 / 36
Probability-Berlin Chen 18
Conditional Expectation Given a Random Variable
• The properties of unconditional expectation carry though,
with the obvious modifications, to conditional expectation
E X Y y xf X Y x y dx
E g X Y y g x f X Y x y dx
E g X , Y Y y g x , y f X Y x y dx
Probability-Berlin Chen 19
Total Probability/Expectation Theorems
• Total Probability Theorem
– For any event A and a continuous random variable Y
P A
P A Y y f Y y dy
• Total Expectation Theorem
– For any continuous random variables X and Y
E X E X Y y f Y y dy
E g X E g X Y y f Y y dy
E g X , Y E g X , Y Y y f Y y dy
Probability-Berlin Chen 20
Independence
• Two continuous random variables X and Y are
independent if
f X ,Y x , y f X x f Y y , for all x,y
– Since that
f X ,Y x , y f Y y f X Y x y f X x f Y
X
y x
• We therefore have
f X Y x y f X x , for all x and all y with fY y 0
• Or
fY X y x fY y , for all y and all x with f X x 0
Probability-Berlin Chen 21
More Factors about Independence (1/2)
• If two continuous random variables X and Y are
independent, then
– Any two events of the forms X Aand Y B are
independent
P X A , Y B x A y B f X ,Y x , y dydx
x A y B f X x f Y y dydx
x A f X x dx y B f Y y dy
P X A P Y B
– It also implies that
F X ,Y x , y P X x , Y y P X x P Y y F X x FY x
– The converse statement is also true (See the end-of-chapter
problem 28)
Probability-Berlin Chen 22
More Factors about Independence (2/2)
• If two continuous random variables X and Y are
independent, then
–
–
E XY E X E Y
var X Y var X var Y
– The random variables g X and hY are independent for any
functions g and h
• Therefore,
E g X h Y E g X E h Y
Probability-Berlin Chen 23
Recall: the Discrete Bayes’ Rule
• Let A1, A2 ,, An be disjoint events that form a partition of
the sample space, and assume that P Ai 0, for all i .
Then, for any event B such that P B 0 we have
P Ai B
P Ai P B Ai
P B
P Ai P B Ai
Multiplication rule
Total probability theorem
nk 1 P Ak P B Ak
P Ai P B Ai
P A1 P B A1 P An P B An
Probability-Berlin Chen 24
Inference and the Continuous Bayes’ Rule
• As we have a model of an underlying but unobserved
phenomenon, represented by a random variable X with
PDF f X , and we make a noisy measurement Y , which
is modeled in terms of a conditional PDF f Y X . Once the
experimental value of Y is measured, what information
does this provide on the unknown value of X ?
X
f X x
Y
Measurement
fY
f X Y x y
X
y x
Inference
f X Y x y
y x
fY y
f X t f Y X y t dt
f X ,Y x , y
f X x fY
X
Note that
f X fY
X
f X ,Y f Y f X
Y
Probability-Berlin Chen 25
Inference and the Continuous Bayes’ Rule (2/2)
Inference about a Discrete Random Variable
• If the unobserved phenomenon is inherently discrete
– Let N is a discrete random variable of the form N n that
represents the different discrete probabilities for the unobserved
phenomenon of interest, and p N be the PMF of N
P N n Y y P N n y Y y
P N n P y Y y N n
P y Y y
p N n f Y
N
y n
f Y y
y n
i f y i
p N n f Y
p
i
N
N
Total probability theorem
Y N
Probability-Berlin Chen 26
Illustrative Examples (1/2)
•
Example 3.19. A lightbulb produced by the General Illumination
Company is known to have an exponentially distributed lifetime Y .
However, the company has been experiencing quality control
problems. On any given day, the parameter of the PDF of Y
is actually a random variable, uniformly distributed in the interval
1, 3 / 2 .
– If we test a lightbulb and record its lifetime ( Y y ), what can
we say about the underlying parameter ?
f Y y e y , y 0, 0
Conditioned on , Y has a exponential distribution
with parameter
2, for 1 3 / 2
f
0, otherwise
f Y y
f f Y y
3/ 2
1
f t f Y
y t dt
2 e y
3/ 2
ty
1 2te dt
,
for 1 λ 3 / 2
Probability-Berlin Chen 27
Illustrative Examples (2/2)
•
Example 3.20. Signal Detection. A binary signal S is transmitted,
and we are given that P S 1 p and P S 1 1 p .
– The received signal is Y S N , where N is a normal noise
with zero mean and unit variance , independent of S .
S 1 , as a function of the observed value
– What is the probability that
y of Y ?
fY
S y s
Conditioned on S
2
1
e y s / 2 , for s 1 and -1, and - y
2
s,Y
P S 1 Y y
has a normal distribution with mean
p S 1 f Y
S
y 1
fY y
p S 1 f Y
p S 1 f Y
S
S
s
and unit variance
y 1
y 1 p S 1 fY S y 1
2
1
e y 1 / 2
2
2
2
1
1
e y 1 / 2 1 p
e y 1 / 2
2
2
p
p
e
y 2 1 / 2
e y
y
2
1 / 2
pe e
pe y
1 p e y
y 2 1 / 2
pe y
pe y 1 p e y
Probability-Berlin Chen 28
Inference Based on a Discrete Random Variable
• The earlier formula expressing P A Y y in terms of
f Y A y can be turned around to yield
fY
A
y
P A fY
A
y
P A f Y
f Y y P A Y y
P A
?
f Y y P A Y y
f Y t P A Y t dt
f Y y P A Y y
y
dy
A
f Y y P A Y y dy
P A f Y y P A Y y dy ( normalizat ion property : f Y
A
y dy 1)
Probability-Berlin Chen 29
Recitation
• SECTION 3.4 Joint PDFs of Multiple Random Variables
– Problems 15, 16
• SECTION 3.5 Conditioning
– Problems 18, 20, 23, 24
• SECTION 3.6 The Continuous Bayes’ Rule
– Problems 34, 35
Probability-Berlin Chen 30