Download Math 425 Intro to Probability Lecture 29

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Discrete Conditional Probability
Definition: Discrete Conditional Probability
Definition
Let X and Y be discrete random variables on the sample space.
The conditional probability mass function of X given Y = y by
Math 425
Intro to Probability
Lecture 29
pX |Y (x|y ) =
Kenneth Harris
[email protected]
provided pY (y ) > 0.
Note. The definition of pX |Y (x, y ) is natural:
Department of Mathematics
University of Michigan
pX |Y (x|y )
March 30, 2009
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
pX ,Y (x, y )
pY (y )
March 30, 2009
1/1
Kenneth Harris (Math 425)
Discrete Conditional Probability
pX ,Y (x, y )
pY (y )
P {X = x, Y = y }
=
P {Y = y }
= P(X = x | Y = y ).
=
Math 425 Intro to Probability Lecture 29
March 30, 2009
3/1
Discrete Conditional Probability
Key Properties
Proof
Theorem
Let X and Y be discrete random variables on the sample space.
(a) (Probability) When pY (y ) 6= 0, pX |Y (·|y ) is a probability:
(i)
(ii)
(a). Part (i), pX |Y (x|y ) ≥ 0 is clear from the definition. For (ii),
X
pX |Y (x|y ) ≥ 0,
X
1=
pX |Y (x|y ).
pX |Y (x|y )
=
x:pX (x)>0
x:pX (x)>0
x:pX (x)>0
X
pX |Y (x|y ).
pX ,Y (x, y )
pY (y )
=
1
·
pY (y )
=
pY (y )
= 1.
pY (y )
(b) (Event Rule) When pY (y ) 6= 0 and for any event A,
P(X ∈ A | Y = y ) =
X
X
pX ,Y (x, y )
x:pX (x)>0
x∈A
(b). Follows from (a), since pX |Y (·|y ) is a probability when pY (y ) > 0.
(c) (Conditioning Rule)
pX (x) =
X
pX |Y (x|y ) · pY (y ).
y :pY (y )>0
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
March 30, 2009
4/1
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
March 30, 2009
5/1
Discrete Conditional Probability
Discrete Conditional Probability
Proof – continued
Example
(c). Let A = {y : pY (y ) = 0}. Then for each x,
Example. Let X and Y be independent geometric random variables
with the same parameter p (and q = 1 − p). Let Z = X + Y .
Find the distribution of X given Z .
P {X = x, Y ∈ A} ≤ P {Y ∈ A} = 0.
By convolution (provided z ≥ 2)
So,
P {X = x}
= P {X = x, Y ∈ A} + P {X = x, Y ∈ Ac }
pZ (z)
= P {X = x, Y ∈ Ac }
X
=
pX ,Y (x, y )
=
pX (k ) · pY (z − k )
k =1
=
y ∈Ac
X
=
z−1
X
z−1
X
q k −1 p · q z−k −1 p
k =1
pX |Y (x|y ) · pY (y ).
=
y :pY (y )>0
(z − 1)q z−2 p2 .
Z has the negative binomial distribution with parameters 2 and p.
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
March 30, 2009
6/1
Kenneth Harris (Math 425)
Discrete Conditional Probability
Math 425 Intro to Probability Lecture 29
March 30, 2009
7/1
Discrete Conditional Probability
Example – continued
Example
By the definition of conditional probability distribution
pX |Z (x|z) =
=
pX ,Z (x, z)
pZ (z)
Example. If we perform n Bernoulli trials yielding U successes and V
failures. We know U and V are dependent (see Ross, example 6.2a).
pX ,Y (x, z − x)
pZ (z)
=
pq x−1 · pq z−x−1
(z − 1)q z−2 p2
=
1
z −1
Suppose, instead, we perform N Bernoulli trials, where N is a
Poisson random variable with parameter λ, and this yields X
successes and Y failures. (So, the number of trials is NOT fixed).
Show the number of successes X and failures Y are independent.
1 ≤ x ≤ z − 1.
Thus, given Z = z, X is uniformly distributed over 1 ≤ x ≤ z − 1.
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
March 30, 2009
8/1
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
March 30, 2009
9/1
Discrete Conditional Probability
Continuous Conditional Density
Example – continued
Definition: Continuous Conditional Density
Let q = 1 − p. Then
P {X = x, Y = y } = P {X = x, Y = y , N = x + y }
= P(X = x, Y = y | N = x + y ) · P {N = x + y }
x + y x y −λ λx+y
=
p q ·e
(x + y )!
x
Definition
Let X and Y have joint probability density function fX ,Y (x, y ).
The conditional density of X given Y = y is
(x + y )!
1
(λp)x e−λp · (λq)x e−λq ·
x!y !
(x + y )!
fX |Y (x|y ) =
=
= e−λp
(λp)x
(λq)y
· e−λq
x!
y!
fX ,Y (x, y )
.
fY (y )
provided fY (y ) > 0.
When fY (y ) = 0, the conditional density fX |Y (x|y ) = 0 is undefined.
So, X and Y are independent and Poisson distributed with
parameters (λp) and (λq).
See Ross, Proposition 6.2.1 for independence; compare example to Ross
Example 4b.
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
March 30, 2009
10 / 1
Kenneth Harris (Math 425)
Continuous Conditional Density
Math 425 Intro to Probability Lecture 29
March 30, 2009
Continuous Conditional Density
Key Properties
Proof
Theorem
Let X and Y be continuous random variables on the sample space.
(a) (Density Rule) fX |Y (·|y ) is a density when fY (y ) > 0,
(a). Part (i), fX |Y (x|y ) ≥ 0 is clear from the definition. For (ii),
(i)
(ii)
12 / 1
Z
∞
fX |Y (x|y ) dx
=
−∞
fX |Y (x|y ) ≥ 0
for all x ,
Z ∞
1=
fX |Y (x|y ) dx.
=
Z ∞
1
fX ,Y (x, y ) dx
fY (y ) −∞
fY (y )
= 1.
fY (y )
−∞
(b). Follows from the directly from the definition of conditional density.
(b) (Multiplication Rule) When fY (y ) > 0,
f (x, y ) = fX |Y (x|y ) · fY (y ).
(c). By the Multiplication rule,
(c) (Bayes Rule) When both fY (y ), fX (x) > 0,
fY |X (y |x) =
Kenneth Harris (Math 425)
fX |Y (x|y ) · fY (y ) = fX ,Y (x, y ) = fY |X (y |x) · fX (x)
fX |Y (x|y ) · fY (y )
.
fX (x)
Math 425 Intro to Probability Lecture 29
The result follows by multiplying by fX (x).
March 30, 2009
13 / 1
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
March 30, 2009
14 / 1
Continuous Conditional Density
Continuous Conditional Density
Example
Example
Example. Suppose that X has gamma distribution with (α = 2, λ), and
that Y has uniform distribution in (0, x) given X = x. Find the joint
density of X and Y .
Example. Compute the marginal density of Y from the previous
example.
By definition of the gamma distribution
fX (x) = λ2 xe−λx
We get the marginal density by integrating out x in the joint density.
x > 0,
For y > 0,
when X = x > 0
Z
∞
fY (y ) =
1
fY |X (y |x) =
x
0 < y < x.
Z
= λe−λy .
0 < y < x.
Math 425 Intro to Probability Lecture 29
March 30, 2009
15 / 1
Kenneth Harris (Math 425)
Continuous Conditional Density
Math 425 Intro to Probability Lecture 29
March 30, 2009
16 / 1
Continuous Conditional Density
Example
Example – continued
Example. Suppose a point (X , Y ) is chosen uniformly on the triangle
0 ≤ x ≤ y ≤ 1 (which has area 12 ). So,
Compute the marginals
if 0 ≤ x ≤ y ≤ 1,
fX ,Y (x, y ) = 0
Z
otherwise.
1
Z
2 dy = 2(1 − x)
fX (x) =
fY (y ) =
x
1.0
y
2 dx = 2y .
0
Compute the conditional densities
0.8
y
fX |Y (x|y ) =
0.6
fY |X (y |x) =
0.4
0.2
x
0.4
0.6
0.8
Math 425 Intro to Probability Lecture 29
fX ,Y (x, y )
1
=
0 ≤ x ≤ y ≤ 1,
fY (y )
y
fX ,Y (x, y )
1
=
0 ≤ x ≤ y ≤ 1,
fX (x)
1−x
Given Y = y , X is uniform on [0, y ].
Given X = x, Y is uniform on [x, 1].
0.2
Kenneth Harris (Math 425)
λ2 e−λx dx
y
fX ,Y (x, y )fX (x) · fY |X (y |x) = λ2 e−λx
fX ,Y (x, y ) = 2
∞
=
Apply the multiplication rule for densities
Kenneth Harris (Math 425)
fX ,Y (x, y ) dx
0
1.0
March 30, 2009
17 / 1
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
March 30, 2009
18 / 1
Continuous Conditional Distribution
Continuous Conditional Distribution
A Warning. When Y is a continuous random variable,
Definition: Conditional Distributions
Recall, f
P(X ≤ x | Y = y ) =???
X |Y (·|y ) is a density function. We use it to make sense of
the probability of events {X ∈ A} given Y = y .
is meaningless as a conditional probability, since P {Y = y } = 0.
However, from our last problem, the conditional density
fX |Y (x|y ) =
1
y
Definition
Let X and Y be continuous random variables. For fY (y ) > 0 define
Z
P(X ∈ A | Y = y ) =
fX |Y (x, y ) dx.
0 ≤ x ≤ y ≤ 1,
A
allows us to give some sense for saying
?
Z
0.35
P(X ≤ 0.35 | Y = 0.7) =
0
We define the conditional cumulative distribution by
Z a
FX |Y (a|y ) = P(X ≤ a | Y = y ) =
fX |Y (x, y ) dx.
1
1
dx = .
0.7
2
Given Y = 0.7, X is uniformly distributed in [0, 0.7].
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
−∞
March 30, 2009
20 / 1
Continuous Conditional Distribution
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
March 30, 2009
21 / 1
Continuous Conditional Distribution
Example
Example – continued
Plot of the joint density:
Example. Suppose (X , Y ) is chosen uniformly from the triangle
(x, y ) : x, y ≥ 0, x + y ≤ 2 .
1
x, y ≥ 0, x + y ≤ 2.
2
The function over the
section Y = 0.5 is f (x, 0.5)and the area under this
R∞
section is fY (0.5) = −∞ fX ,Y (x, y ) dx.
fX ,Y (x, y ) =
Find P(X > 1 | Y = y ). In particular, compute P(X > 1 | Y = 0.5).
2.0
X+Y=2
1.5
1.0
1.0
2.0
0.5
1.5
Y=0.5
0.0
0.5
1.0
0.0
0.5
X>1
0.5
1.0
1.5
0.5
Kenneth Harris (Math 425)
1.0
1.5
Math 425 Intro to Probability Lecture 29
2.0
2.0
March 30, 2009
22 / 1
Kenneth Harris (Math 425)
0.0
Math 425 Intro to Probability Lecture 29
March 30, 2009
23 / 1
Continuous Conditional Distribution
Continuous Conditional Distribution
Example – continued
Example – continued
1
fX ,Y (x, y ) =
2
Given Y = 0.5, X is distributed with density proportional to f
X ,Y (x, 0.5).
The conditional density fX |Y (x|y ) is normalized, dividing by fY (0.5) (the
area under the curve fX ,Y (x, 0.5)) to ensure it is a density.
Note: P(X > 1 | Y = 0.5) = 13 .
x, y ≥ 0, x + y ≤ 2.
The marginal densityZ f (y ) is
Y
∞
fY (y ) =
Z
2−y
1
y
dx = 1 −
2
2
fX ,Y (x, y ) =
−∞
The conditional density f
0
is
0.4
fx,y (x, y )
1
=
fY (y )
2−y
The probability of X >Z 1 given Y = y Zis
0.3
X>1
x is uniformly distributed on [0, 2 − y ].
0.2
0.1
1
2−y
fX |Y (x|y ) dx =
x
Kenneth Harris (Math 425)
fX,Y Hx,0.5L
0.5
X |Y (x|y )
fX |Y (x|y ) =
P(X > 1 | Y = y )
fXÈY HxÈ0.5L
0.6
1
1
1−y
dx =
.
2−y
2−y
Math 425 Intro to Probability Lecture 29
March 30, 2009
0.5
24 / 1
Continuous Conditional Distribution
Kenneth Harris (Math 425)
1.0
1.5
2.0
Math 425 Intro to Probability Lecture 29
March 30, 2009
25 / 1
Continuous Conditional Distribution
Example
Example – continued
Let 0 < x < y < 2.
Example. Let the joint density of X and Y be
15 2
f (x, y ) =
x y
0 < x < y < 2.
32
Compute the marginal densities and conditional densities.
Let 0 < y < 2.
∞
Z
fY (y ) =
f (x, y ) dx =
−∞
Let 0 < x < 2.
=
0
∞
Kenneth Harris (Math 425)
15 2
x y dx
32
=
x
2
Compute P(X < 1 | Y = 1.5):
1
Z
15 2 15 4
x −
x .
16
64
Math 425 Intro to Probability Lecture 29
0
Compute P(Y < 1.5 | X = 1):
15 2
x y dy
32
P(Y < 1.5 | X = 1) =
1
March 30, 2009
26 / 1
Kenneth Harris (Math 425)
15 2
32 x y
5 4
32 y
= 3x 2 y −3 .
15 2
32 x y
15 2
15 4
16 x − 64 x
f (x, y )
=
fX (x)
P(X < 1 | Y = 1.5) =
Z
−∞
=
fY |X (y |x)
f (x, y )
=
fY (y )
Z
f (x, y ) dy =
=
=
Let 0 < x < y < 2.
5 4
y .
32
Z
fX (x)
y
Z
fX |Y (x|y )
=
2y
4 − x2
8 2
8
x =
≈ 0.296
9
27
1.5
2
5
y=
≈ 0.412
3
12
Math 425 Intro to Probability Lecture 29
March 30, 2009
27 / 1
Continuous Conditional Distribution
Continuous Conditional Distribution
Example – continued
Plot of f (x, y ) =
Example – continued
Plot of f (x, y ) =
15 2
32 x y
for 0 < x < y < 2,
2
cross-section at Y = 1.5: f (x, 1.5) = 45
64 x , and
density at Y = 1.5: fX |Y (x|1.5) = 98 x 2 .
8
shown in red.
P(X < 1 | Y = 1.5) = 27
2.0
1.5
1.0
0.5
0.0
2.0
1.5
1.0
0.5
0.0
0.0
0.5
1.0
1.5
15 2
32 x y
for 0 < x < y < 2,
cross-section at X = 1: f (1, y ) = 15
32 y , and
2
density at X = 1: fY |X (x|1) = 3 y .
5
shown in red.
P(Y < 1.5 | X = 1) = 12
2.0
1.2
1.0
0.8
0.6
0.4
0.2
1.5
1.0
0.5
0.5
1.0
1.5
0.0
2.0
0.5
1.0
1.5
2.0
2.0
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
March 30, 2009
28 / 1
Kenneth Harris (Math 425)
Independence
Math 425 Intro to Probability Lecture 29
March 30, 2009
29 / 1
Independence
Independence
Example
Independence for conditional density (and mass) behaves as expected.
Theorem
Let X and Y be continuous (discrete) random variables.
The following are equivalent.
Example. The times of occurrences of rare random events (such as
earthquakes, meteorite strikes, etc.) are well described by a Poisson
process. This has the property that intervals between events are
independent exponential random variables.
(a) X and Y are independent.
Let U be the time of the first event and V be the time of the second
(b) fX |Y (x|y ) = fX (x).
event in such a process in which the average number of occurrences is
λ.
The joint density is
(c) fY |X (y |x) = fY (y ).
(d) For every event A and y with fY (y ) > 0,
P(X ∈ A | Y = y ) = P {X ∈ A}.
fU,V (u, v ) = λ2 e−λv
0<u<v <∞
(e) For every event B and x with fX (x) > 0,
P(Y ∈ B | X = x) = P {Y ∈ B}.
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
March 30, 2009
31 / 1
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
March 30, 2009
32 / 1
Independence
Independence
Example – continued
Example – continued
The marginal density of U is
The marginal density of V is
Z
∞
fU (u) =
Z
∞
fU,V (u, v ) dv =
2 −λv
λ e
−∞
dv = λe
−λu
Z
.
∞
fV (v ) =
Z
fU,V (u, v ) dv =
−∞
u
v
λ2 e−λv du = λ2 ve−λv ,
0
This verifies that U is exponential with parameter λ.
which is a gamma density with parameters (α = 2, λ).
The conditional density of V given U = u for v > u is
The conditional density of V given U = u for 0 < u < v is
fV |U (v |u) =
fU,V (u, v )
λ2 e−λv
=
= λe−λ(v −u) ,
fU (u)
λe−λu
fU|V (u|v ) =
which is exponential with parameter λ on (u, ∞).
Kenneth Harris (Math 425)
fU,V (u, v )
λ2 e−λv
= 2 −λv = v −1 .
fV (v )
λ ve
The conditional density of U given V = v is uniform on (0, v ).
Math 425 Intro to Probability Lecture 29
March 30, 2009
33 / 1
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
Independence
March 30, 2009
34 / 1
Independence
Example – continued
Example – continued
Let W = V − U. For w > 0
The conditional density fW |U (w|u) can be obtained by differentiating:
P(W ≤ w | U = u) = P(V − U ≤ w | U = u)
fW |U (w|u)
= P(V ≤ w + u | U = u)
Z ∞
=
fV |U (v |u) dv
−∞
w+u
=
λe
−λ(v −u)
dv
fU,W (u, w)
u
= λeλu e−λu − e−λ(w+u)
= fW |U (w|u) · fU (u)
= λe−λw · λe−λu = λ2 e−λ(u+w) ,
and otherwise is 0 (since W > 0 and U > 0 must be true).
= 1 − e−λw
So, W and U are independent. This verifies what we knew about U
and V , and explains the density (since V = U + W )
The conditional cumulative distribution of W given U = u is
FW |U (w|u) = 1 − e−λw
d
P(W ≤ w | U = u)
dw
d
1 − e−λw = λe−λw
dw
The joint density for u, w > 0 is
Z
=
=
w > 0,
fU,V (u, v ) = λ2 e−λv
0<u<v <∞
which is an exponential distribution with parameter λ
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
March 30, 2009
35 / 1
Kenneth Harris (Math 425)
Math 425 Intro to Probability Lecture 29
March 30, 2009
36 / 1