Download Joint probability distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
3.2 Joint Probability Distributions.
Recall that in section 1.2.4 and 1.2.5 we discussed two ways to describe two or more discrete random
variables. We could either use the joint probability mass function or conditional probability mass functions.
We have two similar methods to describe two or more continuous random variables, namely the joint
probability density function and conditional probability density functions. First we consider these concepts
for two random variables and later we look at the case of more than two random variables.
Suppose S and T are two random variables. The joint probability density function (pdf) of S and T is the
function f(s,t) = fS,T(s,t) with the property that
Pr{(S,T)  A} =
(1)


 f(s,t) dsdt
A
for any set A in the plane.
Example 1. Let
S = time (in min) until the next male customer arrives at a bank
T = time until the next female customer arrives at a bank.
Suppose the joint pdf of S and T is the function f(s,t) is given by
(2)
 16 e-(s/3 + t/2)
f(s,t) = 
 0
if s  0 and t  0
if s < 0 or t < 0
What is the probability that the next customer is a male? We want to know Pr{(S,T)  A} where
A = {(s,t): s < t}. Using (1) we have

 
e-s/3 e-t/2
e-s/3 -s/2
Pr{(S,T)  A} = 
e dsdt

 f(s,t) dsdt = 

 3 2 dsdt = 
 3
0 s
A
0

=
1
2 -5s/6 
2
-5s/6
=
 e dsdt = - 5 e
3
0
5
0
|
Independent Random Variables. Recall that two random variables S and T are independent if knowledge
of the values of one of them doesn't influence the probability that the other assumes various values, i.e.
Pr{S  A, T  B} = Pr{S  A} Pr{T  B}
for any two sets A and B. In particular, one has
3.2 - 1
(3)
Pr{ a < S < b, c < T < d} = Pr{a < S < b} Pr{c < T < d}
or
(3)
Pr{ a < S < b | c < T < d} = Pr{a < S < b}
for any numbers a, b, c and d. Thus the conditional probability that the value of S is in one interval given
that the value of T is in another is no different from the probability that S is in the first interval with no
knowledge of the values of T.
If S and T are independent and have density functions fS(s) and fT(t), then (3) can be written as
b
Pr{ a < S < b, c < T < d} =
d

 fS(s) ds 
 fT(t) dt
a
c
or
(4)
Pr{ (S, T)  A } =


 fS(s) fT(t) dsdt
A
where A is the rectangle consisting of all points (s,t) such that a < s < b, c < t < d. Using (1) one obtains


 fS,T(s,t) dsdt = 

 fS(s) fT(t) dsdt
A
A
It follows that S and T are independent if and only if
(5)
fS,T(s,t) = fS(s) fT(t)
For example, the random variables S and T in Example 1 are independent.
If we let a  - , c  -  and b  s+ and d  t+ in (3) then it follows that
(6)
Pr{ S  s, T  t} = FS(t)FT(t)
where FS(t) and FT(t) are the cummulative distribution functions of S and T.
The following proposition is a generalization of Example 1.
Proposition 1. If S and T are independent exponential random variables with means 1/ and 1/
respectively then
(7)
Pr{S < T} =

+
3.2 - 2
More generally, if T1, ..., Tn are independent exponential random variables with means 1/1, …, 1/n
respectively, then
(8)
Pr{T1 is less than all of T2, ..., Tn} =
1
1 +  + n
Proof. S and T have density functions fS(t) = e-t and fT(t) = e-t for s  0 and t  0 and fS(t) = fT(t) = 0 for
s < 0 or t < 0. Since S and T are independent, it follows form (5) that
-t -t
 e e
fS,T(s,t) = 
0
if s  0 and t  0
if s < 0 or t < 0
If A = {(s,t): s < t}, then

 
Pr{S < T} =
-t -t

 e e dsdt

 f(s,t) dsdt = 

 e e dsdt = 
-t -t

=
0
0 s
A

-(+)t
dsdt = e-(+)t

 e
+
0
|

0
=

+
This proves (7). We prove (8) by induction. The case n = 2 is (7). Suppose it is true for n – 1. Note that
Pr{T1 is less than all of T2, ..., Tn} = Pr{T1 < R} where R = min{T2, ..., Tn}. By Proposition 2 in the next
section R is an exponential random variable with mean 1/(2 +  + n). So (8) follows from (7). //
Here is an example similar to Example 1.
Example 2. The time it takes Bob to do his Math homework is uniformly distributed between 0 and 1 hour.
The time it takes him to do his English homework is uniformly distributed between 0 and 3 hours and is
independent of the time it takes to do his Math homework. On a given evening what is the probability that
Bob will finish both his Math and English homework in 2 hours? Let
S = time for Bob to do his Math homework
T = time for Bob to do his English homework
R = S + T = time to do both Math and English homework.
We want to find Pr{R < 2} = Pr{S+T < 2} = Pr{(S,T)  A} = 

 fS,T(s,t) dsdt where
A
A = {(s,t}: s + t  2}. One has
fS(s) = probability density function of S =
3.2 - 3
1

0
if 0  s  1
otherwise
fT(t) = probability density function of T =
 1/3

0
if 0  t  3
otherwise
Since S and T are independent
fS,T(s,t) = fS(s) fT(t) =
if 0  s  1 and 0  t  3
 1/3

0
otherwise
Let A' = A  {(s,t): 0 s 1 and 0 t 3} = {(s,t): 0 s 1 and 0  t  2-s}. Then
Pr{R < 2} =
1
1
1
2+1


 3 dsdt = 3 × Area(A') = 3 × 1 × 2
= 1/2
A'
Marginal probability density functions: The pdf's of S and T separately are sometimes called the
marginal pdf's.
Proposition 2. Let S and T be continuous random variables with joint density function fST(s, t) and
individual density functions fS(s) and fT(t). Then

(9)
fS(s) =

 fS,T(s,t) dt
-

(10)
fT(t) =

 fS,T(s,t) ds
-
Proof. The density function fS(s) of S is the only function with the property that
b
Pr{ a < S < b} =

 fS(s) ds
a
However, letting A = {(s, t): a < s < b} in (1) we get
b
Pr{ a < S < b} =

b

 
 fST(s,t) dsdt = 
 g(s) ds
a -
a
where

g(s) =
 fST(s,t) dt

-
(9) follows from this. The proof of (10) is the same. //
3.2 - 4
Example 1 (continued). In Example 1 one has

fS(s) =

e-s/3 e-t/2
e-s/3

-t/2
 fS,T(s,t) dt = 

 3 2 dt = - 3 e | 0 =
e-s/3
3
0
-
Similarly one can show fT(t) = e-t/2/2. Thus S is exponential with mean 2 and T is exponential with mean 3.
Since fS,T(s,t) = fS(s) fT(t) it follows that S and T are independent.
Conditional probability density functions: As with discrete random variables, it is often more natural
way to describe two continuous random variables by means of conditional probability density functions.
For two random variables S and T there are two of these, fT|S(t|s) and fS|T(s|t). The first is given by
fT|S(t|s) =
fS,T(s,t)
fS(s)
This is interpreted as the conditional probability density that T has the value t given that S has the value s.
Note that one can recover the joint pmf from the conditional of T given S and marginal of S since
fS,T(s,t) = fT|S(t|s)fS(s).
If the random variables S and T are independent then we have fT|S(t|s) = fT(t). This is the case in Example 1
where fT|S(t|s) = fT(t) = e-t/2/2.
More Than Two Random Variables. Suppose we have n random variables T1, ..., Tn. The joint
probability density function (pdf) of T1, ..., Tn is the function f(t1,...,tn) = fT1,...,Tn(t1,...,tn) with the property
that
Pr{(T1,...,Tn)  A} =
 f(t1,...,tn) dt1... dtn

 

A
for any set A of n-tuples (t1,... , tn). This is an n fold multiple integral over the region A on the right.
T1, ..., Tn are independent if knowledge of the values of some of the variables doesn't change the probability
that the others assume various values, i.e.
Pr{a1 < T1 < b1, ..., an < Tn < bn} = Pr{a1 < T1 < b1}  Pr{an < Tn < bn}
If the random variables T1, ..., Tn are independent, then
fT1,...,Tn(t1,...,tn) = fT1(t1)fT2(t2)  fTn(tn))
3.2 - 5
Properties of expected values. The following proposition includes some of the properties of expected
values of discrete random variables that also hold for continuous random variables.
Proposition 3. Let X and Y be continuous random variables with joint density function fX,Y(x, y) and
individual density functions fX(x) and fY(y). Let c be a real number and z = g(x) and z = h(x, y) be real
valued functions. Then

(11)
 g(x)fX(x) dx

E(g(X)) =
-

(12)

 h(x,y)fX,Y(x,y) dxdy
E(h(X,Y)) =
-
 
(13)


 x fXY(x, y) dxdy
E(X) =
- - 
(14)
E(X + Y) = E(X) + E(Y)
(15)
E(cX) =
cE(X)
(16)
E(XY) =
E(X)E(Y)
if X and Y are independent
Proof. The proofs of (11) and (12) are somewhat involved and will be omitted. (13) and (14) follow from
(12) and (15) follows from (11). To prove (16), one has
E(X)E(Y) =
 



x
f
(x)
dx
y
f
(y)
dy
=
 X
 Y

 xy fX(x) fY(y) dxdy

 


- 
 - 

- - 
 
=


 xy fX,Y(x,y) dxdy = E(XY)
- - 
3.2 - 6