Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Let us now return to studying the behavior of two random variables. Consider the following
problem:
Example:
Let x1 and x2 denote the proportion of two different chemicals found in a sample
mixture of chemicals used as an insecticide. Suppose x1 and x2 have a joint
probability density function given by:
a
f x1 x2 (x1 x 2 ) = 
0
0 ≤ x1 ≤ 1, 0 ≤ x 2 ≤ 1, 0 ≤ x1 + x 2 ≤ 1
elsewhere
(Note: x1 and x2 must add up to 1, at most since they represent proportion)
a)
b)
c)
d)
e)
f)
Find P{x1 ≤ 3 / 4, x 2 ≤ 3 / 4}
Find P{x1 ≤ 1 / 2, x 2 ≤ 1 / 2}
Find P{x1 ≤ 1 / 2 / x 2 ≤ 1 / 2}
Find the marginal density for x1 and x2
Are x1 and x2 independent?
Find P{x1 > 1 / 2 / x 2 = 1 / 4}
Solution:
First, we must find what a is.
f x1 x2
a
1
x2
1
x1+x2=1
d x1
x2=1-x1
x1
1 1− x1
∴1 = ∫
0
1
∫ adx2 dx1 = ∫ ax 2
0
0
1
1 
a

1 = a  x1 − x12  =
2 0 2

∴a = 2
Then
1
1 − x1
dx1 = a ∫ (1 − x1 )dx1
0
0
a)
P{x1 ≤ 3 / 4, x 2 ≤ 3 / 4} =
1 / 43 / 4
∫
∫ 2dx2 dx1 +
0 0
3 / 4 1− x1
∫ ∫ 2dx dx
2
1/ 4
1
0
3/4
= 2(1 / 4 )(3 / 4 ) + 2 ∫ (1 − x1 )dx1
1/ 4
=
3 1 7
+ =
8 2 8
x2
1
3/4
. x 2=1-x1
x
1+x2=1
1/4
1/4
3/4
1
x
1
b).
x2
P[x1 ≤ 1 / 2, x 2 ≤ 1 / 2] =
1 / 21 / 2
∫ ∫ 2dx dx
1
1
2
0 0
= 2×
=
c).
P[x1 ≤ 1 / 2 / x 2 ≤ 1 / 2] =
1 1
×
2 2
1/2
1
2
1/2
P[x1 ≤ 1 / 2, x 2 ≤ 1 / 2]
P[ x 2 ≤ 1 / 2 ]
1
2
=
P [ x 2 ≤ 1 / 2]
1
x
x2
1
1/2
x
1/2
but
1
1
1
1 1
 1 1 1

P[x2 ≤ 1 / 2] =  × × 2  +  × × × 2 
2 2
 2 2 2

1 1 3
= + =
2 4 4
Then
P[x1 ≤ 1 / 2 / x 2 ≤ 1 / 2] =
1/ 2 4 2
= =
3/ 4 6 3
d) Marginal Densities
f X 1 (x1 ) =
f X 2 (x2 ) =
1− x1
∫
1− x1
f X1X 2 dx 2 =
x2 = 0
x2 =0
1− x2
1− x2
∫f
X1 X 2
dx1 =
0
x2
2
= 2(1 − x1 ) 0 ≤ x1 ≤ 1
1
= 2(1 − x2 ) 0 ≤ x 2 ≤ 1
∫ 2dx
∫ 2dx
x 2 =0
x 2 = 1 - x1
x
x
e) Note
1=
1 - x2
1
f1 (x1 ) ⋅ f 2 (x 2 ) = 4(1 − x1 )(1 − x 2 ) ≠ f1 (x1 x2 ) = 2 0 ≤ x1 + x 2 ≤ 1
∴ x1 and x2 are not independen t.
f) P[x1 > 1 / 2 / x 2 = 1 / 4]
x
2
x
x
3/4
Slicing through f X 1 X 2 and x2 = 1 / 4
2=1/4
x
1
1=1-x2
f (x1 ,1 / 4 ) = 2
0 ≤ x1 ≤ 1 − x2 = 3 / 4
f X 1 (x1 ) = 2(1 − x1 ) f X 2 (x2 ) = 2(1 − x 2 )
Hence
f X 2 (x2 = 1 / 4 ) = 3 / 2
Then, from our identities
f (x1 / x 2 = 1 / 4 ) =
f (x1 , x 2 = 1 / 4 )
2
4
=
=
f (x 2 = 1 / 4 )
3/ 2 3
3/ 4
P[x1 > 1/2 / x 2 = 1 / 4] =
∫ f (x
/ x2 = 1 / 4 )dx1
1
1/ 2
3/4
=
4
dx1
3
1/ 2
=
4 3 1 1
−
=
3  4 2  3
∫
for 0 ≤ x 2 ≤ 1 − x 2 = 3 / 4
Also
P[x1 < 1 / 2 / x 2 = 1 / 4] =
4
3
1/ 2
2
∫ dx = (4 / 3)(1 / 2) = 3
2
0
hence
P[x1 > 1 / 2 / x 2 = 1 / 4] + P[x1 ≤ 1 / 2 / x2 = 1 / 4] = 1
Example:
Two quality control inspectors each interrupt a production line in any given hour at
randomly, but independently, selected time within a given day (of 8 hours). Find
probability that the two interruptions will be more than 4 hours apart.
Let
xi = the hours of the interrupti on time of inspection i
f(x 1)
1/8
8
f (xi ) =
1
8
f (x1 x 2 ) =
1
1
=
8.8 64
x 1,x2
i = 1,2
{
}
Then we want P x1 − x 2 > 4
Region of interest
x2
x 1 -x2 <4
x 1-x2= -4
x 1-x2= 4
x2
xxxx
4
4
Region of interest
x 1 - x2 > 4
x 1 -4
x 1-4
dx2
0
4
x
8
dx1
1
8 x1 − 4
P{x1 − x 2 > 4}= ∫
4
=
∫
0
4
8
1
1
dx 2 dx1 + ∫ ∫
dx 2 dx1
64
64
0 x2 − 4
1 1 1
+ =
8 8 4
When speaking about two random variables it is of common interest to talk about covariance.
Covariance - a measure of the statistical relationship between two r.v.. As example, if X →
large/small at same time that Y is large/small, the covariance will be positive. If X →
large/small at same time that Y is small/large, the covariance will be negative.
Covariance is defined
cov( X 1 X 2 ) = E [(X 1 − M 1 )(X 2 − M 2 )]
= E [X 1 X 2 ] − M 1 M 2
where M 1 , M 2 = E[X 1 ]E[X 2 ] respectively
If x1 and x2 are independent:
cov (X 1 X 2 ) = E [X 1 X 2 ] − M 1 M 2 ≡ 0
∴ Independence implies cov = 0
but cov = 0 does not imply independence.
If Y1 , Y2 ,
,Y
n
and X 1 , X 2 ,
, X
n
n
are r.v. with M i = E [Yi ] and ξ i = E [X i ] and
n
U 1 = ∑ aiYi
U 2 = ∑ bi X i (linear combinatio n)
i =1
i =1
n
∴ a) E [U 1 ] = ∑ ai M i
i =1
b) V (U 1 ) = E [U 1 − E (U 1 )]
2
n
n

= E  ∑ aiYi − ∑ ai M i 
i =1
 i =1

2
2
n

= E  ∑ ai (Yi − M i )
 i =1

n

n
n
2
2

= E ∑ ai (Yi − M i ) + ∑∑ ai a j (Yi − M i )(Y j − M j )
 i =1

i =1 j =1


i≠ j
[
]
= ∑ ai2 E (Yi − M i ) + ∑∑ ai a j E [(Yi − M i )(Y j − M j )]
n
i =1
2
n
i =1 j =1
i≠ j
= ∑ ai2V (Yi ) + ∑∑ ai a j cov (YiY j )
n
∴
n
i =1
(
n
)
n
i =1 j =1
i≠ j
(
)
Since cov YiY j = cov Y jYi
V (U1 ) = ∑ ai2V (Yi ) + 2 ∑∑ ai a j cov (YiY j )
n
i =1
i< j
Similarly
cov (U 1U 2 ) = E [{U 1 − E (U 1 )}{U 2 − E (U 2 )}]
= ∑∑ ai b j cov (YiY j )
n
m
i =1 j =1
Example 5-17
Table below shows the joint probability distribution of the number of defectives observed
on the first draw (X1) and the second draw (X2) from a box containing four good bulbs
and one defective bulb.
a)
b)
c)
Find E(X1), V(X1), E(X2), and V(X2)
Find cov(X1, X2)
The variable Y = X1 + X2 denotes the total number of defectives observed on the two
draws. Find E(Y) and V(Y).
X 1
X2
0
1
Marginal Prob. For X1
0
1
Marginal Prob. for X2
.6
.2
.8
.2
0
.2
.8
.2
1.0
Drawing bulbs from box ⇒ 4 good and 1 defective
a)
E [X 1 ] = (0)(.8) + (1)(.2 ) = 0.2
[ ]
E X 12 = (0) (.8) + (1) (.2 ) = 0.2
2
2
[ ] [ ]
∴V ( X 1 ) = E X 12 − E X 1
2
E [X 2 ] = same
V (X 2 ) = same
b)
= .2 − (.04 ) = 0.16
cov (X 1 X 2 ) = E [X 1 X 2 ] − E [X 1 ]E [X 2 ] = (0 ) − (.2 ) = −.04
2
c)
Y = X1 + X 2
E [Y ] = E [X 1 ] + E [X 2 ] = .2 + .2 = .4
V (Y ) = V (X 1 + X 2 ) = V (X 1 ) + V (X 2 ) + 2 cov ( X 1 X 2 )
= 0.16 + 0.16 + (2 )(− .04 )
= 0.24
Conditional Expectations
E [X 1 / X 2 = x 2 ] = ∫ x1 f (x1 / x2 )dx1
= ∑ x1 p(x1 / x 2 )
if continuous
if discrete
x1
A soft-drink machine has a random supply Y2 at beginning of a given day and dispenses a random
amount Y1 during the day. It is not re-supplied during the day, so Y1 ≤ Y2 .
It has been observed that
1 / 2
f ( y1 y 2 ) = 
0
0 ≤ y1 ≤ y 2
0 ≤ y2 ≤ 2
elsewhere
That is, (y1,y2) are uniformly distributed over the triangle with given boundaries.
Find conditional distribution of Y1 given Y2=y2.
Solution:
f(y1y2)
y1
y1=y2
y2
Marginal density
f 2 ( y2 ) =
∞
∫ f (y y )dy
1
2
1
−∞
 y2  1
 1
 ∫  dy 2  = y 2
= 02
 2
0

Then,
f ( y1 / y 2 ) =
0 ≤ y2 ≤ 2
elsewhere
f ( y1 , y 2 )
1/ 2
1
=
=
f 2 ( y2 )
1 / 2 y2 y2
 less than 1/2 gallon is 


Prob sold, given the machine  =
 contains 1 gallon at start 


1


P Y1 < / Y2 = 1
2


1/ 2
=
0 ≤ y1 ≤ y 2 ≤ 2
∫
f ( y1 / y 2 = 1)dy1 =
−∞
1/ 2
∫ (1)dy
1
=
0
If machine had 2 gallons at start then P[Y1 ≤ 1 / 2 / Y2 = 2] ≡ 1 / 4 ∴ Amount sold
highly dependent on supply.
Find Conditional Expectation of sales, Y1, given Y2=1
∞
E [Y1 / Y2 = 1] = ∫ Y1 f (Y1 / Y2 )dY1
−∞
1
= ∫ Y1 (1)dY1 = 1 / 2
0
If machine contains 1 gallon at start the expected sale is 1/2 gallon.
Note
1
2
[
]
E [X 1 ] = E X 2 E X1 {X 1 / X 2 }
E [X 1 ] =
∞
∫ X f (X )dX
1 1
2
1
−∞
=
∞
∫ X ∫ f (X
1
1
X 2 )dX 2 dX 1
−∞
= ∫ ∫ X 1 f (X 1 / X 2 ) f ( X 2 )dX 2 dX 1
=∫
=
[∫ X
1
∞
∫ E [X
]
f ( X 1 / X 2 )dX 1 f ( X 2 )dX 2
1
/ X 2 ]f ( X 2 )dX 2 ??????
-∞
Compounding
Very often, the parameters of a distribution are unknown, and sometimes are treated as
random quantities. Assigning distributions to these quantities is called compounding.
i.e. Let Y → number of bacteria/cubic centimeter in a liquid and Y has a Poisson
distribution, with mean λ . Also, λ varies from location to location, and for a location
chosen at random, λ has a gamma distribution with α and β .
Find probability distribution for the bacteria count, Y, at a random location.
Solution:
Since λ is random the Poisson assumption applies to the conditional
distribution of Y for fixed λ
∴ P (y / λ ) =
λy e−λ
y!
y = 0,1,2,
Also
 1
λα −1e − λ / β

f (λ ) =  Γ(α )β α
0

λ>0
elsewhere
The joint distribution of λ and Y is given, then, by
g ( y , λ ) = p( y / λ ) f (λ )
1
=
λ y +α −1e −λ [1+ (1/ β )]
α
y! Γ(α )β
The marginal distribution of Y is obtained by interpolating over λ
∴ p( y ) =
1
y! Γ(α )β α
∞
∫λ
y +α −1 λ [1+ (1 / β )]
e
dx
0
1

1
=
Γ( y + α )1 + 
α
y! Γ (α )β
β

− ( y +α )
Since α is an integer
α
(y + α − 1)!  1 
p( y ) =
(α − 1)! y!  β 
 y + α − 1 1
= 

 α − 1  1 + β
If we let n = y + α and
 β

1 + β



α



y +α
 β 


1 + β 
y
1
=p
1+ β
This then has form of negative binomial, so we see that the negative binomial is a
reasonably good model for counts where mean the count may be random.
Birth and Death Processes
We have been concerned with counting the number of occurrences of a single type of
event (accidents, defects, etc.). Let us now consider the counts of two types of events, called
births and deaths. For illustrative purposes we shall think of birth as a cell division producing
two cells, and death as the removal of a cell from the system.
Birth and death processes are of use in modeling dynamics of population growth, which can
influence decisions on housing programs, food management, natural resource management, and a
host of economic matters.
Let
λ → birth rate
θ → death rate
Assume the probability of birth or death for an individual cell is independent of size and age of
population. If Y (t ) = size of population at time t
P[Y (t ) = n ] = Pn (t ),
The probability of more than one death or birth is of order o(h).
For an individual cell, the case n=1 - this says the probability of division in small time interval is
λh + o(h ) .
A differential equation for Pn (t ) is developed as follows:
If population is of size n at time (t+h), it must have been of size n-1, n, or (n+1) at
time t.
Thus
Pn (t + h ) = λ (n − 1)Pn −1 (t ) + [1 − nλh − nθh ]Pn (t ) + θ (n + 1)Pn +1 (t ) + o(h )
or
1
[Pn (t + h ) − Pn (t )] = λ (n − 1)Pn −1 (t ) − n(λ + θ )Pn (t ) + θ (n − 1)Pn+1 (t ) + o(h )
h
h
Let h → 0
dPn (t )
= λ (n − 1)Pn−1 (t ) − n (λ + θ )Pn (t ) + θ (n + 1)Pn +1 (t )
dt
While a general solution can be obtained, we are interested here in some special cases.
If θ = 0 - pure birth process
dPn (t )
dt
= λ ( n − 1) Pn−1 (t ) − nλ Pn (t )
 n − 1 − λit
∴ Pn (t ) = 
1 − e − λt
e
 i −1 
(
)
n −i
n
e
Where i is the size of the population a time
t t=0; that is P(0
=)
i
If θ > 0 we can show
 θe (λ −θ )t − θ 
Pj (t ) =  (λ −θ )t
− θ 
 λe
t
Taking limit of Pj (t ) as t → ∞ we get probability of uttermost extinction of population.
(θ / λ )i

lim Pj (t ) = 1
t →∞
1

E [Y (t )] = le (λ −θ )t
λ >θ
λ <θ
λ =θ
Queues
Queue theory is concerned with probabilistic models governing behavior of "customers" arriving
at a certain station and demanding some kind of service. It may deal with people, automobiles,
telephone calls, breakdown of machines, etc. Consider a system that involves one station
dispensing service to customers on first come, first serve basis. Customers arrive accordingly to
Poisson process with λ /hc and depart with independent Poisson process with θ /hc (this implies
service time is an exponential r.v. with mean 1 / θ ). Probability of customer arrival in small t, h,
is λh + o(h ) and probability of a departure is θh + o(h ) , with probability of more than 1
arrival(departure) is o(h). If Y(t)= number of customers in system (being serviced
and waiting to be serviced) at time t and if Pn (t ) = P[Y (t ) = n ]
dPn (t )
= λPn −1 (t ) − (λ + θ )Pn (t ) + θPn +1 (t )
dt
It can be shown that for large t, this has solution, Pn, which does not depend on t. (Called
equilibrium distribution.) If solution is free of t it must satisfy
0 = λPn −1 − (λ + θ )Pn + θPn +1
A solution of this recursive equation is
 λ  λ 
Pn = 1 −  
 θ  θ 
n
n = 0,1,2,3,
λ <θ
This is probability of there being n customers in system at time t, which is far removed from start
of system.
Functions of Two Random Variables
Given two random variables we can form another random variable which is a function of
the two given random variables.
Z = g (X , Y )
And we are interested in finding its distribution and its density
FZ (z ) = P[Z ≤ z ]
f Z (z )
both in term of g (X , Y ) and f XY of X and Y , where fXY is called the joint density
function.
Let
D - region of the x,y plane such that
g (x , y ) ≤ z
y
g(x,y)=z
g(x,y)= z +dz
Dz: g(x,y) ≤ z
x
we can see
{g (xy ) ≤ z}
{z ≥ 0}= 
{(x, y ) ∈ Dz }
∴ FZ (z ) = P[Z ≤ z ] = P{(x, y ) ∈ Dz }
= ∫ ∫ f XY (x, y )dxdy
Dz
Also given that z ≤ Z ≤ z + dz
P{z ≤ g (x, y ) ≤ z + dz} = f Z (z )dz =
∫ ∫ f (xy )dxdy
XY
∆ Dz
which can also be obtained by differentiating FZ (z )
If x and y are discrete
P{Z = z r } = P{X = x k , Y = y n } = pk
which hold for all points on curve z r = g (x , y ) .
Example:
Z = g (X , Y ) = X + Y
We note that Dz (such that g (x , y ) ≤ z , or x + y ≤ z ) of x,y plane is the half
plane to the left of x+y .
y
dz
Dz
x+y=z+dz
dz
x
z-y
z=x+y
Integrating
FZ (z )P{x + y ≤ z} =
∞ ∞
∫ ∫ f (x, y )dxdy
XY
−∞− ∞
Differentiating with respect to z
f Z (z ) =
∞
∫ f (z − y, y )dy
XY
(Liebnitz' Theorem)
−∞
We can see
f Z (z )dz = P{z ≤ Z ≤ z + dz} = P{z ≤ X + Y ≤ z + dz}
is area of strip dz.
We can generalize if x and y are independent (convolution integral)
f Z (z ) =
∞
∫
−∞
∞
f X (z − y ) f Y ( y )dy = ∫ f X (x ) f Y (z − x )dx
−∞
density of their sum = convolution of the their densities.
Hence, if we have
W = aX + bY
with X 1 = aX Y1 = bY
∴ f X 1 (x1 ) =
1
x 
f X  1 ,
a
a
f Y1 (Y1 ) =
1  y1 
f Y   → from function of one random variable
b b
∴ if X and Y are independent
f W (w) =
1
ab
∞
∫f
X
−∞
 w − y1   Y1 

 f Y  dY1
 a  b
Example:
Two trains arrive at a station some time in the interval (0,T). The arrivals are random
and independent of each other. Let Z = the time interval between arrivals and we want
to find the density of Z.
Let x and y be the arrival time of the trains.
∴Z = X −Y
First we will find the density
t= x− y
f(x) = f(y)
1/T
x, y
T
Convolution of the two functions looks as shown
f w (w )
1/T
∴ f t (t ) =
t=x -y
∞
∫ f (t − y ) f (− y )dy
x
y
−∞
-T
T
w
∴ f z (z ) = 2 f t (z )U (z )
Which plot as the following figure depicts.
f z (z )
2/T
Z
T
This was gotten by considering the sliding of f(t-y) onto f(y), as follows with t=x-y
f (x)
f(y)
w
1/T
1/T
T
x
T
Then, the convolution of these two functions is
f t (t ) =
∞
∫ f (− y ) f (t − y )dy
y
x
−∞
where
1

f x (t − y ) = T
0
t −T ≤ y ≤ t
elsewhere
Note, the limits were developed by observing
1
T

f x (t − y ) = 


0
0≤ x ≤T
0≤t− y ≤T
−T ≤ y ≤ 0
elsewhere
And
1
-T ≤ y ≤ 0

f Y (− y ) = T
 0
elsewhere
This has the effect of sliding f(t-y) over f(x).
Hence:
fx(w-y)
f (x,y)
∞
T +t
∫ f (w − y ) f ( y )dy = T
x
−∞
1/T
fy
T+t
y
-T
0
And, as it slides further. Into quadrant 1.
fx(w-y)
f (x,y)
y
2
−T ≤ t ≤ 0
y
∞
T −t
∫ f (w − y ) f ( y )dy = T
x
y
2
0≤t≤T
−∞
1/T
fy
T-t
y
-T
0
t
This plots as follows;
f z(t)
1
1
2
t
Now, if x and y are independent, with
f x (x) = αe−αxU(x)
f y ( y) = βe−βyU(y)
x+ y=z
z
∴ f z (z) = αβ∫ e
0
 αβ −αz − βz
e −e

e dy = β − α
α 2 ze−αz

(
−α ( z − y ) − βy
)
β ≠α
and z>0
β =α
=0
Example:
Suppose X and Y are exponential with parameter λ . Then
z <0
f Z (t ) =
∞
∫ f ( y ) f (t − y )dy
Y
X
−∞
λe −λy
fY (y ) = 
0
y≥0
otherwise
λe −λ (t − y )
y≤t
f X (t − y ) = 
otherwise
0
t
f Z (t ) = ∫ λe −λy λe −λ (t − y )dy
0
t
2 − λt
=λe
∫ dy = λ te
2
− λt
0
λ te
f Z (t ) = 
0
2
-
− λt
t>0
otherwise
Standby redundancy. Z is lifetime of system with spare component
Example
Consider the following function
y
x2 + y2 ≤ z
Z = X +Y
2
2
D
∴ z = x 2 + y 2 ⇒ Dz
FZ (z ) =
x
z
∫ ∫ f (x, y )dxdy
XY
x2 + y2 ≤z
∴ f XY (x, y ) =
FZ (z ) =
2
2
2
1
e − (x + y )/ 2σ
2
2πσ
1
∫ ∫ 2πσ
2
e −(x
2
)
+ y 2 / 2σ 2
where x = r cos θ
y = r sin θ
dxdy
Dz
1
=
2πσ 2
z
−r
∫ 2πre
2
/ 2σ 2
= 1 − e − z / 2σ
1 − z / 2σ 2
∴ f Z (z ) =
e
U (z )
2σ 2
2
Alternatively,
dr
0
z>0
2π
1
2πσ 2
=
=
∫ ∫e
− r 2 / 2σ 2
(rdθdr )
0 0
1
2πσ 2
1
σ2
z
∫ ∫ dθ (re
z 2π
− r 2 / 2σ 2
dr
)
0 0
z
∫ re
− r 2 / 2σ 2
dr
0
= −e − r
2
/ 2σ 2
z
0
= 1-e − z/ 2ƒ
2
z>0
And the density function follows.
Example
Consider the function described as
Z=
X +Y
2
f XY (x, y ) =
2
2
2
2
1
e − (x + y )/ 2σ
2
2πσ
z
1
− r 2 / 2σ 2
FZ (z ) =
2
π
re
dr
2πσ 2 ∫0
= 1 − e − z / 2σ
z>0
1 − z 2 / 2σ 2
∴ f Z (z ) =
e
U (z ) → Rayleigh Density
2σ 2
2
2
We note
π
2
2
E z = 2σ 2
E [z ] = σ
[ ]
π

∴ σ z2 =  2 − σ 2
2

Functions of Two Random Variables
We consider, now, two functions of two random variables. Given
z = g (x, y )
w = h (x, y )
The random variables have a joint distribution and density
Fzw (z , w )
f zw (z , w )
Letting z and w be two real numbers, then Dzw corresponds to the region of xy plane such that
g (x , y ) ≤ z
h (x , y ) ≤ w
∴ {z ≤ z, w ≤ w} = {(x,y ) ∈ Dzw }
∴ Fzw (z , w ) = P{z ≤ z , w ≤ w} = P{(x,y ) ∈ D zw }
=
∫ ∫ f (x, y )dxdy
xy
D zw
Example
z=+ x +y
2
w = arctan
2
x
y
−
π
π
<w≤
2
2
For z > 0 and w ≤
D z is such that
π
2
x 2 + y 2 ≤ z & arctan
x
≤w
y
y
w
D
zw
z
x
D
w
zw
If
f XY (x, y ) =
(
(
1
− (x 2 + y 2 )/ 2σ 2
e
2πσ 2
where x = r cosθ
y = r sin θ
)
)

− z 2 / 2σ 2 π + 2 w
 1 − e
2π
Fzw = 
 1 − e − z 2 / 2σ 2

π
2
π
for w >
2
for w <
z>0
z > 0 (hold for all points inside circle
We observe that Fzw is independent (product of F(z) and F(w))
Note for z < 0 or w < −π / 2
→
Fzw = 0
We conclude
(
Fz (z ) = 1 − e − z
2
/ 2σ 2
)U (z )
→ as before Rayleigh Density
and

1

π + 2ω
Fw (w) = 
 2π

0

π
2
π
w<
2
w>
w≤−
→
which is a Uniform Density
π
2
Jacobian Transformation
We can determine the joint density of zw functions in terms of Fxy and J(xy) utilizing a Jacobian
transform
where
∂g (x, y ) ∂g (x, y )
∂x
∂y
J (x, y ) = Jacobian =
∂h (x , y ) ∂h (x, y )
∂x
∂y
Theorem
To determine fzw(zw)
1)
2)
Solve g(xy)=z, h(xy)=w for x and y in terms of z and w.
If (x1,y1),…,(xn,yn) are all real solutions then
f zw (z , w ) =
3)
f xy (x1 , y1 )
J (x1 , y1 )
+
+ fJ ((xx ,,yy ))
xy
n
n
n
n
If for certain values (z,w), no real solutions
exist, then
f zw (z , w ) ≡ 0.
Example: Consider the simple linear functions, (underlined variables are random variables)
z = ax + by
w = cx + d y
Then the system
z = ax + by
and
w = cx + dy
has a unique solution
x1 = a1 z + b1w
y1 = c1 z + d1 w
for any z and w, and where a,b,c,d are f(a,b,c,d)
J (x, y ) =
a1
c1
∴ f zw (z , w ) =
Example:
b1
= a1d 1 − b1c1
d1
1
f xy (a1 z + b1w, c1 z + d 1 w )
ad − bc
z=+ x +y
2
x
y
w=
2
∴ The system z = + x 2 + y 2
w=
x
has two solutions (for any w)
y
i.e. z = w 2 y 2 + y 2 = y w 2 + 1
z
∴y =
x=
1 + w2
zw
x1 =
y1 =
1 + w2
x 2 = − x1
zw
1 + w2
z
1 + w2
y 2 = − y1
Now
∂g (x, y ) ∂g (x, y )
∂x
∂y
J (x, y ) =
=
∂h (x, y ) ∂h (x, y )
∂x
∂y
=−
=−
x2
x2 + y2
−
x
y
x2 + y 2
1
y
x2 + y2
x
− 2
y
1
x2 + y 2
w2 1 w2 − 1
− =
−z
z
z
Hence
f zw (z , w ) =
if z < 0
  zw
z
,
 f xy 
  1 + w 2 1 + w 2
→ f zw (z , w ) ≡ 0
z
1 + w2

 − zw
−z
 + f xy 
,
2
1 + w2

 1+ w



Two Functions of Two Random Variables
Before we develop the theorem for transforming density functions of two functions of two r.v., let
us review some mathematics. Evaluating multiple integrals can frequently be simplified by
making changes in independent variables - or coordinate transformation.We can consider the
handling of two functions of two r.v. as coordinate transformations.
z = g ( x, y ) 

w = h( x, y ) 
(1)
i.e.
,yi)
i(x
y
dy
y)g(x,
dx
h(x,y)
x
Where g( ) and h( ) are continuous in their first partial derivatives in a region R, and the Jacobian,
J, does not vanish in the region R.
∂z
∂x
J (x, y ) =
∂w
∂x
∂z
∂y
∂w
∂y
(2)
Now, equations (1) can be solved for any x and y in terms of x, w to yield
x = θ 1 (z , w ) 

y = θ 2 (z , w )
(3)
If we now let z and w take on specific, fixed values then
z0 = g ( x , y ) 

w0 = h( x, y ) 
(4)
These equations determine two curves that intersect at a point (x0,y0) such that
x 0 = θ1 (z 0 , w0 )
y 0 = θ 2 (z 0 , w0 )
y
w=w
0
(x 0,y0)
z=z
0
x
((z1 w1 ), (z 2 w2 ), , (z n wn )) , then a
network of curves will be determined that intersect at points (x1 , y1 ), (x 2 , y 2 ), , (xn , y n ) .
And if z and w are assigned a sequence of constant values
y
w=w
0
w=w
1
(x 0,y0)
w=w
2
(x1,y1)
(x
2,y2)
x
z=z 0
z=z 1
z=z 2
Corresponding to any point whose rectangular coordinates are (x,y) there will be a pair of curves
z=constant and w=constant, which pass through this point.
The totality of numbers (z,w) determines curvilinear coordinate system and the curves themselves
are called coordinate axes???.
i.e. if z =
w = tan −1
x 2 + y 2 and
y
x
z=constant defines family of circles
w=constant defines family of radial lines
The curvilinear system in this case is simply the ordinary polar coordinate system.
w=w 2
y
w=w
3
z=z 1
z=z2
w=w 1
z
z=z
x
3
Now, in the cartesian (xy) coordinate system, the element of area dA=dxdy is the area of a
rectangle formed by intersecting of coordinate lines
x = x0
y = y0
x = x0 + dx
y = y 0 + dy
y
p
p4
1
y
=y0+dy
dy
dA
p
p3
2
y=y 0
dx
x
In the curvilinear, zw, coordinates, the element of area dA can be visualized or transformed as
the area of the quadralateral p1p2p3p4 formed by the intersecting of the coordinate lines
z = z 0 + dz
w = w0 + dw
z = z0
w = w0
y
w=w
p
0+dw
4
w=w0
p1
p
3
Bi
p
2
x
z=z
z=z 0+dz
0
Consider the original problem, where we want the same density function of our two new
functions of two r.v.
Given
z = g (x, y )

w = h(x, y )
(5)
This system has i number of unique, real roots, or solutions. (xi , y i ) and these roots (points) (xi,yi)
transform the differential area A, (dzdw), into one or more differential quadralaterals, Bi.
The area of the nth quadralateral equals (Sokolnikof /Redheffer, , Hildebrande)
dzdw
J (x , y )
∂g
∂x
where J (x, y ) =
∂h
∂x
i.e.
w
dz
y
dw
x1y1
w
x2y2
z
x
z
∆D zw
∂g
∂y
∂h
∂y
∆D zw = region of path in xy such that
z ≤ g ( xy ) ≤ x + dz
w ≤ h( xy ) ≤ w + dw
Now
{ξ ≤ z ≤ ξ + dξ , ω ≤ w ≤ ω + dω } = {(xy )∈ ∆Dzw }
∴ P{ξ ≤ z ≤ ξ + dξ , ω ≤ w ≤ ω + dω } = f zw (zw )dzdw
∴ f zw dzdw = ∫ ∫ f xy (xy )dxdy
(6)
∆D zw
So, given z and w, ∆D zw consists of different quadralaterals, one for each solution (xn,yn). Hence, the
transformed area of the nth guadralateral ;
dzdw
J (x n , y n )
and contains the probability mass
f xy (xn y n )
J (x n , y n )
dzdw
Summing the masses of all the quadralaterals, Equation (6) becomes
f zw dzdw =
f xy (x1 y1 )
J (x1 , y1 )
+
f xy (x 2 y 2 )
J (x 2 , y 2 )
+
summed over all distinct roots
Consider our original problem, were we want the joint density function of our two new functions
of two r.v.
Given
z = g (x, y )

w = h(x, y )
(5)
P{ξ ≤ z ≤ ξ + dξ , ω ≤ w ≤ ω + dω } = f zw dzdw
The system of equation solutions to (5)
z = g (xi y i )
w = h (xi yi )
transforms the differential rectangle A into one or more differential quadralaterals Bi.
The area between transformation is invariant under transformation
∴ Area Bi J (xi y i ) = A
Bi =
A
J (xi yi )
These are the same outcomes which satisfy
P{(x, y ) ∈ Bi } = f xy (xi yi ) Bi =
f xy (xi yi )
J (xi , y i )
dzdw
↑
transformed area
Since
{ξ ≤ z ≤ ξ + dξ , ω ≤ w ≤ ω + dω} = * {xi , yi }
n
n
f zw (ξω ) dξdω = ∑
i =1
f xy (xi y i )
J (xi , y i )
i =1
dξ dω
Q.E.D.
Functions of Independent Random Variables
We had for two independent events a and b, P(ab)=P(a)P(b). We can define:
"Two random variables x, and y are called independent if the events {x ≤ x} and y ≤ y are
{
}
independent for any x and y. That is,
P{x ≤ x , y ≤ y}= P{x ≤ x}P{y ≤ y }
∴ Fxy (xy ) = Fx (x )Fy ( y )
f xy (xy ) = f x (x ) f y ( y )
Also
f y (y / x ) = f y ( y )
f x ( y / x ) = f x (x )
and
y2 x 2
x2
y2
∫ ∫ f (xy )dxdy = ∫ f (x )dx ∫ f ( y )dy
xy
x
y1 x1
y
x1
y1
Similarly, if we have discrete random variable
P{x ≤ x k , y ≤ y n }= P{x ≤ x k }P{y ≤ y n }
Given z=g(x) w=h(y) and that x and y are independent, we have two events
∴ {z ≤ z} and
Letting
Then
{w ≤ w}
Iz - set of numbers x such that g (x ) ≤ z
Iw - set of numbers y such that h ( y ) ≤ w
{z ≤ z}= {x ∈ I z } 
{w ≤ w}= {y ∈ I w } 
(*)
Since x and y are independent then (*) are independent
∴ {z ≤ z} is indepenent of
{w ≤ w}
This could also be deduced by applying joint probability density?????
Assuming
g (x ) = z
∴J =
h( y ) = w has single solution (x1 y1 )
g ' (x )
0
= g ' (x1 )h' ( y1 )
0
h' ( y )
f xy (xy ) = f x (x ) f y ( y )
∴ f zw (zw ) =
f xy (xy )
J
=
f x (x1 ) f y ( y1 )
⋅
g ' (x1 ) h' ( y1 )
= F1 (x1 ) ⋅ F2 ( y1 )
∴ independen t
i.e
{
x1 is only function of z 
 ∴ z , w independen t
y1 is only function of w
Conditional Distributions
We can also talk about Conditional Distributions. We had
Fy ( y / M ) = P[Y ≤ y / M ] =
Suppose
M = {X ≤ x}
∴ P (M ) = FX (x )
∴ FY ( y / X ≤ x ) =
P[Y ≤ y , M ]
when P[M ] ≠ 0
P[M ]
P[X ≤ x, Y ≤ y ] Fxy (xy )
=
P[X ≤ x ]
Fx (x )
i.e.
y
x
Conditional Distributions
∴ FY ( y / X ≤ x ) = Ratio of the probabilit y mass in the two regions
∞
∴ fY (y / X ≤ x ) =
∂Fxy (xy ) / ∂y
Fx (x )
∫ f (ξy )dξ
xy
=
−∞
∞ ∞
∫ ∫ f (ξy )dξdy
xy
−∞−∞
Now if we suppose
M = {x1 ≤ X ≤ x 2 } = F (x 2 ) − F (x1 ) ≠ 0
we can write
FY ( y / x1 < X ≤ x 2 ) =
P[x1 < X ≤ x 2 , Y ≤ y ] Fxy (x 2 y ) − Fxy (x1 y )
=
P[x1 < X ≤ x 2 ]
Fx (x2 ) − Fx (x1 )
x2
∫ f (xy )dx
xy
f Y ( y / x1 < X ≤ x 2 ) =
x1
Fx (x 2 ) − Fx (x1 )
Note that f Y ( y / x1 < X ≤ x 2 )dy is the ratio of the small strip to the big strip
y
dy
x
1
x2
x
We could also show, as we did previously that if
M = {X ≤ x}
y
FY ( y / X ≤ x ) =
fY (y / X ≤ x ) =
∫ f ( yρ )dρ
xy
−∞
f x (x )
f xy (xy )
=
f x (x )
f xy (xy )
∞
∫ f (xy )dy
xy
−∞
Similarly
x
FX ( y / X ≤ x ) =
f X (y / X ≤ x ) =
∫ f (xy )dx
xy
−∞
f y (y )
f xy (xy )
f y (y )
We have another form of Bayes' Theorem
fY (y / X ≤ x ) =
f X ( y / X ≤ x ) f y (y )
f x (x )
We will write these as
F ( y / x ), F (x / y )
f ( y / x ), f (x / y )
Total Probability
From
fY (y ) =
∞
∫ f (xy )dx
xy
and f xy (x , y ) = f Y ( y | X = x ) f x (x )
−∞
fY (y ) =
∞
∫ f ( y | X = x ) f (x )dx
Y
−∞
x
Joint Density
Assume that F(xy) has partial derivatives up to order 2.We define
f (x , y ) =
∂ 2 F (xy )
∂x∂y
Hence, as with the marginal density
f (x , y )dxdy = P{x < X ≤ x + dx, y < Y ≤ y + dy }
We note
P{x < X ≤ x + dx, Y ≤ y} = lim F [(x + dx, y ) − F (x, y )] / ∆x
∆x → 0
∂F (x, y )
dx
∂x
∂F (x, y )
∴ P{X ≤ x, y < Y ≤ y + dy} =
dy
∂y
=
Consider the event that a region, D, of the xy plane exist
∴ {(x, y ) ∈ D} = all outcomes ξ such that a point
with the coordinates x(ξ ), y (ξ )
is in D
Breaking D into elemental areas, dxdy, in D, summing on x and on y in D
{(x, y )∈ D} = {x < X ≤ x + dx1 , y < Y ≤ y + dy1 }+ {
∴ P{(x, y ) ∈ D} = ∫ ∫ f (x, y )dxdy
}+ D
We can then write,
F (x, y ) =
y
y x
∫ ∫ f (xy )dxdy
− ∞− ∞
x
And
∞ ∞
∫ ∫ f (x, y )dxdy = 1
−∞−∞
We could also show by differentiating the above equation for F(x,y), with respect to x
Using Liebnitz' rule,
u (α )
u (α )
du1
du 0 1 ∂f (x, α )
d 1
f
(
x
,
α
)
dx
=
f
(
u
,
α
)
−
f
(
u
,
α
)
+
dx
1
0
dα u 0∫(α )
dα
dα u 0∫(α ) ∂α
we get,
y
x


∂F (x, y )
= ∫ dy  f (x, y ) ⋅ 1 − 0 + ∫ 0
∂x
−∞
−∞ 

y
∫ f (x, ρ )dρ
=
−∞
Similarly,
∂F (x, y )
= ∫ f (ξ , y )dξ
∂y
−∞
x
A physical interpretation of the density is that f(x,y) is analogous to mass density over the plane.
Relationship Between Marginal and Joint Density
We have shown
Fx (x ) = Fx (x, ∞ ) =
∞ x
∫ ∫ f (x, y )dxdy
xy
− ∞−∞
Differentiating with respect to x, using Liebnitz' rule,
f x (x ) =
∞
∫ f (x, y )dy
xy
−∞
∴ f y (y ) =
∞
∫ f (x, y )dx
xy
−∞
f(x,y)
x
y
f x (x1 ) =area under
curve.
Example:
Two quality control inspectors each interrupt a production line at random, but
independently, selected times within a given 8hr day. Find the probability that the
two interruptions will be more than 4 hours apart.
Solution:
Let
xi = the hours of the interruption time of inspection i
i = 1,2
p (xi ) =
1
8
p (x1 x2 ) =
1
1
=
8.8 64
Then, what we want to find is
P{x1 − x 2 > 4}= ? and P(a ) =
∫∫ f
xy
dxdy
Ra
Let's draw our sample diagram
A=
11
x1-x2=-4
x2
4
8
0
x1 + 4
dydx
x 1-x2=4
11
8
x2=x1+4
8
A =
A
4
4
x 2=x1-4
0
4
x1 − 4
0
dydx
4
8
x
1
The event A is in the shaded areas, as shown on the sketch, is
4
A=∫
8 x1 − 4
0
∫ dydx
and
0 x1 − 4
A=∫
∫ dydx
4
0
So, we need to integrate the shaded areas.
4
P{x1 − x 2 > 4}= ∫
8
∫
0 x1 + 4
4
8 x2 − 4
f xx dydx + ∫
8
4
=
∫
xx
dydx
0
8 x2 − 4
1
dydx + ∫
64
0 x1 + 4
4
=∫
∫f
∫
0
1
dydx
64
1 1 1
+ =
8 8 4
Note:
We could have also integrated horizontal strips.
x2
x1=x2-4
x1=x2+4
x1
8 x2 − 4
A=∫
4
∫
0
1
dxdy and
64
∴ P[(x1 − x2 ) > 4] =
4
8
1
dxdy
64
0 x2 + 4
A= ∫
∫
1 1 1
+ =
8 8 4
Discrete Random Variables
What we have completed holds for continuous random variables.To incorporate discrete random
variables, we would have to use double impulse function.Rather than do this, we introduce the
notion on point or line masses.Suppose x and y are both of discrete type, taking values xk and yn,
with
P{X = xk , Y = y n } = pkn
∴ we have only point masses at the points (xk,yn)
Since the entire mass in the plane ≡ 1
∑ pkn = 1
k ,n
Now
{X
= x k } = {X = x k , Y = y1 }+ {X = x k , Y = y 2 }+
∴ P{X = xk } = ∑ P{X = xk , Y = y n }
n
Marginal Probability Function
∑p
kn
= 1 → these are mass of all points on the line Y = y n
kn
Similarly
P{Y = y n } = qn = ∑ pkn → these are mass of all points on the line X = x k
k
Example:
Rolling of a fair die.
a) We define our random variables as
X =i
and
Y = 2i
There are only 6 mass, at the points
(1,2) (2,4) (3,6) (4,8) (5,10) (6,12)
and each mass = 1/6
i = 1,2,
,6
y
14
12
10
8
6
4
2
0
x
0
1
2
3
4
5
6
7
(b) Suppose we roll the die twice
x = first appearing number
y = second appearing number
This are 36 outcomes, ∴ each mass = 1/36
6
5
4
3
2
1
0
0
1
2
3
4
5
c) Suppose we define
x = k −n
y=k+n
6
7
The number of outcomes corresponding to x and y can be summarized in the
following table
2
3
4
5
6
7
8
9
10
11
Y=
12
X=
0 1
1
2
3
4
5
1
1
2
1
2
1
2
2
2
2
2
2
2
2
2
1
2
2
2
2
The probabilities are gotten by dividing by 36.
y
=1/36
12
=2/36
10
8
6
4
2
1
2
3
4
5
x
Adding the masses in each row, or column we could get
P{Y = ?} and P{X = ?} respectively.
x=
0
6/36
2
1/36
3
2/36
P{x= }
y=
P{y= }
1
10/36
4
3/36
5
4/36
2
8/26
6
5/36
7
6/36
3
6/36
8
5/36
4
4/36
9
4/36
10
3/36
5
2/36
11
2/36
12
1/36
Related documents