Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Jointly distributed Random
variables
Multivariate distributions
Quite often there will be 2 or more random
variables (X, Y, etc) defined for the same
random experiment.
Example:
A bridge hand (13 cards) is selected from a deck
of 52 cards.
X = the number of spades in the hand.
Y = the number of hearts in the hand.
In this example we will define:
p(x,y) = P[X = x, Y = y]
The function
p(x,y) = P[X = x, Y = y]
is called the joint probability function of
X and Y.
Note:
The possible values of X are 0, 1, 2, …, 13
The possible values of Y are also 0, 1, 2, …, 13
and X + Y ≤ 13.
p x, y P X x, Y y
The number of
ways of choosing
the x spades for the
hand
The number of
ways of choosing
the y hearts for the
hand
The number of ways
of completing the hand
with diamonds and
clubs.
13 13 26
x y 13 x y
52
The total number of
ways of choosing the
13
13 cards for the hand
Table: p(x,y)
0
13 13 26
x y 13 x y
52
13
1
2
3
4
5
6
7
8
9
10
11
12
13
0 0.0000
0.0002
0.0009
0.0024
0.0035
0.0032
0.0018
0.0006
0.0001
0.0000
0.0000
0.0000
0.0000
0.0000
1 0.0002
0.0021
0.0085
0.0183
0.0229
0.0173
0.0081
0.0023
0.0004
0.0000
0.0000
0.0000
0.0000
-
2 0.0009
0.0085
0.0299
0.0549
0.0578
0.0364
0.0139
0.0032
0.0004
0.0000
0.0000
0.0000
-
-
3 0.0024
0.0183
0.0549
0.0847
0.0741
0.0381
0.0116
0.0020
0.0002
0.0000
0.0000
-
-
-
4 0.0035
0.0229
0.0578
0.0741
0.0530
0.0217
0.0050
0.0006
0.0000
0.0000
-
-
-
-
5 0.0032
0.0173
0.0364
0.0381
0.0217
0.0068
0.0011
0.0001
0.0000
-
-
-
-
-
6 0.0018
0.0081
0.0139
0.0116
0.0050
0.0011
0.0001
0.0000
-
-
-
-
-
-
7 0.0006
0.0023
0.0032
0.0020
0.0006
0.0001
0.0000
-
-
-
-
-
-
-
8 0.0001
0.0004
0.0004
0.0002
0.0000
0.0000
-
-
-
-
-
-
-
-
9 0.0000
0.0000
0.0000
0.0000
0.0000
-
-
-
-
-
-
-
-
-
10 0.0000
0.0000
0.0000
0.0000
-
-
-
-
-
-
-
-
-
-
11 0.0000
0.0000
0.0000
-
-
-
-
-
-
-
-
-
-
-
12 0.0000
0.0000
-
-
-
-
-
-
-
-
-
-
-
-
13 0.0000
-
-
-
-
-
-
-
-
-
-
-
-
-
Bar graph: p(x,y)
0.0900
p(x,y)
0.0800
0.0700
0.0600
0.0500
0.0400
0.0300
0.0200
12
0.0100
9
6
0
1
3
2
3
4
5
x
6
7
8
9
0
10 11
12 13
y
Example:
A die is rolled n = 5 times
X = the number of times a “six” appears.
Y = the number of times a “five” appears.
Now
p(x,y) = P[X = x, Y = y]
The possible values of X are 0, 1, 2, 3, 4, 5.
The possible values of Y are 0, 1, 2, 3, 4, 5.
and X + Y ≤ 5
A typical outcome of rolling a die n = 5 times
will be a sequence F5FF6 where F denotes the
outcome {1,2,3,4}. The probability of any such
sequence will be:
x
y
1 1 4
6 6 6
5 x y
where
x = the number of sixes in the sequence and
y = the number of fives in the sequence
Now
p(x,y) = P[X = x, Y = y]
x
y
1 1 4
K
6 6 6
5 x y
Where K = the number of sequences of length 5
containing x sixes and y fives.
5 5 x 5 x y
x
y
5
x
y
5 x !
5!
5!
x ! 5 x ! y ! 5 x y ! x ! y ! 5 x y !
Thus
p(x,y) = P[X = x, Y = y]
x
y
5!
1 1 4
x ! y ! 5 x y ! 6 6 6
if x + y ≤ 5 .
5 x y
Table: p(x,y)
x
y
5!
1 1 4
x ! y ! 5 x y ! 6 6 6
0
1
2
3
4
5
0
0.1317
0.1646
0.0823
0.0206
0.0026
0.0001
1
0.1646
0.1646
0.0617
0.0103
0.0006
0
2
0.0823
0.0617
0.0154
0.0013
0
0
3
0.0206
0.0103
0.0013
0
0
0
5 x y
4
0.0026
0.0006
0
0
0
0
5
0.0001
0
0
0
0
0
Bar graph: p(x,y)
p(x,y)
0.1800
0.1600
0.1400
0.1200
0.1000
0.0800
0.0600
0.0400
5
4
0.0200
3
2
0.0000
0
1
1
2
x
3
0
4
5
y
General properties of the joint probability
function;
p(x,y) = P[X = x, Y = y]
1.
0 p x, y 1
2.
p x, y 1
x
3.
y
P X , Y A p x, y
x, y A
Example:
A die is rolled n = 5 times
X = the number of times a “six” appears.
Y = the number of times a “five” appears.
What is the probability that we roll more sixes
than fives
i.e. what is P[X > Y]?
Table: p(x,y)
x
y
5!
1 1 4
x ! y ! 5 x y ! 6 6 6
0
1
2
3
4
5
0
0.1317
0.1646
0.0823
0.0206
0.0026
0.0001
1
0.1646
0.1646
0.0617
0.0103
0.0006
0
2
0.0823
0.0617
0.0154
0.0013
0
0
3
0.0206
0.0103
0.0013
0
0
0
P X Y p x, y 0.3441
x y
5 x y
4
0.0026
0.0006
0
0
0
0
5
0.0001
0
0
0
0
0
Marginal and conditional
distributions
Definition:
Let X and Y denote two discrete random variables
with joint probability function
p(x,y) = P[X = x, Y = y]
Then
pX(x) = P[X = x] is called the marginal
probability function of X.
and
pY(y) = P[Y = y] is called the marginal
probability function of Y.
Note: Let y1, y2, y3, … denote the possible values of Y.
pX x P X x
P X x, Y y1 X x, Y y2
P X x, Y y1 P X x, Y y2
p x, y1 p x, y2
p x, y j p x, y
j
y
Thus the marginal probability function of X, pX(x) is
obtained from the joint probability function of X and Y by
summing p(x,y) over the possible values of Y.
Also
pY y P Y y
P X x1 , Y y X x2 , Y y
P X x1 , Y y P X x2 , Y y
p x1 , y p x2 , y
p xi , y p x, y
i
x
Example:
A die is rolled n = 5 times
X = the number of times a “six” appears.
Y = the number of times a “five” appears.
0
1
2
3
4
5
p Y (y )
0
0.1317
0.1646
0.0823
0.0206
0.0026
0.0001
0.4019
1
0.1646
0.1646
0.0617
0.0103
0.0006
0
0.4019
2
0.0823
0.0617
0.0154
0.0013
0
0
0.1608
3
0.0206
0.0103
0.0013
0
0
0
0.0322
4
0.0026
0.0006
0
0
0
0
0.0032
5
0.0001
0
0
0
0
0
0.0001
p X (x )
0.4019
0.4019
0.1608
0.0322
0.0032
0.0001
Conditional Distributions
Definition:
Let X and Y denote two discrete random variables
with joint probability function
p(x,y) = P[X = x, Y = y]
Then
pX |Y(x|y) = P[X = x|Y = y] is called the conditional
probability function of X given Y
=y
and
pY |X(y|x) = P[Y = y|X = x] is called the conditional
probability function of Y given
X=x
Note
pX Y x y P X x Y y
P X x, Y y p x , y
P Y y
pY y
and
pY X y x P Y y X x
P X x, Y y
P X x
p x, y
pX x
• Marginal distributions describe how one
variable behaves ignoring the other variable.
• Conditional distributions describe how one
variable behaves when the other variable is
held fixed
Example:
A die is rolled n = 5 times
X = the number of times a “six” appears.
Y = the number of times a “five” appears.
y
x
0
1
2
3
4
5
p Y (y )
0
0.1317
0.1646
0.0823
0.0206
0.0026
0.0001
0.4019
1
0.1646
0.1646
0.0617
0.0103
0.0006
0
0.4019
2
0.0823
0.0617
0.0154
0.0013
0
0
0.1608
3
0.0206
0.0103
0.0013
0
0
0
0.0322
4
0.0026
0.0006
0
0
0
0
0.0032
5
0.0001
0
0
0
0
0
0.0001
p X (x )
0.4019
0.4019
0.1608
0.0322
0.0032
0.0001
The conditional distribution of X given Y = y.
pX |Y(x|y) = P[X = x|Y = y]
y
x
0
1
2
3
4
5
0
0.3277
0.4096
0.2048
0.0512
0.0064
0.0003
1
0.4096
0.4096
0.1536
0.0256
0.0016
0.0000
2
0.5120
0.3840
0.0960
0.0080
0.0000
0.0000
3
0.6400
0.3200
0.0400
0.0000
0.0000
0.0000
4
0.8000
0.2000
0.0000
0.0000
0.0000
0.0000
5
1.0000
0.0000
0.0000
0.0000
0.0000
0.0000
The conditional distribution of Y given X = x.
pY |X(y|x) = P[Y = y|X = x]
y
x
0
1
2
3
4
5
0
0.3277
0.4096
0.5120
0.6400
0.8000
1.0000
1
0.4096
0.4096
0.3840
0.3200
0.2000
0.0000
2
0.2048
0.1536
0.0960
0.0400
0.0000
0.0000
3
0.0512
0.0256
0.0080
0.0000
0.0000
0.0000
4
0.0064
0.0016
0.0000
0.0000
0.0000
0.0000
5
0.0003
0.0000
0.0000
0.0000
0.0000
0.0000
Example
A Bernoulli trial (S - p, F – q = 1 – p) is repeated until
two successes have occurred.
X = trial on which the first success occurs
and
Y = trial on which the 2nd success occurs.
Find the joint probability function of X, Y.
Find the marginal probability function of X and Y.
Find the conditional probability functions of Y given X
= x and X given Y = y,
Solution
A typical outcome would be:
x
y
FFF…FSFFF…FS
x-1
y–x-1
p x, y P X x, Y y
q x1 pq y x1 p q y 2 p 2 if y x
q y 2 p 2
p x, y
0
if y x
otherwise
p(x,y) - Table
y
x
1
2
3
4
5
6
7
8
1
0
0
0
0
0
0
0
0
2
p2
0
0
0
0
0
0
0
3
4
5
6
p2q p2q2 p2q3 p2q4
p2q p2q2 p2q3 p2q4
0 p2q2 p2q3 p2q4
0
0 p2q3 p2q4
0
0
0 p2q4
0
0
0
0
0
0
0
0
0
0
0
0
7
p2q5
p2q5
p2q5
p2q5
p2q5
p2q5
0
0
8
p2q6
p2q6
p2q6
p2q6
p2q6
p2q6
p2q6
0
The marginal distribution of X
p X x P X x p x, y
y
2
pq
y 2
y x 1
p 2 q x1 p 2 q x p 2 q x1 p 2 q x2
pq
x 1
pq
x 1
2
2
1 q q
2
q
3
1
x 1
pq
1 q
This is the geometric distribution
The marginal distribution of Y
pY y P Y y p x, y
x
y 1 p 2 q y 2
0
y 2,3, 4,
otherwise
This is the negative binomial distribution with k = 2.
The conditional distribution of X given Y = y
pX Y x y P X x Y y
P X x, Y y
P Y y
p x, y
pY y
p 2 q y 2
pq x 1
pq y x 1 for y x 1, x 2, x 3
This is the geometric distribution with time starting at x.
The conditional distribution of Y given X = x
pY X y x P Y y X x
P X x, Y y
P X x
p x, y
pX x
p 2 q y 2
1
2 y 2
y 1 p q
y 1
for x 1, 2,3,
, y 1
This is the uniform distribution on the values 1, 2, …(y – 1)
Summary
Discrete Random Variables
The joint probability function;
p(x,y) = P[X = x, Y = y]
1.
0 p x, y 1
2.
p x, y 1
x
3.
y
P X , Y A p x, y
x, y A
Continuous Random Variables
Definition: Two random variable are said to have
joint probability density function f(x,y) if
1.
0 f x, y
2.
f x, y dxdy 1
3.
P X , Y A f x, y dxdy
A
If 0 f x, y then z f x, y
defines a surface over the x – y plane
f x, y dxdy
A
Multiple Integration
f x, y dxdy
A
f(x,y)
If the region A = {(x,y)| a ≤ x ≤ b, c ≤ y ≤ d} is a
rectangular region with sides parallel to the
coordinate axes:
y
d
c
a
Then
x
b
f x, y dxdy
A
f x, y dx dy f x, y dy dx
d
b
b
d
c
a
a
c
To evaluate
d b
f x, y dxdy f x, y dxdy
c a
d b
A
f x, y dx dy
c a
First evaluate the inner integral
b
G y f x, y dx
a
Then evaluate the outer integral
d b
d
c a
c
f x, y dxdy G y dy
y
d
dy
y
c
b
a
b
x
G y f x, y dx = area under surface above the
line where y is constant
a
d b
d
c a
c
f x, y dxdy G y dy
Infinitesimal volume under
surface above the line where
y is constant
The same quantity can be calculated by integrating
first with respect to y, than x.
b d
f x, y dxdy f x, y dydx
a c
b d
A
f x, y dy dx
a c
First evaluate the inner integral
d
H x f x, y dy
c
Then evaluate the outer integral
b d
b
a c
a
f x, y dydx H x dx
y
dx
d
c
d
a
x
b
x
H x f x, y dy = area under surface above the
line where x is constant
c
b d
b
a c
a
f x, y dydx H x dx
Infinitesimal volume under
surface above the line where
x is constant
Example: Compute
1 1
1 1
0 0
Now
1 1
0 0
x 2 y xy 3 dxdy x 2 y xy 3 dydx
0 0
1
2
3
2
3
x y xy dxdy x y xy dx dy
0 0
x
1
1 3
2
x
x 3
dy
y y
3
2
0
x 0
1
2
4 y 1
1 3
1y 1y
1
y y dy
3
2
3 2 2 4 y 0
0
1 1 7
6 8 24
1
The same quantity can be computed by reversing
the order of integration
1 1
0 0
1
2
3
2
3
x y xy dydx x y xy dy dx
0 0
1
y 1
4
y2
y
2
x x dx
2
4 y 0
0
1
2 x 1
1x 1x
1 2 1
x x dx
2
4
2 3 4 2
0
1 1 7
6 8 24
1
3
x 0
Integration over non rectangular
regions
Suppose the region A is defined as follows
A = {(x,y)| a(y) ≤ x ≤ b(y), c ≤ y ≤ d}
y
d
c
Then
A
b(y)
a(y)
x
b y
f x, y dxdy f x, y dx dy
c
a y
d
If the region A is defined as follows
A = {(x,y)| a ≤ x ≤ b, c(x) ≤ y ≤ d(x) }
y
d(x)
c(x)
Then
A
b
a
x
d x
f x, y dxdy f x, y dy dx
a
c x
b
In general the region A can be partitioned into
regions of either type
y
A2
A1
A3
A
A4
x
Example:
Compute the volume under f(x,y) = x2y + xy3 over the
region A = {(x,y)| x + y ≤ 1, 0 ≤ x, 0 ≤ y}
y
(0, 1)
x+y=1
(1, 0)
x
Integrating first with respect to x than y
y
(0, 1)
x+y=1
(1 - y, y)
(0, y)
(1, 0)
x
x
A
2
1 1 y
y xy dxdy
3
x
0 0
1 1 y
2
y xy dxdy
3
2
3
x y xy dx dy
0
0
and
x 1 y
3
2
x
x
2
3
3
x
y
xy
dx
dy
3 y 2 y dy
0 0
0
x 0
3
2
1
1 y
1 y 3
y
y dy
3
2
0
1
y 3 y 2 3 y3 y 4 y3 2 y 4 y5
dy
3
2
0
1
1 y
1
1 43 15 14 52 16
3
2
16 13 14 151 81 15 121
1
2
20 4030815 2410
120
3
120
1
40
Now integrating first with respect to y than x
y
(0, 1)
x+y=1
(x, 1 – x )
(1, 0)
(x, 0)
A
x
1 1 x
x 2 y xy 3 dydx
0 0
1 1 x
x 2 y xy 3 dydx
2
3
x y xy dy dx
0 0
Hence
0 0
1
1 x
y 1 x
2
4
y
y
2
2
3
x
x y xy dy dx 0 2 x 4 dx
y 0
1
1
x
0
2
1 x
2
2
1 x
x
4
4
dx
x 2 2 x3 x 4 x 4 x 2 6 x3 4 x 4 x5
dx
2
4
0
1
x 2 x 2 2 x3 2 x 4 x5
dx
4
0
1
1512 6
4
18 16 18 101 201 1520120
120
Continuous Random Variables
Definition: Two random variable are said to have
joint probability density function f(x,y) if
1.
0 f x, y
2.
f x, y dxdy 1
3.
P X , Y A f x, y dxdy
A
Definition: Let X and Y denote two random
variables with joint probability density function
f(x,y) then
the marginal density of X is
fX x
f x, y dy
the marginal density of Y is
fY y
f x, y dx
Definition: Let X and Y denote two random
variables with joint probability density function
f(x,y) and marginal densities fX(x), fY(y) then
the conditional density of Y given X = x
fY X y x
f x, y
fX x
conditional density of X given Y = y
fX Y x y
f x, y
fY y
The bivariate Normal distribution
Let
f x1 , x2
1
2 1 2
1
2
e
1
Q x1 , x2
2
where
2
x 2
x
x
x
1
1
1
1
2
2
2
2
2
1 2 2
1
Q x1 , x2
1 2
This distribution is called the bivariate
Normal distribution.
The parameters are 1, 2 , 1, 2 and .
Surface Plots of the bivariate
Normal distribution
Note:
f x1 , x2
1
2 1 2
1
2
e
1
Q x1 , x2
2
is constant when
2
x 2
x
x
x
1
1
1
1
2
2
2
2
2
1 2 2
1
Q x1 , x2
1 2
is constant.
This is true when x1, x2 lie on an ellipse
centered at 1, 2 .
Marginal and Conditional
distributions
Marginal distributions for the Bivariate Normal
distribution
Recall the definition of marginal distributions
for continuous random variables:
f1 x1
f x1 , x2 dx2
and
f 2 x2
f x , x dx
1
2
It can be shown that in the case of the bivariate
normal distribution the marginal distribution of xi
is Normal with mean i and standard deviation i.
1
Proof:
The marginal distributions of x2 is
f 2 x2
f x , x dx
1
2
1
1
2 1 2
1 2
e
1
Q x1 , x2
2
dx1
where
2
x1 1 x2 2
x1 1
2
1
1 2
Q x1 , x2
1 2
x2 2
2
2
Now:
2
x 2
x
x
x
1
1
1
1
2
2
2
2
2
1 2 2
1
Q x1 , x2
1 2
2
2
x
a
x
a
a
1
1
c
2 2 x1 2 c
2
b
b
b
b
2
x12
1
x2 2
2
2
2
1
2 1 1 2
1
2
1
1
2
1
2
2
x2 2
2 1 1
2
1
x2 2
x1
2
22 1 2
Hence
Also
b2 1 2
or b 1 1 2
1
x2 2
a
2
2
b
1
2 1 1 2
1
1
x2
2 1
2
1
and
1
a 1
x2
2
Finally
x2 2
x2 2
a
c 2
2
1 2
2
2
2
b
1 1
2 1 1
2 1 2
2
c
2
1
1
2
1
2
2
1
1
2
1
2
2
1
2
2
2
x2 2
2 1 1
2
x2 2
2 1 1
2
1
1 x2
2
1 2
2
x2 2
2
x2 2
2
1
1
22 1 2
22 1 2
a2
2
b
and
2
1
12
1
2
c 2
2 1 x2 2 2 x2 2
2 1
2
2
1 1
2
1
1 x2 2
2
12
1
2
2
2
1 x2 2
2 2
1 1 2
x2 2
2
2
Summarizing
2
x 2
x
x
x
1
1
1
1
2
2
2
2
2
1 2 2
1
Q x1 , x2
1 2
x1 a
c
b
2
where
and
b 1 1 2
1
a 1
x2
2
2
x2 2
c
2
Thus
f 2 x2
f x , x dx
1
2
1
1
2 1 2
1
e
2
1
2 be
2 1 2
1
2 2
e
1 2
e
1 2
2
2
1 x1 a
c
2 b
dx1
c 2
1 x
2 2
2 2
dx1
2 1 2
1
Q x1 , x2
2
1
e
2 b
1 x a
1
2 b
2
dx1
Thus the marginal distribution of x2 is Normal
with mean 2 and standard deviation 2.
Similarly the marginal distribution of x1 is Normal
with mean 1 and standard deviation 1.
Conditional distributions for the Bivariate Normal
distribution
Recall the definition of conditional distributions
for continuous random variables:
f1 2 x1 x2
f x1 , x2
f 2 x2
and f 21 x2 x1
f x1 , x2
f1 x1
It can be shown that in the case of the bivariate
normal distribution the conditional distribution of
xi given xj is Normal with:
i
mean i j i x j j and
j
standard deviation
i j i 1 2
Proof
f 21 x2 x1
f x1 , x2
f1 x1
e
1
Q x1 , x2
2
2 1 2
1 2
1
2 2
e
1
1 x
Q x1 , x2 2 2
2
2 2
2 1 1
2
2
e
e
1 x
2 2
2 2
2
2
1 x 2
1 x1 a
c 2 2
2 b
2 2
2 1 1 2
where
and
Hence
b 1 1 2
1
a 1
x2
2
2
x2 2
c
2
1
f1 2 x1 x2
e
2 b
1 x a
1
2 b
2
Thus the conditional distribution of x2 given x1 is Normal
with:
1
x2 2 and
mean a 1 2 1
2
standard deviation
b 1 2 1 1 2
Bivariate Normal Distribution with marginal
distributions
Bivariate Normal Distribution with
conditional distribution
x2
( 1, 2)
Major axis of
ellipses
Regression
Regression to the
mean
2
21 2 x1 2
1
x1
Example:
Suppose that a rectangle is constructed by first choosing
its length, X and then choosing its width Y.
Its length X is selected form an exponential distribution
with mean = 1/l = 5. Once the length has been chosen
its width, Y, is selected from a uniform distribution form
0 to half its length.
Find the probability that its area A = XY is less than 4.
Solution:
fX x e
1
5
fY X
15 x
for x 0
1
y x x 2 if 0 y x 2
f x, y f X x fY X y x
e
1
5
15 x
1
1
2 5 x
= 5x e
if 0 y x 2, x 0
x2
y x 24 x
x 2 8 or x 8 2 2
yx 22 2 2 2
xy = 4
2
2, 2
y = x/2
P XY 4
x
2 2 2
4
x
f x, y dydx f x, y dydx
0
2
2, 2
0
2 2 0
P XY 4
x
2 2 2
0
0
2
5x
e
15 x
2
5x
e
0
This part can be
evaluated
2
5x
e
15 x
dydx
15 x
x
2
dx
2
5x
e
15 x 4
x
dx
2 2
2 2
x
2 2 0
0
1
5
4
dydx
0
2 2
x
2 2 0
x
2 2 2
4
f x, y dydx f x, y dydx
0
e
15 x
dx
8
5
2 15 x
x e
dx
2 2
This part may require
Numerical evaluation
multivariate distributions
k≥2
Definition
Let X1, X2, …, Xk denote k discrete random
variables, then
p(x1, x2, …, xk )
is joint probability function of X1, X2, …, Xk if
1.
0 p x1 ,
2.
px ,
1
x1
3.
, xn 1
, xn 1
xn
P X 1 ,
, X n A
x1,
px ,
1
, xn A
, xn
Definition
Let X1, X2, …, Xk denote k continuous random
variables, then
f(x1, x2, …, xk )
is joint density function of X1, X2, …, Xk if
1.
f x1 ,
2.
f x ,
, xn dx1 ,
, dxn 1
, X n A
f x ,
1
3.
, xn 0
P X 1 ,
1
A
, xn dx1 ,
, dxn
Example: The Multinomial distribution
Suppose that we observe an experiment that has k
possible outcomes {O1, O2, …, Ok } independently n
times.
Let p1, p2, …, pk denote probabilities of O1, O2, …,
Ok respectively.
Let Xi denote the number of times that outcome Oi
occurs in the n repetitions of the experiment.
Then the joint probability function of the random
variables X1, X2, …, Xk is
n!
p x1 , , xn
p1x p2x
pkx
x1 ! x2 ! xk !
1
2
k
Note:
p1x1 p2x2
pkxk
is the probability of a sequence of length n containing
x1 outcomes O1
x2 outcomes O2
…
xk outcomes Ok
n!
x1 ! x2 ! xk ! x1
xk
n
x2
is the number of ways of choosing the positions for the x1
outcomes O1, x2 outcomes O2, …, xk outcomes Ok
n n x1 n x1 x2
x
x
x
3
1 2
xk
xk
n x1 !
n x1 x2 !
n!
x ! n x !
x ! n x x !
x ! n x x x !
1
1
2
1
2
3
1
2
3
n!
x1 ! x2 ! xk !
p x1 ,
n!
, xn
p1x1 p2x2
x1 ! x2 ! xk !
x1
n
x2
pkxk
x1 x2
p1 p2
xk
pkxk
is called the Multinomial distribution
Example:
Suppose that a treatment for back pain has three possible
outcomes:
O1 - Complete cure (no pain) – (30% chance)
O2 - Reduced pain – (50% chance)
O3 - No change – (20% chance)
Hence p1 = 0.30, p2 = 0.50, p3 = 0.20.
Suppose the treatment is applied to n = 4 patients suffering
back pain and let X = the number that result in a complete cure,
Y = the number that result in just reduced pain, and Z = the
number that result in no change.
Find the distribution of X, Y and Z. Compute P[X + Y ≥ Z]
4!
x
y
z
p x, y , z
0.30 0.50 0.20
x! y ! z !
x yz 4
Table: p(x,y,z)
x
0
0
0
0
0
1
1
1
1
1
2
2
2
2
2
3
3
3
3
3
4
4
4
4
4
y
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
0
0
0
0
0
0.0625
0
0
0
0.1500
0
0
0
0.1350
0
0
0
0.0540
0
0
0
0.0081
0
0
0
0
1
0
0
0
0.1000
0
0
0
0.1800
0
0
0
0.1080
0
0
0
0.0216
0
0
0
0
0
0
0
0
0
z
2
0
0
0.0600
0
0
0
0.0720
0
0
0
0.0216
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3
0
0.0160
0
0
0
0.0096
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
4
0.0016
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
P [X + Y ≥ Z]
= 0.9728
x
0
0
0
0
0
1
1
1
1
1
2
2
2
2
2
3
3
3
3
3
4
4
4
4
4
y
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
0
0
0
0
0
0.0625
0
0
0
0.1500
0
0
0
0.1350
0
0
0
0.0540
0
0
0
0.0081
0
0
0
0
1
0
0
0
0.1000
0
0
0
0.1800
0
0
0
0.1080
0
0
0
0.0216
0
0
0
0
0
0
0
0
0
z
2
0
0
0.0600
0
0
0
0.0720
0
0
0
0.0216
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3
0
0.0160
0
0
0
0.0096
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
4
0.0016
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Example: The Multivariate Normal distribution
Recall the univariate normal distribution
1
f x
e
2
12
x
2
the bivariate normal distribution
f x, y
1
2 x y 1
2
e
12
2 1
2
x x 2
x
x x
x
x y
y
x y 2
y
The k-variate Normal distribution
f x1 ,
, xk f x
1
2
k /2
1/ 2
e
12 x μ 1 x μ
where
x1
x
2
x
xk
1
μ 2
k
11 12
12
22
1k 2 k
1k
2 k
kk
Marginal distributions
Definition
Let X1, X2, …, Xq, Xq+1 …, Xk denote k discrete
random variables with joint probability function
p(x1, x2, …, xq, xq+1 …, xk )
then the marginal joint probability function
of X1, X2, …, Xq is
p12
q
x , , x p x ,
1
q
1
xq1
xn
, xn
Definition
Let X1, X2, …, Xq, Xq+1 …, Xk denote k continuous
random variables with joint probability density
function
f(x1, x2, …, xq, xq+1 …, xk )
then the marginal joint probability function
of X1, X2, …, Xq is
f12
q
x , , x f x ,
1
q
1
, xn dxq 1
dxn
Conditional distributions
Definition
Let X1, X2, …, Xq, Xq+1 …, Xk denote k discrete
random variables with joint probability function
p(x1, x2, …, xq, xq+1 …, xk )
then the conditional joint probability function
of X1, X2, …, Xq given Xq+1 = xq+1 , …, Xk = xk is
p1
q q 1 k
x ,, x
1
q
xq 1 ,, xk
p x1 ,
pq 1
k
x
q 1
, xk
,, xk
Definition
Let X1, X2, …, Xq, Xq+1 …, Xk denote k continuous
random variables with joint probability density
function
f(x1, x2, …, xq, xq+1 …, xk )
then the conditional joint probability function
of X1, X2, …, Xq given Xq+1 = xq+1 , …, Xk = xk is
f1
q q 1 k
x ,, x
1
q
xq 1 ,, xk
f x1 ,
f q 1
k
x
q 1
, xk
,, xk
Definition – Independence of sets of vectors
Let X1, X2, …, Xq, Xq+1 …, Xk denote k continuous
random variables with joint probability density
function
f(x1, x2, …, xq, xq+1 …, xk )
then the variables X1, X2, …, Xq are independent
of Xq+1, …, Xk if
f x1 ,
, xk f1
q
x , , x f
1
q
q 1 k
x
q 1
, , xk
A similar definition for discrete random variables.
Definition – Mutual Independence
Let X1, X2, …, Xk denote k continuous random
variables with joint probability density function
f(x1, x2, …, xk )
then the variables X1, X2, …, Xk are called
mutually independent if
f x1 ,
, xk f1 x1 f 2 x2 f k xk
A similar definition for discrete random variables.
Example
Let X, Y, Z denote 3 jointly distributed random
variable with joint density function then
2
K
x
yz 0 x 1, 0 y 1, 0 z 1
f x, y , z
0
otherwise
Find the value of K.
Determine the marginal distributions of X, Y and Z.
Determine the joint marginal distributions of
X, Y
X, Z
Y, Z
Solution
Determining the value of K.
1
1 1 1
f x, y, z dxdydz K x 2 yz dxdydz
0 0 0
x 1
1 1
x
1
K xyz dydz K yz dydz
3
3
x 0
0 0
0 0
1 1
3
y 1
1
1
y
1
1
K y z dz K z dz
3
2 y 0
3
2
0
0
1
2
1
z z
7
12
1 1
K K K
1 if K
12
7
3 4
3 4 0
2
The marginal distribution of X.
f1 x
1 1
12
2
f x, y, z dydz x yz dydz
7 00
y 1
1
12
y
12 2 1
2
x y
z dz x z dz
7 0
2 y 0
7 0
2
1
2
1
12 2
z 12 2 1
x z x for 0 x 1
7
4 0 7
4
2
The marginal distribution of X,Y.
f12 x, y
1
12
2
f x, y, z dz x yz dz
7 0
z 1
12 2
z
x z y
7
2 z 0
2
12 2 1
x y for 0 x 1, 0 y 1
7
2
Find the conditional distribution of:
1.
Z given X = x, Y = y,
2.
Y given X = x, Z = z,
3.
X given Y = y, Z = z,
4. Y , Z given X = x,
5.
X , Z given Y = y
6.
X , Y given Z = z
7. Y given X = x,
8.
X given Y = y
9.
X given Z = z
10. Z given X = x,
11. Z given Y = y
12. Y given Z = z
The marginal distribution of X,Y.
12 2 1
f12 x, y x y for 0 x 1, 0 y 1
7
2
Thus the conditional distribution of Z given X = x,Y = y
is
12 2
x yz
f x, y , z
7
f12 x, y 12 2 1
x y
7
2
x 2 yz
1
2
x y
2
for 0 z 1
The marginal distribution of X.
12 2 1
f1 x x for 0 x 1
7
4
Thus the conditional distribution of Y , Z given X = x is
12 2
f x, y, z 7 x yz
12 2 1
f1 x
x
7
4
x 2 yz
for 0 y 1, 0 z 1
1
2
x
4