Download Review of probability and random variables

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Randomness wikipedia , lookup

Review of Probability,
Random Process, Random
for Image Processing
© 2002-2003 by Yu Hen Hu
ECE533 Digital Image Processing
Probability Models
» Throw a dice, toss a coin, …
 Example : Card draw
» Each experiment has an
» A single card is drawn from a
» Experiments can be repeated.
well shuffled deck of playing
Sample space ()
» The set of all possible outcomes
of an experiments
» A subset of outcomes in  that
has a particular meanings.
Probability of an event A
» P(A) = |A|/||
» |A|: cardinality of A, # of
elements in set A.
© 2002-2003 by Yu Hen Hu
cards. Find P(drawing an
»  = {1, 2, …, 52}, || = 52.
» Event A = drawing an Ace.
Assume the four Aces cards
are labeled as 1, 2, 3, 4, then
event A = {1, 2, 3, 4}, |A| = 4
» Thus, P(A) = 4/52 = 1/13
ECE533 Digital Image Processing
Axioms of a Probability Model
Each outcome i of an
experiment can be assigned
to a probability measure P(i)
such that 0  P(i)  1.
For fair experiments where
each outcome is equally likely
to occur, P(i) = 1/||
In general, the probability of
an event, which is a set of
outcomes is evaluated as:
P( A) 
P( )
i A
© 2002-2003 by Yu Hen Hu
Given a set A, its corresponding
probability measure P(A) has
the following properties:
1. P() = 0.
The empty set is an
impossible event.
P(A)  0 for every event A.
If Am  An =  for m  n, then
  
P  An    P( An )
 n 1  n 1
P() = 1.
The probability of entire
sample space = unity.
ECE533 Digital Image Processing
Given a fair coin, and a wellshuffled deck of cards. What is
the probability of toss the coin
and observe a Head AND
drawing a Jack of hearts?
P(Head) = ½
P(Jack of hearts) = 1/52.
But the events of tossing a coin
and drawing a card are
independent!. Hence
P(Head AND Jack of hearts) =
P(Head)P(Jack of hearts) =
© 2002-2003 by Yu Hen Hu
Two events A and B are statistically
independent if
P(AB) = P(A)P(B)
Independence of N events
Given N events {An; 1nN}. We
say these N events are mutually
independent iff
P(  A j )   P( A j )
where J  {1, 2, …, N} is any
subset of the indices
ECE533 Digital Image Processing
Conditional Probability
Let A and B be two events in the
same sample space . Given
that B has occurred, the
conditional probability that A will
also occur is defined as:
P( A | B) 
P( A  B)
P( B)
A perfect dice is thrown twice.
Given that the sum of the two
outcomes is 9. What is the
probability that the outcome of
the first throw is 4?
Assuming P(B)0.
If A and B are independent
events, then
P(A|B) = P(A)P(B)/P(B) = P(A)
© 2002-2003 by Yu Hen Hu
Let the outcome of the first throw
is m, the second throw is n. Then
B={(m,n); m+n=9, 1m,n6}
AB={(m,n); m=4,n=5, 1m,n6}
P(A|B) = P(AB)/P(B)
= (1/36)/(4/36) = ¼
Note that P(A) = 1/6.
ECE533 Digital Image Processing
Law of Total Probability & Bayes’ Rule
Bayes’ Rule
Law of total probability
Let {Bn} be a set of events that partitions
the sample space :
P( Bn | A) 
  and
Bm  Bn   for m  n
P( Bn | A) 
Then for any event A  ,
A  A   Bn   ( A Bn )
P( Bn  A) P( A  Bn )
P( A)
P( A)
P( A | Bn ) P( Bn )
 P( A | Bn ) P( Bn )
P( A)  P( ( A  Bn ))
  P( A  Bn )   P( A | Bn ) P( Bn )
© 2002-2003 by Yu Hen Hu
ECE533 Digital Image Processing
Random Variable
A random variable X() is a
real-valued function defined
for points  in a sample
space 
Example: If  is the whole
class of students, and is an
individual student, we may
define X() as the height of
an individual student (in feet)
Question: What is the
probability of a student’s
height between 5 feet and 6
© 2002-2003 by Yu Hen Hu
Define B=[5, 6]. Our goal is to find
the probability
P({ : 5  X()  6}) = P({ :
X() B}) = P({XB})
In general, we are interested in the
probability P({XB}) or for
convenience, P(XB).
» If B = {xo} is a singleton set, we may
simply write P(X=xo}.
Example 2.1
P(a  X b) = P(X b) P(X a)
Example 2.2
P(X=0 or X = 1) = P(X=0) + P(X=1)
ECE533 Digital Image Processing
Probability Mass Functions (PMF) and
Probability mass function (PMF)
is defined on a discrete random
variable X by pX(xi) = P(X = xi)
P ( X  B) 
 I B ( x i )P ( X  x i )
 IB (xi )p X (xi )
Joint PMF of X and
p XY ( x i , y j )  P ( X  x i ,Y  y j )
 P ({X  x i }  {Y  y j })
© 2002-2003 by Yu Hen Hu
Marginal PMF:
p X (xi ) 
p XY ( x i , y j )
pY ( y j ) 
 p XY ( x i , y j )
Expectations (mean, average):
E[ X ] 
 xi P( X  xi )
 xi p X (xi )
ECE533 Digital Image Processing
Moments and Standard Deviation
n-th moment: E[Xn]
Defined over a real-valued
random variable X.
Standard Deviation: var(X)
Let m = E[X], then
Var[X] = E[(X-m)2]
= E[X2 – 2Xm + m2]
= E[X2] – 2mE[X] + m2
= E[X2] – m2
= E[X2] – (E[X])2
© 2002-2003 by Yu Hen Hu
Example Find the E[X2] and
var(X) of a Bernoulli r.v. X:
E[X2] = 02 (1 – p) + 12 p = p
Since E[X] = p, thus,
Var(X) = E[X2] – (E[X])2
= p – (p)2 = p(1 – p)
Example Let X ~ poisson().
Since E[X(X – 1)] = 2, we
have E[X2] = 2 + . Thus,
var(X) = (2 + ) – 2 = .
ECE533 Digital Image Processing
Conditional Probability
The conditional probability is
defined as follows:
P ( X  B | Y C )
Example Let X = message to
be sent (an integer). For X =
i, light intensity i is directed
at a photo detector. Y ~
Poisson(i) = # of photoelectrons generated at the
Solution: for n = 0, 1, 2, …
 P ({ X  B} | {Y  C})
P ({ X  B}  {Y  C})
P ({Y  C})
P ( X  B, Y  C )
P (Y  C )
In terms of pmf, we have
p X |Y ( x i | y j ) 
© 2002-2003 by Yu Hen Hu
P (Y  n | X  i ) 
p XY ( x i , y j )
pY (y j )
ni e i
Thus, P(Y<2|X=i) =
P(Y=0|X=i) + P(Y=1|X=i)
= (1   i ) exp(  i )
ECE533 Digital Image Processing
Definitions of Continuous R.V.s
Definition: Continuous R.V.
Let X() be a random variable
defined on a sample space . X is
a continuous random variable if
Definition: probability density
function (pdf)
f(x) is a probability density function
1. f (x )  0
P ( X  B)
  f ( x )dx   IB ( x )f ( x )dx
© 2002-2003 by Yu Hen Hu
2.  f(x)dx  1
ECE533 Digital Image Processing
Cumulative Distribution Function
Definition: The cumulative
distribution function (cdf) of a
random variable X is defined by
FX(x) = P(Xx)
lim F ( x )  1
x 
lim F ( x )  0
(e) 
(f) F(x) is right continuous. I.e.
F ( x0 )  lim F ( x )  P ( X  x0 )  F ( x0 )
x  x0
If X is a continuous
random var.
FX ( x ) 
 f (v )dv
f ( x )  dFX ( x ) / dx
Properties of CDFs
(a) 0  F(x)  1.
(b) F(b)F(a) = P(a  X  b)
(c) a < b implies F(a) < F(b)
© 2002-2003 by Yu Hen Hu
F ( x0 )  lim F ( x )  P ( X  x0 )
x  x0
P(X=x0) = F(x0) – F(x0 )
Note that if F(x) is continuous at
x0, F(x0+) = F(x0 ) = F(x0).
From (h), P(X=x0) = 0!
ECE533 Digital Image Processing
Functions of Random Variables
Let X be a random variable, and
g(X) a real-valued function.
Y=g(X) is a new random
variable. We want to find
P(YC) in terms of FX(x). For
this, we must find the set B
B  {x  R; g ( x )  C}
such that
P (Y  C )  P (g ( X )  C )  P ( X  B )
To find FY(y), C = (-, y], or
B  {x  R; g ( x )  y}
© 2002-2003 by Yu Hen Hu
Example X: input voltage, a
random variable. Y = g(X) = aX +
b where a 0 is the gain, and b is
offset voltage.
Solution: g(x)  y iff x  (y-b)/a for
a > 0, and x  (y-b)/a for a < 0.
a>0: FY(y) = FX((y-b)/a),
fY(y) = dF/dy = (1/a)fX((y-b)/a)
a<0: FY(y) = 1-FX((y-b)/a),
fY(y) = dF/dy = (-1/a)fX((y-b)/a)
In summary,
fY(y) = (1/|a|)fX((y-b)/a)
ECE533 Digital Image Processing
Random Processes and Random Fields
Random Process : A family of
random variables Xt()
For each fixed outcome  ,
Xt() is a function of t (time).
For fixed index t, Xt() is a
random variable defined on .
Example a. A jukebox has 6
songs. You roll a dice and
based on its outcome to pick a
Example b. Let t {0, 1, 2, …}.
At each t, toss a coin. Xt() = 0
if outcome is tail, = 1 if outcome
is head.
© 2002-2003 by Yu Hen Hu
Random Field: A random field is
a random process that is
defined on 2D space rather than
on 1D (time).
For a monochrome image, the
intensity of each pixel f(x,y) =
Xx,y() is modeled as a random
variable. For a particular
outcome i, f(x, y) is a
deterministic function of x, and
y. All results applicable to
random processes can be
applied to random field.
ECE533 Digital Image Processing
Mean, Correlation and Covariance
If Xt is a random process, it mean
function is
mX (t )  E[ X t ]
where the expectation is taken w.r.t.
pmf or pdf of Xt at time t.
RX (t1 , t 2 )  E[ X t1 X t2 ]
C X (t1 , t2 )
 E[( X t1  mX (t1 ))( X t2  mX (t2 ))]
© 2002-2003 by Yu Hen Hu
Example a. Denote si(t) to be the
time function of ith song. Then
m X (t )   (1 / 6) si (t )
i 1
1 6 6
RX (u, v)   si (u ) s j (v)
36 i 1 j 1
Example d. Given that X0 = 5.
Hence P(X1 = 4) = 1, mX(1) = 4.
P(X2 = 3) = (4/5)(4/5) = 16/25,
P(X2 = 4) = (4/5)(1/5) + (1/5)(4/5)
= 8/25, P(X2 = 5) = 1/25. Thus,
mX(2) = 3(16/25)+4(8/25)+5(1/25)
= 17/5
ECE533 Digital Image Processing
Stationary Process and WSS
Any property that depends on
the value of {Xt} at k index
points t1, t2, …, tk is completely
characterized by the joint pdf (or
pmf) of Xt1, Xt2, …, Xtk denoted
by (pdf case) f(X(t1), …, X(tk))
Definition Stationary Process
{Xt} is (strictly) stationary if for
any finite set of time points {t1,
t2, …, tk}, their joint pdf is time
f(X(t1), …, X(tk))
= f(X(t1+), …, X(tk +))
© 2002-2003 by Yu Hen Hu
Definition Wide-sense stationary
{Xt} is wide-sense stationary if its
first two moments are
independent of time. That is,
mX(t) = E[Xt] = mX
RX(u,v) = RX(uv)
Let u = t + , v = t, we may write
RX(u,v) = RX((t + )-t) = RX()
ECE533 Digital Image Processing
Power Spectral Density and Power
Definition Power Spectral Density
SX ( f ) 
 j 2f
d  0
RX ( ) 
j 2f
SX(f) gives the power density of
the process Xt distributed over
each frequency f. Hence it must
be non-negative.
Definition Power PX
Properties of PSF and Correlation
a) R() = R(). Hence SX(f) is a
real valued, even function.
b) R()  R(0).
To prove, use Cauchy-Schwarz
E[UV]2  E[U2]E[V2]
c) SX(f) is real, even and nonnegative.
PX  E[| X t | ]  RX (0) 
( f )df
f  
© 2002-2003 by Yu Hen Hu
ECE533 Digital Image Processing
LTI System: A brief review
A system y(t) = L[x(t)] is a
mapping of a function x(t) to
a function y(t).
L[] is a Linear System iff
L[ax1+bx2] = a L[x1]+b L[x2]
L[] is time invariant iff
L[x(t+u)] = y(t+u)
A LTI (linear, time invariant)
system can be uniquely
characterized by its impulse
response h(t) = L[(t)]
© 2002-2003 by Yu Hen Hu
Given a LTI system y(t) = L[x(t)], y(t)
can be obtained via the convolution
between x(t) and the impulse
response h(t):
y(t ) 
 h(u) x(t  u)du  h(t ) * x(t )
u  
The Fourier transform of h(t) is
called the transfer function
H( f ) 
 j 2ft
t  
Y( f )  H( f )X ( f )
ECE533 Digital Image Processing
LTI System with WSS input
Let a WSS process Xt be the
input to a LTI system with
impulse response h(t). The
output is denoted by Yt.
 h(u) X t u du
Yt 
Cross correlation between Xt, Yt
E[ X tYs ]  E  X t  h(u ) X s u du 
 u  
If E[Xt] = m, then
E[Yt ]  E   h(u ) X t u du 
 m  h(u )du  m  H (0)
s u
u  
u  
 h(u) E[ X X
 h(u) R
u  
(t  s  u )du  RXY (t  s)
RXY ( )   h(u ) RX (  u )du
  h(v) RX (  v)dv
 h( ) * RX ( )
© 2002-2003 by Yu Hen Hu
ECE533 Digital Image Processing
LSS Input to a LTI System (Cont’d)
Define cross PDF as the Fourier
transform of RXY(),
S XY ( f ) 
 j 2f
 XY
Substitute t-s with , we have
RY ( )   h(u ) RXY (  u )du
Taking Fourier transform,
SY ( f )  H ( f ) S XY ( f )
SXY(f) = H*(f)SX(f)
 H ( f ){H * ( f ) S X ( f )}
 
 
E[YtYs ]  E   h(u ) X t u du Ys 
 u  
 
| H ( f ) |2 S X ( f )
 h(u ) E[ X
Y ]du
t u s
u  
 h(u ) R
(t  s  u )du  RY (t  s )
u  
© 2002-2003 by Yu Hen Hu
ECE533 Digital Image Processing