Download Appendix I – The Transform 2 and The Laplace Transform

Document related concepts
Transcript
Appendix II – Probability
Theory Refresher
Leonard Kleinrock, Queueing Systems, Vol I: Theory
Nelson Fonseca,
State University of Campinas, Brazil
Appendix II – Probability Theory
Refresher
• Random event: statistical regularity
• Example: If one were to toss a fair coin four
times, one expects on the average two heads
and two tails.There is one chance in sixteen
that88 no heads will occur. If we tossed the
coin a million times, the odds are better than
10 to 1 that at least 490.000 heads will
occur.
II.1 Rules of the game
• Real-world experiments:
– A set of possible experimental outcomes
– A grouping of these outcomes into classes
called results
– The relative frequency of these classes in many
independent trials of the experiment
Frequency = number of times the experimental
outcome falls into that class, divided by number
of times the experiment is performed
•
Mathematical model: three quantities of
interest that are in one-to-one relation with
the three quantities of experimental world
1. A sample space is a collection of objects
that corresponds to the set of mutually
exclusive exhaustive outcomes of the
model of an experiment. Each object  is
in the set S is referred to as a sample point
2. A family of events  denoted {A, B,
C,…}in which each event is a set of
samples points { }
3. A probability measure P which is an
assignment (mapping) of the events defined
on S into the set of real numbers. The
notation is P[A], and have these mapping
properties:
a) For any event A,0 <= P[A] <=1
(II.1)
b) P[S]=1
(II.2)
c) If A and B are “mutually exclusive” events
then P[A U B]=P[A]+P[B]
(II.3)
• Notation
Ac   :  not in A  complement of A
  S c  null evnt (contains no sample point since S contais all the points)
If AB   , then A and B are said to be mutually exclusive (or disjoint)
• Exhaustive set of events: a set of events
whose union forms the sample space S
• Set of mutually exclusive exhaustive events
A1, A2 ,..., An  , which have the properties
Ai Aj   for all i  j
Ai  A2  ...  An  S
• The triplet (S, , P) along with Axioms (II.2)(II.3) form a probability system
• Conditional probability
PAB
PA B
PB  0
PB
• The event B forces us to restrict attention from the
original sample space S to a new sample space
defined by the event B, since B must now have a
total probability of unity. We magnify the
probabilities associated with conditional events by
dividing by the term P[B]

• Two events A, B are said to be statistically
independent if and only if
PAB  PAPB
• If A and B are independent
PA | B  PA
• Theorem of total probability
n
PB    PAi B 
i 1
If the event B is to occur it must occur in
conjunction with exactly one of the
mutually exclusive exhaustive events Ai
• The second important form of the theorem
of total probability
PB    PB Ai PAi 
n
i 1
• Instead of calculating the probability of
some complex event B, we calculate the
occurrence of this event with mutually
exclusive events
PB    PB Ai PAi    PBAi   PB 
n
n
i 1
i 1
• Bayes’ theorem
PAi B 
PAB Ai P[ Ai ]
 PAB A P[ A ]
n
j 1
j
j
i
Where {A }are
a set of events mutually exclusive
and exhaustive
• Example: You have just entered a casino and
gamble with a twin brother, one is honest and the
other not. You know that you lose with
probability=½ if you play with the honest brother,
and lose with probability=P if you play with the
cheating brother
II.2 Random variables
• Random variable is a variable whose value
depends upon the outcome of a random
experiment
• To each outcome, we associate a real number,
which is in fact the value the random variable
takes on that outcome
• Random variable is a mapping from the points of
the sample space into the (real) line
• Example: If we win the game we win $5, if
we lose we win -$5 and if we draw we win
S
$0.
 5   W
L
D

W
(3/8)
X ( )   0
 D
(1/4)
(3/8)
 5
 L


Notation : [ X  x]  : X ( )  x
P[X  x]  probabilit y that X( ) is equal to x
P[X  -5 ]  3 8
P[X  0 ]  1 4
P[X  5 ]  3 8
• Probability distribution function (PDF), also
known as the cumulative distribution
function
X  x   : X ( )  x

 PDF : FX ( x)  PX  x 
 Properties : Fx ( x)  0
Fx ()  1
Fx ()  0
Fx (b)  Fx (a )  P[a  X  b] for a  b
Fx (b)  Fx (a )
for a  b
FX (x)
3
8
3
8
5
8
1
1
4
x
-5
0
+5
P[2  x  6]  5 8
P[1  x  4]  0
 At points of discontinu ity the PDF takes on the upper valu e
• Probability density function (pdf)

dFX ( x)
f X ( x) 
dx
x
FX ( x)   f X ( y )dy

We have f X ( x)  0 and FX ()  1 then



f X ( x)dx  1
• The pdf integrated over an interval gives the
probability that the random variable X lies
in that interval
Pa  X  b   f X ( x)dx
b
a
• Distributed random variable
PDF :
pdf :
1  e  x 0  x
FX ( x)  
x0
 0
e x 0  x
f X ( x)  
x0
 0
 0
P[a  x  b]  FX (b)  FX (a)  e a  e b
b
P[a  x  b]   f X ( x)dx  e a  e b
a
f X (x)
3
8
-5
1
4
0
3
8
+5
• Impulse function (discontinuous)
– Functions of more than one variable

FXY ( x, y )  P[ X  x, Y  y ]

d 2 FXY ( x, y )
f XY ( x, y ) 
dxdy
– “Marginal” density function
f X ( x)  

y  
f XY ( x, y )dy
– Two random variables X and Y are said to be
independent if and only if
f XY ( x, y)  f X ( x) fY ( y)
f X1 X 2 ... X n ( x1 , x2 ,..., xn )  f X1 ( x1 ) f X 2 ( x2 ) f Xn ( xn )
• We can define conditional distributions and
densities
 d
f XY ( x, y )
f X Y ( x y )  P X  x Y  y  
dx
fY ( y )
• Function of one random variable
Y  g( X )
Y  Y ( )  g ( X ( ))
• Given the random variable X and its PDF, one
should be able to calculate the PDF for the
variable Y
FY ( y )  PY  y   P : g ( X ( ))  y
 In general cases
n
Y   Xi
i 1
 For the case of n  2, y  x1  x2
FY ( y)  PY  y  PX1  X 2  y
X1
y
y
X1  X 2  y 0
FY ( y)  



 
X2
f X1 X 2 ( x1 , x2 )dx1dx2
Due to the independen ce of X1 and X 2 we then obtain
the PDF for Y as
y  x2

FY ( y )   
f X 1 ( x1 )dx1  f X 2 ( x2 )dx2

 
 


FY ( y )   F X1 ( y  x2 ) f X 2 ( x2 )dx2

fY ( y )  


f X 1 ( y  x2 ) f X 2 ( x2 )dx2
fY ( y )  f X1 ( y )  f X 2 ( y )
fY ( y )  f X1 ( y )  f X 2 ( y )    f X n ( y )
II.3 Expectation
• Stieltjes integrals deal with discontinuities
and impulses
Let F(x) : a nondecreas ing function
(x) : a continuous function
{t k } and { k } : two sets of points such that t k 1   k  t k
and there is a limit | t k  t k 1 | 0
  (
k
k
)[ F (t k )  F (t k 1 )]    ( x)dF ( x)
 PDF  F ( x) and pdf dF ( x)  f ( x)

dF ( x)  f ( x)dx
• The Stieltjes integral will always exist and therefore it
avoids the issue of impulses
• Without impulses the pdf may not exist
• When impulses are permitted we have
  ( x)dF ( x)    ( x) f ( x)dx
The expectatio n of a real random variable is



E[ X ]  X   xdFX ( x)


E[ X ]  X   xf X ( x)dx

The mean or average value of X is

0
0

E[ X ]   [1  FX ( x)]dx   FX ( x)dx
Expeted value - Random function
Y  g( X )

EY [Y ]   yfY ( y )dy


EY [ y ]  E X [ g ( X )]   g ( x) f X ( x)dx

• Expectation of the sum of two random variables
E[ X  Y ]  


 ( x  y) f ( x, y)dxdy
   xf ( x, y )dxdy   
  xf ( x)dx   yf ( y )dy
XY
 


 

XY

 


X


Y
 E[ X ]  E[Y ]
E[ X  Y ]  E[ X ]  E[Y ]

X Y  X Y
yf XY ( x, y )dxdy
• The expectation of the sum of two random
variables is always equal to the sum of the
expectations of each variable
• This is true even if the variables are dependent
• The expectation operator is a linear operator
E[ X 1  X 2  ...  X n ]  E[ X 1 ]  E[ X 2 ]  ...  E[ X n ]
The question is: what is the probability of your being
playing with the cheating brother since you lost?
PDC L  
PL DC PDC 
PL DC PDC   PL DH PDH 

1
p 
2p
 2

 1   1  1  2 p  1
p     
 2   2  2 
N!

 N ( N  1)  ( N  K  1)
( N  K )!
 The number of combinatio ns of N things taken K at a time is denoted
N
N!
by   
 K  K !( N  K )!
E[ XY ]  


E[ XY ]   

 


 
xyf XY ( x, y )dxdy
xyf X ( x) fY ( y )dxdy  E[ X ]E[Y ]
XY  X Y
– The expected result of the product of variables is equal
to the product of the expected values if the variables
are independent
– Expected value of the product of random functions
E[ g ( X )h(Y )]  E[ g ( X )]E[h(Y )]
– nth moment



E[ X ]  X   x n f X ( x)dx
n
n

– nth central moment


( X  X )   ( x  X ) n f X ( X )dx
n

– The nth central moment can be expressed as a function
of n moments
 n k
( X  X )     X ( X ) n  k
k 0  k 
n
n
 n k
( X  X )     X ( X ) n  k 
k 0  k 
n
 n k
    X ( X ) n  k
k 0  k 
n
n
– First central moment = 0
(X  X )  X  X  0
– Second central moment => variance


  ( X  X )  X 2  ( X )2
2
x
2
– Standard deviation (central moment)
 x   X2
– Coefficient of variation

CX 
X
X
• Covariance of two random variables X1 and X2
Cov (X1, X2) = E[(X1 – E[X1]) (X2 – E[X2])]
var (X1 + X2) = var (X1)+var (X2) + 2Cov(X1, X2)
Corr (X1, X2) = Cov (X1, X2) / (1 2)
Normal
Notation
X ~ Nor (  , )
2
Range
   X  
Parameters – Scale
 :0 
Parameters – Shape
 :     
Normal
Probability Density Function
f (X ) 
e
1 X  
 

2  
2
 2
 1 X    
exp 

2   

f (X ) 
 2
2
Normal
=10 =2
=10 =1
Normal
=0 =2
=0 =1
=0 =1
Normal
Normal
Expected Value
E (X )  
Normal
Variance
V (X )  
2
Chebyshev Inequality
P


x X  x  2
x

2
Strong Law of Large Numbers
Wn
Wn  X
1

n
n
x
i
i 1
W
n
2
X

n
2
Strong Law of Large Numbers
lim
n
Wn  X
Central Limit Theorem
n
Zn 
X
i 1
X
i
 nX
n
lim PZ n  x   x 
n 
Exponential
Probability Density Function
f ( X )  e
 X
Distribution Function
F ( X )  1 e
 X
Exponential
• Inter arrival time of phone calls
• Inter arrival time of web session
• Duration of on and off periods for
voice models
Heavy-tailed distributions
PZ  x  cx
0   2

x 
Heavy- Tailed distributions
• Hyperbolic decay
• Infinite variance
0   2
• Unbounded mean
0  1
• Network context
1  2
Pareto
Notation
X ~ Par( ,  )
Range
  X  
Parameters – Scale
 :0  
Parameters – Shape
 :0 
Pareto
Distribution Function
 
F ( x)  1   
 x

Pareto
Probability Density Function

f (X )   
X

f (X ) 

X
 1
 1
Pareto
=1 =1
=1 =2
Pareto
=10 =5
=5 =10
=5 =10
Pareto
Pareto
Expected Value

E (X ) 
1
 1
Pareto
Moments Uncentered

' 
j
j
j
j 
Pareto
• Distribution of file size in Unix
systems
• Duration of on and off periods in data
models (ethernet individual user)
Weibull
Notation
X ~ Wei(b, c)
Range
0  X  
Parameters – Scale
b:0  b
Parameters – Shape
c:0  c
Weibull
Probability Density Function
f ( X )  cb X e
c
c 1
 ( X / b )c
 X 
cX
f (X ) 
exp   
b
 b
c 1
c
c
Weibull
Distribution Function
F ( x)  1  e
 ( X / b )c
 X 
F ( x)  1  exp   
b



c
Weibull
b=1 c=1
b=2 c=1
Weibull
b=1 c=2
b=2 c=2
Weibull
b=10 c=5
b=5 c=10
b=25 c=10
Weibull
Weibull
Moments Uncentered
 '  b [(c  j ) / c]
j
j
c 
 '  b 
 c
j
j
j
j


b

1


 c 
j
Weibull
Expected Value
 c  1
 1  b 1 
E ( X )  b 

b

1



 c  c  c 
 c 
b
E ( X )  [1 / c]
c
Weibull
Variance
  2  1  1   
2         
  c  c  c   
b 
1

V ( X )  22 / c   1 / c  
c
c

2
b
V (X ) 
c
2
2
2
Lognormal
Notation
X ~ Logn (  , )
2
Range
0  X  
Parameters – Scale
 :   0 or m : m  0
Parameters – Shape
 :   0 or w : w  0
Lognormal
Probability Density Function
 1  ln ( X )    
exp 


2



f (X ) 
2 X
2
2
2
Lognormal
Expected Value
E( X )  e
1 2
 
2
1 

 exp    
2 

or
E( X )  m w
2
Lognormal
Variance
V (X )  e
2   2 2
e
2   2
V ( X )  e e e  1
V ( X )  exp[2  ]  exp[ ]  exp[ ]  1
or
V ( X )  m w( w  1)
2
2
2
2
2
2
Lognormal
=0 =0.5
=0 =0.7
Lognormal
=1 =0.5
=1 =0.7
Lognormal
=0 =0.1
=1 =0.1
Lognormal
=0 =1
=1 =1
=0 =1
Lognormal
Lognormal
• Multiplicative efffect
II.4 Transforms, generating functions and
characteristic function
• Characteristic function of a random variable
x(X(u)) is given by:

 X (u)  Ε[e
juX

]   e jux f X ( x)dx

j  1
– u – real variable

 X (u)   e jux f X ( x) dx

e juX  1

 X (u)   f X ( x)dx

 X (u)  1
– Expanding ejux and integrating
( jux) 2
 X (u)   f X ( x)[1  jux 
 ...]dx

2!
( ju) 2 2
 1  ju X 
x  ...
2!

X( 0 )  1
d n X (u)
n
n

j
X
du n u 0

n
d
– Notation g ( n ) ( x0 )  g(x)
dx n x  x
0
 X( n)( 0 )  j ( n) X n
– Moment generation function

M x (v)  E[e vX ] 



e vx f x ( x)dx
M X( n ) (0)  X n
• Laplace transform of the pdf
– Notation:
A( x)  P[ X  x]
a( x)
PDF
p.d . f .
A* ( x)
Transform

A ( s )  E[e  sX ] 
*



e  sx a ( x)dx
A*(n ) (0)  (1) n X n
 x ( sj )  MX ( s )  A* ( s )
X n  j  n X( n ) (0)
X n  M X( n ) (0)
X n  (1) n A*(n ) (0)
– Example
e  x
f X ( x)  a( x)  
 0
 x (u ) 
x0
x0

  ju

M X (v ) 
 v

*
A ( s) 
s
 X (0)  M X (0)  A* (0)  1
X
1

X 
2
2
2
– Probability generating function – discrete variable

G ( z )  E[ z X ]   z k g k
k
G (1) (1)  X
G ( 2) (1)  X 2  X
G (1)  1
– Sum of n independent variables
u
xi , Y   X i
i 1
 ju i1 X i 
juY
Y (u )  E[e ]  E e



E[e juX1 e juX 2  e juX n ]
u

Y (u )  E[e juX ]E[e juX ] E[e juX ]
Y (u )   X (u ) X (u )  X (u )
1
n
2
1
2
– xi – Identically distributed
Y (u )  [ X (u )]n
n
– Sum of independent variables
Y  X1  X 2    X n
n2
Y 2  ( X 1  X 2 ) 2  X 12  2 X 1 X 2  X 22
(Y ) 2  ( X 1  X 2 ) 2  ( X 1 ) 2  2 X 1 X 2  ( X 2 ) 2
 Y2  Y 2  (Y ) 2  X 12  ( X 1 ) 2  X 22  ( X 2 ) 2  2( X 1 X 2  X 1 X 2 )
  x21   x22  2( X 1 X 2  X 1 X 2 )
– x1 and x2 independent
X1 X 2  X1 X 2
 Y2   X2   X2
1
2
– The variance of the sum of independent random
variables is equal to the sum of variances
 Y2   X2   X2    X2
1
2
n
– Variable  sum of independent variables and the
number of variables is a random variable
N
Y   Xi
i 1
– Where N: is a random variable with mean N and
variance X2
– [ Xi ] is independent and identically distributed
– N and [ Xi ] independent
– FY(y) - Compound distribution
 i1 
Y ( s )  E e



n

  s i1 X i 
  E e
P[ N  n]
n 0 


N
s  X i
*

  E[e  sX1 ] E[e  sXn ]P[ N  n]
n 0
– [ Xi ] - identically distributed variables

Y ( s )   [ X * ( s )]n P[ N  n]


n  0 
*
z - transform for N
Y * ( s )  N ( X * ( s ))
Y  NX
 Y2  N X2  ( X ) 2  N2
II.6. Stochastic process
– To each point of the sample process space S a time
function x is associated => Stochastic process family
 PDF :
FX ( x, t )  P[ X (t )  x]
 FX ( x, t )  FX 1 X 2  X n ( x1 , x2 ,  , xn ; t1 , t 2 ,  t n )

 P[ X (t1 )  x1 , X (t 2 )  x2 , X (t n )  xn ]
 FX ( X ; t   )  FX ( X ; t )
 pdf .
FX ( x; t )
f X ( X , t) 
X


X (t )  E[ X (t )]   xf X ( x; t )dx

– Autocorrelation:
RXX (t1 , t2 )  E[ X (t1 ) X (t2 )]




 
x1 x2 f X1 X 2 ( x1 , x2 ; t1t2 )dx1dx2
– Wide sense stationary process
X (t )  X
RXX (t1 , t 2 )  RXX (t 2  t1 )