Download 4.4

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 4
Continuous Random Variables and Probability Distributions
 4.1 - Probability Density Functions
 4.2 - Cumulative Distribution Functions and
Expected Values
 4.3 - The Normal Distribution
 4.4 - The Exponential and Gamma Distributions
 4.5 - Other Continuous Distributions
 4.6 - Probability Plots
Poisson Distribution (discrete)
For x = 0, 1, 2, …, this calculates P(x Events) in a random sample of n trials
coming from a population with rare P(Event) = .
But it may also be used to calculate P(x Events) within a random interval of
time units, for a “Poisson process” having a known “Poisson rate” α.
0
X = # “clicks” on a Geiger counter
in normal background radiation.
T
Poisson Distribution (discrete)
For x = 0, 1, 2, …, this calculates P(x Events) in a random sample of n trials
coming from a population with rare P(Event) = .
But it may also be used to calculate P(x Events) within a random interval of
time units, for a “Poisson process” having a known “Poisson rate” α.
0
T
X = #time
“clicks”
between
on a “clicks”
Geiger on
counter
a
in Geiger
normalcounter
background
in normal
radiation.
background radiation.
failures, deaths, births, etc.
• “Time-to-Event Analysis”
• “Time-to-Failure Analysis”
• “Reliability Analysis”
• “Survival Analysis”
Time between events is often modeled by the
Exponential Distribution (continuous).
Time between events is often modeled by the
Exponential Distribution (continuous).
X ~ Exp()
( )
1

parameter  > 0
Check pdf?
pdf
1 
 e , x0
f ( x)   
 0,
x0

x
x
1 
 x0  e dx  1




X = Time between
events

1
x 0

f ( x) dx  1?

Let y 
x

then dy 
;
dx



y 0

x
e  dx
e  y dy
c

 lim  e 
00
c 
y
 lim e c  1
c 
1
0

f ( x)  0 is clear

Time between events is often modeled by the
Exponential Distribution (continuous).
X ~ Exp()
( )
1

parameter  > 0
pdf
 1  x
 e , x0
f ( x)   
 0,
x0
x
1 
 x0  e dx  1

0
X = Time between
events
Calculate the expected
time between events
  E[ X ]  



x
x 0


x f ( x) dx

x
e  dx
1

x

u  x dv  e dx
Integration by Parts
 x


u
dv

uv

v
du

du

dx
v


e



x
x
 

 
     x e     e  dx
x 0

0
c
x



 
 lim  c e   0    e  dx
x 0
c 


 0   
Time between events is often modeled by the
Exponential Distribution (continuous).
X ~ Exp()
( )
1

parameter  > 0
Calculate the expected
time between events
  E[ X ]  
pdf


 1  x
 e , x0
f ( x)   
 0,
x0
x f ( x) dx
Mean   
Similarly for the variance…

  E ( X   )    ( x   )2 f ( x) dx

2
x
1 
 x0  e dx  1

0
X = Time between
events
2

  E  X      x2 f ( x) dx   2

x

 1

x 2e  dx   2
x 0 
2
2
2
Integration by Parts
etc... =
 u dv  uv   v du
2
Time between events is often modeled by the
Exponential Distribution (continuous).
X ~ Exp()
( )
1

parameter  > 0
pdf
 1  x
 e , x0
f ( x)   
 0,
x0

x


1
 x0  e dx  1
Calculate the expected
time between events
  E[ X ]  

x f ( x) dx
Mean   
Variance  2   2
Determine the cdf
F ( x)  P( X  x)  
x

F ( x)  
x
1
0


0


e
t

f (t ) dt
x
  t 
dt  e 

0
x
F ( x)  1  e  , x  0
X = Time between
events
Note: F (0)  0, lim F ( x)  1
x 
Time between events is often modeled by the
Exponential Distribution (continuous).
X ~ Exp()
1
parameter  > 0
Calculate the expected
time between events
pdf
x


1
cdf  e  , x  0
f ( x)   
x


 0,1  e x , x0  0

F ( x)  
 0,
x0

x

Note: P( X  x)  1  F ( x)  e
“Reliability Function” R(t)
“Survival Function” S(t)
0
X = Time between
events

  E[ X ]  


x f ( x) dx
Mean   
Variance  2   2
Determine the cdf
F ( x)  P( X  x)  
x

F ( x)  
x
1
0



e
t

f (t ) dt
x
  t 
dt  e 

0
x
F ( x)  1  e  , x  0
Note: F (0)  0, lim F ( x)  1
x 
Time between events is often modeled by the
Exponential Distribution (continuous).
X ~ Exp()
1
parameter  > 0
pdf
 1  x
cdf  e , x  0
f ( x)   
x


 0, 1  e x , x0  0

F ( x)  
 0,
x0

Example: Suppose mean
time between events is
known to be…
Mean    = 2 years
Then for x  0,

x
2
F ( x)  P( X  x)  1  e .
Calculate P ( X  3 years).

3
2
F (3)  P( X  3)  1  e
 0.77687
Calculate the “Poisson rate” .
0
X = Time between
events
Poisson Distribution (discrete)
For x = 0, 1, 2, …, this calculates P(x Events) in a random sample of n trials
coming from a population with rare P(Event) = .
But it may also be used to calculate P(x Events) within a random interval of
time units, for a “Poisson process” having a known “Poisson rate” α.
0
T
 T .
Therefore, the mean number of events in one unit of time is  T   .
The mean number of events during this time interval (0, T) is
1
X = Time between events is often modeled by
the Exponential Distribution (continuous).


(  ) .
Connection?

However, the mean time between events was just shown to be =
Ex: Suppose the mean number of instantaneous
clicks/sec is  = 10, then the mean time between
any two successive clicks is  = 1/10 sec.

.
1 second
Time between events is often modeled by the
Exponential Distribution (continuous).
X ~ Exp()
1
parameter  > 0
pdf
 1  x
 e , x0
f ( x)   
cdf
 0,
xx  0



1

e
, x0

F ( x)  
 0,
x0

Example: Suppose mean
time between events is
known to be…
Mean    = 2 years
Then for x  0,

x
2
F ( x)  P( X  x)  1  e .
Calculate P ( X  3 years).

3
2
F (3)  P( X  3)  1  e
 0.77687
Calculate the “Poisson rate” .
0
X = Time between
events
1
1 event
 
 0.5 events/yr
 2 years
Another property …
(Event = “Failure,” etc.)
0
F ( x )  P( X  x)  1  e

x

|
|
t t  t
No Failure
T
What is the probability of “No Failure” up to t +  t, given “No Failure” up to t?
1  F (t  t )
P( X  t  t | X  t )  P( X  t  t X  t ) 
1  F (t )
P( X  t )


t t

e

e
t

e

t

independent of time t;
only depends on t
“Memory-less” property of the Exponential distribution
The conditional property of “no failure” from ANY time t to
a future time t + t of fixed duration t, remains constant.
Models many systems in the “prime of their lives,” e.g.,
a random 30-yr old individual in the USA.
More general models exist…, e.g.,
In order to understand this, it is first necessary
to understand the ”Gamma Function”
Def:
For any  > 0,
( )  

0
x 1 e x dx
• Discovered by Swiss mathematician Leonhard Euler (1707-1783) in a different form.
• “Special Functions of Mathematical Physics” includes Gamma, Beta, Bessel,
classical orthogonal polynomials (Jacobi, Chebyshev, Legendre, Hermite,…), etc.
• Generalization of “factorials” to all complex values of  (except 0, -1, -2, -3, …).
• The Exponential distribution is a special case of the Gamma distribution!
Basic Properties:

 e   lim e c  1  1
 (1)  1 Proof: (1)   e dx  clim
0

c 
0
x
(  1)   ( ) Proof: (  1)  
Let  = n = 1, 2, 3, …
(n  1)  n !(n)
  12   
Integration by Parts
 u dv  uv   v du
u  x
dv  e  x dx
du   x 1dx v  e  x
c
x

0

x e  x dx


   x e     x 1 e x dx
0
0

 0  
x

0
x 1 e x dx   ( )

The Gamma Function
( )  

0
x 1 e x dx
(5)  4!  24
(4)  3!  6
(1)  0!  1 (2)  1!  1
(3)  2!  2

X ~ Gamma( ,  )
 = “shape
parameter”
 = “scale
parameter”
0
 1
x
x
e dx
Gamma
Function
parameters  ,   0
x


1
 
x 1e  , x  0
f ( x)    ( )

0,
x0

pdf
Note that if  = 1, then pdf
Note that if  = 1, then pdf

( )  


1 1  11  x
e dx
0 (f) ((x0)()dxx)  1?
f ( x) 
f ( x) 
1


  
2  2
x
e , x0
Gamma(1,  )  Exp(  )
1
x 1e x for x  0
( )
Gamma( ,1)
WLOG…
( )  
X ~ Gamma( ,1)

0
 1
x
x
e dx
 = “shape
pdf
parameter”
1
f ( x) 
x 1e  x for x  0
( )
f ( x) 
Gamma
Function
1
x 1e x for x  0
( )
WLOG…
( )  
)
X ~ Gamma( ,,1)

0
 1
x
x
e dx
 = “shape
pdf
parameter”
1
f ( x) 
x 1e  x for x  0
( )
  0.5
  1: X ~ Exp(1)
 2
 3
Gamma
Function
( )  
X ~ Gamma( ,1)

0
 1
x
x
e dx
 = “shape
pdf
parameter”
1
f ( x) 
x 1e  x for x  0
( )
Gamma
Function
cdf F ( x)  P ( X  x)

x

f ( y ) dy
1

y 1e  y dy
0  ( )
x
1
 1  y

y
e dy

( ) 0
x
( )  
X ~ Gamma( ,1)

0
 1
x
x
e dx
 = “shape
pdf
parameter”
1
f ( x) 
x 1e  x for x  0
( )
Gamma
Function
cdf F ( x)  P ( X  x)

x

f ( y ) dy
1

y 1e  y dy
0  ( )
x
1
 1  y
y
e dy


( ) 0
x
“Incomplete
Gamma
Function”
(No general closed form expression, but still continuous and monotonic from 0 to 1.)
Return to…
X ~ Gamma( ,  )
 = “shape
parameter”
 = “scale
parameter”
( )  

0
 1
x
x
e dx
Gamma
Function
parameters  ,   0
x


1
 
x 1e  , x  0
f ( x)    ( )

0,
x0

pdf
Note that if  = 1, then
“Poisson rate”  =
1/ = 
f ( x) 
1


x
e , x0
f ( x)   e
 x
, x0
  
2  2

2 2
Gamma(1,  )  Exp(  )
“independent, identically distributed” (i.i.d.)
, X n are independent , ~ Exp(  ).
 X n ~ Gamma(n,  ). e.g., failure time in
Theorem: Suppose r.v.’s X1 , X 2 , X 3 ,
Then their sum X1  X 2  X 3 
machine components
X ~ Gamma( ,  )
 = “shape
parameter”
 = “scale
parameter”
( )  

0
 1
x
x
e dx
Gamma
Function
parameters  ,   0
x


1
 
x 1e  , x  0
f ( x)    ( )

0,
x0

pdf
Example: Suppose X = time between failures is
known to be modeled by a Gamma distribution, with
mean = 8 years, and standard deviation = 4 years.
Calculate the probability of failure before 5 years.
x
x
x
1
1
4 1  2
3 2

1
3
f ( x)  4
x e 
xe 
2
x
e
, x0
2 (4)
(16) 3!
96
t
x5 1
3 2
F ((5)
x)  P( X  5)
x)  
t e dt 
0 96
  
2  2
8  
42    2
 4
 2
X ~ Gamma( ,  )
 = “shape
parameter”
 = “scale
parameter”
( )  

0
 1
x
x
e dx
parameters  ,   0
x






1
2
2
 
x 1e  , x  0




f ( x)    ( )

0,
x  0 5.68   

2
2
4



3
pdf
Example: Suppose X = time between failures is
known to be modeled by a Gamma distribution, with
5.68 years, and standard deviation = 3
mean =
4 years.
Calculate the probability of failure before 5 years.
1 1 41  2x 3.51 1.6x
f ( x)  4 3.5 x e x e
2 (4)(3.5)
(1.6)
F (5)  P( X  5) 
Gamma
Function
3.5
4
1.6
2
Recall... (  1)   ( ) for any   0.
 7  5  5  5 3  3  5 3 1  1  15
       
  
  
8
2 2 2 2 2 2 2 2 2 2

Chi-Squared Distribution
with  = n  1 degrees of freedom df = 1, 2, 3,…
=1
Special case of the
Gamma distribution:

  ,  2
2

x
1 

1
x2 e 2 , x  0
 2
f ( x)   2 ( 2)

0,
x0
=2
=3
=4
=5
=6
“Chi-squared Test” used
in statistical analysis of
categorical data.
=7
23
F-distribution
with degrees of freedom 1 and 2
.
“F-Test” used when
comparing means of two
or more groups (ANOVA).
24
T-distribution
with (n – 1) degrees of freedom df = 1, 2, 3,
…
df = 1
df = 2
df = 5
df = 10
“T-Test” used when
analyzing means of
one or two groups.
25
T-distribution
with 1 degree of freedom
1
,
2
 1 x
  x  
f ( x) 
1
df = 1
26
T-distribution
with 1 degree of freedom
1
1
2
|
a
1
2
1
f ( x) 
,
2
 1 x
  x  
|
b
pdf:



improper integral
at both endpoints
f ( x) dx 
1




1
dx
2
1 x

1  0 1
1

 
dx  
dx 
2
2

0
  1 x
1 x
a  0, b  0

0
b
1 
1
1

  lim 
dx  lim 
dx 
2
2
a
0
a

b

 
1 x
1 x

0
b
1 
1
1

lim (tan x)  lim (tan x) 
a
0
b 

  a 
1

lim ( tan 1a)  lim (tan 1b) 
b 

  a 
1    
      
   2  2
1 1
  1
2 2

27
T-distribution
with 1 degree of freedom
1
1
2
1
f ( x) 
,
2
 1 x
  x  

improper integral
at both endpoints
pdf:
   x f ( x) dx 

1




1x
2 dx
1 x

1  0 1x
1x

 
dx  
dx 
2
2

0
  1 x
1 x
a  0, b  0

1
2
0
b
1 
1x
1x

  lim 
dx  lim 
dx 
2
2
a
0
a

b

 
1 x
1 x

0
b
1 
1
1

lim (tan x)  lim (tan x) 
a
0
b 

  a 
x
y
1  x2
0
1

lim ( tan 1a)  lim (tan 1b) 
b 

  a 
1    
      
   2  2
1 1
  1
2 2

28
T-distribution
with 1 degree of freedom
1
,
2
 1 x
  x  
f ( x) 
1
2
1
2
x
y
1  x2
|
a
0
|
b
1
improper integral
at both endpoints
1

x
dx
   x f ( x) dx  
2


1 x

1  0 x
x

 
dx  
dx 
2
2

0
  1 x
1 x
a  0, b  0


0
b
1 
x
x

  lim 
dx  lim 
dx 
2
2
a
0
a

b

 
1 x
1 x

1
2 0
2 b
1
1

lim 2 ln(1  x )  lim 2 ln(1  x )

a
0
b 

  a 
1

lim 21 ln(1  a 2 )  lim 12 ln(1  b2 ) 
b 

  a




“indeterminate form”
29
T-distribution
with 1 degree of freedom
1
,
2
 1 x
  x  
f ( x) 
1
2
1
2
x
y
1  x2


0
1
improper integral
at both endpoints
1

x
dx
   x f ( x) dx  
2


1 x

1  0 x
x

 
dx  
dx 
2
2

0
  1 x
1 x
a  0, b  0


0
b
1 
x
x

  lim 
dx  lim 
dx 
2
2
a
0
a

b

 
1 x
1 x

1
2 0
2 b
1
1

lim 2 ln(1  x )  lim 2 ln(1  x )

a
0
b 

  a 
1

lim 21 ln(1  a 2 )  lim 12 ln(1  b2 ) 
b 

  a




“indeterminate form”
30
● Normal distribution
● Log-Normal ~ X is not normally distributed (e.g., skewed), but
Y = “logarithm of X” is normally distributed
● Student’s t-distribution ~ Similar to normal distr, more flexible
● F-distribution ~ Used when comparing multiple group means
● Chi-squared distribution ~ Used extensively in categorical
data analysis
● Others for specialized applications ~ Gamma, Beta, Weibull…
31
Related documents