Download Chapter3

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
Probability and Statistics with
Reliability, Queuing and Computer
Science Applications: Chapter 3
Continuous Random Variables
Definitions
• Distribution function:
• If FX(x) is a continuous function of x, then X is a
continuous random variable.
– FX(x): discrete in x  Discrete rv’s
– FX(x): piecewise continuous  Mixed rv’s
•
Definitions
(Continued)
Equivalence:
• CDF (cumulative distribution function)
• PDF (probability distribution function)
• Distribution function
• FX(x) or FX(t) or F(t)
Probability Density Function (pdf)
•
•
X : continuous rv, then,
pdf properties:
1.
2.
Definitions
(Continued)
• Equivalence: pdf
– probability density function
– density function
– density
dF
– f(t) =
dt
F (t )  
t

f ( x)dx
t
  f ( x)dx ,
0
For a non-negative
random variable
Exponential Distribution
•
•
•
•
Arises commonly in reliability & queuing theory.
A non-negative random variable
It exhibits memoryless (Markov) property.
Related to (the discrete) Poisson distribution
– Interarrival time between two IP packets (or voice calls)
– Time to failure, time to repair etc.
• Mathematically (CDF and pdf, respectively):
CDF of exponentially distributed
random variable with  = 0.0001
F(t)
12500
25000
t
37500
50000
Exponential Density Function
(pdf)
f(t)
t
Memoryless property
• Assume X > t. We have observed that the
component has not failed until time t.
• Let Y = X - t , the remaining (residual) lifetime
• The distribution of the remaining life, Y, does not
depend on how long the component has been
operating. Distribution of Y is identical to that of X.
Memoryless property
• Assume X > t. We have observed that the
component has not failed until time t.
• Let Y = X - t , the remaining (residual)
lifetime
Gt ( y )  P(Y  y | X  t )
 P( X  y  t | X  t )
P(t  X  y  t )
 y

 1 e
P( X  t )
Memoryless property (Continued)
• Thus Gt(y) is independent of t and is identical
to the original exponential distribution of X.
• The distribution of the remaining life does
not depend on how long the component has
been operating.
• Its eventual breakdown is the result of some
suddenly appearing failure, not of gradual
deterioration.
Reliability as a Function of Time
• Reliability R(t): failure occurs after time ‘t’. Let X
be the lifetime of a component subject to failures.
• Let N0: total no. of components (fixed); Ns(t):
surviving ones; Nf(t): failed one by time t.
Definitions
(Continued)
Equivalence:
• Reliability
• Complementary distribution function
• Survivor function
• R(t) = 1 -F(t)
Failure Rate or Hazard Rate
• Instantaneous failure rate: h(t) (#failures/10k hrs)
• Let the rv X be EXP( λ). Then,
• Using simple calculus the following apples to any rv,
Hazard Rate and the pdf
f (t )
f (t )
h (t ) 

R(t ) 1  F (t )
h(t) t = Conditional Prob. system will fail in
(t, t + t) given that it has survived until time t
f(t) t = Unconditional Prob. System will fail in
(t, t + t)
• Difference between:
– probability that someone will die between 90
and 91, given that he lives to 90
– probability that someone will die between 90
and 91
Weibull Distribution
• Frequently used to model fatigue failure, ball bearing failure
etc. (very long tails)
Rt   e
t 0
• Reliability:
• Weibull distribution is capable of modeling DFR (α < 1), CFR
(α = 1) and IFR (α >1) behavior.
 t
• α is called the shape parameter and  is the scale parameter
Failure rate of the weibull
distribution with various values
of  and  = 1
5.0
1.0
2.0
3.0
4.0
Infant Mortality Effects in System
Modeling
• Bathtub curves
– Early-life period
– Steady-state period
– Wear out period
• Failure rate models
Bathtub Curve
Failure Rate (t)
•Until now we assumed that failure rate of equipment is time (age)
independent. In real-life, variation as per the bathtub shape has been
observed
Infant Mortality
(Early Life Failures)
Steady State
Operating Time
Wear out
Early-life Period
• Also called infant mortality phase or reliability
growth phase
• Caused by undetected hardware/software defects
that are being fixed resulting in reliability growth
• Can cause significant prediction errors if steadystate failure rates are used
• Availability models can be constructed and solved
to include this effect
• Weibull Model can be used
Steady-state Period
• Failure rate much lower than in early-life
period
• Either constant (age independent) or slowly
varying failure rate
• Failures caused by environmental shocks
• Arrival process of environmental shocks can
be assumed to be a Poisson process
• Hence time between two shocks has the
exponential distribution
Wear out Period
• Failure rate increases rapidly with age
• Properly qualified electronic hardware do
not exhibit wear out failure during its
intended service life (Motorola)
• Applicable for mechanical and other
systems
• Weibull Failure Model can be used
Bathtub curve
DFR phase: Initial design, constant bug fixes
CFR phase: Normal operational phase
IFR phase: Aging behavior
h(t)
(burn-in-period)
(wear-out-phase)
CFR
(useful life)
DFR
IFR
t
Decreasing failure rate
Increasing fail. rate
Failure Rate Models
•We use a truncated Weibull Model
Failure-Rate Multiplier
7
6
5
4
3
2
1
0
0
2,190
4,380
6,570
8,760
10,950 13,140 15,330 17,520
Operating Times (hrs)
•Infant mortality phase modeled by DFR Weibull and the
steady-state phase by the exponential
Failure Rate Models (cont.)
• This model has the form:
1  t  8,760
W ( t )  C 1 t 
  SS
t  8,760
• where:
• C 1  W 1,  SS  steady-state failure rate
•  is the Weibull shape parameter
• Failure rate multiplier = W ( t)  SS
Failure Rate Models (cont.)
• There are several ways to incorporate time dependent failure rates in
availability models
• The easiest way is to approximate a continuous function by a
decreasing step function
Failure-Rate Multiplier
7
6
1
5
4
2
3
2
1
0
0
2,190
4,380
 SS
6,570 8,760 10,950 13,140 15,330 17,520
Operating Times (hrs)
Failure Rate Models (cont.)
•Here the discrete failure-rate model is
defined by:
0  t  4,380
W ( t )   1
 2
4,380  t  8,760
  ss
t  8,760
Uniform Random Variable
• All (pseudo) random generators generate random
deviates of U(0,1) distribution; that is, if you
generate a large number of random variables and
plot their empirical distribution function, it will
approach this distribution in the limit.
• U(a,b)  pdf constant over the (a,b) interval and
CDF is the ramp function
Uniform density
U(0,1) pdf
1.2
1
cdf
0.8
0.6
0.4
0.2
0
0
0.1 0.2 0.3 0.6 0.7 0.8 0.9
1
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9
tim e
2
2.1 2.2
Uniform distribution
• The distribution function is given by:
0 ,
{
F(x)=
xa ,
ba
1 ,
x < a,
a <x<b
x > b.
time
1.48
1.4
1.32
1.24
1.16
1.08
1.04
1
0.96
0.88
0.8
0.72
0.64
0.56
0.48
0.4
0.32
0.24
0.16
0.08
0
cdf
Uniform distribution (Continued)
U(0,1) cdf
1.2
1
0.8
0.6
U(
0.4
0.2
0
HypoExponential
• HypoExp: multiple Exp stages in series.
• 2-stage HypoExp denoted as HYPO(λ1, λ2). The
density, distribution and hazard rate function are:
• HypoExp results in IFR: 0  min(λ1, λ2)
• Disk service time may be modeled as a 3-stage
Hypoexponential as the overall time is the sum of
the seek, the latency and the transfer time
HypoExponential used in software
rejuvenation models
• Preventive maintenance is useful only if failure rate is increasing
• A simple and useful model of increasing failure rate:
Robust state
Failure
probable
state
Failed state
Time to failure: Hypo-exponential distribution
Increasing failure rate
aging
Erlang Distribution
• Special case of HypoExp: All stages have same rate.
• [X > t] = [Nt < r] (Nt : no. of stresses applied in (0,t])
and Nt is Possion (param λt). This interpretation gives,
Erlang Distribution
• Is used to approximate the deterministic one
since if you keep the same mean but increase
the number of stages, the pdf approaches the
delta function in the limit
• Can also be used to approximate the uniform
distribution
probability density functions (pdf)
If we vary r keeping r/ constant, pdf of r-stage Erlang approaches
an impulse function at r/ .
cumulative distribution functions (cdf)
And the cdf approaches a step function at r/. In other words
r-stage Erlang can approximate a deterministic variable.
1.8
Comparison of
probability density functions (pdf)
1.6
1.4
1.2
1
pdf
3-stage Erlang pdf
U(0,1) pdf
0.8
0.6
0.4
0.2
time
0.
9
0.
96
1.
02
0.
6
0.
66
0.
72
0.
78
0.
84
0.
3
0.
36
0.
42
0.
48
0.
54
0.
06
0.
12
0.
18
0.
24
0
0
T=1
1.2
Comparison of cumulative distribution
functions (cdf)
1
3-stage Erlang cdf
0.6
U(0,1) cdf
0.4
0.2
0.
06
0.
12
0.
18
0.
24
0.
3
0.
36
0.
42
0.
48
0.
54
0.
6
0.
66
0.
72
0.
78
0.
84
0.
9
0.
96
1.
02
0
0
cdf
0.8
time
T=1
Gamma Random Variable
• Gamma density function is,
• Gamma distribution can capture all three failure modes,
viz. DFR, CFR and IFR.
– α = 1: CFR
– α <1 : DFR
– α >1 : IFR
• Gamma with α = ½ and  = n/2 is known as the chisquare random variable with n degrees of freedom
HyperExponential Distribution
• Hypo or Erlang  Sequential Exp( ) stages.
• Alternate Exp( ) stages  HyperExponential.
• CPU service time may be modeled as HyperExp
• In workload based software rejuvenation model
we found the sojourn times in many workload
states have this distribution
Log-logistic Distribution
• Log-logistic can model DFR, CFR and IFR failure models
simultaneously, unlike previous ones.
• For, κ > 1, the failure rate first increases with t (IFR); after
momentarily leveling off (CFR), it decreases (DFR) with
time. This is known as the inverse bath tub shape curve
• Use in modeling software reliability growth
Hazard rate comparison
Defective Distribution
• If
• Example:
• This defect (also known as the mass at infinity) could
represent the probability that the program will not terminate
(1-c). Continuous part can model completion time of
program.
• There can also be a mass at origin.
Pareto Random Variable
• Also known as the power law or long-tailed
distribution
• Found to be useful in modeling
–
–
–
–
CPU time consumed by a request
Webfile sizes
Number of data bytes in FTP bursts
Thinking time of a Web browser
Gaussian (Normal) Distribution
• Bell shaped pdf – intuitively pleasing!
• Central Limit Theorem: mean of a large number of
mutually independent rv’s (having arbitrary
distributions) starts following Normal distribution
as n 
• μ: mean, σ: std. deviation, σ2: variance (N(μ, σ2))
• μ and σ completely describe the statistics. This is
significant in statistical estimation/signal
processing/communication theory etc.
Normal Distribution (contd.)
• N(0,1) is called normalized Guassian.
• N(0,1) is symmetric i.e.
– f(x)=f(-x)
– F(z) = 1-F(z).
• Failure rate h(t) follows IFR behavior.
– Hence, N( ) is suitable for modeling long-term wear or
aging related failure phenomena.
Functions of Random Variables
• Often, rv’s need to be transformed/operated upon.
– Y = Φ (X) : so, what is the density of Y ?
– Example: Y = X2
– If X is N(0,1), then,
– Above Y is also known as the χ2 distribution (with 1degree of freedom).
Functions of RV’s (contd.)
• If X is uniformly distributed, then, Y= -λ-1ln(1-X)
follows Exp(λ) distribution
– transformations may be used to generate random
variates (or deviates) with desired distributions.
Functions of RV’s (contd.)
• Given,
• A monotone differentiable function,
•
• Above method suggests a way to get the random variates with desired
distribution.
– Choose Φ to be F.
– Since, Y=F(X), FY(y) = y and Y is U(0,1).
– To generate a random variate with X having desired distribution, generate
U(0,1) random variable Y, then transform y to x= F-1(y) .
– This inversion can be done in closed-form, graphically or using a table.
Jointly Distributed RVs
•
• Joint Distribution Function:
• Independent rv’s: iff the following holds:
Joint Distribution Properties
Joint Distribution Properties (contd)
Order statistics: kofn, TMR
Order Statistics: KofN
X1 ,X2 ,..., Xn iid (independent and identically distributed)
random variables with a common distribution function
F().
Let Y1 ,Y2 ,...,Yn be random variables obtained by
permuting the set X1 ,X2 ,..., Xn so as to be in
increasing order.
To be specific:
Y1 = min{X1 ,X2 ,..., Xn} and
Yn = max{X1 ,X2 ,..., Xn}
Order Statistics: KofN
(Continued)
• The random variable Yk is called the k-th ORDER
STATISTIC.
• If Xi is the lifetime of the i-th component in a system of n
components. Then:
– Y1 will be the overall series system lifetime.
– Yn will denote the lifetime of a parallel system.
– Yn-k+1 will be the lifetime of an k-out-of-n system.
Order Statistics: KofN (Continued)
To derive the distribution function of Yk, we note that the
probability that exactly j of the Xi's lie in (- ,y] and (n-j)
lie in (y, ) is:
n j
n j
F
(
y
)
[
1

F
(
y
)]
 
 j
hence
n j
n j
FYk ( y )     F ( y ) [1  F ( y )]
j k  j 
n
Applications of order statistics
• Reliability of a k out of n system
n
Rkofn (t )   ( nj )[ R(t )] j [1  R(t )]n  j
j k
• Series system:
Rseries (t )  [ R(t )]
n
n
or  Ri (t )
i 1
• Parallel system: R parallel(t )  1  [1  R(t )]n
• Minimum of n EXP random variables is special case
of Y1 = min{X1,…,Xn} where Xi~EXP(i)
Y1~EXP( i)
• This is not true (that is EXP dist.) for the parallel case
Triple Modular Redundancy (TMR)
R(t)
R(t)
Voter
R(t)
• An interesting case of order statistics occurs when
we consider the Triple Modular Redundant (TMR)
system (n = 3 and k = 2). Y2 then denotes the time
until the second component fails. We get:
RTMR (t )  3R (t )  2R (t )
2
3
TMR (Continued)
• Assuming that the reliability of a single
component is given by,
e
 t
we get:
RTMR (t )  3e
2 t
 2e
3t
TMR
(Continued)
• In the following figure, we have plotted
RTMR(t) vs t as well as R(t) vs t.
TMR
(Continued)
Cold standby (dynamic redundancy)
X
Y
Lifetime of Lifetime of
Spare
Active
EXP()
EXP()
Total lifetime 2-Stage Erlang
R(t )  (1  t )e
 t
EXP()
Assumptions: Detection & Switching perfect; spare
does not fail
EXP()
Sum of RVs: Standby Redundancy
• Two independent components, X and Y
– Series system (Z=min(X,Y))
– Parallel System (Z=max(X,Y))
– Cold standby: the life time Z=X+Y
Sum of Random Variables
• Z = Φ(X, Y)  ((X, Y) may not be independent)
• For the special case, Z = X + Y
• The resulting pdf is (assuming independence),
• Convolution integral (modify for the non-negative
case)
Convolution (non-negative case)
Z = X + Y, X & Y are independent random
variables (in this case, non-negative)
t
f Z (t )   f X ( x) fY (t  x) dx
0
• The above integral is often called the
convolution of fX and fY. Thus the density of
the sum of two non-negative independent,
continuous random variables is the
convolution of the individual densities.
Cold standby derivation
• X and Y are both EXP() and independent.
• Then
t
f t (t )   e e
 x
 ( t  x )
dx
0
2  t
 e
t
 dx
0
 t
  te , t  0
2
Cold standby derivation
(Continued)
• Z is two-stage Erlang Distributed
t
FZ (t )   f Z ( z )dz  1  (1  t )e
0
R(t )  1  F (t )
 t
 (1  t )e , t  0
 t
Convolution: Erlang
Distribution
• The general case of r-stage Erlang Distribution
• When r sequential phases have independent
identical exponential distributions, then the
resulting density is known as r-stage (or r-phase)
Erlang and is given by:
Convolution: Erlang
EXP()
EXP()
(Continued)
EXP()
r r 1  t
t e
f (t ) 
(r  1)!
(  t ) k  t
F (t )  1  
e
k!
k 0
r 1
Warm standby
•With Warm spare, we have:
•Active unit time-to-failure: EXP()
•Spare unit time-to-failure: EXP()
EXP(+ )
EXP()
2-stage hypoexponential distribution
Warm standby derivation
• First event to occur is that either the active or the spare wil
fail. Time to this event is min{EXP(),EXP()} which is
EXP( + ).
• Then due to the memoryless property of the exponential,
remaining time is still EXP().
• Hence system lifetime has a two-stage hypoexponential
distribution with parameters
1 =  +  and 2 =  .
Warm standby derivation
(Continued)
• X is EXP(1) and Y is EXP(2) and are
independent
1 = 2
• Then fZ(t) is
t
f Z (t )   1e
 1 x
2e
 2 ( t  x )
dx
0
12  t 12  t

e 
e
1  2
2  1
2
1
Hot standby
•With hot spare, we have:
•Active unit time-to-failure: EXP()
•Spare unit time-to-failure: EXP()
EXP(2)
EXP()
2-stage hypoexponential
TMR and TMR/simplex
as hypoexponentials
TMR/Simplex
EXP(3)
EXP()
TMR
EXP(3)
EXP(2)
Hypoexponential: general case
r
• Z=
 X , where X
i 1
i
1
,X2 , … , Xr
are mutually independent and Xi is exponentially
distributed with parameter i
(i = j for i = j).
Then Z is a r-stage hypoexponentially
distributed random variable.
EXP(1)
EXP(2)
EXP(r)
Hypoexponential: general case
KofN system lifetime as a
hypoexponential
At least, k out of n units should be operational for the
system to be Up.
EXP(n)
Y1
EXP((n-1))
Y2
...
EXP(k)
Yn-k+1
EXP((k-1))
Yn-k+2
...
EXP()
Yn
KofN with warm spares
At least, k out of n + s units should be operational
for the system to be Up. Initially n units are active
and s units are warm spares.
EXP(n
s)
EXP(n
+(s-1) )
...
EXP(n
+ )
EXP(n)
...
EXP(k)
Sum of Normal Random Variables
• X1, X2, .., Xk are normal ‘iid’ rv’s, then, the rv
Z = (X1+ X2+ ..+Xk) is also normal with,
• X1, X2, .., Xk are normal. Then,
follows Gamma
or the χ2 (with n-degrees
of freedom) distribution