Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Probability Theory Refresher K.S. Trivedi A Tool for Modeling Random Phenomena • Random Phenomena in a Computer Environment – – – – Arrival of jobs. Execution time of jobs. Memory requirement of jobs. Status of computer resources (failed, non-failed). • How to Quantify Randomness? – Use probabilistic models. • How to Estimate these Quantifiers? – Use statistical techniques. Modeling Random Phenomena Measurement Data Statistical Analysis Input Parameters PM Outputs Input Output Validation Components of a Probability Model 1. Sample Space (S): A set of all possible observable “states” of a random phenomena. 2. Set of Events (F): A set of all possible events of interest. 3. Probability of Events (P): A consistent description of the likelihood of observing an event. PM = (S, F, P). Example Random Phenomena: The outcome of the test of condition in an if statement: if B then T; else E; S = {T,E} F = { 0, {E}, {T}, {T,E} } P = { 0, ½, ½, 1} Terminology and Definitions F : Union of two events E and F: E F , EF : Intersection E or F: E Not E: Ē, S\E : Complement n E i i 1 n E i 1 i E1 E 2 ... E n E1 E 2 ... E n E1E 2 ...E n Terminology and Definitions E and F are mutually exclusive if EF = 0 E1, E2 , …, En are exhaustive if n E i 1 i S Structure of S E E E, F E F, E F Structure of P 1. P(E) > 0 for all E Є . 2. P(S) = 1. 3. If E and F are mutually exclusive P(E F) = P(E) + P(F). Consequences 1. P(0) = 0 2. P(Ē) = 1 – P(E) 3. If E1, E2, …, En are mutually exclusive n n i 1 i 1 P( E i ) P(E i ) 4. P(E F) = P(E) + P(F) – P(EF) n n 5. P( Ei ) P(Ei ) P(Ei E j ) i 1 i 1 i j P( E E E i j k i j ) ... (1) P(E1E 2 ...E n ) n k The last one is based on the principle of inclusion and exclusion. Conditional Probability A, B Є P(A) > 0. P(B|A) = Conditional Probability that event B takes place given that event A has taken place. Conditional Probability P(B|A) = P(B A)/P(A). Note 1: If P(A)=0, P(B|A) can be defined to be any number. Note 2: No implicit concept of chronological order of A and B. Independence A, B independent * Assume P(A)≠0, P(B)≠0 P(B|A) = P(B) P(A|B) = P(A) P(A B) = P(A)P(B) Pairwise Independence (P.I.) A1, A2, …, An - pairwise independent Ai , Aj independent for i ≠ j Mutual Independence (M.I.) A1, A2, …, An mutually independent P=(Ai 1 Ai 2 … Ai k)=P(Ai 1) P(Ai 2) …P(Ai k) P.I. ≠> M.I. for any k indices i1, i2, …, ik. Series System 1 2 n The system is operative if every component is operative. Ai = Component i is operative. A = The system is operative. Assume: A1, …, An are mutually independent. P(A) = P (A1 A2 … An) = P (A1) P(A2) … P(An) Note: P(A) < min {P(A1), …, P(An)}, i.e., the system is weaker than its weakest link. Parallel System 1 2 n Parallel System • The system is operative if any one component is operative. • Ai = Component i is operative. • A = System is operative. • A1, A2, …, An are mutually independent. • P(Ā) = P(Ā1 Ā2 … Ān) = P(Ā1) P(Ā2) … P(Ān) = (1-P(A1)) (1-P(A2)) … (1-P(An)) n • ∴P(A) = 1 – (1-P(A )) i 1 i Series – Parallel System 1 4 b a 3 5 • The system operates if there is at least one path of operating components from a to b. Ri = P {Component i operates} R = P {System operates} = 1 – (1 – R1R4) (1 – R3R5) Law of Total Probability • B1, B2, …, Bn = mutually exclusive and exhaustive events. n • P(A) = P(A|Bi ) P(Bi ) i 1 Bayes’ Rule • Given: P(Bi ), P(A|Bi ) i =1, 2, ..., n • Want: P(Bi |A) P(Bi |A) = P(A|Bi ) P(Bi ) n P(A|Bj ) P(Bj) j 1 Example of Law of Total Probability 1 4 2 a 3 b 5 Example of Law of Total Probability P(A) = P(A|A2) P(A2) + P(A|Ā2) P(Ā2) P(A|A2) = 1- (1-R4)(1-R5) P(A|Ā2) = 1- (1-R1R4)(1-R3R5) P(A2) = R2, P(Ā2) = 1-R2 ∴ P(A) = [1- (1-R4)(1-R5)] R2 + [1- (1-R1R4)(1-R3R5)] (1-R2) = 1- R2(1-R4)(1-R5) - (1-R2)(1-R1R4)(1-R3R5) Example of Bayes’ Rule To P(Ro|To) Ro P(Ro|T1) P(R1|To) T1 P(R1|T1) R1 Channel Diagram of an Unreliable Binary Transmission Channel. To = o is transmitted. T1 = 1 is tr. Ro = o is received. R1 = 1 is re. Example of Bayes’ Rule Data: P (To) = 0.45 P (Ro|To) = 0.94 P (R1|T1) = 0.91 Compute: P (Ro), P(R1) P (T1|R1), P(To|Ro) P (Error) Solution P(Ro) = P(Ro|To) P(To) + P(Ro|T1) P(T1) = 0.4725 P(R1) = 1- P(Ro) = 0.5275 P(T1|R1) = P(R1|T1) P(T1) P(R1|To) P(To) + P(R1|T1) P(T1) = 0.9488 P(To|Ro) = 0.8952 P(error) = P(T1 Ro) + P(To R1) = P(Ro|T1)P(T1) + P(R1|To)P(To) = 0.0765 U U Discrete Random Variables A random variable is a function which assigns numerical value to each sample point. Example: Execute an if statement. S = {TT, TE, ET, EE} X = Number of then clauses in two executions. X(TT) = 2, X(TE) = X(ET) = 1, X(EE) = 0. A random variable is called discrete if it can take only countably infinite values. Generally X would take values in A ⊂ {…, -3, -2, -1, 0, 1, 2, 3, …}. How to Describe a Random Variable? By giving its probability mass function. þx (x) = P(X=x) xЄA Obviously: þx (x) ≥ 0 ∑ þx (x) = 0 þx (x) 1/6 xЄA 1 2 3 4 5 6 x Cumulative Distribution Function F(t) = P(x ≤ t) = ∑ þx (x) xЄA: (x ≤ t) F(t) 1 5/6 4/6 3/6 2/6 1/6 t Examples of Discrete R.V. 1. Bernoulli A = {0, 1} Parameter: þ. 0<þ<1 þx(0) = 1-þ = q þx(1) = þ. Examples of Discrete R.V. 2. Binomial A = {0,1,…,n} Parameters: n,p þx(k) = nk þk(1-p)n-k n≥0, 0<þ<1 n k n! k!(n-k)! where = = number of ways of picking k items out of n distinct items without regard to order. Examples of Discrete R.V. 3. Geometric A = {1, 2, 3, …} Parameter: p px (k) = (1-p)k-1p 0<p<1 EX: Repeat S until B with P(B = true) = p Number of executions of the repeat loop. Examples of Discrete R.V. Modified Geometric A = {0, 1, 2, …} px(k) = (1-p)k p EX: While B do S; the number of executions of the while loop. Examples of Discrete R.V. 4. Negative Binomial or Pascal A = {r, r+1, r+2, …} Parameters: r > 1, px(k) = k-1 r -1 p: 0 < p < 1 p r -1 (1 - p)k-r Examples of Discrete R.V. 5. Poisson A = {0, 1, 2, …} Parameter: λ>0 px (k) = e -λ k! Where e = base of natural logarithms = 2.718… EX: • Number of jobs arriving per unit time. • Number of component failures in unit time. k Examples of Discrete R.V. 6. Uniform A = {1, 2, …, n) Parameter: n px (k) = 1 n Cumulative Distribution Function FX(n) = P (X ≤x) -∞ ≤ x ≤ ∞ probability density function fx (x)= d dx Fx (x) Fx(x) = ∫x -∞ fx(u) du Cumulative Distribution Function Properties: 1. fx(x) ≥ 0 2. fx(x) = ∫ ∞-∞ fx(x) dx = 1 3. limx ∞ Fx(x) = 1 (Non-defective distributions)1 4. limx ∞ Fx(x) = 0 5. Fx(x) is a nondecreasing function of x. 6. P (a ≤ X ≤ b) = F(b) – F(a) 1 For a defective distribution: limx ∞ SHARPE allows defective distributions. Fx(x)<1 Continuous Random Variables A random variable is called continuous if it can take uncountably many values. Generally, the random variables will take values in R = (-∞, ∞) RT = [0, ∞) I = [a,b] or [a,b) or (a,b] or (a,b) EX: Execution Time of a Program ∈ [0, ∞) Time to failure of a component ∈ [0, ∞) Examples of Continuous Random Variables 1. Uniform A = [a, b] fx(x) = 1 b-a (X is U(a, b)) Parameters: a,b -∞ < a < b < ∞ if a ≤ x ≤ b = 0 otherwise. Examples of Continuous Random Variables 2. Exponential A = [0, ∞) fx(x) = (X is EXP (λ)) Parameters: λ, λ ≥ 0 λe -λx x ≥ 0 0 x<0 -λx x ≥ 0 1 e Fx(x) = 0 x<0 EX: Time between two successive job arrivals. Service time of a job. Time between two component failures. Examples of Continuous Random Variables 3. Erlang A = [0, ∞) Parameters: λ, λ ≥ 0 k, k = 1, 2, 3… k 1 ( x ) fx(x) = e λx (x ≥ 0) (k 1)! 0 Fx(x) = (x < 0) k 1 1 e x r 0 0 (x) r r! (x ≥ 0) (x < 0) Sum of k independent, identically distributed EXP (λ) random variables. Examples of Continuous Random Variables 4. Hyperexponential – Mixture of n EXP random variables. A = [0, ∞) Parameters: n, n = 1, 2, … ∝1, ∝2, …, ∝n ∝i ≥ 0, ∑∝i =1 λ1, λ2, … λn λi ≥ 0 Examples of Continuous Random Variables Hyperexponential cont. n fx (x) = e i i -λi x i 1 x<0 0 n Fx(x) = 1 ie 1 0 x≥0 -λi x x≥0 x<0 Examples of Continuous Random Variables 5. Weibull - Time to failure distribution. A = [0, ∞) Parameters: λ, λ > 0 ∝, ∝ > 0 fx(x) = λ∝x∝-1 ∝-1 λx e 0 Fx(x) = 1 – e-λx∝ 0 (x ≥ 0) (x < 0) (x ≥ 0) (x < 0) Examples of Continuous Random Variables 6. Normal or Gaussian A = [-∞, ∞] Parameters: μ σ,σ≥0 2 1 1 x fx ( x) exp 2 2 Functions of a Random Variable If x is a random variable, so is Y = f(x). Computing CDF of Y: 1. Y = x2 f y(y) = 1 [fx(√y) = fx(-√y)] if y>0 2(√y) 2. 3. 4. X ~ N(0, 1) Y = aX+b = N(b, a2) -∞< a,b < X ~ U(0, 1) Y = aX+b ~ U(b, b+a) X ~ Exp (λ) Y = rX ~Exp ( λr ) ∞ Expectation of a random variable E(x) = ∑ xi þx(xi) if x is discrete i xfx ( x)dx if x is cont. Alternate Definition for positive r.v. E(x) = (1 Fx ( x))dx o Valid for all random variables. Moments of a r.v. Let Y = g(x) ∑ yi þy(yi) E(Y) = ∫ ∞ y fy(y) dy -∞ ∑ g(xi ) þx(xi) = ∞ ∫-∞ g(x) fx(x) dx if y is discrete if y is cont. if x is discrete if x is cont. Alternate Method for positive r.v. If x and y = g(x) are positive r.v. ∞ E(y) = ∫ g(x)(1-Fx(x))dx o n-th moment of x = E(xn) n-th central moment of x = E((x-E(x))n Variance of x = E((x-E(x))2) Properties E(aX+b) = aE(x)+b Var(aX+b) = a2 Var(x) Coeff. of variation = c(x) = √Var(x) E(x) √Var(x) is called the standard deviation of x, denoted by σx. Table of Moments: Discrete Distributions X Bernoulli Binomial Geometric Neg. Bin. Poisson Uniform Mod. Geom. Param. þ n,p p r, p λ n Mean þ np 1/p r/p λ Variance p(1-p) np(1-p) q/p2 rq/p2 λ n+1 2 n2-1 12 p q/p q/p2 Table of Moments: Continuous R.V. Distribution Uniform Exponential Erlang Hyper-Exp. Param. Mean a, b a+b 2 1 λ k λ λ λ, k n, ∝i, λi Weibull λ, ∝ Normal μ, σ i n 1 1/ d 1 i 1 1 d μ Variance (b-a)2 12 1 λ2 k λ2 i i 2 2 i i 1 2/ d 2 2 1 E ( x ) 2 d σ2 Reliability Concepts Life time of a component can be modeled as a random variable, T. • R(t) = Reliability at time t = P (Life time > t) = I - FT(t) • Mean Time To Failure (MTTF) ∞ ∞ = E(T) = ∫o t fT(t)dt = ∫o (1-FT(t))dt ∞ = ∫o R(t)dt Reliability Concepts • Remaining Life Time at t P(T ≤ t + x | T > t) = F(t+x)/R(t) • Expected Remaining Life Time at t ∞ F( t x) F( t ) E(T|T>t) – t = ∫ o 1 F( t ) dx Hazard Rate or Failure Rate Gt(x) = P (A component fails in (t, t+x] It survived up to t. = P (x + t > T > t | T > t) = F(t x) F(t ) 1 F(t) = R(t) R(t x) R(t ) Instantaneous Failure Rate R(t) R(t x) f ( t ) 1 d h(t ) R(t ) R(t ) R(t ) R(t ) dt h(t)∆t ≈ probability that a component surviving up to t fails during t to t+∆t Cumulative Hazard H(t ) h( x)dx t Some important relations R(t) = 1 – F(t) R(t) = exp(-H(t)) Exponential Distribution T ~ Exp (λ) R(t) = e-λt MTTF = 1/λ Remaining Life: P(T > t + x | T > t) = e-λx (Memoryless Property) E(T | T> t) – t = 1/λ h(t) = λ Constant Failure Rate H(t) = λt Weibull Distribution T ~ Wei (∝, λ) R(t) = e-λt∝ 1 1 MTTF = 1 P(T>t+x |T>t) = e-λ(t+x)∝/ e-λt∝ h(t) = λ∝t∝-1 H(t) = λt∝ 1 Often used in practice to allow for age-dependent failure rates.