Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 4 Continuous Random Variables and Probability Distributions 4.1 - Probability Density Functions 4.2 - Cumulative Distribution Functions and Expected Values 4.3 - The Normal Distribution 4.4 - The Exponential and Gamma Distributions 4.5 - Other Continuous Distributions 4.6 - Probability Plots Poisson Distribution (discrete) For x = 0, 1, 2, …, this calculates P(x Events) in a random sample of n trials coming from a population with rare P(Event) = . But it may also be used to calculate P(x Events) within a random interval of time units, for a “Poisson process” having a known “Poisson rate” α. 0 X = # “clicks” on a Geiger counter in normal background radiation. T Poisson Distribution (discrete) For x = 0, 1, 2, …, this calculates P(x Events) in a random sample of n trials coming from a population with rare P(Event) = . But it may also be used to calculate P(x Events) within a random interval of time units, for a “Poisson process” having a known “Poisson rate” α. 0 T X = #time “clicks” between on a “clicks” Geiger on counter a in Geiger normalcounter background in normal radiation. background radiation. failures, deaths, births, etc. • “Time-to-Event Analysis” • “Time-to-Failure Analysis” • “Reliability Analysis” • “Survival Analysis” Time between events is often modeled by the Exponential Distribution (continuous). Time between events is often modeled by the Exponential Distribution (continuous). X ~ Exp() ( ) 1 parameter > 0 Check pdf? pdf 1 e , x0 f ( x) 0, x0 x x 1 x0 e dx 1 X = Time between events 1 x 0 f ( x) dx 1? Let y x then dy ; dx y 0 x e dx e y dy c lim e 00 c y lim e c 1 c 1 0 f ( x) 0 is clear Time between events is often modeled by the Exponential Distribution (continuous). X ~ Exp() ( ) 1 parameter > 0 pdf 1 x e , x0 f ( x) 0, x0 x 1 x0 e dx 1 0 X = Time between events Calculate the expected time between events E[ X ] x x 0 x f ( x) dx x e dx 1 x u x dv e dx Integration by Parts x u dv uv v du du dx v e x x x e e dx x 0 0 c x lim c e 0 e dx x 0 c 0 Time between events is often modeled by the Exponential Distribution (continuous). X ~ Exp() ( ) 1 parameter > 0 Calculate the expected time between events E[ X ] pdf 1 x e , x0 f ( x) 0, x0 x f ( x) dx Mean Similarly for the variance… E ( X ) ( x )2 f ( x) dx 2 x 1 x0 e dx 1 0 X = Time between events 2 E X x2 f ( x) dx 2 x 1 x 2e dx 2 x 0 2 2 2 Integration by Parts etc... = u dv uv v du 2 Time between events is often modeled by the Exponential Distribution (continuous). X ~ Exp() ( ) 1 parameter > 0 pdf 1 x e , x0 f ( x) 0, x0 x 1 x0 e dx 1 Calculate the expected time between events E[ X ] x f ( x) dx Mean Variance 2 2 Determine the cdf F ( x) P( X x) x F ( x) x 1 0 0 e t f (t ) dt x t dt e 0 x F ( x) 1 e , x 0 X = Time between events Note: F (0) 0, lim F ( x) 1 x Time between events is often modeled by the Exponential Distribution (continuous). X ~ Exp() 1 parameter > 0 Calculate the expected time between events pdf x 1 cdf e , x 0 f ( x) x 0,1 e x , x0 0 F ( x) 0, x0 x Note: P( X x) 1 F ( x) e “Reliability Function” R(t) “Survival Function” S(t) 0 X = Time between events E[ X ] x f ( x) dx Mean Variance 2 2 Determine the cdf F ( x) P( X x) x F ( x) x 1 0 e t f (t ) dt x t dt e 0 x F ( x) 1 e , x 0 Note: F (0) 0, lim F ( x) 1 x Time between events is often modeled by the Exponential Distribution (continuous). X ~ Exp() 1 parameter > 0 pdf 1 x cdf e , x 0 f ( x) x 0, 1 e x , x0 0 F ( x) 0, x0 Example: Suppose mean time between events is known to be… Mean = 2 years Then for x 0, x 2 F ( x) P( X x) 1 e . Calculate P ( X 3 years). 3 2 F (3) P( X 3) 1 e 0.77687 Calculate the “Poisson rate” . 0 X = Time between events Poisson Distribution (discrete) For x = 0, 1, 2, …, this calculates P(x Events) in a random sample of n trials coming from a population with rare P(Event) = . But it may also be used to calculate P(x Events) within a random interval of time units, for a “Poisson process” having a known “Poisson rate” α. 0 T T . Therefore, the mean number of events in one unit of time is T . The mean number of events during this time interval (0, T) is 1 X = Time between events is often modeled by the Exponential Distribution (continuous). ( ) . Connection? However, the mean time between events was just shown to be = Ex: Suppose the mean number of instantaneous clicks/sec is = 10, then the mean time between any two successive clicks is = 1/10 sec. . 1 second Time between events is often modeled by the Exponential Distribution (continuous). X ~ Exp() 1 parameter > 0 pdf 1 x e , x0 f ( x) cdf 0, xx 0 1 e , x0 F ( x) 0, x0 Example: Suppose mean time between events is known to be… Mean = 2 years Then for x 0, x 2 F ( x) P( X x) 1 e . Calculate P ( X 3 years). 3 2 F (3) P( X 3) 1 e 0.77687 Calculate the “Poisson rate” . 0 X = Time between events 1 1 event 0.5 events/yr 2 years Another property … (Event = “Failure,” etc.) 0 F ( x ) P( X x) 1 e x | | t t t No Failure T What is the probability of “No Failure” up to t + t, given “No Failure” up to t? 1 F (t t ) P( X t t | X t ) P( X t t X t ) 1 F (t ) P( X t ) t t e e t e t independent of time t; only depends on t “Memory-less” property of the Exponential distribution The conditional property of “no failure” from ANY time t to a future time t + t of fixed duration t, remains constant. Models many systems in the “prime of their lives,” e.g., a random 30-yr old individual in the USA. More general models exist…, e.g., In order to understand this, it is first necessary to understand the ”Gamma Function” Def: For any > 0, ( ) 0 x 1 e x dx • Discovered by Swiss mathematician Leonhard Euler (1707-1783) in a different form. • “Special Functions of Mathematical Physics” includes Gamma, Beta, Bessel, classical orthogonal polynomials (Jacobi, Chebyshev, Legendre, Hermite,…), etc. • Generalization of “factorials” to all complex values of (except 0, -1, -2, -3, …). • The Exponential distribution is a special case of the Gamma distribution! Basic Properties: e lim e c 1 1 (1) 1 Proof: (1) e dx clim 0 c 0 x ( 1) ( ) Proof: ( 1) Let = n = 1, 2, 3, … (n 1) n !(n) 12 Integration by Parts u dv uv v du u x dv e x dx du x 1dx v e x c x 0 x e x dx x e x 1 e x dx 0 0 0 x 0 x 1 e x dx ( ) The Gamma Function ( ) 0 x 1 e x dx (5) 4! 24 (4) 3! 6 (1) 0! 1 (2) 1! 1 (3) 2! 2 X ~ Gamma( , ) = “shape parameter” = “scale parameter” 0 1 x x e dx Gamma Function parameters , 0 x 1 x 1e , x 0 f ( x) ( ) 0, x0 pdf Note that if = 1, then pdf Note that if = 1, then pdf ( ) 1 1 11 x e dx 0 (f) ((x0)()dxx) 1? f ( x) f ( x) 1 2 2 x e , x0 Gamma(1, ) Exp( ) 1 x 1e x for x 0 ( ) Gamma( ,1) WLOG… ( ) X ~ Gamma( ,1) 0 1 x x e dx = “shape pdf parameter” 1 f ( x) x 1e x for x 0 ( ) f ( x) Gamma Function 1 x 1e x for x 0 ( ) WLOG… ( ) ) X ~ Gamma( ,,1) 0 1 x x e dx = “shape pdf parameter” 1 f ( x) x 1e x for x 0 ( ) 0.5 1: X ~ Exp(1) 2 3 Gamma Function ( ) X ~ Gamma( ,1) 0 1 x x e dx = “shape pdf parameter” 1 f ( x) x 1e x for x 0 ( ) Gamma Function cdf F ( x) P ( X x) x f ( y ) dy 1 y 1e y dy 0 ( ) x 1 1 y y e dy ( ) 0 x ( ) X ~ Gamma( ,1) 0 1 x x e dx = “shape pdf parameter” 1 f ( x) x 1e x for x 0 ( ) Gamma Function cdf F ( x) P ( X x) x f ( y ) dy 1 y 1e y dy 0 ( ) x 1 1 y y e dy ( ) 0 x “Incomplete Gamma Function” (No general closed form expression, but still continuous and monotonic from 0 to 1.) Return to… X ~ Gamma( , ) = “shape parameter” = “scale parameter” ( ) 0 1 x x e dx Gamma Function parameters , 0 x 1 x 1e , x 0 f ( x) ( ) 0, x0 pdf Note that if = 1, then “Poisson rate” = 1/ = f ( x) 1 x e , x0 f ( x) e x , x0 2 2 2 2 Gamma(1, ) Exp( ) “independent, identically distributed” (i.i.d.) , X n are independent , ~ Exp( ). X n ~ Gamma(n, ). e.g., failure time in Theorem: Suppose r.v.’s X1 , X 2 , X 3 , Then their sum X1 X 2 X 3 machine components X ~ Gamma( , ) = “shape parameter” = “scale parameter” ( ) 0 1 x x e dx Gamma Function parameters , 0 x 1 x 1e , x 0 f ( x) ( ) 0, x0 pdf Example: Suppose X = time between failures is known to be modeled by a Gamma distribution, with mean = 8 years, and standard deviation = 4 years. Calculate the probability of failure before 5 years. x x x 1 1 4 1 2 3 2 1 3 f ( x) 4 x e xe 2 x e , x0 2 (4) (16) 3! 96 t x5 1 3 2 F ((5) x) P( X 5) x) t e dt 0 96 2 2 8 42 2 4 2 X ~ Gamma( , ) = “shape parameter” = “scale parameter” ( ) 0 1 x x e dx parameters , 0 x 1 2 2 x 1e , x 0 f ( x) ( ) 0, x 0 5.68 2 2 4 3 pdf Example: Suppose X = time between failures is known to be modeled by a Gamma distribution, with 5.68 years, and standard deviation = 3 mean = 4 years. Calculate the probability of failure before 5 years. 1 1 41 2x 3.51 1.6x f ( x) 4 3.5 x e x e 2 (4)(3.5) (1.6) F (5) P( X 5) Gamma Function 3.5 4 1.6 2 Recall... ( 1) ( ) for any 0. 7 5 5 5 3 3 5 3 1 1 15 8 2 2 2 2 2 2 2 2 2 2 Chi-Squared Distribution with = n 1 degrees of freedom df = 1, 2, 3,… =1 Special case of the Gamma distribution: , 2 2 x 1 1 x2 e 2 , x 0 2 f ( x) 2 ( 2) 0, x0 =2 =3 =4 =5 =6 “Chi-squared Test” used in statistical analysis of categorical data. =7 23 F-distribution with degrees of freedom 1 and 2 . “F-Test” used when comparing means of two or more groups (ANOVA). 24 T-distribution with (n – 1) degrees of freedom df = 1, 2, 3, … df = 1 df = 2 df = 5 df = 10 “T-Test” used when analyzing means of one or two groups. 25 T-distribution with 1 degree of freedom 1 , 2 1 x x f ( x) 1 df = 1 26 T-distribution with 1 degree of freedom 1 1 2 | a 1 2 1 f ( x) , 2 1 x x | b pdf: improper integral at both endpoints f ( x) dx 1 1 dx 2 1 x 1 0 1 1 dx dx 2 2 0 1 x 1 x a 0, b 0 0 b 1 1 1 lim dx lim dx 2 2 a 0 a b 1 x 1 x 0 b 1 1 1 lim (tan x) lim (tan x) a 0 b a 1 lim ( tan 1a) lim (tan 1b) b a 1 2 2 1 1 1 2 2 27 T-distribution with 1 degree of freedom 1 1 2 1 f ( x) , 2 1 x x improper integral at both endpoints pdf: x f ( x) dx 1 1x 2 dx 1 x 1 0 1x 1x dx dx 2 2 0 1 x 1 x a 0, b 0 1 2 0 b 1 1x 1x lim dx lim dx 2 2 a 0 a b 1 x 1 x 0 b 1 1 1 lim (tan x) lim (tan x) a 0 b a x y 1 x2 0 1 lim ( tan 1a) lim (tan 1b) b a 1 2 2 1 1 1 2 2 28 T-distribution with 1 degree of freedom 1 , 2 1 x x f ( x) 1 2 1 2 x y 1 x2 | a 0 | b 1 improper integral at both endpoints 1 x dx x f ( x) dx 2 1 x 1 0 x x dx dx 2 2 0 1 x 1 x a 0, b 0 0 b 1 x x lim dx lim dx 2 2 a 0 a b 1 x 1 x 1 2 0 2 b 1 1 lim 2 ln(1 x ) lim 2 ln(1 x ) a 0 b a 1 lim 21 ln(1 a 2 ) lim 12 ln(1 b2 ) b a “indeterminate form” 29 T-distribution with 1 degree of freedom 1 , 2 1 x x f ( x) 1 2 1 2 x y 1 x2 0 1 improper integral at both endpoints 1 x dx x f ( x) dx 2 1 x 1 0 x x dx dx 2 2 0 1 x 1 x a 0, b 0 0 b 1 x x lim dx lim dx 2 2 a 0 a b 1 x 1 x 1 2 0 2 b 1 1 lim 2 ln(1 x ) lim 2 ln(1 x ) a 0 b a 1 lim 21 ln(1 a 2 ) lim 12 ln(1 b2 ) b a “indeterminate form” 30 ● Normal distribution ● Log-Normal ~ X is not normally distributed (e.g., skewed), but Y = “logarithm of X” is normally distributed ● Student’s t-distribution ~ Similar to normal distr, more flexible ● F-distribution ~ Used when comparing multiple group means ● Chi-squared distribution ~ Used extensively in categorical data analysis ● Others for specialized applications ~ Gamma, Beta, Weibull… 31