Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 4 Analysis of Continuous Random Variables 4-1 Continuous Random Variables y The range of a random variable X includes all values in an interval of real numbers; that is, t he range of X can be thought of as a continuum. The number of possible values of the random variable X is uncountably infinite. y 4-2 Probability Distributions and Probability Density Functions Definition: For a continuous random variable X, a probability density function is a function such that (1) f ( x ) ≥ 0 ∞ (2) ∫ f ( x ) dx = 1 −∞ b (3) P(a ≤ X ≤ b) = ∫ f ( x ) dx = area under f(x) from a to b for any a a and b y A probability density function provides a simple description of the probabilities associated with a continuous random variable. y A histogram is an approximation to a probability density function (See Fig. 4-3). Figure 4-3 Histogram approximates a probability density function. 1 y Note: f(x) is used to calculate an area that represents the probability that X assumes a value in [a, b] => For a continuous random variable X and any value x P ( X = x) = 0 If X is a continuous random variable, for any x1 and x2 , P( x1 ≤ X ≤ x 2 ) = P( x1 < X ≤ x 2 ) = P( x1 ≤ X < x2 ) = P( x1 < X < x 2 ) Prove: P( x1 ≤ X ≤ x2 ) = P( x1 < X ≤ x2 ) + P( x1 ) = P( x1 < x ≤ x2 ) Example 4-1 (Electric Current) ● The continuous random variable X: the current measured in a thin copper wire in milliamperes (mA). y The range of X is [0, 20 mA] y The probability density function of X is f(x) = 0.05 for 0 ≤ x ≤ 20 y What is the probability that a current measurement is less than 10 milliamperes? y The probability density function is shown in Fig.4-4 10 10 10 P( X < 10) = ∫ f ( x)dx = ∫ 0.05dx = 0.05 x = 0.5 0 0 0 Figure 4-4 Probability density function for Example 4-1 Example 4-2 (Hole Diameter) y Continuous random variable X: the diameter (直徑) of a hole 2 drilled in a sheet metal component. y The distribution of X can be modeled by a probability density −20 ( x −12.5) function f ( x) = 20e , x ≥ 12.5. y If a part with a diameter larger than 12.60 millimeters is scrapped (丟棄), what proportion of parts is scrapped? y The density function and the requested probability are shown in Fig. 4-5 ∞ P ( X > 12.60) = ∫ ∞ f ( x ) dx = 12.6 − 20 ( x −12.5 ) dx = − e −20 ( x −12.5) ∫ 20e 12.6 ∞ 12.6 = 0.135 Figure 4-5 Probability density function for Example 4-2 y What proportion of parts is between 12.5 and 12.6 millimeters? 12.6 P(12.5 < X < 12.6) = ∫ f ( x)dx = − e − 20( x −12.5) 12.5 = 1 − P ( X > 12.6) = 1 − 0.135 = 0.865 Exercises 4-2: 4-1, 4-3, 4-5, 4-9 3 12.6 12.5 = 0.865 4-3 Cumulative Distribution Function Definition: The cumulative distribution function (cdf) of a continuous random variable X is x ∫ f (u )du F ( x) = P( X ≤ x) = (4-3) −∞ for –∞ < x < ∞ Example 4-3 (Electric Current) y See Example 4-1 y The cumulative distribution function of the random variable X consists of three expressions. y Case 1: F ( x) = 0 , for x < 0 y Case 2: F ( x) = x 0 x x −∞ −∞ 0 0 ∫ f (u)du = ∫ f (u)du + ∫ f (u )du = ∫ f (u)du = 0.05x, for 0 ≤ x < 20 y Case 3: x F ( x) = ∫ f (u )du = 1 , for 20 ≤ x 0 y Therefore, x<0 ⎧ 0 ⎪ F ( x) = ⎨0.05 x 0 ≤ x < 20 ⎪ 1 20≤ x ⎩ y The plot of F(x) is shown in Fig. 4-6 Figure 4-6 Cumulative distribution function for Example 4-3. 4 Example 4-4 (Hole Diameter) y See Example 4-2 y Case 1: F ( x) = 0 for x < 12.5 y Case 2: x F ( x) = ∫ 20e −20 ( u −12.5 ) 12.5 du = −e −20(u −12.5) x 12.5 = −e −20 ( x−12.5) − [−e −20 (12.5−12.5) ] = 1 − e −20( x−12.5) y Therefore, x < 12.5 0 F ( x) = ⎧⎨ −20 ( x −12.5 ) 12.5 ≤ x ⎩1 − e y Figure 4-7 display a graph of F(x) Figure 4-7 Cumulative distribution function for Example 4-4 The probability density function of a continuous random variable can be determined from the cumulative distribution function by differentiating: d d x F ( x) = f (u ) du = f ( x) dx dx ∫−∞ ¾ f ( x) = dF ( x ) as long as the derivative exists. dx 5 Example 4-5 (Reaction Time) y The time until a chemical reaction is complete (in miniseconds) is approximated by the cumulative distribution function 0 x<0 ⎧ F ( x) = ⎨ − 0.01 x 0≤ x ⎩1 − e y Determine the probability density function of X dF ( x) ⎧ 0 x<0 =⎨ f ( x) = −0.01 x ≤x 0 0 . 01 e dx ⎩ y What proportion of reactions is complete within 200 miniseconds P ( X < 200) = F ( 200) = 1 − e −2 = 0.8647 Exercises 4-3: 4-11(a)(b), 4-17, 6 4-4 Mean and Variance of a Continuous Random Variable Definition: Suppose X is a continuous random variable with probability density function f(x). The mean or expected value of X, denoted as μ or E(X), is ∞ μ = E( X ) = ∫ x f ( x) dx (4-4) −∞ The variance of X, denoted as V(X) or σ 2, is ∞ σ = V ( X ) = ∫ ( x − μ ) f ( x)dx = 2 ∞ 2 −∞ ∫x 2 f ( x)dx − μ 2 −∞ 2 The standard deviation of X is σ = σ Prove: ∞ ∞ σ = ∫ ( x − μ ) f ( x) dx = ∫ ( x 2 f ( x) − 2 xμf ( x) + μ 2 f ( x))dx 2 2 −∞ ∞ = ∫x −∞ ∞ = ∫x 2 −∞ ∞ 2 f ( x )dx − 2 μ ∫ xf ( x)dx + μ ∞ 2 −∞ f ( x)dx − 2μ 2 + μ 2 = −∞ ∞ ∫x 2 ∫ f ( x)dx −∞ f ( x)dx − μ 2 =1 −∞ Example 4-6 (Electric Current) y See Example 4-1 y f(x) = 0.05 for 0 ≤ x ≤ 20 y The mean of X is E( X ) = ∞ 20 20 −∞ 0 0 ∫ xf ( x)dx = ∫ xf ( x)dx = ∫ 0.05x dx = 0.05 x 2 20 = 10 2 0 y The variance of X is 20 V ( X ) = ∫ ( x − 10) 2 f ( x)dx = 0 If X is a continuous random variable with probability density function f(x), ∞ E[h( X )] = ∫ h( x) f ( x)dx −∞ 7 (4-5) Example 4-7 y In Example 4-1, X is the current measured in milliamperes. y f(x) = 0.05, 0 ≤ x ≤ 20 y What is the expected value of the squared current? y h(X) = X2 ∞ ∫x E[h( X )] = 2 f ( x) dx = −∞ Example 4-8 (Hole Diameter) y For the drilling operation in Example 4-2 y The mean of X ∞ E( X ) = ∫ ∞ xf ( x)dx = 12.5 ∫ x20e −20 ( x −12.5 ) 12.5 y The variance of X is ∞ V (X ) = ∫ ( x − 12.55) 2 f ( x)dx = 12.5 Exercises 4-4: 4-25, 4-29, 4-31. 8 dx = 4-5 Continuous Uniform Distribution Definition: A continuous random variable X with probability density function f ( x) = 1 /(b − a) , a ≤ x ≤ b (4-6) is a continuous uniform random variable. y The probability density function of a continuous uniform random variable is shown in Fig. 4-8. Figure 4-8 Continuous uniform probability density function y The mean of the continuous uniform random variable X is b b x 0.5 x 2 E ( X ) = ∫ xf ( x) dx = ∫ dx = b − a b−a a a b = a ( a + b) 2 The variance of X is b b V ( X ) = ∫ ( x − μ ) 2 f ( x) dx = ∫ a a a+b 2 )) 2 dx b−a (x − ( b = a+b 3 ) (b − a ) 2 2 = 3(b − a ) 12 (x − a If X is a continuous uniform random variable over a ≤ x ≤ b, a+b (b − a ) 2 2 and σ = V ( X ) = μ = E( X ) = 2 12 9 (4-7) Example 4-9 (Uniform Current) y Let the continuous random variable X denote the current measured in a thin copper wire in milliamperes. y The range of X is [0, 20 mA] y f(x) = 0.05, 0 ≤ x ≤ 20 y What is the probability that a measurement of current is between 5 and 10 milliamperes? y See Fig. 4-9 Figure 4-9 Probability for Example 4-9. 10 P(5 < X < 10) = ∫ f ( x)dx = ∫ 0.05dx = 0.05 x 10 = 0.5 − 0.25 = 0.25 5 5 10 5 y The cumulative distribution function of a continuous uniform random variable is obtained by integration. y If a < x < b, x F ( x) = ∫ 1 /(b − a) du = x /(b − a) − a /(b − a) = a 0 x<a ⎧ ⎪ y F ( x) = ⎨( x − a) /(b − a) a ≤ x < b ⎪ 1 b≤x ⎩ Exercises 4-5: 4-33, 4-35, 4-37, 4-39 10 x−a b−a 4-6 Normal Distribution (Gaussian Distribution) y Undoubtedly, the most widely used model for distribution of a random variable is a normal distribution. y A normal distribution is also referred to as Gaussian distribution. y The value of E(X) = μ determines the center of the probability density function and the value of V(X) = σ2 determines the width. y Figure 4-10 illustrates several normal probability density function with selected values of μ and σ2. y Each normal distribution has the characteristic symmetric bell shaped curve, but the centers and dispersions differ. Figure 4-10 Normal probability density function for selected values of the parameters μ and σ2. Definition: A random variable X with probability density function − ( x−μ )2 1 2 f ( x) = e 2σ –∞ < x < ∞ (4-8) 2π σ is a normal random variable with parameters μ, where –∞ < μ < ∞, and σ >0. Also, E ( X ) = μ and V ( X ) = σ 2 (4-9) 2 and the notation N(μ, σ ) is used to denote the distribution. The mean and variance of X are shown to equal μ and σ2, respectively, at the end of this Section 5-6. 11 Example 4-10 y The current measurement in a strip of wire follow a normal distribution with a mean of 10 milliamperes and a variance of 4 (milliamperes)2. What is the probability that a measurement exceeds 13 milliamperes? y P( X > 13) = ? y See Fig. 4-11. y There is no closed-form expression for the integration of a normal probability density function y The integral of a normal probability density function are found from a table. y Some useful results concerning a normal distribution are summarized below and in Fig. 4-12. P ( μ − σ < X < μ + σ ) = 0.6827 P ( μ − 2σ < X < μ + 2σ ) = 0.9545 P ( μ − 3σ < X < μ + 3σ ) = 0.9973 Figure 4-11 Probability that X >13 for a normal random variable with μ = 10 and σ2 = 4. Figure 4-12 Probabilities associated with a normal distribution. y From the symmetry of f(x), P( X >μ) = P( X <μ) = 0.5 y Because more than 0.9973 of the probability of a normal distribution is within the interval (x–3σ, x+3σ), 6σ is often referred to as the width of a normal distribution. 12 Definition: A normal random variable with μ = 0 and σ2 =1 is called standard normal random variable and is denoted as Z. The cumulative distribution function of a standard normal random variable is denoted as Φ( z ) = P(Z ≤ z ) y Appendix Table III provides cumulative probability values for Φ(z), for a standard normal random variable. Example 4-11 (Standard Normal Distribution) y Z is a standard normal random variable. y The use of Table II to find P(Z ≤ 1.5) is illustrated in Fig. 4-13. Figure 4-13 Standard normal probability density function y P(Z ≤ 1.5) = 0.93319 y P(Z ≤ 1.53) = y P(Z ≤ 1.525) = 13 Example 4-12 y The following calculations are shown pictorially in Fig. 4-14 (1) P( Z > 1.26) = 1 − P( Z ≤ 1.26) = (2) P ( Z < −0.86) = (3) P( Z > −1.37) = P( Z < 1.37) = (4) P(−1.25 < Z < 0.37) = P( Z < 0.37) − P( Z < −1.25) = (5) P( Z ≤ −4.6) cannot be found exactly from Appendix Table III ∵ P ( Z ≤ −3.99) = 0.00003 and P ( Z ≤ −4.6) < P ( Z ≤ −3.99) ∴ P ( Z ≤ −4.6) is nearly zero. (6) Find the value z such that P ( Z > z ) = 0.05 , P ( Z ≤ z ) = 0.95 ⇒ The nearest value is 0.95053 ⇒ z = 1.65 (7) Find the value z such that P ( − z < Z < z ) = 0.99 The value for z corresponds to probability of 0.995 in Table III. The nearest probability in Table II is 0.99506, when z = 2.58. 14 If X is a normal random variable with E(X) =μ and V(X) =σ2 , the random variable X −μ Z= (4-10) σ is a normal random variable with E(Z) = 0 and V(Z) = 1. That is, Z is a standard normal random variable. y Creating a new random variable by this transformation is referred to as standardizing. Example 4-13 (Normally Distributed Current) y Suppose the current measurement in a strip of wire are assumed to follow a normal distribution with a mean of 10 milliamperes and a variance of 4 (milliamperes)2. y What is the probability that a measurement will exceed 13 milliamperes? (P( X > 13 ) = ?) 13 − 10 Z = ( X − 10) / 2 ⇒ Z = = 1 .5 2 P ( X > 13) = P ( Z > 1.5) = 1 − P ( Z ≤ 1.5) = y Rather than using Fig. 4-15, ( X − 10) (13 − 10) P ( X > 13) = P ( > ) = P ( Z > 1.5) = 2 2 Figure 4-15 Standardizing a normal random variable. Note: 1.5 is referred to as the z-value associated with a probability 15 Suppose X is a normal random variable with mean μ and variance σ2, then X −μ x−μ P( X ≤ x) = P( ≤ ) = P(Z ≤ z) (4-11) σ σ where Z is a standard normal random variable, and z = (x − μ ) σ is the z-value obtained by standardizing X. The probability is obtained by entering Appendix Table III with z = (x − μ ) /σ . Example 4-14 (Normally Distributed Current) ● Continuing the previous example, what is the probability that a current measurement is between 9 and 11 milliamperes? P( 9 < X < 11) = P( (9-10)/2 < (X-10)/2 < (11-10)/2 ) = P(-0.5 < Z < 0.5) = P(Z < 0.5) – P(Z < –0.5) = ● Determine the value for which the probability that a current measurement is below this value is 0.98. ● By standardizing P( X < x) = P(( X − 10) / 2 < ( x − 10) / 2) = P( Z < ( x − 10) / 2) = 0.98 ● The nearest probability from Table III P ( Z < 2.05) = 0.97982 ( x − 10) / 2 = 2.05 ⇒ x = 2(2.05) + 10 = 14.1 16 Example 4-15 (Signal Detection) y Assume that in the detection of a digital signal the background noise follows a normal distribution with a mean of 0 volt and standard deviation of 0.45 volt. y The system assumes a digital 1 has been transmitted when the voltage exceeds 0.9 y What is the probability of detecting a digital 1 when none was sent? y The random variable N denotes the voltage of noise. N − 0 0. 9 − 0 P ( N > 0. 9) = P ( > ) = P ( Z > 2) = 0.45 0.45 y The probability can be described as the probability of a false detection. y Determine symmetric bounds about 0 that include 99% of all noise readings: P ( − x < N < x ) = 0.99 P (− x < N < x) = P( − x / 0.45 < N / 0.45 < x / 0.45) y y y y y = P (− x / 0.45 < Z < x / 0.45) = 0.99 From Appendix Table III: P ( −2.58 < Z < 2.58) = 0.99 ⇒ x / 0.45 = 2.58 ⇒ x = 2.58 × 0.45 = 1.16 Suppose a digital 1 is represented as a shift in the mean of the noise distribution to 1.8 volt. What is the probability that a digital 1 is not detected? Let the random variable S denote the voltage when a digital 1 is transmitted. S − 1 .8 0 .9 − 1 .8 P ( S < 0 .9 ) = P ( < ) = P ( Z < −2 ) = 0.45 0.45 The probability can be interpreted as the probability of a missed signal. 17 Example 4-16 (Shaft Diameter) y The diameter of a shaft (光軸) in an optical storage drive is normally distributed with mean 0.2508 inch and standard deviation 0.0005 inch. y The specifications on the shaft are 0.2500 ± 0.0015 inch. y What proportion of shaft conforms to specification? y Let X denote the shaft diameter in inches. P (0.2485 < X < 0.2515) 0.2485 − 0.2508 0.2515 − 0.2508 <Z< ) 0.0005 0.0005 = P (−4.6 < Z < 1.4) = 0.91924 y If the process is centered so that the process mean is equal to the target value of 0.2500 P (0.2485 < X < 0.2515) = P( 0.2485 − 0.2500 0.2515 − 0.2500 <Z< ) 0.0005 0.0005 = P (−3 < Z < 3) = 0.9973 = P( Exercises 4-6: 4-41, 4-43, 4-47, 4-53, 4-55, 4-61 18 4.8 Exponential Distribution y The Poisson distribution defines a random variable to be the number of flaws along a length of copper wire. y The distance between flaws is another random variable that is often of interest. y The random variable X: the length from any starting point on the wire until a flaw is detected. y The random variable N: the number of flaws in x millimeters of wire. y The distribution of X can be obtained from the distribution of N y Key concept: the distance to the first flaw exceeds 3 mm if and only if there are no flaws within a length of 3 mm y If the mean number of flaws is λ per millimeter. y N has a Poisson distribution with mean λx e − λx (λx)0 P( X > x) = P( N = 0) = = e − λx 0! y The cumulative distribution function of X. F ( x ) = P ( X ≤ x ) = 1 − e − λx , x ≥ 0 y By differentiating F(x), the probability density function of X f ( x) = λe−λx , x ≥ 0 Note: The deviation of the distribution of X depends only on the assumption that the flaws in the wire follow a Poisson process Definition: The random variable X that equals the distance between successive counts of a Poisson process with mean λ > 0 is an exponential random variable with parameter λ. The probability density function of X is f ( x) = λe −λx for 0 ≤ x < ∞ (4-14) 19 If the random variable X has an exponential distribution with parameter λ, 1 1 μ = E ( X ) = and σ 2 = V ( X ) = 2 (4-15) λ λ Prove: ∞ ∞ 0 0 μ = E[ x] = ∫ xf ( x)dx = ∫ xλe −λx dx ⎛ ⎞ = ⎜ x ( −e − λx ) − e − λx ⎟ ∞ λ ⎝ ⎠0 1 = x ( − e − λ x ) ∞ − e − λx ∞ 0 0 λ 1 x λe − λx −1 − e − λx 1 0 λ e − λx 1 ( x)′ 1 x ⎛ ⎞ = ⎜ lim λx − 0 ⎟ − (0 − ) = lim + x λ λ x →∞ (−e )′ λ ⎝ x →∞ − e ⎠ 1 1 1 1 = lim + =0+ = x λ x → ∞ − λe λ λ λ ∞ 1 2 1 1 σ 2 = E[ x 2 ] − μ 2 = ∫ x 2 f ( x ) dx − ( ) 2 = 2 − 2 = 2 0 ∫ ∞ 0 x 2 λe − λx dx = x 2 ( −e − λx ) ∞ 0 − 2x λ 1 λ e − λx ∞ 0 λ λ + 2( − 1 λ2 ∞ e − λx ) 0 2 x x 2 2 lim − + x → ∞ − e λx λ x → ∞ eλx λ2 2x 2 2 = lim + = x → ∞ − λ e λx λ2 λ2 = lim λ x2 λe − λ x − 2x − e − λx 1 −λx e 2 0 20 λ − 1 λ 2 e −λx Example 4-21 (Computer Usage) y In a large corporate computer network, user log-ons to the system can be modeled as a Poisson process with a mean of 25 log-ons per hour. y What is the probability that there are no log-ons in an interval of 6 minutes? y X: the time in hours from the start of the interval until the first log-on. y X has an exponential distribution with λ = 25 log-ons per hour. y We are interested in the probability that X exceeds 6 minutes. (6 minutes = 0.1 hour) ∞ ∞ P ( X > 0.1) = ∫ 25e −25 x dx = −e − 25 x 0 .1 0 .1 = 0 − ( −e − 25 ( 0.1) ) = e − 2.5 = 0.082 y Another point of view – from cumulative distribution function: P ( X > 0.1) = 1 − P ( X ≤ 0.1) = 1 − F (0.1) = 1 − (1 − e − 25 ( 0.1) ) = e − 2.5 = 0.082 y What is the probability that the time until the next log-on is between 2 and 3 minutes? (0.033 and 0.05 hour) P(0.033 < X < 0.05) = ∫ 0.05 25e −25 x dx 0.033 or P(0.033 < X < 0.05) = F (0.05) − F (0.033) = (1 − e −25×0.05 ) − (1 − e −25×0.033 ) y Determine the interval of time such that the probability that no log-on occurs in the interval is 0.90. P( X > x) = e −25 x = 0.90 ⇒ ln e −25 x = ln 0.9 ⇒ − 25x = ln 0.9 = −0.1054 ⇒ x = 0.00421 hour = 0.25 minute y The mean time until the next log-on is μ = 1 / λ = 1 / 25 = 0.04 hour = 2.4 minutes y The standard deviation of the time until the next log-on is μ = 1 / λ = 1 / 25 = 0.04 hour = 2.4 minutes Note: The probability that there are no log-ons in a 6-minute interval is 0.082 regardless of the starting time of the interval ● An even more interesting property of an exponential random variable is concerned with conditional probabilities. 21 Example 4-22 y X denote the time between detection of a particle with a Geiger counter y X has an exponential distribution with λ = 1.4 minutes. y The probability that we detect a particle within 30 seconds (0.5 minutes) of starting the counter is −0 .5 / 1 .4 = 0.30 P(X < 0.5 minute) = F(0.5) = 1 − e y Note: λ’ = 1/1.4 count/minute y Now, suppose we turn on the Geiger counter and wait 3 minutes without detecting a particle. What is the probability that a particle is detected in the next 30 seconds? y The requested probability can be expressed as the conditional probability that P( X < 3.5 | X > 3) = P (3 < X < 3.5) / P( X > 3) where P(3 < X < 3.5) = F (3.5) − F (3) = [1 − e−3.5 / 1.4 ] − [1 − e−3 / 1.4 ] = 0.035 P( X > 3) = 1 − F (3) = e −3 / 1.4 = 0.117 P ( X < 3.5 | X > 3) = 0.035 / 0.117 = 0.30 y After waiting for 3 minutes without a detection, the probability of a detection in the next 30 seconds is the same as the probability of a detection in the 30 seconds immediately after starting the counter. Lack of Memory Property For an exponential random variable X, P( X < t1 + t 2 | X > t1 ) = P( X < t 2 ) A= C C+D Exercises 4-8: 4-77, 4-79, 4-81, 4-83, 4-85, 4-91 22 (4-16) 4.9 Erlang and Gamma Distributions 4.9.1 Erlang Distribution Recall: The Poisson random variable is defined to be the number of counts in a length of wire. The exponential random variable describes the length until the first count is obtained. y The random variable that equals the interval length until r counts occur in a Poisson process has an Erlang random variable. Example 4-23 (Processor Failure) y The failures of CPU of large computer system are often modeled as a Poisson process. y The mean number of failures per hour is 0.0001. y X: the time until four failures occur in a system. y Determine the probability that X exceeds 40,000 hours. y N: the number of failures in 40,000 hours of operation. y The time until four failures occur exceeds 40,000 hours if and only if the number of failures in 40,000 hours is three or less. P( X > 40,000) = P( N ≤ 3) 1 y The assumption that the failures follow a Poisson process implies that N has a Poisson distribution with E ( N ) = λ = 40,000(0.0001) = 4 failures per 40,000 hours 3 e − λ λk e −4 4 k =∑ = 0.433 ! ! k k k =0 k =0 3 y P ( X > 40,000) = P ( N ≤ 3) = ∑ y X: the time until the r-th event in a Poisson process, then r −1 − λx e (λ x ) k P( X > x) = P( N ≤ r − 1) = ∑ k! k =0 dF ( x) d d ⎡ r −1 e − λx (λx) k ⎤ λr x r −1e − λx = ¾ f ( x) = dx = dx [1 − P( X > x)] = dx ⎢1 − ∑ k! ⎥⎦ (r − 1)! ⎣ k =0 3 X X X x 23 r-1 r X 2 X 1 Note: F ( x) = P ( X ≤ x) = 1 − P ( X > x) e − λx (λx) k P ( X > x) = P ( N ≤ r − 1) = ∑ k! k =0 r −1 − λx r −1 − λx k k e ( λx ) k e λ x ⇒ F ( x) = 1 − ∑ =1− ∑ k! k! k =0 k =0 r −1 ⎡ r −1 (−λ )e − λx λk x k + e − λx λk kx k −1 ⎤ dF ( x) = − ⎢∑ f ( x) = ⎥ dx k! ⎣ k =0 ⎦ k −1 r −1 k r −1 k +1 k λ x λ kx = e −λx ∑ − e −λx ∑ k! k! k =0 k =0 r −1 λk +1 x k k =0 k! r −1 λk +1 x k = e − λx ∑ k =1 r −2 λk x k −1 (k − 1)! λm+1 x m −e ∑ k! m! m =0 r −1 k +1 k r − 2 k +1 k λ x λ x = e −λx ∑ − e −λx ∑ k! k! k =0 k =0 =e − λx ∑ r −1 − e − λx ∑ − λx k =0 =e − λx λ( r −1)+1 x r −1 ( r − 1)! = λr x r −1e − λx (r − 1)! Definition: The random variable X that equals the interval length until r counts occur in a Poisson process with mean λ > 0 has an Erlang random variable with parameters λ and r. The probability density function of X is λr x r −1e − λx f ( x) = , for x > 0 and r = 1, 2, ... (r − 1)! y An Erlang random variable can be thought of as the continuous analogy of a negative binomial random variable. y A negative binomial random variable can be expressed as the sum of r geometric random variables. ¾ An Erlang random variable can be represented as the sum of r exponential random variables. 24 If X is an Erlang random variable with parameters λ and r, μ = E[ X ] = r / λ and σ 2 = V ( X ) = r / λ2 Prove: ∞ μ = E[ X ] = ∫ x λr x r −1e − λx (r − 1)! 0 = λr (r − 1)! ∫ ∞ 0 x r e −λx dx ⎡ r 1 − λx ∞ r! −λx ∞ ⎤ r −1 1 − λx ∞ − + − ( ) ( ) x e rx e L e 0 0 0 ⎥⎦ ( r − 1)! ⎢⎣ λ λ2 λr +1 λr ⎡ r! −∞ 0 ⎤ + x r e − λx = − − e e ⎥⎦ (r − 1)! ⎢⎣ λr +1 1 −λx r −1 − rx − e r λ = λ 1 + r (r − 1) x r −2 + 2 e −λx λr = [ + x r +1 dx − ( r + 1) x r + ( r + 1)( r ) x r −1 M − ( r + 1)( r )( r − 1) L1 0 ] λ e − λx 1 − e − λx λ 1 + 2 e − λx λ M 1 − r +1 e − λx λ 1 + r + 2 e − λx λ M M M M − r (r − 1)(r − 2) L1 0 − 1 λ + r e −λx 1 λ r +1 e −λx 兩邊正負號同步 σ 2 = E[ x 2 ] − μ 2 r r −1 − λx ∞ ∞ e λr 2 2 λ x E[ X ] = ∫ x dx = x r +1e −λx dx ∫ 0 (r − 1)! (r − 1)! 0 ∞ λr ⎡ r +1 1 −λx ∞ 1 (r + 1)! −λx ∞ ⎤ = x (− e ) − (r + 1) x r ( 2 e −λx ) + L − e ⎢ 0 0 0 ⎥⎦ λ (r − 1)! ⎣ λ λr + 2 ( r + 1)r r2 + r −∞ 0 [− e + e ] = = λ2 λ2 2 r2 + r ⎛ r ⎞ r σ = E[ X ] − μ = 2 − ⎜ ⎟ = 2 λ λ ⎝λ⎠ 2 2 2 25 4-9-2 Gamma Distribution y The Erlang distribution is a special case of the gamma distribution. y If the parameter r of an Erlang random variable is not an integer, but r > 0, the random variable has a gamma distribution. y How to define (r – 1)! for r ∈ R? Definition: The gamma function is Γ(r ) = ∫ x r −1e − x dx, for r > 0 (4-17) Note: It can be shown that the integral in the definition of Γ(r) is finite and Γ(r) = (r – 1) Γ(r – 1) If r is a positive integer, then Γ(r) = (r – 1)! Γ(1) = 0! = 1 and it can be shown that Γ(1/2) = π1/2 Γ(r) can be interpreted as a generalization to non-integer values of r of the term (r – 1)! that is used in the Erlang probability density function. Definition: The random variable X with probability density function λr x r −1e − λx f ( x) = , for x > 0 (4-18) Γ(r ) has a gamma random variable with parametersλ> 0 and r > 0. If r is an integer, X has an Erlang distribution. Figure 4-26 Gamma probability density functions for selected values of λ and r. 26 If X is a gamma random variable with parameters λ and r, μ = E[ X ] = r / λ and σ 2 = V ( X ) = r / λ2 (4-19) Example 4-24 (Processor Failure) y The time to prepare a micro-array slide fro high-throughput genomics is a Poisson process with a mean of two hours per slide. y What is the probability that 10 slides require more than 25 hours to prepare? y X: the time to prepare 10 slides. y The assumption of a Poisson process implies that X has a gamma distribution with λ = 1/2 and r = 10 y The request probability is P(X > 25) 9 e − λx (λ x ) k e −12.512.5k =∑ = 0.2014 P ( X > 25) = ∑ k! k! k =0 k =0 y What is the mean and standard deviation of the time to prepare 10 slides? E(X) = r/λ = 10/(1/2) = 20 V(X) = r/λ2 = 10/(1/2)2 = 40 y The slides will be completed by what length of time with probability equal to 0.95? y Find x such that P(X ≤ x) = 0.95 9 9 e − λ x (λ x ) k e −0.5 x (0.5 x) k =∑ = 0.05 P( X > x) = ∑ ! ! k k k =0 k =0 9 y The chi-squared distribution is a special case of the gamma distribution in which λ = 1/2 and r equals one of the values 1/2, 1, 3/2, 2, … y The distribution is used extensively in interval estimation and tests of hypotheses Exercises 4-9: 4-97, 4-99, 4-101, 4-103 27 y Appendix A Table III 28 29