Download Lecture 15

Expectations of Random Variables, Functions of Random Variables ECE 313 Probability with Engineering Applications Lecture 15 Ravi K. Iyer Dept. of Electrical and Computer Engineering University of Illinois at Urbana Champaign Iyer - Lecture 16 ECE 313 - Fall 2016 Today’s Topics • • • • • Midterm Exam Statistics review Erlang, Gamma and Hyper epenential distributions Example of a failure detector with detection latency Start on Expectations – Expectations of important random variables – Moments: Mean and Variance Announcements – Homework 7. – Based on your Midterm exam, individual problems are assigned to you to solve. Checkout Compass. – Midterm available today. The grades are posted on Compass. – In-Class activity replaced by a brief concept quiz on Wednesday – Mini Project 2 grades posted this week – Final project dates will be announced soon Iyer - Lecture 16 ECE 313 - Fall 2016 Midterm Histogram Iyer - Lecture 16 MIN MAX 33 105 MEDIAN MEAN 85 84.6 ECE 313 - Fall 2016 Exam Statistics Midterm Exam Q1 Q2 Q3 Q4 Bonus Total Median 34 15 20 17 5 85 Average 33.3 14.2 16.8 15.5 4.8 84.6 Min 15 0 4 5 1 33 Max 40 20 20 20 7 105 Iyer - Lecture 16 ECE 313 - Fall 2016 Iyer - Lecture 16 ECE 313 - Fall 2016 Erlang and Gamma Distribution • When r sequential phases have independent identical exponential distributions, the resulting density (pdf) is known as r-stage (or r-phase) Erlang: f (t )  • r t r 1e  t t  0,   0, r  1,2,... , (r  1)! The CDF (Cumulative distribution function is: (  t ) k  t F (t )  1   e , k! k 0 r 1 • Also: h(t )  Iyer - Lecture 16 r t r 1 ( t ) k (r  1)!  k! k 0 r 1 , t  0,   0, r  1,2,... t  0,   0, r  1,2,... ECE 313 - Fall 2016 Erlang and Gamma Distribution (cont.) • The exponential distribution is a special case of the Erlang distribution with r = 1. • A component subjected to an environment so that Nt, the number of peak stresses in the interval (0, t], is Poisson distributed with parameter  t. • The component can withstand (r -1) peak stresses and the rth occurrence of a peak stress causes a failure. • The component lifetime X is related to Nt so these two events are equivalent: [X  t]  [ Nt  r] Iyer - Lecture 16 ECE 313 - Fall 2016 Erlang and Gamma Distribution (cont.) • Thus: R(t)  P( X  t)  P(N t  r) r 1   P(N t  k) k0 e  t (t) k  k0 k! r 1 • F(t) = 1 - R(t) yields the previous formula: (t)k  t F(t)  1   e , t  0,   0,r  1,2,... k 0 k! • Conclusion: The component lifetime has an r-stage Erlang distribution. r1 Iyer - Lecture 16 ECE 313 - Fall 2016 Gamma Function and Density • If r (call it ) take nonintegral values, then we get the gamma density:  t  1e t f (t)  ,   0,t  0  where the gamma function is defined by the integral:     x  1e  x dx, 0 0 Iyer - Lecture 16 ECE 313 - Fall 2016 Gamma Function and Density (cont.) • The following properties of the gamma function useful: Integration by parts shows that for  > 1:   ( 1)( 1) • In particular, if  is a positive integer, denoted by n, then: n  (n  1)! • Other useful formulas related to the gamma function are: ( 1 2)   and  x 0 Iyer - Lecture 16  1  x e dx    ECE 313 - Fall 2016 Gamma Function and Density (cont.) Iyer - Lecture 16 ECE 313 - Fall 2016 Hyperexponential Distribution • A process with sequential phases gives rise to a hypoexponential or an Erlang distribution, depending upon whether or not the phases have identical distributions. • If a process consists of alternate phases, i. e. during any single experiment the process experiences one and only one of the many alternate phases, and • If these phases have independent exponential distributions, then • The overall distribution is hyperexponential. Iyer - Lecture 16 ECE 313 - Fall 2016 Hyperexponential Distribution (cont.) • The density function of a k-phase hyperexponential random variable is: k f (t)    i  i e   it , i1 k t  0,  i  0,  i  0,   i  1 i1 • The distribution function is: F(t)    i (1  e   it i ), t0 • The failure rate is:  t   i i e i h(t)  ,  it ie t>0 which is a decreasing failure rate from ii down to min {1, 2,…} Iyer - Lecture 16 ECE 313 - Fall 2016 Hyperexponential Distribution (cont.) • The hyperexponential is a special case of mixture distributions that often arise in practice: F(x)    i Fi (x), i   1  1, 1  0 • The hyperexponential distribution exhibits more variability than the exponential, e.g. CPU service-time distribution in a computer system often expresses this. • If a product is manufactured in several parallel assembly lines and the outputs are merged, then the failure density of the overall product is likely to be hyperexponential. Iyer - Lecture 16 ECE 313 - Fall 2016 Hyperexponential Distribution (cont.) Iyer - Lecture 16 ECE 313 - Fall 2016 Example 3 (On-line Fault Detector) • Consider a model consisting of a functional unit (e.g., an adder) together with an on-line fault detector • Let T and C denote the times to failure of the unit and the detector respectively • After the unit fails, a finite time D (called the detection latency) is required to detect the failure. • Failure of the detector, however, is detected instantaneously. Iyer - Lecture 16 ECE 313 - Fall 2016 Iyer - Lecture 16 ECE 313 - Fall 2016 Iyer - Lecture 16 ECE 313 - Fall 2016 Iyer - Lecture 16 ECE 313 - Fall 2016 Iyer - Lecture 16 ECE 313 - Fall 2016 Iyer - Lecture 16 ECE 313 - Fall 2016 Example 3 (On-line Fault Detector) cont. • Let X denote the time to failure indication and Y denote the time to failure occurrence (of either the detector or the unit). • Then X = min{T + D, C} and Y = min{T, C}. • If the detector fails before the unit, then a false alarm is said to have occurred. • If the unit fails before the detector, then the unit keeps producing erroneous output during the detection phase and thus propagates the effect of the failure. • The purpose of the detector is to reduce the detection time D. Iyer - Lecture 16 ECE 313 - Fall 2016 Example 3 (On-line Fault Detector) cont. • We define: Real reliability Rr(t) = P(Y  t) and Apparent reliability Ra(t) = P(X  t). • A powerful detector will tend to narrow the gap between Rr(t) and Ra(t). • Assume that T, D, and C are mutually independent and exponentially distributed with parameters , , and . Iyer - Lecture 16 ECE 313 - Fall 2016 Example 3 (On-line Fault Detector) cont. • Then Y is exponentially distributed with parameter  +  and: Rr (t)  e (   )t • T + D is hypoexponentially distributed so that: FT  D (t)  1 Iyer - Lecture 16   t   t e  e     ECE 313 - Fall 2016 Example 3 (On-line Fault Detector) cont. • And, the apparent reliability is: Ra (t)  P( X  t)  P(min{T  D,C}  t)  P(T  D  t and C  t)  P(T  D  t)P(C  t) by independence  [1 FT  D (t)]e t  Iyer - Lecture 16    e (   )t     e (   )t ECE 313 - Fall 2016 Expectation of a Random Variable • The Discrete Case: If X is a discrete random variable having a probability mass function p(x), then the expected value of X is defined by E[ X ]   xp( x) x: p ( x )  0 The expected value of X is a weighted average of the possible values that X can take on, each value being weighted by the probability that X assumes that value. For example, if the probability mass function of X is given by then is just an ordinary average of the two possible values 1 and 2 that X can assume. Iyer - Lecture 16 ECE 313 - Fall 2016 Expectation of a Random Variable (Cont.) Assume 1 2 p (1)  , p (2)  3 3 Then is a weighted average of the two possible values 1 and 2 where the value 2 is given twice as much weight as the value 1 since p(2) = 2p(1). –Find E[X] where X is the outcome when we roll a fair die. –Solution: Since 1 p(1)  p(2)  p(3)  p(4)  p(5)  p(6)  , we obtain 6 Iyer - Lecture 16 ECE 313 - Fall 2016 Expectation of a Random Variable (Cont.) • Expectation of a Bernoulli Random Variable: Calculate E[X] when X is a Bernoulli random variable with parameter p. • • Since: We have: The expected number of successes in a single trial is just the probability that the trial will be a success. Iyer - Lecture 16 ECE 313 - Fall 2016 Expectation of a Random Variable (Cont.) • Expectation of a Binomial Random Variable: Calculate E[X] when X is a binomially distributed with parameters n and p. . Iyer - Lecture 16 ECE 313 - Fall 2016 Expectation of a Random Variable (Cont.) • • Expectation of a Geometric Random Variable: Calculate the expectation of a geometric random variable having parameter p. We have:   E[ X ]   np (1  p ) n 1 n 1   p  nq n 1 n 1 where q  1  p d n E[ X ]  p  (q ) n 1 dq d   n  p q  dq  n 1  p d  q    dq  1  q  p (1  q) 2 1  p  The expected number of independent trials we need to perform until we get our first success equals the reciprocal of the probability that any one trial results in a success. Iyer - Lecture 16 ECE 313 - Fall 2016 Expectation of a Random Variable (Cont.) • Expectation of a Poisson Random Variable: Calculate E[X] if X is a Poisson random variable with parameter λ. Use the identity:   k   / k !  e k 0 Iyer - Lecture 16 ECE 313 - Fall 2016 The Continuous Case • The expected value of a continuous random variable: If X is a continuous random variable having a density function f (x), then the expected value of X is defined by:  E[ X ]   xf ( x)dx  • Example: Expectation of a Uniform Random Variable, Calculate the expectation of a random variable uniformly distributed over (α, β) The expected value of a random variable Uniformly distributed over the interval (α, β) is just the midpoint of the interval. Iyer - Lecture 16 ECE 313 - Fall 2016 The Continuous Case (Cont.) • Expectation of an Exponential Random Variable: Let X be exponentially distributed with parameter λ. Calculate E[X].  E[ X ]   xex dx 0 • Integrating by parts Iyer - Lecture 16 (dv  e x , u  x) yields: ECE 313 - Fall 2016 The Continuous Case Cont’d • Expectation of a Normal Random Variable): X is normally distributed with parameters μ and σ2: 1 E[ X ]  2   ( x   ) 2 / 2 2 xe  dx • Writing x as (x-μ) + μ yields 1 E[ X ]  2    ( x   )e ( x   ) 2 / 2 2 1 dx   2   e  ( x   ) 2 / 2 2 dx • Letting y= x-μ leads to E[ X ]  • 1 2    ye  y 2 / 2 2  dy    f ( x)dx  Where f(x) is the normal density. By symmetry, the first integral must be 0, and so  E[ X ]    f ( x)dx    Iyer - Lecture 16 ECE 313 - Fall 2016 Example 1 • Consider the problem of searching for a specific name in a table of names. A simple method is to scan the table sequentially, starting from one end, until we either find the name or reach the other end, indicating that the required name is missing from the table. The following is a C program fragment for sequential search: Iyer - Lecture 16 ECE 313 - Fall 2016 Example 1 Cont’d • In order to analyze the time required for sequential search, let X be the discrete random variable denoting the number of comparisons “myName≠Table[I]” made. The set of all possible values of X is {1,2,…,n+1}, and X=n+1 for unsuccessful searches. • More interesting to consider a random variable Y that denotes the number of comparisons for a successful search. The set of all possible values of Y is {1,2,…,n}. To compute the average search time for a successful search, we must specify the pmf of Y. In the absence of any specific information, let us assume that Y is uniform over its range: pY (i )  • Then n 1 , 1  i  n. n E[Y ]   i p Y (i )  i 1 1 n(n  1) (n  1)  . n 2 2 • Thus, on the average, approximately half the table needs to be searched Iyer - Lecture 16 ECE 313 - Fall 2016 Example 2 • If αi denotes the access probability for name Table[i], then the average successful search time is E[Y] is minimized when names in the table are in the order of nonincreasing access probabilities; that is, α1 ≥ α2 ≥ … ≥ αn. c i 1  , 1  i  n, • Where the constant c is determined from the normalization requirement • Thus, c  1 n 1  i 1 i   n i 1 1  1 1 1  , H n ln( n)  C  n • Where Hn is the partial sum of a harmonic series; that is: H n  i 1 (1 / i) and C(=0.577) is the Euler Constant. • Now, if the names in the table are ordered as above, then the average n 1 n n n search time is E[Y ]   i i  i 1 Hn 1  H i 1  n ln( n)  C • Which is considerably less than the previous value (n+1)/2, for large n Iyer - Lecture 16 ECE 313 - Fall 2016 Example 3 • Zipf’s law has been used to model the distribution of Web page requests [BRES 1999]. It has been found that pY (i) the probability of a request for the ith most popular page is inversely proportional to i [ALME1996, WILL 1996], c pY (i )  ,1  i  n, i • Where n is the total number of Web pages in the universe. • We assume the Web page requests are independent and the cache can hold only m Web pages regardless of the size of each Web page. If we adopt a removal policy called “least frequently used”, which always keeps the m most popular pages, then the hit ratio h(m)- the probability that a request can find its page in cache- is given by m h(m)   pY (i)  cH m  i 1 Iyer - Lecture 16 H m ln( m)  C  H n ln( n)  C ECE 313 - Fall 2016 Moments • Let X be a random variable, and define another random variable Y as a function of X so that Y   ( X ). Suppose that we wish to compute E[Y]    ( xi ) p X ( xi ),  E[Y ]  E[ ( X )]   i   ( x) f X ( x)dx, if X is discrete, if X is continuous, (provided the sum or the integral on the right-hand side is absolutely k convergent). A special case of interest is the power function  ( X )  X k For k=1,2,3,…, E[ X ] is known as the kth moment of the random variable X. Note that the first moment E[ X ] is the ordinary expectation or the mean of X. • We define the kth central moment,  k of the random variable X by mk = E[(X - E[X])k ] • Known as the variance of X, Var[X], often denoted by  • The variance of a random variable X is ì 2 (x E[X]) p(xi ) if X is discrete å i ï i 2 Var[X] = s = í ¥ ï ò (x - E[X])2 f (x)dx if X is continuous î -¥ • It is clear that Var[X] is always a nonnegative number. Iyer - Lecture 16 2 ECE 313 - Fall 2016 Variance: 2nd Central Moment • We define the kth central moment,  k of the random variable X by  k  E[( X  E[ X ]) k ] 2 2 2 •  known as the variance of X, Var[X], often denoted by   E[( X  E[ X ]) ] • Definition (Variance). The variance of a random variable X is ì ï 2 Var[X] = s = í ï î å (x - E[X]) p(x ) ò (x - E[X]) f (x)dx 2 i ¥ -¥ i i 2 if X is discrete if X is continuous • It is clear that Var[X] is always a nonnegative number. Iyer - Lecture 16 ECE 313 - Fall 2016 Functions of a Random Variable • • Let Y   ( X )  X 2 As an example, X could denote the measurement error in a certain physical experiment and Y would then be the square of the error (e.g. method of least squares). Note that FY ( y )  0 for y  0. For y  0, FY ( y )  P(Y  y )  P( X 2  y )  P( y  X  y )  FX ( y )  FX ( y ), and by differenti ation the density of Y is  1 [ f X ( y )], y  0,  fY ( y )   2 y 0, otherwise. Iyer - Lecture 16 ECE 313 - Fall 2016 Functions of a Random Variable (cont.) • Let X have the standard normal distribution [N(0,1)] so that 1  x2 / 2 f X ( x)  e , 2 Then    x  .  1  1 y e   fY ( y )   2 y  2  0,  /2  1 y e 2 /2   y  0, , y  0, or  1 e y  fY ( y )   2y  0,  • /2 y  0, , y  0. This is a chi-squared distribution with one degree of freedom Iyer - Lecture 16 ECE 313 - Fall 2016 Functions of a Random Variable (cont.) • • Let X be uniformly distributed on (0,1). We show that Y  1 ln( 1  X ) has an exponential distribution with parameter   0. Observe that Y is a nonnegative random variable implying FY (y) = 0for y £ 0. For y > 0, we have FY (y) = P(Y £ y) = P[-l -1 ln(1- X) £ y] = P[ln(1- X) ³ -l y] = P[(1- X) ³ e- l y ] (since e x is an increasing function of x,) = P(X £ 1- e- l y ) = FX (1- e- l y ). But since X is uniform over (0,1), FX (x) = x, 0 £ x £ 1. Thus FY (y) = 1- e- l y . Therefore Y is exponentially distributed with parameter l. • This fact can be used in a distribution-driven simulation. In simulation programs it is important to be able to generate values of variables with known distribution functions. Such values are known as random deviates or random variates. Most computer systems provide built-in functions to generate random deviates from the uniform distribution over (0,1), say u. Such random deviates are called random numbers. Iyer - Lecture 16 ECE 313 - Fall 2016 Example 1 • Let X be uniformly distributed on (0,1). We obtain the cumulative distribution function (CDF) of the random variable Y, defined by Y = Xn as follows: for 0  y  1, FY ( y )  P{Y  y}  P{ X n  y}  P{ X  y1/ n }  FX ( y1/ n )  y1/ n • Now, the probability density function (PDF) of Y is given by ì 1 1 -1 ï y n fY (y) = í n ï 0 î Iyer - Lecture 16 0 £ y £1 otherwise ECE 313 - Fall 2016 Expectation of a Function of a Random Variable • • • • • • Given a random variable X and its probability distribution or its pmf/pdf We are interested in calculating not the expected value of X, but the expected value of some function of X, say, g(X). One way: since g(X) is itself a random variable, it must have a probability distribution, which should be computable from a knowledge of the distribution of X. Once we have obtained the distribution of g(X), we can then compute E[g(X)] by the definition of the expectation. Example 1: Suppose X has the following probability mass function: p(0)  0.2, p(1)  0.5, p(2)  0.3 Calculate E[X2]. Letting Y=X2,we have that Y is a random variable that can take on one of the values, 02, 12, 22 with respective probabilities Hence, 2 pY (0) = P{Y = 0 } = 0.2 2 E [ X ]  E[Y ]  0(0.2)  1(0.5)  4(0.3)  1.7 2 pY (1) = P{Y = 1 } = 0.5 Note that 2 pY (2) = P{Y = 2 } = 0.3 1.7  E[ X 2 ]  E[ X ]2  1.21 Iyer - Lecture 16 ECE 313 - Fall 2016 Expectation of a Function of a Random Variable (cont.) • Proposition 2: (a) If X is a discrete random variable with probability mass function p(x), then for any real-valued function g, E[ g ( X )]   g ( x) p ( x) x: p ( x )  0 • (b) if X is a continuous random variable with probability density function f(x), then for any real-valued function g:  E[ g ( X )]   g ( x) f ( x)dx  • Example 3, Applying the proposition to Example 1 yields • Example 4, Applying the proposition to Example 2 yields E[ X 2 ]  02 (0.2)  (12 )(0.5)  (22 )(0.3)  1.7 1 E[ X ]   x 3dx 3 0 (since f(x)  1, 0  x  1) 1  4 Iyer - Lecture 16 ECE 313 - Fall 2016 Corollary • If a and b are constants, then E[aX  b]  aE[ X ]  b • The discrete case:  (ax  b) p( x) E[aX  b]  x: p ( x )  0  xp( x)  b  p( x) a x: p ( x )  0 x: p ( x )  0  aE[ X ]  b • The continuous case:  E[aX  b]   (ax  b) f ( x)dx       a  xf ( x)dx  b  f ( x)dx  aE[ X ]  b Iyer - Lecture 16 ECE 313 - Fall 2016 Moments • The expected value of a random variable X, E[X], is also referred to as the mean or the first moment of X. • The quantity E[ X n ], n  1 is called the nth moment of X. We have: ì ï ï n E[X ] = í ï ï î å x n p(x), if X is discrete x:p( x )>0 ¥ òx n f (x)dx, if X is continuous -¥ • Another quantity of interest is the variance of a random variable X, denoted by Var(X), which is defined by: Var( X )  E[( X  E[ X ]) 2 ] Iyer - Lecture 16 ECE 313 - Fall 2016

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lecture 15