Download Document

Let us now return to studying the behavior of two random variables. Consider the following problem: Example: Let x1 and x2 denote the proportion of two different chemicals found in a sample mixture of chemicals used as an insecticide. Suppose x1 and x2 have a joint probability density function given by: a f x1 x2 (x1 x 2 ) =  0 0 ≤ x1 ≤ 1, 0 ≤ x 2 ≤ 1, 0 ≤ x1 + x 2 ≤ 1 elsewhere (Note: x1 and x2 must add up to 1, at most since they represent proportion) a) b) c) d) e) f) Find P{x1 ≤ 3 / 4, x 2 ≤ 3 / 4} Find P{x1 ≤ 1 / 2, x 2 ≤ 1 / 2} Find P{x1 ≤ 1 / 2 / x 2 ≤ 1 / 2} Find the marginal density for x1 and x2 Are x1 and x2 independent? Find P{x1 > 1 / 2 / x 2 = 1 / 4} Solution: First, we must find what a is. f x1 x2 a 1 x2 1 x1+x2=1 d x1 x2=1-x1 x1 1 1− x1 ∴1 = ∫ 0 1 ∫ adx2 dx1 = ∫ ax 2 0 0 1 1  a  1 = a  x1 − x12  = 2 0 2  ∴a = 2 Then 1 1 − x1 dx1 = a ∫ (1 − x1 )dx1 0 0 a) P{x1 ≤ 3 / 4, x 2 ≤ 3 / 4} = 1 / 43 / 4 ∫ ∫ 2dx2 dx1 + 0 0 3 / 4 1− x1 ∫ ∫ 2dx dx 2 1/ 4 1 0 3/4 = 2(1 / 4 )(3 / 4 ) + 2 ∫ (1 − x1 )dx1 1/ 4 = 3 1 7 + = 8 2 8 x2 1 3/4 . x 2=1-x1 x 1+x2=1 1/4 1/4 3/4 1 x 1 b). x2 P[x1 ≤ 1 / 2, x 2 ≤ 1 / 2] = 1 / 21 / 2 ∫ ∫ 2dx dx 1 1 2 0 0 = 2× = c). P[x1 ≤ 1 / 2 / x 2 ≤ 1 / 2] = 1 1 × 2 2 1/2 1 2 1/2 P[x1 ≤ 1 / 2, x 2 ≤ 1 / 2] P[ x 2 ≤ 1 / 2 ] 1 2 = P [ x 2 ≤ 1 / 2] 1 x x2 1 1/2 x 1/2 but 1 1 1 1 1  1 1 1  P[x2 ≤ 1 / 2] =  × × 2  +  × × × 2  2 2  2 2 2  1 1 3 = + = 2 4 4 Then P[x1 ≤ 1 / 2 / x 2 ≤ 1 / 2] = 1/ 2 4 2 = = 3/ 4 6 3 d) Marginal Densities f X 1 (x1 ) = f X 2 (x2 ) = 1− x1 ∫ 1− x1 f X1X 2 dx 2 = x2 = 0 x2 =0 1− x2 1− x2 ∫f X1 X 2 dx1 = 0 x2 2 = 2(1 − x1 ) 0 ≤ x1 ≤ 1 1 = 2(1 − x2 ) 0 ≤ x 2 ≤ 1 ∫ 2dx ∫ 2dx x 2 =0 x 2 = 1 - x1 x x e) Note 1= 1 - x2 1 f1 (x1 ) ⋅ f 2 (x 2 ) = 4(1 − x1 )(1 − x 2 ) ≠ f1 (x1 x2 ) = 2 0 ≤ x1 + x 2 ≤ 1 ∴ x1 and x2 are not independen t. f) P[x1 > 1 / 2 / x 2 = 1 / 4] x 2 x x 3/4 Slicing through f X 1 X 2 and x2 = 1 / 4 2=1/4 x 1 1=1-x2 f (x1 ,1 / 4 ) = 2 0 ≤ x1 ≤ 1 − x2 = 3 / 4 f X 1 (x1 ) = 2(1 − x1 ) f X 2 (x2 ) = 2(1 − x 2 ) Hence f X 2 (x2 = 1 / 4 ) = 3 / 2 Then, from our identities f (x1 / x 2 = 1 / 4 ) = f (x1 , x 2 = 1 / 4 ) 2 4 = = f (x 2 = 1 / 4 ) 3/ 2 3 3/ 4 P[x1 > 1/2 / x 2 = 1 / 4] = ∫ f (x / x2 = 1 / 4 )dx1 1 1/ 2 3/4 = 4 dx1 3 1/ 2 = 4 3 1 1 − = 3  4 2  3 ∫ for 0 ≤ x 2 ≤ 1 − x 2 = 3 / 4 Also P[x1 < 1 / 2 / x 2 = 1 / 4] = 4 3 1/ 2 2 ∫ dx = (4 / 3)(1 / 2) = 3 2 0 hence P[x1 > 1 / 2 / x 2 = 1 / 4] + P[x1 ≤ 1 / 2 / x2 = 1 / 4] = 1 Example: Two quality control inspectors each interrupt a production line in any given hour at randomly, but independently, selected time within a given day (of 8 hours). Find probability that the two interruptions will be more than 4 hours apart. Let xi = the hours of the interrupti on time of inspection i f(x 1) 1/8 8 f (xi ) = 1 8 f (x1 x 2 ) = 1 1 = 8.8 64 x 1,x2 i = 1,2 { } Then we want P x1 − x 2 > 4 Region of interest x2 x 1 -x2 <4 x 1-x2= -4 x 1-x2= 4 x2 xxxx 4 4 Region of interest x 1 - x2 > 4 x 1 -4 x 1-4 dx2 0 4 x 8 dx1 1 8 x1 − 4 P{x1 − x 2 > 4}= ∫ 4 = ∫ 0 4 8 1 1 dx 2 dx1 + ∫ ∫ dx 2 dx1 64 64 0 x2 − 4 1 1 1 + = 8 8 4 When speaking about two random variables it is of common interest to talk about covariance. Covariance - a measure of the statistical relationship between two r.v.. As example, if X → large/small at same time that Y is large/small, the covariance will be positive. If X → large/small at same time that Y is small/large, the covariance will be negative. Covariance is defined cov( X 1 X 2 ) = E [(X 1 − M 1 )(X 2 − M 2 )] = E [X 1 X 2 ] − M 1 M 2 where M 1 , M 2 = E[X 1 ]E[X 2 ] respectively If x1 and x2 are independent: cov (X 1 X 2 ) = E [X 1 X 2 ] − M 1 M 2 ≡ 0 ∴ Independence implies cov = 0 but cov = 0 does not imply independence. If Y1 , Y2 , ,Y n and X 1 , X 2 , , X n n are r.v. with M i = E [Yi ] and ξ i = E [X i ] and n U 1 = ∑ aiYi U 2 = ∑ bi X i (linear combinatio n) i =1 i =1 n ∴ a) E [U 1 ] = ∑ ai M i i =1 b) V (U 1 ) = E [U 1 − E (U 1 )] 2 n n  = E  ∑ aiYi − ∑ ai M i  i =1  i =1  2 2 n  = E  ∑ ai (Yi − M i )  i =1  n  n n 2 2  = E ∑ ai (Yi − M i ) + ∑∑ ai a j (Yi − M i )(Y j − M j )  i =1  i =1 j =1   i≠ j [ ] = ∑ ai2 E (Yi − M i ) + ∑∑ ai a j E [(Yi − M i )(Y j − M j )] n i =1 2 n i =1 j =1 i≠ j = ∑ ai2V (Yi ) + ∑∑ ai a j cov (YiY j ) n ∴ n i =1 ( n ) n i =1 j =1 i≠ j ( ) Since cov YiY j = cov Y jYi V (U1 ) = ∑ ai2V (Yi ) + 2 ∑∑ ai a j cov (YiY j ) n i =1 i< j Similarly cov (U 1U 2 ) = E [{U 1 − E (U 1 )}{U 2 − E (U 2 )}] = ∑∑ ai b j cov (YiY j ) n m i =1 j =1 Example 5-17 Table below shows the joint probability distribution of the number of defectives observed on the first draw (X1) and the second draw (X2) from a box containing four good bulbs and one defective bulb. a) b) c) Find E(X1), V(X1), E(X2), and V(X2) Find cov(X1, X2) The variable Y = X1 + X2 denotes the total number of defectives observed on the two draws. Find E(Y) and V(Y). X 1 X2 0 1 Marginal Prob. For X1 0 1 Marginal Prob. for X2 .6 .2 .8 .2 0 .2 .8 .2 1.0 Drawing bulbs from box ⇒ 4 good and 1 defective a) E [X 1 ] = (0)(.8) + (1)(.2 ) = 0.2 [ ] E X 12 = (0) (.8) + (1) (.2 ) = 0.2 2 2 [ ] [ ] ∴V ( X 1 ) = E X 12 − E X 1 2 E [X 2 ] = same V (X 2 ) = same b) = .2 − (.04 ) = 0.16 cov (X 1 X 2 ) = E [X 1 X 2 ] − E [X 1 ]E [X 2 ] = (0 ) − (.2 ) = −.04 2 c) Y = X1 + X 2 E [Y ] = E [X 1 ] + E [X 2 ] = .2 + .2 = .4 V (Y ) = V (X 1 + X 2 ) = V (X 1 ) + V (X 2 ) + 2 cov ( X 1 X 2 ) = 0.16 + 0.16 + (2 )(− .04 ) = 0.24 Conditional Expectations E [X 1 / X 2 = x 2 ] = ∫ x1 f (x1 / x2 )dx1 = ∑ x1 p(x1 / x 2 ) if continuous if discrete x1 A soft-drink machine has a random supply Y2 at beginning of a given day and dispenses a random amount Y1 during the day. It is not re-supplied during the day, so Y1 ≤ Y2 . It has been observed that 1 / 2 f ( y1 y 2 ) =  0 0 ≤ y1 ≤ y 2 0 ≤ y2 ≤ 2 elsewhere That is, (y1,y2) are uniformly distributed over the triangle with given boundaries. Find conditional distribution of Y1 given Y2=y2. Solution: f(y1y2) y1 y1=y2 y2 Marginal density f 2 ( y2 ) = ∞ ∫ f (y y )dy 1 2 1 −∞  y2  1  1  ∫  dy 2  = y 2 = 02  2 0  Then, f ( y1 / y 2 ) = 0 ≤ y2 ≤ 2 elsewhere f ( y1 , y 2 ) 1/ 2 1 = = f 2 ( y2 ) 1 / 2 y2 y2  less than 1/2 gallon is    Prob sold, given the machine  =  contains 1 gallon at start    1   P Y1 < / Y2 = 1 2   1/ 2 = 0 ≤ y1 ≤ y 2 ≤ 2 ∫ f ( y1 / y 2 = 1)dy1 = −∞ 1/ 2 ∫ (1)dy 1 = 0 If machine had 2 gallons at start then P[Y1 ≤ 1 / 2 / Y2 = 2] ≡ 1 / 4 ∴ Amount sold highly dependent on supply. Find Conditional Expectation of sales, Y1, given Y2=1 ∞ E [Y1 / Y2 = 1] = ∫ Y1 f (Y1 / Y2 )dY1 −∞ 1 = ∫ Y1 (1)dY1 = 1 / 2 0 If machine contains 1 gallon at start the expected sale is 1/2 gallon. Note 1 2 [ ] E [X 1 ] = E X 2 E X1 {X 1 / X 2 } E [X 1 ] = ∞ ∫ X f (X )dX 1 1 2 1 −∞ = ∞ ∫ X ∫ f (X 1 1 X 2 )dX 2 dX 1 −∞ = ∫ ∫ X 1 f (X 1 / X 2 ) f ( X 2 )dX 2 dX 1 =∫ = [∫ X 1 ∞ ∫ E [X ] f ( X 1 / X 2 )dX 1 f ( X 2 )dX 2 1 / X 2 ]f ( X 2 )dX 2 ?????? -∞ Compounding Very often, the parameters of a distribution are unknown, and sometimes are treated as random quantities. Assigning distributions to these quantities is called compounding. i.e. Let Y → number of bacteria/cubic centimeter in a liquid and Y has a Poisson distribution, with mean λ . Also, λ varies from location to location, and for a location chosen at random, λ has a gamma distribution with α and β . Find probability distribution for the bacteria count, Y, at a random location. Solution: Since λ is random the Poisson assumption applies to the conditional distribution of Y for fixed λ ∴ P (y / λ ) = λy e−λ y! y = 0,1,2, Also  1 λα −1e − λ / β  f (λ ) =  Γ(α )β α 0  λ>0 elsewhere The joint distribution of λ and Y is given, then, by g ( y , λ ) = p( y / λ ) f (λ ) 1 = λ y +α −1e −λ [1+ (1/ β )] α y! Γ(α )β The marginal distribution of Y is obtained by interpolating over λ ∴ p( y ) = 1 y! Γ(α )β α ∞ ∫λ y +α −1 λ [1+ (1 / β )] e dx 0 1  1 = Γ( y + α )1 +  α y! Γ (α )β β  − ( y +α ) Since α is an integer α (y + α − 1)!  1  p( y ) = (α − 1)! y!  β   y + α − 1 1 =    α − 1  1 + β If we let n = y + α and  β  1 + β    α    y +α  β    1 + β  y 1 =p 1+ β This then has form of negative binomial, so we see that the negative binomial is a reasonably good model for counts where mean the count may be random. Birth and Death Processes We have been concerned with counting the number of occurrences of a single type of event (accidents, defects, etc.). Let us now consider the counts of two types of events, called births and deaths. For illustrative purposes we shall think of birth as a cell division producing two cells, and death as the removal of a cell from the system. Birth and death processes are of use in modeling dynamics of population growth, which can influence decisions on housing programs, food management, natural resource management, and a host of economic matters. Let λ → birth rate θ → death rate Assume the probability of birth or death for an individual cell is independent of size and age of population. If Y (t ) = size of population at time t P[Y (t ) = n ] = Pn (t ), The probability of more than one death or birth is of order o(h). For an individual cell, the case n=1 - this says the probability of division in small time interval is λh + o(h ) . A differential equation for Pn (t ) is developed as follows: If population is of size n at time (t+h), it must have been of size n-1, n, or (n+1) at time t. Thus Pn (t + h ) = λ (n − 1)Pn −1 (t ) + [1 − nλh − nθh ]Pn (t ) + θ (n + 1)Pn +1 (t ) + o(h ) or 1 [Pn (t + h ) − Pn (t )] = λ (n − 1)Pn −1 (t ) − n(λ + θ )Pn (t ) + θ (n − 1)Pn+1 (t ) + o(h ) h h Let h → 0 dPn (t ) = λ (n − 1)Pn−1 (t ) − n (λ + θ )Pn (t ) + θ (n + 1)Pn +1 (t ) dt While a general solution can be obtained, we are interested here in some special cases. If θ = 0 - pure birth process dPn (t ) dt = λ ( n − 1) Pn−1 (t ) − nλ Pn (t )  n − 1 − λit ∴ Pn (t ) =  1 − e − λt e  i −1  ( ) n −i n e Where i is the size of the population a time t t=0; that is P(0 =) i If θ > 0 we can show  θe (λ −θ )t − θ  Pj (t ) =  (λ −θ )t − θ   λe t Taking limit of Pj (t ) as t → ∞ we get probability of uttermost extinction of population. (θ / λ )i  lim Pj (t ) = 1 t →∞ 1  E [Y (t )] = le (λ −θ )t λ >θ λ <θ λ =θ Queues Queue theory is concerned with probabilistic models governing behavior of "customers" arriving at a certain station and demanding some kind of service. It may deal with people, automobiles, telephone calls, breakdown of machines, etc. Consider a system that involves one station dispensing service to customers on first come, first serve basis. Customers arrive accordingly to Poisson process with λ /hc and depart with independent Poisson process with θ /hc (this implies service time is an exponential r.v. with mean 1 / θ ). Probability of customer arrival in small t, h, is λh + o(h ) and probability of a departure is θh + o(h ) , with probability of more than 1 arrival(departure) is o(h). If Y(t)= number of customers in system (being serviced and waiting to be serviced) at time t and if Pn (t ) = P[Y (t ) = n ] dPn (t ) = λPn −1 (t ) − (λ + θ )Pn (t ) + θPn +1 (t ) dt It can be shown that for large t, this has solution, Pn, which does not depend on t. (Called equilibrium distribution.) If solution is free of t it must satisfy 0 = λPn −1 − (λ + θ )Pn + θPn +1 A solution of this recursive equation is  λ  λ  Pn = 1 −    θ  θ  n n = 0,1,2,3, λ <θ This is probability of there being n customers in system at time t, which is far removed from start of system. Functions of Two Random Variables Given two random variables we can form another random variable which is a function of the two given random variables. Z = g (X , Y ) And we are interested in finding its distribution and its density FZ (z ) = P[Z ≤ z ] f Z (z ) both in term of g (X , Y ) and f XY of X and Y , where fXY is called the joint density function. Let D - region of the x,y plane such that g (x , y ) ≤ z y g(x,y)=z g(x,y)= z +dz Dz: g(x,y) ≤ z x we can see {g (xy ) ≤ z} {z ≥ 0}=  {(x, y ) ∈ Dz } ∴ FZ (z ) = P[Z ≤ z ] = P{(x, y ) ∈ Dz } = ∫ ∫ f XY (x, y )dxdy Dz Also given that z ≤ Z ≤ z + dz P{z ≤ g (x, y ) ≤ z + dz} = f Z (z )dz = ∫ ∫ f (xy )dxdy XY ∆ Dz which can also be obtained by differentiating FZ (z ) If x and y are discrete P{Z = z r } = P{X = x k , Y = y n } = pk which hold for all points on curve z r = g (x , y ) . Example: Z = g (X , Y ) = X + Y We note that Dz (such that g (x , y ) ≤ z , or x + y ≤ z ) of x,y plane is the half plane to the left of x+y . y dz Dz x+y=z+dz dz x z-y z=x+y Integrating FZ (z )P{x + y ≤ z} = ∞ ∞ ∫ ∫ f (x, y )dxdy XY −∞− ∞ Differentiating with respect to z f Z (z ) = ∞ ∫ f (z − y, y )dy XY (Liebnitz' Theorem) −∞ We can see f Z (z )dz = P{z ≤ Z ≤ z + dz} = P{z ≤ X + Y ≤ z + dz} is area of strip dz. We can generalize if x and y are independent (convolution integral) f Z (z ) = ∞ ∫ −∞ ∞ f X (z − y ) f Y ( y )dy = ∫ f X (x ) f Y (z − x )dx −∞ density of their sum = convolution of the their densities. Hence, if we have W = aX + bY with X 1 = aX Y1 = bY ∴ f X 1 (x1 ) = 1 x  f X  1 , a a f Y1 (Y1 ) = 1  y1  f Y   → from function of one random variable b b ∴ if X and Y are independent f W (w) = 1 ab ∞ ∫f X −∞  w − y1   Y1    f Y  dY1  a  b Example: Two trains arrive at a station some time in the interval (0,T). The arrivals are random and independent of each other. Let Z = the time interval between arrivals and we want to find the density of Z. Let x and y be the arrival time of the trains. ∴Z = X −Y First we will find the density t= x− y f(x) = f(y) 1/T x, y T Convolution of the two functions looks as shown f w (w ) 1/T ∴ f t (t ) = t=x -y ∞ ∫ f (t − y ) f (− y )dy x y −∞ -T T w ∴ f z (z ) = 2 f t (z )U (z ) Which plot as the following figure depicts. f z (z ) 2/T Z T This was gotten by considering the sliding of f(t-y) onto f(y), as follows with t=x-y f (x) f(y) w 1/T 1/T T x T Then, the convolution of these two functions is f t (t ) = ∞ ∫ f (− y ) f (t − y )dy y x −∞ where 1  f x (t − y ) = T 0 t −T ≤ y ≤ t elsewhere Note, the limits were developed by observing 1 T  f x (t − y ) =    0 0≤ x ≤T 0≤t− y ≤T −T ≤ y ≤ 0 elsewhere And 1 -T ≤ y ≤ 0  f Y (− y ) = T  0 elsewhere This has the effect of sliding f(t-y) over f(x). Hence: fx(w-y) f (x,y) ∞ T +t ∫ f (w − y ) f ( y )dy = T x −∞ 1/T fy T+t y -T 0 And, as it slides further. Into quadrant 1. fx(w-y) f (x,y) y 2 −T ≤ t ≤ 0 y ∞ T −t ∫ f (w − y ) f ( y )dy = T x y 2 0≤t≤T −∞ 1/T fy T-t y -T 0 t This plots as follows; f z(t) 1 1 2 t Now, if x and y are independent, with f x (x) = αe−αxU(x) f y ( y) = βe−βyU(y) x+ y=z z ∴ f z (z) = αβ∫ e 0  αβ −αz − βz e −e  e dy = β − α α 2 ze−αz  ( −α ( z − y ) − βy ) β ≠α and z>0 β =α =0 Example: Suppose X and Y are exponential with parameter λ . Then z <0 f Z (t ) = ∞ ∫ f ( y ) f (t − y )dy Y X −∞ λe −λy fY (y ) =  0 y≥0 otherwise λe −λ (t − y ) y≤t f X (t − y ) =  otherwise 0 t f Z (t ) = ∫ λe −λy λe −λ (t − y )dy 0 t 2 − λt =λe ∫ dy = λ te 2 − λt 0 λ te f Z (t ) =  0 2 - − λt t>0 otherwise Standby redundancy. Z is lifetime of system with spare component Example Consider the following function y x2 + y2 ≤ z Z = X +Y 2 2 D ∴ z = x 2 + y 2 ⇒ Dz FZ (z ) = x z ∫ ∫ f (x, y )dxdy XY x2 + y2 ≤z ∴ f XY (x, y ) = FZ (z ) = 2 2 2 1 e − (x + y )/ 2σ 2 2πσ 1 ∫ ∫ 2πσ 2 e −(x 2 ) + y 2 / 2σ 2 where x = r cos θ y = r sin θ dxdy Dz 1 = 2πσ 2 z −r ∫ 2πre 2 / 2σ 2 = 1 − e − z / 2σ 1 − z / 2σ 2 ∴ f Z (z ) = e U (z ) 2σ 2 2 Alternatively, dr 0 z>0 2π 1 2πσ 2 = = ∫ ∫e − r 2 / 2σ 2 (rdθdr ) 0 0 1 2πσ 2 1 σ2 z ∫ ∫ dθ (re z 2π − r 2 / 2σ 2 dr ) 0 0 z ∫ re − r 2 / 2σ 2 dr 0 = −e − r 2 / 2σ 2 z 0 = 1-e − z/ 2 2 z>0 And the density function follows. Example Consider the function described as Z= X +Y 2 f XY (x, y ) = 2 2 2 2 1 e − (x + y )/ 2σ 2 2πσ z 1 − r 2 / 2σ 2 FZ (z ) = 2 π re dr 2πσ 2 ∫0 = 1 − e − z / 2σ z>0 1 − z 2 / 2σ 2 ∴ f Z (z ) = e U (z ) → Rayleigh Density 2σ 2 2 2 We note π 2 2 E z = 2σ 2 E [z ] = σ [ ] π  ∴ σ z2 =  2 − σ 2 2  Functions of Two Random Variables We consider, now, two functions of two random variables. Given z = g (x, y ) w = h (x, y ) The random variables have a joint distribution and density Fzw (z , w ) f zw (z , w ) Letting z and w be two real numbers, then Dzw corresponds to the region of xy plane such that g (x , y ) ≤ z h (x , y ) ≤ w ∴ {z ≤ z, w ≤ w} = {(x,y ) ∈ Dzw } ∴ Fzw (z , w ) = P{z ≤ z , w ≤ w} = P{(x,y ) ∈ D zw } = ∫ ∫ f (x, y )dxdy xy D zw Example z=+ x +y 2 w = arctan 2 x y − π π <w≤ 2 2 For z > 0 and w ≤ D z is such that π 2 x 2 + y 2 ≤ z & arctan x ≤w y y w D zw z x D w zw If f XY (x, y ) = ( ( 1 − (x 2 + y 2 )/ 2σ 2 e 2πσ 2 where x = r cosθ y = r sin θ ) )  − z 2 / 2σ 2 π + 2 w  1 − e 2π Fzw =   1 − e − z 2 / 2σ 2  π 2 π for w > 2 for w < z>0 z > 0 (hold for all points inside circle We observe that Fzw is independent (product of F(z) and F(w)) Note for z < 0 or w < −π / 2 → Fzw = 0 We conclude ( Fz (z ) = 1 − e − z 2 / 2σ 2 )U (z ) → as before Rayleigh Density and  1  π + 2ω Fw (w) =   2π  0  π 2 π w< 2 w> w≤− → which is a Uniform Density π 2 Jacobian Transformation We can determine the joint density of zw functions in terms of Fxy and J(xy) utilizing a Jacobian transform where ∂g (x, y ) ∂g (x, y ) ∂x ∂y J (x, y ) = Jacobian = ∂h (x , y ) ∂h (x, y ) ∂x ∂y Theorem To determine fzw(zw) 1) 2) Solve g(xy)=z, h(xy)=w for x and y in terms of z and w. If (x1,y1),…,(xn,yn) are all real solutions then f zw (z , w ) = 3) f xy (x1 , y1 ) J (x1 , y1 ) + + fJ ((xx ,,yy )) xy n n n n If for certain values (z,w), no real solutions exist, then f zw (z , w ) ≡ 0. Example: Consider the simple linear functions, (underlined variables are random variables) z = ax + by w = cx + d y Then the system z = ax + by and w = cx + dy has a unique solution x1 = a1 z + b1w y1 = c1 z + d1 w for any z and w, and where a,b,c,d are f(a,b,c,d) J (x, y ) = a1 c1 ∴ f zw (z , w ) = Example: b1 = a1d 1 − b1c1 d1 1 f xy (a1 z + b1w, c1 z + d 1 w ) ad − bc z=+ x +y 2 x y w= 2 ∴ The system z = + x 2 + y 2 w= x has two solutions (for any w) y i.e. z = w 2 y 2 + y 2 = y w 2 + 1 z ∴y = x= 1 + w2 zw x1 = y1 = 1 + w2 x 2 = − x1 zw 1 + w2 z 1 + w2 y 2 = − y1 Now ∂g (x, y ) ∂g (x, y ) ∂x ∂y J (x, y ) = = ∂h (x, y ) ∂h (x, y ) ∂x ∂y =− =− x2 x2 + y2 − x y x2 + y 2 1 y x2 + y2 x − 2 y 1 x2 + y 2 w2 1 w2 − 1 − = −z z z Hence f zw (z , w ) = if z < 0   zw z ,  f xy    1 + w 2 1 + w 2 → f zw (z , w ) ≡ 0 z 1 + w2   − zw −z  + f xy  , 2 1 + w2   1+ w    Two Functions of Two Random Variables Before we develop the theorem for transforming density functions of two functions of two r.v., let us review some mathematics. Evaluating multiple integrals can frequently be simplified by making changes in independent variables - or coordinate transformation.We can consider the handling of two functions of two r.v. as coordinate transformations. z = g ( x, y )   w = h( x, y )  (1) i.e. ,yi) i(x y dy y)g(x, dx h(x,y) x Where g( ) and h( ) are continuous in their first partial derivatives in a region R, and the Jacobian, J, does not vanish in the region R. ∂z ∂x J (x, y ) = ∂w ∂x ∂z ∂y ∂w ∂y (2) Now, equations (1) can be solved for any x and y in terms of x, w to yield x = θ 1 (z , w )   y = θ 2 (z , w ) (3) If we now let z and w take on specific, fixed values then z0 = g ( x , y )   w0 = h( x, y )  (4) These equations determine two curves that intersect at a point (x0,y0) such that x 0 = θ1 (z 0 , w0 ) y 0 = θ 2 (z 0 , w0 ) y w=w 0 (x 0,y0) z=z 0 x ((z1 w1 ), (z 2 w2 ), , (z n wn )) , then a network of curves will be determined that intersect at points (x1 , y1 ), (x 2 , y 2 ), , (xn , y n ) . And if z and w are assigned a sequence of constant values y w=w 0 w=w 1 (x 0,y0) w=w 2 (x1,y1) (x 2,y2) x z=z 0 z=z 1 z=z 2 Corresponding to any point whose rectangular coordinates are (x,y) there will be a pair of curves z=constant and w=constant, which pass through this point. The totality of numbers (z,w) determines curvilinear coordinate system and the curves themselves are called coordinate axes???. i.e. if z = w = tan −1 x 2 + y 2 and y x z=constant defines family of circles w=constant defines family of radial lines The curvilinear system in this case is simply the ordinary polar coordinate system. w=w 2 y w=w 3 z=z 1 z=z2 w=w 1 z z=z x 3 Now, in the cartesian (xy) coordinate system, the element of area dA=dxdy is the area of a rectangle formed by intersecting of coordinate lines x = x0 y = y0 x = x0 + dx y = y 0 + dy y p p4 1 y =y0+dy dy dA p p3 2 y=y 0 dx x In the curvilinear, zw, coordinates, the element of area dA can be visualized or transformed as the area of the quadralateral p1p2p3p4 formed by the intersecting of the coordinate lines z = z 0 + dz w = w0 + dw z = z0 w = w0 y w=w p 0+dw 4 w=w0 p1 p 3 Bi p 2 x z=z z=z 0+dz 0 Consider the original problem, where we want the same density function of our two new functions of two r.v. Given z = g (x, y )  w = h(x, y ) (5) This system has i number of unique, real roots, or solutions. (xi , y i ) and these roots (points) (xi,yi) transform the differential area A, (dzdw), into one or more differential quadralaterals, Bi. The area of the nth quadralateral equals (Sokolnikof /Redheffer, , Hildebrande) dzdw J (x , y ) ∂g ∂x where J (x, y ) = ∂h ∂x i.e. w dz y dw x1y1 w x2y2 z x z ∆D zw ∂g ∂y ∂h ∂y ∆D zw = region of path in xy such that z ≤ g ( xy ) ≤ x + dz w ≤ h( xy ) ≤ w + dw Now {ξ ≤ z ≤ ξ + dξ , ω ≤ w ≤ ω + dω } = {(xy )∈ ∆Dzw } ∴ P{ξ ≤ z ≤ ξ + dξ , ω ≤ w ≤ ω + dω } = f zw (zw )dzdw ∴ f zw dzdw = ∫ ∫ f xy (xy )dxdy (6) ∆D zw So, given z and w, ∆D zw consists of different quadralaterals, one for each solution (xn,yn). Hence, the transformed area of the nth guadralateral ; dzdw J (x n , y n ) and contains the probability mass f xy (xn y n ) J (x n , y n ) dzdw Summing the masses of all the quadralaterals, Equation (6) becomes f zw dzdw = f xy (x1 y1 ) J (x1 , y1 ) + f xy (x 2 y 2 ) J (x 2 , y 2 ) + summed over all distinct roots Consider our original problem, were we want the joint density function of our two new functions of two r.v. Given z = g (x, y )  w = h(x, y ) (5) P{ξ ≤ z ≤ ξ + dξ , ω ≤ w ≤ ω + dω } = f zw dzdw The system of equation solutions to (5) z = g (xi y i ) w = h (xi yi ) transforms the differential rectangle A into one or more differential quadralaterals Bi. The area between transformation is invariant under transformation ∴ Area Bi J (xi y i ) = A Bi = A J (xi yi ) These are the same outcomes which satisfy P{(x, y ) ∈ Bi } = f xy (xi yi ) Bi = f xy (xi yi ) J (xi , y i ) dzdw ↑ transformed area Since {ξ ≤ z ≤ ξ + dξ , ω ≤ w ≤ ω + dω} = * {xi , yi } n n f zw (ξω ) dξdω = ∑ i =1 f xy (xi y i ) J (xi , y i ) i =1 dξ dω Q.E.D. Functions of Independent Random Variables We had for two independent events a and b, P(ab)=P(a)P(b). We can define: "Two random variables x, and y are called independent if the events {x ≤ x} and y ≤ y are { } independent for any x and y. That is, P{x ≤ x , y ≤ y}= P{x ≤ x}P{y ≤ y } ∴ Fxy (xy ) = Fx (x )Fy ( y ) f xy (xy ) = f x (x ) f y ( y ) Also f y (y / x ) = f y ( y ) f x ( y / x ) = f x (x ) and y2 x 2 x2 y2 ∫ ∫ f (xy )dxdy = ∫ f (x )dx ∫ f ( y )dy xy x y1 x1 y x1 y1 Similarly, if we have discrete random variable P{x ≤ x k , y ≤ y n }= P{x ≤ x k }P{y ≤ y n } Given z=g(x) w=h(y) and that x and y are independent, we have two events ∴ {z ≤ z} and Letting Then {w ≤ w} Iz - set of numbers x such that g (x ) ≤ z Iw - set of numbers y such that h ( y ) ≤ w {z ≤ z}= {x ∈ I z }  {w ≤ w}= {y ∈ I w }  (*) Since x and y are independent then (*) are independent ∴ {z ≤ z} is indepenent of {w ≤ w} This could also be deduced by applying joint probability density????? Assuming g (x ) = z ∴J = h( y ) = w has single solution (x1 y1 ) g ' (x ) 0 = g ' (x1 )h' ( y1 ) 0 h' ( y ) f xy (xy ) = f x (x ) f y ( y ) ∴ f zw (zw ) = f xy (xy ) J = f x (x1 ) f y ( y1 ) ⋅ g ' (x1 ) h' ( y1 ) = F1 (x1 ) ⋅ F2 ( y1 ) ∴ independen t i.e { x1 is only function of z   ∴ z , w independen t y1 is only function of w Conditional Distributions We can also talk about Conditional Distributions. We had Fy ( y / M ) = P[Y ≤ y / M ] = Suppose M = {X ≤ x} ∴ P (M ) = FX (x ) ∴ FY ( y / X ≤ x ) = P[Y ≤ y , M ] when P[M ] ≠ 0 P[M ] P[X ≤ x, Y ≤ y ] Fxy (xy ) = P[X ≤ x ] Fx (x ) i.e. y x Conditional Distributions ∴ FY ( y / X ≤ x ) = Ratio of the probabilit y mass in the two regions ∞ ∴ fY (y / X ≤ x ) = ∂Fxy (xy ) / ∂y Fx (x ) ∫ f (ξy )dξ xy = −∞ ∞ ∞ ∫ ∫ f (ξy )dξdy xy −∞−∞ Now if we suppose M = {x1 ≤ X ≤ x 2 } = F (x 2 ) − F (x1 ) ≠ 0 we can write FY ( y / x1 < X ≤ x 2 ) = P[x1 < X ≤ x 2 , Y ≤ y ] Fxy (x 2 y ) − Fxy (x1 y ) = P[x1 < X ≤ x 2 ] Fx (x2 ) − Fx (x1 ) x2 ∫ f (xy )dx xy f Y ( y / x1 < X ≤ x 2 ) = x1 Fx (x 2 ) − Fx (x1 ) Note that f Y ( y / x1 < X ≤ x 2 )dy is the ratio of the small strip to the big strip y dy x 1 x2 x We could also show, as we did previously that if M = {X ≤ x} y FY ( y / X ≤ x ) = fY (y / X ≤ x ) = ∫ f ( yρ )dρ xy −∞ f x (x ) f xy (xy ) = f x (x ) f xy (xy ) ∞ ∫ f (xy )dy xy −∞ Similarly x FX ( y / X ≤ x ) = f X (y / X ≤ x ) = ∫ f (xy )dx xy −∞ f y (y ) f xy (xy ) f y (y ) We have another form of Bayes' Theorem fY (y / X ≤ x ) = f X ( y / X ≤ x ) f y (y ) f x (x ) We will write these as F ( y / x ), F (x / y ) f ( y / x ), f (x / y ) Total Probability From fY (y ) = ∞ ∫ f (xy )dx xy and f xy (x , y ) = f Y ( y | X = x ) f x (x ) −∞ fY (y ) = ∞ ∫ f ( y | X = x ) f (x )dx Y −∞ x Joint Density Assume that F(xy) has partial derivatives up to order 2.We define f (x , y ) = ∂ 2 F (xy ) ∂x∂y Hence, as with the marginal density f (x , y )dxdy = P{x < X ≤ x + dx, y < Y ≤ y + dy } We note P{x < X ≤ x + dx, Y ≤ y} = lim F [(x + dx, y ) − F (x, y )] / ∆x ∆x → 0 ∂F (x, y ) dx ∂x ∂F (x, y ) ∴ P{X ≤ x, y < Y ≤ y + dy} = dy ∂y = Consider the event that a region, D, of the xy plane exist ∴ {(x, y ) ∈ D} = all outcomes ξ such that a point with the coordinates x(ξ ), y (ξ ) is in D Breaking D into elemental areas, dxdy, in D, summing on x and on y in D {(x, y )∈ D} = {x < X ≤ x + dx1 , y < Y ≤ y + dy1 }+ { ∴ P{(x, y ) ∈ D} = ∫ ∫ f (x, y )dxdy }+ D We can then write, F (x, y ) = y y x ∫ ∫ f (xy )dxdy − ∞− ∞ x And ∞ ∞ ∫ ∫ f (x, y )dxdy = 1 −∞−∞ We could also show by differentiating the above equation for F(x,y), with respect to x Using Liebnitz' rule, u (α ) u (α ) du1 du 0 1 ∂f (x, α ) d 1 f ( x , α ) dx = f ( u , α ) − f ( u , α ) + dx 1 0 dα u 0∫(α ) dα dα u 0∫(α ) ∂α we get, y x   ∂F (x, y ) = ∫ dy  f (x, y ) ⋅ 1 − 0 + ∫ 0 ∂x −∞ −∞   y ∫ f (x, ρ )dρ = −∞ Similarly, ∂F (x, y ) = ∫ f (ξ , y )dξ ∂y −∞ x A physical interpretation of the density is that f(x,y) is analogous to mass density over the plane. Relationship Between Marginal and Joint Density We have shown Fx (x ) = Fx (x, ∞ ) = ∞ x ∫ ∫ f (x, y )dxdy xy − ∞−∞ Differentiating with respect to x, using Liebnitz' rule, f x (x ) = ∞ ∫ f (x, y )dy xy −∞ ∴ f y (y ) = ∞ ∫ f (x, y )dx xy −∞ f(x,y) x y f x (x1 ) =area under curve. Example: Two quality control inspectors each interrupt a production line at random, but independently, selected times within a given 8hr day. Find the probability that the two interruptions will be more than 4 hours apart. Solution: Let xi = the hours of the interruption time of inspection i i = 1,2 p (xi ) = 1 8 p (x1 x2 ) = 1 1 = 8.8 64 Then, what we want to find is P{x1 − x 2 > 4}= ? and P(a ) = ∫∫ f xy dxdy Ra Let's draw our sample diagram A= 11 x1-x2=-4 x2 4 8 0 x1 + 4 dydx x 1-x2=4 11 8 x2=x1+4 8 A = A 4 4 x 2=x1-4 0 4 x1 − 4 0 dydx 4 8 x 1 The event A is in the shaded areas, as shown on the sketch, is 4 A=∫ 8 x1 − 4 0 ∫ dydx and 0 x1 − 4 A=∫ ∫ dydx 4 0 So, we need to integrate the shaded areas. 4 P{x1 − x 2 > 4}= ∫ 8 ∫ 0 x1 + 4 4 8 x2 − 4 f xx dydx + ∫ 8 4 = ∫ xx dydx 0 8 x2 − 4 1 dydx + ∫ 64 0 x1 + 4 4 =∫ ∫f ∫ 0 1 dydx 64 1 1 1 + = 8 8 4 Note: We could have also integrated horizontal strips. x2 x1=x2-4 x1=x2+4 x1 8 x2 − 4 A=∫ 4 ∫ 0 1 dxdy and 64 ∴ P[(x1 − x2 ) > 4] = 4 8 1 dxdy 64 0 x2 + 4 A= ∫ ∫ 1 1 1 + = 8 8 4 Discrete Random Variables What we have completed holds for continuous random variables.To incorporate discrete random variables, we would have to use double impulse function.Rather than do this, we introduce the notion on point or line masses.Suppose x and y are both of discrete type, taking values xk and yn, with P{X = xk , Y = y n } = pkn ∴ we have only point masses at the points (xk,yn) Since the entire mass in the plane ≡ 1 ∑ pkn = 1 k ,n Now {X = x k } = {X = x k , Y = y1 }+ {X = x k , Y = y 2 }+ ∴ P{X = xk } = ∑ P{X = xk , Y = y n } n Marginal Probability Function ∑p kn = 1 → these are mass of all points on the line Y = y n kn Similarly P{Y = y n } = qn = ∑ pkn → these are mass of all points on the line X = x k k Example: Rolling of a fair die. a) We define our random variables as X =i and Y = 2i There are only 6 mass, at the points (1,2) (2,4) (3,6) (4,8) (5,10) (6,12) and each mass = 1/6 i = 1,2, ,6 y 14 12 10 8 6 4 2 0 x 0 1 2 3 4 5 6 7 (b) Suppose we roll the die twice x = first appearing number y = second appearing number This are 36 outcomes, ∴ each mass = 1/36 6 5 4 3 2 1 0 0 1 2 3 4 5 c) Suppose we define x = k −n y=k+n 6 7 The number of outcomes corresponding to x and y can be summarized in the following table 2 3 4 5 6 7 8 9 10 11 Y= 12 X= 0 1 1 2 3 4 5 1 1 2 1 2 1 2 2 2 2 2 2 2 2 2 1 2 2 2 2 The probabilities are gotten by dividing by 36. y =1/36 12 =2/36 10 8 6 4 2 1 2 3 4 5 x Adding the masses in each row, or column we could get P{Y = ?} and P{X = ?} respectively. x= 0 6/36 2 1/36 3 2/36 P{x= } y= P{y= } 1 10/36 4 3/36 5 4/36 2 8/26 6 5/36 7 6/36 3 6/36 8 5/36 4 4/36 9 4/36 10 3/36 5 2/36 11 2/36 12 1/36

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Document