Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
STATISTICAL LABORATORY, April 30th, 2010 BIVARIATE PROBABILITY DISTRIBUTIONS Mario Romanazzi 1 MULTINOMIAL DISTRIBUTION Ex1 Three players play 10 independent rounds of a game, and each player has probability 1/3 of winning each round. 1) Find the joint distribution of the numbers of games won by each of the three players. 2) What are the probabilities of the following events: X1 = X2 = 5, X1 = X2 = 3? (Rice, 3.3) Solution. 1. Denote with Xi , i = 1, 2, 3 the number of games won by the i-th player. The joint distribution of (X1 , X2 , X3 ) is multinomial with parameters n = 10 (number of independent trials) and success probabilities p1 = p2 = p3 = 1/3. The probability function is P (X1 = x1 , X2 = x2 , X3 = x3 ) = 10! (1/3)x1 (1/3)x2 (1/3)10−x1 −x2 , x1 !x2 !(10 − x1 − x2 )! where xi satisfies the constraints 0 ≤ xi ≤ 10, i = 1, 2. 2. We use R to answer the questions. > dmultinom(x = c(5, 5, 0), size = 10, prob = rep(1/3, 3)) [1] 0.004267642 > dmultinom(x = c(3, 3, 4), size = 10, prob = rep(1/3, 3)) [1] 0.07112737 Ex2 Three cards are drawn at random and with replacement from the box containing 20 cards, each card with the name of a different italian region. Recall that there are 8 northern regions (N), 4 central regions (C) and 8 southern regions (S). Let (XN , XC , XS ) denote the joint distribution of the number of regions of the three areas. 1) What is the probability of no southern regions? one region from each area? 2) Describe the probability distribution XC |XN = 1. Solution. 1 2 GENERAL BIVARIATE CONTINUOUS DISTRIBUTIONS 2 1. P (XS = 0) = 0.610 ' 0.00605 and P (XN = XC = XS = 1) = 3! · 0.4 · 0.2 · 0.4 = 0.192. 2. This is a Binomial distribution: XC |XN = 1 ∼ Bi(2, 1/3), whose determinations are 0, 1, 2 with probabilities 4/9, 4/9, 1/9. 2 GENERAL BIVARIATE CONTINUOUS DISTRIBUTIONS Ex1 A bivariate density function is defined as follows 4x(1 − y), 0 ≤ x ≤ 1 ∩ 0 ≤ y ≤ 1, fX,Y (x, y) = 0, elsewhere. 1) What are the marginal distributions? Are they uniform? 2) Are X and Y stochastically independent? 3) Compute the joint cdf values at the points (2, 1/2), (−1/2, 1/2), (1/2, 1/2). Solution. 1. Marginal densities are obtained by integrating out the other variable. Z 1 Z 1 fX (x) = fX,Y (x, y)dy = 4x (1 − y)dy = 2x, 0 ≤ x ≤ 1 and 0 elsewhere, 0 0 Z 1 Z 1 fY (y) = fX,Y (x, y)dx = 4(1 − y) xdx = 2(1 − y), 0 ≤ y ≤ 1 and 0 elsewhere. 0 0 The marginal distributions are not uniform, because neither density is constant. 2. X and Y are stochastically independent because the joint density is identically equal to the product of the marginal densities: for all pairs of real numbers fX,Y (x, y) = fX (x)fY (y). 3. Note that stochastic independence implies FX,Y (x, y) = FX (x)FY (y), where FX and FY are the marginal cdf’s. Therefore Z 1/2 FX,Y (2, 1/2) = FX (2)FY (1/2) = FY (1/2) = 2 2(1 − y)dy = 3/4, 0 FX,Y (−1/2, 1/2) = FX (−1/2)FY (1/2) = 0, Z FX,Y (1/2, 1/2) = FX (1/2)FY (1/2) = (3/4)FX (1/2) = (3/2) 1/2 xdx = 3/16. 0 2 GENERAL BIVARIATE CONTINUOUS DISTRIBUTIONS 3 Ex2 A bivariate density function is defined as follows x + y, 0 ≤ x ≤ 1 ∩ 0 ≤ y ≤ 1, fX,Y (x, y) = 0, elsewhere. 1) Describe the contours of the bivariate density. 2) Derive the marginal distributions. Are X and Y stochastically independent? 3) Compute the joint cdf value at the point (1/2, 1/2). 4) Obtain the conditional densities of Y |X = x, X|Y = y. Solution. 1. Observe that the joint pdf varies between 0 (at the point 0, 0) and 2 (at the point 1, 1). The contours are the subsets of the unit square Q with a constant value 0 ≤ c ≤ 2 of the density, that is (x, y) ∈ Q : x + y = c. Therefore, the contours are parallel segments, more precisely, they are the intersections of the parallel lines x + y = c with Q. The figure below shows the plots of the contours and of the bivariate density. The corresponding R code is > > > > > + > + + f <- function(x, y) x + y x <- seq(0, 1, length = 50) y <- seq(0, 1, length = 50) z <- outer(x, y, f) contour(x, y, z, col = "black", lty = "solid", asp = 1, lwd = 2, xlab = "X", ylab = "Y", main = "Contours of f(x,y) = x+y") persp(x, y, z, theta = 30, phi = 30, expand = 0.5, col = "lightblue", ltheta = 120, shade = 0.75, ticktype = "detailed", xlab = "X", ylab = "Y", zlab = "Density", main = "Plot of f(x,y) = x+y") 2. Marginal pdf’s are Z 1 Z 1 fX (x) = fX,Y (x, y)dy = (x + y)dy = x + 1/2, 0 ≤ x ≤ 1 and 0 elsewhere, 0 0 Z 1 Z 1 fY (y) = fX,Y (x, y)dx = (x + y)dx = y + 1/2, 0 ≤ y ≤ 1 and 0 elsewhere. 0 0 Here, independence test fails because clearly fX,Y (x, y) = x + y 6= (x + 1/2)(y + 1/2) = fX (x)fY (y). 3. The joint cdf is 0, x ≤ 0 ory ≤ 0, xy(x + y)/2, 0 ≤ x ≤ 1 ∩ 0 ≤ y ≤ 1, x(x + 1)/2, 0 ≤ x ≤ 1 ∩ y ≥ 1, FX,Y (x, y) = y(y + 1)/2, 0 ≤ y ≤ 1 ∩ x ≥ 1, 1, x ≥ 1 ∩ y ≥ 1. Hence, FX,Y (1/2, 1/2) = 1/8. 3 BIVARIATE NORMAL DISTRIBUTION 4 Plot of f(x,y) = x+y 4 1. 1.0 Contours of f(x,y) = x+y 8 1. 0.8 6 1. 2 1. 2.0 8 0. 1.0 1.0 0.5 0.8 1 0.0 0.0 0.6 0.2 0.2 0.4 Y 0.4 ity Y 6 0. Dens 0.6 1.5 0.4 X 0.6 4 0. 0.2 0. 0.8 0.0 2 1.0 0.0 0.0 0.2 0.4 0.6 0.8 1.0 X 4. We use the definition of conditional density. For any fixed 0 ≤ x0 ≤ 1, fY |X=x0 (y) = fX,Y (x0 , y) x0 + y = , 0 ≤ y ≤ 1 and 0 elsewhere. fX (x0 ) x0 + 1/2 Similarly, for any fixed 0 ≤ y0 ≤ 1, fX|Y =y0 (y) = 3 x + y0 fX,Y (x, y0 ) = , 0 ≤ x ≤ 1 and 0 elsewhere. fY (y0 ) y0 + 1/2 BIVARIATE NORMAL DISTRIBUTION Ex1 X and Y have a bivariate normal distribution with parameters µX = 5, µY = 10, σX = 1, σY = 5 and ρ > 0. It is also known that P (4 < Y < 16|X = 5) = 0.954. What is the value of ρ? Solution. What is required is the conditional distribution Y |X = 5. From the general properties of the bivariate normal distribution, Y |X = x is a univariate normal distribution with mean function (regression function) µY |X (x) = µY + ρ σY (x − µX ) σX and standard deviation (not dependent on x) σY |X = σY (1 − ρ2 )1/2 . 3 BIVARIATE NORMAL DISTRIBUTION 5 In the present case, replacing known parameters and x = 5, gives µY |X (5) = µY = 10, σY |X = 5(1 − ρ2 )1/2 . Now, P (4 < Y < 16|X = 5) = FY |X=5 (16) − FY |X=5 (4) = FXST (1.2/(1 − ρ2 )1/2 ) − FXST (−1.2/(1 − ρ2 )1/2 ) = 2FXST (1.2/(1 − ρ2 )1 where FXST is the cdf of the standard normal distribution. The previous equation holds iff FXST (1.2/(1 − ρ2 )1/2 ) = 0.977, and the solution is the quantile of the standard normal distribution of the order p = 0.977, that is 1.2 (ST ) = x0.977 . (1 − ρ2 )1/2 (ST ) We use R to obtain a very precise value of x0.977 . > qnorm(0.977, mean = 0, sd = 1) [1] 1.995393 The value of ρ is the solution of the equation 1.2 = 1.995393, (1 − ρ2 )1/2 that is, ρ ' 0.799. Ex2 X and Y have a bivariate normal distribution with parameters µX = 20, µY = 40, σX = 3, σY = 2 and ρ = 0.6. What is shortest interval for Y |X = 22 containing 90% of the probability? Solution. Y |X = 22 has a normal distribution with parameters µY |X (22) = 40.8, σY |X = 1.6. As we know, the endpoints of the shortest interval with 90% probability are the quantiles of the orders 0.05 and 0.95, respectively. > qnorm(c(0.05, 0.95), mean = 40.8, sd = 1.6) [1] 38.16823 43.43177