Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
3 Conditional Expectation 3.1 The Discrete case Recall that for any two events E and F , the conditional probability of E given F is defined, whenever P (F ) > 0, by P (E|F ) = P (E)P (F ) . P (F ) Example. Bowl B1 contains two white chips, bowl B2 contains two red chips, bowl B3 contains two white and two red chips, and bowl B4 contains three white chips and one red chip. The probabilities of selecting bowl B1 , B2 , B3 or B4 are 1/2, 1/4, 1/8, and 1/8, respectively. A bowl is selected using these probabilities, and a chip is when drawn at random. find (a) P (W ), the probability of drawn a white chip (b) P (B1 |W ), the conditional probability that the bowl B1 had been selected, given that white chip was drawn. If X and Y are discrete random variables, then it is natural to define the conditional probability mass function of X given that Y = y, by pX|Y (x|y) = P (X = x|Y = y) = P (X = x, Y = y) P (Y = y) p(x, y) = p(y) = for all values of y such that P (Y = y) > 0. The conditional expectation of X given that Y = y is defined by E(X|Y = y) = X x P (X = x|Y = y) = x = X x pX|Y (x|y). x Typeset by AMS-TEX 1 2 Lemma. If X and Y are independent, then pX|Y (x|y) = p(x) and E(X|Y = y) = E(X). Proof. pX|Y (x|y) = P (X = x|Y = y) = = P (X = x, Y = y) P (Y = y) = p(x). Hence, E(X|Y = y) = X x P (X = x|Y = y) = x = X x p(x) = E(X). x Example. Suppose that p(x, y) is the joint probability mass function of X and Y and given by p(1, 1) = 0.5, p(1, 2) = 0.1 p(2, 1) = 0.1 p(2, 2) = 0.3. Calculate the conditional probability mass function of X given that Y = 1. Example. If X1 and X2 are independent binomial random variables with respective parameters (n1 , p) and (n2 , p). Calculate the conditional probability mass function of X1 given that X1 + X2 = m. That is, compute P (X1 = k | X1 + X2 = m). Example. If X and Y independent Poisson random variables with respective means λ1 and λ2 , calculate the conditional expected value of X given that X + Y = n. 3 3.2 The Continuous Case If X and Y have a joint density function f (x, y), then the conditional probability density function of X, given Y = y, is defined for all values of y such that fY (y) > 0, by fX|Y (x|y) = f (x, y) . fY (y) The conditional expectation of X, given that Y = y, is defined for all values of y such that fY (y) > 0, by Z ∞ E(X|Y = y) = x fX|Y (x|y) dx. −∞ Example. Suppose the joint density of X and Y is given by f (x, y) = ( 6xy(2 − x − y), 0 < x < 1, 0 < y < 1 0, otherwise Compute the conditional expectation of X given that Y = y, where 0 < y < 1. Example. Suppose the joint density of X and Y is given by 1 ye−by , 0 < x < ∞, 0 < y < 2 f (x, y) = 2 0, otherwise Compute E(eX/2 | Y = 1). 4 3.3 Computing Expectations by Conditioning Let us denote by E(X | Y ) that function of the random variable Y whose value at Y = y is E(X | Y = y). Note that E(X | Y ) is itself a random variable. In other words, f (Y ) = E(X|Y ) where f (y) = E(X|Y = y). An extremely important property of conditional expectation is that for all random variable X and Y E(X) = E(E(X | Y )). If Y is a discrete random variable, then E(X) = X E(X|Y = y)P (Y = y). y In general, E(h(X)) = P y E(h(X)|Y = y)P (Y = y). If Y is a continuous random variable, then E(X) = Z ∞ E(X|Y = y)fY (y) dy. −∞ Example. (The mean of a Geometric Distribution) A coin, having probability p of coming up heads, is to be successively flipped until the first head appears. What is the expected number of flips required? Hint: Condition on the first flip result. Example. A miner is trapped in a mine containing three doors. The first door leads to a tunnel that takes him to safety after two hours of travel. The second door leads to a tunnel that returns him to the mine after three hours of travel. The third door leads to a tunnel that returns him to his mine after five hours. Assuming that the miner is at all times equally likely to choose any one of the doors, what is the expected length of time until the miner reaches safety? HINT: Condition on a door he could reach first. 5 3.4 Computing Variance by Conditioning Conditional expectations can also be used to compute the variance of a random variable. As we know the variance can be computed by Var(X) = E(X 2 ) − (E(X))2 . The conditional variance of X given that Y = y is defined by Var(X | Y = y) = E[(X − E(X|Y = y)2 ) | Y = y]. Hence, Var(X | Y ) = E[(X − E(X|Y )2 ) | Y ] = = E(X 2 | Y ) − (E(X | Y ))2 . Example. Consider a geometric random variable, X, with parameter. It is known that E(X) = 1/p. Compute the V ar(X). Solution: Let N = 1 if the first trial is success, and N = 0 otherwise. E(X 2 ) = E(X 2 |N = 1)P (N = 1) + E(X 2 |N = 0)P (N = 0) = p + (1 − p)E(X + 1)2 . Hence, E(X 2 ) = (2 − p)/p2 . Thus, V ar(X) = E(X 2 ) − (E(X))2 = (1 − p)/p2 . Proposition. Var(X) = E[Var(X|Y )] + Var(E[X|Y ]). Example. (The Variance of a Compound Random Variable) Let X1 , X2 , . . . , be i.i.d. random variables with mean µ and variance σ 2 and assume that they are independent of a PN nonnegative integer valued random variable N . S = k=1 Xk is called a compound random variable. Find the variance of S. Solution. V ar(S|N = n) = V ar N X Xk |N = n ! = V ar k=1 n X Xk ! = nσ 2 k=1 Similarly, E(S|N = n) = nµ. Hence V ar(S|N ) = N σ 2 and E(X|N ) = N µ. By the conditional variance formula, V ar(S) = E(N σ 2 ) + V ar(N µ) = σ 2 E(N ) + µV ar(N ). If N is a Poisson random variable with parameter λ, then S is called a compound Poisson random variable. In this case, V ar(S) = λσ 2 + λµ = λE(X 2 ). Example. Suppose that the number of car arrive at a gas station each day is Poison random variable with mean λ. Suppose further that each car that arrives is, independently, two-door 6 with probability p and four-door with probability (1 − p). Find the joint probability that exactly n two-door car, and m four-door car visit the gas station. Solution. Let N denote the total number of cars, let N2 and N4 be the number of cars with two doors and four doors respectively. P (N2 = n, N4 = m) = ∞ X P (N2 = n, N2 = m|N = j)P (N = j) = j=0 = P (N2 = n, N2 = m|N = n + m)P (N = n + m) = λn+m n+m n = p (1 − p)m e−λ . n (n + m)! 7 3.5 Computing Probabilities by Conditioning We may also use conditioning to compute probability. P (E) = X P (E|Y = y)P (Y = y), if Y is discrete y = Z ∞ P (E|Y = y)fY (y) dy, if Y is continuous. −∞ Example. Suppose that X and Y are independent continuous random variables having densities fX (x) and fY (y), respectively. Compute P (X < Y ). Example. An insurance company supposes that the number of accidents that each of its policyholder will have in a year is Poisson distributed, with the mean that is exponential distributed with rate λ. That is, g(λ) = λe−λ , λ ≥ 0. What is the probability that a randomly chosen policyholder has exactly n accidents next year? Solution. Z ∞ P (X = n) = P (X = n|Y = λ)g(λ) dλ 0 8 EXERCISES 1. Sam takes two history classes only: East Asian and American. Suppose that the number of misprint in a chapter of his East Asian history book is Poisson distributed with mean 5 and the misprint in his American history chapter is Poisson distributed with mean 2. Because of heavy assignment on East Asian history class, he studies Asian history more than American history. When he studies his classes, he chooses one of his books and reads one chapter. He chooses American history book 40 percent of time and chooses East Asian history book 60 percent of time. What is the expected number of misprint that Sam will come across by end of the semester? 2. Suppose that the expected number of accidents per week at an industrial plant is four. Suppose also that the numbers of workers injured in each accident are independent random variables with a common mean of 2. Assume also that the number of workers injured in each accident is independent of number of accident that occur. What is the expected number of injures during a week? 3. Following aircraft accidents, a detailed investigation is conducted. The probability that an accident due to structural failure is correctly identified is 0.9 and the probability that an accident that is not due to structural failure being identified incorrectly as due to structural failure is 0.2. If 25% of all aircraft accidents are due to structural failures, find the probability that an aircraft accident is due to structural failure given that it has been diagnosed as due to structural failure.