Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Section 2: Conditional Probability and Conditional Expectation 2.1 Introduction One of the most useful concepts in probability theory is that of conditional probability and conditional expectation. The reason is twofold. First, in practice, we are often interested in calculating probabilities and expectations when some partial information is available; hence, the desired probabilities and expectations are conditional ones. Secondly, in calculating a desired probability or expectation it is often extremely useful to first condition on some appropriate random variable. 2.2 The Discrete Case For any two events A and B, the conditional probability of A given B is defined, as long as PB > 0, by PA ∣ B = PA ∩ B PB Hence, if X and Y are discrete random variables, then it is natural to define the conditional probability mass function of X given that Y = y, by p X∣Y x ∣ y = PX = x ∣ Y = y PX = x, Y = y = PY = y px, y = p Y y for all values of y such that PY = y > 0. Similarly, the conditional probability distribution function of X given that Y = y is defined, for all y such that PY = y > 0, by F X∣Y x ∣ y = PX ≤ x ∣ Y = y = ∑ p X∣Ya ∣ x a≤x Finally, the conditional expectation of X given that Y = y is defined by EX ∣ Y = y = ∑ x.PX = x ∣ Y = y x = ∑ x.p X∣Yx ∣ y x If X is independent of Y, than the conditional mass function, distribution and expectation are the same as the unconditional ones. This follows, since if X is independent of Y, then 12 p Y∣X x ∣ y = PX = x ∣ Y = y = PX = x Example (2.1) : Consider an experiment which results in one of three possible outcomes with outcome i 3 occurring with probability p i , i = 1, 2, 3 and ∑ i=1 p i = 1. Suppose that n independent replications of this experiment are performed and let X i , i = 1, 2, 3, denote the number of times outcome i appears. Determine the conditional expectation of X 1 given that X 2 = m. Solution : For k ≤ n − m, PX 1 = k ∣ X 2 = m = PX 1 = k, X 2 = m PX 2 = m Now, if X 1 = k and X 2 = m, then it follows that X 3 = n − k − m. However, PX 1 = k, X 2 = m, X 3 = n − k − m = n! p k p m p n−k−m k!m!n − k − m! 1 2 3 (2.1) This follows since any particular sequence of the n experiments having outcome 1 appear k times, outcome 2 m times and outcome 3 n − k − m times has probability p k1 p m2 p n−k−m of 3 n! occurring. Since there are such sequences, equation 2.1 follows. k!m!n − k − m! Therefore, we have n! p k p m p n−k−m k!m!n − k − m! 1 2 3 PX 1 = k ∣ X 2 = m = n! p m 1 − p 2 n−m m!n − m! 2 where we have used the fact that X 2 has a binomial distribution with paramenters n and p 2 . Hence, PX 1 = k ∣ X 2 = m = n − m! k!n − m − k! p1 1 − p2 k p3 1 − p2 n−m−k or equivalently, writing p 3 = 1 − p 1 − p 2 , PX 1 = k ∣ X 2 = m = n−m k p1 1 − p2 k 1− p1 1 − p2 n−m−k In other words, the conditional distribution of X 1 , given that X 2 = m, is binomial with p1 paramenters n − m and . Consequently, 1 − p2 p1 EX 1 ∣ X 2 = m = n − m 1 − p2 13 # Example (2.2) : There are n components. On a rainy day, component i will function with probability p i ; on a nonrainy day, component i will function with probability q i , for i = 1, 2, ..., n. It will rain tomorrow with probability α. Calculate the conditional expected number of components that function tomorrow, given that it rains. Solution : Let 1, if component i functions tomorrow Xi = 0, otherwise Then, with Y defined to equal 1 if it rains tomorrow, and 0 otherwise, the desired conditional expectation is obtained as follows. n E ∑ Xi ∣ Y = 1 n = i=1 ∑ EX i ∣ Y = 1 i=1 n = ∑ pi # i=1 2.3 Continuous Case If X and Y have a joint probability density function fx, y, then the conditional probability density function of X given that Y = y, is defined for all values of y such that f Y y > 0, by f X∣Y x ∣ y = fx, y f Y y To motivate this definition, multiply the left side by dx and the right side by dxdy to get dy fx, ydxdy f Y ydy Px ≤ X ≤ x + dx, y ≤ Y ≤ y + dy Py ≤ Y ≤ y + dy f X∣Y x ∣ ydx = = Px ≤ X ≤ x + dx ∣ y ≤ Y ≤ y + dy In other words, for small values dx and dy, f X∣Y x ∣ ydx is approximately the conditional probability that X is between x and x + dx given that Y is between y and y + dy. The conditional expectation of X, given that Y = y, is defined for all values of y such that f Y y > 0, by EX ∣ Y = y = 14 ∞ ∫ −∞ xf X∣Yx ∣ ydx Example (2.3) : The joint density of X and Y is given by fx, y = 1 ye −xy , 0 < x < ∞, 0 < y < 2 2 0, otherwise X What is E e 2 ∣ Y = 1 ? Solution : The conditional density of X, given that Y = 1, is given by f X∣Y x ∣ 1 = fx, 1 f Y 1 1 e −x 2 ∞ 1 −x ∫ 0 e dx 2 −x =e = Hence, X E e2 ∣Y=1 ∞ x e 2 f X∣Y x ∣ 1dx = ∫0 = ∫ 0 e 2 e −x dx ∞ x =2 # 2.4 Computing Expectations by Conditioning Denote by EX ∣ Y that function of the random variable Y whose value at Y = y is EX ∣ Y = y. Note that EX ∣ Y is itself a random variable. An extremely important property of conditional expectation is that for all random variables X and Y EX = EEX ∣ Y provided that EX exists. Theorem (2.1) : If Y is a discrete random variable, then 2.2 states that 15 (2.2) ∞ EX = ∑ EX ∣ Y = yPY = y (2.3) y while if Y is continuous with density f Y y, then 2.2 says that EX = ∞ ∫ −∞ EX ∣ Y = yf Yydy Proof : ∑ EX ∣ Y = y.PY = y = ∑ ∑ xPX = x ∣ Y = yPY = y y y = = x, Y = y PY = y ∑ ∑ x PXPY = y y = x ∑ x ∑ PX = x, Y = y x = x ∑ ∑ xPX = x, Y = y y = x y ∑ xPX = x x = EX # One way to understand 2.3 is to interpret it as follows. It states that to calculate EX we may take a weighted average of the conditional expected value of X given that Y = y, each of he terms EX ∣ Y = y being weighted by the probability of the event on which it is conditioned. Example (2.4) : Expectation of the Sum of a Random Number of Random Variables Suppose that the expected number of accidents per week at an industrial plant is four. Suppose also that the numbers of workers injured in each accident are independent random variables with a common mean of 2. Assume also that the number of workers injured in each accident is independent of the number of accidents that occur. What is the expected number of injuries during a week? Solution : Let N = number of accidents and X i = number of injuries in the i th accident, i = 1, 2, ... Then the total number of injuries can be expressed as ∑ i=1 X i . Now N 16 N E N ∑ Xi =E E i=1 ∑ Xi ∣ N i=1 But N E ∑ Xi ∣ N = n n =E ∑ Xi ∣ N = n i=1 i=1 n =E ∑ Xi by the independence of X i and N i=1 = nEX which yields that N E ∑ Xi ∣ N = NEX i=1 and thus N E ∑ Xi = ENEX i=1 = ENEX Therefore, in the example, the expected number of injuries during a week equals 4 × 2 = 8. # Example (2.5) : Independent trials, each of which is a success with probability p, are performed until there are k consecutive successes. What is the mean number of necessary trials? Solution : Let N k = number of necessary trials to obtain k consecutive successes M k = mean We will obtain a recursive equation for M k by conditioning on N k−1 , the number of trials needed for k − 1 consecutive successes. This yields M k = EN k = EEN k ∣ N k−1 Now, EN k ∣ N k−1 = N k−1 + 1 + 1 − pEN k where the preceding follows since if it takes N k−1 trials to obtain k − 1 consecutive successes, then either the next trial is a success and we have our k in a row or it is a failure and we must begin anew. Taking expectations of both sides of the preceding yields 17 M k = M k−1 + 1 + 1 − pM k = 1p + Mpk−1 Since N 1 , the time of the first success, is geometric with parameter p, we see that M 1 = 1p and, recursively, M 2 = 1p + 12 p M 3 = 1p + 12 + 13 p p ... M k = 1p + 12 + ... + 1k p p # 2.5 Computing Variances by Conditioning Conditional expectations can also be used to compute the variance of a random variable. The conditional variance of X given that Y = y is defined by VX ∣ Y = y = EX − EX ∣ Y = y 2 ∣ Y = y = EX 2 ∣ Y = y − EX ∣ Y = y 2 Letting VX ∣ Y denote that function of Y whose value when Y = y is VX ∣ Y = y, we have the following result. Theorem (2.2) : The conditional variance formula VX = EVX ∣ Y + VEX ∣ Y Proof : EVX ∣ Y = EEX 2 ∣ Y − EX ∣ Y 2 = EEX 2 ∣ Y − EEX ∣ Y 2 = EX 2 − EEX ∣ Y 2 and VEX ∣ Y = EEX ∣ Y 2 − EEX ∣ Y 2 = EEX ∣ Y 2 − EX 2 therefore 18 (2.4) EVX ∣ Y + VEX ∣ Y = EX 2 − EX 2 # Example (2.6) : Variance of a Compound Random Variable Let X 1 , X 2 , ... be independent and identically distributed random variables with distribution F having mean μ and variance σ 2 , and assume that they are independent of the nonnegative integer valued random variable N. As noted in Example 2.4, where its expected value was N determined, the random variable S = ∑ i=1 X i is called a compound random variable. Find its variance. Solution : Whereas we could obtain ES 2 by conditioning on N, let us instead use the conditional variance formula. Now, N VS ∣ N = n = V ∑ Xi ∣ N = n i=1 n ∑ Xi ∣ N = n =V i=1 n ∑ Xi =V i=1 = nσ 2 By the same reasoning, ES ∣ N = n = nμ Therefore, VS ∣ N = Nσ 2 , ES ∣ N = Nμ and the conditional variance formula gives that VS = ENσ 2 + VNμ = σ 2 EN + μ 2 VN If N is a Poisson random variable, then S = ∑ i=1 X i is called a compound Poisson random variable. Because the variance of a Poisson random variable is equal to its mean, it follows that for a compound Poisson random variable having EN = λ N VS = λσ 2 + λμ 2 = λEX 2 where X has the distribution F. 2.6 Computing Probabilities by Conditioning Let E denote an arbitrary event and define the indicator random variable X by 19 # X= 1, if E occurs 0, if E does not occur It follows from the definition of X that EX = PE EX ∣ Y = y = PE ∣ Y = y for any random variable Y Therefore PE = = ∑ PE ∣ Y = yPY = y y ∞ ∫ −∞ PE ∣ Y = yf Yydy if Y is discrete if Y is continuous Example (2.7) : The Ballot Problem In an election, candidate A receives n votes, and candidate B receives m votes where n > m. Assuming that all orderings are equally likely, show that the probability that A is m always ahead in the count of votes is nn − +m. Solution : Let P n,m denote the desired probability. By conditioning on which candidate receives the last vote counted we have P n,m = PA always ahead ∣ A recives last vote n +n m m +PA always ahead ∣ B receives last vote n + m Now given that A receives the last vote, we can see that the probability that A is always ahead is the same as if A had received a total of n − 1 and B a total of m votes. Because a similar result is true when we are given that B receives the last vote, we see from the preceding that P n,m = n +n m P n−1,m + m m (2.5) + n P n,m−1 m We can now prove that P n,m = nn − + m by induction on n + m. As it is true when n + m = 1, that is, P 1,0 = 1, assume it whenever n + m = k. Then when n + m = k + 1, we have by equation 2.5 and the induction hypothesis that n−m+1 P n,m = n +n m n − 1 − m + m m +n n+m−1 n−1+m m = nn − +m and the result is proven. 20 # Exercise (2.1) : 1. Let X and Y be two discrete random variables taking values in 1, 2, 3, .... Suppose PX = m, Y = n = 0.640.2 n+m−2 n, m = 1, 2, ... and Z = X + Y. Compute a. PX = m, Z = k for m = 5 and k = 7 first and then in general after that. b. PX = 5|Z = 7 c. PZ = 7|X = 5 d. PY = 3|Z = 14 e. PX = 5, Y = 3|Z = 8 f. PZ = 8|X = 6, Y = 2 2. Let X and Y denote, respectively, the number of babies born on a certain day in a hospital and the number of them which are boys. Suppose their joint distribution is PX = n, Y = m = e −14 7.14 m 6.68 n−m , if m = 0, 1, ..., n; n = 0, 1, ... m!n − m! 0, otherwise Find PX = n, PY = m and PX − Y = k|Y = m for all m, n ∈ 0, 1, .... 3. Suppose that px, y, the joint probability mass function of X and Y, is given by p1, 1 = 0.5, p1, 2 = 0.1, p2, 1 = 0.1, p2, 2 = 0.3 Calculate the conditional probability mass function of X given that Y = 1. 4. Let A, B and C be independent random variables with distributions indicated below: PA = 1 = 0.4 PA = 2 = 0.6 PB = −3 = 0.25 PB = −2 = 0.25 PB = −1 = 0.25 PB = 1 = 0.25 PC = 1 = 0.5 PC = 2 = 0.4 PC = 3 = 0.1 What is the probability that Ax 2 + Bx + C has real roots? 5. Reliability is the probability of a device performing its purpose adequately for the period of time intended under the operating conditions encountered. A piece of equipment consists of three components in series: for the equipment to function, all three components must be functioning. Let X 1 , X 2 , and X 3 be the respective lifetimes of the components 1,2, and 3 measured in hours. Suppose −t PX 1 ≤ t = 1 − e 10000 , t ≥ 0 −2t PX 2 ≤ t = 1 − e 10000 t≥0 −3t PX 3 ≤ t = 1 − e 100000 t ≥ 0 If the lifetimes of the components are independent, what is the reliability of the 21 equipment in a mission requiring 4000 hours? 6. If X and Y are independent Poisson random variables with respective means λ 1 and λ 2 , calculate the conditional expected value of X given that X + Y = n. 7. Suppose the joint density of X and Y is given by fx, y = 4yx − ye −x+y , 0 < x < ∞, 0 ≤ y ≤ x 0, otherwise Compute EX ∣ Y = y. 8. The Mean of a Geometric Distribution A coin, having probability p of coming up heads, is to be successively flipped until the first head appears. What is the expected number of flips required? 9. A miner is trapped in a mine containing three doors. The first door leads to a tunnel that takes him to safety after two hours of travel. The second door leads to a tunnel that returns him to the mine after three hours of travel. The third door leads to a tunnel that returns him to his mine after five hours. Assuming that the miner is at all times equally likely to choose any of the doors, what is the expected length of time until the miner reaches safety? 10. Variance of the Geometric Random Variable Independent trials, each resulting in a success with probability p, are performed in sequence. Let N be the trial number of the first success. Find VN. 11. A total of n people have been invited to a party honoring an important official. The party begins at time 0. The arrival times of the n guests are independent exponential random variables with mean 1, and the arrival time of the official is uniformly distributed between 0 and 1. a. Find the probability that exactly k of the guests arrive before the official. b. Find the expected number of guests who arrive before the official. 12. A vehicle insurance company classifies each of its policyholders as being of one of the types i = 1, 2, ..., k. It supposes that the numbers of accidents that a type i policyholder has in successive years are independent Poisson random variables with mean λ i , i = 1, 2, ..., k. The probability that a newly insured policyholder is type i is p i , k p i = 1. Given that a policyholder had n accidents in her first year, what is the ∑ i=1 expected number that she has in her second year? What is the conditional probability that she has m accidents in her second year? 13. Let X 1 and X 2 be independent geometric random variables having the same parameter p. Determine PX 1 = i ∣ X 1 + X 2 = n 22 14. An urn contains three white, six red and five black balls. Six of these balls are randomly selected from the urn. Let X and Y denote respectively the number of white and black balls selected. Compute the conditional probability mass function of X given that Y = 3. Also compute EX ∣ Y = 1. Assume that when a ball is selected its colour is noted and it is then replaced in the urn before the next selection is made. 15. Prove that if X and Y are jointly continuous, then EX = ∞ ∫ −∞ EX ∣ Y = yf Yydy 16. A coin having probability p of coming up heads is successively flipped until two of the most recent three flips are heads. Let N denote the number of flips. Note that if the first two flips are heads, then N = 2 Find EN. 17. Suppose X is a Poisson random variable with mean λ. The parameter λ is itself a random variable whose distribution is exponential with mean 1. Show that n+1 PX = n = 1 2 18. An insurance company supposes that the number of accidents that each of its policyholders will have in a year is Poisson distributed, with the mean of the Poisson depending on the policyholder. If the Poisson mean of a randomly chosen policyholder has a gamma distribution with density function gλ = λe −λ , λ≥0 what is the probability that a randomly chosen policyholder has exactly n accidents next year? 23