Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Mathematics of radio engineering wikipedia , lookup
Elementary mathematics wikipedia , lookup
Karhunen–Loève theorem wikipedia , lookup
Proofs of Fermat's little theorem wikipedia , lookup
Inductive probability wikipedia , lookup
Infinite monkey theorem wikipedia , lookup
Birthday problem wikipedia , lookup
Risk aversion (psychology) wikipedia , lookup
Expected Value and Markov Chains Karen Ge September 16, 2016 Abstract A Markov Chain is a random process that moves from one state to another such that the next state of the process depends only on where the process is at the present state. An absorbing state is a state that is impossible to leave once reached. We survey common methods used to find the expected number of steps needed for a random walker to reach an absorbing state in a Markov chain. These methods are: solving a system of linear equations, using a transition matrix, and using a characteristic equation. Keywords: probability, expected value, absorbing Markov chains, transition matrix, state diagram 1 Expected Value In this section, we give a brief review of some basic definitions, properties, and examples of expected value. Let the random variable X take on values x1 , x2 , x3 , ... with probabilities p1 = P (X = x1 ), p2 = P (X = x2 ), p3 = P (X = x3 ), ..., respectively. We call P the probability distribution function of X and define the expected value of X to be: X X E(X) = xi · pi = xi P (X = xi ). i i Example 1. What is the expected number of rolls of a fair die until a 6 turns up? Solution. We get a 6 on the first roll with a probability of 61 . The probability that 6 does not turn up on the first roll, but does turn up on the second roll is 56 · 61 . The probability that 6 does not turn up on the first two rolls, 1 but does turn up on the third roll is number of rolls is E =1· Multiplying 5 6 5 2 6 · 16 , and so on. Thus the expected 5 2 1 5 3 1 1 5 1 · +4· · + ··· +2· · +3· 6 6 6 6 6 6 6 (∗) to both sides of the equation, we get 5 2 1 5 3 1 5 5 1 · +3· · + ··· E = · +2· 6 6 6 6 6 6 6 (∗∗) Subtracting (∗∗) from (∗) gives 1 1 1 5 5 2 5 3 1 E= 1+ + + + ··· = · 6 6 6 6 6 6 1− 5 6 = 1. Therefore, E = 6. That is, the expected number of rolls of a fair die until a 6 turns up is 6 . Example 1 can be generalized to the following theorem. Its proof is left to the reader. Theorem 1. If an event has a probability of p to occur in a trial, then the expected number of trials to obtain the first occurrence of this event in a 1 sequence of trials is . p Example 2. What is the expected number of rolls of a fair die until all 6 numbers turn up? Solution. The first number turns up on the first roll with a probability of 1. So the expected number of rolls until we get the first number is 1. After that, a different number turns up with a probability of 56 . By Theorem 1, we see that the expected number of rolls until we get a result with a probability of p is p1 . Thus, the expected number of rolls until we get two different numbers is 1 + 65 . We proceed with similar reasoning and get that the expected number of rolls of a fair die until all 6 numbers turn up is 1+ 6 6 6 6 6 + + + + = 14.7 . 5 4 3 2 1 Theorem 2 (Linearity of Expectation). Given two random variables X and Y , we have E(X + Y ) = E(X) + E(Y ). 2 Proof. E(X + Y ) = X (xi + yj )P (X = xi , Y = yj ) i,j = XX (xi + yj )P (X = xi , Y = yj ) = XX i i j xi P (X = xi , Y = yj ) + j XX i yj P (X = xi , Y = yj ). j Note that X X P (X = xi , Y = yj ) = P (X = xi ) and P (X = xi , Y = yj ) = P (Y = yj ) j i because X P (X = xi ) = i X P (Y = yj ) = 1. j Therefore, E(X + Y ) = X xi P (X = xi ) + X i yj P (Y = yj ) = E(X) + E(Y ). j Note that in the theorem above, X and Y do not have to be independent. Theorem 3. If X is a random variable taking only non-negative integer values, then ∞ X E(X) = P (X ≥ i). i=1 Proof. ∞ X P (X ≥ i) = i=1 ∞ X ∞ X P (X = j). i=1 j=i We can think of the RHS of the equation above as the summation of the entries of an upper triangular matrix (all entries below the main diagonal are zero) by rows. This summation can also be carried out by adding entries by columns. Therefore, ∞ X ∞ X i=1 j=i P (X = j) = j ∞ X X P (X = j) = j=1 i=1 ∞ X j=1 3 jP (X = j) = E(X). Example 3. What is the expected number of real numbers, chosen uniformly at random from the interval [0, 1], one must select until their sum exceeds 1? Solution. Let N be the first integer index such that the sum of real numbers x1 , x2 , ..., xN exceeds 1. We compute P (N > n). Note that P (N > n) = P n X xi < 1 . i=1 This is exactly the volume of an n-dimensional simplex. A 1-dimensional simplex is a line segment with unit length, so it has volume (length) 1. A 2dimensional simplex is a right isosceles triangle with two legs of unit length, so its volume (area) is 21 . A 3-dimensional simplex is a right triangular pyramid with three legs of unit length, so its volume is 61 . In general, an 1 . n-dimensional simplex has volume n! By Theorem 3, we have E(N ) = ∞ X P (N ≥ i) = i=1 ∞ X P (N > i) i=0 = P (N > 0) + P (N > 1) + · · · 1 1 1 = 1 + + + + ··· = e . 1! 2! 3! 2 Markov Chains A Markov Chain is a random process that moves from one state to another such that the next state of the process depends only on where the process is at present. Each transition is called a step. In a Markov chain, the next step of the process depends only on the present state and it does not matter how the process reaches the current state. In other words, it is “memoryless.” An absorbing state is a state that is impossible to leave once reached. A state that is not absorbing is called a transient state. If every state of a Markov chain can reach an absorbing state, this Markov chain is called an absorbing Markov chain. 2.1 Solving a System of Linear Equations Example 4. A process moves on the integers 1, 2, 3, 4, and 5. It starts at 1 and, on each successive step, moves to an integer greater than its 4 present position, moving with equal probability to each of the remaining larger integers, until it reaches 5. Find the expected number of steps it takes to reach the integer five. Solution. Let E(i) be the expected number of steps the process takes to reach integer 5 from integer i. We need to find E(1). Clearly E(5) = 0 and E(4) = 1. When the process is at 3, there is probability 12 each to reach 4 or 5. Thus, 3 1 1 E(3) = E(4) + 1 + E(5) + 1 = . 2 2 2 Similarly, when the process is at 2, there is probability or 5. Thus, E(2) = 2.2 each to reach 3, 4, 1 1 11 1 E(3) + 1 + E(4) + 1 + E(5) + 1 = . 3 3 3 6 Finally, when the process is at 1, there is probability 4, or 5. Therefore, E(1) = 1 3 1 4 each to reach 2, 3, 1 1 1 1 25 E(2) + 1 + E(3) + 1 + E(4) + 1 + E(5) + 1 = . 4 4 4 4 12 Using a Transition Matrix Example 5. The Ice Castle in Arendelle has the shape of an icosahedron. Anna stands at the bottom vertex (A) and needs to reach the top vertex (E) to meet her sister Elsa. At each step, Anna randomly picks one of the five adjacent edges with equal probability and moves along that edge to the next vertex. Find the expected number of steps Anna takes to reach the top for the first time. E C B C C C C B B B A 5 B Solution. We will use transition matrix to solve this problem. Note that the icosahedron can be divided into 4 layers. Layer 0: Anna’s starting point (A); Layer 1: the vertices (B) connected with vertex A; Layer 2: the vertices (C) connected with vertex E; and Layer 4: Anna’s ending point (E). Thus there are four states in this Markov chain. Define pij to be the probability that Anna goes from state i to state j. The matrix P = (pij ) is called the transition matrix of the Markov chain. We have 0 1 0 0 1/5 2/5 2/5 0 P = 0 2/5 2/5 1/5 . 0 0 0 1 We see that State (E) is an absorbing state. Let Q be the sub-matrix of P without the rows and columns of any absorbing states. We get 0 1 0 Q = 1/5 2/5 2/5 . 0 2/5 2/5 The matrix N = (I − Q)−1 is called the fundamental matrix for P . The entry nij of N gives the expected number of times that the process is in the transient state j if it started in the transient state i. (See [1] for a proof.) Since 1 −1 0 I − Q = −1/5 3/5 −2/5 , 0 −2/5 3/5 we can use Gauss-Jordan elimination to calculate its inverse matrix and get 5/2 15/2 5 N = (I − Q)−1 = 3/2 15/2 5 . 1 5 5 Adding the entries in the top row, we get 52 + 15 2 + 5 = 15 is the expected number of steps Anna takes to reach the top vertex for the first time. 2.3 Using a Characteristic Equation Example 6. Anna needs to go to the top floor of the Ice Tower to see her sister Elsa. The Ice Tower has 99 floors. The Ice Elevator goes up or down one floor at any given time. When Anna is on the first floor, she always goes to the 2nd floor. But when there is a choice, Anna randomly decides 6 whether she will go up or down next with equal probability. Anna starts on the first floor. How many floors, on average, must she travel to reach the 99th floor for the first time? Solution. Let E(n) be the expected number of floors Anna needs to travel to reach the 99th floor for the first time when she is on Floor n. We have E(99) = 0, E(1) = 1 + E(2), and for any n in between, we have 1 1 E(n) = E(n − 1) + 1 + E(n + 1) + 1 , (1) 2 2 1 1 E(n + 1) = E(n) + 1 + E(n + 2) + 1 . (2) 2 2 Subtracting (1) from (2), we get E(n+2)−3E(n+1)+3E(n)−E(n−1) = 0. The characteristic equation of this cubic recurrence relation is: λ3 − 3λ2 + 3λ − 1 = 0, (λ − 1)3 = 0. Since it has three repeated roots, all equal to 1, we have E(n) = an2 + bn + c for some constants a, b, and c. Plugging E(n) = an2 + bn + c into Equation (1), we get a = −1; plugging it into E(1) = 1 + E(2), we get b = 2. Finally, plugging it into E(99) = 0, we get c = 992 − 2 · 99. Thus, E(n) = −n2 + 2n + 992 − 2 · 99 = −n2 + 2n − 1 + 992 − 2 · 99 + 1 = −(n − 1)2 + (99 − 1)2 . Therefore, the expected number of floors Anna needs to travel is E(1) = 982 = 9604 . 3 Summary We reviewed common methods used to find the expected number of steps for a random process to reach an absorbing state in a Markov chain. A good grasp of various methods can help us discern the situation and apply the most effective method to solve the problem at hand. However, we are not limited to using these methods only, as we will see in our last example. Example 7. Prove that if we flip a biased coin with a probability of p of landing heads, then the expected number of coin tosses needed to get n consecutive heads is 1 1 1 + + ··· + n. p p2 p 7 Proof. We will use a state diagram and find the expected number of steps by eliminating loops. If from State A, there is a probability of p to move to State B in k steps and a probability of 1 − p to move back to itself in k steps, then by Theorem 1, the expected number of steps to move from State A to State B is kp . k steps, (p) A B k steps, (1 − p) Therefore, we can eliminate the loop from State A to State A and simply represent it as: k p steps A B We use ∅ to represent no consecutive heads. We see that there is a probability of p to move from ∅ to H in 1 step and a probability of 1 − p to move from ∅ to ∅ in 1 step. 1 step, (p) ∅ H 1 step, (1 − p) Eliminating the loop, we get: ∅ 1 p steps H Now let’s find the expected number of steps needed to get two consecutive heads. We have: 8 1 step, (p) 1 p ∅ steps H HH 1 step, (1 − p) This is equivalent to a probability of p to move from ∅ to HH in 1 + p1 steps and a probability of 1 − p to move from ∅ to ∅ in 1 + p1 steps. 1 p 1+ steps, (p) ∅ 1+ 1 p HH steps, (1 − p) Thus, we can eliminate the loop, and get that the expected number of steps 1 + p1 = p1 + p12 . needed to move from ∅ to HH is p 1 p ∅ + 1 p2 steps HH By induction, we see that the expected number of steps needed to move from ∅ to HH · · · H (n heads) is: ∅ 4 1 p + 1 p2 + ··· + 1 pn steps HH · · · H Exercises 1. Jane has a bag containing r red balls and b blue balls. She draws k balls, without replacement, from the bag. Find the expected number of red balls drawn. 9 2. Give an alternate solution to Example 6 by solving a system of linear equations. 3. You and 10 of your friends sit at a round table. Starting from you with number 0, your friends are numbered 1, 2, ..., 10 clockwise. Suppose there is a tray on the table with an everlasting supply of candy bars. You take a candy bar from the tray and pass the tray randomly either to your left or to your right. The person who gets the tray does the same, and so on. One of your friends does not like candy bars and would like to be the last person to ever take a candy bar from the tray. Where should your candy bar averse friend sit? 4. Anna needs to go to the ninth floor of the Ice Palace to see her sister Elsa. The Ice Elevator goes up or down one floor at any given time. If it is on the first floor, it can only go up. Anna flips a fair coin to decide whether she will go up or down next. Heads means going up one floor; tails means going down one floor. Anna starts on the first floor, where tails means that she flips the coin again. How many total times does she need to flip the coin, on average, to reach the ninth floor for the first time? 5. Suppose N cars start in a random order along an infinitely long one lane tunnel. They are all going at different but constant speeds and cannot pass each other. If a faster car ends up behind a slower car, it must slow down to the speed of the slower car. Eventually the cars will clump up in traffic jams. Find the expected number of clumps of cars. (A clump is a group of one or more cars.) 6. Suppose we have a random permutation of {1, ..., n} with n ≥ 2. What is the expected number of local maxima? That is, numbers greater than both its neighbors, or greater than its only neighbor if it is on the boundary. 5 References 1. C. Grinstead and J. Snell, Introduction to Probability, American Mathematical Society, 2nd Revised Edition, (1997) ISBN: 978-0821807491 10