Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Part 3 Linear Programming 3.6 Game Theory Example Player X holds up either one hand or two, and independently, so does player Y. If they make the same decision, Y wins $10. If they make the opposite decisions, then X is the winner - $10 if he puts up one hand, and $20 if he puts up two. The net payoff to X is easy to record in a matrix A 10 20 10 10 1 by X 2 by X 1 by Y 2 by Y Mixed Strategy It is obvious that X will not do the same thing every time, or Y would copy him and win everything. Similarly, Y cannot stick to a single strategy, or X will do the opposite. Both players must use a mixed strategy, and furthermore the choice at every turn must be absolutely independent of the previous turns. Assume that X decides that he will put up 1 hand with frequency x1 and 2 hands with frequency x2=1-x1. At every turn this decision is random. Similarly, Y can pick his probabilities y1 and y2=1-y1. It is not appropriate to choose x1=x2=y1=y2=0.5 since Y would lose $20 too often. But the more Y moves to a pure 2-hand strategy, the more X will move toward 1 hand. Equilibrium Does there exist a mixed strategy y1 and y2 that, if used consistently by Y, offers no special advantage to X? Can X choose the probabilities x1 and x2 that present Y with no reason to change his own strategy? At such equilibrium, if it exists, the average payoff to X will have reached a saddle point. It is a maximum as far as X is concerned, and a minimum as far as Y is concerned. To find such a saddle point is to “solve” the game. Best Strategy for X The payoff of X can be determined by 10 20 20 30 x1 x1 1 x1 10 20 x 10 10 1 Let 20 30 x1 10 20 x1 3 x1 5 3 10 2 20 2 5 10 5 10 2 Whatever Y does against the strategy, he will lose $2. Best Strategy for Y y1 10 20 1 y1 10 10 10 20 y1 10 30 y1 10 20 y1 10 30 y1 2 y1 5 2 3 10 20 10 10 2 2 5 5 With this strategy, Y cannot lose more than $2. Significance of Solution Such a saddle point is remarkable, because it means that X plays his 2-hand strategy only 2/5 of the time, even though it is this strategy that gives him a chance at $20. At he same time, Y is forced to adopt a losing strategy – he would like to match X, but instead he uses the opposite probabilities 2/5 (1 hand) and 3/5 (2 hand). Matrix Game X has n possible moves to choose from, and Y has m. Thus, the dimension payoff matrix A is m by n. The entry aij in A represents the payment received by X when he chooses his jth strategy and Y chooses his ith. A negative entry means a win for Y. Matrix Game Player X is free to choose the probability vector x x1 xn and T n x j 1 1 j Player Y is free to choose the probability vector y y1 ym and T m y i 1 i 1 The total expected payoff from each play of the game is x1 a1n a11 a12 m n x2 T aij yi x j y Ax y1 ym i 1 j 1 am1 am 2 amn xn It is y T Ax that player X wants to maximize and player Y wants to minimize. Example A nn I nn The optimal solution is 1 x y n * 1 n * 2 1 n T 2 1 1 1 y Ax n n n As n increases, Y has a better chance to escape. *T * Fair Game A symmetric matrix A (e.g. A = I ) does not guarantee the fairness of game. In fact, it is the skew-symmetric matrix, i.e. AT = -A, which means a completely fair game. Such a matrix faces 2 players with identical decisions, since a choice of strategy j by X and i by Y win aij for X, and a choice of j by Y and i by X wins the same amount aij for Y (because a ji -aij ). The optimal strategies for both players must be the same, and the expected payoff must be y *T Ax* =0. Example 0 1 1 A 1 0 1 1 1 0 In words, X and Y both choose a number between 1 and 3, and the one with the smaller number wins $1. 1 x* y * 0 0 y *T Ax* =a11 =0 Equivalent Payoff Matrix A matrix (E) that has every entry equal to 1. Adding a multiple of E to the payoff matrix, i.e. A→A+cE, simply means that X wins an additional amount c at every turn. The value of the game is increased by c, but there is no reason to change the original strategies. Minimax Theorem For any m by n matrix A, the minimax over all mixed strategies equals the maximin: max min y T Ax min max y T Ax x y y x This quantity is the value of the game. If the maximum on the left is attained at x* , and the minimum on the right is attained at y * , then those strategies are optimal and they yield a saddle point from which nobody wants to move y *T Ax y *T Ax* y T Ax* x, y Interpretation of Minimax Theorem from Player X’s Viewpoint If X select a particular strategy x x1 xn , T then Y will eventually recognize it and choose his own strategy so as to minimize the payment y T Ax. X will receive in this case min y y T Ax. An intelligent player X will select a vector x* that miximize this minimum. By this choice, X guarantees that he will win at least the amount min y T Ax* max min y T Ax y x y He cannot expect to win more. Interpretation of Minimax Theorem from Player Y’s Viewpoint Player Y does the opposite. For any of his own mixed strategy y, he must expect X to discover the vector that will maximize y T Ax. Therefore Y will choose the mixture y that minimizes this maximum and guarantees that he will lose no more than max y *T Ax min max y T Ax x y x He cannot expect to do better. Connections Between Game Theory and Duality in LP – (1) Let b 1 1 T m 1 and c 1 1 n 1 T Consider the dual linear programs min cT x max bT y s.t. Ax b, x 0 s.t. AT y c, y 0 (Primal) (Dual) To ensure both problems are feasible, it may be necessary to add the same large number to all entries of the payoff matrix A, i.e. A A + E. Connections Between Game Theory and Duality in LP – (2) The duality theorem of linear programming guarantees that there exists vector x* and y * such that cT x* = bT y *. Thus, n m x y i 1 * i j 1 * j Then the resulting mixed strategies x1* * x x2* T xn* for player X T y y y y* for player Y are optimal in game theory. * 1 * 2 * m Connections Between Game Theory and Duality in LP – (3) This is due to the fact that, for any other mixed strategies x and y , Ax* b y T Ax* y T b 1 y *T A cT y *T Ax cT x 1 Thus, y T Ax* 1 y *T Ax 1 1 1 y T A x* y *T Ax This says that player X cannot win more than strategy strategy y* x* . 1 , and player Y cannot lose less than against the 1 against the