* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture 8 (More on mixed strategies
John Forbes Nash Jr. wikipedia , lookup
Rock–paper–scissors wikipedia , lookup
Strategic management wikipedia , lookup
Artificial intelligence in video games wikipedia , lookup
The Evolution of Cooperation wikipedia , lookup
Prisoner's dilemma wikipedia , lookup
Nash equilibrium wikipedia , lookup
Games of pure conflict two person constant sum Two-person constant sum game • Sometimes called zero-sum game. • The sum of the players’ payoffs is the same, no matter what pair of actions they take. • In a two-person constant sum game, one player’s gain is the other’s loss. Maximin strategy • One way to play a game is to take a very cautious view. • Your payoff from any action depends on other’s actions. • In a two-player game, you might assume other player always does what is worst for you. • Given that assumption, you would choose the strategy such that gives you the best payoff available if the other player always does what is worst for you given your strategy. Simple hide and seek Player 2 (Seeker) Look Upstairs p upstairs Hide Player 1 (Hider) Hide downstairs Look Downstairs 0,1 1,0 Is this a constant sum game? A) Yes B) No 1,0 0, 1 Penalty Kick Goalkeeper Jump Left Jump Right Kick Left .3 , .7 .8, .2 Kick right .9, .1 .5, .5 Shooter Is this a constant sum game? A) Yes B) No Going to the Movies Bob Movie A Movie A Alice Movie B 3,2 1,1 0,0 2,3 Is this a constant sum game? A) Yes B) No C) Maybe Movie B Mixed strategies and maximin • Suppose you are Hider, choosing a mixed strategy, and you believe that Seeker will do what is worst for you, given your mixed strategy. • This is not a silly assumption in a two-player zero sum game, because what is worst for you is best for your opponent. • The maximin player will choose her best mixed strategy given that she believes opponent will respond with the strategy that is worst for her. Clicker question • Suppose that you are Hider and you choose to hide upstairs with probability .6. What strategy by SEEKER is worst for you? A) Look upstairs with probability .6 B) Look upstairs and downstairs with equal probability C) Look upstairs for sure D) Look upstairs with probability .4 Clicker question • If you are Hider and hide upstairs with probability .6 and Seeker uses the strategy that is worst for you, what is your expected payoff? A) .6 B) .4 C) .5 D) .35 More generally • If you are Hider and you hide upstairs with probability p>1/2, what is the strategy for Seeker that is worst for you? • Look upstairs • What is your expected payoff if he does that? • You win only if you hide downstairs. Probability of this is 1-p. Expected payoff is (1-p)x1+px0=1-p What if you hide upstairs with p<1/2? • What is worst thing that Seeker can do to you? • (He’ll look downstairs for sure.) • What is your expected payoff? Maximin for hide and seek The pessimist’s view Penalty Kick Goalkeeper Jump Left Jump Right Kick Left .3 , .7 .8, .2 Kick right .9, .1 .5, .5 Shooter Let’s look from pessimistic shooter’s view Shooter’s View Clicker question If shooter randomly chooses left with probability p>4/9, what Goalie strategy is worst for shooter A) Jump left B) Jump right C) Jump left with same probability that shooter shoots left D) Jump left with probability ½, right with probability ½. Clicker question If Shooter shoots left with probability p, what is the best response for Goalie. A) Jump left with probability p B) Jump left with probability ½ C) Jump left for sure if p>5/9, right if p<4/9 D) Jump left with probability 1-p Constant sum games and Maximin • Note that when shooter uses maximin strategy, his own payoff is the same for either response by Goalkeeper. • If shooter’s payoff is the same from both strategies, so is goalkeeper’s. (Why?) • If goalkeeper’s strategy is same from both strategies, goalkeeper is willing to randomize. Clicker Question If Goalie jumps left with probability ½, what strategy by Shooter is worst for Goalie? A) Shoot left B) Shoot right C) Shoot left or right with equal probability Clicker question What strategy by Goalie makes Shooter equally well off from shooting left or right? A) Jump left with probability ½ B) Jump left with probability 2/3 C) Jump left with probability 1/3 Summing up In Maximin equilibrium: – Shooter shoots to left with probability 4/9 – Goalkeeper jumps left with probability 1/3 – Shooter scores with probability .663 – Goalkeeper makes save with probability .366 Maximin is also a Nash equilibrium in zero sum games Maximin and the movies Bob Movie A Movie A Alice Movie B Movie B 3,2 1,1 0,0 2,3 This is not a constant sum game. Maximin equilibrium is not a Nash equilibrium. Alice’s View Maximin equilibrium Symmetric story for Bob. In maximin equlibrium each is equally likely to go to either movie. If Alice is equally likely to go to Movie A or Movie B, what is Bob’s best response? A) Randomize with probability ½ B) Go to Movie B C) Go to Movie A Is the maximin equilibrium for Alice and Bob a Nash equilibrium? A) Yes B) No Some more Problems Advanced Rock-Paper-Scissors Rock Paper Scissors Rock 0,0 -1,1 2,-2 Paper 1,-1 0,0 -1,1 Scissors -2,2 1,-1 0,0 Are there pure strategy Nash equilibria? Is there a symmetric mixed strategy Nash equilibrium? What is it? Finding Mixed Strategy Nash Equilibrium Rock Paper Scissors Rock 0,0 -1,1 2,-2 Paper 1,-1 0,0 -1,1 Scissors -2,2 1,-1 0,0 Let probabilities that column chooser chooses rock, paper, and scissors be r, p, and s=1-p-r Row chooser must be indifferent between rock and paper This tells us that -p+2(1-p-r)=r-(1-p-r) Row chooser must also be indifferent between rock and scissors. This tells us that –p+2(1-p-r)=-2r+p We have 2 linear equations in 2 unknowns. Let’s solve. They simplify to 4r+4p=3 and 4p=2. So we have p=1/2 and r=1/4. Then s=1-p-r=1/4. Problem 7.7 Find mixed strategy Nash equilibia For player 1, Bottom strictly dominates Top. Throw out Top Then for Player 2, Middle weakly dominates Right. Therefore if Player 1 plays bottom with positive probability, player 2 gives zero Probability to Right. There is no N.E. in which Player 1 plays Bottom with zero probability, (Why?) (If he did, what would Player 2 play? Then what would 1 play?) More mechanically Suppose player 1 goes middle with probability m and bottom with probability 1-m. Then expected payoffs for player 2 are: 1m+3(1-m) for playing left 3m+2(1-m) for playing middle 1m+2(1-m) for playing right We see that playing right is worse than playing middle if m>0. So let’s see if there is a mixed strategy Nash equilibrium where Player 2 plays only left and middle and Player 1 is willing to play a mixed strategy. Does this game have a Nash equilibrium in which Kicker mixes left and right but does not kick to center? • If there is a Nash equilibrium where kicker never kicks middle but mixes between left and right, Goalie will never play middle but will mix left and right (Why?) • If Goalie never plays middle but mixes left and right, Kicker will kick middle. (Why?) • So there can’t be a Nash equilibrium where Kicker never kicks Middle. (See why?) Problem 4: For what values of x is there a mixed strategy Nash equilibrium in which the victim might resist or not resist and the Mugger assigns zero probability to showing a gun? Mugger’s Game If there is a Nash equilibrium in which mugger does not show gun and both mugger and victim have mixed strategies, it must be that the mugger’s payoff in this equilibrium is at least as high as that of showing a gun. Mixed strategy equilibrium with no visible gun Resist Don’t resist No Gun 2,6 6,3 Hidden Gun 3,2 5,5 Note that there is no pure strategy N.E. If Victim resists with probability p then Mugger’s expected payoff from having no gun is 2p+6(1-p)=6-4p Mugger’s expected payoff from having a hidden gun is 3p+5(1-p)=5-2p Mugger will use a mixed strategy only if 6-4p=5-2p, which implies p=1/2. If p=1/2, the expected payoff from not showing a gun is 4. Mugger’s Game • If mugger shows gun, he is sure to get a payoff of x. • If victim’s strategy is to resist with probability 1/2 if he doesn’t see a gun, then expected payoff to mugger from not showing a gun is 3x1/2 +5x1/2=4. • So there is a mixed strategy N.E. where mugger doesn’t show gun if x≤ 4. Entry • N players consider entering a market. If a firm is the only entrant its net profit is 170. • If more than one enter each has net profit 30. • If a firm stays out it has net profit 60. Find a symmetric Nash equilibrium. In symmetric N.E. each enters with same probability p. Equilibrium Let q=1-p. If a firm enters, the probability that nobody else enters is qN-1 If nobody else enters, your profit is 170. If at least one other firm your expected profit is 10. So if you enter, your expected profit is 170qN-1+10(1-qN-1) If you don’t enter your expected profit is 60. So there is a mixed strategy equilibrium if 170qN-1+10(1-qN-1)=60, which implies that 160qN-1=50 and q=(5/16)1/N-1 Then p=1-q=1-(5/16)1/N-1 Saddam and UN (Let’s Pretend Saddam had WMD’s) Part a) Saddam is hiding WMDs in location X, Y, or Z. UN can look either in X AND Y or in Z. All Saddam cares about is hiding. All UN cares about if finding. This reduces to a simple hide and seek game. Only trick: Saddam has more than 1 N.E. mixed strategy Saddam and UN Part b) Saddam is hiding WMDs in location X, Y, or Z. UN can look in any two of these places. Think of UN’s strategy as “where not to look”. In N.E. probability of each strategy will be equal. (Why?) Also in N.E. Saddam’s strategy of hiding missiles in each place is the same. (Why?) See you on Thursday… Hints on some more problems from Chapter 7 Problem 9. Each of 3 players is deciding between the pure strategies go and stop. The payoff to go is 120/m , where m is the number of players that choose go, and the payoff to stop is 55 (which is received regardless of what the other players do). Find all Nash equilibria in mixed strategies. Let’s find the “easy ones”. Are there any symmetric pure strategy equilibria? How about asymmetric pure strategy equilibria? How about symmetric mixed strategy equilibrium? Solve 40p^2+60*2p(1-p)+120(1-p)2=55 40p2-120p+65=0 What about equilibria where one guy is in for sure and other two enter with identical mixed strategies? For mixed strategy guys who both Enter with probability p, expected payoff from entering is (120/3)p+(120/2)(1-p). They are indifferent about entering or not if 40p+60(1-p)=55. This happens when p=1/4. This will be an equilibrium if when the other two guys enter with Probability ¼, the remaining guy is better off entering than not. Payoff to guy who enters for sure is: 40*(1/16)+60*(3/8)+120*(9/16)=92.5>55. Problem 7.7, Find mixed strategy Nash equilibria A mixed strategy N.E. strategy does not give positive probability To any strictly dominated strategy c dominates a and y dominates z Look at reduced game without these strategies Problem 8, Chapter 7 A Nash equilibrium is any strategy pair in which the defense defends against the outside run with probability .5 and the offense runs up the middle with probability .75. No matter what the defense does, The offense gets the same payoff from wide left or wide right, So any probabilities pwl and pwr such that pwl+pwr=.25 will be N.E. probabilities for the offense.