Download Lecture 8 (More on mixed strategies

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

John Forbes Nash Jr. wikipedia , lookup

Rock–paper–scissors wikipedia , lookup

Strategic management wikipedia , lookup

Artificial intelligence in video games wikipedia , lookup

The Evolution of Cooperation wikipedia , lookup

Minimax wikipedia , lookup

Prisoner's dilemma wikipedia , lookup

Nash equilibrium wikipedia , lookup

Evolutionary game theory wikipedia , lookup

Chicken (game) wikipedia , lookup

Transcript
Games of pure conflict
two person constant sum
Two-person constant sum game
• Sometimes called zero-sum game.
• The sum of the players’ payoffs is the same,
no matter what pair of actions they take.
• In a two-person constant sum game, one
player’s gain is the other’s loss.
Maximin strategy
• One way to play a game is to take a very cautious
view.
• Your payoff from any action depends on other’s
actions.
• In a two-player game, you might assume other
player always does what is worst for you.
• Given that assumption, you would choose the
strategy such that gives you the best payoff
available if the other player always does what is
worst for you given your strategy.
Simple hide and seek
Player 2 (Seeker)
Look Upstairs
p upstairs
Hide
Player 1
(Hider)
Hide
downstairs
Look Downstairs
0,1
1,0
Is this a constant sum game?
A) Yes B) No
1,0
0, 1
Penalty Kick
Goalkeeper
Jump Left
Jump Right
Kick Left
.3 , .7
.8, .2
Kick right
.9, .1
.5, .5
Shooter
Is this a constant sum game?
A) Yes B) No
Going to the Movies
Bob
Movie A
Movie A
Alice
Movie B
3,2
1,1
0,0
2,3
Is this a constant sum game?
A) Yes
B) No
C) Maybe
Movie B
Mixed strategies and maximin
• Suppose you are Hider, choosing a mixed strategy,
and you believe that Seeker will do what is worst
for you, given your mixed strategy.
• This is not a silly assumption in a two-player zero
sum game, because what is worst for you is best
for your opponent.
• The maximin player will choose her best mixed
strategy given that she believes opponent will
respond with the strategy that is worst for her.
Clicker question
• Suppose that you are Hider and you choose to
hide upstairs with probability .6. What
strategy by SEEKER is worst for you?
A) Look upstairs with probability .6
B) Look upstairs and downstairs with equal
probability
C) Look upstairs for sure
D) Look upstairs with probability .4
Clicker question
• If you are Hider and hide upstairs with
probability .6 and Seeker uses the strategy
that is worst for you, what is your expected
payoff?
A) .6
B) .4
C) .5
D) .35
More generally
• If you are Hider and you hide upstairs with
probability p>1/2, what is the strategy for
Seeker that is worst for you?
• Look upstairs
• What is your expected payoff if he does that?
• You win only if you hide downstairs.
Probability of this is 1-p. Expected payoff is
(1-p)x1+px0=1-p
What if you hide upstairs with p<1/2?
• What is worst thing that Seeker can do to you?
• (He’ll look downstairs for sure.)
• What is your expected payoff?
Maximin for hide and seek
The pessimist’s view
Penalty Kick
Goalkeeper
Jump Left
Jump Right
Kick Left
.3 , .7
.8, .2
Kick right
.9, .1
.5, .5
Shooter
Let’s look from pessimistic shooter’s view
Shooter’s View
Clicker question
If shooter randomly chooses left with probability
p>4/9, what Goalie strategy is worst for shooter
A) Jump left
B) Jump right
C) Jump left with same probability that shooter
shoots left
D) Jump left with probability ½, right with
probability ½.
Clicker question
If Shooter shoots left with probability p, what is
the best response for Goalie.
A) Jump left with probability p
B) Jump left with probability ½
C) Jump left for sure if p>5/9, right if p<4/9
D) Jump left with probability 1-p
Constant sum games and Maximin
• Note that when shooter uses maximin
strategy, his own payoff is the same for either
response by Goalkeeper.
• If shooter’s payoff is the same from both
strategies, so is goalkeeper’s. (Why?)
• If goalkeeper’s strategy is same from both
strategies, goalkeeper is willing to randomize.
Clicker Question
If Goalie jumps left with probability ½, what
strategy by Shooter is worst for Goalie?
A) Shoot left
B) Shoot right
C) Shoot left or right with equal probability
Clicker question
What strategy by Goalie makes Shooter equally
well off from shooting left or right?
A) Jump left with probability ½
B) Jump left with probability 2/3
C) Jump left with probability 1/3
Summing up
In Maximin equilibrium:
– Shooter shoots to left with probability 4/9
– Goalkeeper jumps left with probability 1/3
– Shooter scores with probability .663
– Goalkeeper makes save with probability .366
Maximin is also a Nash equilibrium in zero sum
games
Maximin and the movies
Bob
Movie A
Movie A
Alice
Movie B
Movie B
3,2
1,1
0,0
2,3
This is not a constant sum game. Maximin
equilibrium is not a Nash equilibrium.
Alice’s View
Maximin equilibrium
Symmetric story for Bob.
In maximin equlibrium each is equally likely to
go to either movie.
If Alice is equally likely to go to Movie
A or Movie B, what is Bob’s best
response?
A) Randomize with probability ½
B) Go to Movie B
C) Go to Movie A
Is the maximin equilibrium for Alice
and Bob a Nash equilibrium?
A) Yes
B) No
Some more Problems
Advanced Rock-Paper-Scissors
Rock
Paper
Scissors
Rock
0,0
-1,1
2,-2
Paper
1,-1
0,0
-1,1
Scissors
-2,2
1,-1
0,0
Are there pure strategy Nash equilibria?
Is there a symmetric mixed strategy Nash equilibrium?
What is it?
Finding Mixed Strategy Nash Equilibrium
Rock
Paper
Scissors
Rock
0,0
-1,1
2,-2
Paper
1,-1
0,0
-1,1
Scissors
-2,2
1,-1
0,0
Let probabilities that column chooser chooses
rock, paper, and scissors be r, p, and s=1-p-r
Row chooser must be indifferent between rock and paper
This tells us that -p+2(1-p-r)=r-(1-p-r)
Row chooser must also be indifferent between rock and scissors.
This tells us that –p+2(1-p-r)=-2r+p
We have 2 linear equations in 2 unknowns. Let’s solve.
They simplify to 4r+4p=3 and 4p=2.
So we have p=1/2 and r=1/4. Then s=1-p-r=1/4.
Problem 7.7 Find mixed strategy Nash equilibia
For player 1, Bottom strictly dominates Top. Throw out Top
Then for Player 2, Middle weakly dominates Right. Therefore if
Player 1 plays bottom with positive probability, player 2 gives zero
Probability to Right.
There is no N.E. in which Player 1 plays Bottom with zero probability, (Why?)
(If he did, what would Player 2 play? Then what would 1 play?)
More mechanically
Suppose player 1 goes middle with probability m and
bottom with probability 1-m.
Then expected payoffs for player 2 are:
1m+3(1-m) for playing left
3m+2(1-m) for playing middle
1m+2(1-m) for playing right
We see that playing right is worse than playing middle if
m>0.
So let’s see if there is a mixed strategy Nash equilibrium
where Player 2 plays only left and middle and Player 1 is
willing to play a mixed strategy.
Does this game have a Nash equilibrium in which Kicker mixes left
and right but does not kick to center?
• If there is a Nash equilibrium where kicker
never kicks middle but mixes between left and
right, Goalie will never play middle but will
mix left and right (Why?)
• If Goalie never plays middle but mixes left
and right, Kicker will kick middle. (Why?)
• So there can’t be a Nash equilibrium where
Kicker never kicks Middle. (See why?)
Problem 4: For what values of x is there a mixed strategy Nash
equilibrium in which the victim might resist or not resist and the
Mugger assigns zero probability to showing a gun?
Mugger’s Game
If there is a Nash equilibrium in which mugger
does not show gun and both mugger and victim
have mixed strategies, it must be that the
mugger’s payoff in this equilibrium is at least as
high as that of showing a gun.
Mixed strategy equilibrium
with no visible gun
Resist
Don’t resist
No Gun
2,6
6,3
Hidden Gun
3,2
5,5
Note that there is no pure strategy N.E.
If Victim resists with probability p then
Mugger’s expected payoff from having no gun is 2p+6(1-p)=6-4p
Mugger’s expected payoff from having a hidden gun is 3p+5(1-p)=5-2p
Mugger will use a mixed strategy only if 6-4p=5-2p, which implies p=1/2.
If p=1/2, the expected payoff from not showing a gun is 4.
Mugger’s Game
• If mugger shows gun, he is sure to get a payoff
of x.
• If victim’s strategy is to resist with probability
1/2 if he doesn’t see a gun, then expected
payoff to mugger from not showing a gun is
3x1/2 +5x1/2=4.
• So there is a mixed strategy N.E. where
mugger doesn’t show gun if x≤ 4.
Entry
• N players consider entering a market. If a firm
is the only entrant its net profit is 170.
• If more than one enter each has net profit 30.
• If a firm stays out it has net profit 60.
Find a symmetric Nash equilibrium.
In symmetric N.E. each enters with same
probability p.
Equilibrium
Let q=1-p. If a firm enters, the probability that nobody
else enters is qN-1
If nobody else enters, your profit is 170. If at least one
other firm your expected profit is 10. So if you enter, your
expected profit is
170qN-1+10(1-qN-1)
If you don’t enter your expected profit is 60.
So there is a mixed strategy equilibrium if
170qN-1+10(1-qN-1)=60, which implies that
160qN-1=50 and q=(5/16)1/N-1
Then p=1-q=1-(5/16)1/N-1
Saddam and UN
(Let’s Pretend Saddam had WMD’s)
Part a) Saddam is hiding WMDs in location X, Y,
or Z. UN can look either in X AND Y or in Z.
All Saddam cares about is hiding. All UN cares
about if finding.
This reduces to a simple hide and seek game.
Only trick: Saddam has more than 1 N.E. mixed
strategy
Saddam and UN
Part b) Saddam is hiding WMDs in location X, Y,
or Z. UN can look in any two of these places.
Think of UN’s strategy as “where not to look”.
In N.E. probability of each strategy will be
equal. (Why?)
Also in N.E. Saddam’s strategy of hiding missiles
in each place is the same. (Why?)
See you on Thursday…
Hints on some more problems from
Chapter 7
Problem 9.
Each of 3 players is deciding between the pure
strategies go and stop. The payoff to
go is 120/m , where m is the number of players that
choose go, and the payoff to stop is 55 (which is
received regardless of what the other players do). Find
all Nash equilibria in mixed strategies.
Let’s find the “easy ones”.
Are there any symmetric pure strategy equilibria?
How about asymmetric pure strategy equilibria?
How about symmetric mixed strategy equilibrium?
Solve 40p^2+60*2p(1-p)+120(1-p)2=55
40p2-120p+65=0
What about equilibria where one guy is in for sure and other
two enter with identical mixed strategies?
For mixed strategy guys who both
Enter with probability p, expected payoff from entering is
(120/3)p+(120/2)(1-p). They are indifferent about entering or not if
40p+60(1-p)=55. This happens when p=1/4.
This will be an equilibrium if when the other two guys enter with
Probability ¼, the remaining guy is better off entering than not.
Payoff to guy who enters for sure is:
40*(1/16)+60*(3/8)+120*(9/16)=92.5>55.
Problem 7.7, Find mixed strategy Nash equilibria
A mixed strategy N.E. strategy does
not give positive probability
To any strictly dominated strategy
c dominates a and y dominates z
Look at reduced game without these strategies
Problem 8, Chapter 7
A Nash equilibrium is any strategy pair in which the defense defends
against the outside run with probability .5 and the offense runs up the
middle with probability .75. No matter what the defense does,
The offense gets the same payoff from wide left or wide right,
So any probabilities pwl and pwr such that pwl+pwr=.25
will be N.E. probabilities for the offense.