Download Games People Play Chapter 8

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Game mechanics wikipedia , lookup

The Evolution of Cooperation wikipedia , lookup

Evolutionary game theory wikipedia , lookup

Artificial intelligence in video games wikipedia , lookup

Chicken (game) wikipedia , lookup

Prisoner's dilemma wikipedia , lookup

Transcript
8: The Prisoners’ Dilemma and
repeated games
In this section we shall learn
How repeated play of a game opens up many new
strategic possibilities
How to improve the outcome you achieve from a
game by making your current play contingent on how
your opponent played in the past.
How to make the other player be cooperative.
Warning: We shall also learn when repeated play gains
us nothing.
Games People Play.
Games People Play.
The Prisoners’ Dilemma
Finite repetition.
What happens when
the game is repeated
with the same opponent
a finite number of
times?
Criminal #2
Criminal
#1
Games People Play.
Confess
Deny
Confess
10,10
1, 25
Deny
25,1
3,3
The Prisoners’ Dilemma
Finite repetition.
The last play of the game is a
one-shot-game. So the
equilibrium is {C,C}.
Now in the last by on play the
players know {C,C} will be
played in the last stage. So
play {C,C} in the last but one
stage and so on.
This is just backwards
induction.
But was it the outcome of
actual play?
Criminal #2
Criminal
#1
Games People Play.
Confess
Deny
Confess
10,10
1, 25
Deny
25,1
3,3
Games People Play.
The Prisoners’ Dilemma
Indeterminate number
of repetitions.
What happens when the
number of repetitions and thus
the end of the game is
unknown?
Criminal #2
Criminal
#1
Games People Play.
Confess
Deny
Confess
10,10
1, 25
Deny
25,1
3,3
The Prisoners’ Dilemma
Indeterminate number
of repetitions.
The key element here is that
the players can adopt
contingent strategies. If you
do this on this round of the
game, then I will respond on
the next round.
This option remains available
if there is some chance there
will be a next round.
Criminal #2
Criminal
1#
Games People Play.
Confess
Deny
Confess
10,10
1, 25
Deny
25,1
3,3
The Prisoners’ Dilemma
Trigger Strategy Equilibria.
If someone “misbehaves” in one round of the game they can be
punished in the next.
This option remains available if there is some chance there will
be a next round.
If this threat is sufficient to change the players strategies then
we have found a new equilibrium termed a trigger strategy
equilibrium.
The simplest form of trigger strategy is called tit-for-tat.
Whatever you do in this round of the game I will do in the next
round.
Games People Play.
Yogi on Trigger Strategy Equilibria
"You should always go to other people's funerals;
otherwise, they won't come to yours." – Yogi Berra.
Games People Play.
The Prisoners’ Dilemma
Trigger Strategy Equilibria – Tit-for-tat.
Nixon #2
For sake of the example let the
payoff matrix be
C
C
D
10,10
1,15
Nixon #1
Let both Nixons be initially playing deny.
D
15,1
3,3
#1 considers playing confess
This gets him 1 today.
But 10 next period when #2 plays tit-for-tat and also denies.
Alternatively he can continue with deny.
This gets him 3 today.
Then 3 in the next period as #2 plays tit-for-tat and also plays deny.
Since 6 < 11 he plays deny today and deny tomorrow, as does #2 who faces
identical incentives.
Games People Play.
The Prisoners’ Dilemma
Trigger Strategy Equilibria.
Tit-for-tat is not the only possible punishment.
In some circumstances more of a threat is needed to
ensure cooperation (check out our example with the
original numbers).
One possibility is the “Grim Punishment Strategy”
which states if you cheat on our deal I will punish you
forever. Since {C,C} is a Nash equilibrium in our
game this is not ridiculous.
More extreme still, but somewhat strange, is the
“Severe Punishment Strategy” I do something worse
to you than Nash. I do this not because if I do not you
punish me for not punishing you and visa versa !!
Games People Play.
The Prisoners’ Dilemma
Infinitely Repeated Games and Discounting.
It might seem that almost any outcome can be supported
by a grim strategy since the punishment appears to be
infinite.
This isn’t the case because we are impatient, that is we
discount the future. A dollar today is better than a dollar in
a years time.
Thus since punishments occur in the future and the
rewards from non-cooperation occur today it may be
difficult to enforce a good outcome.
As an example consider discounting with a tit-for-tat
trigger strategy.
Games People Play.
The Prisoners’ Dilemma
Nixon #2
Nixon
#1
C
D
C
10,10
1,15
D
15,1
3,3
Nixon #1 considers playing confess
This gets him 1 today. But d(10) next period when #2 plays tit-for-tat and also
denies.
Alternatively he can continue with deny.
This gets him 3 today. Then d(3) in the next period as #2 plays tit-for-tat and also
plays deny.
The Strategy fails if (recall small number are better in this example).
3+d(3) > 1+d(10) or 2/7 > d
And the same incentives apply to #2.
Games People Play.
Yogi on Discounting
"A nickel isn't worth a dime today." – Yogi Berra.
Games People Play.