Download updated version for the 2015 Superbowl

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Minimax wikipedia , lookup

Strategic management wikipedia , lookup

The Evolution of Cooperation wikipedia , lookup

Nash equilibrium wikipedia , lookup

Prisoner's dilemma wikipedia , lookup

Evolutionary game theory wikipedia , lookup

Chicken (game) wikipedia , lookup

Transcript
Prisoners’ Dilemma Game
Two criminals, Bill Belichick and Tom Brady, have been caught at the scene of a crime. Prosecutors
separate them for questioning. Each can either confess that the act was planned, or deny that it was
planned (i.e., claim that it was simply a coincidence). The prosecutor tells them that if they both confess
they will each receive a six-month jail sentence (i.e., a six-month suspension). If one of the two
confesses, the one who confessed will go free, but the one denying the crime will receive a two-year jail
sentence (i.e., a two-year suspension). If both deny they will each receive 30 day sentences for the crime
of inadvertently violating NFL rules. The situation is represented in the following payoff matrix, with jail
terms represented as the negative of the number of months of prison time (i.e., months of suspension).
Tom
Bill
Confess
Deny
Confess
(-6,-6)
(-24,0)
Deny
(0,-24)
(-1,-1)
Payoffs: (Bill, Tom)
First consider the situation of Bill. If Tom chooses “confess,” Bill should choose “confess.” This results in
less time in jail (i.e., six months versus 24 months). Alternatively, if Tom chooses “deny,” Bill should
choose “confess.” Again, choosing “confess” results in less jail time (i.e., zero months versus one
month). Since Bill’s optimal strategy of “confess” is independent of the strategy chosen by Tom, we say
Bill has a dominate strategy of “confess.”
Definition: We say a player has a dominant strategy in a game if the strategy is best, independent of the
strategies chosen by other players. Alternative, we say strategy A dominates strategy B for a player if the
payoff from choosing strategy A is always higher than the payoff from choosing strategy B (i.e., no
matter what strategies are chosen by the other players). If one strategy dominates all other strategies
for a player, then we call it a dominant strategy.
We can indicate Bill’s best responses in the table by underlining his corresponding payoff. Since both of
his payoffs in the “confess” row are underlined, his optimal response of “confess” is independent of the
strategy chosen by Tom.
Tom
Bill
Confess
Deny
Confess
(-6,-6)
(-24,0)
Payoffs: (Bill, Tom)
Deny
(0,-24)
(-1,-1)
Now consider the situation of Tom. If Bill chooses “confess,” Tom should choose “confess.” This results
in less time in jail (i.e., six months versus 24 months). Alternatively, if Bill chooses “deny,” Tom should
choose “confess.” Again, choosing “confess” results in less jail time (i.e., zero months versus one
month). Since Tom’s optimal strategy of “confess” is independent of the strategy chosen by Bill, we say
Tom has a dominate strategy of “confess.”
We can indicate Tom’s best responses in the table by underlining his corresponding payoff. Since both
of his payoffs in the “confess” column are underlined, his optimal response of “confess” is independent
of the strategy chosen by Tom.
Tom
Bill
Confess
Deny
Confess
(-6,-6)
(-24,0)
Deny
(0,-24)
(-1,-1)
Payoffs: (Bill, Tom)
We can combine the last two tables showing best responses for Bill and Tom to determine a “mutual
best response,” or Nash equilibrium of “confess-confess.”
Tom
Bill
Confess
Deny
Confess
(-6,-6)
(-24,0)
Deny
(0,-24)
(-1,-1)
Payoffs: (Bill, Tom)
Definition: A combination of strategies is a Nash (non-cooperative) equilibrium if each player’s strategy
is best, given the strategies chosen by the other players. The Nash equilibrium is a “mutual best
response” in the sense that each player is correctly assessing the strategies of all other players and
choosing his or her best possible response.
Definition: An allocation is Pareto Optimal if it is impossible to make one person better off with out
making someone else worse off.
The Nash equilibrium in this game is not Pareto Optimal. If Bill and Tom could agree to deny the crime
they would both be better off. The strategy combination “deny-deny” is a Pareto Improvement over the
Nash equilibrium. Note that if Bill and Tom did reach a collusive agreement to deny the crime, each
would have an incentive to cheat on the agreement by choosing to confess.