Download SCIT1003 Chapter 3: Prisoner*s Dilemma Non

Document related concepts

Strategic management wikipedia , lookup

Turns, rounds and time-keeping systems in games wikipedia , lookup

Deathmatch wikipedia , lookup

Minimax wikipedia , lookup

Artificial intelligence in video games wikipedia , lookup

Porter's generic strategies wikipedia , lookup

Nash equilibrium wikipedia , lookup

The Evolution of Cooperation wikipedia , lookup

Evolutionary game theory wikipedia , lookup

Prisoner's dilemma wikipedia , lookup

Chicken (game) wikipedia , lookup

Transcript
SCIT1003
Chapter 3: Prisoner’s Dilemma
Non-Zero Sum Game

Prof. Tsang
1
Zero-Sum Games
• The sum of the payoffs remains constant
during the course of the game.
• Two sides in conflict, e.g. chess, sports
• Being well informed always helps a player
2
Example of zero-sum game
Matching Pennies
Mis-matcher
matcher
3
Rock-Paper-Scissors
4
Military game: attack the easy or hard
pass?
Attacker
Easy pass
Easy pass
Hard pass
0.5, 0.5
0.4, 0.6
Defense
side
Hard pass 0,
Payoff is the winning probability.
1
1,
0
5
Games of Conflict
• Two sides competing against each other
• Characteristics of zero-sum games: your loss
is my gain
• Simultaneous moves: lack of information
about the opponent’s move
• Logical circle of reasoning: I think that he
thinks that I think that …
6
Zero-sum game matrices are sometimes
expressed with only one number in each box,
in which case each entry is interpreted as a
gain for row-player and a loss for columnplayer.
Player A
Player B
7
Non-Zero Sum Game
Prisoner’s Dilemma
• A zero-sum game is one in which the players' interests
are in direct conflict, e.g. in football, one team wins and
the other loses; payoffs sum to zero.
• A game is non-zero-sum, if players interests are not
always in direct conflict, so that there are opportunities
for both to gain, e.g. games in economics
• For example, when both players choose Don't Confess in
the Prisoners' Dilemma
• Most game in reality have aspects of common interests
as well as conflict.
8
Prisoners’ Dilemma: payoff matrix
Confess
Don’t
Confess
Confess
-5, -5
0, -10
Don’t
Confess
-10, 0
-3, -3
2
1
9
10
Imperfect Information
• Partial or no information concerning the
opponent is given in advance to the player’s
decision, e.g. Prisoner’s Dilemma.
• Imperfect information may be diminished
over time if the same game with the same
opponent is played repeatedly.
11
Games of Co-operation
Players may improve payoff through
• communicating
• forming binding coalitions & agreements
• do not apply to zero-sum games
Prisoner’s Dilemma
with Cooperation
12
Strategies
• A strategy is a “complete plan of action” that fully
determines the player's behavior, a decision rule or set
of instructions about which actions a player should
take following all possible histories up to that stage.
• The strategy concept is sometimes (wrongly) confused
with that of a move. A move is an action taken by a
player at some point during the play of a game (e.g., in
chess, moving white's Bishop a2 to b3).
• A strategy on the other hand is a complete algorithm
for playing the game, telling a player what to do for
every possible situation throughout the game.
13
Dominant or dominated strategy
• A strategy S for a player A is dominant if it
is always the best strategies for player A no
matter what strategies other players will
take.
• A strategy S for a player A is dominated if
there is at least a strategy better than it no
matter what strategies other players will
take.
14
Rule: If you have a dominant strategy, use
it!
Use
strategy 1
15
Dominance Solvable
COMMANDMENT
If you have a dominant strategy, use it.
Expect your opponent to use his/her dominant strategy
if he/she has one.
• If each player has a dominant strategy, the game is
dominance solvable
16
Only one player has a Dominant
Strategy
Time
The Economist
G
S
S 100 , 100 0 , 90
G 95 , 100 95 , 90
• For The Economist:
– G dominant, S dominated
• Dominated Strategy:
• There exists another strategy which always does better regardless
of opponents’ actions
17
How to recognize a Dominant Strategy
To determine if the row player has any dominant strategy
1.Underline the maximum payoff in each column
2.If the underlined numbers all appear in a row, then it is
the dominant strategy for the row player
No dominant strategy for the row player in this example.
18
How to recognize a Dominant Strategy
To determine if the column player has any dominant strategy
1.Underline the maximum payoff in each row
2.If the underlined numbers all appear in a column, then it is the
dominant strategy for the column player
There is a dominant strategy for the column player in this example.
19
If there is no dominant strategy
• Does any player have a dominant strategy?
• If there is none, ask “Does any player have a
dominated strategy?”
• If yes, then
• Eliminate the dominated strategies
• Reduce the normal-form game
• Iterate the above procedure
20
Eliminate any dominated strategy
Eliminate
strategy 2 as
it’s dominated
by strategy 1
21
Successive Elimination of Dominated
Strategies
• If a strategy is dominated, eliminate it
• The size and complexity of the game is
reduced
• Eliminate any dominated strategies from the
reduced game
• Continue doing so successively
22
Example: Two competing Bars
•
•
•
•
Two bars (bar 1, bar 2) compete each other
Can charge price of $2, $4, or $5 for a drink
6000 tourists pick a bar randomly
4000 natives select the lowest price bar
No dominant strategy for the both players.
Bar 2
$2
$4
$5
$2 10 , 10 14 , 12 14 , 15
Bar 1 $4 12 , 14 20 , 20 28 , 15
$5 15 , 14 15 , 28 25 , 25
23
Successive Elimination of Dominated
Strategies
$2
Bar 1
$2
$4
$5
Bar 2
$4
$5
10 , 10 14 , 12 14 , 15
12 , 14 20 , 20 28 , 15
15 , 14 15 , 28 25 , 25
Bar 2
$4
$5
$4 20 , 20 28 , 15
Bar 1
$5 15 , 28 25 , 25
24
An example for Successive Elimination of strictly dominated
strategies, or the process of iterated dominance
25
Equilibrium
• The interaction of all players' strategies results in an
outcome that we call "equilibrium."
• Traditional applications of game theory attempt to find
equilibria in games.
• In an equilibrium, each player is playing the strategy
that is a "best response" to the strategies of the other
players. No one is likely to change his strategy given the
strategic choices of the others.
• Equilibrium is not:
• The best possible outcome. Equilibrium in the one-shot prisoners'
dilemma is for both players to confess.
• A situation where players always choose the same action.
Sometimes equilibrium will involve changing action choices (known
as a mixed strategy equilibrium).
26
Definition: Nash Equilibrium
“If there is a set of strategies with the property
that no player can benefit by changing
his/her strategy while the other players keep
their strategies unchanged, then that set of
strategies and the corresponding payoffs
constitute the Nash Equilibrium.”
Source: http://www.lebow.drexel.edu/economics/mccain/game/game.html
27
Nash equilibrium
• If each player has chosen a strategy and no
player can benefit by changing his/her
strategy while the other players keep theirs
unchanged, then the current set of strategy
choices and the corresponding payoffs
constitute a Nash equilibrium.
28
Conditions for Nash equilibrium
• Each player is choosing a best response to
what he believes the other players will do.
• Each player’s beliefs are correct. The other
players are all doing what everyone else thinks
they are doing.
Assumptions:
Rational players
“Putting yourself in the other person’s shoes”
29
Example: B B Lean vs Rainbow’s End
p.105
30
p.124
Games with infinitely many strategies
Rainbow’s End’s price
31
(U, L) is not a Nash equilibrium because Player 2 can gain by
deviating alone to R; …
You can cross out those that are not Nash equilibria.
Finding Nash equilibria: (a) with strike-outs; (b) with
underlinings
32
Sometimes, there is no Nash
Equilibrium
L
C
R
U
0, 2
2, 3
4, 1
M
1, 1
3, 1
0, 2
D
0, 3
1, 0
5, 1
33
Sometimes, there are more than one Nash
Equilibrium.
No strictly dominant strategies and no strictly dominated
strategies.
34
Hunting game: multi-Nash Equilibria
35
Hunting game with 2-Nash Equilibria
Barney’s choice
Stag
Fred’s
choice
Bison
3
Stag
4
0
0
0
Bison
0
4
3
36
Battle of sexes
Barney’s choice
Ballet
Alice’s
choice
Football
3
Ballet
4
0
0
0
Football
0
4
3
37
Prisoner’s Dilemma: finding the Nash equilibrium
Which is a Nash Equilibrium?
38
Prisoner’s Dilemma :
Applications
• Relevant to:
– Nuclear arms races.
– Dispute Resolution and the decision to hire a
lawyer.
– Corruption/political contributions between
contractors and politicians.
• How do players escape this dilemma?
– Play repeatedly
– Find a way to ‘guarantee’ cooperation
– Change payoff structure
39
Nuclear arms races
prisoner's dilemma in disguise
Two countries try to decide whether to build the nuclear bombs.
Is there a Nash Equilibrium?
40
Cigarette Advertising
prisoner's dilemma in disguise
Two companies try to decide whether to run cigarette advertisement.
Reynolds
No Ad
Ad
Philip Morris
No Ad
Ad
50 , 50
20 , 60
60 , 20
30 , 30
41
Price it higher? Lower?
prisoner's dilemma in disguise
p.69
B B Lean
Rainbow’s
80
80
72k , 72k
70
24k , 110k
70
110k , 24k
70k , 70k
42
OPEC: example of a cartel
• Collaborating in price-fixing guarantees higher
profit to all industrial producers at the
expense of consumers.
• OPEC is the most notable cartel of petroleum
producing countries.
• Cheating is a problem in price-collusion.
• Governments in many nations have enacted
anti-trust laws (Sherman Act in the US) to
protect consumers. (p.90-94)
Fight together or run-away?
prisoner's dilemma in disguise
Your friend
Fight
you
Runaway
Fight
0.8 , 0.8
Run-away
0 ,
1
1 ,
0.1 , 0.1
0
44
Collaborative behavior in biology
• Hunting packs of wolves
• Sharing between colony of vampire bats in
Rica Costa (p.98)
• Ants
• Bees
• Humans
Sustainability of resources sharing
• Community resources sharing is generally
viewed as a form of cooperative game similar
to Prisoner’s Dilemma by most people.
• However, over-exploitation & cheating in
conservation effort lead to depletion of
resources. Its consequence is much deeper
than the simple (& superficial) payoff matrix
would suggest. (p.94-97)
46
Tragedy of the Commons
• When individuals, acting independently & rationally,
will deplete a shared common resource even when
doing so is not in their best interest.
• An example to explain overuse of shared resources.
• Extend the Prisoner’s Dilemma to more than two
players.
• Each member of a group of neighboring farmers
prefers to let his cow/sheep to graze on the commons,
rather than keeping it on his own inadequate land, but
the commons will be rendered unsuitable for grazing if
it is overgrazed.
47
In the beginning, there is a nice piece of grassland
owned by all villagers. “What a waste!” said a farmer.
48
Happy farmers with their well-fed cows.
49
“Why not have more cows? Why waste the resource?”
said the farmers.
50
In the end, sad farmers with their hungry cows.
51
Tragedy of the Commons
an apparent payoff matrix at the start
Your neighbor
Add a cow
Don’t add
add
a cow 2
2
Don’t
add 1
2
1
2
1
1
As long as the common pasture is not overgrazed, adding one
more cow is the dominant strategy for everybody.
52
Tragedy of the Commons
an apparent payoff matrix in between
Your neighbor
Add a cow
Don’t add
add
a cow 1.5
1.5
Don’t
add 0.9
1.8
0.9
1.8
1
1
When the common pasture is starting to be overgrazed, adding
cow is still the dominant strategy for everybody, but the return is
getting worse.
53
Tragedy of the Commons
the final form of payoff matrix
Your neighbor
Add a cow
Don’t add
add
a cow 0.5
0.5
Don’t
add 0.7
1.1
0.7
1.1
1
1
Finally when the common pasture is overgrazed, adding cow is
no longer the dominant strategy for everybody.
54
Tragedy of the Commons
• Problem: cost of maintenance is
ignored/externalized
– Farmers don’t adequately pay for their impact.
– Resources are overused due to inaccurate estimates of
cost.
• Relevant to:
– Health or other social benefits
– Environmental laws, overfishing, whaling, pollution, etc.
– Global warming
55
Environmental policy
Tragedy of the Commons in disguise
Factory B
pollution
No
pollution
50 , 50
60 , 20
Factory A
pollution
No
pollution
20 , 60
20 , 20
Two factories producing same chemical can choose to
pollute (lower production cost) or not to pollute (higher
production cost).
56
Global warming
Tragedy of the Commons in disguise
Emissions
Reduced?
No
Country A
Yes
Country B
No
Yes
50 , 50
60 , 20
20 , 60
20 , 20
Two countries producing CO2 can choose to reduce
(higher production cost) or not to reduce (lower
production cost).
57
Another Example:
Big & Little Pigs
Cost to press
button = 2 units
When button is pressed,
food given = 10 units
58
Decisions, decisions...
What’s the best strategy for the little pig? Does he have a dominant
strategy?
Does the big pig have a dominant strategy?
59
Research in industries
Big & Little Pigs
in disguise Small Company
research
Big
Company
research
No
research
5 , 1
No
research
4 , 4
9 , -1
0 , 0
60
Summary: Ch. 3
• A strategy is a “complete plan of action” that
fully determines the player's behavior.
• Dominant strategy is the best strategy no
matter what strategies other players will take.
• A dominated strategy is one at least there is a
strategy better than it no matter what strategies
other players will take.
• If you have a dominant strategy, use it!
• Eliminate any dominated strategy
61
Assignment 3.1
Assignment 3.1
N-person Investment game
• In a group of N>>2 persons
• If you invest $0, you get $0 in return
• If you invest $10, you get
– $5 return, if more than 60% of the group invest
– -$10 return, if less than 60% of the group invest
• Is there any Nash equilibrium?
• Answer: 2,
• What are they? Why?
64
Example in real life
• Which technology to choose?
– Window or Apple
– Betamax or VHS (video format in 1980s)
• What else??
65