Download Chap02 - Nash Equilibrium theory

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

John Forbes Nash Jr. wikipedia , lookup

Evolutionary game theory wikipedia , lookup

Minimax wikipedia , lookup

The Evolution of Cooperation wikipedia , lookup

Artificial intelligence in video games wikipedia , lookup

Nash equilibrium wikipedia , lookup

Prisoner's dilemma wikipedia , lookup

Chicken (game) wikipedia , lookup

Transcript
Chapter 2
Nash Equilibrium: Theory
Strategic or Simultaneous-move
Games
Definition: A simultaneous-move game
consists of:
• A set of players
• For each player, a set of actions
• For each player, a preference relation over
the set of action profiles.
Or
• For each player, a payoff function over the
set of action profiles.
Example I: The Prisoner’s Dilemma
• Two suspects (Bonnie and Clyde) in a crime, held in
separate cells.
• Enough evidence to convict each on minor, but not
major offense unless one confesses.
• Each can Remain Silent or can Confess.
• Both remain silent: each convicted of minor
offense—1 year in prison.
• One and only one confesses: one who confesses is
used as a witness against the other, and goes free;
other gets 4 years.
• Both confess: each gets 3 years in prison.
Example I: The Prisoner’s Dilemma
Strategic game: two players; each player’s
actions {RS, C}; preference orderings
• (C,RS) >1 (RS,RS) >1 (C, C) >1 (RS, C)
• (RS, C) >2 (RS,RS) >2 (C, C) >2 (C,RS).
Example I: The Prisoner’s Dilemma
Bonnie
Clyde
- This
RS
C
RS
1, 1
4, 0
C
0, 4
3, 3
game models a situation in which there are gains from cooperation,
but each player prefers to be a “free rider”
- Note that payoffs are “years in prison” so less is clearly better. This is the
only game we will see where lower payoffs are better; an alternative
representation of the game will write the game with negative payoffs (so
less negative payoffs are higher, and thus more preferred).
Example II: Working on a joint project
• Each of two people can either work hard or goof
off.
• Each person prefers to goof off if the other works
hard—project would be better if worked hard,
but not worth the extra effort.
Example II: Working on a joint project
Person 2
Work hard
Goof off
Work hard
2, 2
0, 3
Goof off
3, 0
1, 1
Person 1
Example III: Time vs. Newsweek
• Two major news stories:
– There is an impasse between the House and the
Senate on the budget
– A new drug is claimed to be effective against AIDS
• 100 newsstand buyers
– 70 interested on the AIDS story
– 30 interested on the budget story
• If both magazines go for the same story the buyers
split equally between them
Example III: Time vs. Newsweek
Newsweek
AIDS
Budget
AIDS
35, 35
70, 30
Budget
30, 70
15, 15
Time
Example IV: Duopoly
• Two firms producing same good.
• Each firm can charge high price or low price.
• Both firms charge high price: each gets profit of
$1,000
• One firm charges high price, other low: firm
charging high price gets no customers and loses
$200, while one charging low price makes profit
of $1,200
• Both firms charge low price: each earns profit of
$600
Example IV: Duopoly
Firm 2
High
Low
High
1,000, 1,000
-200, 1,200
Low
1,200, -200
600, 600
Firm 1
Example V: Nuclear arms race
• Two countries
• What matters is relative strength, and bombs are
costly, so that for each country:
have bombs, other doesn’t > neither country has
bombs > both countries have bombs > don’t
have bombs, other does
• The game is Prisoner’s Dilemma, with RS
means don’t build bombs and C means build
bombs.
Example VI: Battle of the sexes
• In the Prisoner’s dilemma the players agree that
(RS,RS) is a desirable outcome, though each has an
incentive to deviate from this outcome.
• In BoS the players disagree about the outcome that is
desirable.
• Peter and Jane wish to go out together to a concert.
The options are U2 or Coldplay.
• Their main concern is to go out together, but one
person prefers U2 and the other person prefers
Coldplay.
• If they go to different concerts then each of them is
equally unhappy listening to the music of either band.
Example VI: Battle of the sexes
Jane
U2
Coldplay
U2
2, 1
0, 0
Coldplay
0, 0
1, 2
Peter
Example VII: Matching Pennies
• In the Prisoner’s dilemma and BoS there are aspects
of both conflict and cooperation. Matching Pennies is
purely conflictual.
• Each of two people chooses either the Head or Tail of
a coin.
• If the choices differ, person 1 pays person 2 a dollar.
• If they are the same, person 2 pays person 1 a dollar.
• Each person cares only about the amount of money
that she receives.
Example VII: Matching Pennies
Player 2
Head
Tail
Head
1, -1
-1, 1
Tail
-1, 1
1, -1
Player 1
- This is an example of a zero-sum game.
Example VII: Matching Pennies
• In this game the players’ interests are
diametrically opposed (such a game is
called “strictly competitive”): player 1
prefers to take the same action as the
other player, while player 2 prefers to take
the opposite action.
Strategic or Simultaneous-move
Games
Common knowledge: means that every player
knows:
• The list of players
• The actions available to each player
• The payoffs of each player for all possible
action profiles
• That each player is a rational maximizer
• That each player knows that he is rational, and
that he knows that everybody else know that he
knows they are rational…
Strategic or Simultaneous-move
Games
• Dominant Strategy: a strategy that is
best for a player in a game, regardless of
the strategies chosen by the other players.
• Rational players do not play strictly
dominated strategies and so, once you
determine a strategy is dominated by
another, simply remove it from the game.
Strictly Dominated Strategies
A player’s action is “strictly dominated” if it is inferior,
no matter what the other players do, to some other
action.
Definition: In a strategic game player i’s action bi
strictly dominates her action b’i if
ui(bi, a−i) > ui(b’i, a−i) for every list a−i of the other
players’ actions.
Iterated Elimination of Strictly
Dominated Strategies
Bonnie
Clyde
RS
C
RS
1, 1
4, 0
C
0, 4
3, 3
For Clyde, “remain silent” is a dominated strategy.
So, “remain silent” should be removed from his
strategy space.
Iterated Elimination of Strictly
Dominated Strategies
Bonnie
Clyde
RS
C
RS
1, 1
4, 0
C
0, 4
3, 3
Given the symmetry of the game, it is easy to see
that “remain silent” is a dominated strategy for
Bonnie also. So, “remain silent” should be
removed from her strategy space, too.
Iterated Elimination of Strictly
Dominated Strategies
Bonnie
Clyde
RS
C
RS
1, 1
4, 0
C
0, 4
3, 3
Given the symmetry of the game, it is easy to see
that “remain silent” is a dominated strategy for
Bonnie also. So, “remain silent” should be
removed from her strategy space, too.
Iterated Elimination of Strictly
Dominated Strategies
Player 2
Left
Middle
Right
Up
1,0
1,2
0,1
Down
0,3
0,1
2,0
Player 1
Iterated Elimination of Strictly
Dominated Strategies
Player 2
Left
Middle
Right
Up
1,0
1,2
0,1
Down
0,3
0,1
2,0
Player 1
Iterated Elimination of Strictly
Dominated Strategies
Player 2
Left
Middle
Right
Up
1,0
1,2
0,1
Down
0,3
0,1
2,0
Player 1
Iterated Elimination of Strictly
Dominated Strategies
Player 2
Player 1
Left
Middle
Right
Up
1,0
1,2
0,1
Down
0,3
0,1
2,0
IESDS yields a unique result!
Iterated Elimination of Strictly
Dominated Strategies
Player 1
Top
Middle
Bottom
Left
0,4
4,0
3,5
Player 2
Middle
4,0
0,4
3,5
Right
5,3
5,3
6,6
In many cases, iterated elimination of strictly
dominated strategies may not lead to a unique
result. Here, we cannot eliminate any strategies.
Weakly Dominated Strategies
A player’s action “weakly dominates” another action if the first
action is at least as good as the second action, no matter what
the other players do, and is better than the second action for
some actions of the other players. (Also known as pareto
dominates.)
Definition: In a strategic game player i’s action bi weakly
dominates her action b’i if
ui(bi, a−i) ≥ ui(b’i, a−i) for every list a−i of the other players’ actions.
And
ui(bi, a−i) > ui(b’i, a−i) for some list a−i of the other players’ actions.
Weakly dominated strategies
• We cannot always do iterated elimination if weakly
dominated strategies in the same way that we could
for strictly dominated strategies
• Sometimes the set of actions that remains depends
on the order in which we eliminated strategies.
Consider eliminating L, T vs R, B.
2
L
1
C
R
T
1,1
1,1
0,0
B
0,0
1,2
1,2
Strategic or Simultaneous-move
Games
Definition: an strategy is a complete plan of
action (what to do in every contingency).
In simultaneous-move games a Pure-Strategy
is simply an action.
Nash Equilibrium
Definition: The strategy profile a* in a
strategic game is a Nash equilibrium if, for
each player i and every strategy bi of
player i, a* is at least as good for player i
as the strategy profile (bi, a*−i):
ui(ai*, a*−i) ≥ ui(bi, a*−i) for every strategy bi of
player i.
Nash Equilibrium: Prisoner's Dilemma
Bonnie
RS
C
RS
1, 1
4, 0
C
0, 4
3, 3
Clyde
Nash Equilibrium: Prisoner's Dilemma
Bonnie
Clyde
RS
C
RS
1, 1
4, 0
C
0, 4
3, 3
(C, C) is the unique Nash Equilibrium in purestrategies
Nash Equilibrium: BoS
Jane
U2
Coldplay
U2
2, 1
0, 0
Coldplay
0, 0
1, 2
Peter
Nash Equilibrium: BoS
Jane
Peter
U2
Coldplay
U2
2, 1
0, 0
Coldplay
0, 0
1, 2
(U2, U2) and (Coldplay, Coldplay) are both Nash
equilibria in pure-strategies.
Nash Equilibrium: Matching Pennies
Player 2
Head
Tail
Head
1, -1
-1, 1
Tail
-1, 1
1, -1
Player 1
Nash Equilibrium: Matching Pennies
Player 2
Head
Tail
Head
1, -1
-1, 1
Tail
-1, 1
1, -1
Player 1
In Matching pennies there are no Nash equilibria in
pure-strategies.
Strict Nash Equilibrium
If each player’s Nash equilibrium strategy is
better than all her other strategies then we
say that the equilibrium is strict: an strategy
profile a* is a strict Nash equilibrium if for
every player i we have
ui(ai*, a*−i) > ui(bi, a*−i) for every strategy bi of
player i.
Strict Nash Equilibrium
Player 2
Player 1
Left
Right
Top
1, 1
0, 0
Bottom
0, 0
0, 0
Strict Nash Equilibrium
Player 2
Player 1
Left
Right
Top
1, 1
0, 0
Bottom
0, 0
0, 0
(Top, Left) strict Nash equilibrium
(Bottom, Right) Nash equilibrium, but Not strict
Nash equilibrium and elimination
• All Nash equilibria will survive iterated elimination of
strictly dominated strategies.
Thus, if multiple Nash equilibria are present, then
iterated elimination of strictly dominated strategies will
not leave us with a unique strategy for all players.
• This is not true of iterated elimination of weakly
dominated strategies: we can sometimes eliminate Nash
equilibria by eliminating weakly dominated strategies
(consider previous slide).
• However, we cannot eliminate strict NE through iterated
elimination of weakly dominated strategies.
Nash Equilibrium
Exercise 34.1(Guessing two-thirds of the average)
Each of three people announces an integer from 1 to K. If
the three integers are different, the person whose integer is
closest to 2/3 of the average of the three integers wins $1. If
two or more integers are the same, $1 is split equally
between the people whose integer is closest to 2/3 of the
average.
• Is there any integer k such that the strategy profile (k, k, k),
in which every person announces the same integer k, is a
Nash equilibrium? (if k ≥ 2, what happens if a person
announces a smaller number?)
• Is any other strategy profile a Nash Equilibrium? (what is the
payoff of a person whose number is the highest of the
three? Can she increase this payoff by announcing a
different number?)
Best Response Functions
For any list of actions a−i of all the players
other than i, let Bi(a−i) be the set of player i’s
best actions, given that every other player j
chooses aj :
Bi(a−i) = {ai in Ai : ui(ai, a−i) ≥ ui(bi, a−i) for all bi in Ai}.
Bi is the best response function of player i.
Best Response Functions
• Bi is a set-valued function (technically, a
correspondence, not a function): its values are
sets, not points. Every member of the set Bi(a−i) is
a best response of player i to a−i: if each of the
other players adheres to a−i then player i can do
no better than choose a member of Bi(a−i).
Bi(a−i)
a−i
Alternative definition of Nash
equilibrium
Proposition: The strategy profile a* is a
Nash equilibrium of a strategic game if and
only if every player’s action is a best
response to the other players’ actions:
a*i is in Bi(a*−i) for every player i (1)
Alternative definition of Nash
equilibrium
If each player i has a unique best response to
each a−i.
i.e. if Bi(a−i) = {bi(a−i)}
Then (1) above is equivalent to:
ai = bi(a−i) for every player i (1’)
Which is a collection of n equations in the n
unknowns a*i, where n is the number of
players in the game.
Example: Synergistic Relationship
• Two individuals
• Each decides how much effort to devote to
relationship
• Amount of effort is nonnegative real number
• If both individuals devote more effort to the
relationship, then they are both better off; for any
given effort of individual j, the return to individual
i’s effort first increases, then decreases.
• Specifically, payoff of i: ai(c + aj − ai), where c >
0 is a constant.
Example: Synergistic Relationship
Payoff to player i: ai(c+aj-ai).
First Order Condition: c + aj – 2ai = 0.
Second Order Condition: -2 < 0, Max
Solve FOC: ai = (c+aj)/2 = bi(aj)
Symmetry implies: aj = (c+ai)/2 = bj(ai)
Example: Synergistic Relationship
Symmetry of quadratic payoff functions implies
that the best response of each individual i to aj is
bi(aj ) = 1/2 (c + aj )
Nash equilibria: pairs (a*1, a*2) that solve the two
equations
a1 = 1/2 (c + a2)
a2 = 1/2 (c + a1)
Unique solution, (c, c)
Hence game has a unique Nash equilibrium
(a*1, a*2) = (c, c)