Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Prerequisites
Almost essential
Game Theory: Dynamic
REPEATED GAMES
MICROECONOMICS
Principles and Analysis
Frank Cowell
Note: the detail in slides marked “ * ” can
only be seen if you run the slideshow
July 2015
Frank Cowell: Repeated Games
1
Overview
Repeated Games
Basic structure
Embedding the
game in context
Equilibrium
issues
Applications
July 2015
Frank Cowell: Repeated Games
2
Introduction
Another examination of the role of time
Dynamic analysis can be difficult
• more than a few stages
• can lead to complicated analysis of equilibrium
We need an alternative approach
• one that preserves basic insights of dynamic games
• for example, subgame-perfect equilibrium
Build on the idea of dynamic games
• introduce a jump
• move from the case of comparatively few stages
• to the case of arbitrarily many
July 2015
Frank Cowell: Repeated Games
3
Repeated games
The alternative approach
• take a series of the same game
• embed it within a time-line structure
Basic idea is simple
• connect multiple instances of an atemporal game
• model a repeated encounter between the players in the same situation of
economic conflict
Raises important questions
• how does this structure differ from an atemporal model?
• how does the repetition of a game differ from a single play?
• how does it differ from a collection of unrelated games of identical
structure with identical players?
July 2015
Frank Cowell: Repeated Games
4
History
Why is the time-line different from a collection of unrelated
games?
The key is history
• consider history at any point on the timeline
• contains information about actual play
• information accumulated up to that point
History can affect the nature of the game
• at any stage all players can know all the accumulated information
• strategies can be conditioned on this information
History can play a role in the equilibrium
• some interesting outcomes aren’t equilibria in a single encounter
• these may be equilibrium outcomes in the repeated game
• the game’s history is used to support such outcomes
July 2015
Frank Cowell: Repeated Games
5
Repeated games: Structure
The stage game
• take an instant in time
• specify a simultaneous-move game
• payoffs completely specified by actions within the game
Repeat the stage game indefinitely
• there’s an instance of the stage game at time 0,1,2,…,t,…
• the possible payoffs are also repeated for each t
• payoffs at t depends on actions in stage game at t
A modified strategic environment
• all previous actions assumed as common knowledge
• so agents’ strategies can be conditioned on this information
Modifies equilibrium behaviour and outcome?
July 2015
Frank Cowell: Repeated Games
6
Equilibrium
Simplified structure has potential advantages
• whether significant depends on nature of stage game
• concern nature of equilibrium
Possibilities for equilibrium
• new strategy combinations supportable as equilibria?
• long-term cooperative outcomes
• absent from a myopic analysis of a simple game
Refinements of subgame perfection simplify the analysis:
• can rule out empty threats
• and incredible promises
• disregard irrelevant “might-have-beens”
July 2015
Frank Cowell: Repeated Games
7
Overview
Repeated Games
Basic structure
Developing the
basic concepts
Equilibrium
issues
Applications
July 2015
Frank Cowell: Repeated Games
8
Equilibrium: an approach
Focus on key question in repeated games:
• how can rational players use the information from history?
• need to address this to characterise equilibrium
Illustrate a method in an argument by example
• Outline for the Prisoner's Dilemma game
• same players face same outcomes from their actions that they may
choose in periods 1, 2, …, t, …
Prisoner's Dilemma particularly instructive given:
• its importance in microeconomics
• pessimistic outcome of an isolated round of the game
July 2015
Frank Cowell: Repeated Games
9
*Prisoner’s dilemma: Reminder
Alf
[LEFT] [RIGHT]
Payoffs in stage game
If Alf plays [RIGHT] then Bill’s best response
is [right]
2,2
0,3
If Bill plays [right] then Alf’s best response
is [RIGHT]
Nash Equilibrium
Outcome that Pareto dominates NE
3,0
1,1
[left]
[right]
Bill
The highlighted NE is inefficient
Could the Pareto-efficient outcome be
an equilibrium in the repeated game?
Look at the structure
July 2015
Frank Cowell: Repeated Games
10
*Repeated Prisoner's dilemma
Stage game between (t=1)
Stage game (t=2) follows here
or here
Alf
1
[RIGHT]
[LEFT]
or here
or here
Bill
[LEFT]
2
2
[right]
Alf
Alf
(2,2)
2
(2,2)
[right]
Alf
Alf
(0,3)
(3,0)
Bill
Bill
(1,1)
[RIGHT][LEFT] [RIGHT]
[LEFT]
[RIGHT]
[LEFT]
2
Bill
[left]
[left]
[left]
[right]
[left]
[left]
[right]
[left]
(2,2)
(0,3)
(3,0)
(0,3)
(2,2)
[right]
[left]
[left]
(1,1)
(3,0)
(2,2)
(0,3)
[RIGHT]
Bill
[right]
[left]
[right]
[left]
[right]
(1,1)
(0,3)
(3,0)
(1,1)
(3,0)
[right]
(1,1)
Repeat this structure
indefinitely…?
July 2015
Frank Cowell: Repeated Games
11
Repeated Prisoner's dilemma
The stage game
Alf
1
repeated though time
[RIGHT]
[LEFT]
Bill
[left]
(2,2)
[left]
[right]
…
…
(0,3)
(3,0)
[right]
…
(1,1)
Alf
[RIGHT]
[LEFT]
t
Bill
[left]
(2,2)
July 2015
[right]
…
(0,3)
[left]
…
(3,0)
[right]
Let's look at the
detail
…
(1,1)
Frank Cowell: Repeated Games
12
Repeated PD: payoffs
To represent possibilities in long run:
•
•
first consider payoffs available in the stage game
then those available through mixtures
In the one-shot game payoffs simply represented
•
•
•
it was enough to denote them as 0,…,3
purely ordinal
arbitrary monotonic changes of the payoffs have no effect
Now we need a generalised notation
•
•
cardinal values of utility matter
we need to sum utilities, compare utility differences
Evaluation of a payoff stream:
•
•
suppose payoff to agent h in period t is uh(t)
value of (uh(1), uh(2),…, uh(t)…) is given by
∞
[1d] ∑ dt1uh(t)
t=1
•
July 2015
where d is a discount factor 0 < d < 1
Frank Cowell: Repeated Games
13
PD: stage game
A generalised notation for the stage game
• consider actions and payoffs
• in each of four fundamental cases
Both socially irresponsible:
• they play [RIGHT], [right]
• get ( ua, ub) where ua > 0, ub > 0
Both socially responsible:
• they play [LEFT],[left]
• get (u*a, u*b) where u*a > ua, u*b > ub
Only Alf socially responsible:
• they play [LEFT], [right]
• get ( 0,`ub) where `ub > u*b
Only Bill socially responsible:
• they play [RIGHT], [left]
• get (`ua, 0) where `ua > u*a
July 2015
A diagrammatic
view
Frank Cowell: Repeated Games
14
Repeated Prisoner’s dilemma payoffs
Space of utility payoffs
_
ub
ub
Payoffs for Prisoner's Dilemma
Nash-Equilibrium payoffs
•
Payoffs Pareto-superior to NE
Payoffs available through mixing
Feasible, superior points
"Efficient" outcomes
𝕌*
• (u
*a,
u*b )
•
( ua, ub )
0
July 2015
•u_
ua
a
Frank Cowell: Repeated Games
15
Choosing a strategy: setting
Long-run advantage in the Pareto-efficient outcome
• payoffs (u*a, u*b) in each period
• clearly better than ( ua, ub) in each period
Suppose the agents recognise the advantage
• what actions would guarantee them this?
• clearly they need to play [LEFT], [left] every period
The problem is lack of trust:
• they cannot trust each other
• nor indeed themselves:
• Alf tempted to be antisocial and get payoff`ua by playing [RIGHT]
• Bill has a similar temptation
July 2015
Frank Cowell: Repeated Games
16
Choosing a strategy: formulation
Will a dominated outcome still be inevitable?
Suppose each player adopts a strategy that
1. rewards the other party's responsible behaviour by responding with the
action [left]
2. punishes antisocial behaviour with the action [right], thus generating
the minimax payoffs (ua, ub)
Known as a trigger strategy
Why the strategy is powerful
• punishment applies to every period after the one where the antisocial
action occurred
• if punishment invoked offender is “minimaxed for ever”
Look at it in detail
July 2015
Frank Cowell: Repeated Games
17
Repeated PD: trigger strategies
Bill’s action in 0,…,t
s Ta
[left][left],…,[left]
Alf’s action at t+1
[LEFT]
Take situation at t
First type of history
Response of other player to
continue this history
Second type of history
[RIGHT]
Anything else
Punishment response
Trigger strategies [sTa, sTb]
Alf’s action in 0,…,t
sTb
[LEFT][LEFT],…,[LEFT]
Bill’s action at t+1
[left]
Will it work?
Anything else
July 2015
[right]
Frank Cowell: Repeated Games
18
Will the trigger strategy “work”?
Utility gain from “misbehaving” at t: `ua − u*a
What is value at t of punishment from t + 1 onwards?
• Difference in utility per period: u*a − ua
• Discounted value of this in period t + 1: V := [u*a − ua]/[1 −d ]
• Value of this in period t: dV = d[u*a − ua]/[1 −d ]
So agent chooses not to misbehave if
• `ua − u*a ≤ d[u*a − ua ]/[1 −d ]
But this is only going to work for specific parameters
• value of d
• relative to `ua, ua and u*a
What values of discount factor will allow an equilibrium?
July 2015
Frank Cowell: Repeated Games
19
Discounting and equilibrium
For an equilibrium condition must be satisfied for both a and b
Consider the situation of a
Rearranging the condition from the previous slide:
• d[u*a − ua ] ≥ [1 −d] [`ua − u*a ]
• d[`ua − ua ] ≥ [`ua − u*a ]
Simplifying this the condition must be
• d ≥ da
• where da := [`ua − u*a ] / [`ua − ua ]
A similar result must also apply to agent b
Therefore we must have the condition:
• d≥d
• where d := max {da , db}
July 2015
Frank Cowell: Repeated Games
20
Repeated PD: SPNE
Assuming d ≥ d, take the strategies [sTa, sTb] prescribed by the Table
If there were antisocial behaviour at t consider the subgame that would then
start at t + 1
•
•
•
•
Alf could not increase his payoff by switching from [RIGHT] to [LEFT], given
that Bill is playing [left]
a similar remark applies to Bill
so strategies imply a NE for this subgame
likewise for any subgame starting after t + 1
But if [LEFT],[left] has been played in every period up till t:
•
•
•
Alf would not wish to switch to [RIGHT]
a similar remark applies to Bill
again we have a NE
So, if d is large enough, [sTa, sTb] is a Subgame-Perfect Equilibrium
•
July 2015
yields the payoffs (u*a, u*b) in every period
Frank Cowell: Repeated Games
21
Folk Theorem
The outcome of the repeated PD is instructive
• illustrates an important result
• the Folk Theorem
Strictly speaking a class of results
• finite/infinite games
• different types of equilibrium concepts
A standard version of the Theorem:
• for a two-person infinitely repeated game:
• suppose discount factor is sufficiently close to 1
• any combination of actions observed in any finite number of stages
• this is the outcome of a subgame-perfect equilibrium
July 2015
Frank Cowell: Repeated Games
22
Assessment
The Folk Theorem central to repeated games
• perhaps better described as Folk Theorems
• a class of results
Clearly has considerable attraction
Put its significance in context
• makes relatively modest claims
• gives a possibility result
Only seen one example of the Folk Theorem
• let’s apply it
• to well known oligopoly examples
July 2015
Frank Cowell: Repeated Games
23
Overview
Repeated Games
Basic structure
Some well-known
examples
Equilibrium
issues
Applications
July 2015
Frank Cowell: Repeated Games
24
Cournot competition: repeated
Start by reinterpreting PD as Cournot duopoly
•
•
•
two identical firms
firms can each choose one of two levels of output – [high] or [low]
can firms sustain a low-output (i.e. high-profit) equilibrium?
Possible actions and outcomes in the stage game:
•
•
•
[HIGH], [high]: both firms get Cournot-Nash payoff PC > 0
[LOW], [low]: both firms get joint-profit maximising payoff PJ > PC
[HIGH], [low]: payoffs are (`P, 0) where `P > PJ
Folk theorem: get SPNE with payoffs (PJ, PJ) if d is large enough
•
Critical value for the discount factor d is
•
Let’s review the standard Cournot diagram
`P − PJ
d =
──────
`P − PC
But we should say more
July 2015
Frank Cowell: Repeated Games
25
Cournot stage game
Firm 1’s Iso-profit curves
Firm 2’s Iso-profit curves
Firm 1’s reaction function
Firm 2’s reaction function
Cournot-Nash equilibrium
Outputs with higher profits for both firms
Joint profit-maximising solution
Output that forces other firm’s profit to 0
q2
`q2
c1(·)
(qC1, qC2)
c2(·)
(q1J, qJ2)
0
July 2015
`q1
q1
Frank Cowell: Repeated Games
26
Repeated Cournot game: Punishment
Standard Cournot model is richer than simple PD:
• action space for PD stage game just has the two output levels
• continuum of output levels introduces further possibilities
Minimax profit level for firm 1 in a Cournot duopoly
• is zero, not the NE outcome PC
• arises where firm 2 sets output to `q2 such that 1 makes no profit
Imagine a deviation by firm 1 at time t
• raises q1 above joint profit-max level
Would minimax be used as punishment from t + 1 to ∞?
• clearly (0,`q2) is not on firm 2's reaction function
• so cannot be best response by firm 2 to an action by firm 1
• so it cannot belong to the NE of the subgame
• everlasting minimax punishment is not credible in this case
July 2015
Frank Cowell: Repeated Games
27
Repeated Cournot game: Payoffs
Space of profits for the two firms
Cournot-Nash outcome
Joint-profit maximisation
Minimax outcomes
Payoffs available in repeated game
P2
`P •
(PJ,PJ)
(PC,PC)
0
July 2015
•
P1
Now review
Bertrand
competition
`P
Frank Cowell: Repeated Games
28
Bertrand stage game
p2
Marginal cost pricing
Monopoly pricing
Firm 1’s reaction function
Firm 2’s reaction function
Nash equilibrium
pM
c
July 2015
c
pM
p1
Frank Cowell: Repeated Games
29
Bertrand competition: repeated
NE of the stage game:
• set price equal to marginal cost c
• results in zero profits
NE outcome is the minimax outcome
• minimax outcome is implementable as a Nash equilibrium
• in all the subgames following a defection from cooperation
In repeated Bertrand competition
• firms set pM if acting “cooperatively”
• split profits between them
• if one firm deviates from this
• others then set price to c
Repeated Bertrand: result
• can enforce joint profit maximisation through trigger strategy
• provided discount factor is large enough
July 2015
Frank Cowell: Repeated Games
30
Repeated Bertrand game: Payoffs
Space of profits for the two firms
Bertrand-Nash outcome
Firm 1 as a monopoly
Firm 2 as a monopoly
Payoffs available in repeated game
P2
PM •
0
July 2015
•
P1
PM
Frank Cowell: Repeated Games
31
Repeated games: summary
New concepts:
• Stage game
• History
• The Folk Theorem
• Trigger strategy
What next?
• Games under uncertainty
July 2015
Frank Cowell: Repeated Games
32