Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Prerequisites Almost essential Game Theory: Dynamic REPEATED GAMES MICROECONOMICS Principles and Analysis Frank Cowell Note: the detail in slides marked “ * ” can only be seen if you run the slideshow July 2015 Frank Cowell: Repeated Games 1 Overview Repeated Games Basic structure Embedding the game in context Equilibrium issues Applications July 2015 Frank Cowell: Repeated Games 2 Introduction Another examination of the role of time Dynamic analysis can be difficult • more than a few stages • can lead to complicated analysis of equilibrium We need an alternative approach • one that preserves basic insights of dynamic games • for example, subgame-perfect equilibrium Build on the idea of dynamic games • introduce a jump • move from the case of comparatively few stages • to the case of arbitrarily many July 2015 Frank Cowell: Repeated Games 3 Repeated games The alternative approach • take a series of the same game • embed it within a time-line structure Basic idea is simple • connect multiple instances of an atemporal game • model a repeated encounter between the players in the same situation of economic conflict Raises important questions • how does this structure differ from an atemporal model? • how does the repetition of a game differ from a single play? • how does it differ from a collection of unrelated games of identical structure with identical players? July 2015 Frank Cowell: Repeated Games 4 History Why is the time-line different from a collection of unrelated games? The key is history • consider history at any point on the timeline • contains information about actual play • information accumulated up to that point History can affect the nature of the game • at any stage all players can know all the accumulated information • strategies can be conditioned on this information History can play a role in the equilibrium • some interesting outcomes aren’t equilibria in a single encounter • these may be equilibrium outcomes in the repeated game • the game’s history is used to support such outcomes July 2015 Frank Cowell: Repeated Games 5 Repeated games: Structure The stage game • take an instant in time • specify a simultaneous-move game • payoffs completely specified by actions within the game Repeat the stage game indefinitely • there’s an instance of the stage game at time 0,1,2,…,t,… • the possible payoffs are also repeated for each t • payoffs at t depends on actions in stage game at t A modified strategic environment • all previous actions assumed as common knowledge • so agents’ strategies can be conditioned on this information Modifies equilibrium behaviour and outcome? July 2015 Frank Cowell: Repeated Games 6 Equilibrium Simplified structure has potential advantages • whether significant depends on nature of stage game • concern nature of equilibrium Possibilities for equilibrium • new strategy combinations supportable as equilibria? • long-term cooperative outcomes • absent from a myopic analysis of a simple game Refinements of subgame perfection simplify the analysis: • can rule out empty threats • and incredible promises • disregard irrelevant “might-have-beens” July 2015 Frank Cowell: Repeated Games 7 Overview Repeated Games Basic structure Developing the basic concepts Equilibrium issues Applications July 2015 Frank Cowell: Repeated Games 8 Equilibrium: an approach Focus on key question in repeated games: • how can rational players use the information from history? • need to address this to characterise equilibrium Illustrate a method in an argument by example • Outline for the Prisoner's Dilemma game • same players face same outcomes from their actions that they may choose in periods 1, 2, …, t, … Prisoner's Dilemma particularly instructive given: • its importance in microeconomics • pessimistic outcome of an isolated round of the game July 2015 Frank Cowell: Repeated Games 9 *Prisoner’s dilemma: Reminder Alf [LEFT] [RIGHT] Payoffs in stage game If Alf plays [RIGHT] then Bill’s best response is [right] 2,2 0,3 If Bill plays [right] then Alf’s best response is [RIGHT] Nash Equilibrium Outcome that Pareto dominates NE 3,0 1,1 [left] [right] Bill The highlighted NE is inefficient Could the Pareto-efficient outcome be an equilibrium in the repeated game? Look at the structure July 2015 Frank Cowell: Repeated Games 10 *Repeated Prisoner's dilemma Stage game between (t=1) Stage game (t=2) follows here or here Alf 1 [RIGHT] [LEFT] or here or here Bill [LEFT] 2 2 [right] Alf Alf (2,2) 2 (2,2) [right] Alf Alf (0,3) (3,0) Bill Bill (1,1) [RIGHT][LEFT] [RIGHT] [LEFT] [RIGHT] [LEFT] 2 Bill [left] [left] [left] [right] [left] [left] [right] [left] (2,2) (0,3) (3,0) (0,3) (2,2) [right] [left] [left] (1,1) (3,0) (2,2) (0,3) [RIGHT] Bill [right] [left] [right] [left] [right] (1,1) (0,3) (3,0) (1,1) (3,0) [right] (1,1) Repeat this structure indefinitely…? July 2015 Frank Cowell: Repeated Games 11 Repeated Prisoner's dilemma The stage game Alf 1 repeated though time [RIGHT] [LEFT] Bill [left] (2,2) [left] [right] … … (0,3) (3,0) [right] … (1,1) Alf [RIGHT] [LEFT] t Bill [left] (2,2) July 2015 [right] … (0,3) [left] … (3,0) [right] Let's look at the detail … (1,1) Frank Cowell: Repeated Games 12 Repeated PD: payoffs To represent possibilities in long run: • • first consider payoffs available in the stage game then those available through mixtures In the one-shot game payoffs simply represented • • • it was enough to denote them as 0,…,3 purely ordinal arbitrary monotonic changes of the payoffs have no effect Now we need a generalised notation • • cardinal values of utility matter we need to sum utilities, compare utility differences Evaluation of a payoff stream: • • suppose payoff to agent h in period t is uh(t) value of (uh(1), uh(2),…, uh(t)…) is given by ∞ [1d] ∑ dt1uh(t) t=1 • July 2015 where d is a discount factor 0 < d < 1 Frank Cowell: Repeated Games 13 PD: stage game A generalised notation for the stage game • consider actions and payoffs • in each of four fundamental cases Both socially irresponsible: • they play [RIGHT], [right] • get ( ua, ub) where ua > 0, ub > 0 Both socially responsible: • they play [LEFT],[left] • get (u*a, u*b) where u*a > ua, u*b > ub Only Alf socially responsible: • they play [LEFT], [right] • get ( 0,`ub) where `ub > u*b Only Bill socially responsible: • they play [RIGHT], [left] • get (`ua, 0) where `ua > u*a July 2015 A diagrammatic view Frank Cowell: Repeated Games 14 Repeated Prisoner’s dilemma payoffs Space of utility payoffs _ ub ub Payoffs for Prisoner's Dilemma Nash-Equilibrium payoffs • Payoffs Pareto-superior to NE Payoffs available through mixing Feasible, superior points "Efficient" outcomes 𝕌* • (u *a, u*b ) • ( ua, ub ) 0 July 2015 •u_ ua a Frank Cowell: Repeated Games 15 Choosing a strategy: setting Long-run advantage in the Pareto-efficient outcome • payoffs (u*a, u*b) in each period • clearly better than ( ua, ub) in each period Suppose the agents recognise the advantage • what actions would guarantee them this? • clearly they need to play [LEFT], [left] every period The problem is lack of trust: • they cannot trust each other • nor indeed themselves: • Alf tempted to be antisocial and get payoff`ua by playing [RIGHT] • Bill has a similar temptation July 2015 Frank Cowell: Repeated Games 16 Choosing a strategy: formulation Will a dominated outcome still be inevitable? Suppose each player adopts a strategy that 1. rewards the other party's responsible behaviour by responding with the action [left] 2. punishes antisocial behaviour with the action [right], thus generating the minimax payoffs (ua, ub) Known as a trigger strategy Why the strategy is powerful • punishment applies to every period after the one where the antisocial action occurred • if punishment invoked offender is “minimaxed for ever” Look at it in detail July 2015 Frank Cowell: Repeated Games 17 Repeated PD: trigger strategies Bill’s action in 0,…,t s Ta [left][left],…,[left] Alf’s action at t+1 [LEFT] Take situation at t First type of history Response of other player to continue this history Second type of history [RIGHT] Anything else Punishment response Trigger strategies [sTa, sTb] Alf’s action in 0,…,t sTb [LEFT][LEFT],…,[LEFT] Bill’s action at t+1 [left] Will it work? Anything else July 2015 [right] Frank Cowell: Repeated Games 18 Will the trigger strategy “work”? Utility gain from “misbehaving” at t: `ua − u*a What is value at t of punishment from t + 1 onwards? • Difference in utility per period: u*a − ua • Discounted value of this in period t + 1: V := [u*a − ua]/[1 −d ] • Value of this in period t: dV = d[u*a − ua]/[1 −d ] So agent chooses not to misbehave if • `ua − u*a ≤ d[u*a − ua ]/[1 −d ] But this is only going to work for specific parameters • value of d • relative to `ua, ua and u*a What values of discount factor will allow an equilibrium? July 2015 Frank Cowell: Repeated Games 19 Discounting and equilibrium For an equilibrium condition must be satisfied for both a and b Consider the situation of a Rearranging the condition from the previous slide: • d[u*a − ua ] ≥ [1 −d] [`ua − u*a ] • d[`ua − ua ] ≥ [`ua − u*a ] Simplifying this the condition must be • d ≥ da • where da := [`ua − u*a ] / [`ua − ua ] A similar result must also apply to agent b Therefore we must have the condition: • d≥d • where d := max {da , db} July 2015 Frank Cowell: Repeated Games 20 Repeated PD: SPNE Assuming d ≥ d, take the strategies [sTa, sTb] prescribed by the Table If there were antisocial behaviour at t consider the subgame that would then start at t + 1 • • • • Alf could not increase his payoff by switching from [RIGHT] to [LEFT], given that Bill is playing [left] a similar remark applies to Bill so strategies imply a NE for this subgame likewise for any subgame starting after t + 1 But if [LEFT],[left] has been played in every period up till t: • • • Alf would not wish to switch to [RIGHT] a similar remark applies to Bill again we have a NE So, if d is large enough, [sTa, sTb] is a Subgame-Perfect Equilibrium • July 2015 yields the payoffs (u*a, u*b) in every period Frank Cowell: Repeated Games 21 Folk Theorem The outcome of the repeated PD is instructive • illustrates an important result • the Folk Theorem Strictly speaking a class of results • finite/infinite games • different types of equilibrium concepts A standard version of the Theorem: • for a two-person infinitely repeated game: • suppose discount factor is sufficiently close to 1 • any combination of actions observed in any finite number of stages • this is the outcome of a subgame-perfect equilibrium July 2015 Frank Cowell: Repeated Games 22 Assessment The Folk Theorem central to repeated games • perhaps better described as Folk Theorems • a class of results Clearly has considerable attraction Put its significance in context • makes relatively modest claims • gives a possibility result Only seen one example of the Folk Theorem • let’s apply it • to well known oligopoly examples July 2015 Frank Cowell: Repeated Games 23 Overview Repeated Games Basic structure Some well-known examples Equilibrium issues Applications July 2015 Frank Cowell: Repeated Games 24 Cournot competition: repeated Start by reinterpreting PD as Cournot duopoly • • • two identical firms firms can each choose one of two levels of output – [high] or [low] can firms sustain a low-output (i.e. high-profit) equilibrium? Possible actions and outcomes in the stage game: • • • [HIGH], [high]: both firms get Cournot-Nash payoff PC > 0 [LOW], [low]: both firms get joint-profit maximising payoff PJ > PC [HIGH], [low]: payoffs are (`P, 0) where `P > PJ Folk theorem: get SPNE with payoffs (PJ, PJ) if d is large enough • Critical value for the discount factor d is • Let’s review the standard Cournot diagram `P − PJ d = ────── `P − PC But we should say more July 2015 Frank Cowell: Repeated Games 25 Cournot stage game Firm 1’s Iso-profit curves Firm 2’s Iso-profit curves Firm 1’s reaction function Firm 2’s reaction function Cournot-Nash equilibrium Outputs with higher profits for both firms Joint profit-maximising solution Output that forces other firm’s profit to 0 q2 `q2 c1(·) (qC1, qC2) c2(·) (q1J, qJ2) 0 July 2015 `q1 q1 Frank Cowell: Repeated Games 26 Repeated Cournot game: Punishment Standard Cournot model is richer than simple PD: • action space for PD stage game just has the two output levels • continuum of output levels introduces further possibilities Minimax profit level for firm 1 in a Cournot duopoly • is zero, not the NE outcome PC • arises where firm 2 sets output to `q2 such that 1 makes no profit Imagine a deviation by firm 1 at time t • raises q1 above joint profit-max level Would minimax be used as punishment from t + 1 to ∞? • clearly (0,`q2) is not on firm 2's reaction function • so cannot be best response by firm 2 to an action by firm 1 • so it cannot belong to the NE of the subgame • everlasting minimax punishment is not credible in this case July 2015 Frank Cowell: Repeated Games 27 Repeated Cournot game: Payoffs Space of profits for the two firms Cournot-Nash outcome Joint-profit maximisation Minimax outcomes Payoffs available in repeated game P2 `P • (PJ,PJ) (PC,PC) 0 July 2015 • P1 Now review Bertrand competition `P Frank Cowell: Repeated Games 28 Bertrand stage game p2 Marginal cost pricing Monopoly pricing Firm 1’s reaction function Firm 2’s reaction function Nash equilibrium pM c July 2015 c pM p1 Frank Cowell: Repeated Games 29 Bertrand competition: repeated NE of the stage game: • set price equal to marginal cost c • results in zero profits NE outcome is the minimax outcome • minimax outcome is implementable as a Nash equilibrium • in all the subgames following a defection from cooperation In repeated Bertrand competition • firms set pM if acting “cooperatively” • split profits between them • if one firm deviates from this • others then set price to c Repeated Bertrand: result • can enforce joint profit maximisation through trigger strategy • provided discount factor is large enough July 2015 Frank Cowell: Repeated Games 30 Repeated Bertrand game: Payoffs Space of profits for the two firms Bertrand-Nash outcome Firm 1 as a monopoly Firm 2 as a monopoly Payoffs available in repeated game P2 PM • 0 July 2015 • P1 PM Frank Cowell: Repeated Games 31 Repeated games: summary New concepts: • Stage game • History • The Folk Theorem • Trigger strategy What next? • Games under uncertainty July 2015 Frank Cowell: Repeated Games 32