Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Copyright (C) 2011 David K. Levine This document is an open textbook; you can redistribute it and/or modify it under the terms of version 1 of the open text license amendment to version 2 of the GNU General Public License. The open text license amendment is published by Michele Boldrin et al at http://levine.sscnet.ucla.edu/general/gpl.htm; the GPL is published by the Free Software Foundation at http://www.gnu.org/copyleft/gpl.html. If you prefer you may use the Creative Commons attribution license http://creativecommons.org/licenses/by/2.0/ Reputation Extensive Form Examples (-1,-1) (1,1) Fight Give In 1 (2,0) Out In 2 Chain Store Game 1 (2,-1) (1,1) Low High 1 (0,0) Out Buy 2 Quality Game 2 Simultaneous Move Examples Modified Chain Store out in fight 2- , 0 -1,-1 give in 2,0 1,1** 3 Inflation Game Low High Low 0,0 -2,-1 High 1,-1 -1,0 Inflation Game: LR=government, SR=consumers consumer preferences are whether or not they guess right Low High Low 0,0 0,-1 High -1,-1 -1,0 with a hard-nosed government 4 The Model multiple types of long-run player w Î W W is a countable set of types type is fixed forever (does not change from period to period) u 1(a, w) utility depends on type strategy s 1(h, w) depends on type types are privately known to long-run player, not known to short run player strategy s 2 (h ) does not depend on type m probability distribution over W commonly known short-run player prior over types 5 Truly Committed Types type w(a1 ) has a dominant strategy to play a 1 in the repeated game: ìï 1 a%1 = a 1 u 1(a%1, a 2, w(a 1 )) = ïí ïï 0 a%1 ¹ a 1 î for example A truly committed type “can't be bargained with…can't be reasoned with…doesn't feel pity, or remorse, or fear.” 6 Let n (w) be the least utility received by a type w in any Nash equilibrium Fix a type w0 : let a 1 * be a pure strategy Stackelberg strategy for that type with corresponding committed type w* º w(a1 *) and utility u 1* = maxa 1 min a 2 Î BR (a 1 ) u 1(a 1, a 2, w0 ) Note the use of min rather than max in the second spot let u 1 be the least possible utility for type w0 Theorem: Suppose m(w0 ) > 0 and m* º m(w*) > 0 . Then there is a constant k (m*) otherwise independent of m, W and d such that n ( w) ³ dk ( m*)u 1 * + (1 - dk ( m*) )u 1 7 Proof define pt * to be the probability at the beginning of period t by the short-run player that the long-run player will play a 1 * Let N ( pt * £ p ) be the number of times pt * £ p Lemma 1: There is p < 1 such that if pt * > p the SR player plays a best response to a 1 * Why? Lemma 2: Suppose that LR plays a 1 * always. Then for any history h that has positive probability pr (N (pt * £ p ) > log m * / log p | h ) = 0 Why do the Lemmas imply the Theorem? 8 Proof of Lemma 1 Bayes Law p (w* | ht ) = p (w* | ht - 1)p (ht | w*, ht - 1) p (ht | ht - 1 ) given ht - 1 player 1 and 2 play independently p ( w* | ht ) = p ( w* | ht - 1 )p (ht | w*, ht - 1 ) p (ht1 | ht - 1 )p (ht2 | ht - 1 ) 9 repeating from the previous page p ( w* | ht - 1 )p (ht | w*, ht - 1 ) p ( w* | ht ) = p (ht1 | ht - 1 )p (ht2 | ht - 1 ) since player 1’s type isn’t known to player 2 rewrite denominator p ( w* | ht ) = p ( w* | ht - 1 )p (ht | w*, ht - 1 ) p (ht1 | ht - 1 )p (ht2 | w*, ht - 1 ) player 1’s strategy is to always play a 1 * so p (ht1 | w*, ht - 1) = 1 p ( w* | ht - 1 )p (ht2 | w*, ht - 1 ) p ( w* | ht ) = p (ht1 | ht - 1 )p (ht2 | w*, ht - 1 ) p ( w* | ht - 1 ) p ( w* | ht - 1 ) = = pt * p (ht1 | ht - 1 ) 10 the conclusion reiterated: p ( w* | ht ) = p ( w* | ht - 1 ) pt * what does this say? the Lemma now derives from the fact that p (w* | ht ) £ 1 11 Observational Equivalence r (y | a ) outcome function a 2 Î W (a 1 ) if there exists a%1 such that r (×| a%1, a 2 ) = r (×| a 1, a 2 ) and a 2 Î BR (a%1 ) u 1* = maxa 1 min a 2 Î W (a 1 ) u 1(a 1, a 2, w0 ) here the min can have real bite 12 (-1,-1) (1,1) Fight Give In 1 (2,0) Out In 2 Chain Store Game 13 strategies that are observationally equivalent out in mixed fight all fight fight give all give give mixed all mixed mixed weak best responses fight: out give: in, out mixed: in, out? Best case fight:out so u 1 * = 2 14 (2,-1) (1,1) Low High 1 (0,0) Out Buy 2 Quality Game strategies that are observationally equivalent 15 out buy mixed hi all hi hi lo all lo lo mixed all mixed mixed weak best responses hi: in, out lo: out mixed: in?, out in every case out is a weak best response so u 1 * = 0 16 Moral Hazard and Mixed Commitments r (y | a ) outcome function expand space of types to include types committed to mixed strategies: leads to technical complications because it requires a continuum of types p(ht - 1 ) probability distribution over outcomes conditional on the history (a vector) p + (ht - 1) probability distribution over outcomes conditional on the history and the type being in W+ 17 Theorem: for every e > 0, D 0 > 0 and set of types W+ with m(W+ ) > 0 there is a K such that if W+ is true there is probability less than e that there are more than K periods with p + (ht - 1 ) - p(ht - 1 ) > D 0 18 look for tight bounds let n , n be best and worst Nash payoffs to LR try to get lim infd® 1 n (w) = lim sup d® 1 n ( w) = max a 2 Î BR (a 1 ) u 1(a ) game is non-degenerate if there is no undominated pure action a 2 such that for some a 2 ¹ a 2 u i (×, a 2 ) = u i (×, a 2 ) counterexample: player 2 gets zero always, player 1 gets either zero or one depending only on player 2’s action 19 game is identified if for all a 2 that are not weakly dominated r (×| a 1, a 2 ) = r (×| a%1, a 2 ) implies a 1 = a%1 r (×| a 1, a 2 ) = a 1R (a 2 ) condition for identification R (a 2 ) has full row rank for all a 2 20 Patient Short Run Players: Schmidt é10 0 ù ú short run preferences ê 0 1 êë úû 21 long run preferences 0 m = 0.1 m* = 0.01 i m = 0.89 é10 0 ù ê ú 0 1 êë úû pure coordination é10 10 ù ê ú êë 0 1 úû commitment type é1 1 ù ê ú êë1 1 úû indifferent type 22 strategies: normal: play U except if you previously did D, then switch to D commitment: always play U indifferent type: U until deviation then D SR: play L then alternate between R and L (on path) if 1 deviated to D switch to R forever if 2 deviated play L; if 1 reacts with U continue with L reacts with D continue with R 23 d1 ³ .15, d2 ³ .75 then this is a subgame perfect equilibrium interesting deviation for SR when supposed to do R deviate to L; but then indifferent type switches to D forever for the normal type to prove he’s not type “i” he must play D revealing he is not the commitment type 24 Suppose that LR can minmax SR in a pure strategy a 1 Theorem: LR gets at least min a 2 Î BR 2 (a1 ) u 1(a1, a 2 ) let u 2 be SR minmax let u%2 be second best against a 1 ln(1 - d2 ) + ln(u 2 - u%2 ) - ln(u 2 - u%2 ) N = ln d2 (1 - d2 )2 (u 2 - u%2 ) e= - d2N (1 - d2 ) 2 2 (u - u% ) 25 commit to a 1 Lemma: suppose a2t + 1 Ï BR (a 1 ) with positive probability, then SR must believe that in t + 1, K , t + N there is a probability of at least e of not having a 1 why is this sufficient? 26 Proof of Lemma: 2 can get at least u 2 so (1 - d2 )u (a 1, a2t + 1 ) + d2V ³ u 2 if pr (a1 ) > 1 - e in t + 1, K , t + N suppose the optimum is not a br; consider deviating to a br gain at least u 2 - u%2 at t + 1; starting at t + 2 earn no less than u 2 if you hadn’t deviated, starting at t + 2 would have earned at most (1 - d2 )å N t 2 (1 - e) + eu 2 ) + dN + 1u 2 d ( u 2 2 t=1 but we chose N and e so that the loss exceeds the gain 27