* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Game Theory EconC31
Survey
Document related concepts
The Evolution of Cooperation wikipedia , lookup
Turns, rounds and time-keeping systems in games wikipedia , lookup
Mechanism design wikipedia , lookup
Prisoner's dilemma wikipedia , lookup
Artificial intelligence in video games wikipedia , lookup
Nash equilibrium wikipedia , lookup
Transcript
EC941 - Game Theory Lecture 6 Prof. Francesco Squintani Email: [email protected] 1 Structure of the Lecture Extensive-Form Games with Imperfect Information Spence Signalling Game Crawford and Sobel Cheap Talk 2 In games of perfect information, the subgame perfect equilibrium coincides with the backward-induction solution. But subgame-perfect equilibrium is a more general concept, defined also for games without perfect information. 3 Extensive Form Games with Imperfect Information In some strategic interactions, players make decisions without knowing prior decisions of other players. This formulation includes the possibility of simultaneous moves, and the possibility of uncertainty about the opponents’ types. Extensive form games with imperfect information include also strategic form and Bayesian games. 4 Example: Entry Game Consider an entry game in which the entrant may or not be ready for a fight, and this is private information. 1 U R O 2 2, 5 S 3, 3 F S F 1, 1 4, 3 0, 2 5 1 U R O 2 2, 5 S 3, 3 F S F 1, 1 4, 3 0, 2 The two decision subhistory of player 2 are linked, to account for the fact that she does not know player 1’s prior action. 6 Extensive Form Game Definition An extensive game with imperfect information and chance moves consists of a set of players a set of sequences (terminal histories) with the property that no sequence is a proper subhistory of any other sequence a function (the player function) that assigns either a player or “chance” to every sequence that is a proper subhistory of some terminal history a probability distribution (the chance move) over the actions available after any history to which the player function assigns chance. The probability distributions are independent. 7 for each player, a partition (the player’s information partition) of the set of histories assigned to that player by the player function for each player, preferences over the set of terminal histories. Definition A pure strategy of player i in an extensive game is a function that assigns to each of player i’s information sets Ii an action in A(Ii) (the set of actions available to player i at the information set Ii). With this definition we can transform any extensive form game into a strategic form game, and find all its Nash Equilibria. 8 For example, in the entry game, we have: The set of players is {1, 2}. The terminal histories are {RF, RS, UF, US, O}. The player function is P(∅) =1, P(R) = 2, P(U) =2. There is no chance move, Player 1’s information partition is {∅}, and player 2’s information partition is {{R,U}}. The preferences over the set of terminal histories are u1(RF)=1, u1(RS)=3, u1(UF)=0, u1(US)=4, u1(O)=2 u2 (RF)=1, u2(RS)=3, u2(UF)=2, u2(US)=3, u2(O)=5 9 The strategic form representation of the entry game is: F R U O S 1, 1 3, 3 0, 2 4, 3 2, 5 2, 5 There are 2 pure strategy Nash Equilibria (U, S) and (O,F). Only (U, S) is reasonable. (O, F) contains a noncredible threat. Both equilibria are subgame perfect, because there are no proper subgames in the extensive form representation. 10 Perfect Bayesian Equilibrium 1. 2. To account that only some Subgame Perfect Equilibria are reasonable, we introduce the solution concept of Perfect Bayesian Equilibrium. We separate strategies from beliefs. Strategies refer to the choice of a player at any information set. A belief is a probability over histories that belong to the same information set. We require beliefs to be `consistent’ with strategies. 11 Definition A behavioral strategy of player i in an extensive game is a function that assigns to each of i’s information sets Ii a probability distribution over the actions in A(Ii), the distributions are independent of each other. Definition A belief system in an extensive game is a collection of probability distributions, one for each information set, over the histories in that information set. Definition An assessment is a pair consisting of a profile of behavioral strategies and a belief system. 12 Definition An assessment is a Perfect Bayesian Equilibrium if: 1. Each player’s strategy is optimal whenever she has to move, given her belief and the other players’ strategies. 2. Each player’s belief is consistent with the strategy profile, according to the Bayes rule. Consistency requires that the belief probability of different histories in the same information set is determined by the probability that the history is reached according to the players’ strategies. 13 1 U R O 2 2, 5 S 3, 3 F S F 1, 1 4, 3 0, 2 The Nash Equilibrium (U, S) is a Perfect Bayesian Equilibrium, with the belief that assigns probability 1 to the history U at the information set {U, R}. 14 1 U R O 2 2, 5 S 3, 3 F S F 1, 1 4, 3 0, 2 The Nash Equilibrium (O, F) is not a Perfect Bayesian Equilibrium, because the strategy F is not optimal, regardless of 2’s beliefs at the information set {U, R}. 15 Spence Signalling Model There is one student and two firms. The student’s skill is H or L<H, private information. First, the student chooses her amount e of education. Then, the firms observe e, and offer wages w1 and w2. Finally, the worker chooses one of the wage offers, w*. 16 The firms believe that the student skill is H with probability p. The cost of studying is e/K for a student of skill K. (Studying is less costly for a high skill student). The student’s payoff is w* – e/K. Each firm i’s payoff is K - wi if hiring the student, and zero otherwise. 17 Perfect Bayesian Equilibrium There are many Perfect Bayesian Equilibria. In one PBE, the high skill type chooses a positive amount of education, e*. Consider the following assessment. Student’s strategy. Type H chooses e = e* and type L chooses e = 0. After observing the firms’ wage offers, both types choose one of the highest offer. 18 Firms’ belief. Each firm believes that a student is type H if she chooses e* and type L otherwise. Firms’ strategies. Each firm offers the wage H to a student who chooses e* and the wage L to a student who chooses any other value of e. The firms’ beliefs are consistent with the student’s strategy. (No student chooses an education level e other than e* and 0, so consistency imposes no restriction on the firms’ beliefs after observing any such e.) 19 Now, consider sequential rationality. Student. The strategy of accepting the highest offer at the end of the game is clearly optimal. Now consider the student’s education choice. A type H student obtains the payoff H - e*/H if she follows her strategy, choosing the education level e*, and the payoff L - e/H if deviating to any other level e. So, for the assessment to be a PBE, ones need H - e*/H > L, or e* < H(H - L). 20 A type L student obtains the payoff L, if she follows her strategy and chooses the education level 0. If deviating to any e other than e*, she obtains payoff L - e/L. If deviating to e = e*, then the firms believe her ability to be H, and she obtains the payoff H - e*/L. So, for the assessment to be a PBE, one needs L > H - e*/L, i.e., e* > L(H - L). 21 Each firm’s payoff is 0, given its belief and its strategy. If it raises the wage it offers in response to any value of e, its expected profit is negative, given its belief. If it lowers the wage its expected profit remains zero (its offer is not accepted). In summary, the assessment is a Perfect Bayesian Equilibrium if and only if L (H - L) < e* < H (H - L). 22 There are other Perfect Bayesian Equilibria. For example, consider the following assessment. Student’s strategy. Type H and L chooses e = e’. After observing the firms’ wage offers, both types choose one of the highest offer. Firms’ belief. Each firm believes that a student is type H with probability p if she chooses e = e ’. Otherwise, she believes that the student is of type L. 23 Firms’ strategies. Each firm offers the wage L to a student choosing e = e’, and w = pH+(1-p)L to a student who chooses e = e’. The firms’ beliefs are consistent with the student’s strategies. The student’s wage acceptance strategy is optimal. The type H student’s strategy of choosing e = e’ is also optimal, because it gives her the highest wage, as long as pH+(1-p)L – e’/H > L. 24 The type L student’s strategy of choosing e = e’ is also optimal, because it gives her the highest wage, as long as pH+(1-p)L – e’/L > L. In sum, it is required that e’ < pL(H – L). It can be argued that these “pooling” PBE are less plausible than the “separating” PBE seen before. The pooling PBE is fragile because there are deviations e* = e’, that only high-skill students would choose, if the firms believed that only high-skill students chose e*. 25 Crawford and Sobel Cheap Talk An expert privately observes a state t in [0, 1]. The expert sends a report r (a real number) to a decision maker. After receiving the report, the decision maker chooses an action y (also, a real number). The DM knows that t is uniformly distributed in [0,1], i.e. the p.d.f. f is such that f(t) = 1 for t in [0, 1]. 26 The players’ payoffs are independent of r. (I.e., the report is cheap talk). Both players care that the action matches the state. But the expert is biased relatively to the DM. The expert’s payoff is uE(y, t) = -(y-(t + b))2. (Her optimal action given the state t is y = t + b). The DM payoff is uDM(y, t) = -(y-t)2. (Her optimal action given the state t is y = t ). 27 Perfect Communication We first show that there cannot be any PBE in which the expert accurately reports the state. (I.e., an invertible strategy r cannot be part of a PBE.) By contradiction, say that r is invertible and PBE. Then, consistency implies that the DM believes (correctly) that the state is t, when the report is r = r(t). 28 Hence, upon receiving report r = r(t), the DM chooses the action y = t. Anticipating this, when the state is t, the expert deviates from the strategy r. Instead of sending the report r = r(t), she sends the report r’= r(t+b), which induces the action y = t+b. Hence, r is not sequentially rational, and cannot be part of a PBE. 29 Perfect Bayesian Equilibrium By the above result, every Perfect Bayesian Equilibrium is described by a partition of the state space [0, 1]. Each partition is composed of K intervals, where the admissible set of K depends on the bias level b. The K intervals are [t0, t1), [t1, t2), …, [tK-1, tK], where t0=0 and tK=1. 30 For any k = 1, …, K-1, all expert types t in [tk-1, tk) send the same report rk. Hence, upon receiving report rk, the DM knows that t is in [tk-1, tk), and chooses y so as to maximize tk - (y-t)2 dt tk-1 tk Taking the first-order condition, 0 = - 2 (y-t) dt = - 2[ y - (tk-1+tk)/2 ]. tk-1 31 So, upon receiving report rk, the DM chooses the action y = (tk-1+ tk)/2. In a PBE, the expert anticipates that if reporting r = rk, she induces the action y = (tk-1+ tk)/2. For any k = 1, …, K-1, an expert of type tk must be indifferent between sending report rk and rk+1. I.e., - ((tk-1+ tk)/2 -(tk + b))2 = - ((tk+ tk+1)/2 -(tk + b))2. Or, tk+1 - tk = tk – tk-1 +4b. 32 If these conditions hold, then, for any k = 1, …, K, every type of expert t in [tk-1, tk) prefers to send report r = rk than any other report. The conditions give a second-order difference equation, solved with boundary conditions t0 = 0 and tK = 1. The interval [tk-1, tk) has length tk – tk-1 = t1 + (k-1)4b. The sum of the intervals [tk-1, tk), for k = 1, …, K, must be equal to 1: t1 + (t1 +4b) + … + [t1 +(K-1)4b] = 1. 33 Or, Kt1 + 4b [1+ … + (K-1)b] = 1. I.e., Kt1 + 2bK(K-1) = 1. So, if b is small enough that 2bK(K-1) < 1, then there is a value of t1 that satisfies the condition, and hence a PBE with K intervals. The largest is b, the smallest is the maximum number of intervals K* that can be a PBE. For b> 1/4, K* = 1. For 1/4 < b < 1/12, K* =2, etc. 34 We conclude by ranking PBE in ex-ante sense (i.e., before the state t is realized). The sender’s ex-ante equilibrium payoff is: K tk - S Pr{t in [tk-1, tk)} [ (tk-1+tk)/2 - (t+b)]2dt /Pr[tk-1, tk). k=1 tk-1 The decision maker’s ex-ante equilibrium payoff is: K tk - S Pr{t in [tk-1, tk)} [ (tk-1+tk)/2 - t]2dt /Pr[tk-1, tk). k=1 tk-1 35 Expanding the sender’s ex-ante equilibrium payoff, K tk S {[(tk-1+tk)/2 – t] 2 + b2 – 2b [(tk-1+tk)/2 – t]}dt k=1 tk-1 K tk = - S [(tk-1+tk)/2 – t]2 dt - b2, k=1 tk-1 because, for any k, tk [(tk-1+tk)/2 – t]dt = (tk-1+tk)/2 - (tk-1+tk)/2 = 0. tk-1 36 In sum, the sender’s and the decision maker’s ex-ante equilibrium payoff differ only by the constant b2. The PBE maximizing one, maximizes also the other. It can be shown that, for any b, the PBE maximizing ex-ante payoff is the one with K* intervals. This is because the sum of the loss terms in the ex-ante payoff expression is larger, the coarser is the equilibrium partition. 37 Summary of the Lecture Extensive-Form Games with Imperfect Information Spence Signalling Game Crawford and Sobel Cheap Talk 38 Preview of the Next Lecture Infinitely Repeated Games Nash and Subgame-Perfect Equilibrium Finitely Repeated Games 39