* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Chap02 - Nash Equilibrium theory
Survey
Document related concepts
Transcript
Chapter 2 Nash Equilibrium: Theory Strategic or Simultaneous-move Games Definition: A simultaneous-move game consists of: • A set of players • For each player, a set of actions • For each player, a preference relation over the set of action profiles. Or • For each player, a payoff function over the set of action profiles. Example I: The Prisoner’s Dilemma • Two suspects (Bonnie and Clyde) in a crime, held in separate cells. • Enough evidence to convict each on minor, but not major offense unless one confesses. • Each can Remain Silent or can Confess. • Both remain silent: each convicted of minor offense—1 year in prison. • One and only one confesses: one who confesses is used as a witness against the other, and goes free; other gets 4 years. • Both confess: each gets 3 years in prison. Example I: The Prisoner’s Dilemma Strategic game: two players; each player’s actions {RS, C}; preference orderings • (C,RS) >1 (RS,RS) >1 (C, C) >1 (RS, C) • (RS, C) >2 (RS,RS) >2 (C, C) >2 (C,RS). Example I: The Prisoner’s Dilemma Bonnie Clyde - This RS C RS 1, 1 4, 0 C 0, 4 3, 3 game models a situation in which there are gains from cooperation, but each player prefers to be a “free rider” - Note that payoffs are “years in prison” so less is clearly better. This is the only game we will see where lower payoffs are better; an alternative representation of the game will write the game with negative payoffs (so less negative payoffs are higher, and thus more preferred). Example II: Working on a joint project • Each of two people can either work hard or goof off. • Each person prefers to goof off if the other works hard—project would be better if worked hard, but not worth the extra effort. Example II: Working on a joint project Person 2 Work hard Goof off Work hard 2, 2 0, 3 Goof off 3, 0 1, 1 Person 1 Example III: Time vs. Newsweek • Two major news stories: – There is an impasse between the House and the Senate on the budget – A new drug is claimed to be effective against AIDS • 100 newsstand buyers – 70 interested on the AIDS story – 30 interested on the budget story • If both magazines go for the same story the buyers split equally between them Example III: Time vs. Newsweek Newsweek AIDS Budget AIDS 35, 35 70, 30 Budget 30, 70 15, 15 Time Example IV: Duopoly • Two firms producing same good. • Each firm can charge high price or low price. • Both firms charge high price: each gets profit of $1,000 • One firm charges high price, other low: firm charging high price gets no customers and loses $200, while one charging low price makes profit of $1,200 • Both firms charge low price: each earns profit of $600 Example IV: Duopoly Firm 2 High Low High 1,000, 1,000 -200, 1,200 Low 1,200, -200 600, 600 Firm 1 Example V: Nuclear arms race • Two countries • What matters is relative strength, and bombs are costly, so that for each country: have bombs, other doesn’t > neither country has bombs > both countries have bombs > don’t have bombs, other does • The game is Prisoner’s Dilemma, with RS means don’t build bombs and C means build bombs. Example VI: Battle of the sexes • In the Prisoner’s dilemma the players agree that (RS,RS) is a desirable outcome, though each has an incentive to deviate from this outcome. • In BoS the players disagree about the outcome that is desirable. • Peter and Jane wish to go out together to a concert. The options are U2 or Coldplay. • Their main concern is to go out together, but one person prefers U2 and the other person prefers Coldplay. • If they go to different concerts then each of them is equally unhappy listening to the music of either band. Example VI: Battle of the sexes Jane U2 Coldplay U2 2, 1 0, 0 Coldplay 0, 0 1, 2 Peter Example VII: Matching Pennies • In the Prisoner’s dilemma and BoS there are aspects of both conflict and cooperation. Matching Pennies is purely conflictual. • Each of two people chooses either the Head or Tail of a coin. • If the choices differ, person 1 pays person 2 a dollar. • If they are the same, person 2 pays person 1 a dollar. • Each person cares only about the amount of money that she receives. Example VII: Matching Pennies Player 2 Head Tail Head 1, -1 -1, 1 Tail -1, 1 1, -1 Player 1 - This is an example of a zero-sum game. Example VII: Matching Pennies • In this game the players’ interests are diametrically opposed (such a game is called “strictly competitive”): player 1 prefers to take the same action as the other player, while player 2 prefers to take the opposite action. Strategic or Simultaneous-move Games Common knowledge: means that every player knows: • The list of players • The actions available to each player • The payoffs of each player for all possible action profiles • That each player is a rational maximizer • That each player knows that he is rational, and that he knows that everybody else know that he knows they are rational… Strategic or Simultaneous-move Games • Dominant Strategy: a strategy that is best for a player in a game, regardless of the strategies chosen by the other players. • Rational players do not play strictly dominated strategies and so, once you determine a strategy is dominated by another, simply remove it from the game. Strictly Dominated Strategies A player’s action is “strictly dominated” if it is inferior, no matter what the other players do, to some other action. Definition: In a strategic game player i’s action bi strictly dominates her action b’i if ui(bi, a−i) > ui(b’i, a−i) for every list a−i of the other players’ actions. Iterated Elimination of Strictly Dominated Strategies Bonnie Clyde RS C RS 1, 1 4, 0 C 0, 4 3, 3 For Clyde, “remain silent” is a dominated strategy. So, “remain silent” should be removed from his strategy space. Iterated Elimination of Strictly Dominated Strategies Bonnie Clyde RS C RS 1, 1 4, 0 C 0, 4 3, 3 Given the symmetry of the game, it is easy to see that “remain silent” is a dominated strategy for Bonnie also. So, “remain silent” should be removed from her strategy space, too. Iterated Elimination of Strictly Dominated Strategies Bonnie Clyde RS C RS 1, 1 4, 0 C 0, 4 3, 3 Given the symmetry of the game, it is easy to see that “remain silent” is a dominated strategy for Bonnie also. So, “remain silent” should be removed from her strategy space, too. Iterated Elimination of Strictly Dominated Strategies Player 2 Left Middle Right Up 1,0 1,2 0,1 Down 0,3 0,1 2,0 Player 1 Iterated Elimination of Strictly Dominated Strategies Player 2 Left Middle Right Up 1,0 1,2 0,1 Down 0,3 0,1 2,0 Player 1 Iterated Elimination of Strictly Dominated Strategies Player 2 Left Middle Right Up 1,0 1,2 0,1 Down 0,3 0,1 2,0 Player 1 Iterated Elimination of Strictly Dominated Strategies Player 2 Player 1 Left Middle Right Up 1,0 1,2 0,1 Down 0,3 0,1 2,0 IESDS yields a unique result! Iterated Elimination of Strictly Dominated Strategies Player 1 Top Middle Bottom Left 0,4 4,0 3,5 Player 2 Middle 4,0 0,4 3,5 Right 5,3 5,3 6,6 In many cases, iterated elimination of strictly dominated strategies may not lead to a unique result. Here, we cannot eliminate any strategies. Weakly Dominated Strategies A player’s action “weakly dominates” another action if the first action is at least as good as the second action, no matter what the other players do, and is better than the second action for some actions of the other players. (Also known as pareto dominates.) Definition: In a strategic game player i’s action bi weakly dominates her action b’i if ui(bi, a−i) ≥ ui(b’i, a−i) for every list a−i of the other players’ actions. And ui(bi, a−i) > ui(b’i, a−i) for some list a−i of the other players’ actions. Weakly dominated strategies • We cannot always do iterated elimination if weakly dominated strategies in the same way that we could for strictly dominated strategies • Sometimes the set of actions that remains depends on the order in which we eliminated strategies. Consider eliminating L, T vs R, B. 2 L 1 C R T 1,1 1,1 0,0 B 0,0 1,2 1,2 Strategic or Simultaneous-move Games Definition: an strategy is a complete plan of action (what to do in every contingency). In simultaneous-move games a Pure-Strategy is simply an action. Nash Equilibrium Definition: The strategy profile a* in a strategic game is a Nash equilibrium if, for each player i and every strategy bi of player i, a* is at least as good for player i as the strategy profile (bi, a*−i): ui(ai*, a*−i) ≥ ui(bi, a*−i) for every strategy bi of player i. Nash Equilibrium: Prisoner's Dilemma Bonnie RS C RS 1, 1 4, 0 C 0, 4 3, 3 Clyde Nash Equilibrium: Prisoner's Dilemma Bonnie Clyde RS C RS 1, 1 4, 0 C 0, 4 3, 3 (C, C) is the unique Nash Equilibrium in purestrategies Nash Equilibrium: BoS Jane U2 Coldplay U2 2, 1 0, 0 Coldplay 0, 0 1, 2 Peter Nash Equilibrium: BoS Jane Peter U2 Coldplay U2 2, 1 0, 0 Coldplay 0, 0 1, 2 (U2, U2) and (Coldplay, Coldplay) are both Nash equilibria in pure-strategies. Nash Equilibrium: Matching Pennies Player 2 Head Tail Head 1, -1 -1, 1 Tail -1, 1 1, -1 Player 1 Nash Equilibrium: Matching Pennies Player 2 Head Tail Head 1, -1 -1, 1 Tail -1, 1 1, -1 Player 1 In Matching pennies there are no Nash equilibria in pure-strategies. Strict Nash Equilibrium If each player’s Nash equilibrium strategy is better than all her other strategies then we say that the equilibrium is strict: an strategy profile a* is a strict Nash equilibrium if for every player i we have ui(ai*, a*−i) > ui(bi, a*−i) for every strategy bi of player i. Strict Nash Equilibrium Player 2 Player 1 Left Right Top 1, 1 0, 0 Bottom 0, 0 0, 0 Strict Nash Equilibrium Player 2 Player 1 Left Right Top 1, 1 0, 0 Bottom 0, 0 0, 0 (Top, Left) strict Nash equilibrium (Bottom, Right) Nash equilibrium, but Not strict Nash equilibrium and elimination • All Nash equilibria will survive iterated elimination of strictly dominated strategies. Thus, if multiple Nash equilibria are present, then iterated elimination of strictly dominated strategies will not leave us with a unique strategy for all players. • This is not true of iterated elimination of weakly dominated strategies: we can sometimes eliminate Nash equilibria by eliminating weakly dominated strategies (consider previous slide). • However, we cannot eliminate strict NE through iterated elimination of weakly dominated strategies. Nash Equilibrium Exercise 34.1(Guessing two-thirds of the average) Each of three people announces an integer from 1 to K. If the three integers are different, the person whose integer is closest to 2/3 of the average of the three integers wins $1. If two or more integers are the same, $1 is split equally between the people whose integer is closest to 2/3 of the average. • Is there any integer k such that the strategy profile (k, k, k), in which every person announces the same integer k, is a Nash equilibrium? (if k ≥ 2, what happens if a person announces a smaller number?) • Is any other strategy profile a Nash Equilibrium? (what is the payoff of a person whose number is the highest of the three? Can she increase this payoff by announcing a different number?) Best Response Functions For any list of actions a−i of all the players other than i, let Bi(a−i) be the set of player i’s best actions, given that every other player j chooses aj : Bi(a−i) = {ai in Ai : ui(ai, a−i) ≥ ui(bi, a−i) for all bi in Ai}. Bi is the best response function of player i. Best Response Functions • Bi is a set-valued function (technically, a correspondence, not a function): its values are sets, not points. Every member of the set Bi(a−i) is a best response of player i to a−i: if each of the other players adheres to a−i then player i can do no better than choose a member of Bi(a−i). Bi(a−i) a−i Alternative definition of Nash equilibrium Proposition: The strategy profile a* is a Nash equilibrium of a strategic game if and only if every player’s action is a best response to the other players’ actions: a*i is in Bi(a*−i) for every player i (1) Alternative definition of Nash equilibrium If each player i has a unique best response to each a−i. i.e. if Bi(a−i) = {bi(a−i)} Then (1) above is equivalent to: ai = bi(a−i) for every player i (1’) Which is a collection of n equations in the n unknowns a*i, where n is the number of players in the game. Example: Synergistic Relationship • Two individuals • Each decides how much effort to devote to relationship • Amount of effort is nonnegative real number • If both individuals devote more effort to the relationship, then they are both better off; for any given effort of individual j, the return to individual i’s effort first increases, then decreases. • Specifically, payoff of i: ai(c + aj − ai), where c > 0 is a constant. Example: Synergistic Relationship Payoff to player i: ai(c+aj-ai). First Order Condition: c + aj – 2ai = 0. Second Order Condition: -2 < 0, Max Solve FOC: ai = (c+aj)/2 = bi(aj) Symmetry implies: aj = (c+ai)/2 = bj(ai) Example: Synergistic Relationship Symmetry of quadratic payoff functions implies that the best response of each individual i to aj is bi(aj ) = 1/2 (c + aj ) Nash equilibria: pairs (a*1, a*2) that solve the two equations a1 = 1/2 (c + a2) a2 = 1/2 (c + a1) Unique solution, (c, c) Hence game has a unique Nash equilibrium (a*1, a*2) = (c, c)