Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Strategic management wikipedia , lookup
Turns, rounds and time-keeping systems in games wikipedia , lookup
Artificial intelligence in video games wikipedia , lookup
Porter's generic strategies wikipedia , lookup
Nash equilibrium wikipedia , lookup
The Evolution of Cooperation wikipedia , lookup
Evolutionary game theory wikipedia , lookup
SCIT1003 Chapter 3: Prisoner’s Dilemma Non-Zero Sum Game Prof. Tsang 1 Zero-Sum Games • The sum of the payoffs remains constant during the course of the game. • Two sides in conflict, e.g. chess, sports • Being well informed always helps a player 2 Example of zero-sum game Matching Pennies Mis-matcher matcher 3 Rock-Paper-Scissors 4 Military game: attack the easy or hard pass? Attacker Easy pass Easy pass Hard pass 0.5, 0.5 0.4, 0.6 Defense side Hard pass 0, Payoff is the winning probability. 1 1, 0 5 Games of Conflict • Two sides competing against each other • Characteristics of zero-sum games: your loss is my gain • Simultaneous moves: lack of information about the opponent’s move • Logical circle of reasoning: I think that he thinks that I think that … 6 Zero-sum game matrices are sometimes expressed with only one number in each box, in which case each entry is interpreted as a gain for row-player and a loss for columnplayer. Player A Player B 7 Non-Zero Sum Game Prisoner’s Dilemma • A zero-sum game is one in which the players' interests are in direct conflict, e.g. in football, one team wins and the other loses; payoffs sum to zero. • A game is non-zero-sum, if players interests are not always in direct conflict, so that there are opportunities for both to gain, e.g. games in economics • For example, when both players choose Don't Confess in the Prisoners' Dilemma • Most game in reality have aspects of common interests as well as conflict. 8 Prisoners’ Dilemma: payoff matrix Confess Don’t Confess Confess -5, -5 0, -10 Don’t Confess -10, 0 -3, -3 2 1 9 10 Imperfect Information • Partial or no information concerning the opponent is given in advance to the player’s decision, e.g. Prisoner’s Dilemma. • Imperfect information may be diminished over time if the same game with the same opponent is played repeatedly. 11 Games of Co-operation Players may improve payoff through • communicating • forming binding coalitions & agreements • do not apply to zero-sum games Prisoner’s Dilemma with Cooperation 12 Strategies • A strategy is a “complete plan of action” that fully determines the player's behavior, a decision rule or set of instructions about which actions a player should take following all possible histories up to that stage. • The strategy concept is sometimes (wrongly) confused with that of a move. A move is an action taken by a player at some point during the play of a game (e.g., in chess, moving white's Bishop a2 to b3). • A strategy on the other hand is a complete algorithm for playing the game, telling a player what to do for every possible situation throughout the game. 13 Dominant or dominated strategy • A strategy S for a player A is dominant if it is always the best strategies for player A no matter what strategies other players will take. • A strategy S for a player A is dominated if there is at least a strategy better than it no matter what strategies other players will take. 14 Rule: If you have a dominant strategy, use it! Use strategy 1 15 Dominance Solvable COMMANDMENT If you have a dominant strategy, use it. Expect your opponent to use his/her dominant strategy if he/she has one. • If each player has a dominant strategy, the game is dominance solvable 16 Only one player has a Dominant Strategy Time The Economist G S S 100 , 100 0 , 90 G 95 , 100 95 , 90 • For The Economist: – G dominant, S dominated • Dominated Strategy: • There exists another strategy which always does better regardless of opponents’ actions 17 How to recognize a Dominant Strategy To determine if the row player has any dominant strategy 1.Underline the maximum payoff in each column 2.If the underlined numbers all appear in a row, then it is the dominant strategy for the row player No dominant strategy for the row player in this example. 18 How to recognize a Dominant Strategy To determine if the column player has any dominant strategy 1.Underline the maximum payoff in each row 2.If the underlined numbers all appear in a column, then it is the dominant strategy for the column player There is a dominant strategy for the column player in this example. 19 If there is no dominant strategy • Does any player have a dominant strategy? • If there is none, ask “Does any player have a dominated strategy?” • If yes, then • Eliminate the dominated strategies • Reduce the normal-form game • Iterate the above procedure 20 Eliminate any dominated strategy Eliminate strategy 2 as it’s dominated by strategy 1 21 Successive Elimination of Dominated Strategies • If a strategy is dominated, eliminate it • The size and complexity of the game is reduced • Eliminate any dominated strategies from the reduced game • Continue doing so successively 22 Example: Two competing Bars • • • • Two bars (bar 1, bar 2) compete each other Can charge price of $2, $4, or $5 for a drink 6000 tourists pick a bar randomly 4000 natives select the lowest price bar No dominant strategy for the both players. Bar 2 $2 $4 $5 $2 10 , 10 14 , 12 14 , 15 Bar 1 $4 12 , 14 20 , 20 28 , 15 $5 15 , 14 15 , 28 25 , 25 23 Successive Elimination of Dominated Strategies $2 Bar 1 $2 $4 $5 Bar 2 $4 $5 10 , 10 14 , 12 14 , 15 12 , 14 20 , 20 28 , 15 15 , 14 15 , 28 25 , 25 Bar 2 $4 $5 $4 20 , 20 28 , 15 Bar 1 $5 15 , 28 25 , 25 24 An example for Successive Elimination of strictly dominated strategies, or the process of iterated dominance 25 Equilibrium • The interaction of all players' strategies results in an outcome that we call "equilibrium." • Traditional applications of game theory attempt to find equilibria in games. • In an equilibrium, each player is playing the strategy that is a "best response" to the strategies of the other players. No one is likely to change his strategy given the strategic choices of the others. • Equilibrium is not: • The best possible outcome. Equilibrium in the one-shot prisoners' dilemma is for both players to confess. • A situation where players always choose the same action. Sometimes equilibrium will involve changing action choices (known as a mixed strategy equilibrium). 26 Definition: Nash Equilibrium “If there is a set of strategies with the property that no player can benefit by changing his/her strategy while the other players keep their strategies unchanged, then that set of strategies and the corresponding payoffs constitute the Nash Equilibrium.” Source: http://www.lebow.drexel.edu/economics/mccain/game/game.html 27 Nash equilibrium • If each player has chosen a strategy and no player can benefit by changing his/her strategy while the other players keep theirs unchanged, then the current set of strategy choices and the corresponding payoffs constitute a Nash equilibrium. 28 Conditions for Nash equilibrium • Each player is choosing a best response to what he believes the other players will do. • Each player’s beliefs are correct. The other players are all doing what everyone else thinks they are doing. Assumptions: Rational players “Putting yourself in the other person’s shoes” 29 Example: B B Lean vs Rainbow’s End p.105 30 p.124 Games with infinitely many strategies Rainbow’s End’s price 31 (U, L) is not a Nash equilibrium because Player 2 can gain by deviating alone to R; … You can cross out those that are not Nash equilibria. Finding Nash equilibria: (a) with strike-outs; (b) with underlinings 32 Sometimes, there is no Nash Equilibrium L C R U 0, 2 2, 3 4, 1 M 1, 1 3, 1 0, 2 D 0, 3 1, 0 5, 1 33 Sometimes, there are more than one Nash Equilibrium. No strictly dominant strategies and no strictly dominated strategies. 34 Hunting game: multi-Nash Equilibria 35 Hunting game with 2-Nash Equilibria Barney’s choice Stag Fred’s choice Bison 3 Stag 4 0 0 0 Bison 0 4 3 36 Battle of sexes Barney’s choice Ballet Alice’s choice Football 3 Ballet 4 0 0 0 Football 0 4 3 37 Prisoner’s Dilemma: finding the Nash equilibrium Which is a Nash Equilibrium? 38 Prisoner’s Dilemma : Applications • Relevant to: – Nuclear arms races. – Dispute Resolution and the decision to hire a lawyer. – Corruption/political contributions between contractors and politicians. • How do players escape this dilemma? – Play repeatedly – Find a way to ‘guarantee’ cooperation – Change payoff structure 39 Nuclear arms races prisoner's dilemma in disguise Two countries try to decide whether to build the nuclear bombs. Is there a Nash Equilibrium? 40 Cigarette Advertising prisoner's dilemma in disguise Two companies try to decide whether to run cigarette advertisement. Reynolds No Ad Ad Philip Morris No Ad Ad 50 , 50 20 , 60 60 , 20 30 , 30 41 Price it higher? Lower? prisoner's dilemma in disguise p.69 B B Lean Rainbow’s 80 80 72k , 72k 70 24k , 110k 70 110k , 24k 70k , 70k 42 OPEC: example of a cartel • Collaborating in price-fixing guarantees higher profit to all industrial producers at the expense of consumers. • OPEC is the most notable cartel of petroleum producing countries. • Cheating is a problem in price-collusion. • Governments in many nations have enacted anti-trust laws (Sherman Act in the US) to protect consumers. (p.90-94) Fight together or run-away? prisoner's dilemma in disguise Your friend Fight you Runaway Fight 0.8 , 0.8 Run-away 0 , 1 1 , 0.1 , 0.1 0 44 Collaborative behavior in biology • Hunting packs of wolves • Sharing between colony of vampire bats in Rica Costa (p.98) • Ants • Bees • Humans Sustainability of resources sharing • Community resources sharing is generally viewed as a form of cooperative game similar to Prisoner’s Dilemma by most people. • However, over-exploitation & cheating in conservation effort lead to depletion of resources. Its consequence is much deeper than the simple (& superficial) payoff matrix would suggest. (p.94-97) 46 Tragedy of the Commons • When individuals, acting independently & rationally, will deplete a shared common resource even when doing so is not in their best interest. • An example to explain overuse of shared resources. • Extend the Prisoner’s Dilemma to more than two players. • Each member of a group of neighboring farmers prefers to let his cow/sheep to graze on the commons, rather than keeping it on his own inadequate land, but the commons will be rendered unsuitable for grazing if it is overgrazed. 47 In the beginning, there is a nice piece of grassland owned by all villagers. “What a waste!” said a farmer. 48 Happy farmers with their well-fed cows. 49 “Why not have more cows? Why waste the resource?” said the farmers. 50 In the end, sad farmers with their hungry cows. 51 Tragedy of the Commons an apparent payoff matrix at the start Your neighbor Add a cow Don’t add add a cow 2 2 Don’t add 1 2 1 2 1 1 As long as the common pasture is not overgrazed, adding one more cow is the dominant strategy for everybody. 52 Tragedy of the Commons an apparent payoff matrix in between Your neighbor Add a cow Don’t add add a cow 1.5 1.5 Don’t add 0.9 1.8 0.9 1.8 1 1 When the common pasture is starting to be overgrazed, adding cow is still the dominant strategy for everybody, but the return is getting worse. 53 Tragedy of the Commons the final form of payoff matrix Your neighbor Add a cow Don’t add add a cow 0.5 0.5 Don’t add 0.7 1.1 0.7 1.1 1 1 Finally when the common pasture is overgrazed, adding cow is no longer the dominant strategy for everybody. 54 Tragedy of the Commons • Problem: cost of maintenance is ignored/externalized – Farmers don’t adequately pay for their impact. – Resources are overused due to inaccurate estimates of cost. • Relevant to: – Health or other social benefits – Environmental laws, overfishing, whaling, pollution, etc. – Global warming 55 Environmental policy Tragedy of the Commons in disguise Factory B pollution No pollution 50 , 50 60 , 20 Factory A pollution No pollution 20 , 60 20 , 20 Two factories producing same chemical can choose to pollute (lower production cost) or not to pollute (higher production cost). 56 Global warming Tragedy of the Commons in disguise Emissions Reduced? No Country A Yes Country B No Yes 50 , 50 60 , 20 20 , 60 20 , 20 Two countries producing CO2 can choose to reduce (higher production cost) or not to reduce (lower production cost). 57 Another Example: Big & Little Pigs Cost to press button = 2 units When button is pressed, food given = 10 units 58 Decisions, decisions... What’s the best strategy for the little pig? Does he have a dominant strategy? Does the big pig have a dominant strategy? 59 Research in industries Big & Little Pigs in disguise Small Company research Big Company research No research 5 , 1 No research 4 , 4 9 , -1 0 , 0 60 Summary: Ch. 3 • A strategy is a “complete plan of action” that fully determines the player's behavior. • Dominant strategy is the best strategy no matter what strategies other players will take. • A dominated strategy is one at least there is a strategy better than it no matter what strategies other players will take. • If you have a dominant strategy, use it! • Eliminate any dominated strategy 61 Assignment 3.1 Assignment 3.1 N-person Investment game • In a group of N>>2 persons • If you invest $0, you get $0 in return • If you invest $10, you get – $5 return, if more than 60% of the group invest – -$10 return, if less than 60% of the group invest • Is there any Nash equilibrium? • Answer: 2, • What are they? Why? 64 Example in real life • Which technology to choose? – Window or Apple – Betamax or VHS (video format in 1980s) • What else?? 65