Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 13 Evolutionary Games What happens when players no longer ‘choose’ their strategies but play pre-programmed actions? Adopted with permission from course by Simon David Halliday 1 Introduction • The basic idea behind evolutionary game theory (EGT) is that we can alter how we understand games so that we can do away with the assumption of rationality. • We look at the types of games we have consistently investigated (PD, Chicken) in order to understand evolutionary dynamics and ‘hardwiring’ of strategies. • We will also talk about several of the dynamics of EGT and interesting conclusions that can be drawn from them. 2 Framework • A complex of one or more genes constitutes a genotype. • The genotype governs specific physical outcomes called the phenotype. • The success of one phenotype relative to another in a specific environment indicates its fitness. • The process of selection therefore is the dynamic 3 Framework • Chance mutations can change the proportions of the phenotypes in a given environment. • If their fitness is higher than those present in the population, these mutants will invade the population. • In general populations that express a specific phenotype will become more prevalent in society if they are fitter than other populations. 4 Framework • A population is evolutionary stable if the population cannot be successfully invaded by any mutants. • Remember that some populations may be more successful than others in a given context, but others may be successful in another context. – Thus we need to be aware of how and why specific phenotypes are successful. • Many visible outcomes in biology can be seen to be the result of the evolutionary process. 5 Framework • Using evolutionary theory we can relax the assumptions of rationality! • No longer are agents rationally bound by their preferences, but they are genetically predisposed to act in a specific manner (the phenotypic expression of their genotype). • Individuals are ‘born’ to play one strategy. • For humans this embeddedness may be a result not only of genetic programming, but also a result of cultural and social factors. 6 Framework • From a population with a heterogeneous distribution of strategies, individuals are randomly paired and selected to interact. • Those that come out with better payoffs on aggregate are said to have better fitness. • Those with higher fitness are more able to find a compatible sexual mate and more likely to be able to reproduce. • Thus fitter ‘strategies’ will become more prevalent in a population over time. 7 Framework • In socioeconomic games (quite evidently different to biological interactions) the idea of ‘reproduction’ is somewhat different. • Others observe the fitter strategies and adopt these strategies, or at least do their best to imitate them. • In this way their reproduction is somewhat like Richard Dawkins conceptualisation of a ‘meme’ (rather than a ‘gene’) • (Remember also that conscious experimentation with new strategies could substitute for mutation) 8 Framework • Dawkins comments, “Just as genes propagate themselves in the gene pool by leaping from body to body via sperm or eggs, so memes propagate themselves in the meme pool by leaping from brain to brain via a process which, in the broad sense, can be called imitation.” Richard Dawkins, The Selfish Gene, 1986. 9 Framework • There are two kinds of configurations in biological games: 1) Monomorphism: one phenotype has taken over the population, it was fitter than the others and dominates. The unique prevailing strategy is the evolutionary stable strategy. 2) Polymorphism: there are multiple phenotypes in a population, all reinforcing the presence of the others. 10 Framework • What then is our idea of equilibrium? – A configuration of strategies that is stable in the population is our first idea of equilibrium. – This mirrors the idea of an evolutionary stable strategy configuration (ESS). – A strategy is evolutionary stable if it has higher average payoffs than other strategies. – And if it is resistant to invasion by mutants. – These are testable conditions and hence allow us to come up with a stable idea of equilibrium. 11 Prisoner’s Dilemma • We assume two types of players: Cooperators and Defectors. • Cooperators always cooperate. • Defectors always defect. • The payoff table represents the payoffs that a given type receives when it meets other players. 12 Column 20 (Defect) 26 (Cooperate) (288, 288) (360, 216) 26 (216, 360) (Cooperate) (324, 324) 20 (Defect) Row Figure 13.1. Prisoner’s Dilemma 13 Prisoner’s Dilemma • The population consists only of cooperators and defectors • Let x be the proportion of cooperators in the population. • Hence the probability that any cooperator will meet another cooperator is approximately x. • The probability that any cooperator will meet a defector is (1-x). • Thus the payoff for a cooperator is: π(C) = 324(x) + 216(1-x) 14 Prisoner’s Dilemma • The probability that any defector will meet a cooperator is x. • The probability that any defector will meet another defector is (1-x). • Thus the average payoff for a defector is: π(D) = 360(x) + 288(1-x) • But: π(D) > π(C) always • Why? 360(x) + 288(1-x) > 324(x) + 216(1-x) for all x • Thus defectors will come to dominate the population! (in terms of fitness) 15 Prisoner’s Dilemma • Is it possible for a mutant cooperator to invade a population of all defectors? • Sadly not, even for a very small value of x, the cooperators will always be less fit than the defectors – they cannot invade! • A 100% Defector population is an ESS. • Thus we have a general result: if a game has a strictly dominant strategy in the rational behaviour analysis of the game, then this strategy should constitute an ESS. 16 The Repeated PD • Now assume that we repeat the PD like we did in Chapter 11. • We look at 2, 3 and n repetitions and see that if there is a phenotype for the Tit-for-Tat strategy then this phenotype could dominate the population depending on the initial population distribution. • We assume two strategy phenotypes: – Defect Always (A) – Tit-for-Tat (T) (Will cooperate, until opponent defects, then will defect until opponent cooperates) 17 Column A T A (576, 576) (648, 504) T (504, 648 ) (648, 648) Row Figure 13.2. Twice Repeated Prisoner’s Dilemma 18 Column A T A (864, 864) (936, 792) T (792, 936) (972, 972) Row Figure 13.3. Thrice Repeated Prisoner’s Dilemma 19 Repeated PD • So, let’s assume that the players have to interact 3 times whenever they get matched with another player (approximating our thrice-repeated PD). • If there are x T types in the population, the chance that any T will meet another T is x. • The chance they will meet an A is (1-x). π(T) = 972x + 792(1-x) = 792 + 180x 20 Repeated PD • Now for the A types, the chance they will meet a T is x • The chance they will meet an A is (1-x). • Thus their payoffs are: π(A) = 936(x) + 864(1-x) = 864 + 72x To observe whether one strategy is better than another we must determine the population proportions required to sustain a strategy. 21 Repeated PD • Thus: π(T)> π(A) iff 792 + 180x > 864 + 72x 108x > 72 x > 2/3 • Thus if more than 2/3 of the initial starting population are Ts then Ts will come to dominate the population. • If less than 2/3 of the population is made up of Ts then As will become the dominant species. • The point of 66.6% Ts is thus the tipping point for this population, and any point beyond that forms the ‘basin of attraction’ for a 100% T equilibrium. • Similarly any point below 66.6% will form the ‘basin of attraction’ for the all A equilibrium. 22 Payoff T 972 E 936 A 864 792 0 66.6% Population Proportion of Ts (x) 100% 23 Figure 13.4 Fitness Graphs and Equilibria for the Thrice Repeated PD Repeated PD • What if there are n rounds in the game? • Well, the players simply get the payoff 288 or 324 modified by some amount (as per the table). • We go through the same process as previously… 24 Column A A T (288n, 288n) (288n + 72, 288n - 72) Row T (288n - 72, (324n, 324n) 288n + 72) Figure 13.5. nth Repeated Prisoner’s Dilemma 25 Repeated PD • If the proportion of T types is x then a T will meet a T with probability x and an A with probability (1-x) resulting in: π(T) = x(324n) + (1-x)(288n – 72) • If an A meets a T with probability x and an A with probability (1-x) then this results in: π(A) = x(288n + 72) + (1-x)(288n) 26 Repeated PD • Thus the T type is fitter if: π(T) > π(A) x(324n) + (1-x)(288n – 72) > x(288n + 72) + (1-x)(288n) 36xn > 72 x> 2/n • Thus there are two monomorphic equilibria – 100% A, and 100% T • There is an unstable point at x = 2/n • An important point is that as the number of expected rounds increases, the population proportion of initial Ts to ensure a 100% T equilibrium diminishes. 27 Evolutionary Chicken • We assume that players are distributed into the types Wimp and Macho. • We allow x to be the proportion of Machos in the population. • The interesting point about this game is that each strategy is relatively fitter when it is the minority in the population. • The population will tend towards a polymorphic equilibrium. 28 Column Wimp Macho Wimp 0, 0 -1, 1 Macho 1, -1 -2, -2 Row Figure 13.6. Evolutionary Chicken 29 Evolutionary Chicken • The probability that a Macho will meet a Macho is x, and thus that a Macho will meet a Wimp is (1-x). π(M) = -2(x) + 1(1-x) = 1 – 3x • Similarly, for a Wimp to meet a Macho the probability is x and to meet a Wimp is (1-x) π(W) = -1(x) + 0 (1-x) = -x π(M) > π(W) if.f 1 – 3x > -x x < 1/2 30 Payoff 0.5 Wimp Macho 0 Population Proportion of Macho (x) Figure 13.7. Evolutionary Chicken 100% 31 Evolutionary Chicken • This all means that if there is a proportion of less than half of Machos, the population will gain more Machos until there are 50% Machos and 50% Wimps. • This 50/50 Split is the Stable polymorphic equilibrium (although it could slingshot around this like in our example previously until it reaches this stable point). • This is very interesting when we look at the relationship between our mixed strategy NE from the rational behaviour version and this polymorphic ESS. 32 Evolutionary Assurance • We can take a basic assurance game and once more state that individuals will adopt specific strategies depending on their genetic (social) predisposition. • For example we have two groups Women and Men. • Both groups prefer to meet up, but additionally both strictly prefer meeting at one location V (Vida) rather than S (Seattle) 33 Women S V S 1, 1 0, 0 V 0, 0 2, 2 MEN Figure 13.8. Evolutionary Assurance Coffee Matching 34 Evolutionary Assurance • If x is the proportion of S lovers (men or women) then (1-x) is the proportion of V types. • Thus the expected payoff to any S is: π(S) = x(1) + 0(1-x) • And the payoff to any V is π(V) = x(0) + 2(1-x) • Thus π(S)> π(V) if.f x> 2-2x => x>2/3 • There must be more than 66.6% of the population who are S lovers for S lovers to move to dominance. 35 Payoff V S 0 66.6% Population Proportion x of Seattle Lovers Figure 13.9. Evolutionary Assurance Game 100% 36 The Hawk-Dove Game • Take two animals of the same species, some of them are innately aggressive, some of them are innately passive. • These animals engage with one another when it comes to control and consumption of resources. • We can also generalize these interactions to predispositions of individuals, populations or countries/governments. 37 The Hawk-Dove Game • Hawks always fight for the whole value of the resource (V). • Any Hawk is as likely to win and get V or lose and get –C, each of these occur with 50% probability. • The expected payoff for any Hawk is thus (VC)/2 • When two Doves meet they share the resource without a fight getting V/2 each. • When a Hawk meets a Dove, the Dove retreats and the Hawk gets the whole resource V. 38 COLUMN Hawk Dove Hawk (V-C)/2, (V-C)/2 V, 0 Dove 0, V V/2, V/2 ROW Figure 13.12. The Hawk-Dove Game 39 The Hawk-Dove Game • 1. 2. 3. 4. 5. There are 5 different cases for this game: The Rational Strategic Choice Outcomes Evolutionary Outcome for V>C Evolutionary Outcome for V<C Stable Polymorphism for V>C Mixing between Hawk-Dove We consider each case individually. 40 H-D: Rational Theory Outcomes Two possibilities: 1. V>C: the game is a Prisoner’s Dilemma and players will always choose to play Hawk. 2. V<C: the game is a game of Chicken and Hark is no longer dominant. Instead there are two pure strategy NEs of (Hawk; Dove) and (Dove; Hawk) and a mixed strategy: p(V-C)/2 + (1-p)V = p(0) + (1-p)V/2 => p = V/C 41 H-D: Evolutionary V>C • We check whether a mutant Dove could invade a population of Hawks. • Let d be the population proportion of Doves and (1-d) the proportion of Hawks. • Hawks/Doves meet Doves with probability d and Hawks with probability (1-d). π(H) = d(V) + (1-d)(V-C)/2 π(D) = d(V/2) + (1-d)0 • π(H)> π(D) for all d. • Thus the population cannot be invaded. 42 H-D: Evolutionary V>C • The same holds true for any initial population distribution with any proportion d of Doves. Hawks will dominate and Doves will go to extinction. • Moreover a 100% Dove population is vulnerable to invasion by mutant Hawks. – The Hawks will get their higher payoffs. • As in the rational version of the game, the NE is the ESS. 43 H-D: Evolutionary V< C • In this case the game is evolutionary chicken. • Thus we will have a polymorphic equilibrium. • Once more we assume d as the population proportion of Doves and (1-d) that of the Hawks. • At the margin then, a Dove could invade a population made up entirely of Hawks. • Why? When d is small, d(V/2) + (1-d)0 > d(V) + (1-d)(V-C)/2 • The Dove mutants have a higher payoff than the Hawks. 44 H-D: Evolutionary V< C • Conversely for a population of all Doves, an individual Hawk could also invade. • Let h be the population of Hawks in a population and (1-h) be this proportion for Doves. • Hawks are fitter than Doves when h is small because: => h(V-C/2) + (1-h)(V) > (h)0 + (1-h)(V/2) • For any small h. • We use this to find the polymorphic equilibrium. 45 H-D: Evolutionary V< C Polymorphic Equilibrium • Thus we must find the proportions for this polymorphic equilibrium: π(H) = π(D) h(V-C/2) + (1-h)(V) = (h)0 + (1-h)(V/2) h(V-C/2) + (1-h)(V/2) = 0 V – hC = 0 h=V/C • Which is the same probability in which the individuals played Hawk in the static/Rational version of the game. 46 H-D: Evolutionary Mixing • We now assume a 3rd type of strategy (Maynard’s Cuckoo) M that mixes between Hawk and Dove. • It plays Hawk with probability p and Dove with probability (1-p). • We assume that p = V/C. • When an H meets an M: π(H) = p(V-C/2) + (1-p)V = -1/2(V/C)(C-V) + (V/C)(C-V) = V(C-V)/2C 47 H-D: Evolutionary Mixing • When a D meets an M, π(D) = p(0) + (1-p)(V/2) = ((C-V)/C)(V/2) = V(C-V)/(2C) • These two payoffs are equal. • Let V(C-V)/(2C) = K • Now we check whether a mutant Hawk could enter a population entirely of M. 48 H-D: Evolutionary Mixing • When a D meets an M, it is trivial to show that a mutant Hawk could not invade, because the M payoff will, on average, be better than the mutant Hawk. • Similarly for a mutant Dove. Their payoffs are not high enough because when the M meets them as a ‘Dove’ type it does as well, and when it meets them as a ‘Hawk’ type it does strictly better. • Thus a 100% M population is an ESS! 49