Download Cooperation and Tags

Achieving Cooperative Social Behavior Through The Use Of Tags Critical MAS 2004 Aviv Zohar The Prisoner’s Dilemma C A symmetric 2 player matrix game with 2 strategies: Cooperate,Defect C D R T R D S S T P P And 4 Possible payments nicknamed: Temptation, Reward, Penalty,Sucker T>R>P>S 2R>S+T The Shadow of the Future – Cooperation Through Repeated Play • Axelrod showed that if players play multiple rounds of the dilemma, cooperation can be sensible. • Cooperating with someone can encourage cooperation in the future. • Reciprocating agents (Tit for Tat strategy) faired very well against a wide range of strategies. • Axelrod placed various strategies in an evolutionary setting and checked which strategy would survive and dominate. Evolutionarily Stable Strategies (ESS) • Very similar to Nash equilibriums. • These are strategies that are stable in an evolutionary setting: A strategy is evolutionarily stable if in a population where almost all players follow it, it gets the highest payoff compared to other strategies. Tit for Tat is almost Evolutionarily stable. But so is the strategy of permanent defection. (ALL-D) If the population is composed entirely of ALL-D players, Tit for Tat cannot invade! Niches and Invasion • Axelrod pointed out that if the invading group only interacts with itself, it may grow and replace the previous ESS. • Niches are another idea adopted from biology: If we keep populations separate they will each evolve differently – And find different solutions for survival. • In our case – they may reach different equilibrium points. • Biological niches are often imposed by geographical separation. (for example: species evolving differently on different continents) Tags • Our agents will be identified only according to a number or a string of bits called a tag. • Tags are visible to everyone. • Agents will be directed to interact with others similar to them. • This is another way to impose a topology on the system – Agents with similar tags are considered “closer” to each other. We have another way to create niches. • Another way of looking at tags is from the Social perspective: These are markers that humans/animals observe in each other that may influence their opinions. Tags in the Human World They are all around us: Some are social markers – Gang Colors – Secret Handshakes – Native tongue and accent – Chinese associations Some are determined genetically – Hair color – Skin Color Rick Riolo’s experiments with tags • Each agent uses tags to select acceptable partners for a game of iterated prisoner’s dilemma (IPD). • Tags are from the range [0,1] • Each agent also has a tag bias (tolerance) b (0,100] • Agent i will agree to play with agent j with probability: bi 1  Ti  T j • The game starts only if both agree to play. • Players are given several chances to reject partners with a cost for each rejection. Rick Riolo’s experiments with tags (Cont.) • The strategies allowed are very simple. Players only remember the last game step. • Only 3 genes from the range [0,1] determine them: (i,p,q) i - the probability of cooperating on the first round. p - the probability of cooperating after the opponent cooperated. q - the probability of cooperating after the opponent defected. • Riolo chose the parameters R=T=5, S=P=1. This is not exactly the PD matrix we are used to. Each player’s strategy independently determines the other player’s reward This is only a dilemma in an evolutionary context. Riolo’s Results From “The Effects and Evolution of Tag-Mediated Selection of Partners in Populations Playing the Iterated Prisoner’s Dilemma” - Rick Riolo Riolo’s Results (Cont.) • Populations with tags achieve more cooperation faster. • A fixed tag bias got the population a higher average fitness than an evolved bias. This became more distinct as the search cost was increased. When there’s a high cost on searching – it is better to be less picky. • Populations did not always evolve tag use (when started with High bias) – especially when search cost was high. Evolving Social Rationality for MAS using Tags - Hales & Edmonds • Demonstrate application of tags on various models. each one with it’s own lessons. • Cooperation can be achieved even without iterated play, and without any memory. • In some systems a negative scaling cost is found: The larger the group is, the faster it converges to cooperation! Hales & Edmonds – Model M1 • A simple population of players without tags. • Each player has a simple deterministic strategy - “C” or “D” which is encoded in it’s genes. • Random encounters between players that play a single round of the Prisoner’s dilemma. As expected: The population quickly converges to defecting players. Hales & Edmonds – Model M2 • Tags are added to the M1 model. These are strings of 32 bits. • Tags are inherited from the parents along with the Strategy. There is also a small chance of mutation. • Agents look for similar partners to play with. If none are found they play a random opponent. • Agents are not forced to cooperate with identical partners. The results: Over 90% of the players at any given time cooperate! Model M2 – a Dynamic Equilibrium • Each unique tag defines a group of players that only play internally. • If a group contains even a single defecting player, It will Quickly reproduce and eliminate all cooperating players in the group. • The group is thus quickly reduced to a group of defecting players. • However, if other groups exist that are without defectors, they do much better. • Group selection! • So cooperation is very unstable – but we still find lots of it! Hales & Edmonds – Model M3 • An attempt to see if cooperation between non identical agents could emerge. Each agent is given: • A tag from the range [0,1] • A tolerance level from the range [0,1] • And a “skill” from {1,2,3,4,5} which represents the ability to consume resources of types {1,2,3,4,5}. • Agent i will agree to interact with agent j only if: tagi  tag j  tolerancei Model M3 (Cont.) At every round agents are awarded several resources. • If they posses the appropriate skill, they can consume them and gain 1 utility. • Otherwise, they can choose to either – donate them to other agents with a similar tag at a cost to themselves, or – discard the resource. • Now every “group” will need to be composed of agents with complementing skills. Model M3 - Results From “Evolving Social Rationality for MAS using Tags” – David Hales & Bruce Edmonds. Hales & Edmonds – Model M4 • Attempting to apply tags to a more realistic task domain: • Robots are tasked with unloading incoming trucks. • There are 10 unloading bays And 5 Robots per bay. Each can unload 5 units per cycle. • Each robot gets rewarded only for the amount unloaded in it’s own bay. • If a bay is empty, a truck carrying ‘s’ units will arrive with probability ‘p’, and will stay at the bay until it is empty. • Robots can request help from others. Model M4 (Cont.) The tag model: • Each robot is given an integer tag from 1..500 • Robots that have a truck in their bay request help from robots with identical tags (or random ones if there are none) • Robots agree to help according to their genes and whether or not there is a truck in their own bay. Model M4 - Results From “Evolving Social Rationality for MAS using Tags” – David Hales & Bruce Edmonds. The Benefits of Tags that We’ve Seen: • A high degree of cooperation. • Extremely simple solutions. • Cheap to implement. • Robust. Noise is actually built in to the system. • No complex reasoning was used. • Local interactions. No global view needed. • Sometimes we get a negative scaling cost. Limitations of Tags • Can they be applied outside the evolutionary setting? (some other kind of reasoning?) • Behavior and tag need to be somehow correlated. • We cannot allow totally free change of tags. • Interactions are only with similar agents. What if we want more? (Interactions between all members of the population) A qualitative analysis of a tag based system • The evolving population can be seen as a dynamic system. • This is a view that is often used in population biology. • Differential equations can approximate the interaction between the various groups. • Fixed points, Attractors and bifurcations can point to interesting properties of the system. A minimal model for tag based cooperation – Traulsen & Schuster • A simplified analysis of a donations model by Riolo. The idea is to show why the tolerance level oscillates. • Only 2 tags are allowed “red” and “blue”. • Players can donate some goods at a cost to themselves, but at greater benefit to the recipient. • Only 2 behaviors are allowed: Players can either donate to players with the same tag only, or donate to everyone. • There are thus only 4 types of players possible: Tolerant Selfish Red p1 p3 Blue p2 p4 The Replicator Equation • A simple dynamic equation that embodies the idea that individuals with above average fitness reproduce and replace individuals with low fitness. p i  pi  ( f i  f ) The System Dynamics - No Drift • The mean payoff a player can expect is set as the fitness. • The system’s behavior can now be determined. From “A Minimal Model for Tag-Based Cooperation” – Arne Traulsen & Heinz Georg Schuster. Adding a Slow Drift Towards Tolerance • A small biased drift towards tolerance is added to the equations. This Drift was also found in the experimental models. • The system behavior changes dramatically. t 1  p1t  p1t    ( f1  f )    p3t t 1  p2t  p2t    ( f 2  f )    p4t t 1  p3t  p3t    ( f 3  f )    p3t t 1  p4t  p4t    ( f 4  f )    p4t p1 p2 p3 p4 From “A Minimal Model for Tag-Based Cooperation” – Arne Traulsen & Heinz Georg Schuster. References • • • • • • • • Axelrod. The Evolution of Cooperation (Basic Books, New York, 1984). Hales & Edmonds. Evolving Social Rationality for MAS using “Tags”. In Rosenschein, J. S., et al. (eds.) Proceedings of the 2nd International Conference on Autonomous Agents and Multiagent Systems, 497-503 (ACM Press, 2003) Hofbauer & Sigmund. Evolutionary Games and Population Dynamics (Cambridge Univ. Press, Cambridge, 1998). Maynard-Smith. Did Darwin Get it Right? Essays on Games Sex and Evolution (Penguin Books Ltd. New York 1993). Riolo, Cohen & Axelrod. Cooperation without Reciprocity. Nature 414, 441-443 (2001). Riolo. The Effects and Evolution of Tag-Mediated Selection of Partners in Populations Playing the Iterated Prisoner’s Dilemma. In Proceedings of the Seventh International Conference on Genetic Algorithms, 378-385. (Kaufmann Publishers Inc. 1997). Sigmund & Nowak. Tides of tolerance. Nature 414, 403-405 (2001). Traulsen & Schuster, A Minimal Model for Tag-Based Cooperation. Phys. Rev. E. 68, 046129 (2003)

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Cooperation and Tags