Download Cooperation and Tags

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Achieving Cooperative Social Behavior
Through The Use Of Tags
Critical MAS 2004
Aviv Zohar
The Prisoner’s Dilemma
C
A symmetric 2 player matrix game
with 2 strategies:
Cooperate,Defect
C
D
R
T
R
D
S
S
T
P
P
And 4 Possible payments nicknamed:
Temptation, Reward,
Penalty,Sucker
T>R>P>S
2R>S+T
The Shadow of the Future –
Cooperation Through Repeated Play
• Axelrod showed that if players play multiple rounds of the
dilemma, cooperation can be sensible.
• Cooperating with someone can encourage cooperation in
the future.
• Reciprocating agents (Tit for Tat strategy) faired very well
against a wide range of strategies.
• Axelrod placed various strategies in an evolutionary setting
and checked which strategy would survive and dominate.
Evolutionarily Stable Strategies (ESS)
• Very similar to Nash equilibriums.
• These are strategies that are stable in an evolutionary
setting:
A strategy is evolutionarily stable if in a population
where almost all players follow it, it gets the highest payoff
compared to other strategies.
Tit for Tat is almost Evolutionarily stable.
But so is the strategy of permanent defection.
(ALL-D)
If the population is composed entirely of ALL-D players, Tit
for Tat cannot invade!
Niches and Invasion
• Axelrod pointed out that if the invading group only interacts
with itself, it may grow and replace the previous ESS.
• Niches are another idea adopted from biology: If we keep
populations separate they will each evolve differently – And
find different solutions for survival.
• In our case – they may reach different equilibrium points.
• Biological niches are often imposed by geographical
separation. (for example: species evolving differently on
different continents)
Tags
• Our agents will be identified only according to a number or
a string of bits called a tag.
• Tags are visible to everyone.
• Agents will be directed to interact with others similar to
them.
• This is another way to impose a topology on the system –
Agents with similar tags are considered “closer” to each
other. We have another way to create niches.
• Another way of looking at tags is from the Social
perspective: These are markers that humans/animals
observe in each other that may influence their opinions.
Tags in the Human World
They are all around us:
Some are social markers
– Gang Colors
– Secret Handshakes
– Native tongue and accent
– Chinese associations
Some are determined genetically
– Hair color
– Skin Color
Rick Riolo’s experiments with tags
• Each agent uses tags to select acceptable partners for a game
of iterated prisoner’s dilemma (IPD).
• Tags are from the range [0,1]
• Each agent also has a tag bias (tolerance) b (0,100]
• Agent i will agree to play with agent j with probability:
bi
1  Ti  T j
• The game starts only if both agree to play.
• Players are given several chances to reject partners with a cost
for each rejection.
Rick Riolo’s experiments with tags (Cont.)
• The strategies allowed are very simple. Players only remember the
last game step.
• Only 3 genes from the range [0,1] determine them: (i,p,q)
i - the probability of cooperating on the first round.
p - the probability of cooperating after the opponent cooperated.
q - the probability of cooperating after the opponent defected.
• Riolo chose the parameters R=T=5, S=P=1.
This is not exactly the PD matrix we are used to. Each player’s
strategy independently determines the other player’s reward
This is only a dilemma in an evolutionary context.
Riolo’s Results
From “The Effects and Evolution of Tag-Mediated Selection of Partners in
Populations Playing the Iterated Prisoner’s Dilemma” - Rick Riolo
Riolo’s Results (Cont.)
•
Populations with tags achieve more cooperation faster.
• A fixed tag bias got the population a higher average fitness than
an evolved bias. This became more distinct as the search cost was
increased.
When there’s a high cost on searching – it is better to be less
picky.
• Populations did not always evolve tag use (when started with
High bias) – especially when search cost was high.
Evolving Social Rationality for MAS using
Tags - Hales & Edmonds
• Demonstrate application of tags on various models.
each one with it’s own lessons.
• Cooperation can be achieved even without iterated play, and
without any memory.
• In some systems a negative scaling cost is found:
The larger the group is, the faster it converges to cooperation!
Hales & Edmonds – Model M1
• A simple population of players without tags.
• Each player has a simple deterministic strategy - “C” or
“D” which is encoded in it’s genes.
• Random encounters between players that play a single
round of the Prisoner’s dilemma.
As expected:
The population quickly converges to defecting players.
Hales & Edmonds – Model M2
• Tags are added to the M1 model. These are strings of 32 bits.
• Tags are inherited from the parents along with the Strategy.
There is also a small chance of mutation.
• Agents look for similar partners to play with. If none are
found they play a random opponent.
• Agents are not forced to cooperate with identical partners.
The results: Over 90% of the players at any given time
cooperate!
Model M2 – a Dynamic Equilibrium
• Each unique tag defines a group of players that only play
internally.
• If a group contains even a single defecting player, It will
Quickly reproduce and eliminate all cooperating players in
the group.
• The group is thus quickly reduced to a group of defecting
players.
• However, if other groups exist that are without defectors,
they do much better.
• Group selection!
• So cooperation is very unstable – but we still find lots of it!
Hales & Edmonds – Model M3
• An attempt to see if cooperation between non identical
agents could emerge.
Each agent is given:
• A tag from the range [0,1]
• A tolerance level from the range [0,1]
• And a “skill” from {1,2,3,4,5} which represents the ability
to consume resources of types {1,2,3,4,5}.
• Agent i will agree to interact with agent j only if:
tagi  tag j  tolerancei
Model M3 (Cont.)
At every round agents are awarded several resources.
• If they posses the appropriate skill, they can consume them
and gain 1 utility.
• Otherwise, they can choose to either
– donate them to other agents with a similar tag at a cost
to themselves, or
– discard the resource.
• Now every “group” will need to be composed of agents
with complementing skills.
Model M3 - Results
From “Evolving Social Rationality for MAS using Tags” – David Hales & Bruce
Edmonds.
Hales & Edmonds – Model M4
• Attempting to apply tags to a more realistic task domain:
• Robots are tasked with unloading incoming trucks.
• There are 10 unloading bays And 5 Robots per bay. Each
can unload 5 units per cycle.
• Each robot gets rewarded only for the amount unloaded in
it’s own bay.
• If a bay is empty, a truck carrying ‘s’ units will arrive with
probability ‘p’, and will stay at the bay until it is empty.
• Robots can request help from others.
Model M4 (Cont.)
The tag model:
• Each robot is given an integer tag from 1..500
• Robots that have a truck in their bay request help from
robots with identical tags (or random ones if there are
none)
• Robots agree to help according to their genes and whether
or not there is a truck in their own bay.
Model M4 - Results
From “Evolving Social Rationality for MAS using Tags” – David Hales & Bruce
Edmonds.
The Benefits of Tags that We’ve Seen:
• A high degree of cooperation.
• Extremely simple solutions.
• Cheap to implement.
• Robust. Noise is actually built in to the system.
• No complex reasoning was used.
• Local interactions. No global view needed.
• Sometimes we get a negative scaling cost.
Limitations of Tags
• Can they be applied outside the evolutionary setting?
(some other kind of reasoning?)
• Behavior and tag need to be somehow correlated.
• We cannot allow totally free change of tags.
• Interactions are only with similar agents. What if we want
more? (Interactions between all members of the
population)
A qualitative analysis of a tag based system
• The evolving population can be seen as a dynamic system.
• This is a view that is often used in population biology.
• Differential equations can approximate the interaction
between the various groups.
• Fixed points, Attractors and bifurcations can point to
interesting properties of the system.
A minimal model for tag based
cooperation – Traulsen & Schuster
• A simplified analysis of a donations model by Riolo. The
idea is to show why the tolerance level oscillates.
• Only 2 tags are allowed “red” and “blue”.
• Players can donate some goods at a cost to themselves, but
at greater benefit to the recipient.
• Only 2 behaviors are allowed: Players can either donate to
players with the same tag only, or donate to everyone.
• There are thus only 4 types of players possible:
Tolerant Selfish
Red
p1
p3
Blue
p2
p4
The Replicator Equation
• A simple dynamic equation that embodies the idea that
individuals with above average fitness reproduce and
replace individuals with low fitness.
p i  pi  ( f i  f )
The System Dynamics - No Drift
• The mean payoff a player can expect is set as the fitness.
• The system’s behavior can now be determined.
From “A Minimal Model for Tag-Based Cooperation” –
Arne Traulsen & Heinz Georg Schuster.
Adding a Slow Drift Towards Tolerance
• A small biased drift towards tolerance is added to the
equations. This Drift was also found in the experimental
models.
• The system behavior changes dramatically.
t 1
 p1t  p1t    ( f1  f )    p3t
t 1
 p2t  p2t    ( f 2  f )    p4t
t 1
 p3t  p3t    ( f 3  f )    p3t
t 1
 p4t  p4t    ( f 4  f )    p4t
p1
p2
p3
p4
From “A Minimal Model for Tag-Based Cooperation” –
Arne Traulsen & Heinz Georg Schuster.
References
•
•
•
•
•
•
•
•
Axelrod. The Evolution of Cooperation (Basic Books, New York, 1984).
Hales & Edmonds. Evolving Social Rationality for MAS using “Tags”. In
Rosenschein, J. S., et al. (eds.) Proceedings of the 2nd International Conference on
Autonomous Agents and Multiagent Systems, 497-503 (ACM Press, 2003)
Hofbauer & Sigmund. Evolutionary Games and Population Dynamics
(Cambridge Univ. Press, Cambridge, 1998).
Maynard-Smith. Did Darwin Get it Right? Essays on Games Sex and Evolution
(Penguin Books Ltd. New York 1993).
Riolo, Cohen & Axelrod. Cooperation without Reciprocity. Nature 414, 441-443
(2001).
Riolo. The Effects and Evolution of Tag-Mediated Selection of Partners in
Populations Playing the Iterated Prisoner’s Dilemma. In Proceedings of the
Seventh International Conference on Genetic Algorithms, 378-385. (Kaufmann
Publishers Inc. 1997).
Sigmund & Nowak. Tides of tolerance. Nature 414, 403-405 (2001).
Traulsen & Schuster, A Minimal Model for Tag-Based Cooperation. Phys. Rev. E.
68, 046129 (2003)