Survey

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Survey

Document related concepts

Transcript

Cooperation in multi-player minimal social situations: An experimental investigation Andrew Colman Briony Pulford David Omtzigt Ali al-Nowaihi An article describing four experiments and a Monte Carlo simulation study carried out for this project has been published: Colman, A. M., Pulford, B. D., Omtzigt, D., & al-Nowaihi, A. (2010). Learning to cooperate without awareness in multiplayer minimal social situations. Cognitive Psychology, 61, 201-227. ___________________ A simple, evolutionary game-theoretic model yields the surprising prediction that cooperation can evolve without deliberate intention in a minimal social situation (MSS). This phenomenon was discovered in dyads by Sidowski (1957) and generalized to larger groups by Coleman, Colman, and Thomas (1990). Predictions about the multi-player minimal social situation (MMSS) remain untested. The primary objective of the proposed research is to test them. Two-player MSS In this game of incomplete information, neither play knows the co-player's strategy set nor their own or the co-player's payoff function. Players may even be ignorant of their strategic interdependence. Below is the Mutual Fate Control payoff matrix (Thibaut & Kelley, 1959), normally used to study it: II C D C +, + +, – D –, + –, – All four outcomes are (pure-strategy, weak) Nash equilibria. Players' payoffs are determined by their co-player's choice and not by their own. Colman (1982, pp. 289-91;1995, pp. 40-50) suggested some everyday examples. Experimental evidence In the earliest experiments (Sidowski, 1957; Sidowski, Wyckoff, & Tabory, 1956), pairs of players sat in separate rooms with electrodes attached to their left hands. Each player was provided with a pair of buttons for indicating their choices and a digital display showing points scored. On every iteration, each player pressed a button, attempting to earn a point and avoid a shock. Pressing one button awarded the co-player a point, and pressing the other shocked the co-player. In later experiments, players simply won or lost points – financial incentives have seldom been used. After many iterations, pairs tended to coordinate on the efficient (C, C) equilibrium. After a hundred iterations, C-choosing reached 75-80%. Players behaved as if they were learning to cooperate, although they did not guess that co-players were involved. Sidowski (1957) informed some players that a person in another room controlled their points and shocks, and vice versa, but this additional information made no material difference (p. 324). Subsequent investigations of the MSS, using human and occasionally animal players, have broadly replicated these findings. MSS theory To explain the phenomenon, Kelley, Thibaut, Radloff, & Mundy (1962) proposed that players tend to adopt a myopic “win-stay, lose-change” strategy (Pavlov), repeating a choice immediately following a reward and switching to the alternative after a punishment. Assume that the game is repeated indefinitely and that both players then use Pavlov – after arbitrary initial choices. Using 0 and 1 to represent C and D, if both initially choose 0, then both are rewarded and repeat 0 indefinitely: (0, 0) → (0, 0) → (0, 0) → .... If both choose 1, both are punished and switch to 0, repeating it indefinitely: (1, 1) → (0, 0) → (0, 0) → .... If one player chooses 0 and the other 1, the 0-chooser is punished and switches to 1, and the 1-chooser is rewarded and repeats 1, then both switch to 0 and repeat it indefinitely: (0, 1) → (1, 1) → (0, 0) → (0, 0) → ... or (1, 0) → (1, 1) → (0, 0) → (0, 0) → .... Pavlov players therefore lock into mutually rewarding strategies by the third iteration. Pavlov is essentially a formalization of the law of effect, well documented in human and animal psychology, but players implement it imperfectly, at best. Strict Pavlov would yield 100% cooperation after three iterations. Stochastic learning models are more descriptively accurate in the MSS (Arickx & Van Avermaet, 1981; Delepoulle Preux, & Darcheville, 2000). Multi-player MMSS The MMSS is a generalization of the MSS to n ≥ 2 players, each with a uniquely designated predecessor and successor. Player l's predecessor is Player n and Player n's successor is Player 1, as if the players were seated round a table or the number of players were unbounded. Each player chooses 0 or 1, yielding an n-vector (configuration) of zeros and ones for each iteration. A choice of 0 rewards, and a choice of 1 punishes, the successor. A Pavlov player repeats a rewarded choice on the following iteration and switches after a punished choice. A jointly cooperative configuration consisting entirely of zeros is repeated indefinitely. A cooperative configuration is one that leads ultimately to a zero configuration. The MSS is a special case in which all configurations are cooperative. This is not true in general. The configuration (1, 1, 0, 0, 1, 1) is followed by (1, 1, 0, 0, 1, 1) → (0, 0, 1, 0, 1, 0) → (0, 0, 1, 1, 1, 1) → (1, 0, 1, 0, 0, 0) → (1, 1, 1, 1, 0, 0) → (1, 0, 0, 0, 1, 0) → (1, 1, 0, 0, 1, 1), returning to the beginning. This sequence cycles forever through these non-cooperative configurations. Coleman, Colman, and Thomas (1990) proved that, in the MMSS, joint cooperation evolves only in special cases. The only configurations immediately followed by joint cooperation are those in which all players choose 1 or all choose 0. If n is odd, then joint cooperation occurs only if all players make the same initial choice. If k is the highest power of 2 that divides n evenly, then the number of cooperative configurations is 2k. Once the choices of k players are specified, the rest are strictly determined for a cooperative configuration. Proposed experiments The standard theory yields testable but non-intuitive predictions. For example, cooperation should not evolve at all in odd-sized groups. However, consider a stochastic modification that we call Optimistic Pavlov in which a Pavlov player who should choose 1 with certainty chooses it with probability p (0 < p < 1). Then, immediately following an iteration in which Player i chooses xi and Player i - 1 chooses xi-1, the probability P(xi) that Player i will defect is P(xi) = p(xi-1 + xi), i = 1, ..., n. (Here, addition is mod 2, so 1 + 1 = 0, and subscripts mod n, so i - i = 0 = n.) Because 0 < p < 1 and 0 ≤ xi-1 + xi ≤ 1, it follows that P(xi) decreases over iterations. Optimistic Pavlov play therefore converges towards joint cooperation even in an odd-sized MMSS. We are carrying out experiments to test these theories, and others, in three-player, four-player, and six-player groups. References Arickx, M., & Van Avermaet, E. (1981). Interdependent learning in a minimal social situation. Behavioral Science, 26, 229-242. Coleman, A. A., Colman, A. M., & Thomas, R. M. (1990). Cooperation without awareness: A multiperson generalization of the minimal social situation. Behavioral Science, 35, 115-121. Colman, A. M.(Ed.). (1982). Cooperation and competition in humans and animals. Wokingham: Van Nostrand Reinhold. Colman, A. M. (1995). Game theory and its applications in the social and biological sciences (2nd ed.). London: Routledge. Delepoulle, S., Preux, P. P., & Darcheville, J.-C. (2000). Evolution of cooperation within a behavior-based perspective: Confronting nature and animats. Artificial Evolution Lecture Notes in Computer Science, 18291, 204-16. Kelley, H. H., Thibaut, J. W., Radloff, R., & Mundy, D. (1962) . The development of cooperation in the “minimal social situation” Psychological Monographs, 76, Whole No. 19. Sidowski, J. B. (1957). Reward and punishment in the minimal social situation. Journal of Experimental Psychology, 54, 318-326. Sidowski, J. B., Wyckoff L. B., & Tabory, L. (1956). The influence of reinforcement and punishment in a minimal social situation. Journal of Abnormal and Social Psychology, 52, 115-119. Thibaut, J. W., & Kelley, H. H. (1959). The social psychology of groups. New York: Wiley. Start Date: 1 October 2004. End Date: 30 June 2005 <back to Andrew Colman’s personal home page>