Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Nash equilibrium Non-standard preferences Experimental design Results Nash Equilibrium in Tullock Contests Aidas Masiliunas1 1 Aix-Marseille School of Economics Controversies in Game Theory III, ETH Zurich 2 June, 2016 Other projects Nash equilibrium Non-standard preferences Experimental design Results Other projects Rent-seeking (Tullock) contest Two players compete for a prize (16 ECU) by making costly investments (x1 , x2 ≤ 16) Higher investments increase the probability to win the prize Probability that player i receives the prize: xi xi +xj Nash equilibrium Non-standard preferences Experimental design Results Other projects Rent-seeking (Tullock) contest Two players compete for a prize (16 ECU) by making costly investments (x1 , x2 ≤ 16) Higher investments increase the probability to win the prize Probability that player i receives the prize: Applications: Competition for monopoly rents Investments in R&D Competition for a promotion/bonus Political contests xi xi +xj Nash equilibrium Non-standard preferences Experimental design Results Theory xi xi +xj · 16 + 16 − xi p BRi (xj ) : xi∗ = 16xj − xj E (π) = RNNE : xi∗ = 4, dominance solvable in three steps. 10 9 8 7 6 5 4 3 2 1 Best Response 12 14 15 16 Standard preferences 1 2 3 4 5 6 7 8 9 Other plays 10 11 12 13 14 15 16 Other projects Nash equilibrium Non-standard preferences Experimental design Results Other projects Explanatory power of Nash equilibrium in experiments 7.04% of choices are exactly Nash 60.19% of choices are strictly dominated Investments are spread across the whole strategy space Experience does not help Less stability compared to auctions Nash equilibrium Non-standard preferences Experimental design Results Comparative statics of Nash equilibrium An alternative to point predictions is comparative statics Is behaviour sensitive to changes in the Nash prediction? Other projects Nash equilibrium Non-standard preferences Experimental design Results Comparative statics of Nash equilibrium An alternative to point predictions is comparative statics Is behaviour sensitive to changes in the Nash prediction? Players 2 3 4 5 9 Nash 250 222 188 160 99 Mean investment 325 283 302 322 326 Source: Lim, Matros & Turocy, 2014 Other projects Nash equilibrium Non-standard preferences Experimental design Results Why should players choose Nash equilibrium? Interpretation #1: Nash equilibrium is the unique action profile that can be justified by common knowledge of rationality. Rationality = maximization of expected payoff given some belief. Other projects Nash equilibrium Non-standard preferences Experimental design Results Other projects Rationalizable strategies xi BR(xi ) 1 3 2 4 3 4 4 4 5 4 6 4 7 4 8 3 9 3 10 3 11 2 12 2 13 1 14 1 15 1 16 1 Nash equilibrium Non-standard preferences Experimental design Results Other projects Rationalizable strategies xi BR(xi ) 1 3 2 4 3 4 4 4 5 4 6 4 Rationality Rationalizable: 3, 4, 2, 1 7 4 8 3 9 3 10 3 11 2 12 2 13 1 14 1 15 1 16 1 Nash equilibrium Non-standard preferences Experimental design Results Other projects Rationalizable strategies xi BR(xi ) BR(BR(xi )) 1 3 4 2 4 4 3 4 4 4 4 4 5 4 4 6 4 4 7 4 4 8 3 4 9 3 4 10 3 4 11 2 4 12 2 4 13 1 3 Rationality Rationalizable: 3, 4, 2, 1 Rationality + belief that the opponent is rational Rationalizable: 3, 4 14 1 3 15 1 3 16 1 3 Nash equilibrium Non-standard preferences Experimental design Results Other projects Rationalizable strategies xi BR(xi ) BR(BR(xi )) BR(BR(BR(xi ))) 1 3 4 4 2 4 4 4 3 4 4 4 4 4 4 4 5 4 4 4 6 4 4 4 7 4 4 4 8 3 4 4 9 3 4 4 10 3 4 4 11 2 4 4 12 2 4 4 13 1 3 4 14 1 3 4 15 1 3 4 Rationality Rationalizable: 3, 4, 2, 1 Rationality + belief that the opponent is rational Rationalizable: 3, 4 Rationality + belief that the opponent is rational + belief that the opponent believes in my rationality Rationalizable: 4 16 1 3 4 Nash equilibrium Non-standard preferences Experimental design Results Other projects Rationalizable strategies xi BR(xi ) BR(BR(xi )) BR(BR(BR(xi ))) 1 3 4 4 2 4 4 4 3 4 4 4 4 4 4 4 5 4 4 4 6 4 4 4 7 4 4 4 8 3 4 4 9 3 4 4 10 3 4 4 11 2 4 4 12 2 4 4 13 1 3 4 14 1 3 4 15 1 3 4 Rationality Rationalizable: 3, 4, 2, 1 Rationality + belief that the opponent is rational Rationalizable: 3, 4 Rationality + belief that the opponent is rational + belief that the opponent believes in my rationality Rationalizable: 4 Epistemic definition of Nash equilibrium: common belief in rationality + simple belief hierarchy 16 1 3 4 Nash equilibrium Non-standard preferences Experimental design Results Other projects Why should players choose Nash equilibrium? Nash equilibrium is the unique action profile that cannot be ruled out by common knowledge of rationality. 1 2 3 Players care about expected payoffs Players have the ability to calculate expected payoffs and identify dominated strategies Players believe that other players satisfy 1-2, and believe that they believe that they satisfy 1-2... Nash equilibrium Non-standard preferences Experimental design Results Other projects Why should players choose Nash equilibrium? Nash equilibrium is the unique action profile that cannot be ruled out by common knowledge of rationality. 1 2 3 Players care about expected payoffs Players have the ability to calculate expected payoffs and identify dominated strategies Players believe that other players satisfy 1-2, and believe that they believe that they satisfy 1-2... Nash equilibrium is the rest point of various learning dynamics Belief-based learning, e.g. Cournot best-response, fictitious play Assumption 3 is not necessary Nash equilibrium Non-standard preferences Experimental design Results Other projects Why should players choose Nash equilibrium? Nash equilibrium is the unique action profile that cannot be ruled out by common knowledge of rationality. 1 2 3 Players care about expected payoffs Players have the ability to calculate expected payoffs and identify dominated strategies Players believe that other players satisfy 1-2, and believe that they believe that they satisfy 1-2... Nash equilibrium is the rest point of various learning dynamics Belief-based learning, e.g. Cournot best-response, fictitious play Assumption 3 is not necessary Payoff-based learning, e.g. reinforcement learning Players must be willing to explore, remember past payoffs, receive accurate feedback. Nash equilibrium Non-standard preferences Experimental design Which assumptions are violated? Results Other projects Nash equilibrium Non-standard preferences Experimental design Results Other projects Preference-based explanations: joy of winning Participants receive non-monetary utility from winning (Parco et al, 2005, Sheremeta, 2011) or lose utility after losing (Delgado et al., 2008). Sheremeta (2011) elicits joy of winning by implementing a contest where prize has no value. 12 10 9 8 1 2 3 4 5 6 7 Best Response 10 9 8 7 6 5 4 3 2 1 Best Response 12 14 15 16 Joy of winning with w = 8 14 15 16 Joy of winning with w=3 1 2 3 4 5 6 7 8 9 Other plays 10 11 12 13 14 15 16 1 2 3 4 5 6 7 8 9 Other plays 10 11 12 13 14 15 16 Nash equilibrium Non-standard preferences Experimental design Results Other projects Preference-based explanations: risk preferences CRRA untility function: u(πi ) = πi1−ρ 1−ρ Risk aversion if ρ = 0.5, risk seeking if ρ = −0.5 12 10 9 8 1 2 3 4 5 6 7 Best Response 10 9 8 7 6 5 4 3 2 1 Best Response 12 14 15 16 Risk seeking 14 15 16 Risk aversion 1 2 3 4 5 6 7 8 9 Other plays 10 11 12 13 14 15 16 1 2 3 4 5 6 7 8 9 Other plays 10 11 12 13 14 15 16 Nash equilibrium Non-standard preferences Experimental design Results Preference-based explanations: social preferences Fehr & Schmidt (1999) inequality aversion: πi − α(πj − πi ) if πi ≤ πj u(πi , πj ) = πi − β(πi − πj ) if πi > πj Fehr and Schmidt (1999) inequality aversion Best Response 1 2 3 4 5 6 7 8 9 10 11 13 15 a=0, b=0 a=0.5, b=0 a=1, b=0 1 2 3 4 5 6 7 8 9 10 Other plays 12 14 15 16 Other projects Nash equilibrium Non-standard preferences Experimental design All preferences from Sheremeta (2015) Results Other projects Nash equilibrium Non-standard preferences Experimental design Results Other projects ”Behavioral Variation in Tullock Contests”, joint with F. Mengel and Ph. Reiss Deviations from NE could be a result of bounded rationality Players optimize given the feedback in previous rounds. Noisy feedback prevents players from discovering optimal actions Nash equilibrium Non-standard preferences Experimental design Results Other projects ”Behavioral Variation in Tullock Contests”, joint with F. Mengel and Ph. Reiss Deviations from NE could be a result of bounded rationality Players optimize given the feedback in previous rounds. Noisy feedback prevents players from discovering optimal actions Research questions: Can we identify whether deviations from NE are a result of bounded rationality or of preferences? Is behavioral variability lower and choices closer to theoretical predictions when feedback is more informative? Nash equilibrium Non-standard preferences Experimental design Results Other projects How informative is the feedback that players observe? Reinforcement learning converges to NE as t → ∞ In experiments players rely on small samples of experience Nash equilibrium Non-standard preferences Experimental design Results Other projects How informative is the feedback that players observe? Reinforcement learning converges to NE as t → ∞ In experiments players rely on small samples of experience Suppose that players always choose the action that yielded highest average payoff in the past. Nash equilibrium Non-standard preferences Experimental design Results Other projects How informative is the feedback that players observe? Reinforcement learning converges to NE as t → ∞ In experiments players rely on small samples of experience Suppose that players always choose the action that yielded highest average payoff in the past. Nash equilibrium Non-standard preferences Experimental design Results Other projects Feedback depends on other’s choices and lottery outcomes Nash equilibrium Non-standard preferences Experimental design Treatment 1: eliminate lottery allocation Results Other projects Nash equilibrium Non-standard preferences Experimental design Results Other projects Treatment 2: eliminate variability of opponent’s choices Nash equilibrium Non-standard preferences Treatment 3: eliminate both Experimental design Results Other projects Nash equilibrium Non-standard preferences Experimental design Results Other projects How easy is it to learn in different treatments? Estimate the likelihood that action 4 will yield a higher average payoff than action 6. Π(4) > Π(6) 100 % of iterations 75 50 25 ● 0 0 10 20 30 Memory length Shared prize, fixed actions Shared prize, changing actions Lottery, fixed actions Lottery, changing actions 40 50 Nash equilibrium Non-standard preferences Experimental design Results Other projects How easy is it to learn in different treatments? Estimate the likelihood that action 4 will yield a higher average payoff than action 6. Π(4) > Π(6) 100 % of iterations 75 50 25 ● 0 0 10 20 30 Memory length Shared prize, fixed actions Shared prize, changing actions Lottery, fixed actions Lottery, changing actions 40 50 Nash equilibrium Non-standard preferences Experimental design Results Other projects How easy is it to learn in different treatments? Estimate the likelihood that action 4 will yield a higher average payoff than action 6. Π(4) > Π(6) 100 % of iterations 75 50 25 ● 0 0 10 20 30 Memory length Shared prize, fixed actions Shared prize, changing actions Lottery, fixed actions Lottery, changing actions 40 50 Nash equilibrium Non-standard preferences Experimental design Results Other projects How easy is it to learn in different treatments? Estimate the likelihood that action 4 will yield a higher average payoff than action 6. Π(4) > Π(6) 100 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● % of iterations 75 50 25 ● 0 0 10 20 30 Memory length Shared prize, fixed actions Shared prize, changing actions Lottery, fixed actions Lottery, changing actions 40 50 Nash equilibrium Non-standard preferences Experimental design Results Other projects Procedure 40 rounds, divided into 4 blocks of 10 rounds Each block divided into experimentation phase (rounds 1-5) and incentivized phase (rounds 6-10) Non-incentivized 1 Incentivized 5 6 Block 1 Non-incentivized 10 11 Incentivized 15 16 Block 2 Non-incentivized 20 21 Incentivized 25 26 Block 3 Non-incentivized 30 31 Incentivized 35 36 40 Block 4 One round from each block randomly chosen for payment Incentivized numeracy test at the end of the experiment Average earnings 15.15 euro, duration 60 minutes Nash equilibrium Non-standard preferences Experimental design Results Other projects Explanatory power of Nash equilibrium P(x = NE ) P(x = BR) P(|x − NE | ≤ 1) P(|x − BR| ≤ 1) P(x > 4) Changing actions Lottery EV 7.04% 13.33% 25.74% 32.78% 60.19% 62.78% Fixed actions Lottery EV 22.50% 65.23% 47.95% 83.64% 51.36% 16.14% Absolute value of deviation from equilibrium significantly different between EV/Fixed treatment and the other three treatments, but not in other comparisons. Nash equilibrium Non-standard preferences Experimental design Results Other projects Behavioral variation Is the distribution of choices more concentrated? (not necessarily around NE) Entropy measures the stochastic variation of a random variable (0 = one strategy always chosen, 4 = all strategies chosen with equal frequency): X H=− pi log(pi ) i=1...16 Nash equilibrium Non-standard preferences Experimental design Results Other projects Behavioral variation Is the distribution of choices more concentrated? (not necessarily around NE) Entropy measures the stochastic variation of a random variable (0 = one strategy always chosen, 4 = all strategies chosen with equal frequency): X H=− pi log(pi ) i=1...16 Entropy Std. Dev. Changing actions Lottery EV 3.22 2.79 3.28 2.56 Fixed actions Lottery EV 2.45 1.50 3.15 1.16 Nash equilibrium Non-standard preferences Experimental design Best-response curves in Fixed treatments Results Other projects Nash equilibrium Non-standard preferences Experimental design Stability of choices and convergence Results Other projects Nash equilibrium Non-standard preferences Experimental design Results Other projects Stability of choices and convergence Changing strategies between rounds in experimentation and incentivized rounds. Nash equilibrium Non-standard preferences Experimental design Results Other projects Replacing humans by computers Playing against a computer player is different than playing against a human player: no social preferences, lower joy of winning (?) Nash equilibrium Non-standard preferences Experimental design Results Other projects Replacing humans by computers Playing against a computer player is different than playing against a human player: no social preferences, lower joy of winning (?) Additional treatment replacing computers by human players. All effects replicate if Fixed/EV treatment is replaced by this treatment. P(x = NE ) P(x = BR) P(|x − NE | ≤ 1) P(|x − BR| ≤ 1) P(x > 4) Entropy Std. Dev. Changing actions Lottery EV 7.04% 13.33% 25.74% 32.78% 60.19% 62.78% 3.22 2.79 3.28 2.56 Lottery 22.50% 47.95% 51.36% 2.45 3.15 Fixed actions EV EV-Human 65.23% 50.42% 83.64% 74.58% 16.14% 23.33% 1.50 1.13 1.16 0.91 Nash equilibrium Non-standard preferences Experimental design Results Other projects Strategic uncertainty vs stability Matching players to computers has two effects: The action of the other party is stable over time, hence it is easier to learn. Players face no strategic uncertainty, hence it is easier to optimize Nash equilibrium Non-standard preferences Experimental design Results Other projects Strategic uncertainty vs stability Matching players to computers has two effects: The action of the other party is stable over time, hence it is easier to learn. Players face no strategic uncertainty, hence it is easier to optimize Is stability of choices necessary in addition to the removal of strategic uncertainty? Design: computer plays actions from the baseline contest, players know these actions. Nash equilibrium Non-standard preferences Experimental design Results Other projects Strategic uncertainty vs stability Matching players to computers has two effects: The action of the other party is stable over time, hence it is easier to learn. Players face no strategic uncertainty, hence it is easier to optimize Is stability of choices necessary in addition to the removal of strategic uncertainty? Design: computer plays actions from the baseline contest, players know these actions. P(a = NE ) P(a = BR) P(|a − NE | ≤ 1) P(|a − BR| ≤ 1) P(a > 4) Changing actions Lottery EV 7.04% 13.33% 25.74% 32.78% 60.19% 62.78% Changing Lottery 7.59% 25.00% 62.96% but known EV 25.37% 51.85% 47.04% Fixed actions Lottery EV 22.50% 65.23% 47.95% 83.64% 51.36% 16.14% Nash equilibrium Non-standard preferences Experimental design Strategic uncertainty vs stability Results Other projects Nash equilibrium Non-standard preferences Experimental design Results Other projects Contests with forgone payoff information Conclusion from the first paper: when feedback is more informative about the quality of actions, players make better choices. Can we improve the quality of feedback without changing the nature of the game? Nash equilibrium Non-standard preferences Experimental design Results Other projects Contests with forgone payoff information Conclusion from the first paper: when feedback is more informative about the quality of actions, players make better choices. Can we improve the quality of feedback without changing the nature of the game? Hypothesis: more information and higher quality of information increases the rate of learning Design: 10 rounds of standard contest, 20 rounds of contest with foregone payoff information, 10 rounds of standard contest Nash equilibrium Non-standard preferences Experimental design Results ”Contests with foregone payoff information” Other projects Nash equilibrium Non-standard preferences Experimental design Results Other projects Hypotheses: reinforcement learning simulation Π(2) > Π(4) 100 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 75 ● ● ● % of iterations ● ● 50 25 ● 0 0 10 20 30 Memory length Same actions, same random numbers Different actions, same random numbers Same actions, different random numbers Different actions, different random numbers 40 50 Nash equilibrium Non-standard preferences Results: average investments Experimental design Results Other projects Nash equilibrium Non-standard preferences Experimental design Results: dominated strategies Results Other projects Nash equilibrium Non-standard preferences Experimental design Results Other projects Payoff based learning, joint with H. Nax 0 200 invest 400 600 800 Calculating expected values is very complicated Convergence is much higher when players can use a payoff table/calculator and with neutral framing 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Nash equilibrium Non-standard preferences Experimental design Results Other projects Summary Nash equilibrium has a very low explanatory power in Tullock contests Explanatory power is much higher when actions have direct payoff consequences Providing additional feedback about foregone payoff information does not improve the explanatory power Paying the expected payoffs does not improve learning, unless players know these payoffs.