Download Nash Equilibrium in Tullock Contests

Document related concepts

Prisoner's dilemma wikipedia , lookup

Paul Milgrom wikipedia , lookup

Mechanism design wikipedia , lookup

Evolutionary game theory wikipedia , lookup

John Forbes Nash Jr. wikipedia , lookup

Chicken (game) wikipedia , lookup

Nash equilibrium wikipedia , lookup

Transcript
Nash equilibrium
Non-standard preferences
Experimental design
Results
Nash Equilibrium in Tullock Contests
Aidas Masiliunas1
1 Aix-Marseille
School of Economics
Controversies in Game Theory III, ETH Zurich
2 June, 2016
Other projects
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Rent-seeking (Tullock) contest
Two players compete for a prize (16 ECU) by making costly
investments (x1 , x2 ≤ 16)
Higher investments increase the probability to win the prize
Probability that player i receives the prize:
xi
xi +xj
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Rent-seeking (Tullock) contest
Two players compete for a prize (16 ECU) by making costly
investments (x1 , x2 ≤ 16)
Higher investments increase the probability to win the prize
Probability that player i receives the prize:
Applications:
Competition for monopoly rents
Investments in R&D
Competition for a promotion/bonus
Political contests
xi
xi +xj
Nash equilibrium
Non-standard preferences
Experimental design
Results
Theory
xi
xi +xj
· 16 + 16 − xi
p
BRi (xj ) : xi∗ = 16xj − xj
E (π) =
RNNE : xi∗ = 4, dominance solvable in three steps.
10
9
8
7
6
5
4
3
2
1
Best Response
12
14 15 16
Standard preferences
1
2
3
4
5
6
7
8
9
Other plays
10
11
12
13
14
15
16
Other projects
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Explanatory power of Nash equilibrium in experiments
7.04% of choices are exactly Nash
60.19% of choices are strictly dominated
Investments are spread across the whole strategy space
Experience does not help
Less stability compared to auctions
Nash equilibrium
Non-standard preferences
Experimental design
Results
Comparative statics of Nash equilibrium
An alternative to point predictions is comparative statics
Is behaviour sensitive to changes in the Nash prediction?
Other projects
Nash equilibrium
Non-standard preferences
Experimental design
Results
Comparative statics of Nash equilibrium
An alternative to point predictions is comparative statics
Is behaviour sensitive to changes in the Nash prediction?
Players
2
3
4
5
9
Nash
250
222
188
160
99
Mean investment
325
283
302
322
326
Source: Lim, Matros & Turocy, 2014
Other projects
Nash equilibrium
Non-standard preferences
Experimental design
Results
Why should players choose Nash equilibrium?
Interpretation #1: Nash equilibrium is the unique action
profile that can be justified by common knowledge of
rationality.
Rationality = maximization of expected payoff given some
belief.
Other projects
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Rationalizable strategies
xi
BR(xi )
1
3
2
4
3
4
4
4
5
4
6
4
7
4
8
3
9
3
10
3
11
2
12
2
13
1
14
1
15
1
16
1
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Rationalizable strategies
xi
BR(xi )
1
3
2
4
3
4
4
4
5
4
6
4
Rationality
Rationalizable: 3, 4, 2, 1
7
4
8
3
9
3
10
3
11
2
12
2
13
1
14
1
15
1
16
1
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Rationalizable strategies
xi
BR(xi )
BR(BR(xi ))
1
3
4
2
4
4
3
4
4
4
4
4
5
4
4
6
4
4
7
4
4
8
3
4
9
3
4
10
3
4
11
2
4
12
2
4
13
1
3
Rationality
Rationalizable: 3, 4, 2, 1
Rationality + belief that the opponent is rational
Rationalizable: 3, 4
14
1
3
15
1
3
16
1
3
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Rationalizable strategies
xi
BR(xi )
BR(BR(xi ))
BR(BR(BR(xi )))
1
3
4
4
2
4
4
4
3
4
4
4
4
4
4
4
5
4
4
4
6
4
4
4
7
4
4
4
8
3
4
4
9
3
4
4
10
3
4
4
11
2
4
4
12
2
4
4
13
1
3
4
14
1
3
4
15
1
3
4
Rationality
Rationalizable: 3, 4, 2, 1
Rationality + belief that the opponent is rational
Rationalizable: 3, 4
Rationality + belief that the opponent is rational + belief
that the opponent believes in my rationality
Rationalizable: 4
16
1
3
4
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Rationalizable strategies
xi
BR(xi )
BR(BR(xi ))
BR(BR(BR(xi )))
1
3
4
4
2
4
4
4
3
4
4
4
4
4
4
4
5
4
4
4
6
4
4
4
7
4
4
4
8
3
4
4
9
3
4
4
10
3
4
4
11
2
4
4
12
2
4
4
13
1
3
4
14
1
3
4
15
1
3
4
Rationality
Rationalizable: 3, 4, 2, 1
Rationality + belief that the opponent is rational
Rationalizable: 3, 4
Rationality + belief that the opponent is rational + belief
that the opponent believes in my rationality
Rationalizable: 4
Epistemic definition of Nash equilibrium: common belief in
rationality + simple belief hierarchy
16
1
3
4
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Why should players choose Nash equilibrium?
Nash equilibrium is the unique action profile that cannot be
ruled out by common knowledge of rationality.
1
2
3
Players care about expected payoffs
Players have the ability to calculate expected payoffs and
identify dominated strategies
Players believe that other players satisfy 1-2, and believe that
they believe that they satisfy 1-2...
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Why should players choose Nash equilibrium?
Nash equilibrium is the unique action profile that cannot be
ruled out by common knowledge of rationality.
1
2
3
Players care about expected payoffs
Players have the ability to calculate expected payoffs and
identify dominated strategies
Players believe that other players satisfy 1-2, and believe that
they believe that they satisfy 1-2...
Nash equilibrium is the rest point of various learning dynamics
Belief-based learning, e.g. Cournot best-response, fictitious
play
Assumption 3 is not necessary
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Why should players choose Nash equilibrium?
Nash equilibrium is the unique action profile that cannot be
ruled out by common knowledge of rationality.
1
2
3
Players care about expected payoffs
Players have the ability to calculate expected payoffs and
identify dominated strategies
Players believe that other players satisfy 1-2, and believe that
they believe that they satisfy 1-2...
Nash equilibrium is the rest point of various learning dynamics
Belief-based learning, e.g. Cournot best-response, fictitious
play
Assumption 3 is not necessary
Payoff-based learning, e.g. reinforcement learning
Players must be willing to explore, remember past payoffs,
receive accurate feedback.
Nash equilibrium
Non-standard preferences
Experimental design
Which assumptions are violated?
Results
Other projects
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Preference-based explanations: joy of winning
Participants receive non-monetary utility from winning (Parco
et al, 2005, Sheremeta, 2011) or lose utility after losing
(Delgado et al., 2008).
Sheremeta (2011) elicits joy of winning by implementing a
contest where prize has no value.
12
10
9
8
1
2
3
4
5
6
7
Best Response
10
9
8
7
6
5
4
3
2
1
Best Response
12
14 15 16
Joy of winning with w = 8
14 15 16
Joy of winning with w=3
1
2
3
4
5
6
7
8
9
Other plays
10
11
12
13
14
15
16
1
2
3
4
5
6
7
8
9
Other plays
10
11
12
13
14
15
16
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Preference-based explanations: risk preferences
CRRA untility function: u(πi ) =
πi1−ρ
1−ρ
Risk aversion if ρ = 0.5, risk seeking if ρ = −0.5
12
10
9
8
1
2
3
4
5
6
7
Best Response
10
9
8
7
6
5
4
3
2
1
Best Response
12
14 15 16
Risk seeking
14 15 16
Risk aversion
1
2
3
4
5
6
7
8
9
Other plays
10
11
12
13
14
15
16
1
2
3
4
5
6
7
8
9
Other plays
10
11
12
13
14
15
16
Nash equilibrium
Non-standard preferences
Experimental design
Results
Preference-based explanations: social preferences
Fehr & Schmidt (1999) inequality aversion:
πi − α(πj − πi ) if πi ≤ πj
u(πi , πj ) =
πi − β(πi − πj ) if πi > πj
Fehr and Schmidt (1999) inequality aversion
Best Response
1 2 3 4 5 6 7 8 9 10 11
13
15
a=0, b=0
a=0.5, b=0
a=1, b=0
1
2
3
4
5
6
7
8
9 10
Other plays
12
14 15 16
Other projects
Nash equilibrium
Non-standard preferences
Experimental design
All preferences from Sheremeta (2015)
Results
Other projects
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
”Behavioral Variation in Tullock Contests”, joint with F.
Mengel and Ph. Reiss
Deviations from NE could be a result of bounded rationality
Players optimize given the feedback in previous rounds.
Noisy feedback prevents players from discovering optimal
actions
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
”Behavioral Variation in Tullock Contests”, joint with F.
Mengel and Ph. Reiss
Deviations from NE could be a result of bounded rationality
Players optimize given the feedback in previous rounds.
Noisy feedback prevents players from discovering optimal
actions
Research questions:
Can we identify whether deviations from NE are a result of
bounded rationality or of preferences?
Is behavioral variability lower and choices closer to theoretical
predictions when feedback is more informative?
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
How informative is the feedback that players observe?
Reinforcement learning converges to NE as t → ∞
In experiments players rely on small samples of experience
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
How informative is the feedback that players observe?
Reinforcement learning converges to NE as t → ∞
In experiments players rely on small samples of experience
Suppose that players always choose the action that yielded
highest average payoff in the past.
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
How informative is the feedback that players observe?
Reinforcement learning converges to NE as t → ∞
In experiments players rely on small samples of experience
Suppose that players always choose the action that yielded
highest average payoff in the past.
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Feedback depends on other’s choices and lottery outcomes
Nash equilibrium
Non-standard preferences
Experimental design
Treatment 1: eliminate lottery allocation
Results
Other projects
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Treatment 2: eliminate variability of opponent’s choices
Nash equilibrium
Non-standard preferences
Treatment 3: eliminate both
Experimental design
Results
Other projects
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
How easy is it to learn in different treatments?
Estimate the likelihood that action 4 will yield a higher
average payoff than action 6.
Π(4) > Π(6)
100
% of iterations
75
50
25
●
0
0
10
20
30
Memory length
Shared prize, fixed actions
Shared prize, changing actions
Lottery, fixed actions
Lottery, changing actions
40
50
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
How easy is it to learn in different treatments?
Estimate the likelihood that action 4 will yield a higher
average payoff than action 6.
Π(4) > Π(6)
100
% of iterations
75
50
25
●
0
0
10
20
30
Memory length
Shared prize, fixed actions
Shared prize, changing actions
Lottery, fixed actions
Lottery, changing actions
40
50
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
How easy is it to learn in different treatments?
Estimate the likelihood that action 4 will yield a higher
average payoff than action 6.
Π(4) > Π(6)
100
% of iterations
75
50
25
●
0
0
10
20
30
Memory length
Shared prize, fixed actions
Shared prize, changing actions
Lottery, fixed actions
Lottery, changing actions
40
50
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
How easy is it to learn in different treatments?
Estimate the likelihood that action 4 will yield a higher
average payoff than action 6.
Π(4) > Π(6)
100
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
% of iterations
75
50
25
●
0
0
10
20
30
Memory length
Shared prize, fixed actions
Shared prize, changing actions
Lottery, fixed actions
Lottery, changing actions
40
50
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Procedure
40 rounds, divided into 4 blocks of 10 rounds
Each block divided into experimentation phase (rounds 1-5)
and incentivized phase (rounds 6-10)
Non-incentivized
1
Incentivized
5 6
Block 1
Non-incentivized
10 11
Incentivized
15 16
Block 2
Non-incentivized
20 21
Incentivized
25 26
Block 3
Non-incentivized
30 31
Incentivized
35 36
40
Block 4
One round from each block randomly chosen for payment
Incentivized numeracy test at the end of the experiment
Average earnings 15.15 euro, duration 60 minutes
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Explanatory power of Nash equilibrium
P(x = NE )
P(x = BR)
P(|x − NE | ≤ 1)
P(|x − BR| ≤ 1)
P(x > 4)
Changing actions
Lottery
EV
7.04% 13.33%
25.74% 32.78%
60.19% 62.78%
Fixed actions
Lottery
EV
22.50% 65.23%
47.95% 83.64%
51.36% 16.14%
Absolute value of deviation from equilibrium significantly different
between EV/Fixed treatment and the other three treatments, but not in
other comparisons.
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Behavioral variation
Is the distribution of choices more concentrated? (not
necessarily around NE)
Entropy measures the stochastic variation of a random
variable (0 = one strategy always chosen, 4 = all strategies
chosen with equal frequency):
X
H=−
pi log(pi )
i=1...16
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Behavioral variation
Is the distribution of choices more concentrated? (not
necessarily around NE)
Entropy measures the stochastic variation of a random
variable (0 = one strategy always chosen, 4 = all strategies
chosen with equal frequency):
X
H=−
pi log(pi )
i=1...16
Entropy
Std. Dev.
Changing actions
Lottery
EV
3.22
2.79
3.28
2.56
Fixed actions
Lottery EV
2.45
1.50
3.15
1.16
Nash equilibrium
Non-standard preferences
Experimental design
Best-response curves in Fixed treatments
Results
Other projects
Nash equilibrium
Non-standard preferences
Experimental design
Stability of choices and convergence
Results
Other projects
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Stability of choices and convergence
Changing strategies between rounds in experimentation and
incentivized rounds.
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Replacing humans by computers
Playing against a computer player is different than playing
against a human player: no social preferences, lower joy of
winning (?)
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Replacing humans by computers
Playing against a computer player is different than playing
against a human player: no social preferences, lower joy of
winning (?)
Additional treatment replacing computers by human players.
All effects replicate if Fixed/EV treatment is replaced by this
treatment.
P(x = NE )
P(x = BR)
P(|x − NE | ≤ 1)
P(|x − BR| ≤ 1)
P(x > 4)
Entropy
Std. Dev.
Changing actions
Lottery
EV
7.04% 13.33%
25.74% 32.78%
60.19% 62.78%
3.22
2.79
3.28
2.56
Lottery
22.50%
47.95%
51.36%
2.45
3.15
Fixed actions
EV
EV-Human
65.23%
50.42%
83.64%
74.58%
16.14%
23.33%
1.50
1.13
1.16
0.91
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Strategic uncertainty vs stability
Matching players to computers has two effects:
The action of the other party is stable over time, hence it is
easier to learn.
Players face no strategic uncertainty, hence it is easier to
optimize
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Strategic uncertainty vs stability
Matching players to computers has two effects:
The action of the other party is stable over time, hence it is
easier to learn.
Players face no strategic uncertainty, hence it is easier to
optimize
Is stability of choices necessary in addition to the removal of
strategic uncertainty?
Design: computer plays actions from the baseline contest,
players know these actions.
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Strategic uncertainty vs stability
Matching players to computers has two effects:
The action of the other party is stable over time, hence it is
easier to learn.
Players face no strategic uncertainty, hence it is easier to
optimize
Is stability of choices necessary in addition to the removal of
strategic uncertainty?
Design: computer plays actions from the baseline contest,
players know these actions.
P(a = NE )
P(a = BR)
P(|a − NE | ≤ 1)
P(|a − BR| ≤ 1)
P(a > 4)
Changing actions
Lottery
EV
7.04% 13.33%
25.74% 32.78%
60.19% 62.78%
Changing
Lottery
7.59%
25.00%
62.96%
but known
EV
25.37%
51.85%
47.04%
Fixed actions
Lottery
EV
22.50% 65.23%
47.95% 83.64%
51.36% 16.14%
Nash equilibrium
Non-standard preferences
Experimental design
Strategic uncertainty vs stability
Results
Other projects
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Contests with forgone payoff information
Conclusion from the first paper: when feedback is more
informative about the quality of actions, players make better
choices.
Can we improve the quality of feedback without changing the
nature of the game?
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Contests with forgone payoff information
Conclusion from the first paper: when feedback is more
informative about the quality of actions, players make better
choices.
Can we improve the quality of feedback without changing the
nature of the game?
Hypothesis: more information and higher quality of
information increases the rate of learning
Design: 10 rounds of standard contest, 20 rounds of contest
with foregone payoff information, 10 rounds of standard
contest
Nash equilibrium
Non-standard preferences
Experimental design
Results
”Contests with foregone payoff information”
Other projects
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Hypotheses: reinforcement learning simulation
Π(2) > Π(4)
100
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
75
●
●
●
% of iterations
●
●
50
25
●
0
0
10
20
30
Memory length
Same actions, same random numbers
Different actions, same random numbers
Same actions, different random numbers
Different actions, different random numbers
40
50
Nash equilibrium
Non-standard preferences
Results: average investments
Experimental design
Results
Other projects
Nash equilibrium
Non-standard preferences
Experimental design
Results: dominated strategies
Results
Other projects
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Payoff based learning, joint with H. Nax
0
200
invest
400
600
800
Calculating expected values is very complicated
Convergence is much higher when players can use a payoff
table/calculator and with neutral framing
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20
Nash equilibrium
Non-standard preferences
Experimental design
Results
Other projects
Summary
Nash equilibrium has a very low explanatory power in Tullock
contests
Explanatory power is much higher when actions have direct
payoff consequences
Providing additional feedback about foregone payoff
information does not improve the explanatory power
Paying the expected payoffs does not improve learning, unless
players know these payoffs.