Download cs2005gametheory - University of Exeter

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

John Forbes Nash Jr. wikipedia , lookup

Minimax wikipedia , lookup

Artificial intelligence in video games wikipedia , lookup

Prisoner's dilemma wikipedia , lookup

The Evolution of Cooperation wikipedia , lookup

Nash equilibrium wikipedia , lookup

Evolutionary game theory wikipedia , lookup

Chicken (game) wikipedia , lookup

Transcript
Game Theory
• Game theory models strategic behavior by agents
who understand that their actions affect the actions
of other agents.
• Useful to study
– Company behavior in imperfectly competitive markets
(such as Coke vs. Pepsi).
– Military strategies.
– Bargaining/Negotiations.
– Biology
• A game consists of players, strategies, and
payoffs.
Battle of Bismarck Sea
Imamura
North
South
-2
-2
North
2
2
Kenny
-3
-1
South 1
3
• Imamura wants to transport troops.
• Kenny wants to bomb Japanese troops. .
• North route is two days, Southern route is three days.
• It takes one day for Kenny to switch routes.
Imamura wants to run convoy from Rabaul to Lae
The Prisoners’ Dilemma
Clyde
S
S
Bonnie
C
C
-5
-5
-1
-15
-15
-1
-10
-10
Bonnie and Clyde are caught.
They can confess or be silent.
Nash equilibrium
• A Nash equilibrium is a set of strategies
– Where each player has no incentive to deviate given
the other players’ strategies.
– (or) Given other equilibrium strategies a player would
choose his equilibrium strategy.
– (or) A best response to a best response.
• A pure strategy equilibrium is where each player only
chooses a particular strategy with certainty.
• What are the pure-strategy equilibria in Prisoners’
dilemma?
Trench Warfare: a prisoners’ dilemma
Mars
Not Shoot Shoot
-5
Not Shoot
-1
-5
-15
Venus
-10
-15
Shoot
-1
-10
Are we doomed to the bad outcome?
Not in trench warfare of WWI.
This happens since the game is repeated –
that is, played several times.
Repeated games: forever punish
• Forever punish strategy: if someone cheats, punish
forever.
–
–
–
–
Cheating gives short term gain but long term loss.
Gain of 4 for the time of cheating.
Loss of 5 from the next period on.
Whether this stops cheating depends upon how you
value today compared to tomorrow (discount rate).
• Forever punish is not so great if there is uncertainty.
Why?
Repeated game: Tit for Tat
• A Tit-for-tat strategy is do what ever the other
player did last time.
• This helps with mistakes.
SSSSSCSCS
SSSSSSCSC
• One player can fix this by not choosing C once
• Problem is that with less cost to choosing C, it is
more likely to occur (benefit is still 4, but cost is
now 3 or less for all later periods).
Coordination Problem
Jim
VHS
VHS
Sean
Beta
Beta
1
1
0.5
0
0
0.5
2
2
• Jim and Sean want to have the same VCR.
• Beta is a better technology than VHS.
Information Technology
• Phones, Faxes, e-mail, etc. all have the following
property:
– Network externalities: The more people using it the
more benefit it is to each user.
• Computers, VCRs, PS2s, also have this property
in that both software can be traded among users
and the larger the user market, the larger number
of software titles are made.
• How do markets operate with such externalities?
Discussion points
• Competitors: VHS vs. Beta, Qwerty vs. Dvorak,
Windows vs. Mac, Playstation vs. Xbox.
• Does the best always win?
• Standardization helps with network externalities.
– Drive on left side vs. right side. Out of 206 countries
144 (70%) are rhs.
– Left is more nature for an army: swords in right hand,
mounting horses. (Napolean liked the other way.)
– Sweden switched from left to right in 1967.
• Lots of networks: Religions and Languages.
iChoice
• Apple makes the ipod mp3 player and sells
specially encoded mp3 music on itunes.
• For the most part, they are exclusive.
• Should Apple open one or both of them up?
– Allow itunes to work with other players.
– Allow ipod to work with other sites.
iChoice
• By being exclusive, they may increase network
externalities.
– This may be a curse (no one buys it) or a blessing
(more buy it and they can charge more).
• History of Apple.
– Apple II was somewhat open. Anyone can make
software and hardware for slots. Result: lost some sales
to clones and suppliers, but very popular.
– Mac was closed much more. No clones. Had control
over many suppliers. Right idea, but lost by network
externalities. (Note IBM was open and won the battle
but lost out.)
Penalty Kick
Kick L
Kicker
Kick R
Goalie
Dive L
Dive R
1
-1
-1
1
1
-1
1
-1
• A Kicker can kick a ball left or right.
• A Goalie can dive left or right.
Mixed Strategy equilibrium
• Happens in the Penalty kick game.
• Notice that if the Kicker kicks 50-50 (.5L+.5R),
the Goalie is indifferent to diving left or right.
• If the Goalie dives 50-50 (.5L+.5R), the Kicker is
indifferent to kicking left or right.
• Thus, (.5L+.5R,.5L+.5R) is a mixed-strategy N.E.
• Nash showed that there always exists a Nash
equilibrium.
Do you believe it?
• Can we empirically test this theory?
• Yes!
– Study was done with the Italian football league.
– Step 1: See if the strategies are really left or right.
– Step 2: Calculate payoffs. How? If when the goalie
guesses correctly, there is no goal 100% of the time the
payoffs are 0 for the kicker and 100 for the goalie. If
there no goal 80% of the time, then the payoffs are 20
for the kicker and 80 for the goalie, etc…
– Step 3: Calculate the Nash equilibrium.
– Step 4: Compare.
Do you believe it?
• Do they really choose only L or R? Yes. Kickers 93.8% and
Goalie 98.9%.
• Kickers are either left or right footed. Assume R means kick
in “easier” direction. Below is percentage of scoring.
Kick L
Kick R
Dive L
Dive R
58.3
94.97
92.91
69.92
• Nash prediction for
(Kicker, Goalie)=(41.99L+58.01R, 38.54L+61.46R)
• Actual Data
=(42.31L+57.69R, 39.98L+60.02R)
Parking Enforcement Game
Student Driver
Park OK Park in Staff
Check
University
Don’t
-5
-5
5
5
-5
-95
5
0
• Student can decide to park in staff parking.
• University can check cars in staff parking lot.
What happens?
•
•
•
•
If the University checks, what do the students do?
If the students park ok, what does the Uni do?
If the uni doesn’t check, what do the students do?
If the students park in the staff parking, what does
the uni do?
• What is the equilibrium of the game?
• What happens if the university makes it less harsh
a punishment to only –10. Who benefits from this?
Who is hurt by this?
Answer
• Student parks legally 1/3 of the time and the
uni checks 1/10 of the time.
• With lower penalty, student parks legally
1/3 of the time and the uni checks 2/3 of the
time.
• Who’s expected payoff changes? No one.