Download Part 3.6

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nash equilibrium wikipedia , lookup

Artificial intelligence in video games wikipedia , lookup

The Evolution of Cooperation wikipedia , lookup

Prisoner's dilemma wikipedia , lookup

Evolutionary game theory wikipedia , lookup

Minimax wikipedia , lookup

Chicken (game) wikipedia , lookup

Transcript
Part 3 Linear Programming
3.6 Game Theory
Example
Player X holds up either one hand or two, and independently,
so does player Y. If they make the same decision, Y wins $10.
If they make the opposite decisions, then X is the winner - $10
if he puts up one hand, and $20 if he puts up two.
The net payoff to X is easy to record in a matrix

A

 10
20
10
 10
1 by X 2 by X
 1 by Y
 2 by Y

Mixed Strategy
It is obvious that X will not do the same thing every time, or
Y would copy him and win everything. Similarly, Y cannot
stick to a single strategy, or X will do the opposite.
Both players must use a mixed strategy, and furthermore
the choice at every turn must be absolutely independent
of the previous turns.
Assume that X decides that he will put up 1 hand with
frequency x1 and 2 hands with frequency x2=1-x1. At
every turn this decision is random. Similarly, Y can pick
his probabilities y1 and y2=1-y1.
It is not appropriate to choose x1=x2=y1=y2=0.5 since Y
would lose $20 too often. But the more Y moves to a
pure 2-hand strategy, the more X will move toward 1
hand.
Equilibrium
Does there exist a mixed strategy y1 and y2 that, if
used consistently by Y, offers no special
advantage to X?
Can X choose the probabilities x1 and x2 that
present Y with no reason to change his own
strategy?
At such equilibrium, if it exists, the average payoff
to X will have reached a saddle point. It is a
maximum as far as X is concerned, and a
minimum as far as Y is concerned.
To find such a saddle point is to “solve” the game.
Best Strategy for X
The payoff of X can be determined by
 10
 20   20  30 x1 
x1 
 1  x1  





10

20
x
10

10



 
1
Let
20  30 x1  10  20 x1
3
 x1 
5
3  10 2  20   2 
 
 



5  10  5  10  2 
Whatever Y does against the strategy, he will lose $2.
Best Strategy for Y
y1  10 20  1  y1  10 10
 10  20 y1 10  30 y1 
10  20 y1  10  30 y1
2
 y1 
5
2
3
 10 20  10 10  2 2
5
5
With this strategy, Y cannot lose more than $2.
Significance of Solution
Such a saddle point is remarkable, because
it means that X plays his 2-hand strategy
only 2/5 of the time, even though it is this
strategy that gives him a chance at $20.
At he same time, Y is forced to adopt a
losing strategy – he would like to match X,
but instead he uses the opposite
probabilities 2/5 (1 hand) and 3/5 (2 hand).
Matrix Game
X has n possible moves to choose from, and
Y has m. Thus, the dimension payoff
matrix A is m by n. The entry aij in A
represents the payment received by X
when he chooses his jth strategy and Y
chooses his ith. A negative entry means a
win for Y.
Matrix Game
Player X is free to choose the probability vector
x   x1
xn  and
T
n
x
j 1
1
j
Player Y is free to choose the probability vector
y   y1
ym  and
T
m
y
i 1
i
1
The total expected payoff from each play of the game is
 x1 
a1n   
 a11 a12
m n
x2 


T

aij yi x j y Ax   y1
ym  

 
i 1 j 1
 am1 am 2
amn   
 xn 
It is y T Ax that player X wants to maximize and player Y wants to
minimize.
Example
A nn  I nn
The optimal solution is
1
x y 
n
*
1
n
*
2
1
n 
T
2
1
1
1
y Ax        
n
n
n
As n increases, Y has a better chance to escape.
*T
*
Fair Game
A symmetric matrix A (e.g. A = I ) does not guarantee
the fairness of game. In fact, it is the skew-symmetric
matrix, i.e. AT = -A, which means a completely fair game.
Such a matrix faces 2 players with identical decisions, since
a choice of strategy j by X and i by Y win aij for X, and a choice
of j by Y and i by X wins the same amount aij for Y (because
a ji  -aij ).
The optimal strategies for both players must be the same, and the
expected payoff must be y *T Ax* =0.
Example
0 1 1
A  1 0 1
1 1 0 
In words, X and Y both choose a number between 1 and 3,
and the one with the smaller number wins $1.
1 
x*  y *  0 
0 
y *T Ax* =a11 =0
Equivalent Payoff Matrix
A matrix (E) that has every entry equal to 1.
Adding a multiple of E to the payoff matrix,
i.e. A→A+cE, simply means that X wins an
additional amount c at every turn.
The value of the game is increased by c, but
there is no reason to change the original
strategies.
Minimax Theorem
For any m by n matrix A, the minimax over all mixed
strategies equals the maximin:
max min y T Ax  min max y T Ax
x
y
y
x
This quantity is the value of the game. If the maximum on
the left is attained at x* , and the minimum on the right is attained
at y * , then those strategies are optimal and they yield a saddle
point from which nobody wants to move
y *T Ax  y *T Ax*  y T Ax*
x, y
Interpretation of Minimax Theorem
from Player X’s Viewpoint
If X select a particular strategy x   x1
xn  ,
T
then Y will eventually recognize it and choose his
own strategy so as to minimize the payment y T Ax.
X will receive in this case min y y T Ax.
An intelligent player X will select a vector x* that
miximize this minimum. By this choice, X guarantees
that he will win at least the amount
min y T Ax*  max min y T Ax
y
x
y
He cannot expect to win more.
Interpretation of Minimax Theorem
from Player Y’s Viewpoint
Player Y does the opposite. For any of his own mixed
strategy y, he must expect X to discover the vector that
will maximize y T Ax. Therefore Y will choose the mixture
y that minimizes this maximum and guarantees that he will
lose no more than
max y *T Ax  min max y T Ax
x
y
x
He cannot expect to do better.
Connections Between Game
Theory and Duality in LP – (1)

Let b  1 1

T
m


1 and c  1 1


n

1

T
Consider the dual linear programs
min cT x
max bT y
s.t. Ax  b, x  0
s.t. AT y  c, y  0
(Primal)
(Dual)
To ensure both problems are feasible, it may be necessary
to add the same large number  to all entries of the payoff
matrix A, i.e. A  A +  E.
Connections Between Game
Theory and Duality in LP – (2)
The duality theorem of linear programming guarantees that
there exists vector x* and y * such that cT x* = bT y *. Thus,
n
m
x y
i 1
*
i
j 1
*
j

Then the resulting mixed strategies
 x1*
*
x 

x2*

T
xn* 
 for player X
 
T
y
y 
y
y*  
 for player Y

 

are optimal in game theory.
*
1
*
2
*
m
Connections Between Game
Theory and Duality in LP – (3)
This is due to the fact that, for any other mixed strategies
x and y ,
Ax*  b  y T Ax*  y T b  1
y *T A  cT  y *T Ax  cT x  1
Thus,
y T Ax*  1  y *T Ax
1  1 1

y T A  x*     y *T  Ax
   

This says that player X cannot win more than
strategy
strategy
y*

x*

.
1

, and player Y cannot lose less than
against the
1

against the