Download M351 THEORY OF GRAPHS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nash equilibrium wikipedia , lookup

Strategic management wikipedia , lookup

Artificial intelligence in video games wikipedia , lookup

Porter's generic strategies wikipedia , lookup

Minimax wikipedia , lookup

Prisoner's dilemma wikipedia , lookup

The Evolution of Cooperation wikipedia , lookup

Evolutionary game theory wikipedia , lookup

Chicken (game) wikipedia , lookup

Transcript
M369, Game Theory Part 3, diagrams etc.
1
Part 3: 2m games.
So far, I have discussed domination and saddle points. Unfortunately most matrix games cannot be
solved by either method. If a game is of any size, then typically it has a few dom. rows and cols. After
you have removed all of these, there is no saddle point.
Consider the game shown here. Dither chooses a card,
either red or black, and plays it face down. Swither must
then call red or black. He wins £1 if he called right, loses
£1 if he called wrong. Look at the matrix (inset). This
game has no saddle and no domination, so it cannot be
row
min
Swither
R
B
R
-1
1
-1
B
solved in pure strategies. This impasse is similar to one
that occurs in the theory of equations. After solving a few
col max
simple quadratic equations, we find that x2 + 1 = 0 has
no real solution. This discovery led to the invention of complex numbers.
1
-1
-1
1
1
Dither
Going back to Dither/Swither. Clearly if either D or S plays predictably, then the other can win.
The only way that either player can defend against an intelligent opponent is to play unpredictably. So
in a long sequence of plays, D must choose red and black at random, and by symmetry, he should
choose them equally often. Likewise S should choose R and B at random, equally often. Then
each player will win half and lose half. Suppose that D deviates, say by calling R 2/3 of the time
and B 1/3 . If S continues to play correctly, D does not gain anything. Also, as soon as S notices
that D is deviating, S can penalise D by calling R more often. So D has no incentive to deviate,
and S ditto by the same argument.
Definition. A mixed strategy is a decision by one player to make a choice between some of his pure
strategies, regulated by chance.
Definition. Let V be a vector space over the reals and u1 , u2 , . . . ur a set of vectors in V . A
convex combination of these vectors is a linear combination:
v = iui
where the numbers
i are all  0 and their sum is 1 .
Then we can regard any mixed strategy as a convex combination of that player's pure strategies.
Now change the numbers: Pamela vs Diana, inset.
After we delete the dominated col D1 and row P3,
there is no saddle point and no dom. row or col,  we
look for mixed strategies. (2nd inset).
Algorithm 3.1. To find correct mixed strategies for a
22 matrix game with no saddle point and no dominated
rows or columns.

Diana
Pamela
D1
D2
D3
P1
3
0
2
P2
2
5
1
P3
3
3
-1
WARNING: This will always give the wrong answer if the
matrix has a saddle point or dominated strategies.
Diana
Pamela
D2
D3
P1
0
2
P2
5
1
M369, Game Theory Part 3, diagrams etc.
2
Let M be the original matrix: take Pamela v Diana as example, find Diana's strategy first:
0
2
5
1
Add an extra row containing row 1  row 2 .

-5
One of these numbers will be < 0 ; change its sign. Swop the 2 numbers
1

Call these numbers p and q . To convert these to probabilities, divide by p+q . This gives:
1/6
5/6
So Diana should play D2 with probability 1/6 and D3 with probability 5/6 each (and not D1).
Now do the same for the row player Pamela .
Start:
0
2
then col 1
2
5
1
 col 2 =

Change -2 to
2 and swop
4
2
Probabilities:
So Pamela should play P1 with prob. 2/3 and P2 with
prob. 1/3 and never P3 .
Go back to the original game. This was as in the inset, and
we rejected D1 and P3 which were dominated. Find the
payoffs.
2/3
1/3
Diana
Pamela
From the previous work, we know that Pamela should play
P1 with prob. 2/3 and P2 with prob. 1/3 and never P3 .
We can write this as
D1
D2
D3
P1
3
0
2
P2
2
5
1
P3
3
3
-1
P* = 2/3 P1 + 1/3 P2 + 0 P3
If she does this then depending on what Diana does, Pamela's payoffs will be given by 2/3  row 1 +
1/3  row 2 which is
Diana
D1
D2
D3
8/3
5/3
5/3
Note that Diana is penalised if she plays the bad strategy D1 .
Likewise we found that Diana should play D2 with probability 1/6 and D3 with probability 5/6
(and not D1).
D* = 0 D1 + 1/6 D2 + 5/6 D3
If Diana does this then depending on what Pamela does, Pamela's payoffs will be given by
1/6  column 2 + 5/6  column 3 which is
Pamela
P1
5/3
P2
5/3
P3
1/3
Definition. Let J be a matrix game with no saddle point. An active strategy is a pure strategy (for
either player) that appears with probability > 0 in that player's optimum mixed strategy.
M369, Game Theory Part 3, diagrams etc.
3
Diana's active strategies are D2 and D3 , Pamela's are P1 and P2 . If P plays her optimum
strategy, D gets her expected payoff of 5/3 from either of her active strategies but she gets a worse
payoff from inactive D3. Conversely if D plays her opt strategy. This is actually a general result; it
is the way you test a mixed strategy to see if it is optimum.
Definition. The value of a matrix game is the mean payoff to the first player if both players play their
optimum strategies.
Theorem 3.1. (not proved here) Let M be a matrix game, v the value, and X , Y the 2 players. If
Y (the 2nd player) plays her optimum strategy, the mean payoff to X will be
equal to v if X plays any of her active strategies
 v if X plays any of her non-active strategies.
Conversely, if X (the 1st player) plays her optimum strategy, the mean payoff to Y will be
equal to v if Y plays any, of her active strategies
 v if Y plays any of her non-active strategies.
Now consider 2m matrix games with m > 2 . Given any 2m matrix game with no saddle point,
there is always at least one 22 submatrix whose solution gives optimum strategies for the original
game. Example.
Elizabeth
Henry
E1
E2
E3
E4
E5
E6
E7
H1
-6
-1
1
4
7
4
3
H2
7
-2
6
3
-2
-5
7
To solve this, draw a ladder diagram as shown below. This has 2 vertical axes marked with the same
scale. The LH axis corresponds to H1 , the RH axis to H2 . For each column of the matrix, say
 6 
col 1 =   put in a line from 6 on the L.H. axis to 7 on the RH axis. The L.H. diagram shows
7
the first 2 columns only. Repeat for
all the columns, getting the RH
8
8
8
8
diagram.
C
Mark in double weight the lines that
bound this figure from below .
Mark the highest point of this
underneath border. The lines that
cross at this point correspond to the
required columns of the matrix:
here 1 and 2 . So the matrix is
 6 1
reduced to 
 = M , say .
 7 2 
6
6
6
6
4
4
4
4
2
2
2
2
0
0
0
-2
-2
-2
-4
-4
-4
-6
-6
-6
-8
-8
-8
0
A
-2
P
-4
-6
-8
B
D
We solve as in Algorithm 3.1.
Find E's strategy:
Row 1 – row 2 of M is ( 13 1 ). Swop, change the 13 to +13: ( 1 13 ) So E should play
E1 with prob 1/14 and E2 with prob 13/14 . Depending on what Henry does, his payoff will be
1
2
3
4
5
6
7
M369, Game Theory Part 3, diagrams etc.
4
1/14  column 1 + 13/14 column 2 which is
Henry
1
-19/14
2
-19/14
Find H's strategy.
5
 9 /14 
Column 1  column 2 of M is   . Swop, change the 9 to +9 , divide by 14: 

 9 
 5 /14 
So Henry must play strategy H1 with prob 9/14 and H2 with prob 5/14 . Depending on what
Elizabeth does, his payoff will be 9/14  row 1 + 5/14 row 2 which is
Elizabeth
1
2
3
4
5
6
7
-19/14
-19/14
39/14
51/14
53/14
11/14
62/14
and again we see that if E plays either of her active strategies, the payoff (to H) is -19/14 , but if E
plays any of her non-active strategies, H gets a bigger payoff, which is worse for E .
Suppose now that in the diagram the boundary from below has a flat 8
8
segment. This will happen if (for example) E's 2nd column had been
6
6
( -2 , -2 )T instead of the actual ( -1 , -2 )T . ( T means transpose ).
4
4
Then the ladder diagram will look like this inset, where I have omitted
2
several dominated columns. The lower boundary then has 2 highest 2
0
C
corners: A = columns 1 and 2 and B = columns 1 and 6 . If you 0
-2
analyse either of these, you will find that E's best strategy will be pure -2
A B
1
column 2 . Against pure E2 Henry can play H1 or H2 or any -4
-4
2
mixture and he will score 2 . But if he plays H1 , E can penalise -6
-6
him by playing E1 . H must play somewhere along the line between
6
-8
-8
A and B because that will let him penalise E if she deviates. The
Henry vs Elizabeth II
best point for H is in line with point C. So H's best mixed strategy
is got by solving columns 1 and 6 .
This is the small table far right. Solving for H only: H should play
E1 E6
H* = 6/11  H1 + 5/11  H2
H1
-6
4
This gives payoff 2 against E's optimum strategy E2 , but if E deviates and
H2
7
-5
plays either E1 or E6 , his mean payoff rises to 1/11 .
Now consider a m  2 matrix, like the first inset. This time
the ladder has one line per row . You must find the lowest
point on the upper boundary: here this is point A on rows
1 and 6 . So the matrix reduces to the 3rd inset. Solving as
above: the row player must play
R* = 5/8  R1 + 3/8  R6
and the column player must play
C* = 5/8  C1 + 3/8  C2
and the average payoff is 5/4 .
1
2
3
4
5
6
7
1
-1
-3
0
-3
1
5
2
2
5
1
-3
0
-3
-5
-4
1 2
1 -1 5
6 5 -5
5
4
3
2
1
0
-1
-2
-3
-4
-5
A
5
4
3
2
1
0
-1
-2
-3
-4
-5
M369, Game Theory Part 3, diagrams etc.
Given 3 lines in a ladder diagram, does line 3 go through the
intersection of lines 1 and 2 or above or below?
Answer: suppose that line 1 is the steepest of the 3 lines and goes
from a to b ; line 2 goes from c to d ; and line 3 from e to f .
5
line
1
a
b
line
2
c
d
line
3
e
f
1 1 1 
So they correspond to the columns in the first inset. Form the matrix K   a c e  and let  be the


 b d f 
determinant of K .
Lemma . Line 3 passes above / through / below the intersection of lines 1 and 2 according as  is
>0 or =0 or <0 .
Proof. Let X be the point where lines 1 and 2 meet. First, suppose that
line 3 does go through X. (first diagram) Then triangles AEX , BFX are
similar
 AE/BF = AX/BX
Triangles ACX , BDX are similar
c
X
b
f
d
X
b
f
d
e
 AC/BD = AX/BX = AE/BF
 ( c – a )( b – f ) = ACBF = AEBD = ( e – a )( b – d )
a
Multiply out: collect terms on the RHS:
 0 = eb – ab – ed + ad – cb + cf + ab – af =  .
Now suppose line 3 passes above X . Suppose the line fX meets the LH
axis at e' . Then ' (with e' in place of e ) is 0 by the above.
c
e
e'
Now  is got by replacing e' by e
in K' ;
a
 1 1 1   1 1 1  1 1 0 
       '   a c e    a c e '  0 0 e  e '  (e  e ')(b  d )
b d f  b d f  b d 0 
where b > d because line 1 = a to b was the steepest . Since Line 3 is above X , e > e' and
hence > 0 . If line 3 passes below X then e < e' and so  < 0

Part 4 concluded.
Given a pair of supposedly-optimal mixed strategies for Roy and Clara, say r and c , how
to check if they are genuinely optimal? Answer: Let  be the value. If R plays r , he presents C
with the row vector rTM . The entries of this vector must all be   , with equality for all of C's
active strategies. Then C cannot gain by deviating. If C plays c she presents R with the column
Mc . All entries of this column must be   with equality for all of R's active strategies. Whenever
you get an alleged solution, you need to check both r and c .