Download game_2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

The Evolution of Cooperation wikipedia , lookup

Prisoner's dilemma wikipedia , lookup

Artificial intelligence in video games wikipedia , lookup

Evolutionary game theory wikipedia , lookup

Chicken (game) wikipedia , lookup

Minimax wikipedia , lookup

Transcript
M369, Game Theory Part 2, diagrams etc.
1
Part 2: Solving easy games.
(several examples are not from AJJ's book, so numbers do not correspond.)
Definition. Solving a game is the operation of finding
an optimum strategy for each player
and what payoff each player can expect, against intelligent opponents.
Very often this is impossible. As explained in Part 1 , we are only considering matrix games. So we
only give the matrices.
Definition. Let M be a matrix. We say that row i dominates row j if every entry of row i is 
the corresponding entry in row j . We say that column k dominates column l if every entry of
column k is  the corresponding entry in column l .
Example 2.1. Inset. Jill vs Mary. Every entry of col 1 is 
the corresponding entry of col 3 . So whatever Jill does,
Mary will be better off if she plays M3 than M1 . So we
1
can delete the first column. This leaves a 3  2 matrix in
Jill
2
which every entry of row 2 is  the corresponding entries
of either row 1 or 3 . So Jill should play J2 . Then M
3
is intelligent so she can follow J's thinking. So M knows
that J will play J2, then M3 is better for M than M2, so this game is soluble.
Mary
1
2
3
3
0
1
2
5
2
3
3
-1
Solution: J2 , M3 , value 2

Beware of two booby traps when rejecting dominated strategies. (1) Row i dominates row j
if every entry in i is larger, but col i dominates col j if every entry of col i is smaller.
(2) when you cross out a row or col, don’t relabel the ones remaining.
More generally:
Definition. Let X be a player in any sort of game and  ,  any 2 of X's strategies. If the game has
no chance element, we say that  dominates  if the payoff to X from  is always at least as good
as the payoff from  for every possible choice of strategies by all of the other players. If the game has
chance elements, we say that  dominates  if the mean payoff to X from  (averaged over all
choices by Nature) is always at least as great as that from  .
Two of X's strategies are equivalent if they give the same expected payoff to X for all possible
choices of strategies by the other players. Strategy  strictly dominates  if  dominates  and is
not equivalent, i.e. if  sometimes gives a strictly greater payoff to X .
Theorem 2.1. Any finite non-cooperative game of complete information can be solved in finite time
by rejecting dominated strategies and pruning the game tree.
Explanation: "in finite time" means that the amount of calculation might be far too large. For example
in Chess there are 20 possible first moves for either player. If we make a crude estimate by accepting
this 20 as typical, after 30 moves by each player, there are ~~1080 possible states: more than the
estimated number of electrons in the universe.
Proof. Let N be the length of the game, then there is at least one play of that length. Let S be the
state just before the last move in that play, and X the player who is due to move at S . Then any state
that follows S must be a final state.
M369, Game Theory Part 2, diagrams etc.
2
Case 1. X has only one legal move, say from S to T . Then we can delete the state T . Whatever
payoff any player would have got at T (s)he gets at S .
Case 2. X has several legal moves at S , say from S to T1 , T2 , . . . Tk with respective payoffs to
X of p1 , p2 , . . . pk . By renumbering, we may assume that p1 is the largest. Let  be any strategy
of X that says "move from S to Ti " . Then S has a strategy  that is identical to  at every
other state, but says "move from S to T1 " . Since p1  pi ,  is dominated by   we can delete
it.
Hence we can delete any of X's strategies that move to any Ti different from T1 . Then all of the
states T2 , . . . Tk become inaccessible, so we can prune them.
Case 3. X is Nature. We know (on average) what she will do. On arrival at S Nature will move to
one of the states T1 , T2 , . . . Tk with respective probabilities 1 ,  2 , . . .  k . For each player
(Y, say) the mean payoff if the game reaches S will be ipi where pi is the payoff to Y at Ti .
So again we can prune the tree, removing all the states Ti and attaching these payoffs at state S .
Continuing this way, eventually the game is reduced to a null game (no moves allowed, everybody
collects their mean payoff at the initial state). For every state in the original game, the argument of
Case 2 finds an optimum move for whoever has the move at that state.

This Theorem is equivalent to Th 2.1 in Jones, but I think the proof is easier.
Example: 2—2 Nim: strategy B3 always wins for B .
Partial Algorithm. for finding dominated rows & columns in a matrix game. "Partial" because it only
gets you part way.
When searching for dominated rows, the following is a good way to
-1
0
2
start. For each column, let cj be the greatest value in column j .
Mark the cells in column j where cj appears. Suppose cj appears
1
3
2
in cell p,j and nowhere else in that column. Then row p cannot be
4
-4
-3
dominated by any rows that have lesser values in column j . The
5
0
1
inset is assumed to be the left hand 3 columns of a larger matrix. So
in column 1 the max value 5 appears in row 4 and nowhere else,
-2
3
-1
 row 4 cannot be dominated. In column 2 the max 3 appears
0
3
0
in rows 2 , 5 , 6 and nowhere else  these 3 rows cannot be
3
2
-4
dominated by any others. But in column 3 the max value 2
appears in rows 1 and 2 and nowhere else. Combining these 2
2
1
-2
columns we see that Row 2 cannot be dominated.
Similarly for columns. For each row, let ri be the least value in row i . Then the columns that
contain value ri ( in that row ) cannot be dominated by any columns that have larger values in that
row. In this way, we try to build a list of "indomitable" rows and columns. In a matrix of any size,
checking for domination is tedious, and this process is helpful for reducing the tedium.
Definition. Let G be a r-person game, and r a set of strategies for the r players. We
represent this set as a sort of vector  . A deviation from  is a set of strategies in which precisely
one player (the deviant) chooses a different strategy. We represent a deviation as ||k where k is the
number of the player who deviates.
M369, Game Theory Part 2, diagrams etc.
3
Definition. A set of strategies  is an equilibrium set if for each player ( k , say) the payoff to k
from any deviation ||k is never better that the payoff from  .
Ex 2.2.
This game has no dominated
row or column, but it has a so called saddle
point at B1—J3 .
Definition. Let M be a matrix. For each
row, say row i , let ri be the least value in
row i and let cj be the greatest value in
Joe
1
2
3
1
2
0
-1
-1
2
-5
2
-2
-5
3
4
-3
-2
-3
col max
4
2
-1
Bill
column j . Then the maximin of M is the
largest ri and the minimax of M is the
row min
least cj . Alternatively:
maximin = maxi ( minj Mij )
and
minimax = minj ( maxi Mij )
Definition. We say that a matrix game based on matrix M has a saddle point if
maximin(M) = minimax(M) .
or
max(column min) = min( row max)
Another definition: M has a saddle point at cell ij if Mij is both the largest number in its row and
the smallest number in its column .
Theorem 2.2. If a matrix game with matrix M has a saddle point, say at cell ij , then that row and
column are optimum strategies for their respective players.
Theorem 2.3. In any matrix M
maximin(M) is always <= minimax(M)
Proof. Define ri and cj as above. Suppose that the greatest row min occurs in row p . Then
maximin(M) = rp = min( row p ) . Likewise, suppose that the least column max occurs in column q .
Then for any row, say row i :
Miq = an entry in col q  max ( col q ) = cq = minimax (M)
but also
Miq = an entry in row i  min( row i ) = ri
Putting these together:
minimax (M)  ri for every i . In particular, when i = p , we get
minimax (M)  rp = maximin (M)

Example 2.3.
2 politicians called Blear and Smut are about to fight an election and they can
either spend the last 2 days of the campaign in the city of Rundown or in Burningup or 1 day in each
city. Plans must be made in advance & neither
Smut
min
knows what the other intends. Estimated gains in
RR
BB
RB
seats (by Blear ) are:
RR
-3
2
6
-3
Although I introduced domination before saddle
Blear
BB
2
0
2
0
points I think that SP’s (if they exist) are both easier
RB
5
-2
-4
-4
to find & more useful. So: given a matrix game,
always start by looking for a saddle point. If that
fails, look for domination.
col max
5
0
6
M369, Game Theory Part 2, diagrams etc.
Unfortunately, most matrices have neither SP’s nor domination: so what next?
4