Download MiniMax with Alpha

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Choice modelling wikipedia , lookup

Time value of money wikipedia , lookup

Transcript
MiniMax with Alpha-Beta Cutoff
(Summary taken from the Internet)
We are considering two-player, perfect information games. The two players take turns
and try respectively to maximize and minimize a scoring function (also called utility
function). The two players are called respectively MAX and MIN. We assume that the
MAX player makes the first move. We represent the game as a tree where the nodes
represent the current position and the arcs represent moves. Since players take turns,
successive nodes represent positions where different players must move. We call the
nodes MAX or MIN nodes depending of who is the player that must move at that node. A
game tree could be infinite. The leaves represent terminal positions, i.e. positions where
MAX wins (score: +infinity) or MIN wins (score: -infinity). The ply of a node is the
number of moves needed to reach that node (i.e. arcs from the root of the tree). The ply of
a tree is the maximum of the plies of its nodes.
An imperfect information game is one where we do not expand fully the game tree and
some of the leaves represent positions that could be analyzed further. For these positions,
in place of a utility function value, we will have a scoring function value. Otherwise the
game trees for perfect and imperfect information games will be treated alike. [There are
interesting observations by Pearl on the interaction of ply and errors in the case of
imperfect information game trees.]
The MINIMAX GAME STRATEGY for the MAX (MIN) player is to select the move
that leads to the successor node with the highest (lowest) score. The scores are computed
starting from the leaves of the tree and backing up their scores to their predecessor in
accordance with the Minimax strategy. The problem with this strategy is that it explores
each node in the tree.
function MINIMAX(N) is
begin
if N is a leaf then
return the estimated score of this leaf
else
Let N1, N2, .., Nm be the successors of N;
if N is a Min node then
return min{MINIMAX(N1), .., MINIMAX(Nm)}
else
return max{MINIMAX(N1), .., MINIMAX(Nm)}
end MINIMAX;
ALPHA-BETA cutoff is a method for reducing the number of nodes explored in the
Minimax strategy. For the nodes it explores it computes, in addition to the score, an alpha
value and a beta value.
ALPHA value of a node
It is a value never greater than the true score of this node. Initially it is the score
of that node, if the node is a leaf, otherwise it is -infinity. Then at a MAX node it
is set to the largest of the scores of its successors explored up to now, and at a
MIN node to the alpha value of its predecessor.
BETA value of a node
It is a value never smaller than the true score of this node. Initially it is the score
of that node, if the node is a leaf, otherwise it is +infinity. Then at a MIN node it
is set to the smallest of the scores of its successors explored up to now, and at a
MAX node to the beta value of its predecessor.
It is guaranteed that:



The score of a node will always be no less than the alpha value and no greater
than the beta value of that node.
As the algorithm evolves, the alpha and beta values of a node may change, but the
alpha value will never decrease, and the beta value will never increase.
When a node is visited last, its score is set to the alpha value of that node, if it is a
MAX node, otherwise it is set to the beta value.
function MINIMAX-AB(N, A, B) is
begin
if N is a leaf then
return the estimated score of this leaf
else
Set Alpha value of N to -infinity and
Beta value of N to +infinity;
if N is a Min node then
For each successor Ni of N loop
Let Val be MINIMAX-AB(Ni, A, Min{B,Beta of N});
Set Beta value of N to Min{Beta value of N, Val};
When A >= Beta value of N then
Return Beta value of N endloop;
Return Beta value of N;
else
For each successor Ni of N loop
Let Val be MINIMAX-AB(Ni, Max{A,Alpha value of N},B);
Set Alpha value of N to Max{Alpha value of N, Val};
When Alpha value of N >= B then
Return Alpha value of N endloop;
Return Alpha value of N;
end MINIMAX-AB;
The MINIMAX-AB function is called with as parameters the root of the game tree, infinity (-I) as alpha value, and +infinity (+I) as beta value. Here is an example of Alpha
Beta Cutoff. And here is a C (yes C) program that tests the alpha-beta function on a
couple of examples.
The alpha-beta search can be easily modified to guaranty that the maximum depth of the
game tree never exceeds d.
The efficiency of the Alpha-Beta procedure depends on the order in which successors of
a node are examined. If we were lucky, at a MIN node we would always consider the
nodes in order from low to high score and at a MAX node the nodes in order from high to
low score.
For example, a "lucky" game tree (binary, of depth 4, complete) has leaf scores, from left
to right, 4,3,8,7,2,1,6,5. Of these leaves only 5 will be examined.
Another "lucky" game tree (binary, of depth 6, complete) has leaf scores, from left to
right 12,11,32,31,10,9,30,29,16,15,28,27,14,13,26,25,4,3,24,23,2,1
22,21,8,7,20,19,6,5,18,17. Of these leaves only 11 will be examined. In general it can be
shown that in the most favorable circumstances the alpha-beta search opens as many
leaves as minimax on a game tree with double its depth.
Minimax and alpha-beta must be modified when we deal with games that involve chance.
For example, in backgammon the moves of each player take place after a throw of the
dices. One modifies the game tree to add after each normal node chance nodes to
represent the outcomes of the throw; then the player choses moves from these chance
nodes. The score of the chance nodes is computed as usual. The score of the original node
is the weighted sum of the scores of its successor chance nodes weighted by their
probabilities.
References
Nillson,N.J.: Principles of Artificial Intelligence
Tioga Publishing Co. 1980
Pages 112-125
Rich,E.,Knight,K.: Artificial Intelligence
McGraw-Hill, 1991
Pages 307-319
Luger,G.F.,Stubblefield,W.A.: Artificial Intelligence: Structures
and strategies for complex problem solving, Second Edition
The Benjamin/Cummings Publishing Co. 1993 Pages 137-145
Russell,S.,Norvig,P.: Artificial Intelligence, a Modern Approach
Prentice-Hall, 1995
Pages 122-148
Norvig,P.: Paradigms of Artificial Intelligence Programming
Morgan-Kaufmann, 1992
Pages 596-626
Pearl,J.: Heuristics
Addison-Wesley, 1984
Pages 332-360