Download The Alpha-Beta Procedure The Alpha-Beta Procedure

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

The Talos Principle wikipedia , lookup

History of artificial intelligence wikipedia , lookup

Philosophy of artificial intelligence wikipedia , lookup

Computer Go wikipedia , lookup

Ethics of artificial intelligence wikipedia , lookup

Minimax wikipedia , lookup

Intelligence explosion wikipedia , lookup

Existential risk from artificial general intelligence wikipedia , lookup

Artificial intelligence in video games wikipedia , lookup

Transcript
The Alpha-Beta Procedure
The Alpha-Beta Procedure
Can we estimate the efficiency benefit of the alphabeta method?
Of course, the efficiency gain by the alpha-beta
method always depends on the rules and the current
configuration of the game.
Suppose that there is a game that always allows a
player to choose among b different moves, and we
want to look d moves ahead.
Then our search tree has bd leaves.
Therefore, if we do not use alpha-beta pruning, we
would have to apply the static evaluation function
Nd = bd times.
February 25, 2016
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
1
The Alpha-Beta Procedure
This means that in the best case the alpha-beta
technique enables us to look ahead almost twice as
far as without it in the same amount of time.
In order to get close to the best case, we can
compute e(p) immediately for every new node that
we expand and use this value as an estimate for the
Minimax value that the node will receive after
expanding its successors until depth d.
We can then use these estimates to expand the
most likely candidates first (greatest e(p) for MAX,
smallest for MIN).
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
3
The Alpha-Beta Procedure
If one of the players has won, this simplified version
of e(p) returns the value ∞ or -∞ if in configuration p
the player MAX or MIN, respectively, has won.
2
Of course, this pre-sorting of nodes requires us to
compute the static evaluation function e(p) not only
for the leaves of our search tree, but also for all of its
inner nodes that we create.
However, in most cases, pre-sorting will substantially
increase the algorithm’s efficiency.
The better our function e(p) captures the actual
standing of the game in configuration p, the greater
will be the efficiency gain achieved by the pre-sorting
method.
February 25, 2016
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
4
It is very difficult to predict for a given game situation
how many operations a depth d look-ahead will
require.
Since we want the computer to respond within a
certain amount of time, it is a good idea to apply the
idea of iterative deepening.
First, the computer finds the best move according to a
one-move look-ahead search.
Then, the computer determines the best move for a
two-move look-ahead, and remembers it as the new
best move.
It returns 0 (draw) if no more moves are possible.
This way, no unnecessary - and likely misleading analysis of impossible future configurations can
occur.
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
Timing Issues
Even if you do not want to apply e(p) to inner nodes,
you should at least do a simple check whether in
configuration p one of the players has already won or
no more moves are possible.
February 25, 2016
February 25, 2016
The Alpha-Beta Procedure
Therefore, the actual number Nd can range from
about 2bd/2 (best case) to bd (worst case).
February 25, 2016
However, if we assume that somehow new children of
a node are explored in a particular order - those
nodes p are explored first that will yield maximum
values e(p) at depth d for MAX and minimum values
for MIN - the number of nodes to be evaluated is:
This is continued until the time runs out. Then the
currently remembered best move is executed.
5
February 25, 2016
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
6
1
How to Find Static Evaluation Functions
Often, a static evaluation function e(p) first computes
an appropriate feature vector f(p) that contains
information about features of the current game
configuration that are important for its evaluation.
There is also a weight vector w(p) that indicates the
weight (= importance) of each feature for the
assessment of the current situation.
Then e(p) is simply computed as the dot product of
f(p) and w(p).
How to Find Static Evaluation Functions
Once we have found suitable features, the weights
can be adapted algorithmically.
This can be achieved, for example, with an artificial
neural network.
So the biggest problem consists in extracting the
most informative features from a game configuration.
Let us look at an example: Chinese Checkers.
Both the identification of the most relevant features
and the correct estimation of their relative importance
are crucial for the strength of a game-playing
program.
February 25, 2016
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
7
February 25, 2016
Chinese Checkers
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
8
Chinese Checkers
Sample moves for RED (bottom) player:
• Move all your pieces into your opponent’s home area.
• In each move, a piece can either move to a neighboring
position or jump over any number of pieces.
February 25, 2016
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
9
February 25, 2016
Chinese Checkers
4
5
4
6
5
4
2
2
6
6
3
2
4
2
2
• For successful play, no piece should be “left
behind”
• Therefore add another feature “coherence”:
Difference between the players in terms of the
smallest positional value for any of their pieces.
4
4
4
3
3
1
5
5
5
4
4
3
3
Another important feature:
7
5
3
2
2
1
0
Weights used in sample program:
Idea for important feature:
• assign positional values
• sum values for all pieces of each player
• feature “progress” is difference of sum between players
February 25, 2016
10
Chinese Checkers
8
7
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
• 1 for progress
• 2 for coherence
11
February 25, 2016
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
12
2
Isola
Isola
Your biggest programming assignment in this course
will be the development of a program playing the
game Isola.
In order to win the tournament and receive an
incredibly valuable prize, you will have to write a
static evaluation function that
• assesses a game configuration accurately and
• can be computed efficiently.
February 25, 2016
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
13
Rules of Isola:
• Each of the two players has one piece.
• The board has 7×7 positions which initially contain
squares, except for the initial positions of the
pieces.
• A move consists of two subsequent actions:
– moving one’s piece to a neighboring
(horizontally, vertically, or diagonally) field
that contains a square but not the
opponent’s piece,
– removing any square with no piece on it.
• If a player cannot move any more, he/she loses the
game.
February 25, 2016
Isola
14
Isola
Initial Configuration:
If in this situation O is to move, then X is the winner:
X O
O
X
February 25, 2016
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
If X is to move, he/she can just move left and remove
the square between X and O, and also wins the game.
15
February 25, 2016
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
16
Isola
You can start thinking about an appropriate
evaluation function for this game.
You may even consider revising the Minimax and
alpha-beta search algorithm to reduce the enormous
branching factor in the search tree for Isola.
We will further discuss the game and the Java
interface for the tournament next week.
February 25, 2016
Introduction to Artificial Intelligence
Lecture 10: Two-Player Games II
17
3