Download CS 430 Lecture 11

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Choice modelling wikipedia , lookup

Transcript
Lecture 11



Reminder: Programming Project 1 due
Thursday
Engineers Week Banquet signups due
tomorrow (2/15) in Dean's Office. 20
homework points extra credit for attending.
(Yes, for CS 470, too.)
Questions?
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
1
Outline

Chapter 5 - Adversarial Search

Alpha-Beta Pruning

Imperfect Real-Time Decisions

Stochastic Games
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
2
Alpha-Beta Pruning



As noted at the end of last class, the number of
game states that minimax algorithm examines
is exponential in the depth of the tree.
Can't eliminate the exponent, but can
effectively cut number in half, since it is
possible to compute the correct minimax
decision without looking at every node by
pruning the search tree.
Consider the two-ply game tree from last time.
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
3
Minimax Value

Recall the definition of the minimax value:
Minimax (s) =
Utility (s)
if TerminalTest (s)
maxaActions(s) Minimax(Result(s,a)) if Player(s) = MAX
minaActions(s) Minimax(Result(s,a))
Tuesday, February 14
if Player(s) = MIN
CS 430 Artificial Intelligence - Lecture 11
4
Pruning Example
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
5
Alpha-Beta Pruning

Can also view computation as a simplification
of the Minimax formula. Let x and y be the
values of the unevaluated successors of node
C. Then the value of the root node is:
Minimax(root)

= max(min(3,12,8), min(2,x,y), min(14,5,2))
= max(3, min(2,x,y), 2)
= max(3, z, 2) where z = min(2,x,y)  2
=3
Shows that value at root is independent of the
values of pruned leaves x and y.
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
6
Alpha-Beta Pruning


When applied to taller trees, this prunes entire
subtrees.
General principle is: consider a node n
somewhere in the tree, such that Player has
choice of moving to that
node. If Player has a better
choice m either at the
parent node of n or at any
choice point further up,
then n will never be
reached in actual play.
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
7
Alpha-Beta Pruning

Alpha-beta pruning gets its name from the
parameters that describe the bounds on the
backed-up values:


 = value of the best (i.e., highest-value) choice
found so far along the path for MAX
 = value of the best (i.e., lowest-value) choice
found so far along the path for MIN
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
8
Alpha-Beta-Search Algorithm


As before, at the root, want the action that has
the maximum value
Function: Alpha-Beta-Search
Receives: state; Returns action
1. v = Max-Value(state, -, +)
// initial range [-, +]
2. Return the action in Actions (state) with value v
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
9
Alpha-Beta-Search: Max-Value

Function: Max-Value
Receives: state, , ; Returns utility value
1. If Terminal-Test (state) then Return Utility(state)
2. v = -
3. For each a in Actions (state) do
3.1 v = Max (v, Min-Value(Result(s,a), , ))
3.2 If v   then return v // node is worse than 
3.3  = Max (, v)
4. Return v
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
10
Alpha-Beta-Search: Min-Value

Function: Min-Value
Receives: state, , ; Returns utility value
1. If Terminal-Test (state) then Return Utility(state)
2. v = +
3. For each a in Actions (state) do
3.1 v = Min (v, Max-Value(Result(s,a), , ))
3.2 If v   then return v // node is worse than 
3.3  = Min (, v)
4. Return v
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
11
Alpha-Beta-Search Algorithm


Note that Max-Value and Min-Value are the
same as those for the Minimax algorithm
except for the bookkeeping code to maintain 
and .
Search updates values for  and , and prunes
(by terminating the recursive call) when the
value of the current call is worse than  or  for
MAX or MIN, respectively.
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
12
Pruning Example Again
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
13
Alpha-Beta Pruning



Effectiveness of alpha-beta pruning is highly
dependent on the order in which state are examined.
E.g., could not prune any successors of D at all
because the worst successors were generated first.
Various ways to order moves. Can use iterative
deepening search to find the better (killer) moves.
Also can keep track of previously seen states in a
transposition table.
Results in O(bm/2) nodes examined. Search twice as
deep as Minimax.
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
14
Imperfect Real-Time Decisions



Alpha-beta search algorithm still searches all
the way to terminal state, which is usually at a
depth that is not practical.
Shannon (1950) proposed that chess-playing
programs should cut off the search earlier and
apply a heuristic evaluation function. This
effectively turns non-terminal nodes into
terminal leaves.
Modify Minimax or Alpha-Beta to use a cutoff
test and an Eval function.
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
15
H-Minimax Value

Minimax value using cutoff and Eval:
H-Minimax (s, d) =
Eval (s)
if Cutoff-Test (s, d)
maxaActions(s) H-Minimax(Result(s,a), d+1) if Player(s) = MAX
minaActions(s) H-Minimax(Result(s,a), d+1)
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
if Player(s) = MIN
16
Evaluation Functions


Evaluation function returns an estimate of the
expected utility of the game from a given
position.
Desired properties:



Should order the terminal states in the same way as
utility function (otherwise suboptimal)
Must be efficient to compute; in particular, faster
than computing Minimax value
Should be strongly correlated with actual chance of
winning. (Uncertainty due to early cutoff.)
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
17
Evaluation Functions

Simple evaluation functions are often weighted
linear functions of the value of various features
of the state.
Eval(s) = w1f1(s) + w2f2(s) + … + wnfn(s)

E.g. in chess, pieces have a material value: 1
for pawn, 3 for knight or bishop, 5 for rook, 9 for
queen. fi(s) = # of category i pieces.
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
18
Evaluation Functions


Linear combinations assume that the
contribution of each feature is independent of
the values of other features. More
sophisticated functions also use nonlinear
combinations. E.g., bishop is more valuable
during the endgame.
Note that notion of features and weights is not
part of rules of chess. Come from human
chess-playing experience. Use machine
learning techniques otherwise.
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
19
Cutting Off Search



Cutoff-Test replaces Terminal-Test in
algorithm. It must return true for all terminal
states.
Easiest implementation is to set a fixed depth
limit. The limit d is chosen so that a move is
selected within the allotted time.
More robust approach is to use iterative
deepening. When time runs out, program
returns the move selected by the deepest
completed search.
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
20
Search vs. Lookup



Beginnings and endings of games are usually
stored in lookup tables rather than generated.
For chess openings, the expertise of humans is
copied from books.
For chess endgames, computers have been
used to solve all endgames involving small
numbers of pieces (currently 6 pieces).
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
21
Stochastic Games


Many games add a
random element, such
as throwing dice.
Called stochastic
games.
E.g. backgammon uses
dice to determine legal
moves. (White moves
towards 25; Black
moves towards 0.)
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
22
Stochastic Games



White knows what its legal moves are, but not
what Black's legal moves will be, so cannot do
standard minimax tree. Add chance nodes.
Branches from chance nodes labeled with
possible dice rolls and probability of each roll.
E.g. rolling 1,1 has probability of 1/36.
Since 5-6 is same as 6-5, 21 distinct rolls with
doubles having probability of 1/36, the rest
having probability 1/18.
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
23
Stochastic Games
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
24
Expectiminimax Value

Use probabilities to computed an expectiminimax value for games with chance nodes.
Expectiminimax (s) =
Utility (s)
if TerminalTest (s)
maxaActions(s) Expectiminimax(Result(s,a)) if Player(s) = MAX
minaActions(s) Expectiminimax(Result(s,a))
if Player(s) = MIN
r P(r)Expectiminimax(Result(s,r))
if Player(s) = CHANCE
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
25
Evaluation Functions with Chance

As before, need to use cutoff and evaluation
function. However, not as straightforward.
E.g., left tree best move is a1, a2 for right tree,
even though leaf values are ordered the same.
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
26
Evaluation Functions with Chance



To avoid this problem, evaluation function must
be a positive linear transformation of the
probability of winning from a position.
Addition of chance makes complexity O(bmnm)
where n is number of distinct rolls. For
backgammon, b is around 20 and n is 21, so 3
plies is about as deep as can get.
Might use Monte Carlo simulation to
determine the value of a position.
Tuesday, February 14
CS 430 Artificial Intelligence - Lecture 11
27