Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 11 Reminder: Programming Project 1 due Thursday Engineers Week Banquet signups due tomorrow (2/15) in Dean's Office. 20 homework points extra credit for attending. (Yes, for CS 470, too.) Questions? Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 1 Outline Chapter 5 - Adversarial Search Alpha-Beta Pruning Imperfect Real-Time Decisions Stochastic Games Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 2 Alpha-Beta Pruning As noted at the end of last class, the number of game states that minimax algorithm examines is exponential in the depth of the tree. Can't eliminate the exponent, but can effectively cut number in half, since it is possible to compute the correct minimax decision without looking at every node by pruning the search tree. Consider the two-ply game tree from last time. Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 3 Minimax Value Recall the definition of the minimax value: Minimax (s) = Utility (s) if TerminalTest (s) maxaActions(s) Minimax(Result(s,a)) if Player(s) = MAX minaActions(s) Minimax(Result(s,a)) Tuesday, February 14 if Player(s) = MIN CS 430 Artificial Intelligence - Lecture 11 4 Pruning Example Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 5 Alpha-Beta Pruning Can also view computation as a simplification of the Minimax formula. Let x and y be the values of the unevaluated successors of node C. Then the value of the root node is: Minimax(root) = max(min(3,12,8), min(2,x,y), min(14,5,2)) = max(3, min(2,x,y), 2) = max(3, z, 2) where z = min(2,x,y) 2 =3 Shows that value at root is independent of the values of pruned leaves x and y. Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 6 Alpha-Beta Pruning When applied to taller trees, this prunes entire subtrees. General principle is: consider a node n somewhere in the tree, such that Player has choice of moving to that node. If Player has a better choice m either at the parent node of n or at any choice point further up, then n will never be reached in actual play. Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 7 Alpha-Beta Pruning Alpha-beta pruning gets its name from the parameters that describe the bounds on the backed-up values: = value of the best (i.e., highest-value) choice found so far along the path for MAX = value of the best (i.e., lowest-value) choice found so far along the path for MIN Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 8 Alpha-Beta-Search Algorithm As before, at the root, want the action that has the maximum value Function: Alpha-Beta-Search Receives: state; Returns action 1. v = Max-Value(state, -, +) // initial range [-, +] 2. Return the action in Actions (state) with value v Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 9 Alpha-Beta-Search: Max-Value Function: Max-Value Receives: state, , ; Returns utility value 1. If Terminal-Test (state) then Return Utility(state) 2. v = - 3. For each a in Actions (state) do 3.1 v = Max (v, Min-Value(Result(s,a), , )) 3.2 If v then return v // node is worse than 3.3 = Max (, v) 4. Return v Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 10 Alpha-Beta-Search: Min-Value Function: Min-Value Receives: state, , ; Returns utility value 1. If Terminal-Test (state) then Return Utility(state) 2. v = + 3. For each a in Actions (state) do 3.1 v = Min (v, Max-Value(Result(s,a), , )) 3.2 If v then return v // node is worse than 3.3 = Min (, v) 4. Return v Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 11 Alpha-Beta-Search Algorithm Note that Max-Value and Min-Value are the same as those for the Minimax algorithm except for the bookkeeping code to maintain and . Search updates values for and , and prunes (by terminating the recursive call) when the value of the current call is worse than or for MAX or MIN, respectively. Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 12 Pruning Example Again Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 13 Alpha-Beta Pruning Effectiveness of alpha-beta pruning is highly dependent on the order in which state are examined. E.g., could not prune any successors of D at all because the worst successors were generated first. Various ways to order moves. Can use iterative deepening search to find the better (killer) moves. Also can keep track of previously seen states in a transposition table. Results in O(bm/2) nodes examined. Search twice as deep as Minimax. Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 14 Imperfect Real-Time Decisions Alpha-beta search algorithm still searches all the way to terminal state, which is usually at a depth that is not practical. Shannon (1950) proposed that chess-playing programs should cut off the search earlier and apply a heuristic evaluation function. This effectively turns non-terminal nodes into terminal leaves. Modify Minimax or Alpha-Beta to use a cutoff test and an Eval function. Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 15 H-Minimax Value Minimax value using cutoff and Eval: H-Minimax (s, d) = Eval (s) if Cutoff-Test (s, d) maxaActions(s) H-Minimax(Result(s,a), d+1) if Player(s) = MAX minaActions(s) H-Minimax(Result(s,a), d+1) Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 if Player(s) = MIN 16 Evaluation Functions Evaluation function returns an estimate of the expected utility of the game from a given position. Desired properties: Should order the terminal states in the same way as utility function (otherwise suboptimal) Must be efficient to compute; in particular, faster than computing Minimax value Should be strongly correlated with actual chance of winning. (Uncertainty due to early cutoff.) Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 17 Evaluation Functions Simple evaluation functions are often weighted linear functions of the value of various features of the state. Eval(s) = w1f1(s) + w2f2(s) + … + wnfn(s) E.g. in chess, pieces have a material value: 1 for pawn, 3 for knight or bishop, 5 for rook, 9 for queen. fi(s) = # of category i pieces. Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 18 Evaluation Functions Linear combinations assume that the contribution of each feature is independent of the values of other features. More sophisticated functions also use nonlinear combinations. E.g., bishop is more valuable during the endgame. Note that notion of features and weights is not part of rules of chess. Come from human chess-playing experience. Use machine learning techniques otherwise. Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 19 Cutting Off Search Cutoff-Test replaces Terminal-Test in algorithm. It must return true for all terminal states. Easiest implementation is to set a fixed depth limit. The limit d is chosen so that a move is selected within the allotted time. More robust approach is to use iterative deepening. When time runs out, program returns the move selected by the deepest completed search. Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 20 Search vs. Lookup Beginnings and endings of games are usually stored in lookup tables rather than generated. For chess openings, the expertise of humans is copied from books. For chess endgames, computers have been used to solve all endgames involving small numbers of pieces (currently 6 pieces). Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 21 Stochastic Games Many games add a random element, such as throwing dice. Called stochastic games. E.g. backgammon uses dice to determine legal moves. (White moves towards 25; Black moves towards 0.) Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 22 Stochastic Games White knows what its legal moves are, but not what Black's legal moves will be, so cannot do standard minimax tree. Add chance nodes. Branches from chance nodes labeled with possible dice rolls and probability of each roll. E.g. rolling 1,1 has probability of 1/36. Since 5-6 is same as 6-5, 21 distinct rolls with doubles having probability of 1/36, the rest having probability 1/18. Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 23 Stochastic Games Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 24 Expectiminimax Value Use probabilities to computed an expectiminimax value for games with chance nodes. Expectiminimax (s) = Utility (s) if TerminalTest (s) maxaActions(s) Expectiminimax(Result(s,a)) if Player(s) = MAX minaActions(s) Expectiminimax(Result(s,a)) if Player(s) = MIN r P(r)Expectiminimax(Result(s,r)) if Player(s) = CHANCE Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 25 Evaluation Functions with Chance As before, need to use cutoff and evaluation function. However, not as straightforward. E.g., left tree best move is a1, a2 for right tree, even though leaf values are ordered the same. Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 26 Evaluation Functions with Chance To avoid this problem, evaluation function must be a positive linear transformation of the probability of winning from a position. Addition of chance makes complexity O(bmnm) where n is number of distinct rolls. For backgammon, b is around 20 and n is 21, so 3 plies is about as deep as can get. Might use Monte Carlo simulation to determine the value of a position. Tuesday, February 14 CS 430 Artificial Intelligence - Lecture 11 27