Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College Why Games? • • • • Small number of rules Well-defined knowledge set Easy to evaluate performance Large search spaces (too large for exhaustive search) • Fame & Fortune, e.g. Chess Example Games & Best Computer Players (sec. 6.6 w/updates) • Chess - Deep Blue (beat Kasparov); Deep Junior (tied Kasparov); Hydra (scheduled to play British champion for 80,000 pounds) • Checkers - Chinook (world champion) • Go (Goemate, Go4++ rated “weak amateur”) • Othello - Iago (world championship level), Logistello (defeated world champion, now retired) • Backgammon - TD-Gammon (neural network that learns to play using “reinforcement learning”) Properties of Games • Two-Player • Zero-sum – If it’s good for one player, it’s bad for the opponent and vice versa • Perfect information – All relevant information is apparent to both players (no hidden cards) Game as Search Problem – State space search • Each potential board or game position is a state • Each possible move is an operation • Space can be BIG: – large branching factor (chess avg. 35) – deep search for game (chess avg. 50 ply) – Components of any search technique • Move generator (successor function) • Terminal test (end of game?) • Utility function (win, lose or draw?) Game Tree • • • • Root is initial state Next level is all of first player’s moves Next level is all of second player’s moves Example: Tic Tac Toe – – – – Root: 9 blank squares Level 1: 3 different boards (corner, center and edge X) Level 2 below center: 2 different boards (corner, edge) Etc. • Utility function: win for X is 1, win for O is -1 – X is Maximizer, O is minimizer Minimax Strategy • Max’s goal: get to 1 • Min’s goal: get to -1 • Max’s strategy – Choose moves that will lead to a win, even though min is trying to block • Minimax value of a node (backed up value): – If N is terminal, use the utility value – If N is a Max move, take max of successors – If N is a Min move, take min of successors Minimax Values: 2-Ply Example 1 -1 -1 1 -3 4 -3 5 1 0 3 2 1 Minimax Algorithm • Depth-first search to bottom of tree • As search “unwinds”, compute backed up values • Backed-up value of root determines which step to take. • Assumes: – Both players are playing this strategy (optimally) – Tree is small enough to search completely Alpha-Beta Pruning • We don’t really have to look at all subtrees! • Recognize when a position can never be chosen in minimax no matter what its children are – Max (3, Min(2,x,y) …) is always ≥ 3 – Min (2, Max(3,x,y) …) is always ≤ 2 – We know this without knowing x and y! Alpha-Beta Pruning • Alpha = the value of the best choice we’ve found so far for MAX (highest) • Beta = the value of the best choice we’ve found so far for MIN (lowest) • When maximizing, cut off values lower than Alpha • When minimizing, cut off values greater than Beta Alpha-Beta Example 3 3 3 5 <=1 8 1 x 2 x 7 6 2 Notes on Alpha-Beta Pruning • Effectiveness depends on order of successors (middle vs. last node of 2-ply example) • If we can evaluate best successor first, search is O(bd/2) instead of O(bd) • This means that in the same amount of time, alpha-beta search can search twice as deep! Optimizing Minimax Search • Use alpha-beta cutoffs – Evaluate most promising moves first • Remember prior positions, reuse their backed-up values – Transposition table (like closed list in A*) • Avoid generating equivalent states (e.g. 4 different first corner moves in tic tac toe) • But, we still can’t search a game like chess to the end! When you can’t search to the end • Replace terminal test (end of game) by cutoff test (don’t search deeper) • Replace utility function (win/lose/draw) by heuristic evaluation function that estimates results on the best path below this board – Like A* search, good evaluation functions mean good results (and vice versa) • Replace move generator by plausible move generator (don’t consider “dumb” moves) Good evaluation functions… • Order terminal states in the same order as the utility function • Don’t take too long (we want to search as deep as possible in limited time) • Should be as accurate as possible (estimate chances of winning from that position…) – Human knowledge (e.g. material value) – Known solution (e.g. endgame) – Pre-searched examples (take features, average value of endgame of all games with that feature) How Deep to Search? • Until time runs out (the original application of Iterative Deepening!) • Until values don’t seem to change (quiescence) • Deep enough to avoid horizon effect (delaying tactic to delay the inevitable beyond the depth of the search) • Singular extensions - search best (apparent) paths deeper than others – Tends to limit horizon effect, since these are the moves that will exhibit it