Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Backgammon Artificial Intelligence Adversial Search (Game Playing 2) Chapter 6 (6.5 – 6.8) Games of Chance Stochastic Game Environments Some games involve chance, for example: – Roll of a die – Spin of a game wheel – Deal of cards from shuffled deck Extend the game tree representation: – Computer moves – Opponent moves – Chance nodes Backgammon is a two player game with uncertainty Players roll dice to determine what moves to make White has just rolled 5 and 6 and had four legal moves: •5-10, •5-11, •5-10, •5-11, Such games are good for exploring decision making in adversarial problems involving skill and luck Games with Chance This chance nodes in addition to the Max and Min nodes influence the tree – Branches indicate possible variations – Each branch indicates the outcome and its likelihood 5-11 19-24 10-16 11-16 Rolling Dice1) 36 ways to roll two dice – – – – The same likelihood for all of them Due to symmetry, there are only 21 distinct rolls Six doubles have a 1/36 chance The other fifteen have a 1/18 chance 1 Game Trees with Chance Nodes Chance nodes (shown as circles) represent the dice rolls Game Trees with Chance Nodes Use expected values for chance nodes Min Rolls Min Rolls For chance nodes over a max node, as in C, we compute: Each chance node has 21 distinct children with a probability associated with each epectimax(C) = Sumi(P(di) * maxvalue(i)) For chance nodes over a min node compute: Max Rolls We can use minimax to compute the values for the MAX and MIN nodes Max Rolls epectimin(C) = Sumi(P(di) * minvalue(i)) - - Decisions with Chance The next step is to understand how to make correct decisions: – We still want to pick the move that leads to the best position – However, the resulting positions do not have definite minimax values - - Stochastic Game Environments Weight score by the probabilities that move occurs Use expected value for move: sum of possible random outcomes Instead, we can only calculate the expected value, where the expectation is taken over all the possible dice rolls that could occur Decisions with Chance This leads us to generalize the minimax value for deterministic games to an expectiminimax value for games with chance nodes Stochastic Game Environments Choose move with highest expected value Terminal nodes and MAX and MIN nodes (for which the dice roll is known) work exactly the same way as before Chance nodes are evaluated: – Taking the weighted average of the values resulting from all possible dice rolls – EXPECTIMINIMAX(n) (successor function) for a chance node n simply augments the state of n with each possible dice roll to produce each successor s and P(s) is the probability that that dice roll occurs. 2 Evaluation Function Properties of Expectiminimax Complexity of O(bmnm) – Where n is the number of distinct chance events – Example backgammon: • n = 21, b ≈ 20 Evaluation function (successor function) is used to evaluate the "goodness" of a game position The zero-sum assumption allows us to use a single evaluation function to describe the goodness of a board with respect to both players. – – – – – Stochastic Game Environments Stochastic elements increase the branching factor – 21 possible number rolls with 2 dice – The value of look-ahead diminishes: as depth increases, probability of reaching a particular node decreases Alpha-beta pruning is less effective Approximation to make with expectiminimax is to cut the search off at some point and apply an evaluation function to each leaf Position Evaluation (with chance nodes) Evaluation functions for games such as backgammon differ from evaluation functions for chess f(n) > 0: position n good for me and bad for you. f(n) < 0: position n bad for me and good for you f(n) near 0: position n is a neutral position. f(n) >> 0: win for me. f(n) << 0: win for you. Evaluation Function Evaluation function is a heuristic function, and it is where the domain experts’ knowledge resides Example of an Evaluation Function for Tic-Tac Toe: f(n) = [# of 3-lengths open for me] – [# of 3-lengths open for you] where a 3-length is a complete row, column, or diagonal For chess f(n) = w(n)/b(n) where w(n) = sum of the point value of white’s pieces and b(n) is sum for black Evaluation Function The board evaluation function estimates how good the current board state is for the computer The presence of chance nodes means that one has to be more careful about what the evaluation values mean Heuristic function of the features of the board – i.e. function(feat1, feat2, feat3, …, featn) A program behaves totally differently with different evaluation function and need sensitively specified evaluation function A linear evaluation function of the features is a weighted sum of feat1, feat2, feat3, …, featn Example features for chess are piece count, piece placement, squares controlled, etc… 3 Linear Evaluation Functions Most evaluation functions are specified as a weighted sum of position features: f(n) = w1*feat1(n) + w2*feat2(n) + ... + wn*featk(n) – feat1, feat2, feat3, …, featk are features – and w1, w2 , …, wn are their weights More important features get more weight Deep Blue has about 8,000 features in its evaluation function Linear Evaluation Functions Learning Checkers A. L. Samuel, “Some Studies in Machine Learning using the Game of Checkers,” IBM Journal of Research and Development, 11(6):601-617, 1959 Learned linear weights by playing copies of itself thousands of times Used only an IBM 704 with 10,000 words of RAM, magnetic tape, and a clock speed of 1 kHz Successful enough to be competitive in human Tournaments Learning Backgammon The quality of play depends directly on the quality of the evaluation function G. Tesauro and T. J. Sejnowski, “A Parallel Network that Learns to Play Backgammon,” Artificial Intelligence, 39(3), 357-390, 1989 To build an evaluation function we have to: Also learned by playing copies of itself – Construct good features using expert knowledge of the game – Choose good weights… or learn them Used a non-linear evaluation function: a neural network Rates in the top three players in the world Learning Weights How can we learn the weights for a linear evaluation function? Play lots of games against an opponent! – For every move (or game) error = true outcome - evaluation AI in Modern Computer Games Genetic algorithms and genetic programming have been used and shown some success in “evolving” realistically-acting agents for games – Certainly appropriate for “Sim”-type games function – If error is positive (underestimating) adjust weights to increase the evaluation function – If error is zero do nothing – If error is negative (overestimating) adjust weights to decrease the evaluation function 4 Games and Computers State of the art for some game programs – – – – – Chess Checkers Othello Backgammon Go Chess: IBM’s Deep Blue Checkers1) Chinook becomes world champion in 1994 Chinook runs on regular PCs, alpha-beta search, end-games data base for six-piece positions Othello1) Kasparov vs. Deep Blue, May 1997 – A parallel computer with 30 IBM RS/6000 processors running the “software search" – 480 custom VLSI chess processors that performed move generation (including move ordering) Many programs play far better than humans – Smaller search space than chess – Little evaluation expertise available – Deep Blue searched 126 million nodes per second on average, with a peak speed of 330 million nodes per second – It generated up to 30 billion positions per move, reaching depth 14 routinely. – The heart of the machine is a standard iterative-deepening alphabeta search with a transposition table and very sophisticated heuristics Chess: IBM’s Deep Blue In some cases the search reached a depth of 40 plies. The evaluation function had over 8000 features, many of them describing highly specific patterns of pieces. Massive data base with games from the literature Database of 700,000 grandmaster games from which consensus recommendations could be extracted Backgammon1) Neural-network based program ranks among the best players in the world – Improves its own evaluation function through learning techniques – Systematic search-based methods are practically hopeless • Chance elements, branching factor The system also used a large endgame database of solved positions Note: Deep Blue still searches “brute force,” and still plays with little in common with the intuition and strategy humans use 5 Go Go is the most popular board game in Asia, requiring at least as much discipline from its professionals as chess Humans play far better – Large branching factor (around 360) • Daunting for systematic search methods Up to 1997 there were no competent programs at all, but now programs often play respectable moves Go Combine pattern recognition techniques (successor(pattern of pieces, move)) Limited search (decide whether these pieces can be captured, staying within the local area) Go is an area that is likely to benefit from intensive investigation using more sophisticated reasoning methods References 1) Franz J. Kurfess Associate Professor, Computer Science Department at California Polytechnic State University USA Summary In many games, there is a degree of unpredictability through random elements – Throwing dice, card distribution, roulette wheel, … This requires chance nodes in addition to the Max and Min nodes – Branches indicate possible variations – Each branch indicates the outcome and its likelihood Summary The evaluation value of a position depends on the random element – The definite minimax value must be replaced by an expected value Calculation of expected values – Utility function for terminal nodes – For all other nodes • Calculate the expected value for each chance event • Weigh by the chance that the event occurs • Add up the individual calculated values Summary Evaluation functions estimate the quality of a given board configuration for each player – Good for opponent – Good for computer – 0 neutral For many well-known games, computer algorithms using heuristic search can match or out-perform human world experts 6 Summary An evaluation function gives an estimate of the utility of a state when a complete search is impractical State of the art for some game programs – – – – – Chess Checkers Othello Backgammon Go Next Time! Logical Agents Chapter 7 7