Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Artificial Intelligence CS 165A Thursday, October 18, 2007 Informed (heuristic) search methods (Ch 4) Adversarial search – games (Ch 6) 1 Notes • Reading list typo – Skip Chapter 5, read Chapter 6 • HW#1, problem 4: Tic-Tac-Toe – “It’s not whether you win or lose, it’s how you play the game!” – Implement breadth-first and depth-first approaches looking for your goal (three in a row for you) – If you want to also avoid losing (three in a row for your opponent), you may, but this is optional – In depth-first search, you should stop searching when you find a goal state (i.e., don’t keep looking for the “best” goal) 2 Review Informed Search Methods • Best-first search – Greedy best-first search – A* search • Memory bounded search Not covering in detail – IDA* – SMA* • Iterated improvement algorithms – Hill-climbing – Simulated annealing As with blind search, the strategy is defined by choosing the order of node expansion 3 Review A* Search • “A* Search” is A Search with an admissible h – h is optimistic – it never overestimates the cost to the goal h(n) true cost to reach the goal h is a “Pollyanna” – So f (n) never overestimates the actual cost of the best solution passing through node n function A*-SEARCH(problem, h) returns a solution or failure return BEST-FIRST-SEARCH(problem, f ) 4 Thursday Quiz 1. How does an online search agent operate? 2. How does it differ from an offline search agent? See Sec. 4.5 Simulated Annealing • Similar to hill-climbing – But includes a random element – Sometimes take “bad” steps to escape local maxima • Motivated by the roughly analogous physical process of annealing – Heating and then slowly cooling a substance to obtain a strong crystalline structure – T is temperature, E is energy A schedule determines the rate at which T is lowered Want to maximize the energy E – E (state) measures the state’s “goodness” – E measures increase in “goodness” resulting from new state 6 Simulated Annealing (cont.) • For the current state, evaluate the operators/actions • Instead of taking the best action, choose an action at random – If that action results in a higher “goodness” value (E > 0), take it – Otherwise, take it with probability between 0 and 1 E = 1 E = -1 Etc… E = 3 E = -2 7 Probabilistic action • E : Energy(action) – Energy(current) – If E > 0, take the action – If E < 0, maybe take it, maybe not (probabilistic action) • P(action) = eE/T (for negative E) – T : Determined by “schedule” Starts large, decreases towards zero as the algorithm iterates Large T means ??? Small T means ??? – What happens when T = 0 ? 8 Probabilistic action • What does it mean to take an action with probability P(A) = eE/T ? • For action A1 – Let E = -3 and T = 20 – P(A1) = e-3/20 = 0.71 • Choose a random number r [0..1] – If r < 0.71, take action A1 – Else do nothing • For action A2 – E = -9 and T = 20 – P(A2) = e-9/20 = 0.35 • Choose a random number r [0..1] – If r < 0.35, take action A2 – Else do nothing 9 P(action) P(action) 1 Decreasing T eE/T eE/T 0 E “Badness” of action 10 Simulated Annealing (cont.) 11 Intuition of SA • Consider a ball bouncing around a vibrating surface, which you are holding with your eyes closed – Goal: Get the ball to the deepest point of the surface – What’s your strategy? – Use vibration your to jiggle ball over surface Use lots of vibration at beginning – Get big jumps as if had smoother surface Use less vibration later – Getter smaller jumps when feel smaller surface bumps – Analogy to annealing T measures amount of vibration E measures depth of the ball (goodness of solution) 12 Example: Get the ball to the minimum 13 Large T 14 Smaller T 15 Even smaller T 16 T=0 17 Simulated Annealing (cont.) • The trick is coming up with a proper schedule – Where to start (what T)? – How quickly to lower T? Too quickly? Too slowly? • SA is an example of a Monte Carlo method – A technique that uses probabilistic-based calculations to find an approximate solution to a complex (perhaps deterministic) problem 18 Effective Branching Factor • Though informed search methods may have poor worstcase performance, they often do quite well if the heuristic is good – Even if there is a huge branching factor • One way to quantify the effectiveness of the heuristic: the effective branching factor, b* – N: total number of nodes expanded – d: solution depth – N = 1 + b* + (b*)2 + … + (b*)d • For a good heuristic, b* is close to 1 19 Example: 8-puzzle problem Averaged over 100 trials each at different solution lengths Ave. # of nodes expanded Solution length 20 Local beam search • In simulated annealing, how many nodes do you keep in memory? – One • Local beam search keeps track of k states rather than just one – Start with population of k (initially random?) candidates – Choose k best descendents Not same as choosing and following k independent solutions – Why? Analogy to beam of flashlight (vs. single ray of light) • Stochastic forms of beam search – Like genetic algorithms, where successor states are generated by combining two parent states I said that iterative improvement algorithms are like this: Hill-climbing and SA are really more like this: Local beam search is really more like this: Review: Problem solving and search • AI is about solving hard problems, often requiring searching through a problem space • Problem definition – The choice of problem space is very important Abstraction – The choice of data structure is very important Can make tasks much easier or harder • Keys: Search graph and list of nodes to be expanded • Toy problems (M&C, etc.) are useful to show and compare algorithms – Real problems are not so simple! 23 Review: Problem solving and search • Choose a search algorithm depending on the problem requirements • There are various ways to evaluate search algorithms – – – – Completeness Optimality Goodness of solution Cost of finding solution (computation and memory requirements) • Uninformed search is exponential – We can do better, on average, if we know something about the problem (hence, informed search using heuristics) – The real difference: Effective branching factor is usually smaller for informed search methods 24 But alas, things are not so simple… • All this is fairly straightforward for M&C, N-queens, planning travel to Bucharest, etc. – What about dynamic, uncertain environments; stochastic inputs, etc.? • What if we also know that – – – – – – – Missionaries try to convert cannibals, and sometimes succeed Converted cannibals don’t eat missionaries Unconverted cannibals and converted cannibals argue a lot Even unconverted cannibals would rather eat fish than eat missionaries Missionaries can catch fish with their fancy traps Missionaries won’t fish on Sundays and have no spare fish on Fridays Some cannibals (converted or unconverted) steal the missionaries’ traps • How would we solve M&C now? • What if someone else chooses every other action? 25 Up next… • This leads us to representing and reasoning about more complex information • But first… – A detour into Game Playing – or, How to do search when someone is trying to make life miserable for you! 26 Game Playing – Chapter 6 • Why game playing? – Interesting, engaging, people like competition – Games seem to require intelligence, so perhaps a good AI testbed Games are the drosophilae of AI??? – Nice research domain: Simplified problems, easy to represent states and actions (usually) Yet maintains significant elements of real-world problems: – Competition – Limited resources (e.g., time) – Solution strategies – Uncertainty Due to opponent’s unknown moves, limited time to calculate states, and possible random events (e.g., dice) – Unlike the real world, it’s possible to be fully informed But even that doesn’t make it easy Game Playing (cont.) • Game playing is different from straightforward search – Uncertainty due to opponent – Much too hard to solve completely (branching factor is large) – Time limit is very significant With limited time, a small change in efficiency makes a BIG difference Because it is unlikely or impossible to calculate exact consequences of a move (find the goal state), it is necessary to approximate Game Playing (cont.) • Games can be deterministic or non-deterministic – Contains an element of chance: “probabilistic”, “stochastic” • …and have perfect information or not – “Accessible” – The world state is fully accessible to the program or agent (nothing hidden) Deterministic Perfect information Imperfect information Non-deterministic Chess, Checkers, Go, Othello, Tic-tac-toe Backgammon, Monopoly Navigating a maze Bridge, Poker, Scrabble Formulating a game strategy • Formal definition of game: – – – – Initial state Operators Terminal test Utility function • One move consists of two half-moves, or two ply – One per player • Our approach to finding a game strategy – Consider how to approach the exact solution (perfect play) – Techniques for choosing a good move when time is limited Prune the search to reduce costs Approximate by evaluating moves/states (heuristics) The Minimax algorithm • Determines the optimal strategy, and thus the best move, for player MAX (whose opponent is MIN) – Generate the whole game tree down to terminal states – Evaluate each terminal state using the utility function – Back up from the leaf nodes toward the root, one layer at a time, using the MIN and MAX rules, alternatively MAX chooses the maximum value MIN chooses the minimum value – When reaching the root node, MAX’s best move is the child with the highest value • The minimax decision maximizes the utility under the assumption that the opponent seeks to minimize it (and uses the same evaluation function) 2-ply Minimax example Which move to choose? The minimax decision is move A1 Another example • In the game, it’s your move. Which move will the minimax algorithm choose – A, B, C, or D? What is the minimax value of the root node and nodes A, B, C, and D? MAX 4 1 A 1 7 2 B 2 5 2 4 C 8 9 4 3 D 6 3 3 MIN 5 Multiplayer games • (A, B, C) – utility values for players A, B, and C Player A could take either move here Ignores strategies such as alliances… Minimax search • Generate the tree of minimax values – Then choose best (maximum) move – Don’t need to keep all values around Good memory property • How deep to generate the tree? – Generating the full tree is best, but not often feasible – Generate N moves down, and use an evaluation function The larger N is, the better the game will perform • Depth-first search is used to implement minimax – Recursive implementation (Fig. 6.3)