* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Artificial Intelligence and Decision Systems Course notes
Survey
Document related concepts
Personal knowledge base wikipedia , lookup
Agent-based model in biology wikipedia , lookup
Computer Go wikipedia , lookup
Knowledge representation and reasoning wikipedia , lookup
Soar (cognitive architecture) wikipedia , lookup
Agent (The Matrix) wikipedia , lookup
Multi-armed bandit wikipedia , lookup
Reinforcement learning wikipedia , lookup
Existential risk from artificial general intelligence wikipedia , lookup
Ethics of artificial intelligence wikipedia , lookup
Philosophy of artificial intelligence wikipedia , lookup
Hierarchical temporal memory wikipedia , lookup
History of artificial intelligence wikipedia , lookup
Transcript
Artificial Intelligence and Decision Systems Course notes Rodrigo Ventura Instituto Superior Técnico DRAFT — version 1.2 — October 2010 2 Preface These notes were written as a support bibliography for the Artificial Intelligence and Decision Systems (IASD) course at Instituto Superior Técnico (IST, Lisbon, Portugal). This course aims at providing the students with knowledge on basic methods in Artificial Intelligence. Since this course is integrated in a major on systems, decision and control of the Electrical and Computer Engineering course at IST, its contents are focused towards these areas. This course is being taught at IST from the mid 1990’s, having being continuously updated with new material of this fast changing scientific area, while omitting parts that have been losing interest, regarding the course aims and audience. The main bibliographic reference of this course is the excellent textbook “Artificial Intelligence: A Modern Approach” (2nd edition) by Stuart Russell and Peter Norvig [10]. The following chapters correspond essentially to a selection of chapters of this book. 3 4 Contents 1 Introduction to AI 1.1 Approaches . . . . . . . . . . . 1.2 Foundations . . . . . . . . . . . 1.3 State of the art . . . . . . . . . 1.4 Intelligent agents . . . . . . . . 1.4.1 Definition . . . . . . . . 1.4.2 Example . . . . . . . . . 1.4.3 Properties . . . . . . . . 1.4.4 Nature of environments . 1.4.5 Structure of agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 7 10 12 13 13 14 15 16 18 2 Problem solving 2.1 Introduction . . . . . . . . . . . . . 2.2 Well-defined problems and solutions 2.3 Solving problems . . . . . . . . . . 2.3.1 Uninformed search strategies 2.3.2 Informed search strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 27 28 30 32 38 . . . . . . . . . 3 Knowledge and reasoning 39 4 Planning 41 5 Unertain knowledge and reasoning 43 6 Learning 45 5 6 CONTENTS Chapter 1 Introduction to AI What is Artificial Intelligence? To understand the scope of this relatively recent scientific area, we could start by examining the name itself. First, there is Artificial: something (1) man-made, synthesized, engineered, and (2) aiming at imitating natural things [11]. And second, there is Intelligence: a property primarily ascribed to human being (with exceptions), but whose use to other animals is debatable. Not however as contested as the idea of machine intelligence. Intelligence, in the general sense, is virtually impossible to define in such a clear-cut fashion, that would allow us to classify entities as either intelligent or not without a shadow of doubt. Rather, the best we can do is to characterize intelligence as a set of capabilities or skills that intelligent beings possess. These capabilities include problem-solving, reasoning, decision-making, learning, memory, language, and emotions. This is not an exhaustive list, but it contains aspects that are most consensual. Now, other than humans, what entities possess at least some of these skills? Certainly rocks do not, and so they can hardly be considered intelligent. Vertebrate animals, however, do seem to possess some (if not all) of them, and so one can accept them as intelligent. The problems arise in the middle ground: can insects be considered intelligent? how about amoebas? and viruses? There is a continuum of complexity ranging from humans down to rocks, and placing a border line somewhere in the middle do not seem an easy task at all [1]. 1.1 Approaches The idea that machines can be intelligent was proposed in the dawn of computers, by Alan Turing in 1948 [13], on the basis that if the behavior of a machine is indistinguishable from a human while performing a certain task, 7 8 CHAPTER 1. INTRODUCTION TO AI then it ought to be considered intelligent. However, even today, there are those who still twist their noses to the idea of calling a machine intelligent (although in some cases of daily computer usage it might appear as the sheer opposite of it). But it should be stressed that this issue should not be taken as a matter of faith (in either direction). The name Artificial Intelligence (AI) appeared several years latter, in August of 1955, in the sequence of a summer research project in Dartmouth College in Hanover (US-NH). In the proposal of this project [9], written by Claude Shannon, Marvin Minsky, Nethaniel Rochester, and John McCarthy, it is said that The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. In the following we will analyse four different approaches to AI, illustrated in diagram 1.1, which can roughly be mapped to the history of the field. Understanding them provides a global view of what the field is all about. Act like humans (1) Turing test, NLP, Eliza, etc. — canned answers Think like humans (2) Cognitive science (psychology), serial vs. parallel processing Act rationally (4) Acts so that to achieve the best outcome (or the best expected outcome) Think rationally (3) Laws of thought, logic, irrefutable conclusions (common sense, practical) Table 1.1: Four approaches to AI, roughly corresponding to the combinations among act and think across like-human and rational. The first approach (1) was in fact hinted by Turing’s early works arguing for machine intelligence [13]. The so called Turing test, named on his honor, aims at evaluation whether a machine should be considered intelligent or not. The test consists of two text terminals by which a human evaluator can chat with another human, and with the computer under test. The evaluator does not know which is which. If the evaluator fails to correctly identify which one is the computer, then the latter is said to pass the test. This test is clearly oriented toward to evaluating whether the machine can act like humans, as the evaluator has only access to the machine behavior (and limited to a text chat interface). Several years later, Joseph Weizenbaum developed, between 1.1. APPROACHES 9 1964 and 1966 a program called ELIZA which purpose was to chat with people, in a vaguely similar way to a psychotherapist. In fact, it was intended as a parody, but the outcome was unexpectedly engaging. It has been told that Joseph’s secretary used to engage in very intimate conversations with ELIZA, not wanting anyone else to see the transcripts. One of the most extraordinary things about ELIZA is the simplicity of its programming. ELIZA is basically a set of IF-THEN rules triggered by text patten matching, and some randomness in the choice of responses. For instance, to a “My head hurts”, ELIZA would probably answer the canned sentence “Why do you say your head hurts?”. Although ELIZA is still a bit far from passing the Turing test, it does a pretty good job on fooling non-expert humans. Still, a contest is held yearly, granting prizes to the programs that pass a restricted form of Turing test, the Loebner Prize1 . Joseph Weizenbaum has won four times the price, in 1991–1993 and in 1995. Can we call these programs, tailored to answer in a plausible way to text messages, really intelligent? In fact, research has been focused here more on text pattern matching and on the design of plausible responses, rather than targeting the capabilities we identified above as indicators of intelligence. If it is not enough to (1) act like humans, an alternative approach would be to (2) think like humans. To this respect, researchers looked into the field of psychology, that is dedicated to understanding how the human mind works. Herbert Simon and Alan Newall have contributed enormously to the field, precisely on this approach of making programs based on our understanding of the human mind. There is however one hidden assumption: that the mechanisms found in our brain can be implemented in a computer with similar level of success and efficiency. The brain is, however, structurally different from a computer [14]. While the brain is massively parallel, encompassing about 15 to 33 billion neurons, all functioning simultaneously, the computer is based on a serial execution of instructions2 . This radically different processing architecture ought to have an impact on the approaches to the same problems. So we should not expect, in principle, for the best computational solutions to be similar to the way our brain solves them. What if instead of taking humans as models of intelligence, we approach machine intelligence from a formal, mathematical standpoint? Mathematical reasoning allows one to draw irrefutable conclusions given a set of premises using mathematical logic. In AI, mathematical logic has been used to model many aspects, for instance common sense. We can call this approach as 1 http://www.loebner.net/Prizef/loebner-prize.html (retrieved 27-Aug-2009) Even though computations involve several bits simultaneously (64 in modern machines), and the current trend is for multiple multi-core CPUs, these amounts are several orders of magnitude lower than the amount of neurons in the brain 2 10 CHAPTER 1. INTRODUCTION TO AI (3) think rationally, in the sense of employing rational thought processes provided by mathematical logic to attain machine intelligence. Problems however arise since reality may not always be permeable to rigid mathematical representations. Take for instance the statement “All birds fly”; what happens if we glue a bird’s leg to a heavy concrete block? As the bird is no longer able to fly, we have to account for exceptions, and that turned out to be extremely hard to deal with using logic-based representations. Instead of subsuming thought processes to rigid logic rules, maybe it suffices for a system to (4) act rationally. Let us define a performance criterion rigorously, such that the higher the performance of the outcome, the more it approaches the design goals of the system. Figuring out what to do would boil down to simply finding the solution that maximizes this performance. In this approach, the focus is directed towards finding a solution that maximizes a defined performance criterion, rather than on constraining the thought processes to logically sound framework. This approach has also produced very efficient methods to deal with uncertainty in a quantitative way. Logic methods are also able to cope with uncertainty, but are unable to deal with quantitative levels of uncertainty (e.g., probabilities). The current trend in the AI field is closer to this latter approach (4), than any other. Probabilistic methods have been gaining ground, as they combine, on the one hand, the solid theoretical framework of probability theory and statistics, and on the other, the ability to quantify and propagate through its computations, levels of uncertainty. 1.2 Foundations Although the field is relatively new, many earlier fields are considered to have contributed to the foundations of AI. Here is a brief description of the major contributions of most of them. Readers are invited to look up further information concerning the topics mentioned below (in, for instance, the Wikipedia3 ). 1. Philosophy (from 428 B.C.) — formal rules to obtain valid conclusions (logic), the issue of a mind vs. body (Plato, Descartes), the origin of knowledge and the role of perception on its acquisition, and how knowledge maps to actions; 2. Mathematics (from 800 A.D.) — how formal rules can be used to derive irrefutable conclusions, computability (Turing, what class of functions 3 http://www.wikipedia.org (retrieved 27-Aug-2009) 1.2. FOUNDATIONS 11 are realizable in a computer), modeling uncertain information (statistics); 3. Economics (from 1776) — decisions maximizing payoff (Herbert Simon), theories of utility, decision, and games, (von Neumann), optimizing vs. satisfycing; 4. Neuroscience (from 1861) — how the brain works, levels of organization (from molecules, to synapses, to neurons, to maps, and to systems), localization of function (Broca); 5. Psychology (from 1879) — how humans think and act, behaviorism (Skinner) vs. cognition, psychoanalysis (Freud); 6. Computer engineering (1940) — building computers (ENIAC), programming languages; 7. Cybernetics (from 1948) — the concept of autonomy, control theory (Wiener), information theory (Shannon); 8. Linguistics (from 1957) — language and thought, syntax and grammar (Chomsky). The first publication commonly accepted as being the precursor of AI was written by Warren McCulloch and Walter Pitts in 1943, proposing a model of artificial neurons. Each one of these neurons can assume one of two states (“on” and “off”), receiving signals from other neurons. They showed that this model could implement logical connectives (AND, OR, NOT, etc.), and even perform any computable function. In the late 1940’s Alan Turing wrote an influential paper [13], claiming the possibility of machines exhibiting intelligent behavior, while presenting several arguments sustaining his claim. The name of the field was however only coined in 1956 after the Dartmouth summer research project [9] in 1956. Since then, the field has evolved over several stages, from the early enthusiasm of naive systems and toy problems, to a maturity state where strong theoretical results can be found as well as real-world systems actively used in the industry. The AI now counts with a large community of researchers, with many top-level conferences being held periodically (AAAI, IJCAI, ECAI, AAMAS, to name a few). It has remained a quite inter-disciplinary subject, with strong connections to the many fields of mathematics (logic, statistics), robotics, linguistics, neuroscience, systems and control theories, among others. 12 CHAPTER 1. INTRODUCTION TO AI 1.3 State of the art In this section, several state-of-the-art AI application domains are briefly described. Once again, readers are invited to look for further information for each one of the examples given here. • The NASA remote agent is an autonomous program for the remote management of a spacecraft. It includes tasks such as planning and fault diagnosing. In particular, it was tested on NASA’s Deep Space 1 spacecraft [3]. • In 1997 the computer Deep Blue defeated the chess grand champion Kasparov, with broad media coverage. Insofar chess has been considered a benchmark of intelligence, this is a considerable feat, as this was the first time the best known human on the specific task of chess was beaten by a machine. However, it should be noted that not only Deep Blue’s hardware is optimized specifically for chess (besides IBM’s marketing claims of the contrary), but also the algorithms employed are quite specific of this kind of games [5]. • Stanford’s Stanley is a SUV car that drives autonomously. It won the 2005 DARPA Grand Challenge, consisting of driving a 212km off-road track autonomously and unmanned. Stanley made first place, from a total of 23 cars, taking 6h54m in total. Only five cars made the finish line. The competition track was only know a few hours before the race, in the form of a sequence of GPS waypoints. This was a milestone on terms of having an autonomous vehicle operating for such a long period of time, with so many sources of uncertainty, and also a victory of the probabilistic methods in robotics [12]. • CMU Tartan Racing won the 2007 DARPA Grand Challenge, consisting this time on an urban scenario, where the autonomous cars have to comply with the usual traffic rules, interacting with other cars driven by humans [2]. • Medical diagnosis systems, which not only provide a diagnosis given a set of symptoms, but also an explanation of the line of reasoning to reach the conclusion. • Logistics planning for complex military campaigns, having the challenge of representing the various constraints involved, with the goal of finding a solution satisfying the goal and the constraints, minimizing some performance criteria, weighting factors such as time and costs. 1.4. INTELLIGENT AGENTS 13 The DART system (Dynamic Analysis and Replanning Tool) was used in the American Desert Shield/Storm operations in Kuwait [6]. • The HipNav (Hip Navigation System) is a surgery assistance system to provide the medical staff with the optimal, patient-specific positioning for hip implants [7]. • RoboCup is an annual international robotic competition event, which initially included only robotic soccer events, but now includes competitions for search and rescue robots, service robots at home, among others. Robots playing soccer is a huge scientific challenge, as it poses a common benchmark in a common environment, where the robots have to solve a wide range of problems, from ball tracking up to team strategy, all integrated in a team of autonomous robots [8]. 1.4 1.4.1 Intelligent agents Definition The concept of an agent is a central one to this course. An agent is an entity endowed with sensors by means of which it receives information from the environment (percepts), and produces actions via its actuators (figure 1.4.1 This entity applies both to living beings, as well as to artifacts, although we will be concerned only with machines. It can be, for instance, a physical machine, such as a robot, with sensors that measure distances to obstacles, and actuators that make it move around. It can also be a software agent, that crawls over the Internet gathering data for a search engine (in this case the sensors are the network connections that download web pages). The environment consists on whatever is external to the agent and bears some relationship with it, including other agents. The environment of a mobile robot is a physical one; the environment for a web crawler it is the Internet; for a robotic soccer game the environment is the field and the other robots, including teammates and opponents. The concept of agent provides encapsulation, with respect to the surrounding environment, of an entity that autonomously processes percepts and acts on the environment. One can compare this concept with the idea of object in software engineering. It is important to stress that an agent is not a classification criterion, which could be used to classify entities, but rather a design framework. The concept of agent is used to help us design AI programs. Moreover, this concept not only applies to a single entity, but also to systems comprising several agents — multi-agent systems (MAS) — where 14 CHAPTER 1. INTRODUCTION TO AI AGENT sensors ENVIRONMENT percepts ? actuators actions Figure 1.1: Diagram of an intelligent agent, bearing a relation to its surrounding environment. multiple agents communicate among them to fulfil some task. A robotic soccer team is a paradigmatic example of a MAS. Many research work, design architectures, and software tools have been developed for the development of these systems. At this point one should distinguish between agent function, which is a mathematical description of how an agent maps its percepts into actions, e.g., a mathematical function from the set of all possible percept sequences to the set of all possible actions, and agent program, concerning a computer implementation of that map. The agent function completely specifies the agent behavior, as it defines the action to be performed in any possible situation. It is a formal description of the agent behavior. From an engineering point of view, we will be focused on making agent programs. The question we will address during this course is how can we design such agent programs such that the resulting implementation fulfills the design goals. 1.4.2 Example Let us consider here a very simple example of an autonomous agent: a vacuum cleaner robot. This robot operates in a simple two room environment, being able to move from one room to the other (figure 1.2). Consider that the agent sensors comprise one location sensor, determining in which room the robot is, and a dirt sensor, detecting whether there is dirt in that room. Then, the percepts can have the form of a pair hr, si where r ∈ {A, B} is the robot location, and s ∈ {Clean, Dirty} is the room status (as perceived by the dirt sensor). The actions a considered here are the movement actions, 1.4. INTELLIGENT AGENTS A 15 B . ......................... ... . dirt dirt . ...... .. .......... ... . . .... .. ........... ... . ......................... ... . agent Figure 1.2: Vacuum cleaner robot example. suck, and shutdown: a ∈ {Left, Right, Suck, Shutdown}. The agent program, implementing the agent function, can be something as simple as a lookup table between percepts and actions, e.g.: hA, Cleani hA, Dirtyi hB, Cleani hB, Dirtyi −→ −→ −→ −→ Right Suck Left Suck (1.1) Assuming that the environment evolves as expected — the dirt in one room disappears as soon as the robot Suck’s it —, it is not hard to see that such a program will lead to a infinite loop of moving from room A to B and back. This program is able to clean both rooms automatically, but it would be probably preferable to Shutdown after getting both rooms cleaned. However, this requires a bit more complexity than the lookup table given above (more on this issue below). 1.4.3 Properties This last observation in the previous section brings about the concept of rational agent, which, in a first attempt, we will define as an agent that “does the right thing.” In particular we would like to be able to measure how well does it do “the right thing”; we will call this measure performance. The higher the performance, the better does it perform. Note that there is no universal and unique way of defining a performance measure. Different agents will have different performance measures, not only because each task we want the agent to perform will call for a specific performance measure, but also, even for the same task, different designers might have different assessments of the agent performance. For instance, when there is a trade-off between two (or more) variables, one could weight each one differently. In the commercial 16 CHAPTER 1. INTRODUCTION TO AI aeronautical industry, for instance, different airlines might trade-off flight duration and fuel spent differently. The automatic flight management system of aircrafts are configured with that trade-off variable (cost index), having implications in the rate of climb and descent of the airplane. In the case of the vacuum cleaner, a possible performance measure might be D − λN , where D is the amount of dirt sucked, and N the total number of operations, measured at shutdown time, or after a certain time span. The λ parameter allows the designer to tune the trade-off between amount of dirt sucked and number of operations. The performance of an agent is conditioned to: (1) the percept information, meaning that we cannot always assume that the agent has access to all information concerning the environment, and (2) the built-in knowledge of the environment. Our vacuum cleaner robot dirt sensors only sense dirt for the room it is in, meaning that it has to visit all rooms before being sure everything is cleaned up, and its navigation is conditioned by the knowledge it has of the environment. In figure 1.2 only two rooms were depicted, but we can imagine larger environments with more complex topologies. Now we are ready to formally define a rational agent, with respect to a given performance measure, as the one that maximizes a specified performance measure, given the percept information and the built-in knowledge of the environment. When an agent has full information concerning the environment state, it is called omniscient. As it is usually not the case, the concept of rationality has to be framed with respect to what information the agent has access. In this latter case, it is often crucial for the agent to actively gather information, in the sense of performing actions with the goal of obtaining more information. The issue of trading off actions for gathering information with actions that directly improve the performance measure is called the exploration vs. exploitation problem. Exploring the environment for gathering information concerns the area of Learning, which will be discussed in a later chapter. Another trade-off we encounter in this context is between relying on one’s percepts or on a priori knowledge. For instance, a robot can navigate using a map, but if it is insensitive to the presence of obstacles (unaccounted for in the map), it might collide with them. Furthermore, a map may become outdated, if for instance the environment changes. An agent that relies on its own percepts rather than only on prior knowledge is called autonomous. 1.4.4 Nature of environments A world (also known as domain) is composed of: 1.4. INTELLIGENT AGENTS 17 1. a performance measure, 2. an environment, 3. a set of actuators, and 4. a set of sensors. This defines both the environment an agent interacts with, and its interface with it. Given a world, the design goal consists in constructing a rational agent, in the sense of acting in such a way of maximizing the performance, as defined in the previous section. Some examples of worlds (or domains) are: Taxi driver the sensors and actuators are the ones a human uses for driving, the environment includes the taxi and the city, and the performance measure comprises a combination of carrying clients to the desired place, time/cost of travel, comfort, global satisfaction of the client, etc. Search and rescue (SAR) autonomous helicopter the sensors include cameras, GPS, actuators include the propeller, any deployable first-aid kits, the environment is the operational area, including the victims and the SAR personnel, for the performance measure one has to take into account not only the amount of potential victims found, but also the time for the aid to reach them (a critical variable in SAR operations). Soccer humanoid robot the sensors include a camera to locate the ball and the teammates, the actuators are the mechanical limbs, the environment is the field and the other robots (both teammates and opponents), and the performance measure is simply the amount of goals scored. Web crawler the sensors allow it to download web pages, the actuators allow it to move from one page to another, by establishing new network connections, the environment is the Internet, and the performance measure includes the amount of web pages processed and/or the amount of relevant information gathered. World can be classified according to the set of properties below: • fully observable vs. partially observable, meaning whether the agent has full access to the state of the environment (e.g., chess), or a partial view of it (e.g., vacuum cleaner robot); 18 CHAPTER 1. INTRODUCTION TO AI • deterministic vs. stochastic, if the consequences of the agent actions are predictable, or if there is an element of chance involved (e.g., sensor noise, actions that may fail); • episodic vs. sequential, whether the interaction with the environment follows along a sequence of independent episodes, or if future interactions depend on what happened in past ones; • discrete vs. continuous, whether the domain of all percepts, all actions, and time, is discrete or continuous (any of the 8 combination is possible); • single- vs. multi-agent, whether the agent is alone in the environment, or there is a team of interacting agents (MAS, Multi-Agent System). In MAS one can also distinguish between cooperative agents and competitive agents (e.g., in robotic soccer, teammates are cooperative, and opponents are competitive, that is, if everything works as desired). 1.4.5 Structure of agents Having discussed agents and environments, the big question now is how to design the question mark block depicted in figure 1.4.1, the core of the agent itself. To do that, we will start by analysing an initial, very simple architecture, find out its limitations, and then design a new, more complex one, addressing those limitations. We will repeat this process over and over again. Simple reflex agent Let us first consider a simple lookup table, like the one referred in section 1.4.2. Given a percept, the agent searches the lookup table for an exact match, and outputs immediately the corresponding action (like a reflex). Instead of having a table entry for each possible percept, we can introduce patterns; for instance, whenever the robot finds dirt, perform a Suck action. This forms a so called an IF-THEN rule (also known as production rule), usually in the form: IF hconditioni THEN hactioni (1.2) A production system consists of a set of such rules, which given a percept, looks for a rule that matches the hconditioni, and then produces the corresponding action hactioni (if such a rule is found). For instance, the rules for the vacuum cleaner domain could be 1.4. INTELLIGENT AGENTS 19 sensors interpret input IF-THEN rules database rule match actuators Figure 1.3: Architecture of a simple reflex agent. IF hA, Cleani IF hB, Cleani IF h∗, Dirtyi THEN Right THEN Left THEN Suck where the ∗ symbol matches anything (wildcard). The resulting architecture is depicted in figure 1.3. Each percept is subject to an input interpretation box (for instance, consider a room dirty only if the the measurement of the dirt sensor is higher than a threshold), followed by a rule matching engine, fed by a database of IF-THEN rules. From a design point of view, each one of these blocks should correspond to a different software module. This way we can map the architecture design (figure 1.3) to a computer implementation of it. Moreover, the rule matching engine should be domain independent, meaning that the same module could be applied to any domain, not only the one the agent was designed for. This way, changing the domain would not imply the change of the engine, only the rules database, and also the engine can be re-usable in a different domain. The major advantage of this architecture is its simplicity. However, it poses several limitations. One of them is that it does not scale well to complex domains. It easily becomes cumbersome to write rules for complex problems. Imagine for instance how to solve a Sudoku board using this approach. Another problem is that it underperforms in partially observable environments. For instance, in the case of the above rules, the agent will end up into an infinite loop of moving from one room to the other. If the agent could somehow memorize which rooms it has already cleaned, it could then safely shutdown. This lead us to the next agent architecture. 20 CHAPTER 1. INTRODUCTION TO AI sensors world model update state IF-THEN rules database rule match actuators Figure 1.4: Architecture of a model-based agent. Model-based agent If the environment is partially observable, appropriate decisions may require the agent to have memory. So, let us add a memory to the previous agent architecture, in the form of a world model. This structure models the current state of the environment, being dynamically updated with information from previous and current percepts. In other words, each time the agent receives a percept, it updates its world model. This world model is then used in the rule matching engine. Thus, the rules conditions refer to the world state, rather than to the percepts directly. The agent action thus depends on the history of previous interactions, since previous and current percepts may depend on past actions. Figure 1.4 shows the resulting architecture. The major difference to the previous architecture is that this agent has an internal state. Its actions do not only depend on the immediate percept, but also on previous interactions with the environment. This way the agent is better capable to deal not only with partially observable environments, but also with sequential domains, where the appropriate “thing to do” depends on the past. For the vacuum cleaner domain, the world model could include the cleanness state of each room, such that a production rule in the schematic form IF all rooms clean THEN Shutdown could be added, where all rooms clean matches only with a world state having all rooms in the clean state. 1.4. INTELLIGENT AGENTS 21 One limitation of this architecture is that the agent depends on a model of the environment. This may be a partial model of the environment (except for trivial cases, one cannot usually model everything), and/or there may be noise in sensors and/or uncertainty in the environment which affect the accuracy of the model. The world update model should be robust to this in order to perform appropriately. Another limitation is, once again, scaling up to more complex domains. In particular, the designer has to write rules taking into account all possible situations might encounter. One can argue then that who is really solving the problem is the person writing the rules, rather than the computer: the intelligence is on the designer, not on the machine. One could stretch the argument to the limit, and argue whether this is not always the case with machine intelligence. But if we consider an agent that, given a goal, finds out a solution by itself, then we would be certainly closer to machine intelligence. This leads us to the next architecture. Goal-based agent In this architecture the agent is given a goal, and then it searches (by itself) for a solution satisfying it. Figure 1.5 shows its structure: the agent has to be able to predict the consequences of its actions. In addition, we need a world evolution model, that represents how the world state changes by virtue of the agent actions. With this, the agent becomes able to predict how the world state evolves after performing certain actions: “what happens if I do this action?” Then, the agent compares the resulting final world state with the agent goals. If they do match, a solution was found, in the form of a sequence of actions achieving that goal. For that the agent is required to consider all of its possible actions, and possibly all of its possible action sequences (under computational space/time limitations, of course). Once again, both world models and world evolution models may deviate from reality. This may be more serious than in the previous architecture, as far as actions may not have the expected results (e.g., a failed action). For instance, our vacuum cleaner robot may build a long plan of actions to go from one room to a distant one, but if it fails to perform one turn in the middle of the plan execution, it may end in a completely different room. Another issue is the representation of the goal: is it a specific world state, or is it a condition over the world state? (e.g., all rooms clean, regardless of the robot final position). If finding a solution requires considering a set of possible actions sequences, this can raise computational complexity problems, because of the combinatorial nature of the problem. The computational efficiency of plan- 22 CHAPTER 1. INTRODUCTION TO AI sensors world model update state world evolution model what-if actions goals action selection actuators Figure 1.5: Architecture of a goal-based agent. ning methods is an important issue if the agent is required to deal with a complex domain. For the vacuum cleaner example, the goal could be all rooms being clean. Having a world evolution model that would predict that a room after being Suck’ed becomes clean, then such an agent would be able to figure out a sequence of actions to reach the given goal. This would work without the need for the designer to explicitly tell the agent what to do in any possible situation. However, there may be several valid action sequences (plans) that attain the same goal. How to choose among them? Is there a way of finding the best solution? See next architecture for an answer. Utility-based agent When compared with the previous architecture, this one replaces the simple matching of the resulting world state with the goals, by an evaluation function. The resulting state is evaluated, with respect to a given performance measure, and then the agent selects the action sequence that maximizes this measure. This evaluation function is often called utility, e.g., the utility U (a, s) of performing action a in the world state s. This way the agent 1.4. INTELLIGENT AGENTS 23 sensors world model update state world evolution model what-if actions utility state evaluation action selection actuators Figure 1.6: Architecture of a utility-based agent. is optimizing for the performance measure, thus reaching the level of a rational agent, as previously defined in section 1.4.3. Figure 1.6 depicts this architecture. Applying this architecture to the vacuum cleaner domain would result in an agent that could, for instance, minimize energy while cleaning all rooms. For an environment with only two rooms, this may not sound very exciting. But consider having a large, multi-story building, where the robot had to empty the stored dirt at certain places. It is not trivial to devise manually the best plan to clean the whole building, while minimizing energy (and/or time). As we have evolved the architecture, we also increased the prior knowledge required by the agent to operate appropriately: it needs a world model, along with the algorithms to update it after each percept, it needs a world evolution model, sufficiently accurate and robust to any inherent uncertainty, and it also needs an utility function. All of this knowledge has to be put into the 24 CHAPTER 1. INTRODUCTION TO AI performance standard sensors critic feedback changes learning element knowledge performance element learning goals problem generator actuators Figure 1.7: Architecture of a learning agent agent. agent by the designer. In the next section we will consider an architecture where part (if not all) of this knowledge is gathered by the agent itself. Learning agent This architecture builds upon any of the previous architectures. In particular, consider any of the previous architectures as the performance element box in figure 1.7. Let us analyse this architecture piece by piece. The percepts are the inputs to the performance element, that outputs the agent action. In order for the agent to learn, something that informs it whether the agent is performing well is needed; this is called the critic. Given a performance standard, the critic provides a feedback signal (e.g., positive, negative, or neutral). This signal is often designated reinforcement. The goal of the learning process is to change the performance element, e.g. to change parameters of an utility function, or to change rules of a production system. This change is performed by the learning element, which given knowledge as gathered by the performance element (e.g., the world state), performs changes on it. Learning requires exploration, as discussed in section 1.4.3. In order for the agent to explore (and eventually balance between exploration and 1.4. INTELLIGENT AGENTS 25 exploitation), a module called problem generator was added. This modules receives the learning goals from the learning element, and poses new problems to the performance element to solve. While doing so, more knowledge is gathered, thus contributing to a more varied learning process of the agent. Not doing so would limit the agent performance to the situations it has already encountered. A learning vacuum cleaner robot could, for instance, be able to build a map of the building it would operate in. It could also learn about various aspects of the environment (e.g., time to clean each room, typical dirt amount), thus allowing for a better performance level as times goes by. 26 CHAPTER 1. INTRODUCTION TO AI Chapter 2 Problem solving 2.1 Introduction This chapter is focused on AI methods to solve simple problems. One such problem is the following: consider 2 jugs of water, A and B, with exactly 4 liters and 3 liters of capacity respectively; considering that there is no other way of measuring volume, how can one obtain 2 liters in jug A? The only available actions are fill in a jug of water, throw the water away, and transfer from one jug to the other. Solving the problem means finding the sequence of actions to reach the goal, from a given initial state (say, both jugs empty). In general, the formulation of a problem involves the definition of several items: (1) goal formulation, (2) actions to consider, and (3) states to consider. The definition of state is central to this definition. A state is here understood as the complete representation of a particular condition of the domain in a specific time. It has to be complete in the sense that given a state, we are able to predict exactly the resulting state after a given action, without ambiguity. For instance, in the water jug problem, we will consider a state as the amount of water in each jug. Formally, we could represent it as a pair (a, b) where a and b are the amount of water in jugs A and B (0 ≤ a ≤ 4 and 0 ≤ b ≤ 3). Since the goal is to get 2 liters in jug A, it is represented by the condition a = 2 (we do not care about the final amount of water in jug B). The initial state is both jugs empty, i.e., state (0, 0). For the actions, we will consider the following ones: fill A: (a, b) 7−→ (4, b) fill B: (a, b) 7−→ (a, 3) 27 28 CHAPTER 2. PROBLEM SOLVING A to B (transfer from A to B until B full): (a, b) 7−→ (a−q, b0 ), where b0 = min(3, a + b) and q = b0 − b B to A (transfer from B to A until A full): (a, b) 7−→ (a0 , b−q), where a0 = min(4, a + b) and q = a0 − a empty A (a, b) 7−→ (0, b) empty B (a, b) 7−→ (a, 0) A solution is a sequence of states that lead from the initial state (0, 0) to a state satisfying the goal condition a = 2, for instance: fill B A to B fill B B to A empty A B to A (0, 0) −−−→ (0, 3) −−−−→ (3, 0) −−−→ (3, 3) (2.1) −−−−→ (4, 2) −−−−−→ (0, 2) −−−−→ (2, 0) Formulating a problem in this way amounts to casting it as a state space problem, in the sense of defining a space of all possible states, and finding a action sequence leading from an initial state to a goal state. Note that although the initial state is unique, in the sense of being initially given, more than one state may satisfy the goal condition. Thus we usually refer to a goal state, not necessarily a unique one. How can a machine solve these kinds of problems? In simple words, it will amount to performing a search in the state space for a path from the initial to a goal state. This path contains the desired action sequence to solve the problem. In some problems one may be concerned only with the goal state found (e.g., in the case of the Sudoku domain), while in others on the action sequence (e.g., water jug problem) only. It is important to stress two distinguishing aspects of this approach: first it is the three step process of (1) formulating the problem, (2) searching for a solution, and possibly (3) executing it, and second, the assumption that the environment is static, observable, discrete, and deterministic. 2.2 Well-defined problems and solutions We shall now define formally problems and solutions in this context. Given a state space, a problem is defined by: 1. an initial state 2. a successor function, mapping a state s to a set of action-state pairs (ai , si ) that are applicable to s, i.e., in the form s 7→ {(a1 , s1 ), (a2 , s2 ), . . .}. 2.2. WELL-DEFINED PROBLEMS AND SOLUTIONS 29 States s1 , s2 , . . . are called successor states, and the procedure of obtaining these states from s is called expansion of s. 3. a goal test, which given a state s, is true if and only if s is a goal state (also known as solution state), and 4. a path cost, mapping a sequence of actions from the initial state to a given state, to a number, for which lower values are preferable to higher values (in the performance measure sense) The first two items define the reachable state space, meaning all the states that can be reached from the initial state and the recursive application of the successor function (i.e., applying the successor function to all successor states, over and over again). This forms a graph, which can be cyclic ot acyclic (e.g., a tree). To any reachable state s corresponds a path, consisting of a sequence of state-action pairs h(a1 , s1 ), (a2 , s2 ), . . . , (ak , sk )i, where si+1 is the successor state of si after applying action ai , s0 is the initial state, and sk = s is the given state. This path represents one way of reaching state s, from the initial state. Given such a path, the path cost assesses the total cost to reaching that state. If the final state sk of a path is a goal state, its cost corresponds to the inverse of the performance measure, reflecting the quality of the solution. And in the sense, the optimal solution is the path, from the initial state to a goal state, minimizing its cost. One important aspect is the choice of the state representation, meaning how do we represent, in a formal way, the states of a given problem. When choosing it, one has to address the problem of abstraction. A state is an abstraction. In the water jugs problem we were not concerned with the shape of the jugs, nor the their height from the floor, nor if they are on a table, etc. since all those aspects to not affect the goal of finding a solution to the problem. An abstraction means to leave out of the representation all irrelevant aspects, with respect to the problem at hand. The choice of what to leave out and what to represent is crucial, in two ways: first, leaving out too much will prevent us from finding a feasible solution, while keeping too much in the representation will impact on the computational complexity of the problem. As an illustration, consider a mutilated chessboard case, depicted in figure 2.1. This is a normal chessboard with two opposing corners squares removed. The problem is to find out whether one can cover the whole board with domino pieces, each one occupying exactly 2 by 1 squares. This is not a trivial problem, since we have to consider many possible ways of covering the board, and it turns out to be a very difficult problem. But consider now this alternative representation of the same problem: each piece will necessarily 30 CHAPTER 2. PROBLEM SOLVING 0Z0Z0Z0Z Z0Z0Z0Z0 6 0Z0Z0Z0Z 5 Z0Z0Z0Z0 4 0Z0Z0Z0Z 3 Z0Z0Z0Z0 2 0Z0Z0Z0Z 1 Z0Z0Z0Z0 8 7 a b c d e f g h Figure 2.1: Mutilated chessboard. cover a black and a whitre square; a chessboard contains 82 = 64 squares, 32 black and 32 white squares; however, in mutilating the poor board it resulted in 32 black and 30 white squares; therefore, one can never cover such a board with 2 by 1 domino pieces (since at least two black squares will be left out of the cover). In conclusion, a careful choice of the state representation, in terms of the level of abstraction, has a crucial impact on the complexity of the solving process. Many cases exist, where the mutilated chessboard problem is one of them, that by changing the state representation, one can transform an intractable problem in a very easy one. 2.3 Solving problems The approach that we will use to solve these problems is using search. Search methods consists on finding for a state in the state space that satisfies the goal test, starting from the initial state. The idea is the following: starting with the initial state (or node1 ), to generate all sucessor nodes; then to choose one of these nodes, and expand it, proceeding so over and over again, until a node satisfying the goal condition is encountered. Note that only the first node satisfying this condition is returned. The application of the sucessor function to a node is called expansion, and these expanded nodes are called sucessor nodes. Figure 2.2 illustrates this process. The above description, which will be formalized below as the general treesearch algorithm, does not specify the choice of the next node to expand. Given a tree partially expanded, several possibilities can be considered: expand the oldest node expanded, the newest, expand the one with lower path 1 We will use the terms state and node interchangeably. 2.3. SOLVING PROBLEMS 31 1 initial node expansion 2 3 4 5 sucessor nodes 6 7 8 Figure 2.2: Search tree: from the initial node (1), the application of the sucessor functions yields sucessor nodes (2-5); the search proceeds by expanding one of these nodes. cost, etc. These possibilities are designated search strategies. In what follows we will always use the same general tree-search algorithm, but we will consider different search strategies, and discuss the pros and cons of each one. First, it is important to define the search space which is formed by all reachable nodes from the initial state, by the repetitive application of the sucessor function. Not all states of the state space might be reachable this way, and so the search space is a subset of the state space. Another useful definition is the one of the search tree: the repetitive application of the sucessor funcion yields a tree (as the one illustrated in figure 2.2). This tree not only represents the relationships between the nodes, in terms of sucession, but also repeated states might appear. A good search algorithm should take into account repeated states, as otherwise portions of the search space might be explored more than once, thus wasting time (and memory). The general tree-search algorithm is defined by the following steps: 1. initialize tree with the initial state 2. if there are no candidates to expand, then return “failure” 3. choose node to expand according to the search strategy 4. if the chosen node is a goal state, then return it; otherwise, expand it and add sucessor nodes to the tree 32 CHAPTER 2. PROBLEM SOLVING 5. go to step (2) All nodes not yet expanded are called open nodes. Once a node is expanded, it becomes no longer open. The search strategy boils down then to a choice among open nodes. The choice of the search strategy can have a big impact on the results, so that we will compare the search strategies to be presented next according to the following criteria: completeness — whether the algorithm find a solution if one exists in the search space optimality — whether the solution found is the one that minimizes the path cost time complexity — how long does it take to find a solution, in the worst case, usually expressed in amount of expanded nodes2 space complexity — how much memory is needed, in the worst case, usually expressed also in amount of expanded nodes Search strategies divide among two major classes: uninformed and informed ones. The difference lies in the fact that the latter employ information from the nodes themselves (e.g., estimated path cost to goal, as in the case of A*), while the former are completely blind of the node representation. 2.3.1 Uninformed search strategies Breadth-first search The idea of this strategy is very simple: to expand the search tree in breadth, or in other words, level by level. First all nodes at depth 1 are expanded, then the ones at depth 2 (sucessors of the ones in the previous depth), and so on, until a solution is found. The ordering of node expansion is illustrated in figure 2.3. In implementation terms, it is messy to track the nodes levels, and therefore a very elegant approach can be used: when expanding a node, all successor nodes are inserted into a FIFO structure3 , and the next node to be expanded is the one that is dequeued first. In this way, the next node to be expanded is the older one not yet expanded. 2 This is a implementation independent measure, as the choice of, say, seconds is highly hardware/software dependent. 3 FIFO means First-In-First-Out, corresponding to a queue where the first items to enter the queue are the ones that come out first. 2.3. SOLVING PROBLEMS 33 1 2 depth 1 3 4 5 6 depth 2 7 5 8 9 10 11 12 13 (a) search tree 14 15 4 3 2 expand depth 3 (b) node FIFO queue Figure 2.3: Node expansion order in breadth-first search. Breadth-first is complete (as long as the branching factor is finite), and it will find the shallowest goal node in the tree. The shallowest solution node is not necessarly the optimal one. If the path cost is a function of the depth alone, then the shallowest solution is also the optimal one. Concerning the time complexity, let us compute the number of generated nodes in the worst case: considering a branching factor b and that the shalowest solution is at depth d, this strategy will generate b nodes at the first level, b · b = b2 at the second one, and so on until bd at the level of the shallowest solution. However, according to our general search algorithm, the goal test is only applied to the node chosen to be expanded next. Therefore, in the worst case, breadth-first will expand one mode level, up to the solution node, i.e., bd+1 − b. The time complexity will correspond to the sum of all these terms4 b + b2 + · · · + bd + bd+1 − b = O(bd+1 ) (2.2) Since all open nodes have to be stored in memory (the next to be expanded is the older one), the space complexity equals the time one: O(bd+1 ). The major problem with this strategy is teh exponential nature of both time and space complexities. In practice, this algorithm breaks due to lack of memory, before running out of time. As an example, consider the estimates in figure 2.4 for both time and memory consumption for a problem with branching factor b = 10, running on a machine capable of generating 10000 nodes/sec, and using 1000 bytes/node. Note in this example that with d = 8, 31 hours of computation require 1 TB of memory, which is currently out of reach of common computers. 4 For a description of the O-notation (asymptotic analysis) to express complexity 34 CHAPTER 2. PROBLEM SOLVING d nodes 2 1100 6 107 8 109 12 1013 time memory 0.11 secs 1 MB 19 min 10 GB 31 hours 1 TB 35 years 10 PB5 Figure 2.4: Time and memory consumption estimates of a problem with b = 10, on an implementation generating 10000 nodes/sec and consuming 1000 bytes/node. Uniform cost search This strategy is a variation of the breadth-first, where the node choice is determined by the node cost. Assuming that the path cost of a node n in the search tree is the sum of the step costs from the initial node to n along the tree, the cost g(n) of a node n is the sum of the step costs, from the initial node. The next node to expand is then the open node that minimizes this cost g(n). If all the step costs are equal, the this strategy equals breadth-first. Unless all step costs are strictly positive, this strategy may lead to infinite loops (e.g., a step cost of zero leading to an identical state). However, is all step costs are greater or equal than an > 0, then this strategy is complete and optimal. It can be proven that the time and space complexities of this strategy is ∗ O(b1+bC /c ) (2.3) where > 0 is a lower bound of the step costs and C ∗ is the cost of the optimal solution. This value is usually much larger than bd . This strategy tends to explore large portions of the search tree with low step costs, making it a relatively inefficient way of finding optimal solutions. Depth-first search While breadth-first strategy searches in breadth, the depth-first one searches in depth. This strategy proceeds in depth until finding a node without successors, and then backtracks for alternative paths to a solution. This can be very elegantly implemented replacing the FIFO queue by a LIFO6 (also known as a stack ). The resulting node expansion ordering is illustrated in figure 2.5. check [4], for instance. 6 LIFO means Last-In-First-Out, and corresponds to a stack, where the next item coming out is the most recently added. Thus its comparison with a stack structure. 2.3. SOLVING PROBLEMS 35 1 2 3 4 5 10 11 expand 6 7 8 "dead ends" 9 12 solution (a) search tree 13 3 2 5 4 (b) node FIFO queue Figure 2.5: Node expansion order in depth-first search. One important aspect of this strategy is its memory usage. Note in figure 2.5 that when node 3 is expanded, the whole subtree starting in node 2 was already found to be useless, and therefore there is no need to keep all these nodes in memory. In general, depth-first allows for deallocating all memory except for the nodes in the most recently expanded node, and its immediate successors. Therefore, the space complexity is linear, rather than exponential as in previous strategies: b · m + 1 = O(bm) (2.4) where m is the maximum depth of the search tree. This corresponds to storing m times b nodes at the maximum depth, plus the initial node. The time complexity remains, however, exponential: O(bm ). As an illustration of the tremendous memory saving, consider the previous example of a problem with b = 10, 10000 nodes/sec, and 1000 bytes/node: if the solution and maximum depths were d = m = 12, while the breadth-first requires 10PB of storage, the depth-first only requires 118KB. The price to pay is that depth-first is neither complete, if m is not finite, nor optimal. Backtrack search Backtrack search results from two optimizations of the depth-first strategy, that in the cases that they can be applied, more memory saving can be obtained. • If, for one given node, we can generate one of its sucessors of at a time, 36 CHAPTER 2. PROBLEM SOLVING we only need to store one successor for each expanded node. Thus, space complexity reduces to O(m). • If, moreover, we can implement node expansion by modifying the node (rather than allocating a new memory structure for the successor node), and backtrack by undoing that modification, we only need memory for a single node. This is the strategy of choice of, for instance, solving constraint satisfaction problems. Depth-limited search One way of addressing the lack of completeness of depth-first search is to limit, a priori, the depth of the search tree. In this way, the search tree will not be deeper than that limit. The major problem with this approach is that, whenever the shallowest solution is deeper than the specified limit, no solution will be found. If we designate the limit by l, a solution is found only if d ≤ l. Because of this limitation, this strategy is also not complete, and also not optimal. If however the depth of the solution is known, we can safely set l to that value. The time complexity of this strategy is O(bl ), while its space complexity is O(bl). Iterative deepening depth-first search In the cases where the depth of solution is unknown, we can start experimenting depth-limites search with different values of l, until a solution is found. In particular, we can start with increasingly higher values of l, starting at l = 1, until a solution is found. This is called iterative deepening depth-first. Assuming that the depth of the solution is d, this strategy will run depthfirst search iteratively for l = 1, . . . , d. What might seem a waste of time, since in each iteration the search tree has to be generated from scratch, it is not actually that serious, since in general the bulk of the search time is spent at the deepest level of the tree (because of the exponential nature of search trees). The space complexity is O(bd), and the time complexity is bd = O(bd ) db + (d − 1)b2 + · · · + |{z} |{z} | {z } for l=1 for l=2 (2.5) for l=d It is complete, since l will be incremented iteratively until l = d, and it is optimal if the path cost is a non-decreasing function of the depth (which is the same to say that the shallowest solution is returned). 2.3. SOLVING PROBLEMS 37 The iterative deepening depth-first is often the preferred strategy among all uninformed search strategies, since if automatically combines features from breadth-first (for l < d the tree is searched in breadth, but without the burden of holding in memory an exponential amount of nodes) and from depth-first (for each l the search proceeds effectively in depth-first fashion, thus being very memory lightweight). One variation of this strategy is called iterative lengthening depth-first search, consisting on performing a depth-first search limited to path cost, and then iteratively increasing that limit until a solution is found. Bidirectional search In bidirectional search, not only nodes expand to successors, but also nodes expand backwards to predecessors. First, the initial node expands to its successor nodes, then, the goal node expands backwards to its predecessor nodes. This backward expansion means that from one given node, a set of predecessor nodes is generated such that the given node is a successor of each one of them. The process iterates, in both directions, until a match between a node generated foward and a node generated backwards is found. When this happens, we can trace the solution path from the initial node to the goal. Since node expansions happen in both directions, from the initial node as well as from the goal node, it amounts to twice the search complexity with half of the depth. This is so because the match between search directions always occur in the middle. So, the time complexity is bd/2 + bd/2 = O(bd/2 ) (2.6) and since bd/2 bd , it is dramatically faster. Since all expanded nodes have to be stored in memory, the space complexity is also O(bd/2 ). This strategy is complete, and optimal (as long as all step costs are equal), provided that breadth-first is employed on each direction. This strategy may seem at the first sight clearly superior, at least on what concerns time complexity, but “there are no free lunches”: 1. the goal state has to be specified beforehand, meaning that it cannot (in principle) be used when the goal state is unknown 2. not only a successor function, but also a predecessor function have to be defined 3. the predecessor function have to be complete, in the sense that all possible predecessor nodes have to be generated, since otherwise completeness may be compromised. 38 2.3.2 CHAPTER 2. PROBLEM SOLVING Informed search strategies The following search strategies differentiate themselves from the uninformed ones by the usage of domain-specific information concerning the quality of the open nodes. In particular, this information boils down to an evaluation function, designated f (n), which for a given node n yields a number which is as low as the total path cost of a solution passing through n (and parents) is close to the optimal solution. In other words, the nodes along the path from the initial node to a goal are the ones with lowest values of f (n). This, it is natural that the best strategy consists on choosing from the open nodes the one minimizing f (n) for the node to be expanded next. The choice of the node with the lowest f (n) is called best-first strategy. The variation among the following methods consist exclusively on the way f (n) is computed. A very important concept in the context of informed search strategies is the heuristic function, denoted h(n). Given any node n, this function returns an estimate of the path cost from n to a goal node. This value concerns only the cost from n to the goal, and thus the usage of this function assumes that the path costs are summable, i.e., the total path cost of a solution can be written as a sum of the cost g(n) from the initial goal to n with the cost from n to a goal node. The heuristic function aims at estimating this latter term of the sum, for a given n. Greedy best-first A* (A-star) Chapter 3 Knowledge and reasoning 39 40 CHAPTER 3. KNOWLEDGE AND REASONING Chapter 4 Planning 41 42 CHAPTER 4. PLANNING Chapter 5 Unertain knowledge and reasoning 43 44 CHAPTER 5. UNERTAIN KNOWLEDGE AND REASONING Chapter 6 Learning 45 46 CHAPTER 6. LEARNING Bibliography [1] James F. Allen. AI growing up. AI magazine, 19(4):13–23, Winter 1998. [2] C.R. Baker and J.M. Dolan. Traffic interaction in the urban challenge: Putting boss on its best behavior. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2008), pages 1752–1758, 2008. [3] D.E. Bernard, G.A. Dorais, C. Fry, Jr. Gamble, E.B., B. Kanefsky, J. Kurien, W. Millar, N. Muscettola, P.P. Nayak, B. Pell, K. Rajan, N. Rouquette, B. Smith, and B.C. Williams. Design of the remote agent experiment for spacecraft autonomy. In Proceedings of the IEEE Aerospace Conference, volume 2, pages 259–281, March 1998. [4] Gilles Brassard and Paul Bratley. Algorithmics: theory and practice. Prentice-Hall, 1988. [5] Murray Campbell, A. Joseph Hoane Jr., and Feng Hsiung Hsu. Deep blue. Artificial Intelligence, 134:57–83, 2002. [6] Sara Reese Hedberg. DART: Revolutionizing logistics planning. IEEE Intelligent Systems, 17(3):81–83, May/June 2002. [7] Anthony M. Digioia III, Branislav Jaramaz, Constantinos Nikou, Richard S. Labarca, James E. Moody, and Bruce D. Colgan. Surgical navigation for total hip replacement with the use of hipnav. Operative Techniques in Orthopaedics, 10(1):3–8, 2000. [8] Hiroaki Kitano, Minoru Asada, Yasuo Kuniyoshi, Itsuki Noda, and Eiichi Osawa. RoboCup: The robot world cup initiative. In Proceedings of the First International Conference on Autonomous Agents (Agents’97), pages 340–347, New York, 1997. ACM Press. [9] John McCarthy, Marvin L. Minsky, Nathaniel Rochester, and Claude E. Shannon. A proposal for the darthmouth summer research project on artificial intelligence. AI Magazine, 27(4):12–14, Winter 2006. (reprint). 47 48 BIBLIOGRAPHY [10] Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, second edition, 2003. [11] Herbert A. Simon. The Sciences of the Artificial. MIT Press, third edition edition, 1996. [12] Sebastian Thrun, Mike Montemerlo, Hendrik Dahlkamp, David Stavens, Andrei Aron, James Diebel, Philip Fong, John Gale, Morgan Halpenny, Gabriel Hoffmann, Kenny Lau, Celia Oakley, Mark Palatucci, Vaughan Pratt, Pascal Stang, Sven Strohband, Cedric Dupont, Lars-Erik Jendrossek, Christian Koelen, Charles Markey, Carlo Rummel, Joe van Niekerk, Eric Jensen, Philippe Alessandrini, Gary Bradski, Bob Davies, Scott Ettinger, Adrian Kaehler, Ara Nefian, and Pamela Mahoney. The 2005 DARPA Grand Challenge, chapter Stanley: The Robot That Won the DARPA Grand Challenge, pages 1–43. Springer Tracts in Advanced Robotics. Springer, 2007. [13] Alan M. Turing. Machine Intelligence, volume 5, chapter Intelligent Machinery, pages 3–23. American Elsevier Publishing, 1970. (reprint). [14] John von Neumann. The Computer and the Brain. Yale Nota Bene, second edition edition, 2000.