Download Artificial Intelligence and Decision Systems Course notes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Personal knowledge base wikipedia , lookup

Enactivism wikipedia , lookup

Agent-based model in biology wikipedia , lookup

Minimax wikipedia , lookup

Computer Go wikipedia , lookup

Knowledge representation and reasoning wikipedia , lookup

Soar (cognitive architecture) wikipedia , lookup

Agent (The Matrix) wikipedia , lookup

Multi-armed bandit wikipedia , lookup

Reinforcement learning wikipedia , lookup

Existential risk from artificial general intelligence wikipedia , lookup

Ethics of artificial intelligence wikipedia , lookup

Philosophy of artificial intelligence wikipedia , lookup

Hierarchical temporal memory wikipedia , lookup

History of artificial intelligence wikipedia , lookup

Cognitive model wikipedia , lookup

Embodied cognitive science wikipedia , lookup

Transcript
Artificial Intelligence and Decision Systems
Course notes
Rodrigo Ventura
Instituto Superior Técnico
DRAFT — version 1.2 — October 2010
2
Preface
These notes were written as a support bibliography for the Artificial Intelligence and Decision Systems (IASD) course at Instituto Superior Técnico
(IST, Lisbon, Portugal). This course aims at providing the students with
knowledge on basic methods in Artificial Intelligence. Since this course is
integrated in a major on systems, decision and control of the Electrical and
Computer Engineering course at IST, its contents are focused towards these
areas.
This course is being taught at IST from the mid 1990’s, having being
continuously updated with new material of this fast changing scientific area,
while omitting parts that have been losing interest, regarding the course aims
and audience.
The main bibliographic reference of this course is the excellent textbook
“Artificial Intelligence: A Modern Approach” (2nd edition) by Stuart Russell
and Peter Norvig [10]. The following chapters correspond essentially to a
selection of chapters of this book.
3
4
Contents
1 Introduction to AI
1.1 Approaches . . . . . . . . . . .
1.2 Foundations . . . . . . . . . . .
1.3 State of the art . . . . . . . . .
1.4 Intelligent agents . . . . . . . .
1.4.1 Definition . . . . . . . .
1.4.2 Example . . . . . . . . .
1.4.3 Properties . . . . . . . .
1.4.4 Nature of environments .
1.4.5 Structure of agents . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
7
10
12
13
13
14
15
16
18
2 Problem solving
2.1 Introduction . . . . . . . . . . . . .
2.2 Well-defined problems and solutions
2.3 Solving problems . . . . . . . . . .
2.3.1 Uninformed search strategies
2.3.2 Informed search strategies .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
27
27
28
30
32
38
.
.
.
.
.
.
.
.
.
3 Knowledge and reasoning
39
4 Planning
41
5 Unertain knowledge and reasoning
43
6 Learning
45
5
6
CONTENTS
Chapter 1
Introduction to AI
What is Artificial Intelligence? To understand the scope of this relatively
recent scientific area, we could start by examining the name itself. First,
there is Artificial: something (1) man-made, synthesized, engineered, and
(2) aiming at imitating natural things [11]. And second, there is Intelligence:
a property primarily ascribed to human being (with exceptions), but whose
use to other animals is debatable. Not however as contested as the idea of
machine intelligence.
Intelligence, in the general sense, is virtually impossible to define in such
a clear-cut fashion, that would allow us to classify entities as either intelligent
or not without a shadow of doubt. Rather, the best we can do is to characterize intelligence as a set of capabilities or skills that intelligent beings possess. These capabilities include problem-solving, reasoning, decision-making,
learning, memory, language, and emotions. This is not an exhaustive list,
but it contains aspects that are most consensual. Now, other than humans,
what entities possess at least some of these skills? Certainly rocks do not,
and so they can hardly be considered intelligent. Vertebrate animals, however, do seem to possess some (if not all) of them, and so one can accept
them as intelligent. The problems arise in the middle ground: can insects
be considered intelligent? how about amoebas? and viruses? There is a
continuum of complexity ranging from humans down to rocks, and placing a
border line somewhere in the middle do not seem an easy task at all [1].
1.1
Approaches
The idea that machines can be intelligent was proposed in the dawn of computers, by Alan Turing in 1948 [13], on the basis that if the behavior of a
machine is indistinguishable from a human while performing a certain task,
7
8
CHAPTER 1. INTRODUCTION TO AI
then it ought to be considered intelligent. However, even today, there are
those who still twist their noses to the idea of calling a machine intelligent
(although in some cases of daily computer usage it might appear as the sheer
opposite of it). But it should be stressed that this issue should not be taken
as a matter of faith (in either direction).
The name Artificial Intelligence (AI) appeared several years latter, in
August of 1955, in the sequence of a summer research project in Dartmouth
College in Hanover (US-NH). In the proposal of this project [9], written by
Claude Shannon, Marvin Minsky, Nethaniel Rochester, and John McCarthy,
it is said that
The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in
principle be so precisely described that a machine can be made
to simulate it.
In the following we will analyse four different approaches to AI, illustrated
in diagram 1.1, which can roughly be mapped to the history of the field.
Understanding them provides a global view of what the field is all about.
Act like humans
(1) Turing test, NLP, Eliza,
etc. — canned answers
Think like humans
(2) Cognitive science (psychology), serial vs. parallel processing
Act rationally
(4) Acts so that to achieve the
best outcome (or the best expected outcome)
Think rationally
(3) Laws of thought, logic, irrefutable conclusions (common
sense, practical)
Table 1.1: Four approaches to AI, roughly corresponding to the combinations
among act and think across like-human and rational.
The first approach (1) was in fact hinted by Turing’s early works arguing
for machine intelligence [13]. The so called Turing test, named on his honor,
aims at evaluation whether a machine should be considered intelligent or not.
The test consists of two text terminals by which a human evaluator can chat
with another human, and with the computer under test. The evaluator does
not know which is which. If the evaluator fails to correctly identify which one
is the computer, then the latter is said to pass the test. This test is clearly
oriented toward to evaluating whether the machine can act like humans, as
the evaluator has only access to the machine behavior (and limited to a text
chat interface). Several years later, Joseph Weizenbaum developed, between
1.1. APPROACHES
9
1964 and 1966 a program called ELIZA which purpose was to chat with
people, in a vaguely similar way to a psychotherapist. In fact, it was intended
as a parody, but the outcome was unexpectedly engaging. It has been told
that Joseph’s secretary used to engage in very intimate conversations with
ELIZA, not wanting anyone else to see the transcripts. One of the most
extraordinary things about ELIZA is the simplicity of its programming.
ELIZA is basically a set of IF-THEN rules triggered by text patten matching,
and some randomness in the choice of responses. For instance, to a “My head
hurts”, ELIZA would probably answer the canned sentence “Why do you
say your head hurts?”. Although ELIZA is still a bit far from passing the
Turing test, it does a pretty good job on fooling non-expert humans. Still, a
contest is held yearly, granting prizes to the programs that pass a restricted
form of Turing test, the Loebner Prize1 . Joseph Weizenbaum has won four
times the price, in 1991–1993 and in 1995. Can we call these programs,
tailored to answer in a plausible way to text messages, really intelligent? In
fact, research has been focused here more on text pattern matching and on
the design of plausible responses, rather than targeting the capabilities we
identified above as indicators of intelligence.
If it is not enough to (1) act like humans, an alternative approach would
be to (2) think like humans. To this respect, researchers looked into the
field of psychology, that is dedicated to understanding how the human mind
works. Herbert Simon and Alan Newall have contributed enormously to the
field, precisely on this approach of making programs based on our understanding of the human mind. There is however one hidden assumption: that
the mechanisms found in our brain can be implemented in a computer with
similar level of success and efficiency. The brain is, however, structurally
different from a computer [14]. While the brain is massively parallel, encompassing about 15 to 33 billion neurons, all functioning simultaneously,
the computer is based on a serial execution of instructions2 . This radically
different processing architecture ought to have an impact on the approaches
to the same problems. So we should not expect, in principle, for the best
computational solutions to be similar to the way our brain solves them.
What if instead of taking humans as models of intelligence, we approach
machine intelligence from a formal, mathematical standpoint? Mathematical
reasoning allows one to draw irrefutable conclusions given a set of premises
using mathematical logic. In AI, mathematical logic has been used to model
many aspects, for instance common sense. We can call this approach as
1
http://www.loebner.net/Prizef/loebner-prize.html (retrieved 27-Aug-2009)
Even though computations involve several bits simultaneously (64 in modern machines), and the current trend is for multiple multi-core CPUs, these amounts are several
orders of magnitude lower than the amount of neurons in the brain
2
10
CHAPTER 1. INTRODUCTION TO AI
(3) think rationally, in the sense of employing rational thought processes
provided by mathematical logic to attain machine intelligence. Problems
however arise since reality may not always be permeable to rigid mathematical representations. Take for instance the statement “All birds fly”; what
happens if we glue a bird’s leg to a heavy concrete block? As the bird is no
longer able to fly, we have to account for exceptions, and that turned out to
be extremely hard to deal with using logic-based representations.
Instead of subsuming thought processes to rigid logic rules, maybe it
suffices for a system to (4) act rationally. Let us define a performance criterion rigorously, such that the higher the performance of the outcome, the
more it approaches the design goals of the system. Figuring out what to
do would boil down to simply finding the solution that maximizes this performance. In this approach, the focus is directed towards finding a solution
that maximizes a defined performance criterion, rather than on constraining
the thought processes to logically sound framework. This approach has also
produced very efficient methods to deal with uncertainty in a quantitative
way. Logic methods are also able to cope with uncertainty, but are unable
to deal with quantitative levels of uncertainty (e.g., probabilities).
The current trend in the AI field is closer to this latter approach (4), than
any other. Probabilistic methods have been gaining ground, as they combine,
on the one hand, the solid theoretical framework of probability theory and
statistics, and on the other, the ability to quantify and propagate through
its computations, levels of uncertainty.
1.2
Foundations
Although the field is relatively new, many earlier fields are considered to
have contributed to the foundations of AI. Here is a brief description of
the major contributions of most of them. Readers are invited to look up
further information concerning the topics mentioned below (in, for instance,
the Wikipedia3 ).
1. Philosophy (from 428 B.C.) — formal rules to obtain valid conclusions
(logic), the issue of a mind vs. body (Plato, Descartes), the origin
of knowledge and the role of perception on its acquisition, and how
knowledge maps to actions;
2. Mathematics (from 800 A.D.) — how formal rules can be used to derive
irrefutable conclusions, computability (Turing, what class of functions
3
http://www.wikipedia.org (retrieved 27-Aug-2009)
1.2. FOUNDATIONS
11
are realizable in a computer), modeling uncertain information (statistics);
3. Economics (from 1776) — decisions maximizing payoff (Herbert Simon), theories of utility, decision, and games, (von Neumann), optimizing vs. satisfycing;
4. Neuroscience (from 1861) — how the brain works, levels of organization
(from molecules, to synapses, to neurons, to maps, and to systems),
localization of function (Broca);
5. Psychology (from 1879) — how humans think and act, behaviorism
(Skinner) vs. cognition, psychoanalysis (Freud);
6. Computer engineering (1940) — building computers (ENIAC), programming languages;
7. Cybernetics (from 1948) — the concept of autonomy, control theory
(Wiener), information theory (Shannon);
8. Linguistics (from 1957) — language and thought, syntax and grammar
(Chomsky).
The first publication commonly accepted as being the precursor of AI
was written by Warren McCulloch and Walter Pitts in 1943, proposing a
model of artificial neurons. Each one of these neurons can assume one of two
states (“on” and “off”), receiving signals from other neurons. They showed
that this model could implement logical connectives (AND, OR, NOT, etc.),
and even perform any computable function. In the late 1940’s Alan Turing
wrote an influential paper [13], claiming the possibility of machines exhibiting
intelligent behavior, while presenting several arguments sustaining his claim.
The name of the field was however only coined in 1956 after the Dartmouth
summer research project [9] in 1956. Since then, the field has evolved over
several stages, from the early enthusiasm of naive systems and toy problems,
to a maturity state where strong theoretical results can be found as well as
real-world systems actively used in the industry. The AI now counts with a
large community of researchers, with many top-level conferences being held
periodically (AAAI, IJCAI, ECAI, AAMAS, to name a few). It has remained
a quite inter-disciplinary subject, with strong connections to the many fields
of mathematics (logic, statistics), robotics, linguistics, neuroscience, systems
and control theories, among others.
12
CHAPTER 1. INTRODUCTION TO AI
1.3
State of the art
In this section, several state-of-the-art AI application domains are briefly
described. Once again, readers are invited to look for further information for
each one of the examples given here.
• The NASA remote agent is an autonomous program for the remote
management of a spacecraft. It includes tasks such as planning and
fault diagnosing. In particular, it was tested on NASA’s Deep Space 1
spacecraft [3].
• In 1997 the computer Deep Blue defeated the chess grand champion
Kasparov, with broad media coverage. Insofar chess has been considered a benchmark of intelligence, this is a considerable feat, as this
was the first time the best known human on the specific task of chess
was beaten by a machine. However, it should be noted that not only
Deep Blue’s hardware is optimized specifically for chess (besides IBM’s
marketing claims of the contrary), but also the algorithms employed
are quite specific of this kind of games [5].
• Stanford’s Stanley is a SUV car that drives autonomously. It won the
2005 DARPA Grand Challenge, consisting of driving a 212km off-road
track autonomously and unmanned. Stanley made first place, from a
total of 23 cars, taking 6h54m in total. Only five cars made the finish
line. The competition track was only know a few hours before the race,
in the form of a sequence of GPS waypoints. This was a milestone on
terms of having an autonomous vehicle operating for such a long period
of time, with so many sources of uncertainty, and also a victory of the
probabilistic methods in robotics [12].
• CMU Tartan Racing won the 2007 DARPA Grand Challenge, consisting this time on an urban scenario, where the autonomous cars have to
comply with the usual traffic rules, interacting with other cars driven
by humans [2].
• Medical diagnosis systems, which not only provide a diagnosis given a
set of symptoms, but also an explanation of the line of reasoning to
reach the conclusion.
• Logistics planning for complex military campaigns, having the challenge of representing the various constraints involved, with the goal of
finding a solution satisfying the goal and the constraints, minimizing
some performance criteria, weighting factors such as time and costs.
1.4. INTELLIGENT AGENTS
13
The DART system (Dynamic Analysis and Replanning Tool) was used
in the American Desert Shield/Storm operations in Kuwait [6].
• The HipNav (Hip Navigation System) is a surgery assistance system to
provide the medical staff with the optimal, patient-specific positioning
for hip implants [7].
• RoboCup is an annual international robotic competition event, which
initially included only robotic soccer events, but now includes competitions for search and rescue robots, service robots at home, among
others. Robots playing soccer is a huge scientific challenge, as it poses
a common benchmark in a common environment, where the robots
have to solve a wide range of problems, from ball tracking up to team
strategy, all integrated in a team of autonomous robots [8].
1.4
1.4.1
Intelligent agents
Definition
The concept of an agent is a central one to this course. An agent is an entity
endowed with sensors by means of which it receives information from the
environment (percepts), and produces actions via its actuators (figure 1.4.1
This entity applies both to living beings, as well as to artifacts, although
we will be concerned only with machines. It can be, for instance, a physical
machine, such as a robot, with sensors that measure distances to obstacles,
and actuators that make it move around. It can also be a software agent,
that crawls over the Internet gathering data for a search engine (in this case
the sensors are the network connections that download web pages). The
environment consists on whatever is external to the agent and bears some
relationship with it, including other agents. The environment of a mobile
robot is a physical one; the environment for a web crawler it is the Internet;
for a robotic soccer game the environment is the field and the other robots,
including teammates and opponents.
The concept of agent provides encapsulation, with respect to the surrounding environment, of an entity that autonomously processes percepts
and acts on the environment. One can compare this concept with the idea of
object in software engineering. It is important to stress that an agent is not
a classification criterion, which could be used to classify entities, but rather
a design framework. The concept of agent is used to help us design AI programs. Moreover, this concept not only applies to a single entity, but also to
systems comprising several agents — multi-agent systems (MAS) — where
14
CHAPTER 1. INTRODUCTION TO AI
AGENT
sensors
ENVIRONMENT
percepts
?
actuators
actions
Figure 1.1: Diagram of an intelligent agent, bearing a relation to its surrounding environment.
multiple agents communicate among them to fulfil some task. A robotic soccer team is a paradigmatic example of a MAS. Many research work, design
architectures, and software tools have been developed for the development of
these systems.
At this point one should distinguish between agent function, which is a
mathematical description of how an agent maps its percepts into actions,
e.g., a mathematical function from the set of all possible percept sequences
to the set of all possible actions, and agent program, concerning a computer
implementation of that map. The agent function completely specifies the
agent behavior, as it defines the action to be performed in any possible situation. It is a formal description of the agent behavior. From an engineering
point of view, we will be focused on making agent programs. The question
we will address during this course is how can we design such agent programs
such that the resulting implementation fulfills the design goals.
1.4.2
Example
Let us consider here a very simple example of an autonomous agent: a vacuum cleaner robot. This robot operates in a simple two room environment,
being able to move from one room to the other (figure 1.2). Consider that
the agent sensors comprise one location sensor, determining in which room
the robot is, and a dirt sensor, detecting whether there is dirt in that room.
Then, the percepts can have the form of a pair hr, si where r ∈ {A, B} is the
robot location, and s ∈ {Clean, Dirty} is the room status (as perceived by
the dirt sensor). The actions a considered here are the movement actions,
1.4. INTELLIGENT AGENTS
A
15
B
.
.........................
... .
dirt
dirt
. ......
.. .......... ...
. . ....
.. ........... ...
.
.........................
... .
agent
Figure 1.2: Vacuum cleaner robot example.
suck, and shutdown: a ∈ {Left, Right, Suck, Shutdown}.
The agent program, implementing the agent function, can be something
as simple as a lookup table between percepts and actions, e.g.:
hA, Cleani
hA, Dirtyi
hB, Cleani
hB, Dirtyi
−→
−→
−→
−→
Right
Suck
Left
Suck
(1.1)
Assuming that the environment evolves as expected — the dirt in one room
disappears as soon as the robot Suck’s it —, it is not hard to see that
such a program will lead to a infinite loop of moving from room A to B
and back. This program is able to clean both rooms automatically, but it
would be probably preferable to Shutdown after getting both rooms cleaned.
However, this requires a bit more complexity than the lookup table given
above (more on this issue below).
1.4.3
Properties
This last observation in the previous section brings about the concept of
rational agent, which, in a first attempt, we will define as an agent that
“does the right thing.” In particular we would like to be able to measure
how well does it do “the right thing”; we will call this measure performance.
The higher the performance, the better does it perform. Note that there is no
universal and unique way of defining a performance measure. Different agents
will have different performance measures, not only because each task we want
the agent to perform will call for a specific performance measure, but also,
even for the same task, different designers might have different assessments of
the agent performance. For instance, when there is a trade-off between two
(or more) variables, one could weight each one differently. In the commercial
16
CHAPTER 1. INTRODUCTION TO AI
aeronautical industry, for instance, different airlines might trade-off flight
duration and fuel spent differently. The automatic flight management system
of aircrafts are configured with that trade-off variable (cost index), having
implications in the rate of climb and descent of the airplane.
In the case of the vacuum cleaner, a possible performance measure might
be D − λN , where D is the amount of dirt sucked, and N the total number
of operations, measured at shutdown time, or after a certain time span. The
λ parameter allows the designer to tune the trade-off between amount of dirt
sucked and number of operations.
The performance of an agent is conditioned to: (1) the percept information, meaning that we cannot always assume that the agent has access to all
information concerning the environment, and (2) the built-in knowledge of
the environment. Our vacuum cleaner robot dirt sensors only sense dirt for
the room it is in, meaning that it has to visit all rooms before being sure
everything is cleaned up, and its navigation is conditioned by the knowledge
it has of the environment. In figure 1.2 only two rooms were depicted, but
we can imagine larger environments with more complex topologies.
Now we are ready to formally define a rational agent, with respect to
a given performance measure, as the one that maximizes a specified performance measure, given the percept information and the built-in knowledge of
the environment.
When an agent has full information concerning the environment state, it
is called omniscient. As it is usually not the case, the concept of rationality
has to be framed with respect to what information the agent has access. In
this latter case, it is often crucial for the agent to actively gather information,
in the sense of performing actions with the goal of obtaining more information. The issue of trading off actions for gathering information with actions
that directly improve the performance measure is called the exploration vs.
exploitation problem. Exploring the environment for gathering information
concerns the area of Learning, which will be discussed in a later chapter.
Another trade-off we encounter in this context is between relying on one’s
percepts or on a priori knowledge. For instance, a robot can navigate using
a map, but if it is insensitive to the presence of obstacles (unaccounted for
in the map), it might collide with them. Furthermore, a map may become
outdated, if for instance the environment changes. An agent that relies on
its own percepts rather than only on prior knowledge is called autonomous.
1.4.4
Nature of environments
A world (also known as domain) is composed of:
1.4. INTELLIGENT AGENTS
17
1. a performance measure,
2. an environment,
3. a set of actuators, and
4. a set of sensors.
This defines both the environment an agent interacts with, and its interface
with it. Given a world, the design goal consists in constructing a rational
agent, in the sense of acting in such a way of maximizing the performance,
as defined in the previous section.
Some examples of worlds (or domains) are:
Taxi driver the sensors and actuators are the ones a human uses for driving, the environment includes the taxi and the city, and the performance measure comprises a combination of carrying clients to the desired place, time/cost of travel, comfort, global satisfaction of the client,
etc.
Search and rescue (SAR) autonomous helicopter the sensors include
cameras, GPS, actuators include the propeller, any deployable first-aid
kits, the environment is the operational area, including the victims and
the SAR personnel, for the performance measure one has to take into
account not only the amount of potential victims found, but also the
time for the aid to reach them (a critical variable in SAR operations).
Soccer humanoid robot the sensors include a camera to locate the ball
and the teammates, the actuators are the mechanical limbs, the environment is the field and the other robots (both teammates and opponents), and the performance measure is simply the amount of goals
scored.
Web crawler the sensors allow it to download web pages, the actuators allow it to move from one page to another, by establishing new network
connections, the environment is the Internet, and the performance measure includes the amount of web pages processed and/or the amount
of relevant information gathered.
World can be classified according to the set of properties below:
• fully observable vs. partially observable, meaning whether the agent
has full access to the state of the environment (e.g., chess), or a partial
view of it (e.g., vacuum cleaner robot);
18
CHAPTER 1. INTRODUCTION TO AI
• deterministic vs. stochastic, if the consequences of the agent actions
are predictable, or if there is an element of chance involved (e.g., sensor
noise, actions that may fail);
• episodic vs. sequential, whether the interaction with the environment
follows along a sequence of independent episodes, or if future interactions depend on what happened in past ones;
• discrete vs. continuous, whether the domain of all percepts, all actions, and time, is discrete or continuous (any of the 8 combination is
possible);
• single- vs. multi-agent, whether the agent is alone in the environment,
or there is a team of interacting agents (MAS, Multi-Agent System).
In MAS one can also distinguish between cooperative agents and competitive agents (e.g., in robotic soccer, teammates are cooperative, and
opponents are competitive, that is, if everything works as desired).
1.4.5
Structure of agents
Having discussed agents and environments, the big question now is how to
design the question mark block depicted in figure 1.4.1, the core of the agent
itself. To do that, we will start by analysing an initial, very simple architecture, find out its limitations, and then design a new, more complex one,
addressing those limitations. We will repeat this process over and over again.
Simple reflex agent
Let us first consider a simple lookup table, like the one referred in section 1.4.2. Given a percept, the agent searches the lookup table for an exact
match, and outputs immediately the corresponding action (like a reflex). Instead of having a table entry for each possible percept, we can introduce
patterns; for instance, whenever the robot finds dirt, perform a Suck action.
This forms a so called an IF-THEN rule (also known as production rule),
usually in the form:
IF hconditioni THEN hactioni
(1.2)
A production system consists of a set of such rules, which given a percept,
looks for a rule that matches the hconditioni, and then produces the corresponding action hactioni (if such a rule is found). For instance, the rules for
the vacuum cleaner domain could be
1.4. INTELLIGENT AGENTS
19
sensors
interpret input
IF-THEN rules
database
rule match
actuators
Figure 1.3: Architecture of a simple reflex agent.
IF hA, Cleani
IF hB, Cleani
IF h∗, Dirtyi
THEN Right
THEN Left
THEN Suck
where the ∗ symbol matches anything (wildcard).
The resulting architecture is depicted in figure 1.3. Each percept is subject to an input interpretation box (for instance, consider a room dirty only
if the the measurement of the dirt sensor is higher than a threshold), followed
by a rule matching engine, fed by a database of IF-THEN rules. From a design point of view, each one of these blocks should correspond to a different
software module. This way we can map the architecture design (figure 1.3) to
a computer implementation of it. Moreover, the rule matching engine should
be domain independent, meaning that the same module could be applied to
any domain, not only the one the agent was designed for. This way, changing the domain would not imply the change of the engine, only the rules
database, and also the engine can be re-usable in a different domain.
The major advantage of this architecture is its simplicity. However, it
poses several limitations. One of them is that it does not scale well to complex domains. It easily becomes cumbersome to write rules for complex
problems. Imagine for instance how to solve a Sudoku board using this approach. Another problem is that it underperforms in partially observable
environments. For instance, in the case of the above rules, the agent will end
up into an infinite loop of moving from one room to the other. If the agent
could somehow memorize which rooms it has already cleaned, it could then
safely shutdown. This lead us to the next agent architecture.
20
CHAPTER 1. INTRODUCTION TO AI
sensors
world model
update state
IF-THEN rules
database
rule match
actuators
Figure 1.4: Architecture of a model-based agent.
Model-based agent
If the environment is partially observable, appropriate decisions may require
the agent to have memory. So, let us add a memory to the previous agent
architecture, in the form of a world model. This structure models the current
state of the environment, being dynamically updated with information from
previous and current percepts. In other words, each time the agent receives a
percept, it updates its world model. This world model is then used in the rule
matching engine. Thus, the rules conditions refer to the world state, rather
than to the percepts directly. The agent action thus depends on the history
of previous interactions, since previous and current percepts may depend on
past actions. Figure 1.4 shows the resulting architecture.
The major difference to the previous architecture is that this agent has
an internal state. Its actions do not only depend on the immediate percept,
but also on previous interactions with the environment. This way the agent
is better capable to deal not only with partially observable environments, but
also with sequential domains, where the appropriate “thing to do” depends
on the past.
For the vacuum cleaner domain, the world model could include the cleanness state of each room, such that a production rule in the schematic form
IF all rooms clean
THEN Shutdown
could be added, where all rooms clean matches only with a world state having
all rooms in the clean state.
1.4. INTELLIGENT AGENTS
21
One limitation of this architecture is that the agent depends on a model
of the environment. This may be a partial model of the environment (except
for trivial cases, one cannot usually model everything), and/or there may
be noise in sensors and/or uncertainty in the environment which affect the
accuracy of the model. The world update model should be robust to this in
order to perform appropriately.
Another limitation is, once again, scaling up to more complex domains.
In particular, the designer has to write rules taking into account all possible
situations might encounter. One can argue then that who is really solving
the problem is the person writing the rules, rather than the computer: the
intelligence is on the designer, not on the machine. One could stretch the
argument to the limit, and argue whether this is not always the case with
machine intelligence. But if we consider an agent that, given a goal, finds out
a solution by itself, then we would be certainly closer to machine intelligence.
This leads us to the next architecture.
Goal-based agent
In this architecture the agent is given a goal, and then it searches (by itself)
for a solution satisfying it. Figure 1.5 shows its structure: the agent has
to be able to predict the consequences of its actions. In addition, we need
a world evolution model, that represents how the world state changes by
virtue of the agent actions. With this, the agent becomes able to predict
how the world state evolves after performing certain actions: “what happens
if I do this action?” Then, the agent compares the resulting final world state
with the agent goals. If they do match, a solution was found, in the form
of a sequence of actions achieving that goal. For that the agent is required
to consider all of its possible actions, and possibly all of its possible action
sequences (under computational space/time limitations, of course).
Once again, both world models and world evolution models may deviate
from reality. This may be more serious than in the previous architecture,
as far as actions may not have the expected results (e.g., a failed action).
For instance, our vacuum cleaner robot may build a long plan of actions to
go from one room to a distant one, but if it fails to perform one turn in
the middle of the plan execution, it may end in a completely different room.
Another issue is the representation of the goal: is it a specific world state,
or is it a condition over the world state? (e.g., all rooms clean, regardless of
the robot final position).
If finding a solution requires considering a set of possible actions sequences, this can raise computational complexity problems, because of the
combinatorial nature of the problem. The computational efficiency of plan-
22
CHAPTER 1. INTRODUCTION TO AI
sensors
world model
update state
world evolution
model
what-if actions
goals
action selection
actuators
Figure 1.5: Architecture of a goal-based agent.
ning methods is an important issue if the agent is required to deal with a
complex domain.
For the vacuum cleaner example, the goal could be all rooms being clean.
Having a world evolution model that would predict that a room after being
Suck’ed becomes clean, then such an agent would be able to figure out a
sequence of actions to reach the given goal. This would work without the
need for the designer to explicitly tell the agent what to do in any possible
situation.
However, there may be several valid action sequences (plans) that attain
the same goal. How to choose among them? Is there a way of finding the
best solution? See next architecture for an answer.
Utility-based agent
When compared with the previous architecture, this one replaces the simple
matching of the resulting world state with the goals, by an evaluation function. The resulting state is evaluated, with respect to a given performance
measure, and then the agent selects the action sequence that maximizes this
measure. This evaluation function is often called utility, e.g., the utility
U (a, s) of performing action a in the world state s. This way the agent
1.4. INTELLIGENT AGENTS
23
sensors
world model
update state
world evolution
model
what-if actions
utility
state evaluation
action selection
actuators
Figure 1.6: Architecture of a utility-based agent.
is optimizing for the performance measure, thus reaching the level of a rational agent, as previously defined in section 1.4.3. Figure 1.6 depicts this
architecture.
Applying this architecture to the vacuum cleaner domain would result in
an agent that could, for instance, minimize energy while cleaning all rooms.
For an environment with only two rooms, this may not sound very exciting.
But consider having a large, multi-story building, where the robot had to
empty the stored dirt at certain places. It is not trivial to devise manually
the best plan to clean the whole building, while minimizing energy (and/or
time).
As we have evolved the architecture, we also increased the prior knowledge
required by the agent to operate appropriately: it needs a world model, along
with the algorithms to update it after each percept, it needs a world evolution
model, sufficiently accurate and robust to any inherent uncertainty, and it
also needs an utility function. All of this knowledge has to be put into the
24
CHAPTER 1. INTRODUCTION TO AI
performance standard
sensors
critic
feedback
changes
learning element
knowledge
performance
element
learning
goals
problem
generator
actuators
Figure 1.7: Architecture of a learning agent agent.
agent by the designer. In the next section we will consider an architecture
where part (if not all) of this knowledge is gathered by the agent itself.
Learning agent
This architecture builds upon any of the previous architectures. In particular,
consider any of the previous architectures as the performance element box
in figure 1.7. Let us analyse this architecture piece by piece.
The percepts are the inputs to the performance element, that outputs
the agent action. In order for the agent to learn, something that informs it
whether the agent is performing well is needed; this is called the critic. Given
a performance standard, the critic provides a feedback signal (e.g., positive,
negative, or neutral). This signal is often designated reinforcement.
The goal of the learning process is to change the performance element,
e.g. to change parameters of an utility function, or to change rules of a
production system. This change is performed by the learning element, which
given knowledge as gathered by the performance element (e.g., the world
state), performs changes on it.
Learning requires exploration, as discussed in section 1.4.3. In order
for the agent to explore (and eventually balance between exploration and
1.4. INTELLIGENT AGENTS
25
exploitation), a module called problem generator was added. This modules
receives the learning goals from the learning element, and poses new problems
to the performance element to solve. While doing so, more knowledge is
gathered, thus contributing to a more varied learning process of the agent.
Not doing so would limit the agent performance to the situations it has
already encountered.
A learning vacuum cleaner robot could, for instance, be able to build a
map of the building it would operate in. It could also learn about various aspects of the environment (e.g., time to clean each room, typical dirt amount),
thus allowing for a better performance level as times goes by.
26
CHAPTER 1. INTRODUCTION TO AI
Chapter 2
Problem solving
2.1
Introduction
This chapter is focused on AI methods to solve simple problems. One such
problem is the following: consider 2 jugs of water, A and B, with exactly
4 liters and 3 liters of capacity respectively; considering that there is no
other way of measuring volume, how can one obtain 2 liters in jug A? The
only available actions are fill in a jug of water, throw the water away, and
transfer from one jug to the other. Solving the problem means finding the
sequence of actions to reach the goal, from a given initial state (say, both
jugs empty).
In general, the formulation of a problem involves the definition of several
items: (1) goal formulation, (2) actions to consider, and (3) states to consider.
The definition of state is central to this definition. A state is here understood
as the complete representation of a particular condition of the domain in a
specific time. It has to be complete in the sense that given a state, we are able
to predict exactly the resulting state after a given action, without ambiguity.
For instance, in the water jug problem, we will consider a state as the
amount of water in each jug. Formally, we could represent it as a pair (a, b)
where a and b are the amount of water in jugs A and B (0 ≤ a ≤ 4 and
0 ≤ b ≤ 3). Since the goal is to get 2 liters in jug A, it is represented by the
condition a = 2 (we do not care about the final amount of water in jug B).
The initial state is both jugs empty, i.e., state (0, 0). For the actions, we will
consider the following ones:
fill A: (a, b) 7−→ (4, b)
fill B: (a, b) 7−→ (a, 3)
27
28
CHAPTER 2. PROBLEM SOLVING
A to B (transfer from A to B until B full): (a, b) 7−→ (a−q, b0 ), where
b0 = min(3, a + b) and q = b0 − b
B to A (transfer from B to A until A full): (a, b) 7−→ (a0 , b−q), where
a0 = min(4, a + b) and q = a0 − a
empty A (a, b) 7−→ (0, b)
empty B (a, b) 7−→ (a, 0)
A solution is a sequence of states that lead from the initial state (0, 0) to a
state satisfying the goal condition a = 2, for instance:
fill B
A to B
fill B
B to A
empty A
B to A
(0, 0) −−−→ (0, 3) −−−−→ (3, 0) −−−→ (3, 3)
(2.1)
−−−−→ (4, 2) −−−−−→ (0, 2) −−−−→ (2, 0)
Formulating a problem in this way amounts to casting it as a state space
problem, in the sense of defining a space of all possible states, and finding
a action sequence leading from an initial state to a goal state. Note that
although the initial state is unique, in the sense of being initially given, more
than one state may satisfy the goal condition. Thus we usually refer to a
goal state, not necessarily a unique one.
How can a machine solve these kinds of problems? In simple words, it will
amount to performing a search in the state space for a path from the initial
to a goal state. This path contains the desired action sequence to solve the
problem. In some problems one may be concerned only with the goal state
found (e.g., in the case of the Sudoku domain), while in others on the action
sequence (e.g., water jug problem) only.
It is important to stress two distinguishing aspects of this approach: first
it is the three step process of (1) formulating the problem, (2) searching for
a solution, and possibly (3) executing it, and second, the assumption that
the environment is static, observable, discrete, and deterministic.
2.2
Well-defined problems and solutions
We shall now define formally problems and solutions in this context. Given
a state space, a problem is defined by:
1. an initial state
2. a successor function, mapping a state s to a set of action-state pairs
(ai , si ) that are applicable to s, i.e., in the form s 7→ {(a1 , s1 ), (a2 , s2 ), . . .}.
2.2. WELL-DEFINED PROBLEMS AND SOLUTIONS
29
States s1 , s2 , . . . are called successor states, and the procedure of obtaining these states from s is called expansion of s.
3. a goal test, which given a state s, is true if and only if s is a goal state
(also known as solution state), and
4. a path cost, mapping a sequence of actions from the initial state to a
given state, to a number, for which lower values are preferable to higher
values (in the performance measure sense)
The first two items define the reachable state space, meaning all the states
that can be reached from the initial state and the recursive application of
the successor function (i.e., applying the successor function to all successor
states, over and over again). This forms a graph, which can be cyclic ot
acyclic (e.g., a tree). To any reachable state s corresponds a path, consisting
of a sequence of state-action pairs h(a1 , s1 ), (a2 , s2 ), . . . , (ak , sk )i, where si+1
is the successor state of si after applying action ai , s0 is the initial state, and
sk = s is the given state. This path represents one way of reaching state s,
from the initial state. Given such a path, the path cost assesses the total
cost to reaching that state. If the final state sk of a path is a goal state, its
cost corresponds to the inverse of the performance measure, reflecting the
quality of the solution. And in the sense, the optimal solution is the path,
from the initial state to a goal state, minimizing its cost.
One important aspect is the choice of the state representation, meaning
how do we represent, in a formal way, the states of a given problem. When
choosing it, one has to address the problem of abstraction. A state is an
abstraction. In the water jugs problem we were not concerned with the
shape of the jugs, nor the their height from the floor, nor if they are on a
table, etc. since all those aspects to not affect the goal of finding a solution
to the problem. An abstraction means to leave out of the representation all
irrelevant aspects, with respect to the problem at hand. The choice of what
to leave out and what to represent is crucial, in two ways: first, leaving out
too much will prevent us from finding a feasible solution, while keeping too
much in the representation will impact on the computational complexity of
the problem.
As an illustration, consider a mutilated chessboard case, depicted in figure 2.1. This is a normal chessboard with two opposing corners squares
removed. The problem is to find out whether one can cover the whole board
with domino pieces, each one occupying exactly 2 by 1 squares. This is not a
trivial problem, since we have to consider many possible ways of covering the
board, and it turns out to be a very difficult problem. But consider now this
alternative representation of the same problem: each piece will necessarily
30
CHAPTER 2. PROBLEM SOLVING
0Z0Z0Z0Z
Z0Z0Z0Z0
6
0Z0Z0Z0Z
5
Z0Z0Z0Z0
4
0Z0Z0Z0Z
3
Z0Z0Z0Z0
2
0Z0Z0Z0Z
1
Z0Z0Z0Z0
8
7
a
b
c
d
e
f
g
h
Figure 2.1: Mutilated chessboard.
cover a black and a whitre square; a chessboard contains 82 = 64 squares,
32 black and 32 white squares; however, in mutilating the poor board it resulted in 32 black and 30 white squares; therefore, one can never cover such a
board with 2 by 1 domino pieces (since at least two black squares will be left
out of the cover). In conclusion, a careful choice of the state representation,
in terms of the level of abstraction, has a crucial impact on the complexity
of the solving process. Many cases exist, where the mutilated chessboard
problem is one of them, that by changing the state representation, one can
transform an intractable problem in a very easy one.
2.3
Solving problems
The approach that we will use to solve these problems is using search. Search
methods consists on finding for a state in the state space that satisfies the
goal test, starting from the initial state. The idea is the following: starting
with the initial state (or node1 ), to generate all sucessor nodes; then to choose
one of these nodes, and expand it, proceeding so over and over again, until
a node satisfying the goal condition is encountered. Note that only the first
node satisfying this condition is returned. The application of the sucessor
function to a node is called expansion, and these expanded nodes are called
sucessor nodes. Figure 2.2 illustrates this process.
The above description, which will be formalized below as the general treesearch algorithm, does not specify the choice of the next node to expand.
Given a tree partially expanded, several possibilities can be considered: expand the oldest node expanded, the newest, expand the one with lower path
1
We will use the terms state and node interchangeably.
2.3. SOLVING PROBLEMS
31
1
initial node
expansion
2
3
4
5
sucessor nodes
6
7
8
Figure 2.2: Search tree: from the initial node (1), the application of the sucessor functions yields sucessor nodes (2-5); the search proceeds by expanding
one of these nodes.
cost, etc. These possibilities are designated search strategies. In what follows we will always use the same general tree-search algorithm, but we will
consider different search strategies, and discuss the pros and cons of each
one.
First, it is important to define the search space which is formed by all
reachable nodes from the initial state, by the repetitive application of the
sucessor function. Not all states of the state space might be reachable this
way, and so the search space is a subset of the state space. Another useful
definition is the one of the search tree: the repetitive application of the
sucessor funcion yields a tree (as the one illustrated in figure 2.2). This tree
not only represents the relationships between the nodes, in terms of sucession,
but also repeated states might appear. A good search algorithm should take
into account repeated states, as otherwise portions of the search space might
be explored more than once, thus wasting time (and memory).
The general tree-search algorithm is defined by the following steps:
1. initialize tree with the initial state
2. if there are no candidates to expand, then return “failure”
3. choose node to expand according to the search strategy
4. if the chosen node is a goal state, then return it;
otherwise, expand it and add sucessor nodes to the tree
32
CHAPTER 2. PROBLEM SOLVING
5. go to step (2)
All nodes not yet expanded are called open nodes. Once a node is expanded,
it becomes no longer open. The search strategy boils down then to a choice
among open nodes.
The choice of the search strategy can have a big impact on the results,
so that we will compare the search strategies to be presented next according
to the following criteria:
completeness — whether the algorithm find a solution if one exists in the
search space
optimality — whether the solution found is the one that minimizes the
path cost
time complexity — how long does it take to find a solution, in the worst
case, usually expressed in amount of expanded nodes2
space complexity — how much memory is needed, in the worst case, usually expressed also in amount of expanded nodes
Search strategies divide among two major classes: uninformed and informed ones. The difference lies in the fact that the latter employ information from the nodes themselves (e.g., estimated path cost to goal, as in the
case of A*), while the former are completely blind of the node representation.
2.3.1
Uninformed search strategies
Breadth-first search
The idea of this strategy is very simple: to expand the search tree in breadth,
or in other words, level by level. First all nodes at depth 1 are expanded,
then the ones at depth 2 (sucessors of the ones in the previous depth), and
so on, until a solution is found. The ordering of node expansion is illustrated
in figure 2.3.
In implementation terms, it is messy to track the nodes levels, and therefore a very elegant approach can be used: when expanding a node, all successor nodes are inserted into a FIFO structure3 , and the next node to be
expanded is the one that is dequeued first. In this way, the next node to be
expanded is the older one not yet expanded.
2
This is a implementation independent measure, as the choice of, say, seconds is highly
hardware/software dependent.
3
FIFO means First-In-First-Out, corresponding to a queue where the first items to
enter the queue are the ones that come out first.
2.3. SOLVING PROBLEMS
33
1
2
depth 1
3
4
5
6
depth 2
7
5
8
9
10
11
12
13
(a) search tree
14
15
4
3
2
expand
depth 3
(b) node FIFO queue
Figure 2.3: Node expansion order in breadth-first search.
Breadth-first is complete (as long as the branching factor is finite), and
it will find the shallowest goal node in the tree. The shallowest solution node
is not necessarly the optimal one. If the path cost is a function of the depth
alone, then the shallowest solution is also the optimal one.
Concerning the time complexity, let us compute the number of generated nodes in the worst case: considering a branching factor b and that the
shalowest solution is at depth d, this strategy will generate b nodes at the
first level, b · b = b2 at the second one, and so on until bd at the level of the
shallowest solution. However, according to our general search algorithm, the
goal test is only applied to the node chosen to be expanded next. Therefore,
in the worst case, breadth-first will expand one mode level, up to the solution
node, i.e., bd+1 − b. The time complexity will correspond to the sum of all
these terms4
b + b2 + · · · + bd + bd+1 − b = O(bd+1 )
(2.2)
Since all open nodes have to be stored in memory (the next to be expanded
is the older one), the space complexity equals the time one: O(bd+1 ).
The major problem with this strategy is teh exponential nature of both
time and space complexities. In practice, this algorithm breaks due to lack
of memory, before running out of time. As an example, consider the estimates in figure 2.4 for both time and memory consumption for a problem
with branching factor b = 10, running on a machine capable of generating
10000 nodes/sec, and using 1000 bytes/node. Note in this example that with
d = 8, 31 hours of computation require 1 TB of memory, which is currently
out of reach of common computers.
4
For a description of the O-notation (asymptotic analysis) to express complexity
34
CHAPTER 2. PROBLEM SOLVING
d nodes
2 1100
6
107
8
109
12 1013
time
memory
0.11 secs
1 MB
19 min
10 GB
31 hours
1 TB
35 years 10 PB5
Figure 2.4: Time and memory consumption estimates of a problem with
b = 10, on an implementation generating 10000 nodes/sec and consuming
1000 bytes/node.
Uniform cost search
This strategy is a variation of the breadth-first, where the node choice is
determined by the node cost. Assuming that the path cost of a node n in the
search tree is the sum of the step costs from the initial node to n along the
tree, the cost g(n) of a node n is the sum of the step costs, from the initial
node. The next node to expand is then the open node that minimizes this
cost g(n).
If all the step costs are equal, the this strategy equals breadth-first. Unless
all step costs are strictly positive, this strategy may lead to infinite loops (e.g.,
a step cost of zero leading to an identical state). However, is all step costs are
greater or equal than an > 0, then this strategy is complete and optimal.
It can be proven that the time and space complexities of this strategy
is
∗
O(b1+bC /c )
(2.3)
where > 0 is a lower bound of the step costs and C ∗ is the cost of the
optimal solution. This value is usually much larger than bd . This strategy
tends to explore large portions of the search tree with low step costs, making
it a relatively inefficient way of finding optimal solutions.
Depth-first search
While breadth-first strategy searches in breadth, the depth-first one searches
in depth. This strategy proceeds in depth until finding a node without successors, and then backtracks for alternative paths to a solution. This can
be very elegantly implemented replacing the FIFO queue by a LIFO6 (also
known as a stack ). The resulting node expansion ordering is illustrated in
figure 2.5.
check [4], for instance.
6
LIFO means Last-In-First-Out, and corresponds to a stack, where the next item coming out is the most recently added. Thus its comparison with a stack structure.
2.3. SOLVING PROBLEMS
35
1
2
3
4
5
10
11
expand
6
7
8
"dead ends"
9
12
solution
(a) search tree
13
3
2
5
4
(b) node FIFO queue
Figure 2.5: Node expansion order in depth-first search.
One important aspect of this strategy is its memory usage. Note in figure 2.5 that when node 3 is expanded, the whole subtree starting in node 2
was already found to be useless, and therefore there is no need to keep all
these nodes in memory. In general, depth-first allows for deallocating all
memory except for the nodes in the most recently expanded node, and its
immediate successors. Therefore, the space complexity is linear, rather
than exponential as in previous strategies:
b · m + 1 = O(bm)
(2.4)
where m is the maximum depth of the search tree. This corresponds to
storing m times b nodes at the maximum depth, plus the initial node. The
time complexity remains, however, exponential: O(bm ).
As an illustration of the tremendous memory saving, consider the previous
example of a problem with b = 10, 10000 nodes/sec, and 1000 bytes/node: if
the solution and maximum depths were d = m = 12, while the breadth-first
requires 10PB of storage, the depth-first only requires 118KB. The price to
pay is that depth-first is neither complete, if m is not finite, nor optimal.
Backtrack search
Backtrack search results from two optimizations of the depth-first strategy,
that in the cases that they can be applied, more memory saving can be
obtained.
• If, for one given node, we can generate one of its sucessors of at a time,
36
CHAPTER 2. PROBLEM SOLVING
we only need to store one successor for each expanded node. Thus,
space complexity reduces to O(m).
• If, moreover, we can implement node expansion by modifying the node
(rather than allocating a new memory structure for the successor node),
and backtrack by undoing that modification, we only need memory for
a single node.
This is the strategy of choice of, for instance, solving constraint satisfaction
problems.
Depth-limited search
One way of addressing the lack of completeness of depth-first search is to
limit, a priori, the depth of the search tree. In this way, the search tree will
not be deeper than that limit. The major problem with this approach is
that, whenever the shallowest solution is deeper than the specified limit, no
solution will be found. If we designate the limit by l, a solution is found only
if d ≤ l. Because of this limitation, this strategy is also not complete, and
also not optimal. If however the depth of the solution is known, we can
safely set l to that value.
The time complexity of this strategy is O(bl ), while its space complexity is O(bl).
Iterative deepening depth-first search
In the cases where the depth of solution is unknown, we can start experimenting depth-limites search with different values of l, until a solution is found.
In particular, we can start with increasingly higher values of l, starting at
l = 1, until a solution is found. This is called iterative deepening depth-first.
Assuming that the depth of the solution is d, this strategy will run depthfirst search iteratively for l = 1, . . . , d. What might seem a waste of time,
since in each iteration the search tree has to be generated from scratch, it is
not actually that serious, since in general the bulk of the search time is spent
at the deepest level of the tree (because of the exponential nature of search
trees).
The space complexity is O(bd), and the time complexity is
bd = O(bd )
db + (d − 1)b2 + · · · + |{z}
|{z}
| {z }
for l=1
for l=2
(2.5)
for l=d
It is complete, since l will be incremented iteratively until l = d, and it is
optimal if the path cost is a non-decreasing function of the depth (which is
the same to say that the shallowest solution is returned).
2.3. SOLVING PROBLEMS
37
The iterative deepening depth-first is often the preferred strategy among
all uninformed search strategies, since if automatically combines features
from breadth-first (for l < d the tree is searched in breadth, but without
the burden of holding in memory an exponential amount of nodes) and from
depth-first (for each l the search proceeds effectively in depth-first fashion,
thus being very memory lightweight).
One variation of this strategy is called iterative lengthening depth-first
search, consisting on performing a depth-first search limited to path cost,
and then iteratively increasing that limit until a solution is found.
Bidirectional search
In bidirectional search, not only nodes expand to successors, but also nodes
expand backwards to predecessors. First, the initial node expands to its
successor nodes, then, the goal node expands backwards to its predecessor
nodes. This backward expansion means that from one given node, a set of
predecessor nodes is generated such that the given node is a successor of each
one of them. The process iterates, in both directions, until a match between
a node generated foward and a node generated backwards is found. When
this happens, we can trace the solution path from the initial node to the
goal.
Since node expansions happen in both directions, from the initial node as
well as from the goal node, it amounts to twice the search complexity with
half of the depth. This is so because the match between search directions
always occur in the middle. So, the time complexity is
bd/2 + bd/2 = O(bd/2 )
(2.6)
and since bd/2 bd , it is dramatically faster. Since all expanded nodes have
to be stored in memory, the space complexity is also O(bd/2 ). This strategy
is complete, and optimal (as long as all step costs are equal), provided that
breadth-first is employed on each direction.
This strategy may seem at the first sight clearly superior, at least on what
concerns time complexity, but “there are no free lunches”:
1. the goal state has to be specified beforehand, meaning that it cannot
(in principle) be used when the goal state is unknown
2. not only a successor function, but also a predecessor function have to
be defined
3. the predecessor function have to be complete, in the sense that all
possible predecessor nodes have to be generated, since otherwise completeness may be compromised.
38
2.3.2
CHAPTER 2. PROBLEM SOLVING
Informed search strategies
The following search strategies differentiate themselves from the uninformed
ones by the usage of domain-specific information concerning the quality of
the open nodes. In particular, this information boils down to an evaluation
function, designated f (n), which for a given node n yields a number which is
as low as the total path cost of a solution passing through n (and parents) is
close to the optimal solution. In other words, the nodes along the path from
the initial node to a goal are the ones with lowest values of f (n). This, it is
natural that the best strategy consists on choosing from the open nodes the
one minimizing f (n) for the node to be expanded next. The choice of the
node with the lowest f (n) is called best-first strategy. The variation among
the following methods consist exclusively on the way f (n) is computed.
A very important concept in the context of informed search strategies is
the heuristic function, denoted h(n). Given any node n, this function returns
an estimate of the path cost from n to a goal node. This value concerns only
the cost from n to the goal, and thus the usage of this function assumes that
the path costs are summable, i.e., the total path cost of a solution can be
written as a sum of the cost g(n) from the initial goal to n with the cost from
n to a goal node. The heuristic function aims at estimating this latter term
of the sum, for a given n.
Greedy best-first
A* (A-star)
Chapter 3
Knowledge and reasoning
39
40
CHAPTER 3. KNOWLEDGE AND REASONING
Chapter 4
Planning
41
42
CHAPTER 4. PLANNING
Chapter 5
Unertain knowledge and
reasoning
43
44
CHAPTER 5. UNERTAIN KNOWLEDGE AND REASONING
Chapter 6
Learning
45
46
CHAPTER 6. LEARNING
Bibliography
[1] James F. Allen. AI growing up. AI magazine, 19(4):13–23, Winter 1998.
[2] C.R. Baker and J.M. Dolan. Traffic interaction in the urban challenge:
Putting boss on its best behavior. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2008), pages 1752–1758,
2008.
[3] D.E. Bernard, G.A. Dorais, C. Fry, Jr. Gamble, E.B., B. Kanefsky,
J. Kurien, W. Millar, N. Muscettola, P.P. Nayak, B. Pell, K. Rajan,
N. Rouquette, B. Smith, and B.C. Williams. Design of the remote
agent experiment for spacecraft autonomy. In Proceedings of the IEEE
Aerospace Conference, volume 2, pages 259–281, March 1998.
[4] Gilles Brassard and Paul Bratley. Algorithmics: theory and practice.
Prentice-Hall, 1988.
[5] Murray Campbell, A. Joseph Hoane Jr., and Feng Hsiung Hsu. Deep
blue. Artificial Intelligence, 134:57–83, 2002.
[6] Sara Reese Hedberg. DART: Revolutionizing logistics planning. IEEE
Intelligent Systems, 17(3):81–83, May/June 2002.
[7] Anthony M. Digioia III, Branislav Jaramaz, Constantinos Nikou,
Richard S. Labarca, James E. Moody, and Bruce D. Colgan. Surgical
navigation for total hip replacement with the use of hipnav. Operative
Techniques in Orthopaedics, 10(1):3–8, 2000.
[8] Hiroaki Kitano, Minoru Asada, Yasuo Kuniyoshi, Itsuki Noda, and Eiichi Osawa. RoboCup: The robot world cup initiative. In Proceedings of
the First International Conference on Autonomous Agents (Agents’97),
pages 340–347, New York, 1997. ACM Press.
[9] John McCarthy, Marvin L. Minsky, Nathaniel Rochester, and Claude E.
Shannon. A proposal for the darthmouth summer research project on
artificial intelligence. AI Magazine, 27(4):12–14, Winter 2006. (reprint).
47
48
BIBLIOGRAPHY
[10] Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, second edition, 2003.
[11] Herbert A. Simon. The Sciences of the Artificial. MIT Press, third
edition edition, 1996.
[12] Sebastian Thrun, Mike Montemerlo, Hendrik Dahlkamp, David Stavens,
Andrei Aron, James Diebel, Philip Fong, John Gale, Morgan Halpenny,
Gabriel Hoffmann, Kenny Lau, Celia Oakley, Mark Palatucci, Vaughan
Pratt, Pascal Stang, Sven Strohband, Cedric Dupont, Lars-Erik Jendrossek, Christian Koelen, Charles Markey, Carlo Rummel, Joe van
Niekerk, Eric Jensen, Philippe Alessandrini, Gary Bradski, Bob Davies,
Scott Ettinger, Adrian Kaehler, Ara Nefian, and Pamela Mahoney. The
2005 DARPA Grand Challenge, chapter Stanley: The Robot That Won
the DARPA Grand Challenge, pages 1–43. Springer Tracts in Advanced
Robotics. Springer, 2007.
[13] Alan M. Turing. Machine Intelligence, volume 5, chapter Intelligent
Machinery, pages 3–23. American Elsevier Publishing, 1970. (reprint).
[14] John von Neumann. The Computer and the Brain. Yale Nota Bene,
second edition edition, 2000.