Download Artificial Intelligence - Academic year 2016/2017

Document related concepts

Index of logic articles wikipedia , lookup

Nyaya wikipedia , lookup

Laws of Form wikipedia , lookup

Natural deduction wikipedia , lookup

Transcript
Artificial Intelligence
Academic year 2016/2017
Giorgio Fumera
http://pralab.diee.unica.it
[email protected]
Pattern Recognition and Applications Lab
Department of Electrical and Electronic Engineering
University of Cagliari
Outline
Part I: Historical notes
Part II: Solving Problems by Searching
Uninformed Search
Informed search
Part III: Knowledge-based Systems
Logical Languages
Expert systems
Part IV: The Lisp Language
Part V: Machine Learning
Decision Trees
Neural Networks
Part III
Knowledge-based Systems
Some motivating problems
Consider the following problems, and assume that your goal is to
design rational agents, as computer programs, capable of
autonomously solving them.
Some motivating problems
Automatic theorem proving
An example: write a computer program capable to prove or to
refute the following statement:
Goldbach’s conjecture (1742)
For any even number p Ø 4, there exists at least one pair of prime
numbers q and r (identical or not) such that q + r = p.
Some motivating problems
Game playing
An exaple: write a computer program capable of playing the wumpus
game, a text-based computer game (by G. Yob, c. 1972), used in a
modified version as an AI’s toy-problem
4
Stench
Bree z e
PIT
PIT
Bree z e
I
the wumpus world: a cave made up of
connected rooms, bottomless pits, a heap
of gold, and the wumpus, a beast that
eats anyone who enter its room
I
goal: starting from room (1,1), find the
gold and go back to (1,1), without falling
into a pit or hitting the wumpus
I
rooms’ content is known only after
entering them
I
in rooms neighboring the wumpus and
pits, a stench and a breeze is perceived,
respectively
Bree z e
3
Stench
Gold
2
Bree z e
Stench
Bree z e
1
PIT
Bree z e
3
4
START
1
2
LES: figures/wumpus-world.eps (Tue Nov 3 16:24:13 2009). A typical wumpus
s in the bottom left corner, facing right.
Knowledge-based systems
Humans usually solve problems like the ones above by combining
knowledge and reasoning.
Knowledge-based systems aim at mechanizing these two high-level
human capabilities:
I
representing knowledge about the world
I
reasoning to derive new knowledge (and to guide action)
An example
Sketch of a possible reasoning process for deciding the next move
in the wumpus game, starting from the configuration shown above
Chapter 7. 7. Logical
Logical
Agents
(not all moves are shown).
8686
Chapter
Agents
1,41,4
2,42,4
3,43,4
4,44,4
1,31,3
2,32,3
3,33,3
4,34,3
1,21,2
2,22,2
3,23,2
4,24,2
A A = Agent
= Agent
1,41,4
B B = Breeze
= Breeze
Gold
G G = Glitter,
= Glitter,
Gold
OK
= Safe
square
OK
= Safe
square
P P = Pit
= Pit
1,31,3
S S = Stench
= Stench
V V = Visited
= Visited
WW= Wumpus
= Wumpus
1,21,2
1,4
1,3
OK
OK
a)
1,11,1
AA
OK
OK
1,41,4
2,12,1
3,13,1
2,4
W!
b)
(a)(a)
3,43,4
4,44,4
A
3,4
4,44,4
4,4
2,32,3
3,33,3
2,22,2
P?P?
3,23,2
3,3
1,11,1
1,2
3,43,4
4,34,3
4,24,2
4,3
OK
OK
4,14,1
OK
OK
2,42,4
2,3
2,42,4
VV
OK
OK
2,2
= Agent
A A = Agent
2,12,1
AA
BB
OK
OK
3,13,1
P?P?
3,2 (b)(b)
1,41,4
2,42,4
P?P?
3,43,4
4,14,1
4,2
1,1V V = Visited
2,1
= Visited
= Wumpus
W W= Wumpus
c)
2,22,2
1,11,1
2,12,1
3,13,1
BB
P! P!
VV
OK
OK
VV
OK
OK
3,23,2
4,24,2
OK
OK
V
OK
4,14,1
d)
B
V
OK
S SG G
3,1
B B P!
1,21,2
S S
VV
OKOK
2,22,2
1,11,1
2,12,1
3,13,1
BB
P! P!
VV
OKOK
VV
OKOK
3,23,2
4,1
4,24,2
VV
OKOK
(a)
= Agent
= Breeze
= Glitter, Gold
= Safe square
= Pit
= Stench
= Visited
= Wumpus
1,4
2
1,3 W!
2
1,2
2
4,44,4
B B = Breeze
Figure
(Tue
The
Figure 7.37.3 FILES:
FILES: figures/wumpus-seq01.eps
figures/wumpus-seq01.eps
(Tue Nov
Nov 3 3 16:24:10
16:24:10 2009).
2009).
The
S == Breeze
Glitter,
Gold
G G = Glitter,
Gold
first
wumpus
world.
first step
step taken
taken byby thethe agent
agent inin the
the
world. (a)(a) The
The initial
initial situation,
situation, af-afOK
=wumpus
Safe
square
OK
= Safe
square
OK
OK
terter percept
None,
None,
None].
(b)
After
one
move,
with
percept
percept
[None,
None,
None,
None,
None].
(b)
After
one
move,
with
percept
1,3
2,3
3,3
4,3
4,3
1,3
2,3
3,3
4,3
1,31,3
2,32,3 [None,
3,33,3None,
4,3
= Pit
P P = Pit
W!W!
P?P?
W!
AA
W!
[None,
Breeze
, None,
None,
None].
[None,
Breeze
, None,
None,
None]. S S = Stench
= Stench
1,21,2
AA
S S
OK
OK
A
B
G
OK
P
S
V
W
S
V
OK
1,1
2
V
OK
4,14,1
(b)
Figure 7.4 FILES:(b)figures/wumpus-seq35.eps
(Tue Nov 3 16:24:11 2
FILES:
figures/wumpus-seq35.eps
(Tue
Nov
3the
16:24:11
2009).
Two
later
stages
in
FILES:
figures/wumpus-seq35.eps
(Tue
Nov
2009).
Two
later
stages
in
thethethird move, with percept [Stench , Non
progress
of3 16:24:11
agent.
(a)
After
the
(a)(a)
Figure
Figure
7.47.4
progress
agent.
After
third
move,
with
percept
[Stench
, None,
None,
None,
None].
progress
ofof
thethe
agent.
(a)(a)
After
thethe
third
move,
with
percept
[Stench
, None,
None,
None,
None].
(b)(b)
Main approaches to AI system design
Declarative: explicit representation, in a knowledge base, of
I
background knowledge (e.g., the rules of the wumpus game)
I
knowledge about one specific problem instance (e.g., what the
agent knows about a specific wumpus cave it is exploring)
I
the agent’s goal
Actions are derived by reasoning.
Procedural: the desired behavior (actions) are encoded directly as
program code (no explicit knowledge representation and reasoning).
Architecture of knowledge-based systems
update
Knowledge
base
Sensors
update
update
Reasoning
module
(Inference
engine)
actions
Environment
Actuators
Main feature: separation between knowledge representation and
reasoning
I
knowledge base: contains all the agent’s knowledge about its
environment, in declarative form
I
inference engine: implements a reasoning process to derive
new knowledge and to make decisions
Knowledge representation and reasoning
Logic is one of the main tools used in IA for
I knowledge representation: logical languages
– propositional logic
– predicate (first-order) logic
I
reasoning: inference rules and algorithms
Some of the main contributions:
I
Aristotle (4th cent. bc): the “laws of thought”
I
G. Boole (1815–64): Boolean algebra (propositional logic)
I
G. Frege (1848–1925): predicate logic
I
K. Gödel (1906–78): incompleteness theorem
Main applications
I
Automatic theorem provers
I
Logic programming languages (Prolog, etc.)
I
Expert systems
A short introduction to logic
I
What is logic?
I
Propositions, argumentations
I
Logical (formal) languages
I
Logical reasoning
Logic
Definition (a possible one)
Logic is the study of conditions under which an argumentation
(reasoning) is correct.
The above definition involves the following concepts:
I
argumentation: a set of statements consisting of some
premises and one conclusion. A famous example:
All men are mortal; Socrates is a man;
then, Socrates is mortal
I
correctness: when the conclusion cannot be false when all
the premises are true
I
proof: a procedure to assess correctness
Propositions
Natural language: very complex, vague, difficult to formalize.
Logic considers argumentations made up of only a subset of
statements: propositions (or declarative statements).
Definition
A proposition is a statement expressing a concept that can be
either true or false.
Example
I
Socrates is a man
I
Two and two makes four
I
If the Earth had been flat, then Columbus would have not
reached America
A counterexample: Read that book!
Simple and complex propositions
Definition
A proposition is:
I
simple, if it does not contain simpler propositions
I
complex, if it is made up of simpler propositions connected by
logical connectives
Example
Simple propositions:
I
Socrates is a man
I
Two and two makes four
Complex propositions:
I
A tennis match can be won or lost
I
If the Earth had been flat, then Columbus would have not
reached America
Argumentations
When can a proposition be considered true or false?
This is a philosophical question.
Logic does not address this question: it only analyzes the structure
of an argumentation.
Example
All men are mortal; Socrates is a man;
then, Socrates is mortal.
Is the structure of this argumentation correct, whatever its actual
propositions are (i.e., regardless of whether they are true or false)?
Informally, the structure of this argumentation is:
all P are Q; x is P; then x is Q.
Formal languages
Logic provides formal languages for representing (the structure
of) propositions, in the form of sentences.
A formal language is defined by a syntax and a semantics.
Definition
I
syntax (grammar): rules that define what are the
“well-formed” sentences
I
semantics: rules that define the “meaning” of sentences
Examples of formal languages:
I
arithmetic: propositions about numbers
I
programming languages: instructions to be executed by a
computer
Natural vs logical (formal) languages
In natural languages:
I
syntax is not rigorously defined
I
semantics defines the “content” of a statement, i.e., “what it
refers to in the real world”
Example (syntax)
I
The book is on the table: syntactically correct statement,
with a clear semantics
I
Book the on is table the: syntactically incorrect statement, no
meaning can be attributed to it
I
Colorless green ideas sleep furiously :1 syntactically correct,
but what does it mean?
1
N. Chomsky, Syntactic Structures, 1957
Natural vs logical (formal) languages
Logical languages:
I
syntax: formally defined
I
semantics: rules that define the truth value of each
well-formed sentence with respect to each possible model (a
possible “world” represented by that sentence)
Example (arithmetic)
I
Syntax: x + y = 4 is a well-formed sentence, x 4y + = is not
I
Model: the symbol ‘4’ represents the natural number four, ‘x ’
and ‘y ’ any natural number, ‘+’ the sum operator, etc.
I
Semantics: x + y = 4 is true for x = 1 and y = 3, x = 2 and
y = 2, etc.
Logical entailment
Logical reasoning is based on the relation of logical entailment
between sentences, that defines when a sentence follows logically
from another one.
Definition
The sentence – entails the sentence —, if and only if, in every
model in which – is true, also — is true. In symbols:
– |= —
Example (from arithmetic)
x + y = 4 |= x = 4 ≠ y ,
because in every model (i.e., for any assignment of numbers to x
and y ) in which x + y = 4 is true, also x = 4 ≠ y is true.
Logical inference
Definition
I
logical inference: the process of deriving conclusions from
premises
I
inference algorithm: a procedure that derives sentences
(conclusions) from other sentences (premises), in a given
formal language.
Formally, the fact that an inference algorithm A derives a sentence
– from a set of sentences (“knowledge base”) KB is written as:
KB „A –
Properties of inference algorithms
Definition
I
I
soundness (truth-preservation): if an inference algorithm
derives only sentences entailed by the premises, i.e.:
if KB „A –, then KB |= –
completeness: if an inference algorithm derives all the
sentences entailed by the premises, i.e.:
if KB |= –, then KB „A –
A sound algorithm derives conclusions that are guaranteed to be
true in any world in which the premises are true.
88
Chapter 7.
Properties of inference algorithms
Logical Agents
Inference algorithms operate only at the syntactic level:
I
sentences are physical configurations of an agent (e.g., bits in
registers)
I
inference algorithms construct new physical configurations
from old ones
I
logical reasoning should ensure that new configurations
represent aspects of the world that actually follow from the
ones represented by old configurations
Sentences
Aspects of the
real world
Entails
Follows
Sentence
Semantics
World
Semantics
Representation
Aspect of the
real world
Figure 7.6
FILES: figures/follows+entails.eps (Tue Nov 3 16:22:52 2009). Sentences are physical
configurations of the agent, and reasoning is a process of constructing new physical configurations from
old ones. Logical reasoning should ensure that the new configurations represent aspects of the world
that actually follow from the aspects that the old configurations represent.
A
B
G
OK
P
S
V
W
Applications of inference algorithms
In AI inference is used to answer two main kinds of questions:
85
I
I
does a given conclusion – logically follows from the agent’s
knowledge KB? (i.e., KB |= – ?)
what are all the conclusions that logically follow from the
agent’s knowledge? (i.e., find all – such that KB |= –)
Example (the wumpus world)
= Agent
= Breeze
= Glitter, Gold
= Safe square
= Pit
= Stench
= Visited
= Wumpus
1,4
2,4
3,4
4,4
1,3
2,3
3,3
4,3
1,2
2,2
3,2
4,2
P?
OK
1,1
2,1
V
OK
A
B
OK
3,1
P?
4,1
(b)
us-seq01.eps (Tue Nov 3 16:24:10 2009).
The
he wumpus world.
(a) The initial situation, afe, None].
(b) After one move, with percept
I
does a breeze in room (2,1)
entail the presence of a pit in
room (2,2)?
I
what conclusions can be derived
about the presence of pits and
of the wumpus in each room,
from the current knowledge?
Inference algorithms: model checking
The definition of entailment can be directly applied to construct a
simple inference algorithm:
Definition
Model checking: given a set of premises KB and a sentence –,
enumerate all possible models and check whether – is true in
every model in which KB is true.
Example (arithmetic)
I
I
KB : {x + y = 4}
–:y =4≠x
Is the inference {x + y = 4} „ y = 4 ≠ x correct?
Model checking: enumerate all possible pairs of numbers x , y , and
check whether y = 4 ≠ x is true whenever x + y = 4 is.
The issue of grounding
A knowledge base KB (set of sentences considered true) is just
“syntax” (a physical configuration of the agent):
I
what is the connection between a KB and the real world?
I
how does one know that KB is true in the real world?
This is the same philosophical question met before. For humans:
I
a set of beliefs (set of statements considered true) is a
physical configuration of our brain
I
how do we know that our beliefs are true in the real world?
A simple answer can be given for agents (e.g., computer programs
or robots): the connection is created by
I
sensors, e.g.: perceiving a breeze in the wumpus world
I
learning, e.g., when a breeze is perceived, there is a pit in
some adjacent room
Of course, perception and learning are fallible.
Architecture of knowledge-based systems revisited
update
Knowledge
base
Sensors
update
update
Reasoning
module
(Inference
engine)
actions
Environment
Actuators
If logical languages are used:
I
knowledge base: a set of sentences in a given logical language
I
inference engine: an inference algorithm for the same logical
language
Logical languages
Propositional logic
I
the simplest logical language
I
an extension of Boolean algebra (G. Boole, 1815–64)
Predicate (or first-order) logic
I
more expressive and concise than propositional logic
I
seminal work: G. Frege (1848–1925)
Propositional logic: syntax
I
Atomic sentences
– either a propositional symbol that denotes a given proposition
(usually written in capitals), e.g.: P, Q, ...
– or a propositional symbol with fixed meaning: True and False
I
Complex sentences consist of atomic or (recursively) complex
sentences connected by logical connectives (corresponding to
natural language connectives like and, or, not, etc.)
I
Logical connectives (only the commonly used ones are shown
– different notations exist):
· (and)
‚ (or)
¬ (not)
∆ (implies)
… (if and only if / logical equivalence)
Propositional logic: syntax
A formal grammar in Backus-Naur Form (BNF):
Sentence
AtomicSentence
Symbol
ComplexSentence
æ
æ
æ
æ
|
|
|
|
AtomicSentence | ComplexSentence
True | False | Symbol
P | Q | R | ...
¬Sentence
( Sentence · Sentence )
( Sentence ‚ Sentence )
( Sentence ∆ Sentence )
( Sentence … Sentence )
Propositional logic: semantics
Semantics of logical languages:
I
“meaning” of a sentence: its truth value with respect to a
particular model
I
model: a possible assignment of truth values to all
propositional symbols that appear in the sentence
Example
The sentence P · Q ∆ R has 23 = 8 possible models.
One model is {P = True, Q = False, R = True}.
Note: models are abstract mathematical objects with no unique
connection to the real world (e.g., P may stand for any proposition
in natural language).
Propositional logic: semantics
I
Atomic sentences:
– True is true in every model
– False is false in every model
– the truth value of every propositional symbol (atomic
sentence) must be specified in the model
I
Complex sentences: their truth value is recursively defined as
a function of the simpler sentences and of the truth table of
the logical connectives they contain
Truth tables of commonly used connectives
P
false
false
true
true
Q
false
true
false
true
¬P
true
true
false
false
P·Q
false
false
false
true
P‚Q
false
true
true
true
P∆Q
true
true
false
true
P…Q
true
false
false
true
Example
Determining the truth value of ¬P · (Q ‚ R) in all possible models:
P
false
false
false
false
true
true
true
true
Q
false
false
true
true
false
false
true
true
R
false
true
false
true
false
true
false
true
(Q ‚ R)
false
true
true
true
false
true
true
true
¬P · (Q ‚ R)
false
true
true
true
false
false
false
false
Propositional logic and natural language
The truth table of and, or and not is intuitive, but captures only
a subset of their meaning in natural language.
Example
I
He felt down and broke his leg.
Here and includes a temporal and a causal relation (He broke
his leg and felt down does not have the same meaning)
I
A tennis match can be won or lost.
Disjunctive or, usually denoted in logic by ü
Propositional logic and natural language
The truth table of P ∆ Q may not fit one’s intuitive
understanding of “P implies Q” or “if P then Q”
I
5 is odd implies Tokyo is the capital of Japan: meaningless
in natural language, true in propositional logic (P ∆ Q does
not assume causation or relevance between P and Q)
I
5 is even implies 10 is even: false in natural language, true in
propositional logic (P ∆ Q is true whenever P is false)
Correct interpretation of P ∆ Q:
If P is true, then I am claiming that Q is true; otherwise,
I am making no claim (so, I cannot make a false claim).
In other words: the only way for P ∆ Q to be false is when P is
true and Q is false. This models the fact that P is a sufficient but
not necessary condition for Q to be true.
84
Chapter 7.
Exercise
Logical Agents
1. Define a set of propositional symbols to represent the wumpus
world: the position of the agent, wumpus, pits, etc.
2. Define the model corresponding to the configuration in the
figure below
3. Define the part of the initial agent’s KB corresponding to its
knowledge about the cave configuration in the figure below
4. Write a sentence for the proposition: If the wumpus is in room
(3,1) then there is a stench in rooms (2,1), (4,1) and (3,2)
4
Stench
Bree z e
PIT
PIT
Bree z e
Bree z e
3
Stench
Gold
2
Bree z e
Stench
Bree z e
1
PIT
Bree z e
3
4
START
1
2
Solution (1/4)
A possible choice of propositional symbols:
I
I
I
I
I
I
A1,1 (“the agent is in room (1,1)”), A1,2 , . . . , A4,4
W1,1 (“the wumpus is in room (1,1)”), W1,2 , . . . , W4,4
P1,1 (“there is a pit in room (1,1)”), P1,2 , . . . , P4,4
G1,1 (“the gold is in room (1,1)”), G1,2 , . . . , G4,4
B1,1 (“there is a breeze in room (1,1)”), B1,2 , . . . , B4,4
S1,1 (“there is stench in room (1,1)”), S1,2 , . . . , S4,4
Solution (2/4)
Model corresponding to the considered configuration:
4
Stench
I
A1,1 is true; A1,2 , A1,3 ,. . . are false
Bree z e
PIT
I
W3,1 is true; W1,1 , W1,2 , . . . are false
PIT
Bree z e
I
P1,3 , P3,3 , P4,4 are true; P1,1 , P1,2 ,
. . . are false
I
G3,2 is true; G1,1 , G1,2 , . . . are false
I
B1,2 , B1,4 , . . . are true; B1,1 , B1,3 ,
. . . are false
Bree z e
3
Stench
Gold
2
Bree z e
Stench
Bree z e
1
PIT
Bree z e
3
4
START
1
2
7.2 FILES: figures/wumpus-world.eps (Tue Nov 3 16:24:13 2009). A typical wumpus
The agent is in the bottom left corner, facing right.
I
S2,1 , S3,2 , B4,1 are true; S1,1 , S1,2 ,
. . . are false
Solution (3/4)
What the agent knows in the starting configuration:
I
I am in room (1,1) (starting position of the game)
I
I am alive: there are no pits nor the wumpus in this room
I
there is no gold in this room
I
I do not perceive a breeze nor a stench
The corresponding agent’s KB in propositional logic (the set of
sentences the agent believes to be true)
I
I
I
I
A1,1 , ¬A1,2 , ¬A1,3 , . . . , ¬A4,4 (16 sentences)
¬W1,1
¬G1,1
¬B1,1 , ¬S1,1
Solution (4/4)
One may think to translate the considered proposition using the
implication connective (∆):
W3,1 ∆ (S2,1 · S4,1 · S3,2 )
However, since there is only one wumpus, the opposite is also true:
(S2,1 · S4,1 · S3,2 ) ∆ W3,1
An equivalent, more concise way to express both sentences:
(S2,1 · S4,1 · S3,2 ) … W3,1
Inference: model checking
Goal of inference: given a KB and a sentence –, deciding whether
KB|= –.
A simple inference algorithm: model checking (see above).
Application to propositional logic:
I
I
t
enumerate all possible models for sentences KB {–}
check whether – is true in every model in which KB is true
Implementation: truth tables.
Model checking: an example
Determine whether {P ‚ Q, P ∆ R, Q ∆ R} |= P ‚ R, using
model checking.
Propositional symbols
P
Q
R
false false
false
false false
true
false true
false
false true
true
true false
false
true false
true
true true
false
true true
true
P ‚Q
false
false
true
true
true
true
true
true
Premises
P∆R Q∆R
true
true
true
true
true
false
true
true
false
true
true
true
false
false
true
true
Conclusion
P ‚R
false
true
false
true
true
true
true
true
Answer: yes, because the conclusion is true in every model in
which the premises are true (grey rows).
Properties of model checking
I
Soundness: yes, it directly implements the definition of
entailment
I
Completeness: yes, it works for any (finite) KB any and –,
and the corresponding set of models is finite
I
Computational complexity: O(2n ), where n is the number of
propositional symbols appearing in KB and –
Its exponential computational complexity makes model checking
infeasible when the number of propositional symbols is high.
Example
In the exercise about the wumpus world, 96 propositional symbols
have been used: the corresponding truth table is made up of
296 ¥ 1028 rows.
Inference: general concepts
I
I
I
Two sentences – and — are logically equivalent (– … —), if
they are true under the same models, i.e., if and only if
– |= — and — |= –
An example: (P · Q) … (Q · P) (see the truth tables)
A sentence is valid if it is true in all models. Such sentences
are also called tautologies (an example: P · ¬P)
A sentence is satisfiable if it is true only in some model
An example: P · Q
Inference: general concepts
Two useful properties related to the above concepts:
I
I
for any – and —, – |= — if and only if – ∆ — is valid;
for instance, given a set KB of premises and a possible
conclusion –, the model checking inference algorithm works
by checking whether (KB ∆ –) is valid
satisfiability is related to the standard mathematical proof
technique of reductio ad absurdum (proof by refutation or by
contradiction):
– |= — if and only if (– · ¬—) is unsatisfiable
Inference rules
Practical inference algorithms are based on inference rules.
An inference rule represents a standard pattern of inference: it
implements a simple reasoning step whose soundness can be
easily proven, that can be applied to a set of premises having a
specific structure to derive a conclusion.
Inference rules are represented as follows:
premises
conclusion
Examples of inference rules
In the following, – and — denote any propositional sentences.
And Elimination
And Introduction
Or Introduction
First De Morgan’s law
Second De Morgan’s law
Double Negation
Modus Ponens
–1 ·–2
–i , i = 1, 2
–1 ,–2
–1 ·–2
–1
–1 ‚–2 (–2 can be
¬(–1 ·–2 )
¬–1 ‚¬–2
¬(–1 ‚–2 )
¬–1 · ¬–2
¬(¬–)
–
–∆—, –
—
any sentence)
The first five rules above easily generalize to any set of sentences
–1 , . . . , –n .
Soundness of inference rules
Since inference rules usually involve a few sentences, their
soundness can be easily proven using model checking.
An example: Modus Ponens
premise
–
false
false
true
true
conclusion
—
false
true
false
true
premise
–∆—
true
true
false
true
Inference algorithms
Given a set of premises KB and a hypothetical conclusion –, the
goal of an inference algorithm A is to find a proof KB „A – (if
any), i.e., a sequence of applications of inference rules that leads
from KB to –.
Inference algorithms: an example
In the initial configuration of the Wumpus game shown in the
figure below, the agent’s KB includes:
(a) ¬B1,1 (current percept)
(b) ¬B1,1 ∆ ¬P1,2 · ¬P2,1
(one of the rules of the game)
4
Stench
Bree z e
PIT
PIT
Bree z e
Bree z e
3
Stench
Gold
2
Bree z e
Stench
Bree z e
1
PIT
Bree z e
START
The agent can be interested in knowing whether room (1, 2)
contains a pit, i.e., whether KB |= P1,2 :
1
2
3
4
Figure 7.2 FILES: figures/wumpus-world.eps (Tue Nov 3 16:24:13 2009). A typical wumpus
world. The agent is in the bottom left corner, facing right.
I
I
applying Modus Ponens to (a) and (b) it derives:
(c) ¬P1,2 · ¬P2,1
applying And elimination to (c), it derives ¬P1,2
So, it can conclude that room (1, 2) does not contain a pit.
Properties of inference algorithms
Three main issues:
I
is a given inference algorithm sound (correct)?
I
is it complete?
I
what is its computational complexity?
It is not difficult to see that, if the considered inference rules are
sound, so is an inference algorithm based on them.
Completeness is more difficult to prove: it depends on the set of
available inference rules, and on the ways in which they are applied.
Properties of inference algorithms
What about computational complexity?
Note that finding a proof KB „A –, given a set of inference rules
R, can also be formulated as a search problem:
I
initial state: the set of sentences KB
I
state space: any set of sentences made up of the union of KB
and of the sentences that can be derived by applying to KB
any sequence of rules in R
I
operators: the inference rules in R
I
goal state: set(s) of sentences including –
Properties of inference algorithms
This suggests that the computational complexity can be very high:
I
I
the solution depth may be high (some proofs require a large
number of steps)
the branching factor can be high
– several inference rules can be applicable to a given KB
– each of them can be applicable to several sets of sentences
an example: the 16 sentences of the agent’s KB at the
beginning of the Wumps game, A1,1 , ¬A1,2 , ¬A1,3 , . .!. , "
q
16
¬A4,4 , allow And-Introduction to be applied in 16
k=2 k
different ways
Efficiency can be improved by ignoring irrelevant propositions with
respect to the conclusion –. For instance, to prove ¬P · ¬Q,
propositions like R, S and T can be ignored.
Horn clauses
In many domains of practical interest, the whole KB can be
expressed in the form of “if... then...” propositions that can be
encoded as Horn clauses, i.e., implications where:
I
the antecedent is a conjunction (·) of atomic sentences
(non-negated propostitional symbols)
I
the consequent is a single atomic sentence
P1 · . . . · Pn ∆ Q
For instance, S2,1 · S4,1 · S3,2 ∆ W3,1 is a Horn clause.
As particular cases, also atomic sentences (i.e., propositional
symbols) and their negation can be rewritten as Horn clauses.
Indeed, since (P ∆ Q) … (¬P ‚ Q):
P
¬P
…
…
¬True ‚ P
¬P ‚ False
…
…
True ∆ P
P ∆ False
Forward and backward chaining
Two practical inference algorithms exist in the particular case
when:
I
the KB can be expressed as a set of Horn clauses
I
the conclusion is an atomic and non-negated sentence
These algorithms, named forward and backward chaining, exhibit
the following characteristics:
I
they are complete
I
they use a single inference rule (Modus Ponens)
I
they exhibit a computational complexity linear in the size of
the KB
Forward chaining
Given a KB made up by Horn clauses, forward chaining (FC)
derives all the entailed atomic (non-negated) sentences:
function Forward-chaining (KB)
repeat
apply MP in all possible ways to sentences in KB
add to KB the derived sentences not already present (if any)
until some sentences not yet present in KB have been derived
return KB
Forward chaining
FC is an example of data-driven reasoning: it starts from the
known data, and derives their consequences.
For instance, in the Wumpus game FC could be used to update the
agent’s knowledge about the environment (the presence of pits in
each room, etc.), based on the new percepts after each move.
The inference engine of expert systems (described later) is
inspired by the FC inference algorithm.
Forward chaining: an example (1/2)
Consider the KB shown below, made up of Horn clauses:
1. P ∆ Q
2. L · M ∆ P
3. B · L ∆ M
4. A · P ∆ L
5. A · B ∆ L
6. A
7. B
Forward chaining: an example (2/2)
By applying FC one obtains:
8. the only implication whose premises (individual propositional
symbols) are in the KB is 5: MP derives L and adds it to the
current KB
9. now the premises of 3 are all true: MP derives M and adds it to the
KB
10. the premises of 2 have become all true: MP derives P and adds it
to the KB
11. the premises of 1 and 4 are now all true: MP derives Q form 1 and
adds it to the KB, but disregards 4 since its consequent (L) is
already present in the KB
12. no new sentences can be derived from 1–11: FC ends and returns
the updated KB containing the original sentences 1–7 and the ones
derived in the above steps: {L, M, P, Q}
Backward chaining
For a given KB made up of Horn clauses, and a given atomic,
non-negated sentence –, FC can be used to prove whether or not
KB |= –. To this aim, one has to check whether – is present or
not among the derived sentences.
However, backward chaining (BC) is more effective to this goal.
BC recursively applies MP “backwards”. It exploits the fact that
KB |= –, if and only if:
I
I
either – œ KB (this terminates recursion)
or KB contains some implication —1 , . . . , —n ∆ –,
and (recursively) KB |= —1 , . . . , KB |= —n
The sentence – to be proven is also called query.
Backward chaining
function Backward-Chaining (KB, –)
if – œ KB then return True
let B be the set of sentences of KB having – as the consequent
for each — œ B
let —1 , —2 , . . . be the propostitional symbols in the antecedent of —
if Backward-Chaining (KB, —i ) = True for all —i ’s
then return True
return False
Backward chaining
BC is a form of goal-directed reasoning.
For instance, in the Wumpus game it could be used to answer
queries like: given the current agent’s knowledge, is moving
upward the best action?
The computational complexity of BC is even lower than that of
FC, since BC focuses only on relevant sentences.
The Prolog inductive logic programming language is based on the
predicate logic version of the BC inference algorithm (described
later).
Backward chaining: an example (1/8)
Consider a KB representing the rules followed by a financial institution
for deciding whether to grant a loan to an individual. The following
propositional symbols are used:
I
OK : the loan should be approved
I
COLLAT : the collateral for the loan is satisfactory
I
PYMT : the applicant is able to repay the loan
I
REP: the applicant has a good financial reputation
I
APP: the appraisal on the collateral is sufficiently greater than the
loan amount
I
RATING: the applicant has a good credit rating
I
INC : the applicant has a good, steady income
Backward chaining: an example (2/8)
The KB is made up of the five rules (implications) on the left, and of the
data about a specific applicant encoded by the four sentences on the
right (all of them are Horn clauses):
1. COLLAT · PYMT · REP ∆ OK
6. APP
3. RATING ∆ REP
8. INC
2. APP ∆ COLLAT
7. RATING
4. INC ∆ PYMT
9. ¬BAL
5. BAL · REP ∆ OK
Should the loan be approved for this specific applicant?
This amounts to prove whether OK is entailed by the KB, i.e., whether
KB |= OK .
Backward chaining: an example (3/8)
The BC recursive proof KB „BC OK can be conventiently
represented as an AND-OR graph, a tree-like graph in which:
I
multiple links joined by an arc indicate a conjunction (every
link must be proven)
I
multiple links without an arc indicate a disjunction (any link
can be proven)
Backward chaining: an example (4/8)
The first call Backward-Chaining(KB, OK ) is represented by the
tree root, corresponding to the sentence to be proven:
OK
OK
Since OK œ
/ KB, implications having OK as the consequent are searched
for. There are two such sentences: 1 and 5. The BC procedure tries to
BAL
REP
prove all the antecedents of at least one of them. Considering first 5, a
recursive call to Backward-Chaining is made for each of its two
antecedents, represented by an AND-link:
OK
OK
OK
BAL
REP
BAL
OKREP
RATING
BAL
REP
B
Backward chaining: an example (5/8)
Consider the call Backward-Chaining(KB, REP): since REP œ
/ KB,
and the only implication having REP as the consequent is 3, another
recursive call is made for the antecedent of 3:
OK
OK
OK
BAL
REP
BAL
REP
OK
OK
OK
RATING
BAL
REP
BAL RATING)
REP
The call Backward-Chaining(KB,
returns
True, since
OK
OK
RATING œ KB, and thus also the call Backward-Chaining(KB,
RATING
REP) returns True:
BAL
REP
OK
BAL
REP
BAL
OK
REP
RATING
BAL
REP
OK
RATING
RATING
COLLAT
PYMT REP
OK
BAL
REP
RATING
OK
Backward
chaining:
an exampleOK(6/8)
OK
However, the call Backward-Chaining(KB, BAL) returns False, since
REP
BAL BAL
œ
/ KB andREP
there are BAL
no implications
having BAL as the consequent.
Therefore, the first call Backward-Chaining(KB,
OK ) is not able to
RATING
OK
prove OK through this
OK
OK AND-link:
OK
BAL
BAL
OK
BAL
REP
REP
BAL
REP
RATING
OK
REP
RATING
RATING
OK
The other sentence
inBALthe KBREP
having OK as the
1, is now
OK
REP
BAL consequent,
considered, and another AND-link is generated with three recursive calls
RATING
for each of the antecedents RATING
of 1:
BAL
REP
COLLAT
PYMT REP
OK
RATING
OK
BAL
REP
COLLAT
RATING
BAL
REP
COLLAT
PYMT
OK REP
PYMT REP
BAL
BAL
REP
Backward chaining: an RATING
example
(7/8)
OK
REP
RATING
OK
OK
REP
BAL ) generates
BAL
REP
The call Backward-Chaining(KB,
COLLAT
in turn
REPthe antecedent
COLLAT PYMT
REP
another recursive callBAL
to prove
of
the
only
implication
RATING
RATING
having COLLAT as the consequent,
2:
RATING
OK
OK
BAL
BAL
REP
REP
RATING
RATING
OK
COLLAT
COLLAT
PYMT REP
PYMT
REP
APP
OK
The call Backward-Chaining(KB, APP) returns True, since
COLLAT
BAL
REP
PYMT
REP
APP œ KB, and thus
also Backward-Chaining(KB,
COLLAT )
COLLAT
BAL
REP
PYMT
REP
APP
returns True
RATING
RATING
OK
BAL
APP
REP
COLLAT
RATING
APP
INC
PYMT
REP
INC
Backward chaining: an OK
example (8/8)
BAL
REP
COLLAT
PYMT
RATING
APP
INC
REP
COLLAT
REP
Similarly, the calls Backward-Chaining(KB, PYMT ) and
OK
Backward-Chaining(KB, REP) return True.
BAL
PYMT
REP
The corresponding AND-link is then proven, which finally allows the first
INC
RATING
APP
RATING
call Backward-Chaining(KB,
OK
) to return
True:
OK
BAL
REP
COLLAT
PYMT
REP
RATING
APP
INC
RATING
The proof KB „BC OK is then successfully completed.
Resolution algorithm
FC and BC exhibit a low computational complexity. They are also
complete, but limited to:
I
KB’s made up of Horn clauses
I
conclusions consisting of a non-negated propositional symbol
It turns out that a complete inference algorithm for the full
propositional logic also exists: the resolution algorithm, which uses
a single inference rule, named itself resolution.
Given any KB and any sentence –, the resolution algorithm proves
whether or not KB |= –. Its computational complexity is however
much higher than that of FC and BC.
The predicate logic version of the resolution algorithm is used in
automatic theorem provers, to assist mathematicians to develop
complex proofs.
Exercise 1
Construct the agent’s initial KB for the wumpus game.
The KB should contain:
I
the rules of the game: the agent starts in room (1,1); there is
a breeze in rooms adjacent to pits, etc.
I
rules to decide the agent’s move at each step of the game
Note that the KB must be updated at each step of the game:
1. adding percepts in the current room (from sensors)
2. reasoning to derive new knowledge about the position of pits
and wumpus
3. reasoning to decide the next move
4. updating the agent’s position
Exercise 1
Rules of the wumpus game:
I
I
the agent starts in room (1,1):
A1,1 · ¬A1,2 · . . . · ¬A4,4
there is a breeze in rooms adjacent to pits:
P1,1 ∆ (B2,1 · B1,2 ),
P1,2 ∆ (B1,1 · B2,2 · B1,3 ), . . .
(one proposition in natural language, sixteen sentences in
propositional logic – one for each room)
I
there is only one wumpus:
(W1,1 · ¬W1,2 · ¬W1,3 · . . . · ¬W4,4 )‚
(¬W1,1 · W1,2 · ¬W1,3 · . . . · ¬W4,4 )‚ . . .
(one proposition in natural language, sixteen sentences in
propositional logic – one for each room)
I
...
Often, one concise proposition in natural language needs to be
represented by many complex sentences in propositional logic.
Exercise 1
How to update the KB to account for the change of the agent’s
position after each move? E.g., A1,1 is true in the starting
position, and becomes false after the first move:
I
adding ¬A1,1 makes the KB contradictory, since A1,1 is still
present . . .
I
. . . but inference rules do not allow removing sentences
Solution: using a different propositional symbol for each time
step, e.g., Ati,j , t = 1, 2, . . .
I
I
initial KB: A11,1 , ¬A11,2 , . . . ¬A14,4
if the agent moves to (1,2), the following sentences must be
added to the KB: ¬A21,1 , A21,2 , ¬A21,3 . . . , ¬A24,4 ; and so on
Things get complicated . . .
Exercise 2
The following argumentation (an example of syllogism) is intuitively
correct; prove its correctness using propositional logic:
All men are mortal; Socrates is a man; then, Socrates is mortal.
Three distinct propositional symbols must be used:
P (All men are mortal), Q (Socrates is a man), R (Socrates is mortal)
Therefore:
I
I
premises: {P, Q}
conclusion: R
Do the premises entail the conclusion, i.e., {P, Q} |= R?
Model checking easily allows on to prove that the answer is no: in the
model {P = True, Q = True, R = False}, the premises are true but the
conclusion is false.
What’s wrong?
Limitations of propositional logic
Main problems: limited expressive power, lack of conciseness.
Example (wumpus world)
Even small knowledge bases (in natural language) require a large
number of propositional symbols and sentences.
Example (syllogisms)
Inferences involving the structure of atomic sentences (All men
are mortal, . . . ) cannot be made.
From propositional to predicate logic
The description of many domains of interest for real world
applications (e.g., mathematics, philosophy, AI) involve the
following elements in natural language:
I
nouns denoting objects (or persons), e.g.: wumpus and pits;
Socrates and Plato; the numbers one, two, etc.
I
verbs denoting properties of individual objects and relations
between them, e.g.: Socrates is a man, five is prime, four is
lower than five; the sum of two and two equals four
I
some relations between objects can be represented as
functions, e.g.: “father of”, “two plus two”
I
facts involving some or all the objects, e.g.: all squares
neighboring the wumpus are smelly; some numbers are prime
These elements cannot be represented in propositional logic, and
require the more expressive predicate logic.
Predicate logic: models
A model in predicate logic consists of:
I domain of discourse: a set of objects, e.g.:
– the set of natural numbers
– a set of individuals: Socrates, Plato, . . .
I
relations between objects; each relation is represented as the
set of tuples of objects that are related, e.g.:
–
–
–
–
I
being
being
being
being
greater than (binary relation): {(2,1), (3,1), . . . }
a prime number (unary relation): {1, 2, 3, 5, 7, 11, . . . }
the sum of (ternary relation): {(1,1,2), (1,2,3), . . . }
the father of (binary relation): {(John, Mary), . . . }
(unary relations are also called properties)
functions that map tuples of objects to a single object, e.g.:
– plus: (1,1) æ 2, (1,2) æ 3, . . .
– father of: John æ Mary, . . .
Note that relations and functions are defined extensionally, i.e.,
by explicitly enumerating the corresponding tuples.
Predicate logic: syntax
The basic elements are symbols to represent objects, relations and
functions:
I
constant symbols denote objects, e.g.:
One, Two, Three, John, Mary
I
predicate symbols denote relations, e.g.:
GreaterThan, Prime, Sum, Father
I
function symbols denote functions, e.g.:
Plus, FatherOf
Predicate logic: syntax
A formal grammar in Backus-Naur Form (BNF):
Sentence
AtomicSentence
Term
Connective
Quantifier
Constant
Variable
Predicate
Function
æ
|
|
|
æ
æ
æ
æ
æ
æ
æ
æ
AtomicSentence
(Sentence Connective Sentence)
Quantifier Variable, . . . Sentence
¬ Sentence
Predicate(Term,. . . )
Function(Term,. . . ) | Constant | Variable
∆|·|‚|…
’|÷
John | Mary | One | Two | . . .
a | x | s | ...
GreaterThan | Father | . . .
Plus | FatherOf | . . .
Semantics of predicate logic: interpretations
Remember that semantics defines the truth of well-formed
sentences, related to a particular model.
In predicate logic this requires an interpretation: defining which
objects, relations and functions are referred to by symbols.
Examples:
I
One, Two and Three denote the natural numbers 1, 2, 3
John and Mary denote the individuals John and Mary
I
GreaterThan denotes the binary relation > between numbers
Father denote the fatherhood relation between individuals
I
Plus denotes the function mapping a pair of numbers to their
sum
Semantics: terms
Terms are logical expressions denoting objects.
A term can be:
I
simple: a constant symbol, e.g.: One, Two, John
I
complex: a function symbol applied (possibly, recursively) to
other terms, e.g.:
FatherOf (Mary )
Plus(One, Two)
Plus(One, Plus(One, One))
Note:
I
assigning a constant symbol to every object in the domain is
not required (domains can be even infinite)
I
an object can be denoted by more than one constant symbol
Semantics: atomic sentences
The simplest kind of proposition: a predicate symbol applied to a
list of terms.
Examples:
I
GreaterThan(Two, One), Prime(Two),
Prime(Plus(Two, Two)), Sum(One, One, Two)
I
Father (John, Mary ),
Father (FatherOf (John), FatherOf (Mary ))
Semantics: atomic sentences
Definition
An atomic sentence is true, in a given model and under a given
interpretation, if the relation referred to by its predicate symbol
holds between the objects referred to by its arguments (terms).
Example
According to the above model and interpretation:
I
GreaterThan(Two, One) is false
I
Prime(Two) is true
I
Prime(Plus(One, One)) is true
I
Sum(One, One, Two) is true
I
Father (John, Mary ) is true
Semantics: complex sentences
Complex sentences are obtained as in propositional logic, using
logical connectives.
Examples:
I
I
I
I
Prime(Two) · Prime(Three)
¬Sum(One, One, Two)
GreaterThan(One, Two) ∆ (¬GreaterThan(Two, One))
Father (John, Mary ) ‚ Father (Mary , John)
Semantics (truth value) is determined as in propositional logic.
Examples: the second sentence above is false, the others are true.
Semantics: quantifiers
Quantifiers allow one to express propositions involving collections
of objects, without enumerating them explicitly.
Two main quantifiers are used in predicate logic:
I
universal quantifier, e.g.:
All men are mortal
All rooms neighboring the wumpus are smelly
All even numbers are not prime
I
existential quantifier, e.g.:
Some numbers are prime
Some rooms contain pits
Some men are philosophers
Quantifiers require a new kind of term: variable symbols, usually
denoted with lowercase letters.
Semantics: universal quantifier
Example
Assume that the domain is the set of natural numbers.
I
All natural numbers are greater or equal than one
’x GreaterOrEqual(x , One)
I
All natural numbers are either even or odd
’x Even(x ) ‚ Odd(x )
Semantics: universal quantifier
The semantics of a sentence ’x –(x ), where –(x ) is a sentence
containing the variable x , is:
–(x ) is true for each domain element in place of x .
Example
If the domain is the set of natural numbers,
’x GreaterOrEqual(x , One)
means that the following (infinite) sentences are all true:
GreaterOrEqual(One, One)
GreaterOrEqual(Two, One)
...
Semantics: universal quantifier
Consider the proposition:
all even numbers greater than two are not prime.
A common mistake is to represent it as follows:
’ x Even(x ) · GreaterThan(x , Two) · (¬Prime(x ))
The above sentence actually means:
all numbers are even, greater than two, and are not prime,
which is different from the original one (and is also false).
The correct sentence can be obtained by noting that the original
proposition can be restated as:
for all x, if x is even and greater than two, then it is not prime,
which is represented by an implication:
’ x (Even(x ) · GreaterThan(x , Two)) ∆ (¬Prime(x ))
In general, propositions where “all” refers to all the elements of the
domain that satisfy some condition must be represented using an
implication.
Semantics: universal quantifier
Consider again this sentence:
’ x (Even(x ) · GreaterThan(x , Two)) ∆ (¬Prime(x ))
Claiming that it is true means that also sentences like the following
are true:
(Even(One) · GreaterThan(One, Two)) ∆ (¬Prime(One))
Note that the antecedent of the implication is false (the number
‘one’ is not even, nor it is grater than the number ‘two’). This is
not contradictory, since implications with false antecedents are true
by definition (see again the truth table of ∆).
Semantics: existential quantifier
Example
Assume that the domain is the set of natural numbers.
I
Some numbers are prime
÷x Prime(x )
This is read as: there exists some x such that x is prime
I
Some numbers are not greater than three, and are even
÷x ¬GreaterThan(x , Three) · Even(x )
Semantics: existential quantifier
Consider a proposition like the following: some odd numbers are prime.
A common mistake is to represent it using an implication:
÷x Odd(x ) ∆ Prime(x )
The above sentence actually means:
there exists some number such that, if it is odd, then it is prime,
which is weaker than than the original proposition, since it is true (by
definition of ∆) also if there were no odd numbers (i.e., if the antecedent
Odd(x ) is false for all domain elements).
The correct sentence can be obtained by noting that the original
proposition can be restated as:
there exists some x such that x is odd and x is prime
÷x Odd(x ) · Prime(x )
In general, propositions introduced by “some” must be represented using
a conjunction.
Semantics: nested quantifiers
A sentence can contain more than one quantified variable.
If the quantifier is the same for all variables, e.g.:
’x (’y (’z . . . –[x , y , z, . . .] . . .))
then the sentence can be rewritten more concisely as:
’x , y , z . . . –[x , y , z, . . .]
For instance, the sentence “if a number is greater than another
number, then the latter is lower than the former” can be written in
predicate logic as:
’x , y GreaterThan(x , y ) ∆ LowerThan(y , x )
Semantics: nested quantifiers
If a sentence contains both universally and existentially quantified
variables, its meaning depends on the order of quantification.
In particular, ’x (÷y –[x , y ]) and ÷y (’x –[x , y ]) are not
equivalent, i.e., they are not true under the same models.
For instance,
’x ÷y Loves(x , y )
means (i.e., is true under a model in which) “everybody loves
somebody”.
Instead,
÷y ’x Loves(x , y )
means “there is someone who is loved by everyone”.
Semantics: connections between ’ and ÷
’ and ÷ are connected with each other through negation.
For instance, asserting that “every natural number is greater or
equal to one” is the same as asserting that “there does not exist
some natural number which is not greater or equal to one”.
In general, since ’ is a conjunction over all domain objects and ÷ is
a disjunct, they obey De Morgan’s rules (shown below on the left,
in the usual form involving two propositional variables):
¬P · ¬Q
¬(P · Q)
P ·Q
P ‚Q
…
…
…
…
¬(P ‚ Q)
’x (¬–[x ])
(¬P) ‚ (¬Q) ¬(’x –[x ])
¬(¬P ‚ ¬Q)
’x –[x ]
¬(¬P ‚ ·Q)
÷x –[x ]
…
…
…
…
¬(÷x –[x ])
÷x (¬–[x ])
¬(÷x (¬–[x ]))
¬(’x (¬–[x ]))
Exercises (1/2)
Represent the following propositions using sentences in predicate
logic (including the definition of the domain):
1. All men are mortal; Socrates is a man; Socrates is mortal
2. All rooms neighboring a pit are breezy (wumpus game)
3. Peano-Russell’s axioms of arithmetic, that define natural
numbers (nonnegative integers):
P1
P2
P3
P4
P5
zero is a natural number
the successor of any natural number is a natural number
zero is not the successor of any natural number
no two natural numbers have the same successor
any property which belongs to zero, and to the successor of
every natural number which has the property, belongs to all
natural numbers
Exercises (2/2)
4. Represent the following propositions using sentences in
predicate logic, assuming that the goal is to prove that West
is a criminal (using suitable inference algorithms, see below).
The law says that it is a crime for an American to sell
weapons to hostile countries. The country Nono, an enemy of
America, has some missiles, and all of its missiles were sold to
it by Colonel West, who is American.
Note that in a knowledge-based system the first proposition
above encodes the general knowledge about the problem at
hand (“rule memory”, analogously to the rules of chess and of
the wumpus game), whereas the second proposition encodes a
specific problem instance (“working memory”, analogously to
a specific chess or wumpus game).
Solution of exercise 1
Model and symbols:
I
domain: any set including all men
I
constant symbols: Socrates
I
predicate symbols: Man and Mortal, unary predicates; e.g.,
Man(Socrates) means that Socrates is a man.
The sentences are:
’x Man(x ) ∆ Mortal(x )
Man(Socrates)
Mortal(Socrates)
Solution of exercise 2 (1/2)
A possible choice of model and symbols:
I
domain: row and column coordinates
I
constant symbols: 1, 2, 3, 4
predicate symbols:
I
– Pit, binary predicate; e.g., P(1, 2) means that there is a pit in
room (1,2)
– Adjacent, predicate with four terms; e.g., Adjacent(1, 1, 1, 2)
means that room (1,1) is adjacent to room (1,2)
– Breezy, binary predicate; e.g., Breezy (2, 2) means that there is
a breeze in room (2,2)
Solution of exercise 2 (2/2)
One possible sentence is the following:
’x , y (Breezy (x , y ) … (÷p, q Adjacent(x , y , p, q) · Pit(p, q)))
Note that the sentence above also expresses the fact that rooms
with no adjacent pits are not breezy.
Another possible sentence:
’x , y (Pit(x , y ) ∆ (’p, q Adjacent(x , y , p, q) ∆ Breezy (p, q)))
In this case there is no logical equivalence: if all the rooms
adjacent to a given one are breezy, the latter does not necessarily
contain a pit.
Solution of exercise 3 (1/2)
A possible choice of model and symbols:
I
domain: any set including all natural numbers (e.g., the set of
real numbers)
I
constant symbols: Z , denoting the number zero
predicate symbols:
I
– N, unary predicate denoting the fact of being a natural
number; e.g., N(Z ) means that zero is a natural number
– Eq, binary predicate denoting equality; e.g., Eq(Z , Z ) means
that zero equals zero
– P denoting any given property
I
function symbols: S, mapping a natural number to its
successor; e.g., S(Z ) denotes one, S(S(Z )) denotes two
Solution of exercise 3 (2/2)
P1 N(Z )
P2 ’x N(x ) ∆ N(S(x ))
P3 ¬(÷x Eq(Z , S(x )))
P4 ’x , y Eq(S(x ), S(y )) ∆ Eq(x , y )
P5 (P(Z ) · ’x ((N(x ) · P(x )) ∆ P(S(x )))) ∆
(’x (N(x ) ∆ P(x )))
Solution of exercise 4 (1/3)
A possible choice of model and symbols:
I
domain: a set including different individuals (among which
Colonel West), nations (among which America and Nono),
and missiles
I
constant symbols: West, America and Nono
predicate symbols:
I
– Country (·), American(·), Missile(·), Weapon(·), Hostile(·)
(respectively, being a country, an American citizen, a missile, a
weapon, hostile)
– Enemy (< who >, < to whom >) (being enemies)
– Owns(< who >, < what >) (owning something)
– Sells(< who >, < what >, < to whom >) (selling something
to someone)
I
no function symbols are necessary
Solution of exercise 4 (2/3)
The law says that it is a crime for an American to sell weapons to hostile
nations:
’x , y , z (American(x ) · Country (y ) · Hostile(y ) · Weapon(z)·
Sells(x , y , z)) ∆ Criminal(x )
The second proposition can be conveniently split into simpler ones:
Nono is a country...:
Country (Nono)
...Nono is an enemy of America (which is also a country)...:
Enemy (Nono, America)
Country (America)
...Nono has some missiles...:
÷x Missile(x ) · Owns(Nono, x )
...all Nono’s missiles were sold to it by Colonel West:
’x (Missile(x ) · Owns(Nono, x )) ∆ Sells(West, Nono, x )
Solution of exercise 4 (3/3)
A human would intuitively say that the above propositions in
natural language imply that West is a criminal.
However, it is not difficult to see that the above sentences in
predicate logic are not sufficient to prove this.
The reason is that humans exploit background (or common
sense) knowledge that is not explicitly stated in the above
propositions. In particular, there are two “missing links”:
I
an enemy nation is hostile
I
a missile is a weapon
To use such additional knowledge, it must be explicitly
represented by sentences in predicate logic:
I
I
’x , y (Country (x ) · Enemy (x , America)) ∆ Hostile(x )
’x Missile(x ) ∆ Weapon(x )
Knowledge engineering (1/3)
Knowledge engineering is the process of constructing the KB.
It consists of investigating a specific domain, identifying the
relevant concepts (knowledge acquisition), and formally
representing them.
This requires the interaction between
I
a domain expert (DE)
I
a knowledge engineer (KE), who is expert in knowledge
representation and inference, but usually not in the domain of
interest
A possible approach, suitable for special-purpose KBs (in
predicate logic), is the following.
Knowledge engineering (2/3)
1. Identify the task:
– what range of queries will the KB support?
– what kind of facts will be available for each problem instance?
2. Knowledge acquisition: eliciting from the domain expert the
general knowledge about the domain (e.g., the rules of chess)
3. Choice of a vocabulary: what concepts have to be represented as
objects, predicates, functions?
The result is the domain’s ontology, which affects the complexity
of the representation and the inferences that can be made.
E.g., in the Wumpus game pits can be represented either as objects,
or as unary predicates on squares
Knowledge engineering (3/3)
4. Encoding the domain’s general knowledge acquired in step 2
(this may require to revise the vocabulary of step 3)
5. Encoding a specific problem instance (e.g., a specific chess
game)
6. Posing queries to the inference procedure and getting answers
7. Debugging the KB, based on the results of step 6
Inference in predicate logic
Inference algorithms are more complex than in propositional logic, due to
quantifiers and functions.
Basic tools: two inference rules for sentences with quantifiers (Universal
and Existential Instantiation), that derive sentences without quantifiers.
This reduces first-order inference to propositional inference, with
complete but semidecidable inference procedures:
I
I
algorithms exist that find a proof KB „ – in a finite number of steps
for every entailed sentence KB |= –
no algorithm is capable to find the proof KB 0 – in a finite number
of steps for every non-entailed sentence KB 2 –
Therefore, since one does not know that a sentence is entailed until the
proof is done, when a proof procedure is running one does not know
whether it is about to find a proof, or whether it will never find one.
Inference in predicate logic
Modus Ponens can be generalized to predicate logic, leading to the
first-order versions of the FC and BC algorithms, which are
complete and decidable limited to Horn clauses.
The resolution rule can also be generalized to predicate logic,
leading to the first-order version of the complete but
semidecidable resolution algorithm.
Inference rules for quantifiers
Let ◊ denote a substitution list {v1 /t1 , . . . , vn /tn }, where:
I
v1 , . . . , vn are variable names
I
t1 , . . . , tn are terms (either constant symbols, variables, or
functions recursively applied to terms)
I
– is any sentence in which one or more variables appear
Let Subst(◊, –) denote the sentence obtained by applying the
substitution ◊ to the sentence –.
An example:
Subst({y /One}, ’x , y Eq(S(x ), S(y )) ∆ Eq(x , y ))
produces
’x Eq(S(x ), S(One)) ∆ Eq(x , One)
Inference rules for quantifiers
Universal Instantiation:
’v –
Subst({v /t}, –)
where t can be any term without variables.
In other words, since a sentence ’x –[x ] states that – is true for
every domain element in place of x , then one can derive that – is
true for any given element t.
An example: from ’x N(x ) ∆ N(S(x )) one can derive
I
I
N(Z ) ∆ N(S(Z )), for ◊ = {x /Z }
N(S(S(Z ))) ∆ N(S(S(S(Z )))), for ◊ = {x /S(S(Z ))}
and so on.
Inference rules for quantifiers
Existential Instantiation:
÷v –
Subst({v /t}, –)
where t must be a constant symbol that does not appear
elsewhere in the KB.
A sentence ÷v –[v ] states that there is some object satisfying a
condition. The above rule just gives a name to one such object,
but that name must not belong to another object because we do
not know which objects satisfy that condition.
For instance, from ÷x Missile(x ) · Owns(Nono, x ) one can derive
Missile(M) · Owns(Nono, M), provided that M has not been
already used in other sentences; one cannot derive, instead,
Missile(West) · Owns(Nono, West).
Inference rules for quantifiers
A more general form of Existential Instantiation must be applied
when an existential quantifier appears in the scope of a universal
quantifier:
’x , . . . ÷y , . . . –[x , . . . , y . . .]
For instance, from ’x ÷y Loves(x , y ) (everybody loves somebody )
it is not correct to derive ’x Loves(x , A) (everybody loves A), since
the latter sentence means that everybody loves the same person.
Inference rules for quantifiers
Instead of a constant symbol, a new function symbol must be
introduced, known as Skolem function, with as many arguments as
universally quantified variables. Therefore, from:
’x , . . . ÷y , . . . –[x , . . . , y . . .]
the correct application of Existential Instantion derives:
’x , . . . –[x , . . . , F1 (x ), . . .]
For instance, from
one can correctly derive
’x ÷y Loves(x , y )
’x Loves(x , F (x ))
where F maps any individual x to someone loved by x .
Inference algorithms and quantifiers
First-order inference algorithms usually apply Existential
Instantiation as a pre-processing step: every existentially quantified
sentence is replaced by a single sentence.
It can be proven that the resulting KB is inferentially equivalent to
the original one, i.e., it is satisfiable when the original one is.
Accordingly, the resulting KB contains only sentences without
variables, and sentences where all the variables are universally
quantified.
Another useful pre-processing step is renaming all the variables in
the KB to avoid name clashes between variables used in differente
sentences. For instance, the variables in ’x P(x ) and ’x Q(x ) are
not related to each other, and renaming any of them (say,
’y Q(y )) produces an equivalent sentence.
Unification
Another widely used tool in first-order inference algorithms is
unification, the process of finding a subsitution (if any) that makes
two sentences (where at least one contains variables) identical.
For instance, ’x , y Knows(x , y ) and ’z Knows(John, z) can be
unified by different substitutions. Assuming that Bill is one of the
constant symbols, two possible unifiers are:
I
I
{x /John, y /Bill, z/Bill}
{x /John, y /z}
Among all possible unifiers, the one of interest for first-order
inference algorithms is the most general unifier, i.e., the one that
places the fewest restrictions on the values of the variables. The
only constraint is that every occurrence of a given variable can be
replaced by the same term.
In the above example, the most general unifier is {x /John, y /z},
as it does not restrict the value of y and z.
Unification: an example
Consider the sentence ’x Knows(John, x ) (John knows everyone).
Assume that the KB also contains the following sentences (note that
different variables names are used in different sentences):
1. Knows(John, Jane)
2. ’y Knows(y , Bill)
3. ’z Knows(z, Mother (z))
4. Knows(Elizabeth, Bill)
The most general unifier with Knows(John, x ) is:
1. {x /John} (note that ’x Knows(John, x ) implies that also
Knows(John, John) is true, i.e., John knows himself )
2. {y /John, x /Bill}
3. {z/John, x /Mother (John)}
4. no unifier exists, as the constant symbols John and Elizabeth in the
first argument are different
First-order inference: an example (1/2)
Consider a domain made up of two individuals denoted with the
constant symbols John and Richard, and the following KB:
’x King(x ) · Greedy (x ) ∆ Evil(x )
(1)
’y Greedy (y )
(2)
King(John)
(3)
Brother (Richard, John)
(4)
Intuitively, this entails Evil(John), i.e., KB |= Evil(John).
The corresponding inference KB „ Evil(John) can be obtained by
using the above inference rules, as shown in the following.
First-order inference: an example (2/2)
I
I
I
I
Applying Universal Instantiation to (1) produces:
(5) King(John) · Greedy (John) ∆ Evil(John), with {x /John}
(6) King(Richard) · Greedy (Richard) ∆ Evil(Richard),
with {x /Richard}
Applying Universal Instantiation to (2) produces:
(7) Greedy (John), with {y /John}
(8) Greedy (Richard), with {y /Richard}
Applying And Introduction to (3) and (7) produces:
(9) King(John) · Greedy (John)
Applying Modus Ponens to (5) and (9) produces:
(10) Evil(John)
Generalized Modus Ponens
All but the last inference steps in the above example can be seen
as pre-processing steps whose aim is to “prepare” the application
of Modus Ponens. Moreover, some of these steps (Universal
Instantiation using the symbol Richard) are clearly useless to
derive the consequent of implication (1), i.e., Evil(John).
Indeed, the above steps can be combined into a single first-order
inference rule, Generalized Modus Ponens (GMP):
given atomic sentences (non-negated predicates) pi , piÕ ,
i = 1, . . . , n, and q, and a substitution ◊ such that
Subst(◊, pi ) = Subst(◊, piÕ ) for all i:
(p1 · p2 · . . . · pn ∆ q), p1Õ , p2Õ . . . , pnÕ
Subst(◊, q)
Generalized Modus Ponens
In the previous example, GMP allows Evil(John) to be derived in a
single step, and avoids unnecessary applications of inference rules
like Universal Instantiation to sentences (1) and (2) with
{x /Richard} or {y /Richard}.
In particular, GMP can be applied to sentences (1), (2) and (3),
with ◊ = {x /John}: this immediately derives Evil(John).
Horn clauses in predicate logic
GMP allows the forward chaining (FC) and backward chaining
(BC) inference algorithms to be generalized to predicate logic.
This in turn requires to generalize the concept of Horn clause.
A Horn clause in predicate logic is an implication – ∆ — in which:
I
– is a conjunction of non-negated predicates
I
— is a single non-negated predicate
I
all variables (if any) are universally quantified, and the
quantifier appears at the beginning of the sentence
An example: ’x (P(x ) · Q(x )) ∆ R(x ).
Also single (possibly negated) predicates are Horn clauses:
P(t1 , ..., tn ) … (True ∆ P(t1, ..., tn))
¬P(t1 , ..., tn ) … (P(t1 , ..., tn ) ∆ False)
Forward chaining in predicate logic
Similarly to propositional logic, FC consists of repeatedly applying
GMP in all possible ways, adding to the initial KB all newly derived
atomic sentences until no new sentence can be derived.
FC is normally triggered by the addition of new sentences into the
KB, to derive all their consequences. For instance, it can be used
in the Wumpus game when new percepts are added to the KB,
after each agent’s move.
Forward chaining in predicate logic
A simple (but inefficient) implementation of FC:
function Forward-chaining (KB)
local variable: new
repeat
new Ω ÿ (the empty set)
for each sentence s = (p1 · . . . · pn ∆ q) in KB do
for each ◊ such that Subst(◊, p1 · . . . · pn ) =
Subst(◊, p1Õ · . . . · pnÕ ) for some p1Õ , . . . , pnÕ œ KB do
Õ
q Ω Subst(◊, q)
if q Õ œ
/ KB and q Õ œ
/ new then add q Õ to new
add new to KB
until new is empty
return KB
Forward chaining: an example (1/2)
The sentences in the exercise about Colonel West can be written as Horn
clauses, after applying Existential Instantiation and then And Elimination
to ÷x Missile(x ) · Owns(Nono, x ) (the predicate Country is omitted for
the sake of simplicity; the universal quantifiers are not shown to keep the
notation uncluttered):
(American(x ) · Hostile(y )·
Weapon(z) · Sells(x , y , z)) ∆ Criminal(x )
(Missile(x ) · Owns(Nono, x )) ∆ Sells(West, Nono, x )
Enemy (x , America) ∆ Hostile(x )
Missile(x ) ∆ Weapon(x )
American(West)
Enemy (Nono, America)
Owns(Nono, M)
Missile(M)
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
Forward chaining: an example (2/2)
The FC algorithm carries out two repeat-until loops on the
above KB. No new sentences can be derived after the second loop.
First iteration:
– GMP to (2), (7) and (8), with {x /M}:
Sells(West, Nono, M)
– GMP to (3) and (6), with {x /Nono, y /America}:
(10) Hostile(Nono)
– GMP to (4) and (8), with {x /M}:
(11) Weapon(M)
Second iteration:
– GMP to (1), (5), (10), (11) and (9), with
{x /West, y /Nono, z/M}:
(12) Criminal(West)
Backward chaining in predicate logic
The first-order BC algorithm works similarly to its propositional
version: it starts from a sentence (query) to be proven and
recursively applies GMP backward.
Note that every substitution that is made to unify an atomic
sentence with the consequent of an implication must be
propagated back to every antecedent.
If the consequent of an implication unifies with more than one
atomic sentence, at least one unification must allow the
consequent to be proven.
For a possible implementation of BC, see the course textbook.
Backward chaining: an example (1/2)
A proof by BC can be represented as an And-Or graph, as in
propositional logic. The following graph (which should be read
depth first, left to right) shows the proof of the query
Criminal(West) using the previous sentences (1)–(8) as the KB.
Criminal(West)
American(West)
Weapon(y)
Sells(West,M1,z)
Hostile(Nono)
{z/Nono}
{}
Missile(y)
Missile(M1)
Owns(Nono,M1)
Enemy(Nono,America)
{y/M1}
{}
{}
{}
Figure 9.7 FILES: figures/crime-bc.eps (Tue Nov 3 16:22:34 2009). Proof tree constructed by
backward chaining to prove that West is a criminal. The tree should be read depth first, left to right.
To prove Criminal(West), we have to prove the four conjuncts below it. Some of these are in the
knowledge base, and others require further backward chaining. Bindings for each successful unification
Backward chaining: an example (2/2)
If the predicate Country is used, sentence (1) becomes:
’x , y , z (American(x ) · Country (y ) · Hostile(y )·
Weapon(z) · Sells(x , y , z)) ∆ Criminal(x )
The sentences Country (America) and Country (Nono) must also
be added to the KB.
In this case the additional term Hostile(y ) appears in the And link
below Criminal(West). Two sentences in the KB unify with
Hostile(y ): Country (America) and Country (Nono).
If the unification with Country (America) is attempted first, the
conjunct Hostile(America) can not be proven, and the proof fails.
In such a case a backtracking step can be applied, i.e., one of the
other possibile unifications can be attempted. In this case, the
unification with Country (Nono) allows the proof to be completed.
The resolution algorithm
Completeness theorem for predicate logic (Kurt Gödel, 1930):
For every first-order sentence – entailed by a given KB
(KB |= –) there exists some inference algorithm that
derives – (KB „ –) in a finite number of steps.
The opposite does not hold: predicate logic is semidecidable.
A complete inference algorithm for predicate logic: resolution
(1965), based on:
I
converting sentences into Conjunctive Normal Form
I
the resolution inference rule
I
proof by contradiction: to prove KB |= –, prove that
KB · ¬– is unsatisfiable (contradictory)
I
refutation-completeness: if KB · ¬– is unsatisfiable, then
resolution derives a contradiction in a finite number of steps
Applications of forward chaining
Encoding condition-action rules to recommend actions, based on a
data-driven approach:
I
production systems (production: contition-action rule)
I
expert systems
Applications of backward chaining
Logic programming languages: Prolog
I
rapid prototyping
I
symbol processing: compilers, natural language parsers
I
developing expert systems
Example of a Prolog clause:
criminal(X) :- american(X), weapon(Y), sells(X,Y,Z),
hostile(Z).
Running a program = proving a sentence (query) by BC, e.g.:
I
?- criminal(west)
produces Yes
I
?- criminal(A)
produces A=west, Yes
Applications of the resolution algorithm
Main application: theorem provers, used for
I
assisting (not replacing) mathematicians
I
proof checking
verification and synthesis of hardware and software
I
– hardware design (e.g., entire CPUs)
– programming languages (syntax)
– software engineering (verifying program specifications, e.g.,
RSA public key encryption algorithm)
Beyond classical logic
Classical logic is based on two principles:
I
bivalence: there exist only two truth values, true and false
I
determinateness: each proposition has only one truth value
But: how to deal with propositions like the following ones?
I
Tomorrow will be a sunny day : is this true or false, today?
I
John is tall: is this “completely” true (or false)?
This kind of problem is addressed by fuzzy logic
I
Goldbach’s conjecture: Every even number is the sum of a
pair of prime numbers
Can we say this is either true or false, even if no proof has
been found yet?
Expert systems
One of the main applications of knowledge-based systems:
I
encoding human experts’ problem-solving knowledge in
specific application domains for which no algorithmic
solution exists (e.g., medical diagnosis)
I
commonly used as decision support systems
I
problem-independent architecture for knowledge
representation and reasoning
I
knowledge representation: IF...THEN... rules
Expert systems: historical notes
I
Main motivation: limitations of “general” problem-solving
approaches pursued in AI until the 1960s
I
First expert systems: 1970s
I
Widespread use in the 1980s: many commercial applications
I
Used in niche/focused domains since the 1990s
Main current applications of expert systems
I
Medical diagnosis
an example: UK NHS Direct symptom checker (now closed)
http://www.nhsdirect.nhs.uk/CheckSymptoms
I
Geology, botany (e.g.: rock and plant classification)
I
Help desk
I
Finance
I
Military strategies
I
Software engineering (e.g., design patterns)
Expert system architecture
User
Facts
(Working memory)
User Interface
Rules
(Rule memory)
Knowledge Base
Knowledge Engineer
Domain Expert
Inference Engine
Explanation Module
Designing the Knowledge Base of expert systems
Two main, distinct roles:
I
knowledge engineer
I
domain expert
Main issues:
I
defining suitable data structures for representing facts
(problem instances) in working memory
I
suitably eliciting experts’ knowledge (general knowledge) and
encoding it as IF...THEN... rules (rule memory)
How expert systems work
The inference engine implements a forward chaining-like algorithm,
triggered by the addition of new facts in the working memory:
while there is some active rule do
select one active rule (using conflict resolution strategies)
execute the actions of the selected rule
Three kinds of actions exist:
– modifying one fact in the working memory
– adding one fact to the working memory
– removing one fact from the working memory
Expert system shells: CLIPS
I
C Language Integrated Production System
http://clipsrules.sourceforge.net/
I
Developed since 1984 by the Johnson Space Center, NASA
I
Now maintained independently from NASA as public domain,
open source, free software
I
Currently used by government, industry, and academia
Main features:
I
– interpreted, functional language (object-oriented extension)
– specific data structures and instructions/functions for expert
system implementation
– interfaces to other languages (C, Python, etc.)
Other expert system shells
Some free shells:
I
Jess (Java platform)
http://www.jessrules.com
I
Drools (Business Rules Management System)
http://www.drools.org
Several commercial shells are also available.