Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Artificial Intelligence Academic year 2016/2017 Giorgio Fumera http://pralab.diee.unica.it [email protected] Pattern Recognition and Applications Lab Department of Electrical and Electronic Engineering University of Cagliari Outline Part I: Historical notes Part II: Solving Problems by Searching Uninformed Search Informed search Part III: Knowledge-based Systems Logical Languages Expert systems Part IV: The Lisp Language Part V: Machine Learning Decision Trees Neural Networks Part III Knowledge-based Systems Some motivating problems Consider the following problems, and assume that your goal is to design rational agents, as computer programs, capable of autonomously solving them. Some motivating problems Automatic theorem proving An example: write a computer program capable to prove or to refute the following statement: Goldbach’s conjecture (1742) For any even number p Ø 4, there exists at least one pair of prime numbers q and r (identical or not) such that q + r = p. Some motivating problems Game playing An exaple: write a computer program capable of playing the wumpus game, a text-based computer game (by G. Yob, c. 1972), used in a modified version as an AI’s toy-problem 4 Stench Bree z e PIT PIT Bree z e I the wumpus world: a cave made up of connected rooms, bottomless pits, a heap of gold, and the wumpus, a beast that eats anyone who enter its room I goal: starting from room (1,1), find the gold and go back to (1,1), without falling into a pit or hitting the wumpus I rooms’ content is known only after entering them I in rooms neighboring the wumpus and pits, a stench and a breeze is perceived, respectively Bree z e 3 Stench Gold 2 Bree z e Stench Bree z e 1 PIT Bree z e 3 4 START 1 2 LES: figures/wumpus-world.eps (Tue Nov 3 16:24:13 2009). A typical wumpus s in the bottom left corner, facing right. Knowledge-based systems Humans usually solve problems like the ones above by combining knowledge and reasoning. Knowledge-based systems aim at mechanizing these two high-level human capabilities: I representing knowledge about the world I reasoning to derive new knowledge (and to guide action) An example Sketch of a possible reasoning process for deciding the next move in the wumpus game, starting from the configuration shown above Chapter 7. 7. Logical Logical Agents (not all moves are shown). 8686 Chapter Agents 1,41,4 2,42,4 3,43,4 4,44,4 1,31,3 2,32,3 3,33,3 4,34,3 1,21,2 2,22,2 3,23,2 4,24,2 A A = Agent = Agent 1,41,4 B B = Breeze = Breeze Gold G G = Glitter, = Glitter, Gold OK = Safe square OK = Safe square P P = Pit = Pit 1,31,3 S S = Stench = Stench V V = Visited = Visited WW= Wumpus = Wumpus 1,21,2 1,4 1,3 OK OK a) 1,11,1 AA OK OK 1,41,4 2,12,1 3,13,1 2,4 W! b) (a)(a) 3,43,4 4,44,4 A 3,4 4,44,4 4,4 2,32,3 3,33,3 2,22,2 P?P? 3,23,2 3,3 1,11,1 1,2 3,43,4 4,34,3 4,24,2 4,3 OK OK 4,14,1 OK OK 2,42,4 2,3 2,42,4 VV OK OK 2,2 = Agent A A = Agent 2,12,1 AA BB OK OK 3,13,1 P?P? 3,2 (b)(b) 1,41,4 2,42,4 P?P? 3,43,4 4,14,1 4,2 1,1V V = Visited 2,1 = Visited = Wumpus W W= Wumpus c) 2,22,2 1,11,1 2,12,1 3,13,1 BB P! P! VV OK OK VV OK OK 3,23,2 4,24,2 OK OK V OK 4,14,1 d) B V OK S SG G 3,1 B B P! 1,21,2 S S VV OKOK 2,22,2 1,11,1 2,12,1 3,13,1 BB P! P! VV OKOK VV OKOK 3,23,2 4,1 4,24,2 VV OKOK (a) = Agent = Breeze = Glitter, Gold = Safe square = Pit = Stench = Visited = Wumpus 1,4 2 1,3 W! 2 1,2 2 4,44,4 B B = Breeze Figure (Tue The Figure 7.37.3 FILES: FILES: figures/wumpus-seq01.eps figures/wumpus-seq01.eps (Tue Nov Nov 3 3 16:24:10 16:24:10 2009). 2009). The S == Breeze Glitter, Gold G G = Glitter, Gold first wumpus world. first step step taken taken byby thethe agent agent inin the the world. (a)(a) The The initial initial situation, situation, af-afOK =wumpus Safe square OK = Safe square OK OK terter percept None, None, None]. (b) After one move, with percept percept [None, None, None, None, None]. (b) After one move, with percept 1,3 2,3 3,3 4,3 4,3 1,3 2,3 3,3 4,3 1,31,3 2,32,3 [None, 3,33,3None, 4,3 = Pit P P = Pit W!W! P?P? W! AA W! [None, Breeze , None, None, None]. [None, Breeze , None, None, None]. S S = Stench = Stench 1,21,2 AA S S OK OK A B G OK P S V W S V OK 1,1 2 V OK 4,14,1 (b) Figure 7.4 FILES:(b)figures/wumpus-seq35.eps (Tue Nov 3 16:24:11 2 FILES: figures/wumpus-seq35.eps (Tue Nov 3the 16:24:11 2009). Two later stages in FILES: figures/wumpus-seq35.eps (Tue Nov 2009). Two later stages in thethethird move, with percept [Stench , Non progress of3 16:24:11 agent. (a) After the (a)(a) Figure Figure 7.47.4 progress agent. After third move, with percept [Stench , None, None, None, None]. progress ofof thethe agent. (a)(a) After thethe third move, with percept [Stench , None, None, None, None]. (b)(b) Main approaches to AI system design Declarative: explicit representation, in a knowledge base, of I background knowledge (e.g., the rules of the wumpus game) I knowledge about one specific problem instance (e.g., what the agent knows about a specific wumpus cave it is exploring) I the agent’s goal Actions are derived by reasoning. Procedural: the desired behavior (actions) are encoded directly as program code (no explicit knowledge representation and reasoning). Architecture of knowledge-based systems update Knowledge base Sensors update update Reasoning module (Inference engine) actions Environment Actuators Main feature: separation between knowledge representation and reasoning I knowledge base: contains all the agent’s knowledge about its environment, in declarative form I inference engine: implements a reasoning process to derive new knowledge and to make decisions Knowledge representation and reasoning Logic is one of the main tools used in IA for I knowledge representation: logical languages – propositional logic – predicate (first-order) logic I reasoning: inference rules and algorithms Some of the main contributions: I Aristotle (4th cent. bc): the “laws of thought” I G. Boole (1815–64): Boolean algebra (propositional logic) I G. Frege (1848–1925): predicate logic I K. Gödel (1906–78): incompleteness theorem Main applications I Automatic theorem provers I Logic programming languages (Prolog, etc.) I Expert systems A short introduction to logic I What is logic? I Propositions, argumentations I Logical (formal) languages I Logical reasoning Logic Definition (a possible one) Logic is the study of conditions under which an argumentation (reasoning) is correct. The above definition involves the following concepts: I argumentation: a set of statements consisting of some premises and one conclusion. A famous example: All men are mortal; Socrates is a man; then, Socrates is mortal I correctness: when the conclusion cannot be false when all the premises are true I proof: a procedure to assess correctness Propositions Natural language: very complex, vague, difficult to formalize. Logic considers argumentations made up of only a subset of statements: propositions (or declarative statements). Definition A proposition is a statement expressing a concept that can be either true or false. Example I Socrates is a man I Two and two makes four I If the Earth had been flat, then Columbus would have not reached America A counterexample: Read that book! Simple and complex propositions Definition A proposition is: I simple, if it does not contain simpler propositions I complex, if it is made up of simpler propositions connected by logical connectives Example Simple propositions: I Socrates is a man I Two and two makes four Complex propositions: I A tennis match can be won or lost I If the Earth had been flat, then Columbus would have not reached America Argumentations When can a proposition be considered true or false? This is a philosophical question. Logic does not address this question: it only analyzes the structure of an argumentation. Example All men are mortal; Socrates is a man; then, Socrates is mortal. Is the structure of this argumentation correct, whatever its actual propositions are (i.e., regardless of whether they are true or false)? Informally, the structure of this argumentation is: all P are Q; x is P; then x is Q. Formal languages Logic provides formal languages for representing (the structure of) propositions, in the form of sentences. A formal language is defined by a syntax and a semantics. Definition I syntax (grammar): rules that define what are the “well-formed” sentences I semantics: rules that define the “meaning” of sentences Examples of formal languages: I arithmetic: propositions about numbers I programming languages: instructions to be executed by a computer Natural vs logical (formal) languages In natural languages: I syntax is not rigorously defined I semantics defines the “content” of a statement, i.e., “what it refers to in the real world” Example (syntax) I The book is on the table: syntactically correct statement, with a clear semantics I Book the on is table the: syntactically incorrect statement, no meaning can be attributed to it I Colorless green ideas sleep furiously :1 syntactically correct, but what does it mean? 1 N. Chomsky, Syntactic Structures, 1957 Natural vs logical (formal) languages Logical languages: I syntax: formally defined I semantics: rules that define the truth value of each well-formed sentence with respect to each possible model (a possible “world” represented by that sentence) Example (arithmetic) I Syntax: x + y = 4 is a well-formed sentence, x 4y + = is not I Model: the symbol ‘4’ represents the natural number four, ‘x ’ and ‘y ’ any natural number, ‘+’ the sum operator, etc. I Semantics: x + y = 4 is true for x = 1 and y = 3, x = 2 and y = 2, etc. Logical entailment Logical reasoning is based on the relation of logical entailment between sentences, that defines when a sentence follows logically from another one. Definition The sentence – entails the sentence —, if and only if, in every model in which – is true, also — is true. In symbols: – |= — Example (from arithmetic) x + y = 4 |= x = 4 ≠ y , because in every model (i.e., for any assignment of numbers to x and y ) in which x + y = 4 is true, also x = 4 ≠ y is true. Logical inference Definition I logical inference: the process of deriving conclusions from premises I inference algorithm: a procedure that derives sentences (conclusions) from other sentences (premises), in a given formal language. Formally, the fact that an inference algorithm A derives a sentence – from a set of sentences (“knowledge base”) KB is written as: KB „A – Properties of inference algorithms Definition I I soundness (truth-preservation): if an inference algorithm derives only sentences entailed by the premises, i.e.: if KB „A –, then KB |= – completeness: if an inference algorithm derives all the sentences entailed by the premises, i.e.: if KB |= –, then KB „A – A sound algorithm derives conclusions that are guaranteed to be true in any world in which the premises are true. 88 Chapter 7. Properties of inference algorithms Logical Agents Inference algorithms operate only at the syntactic level: I sentences are physical configurations of an agent (e.g., bits in registers) I inference algorithms construct new physical configurations from old ones I logical reasoning should ensure that new configurations represent aspects of the world that actually follow from the ones represented by old configurations Sentences Aspects of the real world Entails Follows Sentence Semantics World Semantics Representation Aspect of the real world Figure 7.6 FILES: figures/follows+entails.eps (Tue Nov 3 16:22:52 2009). Sentences are physical configurations of the agent, and reasoning is a process of constructing new physical configurations from old ones. Logical reasoning should ensure that the new configurations represent aspects of the world that actually follow from the aspects that the old configurations represent. A B G OK P S V W Applications of inference algorithms In AI inference is used to answer two main kinds of questions: 85 I I does a given conclusion – logically follows from the agent’s knowledge KB? (i.e., KB |= – ?) what are all the conclusions that logically follow from the agent’s knowledge? (i.e., find all – such that KB |= –) Example (the wumpus world) = Agent = Breeze = Glitter, Gold = Safe square = Pit = Stench = Visited = Wumpus 1,4 2,4 3,4 4,4 1,3 2,3 3,3 4,3 1,2 2,2 3,2 4,2 P? OK 1,1 2,1 V OK A B OK 3,1 P? 4,1 (b) us-seq01.eps (Tue Nov 3 16:24:10 2009). The he wumpus world. (a) The initial situation, afe, None]. (b) After one move, with percept I does a breeze in room (2,1) entail the presence of a pit in room (2,2)? I what conclusions can be derived about the presence of pits and of the wumpus in each room, from the current knowledge? Inference algorithms: model checking The definition of entailment can be directly applied to construct a simple inference algorithm: Definition Model checking: given a set of premises KB and a sentence –, enumerate all possible models and check whether – is true in every model in which KB is true. Example (arithmetic) I I KB : {x + y = 4} –:y =4≠x Is the inference {x + y = 4} „ y = 4 ≠ x correct? Model checking: enumerate all possible pairs of numbers x , y , and check whether y = 4 ≠ x is true whenever x + y = 4 is. The issue of grounding A knowledge base KB (set of sentences considered true) is just “syntax” (a physical configuration of the agent): I what is the connection between a KB and the real world? I how does one know that KB is true in the real world? This is the same philosophical question met before. For humans: I a set of beliefs (set of statements considered true) is a physical configuration of our brain I how do we know that our beliefs are true in the real world? A simple answer can be given for agents (e.g., computer programs or robots): the connection is created by I sensors, e.g.: perceiving a breeze in the wumpus world I learning, e.g., when a breeze is perceived, there is a pit in some adjacent room Of course, perception and learning are fallible. Architecture of knowledge-based systems revisited update Knowledge base Sensors update update Reasoning module (Inference engine) actions Environment Actuators If logical languages are used: I knowledge base: a set of sentences in a given logical language I inference engine: an inference algorithm for the same logical language Logical languages Propositional logic I the simplest logical language I an extension of Boolean algebra (G. Boole, 1815–64) Predicate (or first-order) logic I more expressive and concise than propositional logic I seminal work: G. Frege (1848–1925) Propositional logic: syntax I Atomic sentences – either a propositional symbol that denotes a given proposition (usually written in capitals), e.g.: P, Q, ... – or a propositional symbol with fixed meaning: True and False I Complex sentences consist of atomic or (recursively) complex sentences connected by logical connectives (corresponding to natural language connectives like and, or, not, etc.) I Logical connectives (only the commonly used ones are shown – different notations exist): · (and) ‚ (or) ¬ (not) ∆ (implies) … (if and only if / logical equivalence) Propositional logic: syntax A formal grammar in Backus-Naur Form (BNF): Sentence AtomicSentence Symbol ComplexSentence æ æ æ æ | | | | AtomicSentence | ComplexSentence True | False | Symbol P | Q | R | ... ¬Sentence ( Sentence · Sentence ) ( Sentence ‚ Sentence ) ( Sentence ∆ Sentence ) ( Sentence … Sentence ) Propositional logic: semantics Semantics of logical languages: I “meaning” of a sentence: its truth value with respect to a particular model I model: a possible assignment of truth values to all propositional symbols that appear in the sentence Example The sentence P · Q ∆ R has 23 = 8 possible models. One model is {P = True, Q = False, R = True}. Note: models are abstract mathematical objects with no unique connection to the real world (e.g., P may stand for any proposition in natural language). Propositional logic: semantics I Atomic sentences: – True is true in every model – False is false in every model – the truth value of every propositional symbol (atomic sentence) must be specified in the model I Complex sentences: their truth value is recursively defined as a function of the simpler sentences and of the truth table of the logical connectives they contain Truth tables of commonly used connectives P false false true true Q false true false true ¬P true true false false P·Q false false false true P‚Q false true true true P∆Q true true false true P…Q true false false true Example Determining the truth value of ¬P · (Q ‚ R) in all possible models: P false false false false true true true true Q false false true true false false true true R false true false true false true false true (Q ‚ R) false true true true false true true true ¬P · (Q ‚ R) false true true true false false false false Propositional logic and natural language The truth table of and, or and not is intuitive, but captures only a subset of their meaning in natural language. Example I He felt down and broke his leg. Here and includes a temporal and a causal relation (He broke his leg and felt down does not have the same meaning) I A tennis match can be won or lost. Disjunctive or, usually denoted in logic by ü Propositional logic and natural language The truth table of P ∆ Q may not fit one’s intuitive understanding of “P implies Q” or “if P then Q” I 5 is odd implies Tokyo is the capital of Japan: meaningless in natural language, true in propositional logic (P ∆ Q does not assume causation or relevance between P and Q) I 5 is even implies 10 is even: false in natural language, true in propositional logic (P ∆ Q is true whenever P is false) Correct interpretation of P ∆ Q: If P is true, then I am claiming that Q is true; otherwise, I am making no claim (so, I cannot make a false claim). In other words: the only way for P ∆ Q to be false is when P is true and Q is false. This models the fact that P is a sufficient but not necessary condition for Q to be true. 84 Chapter 7. Exercise Logical Agents 1. Define a set of propositional symbols to represent the wumpus world: the position of the agent, wumpus, pits, etc. 2. Define the model corresponding to the configuration in the figure below 3. Define the part of the initial agent’s KB corresponding to its knowledge about the cave configuration in the figure below 4. Write a sentence for the proposition: If the wumpus is in room (3,1) then there is a stench in rooms (2,1), (4,1) and (3,2) 4 Stench Bree z e PIT PIT Bree z e Bree z e 3 Stench Gold 2 Bree z e Stench Bree z e 1 PIT Bree z e 3 4 START 1 2 Solution (1/4) A possible choice of propositional symbols: I I I I I I A1,1 (“the agent is in room (1,1)”), A1,2 , . . . , A4,4 W1,1 (“the wumpus is in room (1,1)”), W1,2 , . . . , W4,4 P1,1 (“there is a pit in room (1,1)”), P1,2 , . . . , P4,4 G1,1 (“the gold is in room (1,1)”), G1,2 , . . . , G4,4 B1,1 (“there is a breeze in room (1,1)”), B1,2 , . . . , B4,4 S1,1 (“there is stench in room (1,1)”), S1,2 , . . . , S4,4 Solution (2/4) Model corresponding to the considered configuration: 4 Stench I A1,1 is true; A1,2 , A1,3 ,. . . are false Bree z e PIT I W3,1 is true; W1,1 , W1,2 , . . . are false PIT Bree z e I P1,3 , P3,3 , P4,4 are true; P1,1 , P1,2 , . . . are false I G3,2 is true; G1,1 , G1,2 , . . . are false I B1,2 , B1,4 , . . . are true; B1,1 , B1,3 , . . . are false Bree z e 3 Stench Gold 2 Bree z e Stench Bree z e 1 PIT Bree z e 3 4 START 1 2 7.2 FILES: figures/wumpus-world.eps (Tue Nov 3 16:24:13 2009). A typical wumpus The agent is in the bottom left corner, facing right. I S2,1 , S3,2 , B4,1 are true; S1,1 , S1,2 , . . . are false Solution (3/4) What the agent knows in the starting configuration: I I am in room (1,1) (starting position of the game) I I am alive: there are no pits nor the wumpus in this room I there is no gold in this room I I do not perceive a breeze nor a stench The corresponding agent’s KB in propositional logic (the set of sentences the agent believes to be true) I I I I A1,1 , ¬A1,2 , ¬A1,3 , . . . , ¬A4,4 (16 sentences) ¬W1,1 ¬G1,1 ¬B1,1 , ¬S1,1 Solution (4/4) One may think to translate the considered proposition using the implication connective (∆): W3,1 ∆ (S2,1 · S4,1 · S3,2 ) However, since there is only one wumpus, the opposite is also true: (S2,1 · S4,1 · S3,2 ) ∆ W3,1 An equivalent, more concise way to express both sentences: (S2,1 · S4,1 · S3,2 ) … W3,1 Inference: model checking Goal of inference: given a KB and a sentence –, deciding whether KB|= –. A simple inference algorithm: model checking (see above). Application to propositional logic: I I t enumerate all possible models for sentences KB {–} check whether – is true in every model in which KB is true Implementation: truth tables. Model checking: an example Determine whether {P ‚ Q, P ∆ R, Q ∆ R} |= P ‚ R, using model checking. Propositional symbols P Q R false false false false false true false true false false true true true false false true false true true true false true true true P ‚Q false false true true true true true true Premises P∆R Q∆R true true true true true false true true false true true true false false true true Conclusion P ‚R false true false true true true true true Answer: yes, because the conclusion is true in every model in which the premises are true (grey rows). Properties of model checking I Soundness: yes, it directly implements the definition of entailment I Completeness: yes, it works for any (finite) KB any and –, and the corresponding set of models is finite I Computational complexity: O(2n ), where n is the number of propositional symbols appearing in KB and – Its exponential computational complexity makes model checking infeasible when the number of propositional symbols is high. Example In the exercise about the wumpus world, 96 propositional symbols have been used: the corresponding truth table is made up of 296 ¥ 1028 rows. Inference: general concepts I I I Two sentences – and — are logically equivalent (– … —), if they are true under the same models, i.e., if and only if – |= — and — |= – An example: (P · Q) … (Q · P) (see the truth tables) A sentence is valid if it is true in all models. Such sentences are also called tautologies (an example: P · ¬P) A sentence is satisfiable if it is true only in some model An example: P · Q Inference: general concepts Two useful properties related to the above concepts: I I for any – and —, – |= — if and only if – ∆ — is valid; for instance, given a set KB of premises and a possible conclusion –, the model checking inference algorithm works by checking whether (KB ∆ –) is valid satisfiability is related to the standard mathematical proof technique of reductio ad absurdum (proof by refutation or by contradiction): – |= — if and only if (– · ¬—) is unsatisfiable Inference rules Practical inference algorithms are based on inference rules. An inference rule represents a standard pattern of inference: it implements a simple reasoning step whose soundness can be easily proven, that can be applied to a set of premises having a specific structure to derive a conclusion. Inference rules are represented as follows: premises conclusion Examples of inference rules In the following, – and — denote any propositional sentences. And Elimination And Introduction Or Introduction First De Morgan’s law Second De Morgan’s law Double Negation Modus Ponens –1 ·–2 –i , i = 1, 2 –1 ,–2 –1 ·–2 –1 –1 ‚–2 (–2 can be ¬(–1 ·–2 ) ¬–1 ‚¬–2 ¬(–1 ‚–2 ) ¬–1 · ¬–2 ¬(¬–) – –∆—, – — any sentence) The first five rules above easily generalize to any set of sentences –1 , . . . , –n . Soundness of inference rules Since inference rules usually involve a few sentences, their soundness can be easily proven using model checking. An example: Modus Ponens premise – false false true true conclusion — false true false true premise –∆— true true false true Inference algorithms Given a set of premises KB and a hypothetical conclusion –, the goal of an inference algorithm A is to find a proof KB „A – (if any), i.e., a sequence of applications of inference rules that leads from KB to –. Inference algorithms: an example In the initial configuration of the Wumpus game shown in the figure below, the agent’s KB includes: (a) ¬B1,1 (current percept) (b) ¬B1,1 ∆ ¬P1,2 · ¬P2,1 (one of the rules of the game) 4 Stench Bree z e PIT PIT Bree z e Bree z e 3 Stench Gold 2 Bree z e Stench Bree z e 1 PIT Bree z e START The agent can be interested in knowing whether room (1, 2) contains a pit, i.e., whether KB |= P1,2 : 1 2 3 4 Figure 7.2 FILES: figures/wumpus-world.eps (Tue Nov 3 16:24:13 2009). A typical wumpus world. The agent is in the bottom left corner, facing right. I I applying Modus Ponens to (a) and (b) it derives: (c) ¬P1,2 · ¬P2,1 applying And elimination to (c), it derives ¬P1,2 So, it can conclude that room (1, 2) does not contain a pit. Properties of inference algorithms Three main issues: I is a given inference algorithm sound (correct)? I is it complete? I what is its computational complexity? It is not difficult to see that, if the considered inference rules are sound, so is an inference algorithm based on them. Completeness is more difficult to prove: it depends on the set of available inference rules, and on the ways in which they are applied. Properties of inference algorithms What about computational complexity? Note that finding a proof KB „A –, given a set of inference rules R, can also be formulated as a search problem: I initial state: the set of sentences KB I state space: any set of sentences made up of the union of KB and of the sentences that can be derived by applying to KB any sequence of rules in R I operators: the inference rules in R I goal state: set(s) of sentences including – Properties of inference algorithms This suggests that the computational complexity can be very high: I I the solution depth may be high (some proofs require a large number of steps) the branching factor can be high – several inference rules can be applicable to a given KB – each of them can be applicable to several sets of sentences an example: the 16 sentences of the agent’s KB at the beginning of the Wumps game, A1,1 , ¬A1,2 , ¬A1,3 , . .!. , " q 16 ¬A4,4 , allow And-Introduction to be applied in 16 k=2 k different ways Efficiency can be improved by ignoring irrelevant propositions with respect to the conclusion –. For instance, to prove ¬P · ¬Q, propositions like R, S and T can be ignored. Horn clauses In many domains of practical interest, the whole KB can be expressed in the form of “if... then...” propositions that can be encoded as Horn clauses, i.e., implications where: I the antecedent is a conjunction (·) of atomic sentences (non-negated propostitional symbols) I the consequent is a single atomic sentence P1 · . . . · Pn ∆ Q For instance, S2,1 · S4,1 · S3,2 ∆ W3,1 is a Horn clause. As particular cases, also atomic sentences (i.e., propositional symbols) and their negation can be rewritten as Horn clauses. Indeed, since (P ∆ Q) … (¬P ‚ Q): P ¬P … … ¬True ‚ P ¬P ‚ False … … True ∆ P P ∆ False Forward and backward chaining Two practical inference algorithms exist in the particular case when: I the KB can be expressed as a set of Horn clauses I the conclusion is an atomic and non-negated sentence These algorithms, named forward and backward chaining, exhibit the following characteristics: I they are complete I they use a single inference rule (Modus Ponens) I they exhibit a computational complexity linear in the size of the KB Forward chaining Given a KB made up by Horn clauses, forward chaining (FC) derives all the entailed atomic (non-negated) sentences: function Forward-chaining (KB) repeat apply MP in all possible ways to sentences in KB add to KB the derived sentences not already present (if any) until some sentences not yet present in KB have been derived return KB Forward chaining FC is an example of data-driven reasoning: it starts from the known data, and derives their consequences. For instance, in the Wumpus game FC could be used to update the agent’s knowledge about the environment (the presence of pits in each room, etc.), based on the new percepts after each move. The inference engine of expert systems (described later) is inspired by the FC inference algorithm. Forward chaining: an example (1/2) Consider the KB shown below, made up of Horn clauses: 1. P ∆ Q 2. L · M ∆ P 3. B · L ∆ M 4. A · P ∆ L 5. A · B ∆ L 6. A 7. B Forward chaining: an example (2/2) By applying FC one obtains: 8. the only implication whose premises (individual propositional symbols) are in the KB is 5: MP derives L and adds it to the current KB 9. now the premises of 3 are all true: MP derives M and adds it to the KB 10. the premises of 2 have become all true: MP derives P and adds it to the KB 11. the premises of 1 and 4 are now all true: MP derives Q form 1 and adds it to the KB, but disregards 4 since its consequent (L) is already present in the KB 12. no new sentences can be derived from 1–11: FC ends and returns the updated KB containing the original sentences 1–7 and the ones derived in the above steps: {L, M, P, Q} Backward chaining For a given KB made up of Horn clauses, and a given atomic, non-negated sentence –, FC can be used to prove whether or not KB |= –. To this aim, one has to check whether – is present or not among the derived sentences. However, backward chaining (BC) is more effective to this goal. BC recursively applies MP “backwards”. It exploits the fact that KB |= –, if and only if: I I either – œ KB (this terminates recursion) or KB contains some implication —1 , . . . , —n ∆ –, and (recursively) KB |= —1 , . . . , KB |= —n The sentence – to be proven is also called query. Backward chaining function Backward-Chaining (KB, –) if – œ KB then return True let B be the set of sentences of KB having – as the consequent for each — œ B let —1 , —2 , . . . be the propostitional symbols in the antecedent of — if Backward-Chaining (KB, —i ) = True for all —i ’s then return True return False Backward chaining BC is a form of goal-directed reasoning. For instance, in the Wumpus game it could be used to answer queries like: given the current agent’s knowledge, is moving upward the best action? The computational complexity of BC is even lower than that of FC, since BC focuses only on relevant sentences. The Prolog inductive logic programming language is based on the predicate logic version of the BC inference algorithm (described later). Backward chaining: an example (1/8) Consider a KB representing the rules followed by a financial institution for deciding whether to grant a loan to an individual. The following propositional symbols are used: I OK : the loan should be approved I COLLAT : the collateral for the loan is satisfactory I PYMT : the applicant is able to repay the loan I REP: the applicant has a good financial reputation I APP: the appraisal on the collateral is sufficiently greater than the loan amount I RATING: the applicant has a good credit rating I INC : the applicant has a good, steady income Backward chaining: an example (2/8) The KB is made up of the five rules (implications) on the left, and of the data about a specific applicant encoded by the four sentences on the right (all of them are Horn clauses): 1. COLLAT · PYMT · REP ∆ OK 6. APP 3. RATING ∆ REP 8. INC 2. APP ∆ COLLAT 7. RATING 4. INC ∆ PYMT 9. ¬BAL 5. BAL · REP ∆ OK Should the loan be approved for this specific applicant? This amounts to prove whether OK is entailed by the KB, i.e., whether KB |= OK . Backward chaining: an example (3/8) The BC recursive proof KB „BC OK can be conventiently represented as an AND-OR graph, a tree-like graph in which: I multiple links joined by an arc indicate a conjunction (every link must be proven) I multiple links without an arc indicate a disjunction (any link can be proven) Backward chaining: an example (4/8) The first call Backward-Chaining(KB, OK ) is represented by the tree root, corresponding to the sentence to be proven: OK OK Since OK œ / KB, implications having OK as the consequent are searched for. There are two such sentences: 1 and 5. The BC procedure tries to BAL REP prove all the antecedents of at least one of them. Considering first 5, a recursive call to Backward-Chaining is made for each of its two antecedents, represented by an AND-link: OK OK OK BAL REP BAL OKREP RATING BAL REP B Backward chaining: an example (5/8) Consider the call Backward-Chaining(KB, REP): since REP œ / KB, and the only implication having REP as the consequent is 3, another recursive call is made for the antecedent of 3: OK OK OK BAL REP BAL REP OK OK OK RATING BAL REP BAL RATING) REP The call Backward-Chaining(KB, returns True, since OK OK RATING œ KB, and thus also the call Backward-Chaining(KB, RATING REP) returns True: BAL REP OK BAL REP BAL OK REP RATING BAL REP OK RATING RATING COLLAT PYMT REP OK BAL REP RATING OK Backward chaining: an exampleOK(6/8) OK However, the call Backward-Chaining(KB, BAL) returns False, since REP BAL BAL œ / KB andREP there are BAL no implications having BAL as the consequent. Therefore, the first call Backward-Chaining(KB, OK ) is not able to RATING OK prove OK through this OK OK AND-link: OK BAL BAL OK BAL REP REP BAL REP RATING OK REP RATING RATING OK The other sentence inBALthe KBREP having OK as the 1, is now OK REP BAL consequent, considered, and another AND-link is generated with three recursive calls RATING for each of the antecedents RATING of 1: BAL REP COLLAT PYMT REP OK RATING OK BAL REP COLLAT RATING BAL REP COLLAT PYMT OK REP PYMT REP BAL BAL REP Backward chaining: an RATING example (7/8) OK REP RATING OK OK REP BAL ) generates BAL REP The call Backward-Chaining(KB, COLLAT in turn REPthe antecedent COLLAT PYMT REP another recursive callBAL to prove of the only implication RATING RATING having COLLAT as the consequent, 2: RATING OK OK BAL BAL REP REP RATING RATING OK COLLAT COLLAT PYMT REP PYMT REP APP OK The call Backward-Chaining(KB, APP) returns True, since COLLAT BAL REP PYMT REP APP œ KB, and thus also Backward-Chaining(KB, COLLAT ) COLLAT BAL REP PYMT REP APP returns True RATING RATING OK BAL APP REP COLLAT RATING APP INC PYMT REP INC Backward chaining: an OK example (8/8) BAL REP COLLAT PYMT RATING APP INC REP COLLAT REP Similarly, the calls Backward-Chaining(KB, PYMT ) and OK Backward-Chaining(KB, REP) return True. BAL PYMT REP The corresponding AND-link is then proven, which finally allows the first INC RATING APP RATING call Backward-Chaining(KB, OK ) to return True: OK BAL REP COLLAT PYMT REP RATING APP INC RATING The proof KB „BC OK is then successfully completed. Resolution algorithm FC and BC exhibit a low computational complexity. They are also complete, but limited to: I KB’s made up of Horn clauses I conclusions consisting of a non-negated propositional symbol It turns out that a complete inference algorithm for the full propositional logic also exists: the resolution algorithm, which uses a single inference rule, named itself resolution. Given any KB and any sentence –, the resolution algorithm proves whether or not KB |= –. Its computational complexity is however much higher than that of FC and BC. The predicate logic version of the resolution algorithm is used in automatic theorem provers, to assist mathematicians to develop complex proofs. Exercise 1 Construct the agent’s initial KB for the wumpus game. The KB should contain: I the rules of the game: the agent starts in room (1,1); there is a breeze in rooms adjacent to pits, etc. I rules to decide the agent’s move at each step of the game Note that the KB must be updated at each step of the game: 1. adding percepts in the current room (from sensors) 2. reasoning to derive new knowledge about the position of pits and wumpus 3. reasoning to decide the next move 4. updating the agent’s position Exercise 1 Rules of the wumpus game: I I the agent starts in room (1,1): A1,1 · ¬A1,2 · . . . · ¬A4,4 there is a breeze in rooms adjacent to pits: P1,1 ∆ (B2,1 · B1,2 ), P1,2 ∆ (B1,1 · B2,2 · B1,3 ), . . . (one proposition in natural language, sixteen sentences in propositional logic – one for each room) I there is only one wumpus: (W1,1 · ¬W1,2 · ¬W1,3 · . . . · ¬W4,4 )‚ (¬W1,1 · W1,2 · ¬W1,3 · . . . · ¬W4,4 )‚ . . . (one proposition in natural language, sixteen sentences in propositional logic – one for each room) I ... Often, one concise proposition in natural language needs to be represented by many complex sentences in propositional logic. Exercise 1 How to update the KB to account for the change of the agent’s position after each move? E.g., A1,1 is true in the starting position, and becomes false after the first move: I adding ¬A1,1 makes the KB contradictory, since A1,1 is still present . . . I . . . but inference rules do not allow removing sentences Solution: using a different propositional symbol for each time step, e.g., Ati,j , t = 1, 2, . . . I I initial KB: A11,1 , ¬A11,2 , . . . ¬A14,4 if the agent moves to (1,2), the following sentences must be added to the KB: ¬A21,1 , A21,2 , ¬A21,3 . . . , ¬A24,4 ; and so on Things get complicated . . . Exercise 2 The following argumentation (an example of syllogism) is intuitively correct; prove its correctness using propositional logic: All men are mortal; Socrates is a man; then, Socrates is mortal. Three distinct propositional symbols must be used: P (All men are mortal), Q (Socrates is a man), R (Socrates is mortal) Therefore: I I premises: {P, Q} conclusion: R Do the premises entail the conclusion, i.e., {P, Q} |= R? Model checking easily allows on to prove that the answer is no: in the model {P = True, Q = True, R = False}, the premises are true but the conclusion is false. What’s wrong? Limitations of propositional logic Main problems: limited expressive power, lack of conciseness. Example (wumpus world) Even small knowledge bases (in natural language) require a large number of propositional symbols and sentences. Example (syllogisms) Inferences involving the structure of atomic sentences (All men are mortal, . . . ) cannot be made. From propositional to predicate logic The description of many domains of interest for real world applications (e.g., mathematics, philosophy, AI) involve the following elements in natural language: I nouns denoting objects (or persons), e.g.: wumpus and pits; Socrates and Plato; the numbers one, two, etc. I verbs denoting properties of individual objects and relations between them, e.g.: Socrates is a man, five is prime, four is lower than five; the sum of two and two equals four I some relations between objects can be represented as functions, e.g.: “father of”, “two plus two” I facts involving some or all the objects, e.g.: all squares neighboring the wumpus are smelly; some numbers are prime These elements cannot be represented in propositional logic, and require the more expressive predicate logic. Predicate logic: models A model in predicate logic consists of: I domain of discourse: a set of objects, e.g.: – the set of natural numbers – a set of individuals: Socrates, Plato, . . . I relations between objects; each relation is represented as the set of tuples of objects that are related, e.g.: – – – – I being being being being greater than (binary relation): {(2,1), (3,1), . . . } a prime number (unary relation): {1, 2, 3, 5, 7, 11, . . . } the sum of (ternary relation): {(1,1,2), (1,2,3), . . . } the father of (binary relation): {(John, Mary), . . . } (unary relations are also called properties) functions that map tuples of objects to a single object, e.g.: – plus: (1,1) æ 2, (1,2) æ 3, . . . – father of: John æ Mary, . . . Note that relations and functions are defined extensionally, i.e., by explicitly enumerating the corresponding tuples. Predicate logic: syntax The basic elements are symbols to represent objects, relations and functions: I constant symbols denote objects, e.g.: One, Two, Three, John, Mary I predicate symbols denote relations, e.g.: GreaterThan, Prime, Sum, Father I function symbols denote functions, e.g.: Plus, FatherOf Predicate logic: syntax A formal grammar in Backus-Naur Form (BNF): Sentence AtomicSentence Term Connective Quantifier Constant Variable Predicate Function æ | | | æ æ æ æ æ æ æ æ AtomicSentence (Sentence Connective Sentence) Quantifier Variable, . . . Sentence ¬ Sentence Predicate(Term,. . . ) Function(Term,. . . ) | Constant | Variable ∆|·|‚|… ’|÷ John | Mary | One | Two | . . . a | x | s | ... GreaterThan | Father | . . . Plus | FatherOf | . . . Semantics of predicate logic: interpretations Remember that semantics defines the truth of well-formed sentences, related to a particular model. In predicate logic this requires an interpretation: defining which objects, relations and functions are referred to by symbols. Examples: I One, Two and Three denote the natural numbers 1, 2, 3 John and Mary denote the individuals John and Mary I GreaterThan denotes the binary relation > between numbers Father denote the fatherhood relation between individuals I Plus denotes the function mapping a pair of numbers to their sum Semantics: terms Terms are logical expressions denoting objects. A term can be: I simple: a constant symbol, e.g.: One, Two, John I complex: a function symbol applied (possibly, recursively) to other terms, e.g.: FatherOf (Mary ) Plus(One, Two) Plus(One, Plus(One, One)) Note: I assigning a constant symbol to every object in the domain is not required (domains can be even infinite) I an object can be denoted by more than one constant symbol Semantics: atomic sentences The simplest kind of proposition: a predicate symbol applied to a list of terms. Examples: I GreaterThan(Two, One), Prime(Two), Prime(Plus(Two, Two)), Sum(One, One, Two) I Father (John, Mary ), Father (FatherOf (John), FatherOf (Mary )) Semantics: atomic sentences Definition An atomic sentence is true, in a given model and under a given interpretation, if the relation referred to by its predicate symbol holds between the objects referred to by its arguments (terms). Example According to the above model and interpretation: I GreaterThan(Two, One) is false I Prime(Two) is true I Prime(Plus(One, One)) is true I Sum(One, One, Two) is true I Father (John, Mary ) is true Semantics: complex sentences Complex sentences are obtained as in propositional logic, using logical connectives. Examples: I I I I Prime(Two) · Prime(Three) ¬Sum(One, One, Two) GreaterThan(One, Two) ∆ (¬GreaterThan(Two, One)) Father (John, Mary ) ‚ Father (Mary , John) Semantics (truth value) is determined as in propositional logic. Examples: the second sentence above is false, the others are true. Semantics: quantifiers Quantifiers allow one to express propositions involving collections of objects, without enumerating them explicitly. Two main quantifiers are used in predicate logic: I universal quantifier, e.g.: All men are mortal All rooms neighboring the wumpus are smelly All even numbers are not prime I existential quantifier, e.g.: Some numbers are prime Some rooms contain pits Some men are philosophers Quantifiers require a new kind of term: variable symbols, usually denoted with lowercase letters. Semantics: universal quantifier Example Assume that the domain is the set of natural numbers. I All natural numbers are greater or equal than one ’x GreaterOrEqual(x , One) I All natural numbers are either even or odd ’x Even(x ) ‚ Odd(x ) Semantics: universal quantifier The semantics of a sentence ’x –(x ), where –(x ) is a sentence containing the variable x , is: –(x ) is true for each domain element in place of x . Example If the domain is the set of natural numbers, ’x GreaterOrEqual(x , One) means that the following (infinite) sentences are all true: GreaterOrEqual(One, One) GreaterOrEqual(Two, One) ... Semantics: universal quantifier Consider the proposition: all even numbers greater than two are not prime. A common mistake is to represent it as follows: ’ x Even(x ) · GreaterThan(x , Two) · (¬Prime(x )) The above sentence actually means: all numbers are even, greater than two, and are not prime, which is different from the original one (and is also false). The correct sentence can be obtained by noting that the original proposition can be restated as: for all x, if x is even and greater than two, then it is not prime, which is represented by an implication: ’ x (Even(x ) · GreaterThan(x , Two)) ∆ (¬Prime(x )) In general, propositions where “all” refers to all the elements of the domain that satisfy some condition must be represented using an implication. Semantics: universal quantifier Consider again this sentence: ’ x (Even(x ) · GreaterThan(x , Two)) ∆ (¬Prime(x )) Claiming that it is true means that also sentences like the following are true: (Even(One) · GreaterThan(One, Two)) ∆ (¬Prime(One)) Note that the antecedent of the implication is false (the number ‘one’ is not even, nor it is grater than the number ‘two’). This is not contradictory, since implications with false antecedents are true by definition (see again the truth table of ∆). Semantics: existential quantifier Example Assume that the domain is the set of natural numbers. I Some numbers are prime ÷x Prime(x ) This is read as: there exists some x such that x is prime I Some numbers are not greater than three, and are even ÷x ¬GreaterThan(x , Three) · Even(x ) Semantics: existential quantifier Consider a proposition like the following: some odd numbers are prime. A common mistake is to represent it using an implication: ÷x Odd(x ) ∆ Prime(x ) The above sentence actually means: there exists some number such that, if it is odd, then it is prime, which is weaker than than the original proposition, since it is true (by definition of ∆) also if there were no odd numbers (i.e., if the antecedent Odd(x ) is false for all domain elements). The correct sentence can be obtained by noting that the original proposition can be restated as: there exists some x such that x is odd and x is prime ÷x Odd(x ) · Prime(x ) In general, propositions introduced by “some” must be represented using a conjunction. Semantics: nested quantifiers A sentence can contain more than one quantified variable. If the quantifier is the same for all variables, e.g.: ’x (’y (’z . . . –[x , y , z, . . .] . . .)) then the sentence can be rewritten more concisely as: ’x , y , z . . . –[x , y , z, . . .] For instance, the sentence “if a number is greater than another number, then the latter is lower than the former” can be written in predicate logic as: ’x , y GreaterThan(x , y ) ∆ LowerThan(y , x ) Semantics: nested quantifiers If a sentence contains both universally and existentially quantified variables, its meaning depends on the order of quantification. In particular, ’x (÷y –[x , y ]) and ÷y (’x –[x , y ]) are not equivalent, i.e., they are not true under the same models. For instance, ’x ÷y Loves(x , y ) means (i.e., is true under a model in which) “everybody loves somebody”. Instead, ÷y ’x Loves(x , y ) means “there is someone who is loved by everyone”. Semantics: connections between ’ and ÷ ’ and ÷ are connected with each other through negation. For instance, asserting that “every natural number is greater or equal to one” is the same as asserting that “there does not exist some natural number which is not greater or equal to one”. In general, since ’ is a conjunction over all domain objects and ÷ is a disjunct, they obey De Morgan’s rules (shown below on the left, in the usual form involving two propositional variables): ¬P · ¬Q ¬(P · Q) P ·Q P ‚Q … … … … ¬(P ‚ Q) ’x (¬–[x ]) (¬P) ‚ (¬Q) ¬(’x –[x ]) ¬(¬P ‚ ¬Q) ’x –[x ] ¬(¬P ‚ ·Q) ÷x –[x ] … … … … ¬(÷x –[x ]) ÷x (¬–[x ]) ¬(÷x (¬–[x ])) ¬(’x (¬–[x ])) Exercises (1/2) Represent the following propositions using sentences in predicate logic (including the definition of the domain): 1. All men are mortal; Socrates is a man; Socrates is mortal 2. All rooms neighboring a pit are breezy (wumpus game) 3. Peano-Russell’s axioms of arithmetic, that define natural numbers (nonnegative integers): P1 P2 P3 P4 P5 zero is a natural number the successor of any natural number is a natural number zero is not the successor of any natural number no two natural numbers have the same successor any property which belongs to zero, and to the successor of every natural number which has the property, belongs to all natural numbers Exercises (2/2) 4. Represent the following propositions using sentences in predicate logic, assuming that the goal is to prove that West is a criminal (using suitable inference algorithms, see below). The law says that it is a crime for an American to sell weapons to hostile countries. The country Nono, an enemy of America, has some missiles, and all of its missiles were sold to it by Colonel West, who is American. Note that in a knowledge-based system the first proposition above encodes the general knowledge about the problem at hand (“rule memory”, analogously to the rules of chess and of the wumpus game), whereas the second proposition encodes a specific problem instance (“working memory”, analogously to a specific chess or wumpus game). Solution of exercise 1 Model and symbols: I domain: any set including all men I constant symbols: Socrates I predicate symbols: Man and Mortal, unary predicates; e.g., Man(Socrates) means that Socrates is a man. The sentences are: ’x Man(x ) ∆ Mortal(x ) Man(Socrates) Mortal(Socrates) Solution of exercise 2 (1/2) A possible choice of model and symbols: I domain: row and column coordinates I constant symbols: 1, 2, 3, 4 predicate symbols: I – Pit, binary predicate; e.g., P(1, 2) means that there is a pit in room (1,2) – Adjacent, predicate with four terms; e.g., Adjacent(1, 1, 1, 2) means that room (1,1) is adjacent to room (1,2) – Breezy, binary predicate; e.g., Breezy (2, 2) means that there is a breeze in room (2,2) Solution of exercise 2 (2/2) One possible sentence is the following: ’x , y (Breezy (x , y ) … (÷p, q Adjacent(x , y , p, q) · Pit(p, q))) Note that the sentence above also expresses the fact that rooms with no adjacent pits are not breezy. Another possible sentence: ’x , y (Pit(x , y ) ∆ (’p, q Adjacent(x , y , p, q) ∆ Breezy (p, q))) In this case there is no logical equivalence: if all the rooms adjacent to a given one are breezy, the latter does not necessarily contain a pit. Solution of exercise 3 (1/2) A possible choice of model and symbols: I domain: any set including all natural numbers (e.g., the set of real numbers) I constant symbols: Z , denoting the number zero predicate symbols: I – N, unary predicate denoting the fact of being a natural number; e.g., N(Z ) means that zero is a natural number – Eq, binary predicate denoting equality; e.g., Eq(Z , Z ) means that zero equals zero – P denoting any given property I function symbols: S, mapping a natural number to its successor; e.g., S(Z ) denotes one, S(S(Z )) denotes two Solution of exercise 3 (2/2) P1 N(Z ) P2 ’x N(x ) ∆ N(S(x )) P3 ¬(÷x Eq(Z , S(x ))) P4 ’x , y Eq(S(x ), S(y )) ∆ Eq(x , y ) P5 (P(Z ) · ’x ((N(x ) · P(x )) ∆ P(S(x )))) ∆ (’x (N(x ) ∆ P(x ))) Solution of exercise 4 (1/3) A possible choice of model and symbols: I domain: a set including different individuals (among which Colonel West), nations (among which America and Nono), and missiles I constant symbols: West, America and Nono predicate symbols: I – Country (·), American(·), Missile(·), Weapon(·), Hostile(·) (respectively, being a country, an American citizen, a missile, a weapon, hostile) – Enemy (< who >, < to whom >) (being enemies) – Owns(< who >, < what >) (owning something) – Sells(< who >, < what >, < to whom >) (selling something to someone) I no function symbols are necessary Solution of exercise 4 (2/3) The law says that it is a crime for an American to sell weapons to hostile nations: ’x , y , z (American(x ) · Country (y ) · Hostile(y ) · Weapon(z)· Sells(x , y , z)) ∆ Criminal(x ) The second proposition can be conveniently split into simpler ones: Nono is a country...: Country (Nono) ...Nono is an enemy of America (which is also a country)...: Enemy (Nono, America) Country (America) ...Nono has some missiles...: ÷x Missile(x ) · Owns(Nono, x ) ...all Nono’s missiles were sold to it by Colonel West: ’x (Missile(x ) · Owns(Nono, x )) ∆ Sells(West, Nono, x ) Solution of exercise 4 (3/3) A human would intuitively say that the above propositions in natural language imply that West is a criminal. However, it is not difficult to see that the above sentences in predicate logic are not sufficient to prove this. The reason is that humans exploit background (or common sense) knowledge that is not explicitly stated in the above propositions. In particular, there are two “missing links”: I an enemy nation is hostile I a missile is a weapon To use such additional knowledge, it must be explicitly represented by sentences in predicate logic: I I ’x , y (Country (x ) · Enemy (x , America)) ∆ Hostile(x ) ’x Missile(x ) ∆ Weapon(x ) Knowledge engineering (1/3) Knowledge engineering is the process of constructing the KB. It consists of investigating a specific domain, identifying the relevant concepts (knowledge acquisition), and formally representing them. This requires the interaction between I a domain expert (DE) I a knowledge engineer (KE), who is expert in knowledge representation and inference, but usually not in the domain of interest A possible approach, suitable for special-purpose KBs (in predicate logic), is the following. Knowledge engineering (2/3) 1. Identify the task: – what range of queries will the KB support? – what kind of facts will be available for each problem instance? 2. Knowledge acquisition: eliciting from the domain expert the general knowledge about the domain (e.g., the rules of chess) 3. Choice of a vocabulary: what concepts have to be represented as objects, predicates, functions? The result is the domain’s ontology, which affects the complexity of the representation and the inferences that can be made. E.g., in the Wumpus game pits can be represented either as objects, or as unary predicates on squares Knowledge engineering (3/3) 4. Encoding the domain’s general knowledge acquired in step 2 (this may require to revise the vocabulary of step 3) 5. Encoding a specific problem instance (e.g., a specific chess game) 6. Posing queries to the inference procedure and getting answers 7. Debugging the KB, based on the results of step 6 Inference in predicate logic Inference algorithms are more complex than in propositional logic, due to quantifiers and functions. Basic tools: two inference rules for sentences with quantifiers (Universal and Existential Instantiation), that derive sentences without quantifiers. This reduces first-order inference to propositional inference, with complete but semidecidable inference procedures: I I algorithms exist that find a proof KB „ – in a finite number of steps for every entailed sentence KB |= – no algorithm is capable to find the proof KB 0 – in a finite number of steps for every non-entailed sentence KB 2 – Therefore, since one does not know that a sentence is entailed until the proof is done, when a proof procedure is running one does not know whether it is about to find a proof, or whether it will never find one. Inference in predicate logic Modus Ponens can be generalized to predicate logic, leading to the first-order versions of the FC and BC algorithms, which are complete and decidable limited to Horn clauses. The resolution rule can also be generalized to predicate logic, leading to the first-order version of the complete but semidecidable resolution algorithm. Inference rules for quantifiers Let ◊ denote a substitution list {v1 /t1 , . . . , vn /tn }, where: I v1 , . . . , vn are variable names I t1 , . . . , tn are terms (either constant symbols, variables, or functions recursively applied to terms) I – is any sentence in which one or more variables appear Let Subst(◊, –) denote the sentence obtained by applying the substitution ◊ to the sentence –. An example: Subst({y /One}, ’x , y Eq(S(x ), S(y )) ∆ Eq(x , y )) produces ’x Eq(S(x ), S(One)) ∆ Eq(x , One) Inference rules for quantifiers Universal Instantiation: ’v – Subst({v /t}, –) where t can be any term without variables. In other words, since a sentence ’x –[x ] states that – is true for every domain element in place of x , then one can derive that – is true for any given element t. An example: from ’x N(x ) ∆ N(S(x )) one can derive I I N(Z ) ∆ N(S(Z )), for ◊ = {x /Z } N(S(S(Z ))) ∆ N(S(S(S(Z )))), for ◊ = {x /S(S(Z ))} and so on. Inference rules for quantifiers Existential Instantiation: ÷v – Subst({v /t}, –) where t must be a constant symbol that does not appear elsewhere in the KB. A sentence ÷v –[v ] states that there is some object satisfying a condition. The above rule just gives a name to one such object, but that name must not belong to another object because we do not know which objects satisfy that condition. For instance, from ÷x Missile(x ) · Owns(Nono, x ) one can derive Missile(M) · Owns(Nono, M), provided that M has not been already used in other sentences; one cannot derive, instead, Missile(West) · Owns(Nono, West). Inference rules for quantifiers A more general form of Existential Instantiation must be applied when an existential quantifier appears in the scope of a universal quantifier: ’x , . . . ÷y , . . . –[x , . . . , y . . .] For instance, from ’x ÷y Loves(x , y ) (everybody loves somebody ) it is not correct to derive ’x Loves(x , A) (everybody loves A), since the latter sentence means that everybody loves the same person. Inference rules for quantifiers Instead of a constant symbol, a new function symbol must be introduced, known as Skolem function, with as many arguments as universally quantified variables. Therefore, from: ’x , . . . ÷y , . . . –[x , . . . , y . . .] the correct application of Existential Instantion derives: ’x , . . . –[x , . . . , F1 (x ), . . .] For instance, from one can correctly derive ’x ÷y Loves(x , y ) ’x Loves(x , F (x )) where F maps any individual x to someone loved by x . Inference algorithms and quantifiers First-order inference algorithms usually apply Existential Instantiation as a pre-processing step: every existentially quantified sentence is replaced by a single sentence. It can be proven that the resulting KB is inferentially equivalent to the original one, i.e., it is satisfiable when the original one is. Accordingly, the resulting KB contains only sentences without variables, and sentences where all the variables are universally quantified. Another useful pre-processing step is renaming all the variables in the KB to avoid name clashes between variables used in differente sentences. For instance, the variables in ’x P(x ) and ’x Q(x ) are not related to each other, and renaming any of them (say, ’y Q(y )) produces an equivalent sentence. Unification Another widely used tool in first-order inference algorithms is unification, the process of finding a subsitution (if any) that makes two sentences (where at least one contains variables) identical. For instance, ’x , y Knows(x , y ) and ’z Knows(John, z) can be unified by different substitutions. Assuming that Bill is one of the constant symbols, two possible unifiers are: I I {x /John, y /Bill, z/Bill} {x /John, y /z} Among all possible unifiers, the one of interest for first-order inference algorithms is the most general unifier, i.e., the one that places the fewest restrictions on the values of the variables. The only constraint is that every occurrence of a given variable can be replaced by the same term. In the above example, the most general unifier is {x /John, y /z}, as it does not restrict the value of y and z. Unification: an example Consider the sentence ’x Knows(John, x ) (John knows everyone). Assume that the KB also contains the following sentences (note that different variables names are used in different sentences): 1. Knows(John, Jane) 2. ’y Knows(y , Bill) 3. ’z Knows(z, Mother (z)) 4. Knows(Elizabeth, Bill) The most general unifier with Knows(John, x ) is: 1. {x /John} (note that ’x Knows(John, x ) implies that also Knows(John, John) is true, i.e., John knows himself ) 2. {y /John, x /Bill} 3. {z/John, x /Mother (John)} 4. no unifier exists, as the constant symbols John and Elizabeth in the first argument are different First-order inference: an example (1/2) Consider a domain made up of two individuals denoted with the constant symbols John and Richard, and the following KB: ’x King(x ) · Greedy (x ) ∆ Evil(x ) (1) ’y Greedy (y ) (2) King(John) (3) Brother (Richard, John) (4) Intuitively, this entails Evil(John), i.e., KB |= Evil(John). The corresponding inference KB „ Evil(John) can be obtained by using the above inference rules, as shown in the following. First-order inference: an example (2/2) I I I I Applying Universal Instantiation to (1) produces: (5) King(John) · Greedy (John) ∆ Evil(John), with {x /John} (6) King(Richard) · Greedy (Richard) ∆ Evil(Richard), with {x /Richard} Applying Universal Instantiation to (2) produces: (7) Greedy (John), with {y /John} (8) Greedy (Richard), with {y /Richard} Applying And Introduction to (3) and (7) produces: (9) King(John) · Greedy (John) Applying Modus Ponens to (5) and (9) produces: (10) Evil(John) Generalized Modus Ponens All but the last inference steps in the above example can be seen as pre-processing steps whose aim is to “prepare” the application of Modus Ponens. Moreover, some of these steps (Universal Instantiation using the symbol Richard) are clearly useless to derive the consequent of implication (1), i.e., Evil(John). Indeed, the above steps can be combined into a single first-order inference rule, Generalized Modus Ponens (GMP): given atomic sentences (non-negated predicates) pi , piÕ , i = 1, . . . , n, and q, and a substitution ◊ such that Subst(◊, pi ) = Subst(◊, piÕ ) for all i: (p1 · p2 · . . . · pn ∆ q), p1Õ , p2Õ . . . , pnÕ Subst(◊, q) Generalized Modus Ponens In the previous example, GMP allows Evil(John) to be derived in a single step, and avoids unnecessary applications of inference rules like Universal Instantiation to sentences (1) and (2) with {x /Richard} or {y /Richard}. In particular, GMP can be applied to sentences (1), (2) and (3), with ◊ = {x /John}: this immediately derives Evil(John). Horn clauses in predicate logic GMP allows the forward chaining (FC) and backward chaining (BC) inference algorithms to be generalized to predicate logic. This in turn requires to generalize the concept of Horn clause. A Horn clause in predicate logic is an implication – ∆ — in which: I – is a conjunction of non-negated predicates I — is a single non-negated predicate I all variables (if any) are universally quantified, and the quantifier appears at the beginning of the sentence An example: ’x (P(x ) · Q(x )) ∆ R(x ). Also single (possibly negated) predicates are Horn clauses: P(t1 , ..., tn ) … (True ∆ P(t1, ..., tn)) ¬P(t1 , ..., tn ) … (P(t1 , ..., tn ) ∆ False) Forward chaining in predicate logic Similarly to propositional logic, FC consists of repeatedly applying GMP in all possible ways, adding to the initial KB all newly derived atomic sentences until no new sentence can be derived. FC is normally triggered by the addition of new sentences into the KB, to derive all their consequences. For instance, it can be used in the Wumpus game when new percepts are added to the KB, after each agent’s move. Forward chaining in predicate logic A simple (but inefficient) implementation of FC: function Forward-chaining (KB) local variable: new repeat new Ω ÿ (the empty set) for each sentence s = (p1 · . . . · pn ∆ q) in KB do for each ◊ such that Subst(◊, p1 · . . . · pn ) = Subst(◊, p1Õ · . . . · pnÕ ) for some p1Õ , . . . , pnÕ œ KB do Õ q Ω Subst(◊, q) if q Õ œ / KB and q Õ œ / new then add q Õ to new add new to KB until new is empty return KB Forward chaining: an example (1/2) The sentences in the exercise about Colonel West can be written as Horn clauses, after applying Existential Instantiation and then And Elimination to ÷x Missile(x ) · Owns(Nono, x ) (the predicate Country is omitted for the sake of simplicity; the universal quantifiers are not shown to keep the notation uncluttered): (American(x ) · Hostile(y )· Weapon(z) · Sells(x , y , z)) ∆ Criminal(x ) (Missile(x ) · Owns(Nono, x )) ∆ Sells(West, Nono, x ) Enemy (x , America) ∆ Hostile(x ) Missile(x ) ∆ Weapon(x ) American(West) Enemy (Nono, America) Owns(Nono, M) Missile(M) (1) (2) (3) (4) (5) (6) (7) (8) Forward chaining: an example (2/2) The FC algorithm carries out two repeat-until loops on the above KB. No new sentences can be derived after the second loop. First iteration: – GMP to (2), (7) and (8), with {x /M}: Sells(West, Nono, M) – GMP to (3) and (6), with {x /Nono, y /America}: (10) Hostile(Nono) – GMP to (4) and (8), with {x /M}: (11) Weapon(M) Second iteration: – GMP to (1), (5), (10), (11) and (9), with {x /West, y /Nono, z/M}: (12) Criminal(West) Backward chaining in predicate logic The first-order BC algorithm works similarly to its propositional version: it starts from a sentence (query) to be proven and recursively applies GMP backward. Note that every substitution that is made to unify an atomic sentence with the consequent of an implication must be propagated back to every antecedent. If the consequent of an implication unifies with more than one atomic sentence, at least one unification must allow the consequent to be proven. For a possible implementation of BC, see the course textbook. Backward chaining: an example (1/2) A proof by BC can be represented as an And-Or graph, as in propositional logic. The following graph (which should be read depth first, left to right) shows the proof of the query Criminal(West) using the previous sentences (1)–(8) as the KB. Criminal(West) American(West) Weapon(y) Sells(West,M1,z) Hostile(Nono) {z/Nono} {} Missile(y) Missile(M1) Owns(Nono,M1) Enemy(Nono,America) {y/M1} {} {} {} Figure 9.7 FILES: figures/crime-bc.eps (Tue Nov 3 16:22:34 2009). Proof tree constructed by backward chaining to prove that West is a criminal. The tree should be read depth first, left to right. To prove Criminal(West), we have to prove the four conjuncts below it. Some of these are in the knowledge base, and others require further backward chaining. Bindings for each successful unification Backward chaining: an example (2/2) If the predicate Country is used, sentence (1) becomes: ’x , y , z (American(x ) · Country (y ) · Hostile(y )· Weapon(z) · Sells(x , y , z)) ∆ Criminal(x ) The sentences Country (America) and Country (Nono) must also be added to the KB. In this case the additional term Hostile(y ) appears in the And link below Criminal(West). Two sentences in the KB unify with Hostile(y ): Country (America) and Country (Nono). If the unification with Country (America) is attempted first, the conjunct Hostile(America) can not be proven, and the proof fails. In such a case a backtracking step can be applied, i.e., one of the other possibile unifications can be attempted. In this case, the unification with Country (Nono) allows the proof to be completed. The resolution algorithm Completeness theorem for predicate logic (Kurt Gödel, 1930): For every first-order sentence – entailed by a given KB (KB |= –) there exists some inference algorithm that derives – (KB „ –) in a finite number of steps. The opposite does not hold: predicate logic is semidecidable. A complete inference algorithm for predicate logic: resolution (1965), based on: I converting sentences into Conjunctive Normal Form I the resolution inference rule I proof by contradiction: to prove KB |= –, prove that KB · ¬– is unsatisfiable (contradictory) I refutation-completeness: if KB · ¬– is unsatisfiable, then resolution derives a contradiction in a finite number of steps Applications of forward chaining Encoding condition-action rules to recommend actions, based on a data-driven approach: I production systems (production: contition-action rule) I expert systems Applications of backward chaining Logic programming languages: Prolog I rapid prototyping I symbol processing: compilers, natural language parsers I developing expert systems Example of a Prolog clause: criminal(X) :- american(X), weapon(Y), sells(X,Y,Z), hostile(Z). Running a program = proving a sentence (query) by BC, e.g.: I ?- criminal(west) produces Yes I ?- criminal(A) produces A=west, Yes Applications of the resolution algorithm Main application: theorem provers, used for I assisting (not replacing) mathematicians I proof checking verification and synthesis of hardware and software I – hardware design (e.g., entire CPUs) – programming languages (syntax) – software engineering (verifying program specifications, e.g., RSA public key encryption algorithm) Beyond classical logic Classical logic is based on two principles: I bivalence: there exist only two truth values, true and false I determinateness: each proposition has only one truth value But: how to deal with propositions like the following ones? I Tomorrow will be a sunny day : is this true or false, today? I John is tall: is this “completely” true (or false)? This kind of problem is addressed by fuzzy logic I Goldbach’s conjecture: Every even number is the sum of a pair of prime numbers Can we say this is either true or false, even if no proof has been found yet? Expert systems One of the main applications of knowledge-based systems: I encoding human experts’ problem-solving knowledge in specific application domains for which no algorithmic solution exists (e.g., medical diagnosis) I commonly used as decision support systems I problem-independent architecture for knowledge representation and reasoning I knowledge representation: IF...THEN... rules Expert systems: historical notes I Main motivation: limitations of “general” problem-solving approaches pursued in AI until the 1960s I First expert systems: 1970s I Widespread use in the 1980s: many commercial applications I Used in niche/focused domains since the 1990s Main current applications of expert systems I Medical diagnosis an example: UK NHS Direct symptom checker (now closed) http://www.nhsdirect.nhs.uk/CheckSymptoms I Geology, botany (e.g.: rock and plant classification) I Help desk I Finance I Military strategies I Software engineering (e.g., design patterns) Expert system architecture User Facts (Working memory) User Interface Rules (Rule memory) Knowledge Base Knowledge Engineer Domain Expert Inference Engine Explanation Module Designing the Knowledge Base of expert systems Two main, distinct roles: I knowledge engineer I domain expert Main issues: I defining suitable data structures for representing facts (problem instances) in working memory I suitably eliciting experts’ knowledge (general knowledge) and encoding it as IF...THEN... rules (rule memory) How expert systems work The inference engine implements a forward chaining-like algorithm, triggered by the addition of new facts in the working memory: while there is some active rule do select one active rule (using conflict resolution strategies) execute the actions of the selected rule Three kinds of actions exist: – modifying one fact in the working memory – adding one fact to the working memory – removing one fact from the working memory Expert system shells: CLIPS I C Language Integrated Production System http://clipsrules.sourceforge.net/ I Developed since 1984 by the Johnson Space Center, NASA I Now maintained independently from NASA as public domain, open source, free software I Currently used by government, industry, and academia Main features: I – interpreted, functional language (object-oriented extension) – specific data structures and instructions/functions for expert system implementation – interfaces to other languages (C, Python, etc.) Other expert system shells Some free shells: I Jess (Java platform) http://www.jessrules.com I Drools (Business Rules Management System) http://www.drools.org Several commercial shells are also available.