Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Automated Theorem Proving Peter Baumgartner [email protected] http://users.rsise.anu.edu.au/˜baumgart/ Slides partially based on material by Alexander Fuchs, Harald Ganzinger, John Slaney, Viorica Sofronie-Stockermans and Uwe Waldmann Automated Theorem Proving – Peter Baumgartner – p.1 Purpose of This Lecture Overview of Automated Theorem Proving (ATP) Emphasis on automated proof methods for first-order logic More “breadth” than “depth” Standard techniques covered Normal forms of formulas Herbrand interpretations Resolution calculus, unification Instance-based methods Model computation Theory reasoning: Satisfiability Modulo Theories Automated Theorem Proving – Peter Baumgartner – p.2 Part 1: What is Automated Theorem Proving? Automated Theorem Proving – Peter Baumgartner – p.3 First-Order Theorem Proving in Relation to ... . . . Calculation: Compute function value at given point: Problem: 22 = ? 32 = ? 42 = ? “Easy” (often polynomial) . . . Constraint Solving: Given: Problem: x 2 = a where x ∈ [1 . . . b] (x variable, a, b parameters) Instance: a = 16, b = 10 Find values for variables such that problem instance is satisfied “Difficult” (often exponential, but restriction to finite domains) First-Order Theorem Proving: Given: Problem: ∃x (x 2 = a ∧ x ∈ [1 . . . b]) Is it satisfiable? unsatisfiable? valid? “Very difficult” (often undecidable) Automated Theorem Proving – Peter Baumgartner – p.4 Logical Analysis Example: Three Coloring Problem Problem: Given a map. Can it be colored using only three colors, where neighbouring countries are colored differently? Automated Theorem Proving – Peter Baumgartner – p.5 Three Coloring Problem - Graph Theory Abstraction Problem Instance Problem Specification The Rôle of Theorem Proving? Automated Theorem Proving – Peter Baumgartner – p.6 Three Coloring Problem - Formalization Every node has at least one color ∀N (red(N) ∨ green(N) ∨ blue(N)) Every node has at most one color ∀N ((red(N) → ¬green(N)) ∧ (red(N) → ¬blue(N)) ∧ (blue(N) → ¬green(N))) Adjacent nodes have different color ∀M, N (edge(M, N) → (¬(red(M) ∧ red(N)) ∧ ¬(green(M) ∧ green(N)) ∧ ¬(blue(M) ∧ blue(N)))) Automated Theorem Proving – Peter Baumgartner – p.7 Three Coloring Problem - Solving Problem Instances ... ... with a constraint solver: Let constraint solver find value(s) for variable(s) such that problem instance is satisfied Here: Variables: Colors of nodes in graph Values: Red, green or blue Problem instance: Specific graph to be colored ... with a theorem prover Let the theorem prover prove that the three coloring formula (see previous slide) + specific graph (as a formula) is satisfiable To solve problem instances a constraint solver is usually much more efficient than a theorem prover (e.g. use a SAT solver) Theorem provers are not even guaranteed to terminate, in general Other tasks where theorem proving is more appropriate? Automated Theorem Proving – Peter Baumgartner – p.8 Three Coloring Problem: The Rôle of Theorem Proving Functional dependency Blue coloring depends functionally on the red and green coloring Blue coloring does not functionally depend on the red coloring Theorem proving: Prove a formula is valid. Here: Is “the blue coloring is functionally dependent on the red/red and green coloring” (as a formula) valid, i.e. holds for all possible graphs? I.e. analysis wrt. all instances ⇒ theorem proving is adequate Theorem Prover Demo Automated Theorem Proving – Peter Baumgartner – p.9 Part 2: Methods in Automated Theorem Proving Automated Theorem Proving – Peter Baumgartner – p.10 How to Build a (First-Order) Theorem Prover 1. Fix an input language for formulas 2. Fix a semantics to define what the formulas mean Will be always “classical” here 3. Determine the desired services from the theorem prover (The questions we would like the prover be able to answer) 4. Design a calculus for the logic and the services Calculus: high-level description of the “logical analysis” algorithm This includes redundancy criteria for formulas and inferences 5. Prove the calculus is correct (sound and complete) wrt. the logic and the services, if possible 6. Design a proof procedure for the calculus 7. Implement the proof procedure (research topic of its own) Go through the red issues in the rest of this talk Automated Theorem Proving – Peter Baumgartner – p.11 How to Build a (First-Order) Theorem Prover 1. Fix an input language for formulas 2. Fix a semantics to define what the formulas mean Will be always “classical” here 3. Determine the desired services from the theorem prover (The questions we would like the prover be able to answer) 4. Design a calculus for the logic and the services Calculus: high-level description of the “logical analysis” algorithm This includes redundancy criteria for formulas and inferences 5. Prove the calculus is correct (sound and complete) wrt. the logic and the services, if possible 6. Design a proof procedure for the calculus 7. Implement the proof procedure (research topic of its own) Automated Theorem Proving – Peter Baumgartner – p.12 Languages and Services — Propositional SAT Yes Formula(s) Theorem Prover Question No Formula: Propositional logic formula φ Question: Is φ satisfiable? (Minimal model? Maximal consistent subsets? ) Theorem Prover: Based on BDD, DPLL, or stochastic local search Issue: the formula φ can be BIG Automated Theorem Proving – Peter Baumgartner – p.13 DPLL as a Semantic Tree Method (1) A∨B (2) C ∨ ¬A (3) D ∨ ¬C ∨ ¬A hempty treei (4) ¬D ∨ ¬B {} 6|= A ∨ B {} |= C ∨ ¬A {} |= D ∨ ¬C ∨ ¬A {} |= ¬D ∨ ¬B A Branch stands for an interpretation Purpose of splitting: satisfy a clause that is currently falsified Close branch if some clause is plainly falsified by it (⋆) Automated Theorem Proving – Peter Baumgartner – p.14 DPLL as a Semantic Tree Method (1) A∨B (2) C ∨ ¬A (3) D ∨ ¬C ∨ ¬A (4) ¬D ∨ ¬B {A} |= A ∨ B A ¬A {A} 6|= C ∨ ¬A {A} |= D ∨ ¬C ∨ ¬A {A} |= ¬D ∨ ¬B A Branch stands for an interpretation Purpose of splitting: satisfy a clause that is currently falsified Close branch if some clause is plainly falsified by it (⋆) Automated Theorem Proving – Peter Baumgartner – p.15 DPLL as a Semantic Tree Method (1) A∨B (2) C ∨ ¬A (3) D ∨ ¬C ∨ ¬A (4) ¬D ∨ ¬B {A, C } |= A ∨ B ¬A A C ¬C ⋆ {A, C } |= C ∨ ¬A {A, C } 6|= D ∨ ¬C ∨ ¬A {A, C } |= ¬D ∨ ¬B A Branch stands for an interpretation Purpose of splitting: satisfy a clause that is currently falsified Close branch if some clause is plainly falsified by it (⋆) Automated Theorem Proving – Peter Baumgartner – p.16 DPLL as a Semantic Tree Method (1) A∨B (2) C ∨ ¬A (3) D ∨ ¬C ∨ ¬A (4) ¬D ∨ ¬B {A, C , D} |= A ∨ B ¬A A C D {A, C , D} |= C ∨ ¬A {A, C , D} |= D ∨ ¬C ∨ ¬A ¬C ⋆ {A, C , D} |= ¬D ∨ ¬B ¬D ⋆ Model {A, C , D} found. A Branch stands for an interpretation Purpose of splitting: satisfy a clause that is currently falsified Close branch if some clause is plainly falsified by it (⋆) Automated Theorem Proving – Peter Baumgartner – p.17 DPLL as a Semantic Tree Method (1) A∨B (2) C ∨ ¬A (3) D ∨ ¬C ∨ ¬A (4) ¬D ∨ ¬B {B} |= A ∨ B ¬A A C D ¬C ⋆ B {B} |= C ∨ ¬A {B} |= D ∨ ¬C ∨ ¬A ¬B ⋆ {B} |= ¬D ∨ ¬B ¬D ⋆ Model {B} found. A Branch stands for an interpretation Purpose of splitting: satisfy a clause that is currently falsified Close branch if some clause is plainly falsified by it (⋆) DPLL is the basis of most efficient SAT solversAutomated todayTheorem Proving – Peter Baumgartner – p.18 Languages and Services — Description Logics Yes Formula(s) Theorem Prover Question No Formula: Description Logic TBox + ABox (restricted FOL) TBox: Terminology ABox: Assertions Professor ⊓ ∃ supervises . Student ⊑ BusyPerson p : Professor (p, s) : supervises Question: Is TBox + ABox satisfiable? (Does C subsume D?, Concept hierarchy?) Theorem Prover: Tableaux algorithms (predominantly) Issue: Push expressivity of DLs while preserving decidability See overview lecture by Maurice Pagnucco on “Knowledge Representation and Reasoning” Automated Theorem Proving – Peter Baumgartner – p.19 Languages and Services — Satisfiability Modulo Theories (SM Yes Formula(s) Theorem Prover Question No Formula: Usually variable-free first-order logic formula φ . Equality =, combination of theories, free symbols Question: Is φ valid? (satisfiable? entailed by another formula?) . |=N∪L ∀l (c = 5 → car(cons(3 + c, l)) = 8) Theorem Prover: DPLL(T), translation into SAT, first-order provers Issue: essentially undecidable for non-variable free fragment P(0) ∧ (∀x P(x) → P(x + 1)) |=N ∀x P(x) Design a “good” prover anyways (ongoing research) Automated Theorem Proving – Peter Baumgartner – p.20 Languages and Services — “Full” First-Order Logic Yes Formula(s) Theorem Prover Question No (sometimes) Formula: First-order logic formula φ (e.g. the three-coloring spec above) . Usually with equality = Question: Is φ formula valid? (satisfiable?, entailed by another formula?) Theorem Prover: Superposition (Resolution), Instance-based methods Issues Efficient treatment of equality Decision procedure for sub-languages or useful reductions? Can do e.g. DL reasoning? Model checking? Logic programming? Built-in inference rules for arrays, lists, arithmetics (still open research) Automated Theorem Proving – Peter Baumgartner – p.21 How to Build a (First-Order) Theorem Prover 1. Fix an input language for formulas 2. Fix a semantics to define what the formulas mean Will be always “classical” here 3. Determine the desired services from the theorem prover (The questions we would like the prover be able to answer) 4. Design a calculus for the logic and the services Calculus: high-level description of the “logical analysis” algorithm This includes redundancy criteria for formulas and inferences 5. Prove the calculus is correct (sound and complete) wrt. the logic and the services, if possible 6. Design a proof procedure for the calculus 7. Implement the proof procedure (research topic of its own) Automated Theorem Proving – Peter Baumgartner – p.22 Semantics “The function f is continuous”, expressed in (first-order) predicate logic: ∀ε(0 < ε → ∀a∃δ(0 < δ ∧ ∀x(|x − a| < δ → |f (x) − f (a)| < ε))) Underlying Language Variables ε, a, δ, x Function symbols 0, | |, − , f( ) Terms are well-formed expressions over variables and function symbols Predicate symbols < , = Atoms are applications of predicate symbols to terms Boolean connectives ∧, ∨, →, ¬ Quantifiers ∀, ∃ The function symbols and predicate symbols comprise a signature Σ Automated Theorem Proving – Peter Baumgartner – p.23 Semantics “The function f is continuous”, expressed in (first-order) predicate logic: ∀ε(0 < ε → ∀a∃δ(0 < δ ∧ ∀x(|x − a| < δ → |f (x) − f (a)| < ε))) “Meaning” of Language Elements – Σ-Algebras Universe (aka Domain): Set U Variables 7→ values in U (mapping is called “assignment”) Function symbols 7→ (total) functions over U Predicate symbols 7→ relations over U Boolean connectives 7→ the usual boolean functions Quantifiers 7→ “for all ... holds”, “there is a ..., such that” Terms 7→ values in U Formulas 7→ Boolean (Truth-) values Automated Theorem Proving – Peter Baumgartner – p.24 Semantics - Σ-Algebra Example Let ΣPA be the standard signature of Peano Arithmetic The standard interpretation N for Peano Arithmetic then is: UN = {0, 1, 2, . . .} 0N = 0 sN : n 7→ n + 1 +N : (n, m) 7→ n + m ∗N : (n, m) 7→ n ∗ m ≤N = {(n, m) | n less than or equal to m} <N = {(n, m) | n less than m} Note that N is just one out of many possible ΣPA -interpretations Automated Theorem Proving – Peter Baumgartner – p.25 Semantics - Σ-Algebra Example Evaluation of terms and formulas Under the interpretation N and the assignment β : x 7→ 1, y 7→ 3 we obtain (N, β)(s(x) + s(0)) . (N, β)(x + y = s(y )) = 3 = True (N, β)(∀z z ≤ y ) = False (N, β)(∀x∃y x < y ) = True N(∀x∃y x < y ) = True (Short notation when β irrelevant) Important Basic Notion: Model If φ is a closed formula, then, instead of I (φ) = True one writes I |= φ (“I is a model of φ”) E.g. N |= ∀x∃y x < y Standard reasoning services can now be expressed semantically Automated Theorem Proving – Peter Baumgartner – p.26 Services Semantically E.g. “entailment”: Axioms over R ∧ continuous(f ) ∧ continuous(g ) |= continuous(f + g ) ? Services Model(I ,φ): I |= φ ? (Is I a model for φ?) Validity(φ): |= φ ? (I |= φ for every interpretation?) Satisfiability(φ): φ satisfiable? (I |= φ for some interpretation?) Entailment(φ,ψ): φ |= ψ ? (does φ entail ψ?, i.e. for every interpretation I : if I |= φ then I |= ψ?) Solve(I ,φ): find an assignment β such that I , β |= φ Solve(φ): find an interpretation and assignment β such that I , β |= φ Additional complication: fix interpretation of some symbols (as in N above) What if theorem prover’s native service is only “Is φ unsatisfiable?” ? Automated Theorem Proving – Peter Baumgartner – p.27 Semantics - Reduction to Unsatisfiability Suppose we want to prove an entailment φ |= ψ Equivalently, prove |= φ → ψ, i.e. that φ → ψ is valid Equivalently, prove that ¬(φ → ψ) is not satisfiable (unsatisfiable) Equivalently, prove that φ ∧ ¬ψ is unsatisfiable Basis for (predominant) refutational theorem proving Dual problem, much harder: to disprove an entailment φ |= ψ find a model of φ ∧ ¬ψ One motivation for (finite) model generation procedures Automated Theorem Proving – Peter Baumgartner – p.28 How to Build a (First-Order) Theorem Prover 1. Fix an input language for formulas 2. Fix a semantics to define what the formulas mean Will be always “classical” here 3. Determine the desired services from the theorem prover (The questions we would like the prover be able to answer) 4. Design a calculus for the logic and the services Calculus: high-level description of the “logical analysis” algorithm This includes redundancy criteria for formulas and inferences 5. Prove the calculus is correct (sound and complete) wrt. the logic and the services, if possible 6. Design a proof procedure for the calculus 7. Implement the proof procedure (research topic of its own) Automated Theorem Proving – Peter Baumgartner – p.29 Calculus - Normal Forms Most first-order theorem provers take formulas in clause normal form Why Normal Forms? Reduction of logical concepts (operators, quantifiers) Reduction of syntactical structure (nesting of subformulas) Can be exploited for efficient data structures and control Translation into Clause Normal Form Formula Prenex normal form Skolem normal form Clause normal form Clausal Theorem Prover Prop: the given formula and its clause normal form are equi-satisfiable Automated Theorem Proving – Peter Baumgartner – p.30 Prenex Normal Form Prenex formulas have the form Q1 x1 . . . Qn xn F , where F is quantifier-free and Qi ∈ {∀, ∃} Computing prenex normal form by the rewrite relation ⇒P : (F ↔ G ) ⇒P ¬QxF ⇒P (QxF ρ G ) ⇒P (QxF → G ) ⇒P (F ρ QxG ) ⇒P (F → G ) ∧ (G → F ) (¬Q) Qx¬F Qy (F [y /x] ρ G ), y fresh, ρ ∈ {∧, ∨} Qy (F [y /x] → G ), y fresh Qy (F ρ G [y /x]), y fresh, ρ ∈ {∧, ∨, →} Here Q denotes the quantifier dual to Q, i.e., ∀ = ∃ and ∃ = ∀. Automated Theorem Proving – Peter Baumgartner – p.31 In the Example ∀ε(0 < ε → ∀a∃δ(0 < δ ∧ ∀x(|x − a| < δ → |f (x) − f (a)| < ε))) ⇒P ∀ε∀a(0 < ε → ∃δ(0 < δ ∧ ∀x(|x − a| < δ → |f (x) − f (a)| < ε))) ⇒P ∀ε∀a∃δ(0 < ε → 0 < δ ∧ ∀x(|x − a| < δ → |f (x) − f (a)| < ε)) ⇒P ∀ε∀a∃δ(0 < ε → ∀x(0 < δ ∧ |x − a| < δ → |f (x) − f (a)| < ε)) ⇒P ∀ε∀a∃δ∀x(0 < ε → (0 < δ ∧ (|x − a| < δ → |f (x) − f (a)| < ε))) Automated Theorem Proving – Peter Baumgartner – p.32 Skolem Normal Form Prenex normal form Formula Skolem normal form Clause normal form Clausal Theorem Prover Intuition: replacement of ∃y by a concrete choice function computing y from all the arguments y depends on. Transformation ⇒S ∀x1 , . . . , xn ∃y F ⇒S ∀x1 , . . . , xn F [f (x1 , . . . , xn )/y ] where f /n is a new function symbol (Skolem function). In the Example ∀ε∀a∃δ∀x(0 < ε → 0 < δ ∧ (|x − a| < δ → |f (x) − f (a)| < ε)) ⇒S ∀ε∀a∀x(0 < ε → 0 < d(ε, a) ∧ (|x − a| < d(ε, a) → |f (x) − f (a)| < ε)) Automated Theorem Proving – Peter Baumgartner – p.33 Clausal Normal Form (Conjunctive Normal Form) Rules to convert the matrix of the formula in Skolem normal form into a conjunction of disjunctions: (F ↔ G ) (F → G ) ¬(F ∨ G ) ¬(F ∧ G ) ¬¬F (F ∧ G ) ∨ H (F ∧ ⊤) (F ∧ ⊥) (F ∨ ⊤) (F ∨ ⊥) ⇒K ⇒K ⇒K ⇒K ⇒K ⇒K ⇒K ⇒K ⇒K ⇒K (F → G ) ∧ (G → F ) (¬F ∨ G ) (¬F ∧ ¬G ) (¬F ∨ ¬G ) F (F ∨ H) ∧ (G ∨ H) F ⊥ ⊤ F They are to be applied modulo associativity and commutativity of ∧ and ∨ Automated Theorem Proving – Peter Baumgartner – p.34 In the Example ∀ε∀a∀x(0 < ε → 0 < d(ε, a) ∧ (|x − a| < d(ε, a) → |f (x) − f (a)| < ε)) ⇒K 0 < d(ε, a) ∨ ¬ (0 < ε) ¬ (|x − a| < d(ε, a)) ∨ |f (x) − f (a)| < ε ∨ ¬ (0 < ε) Note: The universal quantifiers for the variables ε, a and x, as well as the conjunction symbol ∧ between the clauses are not written, for convenience Automated Theorem Proving – Peter Baumgartner – p.35 The Complete Picture F ∗ Q1 y1 . . . Qn yn G ∗ ∀x1 , . . . , xm H ⇒P ⇒S ∗ ⇒K ∀x , . . . , x | 1 {z m} leave out | (G quantifier-free) (m ≤ n, H quantifier-free) k ^ ni _ i =1 j=1 {z F′ Lij | {z } clauses Ci } N = {C1 , . . . , Ck } is called the clausal (normal) form (CNF) of F Note: the variables in the clauses are implicitly universally quantified Instead of showing that F is unsatisfiable, the proof problem from now is to show that N is unsatisfiable Can do better than “searching through all interpretations” Theorem: N is satisfiable iff it has a Herbrand model Automated Theorem Proving – Peter Baumgartner – p.36 Herbrand Interpretations A Herbrand interpretation (over a given signature Σ) is a Σ-algebra A such that The universe is the set TΣ of ground terms over Σ (a ground term is a term without any variables ): UA = TΣ Every function symbol from Σ is “mapped to itself”: fA : (s1 , . . . , sn ) 7→ f (s1 , . . . , sn ), where f is n-ary function symbol in Σ Example ΣPres = ({0/0, s/1, +/2}, {</2, ≤/2}) UA = {0, s(0), s(s(0)), . . . , 0 + 0, s(0) + 0, . . . , s(0 + 0), s(s(0) + 0), . . .} 0 7→ 0, s(0) 7→ s(0), s(s(0)) 7→ s(s(0)), . . . , 0 + 0 7→ 0 + 0, . . . Automated Theorem Proving – Peter Baumgartner – p.37 Herbrand Interpretations Only interpretations pA of predicate symbols p ∈ Σ is undetermined in a Herbrand interpretation pA represented as the set of ground atoms {p(s1 , . . . , sn ) | (s1 , . . . , sn ) ∈ pA where p ∈ Σ is n-ary predicate symbol} S Whole interpretation represented as p∈Σ pA Example ΣPres = ({0/0, s/1, +/2}, {</2, ≤/2}) (from above) N as Herbrand interpretation over ΣPres I ={ 0 ≤ 0, 0 ≤ s(0), 0 ≤ s(s(0)), . . . , 0 + 0 ≤ 0, 0 + 0 ≤ s(0), . . . , . . . , (s(0) + 0) + s(0) ≤ s(0) + (s(0) + s(0)), . . . } Automated Theorem Proving – Peter Baumgartner – p.38 Herbrand’s Theorem Proposition A Skolem normal form ∀φ is unsatisfiable iff it has no Herbrand model Theorem (Skolem-Herbrand-Theorem) ∀φ has no Herbrand model iff some finite set of ground instances {φγ1 , . . . , φγn } is unsatisfiable Applied to clause logic: Theorem (Skolem-Herbrand-Theorem) A set N of Σ-clauses is unsatisfiable iff some finite set of ground instances of clauses from N is unsatisfiable Leads immediately to theorem prover “Gilmore’s Method” Automated Theorem Proving – Peter Baumgartner – p.39 Gilmore’s Method - Based on Herbrand’s Theorem Given Formula Preprocessing: ∀x ∃y P(y , x) ∧ ∀z ¬P(z, a) Clause Form P(f (x), x) ¬P(z, a) Outer loop: Grounding Inner loop: Propositional Method Automated Theorem Proving – Peter Baumgartner – p.40 Gilmore’s Method - Based on Herbrand’s Theorem Given Formula Preprocessing: ∀x ∃y P(y , x) ∧ ∀z ¬P(z, a) Outer loop: Grounding P(f (a), a) ¬P(a, a) Clause Form P(f (x), x) ¬P(z, a) Inner loop: Propositional Method Automated Theorem Proving – Peter Baumgartner – p.41 Gilmore’s Method - Based on Herbrand’s Theorem Clause Form Given Formula Preprocessing: ∀x ∃y P(y , x) ∧ ∀z ¬P(z, a) Outer loop: Grounding P(f (a), a) ¬P(a, a) Inner loop: Propositional Method P(f (x), x) ¬P(z, a) Sat? No STOP: Proof found Yes Continue Outer Loop Automated Theorem Proving – Peter Baumgartner – p.42 Gilmore’s Method - Based on Herbrand’s Theorem Given Formula Clause Form Preprocessing: ∀x ∃y P(y , x) ∧ ∀z ¬P(z, a) P(f (x), x) ¬P(z, a) Outer loop: Grounding P(f (a), a) ¬P(a, a) P(f (a), a) ¬P(a, a) ¬P(f (a), a) Inner loop: Propositional Method Automated Theorem Proving – Peter Baumgartner – p.43 Gilmore’s Method - Based on Herbrand’s Theorem Given Formula Clause Form Preprocessing: ∀x ∃y P(y , x) ∧ ∀z ¬P(z, a) P(f (x), x) ¬P(z, a) Outer loop: Grounding P(f (a), a) ¬P(a, a) P(f (a), a) ¬P(a, a) ¬P(f (a), a) Inner loop: Propositional Method Sat? No STOP: Proof found Yes Continue Outer Loop Automated Theorem Proving – Peter Baumgartner – p.44 Calculi for First-Order Logic Theorem Proving Gilmore’s method reduces proof search in first-order logic to propositional logic unsatisfiability problems Main problem is the unguided generation of (very many) ground clauses All modern calculi address this problem in one way or another, e.g. Guidance: Instance-Based Methods are similar to Gilmore’s method but generate ground instances in a guided way Avoidance: Resolution calculi need not generate the ground instances at all Resolution inferences operate directly on clauses, not on their ground instances Next: propositional Resolution, lifting, first-order Resolution Automated Theorem Proving – Peter Baumgartner – p.45 The Propositional Resolution Calculus Res Modern versions of the first-order version of the resolution calculus [Robinson 1965] are (still) the most important calculi for FOTP today. Propositional resolution inference rule: C ∨A ¬A ∨ D C ∨D Terminology: C ∨ D: resolvent; A: resolved atom Propositional (positive) factorisation inference rule: C ∨A∨A C ∨A These are schematic inference rules: C and D – propositional clauses A – propositional atom “∨” is considered associative and commutative Automated Theorem Proving – Peter Baumgartner – p.46 Sample Proof 1. ¬A ∨ ¬A ∨ B (given) 2. A∨B (given) 3. ¬C ∨ ¬B (given) 4. C (given) 5. ¬A ∨ B ∨ B 6. ¬A ∨ B 7. B ∨B 8. B 9. ¬C (Res. 8. into 3.) 10. ⊥ (Res. 4. into 9.) (Res. 2. into 1.) (Fact. 5.) (Res. 2. into 6.) (Fact. 7.) Automated Theorem Proving – Peter Baumgartner – p.47 Soundness of Propositional Resolution Proposition Propositional resolution is sound Proof: Let I ∈ Σ-Alg. To be shown: 1. for resolution: I |= C ∨ A, I |= D ∨ ¬A ⇒ I |= C ∨ D 2. for factorization: I |= C ∨ A ∨ A ⇒ I |= C ∨ A Ad (i): Assume premises are valid in I . Two cases need to be considered: (a) A is valid in I , or (b) ¬A is valid in I . a) I |= A ⇒ I |= D ⇒ I |= C ∨ D b) I |= ¬A ⇒ I |= C ⇒ I |= C ∨ D Ad (ii): even simpler Automated Theorem Proving – Peter Baumgartner – p.48 Completeness of Propositional Resolution Theorem: Propositional Resolution is refutationally complete That is, if a propositional clause set is unsatisfiable, then Resolution will derive the empty clause ⊥ eventually More precisely: If a clause set is unsatisfiable and closed under the application of the Resolution and Factorization inference rules, then it contains the empty clause ⊥ Perhaps easiest proof: semantic tree proof technique (see blackboard) This result can be considerably strengthened, some strengthenings come for free from the proof Propositional resolution is not suitable for first-order clause sets Automated Theorem Proving – Peter Baumgartner – p.49 Lifting Propositional Resolution to First-Order Resolution Propositional resolution Clauses P(f (x), y ) ¬P(z, z) Ground instances {P(f (a), a), . . . , P(f (f (a)), f (f (a))), . . .} {¬P(a), . . . , ¬P(f (f (a)), f (f (a))), . . .} Only common instances of P(f (x), y ) and P(z, z) give rise to inference: P(f (f (a)), f (f (a))) ¬P(f (f (a)), f (f (a))) ⊥ Unification All common instances of P(f (x), y ) and P(z, z) are instances of P(f (x), f (x)) P(f (x), f (x)) is computed deterministically by unification First-order resolution P(f (x), y ) ¬P(z, z) ⊥ Justified by existence of P(f (x), f (x)) Can represent infinitely many propositional resolution inferences Automated Theorem Proving – Peter Baumgartner – p.50 Substitutions and Unifiers A substitution σ is a mapping from variables to terms which is the identity almost everywhere Example: σ = [y 7→ f (x), z 7→ f (x)] A substitution can be applied to a term or atom t, written as tσ Example, where σ is from above: P(f (x), y )σ = P(f (x), f (x)) A substitution γ is a unifier of s and t iff sγ = tγ Example: γ = [x 7→ a, y 7→ f (a), z 7→ f (a)] is a unifier of P(f (x), y ) and P(z, z) A unifier σ of s is most general iff for every unifier γ of s and t there is a substitution δ such that γ = σ ◦ δ; notation: σ = mgu(s, t) Example: σ = [y 7→ f (x), z 7→ f (x)] = mgu(P(f (x), y ), P(z, z)) There are (linear) algorithms to compute mgu’s or return “fail” Automated Theorem Proving – Peter Baumgartner – p.51 Resolution for First-Order Clauses C ∨A D ∨ ¬B (C ∨ D)σ C ∨A∨B (C ∨ A)σ if σ = mgu(A, B) [resolution] if σ = mgu(A, B) [factorization] In both cases, A and B have to be renamed apart (made variable disjoint). Example Q(z) ∨ P(z, z) ¬P(x, y ) Q(x) where σ = [z 7→ x, y 7→ x] [resolution] Q(z) ∨ P(z, a) ∨ P(a, y ) Q(a) ∨ P(a, a) where σ = [z 7→ a, y 7→ a] [factorization] Automated Theorem Proving – Peter Baumgartner – p.52 Completeness of First-Order Resolution Theorem: Resolution is refutationally complete That is, if a clause set is unsatisfiable, then Resolution will derive the empty clause ⊥ eventually More precisely: If a clause set is unsatisfiable and closed under the application of the Resolution and Factorization inference rules, then it contains the empty clause ⊥ Perhaps easiest proof: Herbrand Theorem + completeness of propositional resolution + Lifting Theorem (see blackboard) Lifting Theorem: the conclusion of any propositional inference on ground instances of first-order clauses can be obtained by instantiating the conclusion of a first-order inference on the first-order clauses Closure can be achieved by the “Given Clause Loop” Automated Theorem Proving – Peter Baumgartner – p.53 The “Given Clause Loop” As used in the Otter theorem prover: Lists of clauses maintained by the algorithm: usable and sos. Initialize sos with the input clauses, usable empty. Algorithm (straight from the Otter manual): While (sos is not empty and no refutation has been found) 1. Let given_clause be the ‘lightest’ clause in sos; 2. Move given_clause from sos to usable; 3. Infer and process new clauses using the inference rules in effect; each new clause must have the given_clause as one of its parents and members of usable as its other parents; new clauses that pass the retention tests are appended to sos; End of while loop. Fairness: define clause weight e.g. as “depth + length” of clause. Automated Theorem Proving – Peter Baumgartner – p.54 The “Given Clause Loop” - Graphically consequences given clause - usable list XXX $ ? ? ? $ filters set of support Automated Theorem Proving – Peter Baumgartner – p.55 Calculi for First-Order Logic Theorem Proving Recall: Gilmore’s method reduces proof search in first-order logic to propositional logic unsatisfiability problems Main problem is the unguided generation of (very many) ground clauses All modern calculi address this problem in one way or another, e.g. Guidance: Instance-Based Methods are similar to Gilmore’s method but generate ground instances in a guided way Avoidance: Resolution calculi need not generate the ground instances at all Resolution inferences operate directly on clauses, not on their ground instances Next: Instance-Based Method “Inst-Gen” Automated Theorem Proving – Peter Baumgartner – p.56 Inst-Gen [Ganzinger&Korovin 2003] Idea: “semantic” guidance: add only instances that are falsified by a “candidate model” Eventually, all repairs will be made or there is no more candidate model Important notation: ⊥ denotes both a unique constant and a substitution that maps every variable to ⊥ Example (S is “current clause set”): S : P(x, y ) ∨ P(y , x) ¬P(x, x) S⊥ : P(⊥, ⊥) ∨ P(⊥, ⊥) ¬P(⊥, ⊥) Analyze S⊥: Case 1: SAT detects unsatisfiability of S⊥ Then Conclude S is unsatisfiable But what if S⊥ is satisfied by some model, denoted by I⊥ ? Automated Theorem Proving – Peter Baumgartner – p.57 Inst-Gen Main idea: associate to model I⊥ of S⊥ a candidate model IS of S. Calculus goal: add instances to S so that IS becomes a model of S Example: S : P(x) ∨ Q(x) S⊥ : P(⊥) ∨ Q(⊥) ¬P(a) ¬P(a) Analyze S⊥: Case 2: SAT detects model I⊥ = {P(⊥), ¬P(a)} of S⊥ Case 2.1: candidate model IS = {¬P(a)} derived from literals selected in S by I⊥ is not a model of S Add “problematic” instance P(a) ∨ Q(a) to S to refine IS Automated Theorem Proving – Peter Baumgartner – p.58 Inst-Gen Clause set after adding P(a) ∨ Q(a) S : P(x) ∨ Q(x) S⊥ : P(⊥) ∨ Q(⊥) P(a) ∨ Q(a) P(a) ∨ Q(a) ¬P(a) ¬P(a) Analyze S⊥: Case 2: SAT detects model I⊥ = {P(⊥), Q(a), ¬P(a)} of S⊥ Case 2.2: candidate model IS = {Q(a), ¬P(a)} derived from literals selected in S by I⊥ is a model of S Then conclude S is satisfiable How to derive candidate model IS ? Automated Theorem Proving – Peter Baumgartner – p.59 Inst-Gen - Model Construction It provides (partial) interpretation for Sground for given clause set S S : P(x) ∨ Q(x) P(a) ∨ Q(a) ¬P(a) Σ = {a, b}, Sground : P(b) ∨ Q(b) P(a) ∨ Q(a) ¬P(a) For each Cground ∈ Sground find most specific C ∈ S that can be instantiated to Cground Select literal in Cground corresponding to selected literal in that C Add selected literal of that Cground to IS if not in conflict with IS Thus, IS = {P(b), Q(a), ¬P(a)} Automated Theorem Proving – Peter Baumgartner – p.60 Model Generation Scenario: no “theorem” to prove, or disprove a “theorem” A model provides further information then Why compute models? Planning: Can be formalised as propositional satisfiability problem. [Kautz& Selman, AAAI96; Dimopolous et al, ECP97] Diagnosis: Minimal models of abnormal literals (circumscription). [Reiter, AI87] Databases: View materialisation, View Updates, Integrity Constraints. Nonmonotonic reasoning: Various semantics (GCWA, Well-founded, Perfect, Stable,. . . ), all based on minimal models. [Inoue et al, CADE 92] Software Verification: Counterexamples to conjectured theorems. Theorem proving: Counterexamples to conjectured theorems. Finite models of quasigroups, (MGTP/G). [Fujita et al, IJCAI 93] Automated Theorem Proving – Peter Baumgartner – p.61 Model Generation Why compute models (cont’d)? Natural Language Processing: Maintain models I1 , . . . , In as different readings of discourses: Ii |= BG -Knowledge ∪ Discourse so far Consistency checks (“Mia’s husband loves Sally. She is not married.”) BG -Knowledge ∪ Discourse so far 6|= ¬New utterance iff BG -Knowledge ∪ Discourse so far ∪ New utterance is satisfiable Informativity checks (“Mia’s husband loves Sally. She is married.”) BG -Knowledge ∪ Discourse so far 6|= New utterance iff BG -Knowledge ∪ Discourse so far ∪ ¬New utterance is satisfiable Automated Theorem Proving – Peter Baumgartner – p.62 Example - Group Theory The following axioms specify a group ∀x, y , z : (x ∗ y ) ∗ z = x ∗ (y ∗ z) (associativity) ∀x : e ∗x = x (left − identity) ∀x : i(x) ∗ x = e (left − inverse) Does ∀x, y : x ∗y = y ∗x (commutat.) follow? No, it does not Automated Theorem Proving – Peter Baumgartner – p.63 Example - Group Theory Counterexample: a group with finite domain of size 6, where the elements 2 and 3 are not commutative: Domain: {1, 2, 3, 4, 5, 6} e:1 1 2 3 4 5 6 1 2 3 5 4 6 1 2 3 4 5 6 1 1 2 3 4 5 6 2 2 1 4 3 6 5 ∗: 3 3 5 1 6 2 4 4 4 6 2 5 1 3 5 5 3 6 1 4 2 6 6 4 5 2 3 1 i: Automated Theorem Proving – Peter Baumgartner – p.64 Finite Model Finders - Idea Assume a fixed domain size n. Use a tool to decide if there exists a model with domain size n for a given problem. Do this starting with n = 1 with increasing n until a model is found. Note: domain of size n will consist of {1, . . . , n}. Automated Theorem Proving – Peter Baumgartner – p.65 1. Approach: SEM-style Tools: SEM, Finder, Mace4 Specialized constraint solvers. For a given domain generate all ground instances of the clause. Example: For domain size 2 and clause p(a, g (x)) the instances are p(a, g (1)) and p(a, g (2)). Automated Theorem Proving – Peter Baumgartner – p.66 1. Approach: SEM-style Set up multiplication tables for all symbols with the whole domain as cell values. Example: For domain size 2 and function symbol g with arity 1 the cells are g (1) = {1, 2} and g (2) = {1, 2}. Try to restrict each cell to exactly 1 value. The clauses are the constraints guiding the search and propagation. Example: if the cell of a contains {1}, the clause a = b forces the cell of b to be {1} as well. Automated Theorem Proving – Peter Baumgartner – p.67 2. Approach: Mace-style Tools: Mace2, Paradox For given domain size n transform first-order clause set into equisatisfiable propositional clause set. Original problem has a model of domain size n iff the transformed problem is satisfiable. Run SAT solver on transformed problem and translate model back. Automated Theorem Proving – Peter Baumgartner – p.68 Paradox - Example Domain: {1, 2} Clauses: {p(a) ∨ f (x) = a} Flattened: p(y ) ∨ f (x) = y ∨ a 6= y Instances: p(1) ∨ f (1) = 1 ∨ a 6= 1 p(2) ∨ f (1) = 1 ∨ a 6= 2 p(1) ∨ f (2) = 1 ∨ a 6= 1 p(2) ∨ f (2) = 1 ∨ a 6= 2 Totality: a=1∨a =2 f (1) = 1 ∨ f (1) = 2 f (2) = 1 ∨ f (2) = 2 Functionality: a 6= 1 ∨ a 6= 2 f (1) 6= 1 ∨ f (1) 6= 2 f (2) 6= 1 ∨ f (2) 6= 2 A model is obtained by setting the blue literals true Automated Theorem Proving – Peter Baumgartner – p.69 Theory Reasoning Let T be a first-order theory of signature Σ Let L be a class of Σ-formulas The T -validity Problem Given φ in L, is it the case that T |= φ ? More accurately: Given φ in L, is it the case that T |= ∀ φ ? Examples “0/0, s/1, +/2, = /2, ≤ /2′′ |= ∃y .y > x The theory of equality E |= φ (φ arbitrary formula) “An equational theory” |= ∃ s1 = t1 ∧ · · · ∧ sn = tn (E-Unification problem) “Some group theory” |= s = t (Word problem) The T -validity problem is decidably only for restricted L and T Automated Theorem Proving – Peter Baumgartner – p.70 Approaches to Theory Reasoning Theory-Reasoning in Automated First-Order Theorem Proving Semi-decide the T -validity problem, T |= φ ? φ arbitrary first-order formula, T universal theory Generality is strength and weakness at the same time Really successful only for specific instance: T = equality, inference rules like paramodulation Satisfiability Modulo Theories (SMT) Decide the T -validity problem, T |= φ ? Usual restriction: φ is quantifier-free, i.e. all variables implicitly universally quantified Applications in particular to formal verification Automated Theorem Proving – Peter Baumgartner – p.71 Checking Satisfiability Modulo Theories Given: A quantifier-free formula φ (implicitly existentially quantified) Task: Decide whether φ is T-satisfiable (T -validity via “T |= ∀ φ” iff “∃ ¬φ is not T -satisfiable”) Approach: eager translation into SAT Encode problem into a T -equisatisfiable propositional formula Feed formula to a SAT-solver Example: T = equality (Ackermann encoding) Approach: lazy translation into SAT Couple a SAT solver with a given decision procedure for T-satisfiability of ground literals For instance if T is “equality” then the Nelson-Oppen congruence closure method can be used Automated Theorem Proving – Peter Baumgartner – p.72 Lazy Translation into SAT Automated Theorem Proving – Peter Baumgartner – p.73 Lazy Translation into SAT Automated Theorem Proving – Peter Baumgartner – p.74 Lazy Translation into SAT Automated Theorem Proving – Peter Baumgartner – p.75 Lazy Translation into SAT Automated Theorem Proving – Peter Baumgartner – p.76 Lazy Translation into SAT Automated Theorem Proving – Peter Baumgartner – p.77 Lazy Translation into SAT Automated Theorem Proving – Peter Baumgartner – p.78 Lazy Translation into SAT Automated Theorem Proving – Peter Baumgartner – p.79 Lazy Translation into SAT: Summary Abstract T -atoms as propositional variables SAT solver computes a model, i.e. satisfying boolean assignment for propositional abstraction (or fails) Solution from SAT solver may not be a T -model. If so, Refine (strengthen) propositional formula by incorporating reason for false solution Start again with computing a model Automated Theorem Proving – Peter Baumgartner – p.80 Optimizations Theory Consequences The theory solver may return consequences (typically literals) to guide the SAT solver Online SAT solving The SAT solver continues its search after accepting additional clauses (rather than restarting from scratch) Preprocessing atoms Atoms are rewritten into normal form, using theory-specific atoms (e.g. associativity, commutativity) Several layers of decision procedures “Cheaper” ones are applied first Automated Theorem Proving – Peter Baumgartner – p.81 Combining Theories Automated Theorem Proving – Peter Baumgartner – p.82 Nelson-Oppen Combination Method Automated Theorem Proving – Peter Baumgartner – p.83 Nelson-Oppen Combination Method Automated Theorem Proving – Peter Baumgartner – p.84 Nelson-Oppen Combination Method Automated Theorem Proving – Peter Baumgartner – p.85 Nelson-Oppen Combination Method Automated Theorem Proving – Peter Baumgartner – p.86 Nelson-Oppen Combination Method Automated Theorem Proving – Peter Baumgartner – p.87 Nelson-Oppen Combination Method Automated Theorem Proving – Peter Baumgartner – p.88 Nelson-Oppen Combination Method Automated Theorem Proving – Peter Baumgartner – p.89 Nelson-Oppen Combination Method Automated Theorem Proving – Peter Baumgartner – p.90 Nelson-Oppen Combination Method Automated Theorem Proving – Peter Baumgartner – p.91 Nelson-Oppen Combination Method Automated Theorem Proving – Peter Baumgartner – p.92 Conclusions Talked about the role of first-order theorem proving Talked about some standard techniques (Normal forms of formulas, Resolution calculus, unification, Instance-based method, Model computation) Talked about DPLL and Satisfiability Modulo Theories (SMT) Further Topics Redundancy elimination, efficient equality reasoning, adding arithmetics to first-order theorem provers FOTP methods as decision procedures in special cases E.g. reducing planning problems and temporal logic model checking problems to function-free clause logic and using an instance-based method as a decision procedure Implementation techniques Competition CASC and TPTP problem library Instance-based methods (a lot to do here, cf. my home page) Attractive because of complementary features to more established methods Automated Theorem Proving – Peter Baumgartner – p.93