Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
AN EXTENSION OF SPASS DECIDING FIRST-ORDER CLAUSAL CLASSES A dissertation submitted to the University of Manchester for the degree of Master of Science in the Faculty of Engineering and Physical Sciences 2012 By Yasmine Harbit School of Computer Science Contents Abstract 6 Declaration 7 Copyright 8 Acknowledgements 9 1 Introduction 10 2 Resolution for first-order logic 12 2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2 First-order logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3 Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.1 Inference rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.2 Completeness and soundness . . . . . . . . . . . . . . . . . . 20 3 SPASS: An automated theorem prover 22 3.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2 Input File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3 Analysis module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.4 Resolution part . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.4.1 Implemented rules and orderings . . . . . . . . . . . . . . . . 26 3.4.2 Default Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4 Resolution as decision procedure and decidable classes 30 4.1 Decision procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.2 Decidable classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.2.1 31 Classes decidable by ordered resolution . . . . . . . . . . . . 2 4.2.2 Classes decidable by hyperresolution . . . . . . . . . . . . . . 5 Guarded clauses as case study 5.1 The guarded clauses . . . . . . . . . . . . . . . 5.2 Decision procedure . . . . . . . . . . . . . . . . 5.3 Recognition of memberships of the GC class 5.4 Implementation of the decision procedure . . 5.4.1 Ordering / Selection function . . . . . 5.4.2 Inference rules . . . . . . . . . . . . . . 5.4.3 Reduction rules . . . . . . . . . . . . . 5.5 Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Other decidable classes 6.1 Recognition part . . . . . . . . . . . . . . . . . . 6.2 Implementation of the decision procedures . . . 6.2.1 Classes decidable by hyperresolution . . 6.2.2 Classes decidable by ordered resolution 7 Tests and results 7.1 The TPTP library . . . . . . . . . . . 7.2 Evaluation of the analysis module . . 7.3 Classification of the TPTP library . 7.4 Evaluation of the resolution methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 . . . . . . . . 38 38 39 40 46 46 49 52 52 . . . . 53 53 54 54 57 . . . . 60 60 62 65 68 8 Conclusion 71 Bibliography 73 Appendix A : SPASS flags 77 Word Count: 14477 3 List of Tables 6.1 6.2 Hyperresolution calculus . . . . . . . . . . . . . . . . . . . . . . . . . Ordered resolution calculus . . . . . . . . . . . . . . . . . . . . . . . . 55 58 7.1 7.2 Classification of the TPTP library . . . . . . . . . . . . . . . . . . . Improvements and deteriorations . . . . . . . . . . . . . . . . . . . . 67 70 4 List of Figures 3.1 3.2 3.3 3.4 SPASS SPASS SPASS SPASS architecture input . . . . analysis . . . output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 24 25 29 4.1 Clause depth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 7.1 7.2 7.3 7.4 7.5 Structure of the TPTP library Analysis of the first test . . . . Analysis of the second test . . Analysis of the third test . . . Analysis of the fourth test . . 61 63 64 64 65 . . . . . 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abstract SPASS is an automated theorem prover for first-order logic which tries to proof the satisfiability or unsatisfiability of a problem. The project presented in this dissertation is to extend SPASS and to make it solve more problems than it does already. The way to achieve this aim is to make SPASS recognise problems known to be decidable by specific resolution methods, and then based on the capacity to know the classes to which a problem belongs, to be able to select the appropriate resolution method. Therefore, the ultimate aim is to turn SPASS into a decision procedure for classes of first-order logic known to be decidable. For this project, a program corresponding to an extension of SPASS has been implemented. The analysis performed by this program contains all the data already provided by the current version of the prover but indicates in addition the decidable classes to which the problem belongs. Thanks to these data, SPASS is able to solve more problems than it already does by setting the right rules depending on the classes the problem belongs to. The results of the tests performed are provided. They have all been found correct. The new version of SPASS can recognise eight different classes of problems and can select the right resolution method for six of them. 6 Declaration No portion of the work referred to in this dissertation has been submitted in support of an application for another degree or qualification of this or any other university or other institute of learning. 7 Copyright i. The author of this dissertation (including any appendices and/or schedules to this dissertation) owns certain copyright or related rights in it (the “Copyright”) and s/he has given The University of Manchester certain rights to use such Copyright, including for administrative purposes. ii. Copies of this dissertation, either in full or in extracts and whether in hard or electronic copy, may be made only in accordance with the Copyright, Designs and Patents Act 1988 (as amended) and regulations issued under it or, where appropriate, in accordance with licensing agreements which the University has entered into. This page must form part of any such copies made. iii. The ownership of certain Copyright, patents, designs, trade marks and other intellectual property (the “Intellectual Property”) and any reproductions of copyright works in the dissertation, for example graphs and tables (“Reproductions”), which may be described in this dissertation, may not be owned by the author and may be owned by third parties. Such Intellectual Property and Reproductions cannot and must not be made available for use without the prior written permission of the owner(s) of the relevant Intellectual Property and/or Reproductions. iv. Further information on the conditions under which disclosure, publication and commercialisation of this dissertation, the Copyright and any Intellectual Property and/or Reproductions described in it may take place is available in the University IP Policy (see http://documents.manchester.ac.uk/ display.aspx?DocID=487), in any relevant Dissertation restriction declarations deposited in the University Library, The University Library’s regulations (see http://www.manchester.ac.uk/library/aboutus/regulations) and in The University’s Guidance for the Presentation of Dissertations. 8 Acknowledgements First, I want to thank the University of Manchester for having given me the opportunity to realise this dissertation. I am particularly grateful to my supervisor Dr. Renate A. Schmidt for the time she spent helping me, the trust she placed in me and her constructive remarks which have allowed me to go so far in this project. I have a particular thought for my parents and my brother. They have contributed to this achievement by their support, their financial aid, their patience and their limitless love. Finally, I want to thank my friends for their advice and their encouragements. 9 Chapter 1 Introduction Automated provers have been a great advance in the automated reasoning area. This area of computer science concerns programs created in order to give to computers the ability of reasoning. These programs are asked to determine the satisfiability or unsatisfiability of different problems. Deciding the satisfiability of a problem can be easy for some cases but almost impossible for others. If there exists a method to determine if a given formula is satisfiable for a logical system then the system is called decidable. The project focuses on first-order logic which is undecidable. However, a large number of fragments of first-order logic can be decided by some specific methods. These fragments are detailed in this report. Today, several automated provers for first-order logic exist. Among these provers are: Prover9, which is a successor of Otter, the first high performance theorem prover based on first-order resolution [McCune et al., 2009, McCune, 2005], E another high performance theorem prover [Schulz, 2010, Schulz, 2002] and Vampire [Voronkov, 2009b, Riazanov and Voronkov, 2001], both based on the superposition calculus which is a mix of first-order resolution and ordering-based equality reasoning. The project consists of improving an existing prover called SPASS. As the E and Vampire provers, SPASS is an automated theorem prover for first-order logic based on superposition calculus [Weidenbach, 2005, Weidenbach et al., 2009]. From a problem given, SPASS tries to find a proof indicating its satisfiability or its unsatisfiability. Depending on the difficulty of the problem, SPASS can determine the satisfiability of a problem on its own or it can be guided, i.e. the users can specify the rules to use when they launch it. In fact, as first-order logic is not decidable, the task is often difficult. 10 11 For many problems, SPASS cannot find how to go about solving them. However, an automated prover should ideally be completely independent. It should determine without any help the satisfiability for a maximum number of problems. The aim of the project is to increase the number of problems for which SPASS can provide a result without being helped by any external assistance. Therefore, the project aims to boost the intelligence of SPASS. The way to achieve this aim is to turn SPASS into a decision procedure for fragments of first-order logic known to be decidable by resolution. The first step in the realisation of the project was to make a list of known decidable classes, i.e classes for which there exists a calculus called dedision procedure which is sound, complete and assured to terminate. The goal was to make SPASS use the information contained in this list by extending its analysis module such that SPASS can recognise if a problem belongs to one of the classes listed. Thanks to this module, SPASS can know what resolution method to use to provide a result for sure. One main issue of the project was to prove that the resolution methods implemented are sound, complete and terminate. This report aims at describing precisely the work realised. First, the basic concepts on which this project is built are described. This contains the main notions related to first-order logic, resolution and SPASS. Then it is explained how resolution can be used as decision procedure, that is to say how it can be used to determine effectively if a problem is satisfiable or not, and a list of decidable classes is provided. Finally, the different steps concerning the implementation of the new SPASS modules are detailed. Chapter 2 Resolution for first-order logic 2.1 Definitions To understand all the concepts that are involved in this project, some definitions have to be given. These definitions as taken from [Schmidt, 2012, Fermüller et al., 2001a, Ganzinger and De Nivelle, 1999] are provided below. Term: A term is either a first-order variable, a constant or a functional term, that is to say, a term which has the form: f (t1 , ..., tn ) where f is a function symbol and t1 , ..., tn are terms. Functional term: A term is called functional if it contains a constant or a function symbol. Atom: We find two kinds of atoms: non-equational atoms and equational atoms. Nonequational atoms have the form P (s1 , ..., sn ) where P is a predicate symbol and s1 , ..., sn are terms. Equational atoms have the form s ≈ t where s and t are terms. We speak about first-order logic with equality when equational atoms are admitted. Literal: A literal is either an atom or an atom preceded by a negation sign. 12 2.1. DEFINITIONS 13 Ground term / literal: A ground term/literal is a term/literal with no occurences of variables. Expression: An expression represents either a term or a literal. Clause: A clause is a finite multiset of literals. Simple literals and clauses: A literal L is called simple if each term in L is either a variable or a functional term containing only variables and constants as arguments. A simple clause is a clause where all literals are simple. Horn clause: An Horn clause is a clause with at most one positive literal. Substitution: A substitution σ is defined as follows: σ[x ↦ t](y) = def ⎧ ⎪ ⎪ if y = x ⎪t, ⎨ ⎪ ⎪ ⎪ ⎩σ(y), otherwise where [x ↦ t] means that the value t is assigned to x. This substitution can also be written yσ[x ↦ t]. Maximal: Let ≻ be a total ordering on ground atoms, i.e. an ordering such that if A and B are two ground atoms, either A ⪰ B or A ⪯ B. A ground literal L is called (stricly) maximal with respect to a ground clause C if and only if ∀L′ ∈ C: L ⪰ L′ (L ≻ L′ ). A non-ground literal L is (strictly maximal) with respect to a clause C if and only if there is some ground substitutions such that Lσ is (stricly) maximal with regard to Cσ, i.e., ∀L′ ∈ C ∶ Lσ ⪰ L′ σ(Lσ ≻ L′ σ). 14 CHAPTER 2. RESOLUTION FOR FIRST-ORDER LOGIC 2.2 First-order logic In propositional logic, formulae are built from propositional symbols whose truth values are either true or false. However, sometimes it can be useful to decompose these propositions. For example, instead of having a proposition KymIsaWoman, it can be nice to have an entity called Kym and a relation called Woman whose arity is equal to one. First-order logic has been created to meet this demand. This logic is really well described in [Voronkov, 2009a], [Schmidt, 2012] and [Rautenberg, 2009]. The main notions extracted from these readings are summarised in this section. Syntax First-order logic is an extension of propositional logic which allows the designation of individuals and their quantification. The alphabet of a first-order language is then more rich than the one of propositional language. It consists of logical and non-logical symbols [Schmidt, 2012]. Logical symbols can be divided in four parts: logical connectives such that →, ↔, ∧, ∨ or ¬, quantifiers (∀, ∃), variables and auxiliary symbols ( ‘(’, ‘[’, ‘.’, ...). Non-logical symbols correspond to function symbols, constants and predicate symbols. Inviduals are denoted by terms built on this symbols. It is possible to give some specifications of this indivual by the creation of atoms. Based on the notion of atoms and on the existence of quantifiers, the definition of a formula can be introduced. A formula is an element A which obeys to one of the following conditions [Ganzinger and De Nivelle, 1999]: A = ⊺ or , A is an atom, A = ¬F where F is a formula, A = (F ∗ G) where * ∈ {→, ↔, ∧, ∨} and F, G are formale, A = ∀xF or ∃xF where F is a formula. 2.3. RESOLUTION 15 Conversion to Clausal Normal Form A difficulty in first-order logic is how to deal with quantifiers. It is easiest to get rid of them. Many transformations have been introduced in order to lead to a clausal form without any quantifiers [Schmidt, 2012, Nonnengart and Weidenbach, 2001]. The first step is to transform the formula such that all quantifiers come at the beginning. After this step, the formula is said to be in prenex normal form. Then the aim is to eliminate all ∃ quantifiers. The idea is to replace all ∃x by an appropriate fresh function symbol taking in arguments the variable whose x depends on. This step is called Skolemisation [Skolem, 1955]. After this step, the formula can be transformed into conjunctive normal form, that is to say it can be transformed in the following form: ∀x1 ...∀xn ⋀i=1..k ⋁j=1..ni Lij . Finally, by dropping all ∀ and all ∧, several clauses which do not possess any quantifier are obtained. The problem is said to be in the clausal form. Semantics It has been seen what a formula is and how it can be transformed in a convenient form. It is now time to define the notion of truth for a formula. The truth value of a formula depends on the interpretation of its language. An interpretation assigns domain elements to the free variables (variables which are not inside the scope of a quantifier) [Schmidt, 2012]. An interpretation I satisfies a formulae F (that can also be said: I is a model of F ) if F is true in I. This is denoted by I ⊧ F . F is said to be satisfiable if there exists an interpretation I such that I ⊧ F . F is valid if F is true in every interpretation. 2.3 Resolution Resolution is a system performing proofs by refutation: for each problem, it aims to prove the unsatisfiability of a set of clauses (see [Voronkov, 2009a, Schmidt, 2012, Leitsch, 1997]). It works on a set of clauses presented in the clausal form. Resolution is based on the principle of making inferences. 16 CHAPTER 2. RESOLUTION FOR FIRST-ORDER LOGIC To make inferences amounts to use inference rules whose general form is: F1 ... Fn G where n ≥ 0 and F1 ...Fn , G are formulae. F1 ...Fn are said to be the premises and G the conclusion. A set of inference rules is called an inference system (or calculus) [Schmidt, 2012]. There exist many different inference rules. The most relevant ones for the project are described in the next section. 2.3.1 Inference rules The project deals with the resolution for first-order logic problems. First-order resolution is a little more difficult than propositional resolution. The problem comes from the use of variables. In this context, two different terms can be in fact identical. There is a process called unification which aims to demonstrate the equality of two terms by the application of some substitutions [Fermüller et al., 2001b]. The definition as taken from [Fermüller et al., 2001a] is the following: “A set of expressions M is unifiable by a substitution σ (called unifier of M) if Ei σ =Ej σ for all Ei ,Ej ∈ M. σ is called the most general unifier (mgu) of M if for every other unifier ρ of M, there is a substitution θ such that σθ = ρ.” When first-order logic is concerned, unification is required in order to compare terms. The inference rules of the first-order resolution calculus are based on the unification of the terms. The definitions of the main inference rules involved in this project are provided below. There are two main inference rules of the first-order resolution calculus: the binary resolution rule and the binary factoring rule. These rules are defined as follows [Bachmair and Ganzinger, 2001]: 2.3. RESOLUTION 17 Binary resolution rule C ∨ A ¬B ∨ D (C ∨ D)σ where A and B denote atoms, C, D denote clauses and σ is a mgu of A and B. (C ∨ D)σ is called a resolvent of the initial clause. Binary factoring rule C ∨L∨L C ∨L where L is a literal and C, D denote clauses. C ∨ L is called a factor of the initial clause. Some other inference rules are required to the project realisation. This is the case of the splitting rule and superposition which are described below. Splitting rule Let N be a set of clauses and C and D be two variable-disjoint clauses. The splitting rule can be defined as follows [Bachmair and Ganzinger, 2001]: N ∪ {C ∨ D} N ∪ {C}∣N ∪ {D} Superposition Superposition is used to deal with equality. It takes into account two parameter: a specific ordering and a selection function. Superpostion is described in detail in Section 5.4.2. This part would not be complete without talking about the rules related to ordered resolution and hyperresolution calculus. These two inference sytems have been really useful to reach the aim of the project. 18 CHAPTER 2. RESOLUTION FOR FIRST-ORDER LOGIC Ordered Resolution An ordering is a transitive and irreflexive binary relation which specifies an order between literals. Inference rules are then applied only on the maximal literals. A selection function is a mapping from a clause C to a set of negative literals present in C [Bachmair and Ganzinger, 2001, Schmidt, 2012]. This set specifies the literals to use in priority to infer new clauses. Consider that an atom ordering > and a selection function S have been chosen. The ordered resolution and ordered factoring rules are defined as follows [Bachmair and Ganzinger, 2001]. Let C, D two clauses. Ordered factoring rule: C ∨ L1 ∨ L2 (C ∨ L1 )σ provided that: - L1 , L2 are two literals, σ is a most general unifier of L1 and L2 , L1 σ is strictly maximal with respect to Cσ and nothing is selected in C by S. Ordered resolution with selection: C ∨ A ¬B ∨ D (C ∨ D)σ provided that: - A, B are two atoms, σ is a most general unifier of A and B, Aσ is strictly maximal with respect to Cσ, nothing is selected in C by S, ¬B is selected or else nothing is selected in ¬B ∨ D and ¬ Bσ is maximal with respect to Dσ. 2.3. RESOLUTION 19 Hyperresolution Hyperresolution calculus is a refinement of resolution. As for ordered resolution, an ordering and a selection function S are used. But in this case, S selects all negative literals in a clause. Hyperresolution can be defined as follows [Georgieva et al., 2003, Schmidt, 2008]: C1 ∨ A1 ... Cn ∨ An ¬B1 ∨ ... ∨ ¬Bn ∨ D (C1 ∨ ... ∨ Cn ∨ D)σ provided that: - σ is the most general unifier such that A1 σ = B1 σ, ..., An σ = Bn σ, Ai σ is strictly maximal in Ci σ, 1 ≤ i ≤ n, nothing is selected in Ci and the indicated ¬Bi are exactly the ones selected by S and D is a clause containing only positive literals. The factoring rule previously defined is also included in the hyperresolution calculus. Hyperresolution calculus produces fewer inferences and only positive clauses. The following example illustrates this statement. Let N be the set of clauses {C1 = D1 ∨ A1 , C2 = D2 ∨ A2 , C3 = ¬A1 ∨ ¬A2 ∨ D3 }. Consider that ordered selection is used with a selection function S which selects in first time the literal ¬A1 . From C1 and C3 , a first clause C4 = D1 ∨ ¬A2 ∨ D3 can be inferred. After this step, the selection function selects the negative literal left ¬A2 . From C2 and C4 , a second clause C5 = D1 ∨ D2 ∨ D3 is then inferred. Now, suppose that hyperresolution is used instead of ordered resolution. All the negative literals are selected at the same time. Then D1 ∨ D2 ∨ D3 is directly inferred from N . 20 CHAPTER 2. RESOLUTION FOR FIRST-ORDER LOGIC 2.3.2 Completeness and soundness Some calculus have interesting properties. In particular, they can be complete and sound. A resolution refinement is called complete if it satisfies the following condition: if a set of clauses is unsatisfiable, an empty clause is derivable from the set by the resolution calculus [Fermüller et al., 1993a]. And it is called sound when its inference rules provide a conclusion which is a semantic/logical consequence of the premises [Schmidt, 2012]. These definitions imply that: - If a calculus R1 is complete, R1 can prove all the unsatisfiable problems, i.e., R1 derives an empty clause for all the unsatisfiable problems. But a satisfiable problem can also be provable by R1 as nothing prevents R1 to derive an empty clause from a satisfiable problem. Then the result of R1 can be declared correct only if it is satisfiable. - If a calculus R2 is sound, R2 can prove only the unsatisfiable problems. But the problem for this kind of calculus is that some unsatisfiable problems are not provable. R2 does not necessary derive an empty clause from an unsatisfiable problem. Then the result of R2 can be declared correct only if it is unsatisfiable. A sound and complete calculus can by consequence prove all the unsatisfiable problems and only the unsatisfiable problems. If no proof is found, the problem is necessary satisfiable. Suppose that we have a reasoner based on calculus which is sound and complete. After applying all inference rules on the set of clauses N , if is derived then N is unsatisfiable, else if no new clauses can be inferred, N is called saturated and that implies N is satisfiable. One of the most important issues of this project is to prove that all the implemented calculus are complete and sound. Without this verification, we cannot be sure that the returned results are correct. This kind of reasoner gives then an efficient proof search. However, appling all inference rules without any restriction leads to redundant clauses. A clause is called redundant with respect to S, if there exist C1 , ..., Cn ∈ S, n ≥ 0, such that all Ci ≺ C and C1 , ..., Cn ⊧ C [Bachmair and Ganzinger, 2001, Schmidt, 2012]. During the saturation process, redundant clauses can be removed. This is the 2.3. RESOLUTION 21 role of the condensation and deletion rules. Among these rules are the tautology elimination rule and the subsumption elimination rule. As their names indicate, the tautology elimination rule consists of deleting all tautologies, that is to say all clauses C which are always true ( ⊧ C). The subsumption elimination rule deletes clauses which are subsumed by other clauses of the set, that is to say all clauses D such that there exist a clause C such that Cσ ⊂ D for some substitutions σ. This rule is especially interesting when condensation is allowed. A condensation of a clause C (Cond(C)) is by definition “a minimal subclause of C which is also an instance of C” [Horrocks et al., 2007]. Cond(C) subsumes C and therefore, if the subsumption elimnation rule is activated, C can be removed in the presence of Cond(C). Chapter 3 SPASS: An automated theorem prover The aim of the project is to improve the prover SPASS. SPASS is an automated theorem prover for first-order logic with equality based on superposition calculus [Weidenbach, 2005, Weidenbach et al., 2007b, Weidenbach et al., 2007a]. It was a requirement of the project specification to use SPASS. SPASS has been chosen because it is an open source prover and it is widely used. Moreover, it contains a large number of rules, such as the splitting rule, that are asked for the implementation of the different decision procedures. This is not the case for all the provers. In this chapter, the SPASS functioning is explained. 3.1 Architecture SPASS works on an input file written in dfg format. This file is treated by the module top which contains the main function. The first step realised, when a problem is given, is the transformation into clausal form of this problem. This step is performed by the module FLOTTER. There is after a module dedicated to handle the different actions related to clauses. In particular, it is this module which selects the literals in a clause. After the transformation into clausal normal form of the problem, an analysis is performed. For this project, a new module which allows to tell if a problem belongs to one of the decidable classes described in this report has been created. It has been integrated to the SPASS analysis module. The analysis performed enables SPASS to set the adequate rules to use. 22 3.2. INPUT FILE 23 The rules are set by activating specific flags. SPASS has an associated flag for each rule implemented. To activate a rule, the user has just to specify the option -flag=value in the command line [Weidenbach, 2005]. For example, to activate the ordered factoring rule, you should set the option -IOFc=1. The list of all flags is provided in the Appendix A. The value of the flags can be set automatically by activating the automatic mode -Auto=1. It is possible to activate specific rules even in the automatic mode. Once the rules are set, the resolution is performed. The output is then print on the terminal. The architecture of SPASS is summarised in the figure 3.1. Figure 3.1: SPASS architecture More details concerning the functioning of SPASS are provided in the following sections. 3.2 Input File SPASS takes as input a file written in a specific syntax. This input file which is in dfg format should contain different information. It consists of two part: a description part and a logical part [Weidenbach, 2005]. 24 CHAPTER 3. SPASS: AN AUTOMATED THEOREM PROVER The description part need to contain the name of the problem and of its author, the status of the problem (satisfiable or unsatisfiable) and a brief description of the problem. The logical part can be divided in two part: a part where symbols are declared and a part where formulas or clauses are declared [Weidenbach, 2005]. Signature symbols have to be declared first. It amounts to declare the necessary functions and precidicate symbols. The function symbols are defined thanks to the use of functions[] which takes in argument pairs specifying the symbols and the arities of the functions. The predicate symbols are defined similarly thanks to predicates[]. Formulas and clauses are then declared. Depending on the clause type chosen, the clause should be written in a different form. A cnf clause has the form forall(term list, or(...)) and a dnf clause has the form exists(term list, and(...)). The figure 3.2 is an example of a SPASS inpute file. begin_problem(example1). list_of_descriptions. name({*example1*}). author({*Yasmine*}). status(satisfiable). description({*This problem is a test concerning the recognition part.*}). end_of_list. list_of_symbols. functions[(a,0),(b,0),(c,0),(f,2),(g,3)]. predicates[(P,2),(Q,3)]. end_of_list. list_of_clauses(axioms, cnf). clause(or(P(a,b))). clause(forall([x,y],or(P(x,y),not(P(y,f(x,y)))))). clause(forall([x,y,z],or(Q(x,y,z),not(Q(g(z,x,y),x,g(g(y,z,x),x,z)) )))). clause(or(not(Q(a,b,c)))). end_of_list. end_problem. Figure 3.2: SPASS input 3.3. ANALYSIS MODULE 3.3 25 Analysis module SPASS starts by reading the input file, right after it transforms the problem into the clausal normal form, then the problem is analysed. The SPASS analysis module can give many information about the problem. This analysis module is able to tell: if the problem is a Horn problem, i.e., a problem which contains only Horn clauses if the problem is a monadic problem if the problem contains equational atoms, that is to say if it a problem with equality if the problem is a first-order problem or a pure propositional problem if the problem contains function symbols if the conjecture is ground The figure 3.3 illustrates the analysis produced for the problem provided in the previous section. Input Problem: 4[0:Inp] || Q(a,b,c) -> . 3[0:Inp] || Q(g(U,V,W),V,g(g(W,U,V),V,U)) -> Q(V,W,U). 2[0:Inp] || P(U,f(V,U)) -> P(V,U). 1[0:Inp] || -> P(a,b). This is a first-order Horn problem without equality. Axiom clauses: 4 Conjecture clauses: 0 Figure 3.3: SPASS analysis 26 CHAPTER 3. SPASS: AN AUTOMATED THEOREM PROVER 3.4 Resolution part 3.4.1 Implemented rules and orderings SPASS has thirty one rules implemented. These rules are listed in Appendix A. They can be split in two different groups: inference rules and reduction rules. All the rules needed for the project are already implemented. Among them, there are: the usual inference rules such as the ordered resolution, the ordered hyper- resolution, the ordered factoring and the splitting rule, the inference rules concerning the problems with equality such as the equal- ity resolution, the reflexivity resolution, the superposition left and the superposition right, the usual reduction rules such as the tautology deletion, the subsumption deletion and the condensation rule. The ordering used in the resolution of a problem can be chosen. There are two orderings implemented in SPASS [Weidenbach et al., 2007b]: the Knuth-Bendix ordering (KBO) and the recursive path ordering with status (RPOS). The Knuth-Bendix ordering is based on a weight function which is a mapping from the set of signature symbols (functions, predicates) into the non-negative integers. It also takes as parameter a strict order of the signature symbols set. This order of the signature symbols is called a precedence. The KBO ordering implemented in SPASS is defined as follows [Weidenbach et al., 2007b]: If t and s are terms then t ≻kbo s if occ(x, t) ≥ occ(x, s) for every variable x ∈ (vars(t) ∪ vars(s)) and (1) weight(t) > weight(s) or 3.4. RESOLUTION PART 27 (2) weight(t) = weight(s) and t = f (t1 , ..., tk ) and s = g(s1 , ..., sk ) and (2a) f > g in the precedence or (2b) f = g and (2b1) status(f ) = lef t and (t1 , ..., tk ) ≻lex kbo (s1 , ..., sl ) or (2b2) status(f ) = right and (tk , tk−1 , ..., t1 ) ≻lex kbo (sl , sl−1 , ..., s1 ) The recursive path ordering with status does not use any weight function but uses a precedence. It asks for a status which can be lef t, right or mul to specify in particular cases if lexicographic ordering has to be used or just a multiset ordering. The RPOS ordering implemented in SPASS is defined as follows [Weidenbach et al., 2007b]: If t and s are terms then t ≻rpos s if (1) s ∈ vars(t) and t ≠ s or (2) t = f (t1 , ..., tk ) and s = g(s1 , ..., sk ) and (2a) ti ⪰rpos s for some 1 ≤ i ≤ k or (2b) f > g and t ≻rpos sj for all 1 ≤ j ≤ l or (2c) f = g and (2c1) status(f ) = lef t and (t1 , ..., tk ) ≻lex rpos (s1 , ..., sl ) and t ≻rpos sj for all 1 ≤ j ≤ l or (2c2) status(f ) = right and (tk , tk−1 , ..., t1 ) ≻lex rpos (sl , sl−1 , ..., s1 ) and t ≻rpos sj for all 1 ≤ j ≤ l or (2c3) status(f ) = mul and (t1 , ..., tk ) ≻mul rpos (s1 , ..., sl ) These orderings are the two main orderings used in todays provers and it is possible to create almost any other ordering just by modifying them a bit [Weidenbach et al., 2007b]. 28 CHAPTER 3. SPASS: AN AUTOMATED THEOREM PROVER 3.4.2 Default Mode By default, SPASS runs in the automatic Mode. In this mode, regardless of the problem, the following rules are enabled: the trivial literal elimination, the tautology deletion, the forward/backward matching replacement resolution and the subsumption deletion. The other rules are set depending on the analysis of the problem performed. The settings are detailed below: If the problem contains real predicates, the ordered resolution and the con- densation rules are activated. Moreover, if the problem contains non Horn clauses, the ordered factoring rule is also activated. If the problem contains positive equations, the superpositions right/left, the Forward/Backward rewriting and the condensation rules are activated. Moreover, if the problem contains non Horn clauses, the equality factoring is also activated. If the problem contains negative equations, the equality resolution is acti- vated. If the problem does not contain any functions, all the negative literals are selected, else only the maximal negative literals are selected. Thanks to these settings, SPASS is able to provide a result for a certain number of problems. But for many other problems, SPASS cannot find how to go about solving them. A way to remedy to this issue is to make SPASS use a resolution method known to be a decision procedure for the input problem. In this way, it is sure that SPASS will provide a result at the end. The result finally produced at the end is either Proof found or Completion found depending on the satisfiability of the problem. SPASS can also provide a proof if it finds one. The figures 3.4 illustrates the output produced by SPASS for the example studied in Section 3.2. 3.4. RESOLUTION PART 29 Inferences: IORe=1 Reductions: RFMRR=1 RBMRR=1 RObv=1 RUnC=1 RTaut=1 RFSub=1 RBSub=1 RCon=1 Extras : Input Saturation, Dynamic Selection, No Splitting , Full Reduction, Ratio: 5, FuncWeight: 1, VarWeight: 1 Precedence: f > g > P > Q > a > b > c Ordering : KBO Processed Problem: Usable Clauses: 1[0:Inp] || -> P(a,b)*. 4[0:Inp] || Q(a,b,c)* -> . 2[0:Inp] || P(U,f(V,U))* -> P(V,U). 3[0:Inp] || Q(g(U,V,W),V,g(g(W,U,V),V,U))* -> Q(V,W,U). Given clause: 1[0:Inp] || -> P(a,b)*. Given clause: 4[0:Inp] || Q(a,b,c)* -> . Given clause: 2[0:Inp] || P(U,f(V,U))* -> P(V,U). Given clause: 3[0:Inp] || Q(g(U,V,W),V,g(g(W,U,V),V,U))* -> Q(V,W,U). SPASS V 3.8d SPASS beiseite: Completion found. Problem: examples/example.dfg SPASS derived 0 clauses, backtracked 0 clauses, performed 0 splits and kept 4 clauses. SPASS allocated 31830 KBytes. SPASS spent 0:00:00.05 on the problem.hey 0:00:00.02 for the input. 0:00:00.00 for the FLOTTER CNF translation. 0:00:00.00 for inferences. 0:00:00.00 for the backtracking. 0:00:00.00 for the reduction. Figure 3.4: SPASS output Chapter 4 Resolution as decision procedure and decidable classes 4.1 Decision procedures The main rules to solve a problem belonging to first-order logic are known but there is no mechanical way to decide if a problem is satisfiable or not. If the program does not terminate, there is no way to know if it is because the problem is satisfiable and the calculation has not been driven far enough or just because the problem is not satisfiable. First-order logic is said to be undecidable. Nevertheless, some problems of first-order logic can be decided by the use of some refinements of the resolution calculus [Fermüller et al., 2001a]. These reasoning methods are called decision procedures. A decision procedure is sound, complete and terminates. Some important refinements of the resolution calculus are refinements based on orderings, selection function and hyper-resolution. It has been shown that these refinements used with some specific rules such that splitting rule make some classes of problems decidable. Take the monadic class as an example. The monadic class is the class of clauses that contains only monadic predicate symbols, i.e., predicate symbols whose arities are equal to 1 and which do not contain any function symbols. The atoms allowed in this class have necessary the form P (s) or P (f (t)) where P is a predicate symbol, f is a skolem function and t is either a constant or a variable. 30 4.2. DECIDABLE CLASSES 31 Consider the following set S of clauses: 1. 2. ¬P (x) ∧ P (f (x)) P (a) This set belongs to the monadic class. Applying unrestricted resolution to S gives the following derivation: 3. 4. 5. ... P (f (a)) P (f (f (a))) P (f (f (f (a)))) Res(1,2) with σ = {x/a} Res(1,3) with σ = {x/f (a)} Res(1,4) with σ = {x/f (f (a))} In this case, clauses that do not belong to the monadic classes (clauses 4, 5, ...) are inferred. However, this fact can be prevented by the use of an appropriate ordering. Let ≻ be an ordering such that for all atoms A, B, A ≻ B if A has more functional symbols than B. Under this ordering restriction, no inference can be performed on S. This example shows that the ordering restriction has prevented to infer clauses whose length increases. It can then be understood why ordered resolution is used for some classes of problems. For classes more complex than the monadic class, ordering restrictions are not enough, that is why depending on the concerned classes, different refinements of resolution are required. The decidable classes that have been chosen to be implemented and their decision procedures have been described below. 4.2 4.2.1 Decidable classes Classes decidable by ordered resolution The first important step of the project was to make a list of known decidable classes that could be implemented. The chapter on resolution decision procedures from the Handbook of Automated Reasoning [Fermüller et al., 2001a] has really helped to realise this task. For each class, a concrete definition and a resolution decision procedure have been provided. 32CHAPTER 4. RESOLUTION AS DECISION PROCEDURE AND DECIDABLE CLASSES In this section, only classes decidable by orderings are described. Most of classes use resolution based on a A-ordering as decision procedure. An A-ordering ≺a is an irreflexive, transitive binary relation on atoms such that for all atoms A, B and all substitutions θ: A ≺a B implies Aθ ≺a Bθ . This notion has been first introduced in [Kowalski and Hayes, 1968]. This kind of ordering is particularly interesting because it is known that a resolution refinement based on a A-ordering combined with the splitting rule is complete. A proof has been provided in [Fermüller et al., 1993c]. Before giving the definitions of the different classes, new notions and notations need to be introduced. First, the definition of a (weakly) covering term as taken from [Fermüller et al., 2001a] is provided. A functional term t is called (weakly) covering if for all (non ground) functional subterms s occurring in t we have the set of variables of s equal to the set of variables of t. An atom or literal A is called (weakly) covering if each argument of A is either a variable or a constant (just a term with no variables in case of weakly covering) or a (weakly) covering term t such that the set of variables of t is equal to the set of variables of A. For example, - P(f(x,y),g(x,y,y),a) is covering. - P(f(x,y),g(x,y,y),h(a)) is not covering but it is weakly covering. - P(f(x,y),g(x,y,y),h(x)) is neither covering nor weakly covering. The notations used in the rest of this thesis are the following: S represents a clause set. C+ (respectively C− ) denotes the set of positive literals occuring in C (re- spectively the set of negative literals occuring in C). τ (E) denotes the depth of the expression E and τmin (t, E) (τmax (t, E)) the minimal (maximal) depth of a term t within an expression E. 4.2. DECIDABLE CLASSES 33 τv (E) represents the maximal variable depth of an expression E and it is defined as: τv (E) = max{τmax (y, E)∣ y belonging to the set of variables of E} The depth of a term t is defined as follows [Fermüller et al., 2001a]. If t is a variable or a constant then τ (t) = 0. If t = f (t1 , ..., tn ) then τ (t) = 1 + max{τ (ti )∣1 ≤ i ≤ n}. For all literals L and clauses C, τ (L) = max{τ (t)∣t ∈ args(L)} and τ (C) = max{τ (L)∣L ∈ C}. The next example is provided to understand the notion of depth. Let C = {P (x, y, f (y), f (f (x)))}. The figure 4.1 represents the clause C. Figure 4.1: Clause depth It is easy to see with this graph that τmin (x, C) = τmin (y, C) = 0 , τmax (y, C) = 1 and τmax (x, C) = 2. The depth of C is the depth of the entire tree that is to say τ (C) = 2. 34CHAPTER 4. RESOLUTION AS DECISION PROCEDURE AND DECIDABLE CLASSES Class Some decidable classes are based on these notions of covering and weakly covering. This is the case for the classes 1 and + . The 1 class can be seen as an extension of the Ackermann class which is characterised by the prefix type ∃ ∀ ∃ [Fermüller et al., 1993c]. It is defined as follows: Suppose that for each clause Ci of S, the following two conditions are true: 1. all literals appearing in Ci are covering and 2. for all literals L, M belonging to Ci , the set of variables of L, V (L), is equal to the set of variables of M , V (M ) or V (L) and V (M ) have no intersection. Then S is said to belong to 1 [Fermüller et al., 1993c]. This class has been studied because of its nice properties. First, it is known that if a clause set S belongs to 1 then a resolvent or a factor of a clause C ∈ S cannot have more variables than C [Fermüller et al., 2001a]. Moreover, the depth of the resolvents of clauses in 1 can be limited by the use of the A-ordering R1 defined such that for all atoms A and B, A is greater than B if τ (A) > τ (B) and τmax (x, A) > τmax (x, B) for all x ∈ V (B) [Fermüller et al., 1993c]. The 1 class has been proved to be decidable [Fermüller et al., 1993c]. R1 combined with the splitting rule gives a decision procedure for 1 [Fermüller et al., 1993c, Fermüller et al., 2001a]. The class + is sensitively similar to the class 1 . The only difference concerns the first condition: literals do not need to be covering but all literals need to be weakly covering. This class has the same properties as the class 1 : it is known to be decidable and R1 combined with the splitting rule provide a decision procedure [Fermüller et al., 1993c, de Nivelle, 2000]. 4.2. DECIDABLE CLASSES 35 Class S+ A third class called the S+ class has been shown to be related to the 1 class [Fermüller et al., 2001a]. This class can be seen as an extension of the Skolem class characterised by a prefix of the form (∃z1 )...(∃zl ) (∀y1 )...(∀ym )(∃x1 )...(∃xn ). Its definition is given below [Fermüller et al., 2001a]. Consider that for each clause Ci of S, the following two conditions are true: 1. for all literals L in Ci , the set of variables of L is equal to the set of variables of Ci or contains at most one element and 2. for all functional terms t in Ci , the set of variables of t is equal to the set of variables of Ci . Then S belongs to S+ . As 1 , this class is decidable [Fermüller et al., 1993c]. A decision procedure is provided by a resolution based on R1 combined with a process called monadization and the splitting rule [Fermüller et al., 1993c, Fermüller et al., 2001a]. Class W GC The class of the weakly guarded clauses has then been studied. This class uses the notion of weakly covering that has been introduced in the beginning of this chapter. The WGC class is decidable [de Nivelle, 1998, Fermüller et al., 2001a]. A decision procedure is provided by the ordering R2 defined such that for all literals L1 and L2 , L1 is greater than L2 if τv (L1 ) > τv (L2 ) or the set of variables of L1 includes the set of variables of L2 [de Nivelle, 1998]. The definition of the W GC class is provided below. S belongs to W GC if for each clause Ci of S and for all literals L in Ci : 1. L is weakly covering, 2. if τv (L) > 0, then the set of variables of L is equal to the set of variables of Ci and 36CHAPTER 4. RESOLUTION AS DECISION PROCEDURE AND DECIDABLE CLASSES 3. if Ci is non ground (i.e. if Ci contains variables), then there is a negative literal L1 ∈ Ci such that τv (L1 ) = 0 and which has the same set of variables than Ci . This class is a superclass of the class GC which is detailed in Chapter 5. 4.2.2 Classes decidable by hyperresolution Refinements based on A-orderings are not the only ones used to provide decision procedures. Some other interesting decidable classes cannot be decided by a refinement based only on an A-ordering. Most of these classes use a refinement based on hyperresolution as decision procedure. It has been shown that hypperresolution based on positive, restricted factoring is complete[Fermüller and Leitsch, 1993]. We say that a resolvent of two clauses C and D, where C is positive, is under positive, restricted factoring if factoring is only applied to C [Fermüller and Leitsch, 1993]. This section aims at describing some classes known to be decidable by hyperresolution based on positive, restricted factoring. Class BSH∗ The first class decidable by hyperresolution which has been encountered is the BSH∗ class [Fermüller et al., 2001a, Caferra et al., 2004]. This class is a subclass of a class called BS∗ which represents the clausal form of the Bernays-Schönfikel class. The Bernays-Schönfinkel class is the class of all formulas of the form: (∃z1 )...(∃zl ) (∀y1 )...(∀ym )P where P is quantifier-free and has only variables as terms [Fermüller et al., 2001a]. Then BS* can be defined as the class of the clause sets S which include only clauses C such that τ (C) = 0. And BSH* is the subclass of BS* containing sets of Horn clauses only [Fermüller et al., 2001a]. Class P V D After the BSH∗ class, the P V D class has been studied. PVD means positive variable dominated [Fermüller et al., 1993b]. This name comes from the fact that in the P V D class, the positives parts of a clause C are kind of dominated by the 4.2. DECIDABLE CLASSES 37 negative parts of C. The definition of this class is the following [Fermüller et al., 2001a]: S belongs to P V D if for each clause Ci ∈ S, the set of variables of {Ci+ } is included in the set of variables of {Ci− } and τmax (x, Ci+ ) ≤ τmax (x, Ci− ) for all x ∈ V (Ci+ ). It has been proved that the P V D class is decidable by hypperresolution [Fermüller et al., 1993b, Fermüller et al., 2001a]. Class OCCI The last class defined is the OCCI class. This class is more restrictive than the P V D class concerning the relation between the positive parts and negative parts of a clause C. It assures that every variables occur only once in C+ . As P V D, the OCCI class has been shown to be decidable and hyperresolution provides a decision procedure [Fermüller et al., 1993b, Fermüller et al., 2001a]. The OCCI class is defined as follows. If for all clauses Ci ∈ S and for all variables v ∈ Ci+ , the number of occurences of v in Ci+ is equal to 1 and τmax (x, Ci+ ) ≤ τmin (x, Ci− ) for all x ∈ V (Ci+ ) ∩ V (Ci− ), then S belongs to OCCI [Fermüller et al., 1993b, Fermüller et al., 2001a]. The classes presented in these two sections have been studied in many papers and are interesting for the project because of their possiblities to be handled and implemented in SPASS. However, we should be careful with the use of equalities which makes almost all these classes undecidable. As it can be noticed, only clausal classes have been studied. As the analysis of a problem is performed after the transformation into clausal normal form of the problem, it is more adapted to work on clausal classes than formula classes. A last decidable class has been carefully studied for this project. It is the class of guarded clauses. The entiere next chapter is dedicated to this class. It details the mechanism used to solve the memberships of this class. Chapter 5 Guarded clauses as case study The guarded fragment was inspired by two observations [Andréka et al., 1998]: Many propositional modal logics (that are extensions of the classical propo- sitional logic which include operators expressing modality) have very nice properties. These modal logics can be translated into first order logic. Propositional modal logics can then be seen as fragments of first-order logic with interesting properties. In particular, they are decidable [Ganzinger and De Nivelle, 1999]. The guarded fragment is one fragment of first-order logic which possesses the nice properties of propositional modal logics. 5.1 The guarded clauses The guarded fragment of first-order logic (GF) is built up as follows [Ganzinger and De Nivelle, 1999]: ⊺ and are in GF . If A is an atom, then A is in GF . If A ∈ GF , then ¬A ∈ GF . If A, B ∈ GF , then A ∨ B ∈ GF , A ∧ B ∈ GF , A → B ∈ GF , A ↔ B ∈ GF . If F ∈ GF and G is an atom, for which every free variable of F is among the arguments of G, then ∀x(G → F ) ∈ GF (or, equivalently, ∀x(¬G∨F ) ∈ GF ) and ∃x(G ∧ F ) ∈ GF for every sequence x of variables. 38 5.2. DECISION PROCEDURE 39 The atom G appearing in the last point is called a guard. Guarded clauses can also be defined. Their definition is the one which is the more interesting for the project since the recognition module focuses only on clausal classes. However, it has be shown in [Ganzinger and De Nivelle, 1999] that guarded fragments can be translated into guarded clauses by transformation into normal clausal form. A clause C is called guarded if it satisfies the following conditions [Ganzinger and De Nivelle, 1999]: 1. C is simple. 2. Every functional subterm in C contains all the variables of C. 3. If C is non-ground, C has a non-functional negative literal, called a guard, which contains all the variables of C. This definition is more restrictive than the weakly guarded clauses definition described in the previous chapter. Let C1 = ¬q(x, y, f (a)) ∨ p(x, y) and C2 = ¬p(x) ∨ p(x, a). C1 and C2 do not belong to GC but they belong to WGC. In fact, there is a negative literal in C1 which contains all the variables of the clause. But this literal is functional. And both C1 and C2 have constants but they are not ground. 5.2 Decision procedure The class composed of guarded clauses with equality (the GC class) has been proved to be decidable [Ganzinger and De Nivelle, 1999, de Nivelle, 1998]. The superposition calculus used with an appropriate choice of ordering and selection function provides a resolution decision procedure. It has been shown that this decision procedure can use, as ordering, a lexicographic path ordering based on a precedence ≻ such that f ≻ c ≻ p where f denotes a non-constant function symbol, c denotes a constant, and p denotes a predicate symbol [Ganzinger and De Nivelle, 1999]. In order to be sure to have 40 CHAPTER 5. GUARDED CLAUSES AS CASE STUDY an ordering compatible with the decision procedure, the choice was to implement this ordering. An appropriate selection function Σ for the GC class resolution is such that [Dierkes, 2000, Ganzinger and De Nivelle, 1999]: If a clause is non-functional and contains a guard then one of its guards is selected by Σ. If a clause contains a functional negative literal, one of these is selected. If a clause contains a positive functional literal, but no negative functional literal, no literal is selected. The inference rules related to the superposition calculus are detailed in the Section 5.4.2. The GC class is decidable and a decision procedure is known. The aim was to find a way to make SPASS use this information. The idea to achieve this aim was simple: if SPASS can tell that a problem belongs to the GC class, it can then know when to use the decision procedure described above. From this observation, two main steps concerning the implementation the GC class resolution take shape. They can be defined as follows: Make SPASS able to recognise that a problem belongs to GC. Implement an appropriate decision procedure. 5.3 Recognition of memberships of the GC class In order for SPASS to handle the GC class, an analysis module able to tell if a given problem belongs to the GC class or not had to be implemented. As it has been noted in Chapter 3, SPASS already has an analysis module implemented. However, this module only gives some characteristics of the problem. It tells if the problem is a Horn problem, if it contains function symbols and some other information that are detailed in Chapter 3. An extension of this module was then necessary to reach the desired goal. 5.3. RECOGNITION OF MEMBERSHIPS OF THE GC CLASS 41 In the project, only clausal classes have been studied. A problem can be said to belong to a specific class only after analysis performed when the problem is in its clausal normal form. The recognition step had then to be implemented after the transformation of the problem into clausal normal form. The analysis already made by SPASS is also performed just after the transformation of the problem into clausal normal form. Therefore, an appropriate choice was to extend the analysis function already implemented in SPASS instead of creating a whole new analysis module. SPASS analyses a problem thanks to the function ana AnalyzeProblem which is defined in the file analyze.c. This function is based on elements called Features. These features defined in analyze.h play the role of internal flags. They are declared as BOOL (true or false) which allow to know if the given problem satisfies some specific characteristics. To make the function recognise if a problem belongs to the GC class, a new feature called ana GC was declared. After having created the feature ana GC specifying the belonging of the problem to the GC class, it had to be initialised. The initialization step is realised in the ana AnalyzeProblem function. This is the first action performed when the function is called. The feature could be initialised to true or false. The first choice was to initialise it to false and if all the clauses of the problem satisfied the three conditions described in the definition of a guarded clause then set the feature to true. This solution seemed to be appropriate but it asked to create a BOOL variable which allowed to keep in memory that all clauses scanned before the current analysed clause were satisfying the conditions. Then the choice was changed to initialise it to true and set it to false at the first encountered clause which does not satisfies the desired conditions. A pseudo-code describing what happens in ana AnalyzeProblem has been provided below (see Algorithm 1). The key question was then to know how to implement the different conditions: GC condition1, GC condition2 and GC condition3 that a clause has to verified to be a guarded clause. 42 CHAPTER 5. GUARDED CLAUSES AS CASE STUDY Algorithm 1 Problem analysis Initialization ana SetFeature(ana GC,TRUE) for all clauses C in the problem do if !GC condition1(C) or !GC condition2(C) or !GC condition3(C) then ana SetFeature(ana GC,FALSE) end if end for For each condition, a function was written. Each function returns a BOOL which value indicates whether the condition is satisfied or not. In order to have a clear code where it is possible to navigate easily, a new file called class conditions.c was created. This file regroups all the functions associated to the membership conditions of the different classes. It contains for example the functions associated to the three membership conditions of the GC class. Now that the basic functioning of the recognition module has been briefly explained, the implementation of the conditions that clauses have to satisfy in order to belong to GC are going to be described in detail. First condition: the clause has to be simple. In the first chapter, it has been said that a simple clause is a clause where literals have only terms which are either variables or constants or functional terms whose argument is either a variable or a constant. This definition means that all literals in a simple clause can be written in one of the following forms: a, where a is a constant, x, where x is a variable, f (t), where f is a function symbol and t is either a constant or a variable. This definition prevents terms from being in the following format : f (g(...)), where f and g are function symbols. Therefore, an alternative definition of a 5.3. RECOGNITION OF MEMBERSHIPS OF THE GC CLASS 43 simple clause can be given: a simple clause is a clause where depth is less or equal than 1. Therefore, the first condition amounts to checking if the depth of the clause is less than or equal than 1. Thus, GC condition1 can be defined as follows: Algorithm 2 GC condition1 BOOL b = FALSE if clause depth(C) ≤ 1 then b = TRUE end if return b Second condition: ∀t ∈ C such that t is functional term, V (t) =V(C) where V (x) denotes the set of variables of x. A relevant remark can be made just by reading briefly this condition: if a guarded clause C contains a constant, then it is ground i.e. V (C) = ∅. This last remark can be easily explained. In fact, a functional term is a term containing either a constant or a function symbol. Then a constant a is a functional term. However, V (a) = ∅. If a clause C contains a constant a, in order to satisfy condition 2, V (C) should be equal to V (a). Therefore, V (C) should be equal to ∅ in order for C to satisfy condition 2. So an issue was to know how to deal with clauses which do not contain any constants. The idea was to implement functions which could be reused for the other decidable classes. The first step was to rewrite this condition. In fact, this condition can be transformed in conditions similar to the ones given in the definition of the weakly guarded clauses. The condition in its initial state is : “Every functional subterm in the clause C have to contain all the variables 44 CHAPTER 5. GUARDED CLAUSES AS CASE STUDY of C”. This condition can be replaced by: in a guarded clause, all literals L containing functional subterms have to contain all the variables of the clause and all functional subterms occuring in L have to contain the same variables than L (that is to say of C as the set of variables of L is equal to the set of variables of C). This latter amounts to say that L is covering. When the constants are not taken into account, the condition can then be simply rewritten as follows: in a guarded clause C, all literals have to be covering and if the variable depth of a literal L is greater than 0, then L contains all the variables present in C. Therefore, a clause C satisfies the second condition: “Every functional subterm in C have to contain all the variables of C” if it satisfies the following conditions: 1. If C contains a constant, then it is ground. 2. All literals in C are covering. 3. For all literals L in C, if τv (L) > 0, then the set of variables of L is equal to the set of variables of C. The second and the third conditions are now sensibly similar to the conditions of the weakly guarded clauses. Then the implementation of the recongition of the weakly guarded clauses should be easier. The conditions have been then implemented following the model: Algorithm 3 GC condition2.1 BOOL b = TRUE if clause containsVariable(C) and !clause isGround(C) then b = FALSE end if return b 5.3. RECOGNITION OF MEMBERSHIPS OF THE GC CLASS 45 Algorithm 4 GC condition2.2 BOOL b = TRUE for all literals L in C do if !literal iscovering(L) then b = FALSE end if end for return b Algorithm 5 GC condition2.3 BOOL b = TRUE for all literals L in C do if variable Depth(L) > 0 and V ariablesof L ≠ V ariablesof C then b = FALSE end if end for return b As it can be seen, these functions ask for some intermediate functions such as literal isCovering or variable Depth. These intermediate functions had to be implemented before the ones described above. All the intermediate functions have been placed in a separate file called complementary func.c. Third condition: if C is a non-ground clause, C has to contain a non-functional negative literal L, called a guard, such that V (L) = V (C). The only thing to verify is the existence of a guard in every non-ground clauses. Then the idea was just to create a BOOL b initialised to false for each non-ground clause C and to scan all the literals of C one by one. At the first encountered literal which contains a guard, b is set to true. All the conditions described above were implemented. Based on these three conditions, the feature ana GC can be correctly set. If on exit of the function 46 CHAPTER 5. GUARDED CLAUSES AS CASE STUDY ana AnalyzeProblem, the feature ana GC is still equal to true then the problem belongs to the GC class, otherwise it does not. In this section, it has been shown how the recognition module was implemented for the GC class. The key question is now how an adequate decision procedure can be set. 5.4 Implementation of the decision procedure The decision procedure procedure for the GC class is based on superposition and uses specific ordering and selection function [Ganzinger and De Nivelle, 1999]. The first step was to find the way to implement these specific ordering and selection function. Once these functions were implemented, the work involved more research than coding. The aim was to find all the rules that can or need to be used in order to preserve the soundness and the completeness of the resolution method. As all the rules implemented in SPASS are sound and the association of sound rules is sound, only the completeness of the resolution method had to be checked. 5.4.1 Ordering / Selection function The ordering used in the decision procedure for the GC class is a lexicographic path ordering, which is based on the precedence ≻ satisfying f ≻ c ≻ p for all non-constant function symbols f , constants c and predicate symbols p. A lexicographic path ordering <lpo is defined by s <lpo t if [Bachmair and Ganzinger, 2001]: s is a variable belonging to t and t ≠ s s = f (s1 , ..., sm ) and t = g(t1 , ..., tn ) and – ∃j, s ≤lpo t or – f ≺ g and ∀i, si <lpo t or – f = g, (s1 , ..., sm ) <lex lpo (t1 , ..., tn ) and ∀i, si <lpo t 5.4. IMPLEMENTATION OF THE DECISION PROCEDURE 47 This ordering is not implemented in SPASS. However, by looking more carefully this definition taken from [Bachmair and Ganzinger, 2001], it can be noticed that it simply corresponds to the definition of the recursive path ordering implemented in SPASS with the status set to left. SPASS manages the status in its own way. The easiest way to implement the lexicographic path ordering was to create a new file where the functions related to the RPO were copied and the actions corresponding to status different from left were deleted. A new SPASS flag flag ORDLPO has been created for this ordering. The flag concerning the ordering is still initialised to the default ordering KBO but it is automatically set to LPO if the problem has been recognised to belong to the GC class. The precedence had also to be changed from the one selected by default by SPASS. SPASS defined automatically a precedence ≻ such that f ≻ p ≻ c for all non-constant function symbol f , constant c and predicate symbol p. The precedence is set in the beginning of the function ana AutoConfiguration situated in the file analyze.c. This function is the one which set the right configuration in the automatic mode. So the only thing to do was to change this precedence in order it to satisfy the desired condition. The pseudo-code explaining what it has be done is provided below: Algorithm 6 Set the precedence Predicates = ana CalculatePredicatePrecedence(Predicates, Clauses) Functions = ana CalculateFunctionPrecedence(Functions, Clauses) Constants = list of constants Add functions to the precedence list if f eatureana GC = T rue then Add Constants to the precedence list Add Predicates to the precedence list else Add Predicates to the precedence list Add Constants to the precedence list end if ana CalculatePredicatePrecedence (ana CalculateFunctionPrecedence) is a function which sorts the predicates (functions). As any indication is given for the precedence between the predicates (functions) then the choice was just to let SPASS order them as usual. However, the precedence between the constants is 48 CHAPTER 5. GUARDED CLAUSES AS CASE STUDY not calculated, they are just assumed to be ordered in the order they appear in the clause set. In addition to a specific ordering, the resolution method used to make the GC class decidable request a special selection function. Defining this selection function was more complicated than setting the right precedence. The first question was where to specify the selection function in the source code of SPASS. There were two ideas concerning where the selection function should be specified. The first one was to create a new function where the selection function would be specified and create a new flag Flag GCselect which specifies when to use this function. The second idea was to specify what should be selected in the beginning of the function clause SelectLiteral which selects the wanted literals. The latter was chosen. Then, it remained only to translate into code the definition of the selection function Σ that has been described in Section 5.2. Algorithm 7 Selection function if the problem belongs to GC then Literal L; context1 = FALSE; context2 = FALSE; // context1 : the problem contains a guard // context2 : the problem contains a functional negative literal for all literals L1 in C do if L1 is a guard then context1 = TRUE L = L1 end if if L1 is a functional negative literal then context2 = TRUE L = L1 end if end for if (the clause is not functional and context1) or (context2) then Select L else No literal is selected end if end if 5.4. IMPLEMENTATION OF THE DECISION PROCEDURE 5.4.2 49 Inference rules The last section has shown how the specific selection function and ordering were implemented. After this step, the aim was to find a way to make SPASS use the correct decision procedure. It is necessary to ensure that an appropriate set of rules is used in order to make the GC class decidable. It has be shown in [Ganzinger and De Nivelle, 1999] that a resolution decision procedure for the guarded clauses with equality is based on superposition. However, superposition cannot be used alone. In order for the resolution method to be complete, some other rules have to be used. The inference rules of the superposition calculus have been detailed in [Bachmair and Ganzinger, 1998]. Their definitions as taken from [Bachmair and Ganzinger, 1998] are provided below. The first evident inference rules that have to be used are the superposition rules. They can be defined as follows: Positive Superposition C ∨ s ≈ t D ∨ u[s′ ] ≈ v (C ∨ D ∨ u[t] ≈ v)σ where σ is the mgu of s and s′ such that: tσ ⪰̸ sσ (1) vσ ⪰̸ uσ (2) (s ≈ t)σ is strictly maximal with respect to Cσ, and C contains no selected literal (3) (u ≈ v)σ is strictly maximal with respect to Dσ, and D contains no selected literal (4) s′ is not a variable (5) (s ≈ t)σ ⪰̸ (u ≈ v)σ (6) 50 CHAPTER 5. GUARDED CLAUSES AS CASE STUDY Negative Superposition C ∨ p ¬p ∨ D C ∨D where σ is the mgu of s and s′ such that (1), (2), (3), (5) and: u ≉ v is selected, or else nothing is selected in this premise and (u ≈ v)σ is strictly maximal with respect to Dσ (4’) For equality, the following inference rules are needed too. Reflexivity resolution C ∨u≈v Cσ where σ is the mgu of u and v such that (4’) Equality factoring C ∨u≈v∨s≈t (C ∨ v ≉ t ∨ u ≈ t)σ where σ is the mgu of s and u such that (6) and: (s ≈ t)σ is maximal with respect to Cσ, and C contains no selected literal Additionally, ordered factoring has to be used. These rules take into account a specific ordering and selection function. This is why these parameters have been specified in Section 5.4.1. The set of rules described above is used only in the case where the problem is with equality. When a problem without equality is concerned, only the use of ordered resolution and of ordered factoring are enough. It has been decided to let the ordered resolution always on and not to deactivate it when the problems is without equality. This choice was made because it has been noticed during the testing phase that the ordered resolution improves the performances of SPASS. As the basic version of SPASS already uses the ordered resolution combined to superposition for some problems, it is assured that the completeness of the resolution is method is not affected. 5.4. IMPLEMENTATION OF THE DECISION PROCEDURE 51 All these rules are already implemented in SPASS. Then the aim was just to find the flags to use to have the wanted resolution decision procedure. The positive and negative superpositions are activated by the flags flag ISPL and flag ISPR. The reflexivity resolution is actived by the flag flag IERR and equality factoring by the flag flag IERR. Once this information is known, the ana AutoConfiguration file had to be changed such that these rules are activated when the problem belongs to the GC class. Algorithm 8 Resolution decision procedure if the problem belongs to GC then flag SetFlagIntValue(Flags, flag IORE, flag ORDEREDRESOLUTIONNOEQUATIONS) flag SetFlagIntValue(Flags, flag IOFC, flag FACTORINGONLYRIGHT) flag SetFlagIntValue(Flags, flag ORD, flag ORDLPO) if the problem is a problem with equality then flag SetFlagIntValue(Flags, flag ISPR, flag SUPERPOSITIONRIGHTON) flag SetFlagIntValue(Flags, flag ISPL, flag SUPERPOSITIONLEFTON) flag SetFlagIntValue(Flags, flag IERR, flag REFLEXIVITYRESOLUTIONON) flag SetFlagIntValue(Flags, flag IERR, flag EQUALITYFACTORINGON) end if end if This inference system based on superposition, reflexity resolution and equality factoring is complete. A proof is provided in [Hsiang and Rusinowitch, 1991]. Moreover, according to [Ganzinger and De Nivelle, 1999], this calculus does not produce any non guarded clauses from a guarded clause. This fact assures its termination. The resolution method implemented is then sure to be a decision procedure for the GC class. 52 5.4.3 CHAPTER 5. GUARDED CLAUSES AS CASE STUDY Reduction rules After having set the inference rules needed, the reduction rules were set. Due to time constraints, the choice was to let the configuration concerning reduction rules as it was already implemented in SPASS. Nevertheless, the superposition calculus described above is known to be compatible with usual simplification and redundancy elimination rules such as subsumption and tautology deletion rules [Ganzinger and De Nivelle, 1999]. Moreover, it is easy to show that the trivial literal elimination which deletes duplicate literals does not affect the completeness of the calculus. So the only reduction rules which may affect the completeness of the calculus are the fordward/backward matching replacement resolution and the forward/backward rewriting. And according to the tests performed, it seems not to be the case. The results provided by the calculus set have been described in the last chapter. 5.5 Issues During the implementation, some issues showed up. The first issue appears when the user wants to activate specific rules in addition to those activated automatically. If the user does not specify anything and let SPASS run in its automatic mode, the calculus is sound and complete. But if the user decides to specify his own rules, nothing assures that the calculus is still complete. Suppose that the user sets another ordering, the instance of SPASS may not be complete. As the user should not be constraint, it was not advisable to prevent the user to add some options if he wants to. Then it was decided to print a warning message when the user specifies his own rules to remind him to be careful about the completeness of the calculus chosen. The second issue concerns problems which belong to several classes. The question is what resolution calculus should be used. Suppose that the problem belongs to GC and to P V D. Should we use the superposition calculus or the hypperresolution calculus? To answer to this question, it was decided to test each resolution decision procedure separately on a certain number of problems. The decision procedure providing the best results will be activated in priority. Chapter 6 Other decidable classes In this chapter, the implementation of the other decidable classes is detailed. It is divided in two main parts. The first part deals with the recognition of the memberships of a decidable class. And the second part concerns the implementation of the different decision procedures. 6.1 Recognition part The process of recognizing the membership in the eight classes described in the chapter 4 (1 , + , S+, W GC, BS∗, BSH∗, P V D, OCCI) is approximately the same as the process used for the GC class discussed in the previous chapter. A new internal flag was created for each class. They was declared in analyze.h: ana BS for the BS∗ class ana BSH for the BSH∗ class ana EPS1 for the 1 class ana EPS2 for the + class ana S for the S+ class ana PVD for the P V D class ana OCCI for the OCCI class ana WGC for the W GC class 53 54 CHAPTER 6. OTHER DECIDABLE CLASSES These internal flags are set in the function ana AnalyzeProblem. Their settings are based on the definitions that have been given in the chapter 4. All these internal flags excluding ana BSH are initialised to TRUE and are set to FALSE at the first encountered clause which does not satisfy the required conditions. The case of the BSH∗ class is a bit different because it is necessary to check first if the problem belongs to BS∗. The flag ana BSH is initialised to FALSE and it is set to TRUE only once the problem is shown to belong to BS∗ and to contain only horn clauses. For each class definition, a function was implemented. For the same reasons than the GC class, these functions were declared in the dedicated file class conditions.c and the required intermediary functions were put in the complementary func.c file. All these functions have allowed to set correctly the different flags. Once the internal flags are set, it is possible to know if a problem is a membership of one of the studied decidable classes. The names of the classes to which the problem belongs is printed during the launch of SPASS. For these eight classes, a decision procedure is known. In the next section, it is discussed how the adequate decision procedures were set for each of these classes. 6.2 6.2.1 Implementation of the decision procedures Classes decidable by hyperresolution Three of these classes have been shown to be decidable by hyperresolution calculus based on positive, restricted factoring. A resolvent of two clauses C and D, where C is positive, is under positive, restricted factoring if factoring is only applied to C [Noll, 1980, Fermüller and Leitsch, 1993]. This calculus does not ask for the use of a lot of inference rules. In fact, in addition to the hyperresolution rule, only rules able to play the role of the positive, restricted factoring are needed. The positive, restricted factoring can easily be activated by the rules present in SPASS. It just need the activation of the condensation rule and of the ordered factoring rule in the mode where only factoring inferences with positive literals 6.2. IMPLEMENTATION OF THE DECISION PROCEDURES 55 are generated. If the ordered factoring rule is runned in its other mode, negative literals are considered for inferences as well and then the factoring is no longer restricted. Therefore, this resolution method is only based on the ordering factoring, the hyperresolution and the condensation rules. The ordering factoring rule is activated by the flag IOFC, the condensation rule is activated by the flag RCon and the hyperresolution rule is activated by the flag IOHy . This calculus has been proved to be complete in [Noll, 1980]. Moreover, it is based only on rules already implemented in SPASS. Therefore, as SPASS only contains sound rules, this calculus is sound. To reach results in a reasonable time, it has been decided to let the default settings concerning the reduction rules. These rules are sound and by consequence, the soundness of the calculus is preserved by the use of these rules. However, these rules may affect the completeness of the calculus. It has been shown that the subsumption, tautology and trivial literal eliminations are compatible with this calculus in [Noll, 1980, Fermüller and Leitsch, 1993]. The only reduction rules which may cause problems are the fordward/backward matching replacement resolution and the forward/backward rewriting. According to the tests performed, it seems not to be the case. The rules forming this calculus are summarised in the table 6.1. Inference rules Ordered hyperresolution Factoring only right Condensation rule Reduction rules Trivial literal elimination Subsumption deletion Tautology deletion Forward/backward Matching replacement resolution Forward/backward rewriting Table 6.1: Hyperresolution calculus During the testing phase, it has been noticed that the splitting rule improves the performance of the resolution method for the classes P V D, BSH∗ and OCCI. 56 CHAPTER 6. OTHER DECIDABLE CLASSES For some problems such as GEO169+1, GRA015+1, MSC007-1.008 and SYN431-1, the results are reached faster when the splitting rule is activated. The question is to know if it possible to activate the splitting rule without breaking the completeness of the decision procedure. To answer this question, each class should be studied separately. As it has been seen in the Chapter 2, the splitting rule is defined as follows: N ∪ {C ∨ D} N ∪ {C}∣N ∪ {D} where N is a set of clauses and C and D are two variable-disjoint clauses. The aim is to show that if an initial clause belongs to the class X, the clauses inferred by the splitting rule applied to this clause still belong to the class X. The BSH∗ class Consider a clause X = B ∪ {C ∨ D} belonging to the BSH∗ class. X has a depth equal to 0 and does not contain any non-horn clauses. As B ∪ {C ∨ D} has a depth equal to 0, all the literals included in B, in C and in D have a depth equal to 0. It implies that the depths of B ∪ {C} and of B ∪ {D} are equal to 0. And as B ∪ {C ∨ D} does not contain any non-horn clauses, B does not contain any non-horn clauses and C and D are not non-horn clauses. Therefore, B ∪ {C} and B ∪ {D} do not contain any non-horn clauses. It can be deduced from these observations that the clauses inferred by the splitting rule belong to the BSH∗ class. The P V D class Now, consider a clause X = B ∪ {C ∨ D} belonging to the P V D class. 6.2. IMPLEMENTATION OF THE DECISION PROCEDURES 57 Refering to the definition given in section 4.2.1, a clause set S belongs to P V D if for each clause Ci ∈ S, the set of variables of {Ci+ } is included in the set of variables of {Ci− } and τmax (x, Ci+ ) ≤ τmax (x, Ci− ) for all x ∈ V (Ci+ ) [Fermüller et al., 1993b]. B ∪ {C ∨ D} satisfies this condition. It can then be deduced that the set B of clause satisfies this condition. A problem can only come from the fact that {C ∨ D} is split in two clauses: C and D. But as the clause cannot be split if C and D have common variables, there is no real problem. The clause will be split only if C and D are variable-disjoint and in this case, if the condition is satisfied for {C ∨ D} that means that the condition is satisfied for C and for D. Therefore, the clauses inferred by the splitting rule belong to the P V D class. The OCCI class Finally, consider a clause X = B ∪ {C ∨ D} belonging to the OCCI class. Refering to the section 4.2.1, a clause set S belongs to OCCI if for all Ci ∈ S, the number of occurences in Ci+ is equal to 1 for all variables occuring in Ci+ and τmax (x, Ci+ ) ≤ τmin (x, Ci− ) for all x ∈ V (Ci+ ) ∩ V (Ci− ) [Fermüller et al., 2001a]. By applying an reasoning similar to the previous one, it can be deduced that the clauses inferred by the splitting rule belong to the OCCI class. 6.2.2 Classes decidable by ordered resolution Concerning the classes decidable by orderings, the choice was to focus on the classes 1 and + . These classes considered as an extension of the Ackermann class are relevant for the project due to their nice properties. [Fermüller et al., 1993c]. Let R1 be an ordering defined such that for all atoms A and B, B >R1 A if: - τ (B) > τ (A) and - τmax (x, B) > τmax (x, A) for all x ∈ V (A). 58 CHAPTER 6. OTHER DECIDABLE CLASSES The irreflexivity and the transivity of < implies the irreflexivity and the transivity of R1 . It can be shown that for all atoms A, B and all substitutions θ: A <R1 B implies Aθ <R1 Bθ [Fermüller et al., 1993c]. Then R1 is an A-ordering. The ordered resolution based on this A-ordering and combined with the splitting rule provides a decision procedure for the classes 1 and + . This calculus is sound given that it is based only on sound rules. And it has been proved to be complete in [Fermüller et al., 1993c]. As the classes discussed earlier, the choice was to let the default mode concerning the reduction rules. The use of these rules considerably reduces the time spent to reach a result and the risk that they affect the completeness of the calculus is quite low. In fact, the use of the usual deletion rules such as the subsumption,tautology and trivial literal eliminations preserves the soundness and completeness of the calculus [Fermüller et al., 1993c]. Once again, the only rules, for which the preservation of the calculus completeness is not assured, are: the fordward/backward matching replacement resolution and the forward/backward rewriting. The table 6.2 summarises the calculus used for the resolution of the classes 1 and + . Inference rules Ordered resolution Factoring only right Splitting rule Reduction rules Trivial literal elimination Subsumption deletion Tautology deletion Forward/backward Matching replacement resolution Forward/backward rewriting Table 6.2: Ordered resolution calculus The issue concerning this decision procedure was how to implement the ordering R1 in SPASS. After much tought, it has been decided to implement the ordering at the beginning of the function ord LiteralCompare. This function intends to compare the literals. The idea was to compare the literals with the ordering R1 when the problem is declared to belong to classes and to give a result before 6.2. IMPLEMENTATION OF THE DECISION PROCEDURES 59 the function uses the Knuth-Bendix ordering or the recursive path ordering with status ordering. Algorithm 9 Comparison of literals if the problem belongs to then Compare the literals according to R1 else Use the ordering specified by the user or the default one if nothing is specified (kbo, rpos or lpo) end if In the case where the problem belongs to the classes and the ordering R1 does not specify any order between the two literals in parameters, the function returns ord UNCOMPARABLE. By consequence, this ordering is not total. The proof demonstrating the completeness of the ordered resolution based on A-ordering does not take into account the fact that the ordering is total or not. However, the fact that this ordering is not total can affect the proper functioning of SPASS. Indeed, It is hard to know what SPASS makes in the case where the result returned is ord UNCOMPARABLE. The test phase has not shown any failure. All the results which have been obtained are correct. The fact that the ordering is not total may therefore cause no real problems. After having finished the implementation, many tests were required to evaluate the work done. The next chapter deals with the testing part. Chapter 7 Tests and results A large number of tests has been run to evaluate the program. Two kind of tests have been used: - tests written by hand to try different situations and, - tests coming from an external source to make the most accurate evaluation possible. 7.1 The TPTP library The TPTP library is a perfect source concerning first-order logic problems. The TPTP library is ”a library of test problems for automated theorem proving systems” [Sutcliffe, 2009]. This library has been created to regroup in a electronic format all the relevant first-order problems that have been studied during the last years. This library is regularly updated and it contains all the most important problems, no need to see elsewhere. There is a large variety of problems. Problems can be very simple as they can be very difficult. The number of problems available in this library is so large that this library is sufficient for significant testing [Sutcliffe and Suttner, 1998]. The problems are classified according to a specific way. The figure 7.1 taken from [Sutcliffe and Suttner, 1998] explains how the problems are classified. In this project, the problems have been decided to be classified by the classes to which they belong instead of following the diagram 7.1. 60 7.1. THE TPTP LIBRARY 61 Figure 7.1: Structure of the TPTP library In order to be useful for the project, the problems had to be translated in the dfg format. This is the only format that SPASS takes into account. This translation has been possible thanks to the tptp2X tool. The tptp2X tool allows to convert a problem from the TPTP format to a format of our choice. Especially, it allows to convert the problems from the TPTP format to the dfg format. 62 CHAPTER 7. TESTS AND RESULTS 7.2 Evaluation of the analysis module The analysis module was tried on many examples. In this section, some examples of tests performed are described. The example chosen to study the analysis module is presented in the description of SPASS, section 3.2. The problem corresponds to the clause set S = {C1 , C2 , C3 , C4 }, where: C1 C2 C3 C4 = = = = {P(a, b)} {P(x, y), ¬P(y, f(x, y))} {Q(x, y, z), ¬Q(g(z, x, y), x, g(g(y, z, x), x, z))} {¬Q(a, b, c)} This problem was carefully chosen to belong to a large number of classes. It satisfies the conditions to belong to 1 , + and S+ classes. All the literals are covering. In fact in C1 and C4 literals contain only constants, and in C2 and C3 literals contains either variables or functional terms which contains all the variables occuring in the literal in question. Moreover, all the literals present in a same clause possess exactly the variables occuring in the clause ({x, y} for the literals present in C2 and {x, y, z} for the literals present in C3 ). The problem satisfies the conditions to belong to P V D and OCCI classes, too. C1 and C4 do not possess any variables. The set of variables of C2+ is equal to set of variable of C2− (same for C3− and C3+ ). So, all these clauses satisfy the condition the set of variables of {Ci+ } is included in the set of variables of {Ci− }. Furthermore, τmax (i, C2+ ) ≤ τmax (i, C2− ) for i ∈ x, y and τmax (i, C3+ ) ≤ τmin (i, C3− ) for i ∈ x, y, z. However, the problem does not belong to BS* or BSH* class: the depth of C2 is equal to 1 and the depth of C3 is equal to 2. And it does not belong to WGC neither: C2 and C3 have no negative literal L with τv (L) = 0. Now the expected result is known, it can be compared to the output which has been provided by the program. The output is presented in the figure 7.2. It can be seen from this figure that the results correspond to the analysis done by hand. 7.2. EVALUATION OF THE ANALYSIS MODULE 63 -----------------------SPASS-START-------------------------Input Problem: 4[0:Inp] || Q(a,b,c) -> . 3[0:Inp] || Q(g(U,V,W),V,g(g(W,U,V),V,U)) -> Q(V,W,U). 2[0:Inp] || P(U,f(V,U)) -> P(V,U). 1[0:Inp] || -> P(a,b). This is a first-order Horn problem without equality. This This This This This problem problem problem problem problem belongs belongs belongs belongs belongs to to to to to the the the the the EPS1 Class. EPS+ Class. S+ Class. PVD Class. OCCI Class. Axiom clauses: 4 Conjecture clauses: 0 ... Figure 7.2: Analysis of the first test After this test, the problem was changed to belong to fewer classes. For that purpose, the second clause was transformed to: C2 = {P(f(f(x, y), y), y), ¬P(y, f(x, y))} In this context, the equation τmax (x, C2− ) ≤ τmax (i, C2+ ) was verified. With this condition, S did not belong to the P V D and OCCI classes anymore. The results of this test are provided in the figure 7.3 and they are as expected. To test the three last classes 1 , + and S+ , the third clause was changed to: C3 = {Q(x, x, z), ¬Q(g(z, x, y), x, g(g(y, z, x), x, z))}. In this way, the first literal did not have the same variables as the entiere clause or even as the other literals present in the clause. This implies that the problem did not belong to S+ and the classes anymore. The output can be seen in the figure 7.4. 64 CHAPTER 7. TESTS AND RESULTS -----------------------SPASS-START-------------------------Input Problem: 4[0:Inp] || Q(a,b,c) -> . 3[0:Inp] || Q(g(U,V,W),V,g(g(W,U,V),V,U)) -> Q(V,W,U). 2[0:Inp] || P(U,f(V,U)) -> P(f(f(V,U),U),U). 1[0:Inp] || -> P(a,b). This is a first-order Horn problem without equality. This problem belongs to the EPS1 Class. This problem belongs to the EPS+ Class. This problem belongs to the S+ Class. Axiom clauses: 4 Conjecture clauses: 0 .. Figure 7.3: Analysis of the second test ----------------------SPASS-START-------------------------Input Problem: 4[0:Inp] || Q(a,b,c) -> . 3[0:Inp] || Q(g(U,V,W),V,g(g(W,U,V),V,U)) -> Q(V,V,U). 2[0:Inp] || P(U,f(V,U)) -> P(f(f(V,U),U),U). 1[0:Inp] || -> P(a,b). This is a first-order Horn problem without equality. Axiom clauses: 4 Conjecture clauses: 0 ... Figure 7.4: Analysis of the third test All these outputs are those expected. Some other tests were then performed. A last simple example test is provided below. This test was run to check if the program can tell when a problem belongs to BS∗, BSH∗ or the W GC classes. Let S = {C1 , C2 } with: C1 = {P(a, b)} C2 = {P(x, y),¬P(y, x)} This problem is really simple and verifies the conditions of all the implemented classes. 7.3. CLASSIFICATION OF THE TPTP LIBRARY 65 The program has produced the following output: -----------------------SPASS-START-------------------------Input Problem: 2[0:Inp] || P(U,V) -> P(V,U). 1[0:Inp] || -> P(a,b). This is a first-order Horn problem without equality. This is a problem that has, if any, a finite domain model. There are no function symbols. This This This This This This This This problem problem problem problem problem problem problem problem belongs belongs belongs belongs belongs belongs belongs belongs to to to to to to to to the the the the the the the the BS* Class. BSH* Class. EPS1 Class. EPS+ Class. S+ Class. PVD Class. OCCI Class. WGC Class. Axiom clauses: 2 Conjecture clauses: 0 ... Figure 7.5: Analysis of the fourth test The program has declared that the problem belongs to BS*, BSH*, WGC classes and all other classes, which is the expected result. In conclusion, all the results given in this section correspond to the expected ones. Several additional tests were done and were all found correct. However, any conclusion could be deduced just from this battery of tests. More tests were necessary to be able to evaluate correctly the work done. 7.3 Classification of the TPTP library The TPTP library was used in order to provide an accurate evaluation of the work done. The program was runned on almost all the problems included in the TPTP library. It allowed to classify the problems of the TPTP library depending 66 CHAPTER 7. TESTS AND RESULTS on the classes the program has declared they belong to. 18% of the problems could not be transformed in the dfg format because they use functions not managed by the tptp2X tool. And 3% of the problems could not be classified because they are too long and take too much time to give a result. All the other problems have been classified. The results are regrouped in table 7.1. This table presents how many problems of each domain belong to the different classes. In total, 1301 problems belong to the BS class, 461 problems belong to the BSH class, 995 problems belong to the + class, 939 problems belong to the 1 , 380 problems belong to the W GC class, 159 problems belong to the GC class, 611 problems belong to the PVD class, 580 problems belong to the OCCI class, 2675 problems belong to the S+ and 10652 problems do no belong to any class. As some classes are included in another one, for example GC is included in W GC and BSH is included in BS, only 3267 problems belong to at least one decidable class. It is not a large number, especially compared to the number of the problems which do not belong to any class, but it is already enough to evaluate the program. Too many problems would have taken too much time to be tested. As it can be seen in the table 7.1, depending on the domain, results are completely different. Some domains such as SY N have really nice properties and are really interesting for the project. 763 problems of the SY N domain belong to BS and 457 of them belongs to 1 . But many other domains are less interesting and contain only problems which do not belong to any class. It can be thought that the two classifications are kind of related. Problems of the domains COL and GRP tend to belong to the S+ class, problems of the SY N domain tend to belong to the BS class and problems of the P U Z domain are equitably distributed among the different decidable classes. The classification made by the program was checked for some problems. It would be impossible to check all the problems one by one. In fact, nothing tells to what classes a problem belongs. The verification has to be done by hand. For all the problems checked, the results were those expected. This new classification has simplified the way to proceed to the evaluation of the different resolution methods. 7.3. CLASSIFICATION OF THE TPTP LIBRARY AGT ALG ANA ARI BOO CAT COL COM CSR DAT FLD GEG GEO GRA GRP HAL HEN HWC HWV KLE KRS LAT LCL LDA MED MGT MSC NLP NUM PLA PRO PUZ REL RNG ROB SCT SET SEU SEV SWB SWC SWV SWW SYN SYO TOP Total BS 0 0 0 1 0 0 0 0 25 0 0 0 2 15 101 0 0 0 53 0 49 3 19 0 0 12 16 58 1 13 0 56 0 0 0 0 2 3 0 32 0 77 0 763 0 0 1301 BSH 0 0 0 0 0 0 0 0 25 0 0 0 1 0 1 0 0 0 0 0 1 3 14 0 0 0 12 18 0 13 0 21 0 0 0 0 0 2 0 32 2 0 0 316 0 0 461 + 1 0 179 3 1 4 0 8 3 0 0 0 0 0 1 155 0 8 2 1 0 42 16 16 0 0 0 10 0 11 1 0 42 0 0 0 2 6 3 0 6 2 16 0 457 0 0 995 0 179 3 1 4 0 0 3 0 0 0 0 0 1 125 0 8 2 1 0 42 15 16 0 0 0 10 0 11 1 0 42 0 0 0 2 5 3 0 6 2 6 0 451 0 0 939 WGC 0 177 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 9 0 1 6 13 0 0 2 10 0 13 0 0 35 0 0 0 0 1 2 0 6 0 0 0 103 0 0 380 GC 0 9 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 2 0 1 5 9 0 0 0 1 0 1 0 0 30 0 0 0 0 0 1 0 1 0 0 0 97 0 0 159 PVD 0 179 0 0 0 0 3 2 25 0 0 0 2 15 44 0 0 1 3 0 1 6 13 0 0 6 11 58 11 13 0 46 0 0 0 0 11 1 0 7 0 28 1 124 0 0 611 67 OCCI 0 177 1 0 0 0 3 2 0 0 0 0 0 1 0 0 0 1 1 0 1 6 13 0 0 0 12 58 11 0 0 45 0 0 0 2 12 1 0 8 0 29 1 195 0 0 580 Table 7.1: Classification of the TPTP library S+ 0 221 10 1 60 0 155 3 0 0 0 0 0 1 509 0 29 2 36 72 65 169 577 29 0 2 12 0 19 2 0 52 0 64 31 2 18 13 0 7 12 19 1 482 0 0 2675 No class 52 211 81 0 21 130 73 39 163 0 279 1 587 18 443 9 38 3 68 157 203 384 337 20 10 144 12 458 885 36 72 64 218 197 14 184 1238 807 0 170 846 1192 362 324 0 102 10652 68 7.4 CHAPTER 7. TESTS AND RESULTS Evaluation of the resolution methods Once the problems of the TPTP library were listed by classes, the decision procedures of the different decidable classes were tried one by one. Each implemented decision procedure was evaluated. These evaluatations helped to choose the decision procedure which has to be used first when the problem belongs to several classes. The program was runned for 150 seconds by problem. It would take too much time to run all the tests if the time limit was too high. The decision procedure of the GC class was tried first. No deterioration was observed. There was no improvement neither. But it has been noticed that all the clauses inferred with the new resolution method belong to the GC class which was not the case with the initial method. 154 out of 159 problems belonging to the GC class provide a result in less than 150 seconds. The results (satisfiable or unsatisfiable) are the same than the results provided by the basic version. After that, the decision procedures of the P V D, OCCI and BSH classes which are based on hyperresolution calculus were tried. Several improvements were observed. 26 problems give now a result while they did not give any with the basic version. No deterioration was observed. 328 out of 351 problems belonging to the P V D class provide now a result in less than 150 seconds whereas the basic version can only provide a result for 303 of them. 304 out of 319 problems belonging to the OCCI class and 410 out of 436 problems belonging to the BSH class also provide a result with this resolution method. The results were checked. They are as expected. Finally, the decision procedure of the classes was tried. 431 out of 579 problems belonging to the classes provide a result instead of 426 with the basic version. It is not a great improvement but it should not be forgot that these results are given in 150 seconds, by waiting more the results could be better. However, two problems which were managed with the basic version do not provide result with the new version. As for the GC class, it was verified that these two procedures product only clauses which belong to the class to which the problem belongs initially. This 7.4. EVALUATION OF THE RESOLUTION METHODS 69 test is a good test to verify if the resolution methods used are really decision procedures for these classes. The different improvements and deteriorations are summarised in the table 7.2 (which represents the results when the program has been runned for 150 seconds by problem). The problems presented in these table are those which give a result only for one version. Only SYN446+1.dfg is a real deterioration. In fact, SYN302-1.003.dfg belongs to the P V D class too. If the decision procedure related to the P V D class is used, a result is found. Therefore, the issue was to know what decision procedure use when a problem belongs to several decidable classes. Based on the result of the tests performed, it was decided to use in priority the decision procedure for the GC class then the hyperresolution calculus and finally the resolution method for the classes. This choice was made to optimise the performance of the prover. With this choice, the new version of SPASS can provide a result for 31 problems more than the basic version on this battery of tests. If the resolution method for the classes had priority on the method for the P V D class, one problem out of the 31 problems now solved by the new version would not give any result. This chapter has shown that several improvements and almost no deterioration have been observed. However, the results are not as good as we could have hoped since many problems studied do not belong to any decidable class that has been implemented. After having performed the tests, a lack concerning the possibility to choose the desired resolution method has been noticed. By consequence, a new flag was created for each decision procedure implemented (GC, PVD, OCCI, BSH, EPS). These methods can then be activated by a simple option. For example, to activate the decision procedure related to the GC class, the user has just to specify the option -GC=1 in the commandline. With these new flags, resolution methods studied in this dissertation can be used for problems which do no belong to any classes. 70 CHAPTER 7. TESTS AND RESULTS Problems belonging to Problems belonging to 1 + New version Basic version PUZ037-3.dfg Proof found SYN434+1.dfg Completion found SYN449-1.dfg Completion found SYN449+1.dfg Completion found SYN456-1.dfg Completion found SYN463+1.dfg Completion found SYN464+1.dfg Completion found SYN302-1.003.dfg Completion found SYN446+1.dfg Completion found Results obtained: 406/554 Results obtained: 401/554 Same improvements than 1 Same deteriorations than 1 Results obtained: 431/579 Results obtained: 426/579 No improvement No deterioration Results obtained: 154/159 Results obtained: 154/159 PUZ056-2.010.dfg Proof found No deterioration Results obtained: 410/436 Results obtained: 409/436 PUZ056-2.010.dfg Proof found No deterioration Results obtained: 304/319 Results obtained: 303/319 Problems belonging to GC Problems belonging to BSH∗ Problems belonging to OCCI Problems belonging to P V D GRA017+1.dfg Completion found GRP128-2.004.dfg Completion found GRP128-2.006.dfg Proof found GRP128-3.004.dfg Completion found GRP128-4.004.dfg Completion found GRP129-1.005.dfg Completion found GRP129-2.004.dfg Proof found GRP129-2.005.dfg Completion found GRP129-3.005.dfg Completion found GRP129-4.005.dfg Completion found GRP130-1.005.dfg Completion found GRP130-2.005.dfg Completion found GRP130-3.004.dfg Completion found GRP130-4.004.dfg Completion found GRP131-1.005.dfg Completion found GRP131-2.005.dfg Completion found GRP132-1.005.dfg Completion found GRP132-2.005.dfg Completion found GRP133-1.004.dfg Completion found GRP133-2.004.dfg Completion found GRP134-1.005.dfg Completion found GRP134-2.005.dfg Completion found GRP135-1.005.dfg Completion found GRP135-2.005.dfg Completion found Results obtained: 328/351 Results obtained: 303/351 Table 7.2: Improvements and deteriorations Chapter 8 Conclusion The aim of this dissertation was to improve the theorem prover SPASS. The idea to achieve this aim was to make SPASS recognise some classes that are known to be decidable and select the correct method according to this new kind of analysis. This projet required to make lot of research. All the main concepts related to the resolution for first-order logic are summarised in this dissertation. Especially, it is explained how resolution can be used as decision procedure for some classes of problems. From the collected information, relevant decidable classes and their decision procedures have been listed. The goal was then to allow SPASS to use the information contained in this list. The basic functioning of SPASS is deeply described in the Chapter 3. An extension of the SPASS analysis module was necessary to make SPASS recognise when a problem belongs to a decidable classe. The rules forming the studied decision procedures were all already implemented in the basic version of SPASS. Only, some specific orderings and selection functions had to be implemented in order for SPASS to be able to use one of the decision procedures listed. The extended version of SPASS is now able to recognise eight classes and to activate a decision procedure for six of them. The results provided by the program are described in the last chapter. It is demonstrated that the new version of SPASS is rather good, even better than the old one. All the results obtained are correct. Thirty three problems provide a result now while they could not with the old version. Only one deterioration has been observed. 71 72 CHAPTER 8. CONCLUSION This version of SPASS can however still be improved. By time constraint, the decision procedures for the two classes W GC and S+ have not been implemented. Therefore, it would be interesting to implement them to observe if better results are obtained. Moreover, many problems do not belong to any decidable class implemented. The work done does not have any impact on these problems. Thus, one way to improve this version would be to add decidable classes which includes a large number of problems such as the Maslov’s class K. The K class is a kind of superclass that includes a certain number of other solvable classes [Hustadt and Schmidt, 1999]. It would also be interesting to use the main idea of this project to improve other theorem provers than SPASS. Bibliography [Andréka et al., 1998] Andréka, H., Németi, I., and van Benthem, J. (1998). Modal languages and bounded fragments of predicate logic. Journal of Philosophical Logic, 27(3):217–274. [Bachmair and Ganzinger, 1998] Bachmair, L. and Ganzinger, H. (1998). Equational reasoning in saturation-based theorem proving. Automated deduction—a basis for applications, 1:353–397. [Bachmair and Ganzinger, 2001] Bachmair, L. and Ganzinger, H. (2001). Resolution theorem proving. Handbook of automated reasoning, 1:19–99. [Caferra et al., 2004] Caferra, R., Leitsch, A., and Peltier, N. (2004). Automated model building, volume 31. Springer. [de Nivelle, 1998] de Nivelle, H. (1998). A resolution decision procedure for the guarded fragment. In Kirchner, C. and Kirchner, H., editors, Automated Deduction — CADE-15, volume 1421 of Lecture Notes in Computer Science, pages 191–204. Springer Berlin / Heidelberg. [de Nivelle, 2000] de Nivelle, H. (2000). Deciding the E+-class by an a posteriori, liftable order. Annals of Pure and Applied Logic, 104(1):219–232. [Dierkes, 2000] Dierkes, M. (2000). An application of model building in a resolution decision procedure for guarded formulas. Computational Logic—CL 2000, pages 583–597. [Fermüller et al., 2001a] Fermüller, C., Leitsch, A., Hustadt, U., and Tammet, T. (2001a). Resolution decision procedures. In Handbook of Automated Reasoning, pages 1791–1849. Elsevier Science Publishers BV, Elsevier and MIT Press. 73 74 BIBLIOGRAPHY [Fermüller et al., 2001b] Fermüller, C., Leitsch, A., Hustadt, U., and Tammet, T. (2001b). Unification theory. In Handbook of Automated Reasoning, pages 445–532. Elsevier Science Publishers BV, Elsevier and MIT Press. [Fermüller and Leitsch, 1993] Fermüller, C. and Leitsch, A. (1993). Model building by resolution. In Börger, E., Jäger, G., Kleine Büning, H., Martini, S., and Richter, M., editors, Computer Science Logic, volume 702 of Lecture Notes in Computer Science, pages 134–148. Springer Berlin / Heidelberg. [Fermüller et al., 1993a] Fermüller, C., Leitsch, A., Tammet, T., and Zamov, N. (1993a). Completeness of ordering refinements. In Resolution Methods for the Decision Problem, volume 679 of Lecture Notes in Computer Science, pages 60–92. Springer Berlin / Heidelberg. [Fermüller et al., 1993b] Fermüller, C., Leitsch, A., Tammet, T., and Zamov, N. (1993b). Semantic clash resolution as decision procedure. In Resolution Methods for the Decision Problem, volume 679 of Lecture Notes in Computer Science, pages 17–59. Springer Berlin / Heidelberg. [Fermüller et al., 1993c] Fermüller, C., Leitsch, A., Tammet, T., and Zamov, N. (1993c). Semantic tree based resolution variants. In Resolution Methods for the Decision Problem, volume 679 of Lecture Notes in Computer Science, pages 93–129. Springer Berlin / Heidelberg. [Ganzinger and De Nivelle, 1999] Ganzinger, H. and De Nivelle, H. (1999). A superposition decision procedure for the guarded fragment with equality. In Logic in Computer Science, 1999. Proceedings. 14th Symposium on, pages 295– 303. IEEE. [Georgieva et al., 2003] Georgieva, L., Hustadt, U., and Schmidt, R. (2003). Hyperresolution for guarded formulae. Journal of Symbolic Computation, 36(12):163–192. [Horrocks et al., 2007] Horrocks, I., Hustadt, U., Sattler, U., and Schmidt, R. (2007). Computational modal logic. Studies in Logic and Practical Reasoning, 3:181–245. [Hsiang and Rusinowitch, 1991] Hsiang, J. and Rusinowitch, M. (1991). Proving refutational completeness of theorem-proving strategies: the transfinite semantic tree method. Journal of the ACM, 38(3):558–586. BIBLIOGRAPHY 75 [Hustadt and Schmidt, 1999] Hustadt, U. and Schmidt, R. (1999). Maslov’s class K revisited. Automated Deduction—CADE-16, pages 678–678. [Kowalski and Hayes, 1968] Kowalski, R. and Hayes, P. (1968). Semantic trees in automatic theorem proving. University of Edinburgh. [Leitsch, 1997] Leitsch, A. (1997). The resolution calculus. Springer-Verlag New York. [McCune, 2005] McCune, W. (2005). Release of prover9. In Mile High Conference on Quasigroups, Loops and Nonassociative Systems, Denver, Colorado. [McCune et al., 2009] McCune, W. et al. (2009). Prover9 manual. http://www. cs.unm.edu/~mccune/mace4/manual/2009-11A. [Noll, 1980] Noll, H. (1980). A note on resolution: How to get rid of factoring without loosing completeness. In 5th Conference on Automated Deduction Les Arcs, France, July 8–11, 1980, pages 250–263. Springer. [Nonnengart and Weidenbach, 2001] Nonnengart, A. and Weidenbach, C. (2001). Computing small clause normal forms. Handbook of automated reasoning, 1:335–367. [Rautenberg, 2009] Rautenberg, W. (2009). First-order logic. In A concise introduction to mathematical logic, pages 41–90. Springer Verlag. [Riazanov and Voronkov, 2001] Riazanov, A. and Voronkov, A. (2001). Vampire 1.1. Automated Reasoning, pages 376–380. [Schmidt, 2008] Schmidt, R. (2007-2008). Lecture Notes, COMP60121: Automated Reasoning. The University of Manchester. [Schmidt, 2012] Schmidt, R. (2011-2012). Lecture Notes, COMP61111: Logical Reasoning and Applications. The University of Manchester. [Schulz, 2002] Schulz, S. (2002). E - A brainiac theorem prover. AI Communications, 15(2-3):111–126. [Schulz, 2010] Schulz, S. (2010). The E theorem prover. informatik.tu-muenchen.de/~schulz/E/E.html. http://www4. 76 BIBLIOGRAPHY [Skolem, 1955] Skolem, T. (1955). Peano’s axioms and models of arithmetic. Studies in Logic and the Foundation of Mathematics, pages 1–14. [Sutcliffe, 2009] Sutcliffe, G. (2009). The TPTP problem library and associated infrastructure. Journal of Automated Reasoning, 43(4):337–362. [Sutcliffe and Suttner, 1998] Sutcliffe, G. and Suttner, C. (1998). The TPTP problem library. Journal of Automated Reasoning, 21:177–203. 10.1023/A:1005806324129. [Voronkov, 2009a] Voronkov, A. (2009a). Lecture notes: Automated reasoning, The University of Manchester. http://www.voronkov.com/ar.cgi. [Voronkov, 2009b] Voronkov, A. (2009b). Vampire’s home page. http://www. vprover.org. [Weidenbach, 2005] Weidenbach, C. (2005). Spass: An automated theorem prover for first-order logic with equality. http://www.spass-prover.org. [Weidenbach et al., 2009] Weidenbach, C., Dimova, D., Fietzke, A., Kumar, R., Suda, M., and Wischnewski, P. (2009). Spass version 3.5. Automated Deduction–CADE-22, pages 140–145. [Weidenbach et al., 2007a] Weidenbach, C., Schmidt, R., Hillenbrand, T., Rusev, R., and Topic, D. (2007a). System description: Spass version 3.0. Automated Deduction–CADE-21, pages 514–520. [Weidenbach et al., 2007b] Weidenbach, C., Schmidt, R., and Keen, E. (2007b). Spass handbook version 3.0. Contained in the distribution of SPASS Version, 3. BIBLIOGRAPHY 77 Appendix A : SPASS flags Inference Rules ISor IEmS IEqR IERR ISpL IOPm ISPm ISpR IOFc ISFc IEqF IMPm IORe ISRe IOHy ISHy Splits Sort Constraint Resolution Empty Sort Equality Resolution Reflexivity Resolution Superposition Left Ordered Paramodulation Standard Paramodulation Superposition Right Ordered Factoring Standard Factoring Equality Factoring Merging Paramodulation Ordered Resolution Standard Resolution Ordered Hyper Resolution Standard Hyper Resolution Splitting Reduction Rules RSSi RSST RObv RFSub RBSub RCon RTaut RUnc RTer RFMMR RBMMR RFRew RBRew RAED Sort Simplification Static Soft Typing Trivial Literal Elimination Forward Subsumption Deletion Backward Subsumption Deletion Condensation Tautology Deletion Unit Conflict Terminator Forward Matching Replacement Resolution Backward Matching Replacement Resolution Forward Rewriting Backward Rewriting Assignment Equation Deletion