Download an extension of spass deciding first

Document related concepts

Axiom of reducibility wikipedia , lookup

Propositional calculus wikipedia , lookup

History of the function concept wikipedia , lookup

Combinatory logic wikipedia , lookup

Social class wikipedia , lookup

Halting problem wikipedia , lookup

Transcript
AN EXTENSION OF SPASS
DECIDING FIRST-ORDER CLAUSAL
CLASSES
A dissertation submitted to the University of Manchester
for the degree of Master of Science
in the Faculty of Engineering and Physical Sciences
2012
By
Yasmine Harbit
School of Computer Science
Contents
Abstract
6
Declaration
7
Copyright
8
Acknowledgements
9
1 Introduction
10
2 Resolution for first-order logic
12
2.1
Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
2.2
First-order logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
2.3
Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
2.3.1
Inference rules . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
2.3.2
Completeness and soundness . . . . . . . . . . . . . . . . . .
20
3 SPASS: An automated theorem prover
22
3.1
Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
3.2
Input File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
3.3
Analysis module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
3.4
Resolution part . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
3.4.1
Implemented rules and orderings . . . . . . . . . . . . . . . .
26
3.4.2
Default Mode . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
4 Resolution as decision procedure and decidable classes
30
4.1
Decision procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
4.2
Decidable classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
4.2.1
31
Classes decidable by ordered resolution . . . . . . . . . . . .
2
4.2.2
Classes decidable by hyperresolution . . . . . . . . . . . . . .
5 Guarded clauses as case study
5.1 The guarded clauses . . . . . . . . . . . . . . .
5.2 Decision procedure . . . . . . . . . . . . . . . .
5.3 Recognition of memberships of the GC class
5.4 Implementation of the decision procedure . .
5.4.1 Ordering / Selection function . . . . .
5.4.2 Inference rules . . . . . . . . . . . . . .
5.4.3 Reduction rules . . . . . . . . . . . . .
5.5 Issues . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
6 Other decidable classes
6.1 Recognition part . . . . . . . . . . . . . . . . . .
6.2 Implementation of the decision procedures . . .
6.2.1 Classes decidable by hyperresolution . .
6.2.2 Classes decidable by ordered resolution
7 Tests and results
7.1 The TPTP library . . . . . . . . . . .
7.2 Evaluation of the analysis module . .
7.3 Classification of the TPTP library .
7.4 Evaluation of the resolution methods
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
36
.
.
.
.
.
.
.
.
38
38
39
40
46
46
49
52
52
.
.
.
.
53
53
54
54
57
.
.
.
.
60
60
62
65
68
8 Conclusion
71
Bibliography
73
Appendix A : SPASS flags
77
Word Count: 14477
3
List of Tables
6.1
6.2
Hyperresolution calculus . . . . . . . . . . . . . . . . . . . . . . . . .
Ordered resolution calculus . . . . . . . . . . . . . . . . . . . . . . . .
55
58
7.1
7.2
Classification of the TPTP library . . . . . . . . . . . . . . . . . . .
Improvements and deteriorations . . . . . . . . . . . . . . . . . . . .
67
70
4
List of Figures
3.1
3.2
3.3
3.4
SPASS
SPASS
SPASS
SPASS
architecture
input . . . .
analysis . . .
output . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
23
24
25
29
4.1
Clause depth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
7.1
7.2
7.3
7.4
7.5
Structure of the TPTP library
Analysis of the first test . . . .
Analysis of the second test . .
Analysis of the third test . . .
Analysis of the fourth test . .
61
63
64
64
65
.
.
.
.
.
5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Abstract
SPASS is an automated theorem prover for first-order logic which tries to proof
the satisfiability or unsatisfiability of a problem. The project presented in this
dissertation is to extend SPASS and to make it solve more problems than it does
already. The way to achieve this aim is to make SPASS recognise problems known
to be decidable by specific resolution methods, and then based on the capacity to
know the classes to which a problem belongs, to be able to select the appropriate
resolution method. Therefore, the ultimate aim is to turn SPASS into a decision
procedure for classes of first-order logic known to be decidable. For this project,
a program corresponding to an extension of SPASS has been implemented. The
analysis performed by this program contains all the data already provided by
the current version of the prover but indicates in addition the decidable classes
to which the problem belongs. Thanks to these data, SPASS is able to solve
more problems than it already does by setting the right rules depending on the
classes the problem belongs to. The results of the tests performed are provided.
They have all been found correct. The new version of SPASS can recognise eight
different classes of problems and can select the right resolution method for six of
them.
6
Declaration
No portion of the work referred to in this dissertation has
been submitted in support of an application for another
degree or qualification of this or any other university or
other institute of learning.
7
Copyright
i. The author of this dissertation (including any appendices and/or schedules
to this dissertation) owns certain copyright or related rights in it (the “Copyright”) and s/he has given The University of Manchester certain rights to
use such Copyright, including for administrative purposes.
ii. Copies of this dissertation, either in full or in extracts and whether in hard
or electronic copy, may be made only in accordance with the Copyright,
Designs and Patents Act 1988 (as amended) and regulations issued under
it or, where appropriate, in accordance with licensing agreements which the
University has entered into. This page must form part of any such copies
made.
iii. The ownership of certain Copyright, patents, designs, trade marks and other
intellectual property (the “Intellectual Property”) and any reproductions of
copyright works in the dissertation, for example graphs and tables (“Reproductions”), which may be described in this dissertation, may not be owned
by the author and may be owned by third parties. Such Intellectual Property and Reproductions cannot and must not be made available for use
without the prior written permission of the owner(s) of the relevant Intellectual Property and/or Reproductions.
iv. Further information on the conditions under which disclosure, publication
and commercialisation of this dissertation, the Copyright and any Intellectual Property and/or Reproductions described in it may take place is available in the University IP Policy (see http://documents.manchester.ac.uk/
display.aspx?DocID=487), in any relevant Dissertation restriction declarations deposited in the University Library, The University Library’s regulations (see http://www.manchester.ac.uk/library/aboutus/regulations) and
in The University’s Guidance for the Presentation of Dissertations.
8
Acknowledgements
First, I want to thank the University of Manchester for having given me the
opportunity to realise this dissertation.
I am particularly grateful to my supervisor Dr. Renate A. Schmidt for the time
she spent helping me, the trust she placed in me and her constructive remarks
which have allowed me to go so far in this project.
I have a particular thought for my parents and my brother. They have contributed to this achievement by their support, their financial aid, their patience
and their limitless love.
Finally, I want to thank my friends for their advice and their encouragements.
9
Chapter 1
Introduction
Automated provers have been a great advance in the automated reasoning
area. This area of computer science concerns programs created in order to give
to computers the ability of reasoning. These programs are asked to determine the
satisfiability or unsatisfiability of different problems. Deciding the satisfiability
of a problem can be easy for some cases but almost impossible for others. If there
exists a method to determine if a given formula is satisfiable for a logical system
then the system is called decidable. The project focuses on first-order logic which
is undecidable. However, a large number of fragments of first-order logic can be
decided by some specific methods. These fragments are detailed in this report.
Today, several automated provers for first-order logic exist. Among these
provers are: Prover9, which is a successor of Otter, the first high performance
theorem prover based on first-order resolution [McCune et al., 2009, McCune,
2005], E another high performance theorem prover [Schulz, 2010, Schulz, 2002]
and Vampire [Voronkov, 2009b, Riazanov and Voronkov, 2001], both based on the
superposition calculus which is a mix of first-order resolution and ordering-based
equality reasoning. The project consists of improving an existing prover called
SPASS. As the E and Vampire provers, SPASS is an automated theorem prover
for first-order logic based on superposition calculus [Weidenbach, 2005, Weidenbach et al., 2009]. From a problem given, SPASS tries to find a proof indicating
its satisfiability or its unsatisfiability. Depending on the difficulty of the problem, SPASS can determine the satisfiability of a problem on its own or it can be
guided, i.e. the users can specify the rules to use when they launch it. In fact, as
first-order logic is not decidable, the task is often difficult.
10
11
For many problems, SPASS cannot find how to go about solving them. However, an automated prover should ideally be completely independent. It should
determine without any help the satisfiability for a maximum number of problems.
The aim of the project is to increase the number of problems for which SPASS can
provide a result without being helped by any external assistance. Therefore, the
project aims to boost the intelligence of SPASS. The way to achieve this aim is
to turn SPASS into a decision procedure for fragments of first-order logic known
to be decidable by resolution.
The first step in the realisation of the project was to make a list of known
decidable classes, i.e classes for which there exists a calculus called dedision procedure which is sound, complete and assured to terminate. The goal was to make
SPASS use the information contained in this list by extending its analysis module such that SPASS can recognise if a problem belongs to one of the classes
listed. Thanks to this module, SPASS can know what resolution method to use
to provide a result for sure. One main issue of the project was to prove that the
resolution methods implemented are sound, complete and terminate.
This report aims at describing precisely the work realised. First, the basic
concepts on which this project is built are described. This contains the main
notions related to first-order logic, resolution and SPASS. Then it is explained
how resolution can be used as decision procedure, that is to say how it can be used
to determine effectively if a problem is satisfiable or not, and a list of decidable
classes is provided. Finally, the different steps concerning the implementation of
the new SPASS modules are detailed.
Chapter 2
Resolution for first-order logic
2.1
Definitions
To understand all the concepts that are involved in this project, some definitions have to be given. These definitions as taken from [Schmidt, 2012, Fermüller
et al., 2001a, Ganzinger and De Nivelle, 1999] are provided below.
Term:
A term is either a first-order variable, a constant or a functional term, that is to
say, a term which has the form: f (t1 , ..., tn ) where f is a function symbol and
t1 , ..., tn are terms.
Functional term:
A term is called functional if it contains a constant or a function symbol.
Atom:
We find two kinds of atoms: non-equational atoms and equational atoms. Nonequational atoms have the form P (s1 , ..., sn ) where P is a predicate symbol and
s1 , ..., sn are terms. Equational atoms have the form s ≈ t where s and t are
terms. We speak about first-order logic with equality when equational atoms are
admitted.
Literal:
A literal is either an atom or an atom preceded by a negation sign.
12
2.1. DEFINITIONS
13
Ground term / literal:
A ground term/literal is a term/literal with no occurences of variables.
Expression:
An expression represents either a term or a literal.
Clause:
A clause is a finite multiset of literals.
Simple literals and clauses:
A literal L is called simple if each term in L is either a variable or a functional
term containing only variables and constants as arguments.
A simple clause is a clause where all literals are simple.
Horn clause:
An Horn clause is a clause with at most one positive literal.
Substitution:
A substitution σ is defined as follows:
σ[x ↦ t](y) =
def
⎧
⎪
⎪
if y = x
⎪t,
⎨
⎪
⎪
⎪
⎩σ(y), otherwise
where [x ↦ t] means that the value t is assigned to x.
This substitution can also be written yσ[x ↦ t].
Maximal:
Let ≻ be a total ordering on ground atoms, i.e. an ordering such that if A and B
are two ground atoms, either A ⪰ B or A ⪯ B.
A ground literal L is called (stricly) maximal with respect to a ground clause C
if and only if ∀L′ ∈ C: L ⪰ L′ (L ≻ L′ ).
A non-ground literal L is (strictly maximal) with respect to a clause C if and
only if there is some ground substitutions such that Lσ is (stricly) maximal with
regard to Cσ, i.e., ∀L′ ∈ C ∶ Lσ ⪰ L′ σ(Lσ ≻ L′ σ).
14
CHAPTER 2. RESOLUTION FOR FIRST-ORDER LOGIC
2.2
First-order logic
In propositional logic, formulae are built from propositional symbols whose
truth values are either true or false. However, sometimes it can be useful to decompose these propositions. For example, instead of having a proposition KymIsaWoman, it can be nice to have an entity called Kym and a relation called
Woman whose arity is equal to one. First-order logic has been created to meet
this demand.
This logic is really well described in [Voronkov, 2009a], [Schmidt, 2012] and
[Rautenberg, 2009]. The main notions extracted from these readings are summarised in this section.
Syntax
First-order logic is an extension of propositional logic which allows the designation of individuals and their quantification. The alphabet of a first-order
language is then more rich than the one of propositional language. It consists of
logical and non-logical symbols [Schmidt, 2012]. Logical symbols can be divided
in four parts: logical connectives such that →, ↔, ∧, ∨ or ¬, quantifiers (∀, ∃),
variables and auxiliary symbols ( ‘(’, ‘[’, ‘.’, ...). Non-logical symbols correspond
to function symbols, constants and predicate symbols.
Inviduals are denoted by terms built on this symbols. It is possible to give
some specifications of this indivual by the creation of atoms. Based on the notion
of atoms and on the existence of quantifiers, the definition of a formula can be
introduced. A formula is an element A which obeys to one of the following
conditions [Ganzinger and De Nivelle, 1999]:
ˆ A = ⊺ or –,
ˆ A is an atom,
ˆ A = ¬F where F is a formula,
ˆ A = (F ∗ G) where * ∈ {→, ↔, ∧, ∨} and F, G are formale,
ˆ A = ∀xF or ∃xF where F is a formula.
2.3. RESOLUTION
15
Conversion to Clausal Normal Form
A difficulty in first-order logic is how to deal with quantifiers. It is easiest
to get rid of them. Many transformations have been introduced in order to
lead to a clausal form without any quantifiers [Schmidt, 2012, Nonnengart and
Weidenbach, 2001].
The first step is to transform the formula such that all quantifiers come at
the beginning. After this step, the formula is said to be in prenex normal form.
Then the aim is to eliminate all ∃ quantifiers. The idea is to replace all ∃x by
an appropriate fresh function symbol taking in arguments the variable whose x
depends on. This step is called Skolemisation [Skolem, 1955]. After this step,
the formula can be transformed into conjunctive normal form, that is to say it
can be transformed in the following form: ∀x1 ...∀xn ⋀i=1..k ⋁j=1..ni Lij . Finally,
by dropping all ∀ and all ∧, several clauses which do not possess any quantifier
are obtained. The problem is said to be in the clausal form.
Semantics
It has been seen what a formula is and how it can be transformed in a convenient
form. It is now time to define the notion of truth for a formula.
The truth value of a formula depends on the interpretation of its language. An
interpretation assigns domain elements to the free variables (variables which are
not inside the scope of a quantifier) [Schmidt, 2012]. An interpretation I satisfies
a formulae F (that can also be said: I is a model of F ) if F is true in I. This is
denoted by I ⊧ F . F is said to be satisfiable if there exists an interpretation I
such that I ⊧ F . F is valid if F is true in every interpretation.
2.3
Resolution
Resolution is a system performing proofs by refutation: for each problem,
it aims to prove the unsatisfiability of a set of clauses (see [Voronkov, 2009a,
Schmidt, 2012, Leitsch, 1997]). It works on a set of clauses presented in the
clausal form. Resolution is based on the principle of making inferences.
16
CHAPTER 2. RESOLUTION FOR FIRST-ORDER LOGIC
To make inferences amounts to use inference rules whose general form is:
F1
... Fn
G
where n ≥ 0 and F1 ...Fn , G are formulae. F1 ...Fn are said to be the premises and
G the conclusion.
A set of inference rules is called an inference system (or calculus) [Schmidt,
2012]. There exist many different inference rules. The most relevant ones for the
project are described in the next section.
2.3.1
Inference rules
The project deals with the resolution for first-order logic problems. First-order
resolution is a little more difficult than propositional resolution. The problem
comes from the use of variables. In this context, two different terms can be in
fact identical. There is a process called unification which aims to demonstrate the
equality of two terms by the application of some substitutions [Fermüller et al.,
2001b].
The definition as taken from [Fermüller et al., 2001a] is the following:
“A set of expressions M is unifiable by a substitution σ (called unifier of M) if
Ei σ =Ej σ for all Ei ,Ej ∈ M. σ is called the most general unifier (mgu) of M if for
every other unifier ρ of M, there is a substitution θ such that σθ = ρ.”
When first-order logic is concerned, unification is required in order to compare
terms. The inference rules of the first-order resolution calculus are based on the
unification of the terms. The definitions of the main inference rules involved in
this project are provided below.
There are two main inference rules of the first-order resolution calculus: the
binary resolution rule and the binary factoring rule. These rules are defined as
follows [Bachmair and Ganzinger, 2001]:
2.3. RESOLUTION
17
Binary resolution rule
C ∨ A ¬B ∨ D
(C ∨ D)σ
where A and B denote atoms, C, D denote clauses and σ is a mgu of A and B.
(C ∨ D)σ is called a resolvent of the initial clause.
Binary factoring rule
C ∨L∨L
C ∨L
where L is a literal and C, D denote clauses.
C ∨ L is called a factor of the initial clause.
Some other inference rules are required to the project realisation. This is the
case of the splitting rule and superposition which are described below.
Splitting rule
Let N be a set of clauses and C and D be two variable-disjoint clauses. The
splitting rule can be defined as follows [Bachmair and Ganzinger, 2001]:
N ∪ {C ∨ D}
N ∪ {C}∣N ∪ {D}
Superposition
Superposition is used to deal with equality. It takes into account two parameter:
a specific ordering and a selection function. Superpostion is described in detail
in Section 5.4.2.
This part would not be complete without talking about the rules related to
ordered resolution and hyperresolution calculus. These two inference sytems have
been really useful to reach the aim of the project.
18
CHAPTER 2. RESOLUTION FOR FIRST-ORDER LOGIC
Ordered Resolution
An ordering is a transitive and irreflexive binary relation which specifies an order between literals. Inference rules are then applied only on the maximal literals.
A selection function is a mapping from a clause C to a set of negative literals
present in C [Bachmair and Ganzinger, 2001, Schmidt, 2012]. This set specifies
the literals to use in priority to infer new clauses.
Consider that an atom ordering > and a selection function S have been chosen. The ordered resolution and ordered factoring rules are defined as follows
[Bachmair and Ganzinger, 2001].
Let C, D two clauses.
Ordered factoring rule:
C ∨ L1 ∨ L2
(C ∨ L1 )σ
provided that:
-
L1 , L2 are two literals,
σ is a most general unifier of L1 and L2 ,
L1 σ is strictly maximal with respect to Cσ and
nothing is selected in C by S.
Ordered resolution with selection:
C ∨ A ¬B ∨ D
(C ∨ D)σ
provided that:
-
A, B are two atoms,
σ is a most general unifier of A and B,
Aσ is strictly maximal with respect to Cσ,
nothing is selected in C by S,
¬B is selected or else nothing is selected in ¬B ∨ D and
¬ Bσ is maximal with respect to Dσ.
2.3. RESOLUTION
19
Hyperresolution
Hyperresolution calculus is a refinement of resolution. As for ordered resolution,
an ordering and a selection function S are used. But in this case, S selects all
negative literals in a clause. Hyperresolution can be defined as follows [Georgieva
et al., 2003, Schmidt, 2008]:
C1 ∨ A1
... Cn ∨ An ¬B1 ∨ ... ∨ ¬Bn ∨ D
(C1 ∨ ... ∨ Cn ∨ D)σ
provided that:
-
σ is the most general unifier such that A1 σ = B1 σ, ..., An σ = Bn σ,
Ai σ is strictly maximal in Ci σ, 1 ≤ i ≤ n,
nothing is selected in Ci and
the indicated ¬Bi are exactly the ones selected by S and D is a clause
containing only positive literals.
The factoring rule previously defined is also included in the hyperresolution calculus.
Hyperresolution calculus produces fewer inferences and only positive clauses.
The following example illustrates this statement. Let N be the set of clauses
{C1 = D1 ∨ A1 , C2 = D2 ∨ A2 , C3 = ¬A1 ∨ ¬A2 ∨ D3 }. Consider that ordered
selection is used with a selection function S which selects in first time the literal
¬A1 . From C1 and C3 , a first clause C4 = D1 ∨ ¬A2 ∨ D3 can be inferred. After
this step, the selection function selects the negative literal left ¬A2 . From C2
and C4 , a second clause C5 = D1 ∨ D2 ∨ D3 is then inferred. Now, suppose that
hyperresolution is used instead of ordered resolution. All the negative literals are
selected at the same time. Then D1 ∨ D2 ∨ D3 is directly inferred from N .
20
CHAPTER 2. RESOLUTION FOR FIRST-ORDER LOGIC
2.3.2
Completeness and soundness
Some calculus have interesting properties. In particular, they can be complete
and sound. A resolution refinement is called complete if it satisfies the following
condition: if a set of clauses is unsatisfiable, an empty clause is derivable from the
set by the resolution calculus [Fermüller et al., 1993a]. And it is called sound when
its inference rules provide a conclusion which is a semantic/logical consequence
of the premises [Schmidt, 2012]. These definitions imply that:
- If a calculus R1 is complete, R1 can prove all the unsatisfiable problems,
i.e., R1 derives an empty clause for all the unsatisfiable problems. But a
satisfiable problem can also be provable by R1 as nothing prevents R1 to
derive an empty clause from a satisfiable problem. Then the result of R1
can be declared correct only if it is satisfiable.
- If a calculus R2 is sound, R2 can prove only the unsatisfiable problems. But
the problem for this kind of calculus is that some unsatisfiable problems
are not provable. R2 does not necessary derive an empty clause from an
unsatisfiable problem. Then the result of R2 can be declared correct only
if it is unsatisfiable.
A sound and complete calculus can by consequence prove all the unsatisfiable
problems and only the unsatisfiable problems. If no proof is found, the problem
is necessary satisfiable. Suppose that we have a reasoner based on calculus which
is sound and complete. After applying all inference rules on the set of clauses N ,
if – is derived then N is unsatisfiable, else if no new clauses can be inferred, N is
called saturated and that implies N is satisfiable.
One of the most important issues of this project is to prove that all the implemented calculus are complete and sound. Without this verification, we cannot
be sure that the returned results are correct.
This kind of reasoner gives then an efficient proof search. However, appling
all inference rules without any restriction leads to redundant clauses. A clause is
called redundant with respect to S, if there exist C1 , ..., Cn ∈ S, n ≥ 0, such that
all Ci ≺ C and C1 , ..., Cn ⊧ C [Bachmair and Ganzinger, 2001, Schmidt, 2012].
During the saturation process, redundant clauses can be removed. This is the
2.3. RESOLUTION
21
role of the condensation and deletion rules. Among these rules are the tautology
elimination rule and the subsumption elimination rule. As their names indicate,
the tautology elimination rule consists of deleting all tautologies, that is to say all
clauses C which are always true ( ⊧ C). The subsumption elimination rule deletes
clauses which are subsumed by other clauses of the set, that is to say all clauses
D such that there exist a clause C such that Cσ ⊂ D for some substitutions σ.
This rule is especially interesting when condensation is allowed. A condensation
of a clause C (Cond(C)) is by definition “a minimal subclause of C which is also
an instance of C” [Horrocks et al., 2007]. Cond(C) subsumes C and therefore, if
the subsumption elimnation rule is activated, C can be removed in the presence
of Cond(C).
Chapter 3
SPASS: An automated theorem
prover
The aim of the project is to improve the prover SPASS. SPASS is an automated theorem prover for first-order logic with equality based on superposition
calculus [Weidenbach, 2005, Weidenbach et al., 2007b, Weidenbach et al., 2007a].
It was a requirement of the project specification to use SPASS. SPASS has been
chosen because it is an open source prover and it is widely used. Moreover, it
contains a large number of rules, such as the splitting rule, that are asked for the
implementation of the different decision procedures. This is not the case for all
the provers. In this chapter, the SPASS functioning is explained.
3.1
Architecture
SPASS works on an input file written in dfg format. This file is treated by the
module top which contains the main function. The first step realised, when a
problem is given, is the transformation into clausal form of this problem. This
step is performed by the module FLOTTER. There is after a module dedicated
to handle the different actions related to clauses. In particular, it is this module
which selects the literals in a clause. After the transformation into clausal normal
form of the problem, an analysis is performed. For this project, a new module
which allows to tell if a problem belongs to one of the decidable classes described
in this report has been created. It has been integrated to the SPASS analysis
module. The analysis performed enables SPASS to set the adequate rules to use.
22
3.2. INPUT FILE
23
The rules are set by activating specific flags. SPASS has an associated flag for
each rule implemented. To activate a rule, the user has just to specify the option
-flag=value in the command line [Weidenbach, 2005]. For example, to activate
the ordered factoring rule, you should set the option -IOFc=1. The list of all flags
is provided in the Appendix A. The value of the flags can be set automatically by
activating the automatic mode -Auto=1. It is possible to activate specific rules
even in the automatic mode.
Once the rules are set, the resolution is performed. The output is then print
on the terminal.
The architecture of SPASS is summarised in the figure 3.1.
Figure 3.1: SPASS architecture
More details concerning the functioning of SPASS are provided in the following
sections.
3.2
Input File
SPASS takes as input a file written in a specific syntax. This input file which
is in dfg format should contain different information. It consists of two part: a
description part and a logical part [Weidenbach, 2005].
24
CHAPTER 3. SPASS: AN AUTOMATED THEOREM PROVER
The description part need to contain the name of the problem and of its author,
the status of the problem (satisfiable or unsatisfiable) and a brief description of
the problem.
The logical part can be divided in two part: a part where symbols are declared and a part where formulas or clauses are declared [Weidenbach, 2005].
Signature symbols have to be declared first. It amounts to declare the necessary
functions and precidicate symbols. The function symbols are defined thanks to
the use of functions[] which takes in argument pairs specifying the symbols
and the arities of the functions. The predicate symbols are defined similarly
thanks to predicates[]. Formulas and clauses are then declared. Depending
on the clause type chosen, the clause should be written in a different form. A
cnf clause has the form forall(term list, or(...)) and a dnf clause has the
form exists(term list, and(...)).
The figure 3.2 is an example of a SPASS inpute file.
begin_problem(example1).
list_of_descriptions.
name({*example1*}).
author({*Yasmine*}).
status(satisfiable).
description({*This problem is a test concerning the recognition
part.*}).
end_of_list.
list_of_symbols.
functions[(a,0),(b,0),(c,0),(f,2),(g,3)].
predicates[(P,2),(Q,3)].
end_of_list.
list_of_clauses(axioms, cnf).
clause(or(P(a,b))).
clause(forall([x,y],or(P(x,y),not(P(y,f(x,y)))))).
clause(forall([x,y,z],or(Q(x,y,z),not(Q(g(z,x,y),x,g(g(y,z,x),x,z))
)))).
clause(or(not(Q(a,b,c)))).
end_of_list.
end_problem.
Figure 3.2: SPASS input
3.3. ANALYSIS MODULE
3.3
25
Analysis module
SPASS starts by reading the input file, right after it transforms the problem
into the clausal normal form, then the problem is analysed. The SPASS analysis
module can give many information about the problem. This analysis module is
able to tell:
ˆ if the problem is a Horn problem, i.e., a problem which contains only Horn
clauses
ˆ if the problem is a monadic problem
ˆ if the problem contains equational atoms, that is to say if it a problem with
equality
ˆ if the problem is a first-order problem or a pure propositional problem
ˆ if the problem contains function symbols
ˆ if the conjecture is ground
The figure 3.3 illustrates the analysis produced for the problem provided in the
previous section.
Input Problem:
4[0:Inp] || Q(a,b,c) -> .
3[0:Inp] || Q(g(U,V,W),V,g(g(W,U,V),V,U)) -> Q(V,W,U).
2[0:Inp] || P(U,f(V,U)) -> P(V,U).
1[0:Inp] || -> P(a,b).
This is a first-order Horn problem without equality.
Axiom clauses: 4 Conjecture clauses: 0
Figure 3.3: SPASS analysis
26
CHAPTER 3. SPASS: AN AUTOMATED THEOREM PROVER
3.4
Resolution part
3.4.1
Implemented rules and orderings
SPASS has thirty one rules implemented. These rules are listed in Appendix
A. They can be split in two different groups: inference rules and reduction rules.
All the rules needed for the project are already implemented. Among them, there
are:
ˆ the usual inference rules such as the ordered resolution, the ordered hyper-
resolution, the ordered factoring and the splitting rule,
ˆ the inference rules concerning the problems with equality such as the equal-
ity resolution, the reflexivity resolution, the superposition left and the superposition right,
ˆ the usual reduction rules such as the tautology deletion, the subsumption
deletion and the condensation rule.
The ordering used in the resolution of a problem can be chosen. There are two
orderings implemented in SPASS [Weidenbach et al., 2007b]: the Knuth-Bendix
ordering (KBO) and the recursive path ordering with status (RPOS).
The Knuth-Bendix ordering is based on a weight function which is a mapping
from the set of signature symbols (functions, predicates) into the non-negative
integers. It also takes as parameter a strict order of the signature symbols set.
This order of the signature symbols is called a precedence.
The KBO ordering implemented in SPASS is defined as follows [Weidenbach et al.,
2007b]:
If t and s are terms then t ≻kbo s if occ(x, t) ≥ occ(x, s) for every variable
x ∈ (vars(t) ∪ vars(s)) and
(1) weight(t) > weight(s) or
3.4. RESOLUTION PART
27
(2) weight(t) = weight(s) and t = f (t1 , ..., tk ) and s = g(s1 , ..., sk ) and
(2a) f > g in the precedence or
(2b) f = g and
(2b1) status(f ) = lef t and (t1 , ..., tk ) ≻lex
kbo (s1 , ..., sl ) or
(2b2) status(f ) = right and (tk , tk−1 , ..., t1 ) ≻lex
kbo (sl , sl−1 , ..., s1 )
The recursive path ordering with status does not use any weight function but
uses a precedence. It asks for a status which can be lef t, right or mul to specify in
particular cases if lexicographic ordering has to be used or just a multiset ordering.
The RPOS ordering implemented in SPASS is defined as follows [Weidenbach
et al., 2007b]:
If t and s are terms then t ≻rpos s if
(1) s ∈ vars(t) and t ≠ s or
(2) t = f (t1 , ..., tk ) and s = g(s1 , ..., sk ) and
(2a) ti ⪰rpos s for some 1 ≤ i ≤ k or
(2b) f > g and t ≻rpos sj for all 1 ≤ j ≤ l or
(2c) f = g and
(2c1) status(f ) = lef t and (t1 , ..., tk ) ≻lex
rpos (s1 , ..., sl ) and
t ≻rpos sj for all 1 ≤ j ≤ l or
(2c2) status(f ) = right and (tk , tk−1 , ..., t1 ) ≻lex
rpos (sl , sl−1 , ..., s1 ) and
t ≻rpos sj for all 1 ≤ j ≤ l or
(2c3) status(f ) = mul and (t1 , ..., tk ) ≻mul
rpos (s1 , ..., sl )
These orderings are the two main orderings used in todays provers and it is
possible to create almost any other ordering just by modifying them a bit [Weidenbach et al., 2007b].
28
CHAPTER 3. SPASS: AN AUTOMATED THEOREM PROVER
3.4.2
Default Mode
By default, SPASS runs in the automatic Mode. In this mode, regardless of
the problem, the following rules are enabled: the trivial literal elimination, the
tautology deletion, the forward/backward matching replacement resolution and
the subsumption deletion. The other rules are set depending on the analysis of
the problem performed. The settings are detailed below:
ˆ If the problem contains real predicates, the ordered resolution and the con-
densation rules are activated. Moreover, if the problem contains non Horn
clauses, the ordered factoring rule is also activated.
ˆ If the problem contains positive equations, the superpositions right/left,
the Forward/Backward rewriting and the condensation rules are activated.
Moreover, if the problem contains non Horn clauses, the equality factoring
is also activated.
ˆ If the problem contains negative equations, the equality resolution is acti-
vated.
ˆ If the problem does not contain any functions, all the negative literals are
selected, else only the maximal negative literals are selected.
Thanks to these settings, SPASS is able to provide a result for a certain number
of problems. But for many other problems, SPASS cannot find how to go about
solving them. A way to remedy to this issue is to make SPASS use a resolution
method known to be a decision procedure for the input problem. In this way, it
is sure that SPASS will provide a result at the end.
The result finally produced at the end is either Proof found or Completion
found depending on the satisfiability of the problem. SPASS can also provide a
proof if it finds one.
The figures 3.4 illustrates the output produced by SPASS for the example studied
in Section 3.2.
3.4. RESOLUTION PART
29
Inferences: IORe=1
Reductions: RFMRR=1 RBMRR=1 RObv=1 RUnC=1 RTaut=1 RFSub=1
RBSub=1 RCon=1
Extras
: Input Saturation, Dynamic Selection, No Splitting
, Full Reduction, Ratio: 5, FuncWeight: 1, VarWeight: 1
Precedence: f > g > P > Q > a > b > c
Ordering : KBO
Processed Problem:
Usable Clauses:
1[0:Inp] || -> P(a,b)*.
4[0:Inp] || Q(a,b,c)* -> .
2[0:Inp] || P(U,f(V,U))* -> P(V,U).
3[0:Inp] || Q(g(U,V,W),V,g(g(W,U,V),V,U))* -> Q(V,W,U).
Given clause: 1[0:Inp] || -> P(a,b)*.
Given clause: 4[0:Inp] || Q(a,b,c)* -> .
Given clause: 2[0:Inp] || P(U,f(V,U))* -> P(V,U).
Given clause: 3[0:Inp] || Q(g(U,V,W),V,g(g(W,U,V),V,U))* ->
Q(V,W,U).
SPASS V 3.8d
SPASS beiseite: Completion found.
Problem: examples/example.dfg
SPASS derived 0 clauses, backtracked 0 clauses, performed 0
splits and kept 4 clauses.
SPASS allocated 31830 KBytes.
SPASS spent 0:00:00.05 on the problem.hey
0:00:00.02 for the input.
0:00:00.00 for the FLOTTER CNF translation.
0:00:00.00 for inferences.
0:00:00.00 for the backtracking.
0:00:00.00 for the reduction.
Figure 3.4: SPASS output
Chapter 4
Resolution as decision procedure
and decidable classes
4.1
Decision procedures
The main rules to solve a problem belonging to first-order logic are known
but there is no mechanical way to decide if a problem is satisfiable or not. If
the program does not terminate, there is no way to know if it is because the
problem is satisfiable and the calculation has not been driven far enough or just
because the problem is not satisfiable. First-order logic is said to be undecidable.
Nevertheless, some problems of first-order logic can be decided by the use of some
refinements of the resolution calculus [Fermüller et al., 2001a]. These reasoning
methods are called decision procedures. A decision procedure is sound, complete
and terminates.
Some important refinements of the resolution calculus are refinements based
on orderings, selection function and hyper-resolution. It has been shown that
these refinements used with some specific rules such that splitting rule make
some classes of problems decidable.
Take the monadic class as an example. The monadic class is the class of clauses
that contains only monadic predicate symbols, i.e., predicate symbols whose arities are equal to 1 and which do not contain any function symbols. The atoms
allowed in this class have necessary the form P (s) or P (f (t)) where P is a predicate symbol, f is a skolem function and t is either a constant or a variable.
30
4.2. DECIDABLE CLASSES
31
Consider the following set S of clauses:
1.
2.
¬P (x) ∧ P (f (x))
P (a)
This set belongs to the monadic class. Applying unrestricted resolution to S gives
the following derivation:
3.
4.
5.
...
P (f (a))
P (f (f (a)))
P (f (f (f (a))))
Res(1,2) with σ = {x/a}
Res(1,3) with σ = {x/f (a)}
Res(1,4) with σ = {x/f (f (a))}
In this case, clauses that do not belong to the monadic classes (clauses 4, 5, ...)
are inferred. However, this fact can be prevented by the use of an appropriate
ordering. Let ≻ be an ordering such that for all atoms A, B, A ≻ B if A has more
functional symbols than B. Under this ordering restriction, no inference can be
performed on S.
This example shows that the ordering restriction has prevented to infer clauses
whose length increases. It can then be understood why ordered resolution is used
for some classes of problems. For classes more complex than the monadic class,
ordering restrictions are not enough, that is why depending on the concerned
classes, different refinements of resolution are required. The decidable classes
that have been chosen to be implemented and their decision procedures have
been described below.
4.2
4.2.1
Decidable classes
Classes decidable by ordered resolution
The first important step of the project was to make a list of known decidable
classes that could be implemented. The chapter on resolution decision procedures
from the Handbook of Automated Reasoning [Fermüller et al., 2001a] has really
helped to realise this task. For each class, a concrete definition and a resolution
decision procedure have been provided.
32CHAPTER 4. RESOLUTION AS DECISION PROCEDURE AND DECIDABLE CLASSES
In this section, only classes decidable by orderings are described. Most of classes
use resolution based on a A-ordering as decision procedure. An A-ordering ≺a is an
irreflexive, transitive binary relation on atoms such that for all atoms A, B and all
substitutions θ: A ≺a B implies Aθ ≺a Bθ . This notion has been first introduced
in [Kowalski and Hayes, 1968]. This kind of ordering is particularly interesting
because it is known that a resolution refinement based on a A-ordering combined
with the splitting rule is complete. A proof has been provided in [Fermüller et al.,
1993c].
Before giving the definitions of the different classes, new notions and notations
need to be introduced. First, the definition of a (weakly) covering term as taken
from [Fermüller et al., 2001a] is provided.
A functional term t is called (weakly) covering if for all (non ground) functional subterms s occurring in t we have the set of variables of s equal to the set
of variables of t. An atom or literal A is called (weakly) covering if each argument
of A is either a variable or a constant (just a term with no variables in case of
weakly covering) or a (weakly) covering term t such that the set of variables of t
is equal to the set of variables of A.
For example,
- P(f(x,y),g(x,y,y),a) is covering.
- P(f(x,y),g(x,y,y),h(a)) is not covering but it is weakly covering.
- P(f(x,y),g(x,y,y),h(x)) is neither covering nor weakly covering.
The notations used in the rest of this thesis are the following:
ˆ S represents a clause set.
ˆ C+ (respectively C− ) denotes the set of positive literals occuring in C (re-
spectively the set of negative literals occuring in C).
ˆ τ (E) denotes the depth of the expression E and τmin (t, E) (τmax (t, E)) the
minimal (maximal) depth of a term t within an expression E.
4.2. DECIDABLE CLASSES
33
ˆ τv (E) represents the maximal variable depth of an expression E and it is
defined as: τv (E) = max{τmax (y, E)∣ y belonging to the set of variables of
E}
The depth of a term t is defined as follows [Fermüller et al., 2001a].
ˆ If t is a variable or a constant then τ (t) = 0.
ˆ If t = f (t1 , ..., tn ) then τ (t) = 1 + max{τ (ti )∣1 ≤ i ≤ n}.
For all literals L and clauses C, τ (L) = max{τ (t)∣t ∈ args(L)} and τ (C) =
max{τ (L)∣L ∈ C}.
The next example is provided to understand the notion of depth.
Let C = {P (x, y, f (y), f (f (x)))}.
The figure 4.1 represents the clause C.
Figure 4.1: Clause depth
It is easy to see with this graph that τmin (x, C) = τmin (y, C) = 0 , τmax (y, C) = 1
and τmax (x, C) = 2. The depth of C is the depth of the entire tree that is to say
τ (C) = 2.
34CHAPTER 4. RESOLUTION AS DECISION PROCEDURE AND DECIDABLE CLASSES
Class
Some decidable classes are based on these notions of covering and weakly covering. This is the case for the classes 1 and + .
The 1 class can be seen as an extension of the Ackermann class which is characterised by the prefix type ∃ ∀ ∃ [Fermüller et al., 1993c]. It is defined as follows:
Suppose that for each clause Ci of S, the following two conditions are true:
1. all literals appearing in Ci are covering and
2. for all literals L, M belonging to Ci , the set of variables of L, V (L), is equal
to the set of variables of M , V (M ) or V (L) and V (M ) have no intersection.
Then S is said to belong to 1 [Fermüller et al., 1993c].
This class has been studied because of its nice properties. First, it is known
that if a clause set S belongs to 1 then a resolvent or a factor of a clause C ∈ S
cannot have more variables than C [Fermüller et al., 2001a]. Moreover, the depth
of the resolvents of clauses in 1 can be limited by the use of the A-ordering R1
defined such that for all atoms A and B, A is greater than B if τ (A) > τ (B) and
τmax (x, A) > τmax (x, B) for all x ∈ V (B) [Fermüller et al., 1993c].
The 1 class has been proved to be decidable [Fermüller et al., 1993c]. R1
combined with the splitting rule gives a decision procedure for 1 [Fermüller
et al., 1993c, Fermüller et al., 2001a].
The class + is sensitively similar to the class 1 . The only difference concerns
the first condition: literals do not need to be covering but all literals need to be
weakly covering.
This class has the same properties as the class 1 : it is known to be decidable
and R1 combined with the splitting rule provide a decision procedure [Fermüller
et al., 1993c, de Nivelle, 2000].
4.2. DECIDABLE CLASSES
35
Class S+
A third class called the S+ class has been shown to be related to the 1 class
[Fermüller et al., 2001a]. This class can be seen as an extension of the Skolem class
characterised by a prefix of the form (∃z1 )...(∃zl ) (∀y1 )...(∀ym )(∃x1 )...(∃xn ). Its
definition is given below [Fermüller et al., 2001a].
Consider that for each clause Ci of S, the following two conditions are true:
1. for all literals L in Ci , the set of variables of L is equal to the set of variables
of Ci or contains at most one element and
2. for all functional terms t in Ci , the set of variables of t is equal to the set
of variables of Ci .
Then S belongs to S+ .
As 1 , this class is decidable [Fermüller et al., 1993c]. A decision procedure is
provided by a resolution based on R1 combined with a process called monadization
and the splitting rule [Fermüller et al., 1993c, Fermüller et al., 2001a].
Class W GC
The class of the weakly guarded clauses has then been studied. This class uses
the notion of weakly covering that has been introduced in the beginning of this
chapter. The WGC class is decidable [de Nivelle, 1998, Fermüller et al., 2001a].
A decision procedure is provided by the ordering R2 defined such that for all
literals L1 and L2 , L1 is greater than L2 if τv (L1 ) > τv (L2 ) or the set of variables
of L1 includes the set of variables of L2 [de Nivelle, 1998]. The definition of the
W GC class is provided below.
S belongs to W GC if for each clause Ci of S and for all literals L in Ci :
1. L is weakly covering,
2. if τv (L) > 0, then the set of variables of L is equal to the set of variables of
Ci and
36CHAPTER 4. RESOLUTION AS DECISION PROCEDURE AND DECIDABLE CLASSES
3. if Ci is non ground (i.e. if Ci contains variables), then there is a negative
literal L1 ∈ Ci such that τv (L1 ) = 0 and which has the same set of variables
than Ci .
This class is a superclass of the class GC which is detailed in Chapter 5.
4.2.2
Classes decidable by hyperresolution
Refinements based on A-orderings are not the only ones used to provide decision procedures. Some other interesting decidable classes cannot be decided by a
refinement based only on an A-ordering. Most of these classes use a refinement
based on hyperresolution as decision procedure. It has been shown that hypperresolution based on positive, restricted factoring is complete[Fermüller and
Leitsch, 1993]. We say that a resolvent of two clauses C and D, where C is
positive, is under positive, restricted factoring if factoring is only applied to C
[Fermüller and Leitsch, 1993]. This section aims at describing some classes known
to be decidable by hyperresolution based on positive, restricted factoring.
Class BSH∗
The first class decidable by hyperresolution which has been encountered is the
BSH∗ class [Fermüller et al., 2001a, Caferra et al., 2004]. This class is a subclass
of a class called BS∗ which represents the clausal form of the Bernays-Schönfikel
class. The Bernays-Schönfinkel class is the class of all formulas of the form:
(∃z1 )...(∃zl ) (∀y1 )...(∀ym )P where P is quantifier-free and has only variables as
terms [Fermüller et al., 2001a].
Then BS* can be defined as the class of the clause sets S which include only
clauses C such that τ (C) = 0. And BSH* is the subclass of BS* containing sets
of Horn clauses only [Fermüller et al., 2001a].
Class P V D
After the BSH∗ class, the P V D class has been studied. PVD means positive
variable dominated [Fermüller et al., 1993b]. This name comes from the fact that
in the P V D class, the positives parts of a clause C are kind of dominated by the
4.2. DECIDABLE CLASSES
37
negative parts of C. The definition of this class is the following [Fermüller et al.,
2001a]:
S belongs to P V D if for each clause Ci ∈ S, the set of variables of {Ci+ } is
included in the set of variables of {Ci− } and τmax (x, Ci+ ) ≤ τmax (x, Ci− ) for all
x ∈ V (Ci+ ).
It has been proved that the P V D class is decidable by hypperresolution [Fermüller
et al., 1993b, Fermüller et al., 2001a].
Class OCCI
The last class defined is the OCCI class. This class is more restrictive than the
P V D class concerning the relation between the positive parts and negative parts
of a clause C. It assures that every variables occur only once in C+ . As P V D,
the OCCI class has been shown to be decidable and hyperresolution provides a
decision procedure [Fermüller et al., 1993b, Fermüller et al., 2001a]. The OCCI
class is defined as follows.
If for all clauses Ci ∈ S and for all variables v ∈ Ci+ , the number of occurences of
v in Ci+ is equal to 1 and τmax (x, Ci+ ) ≤ τmin (x, Ci− ) for all x ∈ V (Ci+ ) ∩ V (Ci− ),
then S belongs to OCCI [Fermüller et al., 1993b, Fermüller et al., 2001a].
The classes presented in these two sections have been studied in many papers
and are interesting for the project because of their possiblities to be handled and
implemented in SPASS. However, we should be careful with the use of equalities
which makes almost all these classes undecidable. As it can be noticed, only
clausal classes have been studied. As the analysis of a problem is performed after
the transformation into clausal normal form of the problem, it is more adapted
to work on clausal classes than formula classes.
A last decidable class has been carefully studied for this project. It is the class
of guarded clauses. The entiere next chapter is dedicated to this class. It details
the mechanism used to solve the memberships of this class.
Chapter 5
Guarded clauses as case study
The guarded fragment was inspired by two observations [Andréka et al., 1998]:
ˆ Many propositional modal logics (that are extensions of the classical propo-
sitional logic which include operators expressing modality) have very nice
properties.
ˆ These modal logics can be translated into first order logic.
Propositional modal logics can then be seen as fragments of first-order logic with
interesting properties. In particular, they are decidable [Ganzinger and De Nivelle, 1999]. The guarded fragment is one fragment of first-order logic which possesses the nice properties of propositional modal logics.
5.1
The guarded clauses
The guarded fragment of first-order logic (GF) is built up as follows [Ganzinger
and De Nivelle, 1999]:
ˆ ⊺ and – are in GF .
ˆ If A is an atom, then A is in GF .
ˆ If A ∈ GF , then ¬A ∈ GF .
ˆ If A, B ∈ GF , then A ∨ B ∈ GF , A ∧ B ∈ GF , A → B ∈ GF , A ↔ B ∈ GF .
ˆ If F ∈ GF and G is an atom, for which every free variable of F is among the
arguments of G, then ∀x(G → F ) ∈ GF (or, equivalently, ∀x(¬G∨F ) ∈ GF )
and ∃x(G ∧ F ) ∈ GF for every sequence x of variables.
38
5.2. DECISION PROCEDURE
39
The atom G appearing in the last point is called a guard.
Guarded clauses can also be defined. Their definition is the one which is the
more interesting for the project since the recognition module focuses only on
clausal classes. However, it has be shown in [Ganzinger and De Nivelle, 1999]
that guarded fragments can be translated into guarded clauses by transformation
into normal clausal form.
A clause C is called guarded if it satisfies the following conditions [Ganzinger
and De Nivelle, 1999]:
1. C is simple.
2. Every functional subterm in C contains all the variables of C.
3. If C is non-ground, C has a non-functional negative literal, called a guard,
which contains all the variables of C.
This definition is more restrictive than the weakly guarded clauses definition
described in the previous chapter.
Let C1 = ¬q(x, y, f (a)) ∨ p(x, y) and C2 = ¬p(x) ∨ p(x, a).
C1 and C2 do not belong to GC but they belong to WGC. In fact, there is a negative literal in C1 which contains all the variables of the clause. But this literal
is functional. And both C1 and C2 have constants but they are not ground.
5.2
Decision procedure
The class composed of guarded clauses with equality (the GC class) has been
proved to be decidable [Ganzinger and De Nivelle, 1999, de Nivelle, 1998]. The
superposition calculus used with an appropriate choice of ordering and selection
function provides a resolution decision procedure.
It has been shown that this decision procedure can use, as ordering, a lexicographic path ordering based on a precedence ≻ such that f ≻ c ≻ p where f
denotes a non-constant function symbol, c denotes a constant, and p denotes a
predicate symbol [Ganzinger and De Nivelle, 1999]. In order to be sure to have
40
CHAPTER 5. GUARDED CLAUSES AS CASE STUDY
an ordering compatible with the decision procedure, the choice was to implement
this ordering.
An appropriate selection function Σ for the GC class resolution is such that
[Dierkes, 2000, Ganzinger and De Nivelle, 1999]:
ˆ If a clause is non-functional and contains a guard then one of its guards is
selected by Σ.
ˆ If a clause contains a functional negative literal, one of these is selected.
ˆ If a clause contains a positive functional literal, but no negative functional
literal, no literal is selected.
The inference rules related to the superposition calculus are detailed in the
Section 5.4.2.
The GC class is decidable and a decision procedure is known. The aim was to
find a way to make SPASS use this information. The idea to achieve this aim was
simple: if SPASS can tell that a problem belongs to the GC class, it can then know
when to use the decision procedure described above. From this observation, two
main steps concerning the implementation the GC class resolution take shape.
They can be defined as follows:
ˆ Make SPASS able to recognise that a problem belongs to GC.
ˆ Implement an appropriate decision procedure.
5.3
Recognition of memberships of the GC class
In order for SPASS to handle the GC class, an analysis module able to tell if a
given problem belongs to the GC class or not had to be implemented. As it has
been noted in Chapter 3, SPASS already has an analysis module implemented.
However, this module only gives some characteristics of the problem. It tells if
the problem is a Horn problem, if it contains function symbols and some other
information that are detailed in Chapter 3. An extension of this module was then
necessary to reach the desired goal.
5.3. RECOGNITION OF MEMBERSHIPS OF THE GC CLASS
41
In the project, only clausal classes have been studied. A problem can be said
to belong to a specific class only after analysis performed when the problem is in
its clausal normal form. The recognition step had then to be implemented after
the transformation of the problem into clausal normal form. The analysis already
made by SPASS is also performed just after the transformation of the problem
into clausal normal form. Therefore, an appropriate choice was to extend the
analysis function already implemented in SPASS instead of creating a whole new
analysis module.
SPASS analyses a problem thanks to the function ana AnalyzeProblem which is
defined in the file analyze.c. This function is based on elements called Features.
These features defined in analyze.h play the role of internal flags. They are
declared as BOOL (true or false) which allow to know if the given problem satisfies
some specific characteristics. To make the function recognise if a problem belongs
to the GC class, a new feature called ana GC was declared.
After having created the feature ana GC specifying the belonging of the problem to the GC class, it had to be initialised. The initialization step is realised
in the ana AnalyzeProblem function. This is the first action performed when the
function is called. The feature could be initialised to true or false. The first
choice was to initialise it to false and if all the clauses of the problem satisfied
the three conditions described in the definition of a guarded clause then set the
feature to true. This solution seemed to be appropriate but it asked to create a
BOOL variable which allowed to keep in memory that all clauses scanned before
the current analysed clause were satisfying the conditions. Then the choice was
changed to initialise it to true and set it to false at the first encountered clause
which does not satisfies the desired conditions.
A pseudo-code describing what happens in ana AnalyzeProblem has been provided below (see Algorithm 1).
The key question was then to know how to implement the different conditions:
GC condition1, GC condition2 and GC condition3 that a clause has to verified
to be a guarded clause.
42
CHAPTER 5. GUARDED CLAUSES AS CASE STUDY
Algorithm 1 Problem analysis
Initialization
ana SetFeature(ana GC,TRUE)
for all clauses C in the problem do
if !GC condition1(C) or !GC condition2(C) or !GC condition3(C) then
ana SetFeature(ana GC,FALSE)
end if
end for
For each condition, a function was written. Each function returns a BOOL
which value indicates whether the condition is satisfied or not. In order to have a
clear code where it is possible to navigate easily, a new file called class conditions.c
was created. This file regroups all the functions associated to the membership
conditions of the different classes. It contains for example the functions associated
to the three membership conditions of the GC class.
Now that the basic functioning of the recognition module has been briefly explained, the implementation of the conditions that clauses have to satisfy in order
to belong to GC are going to be described in detail.
First condition: the clause has to be simple.
In the first chapter, it has been said that a simple clause is a clause where literals
have only terms which are either variables or constants or functional terms whose
argument is either a variable or a constant. This definition means that all literals
in a simple clause can be written in one of the following forms:
ˆ a, where a is a constant,
ˆ x, where x is a variable,
ˆ f (t), where f is a function symbol and t is either a constant or a variable.
This definition prevents terms from being in the following format : f (g(...)),
where f and g are function symbols. Therefore, an alternative definition of a
5.3. RECOGNITION OF MEMBERSHIPS OF THE GC CLASS
43
simple clause can be given: a simple clause is a clause where depth is less or
equal than 1.
Therefore, the first condition amounts to checking if the depth of the clause is
less than or equal than 1. Thus, GC condition1 can be defined as follows:
Algorithm 2 GC condition1
BOOL b = FALSE
if clause depth(C) ≤ 1 then
b = TRUE
end if
return b
Second condition: ∀t ∈ C such that t is functional term, V (t) =V(C) where V (x)
denotes the set of variables of x.
A relevant remark can be made just by reading briefly this condition: if a
guarded clause C contains a constant, then it is ground i.e. V (C) = ∅.
This last remark can be easily explained. In fact, a functional term is a term
containing either a constant or a function symbol. Then a constant a is a functional term. However, V (a) = ∅. If a clause C contains a constant a, in order to
satisfy condition 2, V (C) should be equal to V (a). Therefore, V (C) should be
equal to ∅ in order for C to satisfy condition 2.
So an issue was to know how to deal with clauses which do not contain any
constants. The idea was to implement functions which could be reused for the
other decidable classes. The first step was to rewrite this condition. In fact,
this condition can be transformed in conditions similar to the ones given in the
definition of the weakly guarded clauses. The condition in its initial state is
: “Every functional subterm in the clause C have to contain all the variables
44
CHAPTER 5. GUARDED CLAUSES AS CASE STUDY
of C”. This condition can be replaced by: in a guarded clause, all literals L
containing functional subterms have to contain all the variables of the clause and
all functional subterms occuring in L have to contain the same variables than L
(that is to say of C as the set of variables of L is equal to the set of variables of
C). This latter amounts to say that L is covering. When the constants are not
taken into account, the condition can then be simply rewritten as follows: in a
guarded clause C, all literals have to be covering and if the variable
depth of a literal L is greater than 0, then L contains all the variables
present in C.
Therefore, a clause C satisfies the second condition: “Every functional subterm
in C have to contain all the variables of C” if it satisfies the following conditions:
1. If C contains a constant, then it is ground.
2. All literals in C are covering.
3. For all literals L in C, if τv (L) > 0, then the set of variables of L is equal to
the set of variables of C.
The second and the third conditions are now sensibly similar to the conditions of
the weakly guarded clauses. Then the implementation of the recongition of the
weakly guarded clauses should be easier.
The conditions have been then implemented following the model:
Algorithm 3 GC condition2.1
BOOL b = TRUE
if clause containsVariable(C) and !clause isGround(C) then
b = FALSE
end if
return b
5.3. RECOGNITION OF MEMBERSHIPS OF THE GC CLASS
45
Algorithm 4 GC condition2.2
BOOL b = TRUE
for all literals L in C do
if !literal iscovering(L) then
b = FALSE
end if
end for
return b
Algorithm 5 GC condition2.3
BOOL b = TRUE
for all literals L in C do
if variable Depth(L) > 0 and V ariablesof L ≠ V ariablesof C then
b = FALSE
end if
end for
return b
As it can be seen, these functions ask for some intermediate functions such as
literal isCovering or variable Depth. These intermediate functions had to be implemented before the ones described above. All the intermediate functions have
been placed in a separate file called complementary func.c.
Third condition: if C is a non-ground clause, C has to contain a non-functional
negative literal L, called a guard, such that V (L) = V (C).
The only thing to verify is the existence of a guard in every non-ground clauses.
Then the idea was just to create a BOOL b initialised to false for each non-ground
clause C and to scan all the literals of C one by one. At the first encountered
literal which contains a guard, b is set to true.
All the conditions described above were implemented. Based on these three
conditions, the feature ana GC can be correctly set. If on exit of the function
46
CHAPTER 5. GUARDED CLAUSES AS CASE STUDY
ana AnalyzeProblem, the feature ana GC is still equal to true then the problem
belongs to the GC class, otherwise it does not.
In this section, it has been shown how the recognition module was implemented
for the GC class. The key question is now how an adequate decision procedure
can be set.
5.4
Implementation of the decision procedure
The decision procedure procedure for the GC class is based on superposition
and uses specific ordering and selection function [Ganzinger and De Nivelle, 1999].
The first step was to find the way to implement these specific ordering and selection function. Once these functions were implemented, the work involved more
research than coding. The aim was to find all the rules that can or need to be
used in order to preserve the soundness and the completeness of the resolution
method. As all the rules implemented in SPASS are sound and the association
of sound rules is sound, only the completeness of the resolution method had to
be checked.
5.4.1
Ordering / Selection function
The ordering used in the decision procedure for the GC class is a lexicographic
path ordering, which is based on the precedence ≻ satisfying f ≻ c ≻ p for all
non-constant function symbols f , constants c and predicate symbols p.
A lexicographic path ordering <lpo is defined by s <lpo t if [Bachmair and
Ganzinger, 2001]:
ˆ s is a variable belonging to t and t ≠ s
ˆ s = f (s1 , ..., sm ) and t = g(t1 , ..., tn ) and
– ∃j, s ≤lpo t or
– f ≺ g and ∀i, si <lpo t or
– f = g, (s1 , ..., sm ) <lex
lpo (t1 , ..., tn ) and ∀i, si <lpo t
5.4. IMPLEMENTATION OF THE DECISION PROCEDURE
47
This ordering is not implemented in SPASS. However, by looking more carefully this definition taken from [Bachmair and Ganzinger, 2001], it can be noticed
that it simply corresponds to the definition of the recursive path ordering implemented in SPASS with the status set to left. SPASS manages the status in its
own way. The easiest way to implement the lexicographic path ordering was to
create a new file where the functions related to the RPO were copied and the
actions corresponding to status different from left were deleted. A new SPASS
flag flag ORDLPO has been created for this ordering. The flag concerning the
ordering is still initialised to the default ordering KBO but it is automatically set
to LPO if the problem has been recognised to belong to the GC class.
The precedence had also to be changed from the one selected by default by
SPASS. SPASS defined automatically a precedence ≻ such that f ≻ p ≻ c for
all non-constant function symbol f , constant c and predicate symbol p. The
precedence is set in the beginning of the function ana AutoConfiguration situated
in the file analyze.c. This function is the one which set the right configuration in
the automatic mode. So the only thing to do was to change this precedence in
order it to satisfy the desired condition. The pseudo-code explaining what it has
be done is provided below:
Algorithm 6 Set the precedence
Predicates = ana CalculatePredicatePrecedence(Predicates, Clauses)
Functions = ana CalculateFunctionPrecedence(Functions, Clauses)
Constants = list of constants
Add functions to the precedence list
if f eatureana GC = T rue then
Add Constants to the precedence list
Add Predicates to the precedence list
else
Add Predicates to the precedence list
Add Constants to the precedence list
end if
ana CalculatePredicatePrecedence (ana CalculateFunctionPrecedence) is a function which sorts the predicates (functions). As any indication is given for the
precedence between the predicates (functions) then the choice was just to let
SPASS order them as usual. However, the precedence between the constants is
48
CHAPTER 5. GUARDED CLAUSES AS CASE STUDY
not calculated, they are just assumed to be ordered in the order they appear in
the clause set.
In addition to a specific ordering, the resolution method used to make the GC
class decidable request a special selection function. Defining this selection function was more complicated than setting the right precedence. The first question
was where to specify the selection function in the source code of SPASS.
There were two ideas concerning where the selection function should be specified. The first one was to create a new function where the selection function
would be specified and create a new flag Flag GCselect which specifies when to
use this function. The second idea was to specify what should be selected in
the beginning of the function clause SelectLiteral which selects the wanted literals. The latter was chosen. Then, it remained only to translate into code the
definition of the selection function Σ that has been described in Section 5.2.
Algorithm 7 Selection function
if the problem belongs to GC then
Literal L;
context1 = FALSE;
context2 = FALSE;
// context1 : the problem contains a guard
// context2 : the problem contains a functional negative
literal
for all literals L1 in C do
if L1 is a guard then
context1 = TRUE
L = L1
end if
if L1 is a functional negative literal then
context2 = TRUE
L = L1
end if
end for
if (the clause is not functional and context1) or (context2) then
Select L
else No literal is selected
end if
end if
5.4. IMPLEMENTATION OF THE DECISION PROCEDURE
5.4.2
49
Inference rules
The last section has shown how the specific selection function and ordering
were implemented. After this step, the aim was to find a way to make SPASS
use the correct decision procedure. It is necessary to ensure that an appropriate
set of rules is used in order to make the GC class decidable.
It has be shown in [Ganzinger and De Nivelle, 1999] that a resolution decision
procedure for the guarded clauses with equality is based on superposition. However, superposition cannot be used alone. In order for the resolution method to
be complete, some other rules have to be used. The inference rules of the superposition calculus have been detailed in [Bachmair and Ganzinger, 1998]. Their
definitions as taken from [Bachmair and Ganzinger, 1998] are provided below.
The first evident inference rules that have to be used are the superposition
rules. They can be defined as follows:
Positive Superposition
C ∨ s ≈ t D ∨ u[s′ ] ≈ v
(C ∨ D ∨ u[t] ≈ v)σ
where σ is the mgu of s and s′ such that:
ˆ tσ ⪰̸ sσ (1)
ˆ vσ ⪰̸ uσ (2)
ˆ (s ≈ t)σ is strictly maximal with respect to Cσ, and C contains no selected
literal (3)
ˆ (u ≈ v)σ is strictly maximal with respect to Dσ, and D contains no selected
literal (4)
ˆ s′ is not a variable (5)
ˆ (s ≈ t)σ ⪰̸ (u ≈ v)σ (6)
50
CHAPTER 5. GUARDED CLAUSES AS CASE STUDY
Negative Superposition
C ∨ p ¬p ∨ D
C ∨D
where σ is the mgu of s and s′ such that (1), (2), (3), (5) and:
ˆ u ≉ v is selected, or else nothing is selected in this premise and (u ≈ v)σ is
strictly maximal with respect to Dσ (4’)
For equality, the following inference rules are needed too.
Reflexivity resolution
C ∨u≈v
Cσ
where σ is the mgu of u and v such that (4’)
Equality factoring
C ∨u≈v∨s≈t
(C ∨ v ≉ t ∨ u ≈ t)σ
where σ is the mgu of s and u such that (6) and:
ˆ (s ≈ t)σ is maximal with respect to Cσ, and C contains no selected literal
Additionally, ordered factoring has to be used. These rules take into account
a specific ordering and selection function. This is why these parameters have
been specified in Section 5.4.1. The set of rules described above is used only in
the case where the problem is with equality. When a problem without equality is
concerned, only the use of ordered resolution and of ordered factoring are enough.
It has been decided to let the ordered resolution always on and not to deactivate
it when the problems is without equality. This choice was made because it has
been noticed during the testing phase that the ordered resolution improves the
performances of SPASS. As the basic version of SPASS already uses the ordered
resolution combined to superposition for some problems, it is assured that the
completeness of the resolution is method is not affected.
5.4. IMPLEMENTATION OF THE DECISION PROCEDURE
51
All these rules are already implemented in SPASS. Then the aim was just to find
the flags to use to have the wanted resolution decision procedure. The positive
and negative superpositions are activated by the flags flag ISPL and flag ISPR.
The reflexivity resolution is actived by the flag flag IERR and equality factoring
by the flag flag IERR. Once this information is known, the ana AutoConfiguration file had to be changed such that these rules are activated when the problem
belongs to the GC class.
Algorithm 8 Resolution decision procedure
if the problem belongs to GC then
flag SetFlagIntValue(Flags, flag IORE,
flag ORDEREDRESOLUTIONNOEQUATIONS)
flag SetFlagIntValue(Flags, flag IOFC,
flag FACTORINGONLYRIGHT)
flag SetFlagIntValue(Flags, flag ORD,
flag ORDLPO)
if the problem is a problem with equality then
flag SetFlagIntValue(Flags, flag ISPR,
flag SUPERPOSITIONRIGHTON)
flag SetFlagIntValue(Flags, flag ISPL,
flag SUPERPOSITIONLEFTON)
flag SetFlagIntValue(Flags, flag IERR,
flag REFLEXIVITYRESOLUTIONON)
flag SetFlagIntValue(Flags, flag IERR,
flag EQUALITYFACTORINGON)
end if
end if
This inference system based on superposition, reflexity resolution and equality
factoring is complete. A proof is provided in [Hsiang and Rusinowitch, 1991].
Moreover, according to [Ganzinger and De Nivelle, 1999], this calculus does not
produce any non guarded clauses from a guarded clause. This fact assures its
termination. The resolution method implemented is then sure to be a decision
procedure for the GC class.
52
5.4.3
CHAPTER 5. GUARDED CLAUSES AS CASE STUDY
Reduction rules
After having set the inference rules needed, the reduction rules were set. Due
to time constraints, the choice was to let the configuration concerning reduction
rules as it was already implemented in SPASS. Nevertheless, the superposition
calculus described above is known to be compatible with usual simplification and
redundancy elimination rules such as subsumption and tautology deletion rules
[Ganzinger and De Nivelle, 1999]. Moreover, it is easy to show that the trivial literal elimination which deletes duplicate literals does not affect the completeness
of the calculus. So the only reduction rules which may affect the completeness
of the calculus are the fordward/backward matching replacement resolution and
the forward/backward rewriting. And according to the tests performed, it seems
not to be the case.
The results provided by the calculus set have been described in the last chapter.
5.5
Issues
During the implementation, some issues showed up.
The first issue appears when the user wants to activate specific rules in addition
to those activated automatically. If the user does not specify anything and let
SPASS run in its automatic mode, the calculus is sound and complete. But if
the user decides to specify his own rules, nothing assures that the calculus is still
complete. Suppose that the user sets another ordering, the instance of SPASS
may not be complete. As the user should not be constraint, it was not advisable
to prevent the user to add some options if he wants to. Then it was decided to
print a warning message when the user specifies his own rules to remind him to
be careful about the completeness of the calculus chosen.
The second issue concerns problems which belong to several classes. The question is what resolution calculus should be used. Suppose that the problem belongs
to GC and to P V D. Should we use the superposition calculus or the hypperresolution calculus? To answer to this question, it was decided to test each resolution
decision procedure separately on a certain number of problems. The decision
procedure providing the best results will be activated in priority.
Chapter 6
Other decidable classes
In this chapter, the implementation of the other decidable classes is detailed. It
is divided in two main parts. The first part deals with the recognition of the memberships of a decidable class. And the second part concerns the implementation
of the different decision procedures.
6.1
Recognition part
The process of recognizing the membership in the eight classes described in the
chapter 4 (1 , + , S+, W GC, BS∗, BSH∗, P V D, OCCI) is approximately the
same as the process used for the GC class discussed in the previous chapter.
A new internal flag was created for each class. They was declared in analyze.h:
ˆ ana BS for the BS∗ class
ˆ ana BSH for the BSH∗ class
ˆ ana EPS1 for the 1 class
ˆ ana EPS2 for the + class
ˆ ana S for the S+ class
ˆ ana PVD for the P V D class
ˆ ana OCCI for the OCCI class
ˆ ana WGC for the W GC class
53
54
CHAPTER 6. OTHER DECIDABLE CLASSES
These internal flags are set in the function ana AnalyzeProblem. Their settings
are based on the definitions that have been given in the chapter 4. All these
internal flags excluding ana BSH are initialised to TRUE and are set to FALSE
at the first encountered clause which does not satisfy the required conditions.
The case of the BSH∗ class is a bit different because it is necessary to check first
if the problem belongs to BS∗. The flag ana BSH is initialised to FALSE and it
is set to TRUE only once the problem is shown to belong to BS∗ and to contain
only horn clauses.
For each class definition, a function was implemented. For the same reasons
than the GC class, these functions were declared in the dedicated file class conditions.c and the required intermediary functions were put in the complementary func.c file. All these functions have allowed to set correctly the different
flags. Once the internal flags are set, it is possible to know if a problem is a
membership of one of the studied decidable classes. The names of the classes to
which the problem belongs is printed during the launch of SPASS.
For these eight classes, a decision procedure is known. In the next section, it is
discussed how the adequate decision procedures were set for each of these classes.
6.2
6.2.1
Implementation of the decision procedures
Classes decidable by hyperresolution
Three of these classes have been shown to be decidable by hyperresolution
calculus based on positive, restricted factoring. A resolvent of two clauses C and
D, where C is positive, is under positive, restricted factoring if factoring is only
applied to C [Noll, 1980, Fermüller and Leitsch, 1993].
This calculus does not ask for the use of a lot of inference rules. In fact,
in addition to the hyperresolution rule, only rules able to play the role of the
positive, restricted factoring are needed.
The positive, restricted factoring can easily be activated by the rules present
in SPASS. It just need the activation of the condensation rule and of the ordered
factoring rule in the mode where only factoring inferences with positive literals
6.2. IMPLEMENTATION OF THE DECISION PROCEDURES
55
are generated. If the ordered factoring rule is runned in its other mode, negative
literals are considered for inferences as well and then the factoring is no longer
restricted.
Therefore, this resolution method is only based on the ordering factoring, the
hyperresolution and the condensation rules. The ordering factoring rule is activated by the flag IOFC, the condensation rule is activated by the flag RCon and
the hyperresolution rule is activated by the flag IOHy .
This calculus has been proved to be complete in [Noll, 1980]. Moreover, it is
based only on rules already implemented in SPASS. Therefore, as SPASS only
contains sound rules, this calculus is sound. To reach results in a reasonable
time, it has been decided to let the default settings concerning the reduction
rules. These rules are sound and by consequence, the soundness of the calculus is preserved by the use of these rules. However, these rules may affect the
completeness of the calculus. It has been shown that the subsumption, tautology and trivial literal eliminations are compatible with this calculus in [Noll,
1980, Fermüller and Leitsch, 1993]. The only reduction rules which may cause
problems are the fordward/backward matching replacement resolution and the
forward/backward rewriting. According to the tests performed, it seems not to
be the case.
The rules forming this calculus are summarised in the table 6.1.
Inference rules
Ordered hyperresolution
Factoring only right
Condensation rule
Reduction rules
Trivial literal elimination
Subsumption deletion
Tautology deletion
Forward/backward Matching replacement resolution
Forward/backward rewriting
Table 6.1: Hyperresolution calculus
During the testing phase, it has been noticed that the splitting rule improves the
performance of the resolution method for the classes P V D, BSH∗ and OCCI.
56
CHAPTER 6. OTHER DECIDABLE CLASSES
For some problems such as GEO169+1, GRA015+1, MSC007-1.008 and SYN431-1,
the results are reached faster when the splitting rule is activated. The question
is to know if it possible to activate the splitting rule without breaking the completeness of the decision procedure. To answer this question, each class should
be studied separately.
As it has been seen in the Chapter 2, the splitting rule is defined as follows:
N ∪ {C ∨ D}
N ∪ {C}∣N ∪ {D}
where N is a set of clauses and C and D are two variable-disjoint clauses.
The aim is to show that if an initial clause belongs to the class X, the clauses
inferred by the splitting rule applied to this clause still belong to the class X.
The BSH∗ class
Consider a clause X = B ∪ {C ∨ D} belonging to the BSH∗ class.
X has a depth equal to 0 and does not contain any non-horn clauses. As
B ∪ {C ∨ D} has a depth equal to 0, all the literals included in B, in C and in D
have a depth equal to 0. It implies that the depths of B ∪ {C} and of B ∪ {D}
are equal to 0. And as B ∪ {C ∨ D} does not contain any non-horn clauses, B
does not contain any non-horn clauses and C and D are not non-horn clauses.
Therefore, B ∪ {C} and B ∪ {D} do not contain any non-horn clauses.
It can be deduced from these observations that the clauses inferred by the splitting rule belong to the BSH∗ class.
The P V D class
Now, consider a clause X = B ∪ {C ∨ D} belonging to the P V D class.
6.2. IMPLEMENTATION OF THE DECISION PROCEDURES
57
Refering to the definition given in section 4.2.1, a clause set S belongs to P V D
if for each clause Ci ∈ S, the set of variables of {Ci+ } is included in the set of
variables of {Ci− } and τmax (x, Ci+ ) ≤ τmax (x, Ci− ) for all x ∈ V (Ci+ ) [Fermüller
et al., 1993b].
B ∪ {C ∨ D} satisfies this condition. It can then be deduced that the set B
of clause satisfies this condition. A problem can only come from the fact that
{C ∨ D} is split in two clauses: C and D. But as the clause cannot be split if C
and D have common variables, there is no real problem. The clause will be split
only if C and D are variable-disjoint and in this case, if the condition is satisfied
for {C ∨ D} that means that the condition is satisfied for C and for D.
Therefore, the clauses inferred by the splitting rule belong to the P V D class.
The OCCI class
Finally, consider a clause X = B ∪ {C ∨ D} belonging to the OCCI class.
Refering to the section 4.2.1, a clause set S belongs to OCCI if for all Ci ∈ S,
the number of occurences in Ci+ is equal to 1 for all variables occuring in Ci+ and
τmax (x, Ci+ ) ≤ τmin (x, Ci− ) for all x ∈ V (Ci+ ) ∩ V (Ci− ) [Fermüller et al., 2001a].
By applying an reasoning similar to the previous one, it can be deduced that
the clauses inferred by the splitting rule belong to the OCCI class.
6.2.2
Classes decidable by ordered resolution
Concerning the classes decidable by orderings, the choice was to focus on the
classes 1 and + . These classes considered as an extension of the Ackermann
class are relevant for the project due to their nice properties. [Fermüller et al.,
1993c].
Let R1 be an ordering defined such that for all atoms A and B, B >R1 A if:
- τ (B) > τ (A) and
- τmax (x, B) > τmax (x, A) for all x ∈ V (A).
58
CHAPTER 6. OTHER DECIDABLE CLASSES
The irreflexivity and the transivity of < implies the irreflexivity and the transivity of R1 . It can be shown that for all atoms A, B and all substitutions θ:
A <R1 B implies Aθ <R1 Bθ [Fermüller et al., 1993c]. Then R1 is an A-ordering.
The ordered resolution based on this A-ordering and combined with the splitting rule provides a decision procedure for the classes 1 and + . This calculus
is sound given that it is based only on sound rules. And it has been proved to be
complete in [Fermüller et al., 1993c].
As the classes discussed earlier, the choice was to let the default mode concerning the reduction rules. The use of these rules considerably reduces the time
spent to reach a result and the risk that they affect the completeness of the
calculus is quite low. In fact, the use of the usual deletion rules such as the subsumption,tautology and trivial literal eliminations preserves the soundness and
completeness of the calculus [Fermüller et al., 1993c]. Once again, the only rules,
for which the preservation of the calculus completeness is not assured, are: the
fordward/backward matching replacement resolution and the forward/backward
rewriting.
The table 6.2 summarises the calculus used for the resolution of the classes 1
and + .
Inference rules
Ordered resolution
Factoring only right
Splitting rule
Reduction rules
Trivial literal elimination
Subsumption deletion
Tautology deletion
Forward/backward Matching replacement resolution
Forward/backward rewriting
Table 6.2: Ordered resolution calculus
The issue concerning this decision procedure was how to implement the ordering
R1 in SPASS. After much tought, it has been decided to implement the ordering
at the beginning of the function ord LiteralCompare. This function intends to
compare the literals. The idea was to compare the literals with the ordering R1
when the problem is declared to belong to classes and to give a result before
6.2. IMPLEMENTATION OF THE DECISION PROCEDURES
59
the function uses the Knuth-Bendix ordering or the recursive path ordering with
status ordering.
Algorithm 9 Comparison of literals
if the problem belongs to then
Compare the literals according to R1
else
Use the ordering specified by the user or the default one if nothing is
specified (kbo, rpos or lpo)
end if
In the case where the problem belongs to the classes and the ordering R1 does
not specify any order between the two literals in parameters, the function returns
ord UNCOMPARABLE. By consequence, this ordering is not total. The proof
demonstrating the completeness of the ordered resolution based on A-ordering
does not take into account the fact that the ordering is total or not. However,
the fact that this ordering is not total can affect the proper functioning of SPASS.
Indeed, It is hard to know what SPASS makes in the case where the result returned is ord UNCOMPARABLE. The test phase has not shown any failure. All
the results which have been obtained are correct. The fact that the ordering is
not total may therefore cause no real problems.
After having finished the implementation, many tests were required to evaluate
the work done. The next chapter deals with the testing part.
Chapter 7
Tests and results
A large number of tests has been run to evaluate the program. Two kind of
tests have been used:
- tests written by hand to try different situations and,
- tests coming from an external source to make the most accurate evaluation
possible.
7.1
The TPTP library
The TPTP library is a perfect source concerning first-order logic problems.
The TPTP library is ”a library of test problems for automated theorem proving
systems” [Sutcliffe, 2009]. This library has been created to regroup in a electronic
format all the relevant first-order problems that have been studied during the last
years. This library is regularly updated and it contains all the most important
problems, no need to see elsewhere. There is a large variety of problems. Problems can be very simple as they can be very difficult. The number of problems
available in this library is so large that this library is sufficient for significant
testing [Sutcliffe and Suttner, 1998]. The problems are classified according to a
specific way. The figure 7.1 taken from [Sutcliffe and Suttner, 1998] explains how
the problems are classified.
In this project, the problems have been decided to be classified by the classes
to which they belong instead of following the diagram 7.1.
60
7.1. THE TPTP LIBRARY
61
Figure 7.1: Structure of the TPTP library
In order to be useful for the project, the problems had to be translated in
the dfg format. This is the only format that SPASS takes into account. This
translation has been possible thanks to the tptp2X tool. The tptp2X tool allows
to convert a problem from the TPTP format to a format of our choice. Especially,
it allows to convert the problems from the TPTP format to the dfg format.
62
CHAPTER 7. TESTS AND RESULTS
7.2
Evaluation of the analysis module
The analysis module was tried on many examples. In this section, some examples of tests performed are described.
The example chosen to study the analysis module is presented in the description of SPASS, section 3.2. The problem corresponds to the clause set S =
{C1 , C2 , C3 , C4 }, where:
C1
C2
C3
C4
=
=
=
=
{P(a, b)}
{P(x, y), ¬P(y, f(x, y))}
{Q(x, y, z), ¬Q(g(z, x, y), x, g(g(y, z, x), x, z))}
{¬Q(a, b, c)}
This problem was carefully chosen to belong to a large number of classes. It
satisfies the conditions to belong to 1 , + and S+ classes. All the literals are
covering. In fact in C1 and C4 literals contain only constants, and in C2 and
C3 literals contains either variables or functional terms which contains all the
variables occuring in the literal in question. Moreover, all the literals present in
a same clause possess exactly the variables occuring in the clause ({x, y} for the
literals present in C2 and {x, y, z} for the literals present in C3 ). The problem
satisfies the conditions to belong to P V D and OCCI classes, too. C1 and C4 do
not possess any variables. The set of variables of C2+ is equal to set of variable
of C2− (same for C3− and C3+ ). So, all these clauses satisfy the condition the set
of variables of {Ci+ } is included in the set of variables of {Ci− }. Furthermore,
τmax (i, C2+ ) ≤ τmax (i, C2− ) for i ∈ x, y and τmax (i, C3+ ) ≤ τmin (i, C3− ) for i ∈ x, y, z.
However, the problem does not belong to BS* or BSH* class: the depth of C2 is
equal to 1 and the depth of C3 is equal to 2. And it does not belong to WGC
neither: C2 and C3 have no negative literal L with τv (L) = 0.
Now the expected result is known, it can be compared to the output which has
been provided by the program. The output is presented in the figure 7.2. It can
be seen from this figure that the results correspond to the analysis done by hand.
7.2. EVALUATION OF THE ANALYSIS MODULE
63
-----------------------SPASS-START-------------------------Input Problem:
4[0:Inp] || Q(a,b,c) -> .
3[0:Inp] || Q(g(U,V,W),V,g(g(W,U,V),V,U)) -> Q(V,W,U).
2[0:Inp] || P(U,f(V,U)) -> P(V,U).
1[0:Inp] || -> P(a,b).
This is a first-order Horn problem without equality.
This
This
This
This
This
problem
problem
problem
problem
problem
belongs
belongs
belongs
belongs
belongs
to
to
to
to
to
the
the
the
the
the
EPS1 Class.
EPS+ Class.
S+ Class.
PVD Class.
OCCI Class.
Axiom clauses: 4 Conjecture clauses: 0
...
Figure 7.2: Analysis of the first test
After this test, the problem was changed to belong to fewer classes. For that
purpose, the second clause was transformed to:
C2 = {P(f(f(x, y), y), y), ¬P(y, f(x, y))}
In this context, the equation τmax (x, C2− ) ≤ τmax (i, C2+ ) was verified. With this
condition, S did not belong to the P V D and OCCI classes anymore. The results
of this test are provided in the figure 7.3 and they are as expected.
To test the three last classes 1 , + and S+ , the third clause was changed to:
C3 = {Q(x, x, z), ¬Q(g(z, x, y), x, g(g(y, z, x), x, z))}.
In this way, the first literal did not have the same variables as the entiere clause
or even as the other literals present in the clause. This implies that the problem
did not belong to S+ and the classes anymore. The output can be seen in the
figure 7.4.
64
CHAPTER 7. TESTS AND RESULTS
-----------------------SPASS-START-------------------------Input Problem:
4[0:Inp] || Q(a,b,c) -> .
3[0:Inp] || Q(g(U,V,W),V,g(g(W,U,V),V,U)) -> Q(V,W,U).
2[0:Inp] || P(U,f(V,U)) -> P(f(f(V,U),U),U).
1[0:Inp] || -> P(a,b).
This is a first-order Horn problem without equality.
This problem belongs to the EPS1 Class.
This problem belongs to the EPS+ Class.
This problem belongs to the S+ Class.
Axiom clauses: 4 Conjecture clauses: 0
..
Figure 7.3: Analysis of the second test
----------------------SPASS-START-------------------------Input Problem:
4[0:Inp] || Q(a,b,c) -> .
3[0:Inp] || Q(g(U,V,W),V,g(g(W,U,V),V,U)) -> Q(V,V,U).
2[0:Inp] || P(U,f(V,U)) -> P(f(f(V,U),U),U).
1[0:Inp] || -> P(a,b).
This is a first-order Horn problem without equality.
Axiom clauses: 4 Conjecture clauses: 0
...
Figure 7.4: Analysis of the third test
All these outputs are those expected. Some other tests were then performed.
A last simple example test is provided below. This test was run to check if
the program can tell when a problem belongs to BS∗, BSH∗ or the W GC classes.
Let S = {C1 , C2 } with:
C1 = {P(a, b)}
C2 = {P(x, y),¬P(y, x)}
This problem is really simple and verifies the conditions of all the implemented
classes.
7.3. CLASSIFICATION OF THE TPTP LIBRARY
65
The program has produced the following output:
-----------------------SPASS-START-------------------------Input Problem:
2[0:Inp] || P(U,V) -> P(V,U).
1[0:Inp] || -> P(a,b).
This is a first-order Horn problem without equality.
This is a problem that has, if any, a finite domain model.
There are no function symbols.
This
This
This
This
This
This
This
This
problem
problem
problem
problem
problem
problem
problem
problem
belongs
belongs
belongs
belongs
belongs
belongs
belongs
belongs
to
to
to
to
to
to
to
to
the
the
the
the
the
the
the
the
BS* Class.
BSH* Class.
EPS1 Class.
EPS+ Class.
S+ Class.
PVD Class.
OCCI Class.
WGC Class.
Axiom clauses: 2 Conjecture clauses: 0
...
Figure 7.5: Analysis of the fourth test
The program has declared that the problem belongs to BS*, BSH*, WGC
classes and all other classes, which is the expected result.
In conclusion, all the results given in this section correspond to the expected
ones. Several additional tests were done and were all found correct. However,
any conclusion could be deduced just from this battery of tests. More tests were
necessary to be able to evaluate correctly the work done.
7.3
Classification of the TPTP library
The TPTP library was used in order to provide an accurate evaluation of the
work done. The program was runned on almost all the problems included in the
TPTP library. It allowed to classify the problems of the TPTP library depending
66
CHAPTER 7. TESTS AND RESULTS
on the classes the program has declared they belong to. 18% of the problems could
not be transformed in the dfg format because they use functions not managed by
the tptp2X tool. And 3% of the problems could not be classified because they
are too long and take too much time to give a result. All the other problems
have been classified. The results are regrouped in table 7.1. This table presents
how many problems of each domain belong to the different classes. In total, 1301
problems belong to the BS class, 461 problems belong to the BSH class, 995
problems belong to the + class, 939 problems belong to the 1 , 380 problems
belong to the W GC class, 159 problems belong to the GC class, 611 problems
belong to the PVD class, 580 problems belong to the OCCI class, 2675 problems
belong to the S+ and 10652 problems do no belong to any class. As some classes
are included in another one, for example GC is included in W GC and BSH is
included in BS, only 3267 problems belong to at least one decidable class. It is
not a large number, especially compared to the number of the problems which do
not belong to any class, but it is already enough to evaluate the program. Too
many problems would have taken too much time to be tested.
As it can be seen in the table 7.1, depending on the domain, results are completely different. Some domains such as SY N have really nice properties and are
really interesting for the project. 763 problems of the SY N domain belong to BS
and 457 of them belongs to 1 . But many other domains are less interesting and
contain only problems which do not belong to any class. It can be thought that
the two classifications are kind of related. Problems of the domains COL and
GRP tend to belong to the S+ class, problems of the SY N domain tend to belong to the BS class and problems of the P U Z domain are equitably distributed
among the different decidable classes.
The classification made by the program was checked for some problems. It
would be impossible to check all the problems one by one. In fact, nothing tells
to what classes a problem belongs. The verification has to be done by hand. For
all the problems checked, the results were those expected. This new classification
has simplified the way to proceed to the evaluation of the different resolution
methods.
7.3. CLASSIFICATION OF THE TPTP LIBRARY
AGT
ALG
ANA
ARI
BOO
CAT
COL
COM
CSR
DAT
FLD
GEG
GEO
GRA
GRP
HAL
HEN
HWC
HWV
KLE
KRS
LAT
LCL
LDA
MED
MGT
MSC
NLP
NUM
PLA
PRO
PUZ
REL
RNG
ROB
SCT
SET
SEU
SEV
SWB
SWC
SWV
SWW
SYN
SYO
TOP
Total
BS
0
0
0
1
0
0
0
0
25
0
0
0
2
15
101
0
0
0
53
0
49
3
19
0
0
12
16
58
1
13
0
56
0
0
0
0
2
3
0
32
0
77
0
763
0
0
1301
BSH
0
0
0
0
0
0
0
0
25
0
0
0
1
0
1
0
0
0
0
0
1
3
14
0
0
0
12
18
0
13
0
21
0
0
0
0
0
2
0
32
2
0
0
316
0
0
461
+ 1
0
179
3
1
4
0
8
3
0
0
0
0
0
1
155
0
8
2
1
0
42
16
16
0
0
0
10
0
11
1
0
42
0
0
0
2
6
3
0
6
2
16
0
457
0
0
995
0
179
3
1
4
0
0
3
0
0
0
0
0
1
125
0
8
2
1
0
42
15
16
0
0
0
10
0
11
1
0
42
0
0
0
2
5
3
0
6
2
6
0
451
0
0
939
WGC
0
177
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
9
0
1
6
13
0
0
2
10
0
13
0
0
35
0
0
0
0
1
2
0
6
0
0
0
103
0
0
380
GC
0
9
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
2
0
1
5
9
0
0
0
1
0
1
0
0
30
0
0
0
0
0
1
0
1
0
0
0
97
0
0
159
PVD
0
179
0
0
0
0
3
2
25
0
0
0
2
15
44
0
0
1
3
0
1
6
13
0
0
6
11
58
11
13
0
46
0
0
0
0
11
1
0
7
0
28
1
124
0
0
611
67
OCCI
0
177
1
0
0
0
3
2
0
0
0
0
0
1
0
0
0
1
1
0
1
6
13
0
0
0
12
58
11
0
0
45
0
0
0
2
12
1
0
8
0
29
1
195
0
0
580
Table 7.1: Classification of the TPTP library
S+
0
221
10
1
60
0
155
3
0
0
0
0
0
1
509
0
29
2
36
72
65
169
577
29
0
2
12
0
19
2
0
52
0
64
31
2
18
13
0
7
12
19
1
482
0
0
2675
No class
52
211
81
0
21
130
73
39
163
0
279
1
587
18
443
9
38
3
68
157
203
384
337
20
10
144
12
458
885
36
72
64
218
197
14
184
1238
807
0
170
846
1192
362
324
0
102
10652
68
7.4
CHAPTER 7. TESTS AND RESULTS
Evaluation of the resolution methods
Once the problems of the TPTP library were listed by classes, the decision
procedures of the different decidable classes were tried one by one. Each implemented decision procedure was evaluated. These evaluatations helped to choose
the decision procedure which has to be used first when the problem belongs to
several classes. The program was runned for 150 seconds by problem. It would
take too much time to run all the tests if the time limit was too high.
The decision procedure of the GC class was tried first. No deterioration was
observed. There was no improvement neither. But it has been noticed that all
the clauses inferred with the new resolution method belong to the GC class which
was not the case with the initial method. 154 out of 159 problems belonging to
the GC class provide a result in less than 150 seconds. The results (satisfiable or
unsatisfiable) are the same than the results provided by the basic version.
After that, the decision procedures of the P V D, OCCI and BSH classes which
are based on hyperresolution calculus were tried. Several improvements were
observed. 26 problems give now a result while they did not give any with the
basic version. No deterioration was observed. 328 out of 351 problems belonging
to the P V D class provide now a result in less than 150 seconds whereas the
basic version can only provide a result for 303 of them. 304 out of 319 problems
belonging to the OCCI class and 410 out of 436 problems belonging to the BSH
class also provide a result with this resolution method. The results were checked.
They are as expected.
Finally, the decision procedure of the classes was tried. 431 out of 579
problems belonging to the classes provide a result instead of 426 with the basic
version. It is not a great improvement but it should not be forgot that these results
are given in 150 seconds, by waiting more the results could be better. However,
two problems which were managed with the basic version do not provide result
with the new version.
As for the GC class, it was verified that these two procedures product only
clauses which belong to the class to which the problem belongs initially. This
7.4. EVALUATION OF THE RESOLUTION METHODS
69
test is a good test to verify if the resolution methods used are really decision
procedures for these classes.
The different improvements and deteriorations are summarised in the table 7.2
(which represents the results when the program has been runned for 150 seconds
by problem). The problems presented in these table are those which give a result
only for one version.
Only SYN446+1.dfg is a real deterioration. In fact, SYN302-1.003.dfg belongs
to the P V D class too. If the decision procedure related to the P V D class is used,
a result is found. Therefore, the issue was to know what decision procedure use
when a problem belongs to several decidable classes. Based on the result of the
tests performed, it was decided to use in priority the decision procedure for the
GC class then the hyperresolution calculus and finally the resolution method for
the classes. This choice was made to optimise the performance of the prover.
With this choice, the new version of SPASS can provide a result for 31 problems
more than the basic version on this battery of tests. If the resolution method for
the classes had priority on the method for the P V D class, one problem out of
the 31 problems now solved by the new version would not give any result.
This chapter has shown that several improvements and almost no deterioration
have been observed. However, the results are not as good as we could have hoped
since many problems studied do not belong to any decidable class that has been
implemented.
After having performed the tests, a lack concerning the possibility to choose
the desired resolution method has been noticed. By consequence, a new flag was
created for each decision procedure implemented (GC, PVD, OCCI, BSH, EPS). These
methods can then be activated by a simple option. For example, to activate the
decision procedure related to the GC class, the user has just to specify the option
-GC=1 in the commandline. With these new flags, resolution methods studied in
this dissertation can be used for problems which do no belong to any classes.
70
CHAPTER 7. TESTS AND RESULTS
Problems belonging to
Problems belonging to
1
+
New version
Basic version
PUZ037-3.dfg Proof found
SYN434+1.dfg Completion found
SYN449-1.dfg Completion found
SYN449+1.dfg Completion found
SYN456-1.dfg Completion found
SYN463+1.dfg Completion found
SYN464+1.dfg Completion found
SYN302-1.003.dfg Completion found
SYN446+1.dfg Completion found
Results obtained: 406/554
Results obtained: 401/554
Same improvements than
1
Same deteriorations than
1
Results obtained: 431/579
Results obtained: 426/579
No improvement
No deterioration
Results obtained: 154/159
Results obtained: 154/159
PUZ056-2.010.dfg Proof found
No deterioration
Results obtained: 410/436
Results obtained: 409/436
PUZ056-2.010.dfg Proof found
No deterioration
Results obtained: 304/319
Results obtained: 303/319
Problems belonging to GC
Problems belonging to BSH∗
Problems belonging to OCCI
Problems belonging to P V D
GRA017+1.dfg Completion found
GRP128-2.004.dfg Completion found
GRP128-2.006.dfg Proof found
GRP128-3.004.dfg Completion found
GRP128-4.004.dfg Completion found
GRP129-1.005.dfg Completion found
GRP129-2.004.dfg Proof found
GRP129-2.005.dfg Completion found
GRP129-3.005.dfg Completion found
GRP129-4.005.dfg Completion found
GRP130-1.005.dfg Completion found
GRP130-2.005.dfg Completion found
GRP130-3.004.dfg Completion found
GRP130-4.004.dfg Completion found
GRP131-1.005.dfg Completion found
GRP131-2.005.dfg Completion found
GRP132-1.005.dfg Completion found
GRP132-2.005.dfg Completion found
GRP133-1.004.dfg Completion found
GRP133-2.004.dfg Completion found
GRP134-1.005.dfg Completion found
GRP134-2.005.dfg Completion found
GRP135-1.005.dfg Completion found
GRP135-2.005.dfg Completion found
Results obtained: 328/351
Results obtained: 303/351
Table 7.2: Improvements and deteriorations
Chapter 8
Conclusion
The aim of this dissertation was to improve the theorem prover SPASS. The
idea to achieve this aim was to make SPASS recognise some classes that are
known to be decidable and select the correct method according to this new kind
of analysis.
This projet required to make lot of research. All the main concepts related to
the resolution for first-order logic are summarised in this dissertation. Especially,
it is explained how resolution can be used as decision procedure for some classes
of problems. From the collected information, relevant decidable classes and their
decision procedures have been listed. The goal was then to allow SPASS to use
the information contained in this list. The basic functioning of SPASS is deeply
described in the Chapter 3. An extension of the SPASS analysis module was
necessary to make SPASS recognise when a problem belongs to a decidable classe.
The rules forming the studied decision procedures were all already implemented in
the basic version of SPASS. Only, some specific orderings and selection functions
had to be implemented in order for SPASS to be able to use one of the decision
procedures listed.
The extended version of SPASS is now able to recognise eight classes and to
activate a decision procedure for six of them. The results provided by the program
are described in the last chapter. It is demonstrated that the new version of
SPASS is rather good, even better than the old one. All the results obtained are
correct. Thirty three problems provide a result now while they could not with
the old version. Only one deterioration has been observed.
71
72
CHAPTER 8. CONCLUSION
This version of SPASS can however still be improved. By time constraint, the
decision procedures for the two classes W GC and S+ have not been implemented.
Therefore, it would be interesting to implement them to observe if better results
are obtained. Moreover, many problems do not belong to any decidable class
implemented. The work done does not have any impact on these problems. Thus,
one way to improve this version would be to add decidable classes which includes
a large number of problems such as the Maslov’s class K. The K class is a kind of
superclass that includes a certain number of other solvable classes [Hustadt and
Schmidt, 1999]. It would also be interesting to use the main idea of this project
to improve other theorem provers than SPASS.
Bibliography
[Andréka et al., 1998] Andréka, H., Németi, I., and van Benthem, J. (1998).
Modal languages and bounded fragments of predicate logic. Journal of Philosophical Logic, 27(3):217–274.
[Bachmair and Ganzinger, 1998] Bachmair, L. and Ganzinger, H. (1998). Equational reasoning in saturation-based theorem proving. Automated deduction—a
basis for applications, 1:353–397.
[Bachmair and Ganzinger, 2001] Bachmair, L. and Ganzinger, H. (2001). Resolution theorem proving. Handbook of automated reasoning, 1:19–99.
[Caferra et al., 2004] Caferra, R., Leitsch, A., and Peltier, N. (2004). Automated
model building, volume 31. Springer.
[de Nivelle, 1998] de Nivelle, H. (1998). A resolution decision procedure for the
guarded fragment. In Kirchner, C. and Kirchner, H., editors, Automated Deduction — CADE-15, volume 1421 of Lecture Notes in Computer Science, pages
191–204. Springer Berlin / Heidelberg.
[de Nivelle, 2000] de Nivelle, H. (2000). Deciding the E+-class by an a posteriori,
liftable order. Annals of Pure and Applied Logic, 104(1):219–232.
[Dierkes, 2000] Dierkes, M. (2000). An application of model building in a resolution decision procedure for guarded formulas. Computational Logic—CL 2000,
pages 583–597.
[Fermüller et al., 2001a] Fermüller, C., Leitsch, A., Hustadt, U., and Tammet, T.
(2001a). Resolution decision procedures. In Handbook of Automated Reasoning,
pages 1791–1849. Elsevier Science Publishers BV, Elsevier and MIT Press.
73
74
BIBLIOGRAPHY
[Fermüller et al., 2001b] Fermüller, C., Leitsch, A., Hustadt, U., and Tammet,
T. (2001b). Unification theory. In Handbook of Automated Reasoning, pages
445–532. Elsevier Science Publishers BV, Elsevier and MIT Press.
[Fermüller and Leitsch, 1993] Fermüller, C. and Leitsch, A. (1993). Model building by resolution. In Börger, E., Jäger, G., Kleine Büning, H., Martini, S., and
Richter, M., editors, Computer Science Logic, volume 702 of Lecture Notes in
Computer Science, pages 134–148. Springer Berlin / Heidelberg.
[Fermüller et al., 1993a] Fermüller, C., Leitsch, A., Tammet, T., and Zamov, N.
(1993a). Completeness of ordering refinements. In Resolution Methods for the
Decision Problem, volume 679 of Lecture Notes in Computer Science, pages
60–92. Springer Berlin / Heidelberg.
[Fermüller et al., 1993b] Fermüller, C., Leitsch, A., Tammet, T., and Zamov,
N. (1993b). Semantic clash resolution as decision procedure. In Resolution
Methods for the Decision Problem, volume 679 of Lecture Notes in Computer
Science, pages 17–59. Springer Berlin / Heidelberg.
[Fermüller et al., 1993c] Fermüller, C., Leitsch, A., Tammet, T., and Zamov, N.
(1993c). Semantic tree based resolution variants. In Resolution Methods for
the Decision Problem, volume 679 of Lecture Notes in Computer Science, pages
93–129. Springer Berlin / Heidelberg.
[Ganzinger and De Nivelle, 1999] Ganzinger, H. and De Nivelle, H. (1999). A
superposition decision procedure for the guarded fragment with equality. In
Logic in Computer Science, 1999. Proceedings. 14th Symposium on, pages 295–
303. IEEE.
[Georgieva et al., 2003] Georgieva, L., Hustadt, U., and Schmidt, R. (2003). Hyperresolution for guarded formulae. Journal of Symbolic Computation, 36(12):163–192.
[Horrocks et al., 2007] Horrocks, I., Hustadt, U., Sattler, U., and Schmidt, R.
(2007). Computational modal logic. Studies in Logic and Practical Reasoning,
3:181–245.
[Hsiang and Rusinowitch, 1991] Hsiang, J. and Rusinowitch, M. (1991). Proving
refutational completeness of theorem-proving strategies: the transfinite semantic tree method. Journal of the ACM, 38(3):558–586.
BIBLIOGRAPHY
75
[Hustadt and Schmidt, 1999] Hustadt, U. and Schmidt, R. (1999). Maslov’s class
K revisited. Automated Deduction—CADE-16, pages 678–678.
[Kowalski and Hayes, 1968] Kowalski, R. and Hayes, P. (1968). Semantic trees
in automatic theorem proving. University of Edinburgh.
[Leitsch, 1997] Leitsch, A. (1997). The resolution calculus. Springer-Verlag New
York.
[McCune, 2005] McCune, W. (2005). Release of prover9. In Mile High Conference
on Quasigroups, Loops and Nonassociative Systems, Denver, Colorado.
[McCune et al., 2009] McCune, W. et al. (2009). Prover9 manual. http://www.
cs.unm.edu/~mccune/mace4/manual/2009-11A.
[Noll, 1980] Noll, H. (1980). A note on resolution: How to get rid of factoring
without loosing completeness. In 5th Conference on Automated Deduction Les
Arcs, France, July 8–11, 1980, pages 250–263. Springer.
[Nonnengart and Weidenbach, 2001] Nonnengart, A. and Weidenbach, C. (2001).
Computing small clause normal forms. Handbook of automated reasoning,
1:335–367.
[Rautenberg, 2009] Rautenberg, W. (2009). First-order logic. In A concise introduction to mathematical logic, pages 41–90. Springer Verlag.
[Riazanov and Voronkov, 2001] Riazanov, A. and Voronkov, A. (2001). Vampire
1.1. Automated Reasoning, pages 376–380.
[Schmidt, 2008] Schmidt, R. (2007-2008). Lecture Notes, COMP60121: Automated Reasoning. The University of Manchester.
[Schmidt, 2012] Schmidt, R. (2011-2012). Lecture Notes, COMP61111: Logical
Reasoning and Applications. The University of Manchester.
[Schulz, 2002] Schulz, S. (2002). E - A brainiac theorem prover. AI Communications, 15(2-3):111–126.
[Schulz, 2010] Schulz, S. (2010). The E theorem prover.
informatik.tu-muenchen.de/~schulz/E/E.html.
http://www4.
76
BIBLIOGRAPHY
[Skolem, 1955] Skolem, T. (1955). Peano’s axioms and models of arithmetic.
Studies in Logic and the Foundation of Mathematics, pages 1–14.
[Sutcliffe, 2009] Sutcliffe, G. (2009). The TPTP problem library and associated
infrastructure. Journal of Automated Reasoning, 43(4):337–362.
[Sutcliffe and Suttner, 1998] Sutcliffe, G. and Suttner, C. (1998).
The
TPTP problem library.
Journal of Automated Reasoning, 21:177–203.
10.1023/A:1005806324129.
[Voronkov, 2009a] Voronkov, A. (2009a). Lecture notes: Automated reasoning,
The University of Manchester. http://www.voronkov.com/ar.cgi.
[Voronkov, 2009b] Voronkov, A. (2009b). Vampire’s home page. http://www.
vprover.org.
[Weidenbach, 2005] Weidenbach, C. (2005). Spass: An automated theorem
prover for first-order logic with equality. http://www.spass-prover.org.
[Weidenbach et al., 2009] Weidenbach, C., Dimova, D., Fietzke, A., Kumar,
R., Suda, M., and Wischnewski, P. (2009). Spass version 3.5. Automated
Deduction–CADE-22, pages 140–145.
[Weidenbach et al., 2007a] Weidenbach, C., Schmidt, R., Hillenbrand, T., Rusev,
R., and Topic, D. (2007a). System description: Spass version 3.0. Automated
Deduction–CADE-21, pages 514–520.
[Weidenbach et al., 2007b] Weidenbach, C., Schmidt, R., and Keen, E. (2007b).
Spass handbook version 3.0. Contained in the distribution of SPASS Version,
3.
BIBLIOGRAPHY
77
Appendix A : SPASS flags
Inference Rules
ISor
IEmS
IEqR
IERR
ISpL
IOPm
ISPm
ISpR
IOFc
ISFc
IEqF
IMPm
IORe
ISRe
IOHy
ISHy
Splits
Sort Constraint Resolution
Empty Sort
Equality Resolution
Reflexivity Resolution
Superposition Left
Ordered Paramodulation
Standard Paramodulation
Superposition Right
Ordered Factoring
Standard Factoring
Equality Factoring
Merging Paramodulation
Ordered Resolution
Standard Resolution
Ordered Hyper Resolution
Standard Hyper Resolution
Splitting
Reduction Rules
RSSi
RSST
RObv
RFSub
RBSub
RCon
RTaut
RUnc
RTer
RFMMR
RBMMR
RFRew
RBRew
RAED
Sort Simplification
Static Soft Typing
Trivial Literal Elimination
Forward Subsumption Deletion
Backward Subsumption Deletion
Condensation
Tautology Deletion
Unit Conflict
Terminator
Forward Matching Replacement Resolution
Backward Matching Replacement Resolution
Forward Rewriting
Backward Rewriting
Assignment Equation Deletion