Download Introduction to logic

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Tractatus Logico-Philosophicus wikipedia , lookup

Bayesian inference wikipedia , lookup

Truth wikipedia , lookup

Axiom wikipedia , lookup

Mathematical logic wikipedia , lookup

Model theory wikipedia , lookup

Analytic–synthetic distinction wikipedia , lookup

Structure (mathematical logic) wikipedia , lookup

History of logic wikipedia , lookup

Abductive reasoning wikipedia , lookup

Meaning (philosophy of language) wikipedia , lookup

Combinatory logic wikipedia , lookup

Modal logic wikipedia , lookup

Semantic holism wikipedia , lookup

Junction Grammar wikipedia , lookup

Catuṣkoṭi wikipedia , lookup

Inquiry wikipedia , lookup

Intuitionistic logic wikipedia , lookup

First-order logic wikipedia , lookup

Sequent calculus wikipedia , lookup

Syllogism wikipedia , lookup

Laws of Form wikipedia , lookup

Law of thought wikipedia , lookup

Propositional formula wikipedia , lookup

Accessibility relation wikipedia , lookup

Natural deduction wikipedia , lookup

Argument wikipedia , lookup

Propositional calculus wikipedia , lookup

Principia Mathematica wikipedia , lookup

Truth-bearer wikipedia , lookup

Transcript
Introduction to logic
1. Logic and Artificial Intelligence: some historical remarks
One of the aims of AI is to reproduce human features by means of a computer system. One of these
features is reasoning. Reasoning can be viewed as the process of having a knowledge base (KB) and
manipulating it to create new knowledge.
Reasoning can be seen as comprised of 3 processes:
1. Perceiving stimuli from the environment;
2. Translating the stimuli in components of the KB;
3. Working on the KB to increase it.
We are going to deal with how to represent information in the KB and how to reason about it. We use
logic as a device to pursue this aim.
These notes are an introduction to modern logic, whose origin can be found in George Boole’s and
Gottlob Frege’s works in the XIX century. However, logic in general traces back to ancient Greek
philosophers that studied under which conditions an argument is valid. An argument is any set of
statements – explicit or implicit – one of which is the conclusion (the statement being defended) and
the others are the premises (the statements providing the defense). The relationship between the
conclusion and the premises is such that the conclusion follows from the premises (Lepore 2003).
Modern logic is a powerful tool to represent and reason on knowledge and AI has traditionally adopted
it as working tool. Within logic, one research tradition that strongly influenced AI is that started by the
German philosopher and mathematician Gottfried Leibniz. His aim was to formulate a Lingua
Rationalis to create a universal language based only on rationality, which could solve all the
controversies between different parties.
This language was supposed to be comprised of a:
-
characteristica universalis: a set of symbols with which one may express all the sentences of
this language; and a
-
calculus ratiocinator: a set of rules by which one may deduce all possible truths from an
initial list of thoughts (expressed in the form of a characteristica universalis sentence).
Leibniz’s very ambitious project failed: he encountered several difficulties already in the development
of the characteristica universalis due to the absence of an appropriate formalism for this kind of
language.
The necessary steps to the development of logic in its modern form were taken by George Boole
(1854) and Gottlob Frege (1879). Boole revolutionized logic by applying methods from the thenemerging field of symbolic algebra to logic. Where traditional (Aristotelian) logic relied on cataloging
the valid syllogisms of various simple forms, Boole's method provided general algorithms in an
algebraic language which applied to an infinite variety of arguments of arbitrary complexity. Frege
essentially reconceived the discipline of logic by constructing a formal system which, in effect,
constituted the first ‘predicate calculus’. In this formal system, Frege developed an analysis of
quantified statements and formalized the notion of a ‘proof’ in terms that are still accepted today.
Frege then demonstrated that one could use his system to resolve theoretical mathematical statements
in terms of simpler logical and mathematical notions.
There are different types of logic, which we can use to pursue our aim of representing and reasoning
on knowledge, but we start with the simplest one, called Propositional Logic (PL). Then we present a
2
more powerful type of logic, called First Order Logic (FOL) or Predicate Calculus, which is the one
usually adopted in AI.
2. Introduction to key terms
Before presenting PL and FOL, we briefly introduce some key terms of logic (argument, deductive
validity, soundness) (Lepore 2003) that, in the following, we will decline within the specific logical
languages we will present.
2.1 Arguments
As said, an argument is any set of statements – explicit or implicit – one of which is the conclusion
(the statement being defended) and the others are the premises (statements providing the defense). The
relationship between the conclusion and the premises is such that the conclusion follows from the
premises. Here is one example.
(2.1) Anyone who deliberates about alternative courses of action believes he is free. Since
everybody deliberates about alternative courses of action, it follows that we all believe ourselves to
be free.
A statement is any indicative sentence that is either true or false. ‘Galileo was an astronomer’ is a
statement; ‘Is George Washington president?’ is not a statement.
Putting arguments into a standard format
Having determined that some part of discourse contains an argument, the next task is to put it into a
standard format.
As said, an argument is any set of statements – explicit or implicit – one of which is the conclusion,
and the others are the premises. To put an argument in a standard form requires all the following steps:
-
to identify the premises and the conclusion;
-
to place the premises first and the conclusion last;
-
to make explicit any premise or even the conclusion; this may be only implicit in the
argument, but essential to the argument itself.
So the argument presented as example above (2.1) has the following standard format.
Premise 1: Anyone who deliberates about alternative courses of action believes he is free.
Premise 2: Everybody deliberates about alternative courses of action.
Conclusion: We all believe ourselves to be free.
To make statements (both premises and conclusions) explicit is often critical when putting an
argument into standard format. Let us consider another example in which both one of the two premises
and the conclusion are hidden in its natural format.
(2.2) What’s the problem today with John? Everybody is afraid to go to the dentist.
Premise 1: Everybody is afraid to go to the dentist.
Premise 2 (hidden): John is going to the dentist today.
Conclusion (hidden): John is afraid.
2.2 Deductive validity
Defining an argument, we have said that the relationship between the conclusion and the premises is
such that the conclusion purportedly follows from the premises. But what is for one statement to
‘follow from others’? The principal sense of ‘follows from’ derives from the notion of deductively
valid argument.
3
A deductively valid argument is an argument such that it is not possible both for its premises to be true
and its conclusion to be false. In other terms, in a valid argument it is not possible that, if the premises
are true, the conclusion is false.
(2.3) All men are mortal. Socrates is a man. So Socrates is mortal.
(2.3) is a deductively valid argument, whose conclusion follows from its premises.
(2.4) All bachelors are nice. John is a bachelor. So John is nice.
(2.4) is also a deductively valid argument. However, its premises should be false. It is worth noting
that to determine whether a specific statement is true or false is not the aim of logic. Logic cannot say
if the statements ‘All bachelors are nice’ and ‘John is a bachelor’ are true or false. This has to be
determined by means of empirical analysis. The aim of logic, rather, is to determine if the argument is
valid: given that the premises are true, the conclusion has to be true as well.
If in ordinary language the terms ‘valid’ and ‘true’ are often used interchangeably, in logic ‘valid’ and
‘invalid’ apply only to arguments, while ‘true’ and ‘false’ only to statements.
2.3 Soundness
An argument can be deductively valid, but unlikely to persuade anyone. Normally good arguments are
not only deductively valid. They also have true premises. These arguments are called sound.
Argument (2.4) above is valid, but not sound. Another example:
(2.5) All fish fly. Anything which flies talks. So, all fish talk.
This is a deductively valid argument but unlikely to persuade anyone.
3. Propositional Logic
In learning Propositional Logic (PL) we learn ways of representing arguments correctly and ways of
testing the validity of those arguments. There exist various techniques to determine whether arguments
containing these types of statements are deductively valid in virtue of their form and PL is one of
them. PL, like any other logic, can be thought as comprised of three components:
-
syntax: which specifies how to build sentences of the logic;
-
semantics: which attaches to these sentences a meaning;
-
inference rules: which manipulate these sentences to obtain more sentences.
The basic symbolization rule in PL is that a simple statement is symbolized by a single statement
letter.
3.1 Syntax of PL
It defines the allowable sentences in PL. PL sentences are comprised of:
Atomic sentences: indivisible syntactic elements consisting of a single propositional symbol,
usually an upper case letter (P, Q, R). Every symbol represents a sentence that can be true (T) or
false (F).
Complex sentences: constructed from atomic ones by means of logical connectives, ¬ (negation), ∧
(conjunction), ∨ (disjunction), → (conditional), ↔ (biconditional).
4
Let us analyze in more details logical connectives:
Negation ¬
-
if P is a PL sentence, ¬P is a PL sentence, too.
-
¬P is called the negation of P.
-
If P is an atomic sentence, both P and ¬P are called literals (positive and negative
respectively).
Conjunction ∧
-
If P and Q are PL sentences, P ∧ Q is a PL sentence.
Disjunction ∨
-
If P and Q are PL sentences, P ∨ Q is a PL sentence.
Conditional →
-
If P and Q are PL sentences, P → Q is a PL sentence.
Biconditional ↔
-
If P and Q are PL sentences, P ↔ Q is a PL sentence.
3.2 Semantics of PL
The semantics defines the rules for determining the truth of a sentence with respect to a particular
model. A model fixes the truth value (True, False) for every proposition symbol. A model is a pure
mathematical object, with no necessary connection to the actual world. Such connection is given by an
interpretation, which associates atomic sentences to propositions about the world.
Let us consider the following example. If in our KB we make use of the proposition symbols P1, P2,
and P3 a possible model is M1 = { P1 = False, P2 = False, P3 = True}. A possible interpretation is the
following:
P1: Viola lives in Paris.
P2: Viola lives in London.
P3: Viola lives in Milan.
In this interpretation, M1 represents facts from the real world, but we may have a lot of other
interpretations:
P1: the burgers of Burger King are better than those of McDonald's.
P2: rainbows are beautiful.
P3: the set of prime numbers is finite.
The semantics in PL not only associates a truth value to every atomic sentences, it also provides the
rules about how to determine the truth of complex sentences on the basis of the truth values of their
atomic components (atomic sentences and connectives).
The truth-value of complex sentences is established recursively, in accordance with the truth tables of
the logical connectives (see Figure 1).
If in a model M1 we have that P is True and Q is False (M1 = {P = True, Q = False}) we also have that
P ∨ Q is true and we say that M1 is a model of P ∨ Q.
5
Figure 1: Truth tables of the logical connectives.
3.3 Inference
Reasoning aims at getting new knowledge. New knowledge comes from getting new true sentences
from other true sentences that are already in our KB.
Logical entailment
Two PL sentences P1 and P2 are said to be in a relation of logical entailment, or logical consequence,
(P1╞ P2) if and only if every model of P1 is also a model of P2. We also say P1 entails P2, P2 follows
logically from P1, or P2 is a logical consequence of P1.
The aim of logical inference is to determine whether KB╞ α, which means that our knowledge base
entails α or that in every model in which all sentences in KB are true, α is true, too.
There exists a simple way to decide whether KB╞ α or not. We check all the models in which KB is
true and see whether α is true, too.
Let us consider the following examples:
KB = {P ∧ Q; Q → R}
α=R
KB╞ α ?
The answer is yes because in all the models in which KB is true, also α is true.
KB’ = {P ∨ Q; Q → R}
α=R
KB’╞ α ?
The answer is no, because in this case there is model M1 = {P = True, Q = False, R = False} in which
KB is true, while α is not.
6
This truth table enumeration procedure can be time-consuming: each propositional symbol can be
assigned two different truth values, so that a formula with n symbols has 2n possible assignments, and
checking the truth value of each of them becomes rapidly infeasible as n increases. Is there another
way to check logical entailment?
Logical inference
Sentences can be dealt with without any concern about their truth value. Algorithms can be defined to
obtain new sentences from other sentences. This is called logical inference. For example we can define
an inference rule c, according to which we can infer P2 ∧ P1 from P1 ∧ P2. We write:
P1 ∧ P2 ├c P2 ∧ P1
Or we can define an inference rule mp such that:
P1, P1 → P2 ├mp P2
What is the relation between logical inference and logical entailment? If we apply an inference rule to
a set of sentences, we obtain a new sentence. This is what inference is about. But what about the truth
value of the new sentence? We would like it to be true, so that we can expand our KB.
Soundness
An inference rule is said to be sound when all inferred/derived sentences are also entailed.
If KB├i α, then KB╞ α
Completeness
An inference rule is said to be complete if it can derive/infer all entailed sentences.
If KB╞ α, then KB├i α
With respect to completeness, we may have a set of inference rules I instead of a single rule i
KB├I α means that we obtain α from the sentences of KB after applying some of the rules in I a finite
number of times.
If we have a sound and complete set of inference rules I, we may want to use it to check whether
KB╞ α or not, instead of constructing the truth table of all sentences in KB and α.
Why do we need I to be both sound and complete?
If it is not sound, even if we get KB├I α, we cannot be sure that KB╞ α.
If it is not complete, even if KB╞ α, we may not get to the result of having KB├I α.
Logical equivalence
Two sentences P1 and P2 are equivalent if they are true in the same set of models.
We write P1 ≡ P2
We can show by means of truth tables that P ∧ Q and Q ∧ P are logically equivalent.
Some standard logical equivalences are the following.
7
Commutativity of conjunction
α∧β≡β∧α
Commutativity of disjunction
α∨β≡β∨α
Associativity of conjunction
(α∧β) ∧ γ ≡ α ∧ (β∧γ)
Associativity of disjunction
(α∨β) ∨ γ ≡ α ∨ (β∨γ)
Double-negation elimination
¬(¬α) ≡ α
Contraposition
α → β ≡ ¬β → ¬α
Conditional elimination
α → β ≡ ¬α ∨ β
Biconditional elimination
α ↔ β ≡ (α → β) ∧ (β→ α)
De Morgan (1)
¬(α∧β) ≡ ¬α ∨ ¬β
De Morgan (2)
¬(α∨β) ≡ ¬α ∧ ¬ β
Validity
A sentence is valid if it is true in all models. A valid sentence is also called a tautology.
P ∨ ¬P is valid, and we can write ╞ P ∨ ¬P
Deduction theorem
For any sentence P1 and P2, P1╞ P2 if and only if P1→ P2 is valid.
Satisfiability
A sentence is said to be satisfiable if it is true in some model.
P ∧ Q is satisfiable, for instance, because there exists M1 = {P=true; Q=true};
it is not valid because there exists M2 ={P=true; Q=false}.
P is valid if and only if ¬P is not satisfiable (unsatisfiable).
P is satisfiable if and only if ¬P is not valid.
An unsatisfiable sentence is also called a contradiction.
8
Inference rules
We use inference rules to derive some conclusions from the sentences in our KB.
Modus ponens
α→β
α
β
Conjunction elimination
α∧β
α
Conjunction introduction
α
β
α∧β
Disjunction introduction
α
α∨β
All logical equivalences, for instance contraposition
α→β
¬β → ¬α
We use inference rules to build a so-called proof: a sequence of sentences P1, …, Pn is a proof of Pn
from the sentence-set Δ if and only if
-
every Pk is in Δ, or
-
it is inferred from one or more sentences that precede it in the sequence by means of an
inference rule.
In this case, we write Δ├ Pn and we say that Pn is a theorem of Δ.
Propositional logic has a fundamental property, called monotonicity, that can be expressed in the
following way:
if KB╞ α, then KB ∪{β}╞ α
If we expand the knowledge base, we can have additional conclusions, but such expansion cannot
invalidate a conclusion already inferred like α.
Monotonicity means that we can apply inference rules whenever suitable premises are found in the
knowledge base, regardless of what else is there.
9
Resolution
Resolution is an inference rule that deserves to be presented separately from the others because of the
fundamental role it played in the development of automated reasoning and theorem proving (Robinson
1965).
The rule works as follows:
l1 ∨ ... ∨ lk
m1 ∨ ... ∨ mn
there are two literals li and mj such that one is the negation of the other
l1 ∨...∨ li-1 ∨ li +1 ∨...∨ lk ∨ m1 ∨...∨ mj-1 ∨ mj +1 ∨...∨ mn
The rule states that if there are two disjunctions of literals such that the former contains a literal li that
is the negation of a literal mj in the latter or vice versa, we can infer a disjunction that merges the two
premises, excluding li and mj.
The resulting clause should contain only one copy of each literal. The removal of multiple copies of
literals in the same clause is called factoring.
A∨B
A ∨ ¬B
A∨A
(with resolution)
A
(with factoring)
It is straightforward to prove that
A∨A∨α≡A∨α
for any literal A and for any formula α.
How to transform PL formulas into clauses
Resolution can be applied only to clauses, so that if we want to exploit to reason about a KB, all the
formulas in such KB should be transformed into clauses. Although we skip the relevant proof, it can
be shown that any PL formula can be transformed into an equivalent clause. We will show the steps of
such procedure with respect to the following example:
A ↔ B∨C
1) ↔, → elimination: all biconditionals must be turned into conjunction of conditionals, and all
conditionals (ϕ → ψ) must be transformed into disjunctions (¬ϕ ∨ ψ).
(A → B∨C) ∧ (B∨C → A)
(¬A∨B∨C) ∧ (¬(B∨C) ∨ A)
2) Moving ¬ inwards: all negations should be shifted inwards, by means of double negation
elimination and the De Morgan laws, so to negate only single propositional symbols.
(¬A∨B∨C) ∧ (¬B∧¬C ∨ A)
3) Conjunctive normal form: by using the distributivity properties, transform the formula into a
conjunction of disjunctions.
(¬A∨B∨C) ∧ (¬B∨A) ∧ (¬C∨A)
4) ∧ elimination: decompose the conjunctive normal form into single clauses.
¬A∨B∨C
¬B∨A
¬C∨A
10
Soundness of resolution
Let us take the rule in its general form again to show that it is sound.
l1 ∨ ... ∨ lk
m1 ∨ ... ∨ mn
there are two literals li and mj such that one is the negation of the other
(1)
(2)
l1 ∨...∨ li-1 ∨ li +1 ∨...∨ lk ∨ m1 ∨...∨ mj-1 ∨ mj +1 ∨...∨ mn
(3)
We must show that (3) is true whenever (1) and (2) are both true. As li and mj are one the negation of
the other, one of them is true and the other is false. We examine each possible case.
li is true and mj is false
mj is false while formula (2) to which it belongs is true, hence m1 ∨...∨ mj-1 ∨ mj +1 ∨...∨ mn is true, and,
thus, formula (3) is true because it is comprised of the literals from (1) except for li in disjunction with
a true formula.
li is false and mj is true
we reason in the same way: li is false while formula (1) to which it belongs is true, hence l1 ∨...∨ li-1 ∨
li +1 ∨...∨ lk is true, and, thus, formula (3) is true because it is comprised of the literals from (2) except
for mj in disjunction with a true formula.
Refutation completeness of resolution
It is very easy to show that resolution is not a complete inference rule: A∨B is a logical consequence of
A, because it is true whenever A is true, but it cannot be obtained from A by resolution.
However, there is a specific way to exploit resolution, in which some form of completeness holds. If
we want to check whether A∨B is a logical consequence of A, we may proceed as follows.
We take A as a hypothesis, that is, we suppose that A is true.
As we want to prove that, under this circumstance, A∨B is true, A∨B becomes our thesis.
We negate the thesis: ¬(A∨B).
We conjunct the hypothesis and the negated thesis: A∧¬(A∨B).
Such conjunction is not satisfiable if and only the hypothesis and the negated thesis can never be true
at the same time, that is, the thesis is a logical consequence of the hypothesis.
A proof that follows this procedure is called refutation. In particular, when the search for the
contradiction is performed by means of the resolution technique, we call it refutation by resolution: a
conjunction of the hypothesis, or, more in general, of all the hypotheses we have in a KB with a
negated thesis can be transformed into a set of clauses with the procedure above, and resolution can be
applied to any pair in the set. The new clauses obtained by resolution are added to the set, and can be
the input of further applications of the inference rule. When we say that resolution is refutation
complete, we mean that this procedure always reaches a breakpoint:
-
either a pair of clauses in the form of contradictory literals (e.g. A and ¬A) is found, so that
by an application of resolution to such pair we obtain the so-called empty clause (represented
by “{}”, and called ‘empty’ because both literals are eliminated in the process and nothing
remains),
-
or no new clause can be obtained, that is, the application of resolution to any pair from the
set yields a clause that already exists.
11
In the former case we have discovered that adding the negated thesis to the KB leads to a
contradiction, that is, in all models in which the KB is true, the negated thesis cannot be true, too,
which means that, in those models, the thesis in its original form must be true: we have proven that the
thesis is a logical consequence.
In the latter case, no contradiction has been found, which means that the KB and the negated thesis can
be both true: hence, the thesis is not a logical consequence of the KB.
It can be shown that, given a KB and a formula to prove, we always end up in one of these two cases.
Resolution is, then, not complete (we cannot find all logical consequences of KB from scratch), but
refutation complete: we can always successfully check, by means of refutation, whether a given
formula α is a logical consequence of KB.
Strategies for resolution
How to pick the pairs of clauses in our search for the empty clause? There is no rule to follow, but
some strategies may speed up the process.
The unit preference prescribes that pairs in which one of the two formulas is a literal should be
preferred. The basic idea is that, as we are looking for the empty clause, we had better focus on clauses
that are already short.
There are a number of strategies based on a set of support, that is, a special subset of the clauses from
which we should always pick one clause to perform the resolution with. The new clauses obtained are
to be added to the set. The initial set of support must be chosen carefully, because the refutation
completeness is at stake. However, if the remaining clauses form a satisfiable set of formulas, then the
completeness of the resolution is preserved. A very common choice is to start with a set of support
comprised of only the clauses deriving from the negated thesis. After all, if there is a contradiction, it
should emerge from there.
Another strategy, namely, the input resolution, prescribes that in each pair at least one clause belongs
to the initial set of clauses. While the input resolution is not complete, a generalization called linear
resolution preserves completeness by prescribing that resolution can be applied to clauses P and Q if P
belongs to the initial set or is an ancestor of Q, that is, a chain of resolutions starting from P has
previously yielded Q.
12
4. First Order Logic (FOL)
As any other type of logic, FOL is composed of three elements
-
a formal language that models a fragment of natural language;
-
a semantics that defines under which conditions a sentence of the language is true or false;
-
a calculus that mechanically determines the validity of sentences.
4.1 Formal language
A formal language is defined by means of a dictionary of symbols and a grammar.
Dictionary of symbols
The symbols of a language belong to three different categories: logical symbols, descriptive symbols,
and structural symbols.
Logical symbols. They are logical operators (connectives and quantifiers), equality symbol, and
individual variables. More precisely:
negation, ‘¬’,
conjunction, ‘∧’,
universal quantifier, ‘∀’,
equality, ‘=’;
an infinite numerable set V of individual variables (x, y, z, ..., x1, ..., xn, ...).
Descriptive symbols. They refer to individuals and predicate properties of individuals or relations
among individuals. They are constituted by:
a set C of individual constants (a, b, c, ..., a1, ..., an, ...);
a set P of predicative symbols or predicates (P, Q, R, ..., P1, ..., Pn, ...), with a function
#: P ⎯→ N that assigns to every symbol an arity (i.e., the number of arguments).
Structural symbols. They permit to give a precise structure to a formula (brackets and comma).
Terms are the symbols utilized to refer to individuals. Set T of terms is defined in the following way:
T = V ∪ C.
Functional terms are complex terms f (t1,..., tn), where f is the functional symbol with arity n and tk are
terms.
Grammar of the formulas
The set A of atomic formulas is the smallest set that contains:
P(t1,...,tn) , where P ∈ P, ti ∈ T and #(P) = n;
(t1 = t2), where t1, t2 ∈ T; and such that
nothing else belongs to A.
Atomic formulas with a predicate of null arity (P, with #(P) = 0), are called atomic propositions.
The set Φ of formulas is the smallest set that contains:
A;
¬ϕ, where ϕ ∈ Φ ;
(ϕ ∧ ψ), where ϕ, ψ ∈ Φ ;
∀xϕ, where x ∈ V and ϕ ∈ Φ ; and such that
nothing else belong to Φ.
It is worth noting that the definition of set Φ of formulas is recursive.
13
Similarly to PL, we call literals the atomic formulas in A and the formulas we obtain by negating
them.
Abbreviations
Further logical symbols are introduced as abbreviations:
disjunction:
(ϕ ∨ ψ) abbreviates
conditional:
(ϕ → ψ) abbreviates
biconditional:
(ϕ ↔ ψ) abbreviates
existential quantifier:
∃xϕ
abbreviates
inequality symbol:
(t1 ≠ t2)
abbreviates
¬(¬ϕ ∧ ¬ψ)
(¬ϕ ∨ ψ)
((ϕ → ψ) ∧ (ψ → ϕ))
¬∀x¬ϕ
¬(t1 = t2)
Brackets elimination
External brackets can be eliminated. For instance:
((P(a) ∧ ∃xQ(a,x)) → (R(a) → Q(a,a))) ⎯→
(P(a) ∧ ∃xQ(a,x)) → (R(a) → Q(a,a))
Operators have different priorities: {¬,∀x, ∃x} > ∧ > ∨ > {→,↔}
(P(a) ∧ ∃xQ(a,x)) → R(a) → Q(a,a)
⎯→
P(a) ∧ ∃xQ(a,x) → R(a) → Q(a,a)
Open and closed formulas
A formula is called closed if every occurrence of variable in the formula is bound; it is called open in
the other cases.
Sentences
A closed formula is called sentence.
Predicates
A (first order or elementary) predicate is a property that a given individual can possess or not, or a
relation that can subsist or not subsist among couples, terns, etc. of individuals.
4.2 Semantics
To assign semantics to a logical language means to define a criterion to determine whether an
argument is valid. The validity of an argument is reduced to the relation of logical entailment among
formulas. This notion of entailment is based on the concept of truth of a formula. But in general a
formula is not true or false per se, but only with respect to a precise interpretation of the descriptive
symbols of its language (predicates and constants). It follows that to specify the semantics of a
language is to define when formula ϕ is true in a model M, which is expressed by the notation M |= ϕ.
Models
A first order model is a mathematical structure that models a portion of reality. It is modeled as a nonempty set of individuals, called universe or domain, among which n-ary relations hold. An n-ary
relation R on a domain D, with n ≥ 0, is a set R ⊆ Dn, namely a set of n-tuples 〈d1,...,dn〉, with di ∈ D.
Domain D is the base to have an interpretation of descriptive symbols of the language (constants and
predicates). To every predicate is associated an n-ary relation ext(P) in the domain, called extension of
the predicate, and to every constant c is associated a individual ref(c) of the domain, called reference
of the constant:
ext: P ⎯→ D*, with ext(P) ⊆ D#(P);
ref: C ⎯→ D.
A model is, by definition, the ordered triple constituted by D, ext e ref:
14
M = 〈D,ext,ref〉.
Value assignment to variables
The truth of a formula in a model depends on the interpretation of the descriptive symbols in the
domain of the model. Variables represent a problem, since they have to be free and then cannot have a
value fixed by the interpretation once for all. In general, the truth of a formula depends not only by the
interpretation of descriptive symbols, but also by the assignment of individuals to variables. If D is a
non-empty set, an assignment of individuals to variables is a function
val: V ⎯→ D.
If M = 〈D,ext,ref〉 is a model and t is a term (constant or variable) the denotation of t (den(t)) is defined
in the following way:
den(t) = ref(t)
if t is a constant,
den(t) = val(t)
if t is a variable.
Truth conditions
By exploiting the fact that the set Φ of formulas is recursively defined, it is possible to specify the
truth conditions by induction from the structure of the formula and using atomic formulas as basis for
the induction. We define what it means for a formula to be true in a model M with respect to an
assignment val
M,val |= ϕ,
or that M and val verify (or satisfy) ϕ. When M and val do not satisfy ϕ, we write:
M,val |≠ ϕ,
We can say that ϕ is false in M and val.
If M = 〈D,ext,ref〉 is a model e val an assignment, we define:
M,val |= P(t1,...,tn) iff 〈den(t1),..., den(tn)〉 ∈ ext(P);
M,val |= (t1 = t2)
iff den(t1) = den(t2);
M,val |= ¬ϕ
iff M,val |≠ ϕ;
M,val |= (ϕ ∧ ψ)
iff M,val |= ϕ and M, val |= ψ;
M,val |= ∀xϕ
iff M,val’ |= ϕ for every assignment val’ such that val’ ≅x val, that is,
val’ is identical to val for every variable in V except for x.
The truth of a formula ϕ in a model M is defined as the truth of ϕ in M for any assignment of values to
the variables:
M |= ϕ
iff
M,val |= ϕ for every assignment val.
When ϕ is true in M we can also say that M satisfies ϕ, or that M verifies ϕ , or that M is a model of ϕ.
4.3 Semantic classification of formulas
There exists formulas that are true in all models, formulas that are false in every model, formulas that
are true or false depending on the model. For instance, every formula with the form ϕ ∨ ¬ϕ is true in
each model, ϕ ∧ ¬ϕ is false in each model, ∃xPx is true in some models and false in others. By
definition
a formula ϕ is valid or logically true if for any model M and for any assignment val we have
M,val |= ϕ (and then for every model we have M |= ϕ);
a formula ϕ is satisfiable if for some model M and for some assignment val we have
15
M,val |= ϕ.
A formula is:
invalid if it is not valid, namely false in some models;
unsatisfiable or logically false if it is not satisfiable, namely false in every model;
contingent is it is neither valid nor unsatisfiable, namely true in some models.
To assert that a formula ϕ is valid, we write
|= ϕ.
We write
|≠ ϕ
to assert that ϕ is invalid.
|= ¬ϕ means that ϕ is unsatisfiable;
|≠ ¬ϕ means that ϕ is satisfiable;
|≠ ϕ and |≠ ¬ϕ mean that ϕ is contingent.
4.4 Logical consequence
When dealing with an argument we are not interested to the ‘absolute’ validity of the conclusion, but
to the validity of the conclusion according to the premises; in other words we are not interested in
knowing whether the conclusion is true in all models, but only in those models that satisfy the
premises.
An argument is valid
Γ |= ϕ.
if and only if,
M |= ϕ for every model M such that M |= Γ,
and we say that Γ entails ϕ, or that ϕ is a logical consequence of Γ .
4.5 Inference rules
Propositional logic rules can be generalized to the case of FOL: all inference rules that are used in PL
can be used in FOL, too. Moreover, new rules are introduced to perform inferences from formulas
with quantifiers.
All logical equivalences
¬∀x¬ϕ
∃x ϕ
etc.
Universal instantiation
∀xϕ(x)
ϕ(k)
where k is a constant
16
Existential instantiation
∃xϕ(x)
ϕ(h)
where h is a new constant, not previously used in any formula of the KB
Existential generalization
ϕ(k)
where k is a constant
∃xϕ(x)
Resolution
Again, due to a greater complexity, resolution is presented separately. Firstly, we need to introduce
more basic concepts that support the definition of this inference rule in the context of FOL.
Substitution
A substitution is not an inference rule, but a technique to manipulate a formula. More concretely, a
substitution is a finite sequence of term/variable pairs:
s = {ti / vi}.
Each term ti is meant to substitute the relevant variable vi in the formula the substitution is applied to.
For instance, if
s = {C / x, A / y},
then we have that:
[P(x, f(y), B)]s = P(C, f(A), B).
Unification
Two formulas α and β are unified by a substitution s when
[α]s = [β]s,
and s is said to be the unifier of α and β. For example,
α = P(x, f(y), B)
β = P(x, f(B), B)
are unified by
s’ = {A/x, B/y}.
However, s’ is not the only unifier of α and β, and not even the simplest one. We call most general
unifier (MGU) of two formulas α and β a substitution s such that:
- s unifies α and β, and
- for any another unifier s’ there is a substitution s* such that [α]s’ = [[α]s]s*.
The MGU of α and β in the example above is s = {B/y}.
Like in PL, we call clauses those formulas that are disjunctions of literals, and apply resolution to
pairs of clauses (γ1, γ2) provided the following conditions hold:
- γ1 includes a literal φ, and
- γ2 includes a literal ¬ψ such that
- φ and ψ have an MGU s.
In that case, we can infer by resolution a clause γ3, obtained by applying substitution s to a clause that
includes all the literals from γ1 and γ2, except for φ and ¬ψ:
17
γ1 (with φ)
γ2 (with ¬ψ)
[φ]s = [ψ]s (s is an MGU)
[γ1–{φ} ∨ γ2 –{¬ψ}]s
For example, from P(x)∨Q(f(x)) and R(g(y))∨¬Q(f(a)) we can infer, through the MGU {a/x}, the
clause P(a)∨R(g(y)).
How to transform FOL formulas into clauses
As we have already seen in PL, resolution calls for a mechanism to transform general formulas into
clauses. It can be shown (although we will not do it here) that also any FOL formula can be
transformed into an equivalent clause. The procedure is very similar to what shown in the case of PL,
with additional steps to deal with variables and quantifiers, as illustrated in the example below:
∀x∀y(P(x)∧Q(x) → R(x,y)) → ∀x∃yS(x,y)
1) ↔, → elimination
¬(∀x∀y(¬(P(x)∧Q(x))∨R(x,y))) ∨ ∀x∃yS(x,y)
2) Moving ¬ inwards: all negations should be shifted inwards, by means of double negation
elimination, the De Morgan laws, and the logical equivalence ¬∀xP(x) ≡ ∃¬P(x), so to negate only
atomic sentences.
∃x∃y(P(x)∧Q(x)∧¬R(x,y)) ∨ ∀x∃yS(x,y)
3) Variable standardization: variables that are distinct but are represented with the same letters must
be rewritten to avoid confusion.
∃u∃v(P(u)∧Q(v)∧¬R(u,v)) ∨ ∀x∃yS(x,y)
4) ∃ elimination: existential quantifiers that do not lie within the scope of a universal quantifier must
be eliminated by means of existential instantiation, whereas those within the scope of a universal
quantifiers must be eliminated and the relevant variable must be substituted with a Skolem function.
The basic idea is that if for all x’s there exists a y, such correspondence can be modeled in the form of
a function: ∀x∃yS(x,y) becomes ∀xS(x,f(x)). The Skolem function’s arguments are given by the
universally quantified variables that are together in the predicate with the existentially quantified
variable, so that ∀x1∀x2∃yT(x1,x2,y) would become ∀x1∀x2T(x1,x2,g(x1,x2)). Such process is called
Skolemization, and it prescribes the introduction of a new function symbol for each ∃ elimination.
P(k)∧Q(h)∧¬R(k,h) ∨ ∀xS(x,f(x))
5) Prenex normal form: all the remaining (universal) quantifiers must be moved to the beginning of
the formula which then consists of a sequence of quantifiers followed by a quantifier-free formula,
called matrix.
∀x (P(k)∧Q(h)∧¬R(k,h) ∨ S(x,f(x)))
6) Conjunctive normal form: by means of the distributivity properties, transform the matrix in
conjunctive normal form, that is, a conjunction of disjunctions.
∀x ((P(k)∨S(x,f(x))) ∧ (Q(h)∨S(x,f(x))) ∧ (¬R(k,h)∨S(x,f(x))))
7) ∀ omission: all variables are now universally quantified and, by keeping this fact in mind, we can
omit the universal quantifiers in the beginning of the formula.
(P(k)∨S(x,f(x))) ∧ (Q(h)∨S(x,f(x))) ∧ (¬R(k,h)∨S(x,f(x)))
18
8) ∧ elimination: decompose the conjunctive normal form into single clauses.
P(k)∨S(x,f(x))
Q(h)∨S(x,f(x))
¬R(k,h)∨S(x,f(x))
All the considerations on soundness, refutation completeness, and strategies that have been illustrated
for resolution in PL can be straightforwardly extended to FOL.
19
5. From Natural Language to Logic
Let us get back to why logic was introduced in the first place: to formalize natural language and make
arguments convincing. Today, logic has a narrower scope and, instead of scientific and metaphysical
truths, is solely focused on the validity of the deductions that we perform following logical rules.
Hence, the following example expresses facts that are false, that is, without any correspondence with
the reality we know, but it is considered a valid argument from a logical point of view:
All pigs fly.
Anything which flies talks.
So, all pigs talk.
Nevertheless, the need for formalization still exists.
5.1 Formalization of statements into PL formulas
To analyze deductions and to distinguish the valid ones, logic adopts the approach of formalizing
them, that is, expressing them as sequences of symbols which are built following specific rules. Here
we adopt the simplest set of rules, provided by Propositional Logic (PL). Since statements in natural
languages can be characterized by quite complex structures, the first step of the process is to determine
their components.
Rule 1: a simple statement is symbolized by a single propositional symbol.
Example:
War is cruel, and everbody knows it.
So, war is cruel.
If we represent each simple statement with a symbol, we obtain:
A, and B.
So, C.
We can see that we lost something in the formalization.
Rule 2: if a simple statement type occurs more than once in an argument, then each token of that type
must be represented by the same propositional symbol everywhere it occurs.
If we follow Rule 2, we obtain:
A, and B.
So, A.
Although not very significant, the deduction is now valid. Let us see another example, which shows us
that Rule 2 needs some improvement.
If John goes to the store, then he can buy what he wants.
John has gone to the store.
So, John can buy what he wants.
This is what we get if we apply Rule 2 in a strict way:
If A, then B.
C.
So, D.
Those sentences, which seem to be strongly related in the natural language, end up being represented
by different propositional symbols, and, hence, appear to be completely independent in the formalized
version, so that the deduction loses any validity. Different symbols were chosen on the basis of the
20
hairsplitting differences between ‘goes’ and ‘has gone’, and ‘John’ and ‘he’. It is true that the action
of going is described with different tenses in the two sentences, and that we have referred to the same
person in different ways, that is, with his name and with a pronoun, but it was clear from the context
that the time of the action is not important for the deduction, and that there is only one person, John,
involved. Let us introduce a modified version of the rule.
Rule 2’: if two distinct simple statement types can be properly treated as paraphrases of each other,
relative to the context of an argument in which they occur, then each should be symbolized by the
same statement letter type.
If we apply Rule 2’, we get what we expect from a correct formalization:
If A, then B.
A.
So, B.
In formalizing the previous examples, some parts of the complex statements (i.e. the simple
statements) were substituted by appropriate statement letters, whereas other parts (i.e. if, then, so, and)
were left in natural language. Our next step is to formalize these connective parts into logical
operators.
5.2 Logical operators: Conjunction
Obviously, conjunction (∧) is the logical operator with which we formalize conjunctions in natural
language.
John went to the store (J), and Bill stayed at home (B).
J∧B
We do not use the logical operator of conjunction only when we have an ‘and’ in the sentence we are
going to formalize. The following sentences are all formalizable with a conjunction.
John went to the store, but Bill stayed at home.
John went to the store, while Bill stayed at home.
John went to the store, even though Bill stayed at home.
John went to the store, though Bill stayed at home.
John went to the store, yet Bill stayed at home.
John went to the store; however, Bill stayed at home.
John went to the store; nevertheless, Bill stayed at home.
John went to the store; Bill stayed at home.
Let us add one more example, which is apparently similar, but actually different from the others.
John went to the store because Bill stayed at home.
Imagine a situation in which Bill stays at home, and the only reason why John went to the store is
because he needed some milk. In this case, both J and B are true, but J because B is false, which
shows that we cannot formalize a complex sentence built with a ‘because’ with a conjunction.
In fact, a statement is a logical conjunction just in case from the truth of its component statements its
truth follows, and from its truth, the truth of its component statements follows.
The truth value of the sentence with ‘because’ derives from more than just the truth of its components:
it also needs a causal relation to hold between them.
21
We can find an ‘and’ in several points of a statement.
John cut and raked the lawn
(conjunct verbs)
John was and will be here
(conjunct auxiliary verbs)
John speaks quickly and quietly
(conjunct adverbs)
John cut the grass and the hedge
(conjunct objects)
John and Bill are tall
(conjunct subjects)
When can we decompose statements into conjunct simple statements, like in this example?
John speaks quickly and quietly
John speaks quickly and John speaks quietly
We have to keep the definition of logical conjunction in mind: if from the truth of the complex
statement follows the truth of both simple statements, and if from the truth of both simple statements
follows the truth of the complex statement, then we can formalize the complex statement as a
conjunction.
Let us examine another example.
Romeo and Juliet are lovers.
Let us try to decompose this statement into:
Romeo is a lover, and
Juliet is a lover.
From the truth of the complex statement the truth of the simple statements indeed follows, but not vice
versa, because the complex statement says more: the fact that Romeo and Juliet love each other is lost
when we split the statement.
Here are some other examples that cannot be formalized as logical conjunctions.
The red and blue flag is on the pole
(compounds)
It’s raining cats and dogs
(idioms)
Some planes stop in Chicago and Detroit
(quantifiers)
5.3 Logical operators: Negation
There are several ways in which negation is expressed in natural language:
It is not the case that Dante is Shakespeare
Dishonesty is not morally acceptable
This steak doesn’t taste good
It is false that you’re taller than me
Paris is unbeatable
It’s impossible to save some money nowadays
A statement T is a logical negation just in case it is analyzable into a component statement S such that
T is true if and only if S is false. In such a case, we can write T as ¬S.
22
In particular, the previous examples are logical negations of the following statements.
Dante is Shakespeare
Dishonesty is morally acceptable
This steak tastes good
You’re taller than me
Paris is beatable
It’s possible to save money nowadays
It is important to remark that some sentences that appear to be negations are sometimes only
contraries, that is, they cannot be true at the same time, but they do not satisfy the definition of logical
negation. Let us check the following examples.
None of John’s friends came
Every one of John’s friends came
The former is not a negation of the latter. In fact, although they cannot be true at the same time, they
can be both false (e.g. when a few of his friends came).
Actually, the negation of ‘every one’ is not ‘no one’, but ‘not every one’.
‘No one’ is indeed the negation of ‘someone’: it means ‘not someone’.
Analogously, ‘nothing’ means ‘not something’.
‘Nowhere’ means ‘not somewhere’.
‘Nobody’ means ‘not somebody’.
‘Never’ means ‘not ever’.
Still, we always have to keep an eye on the context, because, in particular cases, the previous
equivalences might not hold. For example, if the focal stress is on the word ‘somebody’, the meaning
of
It is not the case that somebody loves me
could be
Many people love me
and not
Nobody loves me.
With more than one type of logical operators involved, we may feel the need for some groupers to
formalize the sentences.
It is not true that Bill is tall and Frank is tall
may be interpreted in two different ways, depending on whether we consider Frank’s tallness being
negated or not. In logic, we can disambiguate by means of parentheses:
¬(B ∧ F)
¬B ∧ F
23
A word that carries a negative connotation in natural language is ‘without’.
I will do logic, without taking the tests
means:
I will do logic, but I won’t take the tests
which we can formalize as
L ∧ ¬T
This translation, though, does not always work: it is possible to have ‘without’, without negation.
Without going to the store, John can’t eat dinner tonight
This is an example of ‘without’ that cannot be translated with a conjunction and a negation like
¬S ∧ ¬E
because the formula above means that John won’t go to the store and he can’t eat dinner tonight.
These cases are better tackled with ‘if, then’ statements (more on that later).
Negation inside and outside quantifiers, psychological and perceptual verbs, and adverbs can yield
different meanings. The following pairs of sentences do not mean the same thing.
It’s false that some dogs are fuzzy / Some dogs are not fuzzy
John didn’t see Mary leave / John saw Mary not leave
It’s not true that John is particularly competent / John is particularly incompetent
5.4 Logical operators: Disjunction
In English, we create disjunctions by conjoining statements with the expressions ‘Either, or’ or with
‘or’ alone. Here are some examples:
Tom cut the lawn, or Bill cut it
Either Bill left for the gym, or he left for the movies
Either Tom cut the lawn, or he cut the hedge
We can find an ‘either, or’ in several points of a statement.
Tom cut or raked the lawn
(disjunct verbs)
Tom was or will be singing
(disjunct auxiliary verbs)
Tom speaks quickly or quietly
(disjunct adverbs)
Tom cut the lawn or the hedge
(disjunct objects)
Either Tom or Bill won
(disjunct subjects)
Either the red or blue flag is gone
(disjunct adjectives)
More formally, a statement S is a logical disjunction just in case it is analyzable into components T
and U such that from the truth of either T or U the truth of S follows, and vice versa.
As we already know, logical disjunction is symbolized like this:
T∨U
24
If ‘either…or’ is formalized with a disjunction, ‘neither…nor’ is the negation of a disjunction:
Neither the supermarket nor the shop was open
becomes
¬ (M ∨ S)
not to be confused with a disjunction of negations like
¬M ∨ ¬S
which corresponds to
Either the supermarket wasn’t open or the shop wasn’t open
Again, groupers can be needed to disambiguate:
Mary left and Bill stayed or Julie stayed
is ambiguous because it can be formalized in two different ways:
M ∧ (B ∨ J), or
(M ∧ B) ∨ J
‘Either’ can be helpful to disambiguate in the natural language: the two formulas above are the
formalization of the following senteces, respectively:
Mary left and either Bill stayed or Julie stayed
Either Mary left and Bill stayed or Julie stayed
You may have heard of an exclusive-or operator, which behaves like a disjunction except in the case in
which both components are true: the exclusive-or is then false. Many, in the history of logic, have
proposed it to formalize sentences like:
Today is either Monday or Tuesday
Jack will be with either Bill or Susan today
However, some considerations may make us question the utility of such an operator. In the first
example, in any context in which the terms ‘Monday’ and ‘Tuesday’ are used in the traditional way,
there is no possibility for the two sentences ‘Today is Monday’ and ‘Today is Tuesday’ to be true, so
that a case in which the exclusive-or yields a false value out of two true components is not pertinent.
In the second example, we would not feel inclined to say that the speaker of the sentence has asserted
something false, in case Jack is with both Bill and Susan. Rather than introducing another operator, we
invite the reader to specify, when needed, that the traditional inclusive-or is accompanied by an
explicit negation of the possibility that both disjunct components are true:
(B ∨ S) ∧ ¬ (B ∧ S)
Jack will be with either Bill or Susan today, but not with both of them
It is very important to be aware of some particular cases, in which the ‘or’ in natural language
sentences does not correspond to a logical disjunction. Let us focus on this example:
Tom wants a cat or a dog
It may look straightforward to model such a sentence with
C∨D
but let us look at the components of this disjunction:
25
Tom wants a cat
Tom wants a dog
Can we be sure about the truthfulness of either sentence? Imagine a situation in which Tom wants a
pet, a cat or a dog, but is completely neutral about which one: when asked whether it is true that he
wants a cat, we can imagine Tom’s doubt in answering. We can be sure that he would answer ‘no’
when asked whether he wants a cat and he wants a dog, which goes against the definition of
disjunction. Hence, we should model the initial sentence with one formula instead of a disjunction.
Another critical example is the following:
Karen weighs less than either Alice or Betty
In this case, ‘either…or’ mean that we can compare Karen with any of the two other girls, and Karen
would always turn out to be the lighter. Hence, although there is an ‘or’ in the sentence, the best way
to formalize it is to use a conjunction.
5.5 Logical operators: Conditional
Conditionals are found in English complex sentences consisting of two components linked by the
connectives ‘if, then’, like in
If John stays, then Mary leaves
We call antecedent the component that follows the ‘if’ and precedes the ‘then’, whereas the
component that follows the ‘then’ is called consequent. The ‘if, then’ connectives are not necessary to
have conditionals in natural language: all the following examples are conditionals.
If John stays, Mary leaves
Provided that John stays, Mary leaves
Mary leaves provided that John stays
Should John stay, Mary leaves
Mary leaves should John stay
When John stays, Mary leaves
There is not even the need for two self-standing statements to create a conditional:
John’s staying will result in Mary’s leaving
A statement M is a logical conditional just in case it can be analyzed into two components, S and T,
where S (the antecedent) is a condition for the truth of T (the consequent), and M as a whole is false
when S is true and T is false, and true in any other case. We symbolize the conditional like this:
S→T
Some logicians have argued against this definition, claiming that the conditional should not be
assigned any truth-value when the antecedent is false, but formulas with undefined truth-values cannot
be taken into account in the logical framework we have been illustrating so far, so we will adopt the
mainstream position, according to which conditionals with a false antecedent are true.
The antecedent is also called sufficient condition with respect to the consequent, whereas the
consequent is a necessary condition for the antecedent. In fact, if we have that
Attending the class is necessary for doing well
then, the fact that we are doing well (W) means that we have been attending (A), that is, if we are doing
well, then we have been attending:
26
W→A
or, in other words, the fact that we are doing well is sufficient to state that it is true that we have been
attending.
‘Only if’ introduces necessary conditions in natural language:
Patty will pass the exam only if she studies hard
Only if Tom is president will Jack be vice-president
are respectively formalized into
P→S
J→T
Although equivalent from a logical point of view, conditionals expressed with a focus on the sufficient
condition or on the necessary condition sound different when stated in natural language, because some
non-logical aspects are taken into account.
My pulse goes up only if I am doing heavy exercise
If my pulse goes up, then I am doing heavy exercise
These two sentences are perfectly equivalent from a logical point of view, but the latter sounds rather
odd, because it describes the situation in a way that we are not accustomed to hearing.
The word ‘unless’ deserves special attention in this context. Let us focus on this rather macabre
example:
James will die unless he is operated on
If we intend to model it with a conditional, we need to clearly understand what the necessary and
sufficient conditions are. The correct interpretation is that John will die for sure if he is not operated
on, so that the lack of an operation is a sufficient condition for James’ death:
¬O → D
Be aware of the fact that the operation is not a sufficient condition for James’ not dying. In general,
statements of the form
A unless B
are to be formalized as
¬B → A
Other natural language expressions that might make you think of a conditional should not be
formalized in this way. ‘Since’ and ‘because’ are two examples.
I’ll leave the window open since it is not raining
I’ll leave the window open because it is not raining
These two sentences do not contain logical conditionals. The difference with a real conditional
I’ll leave the window open if it is not raining
lies in the fact that ‘since’ and ‘because’ introduce what is taken as a true sentence: “since it is not
raining” presupposes that it is not raining, whereas “if it is not raining” does not.
27
Other expressions, like ‘if and only if’, ‘just in case’ or ‘when and only when’ introduce two
conditionals at the same time: each component is at the same time a necessary and sufficient condition
for the other.
You will get an ‘A’ just in case you work hard
is to be formalized with a conjunction of conditionals, or with the relevant biconditional abbreviation:
(A → W) ∧ (W → A)
A↔W
One fact that differentiates the conditional from the other operators is that we need groupers even
when there is only one type of operator in the sentence. For instance, with sequences of conjunctions,
we do not need any parenthesis because the following formulas are all equivalent:
(A ∧ B) ∧ C
A ∧ (B ∧ C)
A∧B∧C
This does not hold for conditionals:
(A → B) → C
A → (B → C)
are not equivalent, in that, if A is false, B true, and C false, the first formula is false, whereas the
second is true.
5.6 Formalization of statements into FOL formulas
First Order Logic is also known as Relational Predicate Logic, because it expresses relations between
entities in the form of so-called predicates.
Entities can come from any domain we intend to talk about in a formal way, and the fact that we can
point them out shows the increase in expressivity that we obtain by switching from Propositional
Logic, which only allows for the formalization of propositions/facts that are true or false, to First
Order Logic.
Let us start from a simple example:
John loves Mary
The formalization of a sentence into FOL starts with determining how many entities are there that are
in a relation. We call entities in a relation relata.
Rule 1: Determine the number of relata.
In the example above, it is natural to see John and Mary as relata, being related to each other because
of the feeling that John entertains for Mary.
The relation between the relata is formalized into a predicate, which we indicate with a string, always
starting with a capital letter, followed by a list of terms, referring to the relata. The list is usually
shown between parentheses and the terms are separated by commas.
Loves(John, Mary)
We adopt the convention by which we enumerate the terms in the same order as the relevant relata
appear in the original sentence.
Rule 2: Determine the order of relata.
28
Be careful in picking the right predicates with the right number of relata. Check these examples:
The truck parked next to the red car and the blue car
(a)
The truck parked between the red car and the blue car
(b)
Do the two verbs in the examples correspond to the same predicate? The answer is no.
“Parking next to” looks like an action that has two relata, whereas “parking between” has three:
PN(-,-)
PB(-,-,-).
Our tendency to formalize them with different predicates is confirmed by the following experiment: let
us substitute the conjunction with a disjunction.
The truck parked next to the red car or the blue car
(a*)
The truck parked between the red car or the blue car
(b*)
Sentence a* sounds perfectly fine, while sentence b* does not make sense, which tells us that the ‘and’
in sentence a is a real conjunction that can unites two simpler sentences that can also be connected by
means of a disjunction, whereas the ‘and’ in sentence b is part of the relation that we model with a
predicate, and, hence, cannot be substituted by an ‘or’.
In natural language, verbs can be in a passive voice. For example:
Mike was called by Ted
which we know to be equivalent to:
Ted called Mike
The problem with having two types of voice (active, passive) is that once we formalize the sentences,
we may have:
C(M,T)
and
C(T,M)
which we cannot distinguish from “Mike called Ted” and “Ted called Mike”.
We could introduce two different predicates and have:
C_P(M,T)
and
C_A(T,M)
but there is still a problem. Although equivalent, the two sentences cannot be proven to be deducible
from each other, unless we add specific axioms. Hence, let us follow the rule:
Rule 3: Turn passive statements into active ones.
FOL enables us to talk about entities, and we use constants for specific entities (e.g. John, Mary, that
table), and variables for less determined entities (e.g. a table, somebody).
Quantifiers allow us to make universal (all) and existential (some) statements about generic entities.
Variables fix the place that a quantifier binds in a statement.
Check the following examples:
John hit something, and something broke down
John hit something, and it broke down
29
These two sentences mean different things, and also the formalization with an existential quantifier
reflects that:
∃x H(J,x) ∧ ∃y BD(y)
∃x (H(J,x) ∧ BD(x))
Watch out: if you forget the parentheses in the latter formula, the scope of the existential quantifier
extends only to the first predicate:
∃x H(J,x) ∧ BD(x)
The second x is an unbound variable. Any predicate with an unbound variable cannot be assigned a
truth-value, because it depends on what entity the variable refers to.
Rule 4: Avoid unbound variables.
How to deal with quantifications in sentences
There’s no fixed set of rules to follow, but this is a start.
All Franciscans are saintly.
The verb that we formalize in the form of a predicate is “to be saintly”, and the entity that is the
subject of such predicate is “all Franciscans”.
Saintly(All.Franciscans)
We turn statements with such a structure into universally quantified conditionals:
∀x(Franciscan(x) → Saintly(x))
We use the same procedure also for more complex sentences:
All Franciscans are saintly and wise.
Saintly(All.Franciscans) ∧ Wise(All.Franciscans)
∀x(Franciscan(x) → (Saintly(x) ∧ Wise(x)))
Rule 5: Sentences with universal quantifications like “All Fs are S” become universally quantified
conditionals.
There is a different rule for existential quantifications:
Some Franciscans are learned.
Learned(Some.Franciscan)
∃x(Franciscan(x) ∧ Learned(x))
Some Franciscans love to read or are learned.
LoveToRead(Some.Franciscan) ∨ Learned(Some.Franciscan)
∃x(Franciscan(x) ∧ (LoveToRead(x) ∨ Learned(x)))
Rule 6: Sentences with existential quantifications like “Some Fs are L” become existentially quantified
conjunctions.
30
Watch out: if you mix up the rules and obtain an existentially quantified conditional, something went
wrong.
∃x(Franciscan(x) → Learned(x))
Such a formula is not much meaningful, in that, it is satisfied in any domain in which there exists one
entity that is not Franciscan.
Let’s try this procedure with a more complex example:
Every man gave every woman some toy.
First of all, we recognize that there is an action of giving with 3 relata: the giver, the receiver, and the
given.
Give(-,-,-)
Then, we see that the 3 relata are universally, universally, and existentially quantified, respectively.
Give(All.Man, All.Woman, Some.Toy)
We follow the order in the sentence and start from the left with the procedure.
∀x(Man(x) → Give(x, All.Woman, Some.Toy))
∀x(Man(x) → ∀y(Woman(y) → Give(x,y,Some.Toy)))
∀x(Man(x) → ∀y(Woman(y) → ∃z(Toy(z) ∧ Give(x,y,z))))
We can move the universal quantifiers (provided that the variable names do not cause confusion) to
the beginning of the formula.
∀x∀y(Man(x) → (Woman(y) → ∃z(Toy(z) ∧ Give(x,y,z))))
We can rewrite the formula into:
∀x∀y(Man(x) ∧ Woman(y) → ∃z(Toy(z) ∧ Give(x,y,z)))
provided that we prove that
P → (Q → R)
is equivalent to
P ∧ Q → R.
References
Boole, G. (1854). An investigation of the laws of thought, on which are founded the mathematical
theories of logic and probabilities, MacMillan & Co., Cambridge.
Frege, G. (1879). Begriffsschrift – Eine der arithmetischen nachgebildete Formelsprache des reinen
Denkens, Nebert, Halle an der Saale.
Lepore, E. (2003). Meaning and Argument: An Introduction to Logic Through Language, Blackwell
Publishing.
Robinson J. A. (1965). A Machine-Oriented Logic Based on the Resolution Principle, Journal of the
ACM 12(1).