Document related concepts

List of first-order theories wikipedia , lookup

Structure (mathematical logic) wikipedia , lookup

History of logic wikipedia , lookup

Mathematical logic wikipedia , lookup

Truth-bearer wikipedia , lookup

Axiom wikipedia , lookup

Axiom of reducibility wikipedia , lookup

Modal logic wikipedia , lookup

Cognitive semantics wikipedia , lookup

Non-standard calculus wikipedia , lookup

Mathematical proof wikipedia , lookup

Boolean satisfiability problem wikipedia , lookup

First-order logic wikipedia , lookup

Syllogism wikipedia , lookup

Laws of Form wikipedia , lookup

Theorem wikipedia , lookup

Inquiry wikipedia , lookup

Quantum logic wikipedia , lookup

Natural deduction wikipedia , lookup

Sequent calculus wikipedia , lookup

Catuṣkoṭi wikipedia , lookup

History of the function concept wikipedia , lookup

Curry–Howard correspondence wikipedia , lookup

Intuitionistic logic wikipedia , lookup

Law of thought wikipedia , lookup

Principia Mathematica wikipedia , lookup

Propositional formula wikipedia , lookup

Propositional calculus wikipedia , lookup

Transcript
MATH20302 Propositional Logic
Mike Prest
School of Mathematics
Alan Turing Building
Room 1.120
[email protected]
April 10, 2015
Contents
I
Propositional Logic
1 Propositional languages
1.1 Propositional terms . . . . . .
1.2 Valuations . . . . . . . . . . .
1.3 Beth trees . . . . . . . . . . .
1.4 Normal forms . . . . . . . . .
1.6 Interpolation . . . . . . . . .
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4
4
9
12
15
18
19
2 Deductive systems
2.1 A Hilbert-style system for propositional logic . . .
2.1.1 Soundness . . . . . . . . . . . . . . . . . . .
2.1.2 Completeness . . . . . . . . . . . . . . . . .
2.2 A natural deduction system for propositional logic
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
21
22
23
24
29
II
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Predicate Logic
32
3 A brief introduction to predicate
tures
3.1 Predicate languages . . . . . . .
3.2 The basic language . . . . . . . .
3.3 Enriching the language . . . . . .
3.4 L-structures . . . . . . . . . . . .
3.5 Some basic examples . . . . . . .
3.6 Definable Sets . . . . . . . . . . .
logic: languages and struc.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
33
33
34
36
37
38
39
If you come across any typos or errors, here or in the examples/solutions,
please let me know of them.
1
Introduction: The domain of logic
By logic I mean either propositional logic (the logic of combining statements)
or first-order predicate logic (a logic which can be used for constructing statements). This course is mostly about the former; we will, however, spend some
time on predicate logic in the later part of the course. In any case, propositional
logic is a part of predicate logic so we must begin with it. Predicate Logic is
dealt with thoroughly in the 3rd/4th-year course by that title; other natural
follow-on courses from this one are Model Theory and Non-Standard Logics.
Propositional logic can be seen as expressing the most basic “laws of thought”
which are used not just in mathematics but also in everyday discourse. Predicate logic, which can also be thought of as “the logic of quantifiers”, is strong
enough to express essentially all formal mathematical argument.
Most of the examples that we will use are taken from mathematics but we do
use natural language examples to illustrate some of the basic ideas. The natural
language examples will be rather “bare”, reflecting the fact that these formal
languages can capture only a small part of the meanings and nuances of ordinary
language. There are logics which capture more of natural language (modality,
uncertainty, etc.) though these have had little impact within mathematics itself
(as opposed to within philosophy and computer science), because predicate logic
is already enough for expressing the results of mathematical thinking.1
1 One should be clear on the distinction between the formal expression of mathematics
(which is as precise and as formal as one wishes it to be) and the process of mathematical
thinking and informal communication of mathematics (which uses mental imagery and all the
usual devices of human communication).
2
Part I
Propositional Logic
3
Chapter 1
Propositional languages
1.1
Propositional terms
Propositional logic is the logic of combining already formed statements. It
begins with careful and completely unambiguous descriptions of how to use the
“propositional connectives” which are “and”, “or”, “not”, “implies”. But first
we should be clear on what is meant by a “statement” (the words “assertion”
and “proposition” will be used interchangably with “statement”).
The distinguishing feature of a statement is that it is either true or false.
“The moon is made of cheese” is a (false) statement and “1 + 1 = 2” is a (true,
essentially by definition) statement. Fortunately, in order to deal with the logic
of statements, we do not need to know whether a given statement is true or false:
it might not be immediately obvious whether “113456 × 65421 = 880459536“
is true or false but certainly it is a statement. A more interesting example is
“There are infinitely many prime pairs.” where by a prime pair we mean a pair,
p, p + 2, of numbers, two apart, where both are prime (for instance 3 and 5 form
a prime pair, as do 17 and 19 but not 19 and 21). It is a remarkable fact that,
to date, no-one has been able to determine whether this statement is true or
false. Yet it is surely1 either false (after some large enough number there are no
more prime pairs) or true (given any prime pair there is always a larger prime
pair somewhere out there).
On the other hand, the following are not statements.
“Is 7 a prime number?”
The first is a question, the second a command.
“x is a prime number.”:
is this a statement? The answer is, “It depends.”: if the context is such that
x already has been given a value then it will be a statement (since then either
x is a prime number or it is not) but otherwise, if no value (or other sufficient
information) has been assigned to x then it is not a statement.
Here’s a silly example (where we can’t tell whether something is a statement
or not). Set x = 7 if there are infinitely many prime pairs but leave the value of
1 There
are some issues there but they are more philosophical than mathematical.
4
x unassigned if there are not. Is “x is a prime number” a statement? Answer:
(to date) we can’t tell! But this example is silly (the context we have set up is
highly artifical) and quite off the path of what we will be doing.
When we discuss mathematical properties of, for instance, numbers, we use
variables, x, y to stand for these numbers. This allows us to make general
assertions. So we can say “for all integers x, y we have x + y = y + x” instead
of listing all the examples of this assertion: ..., “0 + 1 = 1 + 0”, “1 + 1 = 1 + 1”,
..., “2 + 5 = 5 + 2”, ... (not that we could list all the assertions covered by this
general assertion, since there are infinitely many of them). In the same way,
although we will use particular statements as examples, most of the time we use
variables p, q, r, s, t to stand for statements in order that we may make general
assertions.2
As indicated already, propositional logic is the logic of “and”, “or”, “not”,
“implies” (as well as “iff” and other combinations of the connectives). The
words in quotes are propositional connectives: they operate on propositions
(and propositional variables) to give new propositions.
Initially we define these connectives somewhat informally in order to emphasise their intuitive meaning. Then we give their exact definition after we have
been more precise about the context and have introduced the idea of (truth)
valuation.
First, notation: we write ∧ for “and”; ∨ for “or”, ¬ for “not”, → for “implies“
and ↔ for “iff”. So if p is the proposition “the moon is made of cheese” and
q is the proposition “mice like cheese” then p ∧ q, p ∨ q, ¬p, p → q, p ↔ q
respectively may be read as “the moon is made of cheese and mice like cheese”,
“the moon is made of cheese or mice like cheese”, “the moon is not made of
cheese”, “if the moon is made of cheese then mice like cheese” and “the moon
is made of cheese iff mice like cheese”.
A crucial observation is that the truth value (true or false) of a statement
obtained by using these connectives only depends on the truth values of the
“component propositions”. Check through the examples given to see if you agree
(you might have some doubts about the last two examples: we could discuss
these). For another example, you may not know whether or not the following
are true statements: “the third homology group of the torus is trivial”, “every
artinian unital ring is noetherian” but you know that the combined statement
“the third homology group of the torus is trivial and every artinian unital ring
is noetherian” is true exactly if each of the separate statements is true. That
is why it makes sense to apply these propositional connectives to propositional
variables as well as to propositions.
So now the formal definition.
We start with a collection, p, q, r, p0 , p1 , ... of symbols which we call
propositional variables. Then we define, by induction, the propositional
terms by the following clauses:
2 You might notice that in this paragraph I assigned different uses to the words “assertion”
and “statement” (although earlier I said that I would use these interchangably). This is because I was making statements about statements. That can be confusing, so I used “assertion”
for the first (more general, “meta”, “higher”) type of use and “statement” for the second type
of use. In logic we make statements about statements (and even statements about statements
which are themselves statements about statements ...).
5
(0) every propositional variable is a propositional term;
(i) if s and t are propositional terms then so are: s ∧ t, s ∨ t, ¬s, s → t,
s ↔ t;
(ii) that’s it (more formally, there are no propositional terms other than
those which are so by virtue of the first two clauses).
The terms seen in (i) are respectively called the conjunction (s ∧ t), disjunction (s ∨ t), of s and t, ¬s is the negation of s, s → t is an implication and
s ↔ t a biimplication.
Remark: Following the usual convention in mathematics we will use symbols
such as p, q, respectively s, t, not just for individual propositional variables,
respectively propositional terms, but also as variables ranging over propositional
variables, resp. propositional terms, (as we did just above).
The definition above is an inductive one, with (0) being the base case and (i)
the inductions step(s) but it’s a more complicated inductive structure than that
which uses the natural numbers as “indexing structure”. For there are many
base cases (any propositional variable), not just one (0 in ordinary induction)
and there are (as given) five types of inductive step, not just one (“add 1” in
ordinary induction).
Example 1.1.1. Start with propositional variables p, q, r; these are propositional
terms by clause (0) and then, by clause (i), so are p ∧ p, p ∧ q, ¬q, q → p for
instance. Then, by clause (i) again, (p ∧ p) ∧ p, (p ∧ q) → ¬r, (q → p) → (q → p)
are further propositional terms. Further applications of clause (i) allow us to
build up more and more complicated propositional terms. So you can see that
these little clauses have large consequences. The last clause (ii) simply says that
every propositional term has to be built up in this way.
Notice how we have to use parentheses to write propositional terms. This is
just like the use in arithmetic and algebra: without parentheses the expression
(−3 + 5) × 4 would read −3 + 5 × 4 and the latter is ambiguous. At least, it
would be if we had not become used to the hierarchy of arithmetical symbols
by which − binds more closely than × and ÷, and those bind more closely than
+ and −. Of course parentheses are still needed but such a hierarchy reduces
the number needed and leads to easier readability. A similar hierarchy is used
for propositional terms, by which ¬ binds more closely than ∧ and ∨, which
bind more closely than → and ↔ (at least those are my conventions, but they
are not universal). Therefore ¬p ∧ q → r means ((¬p) ∧ q) → r rather than
¬(p ∧ q) → r or (¬p) ∧ (q → r) or ¬(p ∧ (q → r)) (at least it does to me; if in
doubt, put in more parentheses).
You will recall that in order to prove results about things which are defined
by induction (on the natural numbers) it is often necessary to use proof by
induction. The same is true here: one deals with the base case (propositional
variables) then the inductive steps. In this case there are five different types of
inductive step but we’ll see later than some of the propositional connectives can
be defined in terms of the others. For instance using ∧ and ¬ (or using → and
¬) we can define all the others. Having made that observation, we then need
only prove the inductive steps for ∧ and ¬ (or for → and ¬).
construction are often called “proofs by induction on complexity (of terms)”.
If we wish to be more precise about the set of propositional variables that we
6
are using then we will let L (“L” for “language”) denote the set of propositional
variables. We also introduce notation for the set of propositional terms built
up from these, namely, set S0 L = L and, having inductively (on n) defined the
set Sn L we define Sn+1 L to be the set of all propositional terms which may be
built from Sn L using a single propositional connective and, so as to make this
process cumulative, we also include Sn L in Sn+1 L. More formally:
Sn+1 L = Sn L ∪ {(s ∧ t), (s ∨ t), (¬s), (s → t), (s ↔ t) : s, t ∈ Sn L}.
S
We also set SL = n≥0 Sn L to be the union of all these - the set of all propositional terms (sometime called sentences, hence the “S” in “SL”) which can be
built up from the chosen base set, L = S0 L, of propositional variables.3
Notice that we place parentheses around all the propositional terms we build;
we discussed already that leaving these out could give rise to ambiguity in
reading them: was “s ∧ t ∨ u” - a term in S2 L - built up by applying ∧ to
s, t ∨ u ∈ S1 L or by applying ∨ to s ∧ t, u ∈ S1 L, that is, should it be read as
s ∧ (t ∨ u) or as (s ∧ t) ∨ u? In practice we can omit some pairs of parentheses
without losing unique readability but, formally, those pairs are there.
In fact, although intuitively it might at first seem obvious that if we look
at a propositional term in SL, then we can figure out how it was constructed
- that is, there is a unique way of reading it - a bit more thought reveals that
there is an issue: how do we detect the “last” connective in its construction?
Clearly, if we can do that then we can proceed inductively to reconstruct its
“construction tree”. We have been precise in setting things up so we should be
able to prove unique readability (if it is true - which it is, as we show now).
Theorem 1.1.2. Let s ∈ SL be any propositional term. Then exactly one of
the following is the case:
(a) s is a propositional variable;
(b) s has the form (t ∧ u) for some t, u ∈ SL;
(c) s has the form (t ∨ u) for some t, u ∈ SL;
(d) s has the form (¬t) for some t ∈ SL;
(e) s has the form (t → u) for some t, u ∈ SL;
(f ) s has the form (t ↔ u) for some t, u ∈ SL.
Proof. Every propositional term s does have at least one of the listed forms:
because s ∈ SL it must be that s ∈ Sn L for some n and then, just by the
definitions of S0 L and Sn+1 L, s does have such a form. We have to show that
it has a unique such form. For this we introduce two lemmas and the following
definitions: if s ∈ SL then by l(s) we denote the number of left parentheses,
“(”, occurring in s and by r(s) we denote the number of right parentheses, “)”,
occurring in s (for purposes of this definition we count all the parentheses that
should be there).
Lemma 1.1.3. For every propositional term s we have l(s) = r(s).
Proof. This is an example of a proof by induction on complexity/construction
of terms.
3 A word about notation: I will tend to use p, q, r for propositional variables, s, t, u for
propositional terms (which might or might not be propositional variables) and v for valuations
(see later). That rather squeezes that part of the alphabet so I will sometimes use other parts
and/or the Greek alphabet for propositional variables and terms.
7
If s ∈ S0 L then l(s) = 0 = r(s) so the result is true if s ∈ S0 L.
For the induction step, suppose that for every s ∈ Sn L we have l(s) = r(s).
Let s ∈ Sn+1 L; then either there is t ∈ Sn L such that s = (¬t) or there are
t, u ∈ Sn L such that s = (t∧u) or (t∨u) or (t → u) or (t ↔ u). Since t, u ∈ Sn L,
we have l(t) = r(t) and l(u) = r(u) by the inductive assumption. In the first
case, s = (¬t), it follows that l(s) = 1 + l(t) = 1 + r(t) = r(s), as required. In the
second case, s = (t ∧ u), we have, on counting parentheses, l(s) = 1 + l(t) + l(u)
and r(s) = r(t) + r(u) + 1, and so l(s) = r(s), as required. The other three cases
are similar and so we see that in all cases, l(s) = r(s). Thus the inductive step
is proved and so is the lemma. Digression on proof by induction on complexity. At the start of the proof of
1.1.3 above I said that the proof would be by induction on complexity of terms
but you might have felt that the proof was shaped as a proof by induction on
the natural numbers N = {0, 1, 2, . . . }. That’s true; we used the sets Sn L to
structure the proof, and the proof by induction on complexity of terms was
reflected in the various subcases that were considered when going from Sn L to
Sn+1 L. But the proof could have been given without reference to the sets Sn L.
The argument - the various subcases - would be essentially the same; the hitherto
missing ingredient is the statement of the appropriate Principle of Induction.
Recall that, for N that takes the form “Given a statement P (n), depending on
n ∈ N, if P (0) is true and if from P (n) we can prove P (n + 1), then P (n) is true
for every n ∈ N.”4 The corresponding statement for our “construction tree” for
propositional terms is: “Given a statement P (s), depending on s ∈ SL, if P (p)
is true for every propositional variable p and if, whenever P (s) and P (t) are
true so are P (s ∧ t), P (s ∨ t), P (¬s), P (s → t) and P (s ↔ t), then P (s) is true
for every s ∈ SL.”
Before the next lemma, notice that every propositional term can be thought
of simply as a string of symbols which, individually, are either: propositional
variables (p, q etc.), connectives (∧, ∨, ¬, →, ↔), or parentheses (left, right).
Then the statement that s, as a string, is, for instance, xyz will mean that x, y,
z are strings and, if we place them next to each other in the given order, then
we get s. For instance if s0 is ¬¬(s ∧ (t ∨ u)) then we could write s0 as xyz
where x, y, z are the strings x = ¬, y = ¬(s, z = ∧(t ∨ u)); we could even write
s0 = xyzw with x, y, z as before and w the empty string (which we allow). We
define the length, lng(x), of any string x to be the number of occurrences of
symbols in it. We extend the notations l(x) and r(x) to count the numbers of
left parentheses, right parentheses in any string x. If the string x has the form
yz then we say that y is a left subword of x, a proper subword if z 6= ∅;
similarly z is a right subword of x, proper if y is not the empty string. (We
will use the terms “string” and “word” interchangably.)
Proposition 1.1.4. For every propositional term s, either s is a propositional
variable or there is just one way of writing s in either of the forms s = (¬t) for
some propositional term t or s = (t ∗ u) for some propositional terms t, u where
∗ is one of the binary propositional connectives.
Proof. We can suppose that s is not a propositional variable. Note that if s
has the form (¬t) then the leftmost symbols of s are (¬, whereas if s has the
4 I follow the convention that 0 is a natural number; not followed by everyone but standard
in mathematical logic.
8
form (t ∗ u) then its leftmost symbols are (( or (p where p is a propositional
variable, so we can treat these two cases entirely separately.
In the first case, s = (¬t), this is the only possible way of writing it in this
form because t is determined by s. Therefore, since, as we observed above, it
cannot be written in the form (t ∗ u), there is no other way of writing s as a
propositional term.
In the second case, we argue by contradiction and suppose that we can write
s = (t ∗ u) = (t0 ∗0 u0 ) where t, u, t0 , u0 are propositional terms and ∗, ∗0 are
propositional connectives and, for the contradiction, that these are not identical
ways of writing s, hence that either t is a proper left subword of t or t0 is a
proper left subword of t. A contradiction will follow immediately once we have
proved the following lemma. Lemma 1.1.5. If s is a propositional term and if s0 is a proper left subword of
s then either s0 = ∅ or l(s0 ) − r(s0 ) > 0; in particular s0 is not a propositional
term.
Similarly, if s00 is a proper right subword of s then either s00 = ∅ or r(s00 ) −
l(s00 ) > 0, and s00 is not a propositional term.
Proof. We know that s has the form (¬t) or (t ∗ u).
In the first case, s0 has one of the forms ∅, ( or (¬t0 where t0 is a left subword
(possibly empty) of t. By induction on lengths of propositional terms we can
assume that t0 = ∅ or l(t0 ) − r(t0 ) ≥ 0 (“>” if t0 is a proper left subword of t, “=”
by 1.1.3 in the case t0 = t) and so, in each case, it follows that l(s0 ) − r(s0 ) > 0.
In the second case, s0 has one of the forms ∅, (, (t0 where t0 is a left subword
of t, (t ∗ u0 where u0 is a left subword of u. Again by induction on lengths of
propositional terms we can assume that l(t0 ) − r(t0 ) ≥ 0 and l(u0 ) − r(u0 ) ≥ 0;
checking each case, it follows that l(s0 ) − r(s0 ) > 0.
By 1.1.3 we deduce that s0 is not a propositional term.
Similarly for the assertion about right subwords. 1.2
Valuations
Now for the key idea of a (truth) valuation. Fix some set L = S0 L of propositional variables, and hence the corresponding set SL of propositional terms. A
valuation on the set of propositional terms is a function v : SL → {T, F} to
the 2-element set5 {T, F} which satisfies the following conditions.6
For all propositional terms s, t we have
v(s ∧ t) = T iff v(s) = T and v(t) = T;
v(s ∨ t) = T iff v(s) = T or v(t) = T;
v(¬s) = T iff v(s) = F;
v(s → t) = T iff v(s) = F or v(t) = T;
v(s ↔ t) = T iff the values of v(s) and v(t) are the same: v(s) = v(t).
5 really,
the two-element boolean algebra
course, T represents “true” and F “false”. Often the 2-element set {1, 0} is used instead,
normally with 1 representing “true” and 0 representing “false”.
6 Of
9
Namely, because all propositional terms are built up from the propositional
variables using the propositional connectives, any valuation is completely determined by its values on the propositional variables (this, see 1.2.1((b) below,
is the formal statement of the point we made (the “crucial observation”) when
discussing mice, cheese and homology groups).
For instance if v(p) = v(q) = T and v(r) = F then we have, since v is a
valuation, v(p∨r) = T and hence v(¬(p∨r)) = F. Similarly, for any propositional
term, t, built from p, q and r, the value v(t) is determined by the above choices
of v(p), v(q), v(r). That does actually need proof. There is the, rather obvious
and easily proved by induction, point that this process works (in the sense that it
gives a value), but there’s also the more subtle point that if there were more than
one way of building up a propositional term then, conceivably, one construction
route might lead to the valuation T and the other to F. But we have seen
already in 1.1.4 that this does not, in fact, happen: every propositional term
has a unique “construction tree”. Therefore if v0 is an function from the set,
S0 L, of propositional variables to the set {F, T} then this extends to a unique
valuation v on SL. In particular, if there are n propositional variables there
will be 2n valuations on the propositional terms built from them. We state this
formally.
Proposition 1.2.1. Let L be a set of propositional variables.
(a) If v0 : L → {F, T} is any function then there is a valuation v : SL → {F, T}
on propositional terms in L such that v(p) = v0 (p) for every p ∈ L.
(b) If v and w are valuations on SL and if v(p) = w(p) for all p ∈ L then v = w
(so the valuation in part (a) is unique).
(c) If t is a propositional term and if v and w are valuations which agree on all
propositional variables occurring in t then v(t) = w(t).
The proof of part (c), which is a slight strengthening of (b), is left as an
exercise. In order to prove it we could prove the following statement first (by
induction on complexity of terms):
if L0 ⊆ L are sets of propositional variables then for every n, SL0n ⊆ Sn L;
furthermore, if v 0 is a valuation on SL0 and v is a valuation on SL such that
v(p) = v 0 (p) for every p ∈ L0 then v(t) = v 0 (t) for every t ∈ SL0 .
From that, part (c) follows easily (take L0 to be the set of propositional terms
actually occurring in t). (You might have noticed that I didn’t actually define
what I mean by a propositional variable occurring in a propositional term;
I hope the meaning is clear but it is easy to give a definition by, what else,
induction on complexity of terms.)
Truth tables are tables showing evaluation of valuations on propositional
terms. They can also be used to show the effect of the propositional connectives
on truth values. Note that “or” is used in the inclusive sense (“one or the other
or both”) rather than the exclusive sense (“one or the other but not both”).
p q p∨q
p q p∧q
T T
T
T T
T
T F
F
T F
T
F T
F
F T
T
F F
F
F F
F
10
p q p→q
p q p↔q
p ¬p
T T
T
T T
T
F
T F
F
T F
T F
F T
T
F T
F T
F
T
T
F F
F F
You might feel that the truth table for → does not capture what you consider
to be the meaning of “implies” but, if we are to regard it as a function on truth
values (whatever the material connection or lack thereof between its “input”
propositions) then the definition given is surely the right one. Or just regard
p → q as an abbreviation for ¬p∨q “(not-p) or q”, since they have the same truth
tables. The following example might make the reading of p → q as meaning
¬p ∨ q reasonable: let p be “n = 1” and let q be “(n − 1)(n − 2) = 0”, so p → q
reads “n = 1 implies (n−1)(n−2) = 0” or “If n = 1 then (n−1)(n−2) = 0” and
then consider setting n = 1, 2, 3, . . . in turn and think about the truth values of
p, q and p → q.
You will have seen examples of truth tables in the first year Sets, Numbers
and Functions course. Recall that they can be used to determine whether a
propositional term t is a tautology, meaning that v(t) = T for every valuation
v. The “opposite” notion is: if v(t) = F for every valuation v; then we say
that t is unsatisfiable (also called “a contradiction” though that’s not good
terminology to use when we’ll be drawing the distinction between syntax and
semantics). Notice that the use of truth tables implicitly assumes part (c) of
1.2.1.
We say that two propositional terms, s and t, are logically equivalent,
and write s ≡ t, if v(s) = v(t) for every valuation v. It is equivalent to say that
s ↔ t is a tautology. Let’s prove that.
Suppose s ≡ t so, if v is any valuation, then v(s) = v(t) so, from the definition
of valuation, v(s ↔ t) = T. This is so for every valuation so, by the definition
of tautology, s ↔ t is a tautology. For the converse, suppose that s ↔ t is a
tautology and let v be any valuation. Then v(s ↔ t) = T and so (again, by the
definition of valuation) v(s) = v(t). Thus, by definition of equivalence, s and t
are logically equivalent. We see that the proof was just an easy exercise from
the definitions.
Now for the semantic notion of entailment; we contrast “semantics” (“meaning” or, at least, notions of being true and false) with “syntax” (construction
and manipulation of strings of symbols). If S is a set of propositional terms
and t is a propositional term then we write S |= t if for every valuation v with
v(S) = T, by which we mean v(s) = T for every s ∈ S, we have v(t) = T:
“whenever S is true so is t”.
Extending the above notions we say that a set S of propositional terms is
tautologous if v(S) = T for every valuation v and S is unsatisfiable if for
every valuation v there is some s ∈ S with v(s) = F - in other words, if no
valuation makes all the terms in S true. We also say that S is satisfiable if
there is at least one valuation v with v(S) = T. So note: tautologous means
every valuation makes all terms in S true; satisfiable means that some valuation
makes all terms in S true; unsatisfiable means that no valuation makes all terms
in S true.
Lemma 1.2.2. Let S be a set of propositional terms and let t, t0 , u be proposi-
11
tional terms.
(a) S |= t iff S ∪ {¬t} is unsatisfiable
(b) S ∪ {t} |= u iff S |= t → u
(c) S ∪ {t, t0 } |= u iff S ∪ {t ∧ t0 } |= u
Proof. These are all simple consequences of the definitions. Before we begin,
we introduce a standard and slightly shorter notation: instead of writing S ∪
{t1 , . . . , tk } |= u we write S, t1 , . . . , tk |= u.
(a) S ∪ {¬t} is unsatisfiable iff
for all valuations v, we have v(s) = F for some s ∈ S or v(¬t) = F iff
for all valuations v, if v(s) = T for all s ∈ S then v(¬t) = F iff
for all valuations v, if v(s) = T for all s ∈ S then v(t) = T iff
S |= t.
(b) S ∪ {t} |= u iff
for every valuation v, if v(s) = T for all s ∈ S and v(t) = T then v(u) = T iff
for every valuation v with v(s) = T for all s ∈ S then, if v(t) = T then v(u) = T
iff
for every valuation v with v(s) = T for all s ∈ S then v(t → u) = T (by the
truth table for “→”) iff
S |= t → u.
(c) S ∪ {t ∧ t0 } |= u iff
for every valuation v with v(s) = T for all s ∈ S and v(t ∧ t0 ) = T we have
v(u) = T iff
(by the truth table for ∧) for every valuation v with v(s) = T for all s ∈ S and
v(t) = T and v(t0 ) = T, we have v(u) = T iff
S ∪ {t, t0 } |= u. We can use truth tables to determine whether or not S |= u (assuming S is
a finite (and, in practice, not very large) set) but this can take a long time: if
there are n propositional variables appearing then we need to compute a truth
table with 2n rows. The next section describes a method which sometimes is
more efficient.
1.3
Beth trees
Beth trees provide a method, often more efficient than and perhaps more interesting than, truth tables, of testing whether a collection of propositional terms
is satisfiable or not (and, if it is satisfiable, of giving a valuation demonstrating this). Note that this includes testing whether a propositional term is a
tautology, whether one term implies another, whether S |= t, et cetera.
The input to the method consists of two sets S, T of propositional terms; to
distinguish between these we will write the typical input as S|T . The output
will, if we carry the method to its conclusion (which for some purposes will be
more than we need to do), be all valuations with v(S) = T and v(T ) = F. So if
the output is nonempty then we know that S ∪ {¬t : t ∈ T } is satisfiable. For
instance, t is a tautology if the output from the pair ∅|{t} is empty (which often
will be easier than checking whether the output of {t}|∅ is all valuations).
The actual computation has the form of a tree (as usual in mathematics,
trees grow downwards) and, at each node of the tree, there will be a pair of the
12
form S 0 |T 0 . A node (of a fully or partially-computed Beth tree) is terminal
if it has no node beneath it. A node is a leaf if all the propositional terms at
it are propositional variables. Directly underneath each non-terminal node is
either a branch segment with another node at its end, or two branch segments
with a node at the end of each. A key feature of the tree is that if a node
lies under another then the lower one contains fewer propositional connectives.
That means that if the initial data contains k propositional connectives then no
branch can contain more than k+1 nodes. And that means that the computation
of the tree will terminate.
Before we describe how to compute such trees here, in order to anchor ideas,
is an example.
Example 1.3.1. We determine whether or not ¬p, (p ∧ q) → r |= ¬r → (q → p).
We will build a tree beginning with the input ¬p, (p ∧ q) → r | ¬r → (q → p)
since there will be a valuation satisfying this condition exactly if ¬p, (p ∧ q) →
r |= ¬r → (q → p) does not hold.
¬p, (p ∧ q) → r | ¬r → (q → p)
(p ∧ q) → r | p, ¬r → (q → p)
¬r, (p ∧ q) → r | p, q → p
¬r, q, (p ∧ q) → r | p, p
q, (p ∧ q) → r | r, p
TTTT
jjj
j
TTTT
j
j
j
j
TTTT
j
jj
j
j
TTT
j
jj
q, r | r, p
q | r, p, p ∧ q
TTTT
qq
TTTT
q
q
TTTT
qqq
TTTT
TT
qqq
q | q, r, p
q | r, p
The property (∗) below implies that a valuation v satisfies the input conditions (making both ¬p and (p ∧ q) → r true but making ¬r → (q → p) false)
iff it satisfies at least one of the leaves. But we can see immediately that the
only leaf satisfied by any valuation is q | r, p, which is satisfied by the valuation
v with v(q) = T, v(r) = F, v(p) = F. So there is a valuation making both ¬p
and (p ∧ q) → r true but making ¬r → (q → p) false. That is ¬r → (q → p)
does not follow from ¬p and (p ∧ q) → r.
We will list the allowable rules for generating the nodes directly under a
given node. To make sense of these, we first explain the idea. The property
that we want is the following:
(∗) If, at any stage of the construction of the tree with initial node S|T , the
currently terminal nodes are S1 |T1 ,...,Sk |Tk then, for every valuation v, we have
v(S) = T and v(T ) = F iff v(Si ) = T and v(Ti ) = F for some i.
13
For this section, when I write v(T ) = F I mean v(t) = F for every t ∈ T . This
is a convenient, but bad (because easily misinterpreted), notation.
In order for this property to hold it is enough to have the following two:
(∗1 ) if a node S 0 |T 0 is immediately followed by a single node S1 |T1 then, for
every valuation v we have v(S 0 ) = T and v(T 0 ) = F iff v(S1 ) = T and v(T1 ) = F;
(∗2 ) if a node S 0 |T 0 is immediately followed by the nodes S1 |T1 and S2 |T2
then, for every valuation v we have: v(S 0 ) = T and v(T 0 ) = F iff [v(S1 ) = T and
v(T1 ) = F] or [v(S2 ) = T and v(T2 ) = F].
(The fact that these are enough can be proved by an inductive argument.)
In the pair S|T you can think of the left hand side as the “positive” statements and those on the right as the “negative” ones. Each rule involves either
moving one term between the positive and negative sides (with appropriate
change to the term) or splitting one pair into two. Here are the allowable rules.
S, s ∧ t | T
S, s, t | T
S, ¬t | T
S, | ¬t, T
S | t, T
S, t | T
S, | s ∧ t, T
LLL
r
LLL
rrr
r
LLL
r
r
r
L
r
r
S | s, T
S | t, T
S, s ∨ t | T
KKK
rr
KKK
r
r
r
KKK
rr
r
K
r
r
S, s | T
S, t | T
S | s ∨ t, T
S, s → t | T
LLL
rr
LLL
r
r
r
LLL
r
r
r
L
rr
S | s, T
S, t | T
S | s → t, T
S | s, t, T
S, s | t, T
In lectures we will explain a few of these but you should think through why
each one is valid (that is, satisfies (∗1 ) or (∗2 ), as appropriate). You should also
note that they cover all the cases - together they allow a single pair to be input
and will output a tree where every terminal node is a leaf. When constructing a
Beth tree there may well be some nodes where there is a choice as to which rule
to apply but no choice of applicable rule is wrong (though some choices might
Example 1.3.2. We use Beth trees to show that p ∧ q → p is a tautology. We
already suggested that it might be easier to do the equivalent thing of showing
that ¬(p ∧ q → p) is unsatisfiable; here’s the computation for that.
14
∅|p ∧ q → p
p ∧ q|p
p, q | p
- and clearly no valuation can make both p, q true but make p false; we conclude
that p ∧ q → p is a tautology.
For comparison here is the direct check that p ∧ q → p is a tautology.
p ∧ q → p|∅
KKK
q
q
KK
qq
q
KKK
q
q
q
KK
qq
p|∅
∅|p ∧ q
MMM
w
w
M
w
MMM
ww
MMM
ww
M
ww
∅|p
∅|q
Now note that every valuation satisfies the condition expressed by at least one
of the leaves, so p ∧ q → p is indeed a tautology.
1.4
Normal forms
First, we look at some more basic properties of logical equivalence where, recall,
two propositional terms s, t are said to be logically equivalent, s ≡ t, if v(s) =
v(t) for every valuation v (and by 1.2.1(c) it is enough to check for valuations
on just the propositional variables actually occurring in s or t).
Lemma 1.4.1. If s, t are propositional terms then:
(i) s ≡ t iff
(ii) s |= t and t |= s iff
(iii) |= s ↔ t iff
(iv) s ↔ t is a tautology.
Proof. All this is immediate from the definitions. For instance, to prove
(iv)⇒(i) let v be any valuation; then, assuming (iv), v(s ↔ t) = T and by
definition of valuation, we see this can happen only if v(s) = v(t), as required.
Note that this is an equivalence relation on SL; that is, it is reflexive (s ≡ s),
symmetric (s ≡ t implies t ≡ s) and transitive (s ≡ t and t ≡ u together imply
s ≡ u).
Here are some, easily checked, basic logical equivalences.
For any propositional terms s, t, u:
s ∧ t ≡ t ∧ s;
s ∨ t ≡ t ∨ s;
¬(s ∧ t) ≡ ¬s ∨ ¬t;
¬(s ∨ t) ≡ ¬s ∧ ¬t;
¬¬s ≡ s;
15
s → t ≡ ¬s ∨ t;
(s ∧ t) ∧ u ≡ s ∧ (t ∧ u), so we can write s ∧ t ∧ u without ambiguity;
(s ∨ t) ∨ u ≡ s ∨ (t ∨ u), so we can write s ∨ t ∨ u without ambiguity;
(s ∧ t) ∨ u ≡ (s ∨ u) ∧ (t ∨ u);
(s ∨ t) ∧ u ≡ (s ∧ u) ∨ (t ∧ u);
s ∧ s ≡ s;
s ∨ s ≡ s.
Proposition 1.4.2. Suppose that s ≡ s0 and t ≡ t0 are propositional terms.
Then:
(i) ¬s ≡ ¬s0 ;
(ii) s ∧ t ≡ s0 ∧ t0 ;
(iii) s ∨ t ≡ s0 ∨ t0 ;
(iv) s → t ≡ s0 → t0 .
Proof. To prove (ii): suppose v(s ∧ t) = T. Then by the truth table for ∧, both
v(s) = T and v(t) = T; so v(s0 ) = T and v(t0 ) = T and hence v(s0 ∧ t0 ) = T.
The other parts are equally easy. We introduce notations for multiple
P conjunctions and disjunctions; they are
completely analogous to the use of
for repeated +. Given propositional terms
Vn
V1
Vk+1
Vk
s1 , . . . , sn we define Wi=1 si by induction: i=1 = s1 , i=1 si = i=1 si ∧ sk+1 .
n
Similarly we define i=1 si . Because of associativity and commutativity of ∧,
respectively of ∨, if we permute the terms in such a repeated conjunction or
disjunction, then we obtain an equivalent propositional term. Indeed, we have
the following (the proofs of which are left as exercises).
Proposition 1.4.3. If s1 , . . . , sn are propositional terms and v is a valuation
then:
Vn
(i) v( Wi=1 si ) = T iff v(si ) = T for all i = 1, . . . , n;
n
(ii) v(
T iff v(si ) = T for some i ∈ {1, . . . , n};
Vn i=1 si ) = W
n
(iii) W i=1 si ≡ ¬V i=1 ¬si ;
n
n
(iv) i=1 si ≡ ¬ i=1 ¬si ;
Proposition 1.4.4. Suppose that s1 , . . . , sn and t1 , . . . , tm are sequences of
propositional terms such that {s1 , . . . , sn } = {t1 , . . . , tm } (thus the W
sequences
n
differ
only
in
the
order
of
their
terms
and
possible
repetitions).
Then
i=1 si =
Wm
Vn
Vm
t
and
s
=
t
.
j=1 j
i=1 i
j=1 j
W
If
. , sn } is V
a finite set of propositional terms then we write S
WnS = {s1 , . . V
n
for i=1 si and S for
V i=1 si . What if S = ∅? Since, roughly, the more
conjuncts
V there are in S the harder it is to be true, it makes someWsense to
define ∅ to be any tautology (i.e. always true). Dually we define ∅ to be
any unsatisfiable term (so false W
under every
V valuation). (Because we are only
interested in the truth values of ∅ and ∅ it doesn’t matter which tautology
A little more terminology: given a set L of propositional variables, we refer
to any propositional variable p, or any negation, ¬p, of a propositional variable
as a literal.
We are going to show that every propositional term is equivalent to one
which is in a special form (indeed, there are two special forms: disjunctive and
conjunctive).
16
term is in disjunctive normal form if it has the form
Wn A Vpropositional
mi
g
where
each gij is a literal.
i=1 j=1 ij
Proposition 1.4.5. (Disjunctive Normal Form Theorem) If t ∈ SL then there
is a propositional term s ∈ SL which is in disjunctive normal form and such
that s ≡ t. If {p1 , . . . , pk } are the propositional
Wn Vmi variables appearing in t then
gij with each mi ≤ k and with
we may suppose that s has the form i=1 j=1
k
n≤2 .
Proof. Let v1 , . . . , vn be the distinct valuations v on {p
1 , . . . , pk } such that
pj
if vi (pj ) = T
v(t) = T. For each i = 1, . . . , n and j = 1, . . . , k, set gij =
.
¬pj if vi (pj ) = F
Vk
Note that vi ( j=1 gij ) = T and that if v 0 6= vi is any other valuation on
Vk
{p1 , . . . , pk } then v 0 ( j=1 gij ) = F. It follows that if w is any valuation on
Wn V k
{p1 , . . . , pk } then w( i=1 j=1 gij ) = T iff w is one of v1 , . . . , vn . Therefore for
Wn Vk
W n Vm i
gij are logically
any valuation v, v( i=1 j=1 gij ) = v(t), so t and i=1 j=1
equivalent, as required.
For the final statement, note that there are 2k distinct valuations on {p1 , . . . , pk }.
The proof shows how to go about actually constructing an equivalent propositional term in disjunctive normal form, using either truth tables or, the proof
slightly modified, Beth trees.
Example 1.4.6. Consider the propositional term t = (p ∧ q) → (¬p ∨ r). If we
construct its truth table then we find 7 rows/valuations on {p, q, r} which make
it true. For each of these we form the corresponding “gij ”. For instance, the
valuation v1 (p) = v1 (q) = v1 (r) = T is one of those making t true and the
corresponding term is p ∧ q ∧ r. Another row where t is true is that where p is
true and q and r are false, so the corresponding term is p ∧ ¬q ∧ ¬r. Et cetera,
giving the disjunctive normal form term
(p∧q∧r)∨(p∧¬q∧r)∨(p∧¬q∧¬r)∨(¬p∧q∧r)∨(¬p∧q∧¬r)∨(¬p∧¬q∧r)∨(¬p∧¬q∧¬r)
equivalent to t.
Normal forms are, however, not unique and you might note that, for example,
the last four disjuncts can be replaced by the logically equivalent term ¬p. From
this point of view, Beth trees are more efficient, as we can illustrate with this
example. If we construct a Beth tree starting with p∧q | ¬p∨r then very quickly
we reach the single leaf p, q | r, which corresponds to the single valuation making
(p∧q) → (¬p∨r) false. That corresponds to the term p∧q∧¬r, so t is equivalent
to the negation of this, namely ¬(p ∧ q ∧ ¬r), which is equivalent to ¬p ∨ ¬q ∨ r
- a much simpler disjunctive normal form.
You might instead construct a Beth tree starting from (p ∧ q) → (¬p ∨ r) | ∅.
Taking an obvious sequence of steps leads to a completed tree with the leaves
∅ | p, ∅ | q and r | ∅. These correspond to the (conjunctions of) literals: ¬p, ¬q,
r respectively. Therefore this also leads to the form ¬p ∨ ¬q ∨ r.
The dual form is as follows:
propositional term is in conjunctive normal
Vn WA
mi
form if it has the form i=1 j=1
gij where each gij is a literal.
17
Proposition 1.4.7. If t ∈ SL then there is a propositional term s ∈ SL which
is in conjunctive normal form and such that s ≡ t. If {p1 , . . . , pk } are the
propositional
variables appearing in s then we may suppose that s has the form
Vn Wm i
k
g
i=1 j=1 ij with each mi ≤ k and with n ≤ 2 .
Proof. The term t is logically
to ¬¬t and, by 1.4.5, ¬tWis equivalent
Wn equivalent
Vmi
n Vm i
gij
gij . So t is equivalent to ¬ i=1 j=1
to some term of the form i=1 j=1
which, using DeMorgan’s laws (the
third
and
fourth
on
the
list
of
identities
after
Vn Wm i
¬gij . Since each ¬gij is a literal (at
1.4.1), is in turn equivalent to i=1 j=1
least once we cancel double negations), the result follows. 1.5
The proof of 1.4.5 actually shows that every truth table on a set, p1 , . . . , pk , of
propositional variables can be generated from them by using the propositional
connectives ∧, ∨, ¬. More precisely, every propositional term t in p1 , . . . , pk
defines a function, evaluation-at-t, from the set Valp1 ,...,pk of valuations v on
p1 , . . . , pk , to {T, F}. Conversely, given any function e : Valp1 ,...,pk → {T, F},
one may construct, using ∧, ∨ and ¬, a propositional term t such that e is just
evaluation at t. If we change the set of propositional connectives that we are
“allowed to use” then we can ask the same question. For instance, using just
∧ and ∨ can we construct every truth table/build a term inducing any given
evaluation e? What if we use ¬ and →? And other such questions (the five we
have introduced are not the only possible connectives, indeed not even the only
ones which occur in nature, or at least in Computer Science, where one also sees
NAND=Sheffer stroke, NOR, XOR).7
We say that a set, S, of propositional connectives is adequate if for every
propositional term t (in any number of propositional variables) there is a term
t0 constructed using just the connectives in S such that t and t0 are “logically
equivalent”.8 By “logically equivalent” we mean that they “have the same truth
tables” or, a bit more precisely, they define the same function from Valp1 ,...,pn
to {T, F}.
Example 1.5.1. The set {∧, ¬} is adequate.
We have already commented that {∧, ∨, ¬} is adequate so we need only note
that s ∨ t ≡ ¬(¬s ∧ ¬t).
Example 1.5.2. The NAND gate/operator or Sheffer stroke is a binary
(i.e. has two inputs) propositional connective whose effect is as shown in the
truth table below.
p q p|q
T T
F
T F
T
F T T
F F
T
7 We won’t formulate the general question because then we would have to give a general
definition of “(n-ary) propositional connective” and would be hard-pressed to distinguish these
from propositional terms.
8 Notice that if S includes some “new” propositional connectives then we have to extend
our definitions of “propositional term” etc. to allow these. That’s why I used quotation marks
just then.
18
You can see from this that p|q is logically equivalent to ¬(p ∧ q), hence the name
“NAND”.
If we take our set of connectives to be just S = {|} then we have to re-define
“propositional term” by saying: every propositional variable is a propositional
term; if s, t are propositional terms then so is s|t. We can refer to these as
“(propositional) terms build using | (only)” and can write S| (L) for the set of
such terms.
It is easy to show that {|} is adequate. All we have to do is to show that, given
propositional variables p, q we can find terms using | only which are equivalent
to ¬p and p ∧ q - because we know that {¬, ∧} is an adequate set of connectives.
Indeed, it is easy to check see that p|p is equivalent to ¬p and hence that
(p|q)|(p|q) is equivalent to p ∧ q.
Showing that a given set S of connectives is not adequate can take more
thought: how can one show that some propositional terms do not have equivalents built only using connectives from S?
Example 1.5.3. One might feel that, intuitively, ∧ and ∨ together are not adequate since they are both “positive”. How can one turn that intuition into a
proof? One would like to show, for instance, that no term built only using ∧ and
∨ can be logically equivalent to ¬p but, even using only the single propositional
variable p, there are infinitely many propositional terms to check. That might
suggest trying some sort of inductive proof. But a proof of what statement?
What we can do is to prove by induction on complexity/length of a term
that: if t is any term built only from ∧ and ∨ then for every valuation v such
that all the propositional variables appearing in t are assigned the value T by
v, we also have v(t) = T. Once that is done, we can deduce, in particular, that
no term built only using ∧ and ∨ can be equivalent to ¬p.
1.6
Interpolation
Suppose that s and t are propositional terms and that s |= t, equivalently s → t
is a tautology. This could be for the trivial reasons that either s is always false
(unsatisfiable) or that t is always true (a tautology). But if that’s not the case
then the interpolation theorem guarantees that there is some propositional term
u which involves only the propositional variables appearing in both s and t such
that s |= u and u |= t. Such a u is referred to as an interpolant between s and
t.
Theorem 1.6.1. (Interpolation Theorem) Suppose that s ∈ S(L1 ) and t ∈
S(L2 ) are such that s |= t. Then either s is unsatisfiable or t is a tautology or
there is u ∈ S(L3 ), where L3 = L1 ∩ L2 , such that s |= u and u |= t.
Proof. We suppose that s is satisfiable and that t is not a tautology; we must
produce a suitable u.
Since s is satisfiable there is some valuation v1 on L1 such that v1 (s) = T and
since t is not a tautology there is some valuation v2 on L2 such that v2 (t) = F.
First we show that L3 6= ∅. If this were not so, that is, if L1 and L2 had
no propositional variables in common, then we could
define a valuation v3 on
v1 (p) if p ∈ L1
S(L1 ∪ L2 ) by setting, for p ∈ L1 ∪ L2 , v3 (p) =
. Then,
v2 (p) if p ∈ L2
19
by 1.2.1(c), we would have v3 (s) = v1 (s) = T and v3 (t) = v2 (t) = F, which
contradicts the assumption that s |= t.
Now choose, by 1.4.7, a formula of S(L1 ) in disjunctive normal form which
Vm i
Wn Vli
hik ) where we have separated out
gij ∧ k=1
is equivalent to s, say i=1 ( j=1
the literals into two groups: the gij - those belonging to S(L1 ) \ S(L3 ); the
hik - those belonging to S(L3 ). (We allow that some of these conjuncts might
be empty.) We can assume that
disjunct is satisfiable (we can drop any
Wneach
Vm
i
hik . Clearly u ∈ S(L3 ) and, if v is a
which are not). Define u to be i=1 k=1
Vmi
Vl i
hik ) = T
gij ∧ k=1
valuation on S(L1 ) then, if v(s) = T, it must be that v( j=1
Vmi
9
for some i (by 1.4.3) and hence v( k=1 hik ) = T and hence v(u) = T. Thus
s |= u and we have just seen that u is satisfiable.
It remains to prove that u |= t. So let v be a valuation
Vmi0 on S(L2 ) such that
v(u) = T. Then there must be some i0 such that v( k=1
hi0 k ) = T. We define
a valuation w on S(L1 ∪ L2 ) by setting, for p ∈ L1 ∪ L2 ,

v(p)
if p ∈ L2




T
if p = gi0 k for some k

F
if ¬p = gi0 k for some k
w(p) =


T,
say
if
p
∈
L
\
L
and
is
not
already assigned a value, that is, if p does

1
2


not occur in the i0 th disjunct
Vli0
Vmi0
Note that w( j=1
gi0 j ∧ k=1
hi0 k ) = T by construction and hence w(s) = T.
But we assumed that s |= t and so w(t) = T. But w and v agree on all
propositional variables in L2 ; hence v(t) = T.
We conclude that u |= t, which was what had remained to be proved. The proof gives an effective procedure for computing interpolants.
Example 1.6.2. Given that (p → (¬r∧s))∧(p∨(r∧¬s)) |= ((s → r) → t)∨(¬t →
(r ∧ ¬s)), how do we find an interpolant involving r and s only? (Note that, in
the notation of the proof, L1 = {p, r, s}, L2 = {r, s, t}, L3 = {r, s}.
We find a term in disjunctive normal form which is logically equivalent to
(p → (¬r∧s))∧(p∨(r∧¬s)); one such is (p∧s∧¬r)∨(¬p∧r∧¬s). Following the
procedure in the proof, we obtain the interpolant u which is (s ∧ ¬r) ∨ (r ∧ ¬s).
9 You might reasonably ask what happens if, for this value of i, there are no h
ik conjuncts.
That could happen but there must be at least one such value of i (that is, with v making the
ith conjunct true) such that there is an hik . Otherwise, arguing as before, we could adjust the
valuation v, keeping the same values on propositional variables in S(L1 ) \ S(L3 ) but adjusting
it on those belonging to S(L3 ), so as to make t false, while keeping s true, contradicting that
s |= t.
20
.
Chapter 2
Deductive systems
“The design of the following treatise is to investigate the fundamental laws of
those operations of the mind by which reasoning is performed; to give expression
to them in the symbolical language of a Calculus, and upon this foundation to
establish the science of Logic and construct its method; to make that method
itself the basis of a general method for the application of the mathematical
doctrine of Probabilities; and, finally, to collect from the various elements of
truth brought to view in the course of these inquiries some probable intimations
concerning the nature and constitution of the human mind.”
Thus begins Chapter 1 of George Boole’s “An Investigation of the Laws
of Thought (on which are founded the Mathematical Theories of Logic and
Probabilities)” (1854)
We have already seen the “symbolical language” (though not the way Boole
wrote it) and what Boole meant by a Calculus (or Algebra). Now we discuss
proof/deductive systems further.
Given a propositional term, we may test whether or not it is a tautology by,
for example, constructing its truth table. This is regarded as a “semantic” test
because it is in terms of valuations. The test is recursive in the sense that we
have a procedure which, after a finite amount of time, is guaranteed to tell us
whether or not the term is a tautology.
More generally, suppose that S is a finite set of propositional terms and
that t is a propositional term. Recall that we write S |= t to mean that every
valuation which makes everything in S true also makes t true. Checking whether
or not this is true also is a recursive procedure.
In the case of predicate logic, however, it turns out that there is no corresponding algorithm for determining whether or not a propositional term (“sentence” in that context) is a tautology or whether the truth of a finite set of
propositions implies the truth of another proposition.1 The best we can do
is to produce a method of “generating” all tautologies or, more generally of
starting with a set, S, of sentences/statements/propositions which we treat as
axioms and then generating all consequences of those axioms. Such a method of
1 In fact the set of tautologies of predicate logic is “recursively enumerable” but not recursive. Saying that the set is recursively enumerable means that there is an algorithm which
will output only tautologies and such that any tautology eventually will be output; but we
can’t predict when.
21
generating consequences (and, of course, avoiding anything which is not a consequence) is a propositional or predicate calculus. In the following sections
we will describe two such calculi for propositional logic. Of course, for classical
propositional logic no such calculus is necessary because we have methods such
as truth tables or Beth trees. But these calculi will serve as models of calculi for
logics where there is no analogue of those recursive methods. It is also the case
that these calculi do correspond to “Laws of Thought” in the sense that their
axioms and rules of inference capture steps in reasoning that we use in practice.
The calculi that we will see here are considerably simpler than those for
predicate logic but the main concepts and issues ((in)consistency, soundness,
completeness, compactness, how one might prove completeness) all are present
already in this simpler context which provides, therefore, a good opportunity to
understand these fundamental issues.
2.1
A Hilbert-style system for propositional logic
Our (Hilbert-style) calculus will consist of certain axioms and one rule of
deduction (or rule of inference). There are infinitely many axioms, being
all the propositional terms of one of the forms:
(i) s → (t → s)
(ii) (r → (s → t)) → ((r → s) → (r → t))
(iii) ¬¬s → s
(iv) (¬s → ¬t) → (t → s),
where r, s and t may be any propositional terms.
Thus, for instance, the following is an axiom: (p∧¬r) → ((s∨t) → (p∧¬r)).
We refer to (i)-(iv) as axiom schemas.
The single rule of deduction, modus ponens says that, from s and s → t
we may deduce t.
Then we define the notion of entailment or logical implication, written
`, within this calculus. Let S be a set (not necessarily finite) of propositional
terms and let s, t be propositional terms.
(i) If t is an axiom then S ` t (“logical axiom” LA)
(ii) If s ∈ S then S ` s (“non-logical axiom” NLA)
(iii) If S ` s and S ` s → t then S ` t (“modus ponens” MP)
(iv) That’s it. (like the corresponding clause in the definition of propositional
term)
We read S ` t as “S entails t” or “S logically implies (within this particular
calculus) t”.
This definition is, like various definitions we have seen before, an inductive
one: it allows chains of entailments. Here is an example, of a deduction of p → r
from S = {p → q, q → r}.
1. S ` q → r
NLA
2. S ` (q → r) → (p → (q → r))
LA(i)
3. S ` (p → (q → r))
MP1,2
4. S ` ((p → (q → r)) → ((p → q) → (p → r))
LA(ii)
5. S ` ((p → q) → (p → r))
MP3,4
6. S ` p → q
NLA
7. S ` p → r
MP5,6
22
(The line numbers and right-hand entries are there to help any reader follow/check the deduction.)
Note that if S ⊆ T and if S ` s then T ` s because any deduction (such as
that above) of s from S may be changed into a deduction of s from T simply by
replacing every occurrence of “S” by “T ”. A point about notation: if we write
something like “S ` t” in a mathematical assertion (as opposed to this being a
line of a formal deduction) you should read this as saying “There is a deduction
of t from S.”.
Warning: it can be surprisingly difficult to find deductions, even of simple
things, in this calculus. The (“Gentzen-style/Natural Deduction”) calculus that
we will use later allows deductions to be found more easily. Our main point is,
however, not in the details of the calculus but the fact that there is a calculus
for which one can prove a completeness theorem (2.1.12).
For any such deductive calculus there are two central issues: soundness and
completeness. We say that a deductive calculus is sound if we cannot deduce
things that we should not be able to deduce using it, equivalently if we cannot
deduce contradictions by using it. That is, if S ` t then S |= t. And we
say that a deductive calculus is complete if it is strong enough to deduce all
consequences, that is if S |= t implies S ` t.
So soundness is “If we can deduce t from S then, whenever S is true, t is
true.” and completeness is “If t is true whenever S is true then there will be a
deduction of t from S.”.
In the remainder of this section we will give a proof of soundness (this is the
easier part) and completeness for the calculus above.
2.1.1
Soundness
Suppose that S is a set of propositional terms and that t is a propositional term.
We have to show that if S ` t is true then so is S |= t. Suppose then that S ` t
and let v be a valuation with v(S) = T. We must show that v(t) = T.
In outline, the proof is this. The fact that S ` t is that there is a deduction of
t from S. Any such deduction is given by a sequence of (logical and non-logical)
axioms and applications of modus ponens. If we show that v assigns “T” to
every axiom and that modus ponens preserves “T” then every consequence of
a deduction will be “T”. More precisely, we argue as follows (“by induction on
line number”).
If r is a logical axiom then (go back and check that all those axioms are
actually tautologies!) r is a tautology, so certainly v(r) = T. If r is a non-logical
axiom then r ∈ S so, by assumption on v, we have v(r) = T. Suppose now that
we have an application of MP in the deduction of t. That application has the
form (perhaps with intervening lines and with the first two lines occurring in
the opposite order)
S`r
S ` r → r0
S ` r0
for some propositional terms r, r0 . We may assume inductively (inducting on
the length of the deduction) that v(r) = T and that v(r → r0 ) = T. Then,
23
from the list of conditions for v to be a valuation, it follows that v(r0 ) = T, as
required.
On the very last line of the deduction we have
S`t
so our argument shows that v(t) = T, and we conclude that the calculus is
sound.
2.1.2
Completeness
Our first step is to prove the Deduction Theorem, which allows us to move terms
in and out of the set of non-logical axioms.
Theorem 2.1.1. (Deduction Theorem) Let S be a set of propositional terms
and let s and t be propositional terms. Then S ` (s → t) iff S ∪ {s} ` t.
Proof. Both directions of the proof are really instructions on how to transform
a deduction of one into a deduction of the other.
From a deduction showing that S ` (s → t) we may obtain a deduction of t
from S ∪ {s} by first replacing each occurrence of S (to the right of “`”) by an
occurrence of S ∪ {s} and noting that this is still a valid deduction, then adding
two more lines at the end, namely
S ∪ {s} ` s
NLA
S ∪ {s} ` t
MP(line above and line before that).
Note that this does give a deduction of t from S ∪ {s}.
For the converse, suppose that there is a deduction of t from S ∪ {s}. This
deduction is a sequence of lines
S ∪ {s} ` ti for i = 1, . . . , n where tn = t.
We will replace each of these lines by some new lines.
If ti is a logical axiom or member of S then we replace the i-th line by
S ` ti
LA or NLA
S ` (ti → (s → ti ))
LA(i)
S ` s → ti
MP
If ti is s then we replace the i-th line by lines constituting a deduction of
s → s from S (the proof of 2.1.2 below but with “S” to the left of each “`”).
If the i-th line is obtained by an application of modus ponens then there are
line numbers j, k < i such that tk is tj → ti . In our transformed deduction there
will be corresponding (also earlier) lines reading
S ` s → tj and
S ` s → (tj → ti )
so we replace the old i-th line by the lines
S ` (s → (tj → ti )) → ((s → tj ) → (s → ti ))
Ax(ii)
S ` ((s → tj ) → (s → ti ))
MP(line above and one of the earlier ones)
S ` s → ti
MP(line above and one of the earlier ones).
What we end up with is a (valid - you should check that you see this)
deduction with last line
S ` s → tn ,
as required (recall that tn is t). (It’s worthwhile applying the process described
to an example just to clarify how this works.) Next, some lemmas, the first of which was used in the proof above.
24
Lemma 2.1.2. For every propositional term s there is a deduction (independent
of s) with last line ` s → s and hence for every set S of propositional terms
there is a deduction with last line S ` s → s.
Proof. Here’s the deduction.
1. ` (s → ((s → s) → s)) → ((s → (s → s)) → (s → s))
Ax(ii)
2. ` s → ((s → s) → s)
Ax(i)
3. ` (s → (s → s)) → (s → s)
MP(1,2)
4. ` s → (s → s)
Ax(i)
5. ` s → s
MP(3,4)
To obtain the second statement just put “S” to the left of each “`” and note
that the deduction is still valid. We’ll abbreviate the statement of the following lemmas as in the statement
of the Deduction Theorem. Throughout, s and t are any propositional terms.
Lemma 2.1.3. ` s → (¬s → t)
Proof. The first part of the proof is just to write down a deduction which takes
us close to the end. Then there are two applications of the Deduction Theorem.
We’ve actually incorporated those uses, labelled DT, into the deduction itself, as
a derived rule of deduction. An alternative would be to stop the deduction
at the line “7. {s, ¬s} ` t MP(5,6)” and then say “Therefore {s, ¬s} ` t. By the
Deduction Theorem it follows that {s} ` ¬s → t and then, by the Deduction
Theorem again, ` s → (¬s → t) follows.”
1. {s, ¬s} ` ¬s → (¬t → ¬s)
Ax(i)
2. {s, ¬s} ` ¬s
NLA
3. {s, ¬s} ` ¬t → ¬s
MP(1,2)
4. {s, ¬s} ` (¬t → ¬s) → (s → t)
Ax(iv)
5. {s, ¬s} ` s → t
MP(3,4)
6. {s, ¬s} ` s
NLA
7. {s, ¬s} ` t
MP(5,6)
8. {s} ` ¬s → t
DT
9. ` s → (¬s → t)
DT In the next proof we use more derived rules of deduction.
Lemma 2.1.4. ` (s → ¬s) → ¬s
Proof.
1. {s → ¬s} ` ¬¬s → s
Ax(iii)
2. {s → ¬s, ¬¬s} ` s
DT
3. {s → ¬s, ¬¬s} ` s → ¬s
NLA
4. {s → ¬s, ¬¬s} ` ¬s
MP(2,3)
5. {s → ¬s, ¬¬s} ` s → (¬s → ¬(s → s))
Lemma 2.1.3
6. {s → ¬s, ¬¬s} ` ¬s → ¬(s → s)
MP(2,5)
7. {s → ¬s, ¬¬s} ` ¬(s → s)
MP(4,6)
8. {s → ¬s} ` ¬¬s → ¬(s → s)
DT
9. {s → ¬s} ` (¬¬s → ¬(s → s)) → ((s → s) → ¬s)
Ax(iv)
10. {s → ¬s} ` (s → s) → ¬s
MP(8,9)
11. {s → ¬s} ` s → s
Lemma 2.1.2
12. {s → ¬s} ` ¬s
MP(10,11)
25
13. ` (s → ¬s) → ¬s
DT
Lemma 2.1.5. ` s → ¬¬s
Proof.
` ¬¬¬s → ¬s
Ax(iii)
` (¬¬¬s → ¬s) → (s → ¬¬s)
` s → ¬¬s
MP Av(iv)
Lemma 2.1.6. ` ¬s → (s → t)
Proof. Exercise!
Lemma 2.1.7. ` s → (¬t → ¬(s → t))
Proof. Exercise!
Now, define a set S of (propositional) terms to be consistent if there is some
term t such that there is no deduction of t from S. Accordingly, say that a set
S is inconsistent if for every term t one has S ` t. You might reasonably have
expected the definition of S being consistent to be that no contradiction can
be deduced from S. But the definition just given is marginally more useful and
is equivalent to the definition just suggested (this follows once we have proved
2.1.12 but is already illustrated by the next lemma).
Lemma 2.1.8. The set S of terms is inconsistent iff for some term s we have
S ` ¬(s → s).
Proof. The direction “⇒“ is immediate from the definition.
For the other direction, we suppose that there is some term s such that
S ` ¬(s → s). It must be shown that for every term t we have S ` t. Here is
the proof.
1. S ` s → s
Lemma 2.1.2
2. S ` (s → s) → ¬¬(s → s)
Lemma 2.1.5
3. S ` ¬¬(s → s)
MP(1,2)
4. S ` ¬¬(s → s) → (¬t → ¬¬(s → s))
Ax(i)
5. S ` ¬t → ¬¬(s → s)
MP(3,4)
6. S ` (¬t → ¬¬(s → s)) → (¬(s → s) → t)
Ax(iv)
7. S ` ¬(s → s) → t
MP(5,6)
8. S ` ¬(s → s)
by assumption
9. S ` t
MP(7,8) Lemma 2.1.9. Let S be a set of terms and let s be a term. Then S ∪ {s} is
inconsistent iff S ` ¬s.
Proof. Suppose first that S ∪ {s} is inconsistent. Then, by definition, S ∪
{s} ` ¬s. So, by the Deduction Theorem, we have S ` s → ¬s. Since also
` (s → ¬s) → ¬s (2.1.4) and hence S ` (s → ¬s) → ¬s, we can apply modus
ponens to obtain S ` ¬s.
For the converse, suppose that S ` ¬s and let t be any term. It must be
shown that S ∪ {s} ` t. We have S ∪ {s} ` s and also, by 2.1.3, S ∪ {s} ` s →
26
(¬s → t). So, by modus ponens, S ∪ {s} ` ¬s → t follows. Since S ` ¬s also
S ∪ {s} ` ¬s so another application of modus ponens gives S ∪ {s} ` t. This
shows that S ∪ {s} is inconsistent, as required. Lemma 2.1.10. Suppose that S is a set of terms and that s is a term. If both
S ` s and S ` ¬s then S is inconsistent.
Proof. For every term t we have S ` s → (¬s → t) (by 2.1.3). Since also S ` s
and S ` ¬s, two applications of modus ponens, gives us S ` t (for every t), so
S is inconsistent. The next lemma is an expression of the finite character of the notion of
deduction.
Lemma 2.1.11. Suppose that S is a set of terms and that s is a term such that
S ` s. Then there is a finite subset, S 0 , of S such that S 0 ` s.
Proof. Any derivation (of s from S) has only a finite number of lines and hence
uses only a finite number of non-logical axioms. Let S 0 be the, finite, set of all
those actually used. Replace S by S 0 throughout the deduction to obtain a valid
deduction, showing that S 0 ` s. In the proof of the next theorem we make use of the observation that all the
propositional connectives may be defined using just ¬ and → (that is, together,
these two are adequate in the sense of Section 1.5) and so, in order to check
that a function v from the set of propositional terms to {T, F} is a valuation, it
is enough to check the defining clauses for ¬ and → only.
Theorem 2.1.12. (Completeness Theorem for Propositional Logic, version 1)
Suppose that S is a consistent set of propositional terms. Then there is a valuation v such that v(S) = T.
Proof. Let Γ = {T : T is a consistent set of propositional terms and T ⊇ S}
be the set of all sets of terms which contain S and are still consistent. We begin
by showing, using Zorn’s lemma2 (see 2.1.16 below, for this), that
Γ has a maximal element.
S
Let ∆ be a subset of Γ which is totally ordered by inclusion. Let T = ∆
be the union of all the sets in ∆. It has to be shown that T ∈ Γ and the only
possibly non-obvious point is that T is consistent. If it were not then, choosing
any term s, there would be a deduction T ` ¬(s → s). By 2.1.11 there would
be a finite subset T 0 of T with T 0 ` ¬(s → s). Since ∆ is totally ordered and
since T 0 is finite there would be some T0 ∈ ∆ such that T0 ⊇ T 0 . But then we
would have T0 ` ¬(s → s). By 2.1.8 it would follow that T0 is inconsistent,
contradicting the fact that T0 ∈ ∆ ⊆ Γ.
This shows that every totally ordered subset of Γ has an upper bound in
Γ and so Zorn’s Lemma gives the existence of a maximal element, T say, of Γ.
That is, T is a maximal consistent set of terms containing S. What we will do,
2 Chances are you haven’t seen this before. It is needed in the general case but if we assume
that L is countable then there’s a simpler proof of existence of a maximal element, and that’s
the one I’ll give in the lectures.
27
and this is a key step in the proof, is define the valuation v by v(r) = T if r ∈ T
and v(r) = F if r ∈
/ T , but various things have to be proved in order to show
that this really does give a valuation.
First, we show that T is “deductively closed” in the sense that
(*1) if T ` r then r ∈ T.
Suppose, for a contradiction, that we had T ` r but r ∈
/ T. Then, by
maximality of T, the set T ∪ {r} would have to be inconsistent and hence, by
2.1.9, T ` ¬r. By 2.1.3, T ` r → (¬r → t) for any term t, so two applications
of modus ponens gives T ` t. Since t was arbitrary that shows inconsistency of
T - contradiction. Therefore (*1) is proved.
Next we show that T is “complete” in the sense that
(*2) for every term t either t ∈ T or ¬t ∈ T.
For, suppose that t ∈
/ T. Then, by maximality of T, the set T ∪ {t} is
inconsistent so, by 2.1.9, T ` ¬t. Therefore, by (*1), ¬t ∈ T .
Next we show that
(*3) s → t ∈ T iff ¬s ∈ T or t ∈ T.
For the direction “⇐” suppose first that ¬s ∈ T. Then, by 2.1.6 and (*1),
s → t ∈ T. On the other hand if t ∈ T then s → t ∈ T by Axiom (i) and (*1).
For the converse, “⇒”, if we have neither ¬s nor t in T then, by (*2) both s
and ¬t are in T. Then, by 2.1.7 and (*1), we have ¬(s → t) ∈ T and so, by
consistency of T , s → t ∈
/ T, as required.
Now define the (purported) valuation v by v(t) = T iff t ∈ T. Since S ⊆ T
certainly v |= S so it remains to check that v really is a valuation. First, if
v(t) = T then t ∈ T so (consistency of T ) ¬t ∈
/ T so v(¬t) = F. Conversely, if
v(t) = F then t ∈
/ T so ((*2)) ¬t ∈ T so v(¬t) = T. That dealt with the ¬ clause
in the definition of valuation. The → clause is direct from (*3) which, in terms
of v, becomes v(s → t) = T iff v(¬s) = T or v(t) = T that is (by what we just
showed), iff v(s) = F or v(t) = T, as required. Theorem 2.1.13. (Completeness Theorem for Propositional Logic, version 2)
Let S be a set of propositional terms and let t be a propositional term. Then
S ` t iff S |= t.
Proof. The direction “⇒” is the Soundness Theorem. For the converse, suppose
that S 0 t. Then, by Axiom (iii) and modus ponens, S 0 ¬¬t. It then follows
from 2.1.9 that S ∪{¬t} is consistent so, by the first version of the Completeness
Theorem, there is a valuation v such that v(S) = T and v(¬t) = T so certainly
we cannot have v(t) = T. Therefore S 2 t, as required. Theorem 2.1.14. (Compactness Theorem for Propositional Logic, version 1)
Let S be a set of propositional terms. There is a valuation v such that v(S) = T
iff for every finite subset S 0 of S there is a valuation v 0 with v 0 (S 0 ) = T.
28
Proof. One direction is immediate: if v(S) = T then certainly v(S 0 ) = T for
any (finite) subset S 0 of S. For the converse suppose, for a contradiction, that
there is no v with v(S) = T. Then, by the Completeness Theorem (version
1), S is inconsistent. Choose any term s. Then, by definition of inconsistent,
S ` ¬(s → s). So, by 2.1.11, there is a finite subset, S 0 , of S with S 0 ` ¬(s → s).
By 2.1.8, S 0 is inconsistent. So by Soundness there is no valuation v 0 with
v 0 (S 0 ) = T, as required. Theorem 2.1.15. (Compactness Theorem for Propositional Logic, version 2)
Let S be a set of propositional terms and let t be a propositional term. Then
S |= t iff there is some finite subset S 0 of S such that S 0 |= t.
Proof. Exercise.
Theorem 2.1.16. (Zorn’s Lemma)3 Suppose that (P, ≤) is a partially ordered
set such that every chain has an upper bound, that is, if {ai }i∈I ⊆ P is totally
ordered (for all i, j either ai ≤ aj or aj ≤ ai ) then there is some a ∈ P with
a ≥ ai for all i ∈ I. Then there is at least one maximal element in P (i.e. an
element with nothing in P strictly above it).
This is a consequence, in fact is equivalent to, the Axiom of Choice from set
theory.
2.2
A natural deduction system for propositional
logic
The calculus that we describe in this section has no logical axioms as such but
it has many rules of deduction and
it allows much more “natural” proofs. We
define, by induction, a relation S t where S is any set of propositional terms
and t is any propositional term. It will turn out to be equivalent to the relation
S ` t because one can prove the Completeness
Theorem also for this calculus.
A sequent is a line of the form S t where S is a (finite) set of propositional terms and t is a propositional
term. We write s1 , . . . , sn t instead of
{s1 , . . . , sn }t and we can write t if S is empty. Certain sequents are called
theorems and they are defined inductively by the following rules.
(Ax) Every sequent of the form S, tt is a theorem (these sequents play the
role of non-logical
axioms in the Hilbert-style
calculus).
s → t.
(→I) If S, st is a theorem
then
so
is
S
(→E) If S s → t and S s are theorems then so is S t.
(¬I) If S, st and S, s¬t are theorems then
so is S ¬s.
(¬¬) If S ¬¬t is a theorem then so is S t.
The theorems/rules of deduction in this calculus are usually written using a
less linear notation,
as follows.
(Ax) S, tt
S, st
(→I) .
S s→t
3 included
only for completeness of exposition
29
S s → t S s
(→E)
.
S t
S, st S, s¬t
.
(¬I)
S ¬s
S ¬¬t
.
(¬¬)
S t
This is a minimal list, corresponding to writing every propositional term up
to equivalence using only {→, ¬} (which, recall, is an adequate set of propositional connectives).
Of course,
there are also rules involving ∨ and ∧, as follows.
S, s, tu
S s T t
S ∪ T s ∧ t
S, s ∧ tu
S s ∧ t
S s ∧ t
S s
S t
S s
S s
S s ∨ t
S t ∨ s
S, su T, tu
S ∪ T, s ∨ tu
S ` t S1 ⊇ S
S1 ` t
As before one may introduce derived rules, for example, Proof by Contradiction which says:
If S, ¬st and S, ¬s¬t are theorems then so is S s.
Which may be expressed by
S, ¬st S, ¬s¬t
.
S s
This rule
from those above as follows:
can be derived
t and S, ¬s¬t are theorems then so is S ¬¬s (by (¬I)) and hence
If S,
¬s
so is S s (by (¬¬)).
Here’s the same argument written using the 2-dimensional notation.
S, ¬st S, ¬s¬t
by (¬I)
S ¬¬s
by (¬¬).
S s
As in the earlier-described calculus, a sequence of theorems is called a (valid)
deduction. You should, of course, check that you agree that the above all are
“valid rules of deduction”.
If S is a (possibly infinite) set of propositional terms and t is any propositional term then we will write S ` t if there is a proof of t from S in
this
calculus, more formally, if there is a finite subset S 0 of S such that S 0 t is a
theorem. You can read “S ` t” as “there is a deduction of t from S”. Of course,
we already have such a notation and terminology from the previous section but
ignore that earlier deductive system for the moment.
Some further, easily derived, properties of the relation ` are:
30
S, ¬s ` t S, ¬s ` ¬t
S ` ¬s
S`t
(Fin) 0
for some finite subset S 0 ⊆ S
S `t
S, φ ` t S ` φ
(Cut)
S`t
One can prove soundness and completeness for this calculus. Recall what
the issues are.
(PbC)
• Is the calculus sound?
That is, does the calculus generate only tautologies,
more generally, if S t then is it true that S |= t?
• Is the calculus complete? That is, does the calculus generate all tautologies,
more generally, does S |= t imply that there is a proof in this calculus of S t?
The answer to each question is “yes”. The proof of soundness involves checking that each rule of deduction preserves tautologies (compare the analogous
point in the Hilbert-style calculus). The proof of completeness is entirely analogous to that for the Hilbert-style calculus (though notice that the Deduction
Theorem is already built into this natural deduction calculus). In particular,
one makes the same definition for a set of terms of be (in)consistent and the
heart of the proof is: given a consistent set S of terms, build a valuation which
gives all elements of S the value T.
31
Part II
Predicate Logic
32
Chapter 3
A brief introduction to
predicate logic: languages
and structures
3.1
Predicate languages
As we said in the introduction, propositional logic is about combining alreadyformed statements into more complex ones, whereas predicate logic allows us
to formulate mathematical (and other) statements. Predicate logic is founded
on the standard view in pure mathematics that the main objects of study are
sets-with-structure. The statements that we can form in predicate logic will be
statements about sets-with-structure. So first we need to be able to talk, in
this logic, about elements of sets; that is reflected in predicate languages having
variables, x, y, ..., which range over the elements of a given set. We also
have the universal quantifier ∀ (“for all”) and the existential quantifier
∃ (“there is”) which prefix variables - so a formula in this language can begin
∀x∃y . . . (“for all x there is a y such that ...). Of course, at “...” we want to
be able to insert something about x and y and that’s where the “structure” in
“sets-with-structure” comes in. This will take a bit of explaining because the
predicate language that we set up depends on the exact type of “structure” that
we want to deal with.
In brief, using a pick-and-mix approach, we set up a predicate language by
choosing a certain collection of symbols which can stand for constants (specific
and fixed elements of structures), for functions and for relations. Here are
some examples.
Example 3.1.1. One piece of structure that is always there in a set-with-structure
is equality - so we will (in this course) always have the relation = which expresses
equality between elements of a set. This is a binary (=2-ary) relation, meaning
that it relates pairs of elements.
Example 3.1.2. Part of the structure on a set might be an ordering - for example
the integers or reals with the usual ordering. If so then we would also include a
binary relation symbol, different from equality, say ≤, in our language.
Example 3.1.3. Continuing with the examples Z, R, we might want to express
33
the arithmetic operations of addition, multiplication and taking-the-negative
in our language: so we would add two binary function symbols (i.e. function
symbols taking two arguments), + and ×, and also a unary (=1-ary) function
symbol − (that’s meant to be used for the function a 7→ −a, not the binary
function subtraction). We might also add symbols 0 and 1 as constants.
Example 3.1.4. Functions with more than two arguments are pretty common;
for instance we might want to have some polynomial functions (or, if we were
dealing with C, perhaps some analytic functions) built into our language, say
a 3-ary function symbol f with which we could express the function given by
F (x, y, z) = x2 + y 2 + xz + 1.
Example 3.1.5. Relation symbols with more than two arguments are not so
common but here’s an example. Take the real line and define the relation
B(x, y, z) to mean “y lies (strictly) between x and z.
To recap: in the case of propositional logic there was essentially just one
language (at least once we had chosen a set of propositional variables): in the
case of predicate logic there are many, in the sense that when defining any such
language one has to make a choice from certain possible ingredients. There is,
however, a basic language which contains none of these extra ingredients and
we’ll introduce that first. Actually even for the basic language there is a choice:
whether or not to include a symbol for equality. The choice between inclusion
or exclusion of equality rather depends on the types of application one has in
mind but for talking about sets-with-structure it’s certainly natural to include
a symbol “=” for equality.
3.2
The basic language
The basic (first-order, finitary, with equality) language L0 has the following:
(i) all the propositional connectives ∧, ∨, ¬, →, ↔
(ii) countably many variables x, y, u, v, v0 , v1 , ...
(iii) the existential quantifier ∃
(iv) the universal quantifier ∀
(v) a symbol for equality =
Then we go on to define “terms” and “formulas”. Both of these, in different
ways, generalise the notion of “propositional term” so remember that the word
“term” in predicate logic has a different meaning from that in propositional
logic.
Formulas and free variables
A term of L0 is nothing other than a variable (you’ll see what “term” really
means when we discuss languages with constant or function symbols). The free
variable of such a term x (say) is just the variable, x, itself: fv(x) = {x}.
An atomic formula of L0 is an expression of the form s = t where s and
t are terms. The set of free variables of the atomic formula s = t is given by
fv(s = t) = fv(s) ∪ fv(t).
The following clauses define what it means to be a formula of L0 (and,
alongside, we define what are the free variables of any formula):
(0) every atomic formula is a formula;
(i) if φ is a formula then so is ¬φ, fv(¬φ) = fv(φ);
34
(ii) if φ and ψ are formulas then so are φ ∧ ψ, φ ∨ ψ, φ → ψ and φ ↔ ψ, and
fv(φ ∧ ψ) = fv(φ ∨ ψ) = fv(φ → ψ) = fv(φ ↔ ψ) = fv(φ) ∪ fv(ψ);
(iii) if φ is a formula and x is any variable then ∃xφ and ∀xφ are formulas,
and fv(∃xφ) = fv(∀xφ) = fv(φ) \ {x}.
((iv) plus the usual “that’s it” clause)
A sentence is a formula σ with no free variables (i.e. fv(σ) = ∅).
Just as with propositional logic we do not need all the above, because we may
define some symbols in terms of the others. For instance, ∧ and ¬, alternatively
→ and ¬, suffice for the propositional connectives. Also each of the quantifiers
may be defined in terms of the other using negation: ∀xφ is logically equivalent
to ¬∃x¬φ (and ∃x is equivalent to ¬∀x¬) so we may (and in inductive proofs
surely would, just to reduce the number of cases in the induction step) drop
reference to ∀ in the last clause of the definition.
We also remark that we follow natural usage in writing, for instance, x 6= y
rather than ¬(x = y).
If φ is a formula then it is so by virtue of the above definition, so it has
a “construction tree” and we refer to any formula occurring in this tree as a
subformula of φ. We also use this term to refer to a corresponding substring
of φ. Remember that any formula is literally a string of symbols (usually we
mean in the abstract rather than a particular physical realisation) and so we
can also refer to an occurrence of a particular (abstract) symbol in a formula.
As well as defining the set of free variables of a formula we need to define
the notion of free occurrence of a variable. To do that, if x is a variable then:
(i) every occurrence of x in any atomic formula is free;
(ii) the free occurrences of x in ¬φ are just the free occurrences of x in its
subformula φ;
(iii) the free occurrences of x in φ ∧ ψ are just the free occurrences of x in φ
together with the free occurrences of x in ψ;
(iv) there are no free occurrences of x in ∃xφ.
In a formula of the form Qxφ we refer to φ as the scope of the quantifier
Q (∃ or ∀). Any occurrence of x in Qxφ which is a free occurrence of x in φ
(the latter regarded as a subformula of Qxφ) is said to be bound by that initial
occurrence of the quantifier Qx. So a quantifier Qx binds the free occurrences
of x within its scope.
A comment on use of variables when you are constructing formulas. Note
that bound variables are “dummy variables”: the formula ∃xf (x) = y and
∃zf (z) = y are, intuitively, equivalent. A formula with nested occurrences of
the same variable being bound can be confusing to read: ∃x(∀x(f (x) = x) →
f (x) = x) could be written less confusingly as ∃x(∀y(f (y) = y) → f (x) = x). Of
course these are not the same formula but one can prove that they are logically
equivalent and the second is preferable.
Another informal notation that we will sometimes use is to “collapse repeated
quantifiers”, for example to write ∀x, y(x = y → y = x) instead of ∀x∀y(x =
y → y = x). Sometimes the abbreviations ∃! , ∃≤n , ∃=n are useful.
35
3.3
Enriching the language
The language L0 described above has little expressive power: there’s really not
much that we can say using it; the following list just about exhausts the kinds
of things that can be said.
∀x(x = x);
∀x∀y(x = y → y = x);
∀x∀y∀z(x = y ∧ y = z → x = z);
∃x∃y∃z(x 6= y ∧ y 6= z ∧ x 6= z ∧ ∀w(w = x ∨ w = y ∨ w = z));
∃x(x 6= x).
We are now going to give the formal definitions of the possible extra ingredients for a language but, since this is just a brief introduction to predicate logic,
these definitions are included just so that you have precise definitions to refer to
in case you have a question that is not answered by the perhaps less formal exposition that I will give in lectures. That exposition will focus on a limited class
of examples and on actually making sense of the meanings of various formulas
in specific examples. So what follows is just for reference.
As we discussed earlier, precisely what we should add to the language L0
depends on the type of structures whose properties we wish to capture within
our formal language. We therefore suppose that we have, at our disposal, the
following kinds of symbols with which we may enrich the language:
• n-ary function symbols such as f ( = f (x1 , . . . , xn ));
(since an operation is simply a function regarded in a slightly different way,
we don’t need to introduce operation symbols as well as function symbols, but
we do use “operation notation” where appropriate, writing, for instance, x + y
rather than +(x, y))
• n-ary relation symbols such as R (= R(x1 , . . . , xn ))
(1-ary relation symbols, such as P (= P (x)), are also termed (1-ary) predicate
symbols);
• constant symbols such as c.
Formulas of an enriched language Suppose that L is the language L0 enriched by as many function, relation and constant symbols as we require (the
signature of L is a term used when referring to these extra symbols). Exactly
what is in L will depend on our purpose: in particular, L need not have function
and relation and constant symbols, although I will, for the sake of a uniform
treatment, write as if all kinds are represented. If S is the set of “extra” symbols
we have added then we will write L = L0 ∨ S. (It is notationally convenient
to regard L as being, formally, the set of all formulas of L, so then, writing,
for example, φ ∈ L makes literal sense. Thus the “∨” should be understood as
some sort of “join”, not union of sets.)
The terms of L, and their free variables, are defined inductively by:
(i) each variable x is a term, fv(x) = {x};
(ii) each constant symbol c is a term, fv(c) = ∅;
(iii) if f is an n-ary function symbol and if t1 , . . . , tn are terms, then f (t1 , . . . , tn )
is a term, fv(f (t1 , . . . , tn )) = fv(t1 ) ∪ · · · ∪ fv(tn ).
The atomic formulas of L (and their free variables) are defined as follows:
36
(i) if s, t are terms then s = t is an atomic formula, fv(s = t) = fv(s) ∪ fv(t);
(ii) if R is an n-ary relation symbol and if t1 , . . . , tn are terms, then R(t1 , . . . , tn )
is an atomic formula, fv(R(t1 , . . . , tn )) = fv(t1 ) ∪ · · · ∪ fv(tn ).
The formulas of L (and their free variables) are defined as follows:
(0) every atomic formula is a formula;
(i) if φ is a formula then so is ¬φ, fv(¬φ) = fv(φ);
(ii) if φ and ψ are formulas then so are φ ∧ ψ, φ ∨ ψ, φ → ψ and φ ↔ ψ and
fv(φ ∧ ψ) = fv(φ ∨ ψ) = fv(φ → ψ) = fv(φ ↔ ψ) = fv(φ) ∪ fv(ψ);
(iii) if φ is a formula and x is any variable then ∃xφ and ∀xφ are formulas,
and fv(∃xφ) = fv(∀xφ) = fv(φ) \ {x}
A sentence of L is a formula σ of L with no free variables (i.e. fv(σ) = ∅).
Since formulas were constructed by induction we prove things about them by
induction (“on complexity”) and, just as in the case of propositional terms, the
issue of unique readability raises its head. Such inductive proofs will be valid
only provided we know that there is basically just one way to construct any given
formula (for two routes would give two paths through the induction and hence,
also for terms. Both proofs are done by induction (on complexity) and are not
difficult.
3.4
L-structures
Suppose that L is a language of the sort discussed above.
Formulas and sentences do not take on meaning until they are interpreted
in a particular structure. Roughly, having fixed a language, a structure for that
language provides: a set for the variables to range over (so, if M is the set then,
“∀x” will mean “for all x in M ”); an element of that set for each constant symbol
to name (so each constant symbol c of the language will name a particular, fixed
element of M ); for each function symbol of the language an actual function (of
the correct arity) on that set; for each relation symbol of the language an actual
relation (of the correct arity) on that set. Here’s the precise definition.
An L-structure M (or structure for the language L) is a non-empty set
M , called the domain or underlying set of M, we write M = |M|, together
with an interpretation in M of each of the function, relation and constant symbols of L. By an interpretation of one of these symbols we mean the following
(and we also insist that the symbol “=” for equality be interpreted as actual
equality between elements of M ):
(i) if f is an n-ary function symbol, then the interpretation of f in M, which
is denoted f M , must be a function from M n to M ;
(ii) if R is an n-ary relation symbol, then the interpretation of R in M,
which is denoted RM , must be a subset of M n (in particular, the interpretation
of a 1-ary predicate symbol is a subset of M );
(iii) if c is a constant symbol, then the interpretation of c in M, which is
denoted cM , must be an element of M .
If no confusion should arise from doing so, the superscript “M” may be
dropped (thus the same symbol “f ” is used for the function symbol and for the
particular interpretation of this symbol in a given L-structure).
37
3.5
Some basic examples
The basic language
An L0 -structure is simply a set so L0 -structures have rather limited value
as illustrations of definitions and results.
In lectures we will give a variety of examples, concentrating on languages L
which contain just one extra binary relation symbol R.
Directed graphs An L = L0 ∨ {R(−, −)}-structure M consists of a set M
together with an interpretation of the binary relation symbol R as a particular
subset, RM , of M × M. That is, an L-structure consists of a set together with
a specified binary relation on that set.
Given such a structure, its directed graph, or digraph for short, has for its
vertices the elements of M and has an arrow going from vertex a to vertex b
iff (a, b) ∈ RM . This gives an often useful graphical way of picturing or even
defining a relation RM (note that the digraph of a relation specifies the relation
completely).
Certain types of binary relation are of particular importance in that they
occur frequently in mathematics (and elsewhere).
Posets A partially ordered set (poset for short) consists of a set P and a
binary relation on it, usually written ≤, which satisfies:
for all a ∈ P , a ≤ a (≤ is reflexive);
for all a, b, c ∈ P , a ≤ b and b ≤ c implies a ≤ c (≤ is transitive);
for all a, b ∈ P , if a ≤ b and b ≤ a then a = b (≤ is weakly antisymmetric).
The Hasse diagram of a poset is a diagrammatic means of representing a poset.
It is obtained by connecting a point on the plane representing an element a of
the poset to each of its immediate successors (if there are any) by a line which
goes upwards from that point. We say that b is an immediate successor of a
if a < b (i.e. a ≤ b and a 6= b) and if a ≤ c ≤ b implies a = c or c = b: we also
then say that a is an immediate predecessor of b.
Equivalence relations An equivalence relation, ≡, on a set X is a binary
relation which satisfies:
for all a ∈ X, a ≡ a (≡ is reflexive);
for all a, b ∈ X, a ≡ b implies b ≡ a (≡ is symmetric);
for all a, b, c ∈ X, a ≡ b and b ≡ c implies a ≡ c (≡ is transitive).
The (≡-)equivalence class of an element a ∈ X is denoted [a]≡ , a/ ≡ or
similar, and is {b ∈ X : b ≡ a}. The key point is that equivalence classes are
equal or disjoint: if a, b ∈ X then either [a] = [b] or [a] ∩ [b] = ∅. Thus the
distinct ≡-equivalence classes partition X into disjoint subsets.
38
3.6
Definable Sets
If φ is a formula of a predicate language L and φ just the one free variable x say
(in which case we write φ(x) to show the free variable explicitly) then we can
look at the “solution set” of φ in any particular L-structure M. This solution
set is written as φ(M) and it’s a subset of the underlying set M of M, being the
set of all elements a ∈ M such that, if each free occurrence of x in φ is replaced
by a, then the result (a “formula with parameter a”) is true in M.
That is: φ(M) = {a ∈ M : φ(a) is true}, where φ(a) means the expression
we get when we substitute each free occurrence of x by a.
I will give examples in lectures but it’s something that you’ll already have
seen in less formal mathematical contexts, as is illustrated by the following
examples.
Suppose that our structure is the real line R with its usual arithmetic (+,
×, 0, 1) and order (≤) structure (I’ll use the same notation, R, for the structure
and for the underlying set.). Take the formula φ, or φ(x), to be 0 ≤ x ≤ 1.
Then the solution set φ(R) = {a ∈ R : 0 ≤ a ≤ 1} - the closed interval with
endpoints 0 and 1.
Suppose, again with the reals R as the structure, that our formula, with free
variable x, let’s call it ψ this time, is x × x = 1 + 1. √
Then
√ the solution set
ψ(R) = {a ∈ R : a × a = 1 + 1} = {a ∈ R : a2 = 2} = {− 2, 2}.
For yet another example, again using the reals, take the formula, say θ, with
free variable x (so we can write θ(x)) to be ∃y (y × y = x). Then the solution
set θ(R) = {a ∈ R : ∃b ∈ R b2 = a} = R≥0 - the set of non-negative reals (since
these are exactly the elements which are the square of some real number).
(The solution set for a formula with more than one free variable can be defined in a similar, and probably obvious, way, but we’ll concentrate on examples
with one free variable.)
39