Download Modal Logic for Artificial Intelligence

Document related concepts

Meaning (philosophy of language) wikipedia , lookup

Bayesian inference wikipedia , lookup

Axiom of reducibility wikipedia , lookup

Mathematical proof wikipedia , lookup

Foundations of mathematics wikipedia , lookup

Willard Van Orman Quine wikipedia , lookup

Fuzzy logic wikipedia , lookup

Abductive reasoning wikipedia , lookup

Axiom wikipedia , lookup

Inquiry wikipedia , lookup

Jesús Mosterín wikipedia , lookup

Catuṣkoṭi wikipedia , lookup

Theorem wikipedia , lookup

Model theory wikipedia , lookup

Structure (mathematical logic) wikipedia , lookup

Propositional formula wikipedia , lookup

Quantum logic wikipedia , lookup

Argument wikipedia , lookup

Truth-bearer wikipedia , lookup

Combinatory logic wikipedia , lookup

History of logic wikipedia , lookup

Sequent calculus wikipedia , lookup

First-order logic wikipedia , lookup

Syllogism wikipedia , lookup

Mathematical logic wikipedia , lookup

Curry–Howard correspondence wikipedia , lookup

Law of thought wikipedia , lookup

Laws of Form wikipedia , lookup

Natural deduction wikipedia , lookup

Intuitionistic logic wikipedia , lookup

Propositional calculus wikipedia , lookup

Accessibility relation wikipedia , lookup

Modal logic wikipedia , lookup

Transcript
Modal Logic for Artificial Intelligence
Rosja Mastop
Abstract
These course notes were written for an introduction in modal logic for students in Cognitive Artificial Intelligence at Utrecht University. Earlier notes by Rosalie Iemhoff have been used both as a
source and as an inspiration, the chapters on completeness and decidability are based on her course
notes. Thanks to Thomas Müller for suggesting the use of the Fitch-style proof system, which has
been adopted from Garson [7]. Thanks to Jeroen Goudsmit and Antje Rumberg for comments and
corrections. Further inspiration and examples have been drawn from a variety of sources, including
the course notes Intensional Logic by F. Veltman and D. de Jongh, Basic Concepts in Modal Logic by
E. Zalta, the textbook Modal Logic by P. Blackburn, M. de Rijke, and Y. Venema [2] and Modal Logic
for Open Minds by J. van Benthem [15].
These notes are meant to present the basic facts about modal logic and so to provide a common
ground for further study. The basics of propositional logic are merely briefly rehearsed here, so that the
notes are self-contained. They can be supplemented with more advanced text books, such as Dynamic
Epistemic Logic by H. van Ditmarsch, W. van der Hoek, and B. Kooi [16], with chapters from one of
the handbooks in logic, or with journal articles.
Contents
1
2
3
4
Propositional logic
1.1 Language . . . . . . . . . . .
1.2 Truth values and truth tables .
1.3 Proof theory: natural deduction
1.4 Exercises . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
5
5
6
9
Modal logic and artificial intelligence
2.1 What is the role of formal logic in artificial intelligence? . .
2.2 Modal logic: reasoning about necessity and possibility . . .
2.3 A brief history of modal logic . . . . . . . . . . . . . . . .
2.4 Modal logic between propositional logic and first order logic
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
10
10
10
12
15
Basic Modal Logic I: Semantics
3.1 The modal language . . . . . . . . . . . . . . . . . . . . .
Examples of sentences and arguments . . . . . . . . . . .
Duality . . . . . . . . . . . . . . . . . . . . . . . . . . .
The variety of modalities . . . . . . . . . . . . . . . . . .
3.2 Kripke models and the semantics of for the modal language
3.3 Semantic validity . . . . . . . . . . . . . . . . . . . . . .
3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
17
17
17
17
18
18
20
21
Characterizability and frame correspondence
4.1 Characterizability and the modal language . . .
Different Kripke frames for different modalities
Expressive power of the modal language . . . .
4.2 Frame correspondence . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
23
23
23
24
24
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4.3
4.4
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
29
31
32
33
34
35
Basic Modal Logic II: Proof theory
5.1 Hilbert system . . . . . . . . . . . . . . . . . . . . . . . .
Factual premises . . . . . . . . . . . . . . . . . . . . . .
5.2 Natural deduction for modal logic . . . . . . . . . . . . .
5.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . .
5.5 Adding extra rules or axioms: the diversity of modal logics
5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
37
37
37
37
39
42
43
46
6
Completeness
6.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
51
7
Decidability
7.1 Small models . . . . . .
7.2 The finite model property
7.3 Decidability . . . . . . .
7.4 Complexity . . . . . . .
4.5
5
8
Bisimulation invariance . . . .
The limits of characterizability:
Generated subframes . . . . .
Disjoint unions . . . . . . . .
P-morphisms . . . . . . . . .
Exercises . . . . . . . . . . .
. . . . . . . .
three methods
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
52
52
54
55
55
Tense Logic
8.1 Basic tense logic . . . . . . . . . . . . . . . . . . . . .
8.2 Additional properties: seriality, transitivity, and linearity
8.3 Varieties of linearity . . . . . . . . . . . . . . . . . . . .
8.4 Time and modality: the Master Argument of Diodorus .
8.5 Aristotle on the sea battle . . . . . . . . . . . . . . . . .
8.6 Ockhamist semantics for modal tense logic . . . . . . .
8.7 Computation tree logic . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
56
56
57
59
60
63
64
65
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
What is formal logic?
Logic is concerned with the study of reasoning or, more specifically, the study of arguments. An argument is an act or process of drawing a conclusion from premises. We call an argument sound if the
premises are all true, and valid if the truth of the premises guarantees the truth of the conclusion. Note
that an argument can be valid without being sound: one or more premises may in fact be false, but if
they were true, then the conclusion would have also been true. Vice versa, an argument may be sound
without being valid: the premises may be true but the conclusion just doesn’t follow from it.
Formal logic is concerned with the study of validity of argument forms. For example, the argument
on the left is valid because of its form, whereas the one on the right is valid because of its content.
The door is closed or the window is open.
The door is closed or the window is open.
The window is not open.
The window is closed.
Therefore, the door is closed.
Therefore, the door is closed.
The argument on the right is valid, but only in virtue of the meanings of the words ‘open’ and ‘closed’,
which are such that a window cannot both be open and closed. The argument on the left, however, is
valid in virtue of its form. That is, any argument of the form
A or B
not A
(Therefore) B
is valid, regardless of the sentences we use in the place of A and B. The only items that need to be fixed
are ‘or’ and ‘not’ in this case. If we would replace ‘not’ by ‘maybe’, then the argument would not be
valid anymore. We call ‘or’ and ‘not’ logical constants. Together with ‘and’, ‘if . . . then’ and ‘if, and
only if’, they are the logical constants of propositional logic (see section 1).
A formal logic is a definition of valid argument forms, such as the one above. There are different
methods for doing so. Here we are concerned with two of them: the model-theoretic approach and the
proof-theoretic approach.
The model-theoretic approach
We first need to have a language. Propositional logic, predicate logic and modal logic all have different
languages. In all cases, what we have is a set L of sentences (or: closed formulas, or: well-formed
formulas). This is enough to say what model theory and proof theory say. So below, we simply assume
that some language L is given.
Now we define what a model is. A model is intended to give a meaning to the symbols of the language
L. Specifically, it specifies for every sentence in the language whether it is true in the model or not. An
argument is valid if in every model in which all of the premises are true, the conclusion is also true.
Definition 1 (Validity, model-theoretic). Let a method T be given for evaluating formulas ϕ ∈ L as being
true or false in a model M, notation M |=T ϕ. The conclusion ψ ∈ L can be validly drawn from a set of
premises Φ ⊆ L, notation Φ |=T ψ if, and only if, in every model in which all of the premises in Φ are
true, the conclusion ψ is also true.
∀M : If M |=T ϕ for all ϕ ∈ Φ, then M |=T ψ
When there are no premises we simply write |=T ψ, meaning that the formula ψ is true in every model.
Such a formula is also called a general validity or tautology.
3
A formula is satisfiable if there is a model in which the formula is true. A formula is a tautology
if it is true in every model. A formula is contradictory if it is false in every model (hence, if it is not
satisfiable). If a formula is satisfiable and its negation is also satisfiable, then we call it contingent.
Two formulas are logically equivalent if they are true in exactly the same models. Note that if ϕ and
ψ are logically equivalent, then the formula ϕ ↔ ψ is a tautology.
From the model-theoretic standpoint, we can understand what logical constants are: in propositional
logic they are the ones that are entirely truth-functional. If we know what the truth value is of A and B,
then we know what the truth value is of ‘A or B’, ‘not A’, and so on.
Proof-theoretic approach
In proof theory, we try to find a fixed set of axioms and/or inference rules. Axioms are formulas that are
considered to be self-evidently true, for which no proof is required. They may be used at the beginning
of a proof. Inference rules tell us which steps we are allowed to make in a proof.
Valid argument forms, in the sense of proof theory, are those that make use only of the inference
rules. If we also only make use of axioms as the premises of our proof, then the conclusion of the proof
is just as self-evident as the axioms. Those conclusions we call theorems of the logic.
Definition 2 (Validity, proof-theoretic). Let a set of inference rules and axioms S be given. The conclusion ψ ∈ L can be validly drawn from premises Φ ⊆ L, notation Φ `S ψ if, and only if, there is a proof
starting with only the premises Φ and the axioms in S and using only the inference rules in S , that leads
to the conclusion ψ.
We write `S ψ if there is a valid inference to the conclusion ψ starting from no premises (not including
axioms).
Soundness and completeness
Given these two different ways of defining validity, we can also compare them. It would be rather odd
if an argument could be shown to be valid using one method but invalid using the other. If everything
that can be proven valid using inference is also valid model-theoretically, then we say that the inference
system is sound with respect to the model-theoretic interpretation. Vice versa, if everything that is valid
model-theoretically can be proven using deduction, then we call the inference system complete with
respect to the model-theoretic interpretation.
Put differently, soundness means that the inference system does not allow us too much, whereas
completeness means that it enables us to prove everything that is valid.
4
1
Propositional logic
The following is a brief summary of propositional logic, intended only as a reminder to those who have
taken a course in elementary logic.
1.1
Language
Propositional logic is the logic of propositional formulas. Propositional formulas are constructed from
a set var of elementary or ‘atomic’ propositional variables p, q, and so on, with the connectives ¬
( negation, ‘not’), ∧ (conjunction, ‘and’), ∨ (disjunction, ‘or’), → (implication, ‘if . . . then’), and ↔
(equivalence , ‘if and only if’). If ϕ and ψ are formulas, then so are ¬ϕ and ¬ψ, (ϕ ∧ ψ), (ϕ ∨ ψ), (ϕ → ψ),
and (ϕ ↔ ψ). So p is a formula, ¬(p ∧ q) and q ∨ (q ∧ ¬(r → ¬p)) are formulas, but pq is not a formula,
and neither are p ∧ q → and p¬ ∨ r. We add the simple symbol ⊥ which is called the falsum. We write
this definition in Backus-Naur Form (bnf) notation, as follows:
[Lprop ]
ϕ ::= p | ⊥ | ¬ϕ | (ϕ ∧ ϕ) | (ϕ ∨ ϕ) | (ϕ → ϕ) | (ϕ ↔ ϕ)
This means that a formula ϕ can be an atom p, the falsum ⊥, or a complex expression of the other forms,
whereby its subexpressions themselves must be formulas. The language of propositional logic is called
Lprop .
Brackets are important to ensure that formulas are unambiguous. The sentence p ∨ q ∧ r could be
understood to mean either (p ∨ q) ∧ r or p ∨ (q ∧ r), which are quite different insofar as their meaning is
concerned. We omit the outside brackets, so we do not write ((p ∨ q) ∧ r).
The symbols ϕ, ψ, χ, . . . are formula variables. So, if it is claimed that the formula ϕ ∨ ¬ϕ is a
tautology, it means that every propositional formula of that form is a tautology. This includes p ∨ ¬p,
(p → ¬q) ∨ ¬(p → ¬q) and any other such formula. In a similar way we formulate axiom schemata and
inference rules by means of formula variables. If ϕ → (ψ → ϕ) is an axiom scheme, then every formula
of that form is an axiom, such as (p ∧ q) → (¬q → (p ∧ q)). And an inference rule that allows us to infer
ϕ ∨ ψ from ϕ allows us to infer p ∨ (q ↔ r) from p, and so on.
1.2
Truth values and truth tables
In propositional logic, the models are (truth) valuations. A truth valuation determines which truth value
the atomic propositions get. It is a function v : var → {0, 1}, where var is the non-empty set of atomic
propositional variables. A proposition is true if it has the value 1, and false if it has the value 0. The truth
value of falsum is always 0, so v(⊥) = 0 for every valuation v.
Whether a complex propositional formula is true in a given model (valuation) can be calculated by
means of truth functions, that take truth values as input and give truth values as output. To each logical
connective corresponds a truth function. They are defined as follows:
f¬ (1) = 0
f¬ (0) = 1
f∧ (1, 1) = 1
f∧ (1, 0) = 0
f∧ (0, 1) = 0
f∧ (0, 0) = 0
f∨ (1, 1) = 1
f∨ (1, 0) = 1
f∨ (0, 1) = 1
f∨ (0, 0) = 0
f→ (1, 1) = 1
f→ (1, 0) = 0
f→ (0, 1) = 1
f→ (0, 0) = 1
f↔ (1, 1) = 1
f↔ (1, 0) = 0
f↔ (0, 1) = 0
f↔ (0, 0) = 1
For instance, if v(p) = 1 and v(q) = 0, then the propositional formula (p → q) ∧ ¬q is false. To see that
this is the case, first, we calculate the truth value of p → q, which is f→ (v(p), v(q)) = f→ (1, 0) = 0, so
p → q is false. Second, we calculate the truth value of ¬q, which is f¬ (v(q)) = f¬ (0) = 1, so ¬q is true.
5
Finally, we combine the two obtained truth values and get f∧ (0, 1) = 0, so the formula (p → q) ∧ ¬q is
false. We can do the same thing in one step:
f∧ ( f→ (v(p), v(q)), f¬ (v(q))) = f∧ (0, 1) = 0.
Doing this for all of the four possible valuations, we get the following truth table:
p
q
p→q
¬q
(p → q) ∧ ¬q
1
1
1
0
0
1
0
0
1
0
0
1
1
0
0
0
0
1
1
1
The bold face indicates the valuation we considered above and the truth value we calculated.
By means of truth tables we can easily determine in which models complex formulas true. The
model/truth valuation is given by the rows on left of the double line. Below is the truth table for all the
simplest non-atomic formulas.
p
q
⊥
¬p
¬q
p∧q
p∨q
p→q
p↔q
1
1
0
0
0
1
1
1
1
1
0
0
0
1
0
1
0
0
0
1
0
1
0
0
1
1
0
0
0
0
1
1
0
0
1
1
An argument is semantically valid if, and only if, the conclusion is true in every truth valuation in which
the premises are true. Using the truth table above, we can see that p ∨ q is a valid consequence of p.
Therefore, p |= p ∨ q.
A formula is a tautology if it is true in every valuation, and a contradiction if it is false in every
valuation. The formula p ∨ ¬p is a tautology, so |= p ∨ ¬p. No matter what the truth valuation assigns
to the proposition p, the outcome of the truth functional calculation is always 1. Similarly, p ∧ ¬p is a
contradiction.
1.3
Proof theory: natural deduction
One of the most familiar type of proof systems is natural deduction. This system does not work with
axioms, but only with inference rules: two rules for each connective. The introduction rules regulate the
deduction of a formula with the connective, and the elimination rules govern the deduction of a formula
from a premise with that connective. As basic familiarity with natural deduction will be presupposed in
this course, in the box below the rules are merely restated.
Assumptions are written above the horizontal line, and the inferences based on those assumptions are
written below that line. The vertical line indicates that the proof is still dependent on the assumptions. In
a proof that looks like the one below, the conclusion χ has been reached dependent on the assumptions
ϕ and ψ.
ϕ
ψ
..
.
χ
6
Assumptions can be withdrawn, but only in accordance with specific inference rules: only by means of
the rules for introducing →, ↔ or ¬, or the rule for eliminating ∨. So, for instance, when we prove an
implication ϕ → ψ (Intro →), we first assume ϕ and then, using that assumption, we prove that ψ. If
we succeeded in doing so, we can withdraw the assumption by concluding that if ϕ, ψ. The vertical line
then ends just above the formula ϕ → ψ, as shown in the box below.
Intro ∧
ϕ
..
.
ψ
......
ϕ∧ψ
Elim ∧
Intro ∨
ϕ
......
ϕ∨ψ
ϕ
......
ψ∨ϕ
Elim ∨
Intro
→
Intro
Intro ¬
ϕ
..
.
ϕ
..
.
ϕ
..
.
ψ
ψ
..
.
ψ
......
ϕ→ψ
Elim
↔
ϕ
......
ϕ↔ψ
→
Elim
↔
⊥
......
¬ϕ
Elim ¬
Double ¬
¬¬ϕ
......
ϕ
EFSQ
ϕ∨ψ
ϕ∧ψ
......
ϕ
ϕ∧ψ
......
ψ
ϕ
..
.
χ
ϕ→ψ
..
.
ψ
..
.
ϕ
......
ψ
ϕ↔ψ
......
ϕ→ψ
ϕ↔ψ
......
ψ→ϕ
χ
......
χ
ϕ
..
.
¬ϕ
......
⊥
⊥
......
ϕ
Some rules are very simple: if you can prove ϕ and you can prove ψ, then you can also prove their
conjunction ϕ ∧ ψ. Other rules are more complicated. For example, the only way to ‘eliminate’ the
disjunction ϕ ∨ ψ is by proving, first that ϕ ∨ ψ, and second, that some conclusion χ can be proven both
from ϕ alone and from ψ alone.
Conjunction, disjunction, and equivalence are commutative, which means that the order of the subformulas is irrelevant: ϕ ∨ ψ is logically equivalent to ψ ∨ ϕ, and likewise for ϕ ∧ ψ and ϕ ↔ ψ.
Apart from the introduction and elimination rules for each connective, there are two special rules.
The double negation rule says that two negations cancel each other out. The name ‘EFSQ’ is an acronym
for Ex Falso Sequitur Quodlibet, which means literally “From the False follows whatever”. Note that
this is also semantically valid: in every valuation in which ⊥ is true (namely: none), any other formula
is also true.
Lastly, we are allowed to reiterate earlier steps, for ease of exposition of the proof, but only provided
that no assumptions were withdrawn in between. So the reiteration on the left is correct, and the one on
7
the right is incorrect.
..
.
..
.
..
.
ϕ
..
.
..
.
ϕ
..
.
ϕ
..
.
..
.
(allowed)
ϕ
(not allowed)
Below are two examples of natural deductions. In the left one, we prove the validity of the argument
p → q ` ¬q → ¬p. First, we introduce the premise, and then we prove the conclusion. To do so,
we assume ¬q and prove ¬p, after which we can use the introduction rule for implication to draw the
conclusion. On the right is the proof for p ∨ q, ¬p ` q. Since one of the premises is a disjunction, we
need to use the rule for elimination of disjunction to prove the conclusion.
1
p→q
Assumption
1
p∨q
Assumption
2
¬q
Assumption
2
¬p
Assumption
3
p
Assumption
3
p
Assumption
4
q
Elim → 1,3
4
⊥
Elim ¬ 2,3
5
¬q
Iteration 2
5
q
EFSQ 4
6
⊥
Elim ¬ 4,5
6
q
Assumption
q
Iteration 6
7
¬p
Intro ¬ 3,6
7
8
¬q → ¬p
Intro → 2,7
8
q
Elim ∨ 1,5,7
Below are, on the left, a proof of p ∨ ¬p, which crucially involves the double negation rule and, on
the right, a proof of one of the ‘distribution laws’ (p ∨ q) ∧ r ` (p ∧ r) ∨ (q ∧ r). It can easily be checked
that they are also semantically valid by checking the truth tables.
1
¬(p ∨ ¬p)
1
(p ∨ q) ∧ r
Assumption
2
p∨q
Elim ∧ 1
3
r
Elim ∧ 1
Assumption
2
p
Assumption
3
p ∨ ¬p
Intro ∨ 2
4
⊥
Elim ¬ 1,3
5
¬p
Intro ¬ 2,4
6
p ∨ ¬p
Intro ∨ 5
7
⊥
Elim ¬ 1,6
8
¬¬(p ∨ ¬p)
Intro ¬ 1,7
9
p ∨ ¬p
Double ¬ 8
4
p
Assumption
5
p∧r
Intro ∧ 3,4
6
(p ∧ r) ∨ (q ∧ r)
Intro ∨ 5
7
q
Assumption
8
q∧r
Intro ∧ 3,7
9
(p ∧ r) ∨ (q ∧ r)
Intro ∨ 8
10
8
(p ∧ r) ∨ (q ∧ r)
Elim ∨ 2,6,9
1.4
Exercises
1. Explain why, if ϕ is a tautology, ¬ϕ is a contradiction. Explain why every formula that is not a
contradiction is satisfiable.
2. Write down the truth table for (a) ¬(p → (q ∨ p)), (b) (p → q) ∨ (q → p), (c) ¬((p ↔ (q ∧ ¬r)).
3. Show by means of truth tables that (a) |= p → (q → p), (b) (p ∧ r) ∨ q |= p ∨ q, (c) |= ¬(p → q) ↔
(p ∧ ¬q).
4. Give a natural deduction proof for (a) ϕ → ϕ, (b) ϕ → (ψ → ϕ), (c) (ϕ → (ψ → χ)) → ((ϕ →
ψ) → (ϕ → χ)).
5. Write down a natural deduction proof for (a) ` ¬p → (p → q), (b) p ∨ q ` (p → q) → q, (c)
p → ¬p, ¬p → p ` ⊥.
6. In this exercise you are asked to prove the Deduction Theorem.
(a) Prove that if Γ, ϕ ` ψ, then Γ ` ϕ → ψ.
(b) Prove that if Γ ` ϕ → ψ, then Γ, ϕ ` ψ.
7. Write down a formula that is logically equivalent to (a) ϕ ∧ ψ, (b) ϕ → ψ, (c) ϕ ↔ ψ, using only
the connectives ¬ and ∨ and the formulas ϕ and ψ. How can we write the formula ¬ϕ using only
→ and ⊥?
As this last exercise already points out, we can define the logical language using only ¬ and ∨, treating
the other connectives as defined abbreviations. Note that we can do this for any combination of ¬ and
one of the other connectives.
9
2
2.1
Modal logic and artificial intelligence
What is the role of formal logic in artificial intelligence?
Formal logic has had a central place in the study of artificial intelligence since its first conception in the
1950s. Why is this so? Logic has not been made formal just to make it suitable for computers and robots.
Formal logic is also a tool for humans. Just as we could write, and execute, our grocery list and daily
schedules in Perl or Haskell, so we could make our daily inferences by means of deduction in formal
logic. Moreover, formal logic wasn’t even invented for machines, but to regiment and explicate human
scientific reasoning.
Aristotle’s logic of syllogisms was intended to regiment, amongst others, categorisation of individuals
into species and genera. Frege created predicate logic in an attempt to regiment mathematical proof, and
thereby to provide an ultimate justification of mathematical knowledge on the basis of principles of
pure reasoning, themselves defined in a mathematically precise way. Modal logic grew out of several
endeavours to regiment reasoning about possible situations: utopias and ideals, hypothetical scenarios,
the unknown future, responsibilities and ‘what might have happened’, that which is possible given our
limited knowledge of the facts, and so on. None of these formal logics were invented with artificial
agents in mind.
Formal logic for artificial intelligence can still be understood in two ways: the formal logic can
be implemented as a capacity for automated reasoning by an intelligent agent, and it can be used by
a designer in developing intelligent agents that meet certain criteria. In the latter case, formal logic is
used, again, as an instrument to regiment our own reasoning: about what the agent should choose to do in
certain situations, about what uncertainty the agent might have to cope with, about possible malfunctions
we need to consider, and so on.
The first role is in knowledge representation. Artificial agents can use formal languages for representing the knowledge they gather of the environment in which they have to act intelligently. Because the
knowledge is expressed in formal language, they can use formal methods for drawing inferences from
the knowledge. The agent may know that lightning is always followed by thunder, and know that there
was lighting just now. But in order to know that thunder will follow soon, it has to draw an inference
from its knowledge. Logic is then an instrument of the agent, in the form of a calculus or algorithm to
extend its explicit knowledge with further, inferred knowledge.
The second role of logic in artificial intelligence is agent specification. In this case, the formal
language is used to characterize the agent and its (desired or actual) intelligent behaviour. In designing
an agent, for instance, we want to make sure that it meets certain criteria of behaviour. IF it knows that
it is raining and it does not want to get wet, then it should not go outside without an umbrella. Or, if
it knows that some other agent knows that it is raining, then the agent himself knows that it is raining.
Writing down these criteria in a formal language allows us to compute the consequences of those criteria.
For instance, if we design the agent using this set of criteria, will it always stay out of the rain? Will it
not bump into the walls? Will it shut down if it malfunctions?
In order to fulfill the second of these tasks, a formal language needs to be rich enough to express
many things having to do with intelligent behaviour: we need to say things about what the agent knows
or believes, what the agent intends to do, what it is allowed to do, what it can do, what it has done and
will eventually do. As you can see from the above description, it is modal logic that is most suited for
expressing such criteria and for drawing inferences from such expressions. So what is modal logic more
precisely?
2.2
Modal logic: reasoning about necessity and possibility
A modality is a ‘mode of truth’ of a proposition: about when that proposition is true, or in what way,
or under which circumstances it would, could, or might have been true. Modal logic, accordingly, is
10
the study of reasoning about modalities, inferring from modal premisses that some modal conclusion is
valid. The following examples illustrate what we mean by this.
Imagine that you’re holding a pen in your hand and you release your grip on it. What will the pen
do? Presumably, the answer will be ‘It will fall down until it hits the ground and then it will be at rest.’
Yet, many will admit that this is no mere accident. The pen will be subject to the gravitational pull from
the earth, making its fall inevitable: it has to fall, it cannot happen otherwise. Here we speak of alethic
modality: distinguishing between what is necessary, possible or accidental, and impossible.
Likewise, imagine that you buy a pack of cookies. When you are outside the store, you open the pack
and take a cookie. Now, most likely no one is going complain that you do so. You are allowed to eat a
cookie outside of the store. However, before you bought the pack, when you were still inside the store,
opening the pack and eating one of the cookies would not have been allowed, but prohibited. You may
not eat the cookie under those circumstances—legally, you cannot do so. This is called deontic modality:
that something is obligatory, permissible, or prohibited.
Third, consider the situation that you come home after attending a lecture and you see that a window
is broken, the door is open, things from the cupboard are spread across the floor and the tv and stereo
are missing. You infer that you must have been burglarized. If a police officer asks if there is anything
else that might have happened, you should rightly respond ‘no.’ There is no other possibility than that
a burglar broke in to your home and stole your belongings, that must be what happened, it cannot be
otherwise. These are the epistemic modalities: that something is certain (known, verified), undecided
(consistent with what is known), or excluded (known to be false, falsified).
Further examples can be offered, including power, time, belief, and more.
In all these examples, we are confronted with something that is not merely true, but necessarily so:
necessary in virtue of the (physical) nature of things, necessary in virtue of property law, or necessary
in virtue of the evidence. If we think of the language of propositional logic, then what we add to this
language is two operators, 2 (‘box’) and 3 (‘diamond’). Given some arbitrary formula ϕ, the modal
formula 2ϕ reads: “It is necessarily the case that ϕ”. The formula 3ϕ reads: “It is possibly the case that
ϕ”. So, if p is the proposition that the pen falls to the ground, then 2p says that necessarily, the pen falls
to the ground. Similarly, if q is the proposition that you own the pack of cookies and r the proposition
that you eat one of the cookies from the pack, then q → 3r says that, if you own the pack, it is possible
(i.e., allowed) that you eat one of the cookies.
Modal logic is concerned with reasoning about necessities and possibilities such as these. For example, it can be used to reason about what is permissible under precise circumstances given the entire penal
code: if we can rewrite the penal code in a formal language saying what is permissible and what is not
permissible, and we also write down the precise details of the circumstances, then using modal logic we
can draw conclusions about what we are allowed to do (what the penal code allows us to do) under the
circumstances.
Consider, for example, the following inference:
2This pen falls
This pen falls
This says that if the pen has to fall, then in fact it does fall.
The next example is more complicated:
2(Door open → (Forgot to close ∨ Burglarized))
2(Window broken → (Football accident ∨ Burglarized))
Window broken ∧ Door open
¬3(Forgot to close ∧ Football accident)
2Burglarized
11
This argument expresses the formal structure of the following reasoning: I know that, if the door is open,
then either I forgot to close it, or I was burglarized. Secondly, I also know that, if the window is broken,
then either there was a football accident, or I was burglarized. Now, as a matter of fact, the window is
broken and the door is open. But, it is inconceivable that I both forgot to close the door and there was a
football accident. Therefore, I know that I was burglarized.
Modal logic is concerned with the task of determining whether such an argument is valid, or not.
It is not concerned with the question whether the premisses are in fact true. Questions such as ‘how
do you know that something is necessary?’ and ‘Is this action really permissible?’ are not addressed.
Only, an argument with such premisses is valid only if the premisses, if they are true, guarantee that
the conclusion is true as well. When you have completed this course, you will be able to prove that the
second argument above is indeed valid, or rather, that every argument of the same logical form is valid.
Since there are various kinds of modality, there are also various kinds of modal logic:
logic:
alethic logic
tense logic
epistemic logic
deontic logic
dynamic logic
provability logic
...
2.3
about which modalities:
necessary / possible / actually
always / sometimes / never / until/ eventually
certainly / perhaps / ‘it is known that’
obligatory /permissible / forbidden
makes sure / allows / enables / avoids
provable / consistent
...
A brief history of modal logic
The study of modal logic dates back to the very beginning of logic itself, in the work of Aristotle.
Although most of Aristotle’s logic is concerned with “categorical” statements such as “All horses are
four-legged” and “Some houses are not wooden”, he also considered the logical relationships between
possibility and necessity. In his Prior Analytics he discusses a logical relation between those two concepts, that has come to be called the “duality” of possibility and necessity:
If it is necessary that A is true, then it is not possible that A is not true.
If it is possible that A is true, then it is not necessary that A is not true.
Using the notation above, with 2 for necessity and 3 for possibility, we can state these things as follows:
D1
2A ↔ ¬3¬A.
D2
3A ↔ ¬2¬A.
This duality is similar to the duality of ∀ and ∃ in predicate logic: ∃xPx ↔ ¬∀x¬Px.
When modern formal logic was invented at the end of the nineteenth century, there was not much
attention for ‘necessity’ and ‘possibility’ at first. Many authors at that time thought it was senseless to
talk about anything more than what is actually true or false and so they were generally skeptical of the
very idea of ‘necessity’, apart from ‘logical necessity’ (i.e., tautology).
. . . there is no one fundamental logical notion of necessity, nor consequently of possibility.
If this conclusion is valid, the subject of modality ought to be banished from logic, since
propositions are simply true or false . . . (B. Russell [13])
A necessity for one thing to happen because another has happened does not exist. There is
only logical necessity. (L. Wittgenstein [17], 6.37)
12
What makes some formula a tautology could be understood by pointing out that it can be deduced
from axioms, which are the most fundamental principles of reasoning and so immediately grasped to be
tautological themselves. (Wittgenstein, who was displeased with the concept of an axiom proposed an
alternative way of thinking about tautologies, suggesting that each tautology on its own can be grasped
to be true independently of the way the world is.) Accordingly, there was no need to express in a formal
language that something is necessary.
This attitude started to change after C.I. Lewis published his Survey of Symbolic Logic in 1918, in
which he discussed several proof systems for reasoning about possibility and necessity. Specifically, he
was displeased with the way the connectives from propositional logic were analyzed as truth functions.
He was of the opinion that we need to distinguish two meanings to disjunctions, implications, and so on:
one extensional meaning and the other an intensional meaning.
The extensional implication is what is also known as the ‘material implication’. It is the familiar
truth functional connective of propositional logic. The intensional implication is also called ‘strict implication’. The statement that A strictly implies B means that B logically follows from A.
MI
Material implication: A → B.
Means that either A is false, or B is true.
SI
Strict implication: A J B.
Means that we can validly infer B from A.
In propositional logic we cannot express that something is a tautology, or that some proposition is a
logical consequence of some other proposition. Thus, Lewis proceeded to develop different logics for
strict implication. The five logical systems he came up with have been named S1 to S5. These logics are
all defined by means of axioms. They are proof-theoretic descriptions.
The model-theoretic approach to modal logic came later. It followed not long after the development
of model theory. The first attempt was made by R. Carnap [4, 5]. He introduced models consisting of
sets of state descriptions, whereto the truth value of propositions (or rather, sentences) is relativized. So
in one state description it is true that unicorns exist, but in another state description that proposition is
false. And if propositions A and B are true in state description S , their conjunction A ∧ B is also true
there.
Now a model-theoretic account of the meaning of necessity statements could be given. A sentence
of the form “It is necessary that A” is true if, and only if, it is true in all state descriptions. Using the 2
for “It is necessary that”, we can spell this out precisely. A state description S is simply a set of atomic
propositional variables p, q, r, and so on.
p is true in S
if, and only if,
p is a member of S
¬A is true in S
if, and only if,
A is not true in S
A ∧ B is true in S
if, and only if,
A is true in S and B is true in S
2A is true in S
if, and only if,
for all state descriptions T, A is true in T
Given Carnap’s semantics for necessity, there is no real point to say things like 22A or 23A. If A is
true everywhere, then it is necessary in S , but it is also necessary in T . In short, if A is true everywhere,
it is immediately also necessary everywhere. Thus, if 2A is true in S , then 2A is true everywhere; so
22A is true in S ; so 22A is true everywhere, and so on. The same point can be made for 3.
Incidentally, Carnap’s model-theoretic interpretation of “Necessarily” leads to the same as Lewis’
logic S5, if we define A J B is as: 2(A → B).
13
The next important development in modal logic came from A.N. Prior [11]. His major concern was
with the logic of time. A statement about the future, for instance, can be thought of as a statement about
all moments in the future. These moments are a bit like Carnap’s state descriptions. In fact, as Prior
suggested, we can think of the ‘time line’ as the set of all integers ω, ordered by the relation ‘smaller
than’ <. Now we can give an interpretation of the statement that ‘always in the future A will be true”.
If we use the symbol G as the operator “It is always going to be the case that . . . ”, we can define the
meaning of GA as follows:
GA is true at moment m if, and only if, for all moments m0 > m, A is true at m0 .
Now it was also possible to define the same thing for the past, by modifying the definition. Let HA mean
that “It has always been the case that A”. Then it can be evaluated in the same model with the definition
given here:
HA is true at moment m if, and only if, for all moments m0 < m, A is true at m0 .
Thus, with a simple change in the ordering, we can choose to say something either about the future or
about the past. In different words, we use the ordering relation to restrict the moments we are talking
about: not all moments in the domain ω, but only those that come later (or earlier) than m.
This was a major innovation to Carnap’s models, in which we could only say that something is
necessary in the sense that it is true in absolutely all state descriptions. Now, using Prior’s idea, we could
begin to make sense of, for instance, the difference between 2A and 22A or the difference between
3A and 23A. Such differences are especially important if we think about the modalities of knowledge,
action, and time. The examples below illustrate the meaningfulness of such statements with multiple
modalities.
(1)
It is not certain that I have enough money for a pizza, but it is certain that this is not certain.
(2)
It is possible that I open the door; necessarily, if the door is open, then it is possible that I leave
the room. Therefore, it is possible that it is possible that I leave the room.
(3)
If it is true now that it rains, then it will always be true in the future, that it has been true in the
past that it rained.
The meaning of such statements could be understood by restricting the state descriptions, moments, or
possibilities that are relevant to evaluate them. In other words, we need to think of these modalities, not
as absolute, i.e. for all possibilities, but as relative, i.e. limited to what is possible now, possible for us,
or possible in a given situation. Just like in Prior’s case, where we look at only those times that are in
the future of our time, so we need to look at only those situations that are achievable for me by acting,
and we need to look at only those situations that are in agreement with what I know given my actual
evidence.
Who was the first to come up with the idea of relativized possibility is not quite clear. But it was the
formulation by S. Kripke [8] that has come to be the standard way of giving a semantics for modal logic.
Kripke’s models are built out of three things:
1. Possible worlds. These are similar to the state descriptions of Carnap, but Kripke used the metaphysical terminology of seventeenth century philosopher G.W. Leibniz, who argued that the world
God created is the best of all possible worlds, and who proposed that necessary truths are “eternal
truths. Not only will they hold as long as the world exists, but also they would have held if God
had created the world according to a different plan” (Leibniz, as quoted by Mates [10] p.107).
14
2. Accessibility relation. Not all possible worlds are ‘accessible’ from a given possible world w. A
sentence of the form “Possibly, A” is true in w only if there is a possible world where A is true,
that is accessible from w. Similarly, a sentence of the form “Necessarily, A” is true in w only if A
is true in all accessible possible worlds.
3. Valuation. The valuation determines for atomic propositions whether they are true or false at
a possible world. So, the valuation determines for each possible world, which of the atomic
propositional variables are true in that world and which ones are not.
These ‘Kripke models’, as they have come to be called, made it possible to understand better the meaning
of the modal axioms that had been proposed and were being debated. It also led to comparisons of
different modal logics and comparisons between modal logic and first order logic.
The invention of Kripke models led to a systematic method for studying all of these kinds of modality
in a mathematically similar way. If we want a Kripke model for the logic of epistemic modality (certainty,
knowledge), then we take the possible worlds to be different ways the world might be, and such a world
w will be accessible from a world v if, and only if, the world might be like w on the basis of the evidence
you have in v. To obtain a Kripke model for the logic of time the possible worlds become the moments
in time. Then, a world w will be future-accessible from a world v if w is in the future of v—and pastaccessible if the ordering is the other way around. We can begin to reason about combinations of time,
knowledge and action by combining the domains of those Kripke models (e.g., ways a moment in the
futre might be), and combining the accessibility relations (e.g., distinguishing what I know now about
the present from what I knew in the past about the present).
The development of modal logic after this point has been rapid and very diverse. Logicians realized
that Kripke models are, from a mathematical point of view, nothing other than labelled graphs. We can
think of Kripke models as interpretations of a modal language but, vice versa, we can also think of
modal languages as tools to talk about (labelled) graphs. Modal logic then becomes an instrument for
describing properties of graphs, and for proving that a graph has certain properties. We could also use
first order logic for this purpose, but for various reasons modal logic has been a popular alternative to
first order logic.
2.4
Modal logic between propositional logic and first order logic
We could very easily construct a formal language for reasoning about necessity by means of first order
logic. To say that it is necessary that the pen falls to the ground after it is released, we could write
that “for all x, if x is a possibility given the laws of physics, and that possibility is such that the pen is
released, then that possibility is also such that the pen falls to the ground.” Or, in a more formal notation:
∀x(Physically possible(x) ∧ Released(pen, x) → Falls(pen, x)).
Instead of this, modal logic has only the box and the diamond, and no variables. We do not write
“3xpx” for “There is a possibility x, and x is such that p is true” but we write “3p”, saying that “There
is a possibility that p is true”. This second, modal logic way of expressing possibility and necessity is
more limited than the first order logical way that was suggested above.
For at least two reasons this is not the way modal reasoning has been formalized.
The first reason for preferring a ‘Box’ over a predicate logical approach to modality is that the first
order logical approach presupposes a bird’s eye point of view on the domain of quantification—as in the
above example we are quantifying over a domain of physical possibilities x. This makes sense when
we think of a domain of objects, for instance the books in one’s bookcase. If I say “everything is a
paperback” then we can easily think of this as saying that for all books b in the domain of books in my
bookcase, it is true that b is a paperback. However, when we are concerned with modal concepts this
bird’s eye view on the domain does not make equally good sense.
15
For one example, consider the logic of time. Here, a predicate logical approach to the temporal operator ‘always’ presupposes that we can refer to all moments in time, so ‘Time’ has to be thought of as
a big domain of moments about which we can make meaningful statements. And all future moments indeed ‘exist’ already, then does that also imply that there are facts about those future moments, statements
about the future that are already true now? This issue is an ongoing controversy in philosophy.
A similar point could be made about possibility. If we treat the statement “necessarily, all objects have
a mass” as a statement about a domain of all possibilities, then it would seem that we are presupposing
the ‘existence’ of possibilities. But what does it mean for a possibility to exist, or to be real? Are
possibilities, such as the possibility for me to stop writing after this sentence, real? Perhaps they are, or
were at some time. Again here, philosophers argue about this matter of modal realism.
These philosophical concerns are somewhat different once a model-theoretic perspective on modal
logic is accepted. In first order logic we evaluate sentences relative to a model, which includes a domain
of entities. Similarly, in modal logic we evaluate sentences relative to a model, which has a domain of
possibilities. Still, the important difference is that in models for modal logic we also want to single out
the actual situation, the present time, the real world, and so on: one possibility that is special in the sense
that it is ‘our’ possibility, the real one. Looking at a domain as a totality of possibilities, it is not so clear
what makes any of these ‘special’. So even if we adopt a model-theoretic perspective on modal logic,
it is still different from first order logic because we need to adopt a perspective in that model, select our
own possibility, in order to determine what is true there.
The second reason for dispreferring a predicate logical language is a more technical one. A major
disadvantage of predicate logic is the fact that it is undecidable. This means that it is impossible to
determine for every argument whether it is logically valid or not. Basic modal logic has the advantage
that it is decidable. Many variants of modal logic are in the computational complexity class PSPACE.
Languages with modalities (i.e., box and diamond) are less expressive than a language with universal
and existential quantifiers. But from a computational point of view this often means that they are more
‘manageable’ to work with in a computational seeting.
Modal logics provide, as we will see, a natural way of reasoning about relational structures or graphs.
This is what many computer scientists look for in a logic. For this reason, it is often discovered that some
logic invented in a particular area in computer science turns out to be a kind of modal logic. So-called
Description Logic is a good example of this. Once it is recognized that this logic is a modal logic, all the
meta-logical techniques that have been developed for modal logic can be used to study the properties of
Description Logic. So, as a matter of fact, modal logic is also a useful instrument in computer science.
Still, the distinction between these propositional logic, modal logic and first-order logic is not so
strict as all of this suggests. In particular, logicians have developed logical systems that are a hybrid of
modal and first-order logic—appropriately called hybrid logic. Thereby, the distinction between modal
and first order logic has become that of two ends of a scale, with many intermediate logics. Furthermore,
logicians have studied the combination of modal and first-order logic, mostly called modal predicate
logic. In such a logic we could express such things as “It is possible that there exists a unicorn” and
“There exists something for which it is possible that it is a unicorn”.
16
3
3.1
Basic Modal Logic I: Semantics
The modal language
Modal logic has been developed in order to analyse and precision reasoning about different possibilities
and necessities. In predicate logic we cannot draw a distinction between “All men will die” and “All men
could die”, or between “Susan is certainly the best pilot” and “Susan might be the best pilot”, or “Marie
makes sure that the light is on” and “Marie allows for the light to be on”.
In the language of modal logic, these differences are expressed by means of the operators 2 and
3. The formula 2ϕ means “It is necessary that ϕ” or, in other words, “ϕ is the case in every possible
circumstance”. The formula 3ϕ means “It is possible that ϕ” or, in other words, “ϕ is the case in at least
one possible circumstance”.
We can add modal operators to propositional logic, and we can also add them to predicate logic.
In this course we restrict ourselves to propositional modal logic. The language of propositional modal
logic consists of everything from propositional logic plus the modal operators. So, we have a set var
atomic propositions p, q, and so on, and complex propositional formulas such as ¬p ∨ (q → r) and
¬((⊥ ∧ q) ↔ q), complex modal formulas 2(p ∧ ¬q) and 3(q ↔ ¬q) → ¬22(p ∧ r). In a BNF
expression:
[Lmodal ] ϕ ::= p | ⊥ | ¬ϕ | ϕ ∧ ϕ | ϕ ∨ ϕ | ϕ → ϕ | ϕ ↔ ϕ | 2ϕ | 3ϕ
The expression > is also sometimes used as an abbreviation of ¬⊥, and so it means something that is
always true. In contrast to the falsum, > is also called the verum. This language is called Lmodal .
Examples of sentences and arguments
Considering the intuitive meaning of some modal formulas helps to get a better grasp of the things that
can be expressed using the modal language.
The formula 3(p ∧ q) says that it is possible that p and q are true together, whereas the formula
3p ∧ 3q says that p is possible and q is possible, but not that it is possible that they are true together.
2ϕ → ϕ states that, if ϕ is necessary, it is in fact also true. This seems to be correct: for example, if it
is necessary that the sun will implode at some point in the future, then it is true that the sun will implode
at some point in the future. Nevertheless, as we shall see, there are also uses of modal logic in which this
is not the case.
If it is necessary that I open the door before I leave the house, then 2(p → q) is true, if the propositional variable p stands for the proposition ‘I leave the house’ and q stands for the proposition ‘I open the
door’. Note that this formula is different in meaning from 2p → 2q, which says that if it is necessary
that I leave the house, then it is necessary that I open the door.
Another question to consider is whether 2(p ∨ ¬p) should be true. If something is logically valid, or
tautological, then it would seem reasonable that it is also necessarily true. How could there be a situation
in which p ∨ ¬p is not true? Vice versa, 3⊥ can never be true: it cannot be possible that a contradiction
is true. The falsum is false by definition, so there cannot be some possible situation in which it is true.
Duality
As mentioned above, an important insight of modal logic is the duality of possibility and necessity. That
is, the formula 2ϕ means the same as ¬3¬ϕ. If it is necessary that ϕ is true, whatever ϕ may be, then it
cannot be possible that ϕ is false. Therefore, it is then not possible that not ϕ, or ¬3¬ϕ. Similarly, 3ϕ
intuitively means the same as ¬2¬ϕ. If it is possible that ϕ is true, then it cannot also be necessary that
¬ϕ is true, hence necessary that ϕ is false. Given this duality of 2 and 3, possibility can be defined in
terms of necessity:
3ϕ is an abbreviation of ¬2¬ϕ.
By standard propositional logic we can infer 2¬ϕ ↔ ¬3ϕ from 3ϕ ↔ ¬2¬ϕ, and also 2ϕ ↔ ¬3¬ϕ.
17
The variety of modalities
Necessity and possibility are not the only modalities. The introduction of Kripke’s models showed that
the different modalities could all be understood in the same way: the content of the 2 and 3 is different,
but their logical form is the same.
logic
alethic logic
tense logic
epistemic logic
deontic logic
dynamic logic
meaning of 2
necessarily
always in the future (past)
certainly / known
obligatorily
decided / determined
meaning of 3
possibly
sometime in the future (past)
perhaps
permissibly
undecided / left open
In all these logics, the 2 and 3 are considered dual. So, for instance, it is permitted to enter the zoo
if, and only if, it is not obligatory to not enter (stay out of) the zoo. And it is consistent with what is
known that the sun will rise tomorrow if, and only if, it is not known that the sun will not rise tomorrow.
There are some differences between the modalities. It is natural to think that, if it is known that
Bob is in front of the door, then Bob actually is in front of the door. We would not call it knowledge if it
weren’t so. On the other hand, we would not say that if it is obligatory that personal data are kept private,
it is true that personal data are kept private. So, within epistemic logic the formula 2ϕ → ϕ would be
considered logically valid, but in deontic logic that same formula would be considered invalid. More on
this in section 5.
3.2
Kripke models and the semantics of for the modal language
For propositional logic all we needed as a model was the truth valuation. That won’t do for our modal
language: to evaluate whether something is necessary, or possible, we have to look beyond what is
actually true or false. To do so, we make use of Kripke models.
The concept of a Kripke model was introduced in the previous chapter. As was pointed out there,
these models consist of three items: (1) a set of possible worlds, (2) a relation that determines whether
one possible world is accessible from another, and (3) a valuation that determines whether an atomic
proposition is true or false in a give possible world. In some cases we want to consider the Kripke model
in abstraction of the valuation. For this reason we also define Kripke frames, which consist of only the
set of possible worlds and the accessibility relation.
Definition 3 (Kripke frame and Kripke model). A Kripke frame is a tuple F = hW, Ri, such that
- W is a non-empty set of possible worlds,
- R ⊆ (W × W) is a binary relation on W; if wRv we say that v is accessible from w.
A Kripke model is a tuple M = hW, R, Vi, such that
- hW, Ri is a Kripke frame (the frame underlying M, or the frame M is based on),
- V : W → Pow(var) is a valuation for the set of atomic propositions var; proposition p is true in
world w if p ∈ V(w), and false in w if p < V(w).
When wRv, v is accessible from w. We also say that v is a successor of w.
The Kripke models (not the frames) are dependent on the choice of propositional variables. In practice we often omit an explicit definition of the variables and assume a countably infinite set of them.
18
We can represent Kripke frames by means of graphs. Here is are three simple Kripke frames:
/v
@
w
/ud
t
wo
tO
v
/u
s
w
/vd
uo
/t
In the rightmost of these, the set of possible worlds is {w, v, u, t} and the accessibility relation is wRv,
vRv, uRt, and tRu. So, in world w the only accessible world is v, and for v the only accessible world is v
itself. The leftmost frame has the same set of possible worlds, but the relation is wRv, wRt, tRv, vRu and
uRu.
Kripke models can be represented by writing the valuation set V(w) next to world w in the graphs.
/ w (q, r)
O
(p, r) t
(p) v
&
u ()
In this Kripke model, the frame consists of four possible worlds and the relation is tRw, tRv, tRu and
uRw. The valuation is such that V(t) = {p, r}, so p and r are true at t and q is false there. At u all atomic
propositions are false.
Modal formulas are evaluated in a Kripke model. For all formulas in the modal language and for all
possible worlds in all models, the truth definition determines whether that formula is true in that possible
world in that model. The connectives from propositional logic are evaluated in the same way. Given the
model represented above, the propositions q and r are both true in world w, and therefore also q ∧ r is
true in w. Proposition p is true in world v, and therefore formula p ∨ ¬q is also true in v. In world u, q is
false, and therefore q → r is true.
The relation R is only relevant when it comes to evaluating 2 and 3 formulas.
Given a model M = hW, R, Vi, the formula 2ϕ is true in possible world w if, and only if, ϕ
is true in every possible world that is accessible from w.
Given a model M = hW, R, Vi, the formula 3ϕ is true in possible world w if, and only if, ϕ
is true in some possible world that is accessible from w.
The following definition sums up the semantic evaluation of formulas in model.
Definition 4 (Truth in a model). Let M = hW, R, Vi be a model, w a possible world in W, and var a set
of atomic propositional variables. The truth value of modal formulas is inductively defined relative to M
and w, in the manner given below.
M, w |= p
⇔
M, w |= ⊥
⇔ is not the case
M, w |= ¬ϕ
⇔ it is not the case that M, w |= ϕ
M, w |= ϕ ∧ ψ
⇔
M, w |= ϕ and M, w |= ψ
M, w |= ϕ ∨ ψ
M, w |= ϕ → ψ
⇔
⇔
M, w |= ϕ or M, w |= ψ
M, w |= ϕ implies that M, w |= ψ
M, w |= ϕ ↔ ψ
⇔
M, w |= ϕ if, and only if, M, w |= ψ
M, w |= 2ψ
⇔ for all v : wRv implies that M, v |= ϕ
M, w |= 3ϕ
⇔ there is a v such that wRv and M, v |= ϕ
p ∈ V(w)
19
(for propositions p ∈ var)
Note that ⊥ is by definition false in every possible world. Therefore, > = ¬⊥ is true in every possible
world. In exercise 2 below, you are asked to determine the truth value of formulas such as 2⊥ and 3>.
Example 1. Consider the Kripke model M represented below.
/ v (r)
O
(p) w f
&
(p) u d
t (p, q)
1. In u only one possible world is accessible, u itself, and p is true there. Hence, in all accessible
possible worlds the proposition p is true. Therefore, 2p is true in u, or M, u |= 2p.
2. In t, by contrast, there is some accessible possible world in which p is true (namely w), but there is
also an accessible world in which p is false (namely v). Therefore, p is possible, but not necessary:
M, t |= 3p but M, t |= ¬2p.
3. In w, only two worlds are accessible. In one of them r is true, in the other one q is true. So in both
of these worlds q ∨ r is true. This means that q ∨ r is true in all accessible worlds for w, and so
M, w |= 2(q ∨ r).
4. In v, no possible world is accessible. Therefore, trivially, ⊥ is true in ‘all’ accessible possible
worlds: M, v |= 2⊥. And because v is accessible from w: M, w |= 32⊥.
3.3
Semantic validity
To define what a normal modal logic is semantically, we have to define not only truth but also validity. At
this point we have only defined truth in a world of a model. We generalize this notion now in three steps
to arrive at general validity of a formula. (The concept of a frame class will be made clear in section 4.)
Definition 5 (Validity). Let ϕ be a modal formula, M a Kripke model, F a Kripke frame, and C a class
of Kripke frames.
M |= ϕ
ϕ is valid on M
⇔
M, w |= ϕ, for all possible worlds w in M
F |= ϕ
ϕ is valid on F
⇔
M |= ϕ, for all models M based on F
|=C ϕ
ϕ is valid on C
⇔
F |= ϕ, for all frames F in frame class C
ϕ is generally valid
⇔
F |= ϕ, for all frames F
|= ϕ
An inference from Φ to ψ is generally valid, notation Φ |= ψ, if, and only if, for all models M and worlds
w: M, w |= ϕ for all ϕ ∈ Φ implies that M, w |= ψ.
An inference from Φ to ψ is valid in frame class C, notation Φ |=C ψ, if, and only if, for all models M
based on a frame in C and worlds w: M, w |= ϕ for all ϕ ∈ Φ implies that M, w |= ψ.
Something is valid on a Kripke frame just in case it is true at every possible world in every possible
Kripke model based on that frame. Similarly, something is valid on all Kripke frames just in case it
is true at every possible world in every possible Kripke model (based on any possible Kripke frame).
Because of this, the definition of semantic validity is just the same one as Definition 1 in chapter , when
we take the relativization of truth to possible worlds in consideration. A modal formula ϕ is generally
valid, |= ϕ, if it is valid in all Kripke frames. That just means that it is true in all Kripke models (models
based on any possible frame) at every possible world in that model.
The notion of a frame class will be made more clear in the next chapter.
20
Example 2 (Duality). On the basis of the definition of semantic validity, we can now show that the
duality of 2 and 3 is valid. In other words, the formulas 3ϕ and ¬2¬ϕ are logically equivalent. And,
by simple propositional logic, the same goes for 2ϕ and ¬3¬ϕ.
Proposition 1 (Semantic validity of duality). 3ϕ ↔ ¬2¬ϕ is generally valid.
Proof. We show that if 3ϕ is true anywhere, ¬2¬ϕ must be true there, and if ¬2¬ϕ is true anywhere,
3ϕ must be true there.
Let M be an arbitrary model, w a world in M, and ϕ a modal formula, such that M, w |= 3ϕ. Then
there is a possible world v such that wRv and v |= ϕ. Now suppose that M, w |= 2¬ϕ. Then ¬ϕ is true
in every possible world that is accessible from w. Since v is accessible from w, it must be the case that
M, v |= ¬ϕ. However, then M, v |= ϕ ∧ ¬ϕ, which cannot be the case. So M, w 6|= 2¬ϕ, and therefore
M, w |= ¬2¬ϕ.
Vice versa, suppose that M, w |= ¬2¬ϕ. Then it is not true that, in all possible worlds that are
accessible from w, ¬ϕ is true. In other words, there must be some possible world v, with wRv, in which
¬ϕ is false. So M, v |= ϕ. Now, given the semantic definition of 3, it is true that M, w |= 3ϕ.
Given some modal formula ϕ, we may want to consider whether or not it is a general validity of
modal logic. If so, it must be true at every world in every model. If not, then there must be at least some
world in some Kripke model where ϕ is false. Such a model is called a ‘countermodel’ to the claim that
ϕ is a general validity. Similarly, a countermodel to the claim that Φ |= ψ is a model where, in some
world, all the formulas in Φ are true, but ψ is false.
Example 3. Consider the two formulas 2(p → q) and 2p → 2q. In fact, 2p → 2q 6|= 2(p → q). A
countermodel to prove this is one where at some world 2p → 2q is true and 2(p → q) is false. A first
step is this:
/ v ()
() w
In w, there is an accessible possible world (namely v) in which p is false. Therefore, M, w |= ¬2p.
As a consequence, M, w |= 2p → 2q (look this up in the truth table for → to see this).
Now, we need to make sure that 2(p → q) is false in w. Note that it is true in the model above: in v,
p → q is true (again, look this up in the truth table for →). And, since this is the only accessible world
for w, it is the case that M, w |= 2(p → q). So, we add another accessible world for w, called u, such
that p → q is false there.
/ v ()
() w
"
u (p)
In this model, in w, 2p is still false (because p is false at v) and so 2p → 2q is still true. But
2(p → q) is not true anymore, because p → q is false at u. Hence, the above model is a countermodel
for 2p → 2q |= 2(p → q). There are models and worlds where the premise is true but the conclusion is
false.
3.4
Exercises
1. (a) Let 2ϕ stand for “It is known that ϕ”. Explain why the formulas 2ϕ → 22ϕ and ¬2ϕ →
2¬2ϕ are also called ‘knowledge introspection’. (b) If 2ϕ were to mean that ϕ is obligatory, then
which formula would say that ϕ is permitted? And how about a formula saying that ϕ is forbidden?
21
2. Given the Kripke model below, which of the following statements is true?
<
t (p)
(p) v
b
b
<
u (q)
a. w |= ¬p
b. v |= 2p
c. t |= 3>
d. t |= 2⊥
e. w |= 2(p ∨ q)
f. w |= 2p ∨ 2q
g. u |= 22⊥
h. v |= 2q → ¬p
i. w |= 3⊥
j. w |= 2>
w ()
3.
4.
Given the frame F on the right, define a valuation on the
frame such that all of the following is true: w |= 3p, w |=
22p, v |= 3q, u |= ¬q. Does there exist more than one
valuation that validates these constraints?
i.
Consider this frame, with the valuation
V(w) = {p, q}, V(v) = {p}, V(t) = ∅, V(u) =
{q}, V(s) = {p}, V(r) = {p, q}. In which
world(s) is the following formula true? (a)
32⊥, (b) 22q, (c) 32q, (d) 3n (q ∨ 2q),
for every sequence of 3 of length n ≥ 1.
/t
/v
w
u
9w
/vo
/ud
t
:so
r
ii. Give an alternative valuation such that the following formulas are all, simultaneously, true
on the resulting model: (a) w |= 32p → 2p, (b) r |= ¬p → 33q, (c) u |= 232¬q;
(d) v |= p → 3p; (e) s |= q → 2¬q.
5. Which of the following formulas is a tautology, that is, which formulas are true in all worlds of all
models? If a formula is not a tautology, give a countermodel.
a. 2>
e. 2p → 3p
b. 2⊥
f. (2ϕ ∧ 3ψ) → 3ϕ
c. 3>
g. (2(p ∨ q) ∧ 3¬p) → 3q
d. 3⊥
h. 2(ϕ → ϕ)
6. Explain why the following formulas are generally valid.
a. ϕ ∨ ¬ϕ
d. (2ϕ ∧ 2ψ) → 2(ϕ ∧ ψ)
b. ϕ, 2ϕ, 22ϕ, . . . , for propositional tautologies ϕ e. (2ϕ ∨ 2ψ) → 2(ϕ ∨ ψ)
c. 3ϕ → 3>
f. (2(ϕ → ψ) ∧ 2ϕ) → 2ψ
7. Show that the other direction of the true formula (2ϕ ∨ 2ψ) → 2(ϕ ∨ ψ) given above, i.e. 2(ϕ ∨
ψ) → (2ϕ ∨ 2ψ), is not generally valid. That is, give formulas ϕ and ψ and a model and world at
which 2(ϕ ∨ ψ) → (2ϕ ∨ 2ψ) does not hold.
8. Give instances of ϕ and ψ for which 2(ϕ ∨ ψ) → (2ϕ ∨ 2ψ) does hold in all worlds in all models.
22
4
Characterizability and frame correspondence
4.1 Characterizability and the modal language
Different Kripke frames for different modalities
As has been said, there are various modalities: epistemic for knowledge, deontic for obligation, tense
for the flow of time, dynamic for action and causation, and so on. Since the work of Kripke and others,
the common idea is, that the truth values of all modal formulas can be captured by Kripke models. The
only difference between those modalities concerns what it means that one ‘possible world’ is ‘accessible’
from another one. For epistemic modality ‘accessible’ means that we cannot distinguish one world (i.e.
one picture of how the world might be) from another on the basis of what we know. For deontic modality
some world, or situation, is ‘accessible’ from the current one if it is permissible to change the current
situation into the other one. For tense, the ‘worlds’ are time points and accessibility represents the
passing of time, from one point in time to the next.
However, even if we accept that all modalities can be understood in terms of Kripke models, that
does not imply that all Kripke models make sense for every modality. For instance, the most common
picture of time is that of a (continuous) line, without loops, circles, or jumps. For any two times, they
are the same, or one comes before the other. (If you believe that you cannot change the past but you
can change the future, then the time line might be ‘branching’ towards the future.) So, with this specific
interpretation of ‘accessibility’ comes a more specific demand on the kinds of Kripke frames we consider
applicable.
Similarly, if we think of the accessibility relation such that wRv means that, on the basis of our
knowledge, we cannot distinguish between the possible worlds w and v. Then it would be natural that
for all possible worlds w, if wRv, then also vRw.
And consider the deontic accessibility relation, such that wRv means that, in situation w we are
allowed to perform an action that changes the situation to v. In that sense of ‘accessibility’ it would not
be necessary that wRv implies vRw: if we are in a bad situation, it would be permissible to improve it,
but then it would not be permissible to change the improved situation back to the old one. On the other
hand, we might like to think that, morally, we are always permitted to do something. Our best action is
the most that can, morally speaking, be asked for. Hence, there is always some world that is accessible.
In this way, different senses of ‘accessibility’ lead to different ideas of which models make sense
and which do not. Specifically, in the example with knowledge, only those Kripke frames are applicable
where
∀w∀v(wRv → vRw),
whereas in the case of permissible action the frames should be restricted to those where
∀w∃v(wRv).
In case accessibility represents the flow of time, the only frames we wish to include are those where the
worlds follow each other consecutively, continuously, non-cyclically, and eternally.
The question we then have to deal with, what the consequences of such a restriction of the Kripke
frames (or models) are for reasoning about those modalities. General validity is defined in terms of
what is true in all possible worlds in all Kripke models, or, equivalently, in terms of what is valid in
all Kripke frames. Hence, if we restrict ourselves to only some of the Kripke frames, perhaps more
formulas will become valid: there might be formulas that are valid on all Kripke frames with the condition ∀w∀v(wRv → vRw), but not valid on some other Kripke frames. Those formulas could then be
considered ‘valid for the epistemic modality’, but not valid for the deontic modality.
Vice versa, we would like to be able to express using our modal language that the accessibility relation
has this additional structure. For instance, we would like to have a modal formula that says “If I cannot
distinguish this world from that one, then I cannot distinguish that world from this one”. In other words,
a modal formula that is true if, and only if, the accessibility relation is such that ∀w∀v(wRv → vRw).
23
Due to the limitations in the expressive power of the modal language, this search will be successful in
some cases but not in others. For instance, in view of the proposed condition on epistemic accessibility,
the formula 32ϕ → ϕ is ‘valid for epistemic modality’, and that it is valid in this sense also means, vice
versa, that the frame fulfils this condition. On the other hand, we will not be able to find such a modal
formula characterizing the direction of time.
Expressive power of the modal language
The expressive power of the modal language is importantly constrained by the fact that (i) modal formulas are evaluated at a world in a model, and (ii) their truth values can only depend on the worlds that are
accessible from it (and the worlds accessible from there, and so on). It is sometimes said that the modal
language gives us an ‘internal’ perspective on a model. We evaluate a formula in the given world ‘where
we are’, and then the accessibility relation allows us to ‘travel’ to another world in the model, where
we can evaluate another formula, and so on. This contrasts with predicate logic, in which we evaluate
formulas relative to the model as a whole, thus taking an ‘external’ perspective on the model.
Two different models or frames can seem to be the same, given an internal perspective, although they
can be clearly distinguished once we adopt an external perspective. For instance, if the proposition p is
true everywhere on a model of which the frame is represented by the natural numbers, then ‘travelling’
through it would be indistinguishable from ‘travelling’ from a possible world a to a itself all the time.
All that we would observe is that, no matter how much we travel, the proposition p is true all the time.
0
/1
/ ... ...
/n
/ n+1
/ ...
a
As a consequence of this internal perspective, the modal language is sometimes not able to distinguish
two worlds, frames, or models, when the predicate logical language is able to distinguish them. The
predicate logical formula ∀x∀y(x = y ∧ xRy) is true only for the right hand model. Yet, in both a and in
0 (or any other natural number), it would be true that
p ∧ 3p ∧ 2p ∧ 33p ∧ 22p ∧ . . .
so as far as the modal language goes, those two models are equivalent. We will say, later on, that there
is a “bisimulation” between the two models, or that 0 in the one model and a in the other model are
“bisimilar”.
In what follows, we first discuss the properties on frames, the extra demands on the accessibility
relation. As we will see, we can define some of them using the modal language. Others we cannot
define. To understand why this is so, we will give a precise definition of what it means that two worlds,
models, or frames, are indistinguishable as far as the modal language goes. Several techniques are
presented for determining whether a property of accessibility relations can be expressed or not.
4.2
Frame correspondence
Then, the first thing to do is to consider the properties a frame has, so that we may consider to what
extent those properties can be expressed in the modal language. Of course we can mention the size of
the frame, i.e., the number of possible worlds, or for instance the fact that there are exactly three worlds
for which no world is accessible. Here we restrict ourselves to structural uniformities in a frame, such
as the property that an accessible world exists for every world in the frame, or the property that the
accessibility relation is transitive, or symmetric. The definition below lists a number of these properties,
stated in the language of predicate logic.
Definition 6 (Frame properties). The following list defines various properties of frames using predicate
logical notation.
24
hW, Ri is reflexive
hW, Ri is irreflexive
hW, Ri is serial
hW, Ri is symmetric
hW, Ri is asymmetric
hW, Ri is anti-symmetric
hW, Ri is weakly ordered
hW, Ri is transitive
hW, Ri is Euclidean
hW, Ri is dense
hW, Ri is deterministic
hW, Ri is piecewise connected
hW, Ri is universal
hW, Ri is disconnected
⇔
⇔
⇔
⇔
⇔
⇔
⇔
⇔
⇔
⇔
⇔
⇔
⇔
⇔
∀w(wRw)
∀w¬(wRw)
∀w∃v(wRv)
∀w∀v(wRv → vRw)
∀w∀v(wRv → ¬vRw)
∀w∀v((wRv ∧ w , v) → ¬vRw)
∀w∀v(w , v → (wRv ∨ vRw))
∀w∀v∀u((wRv ∧ vRu) → wRu)
∀w∀v∀u((wRv ∧ wRu) → vRu)
∀w∀v(wRv → ∃u(wRu ∧ uRv))
∀w∀v∀u((wRv ∧ wRu) → v = u)
∀w∀v∀v((wRv ∧ wRu) → (vRu ∨ uRv))
∀w∀v(wRv)
∀w∀v¬(wRv)
A preordered frame is transitive and reflexive. A partially ordered frame is anti-symmetric and preordered. An equivalence frame is symmetric and preordered.
The frame properties each define a frame class. For instance, C = {F | F is transitive} is the class of
all and only transitive frames. The class of preordered frames is the intersection of the class of reflexive
frames and the class of transitive frames, so Cpreorder = Creflexive ∩ Ctransitive .
The question we are concerned with is, whether there are modal formulas that express these properties. Or, more precisely, whether there are modal formulas that, if valid on a frame, guarantee that
the frame has this property (that it belongs to the frame class). This is what the following definition is
concerned with.
Definition 7 (Definable frame class). A set of formulas Φ characterizes a class of frames C if, and only
if the formulas in Φ are jointly valid on all, and only, the frames F in class C.
F |= ϕ for all ϕ ∈ Φ ⇔ F ∈ C
A class of frames is modally definable if there is a set of modal formulas that characterises it.
If Φ = {ϕ}, we also loosely say that ϕ, rather than {ϕ}, characterizes C.
Given this definition, trivially every set of formulas characterizes some class of frames. The issue
we are interested in, however, is vice versa to what extent classes of frames are modally definable. The
following theorem states some positive results in this direction. Many frame classes can be modally
defined by means of a single formula.
Theorem 1 (Correspondence Theorems (1)).
- 2ϕ → ϕ characterizes the class of reflexive frames.
- 32ϕ → ϕ characterizes the class of symmetric frames.
- 2ϕ → 22ϕ characterizes the class of transitive frames.
- 3ϕ → 23ϕ characterizes the class of Euclidean frames.
- 22ϕ → 2ϕ characterizes the class of dense frames.
- 3ϕ → 2ϕ characterizes the class of deterministic frames.
- 2ϕ → 3ϕ characterizes the class of serial frames.
25
Reflexivity means that every possible world is accessible to itself. The opposite of this is irreflexivity:
then no world is accessible to itself. There can also be frames in which some worlds are ‘auto-accessible’
and others are not. Any frame that is not reflexive is called ‘non-reflexive’. So, all irreflexive frames are
non-reflexive, but not vice versa.
Asymmetry means that, for any two worlds in the frame, if the first is accessible to the second, then the
second is not accessible to the first. To see that asymmetry implies that the frame is also irreflexive,
consider the case where the two worlds are one and the same. Anti-symmetric frames allow for both
reflexive and non-reflexive frames, by restricting the asymmetry-condition to only two different worlds.
Symmetric frames can be either reflexive or non-reflexive.
The difference between transitive and Euclidean frames can be illustrated as follows. The continuous
lines represent the condition and the dotted line indicates what the frame property then stipulates.
/v
w
/v
w
u
u
Note 1: An R-cycle wR . . . Rw in a transitive frame implies that for any two worlds in that chain, those
worlds are mutually R-accessible. (You can calculate this for yourself.) In other words, if we would limit
the model to the worlds in that cycle, the relation would be universal. A Euclidean relation implies that
the set of worlds accessible from a world are always accessible to each other. Both facts are illustrated
below.
/vz
/vz
w
9 wX k
@
F
$ u
z
u
Note 2: If a frame is reflexive, then it is Euclidean if, and only if, it is symmetric and transitive. The
resulting relation on the frame is called an equivalence relation.
- 2ϕ characterizes the class of disconnected frames.
- 2(2ϕ → ψ) ∨ 2(2ψ → ϕ) characterizes the class of piecewise connected frames.
In other words, a frame is reflexive if, and only if, the formula 2ϕ → ϕ is valid on it. It is important to
realize that these correspondence theorems relate formulas to frames, not models. The formula 2p → p
can be valid on a model that does not have a reflexive frame. But if it is valid on the frame, then that
frame is reflexive. The models below illustrate this point: they all have the same frame, but only on the
right hand one is the formula 2p → p false.
(p) w o
/ v (p)
() w o
/ v ()
(p) w o
/ v ()
In order to see how to prove these theorems, next we give two such proofs. The first is a proof of the
modal definability of reflexivity, the one thereafter proves the modal definability of transitivity.
Proof. for F |= 2ϕ → ϕ ⇔ F is reflexive.
⇐: Suppose F = hW, Ri is reflexive. We have to show that F |= 2ϕ → ϕ, that is, for all formulas ϕ, for
26
all valuations V, for all w ∈ W, w |= 2ϕ → ϕ in the model hW, R, Vi. Thus consider an arbitrary formula
ϕ, an arbitrary valuation V and an arbitrary world w in W. Since R is reflexive wRw has to hold. Now
suppose that w |= 2ϕ. This means that for all v, if wRv, then v |= ϕ. Since wRw, this implies that w |= ϕ.
This proves that w |= 2ϕ → ϕ, and we are done.
⇒: This direction we show by contraposition. Thus we assume F = hW, Ri is not reflexive, and then
show that F 6|= 2ϕ → ϕ for some formula ϕ. In other words, we have to show that if F is not reflexive,
then there is a formula ϕ and a valuation V and a world w in W such that w 6|= 2ϕ → ϕ in the model
hW, R, Vi. Note that w 6|= 2ϕ → ϕ is the same as w |= 2ϕ ∧ ¬ϕ. Thus suppose F is not reflexive. This
means that there is at least one world w such that not wRw. Now define the valuation V on F as follows
(with only one atomic proposition p). For all worlds v:
p ∈ V(v) ⇔ wRv
Observe that in this definition the v are arbitrary, but w is the particular world such that not wRw that
we fixed above. The definition implies that v |= p if wRv, and for all other worlds z in W it implies z 6|= p,
i.e., z |= ¬p. For instance, as in this model:
(p) u
d
;
v (p)
w (¬p)
O
z (¬p)
Since not wRw, we have w |= ¬p. But the definition of V implies that all accessible worlds v of w, i.e., all
worlds such that wRv, have v |= p. Thus w |= 2p. Hence w |= 2p ∧ ¬p and so w 6|= 2p → p. Therefore,
there is a formula ϕ, namely the formula p, such that w 6|= 2ϕ → ϕ.
Proof. for F |= 2ϕ → 22ϕ ⇔ F is transitive.
⇐: Suppose F = hW, Ri is transitive. We have to show that F |= 2ϕ → 22ϕ, that is, that for all
formulas ϕ, for all valuations V, for all w ∈ W, w |= 2ϕ → 22ϕ in the model hW, R, Vi. Thus consider
an arbitrary formula ϕ, an arbitrary valuation V and an arbitrary world w in W. Now suppose w |= 2ϕ.
We have to show that w |= 22ϕ, i.e. for all v such that wRv, v |= 2ϕ. Thus consider a v such that wRv.
To show v |= 2ϕ, we have to show that for all u with vRu, u |= ϕ. Thus consider a u such that vRu. The
transitivity of R now implies that wRu. Since w |= 2ϕ, this means that all successors of w force ϕ. Since
wRu, u is a successor of w. Hence u |= ϕ. Thus we have shown that for all u with vRu, u |= ϕ. Hence
v |= 2ϕ. And that is what we had to show, as it proves that w |= 2ϕ → 22ϕ.
⇒: This direction we show by contraposition. Thus we assume F = hW, Ri is not transitive, and then
show that F 6|= 2ϕ → 22ϕ for some ϕ. In other words, we have to show that if F is not transitive, then
there is a formula ϕ and a valuation V and a world w in W such that w 6|= 2ϕ → 22ϕ in the model
hW, R, Vi. Note that w 6|= 2ϕ → 22ϕ is the same as w |= 2ϕ ∧ ¬22ϕ. Thus suppose F is not transitive.
Then there are at least three worlds w, v and u such that wRv and vRu and not wRu. Now define the
valuation V on F as follows:
p ∈ V(x) ⇔ wRx.
Thus, we put v |= p if wRv, and for all other nodes u in W we put u 6|= p, i.e. u |= ¬p. E.g. as in this
27
model:
(¬p) u
O
(p) v
O
(¬p) w
Since not wRu, we have u |= ¬p. This implies that v |= ¬2p. But this again implies that w |= ¬22p.
But the definition of V implies that all successors v of w, i.e. all nodes such that wRv, have v |= p. Thus
w |= 2p. Hence w |= 2p ∧ ¬22p. Thus w 6|= 2p → 22p. Thus there is a formula ϕ, namely p, such
that w 6|= 2ϕ → 22ϕ. This proves that the formula 2ϕ → 22ϕ characterizes the class of transitive
frames.
We can get a bit further with our quest for modal definability by means of the following proposition.
It states that we can freely combine the formulas characterizing frame properties into sets of formulas
characterizing combined frame properties.
Proposition 2 (Combination). If Φ1 characterizes frame class C1 and Φ2 characterizes frame class C2 ,
then Φ1 ∪ Φ2 characterizes frame class C1 ∩ C2 .
Proof. The proof for this is an exercise.
Some interesting frame classes are defined by a set of properties, such as the preordered frames
and the equivalence frames. A direct consequence of Proposition 2 is that these frame classes are also
modally definable:
Theorem 2 (Correspondence Theorems (2)).
- {2ϕ → ϕ, 2ϕ → 22ϕ} characterizes the class of preordered frames.
- {2ϕ → ϕ, 2ϕ → 22ϕ, 32ϕ → ϕ} characterizes the class of equivalence frames.
A proof of correspondence theorem for equivalence frames can be found in the book of van Ditmarsch
et.al. [16].
All of this seems to suggest that the frame properties that are definable in the modal language are
always expressible as predicate logical formulas starting with a universal quantifier. However, that is not
quite true. There are a few properties that cannot be defined by means of the predicate logical language
that are nonetheless modally definable.
Theorem 3 (Correspondence Theorems (3)).
- 23ϕ → 32ϕ characterizes the class of McKinsey frames.
- 2(2ϕ → ϕ) → 2ϕ characterizes the class of Gödel-Löb frames.
The Gödel-Löb frames are transitive and conversely well-founded. A frame is well-founded if
there is always a finite series of steps to a ‘root’ world, in other words, there is no infinite chain of
worlds . . . Rw2 Rw1 Rw0 . A frame is conversely well-founded if there is no infinite chain of worlds
w0 Rw1 Rw2 R . . .. The property of converse well-foundedness is not expressible in predicate logic, and it
is also not modally definable as such. But in combination with transitivity it is modally defined by the
28
Gödel-Löb formula. The name originates from the two founders of Provability Logic, which is a modal
logic in which 2ϕ is understood to mean “ϕ is provable”. The modal formula is an axiom in that logic
(see the next section) and it means that there is a proof of A if there is a proof of the fact that a proof of
A implies that A is true.
We leave out the question what the McKinsey frame property is.
4.3
Bisimulation invariance
Not all frame properties can be modally defined. To show that a property cannot be defined, there are
several methods available. They are all based on the fact that two possible worlds in two different models
can make exactly the same formulas true. In this section we define a relation between worlds in models
called a ‘bisimulation’, that guarantees that the two worlds are ‘semantically equivalent’, i.e., they make
the same modal formulas true. Put differently, no modal formula can distinguish the two possible worlds,
by being true in the one world and false in the other. In the next section we will use this concept to define
three types of relations between Kripke frames that can be used to show that a frame property cannot be
modally defined.
Definition 8 (Bisimulation). Given two models M = hW, R, Vi and M 0 = hW 0 , R0 , V 0 i, a bisimulation
between M and M 0 is a relation Z on W × W 0 such that
1. wZw0 implies that the same propositions are true in w and w0 , V(w) = V 0 (w0 ),
2. wZw0 and wRv implies that there is a v0 ∈ W 0 such that w0 R0 v0 and vZv0 ,
3. wZw0 and w0 R0 v0 implies that there is a v ∈ W such that wRv and vZv0 .
When there is a bisimulation relating world w in M to world w0 in M 0 , we write M, w ↔ M 0 , w0 .
Here is a graphical representation of the second and third conditions, the so-called ‘forth’ and ‘back’
conditions. The continuous lines indicate the conditions under which the dotted lines should be found.
vO
Z
/ v0
O
R0
R
w
vO
Z
Z
/ v0
O
R0
R
/ w0
w
Z
/ w0
These graphs illustrate the way to check for bisimilarity. You start with a certain world w in one model
and another world w0 in another model, with at least the same propositional valuation V(w) = V(w0 )
(condition 1). Then you look to the worlds that are accessible from w and for each of them you connect
it with a world in the other, that must be accessible from w0 . And you do so vice versa for the worlds
that are accessible from w0 : they must be connected with a world in the first model. Then, for all of the
worlds you ‘connected’ with each other you do the same thing, until you have reached the point where
there are no more worlds to connect.
Here is a simple example of a bisimulation between the worlds of two different models:
(q)w o
R
/ v(q)
(q)a l
R0
The bisimulation is wZa and vZa. For the first condition we check that V(w) = V(a), which is correct,
and V(v) = V(a), which is also correct. Then we move on to the second condition: w can access v, and
wZa, so there must be some world in the right hand model that is (i) accessible from a, and (ii) bisimilar
29
to v. Clearly, that world is a itself: (i) aRa, and (ii) vZa. And v can access w, so there must be a world
in the right hand model that is accessible from a and bisimilar to w. Again, this world is a itself. For the
third condition we have to do the same thing in the reverse direction. We observe that wZa and aR0 a, so
we have to find a world x in the left hand model such that wRx and xZa. That world can only be v: wRv
and vZa. We are done.
The notion of bisimilarity goes to the heart of what modal logic is. Bisimulations were introduced
by Van Benthem [14], who actually defined modal logic to be the fragment of first order logic that is
invariant under bisimulation. To understand this characterization, we would have to explain how modal
logic can be seen as a fragment of first-order logic. We leave it to the reader to look this up in for instance
Blackburn et al. [2].
Theorem 4 (Bisimulation theorem). If for two models M = hW, R, Vi and M 0 = hW 0 , R0 , V 0 i there is a
bisimulation Z such that wZw0 for some w ∈ W and w0 ∈ W 0 , then the same formulas are true in w and
w0 . That is,
M, w |= ϕ ⇔ M 0 , w0 |= ϕ.
Next, we will give a proof of the Bisimulation theorem by means of formula induction. With an
induction on the complexity of a formula, we obtain a general method of showing that the models make
all the same formulas true. We know that they are the same for the atomic propositions. And if they are
equivalent on p and q, then they are equivalent on p ∨ q. And if they are equivalent for p ∨ q, then they
are equivalent for 3(p ∨ q), and so on.
Proof. For reasons of symmetry, we only have to prove it for the direction from M to M 0 . The other
direction is completely analogous. So we assume that M, w |= ϕ, for abitrary ϕ, and prove that w0 |= ϕ.
We prove this by induction on the complexity of ϕ. That is, we prove that for every complex formula
ϕ, it is true that w |= ϕ ⇔ w0 |= ϕ, given that this is true for the less complex subformulas. This is the
induction hypothesis.
Induction hypothesis: if ψ is a subformula of ϕ, then M, w |= ψ ⇔ M 0 , w0 |= ψ.
Basic case: suppose ϕ = p, some atomic proposition, and so w |= p. Then it follows directly from the
definition of bisimilarity, condition 1, that w0 |= p. So w0 |= ϕ.
Negation: suppose ϕ = ¬ψ, and so w |= ¬ψ. This means that w 6|= ψ. Then, on the induction
hypothesis, it is true that w |= ψ ⇔ w0 |= ψ. And since w 6|= ψ, also w0 6|= ψ. Then, by the definition of ¬
once again, it follows that w0 |= ¬ψ, so w0 |= ϕ.
Disjunction: suppose ϕ = ψ ∨ χ, and so w |= ψ ∨ χ. By the induction hypothesis, we can assume
that w |= ψ ⇔ w0 |= ψ and w |= ψ ⇔ w0 |= ψ. Because w |= ψ ∨ χ, by the semantic definition of ∨ we
know that w |= ψ or w |= χ. Suppose that w |= ψ (the other case is similar). Then it is also the case that
w0 |= ψ. Therefore, by the semantic definition of ∨ once again, it follows that w0 |= ψ ∨ χ, and so w0 |= ϕ.
The other connectives are similar (and can be defined in terms of negation and disjunction).
Possibility: suppose that ϕ = 3ψ, and so M, w |= 3ψ. By the truth definition, there must be a world
v in W such that wRv and M, v |= ψ. From the fact that w and w0 are linked by the bisimulation Z, we
may infer that there is some world v0 accessible from w0 in M 0 , such that vZv0 (this is condition 2 in the
definition of bisimulation). The inductive hypothesis gives us that M 0 , v0 |= ψ. But then we may conclude
from the fact that w0 Rv0 that M 0 , w0 |= 3ψ, which is precisely what we are after.
The case of necessity is similar.
For each connective, we use first the truth definition to reduce the truth value of the complex formula
to the truth value of some more simple formula. Then we use the induction hypothesis to show that in
the bisimilar world that formula has the same truth value. Then finally we use the truth definition again
30
to establish the truth value of the complex formula (in the other world) on the basis of the truth values of
the simpler formulas. For the modalities we also have to use condition 2, from M to M 0 , and condition
3, vice versa.
Bisimulations are relations between worlds in models. The bisimulation theorem shows how bisimulations make those related worlds modally indistinguishable. This result can also be used to establish
dependencies between the larger structures themselves: models validating all of the same modal formulas, and frames validating all of the same modal formulas. Those consequences of the bisimulation
theorem are presented below.
Corollary 1. Call a bisimulation between M and M 0 complete for M 0 if all the worlds in M 0 are related
to a world in M. If there is a bisimulation between M and M 0 that is complete for M 0 , then
M |= ϕ ⇒ M 0 |= ϕ.
Proof. Suppose that there is a bisimulation Z between M and M 0 that is complete for M 0 . This means
that for all worlds w0 in model M 0 , there is some world w in model M such that wZw0 . Therefore, any
formula that is true in w is true in w0 . Now, if it is true that M |= ϕ, then ϕ is true in every world in M,
including world w. So, M, w |= ϕ, and by the bisimulation theorem M 0 , w0 |= ϕ. Seeing as this is true for
all w0 in M 0 , ϕ is true in all worlds in M 0 , and therefore M 0 |= ϕ.
Corollary 2. Let F and F 0 be two frames. If for any valuation V 0 on F 0 we can define a valuation on F,
such that there is a bisimulation between the resulting models M and M 0 , complete for M 0 , then
F |= ϕ ⇒ F 0 |= ϕ.
Proof. The proof for this corollary is an exercise.
This corollary will be the key to grasping the limitations of characterizability of frames. We will
consider three kinds of relations between frames that are such that for all valuations on the one frame we
can always find a valuation on the other frame that creates a bisimulation that is complete for the first
model. Using this corollary, we can then conclude that all formulas that are valid on the second frame
are valid on the first frame. As a consequence, no formula can be such that it defines a class of frames
including the first frame but not the second one.
4.4
The limits of characterizability: three methods
We will define three kinds of relations between frames. Those relations are such that whatever is valid
on the one frame is also valid on the other frame. Therefore, if the one frame has a property that the other
frame does not have, then that property cannot be modally characterized. We will use the Bisimulation
theorem, and its corollaries, to show this.
One of the best known results in modal logic is the Goldblatt-Thomason theorem, which pinpoints
precisely the frame properties that can be modally defined. We will only explain the first three concepts
involved, so we cannot prove the theorem here, but see Blackburn et al. [2] for an extensive treatment.
Theorem 5 (Goldblatt-Thomason Theorem). If a frame property can be formulated in the language of
predicate logic (so not the Gödel-Löb or McKinsey properties), then it is modally definable if, and only
if, it is closed under taking generated subframes, disjoint unions, p-morphic images, and its complement
is closed under taking ultrafilter extensions.
31
Generated subframes
The intuition behind generated subframes is perhaps relatively easy to grasp. Suppose that a frame F
is a member of the definable class C. Then some formula (or set of formulas) must be true in all the
worlds in any model based on F. We can in some cases throw some worlds out of the frame, without
‘disrupting’ the remaining worlds in the frame. That is, those remaining worlds make all of the same
formulas true, in any model based on that resulting frame. If that is so, then the characterizing formula
must also be true in all of the remaining worlds, so the resulting frame must also be in the class C.
Now we ask, which are those ‘irrelevant’ worlds that can be eliminated without disrupting the remaining worlds? Say that a world w can ‘see’ another world v if v is accessible from w via any intermediate
steps. Then if w cannot see v, whatever is the case in v cannot be relevant for w: no amount of 2 and 3
can make the truth value of a formula in w dependent on what is the case in v. Moreover, if w cannot see
v, then neither can any of the worlds w can see. Hence, if we eliminate the worlds that w cannot see, and
keep all of the other worlds, then the resulting frame should validate all of the formulas that are valid on
the original frame.
The first step is to make this concept of ‘seeing’ more precise.
Definition 9 (Hereditary closure). Let F = hW, Ri. The heriditary closure of R is R∗ .
wR∗ v if, and only if, there are worlds u1 . . . un such that wRu1 Ru2 R . . . Run Rv
The worlds w can see are Ww = {v | v = w or wR∗ v}.
The set Ww thus includes all those possible worlds that might be relevant for evaluating a modal
formula in world w: its ‘immediate successors’ (with wRv) are relevant for whether 2p is true in w, the
ones that are accessible from there are relevant for evaluating whether 23q is true, and so on. The only
thing we have to do is define the frame resulting from restricting the worlds to Ww . This is really selfexplanatory, but stated here for completeness. We retain all the relational dependencies vRu for worlds u
and v in Ww .
Definition 10 (Generated subframe). Let F = hW, Ri be a frame and w ∈ W. The w-generated subframe
of F is Fw = hWw , Rw i, where vRw u if, and only if vRu, v ∈ Ww and u ∈ Ww .
We can then make the following observation.
Proposition 3. Let F be a frame, Fw its w-generated subframe, and Mw is a model based on Fw . Then
there is a model M based on F such that M, w ↔ Mw , w is an Mw -complete bisimulation.
Proof. For clarity, we name the worlds in Ww w0 , v0 , u0 and so on, to avoid confusion. Nevertheless,
w0 = w, v0 = v and so on.
We take any valuation V such that V(v) = Vw (v0 ) for all v0 ∈ Ww , and define M = hW, R, Vi. We have
to show that the relation Z such that wZw0 for all w0 ∈ Ww is a bisimulation that is complete for Mw .
First, the defined valuation is such that, if vZv0 , then V(v) = Vw (v0 ). Second, suppose that vRu and
vZv0 . Then v0 ∈ Ww , which means that wR∗ v and, because vRu, also wR∗ u, so there is some u0 (i.e. u
itself), such that v0 Rw u0 and, according to the bisimulation we defined, uZu0 . Third, suppose that v0 Rw u0
and vZv0 . Then, from the fact that Rw is a restriction (subset) of R it immediately follows that vRu, with
uZu0 . Thus, the relation Z is a bisimulation. Finally, the bisimulation is complete for Mw , because for
all of the worlds v0 in Mw , there is a world v in M, with vZv0 .
As will be clear, this proposition implies by Corollary 2 that every formula that is valid on a frame
is valid on its generated subframes. From there it is also straightforward to conclude that no frame class
can be modally defined if some frame belongs to that class but some of its generated subframes do not.
32
Corollary 3. Let F be a frame and Fw its w-generated subframe. For all ϕ: if F |= ϕ, then Fw |= ϕ. If C
is a characterizable class of frames and F ∈ C, then Fw ∈ C.
This corollary was the reason why we defined generated subframes and submodels above. The corollary demonstrates a limitation to the expressive power of modal logic for characterizing frame classes.
It can now be proven, for instance, that the class of non-reflexive frames is not modally definable. You
will be asked to prove this as an exercise.
Disjoint unions
A second technique for proving that a frame property cannot be characterized is by means of a disjoint
union of two frames. Such a union is ‘disjoint’ if there is no overlap between the possible worlds of the
two frames. In order to guarantee disjointness, the general definition is as follows:
Definition 11 (Disjoint Union). Let F1 = hW1 , R1 i and F2 = hW2 , R2 i be two Kripke frames. Their
disjoint union, F1 t F2 is hW, Ri, where
- W = {(w, i) | w ∈ Wi }, and
- R = {h(w, i), (v, i)i | wRi v}, with i ∈ {1, 2}.
Provided that W1 and W2 are already disjoint, the disjoint union of the two frames is simply F =
hW1 ∪ W2 , R1 ∪ R2 i. If w is a member of both W1 and W2 , then there will be two ‘copies’ of it in the
disjoint union: (w, 1) and (w, 2). This guarantees that the structure of the two frames will not be disrupted
by their union. For instance, the disjoint union of two linearly ordered frames will consist of two linearly
ordered frames—even if the two frames have overlapping members. We can also create a disjoint union
by putting two ‘instances’ or ‘copies’ of a single frame together.
That there is a bisimulation, for every valuation on F1 , relating (w, 1) in F to w in F1 is a trivial
observation. That this bisimulation is complete for F1 is also easily seen. There is no ‘influence’ from
‘the other frame’ in F, because the two are disjoint.
However, what we need is a stronger claim.
Proposition 4. Let F = F1 t F2 . Then,
F |= ϕ ⇔ F1 |= ϕ and F2 |= ϕ.
We want to establish that everything that is valid in both of the smaller frames is also valid in their
disjoint union.
Proof. We only do the direction ⇐. Assume that it is the case that, F1 |= ϕ and F2 |= ϕ. We need to show,
for their disjoint union F, that F |= ϕ. We do this by contraposition.
So, suppose F 6|= ϕ. Then there is some model M = hW, R, Vi based on F, and some (w, i) ∈ W such
that M, (w, i) 6|= ϕ. Now take the model Mi = hWi , Ri , Vi i, where Vi is defined by Vi (w) = V((w, i)). It can
easily be checked that the relation Z such that (w, i)Zw, for all w ∈ Wi , is a bisimulation. From that it
follows that M, (w, i) |= ϕ if, and only if, Mi , w |= ϕ. But then, Mi , w 6|= ϕ, and so Fi 6|= ϕ. That contradicts
our initial assumption.
We use this fact about disjoint union to prove the non-characterizability of, among others, (strong/weak)
connectedness and universality. The disjoint union of two universally connected frames (i.e., ∀w∀v(wRv))
is not itself universally connected, because for no two worlds (w, 1) and (v, 2) is it the case that (w, 1)R(v, 2).
33
P-morphisms
P-morphisms are functions between frames. They exist when there is a certain similarity between the
frames. That is, given a p-morphism f one can define valuations on the frames such that a node and its
image under the p-morphism cannot be distinguished modally: w |= ϕ ⇔ f (w) |= ϕ.
Definition 12 (P-morphism). Given two frames F = hW, Ri and F = hW 0 , R0 i, a p-morphism f : W →
W 0 between F and F 0 is a map such that
1. f is a surjection,
2. wRv implies f (w)R0 f (v),
3. f (w)R0 v0 implies that there is a v ∈ W such that wRv and f (v) = v0 .
F 0 is called a p-morphic image of F.
Note the difference between p-morphisms and bisimulations: the former are functions between two
frames, while bisimulations are relations between the worlds in two models. As for bisimulations, the
second (left) and third (right) condition on p-morphisms can be depicted as follows:
vO
f
vO
R0
R
w
/ v0
O
f
f
R0
R
/ w0
w
/ v0
O
f
/ w0
The similarity with the bisimulation conditions is obvious from this depiction. The differences are
worth pointing out. First, because f is a function, in case wRv on the left, there must be images f (w) and
f (v) on the right. So condition 2 only needs to state that those images stand in the accessibility relation,
f (w)R0 f (v). Therefore, only the arrow from w0 to v0 is dotted. Second, in case f (w)R0 v0 (picture on the
right), condition 1 that f is surjective implies that there are one or more v such that f (v) = v0 . Condition
3 only requires that, for one of those v, it is moreover true that wRv.
Proposition 5. Let f : W → W 0 be a p-morphism between F = hW, Ri and F 0 = hW 0 , R0 i, and let
M 0 = hW 0 , R0 , V 0 i be a model based on F 0 . Then there is a model M = hW, R, Vi based on F such that
there is a bisimulation M, w ↔ M 0 , f (w) that is complete for M 0 .
Proof. To obtain the model M we choose V such that V(x) = V 0 ( f (x)). We now have to show that the
relation Z, defined by wZ f (w) for all w ∈ W is a bisimulation, and that this bisimulation is complete for
M0.
For the first condition, we chose V such that for any possible world v, V(v) = V 0 ( f (v)).
For the second condition, suppose that wRv and wZ f (w). Then the p-morphism definition guarantees
that f (w)R0 f (v). So there is a v0 ∈ W 0 , namely f (v) such that vZ f (v) and f (w)R0 f (v).
For the third condition, suppose that f (w)R0 v0 . Then the p-morphism definition guarantees that there
is a v such that wRv and f (v) = v0 . Since vZ f (v), this immediately fulfils the third condition for a
bisimulation.
Finally, to show that the bisimulation is complete for M 0 , the p-morphism is a surjection, which
means that every possible world in w0 is in the image of the function. Accordingly, the bisimulation
with wZ f (w) for every w, relates to every world in M 0 some world in M. Therefore the bisimulation is
complete for M 0 .
It directly follows from this that if a frame has some property X, belongs to some frame class C, then
a p-morphic image of it must also have that property, and so belong to that same frame class.
34
Corollary 4. Let f be a p-morphism between F and F 0 . For all ϕ: if F |= ϕ, then F 0 |= ϕ. If C is a
characterizable class of frames and F ∈ C, then F 0 ∈ C.
Below is one example of a p-morphism, showing that the property of antisymmetry is not modally
definable. The frame on the left is antisymmetric, but the frame on the right is not: in fact it is symmetric.
The dotted lines show one possible p-morphism from the left frame to the right one. By means of the
corollary above we can then conclude that antisymmetry is not modally definable.
?v
f
f
9w_
$&
tc
=a
/$: b d
o
f
uZ
4.5
f
Exercises
1. Give a bisimulation between the following two models such that w and a become bisimilar.
(q) x
b
<
y (q)
(p) z
O
(p) v
c
(q) c
O
(p) b
O
;
u (p)
w
;
d (p)
a
2. Prove the correspondence theorems for (a) disconnected, (b) serial, (c) symmetric, (d) preorder.
3. Which formula characterizes the class of frames where there are no three worlds w, v, and u such
that wRv and vRu?
4. Give a model which is not based on a transitive frame, on which the formula 2p → 22p is valid.
Give a different model based on the same frame where that formula is not true.
5. Show that (a) every reflexive frame is serial and dense, (b) every well-founded frame is irreflexive
and antisymmetric, (c) reflexive frames are Euclidean if, and only if, they are symmetric and
transitive.
6. (a) Prove that 2ϕ ↔ 22ϕ holds on all reflexive transitive frames. (b) Give a formula that characterizes the class of reflexive transitive frames. (c) Show that the formula in (a) is not such a
formula.
7. Show by means of a simple infinite frame that transitivity plus irreflexivity is not characterizable.
8. To get a feeling for what generated subframes look like it is useful to prove the following:
(a) Show that Ww is the smallest subset of W such that the following holds:
(a) w ∈ Ww ; (b) for all v ∈ Ww : if vRu, then u ∈ Ww .
35
(b) Show that if R is transitive, then Ww = {v ∈ W | w = v or wRv} and if R is reflexive also
Ww = {v ∈ W | wRv}.
(c) Show that if v ∈ Ww , then Wv = (Ww )v .
9. Prove Corollary 1.
10. Prove Corollary 3.
11. Give an example of a non-reflexive frame and a reflexive generated subframe. What can we prove
on that basis?
12. Given an example of a p-morphism showing that partial ordering is not modally definable.
13. Can you give an example of an irreflexive frame that has a reflexive frame as its p-morphic image?
What does that prove?
36
5
5.1
Basic Modal Logic II: Proof theory
Hilbert system
The standard way of describing a proof system for modal logic is with a so-called Hilbert system, named
after the German mathematician David Hilbert. In these systems there are only two inference rules and a
variety of axioms. Although those systems are very easy to state, they are nearly impossible to use. For
the interested reader, a statement of the Hilbert system for the basic modal logic is given here.
This system is built on the basis of a Hilbert-system for propositional logic, which has one inference
rule called Modus Ponens (which is the same as Elim →) and a set of axiom schemata (meaning that
we can substitute any propositional formula for the variables ϕ, ψ, and χ). Given the fact that every
propositional formula can be rewritten using only → and ¬, the three axiom schemata below suffice.
(Modus Ponens) From ϕ → ψ and ϕ, infer ψ.
(Axiom 1)
ϕ → (ψ → ϕ)
(Axiom 2)
(ϕ → (ψ → χ)) → ((ϕ → ψ) → (ϕ → χ))
(Axiom 3)
(¬ϕ → ¬ψ) → ((¬ϕ → ψ) → ϕ)
To obtain the basic modal logic K, named after Kripke, we allow modal formulas to be used in the
above system (so that, e.g., 2p → (q → 2p) is an instance of axiom scheme 1), and we add one more
inference rule and one more axiom scheme:
(Necessitation) From ϕ, infer 2ϕ.
(Axiom K)
2(ϕ → ψ) → (2ϕ → 2ψ)
As may be clear from the length of these formulas, making proofs with only these tools available will be
quite tedious. Therefore, we will be using the lesser known natural deduction system for modal logic.
However, in some cases we are merely interested in claiming that some proof system can be given, with
certain properties. In those cases it is convenient that we can restrict ourselves to a proof system of six
lines only.
Factual premises
If we make an assumption that some atomic proposition p is actually true, we could seemingly infer
from this that 2p is true. However, that would be clearly mistaken: the statement “If it is raining, then
it is necessary that it is raining” is not (supposed to be) a tautology. Because of this, we cannot use
Necessitation based on such premises.
This shows that we must formulate the Necessitation rule more clearly. It says that, if we can prove
ϕ, then we can also prove 2ϕ. So it should be understood as follows: from `K ϕ infer `K 2ϕ. The same
applies to Modus Ponens, although there we do not have the same risk of misunderstanding.
5.2
Natural deduction for modal logic
We make use of the Natural Deduction system for modal logic from Garson [7]. To obtain our proof
system, we use Natural Deduction for Propositional Logic (see section 1) and we add four rules. They
37
involve a special type of assumption, expressed by the 2.
..
.
2
..
.
..
.
When we make the 2 assumption we, so to say, go into a ‘necessity modus’ in our proof. To make
an analogy with the semantics, the 2 indicates the ‘shift’ from a possible world to the set of all of
its accessible worlds. If we can derive some formula ϕ from the 2 premise, then we can withdraw the
assumption and conclude that 2ϕ (Intro 2). Next to the introduction rule for the 2 there is an elimination
rule for 2. After we have proven that 2ϕ, we can make the assumption 2 and, in the ‘necessity modus’,
infer ϕ. The result of Elim 2 is therefore that ϕ is inferred only under the assumption of 2. There is an
introduction rule for the 3, although we could do without it, because it is derivable from the rules for 2,
using Def 3.
The system K is the natural deduction system defined by the inference rules for Propositional logic
(see section 1) plus the following four inference rules.
Elim 2
Intro 2
Intro 3
3ϕ
2ϕ
2
..
.
...
ϕ
2
..
.
2
ϕ
..
.
ϕ
......
2ϕ
ψ
......
3ψ
Def 3
3ϕ
......
¬2¬ϕ
¬2¬ϕ
......
3ϕ
The 2 assumption is really different from other assumptions, including assumptions of the form 2ϕ.
We can use our familiarity with Kripke models to understand this. If we are trying to prove that, in any
arbitrary possible world, ϕ is true, then we might make an assumption that 2ψ is true in that arbitrary
world. But this is different from the reasoning about what is then true in the worlds accessible from that
arbitrary world, which is what the 2 assumption allows us to do. So, when we make that assumption
twice, from the semantic point of view we are considering the worlds accessible ‘in two steps’.
22p
w |= 22p
2
2p
wRv ⇒ v |= 2p
2
p
..
.
wRvRu ⇒ u |= p
Because of the special nature of the 2 assumption, there is an important further rule to take into
consideration. We need to constrain Reiteration in a specific way. The normal cases of reiteration are
38
still permissible, including when the premise is a modal formula of whatever form.
..
.
..
.
ϕ
ϕ
ψ
..
.
ϕ
..
.
2ψ
..
.
ϕ
..
.
(allowed)
(allowed)
But reiteration ‘into’ a subproof is not permissible in case the assumption is 2 (see below, on the
left). That would be like inferring from the fact that in every world where ϕ is true, 2ϕ is also true. In
this respect, the carefulness required here with reiteration is essentially the same as the concerns with the
Necessitation rule in the context of the Hilbert-style proof system. The other limitation for reiteration
also still holds: we cannot reiterate intermediate steps after we dropped an assumption (on the right).
..
.
..
.
..
.
ϕ
2
..
.
ϕ
5.3
ϕ
..
.
ϕ
(not allowed)
(not allowed)
Examples
With these inference rules in place, we can make simple natural deductions in the basic modal logic K.
The first example is a proof for `K (2ϕ ∧ 2ψ) → 2(ϕ ∧ ψ).
1
2ϕ ∧ 2ψ
Assumption
2
2ϕ
Elim ∧, 1
3
2ψ
Elim ∧, 1
4
2
2
5
ϕ
Elim 2, 2
6
ψ
Elim 2, 3
7
ϕ∧ψ
Intro ∧, 5, 6
8
9
2(ϕ ∧ ψ)
Intro 2, 7
(2ϕ ∧ 2ψ) → 2(ϕ ∧ ψ)
Intro →, 1, 8
If we assume that ϕ is necessary and ψ is necessary as well, then if we start to reason about what is
necessarily the case, we can infer that ϕ is, and ψ is, and therefore ϕ ∧ ψ is. Consequently, we draw the
conclusion that necessarily, ϕ ∧ ψ.
39
The next example presents a derivation of the K-axiom from the Hilbert-style proof system.
2(ϕ → ψ)
1
Assumption
2ϕ
2
Assumption
3
2
2
4
ϕ→ψ
Elim 2, 1
5
ϕ
Elim 2, 2
6
ψ
Elim →, 4,5
7
2ψ
2ϕ → 2ψ
8
9
Intro 2, 6
Intro →, 2, 7
2(ϕ → ψ) → (2ϕ → 2ψ)
Intro →, 1,8
Observe that these proofs do not involve reiteration in the context of the 2 assumptions. We only
introduce facts χ in the 2 context of which 2χ was proven outside of the 2 context. A third example of
a 2 ‘distribution rule’ is the following. It proves that 2ϕ ∨ 2ψ `K 2(ϕ ∨ ψ), using the elimination rule
for disjunction.
1
2ϕ ∨ 2ψ
2
2ϕ
Premise
Assumption
3
2
2
4
ϕ
Elim 2, 2
5
ϕ∨ψ
Intro ∨, 4
6
2(ϕ ∨ ψ)
Intro 2, 5
7
2ψ
Assumption
8
2
2
9
ψ
Elim 2, 7
ϕ∨ψ
Intro ∨, 9
10
11
12
2(ϕ ∨ ψ)
Intro 2, 10
2(ϕ ∨ ψ)
Elim ∨, 1, 6, 11
As for the 3, in fact we can do without it, since it can be treated as a defined operator. However, here
40
is a short proof of 3(ϕ ∧ ψ) `K 3ϕ ∧ 3ψ, using the 3 introduction rule.
1
3(ϕ ∧ ψ)
Premise
2
2
2
3
ϕ∧ψ
Assumption
4
ϕ
Elim ∧, 3
5
3ϕ
Intro 3, 1,4
6
2
2
7
ϕ∧ψ
Assumption
8
ψ
Elim ∧, 7
9
3ψ
Intro 3, 1,8
3ϕ ∧ 3ψ
Intro ∧, 5,9
10
The only way we can validly derive a formula 3ϕ is in case we have first established that something
is possible and, secondly, that necessarily, if that possible something is true, then ϕ is also true. Hence,
if we assume that it is possible that ϕ ∧ ψ, we can derive that, necessarily (assumption 2), if ϕ ∧ ψ is the
case (assumption 3), ϕ is also the case. And therefore we can then conclude that it is possible that ϕ is
the case.
The rule for defining 3 can be used to prove the following: 3¬ϕ `K ¬2ϕ.
1
3¬ϕ
Premise
2
¬2¬¬ϕ
Def 3, 1
3
2ϕ
Assumption
4
2
2
5
ϕ
Elim 2, 4
6
¬ϕ
Assumption
7
⊥
Elim ¬, 5,6
8
¬¬ϕ
Intro ¬ 6,7
9
2¬¬ϕ
Intro 2 4,8
10
¬2¬¬ϕ
Reit, 2
11
⊥
Elim ¬, 9,10
12
¬2ϕ
Intro ¬, 3,11
The proof for 2¬ϕ `K ¬3ϕ is shorter, and proceeds by means of the assumption that 3ϕ. The proofs
in the other directions, from negated modalities to the dual modality of a negation, are also shorter.
They can both be formulated with assuming the contrary and using negation introduction and the double
negation rule in the last steps.
We can use derivations such as these to form derived rules which are, in effect, shortcuts in a derivation. That is, whenever we prove that 3¬ϕ, for some modal formula ϕ, then we can continue the
derivation with the above steps 2-12 to ¬2ϕ, and continue the derivation further from there. So, instead,
we can shortcut the derivation, going immediately from 1 to 12, and leaving out the steps 2-11.
41
As a final example, the following is a rather lengthy demonstration of the correctness of the 3 introduction rule. Or, more precisely, it is a derivation proving that 2(ϕ → ψ), 3ϕ `K 3ψ. The derivation
uses two, permissible, instances of the reiteration rule.
5.4
1
2(ϕ → ψ)
Premise
2
3ϕ
Premise
3
¬2¬ϕ
Def 3, 2
4
2¬ψ
5
2
2
6
¬ψ
Elim 2, 4
7
ϕ→ψ
Elim 2, 1
8
ϕ
Assumption
9
ψ
Elim →, 7,8
10
¬ψ
Reit, 6
11
⊥
Elim ¬, 9,10
Assumption
12
¬ϕ
Intro ¬, 8,11
13
2¬ϕ
Intro 2, 5,12
14
¬2¬ϕ
Reit, 3
15
⊥
Elim ¬, 13,14
16
¬2¬ψ
Intro ¬, 4,15
17
3ψ
Def 3, 16
Soundness
A proof system is called ‘sound’, in relation to the semantics, if it only allows us to make derivations that
are semantically in order, i.e., only derivations in which the truth of the premises guarantees the truth
of the conclusion. The logic K is sound (in relation to the Kripke model semantics). The proof of this
is rather standard as long as we restrict ourselves to the Hilbert system, but for our natural deduction
system we need some more detail.
Because of the 2 in the derivations, we cannot simply define ∆ |= ϕ in the usual manner: ∆ is not
just a conjunction of premises, but rather a sequence of premises and boxes. In the schema below, the
premises at the point where we infer χ are: ∆ = ϕ, 2, ψ, 2.
ϕ
2
ψ
2
..
.
∆ = ϕ, 2, ψ, 2
χ
∆ `K χ
42
Accordingly, in order to state the soundness theorem, we first need a concept of ‘truth of the premises
∆’ in a world and a model. The main obstacle is that there is no semantics for the 2 assumption, as it
stands. However, the 2 has a natural interpretation in the context of the proof, as was already illustrated
in the demonstration of the inference rules. This allows us to define a semantic counterpart for it.
Here is an inductive definition for truth of a premise sequence ∆ in a world and a model. We decompose the sequence of premises and 2 assumptions, starting at the right hand side and working our way
toward the left hand side.
M, w |= ∆, ϕ
⇔
M, w |= ϕ and M, w |= ∆;
M, w |= ∆, 2
⇔
∃v, vRw and M, v |= ∆.
To some extent, we can read the 2 assumption in the semantics as “the present world is accessible from
a world where the preceding sequence is true”.
Now, for soundness:
Theorem 6 (Soundness Theorem). Everything inference in K is generally valid.
∆ `K ϕ ⇒ ∀M∀w(M, w |= ∆ ⇒ M, w |= ϕ)
Proof. The proof is an induction on the length of a proof. We need to prove that every inference rule is
generally valid. We do so for selected cases.
(Elim →) If ∆ `K ϕ → ψ and ∆ `K ϕ, then ∆ `K ψ. We assume that ∆ |= ϕ → ψ and ∆ |= ϕ, and we
prove that ∆ |= ψ. Consider an arbitrary world w such that w |= ∆. Then, by assumption, w |= ϕ → ψ
and w |= ϕ. The definition of truth of sequences tells us that w |= ϕ → ψ. It now follows immediately
from the semantic clause for implication that w |= ψ.
(Elim 2) If ∆ `K 2ϕ, then ∆, 2 `K ϕ. We assume that ∆ |= 2ϕ and prove that ∆, 2 |= ϕ. Consider
an arbitrary world w such that w |= ∆. Then, by our assumption, w |= 2ϕ. The definition of truth of a
sequence now says that, for all v such that wRv, v |= ∆, 2. The fact that w |= 2ϕ implies, by the Truth
definition, that v |= ϕ.
(Intro 2) If ∆, 2 `K ϕ, then ∆ `K 2ϕ. Suppose this is not sound. So, we assume that ∆, 2 |= ϕ, but
there is some world w in some model, such that w |= ∆ and w 6|= 2ϕ. Hence, there is some world v such
that wRv and v 6|= ϕ. Then, according to the definition of truth of a sequence of formulas above, v |= ∆, 2.
After all, there is w such that wRv and w |= ∆. But, by the fact that ∆, 2 |= ϕ, it must now be true that
v |= ϕ. That contradicts our assumption that v 6|= ϕ.
(Intro 3) If ∆ `K 3ϕ and ∆, 2, ϕ `K ψ, then ∆ `K 3ψ. We assume that ∆ |= 3ϕ and ∆, 2, ϕ |= ψ. We
prove that ∆ |= 3ψ. Consider an arbitrary w such that w |= ∆. Then, by assumption, w |= 3ϕ. According
to the Truth definition, this implies that there is a v such that wRv and v |= ϕ. Also, in view of the
definition of truth of a sequence, it is then true that v |= ∆, 2 and, combining these things, v |= ∆, 2, ϕ.
Then, by our other assumption, v |= ψ. Hence, there is a world v accessible from w, such that v |= ψ. The
Truth definition now tells us that w |= 3ψ.
The reverse of the soundness theorem is called ‘completeness’, because it says that the proof system
is complete with respect to the semantics: every argument that is semantically valid can be derived with
the proof system. This theorem will be discussed later.
5.5
Adding extra rules or axioms: the diversity of modal logics
A major difference between predicate logic and modal logic, in practice, is that there is a plurality of
modal logics, that are all obtained from the basic modal logic K with additional rules or axioms. That
43
these are ‘different logics’ is debatable, as also van Benthem [15] explains: in effect they are the same
logic but we make use of an extra inference rule. Also, they have the same semantics, but with extra
restrictions on the models.
Just as modal logic is used to reason about necessity, time, knowledge, action, and more, so predicate
logic is used to reason about physical objects, people, events, spatial locations, and more. Yet, we do not
talk about ‘different predicate logics’, but only about different domains of objects and predicate logical
theories about those domains (i.e., sets of formulas valid on those domains). For instance, the statement
that x and y are in the same location may be false for all physical objects, but not false for all events. In
essence modal logic is no different. Instead of claiming that “the logic of knowledge is K plus inference
rules X” (see van Ditmarsch et. al. [16]), we might also say that inference rules X constitute a theory
about the domain of knowledge: as long as 2 is understood as the epistemic modality, those inference
rules are acceptable to us. As a matter of historical fact, though, the different modal logics were invented
before Kripke semantics showed how we could standardize all of them. This explains why modal logic
is sometimes presented as comprising a diversity of logics, unlike predicate logic.
Nevertheless, there are clearly differences with respect to the natural assumptions for certain modalities. For instance, it is natural to think that what is necessarily the case is actually the case. But it is not
natural at all to think that what is the case always in the future is the case now, or what is a necessary
outcome of executing a certain program is also true prior to executing that program. This example gives
a hint to the kind of additional rules that we might add. The statement that whatever is necessary is true
is represented by the formula 2ϕ → ϕ that, as we saw in the previous section, modally defines the class
of reflexive frames. We may therefore add this formula as an axiom scheme, or add 2ϕ ` ϕ as a rule
of inference. And so, likewise, we can add other frame-class defining formulas as rules or axioms. The
logic we then obtain also corresponds to a class of frames.
Name
K
D
T
B
4
5
S4
S5
Logic
K+2ϕ → 3ϕ
K+2ϕ → ϕ
K+32ϕ → ϕ
K+2ϕ → 22ϕ
K+3ϕ → 23ϕ
T+2ϕ → 22ϕ
S4+32ϕ → ϕ
Corresponding frame class
All frames
Serial frames
Reflexive frames
Symmetric frames
Transitive frames
Euclidean frames
Preordered frames
Equivalence frames
An alternative name for S4 is KT4; alternative names for S5 are KTB4, KT45 and KT5. (To understand why, answer exercise 5(c) of section 4.) Many more combinations are possible, considering
Proposition 2 in section 4, which states that any combination of frame class characterizing formulas
modally defines the intersection of those frame classes. So for instance the logic KD45 corresponds to
the class of all serial, transitive and euclidean frames.
These correspondence results lead to the formulation of soundness theorems for the different logics:
Theorem 7 (Soundness for L). For all of the modal logics L defined above, L is sound with respect to its
corresponding frame class CL .
Φ `L ψ ⇒ Φ |=CL ψ
See the definition of validity in section 3.3 for the definition of the right hand side of this equivalence.
44
The additions to the logic can naturally be understood as additional axiom schemes, in the Hilbert
style. But we can also easily understand them as inference rules:
D
T
2ϕ
...
3ϕ
B
2ϕ
......
ϕ
4
32ϕ
......
ϕ
5
2ϕ
......
22ϕ
3ϕ
......
23ϕ
A few more examples will illustrate the resulting logics. On the left a proof in the system 5 of
`5 3ϕ → 223ϕ. On the right a simple and useful proof of ϕ `T 3ϕ, that comes in useful as a derived
rule: given the T rule, so assuming reflexivity, if ϕ is true, then there is a possible world where ϕ is true.
1
3ϕ
Assumption
1
ϕ
2
23ϕ
Rule 5, 1
2
2¬ϕ
Assumption
Premise
3
2
2
3
¬ϕ
Rule T, 2
4
3ϕ
Elim 2, 2
4
ϕ
Reit, 1
5
23ϕ
Rule 5, 3
5
⊥
Elim ¬, 3,4
6
223ϕ
Intro 2
6
¬2¬ϕ
Intro ¬, 2,5
Intro →, 1,6
7
3ϕ
Def 3, 6
7
3ϕ → 223ϕ
The derivation on the right above shows that we can always infer 3ϕ from ϕ if we may use Rule T.
Given a derivation of some formula ϕ, we can continue the derivation along the steps 2-7 and arrive at
3ϕ. So we can also add a derived rule: if `T ϕ, then `T 3ϕ.
For a last example, we prove Rule B in system KT5. In semantic terms, the following derivation
illustrates that every frame that is reflexive and Euclidean is also symmetric. We use the derived rule for
T proven above. We also use two derived rules of the Def 3 rule. The first of these has been proven
earlier, and the second one is an exercise.
1
32ϕ
Assumption
2
¬ϕ
Assumption
3
3¬ϕ
Rule T (derived), 2
4
23¬ϕ
Rule 5, 3
5
2
2
6
3¬ϕ
Elim 2, 3
7
¬2ϕ
Def 3 (derived), 6
8
2¬2ϕ
Intro 2, 7
9
¬32ϕ
Def 3 (derived), 8
10
⊥
Elim ¬, 1,9
11
¬¬ϕ
Intro ¬, 2,10
12
ϕ
Double ¬, 11
13
32ϕ → ϕ
Intro →
45
5.6
Exercises
1. For additional understanding of natural deduction, consider the exercises in section 1.
2. Give proofs for the following statements in K.
(a) 2p `K 2(p ∨ q);
(c) 2(ϕ ∨ ψ), 2(ϕ → χ), 2(ψ → χ) `K 2χ;
(b) 2(ϕ ∧ ψ) `K 2ϕ ∧ 2ψ;
(d) 2(ϕ → ψ), 2¬ψ `K 2¬ϕ.
3. In this section there is a derivation showing that `K 3¬ϕ ⇒ `K ¬2ϕ is an admissible rule. In
other words, we may use this fact to legitimate the inference rule from 3¬ϕ to ¬2ϕ. Show that
`K 2¬ϕ ⇒ `K ¬3ϕ is also an admissible rule, using the hints in the text. Using Intro 3, Def 3,
and these two additional admissible rules, give derivations for
(a) 2¬2ϕ `K 23¬ϕ;
(c) (3ϕ ∨ 3ψ) `K 3(ϕ ∨ ψ);
(b) 3¬3ϕ `K 32¬ϕ;
(d) 3(ϕ ∨ ψ) `K (3ϕ ∨ 3ψ).
4. Give derivations for
(a) 232ϕ `KTB ϕ;
(c) `S4 2ϕ ↔ 22ϕ;
(b) 2ϕ `S4 ϕ ∧ 22ϕ;
(d) `KD5 2ϕ → 23ϕ.
5. Prove that `K ϕ → ψ ⇒ `K 2ϕ → 2ψ is a derived rule in the logic K, (a) using the Hilbert system;
(b) using the Natural Deduction system.
6. Prove that the `K 2(ϕ → ψ) ⇒ `K 3ϕ → 3ψ can be derived in the Hilbert system, defining 3 as
¬2¬, and using the derived rule that `K ϕ → ψ ⇒ `K ¬ψ → ¬ϕ for all modal formulas ϕ and ψ.
7. Prove, on the basis of the soundness of K, that the following logics are also sound: (a) T; (b) B;
and (c) S5. That is, prove that the additional inference rules only allow us to derive conclusions
that are valid on all frames in the corresponding class of frames, in the manner of the proof of
Theorem 6.
8. In the proof of Theorem 6 the step for Def 3 is skipped. Write down the proof-part for that
inference rule (both versions).
46
6
Completeness
Soundness and completeness theorems link the syntax and semantics of modal logics, by providing a
correspondence between derivability (`) and validity (|=). Soundness means that we can only derive
conclusions that are valid (on the class of frames for that logic). Completeness means that we can derive
everything that is valid. In other words, what can be expressed in the language, and is generally valid (or
valid on the class of frames), can be derived using the proof system. The general outline of the proof of
this main theorem in modal logic will be given below, in the section on canonical models.
Theorem 8 (Completeness theorem).
ϕ is valid on all frames
⇒
`K ϕ
ϕ is valid on all reflexive frames
⇒
`T ϕ
ϕ is valid on all transitive frames
⇒
`4 ϕ
ϕ is valid on all preordered frames
⇒
`S4 ϕ
ϕ is valid on all equivalence frames
⇒
`S5 ϕ
Thus for these logics derivability is connected to a frame property in an elegant way. Because of the
correspondence theorems we also know that these classes of frames can be characterized by one single
formula, e.g. 2ϕ → ϕ in case of the relexive frames, the formula that is the characteristic inference rule
(or axiom) of T.
Every modal logic has one special model that is in some sense as general as possible. It is close
to the syntax of the logic because its worlds are sets of formulas. This model is called the canonical
model. Its importance stems from the fact that from the existence of such a model one can in some cases
easily prove the completeness of the logic in question. We will do so at the end of this section. We will
consider the canonical model in detail for the logic K and later comment on its construction for other
modal logics. Some definitions first.
Definition 13 (Maximally consistent set). Given a modal logic L, a set of formulas ∆ is L-consistent if
one cannot derive a contradiction from it, i.e. if ⊥ cannot be inferred from it, in the proof system for L.
A set of formulas ∆ is called maximally L-consistent if it is L-consistent and for every formula ϕ, either
ϕ belongs to the set or ¬ϕ does.
We will mainly work with K in this section, therefore the K-part is often omitted, so consistent means
K-consistent, and so on. A simple but important observation:
Proposition 6. If a set of formulas is true on a model (for L), in a world, then it is consistent.
Proof. For if not, it would derive ϕ ∧ ¬ϕ for some ϕ. But then ϕ ∧ ¬ϕ should hold in the model, which
cannot be.
Because of this, the set {p, 2q} clearly is consistent, as there are models in which both the formulas
hold. The same argument applies to the set
{p, ¬2p, 22p, ¬222p, 2222p, . . . }.
Obviously, the set {ϕ, ¬ϕ} is not consistent, as it derives ϕ∧¬ϕ. Also the set {2(ϕ → ψ), 2(> → ϕ), 3¬ψ}
is inconsistent, since 2ψ∧¬2ψ follows from it. The set {p, 2q} is not maximally consistent since neither
q nor ¬q belongs to the set (and so do many other formulas).
Lemma 1 (Lindenbaum lemma). Every consistent set of formulas can be extended to a maximal consistent set of formulas.
47
That this lemma is true can be shown by means of the following method for constructing a maximal
consistent set Γ out of any possible (non-maximal) consistent set ∆. This construction begins by choosing
an enumeration of all the formulas in the modal language. Note that, with a countably infinite set of
atomic propositions there is a countably infinite set of modal formulas (you can try proving this yourself
in the exercises). We then extend ∆ step by step, considering each formula in the language along the
way. We begin with ∆ itself.
Γ0
=
Γn+1
=
∆


 Γn ∪ {ϕn+1 }, if Γn ∪ {ϕn+1 } is consistent;

 Γ,
otherwise.
n
Finally, we define the maximal consistent set as the ‘end point’ of this construction: Γ =
S
n∈N (Γn ).
Proof. You are asked to prove that the result of this construction method is a maximal consistent set,
containing the original consistent set. That proves the Lindenbaum lemma.
Examples of maximal consistent sets are a bit harder to describe. The typical example is the following. Given a possible world w in a model, the set of formulas L = {ϕ | w |= ϕ} is a maximal consistent set.
That it is consistent is clear, as it has a model (and ⊥ is not true in any world). That it is also maximal in
this respect follows from the fact that for any formula ϕ, either w |= ϕ or w |= ¬ϕ, and thus either ϕ ∈ L
or ¬ϕ ∈ L. Thus we see that worlds in a Kripke model naturally correspond to maximally consistent sets
of formulas. This is the guiding idea behind the canonical model.
One more observation on the correspondence between worlds and maximally consistent sets of formulas. Given that wRv holds in a model, then for the sets
Lw = {ϕ | w |= ϕ}
Lv = {ϕ | v |= ϕ},
it holds that 2ϕ ∈ Lw implies ϕ ∈ Lv , for all formulas ϕ. This immediately follows from the truth
definition.
We are ready for the definition of a canonical model.
Definition 14. The K-canonical model is the Kripke model MK = hWK , RK , VK i, where
1. WK = {Γ | Γ is a maximally K-consistent set of formulas},
2. ΓRK ∆ ⇔ ∀ϕ (2ϕ ∈ Γ ⇒ ϕ ∈ ∆),
3. p ∈ V(Γ) ⇔ p ∈ Γ, for atomic propositions p.
Thus the canonical model consists of all maximally consistent sets, with arrows between them at the
appropriate places (think of the remark on Lw and Lv above). As explained above, for every world w in
a model, the set {ϕ | w |= ϕ} is maximally K-consistent. Thus one could view the canonical model as
containing all possible Kripke models together, and putting arrows between two sets {ϕ | w |= ϕ} and
{ϕ | v |= ϕ} if for all 2ψ ∈ {ϕ | w |= ϕ} we have ψ ∈ {ϕ | v |= ϕ}.
The next step is then to reduce the truth of a formula in a maximal consistent set to membership
of that set, which is the content of the truth lemma. This lemma is rather involved, because we have
to consider all possible formulas. Therefore, we first present the valuation lemma, which functions as
an intermediate step. In this lemma we establish several dependencies for membership of a maximal
consistent set.
Lemma 2 (Valuation lemma). For any maximal consistent set Γ, the following are true.
1. Γ is deductively closed: Γ `K ϕ implies that ϕ ∈ Γ;
48
2. ϕ ∈ Γ if, and only if, ¬ϕ < Γ;
3. ϕ ∧ ψ ∈ Γ if, and only if, ϕ ∈ Γ and ψ ∈ Γ;
4. 2ϕ ∈ Γ if, and only if, (ϕ ∈ ∆ for all ∆ such that ΓRK ∆).
The lemma could be extended with other connectives from propositional logic, in an obvious way.
That is left to the reader as an exercise. The propositional cases are simple, but the case for necessity is
more involved.
Proof. The four statements are proven below.
1. Because Γ is maximally consistent, either ϕ ∈ Γ or ¬ϕ ∈ Γ. Given that Γ `K ϕ, if ¬ϕ ∈ Γ, then
also Γ `K ¬ϕ and so Γ `K ⊥ (by the inference rule ‘negation elimination’), which would contradict
consistency of Γ. Therefore, it can only be the case that ϕ ∈ Γ.
2. ⇒: given ϕ ∈ Γ, if ¬ϕ ∈ Γ, then Γ `K ⊥ (elim ¬), and so Γ is not consistent. Thus, ¬ϕ < Γ.
⇐: since Γ is maximally consistent, either ϕ ∈ Γ or ¬ϕ ∈ Γ. Therefore, if ¬ϕ < Γ, then ϕ ∈ Γ.
3. ⇒: if ϕ ∧ ψ ∈ Γ, then Γ `K ϕ and Γ `K ψ (elim ∧). By deductive closure, ϕ ∈ Γ and ψ ∈ Γ.
⇐: if Γ `K ϕ and Γ `K ψ, then Γ `K ϕ ∧ ψ (intro ∧). Therefore, by deductive closure, ϕ ∧ ψ ∈ Γ.
4. ⇒: follows immediately from the definition of RK .
⇐: suppose that 2ϕ < Γ, we prove that there is some ∆ such that ΓRK ∆ and ϕ < ∆.
Consider the set Ψ = {ψ | 2ψ ∈ Γ}. Either Ψ ∪ {¬ϕ} is consistent, or it is not. If it is not consistent, then
Ψ `K ϕ, for otherwise we could not derive ⊥ by adding ¬ϕ to Ψ. But then from Ψ2 = {2ψ | 2ψ ∈ Γ} we
could derive Ψ2 , 2 `K ϕ (using 2 elimination), and so Ψ2 `K 2ϕ (by 2 introduction). Since Ψ2 ⊆ Γ, we
know that also Γ `K 2ϕ and so, by deductive closure of Γ (see the first item), 2ϕ ∈ Γ, which contradicts
our assumption that 2ϕ < Γ. So Ψ ∪ {¬ϕ} cannot be inconsistent, hence it is consistent. Given that
Ψ ∪ {¬ϕ} is consistent, it also has a maximal consistent extension (by the Lindenbaum lemma), and since
¬ϕ is in the set, ϕ is not. This maximal consistent set is now our ∆. The definition of the canonical model
guarantees that ΓRK ∆, since Ψ ⊆ ∆, and it has already been established that ϕ < ∆.
Now we can formulate the truth lemma, and prove it. This lemma crucially establishes that truth of a
formula in a ‘world’ in the canonical model comes down to being a member of that maximal consistent
set.
Lemma 3 (Truth lemma). For any maximally K-consistent set of formulas Γ (that is, for any world in
the canonical model), for any formula ϕ:
MK , Γ |= ϕ ⇔ ϕ ∈ Γ.
Note that here MK , Γ |= ϕ means that in the canonical model, in ‘world’ Γ, formula ϕ is true. The
proof for this lemma is by induction on the complexity of the formula, just as in the proof for the
bisimulation theorem (Theorem 4). So we prove that the lemma is correct for atomic propositions, and
then we prove that no way of making a formula more complex (adding ¬, ∧, 2 or another connective)
poses a problem: if it is correct for p and for q, then also for ¬p, p ∧ q, and 2q, and so on; and therefore
also for ¬(p ∧ q), ¬p ∧ 2q, and so on. The induction hypothesis thus says that, given formulas ψ and χ
of arbitrary complexity, if the truth lemma is correct for those formulas, then also for their negation, and
for their conjunction, and for their ‘necessitation’, and so on.
Proof. (Lemma 3) By induction on the complexity of the formula.
Basic case: Suppose ϕ = p, for some atomic proposition. From the definition of the canonical model
we know that p ∈ Γ if, and only if, p ∈ VK (Γ). That latter fact, by the truth definition, is equivalent to
MK , Γ |= p. So p ∈ Γ is equivalent to MK , Γ |= p.
49
Negation: Suppose ϕ = ¬ψ. From the valuation lemma we know that ¬ψ ∈ Γ if, and only if, ψ < Γ.
By the induction hypothesis, ψ < Γ is equivalent to MK , Γ 6|= ψ. And, according to the truth definition,
that is equivalent to MK , Γ |= ¬ψ. Hence, ¬ψ ∈ Γ is equivalent to MK , Γ |= ¬ψ.
Conjunction: Suppose ϕ = ψ ∧ χ. From the valuation lemma we know that ψ ∧ χ ∈ Γ if, and only if,
ψ ∈ Γ and χ ∈ Γ. By the induction hypothesis, that is equivalent to MK Γ |= ψ and MK , Γ |= χ, respectively.
Lastly, applying the truth definition, this is equivalent to MK , Γ |= ψ ∧ χ. Therefore, ψ ∧ χ ∈ Γ if, and
only if, MK , Γ |= ψ ∧ χ.
Necessity: Suppose ϕ = 2ψ. By the valuation lemma, 2ψ ∈ Γ is equivalent to (*) ψ ∈ ∆ for every
∆ such that ΓRK ∆. But, by the induction hypothesis, (*) is in turn equivalent to (+) MK , ∆ |= ψ for all ∆
accessible from Γ. Then, by the truth definition, (+) is equivalent to MK , Γ |= 2ψ. So, all in all, 2ψ ∈ Γ
is equivalent to MK , Γ |= 2ψ.
The proof only mentions negation, conjunction and necessity. These are all we need, in principle,
as we can define the other connectives in terms of only those three. You can try to extend the proof for
those other connectives yourself in the exercises.
Now we are ready to prove the completeness theorem for the basic modal logic K.
Proof. (Theorem 8) The logic K is complete: if |= ϕ (ϕ is valid on all frames), then `K ϕ.
We prove this by contraposition, showing that 0K ϕ implies 6|= ϕ. If 0K ϕ, there is a maximally
consistent set Γ containing ¬ϕ, as the Lindenbaum lemma shows. By the definition of canonical model,
Γ is a world in this model. By the Truth lemma we have that MK , Γ |= ¬ϕ ⇔ ¬ϕ ∈ Γ. And thus
MK , Γ |= ¬ϕ, since ¬ϕ ∈ Γ. Hence there is a Kripke model, namely MK , and a world in it, namely Γ,
where ¬ϕ is true and ϕ is false. Therefore, 6|= ϕ, and that is what we had to show.
The proofs of the completeness theorem for the other logics follow the same pattern as the proof for
K given above. For instance, if we want to prove that the logic S4 is complete, we need to show that
everything that can be derived in that logic is valid on the class of all preordered frames. So we define
the canonical model MS4 in precisely the same way. Then, we add the following lemma:
Lemma 4 (Correspondence lemma).
If the rule T is valid in logic L, then the canonical relation RL is reflexive.
If the rule 4 is valid in logic L, then the canonical relation RL is transitive.
Proof. Given that T is valid, for every maximal consistent set Γ, if 2ϕ ∈ Γ, then Γ `L ϕ and so, by
deductive closure, ϕ ∈ Γ. So, for all ϕ such that 2ϕ ∈ Γ, also ϕ ∈ Γ. Then, by the definition of the
canonical relation for L, ΓRL Γ. This proves that the relation is reflexive.
Suppose that in the canonical model ΓRL ∆RL E. We need to prove that ΓRL E. Given that 4 is valid, if
2ϕ ∈ Γ, then Γ `L 22ϕ and so, by deductive closure, 22ϕ ∈ Γ. Therefore, according to the definition
of the canonical relation, 2ϕ ∈ ∆. And, again by the definition of the canonical relation, ϕ ∈E. So, for
every ϕ, if 2ϕ ∈ Γ, then ϕ ∈ E. Then, applying the definition of the canonical relation once more, it is
the case that ΓRL E. This is what we had to prove.
So we prove that the canonical model for the logic S4 is a preorder, and we prove completeness in the
same way as before. Hence, we know that, if some formula is not derviable in the proof system S4, then
it is not valid on the class of all preordered frames (because there is a counterexample in the canonical
model for S4).
The book of van Ditmarschet al. [16] contains a completeness proof for the logic S5, on p.180 and
further. The accessibility relation ∼ca in the canonical model is then an equivalence relation, and Ka
50
(with intended meaning “agent a knows that . . . ”) is the necessity operator. The completeness proof is
slightly different, because they define the accessibility relation in such a way that it is immediately an
equivalence relation.
Note that we have not discussed completeness relative to the 2-subproofs in the natural deduction system. We have only proven completeness of `K ϕ with respect to |= ϕ, but not completeness of ∆ `K ϕ
with respect to ∆ |= ϕ. Such an extension is possible, but has been left out of the present discussion.
6.1
Exercises
1. Show that {2(ϕ → ψ), 2ϕ, 3¬ψ} is inconsistent.
2. Show that this set is K-consistent. Is it T-consistent? Why so/not so?
{p, ¬2p, 22p, ¬222p, 2222p, . . .}
3. If 2ϕ ∈ Γ and ¬2ϕ ∈ ∆, is it possible that ΓRS5 ∆? And ∆RS5 Γ? Explain your answer.
4. Given that wRv holds in a model, show that for the sets
Lw = {ϕ | w |= ϕ}
Lv = {ϕ | v |= ϕ}
it holds that 2ϕ ∈ Lw implies that ϕ ∈ Lv , for all formulas ϕ.
5. Finish the proof of the Lindenbaum lemma.
6. Which clauses can be added to the valuation lemma for disjunction ∨, and for implication →?
Give the proofs for those extra clauses, using the proof for ∧ as an example.
7. Which clause can be added to the valuation lemma for possibility 3? Give the proof for this extra
clause, by applying the (already proven) clauses for ¬ and 2.
8. Which clauses could be added to the truth lemma for disjunction ∨, and for possibility 3? Extend
the formula induction in the proof of the truth lemma with clauses for disjunction and possibility.
(Tip: complete exercises 6 and 7 first, and use your answers here.)
9. Extend the correspondence lemma for rule 5 and prove completeness for S5.
10. Show that the logic KB is complete. That is, show that if we add the inference rule `KB 32ϕ ⇒
`KB ϕ, then everything that is valid on the class of symmetrical frames can be derived in the logic
KB.
51
7
Decidability
By proving soundness and completeness, we have ‘reduced’ the issue of whether a modal formula ϕ is
provable with natural deduction, `K ϕ, to the issue of whether ϕ is generally valid, |= ϕ. Similarly, we
know that ϕ is not provable if we know that there is a frame on which it is not valid, i.e., if there is a
Kripke model with a possible world where ϕ is false. Given that there are infinitely many frames, this
might not be an easy task. However, we can restrict the frames that we have to consider in such a way
that in order to check whether there is a frame that refutes ϕ, we only have to check a finite number of
finite frames, which implies the ‘decidability’ of the logic. This is the content of this section. We will
see that the number of frames we need to check only depends on the size of the formula ϕ.
A logic L is decidable if, and only if, an effective procedure (Turing machine) exists by means of
which for every formula ϕ in the language it can be settled whether ϕ is generally valid in L (i.e., whether
ϕ is a theorem in L). Propositional logic is decidable, because the truth table method is such an effective
procedure. Predicate logic is not decidable. A. Turing, who proved this fact (as well as A. Church), gave
a theoretical definition for ‘effective procedure’, that later came to be called a ‘Turing machine’. Using
this concept, a logic is decidable if there is a Turing machine that will compute the general (in)validity
of every formula.
As we will see, normal modal logics such as K and S5 are decidable. This provides a computational
logical reason for preferring formal reasoning in modal logic over predicate logic, so, for instance, characterizing transitivity of a frame as 2ϕ → 22ϕ instead of ∀w∀v∀u((wRv ∧ vRu) → wRu)). In general,
the more expressive a logic is (see section 4), the less decidable it is, and the more computationally complex. Logicians sometimes describe this as a pay-off between expressivity and complexity of a logical
formalism.
A by now common method of proving that a logic is decidable is by means of the finite model
property. A logic L has the finite model property if every formula that is not generally valid has a
finite countermodel; in other words, if the fact that a formula is valid on all finite frames is enough to
conclude that it is valid on all frames, finite or not. When a modal logic has this property, we only need
to check the finite frames to see whether some formula is valid, and it turns out that this is an effective
procedure—also in Turing’s sense.
If a sound and complete logic L has only a finite number of axioms and inference rules and it has the
finite model property, then it is decidable. We know that the modal logics we have considered are all
sound and complete, and they have only a finite number of axioms (in the Hilbert system) and inference
rules (in both the Hilbert and natural deduction systems). Therefore, if we can prove the finite model
property for modal logic L, we can conclude that L is decidable.
7.1
Small models
In order to prove the finite model property, we first prove that to establish validity of a formula in a frame
we only need to check frames of a finite length dependent on the size of the formula ϕ. Intuitively, we
establish “how far up” we have to inspect the frame in order to establish whether a certain node forces a
formula. It turns out that the number of boxes decides this. First, consider the following example.
Example 4.
x (¬p)
O
(p) u
8 v (p)
e
w (¬p)
52
To see that w |= 2p it suffices to consider v and u and check whether p is true in them. In other words,
the truth of w |= 2p does only depend on the valuation at the successors of w and not on the world x,
which is not a successor of w. If p would be true in x, this would not change the truth of w |= 2p,
whereas a change in the valuation of u or v could. On the other hand, for a formula with two boxes, like
22p, whether w |= 22p holds (it does not) depends on the valuation of p in x.
Before we continue we need a definition.
Definition 15. The depth of a frame F is the maximum length of a path from a root of the frame (a
lowest world, a world that is no successor of another world) to the top. Formally: the depth of a frame F
is the maximum number n for which there exists a chain w1 Rw2 R . . . Rwn Rwn+1 in the frame, where all
wi are distinct. Clearly, frames can have infinite depth.
The depth of a world v from a world w is the length of the shortest path from w to v. v is of depth 0
from w when it is equal to w or when it cannot be reached from w by travelling along the arrows.
Let |ϕ| be the size of ϕ, i.e. the number of symbols in it, and let b(ϕ) denote the maximal nesting of
boxes in ϕ. The size of a frame is the number of worlds in it.
Example 5. This frame has depth 2:
/x
?v
u`
w
The world x has depth 2 from w and depth 1 from v and depth 0 from x and from u. And this frame has
depth 0:
wr
In this frame there are no worlds with depth > 0.
The maximal nesting of boxes in (22p ∧ 2q) is 2, and in 2(2p → 2(2p ∧ q)) it is 3 (coming
from the box in front of p, and the box in front of the conjunction, and finally the box in front of the
implication). Note that the nesting of boxes in 23p is 2, not 1.
Returning to the first example, it seems to suggest that to evaluate a formula ϕ in a node w in a model
M, we have to consider only the nodes in M that are of depth ≤ b(ϕ) from w. Here follow two more
examples to support this claim.
First we consider the case that the number of boxes in a formula ϕ is 0, i.e. b(ϕ) = 0. This means that
the formula does not contain boxes. Considering the definition of w |= ϕ, it is not difficult to see that to
establish w |= ϕ for a formula without boxes, one only has to know which atomic propositions are true
in w and which are not. Thus the truth of ϕ at w is indepedent of the model outside w.
In the following example,
?y
x_
vO
wQ
53
q
the truth of w |= 2p does not depend x or y. In other words, w |= 2p holds if and only if w |= p and
v |= p, no matter whether p is true in x or y or not. However, the truth of v |= 2p depends on the valuation
at x, since v |= 2p if and only if x |= p and y |= p. On the other hand, to verify whether w |= 2p → 22q
all the nodes w, v, x, and y have to be taken into account.
This intuition is captured by the following theorem.
Theorem 9 (Finite depth theorem). For all numbers n, for all models M and all nodes w in M there
exists a model N of depth n with root w0 such that for all ϕ with b(ϕ) ≤ n:
M, w |= ϕ ⇔ N, w0 |= ϕ.
Proof. We do not formally prove this statement, but only sketch the idea. Given a model M with world
w, consider Mw . By Corollary 3, concerning generated subframes, we have for all formulas ϕ that for
all v in Mw ,
M, v |= ϕ ⇔ Mw , v |= ϕ,
but this does not prove the lemma as Mw may still have depth > n. Therefore, in Mw we cut out all worlds
that have depth > n from w and call this model N. Observe that the root of N is w. The ideas explained
above imply that for all formulas ϕ with b(ϕ) ≤ n we have M, w |= ϕ if and only if N, w |= ϕ.
Corollary 5.
`K ϕ ⇔ F |= ϕ for all frames F of depth ≤ b(ϕ).
Proof. ⇒: this direction is the soundness theorem.
⇐: this direction we show by contraposition. Thus assuming 0K ϕ we show that there is a frame F
of depth ≤ b(ϕ) such that F 6|= ϕ. Thus suppose 0K ϕ. By the completeness theorem, there should be a
frame G such that G 6|= ϕ. Thus there is a model M on this frame and a world w such that M, w |= ¬ϕ.
By Theorem 9 there is a model N of depth ≤ b(¬ϕ) and a world v such that N, v |= ¬ϕ. Since the number
of boxes in ϕ and ¬ϕ is the same, b(¬ϕ) = b(ϕ). Let F be the frame of N. This then shows that F has
depth ≤ b(ϕ) and F 6|= ϕ, and we are done.
7.2
The finite model property
Results similar to Corollary 5 hold for various modal logics. The result can also be improved in such
a way that in the completeness theorem not only can we restrict ourselves to frames of finite depth, but
even to frames that are finite. The precise formulation is as follows.
Theorem 10.
`K ϕ
⇔
ϕ holds on all frames of size ≤ 2|ϕ| .
`T ϕ
⇔
ϕ holds on all reflexive frames of size ≤ 2|ϕ| .
`4 ϕ
⇔
ϕ holds on all transitive frames of size ≤ 2|ϕ| .
`S4 ϕ
⇔
ϕ holds on all preordered frames of size ≤ 2|ϕ| .
`S5 ϕ
⇔
ϕ holds on all equivalence frames of size ≤ 2|ϕ|
We say that a logic has the finite model property (FMP) if, whenever a formula ϕ is not derivable in
the logic, there is a finite model of the logic (a model in which all formulas of the logic are true) that
contains a world in which ϕ is refuted.
Corollary 6. The logics K, T, 4, S4, S5 have the finite model property.
54
Proof. We prove it for T. Suppose 0T ϕ. Then by Theorem 10 there is a reflexive frame F of size ≤ 2|ϕ| on
which ϕ does not hold. Thus there is a model M on the frame and a node w such that w |= ¬ϕ. By the
correspondence theorem 2ϕ → ϕ holds on all reflexive frames. That is, T holds on all reflexive frames.
Thus M is a finite model of T with a world in which ¬ϕ is true. This proves that T has the finite model
property.
7.3
Decidability
Recall that a language is decidable if there is a Turing machine that decides it. We can define a similar
notion for logics, by considering them as languages, namely as the set of all formulas that are derivable
in the logic. We say that a formula belongs to a logic when it is derivable in it. E.g. with a logic L is
associated the set {ϕ | `L ϕ}. We call a Turing machine a decider for L when it decides {ϕ | `L ϕ}. In
general, we call a logic L decidable if there is a Turing machine that is a decider for L. The previous
theorem implies the decidability of all modal logics mentioned there.
Corollary 7. The logics K, T, 4, S4, S5 are decidable.
Proof. We show that K is decidable and leave the other logics to the reader. Thus we have to construct a
Turing machine that, given a formula ϕ, outputs “yes” if `K ϕ and “no” otherwise. By Theorem 10, `K ϕ
is equivalent to ϕ being valid in all frames of size ≤ 2|ϕ| . Thus the Turing machine has to do the following.
Given ϕ it tests for all worlds w in all models M on all frames of size ≤ 2|ϕ| whether M, w |= ϕ. If in
all cases the answer is positive, it accepts, and otherwise it rejects. It is clear that this Turing machine
decides K.
7.4
Complexity
In terms of complexity the Turing machine constructed in the proof above might not do so well since
there are at least exponentially many frames of size ≤ 2|ϕ| . The exponential factor is likely to be essential,
as for many of these logics, including K, T, 4 and S4, one can show that the corresponding satisfiability
problems are PSPACE-complete. That is, it can be solved in polynomial space whether a formula belongs
to such a logic or not, and any problem in PSPACE can be reduced to such problems. (Recall that the
satisfiability problem for propositional logic is NP-complete.) On the other hand, decidability is still
nice. Recall that predicate logic is not decidable. Of course, propositional logic is, but since modal
logics are extensions of propositional logic with much more expressive power, their decidability is not
apparant, and indeed these facts have nontrivial proofs that, regrettably, fall outside the scope of this
exposition.
55
8
Tense Logic
So far we have only looked at modal logic in general, leaving the meaning of ‘necessary’ vague and with
Kripke frames in which possible worlds are ‘accessible’. Now we go into a more specific form of modal
logic, namely, the logic of time. We start out with standard tense logic.
To understand the idea behind tense logic, consider that there are two ways to think of propositions:
- Eternal propositions: specify exact circumstances as part of the content.
For example: There is a vase on top of a table at 12h:45m:17s, at 7 October 2011, on earth, latitude:
52.087325, longitude: 5.108177.
- Indexical propositions: specifiy circumstances relative to a perspective of evaluation (or index).
For example: That vase is standing on top of my table (now, here).
In modal logic we always take propositions to be indexical in some or other respect: we evaluate them
from the perspective of some ‘possible world’, but we can also adopt the perspective of the present
moment, ‘now’, or from our current location, or from the perspective of the speaker, and so on. Expressions such as ‘I’, ‘here’ and ‘now’ are also called indexical expressions. What they mean in a particular
utterance depends on who is speaking, where, and when.
In tense logic we take propositions to be indexical in one respect. They are evaluated relative to
a point in time. The same idea can be expressed differently, looking not at the language but at the
models. These are now no longer populated by different ‘possible worlds’ but rather by different ‘possible
moments’ or time points. And accessibility is no longer a matter of determining which worlds are
possible from a given world, but rather for describing which time points are earlier or later than some
given time point.
8.1
Basic tense logic
Tense Logic was introduced by Arthur Prior as a logic for reasoning about past, present and future. It is a
logic that has two 2-like operators, from which we can also define their corresponding 3-like operators.
The operator G (for ‘it is always Going to be the case that’) is for the future. A formula Gϕ says that,
at all possible future moments in time, ϕ is true. The other operator H (for ‘it Has always been the case
that’) is the reverse of G. So Hϕ means that, at all possible moments in the past, ϕ is true.
The dual of G is F (for Future) and the dual of H is P (for Past). Putting these things together, we
get the following table:
towards future:
towards past:
always
G
H
sometimes
F
P
The language of tense logic can be written in Backus-Naur form as follows:
[Ltense ]
ϕ ::= p | ⊥ | ¬ϕ | ϕ ∧ ϕ | ϕ ∨ ϕ | ϕ → ϕ | ϕ ↔ ϕ | Gϕ | Fϕ | Hϕ | Pϕ
The statement FH p → p means that, if at some point in the future, p will always have been true
up to that moment, then p is true now. Intuitively, many people take this to be true. Analogously, the
statement PG p → p says that, if at some moment in the past it was true that p would always be true from
that moment on, then p is true now. These two statements are true in tense logic. They characterize the
most general class of all tense frames (see below) and formulated as inference rules they are the basic
rules of tense logic.
56
To evaluate the sentences of the language of tense logic, we use a Kripke model with only one
accessibility relation. So we have two different modal operators which are defined by means of one
accessibility relation. We could use the same notation with W and R as usual, but instead we will use T
for the domain of times, or moments, < for the earlier-than relation and > for the later-than relation. So
a tense frame is a tuple F = hT, <, >i, if it meets the additional constraint that
for all times t and t0 :
t > t0 if, and only if, t0 < t.
This is to guarantee that the earlier-than and later-than relations are complementary in the natural way.
A more abstract formulation of this condition is: >=<−1 . The ‘−1’ means that the relation is inverted. A
tense model can be obtained from a tense frame by adding a valuation, so a model is a tuple M = hT, <
, >, Vi.
Below we only display the semantic definitions for the two modal operators.
M, t |= Gϕ
if, and only if,
for all t0 , if t < t0 , then M, t0 |= ϕ
M, t |= Hϕ
if, and only if,
if t > t0 , then M, t0 |= ϕ
The basic tense logic P (for Prior) is obtained from the basic modal logic K (for both modalities),
plus the following:
Tmp1
Tmp2
PGϕ
......
ϕ
FHϕ
......
ϕ
The inference rules characterize the class of tense frames. That is, the condition that < and > are complementary in the sense defined above is characterized by the two rules of inferences. To prove this,
we assume arbitrary frames with two accessibility relations, R1 and R2 , and two corresponding modal
operators 21 (and 31 ) and 22 (and 32 ).
Proof. ⇐: Suppose that F = hW, R1 , R2 i satisfies the property that R2 = R−1
1 . We have to show that
for every model M based on F and every possible world w ∈ W, if M, w |= 31 22 ϕ, then M, w |= ϕ. So
suppose, for arbitrary such M and w, M, w |= 31 22 ϕ. By the semantic definition this means that there
is some possible world v such that wR1 v and M, v |= Hϕ. Now, we assumed that R2 = R−1 . Therefore,
from wR1 v it follows that vR2 w. According to the semantic definition, M, v |= Hϕ means that ϕ is true in
all worlds accessible by R2 from v. Since vR2 w, w is one of those worlds, so M, w |= ϕ. This is what we
needed to prove.
⇒: By contraposition, we prove that if frame F does not satisfy the property that R2 = R−1
1 , then it
is not true at all worlds in all models based on F, that if 31 22 ϕ is true, ϕ is true as well. Suppose that
R2 , R−1
1 . This means that for some w and v, wR1 v but not vR2 w (or vice versa, here the problem is with
the other inference rule). Now, consider this w. We define a model based on F by means of a valuation V
which is such that proposition p is true everywhere except at w: so p ∈ V(u) if, and only if, u , w. Given
our assumption, not vR2 w. Therefore, all the worlds 2-accessible from v are worlds where p is true. The
semantic definition then tells us that M, v |= 22 p. We also assumed that wR1 v. Therefore, applying the
semantic definition once more, M, w |= 31 22 p. But the valuation of p guarantees that M, w 6|= p.
8.2
Additional properties: seriality, transitivity, and linearity
Note that, apart from the connection between < and >, the earlier-than relation can be anything: circular,
lineair, or even universal. If we have some intuition on what the earlier-than relation is, we have to
restrict membership to the class of tense frames.
57
Some people would say that time is serial: that there is no such thing as a last moment of time, or
a beginning of time: time has always been and will always continue. If this is true, then the temporal
ordering relations are serial.
- ∀x∃y(x < y)
- ∀x∃y(x > y)
A further thought is that whatever is in the future of the future, is itself in the future. The same could
be said of the past. If this is correct, then the earlier-than relation is transitive.
- ∀x∀y∀z((x < y ∧ y < z) → x < z)
- ∀x∀y∀z((x > y ∧ y > z) → x > z)
A common idea is that time is linear. We commonly speak of a ‘time line’. A tense frame is linear if
any two points in the future are ordered as earlier or later—and similarly for the past.
- ∀xyz((y > x ∧ z > x) → (y > z ∨ y = z ∨ z > y))
(past linear)
- ∀xyz((x < y ∧ x < z) → (y < z ∨ y = z ∨ z < y))
(future linear)
Mostly, it is agreed that there is only one past, so the past is linear. There is not one past in which
the Philips the Second ruled the Netherlands and another past in which he did not. For the future it is
somewhat more debatable: some people say that the future is ‘open’: at the present moment it is not yet
determined what the future will be like. For example, who the next prime minister of the Netherlands
will be depends on how people will vote. Voting is a free choice, so it is not fixed now what I will
vote in (perhaps) two years time. So at least in some sense, many people say, there are several possible
futures, not a single one. This view is further strengthened by certain common interpretations of quantum
mechanics according to which, amongst others, radio active decay does not happen according to strict,
deterministic laws. Others reject this idea, arguing either that the future is determined, or that there are
several futures possible, but only one future is the real future. They will maintain that only future-linear
frames can be real tense frames.
The additional rules for characterizing linearity are the ones below.
Fwd-lin
Bwd-lin
Fϕ
......
G(Pϕ ∨ ϕ ∨ Fϕ)
Pϕ
......
H(Pϕ ∨ ϕ ∨ Fϕ)
Proof. An exercise.
The logic Lin consists of the basic tense logic P with additionally the rules D and 4 for both modalities and both of the linearity rules, Fwd-lin and Bwd-lin. The class of frames characterized by the logic
Lin consists of all frames that are serial, transitive, linear in both directions, and in which the two accessibility relations are each other’s mirror image. That is, the characterized class of frames is simply the
intersection of the frame properties characterized by the additional rules. This class of frames includes
also frames in which time is circular, frames in which times precede themselves (since the irreflexivity
of < cannot be modally characterized), and frames in which there are two time lines side by side (since
connectedness cannot be modally characterized either).
58
Still, we can define the frame class of genuine time lines—with an acyclic, irreflexive and connected
ordering of times. It turns out that the logic Lin is complete with respect to this frame class: every tense
logical validity on that frame class is provable in the logic. The proof for this fact is more involved
than the straightforward Henkin proof for completeness given earlier and will not be presented here. It
is important to realize that a logic can be complete for a frame class even if that frame class is more
restrictive than the frame class that the logic characterizes.
Below is a representation of a time line, with closed arrows for the future direction, left to right, and
the interrupted arrows for the past direction.
...
t1 h
(
(
t2 h
(
t3 h
t4 h
(
t5 h
(
t6
...
The alternative with only backward linearity and not forward linearity has an open future and a fixed
past. Such a tense frame can be pictured as a tree or, in the words of writer Luis Borges, as a “garden of
forking paths”.
...
>·
> t4
...
/ t5
...
t6
...
/ t8
/ t9
...
t10
/ t11
...
/ t3
? t2
/ t1
t7
Time progresses from left to right, but there are many forks in the road, where time can develop in
different ways. So time can progress in different ways: in one future you visit a museum tomorrow, in
another future you stay at home and watch a movie instead. In this situation we speak of “branching
time”. We call the logic with only backwards linearity Tree. So Lin = Tree+Fwd-lin.
8.3
Varieties of linearity
Definition 16. Let F = hT, <, >i be a tense frame in the class Tree.
- F has a beginning if, and only if, ∃x∀y(y < x → y = x);
- F is discrete if, and only if, ∀x∃y(x < y ∧ ∀z((x < z ∧ z , y) → y < z));
- F has finite intervals if, and only if, for any two points, there are only finitely many time points in
between them;
- F is dense if, and only if, ∀x∀y(x < y → ∃z(x < z ∧ z < y));
- F is continuous if, and only if, no cut determines a gap
A cut is a partition of the domain into two parts such that (i) the parts are non-empty, (ii) together
they are the entire domain, (iii) if x is in the first partition and y in the second partition, then x < y.
A cut is a gap if, and only if, neither partition has a final element. So, ∀C(∃x(x ∈ C ∧ ∀y(y ∈ C → y <
x))∨(x < C∧∀y(y < C → x < y))), where C is a cut-set, i.e., a set such that ∀x∀y(x ∈ C∧y < x → y ∈ C).
59
Characteristic formulas.
Theorem 11. Let F be a frame in the class Ctree .
- F has a beginning if, and only if, F |= H⊥ ∨ PH⊥.
- F is discrete if, and only if, F |= (ϕ ∧ Hϕ) → FHϕ.
- F has only finite intervals if, and only if, F |= G(Gϕ → ϕ) → (FGϕ → ϕ) and F |= H(Hϕ →
ϕ) → (PHϕ → ϕ).
- F is dense if, and only if, F |= Fϕ → FFϕ.
- F is continuous if, and only if, F |= (Fϕ ∧ O¬ϕ ∧ ¬O(ϕ ∧ P¬ϕ)) → O((ϕ ∧ G¬ϕ) ∨ (¬ϕ ∧ Hϕ)).
Here, Oϕ (‘once ϕ’) is short for Pϕ ∨ ϕ ∨ Fϕ.
The logic Lin-Z consists of Lin plus the axioms for discreteness and only finite intervals. Everything
that is valid on the frame Z, the frame in which time points are ordered like the integers, can be validly
inferred from this logic. In other words, the logic is complete for the class of frames {Z}. The logic
does not characterize this class, because there are also other frames which have exactly the same set of
validities as the frame Z.
To get from a logic matching Z to a logic matching N we need to replace the rule for seriality D of
the past by a rule that expresses that there is a beginning of time.
The logic Lin-Q is obtained by adding the axiom (or rule) for density to the logic Lin. This logic is
complete for the class of frames {Q}—the class consisting of the single frame constituted by the rational
numbers. That is, if we think of time as the rational numbers ordered by the smaller/greater than relations,
then every tense logical validity is provable in the logic Lin-Q.
By adding to Lin-Q the axiom (or rule) for continuity, we obtain the logic Lin-R, which is complete
for the class of frames {R}. So if time is structured like the real line, then all tense logical validities to be
had are those provable in Lin-R.
8.4
Time and modality: the Master Argument of Diodorus
Tense logic can be studied for various reasons. One such reason is historical. Prior himself wanted to
reconstruct the so-called ‘Master argument’ by ancient philosopher Diodorus of Cronus. Diodorus used
this argument to defend what might be called determinism about the future. Unfortunately, Diodorus’
own exposition has not survived. His argument is only known indirectly through descriptions by his
contemporaries. Epictetus writes ([6] Book II, chapter 19):
The Master Argument seems to be based on premisses of this sort. There is a general conflict
among these three statements:
1. Everything past and true is necessary;
2. The impossible does not follow from the possible;
3. There is something possible which neither is nor will be true.
Seeing this conflict, Diodorus relied on the plausibility of the first two to establish: Nothing
is possible that neither is true nor will be.
The ancient philosophers responded differently to this trilemma. Some were inclined to reject the second
statement and maintain the first and the third, for instance.
Diodorus’s conclusion, that the third proposition is false, can be reformulated as the statement:
60
Dio If it is possible that p, then either p is true now, or will be true some time in the future.
This is also what we can make out of the following statement by Boethius ([3], 234.22-26).
Diodorus defines as possible that which either is or will be; the impossible as that which,
being false, will not be true; the necessary as that which, being true, will not be false; and
the non-necessary as that which either is already or will be false.
A natural—and common—way to interpret Dio is as a statement of determinism. Namely, it says that
the only possible future is the actual future: nothing is possible other than what is true now or true in the
future.
Prior [12] attempted to reconstruct Diodorus’ argument by means of a combination of (alethic) modal
logic and tense logic. The language of this logic is simply that of tense logic plus the operators 2 and
3. We will first look at Prior’s argument and only later come to discuss what kind of models and
interpretations would make sense for this language.
In the language of modal tense logic, the conclusion by Diodorus is:
Dio 3ϕ → (ϕ ∨ Fϕ)
The trilemma is then expressed by the following three formulas, of which the third is the negation of
Dio.
D1 Pϕ → 2Pϕ
D2 2(ϕ → ψ) → (3ϕ → 3ψ)
D3 ¬(3ϕ → (ϕ ∨ Fϕ))
= ¬Dio
More precisely, D3 should be, not that the negation of Dio is valid, but that Dio itself is invalid. According to the formulation by Epictetus, it says that there is some proposition for which Dio is not true.
Nevertheless, the conclusion that Diodorus drew is that this is false, so that for no proposition D3 is true,
which means that Dio is generally valid.
In order to come to a valid inference of this kind, Prior added two further premisses. These two
premisses are, he claimed, reasonable for an ancient logician such as Diodorus. (Mates [9] agrees with
Prior’s assessment, but some other commentors have been more critical.)
P1 2(ϕ → HFϕ)
P2 (¬ϕ ∧ ¬Fϕ) → P¬Fϕ
Now Prior attempted to show that Diodorus’s reasoning can be validated using modal tense logic. This
means that we now have four premisses, D1, D2, P1, and P2, leading to the conclusion Dio. The first
of these says (intuitively) that the past is necessary. The second one is a straightforward validity of the
basic modal logic K. Furthermore, we can observe that ϕ → HFϕ is a validity of basic tense logic (it is
esssentially the same as the inference rule Tmp2). We can infer P1 from this validity using Necessitation
(in the Hilbert-style inference system), or by a simple natural deduction. Finally, the premise P2 says:
Of whatever is and always will be false (i.e. what neither is nor ever will be true, it has
already been the case that it will always be false.
61
Effectively, the tense logical system we assume is basic tense logic plus D1 and P2.
Prior’s reconstruction of the Master Argument is a valid inference of ‘Dio’ using these four premisses.
Below is a sketch of the proof, skipping some of the inference steps that you can fill in for yourself. At
two points it makes use of the derived rule called ‘Contraposition’.
ϕ→ψ
......
¬ψ → ¬ϕ
You can prove that this is is a derived rule in all (modal) logics. The deduction sketch below then proves
that D1, D2, P1, P2 ` Dio.
1
Pψ → 2Pψ
Premise D1
2
2(ψ → χ) → (3ψ → 3χ)
Premise D2
3
2(ϕ → HFϕ)
Premise P1
4
(¬ϕ ∧ ¬Fϕ) → P¬Fϕ
Premise P2
5
2(ϕ → ¬P¬Fϕ)
from 3, duality of H and P
6
2(ϕ → ¬P¬Fϕ) → (3ϕ → 3¬P¬Fϕ)
instance of D2 (ψ = ϕ and χ = P¬Fϕ)
7
3ϕ → 3¬P¬Fϕ
from 5, 6, Elim →
8
¬3¬P¬Fϕ → ¬3ϕ
from 7, contraposition
9
2P¬Fϕ → ¬3ϕ
from 8, duality of 2 and 3
10
P¬Fϕ → 2P¬Fϕ
instance of D1 (ψ = ¬Fϕ)
11
(¬ϕ ∧ ¬Fϕ) → 2P¬Fϕ
from 4 and 11, using Intro/Elim →
12
(¬ϕ ∧ ¬Fϕ) → ¬3ϕ
from 9 and 12, using Intro/Elim →
13
3ϕ → (ϕ ∨ Fϕ)
from 12, contraposition
The last of these is the conclusion Diodorus drew.
If we view this inference from the perspective of modern modal logic, D2 and P1 are part of basic
modal and basic tense logic, respectively. If we accept those as given, then logically we have to either
give up D1 or P2, or accept Dio. Accepting Dio would mean accepting that the future is settled from the
beginning of the world: at the Big Bang it was determined that you would be reading this sentence now.
Given a closer look, the two premisses are not that plausible however.
First, consider P2. In fact this formula implies that time is discrete. More precisely, that there is, at
any given time, an immediate past moment.
Theorem 12 (Discreteness). Let F be a frame in the class of all tense frames. Then, F |= (¬ϕ ∧ ¬Fϕ) →
P¬Fϕ if, and only if, every time has a unique closest predecessor, ∀x∃y(y < x ∧ ∀z(z < x → z < y)).
Proof. An exercise.
Second, consider D1. This formula seems to express the plausible idea that the past is fixed, but if
we apply it to future tensed statements it goes beyond the fixedness of the past.
Yesterday it was true that tomorrow it is going to rain. Therefore, it is now necessary that yesterday
it was true that tomorrow it is going to rain.
62
The first sentence in this example is true if, in the actual world, it is raining tomorrow. The second
sentence is true if, in all worlds that are possible now, it is raining tomorrow. In other words, if we use a
future tensed statement in D1, it implies that the future is determined. This is precisely what is involved
in the Master Argument, on Prior’s reconstruction, as can be seen in the deduction sketch above. Perhaps
the past is fixed in the sense that ‘what’s done is done’, without the past being fixed in the sense that
everything that was true in the past (including statements about the future) is fixed.
So the Master argument, on Prior’s reconstruction, says that a discrete time with a (strongly) fixed
past also has a fixed future.
Time is discrete + the past is strongly fixed = determinism
8.5
Aristotle on the sea battle
Aristotle’s argument in favour of the idea of an open future has been very influential. It has been viewed
as a response to the so-called Megaric school of which Diodorus was a member. He makes a distinction
between something being necessary simpliciter and something being necessary when it happens. Using
this distinction he can combine the open future with the idea that everything that happens is necessary
(when it happens). Aristotle uses the example of a sea battle that may or may not take place tomorrow.
The only thing that is necessary is that the sea battle either does or does not happen tomorrow: it must
be one of these, but it need not be both.
Now that which is must needs be when it is, and that which is not must needs not be when
it is not. Yet it cannot be said without qualification that all existence and non-existence is
the outcome of necessity. For there is a difference between saying that that which is, when
it is, must needs be, and simply saying that all that is must needs be, and similarly in the
case of that which is not. In the case, also, of two contradictory propositions this holds
good. Everything must either be or not be, whether in the present or in the future, but it is
not always possible to distinguish and state determinately which of these alternatives must
necessarily come about.
Let me illustrate. A sea-fight must either take place tomorrow or not, but it is not necessary
that it should take place tomorrow, neither is it necessary that it should not take place, yet
it is necessary that it either should or should not take place tomorrow. Since propositions
correspond with facts, it is evident that when in future events there is a real alternative, and
a potentiality in contrary directions, the corresponding affirmation and denial have the same
character. (Aristotle [1], Ch.9)
In our formal language of modal tense logic, what Aristotle proposes is that we acknowledge that necessarily there is a sea battle (p) tomorrow or not
2(F p ∨ F¬p) and 2F(p ∨ ¬p),
although it is not necessary now that in the future there is going to be a sea battle, or necessary that it is
not going to happen,
2F p ∨ 2¬F p.
The only thing that is true is that the sea battle necessarily takes place when it does take place.
F p → F2p and F¬p → F2¬p.
63
8.6
Ockhamist semantics for modal tense logic
There are different ways to make models for combinations of tense and modality. One common approach
is the Ockhamist semantics. This semantics interestingly corresponds with the intuitions of Aristotle. It
makes use of the tree-like frames of tense logic, in which the future is not linear but the past is. We do
not alter these models, but we introduce the notion of a history:
A history is a maximal chain of times. If t and 0 are in history h, then so are all times in between
them. And if the tree goes on for ever into the future (or past), then the history also goes on for
ever into the future (or past).
These histories are also called branches of the tree. They pick out the different possible courses time
might take on the tree. We say that a history “goes through” some moment in time, and vice versa that a
time or moment “occurs in” a history. Observe that, in a tree structure, any moment in time has only one
past, but a multitude of futures. So all histories going through time t have the same past, but they may
have different futures.
In the Ockhamist semantics for modal tense logic, we relativize truth not just to a moment of time,
but rather to the combination of a moment and a history. This means that we get the following truth
definition:
M, t, h |= p
if, and only if,
p ∈ V(t)
M, t, h |= 2ϕ
if, and only if,
for all histories h0 going through time t, M, t, h0 |= ϕ
M, t, h |= Gϕ
if, and only if,
for all times t0 > t occurring in history h, M, t0 , h |= ϕ
M, t, h |= Hϕ
if, and only if,
for all times t0 < t occurring in history h, M, t0 , h |= ϕ
The 2 modality makes a shift in the history-dimension and stays fixed at the same moment in time. The
tense modalities stay fixed on the given history: they only make steps forward and backward on the
time-dimension.
Truth of an atomic propositional variable depends only on the time point and not on the history. So
the valuation is V(t) and not V(t, h). A consequence of this is that, for all atomic propositions:
p → 2p.
This does not generalize to all formulas. It is not valid on all tree frames that F p → 2F p, for instance.
Suppose that there will actually be a sea battle tomorrow, but this is not necessary now. Then yesterday
it was actually true that there will be a sea battle in two days time, but this does not mean that it was
necessary yesterday that there will be a sea battle.
A number of other interesting validities are listed here:
- S5 for 2
The notion “h is an alternative history going through t as h0 ” expresses an equivalence relation.
This means that it is characterized by the validities 2ϕ → ϕ, 2ϕ → 22ϕ and 32ϕ → ϕ.
- H2ϕ ↔ 2Hϕ
What is necessary now with respect to what has always been the case, is the same as what has
always been necessary. This is so in virtue of the fact, mentioned above, that for any given moment
in time there is only one past (all histories going through t have the same past before t).
- P2ϕ → 2Pϕ
If it was once necessary that ϕ, then now it is necessary that once ago, ϕ.
64
- 2Gϕ → G2ϕ
For the future this is only true in one direction: if in all histories it will henceforth always be true
that ϕ, then henceforth it will always be necessary that ϕ. The possible histories for any later time
are always a subset of the possible histories now.
Some philosophers have been more restrictive in their acceptance of the distinction between actual
and possible future. C.S. Pierce, for example, rejected the idea of an actual future. On his view, the
only notion of the future is the one in which Fϕ is true if in all possible futures there is a moment where
ϕ is true. In terms of the language used here, Pierce would only accept as meaningful 2Fϕ, not the
actual future Fϕ, which assumes that we can speak of the ‘real’ course of time in advance of its actual
occurrence. On the other extreme, many philosophers have supported a view we might call determinism,
according to which there is no future other than the actual one. The very idea of a future that could
happen but that does never actually happen is an absurdity to these philosophers. If we would accept
that view, then there would only be one branch. We would have the inference rule of Forward-linearity
for tense modality, but also 3ϕ ↔ 2ϕ. A consequence of this is Fϕ ↔ 2Fϕ. So, interestingly, at both
extremes the distinction between Fϕ and 2Fϕ collapses.
8.7
Computation tree logic
The logic CTL∗ is a version of tense logic for branching time that has been proposed in computer science.
It is used for automated verification of software, debugging of hardware circuits and communication
protocols.
This logic is similar in interesting ways to the modal tense logic we discussed above, and its semantics
is comparable to the Ockhamist semantics. Its language is defined by means of a simultaneous induction.
We distinguish two separate types of (modal) formulas: state formulas ϕ and path formulas π.
ϕ := p | Aπ | Eπ
π := ϕ | ¬π | π ∧ ρ | Fπ | Xπ | πUρ
U is a binary (or dyadic) modality: if we compare 2 with the ¬-operator, then U is like the ∧ operator:
it has two immediate subformulas instead of one.
By defining the language in this way we can preclude certain combinations of operators. The language of CTL∗ does not include formulas such as E pUAq or XAp, because U and X require path formulas
and Ap and E p are state formulas.
Then for the models. They are basically the branching time structures Ctree . Instead of histories, we
now define the concept of a path. A path is like a part of a branch with an initial state. That is, a path is
a sequence of states s0 s1 s2 s3 . . . which is ordered by the precedence relation, so s0 < s1 < s2 < . . ..
- P(k) is the kth element of the path P;
- Pk is the path obtained by ‘cutting off’ the initial k states, so that Pk (0) = P(k);
65
M, s |= p
iff
p ∈ V(s)
M, s |= Aπ
iff
∀P(P(0) = s ⇒ M, P |= π)
M, s |= Eπ
iff
∃P(P(0) = s and M, P |= π)
M, P |= ϕ
iff
M, P(0) |= ϕ
M, P |= Fπ
iff
∃k, M, Pk |= π
M, P |= Xπ
iff
∃k, M, P1 |= π
M, P |= πUρ
iff
∃k : M, Pi |= ϕ for all i < k and M, Pk |= ψ
M, P |= Iπ
iff
for infinitely many i, M, Pi |= π
We see that Aϕ is basically the same as 2ϕ, and Fϕ is what it used to be in Ockhamist semantics.
66
References
[1] Aristotle. On Interpretation (ca. 350 B.C., translated by E. M. Edghill). Kessinger Publishing, 2004.
[2] Patrick Blackburn, Maarten de Rijke, and Yde Venema. Modal Logic, volume 53 of Cambridge Tracts in Theoretical
Computer Science. Cambridge University Press, 2001.
[3] Boethius. Commentary on aristoteles’ de interpretatione.
[4] R. Carnap. Introduction to Semantics. Harvard University Press, 1942.
[5] R. Carnap. Meaning and Necessity: a Study in Semantics and Modal Logic. University of Chicago Press, 1947.
[6] Epictetus. Discourses. In B. Inwood and L.P. Gerson, editors, Hellenistic Philosophy. Indianapolis: Hackett Publishing
Company, 1988.
[7] J.W. Garson. Modal Logic for Philosophers. Cambridge University Press, 2006.
[8] Saul A. Kripke. A completeness theorem in modal logic. The Journal of Symbolic Logic, 24(1):1–13, 1959.
[9] Benson Mates. Review of prior ‘diodorean modalities’. The Journal of Symbolic Logic, 21(2):199–200, 1956.
[10] Benson Mates. The Philosophy of Leibniz. Oxford University Press, 1986.
[11] A.N. Prior. Time and Modality. Oxford University Press, 1957.
[12] Arthur N. Prior. Past, Present and Future. Oxford: Clarendon Press, 1967.
[13] B. Russell. Necessity and possibility [1905]. In A. Urquhart and A.C Lewis, editors, Foundations of Logic, 1903-05.
Routledge, 1994.
[14] J. van Benthem. Modal Correspondence Theory. PhD thesis, University of Amsterdam, 1976.
[15] Johan van Benthem. Modal Logic for Open Minds. CSLI Publications, 2010.
[16] Hans van Ditmarsch, Wiebe van der Hoek, and Barteld Kooi. Dynamic Epistemic Logic, volume 337 of Synthese Library.
Springer, 2007.
[17] Ludwig Wittgenstein. Tractatus Logico-Philosophicus. London: Routledge and Kegan Paul, 1922. Translated by C.K.
Ogden, with an Introduction by Bertrand Russell.
67