* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Notes and exercises on First Order Logic
Survey
Document related concepts
Truth-bearer wikipedia , lookup
Combinatory logic wikipedia , lookup
Non-standard calculus wikipedia , lookup
Propositional calculus wikipedia , lookup
First-order logic wikipedia , lookup
Quasi-set theory wikipedia , lookup
Boolean satisfiability problem wikipedia , lookup
Sequent calculus wikipedia , lookup
Propositional formula wikipedia , lookup
Canonical normal form wikipedia , lookup
Structure (mathematical logic) wikipedia , lookup
Transcript
University of Stellenbosch Department of Mathematical Sciences (Mathematics Division) Elements of Mathematical Reasoning: First-Order Logic Prof IM Rewitzky April 2012 0.1. SYNTAX 0.1 1 Syntax Due to the expressive power of predicate calculus, the language is more complex than that of PropCal. There are two sorts of things involved in a first-order logic formula: (i) objects such as individuals (e.g. Andrew, Paul), variables (e.g. x, y), function symbols (e.g. m(x), g(x, y)) (ii) properties denoting truth values such as predicates (e.g. Younger (andrew, paul)). The predicate vocabulary consists of 4 sets: V of variable symbols denoted by x, y, z, . . .. C of constant symbols denoted by a, b, c, . . .. F of n-ary function symbols denoted by f, g, h, . . .. P of n-ary predicate symbols denoted by P, Q, R, . . .. Note: each predicate symbol and each function symbol comes with an arity n, the number of arguments it expects. Terms are defined as follows: • Every variable is a term. • Every constant in C is a term. • If t1 , t2 , . . . , tn are terms and f ∈ F has arity n then f (t1 , t2 , . . . , tn ) is a term. • Nothing else is a term. Note: functions may be nested; constants may be thought of a functions of 0-arity, so C ⊂ F. Exercise 1 Assume g is a unary function symbol and h is a ternary function symbol. Which of the following are terms? g(y); h(c, x); h(g(c), y, y); g(h); g(g(h(x, x, x), y)) The formulae of FOL are defined as follows: • If t1 , . . . , tn are terms and P is an n-ary predicate symbol then P (t1 , . . . , tn ) is a formula, called an atomic formula. • If φ is a formula then so is ¬φ. • If φ and ψ are formulae then so are (φ ∧ ψ), (φ ∨ ψ), (φ → ψ). • If φ is a formula and x is a variable then (∀xφ) and (∃xφ) are formulae. • Nothing else is a formula. Convention: For convenience, we retain the usual binding priorities for the connectives of propositional logic, and add that ∀y and ∃y bind like ¬. Thus, the order is: ¬, ∀y, ∃y bind most tightly, then ∨ and ∧, then →. We often omit brackets around quantifiers, provided that doing so introduces no ambiguities. Note: The collection of formal symbols is countable, i.e. can be arranged in a sequence (s1 , s2 , s3 , . . .). 2 φ∨ψ φ∧ψ φ↔ψ ∃xφ ¬φ → ψ ¬(φ → ¬ψ) (φ → ψ) ∧ (ψ → φ) ¬∀x¬φ abbreviates abbreviates abbreviates abbreviates Later we shall see that these abbreviations reflect the intended meaning of the connectives ∨, ∧, ↔ and the quantifier ∃. Exercise 2 Let g be a unary function symbol, h a ternary function symbol, and B, Q binary predicate symbols. Which of the following expressions are formulae? Give reasons, and list all terms and atomic formulae in each case. (i) ∀x(B(x, y) → g(z)) (ii) ∀x(B(x, y) → Q(g(z), y) (iii) ∀x(B(g(y)) → Q(x, x) (iv) ∀x(g(B(x, x)) → Q(x, c) (v) ∀x(B(x, c) → B(c, x)) A subformula of a formula is any constituent part that is itself a formula. Definition 0.1 In a quantified formula ∀xφ or ∃xφ, x is the quantified variable and φ is the scope of the quantified variable. It is not required that x actually appear in the scope of its quantification. The concept of scope in formulae of first-order logic is similar to the concept of scope of variables in a block-structured programming language like Pascal. Definition 0.2 Let φ be a formula. An occurrence of a variable x in φ is a free variable of φ if and only if x is not within the scope of a quantified variable x. A variable which is not free is bound. Let ∆ be a set of formulae. A variable x is free in ∆ if it is free in every formula in ∆. If ∆ = {} then no variable is free in ∆. Notation: φ(x1 , . . . , xn ) indicates that the set of free variables of the formula φ is a subset of {x1 , . . . , xn }. If a formula has no free variables it is closed. A closed formula is often called a sentence. If {x1 , . . . , xn } is the set of all free variables of φ then the universal closure of φ is ∀x1 . . . ∀xn φ and the existential closure is ∃x1 . . . ∃xn φ. Example 0.3 φ(x, y) has two free variables x and y, ∃yφ(x, y) has one free variable x and one bound variable y. Note: In the formula ∀yψ(x) y doesn’t occur in ψ(x), so this formula is the same thing as ψ(x). Consider the formula (∀x(P (x) ∧ Q(x))) → (¬P (x) ∨ Q(y)) The occurrence of x in the subformula P (x) ∧ Q(x) is bound since it is in the scope of ∀x, but the occurrence of x in the subformula ¬P (x) ∨ Q(y) is free. 0.1. SYNTAX 3 Example 0.4 Identify the scope of each quantifier and all bound and free variables. (i) B(x, y) → ∀z∀x[B(x, y) → B(z, y)] Answer: The first x is free, the second and third are bound. All occurrences of y are free, and all occurrences of z are bound. The scope of ∀z is ∀x[B(x, y) → B(z, y)], and the scope of ∀x is the part in square brackets. (ii) ∀y∀z[x = y · z → (y = 1 ∨ z = 1)] where · and = denote binary predicate symbols written in infix notation. and = denotes predicate P . Answer: only x is free. The scope of ∀y and ∀z is the rest of the formula following the respective quantifier. Variables are placeholders so we must have some means of replacing them with more concrete information. We often need to replace a free variable v by an entire term t, i.e. substitute t for v. In substituting t for v we have to leave untouched the bound occurrences of v since they are in the scope of some ∃v or ∀v. If φ is a formula, v is a variable, and t is a term, then φ(t/v) is the result of replacing each free occurrence of v in φ by t. Example 0.5 If φ is the formula ∃y(x < y), where < denotes a binary predicate written in infix notation, then φ(2/x) is ∃y(2 < y) and φ(y/x) is ∃y(y < y). Although φ is intuitively true for any natural number x, the second substitution results in a formula that is false. The problem is that when y is substituted for the free variable x, it falls within the scope of the quantifier ∃y and hence becomes bound. To avoid this problem of ‘variable capture’ we require the following definition: If v is a variable, and t is a term, then φ admits t for v (or t is free for v in φ) if no variable in t becomes bound in φ(t/v). Question: What happens if there are no free occurrences of v in φ? Then φ admits every term t for v since no free variable v of φ is in the scope quantifier of φ so the problematic variable capture does not occur. In fact, φ(t/v) is just φ again. Exercise 3 Let φ be ∃x(P (y, z) ∧ (∀y(¬Q(y, x) ∨ P (y, z)))) where P and Q are binary predicates. (i) Is there a variable in φ which has free and bound occurrences? (ii) Consider the terms w (w is a variable), f (x) and g(y, z) where f is a unary function symbol and g is a binary function symbol with one, respectively two, arguments. (a) Compute φ(w/x), φ(w/y) and φ(f (x)/y), φ(g(y, z)/z). (b) Which of w, f (x), g(y, z) are free for x in φ? (c) Which of w, f (x), g(y, z) are free for y in φ? (iii) What is the scope of ∃x in φ? (iv) Suppose we change φ to ∃x(P (y, z) ∧ (∀x(¬Q(x, x) ∨ P (x, z)))). What is the scope of ∃x now? Exercise 4 4 (i) Translate into mathematical English. (a) ¬∃x(x ∈ Q ∧ x2 = 2) (b) ∀x∀y∃z∀w [w ∈ z ↔ (w ∈ x ∨ w ∈ y)] (c) ∃x∀y¬(y ∈ x) (d) ∀y∃x¬(y ∈ x) (e) ∀x∀y[(x ∈ Q ∧ y ∈ Q ∧ x < y) → ∃z(z ∈ Q ∧ x < z ∧ z < y)] (ii) Translate the following sentences into predicate logic. (a) Anyone who is persistent can learn logic. (b) Nobody loves a loser. (c) You can fool some of the people all of the time, and you can fool all the people some of the time, but you cannot fool all the people all the time. Predicate logic is also called first order logic, FOL. The adjective ‘first order’ is used to distinguish the language defined in this section from those in which there are predicates having other predicates or functions as arguments or in which predicate quantifiers or function quantifiers are permitted, or both. 0.2 Semantics We evaluate the propositional formula p ∨ ¬q → (q → p) using truth-tables. What about the F OL formula ∀x∃y(P (x) ∨ ¬Q(y)) → (Q(x) → P (y)))? Could we assume truth values for P (x), Q(y), Q(x), P (y) and compute a truth-value as before? Not Quite! We need to reflect the meanings of ∀x, ∃y, their dependencies and the actual parameters P, Q − ∀x∃yR(x, y) means something different from ∃y∀xR(x, y). Recall: variables are place-holders for any, or some unspecified concrete values (birds, numbers, etc). When evaluating ∃yψ we try and find some instance (concrete value) of y such that ψ holds; if successful then ∃yψ is true otherwise false. When evaluating ∀yψ we show ψ evaluates to true for all possible values of y. Such evaluations of formulae require a fixed universe of concrete values - the things we’re talking about. Some examples should make the idea clearer. Example 0.6 Let c be a constant symbol, f a binary function symbol, B a binary predicate symbol, and consider the formula φ = ∀x∃y [B(f (x, y), c)], which is actually a sentence. We can interpret this sentence in many different situations. (i) For example in the natural numbers N, let c be mapped to the number 0, let f be mapped to the addition function and let B be mapped to the equality relation. Then φ is false since there are no negative numbers in N. (ii) In another example based on the positive rationals Q+ , we could let c be mapped to 1, f mapped to multiplication and B mapped to the equality relation again. Then φ is true since any positive rational number has a multiplicative inverse. 0.2. SEMANTICS 5 From the examples, we get the idea that the truth value of a formula depends on, and varies with, the actual choice of values and the meaning of the function and predicate symbols involved. Thus to interpret a language, we need: • A set of objects. • Some distinguished objects which will correspond to the constant symbols. • Some particular functions that correspond to the function symbols. • Some particular relations on the set of objects that correspond to the predicate symbols. The next definition makes all this precise: Definition 0.7 A structure (or interpretation) has the form U = (U, f1U , f2U , . . . , cU1 , cU2 , . . . , GU1 , GU2 , . . .) where • U is a nonempty set, called the universe or domain of interpretation, • if fi is n-ary, then fiU is a function from U n to U , • each cUi is a distinguished element of U , and • if Gi is n-ary, then GUi is an n-ary relation on U , i.e. GUi ⊆ U n . Note that the superscript U is used to distinguish between the symbols and their particular interpretation (meaning) in the structure U. This distinction is vital for a proper understanding of the semantics of FOL, and also clarifies many ambiguities of mathematics, where the same symbol is used in different contexts. In practice we often omit the superscript when it is clear what the intended meaning of the symbol is. The variables are allowed to range over the elements of U . (This is what distinguishes first order logic from higher order logics. In second order logic there are variables that range over function symbols and predicate symbols.) The concept of a structure is quite common in mathematics, though usually not explicit. The natural numbers N, the integers Z, the rationals Q, the reals R and the complex numbers C are all examples of structures, each with their usual operations such as (+N , +Z , ×N , . . .) and relations (=N , =Z , ≤N , . . .) defined on them. If U contains no function symbols, a structure is called a relational structure. If there are no predicate symbols except the = symbol then a structure is an algebraic structure or algebra. Examples of relational structures are partially ordered sets and graphs. On the other hand groups, rings, fields, lattices and Boolean algebras are examples of algebras. We are aiming of course to establish when a formula is true in a structure U. The next example shows that the truth value of a formula depends on what values we assign to the free variables. Example 0.8 Let us take as domain the set of all positive integers and interpret the predicate symbol A(u, v) as u ≤ v. Then (i) A(x, y) represents the expression ‘x ≤ y’ which is satisfied by all pairs (a, b) of positive integers such that a ≤ b. 6 (ii) ∀yA(x, y) represents the expression ‘For all positive integers y, x ≤ y’, which is satisfied only by the integer 1. (iii) ∃x∀yA(x, y) is a true sentence asserting that there is a smallest positive integer. If we were to take as a domain the set of all integers then ∃x∀yA(x, y) would be false. Exercise 5 For the following wffs and for the given interpretations indicate for what values the wffs are true (if they contain free variables) or whether they are true or false (if they are closed wffs). (i) A(f (x, y), c) (ii) A(x, y) → A(y, x) (iii) ∀x∀y∀z(A(x, y) ∧ A(y, z) → A(x, z)). (a) The domain is the set of positive integers, A(y, z) is y ≥ z, f (y, z) is y · z, c is 2. (b) The domain is the set of integers, A(y, z) is y = z, f (y, z) is y + z, c is 0. (c) The domain is the set of all sets of integers, A(y, z) is y ⊆ z, f (y, z) is y ∩ z, c is the empty set ∅. So we see that to find the truth value of a formula, we first need to assign values to the free variables. Formally this is done as follows: Definition 0.9 An assignment in a structure U is a function h from the set V of variables to U . Such an assignment h is extended inductively to the set of all terms by defining h(ci ) = cUi for each constant ci , and h(fi (t1 , . . . , tn )) = fiU (h(t1 ), . . . , h(tn )) for each n-ary function symbol fi and terms t1 , . . . , tn . We now define inductively the satisfaction relation U, h |= θ, read as U with assignment h satisfies formula θ’. • U, h |= G(t1 , . . . , tn ) iff (h(t1 ), . . . , h(tn )) ∈ GU • U, h |= ¬φ iff U, h 6|= φ iff it is not the case that U, h |= φ • U, h |= φ → ψ iff U, h 6|= φ or U, h |= ψ • U, h |= ∀vφ iff U, h0 |= φ for all assignments h0 that satisfy h(w) = h0 (w) for all variables w 6= v. At the first sight, the evaluation of terms and formulae in a structure seems like a rather cumbersome and technical notion. In practice however, one quickly notices that it is simply a precise way of defining something that is being done informally in all areas of mathematics: how to decide if a statement is true or false (given some specific values for all free variables). Instead of writing U, h |= φ, it is often convenient to write U |= φ[hx , hy , hz , . . .] 0.2. SEMANTICS 7 where hv = h(v), and x, y, z, x1 , . . . is a part of the list of variables in the ‘alphabetical’ order which includes all variables that appear in φ. With this notation the assignment is given explicitly in the square brackets by listing the elements of the structure U to which the variable are assigned. For a specific variable, say y, we can now give a very natural definition of ∀y: U |= ∀yP [hx , hy , hz , . . .] iff for all u ∈ U , we have U |= P [hx , u, hz , . . .]. In the exercises below, try to apply the above definitions rigorously, rather than just relying on your intuition. Exercise 6 Let P be the formula x + y = z (written in infix notation) and consider the algebra of natural numbers (N, +N , =N ), where + and = are interpreted in the usual way, i.e. as addition and equality. Decide which of the following formulae are satisfied in N under the given assignment: (i) P [1, 2, 3], (ii) P [2, 3, 4], (iii) ∀zP [1, 2, 3], (iv) ∀z¬P [1, 2, 3], (v) ∀z(P → z = x1 )[1, 2, 4, 3], (vi) ∀x1 P [1, 2, 3, 4] and (vii) ¬∀z¬P [1, 2, 2]. Exercise 7 Rewriting φ ∧ ψ as ¬(φ → ¬ψ), φ ∨ ψ as ¬φ → ψ and ∃vφ as ¬∀v¬φ, use the above definition of evaluation to prove the following: (i) U, h |= φ ∧ ψ iff U, h |= φ and U, h |= ψ (ii) U, h |= φ ∨ ψ iff U, h |= φ or U, h |= ψ (iii) U, h |= ∃vφ iff U, h0 |= φ for some assignment h0 that satisfies h(w) = h0 (w) for all variables w 6= v (iv) U |= ∃yP [hx , hy , hz , . . .] iff there exists u ∈ U such that U |= P [hx , u, hz , . . .] Exercise 8 Again, let P be the formula x + y = z (written in infix notation) and let (N, +N , =N ) be the additive algebra of natural numbers. Decide which of the following are satisfied (i) ∃z P [1, 2, 3], (ii) ∃y∀zP [1, 2, 2], (iii) ∀y∃zP [1, 1, 3], (iv) ∀z∃yP [1, 2, 1], (v) ∀x∃y∃zP [1, 2, 1], (vi) ∃y∃zP [1, 2, 1]. 8 Exercise 9 Repeat the above exercise, but replace the formula P by x2 + y 2 = z 2 and add the squaring function to the structure N. From the exercises above (and our intuition) we suspect that the truth value of φ depends only on the values substituted for the free variables. The next theorem shows this is true. Its proof is a staightforward induction on the structure of the formula φ. Theorem 0.10 Let φ be a formula, and suppose h and k are two assignments that agree on all free variables of φ. Then U, h |= φ iff U, k |= φ Proof. We first prove the following claim by induction on the structure of terms. Claim: if t is a term that occurs in an atomic formula φ, then h(t) = k(t). For constant symbols this follows directly from the definition since h(c) = cU = k(c), and for variables it follows from our assumption that h and k agree on all free variables of φ, since in an atomic formula all variables are free (end of base step). Assume now that the claim holds for terms t1 , . . . , tn , and let f be an n-ary function symbol. Then h(f (t1 , . . . , tn )) = f U (h(t1 ), . . . , h(tn )) = f U (k(t1 ), . . . , k(tn )) = k(f (t1 , . . . , tn )) so we have shown that the claim holds for f (t1 , . . . , tn ) (end of inductive step). Now we show, by induction on the structure of formulae, that for any formula θ, U, h |= θ iff U, k |= θ. For an atomic formula G(t1 , . . . , tn ) we use the above claim about terms and the definition of the satisfaction of atomic formulae (end of base step). Assume the theorem holds for formulae φ and ψ (inductive hypothesis), and consider the formulae ¬φ and φ → ψ. Then U, h |= ¬φ iff U, h 6|= φ iff U, k 6|= φ iff U, k |= ¬φ, where the middle ‘iff’ follows from the inductive hypothesis. A similar argument for φ → ψ makes use of the observation that all free variables of φ and of ψ are also free in φ → ψ. Now let θ be the formula ∀vφ, and assume that h, k are two assignments that agree on all free variables of θ. Note that this does not imply h(v) = k(v) since the variable v is bound in θ (while it might be free in φ). Now assume U, h |= ∀vφ, which means (∗) U, h0 |= φ for all h0 satisfying h(w) = h0 (w) for w 6= v. We want to show U, k |= ∀vφ. So let k 0 be any assignment that satisfies k(w) = k 0 (w) for w 6= v. We need to show that U, k 0 |= φ. Note that k, k 0 and h all agree on the free variables of θ. Let h0 be defined as 0 k (v) if w = v 0 h (w) = h(w) otherwise. Then U, h0 |= φ by (∗), and h0 , k 0 agree on all free variables of φ since h0 (v) = k 0 (v). So by the inductive hypothesis, we obtain U, k 0 |= φ, as required. The proof for the reverse implication is the same, with the roles of h and k interchanged. (end of inductive step). a Lemma 0.11 Suppose φ admits t for v. Let U be any structure, h any assignment, and define h(t) if w = v on assignment ĥ by ĥ(w) = Then h(w) otherwise. U, h |= φ(t/v) iff U, ĥ |= φ. Proof. This result follows by induction on the structure of the formula φ. If φ is an atomic formula, say G(s1 , . . . , sn ), then φ(t/v) is G(s01 , . . . , s0n ), where s0i is the result after replacing each v in si by t. Since ĥ(v) = h(t), it follows (by a simple induction over terms) that h(s0i ) = ĥ(si ). Hence U, h |= φ(t/v) iff U, ĥ |= φ. 0.2. SEMANTICS 9 As the induction hypothesis assume the claim holds for formulae φ and ψ. If φ is of the form ¬ψ or ψ → θ, then claim holds by definition of |=. The interesting case is if φ is ∀uψ. Suppose first that v is not free in φ. Then φ(t/v) is φ, and h and ĥ agree on all free variables of φ, so the claim holds by Theorem 0.10. Now suppose v does occur free in φ. Then v 6= u, hence φ(t/v) is ∀uψ(t/v) and ψ admits t for v. Since φ admits t for v, the variable u does not occur in t (otherwise u would be bound by the ∀u, contrary to the definition of ‘admits’). Therefore h(t) is independent of the value assigned to u. Let k be an assignment such that k(w) = h(w) for all w 6= u. The assignment k̂ is defined like ĥ, so k̂(v) = k(t). But since h and k agree everywhere except possibly at u, k(t) = h(t) (here we make use of the fact that u does not appear in t). Therefore k̂(w) = ĥ(w) for all w 6= u. By the inductive hypothesis for ψ we have U, k |= ψ(t/v) iff U, k̂ |= ψ. Hence U, h |= φ(t/v) iff U, ĥ |= φ. a We have already seen that some formulae will be true in some structures while they will be false in others. Let’s settle this with a definition. Definition 0.12 • A formula φ is said to be true in U if U, h |= φ for all assignments h. In this case we also say that U satisfies φ, or that U is a model of φ, and write U |= φ (without any reference to h). • φ is said to be false in U if U, h 6|= φ for all assignments h. Question: If φ is not true in a structure, is it false? In general, this is not the case if φ has free variables. Example 0.13 Let φ be the formula B(x, c) in the following structures: (i) N with B N the relation ≥N and cN = 0, then φ is true in N (ii) N with B N the relation ≥N and cN = 3, then φ is neither true nor false in N (iii) N with B N the relation <N and cN = 0, then φ is false in N. Exercise 10 Prove the following results. (i) U |= φ iff ¬φ is false in U. (ii) No formula is both true and false in the same structure. (iii) φ → ψ is false in U iff φ is true in U and ψ is false in U. (iv) φ is true in U iff the universal closure of φ is true in U. Although we’ve seen that a formula need be neither true nor false in a given structure, the situation is much improved for sentences. A sentence is either true or false in a given structure. This follows from Theorem 0.10 since a sentence has no free variables (by definition), so for a sentence φ, the relation U, h |= φ is independent of what assignment h is used. Corollary 0.14 For any sentence φ and any structure U either U is a model of φ or U is a model of ¬φ. 10 Having settled the idea of truth in a particular structure, we can think about those formulae which are true in any structure. Surely they must represent some universal fact and will be called logical laws. Definition 0.15 Let φ a FOL formula. • φ is a logical law iff φ is true in every structure. In this case we also say that φ is logically valid, and write |= φ. • φ is satisfiable iff φ is true in some structure. • φ is a contradiction iff φ is false in every structure. Example 0.16 (i) ∀x φ → ∃x φ is a logical law since structures always have nonempty universes. (ii) ∃x∀yφ → ∀y∃xφ is a logical law. (iii) The converse of (ii) is not a logical law. (iv) ∀x(φ ∨ ¬φ) is a logical law for any formula φ. Exercise 11 Decide if the following are logical laws (for any formulae φ, ψ). If one of them is not a logical law, give a counter-example, i.e. a structure in which the formula is not true. (i) ∀x(φ ∧ ψ) ↔ ∀xφ ∧ ∀xψ (ii) ∀x(φ ∨ ψ) → ∀xφ ∨ ∀xψ (iii) ∀xφ ∨ ∀xψ → ∀x(φ ∨ ψ) (iv) ∃x(φ ∧ ψ) → ∃xφ ∧ ∃xψ (v) ∃xφ ∧ ∃xψ → ∃x(φ ∧ ψ) (vi) ∃x(φ ∨ ψ) ↔ ∃xφ ∨ ∃xψ (vii) ¬∀x(φ → ψ) ↔ ∃x(φ ∧ ¬ψ) Given a model U for a formula φ and an assignment h on U, we have seen how to check whether U with assignment h satisfies φ, ie whether U, h |= φ. This is different from what we did in Propositional logic where we wrote φ1 , φ2 , . . . , φn . How can we define a notion of semantic entailment for predicate logic? Recall the earlier notion of a structure being a model of some formula. More generally, a structure U is called a model of a set of formulae ∆ if U is a model of every formula in ∆. Definition 0.17 Let ∆ be a set of formulae and φ any formula. We say that φ is a semantic entailment of ∆ (or that ∆ semantically implies φ) if every model of ∆ is a model of φ. In this case we write ∆ |= φ. If φ is not a consequence of ∆, i.e. if there exists a structure (called a counter-example) which is a model of ∆ but not of φ, then we write ∆ 6|= φ. Note that if ∆ is the empty set, then |= φ means any structure is a model of ∆, so φ is a logical law as defined above. 0.3. DEDUCTION AND PROOF 0.3 0.3.1 11 Deduction and Proof Natural deduction The natural deduction system for propositional logic can be extended to cover first-order logic as well with the addition of the following rules for quantifiers: • Universal elimination or specialisation • Universal introduction or generalisation • Existential introduction • Existential elimination. Universal elimination or Specialisation Consider the following motivating example. Example 0.18 From the formula ∀x∃y y > x, we can correctly conclude that ∃y y > 2 is true, and we can even substitute a term such as z 2 for x since the sentence ∃y y > z 2 is also true regardless of what value z takes on. However the term y 2 cannot be substituted for x since ∃y y > y 2 is false in N. The problem is that y 2 is not admissible for x in ∃y y > x. As before, ∆ is a set of formulae, φ is a formula, t is a term and v is a variable. The following result is a direct consequence of axiom A5 and MP. Theorem 0.19 (∀ Elim) If ∆ ` ∀vφ and φ admits t for v, then ∆ ` φ(t/v). In a deduction it is applied as follows: m ∀v φ .. . φ(t/v) ∀Elim m provided that φ admits t for v Universal introduction or Generalisation This rule is the formal counterpart of arguments that prove a result about an arbitrary object v, and then conclude that the result is true for all objects v. Example 0.20 Let x be any natural number and let φ be the formula ∃y (y > x). This formula is true in N since we may take y to be x + 1. By generalization we may conclude that ∀x φ is true in N. To see that this rule must be applied with some care, consider the following bogus proof that every natural number is even: Assume that x is a natural number of the form 2 × y. Then the formula ∃y (x = 2 × y) is true. By generalization we conclude that ∀x∃y (x = 2 × y) is true (Not!). The mistake is that there were some assumptions made about x, so x did not represent an ‘arbitrary’ object. The formal rule is given by the following theorem. A variable is not free in a set of formulae if it does not occur free in any member of the set. Theorem 0.21 (∀ Intro) If ∆ ` φ and v is not free in ∆, then ∆ ` ∀vφ. 12 Proof. Suppose v is not free in ∆. We prove by induction: (∗) if ψ1 , . . . , ψn is a deduction from ∆ then there is another deduction from ∆ in which ∀vψ1 , . . . , ∀vψn appear. Basis: If n = 1, then ψ1 is an axiom or ψ1 ∈ ∆. Case ψ1 is an axiom: If v is free in ψ1 , then by Ax Gen, ∀vψ1 is also an axiom, so it can be deduced in one step. If v is not free in ψ1 , then we apply steps. Case ψ1 ∈ ∆: Then by assumption, v is not free in ψ1 so again we apply modus ponens to ψ1 and A6. Induction Step: Assume (*) holds. Let ψ1 , . . . , ψn+1 be a deduction from ∆. By (*) there exists a deduction D from ∆ in which ∀vψ1 , . . . , ∀vψn occur. If ψn+1 is an axiom or member of ∆, we proceed as in the Basis step. The remaining case is that ψn+1 follows by modus ponens from ψj , ψk where j, k ≤ n and ψk is ψj → ψn+1 . Since ∀vψj and ∀v(ψj → ψn+1 occur in D, say at step number mj and mk , we can add the following three steps to the m steps of D: m + 1. m + 2. m + 3. ∀v(ψj → ψn+1 ) → (∀vψj → ∀vψn+1 ) ∀vψj → ∀vψn+1 ∀vψn+1 A4 MPmk , m + 1 MPmj , m + 2 a In a deduction the rule is applied in the following way: n φ .. . ∀v φ ∀ Intro n provided that v is not free in any undischarged assumption on which φ depends. Recall that an undischarged assumption is any assumption made at the beginning of a subproof which has not yet ended at that point in the proof. The side condition is violated in the above bogus proof: assuming that x is of the form 2 × y is an undischarged assumption in which x is free. If v is free in φ, the same provision also prevents a general deduction like φ ` ∀v φ, since then we would have to treat φ as an undischarged assumption. Example 0.22 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. ∀v(φ ↔ ψ) φ↔ψ ∀vφ φ ψ ∀vψ ∀vψ ψ φ ∀vφ ∀vφ ↔ ∀vψ ∀v(φ ↔ ψ) → (∀vφ ↔ ∀vψ) as ∀ Elim 1 as ∀ Elim 3 ↔ Elim 2, 4 ∀ Intro 5 as ∀ Elim 7 ↔ Elim 2, 8 ∀ Intro 9 ↔ Intro 3 − 6, 7 − 10 → Intro 1 − 11 0.3. DEDUCTION AND PROOF 13 Existential introduction This rule is quite simple: Let φ be a formula, v a variable, and suppose for some term t, the formula φ(t/v) is true. (Recall that φ(t/v) is obtained by replacing every free occurrence of v by the term t.) Since the term t ‘evaluates’ to some object, there exists an object for which φ is true. So from φ(t/v) we may deduce ∃v φ. But as in the ∀ elimination rule, we require that φ admits t for v, to avoid variable capture. Example 0.23 From ∀y 0 ≤ y we can deduce ∃x∀y x ≤ y. But from ∀y y ≤ y we cannot obtain ∃x∀y y ≤ x since if t is the term y then t is not admissible for x in ∀y y ≤ x. Theorem 0.24 (∃ Intro) If ∆ ` φ(t/v) and φ admits t for v, then ∆ ` ∃vφ. Proof. Assume φ admits t for v. We first show that ` φ(t/v) → ∃vφ. Since ∃vφ is an abbreviation for ¬∀v¬φ, this is the same as showing ` φ(t/v) → ¬∀v¬φ. By contraposition and ¬¬ Elim this formula follows from ` ∀v¬φ → ¬φ(t/v). Now we note that since φ admits t for v, we also have that ¬φ admits t for v, so the previous formula is an instance of A5. Now assume that ∆ ` φ(t/v). Using the result in the above paragraph and modus ponens, we conclude that ∆ ` ∃vφ. a m φ(t/v) .. . ∃v φ ∃ Intro m provided that φ admits t for v There are two simple cases of ∀ Elim and ∃ Intro that deserve to be mentioned: if the term t is simply the variable v, then φ admits t for v, and φ(v/v) is the same as φ, so we have ∀vφ ` φ and φ ` ∃vφ. Combining these two in the given order shows that for any formula φ we have ∀vφ ` ∃vφ. Existential Elimination This rule is more subtle and requires a subproof. It is similar to the ∨ Elim rule. Example 0.25 From the existence of a Greek, and the fact that every Greek is human, we can deduce the existence of a human: Given ∃x G(x), let the formula G(x) be a temporary assumption in a subproof. Since G(x) implies H(x), we can deduce H(x), so the sentence ∃x H(x) follows by the above rule of ∃ Intro. This sentence does not have x as a free variable, so it can be moved out of the subproof, and is the result of ∃ Elim. Theorem 0.26 (∃ Elim) If ∆ ` ∃vφ and ∆, φ ` ψ and if v is not free in ∆ or ψ, then ∆ ` ψ. Proof. Suppose ∆ ` ∃vφ, ∆, φ ` ψ and v is not free in ∆ or ψ. Then ∆ ` φ → ψ by the deduction theorem, and by contraposition ∆ ` ¬ψ → ¬φ. Now we can add the following steps to an n-step deduction from ∆ that includes ∃vφ and ¬ψ → ¬φ at steps k and m: 14 ¬ψ ¬φ ∀v¬φ ¬∀v¬φ ∀v¬φ ∧ ¬∀v¬φ ¬¬ψ ψ n + 1. n + 2. n + 3. n + 4. n + 5. n + 6. n + 7. as MP m, n + 1 ∀ Intro n + 2, since v is not free in ψ or ∆ same as k ∧ Intro n + 3, n + 4 ¬ Intro ¬¬ Elim a In a proof outline, the deduction for ∆, φ ` ψ is done in a subproof. k ∃vφ m φ .. . n ψ ψ ∃ Elim k, m−n provided that v is not free in ψ or in any undischarged assumption (other than φ )on which ψ depends. Examples of incorrect and correct use of quantifier rules The first incorrect proof below shows a correct application of the ∃ Elim rule, but in the subproof ∀ Intro is applied incorrectly since φ might contain v as a free variable in step 2. If we assume v is not free in φ, then the proof is correct. Example 0.27 (i) 1. 2. 3. 4. ∃v φ φ as ∀v φ ∀v φ ∀ Intro 2 (illegal!) ∃ Elim 1, 2 − 3 (ii) Assume ψ is a formula with no free occurrence of v in it. Give justifications for each step. 1. 2. 3. 4. 5. 6. ∃v φ → ψ φ ∃v φ ψ φ→ψ ∀v (φ → ψ) 1. 2. 3. ¬∃v φ ¬¬∀v¬φ ∀v¬φ as (iii) (iv) as 0.3. DEDUCTION AND PROOF 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. ¬∀vφ ¬∃v¬φ ¬φ ∃v¬φ ∃v¬φ ∧ ¬∃v¬φ ¬¬φ φ ∀vφ ∀vφ ∧ ¬∀vφ ¬¬∃v¬φ ∃v¬φ 15 as Exercise 12 Find deductions for the following: (i) ∀v¬φ ` ¬∃v φ (ii) ∃v¬φ ` ¬∀v φ (iii) ` ∀vφ ↔ φ if v is not free in φ (iv) ` ∃vφ ↔ φ if v is not free in φ (v) ` ∀u∀vφ ↔ ∀v∀uφ (vi) ` ∃u∃vφ ↔ ∃v∃uφ (vii) ` ∃u∀vφ → ∀v∃uφ (viii) ` ∀v(φ ∧ ψ) ↔ ∀vφ ∧ ∀vψ (ix) ` ∃v(φ ∨ ψ) ↔ ∃vφ ∨ ∃vψ (x) ` ∃v(φ ∧ ψ) → ∃vφ ∧ ∃vψ (xi) ` ∀vφ ∨ ∀vψ → ∀v(φ ∨ ψ) (xii) ` ∀v(φ ∨ ψ) ↔ ∀vφ ∨ ψ if v is not free in ψ (xiii) ` ∃v(φ ∨ ψ) ↔ ∃vφ ∨ ψ if v is not free in ψ (xiv) ` ∀v(φ ∧ ψ) ↔ ∀vφ ∧ ψ if v is not free in ψ (xv) ` ∃v(φ ∧ ψ) ↔ ∃vφ ∧ ψ if v is not free in ψ (xvi) ` ∀v(φ ↔ ψ) → (∀vφ ↔ ψ) if v is not free in ψ (xvii) ` ∀v(φ ↔ ψ) → (∃vφ ↔ ∃vψ) (xviii) ` ∀v(φ → ψ) ↔ ∃vφ → ψ if v is not free in ψ (xix) ` ∀v(φ → ψ) ↔ φ → ∀vψ if v is not free in φ 16 0.4 First order theories with equality So far we have studied a proof system for general first order logic, with no specific mathematical theory in mind. Now we add further axioms to our system that are not necessarily true in all situations, but which do hold in situations that are of common interest. All the results from before are still applicable. A first order theory T is defined by a signature σT (also called language or type) which is a set of predicate symbols (including at least =), function symbols and constant symbols, together with a set of formulae AxT , called the axioms of T . The formulae of T are all the formulae constructible from the symbols in σT together with ¬ → ∀ ( , ) and the variables. It is assumed that all axioms of T are formulae of T . We say that a set of formulae ∆ in a theory T yields φ, denoted by ∆ `T φ, if ∆ ∪ AxT ` φ in first order logic with equality. Often one presents first-order logic such that there is always a special predicate =, available to denote equality; it has two arguments and x = y has the intended meaning that x and y compute the same thing. We will also abbreviate ¬(x = y) as x 6= y. In mathematics the equality symbols has very special properties, which are captured by the following formulae: E1: ∀x x = x (i.e. = is reflexive) E2: ∀x1 · · · ∀xn ∀y1 · · · ∀yn (x1 = y1 ∧ · · · ∧ xn = yn → (G(x1 , . . . , xn ) ↔ G(y1 , . . . , yn ))) for each n-ary predicate symbol G E3: ∀x1 · · · ∀xn ∀y1 · · · ∀yn (x1 = y1 ∧ · · · ∧ xn = yn → f (x1 , . . . , xn ) = f (y1 , . . . , yn )) for each n-ary operation symbol f Since we want these formulae to be available for use in deductions, we add them as additional axioms to our proof system, thereby obtaining first order logic with equality. To be explicit we should write ∆ `E φ to indicate that φ was derived from ∆ with possible use of axioms E1, E2, E3 (as well as A1-A6, Ax Gen and MP). However, since equality is part of practically every mathematical theory, we will drop the subscript E and henceforth redefine the meaning of `. Exercise 13 (i) ` ∀x∀y(x = y → y = x) i.e. = is symmetric. Hint: use E2 with G taken to be ‘=’, and then ∀ eliminate with x1 , x2 , y1 , y2 replaced by y, x, y, x respectively. (ii) ` ∀x∀y∀z(x = y ∧ y = z → y = x) i.e. = is transitive. Hint: again use E2 with G taken to be ‘=’, and ∀ eliminate with x1 , x2 , y1 , y2 replaced by x, z, y, z respectively. Therefore, the interpretation of ‘=’ in any model is an equivalence relation. However, the axioms do not force the interpretation of ‘=’ to be equality. (Counterexample?) These properties of = allow us to collapse chains of equalities of the form t1 = t2 , t2 = t3 , . . . , tn−1 = tn and conclude ti = tj for i, j ≤ n. Since this occurs frequently in deductions, we introduce a rule for this reasoning, denoted = m1 , . . . , mn−1 where the mi ’s are the linenumbers of the equality formulae used in the chain. Example 0.28 1. 2. 3. 4. t1 t4 t3 t1 = t2 = t3 = t2 = t4 as as as = 1, 2, 3 0.4. FIRST ORDER THEORIES WITH EQUALITY 17 Another very useful property of equality that can be proved from the above axioms is that ‘equals can be substituted by equals’. The following two results are proved by induction in a manner similar to the proof of the Equivalence Theorem (see an introductory logic book for details). The first one justifies ‘substitution of equals’ in terms, and the second one justifies it in formulae. Theorem 0.29 Suppose r, r0 and t are terms, and t0 is the result of replacing some occurrences of r as a subterm of t by r0 . Then ` r = r0 → t = t0 . Theorem 0.30 Suppose r, r0 are terms, φ is a formula, and φ0 is the result of replacing some occurrences of r as a subterm of φ by r0 . Then ` r = r0 → (φ ↔ φ0 ). Let r, r0 , t, t0 , φ, φ0 be as in the previous two theorems. The following useful rules are justified by these theorems and MP. m r = r0 .. . t= t0 m r = r0 .. . m r = r0 .. . sub m φ↔ φ0 sub m n φ .. . φ0 sub m in n The statement ‘there is exactly one u such that φ’ is common enough in mathematics to deserve being abbreviated as ∃!uφ. Formally this is an abbreviation for ∃uφ ∧ ∀u∀v(φ ∧ φ(v/u) → u = v) where v is a variable that does not occur in φ. This is simply the conjunction of an existence formula with an uniqueness formula. Example 0.31 Here is a formal proof that for every object there is a unique object that is equal to it (namely itself). 1. 2. 3. 4. 5. 6. 7. 8. 9. x=x ∃y(y = x) y =x∧z =x y=z y =x∧z =x→y =z ∀y∀z(y = x ∧ z = x → y = z) ∃y(y = x) ∧ ∀y∀z(y = x ∧ z = x → y = z) ∃!y(y = x) ∀x∃!y(y = x) = ∃ Intro 1 as =3 → Intro 3 − 4 ∀ Intro 5 ∧ Intro 2, 6 same as 7 ∀ Intro Exercise 14 Assuming v is a variable that does not occur in φ, prove the following theorems. (i) ` ∃!uφ ↔ ∃u∀v(u = v ↔ φ(v/u)) (ii) ` ∃!uφ ↔ ∃u(φ ∧ ∀v(φ(v/u) → u = v) We not consider some examples of first order theories The theory L of (strict) linear orders uses two binary predicate symbols = and <, no operation symbols and no constant symbols, hence the signature σL is {=, <}. The only terms of this theory are the variables. The axioms are: L1 ∀x ¬(x < x) 18 L2 ∀x∀y∀z(x < y ∧ y < z → x < z) L3 ∀x∀y(x < y ∨ x = y ∨ y < x) The law of trichotomy states that exactly one of x < y, x = y or y < x holds in a linear order. Although this is seemingly stronger than L3, it follows from the above axioms. Exercise 15 `L ¬(x < y ∧ x = y) ∧ ¬(x < y ∧ y < x) If we omit axiom L3, we get the theory of (strict) partial orders. The theory φ of (reflexive) partial orders has a slightly different presentation: it’s signature is {=, ≤} and has axioms P1 ∀x(x ≤ x) reflexivity P2 ∀x∀y(x ≤ y ∧ y ≤ x → x = y) P3 ∀x∀y∀z(x ≤ y ∧ y ≤ z → x ≤ z) antisymmetry transitivity. The precise relationship between these two presentations is given by the following exercise. Exercise 16 Show that if we consider x ≤ y as an abbreviation for x < y ∨ x = y then L1, L2 imply P1, P2 and P3. Conversely, show that if x < y is an abbreviation for x ≤ y ∧ x 6= y then P1, P2, P3 imply L1 and L2. First-order theories of natural number arithmetic are often referred to as ‘Peano Arithmetic’ (although Peano’s axiomatization of arithmetic was not first-order). We use the letter N for a version of this theory. The signature of N is {+,0 , 0, =} where 0 is a unary operation symbol that is written postfix and denotes the successor operation. The axiom of N are N1 ∀x∀y(x0 = y 0 → x = y) N2 ∀x(x0 6= 0) N3 ∀x(x + 0 = x) N4 ∀x∀y(x + y 0 = (x + y)0 ) N5 ∀x(x0 = 0) N6 ∀x∀y(xy 0 = xy + x) N7 For each formula φ, φ(0/v) ∧ ∀v(φ → φ(v 0 /v)) → ∀vφ. The last axiom is the induction principle, and it is really an axiom schema since it has infinitely many instances. Exercise 17 Show that the following formulae can be derived from N1–N7. (i) ∀x(x + 0 = 0 + x) (ii) ∀x∀y(x0 + y = x + y 0 ) (iii) commutativity of addition: ∀x∀y(x + y = y + x) 0.4. FIRST ORDER THEORIES WITH EQUALITY (iv) ∀x(x · 0 = 0 · x) (v) ∀x(x · 00 = x) (vi) ∀x(00 · x = x) (vii) commutativity of multiplication: ∀x∀y(x · y = y · x) 19