Download Peano and Heyting Arithmetic

3. Peano Arithmetic 3.1. Language and Axioms. Definition 3.1. The language of arithmetic consists of: • • • • • A 0-ary function symbol (i.e. a constant) 0, A unary function symbol S, Two binary function symbols +, ·, Two binary relation symbols =, <, For each n, infinitely many n-ary predicate symbols Xni . We often abbreviate ¬(x = y) by x 6= y and sometimes ¬(x < y) by x 6< y. We write x ≤ y as an abbreviation for x < y ∨ x = y and s + t, s · t as “abbreviations” for +st and ·st. We intend these symbols to represent their usual meanings regarding arithmetic. S is the successor operation. The predicate symbols Xni intentionally have no fixed meaning; their purpose is so that if we prove a formula φ containing one of them then not only have we proven φ[ψ/Xni ] (the formula where we replace Xni with the formula ψ) for any ψ in our language, we have proven φ[ψ/Xni ] for any formula in any extension of the language of arithmetic. Definition 3.2. P − consists of formulas: • ∀x(x = x), • ∀x∀y(x = y → φ[x/z] → φ[y/z]) where φ is atomic and x and y are substitutable for z in φ, • ∀x(Sx 6= 0), • ∀x∀y(Sx = Sy → x = y), • ∀x∀y(x < Sy ↔ x ≤ y), • ∀x(x 6< 0), • ∀x∀y(x < y ∨ x = y ∨ y < x), • ∀x(x + 0 = x), • ∀x∀y(x + Sy = S(x + y)), • ∀x(x · 0 = 0), • ∀x∀y(x · Sy = x · y + x). The second equality axiom is a bit subtle. In particular, note that φ is allowed to contain x or y, so we can easily derive x=y→x=x→y=x (taking φ to be z = x) and y=x→y=w→x=w (taking φ to be z = w). Finally, in order to prove anything interesting, we need to add an induction scheme. 1 2 Definition 3.3. The axioms of arithmetic, ΓP A , consist of P − plus, for every formula φ and each variable x, the formula φ[0/x] → ∀x(φ → φ[Sx/x]) → ∀xφ. We write PA ` Γ ⇒ Σ if Fc ` ΓP A Γ ⇒ Σ and HA ` Γ ⇒ Σ if Fi ` ΓP A Γ ⇒ Σ. PA stands for Peano Arithmetic while HA stands for Heyting arithmetic. Definition 3.4. The numerals are the terms built only from 0 and S. If n is a natural number, we write n for the numeral given recursively by: • 0 is the term 0, • n + 1 is the term Sn. 3.2. Basic Properties. We will generally not try to give even simple proofs explicitly in the sequent calculus: even very simple arguments rapidly become infeasible. (Consider, for example, that the most basic arguments involving substitution generally take several inference rules.) Instead, we will accept from our previous work that Fc already captures most ordinary logical reasoning, and we will give careful arguments from the axioms, using informal logic. Theorem 3.5. HA proves that additition is commutative: ∀x∀y x + y = y + x. Proof. By induction on x. It suffices to show ∀y 0+y = y +0 and ∀y(x+y = y + x) → ∀y(Sx + y = y + Sx). For the first of these, since y + 0 = y, it suffices to show that 0 + y = y; we show this by induction on y. 0 + 0 = 0 is an instance of an axiom, and if 0 + y = y then 0 + Sy = S(0 + y) = Sy. Now assume that ∀y(x + y = y + x). Again, we go by induction on y. Sx + 0 = Sx and we have already shown that 0 + Sx = Sx. Suppose Sx + y = y + Sx; then Sx+Sy = S(Sx+y) = S(y+Sx) = SS(y+x) = SS(x+y) = S(x+Sy) = S(Sy+x) = Sy+Sx. (Consider just how many infence rules it would take to completely formalize applying the transitivity of = over seven equalities, which forms only one of the three inductive arguments in the proof.) By similar arguments, HA (and so also PA) proves all the standard facts about the arithmetic operations. These systems are (more than) strong enough to engage in sensible coding of more complicated (but still finite) objects. The details of how to accomplish this sort of coding are tedious and sufficiently described elsewhere, but we will briefly describe what it means to code something in the language of arithmetic with an example. One of the first things that needs to be coded is the notion of a finite sequence of natural numbers. What we mean by this is that we wish to informally set up a correspondence between finite sequences and natural 3 numbers. Let us name a function π which is an injective map from finite sequences to natural numbers. The range of π should be definable; that is, there should be a formula φπ such that HA ` φπ (n) when n = π(σ) for some σ and HA ` ¬φπ (n) when n is not in the range of π. Then we need the natural operations on sequences to be definable. For instance, we would like to be able to take a sequence σ and a natural number n and define the sequence σ _ hni which consists of appending n to the sequence σ. To code this, we should have a formula φ_ such that: • If m = π(σ _ hni) then HA ` φ_ (π(σ), n, m), • If m 6= π(σ _ hni) then HA ` ¬φ_ (π(σ), n, m), • HA ` ∀x, y, z, z 0 (φ_ (x, y, z) ∧ φ_ (x, y, z 0 ) → z = z 0 ), and • HA ` ∀x, y(φπ (x) → ∃zφ_ (x, y, z)). The first two clauses state the HA proves that φ_ correctly identifies π(σ _ hni) for actual sequences σ and natural numbers n. But this isn’t enough to give the last two clauses, because HA can’t actually prove that the numerals n are the only numbers. So the last two clauses say that HA can actually prove that φ_ represents a well-defined function. (For instance, the last two clauses ensure that in a nonstandard model, which has “nonstandard sequences”, the _ operation still has a sensible interpretation.) Coding sequences is crucial to the power of HA because we can carry out induction along sequences. In particular, this lets us define exponentiation: we say x = y z if there is a sequence σ of length z such that σ(0) = y, σ(i + 1) = σ(i) · y for each i, and the last element of σ is equal to x. And once we have done this, we could define iterated exponentiation, and so on. Once we can code sequences, it also becomes much easier to define other notions, since we can use sequences to combine multiple pieces of information in a single number. For instance, we could define a finite group to consist of a quadruple hG, e, +G , −1 i where G is a number coding a finite set, e is an element of G, +G and −1 are numbers coding finite sets of pairs, and then write down a long formula describing what has to happen for this quadruple to properly define a group. To illustrate just how much is provable, we quote Harvey Friedman’s “grand conjecture” Every theorem published in the Annals of Mathematics whose statement involves only finitary mathematical objects (i.e., what logicians call an arithmetical statement) can be proved in EFA. EFA is the weak fragment of Peano Arithmetic based on the usual quantifier-free axioms for 0, 1, +, , exp, together with the scheme of induction for all formulas in the language all of whose quantifiers are bounded. (We will define the notion of a bounded quantifier below.) In other words, almost all of conventional combinatorics, number theory, finite group theory, and so on can be coded up and then proven, not only inside PA, but in a comparatively small fragment of PA. (Later on we’ll have a way to quantify 4 how strong fragments of PA are, and we’ll learn that EFA is very small indeed.) 3.3. The Arithmetical Hierarchy. Definition 3.6. We write ∀x < y φ as an abbreviation for ∀x(x < y → φ) and ∃x < y φ as an abbreviation for ∃x(x < y ∧ φ). Note that PA ` ¬∀x < y ¬φ ↔ ∃x < y φ and PA ` ¬∃x < y ¬φ ↔ ∀x < y φ, just as we would expect. We call these bounded quantifiers. As we will see, formulas in which all quantifiers are bounded behave like quantifier-free formulas. We call other quantifiers unbounded. Because HA can describe sequences in a single number, there is no real difference between a single quantifier ∃x and a block of quantifiers of the same type, ∃x1 ∃x2 · · · ∃xn —anything said with the later could be coded up and expressed with a single quantifier. Furthermore, all the coding necessary can be done using only bounded quantifiers. Therefore we will generally simply write a single quantifier, knowing that it could stand for multiple quantifiers of the same type. Definition 3.7. The ∆0 formulas are those in which all quantifiers are bounded. Σ0 and Π0 are alternates names for ∆0 . The Σn+1 formulas are formulas of the form ∃xφ (possibly with a block of several existential quantifiers) where φ is Πn . The Πn+1 formulas are formulas of the form ∀xφ (possibly with a block of several universal quantifiers) where φ is Σn . In particular, the truth of ∆0 formulas is computable, in the sense that given numeric values for the free variables in ∆0 , we can easily run a computer program which checks in finite time whether the formula is true (under the intended interpretation in the natural numbers). By the same argument that shows every formula is equivalent in Fc to a prenex formula, PA shows that every formula is equivalent to a formula with its unbounded quantifiers in front, which must be Σn or Πn for some n. Lemma 3.8. If t is a closed term then there is a natural number k such that HA ` t = k. Proof. By induction on the construction of the term t. If t = 0 then this is trivial since 0 = 0 is derivable. If t = St0 then by IH we have HA ` t0 = k 0 , and therefore we can derive St0 = Sk 0 . To keep the formula managable, we write ψ(t) as an abbreviation for t0 = t → St0 = St0 → St0 = St. Note that this is an instance of the substitution axiom. 5 L∀ St0 = St0 ⇒ St0 = St0 ΓP A , St0 = St0 ⇒ St0 = St0 L∀ ΓP A ⇒ St0 = St0 ΓP A ⇒ t0 = k 0 L∀ ΓP A , ψ(k 0 ) ⇒ ψ(k 0 ) ΓP A , ∀y(ψ(y)) ⇒ ψ(k 0 ) ΓP A ⇒ ψ(k 0 ) ΓP A ⇒ St0 = St0 → St0 = Sk 0 ΓP A ⇒ St0 = Sk 0 If t = t0 + t1 then by IH we have HA ` t = k0 + k1 , and then by induction on k1 we can construct a deduction of HA ` t = k0 + k1 . The t = t0 · t1 case is similar. Lemma 3.9. If φ is atomic then HA ` φ ∨ ¬φ. Proof. We first observe that not only does HA have excluded middle in the form t = 0∨t 6= 0, HA even has the slightly stronger form t = 0∨∃y t = Sy. This is due to the presence of the induction axiom: certainly 0 = 0 ∨ ∃y 0 = Sy, and in the inductive case we ignore the hypothesis entirely and note that Sx = 0 ∨ ∃y Sx = Sy. This means we can argue by cases: if we show φ(0) and ∀xφ(Sx) then we have ∀xφ(x). The only atomic formulas are those of the form t0 = t1 or t0 < t1 . We first consider the case of =. We proceed by induction on t0 . In the case where t0 = 0, we split into cases: 0 = 0, so 0 = 0 ∨ 0 6= 0, and 0 6= Sy so 0 = Sy ∨ 0 6= Sy. Suppose ∀y(x = y ∨ x 6= y). Again, we split into cases. Sx 6= 0, so Sx = 0 ∨ Sx 6= 0. Sx = Sy is equivalent to x = y, and we assumed that x = y ∨ x 6= y, so also Sx = Sy ∨ Sx 6= Sy. The case for x < y is even simpler, using the fact that x = y ∨ x 6= y: we already have x < y ∨ x = y ∨ y < x. If x = y then we have x 6< y, and if x 6= y then we have x < y ∨ y < x, the latter of which implies x 6< y. Theorem 3.10. If φ is ∆0 then HA ` φ ∨ ¬φ. Proof. By induction on φ. For φ atomic, this is the previous lemma. Observe that from φ ∨ ¬φ and ψ ∨ ¬ψ, we can derive (φ ~ ψ) ∨ ¬(φ ~ ψ). Suppose φ is ∃x < t φ. We show by induction that ∀y(∃x < y φ ∨ ¬∃x < y φ). For y = 0, this is derivable, since we can show that ¬∃x x < 0. Suppose ∃x < y φ ∨ ¬∃x < y φ. We must show ∃x < Syφ ∨ ¬∃x < Sy φ. If ∃x < y φ then clearly ∃x < Sy φ. Also if φ(y) then ∃x < Sy φ. Otherwise we have ¬∃x < y φ and ¬φ(y), and since x < Sy implies x < y or x = y, we have ¬∃x < Sy φ. The ∀x < t φ case is similar. 3.4. The Friedman-Dragalin Translation. One interpretation of the last theorem of the previous section is that ∆0 formulas behave like classical ones, even in intuitionistic logic. A consequence is that classical logic and intuitionistic logic have to agree on simple formulas: 6 Theorem 3.11. If φ is Π2 and PA ` φ then HA ` φ. This statement is not true if we deduce a sequent of Π2 formulas instead of a single formula. For the proof, we need another translation of formulas: Definition 3.12. Fix a formula θ. • ⊥F D is θ, • If p is atomic and not ⊥, pF D is p ∨ θ, • (φ ~ ψ)F D is φF D ~ ψ F D , • (Qxφ)F D is Qx(φF D ). Note that this is the result of the ∗ translation from intuitionistic to minimal logic followed by replacing every occurrence of ⊥ with θ. Lemma 3.13. If Fm ` Γ ⇒ Σ and Γ[θ/⊥], Σ[θ/⊥] are the result of replacing every occurrence of ⊥ with the formula θ, then Fm ` Γ[θ/⊥] ⇒ Σ[θ/⊥]. Proof. Proof sketch: This follows from the fact that ⊥ has no special properties in minimal logic. We proceed by induction on deductions, and the only way ⊥ can be introduced is by weakening or by the axiom ⊥ ⇒ ⊥. Theorem 3.14. If HA ` Γ ⇒ Σ then HA ` ΓF D ⇒ ΣF D . Proof. We proved for first-order logic in general that if Fi ` ΓP A , Γ ⇒ Σ then Fm ` Γ∗A , Γ∗ ⇒ Σ∗ . The previous lemma then shows that Fm ` F D ⇒ ΣF D . ΓFP D A, Γ Furthermore, we have already seen that Fi ` φ → φ∗ , and by the same argument, Fi ` φ → φF D . In particular, we may apply cuts over all the axioms of ΓP A actually used in the original proof to obtain Fi ` ΓP A , ΓF D ⇒ ΣF D , and therefore HA ` ΓF D ⇒ ΣF D . Lemma 3.15. If φ is ∆0 and no free variable in θ appears bound in φ then HA ` φF D → φ ∨ θ. Proof. By induction on φ. If φ is ⊥, this is trivial since ⊥F D is θ. If φ is atomic then φF D is exactly φ ∨ θ. If φ = ψ0 ∨ ψ1 , it is easy to derive ψ0 ∨ θ ⇒ (ψ0 ∨ ψ1 ) ∨ θ and ψ1 ∨ θ ⇒ (ψ0 ∨ ψ1 ) ∨ θ. Since the inductive hypothesis gives ψ0F D → ψ0 ∨ θ and ψ1F D → ψ1 ∨ θ, we can conclude ψ0F D ∨ ψ1F D → (ψ0 ∨ ψ1 ) ∨ θ. The cases for ∧, → are similar. Suppose φ is ∃x < t ψ. We show by induction that ∀y((∃x < y ψ)F D → (∃x < y ψ ∨ θ)). Note that (∃x < y ψ)F D is ∃x((x < y ∨ θ) ∧ ψ F D ). If y = 0 then since x 6< y, the premise immediately implies θ. Suppose the claim holds for y, and we set out to show it for Sy. Assume ∃x((x < Sy∨θ)∧ψ F D ); using the main inductive hypothesis, we have ∃x((x < Sy ∨ θ) ∧ (ψ ∨ θ)). This easily implies (∃x < Sy ψ) ∨ θ. The case for ∀x < tψ is similar. Theorem 3.16. If φ is Π2 and PA ` φ then HA ` φ. 7 Proof. We have φ = ∀xθ where θ = ∃yψ is Σ1 . We have a deduction of θ in PA. Using the double negation translation yields a deduction HA ` (∀y(ψ → ⊥)) → ⊥. Applying the Friedman-Dragalin translation gives us HA ` ∀y(ψ F D → θ) → θ. We have by the previous lemma HA ` ψ F D → ψ ∨ θ, and since ψ → θ, we actually have HA ` ψ F D → θ and so HA ` ∀y(ψ F D → θ). Combining these, we we obtain a deduction of HA ` θ. 3.5. Ordinals. In order to discuss cut-elimination for Peano Arithmetic, it is helpful to have a theory of ordinals. We will be concerned with linear orders which can be defined in HA— that is, there is a formula ≺ (x, y) with exactly the two listed free variables, where HA can prove that ≺ is a linear order. We will in fact primarily be interested in the case where ≺ is ∆0 . We will write x ≺ y in place of ≺ (x, y). We are mostly interested in the interpretation of ≺ as an ordering on the actual natural numbers, and so we will sometimes equate formulas which define orderings with the ordering itself. Definition 3.17. A definable linear ordering of ω is a formula ≺ (x, y) with exactly the two listed free variables such that HA deduces: • x 6≺ x, • If x ≺ y and y ≺ z then x ≺ z, • If x 6= y, either x ≺ y or y ≺ x. ≺ is a well-ordering if there is no infinite sequence n1 n2 · · · . The statement that ≺ is a well-ordering can’t be directly expressed in the language of arithmetic, but we can make a coherent attempt. We use the presence of the fresh predicate symbols to represent the idea of quantifying over all sequences: we view a binary predicate X as a sequence, saying X(s, t) holds if s is the t-th element of the sequence. Then the statement W O(≺) is: ∃x∀y∀zX(x, y) ∧ X(Sx, z) → z 6≺ y. In other words, X does not list an infinite descending sequence in ≺. If HA ` W O(≺) then it is actually true that, in the standard model, ≺ describes a well-ordering. (In a nonstandard model, this may not be the case because in such models X describes sequences of “nonstandard length”.) Of course, there are many examples of formulas ≺ which actually describe well-orderings, but where HA cannot prove W O(≺). Being a well-ordering is equivalent to saying that every non-empty set contains a least element. We can’t quite state this inside arithmetic, so we prove it externally. Theorem 3.18. ≺ is a well-ordering iff whenever Y is non-empty, there is a ≺-least element of Y . 8 Proof. Suppose Y is non-empty but has no ≺-least element. Let x1 ∈ Y . Since x1 is not ≺-least, there is an x2 ≺ x1 with x2 ∈ Y . Similarly, x2 is not ≺-least in Y . Iterating, we obtain an infinite decreasing sequence in ≺, which shows that ≺ is not well-ordered. Conversely if ≺ is not well-ordered then there is an infinite descending sequence x1 x2 · · · , and clearly {xn } is a non-empty subset of X with no ≺-least element. In particular, every well-ordering other than the one with empty domain has a least element, which we generally call 0. One special feature of well-orderings is that they are precisely the orders on which transfinite induction makes sense. Theorem 3.19. Suppose (X, ≺) is a non-empty well-ordering. Let Z ⊆ X be a set such that 0 ∈ Z and such that for any x ∈ X, if every y ≺ x belongs to Z then x belongs to Z. Then Z = X. Proof. Suppose Z ( X. Then X \ Z is non-empty, and therefore has a ≺-least element x ∈ X \ Z. But then for every y ≺ x, y ∈ Z, and therefore x ∈ Z, a contradiction. Moreover, transfinite induction can be stated inside arithmetic (in the rough way that being a well-ordering can be stated): we write T I(≺, X) for the formula ∀x[(∀y ≺ xX(y)) → X(x)] → ∀xX(x). We can write T I(≺, φ) if we are interested in particular cases of transfinite induction, or T I(≺, X) to indicate the statement with one of our fresh predicates X. Note that if we can prove T I(≺, X) with X a fresh predicate then we can prove T I(≺, φ) for any formula φ. Another key properties of well-orderings is that they are in some sense unique. Definition 3.20. An initial segment of X (under ≺) is a set Z ⊆ X such that whenever z ∈ Z and x ≺ z, x ∈ Z. Theorem 3.21. Let (X, ≺) and (Y, ≺0 ) be well-orderings. Then either there is an order-preserving bijection from X to an initial segment of Y , or an order-preserving bijection from Y to an initial segment of X. Proof. If either is empty, this is trivial. Otherwise, we will define, by transfinite recursion, a function f from an initial segment of X to an initial segment of Y which is a bijection on these initial segments and which is order-preserving (so f (x) ≺ f (y) iff x ≺ y). Initially we set f (0) = 0. Suppose X 0 ⊆ X, Y 0 ⊆ Y are initial segments and we have defined an order-preserving bijection f : X 0 → Y 0 . If X 0 = X then f is an order-preserving bijection from X to an initial segment of Y . If Y 0 = Y then f −1 is an order-preserving bijection from Y to an initial segment of X. 9 Otherwise there is a least x ∈ X \ X 0 and a least y ∈ Y \ Y 0 , and we extend f by setting f (x) = y. Clearly X 0 ∪ {x} and Y 0 ∪ {y} are initial segments and the extended f is an order-preserving bijection. This means that even though the underlying sets X and Y might be different, we can find a copy one of these orderings inside the other. In particular, this allows us to induce an ordering on well-orderings themselves: (X, ≺) is less than or equal to (Y, ≺0 ) if there is an order-preserving bijection from (X, ≺) to an initial segment of Y . (The initial segment could be all of Y , so we allow for “equality”.) In fact, this is a well-ordering on the well-orders! We use the term ordinal to mean an equivalence class of well-orderings— that is to say, the order itself, rather than some particular description of the order. Let’s consider some concrete examples of well-orders which are definable in PA. Each finite number is an ordinal, and since there is only one linear ordering on a finite set (up to isomorphism), there is a unique finite ordinal of each size. In other words, 0 is the smallest ordinal, 1 (the ordinal consisting of a single point) is the next smallest, then 2 (the ordinal with two points, one smaller than the other), and so on. Above all these ordinal is the ordering of the natural numbers, which we call ω. This ordinal has infinitely many elements ordered in a row. Clearly ω is definable, by the formula x < y. A more interesting ordering is given by x ≺ω+1 y ↔ [(0 < x ∧ x < y) ∨ (x = 0 ∧ 0 < y)] . The smallest element in this order is 1, followed by 2, then 3, and so on, with 0 larger than any positive number. In other words, this ordering looks like ω, but with an extra element tacked on at the end, larger than any finite element. Next we could define ω+2, which looks like ω+1 but with another number added on after. In general, if α is any ordering, we could define α + 1 to be the ordering that looks like α, but with one additional element larger than any element of α. Theorem 3.22. If HA proves that α is well-ordered then HA proves that α + 1 is well-ordered. Proof. It suffices to show that if X is an infinite descending sequence in α + 1 then we can define from X an infinite descending sequence in α. This is easily done: take the sequence x 7→ X(x + 1) (that is, the formula Y (x, y) ↔ X(x + 1, y)). Certainly every element of the sequence X after the first must be below the largest element, and therefore must belong to the ordering α. We could keep going, and eventually get x ≺ω+ω y ↔ [(x and y are either both even or both odd and x < y) ∨ (x is odd and y is even)] . 10 This ordering starts with all the odd numbers in their usual order—which looks like a copy of ω—and then above them is another copy of ω. We note that there is a significant difference between well-orderings like ω and ω + ω on the one hand, and well-orderings like ω + 7 on the either. Some well-orderings have largest elements, and some do not. (0 is a special case.) Definition 3.23. We say α is a successor ordinal if there is some β such that α = β + 1. If α is neither 0 nor a successor ordinal, we say α is a limit ordinal. Definition 3.24. Let β1 < β2 < · · · be an increasing sequence of ordinals. We define supn βn to be the least ordinal larger than any βn . Note that supn βn is well-defined, since the ordinals are themselves wellordered, so there is a least such ordinal. Lemma 3.25. Suppose α is a limit ordinal which can be represented with domain the natural numbers. Then there is a sequence β1 < β2 < · · · such that α = supn βn . Proof. Consider some representation of α as a well-ordering ≺ on the natural numbers. For each n, define γn = {m | m ≺ n}. γn is an initial segment of α, so is itself a well-ordering. γn does include n, so γn is a proper initial segment, in particular γn < α. Define β0 = γ0 and given βn , define βn+1 to be γm where m is least such that βn < γm . We have supn βn ≤ α since each βn < α. Suppose δ < α; then δ may be mapped to some proper initial segment of α, so in particular α \ δ is non-empty, and there must be some least k belonging to α \ δ. Then δ = {m | m ≺ k}, and therefore δ < γk+1 ≤ βk+1 . This holds for every δ < α, so α ≤ supn βn . We can define addition on well-orderings: Definition 3.26. • α + 0 = α, • α + (β + 1) = (α + β) + 1, • If λ = supn βn is a limit, α + λ = supn (α + βn ). An immediate consequence of this definition is that addition is not commutative. It is easy to see why: addition corresponds to the operation of placing one ordering after another. So ω < ω + 1, because adding a new element at the end of ω gets a larger ordering. But 1 + ω = ω, since ω already has an infinite increasing sequence, and adding an element to the beginning doesn’t change its length. We can similarly define multiplication as iterated addition: Definition 3.27. • α · 0 = 0, 11 • α · (β + 1) = α · β + α, • If λ = supn βn is a limit, α · λ = supn (α · βn ). Again, this is not commutative. For instance, ω · 2 = ω + ω is two copies of ω, as we have already seen. But 2 · ω is infinitely many pairs, which is really the same as ω. To consider the first really non-trivial example, ω · ω = ω 2 consists of a copy of ω, followed by a second copy of ω, followed by a third, and so on. An easy representation is in terms of pairs: we think of the pair (n, m) as representing ω · n + m, so (n, m) < (n0 , m0 ) if either n < n0 or n = n0 and m < m0 . Although we will not prove it, both addition and multiplication are still associative. Naturally, the next step is exponentiation. Definition 3.28. • α0 = 1, • αβ+1 = αβ · α, • If λ = supn βn then αλ = supn (αβn ). We will really only use the cases where α = 2 or α = ω. It is important to note that ordinal exponentiation is not cardinal exponentiation. In particular, 2ω = ω, which is very different from 2ℵ0 . It turns out that there is a natural representation of exponentiation. Lemma 3.29. Consider the collection X of finite functions x : β → α (here we equate α and β with the set of smaller ordinals) such that x(γ) is nonzero at finitely many values. We may order such functions by setting x ≺ y if when γ < β is largest such that x(γ) 6= y(γ), x(γ) < y(γ). Then X is a representation of the ordinal αβ . Choosing γ largest here is possible since x(γ) and y(γ) are non-zero at finitely many places. Proof. By induction on β. When β = 0, |X | = 1, since it contains only the empty function. Suppose the claim holds for β, and we show it for β + 1: each function x ∈ X can be viewed as a pair (γx , x0 ) where γx < α and x0 is a function from β to α. Clearly x ≺ y if either γx < γy or γx = γy and x0 ≺ y 0 . Therefore X can be viewed as α copies of X 0 in order, which is exactly αβ · α. If λ = supn βn , observe that every element of Xλ is an element of Xβn for some n. One special feature of all these operations is that they have fixed points. Definition 3.30. α is additively principal if whenever β, γ < α, β + γ < α. α is multiplicatively principal if whenever β, γ < α, β · γ < α. α is exponentially principal if whenever β, γ < α, β γ < α. Lemma 3.31. α > 0 is additively principal iff α = ω β for some β. 12 Proof. By induction on β. If β = 0 then α = ω 0 = 1, and the claim is obvious. Suppose the claim holds for β; if γ, δ < ω β+1 = ω β · ω then there must be n, m < ω such that γ < ω β · n and δ < ω β · m. Then γ + δ < ω β · n + ω β · m = ω β (n + m) < ω β+1 . If λ = supn βn , the claim holds for each βn , and γ, δ < ω λ then there is some n such that γ, δ < ω βn , and therefore γ + δ < ω βn < ω λ . β Similarly, α > 2 is multiplicatively principal iff α = ω ω for some β. (0, 1, 2 are multiplicatively principal as well.) The first exponentially principal ordinal greater than 0 is named 0 , and has a special relationship with PA. 0 is the limit of taking exponents: define ω0 = 0, ωn+1 = ω ωn . Then 0 = supn ωn . Our next step will be obtaining a description of 0 inside arithmetic. We will do this by providing a normal form—a canonical way of writing the ordinals below 0 . Lemma 3.32. If α is additively principal and β < α then β + α = α. Proof. If α = 0 or α = 1, this is trivial. Otherwise α is a limit, so β + α = supn (β + αn ) = α. Lemma 3.33. If α is not additively principal, there are β, γ < α such that β + γ = α. Proof. Choose β, γ < α such that β + γ ≥ α. Let γ 0 be least such that β + γ 0 ≥ α; clearly γ 0 ≤ γ < α. If γ 0 = δ + 1 then we have β + δ < α, so β + γ 0 ≤ α, and therefore β + γ 0 = α. If γ 0 = supn δn then for each n, β + δn < α, and therefore supn (β + δn ) ≤ α, so again β + γ 0 = α. Lemma 3.34. Suppose β, γ are additively principle, α < γ, α < β, and γ + α = β + α. Then γ = β Proof. Suppose the claim fails, and let γ be smallest so that this fails, so α < γ, α < β, γ, β are additively principle, and γ + α = β + α, but γ 6= β. If β < γ then β would be an example of an ordinal smaller than γ for which the same statement holds, so we must have γ < β. But since γ < β, α < β, and β is additively principle γ + α < β ≤ β + α, a contradiction. Theorem 3.35 (Additive Normal Form). For any α, there is a unique sequence of additively principal ordinals α1 ≥ α2 ≥ · · · ≥ αn such that α = α1 + α2 + · · · + αn . Proof. We define the sequence explicitly as follows. We let α1 be the largest additively principal ordinal ≤ α. To see that this exists, observe that the supremum of additively principal ordinals is itself additively principal, so we may take α1 to be the supremum of all additively principal ordinals ≤ α. Suppose we have chosen α1 ≥ · · · ≥ αk so that α1 + · · · + αk ≤ α. If these are equal, we are done, so suppose α1 + · · · + αk < α. Let αk+1 be the largest additively principal ordinal such that α1 + · · · + αk + αk+1 ≤ α 13 (again, the largest such ordinal exists by taking it to be the supremum of all such ordinals). We have αk+1 ≤ αk since if αk+1 > αk , α1 + · · · + αk + αk+1 = α1 + · · · + αk−1 + αk+1 ≤ α contradicting the maximality of αk . It remains to show that this process terminates. Since the ordinals are well-founded, the sequence α1 ≥ · · · ≥ αk · · · cannot be strictly decreasing infinitely many times, so in order for the process to fail to terminate, there would have to be some k so that αk = αk+n for all n. That is, α1 + · · · + αk · n ≤ α for all n. But then α1 + · · · + αk · ω ≤ α, and since αk · ω is additively principal and αk < αk · ω, we contradict the maximality of αk . Now we need to show uniqueness. Suppose β1 ≥ · · · ≥ βm , each βi is additively principal, and β1 + · · · + βm = α. We will show by induction on i that βi = αi . Suppose βj = αj for j < i. If αi < βi then, by maximality of αi , α < α1 + · · · + αi−1 + βi = β1 + · · · + βi−1 + βi ≤ β1 + · · · + βm , contradicting the assumption that β1 +· · ·+βm = α. If βi < αi then βi0 < αi for every i0 ≥ i, and therefore β1 +· · ·+βi−1 +βi +· · ·+βm = α1 +· · ·+αi−1 +βi +· · ·+βm < α1 +· · ·+αi−1 +αi ≤ α, and so β1 + · · · + βm < α, again contradicting the assumption. Theorem 3.36. Suppose 0 < α < 0 . Then there is a unique sequence of ordinals α1 ≤ α2 ≤ · · · ≤ αn < α such that α = ω αn + · · · + ω α1 . Definition 3.37. We define the Cantor normal forms as follows: • 0 is a Cantor normal form, • If α1 ≥ α2 ≥ · · · ≥ αn are in Cantor normal form then so is ω α1 + ω α2 + · · · + ω αn . Since each Cantor normal form is in additive normal form, the Cantor normal form is unique. Note that it is easy to code the Cantor normal form in arithmetic using sequences. We need one more arithmetic operation, a modification of addition which is commutative. Definition 3.38. The natural or commutative sum of α and β, written #, is given as follows. Suppose the additive normal forms of α and β are α = α1 + · · · + αn and β = αn+1 + · · · + αn+m . Then α#β = απ(1) + · · · + απ(n+m) 14 where π : [1, n + m] → [1, n + m] is a permutation such that απ(i+1) ≤ απ(i) for all i < n + m. For instance, 1#ω = ω + 1. More elaborately, (ω ω + ω 2 + 1)#(ω 3 + ω 2 + ω) = ω ω + ω 3 + ω 2 · 2 + 1. This choice of permutation π is precisely the choice that makes α#β as large as possible. Lemma 3.39. (1) α#β = β#α, (2) α < β implies α#γ < β#γ and γ#α < γ#β, (3) # is associative, (4) If α is additively principle and β, γ < α then β#γ < α. 3.6. Cut-Elimination. The cut-elimination theorem for first-order logic applies to Peano Arithmetic, but it isn’t very useful: given a deduction of ΓP A , Γ ⇒ Σ, there is a cut-free deduction, but since the axioms in ΓP A include every formula, we lose all the useful properties of cut-elimination. What we would really like is to be able to obtain a deduction without induction axioms—that is, given a deduction of ΓP A , Γ ⇒ Σ, we would like a cut-free deduction of P − , Γ ⇒ Σ. This would have two benefits; first, it would give us (most of) the consequences of cut-elimination back, since the axioms of P − are very simple formulas. Further, such a result says something about the consistency of PA: in particular, if PA ` ⊥ then ` P − ⇒ ⊥. Since the axioms of P − are essentially just definitions, the latter is impossible, so we could conclude that PA is consistent. There is one problem: the proposed theorem isn’t quite true. However it is true if we restrict ourselves to formulas of a specific form. In particular, we will show that if PA ` Σ where Σ consists only of Σ1 formulas then Fc ` P − ⇒ Σ. The proof of cut-elimination for Peano Arithmetic is a step beyond anything we have done so far. In order to simplify the proof, we will take a strange route. We will introduce a new sequent calculus which allows infinitary rules—that is, rules which have infinitely many branches. We will show how to embed proofs from regular Peano Arithmetic into this infinitary system, and then we will prove that a form cut-elimination holds in this infinitary system. Specifically, we will prove that if Fc ` ΓP A , Γ ⇒ Σ − then Fcf ∞ ` P , Γ ⇒ Σ. In order to complete the proof, we will have to move from the infinitary system back to regular Peano Arithmetic. In general, this is not possible, but it will be possible when the statement we have proven consists only of Σ1 formulas. We first define the system F∞ , which consists of replacing the R∀ rule and L∃ rules in Fc with two new rules, known as the ω rules: ... Γ ⇒ Σ, φ[0/x] ... Γ ⇒ Σ, φ[n/x] Rω Γ ⇒ Σ, ∀xφ 15 ... Γ, φ[n/x] ⇒ Σ ... Γ, ∃xφ ⇒ Σ We also add the requirement that all sequents consist entirely of sentences— that is, there are no free variables. Observe that in F∞ , the induction rule is derivable! First, note that we can derive Fc ` φ[0/x], ∀x(φ → φ[Sx/x]) ⇒ φ[n/x] for any n. We show this by induction on n: for n = 0, this is trivial. Suppose the claim holds for n. Then Lω Γ, φ[0/x] ⇒ Σ φ[0/x], ∀x(φ → φ[Sx/x]) ⇒ φ[n/x] φ[Sn/x] ⇒ φ[Sn/x] φ[0/x], ∀x(φ → φ[Sx/x]), φ[n/x] → φ[Sn/x] ⇒ φ[Sn/x] φ[0/x], ∀x(φ → φ[Sx/x]) ⇒ φ[Sn/x] Now the induction axiom follows from a single application of the ω-rule followed by two applications of R →. Theorem 3.40. If Fc ` Γ ⇒ Σ where ΓΣ has no free variables then F∞ ` Γ ⇒ Σ. Proof. By induction on deductions, we show: If Fc ` Γ ⇒ Σ, where x1 , . . . , xn are the free variables in Γ ⇒ Σ, then whenever t1 , . . . , tn are closed terms, there is a deduction F∞ ` Γ[t1 /x1 ] · · · [tn /xn ] ⇒ Σ[t1 /x1 ] · · · [tn /xn ]. If the last inference is anything other than L∃ or R∀ then the claim follows immediately from IH, since all other inference rules of Fc are also rules of F∞ . Suppose the final rule is R∀. Then the preceeding step was Γ ⇒ Σ, φ[y/x] for some free y. By IH, for each n, there is a deduction of Γ ⇒ Σ, φ[n/x], and therefore the claim followed by an application of Rω. The Lω case is similar. So suppose we have a deduction of Fc ` ΓP A , Γ ⇒ Σ. By compactness, we may assume we used finitely many axioms from ΓP A , and in particular, finitely many induction axioms, say Γ0A . By the previous theorem, there is a deduction of F∞ ` Γ0A , P − , Γ ⇒ Σ. We may then apply finitely many cuts with derivations of the induction axioms to conclude that F∞ ` P − , Γ ⇒ Σ. Definition 3.41. The height of a deduction in F∞ is given recursively by: • The height of an axiom is 1, • If a deduction d is formed from subdeductions {di } then the height of d is the smallest ordinal whose height is greater than the height of any di . We write `αr Γ ⇒ Σ if there is a deduction of Γ ⇒ Σ such that all cuts in this deduction have rank < r and the height is ≤ α. We still have our old friends the inversion lemmas: Lemma 3.42. (1) Suppose `αr Γ ⇒ Σ, φ ∧ ψ. Then `αr Γ ⇒ Σ, φ and α `r Γ ⇒ Σ, ψ. 16 (2) (3) (4) (5) Suppose Suppose Suppose Suppose `αr `αr `αr `αr Γ, φ ∨ ψ ⇒ Σ. Then `αr Γ, φ ⇒ Σ and `αr Γ, ψ ⇒ Σ. Γ, φ → ψ ⇒ Σ. Then `αr Γ, ψ ⇒ Σ and `αr Γ ⇒ Σ, φ. Γ ⇒ Σ, ∀xφ. Then for any n, `αr Γ ⇒ Σ, φ[n/x]. Γ, ∃xφ ⇒ Σ. Then for any n, `αr Γ, φ[n/x] ⇒ Σ. And the reduction lemmas: Lemma 3.43. (1) Suppose `αr Γ ⇒ Σ, φ ∧ ψ and `βr Γ, φ ∧ ψ ⇒ Σ where rk(φ ∧ ψ) ≤ r. Then `α#β Γ ⇒ Σ. r α (2) Suppose `r Γ, φ ∨ ψ ⇒ Σ and `βr Γ ⇒ Σ, φ ∨ ψ where rk(φ ∨ ψ) ≤ r. Then `rα#β Γ ⇒ Σ. (3) Suppose `αr Γ, φ → ψ ⇒ Σ and `βr Γ ⇒ Σ, φ → ψ where rk(φ → ψ) ≤ r. Then `rα#β Γ ⇒ Σ. (4) Suppose `αr Γ ⇒ Σ, ∀xφ and `βr Γ, ∀xφ ⇒ Σ where rk(∀xφ) ≤ r. Then `rα#β Γ ⇒ Σ. (5) Suppose `αr Γ, ∃xφ ⇒ Σ and `βr Γ ⇒ Σ, ∃xφ where rk(∃xφ) ≤ r. Then `rα#β Γ ⇒ Σ. Proof. We prove the first of these; the others are similar. We proceed by induction on β. We consider two cases. For the first case, suppose the last inference of the deduction of Γ, φ∧ψ ⇒ Σ had main formula φ ∧ ψ. Then immediate subdeduction must have been a deduction of either Γ, φ ∧ ψ, φ ⇒ Σ or of Γ, φ ∧ ψ, ψ ⇒ Σ, and have had height δ < β for some δ. Without loss of generality, we assume the former. By IH, there is a deduction of Γ, φ ⇒ Σ of height α#δ and by inversion there is a deduction of Γ ⇒ Σ, φ of height α. We obtain a deduction of Γ ⇒ Σ by applying a cut over φ. This deduction has height > max{α#δ, α}. Since β > δ, α#β > α#δ and also β > 0 so α#β > α. Therefore this deduction has height at most α#β. α Lemma 3.44. Suppose `αr+1 Γ ⇒ Σ. Then `2r Γ ⇒ Σ. Proof. By induction on α. If the last inference of the deduction is anything other than a cut over a formula of rank r, the claim follows by applying IH to all subdeductions and then applying the same inference. All subdeductions have height < α, so IH gives deductions of height < 2α . Suppose the last inference is a cut over a formula of rank r. The two subdeductions have heights β, β 0 < α, and by IH, there are deductions of 0 height at most 2β , 2β with all cuts having rank < r. We then apply the 0 previous lemma, obtaining a deduction of Γ ⇒ Σ of height at most 2β #2β ≤ 0 2max{β,β }+1 ≤ 2α . α Definition 3.45. Define 2α0 = α and 2αr+1 = 22r . 2α Theorem 3.46. If `αr Γ ⇒ Σ then `0r Γ ⇒ Σ. For arbitrary sequents Γ ⇒ Σ, having a cut-free proof in F∞ doesn’t do us much good. 17 Theorem 3.47. Consider a deduction of Γ ⇒ Σ in Fcf ∞ where every formula in Γ has the form ∀xφ with φ quantifier-free and every formula in Σ has the form ∃xψ with ψ quantifier-free. Then this is a deduction in Fcf c . Proof. Easily seen since, by the generalized subformula property, the ω rules do not appear in such a deduction. 3.7. Consequences of Cut-elimination. We can ask what it would take to formalize the argument just given—that is to carry it out, not in ordinary mathematics, but inside some sequent calculus. PA includes more than enough knowledge about natural numbers to code deductions and make statements about PA itself. Very careful work shows that the following is enough. IΣ1 is the restriction of PA in which the only induction axioms allowed are those where φ is Σ1 . (It is usual to use, in place of IΣ1 , an even weaker theory, PRA (“primitive recursive arithmetic”), in which there are no quantifiers in the language— and therefore, none in the induction axioms—but where some additional functions—the primitive recursive functions—are added to make enough coding definable.) Definition 3.48. Let α be a description of an ordinal in the language of arithmetic (that is, an injection π : α → N such that there are formulas r and <α such that r(n) holds iff n is in the range of π and <α (n, m) holds iff n and m are in the range of π and π −1 (n) < π −1 (m)). We write T I(α, φ) for the formula (∀x (∀y <α xφ(y)) → φ(x)) → ∀xφ(x). Theorem 3.49. IΣ1 +{T I(0 , φ) | φ is Σ1 } proves that PA is 1-consistent. Idea of the proof : With great care, one can actually carry out the proof of cut-elimination just described entirely within the formal system of P RA together with induction up to 0 on quantifier-free formulas. This isn’t at all obvious—after all, the proof given involved infinite objects. However when the sequent being proven is Σ1 , the ω-rule can be systematically replaced by a constructive ω-rule, in which there is a computable function f with the property that for each n, f (n) is a code describing a deduction of Γ ⇒ Σ, φ[n/x]. This code might have to reference other functions coding other ω rules, so the details are quite complicated. Since Gödel’s Incompleteness Theorem applies to PA, it follows that the argument just given cannot be carried out inside PA, nor in any fragment of it. Therefore we have: Corollary 3.50. PA does not prove T I(0 , X) for any representation of the ordinal 0 . Indeed, the following is true: 18 Theorem 3.51. For every α < 0 , there is a representation of α such that PA ` T I(α, X). In fact, PA ` T I(α, X) for the “natural” representations of α. However there are “artificial” representations of even, say, ω, such that PA cannot prove transfinite induction. For instance, consider the following ordering: x ≺ y if either x < y and PA is consistent, or y < x and PA is not consistent. If PA is consistent, this is a representation of ω, but if PA is not consistent, this is a representation of the ordering which is ω reversed, which obviously has an infinite decreasing sequence 0 1 2 · · · . So if PA could prove transfinite induction for this ordering, it could prove its own consistency. Extensions T of IΣ1 , such as PA and its extensions and fragments, often have an ordinal α for which the following are all true: • α is the supremum of those ordinals such that there is some representation of α such that T ` T I(α), • α is the least ordinal such that T 6` T I(α), • α is least such that IΣ1 + T I(α) ` T is 1 − consistent (to say that a theory is 1-consistent means that every Σ1 sentence it proves is actually true), • If T ` ∀x∃yφ(x, y) where φ is ∆0 then the function mapping x to the least such y is “≺ α-computable” (this means that the function is not only computable, but computable by a machine which, at each step, decrements a timer, where the timer is always an ordinal < α, and where the machine always finishes by the time the timer reaches 0), • If T ` ∀x∃yφ(x, y) where φ is ∆0 then the function mapping x to the least such y is bounded by some fast-growing function (see below) fβ with β < α, and T proves that each fβ for β < α is total. It is possible to contrive artificial theories in which these properties do not align, but for “natural” theories, these properties all occur at the same ordinal. We call this the proof-theoretic ordinal of T. There is an analagous approach to proof-theoretic ordinals for theories of sets (specifically, weak fragments of ZFC) rather than theories of arithmetic; in this case the proof-theoretic ordinal generally aligns with the least α such that every Π2 formula provable in the theory is satisfied at Lα , the α-th level of the constructible hierarchy. Proof-theoretic ordinals sort theories into a rough hierarchy of strength. If the ordinal of S is less than the ordinal of T (and both are theories of— possibly extensions of—the language of arithmetic) then any Π2 consequence of S (in their common language) will typically also be a consequence of T. This is one of the reasons for the special role of Π2 formulas, and computable functions, in proof-theory. 19 Definition 3.52. Suppose that α is a countable ordinal and for every limit ordinal λ ≤ α we have fixed an increasing sequence λ[n] such that λ = supn λ[n]. Then we define the fast-growing hierarchy of functions by recursion on ordinals α: • f0 (x) = x + 1, • fβ+1 (x) = fβx (x), • fλ (x) = fλ[x] (x). Observe that f1 (x) = f0x (x) = x + x = 2x, f2 (x) = f1x (x) = 2x x. As a result, these functions grow very quickly indeed! One consequence of these results is that there is a cap to how quickly functions which PA can prove total are allowed to grow, and therefore one way to show that something cannot be proven in PA is to prove that it grows faster than fα for every α < 0 . (“Grows faster” here could mean fα (x) < g(x) for infinitely many x.) 3.8. Goodstein’s Theorem. All these leads to an example of a “natural” statement unprovable in PA. Definition 3.53. We define a hereditary base n notation for a number inductively by: • 0 is a hereditary base n notation, • If for each i ≤ k, ai is a hereditary base n notation and i < j implies ai ≤ aj then nak + nak−1 + · · · + na0 is a hereditary base n notation. This is a generalization of the usual way of writing a number in base n, with the addition that the exponents themselves must also be written in base n. For example, in hereditary base 2, the first few numbers are: 0 20 0 20 20 20 0 0 20 , 22 , 22 + 20 , 22 , 22 + 20 , 22 + 22 , 22 + 22 + 20 , . . . For a larger example, to write 73 in hereditary base 3, we first write 73 in regular base 3: 221 = 34 + 34 + 33 + 33 + 3 + 1 + 1 and then we rewrite each exponent itself in base 3: 221 = 33+1 + 33+1 + 33 + 33 + 3 + 1 + 1 finally obtaining: 30 +30 221 = 33 30 +30 + 33 30 + 33 30 + 33 0 + 33 + 30 + 3 0 . Definition 3.54. We define the function ιa,b (x) to be the function given by writing the number x in hereditary base a notation and then replacing every a with a b. 20 For example 30 +30 ι3,4 (221) = ι3,4 (33 40 +40 = 44 30 +30 + 33 40 +40 + 44 30 30 0 + 33 + 33 + 33 + 3 0 + 3 0 ) 40 40 0 + 44 + 44 + 44 + 40 + 4 0 = 45 + 45 + 44 + 44 + 4 + 1 + 1 = 2566. Definition 3.55. For any x, the Goodstein sequence starting with x is the sequence a1 , a2 , . . . where: • a1 = x, • ak+1 = ιk+1,k+2 (ak ) − 1. More generally, a generalized Goodstein sequence is a sequence a1 , a2 , . . . together with an auxiliary sequence h1 , h2 , . . . such that for every k, hk < hk+1 and ak+1 < ιhk+1 ,hk+2 (ak ). For example, the Goodstein sequence starting with 3 is the sequence • • • • • • a1 a2 a3 a4 a5 a6 = 3 = 2 1 + 20 , = ι2,3 (21 + 20 ) − 1 = 31 + 30 − 1 = 3, = ι3,4 (3) − 1 = 41 − 1 = 3, = ι4,5 (40 + 40 + 40 ) − 1 = 2, = ι5,6 (50 + 50 ) − 1 = 1, = 0. On the other hand, the Goodstein sequence starting with 4 begins: • • • • a1 a2 a3 a4 = 4 = 22 , = ι2,3 (22 ) − 1 = 33 − 1 = 26 = 32 + 32 + 3 + 3 + 1 + 1, = ι3,4 (26) = 42 + 42 + 4 + 4 + 1 = 41, = ι4,5 (41) = 52 + 52 + 5 + 5 = 60. In fact, this sequence will eventually start decreasing, and will eventually reach 0—after 32402653211 − 2 steps! Theorem 3.56. For every h and every x, the h-Goodstein sequence starting with x eventually reaches 0. Proof. We prove this by transfinite induction up to 0 . For any number x, we may define ιa,ω (x), the result of replacing a in the hereditary base a notation with ω. The result is always an ordinal in Cantor Normal Form, and in particular, an ordinal < 0 . For instance, consider the Goodstein sequence starting with 4: • • • • ι2,ω (a1 ) = ω ω , ι3,ω (a2 ) = ω 2 + ω 2 + ω + ω + 1 + 1, ι4,ω (a3 ) = ω 2 + ω 2 + ω + ω + 1, ι5,ω (a4 ) = ω 2 + ω 2 + ω + ω. 21 This suggests the main point: no matter what h is, ιh(k+2),ω (ak+1 ) < ιh(k+1),ω (ak ). This is easily seen, since ιb,ω (ιa,b (x)) = ιa,ω (x), and therefore ιh(k+2),ω (ak+1 ) < ιh(k+2),ω (ak+1 +1) = ιh(k+2),ω (ιh(k+1),h(k+2) (ak )) = ιh(k+1),ω (ak ). Therefore the sequence ιh(k+1),ω (ak ) is a strictly decreasing sequence of ordinals below 0 , and therefore eventually must hit 0. Theorem 3.57. Suppose that for every h and every x, the h-Goodstein sequence starting with x eventually reaches 0. Then 0 is well-founded. Proof. Suppose g were an infinite descending sequence below 0 , g(1) > g(2) > · · · . We can easily choose an h so that ιk+1,ω (ak ) = g(k) for all k > 1, simply by setting h(k) = ιω,k+1 (g(k)) − ιω,k+1 (g(k + 1)). In particular, it follows that PA cannot prove that every h-Goodstein sequence eventually terminates. In fact, with a bit more care, it is possible to show that the function mapping x to the number of steps in the Goodstein sequence starting with x grows at roughly the speed of f0 , and therefore PA cannot even prove that regular Goodstein sequences terminate.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Peano and Heyting Arithmetic