* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Proof Theory: From Arithmetic to Set Theory
Statistical inference wikipedia , lookup
Intuitionistic logic wikipedia , lookup
Bayesian inference wikipedia , lookup
Abductive reasoning wikipedia , lookup
Structure (mathematical logic) wikipedia , lookup
Law of thought wikipedia , lookup
Axiom of reducibility wikipedia , lookup
Model theory wikipedia , lookup
Propositional calculus wikipedia , lookup
Peano axioms wikipedia , lookup
List of first-order theories wikipedia , lookup
Naive set theory wikipedia , lookup
Non-standard analysis wikipedia , lookup
Mathematical logic wikipedia , lookup
Non-standard calculus wikipedia , lookup
Foundations of mathematics wikipedia , lookup
Ordinal arithmetic wikipedia , lookup
Curry–Howard correspondence wikipedia , lookup
Natural deduction wikipedia , lookup
Mathematical proof wikipedia , lookup
Quasi-set theory wikipedia , lookup
Proof Theory: From Arithmetic to Set Theory Michael Rathjen Accompanying notes for a course given at the Nordic Spring School, Nordfjordeid, 27–30 May 2013 Contents • A brief history of proof theory • Sequent calculi for classical and intuitionistic logic, Gentzen’s Hauptsatz: Cut elimination • Consequences of the Hauptsatz: Subformula property, Herbrand’s Theorem, existence and disjunction property, geometric theories • Ordinal functions and representations up to Γ0 • Ordinal analysis of Peano arithmetic, PA, and some subsystems of second order arithmetic. • Limits for the deducibility of transfinite induction • Kripke-Platek set theory, KP. • The Bachmann-Howard ordinal • KP goes infinite, RS. • Impredicative cut elimination theorem • Interpreting KP in RS 1 1 A short and biased history of logic till 1938 • Logical principles - principles connecting the syntactic structure of sentences with their truth and falsity, their meaning, or the validity of arguments in which they figure - can be found in scattered locations in the work of Plato (428–348 B.C.). • The Stoic school of logic was founded some 300 years B.C. by Zeno of Citium (not to be confused with Zeno of Elea). After Zeno’s death in 264 B.C., the school was led by Cleanthes, who was followed by Chrysippus. It was largely through the copious writings of Chrysippus that the Stoic school became established, though many of these writings have been lost. • The patterns of reasoning described by Stoic logic are the patterns of interconnection between propositions that are completely independent of what those propositions say. • The first known systematic study of logic which involved quantifiers, components such as “for all” and “some”, was carried out by Aristotle (384–322 B.C.) whose work was assembled by his students after his death as a treatise called the Organon, the first systematic treatise on logic. • Aristotle tried to analyze logical thinking in terms of simple inference rules called syllogisms. These are rules for deducing one assertion from exactly two others. • An example of a syllogism is: P 1. All men are mortal. P 2. Socrates is a man. C. Socrates is mortal. • In the case of the above syllogism, it is obvious that there is a general pattern, namely: P 1. All M are P . P 2. S is a M . C. S is P . • Some of the other syllogisms Aristotle formulated are less obvious. E.g. P 1. No M is P . P 2. Some S is M . C. Some S is not P . 2 • Aristotle appears to have believed that any logical argument can, in principle, be broken down into a series of applications of a small number of syllogisms. He listed a total of 19. • The syllogism was found to be too restrictive (much later). • For almost 2000 years Aristotle was revered as the ultimate authority on logical matters. Bachelors and Masters of arts who do not follow Aristotle’s philosophy are subject to a fine of five shillings for each point of divergence, as well as for infractions of the rules of the ORGANON. – Statuses of the University of Oxford, fourteenth century. When did Modern Logic start? • Aristotle’s logic was very weak by modern standards. • The ideas of creating an artificial formal language patterned on mathematical notation in order to clarify logical relationships - called characteristica universalis - and of reducing logical inference to a mechanical reasoning process in a purely formal language - called calculus rationatur - were due to Gottfried Wilhelm Leibniz (1646-1716). • Leibniz’s contributions include arithmetization of syllogistic, a theory of relations, modal logic and logical grammar. • Much of it published posthumously 1903 by Couturat Opuscules et fragment inédit de Leibniz. • Logic as we know it today has only emerged over the past 140 years. • Chiefly associated with this emergence is Gottlob Frege (1848–1925). In his Begriffsschrift 1879 (Concept Script) he invented the first programming language. • His Begriffsschrift marked a turning point in the history of logic. It broke new ground, including a rigorous treatment of quantifiers and the ideas of functions and variables. • Frege wanted to show that mathematics grew out of logic. • Charles Peirce (1839–1914) is another pioneer of modern logic. • Another strand is Algebraic logic which stresses logic as a calculus: Augustus De Morgan (1806–1871), George Boole (1815–1864), Ernst Schröder (1841–1902). • Modern logic was codified in Principia Mathematica (1910,1912,1913) by Bertrand Russell (1872–1970) and Alfred N. Whitehead (1861–1947). 3 The Origins of Proof Theory (Beweistheorie) • David Hilbert (1862–1943) • Hilbert’s second problem (1900): Consistency of Analysis • Hilbert’s Programme (1922,1925) The Grundlagenkrise: the usual suspects • Inconsistency in Frege’s Grundlagen. • Cantor had already observed that in set theory the unrestricted Comprehension Principle (CP) leads to contradictions. CP allows one to build sets by collecting all the sets having in common a property P to form a new set {x | P (x)}. • Russell’s Paradox (1901) • Hermann Weyl: “Über die neue Grundlagenkrise in der Mathematik” (1921) 19th century: Growth of the subject • Beginning 19th century: mathematics was concrete, constructive, algorithmic • End of 19th century: Much abstract, non-constructive, non-algorithmic mathematics was under development growing preference for short conceptual non-computational proofs over long computational proofs. • Non-euclidian geometries: statements can be true in one geometry and false in another. • But also consolidation: (More) rigorous foundations of analysis: Cauchy (17891857), Bolzano (1781-1848), Weierstrass (1815-1897) 4 New (non-constructive) proof methods • Abstract notion of function (In Euler’s time functions were explicitly defined via an analytic expression) • Indirect existence proofs (Hilbert’s Basis Theorem) • Zermelo’s proof that R (the reals) can be well-ordered (1904) Axiom of Choice Let I be a set. Suppose that Ai is a non-empty set for each i ∈ I. Then there exists a function [ f : I −→ Ai i∈I such that f (i) ∈ Ai holds for all i ∈ I. Borel, Baire, Lebesgues against the Axiom of Choice 1905 Borel: It seems to me that the objection against it is also valid for every reasoning where one assumes an arbitrary choice made an uncountable number of times, for such reasoning does not belong in mathematics. Acceptance of AC • By the 1930s AC was widely accepted. • With AC, every vector space has a basis. • Let V, W be a vector spaces over same field, u ∈ V, w ∈ W and u, w 6= 0. Then there is a linear mapping f : V → W such that f (v) = w. Reactions and Cures • Brouwer (1908) rejects the law of excluded middle (A ∨ ¬A for arbitrary statements A) Intuitionistic Mathematics • Russell (1908) Vicious Circle Principle 5 • H. Weyl (1885-1955) criticizes impredicative set formation principles Mathematics .. house build on sand (1918) Hilbert’s way out • Platonists, Logicists and Intuitionists seem to agree that a mathematical concept, or sentence, or a theory is acceptable (or properly understood) only if all terms which occur in it can be interpreted directly. • By contrast, the formalist holds that direct interpretability is not a necessary condition for the acceptability of a mathematical theory. To understand a theory means to be able to follow its logical development and not, necessarily, to interpret, or give a denotation for, its individual terms. Hilbert’s two-tiered approach 1. Interpreted (material ”inhaltlich”) Mathematics: Basic rules of reasoning and arithmetic whose validity is self-evident. 2. Uninterpreted (or formal) mathematics obtained by the adjunction of “ideal” (uninterpreted) elements to material ”inhaltliche” mathematics In Hilbert’s case, interpreted mathematics was finitistic mathematics wherein reference to actual infinite sets was tabu. Hilbert’s Program (1922,1925) • I. Codify the whole of mathematical reasoning in a formal theory T. • II. Prove the consistency of T by finitistic means. • “No one shall drive us from the paradise which Cantor has created for us.” 6 Finitism • The exact meaning of “finitistic means” was never precisely delineated by Hilbert. • Finitistic means form the basis of any scientific reasoning. • They do not refer to the actual infinite and do not include any objectionable proof methods. Hilbert’s Ontology Real Objects: Ideal objects: the natural numbers, finite strings of symbols (something a computer can deal with) the other mathematical objects: abstract functions, choice functions, Hilbert spaces, ultrafilters, etc. • Real objects are the main concern of mathematicians. They exist. • Ideal/abstract objects exist merely as a façon de parler. But they are important for the progress of mathematics. The method of ideal elements • Solve a mathematical problem regarding a specific mathematical structure by adding new ideal elements to the structure. • Hilbert: The method of ideal elements is of great importance to the progress of mathematical research. Examples Elementary Geometry → Points and lines at ∞ → Projective Geometry Elementary number theory → number fields, ideals → algebraic number theory Analysis/number theory → Ultrafilter → Set theory 7 Indispensable condition • Hilbert: Es gibt nämlich eine Bedingung, eine einzige, aber auch absolut notwendige, an die die Anwendung der Methode der idealen Elemente geknüpft ist, und diese ist der Nachweis der Widerspruchsfreiheit: die Erweiterung durch Zufügung von Idealen ist nämlich nur dann statthaft, wenn also die Beziehungen, die sich bei Elimination der idealen Gebilde für die alten Gebilde herausstellen, stets im alten Bereiche gültig sind. • There is just one condition, albeit an absolutely necessary one, connected with the method of ideal elements. That condition is a proof of consistency, for the extension of a domain by the addition of ideal elements is legitimate only if the extension does not cause contradictions to appear in the old, narrower domain, or, in other words, only if the relations that obtain among the old structures when the ideal structures are deleted are always valid in the old domain. • Another reading of Hilberts Programme: Elimination of ideal elements. Maybe we should refrain from ontological talk • Abraham Robinson (1918-74): Non-standard analysis (1966) • this book ... appears to affirm the existence of all sorts of infinitary entities. However, from a formalist point of view we may look at our theory syntactically and may consider that what we have done is to introduce new deductive procedures rather than new mathematical entities. Mathematical statements Real statements A A A A Ideal statements Real statements are of the following forms: ∀x1 · · · ∀xr f (x1 , .., xr ) = g(x1 , .., xr ); ∀x1 · · · ∀xr f (x1 , .., xr ) 6= g(x1 , .., xr ); ∀x1 · · · ∀xr f (x1 , .., xr ) ≤ g(x1 , .., xr ) 8 where f, g are basic functions (polynomials) on the naturals. Examples of real statements • Goldbach’s conjecture: Every even number n > 2 is the sum of two primes. (Confirmed up to at least 1018 ). • Vinogradov’s Three Primes Theorem 1937: Every odd integer > 1013000 is the sum of three primes. • Fermat’s conjecture ( Wiles’ Theorem 1995) : “For all naturals a, b, c, n, if a · b · c 6= 0 and n > 2 then an + bn 6= cn . • Riemann hypothesis All non-trivial zeros s of ζ satisfy Re(s) = 12 . • Four colour theorem Ideal statements • The axiom of choice. • Every vector space has a base. • If R is a noetherian ring, then so is the polynomial ring R[X]. • (Schröder-Berstein Theorem) If f : X → Y and g : Y → X are both injective functions, then there exists a 1-1 correspondence between X and Y . Example of a real statement proved by using ideal elements Theorem: 1.1 (Hadamard, de La Vallée Poussin 1896) Prime number theorem π(x) lim x = 1 x→∞ ln(x) where π(x) = number of prime numbers ≤ x. The original proof used contour integration of curves over C. Atle Selberg and Paul Erdös (1949) found proofs using only the means of elementary number theory. 9 Hilbert’s Conservation Programme • A consequence of Hilbert’s Programme • Hilbert’s hope: If a real statement Ψ is provable in non-finitistic mathematics, then Ψ can also be proved by purely finitistic means. THEOREM Let Ψ be a real statement, T a theory, and F := Finitistic mathematics. T proves Ψ T`Ψ =⇒ =⇒ F plus ConT proves Ψ F + ConT ` Ψ. Hilbert’s Consistency Proofs • Grundlagen der Geometrie (1899). Shows the consistency of theories of geometries (euclidian and non-euclidian) by reduction to the theory of arithmetic. • Über die Grundlagen der Logik und Arithmetik (1904) contains a consistency proof of a weak theory of arithmetic (an almost equational theory). • He shows that in this theory one can only deduce homogeneous equations, hence no contradiction. • Hilbert in lectures 1920,1921. New techniques for consistency proofs. The ε-substitution method. Eliminates quantifiers. Clear distinction between finitistic metatheory and object-theory. Hilbert School I • Wilhelm Ackermann (1896–1962): Begründung des tertium non datur mittels der Hilbertschen Theorie der Widerspuchsfreiheit (1925). • Consistency proof for a theory of arithmetic with second order variables (ranging over functions). Function space closed under primitive recursion. • The proof uses Hilbert’s ε-substitution. Very difficult to follow. ω • Proof seems to require a transfinite induction up to ω ω . • John von Neumann (1903–1957) Zur Hilbertschen Beweistheorie (1927) 10 Hilbert School II • Gerhard Gentzen (1909–1945) • Untersuchungen über das logische Schliessen (1934) Dissertation: • Introduces the natural deduction system and the sequent calculus. Proves cut elimination. • Die Widerspruchsfreiheit der reinen Zahlentheorie (1936) • Proves the consistency of Peano arithmetic. Herbrand • Jacques Herbrand (1908–1931) • Sur la non-contradiction de l’Arithmetique (1931) The most important structure • The set of natural numbers N = {0, 1, 2, 3, 4, . . .} with operations of Addition (+) and Multiplication (×) and the less-than relation (<): N = (N; 0, 1, +, ×, <) • Richard Dedekind (1831-1916), Giuseppe Peano (1858-1932) Axiomatization of N: called Peano Arithmetic ( PA) Usual laws for +, × and <. • Axiom scheme of mathematical induction. • Many of the famous theorems and problems of mathematics (including the above examples) can be formalized as a sentence ϕ of the language of N and thus are equivalent to the question whether N |= ϕ. Is Ψ true in N? Axiomatizing the Structure N Peano Arithmetic, PA. Predicate symbols : =, < Function symbols : +, ·, S (Successor) Language of PA := Constant symbols : 0 (N1) ∀x(Sx 6= 0) (N2) ∀xy[Sx = Sy → x = y] 11 (N3) ∀x[x + 0 = x] (N4) ∀xy[x + Sy = S(x + y)] (N5) ∀x[x · 0 = 0] (N6) ∀xy[x · Sy = (x · y) + x] (N7) ∀x¬(x < 0) (N8) ∀xy[x < Sy ↔ x < y ∨ x = y] (N9) ∀xy[x < y ∨ x = y ∨ y < x] (IND) ϕ(0) ∧ ∀x[ϕ(x) → ϕ(Sx)] → ∀xϕ(x) 12 2 The sequent calculus Remark: 2.1 The most common logical calculi are Hilbert-style systems. They are specified by delineating a collection of schematic logical axioms and some inference rules. The choice of axioms and rules is more or less arbitrary, only subject to the desire to obtain a complete system. In model theory it is usually enough to know that there is a complete calculus for first order logic as this already entails the compactness theorem. There are, however, proof calculi without this arbitrariness of axioms and rules. The natural deduction calculus and the sequent calculus were both invented by Gentzen in 1934. Both calculi are pretty illustrations of the symmetries of logic. In this course I shall focus on the sequent calculus since it is a central tool in ordinal analysis and allows for generalizations to infinitary logics. Gentzen’s main theorem about the sequent calculus is the Hauptsatz, i.e. cut elimination. 2.1 Languages As we will also consider intuitionistic theories and the intuitionistic version of the sequent calculus it is in order to spell out what we consider to be the ingredients of a first order theory. Definition: 2.2 All first order languages will share the same logical symbols: ∧, ∨, →, ¬, ∀, ∃, bound variables x0 , x1 , x2 , x3 , . . . and free variables a0 , a1 , a2 , . . . . A first order language L is specified by its non-logical symbols. These symbols are separated into three groups: LC , LF , and LR . LC is the set of constant symbols, LF is the set of function symbols, and LR is the set of relation symbols. Each function symbol f ∈ LF also comes equipped with an arity #f which is a number > 0. Likewise each relation symbol R ∈ LF comes equipped with an arity #R > 0. The distinction between free and bound variables is not essential but it is extremely useful and simplifies arguments a great deal. Terms can be freely substituted for variables since variables occurring in them are always free and thus cannot be captured by quantifiers. Also the cut elimination theorem to be proved below would have to be reformulated in a slightly awkward way. For example, P (x, y) → ∃y ∃x P (y, x) would not have a cut free proof. Convention: 2.3 We will use metavariables x, y, z, u, v, . . . , y1 , y2 , . . . to range over bound variables and a, b, c, d, b1 , b2 , b3 , . . . to range over free variables. We shall use c, d, e, . . . , c0 , c1 , c2 , . . . to range over constants. Variables P, Q, R, S, R0 , R1 , R2 , . . . , will range over relation symbols while f, g, h, f0 , f1 , f2 , f3 , . . . , g0 , g1 , g2 , . . . range over function symbols. 13 Definition: 2.4 The terms of L are inductively defined as follows: 1. Every free variable is a term. 2. Every constant symbol (of L) is a term. 3. If f is an n-ary function symbol and s1 , . . . , sn are terms then f (s1 , . . . , sn ) is a term. Terms are often denoted by t, s, t1 , t2 , . . .. The formulas of L are inductively defined as follows: 1. If R is an n-ary relation symbol of L and t1 , . . . , tn are terms the R(t1 , . . . , tn ) is a formula. R(t1 , . . . , tn ) is called an atomic formula. 2. If A and B are formulas, then so are (¬A), (A ∧ B), A ∨ B) and (A → B). 3. If A is a formula, a is a free variable and x is a bound variable not occurring in A, then ∀x A0 and ∃x A0 are formulas, where A0 is the expression obtained from A by replacing a everywhere in A by x. Henceforth A, B, C, . . . , F, G, H, . . . will be metavariables ranging over formulas. Definition: 2.5 A formula without free variables will be called a closed formula or sentence. In order to emphasize that they belong to a specific language L, a term or formula of L will sometimes be called an L-term or L-formula. To increase readability we shall omit parentheses whenever possible. Outer parentheses will always be omitted. We shall observe the following priority rules: ¬ takes precedence over each of ∧ and ∨, and each of the latter two takes precedence over →. For example, ¬A ∧ B is short for (¬A) ∨ B, and A ∧ B → A ∨ B is short for (A ∧ B) → (A ∨ B). Parentheses will also be omitted in case of double negations: e.g. ¬¬A stands for ¬(¬A). A ↔ B is short for (A → B) ∧ (B → A). Convention: 2.6 If t is a term, we define the substitution of t for a free variable a by A(t/a). To simplify notation, we adopt the convention that if A is a formula and s is a term we often write A(s) to refer to the formula A with some (or even no) occurrences of s in A indicated. If we then write A(t) afterwards in the same context we refer to the result of replacing these indicated occurrences of s in A by t. We say that the variable a is fully indicated in A(a) if all occurrences of a in A are indicated. 2.2 The rules Definition: 2.7 A sequent (of L) is an expression Γ ⇒ ∆ where Γ and ∆ are finite sequences of L-formulas A1 , . . . , An and B1 , . . . , Bm , respectively. Γ ⇒ ∆ is read, informally, as Γ yields ∆ or, rather, the conjunction of the Ai yields the disjunction of the Bj . In particular, 14 • If Γ is empty, the sequent asserts the disjunction of the Bj . • If ∆ is empty, it asserts the negation of the conjunction of the Ai . • if Γ and ∆ are both empty, it asserts the impossible, i.e. a contradiction. We use upper case Greek letters Γ, ∆, Λ, Θ, Ξ . . . to range over finite sequences of formulae. Definition: 2.8 We spell out the axioms and the inference rules of the sequent calculus. Identity Axiom A ⇒ A where A is any formula. In point of fact, we shall limit this axiom to the case of atomic formulae A. CUT Γ ⇒ ∆, A A, Λ ⇒ Θ Cut Γ, Λ ⇒ ∆, Θ A is called the cut formula of the inference. Structural Rules Exchange, Weakening, Contraction Γ, A, B, Λ ⇒ ∆ Xl Γ, B, A, Λ ⇒ ∆ Γ ⇒ ∆, A, B, Λ Xr Γ ⇒ ∆, B, A, Λ Γ ⇒ ∆ W l Γ, A ⇒ ∆ Γ ⇒ ∆ W r Γ ⇒ ∆, A Γ, A, A ⇒ ∆ Cl Γ, A ⇒ ∆ Γ ⇒ ∆, A, A Cr Γ ⇒ ∆, A LOGICAL INFERENCES Negation Γ ⇒ ∆, A ¬L ¬A, Γ ⇒ ∆ B, Γ ⇒ ∆ ¬R Γ ⇒ ∆, ¬B Implication Γ ⇒ ∆, A B, Γ ⇒ Θ → L A → B, Γ ⇒ ∆, Θ A, Γ ⇒ ∆, B → R Γ ⇒ ∆, A → B 15 Conjunction A, Γ ⇒ ∆ ∧ L1 A ∧ B, Γ ⇒ ∆ B, Γ ⇒ ∆ ∧ L2 A ∧ B, Γ ⇒ ∆ Γ ⇒ ∆, A Γ ⇒ ∆, B Γ ⇒ ∆, A ∧ B ∧R Disjunction A, Γ ⇒ ∆ B, Γ ⇒ ∆ ∨L A ∨ B, Γ ⇒ ∆ Γ ⇒ ∆, A ∨ R1 Γ ⇒ ∆, A ∨ B Γ ⇒ ∆, B ∨ R2 Γ ⇒ ∆, A ∨ B Quantifiers F (t), Γ ⇒ ∆ ∀L ∀x F (x), Γ ⇒ ∆ Γ ⇒ ∆, F (a) ∀R Γ ⇒ ∆, ∀x F (x) F (a), Γ ⇒ ∆ ∃L ∃x F (x), Γ ⇒ ∆ Γ ⇒ ∆, F (t) ∃R Γ ⇒ ∆, ∃x F (x) In ∀L and ∃R, t is an arbitrary term. The variable a in ∀R and ∃L is an eigenvariable of the respective inference, i.e. a is not to occur in the lower sequent. Definition: 2.9 The formulae in a logical inference marked blue are called the minor formulae of that inference, while the red formula is the principal formula of that inference. The other formulae of an inference are called side formulae. A proof (aka deduction or derivation) D is a tree of sequents satisfying the following conditions: • The topmost sequents of D are identity axioms. • Every sequent in D except the lowest one is an upper sequent of an inference whose lower sequent is also in D. Definition: 2.10 (The INTUITIONISTIC case.) The intuitionistic sequent calculus is obtained by requiring that all sequents be intuitionistic. A sequent Γ ⇒ ∆ is said to be intuitionistic if ∆ consists of at most one formula. Specifically, in the intuitionistic sequent calculus there are no inferences corresponding to contraction right or exchange right. 16 Our first example is a deduction of the law of excluded middle. A ⇒ A ¬R ⇒ A, ¬A ∨R ⇒ A, A ∨ ¬A Xr ⇒ A ∨ ¬A, A ∨R ⇒ A ∨ ¬A, A ∨ ¬A Cr ⇒ A ∨ ¬A Notice that the above proof is not intuitionistic since it involves sequents that are not intuitionistic. The second example is an intuitionistic deduction. F (a) ⇒ F (a) ∃R F (a) ⇒ ∃x F (x) ¬L ¬∃x F (x), F (a) ⇒ Xl F (a), ¬∃x F (x) ⇒ ¬L ¬∃xF (x) ⇒ ¬F (a) ∀R ¬∃x F (x) ⇒ ∀x ¬F (x) →R ⇒ ¬∃x F (x) → ∀x ¬F (x) Convention: 2.11 Logics without (some of the) structural rules became important in the 1980s. In particular Linear Logic attracted a great deal of attention back then. For our purposes the structural rules just add an additional layer of bureaucracy. We would really like to sweep them under the carpet. We will achieve this by identifying a sequence of formulas A1 , . . . , An with the set of formulas {A1 , . . . , An }. Henceforth variables ∆, Γ, Λ, . . . will range over finite sets of formulas. We will interpret a comma between these sets as set-theoretic union. Thus Γ, ∆ stands for Γ ∪ ∆. We also adopt the convention that Γ, A stands for Γ ∪ {A}. Likewise A1 , . . . , An stands for {A1 , . . . , An } and Γ, ∆, A stands for Γ ∪ ∆ ∪ {A} etc. Since in the curly bracket notation {A1 , . . . , An } the ordering of the formulas does not matter and repeating a formula doesn’t make a difference, this will take care of the exchange and the contraction rules automatically. This still leaves the weakening rules. However, we are going to ditch them completely in the classical case since it is always possible to add more side formulas already at the leaves of a proof tree. Thus we adopt as Axioms all sequents of the form Γ, A ⇒ ∆, A where A is an atomic formula. Thus, henceforth we no longer consider explicit structural rules in the classical case. The left rule for → can be simplified a bit in the classical case. Henceforth we adopt this rule: Γ ⇒ ∆, A B, Γ ⇒ ∆ → L A → B, Γ ⇒ ∆ 17 while the intuitionistic rule takes the form Γ ⇒ A B, Γ ⇒ ∆ → L A → B, Γ ⇒ ∆ with ∆ containing at most one formula. In the intuitionistic case, we shall also ditch the structural rules with one exception. Here the Axioms will be all the sequents of the form ∆, A ⇒ A with A atomic. As a result we no longer need the left weakening rule. However we still need the right weakening rule that is from Γ ⇒ we may infer Γ ⇒ B for any formula B. This rule could also be called ex falso quodlibet. Definition: 2.12 A sequent deduction D is a proof tree and we can measure a tree by its height, i.e. its longest branch. We use |D| to denote the height of D. We shall use the notation Γ ⇒ ∆ to express that there is a deduction of Γ ⇒ ∆ while n Γ ⇒ ∆ is used to convey that there is a deduction of Γ ⇒ ∆ with height ≤ n. We use n I Γ ⇒ ∆ to convey that that there is a deduction of Γ ⇒ ∆ with height ≤ n in the intuitionistic sequent calculus, and I Γ ⇒ ∆ to say that there is an intutitionistic deduction. The length |A| of a formula A is defined as follows: |A| = 0 if A is atomic. |¬A| = |A| + 1, |A♦B| = max(|A|, |B|) + 1 if ♦ is one of the connectives ∨, ∧, →, |∃x A| = |A| + 1, |∀x A| = |A| + 1. We write n Γ ⇒ ∆ k if there is a deduction of Γ ⇒ ∆ of height ≤ n such that all cuts in this deduction have cut formulas with length < k. n I k Γ ⇒ ∆ is defined similarly. Lemma: 2.13 For every formula A there is an intuitionistic deduction of A ⇒ A. t u Proof: Exercise. We list some technical lemmata that will be useful for proving cut elimination. Lemma: 2.14 (Substitution) Let Γ(a) and ∆(a) be sets of formulas with all occurrences of a indicated. Let s be an arbitrary term. 18 (i) If (ii) If I n k n k n Γ(a) ⇒ ∆(a) , then k Γ(a) ⇒ ∆(a) , then I Lemma: 2.15 (Weakening) (ii) If I n k Γ ⇒ ∆ , then I n k Γ(s) ⇒ ∆(s) . n k Γ(s) ⇒ ∆(s) . n (i) If k Γ ⇒ ∆ , then n k Γ, Γ0 ⇒ ∆, ∆0 . Γ, Γ0 ⇒ ∆ . Proof: Just add Γ0 and ∆0 to all sequents in the deduction. Formally one proves this by induction on n. In the cases of quantifier rules with eigenvariable conditions one might have to replace these variables by ‘fresh’ ones, using Lemma 2.14. t u Lemma: 2.16 (Inversion) (ii) If (iii) If (iv) If (v) If (vi) If (vii) If (viii) If (ix) If (x) If n k n k n k n k n k n k n k n k n k (i) If n Γ ⇒ ∆, A ∧ B then n n Γ, A → B ⇒ ∆ then Γ ⇒ ¬A, ∆ then Γ, ¬A ⇒ ∆ then n k n k n k n k n k Γ, A, B ⇒ ∆ . Γ ⇒ ∆, B . Γ, B ⇒ ∆ . Γ ⇒ ∆, A, B . k Γ ⇒ A → B, ∆ then Γ, A ∧ B ⇒ ∆ then Γ, A ⇒ ∆ and k Γ ⇒ ∆, A ∨ B then k Γ ⇒ ∆, A and k Γ, A ∨ B ⇒ ∆ then n n k n k A, Γ ⇒ ∆, B . Γ ⇒ ∆, A and n k Γ, B ⇒ ∆ . Γ, A ⇒ ∆ . Γ ⇒ ∆, A . Γ ⇒ ∆, ∀x B(x) then Γ, ∃x B(x) ⇒ ∆ then n k n k Γ ⇒ ∆, B(s) for any term s. Γ, B(s) ⇒ ∆ for any term s. (xi) With the exception of (iv), (vi) and (viii) the above inversion properties remain valid for the intuitionistic sequent calculus. One half of (vi) also remains valid intutionistically: If I n k Γ, A → B ⇒ ∆ then I n k Γ, B ⇒ ∆ . Proof: All are provable by easy inductions on n. We have laid the groundwork for cut elimination. Here is an example of how to eliminate cuts of a special form: A, Γ ⇒ ∆, B Λ ⇒ Θ, A B, Ξ ⇒ Φ →R →L Γ ⇒ ∆, A → B A → B, Λ, Ξ ⇒ Θ, Φ Cut Γ, Λ, Ξ ⇒ ∆, Θ, Φ is replaced by Λ ⇒ Θ, A A, Γ ⇒ ∆, B Cut Λ, Γ ⇒ Θ, ∆, B B, Ξ ⇒ Φ Cut Γ, Λ, Ξ ⇒ ∆, Θ, Φ 19 t u So we have replaced a cut with cut formula A → B by cuts with formulas of smaller length. By doing this systematically we arrive at the Reduction Lemma. Well, actually it is not that easy when contractions are involved, i.e. when the principal formula of an inference is also a side formula: A, Γ ⇒ ∆, B, A → B Λ, A → B ⇒ Θ, A B, Ξ, A → B ⇒ Φ →R →L Γ ⇒ ∆, A → B A → B, Λ, Ξ ⇒ Θ, Φ Cut Γ, Λ, Ξ ⇒ ∆, Θ, Φ n Lemma: 2.17 (Reduction) Suppose k ≤ |C|. If k Γ, C ⇒ ∆ and then 2(n+m) Γ, Ξ ⇒ ∆, Θ . |C| m k Ξ ⇒ Θ, C , Proof: Of course we could derive Γ, Ξ ⇒ ∆, Θ by an application of the cut rule, but the resulting derivation would have cut rank |C| + 1. The proof is by induction on n + m. Let D1 be a derivation of Γ, C ⇒ ∆ with cut rank ≤ k and length ≤ n. Likewise let D2 be a derivation of Ξ ⇒ C, Θ with cut rank ≤ k and length ≤ m. Case 1: Γ, C ⇒ ∆ is an axiom whose principal formula is not C, i.e., Γ = Γ0 , A and ∆ = ∆0 , A for some atom A. Then Γ, Ξ ⇒ ∆, Θ is an axiom too and the desired assertion follows. Similarly, if Ξ ⇒ Θ, C is an axiom whose principal formula is different from C then Ξ ⇒ Θ is an axiom and so is Γ, Ξ ⇒ ∆, Θ. Case 2: Both Γ, C ⇒ ∆ and Ξ ⇒ Θ, C are axioms with principal formula C. Then ∆ = ∆0 , C and Ξ = Ξ0 , C for some ∆0 and Ξ0 . Hence Γ, Ξ ⇒ ∆, Θ is an axiom as well. Henceforth we may assume that Γ, C ⇒ ∆ or Ξ ⇒ Θ, C is not an axiom. Hence at least one of the derivations ends with an inference which will be called its last inference. Case 3: D1 ends with an inference whose principal formula is different from C. Then the premisses of the last inference are of the form Γi , C ⇒ ∆i ni and we have k Γi , C ⇒ ∆i where ni < n. Since ni + m < n + m we can apply the induction hypothesis to the premisses and obtain 2(ni +m) |C| Γi , Ξ ⇒ ∆i , Θ . 2(n+m) Γ, Ξ ⇒ ∆, Θ . If the last inference By applying the same inference we get |C| comes with an eigenvariable condition it might be necessary to substitute a new variable. But by Lemma 2.14 this can be done without increasing length and cut rank of derivations. Case 4: D2 ends with an inference whose principal formula is different from C. This is analogous to the previous case. We may from now on assume that C is the principal formula of the last inference of 20 both D1 and D2 . In particular C is not an atom. Case 5: C is of the form A ∧ B. Then we have n1 k Γ, C, A ⇒ ∆ (1) Γ, C, B ⇒ ∆ (2) Ξ ⇒ Θ, C, A (3) Ξ ⇒ Θ, C, B (4) or n1 k as well as m1 k and m2 k for some n1 < n and m1 , m2 < m. Note that C could have been a side formula of any of the last inferences of D1 and D2 , and, moreover, that by weakening (Lemma 2.15) we can always add C as a side formula without increasing the length or the cut rank of the derivation. m If (1) obtains we apply the induction hypothesis with (1) and k Ξ ⇒ Θ, C to arrive at 2(n1 +m) Γ, Ξ, A ⇒ ∆, Θ . |C| (5) Applying the Inversion Lemma 2.16 (ii) to (3) we have m1 k Ξ ⇒ Θ, A . (6) Cutting A out of (5) and (6) gives the desired 2(n+m) |C| Γ, Ξ ⇒ ∆, Θ since |A| < |C|. If (2) obtains we apply the induction hypothesis with (2) and arrive at 2(n1 +m) |C| m k Γ, Ξ, B ⇒ ∆, Θ . Ξ ⇒ Θ, C to (7) Applying the Inversion Lemma 2.16 (ii) to (4) we have m1 k Ξ ⇒ Θ, B . Cutting B out of (7) and (8) gives the desired 2(n+m) |C| (8) Γ, Ξ ⇒ ∆, Θ . Case 6: C is of the form ∀x A(x). Then we have n1 k Γ, C, A(s) ⇒ ∆ 21 (9) and m1 k Ξ ⇒ Θ, C, A(a) (10) for some n1 < n and m1 < m with a being an eigenvariable. Applying the induction m hypothesis to (9) and k Ξ ⇒ Θ, C we get 2(n1 +m) |C| Γ, Ξ, A(s) ⇒ ∆, Θ . (11) By applying first inversion (Lemma 2.16) to (10) and subsequently substitution (Lemma 2.14) (or the other way round) we get m1 Ξ ⇒ Θ, A(s) . k A cut performed on (11) and (12) yields 2(n+m) |C| (12) Γ, Ξ ⇒ ∆, Θ . Case 7: C is of the form A → B. Then we have n1 Γ, C ⇒ ∆, A (13) Γ, C, B ⇒ ∆ (14) Ξ, A ⇒ Θ, C, B . (15) k and n2 k as well as m1 k for some n1 , n2 < n and m1 < m. m (13) can be linked up with k Ξ ⇒ Θ, C to furnish a pair to which we can apply the induction hypothesis. Whence we get 2(n1 +m) |C| Γ, Ξ ⇒ ∆, Θ, A . (16) Another pair to which we can apply the induction hypothesis is given by (15) and m Γ, C ⇒ ∆ . Thus k 2(n+m1 ) |C| Γ, Ξ, A ⇒ ∆, Θ, B . (17) Applying a cut to (17) and (16) yields max(2(n+m1 ),2(n1 +m))+1 |C| Γ, Ξ ⇒ ∆, Θ, B . (18) Applying the Inversion Lemma 2.16 (xi) to (14) yields n1 k Γ, B ⇒ ∆ . (19) Cutting out B from (18) and (19) we arrive at max(2(n+m1 ),2(n1 +m))+2 |C| 22 Γ, Ξ ⇒ ∆, Θ . (20) As max(2(n + m1 ), 2(n1 + m)) + 2 ≤ 2(n + m) we get the desired result from (20). Case 8: C is of the form A ∨ B. Then we have n1 k Γ, C, A ⇒ ∆ (21) Γ, C, B ⇒ ∆ (22) Ξ ⇒ Θ, C, A (23) Ξ ⇒ Θ, C, B (24) and n2 k and also m1 k or m1 k for some n1 , n2 < n and m1 < m. To (21) and hypothesis to arrive at 2(n1 +m) |C| To (22) and m k m k Ξ ⇒ Θ, C we apply the induction Γ, Ξ, A ⇒ ∆, Θ . (25) Ξ ⇒ Θ, C we apply the induction hypothesis to arrive at 2(n2 +m) |C| Γ, Ξ, B ⇒ ∆, Θ . (26) From (23) as well as (24) we get m1 k Ξ ⇒ Θ, A, B (27) by the Inversion Lemma 2.16 (iv). Cutting A out of (25) and (27) yields 2(n1 +m)+1 |C| Γ, Ξ ⇒ ∆, Θ, B . (28) Performing a cut on (26) and (28) gives 2(n+m) |C| Γ, Ξ ⇒ ∆, Θ . Case 9: C is of the form ∃x A(x). Then we have n1 k Γ, C, A(a) ⇒ ∆ (29) Ξ ⇒ Θ, C, A(s) (30) and m1 k 23 for some n1 < n and m1 < m with a being an eigenvariable. Applying the induction n hypothesis with (30) and k Γ, C ⇒ ∆ we get 2(n+m1 ) |C| Γ, Ξ ⇒ ∆, Θ, A(s) . (31) By applying first inversion (Lemma 2.16) to (29) and subsequently substitution (Lemma 2.14) (or the the other way round) we get n1 k Γ, A(s) ⇒ Θ . 2(n+m) A cut performed on (31) and (32) yields |C| (32) Γ, Ξ ⇒ ∆, Θ . Case 10: C is of the form ¬A. Then we have n1 k Γ, C ⇒ ∆, A (33) Ξ, A ⇒ Θ, C . (34) and m1 k for some n1 < n and m1 < m. The induction hypothesis applies to (33) and m Ξ ⇒ Θ, C , furnishing k 2(n1 +m) |C| Γ, Ξ ⇒ ∆, Θ, A . (35) Now apply the Inversion Lemma 2.16 (vii) to (34) to get m1 k Ξ, A ⇒ Θ . (36) Cutting out A from (35) and (36) we arrive at 2(n+m) |C| Γ, Ξ ⇒ ∆, Θ . t u Theorem: 2.18 (Cut Reduction) If n Γ ⇒ ∆ then k+1 4n k Γ ⇒ ∆. Proof: We use induction on n. Suppose D is a derivation of Γ ⇒ ∆ with length ≤ n and cut rank ≤ k + 1. If Γ ⇒ ∆ is an axiom then we clearly get the desired result. So let’s assume that Γ ⇒ ∆ is not an axiom. Then D has a last inference (I) with premisses Γi ⇒ ∆i . Suppose the inference was not a cut or a cut of a degree ni < k. We then have k Γi ⇒ ∆i for some ni < n. By the induction hypothesis 4ni 4n we have k Γi ⇒ ∆i . Applying the same inference (I) yields k Γ ⇒ ∆ since 4ni < 4n . Now suppose the last inference was a cut with a cut formula C satisfying |C| = k. By the induction hypothesis we have 4n1 k Γ, C ⇒ ∆ 24 and 4n2 k Γ ⇒ ∆, C for some n1 , n2 < n. We can then apply the Reduction Lemma 2.17 to these deriva2(4n1 +4n2 ) tions and arrive at k Γ ⇒ ∆ . Since 2(4n1 +4n2 ) ≤ 4n the desired conclusion follows. t u m m 4r Corollary: 2.19 (Gentzen’s Hauptsatz) Let 4m 0 = m and 4r+1 = 4 . If n k Γ ⇒ ∆ then 4n k 0 Γ ⇒ ∆. As a result, there is a cut free derivation of Γ ⇒ ∆. Proof: Just apply the previous result k times. Formally that is an induction on k. t u Definition: 2.20 For a formula A we define its set of subformulae, Subf(A) as follows: If A is an atom then Subf(A) = {A}. Subf(¬A) = Subf(A) ∪ {¬A}. Subf(A♦B) = Subf(A) ∪ Subf(B) ∪ {A♦B} if ♦ is one of the connectives ∧, ∨, →. [ Subf(Qx F (x)) = {Qx F (x)} ∪ Subf(F (s)) s∈T erm where Q is ∀ or ∃ and T erm is the set of terms. B is said to be a subformula of A if B ∈ Subf(A). Corollary: 2.21 (The subformula property) The Hauptsatz 2.19 has an important corollary. If a sequent Γ ⇒ ∆ is deducible, then it has a deduction such that every formula occurring in it is a subformula of some formula in γ ∪ ∆. Proof: Take a cut free proof of Γ ⇒ ∆. Then it’s clear the the entire deduction is made of subformulas of formulas in Γ and ∆. t u Corollary: 2.22 A contradiction, i.e. the empty sequent cannot be deduced. Proof: The empty sequent cannot have a cut free deduction. What could have been the last inference? t u 2.3 Cut elimination for the intuitionistic sequent calculus Lemma: 2.23 (Reduction) Suppose k ≤ |C|. If I then 2(n+m) Γ, Ξ ⇒ ∆ . I n k Γ, C ⇒ ∆ and I m k Ξ ⇒ C, |C| t u Proof: The proof is similar to the classical case (Lemma 2.17). m m 4r Corollary: 2.24 (Gentzen’s Hauptsatz) Let 4m 0 = m and 4r+1 = 4 . If I n k Γ ⇒ ∆ then I 4n k 0 Γ ⇒ ∆. As a result, there is a cut free intuitionistic derivation of Γ ⇒ ∆. 25 3 Consequences of the Hauptsatz Definition: 3.1 A formula is said to be existential if it is quantifier free or of the form ∃x1 . . . , ∃xr B(x1 , . . . , xr ) with B(a1 , . . . , br ) quantifier free. Note that a subformula of an existential formula is existential too. Lemma: 3.2 Suppose that Γ consists of quantifier free formulae and ∆ consists entirely of existential formulae. Let ∃x C(x) be an existential formula. If Γ ⇒ ∆, ∃x C(x) then there exist terms t1 , . . . , tk such that Γ ⇒ ∆, C(t1 ), . . . , C(tk ) . Proof: By the Hauptsatz we have a cut free deduction D of Γ ⇒ ∆, ∃x C(x). We proceed by induction on n = |D|. If n = 0 then Γ ⇒ ∆ is already an axiom. Now let n > 0. The D ended with an inference. First suppose the last inference of D does not have ∃x C(x) as principal formula. Then its premisses are of the form Γi ⇒ ∆i , ∃x C(x). Note that the formulae of Γi must also be quantifier free and those in ∆i must be existential too. Let’s assume we have two premisses. Inductively we Γi ⇒ ∆i , C(ti1 ), . . . , C(tiri ) for some terms and by applying weakening then have and the same inference we get Γ ⇒ ∆, C(t11 ), . . . , C(t1r1 ), C(t21 ), . . . , C(t2r2 ) . If ∃x C(x) is the principal formula of the last inference of D then this must have been ∃R and its premiss is of the form Γ ⇒ ∆, ∃x C(x), C(t) for some term t. Inductively we have terms t01 , . . . , t0l such that Γ ⇒ ∆, C(t01 ), . . . , C(t0l ), C(t) and we are done. t u We shall sometimes write ` ∆ and I ` ∆ for tively. ⇒ ∆ and I ⇒ ∆ , respec- Theorem: 3.3 (Herbrand’s Theorem) If A(~a, ~b ) is quantifier free and ` ∀~x ∃~y A(~x, ~y ) then there are finitely many term tuples t1 , . . . , tn each of the same length as ~b whose free variables are among ~a such that ` A(~a, t1 ) ∨ . . . ∨ A(~a, tn ). Proof: Using inversion 2.16 (ix) several times we have ` ∃~y A(~a, ~y ). Now use Lemma 3.2 several times followed by several ∨R inferences. t u The intuitionistic case is much easier to prove. Lemma: 3.4 If I ∃y F (y) then I F (t) for some term t. 26 Proof: We have I n 0 ∃y F (y) for some n. The last inference of the pertaining deducn−1 tion must have been ∃R. Hence I 0 F (t) for some term t since in the intuitionistic case we can not have side formulas in the antecedent. t u Corollary: 3.5 If I ∀~x ∃~y A(~x, ~y ) then there exists a term tuple t of the same length as ~b whose free variables are among ~a such that I A(~a, t) . Proof: Use ∀-inversion and apply the previous Lemma several times. t u Examples: 3.6 In the classical case we cannot always find a single term as the following example demonstrates. Let L be a language that has two constants 0, 1 and two unary predicate symbols P and R. Then in classical logic we have ` ∃y [(P (0) → R(0)) ∧ (¬P (0) → R(1)) → R(y)] but we can not prove (P (0) → R(0)) ∧ (¬P (0) → R(1)) → R(t) for any term t. (Exercise) Definition: 3.7 A theory T is a set of sentences, called its axioms. T is said to be universal (or open) if all of its axioms are of the form ∀~x A(~x ) with A(~a ) quantifier free. If ~s is a tuple of terms (of the same length as ~x) then A(~s ) will be called a substitution instance of ∀~x A(~x ). Theorem: 3.8 (Hilbert-Ackermann Consistency Theorem) A universal theory T is inconsistent iff there is a tautology which is a disjunction of negations of substitution instances of the axioms of T . In other words T is inconsistent iff there are substitution instances B1 , . . . , Bn of axioms of T such that ` ¬B1 ∨ . . . ∨ ¬Bn . Proof: Clearly if ` ¬B1 ∨ . . . ∨ ¬Bn holds then T must be inconsistent since T proves each Bi . Conversely, if T is inconsistent then there are finitely many axioms A1 , . . . , An of T such that A1 , . . . , An ⇒ . (37) Each Ai is of the form ∀~x Ci (~x ) with Ci (~a ) quantifier free. By applying ¬R to (37) n times we obtain ⇒ ¬A1 , . . . , ¬An . (38) Since ` ¬Ai ⇒ ∃~x ¬Ci (~x ) holds for all i we can employ n cuts to (38) to arrive at ⇒ ∃~x ¬C1 (~x ), . . . , ∃~x ¬Cn (~x ). (39) Now apply Lemma 3.2 to (39) several times to get rid of the existential quantifiers and subsequently apply ∨R several times to get the desired result. t u 27 Remark: 3.9 There are many examples of universal theories: the theory of equality, the theory of groups with a constant symbol for the neutral element and a function symbol for the inverse operation, the theory of linear orderings and many equational theories. Next we will turn to a richer class of theories, the so-called geometric theories. Definition: 3.10 The geometric formulae are inductively defined as follows: Every atom is a geometric formula. If A and B are geometric formulae then so are A ∨ B, A ∧ B and ∃x A. Another way of saying this is that a formula is geometric iff it does not contain any of the particles →, ¬, ∀. A formula is called a geometric implication if it is of either form ∀~x A or ∀~x ¬A or ∀~x (A → B) with A and B being geometric formulae. Here ∀~x may be empty. In particular geometric formulae and their negations are geometric implications. A theory is geometric if all its axioms are geometric implications. Examples: 3.11 (i) 1. Robinson arithmetic. The language has a constant 0, a unary successor function suc and binary functions + and ·. Axioms are the equality axioms and the universal closures of the following. 1. ¬suc(a) = 0. 2. suc(a) = suc(b) → a = b. 3. a = 0 ∨ ∃y a = suc(y). 4. a + 0 = a. 5. a + suc(b) = suc(a + b). 6. a · 0 = 0. 7. a · suc(b) = a · b + a A classically equivalent axiomatization is obtained if (3) is replaced by ¬a = 0 → ∃y a = suc(y) but this is not a geometric implication. (ii) The theories of groups, rings, and local rings have geometric axiomatizations. (iii) The theories of fields, ordered fields, algebraically closed fields and real closed fields have geometric axiomatizations. To express algebraic closure replace axioms s 6= 0 → ∃x sxn + t1 xn−1 + . . . + tn−1 x + tn = 0 by s = 0 ∨ ∃x sxn + t1 xn−1 + . . . + tn−1 x + tn = 0 where sxk is short for s · x · . . . · x with k many x. (iv) The theory of projective geometry has a geometric axiomatization. 28 We want to show that a geometric implication which is classically deducible in a geometric theory T is also intuitionistically deducible in T . We need some simple observations. Lemma: 3.12 Let π {1, . . . , n} → {1, . . . , n} be a bijection. 1. If I Γ ⇒ A1 ∨ . . . ∨ An then I Γ ⇒ Aπ(1) ∨ . . . ∨ Aπ(n) . 2. If I Γ ⇒ D ∨ F (s) then I 3. If I Γ ⇒ D ∨ B and I Γ ⇒ D ∨ C then I 4. If I Γ ⇒ A ∨ B then I Γ, ¬A ⇒ B . 5. If I Γ, B ⇒ C and I Γ ⇒ D ∨ ∃x F (x) . Γ ⇒ C ∨ A then I Γ ⇒ D ∨ (B ∧ C) . Γ, A → B ⇒ C . t u Proof: Exercise. W Lemma: 3.13 For a finite set of formulas ∆ = {A , . . . , A } let ∆ be the formula 1 n W A1 ∨ . . . ∨ An . If ∆ is empty then ∆ is the empty set. Let Γ be a finite set of geometric implications and ∆ be a finite set of geometric formulas. W Γ ⇒ ∆ then I Γ ⇒ ∆ . If Proof: Let D be a cut free deduction of Γ ⇒ ∆. The proof proceeds by induction on n = |D|. If Γ ⇒ ∆ is an axiom then there exists an atom A such that A ∈ Γ ∩ ∆. If ∆ has no other formulae we are done. If there are other formulae in ∆, say W D1 , . . . , Dk , then apply ∨R k times to arrive at I Γ ⇒ ∆ . Let n > 0. We inspect the last inference of D. Note that ∀R, ¬R and → R are ruled out since they have non-geometric principal formulas. If the last inference was ∀L, ∃L, ∧L, or ∨L we can simply apply the induction hypothesis to the premisses and re-apply the same inference. If the last inference was ∃R apply the induction hypothesis to its premiss and subsequently use Lemma 3.12 (2) to get the desired result. If the last inference was ∧R apply the induction hypothesis to its premisses and subsequently use Lemma 3.12 (3). If the last inference was ¬L then its minor formula must be geometric. Then apply the induction hypothesis to its premiss and subsequently use Lemma 3.12 (4). If the last inference was → L then apply the induction hypothesis to its premisses and subsequently use Lemma 3.12 (5). t u Theorem: 3.14 Let T be a geometric theory and suppose that there is a classical proof of a geometric implication G in T . Then there is an intuitionistic proof of G from the axioms of T . Proof: G is of the form ∀~x F (~x ) where F (~a ) is a geometric formula or the negation of a geometric formula or an implication of two geometric formulae. We have A1 , . . . , Ak ⇒ G 29 for some axioms A1 , . . . , Ak of T . Using the Inversion Lemma 2.16 (ix) we get A1 , . . . , Ak ⇒ F (~a ) . If F (~a ) is geometric we obtain I A1 , . . . , Ak ⇒ F (~a ) by Lemma 3.13 so that via (several) ∀R inferences we arrive at the desired result. If F (~a ) is of the form ¬F0 (~a ) with F0 (~a ) geometric we apply the Inversion Lemma 2.16 (vii) to get A1 , . . . , Ak , F0 (~a ) ⇒ . By Lemma 3.13 we infer that I I A1 , . . . , Ak , F0 (~a ) ⇒ and thus, by ¬R, we have A1 , . . . , Ak ⇒ ¬F0 (~a ) so that via ∀R we arrive at I A1 , . . . , Ak ⇒ ∀~x ¬F0 (~x ) . If F (~a ) is of the form F0 (~a ) → F1 (~a ) with Fi (~a ) geometric we apply the Inversion Lemma 2.16 (v) to get A1 , . . . , Ak , F0 (~a ) ⇒ F1 (~a ) . By Lemma 3.13 we infer that I A1 , . . . , Ak , F0 (~a ) ⇒ F1 (~a ) . Via → R we get I A1 , . . . , Ak ⇒ F0 (~a ) → F1 (~a ) and via ∀R we arrive at I A1 , . . . , Ak ⇒ ∀~x (F0 (~x ) → F1 (~x )) . t u The previous result W can be extended toVinfinitary languages which accommodate infinite disjunctions Φ and conjunctions Φ, where Φ is set of (infinitary) formulas such that the total number of variables (free and bounded) occurring in the formulas of Φ is finite. In this richer language a formula W is said to be coherent if in addition to ∨, ∧, ∃ one also allows infinite disjunctions Φ, where Φ is already a set of coherent formulas satisfying the above proviso on the number of variables. Then a theorem similar to 3.14 can be shown for coherent theories, that is theories axiomatized by coherent implications. An example of an axiom expressible in this richer language via a coherent implication is the Archimedian axiom: ∀x (x < 1 ∨ x < 1 + 1 ∨ . . . ∨ x < 1 + . . . + 1 ∨ . . .) or in more compact way: ∀x _ x < n. n∈N Geometric theories are quite ubiquitous. There exists a simple method which is sometimes called Morleyisation (in honour of the logician Michael Morley) by which every theory can be given a geometric axiomatization in a richer language. The technique actually goes back to Skolem. Albeit Skolemization would be more appropriate that name is already used for something else. Wilfrid Hodges called the procedure to find a ∀∃ axiomatization in a richer language atomization. 30 Definition: 3.15 Below ∀~x (A1 (~x ) A2 (~x )) will stand for two formulas namely ∀~x (A1 (~x ) → A2 (~x )) and ∀~x (A2 (~x ) → A1 (~x )). Let T be a theory in a first order language L. For each formula A(a1 , . . . , an ) of L with all free variables indicated we add two new n-ary relation symbols PA(~a ) and NA(~a ) to the language, where ~a = a1 , . . . , an . Call the new language La . The theory T a in the language La has the following axioms: 1. ∀~x ¬(PA(~a ) (~x) ∧ NA(~a ) (~x)). 2. ∀~x (PA(~a ) (~x) ∨ NA(~a ) (~x)). 3. If A(~a ) is atomic add the axioms ∀~x (PA(~a ) (~x) A(~x )). 4. If A(~a ) is B(~a ) ∧ C(~a ) add ∀~x (PA(~a ) (~x ) PB(~a ) (~x ) ∧ PC(~a ) (~x )). 5. If A(~a ) is B(~a ) ∨ C(~a ) add ∀~x (PA(~a ) (~x ) PB(~a ) (~x ) ∨ PC(~a ) (~x )). 6. If A(~a ) is ¬B(~a ) add ∀~x (PA(~a ) (~x ) NB(~a) (~x )). 7. If A(~a ) is B(~a ) → C(~a ) add ∀~x (PA(~a ) (~x ) NB(~a ) (~x ) ∨ PC(~a ) (~x )). 8. If A(~a ) is ∃yB(~a, y) add ∀~x (PA(~a ) (~x ) ∃y PB(~a,b) (~x, y)). 9. If A(~a ) is ∀yB(~a, y) add ∀~x (NA(~a ) (~x ) ∃y NB(~a,b) (~x, y)). 10. Finally, for each axiom ∀~x A(~x ) of T add ∀~x PA(~a) (~x ) as an axiom to T a . Clearly T a is a geometric theory. Theorem: 3.16 Let T and T a as above. (i) For every formula A(~a ) of L with all free variables indicated, T a ` ∀~x [A(~x ) ↔ PA(~a) (~x )]. (ii) Every model A of T can be expanded in just one way to an La -structure Aa which is a model of T a . (iii) T a is conservative over T , that is, for every L-sentence B, T ` B iff T a ` B. t u Proof: Exercise. 31 4 Ordinal functions and representation systems The strength of appropriate theories can be aptly measured via transfinite ordinals. To be able to denote these ordinals and have a sufficient supply of them we shall go beyond the operations of addition, multiplication and exponentiation on ordinals and study a hierarchy of functions introduced by O. Veblen in 1908. In what follows we will work informally in a sufficiently strong classical set theory, e.g. ZF. Lower case Greek letters α, β, γ, δ, . . . will be assumed to range over the class of ordinals ON. 0 is the smallest ordinal. Every ordinal α has a successor which we denote by α + 1, i.e., α + 1 is the smallest ordinal that is bigger than α. An ordinal of the form α + 1 is a successor ordinal or just a successor. A limit ordinal or just a limit is an ordinal which is not a successor and > 0. We denote the ordering of ordinals by < and the less-than-or-equal relation by ≤. As per usual we identify an ordinal α with the set {β | β < α}. In set theory an ordinal is defined to be a transitive set whose elements are transitive too. Moreover, < on ordinals coincides with ∈ and thus α = {β | β ∈ α}. Some crucial properties about ordinals that we shall assume are the following. Postulates: 4.1 (Ordinals) (O1) < is a total linear ordering on ON, i.e. α 6< α and α < β ∨ β < α ∨ α = β hold for all α and β. (O2) Every non-empty class X of ordinals contains a least element (necessarily unique), i.e., there exists α0 ∈ X such that for all α ∈ X, α0 ≤ α. This ordinal will be denoted by min X. (O3) Whenever X is a set and f : X → ON is a function then there exists ξ ∈ ON such that f (u) < ξ for all u ∈ X. In future I shall not explicitly mention the above postulates but note that (O2) is equivalent to the principle of transfinite induction on ON: ∀α (∀ξ < α ξ ∈ X → α ∈ X) → ON ⊆ X . Definition: 4.2 Let N be the smallest set of ordinals which contains 0 and with α also contains α+1. Then all the ordinals in N different from 0 are successor ordinals. The first ordinal that does not belong to N is the least limit ordinal, denoted by ω. Definition: 4.3 A class U ⊆ ON is said to be an initial segment or just a segment if ∀α ∈ U ∀β < α β ∈ U . Any segment is either an ordinal α (i.e. the set of ordinals < α) or the class of ordinals ON. Let X, Y ⊆ ON and f : X → Y be a function. For V ⊆ X let f [V ] = {f (α) | α ∈ V }. f is strictly increasing or order preserving if ∀α, β ∈ X (α < β → f (α) < f (β)). f is said to be an enumeration function of Y or listing function of Y or ordering function of Y if f is strictly increasing, f [X] = Y and X is a segment. Given a set U ⊆ ON we denote by sup U the smallest ordinal ξ such that ∀α ∈ U α ≤ ξ. 32 Lemma: 4.4 Let X be a segment of ON and f : X → ON be order preserving. Then α ≤ f (α) holds for all α ∈ X. t u Proof: Use transfinite induction on α. Lemma: 4.5 Every Y ⊆ ON has a unique enumeration function EnumY . Proof: Existence. Define the collapsing function CY : Y → ON by CY (α) = {CY (ξ) | ξ ∈ Y ∧ ξ < α}. Then CY is 1–1 and X := CY [Y ] is a segment. Now let EnumY := (CY )−1 . Here is another way of defining EnumY : Let c be a set which is not an ordinal, e.g. c = {1}, where 1 = {0}. Define F : ON → ON by transfinite recursion via min(Y \ {F (β) | β < α}) if Y \ {F (β) | β < α} = 6 ∅ F (α) = c otherwise. Then let X := {α | F (α) ∈ Y } and EnumY (α) = F (α) for α ∈ X. The proof that any of the above provides indeed an enumeration function for Y is left to the reader. Uniqueness. Let f : X → Y and g : X 0 → Y both be ordering functions of Y . Then X ⊆ X 0 or X 0 ⊆ X since both are segments. In the first case show by induction on α ∈ X that f (α) = g(α). But since f [X] = Y and g is 1-1 this implies X = X 0 . The argument in the case X 0 ⊆ X is of course analogous. t u Definition: 4.6 Let X ⊆ ON. X is unbounded if for all α there exists γ ∈ X such that γ > α. X is closed if sup U ∈ X whenever U is a non-empty subset of X. We use the phrase X is club or a club to convey that X is closed and unbounded. A function f : ON → ON is continuous if f (sup U ) = sup f [U ] for all nonempty sets of ordinals U . f : ON → ON is a normal function if f is order preserving and continuous. Lemma: 4.7 Let Y ⊆ ON. EnumY is a normal function iff Y is closed and unbounded (Y is club). Proof: Set f := EnumY . First suppose that f is normal. As dom(f ) = ON, Y must be unbounded. Let V ⊆ Y be a non-empty set. Let U = f −1 [V ] = {ξ | f (ξ) ∈ V }. Since f is continuous we have sup V = sup f [U ] = f (sup U ) ∈ Y . Conversely assume that Y is unbounded. Then the domain of EnumY must be ON. If Y is closed and U 6= ∅ is a set of ordinals we have sup f [U ] ∈ Y , hence sup f [U ] = f (α) for some α. Clearly, ξ ≤ α holds for all ξ ∈ U , and hence sup U ≤ α. On the other hand, if δ < α then f (δ) < f (ξ) for some ξ ∈ U , and hence δ < sup U . As a result, sup U = α, thus sup f [U ] = f (sup U ). t u 33 Definition: 4.8 Let ON≥α := {δ | δ ≥ α}. Define the ordinal sum α + ξ by α + ξ := EnumON≥α (ξ). Since ON≥α is obviously a club, the function ξ 7→ α + ξ is a normal function by Lemma 4.7. Lemma: 4.9 The following properties hold for ordinal addition: 1. α + 0 = α. 2. α + (ξ + 1) = (α + ξ) + 1. 3. α + λ = supξ<λ (α + ξ) for limits λ. 4. ξ < η implies α + ξ < α + η. 5. α ≤ α + ξ and ξ ≤ α + ξ. 6. α + (β + γ) = (α + β) + γ. Proof: These are straightforward consequences of ξ 7→ α + ξ being an enumeration function. (5) is proved by induction on γ. If γ = 0 this follows from (1). If γ = γ0 +1, then (2) (2) i.h. (2) α + (β + γ) = α + ((β + γ0 ) + 1) = (α + (β + γ0 )) + 1 = ((α + β) + γ0 ) + 1 = (α + β) + γ. If γ is a limit then i.h. (α + β) + γ = sup((α + β) + ξ) = sup(α + (β + ξ)) ≤ α + (β + γ). ξ<γ ξ<γ Suppose ζ < α + (β + γ). Then ζ < α or ζ = α + ζ0 for some ζ0 < β + γ. In the latter case ζ0 < β or ζ0 = β + ξ for some ξ < γ. Thus in every case we have ζ < supξ<γ (α + (β + ξ)), showing that α + (β + γ) ≤ supξ<γ (α + (β + ξ)). t u Definition: 4.10 We say that an ordinal α > 0 is an additive principal number or additively indecomposable if ξ, η < α implies ξ + η < α. The class of additive principal numbers we denote by AP. Lemma: 4.11 1. Let α > 0. α ∈ / AP iff there exist η, ξ < α such that η + ξ = α. 2. 1 is the smallest additive principal number and ω is the next one. Additive principal number > 1 are limit ordinals. 3. Every infinite cardinal is in AP. 4. AP is a club. 34 Proof: (1) Assume α ∈ / AP. Then α ≤ ξ + δ for some ξ, δ < α. Since α ∈ ON≥ξ there exists η such that α = ξ + η. Hence η ≤ δ < α. Conversely if α = ξ + η for some ξ, η < α then α ∈ / AP. (2) is obvious. (3) Clearly ω ∈ AP. Let ρ be an infinite cardinal > ω. Note that if ξ, η < ρ then the cardinalities of ξ and η are smaller than ρ and the cardinality of ξ + η is not bigger than the maximum of the cardinalities of ξ, η, ω, and hence < ρ. (4) To show unboundedness, take any α and define α0 = α+1 and αn+1 = αn +αn . Let β := sup{αn | n ∈ N}. Since αn > 0 we have αn < αn + αn = αn+1 . Clearly, α < β. If ξ, η < β then ξ, β < αn for some n, and hence ξ + η < αn + αn = αn+1 < β. Thus β ∈ AP. As for closure, let U ⊆ AP be a non-empty set. Let α = sup U . If ξ, η < α then ξ < ξ 0 and η < η 0 for some ξ 0 , η 0 ∈ U . Hence ξ + η < max(ξ 0 , η 0 ) ≤ α. t u Definition: 4.12 Let ω α := EnumAP (α). Lemma: 4.13 1. ω 0 = 1 and ω 1 = ω. 2. ω λ = supξ<λ ω ξ . 3. If α < β then ω α < ω β . t u Proof: Obvious. Lemma: 4.14 Let α > 0. Then α ∈ AP iff for all ξ < α, ξ + α = α. Proof: This is true for α = 1. Let α ∈ AP and α > 1. Then α is a limit and hence ξ + α = supδ<α (ξ + δ) ≤ α. On the other hand, ξ + α ≥ α. Conversely assume ξ + α = α for all ξ < α. Then if ξ, η < α we have ξ + η < ξ + α = α, whence α ∈ AP. t u Definition: 4.15 We write α =N F α1 +. . .+αn if α = α1 +. . .+αn , α1 , . . . , αn ∈ AP and α1 ≥ . . . ≥ αn . Theorem: 4.16 (Cantor’s normal form, Cantor 1897) For every α > 0 there are uniquely determined α1 , . . . , αn ∈ AP such that α =N F α1 + . . . + αn . Proof: We prove the existence by induction on α. If α ∈ AP, the α =N F α. If α∈ / AP then by Lemma 4.11 there exist 0 < η, ξ < α such that η + ξ = α. By the inductive assumption we have η =N F η1 + . . . + ηm and ξ =N F ξ1 + . . . + ξn for some η1 , . . . , ηm , ξ1 , . . . , ξn ∈ AP. As a result, α =N F η1 + . . . + ηj + ξ1 + . . . + ξn 35 where j is the largest index such that ηj ≥ ξ1 . Note that there exists such a j since η1 ≥ ξ1 for otherwise we would have η + ξ = ξ = α. To show uniqueness assume α =N F α1 + . . . + αm and α =N F α1∗ + . . . + αn∗ . We show m = n and αi = αi∗ by induction on m. As α1 < α1∗ would entail α1 + . . . + αm < α1∗ we have α1 ≥ α1∗ . Thus, by symmetry, α1 = α1∗ . Hence m = n = 1 or α2 + . . . + αm = α2∗ + . . . + αn∗ . In the latter case the induction t u hypothesis tells us that m = n and αi = αi∗ for 2 ≤ i ≤ m. Corollary: 4.17 Let α =N F α1 + . . . + αm and β =N F β1 + . . . + βm . Then α < β iff one of the following holds: (i) m < n and αi = βi for all i ≤ m; (ii) there exists j ≤ min(m, n) such that αj < βj and αi = βi holds for all 1 ≤ i < j. t u Proof: Obvious. Definition: 4.18 We define ordinal multiplication and exponentiation as follows: α·0 = 0 α · (β + 1) = α · β + α α · λ = sup{α · ξ | ξ < λ} when λ is a limit. α0 = 1 αβ+1 = αβ · α αλ = sup{αξ | ξ < λ} when λ is a limit. Note that on account of Lemma 4.14, definitions 4.12 and 4.18 give rise to the same function ξ 7→ ω ξ . Lemma: 4.19 1. α < β and γ > 0 iff γ · α < γ · β. 2. If α ≤ β then α · γ ≤ β · γ. 3. α · (β + γ) = α · β + α · γ. 4. (α · β) · γ = α · (β · γ). 5. ω α+1 = ω α · ω. 6. ω α+β = ω α · ω β . t u Proof: Exercises. 36 4.1 Veblen’s functions Definition: 4.20 For f : ON → ON define Fix(f ) := {α | f (α) = α}; f 0 := EnumFix(f ) . Veblen called f 0 the derivative of f . Lemma: 4.21 (i) If f : ON → ON is normal then Fix(f ) is a club and f 0 is a normal function, too. (ii) Let ρ > 0. If Xξ is a sequence of clubs for ξ < ρ then \ Xξ ξ<ρ is also a club. Proof: (i) By Lemma 4.7 it suffices to show that Fix(f ) is a club. For unboundedness let α be arbitrary and define α0 = α + 1, αn+1 = f (αn ) and α∗ = sup{αn | n ∈ ω}. Then α∗ > α and f (α∗ ) = sup{f (αn ) | n ∈ ω} = sup{αn+1 | n ∈ ω} = α∗ whence α∗ ∈ Fix(f ). For closure assume U ⊆ Fix(f ) is a non-empty set. Then f (sup U ) = sup f [U ] = sup U since f is continuous and f [U ] = U since U consists of fixed points of f . Thus sup U ∈ Fix(f ). (ii) Closure is obvious as each class Xξ is closed. For unboundedness let α be arbitrary. Recursively define αn and αnξ for ξ < ρ as follows. Set α0 = α + 1. For ξ < ρ choose αnξ in such a way that αnξ ∈ Xξ and αnξ > αn . Let αn+1 = supξ<ρ αnξ . Put α+ = supk αk . Then we have αn < αnξ ≤ αn+1 and hence α+ = supk αkξ for all ξ < ρ. Whence α+ ∈ Xξ for all ξ < ρ. t u Definition: 4.22 (Veblen 1908) Define Cr(0) = AP; Cr(α + 1) = Fix(ϕα ); \ Cr(λ) = Cr(ξ) if λ is a limit; ξ<λ ϕα = EnumCr(α) . Corollary: 4.23 For every α, Cr(α) is a club and ϕα is a normal function. t u Proof: Lemma 4.11 and Lemma 4.21. 37 Lemma: 4.24 1. If α ≤ β then Cr(β) ⊆ Cr(α). 2. ϕ0 (α) = ω α . 3. ϕα is strictly increasing. 4. β ≤ ϕα (β). 5. If α < β then Cr(β) is a proper subclass of Cr(α), ϕα (γ) ≤ ϕβ (γ), and ϕα (ϕβ (γ)) = ϕβ (γ). Proof: (1) follows readily by induction on β. (2) and (3) are immediate and (4) follows from Lemma 4.4. As to (5) note that ϕβ (γ) ∈ Cr(α + 1) by (1) and hence ϕα (ϕβ (γ)) = ϕβ (γ). As ϕα (0) < ϕα (ϕβ (0)) = ϕβ (0) it follows that ϕα (0) ∈ / Cr(β) and hence Cr(β) is a proper subclass of Cr(α). t u Theorem: 4.25 (ϕ-comparison) lowing conditions is satisfied: (i) ϕα1 (β1 ) = ϕα2 (β2 ) holds iff one of the fol- 1. α1 < α2 and β1 = ϕα2 (β2 ) 2. α1 = α2 and β1 = β2 3. α2 < α1 and ϕα1 (β1 ) = β2 . (ii) ϕα1 (β1 ) < ϕα2 (β2 ) holds iff one of the following conditions is satisfied: 1. α1 < α2 and β1 < ϕα2 (β2 ) 2. α1 = α2 and β1 < β2 3. α2 < α1 and ϕα1 (β1 ) < β2 . Proof: We prove (i) and (ii) simultaneously. Case 1: α1 < α2 . Then ϕα1 (ϕα2 (β2 )) = ϕα2 (β2 ) and hence ϕα1 (β1 ) = ϕα2 (β2 ) ϕα1 (β1 ) < ϕα2 (β2 ) iff iff β1 = ϕα2 (β2 ); β1 < ϕα2 (β2 ). Case 2: α1 = α2 . Then ϕα1 (β1 ) = ϕα2 (β2 ) ϕα1 (β1 ) < ϕα2 (β2 ) iff iff β1 = β2 ; β1 < β2 . Case 3: α1 > α2 . Then ϕα2 (ϕα1 (β1 )) = ϕα1 (β1 ) and hence ϕα1 (β1 ) = ϕα2 (β2 ) ϕα1 (β1 ) < ϕα2 (β2 ) iff iff β2 = ϕα1 (β1 ); ϕα1 (β1 ) < β2 . t u Corollary: 4.26 If α < β then ϕα (0) < ϕβ (0). Hence α ≤ ϕα (0). 38 Proof: The first part follows from Theorem 4.25(ii)(1). Thus the function α 7→ ϕα (0) is order preserving, so α ≤ ϕα (0) follows by Lemma 4.4. t u Theorem: 4.27 (ϕ normal form) For every α ∈ AP there exist uniquely determined ordinals ξ and η such that α = ϕξ (η) and η < α. Proof: For existence, let ξ := min{δ | α < ϕδ (α)}. ξ exists by Corollary 4.26. If ξ = 0 we have α = ϕ0 (η) for some η < α since α ∈ AP. If ξ > 0 then ϕζ (α) = α for all ζ < ξ and hence α ∈ Cr(ξ) which implies α = ϕξ (η) for some η < α. It remains to show uniqueness. If α = ϕξ (η) = ϕξ0 (η 0 ) where η, η 0 < α, then the cases (1) and (3) from Theorem 4.25(i) cannot hold and hence η = η 0 and ξ = ξ 0 . t u Definition: 4.28 Let SC := {α | ϕα (0) = α} and Γβ = EnumSC (β). Theorem: 4.29 SC is a club and hence β 7→ Γβ is a normal function. Proof: By Corollary 4.26 we know that α 7→ ϕα (0) is strictly increasing. One can also show that this function is continuous. Hence its class of fixed points SC forms a club. t u Lemma: 4.30 SC = {α | α > 0 ∧ ∀ξ, η < α ϕξ (η) < α}. t u Proof: Exercise. 4.2 Two ordinal representation systems Let ε0 be the first ordinal α such that such that ω α = α. Then ∀β < ε0 β < ω β . Another notation for ε0 is ϕ1 (0). Also note that if ρ ∈ AP and ρ < Γ0 then there exist (unique) α, β < ρ such that α = ϕα (β). Definition: 4.31 (i) The set OT(ε0 ) is inductively defined by the following clauses: 1. 0 ∈ OT(ε0 ). 2. If α1 , . . . , αn ∈ OT(ε0 ) ∩ AP and α1 ≥ . . . ≥ αn and n > 1 then α1 + . . . + αn ∈ OT(ε0 ). 3. If α ∈ OT(ε0 ) then ω α ∈ OT(ε0 ). (ii) OT(Γ0 ) is inductively defined by the following clauses: 1. 0 ∈ OT(Γ0 ). 2. If α1 , . . . , αn ∈ OT(Γ0 ) ∩ AP and α1 ≥ . . . ≥ αn and n > 1 then α1 + . . . + αn ∈ OT(Γ0 ). 3. If α, β ∈ OT(Γ0 ) and α, β < ϕα (β) then ϕα (β) ∈ OT(Γ0 ). Corollary: 4.32 (i) OT(ε0 ) = ε0 . 39 (ii) OT(Γ0 ) = Γ0 . Proof: Use induction on α < ε0 to show that α ∈ OT(ε0 ). Similarly, use induction on α < Γ0 to show that α ∈ OT(Γ0 ). t u Ordinals β < Γ0 have a unique normal form, namely either β = 0 or β =N F β1 + . . . βn with β1 , βn ∈ AP and n > 1 or β =N F ϕγ (δ) with γ, δ < β. Thus every 0 < β < Γ0 can be uniquely represented in terms of smaller ordinals which again can be uniquely represented in terms of yet smaller ordinals and 0 etc. As this process terminates after finitely many steps, every β < Γ0 has a unique term representation over the alphabet 0, +, ϕ. Corollary: 4.33 There is a primitive recursive set A0 ⊆ N, a primitive recursive relation ≺ on A0 and primitive binary recursive functions +̂ and ϕ̂ such that f : (OT(Γ0 ), <, +, ϕ) ∼ = (A0 , ≺, +̂, ϕ̂) for some structural isomorphism f . Moreover (OT(ε0 ), <, +, ϕ0 ) ∼ = (B0 , ≺1 , +̂, ϕ̂0 ), where B0 = {x ∈ A0 | x ≺ f (ε0 )}, ≺1 is the restriction of ≺ to A0 and +̂ and ϕ̂0 are the restrictions of these functions to B0 . Proof: Ordinals < Γ0 can be coded by natural numbers. For instance a coding function d . e : Γ0 −→ N could be defined as follows: if α = 0 0 h1, dα1 e, . . . , dαn ei if α =N F α1 + · · · + αn where n > 1 dαe = h2, dα1 e, dα2 ei if α =N F ϕα1 (α2 ) where hk1 , · · · , kn i := 2k1 +1 · . . . · pknn +1 with pi being the ith prime number (or any other coding of tuples). Further define: A0 := range of d.e dαe +̂ dβe := dα + βe dαe ≺ dβe :⇔ α < β ϕ̂(dαe, dβe) := dϕα (β)e. Then hΓ0 , +, ϕ, <i ∼ = hA0 , +̂, ϕ̂, ≺i. It remains to show that A0 , ≺, +̂, ϕ̂ are primitive recursive. This can be seen by defining them via a simultaneous primitive recursive definition, viewing Corollary 4.18 and Theorem 4.25 as the recursive clauses for defining ≺. t u 40 5 Ordinal analysis of PA and some subsystems of second order arithmetic The most important structure in mathematics is arguably the structure of the natural numbers N = (N; 0N , 1N , +N , ×N , E N , <N ), where 0N denotes zero, 1N denotes the number one, +N , ×N , E N denote the successor, addition, multiplication, and exponentiation function, respectively, and <N stands for the less-than relation on the natural numbers. In particular, E N (n, m) = nm . Many of the famous theorems and problems of mathematics such as Fermat’s and Goldbach’s conjecture, the Twin Prime conjecture, and Riemann’s hypothesis can be formalized as sentences of the language of N and thus concern questions about the structure N. Definition: 5.1 A theory designed with the intent of axiomatizing the structure N is Peano arithmetic, PA. The language of PA has the predicate symbols =, <, the function symbols +, ×, E (for addition, multiplication,exponentiation) and the constant symbols 0 and 1. The Axioms of PA comprise the usual equations and laws for addition, multiplication, exponentiation, and the less-than relation. In addition, PA has the Induction Scheme (IND) A(0) ∧ ∀x[A(x) → A(x + 1)] → ∀xA(x) for all formulae A(a) of the language of PA. Gentzen showed that transfinite induction up to the ordinal ω ε0 = sup{ω, ω ω , ω ω , . . .} = least α. ω α = α suffices to prove the consistency of PA. To appreciate Gentzen’s result it is pivotal to note that he applied transfinite induction up to ε0 solely to elementary computable predicates and besides that his proof used only finitistically justified means. Hence, a more precise rendering of Gentzen’s result is F + EC-TI(ε0 ) ` Con(PA), (40) where F signifies a theory that embodies only finitistically acceptable means, EC-TI(ε0 ) stands for transfinite induction up to ε0 for elementary computable predicates, and Con(PA) expresses the consistency of PA. Finally, we should spell out the scheme EC-TI(ε0 ) in the language of PA: ∀x [∀y (y ≺ x → P (y)) → P (x)] → ∀x P (x) for all elementary computable predicates P . Gentzen also showed that his result was the best possible in that PA proves transfinite induction up to α for arithmetic predicates for any α < ε0 . The compelling picture conjured up by the above is that the non-finitist part of PA is encapsulated in EC-TI(ε0 ) and therefore “measured” by ε0 , thereby tempting one to adopt the following definition of proof-theoretic ordinal of a theory T : |T |Con = least α. F + EC-TI(α) ` Con(T ). 41 (41) In the above, many notions were left unexplained. We will now consider them one by one. The elementary computable functions are exactly the Kalmar elementary functions, i.e. the class of functions which contains the successor, projection, zero, addition, multiplication, and modified subtraction functions and is closed under composition and bounded sums and products. A predicate is elementary computable if its characteristic function is elementary computable. According to an influential analysis of finitism due to W.W. Tait, finististic reasoning coincides with a system known as primitive recursive arithmetic. For the purposes of ordinal analysis, however, it suffices to identify F with an even more restricted theory known as Elementary Recursive Arithmetic, EA. EA is a weak subsystem of PA having the same defining axioms for +, ×, E, < but with induction restricted to elementary computable predicates. We shall add a new unary predicate symbol U to the language of PA which will serve the purpose of a free predicate variable. Definition: 5.2 We shall formalize PAU in the sequent calculus. In addition to the rules of the (classical) sequent calculus we have to add the following axioms: (=ref) Γ ⇒ ∆, t = t (=sym) Γ, s = t ⇒ ∆, t = s (=tran) Γ, s1 = s2 , s2 = s3 ⇒ ∆, s1 = s3 (=sub) Γ, s1 = t1 , . . . , sn = tn , A(s1 , . . . , sn ) ⇒ ∆, A(t1 , . . . , tn ) only for atomic formulas A(~s ). (suc1) Γ ⇒ ∆, suc(s) 6= 0 and Γ, suc(s) = suc(t) ⇒ ∆, s = t. (+) Γ ⇒ ∆, s + 0 = 0 and Γ ⇒ ∆, s + suc(t) = suc(s + t). (·) Γ ⇒ ∆, s · 0 = 0 and Γ ⇒ ∆, s · suc(t) = s · t + s. (IND) Γ, A(0), ∀x [A(x) → A(x + 1)] ⇒ ∆, ∀xA(x) for all formulas A(a). As the ultimate goal of this course is to carry out an ordinal analysis of a system of set theory, we shall not particularly dwell on an ordinal analysis of PA. To soften the ascend to set theory, however, we will first give an ordinal analysis of two subsystems of second order arithmetic. The analysis of PAU will arise as a corollary. Ordinal analysis is concerned with theories serving as frameworks for formalising significant parts of mathematics. It is known that virtually all of ordinary mathematics can be formalized in Zermelo-Fraenkel set theory with the axiom of choice, ZFC. Hilbert and Bernays [25] showed that large chunks of mathematics can already be formalized in second order arithmetic. Owing to these observations, proof theory has been focusing on set theories and subsystems of second order arithmetic. Further scrutiny revealed that a small fragment is sufficient. Under the rubric of Reverse Mathematics a research programme has been initiated by Harvey Friedman some thirty years ago. The idea is to ask whether, given a theorem, one can prove 42 its equivalence to some axiomatic system, with the aim of determining what prooftheoretical resources are necessary for the theorems of mathematics. More precisely, the objective of reverse mathematics is to investigate the role of set existence axioms in ordinary mathematics. The main question can be stated as follows: Given a specific theorem τ of ordinary mathematics, which set existence axioms are needed in order to prove τ ? Central to the above is the reference to what is called ‘ordinary mathematics’. This concept, of course, doesn’t have a precise definition. Roughly speaking, by ordinary mathematics we mean main-stream, non-set-theoretic mathematics, i.e. the core areas of mathematics which make no essential use of the concepts and methods of set theory and do not essentially depend on the theory of uncountable cardinal numbers. Subsystems of second order arithmetic. The framework chosen for studying set existence in reverse mathematics, though, is second order arithmetic rather than set theory. Second order arithmetic, Z2 , is a two-sorted formal system with free and bound first order variables (also called numerical variables; the same as for PA) and free set variables U0 , U1 , U2 , . . . as well as bound set variables X0 , X1 , X2 , . . . supposed to range over sets of natural numbers. The language L2 of second-order arithmetic also contains the symbols of PA, and in addition has a binary relation symbol ∈ for elementhood. Formulae are built from atomic formulae s = t, s < t, and s ∈ U (where s, t are numerical terms, i.e. terms of PA) by closing off under the connectives ∧, ∨, →, ¬, numerical quantifiers ∀x, ∃x, and set quantifiers ∀X, ∃X. The basic arithmetical axioms in all theories of second-order arithmetic are the defining axioms for 0, 1, +, ×, E, < (as for PA) and the induction axiom ∀X(0 ∈ X ∧ ∀x(x ∈ X → x + 1 ∈ X) → ∀x(x ∈ X)). We consider the axiom schema of C-comprehension for formula classes C which is given by C − CA ∃X∀u(u ∈ X ↔ F (u)) for all formulae F ∈ C in which X does not occur. Natural formula classes are the arithmetical formulae, consisting of all formulae without second order quantifiers ∀X and ∃X, and the Π1n -formulae, where a Π1n -formula is a formula of the form ∀X1 . . . QXn A(X1 , . . . , Xn ) with ∀X1 . . . QXn being a string of n alternating set quantifiers, commencing with a universal one, followed by an arithmetical formula A(X1 , . . . , Xn ). ACA0 denotes the theory consisting of the basic arithmetical axioms plus the scheme ∃X∀u(u ∈ X ↔ F (u)) for all arithmetical formula F (a) in which X does not occur. ACA denotes the theory ACA0 augmented by the scheme of induction for all L2 -formulae. 43 5.1 The semi-formal system RA∗ of Ramified Analysis Definition: 5.3 RA∗ has the following symbols: • Bound number variables: x0 , x1 , x2 , . . .. • Free predicate variables of level α for each ordinal α < Γ0 : U0α , U1α , U2α . . . • Bound predicate variables of level β for each ordinal 0 < β < Γ0 : X0β , X1β , X2β , . . .. • The symbols 0, suc, +, ·. • The logical symbols ∧, ∨, →, ¬, ∀, ∃ and λ. • Symbols for primitive recursive functions and relations. • Parentheses. Inductive definition of formulas and predicators. 1. Every numerical atomic formula is a formula of level 0. 2. Every free predicate variable of level α is a predicator of level α. 3. If P α is a predicator of level α and t is a term, then t ∈ P α is a formula of level α. 4. If A and B are formulas of level α and β then A ∧ B, A ∨ B, A → B are formulas of level max(α, β) and ¬A is a formula of level α. 5. If F (0) is a formula of level α and x is a bound number variable which does not occur in F (0), then ∀x F (x) and ∃x F (x) are formulae of level α and λx F (x) is a predicator of level α. 6. If U β is a free predicate variable of level β 6= 0, F (U β ) is a formula of level α and X β a bound predicate variable of level β which does not occur in F , then ∀X β F (X β ) and ∃X β F (X β ) are formulae of level max(α, β). Inductive definition of the length |A| of a formula A. 1. Every atomic numerical formula A has length 0, |A| = 0. 2. |U α (t)| = ω · α. 3. If A and B are formulas then |A ∧ B| = |A ∨ B| = |A → B| = max(|A|, |B|) + 1 and |¬A| = |A| + 1. 4. |∀x F (x)| = |∃x F (x)| = |λx F (x)| = |F (0)| + 1. 5. |∀X β F (X β )| = |∃X β F (X β )| = max(ω · β, |F (U 0 )| + 1). Definition: 5.4 We define the infinitary proof system RA∗ . A true (false) atomic formula is an atomic formula without free variables (and hence closed) which is true (false) on the standard interpretation. The axioms of ACA∞ are the following: 44 (A1) Γ ⇒ ∆, A where A is a true atomic formula. (A2) Γ, A ⇒ ∆ where A is a false atomic formula. (A3) Γ, U α (s) ⇒ ∆, U α (t) where s and t have the same numerical value and U α is a free set variable. The inference rules of RA∗ comprise those of the sequent calculus with the exception of (∀R) and (∃L). The latter are replaced by two infinitary rules, i.e. rules with infinitely many premisses. They correspond to the so called ω-rule: Γ ⇒ ∆, F (0); Γ ⇒ ∆, F (1); . . . ; Γ ⇒ ∆, F (n); . . . Γ ⇒ ∆, ∀x F (x) F (0), Γ ⇒ ∆; F (1), Γ ⇒ ∆; . . . ; F (n), Γ ⇒ ∆; . . . ∃x F (x), Γ ⇒ ∆ (ωR) (ωL) The price to pay will be that deductions become infinite objects, i.e. infinite wellfounded trees. We will also need rules for the higher order quantifiers and predicators. Variables P, P0 , P1 , . . . will range over predicators and variables P α , P0α , P1α , . . . will range over predicators of level α. We write lev(P ) for the level of P . Pβ stands the collection of predicators with levels < β. F (t), Γ ⇒ ∆ PL λxF (x)(t), Γ ⇒ ∆ F (P ), Γ ⇒ ∆, Γ ⇒ ∆, F (P ) all P ∈ Pβ ∀β L ∀X β F (X β ), Γ ⇒ ∆ F (P ), Γ ⇒ ∆ all P ∈ Pβ β Γ ⇒ ∆, F (t) PR Γ ⇒ ∆, λxF (x)(t) β Γ ⇒ ∆, ∀X β F (X β ) Γ ⇒ ∆, F (P ) ∃β L β ∃X F (X ), Γ ⇒ ∆ β ∀β R ∃β R Γ ⇒ ∆, ∃X F (X ) where in ∀β L and ∃β R, P is a predicator of level < β. Definition: 5.5 RA∗ α ρ Γ ⇒ ∆ is defined inductively as follows: (i) If Γ ⇒ ∆ is an axiom, then RA∗ α ρ Γ ⇒ ∆ for any α, ρ. αi (ii) If RA∗ ρ Γi ⇒ ∆i holds for all premisses Γi ⇒ ∆i of an inference of RA∗ other than (Cut) with conclusion Γ ⇒ ∆ and αi < α holds for all i, then α RA∗ ρ Γ ⇒ ∆ . α1 (iii) If RA∗ ρ Γ, C ⇒ ∆ , RA∗ α RA∗ ρ Γ ⇒ ∆ . Lemma: 5.6 α1 ρ Γ ⇒ ∆, C , |C| < ρ and α1 , α2 < α, then (i) If B is a formula of level α then |A| = ω · α + n for some n < ω. 45 (ii) For every formula A(U ) and comprehension term P α with α < β, |A(P α )| < |∀X β A(X β )|, |∃X β A(X β )|. t u Proof: Exercise. Lemma: 5.7 For every formula C of RA∗ , RA∗ 2·|A| 0 Γ, C ⇒ ∆, C . Proof: Use induction on |A|. t u We list some technical lemmata that will be useful for proving cut elimination. Lemma: 5.8 (Substitution) Let Γ(s) and ∆(s) be sets of formulas with some occurrences of s indicated and let t be a term with the same numerical value. If α ρ Γ(s) ⇒ ∆(s) , then α ρ Γ(t) ⇒ ∆(t) . t u Proof: Use induction on α. Lemma: 5.9 (Weakening) α If RA∗ ρ Γ ⇒ ∆ , Γ ⊆ Γ0 and ∆ ⊆ ∆0 , then RA∗ α ρ Γ0 ⇒ ∆0 . t u Proof: Use induction on α. Lemma: 5.10 (Inversion) (i) If RA∗ α ρ Γ, A ∧ B ⇒ ∆ then RA∗ α ρ Γ, A, B ⇒ ∆ . (ii) If RA∗ α ρ Γ ⇒ ∆, A ∧ B then RA∗ α ρ Γ ⇒ ∆, A and RA∗ α ρ Γ ⇒ ∆, B . (iii) If RA∗ α ρ Γ, A ∨ B ⇒ ∆ then RA∗ α ρ Γ, A ⇒ ∆ and RA∗ α ρ Γ, B ⇒ ∆ . (iv) If RA∗ α ρ Γ ⇒ ∆, A ∨ B then RA∗ α ρ Γ ⇒ ∆, A, B . (v) If RA∗ α ρ Γ ⇒ A → B, ∆ then RA∗ α ρ A, Γ ⇒ ∆, B . (vi) If RA∗ α ρ Γ, A → B ⇒ ∆ then RA∗ α ρ Γ ⇒ ∆, A and RA∗ (vii) If RA∗ α ρ Γ ⇒ ¬A, ∆ then RA∗ α ρ Γ, A ⇒ ∆ . (viii) If RA∗ α ρ Γ, ¬A ⇒ ∆ then RA∗ α ρ Γ ⇒ ∆, A . (ix) If RA∗ α ρ Γ ⇒ ∆, ∀x B(x) then RA∗ α ρ Γ ⇒ ∆, B(s) for any closed term s. (x) If RA∗ α ρ Γ, ∃x B(x) ⇒ ∆ then RA∗ α ρ Γ, B(s) ⇒ ∆ for any closed term s. α ρ Γ, B ⇒ ∆ . α α ρ Γ ⇒ ∆, B(P ) for any predicator α α ρ Γ, B(P ) ⇒ ∆ for any predicator (xi) If RA∗ ρ Γ ⇒ ∆, ∀X β B(X β ) then RA∗ P ∈ Pβ . (xii) If RA∗ ρ Γ, ∃X β B(X β ) ⇒ ∆ then RA∗ in P ∈ Pβ . 46 (xiii) If RA∗ α ρ Γ ⇒ ∆, λxF (x)(t) then RA∗ α ρ Γ ⇒ ∆, F (t) . (xiv) If RA∗ α ρ Γ, λxF (x)(t) ⇒ ∆ then RA∗ α ρ Γ, F (t) ⇒ ∆ . t u Proof: All are provable by easy inductions on α. Lemma: 5.11 (Reduction) α Suppose ρ ≤ |C|. If RA∗ ρ Γ, C ⇒ ∆ and RA∗ RA∗ α#α#β#β |C| β ρ Ξ ⇒ Θ, C , then Γ, Ξ ⇒ ∆, Θ . Proof: The proof is by induction on α#α#β#β and very similar to Lemma 2.17. We only look at two cases where C and was the principal formula of the last inference in both derivations. Case 1: The first is when C is of the form ∀X β A(X β ). Then we have RA∗ α1 ρ Γ, C, A(P 0 ) ⇒ ∆ RA∗ βP ρ Ξ ⇒ Θ, C, A(P ) and for some α1 < α and predicator P 0 ∈ Pβ as well as βP < β for all predicators P ∈ Pβ . By the induction hypothesis we obtain α1 #α1 #β#β RA∗ and RA∗ |C| α#α#βP 0 #βP 0 |C| α#α#β#β Cutting out A(P 0 ) gives RA∗ |C| Γ, Ξ, A(P 0 ) ⇒ ∆, Θ Γ, Ξ ⇒ ∆, Θ, A(P 0 ) . Γ, Ξ ⇒ ∆, Θ . Case 2: The second case is when C is of the form ∀x A(x) Then we have RA∗ α1 ρ Γ, C, A(t) ⇒ ∆ RA∗ βn ρ Ξ ⇒ Θ, C, A(n) and for some α1 < α and closed term t as well as βn < β for all numbers n. Let m be the numerical value of t. By Lemma 5.10(ix) we have RA∗ α1 ρ Γ, C, A(m) ⇒ ∆ . By the induction hypothesis we thus get α1 #α1 #β#β RA∗ and RA∗ |C| α#α#βm #βm |C| Cutting out A(m) gives RA∗ α#α#β#β |C| Γ, Ξ, A(m) ⇒ ∆, Θ Γ, Ξ ⇒ ∆, Θ, A(m) . Γ, Ξ ⇒ ∆, Θ . 47 t u Theorem: 5.12 (First Cut Elimination Theorem) α 4α If RA∗ δ+1 Γ ⇒ ∆ then RA∗ δ Γ ⇒ ∆ . Proof: We use induction on α. If Γ ⇒ ∆ is an axiom then we clearly get the desired result. So let’s assume that Γ ⇒ ∆ is not an axiom. Then we have a last inference (I) with premisses Γi ⇒ ∆i . Suppose the inference was not a cut or a αi cut of a degree < δ. We then have RA∗ δ Γi ⇒ ∆i for some αi < α. By the induction hypothesis we have RA∗ 4α RA∗ δ 4αi δ α Γi ⇒ ∆i . Applying the same inference (I) yields Γ ⇒ ∆ since 4αi < 4 . Now suppose the last inference was a cut with a cut formula C satisfying |C| = δ. By the induction hypothesis we have RA∗ 4α1 RA∗ 4α2 and δ δ Γ, C ⇒ ∆ Γ ⇒ ∆, C for some α1 , α2 < n. We can then apply the Reduction Lemma 5.11 to these 4α1 #4α1 #4α2 #4α2 derivations and arrive at RA∗ δ Γ ⇒ ∆ . Since 4α1 #4α1 #4α2 #4α2 ≤ 4α the desired conclusion follows. t u Theorem: 5.13 (Second Cut Elimination Theorem) If RA∗ α ρ+ω ν Γ ⇒ ∆ then RA∗ ϕν (α) ρ Γ ⇒ ∆. Proof: We use induction on ν with a subsidiary induction on α. The assertion holds for ν = 0 by the First Cut Elimination Theorem 5.12. Now suppose ν > 0. If Γ ⇒ ∆ is an axiom then we clearly get the desired result. So let’s assume that Γ ⇒ ∆ is not an axiom. Then we have a last inference (I) with premisses Γi ⇒ ∆i . Suppose the inference was not a cut or a cut of rank < ρ. αi We then have RA∗ ρ+ων Γi ⇒ ∆i for some αi < α. By the subsidiary induction hypothesis we have RA∗ ϕν (αi ) ρ Γi ⇒ ∆i . Applying the same inference (I) yields ϕν (α) RA∗ ρ Γ ⇒ ∆. Now suppose the last inference was a cut with cut formula C such that ρ ≤ |C| < ρ + ω ν . Then there exist ν0 < ν and n < ω such that |C| < ρ + ω ν0 · n. After performing a cut with C we have RA∗ ϕν (α) ρ+ω ν0 ·n Γ ⇒ ∆. We also have ϕν0 (ϕν (α)) = ϕν (α). Therefore by n-fold application of the main induction hypothesis we obtain RA∗ 5.2 ϕν (α) ρ Γ ⇒ ∆. t u Interpretation of subsystems of Z2 in RA∗ To facilitate the interpretation of subsystems of Z2 in RA∗ we will assume that they are formalized via the sequent calculus. 48 Definition: 5.14 The sequent calculus version of ACA0 has all the axioms of PAU given in Definition 5.2 but with IND excluded. Further axioms are: (IA) Γ ⇒ ∆, ∀X [0 ∈ X ∧ ∀u (u ∈ X → u + 1 ∈ X) → ∀u u ∈ X]. (A-CA) Γ ⇒ ∆, ∃Y ∀u [u ∈ Y ↔ A(u)] where A(a) is an arithmetic formula in which Y does not occur. In addition to the usual inference rules of the sequent calculus we also need inference rules for the second order quantifiers: F (V ), Γ ⇒ ∆, ∀L ∀X F (X), Γ ⇒ ∆ Γ ⇒ ∆, F (U ) ∀R Γ ⇒ ∆, ∀X F (X) F (U ), Γ ⇒ ∆ ∃L ∃X F (X), Γ ⇒ ∆ Γ ⇒ ∆, F (V ) ∃R Γ ⇒ ∆, ∃X F (X) where the variable U in ∀2 R and ∃2 L is an eigenvariable of the respective inference, i.e. U is not to occur in the lower sequent. The sequent calculus version of ACA also has the axiom scheme (IND) from Definition 5.2. The theory of ∆11 -analysis (that’s the name Schütte gave it in [52, VIII.20]) or 1 (∆1 − CR) comprises ACA and in addition has the rule of ∆11 -comprehension: ⇒ ∀x [∀XA(X, x) ↔ ∃Y B(Y, x)] Γ ⇒ ∆, ∃Z∀x [x ∈ Z ↔ ∀XA(X, x)] ∆11 -CR where A(U, a) and B(U, a) are arithmetic formulae. Note that the premiss of an instance of ∆11 -CR does not have any side formulas. Definition: 5.15 Let 0 < σ < Γ0 . Let Ξ ⇒ Θ be an L2 -sequent. We call an L∗RS -sequent Ξσ ⇒ Θσ a σ-instance of Ξ ⇒ Θ if is obtained by the following steps: 1. Write Ξ ⇒ Θ as Ξ(a1 , . . . , ak , U1 , . . . , Ur ) ⇒ Θ(a1 , . . . , ak , U1 , . . . , Ur ) fully indicating all free variables occurring in it. 2. Replace every free variable ai by a number mi and every variable Uj by a predicator Pj of level < σ. 3. Finally add to every bound variable occurring in Ξ(m1 , . . . , mk , P1 , . . . , Pr ) ⇒ Θ(m1 , . . . , mk , P1 , . . . , Pr ) a superscript σ (i.e., X changes to X σ ) and the result is Ξσ ⇒ Θσ . If Γ ⇒ ∆ is a sequent of PAU , we say that Γ0 ⇒ ∆0 is a numerical instance of Γ ⇒ ∆ if it is obtained by the following steps: 49 1. Write Γ ⇒ ∆ as Γ(a1 , . . . , an ) ⇒ ∆(a1 , . . . , an ), where all free number variables are fully indicated. 2. Replace every ai by the same numeral mi . 3. In Γ(m1 , . . . , mn ) ⇒ ∆(m1 , . . . , mn ) replace every expression U (t) by t ∈ U00 , and the result is Γ0 ⇒ ∆0 . Lemma: 5.16 RA∗ 2·|F (0)|+ω F (0), ∀x [F (x) → F (x + 1)] ⇒ ∀x F (x) 0 Proof: We show 2·(|F (0)|+n) RA∗ 0 F (0), ∀x [F (x) → F (x + 1)] ⇒ F (n) (42) by induction on n. Let η := |F (0)|. By Lemma 5.7 we have RA∗ 2·η RA∗ 2·(η+n) 0 F (0), ∀x [F (x) → F (x + 1)] ⇒ F (0) . Assume 0 F (0), ∀x [F (x) → F (x + 1)] ⇒ F (n) . (43) 2·η We have RA∗ 0 F (n + 1) ⇒ F (n + 1) by Lemma 5.7 and thus via an inference (→ L) we obtain RA∗ 2·(η+n)+1 0 F (0), ∀x [F (x) → F (x + 1)], F (n) → F (n + 1) ⇒ F (n + 1) . Using (∀L) we arrive at RA∗ 2·(η+n)+2 0 F (0), ∀x [F (x) → F (x + 1)] ⇒ F (n + 1) (44) which is what we want as 2 · (η + n) + 2 = 2 · (η + n + 1). As a consequence of (42) we get the desired assertion via an inference (ωR). t u Theorem: 5.17 (First Interpretation Theorem) there exist n, k < ω such that RA∗ ω+n k (i) If PAU Γ ⇒ ∆ then Γ0 ⇒ ∆0 holds for every numerical instance of Γ0 ⇒ ∆0 of Γ ⇒ ∆. (ii) If ACA0 that ∀X A(X) where A(U ) is arithmetic then there exist n, k < ω such RA∗ (iii) If ACA ω+n k ∀X 1 A(X 1 ) . Γ ⇒ ∆ then there exist n, k < ω such that RA∗ ω+ω+n ω+k Γ1 ⇒ ∆1 holds for every 1-instance Γ1 ⇒ ∆1 of Γ ⇒ ∆. 50 Proof: (i) Use induction on the length of the derivation in PAU . Numerical instances of the axioms of PAU other than (IA) are axioms of RA∗ . (IA) is deducible cut free and with length ω + 1 by Lemma 5.16. For the induction step note that inferences of PAU other than (∀R) and (∃L) are inferences of RA∗ too. If the last inference was (∀R) use (ωR) instead and if it was (∃L) use (ωL). Also note that if A is numerical instance of a formula of PAU then |A| < ω. (ii) follows from (i) since if ACA0 ∀X A(X) with A(U ) is arithmetic then PAU A(U ) . (iii) Again use induction on the length of the derivation. Note that a 1-instance of a formula of ACA has length < ω + ω. t u Theorem: 5.18 (Second Interpretation Theorem) If (∆11 -CR) RA∗ ω·σ+ω+6·n ω·σ+ω n Γ ⇒ ∆ then Γσ ⇒ ∆σ holds for any σ = ω n · β with β > 0 and σ-instance Γσ ⇒ ∆σ of Γ ⇒ ∆. t u Proof: Homework #5 Problem 5. Corollary: 5.19 (i) If PAU Γ ⇒ ∆ then there exists α < ε0 such that α RA∗ 0 Γ0 ⇒ ∆0 holds for every numerical instance of Γ0 ⇒ ∆0 of Γ ⇒ ∆. (ii) If ACA0 ∀XA(X) where A(U ) is arithmetic and has no free number variables then there exists α < ε0 such that RA∗ (iii) If ACA α 0 ∀X 1 A(X 1 ) . Γ ⇒ ∆ then there exists α < εε0 such that α RA∗ 0 Γ1 ⇒ ∆1 holds for every 1-instance Γ1 ⇒ ∆1 of Γ ⇒ ∆. (iv) If (∆11 -CR) ∀XA(X) where A(U ) is arithmetic and has no free number variables then there exists α < ϕω (0) such that RA∗ α 0 ∀X 1 A(X 1 ) . 51 6 The limits of the deducibility of transfinite induction Definition: 6.1 Let ≺ be a relation on N. For a formula F (a) define Define Prog(≺, F ) := ∀x (∀y ≺ x F (y) → F (x)); TI(≺, F ) := Prog(≺, F ) → ∀x F (x). Also define Prog(≺, U ) := ∀x (∀y ≺ x y ∈ U → x ∈ U ); TI(≺, U ) := Prog(≺, U ) → ∀x x ∈ U. If ≺ is well-founded we define |n|≺ = sup{|k|≺ + 1 | k ≺ n} k≺k = sup{|n|≺ | n ∈ N} For a theory T whose language comprises that of PAU define k T ksup = sup{k≺k| T ` TI(≺, U ) where ≺ is primitive recursive}. Definition: 6.2 We define the notion of a U -positive (U -negative) formula of PAU . A formula in which U does not occur is both U -positive and U -negative. A formula t ∈ U is U -positive and ¬t ∈ U is U -negative. If A, B and F (a) are U -positive (U -negative) then so are A ∧ B, A ∨ B, ∀xF (x) and ∃xF (x). If A is U -positive (U -negative) then ¬A is U -negative (U -positive). If A is U -negative (U -positive) and B is U -positive (U -negative) then A → B is U -positive (U -negative). If A(U ) is a formula of PAU without free number variables and X ⊆ N we write (N, X) |= A(U ) if A(U ) becomes true on interpreting U by X. Note that if A(U ) is U -positive, X ⊆ Y ⊆ N and (N, X) |= A(U ), then (N, Y ) |= A(U ). We shall refer to this fact as monotonicity of U -positive formulae. Similarly, U -negative formulae behave in an anti-monotonic way. W V If Γ is a non-empty finite set of formulae we denote by Γ and V Γ the disjunction and conjunction of all formula in Γ, respectively. Also define ∅ to be the W formula 0 = 0 and ∅ to be the formula 0 = 1. Proposition: 6.3 Assume that ≺ is a well-founded relation on N which is defined by an arithmetic formula, i.e. there is an arithmetic formula B(a, b) with exactly the exhibited free variables such that n ≺ m iff B(n, m) holds in the standard model. Let ∆ be a finite set of U -positive arithmetic formulae and Γ be a finite set of U -negative arithmetic formulae with no other free variables than U . We identify U with U00 . If δ = max(|t1 |≺ , . . . , |tr |≺ ) and RA∗ β 0 t1 ∈ U, . . . , tr ∈ U, Prog(≺, U ), Γ ⇒ ∆ then (N, {m | |m|≺ < δ + 2β }) |= 52 ^ Γ→ _ ∆. Proof: We employ induction on β. If the entire sequent is an axiom one readily checks that the claim is true. If the last inference introduced a principal formula belonging to Γ or ∆ the claim follows readily from the induction hypothesis applied to the premisses. Now assume that the last inference had Prog(≺, U ) as its principal formula. Then we have RA∗ β0 0 t1 ∈ U, . . . , tr ∈ U, Prog(≺, U ), ∀y ≺ t y ∈ U → t ∈ U, Γ ⇒ ∆ for some closed term t and β0 < β. Using (→ L)-inversion we get RA∗ β0 t 0 1 β0 RA∗ 0 t1 ∈ U, . . . , tr ∈ U, Prog(≺, U ), Γ ⇒ ∆, ∀y ≺ t y ∈ U ; (45) ∈ U, . . . , tr ∈ U, t ∈ U, Prog(≺, U ), Γ ⇒ ∆ . (46) Note that ∀y ≺ t y ∈ U is a U -positive formula, and hence we may apply the induction hypothesis to (45) to arrive at ^ _ Γ → ( ∆ ∨ ∀y ≺ t y ∈ U ). (N, {m | |m|≺ < δ + 2β0 }) |= V W If (N, {m | |m|≺ < δ + 2β0 }) |= Γ → ∆ we are done owing to monotonicity. If the latter is not the case, then we have (N, {m | |m|≺ < δ + 2β0 }) |= ∀y ≺ t y ∈ U which entails that |t|≺ ≤ δ + 2β0 . As a result, the induction hypothesis applied to (46) with δ 0 = δ + 2β0 yields ^ _ (N, {m | |m|≺ < δ 0 + 2β0 }) |= Γ→ ∆. As δ 0 + 2β0 = δ + 2β0 + 2β0 < δ + 2β we are done again by monotonicity. t u Corollary: 6.4 If β RA∗ 0 Prog(≺, U ) → ∀x x ∈ U then k≺k ≤ 2β . Proof: The assumption entails that RA∗ β 0 Prog(≺, U ) ⇒ ∀x x ∈ U , and hence by the previous Proposition, |n|≺ < 2β holds for all n, whence k≺k ≤ 2β . t u Corollary: 6.5 (i) k PAU ksup = ε0 . (ii) k ACA0 ksup = ε0 . (iii) k ACA ksup = εε0 . (iv) k (∆11 -CR) ksup = ϕω (0). Proof: The “≤” estimates follow from Corollary 5.19 in combination with Corollary 6.4. The “≥” estimates in (i),(ii),(iii) follow from homework assignment #6, problems 3 and 4. The “≥” part in (iv) will be another exercise. t u 53 6.1 Proof-theoretical reductions Ordinal analyses of theories allow one to compare the strength of theories. This subsection defines the notions of proof-theoretic reducibility and proof-theoretic strength that will be used henceforth. All theories T considered in the following are assumed to contain a modicum of arithmetic. For definiteness let this mean that the system PRA of Primitive Recursive Arithmetic is contained in T , either directly or by translation. Definition: 6.6 Let T1 , T2 be a pair of theories with languages L1 and L2 , respectively, and let Φ be a (primitive recursive) collection of formulae common to both languages. Furthermore, Φ should contain the closed equations of the language of PRA. We then say that T1 is proof-theoretically Φ-reducible to T2 , written T1 ≤Φ T2 , if there exists a primitive recursive function f such that PRA ` ∀φ ∈ Φ ∀x [ProofT1 (x, φ) → ProofT2 (f (x), φ)]. (47) T1 and T2 are said to be proof-theoretically Φ-equivalent, written T1 ≡Φ T2 , if T1 ≤Φ T2 and T2 ≤Φ T1 . The appropriate class Φ is revealed in the process of reduction itself, so that in the statement of theorems we simply say that T1 is proof-theoretically reducible to T2 (written T1 ≤ T2 ) and T1 and T2 are proof-theoretically equivalent (written T1 ≡ T2 ), respectively. Alternatively, we shall say that T1 and T2 have the same proof-theoretic strength when T1 ≡ T2 . Feferman’s notion of proof-theoretic reducibility (in S. Feferman: Hilbert’s program relativized: Proof-theoretical and foundational reductions, J. Symbolic Logic 53 (1988) 364–384) is more relaxed in that he allows the reduction to be given by a T2 -recursive function f , i.e. T2 ` ∀φ ∈ Φ ∀x [ProofT1 (x, φ) → ProofT2 (f (x), φ)]. (48) The disadvantage of (48) is that one forfeits the transitivity of the relation ≤Φ . Furthermore, in practice, proof-theoretic reductions always come with a primitive recursive reduction, so nothing seems to be lost by using the stronger notion of reducibility. 6.2 The general form of ordinal analysis In this subsection I attempt to say something general about all ordinal analyses that have been carried out thus far. One has to bear in mind that these concern “natural” theories. Also, to circumvent countless and rather boring counter examples, I will only address theories that have at least the strength of PA and and always assume the pertinent ordinal representation systems are closed under α 7→ ω α . Before delineating the general form of an ordinal analysis, we need several definitions. We first garner some features (following that ordinal representation systems used in proof theory always have, and collectively call them “elementary ordinal representation system”. One reason for singling out this notion is that it leads to an elegant characterization of the provably recursive functions of theories equipped with transfinite induction principles for such ordinal representation systems. 54 Definition: 6.7 Elementary recursive arithmetic, EA, is a weak system of number theory, in a language with 0, 1, +, ×, E (exponentiation), <, whose axioms are: 1. the usual recursion axioms for +, ×, E, <. 2. induction on ∆0 -formulae with free variables. EA is referred to as elementary recursive arithmetic since its provably recursive functions are exactly the Kalmar elementary functions, i.e. the class of functions which contains the successor, projection, zero, addition, multiplication, and modified subtraction functions and is closed under composition and bounded sums and products Definition: 6.8 For a set X and and a binary relation ≺ on X, let LO(X, ≺) abbreviate that ≺ linearly orders the elements of X and that for all u, v, whenever u ≺ v, then u, v∈X. A linear ordering is a pair hX, ≺i satisfying LO(X, ≺). Definition: 6.9 An elementary ordinal representation system (EORS) for a limit ordinal λ is a structure hA, , n 7→ λn , +, ×, x 7→ ω x i such that: (i) A is an elementary subset of N. (ii) is an elementary well-ordering of A. (iii) || = λ. (iv) Provably in EA, λn is a proper initial segment of for each n, and S λ n = . In particular, EA ` ∀y λy ∈ A ∧ ∀x ∈ A∃y [x λy ]. n (v) EA ` LO(A, ) (vi) +, × are binary and x 7→ ω x is unary. They are elementary functions on elementary initial segments of A. They correspond to ordinal addition, multiplication and exponentiation to base ω, respectively. The initial segments of A on which they are defined are maximal. n 7→ λn is an elementary function. (vii) hA, , +, ×, ω x i satisfies “all the usual algebraic properties” of an initial segment of ordinals. In addition, these properties of hA, , +, ×, ω x i can be proved in EA. (viii) Let ñ denote the nth element in the ordering of A. Then the correspondence n ↔ ñ is elementary. (ix) Let α = ω β1 + · · · + ω βk , β1 ≥ · · · ≥ βk (Cantor normal form). Then the correspondence α ↔ hβ1 , . . . , βk i is elementary. Elements of A will often be referred to as ordinals, and denoted α, β, . . .. 55 Definition: 6.10 Suppose LO(A, ) and F (u) is a formula. Then TIhA,i (F ) is the formula ∀n ∈ A [∀x nF (x) → F (n)] → ∀n ∈ A F (n). (49) TI(A, ) is the schema consisting of TIhA,i (F ) for all F . Given a linear ordering hA, i and α ∈ A let Aα = {β ∈ A : β α} and α be the restriction of to Aα . In what follows, quantifiers and variables are supposed to range over the natural numbers. When n denotes a natural number, n̄ is the canonical name in the language under consideration which denotes that number. Observation: 6.11 Every ordinal analysis of a classical or intuitionistic theory T that has ever appeared in the literatureSprovides an EORS hA, , . . .i such that T is proof-theoretically reducible to PA + α∈A TI(Aᾱ , ᾱ ). S Moreover, if T is a classical theory, then T and PA + α∈A TI(Aᾱ , ᾱ ) prove the sameSarithmetic sentences, whereas if T is based on intuitionististic, then T and HA + α∈A TI(Aᾱ , ᾱ ) prove the same arithmetic sentences. Furthermore, k T ksup =k k. Remark: 6.12 There is a lot of leeway in stating the latter observation. For instance, instead of PA one could take PRA or EA as the base theory, Sand the scheme 0 of transfiniteSinduction could be restricted to Σ1 formulae as PA + α∈A TI(Aᾱ , ᾱ ) and EA + α∈A Σ01 -TI(Aᾱ , ᾱ ) have the same proof-theoretic strength, providing that A is closed under exponentiation α 7→ ω α . Observation 6.11 lends itself to a formal definition of the notion of proof-theoretic ordinal of a theory T . Of course, before one can go about determining the prooftheoretic ordinal of T , one needs to be furnished with representations of ordinals. Not surprisingly, a great deal of ordinally informative proof theory has been concerned with developing and comparing particular ordinal representation systems. Assuming that a sufficiently strong EORS hA, , . . .i has been provided, we define [ |T |hA,,...i := least ρ ∈ A. T ≡ PA + TI(Aᾱ , ᾱ ) (50) αρ and call |T |hA,,...i , providing this ordinal exists, the proof-theoretic ordinal of T with respect to hA, , . . .i. Since, in practice, the ordinal representation systems used in proof theory are comparable, we shall frequently drop mentioning of hA, , . . .i and just write |T | for |T |hA,,...i . Note, however, that |T |hA,,...i might not exist even if the order-type of is bigger than k T ksup . A simple example is provided by the theory PA + Con(PA) (where Con(PA) expresses the consistency of PA) when we take hA, , . . .i to be a standard EORS for ordinals > ε0 ; the reason S being that PA + Con(PA) is prooftheoreticallySstrictly stronger than PA + αε0 TI(Aᾱ , ᾱ ) but also strictly weaker than PA + αε0 +1 TI(Aᾱ , ᾱ ). Therefore, as opposed to k · ksup , the norm |·|hA,,...i is only partially defined and does not induce a prewellordering on theories T with k T ksup <k k. The remainder of this subsection expounds on important consequences of ordinal analyses that follow from Observation 6.11. 56 S S Proposition: 6.13 PA + α∈A TI(Aᾱ , ᾱ ) and HA + α∈A TI(Aᾱ , ᾱ ) prove the same sentences in the negative fragment, where a sentence is in the negative fragment if it is built from atomic formulae via ∧, →, ¬, ∀x. S S Proof: PA + α∈A TI(Aᾱ , ᾱ ) can be interpreted in HA + α∈A TI(Aᾱ , ᾱ ) via the Gödel–Gentzen ¬¬-translation. Observe that for an instance of the schema of transfinite induction we have (∀u [∀x (∀y [y ≺ x → φ(y)] → φ(x)) → φ(u)])¬¬ ≡ (∀u [∀x (∀y [¬¬y ≺ x → ¬¬φ(y)] → ¬¬φ(x)) → ¬¬φ(u)]). Thus for primitive recursive ≺ the ¬¬-translation is HA equivalent to an instance of the same schema. t u Corollary: 6.14 PA + same Π01 sentences. S α∈A TI(Aᾱ , ᾱ ) and HA + S α∈A TI(Aᾱ , ᾱ ) prove the Since many well-known and important theorems as well as conjectures from number theory are expressible in Π01 form (examples: the quadratic reciprocity law, Wiles’ theorem, also known as Fermat’s conjecture, Goldbach’s conjecture, the Riemann hypothesis), Π01 conservativity ensures that many mathematically important theorems which turn out to be provable in S will be provable in T , too. However, Π01 conservativity is not always a satisfactory conservation result. Some important number-theoretic statements are Π02 (examples are: the twin prime conjecture, miniaturized versions of Kruskal’s theorem, totality of the van der Waerden function), and in particular, formulas that express the convergence of a recursive function for all arguments. Consider a formula ∀n ∃m P (n, m), where P (n, m) is a primitive recursive formula expressing that “m codes a complete computation of algorithm A on input n.” The ¬¬-translation of this formula is ∀n ¬∀m ¬P (n, m), conveying the convergence of the algorithm A for all inputs only in a weak sense. Fortunately, Proposition 6.14 can be improved to hold for sentences of Π02 form. S S Proposition: 6.15 PA + α∈A TI(Aᾱ , ᾱ ) and HA + α∈A TI(Aᾱ , ᾱ ) prove the same Π02 sentences. The missing link to get from Proposition 6.13 to Proposition 6.15 is usually provided by Markov’s Rule for primitive recursive predicates, MRP R : if ¬∀n¬Q(n) (or, equivalently, ¬¬ ∃n Q(n)) is a theorem, where Q is a primitive recursive relation, then ∃n Q(n) is a theorem. Kreisel [30] showed that MRP R holds for HA. A variety of intuitionistic systems have since been shown to be closed under MRP R , using a variety of complicated methods, notably Gödel’s dialectica interpretation and normalizability. A particularly elegant and short proof for closure under MRP R is due to Friedman [18] and, independently, to Dragalin [12]. However, though the Friedman–Dragalin argument works for a host of systems, it doesn’t seem to work S in the case of HA + α∈A TI(Aᾱ , ᾱ ). Proof of Proposition 6.15: We will give a direct proof, i.e. without using Proposition 6.13. So suppose [ PA + TI(Aᾱ , ᾱ ) ` ∀x ∃y φ(x, y), α∈A 57 where φ is ∆0 . Then there already exists a δ ∈ A such that PA + TI(Aδ̄ , δ̄ ) ` ∀x ∃y φ(x, y). (51) We now use the coding of infinitary PA∞ derivations presented in [53], section 4.2.2. β Let d ρ ψ signify that d is the code of a PA∞ derivation with length ≤ β, cut-rank ρ and end formula ψ. (51) implies that there is a d0 and n < ω such that [ δ·ω HA + TI(Aᾱ , ᾱ ) ` d0 n ∀x ∃y φ(x, y) . (52) α∈A To obtain a cut-free proof of ∀x ∃y φ(x, y) in PA∞ one needs transfinite induction γ γ := ω ωm . This up to the ordinal ωnδ·ω , where ω0γ := γ and ωm+1 S amount of transfinite induction is available in our background theory HA + α∈A TI(Aᾱ , ᾱ ) as A is closed under ξ 7→ ω ξ . Also note that the cut-elimination procedure is completely effective. Thus from (52) we obtain, for some d∗ , HA + [ TI(Aᾱ , ᾱ ) ` d∗ δ·ω ωn 0 ∀x ∃y φ(x, y) , (53) α∈A and further HA + [ TI(Aᾱ , ᾱ ) ` ∀x∃d d δ·ω ωn 0 ∃y φ(ẋ, y) (54) α∈A (where Feferman’s dot convention has been used here). Let TrΣ1 be a truth predicate for Gödel numbers of disjunctions of Σ1 formulae (cf. [59], section 1.5, in particular 1.5.7). We claim that [ _ β HA + TI(Aᾱ , ᾱ ) ` ∀d ∀β ≤ ωnδ·ω ∀Γ ⊆ Σ1 [ d 0 Γ → TrΣ1 ( Γ)], (55) α∈A where ∀Γ ⊆ W Σ1 is a quantifier ranging over Gödel numbers of finite sets of Σ1 formulae and Γ stands for the Gödel number corresponding to the disjunction of all formulae of Γ. (55) is proved by induction on β by observing that all formulae occurring in a cut-free PA∞ proof of a set of Σ1 formulae are Σ1 themselves and the only inferences therein are either axioms or instances of the (∃) rule or improper instances of the ω rule. Combining (54) and (55) we obtain [ HA + TI(Aᾱ , ᾱ ) ` ∀x TrΣ1 ( ∃y φ(ẋ, y) ). (56) α∈A As HA ` ∀x [ TrΣ1 ( ∃y φ(ẋ, y) ) ↔ ∃y φ(x, y)] (cf. [59], Theorem 1.5.6), we finally obtain [ HA + TI(Aᾱ , ᾱ ) ` ∀x ∃y φ(x, y). α∈A t u 58 In section 2 we considered the ordinal |T |Con . What is the relation between |T |Con and |T |hA,,...i ? First we have to delineate the meaning of |T |Con , though. The latter is only determined with respect to a given ordinal representation system hB, ≺, . . .i. Thus let |T |Con = least α ∈ B. PRA + PR-TI(α) ` Con(T ). It turns out that S the two ordinals are the same when T is proof-theoretically reducible to PA + α∈A TI(Aᾱ , ᾱ ), A is closed under α 7→ ω α and hB, ≺, . . .i is a proper end extension of hA, , . . .i. The reasons are as follows: S Proposition: 6.16 The consistency of PA + α∈A TI(Aᾱ , ᾱ ) can be proved in the theory PRA + PR-TI(A, ), where PR-TI(A, ) stands for transfinite induction along for primitive recursive predicates. Hint of proof. First note that PRA + PR-TI(A, ) ` Π01 -TI(A, ). The key to showing this is that for each α ∈ A and each x ∈ ω we can code α and x by the ordinal ω · α + x which is less than ω · (α + 1) and therefore inSA. Secondly, one has to show that an ordinal analysis of PA + α∈A TI(Aᾱ , ᾱ ) can 0 be carried S out in PRA + Π1 -TI(A, ). The main tool to achieve this is to embed PA+ α∈A TI(Aᾱ , ᾱ ) into a system of Peano arithmetic with an infinitary rule, the so-called ω-rule, and a repetition rule, Rep, which simply repeats the premise as the conclusion. The ω-rule allows one to infer ∀xφ(x) from the infinitely many premises φ(0̄), φ(1̄), φ(2̄), . . . (where n̄ denotes the nth numeral); its addition accounts for the fact that the infinitary system enjoys cut-elimination. The addition of the Rep rule enables one to carry out a continuous cut elimination, due to Mints [35], which is a continuous operation in the usual tree topology on prooftrees. A further pivotal step consists in making the ω-rule more constructive by assigning codes to proofs, where codes for applications of finitary rules contain codes for the proofs of the premises, and codes for applications of the ω-rule contain Gödel numbers for primitive recursive functions enumerating codes of the premises. Details can be found in [53]. The main idea here is that we can do everything with primitive recursive proof–trees instead of arbitrary derivations. A proof–tree is a tree, with each node labelled by: A sequent, a rule of inference or the designation “Axiom”, two sets of formulas specifying the set of principal and minor formulas,respectively, of that inference, and two ordinals (length and cut–rank) such that the sequent is obtained from those immediately above it through application of the specified rule of inference. The well-foundedness of a proof–tree is then witnessed by the (first) ordinal “tags” which are in reverse order of the tree order. As a result, the notion of being a (code of a) proof tree is Π01 . The cut elimination for infinitary proofs with finite cut rank (as presented in [53]) can be formalized in PRA + Π01 -TI(A, ). The last step consists in recognizing that every endformula of Π01 form of a cut free infinitary proof is true. The latter employs Π01 -TI(A, ). For details see [53]. t u 59 7 Kripke-Platek Set Theory One of the fragments of ZF which has been studied intensively is Kripke-Platek set theory, KP. Its standard models are called admissible sets. One of the reasons that this is a truly remarkable theory is that a great deal of set theory requires only the axioms of KP. An even more important reason is that admissible sets have been a major source of interaction between model theory, recursion theory and set theory. (cf. [4]1 ). KP arises from ZF by completely omitting the Powerset axiom and restricting Separation and Collection to absolute predicates (cf. [4]), i.e. ∆0 formulas. These alterations are suggested by the informal notion of ‘predicative’. The axiom systems for set theories considered in this paper are formulated in the usual language of set theory (called L∈ hereafter) containing ∈ as the only nonlogical symbol besides =. Formulae are built from prime formulae a ∈ b and a = b by use of propositional connectives and quantifiers ∀x, ∃x. Quantifiers of the forms ∀x ∈ a, ∃x ∈ a are called bounded. Bounded or ∆0 -formulae are the formulae wherein all quantifiers are bounded; Σ1 -formulae are those of the form ∃xϕ(x) where ϕ(a) is a ∆0 -formula. For n > 0, Πn -formulae (Σn -formulae) are the formulae with a prefix of n alternating unbounded quantifiers starting with a universal (existential) one followed by a ∆0 -formula. The class of Σ-formulae is the smallest class of formulae containing the ∆0 -formulae which is closed under ∧, ∨, bounded quantification and unbounded existential quantification. One of the set theories which is amenable to ordinal analysis is Kripke-Platek set theory, KP. Its standard models are called admissible sets. One of the reasons that this is an important theory is that a great deal of set theory requires only the axioms of KP. An even more important reason is that admissible sets have been a major source of interaction between model theory, recursion theory and set theory (cf. [4]). KP arises from ZF by completely omitting the power set axiom and restricting separation and collection to bounded formulae. These alterations are suggested by the informal notion of ‘predicative’. Definition: 7.1 By a ∆0 formula or bounded formula we mean a formula of set theory in which all the quantifiers appear restricted, that is have one of the forms (∀x ∈ b) or (∃x ∈ b). The axioms of KP are: Extensionality: ∀x (x ∈ a ↔ x ∈ b) → a = b. Set Induction: ∀x[∀y ∈ x G(y) → G(x)] → ∀xG(x) Pair: Union: ∃x (x = {a, b}). S ∃x (x = a). Infinity: ∃x [x 6= ∅ ∧ (∀y ∈ x)(∃z ∈ x)(y ∈ z)]. ∆0 Separation: ∃x ∀u[u ∈ x ↔ (u ∈ a ∧ F (u))] for all ∆0 –formulas F ∆0 Collection: (∀x ∈ a)∃yG(x, y) → ∃z(∀x ∈ a)(∃y ∈ z)G(x, y) for all ∆0 –formulas G. 1 J. Barwise: Admissible sets and structures. (Springer, Berlin, 1975) 60 To be more precise, the axioms of KP consist of Extensionality, Pair, Union, Infinity, Bounded Separation ∃x ∀u [u ∈ x ↔ (u ∈ a ∧ F (u))] for all bounded formulae F (u), Bounded Collection ∀x ∈ a ∃y G(x, y) → ∃z ∀x ∈ a ∃y ∈ z G(x, y) for all bounded formulae G(x, y), and Set Induction ∀x [(∀y ∈ x H(y)) → H(x)] , → ∀x H(x) for all formulae H(x). A transitive set A such that (A, ∈) is a model of KP is called an admissible set. Of particular interest are the models of KP formed by segments of Gödel’s constructible hierarchy L. The constructible hierarchy is obtained by iterating the definable powerset operation through the ordinals L0 = ∅, [ Lλ = {Lβ : β < λ} λ limit Lβ+1 = {X : X ⊆ Lβ ; X definable over hLβ , ∈ i}. So any element of L of level α is definable from elements of L with levels < α and the parameter Lα . An ordinal α is admissible if the structure (Lα , ∈) is a model of KP. Formulae of L2 can be easily translated into the language of set theory. Some of the subtheories of Z2 considered above have set-theoretic counterparts, characterized by extensions of KP. KPi is an extension of KP via the axiom ∀x∃y[x∈y ∧ y is an admissible set]. (Lim) KPl denotes the system KPi without Bounded Collection. It turns out that (Π11 −AC) + BI proves the same L2 -formulae as KPi, while (Π11 −CA) proves the same L2 -formulae as KPl. The intuitionistic version of KP, will be denoted by IKP. By IKP0 we denote the system IKP bereft of Set Induction. 7.1 Basic principles The intent of this section is to explore which of the well known provable consequences of KP carry over to IKP. 7.1.1 Ordered Pairs By the Pairing axiom, for sets a, b we get a set y such that ∀x(x ∈ y ↔ x = a ∨ x = b). 61 This set is unique by Extensionality; we call this set {a, b}. {a} = {a, a} is the set whose unique element is a. ha, bi = {{a}, {a, b}} is the ordered pair of a and b. We claim that if ha, bi = hc, di then a = c and b = d. The usual classical proof argues by cases depending, for example, whether or not a = b. This method is not available here as we cannot assume that instance of the classical law of excluded middle. Instead we can argue as follows. Assume that ha, bi = hc, di. As {a} is an element of the left hand side it is also an element of the right hand side and so either {a} = {c} or {a} = {c, d}. In either case a = c. As {a, b} is an element of the left hand side it is also an element of the right hand side and so either {a, b} = {c} or {a, b} = {c, d}. In either case b = c or b = d. If b = c then a = c = b so that the two sets in ha, bi are equal and hence {c} = {c, d} giving c = d and hence b = d. So in either case b = d. t u We will also have use for ordered triples ha, b, ci, ordered quadruples ha, b, c, di, etc. They are defined by iterating the ordered pairs formation as follows: hai = a and ha1 , . . . , ar , ar+1 i = hha1 , . . . , ar i, ar+1 i. Proposition: 7.2 (IKP0 ) If c, d are sets then so is the class c × d. Proof: Let c, d be sets. Then, as {a} × d = {ha, bi | b ∈ d} is a set, by Replacement, so is c×d= [ ({a} × d) a∈c t u by Replacement and Union. Definition: 7.3 The collection of Σ formulae is the smallest collection containing the ∆0 formulae closed under conjunction, disjunction, bounded quantification and unbounded existential quantification. The collection of Π formulae is the smallest collection containing the ∆0 formulae closed under conjunction, disjunction, bounded quantification and unbounded universal quantification. Given a formula A and a variable w not appearing in A, we write Aw for the result of replacing each unbounded quantifier ∃x and ∀x in A by ∃x ∈ w and ∀x ∈ w, respectively. Lemma: 7.4 For each Σ formula the following are intuitionistically valid: (i) Au ∧ u ⊆ v → Av , (ii) Au → A. Proof: Both facts are proved by induction following the inductive definition of Σ formula. t u 62 Theorem: 7.5 (Σ Reflection Principle). For all Σ formulae A we have the following: IKP0 ` A ↔ ∃aAa . (Here a is any set variable not occurring in A; we will not continue to make these annoying conditions on variables explicit.) In particular, every Σ formula is equivalent to a Σ1 formula in IKP0 . Proof: We know from the previous lemma that ∃a Aa → A, so the axioms of IKP0 come in only in showing A → ∃a Aa . proof is by induction on A, the case for ∆0 formulae being trivial. We take the three most interesting cases, leaving the other two to the reader. Case 0. If A is ∆0 then A ↔ Aa holds for every set a. Case 1. A is B ∧ C. By induction hypothesis, IKP0 ` B ↔ ∃a B a and IKP0 ` C ↔ ∃a C a . Let us work in IKP0 , assuming B ∧ C. Now there are a1 , a2 such that B a1 , C a2 , so let a = a1 ∪ a2 . Then B a and C a hold by the previous lemma, and hence Aa . Case 2. A is B ∨ C. By induction hypothesis, IKP0 ` B ↔ ∃a B a and IKP0 ` C ↔ ∃a C a . Let us work in IKP0 , assuming B ∨ C. Then B a1 for some set a1 or there is a set a2 such that C a2 . In the first case we have B a ∨ C a with a := a1 while in the second case we have B a ∨ C a with a := a2 . Case 2. A is ∀u ∈ v B(u). The inductive assumption yields IKP0 ` B(u) ↔ ∃a B(u)a . Again, working in IKP0 , assume ∀u ∈ v B(u) and show ∃a ∀u ∈ v B(u)a . b For each u ∈ v there is a b such that S B(u) , so by ∆0 Collection there is an a0 such b that ∀u ∈ v ∃b ∈ a0 B(u) . Let a = a0 . Now, for every u ∈ v, we have ∃b ⊆ a B(u)b ; so ∀u ∈ vB(u)a , by the previous lemma. Case 3. A is ∃u B(u). Inductively we have IKP0 ` B(u) ↔ ∃b B(u)b . Working in IKP0 , assume ∃u B(u). Pick u0 such B(u0 ) and b such that B(u0 )b . Letting a = b ∪ {u0 } we get u0 ∈ a and B(u0 )a by the previous lemma. Thence ∃a ∃u ∈ a B(u)a . t u In Platek’s original definition of admissible set he took the Σ Reflection Principle as basic. It is very powerful, as we’ll see below. ∆0 Collection is easier to verify, however. Theorem: 7.6 (The Strong Σ Collection Principle). For every Σ formula A the following is a theorem of IKP0 : If ∀x ∈ a ∃yA(x, y) then there is a set b such that ∀x ∈ a ∃y ∈ b A(x, y) and ∀y ∈ b ∃x ∈ a A(x, y). Proof: Assume that ∀x ∈ a∃y ∈ b A(x, y). By Σ Reflection there is a set c such that ∀x ∈ a ∃y ∈ c A(x, y)c . (57) b = {y ∈ c| ∃x ∈ a A(x, y)c }, (58) Let by ∆0 Separation. Now, since A(x, y)c → A(x, y) by 7.4, (57) gives us ∀x ∈ a ∃y ∈ b A(x, y), whereas (58) gives us ∀y ∈ b ∃x ∈ a A(x, y). t u 63 Theorem: 7.7 (Σ Replacement). For each Σ formula A(x, y) the following is a theorem of IKP0 : If ∀x ∈ a ∃!y A(x, y) then there is a function f , with dom(f ) = a, such that ∀x ∈ a A(x, f (x)). Proof: By Σ Reflection there is a set d such that ∀x ∈ a ∃y ∈ d A(x, y)d . Since A(x, y)d implies A(x, y) we get ∀x ∈ a ∃!y ∈ d A(x, y)d . Thus, defining f = {hx, yi ∈ a × d| A(x, y)d } by ∆0 Separation, f is a function satisfying dom(f ) = a and ∀x ∈ a A(x, f (x)). t u The above is sometimes infeasible because of the uniqueness requirement ∃! in the hypothesis. In these situations it is usually the next result which comes to the rescue. Theorem: 7.8 (Strong Σ Replacement). For each Σ formula A(x, y) the following is a theorem of IKP0 : If ∀x ∈ a ∃y A(x, y) then there is a function f with dom(f ) = a such that for all x ∈ a, f (x) is inhabited and ∀x ∈ a ∀y ∈ f (x) A(x, y). t u Proof: Exercise. One principle of KP that is not provable in IKP is ∆1 Separation. Proposition: 7.9 (KP0 ) (∆1 Separation). If A is a Σ formula A and B is a Π formula, then KP0 ` ∀x ∈ a [A(x) ↔ B(x)] → ∃z ∀u[u ∈ z ↔ (u ∈ a ∧ A(x))]. Proof: The reason is that classically ∀x ∈ a [A(x) ↔ B(x)] entails ∀x ∈ a [A(x) ∨ ¬B(x)] which is classically equivalent to a Σ formula. t u 7.2 Σ Recursion in IKP The mathematical power of KP resides in the possibility of defining Σ functions by ∈-recursion and the fact that many interesting functions in set theory are definable by Σ Recursion. Moreover the scheme of ∆0 Separation allows for an extension with provable Σ functions occurring in otherwise bounded formulae. Proposition: 7.10 (Definition by Σ Recursion in IKP.) If G is a total (n + 2)–ary Σ definable class function of IKP, i.e. IKP ` ∀~xyz∃!u G(~x, y, z) = u then there is a total (n + 1)–ary Σ class function F of IKP such that2 IKP ` ∀~xy[F (~x, y) = G(~x, y, (F (~x, z)|z ∈ y))]. 2 (F (~x, z)|z ∈ y) := {hz, F (~x, z)i : z ∈ y} 64 Proof: Let A(f, ~x) be the formula [f is a function] ∧ [dom(f ) is transitive] ∧ [∀y ∈ dom(f ) (f (y) = G(~x, y, f |y))]. Set B(~x, y, f ) = [A(f, ~x) ∧ y ∈ dom(f )]. Claim IKP ` ∀~x, y∃!f B(~x, y, f ). Proof of Claim: By ∈ induction on y. Suppose ∀u ∈ y ∃g B(~x, u, g). By Strong Σ Collection we S find a set A such that ∀u ∈ y ∃g ∈ A B(~x, u, g) and ∀g ∈ A∃u ∈ y B(~x, u, g). Let f0 = {g : g ∈ A}. By our general assumption there exists a u0 such that G(~x, y, (f0 (u)|u ∈ y)) = u0 . Set f = f0 ∪ {hy, u0 i}. Since for all g ∈ A, dom(g) is transitive we have that dom(f0 ) is transitive. If u ∈ y, then u ∈ dom(f0 ). Thus dom(f ) is transitive and y ∈ dom(f ). We have to show that f is a function. But it is readily shown that if g0 , g1 ∈ A, then ∀x ∈ dom(g0 ) ∩ dom(g1 )[g0 (x) = g1 (x)]. Therefore f is a function. This also shows that ∀w ∈ dom(f )[f (w) = G(~x, w, f |w)], confirming the claim (using Set Induction). Now define F by F (~x, y) = w := ∃f [B(~x, y, f ) ∧ f (y) = w]. t u Corollary: 7.11 There is a Σ function TC of IKP such that [ IKP ` ∀a[TC(a) = a ∪ {TC(x) : x ∈ a}]. Proposition: 7.12 (Definition by TC–Recursion) Under the assumptions of Proposition 7.10 there is an (n + 1)–ary Σ class function F of IKP such that IKP ` ∀~xy[F (~x, y) = G(~x, y, (F (~x, z)|z ∈ TC(y)))]. Proof: Hint: Let C(f, ~x, y) be the Σ formula [f is a function] ∧ [dom(f ) = TC(y)] ∧ [∀u ∈ dom(f )[f (u) = G(~x, u, f |TC(u))]]. Prove by ∈–induction that ∀y∃!f C(f, ~x, y). 65 t u 8 An Ordinal representation system for the BachmannHoward ordinal Serving as a miniature example of an ordinal analysis of an impredicative system, we carry out an ordinal analysis of KP. The first step is to find a sufficiently strong ordinal representation system. Definition: 8.1 Let Ω be a “big” ordinal, e.g. Ω = ℵ1 . By recursion on α we define sets B(α) and the ordinal ψΩ (α) as follows: closure of {0, Ω} under: +, (ξ 7→ ω ξ ), (ξ, η 7→ ϕξ (η), B(α) = (59) (ξ 7−→ ψΩ (ξ))ξ<α ψΩ (α) = min{ρ < Ω | ρ ∈ / B(α) } (60) if the set {ρ < Ω | ρ ∈ / B(α) } is non-empty. As per definition ψΩ α might not be defined but the next Lemma shows that it is a total function. Lemma: 8.2 (i) B(α) is a countable set. (ii) ψΩ (α) is always defined and ψΩ (α) < Ω. S Proof: (i) B(α) = n<ω Bn (α) where B0 (α) = {0, Ω} and Bn+1 (α) = Bn (α) ∪ {η + δ | η, δ ∈ Bn (α) ∪ {ϕη (δ) | η, δ ∈ Bn (α)} ∪ {ψΩ (ξ) | ξ ∈ Bn (α) ∧ ξ < α}. Inductively, each of the sets Bn (α) is countable (actually finite) and therefore B(α) is countable. (ii) Ω is assumed to be a regular uncountable cardinal, thus B(α) ∩ Ω cannot be unbounded in Ω. t u Lemma: 8.3 (i) If α ≤ δ then B(α) ⊆ B(δ) and ψΩ (α) ≤ ψΩ (δ). (ii) If α ∈ B(δ) ∩ δ then ψΩ (α) < ψΩ (δ). (iii) If α ≤ δ and [α, δ) ∩ B(α) = ∅ then B(α) = B(δ). S (iv) If λ is a limit then B(λ) = ξ<λ B(ξ). Proof: (i): B(α) ⊆ B(δ) is clearly true if α ≤ δ. And thus ψΩ (α) ≤ ψΩ (δ) follows by definition and Lemma 8.2. (ii): From α ∈ B(δ) ∩ δ we get ψΩ (α) ∈ B(δ) and also, by (i), ψΩ (α) ≤ ψΩ (δ). Since ψΩ (δ) ∈ / B(δ) this entails ψΩ (α) < ψΩ (δ). (iii): By induction on n one easily shows that Bn (δ) ⊆ B(α). This is obvious for n = 0. Assume it is true for n. If β < δ and β ∈ Bn (δ) then inductively we have β ∈ B(α) and hence β < α, yielding ψΩ (β) ∈ B(α). Thus we get Bn+1 (δ) ⊆ B(α). 66 S (iv): By (i) we S have ξ<λ B(ξ) ⊆ B(λ). To show the reverse inclusion we only need to show that ξ<λ B(ξ) is closed under theSoperations that define B(λ). This is obvious for + and ϕ. So assume that δ ∈ ξ<λ B(ξ) ∩ λ. Then δ < ξ0 and δ ∈ B(ξ1S ) for some ξ0 , ξ1 < λ. Thus, letting ξ ∗ = max(ξ0 , ξ1 ), we have ψΩ (δ) ∈ ∗ B(ξ ) ⊆ ξ<λ B(ξ). t u Lemma: 8.4 ψΩ (α) ∈ SC, i.e., ϕψΩ (α) (0) = ψΩ (α). Proof: If ψΩ (α) = ξ + δ for some ξ, δ < ψΩ (α), then ξ, η ∈ B(α) and therefore ψΩ (α) = ξ + δ ∈ B(α), contradicting the definition of ψΩ (α). Likewise, if ψΩ (α) = ϕρ (η) for some ρ η < ψΩ (α) then ρ, η ∈ B(α) and therefore ψΩ (α) = ϕρ (η) ∈ B(α), contradicting the definition of ψΩ (α). Thus ψΩ (α) ∈ SC follows by Lemma 4.30. t u Theorem: 8.5 B(α) ∩ Ω = ψΩ (α). Proof: Clearly, ψΩ (α) ⊆ B(α) ∩ Ω. To conclude equality, it suffices to show that X := ψΩ (α) ∪ {δ ∈ B(α) | δ ≥ Ω} is closed under the operations that define B(α). closure of X under + and ϕ follows from Lemma 8.4. To show closure under ψΩ for arguments < α, assume β ∈ X and β < α. Then ψΩ (β) < ψΩ (α) by Lemma 8.3(ii), and hence ψΩ (β) ∈ X. t u Corollary: 8.6 If λ is a limit then ψΩ (λ) = supξ<λ ψΩ (ξ). Proof: [ ψΩ (λ) = B(λ) ∩ Ω = ( B(ξ)) ∩ Ω ξ<λ = [ (B(ξ) ∩ Ω) = [ ψΩ (ξ) = sup ψΩ (ξ). ξ<λ ξ<λ ξ<λ Here the first and fourth equality follow from Theorem 8.5 while the second equality is a consequence of Lemma 8.3(iv). t u Γ Definition: 8.7 Let β denote the least ordinal ρ > β such that ρ ∈ SC. Lemma: 8.8 Γ (i) ψΩ (α + 1) ≤ (ψΩ (α)) . Γ (ii) α ∈ B(α + 1) implies ψΩ (α + 1) = (ψΩ (α)) . (iii) α ∈ / B(α) implies B(α) = B(α + 1) and ψΩ (α + 1) = ψΩ (α). Proof: (i): It suffices to show that Γ Y := (B(α + 1) ∩ (ψΩ (α)) ) ∪ {δ ∈ B(α + 1) | δ ≥ Ω} is closed under the operations that define B(α + 1). Clearly, Y is closed under + and ϕ. If β ∈ Y and β < α + 1, then ψΩ (β) ≤ ψΩ (α) and hence ψΩ (β) ∈ Y . Γ (ii): α ∈ B(α + 1) yields ψΩ (α) < ψΩ (α + 1). Let ψΩ (α) < η < (ψΩ (α)) . Then η∈ / SC. By induction on η one therefore easily shows that η < ψΩ (α + 1). Together Γ with (i) this implies ψΩ (α + 1) = (ψΩ (α)) . (iii) follows from Lemma 8.3(iii) since α ∈ / B(α) yields B(α) ∩ [α, α + 1) = ∅. t u 67 Theorem: 8.9 (i) If ξ < ψΩ (Ω) then ξ < Γξ = ψΩ (ξ) < ψΩ (Ω). (ii) ΓψΩ (Ω) = ψΩ (Ω). (iii) If ψΩ (Ω) ≤ ξ ≤ Ω then ψΩ (ξ) = ψΩ (Ω). Proof: Exercise. Definition: 8.10 We write δ =N F ϕξ (η) if δ = ϕξ (η) and ξ, η < δ. We write δ =N F ψΩ (α) if δ = ψΩ (α) and α ∈ B(α). Note that by Lemma 8.3, δ =N F ψΩ (α) and δ =N F ψΩ (β) implies α = β. Lemma: 8.11 (i) If β =N F β1 + . . . + βn and β ∈ B(α) then β1 , . . . , βn ∈ B(α). (ii) If δ =N F ϕξ (η) ∈ B(α) then ξ, η ∈ B(α). (iii) If δ =N F ψΩ (β) ∈ B(ρ) then β ∈ B(ρ) and β < ρ. Proof: (i): Define X := {β ∈ B(α) | if β =N F β1 + . . . + βn for some β1 , . . . , βn then β1 , . . . , βn ∈ B(α)}. Show that X is closed under the operations that define B(α). (ii): Define Y := {β ∈ B(α) | if β =N F ϕξ (η) for some ξ, η then ξ, η ∈ B(α)}. Show that Y is closed under the operations that define B(α). (iii): ψΩ (β) ∈ B(ρ) implies ψΩ (β) < ψΩ (ρ) and hence β < ρ. As β ∈ B(β) we also get β ∈ B(ρ). t u Remark: 8.12 It is essential to require S that Ω ∈ B(α). If, instead of 0, Ω ∈ B(α), one would require only 0 ∈ B(α), then α∈ON B(α) = σ, where σ is the least ordinal such that Γσ = σ. 8.1 The ordinal representation system OT(Ω) We will single out a set of ordinals that can be viewed as ordinal representation in that all ordinals in it have a unique representation over the alphabet 0, Ω, +, ϕ, ψΩ (). Definition: 8.13 The set OT(Ω) and Gα for α ∈ OT(Ω) are inductively defined by the following clauses: (R1) 0, Ω ∈ OT(Ω) and G0 = GΩ := 0. (R2) If α =N F α1 + . . . + αn , n > 1 and α1 , . . . , αn ∈ OT(Ω) then α ∈ OT(Ω) and Gα = max(Gα1 , . . . , Gαn ) + 1. (R3) If α =N F ϕβ (δ), β, δ < Ω and β, δ ∈ OT(Ω) then α ∈ OT(Ω) and Gα = max(Gβ, Gδ) + 1. 68 (R4) If α =N F ω β , β > Ω and β ∈ OT(Ω) then α ∈ OT(Ω) and Gα = (Gβ) + 1. (R5) If α =N F ψΩ (β), β ∈ OT(Ω) and β ∈ B(β) then α ∈ OT(Ω) and Gα = (Gβ)+1. It follows from earlier results that any α ∈ OT(Ω) enters this set according to exactly one of the rules (R1)-(R5) in exactly one way, and thus Gα is defined unambiguously. Especially, OT(Ω) can be viewed as a set of terms which are composed of the symbols 0, Ω, +, ϕ, ψΩ in a unique way. What we are driving at next is a procedure that enables us to decide for α, β ∈ OT(Ω) with α 6= β whether α < β or α > β solely by inspection of their term representation. We also need a recipe to decide whether an expression made up of the symbols 0, Ω, +, ϕ, ψΩ represents an ordinal of OT(Ω). The main obstacle is raised by (R5) since we do not know how to deal with the condition β ∈ B(β). This problems gives rise to the following definition. Definition: 8.14 Inductive definition of Kα for α ∈ OT(Ω). (K1) K0 = KΩ = ∅. (K2) Kα = Kα1 ∪ . . . ∪ Kαn if α =N F α1 + . . . + αn where n > 1. (K3) Kα = Kβ ∪ Kδ if α =N F ϕβ (δ). (K4) Kα = Kβ ∪ {β} if α =N F ψΩ (β). If X is a set of ordinals we write X < η to convey that ξ < η holds for all ξ ∈ X. Note that Kα is always a finite set. Lemma: 8.15 Let α ∈ OT(Ω). Then α ∈ B(ρ) if and only if Kα < ρ. Proof: We proceed by induction on Gα. If α =N F α1 + . . . + αn with n > 1 then: α ∈ B(ρ) iff α1 , . . . , αn ∈ B(ρ) iff Kα1 ∪ . . . ∪ Kαn < ρ iff Kα < ρ, using Lemma 8.11(i) and the induction hypothesis. Likewise, if α =N F ϕη (β) then: α ∈ B(ρ) iff η, β ∈ B(ρ) iff Kη ∪ Kβ < ρ iff Kα < ρ, using Lemma 8.11(ii) and the induction hypothesis. Now let α =N F ψΩ (β). Then: α ∈ B(ρ) iff β ∈ B(ρ) ∧ β < ρ iff Kβ < ρ ∧ β < ρ iff Kα < ρ, using Lemma 8.11(iii) and the induction hypothesis. t u Lemma: 8.16 If α ∈ OT(Ω) then ∀β ∈ Kα Gβ < Gα. t u Proof: Use induction on Gα. Summarizing results from section 4 and this section we arrive at a primitive recursive characterization of < on OT(Ω). Below we write α ∈ SC if α = Ω or α =N F ψΩ (δ) for some δ. 69 Lemma: 8.17 Let α, β ∈ OT(Ω). Then α < β holds if and only if one of the following conditions is satisfied: 1. α = 0 and β 6= 0. 2. α =N F α0 + . . . + αn , β =N F β0 + . . . + βm , 0 < n < m and ∀i ≤ n αi = βi . 3. α =N F α0 + . . . + αn , β =N F β0 + . . . + βm , 0 < n, m and ∃i ≤ min(n, m)[∀j < i αj = βj ∧ αi < βi ]. 4. α =N F α0 + . . . + αn , n > 0, β ∈ AP and α1 < β. 5. α ∈ AP, β =N F β0 + . . . + βn , n > 0 and α ≤ β1 . 6. α =N F ϕα1 (α2 ), β =N F ϕβ1 (β2 ), α1 < β1 and α2 < β. 7. α =N F ϕα1 (α2 ), β =N F ϕβ1 (β2 ), α1 = β1 and α2 < β2 . 8. α =N F ϕα1 (α2 ), β =N F ϕβ1 (β2 ), β1 < α1 and α < β2 . 9. α =N F ϕα1 (α2 ), α1 , α2 < β and β ∈ SC. 10. α ∈ SC, β =N F ϕβ1 (β2 ) and α ≤ max(β1 , β2 ). 11. α =N F ψΩ (α0 ), β =N F ψΩ (β0 ) and α0 < β0 . 12. α =N F ψΩ (α0 ) and β = Ω. Proposition: 8.18 OT(Ω) ⊆ B(εΩ+1 ) ∩ εΩ+1 . Proof: Use induction on Gα for α ∈ OT(Ω). 70 t u 9 KP goes infinite: LRS A peculiarity of PA is that every object n of the intended model has a canonical name in the language, namely, the nth numeral. It is not clear, though, how to bestow a canonical name to each element of the set–theoretic universe. This is where Gödel’s constructible universe L comes in handy. As L is “made” from the ordinals it is pretty obvious how to “name” sets in L once one has names for ordinals. These will be taken from OT(Ω). Henceforth, we shall restrict ourselves to ordinals from OT(Ω). Definition: 9.1 Up to know the basic symbols of our set-theoretic language have been = and ∈. For technical reasons we would like to get rid of =. We simply define a = b to be an abbreviation for (∀x ∈ a) x ∈ b ∧ (∀x ∈ b) x ∈ a. The axiom of extensionality then becomes a triviality. However, its role is taken over by the equality axioms which we have not explicitly considered hitherto. The role of extensionality is then played by the axiom c = d ∧ c ∈ a → d ∈ a, the unabbreviated version of which is (∀x ∈ c)x ∈ d ∧ (∀x ∈ d)x ∈ c ∧ c ∈ a → d ∈ a. Exercise: 9.2 Show that from the previous axiom one can deduce c = d ∧ F (c) → F (d) for any formula F (c). Definition: 9.3 The set terms and their ordinal levels are defined inductively. (i) For each α ∈ OT(Ω) ∩ Ω, there will be a set term Lα . Its ordinal level is declared to be α. (ii) If F (a, ~b ) is a set-theoretic formula, i.e. a formula of KP (whose free variables are among the indicated) and ~s ≡ s1 , · · · , sn are set terms with levels < α, then the formal expression {x∈Lα | F (x, ~s )Lα } is a set term of level α. Here F (x, ~s)Lα results from F (x, ~s) by restricting all unbounded quantifiers to Lα . A formula of RS is any expression of the form F (s1 , . . . , sn ), where F (a1 , . . . , an ) is a formula f KP with all free variables indicated and s1 , . . . , sn are set terms. In the sequel, RS–formulae will be referred to just as formulae. If A is a formula, then k(A) := {α : Lα occurs in A }. 71 Here any occurrence of Lα , i.e. also those inside of terms, has to be considered. For a term s we set k(s) := k(s = s). In what follows s, t, p, q, r, s1 , s2 , . . . will range over set terms. For a set term s we shall notate the level of s by | s |. We also write s < t instead of | s | < | t |. For terms s, t with | s | < | t | we set ◦ B(s) if t ≡ {x ∈ Lβ | B(x)} s∈t ≡ s∈ / L0 if t ≡ Lβ . The collection of set terms will serve as a formal universe for a theory LRS with infinitary rules. The infinitary rule for the universal quantifier on the right takes the form: From Γ ⇒ ∆, F (t) for all RS–terms t conclude Γ ⇒ ∆, ∀x F (x). There are also rules for bounded universal quantifiers: From Γ ⇒ ∆, F (t) for all RS–terms t with levels < α conclude Γ ⇒ ∆, (∀x ∈ Lα ) F (x). The corresponding rule for introducing a universal quantifier bounded by a term of the form {x∈Lα : F (x, ~s)Lα } is slightly more complicated. With the help of these infinitary rules it now possible to give logical deductions of all axioms of KP with the exception of Bounded Collection. The latter can be deduced from the rule of Σ-Reflection: From Γ ⇒ ∆, C conclude Γ ⇒ ∆, ∃z C z for every Σ-formula C. The class of Σ-formulae is the smallest class of formulae containing the bounded formulae which is closed under ∧, ∨, bounded quantification and unbounded existential quantification. C z is obtained from C by replacing all unbounded quantifiers ∃x in C by ∃x ∈ z. The length and cut ranks of KP∞ -deductions will be measured by ordinals from OT(Ω). If KP ` F (u1 , . . . , ur ) then LRS Ω·m Ω+n B(s1 , . . . , sr ) holds for some m, n and all set terms s1 , . . . , sr ; m and n depend only on the KP-derivation of B(~u). Definition: 9.4 The inference rules of KP∞ include all the propositional inferences of the sequent calculus (i.e., those pertaining to ∧, ∨, →, ¬) as well as the cut rule (Cut). In addition, KP∞ has the following rules, where in (∈ R), (b∀ L) and (b∃ R) it is also assumed that s < t): Elementhood ◦ p ∈ t ∧ r = p, Γ ⇒ ∆ all p < t r ∈ t, Γ ⇒ ∆ ◦ (∈∞ ) Γ ⇒ ∆, s ∈ t ∧ r = s Γ ⇒ ∆, r ∈ t (∈ R) Bounded Quantifiers ◦ s ∈ t → F (s), Γ ⇒ ∆ (∀x ∈ t) F (x), Γ ⇒ ∆ ◦ Γ ⇒ ∆, p ∈ t → F (p) all p < t (b∀ L) Γ ⇒ ∆, (∀x ∈ t) F (x) ◦ p ∈ t ∧ F (p), Γ ⇒ ∆ all p < t (∃x ∈ t) F (x), Γ ⇒ ∆ ◦ (b∃∞ ) 72 Γ ⇒ ∆, s ∈ t ∧ F (s) Γ ⇒ ∆, (∃x ∈ t) F (x) (b∃ R) (b∀∞ ) Unbounded Quantifiers F (t), Γ ⇒ ∆ ∀x F (x), Γ ⇒ ∆ Γ ⇒ ∆, F (p) for all p (∀ L) F (p), Γ ⇒ ∆ for all p ∃x F (x), Γ ⇒ ∆ Γ ⇒ ∆, ∀x F (x) Γ ⇒ ∆, F (t) (∃∞ ) Γ ⇒ ∆, ∃x F (x) (∀∞ ) (∃ R) Σ-Reflection Γ ⇒ ∆, A Γ ⇒ ∆, ∃x Ax (Σ-Ref) where A is a Σ-formula Definition: 9.5 The rank of formulae and terms is determined as follows. 1. rk(Lα ) = ω · α. 2. rk({x∈Lα | F (x)}) = max{ω · α + 1, rk(F (L0 )) + 2}. 3. rk(s∈t) := max{rk(s) + 6, rk(t) + 1}. 4. rk(¬A) := rk(A) + 1. 5. rk(A ∧ B) = rk(A ∨ B) = rk(A → B) = max(rk(A), rk(B)) + 1. 6. rk((∃x∈t)F (x)) := rk((∀x∈t)F (x)) := max{rk(t), rk(F (L0 )) + 2}. 7. rk(∃xF (x)) := rk(∀xF (x)) := max{Ω, rk(F (L0 )) + 1}. There is plenty of leeway in designing the actual rank of a formula. Definition: 9.6 Let Pow(ON) = {X | X is a set of ordinals}. A class function H : Pow(ON) → Pow(ON) will be called an operator if the following conditions are met for all X, X 0 ∈ Pow(ON): (H0) 0 ∈ H(X). (H1) For α =N F ω α1 + · · · + ω αn , α ∈ H(X) ⇐⇒ α1 , ..., αn ∈ H(X). (In particular, (H1) implies that H(X) will be closed under + and σ 7→ ω σ , i.e., if α, β ∈ H(X), then α + β, ω α ∈ H(X).) (H2) X ⊆ H(X) 73 (H3) X 0 ⊆ H(X) =⇒ H(X 0 ) ⊆ H(X). Note that an operator is monotone, i.e., if X 0 ⊆ X then X 0 ⊆ H(X) by (H2), and hence H(X 0 ) ⊆ H(X) using (H3). Definition: 9.7 (i) When f is a mapping f : ONk −→ ON, then H is said to be closed under f , if, for all X∈Pow(ON) and α1 , . . . , αk ∈H(X), f (α1 , . . . , αk )∈H(X). (ii) α ∈ H := α ∈ H(∅); s∈H := k(s) ⊆ H. (iii) X ⊆ H := X ⊆ H(∅). (iv) If Y is a set of ordinals we denote by H[Y ] the operator with (H[Y ])(X) := H(Y ∪ X). (v) For a set term s let H[s] denote the operator H[k(s)] The next Lemma garners some simple properties of operators. Lemma: 9.8 Let H be an operator, s be a set term and Y be a set of ordinals. (i) H[Y ] and H[s] are operators. (ii) Y ⊆ H =⇒ H[Y ] = H. (iii) ∀X, X 0 ∈Pow(ON)[X 0 ⊆ X =⇒ H(X 0 ) ⊆ H(X)]. For a set of formulae Γ = {A1 , . . . , An } let k(Γ) = k(A1 ) ∪ . . . ∪ k(An ). Definition: 9.9 We define the relation H α ρ Γ ⇒ ∆ by recursion on α by requiring that k(Γ) ∪ k(∆) ∪ {α} ⊆ H(∅) holds and one of the following conditions is satisfied: 1. Γ ⇒ ∆ is the result of a propositional inference (pertaining to one of the αi connectives ∧, ∨, →, ¬) with premisses Γi ⇒ ∆i and H ρ Γi ⇒ ∆i for some αi < α. α1 2. H ρ Γ, A ⇒ ∆ and H with rk(A) < ρ. α2 ρ Γ ⇒ ∆, A for some α1 , α2 < α and formula A 3. Γ is of the form r ∈ t, Γ0 and H[p] αp ρ ◦ p ∈ t ∧ r = p, Γ0 ⇒ ∆ holds for all p < t for some αp < α. 74 4. ∆ is of the form ∆0 , r ∈ t and ◦ α0 ρ H Γ ⇒ ∆0 , s ∈ t ∧ r = s holds for some s < t with | s | < α and some α0 < α. 5. Γ is of the form (∀x ∈ t) F (x), Γ0 and H α0 ρ ◦ s ∈ t → F (s), Γ0 ⇒ ∆ holds for some s < t with | s | < α and α0 < α. 6. ∆ is of the form ∆0 , (∀x ∈ t) F (x) and H[p] αp ρ ◦ Γ ⇒ ∆, p ∈ t → F (p) holds for all p < t for some αp < α. 7. Γ is of the form (∃x ∈ t) F (x), Γ0 and H[p] ◦ αp ρ p ∈ t ∧ F (p), Γ0 ⇒ ∆ holds for all p < t for some αp < α. 8. ∆ is of the form ∆0 , (∃x ∈ t) F (x) and H α0 ρ ◦ Γ ⇒ ∆0 , s ∈ t ∧ F (s) holds for some s < t with | s | < α and some α0 < α. 9. Γ is of the form ∀x F (x), Γ0 and H α0 ρ F (s), Γ0 ⇒ ∆ holds for some s with | s | < α and α0 + 2 < α. 10. ∆ is of the form ∆0 , ∀x F (x) and H[p] αp ρ Γ ⇒ ∆, F (p) holds for all p for some αp + 2 < α. 11. Γ is of the form ∃x F (x), Γ0 and αp ρ H[p] F (p), Γ0 ⇒ ∆ holds for all p for some αp + 2 < α. 12. ∆ is of the form ∆0 , ∃x F (x) and H α0 ρ Γ ⇒ ∆0 , F (s) holds for some s with | s | < α and some α0 + 2 < α. 75 13. α ≥ Ω and ∆ is of the form ∆0 , ∃z Az , where A is a Σ-formula, and H α0 ρ Γ ⇒ ∆0 , A holds for some α0 + 1 < α. Lemma: 9.10 and (i) If Γ0 ⊆ Γ, ∆0 ⊆ ∆, k(Γ), k(∆) ⊆ H, α ∈ H, α0 ≤ α, ρ0 ≤ ρ α0 ρ0 H then α ρ H (ii) If H α ρ Γ0 ⇒ ∆0 Γ ⇒ ∆. Γ ⇒ ∆, (∀x ∈ Lβ )F (x) , γ ∈ H and γ ≤ β then H α ρ Γ ⇒ ∆, (∀x ∈ Lγ )F (x) Proof: (i) is proved by a straightforward induction on α0 . For (ii) we use induction on α. the only interesting case is when (∀x ∈ Lγ )F (x) was the principal formula of the last inference which would have been (b∀)∞ . So we have αp H[s] ρ Γ ⇒ ∆, (∀x ∈ Lβ )F (x), p ∈ / L0 ∧ F (p) for all p < β, where αp < α. By the induction hypothesis we get H[s] αp ρ Γ ⇒ ∆, (∀x ∈ Lγ )F (x), p ∈ / L0 ∧ F (p) for all p < γ and thus, via another (b∀)∞ inference, we get the desired result. t u Lemma: 9.11 If k(s) ⊆ H, α ∈ H and α > 0 then H Proof: We have H[p] αp 0 α 0 ⇒ s∈ / L0 . ◦ p ∈ L0 ∧ p = s ⇒ for all p < 0 for some αp < 0 (since there ain’t any such p). Hence, via an inference (∈)∞ we get H[p] α from which we get H 0 ⇒ s ∈ / L0 via (¬R). 0 0 s ∈ L0 ⇒ , t u Lemma: 9.12 The inversions (i)-(viii) of RA∗ of Lemma 5.10 concerning propositional logic also hold for RS. In addition the following inversions hold for RS. (i) If H α ρ r ∈ t, Γ ⇒ ∆ then H[p] α ρ ◦ p ∈ t ∧ r = p, Γ ⇒ ∆ holds for all p < t. ◦ α α ρ Γ ⇒ ∆, p ∈ t → F (p) holds for all α α ρ Γ ⇒ ∆, p ∈ t ∧ F (p) holds for all (ii) If H ρ Γ ⇒ ∆, (∀x ∈ t)F (x) then H[p] p < t. (iii) If H ρ (∃x ∈ t)F (x), Γ ⇒ ∆ then H[p] p < t. ◦ (iv) If H α ρ Γ ⇒ ∆, ∀x F (x) then H[s] α ρ Γ ⇒ ∆, F (s) holds for all s. (v) If H α ρ ∃x F (x), Γ ⇒ ∆ then H[s] α ρ F (s), Γ ⇒ ∆ holds for all s. 76 t u Proof: All are straightforward by induction on α. α Lemma: 9.13 (Reduction) Let ρ = |C| = 6 Ω. If H ρ Γ, C ⇒ ∆ and H then α#α#β#β H ρ Γ, Ξ ⇒ ∆, Θ . β ρ Ξ ⇒ Θ, C , Proof: The proof is by induction on α#α#β#β and very similar to Lemma 5.11. We only look at two cases where C and was the principal formula of the last inference in both derivations. It is essential to notice that C is not the principal formula of an inference (Σ-Ref) since |C| = 6 Ω. Case 1: The first is when C is of the form r ∈ t. Then we have αp ρ H[p] ◦ Γ, C, p ∈ t ∧ r = p ⇒ ∆ for all p < t with αp < α and H β0 ρ ◦ Ξ ⇒ Θ, C, s ∈ t ∧ r = s for some β0 < β and term s < t with | s | < β. Since k(s) ⊆ H we also have that H = H[s]. By the induction hypothesis we obtain ◦ H αs #αs #β#β ρ Γ, Ξ, s ∈ t ∧ r = s ⇒ ∆, Θ H α#α#β0 #β0 ρ Γ, Ξ ⇒ ∆, Θ, s ∈ t ∧ r = s . and ◦ ◦ α#α#β#β Γ, Ξ ⇒ ∆, Θ . Cutting out s ∈ t ∧ r = s gives H ρ Case 2: The second case is when C is of the form (∀x ∈ t)A(x) Then we have H α1 ρ ◦ Γ, C, s ∈ t → A(s) ⇒ ∆ for some α1 < α and s < t with | s | < α. And we also have H[s] βs ρ ◦ Γ ⇒ ∆, C, s ∈ t → A(s) for some βs < β and s < t with | s | < β. Since k(s) ∈ H we have H[s] = H. By the induction hypothesis we thus get α1 #α1 #β#β ρ Γ, Ξ, s ∈ t → A(s) ⇒ ∆, Θ H α#α#βs #βs ρ Γ, Ξ ⇒ ∆, Θ, s ∈ t → A(s) . and ◦ ◦ H Cutting out s ∈ t → A(s) gives H ◦ α#α#β#β ρ Γ, Ξ ⇒ ∆, Θ . Theorem: 9.14 (First Cut Elimination Theorem) α 4α If H δ+1 Γ ⇒ ∆ and δ 6= Ω then H δ Γ ⇒ ∆ . 77 t u t u Proof: Use induction on α and the previous Lemma. Theorem: 9.15 (Predicative cut elimination) Let H be closed under ϕ. α If H ρ+ων Γ ⇒ ∆ , Ω ∈ / [ρ, ρ + ω ν [ and ν ∈ H, then ϕν (α) ρ H Γ ⇒ ∆. Proof: By main induction on ν and subsidiary induction on α. The assertion holds for ν = 0 by the First Cut Elimination Theorem 9.14 since ρ 6= Ω. Now suppose ν > 0. There will be a last inference (I) with premisses Γi ⇒ ∆i . Suppose the αi inference was not a cut or a cut of rank < ρ. We then have H[i] ρ+ων Γi ⇒ ∆i for some αi < α. By the subsidiary induction hypothesis we have H[i] ϕν (αi ) ρ Γi ⇒ ∆i . ϕν (α) ρ Applying the same inference (I) yields H Γ ⇒ ∆. Now suppose the last inference was a cut with cut formula C such that ρ ≤ |C| < ρ + ω ν . Then there exist ν0 < ν and n < ω such that |C| < ρ + ω ν0 · n. After performing a cut with C we have ϕν (α) H ρ+ω ν0 ·n Γ ⇒ ∆. We also have ϕν0 (ϕν (α)) = ϕν (α). Therefore by n-fold application of the main ϕν (α) ρ induction hypothesis we obtain H Γ ⇒ ∆. t u Lemma: 9.16 (Bounding Lemma) Let B be a Σ-formula and A be a Π-formula. Suppose α ≤ β < Ω and β ∈ H. (i) If H (ii) If H α ρ α ρ Γ ⇒ ∆, B then H α ρ Γ ⇒ ∆, B Lβ . H α ρ Γ, ALβ ⇒ ∆ . Γ, A ⇒ ∆ then Proof: (i) Use induction on α. Note that the deductions cannot contain any inference (Σ-Ref) since α < Ω. Note that if B is not the principal formula of the last inference then the assertion follows readily from the induction hypothesis. So let’s assume that B was the principal formula of the last inference. If B is a ∆0 formula or of either form B0 ∨ B1 , B0 ∧ B1 , (∀x ∈ t)F (x), or (∃x ∈ t)F (x) then the assertion follows readily from the induction hypothesis. So suppose B is of the form ∃xF (x). Then we have H α0 ρ Γ ⇒ ∆, B, F (s) for some α0 + 2 < α and a term s with | s | < α. Inductively we have (∗) H We also have (∗∗) H α0 ρ Γ ⇒ ∆, B Lβ , F (s)Lβ . α0 +1 ρ Γ ⇒ ∆, B Lβ , s ∈ / L0 78 by Lemma 9.11. Thus from (∗) and (∗∗) we get H α0 +2 ρ / L0 ∧ F (s)Lβ Γ ⇒ ∆, B Lβ , s ∈ α0 +2 via (∧R). The latter is the same as H ρ | s | < β, and hence, using (b∃R), we get H 79 ◦ Γ ⇒ ∆, B Lβ , s ∈ Lβ ∧ F (s)Lβ since α Lβ . t u ρ Γ ⇒ ∆, B 10 Impredicative Cut Elimination The usual cut elimination procedure works unless the cut formulae have been introduced by Σ-reflection rules. The obstacle to pushing cut elimination further is exemplified by the following scenario: δ Ω Γ ⇒ ∆, C ξ Γ ⇒ ∆, ∃z C z Ω ··· RefΣ ξs Ω Ξ, C s ⇒ Λ · · · (| s |< Ω) ξ Ξ, ∃z C z ⇒ Λ Ω α Γ, Ξ ⇒ ∆, Λ Ω+1 (∃L) (Cut) In order to be able to remove these critical cuts, i.e. cuts which were introduced by (Σ-Ref), we have to forgo arbitrary operators. We shall need operators H such that an H–controlled derivation that satisfies certain extra conditions can be “collapsed” into a derivation with much smaller ordinal labels. From now on we will identify ON with B(ΩΓ ). All operators are therefore supposed to just act on subsets of B(ΩΓ ). Definition: 10.1 The operator Hη for η < εΩ+1 is defined by \ Hη (X) = {B(β) | X ⊆ B(β) ∧ η < β}. Lemma: 10.2 (i) Hη is an operator. (ii) η < η 0 =⇒ Hη (X) ⊆ Hη0 (X). (iii) Hη is closed under ϕ and ψΩ η + 1. Proof: (i): X ⊆ Hη (X) follows by definition. If X 0 ⊆ Hη (X), then, for any β > η such that X ⊆ B(β), we have X 0 ⊆ B(β), and therefore Hη (X 0 ) ⊆ B(β), hence Hη (X 0 ) ⊆ Hη (X). So far we have verified (H0), (H2) and (H3). As to (H1), suppose α =N F ω α1 + . . . + ω αn . We have to show α ∈ Hη (X) iff α1 , . . . , αn ∈ Hη (X). But this is a consequence of α ∈ B(β) iff α1 , . . . , αn ∈ B(β) which holds by Lemma 8.11(i). (ii) is obvious. (iii) follows from the fact that the sets B(β) with β > η are closed under ϕ and ψΩ η + 1. t u Lemma: 10.3 Suppose η ∈ Hη . Define β̂ := η + ω Ω+β . (i) If α ∈ Hη then α̂, ψΩ (α̂) ∈ Hα̂ . (ii) If α0 ∈ Hη and α0 < α then ψΩ (αˆ0 ) < ψΩ (α̂). 80 Proof: Obviously, Hη (∅) = B(η +1). From α, η ∈ B(η +1) we obtain α̂ ∈ B(α̂), and hence ψΩ (α̂) ∈ B(α̂+1) = Hα̂ (∅). This shows (i). Now suppose α0 ∈ Hη and α0 < α. By the preceding argument we then have ψΩ (αˆ0 ) ∈ B(α̂), thus ψΩ (αˆ0 ) < ψΩ (α̂). t u Lemma: 10.4 (Persistence) Let δ ∈ H. (i) If H α ρ Γ ⇒ ∆, ∀x F (x) then H α ρ Γ ⇒ ∆, (∀x ∈ Lδ )F (x) . (ii) If H α ρ ∃x F (x), Γ ⇒ ∆ then H α ρ (∃x ∈ Lδ )F (x), Γ ⇒ ∆ . Proof: (i): We proceed by induction on α. The only interesting case is when the last inference was (∀)∞ . Thus H[s] αs ρ Γ ⇒ ∆, ∀x F (x), F (s) holds for all s for some αs + 2 < α. Inductively we have H[s] and hence H[s] αs ρ αs +1 ρ ◦ Γ, s ∈ Lδ ⇒ ∆, (∀x ∈ Lβ )F (x), F (s) ◦ Γ ⇒ ∆, (∀x ∈ Lβ )F (x), s ∈ Lδ → F (s) for all | s | < β. Thus, via (b∀)∞ we conclude that H (ii) is similar. α ρ Γ ⇒ ∆, (∀x ∈ Lδ )F (x) . t u Theorem: 10.5 (Collapsing and Impredicative Cut Elimination) Let Γ be set of Π-formulae and ∆ be a set of Σ-formulae. Suppose that η ∈ Hη . Then Hη α Γ ⇒ ∆ Ω+1 implies Hα̂ ψΩ (α̂) ψΩ (α̂) Γ ⇒ ∆ where α̂ = η + ω Ω+α . This result can also be established for the intuitionistic version of RS provided one adds the extra assumption that all formulae in Γ have rank at most Ω. Proof: We proceed by induction on α. Case 0: If the last inference was propositional then the assertion follows easily from the induction hypothesis. Case 1: Suppose the last inference was (b∀)∞ . Then a formula (∀x ∈ t)F (x) appears in ∆ and H[p] ◦ αp Γ ⇒ ∆, p ∈ t → F (p) Ω+1 holds for all p < t for some αp < α. Since k(t) ⊆ H we have k(t) ⊆ B(η + 1) and thus | t | < ψΩ (η + 1). As a result, | p | < ψΩ (η + 1) and therefore k(p) ⊆ H holds ◦ for all p < t, and hence H[p] = H for all p < t. The formula p ∈ t → F (p) might not be a Σ-formula but F (p) is a Σ-formula since (∀x ∈ t)F (x) is. Using inversion (Lemma 9.12) we have H αp Ω+1 ◦ Γ, p ∈ t ⇒ ∆, F (p) 81 (61) for all p < t Thus we can apply the induction hypothesis to (61), yielding ψΩ (αˆp ) Hαˆp ψΩ (αˆp ) ◦ Γ, p ∈ t ⇒ ∆, F (p) and hence ψΩ (αˆp ) Hαˆp +1 ψΩ (αˆp ) ◦ Γ ⇒ ∆, p ∈ t → F (p) (62) for all p < t. As ψΩ (αˆp ) + 1 < ψΩ (α̂) holds by Lemma 10.3(ii), we can apply an inference (b∀)∞ to get Hα̂ ψΩ (α̂) ψΩ (α̂) Γ ⇒ ∆. Case 3: Suppose the last inference was (Σ-Ref). Then ∆ contains a formula ∃z Az , where A is a Σ-formula and H α0 Ω+1 Γ ⇒ ∆, A for some α0 < α. By the induction hypothesis we have Hαˆ0 ψΩ (αˆ0 ) ψΩ (αˆ0 ) Γ ⇒ ∆, A . Using the Bounding Lemma 9.16 we get Hαˆ0 ψΩ (αˆ0 ) ψΩ (αˆ0 ) Lψ Γ ⇒ ∆, A Ω (αˆ0 ) . Via an inference (∃R) we get Hαˆ0 ψΩ (αˆ0 )+2 ψΩ (αˆ0 ) Γ ⇒ ∆, ∃z Az . Since ψΩ (αˆ0 ) + 2 < ψΩ (α̂), by Lemma 10.3, and ∃z Az is in ∆, we also have Hα̂ ψΩ (α̂) ψΩ (α̂) Γ ⇒ ∆. Case 4: Suppose the last inference was a cut. Then there exists a formula C with rk(C) ≤ Ω and α0 < α such that H H α0 Ω+1 α0 Ω+1 Γ, C ⇒ ∆ ; (63) Γ ⇒ ∆, C . (64) Case 4.1: rk(C) < Ω. Then we can apply the induction hypothesis to both (63) and (64) so that Hαˆ0 Hαˆ0 ψΩ (αˆ0 ) ψΩ (αˆ0 ) ψΩ (αˆ0 ) ψΩ (αˆ0 ) Γ, C ⇒ ∆ ; (65) Γ ⇒ ∆, C . (66) Since k(C) ⊆ Hη this implies rk(C) < ψΩ (η + 1). Thus applying a cut to (65) and (66) yields Hα̂ ψΩ (α̂) ψΩ (α̂) Γ ⇒ ∆. 82 Case 4.2: rk(C) = Ω. Then C is of the form Qx F (x) with Q ∈ {∃, ∀} and F (L0 ) being ∆0 . Let’s first suppose that C is ∃x F (x). Then we can apply the induction hypothesis to (64) and we get Hαˆ0 ψΩ (αˆ0 ) ψΩ (αˆ0 ) Γ ⇒ ∆, C . (67) Using the Persistence Lemma 10.4 and the fact that ψΩ (αˆ0 ) ∈ Hαˆ0 (invoking Lemma 10.3(i)) we infer from (63) that Hαˆ0 α0 Ω+1 Γ, (∃x ∈ LψΩ (αˆ0 ) )F (x) ⇒ ∆ . (68) Since (∃x ∈ LψΩ (αˆ0 ) )F (x) is ∆0 the induction hypothesis can be applied to (68), yielding Hα1 ψΩ (α1 ) ψΩ (α1 ) Γ, (∃x ∈ LψΩ (αˆ0 ) )F (x) ⇒ ∆ , (69) where α1 = αˆ0 + ω Ω+α0 . Since α1 < η + ω Ω+α = α̂ and rk((∃x ∈ LψΩ (αˆ0 ) )F (x)) < ψΩ (α̂) hold, cutting with (67) and (69) furnishes Hα̂ ψΩ (α̂) ψΩ (α̂) Γ ⇒ ∆. If C is ∀x F (x) the argument is similar. 11 t u Interpreting KP in RS Theorem: 11.1 (Interpretation Theorem) If KP ` A where A is sentence then there exist m < ω such that Ω·ω m H0 Ω+m A . Proof: The proof is too long to be incorporated here. Corollary: 11.2 t u (i) If A is a Σ sentence of KP and KP ` A then LψΩ (εΩ+1 ) |= A. (ii) If KP ` C where C is a sentence of the form ∀x∃y F (x, y) with F (a, b) being a Σ formula, then LψΩ (εΩ+1 ) |= C. (iii) There is no ordinal < ψΩ (εΩ+1 ) that satisfies (i). (iv) k KP k= ψΩ (εΩ+1 ). Proof: (i): Suppose KP ` A. By Theorem 11.1 we find m < ω such that H0 Ω·ω m Ω+m A. We can assume that m > 1. Using the First Cut Elimination Theorem 9.14 m-1times we get H0 σ0 Ω+1 83 A (70) where σ0 := ωm−1 (Ω · ω m ). Note that to (70) we can apply Impredicative Cut Elimination 10.5, and hence, since 0 + ω Ω+σ0 = ω σ0 , Hσ1 ψΩ (σ1 ) ψΩ (σ1 ) A (71) where σ1 = ω σ0 . By the Bounding Lemma 9.16 it follows that Hσ1 σ2 σ2 ALσ2 (72) where σ2 = ψΩ (σ1 ). By Predicative Cut Elimination 9.15 we conclude from (71) that Hσ1 ϕσ2 (σ2 ) 0 ALσ2 . (73) As the derivation from (73) contains no inference (Σ-Ref) one then shows by induction on ϕσ2 (σ2 ) that all sequents appearing in the derivation are true in Lσ2 on the standard interpretation. Obviously, ϕσ2 (σ2 ) < ψΩ (εΩ+1 ). As A is a Σ-formula it follows that LψΩ (εΩ+1 ) |= B. (ii) follows from (i) and (iii) using Theorem 2.1 from M. Rathjen: Fragments of Kripke-Platek set theory with infinity, in: P. Aczel, H. Simmons, S. Wainer (eds.): Proof Theory (Cambridge University Press, Cambridge, 1992) 251-273. (iii) requires a well-ordering proof in KP. (iv) follows from the fact that PA + TI(ψΩ (εΩ+1 )) proves the consistency of KP and a cunning argument involving Löb’s Theorem. t u ψΩ (εΩ+1 ) is also known as the Bachmann-Howard ordinal. References [1] T. Arai: Proof theory for theories of ordinals I: recursively Mahlo ordinals, Annals of Pure and applied Logic 122 (2003) 1–85. [2] T. Arai: Proof theory for theories of ordinals II: Π3 -Reflection, Annals of Pure and Applied Logic. [3] H. Bachmann: Die Normalfunktionen und das Problem der ausgezeichneten Folgen von Ordinalzahlen. Vierteljahresschrift Naturforsch. Ges. Zürich 95 (1950) 115–147. [4] J Barwise: Admissible Sets and Structures (Springer, Berlin 1975). [5] W. Buchholz: Eine Erweiterung der Schnitteliminationsmethode, Habilitationsschrift (München 1977). [6] A simplified version of local predicativity, in: Aczel, Simmons, Wainer (eds.), Leeds Proof Theory 1991 (Cambridge University Press, Cambridge, 1993) 115– 147. 84 [7] W. Buchholz, S. Feferman, W. Pohlers, W. Sieg: Iterated inductive definitions and subsystems of analysis (Springer, Berlin, 1981). [8] W. Buchholz and K. Schütte: Proof theory of impredicative subsystems of analysis. (Bibliopolis, Naples, 1988). [9] W. Buchholz: Explaining Gentzen’s consistency proof within infinitary proof theory. in: G. Gottlob et al. (eds.), Computational Logic and Proof Theory, KGC ’97, Lecture Notes in Computer Science 1289 (1997). [10] G. Cantor: Beiträge zur Begründung der transfiniten Mengenlehre II. Mathematische Annalen 49 (1897) 207–246. [11] T. Carlson: Elementary patterns of resemblance, Annals of Pure and Applied Logic 108 (2001) 19-77. [12] A.G. Dragalin: New forms of realizability and Markov’s rule (Russian), Dokl. Acad. Nauk. SSSR 2551 (1980) 543–537; translated in: Sov. Math. Dokl. 10, 1417–1420. [13] F. Drake: Set Theory: An introduction to large cardinals. Amsterdam: North Holland 1974 [14] S. Feferman: Systems of predicative analysis, Journal of Symbolic Logic 29 (1964) 1–30. [15] S. Feferman: Proof theory: a personal report, in: G. Takeuti, Proof Theory, 2nd edition (North-Holland, Amsterdam, 1987) 445–485. [16] S. Feferman: Hilbert’s program relativized: Proof-theoretical and foundational reductions, J. Symbolic Logic 53 (1988) 364–384. [17] S. Feferman: Remarks for “The Trends in Logic”, in: Logic Colloquium ‘88 (North-Holland, Amsterdam, 1989) 361–363. [18] H. Friedman: Classically and intuitionistically provably recursive functions. In: G.H. Müller, D.S. Scott: Higher set theory (Springer, Berlin, 1978) 21–27. [19] H. Friedman, K. McAloon, and S. Simpson: A finite combinatorial principle which is equivalent to the 1-consistency of predicative analysis, in: G. Metakides (ed.): Patras Logic Symposium (North-Holland, Amsterdam, 1982) 197–220. [20] H. Friedman, N. Robertson, P. Seymour: The metamathematics of the graph minor theorem, Contemporary Mathematics 65 (1987) 229–261. [21] H. Friedman and S. Ščedrov: Large sets in intuitionistic set theory, Annals of Pure and Applied Logic 27 (1984) 1–24. [22] H. Friedman and S. Sheard: Elementary descent recursion and proof theory, Annals of Pure and Applied Logic 71 (1995) 1–45. [23] G.H. Hardy: A theorem concerning the infinite cardinal numbers. Quarterly Journal of Mathematics 35 (1904) 87–94. 85 [24] D. Hilbert: Die Grundlegung der elementaren Zahlentheorie, Mathematische Annalen 104 (1931). [25] D. Hilbert and P. Bernays: Grundlagen der Mathematik II (Springer, Berlin, 1938). [26] G. Jäger: Zur Beweistheorie der Kripke–Platek Mengenlehre über den natürlichen Zahlen, Archiv f. Math. Logik 22 (1982) 121–139. [27] G. Jäger and W. Pohlers: Eine beweistheoretische Untersuchung von ∆12 –CA + BI und verwandter Systeme, Sitzungsberichte der Bayerischen Akademie der Wissenschaften, Mathematisch–Naturwissenschaftliche Klasse (1982). [28] A. Kanamori, M. Magidor: The evolution of large cardinal axioms in set theory. In: G. H. Müller, D.S. Scott (eds.) Higher Set Theory. Lecture Notes in Mathematics 669 (Springer, Berlin, 1978) 99-275. [29] G. Kreisel: On the interpretation of non-finitist proofs II, Journal of Symbolic Logic 17 (1952) 43–58. [30] G. Kreisel: Mathematical significance of consistency proofs. Journal of Symbolic Logic 23 (1958) 155–182. [31] G. Kreisel: Generalized inductive definitions, in: Stanford Report on the Foundations of Analysis (Mimeographed, Stanford, 1963) Section III. [32] G. Kreisel: A survey of proof theory, Journal of Symbolic Logic 33 (1968) 321– 388. [33] G. Kreisel: Notes concerning the elements of proof theory. Course notes of a course on proof theory at U.C.L.A. 1967 - 1968. [34] G. Kreisel, G. Mints, S. Simpson: The use of abstract language in elementary metamathematics: Some pedagogic examples, in: Lecture Notes in Mathematics, vol. 453 (Springer, Berlin, 1975) 38–131. [35] G.E. Mints: Finite investigations of infinite derivations, Journal of Soviet Mathematics 15 (1981) 45–62. [36] W. Pohlers: Cut elimination for impredicative infinitary systems, part II: Ordinal analysis for iterated inductive definitions, Arch. f. Math. Logik 22 (1982) 113–129. [37] W. Pohlers: Proof theory and ordinal analysis, Arch. Math. Logic 30 (1991) 311–376. [38] M. Rathjen: Ordinal notations based on a weakly Mahlo cardinal, Archive for Mathematical Logic 29 (1990) 249–263. [39] M. Rathjen: Proof-Theoretic Analysis of KPM, Arch. Math. Logic 30 (1991) 377–403. 86 [40] M. Rathjen: How to develop proof–theoretic ordinal functions on the basis of admissible sets. Mathematical Quarterly 39 (1993) 47–54. [41] M. Rathjen: Collapsing functions based on recursively large ordinals: A well– ordering proof for KPM. Archive for Mathematical Logic 33 (1994) 35–55. [42] M. Rathjen: Proof theory of reflection. Annals of Pure and Applied Logic 68 (1994) 181–224. [43] M. Rathjen: Recent advances in ordinal analysis: Π12 -CA and related systems. Bulletin of Symbolic Logic 1, 468–485 (1995). [44] M. Rathjen: The realm of ordinal analysis. S.B. Cooper and J.K. Truss (eds.): Sets and Proofs. (Cambridge University Press, 1999) 219–279. [45] M. Rathjen: An ordinal analysis of stability, Archive for Mathematical Logic 44 (2005) 1 - 62. [46] M. Rathjen: An ordinal analysis of parameter-free Π12 comprehension Archive for Mathematical Logic 44 (2005) 263 - 362. [47] Richter, W. and Aczel, P.: Inductive definitions and reflecting properties of admissible ordinals. In: J.E. Fenstad, Hinman (eds.) Generalized Recursion Theory (North Holland, Amsterdam, 1973) 301-381. [48] K. Schütte: Beweistheoretische Erfassung der unendlichen Induktion in der Zahlentheorie, Mathematische Annalen 122 (1951) 369–389. [49] K. Schütte: Beweistheorie (Springer, Berlin, 1960). [50] K. Schütte: Eine Grenze für die Beweisbarkeit der transfiniten Induktion in der verzweigten Typenlogik, Archiv für Mathematische Logik und Grundlagenforschung 67 (1964) 45–60. [51] K. Schütte: Predicative well-orderings, in: Crossley, Dummet (eds.), Formal systems and recursive functions (North Holland, 1965) 176–184. [52] K. Schütte: Proof Theory (Springer, Berlin, 1977). [53] H. Schwichtenberg: Proof theory: Some applications of cut-elimination. In: J. Barwise (ed.): Handbook of Mathematical Logic (North Holland, Amsterdam, 1977) 867–895. [54] S. Simpson: Nichtbeweisbarkeit von gewissen kombinatorischen Eigenschaften endlicher Bäume, Archiv f. Math. Logik 25 (1985) 45–65. [55] S. Simpson: Subsystems of second order arithmetic (Springer, Berlin, 1999). [56] G. Takeuti: Consistency proofs of subsystems of classical analysis, Ann. Math. 86, 299–348. [57] G. Takeuti: Proof theory and set theory, Synthese 62 (1985) 255–263. 87 [58] G. Takeuti, M. Yasugi: The ordinals of the systems of second order arithmetic with the provably ∆12 –comprehension and the ∆12 –comprehension axiom respectively, Japan J. Math. 41 (1973) 1–67. [59] A. S. Troelstra: Metamathematical investigations of intuitionistic arithmetic and analysis, (Springer, Berlin, 1973). [60] A. S. Troelstra and D. van Dalen: Constructivism in Mathematics: An Introduction, volume I, II, North–Holland, Amsterdam 1988. [61] O. Veblen: Continuous increasing functions of finite and transfinite ordinals, Trans. Amer. Math. Soc. 9 (1908) 280–292. 88