Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A Beginner’s Guide to Modern Set Theory Martin Dowd Product of Hyperon Software PO Box 4161 Costa Mesa, CA 92628 www.hyperonsoft.com c 2010 by Martin Dowd Copyright 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2. Formal logic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 3. Axioms of equality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4. The integers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5. Informal set theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 6. Structures and models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 7. Models of Peano arithmetic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 8. The real numbers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 9. Computability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 10. Independence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 11. ZFC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 12. Proper classes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 13. Ordinals and cardinals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 14. The real numbers (II). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 15. The continuum hypothesis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 16. Absoluteness. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 17. Admissible sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 18. Formalization of syntax. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 19. Constructible sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 20. CH is true in L. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 21. Forcing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 22. ¬CH is consistent. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 23. Clubs, stationary sets, and diamond. . . . . . . . . . . . . . . . . . . . 84 24. Trees. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 25. The Suslin hypothesis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 26. Diamond implies ¬SH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 27. Iterated forcing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 28. Martin’s axiom. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 29. SH is consistent. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 30. Inaccessible cardinals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 31. Mahlo cardinals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 32. Greatly Mahlo cardinals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 33. Reflection principles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 34. Indescribable cardinals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 35. Ultrapowers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 36. Measurable cardinals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 37. Indiscernibles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 38. 0#. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 39. Relative constructibility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 40. Direct limits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 41. L[U ] and iterated ultrapowers. . . . . . . . . . . . . . . . . . . . . . . . . . 131 42. The sharp operator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 i 43. Cardinals larger than measurable. . . . . . . . . . . . . . . . . . . . . . 44. Kunen’s theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45. Rudimentary functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46. The Jensen hierarchy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47. Fine structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48. Upward extension. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49. Fine structural ultrapowers. . . . . . . . . . . . . . . . . . . . . . . . . . . . 50. The covering lemma. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51. Cardinal arithmetic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52. Square. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53. Independence of AC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54. Proper forcing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55. Core models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56. Consistency strength. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57. Descriptive set theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58. Determinacy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59. Determinacy and descriptive set theory. . . . . . . . . . . . . . . 60. Determinacy and 0#. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61. Determinacy and large cardinals. . . . . . . . . . . . . . . . . . . . . . . 62. Forcing axioms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63. Some observations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 1. Axioms for plane geometry. . . . . . . . . . . . . . . . . . Appendix 2. Computability (II). . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Index of symbols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii 133 143 145 151 157 164 167 173 176 178 180 181 183 188 189 199 202 207 211 212 213 214 227 235 241 246 1. Introduction. As the title suggests, this book is intended to provide an introduction to modern set theory, to readers with little or no knowledge of mathematical logic. As such, it should be useful to anyone interested in learning about modern set theory, without having to wade through an entire text such as the “Millennium Edition” [Jech2]. Readers might fall in to two categories, those who are not interested in reading further, and those who are. For the latter, this book hopefully provides useful orientation. It is hoped that advanced high school students will find this book useful. Admittedly only the most intrepid student would finish it in high school; but the first 15 chapters, and the two appendices, are hopefully fairly accessible. Resources for advanced high school mathematics are mainly in calculus and linear algebra, with some resources in other areas. Resources in mathematical logic have typically been scarce, one example being a 1958 book on Godel’s proof [NagNew]. The website [Wiki, Mathematical logic] has overviews of various topics, and links to additional resources. The present book contains an introduction to mathematical logic sufficient for its purposes, and thus should serve as a useful introduction for other purposes. Various other topics are covered for the same reason, so that the book is fairly self-contained. Set theory, like any branch of contemporary mathematics, consists of an overwhelming volume of technical definitions and arguments. On the other hand, non-technical introductions sometimes engage in circumlocutions intended to avoid technical detail, so convoluted that they become confusing. The present book pursues an intermediate course, covering technical details in outline and giving references, so that the main content can be given with some discussion of technical details. The book consists of a series of sections, each covering a particular topic. The table of contents gives a list of the sections. The end of a proof is denoted using the symbol “⊳”. The author thanks Dr. Herbert Enderton for reading a draft of the manuscript. 2. Formal logic. It is a discovery of late 19th and early 20th century mathematics, that mathematical theorems can be stated and proved in formal logic. This discovery did not change the way mathematics is done; theorems are proved by working mathematicians using informal logic, which other mathematicians can follow, and which may refer to extensive amounts of material already accepted as fact. Rather, formal logic brought complete precision to the analysis of mathematical reasoning, clarified various issues which had been under debate, and produced formal logic as itself 1 a branch of mathematics. Formal logic relies on the fact that statements of mathematics can be specified in a formal language. Indeed, this observation holds in other areas, and formal logic has found uses in addition to its use in mathematics. Statements are finite strings of symbols, each symbol being chosen from an “alphabet” of symbols. For this reason, formal logic is also called symbolic logic. The alphabet of the formal language of mathematics is divided into groups of symbols, as follows. Logical symbols Punctuation marks (), Propositional connectives ¬ ∧ ∨ ⇒⇔ Quantifiers ∀∃ Variables x0 , x1 , . . . Non-logical symbols Predicate symbols P0n , P1n , . . . Function symbols f0n , f1n , . . . Constant symbols c 0 , c1 , . . . The superscript n in predicate and function symbols is an integer giving its “valency”, i.e., the number of arguments it applies to; it will invariably be omitted. Not every string of symbols is “legal”; those that are, are called formulas. These may be defined by giving rules for building them up, as follows. A term is either a variable, a constant symbol, or f (t1 , . . . , tn ) where f is a function symbol of valency n and t1 , . . . , tn are terms. An atomic formula is a formula P (t1 , . . . , tn ) where P is a predicate symbol of valency n and t1 , . . . , tn are terms. A formula is either an atomic formula, ¬F , F1 ∧ F2 , F1 ∨ F2 , F1 ⇒ F2 , F1 ⇔ F2 , ∀xF , or ∃xF , where F , F1 , and F2 are formulas and x is a variable. The preceding style of definition, where objects which are already built up can be use to build up new objects, is called “recursive”. A “shortcut” has been taken; the subformulas in the definition of a formula should be enclosed in parentheses, to avoid ambiguity, although some of the parentheses can be made optional (requiring a more laborious recursive definition). The notion of the free and bound occurrences of variables in a formula is an important one, and may be defined recursively as follows. In an atomic formula, all occurrences of variables are free. In a propositional combination of formulas, all occurrences of variables are free or bound as they are in the constituent subformulas. In ∀xF or ∃xF , any free occurrence of x in F becomes bound; all other occurrences are free or bound as they are in F . A sentence is a formula in which all 2 occurrences of variables are bound. Later it will be seen that, given an interpretation in a mathematical setting of the non-logical symbols, a meaning can be assigned to any formula. Some discussion is useful here. In general, a formula defines a “predicate” on the “universe of discourse”: if values from the universe are assigned to the free variables, the formula takes on the value of either true or false. In particular, a sentence is a statement which is either true or false. A brief statement of the meaning of the propositional connectives and quantifiers can be given, as follows. ¬F means “not F ” (negation) F1 ∧ F2 means “F1 and F2 ” (conjunction) F1 ∨ F2 means “F1 or F2 ” (disjunction) F1 ⇒ F2 means “if F1 then F2 ” (implication) F1 ⇔ F2 means “F1 if and only if F2 ” (bi-implication) ∀xF means “for all x, F ” (universal quantification) ∃xF means “there exists x, F ” (existential quantification) Having a formal definition of a mathematical statement, a formal definition can now be given of a proof. Certain formulas are specified as “axioms”, and rules are given for deducing formulas from formulas already deduced. Some axioms are axioms of formal logic, and are called “logical”. Other axioms are specific to a particular setting, and are called “non-logical”. The rules are all logical. The logical axioms of formal logic are chosen so that they are true in any setting, and in any setting the rules produce true statements from statements already known to be true. The non-logical axioms are true in settings of interest. Even though the principles are clear without giving one, an example of a system of logical axioms and rules will be given. Such will be given for a variation of the alphabet, namely a smaller one. A larger alphabet is more expressive, but a smaller alphabet results in fewer axioms and rules. Needless to say, the variation is inessential; in particular, the larger alphabet can be expressed in terms of the smaller one. The alphabet of the axioms and rules will be ¬ ⇒ ∀. In the following, let F, G, H be formulas. If F is a formula, x a variable, and t a term, Ft/x will denote the formula obtained from F by replacing each free occurrence of x by t. There are three propositional logical axioms. F ⇒ (G ⇒ F ) (H ⇒ (F ⇒ G)) ⇒ ((H ⇒ F ) ⇒ (H ⇒ G)) (¬F ⇒ G) ⇒ ((¬F ⇒ ¬G) ⇒ F ) There is one propositional rule. 3 From F and F ⇒ G, deduce G. There is one quantifier axiom. F ⇒ ∀xG |= F ⇒ Gt/x , provided no occurrence of a variable of t becomes bound. There is one quantifier rule. From F ⇒ G deduce F ⇒ ∀xG, provided x does not occur free in F. Note that arbitrary formulas may occur in a proof, and not just sentences. This is an artifact of the method; quantifiers get introduced as the formulas of the proof become more complex. A formula is considered to be true if it is true, regardless of the values assigned to the free variables (if its “universal closure” is a true sentence). As has been seen, the “syntax” of formal (or mathematical) logic consists of an alphabet, and rules for building formulas. Statements of mathematics are proved to be true using the axioms and rules of a formal system for making deductions. The semantics of mathematical logic consists of assigning in a rigorous manner a meaning to each formula; this requires some additional concepts, and is left to section 6. Once all this is specified, theorems may be proved about mathematical logic itself, which delineate the way in which it captures mathematical reasoning. There are a number of introductions to mathematical logic, among them [Belaniuk], [Enderton], [Mendelson], [Magnus], and chapter 11 of the author’s self-published advanced undergraduate algebra text [Dowd1]. As will be seen in section 11, formal logic is an essential ingredient of modern set theory. Historically, early developments in mathematical logic and set theory overlapped and influenced each other. A relatively recent development in mathematical logic is the use of computers to produce “formal proofs” of mathematical theorems, using a known “informal proof” as a starting point. The December 2008 issue of the Notices of the American Mathematical Society contains several articles on the subject. 3. Axioms of equality. The equality predicate, for which the symbol = is used, has a special status in formal logic. It is a binary (valency 2) predicate. As for many common binary predicates, the notation x = y is used in mathematical writing rather than =(x, y). In settings where equality is present, it is meant to be interpreted as equality, that is, x = y holds only when x and y are assigned the same value. There are some subtleties in handling the special status of the equality predicate; and some variations in how this is done. More will be said in section 6. 4 If equality is present, the axioms for it may be considered to be added as “quasi-logical” (standardized non-logical) axioms. These are as follows. x=x x=y⇒y=x x=y⇒y=z⇒x=z x1 = y1 ⇒ · · · ⇒ xn = yn ⇒ P (x1 , . . . , xn ) ⇒ P (y1 , . . . , yn ), for any valency n predicate symbol P . x1 = y1 ⇒ · · · ⇒ xn = yn ⇒ f (x1 , . . . , xn ) = f (y1 , . . . , yn ), for any valency n function symbol f . In the foregoing, x, y, etc. denote variables. Also, the abbreviation F1 ⇒ · · · ⇒ Fk is used for F1 ⇒ (· · · ⇒ Fk ); this may also be written as (F1 ∧ · · · ∧ Fk−1 ) ⇒ Fk , or just F1 ∧ · · · ∧ Fk−1 ⇒ Fk . The axioms of equality are written without quantifiers, all variables being implicitly universally quantified. This is a serendipitous coincidence between common use in mathematical writing, and a convention of formal logic. 4. The integers. The integers are fundamental mathematical objects, which are familiar from everyday life. With modern machinery, a theory of the integers can be given either for all the integers, including negative integers; or for only the non-negative integers. Historically, the theory of the non-negative integers has been important in the development of mathematical logic, and it continues to play a significant role. The non-negative integers 0,1,2,. . . comprise a universe of discourse concerning which mathematical statements can be made. A set of nonlogical symbols which turns out to be satisfactory as those of the formal language for such statements is as follows: a constant 0; a valency 1 function s, the successor function; a valency 2 function +, addition; a valency 2 function ·, multiplication; and the equality predicate =. The notation xs will be used for the successor function; xs equals x + 1. Even though it is not ordinarily used in mathematical writing, it is convenient and traditional to have it as one of the symbols of the formal language in this setting. The above symbols comprise the language of Peano arithmetic. Let F denote a formula in this language, and let x, y, etc., denote variables. The following formulas are known as Peano’s axioms. 1. xs = y s ⇒ x = y 2. ¬xs = 0 5 3. 4. 5. 6. 7F . x+0=x x + y s = (x + y)s x·0 = 0 x · y s = (x · y) + x F0/x ∧ ∀x(F ⇒ Fxs /x ) ⇒ ∀xF . Again, axioms 1 to 6 are written without quantifiers, and all variables are implicitly universally quantified. Peano’s axioms are clearly basic facts about the non-negative integers. In accordance with the axiomatic method, they are taken as true, and more complex statements deduced to be true by mathematical reasoning. Axiom 7 is an infinite family of axioms, one for each formula F (and variable x). Such a system of axioms is called an axiom scheme, and these occur frequently in mathematical logic. Note that x is not required to occur free in F ; some authors do require this, but it is unnecessary to do so. This axiom scheme is a formal statement of the principle of mathematical induction. Mathematical induction may be stated in a version using sets of integers; but the formal machinery given so far does not provide for this, and Peano’s axioms provide a method for giving axioms for the non-negative integers within the confines of basic formal logic. Historically, this was a reason for their introduction. They remain a topic of considerable interest in mathematical logic, even though they are subsumed by formal set theory, as will be seen. In particular, the “logical strength” of Peano’s axioms is of great interest. As will be noted in section 10, not every true statement about the integers can be proved using them (this is in fact the case for any formal system for arithmetic which proves only true statements); but stronger systems can be given. Whether a particular true statement about the non-negative integers can be proved using Peano’s axioms is a topic of interest in mathematical logic. Of course, Peano’s axioms are of interest because they are strong enough that a wide variety of basic facts about the non-negative integers can be proved using them. Treatments of this topic can be found in [Mendelson] and [Shoenfield1]. Among these facts are the following. - The basic properties of + and · are provable. - There is a formula defining the order relation ≤ (indeed, x ≤ y if and only if ∃w(y = x + w)), and its basic properties are provable. - The “division law” states that for any nonnegative integer x and positive integer d there are unique nonnegative integers q and r such that x = q · d + r; this is provable. - The exponential function is definable, that is, there is a formula E(x, y, z) which is true if and only if z = xy . The basic properties of the exponential function are provable. 6 - More generally, any of the class of functions known as the primitive recursive functions (see appendix 2) is definable. Another result of mathematical logic of interest concerning Peano’s axioms, is that there is no finite set of axioms from which the statements provable are exactly those provable in Peano arithmetic. [Shoenfield1] has a proof of this. 5. Informal set theory. Informal set theory has become so indispensable to mathematical discourse that it is now taught early in mathematical education. Like the integers, the sets are mathematical objects which comprise a mathematical universe of discourse. Indeed, they comprise a single universe of discourse for all of mathematics. This is a more advanced topic, but in view of the fact, it should not be surprising that the notion of a set is useful throughout mathematics. Basic set theory and logic are both tools used throughout mathematics, in particular in the consideration of each other. This results in the need for “forward references” in the presentation of the two topics, which various authors handle in various ways. A formal definition of the meaning of formulas has been deferred to section 6, and until then the reader’s existing knowledge will be relied on, indeed already has been in the preceding section. The language of set theory has a single binary predicate symbol, called “membership” and denoted ∈. The fact that x ∈ y is stated variously as, x is a member of y, x is an element of y, or x belongs to y. The notation x ∈ / y is used to abbreviate ¬(x ∈ y). The equality predicate will also be considered a basic symbol, although in set theory it can be defined. The formula x = y ⇔ ∀w(w ∈ x ⇔ w ∈ y) is called the extensionality axiom. It is assumed as an axiom of set theory if equality is considered to be a predicate symbol; or it may be taken as the definition of equality. The concepts of informal set theory can all be defined in terms of membership and equality. However, it is necessary to posit that certain construction operations can be carried out to obtain new sets from already known sets. The axioms of set theory give formal rules for these constructions. For example, if objects x1 , . . . , xk are given then there is a set {x1 , . . . , xk } whose elements are exactly these objects. In set theory there is no distinction between an object and a set; but in specific settings it may be convenient to make such a distinction. For example, one can consider the integers as objects, and then consider sets of integers. 7 The integers can be defined within set theory as specific sets, in a way which by now is standard; this will be discussed further in section 13. The set containing no elements is called the empty set and denoted ∅. The axioms of set theory ensure that it exists and is unique. It plays a role in set theory analogous to 0 in arithmetic. The main topics of informal set theory can be organized into the following areas. - Subsets, the power set, and operations on the power set. - Ordered ntuples and the Cartesian product. - Relations. - Functions. Each of these will be considered in turn. The website [Wiki, Naive set theory] is one of numerous references covering these topics, and has links to additional resources. Introductory set theory books such as [Monk1] cover them also, deriving basic facts from the axioms. Textbooks in other areas of mathematics frequently review informal set theory in introductory material, [Dowd1] for example. A set x is said to be a subset of a set y, written x ⊆ y, if w ∈ x ⇒ w ∈ y. By the extensionality axiom, x = y if and only if x ⊆ y and y ⊆ x. If x ⊆ y but x 6= y then x is said to be a proper subset of y, and this is written x ⊂ y. It should be noted that, as usual, the foregoing is just one of various notational conventions in use. If x is a set then the collection of all its subsets comprises a set, called the power set of x, and denoted Pow(x). This is one of the construction principles provided in the axioms of set theory (indeed, it is the power set axiom). Note that ∅ ⊆ x (the defining formula holds “vacuously”, since there are no w satisfying w ∈ ∅); and hence ∅ ∈ Pow(x) for any set x. Suppose U is a set; then the following operations may be defined on Pow(U ). - union: w ∈ x ∪ y if and only if w ∈ x or w ∈ y. - intersection: w ∈ x ∩ y if and only if w ∈ x and w ∈ y. - complement: w ∈ xc if and only if w ∈ U and w ∈ / x. The following formulas are the axioms for the structures known as Boolean algebras, with the binary functions ∪ and ∩, the unary function c , and the constants ∅ and U (structures are defined in section 6). - x ∪ y = y ∪ x, x ∩ y = y ∩ x - x ∪ (y ∪ z) = (x ∪ y) ∪ z, x ∩ (y ∩ z) = (x ∩ y) ∩ z - x ∪ (y ∩ z) = (x ∪ y) ∩ (x ∪ z), x ∩ (y ∪ z) = (x ∩ y) ∪ (x ∩ z) - x ∪ ∅ = x, x ∩ U = x - x ∪ xc = U , x ∩ xc = ∅ 8 It is easy to verify that Pow(U ) forms a Boolean algebra with the operations given above. Further identities involving these operations may be proved from the axioms, with that advantage that they then have been shown not only for Pow(U ), but for any Boolean algebra. Such identities may be found in various references, including [Dowd1]. The operations x∪y and x∩y are in fact defined for any pair of sets. A generalization of the union operation is important in the development of formal set theory. The complementation operation however is only defined on the subsets of a given set. The relative complement, or difference, x − y may be defined for any sets x and y: w ∈ x − y if and only if w ∈ x and w ∈ / y. The use of the minus sign for both subtraction of real numbers and relative complement causes no confusion. The context makes clear which is intended, with rare exceptions which can be clarified explicitly. For readers familiar with the concept of “overloading” from programming languages, the minus sign is overloaded, and may have arguments which are real numbers (or more generally elements of a commutative group); or sets. Additional terminology includes the following. A set y is said to be a superset of x, written y ⊇ x, if x is a subset of y. Sets x and y are said to be disjoint if x ∩ y = ∅. A set z is the disjoint union of sets x and y if z = x ∪ y and x ∩ y = ∅. The symmetric difference x ⊕ y of two sets equals (x − y) ∪ (y − x). As noted above, if x and y are objects there is a set {x, y} such that w ∈ {x, y} if and only if w = x or w = y. This in fact is the axiom of pairing. If x and y are the same object than {x, y} only contains a single object, otherwise it contains two objects. Also, {x, y} and {y, x} are the same set. One of the basic constructions of set theory is that of the ordered pair hx, yi of two objects x and y. This is designed to have the property that hx1 , y1 i = hx2 , y2 i if and only if x1 = x2 and y1 = y2 . It is not necessary to add this as a basic construction principle; hx, yi may be defined to be {{x}, {x, y}}. It follows using the axioms of extensionality and pairing that with this definition hx, yi has the desired property. A history of the notion of ordered pair can be found in [Kanamori1]; the modern definition is therein credited to Kuratowski. The Cartesian product x × y of two sets x and y is defined to be the set such that w ∈ x × y if and only if w = hw1 , w2 i where w1 ∈ x and w2 ∈ y. In a more convenient notation, the definition may be written as x × y = {hw1 , w2 i : w1 ∈ x, w2 ∈ y}. From hereon such notation will be used without further comment. In 9 formal set theory the Cartesian product is proved to exist from the axioms. In informal set theory the existence may be accepted as intuitively obvious; note, however, that x × y ⊆ Pow(Pow(x ∪ y)), and this fact is part of the formal existence proof. The Cartesian product x1 × · · · × xn of n sets may be defined recursively to be x1 × (x2 × · · · × xn ). There is an obvious correspondence between hw1 , hw2 , w3 ii and hhw1 , w2 i, w3 i, which can usually be ignored, and the triple written as hw1 , w2 , w3 i, which in tedious formality is the first version. Similar remarks hold for other nested Cartesian products. An nary relation on a set x is defined to be a subset of x × · · · × x, where there are n factors of x. If n = 1 the relation is called unary; a unary relation is the same thing as a subset. If n = 2 the relation is called binary. A function f from a set x to a set y is a subset of x × y, such that for all u ∈ x there exists a unique v ∈ y, such that hu, vi ∈ f . A function assigns an element of y to each element of x. [Kanamori1] notes that the definition of a function in this generality was an early triumph of set theory, with Felix Hausdorff being a major contributer. Having a definition such as this, a function may be considered as an object, as is done in calculus for example. The notation f : x 7→ y is used to denote that f is a function from x to y. Basic definitions concerning such a function include the following. - f (u) = v may be written, rather than hu, vi ∈ f ; similarly f (u) may be used for v in formulas. - In mathematical writing, the terminology “graph of f ” is used for the relation f , although in formal set theory f as an object is the relation. - The domain of f is x; Dom(f ) will be used to denote it. - For x′ ⊆ x, f [x′ ] denotes {v : ∃u ∈ x′ (f (u) = v}. - The range of f equals f [x]; Ran(f ) will be used to denote it. - If x′ ⊆ x the restriction of f to x′ is the set {hu, vi ∈ f : u ∈ x′ }. This is a function from x′ to y, which is denoted f ↾ x′ . - f is said to be injective, or 1-1, if f (u1 ) = f (u2 ) implies u1 = u2 . - f is said to be surjective, or onto, if its range is y. - f is said to be bijective, or a 1-1 correspondence, if it is both injective and surjective. - If f : x 7→ y and g : y 7→ z then there is a function g ◦ f : x 7→ z, defined by the formula (g ◦ f )(u) = g(f (u)). This function is called the composition of g and f . - An nary function on a set x is just a function from x × · · · × x to x, where there are n factors of x in the domain. A function is also called a mapping or map, emphasizing the fact that, 10 in addition to constituting an object itself, it has an “active” aspect. The function from X1 × X2 to Xi where i is 1 or 2, which maps hx1 , x2 i to xi , is called a projection function. These functions are quite convenient, and will be denoted as π1 and π2 . Note that, for example, Dom(f ) = π1 [f ]. In formal set theory, a notion of the “size”, or “cardinality”, of an arbitrary set may be defined; this was an early triumph of set theory, due to Cantor. A treatment will be given in section 13; here a few facts are noted which will be needed before section 13. Given two cardinalities, one is greater than or equal to the other; and given any cardinality there are larger ones. For a nonnegative integer n, let Nn be the set {0, . . . , n − 1}; N0 is the empty set (it will be seen in section 13 that in set theory the notation Nn is unnecessary). A set x is said to be finite if there is a bijection from Nn to x for some n. It may be shown by induction on k that if f : Nk 7→ Nl is a bijection then l = k. It follows that n is unique; this unique n is said to be the cardinality of x. A set is said to be infinite if it is not finite. Letting N denote the set of all natural numbers, a set x is said to be countably infinite if there is a bijection f : N 7→ x. Such a set is infinite. Rather than attempting to be encyclopedic in this section, additional definitions of basic set theory will be introduced as needed. 6. Structures and models. As already noted, set theory is a tool required in the development of mathematical logic. The notion of a universe of discourse referred to in earlier sections can be formalized using it. A first-order language is defined to be a set of predicate, function, and constant symbols. Each predicate or function symbol has a valency associated with it. For many purposes, the set may be finite; however there are contexts where infinite sets are used, and the definition may easily be given in this generality. In a special case of frequent interest, there may be an infinite set of constants, while the predicates and functions are a fixed finite set. Given a first-order language L, a structure for L consists of a nonempty set D, called the domain or universe of the structure, together with the following. - For each nary predicate symbol P of L, an nary relation P̂ on D. - For each nary function symbol f of L, an nary function fˆ on D. - For each constant symbol c of L, a element ĉ of D. The relation, function, or constant assigned to a symbol is called its interpretation. Predicate symbols are also called relation symbols. In this section, if = is in L, initially no restriction is placed on its interpretation. 11 Set-theoretically, a structure is a domain D, together with a function assigning to each symbol of L its interpretation. A frequently used notational abbreviation is to let D denote the structure, with the function understood, and let P̂ , etc. denote the interpretation of P according to the structure. The interpretation of a valency n predicate symbol is an nary relation. From hereon a valency n predicate symbol will be called nary. Likewise, a valency n function symbol will be called nary. A formal definition of the meaning of a formula F in a first order language, in a structure D for the language, will now be given. Typically of mathematical logic, the definition is a tedious and long-winded formalization of a fact which is completely obvious. To begin with, the semantics of the propositional connectives must be specified. Let {t, f } be the two element set of “truth values” true and false. A propositional connective denotes a function on this set; the same symbol will be used to denote this function as the connective itself. For ¬ the function is unary, with ¬t = f and ¬f = t, For the other connectives the function is binary, as follows. X Y t t t f f t f f X ∧Y t f f f X ∨Y t t t f X⇒Y t f t t X⇔Y t f f t Given a structure D and a set of variables V , an assignment to V is defined to be a function α which assigns to each x ∈ V an element of D. For a term t, let Vt be the variables which occur in t. Similarly for a formula F let VF be the variables which occur free in F . Given a structure D, the interpretation t̂ of a term t is a function from assignments to Vt , to D. It is defined recursively as follows. - If t is a variable x then t̂ is the function which assigns to the assignment α to {x}, the value α(x). - If t is a constant c then t̂ is the function which assigns to the empty assignment, the value ĉ of c in the interpretation. The ambiguity of the notation causes no confusion. - If t = f (t1 , . . . , tn ) and α is an assignment to Vt , for 1 ≤ i ≤ n let αi be the assignment to Vti induced by α, i.e., α ↾ Vti . Then t̂(α) = fˆ(t̂1 (α1 ), . . . , t̂n (αn )). Similarly, given a structure D, the interpretation F̂ of a formula F is a function from assignments to VF , to {t, f }. It is defined recursively as follows. - If F is an atomic formula P (t1 , . . . , tn ) then F̂ (α) = P̂ (t̂1 (α1 ), . . . , t̂n (αn )), where αi is as for terms. 12 For the remaining cases let αi = α ↾ VFi . - If F is ¬F1 then F̂ (α) = ¬F̂1 (α1 ). - If F is F1 ∧ F2 then F̂ (α) = F̂1 (α1 ) ∧ F̂2 (α2 ). - If F is F1 ∨ F2 then F̂ (α) = F̂1 (α1 ) ∨ F̂2 (α2 ). - If F is F1 ⇒ F2 then F̂ (α) = F̂1 (α1 ) ⇒ F̂2 (α2 ). - If F is F1 ⇔ F2 then F̂ (α) = F̂1 (α1 ) ⇔ F̂2 (α2 ). - If F is ∀xF1 then F̂ (α) = t if and only if F̂1 (β) = t for all assignments β to VF1 such that β ↾ VF = α. - If F is ∃xF1 then F̂ (α) = t if and only if F̂1 (β) = t for some assignment β to VF1 such that β ↾ VF = α. Some basic definition from mathematical logic are as follows. Fix a first order language L. - A formula is said to be a formula in (or over) L if its non-logical symbols are all in L. - If A is a set of formulas in L, and F is a formula in L, the notation A ⊢ F is used to denote the fact that there is a proof of F in formal logic, using axioms from A, where all formulas of the proof are in L. - Given a structure D for L, and a formula F in L, |=D F is used to denote the fact that F is true in D. - Given a set A of formulas in L, the fact that |= F holds for every F ∈ A is denoted |=D A, and D is said to be a model of (or for) A. - A set of formulas A is said to be consistent if for no sentence F do both F and ¬F have proofs. Suppose |=D A, and A ⊢ F . It is straightforward (if tedious) to show that |=D F . This fact is called the “soundness” of formal logic; it states that the logical axioms and rules are “sound”. A proof of this fact may be found in any of various introductory logic texts, including [Enderton], [Mendelson], and chapter 11 of [Dowd1]. Note that “extra” symbols may be allowed in a proof; this follows by simply enlarging (the technical term is “expanding”) L. Suppose for any D, if |=D A then |=D F ; then A ⊢ F . This fact is called the “completeness” of formal logic. Not only does formal logic prove only true statements, it proves all statements which follow “by logic alone” from the non-logical axioms. That is, either a formula is true in some models and false in others (so additional axioms are needed); or it follows from the axioms by formal logic. The completeness theorem was first proved by Kurt Godel in 1929; a proof may be found in any of the above cited references. Given a proof of F from A, let A0 be the formulas of A which occur in the proof; this set is finite. Let L0 be the symbols of L which occur in A0 or F . A model D for A0 in L0 may be considered a model 13 in L; and since there is a proof of F , it is true in D considered as a model in L, whence it is true in D considered as a model in L0 . By completeness, then, there is a proof of F from A0 which uses only symbols from L0 . There are “syntactic” proofs of facts such as this, using “Gentzen systems” for example; see [Smullyan]. If a set A of formulas has a model then it is consistent, since for a sentence F only one of F and ¬F can be true in the model, so only one can be provable. It follows by completeness that if a set A of formulas is consistent then it has a model. In fact, this is usually proved first, and completeness deduced from it. In some cases, a system of axioms A is intended to be used to prove theorems about a particular structure; Peano’s axioms are an example. It is a fact of mathematical logic, however, that such systems will generally have other models than the intended one. Indeed, it follows from the “Lowenheim-Skolem” theorem that if A has infinite models then it has a model, of any infinite cardinality greater than or equal to the cardinality of the language. A proof of this may be found in the above cited references, and a version is given in section 20; see [Wiki, Lowenheim-Skolem theorem] for some historical comments. In the next section, a few comments will be made on models of Peano’s axioms. On the other hand, some systems of axioms A are intended to be used to prove theorems about any of a variety of structures, namely those which are models of the axioms. This is a basic tool of abstract algebra; a system of axioms for structures of a certain type is specified, and the theory of these developed by deducing facts from the axioms. An example has already been seen, namely Boolean algebras in section 5; additional examples will be seen in section 8. If the language contains the equality predicate, say that a model is an E-model if = is interpreted as equality. By completeness, a consistent set A of formulas, which includes the axioms of equality, has a model M . M need not be an E-model; however an E-model can be constructed from M . It follows that, in considering systems of axioms where = is in the language and the axioms of equality are assumed, only E-models need be considered. An outline of the construction of an E-model will be given; see for example [Dowd1] for details. A binary relation satisfying the first three axioms of equality is called an equivalence relation. Given an equivalence relation ≡, let [x] = {y : y ≡ x}; [x] is called the equivalence class of x. By the axioms, x ∈ [x], and two equivalence classes are either disjoint or equal. A binary relation on the domain of a structure D which satisfies all the axioms of equality is called a congruence relation. A structure D/≡ may be constructed, called the quotient of D by ≡. This has as the ele14 ments of its domain, the equivalence classes. The value P ([x1 ], · · · , [xn ]) for a predicate symbol P may be defined as P (x1 , . . . , xn ); the axioms ensure that the value depends only on the equivalence classes, and not the particular choice x1 , . . . , xn of “representatives” of the classes. Similarly f ([x1 ], · · · , [xn ]) may be defined as [f (x1 , · · · , xn )]. If α is an assignment in D, let α′ be the assignment in D/≡ which assigns to x the value [α(x)]. A straightforward induction shows that for any formula F , F̂ (α) in D equals F̂ (α′ ) in D/≡. In particular, if M is a model of A and ≡ is the interpretation of =, then M/≡ is a model of A. Clearly, it is an E-model. Assignments are somewhat cumbersome, and are used in mathematical logic for the definition of the semantics of formulas, etc. There is a more convenient method of referring to the semantics of a formula, which is in common use and will be used in this text (assignments will be used occasionally also). Suppose F is a formula, and ~v = v1 , . . . , vk is a list of variables which includes the free variables of F . Given elements ~x = x1 , . . . , xk in a structure S, let F~v (~x) be F̂ (a) where a assigns xi to vi for 1 ≤ i ≤ k. It is common practice to use F (~x) as an abbreviation for F~v (~x), when the explicit list of the variables is not needed. Another variation in use is F (x̊1 , . . . , x̊k ); the variables are xi , . . . , xk , and x̊i is assigned to xi for 1 ≤ i ≤ k. k will frequently be used to denote the length of a list ~v . Thus, F~v is a kary predicate on S. A predicate P which is F~v for some F and ~v is said to be definable; the formula F defines P in S. For a formula to define a predicate, a correspondence must be given between the argument places of the predicate and the free variables of the formula. The value of the predicate depends only on the values assigned to the free variables; additional variables are allowed for convenience. 7. Models of Peano arithmetic. Models of Peano arithmetic have become a topic of interest in mathematical logic, [Kaye] being one reference on the subject. Let LA denote the language 0 s + · =. Let N denote the structure of the non-negative integers over this language. This may be defined in set theory; facts to be given here provide some description of it. For a nonnegative integer n, let n be the term, 0 followed by n s s; this is called the numeral for n. Given a structure D in a language L, let Th(D) be the set of formulas in L which are true in D. Th(D) is called the theory of the structure D. Let PA denote the formulas which are provable from Peano’s axioms. Let Q denote the formulas which are provable from the first 6 of Peano’s axioms, and the formula x 6= 0 ⇒ ∃y(x = y s ). Let D1 and D2 be structures for a language L. Let ˆ denote 15 the interpretation in D1 , and ˜ the interpretation in D2 . D2 is said to be a substructure of D1 if the following requirements hold, where x1 , . . . , xn ∈ D1 . - For each predicate P , P̃ (x1 , . . . , xn ) if and only if P̂ (x1 , . . . , xn ). - For each function f , f˜(x1 , . . . , xn ) = fˆ(x1 , . . . , xn ). - For each constant c, c̃ = ĉ. A function h : D1 7→ D2 is said to be a homomorphism if the following requirements hold, where x1 , . . . , xn ∈ D1 . - For each predicate P , P̃ (h(x1 ), . . . , h(xn )) if and only if P̂ (x1 , . . . , xn ). - For each function f , f˜(h(x1 ), . . . , h(xn )) = h(fˆ(x1 , . . . , xn )). - For each constant c, c̃ = h(ĉ). The third requirement is redundant, since a constant is a 0-ary function symbol. Some authors (such as [Dowd1]) weaken the requirement for predicates, and call a homomorphism as above a strong homomorphism; others (such as [Sacks1]) give the above definition. It is readily seen that if h is a homomorphism then h[D1 ] may be made into a substructure of D2 in a unique way (or see [Dowd1]). If h is an injection then it is called an isomorphic embedding of D1 in D2 . If h is a bijection then it is called an isomorphism of D1 with D2 . If D is any structure for LA , the predicate x ≤ y is defined by the formula ∃w(y = x + w). The following are some basic facts concerning the above defined concepts. Let M denote a model of Q. 1. Th(N ) has models other than N ; such models are called nonstandard. 2. Q⊆PA⊆Th(N ). 3. The map h defined by the formula h(n) = n̂ is an isomorphic embedding of N in M . 4. If y ∈ M and y ≤ h(n) for some n ∈ N then y = h(m) for some m ∈ N (h[N ] is said to be an initial segment of M ). 5. Suppose M satisfies the “second order induction axiom”, that is, for any subset S ⊆ M , if 0 ∈ S, and ∀x(x ∈ S ⇒ xs ∈ S), then ∀x(x ∈ S). Then h is an isomorphism. Fact 1 was first observed by T. Skolem in 1933; a proof is as follows. Let ∞ be a new non-logical symbol, and add to Th(N ) the formulas n < ∞ for each integer n. If the enlarged set of formulas were inconsistent, there would be some finite set of the added formulas which, when added to Th(N ), would result in an inconsistent system. But this is impossible, because the ordinary integers with a large enough value assigned to ∞ would be a model. Thus, the enlarged set has a model, 16 and this is a model of Th(N ) which contains an element greater than every “standard” integer. To prove fact 2 it is only necessary to give a proof in PA of x 6= 0 ⇒ ∃y(x = y s ); this is an easy exercise, or may be found in [Yasuhara]. Fact 3 follows from the following facts, where ⊢ denotes provability in Q. - If k + l = m then ⊢ k + l = m. - If k · l = m then ⊢ k · l = m. - If k 6= l then ⊢ k 6= l. Fact 4 follows from the additional fact - ⊢ x ≤ k ⇒ (x = 0 ∨ · · · ∨ x = k). Proofs of these facts can be found in [Yasuhara]. To prove fact 5, let S be h[N ]. The axiom of fact 5 is called “second order” because it involves the use of subsets of the universe of discourse, and must be formalized within set theory (or at least an adequate fragment of it). Together with the preceding facts, it may be seen that second order methods are stronger than strict first order methods. 8. The real numbers. Like the integers, the real numbers are fundamental mathematical objects, which are familiar from everyday life, and form a mathematical universe of discourse. The real numbers may be constructed from the non-negative integers N in informal set theory, and second order axioms can be given which completely characterize the structure. It is valuable to first construct some substructures which are themselves fundamental mathematical objects. The structures to be constructed are the integers Z, the rational numbers Q, and the real numbers R. Some families of structures will be defined, of which the preceding structures are important examples. The language of commutative rings is 0 1 + · =. The axioms for commutative rings are C1 (x + y) + z = x + (y + z) C2 x + y = y + x C3 x + 0 = x C4 For all x there exists y such that x + y = 0 C5 (x · y) · z = x · (y · z) C6 x · y=y · x C7 x · 1 = x C8 x · (y + z) = x · y + x · z Various additional facts can be shown readily from the axioms; these may be found in any of numerous introductions to abstract algebra, 17 including [Dowd1]. In particular, subtraction may be defined, and its basic laws proved. N is not a commutative ring, because axiom C4 does not hold. N can easily be enlarged to a structure which is a commutative ring, by adding the negative integers. One method of doing this is as follows. On N × N , define the binary functions - hm1 , n1 i + hm2 , n2 i = hm1 + m2 , n1 + n2 i and - hm1 , n1 i · hm2 , n2 i = hm1 m2 + n1 n2 , m1 n2 + m2 n1 i; and the binary predicate - hm1 , n1 i ≡ hm2 , n2 i if and only if m1 + n2 = n1 + m2 . By straightforward if tedious calculation ≡ is verified to be a congruence relation on N × N with + ·. The equivalence class [hm, ni] will represent m − n. In the quotient (N × N )/≡, + and · are defined by the above equations. Another straightforward calculation shows that the quotient is a commutative ring, with [h0, 0i] as 0, [h1, 0i] as 1, and [hm, ni] + [hn, mi] = 0. This is the ring Z. The function h where h(n) = [hn, 0i] is an isomorphic embedding of N in Z. A binary predicate ≤ on a set D is said to be a partial order if the following hold. 1. x ≤ x (reflexive law) 2. x ≤ y and y ≤ z imply x ≤ z (transitive law) 3. x ≤ y and y ≤ x imply x = y (antisymmetry law) A partial order is a linear order if the following also holds. 4. x ≤ y or y ≤ x The subset order on Pow(U ) for a set U is an example of a partial order which is not a linear order (provided U has at least two elements). The relation ≤ on N , defined to hold if ∃w(y = x + w), is a linear order. Given a partial order ≤, the predicate x < y may defined by the formula x ≤ y ∧ x 6= y. This relation is called the strict part of the partial order, and satisfies the transitive law and x 6< x. On the other hand, given such a predicate the relation x < y ∨ x = y is a partial order. If ≤ is a partial order on D and S ⊆ D then x ∈ S is said to be a least element of S if x ≤ y for all y ∈ S. An element x ∈ D is said to be an upper bound for S if y ≤ x for all y ∈ S. An upper bound x for S is a least upper bound if x ≤ x′ whenever x′ is an upper bound. An ordered commutative ring is one where a unary predicate P (positive) has been added to the language, and satisfying the following additional axioms. O1 ¬P (0). O2 if x 6= 0, exactly one of P (x) or P (−x) holds. 18 O3 P (x) ∧ P (y) ⇒ P (x + y). O4 P (x) ∧ P (y) ⇒ P (x · y). Properties which follow immediately include the following: - 1 is positive (unless 0=1 and the ring is trivial); - the relation P (x − y) is the strict part x > y of a linear order x ≥ y on the ring; - if x < y then x + z < y + z, and if x ≤ y then x + z ≤ y + z; and - if x < y then −y < −x, and if x ≤ y then −y ≤ −x. The absolute value |x| is defined to be x if x is positive or 0, else −x. This satisfies the triangle inequality |x + y| ≤ |x| + |y|. Axioms can be given using the order predicate; using positivity results in a slightly simpler set of axioms. Z is an ordered commutative ring; the elements [hn, 0i] for n 6= 0 constitute a set of positive elements. If M is any ordered commutative ring, mapping 0 and 1 to 0 and 1 induces a unique isomorphic embedding of Z in M . The following second order axiom ensures that the embedding is in fact an isomorphism. - If S ⊆ M is nonempty and bounded below then S has a least element. A proof will be outlined. Call elements of the image of the embedding “integers”. There can be no element greater than every integer. If not, let S be the set of such, and let a be the least element of S. Then a − 1 ≤ m for some integer m, whence a ≤ m + 1, a contradiction. There can be no element less than every integer; if a is such then −a is greater than every integer. Suppose m < a < m + 1 where m is an integer. Then 0 < b < 1 where b = a − m. The set {bj : j ∈ N } is a set which is bounded below but has no least element. A field is a commutative ring which satisfies the following additional axioms. F1 For all x, if x 6= 0 then there exists y such that x × y = 1 F2 0 6= 1 An ordered field is an ordered commutative ring satisfying F1 and F2. Z is not a field, because there is no x such that 2 · x = 1, as may be easily verified. Z may be enlarged, to construct a field, as follows (in fact this construction may be carried out in any “integral domain”, which is a commutative ring satisfying some additional axioms). Let Z 6= denote the nonzero elements of Z. On Z × Z 6= , define the binary functions - hm1 , n1 i + hm2 , n2 i = hm1 n2 + m2 n1 , n1 n2 i and - hm1 , n1 i · hm2 , n2 i = hm1 m2 , n1 n2 i; and the binary predicate 19 - hm1 , n1 i ≡ hm2 , n2 i if and only if m1 n2 = m2 n1 . By straightforward calculation ≡ is verified to be a congruence relation on Z × Z 6= with + ·. The equivalence class [hm, ni] will represent m/n. In the quotient (Z × Z 6= )/≡, + and · are defined by the above equations. Another straightforward calculation shows that the quotient is a field, with [h0, 1i] as 0, [h1, 1i] as 1, and, provided m 6= 0, [hm, ni] · [hn, mi] = 1. This is the field Q. The function h where h(n) = [hn, 1i] is an isomorphic embedding of Z in Q. Q is an ordered field; the elements [hm, ni] where m, n > 0 constitute a set of positive elements. If M is any ordered field, mapping 0 and 1 to 0 and 1 induces a unique isomorphic embedding of Q in M . Clearly Q is the unique ordered field which is isomorphically embedded in any ordered field; this seems to be the best uniqueness property for Q. The rational numbers suffer from a deficiency. Let S = {q ∈ Q : q 2 < 2}; it is not difficult to show that if S has a least upper bound r then r2 = 2; and there is no r ∈ Q such that r2 = 2 (this is proved in the ancient Greek text “Euclid’s Elements”). Thus, S does not have a least upper bound in Q. Q can be enlarged, so that the deficiency just mentioned is eliminated. This was an important issue in the history of mathematics, and its resolution was important to early set theory. See [MacTutor, Real numbers] for remarks on the history of the subject. The construction to be outlined below can be found in numerous references, [Rudin] for example. A linearly ordered set D is said to have the least upper bound property if, whenever S ⊆ D is nonempty and has an upper bound, then S has a least upper bound. Q does not have this property. One method of constructing the real numbers is to enlarge Q to a linearly ordered set which does have the property. It turns out that there is exactly one way to do this. If D is a set with a partial order on it, say that a subset S ⊆ D is ≤-closed if x ∈ S ∧ w ≤ x ⇒ w ∈ S. Considering Q with its usual order, define a cut to be a set of rationals which is nonempty, bounded above, ≤-closed, and has no greatest element. Let R be the set of cuts; R will be equipped with interpretations for 0 1 + · = P , to produce a structure for this language. For q ∈ Q let q < denote {r ∈ Q : r < q}; this is readily seen to be a cut. To begin with, some facts about R will be proved using only the order ≤ on Q; these facts are of interest in themselves. A linear order is said to be a dense linear order without endpoints if it satisfies the additional axioms 20 ∀x∀y(x < y ⇒ ∃z(x < z < y)), ∀x∃y(y < x), and ∀x∃y(y > x). Later in the section it will be shown that if such a structure is countably infinite then it is isomorphic as a linear order to Q; for now only the easily verified fact that Q is such an order is needed. The notation sup(S) is commonly used for the least upper bound of a subset S of a partially ordered set; henceforth it will be adopted. The notation is derived from that fact that “supremum” is a synonym for “least upper bound”. The notation inf(S) is used for the greatest lower bound (infimum). A map between linear orders is said to be order-preserving if x ≤ y ⇒ h(x) ≤ h(y); suppose h is such a map. It is easy to see that h is an isomorphic embedding if and only if x < y ⇒ h(x) < h(y); and in this case h(x) < h(y) ⇒ x < y. Such a map will be said to be strictly order-preserving. A subset S of a linear order is said to be order-dense if whenever x < y then there is a q ∈ S such that x < q < y. The subset relation induces a partial order on R. To simplify the notation, let p, q, r denote elements of Q and x, y, z elements of R. If q∈ / x then q is an upper bound for x; for if r ∈ x, q ≤ r cannot hold, else q ∈ x, whence r < q. Thus, given x, y, and q ∈ y − x, x ⊂ y follows; this shows that ⊆ is a linear order on R. Suppose x ⊂ y; then there is some q ∈ y − x. Clearly q < ⊂ y. Also, x ⊆ q < , and x = q < if and only if q = sup(x). In the latter case, x ∪ {q} must be a proper subset of y, because y has no greatest element, and therefore there is an r such that q < r and r ∈ y. Replacing q by r if necessary, a q has been found such that x ⊂ q < ⊂ y; in particular, the subset order on R is dense. If x ∈ R then x is bounded above, so there is a q with x ⊂ q < . Also, there is a q ∈ x, and q < ⊂ x. In particular, the subset order on R has no endpoints. Let hR denote the map from Q to R, where hR (q) = q < . Using facts already observed, it follows that hR is an isomorphic embedding of linear orders. R has the least upper bound property. Indeed, if S is a nonempty set of cuts which has an upper bound let b = {q : ∃x ∈ S(q ∈ x)} (readers who are familiar with infinite unions will recognize that this is just the union of the members of S). Then b is nonempty, bounded above, ≤-closed, and has no greatest element; that is, it is a cut. It is the least upper bound because the infinite union is the least upper bound in the subset order; this will be shown in section 11. To summarize, R has the following properties. 21 1. 2. 3. 4. It is a dense linear order without endpoints. It has the least upper bound property. There is an isomorphic embedding of Q. The image of this embedding is an order-dense subset. Suppose that M is any linear order having properties 1-4 above, and let h denote the embedding. For x ∈ M let Cx = {q ∈ Q : h(q) < x}. Then Cx is nonempty (there is some q with h(q) < x), Cx is bounded above (there is some q with h(q) > x), Cx is ≤-closed (r < q ⇒ h(r) < h(q)), and Cx contains no largest element (if h(q) < x then there is an r such that h(q) < h(r) < x). Thus, Cx is a cut, and so there is a map g such that the following hold. a. g : M 7→ R. b. If x < y then g(x) < g(y) (since x < h(q) < y for some q). c. g ◦ h = hR (g(h(q)) = {r : h(r) < h(q)} = {r : r < q} = hR (q)). In fact, if g is any map having properties 1-3 then g(x) = Cx . This follows because q ∈ Cx if and only if h(q) < x if and only if g(h(q)) < g(x) if and only if hR (q)) < g(x) if and only if q ∈ g(x). Suppose X ∈ R; then h[X] is nonempty and bounded, so sup(X) exists. Letting x = sup(h[X]), it is not difficult to verify that g(x) = X, that is, q ∈ X if and only if h(q) < sup(h[X]). To summarize, if M is any linear order having properties 1-4 above, where h is the isomorphic embedding, then there is a unique map g having properties a-c above, and it is an isomorphism. Also, any x ∈ M equals sup(h[Cx ]) where Cx is as above; this may be seen because it is true of hR . Define a commutative group to be a structure in the language 0 + = which satisfies axioms C1-C4. An ordered commutative group is a commutative group, with P added to the language, satisfying axioms O1-O3. Given h : Q 7→ M with properties 1-4, there is a unique commutative group structure on M which makes h an isomorphic embedding of ordered commutative groups. Indeed (writing q for h(q)), q < x + y if and only if ∃r, s(r < x ∧ s < y ∧ q = r + s), and it follows that x + y must equal sup({r + s : r < x ∧ s < y}. In particular, having constructed R as the Dedekind cuts in Q, considered as a linear order, there is a unique way of defining the function + on R so that hR is an isomorphic embedding of ordered commutative groups. Multiplication may be handled similarly; given h : Q 7→ M with properties 1-4, and positive elements x, y, q < x·y if and only if ∃r, s(r < x ∧ s < y ∧ q = r · s), and it follows that x · y must equal sup({r · s : r < x ∧ s < y}. The function · may then be extended uniquely by 22 algebra (that is, by logic from the axioms for ordered fields) to all pairs x, y ∈ M . Given an ordered field M , it is easily seen that there is a unique isomorphic embedding of Q in M . From these facts, any two ordered fields having the least upper bound property are isomorphic by a unique isomorphism. R is such a structure, and this may be taken as a formal definition of the real numbers. Returning to the topic of countable dense linear orders without endpoints, let A and B be two such. The assumption that A is countably infinite means that there is a bijection a : N 7→ A. The convention of writing an for a(n) is a frequently used one, and one may say, “let A be enumerated as a0 , a1 , . . .”. Likewise, let B be enumerated as b0 , b1 , . . .. The following procedure (the “back and forth” procedure) produces a 1-1 correspondence between A and B, which is an order isomorphism. Let Ad be the elements of A which have been assigned a value so far, and similarly for Bd . Repeat the following. a. Let m be smallest such that am ∈ / Ad . Assign to am an element of B which bears the same relation to Bd which am bears to Ad . b. Proceed similarly, with B and A exchanged. Although of peripheral interest, the relation between a rigorous theory of the real numbers, and a rigorous theory of the plane of plane geometry, is of sufficient interest that it was considered by David Hilbert, after whom one system of axioms for plane geometry is named. A treatment of this topic may be found in Appendix 1. Another topic of peripheral interest concerns weaker systems than full set theory in which the theory of the real numbers can be developed. This topic has been of recent interest; see [Simpson]. 9. Computability. Computability theory is concerned with mechanical procedures involving formal objects. The need for such a theory was already evident in 1900, when David Hilbert asked whether there was “a process according to which it can be determined in a finite number of operations”, whether a polynomial with integer coefficients had an integer solution. A negative answer to this question (Hilbert’s tenth problem) was given in 1970 by Yuri Matijasevic; the formal theory of computability developed in the mid 1930’s was necessary to its solution. Computability theory has also proved useful in mathematical logic, as will be seen in the next section. A mechanical procedures might do any of the following. 1. Enumerate a set of integers, or more generally a predicate as a set of ntuples. Such a procedure “runs forever”, outputs only elements of the set, eventually outputs every element of the set, and in general 23 may output the same element more than once. 2. Decide whether an integer is in a set (more generally whether an nary predicate holds). Such a procedure always halts after a finite number of steps, with a “yes” or “no” answer. 3. Compute the value of an nary function. Such a procedure always halts after a finite number of steps, and produces a numeric value. 4. Compute the value of an nary “partial function”. Such a procedure may halt after a finite number of steps and produce a numeric value; or run forever and produce no output, in which case the value of the partial function is undefined. The notion of a partial function is useful in computability theory, and occasionally in other discussions. An (n + 1)ary predicate P (~x, y) (where ~x is used to denote x1 , . . . , xn ) is said to be single-valued if for all ~x there is at most one y such that P (~x, y). P is said to be total if for all ~x there is at least one y such that P (~x, y). Thus, an nary function is just an (n + 1)ary predicate which is both single-valued and total. An nary partial function is only required to be single-valued. Shortly, formal definitions of the predicates and functions computable using procedures of the above types will be given. An informal discussion is of value, though. In particular, informal arguments can be given showing how a procedure of one type can be converted into a procedure of another type. More generally, procedures can be (and usually are) given “informally”, relying on the experience of mathematics to conclude that the predicate or function computed by the procedure has a proper formal procedure. This principle (that informal procedures can be translated to formal ones) is often called “Church’s thesis” (although Church’s original statement was more specialized). A predicate computed by a procedure P of type 1 (an enumeration procedure) can be computed by a procedure Q which halts on a given input if and only if the predicate holds (a semi-decision procedure). Given P, given an input ~x, run P, and halt if ~x appears. On the other hand, given a semi-decision procedure Q, at stage n run Q for n steps, on all inputs where xi ≤ n for all i, and output ~x if Q halts on ~x. The set of inputs on which a semi-decision procedure halts is called its domain. It is interesting to note that already some characteristics of a “mechanical procedure” are apparent. There should be some record of a finite amount of data, and some operations which can be performed on it in a step. It should also be noted that the “informal” transformation just given can be formalized, once a formal definition of a mechanical computation has been given. A set (or predicate) which is the output of an enumeration procedure (or the domain of a semi-decision procedure) is called variously 24 “computably enumerable”, “recursively enumerable”, or “semi-decidable”. Until recently (e.g., as in [Rogers]), the term “recursively enumerable” was preferred. However, this terminology is a “historical artifact”, and progressively since [Soare] the term “computably enumerable” has been seen as carrying less linguistic baggage. A procedure of type 2 is called a decision procedure. A predicate computed by such a procedure is called variously “computable”, “decidable”, or “recursive”. Again, the term “computable” is displacing the older standard “recursive”. The term “decidable” remains in current use, especially in certain contexts, in particular logical theories. If a predicate R is computable, then ¬R is, by simply reversing the roles of “yes” and “no”. Also, R is computably enumerable, by running forever instead of halting and answering “no”. If R is enumerable by procedure P, and ¬R is enumerated by procedure Q, a procedure for deciding R is as follows. Given an input ~x, at alternate stages, run a step of P, or a step of Q. when ~x is enumerated, answer “yes” if P enumerated it, and “no” if Q enumerated it. A partial function computable by a procedure of type 4 is said to be a “computable partial function” or “partial recursive function”, with the first terminology being the more recent. If φ is an nary partial function computed by procedure P, then as a set of (n + 1)tuples, φ is enumerated by the following procedure. At stage n, run P for n steps on all inputs ~x with x1 ≤ n for all i. Whenever P yields a value y, output ~x, y. On the other hand, if Q is a procedure enumerating φ as a set of (n + 1)tuples, φ is computed by the following type 4 procedure. On input ~x, enumerate the (n + 1)tuples of φ. If an ntuple ~x, y appears, produce y as the value and halt. A function computable by a procedure of type 3 is said to be a “computable function” or “recursive function”, with the first terminology being the more recent. A computable function is just a computable partial function, which is total. Needless to say, there are many ways of giving a formal definition of computability. Further, as can be seen from the foregoing discussion, a formal definition can be given of, say, computable partial functions, and other classes of predicates and functions defined in terms of this, instead of more directly from the formal definition. Likewise, a formal definition might instead be given for a computably enumerable predicate. In fact, a definition of the latter type is already available. Using this as the basic definition is uncommon; usually some other definition is given, and the one to be given here shown to be equivalent. An outline of such methods will be given in appendix 2. More extensive treatments of this topic can be found in any of numerous introductions 25 to computability theory, including [Mendelson], [Yasuhara], and chapter 12 of [Dowd1]. Early workers in the subject include (in alphabetical order) Church, Godel, Kleene, Markov, Post, and Turing. The formal definition will involve formulas in the language 0 s + · =, called LA in section 7. Recall that for n ∈ N , n is the numeral with value n. Also, for a formula F , variables x1 , . . . , xk , and terms t1 , . . . tk , Ft1 /x1 ,...tk /xk denotes the formula obtained from F by replacing each free occurrence of xi by ti . Let ⊢ denote provability from the axioms of the system Q defined in section 7. A kary predicate P is said to be computably enumerable if there is a formula F , with k free variables x1 , . . . , xk , such that P (n1 , . . . , nk ) if and only if ⊢ Fn1 /x1 ,...,nk /xk . A predicate P is computable if and only if both P and ¬P are computably enumerable. A partial function or total function is computable if and only if it is computably enumerable as a set of (n + 1)tuples. The formal definition is consistent with the informal one. The formulas provable from the axioms of Q (or any similar finite set of axioms) can be mechanically enumerated, as follows. A list is maintained of all formulas proved so far. At stage n, formulas which follow from those in the list by a rule are added; and also axioms involving the first n variables. A technical point of considerable significance has been ignored in the previous paragraph. A computably enumerable set is a subset of N , whereas formulas are finite strings over an alphabet, which as given contains infinitely many variables, even if there are only finitely many non-logical symbols. Mapping formulas to integers is called “arithmetization of syntax”. This is an omnipresent ingredient of mathematical logic, first carried out by Godel, in the course of proving the incompleteness theorems, to be described in the next section. A mapping needs to be given, such that functions and predicates on the formulas correspond to computable functions and predicates on N . There are many ways of doing this. One simple such relies on a simple correspondence between strings over a finite alphabet and nonnegative integers. Let m be the size of the alphabet. The letters of the alphabet may be taken as thePintegers 1, . . . , m. The finite string lt−1 . . . l0 is mapped to the integer i li ni . This notation is called madic notation, and has advantages over the more familiar “mary” notation, where the letters are considered to be 0, . . . n − 1. In madic notation, the map from strings to N is a 1-1 correspondence; the empty string denotes 0. In mary notation there are multiple strings corresponding to the same integer, unless leading 0’s are disallowed; and 0 rather than the empty string customarily denotes 0. 26 A countably infinite alphabet can be replaced by a finite alphabet. In the case of a first order language with finitely many non-logical symbols, the variable xi can be replaced by the string xN where N is the 2-adic notation for n. It is sometimes useful to be able to distinguish between a string s over an m letter alphabet, and the numeric value corresponding to it under m-adic notation. In such cases, a common notation is psq, and the integer is called the Godel number of the string. In appendix 2, an outline will be given of an argument that various functions and predicates are computable. This is an illustration of Church’s thesis, in that these functions and predicates are readily computed by informally given procedures; readers with experience with a programming language for example should be able to see that such procedures could be expressed in the language. In particular, there is a computably enumerable binary predicate U , whose pairs are the pairs hf, ni, such that n is in the set enumerated by the formula F with one free variable, where f = pF q. It follows from this that the unary predicate K(x) = U (x, x) is computably enumerable; indeed, if F is a formula for U in the free variables x and y then Fx/y is a formula for K. Theorem 1. K c is not computably enumerable. Proof: Suppose to the contrary that F was a formula which yielded an enumeration of K c , and let f = pF q. Then f ∈ K c if and only if U (f, f ) (because U with first argument f enumerates K c ), if and only if f ∈ K (by definition of K). Thus, the assumption of the existence of F leads to a contradiction, so F does not exist. ⊳ As a corollary, K is not decidable. This is a fact of interest; but K is a “contrived” set (although a basic one in computability theory). Undecidability of a more “naturally occurring” set will be proved in the next section. 10. Independence. The emergence of formal logic in the late 19th century showed that mathematical reasoning could be reduced to the formal manipulation of strings of symbols. The question then arose, what could be proved about formal logic using “finitary” methods, which involve reasoning directly about the formal objects without introducing abstract concepts. In the 1920’s there were attempts to make the notion of finitary methods more precise; see [SEP, Hilbert’s Program]. David Hilbert asked (Hilbert’s second problem, 1900) whether a system of axioms could be given for arithmetic, and a proof given of the consistency of the axioms, which was finitary. Later (Hilbert’s program, 1920) he asked the same question for a system of axioms for all 27 of mathematics. It is generally recognized that the second incompleteness theorem, which was proved by Kurt Godel in 1931, provided a negative answer to both questions. The first incompleteness theorem, proved in the same paper, already showed that the application of formal logic to mathematics was unable to provide a method of proving all true statements of mathematics. As observed in section 6, the axioms and rules of formal logic are complete. Thus, the insufficiency is due to the fact that no allowable system of non-logical axioms for all of mathematics can be given. It must be specified what systems of axioms are allowable; the set of all true statements for example should not be. It has become clear that the correct notion of an allowable system of axioms is one which is computably enumerable. Fix a first order language L. A theory over L is defined to be a set S of formulas, which is closed under logical consequence, that is, such that if F1 , . . . , Fk are in S and F1 , . . . , Fk ⊢ G then G is in S. The notation ⊢T F will be used to denote that the formula F is in T . Recall from section 6 that a theory is consistent if for no sentence F , are both F and ¬F is in S; and that a theory is consistent if and only if it has a model. A theory is inconsistent if it is not consistent. A theory S is said to be complete if for all sentences F , either F or ¬F is is in S. Recall from section 7 that for a structure D for L, Th(D) is the set of formulas which are true in D. Th(D) is an example of a consistent and complete theory. A theory is incomplete if it is not complete. A sentence F is independent of a theory S if neither F nor ¬F is in S. A theory is clearly incomplete if and only if there is an independent sentence. However the fact that a particular sentence is independent may be of particular interest. If L has only finitely many non-logical symbols (this restriction can of course be relaxed) a method was given in the preceding section for assigning integers (Godel numbers) to formulas. A theory is said to be computably enumerable if its set of Godel numbers is; and decidable if its set of Godel numbers is. Recall the theory PA from section 7. Since PA is a subset of Th(N ), it is consistent. It will next be shown that PA is not complete; this is the “first incompleteness theorem”. Let Sub(x, y) be the function whose value is pFt/v q if x = pF q where F is a formula with one free variable v, and y = ptq where t is a term. Let Num(x) be the function whose value is pxq, where as in section 7 x is the term representing x. It is shown in appendix 2 that these are computable. 28 Lemma 1. Suppose F is a formula with one free variable v. Then there is a sentence G such that ⊢Q G ⇔ F (NG ) where NG = pGq is the numeral for the Godel number of G. Proof: The notation will be abused by writing y = e(x) for the formula Ex/x,y/y , where E is the formula for e. Let H be ∃w(w = Sub(v, Num(v)) ∧ Fw/v ), let NH = pHq, and let G be HNH /v . By definition of Num, pNH q = Num(pHq); and since Num is computable, ⊢Q pNH q = Num(pHq), or ⊢Q pNH q = Num(NH ). By definition of Sub, pGq = Sub(pHq, pNH q). and since Sub is computable, ⊢Q pGq = Sub(pHq, pNH q). By predicate logic, ⊢Q Sub(NH , Num(NH )) = NG . Again by predicate logic, ⊢Q FNG /v ⇔ HNH /v , and the right side is G. ⊳ Let PrfT (x) be the function which is 0 if x is a proof in T , else 1. Let Thm(x) be the function which is the last formula of the proof x; Thm is shown to be computable in appendix 2. Theorem 2. Suppose T is an extension of Q, possibly in a language with finitely many additional non-logical symbols, which is consistent, and such that PrfT (x) is computable. Let F be the formula ¬∃y(PrfT (y) ∧ Thm(y) = x), and let G be the sentence obtained as in lemma 1. Then 6⊢T G. Proof: Suppose G were provable; then the following would hold. a. ⊢T G. b. For some m ∈ N , ⊢T PrfT (m) ∧ Thm(m) = NG (by computability of PrfT ). c. ⊢T G ⇔ ¬∃y(PrfT (y) ∧ Thm(y) = NG ) (by lemma 1 and the hypothesis Q ⊆ T ). By predicate logic, T is inconsistent, contradicting the hypotheses. ⊳ Corollary 3. PA is not complete. Proof: N is a model of PA (this is provable in set theory). Since G is not provable, it is true in N (since it says it’s not provable). Therefore ¬G is not provable either. ⊳ The methods of this section can be used to show that the axioms of set theory are incomplete, indeed will remain so if they are expanded. A great quantity of modern set theory is concerned with showing that specific statements of set theory are independent, and implications which hold between independent statements. Many other facts about PA and other theories containing Q can be proved. A discussion of two such will be given here; further discussion can be found in any of several references, including chapter 12 of [Dowd1] and [Shoenfield1]. Theorem 4. Suppose T is an extension of Q, possibly in a language with finitely many additional non-logical symbols, which is consistent; 29 then T is undecidable. Proof: It is shown in appendix 2 that if P is a computable predicate then there is a formula F with free variable v such that if P (n) then ⊢Q Fn/v and if ¬P (n) then ⊢Q ¬Fn/v ; say that F represents P . Let UT (f, n) be the predicate which is true if and only if Sub(f, Num(n)) is in T , considered as a set of integer; the predicate U mentioned in section 9 is UQ . It is easily seen that if F represents a computable predicate P , then UT (pF q, n) = U (pF q, n) for all n. Let KT (x) = UT (x, x). As in theorem 9.1, there is no f such that KTc (n) = UT (f, n) for all n. It follows that K c is not computable. From this it follows that T is not computable, since a decision procedure for T would yield one for KTc . ⊳ In particular, PA is undecidable. Q was invented by A. Tarski, and used to prove undecidability of various theories. Some discussion may be found in chapter 12 of [Dowd1]. Call a theory T which satisfies the hypotheses of theorem 2 a G1 theory. Given such, let F abbreviate ∃w(PrfT (w) ∧ Thm(w) = vF ); the “placeholder” vF will be used in various ways. The “derivability conditions” for T are the following formulas. 1. ⊢T F then ⊢T F . 2. ⊢T (F ⇒ G) ⇒ F ⇒ G. 3. ⊢T F ⇒ F , In condition 1 vF denotes pF q. In condition 2, vF and vG are variables and F ⇒ G is an abbreviation for a function giving this formula from its subformulas. In condition 3, vF is a variable and the inner F is an abbreviation for a function giving this formula from F . Call a G1 theory T a G2 theory if it satisfies the derivability conditions. It was observed in the proof of theorem 2 that the first condition holds for a G1 theory. The above notation is used, because proofs can be made more readable; it is the notation used in “provability logic”. The following theorem is called Lob’s theorem. Theorem 4. Suppose T is a G2 -theory. If ⊢T F ⇒ F then ⊢T F . Proof: Using lemma 1 let G be such that ⊢ G ⇔ (G ⇒ F ); then ⊢ (G ⇒ G ⇒ F ), whence ⊢ G ⇒ G ⇒ F . But ⊢ G ⇒ G, so ⊢ G ⇒ F , and so ⊢ G ⇒ F . Thus ⊢ G, so ⊢ G, so ⊢ F . ⊳ Let Con(T ) be the sentence ¬(F0 ∧ ¬F0 ) for some arbitrarily chosen sentence F0 . Corollary 5. If T is a G2 -theory then Con(T ) is not provable in T . Proof: Apply Lob’s theorem with F being F0 ∧ ¬F0 . ⊳ Corollary 5 is called the second incompleteness theorem. Although this will not be proved here, PA is a G2 theory, and so Con(PA) is not 30 provable in PA. Con(PA) is provable in set theory; in fact it is provable in weaker theories, and this is a topic of interest in mathematical logic. A proof that a theory is a G2 theory involves a fair amount of labor; [HajPud], [Monk2], and [Smorynski] are among various references containing treatments. 11. ZFC. The language of set theory is ∈, = . The axioms of equality are axioms of set theory. The other non-logical axioms are as follows; some make use of abbreviations which will be defined later, and are familiar from informal set theory. Extensionality: ∀w(w ∈ x ⇔ w ∈ y) ⇒ x = y Pairing: ∃x∀w(w ∈ x ⇔ w = u ∧ w = v) Union: ∃x∀w(w ∈ x ⇔ ∃u(u ∈ y ∧ w ∈ u)) Power Set: ∃x∀w(w ∈ x ⇔ w ⊆ y) Separation or subset: For any formula F where x does not occur free, ∃x∀w(w ∈ x ⇔ w ∈ y ∧ F ) Replacement: For any formula F , ∀x∀y∀z(F ∧ Fz/y ⇒ y = z) ⇒ ∀u∃v∀y(y ∈ v ⇔ ∃x(x ∈ u ∧ F )) Infinity: ∃x(∃w(w ∈ x ∧ IsEmpt(w)) ∧ ∀w(w ∈ x ⇒ ∃v(v ∈ x ∧ IsSuc(w, v)))) Foundation or regularity: ∃w(w ∈ x) ⇒ ∃w(w ∈ x ∧ ∀u(u ∈ w ⇒ ¬u ∈ x)) Choice: ∀w(w ∈ x ⇒ ∃v(v ∈ w)) ⇒ ∃f (IsChoiceFunc(f, x)) This set of axioms is known as ZFC, “Zermelo-Fraenkel with Choice”. Fraenkel added replacement to Zermelo’s original set, and choice is considered separately because of philosophical issues; in particular ZF is the axioms, with choice omitted. ZFC has been considered the “official” axiom system for mathematics since the 1930’s and even earlier. The extensionality axiom defines how ∈ and = are related; the converse implication follows by the axioms of equality. The axioms pairing, union, and power set describe how to build up new sets. The separation axiom describes how to cut down a set, to those elements having some property. The replacement axiom states that, if a formula gives a partial function on the universe of all sets, then the image of a set is a set. The axiom of infinity states that an infinite set exists. 31 Further discussion of each axiom will be given. Some definitions will be given as well; these are useful in the formal development, and include several concepts used in informal set theory. The treatment will be as brief as possible; various references, including [Monk1] and [Jech2], provide a more extensive treatment. The pairing axiom states that given sets u and v, there is a set x such that the 3-ary predicate ∀w(w ∈ x ⇔ w = u ∧ w = v) holds. Using extensionality, it follows that this set is unique; the notation {u, v} is used for it. In writing formulas, x = {u, v} is used as an abbreviation for the 3-ary predicate. This is common throughout mathematics, and is known as introduction by definition of function symbols. The union axiom states that given a collection y of sets, there is a set x, where w ∈ x if and only if w ∈ u for some set u in the collection. By extensionality this set is unique; it is called the union of the collection y, and the notation ∪y is used to denote it. It may also be denoted ∪u∈y u. A set w is said to be a subset of the set x, written w ⊆ x, if ∀u(u ∈ w ⇒ u ∈ y). The power set axiom states that given a set y, there is a set x whose members are the subsets of y. The power set is unique; in this text the notation Pow(y) is used to denote the power set. The set x stated to exist in the axiom of separation is unique, and x = {w ∈ y : F } may be written. The reader might wonder why the axiom is not simply ∃x∀w(w ∈ x ⇔ F ). This axiom is incorrect. There is an axiom system where it is correct, called Bernays-Godel set theory (a treatment may be found in [Jech2]). This treats proper classes, to be defined in the next section, in a different manner than ZFC; but is essentially equivalent to ZFC for practical purposes. The hypothesis of the replacement axiom requires that for any x there can be at most one y; degenerate cases where x or y does not occur free in F are allowed. Given a set u of x’s, the corresponding y’s can be collected into a set. Notation such as v = F [u] is introduced by some authors. The notation w ∈ / x is used for ¬w ∈ x. A set x is said to be empty if ∀w(w ∈ / x). That there is a set follows by predicate logic; ∃x(x = x) is provable. By separation and extensionality there is a unique empty set; ∅ is used to denote it. The notation IsEmpt(x) in the axiom of infinity can be replaced by x = ∅, and will no longer by used. Likewise, IsSuc(w, v) will no longer be used; it can be replaced by v = w ∪ {w}. This axiom implies the existence of an infinite set; but this requires a fair amount of formal development and further discussion is deferred to section 13. By separation the set {w : w ∈ x ∧ w ∈ y} exists; it is called the 32 intersection of x and y, and denoted x ∩ y. The requirement on w in the axiom of foundation is thus w ∩ x = ∅; w and x are said to be disjoint in this case. If w ∈ x is thought of as implying that w is simpler than x, the axiom of foundation states that a nonempty set x has an element which is as simple as possible among the elements of x. Such an element will be called ∈-minimal. A set x is an ordered pair if and only if ∃u∃v(x = {{u}, {u, v}}); u is the first component and v is the second. The notation hu, vi is used for an ordered pair. Nested use of defined functions is handled by introducing existentially quantified variables. For example x = {{u}, {u, v}} can be written as ∃s∃t(s = {u} ∧ t = {u, v} ∧ x = {s, t}). A relation is a set of ordered pairs. The domain of a relation is the set of its first components (a more formal definition is left to the reader), and the range is the set of its second components. A function is a relation f which is single-valued in the second component, i.e., ∀x∀y∀z(hx, yi ∈ f ∧ hx, zi ∈ f ⇒ y = z). A choice function on a set x of nonempty sets is a function f whose domain is x, such that if hu, vi ∈ f then v ∈ u. The terminology “f is a choice function on x” will be used, rather than IsChoiceFunc(f, x) as written in the statement of the axiom of choice. The axiom of choice states that a system of choices can be made, of elements from the sets of a collection of nonempty sets. The pairing axiom is actually redundant; it is included for historical reasons, and because it is so fundamental. To prove it, first use the power set axiom to prove that Pow(Pow(∅)) exists and has two elements. Then use the replacement axiom with the formula x = ∅∧y = u∨x = {∅}∧y = v. Likewise the separation axiom is redundant. Given F as in the separation axiom, apply replacement with the formula F ∧ y = x (with variables renamed as necessary). Some authors include an axiom stating the existence of the empty set; as already seen this is redundant. ZFC is accepted as the axiom system for mathematics on the basis of experience. It is possible to give arguments that the axioms are true facts about sets; see [Shoenfield2]. Basic facts of set theory may be proved using ZFC; indeed, this is evidence that ZFC is adequate. Outlines of such proofs will be given as necessary in later sections. Here the fact mentioned in section 8, that ∪x is the least upper bound of the elements of x in the subset order, will be shown. Let y = ∪x (which recall equals {v : ∃w(w ∈ x ∧ v ∈ w)}). For w ∈ x, if v ∈ w then v ∈ y by definition of y; thus w ⊆ y. This shows that y is an upper bound. If z is any upper bound, and v ∈ y, then by 33 definition v ∈ w for some w ∈ x. By the assumption that z is an upper bound, w ⊆ z, and so v ∈ z. Thus, y ⊆ z has been shown, and y is the least upper bound. The greatest lower bound also exists, namely {v : ∀w(w ∈ x ⇒ v ∈ w)), provided x is nonempty. This is a set, because it equals {v ∈ w0 : ∀w(w ∈ x ⇒ v ∈ w)) where w0 is any member of x. It is denoted ∩x. Letting y = ∩x, it may be seen that y is a lower bound, and for any lower bound z, z ⊆ y. While ZFC has proved adequate for contemporary mathematics, as will be seen it suffers from the deficiency that basic questions of mathematics are independent of it. To settle independent questions ZFC must be enlarged. In current mathematics there is intensive research underway, as to how this should be done. Some references are: [Bagaria], [Dowd2], [Foreman], [Friedman], [Koellner1], [Steel1]. Of course, enlarging ZFC would settle some questions; but as observed in section 10, others would remain independent. 12. Proper classes. If x ∈ x then {x} violates the axiom of foundation; thus, x ∈ / x is a theorem of ZFC. The universe of discourse of set theory, that is, the collection of all sets, is denoted by the symbol V . This cannot be a set, else V ∈ V . Mathematics has dealt with this situation by considering V to be some sort of collection which is not a set, so that special reasoning must be applied to such collections. Indeed, Cantor realized the need for such caution in his later work. In current common usage, such collections are called proper classes. A proper class is a “large subset of V ”. Suppose F is a formula. It might be provable in ZFC that {x : F } is a set. On the other hand it might be provable that it is not; indeed, it was just proved that {x : x = x}, which is V , is not a set. Other proper classes are commonly encountered in the development of set theory. From the point of view of ZFC, a proper class is {x : F }, which is not a set. From a more general perspective these are only very few of the proper classes; however this perspective must remain intuitive, at least if arguments are to be carried out in ZFC. Note that a set is a “small subset of V ”, because the elements of a set are themselves sets, since in set theory every object is a set. This sometimes seems unreasonable to beginning students; but it is a fundamental fact of mathematics that taking this approach yields an axiom system for all of mathematics. Since V is not a set, the application of formal logic to set theory encounters complications. This concerned some set theorists in the early 34 stages of discussion, for example Skolem, but modern set theory accepts them as a fact of life. In particular, if ZFC is consistent then it has a countable model M , which is a set; and “uncountable” sets exist in M only as objects which satisfy the relevant formula, and are not “actually” uncountable. V is the domain of the “natural” model of ZFC. The logical complications arise because this is not a set. The domains of mathematics are all sets, so the problem generally arises only in set theory. Other proper classes may be models of set theory, but in a sense which must be specified. An example is given in section 19. The term “class” may be used for {x : F }, even if it will be shown later to be a set. If F is a formula with free variable x defining a class C, a convenient notational device is to write x ∈ C instead of the subformula F , in a formula. Also, the term “class” is sometimes used, rather than proper class; the context should clarify the usage. 13. Ordinals and cardinals. The theory of ordinal and cardinal numbers has been a topic of set theory of fundamental importance since the earliest days of set theory, Cantor’s work in the 1870’s. The modern treatment involves some technical concepts, and even in an informal discussion some discussion of these must be given. In developing basic set theory, facts must be proved in a certain order; for example there is yet no definition of the non-negative integers, so the notion of an infinite sequence xi for i ∈ N cannot yet be used. The ordinal numbers must be defined first. A set x is called transitive if ∈ satisfies a version of the transitivity law, namely, v ∈ w ∧ w ∈ x ⇒ v ∈ x. Informally speaking, x is closed under the iterated operation of taking elements. Although the notion of a transitive set is a technical one, it has turned out to be quite useful in set theory. A transitive set x is called an ordinal if in addition the trichotomy law holds, that is, if v ∈ x ∧ w ∈ x ⇒ (v ∈ w ∨ v = w ∨ w ∈ v). Greek letters α, β, γ, δ will be used to denote ordinals, as is commonly done in set theory. Lemma 1. a. If x ⊆ α is transitive then x is an ordinal. b. If x ∈ α then x is an ordinal. c. If β ⊂ α then β ∈ α. d. Either α ⊆ β or β ⊆ α. Proof: Part a follows because if v, w ∈ x then v, w ∈ α, so they are related by ∈. For part b, x ⊆ α because α is transitive. If w ∈ x and v ∈ w then v, w ∈ α, so w ∈ x, w = x, or x ∈ w; but the latter two 35 possibilities contradict foundation. For part c, suppose γ is a ∈-minimal element of α − β. If δ ∈ γ then δ ∈ α, so δ ∈ β else γ is not ∈-minimal. If δ ∈ β then δ ∈ γ, since δ = γ or γ ∈ δ both imply γ ∈ β. Thus, β and γ have the same elements and so are the same set (by extensionality), and β ∈ α as was to be shown. For part d, α ∩ β is readily verified to satisfy the defining properties of an ordinal. If α ∩ β = α then α ⊆ β, and if α ∩ β = β then β ⊆ α; in the remaining case by part c α ∩ β ∈ α and α ∩ β ∈ β, so α ∩ β ∈ α ∩ β, a contradiction. ⊳ The collection of ordinals is denoted Ord. If Ord were a set then by the lemma it would be an ordinal, contradicting Ord ∈ / Ord. Thus, Ord is a proper class. The notation α < β is used for α ∈ β. It follows from the lemma that < satisfies the axioms for the strict part of a linear order on Ord. Lemma 2. a. If x is a transitive set of ordinals then x is an ordinal. b. If x is a set of ordinals then ∪x is an ordinal. Proof: For part a, if α, β ∈ x then trichotomy holds because α, β are ordinals. For part b, suppose α ∈ ∪x, say α ∈ β where β ∈ x; and γ ∈ α. Then γ ∈ β, so γ ∈ ∪x. Thus, ∪x is transitive, and it is a set of ordinals, so by part a it is an ordinal. ⊳ Lemma 3. a. ∅ is an ordinal, denoted 0. 0 ≤ α for any α. b. α ∪ {α} is an ordinal, called the successor of α and denoted α + 1. If α ≤ β ≤ α + 1 then either β = α or β = α + 1. Proof: For part a, ∅ satisfies the requirements for an ordinal vacuously; and ∅ ⊆ α. For part b, the requirements are readily verified, and α ⊆ β ⊆ α ∪ {α}. ⊳ An ordinal α is called a successor ordinal if there is a β < α such that α = β + 1. An ordinal α is called a limit ordinal if it is not 0, and for all β < α, β + 1 < α. If α is not 0 or a successor ordinal then α is a limit ordinal, since if β < α then β + 1 cannot equal α, so must be less than α. The axiom of infinity states that there is a set x with the property (∗) ∅ ∈ x ∧ ∀w(w ∈ x ⇒ w ∪ {w} ∈ x). The subsets of x having property (∗) form a nonempty set y, and ∩y is the smallest set having property (∗). Let ω denote this set. Theorem 4. ω is the smallest limit ordinal. Proof: Let S = {α ∈ ω : α ∈ Ord}. S is readily seen to have property (∗), whence ω ⊆ S; since clearly S ⊆ ω, S = ω, and ω is a set of ordinals. Let T = {α ∈ ω : α ⊆ ω}. T is readily seen to have property (∗), so ω ⊆ T , so α ⊆ ω ⇒ α ∈ ω. Thus, ω is a transitive set of ordinals, whence it is an ordinal. Since ω has property (∗) it is a limit 36 ordinal. Any limit ordinal β has property (∗), so ω ⊆ β. ⊳ The elements of ω are the ordinals 0+1+· · ·+1, where 0 denotes the empty set and there are n 1’s added. That is, ω is a copy of the integers, with n+1 the successor function. It is easily seen that n = {0, . . . , n−1}. As promised in section 5, in set theory there is no need to define Nn , since it is the same thing as n. The ordinals are in many ways a generalization of the integers. For example, there is an induction principle for ordinals, called transfinite induction, stated in the following theorem. To state it, a notational convenience will be adopted. If F is a formula let ∀αF denote ∀α(α ∈ Ord ⇒ F ), so that α stands for a variable ranging over ordinals. Similarly ∃αF denotes ∃α(α ∈ Ord ∧ F ). Theorem 5. ∀α(∀β(β < α ⇒ Fβ/α ) ⇒ F ) ⇒ ∀αF . Proof: By contraposition and replacing F by ¬F it suffices to prove ∃αF ⇒ ∃α(F ∧ ∀β(β < α ⇒ ¬Fβ/α )). To prove this, suppose F holds at γ. Then {α < γ : F } is a nonempty set of ordinals. Let α be an ∈-minimal element; then F ∧ ∀β(β < α ⇒ ¬Fβ/α ) holds. ⊳ Another fundamental tool making use of the ordinals is transfinite recursion, where a function F from Ord to V defined by a recursion involving a function G from V to V . The use of the term “function” needs to be clarified; G is a proper class given by a formula, which has free variables x and y. For each value for x there is a unique value for y such that G holds (i.e., this has been proved in ZFC). Likewise F is a proper class, its formula is derived from that for G, and it may be proved in ZFC that it is a function in the same sense. F (α) = y if and only if there exists a function f with domain α such that f (β) = G(f ↾ β) for all β < α, and y = G(f ). Using transfinite induction it can be shown that for all α there is a unique f and y; details are omitted. A binary relation < on a set S is said to be well-founded if for every subset T ⊆ S there is an element x ∈ T such that for any y ∈ T , y 6< x. Such an element x of T is said to be a minimal element; < is well-founded if and only if every subset contains a minimal element. An infinite descending chain in S is a function f : ω 7→ S such that f (i + 1) < f (i) for all i ∈ ω. Theorem 6. A binary relation < on S is well-founded if and only if there is no infinite descending chain. Proof: Suppose < is well-founded, and f : ω 7→ S. Then f [ω] ⊆ S, so there is a minimal element x ∈ f [ω], say x = f (i). In particular, f (i + 1) 6< f (i), so f is not an infinite descending chain. Suppose < is not well-founded; then there is a subset T ⊆ S such that ∀x ∈ T ∃y ∈ T (y < x). Using the axiom of choice there is a function g : T 7→ T such 37 that g(x) < x. Using recursion there is a function f : ω 7→ T such that f (i + 1) < f (i) for all i ∈ ω. ⊳ If S is a set then ∈ can be considered as a binary relation on S. By the axiom of foundation this relation is well-founded, and so there is no infinite descending chain xi of elements of S, i.e., with xi+1 ∈ xi for all i ∈ ω. To show that there is no infinite descending chain at all, it suffices to show that given any x0 there is a transitive set S with x0 ⊆ S. In fact there is a smallest such S. Define sets Si for i ∈ ω by the recursion S0 = x0 , Si+1 = ∪Si ; and let S = ∪i∈ω Si . A binary relation < on a set S is said to be a well-order it is the strict part of a linear order, and is well-founded. For example, if α is an ordinal then ∈ is a well-order on α. Lemma 7. If f : α 7→ β is an order isomorphism then α = β and f is the identity function, that is, f (γ) = γ for all γ < α. Proof: If not there is a least γ < α such that f (γ) 6= γ, say f (γ) = δ. If δ < γ then f (δ) = δ, a contradiction. If δ > γ then f (ζ) = γ, for some ζ, and ζ < γ must hold, whence f (ζ) = ζ, again a contradiction. Thus, f (γ) = γ for all γ < α, whence β = {γ : γ < α}, whence β = α. ⊳ Theorem 8. If < is a well-order on S then there is a unique ordinal α such that there is an order isomorphism f : α 7→ S; further the isomorphism is unique. Proof: Similarly to a definition in section 8, say that a subset T ⊆ S is <-closed if x ∈ t ∧ y < x ⇒ y ∈ T . Let C be the class of functions g : α 7→ S such that g is an order isomorphism from α to g[α] and g[α] is <-closed. Suppose g1 : α1 7→ S and g2 : α2 7→ S are in C. Then if β ∈ α1 ∩ α2 then g1 (β) = g2 (β). Suppose not, and let β be the smallest counterexample; suppose without loss of generality that g1 (β) = x < y = g2 (β) where x, y ∈ S. Since g2 [α2 ] is <-closed, g2 (γ) = x for some γ, and γ < β must hold. But then g1 (γ) = g2 (γ) = x, a contradiction. Thus, f1 ⊆ f2 or f2 ⊆ f1 . Since there are only a set of initial segments of S, it follows using lemma 7 and replacement that C is a set. Let f = ∪C. If g : α 7→ S is in C and β < α then g ↾ β is in C; it follows that Dom(f ) is an ordinal α. It is easy to check that f is strictly order-preserving; to show that it is an order isomorphism it suffices to show that it is surjective. Suppose not, and let x be least such that x ∈ / Ran(f ). Then f : α 7→ S is in C, and if y < x then y ∈ f [α]. Let f ′ be the function with domain α + 1, which is the same as f below α, and with f (α) = x. It is easy to check that f ′ is in C. This yields a contradiction showing that x does not exists; and f is surjective. Uniqueness of α and f follows using lemma 7. ⊳ The ordinal α is called the order type of the well-order. Thus, the 38 ordinals are a “system of representatives” for the well-orders. This was a topic of concern in set theory, until a nice definition of the ordinals was given (by John von Neumann). The “equivalence relation” that two well-orders are order isomorphic, is a proper class, so one cannot define the ordinals by taking a quotient. The modern definition of the ordinals has a number of additional desirable properties, and proofs of basic facts are simple. Theorem 9. If S is a set then there is an ordinal α and a bijection f : α 7→ S. Proof: Let g be a choice function for Pow(S) − {∅}. By transfinite recursion define f so that f (β) = f (S − g[β]). Since f is injective and its range is a set, by replacement its domain is a set, in fact a transitive set of ordinals, that is, an ordinal. ⊳ That is, “any set can be well-ordered”, a fact known as the wellordering principle. Cantor believed this was true, but did not know how to prove it. In fact, given the other axioms of set theory the well-ordering principle is equivalent to the axiom of choice; a proof is omitted. Note that α will in general depend on g, and the ordinal of the well-ordering is not unique (theorem 8 only says that it is unique if S already is equipped with some well-order). For example ω can be wellordered in “natural” order; or by listing the even integers in natural order, followed by the odd integers in natural order. This latter order is denoted as ω + ω, or ω · 2. Indeed, the following functions on the ordinals may be defined by transfinite recursion. - α + 0 = α, α + (β + 1) = (α + β) + 1, α + β = ∪γ<β (α + γ) when β is a limit ordinal. - α · 0 = α, α · (β + 1) = (α · β) + α, α · β = ∪γ<β (α · γ) when β is a limit ordinal. Basic facts about “ordinal arithmetic” include the following. Proofs may be found in introductory treatments of set theory, in [Jech2] or [Monk1] for example. - The type of the order obtained by appending an order of type β to an order of type α is α + β. - The type of the order obtained by appending β copies of an an order of type α, one after another, is α · β. - On ω the operations + and · are the usual integer operations. - α + (β + γ) = (α + β) + γ and α · (β · γ) = (α · β) · γ. - + is not commutative; for example 1 + ω = ω. - · is not commutative; for example 2 · ω = ω. - α · (β + γ) = α · γ + β · γ. 39 - If δ > 0 then for any α there are unique β and γ such that α = δ · β + γ. The cardinal numbers play a role regarding the size of sets, similar to the role played by the ordinals regarding the well-ordering of sets. Two sets are considered to be the “same size” if there is a 1-1 correspondence between them. An ordinal is said to be a cardinal if and only if it is not in 1-1 correspondence with any smaller ordinal. The following theorem is known as the Bernstein-Cantor-Schroder theorem. Theorem 10. Suppose injective functions f : S 7→ T and g : T 7→ S are given; then there is a bijection from S to T . Proof: By identifying T with its image under g, we may consider T to be a subset of S; we thus have T ⊆ S and f : S 7→ T injective. Define S0 = X, Si+1 = f [Si ]; and T0 = T , Ti+1 = f [Ti ]. Since f is injective, f [Si − Ti ] = f [Si ] − f [Ti ]. It follows that each point either belongs to some Si − Ti , some Ti+1 − Si , or every Si . We may map points x in the first category to f (x), and the remaining points to themselves. ⊳ It follows that an ordinal is a cardinal if and only if it has no injection to a smaller ordinal, if and only if it has no surjection from a smaller ordinal. The class of cardinal numbers will be denoted Card; Greek letters κ, λ, µ, ν are customarily used to denote cardinals. The order relation < on Ord induces an order relation on Card; given two cardinals κ and λ, either κ < λ, κ = λ, or κ > λ, and κ ≤ λ if and only if there is an injection from κ to λ. If κ and λ are distinct cardinals there is no 1-1 correspondence between them. Every ordinal is in 1-1 correspondence with exactly one cardinal, namely the least such ordinal. By the well-ordering principle every set S is in 1-1 correspondence with some ordinal, and hence in 1-1 correspondence with exactly one cardinal κ; κ is called the cardinality of S, and denoted |S|. As noted in section 5 for −, the use of |r| for the absolute value of a real number, and |S| for the cardinality of a set, ordinarily causes no confusion. The following theorem was already noted in section 5; it is called the pigeonhole principle. Theorem 11. If m and n are integers with m > n then there is no injection from m to n. Proof: The proof is by induction on m. If m = 0 the theorem is vacuously true. If f : m + 1 7→ n then there is a bijection g : n 7→ n such that if f ′ = g ◦ f then f ′ (m) = n − 1. Then f ′ ↾ m contradicts the induction hypothesis. To obtain g, if f (m) 6= n − 1 let g(n − 1) = f (m) and g(f (m)) = n − 1; in other cases let g(i) = i. ⊳ Theorem 12. If x is a set of cardinals then ∪x is a cardinal. Proof: By lemma 2 ∪x is an ordinal. Let κ = | ∪ x|. If κ ∈ ∪x then 40 κ ∈ λ for some λ ∈ x. Then λ ⊆ ∪x, so λ ≤ | ∪ x| = κ, a contradiction. Thus, κ = ∪x. ⊳ By theorem 11 the integers are distinct cardinals. By theorem 12 ω is a cardinal. As noted in section 5, a set is said to be finite if its cardinality is an integer, else infinite. A set is said to be countably infinite if its cardinality is ω. Theorem 13. If κ is an infinite cardinal then κ is a limit ordinal. Proof: If α + 1 is an infinite successor ordinal then ω ∈ α, and an injective map from α + 1 to α can be constructed. Namely, for i ∈ ω map i to i + 1, map α to 0, and map other elements to themselves. ⊳ Theorem 14. For any set x, |Pow(x)| > |x|. Proof: Let f : x 7→ Pow(x), and let y = {w ∈ x : f (w) ∈ / w}. Suppose f (w) = y; then w ∈ f (w) if and only if w ∈ y if and only if w∈ / f (w). Hence w does not exist, so f is not a surjection. Since f was arbitrary, there is no surjection from x to Pow(x). ⊳ This is another of the many basic theorems of set theory due to Cantor. By theorems 12 and 14 Card is a proper class; if it were a set there would be a cardinal larger than any cardinal in the set. It also follows that for any infinite cardinal κ there is a next largest cardinal, which will be denoted κ+ . By transfinite recursion the function ℵ from Ord to Card can be defined, where writing ℵα for the value at α, ℵ0 = ω, ℵα+1 = ℵ+ α , and ℵα = ∪β<α ℵβ when α is a limit ordinal. It is readily verified that every infinite cardinal is ℵα for some α. For the next theorem a well-order on the ordered pairs of ordinals will be defined. This well-order has a variety of other uses. Note that α ∪ β is the larger of α and β. Letting γi = αi ∪ βi for i = 1, 2, say that hα1 , β1 i <OP hα2 , β2 i if and only if γ1 < γ2 or γ1 = γ2 ∧ α1 < α2 or γ1 = γ2 ∧ α1 = α2 ∧ β1 < β2 . Lemma 15. <OP satisfies the axioms for a well-order. Proof: To simplify the notation, for an integer i let Pi = hαi , βi i, and let γi = αi ∪ βi . Suppose P1 <OP P2 <OP P3 . If γ1 < γ2 or γ2 < γ3 then P1 <OP P3 ; otherwise if α1 < α2 or α2 < α3 then P1 <OP P3 ; otherwise β1 < β2 and β2 < β3 so P1 <OP P3 . Thus, <OP is transitive. Clearly it is irreflexive. If P1 6= P2 then either γ1 < γ2 or γ2 < γ1 , or α1 < α2 or α2 < α1 , or β1 < β2 or β2 < β1 ; thus, <OP is a linear order. If Pi is a countably infinite nondecreasing infinite sequence then γi must eventually become constant, then αi must, then βi must. Hence, <OP 41 is a well-order. ⊳ Let LOP (α, β) denote {hα′ , β ′ i : {hα′ , β ′ i <OP hα, βi}. Let Γ(α, β) be the order type of LOP (α, β). Γ is a proper class order isomorphism from Ord × Ord (the proper class of ordered pairs of ordinals) ordered by <OP , to Ord. Theorem 16. For an infinite cardinal κ, Γ[κ × κ] = κ. Proof: For an integer n |Γ[n×n]| = n2 , and the claim follows for κ = ω. Suppose inductively that the claim holds for infinite cardinals λ < κ. If ζ < κ for an infinite ordinal ζ, let λ = |ζ|; then |ζ × ζ| equals |λ × λ| (because there is a bijection), equals |Γ(λ×λ)| (because Γ is a bijection), equals λ (by the induction hypothesis); thus, |ζ × ζ| < κ. Suppose hα, βi ∈ κ × κ, and let γ = α ∪ β; then LOP (α, β) ⊆ (γ + 1) × (γ + 1). It follows that |Γ(α × β)| = |LOP (α, β)| < κ, whence Γ(α × β) < κ. Thus, Γ[κ × κ] ⊆ κ has been shown. Clearly, |Γ[κ × κ]| ≥ κ, and Γ[κ × κ] is an ordinal, and Γ[κ × κ] = κ follows. ⊳ There are other ordinals α for which Γ[α × α] = α; see theorem 46.13. Computing the cardinality of sets is frequently done in set theory. For this purpose, and other purposes, κ + λ is defined as the cardinality of the disjoint union of κ and λ, and κ · λ is the cardinality of κ × λ. It is easily seen using theorem 16 that κ + λ = κ · λ = sup(κ, λ) = κ ∪ λ. Again, the use of + and · for both the ordinal and cardinal operations rarely causes confusion; however some authors use different symbols for the ordinal operations. As an example of computing cardinalities, if S has infinite cardinality κ then S k (the set of ordered k-tuples) does also, for any k. The set of finite sequences does also; there is an injection to ω × κ mapping a sequence of length k to the pair hk, αi where α is the “code” (ordinal rank in an enumeration of S k ) for the k-tuple. Ordinal and cardinal numbers are used throughout mathematics, including set theory. Additional properties will be given as they are needed. Also, further details of various basic arguments, omitted so far, will be given in section 17. 14. The real numbers (II). Section 1 of [Miller] is titled, “What are the reals, anyway?” At least five answers to the question of what a real number is, are commonly encountered: 1. An element of R. 2. An element of the completion of Q. 3. A subset of N . 4. A function from N to {0, 1}. 5. A function from N to N . 42 Definitions 1 and 2 are equivalent, and either may be taken as the “official” definition. After earlier work, both constructions were published in 1872, the first by Dedekind and the second by Cantor. Definitions 3 to 5 are used in specific circumstances as a matter of convenience. As will be seen, they are “nearly equivalent” to the official definition; further, in these cases additional structure may be imposed on the collection of reals. The main purpose of this section is to show that |R| = |Pow(ω)|. Other facts of interest will also be shown. R has already been given as the unique ordered field having the least upper bound property. A brief description will be given of the second construction; further details may be found in any of numerous references, including chapter 17 of [Dowd1]. A distance function or metric function (often called simply a “metric”) on a set S is a binary function d such that the following hold. 1. d(x, y) ≥ 0. 2. If d(x, y) = 0 then x = y. 3. d(x, y) = d(y, x). 4. d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality). The function |x− y| is readily verified to be a metric on R, and a fortiori on Q. A metric space is defined to be a set S equipped with a metric function. The “open ball” Bxǫ in S is defined to be {y ∈ S : d(x, y) < ǫ}. By an infinite sequence in S is meant a function f : ω 7→ S; such is frequently written hxi : i ∈ ωi, or simply hxi i. A point x is said to be a limit of the infinite sequence hxi i if every open ball Bxǫ contains all but finitely many points of the sequence. In this case, the sequence is said to converge to x. An infinite sequence converges to at most one point, since given two distinct points x and y there are disjoint open balls Bxǫ and Byǫ . However there may be no point. Certainly hii does not converge. A √ more relevant example is h⌊ 2i⌋/i : i > 0i (the reader is assumed to be familiar with the “greatest integer” function ⌊x⌋ on R). This converges in R but not in Q. An infinite sequence hxi i in a metric space is called a Cauchy sequence if for all ǫ > 0 there is an n such that d(xi , xj ) < ǫ if i, j ≥ n. A metric space is called complete if every Cauchy sequence converges to some limit. It is not difficult to see that R is complete. Let hxn i be a Cauchy sequence. The set of values taken on by the sequence is bounded above; choose any ǫ > 0, choose an N so that if i, j ≥ N then |xi − xj | < ǫ, and consider xN + ǫ. This bounds above xn for x ≥ N , so an upper 43 bound can be obtained by considering the maximum of this and the xn for n < N . A similar argument shows that the set of values is bounded below. It is clear that sup{xk : k ≥ n} exists for each n. Letting bn denote this value, it is clear that inf{bn } exists. Let x denote this value; the sequence xn converges to x. Given ǫ, N ≤ M ≤ L may successively be chosen so that |x − bN | < ǫ/3, |bN − xM | < ǫ/3, and |xM − xn | < ǫ/3 for n ≥ L. A function d satisfying the defining properties of a metric, with 2 replaced by d(x, x) = 0, is called a pseudo-metric. A pseudo-metric space is a set equipped with a pseudo-metric. Given such, the relation d(x, y) = 0 is an equivalence relation, and the distance d([x], [y]) between two classes may be defined as d(x, y). This yields the “quotient” metric space. Suppose X is a metric space, with metric d; let X1 be the set of Cauchy sequences. If x = hxi i and y = hyi i are two such, using the triangle inequality, hd(xi , yi )i is a Cauchy sequence in R. The limit thus exists; define d1 (x, y) to be this limit. The function d1 is a pseudometric. The quotient metric space metric space is called the completion of X. It may be shown that it is the essentially unique complete metric space which contains X as a “dense” subspace. Since R is complete and Q is a dense subspace of R, R is the completion of Q (i.e., there is an isomorphism of metric spaces). The use of the term “dense” above deserves clarification. A topological space is defined to be a set S, equipped with a family T of sets with the following properties. - ∅, S ∈ T ; S - if S ⊆ T then S ∈ T ; - if S1 , S2 ∈ T , then S1 ∩ S2 ∈ T . T is called a topology on S, and its members are called open sets. Let Tb be a set of subsets of S and let T = {∪Q : Q ⊆ Tb }. T is a topology if and only if for all U, V ∈ Tb and all x ∈ U ∩ V there is a W ∈ Tb with x ∈ W . Under these conditions Tb is called a base for T . The open balls Bxǫ form a base for a topology on a metric space S, called the metric topology. A set U ⊆ S is open if and only if for all x ∈ U there is an open ball Bxǫ such that Bxǫ ⊆ U . A subset Q of a topological space S is said to be dense if for every open set U , Q∩U 6= ∅. In the case of a metric space, it suffices that Q ∩ U 6= ∅ for every open ball U . The last characterization of course does not require the introduction of topological spaces. However there is an alternative characterization of the topology on R, which shows that the foregoing definition of a dense subset agrees with that given in section 8. If S is a linear order without 44 endpoints, let (x, y) denote the “open interval” {w : x < w < y}. These form a base for a topology on S, called the order topology. R is both a metric space and a linear order without endpoints, and the metric and order topologies are the same. In set theory the notation xy is heavily overloaded. If x and y are sets, xy denotes {f : f : y 7→ x}. If κ and λ are cardinals, κλ denotes |{f : f : λ 7→ κ}|. Which is intended can be confusing. Here 2ω will be used in the former sense, and 2ℵ0 in the latter. There is an obvious bijection from 2y to Pow(y), mapping f to {w ∈ x : f (w) = 1}. In particular, |Pow(ω)| = |2ω | = 2ℵ0 . Also, reals as in definition 3 may be identified with reals as in definition 4. The notation c (the cardinality of the continuum) is frequently used for 2ℵ0 . If t : n 7→ 2 is a finite string of 0’s and 1’s, let Ut denote {f ∈ 2ω : f ↾ n = t}. The sets Ut form the base for a topology on 2ω . This topological space is called Cantor space; the notation C will be used to denote it. A function f : S1 7→ S2 between topological spaces is said to be continuous if for any open set U ⊆ S2 , f −1 [U ] is an open set. For metric spaces, this is equivalent to the “ǫ-δ” characterization, given x ∈ X and ǫ > 0 there is a δ > 0 such that d(f (x), f (y)) < ǫ whenever d(x, y) < δ. A homeomorphism is a bijection f such that both f and f −1 are continuous. A homeomorphic embedding is an injection, which gives a homeomorphism to its range. In R, let [a, b] denote the “closed interval” {r : a ≤ r ≤ b}. Theorem 1. There is a homeomorphic embedding of C in [0, 1]. P Proof: For f ∈ C let j(f ) = i∈ω (2·f (i))/3i+1 ; it is easy to show that j(f ) ∈ [0, 1]. Given distinct f0 and f1 let i be the first position where they differ, and suppose without loss of generality that f0 (i) = 0 and f1 (i) = 1; it is easy to show that j(f0 ) < j(f1 ). Given an open interval V in [0, 1] a finite string t may be found so that j(t) ∈ V , and this may be extended to a finite string t′ , so that for any f ∈ Ut′ ⊆ V ; this shows that f is continuous. To see that f is a homeomorphic embedding it suffice to show that for any finite 0-1 sequence t there is an open interval V in R such that f [Ut ] ∩ V = f [Ut ] (since this shows that f is an “open” map to f [C], meaning it maps open sets to open sets, and the open sets of f [C] are those of the “relative” or “subspace” topology, namely the intersections of the subspace with the open sets of the parent space). The left endpoint r of f [Ut ] is contained in some open interval V ⊆ R, such that if s < r and s ∈ V then s ∈ / f [Ut ]; and a similar fact holds for the right endpoint. (It can be concluded that f is a homeomorphic embedding by a general fact of topology, since C is compact and [0, 1] is Hausdorff.) ⊳ 45 The image of C under the embedding given above is called the “Cantor”, or “Cantor middle thirds”, set. It can be described as obtained from [0, 1] by successively removing the middle third of remaining closed intervals, for infinitely many stages (the intervals removed are open). This set has a number of properties which make it an interesting example in topology; a few such will be given below. In the following, C will be used ambiguously to denote the image, a subset of [0, 1]. A subset K of a topological space is said to be closed if K c is open; equivalently if given x ∈ / K there is an open set U such that x ∈ U and K ∩ U = ∅. C is a closed subset of [0, 1]; this may be seen since it equals [0, 1] − ∪i∈ω Vi where Vi is open. It is easy to verify that a closed subset of a complete space is complete; hence C is complete. It follows that C, given as 2ω with a topology, is “completely metrizable”, meaning that a metric can be defined, whose metric topology is the given topology, and the resulting metric space is complete. The closure of a subset W of a topological space S is the set of points x ∈ S, such that any open set containing x has nonempty intersection with W . This is the smallest closed set containing W , and if W is closed the closure of W is W . The interior of a subset W of S is the set of x ∈ W such that there is an open set U such that x ∈ U ⊆ W . This is the largest open set contained in W , and if W is open the interior of W is W . A subset of S is said to be nowhere dense if its closure has empty interior. It is not difficult to see that C is nowhere dense in [0, 1]. A topological space S is said to be totally disconnected if for any two distinct points x and y there are disjoint open sets U and V , such that x ∈ U , y ∈ V , and X = U ∪ V . C is readily seen to be totally disconnected (consider the space on 2ω ). Theorem 2. There is a continuous surjection from C to [0, 1]. P Proof: For f ∈ C let e(f ) = i∈ω f (i)/2i+1 . If r ∈ [0, 1] let qi be the largest rational number of the form m/2i+1 such that qi ≤ r. It is easily seen that hqi i yields an f such that e(f ) = r. If V is an open interval in [0, 1] and r ∈ V , let f be such that e(f ) = r. A sufficiently long prefix t of f can be found, such that e[Ut ] ⊆ V . Thus, e is continuous. ⊳ The surjection of the proof (“binary notation”) is almost bijective; reals r ∈ [0, 1] of the form m/2i+1 have two representations, t1 followed by all 0’s, and t0 followed by all 1’s; all other r have a single representation. There is a homeomorphism from (0, 1) to R, for example (2x − 1)/(1−(2x−1)2 ). It has been shown that |C| ≤ |R| = |(0, 1)| ≤ |[0, 1]| ≤ |C|, and so |R| = |C|. By theorem 13.14, |Pow(ω)| > |ω| = ℵ0 . Since 46 |C| = |Pow(ω)|, |R| = |C| = 2ℵ0 > ℵ0 . A set is said to be uncountable if its cardinality is greater than ℵ0 ; thus, R, and also C, is uncountable. Letting ω ω denote {f : f : ω 7→ ω}, a topology may be defined on ω ω in much the same manner as the topology on 2ω . If t : n 7→ ω is a finite string of non-negative integers let Ut denote {f ∈ ω ω : f ↾ n = t}. The sets Ut form the base for a topology. This topological space is called Baire space; the notation N will be used to denote it. Theorem 3. There is a homeomorphic embedding of N in R. Proof: Define sets Si of open intervals in R for i ∈ ω. S0 equals (−∞, ∞), the entirety of R. At stage i + 1, given an interval I in si choose qj for j ∈ Z such that q2j for j ≥ 0 increases to the right endpoint; q2j+ for j ≥ 0 decreases to the left endpoint; and q1 < q0 . Choose the qi so that q ′ − q < 2−(i+1 for two successive chosen rationals. For each interval of Si , add to Si+1 the intervals (q, q ′ ) for each successive chosen pair of rationals within the interval. To ensure that every rational is eventually chosen, enumerate the rationals in some countably infinite sequence ri , and ensure that ri is chosen in stage i. An element of N determines a nested sequence of intervals Ii , namely let I0 = R, and if f (i) = j choose that interval within Ii whose left endpoint is qj as Ii+1 . Since the length of the intervals decreases to 0 a unique real r is singled out; let j(f ) = r. Note that r is not an endpoint of any interval. That e is a homeomorphism follows because e[Ut ] is the intersection of the image of the embedding, with an open interval in R. ⊳ The construction ensures that the image of the embedding is the R − Q, the set of “irrational” real numbers. For this reason Baire space is sometimes called the irrationals. There is a “classical” embedding of N in R, which has additional properties of interest in basic number theory (see [HardWr]). Briefly, there is an embedding of P ω in (1, ∞), as follows, where P = ω − {∅} is the positive integers. For such a sequence hai i, define p0 = a0 , q0 = 1, p1 = a1 a0 + 1, q1 = a1 , and for n ≥ 2, pn = an pn−1 + pn−2 and qn = an qn−1 + qn−2 . The image of hai i under the embedding is limn→∞ (pn /qn ). There is an obvious injection from C to N, and it follows that |N| = 2ℵ0 . This can be shown more easily of course; in particular it is not difficult to construct an injection from N to C. Also, it may be seen using the following facts of cardinal arithmetic. Lemma 4. a. (κ · λ)µ = κµ · λµ . b. κλ+µ = κλ · κµ . c. κλ·µ = (κλ )µ . d. If κ1 ≤ κ2 then κλ1 ≤ κλ2 . 47 e. If λ1 ≤ λ2 , and either λ1 6= 0 or κ 6= 0, then κλ1 ≤ κλ2 . Proof: These facts are stated following lemma 3.3 of [Jech2]. Let S, T , and U be sets. For part a, there is a bijection from (S × T )U to S U × T U ; f corresponds to hπ1 ◦ f, π2 ◦ f i. For part b, if T and U are disjoint there is a bijection from S T ∪U to S T × S U ; f corresponds to hf ↾ T, f ↾ U i. For part c, there is a bijection from S T ×U to (S T )U ; f corresponds to f¯, where f¯(u)(t) = f (t, u). For part d, if S ⊆ T then there is an injection from S U to T U ; f corresponds to itself. For part e, if T ⊆ U and either S or T is nonempty then there is an injection from S U to T U ; f corresponds to f ′ where f ′ is any function such that f′ ↾ T = f. ⊳ Theorem 5. If 2 ≤ λ ≤ κ then λκ = 2κ . Proof: 2κ ≤ λκ ≤ (2λ )κ = 2λ×κ = 2κ . ⊳ 15. The continuum hypothesis. In section 14 it was proved that |R| = 2ℵ0 . Since it is a cardinal, 2 (the “cardinality of the continuum”) equals ℵα for some α. It was shown by Paul Cohen in 1963 that the value of α cannot be determined from ZFC. The continuum hypothesis (CH) is the statement “2ℵ0 = ℵ1 ”. The terminology “continuum hypothesis” arises from the use of the term “continuum” to denote the real line. In more recent usage, the term “continuum” denotes any dense linear order without endpoints which has the least upper bound property. The real line is such an object, indeed the unique such containing the rationals as a dense subset; but there are others, of varying cardinalities. Thus, the terminology “continuum hypothesis” has become a bit of a historical artifact. The history of the continuum hypotheses goes back to Cantor, who discovered the problem of the cardinality of the continuum, and made great efforts to prove that it was ℵ1 , but could not do so. He did prove that every closed uncountable subset of R has cardinality 2ℵ0 , and later results along this line were important advances. In 1934 Sierpinski published a monograph giving various mathematical consequences of the continuum hypothesis. Although the independence of consequences of CH has been an ongoing topic of research, it is a surprising fact that these are fairly technical, and no question of great importance in everyday mathematics seems to require CH for its proof. In what follows, six examples of implications of CH will be given. Implications of CH for the real numbers occur concerning the meager, and the measure 0, subsets of R. These are both families of “small” subsets, which are of interest to “set-theoretic topology”, as well as topology itself. A set is meager if it is a countable union of nowhere dense sets. The Lebesgue measure is a function assigning nonnegative real numbers to some subsets of the real line, which assigns b − a to an ℵ0 48 interval [a, b] where a < b; and which has various geometrically motivated properties. By a set of measure 0 is meant one whose Lebesgue measure is 0. For the first example, some facts about the topology of R are needed. First, there are 2ℵ0 open sets, and so 2ℵ0 closed sets. To see this, there are ℵ0 open intervals (q, r) with q and r rational numbers; and any open set is a union of countably many of these; so there are at most ℵℵ0 0 = 2ℵ0 open sets. Second, a nowhere dense set is contained in a closed nowhere dense set, because the closure of a closed set K equals K. Example 1, a Lusin set exists. A Lusin set is an uncountable subset S ⊆ R, such that for any meager set N , |S ∩ N | ≤ ℵ0 . Theorem 1. If CH is true a Lusin set exists. Proof: Under CH the closed nowhere dense sets may be enumerated as hNα : α < ℵ1 i. Let xα be a real number which is not in ∪β<α Nβ , and let S = {xα : α < ℵ1 }. ⊳ For the second example, some facts about the Lebesgue measure will be needed, which will be stated without proof. A Gδ subset of R is one which is a countable intersection of open sets. As above, there are 2ℵ0 Gδ sets. Any measure 0 set is a subset of a Gδ measure 0 set. Also, the measure 0 sets are closed under countable union. Example 2, a Sierpinski set exists. A Sierpinski set is an uncountable subset S ⊆ R, such that for any measure 0 set N , |S ∩ N | ≤ ℵ0 . The proof that such a set exists if CH is true is similar to the proof of theorem 1, but using an enumeration of the Gδ measure 0 sets. Example 3, cardinal invariants are all ℵ1 . Cardinal invariants are cardinals associated with families of subsets of the real numbers, or functions on the real numbers. They are between ℵ1 and 2ℵ0 , so if CH holds they all equal ℵ1 . The Chicon diagram is a diagram of inequalities which hold between ten such invariants. Various combinations of strict inequalities which are permitted by the Chicon diagram are consistent with ZFC; see [Bart], and [Jech2] for some discussion. Example 4, the iterated integrals can be unequal. The following was proved by Sierpinski in 1920. It was proved in 1980 that the question is independent of ZFC. Theorem 2. If CH holds, then there is a function f : [0, 1] × [0, 1] 7→ R1R1 [0, 1] such that the iterated Lebesgue integrals 0 0 f (x, y) dx dy and R1R1 0 0 f (x, y) dy dx are unequal. Remarks on proof: Let ≤ be a well-ordering of [0, 1] of order-type ℵ1 (by CH such exists). Let f (x, y) = 1 if x ≤ y, else 0. For fixed y the set {x : f (x, y) = 1} is countable, because it is a proper initial segment R1 R1R1 of ℵ1 . It follows that 0 f (x, y) dx = 0, whence 0 0 f (x, y) dx dy = 0. 49 For fixed x the set {x : f (x, y) = 0} is countable, and it follows that R1 R1R1 0 f (x, y) dy = 1, whence 0 0 f (x, y) dy dx = 1. ⊳ Example 5, Kaplansky’s problem. This problem is only mentioned; it is a problem in analysis of whether a “discontinuous homomorphism” exists, in a certain setting. In 1976 it was shown that the existence of such follows from CH; and also that it is consistent with ZFC that there is not one. Example 6, no atomless measure. Say that a function µ : Pow([0, 1]) 7→ [0, 1] is an atomless measure if it is countably additive, µ([0, 1]) = 1, and µ(S) = 0 for every singleton set S. Banach and Kuratowski proved in 1929 that if CH holds then no such measure exists; see corollary 10.17 of [Jech2]. The generalized continuum hypothesis (GCH) is the statement “2ℵα = ℵα+1 ”. One consequence is that the cardinal exponentiation ℵ function ℵαβ is determined (see [Jech2]). Another fact of interest, proved by Sierpinski in 1946, is that in ZF, the axiom of choice (AC) follows from GCH. A proof may be found in [TakZar1]. 16. Absoluteness. A structure for the language of set theory has (in addition to equalb on the domain D, which is the interpretation of ity) a binary relation ∈ the membership predicate. Although there are some contexts where an b is the memberarbitrary relation is of interest, in common contexts ∈ ship relation of V . A structure of this type is called by various names, such as “standard” or an “∈-structure”. An ∈-structure is thus a set D, considered as a structure for the language of set theory, by letting the interpretation of the nonlogical symbol ∈ be simply the membership relation. In such a structure, the formula x ∈ y holds between two elements of D, if and only if it holds between them in V . In general, a formula F is said to be absolute for a set (or class) D (considered as a structure) if whenever elements of D are assigned to the free variables of F , the formula is true in D if and only if it is true in V . F is absolute for a family of structures (sets or classes), if it is absolute for each structure in the family. Absoluteness was defined by Kurt Godel in 1940, and has proved to be a useful definition in set theory. The foregoing requires clarification; for example “true in V ” must be made precise. This is accomplished as usual, by “formalizing” the required facts as formulas of set theory, and proving them in ZFC. One might suppose that the definition of the truth of a formula in a structure might be formalized. Indeed this can be done; but a simpler approach can be used, and includes cases such as class structures. A common abbreviation in set theory is to use ∀w ∈ xF for ∀w(w ∈ 50 x ⇒ F ) and ∃w ∈ xF for ∃w(w ∈ x ∧ F ). These are called bounded universal and existential quantification respectively. They occur in many contexts, and have useful properties, such as theorem 1 below. A formula in the language of set theory is said to be ∆0 if all its quantifiers are bounded (i.e., the clauses in the recursive definition of a formula involving the quantifiers are modified to require bounded quantifiers). The “relativization” of a formula F to the set D is obtained by replacing all quantifiers ∀wF by ∀w ∈ DF , and similarly for ∃xF . This can be more formally specified by giving a recursion on the formation of F . The relativization can be to a class D, recalling that in this case x ∈ D is an abbreviation for a formula. Also, a transitive class is a class S such that x ∈ y ∧ y ∈ S ⇒ x ∈ S. Let Tran(x) be the formula stating that x is transitive. Theorem 1. Suppose D is a set or class, and F is a ∆0 formula with free variables x1 , . . . , xk . Then Tran(D) ∧ x1 ∈ D ∧ · · · ∧ xk ∈ D ⇒ (F ⇔ F D ). Proof: By induction on F , the required formula follows by predicate logic. If F is an atomic formula then F D is F . For ¬F , (¬F )D equals ¬(F D ) and the claim follows easily; similarly for the other propositional connectives. Let (1) be ∀w(w ∈ x ⇒ F ) and let (2) be ∀w(w ∈ x ∧ w ∈ D) ⇒ F . (1)⇒(2) follows directly; and (2)⇒(1) follows using the hypotheses x ∈ D and w ∈ x ∧ x ∈ D ⇒ w ∈ D. ⊳ Many predicates may be shown to be absolute using this theorem, by simply writing ∆0 formulas for them. Examples include the following. - x = ∅: ∀w ∈ x(w 6= w). - x ⊆ y: ∀w ∈ x(w ∈ y). - s = {u, v}: u ∈ s ∧ v ∈ s ∧ ∀w ∈ s(w = u ∨ w = v). - p = hu, vi: ∃s ∈ p∃t ∈ p(s = {u} ∧ t = {u, v} ∧ p = {s, t}). - p is an ordered pair: ∃u ∈ s∃v ∈ t(p = hu, vi) where s, t are as in the previous formula. - u = π1 (p), v = π2 (p), π1 (p1 ) = π1 (p2 ), π2 (p1 ) = π2 (p2 ): ∃v ∈ t(p = hu, vi), etc. - r is a relation: ∀p ∈ r(p is an ordered pair). - x = π1 [r], y = π2 [r]: ∀w ∈ x∃p ∈ r(w = π1 (p)) ∧ ∀p ∈ r∃w ∈ x(w = π1 (p)), etc. - f is a function: f is a relation ∧ ∀p1 ∈ f ∀p2 ∈ f (π1 (p1 ) = π1 (p2 ) ⇒ π2 (p1 ) = π2 (p2 )). - w = f (v): ∃p ∈ f (p = hv, wi. - g = f ↾ x: ∀p ∈ g(π1 (p) = x) ∧ ∀p ∈ f (π1 (p) = x ⇒ p ∈ g)). - z = ∪x: ∀v ∈ x∀u ∈ v(u ∈ z) ∧ ∀u ∈ z∃v ∈ x(u ∈ x). - z = x ∩ y: ∀v ∈ x(v ∈ y ⇒ v ∈ z) ∧ ∀v ∈ z(v ∈ x ∧ v ∈ y). - z = x − y: ∀v ∈ x(v ∈ / y ⇒ v ∈ z) ∧ ∀v ∈ z(v ∈ x ∧ v ∈ / y). 51 - z = ∩x: ∃w ∈ x∀u ∈ w(∀v ∈ x(u ∈ v) ⇒ u ∈ z) ∧ ∀u ∈ z∀v ∈ x(u ∈ x). - z = x × y: ∀p ∈ z∃u ∈ x∃v ∈ y(p = hu, vi) ∧ ∀u ∈ x∀v ∈ y∃p ∈ z(p = hu, vi). - x is transitive: ∀v ∈ x(v ⊆ x). - x is an ordinal: x is transitive ∧∀u ∈ x∀v ∈ x(u ∈ v∨u = v∨v ∈ u). - x is a limit ordinal: x is an ordinal ∧∀u ∈ x∃v ∈ x(u ∈ v). - x ∈ ω: x is an ordinal ∧ x is not a limit ordinal ∧ ∀u ∈ x(x is not a limit ordinal). - x = ω: x is a limit ordinal ∧ ∀u ∈ x(x is not a limit ordinal). Theorem 1 illustrates the utility of transitive sets, as well as bounded quantifiers. The next theorem shows that a transitive structure can be obtained from a structure satisfying a milder hypothesis. It is called the Mostowski collapsing lemma, and is a technical tool of considerable significance. It will be used many times throughout the remainder of the text, lemma 20.6 being a notable example. Theorem 2. Suppose D is a set satisfying the axiom of extensionality. Then there is a transitive set T , and a bijection π : D 7→ T , such that ∀x, y ∈ D(x ∈ y ⇔ π(x) ∈ π(y)) (π is an ∈-isomorphism). Further T and π are unique. Remarks on proof: π is defined by recursion on the well-founded partial order ∈, in a manner similar to definition by transfinite recursion as described in section 13, to be the unique function on D such that π(x) = {π(w) : w ∈ x ∩ D}. That is, π(x) = y if and only if x ∈ D and there exists a function f with domain x such that f (w) = G(f ↾ w) for all w ∈ x, and y = G(f ↾ x), where G(g) = {t : ∃s ∈ D(hs, ti ∈ g)}. It follows by ∈-induction that there is a unique π satisfying the recursion equation. Let T = Ran(π). Using the axiom of replacement T is a set. Suppose x′ ∈ T ∧ v ′ ∈ x′ ; then ∃x ∈ D(x′ = π(x)), so ∃v ∈ x ∩ D(v ′ = π(v)), so v ∈ T . This shows that T is transitive. Also, if v ∈ x for v, x ∈ D then π(v) ∈ π(x) (π is an ∈-homomorphism). This much follows without the extensionality hypothesis. Suppose π is not bijective, so that {x1 ∈ D : ∃x2 ∈ D(x2 6= x1 ∧ π(x2 ) = π(x1 )} is nonempty. Let x1 be an ∈-minimal element. By the hypothesis of extensionality, either (1) ∃w1 ∈ D(w1 ∈ x1 ∧ w1 ∈ / x2 ) or (2) ∃w2 ∈ D(w2 ∈ / x1 ∧ w2 ∈ x2 ). In case 1, π(w1 ) ∈ π(x), so π(w1 ) ∈ π(x2 ), so ∃w2 ∈ D ∩ x2 (π(w1 ) = π(w2 )). Since w1 ∈ / x2 , w1 6= w2 . In case 2, exchanging the roles of 1 and 2 again yields w1 , w2 with w1 ∈ x1 , w2 ∈ x2 , w1 6= w2 , and π(w1 ) = π(w2 ). In either case, w1 contradicts the minimality of x1 . Thus, π is bijective. Suppose π(v) = π(x). Then there is a v ′ ∈ x ∩ D such that π(v ′ ) = 52 π(v). Since π is injective v ′ = v. This shows that π is an ∈-isomorphism. If π ′ is any order isomorphism whose range is transitive, and w ∈ x for w, x ∈ D, then π ′ (w) ∈ π ′ (x); thus, {π ′ (w) : w ∈ x ∩ D} ⊆ π ′ (x). If w′ ∈ π(x) for x ∈ D, since the range of π ′ is transitive ∃w ∈ D(π ′ (w) = w′ ), and π ′ (w) ∈ π ′ (x), whence w ∈ x. Thus, {π ′ (w) : w ∈ x ∩ D} = π ′ (x). Thus, π ′ = π, and it follows that T is unique also. For further details see theorem 6.15 of [Jech2]. ⊳ The set T of the theorem is called the transitive collapse (or Mostowski collapse), and the map π is called the collapsing isomorphism. Theorem 2 gives the transitive collapse of an ∈-structure. The transitive collapse may be applied to more general structures; see section 36. A formula of set theory is said to be Π1 (resp. Σ1 ) if it is of the form ∀~xF (resp. ∃xF ) for a ∆0 formula F . Suppose D is a transitive class. A formula F is said to be down-absolute (resp. up-absolute) if for any assignment of members of D to the free variables of F , if F holds in V then it holds in D (resp. if F holds in D then it holds in V ). It is not difficult to show, using the methods of theorem 1, that a Π1 formula is down-absolute and a Σ1 formula is up-absolute. The predicate z = Pow(x) is Π1 ; it holds if and only if ∀w(w ∈ z ⇔ w ⊆ z). Thus, it is down-absolute; that is, if the power set of x is an element of D then it is the power set in D. However, it may not be in D. Further, the predicate is not absolute. This follows because there are countable models of the power set axiom. Indeed, it can be shown (theorem 12.14 of [Jech2]) that there is a countable transitive model of any finite set of axioms of ZFC. The predicate “x is a cardinal” is Π1 ; it holds if and only if x is an ordinal and for all functions f : α 7→ x where α ∈ x, f [α] 6= x. This predicate is not absolute either, as cardinals can be proved to exist using only finitely many axioms. 17. Admissible sets. The predicate z = x × y is absolute; however, in a set D, the Cartesian product may not always exist. If D is a model of ZFC then it will; however there are various contexts in mathematical logic where it is convenient to consider models of theories which are subsets of ZFC. The models of such theories are shown to be closed under various functions f , that is, ∀x∃yF holds in the models, where F is the formula defining the function. The class of models of interest varies, but using a smaller class than needed for one application avoids re-doing the work when considering a smaller class. Although there are others, the two main classes of models in use are the rudimentarily closed sets, and the admissible sets (see [Mathias], [Rathjen1] for other classes). The former are important in the branch of 53 set theory known as constructibility theory, and are discussed in section 45. The latter have continued to find applications in many areas in mathematical logic. The system of axioms KP (Kripke-Platek) consists of the following axioms. For brevity bounded quantifiers are used (which was not done for ZFC). - extensionality, pairing, union - foundation axiom scheme: For any formula F , ∃wF ⇒ ∃w(F ∧ ∀u ∈ w¬Fu/w ) - ∆0 separation: For any ∆0 formula F where x does not occur free, ∃x∀w(w ∈ x ⇔ w ∈ y ∧ F ) - ∆0 collection: For any ∆0 formula F where w does not occur free, ∀x ∈ z∃yF ⇒ ∃w∀x ∈ z∃y ∈ wF As noted in section 13, the foundation axiom scheme is provable in ZF, The axiom of ∆0 collection is also (see theorem 5 below). Thus, KP is a subtheory of ZF. It lacks the axiom of infinity, and also the power set axiom. As will be seen, in basic constructibility theory, certain functions must be shown to exist (that is, the existence condition ∀x∃yF must be shown), and proofs given that the functions have various properties. If proofs are given in ZFC then the functions have been shown to exist and have the properties, in models of ZFC, and in various cases this is really all that is required. However, even in some such cases, proofs can be given in KP. Theorems proved using KP hold in arbitrary structures satisfying KP, and not just the universe of sets. This is a useful fact in several branches of mathematical logic, including set theory itself. Recall from section 5 that a k-tuple hx1 , x2 , . . .i is formally defined as hx1 , hx2 . . .ii. Any formula ∃~xG where G is ∆0 is (provably in KP, in fact weaker systems) equivalent to ∃pG′ where G′ is ∃s1 ∈ p∃x1 ∈ s1 · · · (p = hx1 , . . .i ∧ G). This transformation is called “contraction of quantifiers”. Note also that if hx, yi ∈ w then x ∈ ∪ ∪ w; this may be seen by applying ∪ twice to {{{x}, {x, y}}} ⊆ w. Theorem 1 (Σ1 collection). For any Σ1 formula F where w does not occur free, ⊢KP ∀x ∈ z∃yF ⇒ ∃w∀x ∈ z∃y ∈ wF . Proof: Suppose F is ∃~v G where G is ∆0 ; then ∀x ∈ z∃y∃~vG. Using contraction of quantifiers ∀x ∈ z∃pG′ , so by ∆0 collection ∃w′ ∀x ∈ z∃p ∈ w′ G′ . so ∃w∀x ∈ z∃y ∈ wF where w = ∪ ∪ w′ . The above argument can be formalized in KP. ⊳ A predicate is said to be Σ1 -definable (resp. Π1 -definable) if there is a Σ1 (resp. Π1 ) formula defining it. A predicate is said to be ∆1 definable if there are both a Σ1 and a Π1 formula defining it. For 54 brevity, the term “Σ1 ” alone may be used to abbreviate “Σ1 -definable”; and similarly for Π1 and ∆1 . The following theorem is known as ∆1 separation. Theorem 2. Suppose ⊢KP F ⇔ G where F is a Σ1 formula, G is a Π1 formula, and w does not occur free in F or G. Then ⊢KP ∃x∀w(w ∈ x ⇔ w ∈ y ∧ F ). Proof: By contraction of quantifiers it may be assumed that F is ∃vF ′ and G is ∃vG′ . It follows from F ⇔ G by predicate logic that (1) ∃v(F ′ ∧ ¬G′ ). It follows from (1) using ∆0 collection that (2) ∃u∀w ∈ y∃v ∈ u(F ′ ∨ G′ ). It follows by ∆0 separation that (3) ∃x∀w(w ∈ x ⇔ w ∈ y ∧ ∃v ∈ uF ′ ). If ∃v ∈ uF ′ then ∃vF . On the other hand, if w ∈ y and ∃vF , then ∀vG′ , so by (2) ∃v ∈ uF ′ . That is, (4) w ∈ y ⇒ (∃v ∈ uF ′ ⇔ ∃vF ′ ). The claim follows by (3) and (4). ⊳ Note that a predicate defined by a formula F as in the theorem, often called a ∆KP predicate, is absolute for transitive sets which are 1 models of KP (and similarly if KP is replaced by other theories, such as ZFC). The utility of theorems 1 and 2 are enhanced by methods for showing that predicates are Σ1 or ∆KP 1 ; the next theorem is one such. Theorem 3. Suppose G and H are Σ1 (resp. Π1 ) formulas, and F is either G ∧ H, G ∨ H, ∀x ∈ yG, or ∃x ∈ yG. Then there is a Σ1 (resp. Π1 ) formula F ′ such that ⊢KP F ⇔ F ′ . Proof: All cases follow by predicate logic, except ∀x ∈ y∃~v G′ and ∃x ∈ y∀~v G′ . The second follows from the first by applying ¬. For the first, by contraction of quantifiers the formula may be assumed to be ∀x ∈ y∃vG′ . By ∆0 collection the formula ∃u∀x ∈ y∃v ∈ uG′ follows, and in fact this formula is equivalent, as can be seen by predicate logic. ⊳ (resp. ΠKP Formulas as in this theorem are often called ΣKP 1 ) for1 mulas. Even though they are not in the required form, they are provably equivalent to formulas that are, and can be used in proofs as if they were. Another useful method for showing that predicates are Σ1 or ∆1 is “substitution”. Given a predicate P (y) and a function f (x), the predicate P (f (x)) is commonly considered; the methods for dealing with such substitutions formally are standard, but a bit involved. Suppose the function f (x) is defined by a formula F , expressing the predicate y = f (x). To use f as a function in proofs, it is necessary to make use of the formula stating that f is a function, that is, “for all x there is a unique y such that F holds”. This occurs so commonly that the abbreviation “∀x∃!yF ” is used for it. The existence condition is (omitting the leading universal quantifier) ∃yF . The uniqueness condition may be expressed as either ∃z∀y(F ⇒ y = z) or F ∧ Fz/y ⇒ z = y, where z does not occur 55 free in F ; the two are equivalent by predicate logic. The existence and uniqueness conditions may be combined into a single formula in various ways. Further discussion is omitted, except to note that the exercises in predicate logic involved require some mastery, which should be acquired in a course in predicate logic. Using the existence and uniqueness conditions, it follows in predicate logic that ∃y(F ∧ G) is equivalent to ∀y(F ⇒ G). Thus, if G is a formula for P , there is a y with y = f (x) and P (y) if and only if, P (y) holds whenever y = f (x). If F is Σ1 , then if G is Σ1 then ∃y(F ∧ G) is Σ1 ; and if G is Π1 then ∀y(F ⇒ G) is Π1 . If G is ∆KP then so is G, 1 “with f (x) substituted for y”, when f is a Σ1 -definable function whose existence and uniqueness conditions are provable in KP. The same is true in various other theories, such as ZFC or PA. Many of the ∆0 predicates listed in section 16 are definitions of functions. For all of these, the existence and uniqueness conditions are provable in KP; some examples will be given. For g = f ↾ x, rewrite the predicate as g = {p ∈ f : π1 (p) ∈ x}. Existence follows by ∆0 separation. Uniqueness follows by supposing g1 and g2 both satisfy the predicate, and showing that if p ∈ g1 then p ∈ g2 ; whence by symmetry g1 = g2 . For z = x × y, using ∆0 collection, ∀u ∈ x∃pu A where A is ∀v ∈ y∃w ∈ pu (w = hu, vi). Again using ∆0 collection, ∃p′ ∀u ∈ x∃pu ∈ p′ A. Let p = ∪p′ ; then ∀u ∈ x∀v ∈ y∃w ∈ p(w = hu, vi). Using ∆0 separation, the existence condition follows. Uniqueness follows by proceeding as in the first example. In the proof of theorem 16.2, a definition by recursion on ∈ was given. With suitable hypotheses, such definitions can be given in models of KP. In the following, f = F ↾ x is an abbreviation for ∀w ∈ x∀y(Fw/x ⇔ f (w) = y). Theorem 4. Given a Σ1 formula G, let F be the formula ∃f (I ∧ G), where I is the formula “f is a function and π1 [f ] = x and ∀w ∈ x∃y∃g(y = f (w) ∧ g = f ↾ w ∧ Gw/x,g/f )”. Then ∃!yF is provable in KP from ∃!yG; F ⇔ ∃f (f = F ↾ x ∧ G) is also provable. Proof: Suppose I, If ′ /f , and w ∈ x. Suppose inductively that f ′ (w′ ) = f (w′ ) for w′ ∈ w; then f ′ ↾ w = f ↾ w. Using this and the uniqueness condition for G, f ′ (w) = f (w). Thus, by ∈-induction, (1) I ∧ If ′ /f ⇒ f ′ = f . By a similar argument using the existence condition for G, (2) ∃f I. The uniqueness condition for F follows by (1) and the uniqueness condition for G. Suppose inductively that ∀x ∈ x0 ∃yF , that is, ∀x ∈ x0 ∃y∃f (I ∧ G). Then by Σ1 collection, for some cy and cf , ∀x ∈ x0 ∃y ∈ cy ∃f ∈ cf (I∧G). 56 I is ∆KP 1 , so using ∆1 separation, let f0 = ∪{f ∈ cf : ∃x ∈ x0 I}. Using (1) and (2), f0 is a function with domain x0 . Ix0 /x,f0 /f is readily seen to hold, whence ∃y(Ix0 /x,f0 /f ∧ Gx0 /x,f0 /f ) holds. ∃yF follows by ∈induction. For the last claim, it suffices to show that I ⇔ f = F ↾ x; since ∃!f (f = F ↾ x) it suffices to show I ⇒ f = F ↾ x. Suppose I ∧ w ∈ x. From the definition of f0 above, Iw/x,g/f ⇒ g = f ↾ w, from which Iw/x,g/f ⇔ g = f ↾ w. By the definition of I, ∃y(y = f (w) ∧ Fw/x ), whence Fw/x ⇔ f (w) = y. ⊳ Recall from section 13 the definition of the smallest transitive set containing a set x as a subset; it is called the transitive closure of x. Informally, it equals x ∪ (∪x) ∪ (∪ ∪ x) · · ·, that is, the sets which are members of x, members of members of x, etc. It follows by the theorem that the function TC(x), whose value at x is the transitive closure of x, is ΣKP 1 . Basic properties of the TC operation include the following. - x ⊆ TC(x) - x = TC(x) if and only if x is transitive. - If w ∈ x then TC(w) ⊂ TC(x). - TC(x) = x ∪ (∪w∈x TC(w)) It clearly follows that |x| ≤ |TC(x)|, and |w| ≤ |TC(x)| for all w ∈ x. On the other hand, if |x| ≤ κ and |w| ≤ κ for all w ∈ x then |TC(x)| ≤ κ. This follows since by the last fact above, |TC(x)| ≤ κ + κ · κ. An admissible set is defined to be a transitive set which is a model of KP. The use of the term “model” requires clarification. A definition has been given in informal set theory; this can be given in formal set theory, indeed this will be done shortly. On the other hand, in set theory D can be said to be a model of a sentence F if F D is true in V . If D is a set the two definitions agree (this is proved in the next section); the latter definition can be used when D is a class. If the existence and uniqueness conditions have been proven in KP for a function f with a Σ1 definition, then an admissible set D is “closed” under f , that is, if x1 , . . . xk ∈ D then f (~x) ∈ D. If D is a transitive set D∩Ord is an ordinal, indeed the least ordinal which is not an element of D. This ordinal is called the ordinal of D; the notation “o(D)” is sometimes used to denote it, but here D ∩ Ord will be used. An ordinal α is said to be admissible if and only if there is an admissible set D with α = D ∩ Ord. In early stages of admissible set theory admissible ordinals had been given various “intrinsic” characterizations; the characterization in terms of admissible sets has turned out to be a useful one. Some examples of admissible sets will shortly be given. In addition, 57 some material of general interest will be covered, including Vα , the rank function, Hκ , cofinality, and regular and singular cardinals. Let V0 = ∅, Vα+1 = Pow(Vα ), and Vα = ∪β<α Vβ for limit ordinals α. The following are basic facts about these sets. 1. Every Vα is transitive. 2. If β ≤ α then Vβ ⊆ Vα . 3. For every x ∈ V there is an α such that x ∈ Vα . For fact 1, Pow(S) is transitive for any set S, and the union of transitive sets is transitive, and the claim follows by transfinite induction. Fact 2 follows by transfinite induction on α; for example Vα ∈ Vα+1 , so by fact 1 Vα ⊆ Vα+1 . For fact 3, if every element of x is in some Vβ then using the axiom of replacement there is a Vα containing all of them, and x ∈ Vα+1 . Thus, the claim follows by induction on ∈. Any set D is already a model of the foundation scheme. Suppose C is the set of w ∈ D such that F is true of w in D. If C is nonempty then there is an ∈-minimal element w′ in C, and this has the necessary properties in D. Any transitive set D is already a model of extensionality. If x, y ∈ D then all elements of either x or y are also in D, so if these sets are the same they are the same in D. Thus for any α, Vα , being transitive, satisfies the extensionality axiom and the foundation axiom scheme. If α is a limit ordinal, Vα also satisfies the pairing, union, and power set axioms, because applying these operations to elements of Vα yields an element of Vα . Vα satisfies the separation axiom, because any subset of a member of Vα is a member of Vα . Vα satisfies the axiom of choice, because given a set of nonempty sets in Vβ for some β < α, there is a choice function in Vβ+i where i is a small integer. If α > ω then Vα satisfies the axiom of infinity, because ω ∈ Vα . The rank ρ(x) of a set is defined to be the least α such that x ∈ Vα+1 . Vα may be seen as the αth “level” of the “cumulative hierarchy” of sets. Each set is a set of objects, which themselves are simpler sets. The rank gives a quantitative meaning to the notion of simpler. The following are basic facts about the rank function. 1. Vα = {x : ρ(x) < α}. If ρ(x) = β < α then x ∈ Vβ+1 ⊆ Vα ; and if ρ(x) = α then x ∈ / Vα . 2. If w ∈ x then ρ(w) < ρ(x). If ρ(x) = α then x ∈ Vα+1 , so x ⊆ Vα , so w ∈ Vα , so ρ(w) < α. 3. Vα ∩ Ord = α. This follows by transfinite induction. For example at successor stages, inductively α ⊆ Vα , so α ∈ Vα+1 , so α + 1 = α ∪ {α} ⊆ Vα+1 . If α + 1 ∈ Vα+1 then α + 1 ⊆ Vα , so α ∈ Vα+1 , contradicting the induction hypothesis. 4. ρ(α) = α. This follows by fact 3. 58 5. ρ(x) = sup{ρ(w) + 1 : w ∈ x}. If ρ(x) = α and w ∈ x then ρ(w) < α, so ρ(w + 1) ≤ α; thus, sup{ρ(w) + 1 : w ∈ x} ≤ α. If α = β + 1 there must be some w ∈ x with ρ(w) = β, else x ⊆ Vβ ; so sup{ρ(w) + 1 : w ∈ x} = α in this case. If α is a limit ordinal then for all β < α there must be a w ∈ x with ρ(w) ≥ β, else x ⊆ Vβ for some β < α; thus sup{ρ(w) + 1 : w ∈ x} = α in this case also. 6. If ρ(x) = α and γ < α then ρ(w) = γ for some w ∈ TC(x). This follows by induction on α. If α = β + 1 then as in the proof of fact 5 there is a w ∈ x with ρ(w) = β, and for γ < β the claim follows inductively. If α is a limit ordinal then there is a w ∈ x with ρ(w) = β and γ < β, and the claim follows inductively. 7. ρ(x) ≤ |TC(x)|. This follows by the previous fact. Item 5 may be used to give a ΣKP definition of the rank function in 1 any admissible set; thus, it is defined and absolute for admissible sets. The following theorem shows how the rank function may be used, to prove a basic fact about KP. Theorem 5. Given the other axioms of KP, Σ1 replacement follows from ∆0 collection, and ∆0 collection follows from Σ2 replacement (the replacement axiom with F restricted to Σ2 formulas). Remarks on proof: The replacement axiom differs from the collection axiom in two ways. In the collection axiom Fxy is required to define a relation which is total when restricted to the domain u, and it is asserted that there exists a set v containing the range of the restricted relation. In the replacement axiom Fxy is required to define a relation which is single valued, and it is asserted that if the domain is restricted to a set u then the range v is a set. Given a Σ1 formula F which defines a single valued relation, and a set u, by the collection axiom there is a set w which contains the range. The required set v is then {y ∈ w : ∃x ∈ uF }. Since F is single valued it may be written in Π1 form, and by theorem 3 ∃x ∈ uF may be also. By ∆1 separation (theorem 2) v is a set. Let KP′ be KP, with ∆0 collection replaced by Σ1 replacement. Theorem 4 may be proved in KP′ by suitably modifying the proof in KP (since the uniqueness condition holds for F , replacement may be used instead of collection to prove the existence condition). Given a ∆0 formula Fxy let G(x, β) be the formula, “∃y(ρ(y) ≤ β ∧ F ) ∧ ∀y(F ⇒ ρ(y) ≤ β)”. Then G is Σ2 , and the uniqueness condition is provable, so by Σ2 replacement the image of u is a set. ⊳ By modifying the above proof, the axiom of collection for any formula F follows in ZFC. Also, let “strong Σ1 collection” be the axiom scheme “∃v∀x ∈ u(∃yF ⇒ ∃y ∈ vF )” for Σ1 F where v does not occur not free; this follows from Σ2 replacement. 59 For an infinite cardinal κ let Hκ be {x : |TC(x)| < κ}. Using fact 7 above, if x ∈ Hκ then |TC(x)| < κ, so ρ(x) < κ, so x ∈ Vκ , so ρ(x) < κ, so x ∈ Vκ . That is, Hκ ⊆ Vκ . In particular, Hκ is a set. Hκ is transitive; it follows from the definition that if x ∈ Hκ and w ∈ x then w ∈ Hκ . Similarly to the case of Vα for limit α, Hκ satisfies extensionality, the foundation scheme, pairing, union, separation, and choice. Before considering collection in Hκ , the notion of cofinality will be introduced. Let α be a limit ordinal. A subset S ⊆ α is said to be “unbounded” if for all β < α there exists γ ∈ S such that γ ≥ β. The “ cofinality” cf(α) of α is the smallest ordinal β such that there is a function f : β 7→ α whose range is unbounded in α. If β is a successor ordinal then f [β] cannot be unbounded; it follows that cf(α) is a limit ordinal. For example, the cofinality of ℵω is ω. The map n 7→ ℵn shows that it is at most ω, and ω is the smallest limit ordinal. A strictly order-preserving map between linear orders is also called “increasing”. It does not matter whether the function f in the definition of cf(α) is required to be increasing. Given an arbitrary f with domain β and range unbounded in α, a strictly order-preserving f ′ with domain β ′ ≤ β and unbounded range can be defined. Using transfinite recursion, let f ′ (γ) be the least δ ∈ f [β] which is greater than any element of f ′ [γ], if any. A function which is increasing and has unbounded range will be said to be increasing and unbounded. The cofinality cf(α) of a limit ordinal α is a cardinal number κ with κ ≤ α. Indeed, if f : β 7→ α, f [β] is unbounded, and g : κ 7→ β is a bijection where κ is the cardinality of β, then (f ◦ g)[κ] is unbounded; it follows that the smallest β must be a cardinal. Suppose α, β, and γ are limit ordinals, f : β 7→ α, and g : γ 7→ β. It is easy to verify that if f and g are increasing then f ◦ g is increasing; and that if f [β] and g[γ] are unbounded then (f ◦ g)[γ] is unbounded. Supposing that f is increasing and unbounded, it follows that cf(α) ≤ cf(β). In fact, cf(β) = cf(α). Suppose cf(α) = κ and h : κ 7→ α is increasing and unbounded. Let g : κ 7→ β be constructed inductively, letting g(δ) = µζ(f (ζ ∈ / h[δ]). Since κ = cf(α) the recursion must continue until the domain κ of g is exhausted, and ζ must eventually exceed any element of β. This shows that cf(β) ≤ cf(α) also. A cardinal κ is said to be regular if cf(κ) = κ; otherwise it is said to be singular. cf(α) is a regular cardinal. Indeed, if f : κ 7→ α is increasing and unbounded, and g : λ 7→ κ is increasing and unbounded, then as noted above f ◦ g is increasing and unbounded. Suppose κ = ℵα is an infinite cardinal. It is said to be a successor 60 cardinal if and only if κ = λ+ for some λ, or equivalently if α is a successor ordinal. Otherwise, α = 0 or α is a limit ordinal, and κ is a union of smaller cardinals; in this case κ is said to be a limit cardinal. Theorem 6. A successor cardinal is regular. Proof: Suppose f : λ 7→ ℵα+1 where λ ≤ ℵα . Then |f (δ)| ≤ ℵα for all δ < λ, whence | ∪δ<λ f (δ)| ≤ λ · ℵα ≤ ℵα . Thus, f [λ] is not unbounded. ⊳ For limit cardinals other than ω, cf(ℵα ) = cf(α). The question arises of whether there are any regular limit cardinals other than ω. This question is independent of ZFC; more will be said in section 30. Lemma 7. Suppose κ is regular, λ < κ, and |Sα | < κ for α < λ. Then | ∪α<λ Sα | < κ. Proof: Because κ is regular, if µ = sup{|Sα | : α < λ} then µ < κ; so | ∪α<λ Sα | ≤ λ · µ < κ. ⊳ “Full collection” is the axiom scheme ∀u ∈ x∃vF ⇒ ∃y∀u ∈ x∃v ∈ yF for any formula F . Theorem 8. If κ is a regular cardinal then Hκ satisfies full collection, and is an admissible set. Proof: Suppose ∀u ∈ x∃vF , and for each u ∈ x let vu be such that F holds at u and vu ; let y = {vu : u ∈ x}. Since x ∈ Hκ , |x| < κ, so |y| < κ. Since vu ∈ Hκ , |TC(v)| < κ for each v ∈ y. Using lemma 6, |TC(y)| < κ, so y ∈ Hκ . ⊳ Hω is known as the collection of hereditarily finite sets. Since ω is a regular cardinal, Hω is an admissible set. It is easy to see that any set in Vω is hereditarily finite, whence Vω = Hω . Further, if D is any admissible set, then Vω ⊆ D must hold, using the pairing and union axioms. That is, the hereditarily finite sets comprise the smallest admissible set. Vω is a model of all the axioms of ZFC, except the axiom of infinity. That replacement holds can be verified by proving full collection as in theorem 8, and using the observation following theorem 5 (alternatively a direct proof can readily be given). In fact, Hκ is admissible for any infinite cardinal κ. The proof requires additional methods, and will be omitted. The textbooks [Barwise] and [Devlin] have proofs. From this, it follows that every infinite cardinal is an admissible ordinal. There are many more, though; indeed there are many admissible countable ordinals. 18. Formalization of syntax. Just as in the theory of PA, formalization of syntax plays a role in the theory of ZFC. A predicate Sat can be defined, so that Sat(D, f, a) is true if and only if the formula F where f = pF q, is true in the set D (considered as a structure for the language of set theory), with 61 assignment a to the free variables of F . In the notation of section 6, Sat(D, f, a) if and only if F̂ (a) = t in the structure D. The “Godel numbering” of formulas used differs from that of arithmetic, and uses sets in Vω rather than integers representing strings. This predicate is called the “satisfaction predicate”. It has various uses; one will be given in the next section. It is important to show that this predicate is ∆ZF 1 . Since it is little more work, it will be shown to be ∆KP 1 . This latter fact has applications in admissible set theory. A function is said to be ΣKP if the predicate 1 “f (~x) = y” is ΣKP 1 . If the existence and uniqueness conditions are provable in KP then the predicate is in fact ∆KP 1 , since it is provably equivalent to ∀w(Fw/y ⇒ w = y). The language of set theory has no constants or function symbols. Thus, a formula F may be given a code pF q as follows. - xn ∈ xm : h0, hn, mii - xn = xm : h1, hn, mii - ¬F : h2, pF qi - F ◦ G: hi, hpF q, pGqii where i is 3,4,5,6 when ◦ is ∧ ∨ ⇒ ⇔ respectively - Qxn F : hi, hn, hpF qii where i is 7,8 when Q is ∀ ∃ respectively Theorem 17.4 does not allow the above recursion to be formalized immediately; however a generalization does, wherein F may be defined from its values at elements of TC(x), rather than members of x. The corresponding induction rule (lemma 1) is needed for the proof of the theorem (theorem 2) showing that the recursively defined function exists. The abbreviation “∀w ∈ TC(x)Fw/x ” is used for “∃t(t = TC(x) ∧ ∀w ∈ tFw/x ”, and similar abbreviations. Lemma 1. ∀x(∀w ∈ TC(x)Fw/x ⇒ F ) ⇒ ∀xF . Proof: Suppose ∀w ∈ TC(x)Fw/x ⇒ F . Let F ′ be ∀u ∈ TC(x ∪ ′ ; then ∀u ∈ TC(x)Fu/x , so F , so F ′ . {x})Fu/x . Suppose ∀w ∈ xFw/x Thus, F ′ follows by ∈-induction, and F follows from F ′ . ⊳ Theorem 2. Given a Σ1 formula G, let F be the formula ∃f (I ∧ G), where I is the formula “f is a function and π1 [f ] = TC(x) and ∀w ∈ TC(x)∃y∃g(y = f (w) ∧ g = f ↾ TC(x) ∧ Gw/x,g/f )”. Then ∃!yF is provable in KP from ∃!yG; F ⇔ ∃f (f = F ↾ TC(x)∧G) is also provable. Proof: The theorem may be proved by modifying the proof of theorem 17.4; a few observations on the modifications will be made. The proof of the existence and uniqueness conditions for I make use of lemma 1, and the existence and uniqueness conditions for TC. In the proof of the existence condition for F let f0 = ∪{f ∈ cf : ∃x ∈ TC(x0 )I}; then Ix0 /x,f0 /f follows. Lemma 1 is used in concluding ∃yF . For the last claim, I ⇒ f = F ↾ TC(x) is shown. Suppose I ∧ w ∈ TC(x). 62 Iw/x,g/f ⇒ g = f ↾ w follows, from which Iw/x,g/f ⇔ g = f ↾ TC(w). Fw/x ⇔ f (w) = y follows as before. ⊳ The predicate IsForm(x) can readily be defined by a recursion as in theorem 2. It holds if and only if x = h0, hn, mii or x = h1, hn, mii or ∃w ∈ TC(x)(x = h2, wi or . . .. There is a slight technicality, in that a 0-1 valued function IsFormF must be defined first; IsForm(x) then equals IsFormF (x, 1), or ∀y(IsFormF (x, y) ⇒ y = 1). Alternatively, versions of theorems 17.4 and 2 can be given for predicates. Using theorem 2 it is a routine exercise to show that the following functions and predicates are ∆KP 1 , and that the existence and uniqueness conditions for the functions are provable in KP. - FrVar(f ), the set of free variables of the formula coded by the set f (or ∅ if f is not the code of a formula). - IsAsn(D, s, a), a : s 7→ D where s is a set of variables. - Sat(D, f, a). As an example, Sat in the case where F is ∀xi G, is ∀u ∈ DSat(D, g, a′ ) where a′ is a ∪ {hi, ui} (i.e., the value of a formula which defines this). If F is a formula with free variables xi1 , . . . , xik let AsnF be a formula with the additional free variables a, which is true if and only if a is the assignment to xi1 , . . . , xik , determined by their values (i.e., such that a(ij ) = xij for 1 ≤ j ≤ k). The following theorem uses abbreviations which should be familiar by now. It shows that for sets, the two notions of a model for a sentence of set theory mentioned in section 17 are equivalent. Theorem 3. For any formula F , ⊢KP Sat(D, pF q, AsnF (~x)) ⇔ F D . Proof: The proof is a straightforward if tedious induction on F . ⊳ 19. Constructible sets. Suppose S is a set, F is a formula, xi is a free variable of F , and a is an assignment to the remaining free variables of F . Then the subset T = {u ∈ S : Sat(S, pF q, a′ ) where a′ = a ∪ {hi, ui} is said to be the subset defined by F and a in S. It should be clear from the preceding section that the function T = DefBy(S, f, a) stating that this is the case is ∆KP 1 . The existence condition follows using ∆1 collection. Let Def(S) be {T : ∃f ∃a(T = DefBy(S, f, a))}. This is the subset of Pow(S), consisting of those subsets which are defined in S by some F and a. In model theory, the definable sets (or more generally predicates) in a structure are similarly defined. A definition of a set T is said to have parameters if a is nonempty, else to be parameter-free. Thus, Def(S) is just the subsets of the structure S for the language of set theory, which have a definition with parameters. Theorem 1. The function Def(S) is ∆KP 1 . 63 Proof: The existence condition can be proved, because f can be limited to Vω , and a to x<ω , the “finite sequences” in x, i.e., the functions from n to x for some integer n. See the proof of theorem 6 below for further details. ⊳ The following definition was given by Kurt Godel in 1939, and is one of the most important definitions in set theory. L0 = ∅ Lα+1 = Def(Lα ) Lα = ∪β<α Lβ for limit ordinals α In the following, the notation x 7→ t will be used. Here, t is a term involving x, and x 7→ t denotes the function whose value at x is t. This is a convenient method for denoting a function, without having to give it a name. Theorem 2. The function α 7→ Lα is ∆KP 1 . Proof: This follows from theorem 1 using theorem 17.4. ⊳ Theorem 3. a. Lα is transitive. b. Lβ ∈ Lα if β < α. c. Lα ⊆ Vα . d. Ln = Vn for n ∈ ω, and Lω = Vω . e. Lα ∩ Ord = α. Proof: Part a follows by induction using the fact that if S is transitive then Def(S) is transitive; indeed, if w ∈ x ∈ Def(S) then x ⊆ S, so w ∈ S, so w is defined by xi ∈ xj and {hj, wi}. For part b, S ∈ Def(S) for any set S, so Lα ∈ Lα+1 , and by part a Lα ⊆ Lα+1 . The claim, together with Lβ ⊆ Lα , follow by induction on α. Part c follows by induction, since Def(S) ⊆ Pow(S), and X ⊆ Y ⇒ Pow(x) ⊆ Pow(y). For part d, if S is finite then it is easily seen that Def(S) = Pow(S), and the claim follows. For part e, by part c Lα ∩ Ord ⊆ α. On the other hand it is easily seen that if Lα ∩ Ord = α then α ∈ Lα+1 , and the claim follows. ⊳ Although it is a secondary topic, some further remarks may be made about KP. Let KP∞ be KP, with the axiom of infinity added. Theorems 1 and 2 hold with ZF replaced by KP∞ , and also theorem 6 below. The only admissible set not satisfying the axiom of infinity is Vω . Any other admissible set D contains ω, Vω , and x<ω for any set x ∈ D. Theorem 3 holds in any admissible set (except the second claim of part e in Vω ). Suppose D is an admissible set and D ∩ Ord = α. By absoluteness, Lβ in D is the same as Lβ for β < α. Lα is a subset of D. Lα is an admissible set (this follows by arguments similar to theorem 4), and is the smallest admissible set D with D ∩ Ord = α. Let L be the class ∪α Lα . As usual, this is an abbreviation for a 64 formula stating that x is in some Lα . Note that L is a transitive class; it is known as the class of constructible sets. Theorem 5 below shows that it is a model of ZF in the sense stated in section 17. For lemma 4, the notation “LimOrd” is introduced for the class of limit ordinals. Also, the notion of a “subformula” of a formula F is required; the set of these is defined by an obvious recursion. Neither lemma 4 nor theorem 5 require the axiom of choice. Lemma 4. Suppose F is a formula with free variables x1 , . . . , xk . Then ⊢ZF ∀α∃β ≥ α(β ∈ LimOrd ∧ ∀x1 , . . . , xk ∈ Lβ (F L ⇔ F Lβ )). Proof: For a formula G with free variables w, x1 , . . . , xk let BGw (α) be the least β > α such that for all x1 , . . . , xk ∈ Lα , if ∃w ∈ LGL then ∃w ∈ Lβ GL ; such a β exists by full collection. Let α0 = α; let αi+1 be the supremum over subformulas G of F and free variables w of G, of BGw (αi ); and let β = ∪i αi . It will be shown by induction on G that if G is a subformula of F , with free variables y1 , . . . , yl , and y1 , . . . , yl ∈ Lβ , then GL ⇔ GLβ . The claim is immediate for atomic formulas, and follows trivially for propositional connectives. It suffices to show the claim for ∃wG. Suppose ∃w ∈ LGL ; then ∃w ∈ Lβ GL , so inductively ∃w ∈ Lβ GLβ . Suppose ∃w ∈ Lβ GLβ ; then inductively ∃w ∈ Lβ GL , so ∃w ∈ LGL . ⊳ Theorem 5. If F is an axiom of ZF then ⊢ZF F L . Proof: Since L is transitive, the axioms of extensionality and foundation hold in L. If a, b ∈ Lα then {a, b} is defined in Lα by x = a∨x = b, so pairing holds. If a ∈ Lα then ∪a is defined in Lα by ∃b ∈ a(x ∈ b), so union holds. ω ∈ Lω+1 , so infinity holds. For separation, let F be a formula, and suppose the free variables are assigned values in L; then for some α the values are in Lα . By lemma 4 β may be chosen such that F L ⇔ F Lβ when the free variables are assigned values in Lβ . Let x = {w ∈ y : F Lβ }; by the definition of Lβ+1 and theorem 18.3, x ∈ Lβ+1 . By choice of β, x = {w ∈ y : F L }. Since x ∈ L, the axiom of separation for F has been shown. For power set, suppose y ∈ L. Let x = {w ⊆ y : w ∈ L}. By power set in V x is a set, and is a subset of L. Using replacement in V , x ⊆ Lα for some α. By separation in L (which has just been shown), x ∈ L. For replacement, suppose F defines a single valued predicate in L, and u ∈ L. Let v = {y ∈ L : ∃x ∈ uF L }. By replacement in V v is a set, and is a subset of L. As for power set, it follows that v ∈ L. ⊳ Theorem 5 is a remarkable fact. The axioms of set theory seem clear enough, and there is no reason to suspect that such a proper class might exist. Godel’s discovery of it in 1939 raised questions which remain under study to this day, the most obvious being, whether every set is constructible. The statement that this is so is called the hypothesis of 65 constructibility, and V = L is frequently written for it. It will be seen later that it is independent of ZFC. Suppose A is a transitive model of ZF, and α = A ∩ Ord (or Ord if A is a proper class). Then x ∈ LA if and only if ∃β ∈ A(x ∈ LA β ). Using absoluteness x ∈ LA if and only if ∃β ∈ α(x ∈ Lβ ). That is, LA = Lα . If A is a proper class, LA = L. Thus, L is the smallest transitive proper class which is a model of ZF. Also, LL = L, and (since clearly V L = L), V = L holds in L. Let ⊥ denote an obviously contradictory statement, Suppose V = L ⊢ZF ⊥. Modify the proof by relativizing each statement to L. The resulting sequence of formulas may be expanded to a ZF proof of ⊥L in ZF; this follows using theorem 5, some facts about the predicate calculus, and ⊢ZF (V = L)L . By suitable formalization, it follows that the consistency of Con(ZF+V = L) is provable in ZF (in fact PA) from Con(ZF). The hypothesis of constructibility is a very strong statement. In [Shelah2] the statement is made that “A major preliminary obstacle to this dream is the lack of a good candidate to be a test problem, since so many questions have already been settled under the assumption V = L”. V = L is so powerful because L can be shown to have a variety of properties; these are not provable in ZFC for “all” sets. For example, L has a “definable well-order”. The notion of a (strict) well-order < on a class S needs to be clarified. It is a proper class of ordered pairs, such that for each pair p π1 (p) ∈ S and π2 (p) ∈ S. It satisfies the axioms for a strict partial order, namely x < y∧y < z ⇒ x < z and x 6< x; and trichotomy x ∈ S ∧ y ∈ S ⇒ (x < y ∨ x = y ∨ y < x), so that it is a strict linear order on S. For the remaining restrictions usage varies. At the least, every nonempty subset of x ⊆ S must contain a <-minimal element v (i.e., ∀u ∈ x(u 6< v)). The additional requirement that {u : u < v} be a set for all v ∈ S may be imposed. Since relations satisfying only the first restriction have uses, these will be called well-orders; if the second restriction holds also the well-order will be said to have small extensions. Theorem 6. There are binary predicates <α on Lα such that, for α < β, <α ⊆<β , and x ∈ Lα ∧ y ∈ Lα − Lβ ⇒ x <β y. The map α 7→<α is Σ1 and the existence and uniqueness conditions are provable in KP. Proof: The proof is an exercise in definition by Σ1 recursion; an outline is given below. ⊳ Write x <L y for ∃α(x <α y). It is readily seen from the theorem that <L is a well-order on L with small extensions, and has a Σ1 definition. As a consequence, the axiom of choice holds in L, indeed there is a definable function on the nonempty sets in L, which assigns to each 66 nonempty set x its <L -least element. It follows that AC is provable in ZF from V = L, from which it follows that if ZF is consistent then ¬AC is not provable. It is also true that <L L (the predicate defined by the relativized formula) equals <L . Since the formula is Σ1 , x <L L y ⇒ x <L y. Since <L satisfies trichotomy, x <L y if and only if x ∈ L ∧ y ∈ L ∧ x 6= y ∧ y 6<L x, and x <L L y follows. The well-orders <α are defined by recursion. <0 = ∅, x <α+1 y if and only if x ∈ Lα+1 ∧ y ∈ Lα+1 ∧ x ∈ Lα ∧ y ∈ Lα ∧ x <α y∨ x ∈ Lα ∧ y ∈ / Lα ∨ x∈ / Lα ∧ y ∈ / Lα ∧ P (x, y). The predicate P (x, y) holds, if there is a definition of x which precedes the first definition of y; an outline of its definition will be given below. For a limit ordinal α, <α = ∪β<α <β . For the definition of P some functions of general use will be defined first. These have Σ1 definitions and the existence and uniqueness conditions are provable in ZF. If v is a collection of subsets of x let Add1(v, x) equal {s ∪ {a} : s ∈ v, a ∈ x}. Let [x]≤n denote the subsets s ⊆ x with |s| ≤ n; this may be defined by the recursion [x]≤0 = ∅, [x]≤n+1 = Add1([x]≤n , x). The set of finite subsets [x]<ω equals ∪n∈ω [x]n . A similar recursion may be used to define the set xn of sequences in x of length n; and the set x<ω of finite sequences. The finite rank sets may be defined without the power set axiom, by the recursion V0 = ∅, Vn+1 = [Vn ]<ω , and Vω = ∪n Vn . Let <fn be the well-order on Vn defined recursively as follows. <f0 = ∅, x <fn+1 y if and only if x ∈ Vn+1 ∧ y ∈ Vn+1 ∧ x ∈ Vn ∧ y ∈ Vn ∧ x <fn y∨ x ∈ Vn ∧ y ∈ / Vn ∨ x∈ / Vn ∧ y ∈ / Vn ∧ ∃s ∈ Vn (s ∈ / x ∧ s ∈ y∧ ∀t ∈ Vn (t <fn s ⇒ (t ∈ x ⇔ t ∈ y))) Let <f denote ∪n <fn ; this is a well-order on Vω . Suppose ≤ is a linear order on a set x, and γ is an ordinal. Let <lex be the relation on xγ , such that f <lex g if and only if ∃β < γ(s(β) < t(β) ∧ ∀α < β(s(α) = t(α))). A straightforward verification shows that this relation is the strict part of a linear order. This order has various uses in mathematics, including set theory, and is called the lexicographic order. If < is a well-order, in general <lex is not; however if γ is a finite ordinal n then <lex is a well-order. To see this, given any set s ⊆ xn , 67 define a “leftmost” element f inductively, by letting f (i) be the least value such that hf (0), . . . , f( i)i is a prefix of an element of s. Given a well-order < on a set x, let <l be the binary relation on <ω x , defined as follows. For s, t ∈ x<ω let m = Dom(s) and n = Dom(t). Then s <l t if an only if m < n ∨ m = n ∧ s <lex t. This relation is easily seen to be a well-order on x<ω . To define P (x, y), an element of Lα is considered to be given by a pair d = hf, ai where f ∈ Vω and a ∈ L<ω α . Let D denote the set of these; for d1 , d2 ∈ D let d1 <D d2 if and only if f1 <f f2 or f1 = f2 and a1 <lα a2 ; and let VD (d) denote DefBy(Lα , f, a). Then P is ∃d1 , d2 ∈ D(x = VD (d1 ) ∧ y = VD (d2 ) ∧ ∀d3 ∈ D(d3 <D d2 ⇒ y 6= VD (d3 )) ∧ d1 <D d2 ). 20. CH is true in L. A constructible x ⊆ ω must be constructed by some stage Lα . The nature of L allows using “model theoretic” arguments to first construct a substructure containing x, second collapse it using the Mostowski collapsing lemma, and third show that the result is Lβ for some β < ℵL 1. The same argument works for any infinite cardinal κ, and it follows that GCH holds in L. There are other methods of proving this. For example Pow(κ) ⊆ Hκ+ , and by a model theoretic construction it can be shown that in L, Hκ+ ⊆ Lκ+ . This proof may be found in [Drake]. The first proof will be given here, because all three steps are of wide use in set theory. The third is called the “condensation lemma”. If S is a structure for a first order language, and T ⊆ S is a substructure, then T is said to be an elementary substructure if, for every formula F (and suitable list of variables), and vector ~x with xi ∈ T for 1 ≤ i ≤ k, F̂ (~x) has the same value in T as it has in S. The notation T ≺ S is used to denote this. For future use, in the case of set theory (and other settings where bounded quantifiers are present), if the requirement need only hold for Σn formulas, T is said to be a Σn -elementary substructure, and this is denoted T ≺n S. ∆0 will be used for Σ0 ; the cases ∆0 and Σ1 are frequently of special interest. Note that if T ≺n S then T is also a “Πn -elementary substructure”, that is, the truth value of a Πn formula with free variables assigned values in T is the same, in either S or T . The first step is accomplished by constructing an elementary substructure of Lα , which contains x and has small cardinality. This is a standard construction in mathematical logic (e.g., proposition 3.3.2 of [ChaKei]); lemma 2 is a version for set theory. Lemma 1 is known as Tarski’s criterion, for a substructure to be elementary. Lemma 1. Suppose T ⊆ S is a substructure. Suppose for each 68 formula F , and each ~t with ti ∈ T for 1 ≤ i ≤ k, if F (s, ~t) is true in S for some s ∈ S then F (s, ~t) is true in S for some s ∈ T . Then T is an elementary substructure. Proof: The proof is by induction on F . For ¬G, since the value in T and S are the same for G they are the same for ¬G; the other propositional connectives are similar. For ∃wG, if the value is true in T then (using the induction hypothesis) it is in true in S, and if it is true in S then (using the induction hypothesis and the hypothesis of the lemma) it is true in T . ⊳ Lemma 2. Suppose S is a structure, and X ⊆ S. Then there is a structure T with X ⊆ T ≺ S, and |T | = max(|X|, ℵ0 ). Proof: Suppose F is a formula, w is a free variable, and x1 , . . . xk are the remaining free variables in alphabetic order. Let σwF (~s) be a value r such that Fw~x (r, ~s) if such an r exists, else ∅. Let T be the “closure” of X under the functions σF w . That is, let T0 = X; let Tn+1 equal Tn , with σF w (~s) added for all F , w, and ~s; and let T = ∪n Tn . It is easy to see by lemma 1 that T is an elementary substructure, and the cardinality claim is also easily seen. ⊳ The function σF w is called a Skolem function. A substructure constructed as in the lemma is called a Skolem hull. The axiom of choice is used in the proof of lemma 2; this can be avoided in the case S = Lα . Namely, let σF w be the <L -least r. With these Skolem functions, T is the smallest elementary substructure of Lα containing X; see lemma II.5.3 of [Devlin]. Lemma 3. The predicate “y = Lγ ” is absolute for any Lα where α ∈ LimOrd and α > ω. Also, the existence condition for y holds in Lα . Remarks on proof: The existence condition follows by theorem 19.3.b. Let G be the functions y = f (~x) such that there is a formula ∃wG defining f where G is ∆0 , and an integer i, such that when γ < α, γ > ω, and xj ∈ Lγ for all j, then y ∈ Lγ+i ; and there is a w ∈ Lγ+i . The lemma may be proved by showing that the functions Sat, DefBy, Def, y = Lγ , and any others that are needed are in G. Since the definitions have only been sketched, a detailed proof is omitted. Note that in a definition by recursion, if a bound γ + i on the rank of y is known then a bound on, say, F ↾ x is also, since F ↾ x is a definable subset of Lγ+i+2 . A detailed proof can be found in [Devlin]. ⊳ Lemma 4. Suppose S is a set satisfying the axiom of extensionality, π : S 7→ T is the collapsing isomorphism, and X ⊆ S is transitive. Then π(x) = x for all x ∈ X. Proof: This follows by induction on ∈: π(x) = {π(w) : w ∈ x∩S} = {w : w ∈ x} = x. ⊳ Lemma 5. Suppose N ≺1 M , F is a Σ1 formula, and |=M ∀uF ; 69 then |=N ∀uF . Proof: If u ∈ N then |=M F (u) so |=N F (u)). ⊳ Note that if S satisfies extensionality (for example if it is transitive), and T ≺0 , then T satisfies extensionality, and so the transitive collapse of T may be taken. Lemma 6 (Condensation lemma). Suppose α ∈ LimOrd, S ≺1 Lα , and T is the transitive collapse of S. Then T = Lβ for some β ∈ LimOrd with β ≤ α. Proof: Let β = T ∩ Ord; then by lemma 5 and the fact that T is ∈-isomorphic to S, β ∈ LimOrd. Let ∃wG be a formula as in lemma 3 defining “y = Lγ ”. Then |=Lα ∀γ∃y∃wG, so by lemma 5 and the fact that T is ∈-isomorphic to S, |=T ∀γ∃y∃wG. Thus, Lγ ∈ T for all γ < β (because T is transitive, G is ∆0 , and Ord is ∆0 ), and so ∪γ<β Lγ ⊆ T . Also |=Lα ∀x∃γ∃y∃w(G ∧ x ∈ y), whence |=S ∀x∃γ∃y∃w(G ∧ x ∈ y), and T ⊆ ∪γ<β Lγ . Since Lβ = ∪γ<β Lγ the lemma is proved. ⊳ Lemma 7. For α ≥ ω, |Lα | = |α|. Proof: By theorem 19.3.e, |α| ≤ |Lα |. That |Lα | ≤ |α| follows by induction on α. First, |Lω | = ℵ0 . Second, |Lα+1 | ≤ ℵ0 · |Lα | ≤ |α + 1|. Third, for α ∈ LimOrd | ∪β<α Lβ | ≤ |α| · |α| = |α|. ⊳ Theorem 8. GCHL . Proof: Suppose x ⊆ κ and x ∈ L; then x ∈ Lα for some α ≥ κ. Let S be such that S ≺1 Lα , κ ∪ {x} ⊆ S, and |S| = κ. Let Lγ be the transitive collapse of S; then γ < κ+ , and by lemma 4 κ ⊆ Lγ , whence π(x) = x and x ∈ Lγ . Hence, every constructible subset of κ is in Lκ+ . The foregoing is an argument in ZF, and it follows that in L, |Pow(κ)| = κ+ . ⊳ From this, V = L ⊢ZF GCH follows, and also, if ZF is consistent then so is ZFC+GCH. 21. Forcing. That GCH is consistent with ZFC is proved by showing that there is a proper class (namely L) which is a model of ZFC+GCH. It is not possible to show that ¬GCH is consistent with ZFC by this method, called the method of inner models, where an inner model is a transitive class which is a model of ZF. Indeed, it is not possible to show that V 6= L is consistent using an inner model. Suppose M is a transitive proper class, the axioms of ZF hold in M , and ⊢ZF (V 6= L)M . Then since V = L ⊢ZF M = L, V = L ⊢ZF (V 6= L)L , so V = L ⊢ZF V 6= L, so ⊢ZF V 6= L, so ZF is inconsistent. The transitivity requirement can be removed; if M is any model then ∪α TC(M ∩ Vα ) is transitive and ∈-isomorphic to M . An alternative approach, which turns out to work, is to start with a transitive model M of ZFC. A set G which is not in M is then added, 70 to obtain a new model M [G]. G can be constructed in such a way that various statements in the language of set theory (for example ¬GCH) can be “forced” to hold in M [G]. This approach was discovered by Paul Cohen in 1963, and has been in extensive use ever since. There are some logical subtleties, which are discussed at the end of the section. For example, it is not provable in ZFC that there is an ∈-model of ZFC. However, these complications can be circumvented, in a manner well-known to set theorists, who therefore usually argue in the above manner. To quote [Geschke]: “In order to prove the consistency of ZFC+¬CH we pretend that there is a transitive set M such that (M ,∈) is a model of ZFC. Using M we construct another transitive set N such that (N ,∈) satisfies ZFC but not CH.” The method is quite flexible; M can be a transitive class, even though if V = L there is no way of “actually” enlarging M . Say that a partially ordered set is a pair hP, ≤i where P is a set and ≤ is a partial order on P . As usual, P rather than hP, ≤i is often used to denote a partially ordered set. Also, the abbreviation “poset” is in common use. To construct G, a partially ordered set P which is a member of M is constructed. The pair hM, P i is called a “notion of forcing”, or a “setting for forcing”. A definition is given of when a subset of P is “generic”. Supposing G is a generic subset, a model M [G], the “generic extension” of the “ground model” M by G, is defined. In a generic extension, sentences will be true if and only if they are “forced” to be, where this is a predicate which can be defined using the partial order. To specify further details some notions from partial order theory are needed. First it should be noted that there are two conventions in use for notions of forcing. Elements of P are called forcing conditions, and there is a notion of one condition being stronger than another. Some authors use p ≤ q to denote that p is stronger than q, and some use p ≥ q. [Jech2] uses p ≤ q, and this will be used here. If P is a partially ordered set say that a subset S ⊆ P is ≤-closed if p ∈ S and q ≤ p imply q ∈ S. For p ∈ P let p≤ = {q : q ≤ p}. Then S is ≤-closed if and only if S = ∪p∈P p≤ . ≥-closed sets are defined similarly, and also p≥ , p< , etc. A subset D ⊆ P is said to be dense if ∀p ∈ P ∃q ∈ D(q ≤ p). This definition may be given a topological interpretation. Recall that a subset of a topological space is said to be dense if it intersects every nonempty open set. In a partially ordered set P , the ≤-closed sets form a topology, with the sets p≤ for p ∈ P forming a base. D is dense as defined above if and only if it is dense in this topology. A subset F ⊆ P is said to be a filter if it is nonempty, ≥-closed, 71 and whenever p, q ∈ F there is an r ∈ F with r ≤ p and r ≤ q. Given a notion of forcing hM, P i, a filter G ⊆ P is said to be M -generic if G ∩ D 6= ∅ for any dense D ⊆ P such that D ∈ M . A basic example of an M -generic filter is as follows. Let P be the set of functions p : d 7→ {0, 1} where d is a finite subset of ω. Let the partial order on P be ⊇, so that a condition is stronger if its value is defined for further integers. If M is a transitive model of ZFC then P ∈ M . Suppose G ⊆ P is an M -generic filter. - Since G is a filter, it follows that f = ∪G is a function with Dom[f ] ⊆ ω. - For any n ∈ ω the set {p ∈ P : n ∈ Dom[p]} is a dense subset of P , and is in M . Since G is generic, f must be defined at n. That is, Dom[f ] = ω. The function f is called a “Cohen generic real”. This is an example of the use of the term “real” in sense 4 of chapter 14. A generic extension M [G] will contain G as an element. Under mild restrictions on P , G ∈ / M , and M [G] is a proper extension. Elements p, q in a partially order set P are said to be compatible if they have a common extension, that is, if there is an r ∈ P such that r ≤ p and r ≤ q; this may also be stated as, p≤ ∩ q ≤ 6= ∅. Elements which are not compatible are said to be incompatible. Suppose P has the property that for any p, p≤ contains incompatible elements. Suppose F ⊆ P is a filter and F ∈ M . Given p, if p ∈ / F then clearly p≤ ∩ (P − F ) is nonempty; otherwise there are incompatible elements r1 , r2 ∈ p≤ , and at most one of them can be in F . Thus, P − F is dense, and is an element of M , so G ∩ (P − F ) 6= ∅, so G 6= F . There are various methods of constructing the model M [G]. The method of Boolean valued models, at least as treated in [Jech2], consists of the following steps. 1. A Boolean algebra B is constructed from P . 2. A “Boolean valued model” V B is defined; by giving a recursive definition of a class. An element is a function mapping elements of lower rank to B. 3. Let M B be the class defined in M by the definition of V B ; each element of M B is a “name”, and is an element of M . 4. For each name x ∈ M B , a value xG is defined by a recursive definition; and M [G] = {xG : x ∈ M B }. The forcing predicate p F is defined for each forcing condition p and formula F in the “forcing language”, i.e., the language of set theory with M B names added as constants. There are more direct methods, such as that used in [Kunen1]. 1’. A class V P is defined recursively; an element is a relation τ ⊆ N ×p, 72 where N is the set of elements of lower rank. Alternatively, it is a function whose domain is a set of elements of lower rank, with f (w) a subset of P . 2’. Let M P be the class defined in M by the definition of V P , each element of M P is a “name”, and is an element of M . 3’. For each name τ ∈ MP , a value τG is defined by a recursive definition; and M [G] = {τG : x ∈ M P }. The forcing predicate p F is defined for each forcing condition p and formula F in the “forcing language”, i.e., the language of set theory with M P names added as constants. The first method has the disadvantage that the extra machinery of the Boolean algebra needs to be introduced; and the advantage that the definition of the forcing relation is simpler, and many proofs are also. Hereafter, the first approach will be used. Recall the definition of a Boolean algebra from section 5; the symbols ⊔, ⊓, † , 0, 1 will be used for the operations, rather than ∪, ∩, c , ∅, and U , with the latter reserved for their usual meaning on Pow(U ) for a set U . In a partially ordered set P , an element x ∈ P is said to be an upper bound for a subset S ⊆ P if x ≥ y for all y ∈ S. An upper bound x is a least upper bound (or supremum) if x ≤ x′ for any upper bound x′ . If is supremum exists then it is unique, as is easily seen. The notions of lower bound and greatest lower bound (or infimum) are defined “dually”; (x ≤ y for all y ∈ S, x ≥ x′ for any lower bound). Lemma 1. Suppose B is a Boolean algebra, and x, y ∈ B. a. x ⊔ x = x, x ⊓ x = x, x ⊔ 1 = 1, x ⊓ 0 = 0, x ⊔ (x ⊓ y) = x, and x ⊓ (x ⊔ y) = x. b. x ⊔ y = y if and only if x ⊓ y = x. c. Defining x ≤ y to hold if and only if x ⊔ y = y, ≤ is a partial order on B. d. x ⊔ y is the least upper bound of {x, y} in the order ≤, and x ⊓ y is the greatest lower bound. Proof: x⊔x = (x⊔x)⊓1 = (x⊔x)⊓(x⊔x† ) = x⊔(x⊓x† ) = x⊔0 = x. The proof that x ⊓ x = x is dual. x ⊔ 1 = x ⊔ (x ⊔ x† ) = x ⊔ x† = 1. The proof that x ⊓ 0 = 0 is dual. x ⊔ (x ⊓ y) = (x ⊓ 1) ⊔ (x ⊓ y) = x ⊓ (1 ⊔ y) = x ⊓ 1 = x. The proof that x ⊓ (x ⊔ y) = x is dual. If x ⊔ y = y then x ⊓ y = x ⊓ (x ⊔ y) = x. The argument that x ⊓ y = x implies x ⊔ y = y is dual. x ≤ x since x ⊔ x = x. If x ⊔ y = y and y ⊔ z = z then x ⊔ z = x ⊔ (y ⊔ z) = (x ⊔ y) ⊔ z = y ⊔ z = z. If x ⊔ y = y and y ⊔ x = x then x = y. This shows that ≤ is a partial order. That x ≤ x ⊔ y follows since x ⊔ (x ⊔ y) = (x ⊔ x) ⊔ y = x ⊔ y; y ≤ x ⊔ y follows similarly. If x, y ≤ u then x ⊔ u = y ⊔ u = u; hence 73 (x ⊔ y) ⊔ u = x ⊔ (y ⊔ u) = x ⊔ u = u, so x ⊔ y ≤ u. This shows that ⊔ is the least upper bound. The argument that x ⊓ y is the greatest lower bound is dual. ⊳ A Boolean algebra is said to be complete if any subset S ⊆ B has a least upper bound (denoted ⊔S) and a greatest lower bound (denoted ⊓S). The construction of the Boolean algebra B of step 1 above makes use of the notion of the regular open sets of a topological space X. Recall the definitions of the interior and closure of a subset of W ⊆ X, given in section 14; these will be denoted W int and W cl respectively. These operations have the following properties. - If W1 ⊆ W2 then W1int ⊆ W2int and W1cl ⊆ W2cl . - For an open set U , U ⊆ (U cl )int (since U ⊆ U cl and U int = U ). - For a closed set K, (K int )cl ⊆ K (since K int ⊆ K and K cl = K). - For an open set U , ((U cl )int )cl = U cl (since U cl ⊆ ((U cl )int )cl ⊆ U cl ). - For a closed set K, ((K int )cl )int = K int (since K int ⊇ ((K int )cl )int ⊇ K int ). An open set is said to be regular open if (U cl )int = U . It is easily seen from the above facts that if U is an open set then (U cl )int is a regular open set. An example of an open set which is not regular open is R − {0}; the closure of this is R, and the interior of the closure is again R. If T is a topological space let ro(T ) denote the regular open sets, equipped with the following operations. - x ⊔ y = ((x ∪ y)cl )int . - x ⊓ y = x ∩ y. - x† = (xc )int . - 0 = ∅. - 1 = T. Theorem 2. ro(T ) is a complete Boolean algebra. Remarks on proof: A structure with operations ⊔ and ⊓ is said to be a lattice if it satisfies the axioms - x ⊔ x = x, x ⊔ y = y ⊔ x, x ⊔ (y ⊔ z) = (x ⊔ y) ⊔ z - x ⊓ x = x, x ⊓ y = y ⊓ x, x ⊓ (y ⊓ z) = (x ⊓ y) ⊓ z - x ⊔ (x ⊓ y) = x, x ⊓ (x ⊔ y) = x The definition of ≤ in a Boolean algebra given above may be given in any lattice. An element l such that l ≤ x for all x is unique if it exists, and is called a 0 element. An element g such that g ≥ x for all x is unique if it exists, and is called a 1 element. A lattice is said to be complete if the greatest lower bound and and least upper bound exist for every subset. It suffices that the greatest 74 lower bound exist: Suppose S is a subset, and let U be the set of upper bounds of S. The greatest lower bound of U is the least upper bound of S. A Heyting algebra is a lattice with a 1 element, and a binary operation →, such that the following axioms hold. - x → x = 1; - x ⊓ (x → y) = x ⊓ y; - y ⊓ (x → y) = y; - x → (y ⊓ z) = (x → y) ⊓ (x → c). In a Heyting algebra, x → y is the largest element among the elements z such that x ⊓ z ≤ y. Conversely if such an element always exists, the operation → may be defined. Suppose L is a Heyting algebra with a 0 element. The pseudocomplement xp of an element is defined to be x → 0. The following may be shown. - x ≤ y if and only if xp ≥ y p , and x ≤ y p if and only if xp ≥ y. - x ≤ xpp , xppp = xp , and (x ⊔ y)p = xp ⊓ y p - Letting Lp denote {xp : x ∈ L}, Lp = {x ∈ L : xpp = x}. - Lp is a Boolean algebra, with greatest lower bound x ⊓ y, least upper bound (xp ⊓ y p )p = (x ⊔ y)pp , and complement xp . A complete lattice has a Heyting algebra → operation if and only if it satisfies the distributive law x⊓(⊔i yi ) = ⊔i (x⊓yi ) for any family {yi }. Such a lattice is called a complete Heyting algebra. If L is a complete Heyting algebra then the greatest lower bound in Lp is the same as the greatest lower bound in L. It follows that Lp is a complete Boolean algebra. Further details regarding the above stated facts may be found in [Dowd1]. The open sets of a topological space T , with the subset order, form a complete Heyting algebra; ∪ is the least upper bound, ∩ is the greatest lower bound, ∅ is a 0 element, and T is a 1 element. Let Ω(T ) denote this algebra. For U ∈ Ω(T ), U p = (U c )int since by definition this is the largest open set contained in U c . If U is regular open then V p = U where V = (U cl )c , and if V p = U for some open V then U is regular. That is, the regular open sets are exactly the complete Boolean algebra Ω(X)p . ⊳ As noted above, the ≤-closed sets in a partial order P form a topology on P . Let ro(P ) denote the set of regular open sets in this topology. By theorem 2 ro(P ) is a complete Boolean algebra. The next two lemmas give some facts about partial orders and Boolean algebras, which will be required. Say that a map f : P 7→ Q between partially ordered 75 sets is order-preserving if x ≤ y ⇒ f (x) ≤ f (y). Lemma 3. Let P be a partially ordered set, with the ≤-closed sets as the open sets of a topology on P . Suppose S ⊆ P . a. For S ⊆ P , w ∈ S cl if and only if w≤ ∩ S 6= ∅. b. An open subset U ⊆ P is regular open if and only if p≤ ⊆ U cl ⇒ p ∈ U. c. If i : P 7→ Q is a bijection of partially ordered sets, with inverse function j : Q 7→ P , and both i and j are order preserving, then i and j are order isomorphisms. d. If i : P 7→ Q is an order isomorphism of partially ordered sets, then i preserves all greatest lower bounds and least upper bounds which exist. Proof: For part a, w ∈ S cl if and only if U ∩ S 6= ∅ for any open set U with w ∈ U ; the latter is clearly true if and only if w≤ ∩ S 6= ∅. Part b follows because p ∈ (U cl )int if and only if p≤ ⊆ U cl . For part c, that i and j are bijective is a well-known fact of informal set theory. If i(p1 ) ≤ i(p2 ) then p1 = j(i(p1 )) ≤ j(i(p2 )) = p2 ; and similarly for j. For part d, if b is a least upper bound for S, then i(b) is an upper bound for i[S]. If i(b′ ) is any other upper bound, then b′ is an upper bound for S, so b′ ≥ b, so i(b′ ) ≥ i(b). The argument for greatest lower bounds is dual. ⊳ Lemma 4. a. In a Boolean algebra, x ⊓ y ≤ z ⇔ y ≤ x† ∪ z. b. In a complete Boolean algebra, x ⊓ (⊔Y ) = ⊔y∈Y (x ⊓ Y ). c. An order isomorphism between Boolean algebras is a Boolean algebra isomorphism. Proof: For part a, if y ≤ x† ∪ z then x ⊓ y ≤ x ⊓ (x† ∪ z) = x ⊓ z ≤ z. Also, y = (x ⊔ x† ) ⊓ y = (x ⊓ y) ⊔ (x† ⊓ y), so if x ⊓ y ≤ z then y ≤ z ⊔ (x† ⊓ y) ≤ x† ⊔ z. For part b, let j = ⊔Y ; since y ≤ j for y ∈ Y , x ⊓ y ≤ x ⊓ j for y ∈ Y . Suppose x ⊓ y ≤ b for y ∈ Y ; then y ≤ x† ⊔ d for y ∈ Y , so j ≤ x† ⊔ d, so x ⊓ j ≤ x ⊓ d. This shows that x ⊓ j is the least upper bound of {x ⊓ y : y ∈ Y }. For part c, if i is the map then by lemma 3.d i preserves ⊔ and ⊓. Since i(0) ≤ i(b) for all b, and i is surjective, i preserves 0; similarly i preserves 1. Since c = b† if and only if b ⊔ c = 1 and b ⊓ c = 0, i preserves † as well. ⊳ A map e : P 7→ Q from a partial order P to a partial order Q is a dense embedding if it satisfies the following requirements. a. x ≤ y ⇒ e(x) ≤ e(y). b. e(x) and e(y) are compatible if and only if x and y are. c. e[P ] is a dense subset of Q. A map e : P 7→ B from a partial order P to a Boolean algebra B is said to be a dense embedding if it is a dense embedding of P in B − {0}. 76 Theorem 5. Let e0 : P 7→ ro(P ) be the map where e0 (p) = ((p≤ )cl )int . Then e0 is a dense embedding of P in ro(P ). Proof: For a regular open set U , p ∈ U if and only if p≤ ⊆ U if and only if e0 (p) ⊆ U . In particular, p ∈ e0 (p), and e0 (p) 6= ∅. Since x ≤ y ⇒ x≤ ⊆ y ≤ , and the closure and interior operations preserve inclusion, requirement a for a dense embedding follows. If w ≤ x and w ≤ y then e(w) ≤ e(x) and e(w) ≤ e(y), so compatible elements map to compatible elements (this much follows by requirement a). If e(x) and e(y) are compatible then there is a w ∈ P such that w ∈ ((x≤ )cl )int and w ∈ ((y ≤ )cl )int . Then w≤ ⊆ (x≤ )cl and w≤ ⊆ (y ≤ )cl . But u ∈ (x≤ )cl if and only if u is compatible with x; thus there is a u1 ≤ w with u1 ≤ x, and a u2 ≤ u1 with u2 ≤ y, so x and y are compatible. Requirement b is thus proved. An element of ro(P ) is a ≤-closed subset S ⊆ P such that (S cl )int = S. For such an S, x ∈ S if and only if x≤ ⊆ S if and only if e(x) = ((x≤ )cl )int ⊆ (S cl )int = S. Requirement c follows by choosing any x ∈ S. ⊳ Lemma 6. Suppose e : P 7→ B is a dense embedding of P in a complete Boolean algebra B. For b ∈ B − {0} let Ub = {p ∈ P : e(p) ≤ b}. Then Ub is a nonempty regular open set, and ⊔e[Ub ] = b. Proof: Ub is nonempty by requirement c for a dense embedding. If p ∈ Ub and q ≤ p then e(q) ≤ e(p) ≤ b; thus Ub is open. Suppose p ∈ / Ub . Then e(p) 6≤ b, so e(p) ⊓ b† 6= ∅, so e(q) ≤ e(p) ⊓ b† 6= ∅ for some q ∈ P . In particular e(q) and e(p) are compatible, so p and q are compatible. Choose r ∈ P with r ≤ q and r ≤ p. Then e(r) ≤ e(q) ≤ b† , whence r≤ ∩ Ub = ∅, whence r ∈ / Ubcl . Thus, p≤ 6⊆ Ubcl has been shown, and so Ub is regular by lemma 3.b. Let b0 = ⊔e[Ub ]. Clearly b0 ≤ b. If b0 < b, it is easily seen that b ⊓ b†0 6= 0, so e(p) ≤ b ⊓ b†0 for some p ∈ P , so e(p) ≤ b0 , a contradiction. Thus, b0 = b. ⊳ Theorem 7. If e : P 7→ B is a dense embedding of P in a complete Boolean algebra B then B is isomorphic to ro(P ). Proof: Let i : ro(P ) − {∅} 7→ B be the map where i(U ) = ⊔e[U ]. If U is a regular open set and p ∈ U then e(p) ≤ ⊔e[U ], whence Ran(i) is contained in B − {0}. Let j : B − {0} 7→ ro(P ) − {∅} where j(b) = Ub , where Ub is as in lemma 6. By lemma 6 i is surjective, indeed j is a right inverse. To show that j is a left inverse, it suffices to show that for p ∈ P , e(p) ≤ ⊔e[U ] if and only if p ∈ U . If p ∈ U then clearly e(p) ≤ ⊔e[U ]. Suppose e(p) ≤ ⊔e[U ]. If e(p)⊓e(q) = 0 for all q ∈ U then e(p) ⊓ (⊔e[U ]) = 0, a contradiction; and it follows that p is compatible with q for some q ∈ U , whence p≤ ⊓U 6= ∅. The same argument holds for any p′ ≤ p. By lemma 3.b, p ∈ U . Clearly S ⊆ T ⇒ i(S) ≤ i(T ), and i is order-preserving. Clearly b ≤ c ⇒ Ub ⊆ Uc , and j is order-preserving. ⊳ 77 Thus, a partial order P can be “converted” to a complete Boolean algebra B = ro(P ) in a canonical way. This fact permits a simple construction of M [G], where G ⊆ P is a generic filter. The filter G can be “converted” to a subset GB of B as follows: let GB = {b ∈ B : ∃p ∈ G(e(p) ≤ b)}; it is readily seen that GB is the smallest filter in B − {0} containing e[G]. Let B be a complete Boolean algebra. The class V B is defined recursively as follows. - V0B = ∅ B - Vα+1 = {f : f : d 7→ B where d ⊆ VαB }. - For a limit ordinal α, VαB = ∪β<α VβB - V B = ∪α∈Ord VαB This definition “mimics” the definition of Vα , except elements of Vα+1 have elements of Vα as members, according to elements of B, rather than {0, 1}. The formula for V B has the parameter B. Basic theorems concerning the class are still theorems of ZFC, though, as they are true for any B. As already indicated, in any transitive model M of ZFC (set or class) containing P , the formula for V B defines a class in M , which will be denoted M B . Elements of M B are called “names”, as they are to be used as names for the elements of M [G]. Indeed, given an M -generic filter G ⊆ P , the value xG of a name x is defined recursively as follows. - ∅G = ∅ - xG = {wG : w ∈ Dom[x] ∧ x(w) ∈ GB } The model M [G] equals {xG : x ∈ M B }. M is not directly a substructure of M [G]; however there is a “canonical embedding”, wherein each x ∈ M is assigned a name x̌. This is defined recursively as follows. - ˇ ∅=∅ - x̌ is the function with domain {w̌ : w ∈ x}, and x̌(w̌) = 1 for all w ∈ x. Of course, it remains to show that this is an embedding; this will be shown below, along with various other basic facts. The forcing language may be defined to be the pairs hF~v , ~xi where as in section 20 F is a formula in the language of set theory, ~v is a list of variables including the free variables of F , and x is a corresponding list of names (elements of M B , or in some cases V B ). Similarly to remarks in section 6, the notation F (x1 , . . . , xn ) will be used for hF~v , ~xi, when no confusion results. Recall from the proof of theorem 2 the operation → in a Heyting algebra. A Boolean algebra is a Heyting algebra when x → y is defined 78 to be x† ⊔ y. As will be seen, this operation is convenient when giving the technical details of forcing. Given a formula F (x1 , . . . , xn ) in the forcing language, its “truth value” JF (x1 , . . . , xn )K, an element of B, may be defined. First, a recursive definition is Pgiven for atomic formulas, as follows. - Jx ∈ yK = v∈π1 [y] (Jv = wK) ⊓ y(v)) P - Jx = yK = Vxy ⊓ Vyx where Vxy = w∈π1 [x] (x(w) → Jw ∈ yK) The value for any formula may be defined by recursion on formulas, as follows. - J¬F (~x)K = JF (~x)K† - JF (~x) ∧ G(~x)K = JF (~x)K ⊓ JG(~x)K - JF (~x) ∨ G(~x)K = JF (~x)K ⊔ JG(~x)K - JF (~x) ⇒ G(~x)K = JF (~x)K → JG(~x)K - J∀vF (~x)K = ⊓w∈M B JF (w, ~x)K - J∃vF (~x)K = ⊔w∈M B JF (w, ~x)K The forcing relation is now easy to define. For a condition p and a formula F of the forcing language, p F if and only if e(p) ≤ JF K. From hereon proofs will be omitted. They may be found in any of numerous standard references; specific references to [Jech2] will be given. The following lemma gives some basic facts about the truth value function. Lemma 8. a. Jx = xK = 1 b. Jx = yK = Jy = xK c. Jx = yK ⊓ Jy = zK ≤ Jx = zK d. Jx′ = xK ⊓ Jx ∈ yK ≤ Jx′ ∈ yK e. Jy ′ = yK ⊓ Jx ∈ yK ≤ Jx ∈ y ′ K f. J∀w(w ∈ x ⇔ w ∈ y)K ≤ Jx = yK g. If S ⊆ B, and for each b ∈ S xb is an element of M B , then there is an element y ∈ M B such that for each b ∈ S, b ≤ Jy = xb K. h. If F is a formula of the forcing language, with free variable v, then there is an element w ∈ M B such that J∃vF (~x)K = JF (w, ~x)K i. If F (~x) is a ∆0 formula with values xi ∈ M then |=M F (~x) if and only if JF (x̌1 , . . . , x̌k )K = 1. Remarks on proof: Part a is lemma 14.15. part b is immediate from the definition. Parts c-e are lemma 14.16. Part f-h are lemmas 14.17 to 14.19. Part i is lemma 14.21. ⊳ The next lemma gives a fundamental relation between truth in M [G] and the truth value function. Lemma 9. Suppose x, xi , y ∈ M B , and F (~x) is a formula of the forcing language. a. xG ∈ y G if and only if Jx ∈ yK ∈ GB b. xG = y G if and only if Jx = yK ∈ GB 79 G c. |=M[G] F (xG x)K ∈ GB 1 , . . . , xk ) if and only if JF (~ Remarks on proof: Parts a and b are lemma 14.28. Part c is theorem 14.29. This is proved by an induction on formulas, using parts a and b for atomic formulas, and some properties of GB . In particular, for a Boolean algebra B, an M -generic filter H in B − {0} is an M -generic ultrafilter in B (exercise 14.10). H is an ultrafilter if and only if for any b ∈ B, either b ∈ H or b† ∈ H. H is M -generic if and only if for any subset S ⊆ H such that S ∈ M , ⊓S ∈ G. ⊳ The next lemma shows that the axioms of ZFC are “true” in V B (or M B ). Lemma 10. If F is an axiom of ZFC then JF K = 1. Remarks on proof: This is theorem 14.24. Elements may be shown to exist by explicitly constructing them. ⊳ Having proved sufficiently many lemmas, the main theorems of forcing theory can be proved. These are the main tools for using forcing, although the lemmas are often useful also. The following theorem is known as the “forcing theorem”. Theorem 11. If F (~x) is a sentence of the forcing language then G |=M[G] F (xG x)). 1 , . . . , xk ) if and only if ∃p ∈ G(p F (~ Remarks on proof: This is theorem 14.6. It follows easily from lemma 9.c above, and the definition of the forcing relation. ⊳ The next theorem states basic properties of the forcing relation. Theorem 12. For formulas F , G of the forcing language, and p ∈ P , the following hold. a. If p F and q ≤ p then q F b. For no p do both p F and p ¬F hold c. For every p there is a q ≤ p such that either q F or q ¬F d. p F ∧ G if and only if p F and p G e. p ¬F if and only if for all q ≤ p q 6 F f. p ∃vF (~x) if and only if ∀q ≤ p∃r ≤ q∃w ∈ M B (r F (w, ~x)) g. p F ∨ G if and only if ∀q ≤ p∃r ≤ q(p F or p G) h. p ∀vF (~x) if and only if ∀w ∈ M B (p F (w, ~x)) Remarks on proof: This is theorem 14.7. It can be proved from the definition of , along with some facts from lemma 8 above. In some treatments, items d to f are used in defining . ⊳ The next theorem is called the “generic model theorem”. Theorem 13. Suppose M is a transitive model of ZFC, and G is a generic filter in a notion of forcing in M . a. M [G] is a model of ZFC b. M ⊆ M [G] c. G ∈ M [G] 80 d. If N is a transitive model of ZFC such that M ⊆ N and G ∈ N then N = M [G] e. OrdM[G] = OrdM Remarks on proof: This is theorem 14.5. Part a follows by lemma 10 above and the fact that 1 ∈ GB for any generic filter G. Part b follows because x 7→ x̌G preserves ∈ and M is transitive. For part c, let Ġ be the name where Ġ(b̌) = b for all b ∈ B. Then ĠG = GB , and G = {p ∈ P : e(p) ∈ GB }. A proof of this last fact, and many other useful facts about filters in partial orders and Boolean algebras, can be found in [TakZar2]. ⊳ Suppose M is a standard model of ZFC, P is the notion of forcing for adding a Cohen generic real described above, and G is a generic ultrafilter. By theorem 13 and the absoluteness of α 7→ Lα , LM = LM[G] . Since G ∈ / M, G ∈ / LM[G] . Since G ∈ M [G], in M [G], V 6= L. The above observation is only a statement about models of ZFC. It can’t immediately be used to conclude that V 6= L is consistent with ZFC, because the existence of M , or of G, has not been established (in fact M cannot be proved to exist). G does exist if M is countable; see section 28. There are various ways of “converting” a construction of a model M [G] of a sentence F , to a consistency proof. A simple one is to note that the construction can be transformed into a proof that JF K 6= 0 (indeed, ordinarily JF K = 1). On the other hand, if ⊢ZFC ¬F then J¬F K = 1. 22. ¬CH is consistent. Adding a single Cohen generic real produces a model where V = L is false. Adding a large quantity of them produces a model where 2ℵ0 is as large as desired. However, there are restrictions on what 2ℵ0 can be; see section 51. In this section let P be a notion of forcing in a transitive model M of ZFC, and let G be a generic set. Forcing arguments often make use of properties of P , to prove properties of M [G]. A frequently encountered such property is the following. P is said to satisfy the κ-chain condition (κ-c.c.), if whenever S ⊆ P and any two elements of S are incompatible, then |S| < κ. The ℵ1 -chain condition is commonly called the countable chain condition (c.c.c.). Authors (for example [Fremlin2]) have observed that the terminology “chain condition” is strained, as chains (defined below) are not involved; but it is in such wide use that this fact is ignored. Further, it may be seen that a complete Boolean algebra satisfies the κ-c.c. if and only if there is no ascending chain of length κ. 81 Theorem 1. Suppose P satisfies the κ-c.c., λ ≥ κ, and cf(α) = λ in M . Then cf(α) = λ in M [G]. Proof: Let λ′ denote cf(α) in M [G]. If f : λ 7→ α in M then f : λ 7→ α in M [G] because M ⊆ M [G], so λ′ ≤ λ. Suppose µ < λ and f : µ 7→ α in M [G]; it suffices to show that the range of f is bounded. Let f˙ be a name for f , and let p ∈ P be such that p f˙ : µ̌ 7→ κ̌. For each β < λ let Sβ = {γ : ∃q < p(q f˙(γ) = β}. For each γ ∈ Sβ choose a qβ such that qγ f˙(γ) = β. Then the qγ are pairwise incompatible, so since P satisfies the κ-c.c. there are at fewer than κ of them, so |Sβ | < κ. Since λ is regular in M , it follows that | ∪β<µ Sβ | < λ, whence ∪β<µ Sβ < δ for some δ < α, whence p f˙[µ] ⊆ δ, whence |=M[G] f [µ] ⊆ δ. ⊳ Corollary 2. Suppose P satisfies the κ+ -c.c., and λ ≥ κ+ . Then λ is a cardinal in M if and only if λ is a cardinal in M [G]. Proof: As noted in section 16, the property of being a cardinal is is down-absolute, and one direction follows. The other direction follows by induction. If λ is singular in M then it is regular, so by the theorem it is a regular cardinal in M [G]. If λ is a limit cardinal in M then in M [G] it is a union of cardinals, so is a cardinal. ⊳ For any cardinal κ, κ many Cohen reals may be added by considering the notion of forcing, where the elements are the functions p : d 7→ {0, 1} where d is a finite subset of κ × ω. As in the case of a single Cohen real, if G is a generic filter then ∪G is a function with domain κ × ω. Lemma 3 (∆-system lemma). Let S be an uncountable set of finite sets. Then there is a finite set r, and an uncountable subset D ⊆ S, such that for any s, t ∈ D, s ∩ t = r. Remarks on proof: This is a classic theorem, proved by N. Shanin in 1946. It is theorem 9.18 of [Jech2]. ⊳ A system D where any two sets intersect in the same set r is called a ∆-system, with root r. Corollary 4. If S is an uncountable set of functions f : d 7→ {0, 1} where d is a finite set then S satisfies the c.c.c. Proof: By lemma 3 there is an uncountable set S2 ⊆ S such that given any two functions in S2 , their domains intersect in a fixed set r. In turn there is an uncountable subset S3 ⊆ S2 , such that f ↾ r is the same for all f ∈ S3 . Any two f ∈ S3 are compatible. ⊳ Lemma 5. Suppose GCH holds. If λ < cf(κ) then κλ = κ. Proof: If f : λ 7→ κ then since λ < cf(κ), f is bounded. It follows that κλ = ∪α<κ αλ . Now, αλ ≤ 2|α|·λ = (|α| · λ)+ ≤ κ, and the lemma follows. ⊳ Let P be a partially ordered set. A chain is defined to be a subset 82 C ⊆ P , which is linearly ordered, i.e., such that if x, y ∈ C then x ≤ y or y ≤ x. A maximal element in P is an element p ∈ P such that if q ≥ p then q = p. Lemma 6 (Zorn’s lemma). Suppose P is a partially ordered set such that for every chain C ⊆ P there is an element of P which is an upper bound for C. Then P contains a maximal element. Proof: Using the axiom of choice, there is a function c which assigns to each S ⊆ P a strict upper bound if it has one, else ∅. Define by transfinite induction a sequence Cα of chains, where Cα+1 = Cα ∪ {c(Cα )}, if Cα has a strict upper bound; and Cα = ∪β<α Cβ if α ∈ LimOrd. Using the axiom of replacement eventually a Cα must be obtained with no strict upper bound. If p is an upper bound for Cα then p is a maximal element of P . ⊳ Lemma 7. If P is a partially ordered set and S ⊆ P , then there is a pairwise incompatible subset T ⊆ S, which is maximal among the set of such, ordered by inclusion. Proof: Suppose C is a chain of pairwise incompatible subsets. Let T = ∪C, and suppose x, y ∈ T . Then x ∈ Sx and y ∈ Sy for some Sx , Sy ∈ C. Since C is a chain, either Sx ⊆ Sy or Sy ⊆ Sx , so for some S ∈ C, x, y ∈ S, and so x and y are incompatible. ⊳ Lemma 8. Suppose P is a partially ordered set, U ⊆ P is a regular open set, and M is a maximal pairwise incompatible subset of U . Then U = ⊔{e(p) : p ∈ M }. Proof: Let V = ⊔{e(p) : p ∈ M }. Clearly V ⊆ U . If V ⊂ U then there is a q ∈ P such that e(q) ⊆ U ∩(V c )int . From this, it follows that q is incompatible with every element of M , contradicting the maximality of M . ⊳ Theorem 9. Suppose M is a model of GCH. Suppose κ has uncountable cofinality (in M ). Let M [G] be a generic extension by the notion of forcing adding κ Cohen generic reals. Then in M [G], 2ℵ0 = κ. Proof: Let f = ∪G and for α < κ let fα : ω 7→ {0, 1} be the function where fα (n) = f (α, n). Then for any α 6= β, {p ∈ P : ∃n(fα (n) 6= fβ (n)} is dense, and since G is generic, |=M[G] fα 6= fβ . From this, it follows that |=M[G] 2ℵ0 ≥ κ, since there is an injection in M [G] and by lemmas and corollaries 4 and 6 κ is a cardinal in M [G]. Given any generic extension, and any cardinal λ in M , let µ1 = |2λ | in M [G], let µ′1 = |2λ | in M , and let µ2 = |B|λ in M . For each S ⊆ λ in M [G] let Ṡ be a name, and let gS : λ 7→ B be the function where gS (α) = Jα̌ ∈ ṡK. If gS = gT then S = T , since whether α ∈ S is determined by Jα̌ ∈ ṡK. Thus, in M µ′1 ≤ µ2 . µ1 ≤ µ′1 by downabsoluteness of cardinality. Using lemma 8 and corollary 4, |B| ≤ |P |ℵ0 = κℵ0 . Using GCH in 83 M , |B| ≤ κ. Finally, |=M[G] 2ℵ0 ≤ |B|ℵ0 = κ. ⊳ Only the first paragraph of the proof is required to show that ¬GCH holds in M [G]. The additional argument gives the exact size of 2ℵ0 in M [G]. The hypothesis of GCH is not needed to do so; in general it is κℵ0 (see [Jech2]). Theorem 9 can be generalized to larger cardinals, by using the functions f : d 7→ {0, 1} where d ⊆ λ and |d| < κ. If κ is regular and 2<κ = κ (this requirement follows from GCH) then P satisfies the κ+ c.c. To show that no new cardinals ≤ κ are introduced, the notion of a κ-closed partially ordered set is introduced. If P is κ-closed, then no new cardinals ≤ κ are introduced; further the above forcing notion is < κ-closed. It follows that if λ > κ and λκ = κ (which follows from GCH if κ < cf(λ)) then |=M[G] 2κ = λ. See chapter 15 of [Jech2]. 23. Clubs, stationary sets, and diamond. In 1972 R. Jensen defined a principle about sets, which has proved to be of importance. Before giving the definition, it is necessary to first give some preliminary definitions, which have many uses in set theory. Suppose S is a set of ordinals. An ordinal β is said to be a limit point of S if S ∩ β is unbounded below β. A limit point β must be a limit ordinal, since if β = γ + 1 and δ ∈ S then δ ≤ γ, so S is bounded below β. Suppose α is a limit ordinal. A subset S ⊆ α is said to be closed if, whenever β < α is a limit point of S then β ∈ S. A subset S which is closed and unbounded is said to be a club subset, or just a club. The notion of an ideal in a partially ordered set P is dual to that of a filter, that is, a subset I ⊆ P is an ideal if it is nonempty, ≤-closed, and whenever p, q ∈ I there is an r ∈ F with r ≥ p and r ≥ q. If P is a family of subsets of a set X, ordered by inclusion, it is a common notion that a filter is a collection of “large” subsets, and an ideal is a collection of “small” subsets. For example the meager and measure 0 subsets of R are ideals of small sets. The club subsets are of most interest when α is an uncountable cardinal (as will be seen in the proof of theorem 2, though, the completely general definition is sometimes needed). In this case, the club subsets are closed under intersection (this will be shown in section 31). Thus, the subsets which contain a club subset form a filter. The complements of subsets in the club filter form an ideal, and are called thin subsets; a subset T is thin if there is a club subset C such that T ∩ C = ∅. A subset which is not thin is called stationary. Thus, S is stationary if and only if for any club subset C, S ∩ C 6= ∅. The diamond principle (often denoted ♦) is as follows. There is a system of subsets hSα i of ℵ1 such that 84 1. Sα ⊆ α for any α < ℵ1 , and 2. for any subset X ⊆ ℵ1 , {α < ℵ1 : X ∩ α = Sα } is stationary. Theorem 1. ♦ ⊢ZFC CH. Proof: If X ⊆ ω then it follows from ♦ that for some ordinal α, X = X ∩ α = Sα . ⊳ Theorem 2. V = L ⊢ZFC ♦. Remarks on proof: If ♦ does not hold for hSα i, there is a set S ⊆ ℵ1 and a club C ⊆ ℵ1 such that S ∩ α 6= Sα for all α ∈ C. The idea of the proof is to construct hSα i to rule out every “condensed” version of such an occurrence; a sequence Cα , where Cα is a club subset of α for α ∈ LimOrd, is constructed along with the Sα . Let C0 = S0 = ∅. Let Cα+1 = Sα+1 = α + 1. For α ∈ LimOrd, if there is an S ⊆ α and a club C ⊆ α such that S ∩ ξ 6= Sξ for all ξ ∈ C, let hSα , Cα i be the <L -least such pair; otherwise let Cα = Sα = α. If ♦ does not hold for hSα i, let hS, Ci be the <L -least pair which is a counterexample. Then hSα i, hCα i, S, and C are in Lℵ2 , and satisfy their defining formulas in Lℵ2 . Let M be a countable elementary substructure of Lℵ2 . Then hSα i, hCα i, S, and C are in M . Also, M ∩ ℵ1 = δ for some δ < ℵ1 . Suppose π : M 7→ Lβ is the collapsing isomorphism. Then π(ℵ1 ) = δ, π(hSα i) = hSα : α < δi, π(hCα i) = hCα : α < δi, π(S) = S ∩ δ, and π(C) = C ∩ δ. In Lδ , and hence in L, hS ∩ δ, C ∩ δi is the <L -least pair hS ′ , C ′ i such that C ′ is club and S ′ ∩ ξ 6= Sξ for all ξ ∈ C ′ . Hence by the definition of hSα i, S ∩ δ = Sδ . On the other hand C ∩ δ is unbounded below δ, and C is closed, so δ ∈ C. This is a contradiction. For more details of the proof, see [Devlin] or [Jech2]. ⊳ There are generalizations of ♦ to larger cardinals. In particular, suppose κ is a cardinal of uncountable cofinality, and E ⊆ κ is stationary; then ♦κ (E) is the principle that, there exists a sequence hSα : α ∈ Ei with Sα ⊆ α, such that for any X ⊆ κ, {α ∈ E : X ∩ α = Sα } is a stationary subset of κ. For κ a regular uncountable cardinal ♦κ (E) follows from V = L; this was proved by Jensen in 1972, and a proof may be found in [Devlin]. Stronger results have been proved since. To state these, a definition is required, which has various uses. Suppose κ is a regular uncountable cardinal, and λ < κ is a regular cardinal. Let Eλκ denote {α < κ : cf(α) = λ}. Let Card denote the class of cardinals. Theorem 3. Eλκ is stationary. Proof: Let C be a club in κ. Let f : κ 7→ κ be the function that enumerates C in increasing order. By remarks in section 17, cf(f (λ)) = cf(λ) = λ. ⊳ In 1976 J. Gregory showed that for κ regular uncountable and λ 85 + regular, ♦κ+ (Eλκ ) follows from 2κ = κ+ and κλ = κ. A proof may be found in [Jech2]. It was shown by S. Shelah in 2007 that for κ uncountable, ♦κ+ (E) follow from 2κ = κ+ , for any stationary subset E ⊆ {α < κ+ : cf(α) 6= cf(κ)}. A proof may be found in [Komjath]. ¬♦ is consistent with CH; more will be said about this in section 29. 24. Trees. The term “tree” is used in many areas of mathematics, and what are called trees in one area may not be the same as what are called trees in another. In set theory, there is a general notion of a tree; in some contexts, however, a more specialized type may be meant. According to [Kanamori1], general transfinite trees were first studied systematically in the 1935 Ph. D. thesis of D. Kurepa. They have become indispensable in modern set theory. A tree is a partially ordered set T , such that for all x ∈ T , x< is wellordered by ≤. Thus, x< is a chain, and there are no infinite descending sequences. Following are some basic notions concerning trees. - Elements of T are called nodes. - A node is said to be of level α if α is the order type of x< ; Tα will be used to denote the set of nodes of level α. - The height of a tree T is the least α such that Tα = ∅. - A tree is said to be a κ-tree if its height is κ, and |Tα | < κ for each α. Some authors impose additional restrictions on a κ-tree; here such will be stated explicitly. Although the definition has been given in general, the notion of a κ tree is less interesting when κ is singular, and from here on it will be assumed that κ is regular. A tree will be said to be rooted if it has a a single node of level 0; in which case the node is called the root. Often this is required, but the most general definition omits the requirement. A branch of a tree is a maximal chain. Clearly it is well-ordered, and its order type is called its length. Elements x, y in a partial order are said to be comparable if x ≤ y or y ≤ x; else they are incomparable. An antichain is a set of pairwise incomparable elements. An application of Zorn’s lemma shows that any antichain is a subset of some maximal antichain. Theorem 1 (Konig’s infinity lemma). An ℵ0 -tree has an infinite branch. Proof: Each level is finite, and the tree is infinite. Let x0 be a node at level 0 such that {y : x0 ≤ y} is infinite. Inductively, let xi+1 be a node at level i + 1 with xi < xi+1 , and {y : xi+1 ≤ y} infinite. ⊳ 86 A regular uncountable cardinal κ is said to have the tree property if any κ-tree has a branch of length κ. A basic question of set theory is whether there are any such cardinals. A counterexample to the tree property, that is, a κ-tree with no branch of length κ, is called a κAronszajn tree. A κ-tree with no branch of length κ and no antichain of size κ is called a κ-Suslin tree. Following are some known facts. Some will be considered in later sections. Inaccessible cardinals are defined in section 30, and Π11 -indescribable cardinals in section 34. - If κ = ℵ1 then it is provable in ZFC that an Aronszajn tree exists ([Jech2], theorem 9.16). If V = L then a Suslin tree exists (section 26). It is consistent that there is no Suslin tree (section 29). - If κ > ℵ1 is a successor cardinal, if V = L, a Suslin tree exists ([Devlin], theorem 2.4; some further remarks are made in section 52). - If κ = λ+ and 2<λ = λ then an Aronszajn tree exists (see [Jackson]). - If κ is an inaccessible cardinal, then κ has the tree property if and only if κ is Π11 -indescribable (see section 34). - If κ is an inaccessible cardinal, and V = L, then there is no κ-Suslin tree if and only if κ is Π11 -indescribable ([Devlin], theorem VII.1.3). - If ℵ2 has the tree property then ℵ2 is Π11 -indescribable in L ([Jech2], theorem 28.23). - If there is a Π11 -indescribable cardinal then there is a generic extension in which ℵ2 has the tree property ([Jech2], theorem 28.24). The question of the consistency of the tree property for successors of singular cardinals is a matter of current research, [MagShel] for example. Independent questions arise for a third type of tree. A κ-Kurepa tree is a κ-tree with at least κ+ branches of length κ. In the case when κ is an inaccessible cardinal, to avoid triviality an additional restriction is imposed, namely, |Tα | ≤ |α| for for infinite ordinals α. Following are some known facts. See [Devlin] for the definition of an ineffable cardinal. - If V = L, if κ is a successor cardinal then there is a Kurepa tree ([Devlin], theorem IV.3.3). - If there exists an inaccessible cardinal then there is a generic extension in which there is no ℵ1 -Kurepa tree ([Jech2], theorem 27.9). - If there is no ℵ1 -Kurepa tree then ℵ2 is inaccessible in L ([Jech2], exercise 27.5). - If κ is an ineffable cardinal then there is no κ-Kurepa tree. ([Devlin], theorem VII.2.6). - If V = L, κ is an inaccessible cardinal, and there is no κ-Kurepa tree then κ is an ineffable cardinal ([Devlin], theorem VII.2.7). 25. The Suslin hypothesis. 87 It follows from results in section 8 that the real line is the unique order R (up to order isomorphism) such that 1. R is a dense linear order without endpoints, 2. R has the least upper bound property, and 3. R contains a countable order-dense subset. It follows from property 3 that 4. a set of pairwise disjoint open intervals in R is at most countable; indeed any open interval contains an element of the dense subset. In 1920 Suslin asked whether, if property 3 is replaced by property 4, there is still only a unique order. A counterexample, that is, an order with properties 1, 2, and 4 which does not have property 3, is called a Suslin line. The Suslin hypothesis (SH) is the hypothesis that no Suslin line exists. In his 1935 dissertation on trees, D. Kurepa reduced the question of the existence of a Suslin line to that of the existence of a Suslin tree, where by the latter is meant an ℵ1 -Suslin tree. The question was finally shown to be independent of ZFC in 1968 to 1971 by T. Jech, R. Jensen, R. Solovay, and S. Tennenbaum. Theorem 1. If there is a Suslin line then there is a Suslin tree. Proof: A sequence Iα for α < ℵ1 of the given Suslin line will be defined, where Iα = [aα , bα ] with aα < bα . Let I0 be arbitrary. For α > 0, let S = {aβ : β < α} ∪ {bβ : β < α}. Then S is countable, so is not dense, so there is a closed interval disjoint from S; let [aα , bα ] be any such. If for β < α Iβ and Iα intersect then Iβ ⊇ Iα . Letting T = {Iα : α < ℵ1 }, with I ≤ J if and only if I ⊇ J, T is a tree. An antichain consists of disjoint intervals, so is countable (consider the interiors). In particular the levels of T are countable, and |T | = ℵ1 , so the height of T is ℵ1 . In a branch, the left endpoints form an increasing sequence lα , and the intervals (lα , lα+1 ) are disjoint; it follows that the branch is countable. ⊳ Lemma 2. If there is an order with properties 1 and 4 but not property 3 then there is a Suslin line. Proof: Let R be the given order. The set of cuts R̄ may be defined as in section 8 for the rationals, and ordered by ⊆. As in the case of the rationals, R̄ is a dense linear order without endpoints, has the least upper bound property, and contains (an isomorphic copy of) R as an order-dense subset. Any open interval in R̄ contains an open interval with endpoints in R. It follows that any collection of pairwise disjoint open intervals in R̄ is at most countable. If R̄ had property 3 it would be isomorphic to the real numbers; but then R would have property 3, since it is a dense subset of the real numbers (iterate the process of taking a point in between successive points). ⊳ 88 By a subtree of a tree T will be meant any subset T ′ ⊆ T , equipped with the inherited order. If b′ ⊆ T ′ is a branch then there is a branch b ⊆ T such that b′ ⊆ b. First, close b′ downward in T , and then extend the result in any way to a branch. If a′ ⊆ T ′ is an antichain then a′ is an antichain in T . For a node x of level α in a tree T let sons(x) denote the elements of T of level α + 1 with x ≤ y. Define the following “normality” properties of a κ-tree T . N1. For any x ∈ T of level α, and any β > α, there is a y ∈ T of level β, with x ≤ y. N2. T is rooted. N3. T has unique limits, that is, if x is a node at level α where α ∈ LimOrd, there is no distinct node y at level α such that x< = y < . N4. For any x ∈ T , |sons(x)| ≥ 2. N5. For any x ∈ T , sons(x) is infinite. Lemma 3. Suppose T0 is κ-tree, where κ is a regular cardinal. Then there is a subtree T3 ⊆ T such that T3 is a κ-tree having properties N1N3. Proof: In any κ-tree, if |x≥ | = κ, x ∈ Tα , and β > α, then there is a y > x with y ∈ Tβ and |y ≥ | = κ, else |x≥ | would be < κ. Let T1 = {x ∈ T : |x≥ | = κ. It follows that the level in T1 does not change from T , and T1 is a κ-tree. Further, T1 has property N1. To ensure property N2, choose any x at level 0 and let T2 = x≥ . To ensure property N3, proceed inductively on α ∈ LimOrd to choose a single node x from each set {y ∈ levα : y < = x< }. ⊳ Lemma 4. Suppose T0 is Suslin tree. Then there is a subtree T5 ⊆ T0 such that T5 is a Suslin tree having properties N1-N5. Proof: Let T be a Suslin tree, and let T3 be as in lemma 3; then T3 is a Suslin tree. For any x ∈ T3 , |{y ∈ x≥ : |sons(y)| ≥ 2}| = ℵ1 , since otherwise there would be a branch of length ℵ1 . Let T4 = {x ∈ T3 : |sons(x)| ≥ 2}. By transfinite recursion a function f : ℵ1 7→ ℵ1 may be defined, so that nodes in T4 of level α have level ≤ f (α) in T3 . It follows that T4 is an ℵ1 -tree. It has properties N1-N4, and is a Suslin tree. Suppose x ∈ Tα in T4 ; then {y ∈ T4 : y ∈ Tα+ω and x ≤ y} is infinite. It suffices to note that if S ⊆ {0, 1}ω and |S| = k then only k of the finite strings of length l where 2l > k can occur as prefixes of members of S. Let T5 be the nodes of T4 , whose level is 0 or a limit ordinal. It follows that T5 is as required. ⊳ Theorem 5. If there is a Suslin tree then there is a Suslin line. Proof: Let T be a Suslin tree with properties N1-N5. For each x ∈ T choose a bijection of sons(x) with Q, inducing an order of type Q on sons(x). Let R be the branches of T ; order these lexicographically. 89 It is easy to see that under this order R is dense and without endpoints. For x ∈ T let Bx = {b ∈ R : x ∈ b}. If (b1 , b2 ) is an open interval in R then there is an x ∈ T such that Bx ⊆ (b1 , b2 ). Given a set S of pairwise disjoint intervals, any set of such x is an antichain in T , hence at most countable, whence S is at most countable. Suppose S is a countable subset of B. Then there is a level α such that no branch of C has an element at level α in T . For any x ∈ T at level α, Bx does not contain an element of S. Thus, S is not dense, and since S was arbitrary R does not have a countable dense subset. ⊳ 26. Diamond implies ¬SH. That ¬SH is consistent was first proved by constructing a generic extension in which there is a Suslin line. Later, Jensen showed that ¬SH follows from V = L, indeed from ♦. If T is a κ-tree and α < κ let T<α denote ∪β<α Tβ . Properties N1-N4 of section 25 will be used. Note that these can be defined for trees of height α where α ∈ LimOrd, and not just κ-trees. Lemma 1. Suppose α is a countable limit ordinal and T is a tree of height α having countable levels and properties N1-N4. Suppose x ∈ T . Then there is a branch b ⊆ T of length α with x ∈ b. Proof: Suppose x is of level β0 . Choose an increasing sequence βi with i ∈ ω, unbounded below α. Let x0 = x, and inductively let xi+1 be such that xi+1 ∈ Tβi+1 and xi+1 > xi . Let b be the union of the x≤ i . ⊳ Lemma 2. Suppose T is an ℵ1 -tree having properties N1-N4, and no uncountable antichains. Then T is a Suslin tree. Proof: Suppose b is a branch. For each x ∈ b let yb be a son of x, other than the son in b. Then {yx : x ∈ b} is an antichain, so is countable, and it follows that b is countable. ⊳ Lemma 3. Suppose T is an ℵ1 -tree having properties N1-N4. Suppose A ⊆ T is a maximal antichain. Let C = {α : A ∩ T<α is a maximal antichain in T<α }. Then C is a club subset of ℵ1 . Proof: Suppose A ∩ T<βi is a maximal antichain in T<βi , where βi is an increasing sequence of ordinals with limit α. If x ∈ T<α then x ∈ T<βi for some i, so x is comparable with y for some y ∈ A ∩ T<βi . It follows that C is closed. Choose any α0 < ℵ1 . Given αi , since T<αi is countable there is some αi+1 with αi < αi+1 < ℵ1 , such that every x ∈ T<αi is comparable with some y ∈ Tαi+1 . Let α = ∪i∈ω αi ; then A ∩ Tα is a maximal antichain in Tα . ⊳ Theorem 4. ♦ ⊢ZFC ¬SH. Proof: A Suslin tree will be constructed, by constructing T<α recursively for α < ℵ1 ; T<α will have properties N1-N4. Let T0 be a single node. To add Tα+1 to T<α+1 , give each node of Tα two sons. 90 Suppose α ∈ LimOrd. By lemma 1, for each x ∈ T<α there is a branch b ⊆ T<α of length α having x as a member. Tα is obtained by choosing for each x ∈ T<α , a branch bx having x as a member; and adding a node at level α “extending” bx . To specify how bx is chosen, the nodes of T<α for α ∈ LimOrd will be enumerated as νξ for ξ < α. This can be achieved recursively, by enumerating the nodes in ∪α≤β<α+ω Tβ as να+i for i ∈ ω in some manner (for example using the pairing function of appendix 2 to obtain i for the jth node of the kth subtree). Let hSα : α < ℵ1 i be a diamond sequence. Identifying the nodes of T<α with their indices in the enumeration, suppose Sα happens to be a maximal antichain in T<α . Then for any x ∈ T<α there is a y ∈ Sα which is comparable with x, and bx may be chosen to contain such a y. If Sα is not a maximal antichain in T<α , each bx may be chosen arbitrarily. Letting T = ∪α T<α , clearly T is an ℵ1 tree having properties N1N4. By lemma 2, and the fact that any antichain can be enlarged to a maximal one, to show that T is a Suslin tree it suffices to show that any maximal antichain is countable. Let A be a maximal antichain in T . Let C = {α ∈ LimOrd : A ∩ T<α is a maximal antichain in T<α }. By lemma 3 C is a club set (the proof shows that α ∈ LimOrd may be required). Since Sα is a diamond sequence there is an α ∈ C such that A∩α = Sα . Since α ∈ C A ∩ T<α is a maximal antichain in T<α . Suppose x ∈ T is of level ≥ α; then there is a y ∈ Tα such that y ≤ x. Since A ∩ α = Sα is a maximal antichain in Tα , by construction there is a z ∈ A ∩ α such that z ≤ y. It follows that A ∩ T<α is a maximal antichain in T , whence A ∩ T<α = A, whence A is countable. ⊳ 27. Iterated forcing. Many forcing arguments make use of “successive” extensions of the ground model. There are two variations: the successive partial orders are in the ground model, or in the preceding extension. These are called “product forcing” and “iterated forcing” respectively. In either case, the overall extension turns out to be a generic extension of the ground model. As in the case of a simple generic extension, properties of the overall forcing notion may be proved, necessary to ensure properties of M [G]. These in turn may be seen to hold by general arguments, from properties of the partial orders Pi where Pi is used at stage i. The simplest example is the product of two forcing notions P1 and P2 . In partial order theory, the product P1 × P2 is the Cartesian product (the set of ordered pairs hp1 , p2 i), with the product order, where 91 hp1 , p2 i ≤ hq1 , q2 i if and only if p1 ≤ q1 and p2 ≤ q2 . Lemma 1. Suppose P1 and P2 are notions of forcing in a transitive model M . Let G be a subset of P1 ×P2 . Let Gi = πi [G] for i = 1, 2. Then G is an M -generic filter if and only if G1 is an M -generic filter and G2 is an M [G1 ]-generic filter, if and only if G2 is an M -generic filter and G1 is an M [G2 ]-generic filter. In this case, M [G] = M [G1 ][G2 ] = M [G2 ][G1 ]. Remarks on proof: This is lemma 15.9 of [Jech2] ⊳ One application of the product of two notions may be found in lemma 15.19 of [Jech2], a fact used in the proof of Easton’s theorem (stated in section 51). Various products of a family Pi where i ranges over an index set I are considered. In such cases, often each Pi is required to have a largest element 1. The support of a sequence hpi : i ∈ Ii in the Cartesian product is {i ∈ I : pi 6= 1}. The overall notion of forcing is a subset of the Cartesian product, namely those sequences whose whose support satisfies some restriction. For example, the notion of forcing that adds κ Cohen reals is the same as the product with finite support of κ copies of the notion that adds a single Cohen real. (The largest element is the function with empty domain.) By lemma 1, forcing with P1 × P2 is the same as forcing with P1 , and then with P2 (or P2 and then P1 ). P2 is an element of the ground model M , and the generic filter for the second forcing is a subset of P2 . More generally, forcing with P2 , where this is an element of M [G1 ], can be considered. Forcing in this manner is called iterated forcing. It was first used in 1971 by R. Solovay and S. Tennenbaum, and has since seen extensive use. To define a notion of forcing in M which is equivalent to the iterated forcing, a name for P2 must be used. Recall that this is an element of M B1 where B1 = ro(P1 ); Ṗ2 will be used to denote it. The notation P1 ∗ Ṗ2 is used to denote the notion equivalent to a two-step iteration. The elements of P1 ∗Ṗ2 are ordered pairs hp1 , ṗ2 i where p1 ∈ P1 and Jṗ2 ∈ Ṗ2 K = 1. The truth value is taken with respect to the first forcing, and is an element of B1 . This is the definition used in [Jech2]; other authors use variations, which are mostly inessential, although some differences may result. To define the partial order on P1 ∗ Ṗ2 , let hp1 , ṗ2 i ≤ hq1 , q̇2 i if and only if p1 ≤ q1 and p1 ṗ2 ≤ q̇2 . Ṗ2 is required to be a name for a partial order, that is, the statement “P2 is a partial order” is required to have truth value 1, whence it is forced by any condition. Using this, it is easy to verify that the relation ≤ on P1 ∗ Ṗ2 is a partial order. For example, suppose hp1 , ṗ2 i ≤ hq1 , q̇2 i and hq1 , q̇2 i ≤ hr1 , ṙ2 i. Then p1 ≤ q1 and q1 ≤ r1 , whence p1 ≤ r1 since P1 is a partial order. Also p1 p2 ≤ q2 and q1 q2 ≤ r2 , whence, since 92 p1 ≤ q1 , p1 p2 ≤ q2 ∧q2 ≤ r2 . By the requirement on Ṗ2 , p1 p2 ≤ r2 . The reflexivity and antisymmetry properties may be similarly verified. Lemma 2. Suppose G1 ⊆ P1 is an M -generic filter, P2 = Ṗ2G , and G2 ⊆ P2 is an M [G1 ]-generic filter. Let G1 ∗ G2 = {hp1 , ṗ2 i : p1 ∈ G1 and ṗG 2 ∈ G2 }. Then G1 ∗ G2 ⊆ P1 ∗ Ṗ2 is an M -generic filter, and M [G1 ∗ G2 ] = M [G1 ][G2 ]. Remarks on proof: This is lemma 16.2(i) of [Jech2]. ⊳ Lemma 3. Suppose G ⊆ P1 ∗ Ṗ2 is an M -generic filter. Let G1 = π1 [G], P2 = Ṗ2G1 , and G2 = {ṗ2G1 : ṗ2 ∈ π2 [G]}. Then G1 ⊆ P1 is an M -generic filter, G2 ⊆ P2 is an M [G1 ]-generic filter. and G = G1 ∗ G2 . Remarks on proof: This is lemma 16.2(ii) of [Jech2]. ⊳ It should not come as a surprise that the foregoing two-step iteration can be used at successor stages of a transfinite iteration. To specify an iterated forcing in general, an overall poset P is defined as a poset of sequences of length η for some ordinal η > 0. The notation Pα will be used for {p ↾ α : p ∈ P }, for 1 ≤ α ≤ η. To simplify the notation, for a partial order P let M P denote M ro(P ) . P must have various properties, which are specified inductively using the Pα . At stages where α is a successor ordinal β + 1, a two-step iteration is done, using Pβ and a name Q̇β in M Pβ for a notion of forcing. The case α = 1 is special, and involves a notion of forcing Q0 in M . Q0 and the Q̇α are all required to contain a largest element, which will be denoted 1. The notation ≤α is used for the order on Pα . Likewise JF Kα denotes the truth value in M Pα , and α the corresponding notion of forcing. The requirements that Pα be the partial order of a two-step iteration when α = β + 1 are as follows, where p, q ∈ Pα . 1. p ↾ β ∈ Pβ and Jp(β) ∈ Q̇β Kβ = 1. 2. p ≤ q if and only if p ↾ β ≤ q ↾ β and p ↾ β p(β) ≤ q(β). This differs from the general definition slightly, in that Pβ × M Pβ is identified with Pα . P1 is just the set of length 1 sequences p with p(0) ∈ Q0 ; p ≤1 q if and only if p(0) ≤ q(0). When α is a limit ordinal, p ∈ Pα only if p ↾ β ∈ Pβ for β < α; and p ≤ q if and only if p ↾ β ≤ q ↾ β for β < α. The exact definition of Pα varies, depending on the type of iterated forcing. Some “regularity” properties may be required. In [Jech2], P is required to contain the sequence p where p(α) = 1 for all α < η. Also, if β < α, p ∈ Pα , q ∈ Pβ , and q ≤ p ↾ β, then p′ ∈ Pα , where p′ (γ) equals q(γ) if γ < β, else p(γ). As above, the support of a sequence p is {α < η : p(α) 6= 1}. A general type of iteration considers those p whose support is in a specified ideal I ⊆ Pow(η) which contains the finite sets; such a notion of forcing is 93 called an iteration with I-support. At limit stages, all such p are taken. The simplest type of iterated forcing is iteration with finite support, which is iteration with I-support, where I is the finite sets. The properties which P must have, and the methods used to prove that they hold, again vary with the type of forcing. Theorem 4. Suppose κ is a regular uncountable cardinal. Suppose P is a finite support iteration of length η, where J“Q̇α satisfies the κc.c.”Kα = 1 for all α < η. Then P satisfies the κ-c.c. Remarks on proof: This is lemma 16.9 of [Jech2] ⊳ 28. Martin’s axiom. As noted in chapter 21, it is of little interest in forcing arguments whether a generic filter actually exists. It has turned out that the existence question is of interest in set theory. The following fact is a classic theorem of partial order theory, called by some authors the RasiowaSikorski lemma. Theorem 1. Suppose P is a poset, and C is a countable collection of dense subsets of P . Then there is a filter G ⊆ P , such that G ∩ D 6= ∅ for every D ∈ C. Proof: Enumerate C as D0 , . . .. Let p0 be an arbitrary element of P . Given pn , let pn+1 be any element p ∈ Dn such that p ≥ pn . Let G = {q : q ≤ pn for some n}. ⊳ In accordance with the terminology of forcing, a set G as in the theorem is often said to be “C-generic”. Martin’s axiom (MA) is a statement more general than theorem 1: Suppose P is a poset satisfying the c.c.c., and C is a collection of dense subsets of P , with |C| < 2ℵ0 . Then there is a filter G ⊆ P , such that G ∩ D 6= ∅ for every D ∈ C. MA follows from CH by theorem 1; indeed the hypothesis that P be c.c.c. is unnecessary. If CH is false, however, then MA asserts the existence of generic filters in additional cases. MA+¬CH has turned out to be an assumption of interest in considering independent statements; it settles a variety of them, sometimes in the same manner as V = L, and sometimes in the opposite manner. To begin the discussion, MA will be reduced to a special case, where only “small” P need be considered. Let MAs denote the following statement: Suppose P is a poset satisfying the c.c.c. and |P | < 2ℵ0 , and C is a collection of dense subsets of P , with |C| < 2ℵ0 . Then there is a filter G ⊆ P , such that G ∩ D 6= ∅ for every D ∈ C. Lemma 2. MA follows from MAs . Proof: Given any P , and C, a poset P ′ ⊆ P with |P ′ | = |C| will be constructed, in such a way that MAs may be applied. Let p be an 94 element of P . Let fc : P × P 7→ P be such that if p and q are compatible and r = fc (p, q) then r ≤ p, q. For D ∈ C let fD : P 7→ D be such that fD (p) ≤ p. P ′ will be taken as the smallest subset of P containing p and closed under fc and fD for D ∈ C. By standard arguments |P ′ | ≤ sup(ℵ0 , |C|). A subset of P ′ which is pairwise incompatible in P ′ is pairwise incompatible in P , so is countable. Each set D ∩ P ′ is dense in P ′ . Thus, by MAs there is a filter F ′ ⊆ P ′ which has nonempty intersection with every D ∩ P ′ for D ∈ C. Let F be the upward closure of F ′ ; then F is a filter which has nonempty intersection with every D ∈ C. ⊳ Theorem 3. Suppose M is a transitive model of ZFC+GCH. Suppose that (in M ) κ > ℵ1 is a regular cardinal. There is a notion of forcing P satisfying the c.c.c., such that a generic extension M [G] satisfies MA, and 2ℵ0 = κ. Remarks on proof: P will be an iteration with finite support of length κ. The idea of the proof is to choose Q̇α such that each small poset Q in M [G] gets used as Q̇α at some stage. Pα+1 will then contain a generic filter for Q. Some further details will be given; for complete details see theorem 16.13 of [Jech2]. Q̇α is chosen so that “Q̇α satisfies the c.c.c” and “Q̇α < κ” have truth value 1 in V Pα . From this, and GCH in M , the following can be concluded: Pα and P satisfy the c.c.c., |Pα | ≤ κ, there are at most κ candidates for Q̇α , and “2ℵ0 ≤ κ” has truth value 1 in V P . Recalling the function Γ from section 13, let α = Γ(β, γ), and at stage α let Q̇α be the γth name in V Pβ of a suitable small poset. Let G be a generic filter for P , and let Gα = G ↾ Pα . It may be shown that if λ ≤ κ, X ⊆ λ, and X ∈ M [G] then X ∈ M [Gα ] for some α < κ. Using this, it follows that if Q ∈ M [G] is a suitable poset, and C ∈ M [G] is a collection of dense subsets with |C| < κ, then there is C-generic filter F ∈ M [G] for Q. This shows that M [G] satisfies MAs . Further, if X ⊆ {0, 1}ω and |X| < κ, Q and C may be chosen to conclude that X 6= {0, 1}ω . Thus, 2ℵ0 ≥ κ, whence 2ℵ0 = κ, as 2ℵ0 ≤ κ was shown already. ⊳ In what follows a brief list of consequences of MA will be given. There is an entire book [Fremlin1] devoted to the subject. In some cases the consequence might follow from a weaker hypothesis. Recall from section 15 the notions of meager and measure 0 subsets of R. It is provable in ZFC that these are closed under countable unions. Hence, if CH is true, they are closed under unions of size < 2ℵ0 . In fact it follows from MA that they are closed under unions of size < 2ℵ0 . A proof may be found in [Jech2], theorem 26.39, or [Ciesielski], theorems 8.2.6 and 8.2.7. 95 It is independent whether the measure 0 sets are closed under unions of size < 2ℵ0 . Indeed, if M is countable, P is the notion of forcing for adding ℵ2 reals, and G ⊆ P is M -generic, then in M [G] the interval [0, 1] is a union of < 2ℵ0 measure 0 sets. This is corollary 9.4.7 of [Ciesielski]. It follows that ¬MA is consistent with ¬CH. Recall from section 15 the notion of a cardinal invariant. It follows from MA that these all equal 2ℵ0 . Discussions may be found in [Jech2] and [Roitman]. A related result concerns scales, which are certain families of functions f : ω 7→ ω. MA implies that these exist, and all have cardinality 2ℵ0 ; see [Roitman]. CH implies that there are partial orders satisfying the c.c.c., whose product does not satisfy it (theorem 8.1.12 of [Ciesielski]). MA+¬CH implies that for any two partial orders satisfying the c.c.c., their product satisfies it (theorem 8.2.10 of [Ciesielski]). Ultrafilters were mentioned in section 21. An ultrafilter in a Boolean algebra B is a filter F , such that for each x ∈ B, either x ∈ F or x† ∈ F . A p-point is an ultrafilter in Pow(ω), which has certain properties. It was shown in 1970 that MA implies the existence of ppoints; [Jech2] gives a proof in theorem 16.27. It had been shown by W. Rudin in 1956 that CH implies the existence of p-points. In 1982 it was shown by S. Shelah that there are models of ZFC where there are no p-points. The notion of a commutative (or Abelian) group was defined in section 8. Free Abelian groups are an important type. Whitehead groups are defined as groups which have an important property of free groups (trivial extension group). In the 1950’s the problem was raised of whether every Whitehead group was free. This was shortly shown to be the case for countable groups. The problem remained open for arbitrary cardinality until 1974, when S. Shelah showed the following. 1. If V = L, in fact if a certain ♦ principle, holds then every Whitehead group is free. 2. If MA+¬CH holds then there are Whitehead groups which are not free. Finally, SH follows from MA+¬CH; this is shown in the next section. 29. SH is consistent. SH was the first principle proved consistent using iterated forcing. It was almost immediately observed that SH follows from MA+¬CH, that the consistency of MA+¬CH follows by a basic iterated forcing argument, and that MA+¬CH had various other consequences. It is sometimes valuable to refine the hypothesis MA+¬CH. For an infinite cardinal κ, MAκ is the following statement. 96 Suppose P is a poset satisfying the c.c.c., and C is a collection of dense subsets of P , with |C| ≤ κ. Then there is a filter G ⊆ P , such that G ∩ D 6= ∅ for every D ∈ C. MA is the statement ∀κ < 2ℵ0 (MAκ ). MAℵ0 is true, by theorem 28.1. Theorem 1. MAκ can only hold if κ < 2ℵ0 . Proof: Let P be the poset for adding a single real. For each f ∈ {0, 1}ω let Df = {p ∈ P : ∃n ∈ ω(p(n) 6= f (n)). Suppose G is a filter generic for this collection. Since G is a filter, h = ∪G exists. But this is a contradiction, since h cannot equal any f ∈ {0, 1}ω . ⊳ Theorem 2. MAℵ1 implies SH. Proof: Suppose T is a Suslin tree, with property N1 of section 25. Let P be T with the order reversed, so that p < q if p is farther down a branch of the tree than q. If two elements of P are incompatible in P , then they are incomparable in T , and since any pairwise incomparable subset of T is countable, any pairwise incompatible subset of P is countable. That is, P satisfies the c.c.c. For α < ℵ1 let Dα be the set of nodes of T of level > α. It is easy to see using property N1 that Dα is a dense subset of P . Since |P | = |T | = ℵ1 , by MAℵ1 there is filter G which intersects every Dα . Since G is a filter, it’s union must be a branch of T , and since it intersects every Dα , it must have length ℵ1 , contradicting the hypothesis that T is a Suslin tree. ⊳ From theorems 23.1 and 26.4, CH+¬SH is consistent with ZFC. From theorems 1 and 2, ¬CH+SH is consistent with ZFC. That ¬CH+¬SH is consistent with ZFC follows by a theorem of Shelah, that adding a Cohen real adds a Suslin tree. This is theorem 28.12 of [Jech2], or theorem 46 of [Roitman]. That CH+SH is consistent with ZFC was shown by Jensen. The proof is quite involved, and may be found in [DevJohn]. Later, Shelah gave a proof using “proper forcing”, which according to [Kanamori2] he invented partly to prove this theorem. Still later, the proof was simplified [AbrShel]. It was remarked in section 23 that CH+¬♦ is consistent with ZFC; this follows because CH+SH is, and SH⇒ ¬♦. 30. Inaccessible cardinals. Recall that a cardinal κ is a limit cardinal if λ+ < κ whenever λ < κ. A cardinal κ is said to be a strong limit cardinal if 2λ < κ whenever λ < κ. If GCH holds then the two notions are equivalent, but since GCH is independent the two notions must be considered separately. Recall also that a cardinal κ is regular if there is no (increasing) map f : α 7→ κ from an ordinal α < κ, whose range is unbounded. A cardinal κ is said to be weakly inaccessible (resp. strongly inaccessible) if it is uncountable, regular, and a limit (resp. strong limit) cardinal. 97 Many authors use the term “inaccessible” by itself to mean “strongly inaccessible”, and this will be done here. Lemma 1. Suppose κ is an inaccessible cardinal. a. If α < κ then |Vα | < κ. b. |Vκ | = κ. Proof: Part a is proved by induction on α. At successor stages, |Vα+1 | = 2|Vα | ; |Vα | < κ by induction, so |Vα+1 | < κ since κ is a strong limit cardinal. At limit stages, |Vβ | < κ by induction, so ∪β<α |Vβ | < κ since κ is regular and α < κ, so |Vα | < κ. For part b, using part a |Vκ | = ∪α<κ |Vα | ≤ κ; and clearly κ ≤ |Vκ | for any cardinal κ. ⊳ It was observed in section 17 that if α is a limit ordinal and α > ω then Vα is a model of the axioms of set theory, with the possible exception of the replacement axiom. The replacement axiom can be given in a “second order” form, ∀F (“F is a partial function”⇒ ∀u∃v(v = F [u])). Theorem 2. If κ is an inaccessible cardinal then Vκ is a model of second order replacement, and hence a model of ZFC. Proof: If u ∈ Vκ then u ∈ Vλ for some λ < κ, so u ⊆ Vλ , so |u| ≤ |Vλ | < κ. Let r(x) equal ρ(F (x)) if F (x) exists, else 0. Since κ is regular there is an α < κ such that x ∈ u ⇒ r(x) < α, whence F [u] ⊆ Vα , whence F [u] ∈ Vα+1 , whence F [u] ∈ Vκ . This shows that Vκ is a model of second order replacement. A fortiori Vκ is a model of the replacement axiom scheme, hence by remarks above a model of ZFC. ⊳ Lemma 3. The following predicates are Π1 : a. κ is a cardinal. b. κ is a regular cardinal. c. κ is a limit cardinal. d. κ is a strong limit cardinal. Further, if κ is an inaccessible cardinal and λ < κ has one of these properties in Vκ then λ has the property. Proof: Part a was already observed in section 16. For part b, a cardinal κ is regular if and only if ¬∃f ∃α < κ, the domain of f is a subset of α and f [α] is unbounded in κ. For part c, a cardinal κ is a limit cardinal if and only if for all α < κ∃λ < κ, α < λ and λ is a cardinal. For part d, a cardinal κ is a strong limit cardinal if and only if ¬∃f ∃α < κ, any element in the domain of f is a subset of α and f [α] = κ. For the second claim, suppose λ is an ordinal and f : α 7→ λ where α < λ and f [α] = λ. Since every ordered pair of f is in Vλ , f ∈ Vκ . Similarly if f is a function contradicting that λ is a regular cardinal or a strong limit cardinal then f ∈ Vκ . Finally if λ = µ+ then µ ∈ Vκ . ⊳ Theorem 4. If ZFC is consistent then it is not provable in ZFC that 98 an inaccessible cardinal exists. Proof: Let I be the statement that an inaccessible cardinal exists. Let F be the statement “for all M , if M is a model of ZFC then M is a model of I”. Using lemma 3, it follows from I that there is a model of ZFC in which ¬I holds, namely Vκ where κ is the smallest inaccessible cardinal. Writing ⊢ for ⊢ZFC , it has been shown that ⊢ I ⇒ ¬F . On the other hand, if ⊢ I then ⊢ F . ⊳ Theorem 5. If ZFC is consistent then it is not provable in ZFC that a weakly inaccessible cardinal exists. Proof: Let W be the statement that a weakly inaccessible cardinal exists. By an argument similar to the proof of theorem 4, if W is provable in ZFC+V = L then ZFC+V = L is inconsistent. But if W is provable in ZFC then W is provable in ZFC+V = L, and if ZFC+V = L is inconsistent then ZFC is inconsistent. ⊳ Set theorists have long suspected that inaccessible cardinals exist. [Godel] contains arguments in favor of their existence. [Hauser] states that “Their existence is intrinsically plausible on the basis of . . . the doctrine that the universe of all sets V is beyond any determination”. [Bagaria] states that the axioms for set theory which should be found intuitively obvious include ZFC “plus, perhaps, some small largecardinal existence axioms”. See also remarks at the end of chapter 1 of [Kanamori3], including a quote from a 1930 paper of Zermelo. A compelling argument can be given via “meta-logical” considerations concerning the universe of discourse of set theory. This behaves in many respects like a set, indeed like a domain of discourse of mathematics which is a set, satisfying some axioms. A type structure can be erected on top of it, proper classes being “type 1” collections. Axioms which are intuitively obvious can be given. Among these is the axiom which state that the universe satisfies second order replacement. These observations can be seen as putting the existence of inaccessible cardinals on the same footing as the existence of sets in general. Since there is “something” that behaves in this way, there is a set that does. The universe is a concept more vague than a set, and aspects of the situation are “reflected” by the existence of inaccessible cardinals. In addition, a universe without an inaccessible cardinal can be obtained by “truncating” a universe with inaccessible cardinals, at the smallest such. This truncation seems arbitrary, once the existence of inaccessible cardinals is considered. A “large cardinal” is a cardinal at least as large as the smallest inaccessible cardinal. Throughout the history of set theory various types of large cardinals have been defined, and their properties studied. While there is still debate on which types of large cardinals should be accepted 99 as existing, most set theorists would probably agree that the existence of inaccessible cardinals should be added to the axioms of set theory. This has not yet been done, though, probably because it makes little difference to the rest of mathematics. As a corollary of lemma 3, if κ is an inaccessible cardinal then it remains one in L. As will be seen, some types of large cardinals have this property, and some do not. 31. Mahlo cardinals. In the preceding section, it was observed that inaccessible cardinals can be argued to exist, because the universe behaves like a set, so there is a set that behaves like the universe. This principle might be called “collecting the universe”. At stages of the cumulative hierarchy hVα i where second order replacement is satisfied, the collection process may be continued. It should be clear that according to the principle of collecting the universe, the inaccessible cardinals form a proper class. The inaccessible cardinals should be unbounded, and the universe should retain the characteristics of an inaccessible cardinal. Continuing, there are inaccessible cardinals κ where the inaccessible cardinals are unbounded below κ. Such cardinals are called hyperinaccessible. To continue further, it is useful to introduce the operator Lim on classes of ordinals. If X is a class of ordinals let Lim(X) = {α : X ∩ α is unbounded below α} denote the limit points of X, as defined in section 23. This is standard notation, used for example in [Jech2]. The notation Lim′ (X) will be used for X ∩ Lim(X) (there doesn’t seem to be a standard notation for this). Let Inac denote the class of inaccessible cardinals. The hyperinaccessible cardinals are clearly the class Lim′ (Inac). Applying Lim′ again results in the hyper-hyperinaccessible cardinals, etc. It seems clear that continuing to apply the operation Lim′ results in cardinals which can be argued to exist using the principle of collecting the universe. A more complete discussion of this topic may be found in [Dowd2]. Mahlo cardinals are named after P. Mahlo, who defined weakly Mahlo cardinals in 1911. There had been suspicion that Mahlo cardinals represented some sort of “limit” of iterating the operation Lim′ ; see [Drake] for example. A result to this effect was proved in 1967 in [Gaifman]. The proof has been simplified by the author and others. It will be given here, since it provides evidence for the existence of Mahlo cardinals, and introduces various basic facts about clubs and stationary sets. An inaccessible cardinal κ is said to be Mahlo if the inaccessible cardinals below κ are stationary. A weakly inaccessible cardinal κ is 100 said to be weakly Mahlo if the weakly inaccessible cardinals below κ are stationary; weakly Mahlo cardinals will not be further considered. To begin with, some additional facts about Lim will be noted. The operation Lim is “local”; given any ordinal α, there is an operation Lim acting on the subsets of α, where Lim(X ∩ α) = Lim(X) ∩ α. Recalling a definition from section 23, a class X of ordinals is closed if and only if Lim(X) ⊆ X. Clearly this is the case if and only if Lim(X) = Lim′ (X). More generally, X is said to be closed in Y if Lim(X) ∩ Y ⊆ X. Lemma 1. The operation Lim satisfies a. X ⊆ Y ⇒ Lim(X) ⊆ Lim(Y ), b. Lim(Lim(X)) ⊆ Lim(X), and c. Lim(X ∪ Y ) = Lim(X) ∪ Lim(Y ). Proof: Part a is obvious. For part b, suppose α ∈ Lim(Lim(X)). If β < α then ∃γ ∈ Lim(X) ∩ α(β < γ), whence ∃δ ∈ X ∩ γ(β < δ). This shows that α ∈ Lim(X). For part c, Lim(X) ∪ Lim(Y ) ⊆ Lim(X ∪ Y ) by part a. If α ∈ Lim(X ∪ Y ) then (X ∪ Y ) ∩ α is unbounded below α, whence X ∩ α or Y ∩ α must be. ⊳ Next, some additional facts about club subsets and stationary subsets of an uncountable cardinal κ will be noted. The diagonal intersection △ξ<κ Xξ of an indexed sequence Xξ of subsets of κ is the subset which contains an ordinal α if and only if α ∈ Xξ for ξ < α. Lemma 2. Suppose κ is an uncountable cardinal. a. If Cξ is a club subset for ξ < η where η < κ then ∩ξ<η Cξ is a club subset. b. If Cξ is a club subset for ξ < κ then △ξ<κ Cξ is a club subset. c. If C is a club subset then Lim(C) is a club set. Proof: For part a, Lim(∩ξ Cξ ) ⊆ Lim(Cξ ) ⊆ Cξ , so Lim(∩ξ Cξ ) ⊆ ∩ξ Cξ ; thus, ∩ξ Cξ is closed. Given α < κ, at stage η · i + ξ choose an element of Cξ greater than the elements chosen so far (greater than α at stage 0). Let γ be the supremum of the elements chosen. Then γ ∈ Lim(Cξ ) for each ξ, so γ ∈ Cξ for each ξ, so γ ∈ ∩ξ Cξ . For part b, suppose C = △ξ<κ Cξ and α ∈ Lim(C). If ξ < α then there is an increasing sequence hαζ i with ξ < αζ < α and αζ ∈ C, whose supremum is α. Then αζ ∈ Cξ for all ζ, whence since Cξ is closed α ∈ Cξ . Since ξ was arbitrary, α ∈ △ξ<κ Cξ , and it has been shown that C is closed. Given α < κ, choose β0 > α with β0 ∈ C0 . At stage n + 1, choose βn+1 > βn with βn+1 ∈ Cβ′ n where Cξ′ = ∩ζ≤ξ Cξ . Let β = sup{βn }. If ξ < β then ξ < βn for some n. For k > n βk ∈ Cβ′ n , and so β ∈ Cβ′ n , and so β ∈ Cξ . Since ξ was arbitrary, β ∈ △ξ<κ Cξ , and it has been shown that C is unbounded. For part c, since Lim(Lim(C)) = Lim(C), Lim(C) is closed. If X ⊆ κ is unbounded then Lim(X) is unbounded: Given α ∈ κ, choose an ascending chain α0 < α1 · · · of length ω, of 101 elements of X, with α0 > α. The limit of this chain is in Lim(X), and is greater than α. ⊳ If β is a limit ordinal of uncountable cofinality the notion of a club subset of β is still a sensible notion. Parts a and c of the lemma may be seen to hold, with essentially the same proofs, where η < cf(β) in part a. Lemma 3. Suppose κ is an uncountable cardinal. Suppose X ⊆ Y ⊆ κ, Y is stationary, and X is closed in Y and unbounded; then the following hold. a. X is stationary. b. Lim′ (X) is closed in Y and unbounded. Proof: Since X is unbounded Lim(X) is a club subset (see the proof of lemma 2.c). Given a club subset C, Lim(X) ∩ C is a club subset, so Lim(X) ∩ Y ∩ C is nonempty. This proves part a. For part b, X ∩ Lim(X) ∩ Y ⊆ X ∩ Lim(X), and X ∩ Lim(X) ∩ Y = Lim(X) ∩ Y is unbounded, so X ∩ Lim(X) is unbounded. ⊳ The Lim′ operation may be iterated through ordinals, by taking intersections at limit ordinals. To avoid triviality, at a stage α where cf(α) < κ, an intersection of length η < κ should be taken; and if cf(α) = κ a diagonal intersection should be taken. These observations may be captured in the notion of a “scheme” in κ; this is a specification of the intersections and diagonal intersections to be taken. A length is given, which is a successor ordinal ρ + 1 < κ+ . For each α ≤ ρ which is a limit ordinal, an increasing unbounded sequence with domain η ≤ κ is given. It follows that η = κ if and only if cf(α) = κ. For a scheme Σ in κ, an operation F on subsets of κ, and a subset X of κ, the F Σ (X) of using Σ to iterate F on X may be defined. Indeed, for α ≤ ρ let Xα be the result of applying F through α steps according to Σ. This is defined by recursion, with the definition falling into into 4 cases: 0. X0 = X. 1. Xβ+1 = F (Xβ ). 2. ∩ξ<η Xαξ . 3. △ξ<κ Xαξ . F Σ (X) equals Xρ . Lemma 4. If Y ⊆ κ is stationary then for any scheme Σ in κ, Σ Lim′ (Y ) is closed in Y and unbounded. Proof: The proof is by induction on Σ. The basis is trivial. The Σ Σ claim follows for Lim′ (Lim′ (Y )) from the claim for Lim′ (Y ) by lemma 3. Suppose Xξ is closed in Y and unbounded for ξ < η where η < κ. Clearly ∩ξ<η Xξ is closed in Y . Also, ∩ξ<η Lim(Xξ ) is club, so Y ∩ (∩ξ<η Lim(Xξ )) is unbounded. Suppose Xξ is closed in Y for ξ < κ, and 102 suppose α ∈ Lim(△ξ Xξ ) ∩ Y . Let αη be a sequence in △ξ Xξ converging to α. If ξ < α then some suffix of the sequence converges in Xξ to α, so α ∈ Xξ . But this shows that α ∈ △ξ Xξ . The argument for unboundedness is similar to the intersection case. ⊳ Lemma 5. If Y ⊆ κ is not stationary then for some scheme Σ in κ, Σ Lim′ (Y ) = ∅. Proof: Let Z ⊆ κ be a club set disjoint from Y . Enumerate Z in natural order as hαγ : γ < κi. Choose any scheme of rank κ where the limiting sequence for κ is hαγ i. By induction Yα ∩ α = ∅ for α < κ. It follows that α ∈ / Yκ for α ∈ Lim, whence Lim′ (Yκ ) = ∅. ⊳ Theorem 6. Suppose κ ∈ Inac; the following are equivalent. a. κ is Mahlo. Σ b. For any scheme Σ in κ, Lim′ (Inac ∩ κ) is closed in Inac and unbounded. Σ c. For any scheme Σ in κ, Lim′ (Inac ∩ κ) is stationary. Σ d. For any scheme Σ in κ, Lim′ (Inac ∩ κ) 6= ∅ Proof: b follows from a by lemma 4 c follows from b by lemma 3. d follows from c immediately. a follows from d by lemma 5. ⊳ Thinking of κ as Ord, the theorem indicates that Ord has the Mahlo property exactly if no iteration of Lim′ exhausts the inaccessible cardinals. Since it is reasonable to suppose that this is the case, it is reasonable to suppose that the universe has the Mahlo property, whence it is reasonable to suppose that Mahlo cardinals exist. The theorem can be recast in terms of filters. Let F be the subsets X ⊆ κ which contain a club set. As mentioned in section 23, it follows by lemma 2 that F is a filter, called the club filter. Say that a filter is proper, or nontrivial, if it does not contain ∅; clearly F is proper. Say that a filter of subsets of κ is κ-complete if it is closed under intersections of length less than κ; and normal if it is closed under diagonal intersections. Lemma 2 shows that F is a κ-complete normal proper filter. Theorem 7. Suppose κ ∈ Inac. Then κ is Mahlo if and only if there is a κ-complete normal proper filter of subsets of κ, containing Inac ∩ κ and closed under Lim′ . Proof: Suppose κ is Mahlo. Using theorem 6.d, let F0 be the sets ′Σ Lim (Inac ∩ κ) for Σ a scheme in κ. Let F be the sets containing a set in F0 . Then F is a filter with the required properties. If F is a filter with the required properties then F0 ⊆ F and the requirement in theorem 6.d holds. ⊳ The statements “X is a closed subset” and “X is an unbounded subset” of an ordinal α are readily verified to be ∆0 . The statement “κ is a Mahlo cardinal” is Π1 ; κ is a Mahlo cardinal if and only if κ is an 103 inaccessible cardinal, and any club subset of κ contains an inaccessible cardinal. Thus, if κ is a Mahlo cardinal then κ is a Mahlo cardinal in L. Mahlo cardinals appear in various topics in set theory. Some examples may be found in [Jech2], for example corollary 18.4. A fairly recent result concerning Mahlo cardinals and Aronszajn trees can be found in [Todor1]. In [Friedman3] a statement about the integers is given, whose proof requires the existence of Mahlo cardinals. Large cardinals, including weakly Mahlo cardinals, have been used in “ordinal analysis”; see [Rathjen2] for a survey. 32. Greatly Mahlo cardinals. Like Lim′ of the previous section, the Mahlo operation is defined on Pow(κ) for an inaccessible cardinal κ. Several variations have been considered; the definition used here will be as follows: H(X) = {λ ∈ Inac ∩ X : X ∩ λ is stationary below λ}. Say that an inaccessible cardinal κ is greatly Mahlo if and only if there is a κ-complete normal proper filter of subsets of κ, containing Inac ∩ κ and closed under H. Theorem 1. Suppose κ ∈ Inac. Then κ is greatly Mahlo if and only if for any scheme Σ in κ, HΣ (Inac ∩ κ) 6= ∅. Proof: The proof is just like the proof of theorem 31.6. ⊳ The greatly Mahlo cardinals were so-named in [BTW]. They had already been considered, along with even larger Mahlo-type cardinals, in [Gaifman]. The considerations of the previous section can be continued, suggesting that it is reasonable to suppose that greatly Mahlo cardinals exist. No attempt will be made give a detailed argument here, but the Mahlo cardinals should be a proper class, and generalizing using theorem 31.6, applying Mahlo’s operation should not exhaust the cardinals. Iterating Mahlo’s operation using schemes should not either. Theorem 2. If a cardinal κ is greatly Mahlo then this is true in L. Remarks on proof: First, the predicate “Σ is a scheme” is ∆0 . A scheme Σ is a function whose domain is an ordinal. Σ(α) = 0 if α is a successor, else it is an increasing unbounded sequence with domain ≤ κ. By induction on Σ, HΣ (Inac ∩ κ) ⊆ (HΣ (Inac ∩ κ))L . The basis follows because “λ is inaccessible” is Π1 . At successor stages, Y = H(X) where inductively X ⊆ X L . If λ ∈ Y then λ ∈ X, so λ ∈ X L ; and X is stationary below λ, so X L is. “X is stationary below λ” is Π1 , so X L is stationary below λ in L, that is, λ ∈ Y L . Intersection and diagonal intersection are straightforward. It follows that if HΣ (Inac ∩ κ) is nonempty for any Σ, then this is true in L. ⊳ Greatly Mahlo cardinals have been appearing in topics in set theory. [Jech2] mentions one example. 104 A further characterization of greatly Mahlo cardinals can be give, using an additional method, first introduced in [Jech1]. This method has continued to find uses since, so an outline will be given. For the rest of the section suppose κ is an inaccessible cardinal, although various facts hold more generally. For X, Y ∈ Pow(κ) let X ⊆t Y denote that X − Y is thin. A binary relation which is reflexive and transitive will be called a quasi-order; other terminology is in use, such as preorder. Lemma 3. The relation X ⊆t Y on Pow(κ) is a quasi-order. Proof: X ⊆t X is immediate. If X ⊆t Y and Y ⊆t Z then X ⊆ Y ∪ T1 and Y ⊆ Z ∪T2 then where T1 and T2 are then. Then X ⊆ Z ∪T1 ∪T2 , and T1 ∪ T2 is thin since the thin sets form the ideal dual to the club filter. ⊳ Let Hi (X) = {λ ∈ Inac : X ∩ λ is stationary below λ}. Then H(X) = X ∩ Hi (X); note the resemblance to Lim′ and Lim. Lemma 4. a. If X ⊆ Y then Hi (X) ⊆ Hi (Y ). b. Hi (Hi (X)) ⊆ Hi (X). c. Hi (X ∪ Y ) = Hi (X) ∪ Hi (Y ). Proof: Part a is obvious. For part b, suppose λ ∈ / Hi (X) where λ ∈ Inac. Then there is a club subset C ⊆ λ such that C ∩ X = ∅. If µ ∈ Lim(C) ∩ Inac then C ∩ µ is a club subset of µ, whence X ∩ µ is a thin subset of µ, whence µ ∈ / Hi (X). Since µ was arbitrary, Lim(C) ∩ Hi (X) = ∅, whence λ ∈ / Hi (X) ∪ Hi (Y ). Part c follows just as lemma 31.c. ⊳ Lemma 5. a. If X is thin then Hi (X) is thin. b. If X ⊆t Y then Hi (X) ⊆t Hi (Y ). c. If X ⊆t Y then H(X) ⊆t H(Y ). Proof: For part a, suppose Hi (X) is stationary. Let C be a club subset; then Lim(C) is a club subset, so there is some λ ∈ Lim(C) ∩ Hi (X). Since λ ∈ Lim(C), C ∩ λ is a club subset of λ; and since λ ∈ Hi (X), X ∩ λ is a stationary subset of λ. Thus, C ∩ X is nonempty, and since C was arbitrary X is stationary. For part b, if X ⊆t Y then X ⊆ Y ∪ T where T is thin, so Hi (X) ⊆ Hi (Y ∪ T ) = H(Y ) ∪ H(T ), so Hi (X) ⊆t Hi (Y ). Part c follows from part b, and the fact that the operation ∩ (and also ∪) respects ⊆t , whose proof is left to the reader. ⊳ Lemma 6. If Xξ is a family of subsets of κ then Lim(∩ξ Xξ ) ⊆ ∩ξ Lim(Xξ ). Proof: This follows by lemma 31.1.a, and ∩ξ Xξ ⊆ Xξ ⊳ 105 Lemma 7. If X is a stationary subset of κ and C is a club subset, then X ∩ C is a stationary subset. Proof: If D is a club subset then X ∩ C ∩ D is nonempty because C ∩ D is a club subset. ⊳ Lemma 8. If X, Y are subsets of κ and C is a club subset, and X ⊆ H(Y ), then X ∩ Lim(C) ⊆ H(Y ∩ C). Proof: An element of X ∩ Lim(C) is an element of H(Y ) also, whence it is an inaccessible cardinal, and λ ∈ C and λ ∈ Y . C ∩ λ is a club subset of λ, and Y ∩ λ is a stationary subset of λ, so by lemma 7 Y ∩ C ∩ λ is a stationary subset of λ. It follows that λ ∈ H(Y ∩ C). ⊳ The binary relation X ≺ Y on the stationary subsets of an inaccessible cardinal κ will be defined to hold if and only if Y ⊆t H(X). This is a variation from [Jech1] in that a variation of the Mahlo operation is used. Lemma 9. The relation X ≺ Y is transitive. Proof: If Y ⊆t H(X) and Z ⊆t H(Y ) then H(Y ) ⊆t H(H(X)) by lemma 5, and H(H(X)) ⊆ H(X), whence Z ⊆t H(X) by lemma 3. ⊳ Lemma 10. The relation X ≺ Y is well-founded. Proof: Suppose X0 ≻ X1 ≻ X2 ≻ · · · is an infinite descending chain. Let Ci be a club set disjoint from Xi −H(Xi+1 ), so that Xi ∩Ci ⊆ H(Xi+1 ). Let Ci′ = ∩j≥0 Lim(j) (Ci+j ) where Lim(j) (X) denotes Lim, applied to X j times. Let Xi′ = Xi ∩ Ci′ . By lemma 8 Xi ∩ Ci ∩ ′ ′ Lim(Ci+1 ) ⊆ H(Xi+1 ∩ Ci+1 ). Using lemma 6, Ci ∩ Lim(Ci+1 )′ ⊆ Ci′ . ′ ′ Thus, Xi ⊆ H(Xi+1 ). Let λi be the least element of Xi′ . Then λi ∈ ′ ), so λi > λi+1 , which yields an infinite descending chain of H(Xi+1 ordinals. ⊳ Given a well-founded relation on a set S, for an ordinal ρ, the set Sρ of nodes of rank ρ is defined by the transfinite recursion, Sρ equals the minimal elements of S − ∪ν<ρ Sν . For x ∈ S let ρ(x) be the unique ρ such that x ∈ Sρ . The function ρ : S 7→ Ord will be referred to as the canonical rank function of the relation. If a well-founded relation < on a set S is transitive the canonical rank function has the following properties. 1. If ν < ρ(y) then there is a z < y with ρ(z) = ν. (y is not minimal in S1 = S − ∪ξ<ν Sξ , so there is a minimal z ∈ S2 = y < ∩ S1 ; z is easily seen to be minimal in S1 , so ρ(z) = ν.) 2. If y > x then ρ(y) > ρ(x). (If ρ(x) = ρ(y) then x and y are incomparable. If ρ(x) > ρ(y) then there is a y ′ with ρ(y ′ ) = ρ(y) and x > y ′ ; but then y > y ′ , a contradiction.) 3. If x, y are such that y > z ⇒ x < z then ρ(y) ≤ ρ(x). (If ρ(y) > ρ(x) then there is a z such that y > z and ρ(z) = ρ(x); but then x 6> z.) 106 4. The range of ρ is an ordinal α. Since ρ : S 7→ α is surjective, α < |S|+ . Let S denote the stationary subsets of (Inac ∩ κ) ∪ {0}. Let ρ≺ denote the rank function of the order ≺ restricted to S. This rank has the following properties. 1. If X ⊆t Y then ρ≺ (X) ≥ ρ≺ (Y ). (This follows since then Y ≺ Z implies X ≺ Z.) 2. If ρ≺ (Y ) ≥ α, and Z ⊆t Y whenever ρ≺ (Z) ≥ α, then ρ≺ (Y ) = α. (If ρ≺ (Y ) > α and Z is such that ρ(Z) = α and Y ≺ Z then Z ⊆t Y and Y ⊆t H(Z), a contradiction). A stationary set M ∈ S will be said to be maximal of rank ρ if ρ≺ (M ) = ρ, and X ⊆t M whenever X ∈ S and ρ≺ (X) ≥ ρ. Lemma 11. Suppose X ′ and Xξ are elements of S. 0. If Inac ∩ κ is stationary then it is in S and has rank 0. 1a. If X = H(X ′ ) is stationary then X ∈ S and ρ≺ (X) ≥ ρ≺ (X ′ ) + 1; and 1b. if X ′ is maximal then X is maximal and ρ≺ (X) = ρ≺ (X ′ ) + 1. 2a. Suppose η < κ is a limit ordinal, ρξ for ξ < η is an increasing sequence of ordinals with limit ρ, ρ≺ (Xξ ) = ρξ for ξ < η, X = ∩ξ<η Xξ , and X is stationary. Then X ∈ S and ρ≺ (X) ≥ ρ; and 2b. if Xξ is maximal for ξ < η then X is maximal and ρ≺ (X) = ρ. 3a. Suppose ρξ for ξ < κ is an increasing sequence of ordinals with limit ρ, ρ≺ (Xξ ) = ρξ for ξ < κ, X = △ξ<κ Xξ , and X is stationary. Then X ∈ S and ρ≺ (X) ≥ ρ; and 3b. if Xξ is maximal for ξ < κ then X is maximal and ρ≺ (X) = ρ. Proof: Part 0 is immediate. For part 1a, X ∈ S is clear, and since X ≺ X ′ , ρ≺ (X) > ρ≺ (X ′ ). For part 1b, if ρ≺ (Y ) ≥ ρ + 1 then ρ≺ (Y ) > ρ, so there is a Z such that ρ≺ (Z) = ρ and Y ≺ Z. Inductively, Z ⊆t X ′ , so Y ⊆t H(Z) ⊆t H(X ′ ). For part 2a, again X ∈ S is clear, and since X ⊆ Xξ for each ξ, ρ≺ (X) ≥ ρξ for each ξ, whence ρ≺ (X) ≥ ρ. For part 2b, if ρ≺ (Y ) ≥ ρ then ρ≺ (Y ) ≥ ρξ for all ξ < η, so Y ⊆t Xξ for all ξ < η, so Y ⊆t X. Parts 3a and 3b are similar to part 2, noting that 0 ∈ X is allowed by the definition of S. ⊳ Corollary 12. For any scheme Σ in κ if HΣ (Inac ∩ κ) is stationary then it is maximal. Proof: The proof is by induction on Σ. ⊳ The rank ρ≺ (κ) of κ is defined to be the least ρ such that for every ν < ρ, there is a stationary set S ⊆ Inac with ρ≺ (S) ≥ ν. With the foregoing conventions, any κ has rank ≥ 0, and the rank of κ is ≥ 1 if and only if Inac ∩ κ is stationary, if and only if κ is Mahlo, if and only if S is nonempty. If arbitrary stationary sets rather than sets in S were allowed, the rank would always be at least 1. 107 Theorem 13. An inaccessible cardinal κ is greatly Mahlo if and only if ρ≺ (κ) ≥ κ+ . Proof: κ is greatly Mahlo if and only if HΣ (Inac ∩ κ) is nonempty for all schemes Σ in κ. Lemma 31.5 clearly holds with Lim′ replaced by H, whence κ is greatly Mahlo if and only if HΣ (Inac ∩ κ) is stationary for all schemes Σ in κ. By corollary 12 κ is greatly Mahlo if and only if there is a stationary set of rank ρ for all ρ < κ+ , which is so if and only if its Mahlo rank is ≥ κ+ . ⊳ 33. Reflection principles. A reflection principle in set theory is a statement to the effect that if some statements holds in a model then they hold in a submodel. There are a variety of such principles, some provable in ZFC, and some stronger than ZFC. One example, lemma 20.2, has already been given. It is a theorem of ZFC that a single formula can be reflected. Theorem 1.8.1 of [Devlin] is a general theorem of this type. Namely, suppose W = ∪α∈Ord Wα , where Wα is a transitive set, the predicate x ∈ Wα is definable, α ≤ β ⇒ Wα ⊆ Wβ , and if α ∈ LimOrd then Wα = ∪β<α Wβ . Let F be a formula, with free variables among ~x. Then for any α there is a limit ordinal β > α such that ∀~x ∈ Wβ (F W ⇔ F Wβ ). The version where W = V and Wα = Vα is proved in theorem 12.14 of [Jech2]; and theorem 3.6.3 of [Drake], where it is called the Montague-Levy reflection principle. Inaccessible cardinals can be characterized in terms of a reflection principle. Let L be the first order language h=, ∈, P i where P is a unary predicate. By a structure for L will be meant a pair hM, Xi where M is a transitive set and X = P̂ , where as in section 6 P̂ denotes the interpretation of P . A substructure is a pair hN, X ∩ N i where N ⊆ M . Say that an ordinal α is Π10 -indescribable if for every sentence F in L, if |=Vα ,X F then for some β < α |=Vβ ,X∩Vβ F . Theorem 1. α is an inaccessible cardinal if and only if α is Π10 indescribable. Remarks on proof: Suppose α is an inaccessible cardinal κ. Let X ⊆ Vκ be an interpretation for P , and let {f } be a set of Skolem functions for the formulas of the language L. Since κ is inaccessible, for any β < κ there is a γ with α < β < κ, such that for all ~x ∈ Vβ and all Skolem functions f , f (~x) ∈ Vγ ; let e(β) be the least such γ. Let γ0 = 0, and given γn let γn+1 = e(γn ). Let γ = sup{γn }. Using lemma 20.1, it follows that hVγ , X ∩ Vγ i is an elementary substructure of hVκ , Xi. The opposite implication is proved by showing that if α fails to have a property of an inaccessible cardinal, then a sentence F can be found which is not reflected. Sentences using more than one predicate may be used, since these can be converted to sentences using a single 108 predicate (code two classes X1 , X2 as the single class ∪i=1,2 {hi, xi : x ∈ Xi }). If α = γ + 1 let X = {γ}, and let F be ∃xP (x). If γ < α, f : γ 7→ α, and f [γ] is unbounded, let X1 = {γ} and X2 = f , let G(x) be “P2 is a function with domain x taking ordinal values”, and let F be ∃xP1 (x) ∧ G(x). If γ < α, f : Pow(γ) 7→ α, and f is surjective, let X1 = {γ} and X2 = f , let G(x) be “P2 is a function with domain Pow(x) taking ordinal values”, and let F be ∃xP1 (x) ∧ G(x). If α = ω let F be “∀x∃y(x ∈ y)”. For further details see theorem 9.1.3 of [Drake]. ⊳ 34. Indescribable cardinals. New types of large cardinals can be obtained by generalizing the formulas that are considered in the reflection principle of theorem 33.1. The language of the more general formulas has two sorts of variables, “first order”, which range over elements of the universe of discourse, and “second order”, which range over subsets. Second order variables are sometimes more convenient than unary predicate symbols, provided some technical details are attended to. The atomic formulas are x ∈ y, x = y, x ∈ Y , X = Y , where x, y are first order variables and X, Y are second order variables. Some authors continue to use Y (x) rather than x ∈ Y . Second order variables may be quantified over. The notion of a structure has already been indicated; second order variables range over subsets, and the satisfaction definition only needs to be modified accordingly. The restriction of the value X of a free second order variable to a substructure N is X ∩ N . A formula with no bound second order variables is called ∆10 ; it is convenient to consider these to be the Π10 and Σ10 formulas as well. This differs slightly from the definition in section 33, in that several several free second order variables are allowed, rather than coding them into a single one. A Π1n+1 (resp. Σ1n+1 ) formula is obtained from a Σ1n (resp. Π1n ) formula by preceding it with universal (resp. existential) quantifications of second order variables. Variables of order higher than second order can be considered. This is omitted here; see [Koellner2] for one discussion. Using notation from chapter 20, an inaccessible cardinal κ is said to be Π1n -indescribable if for every Π1n formula F and suitable variable list ~ then for some α < κ |=Vα F ~ (X1 ∩ Vα , . . . , Xk ∩ Vα ). ~ , if |=Vκ F ~ (X) W W W By theorem 33.1, there is no loss of generality in requiring κ to be an inaccessible cardinal. For simplicity the free variables are required to be second order; these can be used to “simulate” first order free variables. To avoid repetition, let F̄ denote a formula, together with a suit109 able list of variables and assignment X1 , . . . , Xk of subsets of Vκ to the variables; and for α < κ let F̄ ↾α denote the result of replacing Xi by Xi ∩ α. There is no point to defining Σ1n -indescribable cardinals. Omitting free variables, if F̄ is ∃X Ḡ, and |=Vκ F̄ , then |=Vκ Ḡ where Ḡ extends the assignment to the variable X in some way, so |=Vα Ḡ for some α < κ, so |=Vα F̄ . If κ is a Π1n -indescribable cardinal, a subset Q ⊆ κ is said to be Π1n enforceable if there is an F̄ such that |=Vκ F̄ , and {α : |=Vα F̄ ↾α} ⊆ Q. By the definition of a Π1n -indescribable cardinal, a Π1n -enforceable set is nonempty. In fact, such a set is stationary. If F̄ witnesses that Q is Π1n -enforceable, and C is a club subset of κ, let Ḡ be the formula “C is unbounded and F̄ ”. Then |=Vκ Ḡ, so |=Vα Ḡ↾α for some α. Since C is a club subset, α ∈ C; and since F̄ holds, α ∈ Q. Thus, Q ∩ C is nonempty, and since C was arbitrary, Q is stationary. A Π1n formula and assignment F̄ may be coded as a subset of Vκ ; F̃ will be used to denote such a code. The predicate “F̃ is true” will be denoted “Trun (F̃ )”. There is a Π1n formula which defines this predicate in any Vκ for κ an inaccessible cardinal (in fact, in any Vα where α ∈ LimOrd). This predicate is Π1n . Only a brief discussion will be given here; see [Drake] for further details. By usual methods, the predicate Tru0 (F̃ ) is ∆11 . Tru1 (F̃ ) states that “for all G̃, if G̃ is properly derived from F̃ then Tru0 (G̃)”; and similarly for additional quantifiers. The predicate “F̃ is true in M ” will be denoted Sats.o. (M, F̃ ). There is a Π1n formula which defines this predicate in any Vα where α ∈ LimOrd. Using the preceding predicates, it may be shown that the set of Π1n -indescribable cardinals is a Π1n+1 -enforceable set. Let En be the sentence “∀F̃ (Trun (F̃ ) ⇒ ∃α(Sats.o. (Vα , F̃ )))”. En is Π1n+1 , and an inaccessible cardinal λ is a Π1n -indescribable cardinal if an only if |=Vλ En . In particular, if κ is a Π1n+1 -indescribable cardinal then |=Vκ En . From hereon only Π11 -indescribable cardinals will be considered. Theorem 1. Suppose κ is a Π11 -indescribable cardinal. The collection of Π11 -enforceable sets is a proper normal filter which contains Inac ∩ κ and is closed under H. Remarks on proof: The formula witnessing that Inac ∩ κ is enforceable is “∃x(x = ω) and ∀x(Pow(x) is a set) and ∀F ∀x(if F is a function then F [x] is a set)”. F̄ witnesses that Q is is enforceable then “F̄ and ∀C(C club implies C ∩ Q nonempty)” witnesses that H(Q) is is enforceable. Suppose Qξ ⊆ κ for ξ < κ is is witnessed by F̄ξ . Let F̃ = {hξ, xi : x ∈ F̃ξ }. Then △ξ<κ Qξ is witnessed by ∃x(x = ω) ∧ ∀x∃y(x ∈ y) ∧ ∀ξ(G̃ = F̃ξ ⇒ Tru′ (ξ, c, P )), where G̃ = F̃ξ is 110 an abbreviation for a formula involving G̃ and F̃ . For the case ∩ξ<µ Qξ , replace ∃x(x = ω) by ∃x(x = µ), and ∀ξ by ∀ξ < µ. For further details see section 9.1 of [Drake]. ⊳ Thus, a Π11 -indescribable cardinal is greatly Mahlo. It may be shown that the greatly Mahlo cardinals below a Π11 -indescribable cardinal comprise a Π11 -enforceable set. It is a topic of current research, what ρ≺ (κ) is for a Π11 -indescribable cardinal, and whether a determination of this would provide evidence that Π11 -indescribable cardinals exist. Theorem 2. If κ is a Π11 -indescribable cardinal then κ has the tree property. Remarks on proof: This is theorem 9.15 of [Drake], and follows from theorem 17.18 of [Jech2]. Suppose κ is inaccessible and T is a κtree, which recall from section 24 is a tree whose height is κ, and where |Tα | < κ for each α. By enumerating the levels successively, it may be assumed that Tα ⊆ Vβ , where the function α 7→ β is an increasing function from κ to κ. If T has a branch of length κ then the formula “T is unbounded and for all X, if X is a branch of T then X is bounded” is true in Vκ but not in Vα for any α < κ. ⊳ As mentioned in section 24, if an inaccessible cardinal has the tree property then it is Π11 -indescribable. Since the proof is somewhat involved it will be omitted. However some comments will be made. First, some standard notation for “Ramsey-theoretic” properties of sets will be introduced. Let [x]n denote the set of n element subsets of a set x, where n is an integer. If f : [x]n 7→ I, a subset y ⊆ x is said to be homogeneous for f if there is an i ∈ I such that f (s) = i for all s ∈ [y]n . For cardinals κ, µ, and λ, and an integer n, the notation κ → (λ)nµ is used for the statement “for any function f : [κ]n 7→ µ, there is a homogeneous subset of κ of cardinality λ”. A classic example of a Ramsey-theoretic theorem states that 6 → (3)22 . That is, if the lines between 6 points are colored red and green, then either there is a red triangle or a green triangle. Following is a list of properties which a cardinal might have. 1. Inaccessible and tree property 2. Uncountable and κ → (κ)22 3. Inaccessible and a certain infinitary language has a certain compactness property 4. A certain elementary extension property 5. Π11 -indescribable These properties are equivalent. Following are references to the proof in [Jech2]. See also theorem 10.2.1 of [Drake]. 1⇔2 is proved in lemmas 9.9 and 9.26. 1⇔3 is proved in theorem 17.13.(i). 3⇔4 is proved in lemma 17.17. 4⇔5 is proved in theorem 17.18. If the compactness property of 111 property 3 is modified slightly the requirement of inaccessibility becomes redundant; see proposition 4.4 of [Kanamori3]. If a cardinal is Π11 -indescribable then it remains so in L. Again, the proof is somewhat involved, and may be found in theorem 17.22 of [Jech2]. The claim is proved for the property κ → (κ)22 . 35. Ultrapowers. Ultrapowers are a construction from mathematical logic which have various uses in set theory; one will be seen in the next section. Ultrapowers are a special case of a slightly more general construction, ultraproducts. Recall that a filter of subsets of a set S is a nonempty subset of Pow(S) such that if A ∈ F and A ⊆ B then B ∈ F , and if A, B ∈ F then A ∩ B ∈ F . A filter is - proper if ∅ ∈ / F, - µ-complete for a cardinal µ if the intersection of fewer than µ sets in F is in F , - an ultrafilter if it is proper and for any A ⊆ S, either A ∈ F or Ac ∈ F , and - principal if there is some A0 ∈ S such that A ∈ F if and only if A0 ⊆ A. It is readily verified that in a principal ultrafilter, A0 must be a singleton set, i.e., of the form {x} for some x ∈ S. Indeed, if x ∈ A0 and {x}c ∈ F then A0 − {x} ∈ F . Let {Mi : i ∈ I} be a family of structures for a first order language. The Cartesian product ×i∈I Mi is the set of sequences hxi : i ∈ Ii where xi ∈ Mi . Let U be an ultrafilter on U . A relation ≡U is defined on ×i∈I Mi by the requirement, hxi i ≡U hyi i if and only if {i ∈ I : xi = yi } ∈ U . Lemma 1. ≡U is an equivalence relation. Proof: hxi i ≡U hxi i because I ∈ U . If hxi i ≡U hyi i then obviously hyi i ≡U hxi i. If {i ∈ I : xi = yi } = A where A ∈ U and {i ∈ I : yi = zi } = B where B ∈ U then {i ∈ I : xi = zi } ⊇ A ∩ B. ⊳ Let (×i∈I Mi )/ ≡U denote the set of equivalence classes. This is called the ultraproduct of the Mi , by the ultrafilter U . Lemma 2. (×i∈I Mi )/ ≡U may be made into a structure M for the language by setting - cM = [hcMi i] for a constant c, - f M ([hx1i i], . . . , [hxki i]) = [hf Mi (x1i , . . . , xki )i] for a function f , and - P M ([hx1i i], . . . , [hxki i]) if and only if {i : P Mi (x1i , . . . , xki i)} ∈ U . Proof: Suppose for 1 ≤ j ≤ k that {i : hxji i = hyji i} = Aj where Aj ∈ U . Then for i ∈ ∩j Aj , f Mi (x1i , . . . , xki ) = f Mi (y1i , . . . , yki ) and P Mi (x1i , . . . , xki ) if and only if P Mi (y1i , . . . , yki ). ⊳ 112 If = is a symbol of the language, and is interpreted as equality in Mi , then the interpretation of = in M is equality, since if {i : xi = yi } ∈ U then [hxi i] = [hyi i]. Theorem 3. Suppose (×i∈I Mi )/ ≡U is the ultraproduct of the Mi , by the ultrafilter U . Suppose F is a formula, with a suitable variable list. Suppose [hx1i i], . . . , [hxki i] are elements of M . Then |=M F ([hx1i i], . . . , [hxki i]) if and only if {i : |=Mi F (x1i , . . . , xki )} ∈ U . Proof: First, for a term t with k variables in order, the value of t in M at [hx1i i], . . . , [hxki i] equals [ht̂i i], where t̂i is the value in Mi of t at x1i , . . . , xki ; this follows by induction on t. Second, for an atomic formula F = P (t1 , . . . , tl ), |=M F if and only {i : P Mi (t̂1i , · · · , t̂li )} ∈ U , and P Mi (t̂1i , · · · , t̂li ) if and only |=Mi F . Third, for F = ¬G, |=M F / U , if and only if if and only if ¬ |=M G, if and only if {i : |=Mi G} ∈ {i : |=Mi ¬G} ∈ U . Fourth, for F = G ∧ H, |=M F if and only if |=M G and |=M H, if and only if {i : |=Mi G} ∈ U and {i : |=Mi H} ∈ U , if and only if {i : |=Mi G and |=Mi H} ∈ U , if and only if {i : |=Mi G ∧ H} ∈ U . Fifth, suppose F = ∃wG. If |=M F then |=M G where a value [hwi i] is assigned to v. Then {i : |=Mi G} ∈ U . whence {i : |=Mi F } ∈ U . If on the other hand A ∈ U where A = {i : |=Mi F }, for each i ∈ A |=Mi G, where a value wi is assigned to v. Then |=M G where v is assigned [hwi i] (wi being arbitrary if i ∈ / A), and so |=M F . ⊳ The preceding theorem is called Los’ theorem. It states that F holds in M if and only if F holds in Mi for “almost all” i, i.e., the set of such i is in the filter of measure 1 sets. If Mi = M for all i the ultraproduct is called an ultrapower. This may be denoted M I / ≡U . The ultrapower construction provides a construction of “elementary extensions” of M , as the following theorem shows. A map j : M 7→ N between structures is said to be an elementary embedding if j is an injective homomorphism, and j[M ] is an elementary substructure of N . Equivalently, for any formula F , and x1 , . . . , xk ∈ M , |=M F (x1 , . . . , xk ) if and only if |=N F (j(x1 ), . . . , j(xk )). The requirement that j be injective is redundant if equality is present, since then x = y if and only if j(x) = j(y). Theorem 4. Suppose M I / ≡U is the ultrapower of I copies of M , by the ultrafilter U . Suppose F is a formula. Suppose j : M 7→ M I / ≡U is the map where j(x) equals [Cx ] where Cx is the sequence which is constantly x. Then j is an elementary embedding. Proof: The induction in the proof of theorem 3 shows that for x1 , . . . , xk ∈ M , |=M I /≡U F (j(x1 ), . . . , j(xk )) if and only if {i : |=Mi F (x1 , . . . , xk )} = I, if and only if |=M F (x1 , . . . , xk )}. Since Cx ≡U Cy only if x = y, j is injective. ⊳ 113 36. Measurable cardinals. Measurable cardinals were first defined by S. Ulam in 1930. In 1962 some facts about the size of a measurable cardinal were established. In 1961 D. Scott had already shown that the existence of a measurable cardinal had implications for statements of set theory which were not concerned with large cardinals. This was another milestone of set theory dating from the early 1960’s. Measurable cardinals are among the smallest large cardinals for which this is the case, and a considerable amount of research since has concerned implications which stronger types of large cardinals have. A measurable cardinal is defined to be an uncountable cardinal κ, such that there is a κ-complete nonprincipal ultrafilter U ⊆ Pow(κ). The name arose from connections with measure theory; see [Jech2] for a discussion of this topic. If U is a κ-complete nonprincipal ultrafilter on κ, and |A| < κ, then A∈ / U . This follows because {α} ∈ / U for any α < κ, and if |A| < κ then A is the union of fewer than κ such sets. Suppose κ is a measurable cardinal, and U ⊆ Pow(κ) is a κcomplete nonprincipal ultrafilter. Even though V is a proper class, the ultrapower V κ / ≡U may be defined. It is a proper class. Rather than taking the entire equivalence class of a sequence hxα : α < κi, only the b [hyi i] holds, if elements of least rank are taken. The predicate [hxi i]∈ and only if {i : xi ∈ yi } ∈ U . Recall from section 19 that a binary relation < which is a class has small extensions if {u : u < v} is a set for all v ∈ S. It will be said to be well-founded if there are no infinite descending chains. Lemma 1. If κ is a cardinal and U ⊆ Pow(κ) is a countably complete ultrafilter then V κ / ≡U has small extensions and is well-founded. b [hyα i]. Then A ∈ U where A = {α : xα ∈ Proof: Suppose [hxα i]∈ ′ b [hyα i] and ρ(hx′α i) ≤ yα }. Let xα equal xα if α ∈ A, else ∅. Then [hx′α i]∈ hyα i. It follows that U ⊆ Pow(κ) has small extensions. Suppose that [si ] b [si ], where si = hxiα i. for i ∈ ω were a sequence of elements with [si+1 ]∈ Then Ai ∈ U where Ai = {α : xi+1,α ∈ xiα }, so ∩i Ai is in U , and for any α ∈ U xi,α is an infinite descending sequence of ordinals. ⊳ The elementary embedding of theorem 35.4 may be adapted to this case. For a formula F , the statement that F is true in a class may be cast as a statement in the language of set theory, using relativization to a class, as discussed preceding theorem 16.1. Theorem 35.4 may then be proved “formula by formula”, so that it is replaced by an infinite set of formulas, all of which are provable in ZFC. The terminology “elementary embedding” is in common use to describe this situation, in addition to its use for maps between structures which are sets. See the end of the 114 section for further discussion. Theorem 16.2 holds with D a class, with an arbitrary relation which has small extensions and is well-founded, and satisfies the axiom of extensionality; T and π are classes. The modifications to the proof given in section 16 are minimal. Recursion on the well-founded relation is used; this is a generalization of recursion on ∈. Theorem 6.15 of [Jech2] gives a complete treatment. If κ is a measurable cardinal and U ⊆ Pow(κ) is a κ-complete nonprincipal ultrafilter, let UltU0 denote V κ / ≡U ; and let j U0 : V 7→ UltU0 be the canonical embedding of theorem 35.4. Let UltU denote the transitive collapse of UltU0 , and let jU denote π ◦ j U0 where π is the collapsing isomorphism. Theorem 2. jU : V 7→ UltU is an elementary embedding. Proof: j U0 is an elementary embedding and π is an isomorphism. ⊳ Some useful observations about an elementary embedding j are as follows. - If y = f (x) is defined by a formula, then y = f (x) if and only if j(y) = f (j(x)); thus, j(f (x)) = f (j(x)). - If j : V 7→ M where M is a transitive class then M is a model of ZFC, and by adapting an argument in section 17, the rank function on M is absolute. The identity map is called the trivial embedding; any other embedding of structures is said to be nontrivial. Lemma 3. Suppose j : V 7→ M is a nontrivial elementary embedding of V in a transitive class. Then there is a least ordinal α such that j(α) > α; j ↾ Vα is the identity map. Proof: Since β < α if and only if j(β) < j(α), j ↾ Ord is increasing, so j(α) ≥ α. Inductively, if j(β) = β for β < α then j(x) = x for x ∈ V<α . This is clear if α = 0 or α is a limit ordinal. If α = γ + 1 and x ∈ V<α then x ⊆ Vγ , whence j(x) ⊆ j(Vγ ) = Vj(γ) = Vγ . Thus, if w ∈ j(x) then j(w) = w, whence j(w) ∈ j(x), whence w ∈ x. Thus, j(x) ⊆ x. If w ∈ x then w = j(w) ∈ j(x); thus, x ⊆ j(x) also. If j(α) = α for all α then j is the identity map. It follows that j(α) > α for some α; since j(β) = β for β < α, j ↾ Vα is the identity map. ⊳ Theorem 4. Suppose j : V 7→ M is an elementary embedding of V in a transitive class, and j is not the identity map. a. Let α be the least ordinal such that j(α) > α. Then α is a cardinal κ. b. Let U = {X ⊆ κ : κ ∈ j(X)}. Then U is a κ-complete nonprincipal ultrafilter on κ, and κ is a measurable cardinal. Proof: To begin with, let U = {X ⊆ α : α ∈ j(X)}. If X, Y ∈ U 115 then α ∈ j(X) ∩ j(Y ), and j(X) ∩ j(Y ) = j(X ∩ Y ), so X ∩ Y ∈ U . If X ∈ U and X ⊆ Y then j(X) ⊆ j(Y ), so α ∈ j(Y ), so Y ∈ U . Since j(α − X) = j(α) − j(X) and α ∈ j(α), either α ∈ j(X) or α ∈ j(α − X) but not both. If β < α then j({β}) = {j(β)} = {β}, so {β} ∈ / U . Thus, U ⊆ Pow(α) is a nonprincipal ultrafilter. Suppose Xξ ∈ U for ξ < η where η < α. Then j(hXξ i) is a sequence of length j(η) whose j(ξ)-th element is j(Xξ ). Since j(η) = η and j(ξ) = ξ for ξ < η, j(hXξ i) = hj(Xξ )i. It follows that j(∩ξ Xξ ) = ∩ξ j(Xξ ), whence ∩ξ Xξ ∈ U . That is, U is “α-complete”. Let κ = |α|; then α is the union of κ singleton sets, so if κ were less that α then α ∈ / U by α-completeness, a contradiction. Thus, α = κ. It has already been shown that U is a κ-complete nonprincipal ultrafilter on κ. To complete the proof that κ is measurable, it need only be shown that it is uncountable. For α ≤ ω j(α) = α, since α is definable and j is elementary. Thus, κ > ω. ⊳ In the remainder of the section, κj and Uj will be used to denote κ and U of the theorem. Recall the definitions of diagonal intersection and normal filter from section 31. A function f : S 7→ Ord where S is a set of ordinals is said to be regressive if f (α) < α for every nonzero α ∈ S. Say that a function f is constant on a subset S of its domain if and only if f (x1 ) = f (x2 ) for all x1 , x2 ∈ S. In the case of the club filter, the forward direction of the following lemma is known as Fodor’s theorem. Lemma 5. Suppose κ is a regular uncountable cardinal and F ⊆ Pow(κ) is a κ-complete filter. Let I = {X c : X ∈ F } be the dual ideal. Then F is normal if and only if (∗) for every regressive function f : X 7→ κ where X ∈ / I, there is a subset Y ⊆ X such that Y ∈ / I and f is constant on Y . Proof: Suppose F is normal. Let Xξ = {α ∈ X : f (α) = ξ}. Suppose Xξ ∈ I for every ξ < κ. Let D = △ξ<κ Xξc . Since F is normal, D ∈ F ; and since X ∈ / I, D ∩ X 6= ∅. But if α ∈ D ∩ X then α ∈ Xξc for all ξ < α, whence f (α) 6= ξ for all ξ < α, whence f (α) ≥ α, a contradiction. Suppose property (∗) holds, and Xξ ∈ F for ξ < κ. Let D = △ξ<κ Xξ . If α ∈ Dc then α ∈ / Xξ for some ξ < α; let f (α) be any such ξ. Suppose D ∈ / F . By property (∗), there is a subset E ⊆ Dc and a ξ < κ, such that E ∈ / I and f (α) = ξ for all ξ ∈ E. Since E ∈ / I and Xξ ∈ F , Xξ ∩ E is nonempty. But if α ∈ E ∩ Xξ then f (α) = ξ, so α∈ / Xξ , a contradiction. Thus, D must be in F . ⊳ Define the diagonal sequence in an ultrapower V κ / ≡U to be [hxξ i], where xξ = ξ. Lemma 6. Suppose j : V 7→ M is an elementary embedding. Suppose U ⊆ Pow(κ) is a κ-complete nonprincipal ultrafilter. 116 a. Uj is normal. b. κjU = κ. c. Suppose U is normal. Then jU (κ) is the image under the collapsing isomorphism of the diagonal sequence. Proof: For part a, suppose κ is the least ordinal moved by j, X ⊆ κ, Xc ∈ / Uj , and f : X 7→ κ is a regressive function. Then X ∈ Uj since Uj is an ultrafilter, j(f ) : j(X) 7→ j(κ), j(f ) is a regressive function, and κ ∈ j(X) by definition of Uj . Let γ = j(f )(κ); then γ < κ. Let Y = {α : f (α) = γ}. Then κ ∈ j(Y ), so Y ∈ Uj ; and f is constant on Y. b [hαi], For part b, suppose α < κ and β < jU (α) in M . Then [hxξ i]∈ where [hxξ i] is the preimage of β under the collapsing isomorphism. It follows that ∪γ<α {ξ : xξ = γ} ∈ U , and it follows by κ-completeness that β = jU (γ) for some γ < α. It then follows by induction that jU (α) = α for α < κ. In V κ / ≡U let d be the diagonal sequence. It is easily seen that b d and d∈ b [hκi], and so jU (κ) > κ. This completes the proof of part [hαi]∈ b. b d then ξ 7→ xξ For part c, let d be the diagonal sequence. If [hxξ i]∈ is a regressive function, and by normality hxξ i ≡U hγi for some γ < κ. ⊳ By parts a and b, if κ is a measurable cardinal then there is a normal κ-complete nonprincipal ultrafilter on κ. Theorem 7. Suppose U ⊆ Pow(κ) is a normal κ-complete nonprincipal ultrafilter. Suppose F is a second order formula with free ~ ~ , and |=Vκ F ~ (X). Let RF = {α < κ : |=Vα variables among W W FW (X ∩ V , . . . , X ∩ V )}. Then R ∈ U . 1 α k α F ~ Remarks on proof: Some preliminary facts are required. First, if κ is measurable then κ is regular. Indeed, suppose κ were singular, say κ = ∪ξ<λ Sξ for some λ < κ where |Sξ | < κ for all ξ < λ. Then Sξ ∈ /U for all ξ < λ, whence κ ∈ / U , a contradiction. Second, if κ is measurable then κ is inaccessible. Indeed, suppose S is a set of functions f : λ 7→ {0, 1} where λ < κ and |S| = κ. Let Dξ,i = {f ∈ S : f (ξ) = i} for ξ < λ and i = 0, 1. Let Dξ = Dξ,0 if Dξ,0 ∈ U , else Dξ,1 . Let D = ∩ξ<λ Dξ . Then D ∈ U ; but |D| = 1, a contradiction. Third, suppose j : V 7→ M and κ = κj . Then j[Vκ ] ⊆ M , and by lemma 3 j[Vκ ] = Vκ . It follows that VκM = {x ∈ M : ρ(x) < κ} = Vκ . M Also, if X ⊆ Vκ then j(X) ∩ Vκ = X, and so X ∈ M , and so Vκ+1 = Vκ+1 . Fourth, under the hypotheses of the theorem, a subset X ⊆ Vκ is represented by [hX ∩ Vξ i]. Indeed, if x ∈ X then x ∈ Vξ for some ξ < κ, 117 b [hX ∩Vξ i]. On the other hand suppose [hxξ i]∈ b [hX ∩Vξ i], and let so [hxi]∈ A = {ξ : xξ ∈ X ∩ Vξ }. Then A ∈ U , and ρ(xξ ) < ξ for ξ ∈ A, whence since A is normal there is a B ⊆ A and a γ < κ such that B ∈ U and ρ(xξ ) = γ for ξ ∈ B. Thus, xξ ∈ Vγ for ξ ∈ B. Since κ is inaccessible |Vγ | < κ, so by κ-completeness there is a w ∈ Vγ and a C ⊆ B such that C ∈ U and xξ = w for ξ ∈ C. Let N denote UltU . Let F ′ be F , with first (resp. second) order variables limited to Vκ (resp. Vκ+1 ). Let G states that κ is a limit ordinal ~ Using the and F ′ holds in Vκ . Using the third fact above, |=N G(κ, X). fourth fact, |=V κ /≡U G([hξi], [hX1 ∩ Vξ i], . . . , [hXk ∩ Vξ i]). Using Los’ theorem, {ξ < κ : G(ξ, X1 ∩ Vξ , . . . , Xk ∩ Vξ )} is in U , and this is a subset of RF . See theorem 9.3.1 of [Drake] or lemma 17.15 of [Jech2] for further details. ⊳ As a corollary, if κ is measurable then κ is Π1n -indescribable for any n. By extending the argument it may be shown that κ is Π21 indescribable; see theorem 9.3.1 of [Drake]. It is easily seen that club subsets of κ are Π10 -enforceable. Thus as another corollary, club subsets of κ are in a normal ultrafilter U , whence sets in U are stationary. It is a theorem of D. Scott that if V = L then there are no measurable cardinals. A stronger result will be proved in section 38. A direct proof may be found in theorem 17.1 of [Jech2]. The following theorem gives a relationship between j and jUj ; it will be useful in section 44. Theorem 8. Given an elementary embedding j : V 7→ M , there is an elementary embedding k : MUj 7→ M , such that j = k ◦ jUj . Proof: Write κ for κj and U for Uj . A map k ′ on V κ / ≡U will be defined; k = π ◦ k ′ where π is the collapsing isomorphism. Let k ′ ([hxξ i]) = j(hxξ i)(κ). Let X = {ξ : xξ = yξ }; if X ∈ U then κ ∈ j(X), and j(hxξ i)(κ) = j(hyξ i)(κ) follows. Thus, k ′ is a well-defined function. Suppose |=V κ /≡U F ([hx1ξ i], . . . , [hxkξ i]). Then there is a D ∈ U such that F (x1ξ , . . . , xkξ ) is true for ξ ∈ D. It follows that κ ∈ j(D), whence |=M F (j(hx1ξ i)(κ), . . . , j(hxkξ i)(κ)), whence |=M F (k ′ ([hx1ξ i]), . . . , k ′ ([hxkξ i])), which shows that k ′ is elementary. Finally, k(jU (x)) = k ′ ([hxi]) = j(hxi)(κ) = j(x). ⊳ Some further comments on elementary embeddings which are proper classes will be made. If j, M , and N are proper classes, it is safe to say that j is an elementary embedding if for every formula φ~x in the language of set theory, the formula Eφ is true (which fact would be demonstrated by proving it in ZFC), where Eφ is y1 = j(x1 ) ∧ · · · ∧ yk = j(xk ) ∧ φM ⇒ φN y1 /x1 ,...,yk /xk and y1 , . . . , yk are new free variables. Section 6.2 of [Drake] uses this method. 118 The above method does not, however, yield a single statement of ZFC which states that j is an elementary embedding. Such a statement can be given, making use of the observation that truth for Σ1 formulas is definable. A discussion may be found in section 5 of [Kanamori3]. For any n there is a single statement En which states that j is a Σn elementary embedding. If E1 holds then En holds for any n, and Eφ holds for any formula φ. There are types of cardinals “in between” Π1n -indescribable cardinals and measurable cardinals. Recent interest has been in cardinals larger than measurable; smaller cardinals will generally not be covered here. Discussions may be found in [Drake] and [Jech2]. Types considered include the following. The notation [X]<ω will be used to denote the finite subsets of X. 1. Totally indescribable cardinals 2. ν-indescribable cardinals 3. Cardinals for which κ → (ω)<ω 2 . 4. Cardinals for which κ → (ℵ1 )<ω 2 5. Johnsson cardinals 6. Rowbottom cardinals 7. Ramsey cardinals, cardinals for which κ → (κ)<ω 2 In some cases, a cardinal of a higher numbered type is a cardinal of a lower numbered type, but not always. Types 1-3 are consistent with V = L, and higher numbered types are not. Types 3 and 4 are called Erdos, or partition, cardinals. Subtle and ineffable cardinals are related to indescribable cardinals. They are consistent with V = L. They have found various uses; see for example [Friedman1]. Ineffable cardinals were mentioned in section 24. Because the fact will be used later (section 42), it will be shown that measurable cardinals are Ramsey. Theorem 9. If κ is a measurable cardinal then κ → (κ)<ω 2 . Proof: This is theorem 10.22 of [Jech2]. It suffices to show that if U ⊆ Pow(κ) is a κ-complete normal ultrafilter then for any partition of [κ]n into finitely many parts there is a homogeneous subset H ⊆ κ with H ∈ U . The proof is by induction on n. For the basis n = 1, at least one of the parts must be in U , because the dual ideal is closed under finite union. Suppose Pi for 1 ≤ i ≤ t are the parts of a partition of [κ]n+1 . Suppose α < κ. Let Piα be the sets in Pi with least element α, and let Qα i be these sets with α removed. Let T = κ − (α + 1), and let U ′ = {S ∩ T : S ∈ U }; it is readily seen that U ′ ⊆ Pow(T ) is a κcomplete normal ultrafilter, and U ′ ⊆ U . It follows using the induction hypothesis that there is a Jα ∈ U such that [Jα ]n ⊆ Qα i for some i, which will be denoted iα . Let Ri = {α : iα = i}; then there is a j such 119 that Rj ∈ U . Let H = Rj ∩ △α Jα . If ~x is a set in [H]n+1 let α = x1 and ~ y = x2 , . . . , xn+1 . Then ~x ⊆ Rj , so iα = j; and ~y ⊆ Jα . Thus [Jα ]n ⊆ Qα y ∈ Qα x is in Pj ; this shows that H is homogeneous. j , so ~ j , so ~ ⊳ 37. Indiscernibles. Like ultrapowers, indiscernibles are a construction from mathematical logic with uses in set theory. Suppose M is a structure for some first order language. Suppose I ⊆ M and < is a linear order on I. I is said to be a set of indiscernibles for M if, for any formula F and suitable list of variables, and sequences x1 < · · · < xn and y1 < · · · < yn , F (~x) if and only if F (~y ). If I is a set of indiscernibles for M , the set of formulas {F :|=M F (~x) for x1 < · · · < xk ∈ I} may be considered. This set will be called the EM-set, where EM is an abbreviation for Ehrenfeucht-Mostowski; other names include EM-formula, theory of indiscernibles, and character. Note that each F must be given with a suitable list of variables; this can be formally defined in a straightforward manner, say as a k +1-tuple hF, ~v i. An EM-set is defined to be a set of formulas which is the EM-set for some structure M and set of indiscernibles I ⊆ M . A standard theorem states that if T is a theory with infinite models, and I is a set with a linear order <, then there is a model M of T which has (an isomorphic copy of) I as a set of indiscernibles (theorem 3.3.10 of [ChaKei]). Since this is not needed, a proof will be omitted. The following, called the stretching lemma, is another standard fact. Theorem 1. Suppose I is an infinite set of indiscernibles for a structure M , and Σ is the EM-set. Suppose J is an infinite linearly ordered set. Then there is a structure containing J as a set of indiscernibles and having Σ as its EM-set. Remarks on proof: This is theorem 3.3.11.b of [ChaKei]. Introduce additional constants cj for j ∈ J, and consider the theory of M together with the formulas Fv (cj1 , . . . , cjk ) where j1 < · · · < jk . Since I is infinite, any finite subset of the enlarged theory T has a model, so T is consistent, so T has a model. The interpretations of the cj are a copy of J, and may be linearly ordered according to the indexes j. It is readily seen that J is a set of indiscernibles and Σ is the EM-set. ⊳ Indiscernibles have additional properties of interest for structures with an additional property. A structure M is said to have “definable Skolem functions” if for every formula F there is a formula F s such that 1. |=M ∃!yF s ; and 2. |=M F s ⇒ F . A standard theorem states that any structure can be “expanded” to one with definable Skolem functions (proposition 3.3.4 of [ChaKei]). Various 120 particular structures already have them, in particular some structures of interest in set theory, as will be seen in the next section. Recalling the definition of the theory of a structure from section 7, it is clear that if M1 and M2 have the same theory, and M1 has definable Skolem functions, then M2 has definable Skolem functions, indeed defined by the same formulas. Recalling the proof of lemma 2 of section 20 and following comments, given a structure M with definable Skolem functions, and X ⊆ S, if M ′ is the Skolem hull obtained using the definable Skolem functions, then M ′ ≺ M . M ′ will be called the definable hull of X. Theorem 2. Suppose the following hold. 1. For j = 1, 2, Mj is a structure, and Ij ⊆ Mj with order <j is a set of indiscernibles. 2. Σ is the EM-set in both cases. 3. M1 (hence M2 ) has definable Skolem functions. 4. f : I1 7→ I2 is strictly order-preserving. 5. For j = 1, 2 Hj is the definable hull of Ij in Mj . Then there is a unique elementary embedding f¯ : H1 7→ H2 , whose restriction to I1 is f . Further, f¯[H1 ] is the definable hull of f [I1 ]. Remarks on proof: This is theorem 3.3.11.d of [ChaKei]. Each element of H1 is given by a term, involving applying Skolem functions to elements of I1 . Such a term determines an element of H2 , by replacing an element x ∈ I1 by f (x), and using the “same” Skolem functions, i.e., those defined in M2 by the same formulas. If t1 and t2 determine the same element of H1 , there is a formula stating that t1 = t2 , which holds in M1 at certain x ∈ I1 . By hypotheses 2 and 4, replacing each x by f (x), the formula holds in M2 . Thus, f¯(w) may be defined to be the value determined in M2 by t, where t is any term determining w in M1 . Given a formula F , and values ~x in H1 , there is a formula G, and values ~y in I1 , such that |=H1 F (~x) if and only if |=H1 G(~y ). Let ~y∗ denote the vector where y∗i = f (yi ); then |=H1 G(~y ) if and only if |=H2 G(~y∗ ). Let ~x∗ denote the vector where x∗i = f¯(xi ); then |=H2 G(~y∗ ) if and only if |=H2 F (~x∗ ). This shows that f¯ is an elementary embedding. let H2′ denote the definable hull of f [I1 ]. It is readily seen from the definition of f¯(w) that H2′ ⊆ f¯[I1 ]. If w′ ∈ H2′ then there is a term t determining it in M2 , involving elements ~x∗ where x∗i = f (xi ) with xi ∈ I1 . Let w be the element of H1 determined in M1 by replacing x∗i by xi in t; then f¯(w) = w′ , so H2′ f¯[H1 ]. ⊳ 38. 0#. 0# may be defined as an EM-set with certain properties. It will be seen that if V = L then there is no such set. In particular, unless 121 ZFC is inconsistent, it is not provable in ZFC that 0# exists. Thus, the principle “0# exists”, discovered by Solovay and Silver in the late 1960’s, extends ZFC in a manner contrary to V = L. It will be seen that “0# exists” follows from “there exists a measurable cardinal”. As mentioned in section 36, that the existence of a measurable cardinal implies V 6= L was first proved directly, by Scott in 1961. Various other principles imply “0# exists”, and hence that V 6= L. Since its introduction, the notion of 0# has found many uses. In this section, by “EM-set” will be meant the EM-set for a structure Lγ in the language of set theory, where γ ∈ LimOrd, and an infinite set I ⊆ γ of indiscernibles, with the natural order. The structure Lγ has definable Skolem functions; indeed F c is the formula, “y is the <L -least y such that F if there is any such y, else ∅”. Theorem 1. Suppose Σ is an EM-set. For each infinite ordinal α there is a structure MΣ,α , unique up to isomorphism, and a set of indiscernible I ⊆ OrdM , such that I has order type α, Σ is the EM-set, and MΣ,α is the definable hull of I. Remarks on proof: This is theorem 18.7 of [Jech2]. Existence follows by taking the definable hull of I in the structure given by theorem 37.1. The proof of theorem 37.1 needs to be modified, by adding the formulas “cj is an ordinal”, and cj1 ∈ cj2 for j1 < j2 , to T . Uniqueness follows by theorem 37.2. ⊳ There are three important properties an EM-set might have. An EM-set may be 1. well-founded, 2. unbounded, or 3. remarkable. These will be discussed in turn. The statement “0# exists” is the statement “there is an EM-set with properties 1-3”. Theorem 7 below gives an equivalent condition. By a Skolem term will be meant a term involving Skolem funcb is tions. Say that a structure MΣ,α is well-founded if the relation ∈ well-founded. Lemma 2. If MΣ,α is well-founded then MΣ,α is isomorphic to Lγ for some γ ∈ LimOrd. Further the set of indiscernibles for Lγ is uniquely determined. Remarks on proof: Using remarks following lemma 36.1, MΣ,α is isomorphic to a transitive set. Also, by the hypotheses on Σ it is elementarily equivalent to Lγ0 for some γ0 ∈ LimOrd. By a version of the condensation lemma, lemma 20.6 (see section V.2 of [Devlin] for example), the transitive set is Lγ for some γ ∈ LimOrd. Let I and J be two sets of indiscernibles. There is a unique order isomorphism f from 122 I to J. This induces an isomorphism f¯ from Lγ to Lγ , which extends f . Since f¯ must be the identity, f is. ⊳ Say that an EM-set Σ is well-founded if every structure MΣ,α is well-founded. Lemma 3. An EM-set Σ is well-founded if and only if there is an uncountable ordinal α such that MΣ,α is well-founded. Remarks on proof: This is theorem 18.9 of [Jech2]. Suppose for some α1 that MΣ,α1 is not well-founded, and hai i is an infinite descending chain, i.e., ai+1 b ai for all i < ω. Let ai be defined from indiscernibles by the Skolem term ti . Let I0 be the indiscernibles involved in any ti . Then I0 is countable; let α2 be its order type. The definable hull of I0 in MΣ,α1 is not well-founded, since it contains the ai . Also, it is MΣ,α2 . However, MΣ,α2 is the definable hull in MΣ,α of the first α2 elements of α, and hence, being a substructure of a well-founded structure, is well-founded. This is a contradiction to the hypothesis that α1 exists. ⊳ For α ∈ LimOrd, say that MΣ,α is unbounded if the indiscernibles are unbounded in OrdM . Say that an EM-set Σ is unbounded if for every α ∈ LimOrd, MΣ,α is unbounded. In the statement of the following theorem, and later in the section, if t is a Skolem term, the formula defining it will be denoted as t also. Lemma 4. For an EM-set Σ, the following are equivalent. a. Σ is unbounded. b. For some α ∈ LimOrd, MΣ,α is unbounded. c. Suppose tw~ is a Skolem term. Then Σ contains the formula Fw,v ~ , “if tw~ is an ordinal then tw~ < v”. Remarks on proof: This is theorem 18.10 of [Jech2]. a⇒b is trivial. For b⇒c, let ~x be any increasing finite sequence of indiscernibles. If, in MΣ,α , tw~ (~x) is an ordinal, then there is some y with xn < y and tw~ (~x) < y. Thus, Fw,v x, y) is true, so Fw,v is in Σ. For c⇒ a, let α be ~ (~ ~ a limit ordinal, let I be the set of indiscernibles in MΣ,α , and let a be an ordinal. Then a = t where t is a Skolem term involving the increasing sequence ~x of indiscernibles. If y is any indiscernible greater than xn , then by hypothesis Fw,v is in Σ, whence Fw,v x, y) is true. ⊳ ~ ~ (~ For α ∈ LimOrd, say that M = MΣ,α is remarkable if it is unbounded, and whenever b is a limit point (in M ) of the set of indisb b is in the definable hull of I ∩ b (in M ). Say that cernibles I, every c∈ an EM-set Σ is remarkable if for every α ∈ LimOrd, MΣ,α is remarkable. In the statement of the following theorem, and later in the section, if t~v is a Skolem term with variables as indicated, tu~ denotes the term, where vi is replaced by ui . Lemma 5. For an unbounded EM-set Σ, the following are equiva123 lent. a. Σ is remarkable. b. For some α ∈ LimOrd with α > ω, if b is the ωth indiscernible of b b is in the definable hull of I ∩ b. MΣ,α every c∈ c. Suppose tw,~ ~ v is a Skolem term. Then Σ contains the formula Fw,~ ~ v ,~ u, ‘if tw,~ ~ v is an ordinal and tw,~ ~ v < v1 then tw,~ ~ u ”. Remarks on proof: This is theorem 18.11 of [Jech2]. a⇒b is trivial. For b⇒c, let ~x, ~y, ~z be an increasing finite sequence of indiscernibles, where ~x is the first k indiscernibles, y1 is the ωth. Given a Skolem term t, let a = tw,~ x, ~y ). If a is an ordinal and a < y1 then by hypothesis ~ v (~ a = s where s is a term involving a finite set J of indiscernibles, each less than y1 . There is a formula G which, at the values ~x, J, and ~y in increasing order, states that t = s, so is true. Since the values are indiscernibles, G is true when yi is replaced by zi . It follows that Fw,~ x, ~y, ~z ) is true, whence Fw,~ ~ v ,~ u (~ ~ v ,~ u is in Σ. For c⇒ a, let α be a limit ordinal, let I be the set of indiscernibles in MΣ,α , let b be a limit point b b. Then for some Skolem term t = tw,~ of I, and suppose c∈ x, ~y ) ~ v , c = t(~ where y1 = b. Choose ~ q and ~r of the same length as ~y such that ~x, ~p, ~y , ~q is an increasing sequence of indiscernibles, Since Fw,~ ~ v ,~ u is in Σ, and c < y1 , t(~x, ~y ) = t(~x, ~ q ). By applying indiscernibility to the appropriate formula, and c < y1 , t(~x, ~y ) = t(~x, p~). ⊳ Lemma 6. Suppose Σ is an EM-set with properties 1-3. Suppose κ is an uncountable cardinal, and MΣ,κ is isomorphic to Lγ . Then γ = κ. Remarks on proof: This is corollary 18.13 of [Jech2]. Suppose I is the set of indiscernibles. Since I ⊆ γ and |I| = κ, γ ≥ κ. Suppose γ > κ. Since MΣ,κ is unbounded and I has uncountable order type, there is a β ∈ Lim(I) so that κ < β. Since MΣ,κ is remarkable, β is a subset of the definable hull H of I ∩ β. Let α be the order type of I ∩ β; then α < κ, so H < κ, which is a contradiction since κ ⊆ H. ⊳ Theorem 7. There is an EM-set with properties 1-3, if and only if there is a γ ∈ LimOrd such that Lγ has an uncountable set of indiscernibles. Remarks on proof: This is corollary 18.18 of [Jech2]. If the EMset exists, let κ be any uncountable cardinal; by lemma 6 Lκ has an uncountable set of indiscernibles. Suppose I is an uncountable set of indiscernibles for Lγ ; choose γ as small as possible. Let Σ be the EM-set, and let H be the definable hull of I in Lγ . Σ is well-founded because H is. If H is not unbounded, there is a Skolem term t~v (~x) whose value γ ′ in H is greater than any element of I. It may be assumed that γ ′ ∈ LimOrd; if not, subtract an integer from it. Let I ′ = {x ∈ I : x > xk } where xk is the last (largest) element of ~x. Since Lγ ′ is definable from γ ′ in Lγ , there is a formula G~v ,w~ such that for y1 , . . . , yl ∈ I ′ , G~v,w~ (~x, ~y ) is true in 124 Lγ if and only if Fw~ (~y ) is true in Lγ ′ . Since I is a set of indiscernibles for Lγ , I ′ is a set of indiscernibles for Lγ ′ . Since γ ′ < γ this is impossible, so H is unbounded, and so Σ is. Since γ is a small as possible, H is isomorphic to Lγ . By applying the collapsing isomorphism to I, H = Lγ may be assumed. Choose I so that the ωth element iω of I is as small as possible. Suppose Lγ is not remarkable. Then there is a Skolem term t = t~v,w~ such that if ~x, ~y , ~z is an increasing sequence of indiscernibles, then in Lγ , t(~x, ~y) is an ordinal, t(~x, ~y ) < y1 , and t(~x, ~y) 6= t(~x, ~z). If ~v has length k and w ~ has length l, partition I into a vector ~x of length k, followed by successive vectors yζ of length l. Let ηζ = t(~x, ~yζ ). The ηζ must form either a decreasing or increasing sequence, and the former is impossible. The set J = {ηζ } is readily seen to be a set of indiscernibles for Lγ , since a formula involving elements of J can be transformed into one involving elements of I. Since iω = yω,1 , ηω = t(~x, ~yω ) < iω . If π is the collapsing isomorphism, π[J] is a set of indiscernibles for Lγ with a smaller ωth element than I. Thus, Σ is remarkable. ⊳ Next, some consequences of “0# exists” will be given. First, a lemma about remarkable EM-sets is given. Lemma 8. Suppose Σ is a remarkable EM-set, M = MΣ,α where α ∈ LimOrd, and I is the set of indiscernibles. Then in M , I is a closed and unbounded class of ordinals. Remarks on proof: This is lemma 18.12 of [Jech2]. By definition I is unbounded. Suppose β < α is a limit ordinal and b is the βth element of I. It suffices to show that b is the limit of I ∩b. Let H be the definable hull in M of I ∩ b; then H is MΣ,β . Since M is remarkable, b is the limit of the ordinals in H. Also, H is unbounded, and the claim follows. ⊳ Theorem 9. Let Σ be an EM-set with properties 1-3. For each uncountable cardinal κ, let Iκ ⊆ κ be the set of indiscernibles for Lκ . Suppose λ < κ is an uncountable cardinal. a. Iλ = Iκ ∩ λ. b. Lλ is the definable hull of Iλ in Lκ . c. Lλ ≺ Lκ d. λ ∈ Iκ . e. Σ is uniquely determined. Remarks on proof: Parts a and b are lemma 18.14 of [Jech2]. Let l be the λth element of Iκ , and let H be the definable hull of Iκ ∩ l. Then H equals MΣ,λ , hence it is isomorphic to Lλ , hence the ordinals of H equal λ. Also, l is the limit of the ordinals of H, so l = λ. Now, H is closed under the definable function α 7→ Lα , whence Lλ ⊆ H, whence Lλ = H since the collapsing isomorphism must be the identity. Part b is proved; part a follows because Iκ ∩ λ is a set of indiscernibles for H = Lλ . Part c follows because H ≺ M since H is a Skolem hull (see 125 lemma 20.2). Part d has already been shown, since l = λ and l ∈ Lκ . For part e, by the foregoing each ℵn for n < ω is in Iℵω . Thus, F~v is in Σ if and only if |=Lℵω F~v (ℵ1 , . . . , ℵk ). ⊳ The union of the Iκ is a class of ordinals, called the Silver indiscernibles. Corollary 10. Suppose 0# exists, and κ is an uncountable cardinal. Then “Lκ ≺ L”, that is, for any formula F and x1 , . . . , xk ∈ Lκ , Fv (~x)L ⇔ Fv (~x)Lκ . Remarks on proof: By the reflection principle referred to in section 33, there is an uncountable cardinal κ such that Fv (~x)L ⇔ Fv (~x)Lκ . The corollary follows by theorem 9.c. ⊳ Corollary 11. Suppose 0# exists. For any cardinal κ, |Vκ ∩ L| = κ. Proof: Let F be the formula “w ∈ x if and only if ρ(w) < κ”. F (x, κ) holds in L if and only if x = Vκ ∩ L. ∃xF (x, κ) holds in L, so it holds in Lκ+ . Thus, Vκ ∩ L ∈ Lκ+ , and the corollary follows. ⊳ In particular, there are only countably many constructible subsets of ω, and certainly V 6= L. There are various other basic facts concerning 0#, including the following. - If 0# exists, if κ is an uncountable cardinal then in L, κ → (ω)<ω 2 (theorem V.2.15.ii of [Devlin]). - There is a Π1 formula F with free variable v, such that Fv (x) holds if and only if x is 0#. It follows that 0# is not constructible (theorem V.3.1 and corollary V.3.2 of [Devlin]). - There is a Π12 formula F of arithmetic with a free second order variable V , such that FV (X) holds if and only if X is 0#, considered as a set of integers (lemma 25.30 of [Jech2]). - 0# exists if and only if there is a nontrivial embedding j : L 7→ L (this will be shown in section 44). It will be seen in section 42 that if a measurable cardinal exists then 0# exists (in fact stronger facts hold). Among other principles which imply “0# exists” are Chang’s conjecture (Corollary 18.28 of [Jech2]), and the existence of a Johnsson cardinal (Corollary 18.29 of [Jech2]). Another example will be seen in section 60. 39. Relative constructibility. The constructible sets may be generalized, and the resulting construction has numerous applications. If L is a proper subclass of V then models of interest may be constructed by starting with some sets or classes of interest, and “closing up” under the constructibility process. Notation varies; a general construction along the lines of [TakZar2] will be given first, and some special cases with more or less standard notation defined in terms of it. 126 In this section the language of set theory may be expanded with unary predicate symbols P1 , . . . , Pk . These may occur in the formulas in the separation and replacement axioms schemes; the expanded axiom system will be denoted ZFCP1 ,...,Pk . This is done for convenience; if the Pi are interpreted as sets or classes defined by formulas, the added predicates may be removed. The predicate Sat(D, f, a) of section 18 can be generalized, to consider formulas F in the language expanded with unary predicate symbols P1 , . . . , Pk , to the predicate Sat(D, A1 , . . . , Ak , f, a). This is true as in the case of Sat(D, f, a), with the added proviso that for 1 ≤ i ≤ k, Ai ⊆ D and Pi is interpreted as Ai . ~ f, a) generalizing the function The function DefBy(S, A, DefBy(S, f, a) of section 19 is readily defined. Let Def P~ (S) be {T : ∃f ∃a(T = DefBy(S, P1 ∩ S, . . . , P1 ∩ S, f, a))}. Theorem 19.1 holds, in ~ ZFCP . Given a class A, and a transitive class B, let L0 (A; B) = ∅, Lα+1 (A; B) = Def A,B (Lα (A; B)) ∪ (B ∩ Vα+1 ), and Lα (A; B) = ∪β<α Lβ (A; B) for limit ordinals α. Let L(A; B) = ∪α Lα (A; B). A and B are used as additional predicates in adding sets to L(A; B), and also the sets of B are added. Theorem 19.2 holds, in ZFCA,B . Theorem 19.3 also holds; for the proof, it is only necessary to observe that part a still follows, because B is transitive. Theorem 1. L(A; B) is the smallest class M having the following properties. a. M is transitive. b. Ord ⊆ M . c. M is a model of ZF. d. For any x ∈ M , x ∩ A ∈ M . e. For any x ∈ M , x ∩ B ∈ M . f. B ⊆ M . Remarks on proof: That L(A; B) has properties a and b has already been observed. To prove property c, only minimal changes are needed to the proofs of lemma 19.4 and theorem 19.5. A similar claim is proved in theorem 7.25 of [TakZar2]. For part d, if x ∈ L(A; B) then for some α, x ∈ Lα (A; B). By part a, x ⊆ Lα (A; B). It follows that x ∩ A = {w ∈ Lα (A; B) :|=Lα (A;B) w ∈ x ∧ A(x)}, from which it follows that x ∩ A ∈ Lα+1 (A; B). Part e is similar. For part f, if x ∈ B has rank α then x will be added to Lα+1 (A; B). Suppose M has properties a-f. Using absoluteness and the hypotheses, it follows readily by induction that Lα (A; B)M = Lα (A; B), whence L(A; B) = L(A; B)M ⊆ M . ⊳ 127 Although usage varies, L[A] is commonly used to denote L(A; ∅) (the usual definition involves only a single predicate symbol in the extended language). The proof of theorem 19.6, with hardly any modification, shows that L[A] satisfies AC. For another useful fact, letting à = A ∩ L[A], it follows readily by induction that Lα [Ã] = Lα [A], whence L[Ã] = L[A]. If A is a set then à ∈ L[Ã] follows. A version of the condensation lemma (lemma 20.6) holds for L[A]. It is necessary to add the hypothesis “TC(A ∩ Lα ) ⊆ S” The proof of lemma 20.6 may then be adapted, since π[A ∩ Lα ] = A ∩ Lα , and the relativizing predicate does not change due to the collapsing, so that T = Lβ [A]. (See exercise II.4.A of [Devlin]). Using this condensation lemma it may be shown that in L[A] for a set A, for “sufficiently large” κ, 2κ = κ+. One version is used in showing that if U is a normal κ-complete ultrafilter on a measurable cardinal κ then the GCH holds in L[U ] (theorem 19.3 of [Jech2]). For a set b, L(b) denotes L(∅, TC({b})). This may be defined by the recursion L0 (b) = TC({b}), Lα+1 (b) = Def(Lα (b)), and Lα (b) = ∪β<α Lβ (b) for limit ordinals α. L(b) is the smallest class with properties a-c of theorem 1, which contains b as an element. If there is a well-ordering of b in L(b) then L(b) satisfies AC. The most commonly encountered example, however, is L(R), and as will be seen in section 53 it is independent of ZF whether R can be well-ordered. For a transitive model M of ZFC which is a class, and a set x which is a subset of M , L(∅; M ∪ {x}) is commonly denoted M [x]. M [x] is the smallest class N with properties a-c of theorem 1, such that M ⊆ N and x ∈ N . M [x] may be defined for M a set. Lα (A; B) is defined as usual for α ∈ M , and M [x] is the union over these ordinals. The proof of theorem 1.c may be modified to show that M [x] is a model of ZF. The model M [x] is sometimes of use in forcing arguments. For example, if x is a Cohen real then M [x] and M [G] are the same, since x and G are definable from each other. This follows using theorem 21.13.d. 40. Direct limits. In this section some facts from model theory will be covered, which are among the many facts of model theory that are useful in set theory. One use will be seen in the next section. A poset P is said to be directed if for any x, y ∈ P there is a z ∈ P such that x ≤ z and y ≤ z. A chain is an example of a directed 128 poset. For many applications of the constructions of this section a chain suffices, but the general construction is only slightly more involved. Recall the definition of a homomorphism h : S 7→ T from a structure S for a first order language to a structure T . An injective homomorphism is called a monomorphism. A direct system of structures is a family {Di : i ∈ P } of structures, where P is a directed poset, together with homomorphisms hij : Di 7→ Dj for i ≤ j, such that hii is the identity map, and hjk ◦ hij = hik whenever i ≤ j ≤ k. Suppose {Di : i ∈ P } is a direct system of structures. Let U be the disjoint union of the Di (that is, {hi, xi : x ∈ Di }). Let ≡ be the relation on U , where hi, xi ≡ hj, yi if and only if for some k ≥ i, j, hik (x) = hjk (y). Lemma 1. With notation as above, ≡ is an equivalence relation on U. Proof: It is immediate that hi, xi ≡ hi, xi, and if hi, xi ≡ hj, yi then hj, yi ≡ hi, xi. Suppose hi1 , x1 i ≡ hi2 , x2 i and hi2 , x2 i ≡ hi3 , x3 i. There is a ka with ka ≥ i1 , i2 and hi1 ka (x1 ) = hi2 ka (x2 ). Similarly there is a kb with kb ≥ i2 , i3 and hi2 kb (x2 ) = hi3 kb (x3 ). There is a k with k ≥ ka , kb . Then hi1 k (x1 ) = hka k (hi1 ka (x1 )) = hka k (hi2 ka (x2 )) = hi2 k (x2 ) = hkb k (hi2 kb (x2 )) = hkb k (hi3 kb (x3 )) = hi3 k (x3 ). ⊳ Let D∞ be the set of equivalence classes of U under ≡. Lemma 2. Suppose that [hit , xt i] ∈ D∞ for 1 ≤ t ≤ k. Suppose j1 and j2 both satisfy js ≥ it for 1 ≤ t ≤ k. Let xst = hit js (xt ). Then for any predicate symbol P , P (x11 , . . . x1k ) holds in Dj1 if and only if P (x21 , . . . x2k ) holds in Dj2 ; and for any function symbol f , x1k = f (x11 , . . . x1k−1 ) holds in Dj1 if and only if x2k = f (x21 , . . . x2k−1 ) holds in Dj2 . Proof: Choose j3 ≥ j1 , j2 . Letting x3t = hit j3 (xt ), for s = 1, 2 x3t = hjs j3 (xst ). Thus, P (xs1 , . . . xsk ) holds in Djs if and only if P (x31 , . . . x3k ) holds in Dj3 . The argument for functions is similar. ⊳ Thus, a structure may be defined on D∞ , by letting P ([hi1 , x1 i], . . . , [hik , xk i]) hold if and only if P (x′1 , . . . x′k ) holds in Dj for some (and hence any) j with j ≥ it for 1 ≤ t ≤ k, where x′t = hit j (xt ). Similarly [hik , xk i] = f ([hi1 , x1 i], . . . , [hik−1 , xk−1 i]) if and only if x′k = f (x′1 , . . . , x′k−1 ) in Dj where j ≥ it for 1 ≤ t ≤ k and x′t = hit j (xt ). The structure D∞ is called the direct limit of the direct system S = {Di }, and denoted dirlim(S). The “canonical homomorphism” hi∞ : Di 7→ D∞ is defined to be the map where for x ∈ Di , hi∞ (x) = [hi, xi]. Lemma 3. a. With notation as above, hi∞ is a homomorphism. Also, for i ≤ j, 129 hj∞ ◦ hij = hi∞ . b. Suppose D′ is a structure, and h′i : Di 7→ D′ are homomorphisms with h′j ◦ hij = h′i . Then there is a unique homomorphism h′∞ : D∞ 7→ D′ such that h′∞ ◦ hi∞ = h′i for all i ∈ P . Proof: For part a, by definition, for a predicate P , and x1 , . . . , xk ∈ Di , P (hi∞ (x1 ), . . . , hi∞ (xk )) holds in D∞ if and only if P (x1 , . . . xk ) holds in Di . A similar claim holds for functions. Thus, hi∞ is a homomorphism. If i ≤ j then hi∞ (x) = [hi, xi] = [hj, hij (x)i] = hj∞ (hij (x)). For part b, h′∞ ([hi, xi]) = h′i (x) must hold, and straightforward calculation shows that this yields a well-defined homomorphism. ⊳ If E is another structure having the properties of D∞ as stated in the lemma, then E is isomorphic to D∞ . Indeed, there are homomorphisms in both directions, and by the uniqueness of the map in part b, it may be seen that their compositions are the identity. Thus, the properties of the lemma characterize the direct limit. This is an example of a fact from category theory, the lemma stating the existence of the colimit of a direct system in the category of structures and homomorphisms; see [Dowd1]. Lemma 4. If hij is a monomorphism for all i ≤ j then hi∞ is a monomorphism for all i. Proof: If hi∞ (x) = hi∞ (y) then hi, xi ≡ hi, yi, so for some k ≥ i, j hik (x) = hik (y). Since hik is a monomorphism, x = y. ⊳ An important special case occurs when Di ⊆ Dj for i ≤ j, and hij : Di 7→ Dj is the “inclusion map”, that is, hij (x) = x. In this case D∞ may be taken as ∪i Di , and hi∞ as the inclusion map. Indeed, to determine the value of a predicate P (x1 , . . . , xk ) in D∞ , choose j ≥ xt for 1 ≤ t ≤ k, and take the value in Dk ; and similarly for functions. In particular, in the case of a chain, this produces the union of the chain as a structure. Theorem 5. If hij is an elementary embedding for all i ≤ j then hi∞ is an elementary embedding for all i. Proof: This is theorem 10.1 of [Sacks1], where it is attributed to Tarski and Vaught. See also lemma 12.2 of [Jech2]. It follows by induction on the formula F that for all i, if x1 , . . . , xk ∈ Di then F (x1 , . . . , xk ) holds in Di if and only if F (x′1 , . . . x′k ) holds in D∞ , where x′t = hi∞ (xt ) for 1 ≤ t ≤ k. For convenience let ~x denote x1 , . . . , xk , and similarly for ~x′ . If F is atomic the claim follows because hi∞ is a homomorphism. If F is ¬G then F (~x) is true in Di if and only if G(~x) is false in Di , if and only if (using the induction hypothesis) G(~x′ ) is false in D∞ , if and only if F (~x′ ) is true in D∞ . The claim follows similarly for the other propositional connectives. Suppose F is ∃vG. If F (~x) is true in Di then G(w, ~x) is true in Di for some w ∈ Di , so by induction G(hi∞ (w), ~x′ ) 130 is true in D∞ , so F (~x′ ) is true in D∞ . If F (~x′ ) is true in D∞ then G(hj∞ (w), ~x′ ) is true in D∞ for some j ≥ i and w ∈ Dj , so by induction G(w, hij (x1 ), . . . , hij (xk )) is true in Dj , so F (hij (x1 ), . . . , hij (xk )) is true in Dj . Since hij is elementary, so F (x1 , . . . , xk ) is true in Di . ⊳ As for elementary substructures (as noted in section 20), the notion of a Σn -elementary embedding may be defined, by requiring the defining property to hold only for formulas which are Σn (and it holds for Πn formulas also). The cases ∆0 and Σ1 are often of particular interest. Theorem 5 holds when “elementary” is replaced by “∆0 -elementary”, or “Σ1 -elementary”. The proof requires only minor adjustments. 41. L[U ] and iterated ultrapowers. Class models of ZFC have been of considerable interest in modern set theory. An early example of such is L[U ] where U is a κ-complete nonprincipal ultrafilter on a cardinal κ. Its study helped spur later developments. Chapter 19 of [Jech2] contains a thorough treatment of the basic theory. Here, some facts of interest will be stated; the reader is referred to [Jech2] for proofs and further facts. See also sections 8 and 9 of [KanMag]. Let Ũ = L ∩ U . As noted in section 39, L[U ] = L[Ũ]. Lemma 1. In L[U ], Ũ is a κ-complete nonprincipal ultrafilter on κ. If U is normal then Ũ is normal. Remarks on proof: This is lemma 19.1 of [Jech2]; see also theorem 6.5.1 of [Drake]. ⊳ By results of section 39, L[U ] is a model of ZFC. Lemma 2. L[U ] is a model of GCH. Remarks on proof: This is lemma 19.2 of [Jech2]; see also lemma 6.5.2 and theorem 6.5.3 of [Drake]. ⊳ Theorem 5 below shows that L[U ] is “tied” to the measure U ; the following lemma is an initial such fact. Also, if V has many measurable cardinals then L[U ] is quite unlike V . Lemma 3. In L[U ], κ is the only measurable cardinal. Remarks on proof: This is lemma 19.4 of [Jech2]. ⊳ Further properties of L[U ] may be shown using iterated ultrapowers, which were first considered by Gaifman, and subsequently by Kunen. Suppose U is a κ-complete nonprincipal ultrafilter on a cardinal κ. For an ordinal α the α-th iterated ultrapower Ultα may be defined by recursion, together with elementary embeddings iβα : Ultβ 7→ Ultα for β < α. The embeddings will satisfy the compatibility condition, that for γ < β < α, iβα◦ iγβ = iγα . The cardinal κα may be defined as i0α (κ) and the κ-complete nonprincipal ultrafilter Uα on κα as i0α (U ). To start the recursion, Ult0 = V , κ0 = κ, and U0 = U . Let Ult′α+1 be the ultrapower of Ultα by Uα . If this is not well-founded, the iteration 131 stops. Otherwise, Ultα+1 is the transitive collapse of Ult′α+1 , and iα,α+1 is the composition of the transitive collapse map with the canonical embedding (the remaining iβ,α+1 are determined by the compatibility requirement). If α is a limit ordinal, let Ult′α be the direct limit of the system {Ultβ : β < α}, where the map from Ultβ ′ to Ultβ for β ′ < β is iβ ′ β . If this is not well-founded, the iteration stops. Otherwise, Ultα is the transitive collapse of Ult′α , and iβ,α is the composition of the transitive collapse map with the direct limit map. Theorem 4. Every iterated ultrapower Ultα exists. Remarks on proof: This is theorem 19.7 of [Jech2]. The proof makes use of the “factor lemma”, lemma 19.5, which states that when the β-th iterate is taken inside the model Ultα , the result is Ultα+β . ⊳ Theorem 5. a. If U is a normal measure on κ in L[U ] then U is the only such. b. For every ordinal κ there is at most one U ⊆ Pow(κ) such that U is a normal measure on κ in L[U ]. c. Suppose κ1 < κ2 are ordinals, and for i = 1, 2, Ui is a normal measure on κi in L[Ui ]. Then there is an ordinal α such that L[D2 ] = Ultα in L[D1 ], and D2 = i0α (D1 ). Remarks on proof: This is theorem 19.14 of [Jech2]. Lemma 19.13 (the representation lemma) gives a characterization of Ultα as an ultrapower defined by an ultrafilter on a certain Boolean algebra. Using this, certain values of i0α can be computed (lemma 19.15). In turn, Ultλ in L[U ] can be characterized for certain λ (lemma 19.17), and uniqueness can then be shown (lemma 19.18). Part c is shown by iterating in L[D1 ] “at least to D2 ”, and proving a lemma (lemma 19.19) which shows that the iteration in fact “hits D2 ”. ⊳ 42. The sharp operator. By generalizing the definition of 0#, the notion of x# where x ⊆ ω may be defined. Thus, # may be considered an operator on subsets of ω. The sharp operator is occasionally useful in various discussions. Various facts of section 38 continue to hold, merely expanding the language of set theory with a unary predicate symbol, and considering structures Lγ [x] in the definition of an EM set, etc. In particular lemma 38.2 holds with this modification, for limit ordinals γ > ω. This may be seen using the appropriate condensation lemma, for example lemma 1.7 of [Mitchell1], noting that x ∪ {x} ∈ Lγ [x]. The proofs of the remaining facts (with Lγ replaced by Lγ [x]), in particular theorem 38.9, require few if any changes. In fact, x# can be defined for any set of ordinals; see [KanMag] and [Mitchell1] for some remarks. 132 Theorem 1. If there is a cardinal κ such that κ → (ℵ1 )<ω then a# 2 exists for all a ⊆ ω. Remarks on proof: This is theorem 9.19 of [Kanamori3]. By the relativized version of theorem 38.7, it suffices to show that there is an uncountable set of indiscernibles for Lκ [a]. Number the formulas with free variables indicated, so that in Fn,~v , ~v has length at most n. Partition [κ]<ω into two parts, where a set corresponding to the sequence ξ1 < · · · < ξn is in one part if Fn,~v (ξ1 , . . . , ξn ) is true, else in the other part. A homogeneous subset is a set of indiscernibles. ⊳ In some topics, the hypothesis “a# exists for all a ⊆ ω” is of interest. By theorem 1, theorem 36.9, and the fact that if κ → (κ)<ω 2 then κ → (ℵ1 )<ω 2 , this hypothesis follows from the hypothesis that a measurable cardinal exists. 43. Cardinals larger than measurable. In this section, let “j : V 7→κ M ” denote that “j : V 7→ M is an elementary embedding of V in the transitive class M , and κ is the least ordinal moved”. By results of section 36, if j is nontrivial then κ exists and is a measurable cardinal. The term “critical point” is often used for κ. A cardinal κ is said to be - measurable if ∃j, M (j : V 7→κ M ); - strong if ∀α∃j, M (j : V 7→κ M ∧ Vα ⊆ M ); - superstrong if ∃j, M (j : V 7→κ M ∧ Vj(κ ) ⊆ M ); and - supercompact if ∀α∃j, M (j : V 7→κ M ∧ M α ⊆ M ). The first definition agrees with that given previously by results of section 36. All of these types of cardinals are measurable. As observed in the proof of theorem 36.7, for a measurable cardinal, Vκ+1 ⊆ M . Suppose Vκ+2 ⊆ M . Then the ultrafilter determined by j is in M , so κ is measurable in M , so there is a measurable cardinal below j(κ) in M , so there is a measurable cardinal below κ in V . The above list of types of large cardinals elaborates on this theme, by imposing various requirements on M , to the effect that it “more closely resemble” V , to obtain cardinals which are “larger than measurable”. A cardinal is said to be Woodin if for all A ⊆ Vκ there is a cardinal λ < κ such that ∀α < κ∃j, M (j : V 7→λ M ∧ Vα ⊆ M ∧ α < j(λ) ∧ A ∩ Vα = j(A) ∩ Vα ). As will be seen below, though Woodin cardinals need not be measurable, they lie between strong and superstrong in “consistency strength”. The above definitions are not formalizable in the first order language of set theory, since they involve quantifying over the proper class j. As has been seen in section 36, for measurable cardinals a first order definition can be given (and it can then be shown that there is a defin133 able embedding). However, using just a single ultrafilter U results in an ultrapower M = UltU where U ∈ / M (lemma 17.9 of [Jech2]), and so Vκ+2 6⊆ M . In the 1980’s it was discovered that systems of ultrafilters can be used to characterize cardinals larger than measurable. The ultrafilters in a system are indexed by elements of [λ]<ω where λ is some ordinal with κ < λ. The ultrafilter Ea for a ∈ [λ]<ω is an ultrafilter on [ζ]|a| for some ordinal ζ ≥ κ. It is convenient to adopt the convention that, when a finite set of ordinals is written as {α1 , . . . , αn }, α1 < · · · < αn is understood. If a = {α1 , . . . , αn }, b = {β1 , . . . , βm }, and a ⊆ b, a map t : {1 · · · n} 7→ {1 . . . m} is induced, where ai = bt(i) . This in turn induces a map π : [ζ]|b| 7→ [ζ]|a| , where {ξ1 . . . ξm } maps to {ξt(1) . . . ξt(m) }. When necessary, π may be written as πba . It is easily checked that for a ⊆ b ⊆ c, πca = πba ◦ πcb . The following convenient notation will be introduced. Suppose a ⊆ −1 b. Given X ⊆ [ζ]|a| let Xab = πba [X] (X is transformed to a subset |b| of [ζ] ). Given a function f with domain [ζ]|a| let fab be f ◦ πba (f is transformed to a function with domain [ζ]|b| ). For the following lemma, for a finite set of ordinals s let si denote the ith element in increasing order (so that s = {s1 . . . sn }); and for ξ ∈ s let i(ξ, s) be that i such that ξ = si . Lemma 1. Suppose j : V 7→ M is an elementary embedding with critical point κ; λ > κ is an ordinal; and ζ is the least ordinal with ζ ≥ κ and λ ≤ j(ζ). For a ∈ [λ]<ω let Ea = {X ⊆ [ζ]|a| : a ∈ j(X)}. The following hold. a. ζ ≥ κ and [ζ]|a| ∈ Ea . b. Ea is a κ-complete ultrafilter on [ζ]|a| . c. For some a, Ea is not κ+ -complete. d. If β < ζ then there is an a such that Ea is not κ+ -complete, and {s ∈ [ζ]|a| : β ∈ s} ∈ Ea . e. For a ⊆ b and X ⊆ [ζ]|a| , X ∈ Ea if and only if Xab ∈ Eb . f. Suppose f : [ζ]|a| 7→ V , and {s ∈ [ζ]|a| : f (s) < max(s)} ∈ Ea . Then for some b with a ⊆ b, {s ∈ [ζ]|b| : fba (s) ∈ s} ∈ Eb . g. Suppose for each i ∈ ω, ai ∈ [ζ]<ω and Xi ∈ Eai . Then there is a function d : ∪i ai 7→ ζ such that for all i ∈ ω, d[ai ] ∈ Xi . Remarks on proof: This is stated following (20.39) of [Jech2]; it is also exercise 26.3.a of [Kanamori3]. Since λ ≤ j(ζ), a ∈ [λ]|a| ⊆ [j(ζ)]|a| = j([ζ])|a| , proving part a. For part b, the proof of theorem 36.4 may be adapted, using part a. For part c, for α < κ let Xα = {{ξ} : ξ ≥ α}. Since j(α) = α, j(Xα ) = {{ξ} : ξ ≥ α}, whence {κ} ∈ j(Xα ), so Xα ∈ Eκ . Since {κ} ∈ [j(κ)]1 = j([κ]1 ), [κ]1 ∈ Eκ . 134 Letting Yα = Xα ∩ [κ]1 , Yα ∈ Eκ . But ∩α<κ Yα = ∅, so Eκ is not κ+ complete. For part d, A = B ∪ C where A = {{ξ1 , ξ2 } : β ∈ {ξ1 , ξ2 }}, B = {{ξ1 , β}}, and C = {{β, ξ2 }. First suppose β ≥ κ. Since β < ζ, j(β) < λ; let a = {κ, j(β)}. Then a ∈ j(B), whence B ∈ Ea , so A ∈ Ea . For α < κ let Xα = {{ξ, β} : α ≤ ξ < κ}; then a ∈ j(Xα ); this family shows that Ea is not κ+ -complete. Now suppose β < κ, whence j(β) = β; let a = {β, κ}. Then a ∈ j(C), whence C ∈ Ea , so A ∈ Ea . For α < κ let Xα = {{β, ξ} : α ≤ ξ < κ}; this family shows that Ea is not κ+ -complete. For part e, since b ∈ j([ζ]|b| ), it suffices to −1 show that a ∈ j(X) if and only if b ∈ πba [j(X)]. This is clear, since πba (b) = a. For part f, let X denote {s ∈ [ζ]|a| : f (s) < max(s)}. Then ∀s ∈ X∃β < max(s)(f (s) = β), whence in M the formula is true with X replaced by j(X) and f replaced by j(f ). By hypothesis a ∈ j(X), whence there is a β such that β < max(a) and j(f )(a) = β. let P (s, a, β) be the predicate “β < max(a) and s ∈ [µa ]a∪{β} and fab (s) = si(β,b) ”. Let Y denote {s ∈ [ζ]|b| : fab (s) = si(β,b) }. Then ∀s∀a∀β(P ⇒ s ∈ Y ), so replacing f by j(f ) and Y by j(Y ) it is true in M . Choosing β as above, Y ∈ Eb ; part f follows. For part g, for each i, ∃s(s : n 7→ λ ∧ s[n] ∈ j(Xi )), whence ∃s(s : n 7→ j(ζ) ∧ s[n] ∈ j(Xi )), whence ∃s(s : n 7→ ζ ∧ s[n] ∈ Xi ). Choosing si as the value for Xi , d can be constructed from the si . ⊳ Given a cardinal κ and an ordinal λ > κ, a system of ultrafilters satisfying the properties of lemma 1 is called a (κ, λ)-extender. It should be noted that [Jech2] only considers the case ζ = κ and λ ≤ j(κ). Extenders where λ > j(κ) and ζ > κ are called “long”; it will shortly be seen that they are useful. Also, [Kanamori3] considers j : N 7→ M with N not necessarily V ; but this will not be needed here. Given an extender, an ultraproduct may be taken. Let U0 = {ha, f i : a ∈ [λ]<ω , f : [ζ]|a| 7→ V . Let ≡0 be the binary relation on U0 , where ha, f i ≡0 hb, gi if and only if {s ∈ [ζ]|c| : fac (s) = fbc (s)} ∈ Ec , where c = a ∪ b. Let ∈0 be the binary relation, where ha, f i ∈0 hb, gi if and only if {s ∈ [ζ]|c| : fac (s) ∈ fbc (s)} ∈ Ec , where c = a ∪ b. Lemma 2. ≡0 is a congruence relation on U0 , equipped with ∈0 . Remarks on proof: This construction is mentioned preceding lemma 20.29 of [Jech2]. First, for a ⊆ b if f (s) = g(s) for s ∈ X where X ∈ Ea then fab (s) = f (πba (s)) = g(πba (s)) = gab (s) for s ∈ Xab where Xab ∈ Eb . That ≡0 is reflexive follows from [ζ]|a| ∈ Ea . That ≡0 is symmetric is immediate. Given a, b, c let c1 = a ∪ b, c2 = b ∪ c,and c3 = a ∪ b ∪ c. Suppose fac1 (s) = gbc1 (s) for s ∈ Xc1 where Xc1 ∈ Ec1 , and gbc2 (s) = hcc2 (s) for s ∈ Xc2 where Xc2 ∈ Ec2 . Then fac3 (s) = gbc3 (s) for s ∈ Xc1 c3 and gbc3 (s) = hcc3 (s) for s ∈ Xc2 c3 , whence fac3 (s) = hcc3 (s) for s ∈ Xc1 c3 ∩ Xc2 c3 . This proves that ≡0 is 135 transitive. Given a, b, a′ , b′ let c1 = a ∪ b, c2 = a ∪ a′ , c3 = b ∪ b′ , and c4 = a∪b∪a′ ∪b′ . Suppose fac1 (s) ∈ gbc1 (s) for s ∈ Xc1 where Xc1 ∈ Ec1 , fac2 (s) = fa′ ′ c2 (s) for s ∈ Xc2 where Xc2 ∈ Ec2 , and gbc3 (s) = gb′ ′ c3 (s) for s ∈ Xc3 where Xc2 ∈ Ec3 ; arguing as above, fa′ ′ c4 (s) ∈ gb′ ′ c4 (s) for s ∈ Y where Y ∈ Ec4 , which proves that ≡0 respects ∈0 . ⊳ Letting E denote the extender, let UltE0 be the quotient of U0 by ≡0 . This is a structure for the language of set theory. To simplify the notation, write [a, f ] for [ha, f i]. Lemma 3 (Los theorem). Suppose φ is a ∆0 formula. Letting N denote UltE0 , suppose and [ai , fi ] is an element of N for 1 ≤ i ≤ n. Let c = ∪i ai . Then a. |=N φ([a1 , f1 ], . . . , [ak , fk ]) if and only if b. Xφ ∈ Ec where Xφ = {s ∈ [ζ]|c| :|=V φ(f1a1 c (s), . . . , fkak c (s))}. Proof: The proof is by induction on the formation of φ. The claim holds for atomic formulas by definition. The claim for φ = ¬ψ follows, from claim for ψ, since Xφ = [ζ]|c| − Xψ . The claim for φ = ψ ∧ θ follows from the claim for ψ and θ, since Xφ = Xψ ∩ Xθ . It might be necessary to add variables to the list for ψ or θ; that this is permissible follows by property (e) of a (κ, λ)-extender. Suppose φy,~x is ∃z ∈ y ψz,~x . If |=N φ([b, g], [a1 , f1 ], . . . , [ak , fk ]) then for some c, h, |=N [c, h] ∈ [b, g] ∧ ψ([c, h], [a1 , f1 ], . . . , [ak , fk ]). Using the induction hypothesis, and letting c1 = ∪i ai ∪ b and c2 = c1 ∪ c, {s :|=V hcc2 (s) ∈ g cc2 (s) ∧ ψ(hcc2 (s), f1a1 c2 (s), . . . , f1ak c2 (s))} ∈ Ec2 , whence {s :|=V φ(g cc1 (s), f1a1 c1 (s), . . . , f1ak c1 (s))} ∈ Ec1 . Suppose {s :|=V φ(g cc1 (s), f1a1 c1 (s), . . . , f1ak c1 (s))} ∈ Ec1 ; and let h : [µc1 ]|c1 | 7→ ∪Ran(g) be a function where h(s) is some y ∈ ∪Ran(g) such that |=V y ∈ g(s) ∧ ψ(y, f1a1 c1 (s), . . . , f1ak c1 (s)) if such a y exists, else ∅. Then {s :|=V h(s) ∈ g(s)∧ψ(h(s), f1a1 c1 (s), . . . , f1ak c1 (s))} ∈ Ec1 . Using the induction hypothesis, |=N [c1 , h] ∈ [b, g]∧ψ([c1 , h], [a1 , f1 ], . . . , [ak , fk ]); and a follows. ⊳ Lemma 4. UltE0 has small extensions and is well-founded. Proof: Given [a, f ], to consider its elements it suffices to consider [b, g] with b ∈ [λ]<ω and g where either g(s) ∈ f (s) or g(s) = ∅; this shows that UltE0 has small extensions. Suppose UltE0 is not wellfounded, and let [ai , fi ] be such that [ai+1 , fi+1 ] ∈ [ai , fi ] for i ∈ ω. Let ci = ai ∪ ai+1 and Xi = {s ∈ [ζ]ci : fiai ci (s) = fi+1,ai+1 ,ci+1 (s)}. Then Xi ∈ Eai , so using property (g) of an extender, let d : ∪i ai 7→ ζ be such that d[ai ] ∈ Xi . Then fi+1,ai+1 ,ci (d[ci ]) ∈ fi,ai ,ci (d[ci ]), fi+2,ai+2 ,ci+1 (d[ci+1 ]) ∈ fi+1,ai+1 ,ci+1 (d[ci+1 ]), and πci ai+1 (d[ci ]) = πci+1 ai+1 (d[ci+1 ]), yielding an infinite descending chain of sets. ⊳ |a| Since Ea is an ultrafilter, there is an ultrapower V [ζ] / ≡Ea . Let 136 UltEa 0 denote this, and j Ea 0 : V 7→ UltEa 0 the canonical embedding. Lemma 5. The map f 7→ fab induces an elementary embedding 0 jab : UltEa 0 7→ UltEb 0 ; these maps together with the objects UltEa 0 form direct system. UltE0 is the direct limit; the map ja0 : UltEa 0 7→ UltE0 is that induced by f 7→ ha, f i. Remarks on proof: This is a routine verification; an outline will be 0 is well-defined. Using given. If f ≡Ea g then fab ≡Eb gab , whence jab Los theorem twice, it follows that |=UltEa 0 φ([a, f1 ], . . . , [a, fk ]) if and 0 only if |=UltEb 0 φ([b, f1ab ], . . . , [b, fkab ]); that is, jab is elementary. Since 0 0 0 0 jac = jbc ◦ jab , the jab are the maps of a direct system. If f ≡Ea g then ha, f i ≡E ha, gi, whence ja0 is well-defined. That it is elementary again follows using Los theorem twice. For a ⊆ b, [a, f ] = [b, fab ], and ja0 = 0 . Suppose U ′ is another structure with maps ja′ : UltEa 0 7→ U ′ jb0 ◦ jab 0 satisfying ja′ = jb′ ◦ jab . If there is a map j∞ : UltEb 0 7→ U ′ such that ′ j∞ ◦ ja = ja for all a, then it must be the case that j∞ ([a, f ]) = ja′ ([f ]). By remarks following lemma 40.3, to complete the proof that UltE0 is the direct limit, it suffices to show that this prescription yields a well-defined elementary embedding. But if [a, f ] = [b, c] then [c, fac ] = [c, gac ], whence [fac ] = [gac ]; and so ja′ ([f ] = jc′ ([fac ]) = jc′ ([gbc ]) = jb′ ([g]). Finally, let c = ∪i ai . Since [a1 , f1 ] = [c, f1a1 c ] = jc ([f1a1 c ) and jc′ ([f1a1 c ]) = ja′ 1 ([f1 ]) = j∞ ([a1 , f1 ]), |=UltE0 φ([a1 , f1 ], . . .) if and only if |=UltE0 φ(jc ([f1a1 c ]), . . .) if and only if |=UltEc 0 φ([f1a1 c ], . . .) if and only if |=U ′ φ(jc′ ([f1a1 c ]), . . .) if and only if |=U ′ φ(j∞ ([a1 , f1 ]), . . .). ⊳ The preceding lemma is not needed, but is included because many authors define UltE0 as the direct limit. It follows that for each a the ultrapower UltEa 0 is well-founded. It also follows that, letting UltEa and UltE denote the transitive collapses, and jab and ja denote the maps composed with the appropriate transitive collapse isomorphisms or their inverse, UltE is the direct limit of the UltEa . For x ∈ V let Cx denote the function {h∅, xi}. Let j E0 : V 7→ UltE0 (M ) be the map given by j E0 (x) = [∅, Cx ]. Let U : [ζ]1 be the function {h{ξ}, ξi : ξ < ζ}. For n ∈ ω let In denote the identity function on [ζ]n . Lemma 6. a. j E0 is an elementary embedding. b. For α < κ, the elements of [∅, Cα ] are the elements [∅, Cβ ] for β < α (whence [∅, Cα ] is the ordinal α in UltE0 ). c. For α < λ, the elements of [{α}, U ] are the elements [{β}, U ] for β < α. (whence [{α}, U ] is the ordinal α in UltE0 ). d. For a ∈ [λ]ω , the elements of [a, I|a| ] are the elements [{α}, U ] for α ∈ a. (whence [a, I|a| ] is the set a in UltE0 ). e. For all [a, f ], [∅, f ]([a, I|a| ]) = [a, f ]. 137 f. For all a ∈ [λ]<ω , X ∈ Ea if and only if a ∈ j E0 (X). g. The critical point of j E0 equals κ. h. ζ is the least ordinal with ζ ≥ κ and λ ≤ j E0 (ζ). Remarks on proof: These claims are proved in [Kanamori3], many of them in the proof of lemma 26.2. For part a, in the notation of the proof of lemma 4, Xφ (Cx1 , . . . , Cxk ) equals {h∅, . . . , ∅i} if |=M φ(x1 , . . . , xk ), else ∅. That j E0 is elementary follows using lemma 3, and since equality is present j E0 is injective. For part b, it is straightforward to verify that [∅, Cβ ] is an element of [∅, Cα ]. Suppose [a, f ] ∈ [∅, Cα ]; then {s : f (s) ∈ α} ∈ Ea . Since Ea is a κ-complete ultrafilter, there is some β < α such that {s : f (s) = β} ∈ Ea , and [a, f ] = [∅, Cβ ]. The last claim follows inductively. For part c, it is straightforward to verify that [{β}, U ] is an element of [{α}, U ]. Suppose [a, f ] ∈ [{α}, U ]. Let c1 = a ∪ {α} and X = {s : fac1 (s) ∈ U{α}c1 (s) }; then X ∈ Ec1 . But U{α}c1 (s) is some member of s, so for s ∈ X, fac1 (s) < max(s). Using property (f) of a (κ, λ)-extender, for some c2 ⊇ c1 , {s ∈ [ζ]|c2 | : fac2 (s) ∈ s} ∈ Ec2 , whence for some i, {s ∈ [ζ]|c2 | : fac2 (s) = si } ∈ Ec2 , for some β < max(s). Since x ∈ y ⇒ ¬(y ∈ x ∨ y = x) holds in V , it holds in UltE0 . It follows that β < α must hold. For part d, it is straightforward to verify that [{α}, U ] is an element of [a, I|a| ]. Suppose [b, f ] ∈ [a, I|a| ]. Let c1 = a ∪ b; then {s : fac1 (s) ∈ I|a|ac1 } ∈ Ec1 . It follows that for some i of a position of a in c1 , {s : fac1 (s) = si } ∈ Ec1 . It follows from this that for some α ∈ a, [b, f ] = [{α}, U ]. For part e, {s ∈ [ζ]|a| : f (s) = (Cf ∅a (s))(I|a| (s))} equals [ζ]|a| . For part f, X ∈ Ea if and only if {s ∈ [ζ]|a| : I|a| (s) ∈ CX∅a (s)} ∈ Ea if and only if [a, I|a| ∈ [∅, CX ]. For part g, suppose κ̄ is the critical point of j E0 . If ν < κ̄ and Xα ∈ Ea for α < ν then by part f, a ∈ j E0 (Xα ) for α < ν, whence a ∈ ∩α j E0 (Xα ). Since ν < κ̄ it follows as usual that whence a ∈ j(∩α j E0 Xα ). This shows that Ea is κ̄-complete. By property (c) of an extender, κ < κ̄ cannot hold, whence by part b κ̄ = κ. For part h, ζ ≥ κ since otherwise Ea would be principal and hence κ+ -complete for all a. If α < λ then {{xi} : ξ ∈ ζ} ∈ E{α} , so [{α}, U ] ∈ [∅, Cζ ]; it follows that λ ≤ j E0 (ζ). Suppose κ ≤ ζ ′ < ζ. By property (d) of an extender, [∅, ζ ′ ] ∈ [a, I|a| ] for some a. It follows that j E (ζ ′ ) ∈ a ⊆ λ. ⊳ Let π denote the transitive collapse map for UltE0 , and let j E = π ◦ j E0 (so that j E : V 7→ UltE ). By part b, jE (α) = α for α < κ. By part c, π([{α}, U ]) = α for α < λ. By part d, π([a, I|a| ]) = a. By part e, Ea = {X ⊆ [ζ]|a| : a ∈ j E (X)}. By part f, UltE = {j E (f )(a) : f : [ζ]|a| 7→ V, a ∈ [λ]<ω }. ⊃E0 may be replaced by j E in parts g and h. Using lemma 5 it is easy to verify that j E0 = ja0 ◦ j Ea 0 . Composing with the appropriate transitive collapses and their inverses, j E = ja ◦ j Ea . See remarks preceding lemma 26.1 of [Kanamori3]. 138 Lemma 7. Suppose j : V 7→κ M , λ > κ, and E is the (κ, λ)extender derived from j. a. The map ha, f i 7→ j(f )(a) induces an elementary embedding k : UltE 7→ M . b. k ◦ j E = j. c. If |Vγ | ≤ λ then Vγ) ⊆ Ran(k), whence k ↾ Vj E (β) is the identity. d. k ↾ λ is the identity. Remarks on proof: These claims are proved in the proof of 26.1 of [Kanamori3]. For part a, given a formula φ and hai , fi i for 1 ≤ i ≤ k, let X = {s ∈ [ζ]c : φ(f a1 c1 (s), · · ·)} where c = ∪i ai . Then s ∈ X ⇔ φ(f a1 c1 (s), · · ·), and it follows that s ∈ j(X) ⇔ φ(j(f )a1 c1 (s), · · ·). Letting φ be equality, c ∈ j(X) if and only if j(f )(a) = j(g)(b), showing k is well-defined. Letting φ be arbitrary, it follows that k is elementary. Part b follows because j(Cx )(∅) = j(x). For part c, the claim is clear for β < ω, so suppose β ≥ ω. It is easy to see that there is a bijection gβ : |Vβ | 7→ Vβ , such that for ω ≤ γ ≤ β, gβ [|Vγ |] = Vγ . For example to extend gβ to Vβ+1 , use a bijection from |Vβ+1 | to Vβ+1 − Vβ . Let f be the function where f ({ξ}) = gβ (ξ), for ξ < min(ζ, |Vβ |). Then if [|Vγ |]1 ⊆ Dom(f ) then f [[|Vγ |]1 ] = Vγ . Hence this is true in UltE for j(f ). If |Vγ | ≤ λ then [|Vγ |]1 ⊆ Dom(f ). The first claim follows by part a, and the second claim follows because k is an elementary embedding. For part d, if α < λ then α is represented by [{α}, U ], whence k(α) = j(U )({α}) = α. ⊳ Theorem 8. The following are equivalent, for a cardinal κ and ordinal α. a. ∃j, M (j : V 7→κ M ∧ Vκ+α ⊆ M ). b. For some λ there is a (κ, λ)-extender E such that Vκ+α ⊆ UltE . Remarks on proof: For related results see exercise 26.7 of [Kanamori3]. Suppose a holds. Let λ = |Vκ+α | and let E be the (κ, λ)-extender derived from j. By lemma 7, Vκ+α ⊆ UltE . Suppose (b) holds. Letting j = jE and M = UltE , it is immediate that (a) holds. ⊳ Thus, κ is strong if and only if (b) holds for all α. This latter fact is expressible in the language of set theory. A similar argument shows that κ is superstrong if and only if, for some λ there is a (κ, λ)-extender E such that Vj(κ) ⊆ UltE , whence this also is expressible in the language of set theory. Given a cardinal λ, an ordinal α, and a set A, say that λ is α-strong for A ∃j, M (j : V 7→λ M ∧ Vα ⊆ M ∧ α < j(λ) ∧ A ∩ Vα = j(A) ∩ Vα ). Theorem 9. For a cardinal κ, consider the following properties. a. ∀A ⊆ Vκ {λ < κ : ∀α < κ(λ is α-strong for A)} is nonempty. b. ∀A ⊆ Vκ {λ < κ : ∀α < κ(λ is α-strong for A)} is stationary. 139 c. For any function f : κ 7→ κ there is a cardinal λ < κ with f [λ] ⊆ λ, and a j, M such that j : V 7→λ M , and Vj(f )(λ) ⊆ M . d. For any function f : κ 7→ κ there is a cardinal λ < κ with f [λ] ⊆ λ, and an extender E ∈ Vκ , such that j E has critical point λ, Vj E (f )(λ) ⊆ M e , and j E (f )(λ) = f (λ). If property c holds then κ is Mahlo and there is a stationary set of measurable cardinals below κ. All 4 properties are equivalent. Remarks on proof: See exercise 26.10 and lemma 26.14 of [Kanamori3], or lemma 34.2 of [Jech2]. ⊳ Property (a) is just the definition of a Woodin cardinal given earlier. Property (d) is expressible in the language of set theory. In fact, the property of being a Woodin cardinal is “Π11 -describable”; whence the smallest Woodin cardinal is not Π11 -indescribable, and hence not measurable. An uncountable regular cardinal is said to be strongly compact if, for any set S, every κ-complete filter on S can be extended to a κcomplete ultrafilter on S. If A is a set with with |A| ≥ κ, and x ∈ [A]<κ , let x̂ = {y ∈ [A]<κ : x ⊆ y}. Let F be the filter on [A]<κ generated by {x̂ : x ∈ [A]<κ }. Lemma 10. For an uncountable regular cardinal κ, F is κ-complete. Proof: If x ⊆ y then ŷ ⊆ x̂. Given x̂xi for ξ < µ where µ < κ, let u = ∪ξ xξ ; then u ∈ [A]<κ and û ⊆ ∩ξ x̂ξ . A fine measure on [A]<κ is defined to be a κ-complete ultrafilter which extends F . Let Lκ,ω denote a language where conjunctions and disjunctions of µ subformulas are allowed, for µ < κ. Also, there are κ variables. A formal definition is straightforward and will be omitted. Theorem 11. For an uncountable regular cardinal κ, the following are equivalent. a. κ is strongly compact. b. For any set A with |A| ≥ κ, there exists a fine measure on [A]<κ . c. Suppose S is a set of sentences in the language Lκ,ω , and every subset T ⊆ S with |T | < κ has a model; then S has a model. Remarks on proof: This is lemma 20.2 of [Jech2]. ⊳ Say that a fine measure U on [A]<κ is normal if whenever f : <κ [A] 7→ A is such that f (x) ∈ x for x ∈ S where S ∈ U , then f is constant on some S ′ ⊆ S with S ′ ∈ U . Theorem 12. For an uncountable regular cardinal κ, the following are equivalent. a. κ is supercompact. b. For any set A with |A| ≥ κ, there exists a normal fine measure on [A]<κ . 140 Remarks on proof: This follows by lemma 20.14 of [Jech2]. ⊳ Property (b) is expressible in the language of set theory. There is an embedding characterization of strongly compact cardinals, namely, a cardinal is strongly compact if and only if ∀α∃j, M (j : V 7→κ M ∧ ∀X ⊆ M (|X| ≤ α ⇒ ∃Y (Y ∈ M ∧ X ⊆ Y ∧ |=M |Y | < j(κ))). A proof may be found in theorem 22.17 of [Kanamori3]. Given types of large cardinals T1 and T2 , some relationships which might hold include the following. - T2 (κ) ⇒ T1 (κ) - ∃κT2 (κ) ⇒ ∃κT1 (κ) - If ∃κT2 (κ) then in an inner model, ∃κT1 (κ) holds - Con(ZF C + ∃κT2 (κ)) ⇒ Con(ZF C + ∃κT1 (κ)) (That is, these statements are provable in ZFC). Write these as T2 ⇒ T1 , T2 ⇒e T1 , T2 ⇒i T1 , and T2 ⇒c T1 respectively. All four implications are transitive. It is easily seen that if T2 ⇒ T1 then T2 ⇒e T1 ; and if T2 ⇒e T1 then T2 ⇒i T1 . It is also true that if T2 ⇒i T1 then T2 ⇒c T1 . Indeed, for sentences φ, ψ, to show that Con(ZF C + φ) ⇒ Con(ZF C + ψ), it suffices to show (in ZFC) that φ implies that ψ has an inner model (see remarks preceding theorem II.4.1 of [Devlin]). Theorem 13. a. supercompact⇒superstrong b. superstrong⇒Woodin c. Woodin⇒i strong d. supercompact⇒strong e. Woodin⇒e measurable f. supercompact⇒strongly compact g. strongly compact⇒i Woodin With the exception of Woodin cardinals, all the above types of cardinals are measurable. Remarks on proof: Say that a cardinal κ is α-supercompact if ∃j, M (j : V 7→κ M ∧ M α ⊆ M ). If x ∈ M and M is 2|x| -supercompact then Pow(x) ∈ M , because Pow(x) can be constructed from a wellordering of x and a string of 0’s and 1’s of length |x| · 2|x| . Parts a and d follow. For part b, see proposition 26.12 of [Kanamori3]. Part b follows by exercise 34.3 of [Jech2]. Part c follows because each λ in the definition of a Woodin cardinal is strong in Vκ . Part e follows because each λ in the definition of a Woodin cardinal is measurable. Part f follows by theorems 11 and 12. Part g follows using sophisticated results in inner model theory; see [SchSt]. It has already been observed that supercompact, superstrong, and strong cardinals are measurable. That strongly compact cardinals are measurable follows by theorem 11.b. ⊳ It is not true that a superstrong cardinal is strong; see exercise 26.9 141 of [Kanamori3]. It is considered a major open question of set theory, whether strongly compact ⇒c supercompact. An “order of measurability”, called the Mitchell order, can be defined for a cardinal κ. For normal ultrafilters U1 , U2 ⊆ κ, say that U1 < U2 if U1 ∈ UltU2 . Theorem 14. < is transitive and well-founded. Remarks on proof: Suppose U2 < U1 < U0 . U1 is represented by a function Ũ1 , such that Ũ1 (α) is an ultrafilter in α, on a set I1 ∈ U0 . W.l.g., I1 can be assumed to be a set of cardinals, since the cardinals are a club subset of κ and U0 is normal. Since U0 is normal κ is represented by the diagonal function. A subset x ⊆ κ is represented by the function whose value at α is x∩α. Letting U1p (x) denote {λ ∈ I1 : x∩λ ∈ Ũ1 (λ)}, x ∈ U1 if and only if U1p (x) ∈ U0 . Similarly y ∈ U2 if and only if U2p (y) ∈ U1 . So y ∈ U2 if and only if U1p (U2p (y)) ∈ U0 . For t ⊆ λ let U2pλ (t) = {µ ∈ I2 ∩ λ : t ∩ µ ∈ Ũ2 (µ)}; then U2p (x) ∩ λ = U2pλ (x ∩ λ) Thus λ ∈ U1p (U2p (y)) if and only if λ ∈ I1 and U2p (y) ∩ λ ∈ Ũ1 (λ), if and only if λ ∈ I1 and U2pλ (y ∩ λ) ∈ Ũ1 (λ). Let W̃ (λ) = {t ⊆ λ : U2pλ (t) ∈ Ũ1 (λ)}. Then y ∈ U2 if and only if {λ ∈ I1 : x∩λ ∈ W̃ (λ)}, so to show U2 < U0 it suffices to show that for λ ∈ I1 , W̃ (λ) is a λ-complete normal ultrafilter in λ. Let J denote U1p (I2 ), i.e., {λ ∈ I1 : I2 ∩ λ ∈ Ũ1 (λ)}. Since I2 ∈ U1 , J ∈ U0 . For the following assume λ ∈ J, whence λ ∈ I1 and I2 ∩ λ ∈ Ũ1 (λ) (although λ ∈ I1 suffices for some cases). Suppose t ∈ W̃ (λ) and t ⊆ s. Then U2pλ (t) ∈ Ũ1 (λ), and U2pλ (t) ⊆ U2pλ (s), so U2pλ (s) ∈ Ũ1 (λ), so s ∈ W̃ (λ). Suppose η < λ and tξ ∈ W̃ (λ) for ξ < η. Let Kξ = U2pλ = {µ ∈ I2 ∩ λ : tξ ∩ µ ∈ Ũ2 (µ)}; then Kξ ∈ U1 (λ). Let K = (∩ξ<η Kξ ) ∩ (η, λ); then K ∈ U1 (λ). If µ ∈ K then η < µ and tξ ∩ µ ∈ Ũ2 (µ) for all ξ < η, so (∩ξ<η tξ ) ∩ µ ∈ Ũ2 (µ). Thus, ∩ξ<η tξ ∈ W̃ (λ). For any µ ∈ I2 ∩ λ, either t ∩ µ ∈ Ũ2 (µ) or tc ∩ µ ∈ Ũ2 (µ). So if K1 = {µ ∈ I2 ∩ λ : t ∩ µ ∈ Ũ2 (µ)} and K2 = {µ ∈ I2 ∩ λ : tc ∩ µ ∈ Ũ2 (µ)}, then I2 ∩ λ = K1 ∪ K2 . Since I2 ∩ λ ∈ Ũ1 (λ), either K1 ∈ Ũ1 (λ) or K2 ∈ Ũ1 (λ). Suppose tξ ∈ W̃ (λ) for ξ < λ. Let Kξ = U2pλ = {µ ∈ I2 ∩ λ : tξ ∩ µ ∈ Ũ2 (µ)}; then Kξ ∈ U1 (λ). Let K = △ξ < λKξ ; then K ∈ U1 (λ). If µ ∈ K then tξ ∩ µ ∈ Ũ2 (µ) for all ξ < µ, so △ξ < µ(tξ ∩ µ) ∈ Ũ2 (µ), so (△ξ < λtξ ) ∩ µ ∈ Ũ2 (µ). Thus, △ξ<λ tξ ∈ W̃ (λ). This completes the proof that < is transitive. The proof that it is well-founded may be found in lemma 19.31 of [Jech2]. ⊳ Let ρ< denote the rank function for < defined in section 32. For a cardinal κ let ρ< (κ) = sup{ρ< (U ) + 1 : U is a normal ultrafilter on κ}. Clearly ρ< (κ) > 0 if and only if κ is measurable. ρ< (κ) > 1 if and only if there is a normal ultrafilter U on κ such that ρ< (U ) ≥ 1. By lemma 19.33 of [Jech2], this is so if and only if {λ : ρ< (λ) ≥ 1} ∈ U , i.e., the 142 measurable cardinals below κ comprise a set which is in U . In general ρ< is a measure of the “order of measurability” of κ. The following may be shown. - ρ< (κ) ≤ (2κ )+ ; if GCH holds then ρ< (κ) ≤ κ++ (remarks following lemma 19.34 of [Jech2]). - If κ is strong then ρ< (κ) ≤ κ++ (exercise 20.17 of [Jech2]). From one point of view, the types of large cardinals considered above comprise the most important types larger than measurable. Even larger types are considered, though. A cardinal κ which was the critical point of an elementary embedding j : V 7→ V would be very large. However in 1971 it was shown that there is no such (definable proper class) j. This is theorem 17.7 of [Jech2]; and also theorem 23.12 of [Kanamori3], where three proofs are given. It became of interest what types of cardinals can be defined, for which it was not known whether the existence of such cardinals was demonstrably false. The main such which have been defined are I0, I1, I2, and I3 (the rank-into-rank types); huge cardinals and related types; and extendible cardinals and related types. The main properties of these types of cardinals can be found in [Kanamori3]; [Jech2] has some discussion. Although not of as great interest as the types considered above, these types continue to be studied. 44. Kunen’s theorem. As has been seen in section 38, the principle “0# exists” is greatly at variance with the principle “V = L”. In 1974 R. Jensen proved the “covering lemma”, which is a statement to the effect that if 0# does not exist (written “¬0#”) then there are restrictions on how greatly V can differ from L. In particular, the “singular cardinals hypothesis”, which follows from GCH and hence from V = L, follows from ¬0#. The proof of the covering lemma is quite involved, no matter how it is done. It was originally proved using the “fine structure theory”. Later, proofs were given which did not require this. More recently, proofs using fine structure theory have been recognized as being of additional interest, in particular to generalizations of the covering lemma to “core models”. An introduction to these methods may be found in [Mitchell2]. An overview of a proof of the covering lemma for L will be given in section 50; the sections from here through section 49 discuss preliminary results. The singular cardinal hypothesis will be discussed in section 51. As seen in section 36, the existence of a measurable cardinal is equivalent to the existence of a non-trivial elementary embedding of V . A theorem of Kunen states that “0# exists” is equivalent to the existence of a non-trivial elementary embedding of L. This is clearly a fact of interest in itself, and will be needed in section 50. 143 An elementary embedding j : M 7→ N between transitive models (sets or classes) of ZFC is defined as usual. In this section. only the case that M is a proper class will be considered, in which case N is. The case that M is a set is also of use; a discussion may be found in [Cummings] for example. Various facts from the case M = V are readily adapted. In particular, an ultrafilter may be defined from j. It need not be in M , though, so the following definition is needed: If κ is a cardinal in M , an M ultrafilter on κ is an ultrafilter in the Boolean algebra Pow(κ)M . Say that U is M -κ-complete if ∩ξ<η Xξ ∈ U whenever η < κ and hXξ : ξ < ηi is an element of M . Say that U is M -normal if △ξ<κ Xξ ∈ U whenever hXξ : ξ < κi is an element of M . Given an M -ultrafilter U on a cardinal κ of M , the ultrapower (M κ )M / ≡U can be constructed as follows. (M κ )M is the functions f in M with domain κ. Say that two such are equivalent if they are equal on a set in U . An element of the ultrapower is the elements of least rank b is defined as in section 36, and of an equivalence class. The predicate ∈ theorems 35.3 and 35.4 hold. By the same argument as in the proof of lemma 36.1, (M κ )M / ≡U has small extensions. In general it is not necessarily well-founded, even if U is M -κ complete. If U is M -κ complete then κ is a regular cardinal in M ; this follows by an argument given in the proof of theorem 36.7. In general ensuring further properties of κ requires placing further restrictions on U ; see [Kunen2]. Suppose j : M 7→ N is an elementary embedding where M (and hence N ) is a proper class which is a model of ZFC. Lemma 36.3 may be adapted, replacing Vα by VαM . Theorem 36.4 may be adapted, letting U in the proof equal {X ⊆ α : X ∈ M and α ∈ j(X)}. Lemma 36.5 may be adapted; U is M -normal if and only if for every regressive function f : X 7→ κ where f ∈ M and X ∈ U , there is a subset Y ⊆ X such that Y ∈ U and f is constant on Y . Lemma 36.6 may be adapted. Theorem 36.8 may be adapted; rather than from MUj , the map constructed is from (M κ )M / ≡Uj . Suppose j as above, It follows by the adaptation of theorem 36.8 that (M κj )M / ≡Uj is well-founded, since a descending chain would yield one in N . Theorem 1. If 0# exists then there is a non-trivial elementary embedding j : L 7→ L. Proof: Let I = ∪Iκ be the Silver indiscernibles (defined in section 38). Let j0 : I 7→ I be any order-preserving map. By theorem 38.9 every element of L equals t~v (~x) where t is a Skolem term and ~x is an increasing sequence of Silver indiscernibles. Define j(t~v (~x)) to be 144 t~v (j0 (x1 ), . . . , j0 (xk )). Standard arguments show that j is a well-defined function, and is an elementary embedding. ⊳ Suppose j : L 7→ L is a non-trivial elementary embedding. The facts outlined above can be used to transform j to an embedding with an additional property. Write κ for κj and U for Uj . (M κ )M / ≡U is wellfounded, and its transitive collapse is a model of set theory containing the ordinals, so equals L, giving rise to an elementary embedding jU : L 7→ L; by the adaptation of lemma 36.6, κjU = κj . Lemma 2. If λ is a limit cardinal of cofinality greater than κj then jU (λ) = λ. Remarks on proof: This is lemma 18.23 of [Jech2]. ⊳ From hereon j will be assumed to have the property of lemma 2. The next definition makes use of an operator on classes of ordinals which has uses in various settings. If X is a class of ordinals, there is a function eX on the ordinals, where eX (ξ) is the ξth element of X in increasing order. A “fixed point” of eX is an ordinal ξ such that eX (ξ) = ξ. The fixed points of eX are also called the fixed points of X. Clearly, ξ is a fixed point of X if and only if the order type of X ∩ ξ equals ξ. If X is a class of cardinals then λ is a fixed point if and only if |X ∩ λ| = λ. Let FP(X) denote the set of fixed points of X. Given j, let X0 be the set of limit cardinals of cofinality greater than κj . Let Xα+1 = FP(Xα ). For α ∈ LimOrd let Xα = ∩β<α Xβ . Lemma 3. Each Xα is a proper class. Remarks on proof: This is proved in remarks preceding lemma 18.24 of [Jech2]. ⊳ Now let κl be any element of Xℵ1 . Let jl be the restriction of j to Lκl . Since j(κl ) = κl , jl is an elementary embedding of Lκl in Lκl . For α < ℵ1 let Mα be the definable hull in Lκl of κj ∪ (Xα ∩ κl ). Since |Xα | = κl its transitive collapse is Lκl ; let πα be the collapsing isomorphism. Let γα = πα−1 (κj ). Lemma 4. The set {γα : α < ℵ1 } is a set of indiscernibles for Lκl . Remarks on proof: This is proved in lemmas 18.24 to 18.26 of [Jech2]. ⊳ Theorem 5. If there is a non-trivial elementary embedding j : L 7→ L then 0# exists. Remarks on proof: This follows by lemma 4 and theorem 38.7. ⊳ 45. Rudimentary functions. The rudimentary functions were introduced independently in [Gandy] and [Jensen] in the early 1970’s, and have since been of considerable use in set theory. The rudimentary functions are functions f : V k 7→ V . As usual, these are proper classes, so each is defined by a formula. However an “informal” style may be used in defining them; 145 the definition could be “translated” to more formal one, at the cost of great tedium. The rudimentary functions may be initially defined to be the smallest class of function containing those in clauses 1-3 below; and closed under the operations of clauses 4 and 5. 1. F (x1 , . . . , xk ) = xi , for any k and 1 ≤ i ≤ k. 2. F (x1 , . . . , xk ) = {xi , xj }, for any k and 1 ≤ i, j ≤ k. 3. F (x1 , . . . , xk ) = xi − xj , for any k and 1 ≤ i, j ≤ k. 4. F (x1 , . . . , xk ) = ∪w∈x1 G(w, x2 , . . . , xk ) (union on the first argument). 5. F (x1 , . . . , xk ) = G(H1 (x1 , . . . , xk ), . . . , Hl (x1 , . . . , xk )) (composition). The composition operator is a strict variety. A more general composition operator allows F (x1 , . . . , xk ) to equal G(t1 , . . . , tl ) where ti is Hi (xi1 , . . . , xiki ), or some xj . It is easily seen by induction that allowing the more general composition operator does not change the collection of rudimentary functions. The initial definition may be used to prove basic properties of the rudimentary functions. In this section various of these will be stated; proofs will be omitted, and can be found in the original paper [Jensen], [Devlin], or [Dodd]. There is a brief treatment of rudimentary functions in [Jech2], which omits proofs. A predicate P (~x) is said to be rudimentary if there is a rudimentary function f such that P (~x) ⇔ f (~x) 6= ∅. Lemma VI.1.1 of [Devlin] shows that various functions and predicates are rudimentary, and shows the following. - A predicate P (x) is rudimentary if and only if its characteristic function χP (~x) is rudimentary, where χP (~x) = 1 if P (~x), else 0. - The predicates x = y and x ∈ y are rudimentary. - The rudimentary predicates are closed under Boolean operations and bounded quantification. In particular a ∆0 predicate (i.e., one definable by a ∆0 formula) is rudimentary. - Various standard functions with ∆0 graphs are rudimentary, including p = hu, vi, u = π1 (p), and v = π2 (p); z = x × y; and standard functions concerning relations and functions. Lemma 1. A rudimentary function has a ∆0 graph. Remarks on proof: Say that a function f (~x) is simple if R(f (~x), ~y ) is ∆0 whenever R(w, ~y ) is ∆0 . After a preliminary lemma, it follows by induction on f that every rudimentary function is simple. The proof may be found in lemmas VI.1.2 and VI.1.3 of [Devlin]. ⊳ As a corollary, a k-ary predicate on V is rudimentary if and only if it has a ∆0 definition in the language of set theory (if and only if the 146 characteristic function has a ∆0 graph). The rudimentary functions can be characterized using “basis functions”. These are as follows, where recall hx1 , x2 , . . . , xk i denotes hx1 , hx2 , . . . , xk ii. 0. {x, y} 1. x − y 2. x × y 3. {hu1 , u2 , u3 i : u2 ∈ x ∧ hu1 , u3 i ∈ y} 4. {hu1 , u2 , u3 i : u3 ∈ x ∧ hu1 , u2 i ∈ y} 5. ∪x 6. π1 (x) 7. ∈ ∩(x × x) 8. {π2 (x ↾ z) : z ∈ y} Lemma 2. A function is rudimentary if and only if it can be obtained by (general) composition from the basis functions. Remarks on proof: Let B be the collection of functions which can be obtained by general composition from the basis functions. It is not difficult to see that the basis functions, and hence all the functions in B, are rudimentary. For the converse, if F is a formula of set theory together with a suitable list of k variables, let dF be the function of a single argument u, whose value is the subset of uk defined in u by F . It follows by induction on F that dF is in B. Given a function f : V k 7→ V let f ∗ : V 7→ V be the function where f ∗ (u) = f [un ]. It follows by induction (using dF ’s) that if f is rudimentary then f ∗ is in B. From this, it follows that if f is rudimentary then f ∈ B. For details see lemma VI.1.11 of [Devlin]. ⊳ A set X is said to be “rudimentarily closed” (abbreviated “rudclosed”) if whenever x1 , . . . , xk ∈ X and f is a rudimentary function then f (x1 , . . . , xk ) ∈ X. Strictly speaking this definition is unsatisfactory, since it quantifies over a set of proper classes. However, it is intuitively correct, and the difficulties are removable. For example, it suffices that X be closed under the basis functions. Lemma 2 is subject to similar comment. It is not a theorem of ZFC; rather, it is a “meta-theorem” about ZFC. It is easy to see that if α is a limit ordinal then Vα is rud-closed; and that the intersection of a family of rud-closed sets is rud-closed. Thus, for any set X there is a smallest rud-closed set Y such that X ⊆ Y . Y is called the “rudimentary closure” (abbrevated rud-closure) of X. As usual it may be described as ∪i∈ω Xi where Xi+1 is obtained from Xi by adding to Xi the result of applying each of the basis functions in every possible way to elements of Xi . Lemma 3. If X is a transitive set then the rud-closure of X is 147 transitive. Remarks on proof: Let Y be the rud-closure. Say that x ∈ Y is valid if TC({x}) ⊆ Y . It suffices to show that if f is rudimentary, and xi is valid for 1 ≤ i ≤ k, then f (~x) is valid. Indeed, if x ∈ X then x is valid; and it follows inductively that any x ∈ Y is valid. The claim for f is proved by induction on the formation of f according to the initial definition of a rudimentary function. For details see lemma VI.1.7 of [Devlin]. ⊳ The following two technical lemmas will be used in section 46. If M is a structure, a subset S ⊆ M will be said to be definable in M if it is defined by some formula without parameters, and definable in M from parameters if is defined by some formula with parameters, Lemma 4. Suppose X is transitive and rud-closed, and Y ≺1 X. Then Y is rud-closed and satisfies the axiom of extensionality. Suppose π : Y 7→ Z is the transitive collapse; then π commutes with rudimentary functions (i.e., π(f (~x)) = f (π(x1 ), . . . , π(xk ))). Remarks on proof: That Y satisfies extensionality follows because Y ≺1 X, and x 6= y can be written in Σ1 form. That Y is rud-closed follows because the existence condition is Σ1 , by lemma 1. The final claim follows by induction on f . For details see lemma VI.1.22 of [Devlin]. ⊳ Lemma 5. If f is a rudimentary function then there is an integer b such that ρ(f (~x)) ≤ max(ρ(x1 ), . . . , ρ(xk )) + b. Proof: This follows by induction from the fact that it is true of the basis functions, which is easily checked. ⊳ In many applications of the rudimentary functions it is necessary to consider them, relativized to an additional unary predicate A. Examples will be seen in section 47. As in section 39, for convenience ZFCA can be used, and A removed when it is definable or a set. To define the Arelative rudimentary functions, add to the initial definition the following clause. 6. A ∩ x This is a unary function, for any A. Lemma 6. If f is an A-relative rudimentary function then f equals a general composition of rudimentary functions and the function A ∩ x. Remarks on proof: The claim follows by induction on f . All cases are straightforward except clause 4. For details see lemma VI.1.8 of [Devlin]. ⊳ Note that this is a meta-theorem of ZFCA , and the claim holds for any A (“uniformly in A”). As a corollary, if the function A ∩ x is added as a 9th basis function, the A-relative rudimentary functions are those obtained from the expanded set of basis functions by general 148 composition. Expanding a remark preceding lemma 1, note that if x ∈ A then A ∩ {x} = {x}, else A ∩ {x} = ∅. It follows that if a predicate is ∆0 in A then it is an A-relative rudimentary function; further the function definition depends only on the formula and not on A. The “A-relative rud-closure” of a set X is defined as the rud-closure, except the additional function A ∩ x is included in the basis functions. A structure hM, Ai for the language, expanded with a unary predicate symbol, is said to be amenable if A ∩ x ∈ M for all x ∈ M . The structure will be said to be transitive or rud-closed if M is. If hM, Ai is rud-closed and amenable then M is closed under the A-relative rudimentary functions; and if a predicate is ∆0 relative to A then it is given by an A-relative rudimentary function, acting on M . Just as an admissible set, a transitive rud-closed set is a model of a sufficient fragment of set theory that definitions can be given, and “computations” carried out, which are valid in any such set (lemma 8 below will be an example). However, rud-closure is not as stringent a requirement as admissibility, and more care is needed. (Just as for admissible sets, there are axiom systems for the rud-closed sets, but this will not be considered; see [Dodd], [Gandy], or [Mathias].) As in the case of admissible sets, it is useful to give Σ1 definitions, which hold in any transitive rud-closed set. It is easily seen that the set Vω of hereditarily finite sets is rud-closed, and is a subset of any rud-closed set. Suppose ∃w P (w, ~x) is a predicate on Vω where P is ∆0 ; if w can be taken to be in Vω when it exists, then the definition holds in any transitive rud-closed set. Lemma 7 uses this observation to provide some useful predicates. Recall from section 16 that the predicate “x is an integer” is ∆0 . Lemma 7. For every recursively enumerable predicate P on ω there is Σ1 formula which defines P in any transitive rud-closed set. Remarks on proof: First, k = i + j has such a formula; a witness is a function f with domain j + 1, such that f (0) = i, f (t + 1) = f (t) + 1 for all t ∈ j, and k = f (j). The required f is in Vω . Second, there is a Σ1 formula Nm (n, f ) which holds if f is the m-adic notation of n (f (0) being the low-order digit). This states the existence of a sequence v of integers, and w of witnesses, where if Dom(f ) = l then Dom(v) = l + 1, Dom(w) = l, v(0) = 0, v(l) = n, and for t < l w(t) witnesses that v(t + 1) = m · v(t) + f (t). Again, the required witnesses are in Vω . If the state of a Turing machine (q.v. see appendix 2) is coded as a sequence over a finite alphabet, the step predicate and “halted” predicate are rudimentary. An input integer can be transformed to the initial state using N2 . Thus, the predicate determined by the Turing machine is 149 defined by a Σ1 formula, where the witnesses may all be taken in Vω . ⊳ There is a well-known bijection E : ω 7→ Vω . This may be defined by the following recursion. E(0) = ∅, E(2i0 + i1 ) = {E(i0 )} ∪ E(i1 ). Using the methods indicated above, it may be seen that E is Σ1 , i.e., has a Σ1 definition which holds in any transitive rud-closed set. Thus, any function from Vω to Vω , such that the induced function on the integer codes is Σ1 , is Σ1 also. Lemma 8. There is a Σ1 formula in the language expanded with a unary predicate symbol A, which defines the predicate “the ∆0 formula φ is true in hM, Ai, with assignment a to the free variables”, in any transitive rud-closed amenable structure hM, Ai. There is also such a Π1 formula. Remarks on proof: This is lemma 1.12 of [Jensen]; see also definition 1.16 of [Dodd]. Using the function E defined after lemma 7, it is irrelevant whether φ is coded as an integer or a hereditarily finite set; integer codes will be assumed. As observed preceding lemma 1, given a ∆0 formula φ defining the predicate Q, there is a rudimentary function definition dφ such that Q holds if and only if dφ = 1. By lemma 2 dφ may be translated to a term tφ involving the basis functions. The proofs provide explicit methods, allowing the conclusion that the function φ 7→ tφ is recursive (indeed primitive recursive). These remarks extend readily to the relativized case. By lemma 7 the predicate P1 (t, φ), which holds if t = tφ , is Σ1 . The amenability requirement ensures that every term t in the basis functions defines a total function. The predicate P2 (y, t, a), which holds if y is the value of the term t at the assignment a to the variables of t, may be seen to be Σ1 . The final predicate P is then ∃t(P2 ∧ P1 ). This may be written in Π1 form as ¬P (¬φ, a). ⊳ Corollary 9. For n ≥ 1, there is a Σn formula in the expanded language, which defines the predicate “the Σn formula φ is true in hM, Ai, with assignment a to the free variables”, in any transitive rud-closed amenable structure hM, Ai. Remarks on proof: See lemma II.6.4 of [Devlin] for a related result. Let P1 (φ, a) be the predicate of lemma 8. Let P2 (θ, ā, a1 , . . . , an , φ, a) be the following predicate: “θ is the matrix of φ, ai is an assignment to the ith block of bound variables, and ā is the concatenation of a and the ai ”. P2 may be seen to have a Σ1 definition, which holds in any transitive rud-closed set. If n is odd the final predicate is ∃a1 · · · ∃an ∃θ∃ā(P2 ∧P1 ). If n is even the final predicate is ∃a1 · · · ∀an ∀θ∀ā(P2 ⇒ P1′ ) where P1′ is the Π1 form. ⊳ The following fact will be used in section 46. Lemma 10. Suppose M is transitive and rud-closed, and A ⊆ M is 150 ∆0 -definable in M from parameters. Then A ∩ x ∈ M for all x ∈ M . Remarks on proof: A is rudimentary, so A ∩ x is rudimentary; the parameters are in M , and M is rud-closed. For details see lemma VI.1.6.v of [Devlin]. ⊳ 46. The Jensen hierarchy. The Jensen hierarchy is an alternative to Godel’s Lα hierarchy for the constructible sets. In various circumstances facts about L can be proven more straightforwardly using the Jensen hierarchy than the Godel hierarchy, and it has become widely used since Jensen introduced it in 1972. Again, in this section proofs will mostly be omitted, and may be found in [Jensen] or [Devlin]. Let Rud(X) denote the rud-closure of X ∪ {X}. For an ordinal α, define the set Jα by transfinite recursion as follows. J0 = ∅ Jα+1 = Rud(Jα ) Jα = ∪β<α Jβ for limit ordinals α Although the Lα hierarchy can be dispensed with entirely ([Schindler] for example does so), both hierarchies are widely used. Lemma 5 below gives a relation between them. The definition of the Lα hierarchy using the “Def” operator has the advantage of yielding a quite straightforward definition of L. If U is transitive then U ∪ {U } is, so Rud(U ) is. Thus, each Jα is transitive. Since clearly Jα ∈ Jα+1 , it follows that Jβ ∈ Jα if β < α, and Jβ ⊆ Jα if β ≤ α, Each Jα is clearly rud-closed. This is a significant advantage of the Jensen hierarchy. For example, Lα is not even closed under ordered pairs in general (it is if α is a limit ordinal). As observed in the previous section, Vω is rud-closed, from which it follows that J1 = Vω . Lemma 1. Jα ∩ Ord = ω · α. Proof: By lemma 45.5 Jα+1 ∩ Ord ≤ (Jα ) + ω. It is easy to see that equality holds. The lemma follows by induction on α. ⊣ Some authors adopt the convention that Jα receives the index ω · α rather than α. The indexing here is that used by [Jensen], [Devlin], and [Jech2]. For the following let ∆0 -Def(S) be the collection of subsets of S which are defined in S by a ∆0 formula with parameters from S (as opposed to Def(S), which allows any formula). Lemma 2. For a set S, ∆0 -Def(S ∪ {S}) ∩ Pow(S) = Def(S). ′ Remarks on proof: If A ⊆ S is defined in S by Fx , let Fvx be F with all quantifiers replaced by quantifiers restricted to range over v where v is a new variable; then A is defined in S ∪ {S} by x ∈ v ∧ F ′ , with S assigned to v. For the converse, it follows by induction on F that 151 there is a formula F ′ , such that if F defines X in S and F ′ defines X ′ in S ∪ {S}, then X = X ′ ∩ S. For details see lemma VI.1.17 of [Devlin]. ⊳ Lemma 3. For a transitive set S, Rud(S) ∩ Pow(S)=∆0 -Def(S ∪ {S}) ∩ Pow(S). Proof: Suppose A ⊆ S is defined in S ∪ {S} by the ∆0 formula F . Since S ∪{S} and Rud(S) are both transitive, A is defined in Rud(S) by F . By lemma 45.10, B = B ∩ S is in Rud(S). Suppose A ∈ Rud(S) and A ⊆ S. Then for some rudimentary function f and p ∈ S, A = f (p, S). As noted in the proof of lemma 45.1, f is simple,, so x ∈ f (p, S) is defined by a ∆0 formula Fv , with parameters p, S. By absoluteness, it is so defined in S ∪ {S}; that is, A is in ∆0 -Def(S ∪ {S}) ∩ Pow(S). ⊳ Corollary 4. For a transitive set S, Rud(S) ∩ Pow(S) = Def(S). Proof: Immediate by lemmas 2 and 3. ⊳ It follows that a subset of Jα is definable in Jα from parameters if and only if it is an element of Jα+1 . The following provides a comparison of the Godel and Jensen hierarchies. Neither inclusion should be surprising; the first is reasonable by corollary 4, and the second by lemma 1. Lemma 5. For any ordinal α, Lα ⊆ Jα ⊆ Lω·α . Proof: See lemmas VI.2.3 and VI.2.4 of [Devlin]. ⊳ In particular, L = ∪α Jα . Also, if α = ω ·α then Jα = Lα . The class of ordinals for which α = ω · α is a large one; it is closed and unbounded for example. The following definition is often useful in developing the theory of the Jensen hierarchy. Let Bi for 0 ≤ i ≤ 8 denote the ith basis function of section 45, and let ki denote its valency. Let s(u) = u ∪ ∪8i=0 Bi [uki ] S(u) = s(u ∪ {u}) S0 = ∅ Sα+1 = S(Sα ) Sα = ∪β<α Sβ for limit ordinals α Lemma 6. Rud(u) = ∪i<ω S i (u). Proof: Clearly ∪i<ω si (u) is the rud-closure of u. Also, if u ⊆ v then s(u) ⊆ s(v), whence s(u) ⊆ S(u). The inclusion Rud(u) ⊆ ∪i<ω S i (u) follows. For the opposite inclusion, by the proof of lemma 45.2, Bi∗ is rudimentary, where Bi∗ (u) = Bi [uki ]. Thus, if v ∈ Rud(u) then Bi∗ (v) ∈ Rud(u). It is easily seen that if v, w ∈ Rud(u) then v ∪ w ∈ Rud(u). Thus, if v ∈ Rud(u) then s(v) ∈ Rud(u). But if v ∈ Rud(u) then v ∪ {v} ∈ Rud(u), so if v ∈ Rud(u) then S(v) ∈ Rud(u). It is now easy to see by induction that S i (u) ⊆ Rud(u) for all i. Indeed, if v = S i (u) then S i+1 (u) = s(v∪{v}), and v∪{v} ⊆ Rud(u), so s(v∪{v}) ⊆ Rud(u). 152 ⊳ It follows by induction that Jα = Sω·α . Lemma 7. For the following predicates, there is a Σ1 formula which defines the predicate in Jα for any α. a. y = Sβ b. y = Jβ Remarks on proof: There is a ∆0 predicate P (f ) which states that “f is a function whose domain is an ordinal; f (0) = 0; at successor stages f (γ + 1) = S(f (γ)); and at limit states f (γ) = ∪δ<γ f (δ)”. y = Sβ is defined by “∃f (P (f ) ∧ y = f (β))”. There is at most one f with a given domain. That there is an f , and part a of the lemma, may be shown by induction on α. For the induction step at successor stages, the function y = Sβ is Σ1 -definable in Jα , so is in Jα+1 . Thus, there is an f with domain ω · α, and it follows that there is an f with domain ω · α + n for any n ∈ ω. Part b follows by giving a suitable definition of the predicate “β = ω · γ”. Further details may be found in lemma 2.2 of [Jensen]; a more involved proof may be found in lemma VI.2.5 of [Devlin]. ⊳ Lemma 8. There is a rudimentary function W , such that if r is a well-order of v then W (v, r) is a well-order of s(v). Further, W (v, r) is an end-extension of r, i.e., r ⊆ W (v, r), and if x ∈ v and y ∈ / v then x precedes y in W (v, r). Remarks on proof: Given x and y, say that x precedes y if one of the following holds, where a clause presumes that preceding clauses do not hold. - x and y are in v and x precedes y in r. - x ∈ v and y ∈ / v. - The smallest i such that x ∈ Bi [v ki ] precedes the smallest j such that y ∈ Bj [v kj ]. - The smallest i and j are equal, and the least ~a such that x = Bi (~a) precedes the least ~b such that y = Bi (~b), in the lexicographic order on v ki . The predicate “precedes” is ∆0 , and W (v, r) is the subset of s(v) × s(v) defined by it. For further details see lemma 1.21 of [Devlin]. ⊳ Define well-orders <Sα as follows. < S0 = ∅ <Sα+1 = W (<Sα ) <Sα = ∪β<α <Sβ for limit ordinals α Let <Jα equal <Sω·α . Let <J equal ∪α <Jα . Lemma 9. For the following predicates, there is a Σ1 formula which defines the predicate in Jα for any α. a. y = <Sβ b. y = <Jβ 153 Remarks on proof: The proof of lemma 8 need only be modified as needed. ⊳ It should come as no surprise to the reader that there are Skolem functions for the Jα with a variety of additional properties. As will be seen, it is a fact of considerable importance that such can be defined when the language has an additional unary predicate symbol A, in a manner which holds for any A meeting the amenability requirement. For the following lemma (and for later use) let φi denote the ith formula in some fixed computable (and hence Σ1 in Vω ) enumeration of the Σ1 formulas with free variables y and x, in the language expanded with a unary predicate symbol A. Lemma 10. There is a Σ1 formula in the expanded language, which defines a Σ1 Skolem function in any structure M = hJα , Ai which is amenable. That is, it defines a partial function hM J (i, x), such that if ∃yφi (x̊) is true in M , then φi (h(i, x̊), x̊) is true. Remarks on proof: Define the following predicates: - P1 (φ, y, x) is the predicate of corollary 45.9; this is Σ1 . - P2 (i, y, x) = P1 (φi , y, x); this is also Σ1 . - Write P2 as ∃w P3 (w, i, y, x) where P3 is ∆0 . - P4 (z, i, x) = P3 (π1 (z), i, π2 (z), x); this is ∆0 . - P5 (z, i, x) holds if and only if z is the <J -least z ′ such that P4 (z ′ , i, x) holds; this is Σ1 . Then y = hJ (i, x) if and only if ∃z(P5 (z, i, x)∧y = π2 (z)). To see that P5 is Σ1 , write it as P4 (z, i, p)∧∃v(v = {z ′ : z ′ <J z}∧∀z ′ ∈ v¬P4 (z ′ , i, x)); that v = {z ′ : z ′ <J z} is Σ1 follows by lemma 9. ⊳ The superscript M may be omitted when it is clear. As the following lemma shows, a Skolem function such as hJ permits taking a Skolem hull with a single function application. A more general fact will be proved, which is useful in fine structure theory. Let M̄ denote a structure hM, Ai where A is a unary predicate (only a single unary predicate is considered here, but the case of several unary predicates is essentially the same). A function h : D 7→ M where D ⊆ ω × M is said to be a Σn Skolem function for M̄ if, whenever ∃yFyx (x̊) is true in M̄ then Fyx (h(i, x̊), x̊) is true for some i. This is a slight generalization of the hypothesis of lemma 10, where F = φi ; the more general hypothesis suffices. Recall from section 5 that in set theory, a function of several variables is an abbreviation for a function on a Cartesian product. In particular, to simplify the notation, angle brackets may be omitted in the argument list of a function. This abbreviation is used in the following lemma, and occasionally subsequently. Lemma 11. Suppose M̄ = hM, Ai is a structure as above with 154 M rud-closed. Suppose h is a Σn Skolem function, which is defined in M by a Σn formula with parameter p ∈ M . Suppose X ⊆ M is nonempty and closed under ordered pairs. Let N = h[ω × {p} × X], and let N̄ = hN, A ∩ N i. Then X ⊆ N and N̄ ≺n M̄ . Proof: Suppose x ∈ X, and consider the formula “w = π2 (hp, xi)”. The unique value of w satisfying this is x, from which it follows that x = h(i, hp, xi) for some i, and so x ∈ N . Suppose F (w, q1 , . . . , qn ) is true for some w, where qj ∈ N for 1 ≤ j ≤ n. Then qj = h(ij , hp, xj i) for some xj ∈ X. A Σn formula F ′ can thus be defined, so that for any w, F (w, q1 , . . . , qn ) holds in M̄ if and only if F ′ (w, hp, hx1 , . . . , xk ii) (the integers ij may be defined by formulas). The latter holds for w = h(i, hp, hx1 , . . . , xk ii) for some i, which is in N . Thus, F holds for some w ∈ N . By Tarski’s criterion, lemma 20.1, suitably modified for Σn formulas, it follows that N̄ ≺n M̄ . See also remarks preceding lemma 2.8 of [Jensen]. ⊳ The parameter p is unnecessary for the function hJ , but is needed in other situations. From the proof it is easily seen that p ∈ N̄ . There is a condensation lemma for the Jα hierarchy. For the following, by lemma 45.4, if N ≺1 Jα for some α then the transitive collapse of N may be taken. Lemma 12. Suppose α ∈ Ord and N ≺1 Jα , and let π be the collapsing isomorphism. Then π[N ] = Jβ where β ≤ α. Further, for x ∈ N , π(x) ≤J x. Remarks on proof: By lemma 7.a, if γ ∈ ω · α ∩ N then Jγ ∈ N , since the formula giving Jγ from γ is Σ1 , so holds in N , and Σ1 formulas are up-absolute, so the formula defines Jγ from γ in N . Using lemma 45.4 and induction on γ, it may be seen that π(Sγ ) = Sπ(γ) . Letting β = π[N ∩ Ord], That Jβ ⊆ π[N ] is straightforward. The opposite inclusion holds because if x = π(w) then ∃γ(w ∈ Sγ ) is true in Jα , so is true in N , and x ∈ Sπ(γ) follows. For the second claim, since <J is uniformly Σ1 , x <J y if and only if π(x) <J π(y). If x <J π(x) for some x, let x be the <J -least such. Since π(x) ∈ Jβ , x ∈ Jβ , and so x = π(w) for some w ∈ N . Then w <J x, so π(w) ≤J w by choice of x, so x ≤J w, a contradiction. For details see lemma VI.2.9 of [Devlin]. ⊳ If hN, Bi ≺0 hM, Ai then B = A ∩ N , since A(x) is a ∆0 formula. Thus, if M̄ = hJα , Ai, and N̄ ≺1 M̄ where N̄ = hN, A ∩ N i, then the conclusion of the lemma holds. Following are some further lemmas, which will be needed later. Recall the function Γ defined in section 13. An ordinal number α is called a δ-number if whenever β, γ < α then β · γ < α. A treatment of δ-numbers may be found in [Monk1]. Lemma 13. 155 a. Γ has a Σ1 definition, which is provably total in KP. b. If α is an infinite δ-number then Γ[α × α] = α. c. For each α there is a surjection g : ω · α 7→ (ω · α) × (ω · α), which is Σ1 -definable in Jα from parameters. d. If p is a parameter such that there is a function g as in part c, which has a definition from p, then Jα = hJ (ω × {p} × ω · α). e. For each α there is a surjection g : ω ·α 7→ Jα , which is Σ1 -definable in Jα from parameters. Remarks on proof: Part a is proved in lemma II.6.6 of [Devlin]. For part b, it follows by ordinal arithmetic that Γ(β, γ) ≤ (max(β, γ) + 1)2 ; see lemma 1 of [Linden]. Thus if α is an infinite δ-number and β, γ < α then Γ(β, γ) ≤ α. That Γ[α × α] ≥ α follows because the map γ 7→ Γ(0, γ) is increasing. This proves part b. Part c is proved in lemma VI.3.15 of [Devlin]. Part d is proved in lemma 2.10 of [Jensen]. Part e follows using part d. ⊳ Suppose α is an infinite δ-number. It is easily seen that either α = ω or ω × α = α. The function Γ−1 , which has a parameter-free definition, may be used as g in part c. Likewise, there is a function which has a parameter-free Σ1 definition, which may be used as g in part e. Note that an admissible ordinal is a δ-number, since the existence condition for ordinal multiplication is provable in KP. A function φ ⊆ X × Y is said to uniformize a relation R ⊆ X × Y if π1 [f ] = π1 [R] and R(φ(~y ), ~y ). A structure M is said to be Σn uniformizable if and only if, whenever R is a predicate which is Σn definable in M from parameters, then there is a function φ, which is Σn -definable in M from parameters, such that φ uniformizes R. Uniformization is useful in constructibility theory, and will be encountered again in section 58. Lemma 14. Suppose hM, Ai is a transitive rud-closed amenable structure. If hM, Ai is Σn -uniformizable then it has a Σn -definable Σn Skolem function. Remarks on proof: This is lemma VI.3.12 of [Devlin]. It follows using corollary 45.9. ⊳ Recall the strong Σ1 collection axiom from section 17. Say that an admissible set is strongly admissible if it satisfies this axiom. Say that an ordinal α is nonprojectible if there is no function f , Σ1 -definable in Jα from parameters, with domain a set of ordinals bounded below ω · α, and range ω · α. Lemma 15. a. Jα is admissible if and only if there is no function f , definable in Jα from parameters, with domain an ordinal γ < ω · α, and range unbounded in ω · α. 156 b. Jα is strongly admissible if and only if there is no function f , definable in Jα from parameters, with domain a set of ordinals bounded below ω · α, and range unbounded in ω · α. c. If ω · α is nonprojectible then Jα is strongly admissible. Remarks on proof: These statements, and some further statements, can be found in lemmas 2.11 to 2.13 of [Jensen]. ⊳ The relativized Jensen Hierarchy JαA is defined by the same recurA sion as the unrelativized hierarchy, except Jα+1 = RudA (JαA ), where A Rud (X) denotes the A-relativized rud-closure of X ∪ {X}, as defined in section 45. Likewise, sA (u) = u ∪ ∪9i=0 Bi [uki ], and S A (u) and SαA are defined in terms of sA . Discussion of this hierarchy will be cursory, but it is of considerable importance in modern set theory. [Dodd] contains a comprehensive introduction. Some properties of the unrelativized hierarchy continue to hold. In particular: - Every JαA is transitive. Indeed, if the set of basis functions is enlarged then every SαA is transitive. (See [lemma 2.2 of [Dodd].) - Lemmas 7 to 9 continue to hold. (See lemmas 1.10 and 1.11 of [SchZem].) - Every structure hJαA , A ∩ JαA i is amenable (remarks following definition 1.9 of [SchZem]). - There is a uniform Σ1 Skolem function, which will be denoted hM J (lemma 1.15 of [SchZem]). - Lα [A] ⊆ JαA ⊆ Lω·α [A]. (The proof of lemma VI.2.4 of [Devlin] may be modified as necessary.) In particular, L[A] = ∪α JαA . Lemma 16. A a. If A is definable in Jα then Jα+1 = Jα+1 . A b. If hJα , Ai is amenable then Jα = Jα . Proof: For part a, A ∈ Jα+1 , so by induction SγA ⊆ Jα+1 for γ < ω · (α + 1). Part b is similar. ⊳ 47. Fine structure. The “fine structure theory” of the Jensen hierarchy involves defining a system of structures. These are used in various ways, once a definition has been given. Fine structure theory may considered for L, or more generally L[B]. Initially, the case of L will be of principal interest, but the case of L[B] will be of interest later, so some preliminary facts will be given for both cases. A structure M = hJαB , B ∩ JαB , Ai where α > 0 will be called a J-structure. hJα , Ai and Jα are special cases of interest. As in [Dodd], B B B the notation J B α will be used to denote hJα , B ∩ Jα i; hJ α , Ai denotes B M . The notation “p ∈ M ” will be used for “p ∈ Jα ”. 157 Let M be a J-structure. The notation “ΣM n ” will be used to abbreviate “Σn -definable in M from parameters”. a. Let ρaM be the largest ordinal ρ such that, if X ⊆ JρB is ΣM 1 , then hJρB , B ∩ JρB , A ∩ JαB i is amenable. b. Let ρbM be the smallest ordinal ρ such that there is a subset X ⊆ ω·ρ which is ΣM / JαB . 1 , such that X ∈ c. Let ρcM be the smallest ordinal ρ such that there is a partial function f which is ΣM 1 , such that Dom(f ) ⊆ ω · ρ and Ran(f ) = ω · α. Theorem 1. a. ρaM ≤ ρbM . b. ρbM ≤ ρcM . c. If M = Jα then ρcM ≤ ρaM . Proof: For part a, suppose η < ρaM , and X ⊆ ω · η is ΣM , Xi 1 . hJ ρa M is amenable, X = X ∩ Jη , and Jη ∈ JρaM , so X ∈ JρaM , so X ∈ Jα . Thus, η < ρbM ; since this is so whenever η < ρaM , ρaM ≤ ρbM . For part b, suppose f is as in the definition of ρcM . Let g : ω·α 7→ Jα be a surjection as in lemma 46.13. Let f ′ = g ◦ f ; then f ′ is ΣM 1 . Let X = {γ < ω ·ρcM : γ ∈ / f ′ (γ)}; then X is ΣM . If X ∈ J then X = f ′ (γ0 ) α 1 c ′ for some γ0 < ω · ρM . But then γ0 ∈ X if and only if γ0 ∈ f (γ0 ) if and only if γ0 ∈ / X, a contradiction. Thus, X ∈ / Jα , and this shows that ρbM ≤ ρcM . For any ordinal γ > 0 there is a ΣM 1 function g : ω · γ 7→ ω · (γ + 1); namely map 2m to m, map 2m + 1 to ω · γ + m, and remaining values to themselves. From this is it easily seen that if ρcM > 1 then ρcM ∈ LimOrd. For part c, it suffices to show that if X ⊆ JρcM is ΣM 1 , then hJρcM , Xi is amenable. If ρcM = 1 then JρcM = Vω , and the claim follows, because a subset of an element of Vω is in Vω . Thus, it may be supposed that ρcM is a limit ordinal. Suppose γ < ρcM . Let Y = X ∩ Jγ ; then Y is defined in Jα by a Σ1 formula with parameter q ∈ Jα . Let Z = hJ [ω × {q} × Jγ ]. By lemma 46.11, it follows that Z ≺1 Jα , Jγ ⊆ Z, and q ∈ Z. By lemma 46.12, π[Z] = Jᾱ for some ᾱ, where π is the collapsing isomorphism. Since Y ⊆ Jγ , π ↾ Y is the identity map. Let q̄ = π(q). Since π is an isomorphism, Y has a Σ1 definition in Jᾱ with parameter q̄. It follows that Y ∈ Jᾱ+1 . Using the definition of Z, and the partial function π[hJ ∩(ω×Z)×Z], it is easy to see that there is a partial function f1 which is ΣJ1 ᾱ , such that Dom(f1 ) ⊆ Jγ and Ran(f1 ) = Jᾱ . Let f2 be a partial function which is ΣJ1 α , such that Dom(f ) ⊆ ω · ρcM and Ran(f ) = ω · α. If ᾱ ≥ ρcM held, then using lemma 46.13, f1 , an easily defined surjection from Jᾱ to ω · ᾱ, and f2 , a Σ1 partial function from ω · γ to ω · α could be defined, contradicting γ < ρcM . Thus, ᾱ < ρcM , whence Y ∈ JρcM . Thus far it has been shown that if γ < ρcM then X ∩ Jγ ∈ JρcM . 158 If x ∈ JρcM then x ∈ Jγ for some γ < ρcM . Then x = x ∩ Jγ , so x ∩ X = x ∩ Jγ ∩ X; since x and Jγ ∩ X are in JρcM , x ∩ Jγ ∩ X is. This completes the proof that ρcM ≤ ρaM . ⊳ Note that the proof that ρM ≤ ρM c a does not apply in general, π[B] because it can only be concluded that Y has a definition in hJᾱ , π[A]i. In the case M = Jα , the common value of the theorem is called the Σ1 -projectum of α; ρα will be used to denote it. As will be seen, the projectum is central to fine structure theory. It is also used in the branch of higher recursion theory known as “α-recursion theory”; see [Sacks2]. Jρα is a “reduct” of Jα , and has properties not held by an arbitrary Jα , for example the following. Lemma 2. For any J-structure M , ω · ρcM is nonprojectible. Proof: By definition there is no ΣM 1 partial function f such that Dom(f ) is a bounded subset of ω · ρcM and Ran(f ) = ω · ρcM . A fortiori there is no such f which is Σ1 in Jα . ⊳ By lemma 15.c, ω · ρcM is admissible, and hence as noted in section 46 is a δ-number. It is useful to have a definition of the projectum ρM for any Jstructure M . In [DoddJen1] (and [SchZem] and [Mitchell2]), ρbM is used, and will be here. A value p ∈ M is called a good parameter if there is a subset X ⊆ ω · ρM which is Σ1 -definable in M from the parameter p, such that X ∈ / Jα . Since ρM is defined as ρbM , a good parameter exists. Let pM be the <J -least good parameter. The fact that pM can be defined in this way for any M is a main advantage of using ρM as the definition of the projectum. [DoddJen1], and various other authors, allow only finite sets of ordinals as parameters, while others, such as [Welch1], allow any element of Jα , as has been done here. The less restrictive definition permits connecting the recursion equations given below with the original definitions (as found in [Devlin]). The more restrictive definition has additional properties of use in some applications. They are equivalent to some <ω ] (see lemma 2.36 of [Dodd]). extent, since JαB = hM J [ω × ω · α The projectum of Jα may be iterated, provided structures M = hJα , Ai are considered. For any such M , for any p ∈ Jα let AMp be the set of pairs hi, xi such that x ∈ JρM and φi (x, p) is true in M , where φi is the enumeration of formulas as in lemma 46.10. Using corollary 45.9, this is readily seen to be Σ1 -definable in M , from the parameter hp, ρM i, or p if ρM = α. Let AM be AMp where p = pM . (These quantities can be defined in the relativized case, but the definition is more complicated; see below). Let P(M ) denote hJρM , AM i. - P0 (Jα ) = hJα , ∅i 159 - ρn+1 = ρPn (Jα ) α - pn+1 = pPn (Jα ) α - An+1 = APn (Jα ) α - Pn+1 (Jα ) = P(Pn (Jα )) It is convenient to define ρ0α = α, p0α = ∅, and A0α = ∅. A simple induction shows that Pn (Jα ) = hJρnα , Anα i. Also, for n > 0, pnα ∈ Jρn−1 α and Anα ⊆ Jρnα . ρnα is called the Σn -projectum of α, pnα the standard n parameter, and Aα the standard code (or master code). Originally (in [Jensen]), these values were defined as those having certain properties, and various basic facts, including the above recursion equations, proved to hold. This will be done here, following the presentation in [Devlin]. Some additional facts are needed, since the nb definition of pM differs from that given in [Devlin]. Let ρna α , ρα , and nc ρα be defined as in theorem 1, but for Σn formulas. Theorem 3. Suppose M = Jα and n ≥ 1. Then M is Σn uniformizable. Remarks on proof: References will be to [Devlin]. Let an be the statement of the theorem for a particular n. Let bn be the statement that hJα , Xi is amenable for any X ⊆ Jρnc which is ΣM n . Let cn be α M the statement that, if R(x, ~y ) is Σn and u ∈ Jρnc then ∀x ∈ uR(x, ~y ) α is ΣM n+1 . Lemma VI.3.13 states that a1 . Lemma VI.4.3 states that an ⇒ bn . Lemma VI.4.4 states that cn ∧ an ⇒ an+1 . Using these, it suffices to show that cn follows from cj for 1 ≤ j < n. The proof of this may be found in theorem VI.4.5; by the induction hypotheses and facts above, aj and bj may be assumed for 1 ≤ j ≤ n. ⊳ A proof of theorem 3, for Lα where α is an admissible ordinal, may be found in theorem 1.27 of [Chong]. nb nc Theorem 4. ρna α = ρα = ρα . nb nb nc Remarks on proof: By theorem 1 ρna α ≤ ρα and ρα ≤ ρα . That na nc ρα ≤ ρα follows mutatis mutandis as part c of theorem 1 (see lemma VI.4.3 of [Devlin]). By theorem 3 and lemma 46.14, there is a Σn Skolem function which is Σn -definable in Jα from some parameter p; p must be included in taking the Skolem hull Z, as in lemma 46.11. ⊳ Note that ρnα is defined earlier; it will be seen in theorem 8 that it equals the common value of the theorem. In a structure S, a formula Fx is said to define an element x̊ if Fx (ẘ) is true if and only if ẘ = x̊. If S is an ∈-structure there is some ambiguity in the use of the term “definable” for an element, in that it might be defined as an element, or as a unary predicate; usually this causes no confusion. Lemma 5. For any J-structure M , ordinal ρ ≤ α, and element p ∈ M , the following are equivalent, 160 a. Every x ∈ M has a Σ1 definition in M from parameters in JρB ∪{p}. b. JαB = hM J [ω × {p} × Jρ ]. c. There is a partial function f which is Σ1 -definable in M from p, such that f [Jρ ] = JαB . If ρ is an infinite δ-number then the preceding hold if and only if d. there is a partial function f which is Σ1 -definable in M from p, such that f [ω · ρ] = Jα . Proof: Write p̊ for the element, and x for the variable, etc. Suppose ψ(x, hp̊, ẘi) defines x̊ in M , where ẘ ∈ JρB . Then ∃xψ(x, hp̊, ẘi) is true in M . Since there is exactly one possible value for x, x̊ = hM J (i, hp̊, ẘi), where φi = ∃xψ. Thus, a⇒b. For b⇒c, let f (v) = hJ (i, p, w) if v is of the form hi, wi. For c⇒a, the formula x = f (ẘ) defines x̊ when x̊ = f (ẘ). For part d, let g be a function as in lemma 46.13.e, which may be taken to have a parameter-free definition by the additional hypothesis. Given f as in part c, f ◦ g is as in part d; and the opposite implication is trivial. ⊳ A parameter p ∈ M is said to be very good if JαB = hM J [ω × {p} × JρM ]. Lemma 6. A very good parameter is good. Remarks on proof: This is lemma 3.0.4 of [Welch1]. Given a very good parameter, an f as in the definition of ρcM can be constructed, showing that ρcM ≤ ρbM , whence ρbM = ρcM , whence ω ·ρM is a δ-number. Let f be as in part d of lemma 5. Let X = {γ < ω · ρM / f (γ)}; by 1 :γ ∈ the argument in the proof of theorem 1.b, X ∈ / Jα . ⊳ The following additional observations may be made. - If pM is a very good parameter then it is the <J -least very good parameter (by lemma 6). Suppose M = hJα , Ai. - If there is a very good parameter then ρaM = ρbM = ρcM (as in theorem 4). - If there is a very good parameter then P(M ) is amenable (since AM is ΣJ1 α and ρaM = ρbM ). To avoid complication, the following lemma will be given only for the unrelativized case, although a suitably formulated version holds in the relativized case. Let P(M, p) denote hJρM , AMp i. Lemma 7. Suppose M = hJα , Ai is an amenable structure, and p ∈ Jα is a good parameter. Suppose j : K 7→ P(M, p) is a ∆0 -elementary embedding. Then there is an amenable structure N = hJβ , Bi and a very good parameter q ∈ Jβ , such that K = P(N, q). Further, there is a Σ1 -elementary embedding ̂ : N 7→ M such that j ⊆ ̂ and ̂(q) = p. N Finally, if p equals pM 1 then q = p1 . 161 Remarks on proof: This is (the unrelativized case of) lemma 3.3 of [SchZem] (with less restricted parameters). Let X = Ran(j), let Y = hM J [ω × {p} × X], and let π be the transitive collapse map for Y . It is easily seen that X is closed under ordered pairs, whence by lemmas 46.10 and 46.11 Y ≺1 M , whence by lemma 46.12 π[Y ] equals Jβ for some β ≤ α. Let B equal π[A ∩ Y ]. Let ̂ be the inverse of π. Write K as hJρ̄ , Āi. Suppose x ∈ X, y ∈ Y , and y ∈ x. Then x ∈ P(M, p) since X ⊆ P(M, p), whence y ∈ P(M, p) since P(M, p) is transitive. Also, y = hJ (i, p, z) for some z ∈ X. This may be expressed as AMp (k, y, z) (which recall is an abbreviation for AMp (hk, hy, zii) for some k, and ∃y ∈ xAMp (k, y, z) is true. Let x̄, z̄ be such that x = j(x̄) and z = j(z̄); since j is ∆0 -elementary, ∃ȳ ∈ x̄Ā(k, ȳ, z̄) is true; choosing some ȳ, AMp (k, j(ȳ), z) is true, so j(ȳ) = hJ (i, p, z), so j(ȳ) = y, and y ∈ X. It follows by the preceding paragraph that π ↾ X is the inverse map to j, and j ⊆ ̂. Let q = π(p). It also follows that N = hN J [ω × {q} × Jρ̄]. From this it follows that ρN ≤ ρcN ≤ ρ̄. For x ∈ Jρ̄ and i ∈ ω, φi (x, q) holds in N if and only if φi (j(x), p) holds in M if and only if AMp (i, j(x)) if and only if Ā(i, x). N Suppose P ⊆ Jρ̄ is ΣN 1 . Since N = hJ [ω × {q} × Jρ̄ ] there is a w ∈ Jρ̄ and a Σ1 formula Q such that P (x) ⇔ Q(x, w, q). By the preceding paragraph there is an i such that P (x) if and only if Ā(i, x, w). If γ < ω · ρ̄ then {w ∈ Jγ : Ā(i, x, w)} is an element of Jρ̄ , and it follows that P ∩ Jγ is an element of Jρ̄ . As noted in the proof of theorem 1, it follows that hJρ̄ , P i is amenable. Since P was arbitrary, ρ̄ ≤ ρaN ≤ ρN . Thus, ρ̄ = ρN . By the above noted fact about Ā, Ā = AN q ; and so K = P(N, q). That q is a very good parameter has already been noted. The proof of the last claim may be found in the proof of lemma 3.6 of [Mitchell2]. Certainly pN ≤J q. Suppose pN <J q. Since q is very good, pN is Σ1 -definable in Jβ from q. Since ̂ is Σ1 -elementary, ̂(pN ) is Σ1 -definable in Jα from ̂(q) = pM . This implies that ̂(pN ) is a very good parameter, which is a contradiction since ̂(pN ) <J ̂(q) = pM . ⊳ Theorem 8. For α > 0 and n > 0 the following hold. a. ρnα = ρnc α . b. pnα is a very good parameter. c. Anα is ΣJnα . P (J ) α d. A subset of Jραn is ΣJn+1 if and only if it is Σ1 n α . e. Suppose N = hJβ , Bi and j : N 7→ Pn (Jα ) is a ∆0 -elementary embedding. There is a ᾱ ≤ α such that N = Pn (Jᾱ ). There is a Σn -elementary embedding ̂ : Jᾱ 7→ Jα such that j ⊆ ̂ and α ̂(pᾱ i ) = pi for 1 ≤ i ≤ n. Remarks on proof: The proof is by induction on n. M will be used 162 to denote Pn−1 (Jα ). Jα First, since ρnc α is a δ-number there is a Σn partial function f nc such that Dom(f ) ⊆ ω · ρα and Ran(f ) = Jα . Let f¯ = f ∩ (ω · ρnc α × Jραn−1 ). Using lemma 46.9, f¯ is ΣJnα , whence by part d of the induction c nc hypothesis, f¯ is ΣM 1 . It follows that ρM ≤ ρα . a Second, ρna α = ρM . This follows using part d of the induction hypotheses, as in the proof of lemma VI.5.4 of [Devlin]. Part a follows from the preceding two paragraphs and theorem 1.a and 1.b. Part b is lemma 3.4 of [Mitchell2]; see also lemma 9.2 of [SchZem]. Let Y = hM J [ω × {pM } × JρM ], and let π be the transitive collapse map for Y . If n = 1, as in lemma 7 π[Y ] equals Jᾱ for some ᾱ ≤ α. Now, ρ1α = ρα , whence there is an X ⊆ ω · ρ1α , which is Σ1 -definable in Jα from p1M , such that X ∈ / Jα . Since Jρ1M ⊆ Y and p1M ∈ Y , it follows that X is Σ1 -definable in Jᾱ (from π(p1M )). If ᾱ < α then X ∈ Jα would follow, a contradiction, Thus, ᾱ = α. By lemma 46.12, π(p1M ) ≤ p1M . Since A1M is Σ1 -definable in Jα from π(p1M ), it follows that π(p1M ) = p1M . It follows that π is the identity map and Y = Jα , and so p1M is a very good parameter. For n > 1, by lemma 7 with j the identity map on Pn (Jα ), π[Y ] n−1 n−1 ]. By part e of the . Let Ā = π[Aα equals Jρ̄ for some ρ̄ ≤ ρα n−1 n−1 induction hypothesis, ρ̄ = ρᾱ and Ā = Aᾱ for some ᾱ ≤ α. By definition of ρnα , there is an X ⊆ ω · ρnα , which is ΣM 1 , such that hJ ,Āi 1 X∈ / Jρn−1 . Since Jρ1M ⊆ Y and pM ∈ Y , it follows that X is Σ1 ρ̄ . It α J Ā Ā follows that X is Σ1 ρ̄ , whence X ∈ Jρ̄+1 . Since hJρ̄ , Āi is amenable, it n−1 ; follows by lemma 46.16 that X ∈ Jρ̄+1 . As in the case n = 1, ρ̄ = ρα 1 1 and further π(pM ) = pM , whence π is the identity map. In remarks preceding lemma VI.5.3 of [Devlin], pnα is defined as the least very good parameter, as may be seen using lemma 5. By part b and lemma 6, this equals pnα as defined here. Parts c and d follow by the proof of lemma VI.5.3. For part e, using lemma 7 there is a Σ1 embedding j1 : hJγ , Ci 7→ Pn−1 (Jα ) which extends j, and with j1 (q) = pnα where q is a very good parameter in Jγ . By part e of the induction hypothesis (or lemma 46.12 if n = 1), γ = ρn−1 for some ᾱ, and C = An−1 . It follows that β = ρnᾱ . ᾱ ᾱ It also follows that q = pᾱ , by the last claim of lemma 7. ̂ is obtained from ̂1 (or is the inverse of the transitive collapse if n = 1). The proof that ̂ is ΣJnα follows inductively using lemma 3.2 of [SchZem]. ⊳ Theorem 8.d is a principal fact of fine structure theory. In Jα , the standard code “codes” the Σn+1 subsets of Jρ as Σ1 subsets, where ρ is 163 the n-th projectum. Some additional facts will be given in the next few paragraphs. Theorem 8.d can be strengthened. For any l ≥ 1, a subset of Jραn P (J ) α is ΣJn+l if and only if it is Σl n α . See lemma VI.5.3 of [Devlin]. The map ̂ of lemma 7 is unique. Indeed, suppose j : P(N, q) 7→ P(M, p) is a ∆0 -elementary embedding, and q is a very good parameter. Then there is a unique ∆0 -elementary embedding ̂ : N 7→ M such that j ⊆ ̂ and ̂(q) = p; further ̂ is Σ1 -elementary. See lemma 3.1 of [SchZem]. Theorem 8.e can be strengthened. Both ᾱ and ̂ are unique. If j is Σm -elementary, then for 0 ≤ i ≤ n, ̂ ↾ Jρiᾱ is a Σn−i+m -elementary embedding of Pi (Jᾱ ) in Pi (Jα ). See theorem 8.6 of [Devlin]. As already indicated, suitably reformulated, various facts of fine structure theory continue to hold for the relativized Jensen hierarchy. Relativized fine structure theory is an essential ingredient of the branch of set theory known as “core model theory”. A J-structure M is said to be acceptable if the following holds: B - Suppose β < α, γ < ω · β, and Pow(γ) ∩ (Jβ+1 − JβB ) 6= ∅; then in B B Jβ+1 , there is a function f ∈ Jβ+1 which is a surjection from γ to ω · β. In [DoddJen1] the notion defined above is called “strong acceptability”, and shown to imply a property called “acceptability”; various facts can be proved from the weaker notion. In [Dodd] the weaker notion is given as an axiom of the system RA, and various facts proved in RA. Some facts which follow from the assumption that M is acceptable will be stated without proof; unless otherwise specified references are to [SchZem]. For a cardinal ρ of M (where M is a transitive ∈-structure), HρM denotes {x ∈ M : |T C(x)|M < ρ}. - A version of GCH holds in M (corollary 2.13 of [DoddJen1]). - Suppose ρ ∈ M is an infinite cardinal in M , and a ⊆ u where u ∈ JρB and a ∈ JαB ; then a ∈ JρB (1.23). M . - Suppose ρ ∈ M is an infinite limit cardinal in M ; then JρA = Hωρ - If ρM < α then ρM is a cardinal in M (2.2). - If X ⊆ Jρ is ΣM , Xi is amenable (2.4.a). 1 , then hJρM 1 M - If p1 is a very good parameter then every good parameter is very good (6.8). - Each Jα is acceptable (9.1). 48. Upward extension. Lemma 47.7 states how an embedding of an amenable structure K into P(M, p) may be extended to an embedding of a larger structure N into M . It is often called a “downward extension” lemma, since N is 164 obtained by a collapse. The “dual” problem of extending an embedding of P(M, p) into an amenable structure K, to an embedding of M into a larger structure N , is called the “upward extension” problem. Suppose for this section that M = hJα , Ai and K = hJγ , Ci are amenable structures (as for lemma 47.4, more general structures can be considered), p ∈ Jα is a very good parameter, and j : P(M, p) 7→ K is Σ1 -elementary. Also, let P denote the unary predicate symbol of the expanded language. To construct N , the pair hi, xi for i ∈ ω and x ∈ Jγ will represent hJ (i, q, x) where q is an appropriate parameter. The construction is based on the assumption that C is AN q , which can later be justified. The pair hi, xi can represent an element of N only if ∃y(y = hJ (i, q, x)), which can be expressed as C(hiv , hi, xii) for some iv ; such a pair will called valid. Similarly, the following predicates may be defined. - Let ≡K be the relation which holds for the valid pairs hi1 , x1 i and hi2 , x2 i, if and only if hJ (i1 , q, x1 ) = hJ (i2 , q, x2 ); let i≡ be such that this holds if and only if C(hi≡ , hi1 , x1 , i2 , x2 ii). - Let ∈K be the relation which holds for the valid pairs hi1 , x1 i and hi2 , x2 i, if and only if hJ (i1 , q, x1 ) ∈ hJ (i2 , q, x2 ); let i∈ be such that this holds if and only if C(hi∈ , hi1 , x1 , i2 , x2 ii). - Let PK be the relation which holds for the valid pair hi, xi, if and only if P (hJ (i, q, x)); let iP be such that this holds if and only if C(hiP , hi, xii). Lemma 1. The relation ≡K is an equivalence relation on the valid pairs, and respects ∈K and PK . Remarks on proof: See the proof of lemma 4.2 of [SchZem]. The relation ≡P(M,p) , defined from P(M, p) in the same way as ≡K , is an equivalence relation, since the role of C is played by AMp . Using i≡ , this may be expressed as a Π1 formula in the expanded language. Thus, the formula holds of C in K since j is Σ1 -elementary by hypothesis, and thus preserves Π1 formulas as well. The argument that ≡K respects ∈K and PK is similar. ⊳ Let K̂ denote the set of equivalence classes of the valid pairs under ≡K . K̂ will also be used to denote the structure with this set as its domain, and where ∈ is interpreted as ∈K and P is interpreted as PK . Let ix (resp. ip ) be the number of the formula y = π2 (x) (resp. y = π1 (x)). Then in K̂, hix , xi represents x and hip , ∅i represents the parameter. Let pK denote [hip , ∅i]. If S is a structure for the language of set theory, with membership relation ∈S , a substructure T ⊆ S is said to be an initial substructure of S if w ∈ T whenever x ∈ T and w ∈S x. S is also said to be an end extension of A. See definition I.8.2.of [Barwise]. 165 For part c of the following lemma, let R′ be the system of axioms in the language of set theory expanded with a unary predicate symbol, consisting of extensionality, foundation, and the existence conditions for the basis functions for the rudimentary functions; see remarks following lemma 1.4 of [Dodd]. Lemma 2. a. There is a well-defined map jK : M 7→ K̂, where jK (hJ (i, p, x)) = [hi, j(x)i]. b. jK is Σ2 -elementary. c. In K̂ the axioms of the system R′ defined above hold; V = L also holds. d. The map x 7→ [hix , xi] is a bijection from Jγ to an initial substructure of K̂. In the remaining clauses K will be identified with its image under this map. e. jK is an extension of j. f. jK (p) = pK . g. C(hi, xi) is true in K if and only if φi (x, pK ) is true in K̂. h. Every x ∈ K̂ equals hJ (i, pK , w) for some w ∈ K. Remarks on proof: See the proof of lemma 4.2 of [SchZem]. For part a, if hJ (i1 , p, x1 ) = hJ (i2 , p, x2 ) then this is attested to by AMp , and hi1 , x1 i ≡K hi2 , x2 i follows. For part b, a formula ∀x∃yφ(x, y,z̊) with z̊ ∈ M may be rewritten as ∀ix ∀xψ(ix , x,ı̊z ,z̊), where ψ is Σ1 , underlined variables and constants are in P(M, p), and z̊ = hJ (ı̊z ,z̊). ψ is equivalent to a ∆0 formula, and the second claim follows. Part c is immediate from part b, since these are all expressed by sentences which are at worst Π2 , and they hold in M . Part d follows because the formulas hix , x1 i ≡K hix , x2 i ⇒ x1 = x2 and hix , wi ∈K hix , xi ⇒ ∃u ∈ x(hix , ui ≡K hix , wi) may be expressed in Π1 form, and hold in P(M, p). For part e, for x ∈ P(M, p) jK (x) = jK (hJ (ix , p, x)) = [hix , j(x)i]. For part f, jK (p) = jK (hJ (ip , p, x)) = [hip , ∅i] = pK . Let T (i, x, q) be the Σ1 predicate defined as in lemma 45.8, for i the number of a Σ1 formula in two free variables. The hypotheses of lemma 45.8 are needlessly restrictive, and T in fact defines the truth predicate in any model of R′ . See definition 1.16 of [Dodd]. Also, that T (i, x, q) ↔ φi (x, q) for each i is provable in R′ (exercise for the reader; see for example theorem 16.49 of [TakZar1]). The formula for T may be translated into a formula Ť (i, i1 , x1 , i2 , x2 ), which holds in P(M, p) if and only if T (i, hJ (i1 , x1 ), hJ (i2 , x2 )) holds in M . The sentence P (i, x) ⇔ Ť (i, ix , x, ip , ∅) holds in P(M, p), and may be written in Π1 form, so it holds in K. Part g follows. The formula hJ (i, p, x) = hJ (i, hJ (ip , p, ∅), hJ (ix , p, x)) is express166 ible (using AMp ) as a ∆0 formula with free variables i and x. It’s universal quantification is true in P(M, p), so is true in K. Using part g, part h follows. ⊳ Lemma 3. Suppose K̂ is well-founded. There is a β and a “transitive collapse” map π : K̂ 7→ Jβ . Let B = π[PK ], let q = π(pK ), and let N = hJβ , Bi. Then K equals P(N, q), q is a very good parameter, and N and q are unique. Remarks on proof: See the proof of lemma 4.2 of [SchZem]. By remarks following lemma 36.1, the transitive collapse of K̂ may be taken. Since K̂ is rud-closed and satisfies V = L, an argument similar to the proof of lemma 46.12 shows that π[K̂] is Jβ for some β, where π is the collapse map. The transitive collapse of the image in K̂ of Jγ equals Jγ since it is an initial substructure. There is a map from ρM onto JρM , which is Σ1 -definable without parameters, and it follows that there is a map from γ onto Jγ , which is Σ1 definable without parameters. Using this and lemma 2.h, it follows K that ρcN ≤ γ. Suppose X ⊆ Jγ is ΣN 1 . Then X is ∆0 , so by lemma 45.10 if x ∈ Jγ then x ∩ X ∈ Jγ . This shows that γ ≤ ρaN . Since γ ≤ ρaN ≤ ρcN ≤ γ, γ = ρN . It now follows by lemma 2.g that C equals AN q , and by lemma 2.h that q is a very good parameter. Suppose N1 and q1 , and N2 and q2 , satisfy the conclusion of the theorem. By lemma 2.g, AN1 q1 and AN2 q2 both equal C. Let σ : M1 7→ M2 be such that σ(hJ (i, q1 , w)) = hJ (i, q2 , w)) for w ∈ Jγ . Using C and translating Σ1 to ∆0 as usual, it follows that σ is a well-defined function from N1 to N2 , and is an isomorphism. Also, σ(q1 ) = q2 . ⊳ 49. Fine structural ultrapowers. There are different varieties of fine structural ultrapowers, and they have many uses in modern set theory. An example of their use will be seen in the next section, in a proof of the covering lemma. They are also used in “core model” theory. One variety of fine structural ultrapowers are called “extender ultrapowers”. These are fashioned after the extender ultrapowers of large cardinal theory, described in section 43. The version given in [Schindler] will be given here; other authors give different versions. Suppose j : M 7→ N is a ∆0 -elementary embedding. [Schindler] assumes that M and N are acceptable structures, which recall are amenable structure hJαB , Ai satisfying a certain requirement (and are models of the axiom system RA of [Dodd]). This is a convenient assumption; in particular, the structures will be transitive, rud-closed, amenable, and models of V = L[B] (lemma 5.17 of [Schindler]). ∆0 separation holds (lemma 1.2 of [Dodd]). In applications here, B is invariably ∅. 167 Say that j is ∈-cofinal if, for all x ∈ N there is some w ∈ M such that x ∈ j(w). [Schindler] assumes that j is ∈-cofinal. However, given any j, let N ′ be the downward closure under ǫ of j[M ]. Then j, considered as a map to N ′ (such a map is called a co-restriction) is ∈-cofinal. It is also ∆0 -elementary, and sup(j[M ] ∩ Ord) = N ′ ∩ Ord. Lemma 1. If j : M 7→ N is ∈-cofinal and j is a ∆0 -elementary embedding then j is Σ1 -elementary. Proof: Suppose Fx,~y is a ∆0 formula, and x̊ ∈ N and ẙ1 , . . . , ẙn ∈ M are such that |=N F (x̊, h(ẙ1 ), . . . , j(ẙn )). Since j is ∈-cofinal, there is some ẘ ∈ M such that |=N ∃x ∈ j(ẘ)F (x, j(ẙ1 ), . . . , j(ẙn )). Since j is ∆0 -elementary, |=M ∃x ∈ ẘF (x, ẙ1 , . . . , ẙn ), whence |=M ∃xF (x, ẙ1 , . . . , ẙn ). ⊳ Thus, suppose j : M 7→ N is ∆0 -elementary (if needed, its corestriction to N ′ as above can be considered). Suppose that j is nontrivial, and κ is the least ordinal moved. Suppose λ < sup(j[M ] ∩ Ord). For each a ∈ [λ]<ω let µa be the smallest µ such that a ⊆ j(µ); and let Ea = {X ⊆ [µa ]|a| : X ∈ M, a ∈ j(X)}. Given a ⊆ b the map πba may be defined as in section 43, except −1 now πba : [µb ]|b| 7→ [µa ]|a| . For X ⊆ [µa ]|a| let Xab = πba [X] ∩ [µb ]|b| |b| |a| (a subset of [µb ] ). Given a function f with domain [µa ] let fab be f ◦ πba (a function with domain [µb ]|b| ). Recall the definition of an M -ultrafilter from section 44; and the notation si and i(ξ, s) used in the proof of lemma 43.1. Lemma 2. a. µa is the least µ ∈ M such that [µ]|a| ∈ Ea . b. Ea is an M -ultrafilter on [µa ]|a| , which is κ-complete for sequences in M . c. For X ⊆ [µa ]|a| with X ∈ M , X ∈ Ea if and only if Xab ∈ Eb . d. µ{κ} = κ. e. Suppose f : [µa ]|a| 7→ µa where f ∈ M , and suppose {s ∈ [µa ]|a| : f (s) < max(s)} ∈ Ea . Then for some β < max(a), {s ∈ [µa ]|b| : fab (s) = si(β,b) } ∈ Eb where b = a ∪ {β}. Remarks on proof: This is lemma 10.18 of [Schindler], where the proof is left as an exercise. For part a, write n for |a|; [µ]n ∈ Ea if and only if a ∈ j([µ]n ). Also, j([µ]n ) = [j(µ)]n . It follows readily that a ⊆ j(µ) if and only if a ∈ j([µ]n ). For part b, the proof of theorem 36.4 may be adapted, as already indicated in section 44 for M -ultrafilters. That [µa ]|a| ∈ Ea is needed, which was already proved in part a. Part c follows as lemma 43.1.e, noting that b ∈ j([µb ]|b| ). Part d follows because κ∈ / j(α) for α < κ, but κ ∈ j(κ). Part e follows as lemma 43.1.f, with the following modifications. Let X denote {s ∈ [µa ]|a| : f (s) < max(s)}. ∀s ∈ X∃β < max(s)(f (s) = β) is true in M , and after the substitution 168 is true in N . Let Y denote {s ∈ [µa ]|b| : fab (s) = si(β,b) }. ∀s∀a∀β(P ⇒ s ∈ Y ) is true in M , so replacing f by j(f ) and Y by j(Y ) it is true in the reduced codomain N ′ . ⊳ In this section, by a (κ, λ)-extender over M is meant a system of ultrafilters with the properties of lemma 2. As in section 43, given such, an ultraproduct may be taken. U0 , ≡0 , and ∈0 are defined as in section 43. [ζ]|a| is replaced by [µa ]|a| , etc.; and a function f must be in M . In addition, let P0 be the unary predicate, where P0 (ha, f i) if and only if {s ∈ [µa ]|a| : P (f (s))} ∈ Ea . Lemma 3. ≡0 is a congruence relation on U0 , equipped with ∈0 and P0 . Remarks on proof: This is stated in the proof of theorem 10.20 of [Schindler]. The proof is as that for lemma 43.2, with the following modifications. That ≡0 is reflexive follows from [µa ]|a| ∈ Ea . Given a, a′ let c = a ∪ a′ . Suppose P (f (s)) on s ∈ X where X ∈ Ea , and fac (s) = fa′ ′ c (s) for s ∈ Xc where Xc ∈ Ec . Then fac (s) = fa′ ′ c (s) for s ∈ Xac ∩ Xc , which proves that ≡0 respects P0 . ⊳ As in section 43, let E denote the extender and let UltE0 (M ) be the quotient of U0 by ≡0 . This is a structure for the language of set theory expanded with a unary predicate symbol. To simplify the notation, write [a, f ] for [ha, f i]. Lemma 4 (Los theorem). Suppose φ is a ∆0 formula. Letting N denote UltE0 (M ), suppose [ai , fi ] is an element of N for 1 ≤ i ≤ n. Let c = ∪i ai . Then a. |=N φ([a1 , f1 ], . . . , [ak , fk ]) if and only if b. Xφ ∈ Ec where Xφ = {s ∈ [µc ]|c| :|=M φ(f1a1 c (s), . . . , fkak c (s))}. Remarks on proof: This is “claim 1” in the proof of theorem 10.20 of [Schindler]. The proof is the same as that of lemma 43.3, with some modifications. V is replaced by M . The claim for φ = ¬ψ follows using Xφ = [µc ]|c| − Xψ . Free variables may be added to subformulas using property (c) of an extender. Suppose φy,~x is ∃z ∈ y ψz,~x . To prove b⇒a, suppose {s :|=M φ(g cc1 (s), f1a1 c1 (s), . . . , f1ak c1 (s))} ∈ Ec1 ; and let h : [µc1 ]|c1 | 7→ ∪Ran(g) be the function where h(s) is the <J -least y ∈ ∪Ran(g) such that |=M y ∈ g(s) ∧ ψ(y, f1a1 c1 (s), . . . , f1ak c1 (s)) if such a y exists, else ∅. By the assumptions on M and the fact that g ∈ M . it follows that h ∈ M . Further, {s :|=M h(s) ∈ g(s) ∧ ψ(h(s), f1a1 c1 (s), . . . , f1ak c1 (s))} ∈ Ec1 . Using the induction hypothesis, |=N [c1 , h] ∈ [b, g] ∧ ψ([c1 , h], [a1 , f1 ], . . . , [ak , fk ]); and a follows. ⊳ For x ∈ M let Cx denote the function {h∅, xi} (this is a map from [µ∅ ]0 to M ). Let j E0 : M 7→ UltE0 (M ) be the map given by j E0 (x) = 169 [∅, Cx ]. Let Uα : [µ{α} ]1 be the function {h{ξ}, ξi : ξ < µ{α} }. For a ∈ [λ]<ω let Ia denote the identity function on [µa ]|a| . Clearly Uα , Ia ∈ M . Lemma 5. a. j E0 is a ∆0 -elementary embedding. b. j E0 is ∈-cofinal. c. For α < κ, the elements of [∅, Cα ] are the elements [∅, Cβ ] for β < α. d. For α < λ, the elements of [{α}, Uα ] are the elements [{β}, Uβ ] for β < α. e. The elements of [a, Ia ] in UltE0 (M ) are the elements [{α}, Uα ] for α ∈ a. f. For all [a, f ] in UltE0 (M ), [∅, f ]([a, Ia ]) = [a, f ]. g. For all a ∈ [λ]<ω , X ∈ Ea if and only if a ∈ j E0 (X). h. The critical point of j E0 equals κ. Remarks on proof: This is proved in the proof of theorem 10.20 of [Schindler]. For part a, in the notation of the proof of lemma 4, Xφ (Cx1 , . . . , Cxk ) equals {h∅, . . . , ∅i} if |=M φ(x1 , . . . , xk ), else ∅. That j E0 is ∆0 -elementary follows using lemma 4, and since equality is present j E0 is injective. For part b, suppose [ha, f i] is an element of UltE0 (M ). Let x = Ran(f ). It is easily seen that X∈ (f, Cx ) = [µa ]|a| , whence by lemma 4 [a, f ] ∈ [∅, Cx ]. Part c is proved as lemma 43.6.b, except the fact that Ea is an M-κ-complete M -ultrafilter is used. Part d is proved as lemma 43.6.c, except Uα is used rather than U ; property (e) of a (κ, λ)-extender over M is used; [µa ]|a| is used rather than [ζ]|a| , etc.; c2 = c1 ∪{β}; and x ∈ y ⇒ ¬(y ∈ x∨y = x) holds in UltE0 (M ) by parts a and b and lemma 1. Part e is proved as lemma 43.6.d. Part f and g are proved as lemmas 43.6.e and 43.6.f, except s ∈ [µa ]|a| . For part h, by parts c and d it suffices to show that [{κ}, Uκ] ∈ [∅, Cκ ]. By property (d) of a (κ, λ)-extender over M , µ{κ} = κ, whence {s ∈ [µκ} ]1 : Uκ (s) ∈ κ} equals [µκ} ]1 and is in Eκ . ⊳ Theorem 10.20 of [Schindler] states that j E0 has some properties, easily derived from the foregoing. It also states that UltE0 (M ) and jE0 are characterized by these properties. If UltE0 (M ) is well-founded its transitive collapse may be taken; UltE (M ) will be used to denote this; j E denotes the composition of the transitive collapse map with j E0 . If UltE0 (M ) is well-founded the comments following lemma 43.6 apply (with M rather than V , etc.; in particular Ea = {X ⊆ [ζ]|a| : X ∈ M and a ∈ j E (X)}). A second variety of fine structural ultrapowers, called pseudo-ultrapowers, will be briefly covered. An early construction of these may be found in [DoddJen2]; [Welch2] contains a more recent treatment. For simplicity only amenable structures of the form hJα , Ai will be considered; but the treatment may be extended to acceptable structures. 170 Suppose - j : Jγ̄ 7→ Jγ is a ∆0 -elementary embedding, - M = hJα , Ai is an amenable structure with α ≥ γ̄, - F is a set of functions f with f ∈ Jα and Dom(f ) ∈ Jγ̄ , and - λ ≤ ω · γ is an ordinal. Let U0 = {ha, f i : f ∈ F , a ∈ [λ]<ω , and a ∈ j(Dom(f ))}. Given f, g ∈ F let D= (f, g) = {hu, vi : f (u) = g(v)}, let D∈ (f, g) = {hu, vi : f (u) ∈ g(v)}, and let DP (f ) = {u : P (f (u))}. Say that F is good for γ̄ if for all f, g ∈ F , D= (f, g), D∈ (f, g), and DP (f ) are in Jγ̄ . Supposing F to be good for γ̄, let ≡0 be the binary relation on U0 , where ha, f i ≡0 hb, gi if and only if ha, bi ∈ j(D= (f, g)). Let ∈0 be the binary relation, where ha, f i ∈0 hb, gi if and only if ha, bi ∈ j(D∈ (f, g)). Let P0 be the unary predicate, where P0 (ha, f i) if and only if a ∈ j(DP (f )). Lemma 6. ≡0 is a congruence relation on U0 , equipped with ∈0 and P0 . Remarks on proof: This is stated without proof in remarks preceding proposition 3.9 of [Mitchell2]; see also remarks following definition 3.4 in [DoddJen2]. Let ds = {hu, ui : u ∈ s}; that t = ds is expressible by a ∆0 formula, so j(ds ) = dj(s) . Thus, if a ∈ j(Dom(f )) then ha, ai ∈ j(dDom(f ) ). Since dDom(f ) ⊆ D= (f, f ), ha, ai ∈ j(D= (f, f )), which shows that ≡0 is reflexive. If r is a binary relation then j(r) is a binary relation, and j commutes with the transpose operation. It follows that ≡0 is reflexive. Similarly, j commutes with composition of binary relations, and it follows that ≡0 is transitive. The composition of D= (f ′ , f ), D∈ (f, g), and D= (g, g ′ ) is contained in D∈ (f ′ , g ′ ). It follows that if ha′ , ai ∈ j(D= (f ′ f )), ha, bi ∈ j(D∈ (f, g)), and hb, b′ i ∈ j(D= (g, g ′ )) then ha′ , b′ i ∈ j(D∈ (f ′ , g ′ )). That ≡0 respects P0 follows similarly, since the composition of D= (g, f ) and DP (f ) is contained in DP (g). ⊳ Let Ultjλ0 (M ) be the quotient of U0 by ≡0 , provided this exists, that is, provided F is good for γ̄. This is a structure for the language of set theory expanded with a unary predicate symbol. For a ∆0 formula φ over this language, and f1 , . . . , fn ∈ F , let Dφ (f1 , . . . , fn ) denote {hu1 , . . . , un i : |=M φ(f1 (u1 ), . . . , fn (un ))}. As previously, write [a, f ] for [ha, f i]. Lemma 7. Suppose F is good for γ̄, and let N denote Ultjλ0 (M ). Suppose φ is a ∆0 formula, f1 , . . . , fn ∈ F , and ai ∈ [λ]<ω ∩ j(Dom(fi )) for 1 ≤ i ≤ n. Then Dφ (f1 , . . . , fn ) ∈ Jγ̄ , and a. |=N φ([a1 , f1 ], . . . , [an , fn ]) if and only if b. ha1 , . . . , an i ∈ j(Dφ (f1 , . . . , fn )). Proof: Write M ′ for Jγ̄ . The claim holds for atomic formulas by def171 inition. The claim for φ = ¬ψ follows using the claim for ψ, and the fact that Dφ (f1 , . . . , fn ) = Dom(f1 ) × · · · × Dom(fn ) − Dψ (f1 , . . . , fn ) (and the fact that j commutes with Cartesian product, set difference, etc.). The claim for φ = ψ ∧ θ follows using Dφ (f1 , . . . , fn ) = Dψ (f1 , . . . , fn ) ∩ Dθ (f1 , . . . , fn ) (and the fact that Dψ (f1 , . . . , fn ) may be obtained from Dψ (fi1 , . . . , fit ) and Dom(fi ) for 1 ≤ i ≤ n by a rudimentary function, etc.). Suppose φ is ∃z ∈ y ψ(z, x1 , . . . , xn ), and g, f1 , . . . , fn ∈ M have domains in M ′ . Let h be the function with domain Dom(g)×Dom(f1 )× · · · × Dom(fn ), where h(v, u1 , . . . , un ) equals the <J -least z ∈ g(v) such that |=M ψ(z, f1 (u1 ), . . . , fn (un )), or ∅ if there is no such z. Clearly Dom(h) ∈ M ′ . To see that h ∈ M , in M let r = {hz, v, ~ui : z ∈ g(v) ∧ ψ(z, f1 (u1 ), . . . , fn (un ))}; then r ∈ M follows by ∆0 -separation. Thus, r ∈ Sν for a sufficiently large ν ∈ M . The restriction o of <J to Sν is in M (lemma 2.7.iii of [Devlin], or lemma 2.23 of [Dodd] for more general structures). Finally w = h(v, ~u) if and only if r(w, v, ~u) ∧ ∀z ∈ g(v)(hz, wi ∈ o ⇒ ¬r(z, v, ~u)). Let θ denote z ∈ y ∧ ψ(z, x1 , . . . , xn ). Using the induction hypothesis, it follows that Dθ (h, g, f~) ∈ M ′ , from which Dφ (g, f~) ∈ M ′ follows, since it is the composition with w = hv, ~ui, restricted to Ran(g) × Dom(g) × Dom(f1 ) × · · · × Dom(fn ). If b holds then h witnesses that a does. Suppose a holds (with [b, g] for the value of y), and let [c, h] be a value of z witnessing the fact. Using the induction hypothesis it follows that hc, a1 , . . . , an i ∈ j(Dψ (f1 , . . . , fn )) and hc, bi ∈ j(D∈ (h, g)); hb, a1 , . . . , an i ∈ j(Dφ (g, f1 , . . . , fn )) follows. ⊳ For x ∈ M let Cx denote the function {h∅, xi}. Let j jλ0 : M 7→ Ultjλ0 (M ) be the map given by j jλ0 (x) = [∅, Cx ]. Lemma 8. a. j jλ0 is a ∆0 -elementary embedding. b. j jλ0 is ∈-cofinal. Proof: For part a, Dφ (Cx1 , . . . , Cxn ) equals {h∅, . . . , ∅i} if |=M φ(x1 , . . . , xn ), else ∅. The result follows using lemma 7. For part b, suppose [a, f ] is an element of Ultjλ0 (M ). Let x = Ran(f ). Then D∈ (f, Cx ) = Dom(f ) × {∅}, so ha, ∅i ∈ j(D∈ (f, Cx )), that is, [a, f ] ∈ j jλ0 (x). ⊳ Lemma 9. Suppose j : Jα 7→ Jβ is a Σ1 -elementary embedding, γ ≤ α, and j ↾ ω · γ is the identity map. Then j ↾ Jγ is the identity map. Proof: Since ξ 7→ Sξ is Σ1 , j(Sξ ) = Sξ for ξ < ω · γ. It follows by induction that for such ξ, j ↾ Sξ is the identity. ⊳ Lemma 10. If j is nontrivial then j jλ0 is nontrivial. Proof: Consider the co-restriction of j as in remarks preceding 172 lemma 1. Using lemmas 1 and 9, it follows that j ↾ ω · γ̄ is not the identity. As in the proof of lemma 36.3, there is a least ordinal κ such that j(κ) 6= κ, and j(κ) > κ. For x ∈ Jγ̄ let Ix denote the identity function restricted to x. Let x = {{ζ} : ζ < κ}. Then ∀ζ(ζ < κ ⇒ {ζ} ∈ x), so ∀ζ(ζ < j(κ) ⇒ {ζ} ∈ j(x)), so {κ} ∈ j(x). Thus [{κ}, Ix ] is in Ultjκ0 (M ). D= (Ix , Cy ) is empty unless y ∈ x, in which case it equals {hy, ∅i}. It follows that ha, Ix i ≡0 h∅, Cy i if and only if y ∈ x and a = j(y); and thus [{κ}, Ix ] does not equal [∅, Cy ] for any y ∈ Jγ̄ . ⊳ Lemma 11. Suppose N = Ultjκ0 (M ) is well-founded and let π : N 7→ Jᾱ be the collapsing isomorphism. Let γ ′ = sup(j[ω · γ̄]). Then β ∩ γ ′ ⊆ Jᾱ ; and if ξ ∈ Jγ̄ and j(ξ) < β then π(j jλ0 (ξ)) = j(ξ). Proof: Suppose ζ ∈ β ∩γ ′ ; then there is a θ ∈ Jγ̄ such that ζ < j(θ). Let Jθ = {h{ζ}, {ζ}i : ζ < θ}. Since D= (Jθ1 , Jθ2 ) = {h{ζ}, {ζ}i : ζ < min(j(θ1 ), j(θ2 ))}, h{ζ1 }, Jθ1 i ≡0 h{ζ2 }, Jθ2 i if and only if ζ1 = ζ2 and ζ1 < min(j(θ1 ), j(θ2 )). Let φ(x) be the ∆0 formula stating that x is a singleton set whose element is an ordinal. Then Dφ (Jθ ) = Dom(Jθ ), and using lemma 7 it follows that if ζ < j(θ) then |=N φ([{ζ}, Jθ ]). Let ψ(x1 , x2 ) be the ∆0 formula stating that x1 = {ζ1 }, x2 = {ζ2 }, and ζ1 < ζ2 . Again using lemma 7, it follows that if ζ1 < ζ2 < j(θ) then |=N ψ([{ζ1 }, Jθ ], [{ζ2 }, Jθ ]). Let ζ ′ be the ordinal such that π([{ζ}, Jθ ]) = {ζ ′ }. Thus, if ζ1 < ζ2 then ζ1′ < ζ2′ . It follows that ζ ′ = ζ. That j({ξ}) = [j({ζ}), Jθ ]) follows by facts given in the proof of lemma 11, and the last claim follows. ⊳ In applications of pseudo-ultrapowers, various methods are used to ensure that the collection of functions F is good for γ̄. 50. The covering lemma. As mentioned in section 40, modern proofs of the covering lemma make use of fine structure theory. The proof given in [Schindler] will be outlined here. It involves the use of fine structural extender ultrapowers. A proof of the covering lemma using fine structural pseudo-ultrapowers may be found in [Rasch]. The proof makes use of two definitions, that of an F -dense set, and that of a specific set W . Lemmas 5 and 6 will prove properties of these, which will be used in the proof of the covering lemma in theorem 7. Suppose κ is an uncountable cardinal and A is a set with |A| ≥ κ. Recall that [A]κ denotes the set of subsets of A of cardinality κ. Say that a subset S ⊆ [A]κ is F -dense if, whenever {fγ : γ < η} for η ≤ κ is a set of functions from A to A, then there is a set x ∈ S such that fγ [x] ⊆ x for all γ < η. For the definition of W , some preliminary definitions are needed. As in earlier sections, Σn will be used as an abbreviation for “Σn -definable in Jβ from parameters”. For ordinals α ≤ ω ·β and n ∈ ω, α is said to be 173 a Σn -cardinal in Jβ if there is no Σn function f , with Dom(f ) a bounded subset of α and Ran(f ) = ω·β. In the case n = 0, equivalently, if there is no function f ∈ Jβ , with Dom(f ) an ordinal γ < α and Ran(f ) = ω · β. α will be said to be a cardinal in Jβ if, as usual, α ∈ Jβ and the preceding holds. Lemma 1. Suppose α ≤ β, n > 0, and in Jβ , ω ·α is a Σn−1 -cardinal but not a Σn -cardinal. Then ρnβ < α ≤ ρβn−1 . Further ω · ρnβ is the least ρ such that there is a Σn map of a subset of ρ onto ω · α. Remarks on proof: The first two claims are lemma 2.1.i of [DevJen]. If ω · α is a Σl -cardinal then there is no map onto ω · α, whence clearly there is none onto ω·β. ρlβ ≥ α follows by theorem 47.4 (characterization (c) of ρnβ ). Let ρ be least such that there is a Σn map from a subset of ω · ρ onto ω · α; by hypothesis ρ < α. Again using theorem 47.4, ρnβ ≥ ρ. Let f be Σn , such that Dom(f ) ⊆ ω · ρ and Ran(f ) = Jα . Let a = {ν ∈ Dom(f ) : ν ∈ / f (ν)}. If a ∈ Jα then a = f (ν) for some ν, and ν ∈ f (ν) ⇒ ν ∈ / f (ν); thus, a ∈ / Jα . Since a is Σn , it follows by theorem 47.4 (characterization (b) of ρnβ ) that ρnβ ≤ ρ. Thus, ρ = ρnβ and ρnβ < α. ⊳ Lemma 2. If n > 0 then ρnβ is a Σn -cardinal in Jβ . Proof: This follows by theorem 47.4; if there is a map f : γ 7→ ω · ρnβ then there is a map g : γ 7→ ω · β ⊳ Suppose β is the largest ordinal such that κ is a cardinal in Jβ . Then κ is not a cardinal in Jβ+1 , so there exists a surjection f : γ 7→ κ with γ < κ and f ∈ Jβ+1 . Such an f is Σn for some n. Let n be least such that there is such an f . Then in Jβ , κ is a Σn−1 -cardinal but not a Σn -cardinal. Let η be an ordinal and let κi : i < α be the cardinals of Jη . For i < α let βi be the largest β such that κi is a cardinal in Jβi if such exists, else ∞. Let κα equal ω · η. Define βα as for i < α, except that it equals η if ω · η is not a cardinal in Jη+1 . Lemma 3. Suppose i, j ≤ α. a. βi ≥ η. b. If η ≤ β < βi then for all n, ρnβ ≥ κi . c. If βi < ∞ then for some n, ρnβi < κi . d. If i ≤ j then βj ≤ βi . e. {βi : i < α} is finite. Remarks on proof: These observations are made preceding lemma 10.32 of [Schindler]. Part a is immediate. Parts b and c follow by lemma 1 and the remarks following it. For part d, there is nothing to prove if βi = ∞, so suppose otherwise. Then for some n, ρβni < κi < κj . However if βi < βj then ρnβi ≥ κj for all n. Part e is an immediate consequence of part d. ⊳ 174 < κi ≤ ρnβi if βi < ∞, else 0. Let ni be the least n such that ρn+1 βi Lemma 4. If i ≤ j and βi = βj then nj ≤ ni . Remarks on proof: This observation is also made preceding lemma 10.32 of [Schindler]. The claim is immediate if βi = ∞. Otherwise, ρnβii +1 < κi ≤ κj ≤ ρnβii and the claim follows. ⊳ By lemma 2 each ρnβii +1 equals κl for some l; let IY be the set of such l, together with α. By lemmas 3 and 4 IY is finite. Suppose that µ is a regular cardinal, Y ⊆ Hµ is an elementary substructure, and j : X 7→ Hµ is the inverse of the transitive collapse map for Y . Suppose j is nontrivial and let κc denote the critical point. - Let η = X ∩ Ord, and let α, and κi and βi for i ≤ α, be defined as above. - For i ≤ α, if κc < κi let νi = sup(j[κi ]), and let Ei be the (κc ,νi )extender derived from j ↾ Jκi . - Recall the definition of Pn from section 47. Since κi ≤ ρnβii , there is a map ji : Pni (Jβi ) 7→ UltEi (Pni (Jβi )). - If UltEi (Pni (Jβi )) is well-founded let i : Jβi 7→ Mi denote the map given by lemma 48.2. Suppose κ is an uncountable cardinal, θ ≥ κ is an ordinal, and µ > θ is a regular cardinal. With notation as above, let W denote the set of elementary substructures Y ⊆ Hµ such that Mi exists and is well-founded for all i ∈ IY with κc < κi . Lemma 5. {Y ∩ θ : Y ∈ W and |Y ∩ θ| = κ} is an F -dense subset of [θ]κ . Remarks on proof: This is proved in lemma 10.33 of [Schindler] for κ a regular cardinal; the general case then follows by lemma 10.34. ⊳ Lemma 6. If there is no nontrivial embedding e : L 7→ L then {Y ∩ Ord : Y ∈ W } ∈ L. Remarks on proof: This is lemma 10.32 of [Schindler]. ⊳ It should be mentioned that various details are omitted from the proofs of lemmas 5 and 6 cited above; no effort to provide these will be made here, though. Various references have other proofs of the covering lemma, including [DevJen], [Devlin], [Jech2]; some of these do not use fine structure theory. Theorem 7. The following are equivalent. a. There is no non-trivial elementary embedding j : L 7→ L. b. If κ is an uncountable cardinal and θ is a cardinal such that θ ≥ κ, then [θ]κ ∩ L is an F-dense subset of [θ]κ . c. If x is an uncountable set of ordinals then there is a constructible set of ordinals y such that x ⊆ y and |y| = |x|. d. 0# does not exist. 175 Remarks on proof: That a⇒b follows by lemmas 5 and 6. For b⇒c, suppose x as in c is given. Let κ = |x| and let θ ≥ κ be a cardinal such that x ⊆ θ. Let γ 7→ xγ be a bijection from κ to x. Let fγ be the function which is constantly γ on θ. By part b, there is a constructible set y ∈ [θ]κ such that xγ ∈ y for all γ, so that x ⊆ y. The proof that c⇒d may be found in remarks preceding Corollary 18.31 of [Jech2]. Suppose 0# exists. First, if κ is an uncountable cardinal then κ is regular in L. This follows because the Silver indiscernibles are “L-indiscernibles”, the uncountable cardinals are Silver indiscernibles, and ℵ1 is regular in L (using lemma 30.3); see theorem 2.15 of [Devlin], and also theorem 38.9. In particular ℵω is regular in L. Thus, ℵ1 ∪ {ℵn : n ∈ ω} cannot be covered by any constructible set of ordinals of cardinality less than ℵω . Part d implies part a by theorem 44.5. ⊳ The implication d⇒c is the classical covering lemma. An “official” proof first appeared in [DevJen]; slightly earlier, handwritten notes containing a proof had been provided by Jensen. ¬0# is a weaker hypothesis than V = L. When part c was seen to follow from it, various consequences for the universe of sets were seen to hold, some of which will be given in the next section. The implication a⇒d was already proved, in theorem 44.1. 51. Cardinal arithmetic. The notions of κ+ , ℵα , κ + λ, and κ · λ are defined in section 13; κ is defined in section 14. Various basic facts about these operations have already been given. κ + λ and κ · λ are determined by ZFC. By results of sections 20 and 22, however, 2ℵ0 is not determined. Suppose hκi : i ∈ Ii is a set of cardinal numbers. Let Σi∈I κi = |D| where D is the disjoint union of the κi . Let Πi∈I κi = |C| where C is the Cartesian product of the κi , that is, {f : I 7→ ∪i κi } where f (i) ∈ κi for all i ∈ I. Theorem 1 (Konig’s theorem). If κi < λi for all i ∈ I then Σi∈I κi < Πi∈I λi . Proof: This is theorem 5.10 of [Jech2]. Let C be the Cartesian product of the λi . Let πi be the projection from C to the ith coordinate. Suppose Di ⊆ C is a subset with |Di | ≤ κi , for each i ∈ I. Then |πi [Di ]| ≤ κi < λi , so πi [Di ] ⊂ λi . Thus there is an f ∈ C such that f (i) ∈ / πi [Di ] for any i, and so f ∈ / Di for any i. ⊳ Corollary 2. κcf(κ ) > κ. P Proof: This is corollary 5.14 of [Jech2]. If κ = i<cf(κ) κi where κi < κ then κ < Πi<cf(κ) κ = κcf(κ) . ⊳ There are restrictions on the behavior of 2λ and κλ which are provable in ZFC. In particular: - If κ1 ≤ κ2 then 2κ1 ≤ 2κ2 . λ 176 - κ < cf(2κ ). Theorem 3 (Easton’s theorem). Suppose F is a function from cardinals to cardinals such that if κ1 ≤ κ2 then F (κ1 ) ≤ F (κ2 ), and κ < cf(F (κ)). Then there is a model of ZFC in which 2κ = F (κ) for any regular cardinal κ. Remarks on proof: This is theorem 15.18 of [Jech2]. The proof uses “Easton forcing”, a type of forcing with a class of conditions, over a ground model satisfying GCH. ⊳ The situation for singular cardinals is more complicated. Theorem 4. a (Bukovsky-Hechler). Suppose κ is a singular cardinal. If 2λ = µ for all sufficiently large λ < κ then 2κ = µ. b (Silver). Suppose κ is a singular cardinal of uncountable cofinality. If 2λ = λ+ for all cardinals λ < κ then 2κ = κ+ . c (Galvin-Hajnal). Suppose κ = ℵα is a strong limit singular cardinal of uncountable cofinality. Then 2κ < ℵγ where γ = (2|α |)+ . d (Shelah). Suppose ℵω is a strong limit cardinal. Then 2ℵω < ℵℵ4 . Remarks on proof: These may all be found in [Jech2]. Part a is corollary 5.17, part b is theorem 8.12, part c is theorem 24.1, and part d is theorem 24.33. ⊳ The proof of theorem 5.d uses “PCF theory”, a theory concerning ultraproducts of sets of regular cardinals, which has found a variety of applications. Additional consequences for cardinal arithmetic have been stated by various authors (although these were doubtless known to Shelah). For example, cf(δ) - Suppose δ is a limit ordinal and |δ|cf(δ) < ℵδ . Then ℵδ < ℵγ where γ = |δ|+4 . (theorem 7.3 of [AbrMag]). The “singular cardinals problem” is to give a set of rules describing the behavior of the function 2κ on singular cardinals. This turns out to depend on what types of large cardinals are allowed. See [Gitik] for a survey. The function λκ is not determined by the function 2κ . It is determined by the function κcf(κ) ; see corollary 5.18 and corollary 5.21 of [Jech2]. It is determined if GCH holds, as follows. Theorem 5. Suppose GCH holds and κ, λ are infinite cardinals. - If κ < cf(λ) then λκ = λ. - If cf(λ) ≤ κ < λ then λκ = λ+ . - If λ ≤ κ then λκ = κ+ . Remarks on proof: This is theorem 5.15 of [Jech2]; the third claim was already proved in theorem 14.5. ⊳ 177 By corollary 2 κcf(κ ) ≥ κ+ . If GCH holds then κcf(κ ) = κ+ (theorem 5). The singular cardinals hypothesis (SCH) is the statement that κcf(κ) = κ+ for any cardinal κ such that 2cf(κ) < κ. Note that for the hypothesis to hold, κ must be singular. SCH was isolated around 1970 as being of particular interest. If SCH holds then the function λκ is determined by the function 2κ , as follows. Theorem 6. Suppose SCH holds and κ, λ are infinite cardinals. - If λ > 2κ and κ < cf(λ) then λκ = λ. - If λ > 2κ and κ ≥ cf(λ) then λκ = λ+ . - If λ ≤ 2κ then λκ = 2κ . Remarks on proof: This is theorem 5.22.ii of [Jech2]. ⊳ Theorem 7. Suppose ¬0#. a. SCH holds. b. If κ is singular then κ is singular in L. c. If κ is singular then (κ+ )L = κ+ . Remarks on proof: Part a is corollary 18.33 of [Jech2]. Suppose 2cf(κ) < κ (whence κ is singular). Let C = {Y ⊆ κ : Y ∈ L and |Y | = max(ℵ1 , cf(κ))}. Then |C| ≤ |PowL (κ)| = |(κ+ )L | ≤ κ+ . Also, |[Y ]cf(κ) | = max(ℵ1 , cf(κ))cf(κ) = 2cf(κ) . If X ∈ [κ]cf(κ) then by the covering lemma there is a Y ∈ C with X ⊆ Y . Using the hypothesis 2cf(κ) < κ, it follows that |[κ]cf(κ) | ≤ κ+ . Now, κcf(κ) is the number of functions from cf(κ) to κ. It follows that κcf(κ) ≤ |[κ]≤cf(κ) | · cf(κ)cf(κ) . It is easily seen that |[κ]≤cf(κ) | ≤ |[κ]cf(κ) | · cf(κ), and 2cf(κ) ≤ |[κ]cf(κ) |; κcf(κ) = |[κ]cf(κ) | follows. Part b is corollary 18.31 of [Jech2]. Part c is corollary 18.32 of [Jech2]. ⊳ Part c is called the “weak covering” principle. It is of interest in “core model theory”, since if L is replaced by a core model K, the weak covering principle might hold, even though the full covering principle does not. Some other conclusions which follow from ¬0# via the covering lemma will be given in the next section. 52. Square. Just as the diamond principle ♦, the square principle is a “combinatorial” principle which follows from V = L, and has various consequences. It was defined in 1972 by R. Jensen in the same paper ([Jensen]) as the diamond principle. Recall from section 23 that the club subsets of α are defined for any limit ordinal α. If S is a set of ordinals let Otp(S) denote the order type of S, with the order inherited from Ord. Let κ be an infinite cardinal, and let E be a subset of κ+ . The principle κ (E) will be defined; this is a generalization of κ useful in developing the theory. κ (E) is the statement that the following holds: 178 There is a sequence hCα i for α < κ+ with α ∈ LimOrd, such that the following hold. 1. Cα is a club subset of α; 2. if cf(α) < κ then Otp(Cα ) < κ; and 3. if β ∈ Lim(Cα ) then Cβ = Cα ∩ β, and β ∈ / E. κ denotes κ (∅). It is readily seen that requirement 2 may be alternatively stated as, |Cα | < κ. Theorem 1. If V = L then κ holds. Remarks on proof: A proof is given in section IV.5 of [Devlin]. Let S = {α < κ+ : α > κ, ω · α = α, and ∀γ < α(|γ|Jα < κ)}. S may be seen to be a club subset of κ+ . Using fine structure theory, a sequence hCα i for α ∈ S is constructed, such that 1. Cα is a closed subset of α; 2. if cf(α) > ω then Cα is unbounded in α; 3. the order type of Cα is at most κ; and 4. if β ∈ Cα then Cβ = Cα ∩ β. Via the order preserving bijection S 7→ Lim(κ), hCα i becomes a sequence hBα i for α < κ+ with α ∈ LimOrd, such that 1. Bα is a closed subset of Lim(α); 2. if cf(α) > ω then Bα is unbounded in α; 3. the order type of Bα is at most κ; and 4. if β ∈ Bα then Bβ = Bα ∩ β. The sequence hBα i can in turn be converted to a sequence of sets as required for κ (lemma IV.5.1). ⊳ Recall the principle ♦κ (E) defined in section 23. Lemma 2. Let W denote {α < κ+ : cf(α) = ω}. a. Suppose κ is uncountable and κ holds; then there is a stationary set E ⊆ W such that κ (E) holds, and if ♦κ+ (W ) holds then ♦κ+ (E) holds. b. Suppose GCH holds, κ is uncountable, cf(κ) = ω, and κ holds; then ♦κ+ (W ) holds. c. Suppose GCH holds, κ is infinite, and cf(κ) > ω; then ♦κ+ (W ) holds. Remarks on proof: These may be found in [Devlin]. Part a is lemma IV.2.10. Part b is lemma IV.2.8. Part c is lemma IV.2.7. ⊳ Theorem 3. Suppose κ is an infinite cardinal, and there is a stationary subset E ⊆ κ+ such that κ (E) and ♦κ+ (E) both hold. Then there is a κ+ -Suslin tree. Remarks on proof: This is theorem IV.2.5 of [Devlin]. ⊳ Corollary 4. Suppose GCH holds, κ is uncountable, and κ holds; then there exists a κ+ -Suslin tree. 179 Remarks on proof: This is theorem IV.2.10 of [Devlin]. Using lemma 2 there is a stationary subset E ⊆ κ+ such that κ (E) and ♦κ+ (E) both hold. ⊳ It follows using theorem 1 and corollary 4 that if V = L then for any infinite cardinal κ, a κ+ -Suslin tree exists. Using the covering lemma, the following may be shown. Theorem 5. Suppose ¬0#. Let κ be a singular cardinal. a. κ holds. b. Suppose that GCH holds also. Then there is a κ+ -Suslin tree. Remarks on proof: Part a is theorem V.5.6 of [Devlin]. Part b (theorem V.5.7 of [Devlin]) follows from part a by theorem 3. ⊳ The following result will be of interest in chapter 56. Theorem 6. For any cardinal κ, if κ is false then κ+ is Mahlo in L. Remarks on proof: This was noted in [Jensen]. First, by modifying the proof that κ holds in L it may be shown that, if A ⊆ κ+ and ∀α < κ+ (|α|L[A∩α] ≤ κ) then κ holds in L[A]. See exercise IV.5 of [Devlin] and theorem 6.36.of [BJW]. The theorem then follows (exercise IV.6 of [Devlin]). ⊳ Corollary 7. If ¬κ for a singular cardinal κ then 0# exists. Remarks on proof: This follows from theorem 6 using theorem 51.7.c. ⊳ The square principle, like many other statements of modern set theory, has become of interest in various topics. [CFM] is one example of research in this area. 53. Independence of AC. As noted in theorem 21.13, if M [G] is a generic extension of a transitive model M of ZFC then M [G] is a model of ZFC. To construct a model in which the axiom of choice fails, additional steps are needed. A set x is said to be ordinal definable if (∗) there is a formula Fu,~p , and ordinals α ~ , such that u ∈ x if and only if F (u, α ~ ) holds. The notion of ordinal definability was discussed by K. Godel in 1947, and has found various uses in set theory since. Discussions can be found in [Drake], [Jackson], [Jech2], [Schindler], and other references. The definition (∗) cannot be formalized in ZFC. Such a definition is easy to give, however; Let OD be the class of sets x such that there is an ordinal β such that x is definable in Vβ from ordinal parameters. Using a reflection principle it follows that x ∈ OD if and only if (∗) holds (this is noted following lemma 13.25 of [Jech2]). OD can be characterized in various other ways. In [Jech2] it is characterized as those x which are obtained from elements Vα by “Godel 180 operations”, also called “fundamental operations”. The exact set of operations varies from author to author. The basis functions of section 45 can be used, since theorem 13.4 of [Jech2] holds for these; this may be seen from results of section VI.1 of [Devlin]. Let HOD be the sets x which are “hierarchically” OD, that is, such that x ∈ OD and TC(x) ⊆ OD. Basic properties of these classes include the following. Theorem 1. a. OD has a definable well-ordering. b. L ⊆ OD ⊆ HOD ⊆ V . c. If HOD = OD then V = HOD. d. HOD is a model of ZFC. Remarks on proof: Part a is lemma 13.25 of [Jech2]. Since {Vβ : β < α} is well-ordered, so is its closure under the Godel operations (see for example lemma 46.8). For part b, it is only necessary to show that L ⊆ OD; this follows because there is a definable bijection between Ord and L (see lemma 5.8.4 of [Drake]). For part c, if HOD = OD then every Vα is HOD and hence every set is OD (remarks following theorem 8.8 of [Drake]). Part d may be proved by various methods; that of theorem 13.26 of [Jech2] uses a general fact of interest. Namely, a transitive class which is closed under the Godel operations, and “almost universal”, is a model of ZF (theorem 13.9 of [Jech2], theorem 14.11 of [TakZar1]). ⊳ Possible inequalities which are not excluded by theorem 1.b and theorem 1.c are consistent; see [Drake] for further remarks. Since HOD satisfies AC, to violate AC the method must be generalized. For a set A, let ODA be the sets such that - there is a formula Fu,~p , ordinals α ~ , and elements ai ∈ A, such that u ∈ x if and only if F (u, α ~ , ~a) holds. Let HODA be the sets x such that x ∈ ODA and TC(x) ⊆ ODA . By arguments similar to those already given, ODA is definable in ZFC, and HODA is a model of ZF. The notation HOD(A) will be used for HODA∪{A} , as in [Jech2]. Let M be a ground model in which V = L holds. Let P be the partial order, whose elements are functions f : s 7→ {0, 1} where s is a finite subset of ω × ω, with f < g if and only if f ⊃ g. Let G be an M -generic subset of P . For i ∈ ω let ai be the set of j ∈ ω such that f (i, j) = 1 for some f ∈ G. Let A = {ai : i ∈ ω}. Theorem 2. In the model HOD(A) where A is defined above, there is no well-order of the real numbers. Remarks on proof: This follows from lemma 14.39 of [Jech2]. ⊳ 54. Proper forcing. 181 Properness is a property of notions of forcing, which has found many applications since it was defined by S. Shelah in 1982. Since it will be referred to in subsequent sections, a brief treatment will be given. See chapter 31 of [Jech2] and [Abraham] for more extensive treatments. Some variations of proper forcing which have subsequently been seen also to be of interest will be described as well. First, the notion of the club filter in [A]µ , for a cardinal µ of uncountable cofinality and a set A with |A| ≥ µ, will be defined. This notion was first defined by T. Jech in 1972, and has since become commonly used. Chapter 8 of [Jech2] includes a discussion of this topic. The set [A]µ becomes a poset when ordered by the subset relation. A subset X ⊆ P of a quasi-order P is said to be cofinal if for any p ∈ P there is an x ∈ X with p ≤ x. The term “unbounded” is also used. A subset C ⊆ [A]µ is said to be a club subset if it is cofinal, and closed under unions of ascending chains of length some ordinal α ≤ µ. Lemma 1. Suppose µ and A are as above. The subsets of [A]µ which contain a club subset comprise a filter. Proof: It suffices to show that if C0 and C1 are club subsets then C0 ∩ C1 is a club subset. Suppose s0 is any element of [A]µ . Define sij for i ∈ ω and j = 0, 1 inductively. Let s00 be any f ∈ C0 such that s0 ⊆ f . Let si1 be any f ∈ C1 such that si0 ⊆ f . Let si+1,0 be any f ∈ C0 such that si1 ⊆ f . Let s = ∪i,j sij . Then s0 ⊆ s and s ∈ C0 ∩ C1 . Given an ascending chain of functions in C0 ∩ C1 of length < µ+ , its union is in both C0 and C1 , so is in C0 ∩ C1 . ⊳ The terminology “club subset” is used because facts concerning the usual club filter may be adapted to this filter. In particular, a subset is called thin if its complement is in the club filter; and stationary if it is not thin. A notion of forcing hM, P i is said to be proper if, whenever, in M , S is a stationary subset of [λ]ω for an uncountable cardinal λ, S is stationary in the generic extension M [G]. Some sufficient conditions for properness include the following. - If P satisfies the countable chain condition then P is proper (lemma 31.2 of [Jech2]). - If P is ω-closed then P is proper (lemma 31.3 of [Jech2]). - If P “satisfies axiom A” then P is proper (lemma 31.10 of [Jech2]). There is an important equivalent formulation, which involves some preliminary definitions. Suppose (N, P ) is a notion of forcing. - A cardinal λ is sufficiently large if λ > 2|P | . - For λ a sufficiently large cardinal, by an elementary substructure of Hλ will be meant one in an expanded structure including any items of interest, for example P . 182 - A set D ⊆ P is predense if every p ∈ P is compatible with some q ∈ D. - If M is an elementary substructure of Hλ where λ is a sufficiently large cardinal, a condition q ∈ P is said to be (M, P )-generic if for every maximal antichain A ∈ M the set A ∩ M is predense below q. The following holds: - P is proper if and only if for all sufficiently large cardinals λ there is a club subset of [Hλ ]ω of countable elementary submodels M such that for all p ∈ M there is a q ≤ p which is (M, P )-generic (theorem 31.7 of [Jech2]). One of the most important properties of proper forcing is the following. Recall from section 27 that in an iteration having countable support, the overall forcing notion involves sequences which are not 1 at only countably many places. - If Pα is a countable support iteration of {Q̇β : β < α} such that every Q̇β is a proper forcing notion in M Pβ , then Pα is proper (theorem 31.15 of [Jech2]). A weakened version of properness has been seen to have various applications. Suppose M is an elementary substructure of Hλ where λ is a sufficiently large cardinal. - For a condition q ∈ P , q is (M, P )-generic if and only if, for any name α̇ for an ordinal (i.e., such that Jα̌ is an ordinalK = 1), q α̇ ∈ M (lemma 31.6 of [Jech], fact 18.31 of [Roitman]). - Say that a condition q ∈ P is (M, P )-semigeneric if for any name α̇ for a countable ordinal, q α̇ ∈ M . - P is defined to be semiproper if, for all sufficiently large cardinals λ there is a club subset of [Hλ ]ω of countable elementary submodels M such that for all p ∈ M there is a q ≤ p which is (M, P )semigeneric. Properties of interest of semiproper notions of forcing include the following. - If hM, P i is semiproper then every stationary set S ⊆ ℵ1 remains stationary in M [G] (theorem 34.4 of [Jech2]). A notion of forcing is said to be “stationary set preserving” if it has the foregoing property. - In general, countable support iteration does not preserve semiproperness. There is a type of iteration, called “revised countable support” (RCS) iteration, which does. See chapter 37 of Jech for some discussion. 55. Core models. 183 Core models are inner models which are constructed by certain methods, and have certain properties. There is no precise definition, but several models have been constructed, which set theorists agree are examples of core models. The first construction appeared in [DoddJen1], of the “Dodd-Jensen” core model, which will be denoted K DJ . An alternative treatment may be found in [Dodd], and a brief overview in [Jech2]. An overview will be given here. A notion central to core model theory is that of a “premouse”. The definition depends on the type of core model, and even for the same type there are variations between authors. The definition in [DoddJen1] will be considered here. Recalling the definition of the relativized Jensen hierarchy from secA A tion 46, structures of the form J A α = hJα , A∩Jα i will be considered. As noted in section 46, such structures are amenable; they are a restricted type of J-structure, as defined in section 47. A premouse at κ is defined to be a structure N = J U α where in N , κ is a cardinal and U is a normal N -ultrafilter on κ. (U is a subset of JαU , but not necessarily an element). Suppose N is a premouse at κ. On N κ ∩ N , let - f ≡0 g if and only if {α < κ : f (α) = g(α)} ∈ U ; - f ∈0 g if and only if {α < κ : f (α) ∈ g(α)} ∈ U ; and - U0 (f ) if and only if {α < κ : U (f (α))} ∈ U . By arguments as given in earlier sections, ≡0 is a congruence relation on N κ ∩ N , equipped with ∈0 and U0 . Let Ñ denote the quotient, with ∈Ñ and UÑ the induced relations. Using familiar arguments, the following may be shown. Los’ theorem holds for Σ0 formulas. For x ∈ N let Cx denote the function on κ where Cx (α) = x for all α < κ; the map x 7→ [Cx ] is Σ0 -elementary. It is also ∈-cofinal, so in fact is Σ1 -elementary. If Ñ is well-founded, let N + be the transitive collapse, and let jN : N 7→ N + be x 7→ [Cx ] composed with the transitive collapse map π. Again, by familiar arguments, jN ↾ κ is the identity map, Pow(κ) ∩ N + = Pow(κ) ∩ N , and if x = π([f ]) then jN (f )(κ) = x. + + + Lemma 1. N + = J U α+ where α = sup(j[α]) and U (π([f ])) if and ′ ′ only if UÑ ([f ]). Letting κ denote jN (κ), κ is a cardinal in N + and U + is a normal N + -ultrafilter on κ′ . In particular, N + is a premouse at κ′ . Remarks on proof: This is lemma 3.8 of [DoddJen1]. All claims follows using the fact that jN is Σ1 -elementary. ⊳ Lemma 1 permits iterating the step N 7→ N + , provided Ñ is wellfounded. Say that N is 0-iterable, N0 = N , and j00 is the identity. If N is α-iterable and Ñα is well-founded say that N is (α + 1)-iterable, Nα+1 = Nα+ , for β ≤ α jβ,α+1 = jNα ◦ jβ,α , and jα+1,α+1 is the identity. If α is a limit ordinal and N is β-iterable for all β < α, let Nl be the 184 direct limit of the Nβ with the maps jγβ for γ < β < α. If Nl is wellfounded say that N is α-iterable, let Nα be the transitive collapse of Nl , and let jβα be the direct limit map, composed with the transitive collapse map. A premouse N is said to be iterable if it is α-iterable for all ordinals α. If M is an iterable premouse at κ the sequence hMi , jij , κi i, where κi = j0i (κ), is called the iteration of M . The following may be verified (lemma 3.12 of [DoddJen1]): - jij is Σ1 -elementary and cofinal. - jij ↾ κi is the identity, - If i < j then κi < κj and Pow(κi ) ∩ Mj = Pow(κi ) ∩ Mi . - κi is the critical point of jij . - Any element of Mj is definable by a Σ0 formula with parameters from Ran(jij ) ∪ {κh : i ≤ h < j}. The core model K DJ (written simply K if there is no possibility of confusion)) is constructed by singling out certain iterable premice, called mice, which satisfy certain “fine structural” requirements. It turns out that the existence of a mouse, the existence of an iterable premouse, and the existence of 0#, are equivalent (see chapter 12 of [Dodd]). The definition of mice and the derivation of their relevant properties in [DoddJen1] is lengthy, and only some remarks will be made here, referring to the treatment there. For convenience, conventions following [DoddJen1] will be used, which are slightly at variance with those used in section 47. M , N will generally used to denote structures of the form JA α . These may be required to satisfy additional hypotheses, e.g., to be acceptable or to be a premouse. The term “strongly acceptable” will be used for the notion of “acceptable” defined in section 47. The parameter pM is defined to be the least finite set of ordinals (under a well-order on such). AM is defined to be {hi, xi ∈ JρM :|=M φi (x, pM )}. M ∗ denotes M JA ρM . Lemma 2. a. M ∗ is strongly acceptable. M . b. JρAMM = Hωρ M Remarks on proof: Part a is lemma 2.18 of [DoddJen1], and part b is lemma 2.19. ⊳ Let hM denote the uniformly defined σ1 Skolem function for M A (denoted hM J in section 46). A structure M = Jα is said to be sound if JαA = hM [ω × (JρM ∪ {pM })] (that is, making suitable adjustments to the definitions, if pM is a very good parameter). Lemma 3. Suppose A ⊆ JρAM . a. M is sound. b. JρAMM = JρAM . 185 Remarks on proof: See lemma 4.2 and remarks following lemma 4.4 of [DoddJen1]. Part a may be proved as the case n = 0 of lemma 9.7 of [Dodd]. For part b, by remarks following lemma 47.6 hJαA , AM i is amenable. The claim follows by arguments as in the proof of lemma 46.16. ⊳ Definition 4.5 of [DoddJen1] gives the recursion equations for iterating the projectum, as follows. Let M (0) = M , ρ0M = α, p0M = ∅, i+1 and A0M = A. Let M (i+1) = M (i)∗ , ρi+1 M = ρM (i) , pM = pM (i) , and Ai (i) Ai+1 = J ρiM . M = AM (i) . It follows by induction that M M M is said to be n-sound if M (i) is sound for i < n. Using lemma 3 and standard methods such as may be found in the proof of theorem 47.8, it may be seen that if A ⊆ JρAn then M is n-sound, and M suitably reformulated, theorems 47.8.c and 47.8.d hold (lemma 4.6.a of [DoddJen1]). A premouse N = J U α at κ is said to be critical if N is acceptable, and there is a subset of κ, definable from parameters, which is not an element of N . Lemma 4. Suppose N is critical, and n is least such that there is a subset of κ, Σn -definable from parameters, which is not an element of N . Then ρn+1 ≤ κ < ρnN . N Remarks on proof: This is stated following Definition 5.1 of [DoddJen1]. Let δN denote the least δ ≤ α such that U ⊆ JαU . Using facts given above, ρn+1 < δN ≤ ρnN . The claim follows using lemma 4.8 of M [DoddJen1]. ⊳ The integer n of lemma 4 is called the critical number, and denoted n(N ). Let N ′ = N (n) ; it follows by facts above that N ′ = J U ρ′ where ρ′ = ρnN . A premouse N = JαU at κ is said to be a mouse (at κ) if - N is critical, - N ′ is iterable with iteration h(N ′ )i , (j ′ )ij , (κ′ )i i, and - for each i ∈ Ord there is a critical premouse Ni such that n(Ni ) = n(N ) and (Ni )′ = (N ′ )i . Lemma 5. Suppose N is a mouse and σ : M 7→ N ′ is Σ1 -elementary. Then there is a unique N̄ such that N̄ ′ = M . Further, N̄ is a mouse and n(N̄ ) = n(N ). Remarks on proof: See lemma 5.6 of [DoddJen1]. ⊳ If M is the transitive collapse of hN ′ [ω × (JρN ′ ∪ {pN ′ })], by the preceding lemma there is a unique mouse N̄ with N̄ ′ = M . N̄ is called the core of N . It may be shown that, for a mouse N , the iteration of N ′ may be “lifted” to a system of structures and Σn+1 -elementary embeddings, starting at N (remarks following Definition 5.4 of [DoddJen1]; and also lemma 9.15 of [Dodd]). This system is called the mouse iteration of N . 186 Let CN = ∩{U ∩ hN ′ [ω × (JρN ′ ∪ {pN ′ })]. It follows that CN is the set of Σ1 generating indiscernibles for N ′ with suitable constants added to the language, which is unique, and equals {κ̄i : i < λ}, where the κ̄i are the critical points of the mouse iteration of the core N̄ of N and N = N̄λ in the iteration (see remarks following lemma 5.14 of [DoddJen1]). Lemma 6. For any κ, there is at most one N such that N is a mouse at κ and |CN | = ω. Remarks on proof: See lemma 6.2 of [DoddJen1]. ⊳ Let D = {hξ, κi : there is a mouse N at κ with |CN | = ω and ξ ∈ CN }; Kα = JαD ; and K = ∪α Kα = L[D]. K is the (Dodd-Jensen) core model. After proving various lemmas about the above constructs, the following may be shown (references are to [DoddJen1]). - if β is an infinite cardinal in K then Kβ = HβK (lemma 6.9). - |=K GCH (corollary 6.10). - if 0# does not exist then K = L (Examples 1 of section 6). - If 0# exists then 0# ∈ K (remarks following Definition 5.4). - If 0# exists and β is an uncountable cardinal in K then Kβ = ∪{N ∈ Kβ : N is a mouse} (lemma 6.11). - If there is a set U such that U is a normal measure in LU then Ui K = ∪i HκLi = ∩i LUi where hLUi , κi i is the iteration (Examples 3 of section 6). - It follows by standard observations that the hypothesis of the preceding item holds if and only if there is an inner model containing a measurable cardinal. - In K, there is no set U such that U is a normal measure in LU (remarks in fourth paragraph). - An iterable premouse is acceptable (lemma 5.21). K is “between” L and L[U ] for a normal measure U . It exists even if there is no such U , and contains 0# if and only if 0# exists. A standard method in applications of core models is as follows. - Let P be a property of cardinals, to the effect that they are “large”. - Let H be the statement, “there does not exist an inner model of ∃κP (κ)” - Let K be a core model. - Let SK be a statement involving K, such that H ⇒ SK . - Let SV be a statement not involving K, such that SK ⇒ SV . - It follows that if ¬SV is consistent then ∃κP (κ) is consistent. 187 The statement H is called the “anti-large cardinal hypothesis”. K must be constructed so that H ⇒ SK follows, for various SK of interest. This is often stated, “K is the core model below, or up to, a cardinal with property P ”. K has a “maximality” property, in that considerable “large cardinal structure” up to ∃κP (κ) is incorporated; it has a “minimality” property in that this is done in a minimal fashion, invariably as L[E] for some class E. That H ⇒ SK is not as written a statement of set theory; what is actually proved is the contrapositive ¬SK ⇒ ∃κP (κ)M for some inner model M (so that ZF C M also). The consistency implication than follows as in remarks preceding theorem 43.13. In the introduction to part four of [Dodd] it is stated that “The covering lemma and the SCH are the most important applications of K”. This application is an example (indeed the first) of the above described method: - P is “measurable”. - K is K DJ . - SK is the “covering property”, that if x is an uncountable set of ordinals then there is a set of ordinals y ∈ K such that x ⊆ y and |y| = |x|. - SV is SCH. That H ⇒ SK was first proved in [DoddJen2]; a proof can also be found in lemma 18.26 of [Dodd]. That SK ⇒ SV is lemma 21.10 of [Dodd]. There is a covering lemma for L[U ], and consequences for SCH; see [DoddJen3]. “Higher” core models, that is, up to larger cardinals, have been of considerable interest in modern set theory. Chapter 35 of [Jech2] mentions three such, up to a measurable cardinal κ of Mitchell order κ (K m ), up to a strong cardinal (K strong ), and up to a Woodin cardinal (K steel ). The methods needed for constructing successive core models become increasingly complex. Roughly, K m requires coherent sequences of measures, K strong requires coherent sequences of extenders, and K steel requires iteration trees. Iteration trees are used for various purposes, including constructing fine structural inner models which need not be core models. Further remarks are omitted. Several articles in The Handbook of Set Theory provide further information. 56. Consistency strength. If T1 and T2 are first order theories, T1 is said to have consistency strength at least that of T2 if Con(T1 ) ⇒ Con(T2 ). This notion has already been encountered in section 43, where the theories are specialized to “ZFC+∃κP (κ)” where P defines a type of large cardinal. The question arises of what methods may be allowed in proving the 188 implication. For extensions of ZFC, ZFC may be allowed. For weaker theories (and even set theory), though, the methods might be restricted, say to primitive recursive arithmetic (for which see [Smorynski]). T1 and T2 are said to be equiconsistent if Con(T1 ) ⇒ Con(T2 ) and Con(T2 ) ⇒ Con(T1 ). T1 is said to have consistency strength greater than that of T2 if Con(T1 ) ⇒ Con(T2 ), but ¬(Con(T2 ) ⇒ Con(T1 )). Some simple examples are as follows. - ZFC+GCH is equiconsistent with ZFC (since L is an inner model). - ZFC+¬GCH is equiconsistent with ZFC (since there is a generic extension in which CH is false). Let I be the statement “there exists an inaccessible cardinal”. - ZFC+I has consistency strength strictly greater than ZFC (since Con(ZFC) is provable in ZFC+I, and using the second incompleteness theorem). - ZFC+¬I is equiconsistent with ZFC (since a model of ZFC+I may be “truncated” to a model of ZFC+¬I). Certain statements of set theory may be singled out as “large cardinal axioms”. Often these are of the form “∃κP (κ)” where P is a predicate singling out some class of large cardinals. Some other statements are considered as “having large cardinal strength”, though, for example “0# exists” (see theorem 9.17 of [Kanamori]). The following are empirical observations of modern set theory. - If C1 and C2 are large cardinal axioms then either Con(ZF C + C1 ) ⇒ Con(ZF C + C2 ) or Con(ZF C + C2 ) ⇒ Con(ZF C + C1 ). - If S is a statement of interest then there is a large cardinal axiom C such that ZFC+S is equiconsistent with ZFC+C. - That Con(ZFC + C) ⇒ Con(ZFC + S) gives an “upper bound” on the consistency strength of S. Upper bound proofs often involve constructing a generic extension satisfying S, from a ground model satisfying C. Sometimes, the implication C ⇒ S can be shown. - That Con(ZFC + S) ⇒ Con(ZFC + C) gives a “lower bound” on the consistency strength of S. Lower bound proofs often involve constructing an inner model (in many cases of interest a core model) satisfying C, starting with a model of S. - Sometimes bounds are shown for Con(ZF + S). For an example, the negation of SCH is equiconsistent with the existence of a measurable cardinal κ of Mitchell order κ++ . The existence of a measurable cardinal κ such that 2κ > κ+ also is. These results were proved by Mitchell and Gitik, in the period from 1984 to 1991. For further information see corollary 35.18 and theorem 36.1 of [Jech2], and theorem 4.41 of [Mitchell2]. For a survey of subsequent developments see [Kanamori4]. 189 For another example, if there exists a Mahlo cardinal then there is a forcing model in which ¬ℵ1 holds (exercise 27.2 of [Jech2]). By theorem 52.6, ¬ℵ1 is equiconsistent with the existence of a Mahlo cardinal. Further examples of consistency strength bounds will be given in subsequent sections. 57. Descriptive set theory. Descriptive set theory is a subject dating to the last half of the 1910’s. Its subject matter is sets of real numbers, which satisfy any of various definability conditions. It has been found convenient to consider the real numbers in the alternative form mentioned in section 14, as elements of Baire space N, ω ω with basic open sets Ut where t ∈ ω <ω . Definable subsets are a topic of interest in any topological space. Spaces called “Polish spaces” are of particular interest. Descriptive set theory in more general spaces is thoroughly covered in [Kechris] and [Moschovakis]. For some applications, including set theory, descriptive set theory in some particular spaces is all that is required. Such treatments may be found in [Jech2] and [Kanamori3]. Many facts of descriptive set theory, when suitably formulated, hold for Polish spaces in general. For various applications the cases of greatest interest are R, N, and C. No two of these are homeomorphic. (For readers familiar with some topology, R is connected and not compact, N is totally disconnected and not compact, and C is totally disconnected and compact. See [Dowd1] for definitions.) Any two uncountable Polish spaces are however “Borel isomorphic” (see theorem 15.6 of [Kechris]). Proofs might be given for any Polish space. Alternatively, as a matter of convenience, a proof might be given for a particular Polish space. This can then be “transferred” to an arbitrary Polish space via Borel isomorphism. It can also be transferred to some other particular Polish space by a specific method, avoiding the need to develop the theory of Borel isomorphism. Most results given here will be for particular spaces. The starting point of descriptive set theory is the Borel sets. These were first defined in the 19th century by E. Borel. The class of Borel subsets may be defined for any topological space. A “σ-algebra” of subsets of some set is defined to be a family of subsets which is closed under the operations of countable union and complement. The Borel sets are defined as the smallest σ-algebra containing the the open sets (the σ-algebra generated by the open sets). If the space satisfies some restrictions, the Borel sets may be stratified into the Borel hierarchy. In [Kechris] the restriction of “metrizability” is imposed (a topological space is metrizable if there is a metric whose metric topology is the given topology). In a metrizable space: 190 - A subset is defined to be Σ01 if it is open. - For α ≥ 1 a subset is Π0α if it is the complement of a Σ0α set. - For α > 1 a subset is Σ0α if it is a union of finitely or countably many sets which are Π0β for some β < α. - For α ≥ 1 a subset is ∆0α if it is both Σ0α and Π0α . The use of “boldface” to denote these classes is standard. Proposition I.3.7 of [Kechris] shows that in a metrizable space, every closed subset is a countable intersection of open sets. It follows by induction that for α < β, Σ0α , Π0α ⊆ ∆0β . It is not difficult to see that ∪α<ℵ1 Σ0α is closed under countable union and complement, and equals the class of Borel sets. In developing descriptive set theory it is convenient to consider the space Aω of sequences from any set A. Generalizing a definition from section 14, for t ∈ An let Ut denote {f ∈ Aω : f ↾n = t}. The sets Ut form the base for a topology on Aω . This space is metrizable; indeed, the function d(s, t) = 1/2n where n is least such that s(n) 6= t(n), is readily verified to be a metric. It is also convenient to consider product spaces. In general, the product of topological spaces X1 , . . . , Xk has as underlying set the Cartesian product X1 × · · · × Xk . The sets U1 × · · · × Uk where Ui is an open subset of Xi may be taken as the base of a topology (in fact the Ui may be required to be basic, given a base for the topology on Xi for each i). If for each i Xi is a metric space, with metric function di , then the product space may be given any of several metrics, with the resulting metric topology being the product topology, the function max(d1 (x, y), . . . , dk (x, y)) for example. ω A product space Aω 1 × · · · × Ak is readily seen to be homeomorphic ω to (A1 × · · · × Ak ) , via the map hhx1i i, . . . , hxki ii 7→ hhx1i , . . . , xki ii. In descriptive set theory, it is convenient to blur the distinction. For example, hx1 , . . . , xk i may be used to denote an element of (A1 × · · · × Ak )ω , so that no special notation is needed for this. Also, in a product ω Aω 1 × A2 , a factor may itself be a product space; more generally, the usual identifications may be invoked, so that the order of association of a product may be ignored. By a “standard space” will be meant a Aω where A = A1 ×· · ·×Ak , where Ai is 2 or ω. A standard space is a product of copies of N and C. Many authors only consider spaces Nk ; allowing factors of C as well is sometimes convenient. Another homeomorphism of interest arises between Aω and B ω when |A| = |B|. Let f : A 7→ B be any bijection, and map hxi i to hf (xi )i. Letting ∼ = denote homeomorphism, it follows that for infinite A, (Aω )k ∼ = N. = Aω . For example, Nk ∼ = (Ak )ω ∼ 191 In many areas in mathematics, including descriptive set theory, it is useful to specify a method of “coding” a finite sequence of integers as a single integer. The “prime power coding” is a standard such (discussed in [Yasuhara] for example). It makes use of the “fundamental theorem of arithmetic”, that every integer n ≥ 2 may be uniquely expressed in the form pe11 · · · pekk where p1 < · · · < pk are “prime numbers”. A proof of this may be found in [Dowd1]. The code for a finite sequence s = hs0 , . . . , sl−1 i will be taken as 2s0 +1 3s1 +1 · · · psl−1 +1 where p is the l-th prime number in increasing numerical order. The empty sequence is coded by the empty product, whose value is by a standard convention equal to 1. The code for s will be denoted Cd(s). The standard operations on sequences translate to primitive recursive functions on their codes; see [Yasuhara] for some examples. Let FS(i) be the sequence whose code is i if such exists, else the empty sequence. This provides a computable listing of the finite sequences of integers. It is readily seen that |FS(i)| ≤ i, a fact which is sometimes useful. Theorem 1 (Parameterization theorem). Let Aω be a standard space. For any α > 0 there is a Σ0α subset Ũ ⊆ N × Aω such that for any Σ0α subset X ⊆ Aω there is an element s ∈ N such that X = {x : hs, xi ∈ Ũ }. Remarks on proof: The proof may be given for A = ω; modifications for the general case are slight. For α = 1 let Ũj = {hs, xi : x ∈ UFS(j) }; and let Ũ = ∪j Ũj . Ũj is open, since Us↾(j+1) × UFS(j) ⊆ Ũj ; hence Ũ is open. If X is open then X = ∪j UFS(j) for some s, and the theorem follows when α = 1. The claim for α > 1 may be proved by induction; see lemma 11.2 of [Jech2]. ⊳ Corollary 2. In a standard space, a. there is a subset which is Σ0α but not Π0α ; and b. if α < β then Σ0α , Π0α ⊂ ∆0β . Remarks on proof: Again, the case ω ω is typical. Given α, let Ũ be as in the theorem. Let D = {s : hs, si ∈ Ũ }. D is the inverse image of Ũ under the function s 7→ hs, si. This function is readily verified to be continuous. The inverse image of a Σ0α subset under a continuous function is Σ0α (theorem 1C.2 of [Moschovakis]). Thus, D is Σ0α . If D were Π0α then for some s0 , Dc equals {x : hs0 , xi ∈ Ũ }. But then s0 ∈ D if and only if hs0 , xi ∈ Ũ , if and only if s0 ∈ Dc . This contradiction shows that D is not Π0α . If α < β then Π0α ⊆ ∆0β ; Σ0α ⊂ ∆0β follows. Taking complements, Π0α ⊂ ∆0β also. ⊳ A second hierarchy of sets is defined in a standard space Aω as follows. 192 - A subset X is Σ1n if there is a subset Y ⊆ N × Aω , which is closed if n = 1, or Π1n−1 if n > 1, such that X = π2 [Y ]. X is said to be the projection of Y ; x ∈ X if and only if ∃w(hw, xi ∈ Y ). - A subset is Π1n if it is the complement of a Σ1n set. - A subset is ∆1n if it is both Σ1n and Π1n . These classes are called the classes of the projective hierarchy, and their elements are called projective sets. The sets in Σ11 are called analytic sets. See the introduction of [Moschovakis] for remarks on the history of the definition of the analytic and projective sets. The definition of an analytic subset applies in any Polish space X, that is, a subset is analytic if and only if there is a closed subset W ⊆ N × X such that X = π2 [W ]. Lemmas 11.6 and 11.7 of [Jech2] gives some other characterizations. Lemma 3. Suppose Aω is a standard space and let ΓA denote the class of analytic subsets of Aω . a. ΓA is closed under countable union and intersection. b. ΓA contains the closed sets and the open sets. c. If W ∈ Γω×A and X is the projection of W then X ∈ ΓA . d. If W ∈ ΓA and f : ΓA 7→ ΓB is continuous then f [W ] ∈ ΓB . Remarks on proof: A proof of part a may be found in the proof of lemma 11.6 of [Jech2]. A closed set X equals the projection of N × X. Since N has a countable base of sets which are both open and closed, any open set is a countable union of closed sets, so using part a ΓA contains the open sets. For part c, ∃u∃vW may be converted to ∃wW ′ , homeomorphically. For part d, since W is the projection of a closed set W may be assumed to be closed. The graph of f may be verified to be a closed subset of ΓA×B , and f [W ] may be obtained by projection. ⊳ Theorem 4. Let Aω be a standard space. For any n > 0 there is a 1 Σn subset Ũ ⊆ N × Aω such that for any Σ1n subset X ⊆ Aω there is an element s ∈ N such that X = {x : hs, xi ∈ Ũ}. Remarks on proof: See lemma 11.8 of [Jech2]. ⊳ Corollary 5. In a standard space, a. there is a subset which is Σ1n but not Π1n ; and b. if n < m then Σ1n , Π0n ⊂ ∆1m . Remarks on proof: Similar to corollary 2. ⊳ By lemma 3 a Borel set is ∆11 . By corollary 5 there are analytic sets which are not Borel. Say that disjoint subsets X, Y of a set are separated by a subset Z if X ⊆ Z and Y ⊆ Z c . Theorem 6. If X and Y are disjoint analytic subsets then there is a Borel subset Z which separates them. Remarks on proof: See lemma 11.11 of [Jech2]. Other proofs may be found in proposition 13.4 of [Kanamori3] and theorem 2E.1 193 of [Moschovakis]. These involve further concepts, some of which will be discussed below. ⊳ Theorem 7. A subset is ∆11 if and only if it is Borel. Remarks on proof: It has already been observed that Borel subsets are ∆11 . If X is ∆11 then both X and X c are analytic, whence they are separated by some Borel set, which must be X. ⊳ There are “effective” (“lightface”) versions of classes of sets (“boldface”) defined above. A simple method for defining these is to consider a language with three sorts of variables, intended to range over ω (integer variables), ω ω (function variables), and and 2ω (set variables). The functions and relations of the language consist of 0,1,+,·,≤ on integer values; function application, which may be written as the term f (n) as usual; and predicate application, which may be written as the atomic formula X(n) as usual. Set variables may only occur free in formulas. As mentioned previously, many authors omit set variables. The quantifier complexity of formulas in this language is defined “as usual” (see section 34). A formula of the form Q1 ~x1 · · · Qn ~xn G, where the blocks of quantifiers alternate in type, is said to be Σ0n (resp. Π0n ) if the quantifiers are number quantifiers, G has only bounded number quantifiers, and Q1 is ∃ (resp. ∀). The formula is said to be Σ1n (resp. Π1n ) if the quantifiers are function quantifiers, G has only number quantifiers, and Q1 is ∃ (resp. ∀). A formula with free variables f1 , . . . , fk , where fi denotes either a function or set variable, defines a predicate on some standard space. This predicate is said to be Σ0n , Π0n , Σ1n , or Π1n , if there is a formula of the specified type defining it. The ∆0n and ∆1n predicates are defined as usual. Lemma 8. a. A Σ11 subset may be defined by a formula ∃~g G where G is Π01 . b. A subset is Σ01 if and only if it can be defined in the form ∃nR(Cd(f1 ↾n), . . . , Cd(fk ↾n), n) where R is a computable predicate. c. A subset is Σ11 if and only if it can be defined in the form ∃g∀nR(Cd(f1 ↾n), . . . , Cd(fk ↾n), Cd(g↾n), n) where R is a computable predicate. Remarks on proof: See section 12 of [Kanamori3]. ⊳ Part a is an effective version of the fact that an analytic set is the projection of a closed set. Part c is used by some authors (such as [Jech2]) as the definition of a Σ11 subset. Using part b it is easy to show that (when k = 1) a subset is Σ01 if and only if it is empty or is of the form ∪i UFS(e(i)) for some computable function e. The lightface classes may be defined “relative to” an arbitrary h ∈ 194 ω ω . This is readily accomplished by adding a symbol for the function to the language (so that h(t) may occur in a formula; but h may not be quantified). The resulting classes are denoted Σ0n (h), etc. Theorem 8.c “relativizes”, i.e., a Σ11 (h) subset may be defined in such form, where R is computable from h. Theorem 9. A subset of a standard space is Σ01 if and only if it is 0 Σ1 (h) for some h. The same claim holds for the other classes. Remarks on proof: See proposition 12.6 of [Kanamori3]. ⊳ One may say that a lightface subset is one definable without parameters; and a boldface subset is one definable with parameters. The method of the proof of the theorem may be used to strengthen theorem 1 when α is a nonzero integer n, to require that Ũ be Σ0n ; theorem 4 may similarly be strengthened (see proposition 12.7 of [Kanamori3]). As a consequence, the lightface hierarchies are strict, indeed there is a Σ0n subset which is not Π0n , etc. There is a notion of an effective transfinitely Borel set, and an effective version of theorem 7; see corollary 27.4 of [Miller]. For s ∈ A<ω , the length of s is just the cardinality of s as a set of ordered pairs, so that |s| may be used to denote the length of s. The Lebesgue measure was mentioned in section 15. A self-contained treatment may be found in chapter 23 of [Dowd1]; an overview may be found in chapter 11 of [Jech2]. Some subsets of Rn (the “measurable” sets) are assigned a “volume”. The Lebesgue measurable sets form a σ-algebra containing the Borel sets. A measure on C, which is also called the Lebesgue measure, may be defined by assigning the measure 2−|t| to Ut . This is readily generalized, to a Lebesgue measure on Ck for any k > 0. Recall the definition of a meager set from section 15. A subset X of Rk is said to have the property of Baire if for some open set U , X ⊕ Y is meager. The sets with the property of Baire form a σ-algebra containing the Borel sets. A subset of Rk is said to be perfect if is nonempty, closed, and contains no isolated points. A subset is said to have the perfect set property if it it is countable, or contains a perfect subset. The following theorem is a version for particular spaces. Suitably stated, it holds for more general spaces; see exercises 2H.8, 2H.5, and 2C.2 of [Moschovakis]. In particular, measures may be defined on Baire space; but this will be be omitted, so theorems regarding measure will be stated for the Lebesgue measure on Ck . Theorem 10. Every analytic subset of Rk or Ck is Lebesgue measurable, has the property of Baire, and has the perfect set property. Remarks on proof: See theorem 11.18 of [Jech2]. ⊳ 195 As a corollary, Π11 subsets also are Lebesgue measurable and have the property of Baire. This theorem was proved early in the history of descriptive set theory. No progress was made until 1938, when Godel announced some parts of theorem 16 below. Before giving this, some further methods of descriptive set theory will be outlined, in particular trees. For a set A, a subset of T ⊆ A<ω which is closed under prefix (i.e., if s ∈ T and t is a prefix of s then t ∈ T ) is a type of tree. To distinguish this type from general trees, authors use adjectives, such as “sequential” in [Jech2]. For the rest of this section, by a tree will be meant one of this type. These trees are extremely useful in descriptive set theory. For example, closed subsets of Aω can be characterized in terms of trees. For a set X ⊆ Aω the set {s : s is a prefix of f for some f ∈ X} is a tree; let Pr(X) denote it. For a tree T let Br(T ) denote {f ∈ Aω : for all n, f ↾n ∈ T }. An element of Br(T ) is called a branch of T . Lemma 11. For any tree T , Br(T ) is closed. For any subset X, Br(Pr(X)) is the closure of X. Proof: If f is not a branch of T then there is an n such that f ↾n ∈ / T, whence Uf ↾n contains f and is disjoint from Br(T ). This shows that Br(T ) is closed. If f ∈ X then every prefix of f is in Pr(X), so f is a branch of Pr(X). This shows that X ⊆ Br(Pr(X)). If f ∈ Br(Pr(Y )) then for any n f ↾n is a prefix of some gn ∈ Y . The gn converge to f , so if Y is closed f ∈ Y . This shows that if Y is closed then Y = Br(Pr(Y )). Thus, if Y is closed and X ⊆ Y then Br(Pr(X)) ⊆ Br(Pr(Y )) = Y . This shows that Br(Pr(X)) is the closure of X. ⊳ The relation ⊃ (proper suffix) is a transitive irreflexive relation on A<ω . A tree is said to be well-founded this relation is, that is, if there are no infinite branches, or Br(T ) = ∅. Lemma 12. A tree T is well-founded if and only if there is a function f : T 7→ Ord such that if s ⊃ t then f (s) < f (t). Further, the range of f may be taken to be a subset of |T |+ . Proof: If f exists then clearly T is well-founded. If T is well-founded let f be the canonical rank function defined in section 32. ⊳ ω For a product space Aω 1 × · · · × Ak , a tree may be defined as a <ω subset of (A1 × · · · × Ak ) , which is closed under prefix. Some authors define such a tree as a subset of the set of k-tuples of finite sequences hs1 , . . . , sk i, where si ∈ Ali for all i, that is, all si are the same length; this is readily seen to amount to the same thing. In particular, sequences si in Ai , all of the same length l, may be combined into a sequence in (A1 × · · · × Ak )l . As for infinite sequences, the notation hs1 , · · · , sk i may be used ambiguously to denote this operation. If T is a tree on A × B, and x ∈ B ω , let T ⊘x = {s ∈ A<ω : 196 hs, x↾|s|i ∈ T } (the notation Tx is in common use for this operation). It is easy to see that hw, xi ∈ Br(T ) if and only if w ∈ Br(T ⊘x). B is usually ω, although the case B = 2 is also of interest. A is often ω also; however letting it be a larger cardinal adds a fundamental tool to descriptive set theory. For an infinite cardinal κ, a subset S of a space Aω is said to be κ-Suslin if there is a tree T on κ × A, such that S = π2 [Br(T )], i.e., x ∈ S if and only if ∃w ∈ κω (hw, xi ∈ Br(T )). This last requirement can be restated variously as follows: - Br(T ⊘x) 6= ∅. - T ⊘x is not well-founded. - ∃y ∈ κω (hx, yi ∈ K) where K is a closed subset of (ω × κ)ω , equipped with the usual topology (this follows using lemma 11). In particular, a set is analytic if and only if it is ω-Suslin. The following lemma is a fundamental fact of descriptive set theory. Lemma 13. Suppose h ∈ N. A subset S of a standard space Aω is 1 Π1 (h) if and only if there is a a tree T on ω × A which is computable from h, such that x ∈ S if and only if T ⊘x is well-founded. Remarks on proof: See theorem 13.1 of [Kanamori3], and also theorem 25.3 of [Jech2]. ⊳ Lemma 14. Suppose T is a tree on ω, M is a model of a sufficient fragment of ZFC, and T ∈ M . Then T is well-founded if and only if it is well-founded in M . Remarks on proof: See lemma 25.4 [Jech2] ⊳ An absoluteness theorem for Π11 predicates follows, called Mostowski’s absoluteness lemma. This, and lemma 13, are used in the proof of the following lemma. Lemma 15. Suppose S is a subset of a standard space Aω . Then S is Σ1 -definable without parameters in Hℵ1 if and only if S is Σ21 . Remarks on proof: Theorem 25.25 of [Jech2] proves this for subsets of N; the argument for subsets of Aω requires only minor changes. See also theorem 19.1 of [Miller]. ⊳ Theorem 16. Suppose V = L. a. There is a ∆12 subset of C2 which is neither Lebesgue measurable nor has the property of Baire. b. There is a Π11 subset of C which does not have the perfect set property. Remarks on proof: For part a, <L ∩C2 is such a subset. Suppose x ∈ Hℵ1 ∩ L. Let y = TC(x ∪ {x}); then |y| < ℵ1 and y ∈ L. As in the proof of theorem 20.8, and since y is transitive, y ∈ Lα for some α < ℵ1 . It follows that <L ∩C2 =<Lℵ1 ∩C2 is Σ1 -definable without parameters in Lℵ1 , whence is Σ1 -definable without parameters in Hℵ1 . By lemma 197 15 <L ∩C2 is Σ21 . See corollary 25.28 of [Jech2] for the rest of the proof, and also corollary 13.10 of [Kanamori3]. For part b, see theorem 13.12 of [Kanamori3]; this provides a more direct proof than that given for corollary 25.37 of [Jech2]. ⊳ It is a classic result that it follows in ZF that if R can be wellordered then there is a subset of R which does not have the perfect set property, is not Lebesgue measurable, and does not have the property of Baire (theorem 11.4 of [Kanamori3]; see also exercises 10.1 and 11.7 of [Jech2]). Models of ZF in which AC fails are of interest in various topics. It may be of interest that a weakened version of AC holds. An example of such is the principle of dependent choices, denoted DC. This states that if R is a binary relation on a nonempty set S, and ∀x ∈ S∃y ∈ S R(x, y), then there is an infinite sequence hxi : i ∈ ωi of elements of S such that ∀i ∈ ωR(xi , xi+1 ). Theorem 17. Suppose M is a transitive model of ZFC containing an inaccessible cardinal. There is a generic extension M [G] with the following properties. Let S be the class in M [G] of infinite sequences of ordinals. a. HOD(S) is a model of ZF+DC, in which every set of reals is Lebesgue measurable, has the property of Baire, and has the perfect set property. a. OD(S) is a model of ZFC, in which every projective set of reals is Lebesgue measurable, has the property of Baire, and has the perfect set property. Remarks on proof: This is theorem 26.14 of [Jech2]. The proof makes use of methods which have many applications, for example Levy collapse and random reals. ⊳ This is one of the earliest examples of a consistency strength bound. “There is an inaccessible cardinal” is an upper bound on the consistency strength of “every projective set is Lebesgue measurable” and “ZF + every set is Lebesgue measurable”. In [Shelah1] it is shown that if every Σ13 set of reals is Lebesgue measurable then ℵ1 is an inaccessible cardinal in L, so the bound is exact. One further fact about the projective hierarchy will be given, which illustrates additional basic methods of descriptive set theory. Lemma 18. Suppose Aω is a standard space, κ is an infinite cardinal, and Y ⊆ N × A is κ-Suslin. Then X = π2 [Y ] is κ-Suslin. Remarks on proof: See proposition 13.13.d of [Kanamori3]. Using a bijection of κ × ω with κ, two existentially quantified variables can be combined into one. ⊳ Theorem 19. Suppose Aω is a standard space, h ∈ N, and X ⊆ A 198 is Σ12 (h). Then X is the projection of a tree T̂ ∈ L[h] on ℵ1 × A, and in particular is ℵ1 -Suslin. Remarks on proof: See theorem 13.14 of [Kanamori3]. By the proof of lemma 18, it suffices to prove the claim for X Π11 ; the bijection used in the proof can be taken to be defined by a sufficiently simple formula. Let T ∈ L[h] be a tree on ω × A such that x ∈ X if and only if T ⊘x is well-founded. Let T̂ be the tree on ℵ1 × A, such that hs, ti ∈ T̂ if and only if ∀i, j ≤ |s|(hFS(i), t ↾ |FS(i)|i ∈ T ∧ FS(i) ⊃ FS(i) ⇒ s(i) < s(j)). This definition is sensible since, using a fact noted above, |FS(i)| ≤ i ≤ |s| = |t|. Given a branch hŵ, xi of T̂ , for t ∈ T ⊘x let f (t) = ŵ(Cd(t)). Using lemma 12, it follows that T ⊘x is well-founded. Conversely, if T ⊘x is well-founded, let f be as in lemma 12, and let ŵ be any function such that f (t) = ŵ(Cd(t)) for t ∈ T ⊘x. ⊳ There has been considerable effort in modern set theory to determine relations between independent questions of descriptive set theory, and other questions, in particular concerning the determinacy of twoperson games. Some discussion will be given in the next four sections. In particular, it will be shown in section 60 that if there is a measurable cardinal then theorem 10 holds for Σ12 subsets. An alternative proof of this fact may be found in theorems 8.G.4,9 of [Moschovakis]. 58. Determinacy. The theory of infinite games dates back to the 1930’s (see [Telgarsky]). Papers concerning relations with set theory appeared as early as 1953, with additional results appearing throughout the 1960’s. The subject has continued to evolve, and has become a major one in modern set theory. The “plays” of an infinite two-person game on a set A are the elements of Aω . A “position” is an element s ∈ A<ω ; if |s| is even (resp. odd) “player 1” (resp. 2) plays, by appending an element of A to s. The game is specified by giving a subset W ⊆ Aω , which are the plays in which player 1 wins. The notation GA (W ) is in common use to denote this game. When A is fixed this may be abbreviated to G(W ). A strategy for player 1 (resp. 2) in GA (W ) is a function σ1 : ∪l even Al 7→ A (resp. σ2 : ∪l odd Al 7→ A). A play hxl i accords with σ1 (resp. σ2 ) if xl = σ1 (hx0 , . . . , xl−1 i) whenever l is even (xl = σ2 (hx0 , . . . , xl−1 i) whenever l is odd). A strategy σ1 (resp. σ2 ) is a winning strategy if any play that accords with the strategy is in A (resp. Aω − W ). Thus, no matter how the other player plays, a player playing the strategy wins. A game GA (W ), or the set W , is said to be determined if either player I or player II has a winning strategy. Recall from section 57 that Aω may be considered a topological 199 space, where a basis for the topology is {Ut : t ∈ A<ω }. Theorem 1. If W is a closed subset of Aω then G(W ) is determined. Proof: Say that a position s is viable if player 2 does not have a winning strategy from then on, i.e., no σ2 with which s accords is winning. If hx0 , . . . , xl−1 i is viable then there must be some xl such that for any xl+1 , hx0 , . . . , xl−1 , xl , xl+1 i is viable. Suppose player 2 does not have a winning strategy, so that the empty sequence is viable. Let σ1 be any strategy where at hx0 , . . . , xl−1 i, player 1 plays any xl as described above. Suppose f ∈ Aω accords with σ1 . If f ∈ / W then since W is closed, there is some position s ⊆ f such that f ∈ Us ⊆ W c . But then s is not viable, contradicting the assumption that f accords with σ1 . Hence f ∈ W . This shows that σ1 is a winning strategy. ⊳ D. Martin proved in 1975 that every Borel subset of Aω is determined. A proof may be found in [Kechris]; some remarks will be given here. Before proceeding, some basic facts about games will be noted. - Games G1 and G2 are said to be equivalent if player 1 (resp. 2) has a winning strategy in G1 if and only if he has one in G2 . - A tree T is said to be pruned if it has no finite maximal branch, i.e., for any s ∈ T there is a t ∈ T with t ⊃ s. - The notion of a game may be generalized. Given a a pruned tree L of “legal” positions, and W ⊆ Br(L), G(L, W ) is those plays where all positions (finite prefixes) are in L. (For an “unrestricted” game, L = A<ω .) - A strategy on L is a function whose domain is the even length, or odd length, sequences which are elements of L, and such that if s = hx0 , . . . , xl−1 i ∈ L then hx0 , . . . , xl−1 , f (s)i ∈ L. - A game G(L, W ) can be converted to an equivalent game G(W ′ ), by adding to W those plays f where the least l such that f ↾ l ∈ /L is even. Thus, GA (L, W ) may be considered an abbreviation. - If π : A 7→ B is a bijection then GB (W ′ ) is equivalent to GA (W ), where W ′ = {π ◦ f : f ∈ W }.‘ - If A ⊆ B then GA (W ) is equivalent to GB (Aω , W ). - A game Gω (W ) can be converted to an equivalent game G2 (W ′ ). Let c0n (resp. c1n ) be the sequence of 2n 0’s followed by 10 (resp. 01). Given f ∈ ω ω let f ′ be the concatenation of the strings cpnii where ni = f (i) and pi equals 0 if i is even, and 1 if i is odd. W ′ = W1 ∪ W2 where W1 is {f ′ : f ∈ W }, and W2 is the “illegal” strings where the first mistake is made by player 2. - A game G(W ) can be converted to a game G(W ′ ) such that player 1 (resp. 2) has a winning strategy in G(W ) if and only if player 2 (resp. 1) has a winning strategy in G(W ′ ). Given f , let f ′ be the sequence where f ′ (n) = f (n + 1) (i.e., the first element is deleted); 200 then W ′ = {f : f ′ ∈ W c }. - Theorem 1 generalizes, to G(L, W ) where W is a closed or open subspace of the subspace Br[L] ⊆ Aω (theorem 20.1 of [Kechris]). There is a standard method for proving determinacy, which cannot be better described than by the following quotation from [Kechris]: “The idea is . . . to associate to the game G(T, X) an auxiliary game G(T ∗ , X ∗ ), which is known to be determined, usually a closed or open game, in such a way that a winning strategy for any of the players in G(T ∗ , X ∗ ) gives a winning strategy for the corresponding player in G(T, X).” Typically, a position of the auxiliary game has “additional information”; this may be discarded, producing a position of the original game. In the case of Borel determinacy, the auxiliary game is constructed using the notion of a “cover” of a set L of legal positions. Such is given by the following: - a set L̃ of legal positions of the auxiliary game; - a map π : L̃ 7→ L; and - a map φ from strategies on L̃ to strategies on L. These must satisfy the following restrictions. - For f˜ ∈ Br(L̃), |π(f˜)| = |f˜|. - For σ̃, τ̃ ∈ L̃, and any n, if σ̃ and τ̃ agree on positions s with |s| ≤ n then so do φ(σ̃) and φ(τ̃ ). - If f ∈ Br(L) accords with φ(σ̃) then there is a f˜ ∈ Br(L̃) which accords with σ̃, such that π(f˜) = f . Lemma 2. Suppose hL̃, π, φi is a cover of L, and W ∈ Br(L). If σ̃ is a winning strategy in G(L̃, π −1 [W ]) then φ(σ̃) is a winning strategy in G(L, W ). Proof: Suppose σ̃j wins in G(L̃, π −1 [W ]) where j = 1, 2, and f ∈ Br(L) accords with φ(σ̃j ). Suppose f˜ ∈ Br(L̃) accords with σ̃j and π(f˜) = f . Then since σ̃j is winning, if j = 1 then f˜ ∈ π −1 [W ], and if j = 2 then f˜ ∈ π −1 [W c ]. Thus, if j = 1 then f ∈ W , and if j = 2 then f ∈ W c. ⊳ A subset of a topological space is said to be “clopen” if and only if it is both closed and open. The basic open sets Ut ⊆ Aω are readily seen to be clopen. For an integer k ≥ 0 a cover is said to be a k-cover if L̃ ∩ ω ≤2k = L ∩ ω ≤2k and π ↾ L̃ ∩ ω ≤2k is the identity. A cover is said to unravel W ⊆ Br(L) if π −1 [W ] is a clopen subset of Br(L). Lemma 3. Suppose L is a nonempty pruned tree and W ⊆ Br(L) is closed. For each k ≥ 0 there is a k-cover of L that unravels W . Remarks on proof: This is lemma 20.7 of [Kechris]. Its proof comprises the bulk of the work in proving Borel determinacy. ⊳ 201 Lemma 4. Suppose k ∈ ω, and hTi+1 , πi+1 , φi+1 i is a cover of Ti for all i ∈ ω. Then there is a pruned tree Tω , and for each i ∈ ω maps πωi and φωi , such that for all i ∈ ω hTω , πωi , φωi i is a k + i-cover of Ti . πi+1 ◦ πω,i+1 = πωi , and φi+1 ◦ φω,i+1 = φωi . Remarks on proof: This is lemma 20.8 of [Kechris]. ⊳ Theorem 5. Suppose L is a nonempty pruned tree and W ⊆ Br(L) is Borel. For each k ≥ 0 there is a k-cover of L that unravels W . Remarks on proof: This is theorem 20.6 of [Kechris]. ⊳ The following theorem was proved in 1953, in the same paper as theorem 1. Theorem 6. There is a set of reals W such that Gω (W ) is not determined. Proof: The number of strategies is ℵℵ0 0 = 2ℵ0 . Let σiα for α < 2ℵ0 be an enumeration of the σi for i = 1, 2. Sets Wi = {wiα : α < ℵ0 } for i = 1, 2 may be constructed by transfinite recursion as follows. At stage α, first add to W2 some some w ∈ ω ω − (W1 ∪ W2 ) so that σ1α is defeated, then add to W1 some some w ∈ ω ω − (W1 ∪ W2 ) so that σ2α is defeated. Such w exist, because there are 2ℵ0 elements of N according with a given strategy. Clearly neither player has a winning strategy in Gω (W1 ). ⊳ The axiom of determinacy (AD) states that every subset of N is determined. This can only hold in models where AC fails. By facts noted above AD holds if an only if games in 2ω are determined. Various questions regarding determinacy have been of considerable interest in modern set theory, including the following. - Does determinacy hold for sets of higher complexity than Borel? - What effect does the assumption of determinacy for further types of sets have on the universe, in particular the regularity properties of sets of reals? - What properties must models of ZF+AD have? Chapters 27 to 32 of [Kanamori3] contain a survey of this work; other references include [Jech2]. Some discussion will be given in the next three sections. Theorem 7. In ZF, AD implies that every countable family of nonempty subsets of ω ω has a choice function. Remarks on proof: See lemma 33.2 of [Jech2]. ⊳ In consideration of models where AD holds, sometimes the above theorem suffices as a substitute for AC, and sometimes DC is assumed to hold. 59. Determinacy and descriptive set theory. Lebesgue measurability, the property of Baire, and the perfect set property, for a set of reals, are known as regularity properties. As seen in 202 the preceding section, analytic sets have these properties. It was realized in 1964, and earlier for the property of Baire, that these properties could be formulated in terms of games. The Banach-Mazur game has alphabet A = 2<ω − ∅. Any play is legal. A play p may be transformed to an element f ∈ 2ω by concatenating the sequences p(i). Let χ denote the map p 7→ f . Theorem 1. In the game as above, suppose X ⊆ C and W = χ−1 [X]. a. Player 1 has a winning strategy in GA (W ) if and only if Ut − X is meager for some t ∈ ω <ω . b. Player 2 has a winning strategy GA (W ) if and only if X is meager. c. GA (χ−1 [X − ∪t {Ut : X − Ut is meager}]) is determined if and only if X has the property of Baire. Remarks on proof: See proposition 27.3 and corollary 27.4 of [Kanamori3]. These are proved for N; the changes for C are minimal. ⊳ The perfect set game has alphabet A = 2<ω ∪ 2. A play is legal if elements in even (resp. odd) positions are in 2<ω (resp. 2). A play p may be transformed to an element f ∈ 2ω by concatenating the sequences p(i) (elements of 2 being considered sequences of length 1). Let χ denote the map p 7→ f . Theorem 2. In the game as above, suppose X ⊆ C, L is the legal positions, and W = χ−1 [X]. a. Player 1 has a winning strategy in GA (L, W ) if and only if X has a perfect subset. b. Player 2 has a winning strategy in GA (L, W ) if and only if X is countable. c. GA (L, W ) is determined if and only if X has the perfect set property. Remarks on proof: For parts a and b see proposition 27.5 of [Kanamori3]. Part c follows easily. ⊳ The covering game has alphabet 2 ∪ (2<ω )<ω , and a real parameter ǫ > 0. Given an element t̃ = ht1 , . . . , tk i ∈ (2<ω )<ω , let Nt̃ = ∪ki=1 Uti . Let µ denote the Lebesgue measure. A play p is legal if: 1. elements in even (resp. odd) positions are in 2 (resp. (2<ω )<ω ); and 2. if p(2n + 1) = t̃ then µ(Nt̃ ) < ǫ/22(n+1). Given a legal play p, let rp be the concatenation of the values in the even positions, and let Op be the union of the neighborhoods in the odd positions. Given X ⊆ 2ω , let W = {p : rp ∈ X = Op }. C is said to be a minimal cover of X if X ⊆ Y , Y is measurable, and µ(Y ) is as small as possible among such Y (it is not difficult to show that such Y exists). Theorem 3. In the game as above, suppose X ⊆ C, and L is the legal positions. 203 a. If player 1 has a winning strategy in GA,ǫ (L, W ) then there is a measurable B ⊆ X such that µ(Y ) > 0. b. If player 2 has a winning strategy in GA,ǫ (L, W ) then there is an open Y ⊇ X such that µ(Y ) < ǫ. c. Let Y be a minimal cover of X, and suppose that for any ǫ > 0 GA,ǫ (L, Y − X) is determined; then X is Lebesgue measurable. Remarks on proof: Proposition 27.7 and corollary 27.8 of [Kanamori3] give a version for N; this is readily adapted to the case of C. ⊳ Theorem 4. It follows in ZF from AD that every subset of C has the property of Baire, has the perfect set property, and is Lebesgue measurable. Proof: By remarks in section 58, the games of theorems 1, 2, and 3 are determined. One may verify that this follows in ZF; for example there is a definable well-order on the positions. ⊳ The theorem is stated for subsets of C so that the Lebesgue measure may be used; as usual more general facts hold. This theorem may be seen to follow for various “pointclasses” Γ, i.e., if determinacy holds for all sets in Γ then the regularity properties hold for all sets in Γ. The notion of a pointclass is frequently used in descriptive set theory. A pointclass is a set of subsets of one or more Polish spaces. Some authors (such as [Kanamori3] and [Kechris]) call them classes. Theorem 5. It follows in ZF from determinacy for Γ that every set in Γ has the property of Baire, has the perfect set property, and is Lebesgue measurable, where Γ is the Σ1n (h) or Π1n (h) subsets of C for n ≥ 1 and h ∈ N. Remarks on proof: The transformations involved in proving theorem 4 are all computable. ⊳ Note that by remarks in section 58 determinacy for Σ1n (h) is equivalent to determinacy for Π11 (h). For a more general version of the theorem see for example exercises 6A.12,16,19 of [Moschovakis]. The theorem follows for the boldface classes by theorem 56.9 (or directly). Projective determinacy (PD) is the statement that determinacy holds for any projective set; it follows from PD that any projective subset of C has the regularity properties. Theorem 6. It follows in ZF from determinacy for the Π1n (h) subsets of C that every Σ1n+1 (h) subset of C has the property of Baire, has the perfect set property, and is Lebesgue measurable, for n ≥ 1 and h ∈ N. Remarks on proof: This is proved for the boldface classes in N, in theorem 27.14 of [Kanamori3]. The method of proof is to define an “unfolded” version of the games given above, where player 1’s plays are augmented with a value y(i), where y will be the value of the leading 204 existentially quantified variable. See also sections 21.B,C of [Kechris]. ⊳ Various other properties of pointclasses are decided by assuming determinacy. Among the most important are reduction and uniformization. The modern theory of these involves the use of norms and scales. These in turn have become mainstays of descriptive set theory and related areas. In what follows a pointclass is assumed to be “suitable”, i.e., to have any required properties. The classes of the lightface or boldface projective hierarchy of subsets of standard spaces as considered in section 57 are suitable. Suppose Γ is a pointclass. Given X ∈ Γ let X c denote T −X where T is the space of which X is a subset. Let ¬Γ denote {X c : X ∈ Γ}, the “dual” pointclass. Let ∆Γ denote Γ ∩ ¬Γ. Let ∃1 Γ denote the sets X which can be written in the form {x : ∃w(hw, xi ∈ Y )} where Y ∈ Γ and Y ⊆ N × T for some space T . ∀1 Γ is similarly defined. Recall from section 32 that a quasi-order is a reflexive and transitive binary relation. If ≤ is a quasi-order then the relation x ≤ y ∧ y 6≤ x is a transitive irreflexive relation, called the strict part, and denoted x < y. A pre-well-order is defined to be a quasi-order, such that x ≤ y ∨ y ≤ x, and < is well-founded. In descriptive set theory, a norm on a set X is a function f : X 7→ Ord. Given such, the relation f (x) ≤ f (y) is readily verified to be a pre-well-ordering. On the other hand, given a pre-well-ordering ≤, the canonical rank function ρ for < (defined in section 32) is a norm; further, as is easily verified, x ≤ y if and only if ρ(x) ≤ ρ(y). For a pointclass Γ a norm ρ on a set X ∈ Γ is said to be a Γ-norm if the relation “x ∈ X ∧ ρ(x) ≤ ρ(y)” is in ∆Γ . Γ is said to have the pre-well-ordering property if every X ∈ Γ has a Γ-norm. A pointclass Γ is said to have the reduction property if whenever X, Y ∈ Γ there are X ′ , Y ′ ∈ Γ such that X ′ ⊆ X, Y ′ ⊆ Y , X ′ ∪ Y ′ = X ∪ Y , and X ′ ∩ Y ′ = ∅. A pointclass Γ is said to have the separation property if whenever X, Y ∈ Γ and X ∩ Y = ∅ there is a Z ∈ ∆Γ such that X ⊆ Z and Y ⊆ Z c . Some facts about the foregoing pointclass properties will be stated without proof; references are given to proofs in [Kanamori3]. - Γ has the reduction property if and only if ¬Γ has the separation property (29.2). - If Γ has the reduction property then it does not have the separation property (assuming that Γ has universal sets) (29.3). - If Γ has the pre-well-ordering property then it has the reduction property (29.7). - For h ∈ ω ω , Π11 (h) has the pre-well-ordering property (29.8). - Suppose V = L. For h ∈ ω ω and n ≥ 2, Σ1n (h) has the pre-well205 ordering property (29.11). - (First periodicity theorem.) It follows in ZF+DC that if determinacy holds for ∆Γ and ∃1 Γ ⊆ Γ, then if Γ has the pre-well-ordering property then ∀1 Γ does (29.13). - Suppose PD holds. For h ∈ ω ω , the classes Π1n (h) for n odd and Σ1n (h) for n even and nonzero, have the pre-well-ordering property (29.14). A semiscale on a set X is a sequence hρi : i ∈ ωi of norms on X such that, if hxi : i ∈ ωi is a sequence in X, x = limi→∞ xi , and for each n there is a λn such that φn (xi ) = λn for sufficiently large i, then x ∈ X. A scale is a semiscale such that in addition, for all n, φn (x) ≤ λn . (The term “scale” is used for other purposes in set theory also.) For a pointclass Γ a scale hρi i on a set X ∈ Γ is said to be a Γ-scale if the 3-ary relation “x ∈ X ∧ ρn (x) ≤ ρn (y)” is in ∆Γ . Γ is said to have the scale property if every X ∈ Γ has a Γ-scale. Recall the definition of a function uniformizing a relation from section 46. Γ is said to have the uniformization property if whenever R is a binary relation in Γ, there is a function f in Γ which uniformizes R. Again, the following will be stated without proof, with references being to [Kanamori3] unless otherwise indicated. - If Γ has the uniformization property then Γ has the reduction property (exercise 1C.8 of [Moschovakis]). - If Γ has the scale property and ∀1 Γ ⊆ Γ then Γ has the uniformization property (30.4). - For h ∈ ω ω , Π11 (h) has the scale property (see theorem 8 below). - Suppose V = L. For h ∈ ω ω and n ≥ 2, Σ1n (h) has the scale property (30.5). - (Second periodicity theorem.) It follows in ZF+DC that if determinacy holds for ∆Γ and if ∃1 Γ ⊆ Γ, then if Γ has the scale property then ∀1 Γ does (30.8). - Suppose PD holds. For h ∈ ω ω , the classes Π1n (h) for n odd and Σ1n (h) for n even and nonzero, have the scale property (30.9). An even stronger hypothesis than PD is ADL(R) . Consequences of AD have become of interest, since if ADL(R) holds then such consequences hold in L(R), and if they are absolute, in V . The reader is referred to [Kanamori3] and [Jackson] for surveys of this extensive and ongoing work. To illustrate the use of scales, an outline of a proof of a fact stated above will be given. To begin with, note that for an ordinal γ and a subset X ⊆ γ ω , if X is closed then X has a lexicographically least (“leftmost”) branch. To see this, let s0 be the empty sequence. Letting ∗ denote concatenation, let si+1 be the least α such that si ∗ α ∈ Pr(X). 206 By induction, si is in Pr(X) and is lexicographically less than or equal to x ↾ i for any x ∈ X. Since X is closed xl = ∪i si is in X, and is the leftmost branch. The leftmost branch xl of a subset X ⊆ γ ω has the property that, for any x ∈ X with x 6= xl , there exists an i ∈ ω such that xl (i) < x(i). A branch x0 ∈ X is said to be the honest leftmost branch of X if, for any x ∈ X, x0 (i) < x(i) for all i ∈ ω. Such is clearly the leftmost branch of X if it exists. For an ordinal γ, a norm f : X 7→ Ord is said to be “into γ” if f [X] ⊆ γ. A scale is said to be “into γ” if its norms are. Recall the notion of a standard space from section 57. Lemma 7. Let Aω be a standard space. For any subset X ⊆ Aω , there is a scale on X into γ if and only if there is a tree T on γ × A, such that X = π2 [Br(T )], and Br(T ⊘x) has an honest leftmost branch for all x ∈ X. Remarks on proof: See proposition 30.2 of [Kanamori3]. ⊳ Theorem 8. For h ∈ ω ω , the class of Π11 (h) subsets of a standard space has the scale property. Remarks on proof: Suppose X ⊆ Aω , and T is a tree on ω × A such that x ∈ X if and only if T ⊘x is well-founded. Let T̂ be the tree on ℵ1 × A, derived from T as in the proof of theorem 56.18. Suppose x ∈ X. Let ρ be the canonical rank function on T ⊘x. Let ŵ be the function such that ŵ(i) = ρ(FS(i)) if FS(i) ∈ T ⊘x, else 0. This function is the honest leftmost branch of Br(T̂ ⊘x). To see this, note that if ρ0 is the canonical rank function on a well-founded tree, and ρ is any other rank function, then it follows by induction on ρ0 (t) that ρ0 (t) ≤ ρ(t). See exercise 30.3 of [Kanamori3] for further details. ⊳ The following are noted without proof; see exercise 30.7 of [Kanamori3]. - If Γ has the scale property and ∀1 Γ ⊆ Γ then ∃1 Γ has the scale property. - For h ∈ ω ω , the class of Σ12 (h) subsets of a standard space has the scale property. 60. Determinacy and 0#. Consider the following two statements. 1. a# exists for all subsets a ⊆ ω. 2. Π11 games on ω are determined. D. Martin proved in 1970 that 1⇒2. L. Harrington proved in 1978 that 2⇒1. In fact lightface versions can be proved. These results show that before determinacy beyond Π11 can be assumed, the existence of 0# must first be; in particular the consequences of determinacy for higher complexity sets has no bearing on the latter 207 question. An overview of these results will be given in this section; this will involve a treatment of further basic methods in descriptive set theory. The underlying space will be N for the first statement, and C for the second. For an ordinal γ, let <KB be the relation on γ <ω , where x <KB y if and only if x ⊃ y or for some i ∈ Dom(s) ∩ Dom(t), ∀j < i(s(j) = t(j)) ∧ s(i) < t(i). This order is called the Kleene-Brouwer order. A routine verification shows that it is a linear order. Lemma 1. Suppose T is a tree on γ. Then T is well-founded if and only if T is well-ordered by <KB . Remarks on proof: See exercise 13.2 of [Kanamori3]. ⊳ Suppose X is Π11 , and T is a tree on ω such that x ∈ X if and only if T ⊘x is well-founded. For x ∈ ω ω let <x be the relation on ω where i <x j if and only if FS(i) ∈ / T ⊘x ∧ FS(j) ∈ / T ⊘x ∧ i < j∨ FS(i) ∈ / T ⊘x ∧ FS(j) ∈ T ⊘x∨ FS(i) ∈ T ⊘x ∧ FS(j) ∈ T ⊘x ∧ FS(i) <KB FS(j). The codes of non-members of T ⊘x come first, well-ordered by <; these are followed by the codes of members of T ⊘x, ordered by <KB on the sequences they code. It follows that <x is a well-order if and only if <KB is a well-order, if and only if x ∈ X. For t ∈ ω <ω let T ⊘t = {s ∈ A<ω : |s| < |t| ∧ hs, t↾|s|i ∈ T }; let <t be the relation on {i : i < |t|} defined as <x , except using T ⊘t rather than T ⊘x. It readily follows that <t is a linear order, if t1 ⊆ t2 then <t1 ⊆<t2 , and <x = ∪t⊆x <t . Let T ∗ be the tree on ℵ1 × ω, such that hs, ti ∈ T ∗ if and only if ∀i, j < |t|(i <t j ⇒ s(i) < s(j)). It readily follows that <x is a well-order if and only if there is a map w : ω 7→ ℵ1 such that i <x j ⇒ w(i) < w(j)), ∗ if and only if there is a w ∈ ℵω 1 such that hw, xi ∈ Br(T ). For the following, an element h ∈ ω ω can be considered as its code, a real; h# may be used to denote h̃#. Theorem 2. Suppose h# exists where h ∈ ω ω . Then Π11 (h) subsets of N are determined. Remarks on proof: See theorem 31.2 of [Kanamori3]. Let A be the alphabet (ℵ1 × ω) ∪ ω. A position is legal if the elements in even (resp. odd) positions are in ℵ1 × ω (resp. ω). Let M be the set of legal positions. Given a play p, let x be the element of ω ω where x(i) = π2 (p(i)) for i even, and p(i) for i odd; and let w be the element of ℵω 1 where w(i) = π1 (p(2i)). Let W be the plays such that hw, xi ∈ Br(T ∗ ). It is readily seen that W is closed in Br(M ) (if player 1 loses there is some i such that w ↾ 2i is not order-preserving). By theorem 20.1 of [Kechris], GA (M, W ) is determined. By lemma 56.13 T can be chosen 208 in L[h]. Given such a T , it follows readily that T ∗ and W are in L[h], so in fact GA (M, W ) is determined in L[h]. Suppose σ1 ∈ L[h] is a winning strategy for player 1 in GA (M, W ) in L[h]. If there is play in V according with σ1 , which is in Br(M ) − W , then since Br(M ) − W is open there are such plays in L[h]. Thus, σ1 is a winning strategy for player 1 in V . Let σ1′ be the strategy for Gω (X), where player 1 reconstructs the values w(i) and then plays according to σ1 ; this is a winning strategy for player 1 in Gω (X). Suppose σ2 ∈ L[h] is a winning strategy for player 2 in GA (M, W ) in L[h]. Let P ⊆ M be the positions according with σ2 which are “viable” for player 1, i.e., there is a play extending the position which player 1 wins. P must be well-founded in L[h]. By absoluteness of well-foundedness (lemma 0.3 of [Kanamori3]), P is well-founded in V . It follows that σ2 is a winning strategy for player 2 in V . By the hypothesis that h# exists, there is a Skolem term defining σ2 involving Silver indiscernibles of L[h]; further the indiscernibles may be taken as less than γ where γ is an ordinal less than ℵ1 . Given a position u of length 2i + 1, let s and t be defined as w and x for a play p as above. If s(j) ≥ γ for all j < 2i + 1, and the elements s(j) are distinct, then σ2 (u) does not depend on s; let σ2′ be the strategy for player 2 in Gω (X), where σ2′ (t) equals σ2 (u) for any such s. Suppose x accords with σ2′ . If x ∈ X then there is a w ∈ ℵω 1 such that hw, xi is in Br(T ∗ ) and accords with σ2 . But then x ∈ / X, a contradiction. Thus, σ2′ is a winning strategy for player 2 in Gω (X). ⊳ By theorem 56.9, if h# exists for any h ∈ ω ω , then any game in 1 Σ1 is determined. By theorem 58.6, it follows that every Σ12 set has the regularity properties. As noted in section 42, the hypothesis follows if a measurable cardinal exists. This proves a fact mentioned at the end of section 57. The converse of theorem 2 also holds (theorem 9 below). An outline of a proof will be given, following [Harrington]. This will be for the unrelativized lightface pointclass; but as noted in [Harrington], the argument may be adapted to the lightface pointclass relative to an oracle. The proof makes use of a notion of forcing, that of tagged trees; see [Sami] for a forcing-free proof. By a real will be meant an element of C, indifferently considered as a subset of ω. Given reals a and b, a ≤T b will denote that a is computable from b; and a ≡T b that a ≤T b ∧ b ≤T a. A set X of reals is said to be Turing closed if a ∈ X ∧ a ≡T b ⇒ b ∈ X. A Turing cone is a set of the form {b : a ≤T b}. Various sets may be coded as a real. A binary relation R on ω may be coded as {GP(x, y) : R(x, y)} (GP is defined in appendix 2). 209 The code for a function (in particular a strategy) is a special case. A countable structure for the language of set theory may be coded by the code for the membership relation, after choosing some enumeration of the domain. A tree T may be coded as the set of codes of the sequences in T . Lemma 3. Suppose X is Turing closed. If player 1 (resp. 2) has a winning strategy in G2 (X) then X (resp. X c ) contains a Turing cone. Proof: Let f be a winning strategy for player 1, coded as a real. Suppose f ≤T g. Let h be the play according with f , where player 2 plays g. It is easily seen that h ≡T g. Also, h ∈ X, whence g ∈ X since X is Turing closed. The proof for player 2 is similar. ⊳ For the rest of this section, for a real a let ω1 (a) denote the smallest ordinal α > ω such that Lα [a] is admissible. Let A0 = {x ∈ C : ∃y ≤T x (y codes an end extension of Lω1 (x) )}. Lemma 4. A0 is Turing closed and Σ11 . Remarks on proof: This is stated without proof following definition 3.1 of [Harrington]. Clearly A0 is Turing closed. The statement that y codes a structure for the language of set theory having the required properties is first order with free second order variables x and y. ⊳ Lemma 5. A0 is cofinal in the quasi-order ≤T on C. Remarks on proof: See lemma 3.2 of [Harrington]. Given x, using facts from model theory, choose a model M of KP such that M contains no nonstandard integers, x ∈ M , and there is an element α ∈ M which is a nonstandard countable ordinal in M . Let w be a real which codes Lα in M . Let y as in the definition of A0 be GP(x, w). ⊳ Lemma 6. If A0 is determined then it contains a Turing cone. Proof: This follows by lemmas 3 and 5. ⊳ For a limit ordinal α, let Q(α) be the poset whose elements are the pairs ht, ri where t is a finite tree on ω and r : t 7→ α ∪ ∞ is a function with the following properties: 1. if s ⊃ t and r(t) 6= ∞ then r(s) < r(t); and 2. r(∅) = ∞. The relation ht1 , r1 i ≥Q(α) ht2 , r2 i holds if t1 ⊆ t2 and r1 = r2 ↾ t1 . Given a notion of forcing hM, Q(α)i (where M is a transitive set which is a model of some fragment of set theory), an M -generic filter in Q(α) yields a tree T and an extended norm R : T 7→ α∪∞. It follows by basic facts about forcing that if R(t) 6= ∞ then R(t) = sup{R(s) + 1 : s is a son of t}. A proof of the last mentioned fact will be outlined. In a poset P , for p ∈ P let p≤ denote {q : q ≤ p}; this may be considered a sub-poset, with the inherited order. A subset X ⊆ P is said to be dense below P if X is a dense subset of p≤ , i.e., X ⊆ p≤ and ∀q ≤ p∃r ∈ X(r ≤ q). 210 If G is an M -generic filter for a notion of forcing hM, P i, p ∈ G, and D is dense below p, then D ∩ G 6= ∅. This may be seen by noting that D′ = D ∪ {q ∈ P : p and q are incompatible} is dense. Suppose ht, ri ∈ P , u ∈ t is a node, β(u) > β, and β > β(v) for any son v of u. Then the set of trees extending t such that u has a son labeled β is readily seen to be dense below t. The claim for R follows. Further discussion of this notion of forcing will be omitted; see [Harrington] and [Steel2]. Lemma 7. Suppose {x : a0 ≤T x} ⊆ A0 , and α > ω is a countable ordinal such that Lα [a0 ] is admissible. Then α is a cardinal in L. Remarks on proof: See lemma 3.4 of [Harrington]. Suppose κ < α is a cardinal in L, X ⊆ κ, and X ∈ L. Choose a countable limit ordinal ξ > α with X ∈ Lξ . Choose an Lξ -generic filter in Q(α). Let T be its tree, and let b = GP(a0 , T ); then ξ < ω 1 (b). Since b ∈ A0 , let N be an end-extension of Lξ , whose code is computable from b. Then N ∈ Lκ+1 . From this it may be shown that X ∈ Lκ·3 . It then follows that X ∈ Lα [a0 , T ]. Since T is generic a forcing argument may be given to conclude that X ∈ Lα [a0 ]. Thus far it has been shown that if X is a constructible subset of κ then X ∈ Lα [a0 ]. An ordinal less than κ+L is coded by such an X, so is in Lα [a0 ] since Lα [a0 ] is admissible, so is in L. Thus, α is a cardinal in L. ⊳ Corollary 8. Suppose {x : a0 ≤T x} ⊆ A0 , and α > ω is an ordinal such that Lα [a0 ] is admissible. Then α is a cardinal in L. Remarks on proof: See lemma 7.22 of [MansWeit]. ⊳ Theorem 9. Suppose Σ11 subsets of C which are Turing closed are determined. Then 0# exists. Remarks on proof: This is theorem 4.1 of [Harrington]. Let A0 be as above. By hypothesis and lemma 4 A0 is determined, whence by lemma 6 some a0 with {x : a0 ≤T x} ⊆ A0 exists. In L[a0 ], an elementary substructure X ⊆ Lℵ3 [a0 ] may be chosen so that |X| = ℵ1 , ℵ2 ∈ X, and X ω ⊆ X. Let j : Lα [a0 ] 7→ Lℵ3 [a0 ] be the inverse of the transitive collapse. It follows that Lα [a0 ] is admissible, whence by corollary 8 α is a cardinal in L. Also, α < ℵ2 . Since |X| = ℵ1 , j is nontrivial; let κ be the critical point. Let U = {Y ⊆ κ : Y ∈ L, κ ∈ j(Y )}. It follows that U is a countably complete L-ultrafilter, and the theorem follows by theorem 44.5 and theorem 36.2 adapted to L-ultrafilters. ⊳ 61. Determinacy and large cardinals. It has already been seen in section 60 that Π11 determinacy has large cardinal strength. An investigation into the connection between 211 determinacy and large cardinals led to dramatic developments in modern set theory. To quote [Neeman1]: “In 1985 the faith in this connection was fully justified. A sequence of results of Foreman, Magidor, Martin, Shelah, Steel, and Woodin . . . brought the identification of a new class of large cardinals, known now as Woodin cardinals, new structures of iterated ultrapowers, known now as iteration trees, and new proofs of determinacy, including a proof of ADL(R) . Additional results later on obtained Woodin cardinals from determinacy axioms, and indeed established a deep and intricate connection between the descriptive set theory of L(R) under AD, and inner models for Woodin cardinals.” The articles [Neeman1] and [KoeWood] contain an extensive survey of these results. Some brief remarks will be made on these articles. The article [Neeman1] contains self-contained proofs of the following. - Suppose that there are n Woodin cardinals and a measurable cardinal above them. Let A ⊆ ω ω be Π1n+1 . Then Gω (A) is determined. (Corollary 5.30). - Suppose that there are ω Woodin cardinals and a measurable cardinal above them. Then AD holds in L(R). (Theorem 8.24.) The methods used in the proofs have many other uses, and an extensive history. These include homogeneous trees, homogeneously Suslin sets, extender models, iteration trees. Some of these topics are covered in [Jech2] and [Kanamori3]. The article [KoeWood] contains a self-contained proof of the following. - Assume ZF+AD. Then there is a model of ZFC+“There are ωmany Woodin cardinals”. (See theorem 6.2). 62. Forcing axioms. A forcing axiom is a statement which asserts the existence of a generic set for a family of notions of forcing. One example, Martin’s axiom, has already been seen in chapter 28. As for Martin’s axiom, it is of interest what facts hold concerning a forcing axiom. The following are among the most commonly encountered forcing axioms: - PFA: If P is a proper notion of forcing and D is a family of predense subsets of P with |D| ≤ ℵ1 then there is a filter G ⊆ P such that G ∩ D 6= ∅ for all D ∈ D. - SPFA: As for PFA, but with P semiproper. - MM: As for PFA, but with P stationary set preserving. 212 - The “bounded versions” BPFA, BSPFA, and BMM of the above, where the sets D ∈ D are restricted to have |D| ≤ ℵ1 . It is virtually immediate from the definitions that MM ⇒ SPFA ⇒ PFA ⇓ ⇓ ⇓ BMM ⇒ BSPFA ⇒ BPFA. In the unbounded cases, “predense” can be replaced by “dense” in the definition (exercise 14.4 of [Jech2]). After earlier results, it was shown in [Moore] that BPFA ⇒ 2ℵ0 = ℵ2 . This result is one reason for interest in forcing axioms, since no known large cardinal axiom settles CH one way or the other. Recall the definition of MAℵ1 from section 29. As observed in section 54, if P is c.c.c. then P is proper, and it follows that PFA ⇒ MAℵ1 . By the preceding paragraph, it then follows that PFA ⇒ MA. As seen in section 28, the consistency of MA follows from the consistency of ZFC. BPFA, on the other hand, was shown in [GoldShe] to be equiconsistent with the existence of a “Σ1 -reflecting” cardinal, which is stronger than inaccessible but weaker than Mahlo. It was shown in 1988 that if there is a supercompact cardinal then there is a forcing model in which SPFA holds (theorem 37.9 of [Jech2]). It is also true that SPFA ⇒ MM (theorem 37.10 of [Jech2]). Better upper and lower bounds are known for the consistency strength of various forcing axioms. This is an active area of research; see [Neeman2] for some remarks. Various of these axioms, or related axioms, have various consequences. Among the most notable such implications are the following. - In [Steel4] it is shown that PFA implies ADL(R) . - In [Todor2] it is shown that PFA implies ¬κ for any cardinal κ ≥ ℵ1 (and hence, by corollary 52.7, that 0# exists). - In [Vaile] a statement CP is defined, and shown to follow from the statements MRP, PID, and “there exists a strongly compact cardinal”. Both MRP and PID were already known to follow from PFA. CP in turn implies SCH. 63. Some observations. In a poll taken in 2000, 31 out of 31 set theorists replied that they did not believe that V = L. The author has been maintaining that the situation should be re-assessed, and the possibility that V = L taken more seriously. Since Cohen’s invention of forcing in 1963, set theory has undergone a series of dramatic advances. Various of the most complex of these involve statements which are false if V = L, and set theorists argue that this complexity is indicative of the fact that V = L “impoverishes” the 213 universe and should be rejected. That large cardinal hypotheses and PD result in a “richer” universe is “a posteriori” evidence in favor of them. It is clear, though, that independent questions must be resolved by more fundamental considerations. Set theorists since Godel have been attempting to come to grips with what such might be, and the explosion of knowledge has not clarified the situation in the least. One question which seems fundamental is the existence of Suslin trees. The question of their existence was raised in 1920, and shown to be independent of ZFC in the early 1970’s. The proof of theorem 26.4 “constructs” a Suslin tree by judiciously choosing a countable set of branches at limit stages. On the other hand, the proof of theorem 29.2 involves a “blanket” assumption, which has as a consequence that an uncountable branch exists. It is clear that the first proof has at least equal claim to representing the truth about the real numbers as the second proof. Another fundamental question is whether ω1L = ω1 . This assumption does not require assigning a value to |Pow(ω)|. In the process of obtaining constructible bijections of ω with countable ordinals, ω1 steps of the constructibility process may be performed. Although most set theorists currently disagree, one view is that assumptions which limit the number of constructible bijections to be countable must somehow be “pathological”. Appendix 1. Axioms for plane geometry. In a modern system of axioms for the Euclidean plane, the plane is considered to consist of a set P of points. In addition, certain subsets of P , called lines, are given. Roman letters x, y, . . . will be used to denote points. Greek letters α, β, . . . will be used to denote lines. The language of plane geometry is ∈, B, C. B(x, y, z) is intended to hold if y lies on the line through x and z, and is strictly between them. C(w, x, y, z) is intended to hold if the distance between w and x equals the distance between y and x. Before giving the axioms it is useful to introduce some defined concepts and notation (used only in this appendix), as follows. To simplify the notation, x-y-z is used for B(x, y, z), and wx≡yz for C(w, x, y, z). - Lines α and β are said to intersect, at the point x, if α ∩ β = {x}; and to be parallel if α ∩ β = ∅. - Three points are said to be collinear if they lie on a common line. - Given points x and y let (xy) denote {z : B(x, z, y)}. - Let [xy] denote (x, y) ∪ {x, y}; this is called the line segment, or simply segment, between x and y, and (xy) is called its interior. 214 - If wx≡yz the line segments [wx] and [yz] are said to be congruent. - The notation xyz is used as an abbreviation for hx, y, zi. - The notation xyz≡x′ y ′ z ′ denotes that xy≡x′ y ′ , xz≡x′ z ′ , and yz≡ y ′ z ′ all hold. - An ordered triple xyz is called a triangle if the points are distinct and non-collinear. The points x, y, and z are called its vertices, and the line segments [xy], [xz], and [yz] its sides. Side [xy] is said to be opposite z, side [xz] opposite y, and side [yz] opposite x. A side not opposite a vertex is called adjacent to it. - Two triangles xyz and x′ y ′ z ′ are called congruent if xyz≡x′ y ′ z ′ . The axioms can be divided up into several groups. The first group are the “incidence axioms”, giving the restrictions on the way in which points can lie in lines. I1 Given two distinct points there is exactly one line containing them. I2 Each line contains at least three distinct points. I3 There is a set of three distinct non-collinear points. These axioms may not seem to say much, but already they have some consequences. For example two distinct lines intersect at at most one point; for if there were two points x and y then by axiom I1 there could only be one line. Also, given any line α there is a point x which does not lie on it, else axiom I3 would be contradicted since all points would lie on α. We use the notation α(xy) to denote the unique line containing the distinct points x and y. The next axiom is called the axiom of parallels. P Given a line α and a point x not on α there is exactly one line β such that x is in β and β is parallel to α. As an immediate consequence of the parallel axiom, if α is parallel to β, and β is parallel to γ, then α is parallel to γ or equals γ. For if α and γ intersect at the point x, then α and γ would both contain x and be parallel to β, and so by axiom P they would be identical. The next group of axioms concerns the betweenness relation. B1 If x-y-z then x, y, z are distinct collinear points. B2 If x, y, z are distinct collinear points then exactly one of x-y-z, y-x-z, y-z-x holds. B3 If x-y-z then z-y-x. B4 Suppose xyz is a triangle; α is a line containing none of its points; and α intersects (yz). Then either α intersects (xy) or α intersects (xz). Axiom B4 is called Pasch’s axiom; it states that if a line intersects the interior of one side of a triangle then it intersects the interior of a second, provided the line does not contain any vertex. The axioms concerning equality of distance are as follows. 215 D1 D2 D3 D4 D5 D6 D7 xx≡yz if and only if y = z. xy≡xy. xy≡yx. If uv≡wx then wx≡uv. If uv≡wx and wx≡yz then uv≡yz. If x-y-z, x′ -y ′ -z ′ , xy≡x′ y ′ , and yz≡y ′ z ′ then xz≡x′ z ′ . Suppose xyz and x′ y ′ z ′ are triangles with xyz≡x′ y ′ z ′ ; w, y, z are collinear; w′ , y ′ , z ′ are collinear; yw≡y ′ w′ ; and wz≡w′ z ′ . Then xw≡x′ w′ . D8 Given x in α and distinct y, z there are exactly two points u = u1 , u2 in α such that xu≡yz. Further u1 -x-u2 . D9 If xyz is a triangle and y ′ z ′ ≡yz then there are exactly two points x′ = x′1 , x′2 such that xyz≡x′ y ′ z ′ . Further α(y ′ z ′ ) intersects (x′1 x′2 ). The final axiom requires some definitions. A cut of a line α consists of two nonempty subsets A and B of α such that 1. every point of α is in exactly one of A or B, and 2. whenever x and z are both in the same set and x-y-z, then y is in that set also. Given a cut, a cutpoint is defined to be a point p such that whenever x ∈ A and y ∈ B then either x = p, y = p, or x-p-y. C If A and B form a cut of a line α then there is a cutpoint. A model for these (second order) axioms can be given, with P = R × R, which is also denoted as R2 . The following conventions will be adopted for points in R2 . - A point x equals hx1 , x2 i for some x1 , x2 ∈ R. - The sum x + y of two points in equals hx1 + y1 , x2 + y2 i. - The scalar multiple rx of a point by a real number equals hrx, ryi. - For distinct x, y, let α(xy) denote {x + t(y − x) : t ∈ R}. - The p “Euclidean distance” d(x, y) is defined to be (x1 − y1 )2 + (x2 − y2 )2 . The lines of the model are taken as the sets α(xy) for distinct x, y (the set of such sets; different pairs might yield the same line). B(x, y, z) is taken to hold if ∃t(0 < t < 1 ∧ y = x + t(z − x)). C(w, x, y, z) is taken to hold if d(w, x) = d(y, z). Theorem 1. R2 , with lines, B, and C interpreted as above, satisfies the axioms for the Euclidean plane. Proof: The proof is a straightforward but tedious verification that all the axioms hold. Proofs will assume basic facts about the real numbers, which are well-known and can easily be proved from the axioms. In particular, for a non-negative r ∈ R there is a unique non-negative √ s ∈ R such that s2 = r, called the square root of r and denoted r. 216 For axiom I1, clearly x and y are elements of α = α(xy). Suppose x, y ∈ β where β = α(x′ y ′ ), say x = x′ +t1 (y ′ −x′ ) and y = x′ +t2 (y ′ −x′ ); then y − x = (t2 − t1 )(y ′ − x′ ). Thus, x + t(y − x) = x′ + (t1 + t(t2 − t1 ))(y ′ − x′ ), showing that α ⊆ β. Also x′ = x − (t1 /(t2 − t1 ))(y − x) and y ′ = x + ((t1 + 1)/(t2 − t1 ))(y − x), so x′ ∈ α and y ′ ∈ α, whence β ⊆ α also. For axiom I2, (1/2)(x + y) is a third point. For axiom I3, h0, 0i, h1, 0i, and h0, 1i are non-collinear. For axiom P, let α(x1 y1 ) and α(x2 y2 ) be two lines. Let vi = yi − xi for i = 1, 2. A point of intersection would be determined by values t1 and t2 such that x1 + t1 v1 = x2 + t2 v2 , and t1 and t2 would be a solution to the linear equations v11 v12 −v21 −v22 t1 t2 x21 − x11 = . x22 − x12 By linear algebra there are three cases, v2 is not a scalar multiple of v1 and the lines intersect at a single point; v2 = sv1 for some nonzero s ∈ R and x2 = x1 + tv2 for some t ∈ R and the lines are the same; or the remaining case and the lines are parallel. Suppose in axiom P α = α(x1 y1 ). Then any α(x2 y2 ) containing x and parallel to α must have v2 a scalar multiple of v1 . Given two such α(x2 , y2 ) and α(x′2 y2′ ), x lies on both and v2′ is a scalar multiple of v2 , so the lines are the same. By the definitions, if B(x, y, z) then x, y, z are colinear; further they are distinct, since x + t(z − x) takes on distinct values as t does. This proves axiom B1. Axiom B2 follows by algebra, by considering the cases t < 0, 0 < t < 1, and t > 1. Axiom B3 follows because x + t(z − x) = z + (1 − t)(x − z). For axiom B4, suppose without loss of generality that α intersects (yz) at w = y + t(z − y) where 0 < t < 1. If α is not parallel to α(xy) or α(xz) then for some p, q, r, s, v y + r(x − y) = w + pv and z + s(x − z) = w + qv. Given v there is exactly one solution; given v, p, q, r = (p/q)(1 − t) + t and s = (1 − t) + (q/p)t is the solution. If s ≥ 1 then q/p ≥ 1 so 0 < p/q ≤ 1 and so t < r ≤ 1; and if s ≤ 0 then q/p ≤ −(1 − t) so −t/(1 − t) ≤ p/q < 0 and so 0 ≤ r < t. But r 6= 0, 1 and so if 0 < s < 1 is false then 0 < r < 1 is true. If α is parallel to α(xy) then z + s(x − z) = w + q(x − y) where q = s = 1 − t. The case α parallel to α(xz) is similar. Recalling from section 14 the definition of a metric function, it is next shown that the Euclidean distance is a metric function. Note first 217 that x2 ≥ 0, and the sum of non-negative values is non-negative, so d(x, because √ y) is defined for any two points x, y. Requirement 1 follows x ≥ 0 if it is defined. Requirement 2 follows because x2 > 0 if x > 0, and other basic properties of ≤. Requirement 3 is immediate. The proof of M4 benefits from some additional definitions. The “inner product” x · y of two elements √ of R2 is defined to be x1 y1 + x2 y2 , and the “norm” |x| is defined to be x · x; then d(x, y) = |x − y|. M4 follows from |x + y| ≤ |x| + |y| (1), since then |x − z| ≤ |x − y| + |y − z|. Both sides of (1) are nonnegative, so squaring both sides yields an equivalent relation; after canceling terms it remains to show that x · y ≤ |x||y|. Indeed |x · y| ≤ |x||y| holds, a fact known as the CauchySchwarz inequality. This follows by observing that x y 2 y x ± · . 0≤ =2±2 |x| |y| |x| |y| It is easily seen from the above proof that equality holds in (1) if and only if x · y = |x||y|, and that this holds for nonzero x, y if and only if x/|x| = y/|y|, or equivalently y = cx for some scalar c > 0. Equality holds in M4 for distinct points if and only if y − x is a positive scalar multiple of z − y, or equivalently if x-y-z. Axioms D1-D5 follow easily from the fact that d is a metric function (and the axioms of equality for R). For axiom D6, if x-y-z then as observed in the proof of M4, d(x, y)+ d(y, z) = d(x, z); similarly d(x′ , y ′ ) + d(y ′ , z ′ ) = d(x′ , z ′ ). By hypothesis d(x, y) = d(x′ , y ′ ) and d(y, z) = d(y ′ , z ′ ), so d(x, z) = d(x′ , z ′ ). For axiom D7, from the hypotheses it follows that w = y + t(z − y) and w′ = y ′ + t′ (z ′ − y ′ ) where t′ = t. Thus (d(x, w))2 = ((x1 − y1 ) − t(z1 − y1 ))2 + ((x2 − y2 ) − t(z2 − y2 ))2 = ((x1 − y1 )2 + (x2 − y2 )2 ) − 2t((x1 − y1 ))(z1 − y1 ) + (x2 − y2 )(z2 − y2 )) + t2 ((z1 − y1 )2 + (z2 − y2 )2 ) = ((x1 − y1 )2 + (x2 − y2 )2 ) − t(((x1 − y1 )2 + (x2 − y2 )2 ) − ((x1 − z1 )2 + (x2 − z2 )2 ) + ((y1 − z1 )2 + (y2 − z2 )2 ))) + t2 ((z1 − y1 )2 + (z2 − y2 )2 ) where the last step uses the equality 2(x − y)(z − y) = (x − y)2 − (x − z)2 + (y − z)2 . A similar equation is obtained for (d(x′ , w′ ))2 , with 218 all variables primed. By the hypotheses the right sides are equal, so d(x′ , w′ ) = d(x, w). For axiom D8, let w be any other point of α and let u1 = x+t(w−x), u2 = x − t(w − x), where t = d(y, z)/d(w, x). Then d(ui , x) = td(w, x) = d(y, z) for i = 1, 2; further x = u1 + (1/2)(u2 − u1 ), so u1 -x-u2 . For axiom D9, let d = d(y, z), v = h−(z2 − y2 ), z1 − y1 i, and (x1 − y1 )(z1 − y1 ) + (x2 − y2 )(z2 − y2 ) , d2 −(x1 − y1 )(z2 − y2 ) + (x2 − y2 )(z1 − y1 ) . u= d2 Then it is a straightforward exercise to show that x − y = t(z − y) + uv, d(x, y) = (t2 + u2 )d2 , and d(x, z) = ((1 − t)2 + u2 )d2 . These facts hold similarly with all quantities primed, and so xyz≡x′ y ′ z ′ if and only if t2 + u2 = t′ 2 + u′ 2 and (1 − t)2 + u2 = (1 − t′ )2 + u′ 2 , which holds if and only if t′ − t and u′ = ±u. Finally (x′1 , x′2 ) intersects α(y ′ z ′ ) at y ′ + t(z ′ − y ′ ), because this point is x′1 + (1/2)(x′2 − x′1 ). The relation B(x, y, z) may be defined in R by the formula x < y < z ∨ x > y > z. Define a cut-pair in R to be a pair of subsets A and B as in the definition of a cut in a line; and a cutpoint also. It may be assumed without loss of generality that elements of A are less than elements of B. It follows by algebra and the least upper bound property that either A has a least upper bound, or B has a greatest lower bound. Further, that point is a cutpoint. For axiom C, let α = α(xy). The cut of α induces a cut-pair of R; t lies in the subset in R corresponding to the subset that x + t(y − x) lies in. The cutpoint of this pair corresponds to a cutpoint for the cut in α. This concludes the proof of theorem 1. ⊳ Before stating the next theorem, some definitions and lemmas needed in its proof will be given. To distinguish points of the Euclidean plane from points of R2 , capital letters will be used for the former. Facts about the Euclidean plane will be proved using the axioms for it. Lemma 2. If XY Z is a triangle and α is a line not containing any of its vertices then it is not the case that α intersects all of (XY ), (XZ), and (Y Z). Proof: Suppose to the contrary that U, V, W are the points of intersection; suppose also without loss of generality that U -V -W . Then the triangle Y U W and the line α(XZ) contradict axiom B4. ⊳ Given any line α, say that two points X and Y not lying on α are on opposite sides of α if α intersects (XY ), and are on the same side otherwise. Using axioms I3 and D9, there are points X and Y not on α and lying on opposite sides. t= 219 Lemma 3. Suppose X and Y , and Y and Z, are on the same side of α, or X and Y , and Y and Z, are on opposite sides of α. Then X and Z are on the same side of α. Proof: Suppose the first possibility holds. Consider the triangle XY Z; if α intersects (XZ) it must intersect (XY ) or (Y Z). If the second possibility holds the claim follows similarly using lemma 2. ⊳ Lemma 4. For O, A, X, Y in α suppose X-O-A and Y -O-A, or ¬X-O-A and ¬Y -O-A; then ¬X-O-Y . Proof: The case X-O-A and Y -O-A follows from the case ¬X-O-A and ¬Y -O-A by reversing the roles of O and A; so consider the latter case. The claim is straightforward if X or Y equals O or A, so suppose not. Choose P not in α. Choose Q in α(AP ), Q 6= A, P . If A-P -Q, considering the triangle QAX together with the line α(OP ) there is an R in α(OP ) with Q-R-X. Similarly there is an S in α(OP ) with Q-S-Y . By lemma 1, considering the triangle QXY and the line α(OP ), ¬X-O-Y . If A-Q-P , reverse the roles of P and Q in the preceding argument. If Q-A-P , by lemma 3 X and Y are on the same side of α(OP ), whence ¬X-O-Y . ⊳ Lemma 5. Given a line α, an order < can be defined on it such that X-Y -Z if and only if either X < Y < Z or Z < Y < X. There are exactly two such orders, <1 and <2 , and X <1 Y if and only if Y <2 X. Proof: Suppose α = α(O, U ). A unique order may be defined such that O < U , which is consistent with B in the sense of the lemma. Partition α into the 5 classes λ = {X : XOU }, o = {O}, µ = {X : OXU }, υ = {X : U }, and ρ = {X : OU X}. Writing χ1 < χ2 for ∀X ∈ χ1 ∀Y ∈ χ2 (X < Y ), the order on the classes must be λ < o < µ < υ < ρ. Add the pairs X < Y to the relation whenever X and Y are in different classes. If X, Y ∈ λ add X < Y if X-Y -O, else add Y < X. If X, Y ∈ µ add X < Y if O-X-Y , else add Y < X. If X, Y ∈ ρ add X < Y if O-X-Y , else add Y < X. Clearly, for distinct X and Y exactly one of X < Y or Y < X holds; and X 6< X. It remains to verify the transitivity law X < Y ∧ Y < Z ⇒ X < Z. This is clear unless X, Y , and Z are all in the same class. In these cases either X-O-A and Y -O-A or ¬X-O-A and ¬Y -O-A,and so ¬X-O-Y ; similarly ¬X-O-Z. Suppose X-Y -O and Y -Z-O; then ¬O-Y -Z, and so if ¬X-Y -Z then ¬X-Y -O. Hence, X-Y -Z, so ¬Z-X-Y ; also ¬O-X-Y and so ¬O-X-Z. Together with ¬X-O-Z this shows X-Z-O. Suppose O-X-Y and O-Y -Z; then ¬O-Y -X, and so if ¬Z-Y -X then ¬O-Y -Z. Hence, Z-Y -X, so ¬X-Z-Y ; also ¬O-Z-Y and so ¬O-Z-X. Together with ¬X-O-Z this shows O-X-Z. ⊳ 220 Lemma 6. If XY Z≡X ′ Y ′ Z ′ and X, Y, Z are collinear then X , Y ′ , Z ′ are collinear. Proof: Suppose without loss of generality that X-Y -Z. If X ′ , Y ′ , Z ′ are not collinear let Z1 be a point with XY Z1 ≡X ′ Y ′ Z ′ , and let X1 be a point on α(Y Z1 ) with XY Z≡X1 Y Z1 . Then XY X1 ≡X1 Y X, X-Y -Z, X1 -Y -Z1 , XZ≡X1 Z1 , and Y Z≡Y Z1 , so by axiom D7 XZ1 ≡X1 Z. By assumption, XZ≡XZ1 ; XZX1 ≡XZ1 X1 follows, and so by axiom D9 α(XX1 ) intersects (ZZ1 ). By axiom B4 α(XX1 ) intersects either (Y Z) or (Y Z1 ); but this is impossible. ⊳ Lemma 7. Suppose X-Y -Z, XY ≡X ′ Y ′ and XZ≡X ′ Z ′ , and Y ′ and Z ′ are on the same side of X ′ . Then X ′ -Y ′ -Z ′ and Y Z≡Y ′ Z ′ . Proof: Under the stated hypotheses there are by axiom D8 two points Z1′ and Z2′ with Y Z≡Y ′ Zi′ ; only one of these, say Z1′ , is on the opposite side of Y ′ than X ′ . By axiom D6 XZ≡X ′ Z1′ . Also both Z ′ and Z1′ are on the same side of X ′ as Y ′ . Using axiom D8 again it must be the case that Z ′ and Z1′ are identical. ⊳ Lemma 8. Given distinct points X and Y there is a unique point M , called the midpoint of the segment [XY ], such that X-M -Y and XM ≡M Y . Proof: Choose W not in α(XY ); choose W ′ on the opposite side of α(XY ) with XW Y ≡Y W ′ X, and let α(W W ′ ) intersect α(XY ) in M . Let M ′ in α(XY ) satisfy XM Y ≡Y M ′ X. Using axiom D7 W M ≡W ′ M ′ and W M ′ ≡W ′ M , and hence W M W ′ ≡W ′ M ′ W . By lemma 6 M ′ is in α(W W ′ ), and so M ′ = M . If, say, X-Y -M then since M X≡M Y and M Y ≡M Y , by lemma 7 XY ≡Y Y would hold, which contradicts the assumption that X and Y are distinct. Similarly M -X-Y is impossible, so X-M -Y . To prove uniqueness, order the line so that X < M < Y , and suppose M ′ is another midpoint; if M ′ 6= M suppose without loss of generality that M < M ′ . Then the point M ′′ such that M M ′ Y ≡ M M ′′ X must be identical to M ′ , which is impossible. ⊳ Lemma 9. Given a line α = α(OU ), ordered so that O < U , there is a unique order-preserving bijection P : R 7→ α with the following properties, where Pr is written for P (r). 1. P0 = O, P1 = U , and 2. Pr Ps ≡Pt Pu if and only if |s − r| = |u − t|. Proof: Let Qd denote the rational numbers whose denominator is a power of 2. There is a unique order-preserving map P : Qd 7→ α such that properties 1 and 2 hold; this may be obtained by first “marking off” the integers, and then successively taking midpoints. Since Qd is a dense linear order without endpoints, a unique order-preserving bijection g : R 7→ α is induced; let P be the inverse of g. It remains to show that property 2 holds in general. Let ⊕ be the ′ 221 binary function on α, where S = X ⊕ Y if and only if XS≡OY and S > X, S = X, or S < X according to whether Y > O, Y = O, or Y < O. To see that ⊕ satisfies axiom C1, let S = X + Y , T = Y + X, U = (X + Y ) + Z, and V = X + (Y + Z). Then XS≡OY and SU ≡Y T , so (considering signs and using axiom D6 and lemma 7) XU ≡OT ; also OX≡V T , so OU ≡OV . To see that ⊕ satisfies axiom C2, let S = X + Y and T = Y + X. Then XS≡OY and Y T ≡OX, so (considering signs and using axiom D6 and lemma 7) OS≡OT . That ⊕ satisfies axiom C3, with O for 0, follows by axiom D1. To see that ⊕ satisfies axiom C4, let X ′ be the other point on α such that OX≡OX ′ . Let S = X + X ′ ; OX≡XS follows, whence O = S since O and S are on the same side of X. That ⊕ satisfies axiom O3 follows, since if S = X + Y where X and Y are positive, then O < X < X + Y . Thus, α forms an ordered commutative group with ⊕ and <, which has the least upper bound property. It is readily verified that ⊕ and + agree on Qd ; and hence they agree on α. It follows that Pr Pr+s ≡P0 Ps , and property 2 follows. ⊳ It is useful to have a definition of the distance between two points; when this is used it is assumed that two distinct points O and U have been selected, to give the “unit length”. By the lemma this defines for each r ∈ R a point Pr ∈ α(OU ). Define the distance D(X, Y ) between points X and Y to be the unique real number r ≥ 0 such that XY ≡ OPr . Note that W X≡Y Z if and only if D(W, X) = D(Y, Z). Lemma 10. If α is parallel to β, and β is parallel to γ, then α is parallel to γ, or α equals γ. Proof: If α and γ intersected in the point X, then α and γ would both contain X and be parallel to β, and so by axiom P would be identical. ⊳ Lemma 11. Suppose that OX1 Y1 ≡O′ X1′ Y1′ , X2 (resp. Y2 , X2′ , Y2′ ) is on the same side of O as X1 (resp. Y1 , X1′ , Y1′ ), OX2 ≡OX2′ , and OY2 ≡ OY2′ . Then X2 Y2 ≡X2′ Y2′ . Proof: By axiom D6 or lemma 7 OY2 ≡O′ Y2′ , and so by axiom D7 X1 Y2 ≡X1′ Y2′ . Repeating the argument with the triangles OX1 Y2 and O′ X1′ Y2′ yields X2 Y2 ≡X2′ Y2′ . ⊳ It is useful to introduce some further notation. Given distinct points X and Y , let α> (XY ) denote the points of α(XY ) which are on the same side of X as Y . Given noncollinear points O, X, and Y write 6 OXY for the triple OXY , when it is to be considered as an “angle”. Say that angles 6 OXY and 6 OX ′ Y ′ are congruent, written 6 OXY ≡ 222 O′ X ′ Y ′ , if X1 Y1 ≡X1′ Y1′ for some (and hence any) X1 ∈ α> (OX), X1′ ∈ α> (O′ X ′ ), Y1 ∈ α> (OY ), and Y1′ ∈ α> (O′ Y ′ ), with OX1 ≡O′ X1′ and OY1 ≡O′ Y1′ . It is easy to verify that this is an equivalence relation, and 6 OXY ≡6 OY X. Lemma 12. Suppose Z1 and Z2 are on the same side of α(XY ), and 6 XY Z1 ≡6 XY Z2 . Then α(Y Z1 ) and α(Y Z2 ) coincide. Proof: Let Z3 ∈ α> (Y Z2 ) be such that Y Z3 ≡Y Z1 . Then XY Z1 ≡ XY Z3 . Z1 and Z3 are on the same side of α(XY ), so by axiom D9 Z3 = Z1 . Thus, α(Y Z1 ) = α(Y Z3 ) = α(Y Z2 ). ⊳ For readers familiar with plane geometry, lemma 11 may be seen as a version of the “side-angle-side” criterion for congruent triangles. The next lemma is the “angle-side-angle” criterion, Lemma 13. Given triangles XY Z and X ′ Y ′ Z ′ , if XY ≡X ′ Y ′ , 6 XY Z≡6 X ′ Y ′ Z ′ , and 6 Y XZ≡6 Y ′ X ′ Z ′ then XY Z≡X ′ Y ′ Z ′ . Proof: Let Z2′ ∈ α> (X ′ Z ′ ) be such that XZ≡X ′ Z2′ . By lemma 11 XY Z≡X ′ Y ′ Z2′ , and 6 X ′ Y ′ Z ′ ≡6 X ′ Y ′ Z2′ follows. By lemma 12 α(Y ′ Z ′ ) = α(Y ′ Z2′ ). Hence Z2′ and Z ′ both lie on α(Y ′ Z ′ ) and α(Y ′ Z2′ ), and hence Z2′ = Z ′ . ⊳ Two distinct lines intersecting at a point determine four angles. Pairs of angles which share a common side are said to be supplementary; there are four of these. Pairs of angles which have no common side are said to be opposite; there are two of these. Lemma 14. Suppose 6 OXY and 6 OY Z are supplementary, ′ 6 O X ′ Y ′ and 6 O′ Y ′ Z ′ are supplementary, and 6 OXY ≡6 O′ X ′ Y ′ . Then 6 OY Z≡6 O′ Y ′ Z ′ . Proof: It may be assumed that OX≡O′ X ′ , OY ≡O′ Y ′ , and OZ≡ ′ ′ O Z . By hypothesis XY ≡X ′ Y ′ . Using axiom D7, Y Z≡Y ′ Z ′ . ⊳ Corollary 15. If OXY and OX ′ Y ′ are opposite angles then 6 OXY ≡6 OX ′ Y ′ . Proof: 6 OXY and 6 OXY ′ , and 6 OXY ′ and 6 OX ′ Y ′ , are supplementary. ⊳ Lemma 16. Suppose X1 , X2 , and X3 are points on the line α with X1 -X2 -X3 , and Y1 and Y2 are points on the same side of α. Then 6 X1 X2 Y1 ≡6 X2 X3 Y2 if and only if α(X1 Y1 ) and α(X2 Y2 ) are parallel. Proof: Suppose α(X1 Y1 ) and α(X2 Y2 ) intersect at W1 . Let W2 be the midpoint of [X1 X2 ], and let W3 ∈ α(W1 W2 ) be such that W1 W2 ≡ W2 W3 and W1 6= W3 . Then by congruence of opposite angles and side-angle-side, W1 X1 W2 ≡W3 X2 W2 . Further W3 is not in α(W1 X2 ), because W2 lies in α(W1 W3 ) and W2 and X2 are distinct. Choose W4 with X2 -X1 -W4 . Choose W5 ∈ α(X1 Y1 ) and W6 ∈ α(X2 Y2 ) on the same side of α as W3 . By congruence of opposite angles, 6 X1 X2 Y1 ≡ 6 X1 W5 W4 and 6 X2 X3 Y2 ≡6 X2 W6 W4 ; and 6 X1 W1 W2 ≡6 X2 W3 W2 has 6 223 already been shown. Since 6 X2 W4 W3 6≡6 X2 W4 W6 , 6 X1 X2 Y1 6≡ X2 X3 Y2 follows. Suppose now that α(X1 Y1 ) and α(X2 Y2 ) are parallel. Let Y3 be a point on the same side of α(X1 X2 ) as Y2 such that 6 X1 X2 Y1 ≡6 X2 X3 Y3 (such exists, using axioms D8 and D9). Then by what was just shown, α(X2 Y3 ) is parallel to α(X1 Y1 ). By axiom P α(X2 Y2 ) and α(X2 Y3 ) are identical, whence 6 X1 X2 Y1 ≡6 X2 X3 Y2 . ⊳ The ordered sequence W XY Z of four points is said to be the vertices of a parallelogram if α(W X) and α(Y Z) are parallel, and and α(W Z) and α(XY ) are parallel. It follows that the points must be distinct, with no three collinear. The segments [W X], [XY ], [Y Z] and [ZW ] are called the sides of the parallelogram. The segments [W Y ] and [XZ] are called the diagonals. Lemma 17. In a parallelogram W XY Z, W X≡Y Z and XY ≡ZW . Proof: By lemma 16 and opposite angles, 6 W XY ≡6 Y ZW and 6 W ZY ≡6 Y XW . By angle-side-angle, the lemma follows. ⊳ Lemma 18. Suppose W XY Z are distinct points such that α(W X) and α(Y Z) are parallel, W and Z are on the same side of α(XY ), and W X≡Y Z. Then W XY Z is a parallelogram. Proof: Let β be the line through W parallel to α(XY ). Let Z ′ be the point where β intersects α(Y Z). Then W XY Z ′ is a parallelogram, so by lemma 17 W X≡Y Z ′ , so Z = Z ′ . ⊳ Lemma 19. The diagonals of a parallelogram W XY Z intersect. Proof: Let M be the midpoint of [W X] and let N be the midpoint of [Y Z]. It is easy to see that W M ≡ZN and M X≡N Y (let N ′ be the point on α> (ZY ) with W M ≡ZN ′ ; by lemmas 17 and 7 M X≡ N ′ Y ). By lemma 18 W M N Z and M XY N are parallelograms. W and Y are on opposite sides of α(M N ), so [W Y ] intersects α(M N ); let V be the point of intersection. By angle-side-angle, W M V ≡Y N V , so V is the midpoint of [M N ]. A similar argument shows that [XZ] intersects [M N ] in the midpoint. ⊳ Say that a point W is inside 6 XY Z if W is on the same side of α(XY ) as Z, and W is on the same side of α(XZ) as Y . Lemma 20. The following are equivalent. a. W is inside 6 XY Z. b. W ∈ (Y ′ Z ′ ) for some Y ′ ∈ α> (XY ) and Z ′ ∈ α> (XZ). c. α(XW ) intersects (Y Z), and if V is the intersection point then V ∈ α> (X, W ). Proof: Suppose a holds. Let β be the line through X, parallel to α(Y Z). Y and Z are on the same side of β. Since α(XW ) is not parallel to β it is not parallel to α(Y Z) either, so intersects it at some point V . V , indeed α> (XW ), is on the Y, Z side of β, the Z side of α(XY ), and 6 224 the Y side of α(XZ). In particular c holds. Further, let γ be the line through W parallel to β, and let Y ′ be the intersection with α(XY ) and Z ′ the intersection with α(XZ). Y ′ and Z ′ are on the Y, Z side of β, so Y ′ ∈ α> (XY ) and Z ′ ∈ α> (XZ). That is, b holds also. If b holds then W and Z ′ are on the same side of α(XY ), and Z ′ and Z are also, whence W and Z are. Similarly W and Y are on the same side of α(XZ). If c holds then W and V are on the same side of α(XY ), and V and Z are also, whence W and Z are. Similarly W and Y are on the same side of α(XZ). ⊳ Lemma 21. If W is inside 6 XY Z, W ′ is inside 6 X ′ Y ′ Z ′ , 6 XY Z≡ 6 X ′ Y ′ Z ′ , and 6 XY W ≡6 X ′ Y ′ W ′ , then 6 XW Z≡6 X ′ W ′ Z ′ . Proof: Let V be the intersection point of α(XW ) and (Y Z); and let V ′ be the intersection point of α(X ′ W ′ ) and (Y ′ Z ′ ). The lemma follows by lemma 7. ⊳ Say that 6 O1 X1 Y1 < 6 O2 X2 Y2 if there is a point W inside 6 O2 X2 Y2 such that 6 O1 X1 Y1 ≡6 O2 X2 W Lemma 22. If 6 O1 X1 Y1 < 6 O2 X2 Y2 and 6 O2 X2 Y2 < 6 O3 X3 Y3 then 6 O1 X1 Y1 < 6 O3 X3 Y3 . Proof: Without loss of generality it may be assumed that O1 = O2 = O3 , X1 = X2 = X3 , Y1 is inside 6 O1 X1 Y2 , and Y2 is inside 6 O1 X1 Y3 . The lemma follows by considering the segment (Y1 Y3 ). ⊳ Lemma 23. Suppose XY Z is a triangle and W -Y -Z. Then 6 W XZ < 6 Y XZ. Proof: Let α be the line through W parallel to α(XY ) and β the line through X parallel to α(Y Z). Then Y XV W is a parallelogram, where V is the point of intersection of α and β. By lemma 19 (W X) is inside 6 W Y V , which is congruent to 6 Y XZ. ⊳ If OU V is a triple of distinct non-collinear points, and let U ′ be the other point of α(OU ) such that OU ≡OU ′ . Say that OU V forms a right angle if V U ≡V U ′ . Using lemma 16 if this is so then all four angles at the intersection O of α(OU ) and α(OV ) are right angles. It is readily verified that two right angles are congruent. If all four angles at the intersection of lines α and β are right angles, the lines α and β are said to be perpendicular. Using lemma 16, if β1 and β2 are distinct lines perpendicular to α then β1 and β2 are parallel; also if β1 is a perpendicular and β2 is parallel to β1 then β2 is a perpendicular. Lemma 24. Given a point X and a line α, there is a unique line β perpendicular to α, such that X ∈ β. Proof: Suppose α = α(OU ), and let U ′ be the distinct point of α with U ′ O≡OU . Let W be a point not in α, and let W ′ be the point on the same side of α as W with U W U ′ ≡U ′ W ′ U . Let V be the midpoint of [W W ′ ]. Using the triangles U W W ′ and U ′ W ′ W it follows by axiom 225 D7 that U V ≡U ′ V . This shows that there is a perpendicular β1 to α with O ∈ β1 . Such a perpendicular is unique by lemma 12. If X is on β1 let β = β1 ; otherwise let β be the line through X parallel to β1 . ⊳ Lemma 25. Suppose XY Z is a triangle with 6 XY Z a right angle. Then 6 Y XZ < 6 XY Z and 6 ZXY < 6 XY Z. Proof: Let α be the line through Y parallel to α(XZ) and β the line through Z parallel to α(XY ). Then W Y XZ is a parallelogram, where W is the point of intersection of α and β. Further 6 Y XW is a right angle and Z is inside it, so 6 Y XZ < 6 Y XW ; similarly 6 ZXY < 6 ZXW where ZXW is a right angle. ⊳ Lemma 26. Suppose XY Z is a triangle with 6 XY Z a right angle. Suppose α is the perpendicular to α(Y Z) passing through X, and that α and α(Y Z) intersect in the point W . Then Y -W -Z. Proof: W = Y or W = Z are ruled out by lemma 25. Y ZW and W Y Z are ruled out by lemmas 23, 25, and 22. ⊳ Theorem 2. Suppose OU V is a right angle with OU ≡OV . Then there is a unique bijection from the Euclidean plane to R2 with the following properties, where X̃ is used to denote the value at X. a. Õ = h0, 0i, Ũ = h1, 0i, and Ṽ = h0, 1i. b. Z ∈ α(X, Y ) if and only if Z̃ = X̃ + t(Ỹ − X̃) for some t ∈ R. c. D(X, Y ) = d(X̃, Ỹ ). d. X-Z-Y if and only if Z̃ = X̃ + t(Ỹ − X̃) for some t ∈ R such that 0 < t < 1. By the proof of lemma 9, the requirements imply that for Pr in α(OU ), P̃r = hr, 0i. Likewise, for Pr in α(OV ), P̃r = h0, ri. Let α be the line through Pr on α(OU ), which is parallel to α(OV ); let X be another point on α. Then X̃ must be hr, si for some s ∈ R, else α would intersect α(OV ). Choose X to be the point such that OV ≡ Pr X, which is on the same side of α(OU ) as V . Then (since V X can’t intersect OPr ), X̃ must be hr, 1i. This determines X̃ for all X ∈ α. Finally, every X occurs in some α, namely that α with X ∈ α which is parallel to α(OV ); this must intersect α(OU ), else it is parallel to both α(OU ) and α(OU ), which intersect but are not equal. Call α(OU ) the x-axis, and α(OV ) the y-axis. It has been shown that if there is a bijection with the required properties then it must be as just described, indeed it suffices that it have the required properties on the x-axis and all lines parallel to the y-axis. It remains to show that this map has the required properties in all other cases. Let Pxy denote the point such that P̃xy = hx, yi. Using lemma 19, Pxy lies on the line through P0y which is parallel to the x-axis; further this line has properties b, c (meaning it holds for all X and Y on the line), and d. 226 Let α be a line not parallel to the x-axis or the y-axis. Let Pxα be the point where α intersects the line through Px0 and parallel to the y-axis. Let y0 be such that P0α = P0,y0 , and let s be such that P1α = P1,y0 +s . Given x1 < x2 , let y1 be such that Pxα1 = Px1 ,y1 , let y2 be such that Pxα2 = Px2 ,y2 , and let y3 be such that Pxα2 −x1 = Px2 −x1 ,y3 . Using lemma 16 and angle-side-angle, y2 = y1 + (y3 − y0 ). By induction, Pnα = Pn,y0 +ns for n ∈ Z. It then follows that Pxα = Px,y0 +xs for x ∈ Qd . Finally, this holds for all x ∈ R using axiom C. A similar argument shows that D(Px,y0 +xs , P00 ) = D(P1,y0 +s , P00 ). Properties b and d for α follow readily. To prove that property c holds, it suffices to show that l2 = x2 + y 2 where l = D(P00 , Pxy ) (the “Pythagorean theorem”); it may be assumed that x > 0 and y > 0. Let β be the line perpendicular to α(P00 Pxy ), with Px0 ∈ β. Let Q be the point of intersection of β and α(P00 Pxy ). By lemma 26 P00 -Q-Pxy . Let l1 = D(Q, P00 ), l2 = D(Q, Pxy ), and l3 = D(Q, Px0 ); then l = l1 + l2 . Using the facts that 6 P00 Px0 Q ≡ 6 P00 QPx0 , and 6 Px0 P00 Pxy and 6 QP00 Px0 are both right angles, it follows that l1 /x = l3 /y = x/l. Similarly l3 /x = l2 /y = y/l. Thus, x2 = l1 l and y 2 = l2 l, and using l = l1 + l2 , l2 = x2 + y 2 follows. This concludes the proof of theorem 2. ⊳ Theorems 1 and 2 show that the second order axioms for the Euclidean plane characterize it as an essentially unique structure. Appendix 2. Computability (II). In applications of computability theory it is necessary to have a repertoire of functions which have been shown to be computable, in particular various “syntactic functions” involved in the arithmetization of syntax, and other string manipulations. Showing that such functions are computable is somewhat lengthy no matter how it is done, and usually involves giving additional characterizations of the computable functions. The following classes of functions will be defined: representable, µ-recursive, primitive recursive, and Turing computable. Outlines will be given of proofs of the following facts. 1. A representable function is computable. 2. A µ-recursive function is representable. 3. A primitive recursive function is µ-recursive. 4. Various syntactic functions are primitive recursive. 5. A Turing computable function is µ-recursive. 6. Further syntactic functions are Turing computable. 7. A Turing computable function is computable. In particular, µ-recursive and Turing computable are the same as computable. There are methods which do not involve as many overall 227 steps; in [Shoenfield1] for example the Turing computable functions are not defined and step 6 is carried out using µ-recursion directly. The proof is still lengthy, however; and the coding of strings is more technical. The method here provides additional facts. In particular, it will be noted that the syntactic functions are members of a class of functions which is considerably smaller than the entire class of computable functions. Many omitted details can be found in chapter 12 of [Dowd1]. A kary function f is said to be representable if there is a formula F , with k + 1 free variables x1 , . . . , xk , y, such that f (n1 , . . . , nk ) = m if and only if ⊢ Fn1 /x1 ,...,nk /xk ⇔ y = m (in this section ⊢ denotes provability from the axioms of Q). It is readily seen that fact 1 holds. Representability is a slightly stronger concept than computability, of only technical interest. The µ-recursive functions are defined by a recursive definition. To streamline the definition some preliminary definitions are helpful. Variables may be assumed to be ordered in “alphabetical” order x0 , x1 , . . .. A term t involving functions which have already been assigned a meaning determines a function ft . If t has k variables then ft is kary; argument position i corresponds to the ith variable in the alphabetical order. A recursive definition may be given, similar to the definition of t̂ given in section 6. If g(~x, y) is a (k + 1)ary function the minimization (or minimalization) of g (on y) is the kary partial function φ(~x), such that φ(~x) = y if and only if y is the least value such that g(~x, y) = 0. It should be obvious that φ will in general be a partial function, since there might be no y such that g(~x, y) = 0. The µ recursive functions are defined as follows. 1. The functions s , +, and · are µ-recursive. 2. If t is a term involving 0 and already defined µ-recursive functions then ft is µ-recursive. 3. If f is obtained from the already defined µ-recursive function g by mimimization then f is µ-recursive. Note that if t has no variables then ft is a constant; for reasons such as this it is convenient to consider constants to be 0ary functions. Theorem 1. If f is µ-recursive then f is representable. Proof: Recall from section 7 that if k + l = m then ⊢ k + l = m, and if k ·l = m then ⊢ k ·l = m. where ⊢ denotes provability from the axioms of Q. It follows using these, the axioms of equality, and propositional logic that 0 is represented by y = 0, s by y = xs , + by y = x1 + x2 , and · by y = x1 · x2 . The reader may verify this, or see the “Equality theorem” of [Shoenfield1]. Suppose f is represented by F , and for 1 ≤ i ≤ k ti is represented 228 by Gi . By renaming variables, the argument variables of F may be assumed to be z1 , . . . zk , and the value variable of Gi to be zi . Then f (t1 , . . . , tk ) is represented by ∃z1 . . . ∃zk (G1 ∧· · ·∧Gk ∧F ). The required formulas for this formula may be derived from the inductively assumed formulas by predicate logic; details may be found in [Shoenfield1]. Recall the definition of ≤ from section 7; and that ⊢ x ≤ k ⇒ (x = 0∨· · ·∨x = k). Also, x ≤ n∧n ≤ x (see [Yasuhara]), and ¬n ≤ x ⇒ x = 0 ∧ · · · ∧ x = n − 1 follows. If h(w, ~x) is represented by H(w, ~x, y) then µw(h(w, ~x) = 0) is represented by H(w, ~x, 0) ∧ ∀v(H(v, ~x, 0) ⇒ w ≤ v). This follows by predicate logic from the induction hypothesis and the above facts. ⊳ Given a kary function g and a (k + 2)ary function h, there is a unique (k + 1)ary function f such that f (0, ~y) = g(~y) and f (xs , ~y ) = h(x, ~y , f (x, ~y )). In this case f is said to be obtained by primitive recursion from the functions g and h. The existence and uniqueness of f can be proved in basic set theory. The primitive recursive functions are defined as follows. 1. The function s is primitive recursive. 2. If t is a term involving 0 and already defined primitive recursive functions then ft is primitive recursive. 3. If f is obtained from already defined primitive recursive functions g and h by primitive recursion then f is primitive recursive. Some facts about the integers are required for the proof of the next theorem. If n and d are elements of N with d 6= 0, there is a unique r ∈ N such that n = dq + r and r < d. The notation Rem(n, d) will be used for r. Other facts required concern prime factorization, properties of relative primality, and the Chinese remainder theorem. Discussions of these topics can be found in various sources, including [Dowd1]. The notation n! is used for the factorial function. Let GP(x, y) = ((x + y)(x + y + 1))/2 + y. This is a bijection from N × N to N , as the reader may verify. It is called either the Godel or Cantor, pairing function. Let GP1 (x) and GP2 (x) denote the first and second components of the pair corresponding to x. Theorem 2. If f is obtained from µ-recursive functions g and h by primitive recursion then f is µ-recursive. Proof: Let β(c, d, i) = Rem(c, 1 + (i + 1)d). Given a sequence a0 , . . . , ak−1 of nonnegative integers there are c and d such that β(c, d, i) = ai for 0 ≤ i < k. Indeed, let d = (k − 1)!; then the numbers 1 + (i + 1)d are pairwise relatively prime. This follows because a prime divisor of 1 + (i + 1)d must be greater than k − 1; and a prime divisor of two such values must divide their difference. Now by the Chinese remainder theorem c may be chosen. Let γ(x, i) = β(GP1 (x), GP2 (x), i), 229 so that c and d are coded by a single value. Suppose f is defined by f (0, ~y) = g(~y) and f (xs , ~y) = h(x, ~y , f (x, ~y )). Using µ to denote minimization, let Vseqf (x, ~y ) = µs(γ(s, 0) = g(~y ) ∧ µi(γ(s, i + 1) 6= h(x, ~y , γ(s, i))) ≥ x). Then f (x, ~y ) = γ(Vseqf (x, ~y ), x). ⊳ The method of sequence coding used in the proof of theorem 2 is due to Godel. As an almost immediate consequence of the theorem, every primitive recursive function is µ-recursive. The primitive recursive functions are an easily defined class of computable functions which contain functions of interest for arithmetization of syntax and similar technical details. Godel used them for this purpose. Some basic primitive recursive functions relevant to string manipulation are as follows, where m ≥ 2. Predecessor: Pred(0) = 0, Pred(xs ) = x. Limited difference: Ldif(x, 0) = x, Ldif(x, y s ) = Pred(Ldif(x, y)). Conditional: Cond(0, y, z) = y, Cond(xs , y, z) = z. m-ary conditional: Condm (x, y0 , . . . ym−1 ) = Cond(x, y0 , Cond(Pred(x), y1 , · · ·)). Right digit of m-adic notation: Rdigm (0) = 0, Rdigm (xs ) = Condm+1 (Rdigm (x), 1, Rdigm (x) + 2, . . . , Rdigm (x) + 1). Trim right digit of m-adic notation: Trdigm (0) = 0, Trdigm (xs ) = Condm+1 (Rdigm (x), 0, Trdigm (x), . . . , Trdigm (x)s ). Trim digits from right of m-adic notation: Trdigsm (0, y) = y, Trdigsm (xs , y) = Trdigm (Trdigsm (x, y)). Length of m-adic notation up to limit: Lenlm (0, x) = 0, Lenlm (ws ), x) = Condm (Trdigsm (w, x), Lenlm (w, x), Lenlm (w, x)s ). Length of m-adic notation: Lenm (x) = Lenlm (x, x), 230 Let Appim be the function x + · · · + x + i where x is repeated m times. A function f is said to be defined from functions g and hi for 1 ≤ i ≤ m by recursion on m-adic notation if f (0, ~y) = g(~y) and f (Appim (x), ~y ) = hi (x, ~y , f (x, ~y )). Lemma 3. For m ≥ 2 the primitive recursive functions are closed under recursion on m-adic notation. Proof: Define by primitive recursion the function k(w, x, ~y ) which is the value of f (u, ~y) where u is the leftmost w digits of x. Thus (omitting the subscript m), let Ldigs(x, y) = Trdigs(Ldif(Len(y), x), y); and let k(0, x, ~y ) = g(~y), and k(w + 1, x, y) = condm+1 (Rdig(Ldigs(w + 1, x)), 0, t1 , . . . , tm ) where ti = hi (Ldigs(w, x), ~y , k(w, x, ~y )). Then f (x, ~y) = k(x, x, ~y ). ⊳ The definition of Turing comuputabilty involves the notion of a Turing machine. A Turing machine has a tape alphabet A, and a “head state” alphabet Q. One of the tape symbols is distinguished as a blank symbol, denoted b. There is a set of rules, of the form qs → q ′ s′ d where q, q ′ ∈ Q, s, s′ ∈ A, and d is L (left) or R (right). A Turing machine is called deterministic if no pair qs occurs more than once on the left side of a rule; Turing machines will here be required to be deterministic. The state of the machine is represented by a string over Q ∪ A, which contains exactly one symbol from Q. If the state is of the form αqsβ, the rule with left side qs (if any) is applied to update the state, resulting in a step of the computation. If the right side is q ′ s′ R the new state is αs′ q ′ β, which may be described as, “if the tape head is reading s in state q then it overwrites s with s′ , switches to state q ′ , and moves one square to the right”. The remaining types of transitions are as follows. If the state is αq and there is a rule with left side qb and right side q ′ s′ R the new state is αs′ q ′ (the tape grows a blank square). If the rule is qs → q ′ s′ L then αtqsβ becomes αq ′ ts′ β and qsβ becomes q ′ bs′ β; and if s = b then αtq becomes αq ′ ts′ and q becomes q ′ bs′ . A computation is a sequence of states, where each follows from the previous according to a rule. The transition from one state to the next is called a step of the computation. Note that blanks may be added to the beginning or end of the initial state, without affecting the computation. A state may be reached which has no successor, in which case the computation has halted. Alternatively a computation may continue forever. To view a Turing machine as computing a partial function from N to N conventions for coding the input and output must be adopted. A simple choice is to represent a nonnegative integer n by its 2-adic notation; the initial state for as the input preceded by the initial head 231 state. If the computation halts in state αqβ, the output value is the longest valid integer following q (the longest string of 1’s and 2’s). A Turing machine may be viewed as computing a kary partial function φ(x1 , . . . , xn ), by adding comma to the tape alphabet. Lemma 4. For any Turing machine M the step function StepM (x) is primitive recursive. Proof: The following functions are primitive recursive, where S is a subset of {1, . . . , m}. The abbreviation xi is used for Appim (x), 1 ≤ i ≤ m. Concatenation: Concat(x, 0) = x, Concat(x, yi) = Concat(x, y)i. Leftmost member of S: LftmS (0) = 0, LftmS (xi) = LftmS (x) if i ∈ / S, LftmS (xi) = Cond(LftmS (x), LftmS (x), i) if i ∈ S. Similarly the functions SlLftmS (x) (substring to the left of it) and SrLftmS (x) (substring to the right of it) may be defined. Remaining details are left to the reader. ⊳ Theorem 5. If a Turing machine M computes a function f then f is µ-recursive. Proof: The following functions are primitive recursive: - HaltedM (x), which is 0 if x is a halted state, else 1. - CompM (x, y), which is the state starting at x, after y steps, with steps after halting doing nothing. - InM (x) which translates a 2-adic integer to the initial state. - OutM (x) which translates a state to the output value. The function f is a composition of the above functions, and µy(HaltedM (CompM (x, y)). ⊳ To show that a function is computable it suffices to show that it is Turing computable, and this is often easier than showing directly that it is µ-recursive. According to [Soare], Godel was not convinced that a correct formal definition of computation had been given, until the appearance of Turing machines. As an example, the function IsTerm(x), which is 0 if x is (the Godel number of) a term, else 1, is Turing computable. A Turing machine which computes this function repeats the following step, until the leftmost function symbol is checked. Find the leftmost symbol which is either 0; x; or a function symbol followed by a left parenthesis, checked symbols or commas with at least one checked symbol after the left parenthesis and each comma, and right parenthesis. In the first case check 0; in the second check x and following 1’s and 2’s; and in the third check all 232 symbols from the function symbol to the right parenthesis. The checkmarks needed for this can be placed on a separate “track”; the tape alphabet letters can be considered as tuples, with the ith element being the symbol in the ith track. A more detailed consideration of “Turing machine programming” can be found in numerous references, for example [AHU]. After some labor, it should be easy to outline programs to compute the following functions. - Sub(x, y), which equals pFt/v q if x = pF q where F is a formula with one free variable v, and y = ptq where t is a term. - Num(x), which equals pxq where as in section 7 x is the numeral for x. - PrfQ (x), which is 0 if x is a proof in Q, else 1. - Thm(x), which is the last formula of the proof x. Theorem 6. a. If P is computably enumerable then there is a Turing machine which halts on input ~x if and only if P (~x). b. If f is computable then f is Turing computable. Proof: For part a, let F be a formula showing P is computably enumerable. On input ~n, the Turing machine successively computes Prf(y) for y = 0, 1, . . .. If y is a proof, the machine checks whether it is a proof of Fn1 /x1 ,...,nk /xk . For part b, the Turing machine checks for a proof of Fn1 /x1 ,...,nk /xk ,m/y for some m, and produces m as output when one is found. ⊳ Theorem 7. If P is a computable k-ary predicate then there is a formula H with free variables v1 , . . . , vk such that if P (n) then ⊢ Hn1 /x1 ,...,nk /xk , and if 6= P (n) then ⊢ ¬Fn/v . ⊢ ¬Hn1 /x1 ,...,nk /xk . Proof: To simplify the notation the proof will be given for k = 1. Suppose P is a computable predicate. Let F be the formula such that P (n) if and only if ⊢ Fn/x , and let G be the formula such that ¬P (n) if and only if ⊢ Gn/x . The function fP (n), which equals 0 if P (n) and 1 if ¬P (n), is µ-recursive. Indeed, for a formula F let wF (w, x) = PrfQ (w, Sub(NF , Num(x)) be the µ-recursive function which states that w is a proof that F holds at x. Then fP (x) = wF (µw(wF (w, x) = 0 ∨ wG (w, x) = 1), x). Suppose H ′ is the formula with free variables x ′ and y, representing fP . Let H be H0/y . ⊳ The predicate U (f, n) mentioned in section 9 can be defined as follows. U (f, n) if and only if for some w, w is a proof of Fn/v , where f = pF q and v is the free variable of F . The length ℓ(x) of a string x over a finite alphabet is defined to be the number of symbols comprising the string. Let ∗ denote 2-adic concatenation, which is defined by x ∗ 0 = x, x ∗ yi = (x ∗ y)i for i = 1, 2. 233 The operation of concatenating x with itself ℓ(y) times is defined by x ⊛ 0 = 0, x ⊛ yi = x ∗ (x ⊛ y). This is an easily defined function with ℓ(x ⊛ y) = ℓ(x) · ℓ(y). A function f is said to be defined from functions g, hi for 1 ≤ i ≤ m, and b by bounded recursion on m-adic notation if f (0, ~y) = g(~y ) and f (xi, ~y ) = min(b(x, ~y ), hi (x, ~y , f (x, ~y ))). The class L is defined as the least class containing x1, x2, ∗, and ⊛; and closed under definition by terms and bounded recursion on notation. Cobham’s theorem states that a function f (~x) is in L if and only if there is a Turing machine M computing f and a polynomial p such that M halts within ℓ(x1 ) + · · · + ℓ(xk ) steps. This is stated here without proof (see [Dowd1] for additional details), and also the following. Various functions defined above are in L, including + and · ; the string manipulation functions; StepM , HaltedM , InM , OutM ; and Sub, Num, PrfQ , and Thm. 234 References [AHU] A. Aho, J. Hopcroft, and J. Ullman, The Design and Analysis of Computer Algorithms, Addison-Wesley, 1974. [Abraham] U. Abraham, “Proper Forcing”, In Handbook of Set Theory, ed. M. Foreman and A. Kanamori, to appear. [AbrMag] U. Abraham and M. Magidor, “Cardinal Arithmetic”, In Handbook of Set Theory, ed. M. Foreman and A. Kanamori, to appear. [AbrShel] U. Abraham and S. Shelah, “A ∆22 Well-Order of the Reals And Incompactness of L(QMM )”, manuscript, 1998. [BJW] A. Beller, R. Jensen, and P. Welch, “Coding the Universe”, London Mathematical Society Lecture Notes No. 47, 1982. [Bagaria] J. Bagaria, “Natural Axioms of Set Theory and the Continuum Problem”, in Proceedings of the 12th International Congress of Logic, Methodology, and Philosophy of Science, King’s College London, 2005, 43–64. [Bart] T. Bartoszynski, “Invariants of measure and category”, in Handbook of Set Theory, ed. M. Foreman and A. Kanamori, to appear. [Barwise] J. Barwise, Admissible Sets and Structures, Springer-Verlag, 1971. [Belaniuk] S. Bilaniuk, A Problem Course in Mathematical Logic, euclid.trentu.ca/math/sb/pcml, 2003. [BTW] J. Baumgartner, A. Taylor, and S. Wagon, “On splitting stationary subsets of large cardinals”, J. Symb. Logic 42 (1977) 203– 214. [CFM] J. Cummings, M. Foreman, and M. Magidor, “Squares, scales and stationary reflection”, Journal of Mathematical Logic, 1 (2001), 35–99. [ChaKei] C. Chang and H. Keisler, Model Theory, NorthHolland Publishing Company, 1973. [Chong] C. Chong, Techniques of Admissible Recursion Theory, Lecture Notes in Mathematics 1106, Springer-Verlag, 1984. [Ciesielski] K. Ciesielski, Set Theory for the Working Mathematician, London Math Society Student Texts 39, Cambridge University Press, 1997. [Cummings] J. Cummings, “Iterated Forcing and Elementary Embeddings”, In Handbook of Set Theory, ed. M. Foreman and A. Kanamori, to appear. [DevJen] K. Devlin and R. Jensen, “Marginalia to a theorem of silver”, Lecture Notes in Mathematics 499, Springer, 1975, 115–142. [DevJohn] K. Devlin and H. Johnsbraten, The Souslin Problem, Lecture Notes in Mathematics 405, Springer-Verlag, 1974. [Devlin] K. Devlin, Constructibility, Springer-Verlag, 1984. 235 [Dodd] A. Dodd, The Core Model, Cambridge University Press, 1982. [DoddJen1] A. Dodd and R. Jensen, “The Core Model”, Annals of Mathematical Logic 20 (1981), 43–75. [DoddJen2] A. Dodd and R. Jensen, “The Covering Lemma for K”, Annals of Mathematical Logic 22 (1982), 1–30. [DoddJen3] A. Dodd and R. Jensen, “The Covering Lemma for L[U ]”, Annals of Mathematical Logic 22 (1982), 127–135. [Dowd1] M. Dowd, Introduction to Algebra, Topology, and Category Theory, www.hyperonsoft.com, 2006. [Dowd2] M. Dowd, “Some New Axioms for Set Theory”, submitted to IJPAM. [Drake] F. Drake, Set Theory, An Introduction to Large Cardinals, North Holland, 1974. [Enderton] H. Enderton, A Mathematical Introduction to Logic, Academic Press, 1972. [Foreman] M. Foreman, “Generic Large Cardinals: New Axioms for Mathematics?”, manuscript, 2001, www.math.uiuc.edu/documenta/xvol-icm/01/01.html. [Friedman1] H. Friedman, “Subtle Cardinals and Linear Orderings”, manuscript, 1998. [Friedman2] H. Friedman, “Does Mathematics Need New Axioms?”, manuscript, 2000. [Friedman3] H. Friedman, “Boolean Relation Theory and the Incompleteness Phenomena” manuscript, 2007. [Fremlin1] D. Fremlin, Consequences of Martin’s Axiom, Cambridge University Press, 1984. [Fremlin2] D. Fremlin, Measure Theory, manuscript, 2006. [Gaifman] H. Gaifman, “A generalization of Mahlo’s method for obtaining large cardinal numbers”, Israel J. Math. 5 (1967) 188–200. [Gandy] R. Gandy, “Set-theoretic functions for elementary syntax”, in Proceedings of Symposia in Pure Mathematics 13, Part II, American Mathematical Society, 1974, 103–126. [Geschke] S. Geschke, “Models of Set Theory”, manuscript, 2008. [Gitik] M. Gitik, “The Power Set Function”, Proceedings of the International Conference of Mathematics, 2002, 507–513. [Godel] K. Godel, “What is Cantor s continuum problem?”, 1947. [GoldShe] M. Goldstern and S. Shelah, “The Bounded Proper Forcing Axiom” J. Symbolic Logic 60 (1995), 58–73. [HajPud] P. Hajek and P. Pavel, Metamathematics of First-Order Arithmetic, Springer-Verlag, 1993. [Harrington] L. Harrington, “ Analytic determinacy and 0#”, J. Symbolic Logic 43 (1978), 685–693. 236 [HardWr] G. Hardy and E. Wright, An Introduction to the Theory of Numbers, Oxford University Press, 1968. [Hauser] K. Hauser, “Godel’s Program Revisited Part I: the Turn to Phenomenology”, The Bulletin of Symbolic Logic 12 (2006), 529– 590. [Jackson1] S. Jackson, Math 6010 Notes, manuscript, 2003. [Jackson2] S. Jackson, “Structural Consequences of AD”, In Handbook of Set Theory, ed. M. Foreman and A. Kanamori, to appear. [Jech1] T. Jech, “Stationary subsets of inaccessible cardinals”, in Axiomatic Set Theory, Contemporary Mathematics 31 (1984), 115– 142, Ed. by J. Baumgartner, D. Martin, and S. Shelah, American Mathematical Society. [Jech2] T. Jech, Set Theory, Springer, 2003. [Jensen] R. Jensen, “The fine structure of the constructible hierarchy”, Ann. Math. Logic 4 (1972), 229–308. [KanMag] A. Kanamori and M. Magidor, “The Evolution of Large Cardinal Axioms in Set Theory” in Higher Set Theory, Lecture Notes in Mathematics 669, Springer 1978, 99–275. [Kanamori1] A. Kanamori, “Set Theory From Cantor to Cohen”, manuscript, 2007. [Kanamori2] A. Kanamori, “Tennenbaum and Set Theory”, manuscript, 2007. [Kanamori3] A. Kanamori, The Higher Infinite, Springer, 2003. [Kanamori4] A. Kanamori in “Reviews”, The Bulletin of Symbolic Logic 9 (2003), 237–241. [Kaye] R. Kaye, Models of Peano Arithmetic, Clarendon Press, 1991. [Kechris] A. Kechris, Classical Descriptive Set Theory, Springer-Verlag, 1995. [Komjath] P. Komjath, “Shelah’s proof of diamond”, manuscript, 2008. [KoeWood] P. Koellner and H. Woodin, “Large Cardinals from Determinacy”, In Handbook of Set Theory, ed. M. Foreman and A. Kanamori, to appear. [Koellner1] P. Koellner, “The Search for New Axioms” Ph. D. thesis, Massechusets Institute of Technology, 2003. [Koellner2] P. Koellner, “On Reflection Principles” manuscript, 2008. [Kunen1] K. Kunen, “Some applications of iterated ultrapowers in set theory”, Annals of Mathematical Logic 1 (1970), 179–227. [Kunen2] K. Kunen, Set Theory: an Introduction to Independence Proofs, North-Holland, 1980. [Linden] T. Linden, “Equivalences between Godel’s definitions of constructibility”, in Sets, Models, and Recursion Theory, J. Crossley, Ed., North-Holland, 1967. 237 [MacTutor] The MacTutor History of Mathematics archive, University of St Andrews, www-groups.dcs.st-and.ac.uk/ history [MagShel] M. Magidor and S. Shelah, “The tree property at successors of singular cardinals”, manuscript, 2003. [Magnus] P. Magnus, forall x: An Introduction to Formal Logic, manuscript, 2008. [MansWeit] R. Mansfield and G. Weitkamp, Recursive Aspects of Descriptive Set Theory, Oxford University press, 1985. [Mathias] A. Mathias, “Weak systems of Gandy, Jensen and Devlin”, manuscript, 2006. [Mendelson] E. Mendelson, Introduction to Mathematical Logic, van Nostrand, 1964. [Miller] A. Miller, Descriptive Set Theory and Forcing, Lecture Notes, University of Wisconson, 1995. [Mitchell1] W. Mitchell, “Beginning Inner Model Theory”, In Handbook of Set Theory, ed. M. Foreman and A. Kanamori, to appear. [Mitchell2] W. Mitchell, “The Covering Lemma”, In Handbook of Set Theory, ed. M. Foreman and A. Kanamori, to appear. [Monk1] J. Monk, Introduction to Set Theory, McGraw-Hill, 1969. [Monk2] J. Monk, Mathematical Logic, Springer-Verlag, 1976. [Moore] TṀoore, “Set Mapping Reflection”, J. Math. Logic 5 (2005), 87–97. [Moschovakis] Y. Moschavakis, Descriptive Set Theory, North-Holland (1980). [NagNew] E. Nagel and J. Newman, Godel’s Proof, Routledge, 1989. [Neeman1] I. Neeman, “Determinacy in L(R)”, In Handbook of Set Theory, ed. M. Foreman and A. Kanamori, to appear. [Neeman2] I. Neeman, “Hierarchies of forcing axioms II”, Journal of Symbolic Logic 73 (2008), 522-542. [Rasch] T. Rasch, “Erweiterbarkeit von Einbettungen”, Diploma thesis, Humboldt University in Berlin, 2000. [Rathjen1] M. Rathjen, “A proof-theoretic characterization of the primitive recursive set functions”, Journal of Symbolic Logic, 1992. [Rathjen2] M. Rathjen, “The Higher Infinite in Proof Theory”, manuscript, 1995. [Rogers] H. Rogers, Theory of Recursive Functions and Effective Computability, McGraw-Hill (1967). [Roitman] J. Roitman, “Notes on Forcing”, manuscript, 2005. [Rudin] W. Rudin, Principles of Mathematical Analysis, McGraw-Hill, 1964. [SEP] Stanford Encyclopedia of Philosophy, plato.stanford.edu. [Sacks1] G. Sacks, Saturated Model Theory, W. A. Benjamin, 1972. 238 [Sacks2] G. Sacks, Higher Recursion Theory, Springer-Verlag, 1990. [Sami] R. Sami, “Analytic determinacy and 0#: A forcing-free proof of Harrington’s theorem”, Fundamena Mathematicae 160 (1999), 153–159. [SchSt] E. Schimmerling and J. Steel “Fine Structure for Tame Inner Models”, The Journal of Symbolic Logic 61 (1996), 621–639. Transactions of the American Mathematical Society 351 (), 3119–3141. [Schindler] R. Schindler, Set Theory, manuscript, 2007. [SchZem] R. Schindler and M. Zeman, “Fine structure”, In Handbook of Set Theory, ed. M. Foreman and A. Kanamori, to appear. [Shelah1] S. Shelah, “Can you take Solovay’s inaccessible away?” Israel J. Math. 48 (1984), 1–47. [Shelah2] S. Shelah, “Logical Dreams”, Bulletin of the American Mathematical Society 40 (2003), 203–228. [Shoenfield1] J. Shoenfield, Mathematical Logic, Addison-Wesley, 1967. [Shoenfield2] J. Shoenfield, “Axioms of Set Theory”, in Handbook of Mathematical Logic, ed. J. Barwise, North-Holland, 1977. [Simpson] S. Simpson, Subsystems of Second Order Arithmetic, to appear [Smorynski] C. Smorynski, Self-Reference and Modal Logic, SpringerVerlag, 1985. [Smullyan] R. Smullyan, First-Order Logic, Springer-Verlag, 1968. [Soare] R. Soare, “Computability and Recursion”, manuscript, 1996. [Steel1] J. Steel, “Mathematics Needs New Axioms”, manuscript, 2000. [Steel2] J. Steel, “Forcing with Tagged Trees”, Annals of Mathematical Logic 15 (1978), 55-74. [Steel3] J. Steel, “An Outline of Inner Model Theory”, In Handbook of Set Theory, ed. M. Foreman and A. Kanamori, to appear. [Steel4] J. Steel, “PFA implies ADL(R) ”, manuscript, 2007. [TakZar1] G. Takeuti and W. M. Zaring, “Introduction to Axiomatic Set Theory”, Springer-Verlag, 1971. [TakZar2] G. Takeuti and W. M. Zaring, “Axiomatic Set Theory”, Springer-Verlag, 1973. [Telgarsky] R. Telgarsky, “Topological Games: on the 50th Anniversary of the Banach-Mazur game”, Rocky Mountain Journal of Mathematics 17 (1987), 227–276. [Todor1] S. Todorcevic, “Coherent Sequences”, In Handbook of Set Theory, ed. M. Foreman and A. Kanamori, to appear. [Todor2] S. Todorcevic, “A note on the Proper Forcing Axiom”, in Axiomatic Set Theory, 1983, 209–218. 239 [Viale] M. Viale, “Applications of the Proper Forcing Axiom to Cardinal Arithmetic”, Ph. D. thesis, University of Paris, 2006. [Welch1] P. Welch, “An introduction to inner model theory”, lecture notes. [Welch2] P. Welch, “Σ∗ Fine Structure”, In Handbook of Set Theory, ed. M. Foreman and A. Kanamori, to appear. [Wiki] Wikipedia, the free encyclopedia, en.wikipedia.org [Yasuhara] A. Yasuhara, Recursive Function Theory & Logic, Academic Press, 1971. 240 Index. absolute, 50 absolute value, 19 acceptable J-structure, 164 AD, 202 admissible ordinal, 57 admissible set, 57 amenable structure, 149 analytic set, 193 anti-large cardinal hypothesis, 187 antichain, 86 antisymmetry, 18 Aronszajn tree, 87 assignment, 12 atomic formula, 2 auxiliary game, 201 axiom, 3 axiom of determinacy, 202 axiom scheme, 6 axioms of equality, 4 closed interval, 45 closed subset, 46, 84 closure of a subset, 46 club filter, 103 club subset, 84, 182 cofinal, 182 cofinality, 60 collapsing isomorphism, 53 commutative group, 22 commutative ring, 17 comparable elements, 86 compatible elements, 72 complement, 8 complete, 13 complete Boolean algebra, 74 complete Heyting algebra, 75 complete lattice, 74 complete metric space, 43 complete theory, 28 composition, 10 computable function, 25 computable partial function, 25 computable predicate, 25 computably enumerable, 25 condensation lemma, 70 congruence relation, 14 consistency strength, 188 consistent, 13, 14 constructible set, 65 continuous, 45 continuum hypothesis, 48 contraction of quantifiers, 54 core, 186 core model, 183 countable chain condition, 81 countably infinite, 11 covering lemma, 176 critical number, 186 critical point, 133 critical premouse, 186 Baire space, 47 base for topology, 44 bijective, 10 boldface class, 193 Boolean algebra, 8 Boolean valued model, 72 Borel determinacy, 201 Borel hierarchy, 190 Borel set, 190 bound variable, 2 bounded quantifier, 51 c.c.c., 81 canonical rank function, 106 Cantor space, 45 cardinal, 40 cardinal arithmetic, 47 cardinality, 11 Cartesian product, 9 chain, 82 choice function, 33 Church’s thesis, 24 clopen subset, 201 DC, 198 definable, 148 definable element, 160 241 definable from parameters, 148 definable hull, 121 definable Skolem functions, 120 definition with parameters, 63 dense embedding, 76 dense linear order, 20 dense subset, 71 descending chain, 37 determined, 199 diagonal intersection, 101 diamond principle, 84 direct limit, 129 direct system, 129 directed poset, 128 disjoint sets, 9 disjoint union, 9 domain, 10, 11 down-absolute, 53 downward extension, 164 game, 199 GCH, 50 generic extension, 72 generic filter, 72 generic model theorem, 80 Godel number, 27 good parameter, 159 greatest lower bound, 73 greatly Mahlo cardinal, 104 Heyting algebra, 75 homeomorphism, 45 homomorphism, 16 hypothesis of constructibility, 66 inaccessible cardinal, 97 incomparable elements, 86 incompatible elements, 72 independent sentence, 28 indiscernibles, 120 induction, 6 infimum, 21, 73 infinite, 11 infinite sequence, 43 infinite two-person game, 199 injective, 10 inner model, 70 integer, 5, 17 integral domain, 19 interior of a subset, 46 interpretation, 11 intersection, 8 irrational number, 47 isomorphic embedding, 16 isomorphism, 16 iterable premouse, 184 iterated forcing, 92 iterated ultrapower, 131 elementary embedding, 113 elementary substructure, 68 EM-set, 120 empty set, 8 equality predicate, 4 equiconsistent, 188 equivalence class, 14 equivalence relation, 14 existence condition, 54, 55 expansion, 13 extender, 135 field, 19 filter, 71 fine structure theory, 157 finite, 11 first incompleteness theorem, 28 first-order language, 11 forcing axiom, 212 forcing condition, 71 forcing language, 78 forcing relation, 79 forcing theorem, 80 formula, 2 free variable, 2 full collection, 61 function, 8, 10 Jensen hierarchy, 151 Kleene-Brouwer order, 208 KP, 54 Kurepa tree, 87 242 open interval, 45 open set, 44 order topology, 45 order type, 38 order-dense, 21 order-preserving, 21 ordered n-tuple, 8 ordered commutative group, 22 ordered commutative ring, 18 ordered field, 19 ordered pair, 9 ordinal, 35 ordinal arithmetic, 39 ordinal definable, 180 lattice, 74 least element, 18 least upper bound, 18, 73 least upper bound property, 20 Lebesgue measure, 48 legal position, 200 lexicographic order, 67 lightface class, 193 limit, 43 limit cardinal, 60 limit ordinal, 36 limit point, 84 linear order, 18 logical axioms, 3 Los’ theorem, 113 Lowenheim-Skolem theorem, 14 lower bound, 73 partial function, 24 partial order, 18 Peano’s axioms, 5 perfect set property, 195 perfect subset, 195 pigeonhole principle, 40 play, 199 player, 199 Polish space, 190 poset, 71 position, 199 positive, 18 power set, 8 pre-well-order, 205 predense subset, 182 predicate, 3 premouse, 184 prime power coding, 191 principal filter, 112 principle of dependent choices, 198 product forcing, 91 product order, 91 projection function, 11 projective hierarchy, 192 projective set, 192 projectum, 159, 160 proper class, 34 proper filter, 103 proper forcing, 182 property of Baire, 195 Mahlo cardinal, 100 Martin’s axiom, 94 master code, 160 maximal antichain, 86 maximal element, 82 meager, 48 measurable cardinal, 114 measure 0, 49 metric function, 43 metric space, 43 metric topology, 44 minimal element, 37 Mitchell order, 142 model, 13 monomorphism, 129 Mostowski collapse, 53 mouse, 186 mouse iteration, 186 name, 78 nonprincipal filter, 112 nonprojectible ordinal, 156 norm, 204 notion of forcing, 71 nowehere dense subset, 46 243 Skolem term, 122 small extensions, 66 sound, 13 sound structure, 185 square principle, 178 standard code, 160 standard parameter, 160 standard space, 191 stationary set preserving, 183 stationary subset, 84, 182 strategy, 199 strict partial order, 18 strictly order-preserving, 21 strong Σ1 collection, 59 strong cardinal, 133 strong limit cardinal, 97 strongly admissible set, 156 strongly compact cardinal, 140 structure, 11 subset, 8 substructure, 16 successor cardinal, 60 successor function, 5 successor ordinal, 36 supercompact cardinal, 133 superset, 9 superstrong cardinal, 133 support, 92 supremum, 21, 73 surjective, 10 Suslin hypothesis, 88 Suslin line, 88 Suslin tree, 87 symmetric difference, 9 propositional connective, 2 pruned tree, 200 pseudo-complement, 75 quantifier, 2 quasi-order, 105 quotient, 14 range, 10 rank, 58 rational number, 17 real number, 17 recursive definition, 2 reduction, 204 reflexive, 18 regressive function, 116 regular cardinal, 60 regular open set, 74 regularity property, 202 relation, 8, 11 relative complement, 9 relative rudimentary function, 148 relativization, 51 restriction, 10 rud-closed, 147 rud-closure, 147 rudimentary functions, 145 scale, 204 SCH, 177 scheme, 102 second incompleteness theorem, 28 second order variable, 109 semiproper forcing, 183 semiscale, 206 sentence, 2 separated subsets, 193 Silver indiscernibles, 126 single-valued, 24 singular cardinal, 60 singular cardinals hypothesis, 177 Skolem function, 69 Skolem hull, 69 term, 2 theory, 15 thin subset, 84 topological space, 44 topology, 44 total, 24 totally disconnected, 46 transfinite induction, 37 transfinite recursion, 37 transitive, 18 transitive class, 51 244 universal closure, 4 universe, 11 universe of discourse, 3 up-absolute, 53 upper bound, 18, 73 upward extension, 164 transitive closure, 57 transitive collapse, 53 transitive set, 35 tree, 196 tree property, 86 triangle inequality, 19 Turing closed, 209 Turing cone, 209 Turing machine, 231 valency, 2 variable, 2 very good parameter, 161 ultrafilter, 96 ultrapower, 113 ultraproduct, 112 unbounded subset, 60 uncountable set, 47 uniformization, 204 uniformize, 156 union, 8, 32 uniqueness condition, 55 well-founded, 37 well-order, 38 well-ordering, 39 winning strategy, 199 Woodin cardinal, 133 ZFC, 31 Zorn’s lemma, 83 245 Index of symbols. κλ , 45 xy , 45 c, 45 N, 47 ∈-structure, 50 ∆0 collection, 54 Σ1 collection, 54 ∆KP 1 , 55 ∆ZF 1 , 55 ∃!, 55 TC, 57 Vα , 58 ρ, 58 Hκ , 60 Def, 63 Sat, 63 L, 64 Lα , 64 LimOrd, 65 <L , 66 [x]≤n , 67 Σn -elementary substructure, 68 ≺, 68 ≺n , 68 p< , 71 p≥ , 71 p≤ , 71 M [G], 72 ro, 74 M B , 78 V B , 78 ˇ, 78 , 79 JK, 79 κ-c.c., 81 κ-chain condition, 81 ∆-system, 82 ♦, 84 κ-closed, 84 ♦κ (E), 85 =, 4 Pow, 8 ∩, 8 ∪, 8 ∅, 8 ⊂, 8 ⊆, 8 c ,8 −, 9 hi, 9 ⊕, 9 ⊇, 9 ×, 9 Dom, 10 Ran, 10 ◦, 10 ↾, 10 f [x′ ], 10 π1 , 11 π2 , 11 N , 11, 15 |=, 13 ⊢, 13 E-model, 14 F~v (~x), 15 Q, 17 R, 17 Z, 17 inf, 21 sup, 21 madic notation, 26 pq, 27, 62 ∈-minimal, 33 Ord, 36 ω, 36 +, 39, 42 ·, 39, 42 Card, 40 ℵ, 41 C, 45 246 SαA , 157 AM , 159 AMp , 159 ρα , 159 pM , 159 P(M ), 159 Anα , 160 pnα , 160 ∈-cofinal, 167 , 178 HOD, 180 OD, 180 K DJ , 183 n-sound structure, 186 n(N ), 186 σ-algebra, 190 Cd, 192 FS, 192 T ⊘x, 196 Br, 196 Pr, 196 κ-Suslin, 197 <KB , 208 GP, 229 Lim, 100 △, 101 Π10 -indescribable cardinal, 108 Π1n -indescribable cardinal, 109 Π1n -enforceable subset, 110 [x]n , 111 κ → (λ)nµ , 111 UltU , 115 0#, 122 L[A], 127 L(b), 128 M [x], 128 dirlim, 129 L[U ], 131 Σn -elementary embedding, 131 x#, 132 (κ, λ)-extender, 135 UltE , 137 M -ultrafilter, 144 Jα , 151 Rud, 151 Sα , 152 δ-number, 155 J-structure, 157 JαA , 157 247