Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Set Theory-an Introduction 1. Intro • A set theoretist is a mathematician who admits not to know what the real numbers are. • Some commonly used axioms/theorems/statements/logical reasoning is dangerous. One has to know the danger and should not ignore it. • Naive set theory (Cantor: everything with a property is a set: any definable collection is a set) and Russel’s paradox: The set of all sets ⊃ the set of all sets not containing itself as an element. Let us call a set ”abnormal” if it is a member of itself, and ”normal” otherwise. For example, take the set of all squares in the plane. That set is not itself a square in the plane, and therefore is not a member of the set of all squares in the plane. So it is ”normal”. On the other hand, if we take the complementary set that contains all non-(squares in the plane), that set is itself not a square in the plane and so should be one of its own members as it is a non-(square in the plane). It is ”abnormal”. Now we consider the set of all normal sets, R. Determining whether R is normal or abnormal is impossible: if R were a normal set, it would be contained in the set of normal sets (itself), and therefore be abnormal; and if R were abnormal, it would not be contained in the set of all normal sets (itself), and therefore be normal. This leads to the conclusion that R is neither normal nor abnormal: Russell’s paradox. • There is a model of R such that there is an -δ discontinuous function that is sequentially continuous. From ZF it cannot be proven that a sequentially contiuous function on a compact interval attains its maximum. 2. What is a set? • Main obstacle: A set cannot contain itself as an element. • solutions: certain axiomatics: Sierpinski: a set is of higher hierarchy than its elements • More common: Zermelo-Fraenkel axioms 1 2 • • • • • • • • • • • • A set contains elements (objects): a ∈ A Two sets are equal iff they have the same elements. No set is its own element. Given a condition we do not know beforehand that there is an object fulfilling it. So it is convenient to define ∅ (empty set) as the set that does not contain any element. Sets might be elements of other sets. Subsets are sets (collection of some elements in the set). The empty set is a subset of all sets. Unions of sets areSsets: A a set, for each α ∈ A there is a set Bα then the union α Bα is defined as the set containing all the elements of at least one of Bα . Complements are sets B \ A = C is a set. Intersection of sets: TA a set, for each α ∈ A there is a set Bα then the intersection α Bα is defined as the set containing all the elements that are elements of all Bα . Complements are sets B \ A = C is a set. Cartesian product of sets: A × B is the set of ordered pairs (x, y), x ∈ A, y ∈ B. Exponents of sets: AB is the set of all functions from B into A. Example: AN is the set of all sequences in A. 3. Consistency of ZF By Gödel’s theorem ZF’s consistency cannot be proven within ZF. For this one needs the existence of ”large” cardinals (another axiom in set theory). However, redundancy is known. 4. Classes • A class is a collection of sets (or sometimes other mathematical objects) that can be unambiguously defined by a property that all its members share. • A class that is not a set (informally in Zermelo–Fraenkel) is called a proper class. • Examples: – The class of all sets – The class of all one-element sets – The class of all groups, rings, fields, vector spaces, etc. • One way to prove that a class is proper is to place it in bijection with the class of all ordinal numbers. 3 5. Equivalence relations Let M be a set. Definition 5.1. A relation ”∼” on M × M is an equivalence relation if (1) Reflexivity: a ∼ a for all a ∈ M . (2) Symmetry: a ∼ b, =⇒ b ∼ a. (3) Transitivity: a ∼ b and b ∼ c =⇒ a ∼ c. Definition 5.2. Partition into classes: [ M= Mα , α 6= β =⇒ Mα ∩ Mβ = ∅. α Mα – classes. Lemma 5.1. Any equivalence relations gives a partition into classes and vice versa. Proof. ( =⇒ ) Ma = {x ∈ M : x ∼ a}, Ma = Mb = Mα (reverse) a ∼ b ⇐⇒ ⇐⇒ a ∼ b. a ∈ Mb . Two classes either coincide or are disjoint! 6. Equivalence of sets Definition 6.1. A ∼ B iff there is a bijection f : A → B. Remark 6.1. Two finite sets are equivalent iff they have the same number of elements (generalization!) Example 6.1. NN ∼ [0, 1] \ Q via continued fraction expansion. Note that NN is the set of all integer sequences. Definition 6.2. A set is of infinite power (infinite) if it is not equivalent to any finite set. Definition 6.3. A set is countable iff it is equivalent to N. If a set is neither finite nor countable it is called uncountable. Example 6.2. • Z is countable. • The even numbers are countable. • Q is countable. • Any subset B of a countable set A is countable or finite. (enumerate A : a1 , a2 , a3 , · · · and let B : an1 , an−2 , an3 , · · · either the enumeration of B is finite or a correspondence to ω) 4 • Any countable union of countable sets with specified bijections to N is countable (write a table for each An : · · · , an,i , · · · and mimic the enumeration of Q. Definition 6.4. A set is called Dedekind finite if it does not contain a countable set as a subset. Otherwise it is called Dedekind infinite. Lemma 6.1 (ωAC equivalent). Any infinite set is Dedekind infinite. Proof. • Choose arbitrary a1 ∈ M (OK!). • Choose a2 ∈ M \ a1 (OK!). • Continue· · · since there are always elements left because M is infinite (OK???????). We run into the same problems as with sequential continuity. There is a model of R where R contains infinite sets that are Dedekind finite! This gives also an example of a discontinuous function that is sequentially continuous! Theorem 6.1. I := [0, 1] ∩ R is uncountable (in ZF without AC!). Proof. • Any x has a unique binary expansion with infinitely many 0’s. So I ∼ B ⊂ {0, 1}N . (The latter is the set of all 0–1 sequences). • {0, 1}N ∼ C1/3 ⊂ I (use base 3 expansion). • This shows by the Cantor-Bernstein Theorem that I ∼ {0, 1}N . • Use Cantor’s diagonal argument to show that N {0, 1}N : Consider a table [ak,l ]∞ k,l=1 with countably many lines indexed by k and in each line countably many entries indexed by l and each entry either 0 or 1. This table reflects a numeration (bijection with N) of 0 − 1-sequences. Let 0∗ = 1 and 1∗ = 0. Then the sequence (a∗n,n )∞ n=1 is not in the table and hence there is no desired bijection of N {0, 1}N . 7. The Cantor–Bernstein Theorem Theorem 7.1 (Cantor–Bernstein). Let A ∼ B1 , B ∼ A1 , A1 ⊂ A and B1 ⊂ B. Then A ∼ B. Proof. Let f : A → B1 and g : B → A1 be the (any) corresponding bijective maps. Consider chains a 7→ b iff a ∈ A, b ∈ B1 ⊂ B and f (a) = b. Similarly, b 7→ a iff b ∈ B, a ∈ A1 ⊂ A and g(b) = a. • Each chain is ”infinite to the right” 5 • Each element of A, respectively B is contained in exactly one chain. • There are (disjoint) 3 possibilities: a chain C is ”infinite in both directions” (type 1), the ”least element” belongs to A (type 2) or the ”least element” belongs to B (type 3). • f maps the elements of A that belong to chains of type 1 or 2 in a 1-to-1 way into B 1,2 ⊂ B. The remaining elements of A are in chains of type 3 and g −1 maps A \ { chains of type 1 or type 2} in a 1-to-1 fashion onto B \ B 1,2 . 8. The power of a set, Cardinals Definition 8.1. A cardinal number is an equivalence class of (nonempty) sets. Definition 8.2. The power or cardinality of a set A is the corresponding cardinal number denoted by m(A). Remark 8.1. Some facts. • The cardinality of a finite set is the natural number equal to the number of elements contained in the set. • The set N of all natural numbers has the cardinality denoted by ℵ0 . • The set R of all real numbers has the cardinality denoted by c. • There are 4 possibilities: – A ∼ B1 ⊂ B and B ∼ A1 ⊂ A – A ∼ B1 ⊂ B but ∀A1 ⊂ B A1 B – B ∼ A1 ⊂ A but ∀B1 ⊂ A B1 A – ∀A1 ⊂ A A1 B and ∀B1 ⊂ B B1 A. In the first case (Cantor-Bernstein) m(A) = m(B), in the second we write m(A) < m(B), in the third m(A) > m(B). The most interesting case is the fourth. Without any further axioms it can happen or not. The assumption that the fourth case does not happen, i.e. we can compare any two sets (Trichotomy) is equivalent to the Axiom of Choice that we will study later. For a set M we write P(M ) for its power set, i.e. the set of all (including the empty set!) subsets of M . We note that for a finite set M with n elements its power set has exactly 2n elements. The next theorem shows that there is no ”largest set”. Theorem 8.1 (Cantor). m(M ) < m(P(M )). 6 Proof. • M 3 x {x} ∈ P(M ) is a bijection onto its image. So M and P(M ) are compareble. • Assume there is a bijection x f (x) = Mx ∈ P(M ). • Consider the set X := {x ∈ M : x ∈ / Mx } ⊂ M , i.e. X ∈ P(M ). • Like in Russel’s paradox X 6= f (y) for all y ∈ M . If it were than y cannot be in X nor in its complement! Remark 8.2. We summarize: • There is no ”largest” cardinal. • The collection of all cardinal numbers is a proper class! • Notation: For a finite set A with n elements we write m(A) = n. • Notation: m(P(M )) = 2m(M ) in analogy of finite sets. • 2ℵ0 = c, where c = m(R) - the power of the continuum. • m({0, 1}N ) = c. • m(NN ) = c. For both last statements consider the equivalence to the real numbers. • The continuum hypothesis states that there is no cardinal number m such that ℵ0 < m < c. • One can prove that m(Borel sets on R) = c All subsets of C1/3 have outer Lebesgue measure 0 and hence are Lebesgue measurable. So m(Lebesgue measurable sets in R) > c = m(Borel sets in R). 9. Cardinal arithmetics Let n, k, l be cardinals. By the very definition we can choose sets B, C, D, pairwise dijoint (no two of them have common elements), such that n = m(B), k = m(C), l = m(D). We define • l + n = m(D ∪ B), • l · n = m(D × B), • ln = m(DB ). This definitions are well-stated since for equivalent sets B 0 , C 0 , D0 one has m(D ∪ B) = m(D0 ∪ B 0 ), m(D × B) = m(D0 × B 0 ) and m(DB ) = 0 m(D0B ), since the 0 -sets are coming from ”renameing” the elements via 7 the corresponding bijections. Only for addition we need the assumption that the involved sets are disjoint! We have the following rules: • n + l = l + m since B ∪ D = D ∪ B. • n · l = l · n since B × D ∼ D × B. • n · (k + l) = n · k + n · l since B × (C ∪ D) ∼ (B × C) ∪ (B × D). Lemma 9.1. The notation m(P(A)) = 2m(A) is justified by the computation rules. Moreover, 2m(A) m(A) m(P(P(A))) = 22 m(P(P(P(A)))) = 22 etc. Proof. Let B = {0, 1}. Then B has cardinality equal to 2. Then the set B A has cardinality 2m(A) . By definition the set B A consists of all functions f : A → {0, 1}, i.e. for a ∈ A, f (a) = 0 or f (a) = 1. Setting Af = {a ∈ A : f (a) = 1} we have that f is the charcteristic (indicator) function of the subset Af ∈ P(A) and vice versa. So we obtained a 1-to-1 correspondence between the subsets of A and the elements of BA. For the proofs of the following statements we assume that the chosen representatives of a cardinal are pairwise disjoint. Lemma 9.2. mn1 +n2 = mn1 · mn2 . Proof. Follows since we can biject the functions from f : N1 ∪ N2 → M with the ordered pairs of functions given by (f1 = f |N1 , f2 = f |N2 ). Lemma 9.3. (mn )k = mn·k . Proof. (mn )k is the cardinality of the set of function from K to the set of functions from N to M . I.e. f (k) = gk (n) = mk,n ∈ M , k ∈ K, n ∈ N . On the other hand mn·k is the cardinality of the functions h from N × K 3 (n, k) → m ∈ M . We bijectively set f ∼ h iff gk (n) = h(n, k) for all k ∈ K, n ∈ N . Lemma 9.4. For m ≤ n and k ≤ l we have m + k ≤ n + l, m · k ≤ n · l. Proof. Obviously, m(M ∪ K) ≤ m(N ∪ L) and m(M × K) ≤ m(N × L). 8 Lemma 9.5. (m1 · m2 )n = mn1 · mn2 . Proof. Follows since we can biject the functions from f : N → M1 × M2 given by f (n) = (m1 , m2 ) with the elements of M1N × M2N , i.e. ordered pairs of functions given by (f1 (n) = m1 , f2 (n) = m2 ). Lemma 9.6. If n ≥ ℵ0 , n -finite then n = n + n = n + ℵ0 . Proof. By the assumption we find a sets N , N1 , m(N ) = n, N = N1 ∪N and N1 ∩ N = ∅ (note that n ≥ ℵ0 means that there are representatives of n that contain the natural numbers since the two cardinals are comparable) . We define bijections between N and N ∪ {1, 2, · · · , n} and N ∪ N, respectively, by taking the identity on N1 and bijective map N to N ∪ {1, 2, · · · , n} or N to N ∪ N, respectively. Those bijections have been considered before (sometimes called ”Hilbert’s hotel”). Lemma 9.7. ℵ20 = ℵ0 . Proof. The statement can be interpreted as m(N × N) = m(N), i.e. we have to biject (ordered) pairs of natural numbers to natural numbers. An effective way is to choose N × N 3 (m, n) 2m (2n + 1) ∈ N. Lemma 9.8. For n ≥ ℵ0 and k ≤ 2n 2n + k = 2n . Proof. Since n ≥ ℵ0 we have n + 1 = n and 2n ≤ 2n + k ≤ 2n + 2n = 2 · 2n+1 = 2n . 10. CH and GCH, the continuum hypothesis and the generalized continuum hypothesis CH: There is no cardinal n such that ℵ0 = m(N) < n < c = 2ℵ0 = m(R). GCH: For any cardinal number m there is no other cardinal n such that m < n < 2m . 9 11. Well-ordering Definition 11.1. (Partial) Ordering a ≤ b is a relation, i.e. a subset of M × M with: • a ≤ a (refelxivity) • a ≤ b, b ≤ a =⇒ a ≤ c (transitivity) • a ≤ b, b ≤ a =⇒ a = b (antisymmetry) Example 11.1. N, R, P(M ) with subset relation, · · · Definition 11.2. A set M is totally ordered if for any distinct a, b ∈ M either a < b or b < a. Example 11.2. N, R but not P(M ). Definition 11.3. A well-ordering of a set M is a total ordering such that any subset M1 ⊂ M , i.e. M1 ∈ P(M ), has a least element, i.e. ∃x ∈ M1 such that for all a ∈ M1 we have x ≤ a. Example 11.3. N but not R. Definition 11.4. A subset M1 ⊂ M of a (partially) ordered set M is called a chain if it is totally ordered, i.e.∀a, b ∈ M1 either a ≤ b or b ≤ a. Definition 11.5. An element a ∈ M of a (partially) ordered set M is said to be an upper bound for a subset M1 ⊂ M if ∀x ∈ M1 =⇒ x ≤ a. A lower bound is defined analoguously. A subset with an upper/lower bound is said to be bounded from above/below. An element a ∈ M of a (partially) ordered set M is maximal if a≤x =⇒ a = x. Definition 11.6. An element a ∈ M1 ⊂ M of a subset of an (partially) ordered set is said to be comparable in M1 if ∀x ∈ M1 =⇒ a ≤ x or x ≤ a. Remark 11.1. A subset M1 ⊂ M of a (partially) ordered set M is a chain iff any of its elements is comparable in M1 . 12. The Axiom of Choice and related statements AC: For any collection S of sets {Aα }α∈I , Aα , I are sets, there is a function f : {Aα }α∈I → α Aα such that for any α we have f (Aα ) ∈ Aα . Remark 12.1. In contrast to ZF this axiom allows to build sets without specifying the elements. One can choose one shoe from an infinite collection of pairs of shoes (choose always the left!) but one needs AC to choose from pairs of socks. 10 Since any set contains elements AC is not needed for a finite collection of sets! AC is not harmless at all but it has many convenient applications. Sometimes one uses AC even when it is not needed (definition of a differentiable structure). One should always be aware when one uses AC! 12.1. Statements equivalent to AC. The AC is equivalent to the following statements. Each of which has its advantages in different applications. Theorem 12.1 (Maximal chain theorem of Hausdorff). In any partially ordered set, every totally ordered subset (chain) is contained in a maximal totally ordered subset (chain). AC implies the Maximal chain theorem of Hausdorff. • First change ≤ to ⊂ by defining M sup Mx := {y ∈ M : y ≤ x}. • Let X be a non-empty collection (it contains ∅) of subsets of M with the properties. Every subset of a set in X belongs to X and the union of each chain in X is contained in X. • Let f : P(M ) \ ∅ → M be a choice function for M (AC!). For each A ∈ X let A∗ := {x ∈ M : A ∪ {x}}. Define g : X → X by g(A) = A if A∗ \ A = ∅ or otherwise g(A) = A ∪ {f (A∗ \ A)}. Then g(A) contains at most 1 element more than A. • We want to prove that G(A) = A for some A ∈ X. That will impliy the theorem. • We say set a subcollection J of X is a tower if – ∅∈J – If A ∈ J then g(A) ∈ J. S – If C is a chain in J then A∈C A ∈ J. • The intersection of towers is a tower. Let J0 be the intersection of all towers, i.e. the smallest. We are going to prove that it is a chain. • Let C be comparable in J0 . • Assume A ( C then g(A) ⊂ C. Otherwise A ( C ( g(A) contradicting that g(A) has at most one more element than A. • Let U be the collection of sets in A ∈ J0 such that A ⊂ C or g(C) ⊂ A. We want to show that U is a tower. – ∅ ∈ U. – A ( C then (previously) g(A) ⊂ C, i.e. g(A) ∈ U. – A = C then g(C) = g(A) (i.e. g(C) ⊂ g(A)) and g(A) ∈ U. – g(C) ⊂ A then g(C) ⊂ A ⊂ g(A) and g(A) ∈ U. 11 • • • • – The union of the elements over a chain is by the definition of U contained in U. U = J0 . If C is comparable then so is g(C) by the previous considerations.: If A ∈ J0 = U so either AsubsetC ⊂ g(C) or g(C) ⊂ A. g maps comparable sets to comparable sets. The union of comparable setsover a chain is comparable. That implies that comparable sets constitute a tower and hence J0 consists of comparable sets only, i.e. J0 is a chain itself. Since J0 is a chain and a tower the union A over all elements of J0 is in J0 . Therefore g(A) ⊂ A since A includes all sets in J0 . On the other hand A ⊂ g(A). Therefore A = g(A). Theorem 12.2 (Zorn’s Lemma). Every non-empty partially ordered set in which every chain (i.e., totally ordered subset) has an upper bound contains at least one maximal element. Maximal chain theorem of Hausdorff implies Zorn’s Lemma. Take a chain C in the set and an upper bound a for it. Then a ∈ C and is a maximal element. Otherwise there is an element b in the set such that a < b ∈ / C and the chain C is not maximal. Theorem 12.3 (Zermelo’s Well-Ordering Principle (WOP)). Every set can be well-ordered. Zorn’s lemma implies WOP. • If A, B are two well-ordered sets then A is said to be a continuation of B if B ⊂ A and the subset inclusion preserves the order in B, A. • If a collection of well-ordered sets forms a chain C with respect to continuation then there is a unique well-ordering of U , the union of the sets in the chain, that is a continuation of the well-ordering of all sets in C. This well-order is defined in the following way: take a, b ∈ U . The there are a ∈ A ∈ C, b ∈ B ∈ C. By the continuation property either A = B or one is the continuation of the other. That defines the order between a andb. It is clearly a well-ordering. • Consider U the collection of well-ordered subsets of M , i.e subsets together with a (chosen) well-ordering. Then U is partially ordered by continuation. This collection contains the empty set. If C is a chain in U then its union of the sets in C is an upper bound. Hence, there is a maximal well-ordered set M 0 in U. This set must be equal to M since otherwise we could 12 add the ”missing” element x ∈ M \ M 0 to the chain by x > y, ∀y ∈ M 0 . WOP implies AC. S Consider U = α∈I Aα and choose (no AC needed for this!) a wellordering on the set U . Since Aα ⊂ U we can well-define f (Aα ) as the minimal element of Aα ⊂ U . This is a choice function. Theorem 12.4 (Trichotomy of Cardinals). If two sets are given, then either they have the same cardinality, or one has a smaller cardinality than the other. Proof. This proof comes in section 14. Theorem 12.5 (Tychonov’s theorem). Let {Kα }Y α∈I , I any index set, Kα endowed with Kα non-empty compact spaces. The the product α∈I the product topology is a compact non-empty space. Proof. The proof is postponed till section 17. Remark 12.2. If the spaces Kα are assumed to be Hausdorff Tychonov’s theorem follows from the Ultrafilter Lemma without using AC as can be seen from the forthcoming proof. There are other statements commonly used in analysis, topology, combinatorics,· · · that are equivalent to AC. We will list them without proofs. Theorem 12.6. Every vector space over any field F has a basis. In particular (not equivalent to AC!), R has a basis (Hamel basis) as a vector space over Q. The latter statement will be proved in section 17. Remark 12.3. This theorem Q∞ does not hold if one considers modulis instead of vector spaces: n=1 Z is not a free Z-module, i.e. has no basis as a module over Z. Theorem 12.7. Every ring with a unit has a basis. Theorem 12.8. The closed unit ball of the dual of a normed vector space over R has an extreme point. Theorem 12.9. Any connected graph has a spanning tree. 13 12.2. Statements implied by AC but strictly weaker. The following are some important statements that are not implied by ZF but can be proven in ZF+AC and do not imply AC. • Any union of countably many countable sets is countable. (Compare with the statement in section 6! It is equivalent with ωAC.) • An infinite set is Dedekind-infinite (equivalent to ωAC). • The existence of (Lebesgue-) non-measurable subsets of R (follows from the Ultrafilter Lemma, see section 15). • The Banach-Tarski paradox. • Every subgroup of a free group is free. • The additive groups of R and C are isomorphic. • The Hahn-Banach theorem (Follows from the Ultrafilter Lemma): Any (semi-)norm bounded linear functional on a linear subspace U of a (semi-)normed vector space V over R or C has a linear extension to all of V . • The Ultrafilter Lemma (see section 15). • Every set can be totally ordered (follows from the Ultrafilter Lemma but is strictly weaker!) • The Alaoglu theorem (equivalent to the Ultrafilter Lemma): The closed unit ball of the dual of a normed vector space is compact with respect to the weak* topology. • The Baire Category Theorem (equivalent to the Axiom of dependent choices) : A non-empty complete metric space is not the countable union of nowhere-dense closed sets. • The open mapping and closed graph theorems (rely on the Baire Category Theorem). • Every infinite-dimensional topological vector space has a discontinuous linear functional. • Every Tychonov space has a Stone-Čech compactification (Relies on the Ultrafilter Lemma). Another interesting statement is the following Theorem 12.10 (Ulam). Under CH (or weaker) there is no finite, σadditive, non-atomic measure such that any subset of R is measurable. There are also stronger axioms that imply AC. The most familiar one is the generalized continuum hypothesis (GCH). On the other hand, CH is independent of ZFC (Cohen). 13. Ordinal numbers Lemma 13.1. The collection of all ordinals is a proper class. 14 Lemma 13.2. Given any sequence of infinite ordinals strictly less than a given ordinal β ≥ ω1 . Then they are all prefixes of an ordinal β 0 < β. This proves that [0, ω1 ) is not compact but sequentially compact. 14. Cardinals and alephs Definition 14.1. The cardinal number of a well-ordered set is called an Aleph: ℵ. Definition 14.2. An ordinal number ωa is an initial ordinal for an ℵ if ℵ = m(ωa ) and ∀ω with m(ω) ≥ ℵ follows ω ≥ ωa . Lemma 14.1. The ℵ’s inherit a well-ordering from their initial ordinals. Lemma 14.2 (Hartogs, Sierpinski). The following statement holds without the use of AC. For any set M there is a set f (m) ∈ P(P(P(M ))) such that • f (M ) is well-ordered, • m(f (M )) m(M ), 2m(M ) • m(f (M )) ≤ 22 . Proof. • Let M ∈ P(P(M )) be the subsets of the subsets of M that are well-ordered with respect to ⊂. • Let Φ ∈ P(P(P(M ))) the set of their equivalence classes with respect to their order type. Φ is well-ordered itself (by the magnitude of the order types of their equivalence classes) and we set f (M ) = Φ. 2m(M ) • By construction: f (M ) is well-ordered and m(f (M )) ≤ 22 . • Assume m(f (M )) ≤ m(M ). Then Φ ∼ M1 ⊂ M and that would give a well-ordering to M1 of the same type ωΦ as Φ. S Now the collection α<ωΦ (M1 )α of all the initial segments of M1 form a well-ordered chain in M and this chain has the same type as Φ. We therefore get [ (M1 )α ≈ ωΦ ≈ ω(M1 ) α<ωΦ a contradiction. Corollary 14.1. From this follows (without AC): There is no injective net in any set (a map f : Ordinal numbers → M with f (α) = xα , 15 α along all ordinal numbers and xα 6= xβ if α 6= β). This gives an alternative proof of Zorn’s lemma (use AC to construct an injective net if there is no maximal element). Roughly speaking: ”There are more ordinal numbers than any set can have elements.” Lemma 14.3 (Zermelo). Without AC: If A is an infinite well-ordered set then (m(A))2 = m(A). Proof. We have seen that ℵ20 = ℵ0 . Assume that the statement is wrong. Since the ℵ’s are well-ordered there is a smallest one, α sucht that ℵ2α > ℵα and ℵ2β = ℵβ for all β < α. Let ωα its initial ordinal. Let P := {(µ, ν) : µ < ωα , ν < ωα }, and for λ < ωα • P = S Pλ := {(µ, ν) : µ + ν < ωα }. λ<ωα Pλ : We need to show that µ ≤ ωα , ν ≤ ωα =⇒ µ + ν < ωα . But this follows by the definition of α: m(µ) + m(ν) ≤ m(λ)2 = m(λ) < ℵα . • We well-order P : – µ < µ0 =⇒ (µ, ν) < (µ0 , ν 0 ) – ν < ν 0 =⇒ (µ, ν) < (µ, ν 0 ). • Obviously m(Pλ ) ≤ m({µ : µ < λ}) since for each µ ≤ λ there is exactly one order type ν such that µ + ν = λ. Hence, m(Pλ ) ≤ m(λ + 1) = m(λ) < ℵα . • For any given initial segment P(µ,ν) of P there are at most λ0 = µ + ν ”diagonals” Pλ0 not exceeding (µ, ν) < ωα . Therefore m(A)2 = m(P ) ≤ m(λ0 + 1)2 = m(λ0 )2 = m(λ0 ) ≤ ℵℵ a contradiction. Thus P ≤ ωα but then ℵ2α ≤ m(P ) = ℵα contradicting the definition of α. Lemma 14.4. If n ≥ ℵ0 then n + n < 2n (without AC!). The statement is obviously true for 3 ≤ n ∈ N. Proof. Let n ≥ ℵ0 . Since N 3 x → {x} ∈ P(N ) and N 3 x → N \ {x} ∈ P(N ), N \ {x} = 6 {y} are bijections we have n + n ≤ 2n . We also have: n + n = 2 · n ≤ n2 . 16 Hence it suffices to show that n2 m(P(N )). Assume there is an injection g : P(N ) ,→ N × N. Let Y ⊂ N be an infinite subset with a well-ordering ≺R . • By Lemma 14.3 there is a bijection h = h(Y,≺R ) : Y Y × Y. • Let A = A(Y,≺R ) := {y ∈ Y : g −1 (h(y)) is defined and y ∈ / g −1 (h(y))}. If g(A) ∈ Y × Y then there is a y0 ∈ Y such that h(y0 ) = g(A). But then the question y0 ∈ A or y0 ∈ / A? leads to a contradiction (diagonal argument as by Russel or Cantor). Hence, • g(A) ∈ / Y ×Y. • We define a map on infinite well-ordered subsets of N with F ((Y, ≺R )) ∈ N \ Y (1) by F ((Y, ≺R )) = π(g(A(Y,≺R ) )) where π maps an element (x, z) ∈ (N × N ) \ (Y × Y ) to x if x∈ / Y or to z otherwise. • We will start with N ) Y0 ∼ N. The latter equivalence is possible since m(N ) ≥ ℵ0 . So Y0 is of the form y1 < y2 < · · · < yn < · · · We put F ((Y0 ≺N )) = x ∈ N \ Y0 and for n ∈ N we put F (y1 < · · · < yn ) = yn+1 . • We call a set Y ⊂ N equipped with a well-ordering ≺R a gwell-ordered set if for any y ∈ Y F ({x ∈ Y : x ≺R y}) = y. • (Y0 , ≺N ) is a g-well-ordered set. • If (Y, ≺R ) and (Z, ≺N ) are both g-well-ordered sets then one is an initial segment of the other. For: – Since we can compare any two well-ordered sets w.l.o.g. there is an order-preserving injection i : (Y, ≺R ) → (Z, ≺S ) mapping Y onto an initial segment of Z. – We need to show the i = id|Y . Assume not let x ∈ Y be the least element such that i(x) 6= x. Then {y ∈ Y : y ≺R x} = {z ∈ Z : z ≺S i(x)}. 17 But by the g-well-ordering property i(x) = F ({z ∈ Z : z ≺S i(x)}) = F ({y ∈ Y : y ≺R x}) = x, a contradiction. • Let now W be the union of all g-well-ordered sets. By the preceding statement there is a (unique) g-well-ordering ≺g on W. • F ((W, ≺g )) ∈ W for otherwise W ∪ {F (W )} can be equipped with a g-well-ordering ≺0g by setting ≺0g =≺g on W and F (W ) is larger than any element in W (this is a well-defined g-wellordering since F (W ) ∈ N \ W ). This contradicts the definition of W . • On the other hand by 1 we should have F (W, ≺g )) ∈ N \ W – a contradiction. Theorem 14.1. Trichotomy for cardinals is equivalent to the axiom of choice. Proof. • By the WOP we can well-order any two sets A, B. Then one is order-equivalent to an initial segment of the other and the order-equivalence gives a bijection of one set onto a subset of the other. • If one can compare a set A with any other set so one can do so, in particular, with any well-ordered set. By Theorem 14.2 we A is comparable with f (A) and the possibility m(A) ≥ m(f (A)) is excluded. Hence, m(A) < m(f (A)) and we inherit on A a well-ordering from f (A). Theorem 14.2. The statement n2 = n for all infinite cardinals n is equivalent to the axiom of choice. Proof. The AC is equivalent that any cardinal number is an ℵ. Thus Lemma 14.3 shows AC =⇒ n2 = n ∀ cardinals n. Conversely, n2 = n, k2 = k, (n + k)2 = n + k. Hence, n + k = n2 + 2n · k + k2 = n + 2n · k + k ≥ n · k. 18 and n · k = (n1 + 1) · (k1 + 1) = n1 · k1 + n1 + k1 + 1 ≥ 1 + n1 + k1 + 1 ≥ n + k Therefore n+k = n·k and there are (pairwise disjoint) sets N1 , N2 , K1 , K2 , N1 ∼ N2 , K1 ∼ K2 , m(Ni ) = n, m(Ki ) = k such that N1 ×K1 ∼ N2 ∪K2 . We will choose K1 = f (N1 ) from Lemma 14.2 There are 2 possibilities: • ∃n1 ∈ N1 such that ∀k1 ∈ K1 we have (n1 , k1 ) n2 ∈ N2 . Then, k = m(K1 ) ≤ m(N2 ) = n. This possibility is excluded since k = m(f (N1 )) m(N1 ) = n. • ∀n1 ∈ N1 ∃k1 ∈ K1 such that (n1 , k1 ) k2 ∈ K2 . For fixed n1 ∈ N1 we choose g(n1 ) to be the smallest k ∈ K1 (note that K1 is well-ordered) such that (n1 , k) n2 ∈ K2 . We set M = {(n1 , g(n1 ))} and get n = m(N1 ) = m(M ) ≤ m(K2 ) = k = m(K1 ). The latter means that N1 is equivalent to an initial segment of f (N1 ) = K1 and hence, well-ordered. Lemma 14.5. The generalized continuum hypothesis implies: If k + n = 2n , n ≥ ℵ0 then k = 2n . Proof. We have k ≤ 2n and n ≤ n + n ≤ 2n + 2n = 2n+1 = 2n . But n + n < 2n and by GCH n + n = n. Now 2n · 2n = 2n+n = 2n = n + k. Hence, there are sets N, K and a bijection h : P(N ) × P(N ) N ∪ K, m(N ) = n, m(K) = k and N ∩ K = ∅. Since n < 2n there is a subset N0 ⊂ N such that ∀M ⊂ N we have (N0 , M ) ∈ / h−1 (N ). Therefore h|(N0 ,P(N )) ,→ K is an injection and k = m(K) ≥ m(P(N )) = 2n . Theorem 14.3. The generalized continuum hypothesis implies the axiom of choice. Proof. Let n be an arbitrary non-finite cardinal. We need to prove that under GCH n is an Aleph, i.e. if m(N ) = n then N can be well-ordered (for finite n there is nothing to prove!). 19 Since n is not finite the cardinal k = n + ℵ0 ≥ ℵ0 is non-finite too. If k = m(K) and K can be well-ordered so can N itself since it corresponds to a segment of K (i.e. k ≥ n). We have 2k k ℵ0 ≤ k < 2k < 22 < 22 . Also k + 2k = 2k , k k 2k k 2k 22 + 22 = 22 . 2k + 22 = 22 , 2k By Theorem 14.2 m(f (K)) ≤ 22 and therefore 2k k k 2k m(K) + 22 ≤ 22 + 22 = 22 . k 2k If m(f (K)) + 22 = 22 then by Lemma 14.5: 2k m(f (K)) = 22 > k ≥ n. Since f (K) is well-ordered this induces a well-ordering on n. k k 2k Suppose 22 ≤ m(f (K)) + 22 < 22 then GCH implies k k k 22 = m(f (K)) + 22 , 22 ≥ m(f (K)). Hence, k k m(f (K)) + 2k ≤ 22 + 2k = 22 . k Again: If m(f (K)) + 2k = 22 then by Lemma 14.5: k m(f (K)) = 22 > k ≥ n. Again this induces a well-ordering on n. k Suppose 2k ≤ m(f (K)) + 2k < 22 then GCH implies 2k = m(f (K)) + 2k , 2k ≥ m(f (K)). Hence, m(f (K)) + k ≤ 2k + k = 2k . Again: If m(f (K)) + k = 2k then by Lemma 14.5: m(f (K)) = 2k > k ≥ n. Again this induces a well-ordering on n. Finally we should have by GCH: m(f (K)) + k = k thus m(f (K)) ≤ k, a contradiction. 20 Proof of Ulam’s theorem 12.10. µ ba a finite, non-atomic, σ-additive measure on R. Since under CH and AC c = m(R) = ℵ1 is the first uncountable Aleph there is a well-ordering ≺ on R such that for each y ∈ R the set Ay := {x ∈ R : x ≺ y} is at most countable. Choose 1-1 mappings (AC) f (·, y) : Ay Nx ⊂ N. So f is an integer-valued function defined for all pairs (x, y), y ∈ R, x ≺ y). By definition x ≺ x0 ≺ y =⇒ f (x, y) 6= f (x0 , y). For x ∈ R, n ∈ N we define Fxn := {y ∈ R : x ≺ y and f (x, y) = n}. We form a table Fx11 Fx12 · · · Fx1 · · · Fx21 Fx22 · · · Fx2 · · · ··· ··· ··· ··· ··· Fxn1 Fxn2 · · · Fxn · · · ··· ··· ··· ··· ··· with ℵ0 rows and ℵ1 columns. It has the following properties. • The sets in a row are pairwise disjoint. Otherwise suppose y ∈ Fxn ∪ Fxn0 . Then x ≺ y, x0 ≺ y and f (x, y) = f (x0 , y) = n. Since f (·, y) is a bijection it follows x = x0 . S • n∈N Fxn = R \ Xx where m(Xx ) ≤ ℵ0 , i.e. the union of each column fills R up to a countable set. For y x then y ∈ Fxn for some n, i.e. the one for which f (x, y) = n. Hence [ R\ Fxn = {y ∈ R : y x}. n∈N The set on the right-hand-side is countable by the definition of ≺. Since µ(R) < ∞ in each row m({x ∈ R : µ(Fxn ) > 0}) ≤ ℵ0 . Therefore there are at most countably many sets of positive measure in the entire table. Because there are ℵ1 colums there is a x0 ∈ R such that µ(Fxn0 ) = 0 for all n ∈ N. Then by σ-additivity ! [ 0=µ Fxn0 = µ (R \ Xx0 ) = µ(R). n The last equality is due to the assumption that µ is non-atomic and m(Xx0 ) ≤ ℵ0 . Let 21 15. ultrafilter Definition 15.1 (Filter). A filter F on a set M is an element of P(P(M )) such that for all N, K ∈ P(M ) • M ∈ F, but ∅ ∈ / F. • N ∈ F, N ⊂ K =⇒ K ∈ F. • K ∈ F, N ∈ F =⇒ N ∩ N ∈ F. We can introduce a partial order on filters of a set M . Definition 15.2. We say that the filter F0 is finer than the filter F if for any N ∈ F it follows N ∈ F0 . We will write F ⊂ F0 . This gives a partial order on the set of all filters on a (fixed) set. Definition 15.3. A non-empty family (Aα )α∈I of subsets ∅ 6= Aα ∈ P(M ) is said to have the finite intersection property T if for any finite subset {α1 , α2 , · · · , αn } ∈ P(I), n ∈ N the intersection nk=1 6= ∅. Definition 15.4. A non-empty family (Aα )α∈I of subsets ∅ 6= Aα ∈ P(M S ) is said to have the centered finite intersection property if α∈J Aα = ∅ for some subset J ⊂ I then there is a finite subset {α Tn1 , α2 , · · · , αn } ∈ P(J), n ∈ N such that already the finite intersection k=1 = ∅. Lemma 15.1. Let M ∈ P(P(M )) have the finite intersection property. Then ) ( n \ Mk ⊂ N F := N ⊂ M : ∃M1 , M2 , · · · , Mn ∈ M such that k=1 is a filter on M . This is the smallest filter containing M and M is called a filter basis. T T 0 Proof. Since ∅ 6= nk=1 Mk ∪ nk=1 Mk0 ⊂ N ∩ N 0 ⊂ M the axioms of a filter are fulfilled. Any filter containing M must also contain any finite intersection of elements of M and hence also all the supsets of these intersections. Example 15.1. • The principal filter px of an element x ∈ M is the family px := {N ⊂ M x ∈ N }. • In a topological space X the system Ux of all neighborhoods of a point is a filter, called the neighborhood filter Vx := {Y ⊂ X : x ∈ U (x) ⊂ Y, U (x) open}. 22 M • Let M be an infinite set. The cofinite filter Fcof in is defined as M Fcof 6 N, m(M \ N ) < ∞}. in := {N ⊂ M : ∅ = Definition 15.5 (Ultrafilter). A filter F on M is called an ultrafilter if any set N ∈ P(M ) either N ∈ F or (M \ N ) ∈ F. Remark 15.1. • Any principal filter px is an ultrafilter since x ∈ N, x ∈ K =⇒ x ∈ N ∩ K. • Reversely if for a filter F we have {x} ∈ F then F = px because it must contain all supsets of {x} and cannot contain their complements. • An ultrafilter that is not principal (if it exists!) is called free. M • Fcof is an ultrafilter if and only if M cannot be partitioned into two disjoint infinite subsets, i.e. if M ∼ M1 ∪ M2 , M1 ∩ M2 = ∅ M and ∀n ∈ N, m(Mi ) > n then Fcof is not an ultrafilter. The latM ter is true since otherwise exactly one of Mi belongs to Fcof but both are not cofinite. If there is no such partition of M into two infinite subsets then any set is either finite or its complement is finite, i.e. any subset or its complement (if the set is a proper M subset) belongs to Fcof . – In ZF without choice there can be amorphous sets, that means infinite sets such that each of its subset is either finite or cofinite. – Amorphous sets cannot be totally ordered, for otherwise the set A := {x ∈ M : m({y ≤ x}) > n and m({x < z}) > n, ∀n ∈ N} = 6 ∅ • • • • and M is the disjoint union of {y ≤ x} and {x < z}. – The ultrafilter lemma implies the existence of a total order on any set. Hence there are no amorphous sets assuming UFL. Note that exactly one of the two sets N and M \ N is contained in an ultrafilter since N ∩ (M \ N ) = ∅ ∈ / F. An ultrafilter p can be interpreted as a finitely additive 0-1measure of X, measuring all subsets. So the family of ultrafilters on a given set M can be interpreted as the family of all finitely additive 0-1- measures on P(M ). N ∪ K ∈ p, p is an ultrafilter =⇒ N ∈ p or K ∈ p. K ⊂ N, N ∈ / F =⇒ K ∈ / F for any filter F. 23 S • Let p be an ultrafilter. If M = nk=1 Nk , Nk ∩ Nl = ∅, k 6= l then there is exactly one 1 ≤ k ≤ n such that Nk ∈ p and Nl ∈ / p, l 6= k. • Any ultrafilter on a finite set is principal, since a finite set is the finite union of its distinct elements. • If p is a free ultrafilter on an infinite set M (if it exists) and the subset N = {x1 , · · · , xn } ∈ P(M ) is a finite set then N ∈ / p. For otherwise N= n [ {xk }, {xk } ∩ {xl } = ∅, k 6= l. k=1 • An ultrafilter p on M is free if and only if it contains the cofinite M filter Fcof in . M For if p is free and N ∈ Fcof in then m(M \ N ) < ∞, hence (M \ N ) ∈ / p and N ∈ p. On the other hand (we may assume that M is an infinite set since otherwise the cofinite filter does M not exist) the cofinite set M \ {x} ∈ Fcof in is not an element of px . The next lemma establishes an alternative definition of an ultrafilter. Lemma 15.2. A filter F on M is an ultrafilter if and only if F is maximal, i.e. there is no filter F0 finer than F. Proof. • ( =⇒ ) Assume F ( F0 and ∅ = 6 N ∈ F0 \ F. Since F is an ultrafilter we have (M \ N ) ∈ F ⊂ F0 and this implies that N ∈ F0 and (M \ N ) ∈ F0 - a condradiction. • ( ⇐= ) Assume F is not an ultrafilter but maximal. Hence, there is a set N such that neither N nor (M \ N ) is contained in F. Assume further there is a K ∈ F with K ⊂ M \ N . But then (M \ N ) ∈ F which is a contradiction to our assumption. So every K ∈ F intersects N , i.e. ∀B ∈ F =⇒ B ∩N 6= ∅. So the family F ∪ {N } has the finite intersection property and can be extended to a filter containing N . So F cannot be maximal. The existence of free ultrafilters is not a priori. It is impossible to derive it from ZF only. It is impossible to construct an ultrafilter even when its existence is established. The next theorem shows that AC implies the existence of free ultrafilters. 24 Theorem 15.1 (Ultrafilter Lemma I). AC implies: Any filter F on a set M is a subset of an ultrafilter p on M . Note that the ultrafilter p does not have to be unique! Proof. Let C be a chain (with respect to the ”finer”Sordering) of filters on M containing F. First we observe that FC := G∈C {N ∈ G} is a filter: • ∅∈ / G ∀G ∈ C =⇒ ∅ ∈ / FC . • (N ⊂ FC =⇒ ∃G ∈ C : N ∈ G) =⇒ N ⊂ K ∈ G and therefore K ∈ FC • (N, K ∈ FC =⇒ ∃G ∈ C : N, K ∈ G) =⇒ N ∩ K ∈ G and therefore N ∩ K ∈ FC . By Theorem 12.1 (equivalent to AC) C ⊂ Cmax and Cmax is a maximal chain. S Therefore Fmax := G∈Cmax {N ∈ G} is itself a filter, finer than any other in Cmax by maximality. So by Lemma 15.2 it is an ultrafilter containing F. An equivalent (seemingly stronger) statement is Theorem 15.2 (Ultrafilter Lemma II). (AC implies:) Any family M of subsets of M with the finite intersection property can be extended to an ultrafilter. Proof. Any filter has the finite intersection property and any family of sets having the finite intersection property can be extended to a filter by Lemma 15.1. Remark 15.2. The equivalent Theorems 15.1 and 15.2 can be stated as an axiom UF if one wants to avoid AC. This axiom is strictly weaker than AC as will be indicated in the proof of Theorem 12.5. Theorem 15.3. The Ultrafilter lemma implies (without AC) that any set can be totally ordered. Proof. We consider a strict partial order ”<” on a set M . It is clear that this exists. 0) On any finite set M we can extend the partial order ”<”to a total order ”<M ” : it is clearly true for m(M ) = 1. Assume we can extend the partial order for a set of n elements. Let M 0 = M ∪ {xn+1 }, m(M 0 ) = n + 1. Let y be the maximal (with respect to <M ) element in M such that y < xn+1 and z the minimal element such that x < z. Then by compatibility of < and <M we have y <M z. Put for w ∈ M either w <M 0 xn+1 if w <m y or w = y, or x <m0 w if y < w and for all u, w ∈ M let 25 i) ii) iii) iv) u <M 0 w ⇐⇒ u <M w. This gives the desired total order of M 0 and we conclude by the induction principle (for N). We consider the set of maps S := {f : Gf ⊂ M × M → {0, 1}} such that a) (x, y), (y, z), (x, z) ∈ Gf , f (x, y) = f (y, z) = 1 =⇒ f (x, z) = 1. b) (x, y), (y, x) ∈ Gf =⇒ (f (x, y) = 1 ⇐⇒ f (y, x) = 0). c) (x, y) ∈ Gf , x < y =⇒ f (x, y) = 1. For each finite subset G ⊂ M × M, m(G) < ∞ there is a f ∈ S with Gf = G. We just take f to be the characteristic function of the extended totel order <G on G, i.e. f (x, y) = 1 if x <G y and f (x, y) = 0 otherwise. f ∈ S ⇐⇒ f |G ∈ S for any finite subset G ⊂ Gf . – ( =⇒ ) follows immediately from the definition of restriction. – ( ⇐= ) let (x, y), (y, x), (y, z) ∈ Gf . We consider the following finite subsets of Gf : Ga = {(x, y), (y, z)}, Gb = {(x, y), (y, x)}, Gc = {(x, y)}. So a), b), c) follow from applying the restriction properties to Ga , Gb and Gc , respectively. There is an element f0 ∈ S such that Gf0 = M × M . – For finite F, G ⊂ M × M let SF = {f ∈ S : F ⊂ Gf }. Then SF ∩ FG = SF ∪G 6= ∅: For each f ∈ SF ∪G ,we have S∪Gf , hence F ⊂ Gf and G ⊂ Gf and f ∈ SF and f ∈ SG . Reversely, f ∈ SF ∩ SG implies F ∈ Gf and G ∈ Gf . Thus F ∪ G ⊂ Gf . – This means {SG : G ⊂ M × M, m(G) < ∞} has the finite intersection property. Hence there is an ultrafilter p on S that includes {SG : G ⊂ M × M, m(G) < ∞}. – For (x, y) ∈ M × M we have p 3 {f ∈ S(x,y) } = {f ∈ S(x,y) : f (x, y) = 0}∪{f ∈ S(x,y) : f (x, y) = 1}. Hence exactly one of the sets {f ∈ S(x,y) : f (x, y) = i}, i = 1, 2 is contained in p. We define i(x,y) equal to 0 or 1 so that {f ∈ S(x,y) : f (x, y) = i(x,y) } ∈ p. – let G ⊂ M × M, m(g) < ∞. Then the finite intersection \ {f ∈ S(x,y) : f (x, y) = i(x,y) } ∈ p. (x,y)∈G T Thus (x,y)∈G {f ∈ S(x,y) : f (x, y) = i(x,y) } = 6 ∅. Chooseany T f0 ∈ (x,y)∈G {f ∈ S(x,y) : f (x, y) = i(x,y) }. So f0 ∈ S and 26 G ⊂ Gf0 and f0 |G ∈ S. Since this holds for any finite subset of S by iii) Gf0 = S. v) Define (x <M y) ⇐⇒ f0 (x, y) = 1. By the preceding arguments this will give a compatible total order on M . Theorem 15.4. The Ultrafilter Lemma implies the existence of nonmeasurable sets. Proof. • To each x ∈ R ∩ [0, 1] we associate its binary expansion containing infinitely many zeros, i.e. x (xn )n∈N , xn ∈ {0, 1} ∀n ∈ N. • We consider the sequence (Xn )n∈N of independent (with respect to Lebesgue measure L ) random variables Xn (x) := xn . • We define an equivalence relation v on [0, 1]: xvy ⇐⇒ m({n ∈ N : xn 6= yn }) < ∞. • A subset A ⊂ [0, 1] is called a tail set if A respects v, i.e. if ∀y v x x∈A ⇐⇒ y ∈ A. • Kolmogorov’s 0-1-law states: If A is a (Lebesgue) measurable tail event then L (A)2 = L (A), i.e. L (A) = 0 or L (A) = 1. • The inversion j : [0, 1] → [0, 1] defined by xn → xn + 1 (mod 2) leaves the Lebesgue measure invariant, i.e. L (B) = L (j −1 (B)) ∀ measurable B ⊂ [0, 1]. • Let p be an ultrafilter on N. We define Ap := {x ∈ [0, 1] : {n ∈ N : xn = 1} ∈ p}. • Ap is a tail set: Assume x ∈ Ap and y v x. Then {n ∈ N : xn = 1} ⊂ {n ∈ N : yn = 1} ∪ {n ∈ N : xn 6= yn } ∈ p. The union on the right-hand-side is a disjoint union and the set {n ∈ N : xn 6= yn } is finite. Hence {n ∈ N : yn = 1} ∈ p. • j(Ap ) = [0, 1] \ Ap : First we note that since N ∈ p the sequence (1)n∈N ∈ Ap . Moreover ∀x ∈ [0, 1], j(x) = y we have xn = 1 ⇐⇒ yn = 0. Hence, we have the disjoint union N = {n ∈ N : xn = 1} ∪ {n ∈ N : yn = 1}. The latter implies that x ∈ Ap if and only if j(x) ∈ / Ap . 27 • We conclude that 0 < L (Ap ) = L ([0, 1] \ Ap ) = 12 < 1 if Ap were measurable. This contradicts Kolmogorov’s 0-1-law. Hence, Ap is not measurable. 16. Non-standard analysis 17. Ultrafilter in topology Definition 17.1. A set is called a topological space if there is a collection of sets T ∈ P(P(X)) with the properties • ∅ ∈ T and X ∈S T. • Uα ∈ T =⇒ α Uα ∈ T. T n • Ui ∈ T, i = 1, · · · , n =⇒ i=1 Ui ∈ T. The sets U ∈ T are called open sets. Their complements are called closed. A set V ⊂ X with x ∈ U ⊂ V, U ∈ T is called a neighborhood of x. Definition 17.2. A topological space (X, T) is called a Hausdorff or T2 -space if for all distinct x, y ∈ X there are open sets x ∈ U , y ∈ V such that U ∩ V = ∅. Definition 17.3. A subset Y S ⊂ X of a topological space is called compact if forS every open cover α Uα ⊃ Y there are indices α1 , · · · , αn such that ni=1 Uαi ⊃ Y , i.e. any cover has a finite subcover. Definition 17.4. A subset Y ⊂ X of a topological space is called sequentially compact if any sequence (xn )n∈N , xn ∈ Y contains a converging (to a point of Y ) subsequence limk→∞ xnk = y ∈ Y . As we will see later these two definitions of compactness are independent. Example 17.1. Let X be any set. Then T = {∅, X} defines the trivial topology. If X contains at least two points this topology is not Hausdorff. Also any subset is compact (there are only two open sets altogether) but only ∅ and X are closed! Let X be any set and T = P(P(X)) (This topology is called the discrete topology.). This topology is Hausdorff. Any subset is at the same time open and closed. The only compact sets are the finite subsets (Cover an infinite set by its points. This open cover has no finite subcover.). 28 X Let X be any infinite set and T = Fcof (This topology is called the cofinite topology.). This topology is not Hausdorff, any non-empty U, V ∈ T have a non-empty (infinite) intersection. Any subset is comS pact: Let Y ⊂ X and Uα ⊃ Y be an open cover, i.e. Uα are cofinite. Choose Uα1 6= ∅. Then Y \ Uα1 = {x1 , · · · , xn } - a finite set. Now we choose Uαi 3 xi , 1 ≤ i ≤ n a finite subcover. Definition 17.5. Let (Xi , Ti ), i = 1, 2 be topological spaces and f : X1 → X2 a mapping. F is said to be continuous if ∀V ∈ T2 =⇒ f −1 (V ) ∈ T1 . Definition 17.6. A (semi-) group is called a topological (semi-) group if it carries a Hausdorff topology such that the group operation g → hg and g → gh are continuous. For groups we also ask that g → g −1 is continuous. The following proposition can be found in any book on topology. (1) Proposition 17.1. X is a separable metric space =⇒ X is a Haus(2) (3) dorff space =⇒ All compact subsets of X are closed =⇒ The family of compact subsets has the centered finite intersection property. Proof. (1) =⇒ For x 6= y in X set U (x) = {z ∈ X : d(x, z) < 21 d(x, y)} and U (y) = {z ∈ X : d(y, z) < 21 d(x, y)}. Then x ∈ U (x), y ∈ U (Y ) and U (x) ∩ U (y) = ∅. (2) =⇒ let K ⊂ X be compact. If K = X the statement is true since X is closed by definition. So assume K ( X and x ∈ X \ K. For y ∈ KS let Uy (x), Ux (y) be open sets such that Ux (y)∩Uy (x) = ∅. Then y∈K Ux (y) is an open cover of K and we can extract a S Tn finite subcover N k=1 Ux (yk ) ⊃ K. The set U (x) = k=1 Uyk (x) is open and disjoint from K. Hence, X \ K is open. T (3) =⇒ Let Kα , α ∈ I be a family of compact sets with α∈I Kα = ∅. Choose a non-empty compact set K ∈ {Kα : α ∈ I}. Since Kα is closed the sets K \ Kα are open and [ \ (K \ Kα ) = K \ ( Kα ) = K. α∈I α∈I We extract a finite subcover K⊂ n [ Sn i=1 (K (K \ Kαi ) = K \ i=1 \ Kαi ) ⊃ K. Since n \ i=1 K αi 29 we conclude K ∩ Tn i=1 Kαi = ∅. Theorem 17.1. A topological space X is compact if and only if any ultrafilter p on P(X) converges. Proof. Coming soon. 17.1. Stone–Čech compactification of N. Lemma 17.1. βN is a compact semi-group, i.e. it is a compact Hausdorff space with continuous addition (and also multiplication). Proof. Coming soon. 17.2. Proof of Tychonov’s Theorem. Tychonov’s Theorem is equivalent to AC. Coming soon. 17.3. Some more statements from topology and analysis. Example 17.2. • X = [1, · · · , ω1 ) equipped with the order topology is not compact but sequentially compact: S – The open cover X = α<ω1 [1, · · · , α) has no countable (and therefore no finite) subcover. – Any sequence (αn )n∈N , α ≺ ω1 is by Lemma 13.2 contained in some [1, · · · , β 0 < ω1 ) and hence has a least upper bound γ < ω1 . That γ is the limit point for some subsequence (αnk )k∈N . • βN is compact but not sequentially compact. – βN is compact by Lemma 17.1 – Consider the sequence (pn )n∈N of principal ultrafilters corresponding to n ∈ N. We need to show that it does not contain a convergent subsequence (pnk )k∈N . Let p be an arbitrary point of βN. Then by the properties of an ultrafilter at least one of the two disjoint infinite sets Se = {n2 , · · · , n2k , · · · }, So = {n1 , · · · , n2k+1 , · · · } does not belong to p, say Se . Since βN is a Hausdorff space there is a neighborhood U (p) not containing the infinite subsequence (pnl )l∈Se and the subsequence (pnk )k∈N does not converge to p, i.e. does not converge at all since p was arbitrary. Theorem 17.2 (Hamel). There exist a Hamel basis in R, i.e. a basis of R as a vector space over Q. 30 Proof. Coming soon. Corollary 17.1. There are (discontinuous) non-linear solutions to Abel’s functional equation f (x + y) = f (x) + f (y), ∀x, y ∈ R. Proof. Coming soon. Theorem 17.3 (Ellis). Any compact topological semi-group contains an idempotent element. Proof. Coming soon. Theorem 17.4. βN contains an idempotent element. Proof. Coming soon. 18. Ultrafilter in infinite combinatorics and ergodic theory Theorem 18.1 (Furstenberg). Let f : X → X be a continuous map of a compact metric space X. Then there is a minimal compact invariant set K ⊂ X. Proof. Coming soon. Proposition 18.1 (Corollary to Theorem 17.2). There is a proper nonempty subset A ⊂ T1 = R/Z which is periodic under every rotation, i.e. ∀α ∈ R ∃n ∈ N =⇒ A + nα (mod 1) = A. Proof. Coming soon. Theorem 18.2 (Ramsey). Any complete finitely colored graph has a complete infinite monochromatic subgraph. Proof. Let the complete infinite graph (V, E = (V × V )/(x, y) ∼ (y, x) \ {(x, x) : x ∈ V }) be finitely colored, i.e. there is a function f : E → {1, · · · , n} for some fixed n ∈ N. Since V is an infinite set, there is a free ultrafilter p on V . We set Vi (x) := {y ∈ V : f (x, y) = i}. We have V \ {x} ∈ p and V \ {x} = n [ i=1 Vi (x) 31 and this union is disjoint. Next we define g(x) = j, where j is the unique index such that Vj (x) ∈ p. Let Gi := {x ∈ V : g(x) = i} then again n [ V = Gi i=1 and this union is disjoint. Let Gi0 be the unique set such that Gi0 ∈ p. Choose x1 ∈ Gi0 . Continue (weaker than AC) by choosing xn+1 ∈ Gi0 ∩ n−1 \ Vi0 (xk ) 6= ∅ (finite intersection of elements in a filter). k=1 The complete graph with vertices xk , k ∈ N is monochromatic. Theorem 18.3 (Hindman). Any finite coloring of N has a monochromatic IP-subset. Proof. Coming soon.