Download Set Theory-an Introduction

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Grothendieck topology wikipedia , lookup

Geometrization conjecture wikipedia , lookup

General topology wikipedia , lookup

3-manifold wikipedia , lookup

Brouwer fixed-point theorem wikipedia , lookup

Transcript
Set Theory-an Introduction
1. Intro
• A set theoretist is a mathematician who admits not to
know what the real numbers are.
• Some commonly used axioms/theorems/statements/logical reasoning is dangerous. One has to know the danger and should
not ignore it.
• Naive set theory (Cantor: everything with a property is a set:
any definable collection is a set) and Russel’s paradox: The
set of all sets ⊃ the set of all sets not containing itself as an
element. Let us call a set ”abnormal” if it is a member of
itself, and ”normal” otherwise. For example, take the set of all
squares in the plane. That set is not itself a square in the plane,
and therefore is not a member of the set of all squares in the
plane. So it is ”normal”. On the other hand, if we take the
complementary set that contains all non-(squares in the plane),
that set is itself not a square in the plane and so should be
one of its own members as it is a non-(square in the plane). It
is ”abnormal”. Now we consider the set of all normal sets, R.
Determining whether R is normal or abnormal is impossible: if
R were a normal set, it would be contained in the set of normal
sets (itself), and therefore be abnormal; and if R were abnormal,
it would not be contained in the set of all normal sets (itself),
and therefore be normal. This leads to the conclusion that R is
neither normal nor abnormal: Russell’s paradox.
• There is a model of R such that there is an -δ discontinuous
function that is sequentially continuous. From ZF it cannot
be proven that a sequentially contiuous function on a compact
interval attains its maximum.
2. What is a set?
• Main obstacle: A set cannot contain itself as an element.
• solutions: certain axiomatics: Sierpinski: a set is of higher
hierarchy than its elements
• More common: Zermelo-Fraenkel axioms
1
2
•
•
•
•
•
•
•
•
•
•
•
•
A set contains elements (objects): a ∈ A
Two sets are equal iff they have the same elements.
No set is its own element.
Given a condition we do not know beforehand that there is an
object fulfilling it. So it is convenient to define ∅ (empty set)
as the set that does not contain any element.
Sets might be elements of other sets.
Subsets are sets (collection of some elements in the set). The
empty set is a subset of all sets.
Unions of sets areSsets: A a set, for each α ∈ A there is a set
Bα then the union α Bα is defined as the set containing all the
elements of at least one of Bα .
Complements are sets B \ A = C is a set.
Intersection of sets: TA a set, for each α ∈ A there is a set Bα
then the intersection α Bα is defined as the set containing all
the elements that are elements of all Bα .
Complements are sets B \ A = C is a set.
Cartesian product of sets: A × B is the set of ordered pairs
(x, y), x ∈ A, y ∈ B.
Exponents of sets: AB is the set of all functions from B into
A. Example: AN is the set of all sequences in A.
3. Consistency of ZF
By Gödel’s theorem ZF’s consistency cannot be proven within ZF.
For this one needs the existence of ”large” cardinals (another axiom in
set theory).
However, redundancy is known.
4. Classes
• A class is a collection of sets (or sometimes other mathematical
objects) that can be unambiguously defined by a property that
all its members share.
• A class that is not a set (informally in Zermelo–Fraenkel) is
called a proper class.
• Examples:
– The class of all sets
– The class of all one-element sets
– The class of all groups, rings, fields, vector spaces, etc.
• One way to prove that a class is proper is to place it in bijection
with the class of all ordinal numbers.
3
5. Equivalence relations
Let M be a set.
Definition 5.1. A relation ”∼” on M × M is an equivalence relation
if
(1) Reflexivity: a ∼ a for all a ∈ M .
(2) Symmetry: a ∼ b,
=⇒
b ∼ a.
(3) Transitivity: a ∼ b and b ∼ c
=⇒
a ∼ c.
Definition 5.2. Partition into classes:
[
M=
Mα , α 6= β
=⇒ Mα ∩ Mβ = ∅.
α
Mα – classes.
Lemma 5.1. Any equivalence relations gives a partition into classes
and vice versa.
Proof.
( =⇒ ) Ma = {x ∈ M : x ∼ a}, Ma = Mb = Mα
(reverse) a ∼ b
⇐⇒
⇐⇒
a ∼ b.
a ∈ Mb .
Two classes either coincide or are disjoint!
6. Equivalence of sets
Definition 6.1. A ∼ B iff there is a bijection f : A → B.
Remark 6.1. Two finite sets are equivalent iff they have the same
number of elements (generalization!)
Example 6.1. NN ∼ [0, 1] \ Q via continued fraction expansion. Note
that NN is the set of all integer sequences.
Definition 6.2. A set is of infinite power (infinite) if it is not equivalent to any finite set.
Definition 6.3. A set is countable iff it is equivalent to N. If a set
is neither finite nor countable it is called uncountable.
Example 6.2.
• Z is countable.
• The even numbers are countable.
• Q is countable.
• Any subset B of a countable set A is countable or finite. (enumerate A : a1 , a2 , a3 , · · · and let B : an1 , an−2 , an3 , · · · either the
enumeration of B is finite or a correspondence to ω)
4
• Any countable union of countable sets with specified bijections to N is countable (write a table for each An : · · · , an,i , · · ·
and mimic the enumeration of Q.
Definition 6.4. A set is called Dedekind finite if it does not contain
a countable set as a subset. Otherwise it is called Dedekind infinite.
Lemma 6.1 (ωAC equivalent). Any infinite set is Dedekind infinite.
Proof.
• Choose arbitrary a1 ∈ M (OK!).
• Choose a2 ∈ M \ a1 (OK!).
• Continue· · · since there are always elements left because M is
infinite (OK???????).
We run into the same problems as with sequential continuity. There
is a model of R where R contains infinite sets that are Dedekind finite! This gives also an example of a discontinuous function that is
sequentially continuous!
Theorem 6.1. I := [0, 1] ∩ R is uncountable (in ZF without AC!).
Proof.
• Any x has a unique binary expansion with infinitely many 0’s.
So I ∼ B ⊂ {0, 1}N . (The latter is the set of all 0–1 sequences).
• {0, 1}N ∼ C1/3 ⊂ I (use base 3 expansion).
• This shows by the Cantor-Bernstein Theorem that I ∼
{0, 1}N .
• Use Cantor’s diagonal argument to show that N {0, 1}N :
Consider a table [ak,l ]∞
k,l=1 with countably many lines indexed by
k and in each line countably many entries indexed by l and each
entry either 0 or 1. This table reflects a numeration (bijection
with N) of 0 − 1-sequences. Let 0∗ = 1 and 1∗ = 0. Then
the sequence (a∗n,n )∞
n=1 is not in the table and hence there is no
desired bijection of N {0, 1}N .
7. The Cantor–Bernstein Theorem
Theorem 7.1 (Cantor–Bernstein). Let A ∼ B1 , B ∼ A1 , A1 ⊂ A and
B1 ⊂ B. Then A ∼ B.
Proof. Let f : A → B1 and g : B → A1 be the (any) corresponding
bijective maps. Consider chains a 7→ b iff a ∈ A, b ∈ B1 ⊂ B and
f (a) = b. Similarly, b 7→ a iff b ∈ B, a ∈ A1 ⊂ A and g(b) = a.
• Each chain is ”infinite to the right”
5
• Each element of A, respectively B is contained in exactly one
chain.
• There are (disjoint) 3 possibilities: a chain C is ”infinite in both
directions” (type 1), the ”least element” belongs to A (type 2)
or the ”least element” belongs to B (type 3).
• f maps the elements of A that belong to chains of type 1 or 2 in
a 1-to-1 way into B 1,2 ⊂ B. The remaining elements of A are in
chains of type 3 and g −1 maps A \ { chains of type 1 or type 2}
in a 1-to-1 fashion onto B \ B 1,2 .
8. The power of a set, Cardinals
Definition 8.1. A cardinal number is an equivalence class of (nonempty) sets.
Definition 8.2. The power or cardinality of a set A is the corresponding cardinal number denoted by m(A).
Remark 8.1. Some facts.
• The cardinality of a finite set is the natural number equal to the
number of elements contained in the set.
• The set N of all natural numbers has the cardinality denoted by
ℵ0 .
• The set R of all real numbers has the cardinality denoted by c.
• There are 4 possibilities:
– A ∼ B1 ⊂ B and B ∼ A1 ⊂ A
– A ∼ B1 ⊂ B but ∀A1 ⊂ B
A1 B
– B ∼ A1 ⊂ A but ∀B1 ⊂ A
B1 A
– ∀A1 ⊂ A
A1 B and ∀B1 ⊂ B
B1 A.
In the first case (Cantor-Bernstein) m(A) = m(B), in the
second we write m(A) < m(B), in the third m(A) > m(B).
The most interesting case is the fourth. Without any further axioms it can happen or not. The assumption that the
fourth case does not happen, i.e. we can compare any two
sets (Trichotomy) is equivalent to the Axiom of Choice
that we will study later.
For a set M we write P(M ) for its power set, i.e. the set of all
(including the empty set!) subsets of M . We note that for a finite set
M with n elements its power set has exactly 2n elements.
The next theorem shows that there is no ”largest set”.
Theorem 8.1 (Cantor). m(M ) < m(P(M )).
6
Proof.
• M 3 x {x} ∈ P(M ) is a bijection onto its image. So M and
P(M ) are compareble.
• Assume there is a bijection x f (x) = Mx ∈ P(M ).
• Consider the set X := {x ∈ M : x ∈
/ Mx } ⊂ M , i.e. X ∈
P(M ).
• Like in Russel’s paradox X 6= f (y) for all y ∈ M . If it were
than y cannot be in X nor in its complement!
Remark 8.2. We summarize:
• There is no ”largest” cardinal.
• The collection of all cardinal numbers is a proper class!
• Notation: For a finite set A with n elements we write m(A) =
n.
• Notation: m(P(M )) = 2m(M ) in analogy of finite sets.
• 2ℵ0 = c, where c = m(R) - the power of the continuum.
• m({0, 1}N ) = c.
• m(NN ) = c. For both last statements consider the equivalence
to the real numbers.
• The continuum hypothesis states that there is no cardinal
number m such that ℵ0 < m < c.
• One can prove that m(Borel sets on R) = c All subsets of C1/3
have outer Lebesgue measure 0 and hence are Lebesgue measurable. So
m(Lebesgue measurable sets in R) > c = m(Borel sets in R).
9. Cardinal arithmetics
Let n, k, l be cardinals. By the very definition we can choose sets
B, C, D, pairwise dijoint (no two of them have common elements), such
that
n = m(B), k = m(C), l = m(D).
We define
• l + n = m(D ∪ B),
• l · n = m(D × B),
• ln = m(DB ).
This definitions are well-stated since for equivalent sets B 0 , C 0 , D0 one
has m(D ∪ B) = m(D0 ∪ B 0 ), m(D × B) = m(D0 × B 0 ) and m(DB ) =
0
m(D0B ), since the 0 -sets are coming from ”renameing” the elements via
7
the corresponding bijections. Only for addition we need the assumption
that the involved sets are disjoint!
We have the following rules:
• n + l = l + m since B ∪ D = D ∪ B.
• n · l = l · n since B × D ∼ D × B.
• n · (k + l) = n · k + n · l since B × (C ∪ D) ∼ (B × C) ∪ (B × D).
Lemma 9.1. The notation m(P(A)) = 2m(A) is justified by the computation rules. Moreover,
2m(A)
m(A)
m(P(P(A))) = 22
m(P(P(P(A)))) = 22
etc.
Proof. Let B = {0, 1}. Then B has cardinality equal to 2. Then the
set B A has cardinality 2m(A) . By definition the set B A consists of all
functions f : A → {0, 1}, i.e. for a ∈ A, f (a) = 0 or f (a) = 1. Setting
Af = {a ∈ A : f (a) = 1} we have that f is the charcteristic (indicator)
function of the subset Af ∈ P(A) and vice versa. So we obtained a
1-to-1 correspondence between the subsets of A and the elements of
BA.
For the proofs of the following statements we assume that the chosen
representatives of a cardinal are pairwise disjoint.
Lemma 9.2.
mn1 +n2 = mn1 · mn2 .
Proof. Follows since we can biject the functions from f : N1 ∪ N2 → M
with the ordered pairs of functions given by (f1 = f |N1 , f2 = f |N2 ). Lemma 9.3.
(mn )k = mn·k .
Proof. (mn )k is the cardinality of the set of function from K to the set
of functions from N to M . I.e. f (k) = gk (n) = mk,n ∈ M , k ∈ K, n ∈
N . On the other hand mn·k is the cardinality of the functions h from
N × K 3 (n, k) → m ∈ M . We bijectively set f ∼ h iff gk (n) = h(n, k)
for all k ∈ K, n ∈ N .
Lemma 9.4. For m ≤ n and k ≤ l we have
m + k ≤ n + l,
m · k ≤ n · l.
Proof. Obviously,
m(M ∪ K) ≤ m(N ∪ L)
and
m(M × K) ≤ m(N × L).
8
Lemma 9.5.
(m1 · m2 )n = mn1 · mn2 .
Proof. Follows since we can biject the functions from f : N → M1 × M2
given by f (n) = (m1 , m2 ) with the elements of M1N × M2N , i.e. ordered
pairs of functions given by (f1 (n) = m1 , f2 (n) = m2 ).
Lemma 9.6. If n ≥ ℵ0 , n -finite then
n = n + n = n + ℵ0 .
Proof. By the assumption we find a sets N , N1 , m(N ) = n, N = N1 ∪N
and N1 ∩ N = ∅ (note that n ≥ ℵ0 means that there are representatives
of n that contain the natural numbers since the two cardinals are comparable) . We define bijections between N and N ∪ {1, 2, · · · , n} and
N ∪ N, respectively, by taking the identity on N1 and bijective map N
to N ∪ {1, 2, · · · , n} or N to N ∪ N, respectively. Those bijections have
been considered before (sometimes called ”Hilbert’s hotel”).
Lemma 9.7.
ℵ20 = ℵ0 .
Proof. The statement can be interpreted as m(N × N) = m(N), i.e. we
have to biject (ordered) pairs of natural numbers to natural numbers.
An effective way is to choose N × N 3 (m, n) 2m (2n + 1) ∈ N.
Lemma 9.8. For n ≥ ℵ0 and k ≤ 2n
2n + k = 2n .
Proof. Since n ≥ ℵ0 we have n + 1 = n and
2n ≤ 2n + k ≤ 2n + 2n = 2 · 2n+1 = 2n .
10. CH and GCH, the continuum hypothesis and the
generalized continuum hypothesis
CH: There is no cardinal n such that
ℵ0 = m(N) < n < c = 2ℵ0 = m(R).
GCH: For any cardinal number m there is no other cardinal n such
that m < n < 2m .
9
11. Well-ordering
Definition 11.1. (Partial) Ordering a ≤ b is a relation, i.e. a subset
of M × M with:
• a ≤ a (refelxivity)
• a ≤ b, b ≤ a
=⇒
a ≤ c (transitivity)
• a ≤ b, b ≤ a
=⇒
a = b (antisymmetry)
Example 11.1. N, R, P(M ) with subset relation, · · ·
Definition 11.2. A set M is totally ordered if for any distinct a, b ∈
M either a < b or b < a.
Example 11.2. N, R but not P(M ).
Definition 11.3. A well-ordering of a set M is a total ordering such
that any subset M1 ⊂ M , i.e. M1 ∈ P(M ), has a least element, i.e.
∃x ∈ M1 such that for all a ∈ M1 we have x ≤ a.
Example 11.3. N but not R.
Definition 11.4. A subset M1 ⊂ M of a (partially) ordered set M is
called a chain if it is totally ordered, i.e.∀a, b ∈ M1 either a ≤ b or
b ≤ a.
Definition 11.5. An element a ∈ M of a (partially) ordered set M is
said to be an upper bound for a subset M1 ⊂ M if ∀x ∈ M1
=⇒
x ≤ a. A lower bound is defined analoguously. A subset with an
upper/lower bound is said to be bounded from above/below.
An element a ∈ M of a (partially) ordered set M is maximal if
a≤x
=⇒ a = x.
Definition 11.6. An element a ∈ M1 ⊂ M of a subset of an (partially)
ordered set is said to be comparable in M1 if
∀x ∈ M1
=⇒ a ≤ x or x ≤ a.
Remark 11.1. A subset M1 ⊂ M of a (partially) ordered set M is a
chain iff any of its elements is comparable in M1 .
12. The Axiom of Choice and related statements
AC: For any collection
S of sets {Aα }α∈I , Aα , I are sets, there is a
function f : {Aα }α∈I → α Aα such that for any α we have f (Aα ) ∈ Aα .
Remark 12.1. In contrast to ZF this axiom allows to build sets without
specifying the elements. One can choose one shoe from an infinite
collection of pairs of shoes (choose always the left!) but one needs AC
to choose from pairs of socks.
10
Since any set contains elements AC is not needed for a finite collection of sets!
AC is not harmless at all but it has many convenient applications.
Sometimes one uses AC even when it is not needed (definition of a
differentiable structure).
One should always be aware when one uses AC!
12.1. Statements equivalent to AC.
The AC is equivalent to the following statements. Each of which has
its advantages in different applications.
Theorem 12.1 (Maximal chain theorem of Hausdorff). In any partially ordered set, every totally ordered subset (chain) is contained in a
maximal totally ordered subset (chain).
AC implies the Maximal chain theorem of Hausdorff.
• First change ≤ to ⊂ by defining M sup Mx := {y ∈ M : y ≤ x}.
• Let X be a non-empty collection (it contains ∅) of subsets of M
with the properties. Every subset of a set in X belongs to X
and the union of each chain in X is contained in X.
• Let f : P(M ) \ ∅ → M be a choice function for M (AC!). For
each A ∈ X let A∗ := {x ∈ M : A ∪ {x}}. Define g : X → X by
g(A) = A if A∗ \ A = ∅ or otherwise g(A) = A ∪ {f (A∗ \ A)}.
Then g(A) contains at most 1 element more than A.
• We want to prove that G(A) = A for some A ∈ X. That will
impliy the theorem.
• We say set a subcollection J of X is a tower if
– ∅∈J
– If A ∈ J then g(A) ∈ J. S
– If C is a chain in J then A∈C A ∈ J.
• The intersection of towers is a tower. Let J0 be the intersection
of all towers, i.e. the smallest. We are going to prove that it is
a chain.
• Let C be comparable in J0 .
• Assume A ( C then g(A) ⊂ C. Otherwise A ( C ( g(A)
contradicting that g(A) has at most one more element than A.
• Let U be the collection of sets in A ∈ J0 such that A ⊂ C or
g(C) ⊂ A. We want to show that U is a tower.
– ∅ ∈ U.
– A ( C then (previously) g(A) ⊂ C, i.e. g(A) ∈ U.
– A = C then g(C) = g(A) (i.e. g(C) ⊂ g(A)) and g(A) ∈ U.
– g(C) ⊂ A then g(C) ⊂ A ⊂ g(A) and g(A) ∈ U.
11
•
•
•
•
– The union of the elements over a chain is by the definition
of U contained in U.
U = J0 .
If C is comparable then so is g(C) by the previous considerations.: If A ∈ J0 = U so either AsubsetC ⊂ g(C) or g(C) ⊂ A.
g maps comparable sets to comparable sets. The union of comparable setsover a chain is comparable. That implies that comparable sets constitute a tower and hence J0 consists of comparable sets only, i.e. J0 is a chain itself.
Since J0 is a chain and a tower the union A over all elements of
J0 is in J0 . Therefore g(A) ⊂ A since A includes all sets in J0 .
On the other hand A ⊂ g(A). Therefore A = g(A).
Theorem 12.2 (Zorn’s Lemma). Every non-empty partially ordered
set in which every chain (i.e., totally ordered subset) has an upper
bound contains at least one maximal element.
Maximal chain theorem of Hausdorff implies Zorn’s Lemma.
Take a chain C in the set and an upper bound a for it. Then a ∈ C
and is a maximal element. Otherwise there is an element b in the set
such that a < b ∈
/ C and the chain C is not maximal.
Theorem 12.3 (Zermelo’s Well-Ordering Principle (WOP)). Every
set can be well-ordered.
Zorn’s lemma implies WOP.
• If A, B are two well-ordered sets then A is said to be a continuation of B if B ⊂ A and the subset inclusion preserves the
order in B, A.
• If a collection of well-ordered sets forms a chain C with respect
to continuation then there is a unique well-ordering of U , the
union of the sets in the chain, that is a continuation of the
well-ordering of all sets in C. This well-order is defined in the
following way: take a, b ∈ U . The there are a ∈ A ∈ C, b ∈
B ∈ C. By the continuation property either A = B or one is
the continuation of the other. That defines the order between
a andb. It is clearly a well-ordering.
• Consider U the collection of well-ordered subsets of M , i.e subsets together with a (chosen) well-ordering. Then U is partially
ordered by continuation. This collection contains the empty
set. If C is a chain in U then its union of the sets in C is an
upper bound. Hence, there is a maximal well-ordered set M 0
in U. This set must be equal to M since otherwise we could
12
add the ”missing” element x ∈ M \ M 0 to the chain by x > y,
∀y ∈ M 0 .
WOP implies AC.
S
Consider U = α∈I Aα and choose (no AC needed for this!) a wellordering on the set U . Since Aα ⊂ U we can well-define f (Aα ) as the
minimal element of Aα ⊂ U . This is a choice function.
Theorem 12.4 (Trichotomy of Cardinals). If two sets are given, then
either they have the same cardinality, or one has a smaller cardinality
than the other.
Proof. This proof comes in section 14.
Theorem 12.5 (Tychonov’s theorem). Let {Kα }Y
α∈I , I any index set,
Kα endowed with
Kα non-empty compact spaces. The the product
α∈I
the product topology is a compact non-empty space.
Proof. The proof is postponed till section 17.
Remark 12.2. If the spaces Kα are assumed to be Hausdorff Tychonov’s theorem follows from the Ultrafilter Lemma without using AC
as can be seen from the forthcoming proof.
There are other statements commonly used in analysis, topology,
combinatorics,· · · that are equivalent to AC. We will list them without
proofs.
Theorem 12.6. Every vector space over any field F has a basis. In
particular (not equivalent to AC!), R has a basis (Hamel basis) as a
vector space over Q. The latter statement will be proved in section 17.
Remark 12.3. This theorem
Q∞ does not hold if one considers modulis
instead of vector spaces:
n=1 Z is not a free Z-module, i.e. has no
basis as a module over Z.
Theorem 12.7. Every ring with a unit has a basis.
Theorem 12.8. The closed unit ball of the dual of a normed vector
space over R has an extreme point.
Theorem 12.9. Any connected graph has a spanning tree.
13
12.2. Statements implied by AC but strictly weaker. The following are some important statements that are not implied by ZF but
can be proven in ZF+AC and do not imply AC.
• Any union of countably many countable sets is countable. (Compare with the statement in section 6! It is equivalent with ωAC.)
• An infinite set is Dedekind-infinite (equivalent to ωAC).
• The existence of (Lebesgue-) non-measurable subsets of R (follows from the Ultrafilter Lemma, see section 15).
• The Banach-Tarski paradox.
• Every subgroup of a free group is free.
• The additive groups of R and C are isomorphic.
• The Hahn-Banach theorem (Follows from the Ultrafilter Lemma):
Any (semi-)norm bounded linear functional on a linear subspace
U of a (semi-)normed vector space V over R or C has a linear
extension to all of V .
• The Ultrafilter Lemma (see section 15).
• Every set can be totally ordered (follows from the Ultrafilter
Lemma but is strictly weaker!)
• The Alaoglu theorem (equivalent to the Ultrafilter Lemma):
The closed unit ball of the dual of a normed vector space is
compact with respect to the weak* topology.
• The Baire Category Theorem (equivalent to the Axiom of dependent choices) : A non-empty complete metric space is not
the countable union of nowhere-dense closed sets.
• The open mapping and closed graph theorems (rely on the Baire
Category Theorem).
• Every infinite-dimensional topological vector space has a discontinuous linear functional.
• Every Tychonov space has a Stone-Čech compactification (Relies on the Ultrafilter Lemma).
Another interesting statement is the following
Theorem 12.10 (Ulam). Under CH (or weaker) there is no finite, σadditive, non-atomic measure such that any subset of R is measurable.
There are also stronger axioms that imply AC. The most familiar one
is the generalized continuum hypothesis (GCH). On the other hand, CH
is independent of ZFC (Cohen).
13. Ordinal numbers
Lemma 13.1. The collection of all ordinals is a proper class.
14
Lemma 13.2. Given any sequence of infinite ordinals strictly less than
a given ordinal β ≥ ω1 . Then they are all prefixes of an ordinal β 0 < β.
This proves that [0, ω1 ) is not compact but sequentially compact.
14. Cardinals and alephs
Definition 14.1. The cardinal number of a well-ordered set is called
an Aleph: ℵ.
Definition 14.2. An ordinal number ωa is an initial ordinal for an ℵ
if ℵ = m(ωa ) and ∀ω with m(ω) ≥ ℵ follows ω ≥ ωa .
Lemma 14.1. The ℵ’s inherit a well-ordering from their initial ordinals.
Lemma 14.2 (Hartogs, Sierpinski). The following statement holds
without the use of AC. For any set M there is a set f (m) ∈ P(P(P(M )))
such that
• f (M ) is well-ordered,
• m(f (M )) m(M ),
2m(M )
• m(f (M )) ≤ 22
.
Proof.
• Let M ∈ P(P(M )) be the subsets of the subsets of M that are
well-ordered with respect to ⊂.
• Let Φ ∈ P(P(P(M ))) the set of their equivalence classes with
respect to their order type. Φ is well-ordered itself (by the
magnitude of the order types of their equivalence classes) and
we set f (M ) = Φ.
2m(M )
• By construction: f (M ) is well-ordered and m(f (M )) ≤ 22
.
• Assume m(f (M )) ≤ m(M ). Then Φ ∼ M1 ⊂ M and that
would give a well-ordering
to M1 of the same type ωΦ as Φ.
S
Now the collection α<ωΦ (M1 )α of all the initial segments of
M1 form a well-ordered chain in M and this chain has the same
type as Φ. We therefore get
[
(M1 )α ≈ ωΦ ≈ ω(M1 )
α<ωΦ
a contradiction.
Corollary 14.1. From this follows (without AC): There is no injective net in any set (a map f : Ordinal numbers → M with f (α) = xα ,
15
α along all ordinal numbers and xα 6= xβ if α 6= β). This gives an alternative proof of Zorn’s lemma (use AC to construct an injective net
if there is no maximal element). Roughly speaking: ”There are more
ordinal numbers than any set can have elements.”
Lemma 14.3 (Zermelo). Without AC: If A is an infinite well-ordered
set then
(m(A))2 = m(A).
Proof. We have seen that ℵ20 = ℵ0 . Assume that the statement is
wrong. Since the ℵ’s are well-ordered there is a smallest one, α sucht
that ℵ2α > ℵα and ℵ2β = ℵβ for all β < α. Let ωα its initial ordinal. Let
P := {(µ, ν) : µ < ωα , ν < ωα },
and for λ < ωα
• P =
S
Pλ := {(µ, ν) : µ + ν < ωα }.
λ<ωα Pλ : We need to show that
µ ≤ ωα , ν ≤ ωα =⇒ µ + ν < ωα .
But this follows by the definition of α:
m(µ) + m(ν) ≤ m(λ)2 = m(λ) < ℵα .
• We well-order P :
– µ < µ0 =⇒ (µ, ν) < (µ0 , ν 0 )
– ν < ν 0 =⇒ (µ, ν) < (µ, ν 0 ).
• Obviously m(Pλ ) ≤ m({µ : µ < λ}) since for each µ ≤ λ there
is exactly one order type ν such that µ + ν = λ. Hence,
m(Pλ ) ≤ m(λ + 1) = m(λ) < ℵα .
• For any given initial segment P(µ,ν) of P there are at most λ0 =
µ + ν ”diagonals” Pλ0 not exceeding (µ, ν) < ωα . Therefore
m(A)2 = m(P ) ≤ m(λ0 + 1)2 = m(λ0 )2 = m(λ0 ) ≤ ℵℵ
a contradiction. Thus P ≤ ωα but then ℵ2α ≤ m(P ) = ℵα
contradicting the definition of α.
Lemma 14.4. If n ≥ ℵ0 then n + n < 2n (without AC!).
The statement is obviously true for 3 ≤ n ∈ N.
Proof. Let n ≥ ℵ0 . Since N 3 x → {x} ∈ P(N ) and N 3 x →
N \ {x} ∈ P(N ), N \ {x} =
6 {y} are bijections we have n + n ≤ 2n . We
also have:
n + n = 2 · n ≤ n2 .
16
Hence it suffices to show that n2 m(P(N )). Assume there is an
injection
g : P(N ) ,→ N × N.
Let Y ⊂ N be an infinite subset with a well-ordering ≺R .
• By Lemma 14.3 there is a bijection
h = h(Y,≺R ) : Y Y × Y.
• Let
A = A(Y,≺R ) := {y ∈ Y : g −1 (h(y)) is defined and y ∈
/ g −1 (h(y))}.
If g(A) ∈ Y × Y then there is a y0 ∈ Y such that h(y0 ) = g(A).
But then the question y0 ∈ A or y0 ∈
/ A? leads to a contradiction (diagonal argument as by Russel or Cantor). Hence,
• g(A) ∈
/ Y ×Y.
• We define a map on infinite well-ordered subsets of N with
F ((Y, ≺R )) ∈ N \ Y
(1)
by
F ((Y, ≺R )) = π(g(A(Y,≺R ) ))
where π maps an element (x, z) ∈ (N × N ) \ (Y × Y ) to x if
x∈
/ Y or to z otherwise.
• We will start with N ) Y0 ∼ N. The latter equivalence is
possible since m(N ) ≥ ℵ0 . So Y0 is of the form
y1 < y2 < · · · < yn < · · ·
We put F ((Y0 ≺N )) = x ∈ N \ Y0 and for n ∈ N we put
F (y1 < · · · < yn ) = yn+1 .
• We call a set Y ⊂ N equipped with a well-ordering ≺R a gwell-ordered set if for any y ∈ Y
F ({x ∈ Y : x ≺R y}) = y.
• (Y0 , ≺N ) is a g-well-ordered set.
• If (Y, ≺R ) and (Z, ≺N ) are both g-well-ordered sets then one is
an initial segment of the other. For:
– Since we can compare any two well-ordered sets w.l.o.g.
there is an order-preserving injection i : (Y, ≺R ) → (Z, ≺S )
mapping Y onto an initial segment of Z.
– We need to show the i = id|Y . Assume not let x ∈ Y be
the least element such that i(x) 6= x. Then
{y ∈ Y : y ≺R x} = {z ∈ Z : z ≺S i(x)}.
17
But by the g-well-ordering property
i(x) = F ({z ∈ Z : z ≺S i(x)}) = F ({y ∈ Y : y ≺R x}) = x,
a contradiction.
• Let now W be the union of all g-well-ordered sets. By the
preceding statement there is a (unique) g-well-ordering ≺g on
W.
• F ((W, ≺g )) ∈ W for otherwise W ∪ {F (W )} can be equipped
with a g-well-ordering ≺0g by setting ≺0g =≺g on W and F (W )
is larger than any element in W (this is a well-defined g-wellordering since F (W ) ∈ N \ W ). This contradicts the definition
of W .
• On the other hand by 1 we should have F (W, ≺g )) ∈ N \ W –
a contradiction.
Theorem 14.1. Trichotomy for cardinals is equivalent to the axiom
of choice.
Proof.
• By the WOP we can well-order any two sets A, B. Then one
is order-equivalent to an initial segment of the other and the
order-equivalence gives a bijection of one set onto a subset of
the other.
• If one can compare a set A with any other set so one can do so,
in particular, with any well-ordered set. By Theorem 14.2 we A
is comparable with f (A) and the possibility m(A) ≥ m(f (A))
is excluded. Hence, m(A) < m(f (A)) and we inherit on A a
well-ordering from f (A).
Theorem 14.2. The statement n2 = n for all infinite cardinals n is
equivalent to the axiom of choice.
Proof. The AC is equivalent that any cardinal number is an ℵ. Thus
Lemma 14.3 shows
AC
=⇒
n2 = n
∀ cardinals n.
Conversely,
n2 = n,
k2 = k,
(n + k)2 = n + k.
Hence,
n + k = n2 + 2n · k + k2 = n + 2n · k + k ≥ n · k.
18
and
n · k = (n1 + 1) · (k1 + 1) = n1 · k1 + n1 + k1 + 1 ≥ 1 + n1 + k1 + 1 ≥ n + k
Therefore n+k = n·k and there are (pairwise disjoint) sets N1 , N2 , K1 , K2 ,
N1 ∼ N2 , K1 ∼ K2 , m(Ni ) = n, m(Ki ) = k such that N1 ×K1 ∼ N2 ∪K2 .
We will choose K1 = f (N1 ) from Lemma 14.2
There are 2 possibilities:
• ∃n1 ∈ N1 such that ∀k1 ∈ K1 we have (n1 , k1 ) n2 ∈ N2 .
Then,
k = m(K1 ) ≤ m(N2 ) = n.
This possibility is excluded since k = m(f (N1 )) m(N1 ) = n.
• ∀n1 ∈ N1 ∃k1 ∈ K1 such that (n1 , k1 ) k2 ∈ K2 . For fixed
n1 ∈ N1 we choose g(n1 ) to be the smallest k ∈ K1 (note
that K1 is well-ordered) such that (n1 , k) n2 ∈ K2 . We set
M = {(n1 , g(n1 ))} and get
n = m(N1 ) = m(M ) ≤ m(K2 ) = k = m(K1 ).
The latter means that N1 is equivalent to an initial segment of
f (N1 ) = K1 and hence, well-ordered.
Lemma 14.5. The generalized continuum hypothesis implies:
If k + n = 2n , n ≥ ℵ0 then k = 2n .
Proof. We have k ≤ 2n and
n ≤ n + n ≤ 2n + 2n = 2n+1 = 2n .
But n + n < 2n and by GCH n + n = n.
Now 2n · 2n = 2n+n = 2n = n + k. Hence, there are sets N, K and
a bijection h : P(N ) × P(N ) N ∪ K, m(N ) = n, m(K) = k and
N ∩ K = ∅. Since n < 2n there is a subset N0 ⊂ N such that ∀M ⊂ N
we have (N0 , M ) ∈
/ h−1 (N ).
Therefore h|(N0 ,P(N )) ,→ K is an injection and
k = m(K) ≥ m(P(N )) = 2n .
Theorem 14.3. The generalized continuum hypothesis implies the axiom of choice.
Proof. Let n be an arbitrary non-finite cardinal. We need to prove that
under GCH n is an Aleph, i.e. if m(N ) = n then N can be well-ordered
(for finite n there is nothing to prove!).
19
Since n is not finite the cardinal k = n + ℵ0 ≥ ℵ0 is non-finite too. If
k = m(K) and K can be well-ordered so can N itself since it corresponds
to a segment of K (i.e. k ≥ n).
We have
2k
k
ℵ0 ≤ k < 2k < 22 < 22 .
Also
k + 2k = 2k ,
k
k
2k
k
2k
22 + 22 = 22 .
2k + 22 = 22 ,
2k
By Theorem 14.2 m(f (K)) ≤ 22 and therefore
2k
k
k
2k
m(K) + 22 ≤ 22 + 22 = 22 .
k
2k
If m(f (K)) + 22 = 22 then by Lemma 14.5:
2k
m(f (K)) = 22 > k ≥ n.
Since f (K) is well-ordered this induces a well-ordering on n.
k
k
2k
Suppose 22 ≤ m(f (K)) + 22 < 22 then GCH implies
k
k
k
22 = m(f (K)) + 22 ,
22 ≥ m(f (K)).
Hence,
k
k
m(f (K)) + 2k ≤ 22 + 2k = 22 .
k
Again: If m(f (K)) + 2k = 22 then by Lemma 14.5:
k
m(f (K)) = 22 > k ≥ n.
Again this induces a well-ordering on n.
k
Suppose 2k ≤ m(f (K)) + 2k < 22 then GCH implies
2k = m(f (K)) + 2k ,
2k ≥ m(f (K)).
Hence,
m(f (K)) + k ≤ 2k + k = 2k .
Again: If m(f (K)) + k = 2k then by Lemma 14.5:
m(f (K)) = 2k > k ≥ n.
Again this induces a well-ordering on n.
Finally we should have by GCH:
m(f (K)) + k = k thus m(f (K)) ≤ k,
a contradiction.
20
Proof of Ulam’s theorem 12.10.
µ ba a finite, non-atomic, σ-additive measure on R.
Since under CH and AC c = m(R) = ℵ1 is the first uncountable
Aleph there is a well-ordering ≺ on R such that for each y ∈ R the set
Ay := {x ∈ R : x ≺ y} is at most countable. Choose 1-1 mappings
(AC) f (·, y) : Ay Nx ⊂ N. So f is an integer-valued function defined
for all pairs (x, y), y ∈ R, x ≺ y). By definition
x ≺ x0 ≺ y
=⇒
f (x, y) 6= f (x0 , y).
For x ∈ R, n ∈ N we define
Fxn := {y ∈ R : x ≺ y and f (x, y) = n}.
We form a table
Fx11 Fx12 · · · Fx1 · · ·
Fx21 Fx22 · · · Fx2 · · ·
··· ··· ··· ··· ···
Fxn1 Fxn2 · · · Fxn · · ·
··· ··· ··· ··· ···
with ℵ0 rows and ℵ1 columns. It has the following properties.
• The sets in a row are pairwise disjoint. Otherwise suppose
y ∈ Fxn ∪ Fxn0 . Then x ≺ y, x0 ≺ y and f (x, y) = f (x0 , y) = n.
Since
f (·, y) is a bijection it follows x = x0 .
S
• n∈N Fxn = R \ Xx where m(Xx ) ≤ ℵ0 , i.e. the union of each
column fills R up to a countable set. For y x then y ∈ Fxn for
some n, i.e. the one for which f (x, y) = n. Hence
[
R\
Fxn = {y ∈ R : y x}.
n∈N
The set on the right-hand-side is countable by the definition of
≺.
Since µ(R) < ∞ in each row
m({x ∈ R : µ(Fxn ) > 0}) ≤ ℵ0 .
Therefore there are at most countably many sets of positive measure
in the entire table. Because there are ℵ1 colums there is a x0 ∈ R such
that µ(Fxn0 ) = 0 for all n ∈ N. Then by σ-additivity
!
[
0=µ
Fxn0 = µ (R \ Xx0 ) = µ(R).
n
The last equality is due to the assumption that µ is non-atomic and
m(Xx0 ) ≤ ℵ0 .
Let
21
15. ultrafilter
Definition 15.1 (Filter). A filter F on a set M is an element of
P(P(M )) such that for all N, K ∈ P(M )
• M ∈ F, but ∅ ∈
/ F.
• N ∈ F, N ⊂ K =⇒ K ∈ F.
• K ∈ F, N ∈ F =⇒ N ∩ N ∈ F.
We can introduce a partial order on filters of a set M .
Definition 15.2. We say that the filter F0 is finer than the filter F
if for any N ∈ F it follows N ∈ F0 . We will write F ⊂ F0 . This gives
a partial order on the set of all filters on a (fixed) set.
Definition 15.3. A non-empty family (Aα )α∈I of subsets ∅ 6= Aα ∈
P(M ) is said to have the finite intersection property T
if for any
finite subset {α1 , α2 , · · · , αn } ∈ P(I), n ∈ N the intersection nk=1 6= ∅.
Definition 15.4. A non-empty family (Aα )α∈I of subsets ∅ 6= Aα ∈
P(M
S ) is said to have the centered finite intersection property
if α∈J Aα = ∅ for some subset J ⊂ I then there is a finite subset
{α
Tn1 , α2 , · · · , αn } ∈ P(J), n ∈ N such that already the finite intersection
k=1 = ∅.
Lemma 15.1. Let M ∈ P(P(M )) have the finite intersection property.
Then
)
(
n
\
Mk ⊂ N
F := N ⊂ M : ∃M1 , M2 , · · · , Mn ∈ M such that
k=1
is a filter on M . This is the smallest filter containing M and M is
called a filter basis.
T
T 0
Proof. Since ∅ 6= nk=1 Mk ∪ nk=1 Mk0 ⊂ N ∩ N 0 ⊂ M the axioms of a
filter are fulfilled. Any filter containing M must also contain any finite
intersection of elements of M and hence also all the supsets of these
intersections.
Example 15.1.
• The principal filter px of an element x ∈ M is the family
px := {N ⊂ M x ∈ N }.
• In a topological space X the system Ux of all neighborhoods of
a point is a filter, called the neighborhood filter
Vx := {Y ⊂ X : x ∈ U (x) ⊂ Y, U (x) open}.
22
M
• Let M be an infinite set. The cofinite filter Fcof
in is defined
as
M
Fcof
6 N, m(M \ N ) < ∞}.
in := {N ⊂ M : ∅ =
Definition 15.5 (Ultrafilter). A filter F on M is called an ultrafilter
if any set N ∈ P(M ) either N ∈ F or (M \ N ) ∈ F.
Remark 15.1.
• Any principal filter px is an ultrafilter since x ∈ N, x ∈ K =⇒
x ∈ N ∩ K.
• Reversely if for a filter F we have {x} ∈ F then F = px because it must contain all supsets of {x} and cannot contain their
complements.
• An ultrafilter that is not principal (if it exists!) is called free.
M
• Fcof
is an ultrafilter if and only if M cannot be partitioned into
two disjoint infinite subsets, i.e. if M ∼ M1 ∪ M2 , M1 ∩ M2 = ∅
M
and ∀n ∈ N, m(Mi ) > n then Fcof
is not an ultrafilter. The latM
ter is true since otherwise exactly one of Mi belongs to Fcof
but
both are not cofinite. If there is no such partition of M into two
infinite subsets then any set is either finite or its complement is
finite, i.e. any subset or its complement (if the set is a proper
M
subset) belongs to Fcof
.
– In ZF without choice there can be amorphous sets, that
means infinite sets such that each of its subset is either
finite or cofinite.
– Amorphous sets cannot be totally ordered, for otherwise the
set
A := {x ∈ M : m({y ≤ x}) > n and m({x < z}) > n, ∀n ∈ N} =
6 ∅
•
•
•
•
and M is the disjoint union of {y ≤ x} and {x < z}.
– The ultrafilter lemma implies the existence of a total order
on any set. Hence there are no amorphous sets assuming
UFL.
Note that exactly one of the two sets N and M \ N is contained
in an ultrafilter since N ∩ (M \ N ) = ∅ ∈
/ F.
An ultrafilter p can be interpreted as a finitely additive 0-1measure of X, measuring all subsets. So the family of ultrafilters on a given set M can be interpreted as the family of all
finitely additive 0-1- measures on P(M ).
N ∪ K ∈ p, p is an ultrafilter =⇒ N ∈ p or K ∈ p.
K ⊂ N, N ∈
/ F =⇒ K ∈
/ F for any filter F.
23
S
• Let p be an ultrafilter. If M = nk=1 Nk , Nk ∩ Nl = ∅, k 6= l
then there is exactly one 1 ≤ k ≤ n such that Nk ∈ p and
Nl ∈
/ p, l 6= k.
• Any ultrafilter on a finite set is principal, since a finite set is
the finite union of its distinct elements.
• If p is a free ultrafilter on an infinite set M (if it exists) and the
subset N = {x1 , · · · , xn } ∈ P(M ) is a finite set then N ∈
/ p.
For otherwise
N=
n
[
{xk }, {xk } ∩ {xl } = ∅, k 6= l.
k=1
• An ultrafilter p on M is free if and only if it contains the cofinite
M
filter Fcof
in .
M
For if p is free and N ∈ Fcof
in then m(M \ N ) < ∞, hence
(M \ N ) ∈
/ p and N ∈ p. On the other hand (we may assume
that M is an infinite set since otherwise the cofinite filter does
M
not exist) the cofinite set M \ {x} ∈ Fcof
in is not an element of
px .
The next lemma establishes an alternative definition of an ultrafilter.
Lemma 15.2. A filter F on M is an ultrafilter if and only if F is
maximal, i.e. there is no filter F0 finer than F.
Proof.
• ( =⇒ ) Assume F ( F0 and ∅ =
6 N ∈ F0 \ F. Since F is an
ultrafilter we have (M \ N ) ∈ F ⊂ F0 and this implies that
N ∈ F0 and (M \ N ) ∈ F0 - a condradiction.
• ( ⇐= ) Assume F is not an ultrafilter but maximal. Hence,
there is a set N such that neither N nor (M \ N ) is contained
in F. Assume further there is a K ∈ F with K ⊂ M \ N . But
then (M \ N ) ∈ F which is a contradiction to our assumption.
So every K ∈ F intersects N , i.e. ∀B ∈ F =⇒ B ∩N 6= ∅. So
the family F ∪ {N } has the finite intersection property and can
be extended to a filter containing N . So F cannot be maximal.
The existence of free ultrafilters is not a priori. It is impossible to
derive it from ZF only. It is impossible to construct an ultrafilter
even when its existence is established. The next theorem shows that
AC implies the existence of free ultrafilters.
24
Theorem 15.1 (Ultrafilter Lemma I). AC implies: Any filter F on a
set M is a subset of an ultrafilter p on M . Note that the ultrafilter p
does not have to be unique!
Proof. Let C be a chain (with respect to the ”finer”Sordering) of filters
on M containing F. First we observe that FC := G∈C {N ∈ G} is a
filter:
• ∅∈
/ G ∀G ∈ C =⇒ ∅ ∈
/ FC .
• (N ⊂ FC =⇒ ∃G ∈ C : N ∈ G) =⇒ N ⊂ K ∈ G and
therefore K ∈ FC
• (N, K ∈ FC =⇒ ∃G ∈ C : N, K ∈ G) =⇒ N ∩ K ∈ G
and therefore N ∩ K ∈ FC .
By Theorem 12.1 (equivalent to AC) C ⊂ Cmax and Cmax is a maximal
chain.
S
Therefore Fmax := G∈Cmax {N ∈ G} is itself a filter, finer than any
other in Cmax by maximality. So by Lemma 15.2 it is an ultrafilter
containing F.
An equivalent (seemingly stronger) statement is
Theorem 15.2 (Ultrafilter Lemma II). (AC implies:) Any family M
of subsets of M with the finite intersection property can be extended to
an ultrafilter.
Proof. Any filter has the finite intersection property and any family of
sets having the finite intersection property can be extended to a filter
by Lemma 15.1.
Remark 15.2. The equivalent Theorems 15.1 and 15.2 can be stated as
an axiom UF if one wants to avoid AC. This axiom is strictly weaker
than AC as will be indicated in the proof of Theorem 12.5.
Theorem 15.3. The Ultrafilter lemma implies (without AC) that any
set can be totally ordered.
Proof. We consider a strict partial order ”<” on a set M . It is clear
that this exists.
0) On any finite set M we can extend the partial order ”<”to a
total order ”<M ” : it is clearly true for m(M ) = 1. Assume
we can extend the partial order for a set of n elements. Let
M 0 = M ∪ {xn+1 }, m(M 0 ) = n + 1. Let y be the maximal (with
respect to <M ) element in M such that y < xn+1 and z the
minimal element such that x < z. Then by compatibility of <
and <M we have y <M z. Put for w ∈ M either w <M 0 xn+1 if
w <m y or w = y, or x <m0 w if y < w and for all u, w ∈ M let
25
i)
ii)
iii)
iv)
u <M 0 w ⇐⇒ u <M w. This gives the desired total order of
M 0 and we conclude by the induction principle (for N).
We consider the set of maps S := {f : Gf ⊂ M × M → {0, 1}}
such that
a) (x, y), (y, z), (x, z) ∈ Gf , f (x, y) = f (y, z) = 1
=⇒
f (x, z) = 1.
b) (x, y), (y, x) ∈ Gf =⇒ (f (x, y) = 1 ⇐⇒ f (y, x) = 0).
c) (x, y) ∈ Gf , x < y =⇒ f (x, y) = 1.
For each finite subset G ⊂ M × M, m(G) < ∞ there is a f ∈ S
with Gf = G. We just take f to be the characteristic function
of the extended totel order <G on G, i.e. f (x, y) = 1 if x <G y
and f (x, y) = 0 otherwise.
f ∈ S ⇐⇒ f |G ∈ S for any finite subset G ⊂ Gf .
– ( =⇒ ) follows immediately from the definition of restriction.
– ( ⇐= ) let (x, y), (y, x), (y, z) ∈ Gf . We consider the following finite subsets of Gf : Ga = {(x, y), (y, z)}, Gb =
{(x, y), (y, x)}, Gc = {(x, y)}. So a), b), c) follow from
applying the restriction properties to Ga , Gb and Gc , respectively.
There is an element f0 ∈ S such that Gf0 = M × M .
– For finite F, G ⊂ M × M let SF = {f ∈ S : F ⊂ Gf }.
Then SF ∩ FG = SF ∪G 6= ∅: For each f ∈ SF ∪G ,we have
S∪Gf , hence F ⊂ Gf and G ⊂ Gf and f ∈ SF and f ∈ SG .
Reversely, f ∈ SF ∩ SG implies F ∈ Gf and G ∈ Gf . Thus
F ∪ G ⊂ Gf .
– This means {SG : G ⊂ M × M, m(G) < ∞} has the finite
intersection property. Hence there is an ultrafilter p on S
that includes {SG : G ⊂ M × M, m(G) < ∞}.
– For (x, y) ∈ M × M we have
p 3 {f ∈ S(x,y) } = {f ∈ S(x,y) : f (x, y) = 0}∪{f ∈ S(x,y) : f (x, y) = 1}.
Hence exactly one of the sets {f ∈ S(x,y) : f (x, y) = i}, i =
1, 2 is contained in p. We define i(x,y) equal to 0 or 1 so
that {f ∈ S(x,y) : f (x, y) = i(x,y) } ∈ p.
– let G ⊂ M × M, m(g) < ∞. Then the finite intersection
\
{f ∈ S(x,y) : f (x, y) = i(x,y) } ∈ p.
(x,y)∈G
T
Thus (x,y)∈G {f ∈ S(x,y) : f (x, y) = i(x,y) } =
6 ∅. Chooseany
T
f0 ∈ (x,y)∈G {f ∈ S(x,y) : f (x, y) = i(x,y) }. So f0 ∈ S and
26
G ⊂ Gf0 and f0 |G ∈ S. Since this holds for any finite
subset of S by iii) Gf0 = S.
v) Define (x <M y) ⇐⇒ f0 (x, y) = 1. By the preceding arguments this will give a compatible total order on M .
Theorem 15.4. The Ultrafilter Lemma implies the existence of nonmeasurable sets.
Proof.
• To each x ∈ R ∩ [0, 1] we associate its binary expansion containing infinitely many zeros, i.e. x (xn )n∈N , xn ∈ {0, 1} ∀n ∈ N.
• We consider the sequence (Xn )n∈N of independent (with respect
to Lebesgue measure L ) random variables
Xn (x) := xn .
• We define an equivalence relation v on [0, 1]:
xvy
⇐⇒
m({n ∈ N : xn 6= yn }) < ∞.
• A subset A ⊂ [0, 1] is called a tail set if A respects v, i.e. if
∀y v x
x∈A
⇐⇒ y ∈ A.
• Kolmogorov’s 0-1-law states: If A is a (Lebesgue) measurable
tail event then
L (A)2 = L (A), i.e. L (A) = 0 or L (A) = 1.
• The inversion j : [0, 1] → [0, 1] defined by xn → xn + 1 (mod 2)
leaves the Lebesgue measure invariant, i.e.
L (B) = L (j −1 (B))
∀ measurable B ⊂ [0, 1].
• Let p be an ultrafilter on N. We define
Ap := {x ∈ [0, 1] : {n ∈ N : xn = 1} ∈ p}.
• Ap is a tail set: Assume x ∈ Ap and y v x. Then
{n ∈ N : xn = 1} ⊂ {n ∈ N : yn = 1} ∪ {n ∈ N : xn 6= yn } ∈ p.
The union on the right-hand-side is a disjoint union and the set
{n ∈ N : xn 6= yn } is finite. Hence {n ∈ N : yn = 1} ∈ p.
• j(Ap ) = [0, 1] \ Ap : First we note that since N ∈ p the sequence
(1)n∈N ∈ Ap . Moreover ∀x ∈ [0, 1], j(x) = y we have xn =
1 ⇐⇒ yn = 0. Hence, we have the disjoint union
N = {n ∈ N : xn = 1} ∪ {n ∈ N : yn = 1}.
The latter implies that x ∈ Ap if and only if j(x) ∈
/ Ap .
27
• We conclude that 0 < L (Ap ) = L ([0, 1] \ Ap ) = 12 < 1 if
Ap were measurable. This contradicts Kolmogorov’s 0-1-law.
Hence, Ap is not measurable.
16. Non-standard analysis
17. Ultrafilter in topology
Definition 17.1. A set is called a topological space if there is a
collection of sets T ∈ P(P(X)) with the properties
• ∅ ∈ T and X ∈S
T.
• Uα ∈ T =⇒
α Uα ∈ T. T
n
• Ui ∈ T, i = 1, · · · , n =⇒
i=1 Ui ∈ T.
The sets U ∈ T are called open sets. Their complements are called
closed. A set V ⊂ X with x ∈ U ⊂ V, U ∈ T is called a neighborhood of x.
Definition 17.2. A topological space (X, T) is called a Hausdorff or
T2 -space if for all distinct x, y ∈ X there are open sets x ∈ U , y ∈ V
such that U ∩ V = ∅.
Definition 17.3. A subset Y S
⊂ X of a topological space is called compact if forS every open cover α Uα ⊃ Y there are indices α1 , · · · , αn
such that ni=1 Uαi ⊃ Y , i.e. any cover has a finite subcover.
Definition 17.4. A subset Y ⊂ X of a topological space is called
sequentially compact if any sequence (xn )n∈N , xn ∈ Y contains a
converging (to a point of Y ) subsequence limk→∞ xnk = y ∈ Y .
As we will see later these two definitions of compactness are independent.
Example 17.1.
Let X be any set. Then T = {∅, X} defines the trivial topology. If
X contains at least two points this topology is not Hausdorff. Also any
subset is compact (there are only two open sets altogether) but only ∅
and X are closed!
Let X be any set and T = P(P(X)) (This topology is called the discrete topology.). This topology is Hausdorff. Any subset is at the
same time open and closed. The only compact sets are the finite subsets (Cover an infinite set by its points. This open cover has no finite
subcover.).
28
X
Let X be any infinite set and T = Fcof
(This topology is called the
cofinite topology.). This topology is not Hausdorff, any non-empty
U, V ∈ T have a non-empty
(infinite) intersection. Any subset is comS
pact: Let Y ⊂ X and Uα ⊃ Y be an open cover, i.e. Uα are cofinite.
Choose Uα1 6= ∅. Then Y \ Uα1 = {x1 , · · · , xn } - a finite set. Now we
choose Uαi 3 xi , 1 ≤ i ≤ n a finite subcover.
Definition 17.5. Let (Xi , Ti ), i = 1, 2 be topological spaces and f : X1 →
X2 a mapping. F is said to be continuous if
∀V ∈ T2 =⇒ f −1 (V ) ∈ T1 .
Definition 17.6. A (semi-) group is called a topological (semi-)
group if it carries a Hausdorff topology such that the group operation
g → hg and g → gh are continuous. For groups we also ask that
g → g −1 is continuous.
The following proposition can be found in any book on topology.
(1)
Proposition 17.1. X is a separable metric space =⇒ X is a Haus(2)
(3)
dorff space =⇒ All compact subsets of X are closed =⇒ The family
of compact subsets has the centered finite intersection property.
Proof.
(1)
=⇒ For x 6= y in X set U (x) = {z ∈ X : d(x, z) < 21 d(x, y)} and
U (y) = {z ∈ X : d(y, z) < 21 d(x, y)}. Then x ∈ U (x), y ∈
U (Y ) and U (x) ∩ U (y) = ∅.
(2)
=⇒ let K ⊂ X be compact. If K = X the statement is true since X
is closed by definition. So assume K ( X and x ∈ X \ K. For
y ∈ KS
let Uy (x), Ux (y) be open sets such that Ux (y)∩Uy (x) = ∅.
Then y∈K Ux (y) is an open cover of K and we can extract a
S
Tn
finite subcover N
k=1 Ux (yk ) ⊃ K. The set U (x) =
k=1 Uyk (x)
is open and disjoint from K. Hence, X \ K is open.
T
(3)
=⇒ Let Kα , α ∈ I be a family of compact sets with α∈I Kα = ∅.
Choose a non-empty compact set K ∈ {Kα : α ∈ I}. Since Kα
is closed the sets K \ Kα are open and
[
\
(K \ Kα ) = K \ ( Kα ) = K.
α∈I
α∈I
We extract a finite subcover
K⊂
n
[
Sn
i=1 (K
(K \ Kαi ) = K \
i=1
\ Kαi ) ⊃ K. Since
n
\
i=1
K αi
29
we conclude K ∩
Tn
i=1
Kαi = ∅.
Theorem 17.1. A topological space X is compact if and only if any
ultrafilter p on P(X) converges.
Proof. Coming soon.
17.1. Stone–Čech compactification of N.
Lemma 17.1. βN is a compact semi-group, i.e. it is a compact Hausdorff space with continuous addition (and also multiplication).
Proof. Coming soon.
17.2. Proof of Tychonov’s Theorem.
Tychonov’s Theorem is equivalent to AC. Coming soon.
17.3. Some more statements from topology and analysis.
Example 17.2.
• X = [1, · · · , ω1 ) equipped with the order topology is not compact
but sequentially compact: S
– The open cover X = α<ω1 [1, · · · , α) has no countable
(and therefore no finite) subcover.
– Any sequence (αn )n∈N , α ≺ ω1 is by Lemma 13.2 contained
in some [1, · · · , β 0 < ω1 ) and hence has a least upper bound
γ < ω1 . That γ is the limit point for some subsequence
(αnk )k∈N .
• βN is compact but not sequentially compact.
– βN is compact by Lemma 17.1
– Consider the sequence (pn )n∈N of principal ultrafilters corresponding to n ∈ N. We need to show that it does not
contain a convergent subsequence (pnk )k∈N . Let p be an arbitrary point of βN. Then by the properties of an ultrafilter
at least one of the two disjoint infinite sets
Se = {n2 , · · · , n2k , · · · },
So = {n1 , · · · , n2k+1 , · · · }
does not belong to p, say Se . Since βN is a Hausdorff space
there is a neighborhood U (p) not containing the infinite
subsequence (pnl )l∈Se and the subsequence (pnk )k∈N does not
converge to p, i.e. does not converge at all since p was
arbitrary.
Theorem 17.2 (Hamel). There exist a Hamel basis in R, i.e. a basis
of R as a vector space over Q.
30
Proof. Coming soon.
Corollary 17.1. There are (discontinuous) non-linear solutions to
Abel’s functional equation f (x + y) = f (x) + f (y), ∀x, y ∈ R.
Proof. Coming soon.
Theorem 17.3 (Ellis). Any compact topological semi-group contains
an idempotent element.
Proof. Coming soon.
Theorem 17.4. βN contains an idempotent element.
Proof. Coming soon.
18. Ultrafilter in infinite combinatorics and ergodic
theory
Theorem 18.1 (Furstenberg). Let f : X → X be a continuous map of
a compact metric space X. Then there is a minimal compact invariant
set K ⊂ X.
Proof. Coming soon.
Proposition 18.1 (Corollary to Theorem 17.2). There is a proper nonempty subset A ⊂ T1 = R/Z which is periodic under every rotation,
i.e.
∀α ∈ R
∃n ∈ N
=⇒
A + nα
(mod 1) = A.
Proof. Coming soon.
Theorem 18.2 (Ramsey). Any complete finitely colored graph has a
complete infinite monochromatic subgraph.
Proof. Let the complete infinite graph
(V, E = (V × V )/(x, y) ∼ (y, x) \ {(x, x) : x ∈ V })
be finitely colored, i.e. there is a function f : E → {1, · · · , n} for some
fixed n ∈ N. Since V is an infinite set, there is a free ultrafilter p on
V . We set Vi (x) := {y ∈ V : f (x, y) = i}. We have V \ {x} ∈ p and
V \ {x} =
n
[
i=1
Vi (x)
31
and this union is disjoint. Next we define g(x) = j, where j is the
unique index such that Vj (x) ∈ p. Let Gi := {x ∈ V : g(x) = i} then
again
n
[
V =
Gi
i=1
and this union is disjoint. Let Gi0 be the unique set such that Gi0 ∈ p.
Choose x1 ∈ Gi0 . Continue (weaker than AC) by choosing
xn+1 ∈ Gi0 ∩
n−1
\
Vi0 (xk ) 6= ∅ (finite intersection of elements in a filter).
k=1
The complete graph with vertices xk , k ∈ N is monochromatic.
Theorem 18.3 (Hindman). Any finite coloring of N has a monochromatic IP-subset.
Proof. Coming soon.