* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 1 Cardinality and the Pigeonhole Principle
History of the function concept wikipedia , lookup
List of important publications in mathematics wikipedia , lookup
Brouwer–Hilbert controversy wikipedia , lookup
Big O notation wikipedia , lookup
List of first-order theories wikipedia , lookup
Turing's proof wikipedia , lookup
Wiles's proof of Fermat's Last Theorem wikipedia , lookup
Georg Cantor's first set theory article wikipedia , lookup
Hyperreal number wikipedia , lookup
Four color theorem wikipedia , lookup
Non-standard analysis wikipedia , lookup
Non-standard calculus wikipedia , lookup
Principia Mathematica wikipedia , lookup
Mathematical proof wikipedia , lookup
Birkhoff's representation theorem wikipedia , lookup
Fundamental theorem of algebra wikipedia , lookup
Naive set theory wikipedia , lookup
1 Cardinality and the Pigeonhole Principle In this section we consider the basic problem of naive set theory: what it means for two sets to be the same size, or for one to be smaller than another, etc.. The word we use for the size of a set A is the cardinality of A, denoted |A| or #A (the second being used more for finite sets). Exactly what kind of thing a cardinality is in general is not clear (the collection of all possible cardinalities are called cardinal numbers) but whatever they are we can certainly compare them. The definition is Definition 1. Let A, B be sets. |A| = |B| if and only if there exists a bijection A → B. In this case we say1 that A and B are equinumerous. If there exists an injection A → B we write |A| ≤ |B|. The symbols <, ≥, > are defined as usual from ≤. This is common sense: even if you can’t count you can still check if, say, a bag of apples and a bag of oranges contain the same number of fruit by pairing each apple with an orange and vice versa. If you can pair up all the fruits it seems reasonable to assert the numbers were the same. If you run out of apples but still have oranges left then there were at least as many oranges as apples. The reader may be confused as to why in the previous sentence we say “at least as many” instead of “more”. For actual bags of fruit we would be justified in the stronger conclusion. Consider, though, an infinite set, say N. Certainly |N| = |N| since the identity function x 7→ x is bijective. The function x 7→ x + 1 while injective is not surjective, and we can’t very well conclude that N has more elements than itself. In writing the symbols “=, ≤” one hopes that the usual properties of these symbols hold. Otherwise the notation would be more confusing than enlightening. Thus a first order of business would be to show this is indeed true. Theorem 1. Let A, B, C be sets. Then 1. (|A| = |B|) ∧ (|B| = |C|) =⇒ |A| = |B| 2. (|A| ≤ |B|) ∧ (|B| ≤ |C|) =⇒ |A| ≤ |B| 3. (|A| ≤ |B| ∧ (|B| ≤ |A|) =⇒ |A| = |B| 4. (|A| ≤ |B|) ∨ (|B| ≤ |A|) Proof. 1. (Note this is implied by the next two points. We give a separate proof since point 3 is hard.) If |A| = |B| then by definition there is a bijection f : A → B. Similarly if |B| = |C| there is a bijection g : B → C. But then g ◦ f : A → C is a bijection so |A| = |C|. 2. Similarly the composition of two injective functions is injective. 3. This is somewhat hard, and we omit the proof. It goes by the name of the Cantor-Schroeder-Bernstein theorem. 1 Or at least we could say. It’s a fun word but not that common. 1 4. This assertion is not provable in ZF, and is in fact equivalent to the axiom of choice (Hartog’s theorem). We omit the proof. Together we read these as saying that ≤ defines an order on the cardinal numbers. The first three points (which are true without the axiom of choice) show that ≤ gives a partial order. Again note that the first point is redundant. One presumably already has an intuitive understanding of what a finite set is. Eccles defines a collection of standard sets, one for each finite cardinal: Definition 2. Nn = {1, 2, 3, . . . , n} ⊂ N A set A is said to be finite if |A| = |Nn | for some n. In this case we say that |A| = n. Again, this is secretly a common sense definition, or at least one you’ve known since you were very young. When we want to know how many apples are in a bag and don’t have your standardized bags of oranges2 to compare them to we simply point at the apples one by one while uttering the words “one, two, three” etc.. If we ever exhaust the apple supply we then assert the apple set to be finite and declare the last number spoken to be the number of apples, having created a bijection between Nlast number spoken and the apples. Of interest to pedants, but perhaps still worth noting is that we need to check this notion is well defined, i.e. that the “definition” actually is one. That is the content of this theorem: Theorem 2. |Nn | = |Nm | if and only if n = m. Eccles goes in to a lot of detail proving this. While it’s important to recognize that this fact requires proof, I trust that the reader wholeheartedly believes it already. This is often used in conjunction with the following: Proposition 1 (Pigeonhole Principle). Let A, B be sets with |A| > |B| and f : A → B be any function. Then there are two elements a1 , a2 ∈ A with f (a1 ) = f (a2 ). Proof. By definition if |A| > |B| then |A| 6≤ |B|. Again by definition this means there is no injection A → B. In particular f is not injective. But not being injective means precisely that two different elements of the domain have the same image, as desired. This is a simple statement, but often gets used in surprising ways. In any problem involving sets you should be open to the possibility of using the principle. Recall that a person is called inbred if the parents of one of their ancestors were related (i.e. two of their ancestors had a common ancestor). This is a pejorative term, since a high degree of inbreeding is associated with increased risk of genetic diseases. 2 Don’t leave home without them. 2 “Theorem” 1. You are inbred. Proof. Assume not. Then the ball of your ancestors 4000 years ago would have a mass greater than the entire Earth (calculation omitted). Since your ancestors at the time were a strict subset of the things on Earth this is absurd. A more precise example: Example 1. Assume that people are either mutual friends or not (i.e. if I’m your friend then you are mine too). Then within every group of people there are two people with the same number of friends within the group. Proof. Let n denote the number of people in the group and label them with the numbers in Nn . By we claim the function f : Nn → {0, . . . , n−1}, (person in group) 7→ (number of friends) cannot be surjective, because if it was then there would be one person who had n − 1 friends (so was friends with everyone) and one with no friends, a contradiction. But then | Im f | < n, so the pigeonhole principle applies to give the result. Exercise 1 As above, assume people are either mutual friends or not. Find the minimum size of group such that there must be either a group of 3 people all of whom are friends with each other or a group of 3 none of whom are friends. Another example: Example 2. Assume that every point on the plane is assigned one of n colors (i.e. we’re given a function R × R → Nn ). Show that there exists a rectangle (“monochromatic rectangle”) with all 4 corners the same color. Proof. Consider a (n + 1) × (nn+1 + 1) grid in the plane. We will find our rectangle within this grid. Observe there are nn+1 possible colorings of each column of of n+1 points. By the pigeonhole principle then there are two columns colored identically. Again by the pigeonhole principle both both these have one color occurring in two positions, and these four points define a monochromatic rectangle. The following question is possibly much harder, although maybe there’s some trick I haven’t thought of (tell me!) Question 1. Same set-up as above, does there necessarily exist a monochromatic square? For the next exercise recall that a sequence of real numbers (ai ) is simply a list (a1 , a2 , a3 . . . ) (more formally a function N → R or Nn → R). A sequence is increasing if i > j =⇒ ai > aj , and nondecreasing if i > j =⇒ ai ≥ aj . Decreasing and nonincreasing sequences are defined the same way. Such sequences are called monotonic or monotone increasing/decreasing/etc.. 3 A subsequence of a sequence S is the sequence defined by some subset of the indices of S (in order). For example (1, 2, 2, 3) is a nondecreasing subsequence of (5, 1, 2, 7, 2, −4, 5, 3, 0). Exercise 2 1. Show that for any n every sufficiently long sequence of real numbers has a nondecreasing or nonincreasing subsequence of length n. 2. (Stolen from Nikita Selinger) Show that every sequence of length n2 has such a subsequence. Hint: For every element of the sequence, consider a pair (x, y) where x is the length of the longest nondecreasing sequence starting at this element and y is the length of the longest nonincreasing sequence starting at this element. Apply the Pigeonhole Principle. Note that the second part is strictly stronger than the first. I recommend trying to solve the first part first, since this will give you some intuition as to what’s going on. With the hint the second isn’t really any harder, just very clever. The easy constructive proofs are no more correct but somehow more convincing. Note that you can easily improve the bounds from the second part to other quadratic polynomials less than n 7→ n2 . Can you do better than quadratic bounds? Linear bounds can’t work (exercise), but something in between might. 2 Counting: The sum rule, product rule, and inclusion exclusion Herein we describe various techniques for counting the elements in f inite sets. The most basic one is fairly obvious: Proposition 2 (Sum Rule). If A and B are two finite sets and A ∩ B = ∅ (i.e.A and B are disjoint) then |A ∪ B| = |A| + |B|. Proof. Let f : A → N|A| and g : B → N|B| be bijections. Then h : A ∪ B → N|A|+|B| ( f (x) if x ∈ A h(x) = g(x) + |A| if x ∈ A is a bijection too. By the obvious induction the same rule holds for any finite number of pairwise disjoint sets. Moreover if we have several sets that aren’t disjoint we can still break them up further using intersections to get a collection of smaller disjoint sets, and apply the sum rule from there. This procedure is important enough that it has a name 4 Proposition 3 (Inclusion-Exclusion Principle). Let A1 , A2 be any two finite sets. Then |A1 ∪ A2 | = |A1 | + |A2 | − |A1 ∩ A2 |. More generally if {Ai }i∈Nn is a set of finite sets then X [ \ Ai = (−1)|I|+1 | Ai | i∈Nn I⊂Nn ,I6=∅ i∈I Proof. We give only the 2 set case, with the general case being left as an exercise. Observe that U = (A1 − A1 ∩ A2 ) ∪ (A2 − A1 ∩ A2 ) ∪ (A1 ∩ A2 ) is a disjoint union. Moreover A1 = (A1 − A1 ∩ A2 ) ∪ A1 ∩ A2 , A2 = (A2 − A1 ∩ A2 ) ∪ A1 ∩ A2 so U = A1 ∪ A2 . These unions were also disjoint, so by the sum rule |Ai − A1 ∩ A2 = |Ai | − |A1 ∩ A2 | for i = 1, 2. Applying the sum rule to the first expression for U gives |U | = |A1 − A1 ∩ A2 | + |A2 − A1 ∩ A2 | + |A1 ∩ A2 | and substituting in the previous result proves the claim. Somehow the inclusion exclusion principle is a very deep way of dealing with things, which I don’t quite understand. The reader is encouraged to prove it as many ways as she can. There is also a product rule Proposition 4 (Product Rule). For A, B finite sets |A × B| = |A||B|. Proof. Let f : A → N|B| and g : B → N|B| be bijections. Then h : A × B → N|A||B| given by h((a, b)) = |B| · (a − 1) + b is a bijection. Again an obvious induction argument extends the result to any finite product of sets. Exercise 3 A binary string is a finite word in the digits “0” and “1”. For example “110100” and “000000” are both binary strings of length 6. For any n 1. Find the total number of binary strings of length n. 2. Find the total number of binary strings of length less than or equal to n. (Including the empty string “”). Write a closed form expression for this (i.e. without summing or giving a recurrence). 5 Exercise 4 Call a binary string golden if the substring “00” doesn’t occur (i.e. the digit 0 is never repeated). Show that the number of golden strings of length n is Fn+2 where Fi are the Fibonacci sequence 1, 1, 2, 3, 5, 8, 13 . . . . One often uses a twisted version of the product rule which is hard to state formally (which is probably the reason Eccles omits it) but so often used that we give an informal statement and examples. Fact 1 (General Product Rule). Suppose the objects we wish to count can be determined by some procedure, where there are always n1 choices for the first step, n2 choices for the second, n3 for the third, etc. for a total of m steps. Then assuming each sequence of choices leads to a distinct object there are m Y ni i=1 total objects3 . We can now count some classical combinatorial objects. Definition 3. Let X be a finite set. A bijective function X 7→ X is called a permutation of X. Proposition 5. There are exactly n! permutations of Nn . Proof. We use the general product rule, constructing all permutations step by step. In constructing a permutation f : Nn → Nn there are exactly n choices f (1) if the images of the other elements are still to be determined. Given f (1) there are n − 1 choices for f (2), since in order to be bijective we need f (1) 6= f (2), all other possibilities being valid. Proceeding inductively there are n − i + 1 choices for f (i), all the way down to 1 choice for f (n), so the total number is n · (n − 1) · (n − 2) · · · · · 2 · 1 = n! Note this proof is informal (since we didn’t state the general product rule formally), but this type of argument is perfectly good for most purposes. Eccles gives a more formal proof, which you should read. Permutations are often written in “two line notation”. For example the permutation p of N5 written: 1 2 3 4 5 3 2 5 4 1 takes 1 to 3, 2 to 2, 3 to 5, etc. Notice p(2) = 2, p(4) = 4. We say that p fixes 2 and 4. 3 Notation: Qm i=1 ni = n1 · n2 · · · · · nm 6 Definition 4. A permutation is called a derangement if it doesn’t fix any elements. We briefly introduce the important binomial coefficients nr . Definition 5. nr , read “n choose r” is defined as the number of subsets of Nn of cardinality r. That is, nr = |{S ∈ 2Nn | |S| = r}| A few obvious properties: Proposition 6. 1. n0 = 1 2. n1 = n n 3. nr = n−r 4. If r > 0 then nr = n−1 + r Proof. n−1 r−1 1. ∅ is the unique zero element set. 2. The set of one element subsets of Nn is precisely {{i}, i ∈ Nn }. 3. The map from r element subsets to n − r element subsets given by S 7→ S c is clearly invertible (it’s inverse is just taking complements again) so bijective. 4. Any r element subset of Nn either contains n or doesn’t. There are n−1 r−1 of the first type (we need to choose r − 1 more elements from the n − 1 remaining) and n−1 of the first. r The proposition gives a handy way to calculate small binomial coefficients: Pascal’s Triangle n = 0: 1 n = 1: 1 n = 2: 1 n = 3: n = 4: 1 1 1 2 3 4 1 3 6 1 4 1 Each row gives the binomial coefficients for the corresponding n, and each entry row is obtained from previous entries by adding together the entries immediately above it. To calculate large binomial coefficients we use a formula n! Proposition 7. nr = r!(n−r)! Proof. We give two proofs, one conceptual and one using the induction for Pascal’s Triangle. In the first case, note that nr is the same as the number of ways to color the elements of Nn two colors. If we have r red tokens and n − r yellow tokens 7 there are n! different ways to assign them one by one to the numbers in Nn (n choices for the first, n − 1 for the second, etc.). But each coloring can arise in many ways: we don’t care in which of the r! orders we put down the red tokens or which of the (n − r)! orders we put down the blue. Hence the total number n! . of distinct colorings is r!(n−r)! Alternatively use induction on n. For n= is clear. Assume then k+1 1 the result (k+1)! k+1 it holds for n = k. We calculate r = 1 = 1!(k+1)! , so we can assume 0 r > 0. In this case k+1 k k = + r r−1 r k! k! = + By the inductive hypothesis. (r − 1)!(k + 1 − r)! r!(k − r)! (r + 1)k! + (k + 1 − r)k! = Find common denominator. r!(k + 1 − r)! (k + 1)! = 1!(k + 1)! so the result holds by induction. Counting derangements is an interesting application of our techniques. Theorem 3. There are n n n n ( n n n! − (n − 1)! + (n − 2)! − · · · + (−1) n − 1) 1! + (−1) 0! 0 1 2 n−1 n (1) derangements of Nn . This can be written: n!(1 − 1 1 (−1)n−1 (−1)n + − ··· + + ) 1! 2! (n − 1)! (n)! (2) Proof. We prove the first form, the second following from it by the expression for binomial coefficients given above. Denote the total number of derangements by !n. Let F∅ denote the set of all n! permutations of Nn , and for any subset S ⊂ Nn let FS denote the set of permutations of Nn fixing (at least) the elements of S, that is FS = {p ∈ F∅ |∀i ∈ S, p(i) = i} Then since every non-derangement must fix at least one element we have [ !n = n! − | F{i} | i∈N Noting that T i∈S F{i} = FS inclusion exclusion gives !n = X (−1)|S| FS S⊆Nn 8 (Take a second look at this. What happened to the n!?) We can break up the sum by the cardinality of S:: X X i (−1) !n = FS i∈{0,1,...,n} S⊆Nn ,|S|=i Then we simply observe that |FS | = (n − |S|)! and that there are ni subsets S of each cardinality i so each term of the outer sum is equal to the corresponding term of the expression 1 and the result follows. The alert reader will notice that the parenthesised sum in the expression 2 is the first n + 1 terms of the usual series for e−1 , so we have the following cute result: Corollary 1. The chance that a random permutation of Nn is a derangement is approximately 1e ' 37%. 9