Download Lecture3

Lecture 3 The entropy method and implications of Keevash’s theorems Zur Luria 1 ∗ Upper bounds: The entropy method The entropy method is a very powerful method of obtaining bounds, usually upper bounds. First, let us recall the notion of information entropy and some of its basic properties. Let X be a random variable with a finite range. We will only be interested in X’s distribution, and so we can really think of X as a finite collection {p1 , ..., pn } of nonnegative numbers that sum to 1. The entropy of X is defined to be H(X) = n X pi log(1/pi ). i=1 1. Intuitively, the entropy of X is the amount of information it encodes, measured in bits. There are theorems that justify this interpretation, saying that the average number of bits needed to encode X’s value is exactly H(X). 2. For any random variable X taking on n possible values there holds H(X) ≤ log(n), with equality iff X is uniform. 3. Let X, Y be discrete random variables. The joint entropy of X and Y is just the entropy of the pair (X, Y ) considered as a single random variable whose distribution is the joint distribution of X and Y . 4. Let X, Y be discrete random variables. The conditional entropy of X given Y is X X H(X|Y ) = Pr(Y = y) Pr(X = x|Y = y) log(1/ Pr(X = x|Y = y)). y∈Range(Y ) = X x∈Range(X) Pr(Y = y)H(X|Y = y) = EY [H(X|Y = y)]. y∈Range(Y ) This quantity may be interpreted as the average amount of information that X gives us if we already know Y . ∗ Institute of Theoretical Studies, ETH, 8092 Zurich, Switzerland. [email protected]. Research supported by Dr. Max Rössler, the Walter Haefner Foundation and the ETH Foundation. 1 5. The chain rule: For a sequence of random variables X1 , ..., Xn , we have H(X1 , ..., Xn ) = H(X1 ) + H(X2 |X1 ) + ... + H(Xn |X1 , ..., Xn−1 ). For an entropy upper bound on the number of STSs, see [?]. We will illustrate the entropy method by proving an upper bound on the number of Sudoku squares, that as far as I know is new. Theorem 1.1. For a square N = n2 , let SN denote the number of order N Sudoku squares. Then SN 2 N N ≤ (1 + o(1)) 3 e Proof. Fix N = n2 , and let X be an order-N Sudoku square chosen uniformly at random. Then H(X) = log(SN ). Let Xi,j denote the value of Xi,j . Using the chain rule, for any ordering of the Xi,j ’s we have X H(X) = E[H(Xi,j | values for the previous variables)] i,j Assume now that we choose a random ordering of the variables by choosing xi,j ∼ U [0, 1] independently for each (i, j), and ordering the variables in order of decreasing xi,j . Let x = (xi,j )i,j , and observe that this is, of course, nothing but a uniformly random ordering, but it is convenient to use these xi,j ’s. A value s is unavailable for Xi,j given previously observed variables if we already observed a variable in the same column, row or box whose value was s. Let Ni,j denote the number of values that are available for Xi,j given the previous variables. Then we have X H(X) ≤ Exi,j E[H(Xi,j | values for the previous variables)] i,j   X ≤ EX  Ex [log(Ni,j )] i,j   X ≤ EX  Exi,j [log(Exk,l :(k,l)6=(i,j) [Ni,j ])] . i,j So let’s turn to calculate the inner expectation. The situation is that X, the Sudoku square, is fixed, and xi,j , the random number given to (i, j) is also fixed, and we are calculating the expectation of Ni,j over the choices for xk,l where (k, l) 6= (i, j). Using linearity of expectation, Exk,l :(k,l)6=(i,j) [Ni,j ] = n X Pr(s is available for Xi,j given previously seen variables|X, xi,j ). s=1 There are three cases for s, depending on X. • If Xi,j = s then s is clearly available for Xi,j no matter which variables were seen before. 2 • If Xi,j 6= s, then if Xk,l = s for (k, l) in the same box as (i, j) and in the same row or column as (i, j), then there are exactly two variables that can rule out s. The value s remains legal if Xi,j precedes them both, which means that their x-values are smaller than xi,j , which happens with probability x2i,j . There are always 2n − 2 such values s. • Otherwise, there are exactly three variables that can rule out s- the s-valued variables in the same row, column and box as (i, j). There are N − 2n + 1 = (n − 1)2 such values s, and the probability that such an s is available is x3i,j , similarly to the previous case. Therefore Exk,l :(k,l)6=(i,j) [Ni,j ] = 1 + (2n − 2)x2i,j + (n − 1)2 x3i,j . Plugging this in, we get H(X) ≤ N 2 · Z 1 log(1 + 2(n − 1)x2 + (n − 1)2 x3 ) 0 N 2 · (log(N ) − 3 + o(1)). The last thing isn’t obvious at all, but mathematica solves the integral in about ten seconds. Thus, SN ≤ exp(N 2 · (log(N ) − 3 + o(1))) = 2 N N . (1 + o(1)) 3 e This proof can be adapted to give upper bounds on many different types of combinatorial objects, including designs, high dimensional permutations, Latin transversals, and various generalizations of these objects. 2 Immediate consequences of Keevash’s result Before we delve into Keevash’s proof, there are some interesting implications that can be inferred, using Theorem ?? and the lower bound on Steiner triple systems as a black boxes. • Typical designs: One natural direction to take in the study of designs is to investigate the properties of uniformly random design. For random regular graphs this has been a fertile area, and what was missing for designs until now was good estimates on their number. Here are a couple of examples. 3−ε , Theorem 2.1. There exists > 0 such that for every set of triples B ⊆ [n] 3 with |B| ≥ n with high probability an order-n STS chosen uniformly at random contains an element of B. Proof. It is straightforward to get an entropy upper bound on the number of order-n STSs that avoid B. Dividing this bound by Keevash’s lower bound gives the result Theorem 2.2. With high probability an order-n STS chosen uniformly at random doesn’t contain an order-u STS for u ≥ n/5. 3 Proof. First, we obtain an entropy upper bound for the number of STSs containing a smaller STS on a fixed set U ⊂ [n] of size u for some u ≥ n/5, and then divide by Keevash’s lower bound. This is an upper bound on the probability that U is a sub-STS of the random STS. We get the result via a union bound over all such sets U . • Another application of Keevash’s theorem is the construction of designs (and related objects) with desirable properties. If the property is monotone (ie, adding edges can only help), then all we have to do is to construct a partial STS that has the desired property and make sure that the remaining uncovered edges satisfy the requirements of theorem [?]. In this manner, we can prove results such as the following. Theorem 2.3. – There exists a constant M > 0 such that for every large enough n there is an order-n STS that contains a triple t ∈ S3 for every S ⊆ [n] whose size is at least M (n log n)1/2 . – There exists a constant M 0 > 0 such that for every large enough n there is an order-n Latin square that contains an element (a, b, c) ∈ A × B × C for every A, B, C such that |A||B||C| ≥ M 0 n2 . Proof. We can prove these kinds of statements by analyzing the random greedy process, and showing that with high probability the triangles chosen by the random greedy process already have this property. Let us prove the first item. Fix S ⊆ [n] and let s = |S|. The number of triples in S3 is about s3 /6, while the total number of triples is about n3 /6, and so the probability that the first triangle is in S is s3 /n3 . Let Gi denote the graph consisting of all of the uncovered edges before the i-th step of the greedy algorithm, and let Ai denote the event that the triangle chosen in the i-th step isn’t in S. Let 0 < α < 16 be a constant whose value will be determined later. The probability that no triangle of S is chosen during the first αn2 steps of the greedy algorithm is Pr(A1 ) · Pr(A2 |A1 ) · ... · Pr(Aαn2 |A1 , ..., Aαn2 −1 ). Fix 1 ≤ k ≤ αn2 . Note that Pr(Ak |A1 , ..., Ak−1 ) is one minus the number of triangles in S in Gi (which we shall denote by T (S, Gi )) divided by the total number of triangles in Gi , so Pr(Ak |A1 , ..., Ak−1 ) ≤ 1 − T (S, Gi ) . n 3 It remains to give a lower bound on T (S, Gi ). Since we know that previous steps did not choose a triangle in S, the only way a triangle in S can be eliminated by previous choices is if a previous triangle had an edge in S. Every such triangle can eliminate at most s triangles of S, and so if ti is the number of triangles prior to the i-th step with an edge in S, then T (S, Gi ) ≥ 3s − s · ti . The next step is to obtain an upper bound on ti . The number of triangles in Gj is at least n3 − 3(j − 1)n. Since we are only considering the first αn2 steps, this is at least n3 − 3αn3 , and so the probability that the j-th triangle has an (2s)n edge in S is at most n −3αn ≈ 3s2 n/(1 − 18α)n3 . 3 (3) 4 Therefore, we can upper bound ti by the sum of i independent Bernoulli random variables with p = 3s2 n/(1 − 18α)n3 . Chernoff’s bound tells us that αs2 2 . Pr tαn2 > 6αs /(1 − 18α) ≤ exp − (1 − 18α) So whp this doesn’t happen, and we have ti ≤ 6αs2 /(1 − 18α) for all 1 ≤ i ≤ αn2 , so s 3 − 18α) 1 − 54α s3 3 − 6αs /(1 Pr(Ak |A1 , ..., Ak−1 ) ≤ 1 − ≈ 1 − . n 1 − 18α n3 3 3 s So for an appropriately small constant α, we have Pr(Ak |A1 , ..., Ak−1 ) ≤ 1 − 2n 3 (α = 0.01 suffices), and then the probability that no triangle of S is chosen during the first αn2 steps of the greedy algorithm is at most αs2 s3 2 exp − + (1 − 3 )αn ≤ 2 exp(−αs3 /2n). (1 − 18α) 2n Using the union bound over all sets S whose size is at least are no such sets that don’t contain any triangles. 2√ α n log n implies that whp there For the second statement, recall that every STS is equivalent to a Latin square, so proving the result for Latin squares reduces to proving a similar result for STSs. There is a nice interpretation of this result. The chromatic index of an STS is the smallest number of colors needed to color the vertices such that there is no monochromatic triple. The 1/2 n previous theorem tells us that in Keevash’s STSs, the chromatic index is at least Ω( log n ). References [1] P. Keevash, The existence of designs, arXiv preprint arXiv:1401.3665 (2014). [2] P. Keevash, Counting designs, arXiv preprint arXiv:1504.02909 (2015). 5

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lecture3