Download solutions

CS–174 Combinatorics & Discrete Probability, Spring 2007 Quiz 1 Solutions Note: These solutions are not necessarily model answers. Rather, they are designed to be tutorial in nature, and sometimes contain a little more explanation than an ideal solution. Also, bear in mind that there may be more than one correct solution. The maximum total number of points available is 92. For each incorrect answer in the multiple choice problems, 1 point was deducted. 1. (a) Upon picking the first card, 48 out of the remaining 51 cards do not have the same numerical value as 3pts 48·44·40·36 . the first one. Continuing this line of argument, we obtain the required expression 51·50·49·48 48 (b) There are 13 · 4 · 2 hands in which exactly 3 cards are of the same value (13 ways to pick the value, 3pts 4 ways to pick the 3 out of 4 cards of that value, and 48 2 ways to pick the other two cards); similarly, there are 13 · 48 hands in which exactly 4 cards are of the same value. These are mutually exclusive 13(4·(48 +48) 2) events, so the required probability is . (525) 5 4 13·(3)·(3) Several students picked 13·4·48·47 or instead, due to over-counting, etc. Please refer to solu(525) (525) tions to homework 1(b) for another example with similar calculations. 2. (a) Let Y, Z be the r.v.’s for the number of heads and number of tails respectively. Clearly, Y (resp. Z) 3pts is a binomial r.v. with parameters n and p (resp. n and 1 − p). Then, X = Y − Z, and E[X] = E[Y ] − E[Z] = np − n(1 − p) = n(2p − 1). (b) We have X = Y − (n − Y ) = 2Y − n. Hence, Var[X] = 4 Var[Y ] = 4np(1 − p). 3pts Note that, if Y and Z were independent, we would have X = Y − Z and thus Var[X] = Var[Y ] + Var[Z] = 2np(1 − p). But Y, Z are clearly not independent, so this value is invalid. 3. 1 1 + n · n1 = 23 (namely, 2n from the first phase 3pts (a) The expected number of balls in the first bin is n · 2n 1 and n from the second). The same applies to the second and third bins. By linearity of expectation, the expected number of balls in the first 3 bins is 3 · 23 = 92 . It appears from the scratch work that several students misread the question. 1 n (b) The probability that the first bin is empty after both phases is (1− 2n ) ·(1− n1 )n ∼ e−1/2 ·e−1 = e−3/2 . 3pts By linearity of expectation, the expected number of empty bins amongst the first 3 bins is therefore 3e−3/2 . Quite a few students do not know how to compute asymptotics. 4. (a) Observe that 3pts E[X | X1 = 3] = E[X1 + X2 | X1 = 3] = 3 + E[X2 | X1 = 3] = 3 + E[X2 ] = 3 + 7 13 = . 2 2 (b) E[E[X | X1 ]] = E[X] = E[X1 ] + E[X2 ] = 7. 5. 3pts (a) By the same analysis as in the regular coupon collector problem, the expected number of boxes that 3pts need to be bought until n/2 different coupons are obtained is: 1+ n n n + + ... n−1 n−2 n/2 Pn = n( n X 1 ) = n(Hn − Hn/2−1 ) i i=n/2 = n(ln n − ln(n/2) + O(1)) = n(ln 2 + O(1)) = Θ(n), where as usual we write Hn = i=1 1i = ln n + O(1) for the harmonic series. Many students got at least one of part (a) and (b) wrong. Note that the expression for Hn was covered in class and reviewed in homework 2 and 4 and again in section on Feb 21 (along with the trick for expressing partial sums as a difference). (b) Similarly, the expected number of boxes that need to be bought until n − obtained is: n( √ n different coupons are 3pts n X √ 1 ) = n(Hn − H√n−1 ) = n(ln n − ln n + O(1)) = n( 21 ln n + O(1)) = Θ(n log n). √ i i= n 6. This is essentially the coupon-collecting problem, wherein the cereal box corresponds to a roll of the die 3pts and there are 6 different coupons (corresponding to 6 different outcomes of the roll). The expected number of boxes that need to be bought until we collect 4 different coupons is 1 + 56 + 64 + 36 = 57 10 . Many students correctly guessed that the answer must be greater than 4 but incorrectly circled 7. — Pr[X ≥ 6] ≤ 31 : must be true; by Chebyshev’s inequality — Pr[X ≥ Y ] < 1: may not be true; (e.g., consider X that assumes the value 3 − √ 3 + 3 w.p. 12 ). √ 33 5 . 1pt 3 ≈ 1.3 w.p. 1 2 and 1pt — E[X 2 ] = 12: true, since E[X 2 ] = Var[X] + E[X]2 . — Cov(X, Y ) = 2 3: 1pt false; Cov(X, Y ) = 0 since X and Y are independent. — Var[Y 2 ] = 92 : true, since Y 2 = Y (property of a 0-1 r.v.) and Var[Y ] = 92 . — Var[X + 3Y ] = 8. 11 3 : false; by independence, Var[X + 3Y ] = Var[X] + 9 Var[Y ] = 9. 1pt 1pt 1pt (a) X is a binomial r.v. with parameters n and p = 1/2, so E[X] = np = n2 . 3pts (b) Similarly, Var[X] = np(1 − p) = n4 . 3pts (c) Consider two scenarios; in the first scenario we condition on the first ball being some particular value 2pts (green, say), and in the second scenario we condition on the ith ball being this same value. Then the set of possibilities for the other n − 1 balls in the sample is exactly the same in both scenarios. Hence Y1 and Yi have the same distribution. Another way to see this is to note that the probability of any given sample without replacement is the same regardless of the order in which the balls are chosen. A lot of people did a calculation here to show that Y2 (and in some cases even Y3 ) has the same distribution as Y1 . Note that this is not a valid argument for general i. P (d) Y = ni=1 Yi , and by part (c) E[Yi ] = E[Y1 ] = 21 , so by linearity of expectation E[Y ] = nE[Yi ] = n2 . 3pts P P (e) Note that Var[Y ] = E[Y 2 ] − E[Y ]2 , and E[Y 2 ] = i E[Yi2 ] + i6=j E[Yi Yj ]. By similar reasoning 6pts 9 9 to part (c), we have that E[Yi Yj ] = E[Y1 Y2 = Pr[first two balls are green] = 21 · 19 = 38 . Therefore, we get Var[Y ] = nE[Y1 ] + n(n − 1)E[Y1 Y2 ] − E[Y ]2 = n 2 n n(n − 1) 9 n + n(n − 1) − = − . 2 38 2 4 76 [Note that this evaluates to 1/4 when n = 1 and to zero when n = 20, as expected.] A lot of people missed this part. The most common problem was just to assume that Yi , Yj are independent, so that E[Yi Yj ] = 21 · 12 = 14 . With this value, Var[Y ] would just be the same as Var[X], which is n4 . But of course Yi , Yj are not independent. Many people realized this, and realized that they needed to compute the value of E[Yi Yj ], but were unable to do so. The hint was intended to make you think along the same lines as part (c), so that you would come up with E[Yi Yj ] = E[Y1 Y2 ], which is easy to calculate. (f) It is clear from the answers to parts (b) and (e) that Var[Y ] ≤ Var[X] always (with equality only when 2pts n = 1). Moreover, as the sample size n increases, Var[Y ] decreases monotonically from n4 to 0. This is in line with the intuition that, in sampling without replacement, the proportion of the sample that is green becomes more concentrated around its expectation, because if the sample has (say) too many green balls, at the next step it is more likely to pick a yellow ball. (This is sometimes referred to as the “self-correcting” nature of sampling without replacement.) Ultimately, of course, when n = 20, the sample is guaranteed to have exactly the expected number of green balls! [Note that there is nothing in this problem that makes use of the fact that 50% of the balls are green; we would get analogous behavior for any other proportion (other than 0% or 100%)]. 9. This problem was intended to be fairly straight-forward as it is virtually the same as problem 4 on HW2. On this note, we encourage students to read the homework solutions even for problems they solved on the homework. Often the solutions present a cleaner approach to the problem, in addition to a clearer presentation. P (a) X = e∈E Xe . 2pts (b) Fix an edge e and one endpoint. The probability that the other endpoint P is assigned a different color 4pts is 32 . Hence, E[Xe ] = 23 , and by linearity of expectation, E[X] = e∈E E[Xe ] = 32 m. Moreover, OPT ≤ m, so E[X] ≥ 32 OPT. We took off 1 point for claiming E[Xe ] = 23 without proof. (c) If e, e0 do not share an endpoint, then Xe , Xe0 are clearly independent, so E[Xe Xe0 ] = E[Xe ]E[Xe0 ] = 3pts ( 23 )2 = 49 . If e, e0 share an endpoint, then e and e0 are well-colored iff the other endpoints of e, e0 are assigned different colors from the common vertex. Again we have E[Xe Xe0 ] = Pr[Xe = Xe0 = 1] = ( 32 )2 = 94 . Note that the above calculation actually shows that the r.v.’s Xe and Xe0 are independent even when e, e0 share an endpoint. However, this is not obvious without doing the calculation. We penalized students who claimed independence without justifying it. P P (d) E[X 2 ] = e∈E E[Xe2 ] + e6=e0 E[Xe Xe0 ] = m · 32 + m(m − 1) · 94 = 49 m2 + 29 m. Therefore, 3pts Var[X] = E[X 2 ] − E[X]2 = 29 m. Quite a few students failed to compute the variance correctly. Several failedP to expand theP square of the 2 sum of indicator r.v. correctly, and many also did not know how to write e E[Xe ] as e E[Xe ] = E[X] = 32 m. (e) Since OPT ≥ m, we have p = Pr[X ≥ 95 OPT] ≥ Pr[X ≥ 95 m]. We use Chebyshev’s inequality to 5pts obtain an upper bound on the tail probability, namely: 2 Var[X] 18 9m Pr X < 95 m ≤ Pr X − 23 m > 19 m ≤ = = . (m/9)2 (m/9)2 m Hence, p ≥ 1 − 18 m , i.e. p = 1 − O(1/m). Many students failed to obtain full credit for this part. There is no credit for simply stating the Chebyshev’s inequality in its general form. We highlight several common mistakes: – Several students applied Chebyshev directly to compute Pr[X ≥ 95 m] or Pr[X ≥ 95 OPT]; that is incorrect, we always use Chebyshev (and Chernoff for that matter) for tail bounds, i.e., to obtain an upper bound for deviation from the mean. – Several students simply manipulated inequality signs (flipping them at will, possibly applying Chebyshev to derive a lower bound) to obtain the desired expression. – A few students were confused by the expression 1 − O(1/m); this is technically a lower bound because O(1/m) is technically an upper bound (we used similar notation on HW3). – A few students did not point out explicitly the relationship between Pr[X ≥ 95 m] and Pr[X ≥ 5 9 OPT] or claimed that they are equal; there was no penalty for this, but we emphasize that you should understand why Pr[X ≥ 95 OPT] is at least as big. – A few students tried to analyze Y = m − X, and did the calculations incorrectly. There does not appear to be any advantage to analyzing Y here; Chebyshev is applicable to both X and Y . – Several students did calculations of the kind: Pr[X ≥ 59 m] = 1 − Pr[X < 1 − 95 m] ≥ · · · ≥ 1 − Var[X] (m/9)2 Some carried this out correctly; some did not. We discourage doing calculations this way because it is difficult to keep track of the signs (for both you and us!). (f) The m random variables Xe are not independent, so the Chernoff bound is not applicable. 2pts Surprisingly, not everyone who attempted this part got it correct. If indeed the random variables Xe were all independent, then the analysis would in fact be correct. The only small step that was not explicitly pointed out (which ought to have been carried out in (e)) is that we would in fact be applying Chernoff to bound Pr[X ≤ 59 m]. 10. (a) Note first that |X − p| ≥ p ⇔ |tX − tp| ≥ tp. So we can work with the r.v. Y = tX; Y is the sum 5pts of 0-1 r.v.’s, so we can apply Chernoff bounds to it. Assuming < 1 and applying the bound in the 2 form Pr[|Y − µ| ≥ µ] ≤ 2 exp(− 3µ ), with µ = E[Y ] = tp, we get 2 Pr[|X − p| ≥ p] = Pr[|Y − tp| ≥ tp] ≤ 2 exp(− 3tp ). This latter quantity is less than δ provided t ≥ 23p ln( 2δ ), as required. This part was a bit of a gift as we did exactly this calculation in class a few weeks ago. Nonetheless, quite a few people got confused by attempting to apply the Chernoff bound to X rather than to Y . Those who did this generally were unable to correctly incorporate t into their bound (though in most cases t appeared “by magic” in order to get the right answer). (b) We simply substitute δ = 1 4 into the bound of part (a), which becomes t = O( 21p ). (c) Call an execution of step (1) “good” if its value falls within the range [(1 − )p, (1 + )p]. By our choice of t, each execution is good with probability at least 43 . Our final output is the median of s such executions, and equation (∗) says that we want the median to be good with probability at least 1 − δ. Notice that, if the median is not good, then at least 2s of the executions must not be good. The key idea is to formulate this as a coin-tossing problem: if we think of each execution as a coin toss with Heads probability 34 (Heads corresponding to being good), we want to bound the probability that 2s or fewer of the tosses are Heads. Since executions are independent, we can use a Chernoff bound with µ = 3s 4 and deviation 4s to get 2 s Pr[# Heads ≤ s/2] = Pr[# Heads ≤ (1 − 31 )µ] ≤ exp(− (1/3) 2 µ) = exp(− 24 ). To make this value less than δ, it is enough to take s ≥ 24 ln(1/δ), as required. Many people seemed to be lost in this part, and either could not find the correct r.v. to which to apply a Chernoff bound, or ignored the hint and didn’t even try to use a Chernoff bound. (d) The total number of samples is clearly just st, which by parts (b) and (c) is O( ln(1/δ) ). 2 p 2pts

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download solutions