Download solutions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability interpretations wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Boy or Girl paradox wikipedia , lookup

Birthday problem wikipedia , lookup

Risk aversion (psychology) wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
CS–174 Combinatorics & Discrete Probability, Spring 2007
Quiz 1 Solutions
Note: These solutions are not necessarily model answers. Rather, they are designed to be tutorial in nature,
and sometimes contain a little more explanation than an ideal solution. Also, bear in mind that there may
be more than one correct solution. The maximum total number of points available is 92. For each incorrect
answer in the multiple choice problems, 1 point was deducted.
1.
(a) Upon picking the first card, 48 out of the remaining 51 cards do not have the same numerical value as 3pts
48·44·40·36
.
the first one. Continuing this line of argument, we obtain the required expression 51·50·49·48
48
(b) There are 13 · 4 · 2 hands in which exactly 3 cards are of the same value (13 ways to pick the value, 3pts
4 ways to pick the 3 out of 4 cards of that value, and 48
2 ways to pick the other two cards); similarly,
there are 13 · 48 hands in which exactly 4 cards are of the same value. These are mutually exclusive
13(4·(48
+48)
2)
events, so the required probability is
.
(525)
5
4
13·(3)·(3)
Several students picked 13·4·48·47
or
instead, due to over-counting, etc. Please refer to solu(525)
(525)
tions to homework 1(b) for another example with similar calculations.
2.
(a) Let Y, Z be the r.v.’s for the number of heads and number of tails respectively. Clearly, Y (resp. Z) 3pts
is a binomial r.v. with parameters n and p (resp. n and 1 − p). Then, X = Y − Z, and E[X] =
E[Y ] − E[Z] = np − n(1 − p) = n(2p − 1).
(b) We have X = Y − (n − Y ) = 2Y − n. Hence, Var[X] = 4 Var[Y ] = 4np(1 − p).
3pts
Note that, if Y and Z were independent, we would have X = Y − Z and thus Var[X] = Var[Y ] +
Var[Z] = 2np(1 − p). But Y, Z are clearly not independent, so this value is invalid.
3.
1
1
+ n · n1 = 23 (namely, 2n
from the first phase 3pts
(a) The expected number of balls in the first bin is n · 2n
1
and n from the second). The same applies to the second and third bins. By linearity of expectation,
the expected number of balls in the first 3 bins is 3 · 23 = 92 .
It appears from the scratch work that several students misread the question.
1 n
(b) The probability that the first bin is empty after both phases is (1− 2n
) ·(1− n1 )n ∼ e−1/2 ·e−1 = e−3/2 . 3pts
By linearity of expectation, the expected number of empty bins amongst the first 3 bins is therefore
3e−3/2 .
Quite a few students do not know how to compute asymptotics.
4.
(a) Observe that
3pts
E[X | X1 = 3] = E[X1 + X2 | X1 = 3] = 3 + E[X2 | X1 = 3] = 3 + E[X2 ] = 3 +
7
13
= .
2
2
(b) E[E[X | X1 ]] = E[X] = E[X1 ] + E[X2 ] = 7.
5.
3pts
(a) By the same analysis as in the regular coupon collector problem, the expected number of boxes that 3pts
need to be bought until n/2 different coupons are obtained is:
1+
n
n
n
+
+ ...
n−1 n−2
n/2
Pn
= n(
n
X
1
) = n(Hn − Hn/2−1 )
i
i=n/2
= n(ln n − ln(n/2) + O(1)) = n(ln 2 + O(1)) = Θ(n),
where as usual we write Hn = i=1 1i = ln n + O(1) for the harmonic series.
Many students got at least one of part (a) and (b) wrong. Note that the expression for Hn was covered
in class and reviewed in homework 2 and 4 and again in section on Feb 21 (along with the trick for
expressing partial sums as a difference).
(b) Similarly, the expected number of boxes that need to be bought until n −
obtained is:
n(
√
n different coupons are 3pts
n
X
√
1
) = n(Hn − H√n−1 ) = n(ln n − ln n + O(1)) = n( 21 ln n + O(1)) = Θ(n log n).
√ i
i= n
6. This is essentially the coupon-collecting problem, wherein the cereal box corresponds to a roll of the die 3pts
and there are 6 different coupons (corresponding to 6 different outcomes of the roll). The expected number
of boxes that need to be bought until we collect 4 different coupons is 1 + 56 + 64 + 36 = 57
10 .
Many students correctly guessed that the answer must be greater than 4 but incorrectly circled
7.
— Pr[X ≥ 6] ≤ 31 : must be true; by Chebyshev’s inequality
— Pr[X ≥ Y ] < 1: may not be true; (e.g., consider X that assumes the value 3 −
√
3 + 3 w.p. 12 ).
√
33
5 .
1pt
3 ≈ 1.3 w.p.
1
2
and 1pt
— E[X 2 ] = 12: true, since E[X 2 ] = Var[X] + E[X]2 .
— Cov(X, Y ) =
2
3:
1pt
false; Cov(X, Y ) = 0 since X and Y are independent.
— Var[Y 2 ] = 92 : true, since Y 2 = Y (property of a 0-1 r.v.) and Var[Y ] = 92 .
— Var[X + 3Y ] =
8.
11
3 :
false; by independence, Var[X + 3Y ] = Var[X] + 9 Var[Y ] = 9.
1pt
1pt
1pt
(a) X is a binomial r.v. with parameters n and p = 1/2, so E[X] = np = n2 .
3pts
(b) Similarly, Var[X] = np(1 − p) = n4 .
3pts
(c) Consider two scenarios; in the first scenario we condition on the first ball being some particular value 2pts
(green, say), and in the second scenario we condition on the ith ball being this same value. Then the
set of possibilities for the other n − 1 balls in the sample is exactly the same in both scenarios. Hence
Y1 and Yi have the same distribution. Another way to see this is to note that the probability of any
given sample without replacement is the same regardless of the order in which the balls are chosen.
A lot of people did a calculation here to show that Y2 (and in some cases even Y3 ) has the same
distribution as Y1 . Note that this is not a valid argument for general i.
P
(d) Y = ni=1 Yi , and by part (c) E[Yi ] = E[Y1 ] = 21 , so by linearity of expectation E[Y ] = nE[Yi ] = n2 . 3pts
P
P
(e) Note that Var[Y ] = E[Y 2 ] − E[Y ]2 , and E[Y 2 ] = i E[Yi2 ] + i6=j E[Yi Yj ]. By similar reasoning 6pts
9
9
to part (c), we have that E[Yi Yj ] = E[Y1 Y2 = Pr[first two balls are green] = 21 · 19
= 38
. Therefore,
we get
Var[Y ] = nE[Y1 ] + n(n − 1)E[Y1 Y2 ] − E[Y ]2 =
n 2 n n(n − 1)
9
n
+ n(n − 1) −
= −
.
2
38
2
4
76
[Note that this evaluates to 1/4 when n = 1 and to zero when n = 20, as expected.]
A lot of people missed this part. The most common problem was just to assume that Yi , Yj are independent, so that E[Yi Yj ] = 21 · 12 = 14 . With this value, Var[Y ] would just be the same as Var[X],
which is n4 . But of course Yi , Yj are not independent. Many people realized this, and realized that they
needed to compute the value of E[Yi Yj ], but were unable to do so. The hint was intended to make you
think along the same lines as part (c), so that you would come up with E[Yi Yj ] = E[Y1 Y2 ], which is
easy to calculate.
(f) It is clear from the answers to parts (b) and (e) that Var[Y ] ≤ Var[X] always (with equality only when 2pts
n = 1). Moreover, as the sample size n increases, Var[Y ] decreases monotonically from n4 to 0. This
is in line with the intuition that, in sampling without replacement, the proportion of the sample that
is green becomes more concentrated around its expectation, because if the sample has (say) too many
green balls, at the next step it is more likely to pick a yellow ball. (This is sometimes referred to as the
“self-correcting” nature of sampling without replacement.) Ultimately, of course, when n = 20, the
sample is guaranteed to have exactly the expected number of green balls! [Note that there is nothing
in this problem that makes use of the fact that 50% of the balls are green; we would get analogous
behavior for any other proportion (other than 0% or 100%)].
9. This problem was intended to be fairly straight-forward as it is virtually the same as problem 4 on HW2.
On this note, we encourage students to read the homework solutions even for problems they solved on
the homework. Often the solutions present a cleaner approach to the problem, in addition to a clearer
presentation.
P
(a) X = e∈E Xe .
2pts
(b) Fix an edge e and one endpoint. The probability that the other endpoint
P is assigned a different color 4pts
is 32 . Hence, E[Xe ] = 23 , and by linearity of expectation, E[X] = e∈E E[Xe ] = 32 m. Moreover,
OPT ≤ m, so E[X] ≥ 32 OPT.
We took off 1 point for claiming E[Xe ] = 23 without proof.
(c) If e, e0 do not share an endpoint, then Xe , Xe0 are clearly independent, so E[Xe Xe0 ] = E[Xe ]E[Xe0 ] = 3pts
( 23 )2 = 49 . If e, e0 share an endpoint, then e and e0 are well-colored iff the other endpoints of e, e0 are
assigned different colors from the common vertex. Again we have E[Xe Xe0 ] = Pr[Xe = Xe0 = 1] =
( 32 )2 = 94 .
Note that the above calculation actually shows that the r.v.’s Xe and Xe0 are independent even when
e, e0 share an endpoint. However, this is not obvious without doing the calculation. We penalized
students who claimed independence without justifying it.
P
P
(d) E[X 2 ] = e∈E E[Xe2 ] + e6=e0 E[Xe Xe0 ] = m · 32 + m(m − 1) · 94 = 49 m2 + 29 m. Therefore, 3pts
Var[X] = E[X 2 ] − E[X]2 = 29 m.
Quite a few students failed to compute the variance correctly. Several failedP
to expand theP
square of the
2
sum of indicator r.v. correctly, and many also did not know how to write e E[Xe ] as e E[Xe ] =
E[X] = 32 m.
(e) Since OPT ≥ m, we have p = Pr[X ≥ 95 OPT] ≥ Pr[X ≥ 95 m]. We use Chebyshev’s inequality to 5pts
obtain an upper bound on the tail probability, namely:
2
Var[X]
18
9m
Pr X < 95 m ≤ Pr X − 23 m > 19 m ≤
=
= .
(m/9)2
(m/9)2
m
Hence, p ≥ 1 − 18
m , i.e. p = 1 − O(1/m).
Many students failed to obtain full credit for this part. There is no credit for simply stating the Chebyshev’s inequality in its general form. We highlight several common mistakes:
– Several students applied Chebyshev directly to compute Pr[X ≥ 95 m] or Pr[X ≥ 95 OPT]; that is
incorrect, we always use Chebyshev (and Chernoff for that matter) for tail bounds, i.e., to obtain
an upper bound for deviation from the mean.
– Several students simply manipulated inequality signs (flipping them at will, possibly applying
Chebyshev to derive a lower bound) to obtain the desired expression.
– A few students were confused by the expression 1 − O(1/m); this is technically a lower bound
because O(1/m) is technically an upper bound (we used similar notation on HW3).
– A few students did not point out explicitly the relationship between Pr[X ≥ 95 m] and Pr[X ≥
5
9 OPT] or claimed that they are equal; there was no penalty for this, but we emphasize that you
should understand why Pr[X ≥ 95 OPT] is at least as big.
– A few students tried to analyze Y = m − X, and did the calculations incorrectly. There does not
appear to be any advantage to analyzing Y here; Chebyshev is applicable to both X and Y .
– Several students did calculations of the kind:
Pr[X ≥ 59 m] = 1 − Pr[X < 1 − 95 m] ≥ · · · ≥ 1 −
Var[X]
(m/9)2
Some carried this out correctly; some did not. We discourage doing calculations this way because
it is difficult to keep track of the signs (for both you and us!).
(f) The m random variables Xe are not independent, so the Chernoff bound is not applicable.
2pts
Surprisingly, not everyone who attempted this part got it correct. If indeed the random variables Xe
were all independent, then the analysis would in fact be correct. The only small step that was not
explicitly pointed out (which ought to have been carried out in (e)) is that we would in fact be applying
Chernoff to bound Pr[X ≤ 59 m].
10.
(a) Note first that |X − p| ≥ p ⇔ |tX − tp| ≥ tp. So we can work with the r.v. Y = tX; Y is the sum 5pts
of 0-1 r.v.’s, so we can apply Chernoff bounds to it. Assuming < 1 and applying the bound in the
2
form Pr[|Y − µ| ≥ µ] ≤ 2 exp(− 3µ ), with µ = E[Y ] = tp, we get
2
Pr[|X − p| ≥ p] = Pr[|Y − tp| ≥ tp] ≤ 2 exp(− 3tp ).
This latter quantity is less than δ provided t ≥ 23p ln( 2δ ), as required.
This part was a bit of a gift as we did exactly this calculation in class a few weeks ago. Nonetheless,
quite a few people got confused by attempting to apply the Chernoff bound to X rather than to Y .
Those who did this generally were unable to correctly incorporate t into their bound (though in most
cases t appeared “by magic” in order to get the right answer).
(b) We simply substitute δ =
1
4
into the bound of part (a), which becomes t = O( 21p ).
(c) Call an execution of step (1) “good” if its value falls within the range [(1 − )p, (1 + )p]. By our
choice of t, each execution is good with probability at least 43 . Our final output is the median of s such
executions, and equation (∗) says that we want the median to be good with probability at least 1 − δ.
Notice that, if the median is not good, then at least 2s of the executions must not be good. The key idea
is to formulate this as a coin-tossing problem: if we think of each execution as a coin toss with Heads
probability 34 (Heads corresponding to being good), we want to bound the probability that 2s or fewer
of the tosses are Heads. Since executions are independent, we can use a Chernoff bound with µ = 3s
4
and deviation 4s to get
2
s
Pr[# Heads ≤ s/2] = Pr[# Heads ≤ (1 − 31 )µ] ≤ exp(− (1/3)
2 µ) = exp(− 24 ).
To make this value less than δ, it is enough to take s ≥ 24 ln(1/δ), as required.
Many people seemed to be lost in this part, and either could not find the correct r.v. to which to apply
a Chernoff bound, or ignored the hint and didn’t even try to use a Chernoff bound.
(d) The total number of samples is clearly just st, which by parts (b) and (c) is O( ln(1/δ)
).
2 p
2pts