Download Homework 5 (due October 27, 2009)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Pattern recognition wikipedia , lookup

Generalized linear model wikipedia , lookup

Least squares wikipedia , lookup

Hardware random number generator wikipedia , lookup

Simulated annealing wikipedia , lookup

Fisher–Yates shuffle wikipedia , lookup

Probability box wikipedia , lookup

Randomness wikipedia , lookup

Birthday problem wikipedia , lookup

Probability amplitude wikipedia , lookup

Transcript
Homework 5 (due October 27, 2009)
Problem 1. Each night different meteorologists give us the probability that it will rain the
next day. To judge how well meteorologists predict, we will score each of them as follows: If
a meteorologist says that it will rain with probability p, then he or she will receive a score of
1 − (1 − p)2
if it does rain,
1 − p2
if it does not rain.
We will then keep track of scores over a certain time span, and conclude that the meteorologist with the highest average score is the best predictor of weather. Suppose now that
a given meteorologist is aware of this and so wants to maximize his or her expected score.
If this person truly believes that it will rain tomorrow with probability p∗ , what value of p
should he or she assert so as to maximize the expected score?
Solution If I let p∗ be the probability that I truely BELIEVE that it will rain
tomorrow. Let p be the probability that I will SAY it will rain tomorrow. The
question is what should p be to maximize my score. Let X be the random variable
equal to the score the meteorologist receives when i say it will rain with probability
p. There are only two values that X can take; (1 − (1 − p)2 ) or 1 − p2 . Thus when
computing the expected value of X we have to sum of those two values multiplied by
the probability that X takes on those values.
E[X] = (1 − (1 − p)2 )P {X = 1 − (1 − p)2 } + (1 − p2 )P {X = 1 − p2 }.
Now P {X = 1 − (1 − p)2 } is the probability that my score is (1 − (1 − p)2 ), but
I get this score when it rains, so this is the probability that it rains. Similarly
P {X = 1 − p2 } = P {It does not rain}. But I believe that it will rain with probability
p∗ and that it wont rain with probability 1 − p∗ . Thus:
E[X] = (1 − (1 − p)2 )p∗ + (1 − p2 )(1 − p∗ ) = −2pp∗ + 1 − p∗ − p2 .
Now I seek to maximize this expected value with respect to p. I recognize that it is a
quadratic function in p with a negative coefficient of p2 , thus it has a concave down
parabola and thus has one maximum where the derivative is 0. If I take the derivative
with respect to p and set it equal to 0. This gives that a critical point occurs when
p = p∗ . Checking the second derivative is −2, we learn that when p = p∗ we must
indeed maximize our score.
Problem 2. For a nonnegative integer-valued random variable X, show that
E[X] =
∞
X
P (X > n) .
n=0
1
Hint: Note that
∞
X
P (X > n) =
n=0
∞
∞
X
X
P (X = m)
n=0 m=n+1
and change the order of the sums to show that this can be written as
E[X].
Solution
∞
∞
∞
X
X
X
P (X > n) =
P (X = m)
n=0
P∞
k=0
k P (X = k) =
n=0 m=n+1
=
∞ m−1
X
X
P (X = m)
m=1 n=0
This last equality comes from the fact that if I fix m then n ranges from 0 to m − 1.
Now the inner sum in the above formula is summing over n thus P (X = m) which
does not depend on n can be pulled out of the inner sum or:
=
∞
X
P (X = m)
m=1
m−1
X
1
n=0
The inner sum just adds 1, m times. Thus we have:
=
∞
X
mP (X = m) =
∞
X
mP (X = m) = E[X].
m=0
m=1
Problem 3. Let X be a random variable having expected value µ and variance σ 2 . Find
the expected value and the variance of the random variable
Y =
Solution
X −µ
.
σ
X −µ
E[X] µ
E[Y ] = E
=
− =0
σ
σ
σ
The second to last equality was using the fact that was proved in class as well as in
the book that E[aX + b] = aE[X] + b, then we used the fact that µ = E[X]. To
compute the Variance of Y we note that E[Y ] = 0 so the V ar(Y ) = E[Y 2 ].
(X − µ)2
1
2
E[Y ] = E
= 2 E[(X − µ)2 ] = 1.
2
σ
σ
The last equality comes from recognizing that E[(X − µ)2 ] = V ar(X) = σ 2 .
2
Problem 4. Airlines find that each passenger who reserves a seat fails to turn up with
1
independently of the other customers. So Teeny Weeny Airlines (TWA) always
probability 10
sell 10 tickets for their 9-seat airplane, while Blockbuster Airways (BA) always sell 20 tickets
for their 18-seat airplane. Which is more often over-booked?
Solution If we first consider TWA and let the random variable X = the number of
people who show up. This is a binomial random variable for parameters (n = 10, p =
9/10). The probability that TWA is overbooked is equal to:
P (X > 9) = P (X = 10) =
9
10
10
≈ .349
Doing the same for the BA airline we do the same thing. Let Y = the number
of people that show up, then it is a binomial random variable for the parameters
9
). Here though the probability that they are overbooked is equal to:
(n = 20, p = 19
20 9 19 1 9 20
P (Y > 18) = P (Y = 19) + P (Y = 20) =
+
≈ .3917
19
10
10
10
Thus the second airline is more likely to be overbooked.
Problem 5. A man claims to have extrasensory perception. As a test, a fair coin is flipped
8 times, and the man is asked to predict the outcome in advance. He gets 6 out of 8 correct.
What is the probability that he would have done at least this well if he had no extrasensory
perception?
Solution In this problem we think of flipping a coin 8 times and let X = the random
variable of the number of times we guess correctly. Since we have a 1/2 chance of
guessing each flip correctly, then X is a binomial random variable associated to the
parameter (n = 8, p = .5). Thus the probability that I guess better then 6 correctly
would be equal to:
P (X ≥ 6) =
8
X
k=6
8 1 8 8 1 8 8 1 8
P (X = k) =
+
+
= (.5)8 (28+8+1)
6
7
8
2
2
2
37
≈ .1445
28
Consequently we are very unlikely to guess as well or better then he did.
=
Problem 6. To determine whether or not they have a certain disease, a large number
of people are to have their blood tested. However, rather than testing each individual
separately, it has been decided first to group people in groups of 10. The blood samples of
the 10 people in each group will be pooled and analyzed together. If the test is negative,
3
one test will suffice for the 10 people; whereas, if the test is positive, each of the 10 people
will also be individually tested and, in all, 11 tests will be made on this group. Assume that
the probability that a person has the disease is p for all people, independently of each other,
and that the pooled test will be positive if at least one person in the pool has the disease.
Let T be a random variable equal to the number of tests needed for a group of 10 people.
Find the expected number E[T ] of tests necessary for each group. Calculate the numerical
values of E[T ] for the cases p = 0.001, p = 0.01, p = 0.05, p = 0.1, and p = 0.5. Discuss
briefly the practical applicability of this method (from purely probabilistic point of view).
Solution If T is the random variable defined above then it can take on two possible
values: 1 or 11. Thus the expectation value of T is given by:
E[T ] = 1P (T = 1) + 11P (T = 11) = (1 − p)10 + 11 1 − (1 − p)1 0 = 11 − 10(1 − p)10 .
The second to last equality comes from the fact that P (T = 1) is the probability that
no one has the disease, and thus (1 − p)1 0. Similarly P (T = 11) is the probability
that someone has the disease which is the complement of no one having the disease
thus (1 − (1 − p)1 0).
Computing this for the values indicated we get:
• p = .001 gives E[T ] ≈ 1.1
• p = .01 gives E[T ] ≈ 1.956
• p = .05 gives E[T ] ≈ 5.01
• p = .1 gives E[T ] ≈ 7.51
• p = .5 gives E[T ] ≈ 10.99
What this tells us is that if the probability of having the disease is too high (in
particular if its bigger then one half) then this method of testing results in doing
more tests on average then if I just tested everyone since the expected value of the
number of tests is bigger then 10.
Problem 7. We toss n coins, and each one shows heads with probability p, independently
of the others. Each coin which shows heads is tossed again. Let Y be the number of heads
resulting from the second round of tosses. Find the probability mass function, fY (k), the
expectation, E[Y ], and the variance, var(Y ), of the random variable Y .
4
Solution This problem is not to complicated if you recognize the following. Y counts
the number of successes where success is deemed to be when a coin is flipped twice
and gets a head both times, which occurs with probability p2 , So Y is a binomial
random variable associated to the parameters (n, p2 ). Thus we now have computed
all the things we seek in this problem.
fY (k) =
n
k
p2k (1 − p2 )n−k , for k = 0, 1, 2, ......
E[Y ] = np2
V ar(Y ) = np2 (1 − p2 )
5