Download Statistics 100A Homework 1 Solutions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Birthday problem wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Statistics 100A
Homework 1 Solutions
Ryan Rosario
Problem 1
Suppose we flip a fair coin 4 times independently.
(1) What is the sample space?
By definition, the sample space, denoted as Ω, is the set of all possible outcomes in the
experiment. A coin has two sides, heads denoted H and tails denoted T . Each coin flip
will turn up either H or T , and 4 independent coin flips will yield a sequence of length 4
containing H and/or T . There are a total of 24 = 16 possible outcomes, thus |Ω| = 16. The
sample space is

{H, H, H, H}
 {H, T, H, H}
Ω=
 {T, H, H, H}
{T, T, H, H}
{H, H, H, T }
{H, T, H, T }
{T, H, H, T }
{T, T, H, T }
{H, H, T, H}
{H, T, T, H}
{T, H, T, H}
{T, T, T, H}

{H, H, T, T }
{H, T, T, T } 

{T, H, T, T } 
{T, T, T, T }
(2) What is the set that corresponds to the event that the number of heads is exactly 2? What is
its probability?
We want to find all cases in Ω such that the number of heads H is 2. Let’s call this set A.
Then,
A=
{H, H, T, T } {H, T, H, T } {H, T, T, H}
{T, T, H, H} {T, H, T, H} {T, H, H, T }
Note that there are 6 such cases, so |A| = 6. The probability that we get 2 heads on 4
independent coin flips is then
|A|
6
3
=
=
|Ω|
16
8
Aside: In this problem, we performed a process called enumeration to solve the problem.
That is, we enumerated all possible cases (Ω) and used it to compute the probability |A|
|Ω| .
There is another way to solve this problem. If we get 2 heads, this means we also get 2
non-heads (tails). What is the probability of this case? Recall from basic probability that
with independent events, the word and implies multiplication. We know that the probability
of a head on one flip is p = 12 ,
2
p (1 − p)
2
2 1
1 2
=
1−
2
2
2 2
1
1
=
2
2
1
1
1
=
=
4
4
16
1
but this computation only takes one particular configuration of the two heads and tails into
account. If we take the number of ways we can get two tails and two heads, we get
4
2
2 1
1 2
6
1−
=
2
2
16
This is called the binomial distribution, and has the form
n
k
P (X = k) =
pk (1 − p)n−k
where X is a random variable representing the number of heads in the sequence (see the
next problem), p is the probability of success (getting a head on one flip/trial), and k is the
number of successes/flips that are heads.
(3) Let Zi = 1 if the i-th flip is head, and Zi = 0 otherwise, for i = 1, 2, 3, 4. Let X be the number
of heads. Express X in terms of Zi .
Recall that Zi is the result of each individual flip i = 1, 2, 3, 4 and will be either one or zero.
Zi can be represented by the indicator variable:
Zi = 1Flip i is heads =
1, if flip i is a head
0, otherwise
Then the number of heads on the four flips is just a sum over Z:
X=
4
X
Zi
i=1
(4) What is the probability distribution of X? That is, what is P r(X = k) for k = 0, 1, 2, 3, 4?
Note that X is discrete, that is, it takes on one of a fixed number of values. Thus, we
represent the distribution of X as a table. To compute each probability, just use enumeration
like we did in problem (1) and (2). For future use, you would also be able to use the binomial
distribution directly.
Note: On a midterm (or on your assignment), only the columns labeled “X” and “P r(X =
k)” would be required. n = 4 since there are 4 flips/trials.
k
0
1
2
3
4
Binomial
4
1 0
2
0 4
1 1
2
1 4
1 2
2
2 4
1 3
2
3 4
1 4
2
4
|A|
|Ω|
P r(X = k)
1−
1 4
2
1
16
1
16
1−
1 3
2
4
16
4
16
1−
1 2
2
6
16
6
16
1−
1 1
2
4
16
4
16
1−
1 0
2
1
16
1
16
2
Recall that there are two requirements for a probability distribution:
1. 0 ≤ P r(X = k) ≤ 1, k ∈ Dom(X)
P
2.
k∈Dom(X) P r(X = k) = 1
To conclude, we see that X ∼ Bin n = 4, p =
1
2
.
Problem 2
Suppose we roll a fair die twice independently. Let X and Y be the two numbers we get.
(1) What is the sample space? Let A be the event that X > 4, and B be the event that Y > 4.
What are P r(A), P r(B)?
The sample space contains all |Ω| = 36 possible combinations of the two rolls.
Ω=





















X corresponds to the pips on the rolled die. The only way for X > 4 is if the first roll showed
a
or . Since we are only placing a restriction on the first roll (X), the second roll (Y )
can be anything.
There are only 2 possible ways for the first roll to show a value greater than 4. There are 6
possible values for the second roll such that X > 4, thus
|A| = 2 · 6 = 12
Then, P r(A) = P r(X > 4) =
12
36
= 13 .
Notice that there is another way to solve this problem. Since the rolls are independent, and
the restriction only applies to the first roll, we can ignore the second roll entirely and get
2
1
6 = 3.
Using the same exact reasoning,
P r(B) = P r(Y > 4) =
12
1
=
36
3
(2) Let C be the event that min (X, Y ) > 4? What is P r(C)? What is the relationship between
P r(C), P r(A), P r(B)?
There are two ways to approach this problem. First, min (X, Y ) > 4 means that X = 5, Y = 6,
or X = 6, Y = 6, or X = 6, Y = 5.
Typically, P (A∪B) = P (A)+P (B)−P (A∩B) for two events A and B. For three events, this
formula is more complicated and is based on the principle of inclusion and exclusion (Math
113), but contains intersections. Assuming all of the events are disjoint, the intersections
evaluate to 0, so:
3
P (X = 5, Y = 6 ∪ X = 6, Y = 6 ∪ X = 6, Y = 5)
=
=
=
=
P (X = 5, Y = 6) + P (X = 6, Y = 6) + P (X = 6, Y = 5)
+P (X = 5, Y = 5)
1
1
1
1
1
1
1
1
+
+
6
6
6
6
6
6
6
6
4
36
1
9
This was the intuitive method. The other method involves listing out all 36 possible rolls and
retaining the ones where min (X, Y ) > 4.
Another more clever method (really just restating the above), yields the answer to the final
question. Note that
P (min (X, Y ) > 4) = P (X > 4 ∩ Y > 4)
= P (A ∩ B)
= P (A) · P (B)
= P (C)
where P (A ∩ B) = P (A)P (B) because A and B are independent. Note that we get the same
answer as before, 91 .
Thus, P (C) = P (A)P (B) =
1
3
·
1
3
= 91 .
(3) Let D be the event that max (X, Y ) > 4? What is P r(D)? What is the relationship between
P r(D) and P r(A), P r(B), P r(C)
Again, there are multiple ways to solve this problem, but only one way will easily yield the
answer to the second question. max (X, Y ) > 4 means that at least one of the rolls must show
a value greater than 4. This is the union of the two events.
P (D) = P (max (X, Y ) > 4)
= P (X > 4 ∪ Y > 4)
= P (X > 4) + P (Y > 4) − P (X > 4 ∩ Y > 4)
Recall that X and Y are independent, so this reduces to
= P (X > 4) + P (Y > 4) − P (X > 4) P (Y > 4)
= P (A) + P (B) − P (A) · P (B)
= P (A) + P (B) − P (C)
The probability is then
P (C) = P (A) + P (B) − P (A) · P (B)
1 1 1
=
+ −
3 3 9
2 1
5
=
− =
3 9
9
4
(4) If I tell you that X > 4, what is the probability that X = 6?
There are a couple of ways to solve this problem. First, consider the intuitive approach. We
are given that X > 4. This means that X = 5 or X = 6. Only two cases satisfy X > 4. Of
these two cases, only one satisfies X = 6, thus
P (X = 6|X > 4) =
1
2
The other method uses conditional probability/Bayes rule:
P (X = 6|X > 4) =
P (X = 6 ∩ X > 4)
P (X > 4)
but note that the events X = 6 and X > 4 are not independent, so we cannot just multiply
the probabilities together. Instead we need to use Bayes rule in two different ways. Note
that:
P (B|A) =
P (A ∩ B)
P (B)
P (A|B) =
P (A ∩ B)
P (A)
and also,
Thus, we have that
P (A ∩ B) = P (A)P (A|B) = P (B)P (B|A)
Then we can use conditional probability as follows
P (X = 6 ∩ X > 4)
P (X > 4)
P (X = 6)P (X > 4|X = 6)
=
P (X > 4)
Note that P (X > 4|X = 6) = 1 because it is trivial that X > 4.
P (X = 6) · 1
=
P (X > 4)
P (X = 6)
=
P (X > 4)
P (X = 6|X > 4) =
=
1
6
2
6
=
1
2
(5) Let Z = X + Y , what is the probability distribution of Z? That is, what is P r(Z = k), where
k goes through all the possible values that Z can take?
5
The probability distribution for a discrete random variable is just a table describing the
probabilities for all possible values of the random variable!
We know that 1 ≤ X ≤ 6 and 1 ≤ Y ≤ 6 so 2 ≤ Z ≤ 12 since Z = X + Y . So, we need to
compute the probabilities P (X = k), k ∈ {2, . . . , 12}. The easiest way to compute all of the
probabilities is to use the same space from part 1 containing the 36 possible combination of
rolls.
k
2
3
4
5
6
7
8
9
10
11
12
Possibilities
, , , , , , , , , , , , , , , , , , , , , , , , , P (X = k)
1
36
2
36
3
36
4
36
5
36
6
36
5
36
4
36
3
36
2
36
1
36
In your response, you only need the columns k and P (X = k).
Again, note that all of the probabilities are between 0 and 1, and they all sum to 1.
Problem 3
Suppose we draw two random numbers uniformly and independently from [0, 1]. Let X and Y be
the two numbers.
We say that X ∼ U [0, 1] and Y ∼ U [0, 1]. The uniform distribution is a very simple distribution,
and is somewhat boring. It is defined by two endpoints, and the height of the distribution function
(the top of the rectangle shown in class) is defined such that the area of the rectange is exactly
1.
1
0.5
0
0
0.5
1
(1) What is P r(.1 ≤ X ≤ .4)? In general, for A ⊂ [0, 1], what is P r(X ∈ A)?
In class, I used a number line to answer this question. This works under certain cases. Here,
I will use the standard definition of the Uniform distribution (the rectangle).
6
1
0.5
0
0
0.5
1
We see above that P r(.1 ≤ X ≤ .4) is just the area of the shaded region.
P r(.1 ≤ X ≤ .4) = lA · hA = (0.4 − 0.1) · 1 = 0.3
Now let’s switch gears and consider any two endpoints a0 and b0 where b0 > a0 . Then,
P (X ∈ A) = b0 − a0
How would we solve this problem if instead of using the domain [0, 1] we use [a, b]?
Draw a picture! Remember that the area of the rectangle must be 1, because U [a, b] is a
probability distribution! Thus, lΩ · hΩ = 1. Thus, hΩ = l1Ω , that is:
hΩ =
1
b−a
Aside: The function that gives the height of a distribution at a particular point in its domain
is called the probability distribution function (PDF) and will be covered later in the course.
The PDF for U [a, b] is
f (x) =
1
b−a
The probability of X ∈ [a0 , b0 ] is then
lA · hA =
b0 − a0
b−a
This is called the cumulative density function (CDF) of U [a, b] and is simply the integral of
the PDF over the region of interest:
0
0
Z
b0
P (a ≤ X ≤ b ) =
a0
0
x b
b0 − a0
1
=
=
b−a
b − a a0
b−a
And in general,
Z
x2
P (x1 ≤ X ≤ x2 ) =
f (x)dx
x1
7
(2) What is P r(X + Y > 1.5)?
1
X
+
Y
>
0.75
5
1.
0.5
0.25
0
0
0.25
0.5
0.75
1
After drawing the picture, one can conclude that the shaded region of interest is 1/8th of
the total area of Ω, so P (X + Y > 1) = 18 . Of course, we can also compute the area of this
triangle and use that:
1
P (X + Y > 1) = bA · hA =
2
3
1
1
=
2
8
(3) Let A be X < .3, and let B be Y > .6, what is P r(A ∩ B)? What is the relationship between
P r(A ∩ B) and P r(A), P r(B)?
By now, it should be trivial that P (A) = P (X < .3) = 0.3 and P (B) = P (Y > .6) = 0.4.
Since X and Y are independent, and since events A and B contain only variables X and Y
respectively, we have that
P (A ∩ B) = P (A)P (B)
Or,
P (X < .3 ∩ Y > .6) = P (X < .3)P (Y > .6) = 0.3 · 0.4 = 0.12
8
(4) What is P r X 2 + Y 2 > 1 ?
This problem is very similar to finding P (X + Y > 1.5). Note that X 2 + Y 2 = 1 is a circle,
centered on the origin, with radius 1. Note that on U [0, 1], we are looking at 41 of a circle,
and the region we are interested in is the complement of this circle as in the graphic below.
1
2
X
0.75
+
2
Y
=
1
0.5
0.25
0
0
0.25
0.5
0.75
1
Recall that the area of a circle is πr2 , and since r = 1, the area of the circle would be π. The
area of the quarter circle is thus π4 . The area of the region of interest is thus
P (X 2 + Y 2 > 1) = 1 −
π
4
(5) Suppose I tell you X + Y > 1, what is the probability that X > 1/2?
There are a few different ways to solve this problem. First, we can lightly shade the region
corresponding to the assumption/given that X + Y > 1 and note that this is a triangle. We
can then discard everything outside of the triangle.
1
0.9
0.8
X>
1
2
|X + Y > 1
0.7
X
0.6
+
Y
>
Y 0.5
1
0.4
0.3
0.2
0.1
0.2
0.4
0.6
X
9
0.8
1
Given the assumption that X + Y > 1, this triangle becomes out Ω, if you will. Then, we
can look at only the portion of the triangle such that X > 21 and let this trapezoid form our
A if you will. Then the probability is
1
=
P X > |X + Y > 1
2
=
=
=
=
Area of Dark Shaded Region
Area of Light Shaded Region
Area of Trapezoid
Area of Triangle
Area of a Triangle + Area of a Rectangle
Area of Triangle
1
1
b
h
+
w
2 A A
2 A lA
1
2 bΩ hΩ
1 1 1
1
2 2 · 2 + 2
1
2 (1 · 1)
=
3
8
1
2
=
6
3
=
8
4
bA and hA are the base and height of the triangle in A, wA and lA are the width and length
of the rectangle in A, and bΩ and hΩ are the base and height of the triangle in Ω.
Aside: There is another way that yields the same answer, and points out some interesting
facts that will help you in this course. By Bayes rule (aka conditional probability), we have
that:
P X > 21 ∩ X + Y > 1
1
P X > |X + Y > 1 =
2
P (X + Y > 1)
But what is P X >
“part” of A. So
1
2
∩ X + Y > 1 ? Events A and B are not independent because B uses
1
1
P X > ∩ X + Y > 1 6= P X >
P (X + Y > 1)
2
2
Using the trick with Bayes’ rule, we see that
1
P X > ∩X +Y >1
= P (X + Y > 1)P X >
2
1
= P X>
P X +Y
2
10
1
|X + Y > 1
2
1
> 1|X >
2
Thus,
P X > 21 ∩ X + Y > 1
P (X + Y > 1)
P (X + Y > 1)P X > 12 |X + Y > 1
=
P (X + Y > 1)
1
= P X > |X + Y > 1
2
1
P X > |X + Y > 1
=
2
And this term is computed exactly as in the previous method. Note that the intuitive method
presented earlier gets us to the answer much quicker, but understand Bayes’ rule and its uses
is important.
11