Download Solutions Exam 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Dragon King Theory wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Law of large numbers wikipedia , lookup

Birthday problem wikipedia , lookup

Transcript
Math 244 Exam 1
April 24, 2009
Name:
• Show all work and indicate your final answers clearly. Answers without clear justification will be given very little credit.
Problem Possible Points Score
1
12
2
10
3
12
4
12
5
15
6
21
7
15
Total
97
1
2
1. (12 Points) There is a common mutation for each of three different genes (A B and
C). Suppose that 70% of people have the mutation of gene A, 60% have the mutation
of gene B and 52% have the mutation of gene C. 30% have the mutation of all three
genes. 40% have both the mutation of A and B, 32% have the mutation of both A and
C and 42% have the mutation of both B and C. Define the events:
A = {a randomly chosen person has the mutation on gene A}
B = {a randomly chosen person has the mutation on gene B}
C = {a randomly chosen person has the mutation on gene C}
One way to handle this problem is to begin by filling out a Venn Diagram filling in
the appropriate probabilities of each area:
A
B
.10
.28
.02
.30
.08
C
3
.08
.12
.02
(a) (15 Points) Express the event that a randomly chosen person will not have a mutation on any of the three genes in terms of A, B and C. Then find the probability
of this event.
This event can be expressed as A0 ∩ B 0 ∩ C 0 . P (A0 ∩ B 0 ∩ C 0 ) = .02 can be obtained
by
A
B
.10
.28
.02
.30
.08
.12
.08
.02
C
(b) Find P (A ∪ C).
One can find this answer using the Venn Diagram or by
P (A ∪ C) = P (A) + P (C) − P (A ∩ C) = .7 + .52 − .32 = .9
(c) Find P (A0 ∩ (B ∪ C)).
To do this one can use the Venn Diagram by recognizing the even A0 ∩ (B ∪ C) as
the shaded region in the figure below.
A
B
.10
.28
.02
.30
.08
C
4
.08
.12
.02
2. (10 Points)Suppose that in a box there are four pieces of paper with identical dimensions. One piece says “Win prize A”, one says “Win prize B”, one says “Win prize C”
and the remaining piece says “Win prizes A, B, and C!”. Suppose that you select a
random piece of paper from the box. Define the events:
A = {you win prize A}
B = {you win prize B}
C = {you win prize C}
(a) Show that the two events A and B are independent.
To show that the two events A and B are independent we must show that
P (A) = P (A|B). I will argue separately that each of them is 1/2.
First there are two slips of paper for which one would win prize A, (the one that
says ”Win prize A” and the one that says ”Win prizes A, B and C”) Since there are
4 possible choices P (A) = 12 .
Similarly I could compute P (B) = 21 as well. Now, P (A ∩ B) is the probability that
one wisn both prizes A and B, since only one choice achieves this, the probability of
P (A ∩ B) = 14 . Using these probabilities and the definiition of conditional probability
we then have that:
P (A ∩ B)
=
P (A|B) =
P (B)
1
4
1
2
=
1
2
Since P (A) = P (A|B) then A and B are independent.
(b) Show that the the set of three events A, B and C are NOT an independent set
of events by considering P (A ∩ B ∩ C).
For this part of the problem we recall that a for a set of three events to be an
independent set they must not only be pairwise independent, but we must also have
that: P (A ∩ B ∩ C) = P (A)P (B)P (C). However, there is only one outcome where
one wins all three prizes so P (A ∩ B ∩ C) = 14 . On the other hand we argued above
that P (A) = P (B) = P (C) = 12 . Thus,
1
1
1
1
1
P (A)P (B)P (C) =
= 6= = P (A ∩ B ∩ C)
2
2
2
8
4
Thus the three events do not form an independent set.
5
3. (12 Points) One box contains six red balls and four green balls, and a second box
contains seven red balls and three green balls. A ball is randomly chosen from the first
box and placed in the second box. Then a ball is randomly selected from the second
box and placed in the first box.
(a) What is the probability that a red ball is selected from the second box?
To begin to answer this question I will define some events:
R1 = {A red is drawn first}
R2 = {A red is drawn second}
6
4
It is relatively easy to see that P (R1 ) = 10
and P (R1c ) = 10
. Additionally, if we
8
know the first draw it is easy to know the second draw. Namely, P (R2 |R1 ) = 11
and
3
7
4
c
c
c
c
P (R2 |R1 ) = 11 , P (R2 |R1 ) = 11 and P (R2 |R1 ) = 11 . Now we also recognize that R1
and R1c form a mutually exclusive and exhaustive set. Thus using the rule of total
probability we have that:
P (R2 ) = P (R1 )P (R2 |R1 ) + P (R1c )P (R2 |R1c )
6
8
4
7
=
+
10
11
10
11
72
≈ 0.6545
=
110
(b) At the conclusion of the selection process, what is the probability that the numbers
of red and green balls in the first box are identical to the numbers at the beginning.
To do this problem we recognize that the only way that the number of red and green
balls can be identical to the start is if both draws are the same color. These are
described by the events R1 ∩ R2 and R1c ∩ R2c . Since these two events are mutually
exclusive we can write:
P ((R1 ∩ R2 ) ∪ (R1c ∩ R2c )) = P (R1 ∩ R2 ) + P (R1c ∩ R2c )
= P (R1 )P (R2 |R1 ) + P (R1c )P (R2c |R1c )
4
8
4
6
+
=
10
11
10
11
64
=
≈ 0.5818
110
6
4. (12 Points) Suppose a 5 letter word is selected at random by choosing each of the
letters from all 26 letters. Repeated letters are not allowed.
(a) What is the probability that the word contains at least one vowel (A, E, I, O or U)?
One of the ways to do this problem is to consider the compliment of this event which
is the event that no vowels are chosen. To do this one thinks of all the possible words
which can by counted by 26 · 25 · 24 · 23 · 22 and the number with only consonants as
21 · 20 · 19 · 18 · 17. Thus the Probability is given by:
P (at least one vowel) = 1 − P ( no vowels)
21 · 20 · 19 · 18 · 17
= =1−
26 · 25 · 24 · 23 · 22
≈ 0.69065
(b) What is the probability that the word has the letter combination CH in it?
We already counted in part (a) the number of objects in the sample space. Now we
need to count the number of words with a CH in it. One way to think about this is
to first place the CH in the word. There are four different cases:
CH
CH
CH
CH
For each of these four cases we then have to fill the remaining slots with the other 24
letters. There are 24 · 23 · 22 ways to do this in each case. Thus the probability of
getting one of these letters is given by,
P (getting a CH) =
4 · 24 · 23 · 22
4
=
≈ .006154
26 · 25 · 24 · 23 · 22
26 · 25
7
5. (15 Points) If a student scores 400 or less on the SAT exam they have a 3% chance
of being accepted into SU. Similarly if a student scores from 401 to 600 inclusive or
601 and 800 inclusive they have a 30% or 78% chance of getting accepted into SU
respectively.
Among all those taking the SAT exam, 10% score at or below 400, 60% score between
401 and 600 inclusive, and 30% score 601 or higher.
To begin this problem I will first define 4 events:
A1
A2
A3
S
=
=
=
=
{
{
{
{
A
A
A
A
randomly
randomly
randomly
randomly
selected
selected
selected
selected
student
student
student
student
scores between 0 and 400}
scores between 401 and 600}
scores 601 or higher}
is accepted at SU}
(a) What is the probability that a randomly selected student among all SAT takers
scores 601 higher but gets rejected from SU?
This question asks us to compute P (A3 ∩ S 0 ), we can evaluate this by recognizing
that:
P (A3 ∩ S c ) =
=
=
=
=
P (A3 ) − P (A3 ∩ S)
P (A3 ) − P (A3 )P (S|A3 )
P (A3 )(1 − P (S|A3 ))
(0.3)(1 − .78) = (.3)(.22)
.066
(b) What is the probability that a randomly selected student among all SAT takers
gets accepted at SU?
This is a question of total probability. We recognize that we seek to compute P (S)
P (S) = P (A1 )P (S|A1 ) + P (A2 )P (S|A2 ) + P (A3 )P (S|A3 ))
= (.1)(.03) + (.6)(.3) + (.3)(.78)
= 0.417
8
(c) What is the probability that a randomly selected student scored between 401 and
600 inclusive if it is known that the student was accepted at SU?
This is a question that uses Bayes’ Theorem. We recognize that we seek to compute
P (A2 |S).
P (S|A2 )P (A2 )
P (A1 )P (S|A1 ) + P (A2 )P (S|A2 ) + P (A3 )P (S|A3 ))
.6(.3)
=
(.1)(.03) + (.6)(.3) + (.3)(.78)
= 0.43165
P (A2 |S) =
9
6. (21 points) In a green house 10 seeds are planted and after 3 weeks their heights are
recorded in centimeters. The ten the measurements are:
0.7, 4.9, 5.7, 5.8, 5.9, 8.5, 9.0, 9.0, 9.1, 9.9
(a) Compute the sample mean and median.
The mean is the average of the 10 numbers:
x=
0.7 + 4.9 + 5.7 + 5.8 + 5.9 + 8.5 + 9.0 + 9.0 + 9.1 + 9.9
= 6.85
10
The median is the middle number or the average of the two middle numbers in this
case the median is:
5.9 + 8.5
= 7.2
x
e=
2
(b) Compute sample standard deviation.
The standard deviation is the square root of the variance which is given by:
s
2
(0.72 + 4.92 + 5.72 + 5.82 + 5.92 + 8.52 + 9.02 + 9.02 + 9.12 + 9.92 ) −
=
9
= 7.920556
Thus the standard deviation is given by:
s≈
√
(0.7+4.9+5.7+5.8+5.9+8.5+9.0+9.0+9.1+9.9)2
10
7.920556 = 2.814
(c) Compute the 20% trimmed sample mean.
The 20% trimmed mean is the average of the middle 6 numbers since 20% of 10 is 2
we remove the 2 largest and smallest observations:
x10% =
5.7 + 5.8 + 5.9 + 8.5 + 9.0 + 9.0
= 7.32
6
(continued on the next page)
10
(d) Compute the fourth spread.
For the fourth spread we must first compute the upper and lower fourth. The upper
fourth is the median of the largest 5 numbers in this case which is 9. The lower fourth
is the median of the smallest 5 observations which in this case is 5.7. The difference
of these two is the fourth spread. Thus,
fs = 9 − 5.7 = 3.3
(e) Decide if there are outliers or extreme outliers in the data.
Outliers are more then 1.5 ∗ fs = 1.5 ∗ 3.3 = 4.95 below the lower fourth or more
than 4.95 above the upper fourth. Since 5.7 − 4.95 = .75 then the observation of . is
indeed an outlier. Since 9 + 4.95 = 13.95 and there are no observations larger than
this number there are no outliers at this end. To check that .7 is not an extreme
outlier we must subtract 3(fs ) from the lower fourth. 5.7 − 3(3.3) = −4.2. Since this
is smaller then .7 then there are no extreme outliers.
(f) Construct a stem and leaf diagram using 10 stems.
Here the stem is the ones digit and the leaf is the tenths digit.
0
1
2
3
4
5
6
7
8
9
7
9
7
8
9
5
0
0
1
9
(g) What features of the data can you identify from the stem and leaf diagram?
From this diagram we can see that the data is bimodal with peaks somewhere near 5
and 9. The tails are heavier to the negative side for each of the peaks. The peak at
9 seems to be larger than the peek at 5. There are gaps between 6 and 7. Once can
see that the observation at .4 is most likely an outlier.
11
7. (15 Points) A computer lab has 10 computers in it. The probability that any computer
has a virus is .2 independent of any other computer.
(a) What is the probability that all of the computers have the virus?
With our knowledge of binomial random variables we can now think of the quantity
X = the number of computers with the virus, as a binomial random variable with
parameters n = 10 and p = .2. We are then after the probability that P (X = 10).
Using our pdf we know that:
10
P (X = 0) =
(.2)10 (.8)10−10 = (.2)10 ≈ 1.024 × 10−7
10
(b) What is the probability that at least one computer in the lab has a virus?
Using the same random variable as above we can phrase this as:
10
P (X ≥ 1) = 1 − P (X = 0) = 1 −
(.2)0 (.8)10−0 = 1 − (.8)10 ≈ .8926
0
(c) What is the probability that exactly one of the computers has a virus?
Using the same random variable we are then asked for P (X = 1) using our pdf
formula we have that:
10
p(X = 1) =
(.2)1 (.8)10−2 ≈ .26844
1
12