Download Introduction to the Practice of Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Probability wikipedia , lookup

Randomness wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
Introduction to the Practice of Statistics
Fifth Edition
Moore, McCabe
Section 4.3 Homework Answers
4.47 Example 4.14 (page 269) gives a model for an attempt to send an email message to someone you
don’t know by sending email to an acquaintance who you think is closer to your target and asking
him or her to pass on the message. At each step after you send the initial email, the probability is
0.37 that the recipient will pass on your message. Steps are independent. Let T be the total length
of the chain. That is, if your friend does not relay your message, T= 2 because the chain stops at the
2nd person. If your friend passes it on but the next person does not, T= 3, and so on.
a) What is P(T = 2)? What is P(T = 3)?
I am going to create a tree diagram (we will look at this more formally later) to help me visualize
the situation. The decisions of each person are independent so I can replicate the probability values at
each stage. Notice that the organization makes the task easier, which involves not only the tree but the
use of function notation.
T=4
T=3
T=2
0.37
0.37
0.63
0 63
Pass
Pass
Pass
Get message
0.63
0.37
No Pass
No Pass
No Pass
As I answer the following probabilities look at the tree and the definition of the meaning of the
random variable T.
P(T = 2) = P(not pass) = 0.63
P(T = 3) = P(pass and not pass) = 0.37(0.63) = 0.2331
P(T = 4) = P(pass And pass And not pass) = (0.37)2(0.63) = 0.086247. Yes, I know this was not asked
for but I am going to need it for the next question so why not do it.
b) Express in words the meaning of P(T ≤ 4). What is the value of this probability?
P(T ≤ 4) = P(T = 2 OR T = 3 OR T = 4)
= 0.63 + 0.2331 + 0.086247
= 0.949347
c) No, there is no question C. But, I will take the opportunity to make a few mentions about this
distribution. It is discrete. What is the sample space? Here is the set of possible values:
{T | 2, 3, 4, 5, 6,…}; notice there is no upper value since the chain could go on forever in theory.
Yet it is still a discrete distribution. I can assign a probability value to each value of T.
P(T = k) = (0.63)(0.37)k – 2 unlike a continuous distribution. If the random variable H is continuous
the probability of P(H = 6) = 0, regardless of what the random variable H represents, or the shape
of the distribution.
While you can’t really go on forever, since this is just a mathematical model, it does. Of course
the chance of T = 1000 is almost nothing. If you could list out all the probabilities the sum would
equal 1.
4.48 Some games of chance rely on tossing two dice. Each die has six faces, marked with 1, 2, 3, …6
spots called pips. The dice used in casinos are carefully balanced so that each face is equally likely to
come up. When two dice are tossed, each of the 36 possible pairs of faces is equally likely to come up.
The outcome of interest to a gambler is the sum of the pips on the two up-faces. Call this random
variable X.
a) Write down all possible pairs of faces.
1
2
3
4
5
6
1
(1,1)
(2,1)
(3,1)
(4,1)
(5,1)
(6,1)
2
(1,2)
(2,2)
(3,2)
(4,2)
(5,2)
(6,2)
3
(1,3)
(2,3)
(3,3)
(4,3)
(5,3)
(6,3)
4
(1,4)
(2,4)
(3,4)
(4,4)
(5,4)
(6,4)
5
(1,5)
(2,5)
(3,5)
(4,5)
(5,5)
(6,5)
6
(1,6)
(2,6)
(3,6)
(4,6)
(5,6)
(6,6)
b) Each pair has a 1/36 chance of appearing.
c) Write the value of X next to each pair of faces and use this information with the results of b) to
give the probability distribution of X. Draw a probability histogram to display the distribution.
X
P(x)
2
1/36
3
2/36
4
3/36
5
4/36
6
5/36
7
6/36
8
5/36
9
4/36
10
3/36
11
2/36
12
1/36
4.52 Weary of low turnout in student elections, a college administrator decides to choose a SRS of three
students to form an advisory board that represents student opinion. Suppose that 40% of all student
oppose the use of student fees to fund student interest groups, and that the opinion of the three
students on the board are independent. Then the probability is 0.4 that each opposes the funding of
interest groups.
a) Call the three students A, B, and C. What is the probability that A and B support funding and C
opposes it?
P(A supports AND B supports AND C oppose) =
(0.6)(0.6)(0.4)
b) List all possible combinations of opinions that can be held by students A, B, and C. Then give the
probabilities of each of these outcomes. Note that they are not all equally likely.
Define the ordered pair (SSO) to be A supports, B Supports and C opposes.
The sample space is then {SSS, SSO, SOS, OSS, SOO, OSO, OOS, OOO}
P(SSS) = (0.6)3, P(SSO) = (0.6)2(0.4), P(SOS) = (0.6)2(0.4), P(OSS) = (0.6)2(0.4), P(SOO) = (0.6)(.4)2,
P(OSO) = (0.6)(.4)2, P(OOS) = (0.6)(.4)2, P(OOO) = (0.4)3.
c) Let the random variable X be the number of student representatives who oppose the funding of
interest groups. Give the probability distribution of X.
X
P(X)
0
(0.6)3
1
3(0.6)2(0.4)
2
3(0.6)(.4)2
3
(0.4)3
d) Express the event “a majority of the advisory board opposes funding” in terms of X and find its
probability.
P(X ≥ 2) = P(X = 2 OR X = 3)
= P(X = 2) + P(X = 3)
= 3(0.6)(.4)2 + (0.4)3
4.53 Let X be a random number between 0 and 1 produced by the idealized uniform random number
generator described in Example 4.18 and Figure 4.9. Find the following probabilities:
(a) P(X < 0.5) = 1(0.5 – 0)
= 0.5
(b)
1
=1
(1 − 0)
0
(b) P(X ≤ 0.5) = 1(0.5 – 0)
0.5
1
= 0.5
(c) What important fact about continuous random variables does comparing your answers to (a) and
(b) illustrate?
This shows that for a uniform distribution, which is an example of a continuous distribution, the
probability of getting an exact value is zero. That is P(X = 0.5) = 0.
4.54 Many random number generators allow users to specify the range of the random numbers to be
produced. Suppose that you specify that the range is to be all numbers between 0 and 2. Call
the random number generated Y. Then the density curve of the random variable Y has
constant height between 0 and 2, and height 0 elsewhere.
a) What is the height of the density curve between 0 and 2? Draw a graph of the density curve.
The height is ½.
1
= 0.5
(2 − 0)
1
= 0.5
(2 − 0)
Y
b) Use your graph from (a) and the fact that probability is area under the curve to find P(Y ≤ 1).
P(Y ≤ 1) = (0.5)(1 – 0)
1
= 0.5
(2 − 0)
= 0.5
Y
c) Find P(0.5 < Y < 1.3)
P(0.5 < Y < 1.3) = (0.5)(1.3 – 0.5)
1
= 0.5
(2 − 0)
= 0.4
Y
4.56 Generate two random numbers between 0 and 1 and take Y to be their sum. Then Y is a continuous
random variable that can take any value between 0 and 2. The density curve of Y is the triangle shown in
Figure 4.12.
FIGURE 4.12 The density curve for the sum Y of two random numbers, for Exercise 4.56.
(a) Verify by geometry that the area under this curve is 1.
The area of a triangle has the following formula: area = ½ (base)(height). What this problem is trying to
emphasize again is the fact that in order to calculate probabilities or relative frequencies continuous
random variables (measurements which in theory are real numbers, even though measuring instruments
are limited in accuracy)
Area = ½ (2)(1) = 1.
(b) What is the probability that Y is less than 1? (Sketch the density curve, shade the area that represents
the probability, then find that area. Do this for (c) also.)
P(Y < 1) = ½(1)(1)
= 0.5
(c) What is the probability that Y is less than 0.5?
P(Y < 0.5) = ½ (0.5)(1)
4.57 How many close friends do you have? Suppose that the number of close friends adults claim to have
varies from person to person with mean µ = 9 and standard deviation σ = 2.5. An opinion poll asks this
question of an SRS of 1100 adults. We will see in the next chapter that in this situation the sample mean
response x has approximately the normal distribution with mean 9 and standard deviation 0.075. What is
P(8 ≤ x ≤ 10), the probability that the statistic x estimates the parameter µ to within ± 1?
The key to this question is that they inform you of what it is they are measuring which is x (1), they also
inform you of how this particular measurement is distributed, normal (2), and lastly they give you the
required parameters to deal with a normal distribution, µ = 9, and σ = 0.075 (3).
The information allows me to draw the following figure, and shade the appropriate area that corresponds
to the question P(8 ≤ x ≤ 10).
9
x
9
If I went three standard deviations on either side of the indicated mean, what region do I contain?
(9 – 3(0.075), 9 + 3(0.075) ) = (8.775, 9.225) Now according to the 68-95-99.7 rule these values appear in
the long run 99.7% of the time. But I am interested in an even bigger region (8, 10). I can safely say that
the likelihood of a sample mean x being in this region is almost guaranteed to occur. As a matter of fact,
we would be astounded if the sample mean is not in the range (8, 10). This is why the answer to this
question is P(8 ≤ x ≤ 10) ≈ 1.
4.58 A sample survey contacted an SRS of 663 registered voters in Oregon shortly after an election and
asked respondents whether they had voted. Voter records show that 56% of registered voters had
actually voted. We will see in the next chapter that in this situation the proportion p̂ of the sample
who voted has approximately the normal distribution with mean µ = 0.56 and standard deviation
σ = 0.019.
a) If the respondents answer truthfully, what is P(0.52 ≤ p̂ ≤ 0.60)? This is the probability that the
statistic p̂ estimates the parameter 0.56 within plus or minus 0.04.
0.60-0.56 
 0.52-0.56
P(0.52 ≤ pˆ ≤ 0.60) = P 
≤ Z≤

0.019 
 0.019
= P(-2.105 ≤ Z ≤ 2.105)
= 0.9647
b) In fact, 72% of the respondents said they had voted ( p̂ = 0.72). If respondents answer truthfully,
what is P( p̂ ≥ 0.72)? This probability is so small that it is good evidence that some people who did
not vote claimed that they did vote.
0.72-0.56 

P(pˆ ≥ 0.72) = P  Z ≥

0.019 

= P(Z ≥ 8.42)
≈ 0