Download Chapter 5 Integrated problems

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Probability wikipedia , lookup

Transcript
Introduction to the Practice of Statistics
Sixth Edition
Moore, McCabe
Section 5.1 Homework Answers
5.18 Attitudes toward drinking and behavior studies. Some of the methods in this section are
approximations rather than exact probability results. We have given rules of thumb for safe use of these
approximations.
(a) You are interested in attitudes toward drinking among the 75 members of a fraternity. You choose 30
members at random to interview. One question is "Have you had five or more drinks at one time during the
last week?" Suppose that in fact 30% of the 75 members would say "Yes." Explain why you cannot safely
use the B(30, 0.3) distribution for the count X in your sample who say "Yes."
The binomial distribution assumes that we have independence. In this case we do not, and the
probabilities change too much for us to disregard the fact that we do not.
P(drink) = 0.3,
P(drink | drink and drink and not drink and not drink and not drink and not drink) = 0.2857 and
this is after only sampling five, by 20, the probability of success will not be close to 0.3.
(b) The National AIDS Behavioral Surveys found that 0.2% (that's 0.002 as a decimal fraction) of adult
heterosexuals had both received a blood transfusion and had a sexual partner from a group at high risk of
AIDS. Suppose that this national proportion holds for your region. Explain why you cannot safely use the
Normal approximation for the sample proportion who fall in this group when you interview an SRS of
1000 adults.
The criteria is np ≥ 10 and n(1 – p) ≥ 10.
The probability of success is 0.002.
0.002(1000) = 2 is not greater than 10.
5.22 The ideal number of children. "What do you think is the ideal number of children for a family to
have?" A Gallup Poll asked this question of 1016 randomly chosen adults. Almost half (49%) thought two
children was ideal.3 Suppose that p = 0.49 is exactly true for the population of all adults. Gallup announced
a margin of error of ±3 percentage points for this poll. What is the probability that the sample proportion p̂
for an SRS of size n = 1016 falls between 0.46 and 0.52? You see that it is likely, but not certain, that polls
like this give results that are correct within their margin of error. We will say more about margins of error
in Chapter 6.
A similar question was asked in chapter 4, section 3, 4.66.
n = 1016, p = 0.49, 1016(0.49) = 497.84 (expected number
of successes and 1016(0.51) = 518.16 expected number of
failures thus the distribution of this binomial situation
resembles that of a normal distribution.
Let X count the number of people that think 2 children is
ideal.
Sample space: {X | 0, 1, 2, …, 1016}
Or Sample space { p̂ | 0, 1/1016, 2/1016, …, 1015/1016, 1}
The sample spaces consist of 1017 values.
0.46(1016) = 467.36, 0.52(1016) = 528.32
Here is what we want expressed as either a count or proportion:
P(0.46 ≤ p̂ ≤ 0.52) ≈ P(467 ≤ X ≤ 528)
You can see that all we can do is get an approximation to the question since you can not have a count
of 467.36 for example.
P(467 ≤ X ≤ 528) = P(X ≤ 528) – P(X ≤ 466) = 0.9728 – 0.02454 = 0.9483
=binomdist(528, 1016, 0.49,true) – binomdist(466,1016,0.49, true)
Why did I change from 467 to 466? Because I want to include 467 in the calculation, and since I have
a discrete distribution, I need to take away 466, 465, and so on.
Normal Approximation - typically, in this scenario posed, most researchers will do a normal
approximation and not the procedure for a binomial calculation.
Again we meet the criteria np ≥ 10 and n(1 – p) ≥ 1016(0.49) = 497.84 and
1016(0.51) = 518.16
0.46(1016) = 467.36, 0.52(1016) = 528.32
P(0.46 ≤ p̂ ≤ 0.52) ≈P(X ≤ 528.32) – P(X ≤ 467.36)


528.32-1016(0.49) 
467.36-1016(0.49) 
≈ PZ ≤
- PZ ≤




1016(0.49)(0.51) 
1016(0.49)(0.51) 


≈ P(Z < 1.91) - P(Z < -1.91)
≈ 0.8832 - 0.02801
≈ 0.8552 Notice that this value is smaller than the one using the binomial
routine.


0.52- (0.49)
P(0.46 ≤ p̂ ≤ 0.52) ≈ P  Z ≤

(0.49)(0.51)

1016





0.46- (0.49)
 - PZ ≤


(0.49)(0.51)


1016








≈ P(Z < 1.91) - P(Z < -1.91)
≈ 0.8832 - 0.02801
≈ 0.8552 Notice that this value is smaller than the one using the binomial
routine.
Using a normal approximation - The sample size here is large and p is in the middle of the possible
range of p values; [0, 1]. Thus the normal approximation above will be very close to actual. Below
are the steps with continuity correction.
≈P(X ≤ 528.5) – P(X ≤ 466.5)


528.5-1016(0.49) 
466.5-1016(0.49) 
≈ P Z ≤
- P Z ≤




1016(0.49)(0.51) 
1016(0.49)(0.51) 


≈ P(Z < 1.924) - P(Z < -1.967)
≈ 0.9728 - 0.0246
≈ 0.9482
5.24 How do the results depend on the sample size? Return to the Galiup Poll setting of Exercise 5.22.
We are supposing that the proportion of all adults p.-ho think that two children is ideal is p = 0.49. What is
the probability that a sample proportion p̂ falls between 0.46 and 0.52 (that is, within ±3
percentage points of the true p) if the sample is an SRS of size n = 300? Of size n = 5000? Combine these
results with your work in Exercise 5.22 to make a general statement about the effect of larger samples in a
sample survey.
Size n = 300
Crunch it.
P(0.46 ≤ p̂ ≤ 0.52) = P(0.52(300) ≤ X ≤ 0.46(300))
= P(X ≤ 156) – P(X ≤ 138)
= binomdist(156, 300, 0.49, true) – binomdist(137, 300, 0.49, true)
= 0.8637 – 0.1363
= 0.7275
see answer to problem 5.22 for pictorial representation.
Normal Approximation.
P(0.46 ≤ p̂ ≤ 0.52) = P(0.52(300) ≤ X ≤ 0.46(300))
= P(X ≤ 156) – P(X ≤ 138)


156-300(0.49) 
138-300(0.49) 
≈ P Z ≤
- P Z ≤

 or if using p-hats


300(0.49)(0.51) 
300(0.49)(0.51) 




0.52- (0.49)
≈PZ ≤

(0.49)(0.51)

300





0.46- (0.49)
 - PZ ≤


(0.49)(0.51)


300








≈ P(Z < 1.039) - P(Z < -1.039)
≈ 0.8506 - 0.1494
≈ 0.7012
Normal Approximation, continuity correction.
P(0.46 ≤ p̂ ≤ 0.52) = P(0.52(300) ≤ X ≤ 0.46(300))
= P(X ≤ 156) – P(X ≤ 138)


156.5-300(0.49) 
137.5-300(0.49) 
≈ P Z ≤
- P Z ≤




300(0.49)(0.51) 
300(0.49)(0.51) 


≈ P(Z < 1.097) - P(Z < -1.097)
≈ 0.8637 - 0.1363
≈ 0.7214
Size n = 5000 Normal Approximation.
P(0.46 ≤ p̂ ≤ 0.52) = P(0.52(5000) ≤ X ≤ 0.46(5000))
= P(X ≤ 2600) – P(X ≤ 2300)


2600-5000(0.49) 
2300-5000(0.49) 
≈ P Z ≤
- P Z ≤





5000(0.49)(0.51)
5000(0.49)(0.51)




≈ P(Z < -4.24) - P(Z < 4.24)
≈ 0.999989 - 0.0000118
≈1
P(0.46 ≤ p̂ ≤ 0.52) = P(0.46(5000) ≤ X ≤ 0.52(5000))
= P(X ≤ 2600) – P(X ≤ 2300)
=binomdist(2600, 5000, 0.49,true) – binomdist(2299, 5000, 0.49, true)
= 0.999979
5.28 Admitting students to college. A selective college would like to have an entering class of 950
students. Because not all students who are offered admission accept, the college admits more than 950
students. Past experience shows that about 75% of the students admitted will accept. The college
decides to admit 1200 students. Assuming that students make their decisions independently, the
number who accept has the B(1200, 0.75) distribution. If this number is less than 950, the college will
admit students from its waiting list.
(a) What are the mean and the standard deviation of the number X of students who accept?
Notice that we want the mean and standard deviation of the count: of the number X of students who
accept?
µX = 1200(0.75) = 900
σX = 1200(0.75)(0.25) = 15
(b) The college does not want more than 950 students. What is the probability that more than 950 will
accept?
P(X ≥ 951) = 0.00030194
= 1 – binomdist(950, 1200,0.75,true)
Normal Approximation.
1200(0.75) = 900 ≥ 10 and 1200(0.25) = 300 ≥ 10.

P(X ≥ 951) ≈ P  Z >


951-1200(0.75) 

1200(0.75)(0.25) 
≈ P(Z > 3.4)
≈ 0.000337
Normal Approximation with continuity correction.
1200(0.75) = 900 ≥ 10 and 1200(0.25) = 300 ≥ 10.

950.5-1200(0.75) 
P(X ≥ 951) ≈ P  Z >


1200(0.75)(0.25) 

≈ P(Z > 3.37)
≈ 1 – normsdist(3.37)
≈ 0.000376
(c) If the college decides to increase the number of admission offers to 1300, what is the probability
that more than 950 will accept?
P(X ≥ 951) = 0.940834
= 1 – binomdist(950,1300,0.75,true)
Normal Approximation.
1300(0.75) = 975 ≥ 10 and 1300(0.25) = 325 ≥ 10.

P(X ≥ 951) ≈ P  Z >


951-1300(0.75) 

1300(0.75)(0.25) 
≈ P(Z > -1.5372)
≈ 0.9379
Normal Approximation with continuity correction.
1300(0.75) = 975 ≥ 10 and 1300(0.25) = 325 ≥ 10.

950.5-1300(0.75) 
P(X ≥ 951) ≈ P  Z >


1300(0.75)(0.25) 

≈ P(Z > -1.56926)
≈ 0.9417