Download NAME (Please Print): KEY HONOR PLEDGE (Please Sign): Midterm

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
NAME (Please Print): KEY
HONOR PLEDGE (Please Sign):
Midterm II
Statistics 101
Professor:
• This is a closed book exam.
• You may use your calculator and a single page of notes.
• The room is crowded. Please be careful to look only at your own exam. Try to sit
one seat apart; the proctors may ask you to randomize your seating a bit.
• Report all numerical answers to at least two correct decimal places or (when appropriate) write them as a fraction.
• All question parts count for 1 point unless otherwise indicated.
1
1. Vaccine A enables 40% of the population to survive avian flu; vaccine B gives a 50%
survival rate, and vaccine C protects 30%. Suppose that the pool of vaccines is is 60%
type A, 10% type B, and the rest are type C.
.13 You receive a random vaccine, and then contract and survive the flu. What is
the probability that you received type B?
P [B|Surv] = P [Surv|B] ∗ P [B]/(P [Surv|A] ∗ P [A] + P [Surv|B] ∗ P [B] + P [Surv|C] ∗
P [C]) = (.5 ∗ .1)/[.4 ∗ .6 + .5 ∗ .1 + .3 ∗ .3] = .1316.
2. You draw a simple random sample of 250 people in DC and ask whether they support
the I-95 bridge over the Potomoc. Suppose 180 say yes.
State the box model for this problem. Describe the tickets and how to draw. (3
points.)
1 point for saying there is one ticket in the box for each person in DC. 1 point
for saying the tickets are marked 0 or 1, for no or yes. 1 point for saying the
draws are made without replacement.
Set a two-sided 90% confidence interval on the proportion of people who support
the project.
L, U =
180
250
±
p
(180/250) ∗ (1 − 180/250)/250 ∗ 1.65 = .6731, .7669.
L = .67
U = .77
.69 Suppose the sample of 250 were drawn without replacement from the 400 residents on Telegraph Ave. What is the 95% one-sided lower CI?
Use the FPCF. L = (180/250) −
1.65 = .6913.
p
(.72 ∗ .28)/250 ∗
p
(400 − 250)/(400 − 1) ∗
3. Suppose that the sd in household income in Durham is $25K.
2
2377 How many households would you need to sample to ensure that a 95% confidence
interval on the mean had width less than $2K?
√
width = 2 = 2 ∗ sd/ n ∗ 1.95 = 2 ∗ 25 ∗ 1.95/sqrtn. Solving gives n = 2376.56
and one must round up since households aren’t fractional.
.65 Suppose you drew a sample of size 500. What is the probability that your sample
average is wrong by more than $.5K?
√
√
By the CLT, this is P [z > .5/(25/ 500) or z < −.5/(25/ 500)] = P [|z| > .447] =
.6527.
4. To make more money in casino, you attempt to “shave” a die so that the sides marked
2 and 6 each have probability .2, and the other faces are equally likely. Shaving this
precisely is difficult; You test your result by rolling the die 600 times, getting 85 ones,
130 twos, 85 threes, 95 fours, 90 fives, and 115 sixes.
In practical words, what is the alternative hypothesis?
It is not true that P[2]=P[6]=.2 and P[1]=P[3]=P[4]=P[5]=.6/4; at least some
of these equalities are wrong.
1.87 or 1.88 What is the value of the test statistic?
ts = (85 − 90)2 /90 + (130 − 120)2 /120 + (85 − 90)2 /90 + (95 − 90)2 /90 + (90 −
90)2 /90 + (115 − 120)2 /120 = 1.875.
χ25 What distribution do you use to find the P-value? (Include degrees of freedom
if appropriate.)
What is the P-value (give a range if appropriate)?
Between .9 and .7.
3
State your conclusion in words.
No reason to reject the null; this was a good job of shaving.
Suppose your test leads you to believe that your shaving effort was successful.
But you want further confirmation, and now toss the shaved die 6 billion times.
What conclusion will you reach and why? (2 points)
We would reject the null hypothesis of perfect shaving; small errors will be found.
No one can shave perfectly.
5. You are picking a college and want to determine whether UNC and Duke have different
gender ratios (at the 0.05 level). A random sample of 100 Duke students gives 60
males. A random sample of 200 UNC students gives 90 males.
In words, what is the alternative hypothesis? Subtract UNC from Duke.
The proportion of males at Duke is different from the proportion at UNC; or the
difference does not equal zero.
2.49 What is the value of your test statistic?
p
ts = (.6 − .45)/ (.6 ∗ .4)/200 + (.45 ∗ .55)/200) = 2.487.
z What kind of distribution does your test statistic have under the null?
.01 Calculate the p-value for this test.
P [z > 2.49 or z < −2.49] = .0124.
What is your conclusion?
We reject the null hypothesis; there is a difference in the proportion of males at the
two schools.
4
.23 What is a 90% upper confidence bound on the difference?
U = (.6 − .45) +
p
(.6 ∗ .4)/100 + (.45 ∗ .55)/200) ∗ 1.3 = .2284.
6. There are 5 urns, and each contains 5 marbles. The first urn has one black marble and
four white marbles; the second urn has two black marbles and three white marbles;
and so forth. You are then blindfolded and led to one of the urns at random. You
pick two marbles (without replacement) from that urn and go to another room, where
your blindfold is removed.
.3 Suppose your null hypothesis is that the urn from which you drew had 3 or fewer
black marbles. If both marbles you drew were black, what is the significance
probability?
The alternative hypothesis is supported by getting black marbles, since the null
indicates that there three or fewer few black marbles. The chance of drawing
something this supportive of the alternative from the third urn is 3/5 * 2/4 =
.3.
.34 Suppose you had picked 5 marbles with replacement, and exactly four are black.
What is your significance probability for the above hypothesis?
Since we are drawing with replacement, the number of black marbles is binomial. If the null hypothesis is true, then you are drawing from urn 3 and the
probability of p=.6 of getting a black marble, and you are making n=5 tries. So
the probability of a result thissupportive
of the
or more supportive
alternative
is P [ 4 or 5 black marbles ] =
5
5
.65 .40 +
5
4
.64 .41 = .33696.
7. A sample of 10 Duke students studies for an average of 122 minutes per day, with the
sample SD of 10 (study times are roughly normal). MIT students study for exactly
130 minutes per day. You want to prove that Duke students study less than those at
MIT (at the .05 level).
What is the null hypothesis (in words)?
On average, Duke studies for 130 minutes or more per day.
5
-2.40 What is the value of your test statistic?
p
ts = (122 − 130)/[10/ (10 − 1)] = −2.4.
t9 What kind of distribution does your test statistic have under the null?
.025 to .01 What is the range of the p-value?
What is your conclusion?
Strong evidence that Duke’s average is less than 130.
8. List all the true statements.
C, E, G, H, I, K
A. Response bias arises when some people refuse to answer.
B. One should use the FPCF when sampling with replacement.
C. Non-response bias arises when people cannot be contacted by the interviewer.
D. The Belmont Report establishes the principles for conducting unbiased surveys.
E. The frame is a list of all people who are within the scope of the survey.
F. If everyone is equally likely to be chosen, it is a simple random sample.
G. The Current Population Survey uses half-sampling to estimate standard error.
H. The Current Population Survey uses multistage cluster sampling.
I. The Current Population Survey is used to estimate unemployment.
J. As the sample size increases, so does the width of the confidence interval.
K. Gossett discovered the t-distribution.
9. State the CLT for averages.
√
As the sample size increases, the distribution of ( X̄ − EV )/(sd/ n) is approximately
N(0,1).
6
10. Among 30 in-state students, 10 fail statistics. Among 40 out-of-state students, 10 fail
statistics.
1.50 What is the odds ratio of failing for in-state students compared to out-of-state
students?
Here failing plays the role of heart attack and in-state plays the role of aspirin, so the
table is:
in-state
out-state
fail
10
10
pass
20
30
So the odds ratio is (A/B)/(C/D) = (10/20)/(10/30)=1.5.
.58 You want to test whether failing and whether a student is from North Carolina
are independent. What is the value of your test statistic?
The contingency table is show below, with the expected values in each cell written in
parentheses. Recall that the expected values are the row sum times the column sum
divided by the overall total; for the first cell, this is 30*20/70 = 8.571.
in-state
out-state
total
fail
10 (8.571)
10 (11.429)
20
pass
20 (21.429)
30 (28.571)
50
total
30
40
70
The test statistic is the sum over all four cells of (observed - expected) 2 /Expected,
which gives .5836.
.5 to .3 What is the range of the p-value?
What is your conclusion?
There is no evidence of dependence; being in-state or out-of-state doesn’t affect the
failure chance.
7