Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
NAME (Please Print): KEY HONOR PLEDGE (Please Sign): Midterm II Statistics 101 Professor: • This is a closed book exam. • You may use your calculator and a single page of notes. • The room is crowded. Please be careful to look only at your own exam. Try to sit one seat apart; the proctors may ask you to randomize your seating a bit. • Report all numerical answers to at least two correct decimal places or (when appropriate) write them as a fraction. • All question parts count for 1 point unless otherwise indicated. 1 1. Vaccine A enables 40% of the population to survive avian flu; vaccine B gives a 50% survival rate, and vaccine C protects 30%. Suppose that the pool of vaccines is is 60% type A, 10% type B, and the rest are type C. .13 You receive a random vaccine, and then contract and survive the flu. What is the probability that you received type B? P [B|Surv] = P [Surv|B] ∗ P [B]/(P [Surv|A] ∗ P [A] + P [Surv|B] ∗ P [B] + P [Surv|C] ∗ P [C]) = (.5 ∗ .1)/[.4 ∗ .6 + .5 ∗ .1 + .3 ∗ .3] = .1316. 2. You draw a simple random sample of 250 people in DC and ask whether they support the I-95 bridge over the Potomoc. Suppose 180 say yes. State the box model for this problem. Describe the tickets and how to draw. (3 points.) 1 point for saying there is one ticket in the box for each person in DC. 1 point for saying the tickets are marked 0 or 1, for no or yes. 1 point for saying the draws are made without replacement. Set a two-sided 90% confidence interval on the proportion of people who support the project. L, U = 180 250 ± p (180/250) ∗ (1 − 180/250)/250 ∗ 1.65 = .6731, .7669. L = .67 U = .77 .69 Suppose the sample of 250 were drawn without replacement from the 400 residents on Telegraph Ave. What is the 95% one-sided lower CI? Use the FPCF. L = (180/250) − 1.65 = .6913. p (.72 ∗ .28)/250 ∗ p (400 − 250)/(400 − 1) ∗ 3. Suppose that the sd in household income in Durham is $25K. 2 2377 How many households would you need to sample to ensure that a 95% confidence interval on the mean had width less than $2K? √ width = 2 = 2 ∗ sd/ n ∗ 1.95 = 2 ∗ 25 ∗ 1.95/sqrtn. Solving gives n = 2376.56 and one must round up since households aren’t fractional. .65 Suppose you drew a sample of size 500. What is the probability that your sample average is wrong by more than $.5K? √ √ By the CLT, this is P [z > .5/(25/ 500) or z < −.5/(25/ 500)] = P [|z| > .447] = .6527. 4. To make more money in casino, you attempt to “shave” a die so that the sides marked 2 and 6 each have probability .2, and the other faces are equally likely. Shaving this precisely is difficult; You test your result by rolling the die 600 times, getting 85 ones, 130 twos, 85 threes, 95 fours, 90 fives, and 115 sixes. In practical words, what is the alternative hypothesis? It is not true that P[2]=P[6]=.2 and P[1]=P[3]=P[4]=P[5]=.6/4; at least some of these equalities are wrong. 1.87 or 1.88 What is the value of the test statistic? ts = (85 − 90)2 /90 + (130 − 120)2 /120 + (85 − 90)2 /90 + (95 − 90)2 /90 + (90 − 90)2 /90 + (115 − 120)2 /120 = 1.875. χ25 What distribution do you use to find the P-value? (Include degrees of freedom if appropriate.) What is the P-value (give a range if appropriate)? Between .9 and .7. 3 State your conclusion in words. No reason to reject the null; this was a good job of shaving. Suppose your test leads you to believe that your shaving effort was successful. But you want further confirmation, and now toss the shaved die 6 billion times. What conclusion will you reach and why? (2 points) We would reject the null hypothesis of perfect shaving; small errors will be found. No one can shave perfectly. 5. You are picking a college and want to determine whether UNC and Duke have different gender ratios (at the 0.05 level). A random sample of 100 Duke students gives 60 males. A random sample of 200 UNC students gives 90 males. In words, what is the alternative hypothesis? Subtract UNC from Duke. The proportion of males at Duke is different from the proportion at UNC; or the difference does not equal zero. 2.49 What is the value of your test statistic? p ts = (.6 − .45)/ (.6 ∗ .4)/200 + (.45 ∗ .55)/200) = 2.487. z What kind of distribution does your test statistic have under the null? .01 Calculate the p-value for this test. P [z > 2.49 or z < −2.49] = .0124. What is your conclusion? We reject the null hypothesis; there is a difference in the proportion of males at the two schools. 4 .23 What is a 90% upper confidence bound on the difference? U = (.6 − .45) + p (.6 ∗ .4)/100 + (.45 ∗ .55)/200) ∗ 1.3 = .2284. 6. There are 5 urns, and each contains 5 marbles. The first urn has one black marble and four white marbles; the second urn has two black marbles and three white marbles; and so forth. You are then blindfolded and led to one of the urns at random. You pick two marbles (without replacement) from that urn and go to another room, where your blindfold is removed. .3 Suppose your null hypothesis is that the urn from which you drew had 3 or fewer black marbles. If both marbles you drew were black, what is the significance probability? The alternative hypothesis is supported by getting black marbles, since the null indicates that there three or fewer few black marbles. The chance of drawing something this supportive of the alternative from the third urn is 3/5 * 2/4 = .3. .34 Suppose you had picked 5 marbles with replacement, and exactly four are black. What is your significance probability for the above hypothesis? Since we are drawing with replacement, the number of black marbles is binomial. If the null hypothesis is true, then you are drawing from urn 3 and the probability of p=.6 of getting a black marble, and you are making n=5 tries. So the probability of a result thissupportive of the or more supportive alternative is P [ 4 or 5 black marbles ] = 5 5 .65 .40 + 5 4 .64 .41 = .33696. 7. A sample of 10 Duke students studies for an average of 122 minutes per day, with the sample SD of 10 (study times are roughly normal). MIT students study for exactly 130 minutes per day. You want to prove that Duke students study less than those at MIT (at the .05 level). What is the null hypothesis (in words)? On average, Duke studies for 130 minutes or more per day. 5 -2.40 What is the value of your test statistic? p ts = (122 − 130)/[10/ (10 − 1)] = −2.4. t9 What kind of distribution does your test statistic have under the null? .025 to .01 What is the range of the p-value? What is your conclusion? Strong evidence that Duke’s average is less than 130. 8. List all the true statements. C, E, G, H, I, K A. Response bias arises when some people refuse to answer. B. One should use the FPCF when sampling with replacement. C. Non-response bias arises when people cannot be contacted by the interviewer. D. The Belmont Report establishes the principles for conducting unbiased surveys. E. The frame is a list of all people who are within the scope of the survey. F. If everyone is equally likely to be chosen, it is a simple random sample. G. The Current Population Survey uses half-sampling to estimate standard error. H. The Current Population Survey uses multistage cluster sampling. I. The Current Population Survey is used to estimate unemployment. J. As the sample size increases, so does the width of the confidence interval. K. Gossett discovered the t-distribution. 9. State the CLT for averages. √ As the sample size increases, the distribution of ( X̄ − EV )/(sd/ n) is approximately N(0,1). 6 10. Among 30 in-state students, 10 fail statistics. Among 40 out-of-state students, 10 fail statistics. 1.50 What is the odds ratio of failing for in-state students compared to out-of-state students? Here failing plays the role of heart attack and in-state plays the role of aspirin, so the table is: in-state out-state fail 10 10 pass 20 30 So the odds ratio is (A/B)/(C/D) = (10/20)/(10/30)=1.5. .58 You want to test whether failing and whether a student is from North Carolina are independent. What is the value of your test statistic? The contingency table is show below, with the expected values in each cell written in parentheses. Recall that the expected values are the row sum times the column sum divided by the overall total; for the first cell, this is 30*20/70 = 8.571. in-state out-state total fail 10 (8.571) 10 (11.429) 20 pass 20 (21.429) 30 (28.571) 50 total 30 40 70 The test statistic is the sum over all four cells of (observed - expected) 2 /Expected, which gives .5836. .5 to .3 What is the range of the p-value? What is your conclusion? There is no evidence of dependence; being in-state or out-of-state doesn’t affect the failure chance. 7