Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Name: TA’s name: Student number: MID-TERM BIOL 300: October 2009 For all statistical tests, make sure that you clearly state your hypotheses. Unless otherwise stated, assume α = 0.05. Show your work. Be as precise as possible about P-values. Some questions have a box for the final answer. Please put the final answer in this box, and show all work in the other space provided, including the back of the page if necessary. By taking this test and putting your name above, you are declaring that your answers on this test are all your own work. Make sure that your copy of the test includes 6 pages, including this one. Points Q1 12 Q2 7 Q3 30 Q4 17 Q5 22 Q6 12 100 2 1. (12 points)A flycatcher is trying to catch passing bugs. The probability that it catches a bug on any given try is 20%. a. What is the probability that it catches its first bug is on its fourth try? In order for the first bug to be on the fourth try, it must have missed the first three times. The probability of a miss is (1-02) = 0.8, so the probability of three misses is 0.83=0.512. Multiply this times the probability that it succeeds on the fourth try, 0.2, and we get the answer: 0.512 (0.2) = 0.1024 b. What is the probability that it catches its first bug after four tries, assuming that it keeps trying until it is successful? The probability that it catches the first bug after 4 tries is equal to one minus the probability that it catches its first bug in it first four tries. Following the logic of part (a), the probability of the first success being in each of the first four tries are: 1st try 0.2 nd 2 try 0.8(0.2) 3rd try 0.82 (0.2) th 4 try 0.83 (0.2) Adding these all up, we get Pr (first bug captured during first four tries) = 0.5904. So the probability that the first success comes after 4 tries is 1 – 0.5904 = 0.4096 2. (7 points) Define Type II error. If the significance level of a test is increased, will the Type II error increase, decrease, or stay the same? Explain. Type II error is failing to reject a false null hypothesis. If the significance level (α) of a test is increased, it makes it easier to reject any hypothesis, including false null hypotheses. As a result Type II error goes down when the significance level is increased. 3 3. (5 points each)For each of the following scenarios, identify the best statistical test to use and state the null hypothesis. (Please note, do not give the answer to the specific question, but simply state the best test to use and the null hypothesis for the scenario.) a. Testing the accuracy of a new scale by using the mean of repeated measures of a 1gram standard weight. One-sample t-test H0: The mean weight measured on the scale is 1g. b. Asking whether prairie dog colonies were randomly distributed over the area of a national park. χ2 goodness-of-fit test H0: The number of prairie dog colonies per acre follows a Poisson distribution. c. Asking whether the number of cases of swine flu in a city was proportional to the population size of that city. χ2 goodness-of-fit test H0: The number of cases of swine flu is proportional to the population size of each city. d. Asking whether smokers or non-smokers were more likely to get swine flu over a two-week period, assuming that the sample sizes in the study were very large. χ2 contingency analysis H0: Smokers and non-smokers are equally likely to get swine flu. e. Asking whether smokers or non-smokers were more likely to get swine flu over a two-week period, where the number of smokers in the sample expected to get swine flu was only 1.8. Fisher’s exact test H0: Smokers and non-smokers are equally likely to get swine flu. f. Asking whether the number of pea plants with a tall phenotype (compared to a short phenotype) was as predicted by Mendel’s rules in a genetics experiment. binomial test H0: The proportion of tall plants is equal to the proportion predicted by Mendel.. 4 4. Researchers in Britain asked a few hundred people about their daily coffee intake. Of the 22 people from that group who had the highest coffee intake, nine of them reported hearing voices sometimes when no other person was present. Of the 22 people with the lowest coffee intake, three people reported sometimes hearing voices. a. (12 points) Do an appropriate hypothesis test to investigate a possible link between coffee intake and hearing voices. H0: Coffee intake and hearing voices are independent. HA: Coffee intake and hearing voices are not independent. Here is the contingency table with observed and expected values. high coffee low coffee hears voices 9 6 3 6 doesn’t hear voices 13 16 19 16 total 22 22 total 12 32 44 The expected value for Coffee/hears voice was calculated from 12/44 (estimated probability of hearing voices) times the probability of drinking coffee (22/44), multiplied times the 44 total observations. The others can be calculated by subtraction from the row and column totals. We can calculate a χ2 contingency test, because all of the cells have expected values of 5 or more. (9 − 6) 2 (3 − 6) 2 (13 − 16) 2 (19 − 16) 2 χ2 = + + + = 4.125 6 6 16 16 We have (r-1)(c-1) = 1 degrees of freedom, so we compare this to the critical value of χ21 = 3.84. Therefore P<0.05 and we can reject the null hypothesis. Heavy coffees drinkers are more likely to hear voices than light coffee drinkers. b. (5 points) Are the results from part (a) sufficient evidence to support the statement that coffee drinking sometimes causes people to hear voices? Why, or why not? No, because the data merely show an association. It is possible that hearing voices causes people to drink coffee more, or that a third factor may cause both. 5 5. (22 points) Researchers have removed nearly every gene in the yeast genome one by one, to create lines of yeast that are missing one gene each. The relative growth rate of these yeast lines have been measured (expressed relative to the growth rate of the wild type that has all the genes). Below is a random sample of these relative growth rates: 0.86, 1.02, 1.02, 1.01, 1.02, 1, 0.99, 1.01, 0.91, 0.83, 1.01 a. (6 points) What is the best estimate of the relative growth rate of yeast which is missing a random gene? Give a 95% confidence interval for this estimate. Y ± SE Y tα ( 2),df s t () n 0.05 2 ,10 0.06992 (2.23) 0.9709 ± 11 0.971 ± 0.047 Y± Y = 0.971. b. (3 points) What is the median relative growth rate from this sample? 1.01 c. (6 points) What is the variance of relative growth rate? Give a 95% confidence interval for this estimate. s2 = 0.004889 df s2 df s2 2 ≤σ ≤ 2 χ α2 χ α 2 ,df 1− ,df 2 10 (0.0048891) 10 (0.0048891) ≤σ2 ≤ 2 2 χ 0.025,10 χ 0.975,10 0.048891 0.048891 ≤σ2 ≤ 20.48 3.25 0.00239 ≤ σ 2 ≤ 0.01504 6 d. (4 points) What is the standard deviation of relative growth rate? Give a 95% confidence interval for this estimate. s=0.06992 Confidence interval can be calculated approximately from the square root of the confidence interval of the variance: 0.00239 ≤ σ ≤ 0.01504 0.0489 ≤ σ ≤ 0.123 e. (3 points) What is the coefficient of variation of relative growth rate? CV = 100% s 0.06992 = 100% = 7.2% Y 0.9709 7 6. (12 points) In a study in Scotland, researchers left a total of 240 wallets around town, as though the wallets were lost. Each contained contact information including an address. 101 of the wallets were returned by people who found them. Estimate the proportion of lost wallets that will be returned, and give a 95% confidence interval for this estimate. Use the confidence interval to say whether the data are consistent with a true probability of wallets being returned of 50%. Estimated proportion returned = 101/240 = 0.421 95% confidence interval for the proportion: p′ = X + 2 101 + 2 = = 0.4221. n + 4 240 + 4 ⎛ ⎛ p′ (1− p′ ) ⎞ p′ (1− p′ ) ⎞ p − Z ≤ p ≤ p + Z ′ ′ ⎜ ⎜ n + 4 ⎟⎠ n + 4 ⎟⎠ ⎝ ⎝ ⎛ ⎛ 0.4221(1− 0.4221) ⎞ 0.4221(1− 0.4221) ⎞ 0.4221− 1.96 ≤ p ≤ 0.4221 + 1.96 ⎜ ⎟ ⎜ ⎟ 240 + 4 240 + 4 ⎝ ⎠ ⎝ ⎠ 0.360 ≤ p ≤ 0.484