* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download instructions - University of Manitoba
Foundations of statistics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
History of statistics wikipedia , lookup
Taylor's law wikipedia , lookup
Psychometrics wikipedia , lookup
Omnibus test wikipedia , lookup
Resampling (statistics) wikipedia , lookup
THE UNIVERSITY OF MANITOBA DATE: DEC. 21, 2005 6:00 p.m.-8:00 p.m. FINAL EXAMINATION PAPER NO. 608 DEPARTMENT & COURSE NO. : 005.200 EXAMINATION: Basic Statistical Analysis II DURATION: 2 hours PAGE NO. : Page 1 of 16 EXAMINER: INSTRUCTIONS I. You have been provided with: (A) the examination paper, (B) a multiple choice answer sheet, (C) formula sheet at the end of exam paper, (D) a booklet of tables. II. The total number of marks possible is 50. III. This exam consists of 36 multiple choice questions. It is suggested that you first complete the questions on the examination paper by choosing the BEST answer out of five in each case; then transfer your answers to the multiple choice answer sheet by blackening the appropriate space with an "HB" or "F" pencil. Only one space should be blackened; otherwise, the question will be marked wrong. The questions are of equal value. There is no penalty made for guessing; therefore, all questions should be attempted. IV. Calculators are permitted. However, programmable calculators, graphing calculators and cell phones are NOT permitted. V. At the end of the examination period, turn in (i) your multiple choice answer sheet for, (ii) the booklet of tables. Be sure to write your NAME, STUDENT NUMBER and INSTRUCTOR’S NAME on your MULTIPLE CHOICE ANSWER SHEET. Also, using a pencil, fill in the bubbles corresponding to your (7-digit) student number. VI. The backsides of the test may be used for rough work. Answers will be posted on the department’s bulletin board, outside Room 311 on the 3rd floor of Machray Hall and will also be posted on the department website. THE UNIVERSITY OF MANITOBA DATE: DEC. 21, 2005 6:00 p.m.-8:00 p.m. FINAL EXAMINATION PAPER NO. 608 DEPARTMENT & COURSE NO. : 005.200 EXAMINATION: Basic Statistical Analysis II DURATION: 2 hours PAGE NO. : Page 2 of 16 EXAMINER: 1. The time needed for college students to complete a certain paper-and-pencil maze follows a normal distribution with a mean of 30 seconds and a standard deviation of 3 seconds. We wish to see if the mean reaction time required is affected by vigorous exercise, so a group of 9 students exercises vigorously for 30 minutes and then completes the maze. We compute the sample mean and use this information to test the hypotheses Ho: µ = 30 vs. Ha: µ ≠ 30 at the 0.05 level of significance. The power of the test against Ha: µ = 32 seconds is approximately: (A) 0.4840. 2. (B) 0.0630. (C) 0.5160. * (E) 0.7190. (D) 0.4877. A random variable X is known to follow a normal distribution with standard deviation σ = 3.61. A sample of 24 individuals is selected. A hypothesis test of H0: µ = 17.8 vs. Ha: µ > 17.8 is to be conducted for the population mean µ. It is decided that the null hypothesis € will be rejected if X ≥ 19.09. The probability of a Type I error is closest to: (A) 0.01 (B) 0.03 (C) 0.04* (D) 0.05 (E) 0.07 Questions 3 to 4 refer to the following: In the following analysis of variance problem to compare the means of three normally distributed populations, the following table of means and standard deviations are obtained from three independent samples. The overall mean turns out to be 78. Sample No. of observations 8 5 15 1 2 3 3. 83 88 72 10 12 6 (B) 9.33 (C) 73.57 (D) 71.20 (E) 8.08* (C) 1780 (D) 71.20 (E) 1240 The mean square for groups is: (A) 134 5. Std. Dev. A pooled estimate of the common standard deviation is: (A) 93.33 4. Mean (B) 620* Roger Ebert and Richard Roeper are film critics who host a weekly television show on which they rate movies currently showing in theatres. Each of them rates a movie either as “Thumbs Up” (if they like the movie) or “Thumbs Down” (if they dislike it). Let X be the total number of “Thumbs Up” a movie receives from the two critics. Suppose the probability distribution of X is P (X = x) = cx + 1 , x = 0, 1, 2 2 What is the value of the constant c? (A) 0 € (B) -1/5 (C) 1/4 (D) 1/3 (E)* -1/3 THE UNIVERSITY OF MANITOBA DATE: DEC. 21, 2005 6:00 p.m.-8:00 p.m. FINAL EXAMINATION PAPER NO. 608 DEPARTMENT & COURSE NO. : 005.200 EXAMINATION: Basic Statistical Analysis II DURATION: 2 hours PAGE NO. : Page 3 of 16 EXAMINER: 6. In order to test the equality of four population means, a sample of size ten is taken from each of the four populations of interest. The following sum of squares and the F-ratio are calculated: SSG = 472.14 SSE = 1378.51 F = 4.11 In conducting the appropriate hypothesis test, we would: (A) (B) (C) (D) (E) reject H0 at α = 0.01. reject H0 at α = 0.025, but fail to reject H0 at α = 0.01.* reject H0 at α = 0.05, but fail to reject H0 at α = 0.025. fail to reject H0 at α = 0.05. need to be given the P-value to make our decision. Questions 7 and 8 refer to the following. Fourteen toddlers are observed over several months at a nursery school. Here are data on the average Time in minutes that a child spent at the table and the average number of Calories the child consumed during lunch. Time 37.7 33.5 21.4 39.5 22.8 34.1 33.9 42.4 28.6 30.6 35.1 33.0 43.7 Calories 465 461 472 437 508 442 479 450 489 479 439 444 410 The means and standard deviations are summarized as follows: Mean Standard Deviation 33.56 6.67 Time 459.62 26.12 Calories To assist your calculations, the following JMP INTRO output is provided. Bivariate Fit of Calories By Time Linear Fit Calories = 560.30 - 3.00 Time Analysis of Variance Source Model Error C. Total 7. Sum of Squares 4801.54 3383.54 8185.08 Mean Square 4801.54 307.59 F Ratio 15.6100 Prob > F 0.0023 The value of the appropriate statistic to test for zero slop is: (A) 3.95 8. DF 1 11 12 (B) –3.95* (C) –0.77 (D) 0.77 (E) 0.59 The residual for the child who spent an average of 33 minutes at the table is: (A) 1.68 (B) –1.68 (C) 461.30 (D) –17.3* (E) 17.3 THE UNIVERSITY OF MANITOBA DATE: DEC. 21, 2005 6:00 p.m.-8:00 p.m. FINAL EXAMINATION PAPER NO. 608 DURATION: 2 hours DEPARTMENT & COURSE NO. : 005.200 PAGE NO. : Page 4 of 16 EXAMINATION: Basic Statistical Analysis II EXAMINER: Questions 9 and 10 refer to the following situation: Two scientists developed a revised method for evaluating the precision and accuracy of "kits" for chemical analysis. The accuracy may be measured by comparison with the "true" assay value found in analysis by a reference method. In a glucose study, 100 ml of a patient's serum are analyzed at 10 different times using the “kit” and the reference method with results as shown below. Do the data support the conclusion that the kit consistently overestimates the reference value? Analysis Kit Reference 9. 3 92 87 4 100 94 5 106 90 6 99 91 7 79 92 8 101 93 9 110 86 10 100 91 (B) 0.0107* (C) 0.0214 (D) 0.9990 (E) 0.9893 The value of the test statistic based on ranks (using Reference – Kit) is: (A) 8* 11. 2 96 92 The P-value for testing the above hypothesis using a sign test is: (A) 0.0010 10. 1 89 86 (B) 36 (C) 44 (D) 47 (E) 55 Are men and women in Manitoba homogeneous with respect to the proportion of people who prefer each of the three political parties? The following contingency table gives voting data for the most recent provincial election: Contingency Analysis of Gender By Party Contingency Table Party By Gender Count Female Male Expected Cell Chi^2 Conservative 25 28 29.03 23.97 ??? 0.68 Liberal 26 13 21.36 17.64 1.01 1.22 NDP 35 30 35.61 29.39 0.01 0.01 86 71 53 39 65 157 Using the appropriate test statistic, the P-value of the test is: (A) (B) (C) (D) (E) 12. less than 0.01 exactly 0.01 between 0.01 and 0.05 between 0.05 and 0.10 greater than 0.10* A candy manufacturer sells packages containing six gumdrops. The gumdrops are made in many different colours and the colours of gumdrops vary from one package to another. We take a sample of 150 packages and count the number of orange gumdrops in each one. Packages can contain anywhere from zero to six orange gumdrops. We will conduct a chisquare goodness of fit test to determine if the number of orange gumdrops per package follows a binomial distribution. The parameter p is unknown and must be estimated from the sample data. Assuming all expected cell counts are at least five, the appropriate degrees of freedom for the test are: (A) 4 (B) 148 (C) 6 (D) 5* (E) 149 THE UNIVERSITY OF MANITOBA DATE: DEC. 21, 2005 6:00 p.m.-8:00 p.m. FINAL EXAMINATION PAPER NO. 608 DEPARTMENT & COURSE NO. : 005.200 EXAMINATION: Basic Statistical Analysis II DURATION: 2 hours PAGE NO. : Page 5 of 16 EXAMINER: Questions 13 and 14 refer to the following situation: The arrangement of test items was studied for its effect on anxiety. Two independent samples of Statistics 5.200 students with the same 5.100 final grades were selected. Both groups of students were given a test consisting of the same 40 multiplechoice questions, but the questions for one group were arranged from easy to difficult and the questions for the other group were arranged from difficult to easy. Numbers of correct answers in the two groups were as follows. At 5% level of significance, is there sufficient evidence to indicate that the two samples are from populations with different medians? Easy to Difficult: Difficult to Easy 13. 28 20 43 28 48 38 35 38 (B) 23 (C)* 24 (D) 28 (E) 54 The 5% rejection region for the test statistic is: (A) (B) (C) (D) (E) 15. 49 30 The value of the rank sum test statistic (computed on Easy to Difficult group) is: (A) 15 14. 34 42 * T ≤ 21 T ≥ 44 T ≥ 57 T ≤ 20 or T ≥ 45 T ≤ 33 or T ≥ 58 Which of the following is NOT CORRECT? (A) (B) (C) (D) Nonparametric procedures require fewer assumptions than parametric procedures. The SIGNED-RANK test can be used for paired data. Nonparametric procedures can be used with ordered data since all that is needed are the relative sizes of the values. Tied values are assigned a rank equal to the average of the ranks associated with the tied values. (E) * The assumption of random samples is not important for non-parametric tests. 16. There is an approximate linear relationship between the height of females and their age (from 5 to 18 years) described by: height = 50.3 + 6.01(age) where height is measured in cm and age in years. Which of the following is NOT CORRECT? (A) (B) (C)* (D) (E) The estimated slope is 6.01 which implies that children increased by about 6 cm for each year they grow older. The estimated height of a child who is 10 years old is about 110 cm. The estimated intercept is 50.3 cm which implies that children reach this height when they are 50.3/6.01=8.4 years old. The average height of children when they are 5 years old is about 50% of the average height when they are 18 years old. My niece is about 8 years old and is about 115 cm tall. She is taller than average. THE UNIVERSITY OF MANITOBA DATE: DEC. 21, 2005 6:00 p.m.-8:00 p.m. FINAL EXAMINATION PAPER NO. 608 DEPARTMENT & COURSE NO. : 005.200 EXAMINATION: Basic Statistical Analysis II DURATION: 2 hours PAGE NO. : Page 6 of 16 EXAMINER: 17. Bottles of a popular cola are supposed to contain 300 millimeters (ml) of cola. There is some variation from bottle to bottle because the filling machinery is not perfectly precise. The distribution of the contents is normal with standard deviation σ = 3ml. An inspector who suspects that the bottler is underfilling decides to test the hypothesis H0: µ = 300 against Ha: µ < 300, using a sample of n = 9. He rejects Ho if x ≤ 297.515. The power against the alternative µ = 295 is: (A) 0.0059 (B)* 0.9941 (C) 0.7291 (D) 0.2709 (E) 0.8907 2 18. For a least squares regression line, we find r = 0.8. Which of the following statements is true? (A) About 80% of the variation in the response variable is caused by the variation in the explanatory variable. About 80% of the variation in the explanatory variable is explained by the least squares regression line. About 80% of the variation in the response variable is explained by the least squares regression line. The correlation between the response variable and the explanatory is 0.8. None of the above. (B) (C)* (D) (E) 19. The Gallup Poll has decided to increase the size of its random sample of Manitoba voters from about 1500 people to about 4000 people when estimating the proportion of Manitobans who would vote for the NDP party. The effect of this increase is to: (A) (B) (C)* (D) (E) 20. reduce the bias of the sample proportion. increase the standard error of the sample proportion. reduce the variability of the sample proportion. increase the width of the confidence interval for the population parameter. have no effect since the population size is the same. A random sample of 900 individuals has been selected from a large population. It was found that 180 are regular users of vitamins. Thus, the proportion of regular users of vitamins in the population is estimated to be 0.20. An estimate of the standard error of this estimate is: (A) 0.1600 (B) 0.0002 (C) 0.4000 (D)* 0.0133 (E) 0.0267 THE UNIVERSITY OF MANITOBA DATE: DEC. 21, 2005 6:00 p.m.-8:00 p.m. FINAL EXAMINATION PAPER NO. 608 DEPARTMENT & COURSE NO. : 005.200 EXAMINATION: Basic Statistical Analysis II DURATION: 2 hours PAGE NO. : Page 7 of 16 EXAMINER: Questions 21 to 24 refer to the following situation: A national survey was conducted to obtain information on alcohol consumption patterns of U.S. adults by marital status. A random sample of 1772 residents, 18 years old or older, yielded the data displayed in the following table. We want to see whether there is an association between marital status and alcohol consumption. NOTE: (1) Expected values for some cells are enclosed in brackets. (2) Cell χ2 values are given in the second row of some cells. (3) The sum of the given χ2 values in the table is 63.49. Marital Status Single Married Widowed Divorced Total 21. Drinks per month Abstain 1-60 Over 60 67 (117.9) 213 ( ) 74 ( ) 21.95 2.49 18.78 411 (390.6) 633 (633.5) 129 (148.9) 1.07 .00 2.67 85 (47.6) 51 ( ) 7( ) 8.91 6.86 27 (34) 60 (55.1) 15 (13.0) .44 .32 590 957 225 Total 354 1173 143 102 1772 The null hypothesis we usually test for data such as this is: (A) Alcohol consumption depends on marital status. (B) The four categories of marital status are equally likely for all alcohol consumption categories. (C) Married people drink less than unmarried people. (D) * Alcohol consumption is independent of marital status. (E) None of the above. 22. The expected frequency for the (3,2) cell (number of widowed people who have 1-60 drinks per month) is: (A)* 77.2 (B) 51 (C) 47.7 (D) 147.7 (E) 239.5 23. The degrees of freedom and 5% critical value for the appropriate test statistic are: (A) 12 & 21.03 24. (B) 2 & 5.99 (C) 11 & 19.68 (D) 6 & 14.45 (E)* 6 & 12.59 The value of the appropriate test statistic is: (A) 18.27 (B) 30.83 (C) 64.06 (D) 94.32 (E)* 81.75 THE UNIVERSITY OF MANITOBA DATE: DEC. 21, 2005 6:00 p.m.-8:00 p.m. FINAL EXAMINATION PAPER NO. 608 DEPARTMENT & COURSE NO. : 005.200 EXAMINATION: Basic Statistical Analysis II DURATION: 2 hours PAGE NO. : Page 8 of 16 EXAMINER: 25. A researcher wished to compare the average amount of time spent in extracurricular activities by high school students in a suburban school district with that in a school district of a large city. The researcher obtained a SRS of 10 high school students in a large suburban school district and found the mean time spent in extracurricular activities per week to be 6 hours with a standard deviation of 3 hours. The researcher also obtained an independent SRS of 12 high school students in a large city school district and found the mean time spent in extracurricular activities per week to be 4 hours with a standard deviation of 1 hour. Let µ1 and µ2 represent the mean amount of time spent in extracurricular activities per week by the populations of all high school students in the suburban and city school districts, respectively. The researcher is willing to assume that the two populations are normally distributed with equal variances. Using the above data, suppose the researcher wished to test whether the time spent on extracurricular activities is different in the two types of districts. The P-value of the test is: (A) (B)* (C) (D) (E) 26. larger than .10 between .05 and .1 between .025 and .05 between .01 and .025 below .01 A restaurant manager is considering a new location for her restaurant. The projected annual cash flow for the new location is: Annual Cash Flow $10,000 Probability 0.10 $30,000 0.15 $70,000 0.50 $90,000 $100,000 0.15 0.10 The expected cash flow for the new location is: (A) $61,000 (B)* $64,000 (C) $70,000 (D) $60,000 (E) $50,000 27. A newspaper conducted a province wide survey concerning the 1999 election. The newspaper took a random sample of 1200 registered voters and found that 570 would vote for the NDP . Let p represent the proportion of registered voters in the province that would vote for the NDP. Using the above data, suppose you wished to see if the NDP had a "clear" majority. To do this you test the hypothesis H0: p = 0.45 vs. Ha: p > 0.45 The approximate P-value of your test is: (A) (B) (C) (D) (E)* 0.959 0.88 0.12 0.05 0.041 THE UNIVERSITY OF MANITOBA DATE: DEC. 21, 2005 6:00 p.m.-8:00 p.m. FINAL EXAMINATION PAPER NO. 608 DURATION: 2 hours DEPARTMENT & COURSE NO. : 005.200 PAGE NO. : Page 9 of 16 EXAMINATION: Basic Statistical Analysis II EXAMINER: 28. An agricultural researcher wishes to see if a kelp extract helps prevent frost damage on tomato plants. Two similar small plots are planted with the same variety of tomato. Plants in both plots are treated identically, except the plants on plot 1 are sprayed weekly with a kelp extract while the plants on plot 2 are not. After the first frost in the autumn, the percentage of damaged fruit is determined. For plants in plot 1, 20 of the 100 tomatoes on the vine exhibited damage. For plants in plot 2, 36 of the 100 tomatoes on the vine showed damage. Let p1 and p2 be the actual proportion of all tomatoes of this variety that would experience crop damage under the kelp and no kelp treatments, respectively, when grown under conditions similar to those in the experiment. What is a 99% confidence interval for p1 - p2? (A) -.16 + 0.062 (B) -.16 + 0.122 (C)* -.16 + 0.161 (D) 16 + 0.062 (E) - .16 + 0.141 29. An experiment is conducted to compare the effectiveness of two vaccines. Vaccine A is given to 200 randomly selected people and Vaccine B is given to a separate sample of 300 randomly selected people. Of the 200 people who received vaccine A, 80 were infected. For those receiving vaccine B, 105 were infected. Is there a difference in the proportion of people receiving the two vaccines who got infected? The appropriate test statistic is: (A) 80 105 − 200 300 80 20 105 195 ⋅ + ⋅ 200 200 300 300 (B) 80 105 − 200 300 80 20 1 105 195 1 ⋅ ⋅ + ⋅ ⋅ 200 200 200 300 300 300 (C) 80 105 − 200 300 185 1 185 1 ⋅ + ⋅ 500 200 500 300 (D) 80 105 − 200 300 185 315 ⋅ 500 500 (E)* 80 105 − 200 300 185 315 1 1 ⋅ + 500 500 200 300 30. It is hypothesized that an experiment results in outcomes K, L, M and N, with probabilities 1/5, 3/10, 1/10 and 2/5 respectively. Forty independent repetitions of the experiment have results as follows: Outcome Frequency K 11 L 14 M 5 N 10 The chi-square goodness of fit statistics is used to test the above hypothesis. Let r be the observed value of the test statistic (without pooling ) , and let s be the critical value corresponding to a significance level of 0.01. The values of r and s are respectively: (A) r = 95/24 and s = 13.28 (B) r = 28/24 and s = 11.35 (C) *r = 95/24 and s = 11.35 (D) r = 28/24 and s = 13.28 (E) r = 28/24 and s = 0.30 THE UNIVERSITY OF MANITOBA DATE: DEC. 21, 2005 6:00 p.m.-8:00 p.m. FINAL EXAMINATION PAPER NO. 608 DEPARTMENT & COURSE NO. : 005.200 EXAMINATION: Basic Statistical Analysis II DURATION: 2 hours PAGE NO. : Page 10 of 16 EXAMINER: 31. Are all employees equally prone to having accidents? To investigate this hypothesis, Parry (1985) looked at a light manufacturing plant and classified the accidents by type and by age of the employee. Accident Type Age Under 25 25 or over Sprain Burn Cut | 9 17 5 | 61 13 12 The observed value of the chi-square test statistic is 20.78. If we test at level of significance α =.025: (A) (B) (C)* There appears to be no relation between accident type and age. Age seems to be independent of accident type. Accident type seems to be related to age. (D) (E) There appears to be a 20.78% correlation between accident type and age. The proportion of sprain, cuts and burns seems to be similar for both age classes. Questions 32 to 36 refer to the following situation: The following data give the weights (in pounds) of paper discarded by a sample of households along with the sizes of the households. HH Size (x) Weight (y) 2 10.4 3 12.6 3 18.9 Moreover, x = 3.25, y = 15.9625, 6 18.8 ∑ (x − x) 4 16.8 2 2 13.8 25 22.5 Weight 20 17.5 15 12.5 10 1 2 3 4 HH Size 5 6 5 22.8 ∑ ( x − x )( y − y ) = 35.475 = 19.5 , Bivariate Fit of Weight By HH Size 0 1 13.6 7 THE UNIVERSITY OF MANITOBA DATE: DEC. 21, 2005 6:00 p.m.-8:00 p.m. FINAL EXAMINATION PAPER NO. 608 DEPARTMENT & COURSE NO. : 005.200 EXAMINATION: Basic Statistical Analysis II DURATION: 2 hours PAGE NO. : Page 11 of 16 EXAMINER: Linear Fit Weight = 10.05 + 1.82 HH Size Analysis of Variance Source DF Sum of Squares Model Error C. Total 116.63875 Parameter Estimates Term Estimate Intercept 10.05 HH Size 1.82 32. Mean Square F Ratio 8.6836 Prob > F Std Error 2.406047 0.667317 t Ratio Prob>|t| The respective numerator and denominator degrees of freedom for testing whether the slope parameter is zero using the F test are: (A) 1 and 7 (B) 7 and 6 (C) 6 and 1 (D)* 1 and 6 (E) 6 and 7 33. A 90% prediction interval for the pounds of paper discarded by a household of 5 people is closest to: (A) 19.15 ± 8.96 (B) 19.15 ± 6.32 (C) *19.15 ± 6.48 (D) 19.15 ± 5.49 (E) Cannot determine such an interval because that requires extrapolation. 34. The 95% confidence interval for the mean weight of paper discarded by all households of 2 people is closest to: (A)* 13.69 ± 2.447*1.335 (B) 13.69 ± 2.447*3.933 (C) 13.69 ± 2.365*1.335 (D) 13.69 ± 2.365*3.933 (E) 13.69 ± 1.96*1.335 35. The observed value of the test statistic for zero correlation ρ is: (A) (2.727)2 (B)* 2.727 (C) 4.176 (D) 8.684 (E) 1.322 36. To see whether the data provides sufficient evidence to indicate a positive population slope β, the P-value is: (A) less than 0.015 (B)* between 0.015 and 0.025 (C) between 0.025 and 0.05 (D) between 0.05 and 0.10 (E) greater than 0.10 THE UNIVERSITY OF MANITOBA DATE: DEC. 21, 2005 6:00 p.m.-8:00 p.m. FINAL EXAMINATION PAPER NO. 608 DEPARTMENT & COURSE NO. : 005.200 EXAMINATION: Basic Statistical Analysis II DURATION: 2 hours PAGE NO. : Page 12 of 16 EXAMINER: Selected Formulae s2 s2 1 2 + with df = smaller of n1 − 1 and n 2 − 1 n1 n2 1. SE ( x1 − x 2 ) = 2. SE ( x1 − x 2 ) = sp 1 1 + n1 n 2 € df = n1 + n2 − 2 if σ 12 = σ 22 with 2 where s p = (n1 − 1)s12 + (n2 − 1)s22 n1 + n2 − 2 € k 3. SSG = ∑ n i( X i − X ) 2 i=1 € 4. Poisson Distribution P(X = k) = 5. t= 6. sb = e− λ λk , k! k = 0, 1, 2, … r n−2 1 − r2 € se ∑ (x i − x) 2 , se = MSE 2 € 1 ( x * −x ) + n ∑ ( x − x )2 7. SE µˆ = se 8. 1 ( x * −x ) SE yˆ = se 1+ + n ∑ ( x i − x )2 2 € i € € € 9. SE( pˆ1 − pˆ 2 ) = SE( pˆ1 − pˆ 2 ) = 1 1 pˆ (1− pˆ ) + n1 n 2 if p1 = p2 € pˆ1 (1− pˆ1 ) pˆ 2 (1− pˆ 2 ) + n1 n2 € if p1 ≠ p2 where pˆ = x1 + x2 n1 + n2