Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Comparing Means Between Groups Michael Ash Lecture 6 Summary of Main Points I Comparing means between groups is an important method for program evaluation by policy analysts and public administrators. I I I The question “Does a program work?” is often answered in terms of the program’s effect on the mean of an important outcome variable by comparing the mean of a treated group and a comparison group. Comparing means between groups is an important method for identifying discrimination and other social problems. Examples: income by white or non-white; drop-out risk by single-parent or two-parent household; body mass index (BMI) by urban or suburban residence. The treated group and the comparison group are samples from two different populations. Sampling variation (rather than true underlying differences in the populations) may account for differences in the sample mean.between groups. I Statistical methods apply Caveats Outcome The outcome must measure something worth knowing. Confounding factors and selection into treatment The treatment and comparison groups may be different other than in the receipt of treatment. Mean The population mean does not fully describe the distribution of outcomes. For example, two groups with equal population mean income could have different probabilities of extreme poverty. Means for Different Populations Example of two populations 1. population of women recently graduated from college, mean earnings µw 2. population of men recently graduated from college, mean earnings µm Hypothesis Test for the Difference Between Two Means The null hypothesis is that the difference is some amount d 0 specified by the researcher. H0 : µ m − µ w H1 : µ m − µ w = d0 6= d0 For example, d0 = 0 would set up the test that there is no difference in mean earnings between recent male and female college graduates. Procedure to Test a Null about Differences 1. Y w is a good estimate of µw , and Y m is a good estimate of µm 2. Y m − Y w is a good estimate of the difference in population means, µm − µw 3. Y w and Y m are subject to sampling variation, as is the difference Y m − Y w . We will need an estimate of the standard deviation of Y m − Y w . 4. We want to know if, under the null hypothesis, the r.v. (Y m − Y w ) − d0 , the difference between the difference in sample means and the null-hypothesized difference between population means, is likely to be as large as the observed actual difference between the sample means our particular sample and the null-hypothesized difference between population means. The Hypothesis Test I A test statistic for the difference between the difference in sample means and the null-hypothesized difference in population means t= I (Y m − Y w ) − d0 SE (Y m − Y w ) This test statistic is distributed N(0, 1) if the two samples are reasonably large. If the test statistic is “large” (bigger than 1.96), then we reject the null hypothesis. Why? The actual difference in sample means is unlikely to be as big as it is if the null were true. Standard error of the difference in sample means SE (Y m − Y w ) = I s 2 sm s2 + w nm nw 2 , Sample variance for men’s earnings sm n 2 sm = m 2 1 X Yi − Y m nm − 1 i =1 I sw2 , Sample variance for women’s earnings n sw2 = w 2 1 X Yj − Y w nw − 1 j=1 Real-world data I Table 3.1 presents summary statistics from real-world data I Useful exercise to think about the underlying data I What is the unit of observation? I What variables are reported for each observation? . use cps_ch3 . list in 1/7 +-------------------------+ | a_sex year ahe98 | |-------------------------| 1. | 1 1992 12.99912 | 2. | 1 1992 11.61796 | 3. | 1 1992 17.37729 | 4. | 2 1992 10.06127 | 5. | 1 1992 16.75668 | |-------------------------| 6. | 2 1992 9.216171 | 7. | 2 1992 15.95874 | +-------------------------+ Comparing means with Stata Stata can tabulate and summarize data for us. . tabulate a_sex if year==1992, summarize(ahe98) | Summary of ahe98 a_sex | Mean Std. Dev. Freq. ------------+-----------------------------------1 | 17.574572 7.4964888 1591 2 | 15.220472 5.9732026 1371 ------------+-----------------------------------Total | 16.484946 6.932766 2962 With just one command, we have moved from “raw” individual data to the summary statistics in the first line of Table 3.1. (Think about how long it would take to do this in Excel—or by hand) Comparing means with Stata In fact, we can now test a null of equality (d 0 = 0)of mean hourly earnings for men and women in 1992, or H 0 : µm − µw = 0 t = = = (Y m − Y w ) − d0 SE (Y m − Y w ) 17.57 − 15.22 − 0 SE (Y m − Y w ) 2.35 SE (Y m − Y w ) SE (Y m − Y w ) = s r 2 s2 sm + w nm nw 5.972 7.502 + 1591 1371 = 0.25 or 25 cents per hour = Aside: is this SE, $0.25, plausible? √ 7.50/ 1591 = 0.18 √ The SE for women’s earnings = 5.97/ 1371 = 0.16 The SE for the difference should not be tremendously different from the SE for each group. (If you computed an SE of 7, you should be worried.) The SE for men’s earnings is √sm = nm is √snww Returning to our test statistic t = = = = = (Y m − Y w ) − d0 SE (Y m − Y w ) 17.57 − 15.22 − 0 SE (Y m − Y w ) 2.35 SE (Y m − Y w ) 2.35 0.25 9.35 This is a very large t-statistic (a t-statistic of 2 is all that is required to reject the null hypothesis. So we reject the null hypothesis of equal wages with very high confidence (very low probability that the difference in sample means is only due to sampling variation). Applying the method to a different null Is the difference between male and female earnings $1.50? H0 : µm − µw = 1.50 t = = = = = (Y m − Y w ) − d0 SE (Y m − Y w ) (17.57 − 15.22) − 1.50 SE (Y m − Y w ) 0.85 SE (Y m − Y w ) 0.85 0.25 3.4 We can reject this null hypothesis as well (Pr(|t| > 3.4) < 0.001) Young men’s earnings over time H0 : µm,1998 − µm,1992 = 0 t = = = = = (Y m,1998 − Y m,1992 ) − d0 SE (Y m,1998 − Y m,1992 ) (17.94 − 17.57) − 0 SE (Y m,1998 − Y m,1992 ) 0.37 SE (Y m,1998 − Y m,1992 ) 0.37 0.28 1.31 N.B. We are looking at two different samples of young men from two different cohorts. Young men’s earnings over time 1. This t-statistic is well below 1.96. 2. Pr(|t| > 1.31) = 0.19, or 19 percent of the time the sample means will differ this much if there is no true difference in the population means. 3. We cannot reject the null hypothesis with 95 percent confidence: there is no evidence that the wages of recent male college graduates was higher in the late 1990s than it had been in the early 1990s. Bernoulli outcomes Very common application. 1. What is the percent of positive outcomes (Y = 1) in the population? 2. Does the percent of positive outcomes (Y = 1) differ between two groups? Methods are identical to the method for continuous variables, but the interpretation and computations differ slightly. Do you approve of the job . . . is doing as your President? Bernoulli outcomes An individual’s response is yes Yi = 1 or no Yi = 0. Call px the mean population approval of President x and p̂ x the mean sample approval of President x. Note that n 1X Yi p̂x = n i =1 Sample Size President I 250 President II 300 (Think about the underlying Percent “yes” 0.54 0.44 data.) Is approval different from 50 percent? H0 : pI = 0.5 t = SE (p̂I ) = = = t = p̂I − pI ,0 p̂I − 0.5 0.54 − 0.5 = = SE (p̂I ) SE (p̂I ) SE (p̂I ) s sY2 no difference so far n r p̂(1 − p̂) special sY2 for a Bernoulli variable n r 0.54 · 0.46 ≈ 0.031 250 0.54 − 0.5 = 1.27 0.031 The t statistic is smaller than 1.96; so we cannot reject the null hypothesis. Polling: margin of error By the way, poll results are often expressed with a “margin of error” that is, in fact the 95 percent confidence interval. Pr(p̂ − 1.96SE (p̂) ≤ p ≤ p̂ − 1.96SE (p̂)) = 0.95 Pr(0.54 − 1.96 × 0.031 ≤ p ≤ p̂ + 1.96 × 0.031) = 0.95 Pr(0.54 − .06 ≤ p ≤ 0.54 + 0.06) = 0.95 Pr(0.48 ≤ p ≤ 0.60) = 0.95 The margin of error would be reported as ±1.96 × SE (p̂) = ±0.06 Note the importance of sample size for determining standard error and the margin of error of a poll: r p̂(1 − p̂) SE (p̂) = n You can push down the SE , and the margin of error, by increasing the sample size. Approval rating for two presidents Is approval for President I different from approval for President II? H0 : pI − pII = 0 t = SE (p̂I − p̂II ) = = = t = (p̂I − p̂II ) − d0 (p̂I − p̂II ) − 0 0.54 − 0.44 = = SE (p̂I − p̂II ) SE (p̂I − p̂II ) SE (p̂I − p̂II ) s sY2 sY2 I + II nI nII s p̂I (1 − p̂I ) p̂II (1 − p̂II ) + nI nII r 0.54 · 0.46 0.44 · 0.56 + ≈ 0.0426 250 300 0.54 − 0.44 = 2.35 0.0426 We can reject the null that approval for the two candidates is equal.