* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Testing Differences between Means continued
Survey
Document related concepts
Transcript
Testing Differences between Means, continued Statistics for Political Science Levin and Fox Chapter Seven Testing Differences between Means To test the significance of a mean difference we need to find the standard deviation for any obtained mean difference. However, we rarely know the standard deviation of the distribution of mean differences since we rarely have population data. Fortunately, it can be estimated based on two samples that we draw from the same population. Remember this formula required the standard deviation of the distribution of mean differences. Step 2b: Translate our sample mean difference into units of standard deviation. Z = X1 X2 Where (X = mean of the first sample 1 – X 2) - 0 X 1X 2 = mean of the second sample 0 = zero, the value of the mean of the sampling distribution of differences between means (we assume that µ1 - µ2 = 0) X 1X 2 = standard error of the mean (standard deviation of the distribution of the difference between means) We can reduce this equation down to the following: z X1 X 2 X 1X 2 3 Child Rearing: Comparing Males and Females Result: (assuming X 1X 2 equals 2) ( 45 – 40) Z = 2 Z = + 2.5 Thus, a difference of 5 between the means of the two samples (women and men) falls 2.5 standard deviations from a mean of zero. 4 Standard Error of the Difference between Means Here is how the standard error of the difference between means can be calculated. sx1 x 2 N s N s N1 N 2 N1 N 2 2 N1 N 2 The formula for 2 1 1 sX 1X 2 2 2 2 combines the information from the two samples. Where The formula for samples. 2 1 s X 2 1 s 2 2 X 2 2 sX 1X 2 N1 N2 X 2 1 X 2 2 combines the information from the two A large difference between Xbar1 and Xbar2 can result if (1) one mean is very small, (2) one mean is very large, or (3) one mean is moderately small and the other is moderately large. Variance: Weeks on Unemployment: Step 1: Calculate the Mean Step 2: Calculate Step 3: Calculate Deviation Sum of square Dev X (weeks) N=6 9 8 6 4 2 1 ΣX=30 χ= 30=5 6 Deviation: (X X) (X X) 2 (raw score from the mean) (raw score from the mean, squared) 9-5= 4 8-5=3 6-5=1 4-5=-1 2-5=-3 1-5=-4 42 = 16 32 = 9 12 = 1 -12 = 1 -32 = 9 -42 = 16 2 (X X) 52 Step 4: Calculate the Mean of squared dev. Variance: s 2 XX N 52 8.67 6 (weeks squared) 2 Testing the Difference between Means Let’s say that we have the following information about two samples, one of liberals and one of conservatives, on the progressive scale: Liberals Conservatives N1 = 25 N2 = 35 X 1 = 60 X 2 = 49 S1 = 12 S2 = 14 We can use this information to calculate the estimate of the standard error of the difference between means: We start with our formula: sx1 x 2 sx1 x 2 N1s12 N 2 s22 N1 N 2 N1 N 2 2 N1 N 2 (25)(12) 2 (35)(14) 2 25 35 25 35 2 (25)(35) 3,600 6,860 60 58 875 (180.3448)(. 0686) 12.3717 3.52 The standard error of the difference between means is 3.52. We can now use our result to translate the difference between sample means to a t ratio. We can now use our standard error results to change difference between sample mean into a t ratio: X1 X 2 t s X1 X 2 t = 60 – 49 3.52 t = 11 3.52 t = 3.13 REMEMBER: We use t instead of z because we do not know the true population standard deviation. We aren’t finished yet! Turn to Table C. 1) Because we are estimating for both σ1 and σ2 from s1 and s2, we use a wider t distribution, with degrees of freedom N1+ N2 – 2. 2) For each standard deviation that we estimate, we lose 1 degree of freedom from the total number of cases. N = 60 Df ( 25 + 35 - 2) = 58 In Table C, use a critical value of 40 since 58 is not given. We see that our t-value of 3.13 exceeds all the standard critical points except for the .001 level. df .20 .10 .05 .02 .01 .001 40 1.303 1.684 2.021 2.423 2.704 3.551 Therefore, based on what we established BEFORE our study, we reject the null hypothesis at the .10, .05, or .01 level. Comparing the Same Sample Measured Twice Some research employs a panel design or before and after test (testing the same sample at two points in time). In these types of studies, the same sample is tested twice. It is not two samples from the same population, it is a measuring the same group of people twice. CRITICAL POINTS TO NOTE: 1. The same sample measured twice uses the t-test of difference between means. 2. Different samples from the same population selected at two points in time use the t-test of difference between means for independent groups. Example Problem of Test of Difference Between Means for Same Sample Measured Twice Null Hypothesis (µ1 = µ2): The degree of neighborliness does not differ before and after relocation. Research Hypothesis (µ1 ≠ µ2): The degree of neighborliness differs before and after relocation. Where µ1 is the mean score of neighborliness at time 1 Where µ2 is the mean score of neighborliness at time 2 Before (X1) After (X2) Difference (D = X1 – X2) Difference2 (D2) Johnson 2 1 1 1 Robinson 1 2 -1 1 Brown 3 1 2 4 Thomas 3 1 2 4 Smith 1 2 -1 1 Holmes 4 1 3 9 ∑ X1 = 14 ∑ X2 = 8 Respondent ∑ D2 = 20 The formula for obtaining the standard deviation for the distribution of beforeafter difference scores sD D N 2 (X 1 X 2) sD = standard deviation of the distribution of before-after difference scores D = after-move raw score subtraction from before-move raw score N = number of cases or respondents in sample From this, we get the formula for the standard error of the difference between the means: SD SD N 1 2 Step 1: Find mean for each point in time X1 X n 14 = 6 = 2.33 1 X2 = X 2 n 8 6 = 1.33 Step 2: Find the SD for the diff between the times 20 2 sD (2.33 1.33) 6 = 1.53 Step 3: Find the SE for the diff between the times 1.53 SD 6 1 = .68 Step 4: Translate the mean diff into a t score X1 X 2 t sD t = 60 – 49 3.52 t = 3.13 Comparing the Same Sample Measured Twice Step 5: Calculate the degrees of freedom df = (n – 1) =6–1 =5 Step 6: compare the obtained t ratio with t ratio in Table C Obtained t = 1.47 Table t = 2.571 df = 5 α = .05 df .20 .10 .05 .02 .01 .001 5 1.476 2.015 2.571 3.365 4.032 6.859 To order reject the null hypothesis at the .05 significance with five degrees of freedom we must obtain a calculated t ratio of 2.571. Because our t ratio is only 1.47 – we retain the null hypothesis. Two Sample Test of Proportions P1 P2 z s P1 P2 The standard error of the difference in proportions is: Where P* is the combined sample proportion sP1 P2 Where P1 and P2 are respective sample proportions. N1 N 2 P * (1 P*) N1 N 2 N1 P1 N 2 P2 P* N1 N 2 Requirements when considering the appropriateness of the tratio as a test of significance. (For Testing the Difference between Means): 1. 2. 3. 4. 5. The t ratio is used to make comparisons between two means. The assumption is that we are working with interval level data. We used a random sampling process. The sample characteristic is normally distributed. The t ratio for independent samples assumes that the population variances are equal. So how do you interpreting the results and state them for inclusion in your research? “Since the observed value of t (state the test statistic) exceeds the critical value (state the critical value), the null hypothesis is rejected in favor of the directional alternative hypothesis. The probability that the observed difference (state the difference between means) would have occurred by chance, if in fact the null hypothesis is true, is less than .05.”