Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
History of statistics wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Foundations of statistics wikipedia , lookup
Taylor's law wikipedia , lookup
Statistical hypothesis testing wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Testing the Differences between Means Statistics for Political Science Levin and Fox Chapter Seven 1 Exam 4 Review Topics 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. Null hypothesis Research hypothesis Standard error of the difference between means t score/ratio Degrees of freedom for t score/ratio t chart (table c) Sum of squares (total, within, between) Mean square Between-groups and within-groups degrees of freedom Between-groups and within-group mean square f ratio 2 Standard Error of the Difference between Means? Are two populations the same: µ1 = µ2 If so, retain the null hypothesis. Are two populations different: µ1 ≠ µ2 If so, reject the null hypothesis (accept hypothesis). 3 What is hypothesis testing? When we evaluate sample data collected about a particular population and see how likely the sample results are, given our hypothesis about the population. If the sample results are plausible under the hypothesis about the population, we retain the hypothesis and attribute any departure form our expected results to pure chance based on sampling error. If the sample results are unlikely (less than 5 chances in 100) we then reject the hypothesis. 4 The Null Hypothesis Null Hypothesis: It is the hypothesis that says that two samples have been drawn from equivalent populations. Any observed difference between samples is a result of chance occurrence resulting from sampling error alone. The difference in sample means does not imply a difference in population means. To conclude that sampling error is responsible for obtaining a difference between sample means is to retain the null hypothesis: µ1 = µ2 Where µ1 = mean of the first population µ2 = mean of the second population To Retain: Does not imply that we have proven the population means are equal, but rather that we lack sufficient evidence to say otherwise (that is, to say they there is a difference between the populations). 5 The Research Hypothesis for Means Difference Research Hypothesis: If we reject the null hypothesis, then we automatically accept the research hypothesis that a true population difference does exist. The difference between sample means is too large to be accounted for by sampling error. The research hypothesis for mean differences is symbolized by (the population means are not equal): µ1 ≠ µ2 6 Sampling Distribution of Differences between Means Sampling Distribution of Differences between Means: Recall from our long-distance phone calling example, that if a researcher was to take multiple samples, he/she could get a sampling distribution of means (rather than raw scores). Paired Samples: What if the researcher, while gathering samples, studies, or compares two samples at a time. 7 Testing Hypotheses with the Distribution of Differences between Means Sampling Distribution of Differences between Means: 1) It assumes that all sample pairs differ only by virtue of sample error and not as a function of true population differences. 2) The mean of the difference between means equals zero (this is so because the resulting positive and negative numbers tend to cancel each other out. 3) Approximates the normal curve (most of the mean differences fall near zero, which is expected since any difference between means is a product of sampling error.) 8 Testing Hypotheses with the Distribution of Differences between Means Probability and Sampling Distribution of Differences between Means: Since Sampling Distribution of Differences between Means approximates the normal curve, we can use the properties of the normal curve to make statements of probability about mean differences, specifically whether it is likely or not that the mean difference is a result of chance/sampling error or true population differences. 9 Testing Hypotheses with the Distribution of Differences between Means Null Research Closer to zero, more Further from zero, less likely to be sample error likely to be sample error 10 Probability and Sampling Distribution of Differences between Means: If the obtained difference between means lies so far from a difference of zero that it has only a small probability of occurrence in the sampling distribution of differences between means, we reject the null hypothesis. If our sample mean difference falls so close to zero that its probability of occurrence is large, we must retain the null hypothesis and treat the obtained difference as a sampling error. 11 Testing Hypotheses with the Distribution of Differences between Means Example: Child Rearing: Comparing Males and Females What if the researcher examines one pair (as opposed to 70 pairs) containing 30 men and 30 women. (Subtract second mean from the first.) Results: Women: (sample mean) = 45.0 Men: (sample mean) = 40.0 Difference Between Means: (45.0 – 40.0) = + 5.0 How far does + 5.0 fall from the mean of zero? 12 Child Rearing: Comparing Males and Females Step 2b: Translate our sample mean difference into units of standard deviation. Z = X1 X2 Where (X 1 – X 2) - 0 X 1X 2 = mean of the first sample = mean of the second sample 0 = zero, the value of the mean of the sampling distribution of differences between means (we assume that µ1 - µ2 = 0) X 1X 2 = standard error of the mean (standard deviation of the distribution of the difference between means) We can reduce this equation down to the following: z X1 X 2 X 1X 2 13 Child Rearing: Comparing Males and Females Result: (assuming X 1X 2 Z = equals 2) ( 45 – 40) 2 Z = + 2.5 Thus, a difference of 5 between the means of the two samples (women and men) falls 2.5 standard deviations from a mean of zero. 14 What is the probability that a difference of 5 between sample means could be caused by sampling error? The probability of getting 5 or move (above or below the mean) because of sample error is roughly P = .01 (1 in a 100) z = 2.50 1.24 % P =.012 P =.4938 0 49.38% .62 % P =.4938 P=.006 5 Levels of Significance Is a mean difference of 5, which has a P = .01 chance of resulting from sample error statistically significant, that is does it result from population difference? Levels of Significance: We need to establish this to determine whether or not our obtained sample difference is statistically significant. The α (alpha) value is the level of probability at which the null hypothesis can be rejected with confidence and the research hypothesis accepted with confidence. We decide to reject the null hypothesis if the probability is very small. This is symbolized as P ≤ .05 16 Things to Know about Levels of Significance: A small probability is symbolized by – P ≤ .05 Alpha is generally defined as (95 % Confidence Interval) – α = .05 level of significance This means that we are willing to reject the null hypothesis if an obtained sample difference occurs by chance less than 5 times out of 100. Thus, a mean difference of 5, between men women with regards to their approach to child-rearing is statistically significant, and is not the result of sampling error but differences between the populations. 17 Critical Values In this case, the z scores are called critical values. With α = .05, the z score ±1.96 is a critical value. If we obtain a z score that exceeds 1.96 (z>1.96 or z<-1.96), it is statistically significant. Critical or rejection regions are those areas beyond the z score to the tail of the normal curve and scores within these areas lead us to reject the null hypothesis. 18 The Difference between P and α P is the exact probability that the null hypothesis is true in light of some sample data. Alpha is the threshold below which is considered so small that we decide to reject the null hypothesis. We reject the null hypothesis if the P value is less than the alpha value. 19 Critical Values: Z Score 2.50 % z = -1.96 z = 1.96 47.5 % 47.5 % 95% 2.50 % If we obtain a z score that exceeds 1.96 it is called statistically significant. Critical Values: Z Score Statistically Significant: Statistically Insignificant: accept Null Hypothesis reject Null Hypothesis Statistically Significant: reject Null Hypothesis 2.50 % 2.50 % z= 1.96 0 z= +1.96 The Difference between P and α Example: a mean difference of 5 has a P of .006 x2 = roughly .01 (1 in a 100), whereas α = .05 cuts off the null hypothesis at .025 x2 = .05 (5 chances in 100). z = 2.50 49.38% .62 % P =.4938 P=.006 α =.05/z =1.96: 95.0% 2.50% P = .025% z= 1.96 0 z= +1.96 5 The Difference between P and α Example: a mean difference of 5 has a P of .006 x2 = roughly .01 (1 in a 100), whereas α = .05 cuts off the null hypothesis at .025 x2 = .05 (5 chances in 100). Any mean difference below 5 chances in 100 Supports the research hypothesis. Statistically Significant .62% P =.006 2.50% P = .025% z= 1.96 0 z= +1.96 5 Testing Differences between Means, continued Statistics for Political Science Levin and Fox Chapter Seven Testing Differences between Means To test the significance of a mean difference we need to find the standard deviation for any obtained mean difference. However, we rarely know the standard deviation of the distribution of mean differences since we rarely have population data. Fortunately, it can be estimated based on two samples that we draw from the same population. Testing Differences between Means Steps for calculating standard deviation from Two Samples Means (sample data) 1. 2. 3. 4. 5. Calculate standard error of the difference between means Calculate the t score Calculate the Degrees of Freedom Determine the alpha (.05) Consult t chart Standard Error of the Difference between Means Step One: Calculate standard error of the difference between means. sx1 x 2 N s N s N1 N 2 N1 N 2 2 N1 N 2 The formula for 2 1 1 sX 1X 2 2 2 2 combines the information from the two samples. Step Two: Calculate the t score: X1 X 2 t s X1 X 2 REMEMBER: We use t instead of z because we do not know the true population standard deviation. We aren’t finished yet! Step 3: Calculate the Degrees of Freedom N1+ N2 – 2. Step 4: Determine the alpha (.05) Step 5: Consult t chart df .20 .10 .05 .02 .01 .001 40 1.303 1.684 2.021 2.423 2.704 3.551 Testing the Difference between Means Example: Lets say that we have the following information about two samples, one of liberals and one of conservatives, on the progressive scale: Liberals Conservatives N1 = 25 N2 = 35 X 1 = 60 X 2 = 49 S1 = 12 S2 = 14 We can use this information to calculate the estimate of the standard error of the difference between means: We start with our formula: sx1 x 2 sx1 x 2 N1s12 N 2 s22 N1 N 2 N1 N 2 2 N1 N 2 (25)(12) 2 (35)(14) 2 25 35 25 35 2 (25)(35) 3,600 6,860 60 58 875 (180.3448)(. 0686) 12.3717 3.52 The standard error of the difference between means is 3.52. We can now use our result to translate the difference between sample means to a t ratio. We can now use our standard error results to change difference between sample mean into a t ratio: X1 X 2 t s X1 X 2 t = 60 – 49 3.52 t = 11 3.52 t = 3.13 REMEMBER: We use t instead of z because we do not know the true population standard deviation. We aren’t finished yet! Turn to Table C. 1) N1+ N2 – 2. 2) For each standard deviation that we estimate, we lose 1 degree of freedom from the total number of cases. N = 60 Df ( 25 + 35 - 2) = 58 In Table C, use a critical value of 40 since 58 is not given. We see that our t-value of 3.13 exceeds all the standard critical points except for the .001 level. df .20 .10 .05 .02 .01 .001 40 1.303 1.684 2.021 2.423 2.704 3.551 Therefore, based on what we established BEFORE our study, we reject the null hypothesis at the .10, .05, or .01 level. Analysis of Variance Statistics for Political Science Levin and Fox Chapter Seven Analysis of Variance Sometimes it is necessary to make comparisons over three or more groups. The analysis of variance yields an F ratio (which we will cover a little bit later) which indicates the size of the groups relative size of the variation within each group. The larger the F ratio, the greater the probability of rejecting the null hypothesis and accepting the research hypothesis. A Note on Process: The Analysis of Variance is a multi-step process. 1. 2. 3. Sum of Squares Mean Square F Ratio The F ratio in Table D is the final step in the analysis of variance. Sum of Squares Sum of Squares: The sum of squares is simply: 1. Squaring the deviations from the mean of the distribution and 1. Adding them up. Now we will work to understand the components of the analysis of variance: The Sum of Squares: Found by squaring the deviations from the mean of a distribution and adding these squared distributions together. X X 2 The general equation you must know to calculate the different types of sum of squares. Sum of Squares Comparing Groups: When groups are compared, there are more than one type of sum of squares. Total Sum of Squares (SS total) Between Groups Sum of Squares (SS between) Within Groups Sum of Squares (SS within) Each type represents the sum of squared deviations from a mean. We will use THESE formulas for computation: The Computational Formulas for Sum of Squares SStotal X 2 total SS within X N total X 2 total 2 total N group X SSbetween N group X 2 total 2 total N total X 2 total 2 X total All the scores squared and then summed X N total Total mean of all groups combined X group N group Mean of any group Total number of scores in all groups combined Number of scores in any group Analysis of Variance is a multi-step process. 1) Sum of Squares: a. sum of scores b. sum of squared scores c. number of scores d. mean. e. SS total f. SS within g. SS between 2) Mean of Squares a. MS between b. MS within c. df between d. df within 1) F Ratio Table D Before applying the formulas we have to find sum of scores (1), sum of squared scores (2), number of scores (3), and mean (4) 1) 2) 3) 4) Next we calculate the following sums of squares: Nex Mean Square (MS) Mean Square (MS): The value of the sum of squares becomes larger as variation increases. The sum of squares also increases with sample size. Because of this, the SS cannot be viewed as a true measure of variation. Another measure of variation that we can use is the Mean Square. Calculating the Mean Square for within and between groups: MS between SS between df between SS within MS within df within MSbetween = between group mean square SSbetween = between group sum of squares Dfbetween = between group degrees of freedom MSwithin = within group mean square SSwithin = within group sum of squares Dfwithin = within group degrees of freedom Use the following equations to obtain the correct degrees of freedom: df between k 1 df within N total k k = number of groups Calculating the mean Square (using Table 8.2 data) df between = k (# of groups) - 1 df within = N groups) MS between MS within total (# of cases) – K (number of = SS between df between = SS within df within The F Ratio The analysis of variance yields an F ratio. The F ratio is the variance between groups and variation within groups compared. MS between F MS within The larger our calculated F ratio, the increased likelihood that we will have a statistically significant result. 1. Go to Table D in Appendix B. 2. Use the dfbetween (the numerator) across the top of the table. 3. Use the dfwithin (the denominator) along the side of the table. Example: does family size vary by religious affiliation? Step 1: Find the mean for each sample Step 2: Cal. (1) Sum of scores, (2) sum of sq. scores, (3) number of subjs., (4) and mean 1) 2) 3) 4) Finding: Reject Null Hypothesis: Family size does vary by religion To reject the null hypothesis at the .05 significance level with 2 and 12 degrees of freedom, our calculated F ratio must exceed 3.8. Our obtained an F ratio of 8.24, we must reject the null hypothesis. Requirements for using the F ratio: 1) Must be a comparison between three or more means. 2) Must be working with interval data. 3) Our sample must have been collected randomly from the research population. 4) We can/must assume that the sample characteristics are normally distributed. 5) We must assume that the variance between samples are all equal.