Download Testing the Differences between Means

Testing the Differences between Means Statistics for Political Science Levin and Fox Chapter Seven 1 What is hypothesis testing? When we evaluate sample data collected about a particular population and see how likely the sample results are, given our hypothesis about the population. If the sample results are plausible under the hypothesis about the population, we accept the hypothesis. If the sample results are unlikely (less than 5 chances in 100) we then reject the hypothesis (or retain the null) and attribute any departure form our expected results to be pure chance based on sampling error. 2 The Null Hypothesis Null Hypothesis: It is the hypothesis that says that two samples have been drawn from equivalent populations. Any observed difference between samples is a result of chance occurrence resulting from sampling error alone. The difference in sample means does not imply a difference in population means. To conclude that sampling error is responsible for obtaining a difference between sample means is to retain the null hypothesis: µ1 = µ2 Where µ1 = mean of the first population µ2 = mean of the second population To Retain: Does not imply that we have proven the population means are equal, but rather that we lack sufficient evidence to say otherwise (that is, to say they there is a difference between the populations). 3 The Research Hypothesis for Means Difference Research Hypothesis: If we reject the null hypothesis, then we automatically accept the research hypothesis that a true population difference does exist. The difference between sample means is too large to be accounted for by sampling error. The research hypothesis for mean differences is symbolized by (the population means are not equal): µ1 ≠ µ2 4 Null and Research Hypothesis Hypothesis: Example Men are more permissive than women with regards to disciplining children. Null Hypothesis: The null hypothesis holds that there is no difference between Men and Women (as populations) when it comes to disciplining children. Any observed difference is the result of sampling error (rather than actual difference). 5 Null and Research Hypothesis Hypothesis: Example Men are more permissive than women with regards to disciplining children. Research Hypothesis: The research hypothesis holds that there IS a difference between between Men and Women (as populations) when it comes to disciplining children. 6 Sampling Distribution of Differences between Means Sampling Distribution of Differences between Means: Recall from our long-distance phone calling example, that if a researcher was to take multiple samples, he/she could get a sampling distribution of means (rather than raw scores). Paired Samples: What if the researcher, while gathering samples, studies or compares two samples at a time. 7 Sampling Distribution of Differences between Means Example: Child Rearing: Comparing Males and Females To test the difference, a researcher constructs a scale of permissiveness from 1 (Strict: not very permissive) to 100 (very permissive). Then they study two random samples of 30 men and 30 women. Results: Women: (sample mean) = 58.0 (more permissive) Men: (sample mean) = 54.0 (less permissive) Difference Between Means: (58.0 – 54.0) = + 4.0 Is the difference the result of chance alone/sampling error (Null hypothesis)? Or is there a difference between men and women (as populations) (Research Hypothesis)? 8 Sampling Distribution of Differences between Means Example: Child Rearing: Comparing Males and Females What if the researcher continued to take samples, and took 70 additional pairs of samples, each containing 30 women and 30 men. This would give us, as it did with the first two paired samples a difference between the means. Sampling Distribution of Differences between Means: And, just as we did first with raw scores, and then sample means, once we have a distribution of mean differences we can construct a Sampling Distribution of Differences between Means. 9 Sampling Distribution of Differences between Means Example: Child Rearing: Comparing Males and Females What if the researcher continued to take samples, and took 70 additional pairs of samples, each containing 30 women and 30 men. Samples: Women (30 in each) Samples: Men (30 in each) Population: µ=? Note: You always subtract the second sample mean (men) from the first sample mean (women). 1 57 _ 54 =+3 2 55 _ 56 =-1 3 59 _ 57 =+2 … 70 10 The Purpose and Function of a Sampling Distribution of Differences between Means Child Rearing: Males and Females Here is what it looks like as a frequency distribution. Mean Difference f +3 1 +2 5 +1 7 0 13 -1 8 -2 4 -3 1 N= 35 11 Testing Hypotheses with the Distribution of Differences between Means Sampling Distribution of Differences between Means: 1) It assumes that all sample pairs differ only by virtue of sample error and not as a function of true population differences. 2) The mean of the difference between means equals zero (this is so because the resulting positive and negative numbers tend to cancel each other out. 3) Approximates the normal curve (most of the mean differences fall near zero, which is expected since any difference between means is a product of sampling error.) 12 Testing Hypotheses with the Distribution of Differences between Means Probability and Sampling Distribution of Differences between Means: Since Sampling Distribution of Differences between Means approximates the normal curve, we can use the properties of the normal curve to make statements of probability about mean differences, specifically whether it is likely or not that the mean difference is a result of chance/sampling error or true population differences. 13 Testing Hypotheses with the Distribution of Differences between Means Null Research Closer to zero, more Further from zero, less likely to be sample error likely to be sample error 14 Probability and Sampling Distribution of Differences between Means: If the obtained difference between means lies so far from a difference of zero that it has only a small probability of occurrence in the sampling distribution of differences between means, we reject the null hypothesis. If our sample mean difference falls so close to zero that its probability of occurrence is large, we must retain the null hypothesis and treat the obtained difference as a sampling error. 15 Testing Hypotheses with the Distribution of Differences between Means Example: Child Rearing: Comparing Males and Females What if the researcher examines one pair (as opposed to 70 pairs) containing 30 men and 30 women. (Subtract second mean from the first.) Results: Women: (sample mean) = 45.0 Men: (sample mean) = 40.0 Difference Between Means: (45.0 – 40.0) = + 5.0 How far does + 5.0 fall from the mean of zero? 16 Child Rearing: Comparing Males and Females So, to determine how far our obtained difference betweens lies from the mean difference of zero, we must translate our obtained difference into units of standard deviation. Step 1: Recall this formula for standardizing units of deviation: Raw Score Z = X-µ σ Where x = raw score µ = mean of the distribution of raw scores σ = standard deviation of the distribution of raw scores 17 Child Rearing: Comparing Males and Females Step 2a: Use this formula as step in translating the mean scores in a distribution of sample means into units of standard deviation. Z = X -µ σ Where X X = sample mean µ = population mean (mean of means) σ = standard error of the mean (standard deviation of the X distribution of means) 18 Child Rearing: Comparing Males and Females Step 2b: Translate our sample mean difference into units of standard deviation. Z = X1 X2 Where (X 1 – X 2) - 0 X 1X 2 = mean of the first sample = mean of the second sample 0 = zero, the value of the mean of the sampling distribution of differences between means (we assume that µ1 - µ2 = 0) X 1X 2 = standard error of the mean (standard deviation of the distribution of the difference between means) We can reduce this equation down to the following: z X1  X 2 X 1X 2 19 Child Rearing: Comparing Males and Females Result: (assuming  X 1X 2 Z = equals 2) ( 45 – 40) 2 Z = + 2.5 Thus, a difference of 5 between the means of the two samples (women and men) falls 2.5 standard deviations from a mean of zero. 20 What is the probability that a difference of 5 between sample means could be caused by sampling error? The probability of getting 5 or move (above or below the mean) because of sample error is roughly P = .01 (1 in a 100). 5 and above P = .006 (.06 in 100). z = 2.50 1.24 % P =.012 P =.4938 0 49.38% .62 % P =.4938 P=.006 5 Levels of Significance Is a mean difference of 5, which has a P = .01 chance of resulting from sample error statistically significant, that is, does it result from population difference? Levels of Significance: We need to establish this to determine whether or not our obtained sample difference is statistically significant. The α (alpha) value is the level of probability at which the null hypothesis can be rejected with confidence and the research hypothesis accepted with confidence. We decide to reject the null hypothesis if the probability is very small. This is symbolized as P ≤ .05 (P is less than or equal to .05) 22 Things to Know about Levels of Significance: A small probability is symbolized by – P ≤ .05 Alpha is generally defined as (95 % Confidence Interval) – α = .05 level of significance This means that we are willing to reject the null hypothesis if an obtained sample difference occurs by chance less than 5 times out of 100. Thus, a mean difference of 5, between men women with regards to their approach to child-rearing is statistically significant, and is not the result of sampling error but differences between the populations. 23 Child Rearing: Comparing Males and Females Thus, a mean difference of 5, between men women with regards to their approach to child-rearing is statistically significant, and is not the result of sampling error but differences between the populations. 24 Critical Values In this case, the z scores are called critical values. With α = .05, the z score ±1.96 is a critical value. If we obtain a z score that exceeds 1.96 (z>1.96 or z<-1.96), it is statistically significant. Critical or rejection regions are those areas beyond the z score to the tail of the normal curve and scores within these areas lead us to reject the null hypothesis. 25 Critical Values: Z Score 2.50 % z = -1.96 z = 1.96 47.5 % 47.5 % 95% 2.50 % If we obtain a z score that exceeds 1.96 it is called statistically significant. Statistically Significant: reject Null Hypothesis Critical Values: Z Score Statistically Insignificant: accept Null Hypothesis Statistically Significant: reject Null Hypothesis 2.50 % 2.50 % z= 1.96 0 z= +1.96 Significance levels Significance levels can be set up for any degree of probability. NOTE: Levels of significance do not give us an absolute statement as to the correctness of the null hypothesis. We can choose to accept or reject the null hypothesis anyway. 28 Type I Errors Type 1 Error: Rejecting the hypothesis when it should have been retained For example, if we reject the the null hypothesis at the .05 level of significance and conclude that there are gender differences in child-rearing attitudes, then there are 5 chances out of 100 that we are wrong. Or P = .05 that we committed a Type I error and that gender actually has no effect. The more stringent our level of significance (the farther out in the tail it lies), the less likely we are to make a Type 1 error. The probability of a Type I error is represented by α or alpha. 29 Type II Errors Type II Error: Accepting the null hypothesis when it should have been rejected The farther out in the tail of the curve that our critical value falls, the greater the risk of a Type II error. The research hypothesis may still be correct, despite the decision to reject it and retain the null hypothesis. One method for reducing the risk of a Type II error is to increase the size of the sample so that the true population difference is more likely to be represented. The probability of a Type II error is β or beta. 30 Type I: Reject Null, when we should have retained it Error Types: Type I Example: 95% confidence interval, α =.05, z =1.96 The larger the significance level, and thus % on the tail, the more likely we are to mistakenly reject the null. 2.50% 2.50% z= 1.96 0 z= +1.96 Error Types: Type I Type I: Reject Null, when we should have retained it Example: 99% confidence interval, α =.01, z =2.58 The smaller the significance level, and thus % on the tail, the less likely we are to mistakenly reject the null. .5% z= 2.58 .5% 0 z= +258 Error Types: Type II Type II: Accept Null, when we should have rejected it Example: 99% confidence interval, α =.01, z =2.58 The smaller the significance level, and thus % on the tail, the more likely we are to mistakenly accept the null. .5% z= 2.58 .5% 0 z= +258 Some notes on Type I and Type II Errors The probabilities of Type I and Type II errors are inversely related. The larger the level of significance, the larger the chance of a Type I error. We predetermine our level of significance for a hypothesis test depending on which error is the least damaging or costly. If it would be far worse to reject a true null hypothesis (that is, suggest statistically significance, or different populations where there is none) (Type I error) than to retain a false null hypothesis (to suggest there is no population difference where there is difference) (Type II error), we should use a use a smaller level of significance: α = .01 34 The Difference between P and α P is the exact probability that the null hypothesis is true in light of some sample data. Alpha is the threshold below which is considered so small that we decide to reject the null hypothesis. We reject the null hypothesis if the P value is less than the alpha value. 35 The Difference between P and α Example: a mean difference of 5 has a P of .006 x2 = roughly .01 (1 in a 100), whereas α = .05 cuts off the null hypothesis at .025 x2 = .05 (5 chances in 100). z = 2.50 49.38% .62 % P =.4938 P=.006 α =.05/z =1.96: 95.0% 2.50% P = .025% z= 1.96 0 z= +1.96 5 The Difference between P and α Example: a mean difference of 5 has a P of .006 x2 = roughly .01 (1 in a 100), whereas α = .05 cuts off the null hypothesis at .025 x2 = .05 (5 chances in 100). Any mean difference below 5 chances in 100 Supports the research hypothesis. Statistically Significant .62% P =.006 2.50% P = .025% z= 1.96 0 z= +1.96 5

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Testing the Differences between Means