Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Statistical Imagination Chapter 11: Bivariate Relationships: t-test for Comparing the Means of Two Groups © 2008 McGraw-Hill Higher Education Bivariate Analysis • Bivariate – or “two variable” – analysis involves searching for statistical relationships between two variables • A statistical relationship between two variables asserts that the measurements of one variable tend to consistently change with the measurements of the other, making one variable a good predicator of the other © 2008 McGraw-Hill Higher Education Independent and Dependent Variables • The predictor variable is the independent variable • The predicted variable is the dependent variable © 2008 McGraw-Hill Higher Education Three Approaches to Measuring Statistical Relationships 1. Difference of means testing (Ch. 11 & 12) 2. Counting the frequencies of joint occurrences of attributes of two nominal/ordinal variables (Ch. 13) 3. Measuring the correlation between two interval/ratio variables (Ch. 14 & 15) © 2008 McGraw-Hill Higher Education Difference of Means Testing • Compares means of an interval/ratio variable among the categories or groups of a nominal/ordinal variable • Chapter 11. The two-group difference of means test – for a dependent interval/ratio and an independent dichotomous nominal/ordinal variable • Chapter 12. Analysis of variance – to test for a difference among three or more group means © 2008 McGraw-Hill Higher Education Frequencies of Joint Occurrences of Two Nominal Variables • Chapter 13. Chi-square test – to determine a relationship between two nominal variables • Web site Chapter Extensions to Chapter 13: Gamma test – to determine a relationship between two ordinal variables © 2008 McGraw-Hill Higher Education Measuring Correlation • Chapter 14-15. Correlation – to determine a relationship between two interval/ratio variables • Web site Extensions to Chapter 15: Rank-order correlation test – to determine a relationship between two numbered ordinal level variables © 2008 McGraw-Hill Higher Education 2-Group Difference of Means Test: Independent Samples (t-test) • Useful for testing a hypothesis that the means of a variable differ between two populations comprised of different groups of individuals © 2008 McGraw-Hill Higher Education When to Use an Independent Samples t-test • Two variables from one population and sample, one interval/ratio and one dichotomous nominal/ordinal • Or: There are two populations and samples and one interval/ratio variable; the samples are representative of their population • The interval/ratio variable is typically the dependent variable • The groups do not consist of same subjects • Population variances are assumed equal © 2008 McGraw-Hill Higher Education Features of an Independent Samples t-test • The t-test focuses on the computed difference between two sample means and addresses the question of whether the observed difference between the sample means reflects a real difference in the population means or is simply due to sampling error © 2008 McGraw-Hill Higher Education Features of an Independent Samples t-test (cont.) • Step 1. Stating the H0: The mean of population 1 equals the mean of population 2 • That is, there is no difference in the means of the interval/ratio variable, X, for the two populations © 2008 McGraw-Hill Higher Education Features of an Independent Samples t-test (cont.) • Step 2. The sampling distribution is the approximately normal t-distribution • The pooled variance formula for the standard error is used when we can assume that population variances are equal • The separate variance formula for the standard error is used when we cannot assume that population variances are equal © 2008 McGraw-Hill Higher Education Features of an Independent Samples t-test (cont.) • Step 4. The effect is the difference between the sample means • The test statistic is the effect divided by the standard error • The p-value is estimated using the t-distribution table © 2008 McGraw-Hill Higher Education Assumption of Equality of Population Variances • When one sample variance is not larger than twice the size of the other, this suggests that the two population variances are equal and we assume equality of variances • We may use the pooled variance estimate of the standard error • Equality of variances is also termed homogeneity of variances or homoscedasticity © 2008 McGraw-Hill Higher Education Assumption of Equality of Population Variances (cont.) • Heterogeneity of variances, or heteroscedasticity, is when variances of the two populations appear unequal • Here we use the separate variance estimate of the standard error and calculate degrees of freedom differently © 2008 McGraw-Hill Higher Education Test for Nonindependent or Matched-Pair Samples • This is a test of the difference of means between two sets of scores of the same research subjects, such as two questionnaire items or scores measured at two points in time • This test is especially useful for before-after or test-retest experimental designs © 2008 McGraw-Hill Higher Education When to Use a Nonindependent Samples t-test • There is one population with a representative sample from it • There are two interval/ratio variables with the same score design • Or: There is a single variable measured twice for the same sample subjects • There is a target value of the variable (usually zero) to which we may compare the mean of the differences between the two sets of scores © 2008 McGraw-Hill Higher Education Features of a Nonindependent Samples or Matched-Pair t-test • Step 1. Stating the H0: The mean of differences between the scores in a population is equal to zero © 2008 McGraw-Hill Higher Education Nonindependent Samples or Matched-Pair t-test (cont.) • Step 2. The sampling distribution is the approximately normal t-distribution • The standard error is calculated as the standard deviation of differences between scores divided by the square root of n - 1 © 2008 McGraw-Hill Higher Education Nonindependent Samples or Matched-Pair t-test (cont.) • Step 4. The effect is the mean of differences between scores • The test statistic is the effect divided by the standard error • The p-value is estimated using the t-distribution table © 2008 McGraw-Hill Higher Education Distinguishing Between Practical and Statistical Significance • A hypothesis test determines significance in terms of likely sampling error – whether a sample difference is so large that there probably is a difference in the populations • Practical significance is an issue of substance. A statistically significant difference may not be practically significant © 2008 McGraw-Hill Higher Education Practical and Statistical Significance (cont.) • E.g., a hypothesis test reveals a statistically significant difference in the mean number of personal holidays of men and women in a corporation: women average 0.1 days per year more. The test tells us with 95% confidence that the 0.1 day difference in the samples truly exists in the populations • However, is one-tenth day per year meaningful? Might such a small statistical effect be accounted for by some other variable? © 2008 McGraw-Hill Higher Education Four Aspects of Statistical Relationships • When examining a relationship between two variables, we can address four things: existence, direction, strength, and practical applications • These four aspects provide a checklist for what to say in writing up the results of a hypothesis test © 2008 McGraw-Hill Higher Education Existence of a Relationship • Existence: On the basis of statistical analysis of a sample, can we conclude that a relationship exists between two variables among all subjects in the population? • Established by rejection of the H0 • Testing for the existence of a relationship is the first step in any analysis. If a relationship is found not to exist, the other three aspects of a relationship are irrelevant © 2008 McGraw-Hill Higher Education Direction of a Relationship • Direction: Can the dependent variable be expected to increase or decrease as the independent variable increases? • Direction is stated in the alternative hypothesis (HA) of step 1 of the six steps of statistical inference © 2008 McGraw-Hill Higher Education Strength of a Relationship • Strength: To what extent are errors reduced in predicting the scores of a dependent variable when an independent variable is used as a predictor? © 2008 McGraw-Hill Higher Education Practical Applications of a Relationship • Practical Applications: In practical, everyday terms, how does knowledge of a relationship between two variables help us understand and predict outcomes of the dependent variable? © 2008 McGraw-Hill Higher Education Existence of a Relationship for 2-Group Difference of Means Test • Existence: Established by using independent samples or nonindependent samples t-test • When the H0 is rejected, a relationship exists © 2008 McGraw-Hill Higher Education Direction of a Relationship for 2-Group Difference of Means Test • For the two group tests, direction and strength are not relevant • Direction: Not relevant • Strength: Not relevant © 2008 McGraw-Hill Higher Education Practical Applications of Relationship for a 2-Group Difference of Means Test • Practical Applications: Describe the effect of the test in everyday terms, where the effect of the independent variable on the dependent variable is the difference between sample means © 2008 McGraw-Hill Higher Education Statistical Follies • Avoid a common tendency: Difference in means testing is so widely used that researchers often focus too heavily on mean differences while ignoring the differences in variances (or standard deviations) © 2008 McGraw-Hill Higher Education