Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Bootstrapping (statistics) wikipedia , lookup
Foundations of statistics wikipedia , lookup
Psychometrics wikipedia , lookup
Taylor's law wikipedia , lookup
Misuse of statistics wikipedia , lookup
Omnibus test wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Two Groups Too Many? Try Analysis of Variance (ANOVA) • T-test • Compare two groups • Test the null hypothesis that two populations has the same average. • ANOVA: • Compare more than two groups • Test the null hypothesis that two populations among several numbers of populations has the same average. • The test statistic for ANOVA is the F-test (named for R. A. Fisher, the creator of the statistic). Three types of ANOVA • One-way ANOVA • Within-subjects ANOVA (Repeated measures, randomized complete block) Will not be covered in this class. • Factorial ANOVA (Two-way ANOVA) One-way ANOVA example • Example: Curricula A, B, C. • You want to know if the population average score on the test of computer operations would have been different between the children who had been taught using Curricula A, B and C. • Null Hypothesis: The population averages would have been identical regardless of the curriculum used. • Alternative Hypothesis: The population averages differ for at least one pair of the population. Verbal Explanation on the Logic in Comparing the Mean • If 2 or more populations have identical averages, the averages of random samples selected from those populations ought to be fairly similar as well. • Sample statistics vary from one sample to the next, however, large differences among the sample averages would cause us to question the hypothesis that the samples were selected from populations with identical averages. • How much should the sample averages differ before we conclude that the null hypothesis of equal population averages should be rejected ? Logic of ANOVA Okay, we’ve dealt with this logic in the t-test, too. How is it different in ANOVA? • In t-test, we calculated “t-statistics” • In ANOVA, we calculate “F-statistics” Logic of ANOVA • F-statistics is obtained by comparing “the variation among the sample averages” to “the variation among observations within each of the samples”. F= Variation among the sample averages Variation among observations within each of the samples • Only if variation among sample averages is substantially larger than the variation within the samples, (in other words only if F statistic is substantially large) do we conclude that the populations must have had different averages. Sources of Variation • Three sources of variation: 1) Total, 2) Between groups (“the variation among the sample averages” ), 3) Within groups (“the variation among observations within each of the samples”) • Sum of Squares (SS): Reflects variation. Depend on sample size. • Degrees of freedom (df): Number of population averages being compared. • Mean Square (MS): SS adjusted by df. MS can be compared with each other. (SS/df) Computing F-statistic SS Total: Total variation in the data df total: Total sample size (N) -1 MS total: SS total/ df total SS between: Variation among the groups compared. df between: Number of groups -1 MS between : SS between/df between SS within: Variation among the scores who are in the same group. df within: Total sample size - number of groups -1 MS within: SS within/df within F statistic = MS between / MS within Interpreting SPSS output Univariate Analysis of Variance Be twe en-Subjects Factors Employment Category Value Label Clerical Custodial Manager 1 2 3 N 363 27 84 Descriptive Statistics Dependent Variable: Current Salary Employment Category Clerical Custodial Manager Total Mean 27838.54 30938.89 63977.80 34419.57 Std. Deviation 7567.995 2114.616 18244.776 17075.661 N 363 27 84 474 Interpreting SPSS output Tests of Between-Subjects Effects Dependent Variable: Current Salary Type III Sum Source of Squares df Mean Square a Corrected Model 8.944E+10 2 4.472E+10 Intercept 2.915E+11 1 2.915E+11 JOBCAT 8.944E+10 2 4.472E+10 Error 4.848E+10 471 102925714.5 Total 6.995E+11 474 Corrected Total 1.379E+11 473 a. R Squared = .648 (Adjusted R Squared = .647) F 434.481 2832.005 434.481 Sig. .000 .000 .000 Partial Eta Squared .648 .857 .648 Interpreting Significance • p<.05 • The probability of observing an F-statistic at least this large by chance is less than .05. • Therefore, we can infer that the difference we observe in the sample will also be observed in the population. • Therefore, reject the null hypothesis that there is no differences among the sample means. • Accept the research hypothesis, that there is a difference between at least one pair of the population Writing up the result • “A one-way ANOVA was conducted in order to evaluate the relationship between the salary and the job category. The result of the One-way ANOVA was significant, F(2, 471) = 434.48, p<.001, partial η2=.65, which indicated that at least one pair of the job category in the mean salary is significantly different from each other.” • Report the “descriptive statistics” after this. • If not doing the follow-up test, describe and summarize the general conclusions of the analysis. Follow-up test But we don’t know which pairs are significantly different from each other !! • Conduct a “Follow-up test” to see specifically which means are different from which other means. • Instead of repeating t-test for each combination (which can lead to an alpha inflation) there are some modified versions of t-test that adjusts for the alpha inflation. • Most recommended: • Tukey HSD test (When equal variance assumed) • Dunnett’s C test (When euqal variance is not assumed) • Other popular tests: Bonferroni test , Scheffe test What’s Alpha Inflation? • Conducting multiple tests, will incur a large risk that at least one of them would be statistically significant just by chance (Type I error) . • Example: 2 tests .05 Alpha (=probability) • Probability of not having Type I error .95 .95x.95 = .9025 • Probability of at least one Type I error is 1-.9025= .0975. Close to 10 %. • Therefore, when you repeat the number of same tests, use more stringent criteria. e.g. .001 Interpreting SPSS output Le vene's Test of Equali ty of Error Variancesa Dependent Variable: Current Salary F df1 df2 Sig. 59.733 2 471 .000 Tests the null hypothesis that the error variance of the dependent variable is equal acros s groups. a. Design: Interc ept+ JOBCAT Interpreting SPSS output Post Hoc Tests Employment Category Multiple Comparisons Dependent Variable: Current Salary Tukey HSD (I) Employment Category Clerical Custodial Manager Dunnett C Clerical Custodial Manager (J) Employment Category Custodial Manager Clerical Manager Clerical Custodial Custodial Manager Clerical Manager Clerical Custodial Based on observed means. *. The mean difference is significant at the .05 level. Mean Difference (I-J) -3100.35 -36139.26* 3100.35 -33038.91* 36139.26* 33038.91* -3100.35* -36139.26* 3100.35* -33038.91* 36139.26* 33038.91* Std. Error 2023.760 1228.352 2023.760 2244.409 1228.352 2244.409 568.679 2029.912 568.679 2031.840 2029.912 2031.840 Sig. .277 .000 .277 .000 .000 .000 95% Confidence Interval Lower Bound Upper Bound -7858.50 1657.80 -39027.29 -33251.22 -1657.80 7858.50 -38315.84 -27761.98 33251.22 39027.29 27761.98 38315.84 -4476.97 -1723.73 -40981.02 -31297.50 1723.73 4476.97 -37895.87 -28181.95 31297.50 40981.02 28181.95 37895.87 Writing up the result of follow-up test “The follow-up test was conducted in order to determine which job category was different from others. Because Levene’s test indicated that the equal variance cannot be assumed between the groups, Dunnett’s C test was used for the follow-up test in order to control for Type I error across the pairwise comparisons. The result of the follow-up test indicated that the salary of all three job categories are significantly different from each other.” Relation between t-test and F-test • When two groups are compared both t-test and F-test will lead to the same answer. • t2 = F. • So by squaring F you’ll get t (or square root of F is t) Formula for Sum of Squares in ANOVA Formula Name How To Sum of Square Total Subtract each of the scores from the mean of the entire sample. Square each of those deviations. Add those up for each group, then add the two groups together. Sum of Squares Among Each group mean is subtracted from the overall sample mean, squared, multiplied by how many are in that group, then those are summed up. For two groups, we just sum together two numbers. Sum of Squares Within Here's a shortcut. Just find the SST and the SSA and find the difference. What's left over is the SSW.