Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ANOVA Analysis of Variance: •Why do these Sample Means differ as much as they do (Variance)? •Standard Error of the Mean (“variance” of means) depends upon Population Variance (/n) •Why do subjects differ as much as they do from one another? Many Random causes (“Error Variance”) or Many Random causes plus a Specific Cause (“Treatment”) Making Sample Means More Different than SEM Why Not the t-Test If 15 samples are ALL drawn from the Same Populations: •105 possible comparisons •Expect 5 Alpha errors (if using p<0.05 criterion) •If you make your criterion 105 X more conservative (p<0.0005) you will lose Power The F-Test ANOVA tests the Null hypothesis that ALL Samples came from The Same Population •Maintains Experiment Wide Alpha at p<0.05 Without losing Power •A significant F-test indicates that At Least One Sample Came from a different population (At least one X-Bar is estimating a Different Mu) The Structure of the F-Ratio F = Estimation (of SEM) The Differences (among the sample means) you got ---------------------------------------------------------------- Evaluation The Differences you could expect to find (If H0 True) Expectation (If this doesn’t sound familiar, Bite Me!) The Structure of the F-Ratio If H0 True: F = Average Error of Estimation of Mu by the X-Bars ---------------------------------------------------------------Variability of Subjects within each Sample Size of Denominator determines size of Numerator If a treatment effect (H0 False): Numerator will be larger than predicted by denominator The Structure of the F-Ratio F = Between Group Variance ------------------------------Within Group Variance If H0 True: F = Error Variance -----------------Error Variance Approximately Equal With random variation If a treatment effect (H0 False): Error plus Treatment Variance F = ------------------------------------Error Variance Numerator is Larger Probability of F as F Exceeds 1 F = Between Group Variance ------------------------------Within Group Variance If H0 True: F = Error Variance -----------------Error Variance Approximately Equal With random variation If a treatment effect (H0 False): Error plus Treatment Variance F = ------------------------------------Error Variance Numerator is Larger For U Visual Learners H0 True: Reflects SEM (Error) Sampling Distributions H0 False: Error Plus Treatment Keep the Data, Burn the Formulas Do These Measures Depend on What Drug You Took? Drug A & B don’t look different, but Drug C looks different From Drug A & B Partitioning the Variance Each Subject’s deviation score can be decomposed into 2 parts: •How much his Group Mean differs from the Grand Mean •How he differs from his Group Mean If Grand Mean = 100: Score-1 in Group A =117; Group A mean =115 (117 - 100) = (115 - 100) + (117 - 115) 17 = 15 + 2 Score-2 in Group A = 113; Group A mean = 115 (113 – 100) = (115 - 100 + (113 – 115) 13 = 15 2 Partitioning the Variance in the Data Set Total Variance (Total Sum of Squared Deviations from Grand Mean) Sum (Xi-Grand Mean)^2 Variance among Subjects Within each group (sample) Sum ( Xi – Group mean)^2 for All subjects in all Groups SS-Within Variance among Samples Sum (X-Bar – Grand Mean)^2 For all Sample Means SS-Total SS-Between Step 1: Calculate SS-Total Xi 9 8 7 5 Xi-GM dev-score 3.583333333 2.583333333 1.583333333 -0.416666667 sq-dev 12.84028 6.673611 2.506944 0.173611 Drug B 9 7 6 5 3.583333333 1.583333333 0.583333333 -0.416666667 12.84028 2.506944 0.340278 0.173611 Drug C 4 3 1 1 -1.416666667 -2.416666667 -4.416666667 -4.416666667 2.006944 5.840278 19.50694 19.50694 Drug A Grand mean= 5.416667 SStot= 84.91667 Step 2: Calculate SS-Between XBarA - GM XBarB - GM XBarC - GM 7.25 6.75 2.25 -5.416667 -5.416667 -5.416667 dev sq-dev 1.833333 3.36111 1.333333 1.777777 -3.16667 10.02778 n 4 4 4 SS-Bet= sq-dev * n 13.44444 7.111108 40.11112 60.66667 Multiply by n (sample size) because: Each subject’s raw score is composed of: •A deviation of his sample mean from the grand mean •(and a deviation of his raw score from his sample mean) Step 3: Calculate SS-Within SS-Total – SSb = SSw 84.91667 – 60.6667 = 24.25 Should Agree with Direct Calculation Direct Calculation of SSw Xi Drug A X-Bar-A= Drug B X-Bar-B Drug C X-Bar-C 9 8 7 5 7.25 9 7 6 5 6.75 4 3 1 1 2.25 Xi-XBarA dev-score sq-dev 1.75 3.0625 0.75 0.5625 -0.25 0.0625 -2.25 5.0625 Xi-XBarB dev-score sq-dev 2.25 5.0625 0.25 0.0625 -0.75 0.5625 -1.75 3.0625 xi-XBarC dev-score sq-dev 1.75 3.0625 0.75 0.5625 -1.25 1.5625 -1.25 1.5625 SS-Within= 24.25 Step 4: Use SS to Compute Mean Squares & F-ratio df-Tot=N-1 11.00 df-B=k-1 2 df-W=dfTot-dfB 9 MSb = SSb/df 60.66667 MSw = SSw/df 24.25 2 9 F=MSb/MSw 30.33334 11.25773 2.694444 The differences among the sample means are over 11 x greater than if: •All three samples came from the Same population •None of the drugs had a different effect Look up the Probability of F with 2 & 9 dfs •Critical F2,9 for p<0.01 = 8.02 •Reject H0 •Not ALL of the drugs have the same effect The F-Table The ANOVA Summary Table What Do You Do Now? A Significant F-ratio means at least one Sample came from a Different Population. What Samples are different from what other Samples? Use Tukey’s Honestly Significant Difference (HSD) Test Tukey’s HSD Test Can only be used if overall ANOVA is Significant A “Post Hoc” Test Used to make “Pair-Wise” comparisons Structure: Analogous to t-test But uses estimated Standard Error of the Mean in the Denominator Hence a different critical value (HSD) table Tukey’s HSD Test Equal N Unequal N Assumptions of ANOVA 1. All Populations Normally distributed 2. Homogeneity of Variance 3. Random Assignment ANOVA is robust to all but gross violations of these theoretical assumptions Effect Size S = 0.10 M = 0.25 L = 0.40 MStreatment is really MSb Which is T + E What’s the Question?