Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Analyzing the Results of an Experiment… • -not straightforward.. – Why not? Variability and Random/chance outcomes Inferential Statistics • Statistical analysis appropriate for inferring causal relationships and effects. • Many different formulas…which one do you use? Inferential Stat selection • -Determine that you are analyzing the results of an experimental manipulation, not a correlation • Identify the IV and DV. • The IV Will always be nominal on some level, even when it may seem to be continuous..low, medium and high doses of a drug Inf. Stat Selection • What is the scale of the DV? – Scale of DV -Statistic to use Nominal Chi-squared Ordinal Mann-Whitney U-test Continuous T-test or ANOVA t-test or ANOVA? How many levels of the IV are there? 2 levels more than 2 levels T-test or ANOVA ANOVA There are different forms of T-tests and ANOVA’s: Did the Study Use a Within Group or Between group Experimental Design? Only 2 levels of the IV More than 2 levels of the IV Between Group Within Group Unpaired t-tests (or “t for independent samples”). “Paired t-tests ( or “t for dependent samples”) Or…ANOVA ( the basic ANOVA is fitted for between group designs) Or…Within group ANOVA (often referred to as a “repeated measures ANOVA”) ANOVA Repeated Measures ANOVA In some ways all inferential Stats are similar. • They calculate the probability that a result was due to the IV as opposed to random variability… • Let’s focus on the Basic ANOVA since it is likely to be the statistic you may use most commonly. ANOVA • ANOVA produces an F-value. • F values are the ratio of overall between group Variability to the Mean within group variability / Between Var. (+ chance) Mean within grp. Variability (+ chance) What does this mean? Lets suppose: • Experiment- IV marijuana – Control – Placebo control – Low dose – High dose Dependent Variable is: • Performance on a short term memory task measured number correct out of 10 test items. • 9 subjects in each group Possible out come 1 Possible Outcome 1 Control • • • • • • • • • 4 5 6 5 5 6 4 3 7 Placebo 2 3 4 6 5 5 4 4 3 Low dose 2 3 4 4 5 4 5 6 3 High dose 2 3 5 3 4 4 4 6 5 Distribution of scores for control sample 3.5 3 Count 2.5 2 1.5 1 .5 0 0 2 4 6 control 8 10 12 Placebo scores 3.5 3 Count 2.5 2 1.5 1 .5 0 0 2 4 6 placebo 8 10 12 Low dose scores 3.5 3 Count 2.5 2 1.5 1 .5 0 0 2 4 6 low 8 10 12 High dose scores 3.5 3 Count 2.5 2 1.5 1 .5 0 0 2 4 6 high 8 10 12 The population distribution of scores 12 10 Count 8 6 4 2 0 0 1 2 3 4 5 6 7 population 8 9 10 11 F value relatively low High low placebo control w/in grp. var Between grp. Var Now consider this: Possible Outcome 2 Control • • • • • • • • • 4 5 6 5 5 6 4 3 7 Placebo 2 3 4 6 5 5 4 4 3 Low dose 2 3 4 4 5 4 5 6 3 High dose 2 3 5 3 4 4 4 6 5 Distribution of scores for control sample 3.5 3 Count 2.5 2 1.5 1 .5 0 0 2 4 6 control 8 10 12 Placebo scores 3.5 3 Count 2.5 2 1.5 1 .5 0 -2 0 2 4 6 placebo 8 10 12 Low dose scores 3.5 3 Count 2.5 2 1.5 1 .5 0 0 2 4 6 low 8 10 12 High dose scores 3.5 3 Count 2.5 2 1.5 1 .5 0 0 2 4 6 high 8 10 12 F value relatively High High low placebo control w/in grp. var Between grp. Var The high F value reflects • Logic! • Distribution of score are much more obviously separated, and in this case are completely non-overlapping • Low F values indicate highly overlapping score distributions So how do we decide if an F value is large enough to consider the result as causal? • We consult a table of established probabilities of different F values, within the context of Degree of freedom terms: ANOVA Significance table Where is/are the difference (s)? 70 60 50 Neutral 40 Positive Negative Sex Drug 30 Taboo 20 10 0 Neutral Positive Negative Sex Drug Taboo Inferential Statistics The story of “Scratch” Why not jus use repeated t-tests? Probability pyramiding • 15 t-tests required for this data set 70 60 50 • Post-hocs include compensations for repeated testing of a large data set Neutral 40 Positive Negative Sex Drug 30 Taboo 20 10 0 Neutral Positive Negative Sex Drug Taboo After all this where so we stand? We can still be wrong. Factors that affect “power.” Sample size One vs two-tailed testing • Effect size