Survey

Document related concepts

Transcript

Confidence Intervals & Effect Size Outline of Today’s Discussion 1. Confidence Intervals 2. Effect Size 3. Thoughts on Independent Group Designs The Research Cycle Real World Abstraction Generalization Research Conclusions Research Representation Methodology *** Data Analysis 1. Observational 2. Survey 3. Experimental Research Results Part 1 Confidence Intervals A.K.A. How Big is Your Error Bar? Confidence Intervals “A picture is worth a thousand…p-values!” (say it with me) The Effe ctive ne s s of Drug x 12 12 10 10 Mean Effectiveness Mean Effectiveness The Effe ctive ne s s of Drug x 8 6 4 8 6 4 2 2 0 0 Drug x Placebo Treatm e nt Drug x Placebo Treatm e nt Confidence Intervals The Effe ctive ne s s of Drug x 12 12 10 10 Mean Effectiveness Mean Effectiveness The Effe ctive ne s s of Drug x 8 6 4 8 6 4 2 2 0 0 Drug x Placebo Treatm e nt Drug x Placebo Treatm e nt Which graph makes a more convincing case for Drug X, and why? Confidence Intervals The Effe ctive ne s s of Drug x 12 12 10 10 Mean Effectiveness Mean Effectiveness The Effe ctive ne s s of Drug x 8 6 4 8 6 4 2 2 0 0 Drug x Placebo Treatm e nt Drug x Placebo Treatm e nt In some graphs, the error bars reflect the range (min to max). Confidence Intervals The Effe ctive ne s s of Drug x 12 12 10 10 Mean Effectiveness Mean Effectiveness The Effe ctive ne s s of Drug x 8 6 4 8 6 4 2 2 0 0 Drug x Placebo Treatm e nt Drug x Placebo Treatm e nt In some graphs, the error bars reflect the inter-quartile range. Confidence Intervals The Effe ctive ne s s of Drug x 12 12 10 10 Mean Effectiveness Mean Effectiveness The Effe ctive ne s s of Drug x 8 6 4 8 6 4 2 2 0 0 Drug x Placebo Treatm e nt Drug x Placebo Treatm e nt In some graphs, the error bars reflect one standard deviation. Confidence Intervals The Effe ctive ne s s of Drug x 12 12 10 10 Mean Effectiveness Mean Effectiveness The Effe ctive ne s s of Drug x 8 6 4 8 6 4 2 2 0 0 Drug x Placebo Treatm e nt Drug x Placebo Treatm e nt In some graphs, the error bars reflect one standard error (of the mean).* Confidence Intervals The Effe ctive ne s s of Drug x 12 12 10 10 Mean Effectiveness Mean Effectiveness The Effe ctive ne s s of Drug x 8 6 4 8 6 4 2 2 0 0 Drug x Placebo Treatm e nt Drug x Placebo Treatm e nt In still other graphs, the error bars reflect a confidence interval. * Confidence Intervals 1. Standard Error (of the Mean) – The standard deviation of the “distribution of means” (D.O.M). • The standard deviation describes the average extent to which a RAW SCORE (that’s one raw score) deviates from the mean of the distribution of raw scores. • The standard error describes the average extent to which a SAMPLE MEAN (that’s the mean of one sample) deviates from the mean of the distribution of means (DOM). Confidence Intervals Three Kinds of Distributions There are three kinds of distributions A. The distribution of the population of individuals B. The distribution of a sample C. The distribution of means (of samples) Critical Thinking Question: Why is the D.O.M. so ‘skinny’? Confidence Intervals Main Points on the D.O.M • Q: Why would we want to use the standard deviation of the D.O.M.? • A: So we can put a mean in context! • This is similar to the rationale for knowing the SD of a distribution of raw scores…whether we have a raw score or a mean we want some CONTEXT. Confidence Intervals Main Points on the D.O.M • Example: Your new drug is given to a sample of depressed patients. Subsequently, the sample’s mean mood score is 25, whereas the mean for the population of all depressed people is 20. • Did our drug have a significant effect? • IT DEPENDS!!!! • If the D.O.M has a standard deviation of 10 units, then our sample is not so different from the D.O.M. mean. Our drug isn’t so special. • If the D.O.M. has a standard deviation of 1 unit, then our sample mean is very different from the D.O.M. mean. Our drug is hot stuff!!! Confidence Intervals Main Points on the D.O.M • The standard error IS the standard deviation of the distribution of means (DOM). • We can estimate the standard deviation of the DOM from a sample. To do so, we use the equation S.E. = SDsample / sqrt( n ). Please memorize this formula! Confidence Intervals 1. Confidence Interval – A range of values assumed, with a specified degree of confidence (i.e., probability), to include a population parameter (usually the mean) . 2. Example 1: We might be, say, 95% confident that the mean height in our room is in the range between 5’ 7’’ and 5’ 9’’. 3. Example 2: We might be, say, 99% confident that the mean height in our room is in the range between 5’ 6’’ and 5’ 10’’. 4. Critical Thinking Question: Why is the 99% confidence interval wider than the 95% confidence interval? Confidence Intervals 1. Each confidence interval has an upper bound, and a lower bound. 2. The upper & lower bounds depend on - The mean - The standard error [ s.d. / sqrt(n) ] - The confidence level (95% versus 99%) 3. The confidence level is determined by the critical value of ‘t’ (the number to beat)… Confidence Intervals 1. If we want a 95% confidence interval, we’ll need to find ‘t’ critical value at a = 0.05. 2. If we want a 99% confidence interval, we’ll need to find ‘t’ critical value at a = .01. 3. Upper Bound = Mean + (tcrit * S.E.) 4. Lower Bound = Mean - (tcrit * S.E.) Confidence Intervals 1. Practice Item 1: Assume that a sample in your experiment has the following features: Mean = 10 S.D. = 8 n = 16 D.F. = 15 tcrit(15) = 2.13 at 0.05 alpha level 2. Compute the 95% confidence interval. Confidence Intervals 1. Practice Item 1: Assume that a sample in your experiment has the following features: Mean = 10 S.D. = 8 n = 16 D.F. = 15 tcrit(15) = 2.95 at 0.01 alpha level 2. Compute the 99% confidence interval. Confidence Intervals 1. To summarize, researchers can make their error bars equal to confidence intervals, instead of the standard deviation. 2. The researchers might then say: “We are 95% confident that the population mean falls between (upper bound) and (lower bound).” 3. Larger confidence levels have larger confidence intervals. Part 2 Effect Size Effect Size & Meta-Analysis There is Trouble in Paradise (Say it with me) Effect Size & Meta-Analysis 1. One major problem with Null Hypothesis Testing (i.e., inferential statistics) is that the outcome depends on sample size. 2. For example, a particular set of scores might generate a non-significant t-test with n=10. But if the exact same numbers were duplicated (n=20) the t-test suddenly becomes “significant”. Effect Size & Meta-Analysis 1. Effect Size – The magnitude of the influence that the IV has on the DV. 2. Effect size does NOT depend on sample size! (“And there was much rejoicing!”) Effect Size & Meta-Analysis 1. A commonly used measure of effect size is Cohen’s d. 2. Conventions for Cohen’s d: d = 0.2 small effect d = 0.5 medium effect d = 0.8 large effect Effect Size & Meta-Analysis 1. A statistically significant effect is said to be a ‘reliable effect’… it would be found repeatedly if the sample size were sufficient. 2. Statistically significant effects are NOT LIKELY due to chance. 3. An effect can be statistically significant, yet ‘puny’. 4. There is an important distinction between statistical significance, and practical significance… Effect Size Examples that distinguish effect size and statistical significance…. 1. Analogy to a Roulette Wheel – An effect can be small, but reliable. 2. Anecdote about the discovery of the planet Pluto -An effect can be small, but reliable. 3. Anecdote about buddy’s doctoral thesis, “Systematic non-linearities in the production of time intervals”. 4. Denison versus “Other” in S.A.T. scores. Effect Size & Meta-Analysis 1. Potential Pop Quiz Question – Using two sentences, generate your own novel example of a meta-analysis. http://en.wikipedia.org/wiki/Meta-analysis 2. Potential Pop Quiz Question – In your own words, explain how Cohen’s d can be helpful in a metaanalysis. Part 3 Thoughts on Independent Groups Designs From Shaughnessy, Zechmeister, Zechmeister (2012) Independent Groups Designs 1. Potential Pop Quiz Question – As we’ve seen, inferential statistics can address the issue of reliability. Statistically significant effects are ‘reliable effects’. What is the ultimate test of an experiment’s reliability? (One word will do.) 2. Potential Pop Quiz Question – In your own words, explain what a conceptual replication is. Use an example of your own, or from the readings. Independent Group Designs 1. Potential Pop Quiz Question – In your own words, explain what a matched group design is, and when it can be advantageously used. 2. Potential Pop Quiz Question – As we’ve noted many times, the scientific method has 4 goals. Which goal or goals can be met by a natural groups design, and which cannot? Explain your reasoning.