Download CHAPTER EIGHT Confidence Intervals, Effect Size, and Statistical

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

German tank problem wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
CHAPTER EIGHT
Confidence Intervals, Effect Size,
and Statistical Power
NOTE TO INSTRUCTORS
Students have a tendency to think that if something is statistically
significant, the story is over and that’s all that a person needs to
know. In other words, they frequently confuse “statistically
significant” with “meaningful.” This chapter will help students
recognize that this is not always the case. Aside from using the
discussion questions and classroom exercises, present examples of
studies that demonstrate a significant difference between groups but
not a very meaningful one. It is also important to break students of
the habit of using phrases such as “very significant” by discussing
effect sizes. Although students might be tempted to describe an
effect as “very significant,” emphasize that they should use effect
sizes for this purpose instead.
OUTLINE OF RESOURCES
I.
Confidence Intervals
 Discussion Question 8-1 (p. 74)
 Classroom Activity 8-1: Understanding Confidence Intervals (p.
74)
 Discussion Question 8-2 (p. 75)
II.
Effect Size and prep
 Discussion Question 8-3 (p. 75)
 Discussion Question 8-4 (p. 76)
III.
Next Steps: prep
IV.
Statistical Power
 Discussion Question 8-5 (p. 77)
 Discussion Question 8-6 (p. 77)
 Classroom Activity 8-2: Working with Confidence Intervals and
Effect Size (p. 77)
V.
Next Steps: Meta-Analysis
 Discussion Question 8-7 (p. 78)
 Classroom Activity 8-3: Analyzing Meta-Analyses (p. 78)


VI.
Additional Reading (p. 78)
Online Resources (p. 79)
Handouts
 Handout 8-1: Classroom Activity: Working with Confidence
Intervals and Effect Size (p. 80)
 Handout 8-2: Analyzing Meta-Analyses (p. 82)
CHAPTER GUIDE
II.
Confidence Intervals
1. A point estimate is a summary statistic from a sample that is just
one number as an estimate of the population parameter.
2. Instead of using a point estimate, it is wiser to use an interval
estimate, which is based on a sample statistic and provides a
range of plausible values for the population parameter.
> Discussion Question 8-1
What is the difference between a point estimate and an interval estimate?
Your students’ answers should include:
 A point estimate is a summary statistic from a sample that is just
one number as an estimate of the population parameter. Point
estimates are useful for gauging the central tendency, but by
themselves can be misleading.
 An interval estimate is based on a sample statistic and provides a
range of plausible values for the population parameter. Interval
estimates are frequently used in media reports, particularly when
reporting political polls.
Classroom Activity 8-1
Understanding Confidence Intervals
The following Web site provides a nice applet to help your students
understand confidence intervals:
http://www.ruf.rice.edu/~lane/stat_sim/conf_interval/index.html
The applet simulates a known population mean and standard
deviation and allows you to control the sample size, providing a
graphical display of the resulting confidence intervals.
3. A confidence interval is an interval estimate that includes the
mean we would expect a certain percentage of the time for the
sample statistic were we to sample from the same population
repeatedly.
4. With a confidence interval, we expect to find a mean in this
interval 95% of the time that we conduct the same study (if our
confidence level is 95%).
5. To calculate a confidence interval with a z test, we first draw a
normal curve that has the sample mean in the center.
6. We then indicate the bounds of the confidence interval on either
end and write the percentages under each segment of the
curve.
7. Next, we look up the z statistics for the lower and upper ends of
the confidence interval in the z table.
8. We then convert the z statistic to raw means for the lower and
upper ends of the confidence interval. To do so, we first
calculate the standard error as our measure of spread using the
formula M = /. Then, with this standard error and the
sample mean, we can calculate the raw mean at the upper and
lower end of the confidence interval. For the lower end we use
the formula: MLower = –z(M) + MSample. For the upper end, we
use the formulat: MUpper = –z(M) + MSample.
9. Lastly, we should check our answer to ensure that each end of
the confidence interval is exactly the same distance from the
sample mean.
> Discussion Question 8-2
How would you calculate a confidence interval with a z test?
Your students’ answers should include:
To calculate a confidence interval with a z test:
 Draw a normal curve with a sample mean in the center.
 Indicate the bounds of the confidence interval on either end and
write the percentages under each segment of the curve.
 Look up the z statistics for the lower and upper ends of the
confidence interval in the z table.
 Convert the z statistic to raw means for the lower and upper ends
of the confidence interval. For the lower end, use the formula:
MLower = –z(M) + MSample. For the upper end, use the formula:
MUpper = z(M) + MSample.

II.
Lastly, check the answer to ensure that each end of the
confidence interval is exactly the same distance from the sample
mean.
Effect Size and prep
1. Increasing the sample size can lead to an increased test statistic
during hypothesis testing. In other words, it becomes
progressively easier to declare statistical significance as we
increase the sample size.
2 An effect size indicates the size of a difference and is unaffected
by sample size.
3. Effect size tells us how much two populations do not overlap.
Two populations can overlap less if either their means are
farther apart or the variation within each population is smaller.
> Discussion Question 8-3
What is an effect size, and why would reporting it be useful?
Your students’ answers should include:
 An effect size is a measure of the degree to which groups differ in
the population on the dependent variable.
 It is useful to report the effect size because it provides you with a
standardized value of the degree to which two populations do not
overlap and addresses the relative importance and generalizability
of your sample statistics.
3. Cohen’s d is a measure of effect size that assesses the
difference between two means in terms of standard deviation,
not standard error.
4. The formula for Cohen’s d for a z distribution is: d = (M – )/
5. A d of .2 is considered a small effect size, a d of .5 is considered
a medium effect size, and a d of .8 is considered a large effect
size.
6. The sign of the effect size does not matter.
> Discussion Question 8-4
Imagine you obtain an effect size of –0.3. How would you interpret this
number?
Your students’ answers should include:
 If you obtained an effect size of –0.3, you would interpret this as a
small effect size.
III.
Next Steps: prep
1. Another method we can use in hypothesis testing is prep or the
probability of replicating an effect given a particular population
and sample size. It is interpreted as “This effect will replicate
100(prep)% of the time.”
2. To calculate prep, we first calculate the specific p value
associated with our test statistic.
3. Next, using Excel we enter into one cell the formula
=NORMDIST(NORMSINV (1-P/SQRT(2))) where we substitute
the actual p value for p.
IV.
Statistical Power
1. Statistical power is a measure of our ability to reject the null
hypothesis given that the null hypothesis is false. In other
words, it is the probability that we will not make a Type II error,
or the probability that we will reject the null hypothesis when we
should reject the null hypothesis.
2. Our calculation of statistical power ranges from a probability of
0.00 to 1.00. Historically, statisticians have used a probability of
.80 as the minimum for conducting a study.
3. There are three steps to calculating statistical power. In the first
step, we determine the information needed to calculate
statistical power, including the population mean, the population
standard deviation, the hypothesized mean for the sample, the
sample size, and the standard error based on this sample size.
4. In step two, we caculate the critical value in terms of the z
distribution and in terms of the raw mean so that statistical
power can be calculated.
5. In step three we calculate the statistical power or the percentage
of the distribution of means for population 2 (the distribution
centered on the hypothesized sample mean) that falls above
the critical value.
> Discussion Question 8-5
What is statistical power, and how would you calculate it?
Your students’ answers should include:
 Statistical power is the probability of rejecting the null hypothesis
when it is false.
 You calculate statistical power in three steps. First determine the
characteristics of the two populations. Next, calculate the raw
mean value that determines your cutoff values. Finally, determine
the percentage that falls above the raw mean and at the cutoff
value using population 2.
6. There are five ways that we can increase the power of a
statistical test. First, we can increase alpha. Second, we could
turn a two-tailed hypothesis into a one-tailed hypothesis. Third,
we could increase N. Fourth, we could exaggerate the levels of
the independent variable. Lastly, we could decrease the
standard deviation.
> Discussion Question 8-6
What are ways that you could increase statistical power?
Your students’ answers should include:
Three ways that you could increase your statistical power are:
 Adapt a more lenient alpha level.
 Use a one-tailed test in place of a two-tailed test.
 Increase the size of the sample.
 Exaggerate the levels of the independent variable.
 Decrease the standard deviation.
Classroom Activity 8-2
Working with Confidence Intervals and Effect Size
For this activity, you will need to have the class take a sample IQ
test. You can find many examples of abbreviated IQ tests online
(www.iqtest.com is one such site). Have students anonymously
submit their scores and compare the class data to data for the
general population (population mean = 100, population standard
deviation = 15). Using these data:
 Have students calculate the confidence interval for the analysis.
Have students calculate the effect size.
Use Handout 8-1, found at the end of this chapter, to complete the
activity.

V.
Next Steps: Meta-Analysis
1. A meta-analysis is a study that involves the calculation of a
mean effect size from the individual effect sizes of many
studies.
2. A meta-analysis can provide added statistical power by
considering many studies at once. In addition, a meta-analysis
can help to resolve debates fueled by contradictory research
findings.
> Discussion Question 8-7
What is a meta-analysis and why is it useful?
Your students’ answers should include:
 A study that involves the calculation of a mean effect size from the
individual effect sizes of many studies.
 It is useful because it considers many studies at once and helps to
resolve debates fueled by contradictory research findings.
3. The first step in a meta-analysis is to choose the topic and make
a list of criteria for which studies will be included.
4. Our next step is to gather every study that can be found on a
given topic and calculate an effect size for every study that was
found.
5. Lastly, we calculate statistics—ideally, summary statistics, a
hypothesis test, a confidence interval, and a visual display of
the effect sizes.
6. A file-drawer analysis is a statistical calculation following a
meta-analysis of the number of studies with null results that
would have to exist so that a mean effect size is no longer
statistically significant.
Classroom Activity 8-3
Analyzing Meta-Analyses
Directions: In this activity, have students find a meta-analysis within
the psychological literature. You may want to point them in the right
direction by suggesting journals that will typically publish metaanalyses such as Psychological Bulletin, Personality and Social
Psychology Review, or Journal of Applied Psychology. Once
students have found their meta-analysis, they should answer the
questions in Handout 8-2.
Additional Readings
Cohen, J. (1988). Statistical Power Analysis for the Behavioral
Sciences. Hillsdale, NJ: Lawrence Erlbaum.
This is arguably the definitive source for power analysis. Many
of the procedural guidelines for determining power that are useful in
many types of research design are clearly laid out in this text.
Neyman, J. (1937). Outline of a theory of statistical estimation based
on the classical theory of probability. Philosophical Transactions of
the Royal Society of London. Series A, 236, 333–380.
This is considered the seminal paper for confidence intervals.
Rosenthal, R. (1994). Parametric measures of effect size. In Cooper,
H., and Hedges, L. V. (Eds.), The handbook of research synthesis
(pp. 231–244). New York: Russell Sage Foundation.
A very readable account of many of the techniques for
calculating effect sizes. The chapter also includes a lot of
background information about these techniques and how to interpret
them.
Online Resources
This is an excellent Web site with numerous statistical
demonstrations that you can run in your classroom to help explain
the concepts concretely: http://onlinestatbook.com/. Here you will
find demonstrations of effect size, goodness of fit, and power.
Math World is an excellent and extensive resource site, providing
background information and succinct explanations for all of the
statistical concepts covered in the textbook and beyond.
http://mathworld.wolfram.com/topics/ProbabilityandStatistics.html
PLEASE NOTE: Due to formatting, the Handouts are only available in Adobe
PDF®.