Download Hypothesis Testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Hypothesis Testing
Samples and Sampling
 We use small samples to infer about large
populations
 Good samples are:
 Representative
 Selected at random
 Truly Independent
Sampling Distributions
 One of the most crucial ideas in stats
1. Take a sample
2. Calculate the mean
3. Repeat
You will get a sampling distribution of the
mean
Central Limit Theorem
The sampling distribution of
means from random samples of
n observations approaches the
normal distribution regardless
of the shape of the parent
population.
See CLT example
Central Limit Theorem: Example
So what?
 From the CLT, we can establish the
accuracy of the estimate of the mean from
one sample.
 Standard Error of the mean (SE) = SD/√n
 Confidence Intervals (CI):
.68CI=Mean ± SE
.95CI=Mean ± 1.96(SE)
.99CI=Mean ± 2.576(SE)
Now we can talk about hypotheses
 Scientific vs. Statistical Hypotheses
 Ho : the null hypothesis (the one you want
to nullify)
 H1 : the alternative hypothesis
Example:
Ho : Meancontrol group = Meanexperimental group
H1 : Meancontrol group < Meanexperimental group
How do you know your sample or
group is really different from another
sample, group or population?
When comparing two samples (or a sample to
an assumed population), two errors are
possible.
powerapplet1.html
Statistical
Decision
Reject Ho
Do not reject Ho
True state of null hypothesis
Ho True
Ho False
Type I error (α)
Correct
Correct
Type II error (β)
Power
 Two of the most important and most neglected
concepts are power and effect size.
 Power (1- β) : probability of correctly rejecting a
false null hypothesis.
 Thus, the previous table could be expressed in
terms of probability:
Statistical Decision
True state of null hypothesis
Ho True
Ho False
Reject Ho
Type I error (α) = .05
Correct = .80 (Power)
Do not reject Ho
Correct = .95
Type II error (β) = .20
powerapplet1.html
Effect Size
 Which result is a larger effect?
 Significant difference between groups (p<.05)
 Significant difference between groups (p<.01)
Effect Size
 Two roads to a significant result:
 Small effect but a large sample
 Large effect
 Effect size statistics provide estimates that are
independent of idiosyncrasies of any given
sample. They typically translate mean
differences into standard deviation units.
i.e. Cohen’s d = (Mean1-Mean2)/SD
For this stat, small=.2, medium=.5, and large>.8
(See Cohen, J. (1992). A Power Primer. Psychological
Bulletin, 112, 155-159.
powerapplet1.html
How does it all relate?
 There are 4 variables involved with
statistical inference: Sample size (N),
significance criterion (α), effect size, and
statistical power.
 You can get the value of any one of these
with the values of the other 3.
Power analysis in proposals
 You can use power to determine the N
needed for your study.
 If you estimate your expected effect size
(i.e. mean difference of ½ SD = ES of .5),
know your alpha (probably .05), and
desired power to detect differences
(typically >.80), you can get the sample
size necessary to detect that difference.
 Tables in Cohen (1988) or on-line.
What does it all mean?
 Low power (< .80) gives under-estimates
of actual effects. That is, increased Type II
error – failure to reject a false Ho.
 Small effect sizes, regardless of p level,
are just small effects.
Recommended Reading and OnLine resources
Cohen, J. (1992). A Power Primer. Psychological
Bulletin, 112, 155-159.
Rosnow, R.L., & Rosenthal, R. (2003). Effect sizes
for experimenting psychologists. Canadian
Journal of Experimental Psychology, 57(3), 221237.
http://calculators.stat.ucla.edu/powercalc/