Download Chapter 4: Hypothesis Tests

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Student's t-test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Chapter Summaries
741
Chapter 4: Hypothesis Tests
Hypothesis tests are used to investigate claims about population parameters. We use
the question of interest to determine the two competing hypotheses: The null hypothesis (H0 ) is generally that there is no effect or no difference while the alternative
hypothesis (Ha ) is the claim for which we seek evidence. We conclude in favor of the
alternative hypothesis if the sample supports the alternative hypothesis and provides
strong evidence against the null hypothesis. We measure the strength of evidence a
sample shows against the null hypothesis with a p-value.
The p-value is the probability of obtaining a sample statistic as extreme as (or
more extreme than) the observed sample statistic, when the null hypothesis is true.
A small p-value means that the observed sample results would be unlikely to
happen, when the null hypothesis is true, just by random chance. When making formal decisions based on the p-value, we use a pre-specified significance level, 𝛼.
• If p-value < 𝛼, we reject H0 and have statistically significant evidence for Ha .
• If p-value ≥ 𝛼, we do not reject H0 , the test is inconclusive, and the results are not
statistically significant.
The key idea is: The smaller the p-value, the stronger the evidence against the null
hypothesis and in support of the alternative hypothesis. Rather than making a formal
reject/do not reject decision, we sometimes interpret the p-value as a measure of
strength of evidence.
One way to estimate a p-value is to construct a randomization distribution of
sample statistics that we might see by random chance, if the null hypothesis were
true. The p-value is the proportion of randomization statistics that are as extreme as
the observed sample statistic. If the original sample falls out in the tails, then a result
that extreme is unlikely to occur if the null hypothesis is true, providing evidence
against the null.
A randomization distribution for difference in mean memory recall between
sleep and caffeine groups for data in SleepCaffeine is shown. Each dot is a difference
in means that might be observed just by random assignment to treatment groups, if
there were no difference in terms of mean (memory) response. We see that 0.042 of
the simulated statistics are as extreme as the observed statistic (xs − xc = 3), so the
p-value is 0.042. This p-value is less than 0.05, so the results are statistically significant
at 𝛼 = 0.05, giving moderately strong evidence that sleeping is better than drinking
caffeine for memory.
Randomization Dotplot of x1 — x2 ; Null hypothesis μ1 = μ2
70
Left Tail
Two Tail
Right Tail
# samples = 1000
mean = 0.092
st. dev. = 1.48
60
50
40
0.021
30
0.958
0.021
20
10
0
‒4
‒3
‒2.833
‒2
‒1
0
0.092
1
2
Randomization distribution of differences in means
3
3
4