Download Lecture 14: Hypothesis testing, continued

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Taylor's law wikipedia , lookup

Foundations of statistics wikipedia , lookup

Analysis of variance wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Lecture 14: Hypothesis testing, continued
Announcements:
Reading for this week’s subject: pp. 61 - 79 in Vasilj.
Hypothesis testing: example
Hypothesis testing: example
We have a new variety of tomato.
Hypothesis testing: example
We have a new variety of tomato. Does it have the same
resistance to fusarium wilt as the old variety?
Hypothesis testing: example
We have a new variety of tomato. Does it have the same
resistance to fusarium wilt as the old variety? To answer this
question, we do an experiment.
Hypothesis testing: example
We have a new variety of tomato. Does it have the same
resistance to fusarium wilt as the old variety? To answer this
question, we do an experiment. We expose 100 new plants to
fusarium, and found that the mean growth rate was 35 with
standard deviation 10.
Hypothesis testing: example
We have a new variety of tomato. Does it have the same
resistance to fusarium wilt as the old variety? To answer this
question, we do an experiment. We expose 100 new plants to
fusarium, and found that the mean growth rate was 35 with
standard deviation 10. We know from previous experience that
the mean growth rate of the old genotype is 33.
Hypothesis testing: procedure
1
We ask a yes/no question about a population.
Hypothesis testing: procedure
1
2
We ask a yes/no question about a population.
We answer the question yes, and answer the question no,
using symbols for the population means.
Hypothesis testing: procedure
1
2
3
We ask a yes/no question about a population.
We answer the question yes, and answer the question no,
using symbols for the population means.
We label one answer the null hypothesis and the other
answer the alternative hypothesis.
Hypothesis testing: procedure
1
2
3
4
We ask a yes/no question about a population.
We answer the question yes, and answer the question no,
using symbols for the population means.
We label one answer the null hypothesis and the other
answer the alternative hypothesis.
We decide the criterion for rejecting the null hypothesis.
Hypothesis testing: procedure
1
2
3
4
5
We ask a yes/no question about a population.
We answer the question yes, and answer the question no,
using symbols for the population means.
We label one answer the null hypothesis and the other
answer the alternative hypothesis.
We decide the criterion for rejecting the null hypothesis.
The test is one of: two-tailed, right-tailed, or left-tailed.
Hypothesis testing: procedure
1
2
3
4
5
6
We ask a yes/no question about a population.
We answer the question yes, and answer the question no,
using symbols for the population means.
We label one answer the null hypothesis and the other
answer the alternative hypothesis.
We decide the criterion for rejecting the null hypothesis.
The test is one of: two-tailed, right-tailed, or left-tailed.
We take a sample, and calculate our test statistic (Z or t
for now)
Hypothesis testing: procedure
1
2
3
4
5
6
7
We ask a yes/no question about a population.
We answer the question yes, and answer the question no,
using symbols for the population means.
We label one answer the null hypothesis and the other
answer the alternative hypothesis.
We decide the criterion for rejecting the null hypothesis.
The test is one of: two-tailed, right-tailed, or left-tailed.
We take a sample, and calculate our test statistic (Z or t
for now)
We find if the observed test statistic is in the rejection
region (critical region or tail) of the distribution.
Hypothesis testing: procedure
1
2
3
4
5
6
7
8
We ask a yes/no question about a population.
We answer the question yes, and answer the question no,
using symbols for the population means.
We label one answer the null hypothesis and the other
answer the alternative hypothesis.
We decide the criterion for rejecting the null hypothesis.
The test is one of: two-tailed, right-tailed, or left-tailed.
We take a sample, and calculate our test statistic (Z or t
for now)
We find if the observed test statistic is in the rejection
region (critical region or tail) of the distribution.
If the statistic is in the rejection region, we reject the null
hypothesis and accept the alternative hypothesis.
Hypothesis testing: procedure
1
2
3
4
5
6
7
8
9
We ask a yes/no question about a population.
We answer the question yes, and answer the question no,
using symbols for the population means.
We label one answer the null hypothesis and the other
answer the alternative hypothesis.
We decide the criterion for rejecting the null hypothesis.
The test is one of: two-tailed, right-tailed, or left-tailed.
We take a sample, and calculate our test statistic (Z or t
for now)
We find if the observed test statistic is in the rejection
region (critical region or tail) of the distribution.
If the statistic is in the rejection region, we reject the null
hypothesis and accept the alternative hypothesis.
If the statistic is not in the rejection region, we retain the
null hypothesis, and do not accept the alternative
hypothesis.
Rules
If the alternative hypothesis contains the 6= sign, the test
is two-tailed.
Rules
If the alternative hypothesis contains the 6= sign, the test
is two-tailed.
If the alternative hypothesis contains the < sign, the test
is left-tailed.
Rules
If the alternative hypothesis contains the 6= sign, the test
is two-tailed.
If the alternative hypothesis contains the < sign, the test
is left-tailed.
If the alternative hypothesis contains the > sign, the test
is right-tailed.
Hypothesis testing: example
We have a new variety of tomato. Does it have the same
resistance to fusarium wilt as the old variety? To answer this
question, we do an experiment. We expose 100 new plants to
fusarium, and found that the mean growth rate was 35 with
standard deviation 10. We know from previous experience that
the mean growth rate of the old genotype is 33.
Hypothesis testing: example
Hypothesis testing: example
We have a new variety of tomato. Does it have lower
resistance to fusarium wilt than the old variety? To answer this
question, we do an experiment. We expose 100 new plants to
fusarium, and found that the mean growth rate was 35 with
standard deviation 10. We know from previous experience that
the mean growth rate of the old genotype is 33.
Hypothesis testing: example
Hypothesis testing: example
We have a new variety of tomato. Does it have higher
resistance to fusarium wilt than the old variety? To answer this
question, we do an experiment. We expose 100 new plants to
fusarium, and found that the mean growth rate was 35 with
standard deviation 10. We know from previous experience that
the mean growth rate of the old genotype is 33.
Hypothesis testing: example
Some terminology
If we reject the null hypothesis, then we conclude our data
are significant, or significantly different from the null.
Some terminology
If we reject the null hypothesis, then we conclude our data
are significant, or significantly different from the null.
The probability of the test is the probability of getting
the observed test result, or more extreme, than expected
by chance, if the null hypothesis were true.
Some terminology
If we reject the null hypothesis, then we conclude our data
are significant, or significantly different from the null.
The probability of the test is the probability of getting
the observed test result, or more extreme, than expected
by chance, if the null hypothesis were true.
In a two-tailed test, the probability is twice the tail
probability. In a one-tailed test, the probability is the
one-tail probability.
Some terminology
If we reject the null hypothesis, then we conclude our data
are significant, or significantly different from the null.
The probability of the test is the probability of getting
the observed test result, or more extreme, than expected
by chance, if the null hypothesis were true.
In a two-tailed test, the probability is twice the tail
probability. In a one-tailed test, the probability is the
one-tail probability.
This probability is also called the significance or the
significance level of the test.
Some more terminology
The least significant difference:
Some more terminology
The least significant difference:
Some more terminology
The least significant difference:
t=
observed − expected
sobserved
Some more terminology
The least significant difference:
t=
observed − expected
sobserved
tcrit =
observed − expected
sobserved
Some more terminology
The least significant difference:
t=
observed − expected
sobserved
tcrit =
observed − expected
sobserved
LSD =
Some more terminology
The least significant difference:
t=
observed − expected
sobserved
tcrit =
observed − expected
sobserved
LSD = tcrit
Some more terminology
The least significant difference:
t=
observed − expected
sobserved
tcrit =
observed − expected
sobserved
LSD = tcrit sobserved
Hypothesis testing: example
We have a new variety of tomato. Does it have the same
resistance to fusarium wilt as the old variety? To answer this
question, we do an experiment. We expose 100 new plants to
fusarium, and found that the mean growth rate was 35 with
standard deviation 10. We know from previous experience that
the mean growth rate of the old genotype is 33.
Hypothesis testing: example
Hypothesis testing: example
Hypothesis testing: example
We have a new variety of tomato. Does it have the same
resistance to fusarium wilt as the old variety? To answer this
question, we do an experiment. We expose 100 new plants and
100 old plants to fusarium, and found that the mean growth
rate was 35 with standard deviation 10 for the new plants, and
mean 33 and standard deviation 8 for the old plants.
Hypothesis testing: example
Unequal variance t-test
Calculate separate variances for the two samples:
Unequal variance t-test
Calculate separate variances for the two samples:
t=
Unequal variance t-test
Calculate separate variances for the two samples:
t=
x1 − x2
Unequal variance t-test
Calculate separate variances for the two samples:
t=q
x1 − x2
s12 /n1 + s22 /n2
Unequal variance t-test
Calculate separate variances for the two samples:
t=q
x1 − x2
s12 /n1 + s22 /n2
What is the degrees of freedom?
Unequal variance t-test
Calculate separate variances for the two samples:
t=q
x1 − x2
s12 /n1 + s22 /n2
What is the degrees of freedom? Either use the smallest of
n1 − 1 and n2 − 1,
Unequal variance t-test
Calculate separate variances for the two samples:
t=q
x1 − x2
s12 /n1 + s22 /n2
What is the degrees of freedom? Either use the smallest of
n1 − 1 and n2 − 1, or use R to calculate the following degrees
of freedom:
Unequal variance t-test
Calculate separate variances for the two samples:
t=q
x1 − x2
s12 /n1 + s22 /n2
What is the degrees of freedom? Either use the smallest of
n1 − 1 and n2 − 1, or use R to calculate the following degrees
of freedom:
d.f. =
Unequal variance t-test
Calculate separate variances for the two samples:
t=q
x1 − x2
s12 /n1 + s22 /n2
What is the degrees of freedom? Either use the smallest of
n1 − 1 and n2 − 1, or use R to calculate the following degrees
of freedom:
d.f. =
(s12 /n1 + s22 /n2 )2
Unequal variance t-test
Calculate separate variances for the two samples:
t=q
x1 − x2
s12 /n1 + s22 /n2
What is the degrees of freedom? Either use the smallest of
n1 − 1 and n2 − 1, or use R to calculate the following degrees
of freedom:
d.f. =
(s12 /n1 + s22 /n2 )2
.
(s12 /n1 )2 /(n1 − 1) + (s22 /n2 )2 /(n2 − 1)
Hypothesis testing: example
Equal variance t-test: use the pooled variance to
calculate the sample standard deviation
Equal variance t-test: use the pooled variance to
calculate the sample standard deviation
pooled variance =
Equal variance t-test: use the pooled variance to
calculate the sample standard deviation
pooled variance =
(n1 − 1)si2 +
Equal variance t-test: use the pooled variance to
calculate the sample standard deviation
pooled variance =
(n1 − 1)si2 + (n2 − 1)s22
Equal variance t-test: use the pooled variance to
calculate the sample standard deviation
pooled variance =
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
Equal variance t-test: use the pooled variance to
calculate the sample standard deviation
pooled variance =
s=
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
Equal variance t-test: use the pooled variance to
calculate the sample standard deviation
pooled variance =
s=
p
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
Equal variance t-test: use the pooled variance to
calculate the sample standard deviation
pooled variance =
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
s
s=
p
pooled variance =
Equal variance t-test: use the pooled variance to
calculate the sample standard deviation
pooled variance =
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
s
s=
p
pooled variance =
(n1 − 1)si2 +
Equal variance t-test: use the pooled variance to
calculate the sample standard deviation
pooled variance =
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
s
s=
p
pooled variance =
(n1 − 1)si2 + (n2 − 1)s22
Equal variance t-test: use the pooled variance to
calculate the sample standard deviation
pooled variance =
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
s
s=
p
pooled variance =
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
Equal variance t-test: use the pooled variance to
calculate the sample standard deviation
pooled variance =
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
s
s=
p
pooled variance =
t=
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
Equal variance t-test: use the pooled variance to
calculate the sample standard deviation
pooled variance =
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
s
s=
p
pooled variance =
t=
x1 − x2
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
Equal variance t-test: use the pooled variance to
calculate the sample standard deviation
pooled variance =
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
s
s=
p
pooled variance =
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
x1 − x2
t=p
=
2
s /n1 + s 2 /n2
Equal variance t-test: use the pooled variance to
calculate the sample standard deviation
pooled variance =
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
s
s=
p
pooled variance =
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
x1 − x2
t=p
=
2
s /n1 + s 2 /n2
x1 − x2
Equal variance t-test: use the pooled variance to
calculate the sample standard deviation
pooled variance =
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
s
s=
p
pooled variance =
(n1 − 1)si2 + (n2 − 1)s22
n1 + n2 − 2
x1 − x2
x1 − x2
t=p
= p
2
2
s 1/n1 + 1/n2
s /n1 + s /n2
Hypothesis testing: example
We have a new variety of tomato. Does it have the same
resistance to fusarium wilt as the old variety? To answer this
question, we do an experiment. We expose 100 new plants and
100 old plants to fusarium, and found that the mean growth
rate was 35 with standard deviation 10 for the new plants, and
mean 33 and standard deviation 8 for the old plants.
Hypothesis testing: example
Hypothesis testing: one-sample test of proportion
example
We have a new variety of tomato. Does it have the same
resistance to fusarium wilt as the old variety? To answer this
question, we do an experiment. We expose 100 new plants to
fusarium, and found that 72 are resistant. We know in advance
that 65% of the old plants are resistant.
Hypothesis testing: example
Hypothesis testing: two-sample test of proportions
example
We have a new variety of tomato. Does it have the same
resistance to fusarium wilt as the old variety? To answer this
question, we do an experiment. We expose 100 new plants and
100 old plants to fusarium, and found that 72 new are
resistant, and 65 old are resistant.
Hypothesis testing: example
Two sample test of proportions
Given x1 successes in a sample of n1 , and x2 successes in a
sample of n3 . Therefore pˆ1 = x1 /n1 , and pˆ2 = x2 /n2 .
+x2
We define p = nx11 +n
, and q = 1 − p.
2
Then our test statistic is:
pˆ1 − pˆ2
Z=q
pq
pq
n1 + n2