Download Class 11 Lecture: t-tests for differences in means

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Taylor's law wikipedia , lookup

Regression toward the mean wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Sociology 5811:
Lecture 11: T-Tests for
Difference in Means
Copyright © 2005 by Evan Schofer
Do not copy or distribute without
permission
Announcements
• Problem Set #3 due today
• Midterm in 2 weeks
• Details coming soon
• We are a bit ahead of readings
• Try to start on readings for next week NOW!
Hypothesis Testing
• Definition: Two-tailed test: A hypothesis test in
which the a-area of interest falls in both tails of a
Z or T distribution.
• Example: H0: m = 4; H1: m ≠ 4
• Definition: One-tailed test: A hypothesis test in
which the a-area of interest falls in just one tail
of a Z or T distribution.
• Example: H0: m > or = 4; H1: m < 4
• Example: H0: m < or = 4; H1: m > 4
• This is called a “directional” hypothesis test.
Hypothesis Tests About Means
• A one-tailed test: H1: m < 4
• Entire a-area is on left, as opposed to half (a/2)
on each side. SO: the critical t-value changes.
4
Hypothesis Tests About Means
• T-value changes because the alpha area (e.g., 5%)
is all concentrated in one size of distribution,
rather than split half and half.
•
One tail
vs.
Two-tail:
a=.05
a/2=.025
a/2=.025
Looking Up T-Tables
How much does the
95% t-value change
when you switch from
a 2-tailed to 1-tailed
test?
Two-tailed
test (20 df):
t=2.086
One-tailed
test (20df)
t=1.725
Review: Hypothesis Tests
• T-value changes because the alpha area (e.g., 5%)
is all concentrated in one size of distribution,
rather than split half and half.
•
One tail
vs.
Two-tail:
a=.05
a/2=.025
a/2=.025
Concentrating the alpha area in one tail reduces
the critical T-value needed to reject H0
Tests for Differences in Means
• A more useful application: Two groups
• Issue: Whenever you compare two groups, you’ll
observe different means
• Question: Is the difference due to the particular
sample, but populations have the same mean?
• Or can we infer that the populations are different?
• Example: Test scores for 20 boys, 20 girls in the
2nd grade
• Y-barboys = 72.75, s = 8.80
• Y-bargirls = 78.20, s = 9.55
Example: Boy’s Test Scores
8
7
6
5
4
3
2
Std. Dev = 8.80
1
Mean = 72.8
N = 20.00
0
45.0
55.0
50.0
65.0
60.0
Test Scores: BOYS
75.0
70.0
85.0
80.0
95.0
90.0
100.0
Example: Girl’s Test Scores
8
7
6
5
4
3
2
Std. Dev = 9.55
1
Mean = 78.2
N = 20.00
0
45.0
55.0
50.0
65.0
60.0
Test Scores: GIRLS
75.0
70.0
85.0
80.0
95.0
90.0
100.0
Differences in Means
• Inferential statistics can help us determine if
group population means are really different
• The hypotheses we must consider are:
H 0 : μ Boys = μ Girls
H1 : μ Boys  μ Girls
• An alternate (equivalent) formulation:
H 0 : μ Boys  μ Girls = 0
H1 : μ Boys  μ Girls  0
Differences in Means
• Issue:
YBoys  YGirls = 5.45
μ Boys  μ Girls = ?
• How likely is it to draw means with a difference
of -5.45, if the difference in population means is
really 0?
• If common, we can’t conclude anything
• If rare, we can conclude that the population
means differ.
Strategy for Mean Difference
• We never know true population means
• So, we never know true value of difference in means
• So, we don’t know if groups really differ
• If we can figure out the sampling distribution of
the difference in means…
• We can guess the range in which it typically falls
• If it is improbable for the sampling distribution to
overlap with zero, then the population means
probably differ
• An extension of the Central Limit Theorem provides
information necessary to do calculations!
Strategy for Mean Difference
• Logic of tests about differences in means:
• The C.L.T. defines how sample means (Y-bars)
cluster around the true mean:
• The center and width of the sampling distribution
• This tells us the range of values where Y-bars fall
• For any two means, the difference will also fall in
a certain range:
• Group 1 means range from about 6.0 to 8.0
• Group 2 means range from about 1.0 to 2.0
• Estimates of the difference in means will range from about
4.0 to 7.0!
A Corollary of the C.L.T.
• Visually: If each group has a sampling
distribution of the mean, the difference does too:
μ 2  μ1
σY2 Y1
Sampling distribution of
differences in means
μ2
μ1
σ Y1
σY2
A Corollary of the C.L.T.
• Example: If population means are 7 and 10,
observed difference in means will cluster around 3
μ 2  μ1 = 3
σY2 Y1
μ1 = 7 μ 2 = 10
σ Y1
If group 1 sample mean is 7.4,
group 2 is 9.8… Difference is 2.4
σY2
A Corollary of the C.L.T.
• Example: If two groups have similar means, the
difference will be near zero
μ 2  μ1 = .2
μ1 = 7, μ 2 = 7.2
σY2 Y1
When group means are similar,
difference are usually near zero.
But, even if
group means
are identical,
difference in
sample
means won’t
be exactly
zero in most
cases.
Sampling Distribution for
Difference in Means
• The mean (Y-bar) is a variable that changes
depending on the particular sample we took
• Similarly, the differences in means for two groups varies,
depending on which two samples we chose
• The distribution of all possible estimates of the
difference in means is a sampling distribution!
• The “sampling distribution of differences in means”
• It reflects the full range of possible estimates of the
difference in means.
A Corollary of the C.L.T
• For any two random samples (of size N1, N2),
with means m1, m2 and S.D. s1, s2:
• The sampling distribution for the difference of
two means is normal, with mean and S.D:
1. μ (Y1 Y2 ) = μ1 - μ 2
2. σ (Y1 Y2 ) =
σ
σ

N1 N 2
2
1
2
2
A Corollary of the C.L.T
• We can calculate the standard error of
differences in means
• It is the standard deviation of the sampling distribution of
differences in means:
σ (Y1 Y2 ) =
σ
σ

N1 N 2
2
1
2
2
• This formula tells us the dispersion of our estimates of
the difference in means.
A Corollary of the C.L.T
• Hypothesis tests using Z-distribution depend on:
• N being large
• N of both groups > 60, ideally > 100
• And, we must estimate population standard
deviations based on samples standard deviations:
σ̂ (Y1 Y2 ) =
2
1
2
2
s
s

N1 N 2
Z-Values for Mean Differences
• Finally, we can calculate a Z-value using the Zscore formula:
• This will be compared to a critical Z-value
Z(Y1 Y2 )
Y1  Y2
=
σ̂ (Y1-Y2 )
Z(Y1 Y2 ) =
Y1  Y2
s N1  s N 2
2
1
2
2
Z-Values for Mean Differences
• Visually:
Small Z
Large Z
Y1  Y2
σ̂(Y1 Y2 )
Y1  Y2
σ̂(Y1 Y2 )
• Question: In which case can we reject H0?
• Answer: If observed Z is large, it is improbable
that difference in means of populations is zero.
Z-Values for Mean Differences
• Back to the example: Test score differences for
boys and girls
• Y-barboys = 72.75, s = 8.80
• Y-bargirls = 78.20, s = 9.55
• Pretend our total N (of both groups) is “large”
• Choose a=.05, two-tailed test: critical Z = 1.96
H 0 : μ Boys = μ Girls
H1 : μ Boys  μ Girls
Z-Values for Mean Differences
• Strategy: Calculate Z-value using formula:
Z(Y1 Y2 ) =
Z(Y1 Y2 ) =
Y1  Y2
s N1  s N 2
2
1
2
2
5.45
8.80 20  9.55 20
2
2
Z-Values for Mean Differences
• Strategy: Calculate Z-value using formula:
Z(Y1 Y2 )
•
•
•
•
5.45
5.45
=
=
= 1.87
3.87  4.56 2.90
Observed Z = 1.87, critical Z = 1.96
Question: Can we reject H0?
Answer: NO! We are less than 95% confident
Also, our N is too small to do a Z-test.
Mean Differences for Small Samples
• Sample Size: rule of thumb
• Total N (of both groups) > 100 can safely be
treated as “large” in most cases
• Total N (of both groups) < 100 is possibly
problematic
• Total N (of both groups) < 60 is considered
“small” in most cases
• If N is small, the sampling distribution of mean
difference cannot be assumed to be normal
• Again, we turn to the T-distribution.
Mean Differences for Small Samples
• To use T-tests for small samples, the following
criteria must be met:
• 1. Both samples are randomly drawn from
normally distributed populations
• 2. Both samples have roughly the same variance
(and thus same standard deviation)
• To the extent that these assumptions are violated,
the T-test will become less accurate
• Check histogram to verify!
• But, in practice, T-tests are fairly robust.
Mean Differences for Small Samples
• For small samples, the estimator of the Standard
Error is derived from the variance of both groups
(i.e. it is “pooled”)
• Formulas:
s (Y1 -Y2 )
( N1  1)( s )  ( N 2  1)( s )
=
N1  N 2  2
2
1
2
2
Probabilities for Mean Difference
• A T-value may be calculated:
t(N1  N 2 2 )
(Y1  Y2 )
=
1
1
s(Y1 Y2 )

N1 N 2
• Where (N1 + N2 – 2) refers to the number of
degrees of freedom
– Recall, t is a “family” of distributions
– Look up t-dist for “N1 + N2 -2” degrees of freedom.
T-test for Mean Difference
•
•
•
•
Back to the example: 20 boys & 20 girls
Boys: Y-bar = 72.75, s = 8.80
Girls: Y-bar = 78.20, s = 9.55
Let’s do a hypothesis test to see if the means
differ:
• Use a-level of .05
• H0: Means are the same (mboys = mgirls)
• H1: Means differ (mboys ≠ mgirls).
T-test for Mean Difference
• Calculate t-value:
t(N1  N 2 2 )
t( 38 )
(Y1  Y2 )
=
1
1
s(Y1 Y2 )

N1 N 2
( 5.45 )
=
1
1
s(Y1 Y2 )

20 20
T-Test for Mean Difference
• We need to calculate the Standard Error of the
difference in means:
s (Y1 -Y2 )
( N1  1)( s )  ( N 2  1)( s )
=
N1  N 2  2
2
1
2
2
(19)(8.80 )  (19)(9.55 )
2
s (Y1 -Y2 ) =
2
38
T-Test for Mean Difference
• We also need to calculate the Standard Error of
the difference in means:
s (Y1 -Y2 )
(1471.36)  (1732.85)
=
38
s (Y1 -Y2 ) = 84.32 = 9.18
T-test for Mean Difference
• Plugging in Values:
t(N1  N 2 2 )
t( 38 )
(Y1  Y2 )
=
1
1
s(Y1 Y2 )

N1 N 2
( 5.45 )
=
1
1
(9.18)

20 20
T-test for Mean Difference
t( 38 )
( 5.45 )
=
(9.18)(.316)
t( 38 )
( 5.45 )
=
= 1.88
(2.90)
T-Test for Mean Difference
• Question: What is the critical value for a=.05,
two-tailed T-test, 38 degrees of freedom (df)?
• Answer: Critical Value = approx. 2.03
• Observed T-value = 1.88
• Can we reject the null hypothesis (H0)?
• Answer: No! Not quite!
• We reject when t > critical value
T-Test for Mean Difference
• The two-tailed test hypotheses were:
H 0 : μ Boys = μ Girls
H1 : μ Boys  μ Girls
• Question: What hypotheses would we use for the
one-tailed test?
H 0 : μ Boys  μ Girls
H1 : μ Boys  μ Girls
T-Test for Mean Difference
• Question: What is the critical value for a=.05,
one-tailed T-test, 38 degrees of freedom (df)?
• Answer: Around 1.684 (40 df)
• One-tailed test: T =1.88 > 1.684
• We can reject the null hypothesis!!!
• Moral of the story:
• If you have strong directional suspicions ahead of time, use
a one-tailed test. It increases your chances of rejecting H0.
• But, it wouldn’t have made a difference at a=.01
T-Test for Mean Difference
• Question: What if you wanted to compare 3 or
more groups, instead of just two?
• Example: Test scores for students in different educational
tracks: honors, regular, remedial
• Can you use T-tests for 3+ groups?
• Answer: Sort of… You can do a T-test for every
combination of groups
• e.g., honors & reg, honors & remedial, reg & remedial
• But, the possibility of a Type I error
proliferates… 5% for each test
• With 5 groups, chance of error reaches 50%
• Solution: ANOVA.