Download Student`s t-Distribution Sampling Distributions Redux

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Taylor's law wikipedia , lookup

Gibbs sampling wikipedia , lookup

Statistical inference wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Student’s t-Distribution
The t-Distribution, t-Tests, Measures
of Effect Size, & Managing Violations
of Assumptions
Sampling Distributions Redux
• Chapter 7 opens with a return to the
concept of sampling distributions from
chapter 4
– Sampling distributions of the mean
1
Sampling Distribution of the Mean
• Because the SDotM is so important in statistics,
you should understand it
• The SDotM is governed by the Central Limit
Theorem
Given a population with a mean μ and a
variance σ2, the sampling distribution
of the mean (the distribution of sample
means) will have a mean equal to μ, a
variance equal to σ2/n, and a standard
deviation equal to σ 2 / n . The
distribution will approach the normal
distribution as n, the sample size,
increases. (p. 178)
Sampling Distribution of the Mean
Translation:
1. For any population with a given mean and
variance the sampling distribution of the mean
will have:
•
•
•
µx = μ
σx2 = σ2/n
σx = σ/√n
2. As n increases, the sampling distribution of the
mean (µx) approaches a normal curve
2
Sampling Distribution of the Mean
• Analysis:
– Although µx and µ will tend to be similar to one
another…
– The relationships between…
• σx2 and σ2
• σx and σ
– …will differ as a function of the sample size
• We saw this in our sampling distribution of the
mean example from chapter 4…
So, you wanna test
a hypothesis, do ya?
• Our understanding of sampling and
sampling distributions now allows us to
test hypotheses
• How we test a hypothesis depends on the
information we have available
3
Choosing a Test
• µ?
1. Which variables are
available?
– σ?
– s?
• Number of data sets:
–1
–2
2. How many data
sets are you
presented with?
3. Do your data sets
come from 1 or 2
groups?
• Number of Groups
–1
–2
Testing Hypotheses about Means:
The Rare Case of Knowing σ
• So far, to test the
probability of finding a
particular score,
we’ve used the
Standard Normal
Distribution
– IQ = 122
– µ = 100
– σ = 15
(x − x)
z=
z=
σ
(122 − 100)
15
z=
(22)
15
z = 1.47
-1.96 < z < 1.96 Fail to reject H0
4
How the z-Test Works
• How does our test change when we test
group means, not just individual scores?
– We use the central limit theorem
How the z-Test Works
n = 100
(122 − 100)
15
100
(122 − 100)
z=
15
2
z=
n=2
n=1
z=
(122 − 100)
15
1
( 22)
15
10
( 22)
z=
15
1.41
z=
z=
( 22)
15
1
(22)
1.5
z = 14.67
(22)
10.64
z = 2.07
(22)
15
z = 1.47
z=
z=
z=
5
How the z-Test Works
•
Large samples reduce the amount of random
variance (sampling error)
–
•
•
More confidence that the sample mean =
population mean
Larger samples improve our ability to detect
differences between samples and populations
For n = 1
(x − )
(x − µ)
z=
= z =
σ
µ
σ
n
Testing Hypotheses:
When σ Is Unknown
• Generally, the population standard
deviation, σ, is unknown to us
• Occasionally, we will know the population
mean, µ, when we don’t know σ
• In these situations, the standard normal
distribution no longer meets our needs
6
Testing Hypotheses:
When σ Is Unknown
• Knowing µ…
– We can produce an estimate of σ from s
– Using s changes the nature of the test we are
conducting, as s is not distributed in the same
fashion as σ
• Sampling distribution of the sample standard
deviation is NOT normally distributed
– Strong positive skew
Testing Hypotheses:
When σ Is Unknown
Sampling distribution
of s
Sampling distribution
of σ
7
So How Does s Estimate σ?
• Given the differences in distribution shape, it is
easy to conclude that s ≠ σ
– s is an unbiased estimator of σ over repeated
samplings
– However, a SINGLE value of s is likely to
underestimate σ
• Because of this fact, small samples will systematically
underestimate σ as a function of s
– This leads to any given statistic calculated from this
distribution to be < a comparable value of z
– We cannot use z any longer Æ t
t and the t-Distribution
•
Developed by Student while he was
working for the Guinness Brewing Co.
1. The shape of the t-distribution is a direct
function of the size of the sample we are
examining
2. For small samples, the t-distribution is
somewhat flatter than the standard normal
distribution, with a lower peak and fatter tails
8
t and the t-Distribution
3. As sample size increases:
•
•
•
The t-distribution approaches a normal
distribution
Theoretically, we mean that the closer that our
sample comes to infinity, the more it looks like a
normal distribution
Practically, when n ~ 100 – 120
t and the t-Distribution
9
t and the t-Distribution
4. Identifying values of t associated with a
given rejection region depends on:
– α
– the number of tails associated with the test
– the degrees of freedom available in the analysis
– For this one-sample test, (df = n-1) because we used
one degree of freedom calculating s using the sample
mean and not the population mean.
One-Sample t-Test
( x − µ)
( x − µ) t = ( x − µ)
or
or t =
t=
s
2
x
sx
sx
n
n
10
z-Test vs. One-Sample t-Test
z=
(x − µ)
σ
n
(x − µ)
t=
sx
n
Note the similarities between these tests: ONLY the
source of “variance” and the distribution you test
against have changed!
Using the One-Sample t-Test
• You are one the admissions board for a
graduate school of Psychology.
• You are attempting to determine if the GRE
scores for the students applying to your program
is competitive with the national average.
– µVerbal = 569
• SPSS output from your data
Descriptive Statistics
N
Range
GRE
24
Valid N (listwise)
24
310.00
Mean
659.7917
Std. Deviation
86.43267
11
Using the One-Sample t-Test
•
Research Hypothesis:
– The GRE scores from your applicants differ
from the population norms
•
•
H1: µa ≠ µp
or
ES > 0
Null Hypothesis
– The GRE scores from your applicants do not
differ from the population norms
•
•
H0: µa = µp
or
ES = 0
Evaluate the students’ GRE-V scores
Using the One-Sample t-Test
•
Select:
•
Rejection region
•
•
α = .05
“Tail” or directionality
•
•
We don’t know exactly how the students will
score: we just expect them to show scores
differing from the population values
Might predict higher scores…
12
Using the One-Sample t-Test
•
Generate sampling distribution of the
mean assuming H0 is true
•
•
One-Sample t-test
Given our sampling distribution:
•
Conduct the statistical test
Using the One-Sample t-Test
t=
t=
(x − µ)
sx
n
(659.79 − 569)
86.43
24
µVerbal = 569
x-bar = 659.79
s = 86.43
n = 24
t=
(90.79)
86.43
4.90
t=
(90.79)
17.64
t = 5.15
This numerical value is called tobt
tobt(23) = 5.15
13
Using the One-Sample t-Test
• SPSS Output
µ
One-Sample Test
Test Value = 569
95% Confidence Interval
of the Difference
t
GRE
5.146
df
Sig. (2-tailed) Mean Difference
23
.000
90.79167
Lower
54.2943
Upper
127.2890
tobt(23) = 5.15
Evaluating Statistical
Significance of the t-Test
• First note:
– α = .05
– Tail or directionality: two-tailed
– t-Value = 5.15
– Degrees of freedom (df)
• For the One-Sample t-Test, df = n-1 (24-1 = 23)
• Estimating s from x-bar (not σ from µ)
14
Evaluating Statistical
Significance of the t-Test
• In the past you…
– Identified a tabled value of tcrit
– Compare tcrit to our tobt value
– If tobt falls into the rejection region identified by
tcrit, then we reject H0
– If tobt does not fall into the rejection region
identified by tcrit, then we fail to reject H0
• SPSS Simplifies matters by exactly
calculating p for us
Using the One-Sample t-Test
• SPSS Output
µ
One-Sample Test
Test Value = 569
95% Confidence Interval
of the Difference
t
GRE
5.146
df
Sig. (2-tailed) Mean Difference
23
.000
90.79167
Lower
54.2943
Upper
127.2890
tobt(23) = 5.15, p < .05
Exact probability ≈ .000003
15
Evaluating Statistical
Significance of the t-Test
tobt = 5.15
tcrit = - 2.069
tcrit = 2.069
0
Because tobt falls within the rejection region
identified by tcrit we reject H0
Testing Hypotheses:
Two Matched (Repeated) Samples
• Sometimes, we’re interested in how a single set
of scores change over time
–
–
–
–
Psychotherapy tx influences depression
Patients respond to medication
Consumer attitudes before and after an advertisement
Changes in citizen attitudes following the State of the
Union address
• When we look at two sets of scores collected
from a single sample at different time points, we
need to use a matched samples test
16
Matched Samples
• Matched samples
– Use the same participants at two or more
different time points to collect similar data
• MUST BE THE SAME SAMPLE!
Time 1
Wait 30 Days
BDI - II
Time 2
BDI - II
Matched Samples Test
• With a matched samples test, you are
testing the change in scores between the
two administrations of the test
– H0: µ1 = µ2
– H0: µ1 - µ2 = 0
or
ES = 0
• This is truly the null hypothesis for the matched
samples test
17
Matched Samples Test
• Essentially, the group means at each time
point mean little to us
– Change in scores is the key
– Conduct this test by obtaining the average
difference score between the two time points
Matched Samples Test
D −0
t=
sD
n
D-bar represents average
difference scores
between time points
sD is the standard
deviation of the difference
scores
-0 may seem redundant,
but isn’t!
18
Calculating the
Matched Samples t-Test
• You are a researcher examining the
impact of a new therapy intervention on
the incidence of self-injurious behavior
(SIB)
• You collect a measure of the frequency of
self-injurious acts when clients enter your
treatment (time 1)
• You collect a measure of the frequency of
self-injurious acts two weeks later (time 2)
Calculating the
Matched Samples t-Test
•
Research Hypothesis:
– The new treatment will change SIB scores
•
•
H1: µ1 ≠ µ2
or
ES > 0
Null Hypothesis
– The SIB scores at time 2 will be the same as
the scores at time 1 (no change)
•
•
•
H0: µ1 = µ2
H0: µ1 - µ2 = 0
or
ES = 0
Evaluate SIB at time 1 & time 2
19
Using the One-Sample t-Test
•
Select:
•
Rejection region
•
•
α = .05
“Tail” or directionality
•
We don’t know exactly how the treatment will
work, so we’d better use a two-tailed test
Using the One-Sample t-Test
•
Generate sampling distribution of the
mean assuming H0 is true
•
•
Matched Samples t-test
Given our sampling distribution:
•
Conduct the statistical test
20
Calculating the
Matched Samples t-Test
Time 1 13 14 8 10 11 13 15 16 19 10 7
Time 2
8 10 4
7 10 9 11 9 17 6
2
D
5
3
5
D2
25 16 16 9
4
4
1
4
4
7
2
4
1 16 16 49 4 16 25
∑D = 43
Descriptive Statistics
D = 3.91
∑D2 = 193
N
Minimum
Maximum
Mean
Std. Deviation
time1
11
7.00
19.00
12.3636
3.58532
time2
11
2.00
17.00
8.4545
3.93354
Valid N (listwise)
11
(∑D)2 = 1849
Calculating the
Matched Samples t-Test
(∑ D ) 2
∑D − n
sD2 =
(n − 1)
2
sD2 =
sD2 =
1849
11
(10)
193 −
24.91
(10)
432
193 −
2
11
sD =
(11 − 1)
sD2 =
sD2 = 2.49
193 − 168.09
(10)
sD = 2.49 sD = 1.58
21
Calculating the
Matched Samples t-Test
t=
t=
D −0
sD
n
3.91
.48
t=
3.91 − 0
1.58
11
t = 8.15
t=
3.91
1.58
3.32
tobt = 8.15
Evaluating Statistical
Significance of the t-Test
• First note:
– α = .05
– Tail or directionality: two-tailed
– t-Value = 8.15
– Degrees of freedom (df)
• For the Matched Samples t-Test:
– df = number of PAIRS of scores -1
– df = 11 - 1 = 10
– Again, we can calculate p exactly with SPSS
22
Calculating the
Matched Samples t-Test
• SPSS Output
Paired Samples Correlations
N
Pair 1 time1 & time2
Correlation
11
Sig.
.916
.000
Paired Samples Test
Paired Differences
95% Confidence Interval
of the Difference
Mean
Pair 1 time1 - time2
Std. Deviation Std. Error Mean
3.90909
1.57826
.47586
Lower
2.84880
Upper
4.96938
t
8.215
df
Sig. (2-tailed)
10
.000
tobt (10) = 8.15, p < .05
p ≈ .0000009
Evaluating Statistical
Significance of the t-Test
tobt = 8.15
tcrit = - 2.228
tcrit = 2.228
0
Because tobt falls within the rejection region
identified by tcrit we reject H0
23
Testing Hypotheses:
Two Independent Samples
• Probably the most common use of the tTest and the t-distribution
• Compare the mean scores of two groups
on a single variable
– IV: Groups
– DV: Variable of interest
• Groups must be independent of one
another
– Scores in 1 group cannot influence scores in
the other group
Independent Samples t-Test
X1 − X 2
t=
s x1 − x2
or
t=
X1 − X 2
s12 s22
+
n1 n2
This test is calculated by dividing the mean
difference between two groups by the “dispersion”
or “variation” observed between the two groups
24
Independent Samples t-Test:
Degrees of Freedom
• 1 df lost for each σ estimated by s using xbar
• Since there are two independent groups in
this analysis, we must estimate σ twice
• df = (n1 + n2) - 2
Independent Samples t-Test:
Example
• Let’s return to the example used for the
matched samples test
• As a competent researcher, you realize
that simply showing a change over time is
not enough to prove the efficacy of your
treatment
– People spontaneously change over time
• Show that an untreated control group does
not change over the same period of time
that your treatment group does change
25
Independent Samples t-Test:
Example
Time 1
Tx Group
Tx
SIB
Scores
SIB
Scores
=
Ctrl
Group
Time 3
Time 2
?
SIB
SIB
Scores
Scores
Tx
SIB
Scores
Independent Samples t-Test:
Example
• At time 1, the control and treatment SIB
groups have equal SIB scores
• Administer the treatment for 2 weeks to Tx
group
– The Control group receives no intervention
during these two weeks
• Compare SIB scores of Tx and Control
group after 2 weeks
• Provide Control group w/ intervention if
desired
26
Independent Samples t-Test:
Example
•
Research Hypothesis:
– Your treatment for SIB will reduce SIB
scores in the Tx group after 2 weeks
•
•
H1: µt < µc
Null Hypothesis
– Your treatment for SIB will have no effect
•
•
H0: µt = µc
Evaluate the efficacy of your treatment
Independent Samples t-Test:
Example
Time 2 Data
Control 12 13 10 9 11 8 16 13 15 16 12
Tx
8 10 4
Ctrl Group
135
93
∑x2
1729
941
18225
2
Tx Group
∑x
(∑x)2
7 10 9 11 9 17 6
8649
x-bar
12.27
8.45
s2
7.29
15.47
s
2.69
3.93
n
11
11
Descriptive Statistics
N
Minimum
Maximum
Mean
Std. Deviation
ctrl
11
8.00
16.00
12.2727
2.68667
tx
11
2.00
17.00
8.4545
3.93354
Valid N (listwise)
11
27
Independent Samples t-Test:
Example
•
Select:
•
Rejection region
•
•
α = .05
“Tail” or directionality
•
We have evidence that the treatment probably
works, so we make a one-tailed hypothesis here
(scores for the Tx group will be lower than the
Control group at time 2)
Independent Samples t-Test:
Example
•
Generate sampling distribution of the
mean assuming H0 is true
•
•
Independent Samples t-Test
Given our sampling distribution:
•
Conduct the statistical test
28
Independent Samples t-Test:
Example
t=
t=
X1 − X 2
8.45 − 12.27
15.47 7.29
+
11
11
t=
s12 s22
+
n1 n2
− 3.82
1.41 + .66
t = −2.65
t=
− 3.82
2.07
t=
− 3.82
1.44
tobt(20) = -2.65
Evaluating Statistical
Significance of the t-Test
• First note:
– α = .05
– Tail or directionality: one-tailed
– t-Value = -2.65
– Degrees of freedom (df)
• For the Independent Samples t-Test
– (n1 + n2) - 2
– (11+11)-2
– 22 - 2 = 20
29
Evaluating Statistical
Significance of the t-Test
• SPSS Output
Independent Samples Test
Levene's Test for Equality
of Variances
F
Self-Injurious Behavior Equal variances assumed
Sig.
.518
t-test for Equality of Means
t
.480
Equal variances not assumed
df
Sig. (2-tailed) Mean Difference
Std. Error
Difference
2.658
20
.015
3.81818
1.43625
2.658
17.663
.016
3.81818
1.43625
tobt(20) = -2.65, p < .05
p ≈ .015
Evaluating Statistical
Significance of the t-Test
tcrit = - 1.725
tobt = -2.65
0
Because tobt falls within the rejection region
identified by tcrit we reject H0
30
Independent Samples t-Test: One
Complication
• There is a slight
problem with the form
of the equation we
used…
– ONLY can be applied
to groups with equal
sample sizes
– A major limitation in
real-world research
t=
X1 − X 2
s12 s22
+
n1 n2
Pooled Variance Estimate
• This equation permits tests with different
sample sizes
• Generates an estimate of the total
variance between groups weighted by the
size of each group
– Therefore, larger samples have a greater
impact on the variance
– Vice-versa for small samples
31
Pooled Variance Estimate
2
2
(
n
−
1
)
s
+
(
n
−
1
)
s
2
1
2
2
sp = 1
n1 + n2 − 2
Using the Pooled Variance
Estimate
X − X2
t= 1
s 2p s 2p
+
X1 − X 2
n1 n2
t=
s12 s22
+
X1 − X 2
t
=
n1 n2
1 1
s 2p +
n1 n2
32
Using the Pooled Variance
Estimate: Example
Time 2 Data
Control 11 16 13 15 16 12
Tx
8 10 4
Ctrl Group
No Data
7 10 9 11 9 17 6
2
Tx Group
Descriptive Statistics
∑x
83
93
∑x2
1171
941
ctrl
6
11.00
16.00
13.8333
2.13698
tx
11
2.00
17.00
8.4545
3.93354
(∑x)2
6889
8649
Valid N (listwise)
x-bar
13.83
8.45
s2
4.57
15.47
s
2.14
3.93
n
6
11
N
Minimum
Maximum
Mean
Std. Deviation
6
Using the Pooled Variance
Estimate: Example
( n1 − 1) s12 + ( n2 − 1) s22
s =
n1 + n2 − 2
s 2p =
(11 − 1)15.47 + (6 − 1)4.57
11 + 6 − 2
s 2p =
(10)15.47 + (5)4.57
15
s 2p =
154.7 + 22.85
15
s 2p =
177.55
15
s 2p = 11.84
2
p
33
Using the Pooled Variance
Estimate: Example
t=
t=
t=
X1 − X 2
1 1
s 2p +
n1 n2
t=
8.45 − 13.83
1 1
11.84( + )
11 6
− 5.38
11.84(.1667 + .0909)
− 5.38
3.05
t=
− 5.38
1.75
t=
− 5.38
11.84(.2576)
t = −3.07
tobt(15) = -3.07
Evaluating Statistical
Significance of the t-Test
• First note:
– α = .05
– Tail or directionality: one-tailed
– t-Value = -3.07
– Degrees of freedom (df)
• For the Independent Samples t-Test
– (n1 + n2) - 2
– (11+6)-2
– 17 - 2 = 15
34
Evaluating Statistical
Significance of the t-Test
• SPSS Output
Independent Samples Test
Levene's Test for Equality
of Variances
F
Self-Injurious Behavior Equal variances assumed
Sig.
.714
t-test for Equality of Means
t
.411
Equal variances not assumed
df
Sig. (2-tailed) Mean Difference
Std. Error
Difference
3.080
15
.008
5.37879
1.74614
3.653
14.979
.002
5.37879
1.47232
tobt(15) = -3.07, p < .05
p ≈ .0076
Evaluating Statistical
Significance of the t-Test
tcrit = - 1.753
tobt = -3.07
0
Because tobt falls within the rejection region
identified by tcrit we reject H0
35
Effect Size of The Independent
Samples t-Test
d=
µ1 − µ 2
σ
or
d=
X1 − X 2
sp
We use the same effect size conventions
we identified for the Matched Samples test
Effect Size of The Independent
Samples t-Test
X1 − X 2
d=
sp
d=
−5.38
11.84
8.45 − 13.83
d=
11.84
d = −.45
An effect size approaching the convention
for a medium effect
36
t-test Assumptions
• Although the t-test is generally a robust
test, it can be affected by violations of
underlying test assumptions
– Normality – sampling distribution is normally
distributed
– Sample size – samples for each group should
be of roughly equal size
– Homogeneity of variance – σ1 = σ2
t-test Assumptions
• One sample t-test
– Normality - √
– Sample size - X
– Homogeneity of variance – X
• Matched & Independent samples t-test(s)
– Normality - √
– Sample size - √
– Homogeneity of variance – √
37
Impact of Violated Assumptions
• For equal sample sizes…
– …violating homogeneity of variance…
• Minimal impact (α = .05 ± .02)
– …with minor normality violations…
• Similar results as above
– …with major normality violations…
• Severe skew (particularly in opposite directions)
can lead to significant problems unless variances
are fairly equal
Impact of Violated Assumptions
• Unequal sample sizes…
– Much more difficult to interpret
– Unequal sample sizes + heterogeneity of
variance = distortions in p
• Possibly increased risk of Type I error
• Risk of error increases as more assumptions are
violated
38
Coping with Violated Assumptions
•
What can we do to prevent or cope with
violated assumptions?
1. Maintain equal sample sizes
2. Use trimmed samples…
3. Use a distribution free (i.e. non-parametric)
test
4. Apply a statistical correction to t
Coping with Violated Assumptions
• SPSS Output
Independent Samples Test
Levene's Test for Equality
of Variances
F
Self-Injurious Behavior Equal variances assumed
Equal variances not assumed
Sig.
.714
.411
t-test for Equality of Means
t
df
Sig. (2-tailed) Mean Difference
Std. Error
Difference
3.080
15
.008
5.37879
1.74614
3.653
14.979
.002
5.37879
1.47232
If pF < .05, use the “Equal
variances no assumed” row
39
Statistical Tests We Have Learned
1. z-Test
•
•
•
1 group
1 set of data
µ & σ known
2. One-Sample t-Test
•
•
•
•
1 group
1 set of data
µ known
Estimate σ with s using
x-bar
3. Matched Samples tTest
•
•
•
•
1 group
2 sets of data
µ & σ unknown
Estimate σD with sD
using D-bar
4. Independent
Samples t-Test
•
•
•
•
2 groups
2 sets of data
µ & σ unknown
Estimate σ twice with s
using x-bar
Choosing the Best Test
40
Choosing the Best Test
• Flow-chart available on the website:
– http://www.personal.kent.edu/~marmey
• Also refer to the diagram on p. 11 of your
Howell text
• Try the review problems on the website for
an example of the types of questions I
might ask on an exam!
41