Download Two-sample t-test

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Sufficient statistic wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Review
0.002
3
0.000
0.001
p
0.003
Sampling Distribution under 1
-4
-2
0
2
4
z[1:8000]
4
A.
B.
C.
D.
2
Null hypothesis, critical value, alpha, test statistic
Alternative hypothesis, critical region, p-value, independent variable
Null hypothesis, critical region, p-value, test statistic
Alternative hypothesis, critical value, alpha, independent variable
Review
The average person can hold his or her breath for 47 seconds. Curious how you
compare to this value, you time yourself ten times, calculate your mean, and do
a t-test.
Which of these would be a 2-tailed alternative hypothesis?
A.
B.
C.
D.
Your mean time is 47 seconds
Your mean time is different from 47 seconds
Your mean time is greater than 47 seconds
Your mean time is less than 47 seconds
Review
A pharmaceutical company regularly runs experiments to test whether
candidate drugs do what they’re supposed to. Their experiments use an alpha
level of .01.
Of all the drugs that really do work, 20% are mistakenly abandoned because
the experiment fails to find evidence that they work. What is the power of the
company’s experiments?
A.
B.
C.
D.
1%
20%
80%
99%
Two-sample t-tests
10/14
+14
M = 14
+14
Flood
-27
-41 M = -36.3
-41
Independent-samples t-test
• Often interested in whether two groups have same mean
– Experimental vs. control conditions
– Comparing learning procedures, with vs. without drug, lesions, etc.
– Men vs. women, depressed vs. not
• Comparison of two separate populations
– Population A, sample A of size nA, mean MA estimates mA
– Population B, sample B of size nB, mean MB estimates mB
– mA = mB?
• Example: maze times
– Rats without hippocampus: Sample A = [37, 31, 27, 46, 33]
– With hippocampus: Sample B = [43, 26, 35, 31, 28]
– MA = 34.8, MB = 32.6
– Is difference reliable? mA > mB?
• Null hypothesis: mA = mB
– No assumptions of what each is (e.g., mA = 10, mB= 10)
• Alternative Hypothesis: mA ≠ mB
Finding a Test Statistic
• Goal: Define a test statistic for deciding mA = mB vs. mA ≠ mB
• Constraints (apply to all hypothesis testing):
– Must be function of data (both samples)
– Sampling distribution must be fully determined by H0
• Can only assume mA = mB
• Can’t depend on mA or mB separately, or on s
– Alternative hypothesis should predict extreme values
• Statistic should measure deviation from mA = mB
• so that if mA ≠ mB, we’ll be able to reject H0
• Answer (preview):
– Based on MA – MB (just like M – m0 for one-sample t-test)
MA - MB
"Standard Error"
– .
– (MA – MB) has Normal distribution
– Standard error has (modified) chi-square distribution
– Ratio has t distribution
Likelihood Function for MA – MB
• Central Limit Theorem
(
M A ~ Normal m,
s
nA
)
(
M B ~ Normal m,
s
nB
)
• Distribution of MA – MB
– Subtract the means: E(MA – MB) = E(MA) – E(MB) = m – m = 0
– Add the variances:
– .( M A - M B ) ~ Normalæç0, s
è
s2
nA
1
nA
+ sn B = s 2
2
+
1
nB
(
1
nA
+ n1B
)
ö
÷
ø
• Just divide by standard error?
– . M A 1- M1B
s
nA
+
nB
~ Normal(0,1)
– Same problem as before: We don’t know s
– Need to estimate from data
Estimating s
• Already know best estimator for one sample
2
(X - M)
å
s=
n -1
• Could just use one sample or the other
– sA or sB
– Works, but not best use of the data
• Combining sA and sB
– Both come from averages of (X – M)2
– Average them all together:
• Degrees of freedom
– (nA – 1) + (nB – 1) = nA + nB – 2
å (X - M ) + å (X - M )
2
A
A
B
nA + nB - 2
B
2
Independent-Samples t Statistic
t=
Difference between sample means
MA - MB
Standard Error
Typical difference expected by chance
Variance of MA – MB
Estimate of s2
Variance from MA
æ
ö
Standard Error = MS× ç n1 + n1 ÷
è A
Bø
Variance from MB
å A ( X - M A ) + åB ( X - M B )
2
MS =
n A + nB - 2
2
Sum of squared deviations
Degrees of freedom
Mean Square; estimates s2
Steps of Independent Samples t-test
1.
2.
State clearly the two hypotheses
Determine null and alternative hypotheses
• H0: mA = mB
• H1: mA ≠ mB
3.
Compute the test statistic t from the data
•
t .=
M A - MB
æ
ö
MS× ç n1 + n1 ÷
è A
Bø
• Difference between sample means, divided by standard error
4.
Determine likelihood function for test statistic according to H0
• t distribution with nA + nB – 2 degrees of freedom
5.
6.
7a.
7b.
Choose alpha level
Find critical value
t beyond tcrit: Reject null hypothesis, mA ≠ mB
t within tcrit: Retain null hypothesis, mA = mB
Example
Rats without hippocampus: Sample A = [37, 31, 27, 46, 33]
With hippocampus: Sample B = [43, 26, 35, 31, 28]
MA = 34.8, MB = 32.6, MA – MB = 2.2
df = nA + nB – 2 = 5 + 5 – 2 = 8
2
37
2.2
4.84
31
-3.8
14.44
27
-7.8
60.84
46
11.2
125.44
33
-1.8
4
-4
3.24
2
-2
( + ) = 4.42
1
5
1
5
M A - MB
2.2
=
= .498
s (M A -M B ) 4.42
0.1
0.2
0.3
0.4
(X-MA)2
SA(X-MA)2 = 208.80
00
t8
X
X-MB
(X-MB)2
43
10.4
108.16
26
-6.6
43.56
35
2.4
5.76
31
-1.6
2.56
28
-4.6
21.16
0.0
æ
ö
s (M A -M B ) = MS × ç 1 + 1 ÷
è nA nB ø
t=
XMA
22-
44-
SB(X-MB)2 = 181.20
tcrit = 1.86
3.0
208.8 +181.2
=
= 48.75
8
= 48.75 ×
X
4.0
df
1.0
MS =
2
2 .0
å A ( X - M A ) + åB ( X - M B )
0.0
–
–
–
–
Mean Squares
• Average of squared deviations
• Used for estimating variance
MS =
Population
Sample
MS =
å ( X - m)
2
Population variance, s2
N
å( X - M )
2
n -1
å A ( X - M A ) + åB ( X - M B )
2
Two samples
MS =
Sample variance, s2
Estimates s2
n A + nB - 2
2
Also estimates s2
Degrees of Freedom
•
Applies to any sum-of-squares type formula
å( X - M )
•
å
2
A
( X A - MA ) + å ( X B - MB )
2
2
B
å(
X - Xˆ
Tells how many numbers are really being added
– n = 2: only one number
– In general: one number determined by the rest
•
X
3
7
)
2
X–M
-2
2
Every statistic in formula that’s based on X removes 1 df
– M, MA, MB
– Algebraically rewriting formula in terms of only X results in fewer summands
•
I will always tell you the rule for df for each formula
•
To get Mean Square, divide sum of squares by df
s2 =
•
å( X - M )
n -1
å A ( X A - M A ) + åB ( X B - M B )
2
2
MS =
2
n A + nB - 2
Sampling distribution of a statistic depends on its degrees of freedom
– c2, t, F
(X – M)2
4
4
Independent vs. Paired Samples
• Independent-samples t-test assumes no relation
between Sample A and Sample B
– Unrelated subjects, randomly assigned
– Necessary for standard error of (MA – MB) to be correct
• Sometimes samples are paired
– Each score in Sample A goes with a score in Sample B
– Before vs. after, husband vs. wife, matched controls
– Paired-samples t-test
Paired-samples t-test
• Data are pairs of scores, (XA, XB)
– Form two samples, XA and XB
– Samples are not independent
• Same null hypothesis as with independent samples
– mA = mB
– Equivalent to mean(XA – XB) = 0
• Approach
– Compute difference scores, Xdiff = XA – XB
– One-sample t-test on difference scores, with m0 = 0
Example
• Breath holding underwater vs. on land
– 8 subjects
– Water:
XA = [54, 98, 67, 143, 82, 91, 129, 112]
– Land:
XB = [52, 94, 69, 139, 79, 86, 130, 110]
• Difference:
Xdiff = [2, 4, -2, 4, 3, 5, -1, 2]
å Xdiff = 17 = 2.13
Mean: M diff =
n
Standard Error:
s M diff =
8
sdiff
6.13
=
= .88
n
8
• Critical value
> qt(.025,7,lower.tail=FALSE)
[1] 2.364624
•
Reliably longer underwater
Mean Square:
Test Statistic:
2
sdiff
t=
=
å( Xdiff - M diff )
2
n -1
M diff 2.13
=
= 2.43
s M diff .88
= 6.13
Comparison of t-tests
Samples
One
2-Indep.
2-Paired
Data
t
Standard Error
X
M -m 0
sM
s
1
= MS
n
n
XA, XB
Xdiff = XA - XB
MA - MB
s M A -M B
M diff
s M diff
MS
(
1
nA
+ n1B
Mean Square
s =
2
)
å( X - M )
df
2
n-1
df
å( X A - M A )
2
+å ( X B - M B )
2
nA + nB – 2
df
sdiff
1
= MS
n
n
2
sdiff
=
å ( Xdiff - M diff )
df
2
n-1
Review
A study of adult and infant speech observes brief interactions between 10 twoyear-olds and their mothers. For each dyad, we record the number of words
spoken by the parent and by the child. The question is who speaks more on
average.
What type of t-test should be used to analyze the data?
A.
B.
C.
D.
One-sample t-test
Independent-samples t-test
Paired-samples t-test
Depends on your choice of null hypothesis
Review
Find the mean difference score, Mdiff, from the data below.
Dyad
A.
B.
C.
D.
1
2
3
4
5
6
7
8
9
10
Parent
27
44
29
32
47
53
24
59
38
17
Child
33
40
25
33
41
59
21
50
34
16
1.8
4.4
36.1
72.2
Review
Now calculate t.
Dyad
A.
B.
C.
D.
1
2
3
4
5
6
7
8
9
10
Parent
27
44
29
32
47
53
24
59
38
17
Child
33
40
25
33
41
59
21
50
34
16
Diff
-6
4
4
-1
6
-6
3
9
4
1
0.12
0.37
1.16
2.72
Mdiff = 1.8
sdiff = 4.9