Download Getting to the essential

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia, lookup

Student's t-test wikipedia, lookup

Taylor's law wikipedia, lookup

Resampling (statistics) wikipedia, lookup

Misuse of statistics wikipedia, lookup

Psychometrics wikipedia, lookup

Analysis of variance wikipedia, lookup

Omnibus test wikipedia, lookup

Transcript
“T”s and “F”s
Statistical testing for means
FETP India
Competency to be gained
from this lecture
Test the statistical significance of the
difference between two means
Key elements
•
•
•
•
Paired and unpaired data
Paired t-test
Unpaired t-test
F test
Application of the concept of
statistical testing
• Means
• Proportion
• Measures of association
Paired and unpaired data
Statistical testing for means
• Means
 T-test for paired data
 T-test for unpaired data
 F-test to test the difference in variances
• Proportion
• Measures of association
Paired and unpaired data
Comparing unpaired data
• Concept
 Comparing a bag of observations against another
bag of observations
• Example
 The mean height of the children in one class
versus versus the height of the children in
another
Paired and unpaired data
Comparing paired data
• Concept
 Comparing pairs of observations that are linked
with each other
• Example
 Pre and post treatment values of a parameter in
a group of subjects
Paired and unpaired data
Paired and unpaired t-tests
• To test the difference between two sample
means that are paired (e.g., before and
after treatment) or matched (e.g., patients
matched for age, sex , etc)
 Use PAIRED t-test
• To test the difference between two sample
means that are not paired / unmatched
 Use UNPAIRED (independent) t-test
Paired and unpaired data
Example
• Drug trial
 Drug A and Drug B
 Two groups have equal initial blood sugars levels
• Question:
 Does the drug have an impact on the blood sugar
level?
• Null hypothesis
 There is no difference between the mean blood
sugar levels before and after treatment
Paired and unpaired data
Options available for the
example considered
• Two paired t-tests
 Each group has an initial and a post treatment
values
 Two paired t-tests are possible for each group
 This option is adapted to a research question
examining the individual relevance of each drug
• One unpaired test test on final value
 This option is adapted to a research question
comparing the two drugs
Paired and unpaired data
Methods to calculate the paired t-test:
Concept
• We test the probability that the difference
between the paired data is equal to 0
Paired t-test
Methods to calculate the paired t-test:
Formula (1/2)
•
•
•
•
•
Number of pairs: n
Value before Rx: a
Value after treatment: b
Difference: d  a  b  d
Mean (d):  d
n


Methods to calculate the paired t-test:
Formula (2/2)
2 

• Variance (d):  s 2  1 d 2  (d) 
n 1 
n 

t
d 0
s
n

d
s
n
Illustration of an application of the t-test
Drug
No of
patients
Fasting blood sugar
(mg%)
Initial
Decrease
Final
A
30
178
153
25*
B
31
179
119
60*
* Statistically Significant
( P < 0.05)
Paired t-test
Numerical example of paired t - test
Patient
number
1
2
3
4
5
6
7
8
9
10
Total
Erythrocyte sedimentation rate - 1 hour
Square of
Difference
(mm)
difference
(a–b) = d
(d2)
Before Rx (a)
After Rx (b)
8
25
17
289
10
43
33
1,088
6
38
32
1,024
7
20
13
169
10
41
31
961
5
48
43
1,849
8
15
7
49
9
28
19
361
4
35
31
961
3
33
30
900
326
70
256
7,652
Paired t-test
d =
256 ;
n
= 10 ;
d = 256/10
=
25.6
 d2 = 7652
1  2 ( d ) 2 
 d 

Variance (s2) =
n 1
n 
1 
(256) 2 
7652 
 = 122.04
=
10  1 
10 
s = S 2 = 122.04 =
t
=
d
s/n
11.047
25.6
=
= 7.33 with 9 d.f.
11.047 / 10
Paired t-test
Inference
• Calculated value of t= 7.33
 9 degrees of freedom (df)
• Tabulated value of t (df=9)(0.1%) = 4.781
• The value of t-cal exceeding the value of t-tab
• The treatment had a significant benefit in reducing
the erythrocyte sedimentation rate (P < 0.001)
• The mean erythrocyte sedimentation rate after
treatment (7.0 mm) is significantly lower than the
mean pre-treatment ESR value (32.6 mm)
Paired t-test
Methods to calculate the unpaired t-test:
Concept
• The pooled variance is a weighted average of
the two variances
• If the two sample sizes are equal, the pooled
variance is the mean of the two variances
• The t-table is identical for unpaired and
paired data
Unpaired t-test
Methods to calculate the unpaired t-test:
Formula
Size
Mean
Variance
Sample I
n1
x1
s21
Sample II
n2
x2
s22
To test the significance of the difference between the two sample
means, calculate
x1  x2
t=
x1  x2
=
SE ( x1  x2 )
2
1 1
s   
 n1 n2 
(n1 - 1) s21 + (n2 - 1) s22
where s2 = ------------------------------(n1 - 1) + (n2 - 1)
Unpaired t-test
t follows a t distribution with (n1 + n2 - 2) df
Numerical example of unpaired t -test
• Comparing the 24-hour
total energy
expenditure among an
obese and a lean group
• Null hypothesis:
 There is no difference
between the mean
energy expenditure
between the two groups
tcal > t
tab
indicate that the mean energy expenditure
in obese group (10.3) is significantly (P<0.001) higher
than that of lean group (8.1)
Unpaired t-test
Underlying assumptions of
the unpaired t-test
1. The distributions of x1 and x2 are normal
2. The population variances of x1 and x2 are
equal
However, minor deviations from these
assumptions do not affect the validity
of the test
Unpaired t-test
Un-paired t-test on paired data
• It would be inefficient to test paired
observations as though they were unpaired
• Consequences:
 Underestimation of t - value
 Overestimation of probability value
 Undercalling of significant difference
Unpaired t-test
Unequal variances
• Variances in the two samples may differ
considerably from one another
• Example:
 Two technicians, one experienced (more consistent) and
the other relatively inexperienced (more variable)
undertake a blood count
 Both technicians are estimating the same population mean
value
 The more experienced one will have a smaller variability
in his readings than the less experienced one
F test
Possible course of action for situations
with unequal variances
• No course of action will suit all situations
• Options:
 Transform the values to some other scale (e.g.
logarithmic) to equalize variances
 Use specific methods when this is not possible:
• Modified t – test
• Fisher-Behren’s test
F test
Variance ratio test (F- test)
• To test the equality of two variances, s12 and s22,
we use a statistical test called the ‘variance ratio’
test (F-test)
• Calculate the ratio of the larger variance to the
smaller variance
s12
i.e.,
F = ------(s12 - larger variance)
s22
• F follows a F-distribution with (n1 – 1) and (n2 – 1)
degrees of freedom
F test
Example of variance ratio test (F-test)
• Variance in the infected group
 10.9 (n1= 10)
• Variance in the control group
 5.9 (n2 = 12)
• F is calculated as = 10.9 / 5.9 = 1.85
 9,11 degrees of freedom (n1 and n2 - 1)
• Tabulated F 9,11(5%) = 2.92
• A calculated “F” (Fcal) smaller than the tabulated
“F” (Ftab) indicates that the variances are equal
F test
Assumptions of the variance ratio F test
• The two samples must be independent
 e.g., Two series of patients and not the same
patients tested twice (before and after
treatment)
• Both samples must have come from a normal
distribution
F test
What test should be used to test the
difference between two means?
Test the difference
between two sample
mean values
Values are
paired / matched
Paired t-test
Values are
unpaired / unmatched
Check if variances
are equal
Equal variances
Unpaired t-test
Different
variances
Modified t-test
Fisher-Behren test
Key messages
• Determine whether the data are paired
• Used paired t-test for paired data
• Used unpaired t-test for unpaired data if the
variances are comparable
• Test for the difference in variance with Ftest and use other tests if variances differ