Download Descriptive statistical methods and comparison measures

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Forensic epidemiology wikipedia , lookup

Geometric morphometrics in anthropology wikipedia , lookup

Odds ratio wikipedia , lookup

Transcript
Department of Epidemiology and Public Health
Unit of Biostatistics and Computational Sciences
Descriptive statistical methods and
comparison measures
PD Dr. C. Schindler
Swiss Tropical and Public Health Institute
University of Basel
[email protected]
Annual meeting of the Swiss Societies of Neurophysiology,
Neurology and Stroke, Lucerne, May 19th 2011
1
Contents
Tabular representations
Graphical representations
Comparison measures for quantitative variables
(difference in means, geometric mean ratio)
Comparison measures for binary variables
(risk difference, relative risk, odds ratio)
Comparison measures for count data
(incidence rate ratio)
Non-parametric comparison measures
(AUC)
2
General rules for tabulary and graphical representations
Tables and Figures should be self-explanatory
T + F:
Title
T + F:
Caption
F:
clear axis titles with indication of units
F:
explanation of different graphical elements
(colors, symbols, line types, etc.)
3
Tabular representations
4
Table 1 (longitudinal study report)
Comparison of the different groups with respect to baseline
characteristics (sex, age, etc., incl. baseline of the outcome variable)
Qualitative variables:
relative frequencies in % +
absolute frequencies
Quantitative variables:
mean (standard deviation) 1
median (minimum – maximum) 2
or
(lower – upper quartile) 2
1
if QQ-Plot does not deviate systematically from a straight line
2 if QQ-Plot shows clear curvature or wave pattern
5
Statistical properties of the normal distribution
µ = mean
σ = standard deviation
~ 2/3 of all values
(in fact: 68%)
µ - 2σ
µ-σ
µ
µ+σ
~ 95% of all values
(in fact: 95.4%)
2.5%
µ - 2σ
µ-σ
µ
µ+σ
µ + 2σ
2.5%
µ + 2σ
6
Huang HY et al.,
The Effects of Vitamin C
Supplementation on Serum
Concentrations of Uric Acid Results of a Randomized
Controlled Trial,
ARTHRITIS & RHEUMATISM
Vol. 52, No. 6, June 2005,
pp 1843–1847
DOI 10.1002/art.21105
7
Table 1 (cross-sectional study report)
Description of the sample studied
and
comparison with persons not included in the sample
(with respect to demographic characteristics and health-relevant
variables.)
Same rules as for table 1 of a longitudinal study report.
8
Alkerwi et al.,
Comparison of participants
and non-participants to the
ORISCAV-LUX populationbased study on
cardiovascular risk factors
in Luxembourg,
BMC Medical Research
Methodology 2010,
http://www.biomedcentral.c
om/content/pdf/1471-228810-80.pdf
9
Graphical representations
10
Boxplot (box plot)
Graphical representation of the distribution of a quantitative
variable based on a few important measures
(minimum, lower quartile, median, upper quartile, maximum).
Outlying values are represented as individual points.
11
BMI in adults aged 30 to 70 years in Basel (SAPALDIA-study)
Body mass index
50
40
upper fence*
30
3. quartile (75. percentile)
median
20
1. quartile (25. percentile)
lower fence*
10
0
Men
Women
sex
*lower (upper) fence: smallest (largest) observation which is still within 1.5
box lengths of the lower (upper) end of the box.
12
Number of discharges as percentage of total number of patients,
by day of week
Wong HJ et al., Real-time operational feedback: daily discharge rate as a novel hospital efficiency metric,
Qual Saf Health Care 2010;19:1-5 doi:10.1136/qshc.2010.040832
13
Bar charts
1. Representation of the distribution of a qualitative variable or
of a quantitative variable with few values (e.g. parity of a woman).
Each value of the variable is assigned a bar, whose height
equals the absolute or relative frequency of the value.
2. Representation of group statistics (e.g., group means of the
outcome variable)
or of statistics of complex observational units (e.g., regions,
hospitals, etc.)
14
relative frequency (%)
Bar charts representing the distribution of a categorical variable
60
50
40
20
10
0
relaative frequency (%)
Group 1
Group 2
30
Â
B
category
C
D
100
90
80
70
60
50
40
30
20
10
D
C
B
Â
Group 1
Bars represent different
categories (or levels) of
the respective categorical
variable.
Heights of bars are proportional to the relative
frequencies of the
associated categories.
Group 2
15
Representation of group means by bar charts
Here, bars represent group
means and error intervals
are mean ±1 standard error.
(68%-confidence interval).
95%-confidence intervals
would be better
(mean ± 2 · standard error)
Smith HAB et al., Nitric oxide precursors and congenital heart surgery: A randomized controlled
trial of oral citrulline, J Thorac Cardiovasc Surg 2006; 132:58-65
16
80
60
40
20
0
z-score of lower extremity latency
Scatter plots
0
10
20
30
40
50
z-score of upper extremity latency
Scatter plots serve to visualize the association between two numerical variables
(here z-scores of upper and lower extremity latencies in RRMS and SPMS-patients)
17
Comparison measures
a) for quantitative data
b) for binary data
c) for count data
18
Comparison measures
for
quantitative variables
19
Differences in means
Application: Comparison of different groups with respect to
a) Outcome of interest at follow-up
and / or
b) Change in outcome variable during follow-up.
Example: Effect of vitamin C on serum uric acid level.
Comparison measure:
Difference between the mean change in serum uric acid
level in the treatment group (vitamin C supplementation)
and the mean change in serum uric acid level in the
placebo group.
20
Huang HY et al., The Effects of Vitamin C Supplementation on Serum Concentrations of Uric Acid: results
of a randomized controlled trial, Arthritis Rheum. 2005; 52:1843-7.
21
Remarks
The difference in the mean of an outcome variable between two
independent samples is generally assessed using the t-test
(validity condition: approximate normality and similar variability of
the data in both groups or sufficiently large sample sizes.)
If data have a skewed distribution (e.g., lab measurements),
approximate normality of the data may often be achieved by a
logarithmic transformation of the data (cf. next topic)
But a data transformation is not always appropriate, e.g., if mean
costs are to be compared.
In this case, bootstrap methods or permutation tests may help to
achieve valid statistical comparisons.
22
Geometric mean ratios
In many cases, the original outcome has a skewed distribution.
But, on a logarithmic scale, it becomes approximately normal.
In this case, the data should first be log-transformed.
Then the group means of the log-transformed data should be
compared.
Example: Neurofilament heavy chain protein in cerebrovascular fluid
across healthy controls and different groups of MS-patients
23
CIS
PPMS
SPMS
RRMS
5
4.5
4
3.5
3
2.5
ln(NFH-protein concentration)
200
150
100
NFH-protein concentration
50
0
controls
controls
CIS
PPMS
Group
median
Geometric mean
Group
Mean
Controls
27.1
exp(3.30) = 27.1
Controls
3.30
CIS
32.9
exp(3.48) = 32.5
CIS
3.48
PPMS
47.8
exp(3.97) = 53.0
PPMS
3.97
SPMS
51.2
exp(3.83) = 46.1
SPMS
3.83
RRMS
43.4
exp(3.84) = 46.5
RRMS
3.84
SPMS
RRMS
24
5
HC
lognfh
4
lognfh
4
3.5
3.5
3
3
2.5
2.5
2.5
3
3.5
4
2.5
Inverse Normal
3
3.5
Inverse Normal
4.5
SPMS
RRMS
lognfh
4
3.5
3
3.5
4
4.5
Inverse Normal
4.5
5
lognfh
4
4.5
3.5
4
Inverse Normal
3
3.5
2.5
3
5
2.5
2.5
3
3
3.5
lognfh
4
4.5
If points are close to a straight
line, the distribution can be
considered as approximately
normal.
4
4.5
5
PPMS
5
CIS
4.5
4.5
5
QQ-plots (of ln(NFH))
3
3.5
4
Inverse Normal
4.5
5
25
Geometric mean – mathematical definition
Let mean(ln(X)) denote the sample mean of a log-transformed
variable ln(X). Then, after back-exponentiation, this mean
turns into the so-called geometric mean of X:
geometric mean of X = e mean(ln(X))
(*)
If the distribution of ln(X) is approximately symmetrical,
then the geometric mean of X is a good approximation of the
median of X.
(*) eu = exp(u) = Euler‘s exponential function (e = 2.71828... = Euler‘s number)
26
Geometric mean ratios
Let mean1(ln(X)) = mean of ln(X) in sample 1
mean2(ln(X)) = mean of ln(X) in sample 2.
Then, after back-exponentiation, the difference
∆ mean = mean2(ln(X)) – mean1(ln(X))
turns into the so-called geometric mean ratio between the two
samples
e
∆mean
=e
mean2 (ln( X )) − mean1 (ln( X ))
e mean2 (ln( X ))
GM 2 ( X )
=
=
e mean1 (ln( X )) GM1 ( X )
In many cases, this ratio is close to the ratio of medians.
27
Geometric mean ratios
Group
Mean
Geometric mean
log-scale
Geometric mean
ratio
Mean difference
log scale
Controls
3.30
exp(3.30) = 27.1
1
0
CIS
3.48
exp(3.48) = 32.5
32.5 / 27.1 = 1.20
exp(x)
0.18
PPMS
3.97
exp(3.97) = 53.0
53.0 / 27.1 = 1.96
exp(x)
0.67
SPMS
3.83
exp(3.83) = 46.1
46.1 / 27.1 = 1.70
exp(x)
0.53
RRMS
3.84
exp(3.84) = 46.5
46.5 / 27.1 = 1.72
exp(x)
0.54
Digression:
95%-confidence limits of geometric means:
exp [ mean log scale ± 1.96 SE( mean log scale) ]
95%-confidence limits of geometric mean rations:
exp [ ∆ mean log scale ± 1.96 SE(∆ mean log scale) ]
28
Comparison measures
for
binary variables
29
Binary outcome variables
X1 = „Treatment was effective in patient P“
X1 = 1, if P was sucessfully treated,
X1 = 0, if the result of the treatment in patient P did not meet
expectations
X2 = „Subject P developed cancer during follow-up“
X2 = 1, if this happened with P,
X2 = 0, if P did not develop cancer during follow-up
X3 = „Patient P was satisfied with treatment“
X3 = 1, if P expressed satisfaction,
X3 = 0, if P was not satisfied
30
Comparison measures for binary outcome variables
A) Frequency or risk difference (RD)
Difference in risks (relative frequencies) between the two groups
B) Relative risk (RR)
Ratio of risks (relative frequencies) between the two groups
C) Odds ratio (RR)
Odds = risk : 1 – risk
Ratio of odds* between the two groups
31
Risk and Odds (examples)
Risk
Odds
0.1 (10%)
10 / 90 = 0.11
0.2 (20%)
20 / 80 = 0.25
0.5 (50%)
50 / 50 = 1.0
0.6 (60%)
60 / 40 = 1.5
0.8 (80%)
80 / 20 = 4.0
For risks < 10%,
odds and risks are essentially
the same
32
These comparison measures can be computed directly from the
underlying 2 by 2 table
RD = 64/80 - 72/120
= (96 – 72)/120
= 0.2
with
outcome
without
outcome
exposed*
64
(80%)
16
(20%)
80
unexposed
72
(60%)
48
(40%)
120
136
64
200
OR = 64/16 : 72/48
= (64·48) / (16·72)
= 2.67
RR = 64/80 : 72/120
= (64·120) / (72·80)
= 1.33
* „exposed“ can also stand for a specific treatment, in which case
subjects with the control treatment are said to be unexposed.
33
Intervention
group (n = 80)
(95%-conf. interval)
Control group
(n = 120)
(95%- conf. interval)
p-value
Risk Difference
(95%- conf. interval)
Successful
treatment
80%
(71% , 89%)
60%
(49% , 71%)
20%
(8% , 32%)
0.003
Satisfied
patients
90%
(83% , 97%)
80%
(71% , 89%)
10%
(<0%, 20%)
0.06
Relative Risk
Successful
treatment
80%
(71% , 89%)
60%
(49% , 71%)
1.33
(1.11, 1.60)
0.003
Satisfied
patients
90%
(83% , 97%)
80%
(71% , 89%)
1.13
(<1.00, 1.26)
0.06
Odds Ratio
Successful
treatment
80%
(71% , 89%)
60%
(49% , 71%)
2.67
(1.38, 5.15)
0.003
Satisfied
patients
90%
(83% , 97%)
80%
(71% , 89%)
2.25
(0.96, 5.30)
0.06
34
Why odds ratios ?
Odds ratios ratios are commonly used to describe associations
between binary outcomes and predictor variables because:
a) Unlike the relative risk, the odds ratio is a meaningful measure
not only in cohort but also in case control studies.
b) Logistic regression models provide effect estimates in the
form of odds ratios.
35
How to interpret odds ratios?
There are 3 possibilities:
a)
b)
c)
1 < RR < OR
OR < RR < 1
Odds ratios are always farther
away from 1 than the
corresponding relative risks
RR = 1 = OR
With low risks (i.e., risks < 10%), odds ratios may be interpreted
as relative risks.
36
Comparison measures
for
count data
37
Count variables
Examples
Number of doctor‘s visits of a patient during a certain time period.
Number of deaths within a specific region during a certain time
period.
Number of children with epilepsy manifesting in the first 5 years of
life in Denmark 1979-2002
38
Incidence rate
If observational units are individual persons:
IR = number of events / length of the observation period
If observational units are populations
IR = number of events / person time observed
Example: IR of epilepsy in first 5 years of life in Denmark:
low birth weight:
361 / 272318 person years = 179 / 105 pyrs
normal birth weight: 1342 / 1513527 person years = 89 / 105 pyrs
Sun et al., Gestational Age, Birth Weight, Intrauterine Growth and Risk of Epilepsy,
Am J Epidemiol 2007; 167: 262-70
39
Incidence rate
If the event is unique (e.g., death), then the period of observation
of a person with this event equals the time between the beginning of
the observation period and the event.
observation period
event
♦
time
incomplete observation without event
complete observation without event
40
Incidence rate ratio
IRR = IR in group 2 / IR in group 1
( = 179 / 89 = 2.01 )
95%-confidence interval (approximative)*
IRR ⋅ e
± 1.96
1 1
+
n1 n2
(= 2.01⋅ e
± 1.96
1
1
+
361 1342
= (1.71, 2.37) )
n1 = number of events in group 1
n2 = number of events in group 2
* holds if n1 and n2 have a Poisson-distribution
41
Adjusted and unadjusted comparison measures
In observational studies, but also in randomised trials with a
remaining imbalance of certain factors, differences between groups
may be confounded.
E.g., the difference in mean blood pressure between normal and
overweight persons is confounded by age (since both weight and
blood pressure tend to increase with age).
Without adjustment for the influence of age, the effect of overweight
on blood pressure is therefore overestimated.
There exist different statistical methods by which comparison
measures can be rid of such confounding influences.
-> stratification, standardization, regression models
42
Non-parametric comparison measures
43
Receiver Operating Characteristic-curve
Sensitivity (True Positive Rate)
Outcome:
Worsening of
EDSS-score
by > 0.5 units
over 14 years
AUC = 0.83
Predictor:
score involving z-values
of latencies from eyes
and upper extremities
at baseline
1-Specificity (False Positive Rate)
AUC = area under the curve
44
Area under the ROC-curve
The ROC-curve of X as a predictor of membership in population 2
(as opposed to population 1) has the property
AUC = proportion of pairs (x1, x2) with x1 from group 1 and
x2 from group 2 satisfying x2 > x1
+
0.5*(proportion of pairs (x1, x2) with x1 from group 1 and
x2 from group 2 satisfying x2 = x1)
This is an estimate of the probability that a randomly selected member
of population 2 will have a higher value of X than a randomly
selected member of population 1.
45
AUC > 0.5
values of X are higher in group 2 than in group 1
AUC = 0.5
X does not discriminate between the two groups
AUC < 0.5
values of X are lower in group 2 than in group 1
! AUC can also be applied with ordinal variables and
provides a natural way of comparing such variables.
Moreover, AUC has a direct link to the Wilcoxon-rank sum test.
A significant result of the Wilcoxon rank sum test is equivalent
to a significant difference between AUC and 0.5
46
Summary: Tabular and graphical representations of
distributions
Basic rule: all such representations should be self explanatory
Tables: categorical variables: relative (%) and absolute frequencies (n)
numerical variables: mean ± SD (if ∼ normally distributed)
median + quartiles or min / max (otherwise)
Figures: Boxplots for numerical variables
Bar charts for categorical variables
Scatter plots to display association between two numerical
variables
(Normal probability plot for visual assessment of „degree“
of normality of data distribution)
47
Summary: comparison measures
Numerical variables: Difference in means (data ∼ normally distributed
or no other measure wanted)
Geometric mean ratio (data have ∼ log-normal
distribution)
Binary variables:
Risk difference (or frequency difference)
Relative risk
Odds Ratio
Count data:
Incidence rate ratio
Numerical and ordinal data:
area under the ROC-curve
All comparison measures always with 95%-confidence intervals!
48
Thank you for your attention!
49