Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Department of Epidemiology and Public Health Unit of Biostatistics and Computational Sciences Descriptive statistical methods and comparison measures PD Dr. C. Schindler Swiss Tropical and Public Health Institute University of Basel [email protected] Annual meeting of the Swiss Societies of Neurophysiology, Neurology and Stroke, Lucerne, May 19th 2011 1 Contents Tabular representations Graphical representations Comparison measures for quantitative variables (difference in means, geometric mean ratio) Comparison measures for binary variables (risk difference, relative risk, odds ratio) Comparison measures for count data (incidence rate ratio) Non-parametric comparison measures (AUC) 2 General rules for tabulary and graphical representations Tables and Figures should be self-explanatory T + F: Title T + F: Caption F: clear axis titles with indication of units F: explanation of different graphical elements (colors, symbols, line types, etc.) 3 Tabular representations 4 Table 1 (longitudinal study report) Comparison of the different groups with respect to baseline characteristics (sex, age, etc., incl. baseline of the outcome variable) Qualitative variables: relative frequencies in % + absolute frequencies Quantitative variables: mean (standard deviation) 1 median (minimum – maximum) 2 or (lower – upper quartile) 2 1 if QQ-Plot does not deviate systematically from a straight line 2 if QQ-Plot shows clear curvature or wave pattern 5 Statistical properties of the normal distribution µ = mean σ = standard deviation ~ 2/3 of all values (in fact: 68%) µ - 2σ µ-σ µ µ+σ ~ 95% of all values (in fact: 95.4%) 2.5% µ - 2σ µ-σ µ µ+σ µ + 2σ 2.5% µ + 2σ 6 Huang HY et al., The Effects of Vitamin C Supplementation on Serum Concentrations of Uric Acid Results of a Randomized Controlled Trial, ARTHRITIS & RHEUMATISM Vol. 52, No. 6, June 2005, pp 1843–1847 DOI 10.1002/art.21105 7 Table 1 (cross-sectional study report) Description of the sample studied and comparison with persons not included in the sample (with respect to demographic characteristics and health-relevant variables.) Same rules as for table 1 of a longitudinal study report. 8 Alkerwi et al., Comparison of participants and non-participants to the ORISCAV-LUX populationbased study on cardiovascular risk factors in Luxembourg, BMC Medical Research Methodology 2010, http://www.biomedcentral.c om/content/pdf/1471-228810-80.pdf 9 Graphical representations 10 Boxplot (box plot) Graphical representation of the distribution of a quantitative variable based on a few important measures (minimum, lower quartile, median, upper quartile, maximum). Outlying values are represented as individual points. 11 BMI in adults aged 30 to 70 years in Basel (SAPALDIA-study) Body mass index 50 40 upper fence* 30 3. quartile (75. percentile) median 20 1. quartile (25. percentile) lower fence* 10 0 Men Women sex *lower (upper) fence: smallest (largest) observation which is still within 1.5 box lengths of the lower (upper) end of the box. 12 Number of discharges as percentage of total number of patients, by day of week Wong HJ et al., Real-time operational feedback: daily discharge rate as a novel hospital efficiency metric, Qual Saf Health Care 2010;19:1-5 doi:10.1136/qshc.2010.040832 13 Bar charts 1. Representation of the distribution of a qualitative variable or of a quantitative variable with few values (e.g. parity of a woman). Each value of the variable is assigned a bar, whose height equals the absolute or relative frequency of the value. 2. Representation of group statistics (e.g., group means of the outcome variable) or of statistics of complex observational units (e.g., regions, hospitals, etc.) 14 relative frequency (%) Bar charts representing the distribution of a categorical variable 60 50 40 20 10 0 relaative frequency (%) Group 1 Group 2 30 Â B category C D 100 90 80 70 60 50 40 30 20 10 D C B Â Group 1 Bars represent different categories (or levels) of the respective categorical variable. Heights of bars are proportional to the relative frequencies of the associated categories. Group 2 15 Representation of group means by bar charts Here, bars represent group means and error intervals are mean ±1 standard error. (68%-confidence interval). 95%-confidence intervals would be better (mean ± 2 · standard error) Smith HAB et al., Nitric oxide precursors and congenital heart surgery: A randomized controlled trial of oral citrulline, J Thorac Cardiovasc Surg 2006; 132:58-65 16 80 60 40 20 0 z-score of lower extremity latency Scatter plots 0 10 20 30 40 50 z-score of upper extremity latency Scatter plots serve to visualize the association between two numerical variables (here z-scores of upper and lower extremity latencies in RRMS and SPMS-patients) 17 Comparison measures a) for quantitative data b) for binary data c) for count data 18 Comparison measures for quantitative variables 19 Differences in means Application: Comparison of different groups with respect to a) Outcome of interest at follow-up and / or b) Change in outcome variable during follow-up. Example: Effect of vitamin C on serum uric acid level. Comparison measure: Difference between the mean change in serum uric acid level in the treatment group (vitamin C supplementation) and the mean change in serum uric acid level in the placebo group. 20 Huang HY et al., The Effects of Vitamin C Supplementation on Serum Concentrations of Uric Acid: results of a randomized controlled trial, Arthritis Rheum. 2005; 52:1843-7. 21 Remarks The difference in the mean of an outcome variable between two independent samples is generally assessed using the t-test (validity condition: approximate normality and similar variability of the data in both groups or sufficiently large sample sizes.) If data have a skewed distribution (e.g., lab measurements), approximate normality of the data may often be achieved by a logarithmic transformation of the data (cf. next topic) But a data transformation is not always appropriate, e.g., if mean costs are to be compared. In this case, bootstrap methods or permutation tests may help to achieve valid statistical comparisons. 22 Geometric mean ratios In many cases, the original outcome has a skewed distribution. But, on a logarithmic scale, it becomes approximately normal. In this case, the data should first be log-transformed. Then the group means of the log-transformed data should be compared. Example: Neurofilament heavy chain protein in cerebrovascular fluid across healthy controls and different groups of MS-patients 23 CIS PPMS SPMS RRMS 5 4.5 4 3.5 3 2.5 ln(NFH-protein concentration) 200 150 100 NFH-protein concentration 50 0 controls controls CIS PPMS Group median Geometric mean Group Mean Controls 27.1 exp(3.30) = 27.1 Controls 3.30 CIS 32.9 exp(3.48) = 32.5 CIS 3.48 PPMS 47.8 exp(3.97) = 53.0 PPMS 3.97 SPMS 51.2 exp(3.83) = 46.1 SPMS 3.83 RRMS 43.4 exp(3.84) = 46.5 RRMS 3.84 SPMS RRMS 24 5 HC lognfh 4 lognfh 4 3.5 3.5 3 3 2.5 2.5 2.5 3 3.5 4 2.5 Inverse Normal 3 3.5 Inverse Normal 4.5 SPMS RRMS lognfh 4 3.5 3 3.5 4 4.5 Inverse Normal 4.5 5 lognfh 4 4.5 3.5 4 Inverse Normal 3 3.5 2.5 3 5 2.5 2.5 3 3 3.5 lognfh 4 4.5 If points are close to a straight line, the distribution can be considered as approximately normal. 4 4.5 5 PPMS 5 CIS 4.5 4.5 5 QQ-plots (of ln(NFH)) 3 3.5 4 Inverse Normal 4.5 5 25 Geometric mean – mathematical definition Let mean(ln(X)) denote the sample mean of a log-transformed variable ln(X). Then, after back-exponentiation, this mean turns into the so-called geometric mean of X: geometric mean of X = e mean(ln(X)) (*) If the distribution of ln(X) is approximately symmetrical, then the geometric mean of X is a good approximation of the median of X. (*) eu = exp(u) = Euler‘s exponential function (e = 2.71828... = Euler‘s number) 26 Geometric mean ratios Let mean1(ln(X)) = mean of ln(X) in sample 1 mean2(ln(X)) = mean of ln(X) in sample 2. Then, after back-exponentiation, the difference ∆ mean = mean2(ln(X)) – mean1(ln(X)) turns into the so-called geometric mean ratio between the two samples e ∆mean =e mean2 (ln( X )) − mean1 (ln( X )) e mean2 (ln( X )) GM 2 ( X ) = = e mean1 (ln( X )) GM1 ( X ) In many cases, this ratio is close to the ratio of medians. 27 Geometric mean ratios Group Mean Geometric mean log-scale Geometric mean ratio Mean difference log scale Controls 3.30 exp(3.30) = 27.1 1 0 CIS 3.48 exp(3.48) = 32.5 32.5 / 27.1 = 1.20 exp(x) 0.18 PPMS 3.97 exp(3.97) = 53.0 53.0 / 27.1 = 1.96 exp(x) 0.67 SPMS 3.83 exp(3.83) = 46.1 46.1 / 27.1 = 1.70 exp(x) 0.53 RRMS 3.84 exp(3.84) = 46.5 46.5 / 27.1 = 1.72 exp(x) 0.54 Digression: 95%-confidence limits of geometric means: exp [ mean log scale ± 1.96 SE( mean log scale) ] 95%-confidence limits of geometric mean rations: exp [ ∆ mean log scale ± 1.96 SE(∆ mean log scale) ] 28 Comparison measures for binary variables 29 Binary outcome variables X1 = „Treatment was effective in patient P“ X1 = 1, if P was sucessfully treated, X1 = 0, if the result of the treatment in patient P did not meet expectations X2 = „Subject P developed cancer during follow-up“ X2 = 1, if this happened with P, X2 = 0, if P did not develop cancer during follow-up X3 = „Patient P was satisfied with treatment“ X3 = 1, if P expressed satisfaction, X3 = 0, if P was not satisfied 30 Comparison measures for binary outcome variables A) Frequency or risk difference (RD) Difference in risks (relative frequencies) between the two groups B) Relative risk (RR) Ratio of risks (relative frequencies) between the two groups C) Odds ratio (RR) Odds = risk : 1 – risk Ratio of odds* between the two groups 31 Risk and Odds (examples) Risk Odds 0.1 (10%) 10 / 90 = 0.11 0.2 (20%) 20 / 80 = 0.25 0.5 (50%) 50 / 50 = 1.0 0.6 (60%) 60 / 40 = 1.5 0.8 (80%) 80 / 20 = 4.0 For risks < 10%, odds and risks are essentially the same 32 These comparison measures can be computed directly from the underlying 2 by 2 table RD = 64/80 - 72/120 = (96 – 72)/120 = 0.2 with outcome without outcome exposed* 64 (80%) 16 (20%) 80 unexposed 72 (60%) 48 (40%) 120 136 64 200 OR = 64/16 : 72/48 = (64·48) / (16·72) = 2.67 RR = 64/80 : 72/120 = (64·120) / (72·80) = 1.33 * „exposed“ can also stand for a specific treatment, in which case subjects with the control treatment are said to be unexposed. 33 Intervention group (n = 80) (95%-conf. interval) Control group (n = 120) (95%- conf. interval) p-value Risk Difference (95%- conf. interval) Successful treatment 80% (71% , 89%) 60% (49% , 71%) 20% (8% , 32%) 0.003 Satisfied patients 90% (83% , 97%) 80% (71% , 89%) 10% (<0%, 20%) 0.06 Relative Risk Successful treatment 80% (71% , 89%) 60% (49% , 71%) 1.33 (1.11, 1.60) 0.003 Satisfied patients 90% (83% , 97%) 80% (71% , 89%) 1.13 (<1.00, 1.26) 0.06 Odds Ratio Successful treatment 80% (71% , 89%) 60% (49% , 71%) 2.67 (1.38, 5.15) 0.003 Satisfied patients 90% (83% , 97%) 80% (71% , 89%) 2.25 (0.96, 5.30) 0.06 34 Why odds ratios ? Odds ratios ratios are commonly used to describe associations between binary outcomes and predictor variables because: a) Unlike the relative risk, the odds ratio is a meaningful measure not only in cohort but also in case control studies. b) Logistic regression models provide effect estimates in the form of odds ratios. 35 How to interpret odds ratios? There are 3 possibilities: a) b) c) 1 < RR < OR OR < RR < 1 Odds ratios are always farther away from 1 than the corresponding relative risks RR = 1 = OR With low risks (i.e., risks < 10%), odds ratios may be interpreted as relative risks. 36 Comparison measures for count data 37 Count variables Examples Number of doctor‘s visits of a patient during a certain time period. Number of deaths within a specific region during a certain time period. Number of children with epilepsy manifesting in the first 5 years of life in Denmark 1979-2002 38 Incidence rate If observational units are individual persons: IR = number of events / length of the observation period If observational units are populations IR = number of events / person time observed Example: IR of epilepsy in first 5 years of life in Denmark: low birth weight: 361 / 272318 person years = 179 / 105 pyrs normal birth weight: 1342 / 1513527 person years = 89 / 105 pyrs Sun et al., Gestational Age, Birth Weight, Intrauterine Growth and Risk of Epilepsy, Am J Epidemiol 2007; 167: 262-70 39 Incidence rate If the event is unique (e.g., death), then the period of observation of a person with this event equals the time between the beginning of the observation period and the event. observation period event ♦ time incomplete observation without event complete observation without event 40 Incidence rate ratio IRR = IR in group 2 / IR in group 1 ( = 179 / 89 = 2.01 ) 95%-confidence interval (approximative)* IRR ⋅ e ± 1.96 1 1 + n1 n2 (= 2.01⋅ e ± 1.96 1 1 + 361 1342 = (1.71, 2.37) ) n1 = number of events in group 1 n2 = number of events in group 2 * holds if n1 and n2 have a Poisson-distribution 41 Adjusted and unadjusted comparison measures In observational studies, but also in randomised trials with a remaining imbalance of certain factors, differences between groups may be confounded. E.g., the difference in mean blood pressure between normal and overweight persons is confounded by age (since both weight and blood pressure tend to increase with age). Without adjustment for the influence of age, the effect of overweight on blood pressure is therefore overestimated. There exist different statistical methods by which comparison measures can be rid of such confounding influences. -> stratification, standardization, regression models 42 Non-parametric comparison measures 43 Receiver Operating Characteristic-curve Sensitivity (True Positive Rate) Outcome: Worsening of EDSS-score by > 0.5 units over 14 years AUC = 0.83 Predictor: score involving z-values of latencies from eyes and upper extremities at baseline 1-Specificity (False Positive Rate) AUC = area under the curve 44 Area under the ROC-curve The ROC-curve of X as a predictor of membership in population 2 (as opposed to population 1) has the property AUC = proportion of pairs (x1, x2) with x1 from group 1 and x2 from group 2 satisfying x2 > x1 + 0.5*(proportion of pairs (x1, x2) with x1 from group 1 and x2 from group 2 satisfying x2 = x1) This is an estimate of the probability that a randomly selected member of population 2 will have a higher value of X than a randomly selected member of population 1. 45 AUC > 0.5 values of X are higher in group 2 than in group 1 AUC = 0.5 X does not discriminate between the two groups AUC < 0.5 values of X are lower in group 2 than in group 1 ! AUC can also be applied with ordinal variables and provides a natural way of comparing such variables. Moreover, AUC has a direct link to the Wilcoxon-rank sum test. A significant result of the Wilcoxon rank sum test is equivalent to a significant difference between AUC and 0.5 46 Summary: Tabular and graphical representations of distributions Basic rule: all such representations should be self explanatory Tables: categorical variables: relative (%) and absolute frequencies (n) numerical variables: mean ± SD (if ∼ normally distributed) median + quartiles or min / max (otherwise) Figures: Boxplots for numerical variables Bar charts for categorical variables Scatter plots to display association between two numerical variables (Normal probability plot for visual assessment of „degree“ of normality of data distribution) 47 Summary: comparison measures Numerical variables: Difference in means (data ∼ normally distributed or no other measure wanted) Geometric mean ratio (data have ∼ log-normal distribution) Binary variables: Risk difference (or frequency difference) Relative risk Odds Ratio Count data: Incidence rate ratio Numerical and ordinal data: area under the ROC-curve All comparison measures always with 95%-confidence intervals! 48 Thank you for your attention! 49