Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Selecting Right Statistics Hyungjin Myra Kim, Sc.D. The University of Michigan Choosing an Analytic Method (1) • First, analytic plan should be considered while planning the study. • What do you plan to study (or measure)? Primary outcome measure determines the type of dependent variable – – – – – Continuous (ex: hours of sleep) Dichotomous (ex: binge drinking or not) Ordinal (ex: depression diagnosis) Categorical (ex: choice of treatment) Time to event (ex: time to relapse) Choosing an Analytic Method (2) • Sometimes, there is no dependent variable – – – – – – Factor analysis Cluster analysis Higher-way contingency table analyses Agreement (kappa) Correlation analysis (correlation coefficient) Accuracy (sensitivity, specificity) (We will not discuss the above today.) Choosing an Analytic Method (3) Study design • Do you have a primary comparison? • Determines the nature of the primary predictor variable (independent variable) (ex) 2 group or 3 group comparison? (ex) Evaluating the relationship between happiness to ratio of leisure to work hours • How often do you plan to measure? • X-sectional, longitudinal, x-over • Determines the number of dependent variables (ex) pre/post has measurements twice per person The choice of analysis will also depend on • Unvariate vs. bivariate analysis • Bivariate vs. multivariate analysis – Potential confounder? – Adjust for covariates? • Data skewed or sample size small? – Transformation – Parametric vs. non-parametric Dependent Variable (Outcome) Study Designs Pre/Post Continuous Effect of nightly exercise on hrs of sleep before/after in insomniacs Matched pairs Mastectomy vs. Lumpectomy on QOL in patients matched by age & family history Binary (yes/no) Patient satisfaction before vs. after color change in hospital ward Mastectomy vs. Lumpectomy on survival in patients matched by age & family history 1-group Cholesterol in diabetic patients: Is it higher than general public? Depression in substance abusers 2-group Writing skill between teaching methods A vs. B Comparison of drugs A vs. B on relapse to heavy drinking 2-group, pre/post Weight before/after in exercise vs. no exercise group Satisfaction before & after between 2 skin products 3-group Comparing effectiveness of three drugs on cholesterol Pain reduction in three different pain relief medication Does pack-year of smoking predict Cognitive deficit? Is average nightly sleep predictive of hair loss? Continuous Predictor What Type of Analysis? • Descriptive – Numerical – tables of means, counts, proportion – Graphical - histograms, box plots, scatter plots, etc. • Inferential – Estimation – Point estimates/Confidence Intervals – Hypothesis Tests Analytic Methods Dependent Variable (Outcome) Type Study Design Continuous (multiple regression for multivariate analysis) Binary (yes/no) (logistic regression for multivariate analysis) Pre/Post Paired t-test McNemar’s test Matched pairs Paired t-test McNemar’s test 1-group One-group t-test One proportion test 2-group* Two-group t-test Two proportion test or Chi-square Test 2-group, pre/post* Analysis of Covariance or multiple regression Repeated measures logistic regression 3-group* Analysis of Variance Chi-square test Continuous Predictor Simple regression Logistic regression * Bivariate relationships Binary Dependent Variable Descriptive Statistics: Proportion To estimate a proportion or prevalence, subjects must be a representative sample from the population. Assuming the subjects are representative and independent, the rate is estimated as: p = n/N where n is the number of subjects with the attribute and N is the total number of subjects tested (or studied). Binary Dependent Variable (2) When Only One Group is of Interest: • Test Proportion compared to a null value one proportion test Ex) Are substance abusers more likely to be depressed than general public? • Confidence Interval (95% CI: proportion ± 1.96*SE) Ex) Prevalence of depression in substance abusers Ex) Sensitivity and specificity of a new short depression instrument compared with the physician’s gold standard depression diagnosis • More on interpretation of 95% CI tomorrow. Binary Dependent Variable (3) When Comparing Two Independent Groups: Ex) Comparing drugs A vs. B on relapse to heavy drinking • Essentially a 2 by 2 table • Comparative Test - Chi-square test - Two proportion test • Comparative Statistics (summary effect size) - Absolute Difference in Proportions - Odds Ratios (OR) - Relative Risks (RR) For both OR and RR, 1 means no difference • Can calculate 95% CI for any of the above (OR, etc.) Binary Dependent Variable (4) When Comparing Two Independent Groups: If sample size is small rule of thumb = expected cell count < 5 • Comparative Test: Fisher’s Exact Test | A B| Total --------------------------------------------yes | 3 6 | 9 no | 9 2 | 11 -------------------------------------------Total | 12 8 | 20 Pearson chi-square test p-value = 0.028 Fisher's exact test p-value = 0.065 Continuous Dependent Variable Descriptive Statistics • Mean and Standard Deviation if data are symmetric • Median and Inter-quartile range if data are skewed Mean can be affected by one very large or one very small value, and therefore is sensitive to outlying values Median is robust to an outlying value because it is simply the value at the center when data are ranked in order. • If mean and median are very different, data are skewed. • Always graphically explore the distribution (e.g., using histogram, box plot) and choose the appropriate descriptive statistics • More on mean vs. median tomorrow. Continuous Dependent Variable (2) When Only One Group is of Interest: • Test (One mean compared to a null value) One sample t-test Ex) Is cholesterol higher in diabetic patients compared with the general public? • Confidence Interval (95% CI = mean ± 1.96*SE) Ex) Sample mean cholesterol = 124 Sample SD = 10, N = 200 95% CI for mean cholesterol = 124 ±1.96*10/sqrt(200) = (122.6, 125.4) Continuous Dependent Variable (3) When Only One Group is of Interest: When sample size is small (N<25) or cannot assume that the dependent variable is interval and normally distributed Use a Non-parametric Test • One Sample Median Test - Sign test - Sign rank test Continuous Dependent Variable (4) When Comparing Two Independent Groups: • Test Two independent group t-test Ex) Writing skill comparison between teaching methods A vs. B • Comparative Statistics: difference in means Ex) Difference in mean writing skill scores between those who were taught with method A vs. method B • Confidence Interval for Difference in Means 95% CI = difference ± 1.96*SE (of difference) Continuous Dependent Variable (5) When Comparing Two Independent Groups: If sample size is small (N < 25) or cannot assume that the dependent variable is interval and normally distributed Use a Non-parametric Test (Test of Median) • Wilcoxon ranksum test (tests equality of medians) Graphical Methods to Compare Groups: Box Plots Resting Heart Rate No Exercise Mild Exercise Strenuous Exercise Using Subjects as Their Own Controls: Cross-Over Designs Same subject undergoes 2 or more treatments • Advantage • • Limitations of reusing the same subject • • • • May not be possible Carryover effect of treatment – need washout Length of experiment Order effect • • Maximizes power – fewest subjects needed Order should be randomized and balanced Period effect Cross-Over Designs (2) Examples • Pre-post study (poor design, why?) Ex) Weight before an exercise program and weight after a month of exercise program • Traditional X-over Study Ex) Alternating exposure to guided imagery procedure between stressful situation and a natural relaxing situation on different days in random order and assessing the effect on craving • • Stressful Image – washout period – Relaxing Image Relaxing Image – washout period – Stressful Image Ex) Drug A then cross over to B Cross-Over Designs (2) Analytic Method • Pre-post study • Analyze change-score or gain-score and treat it as a one sample problem Ex) change in weight within a person before and after the exercise program • Traditional X-over Study • Analysis must first assess carryover effect, order effect and period effect. • If any effect, then must account for it. Multiple Comparison: Doing Many Tests • α-level (significance level) – the probability of claiming that there is a difference when there is no true difference – Small α is good. – We usually set α-level at 0.05. – This means we allow 5% for making the kind of error where we declare a significant difference (reject the null hypothesis) when the result happened by chance (Type 1 error). Multiple Comparison (2) • When ≥2 comparisons, α (5%) should be reduced to adjust for the number of comparisons. • Suppose we are performing two independent statistical tests, then: • P(of rejecting the 1st when true) is 0.05 • P(of rejecting the 2nd when true) is 0.05 • What is probability of rejecting at least one? • P(of accepting 1st when true) is 0.95 • P(of accepting 2nd when true) is 0.95 • Therefore, p(of accepting both) = 0.95 x 0.95 = 0.9025 • That is, p(of rejecting at least one) = 0.0975 Multiple Comparison (3) Number of independent tests 1 Probability of rejecting null hypothesis, when true 0.05 2 0.0975 3 0.143 5 0.226 10 0.401 If perform enough significant tests, you are sure to find significant results by chance alone even when none exists. Multiple Comparisons: What to do? (4) For independent tests, one easy way of adjusting the level of significance is to use: 0.05/k where k is the number of tests to be performed. Therefore, instead of 0.05, – When there are 5 tests, use 0.01 – When there are 10 tests, use 0.005 Multiple Comparison (5) • When testing a pre-specified relationship, use a significance level of 5%. • When screening for interesting relationships, use significance level of 1% so as not to identify too many false relationships. Confounding Example 1: Sex bias in graduate admissions? (UC, Berkeley, 1973) Overall: 44% of males admitted 35% of females admitted Admissions are made by department. Confounding (2) Male Female Number of Applicants Percent Admitted Number of Applicants Percent Admitted A B 825 62% 108 82% 560 63% 25 68% C D 325 37% 593 34% 417 33% 375 35% E F 191 28% 393 24% 373 6% 341 7% Total 2691 45% 1835 30% Major Weighted Average: 39% 43% Confounding (3) Example2 : Is psychiatric hospitalization rate different in substance users versus non-users? Hospitalization Yes No User 20 373 5.1% Non-user 6 316 1.9% Substance use looks to be associated with higher psychiatric hospitalization rate. User Non-User Separated by Bipolar Status No Bipolar Bipolar I/II 3 176 1.7% 17 197 7.9% 4 293 1.4% 2 23 8.0% Confounding (4) Example 3: Smoking versus MI Smoker Non-Smoker MI 51 54 No MI 43 67 54% 44.6% OR = 1.47 Male Female MI 37 25 14 29 No MI 24 20 19 47 61% 56% 48% 38% OR = 1.23 OR = 1.19 Smokers have higher MI rate, but the magnitude of the relative likelihood of MI (measured as odds ratio (OR)) is larger in the combined data. Confounding (5) Example 4: 1) Regression of Happiness on Smoker Group Coef SE p-value Intercept 65.05 1.48 0.000 Smoke 4.80 2.03 0.020 2) Regression of Happiness on Age Coef SE Intercept 7.48 2.45 Age 1.85 0.07 p-value 0.003 0.000 3) Regression of Happiness on Age and Smoke Coef SE p-value Intercept 2.65 2.07 0.203 Age 2.08 0.07 0.000 Smoke -5.25 0.70 0.000 Confounding (6) Relationship between Happiness and Age 20 100 20 40 40 Happiness 60 60 80 80 100 Without Considering Age, smokers appear to have higher mean by Smoking Status 20 Not 25 30 Smoke Y, Smoke == Not • • Age 35 40 45 Y, Smoke == Smoke Increasing age is associated with greater happiness. Smokers tend to be older, making it look like smoking is associated with greater happiness when not adjusting for age. • But smokers tend to be less happy than non-smokers given same age. Developing a Statistical Analysis Plan • Comparing two groups – Continuous: t-test – Proportion: chi-square test • Comparing multiple groups (continuous): ANOVA – Adjusted for other factors: ANCOVA, or regression • Dichotomous outcome: Logistic regression • Count outcome: Poisson regression • Survival time outcome: Cox regression • Watch for correlated data (repeated measures, clusters – e.g., teeth in the mouth To Keep in Mind • Typically, multiple appropriate methods are available to analyze the same data that could yield legitimate answers. • Try to use at least two different available methods to confirm your results. • Always look at the raw data and display data graphically, so learn to choose the right graphical displays (ex: cross tabs, scatter plots, box plots) • It helps to make sample tables summarizing results before you start the analysis.