Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Biostatistics and Epidemiology: ASPHO 2015 Lillian Sung MD, PhD Disclosure Information • Lillian Sung – No disclosures Outline • Principles of Use of Biostatistics in Research • Principles of Epidemiology and Clinical Research Design • Applying Research to Clinical Practice Principles of Use of Biostatistics in Research A. Principles of Use of Biostatistics in Research 1. Types of variables Distinguish types of variables (eg, continuous, categorical, ordinal, nominal) (slide 7) Understand how the type of variable (eg, continuous, categorical, nominal) affects the choice of statistical test (slides 7, 12-16) 2. Distribution of Data Understand how distribution of data affects the choice of statistical test (slides 8, 12, 13) Differentiate normal from skewed distribution of data (slide 8) Understand the appropriate use of the mean, median, and mode (slide 8) Understand the appropriate use of standard deviation (slide 9) Understand the appropriate use of standard error (slide 9) 3. Hypothesis testing Distinguish the null hypothesis from an alternative hypothesis (slides 10, 11) Interpret the results of hypothesis testing (slides 10, 11, 18) 4. Statistical tests Understand the appropriate use of the chi-square test versus a t-test (slides 12-16) Understand the appropriate use of analysis of variance (ANOVA) (slides 12, 13) Understand the appropriate use of parametric (eg, t-test, ANOVA) versus non-parametric (eg, MannWhitney U, Wilcoxon) statistical tests (slides 12, 13) Interpret the results of chi-square tests (slides 15, 16) Interpret the results of t-tests (slide 14) Understand the appropriate use of a paired and non-paired t-test (slides 14, 17) Determine the appropriate use of a 1- versus 2-tailed test of significance (slides 19, 20) Interpret a p-value (slides 18-21) Interpret a p-value when multiple comparisons have been made (slide 21) Interpret a confidence interval (slide 22) Identify a type I error (slide 23) Identify a type II error (slide 23) 5. Measurement of association Differentiate relative risk reduction from absolute risk reduction (slide 24) Calculate and interpret a relative risk (slide 25) Calculate and interpret an odds ratio (slide 25) Interpret a hazard ratio (slide 25) Understand the uses and limitations of a correlation coefficient (slide 26) 6. Regression Identify when to apply regression analysis (eg, linear, logistic) (slide 27) Interpret a regression analysis (eg, linear, logistic) (slide 27) Identify when to apply survival analysis (eg, Kaplan-Meier) (slides 28-30) Interpret a survival analysis (eg, Kaplan-Meier) (slides 28-30) 7. Diagnostic tests Recognize the importance of an independent "gold standard" in evaluating a diagnostic test (slides 31, 32) Calculate and interpret sensitivity and specificity (slides 31, 32) Calculate and interpret positive and negative predictive values (slide 31) Understand how disease prevalence affects the positive and negative predictive value of a test (slide 32) Calculate and interpret likelihood ratios (slide 33) Interpret a receiver operator characteristic curve (slide 34) Interpret and apply a clinical prediction rule (slide 34) 8. Systematic reviews and meta-analysis Understand the purpose of a systematic review (slide 35) Understand the advantages of adding a meta-analysis to a systematic review (slide 35) Interpret the results of a meta-analysis (slide 35) Identify the limitations of a systematic review (slide 35) Identify the limitations of a meta-analysis (slide 35) Types of Variables Type Example Dichotomous Induction death yes/no Categorical/ Nominal Ordinal Leukemia type: AML, ALL, CML, other Continuous Serum creatinine Survival Time to death CTCAE toxicity grade 1, 2, 3, 4, 5 Nature of outcome variable (dichotomous, categorical, ordinal, survival) drives the choice of statistical tests Distribution of Data 0.4 Central Tendency Skewed 0.0 0.1 0.2 0.3 Normal -3 -2 -1 0 1 2 3 z Mean=Median=Mode Mean – average value (use if normal) Median – middle value (use if skewed) Mode – most common value Mode Median Mean Distribution drives the choice of statistical tests Standard Deviation and Standard Error • Standard deviation – average spread from the mean • Wider the spread, the larger the SD • Standard error – SD/sqrt(n) Hypothesis Testing • Define null hypothesis (treatment does not work) • Alternate hypothesis – treatment works • Determine probability of observing data as or more extreme assuming the treatment does not work • If this probability is sufficiently small – reject null hypothesis 0.2 0.3 0.4 Rejection of Null Hypothesis Z=+1.96 0.0 0.1 Z=-1.96 -3 -2 -1 0 1 z Rejection of the null hypothesis 2 3 Statistical Tests Nature of outcome variable (dichotomous, categorical, ordinal, survival) drives the choice of statistical tests Distribution drives the choice of statistical tests Outcome is Continuous Number of Groups of Interest Two Three or more Parametric Non-parametric Student’s ttest Mann Whitney U (Wilcoxon rank sum) Kruskal Wallis Analysis of variance (ANOVA) Student’s T-test Two groups with a continuous outcome measure t = mean(gp1) – mean(gp2) Variance Larger t ~ smaller p value Assumptions: Data normally distributed Observations independent If data are matched (eg blood pressure before and after) should use paired T-test Outcome is Dichotomous Number of Groups of Interest Two or More Chi square or Fisher’s exact test Chi Square Test • Compares proportions in 2 or more groups: WBC ≥ 200 WBC < 200 Induction Death Yes A B Induction Death No C D • Calculate expected values for each cell • X2 =∑ (O-E)2/E • Larger X2 ~ smaller p value Matched/Paired Versus Independent Are the exposure groups independent of one another? Ways to induce matching: • Compare outcome within an individual • Eg. Pre-post intervention, cross-over trial • Can create match by how you select subjects • Eg. In case-control study, can match cases and controls P Values • P value: probability of obtaining a test statistic at least as extreme as the one actually observed assuming the null hypothesis is true • Translation: Chance of getting the results you saw assuming that the treatment doesn’t work • P=0.05: 5% chance of seeing data as extreme assuming null hypothesis – Translation: Assuming that the treatment doesn’t work, there is a 5% chance of observing a difference by chance alone 0.4 Two vs One Sided P Value 0.2 0.3 Two-sided P value will evaluate both that the treatment is better and that the treatment is worse than control Z=+1.96 0.1 Z=-1.96 2.5% 0.0 2.5% -3 -2 Typically use twosided P values -1 0 1 2 z Rejection of the null hypothesis Two-sided P value 3 0.4 Two vs One Sided P Value 0.2 0.3 One-sided P value will only evaluate that the treatment is better than control Z=+1.645 0.1 Easier to show “statistical significance” 0.0 5% -3 -2 -1 0 1 2 3 z Rejection of the null hypothesis One-sided P value Less commonly used • Multiple testing: if you do many tests, increase the chance of finding P < 0.05 just by chance alone, therefore need to adjust P value for multiple comparisons Confidence Intervals Confidence interval: probability that the interval contains the true parameter For example, 95% CI around the mean – 95% probability that the interval contains the true mean Type I and Type II Errors TRUTH CONCLUSION FROM TEST Difference Exists No Difference Exists Difference Exists * Power Sensitivity Type I or alpha error False positive No Difference Exists Type II or beta error False negative * Measures of Association Outcome Group Yes No Treatment A (1) B (9) Control C (4) D (6) Risk of outcome in treatment group = 0.10 (A/A+B) Risk of outcome in control group = 0.40 (C/(C+D) • Absolute risk reduction: decrease in risk of an outcome associated with an intervention ARR = C/(C+D)- A/(A+B) = 0.40 – 0.10 = 0.30 • Relative risk reduction: absolute risk reduction divided by event rate in the control arm RRR = 0.30/0.40=0.75 • Number needed to treat = 1/ARR = 1/0.30 = 3.3 Odds Ratio and Relative Risk Outcome Group Yes No Treatment A B Control C D • Risk in treatment = A/(A+B); Risk in control = C/(C+D) Relative risk = A/(A+B) C/(C+D) RR 2.5 = 2.5 times risk of outcome if treated • Odds in treatment = A/B; Odds in control = C/D Odds ratio = A/B = AD/BC C/D OR 2.5 = 2.5 times odds of outcome if treated • Hazard ratio: analogous to a relative risk used in survival analysis Correlation Coefficient •Strength of the linear relationship between two numbers • - 1 ≤ r ≤ +1 7 7 6 6 5 5 4 4 3 3 2 r = 1.0 2 1 1 0 0 1 2 3 r = -1.0 4 • Measure of correlation • Not measure of concordance 1 2 3 4 Regression Used to define relationships or to predict an outcome based on one or more exposure variables • Univariate: single exposure variable, single outcome • Multivariable: multiple exposure variables, single outcome Type of regression depends on nature of outcome variable • Dichotomous – logistic regression • Continuous – linear regression • Survival – Cox proportional hazards model Survival Analysis • Outcome is time to event • Censor patients who don’t have an event when last observed • Most data is right censored censored START STUDY STOP STUDY • Example of how to display survival data • Calculates survival probability whenever an event occurs Survival Kaplan-Meier Method Months Use to describe survival at a given time eg survival at 30 months is 40% When to Use Survival Analysis • Time to event data • Each individual - different length of follow-up • Patients may be lost to follow-up • Patients may be censored Diagnostic Tests New Test Positive Negative Gold Standard True False A B C D Need gold standard • Sensitivity = A/(A+C) – proportion of those with the disease who have a positive test • Specificity – D/(B+D) – proportion of those without the disease who have a negative test • Positive predictive value = A/(A+B) – proportion of those who test positive who have the disease • Negative predictive value = D/(C+D) – proportion of those who test negative who do not have the disease Influence of Prevalence on Diagnostic Tests New Test Positive Negative Gold Standard True False A B C D • Because (A+C) is prevalence of disease: • Prevalence influences PPV and NPV • Does not influence sensitivity and specificity Likelihood Ratios • LR+: How much the odds of a disease increase when the test is positive = sensitivity/(1-specificity) • LR-: How much the odds of a disease decrease when the test is negative = (1-sensitivity)/specificity • Receiver operator curve: to evaluate the optimal threshold for a diagnostic test • Clinical prediction rule: using signs, symptoms and tests to predict a clinical outcome Sensitivity Other Diagnostic Test Issue (1-Specificity) Systematic Reviews • Identify studies that address a similar question – and synthesize the data either qualitatively or quantitatively • Quantitative review – meta-analysis • Limitations • Heterogeneity in treatment effect – may not be appropriate to combine • Publication bias Principles of Epidemiology and Clinical Research Design and Applying Research to Clinical Practice B. Principles of Epidemiology and Clinical Research Design 1. Study types Distinguish between Phase I, II, III, and IV clinical trials (slide 39) Recognize a retrospective study (slide 40) Understand the strengths and limitations of retrospective studies (slide 40) Recognize a case series (slide 41) Understand the strengths and limitations of case series (slide 41) Recognize a cross-sectional study (slide 42) Understand the strengths and limitations of cross-sectional studies (slide 42) Recognize a case-control study (slide 43) Understand the strengths and limitations of case-control studies (slide 44) Recognize a longitudinal study (slide 48) Understand the strengths and limitations of longitudinal studies (slide 48) Recognize a cohort study (slide 45) Understand the strengths and limitations of cohort studies (slide 46) Recognize a randomized-controlled study (slide 47) Understand the strengths and limitations of randomized-controlled studies (slide 47) Recognize a before-after study (slide 48) Understand the strengths and limitations of before-after studies (slide 48) Recognize a crossover study (slide 48) Understand the strengths and limitations of crossover studies (slide 48) Recognize an open-label study (slide 49) Understand the strengths and limitations of open-label studies (slide 49) Recognize a post-hoc analysis (slide 49) Understand the strengths and limitations of post-hoc analyses (slide 49) Recognize a subgroup analysis (slide 49) Understand the strengths and limitations of subgroup analyses (slide 49) 2. Bias and Confounding Understand how bias affects the validity of results (slide 50) Understand how confounding affects the validity of results (slide 50) Identify common strategies in study design to avoid or reduce bias (slide 51) Identify common strategies in study design to avoid or reduce confounding (slide 51) Understand how study results may differ between distinct sub-populations (effect modification) (slide 52) 3. Causation Understand the difference between association and causation (slide 53) Identify factors that strengthen causal inference in observational studies (eg, temporal sequence, dose response, repetition in a different population, consistency with other studies, biologic plausibility) (slide 54) 4. Incidence and Prevalence Distinguish disease incidence from disease prevalence (slide 55) 5. Screening Understand factors that affect the rationale for screening for a condition or disease (eg, prevalence, test accuracy, risk-benefit, disease burden, presence of a presymptomatic state) (slide 56) 6. Decision analysis Understand the strengths and limitations of decision analyses (slide 57) Interpret a decision analysis 7. Cost-benefit, cost-effectiveness, and outcomes Differentiate cost-benefit from cost-effectiveness analysis (slide 58) Understand how quality-adjusted life years are used in cost analyses (slide 58) Understand the multiple perspectives (eg, of an individual, payor, society) that influence interpretation of cost-benefit and cost-effectiveness analyses (slide 58) 8. Sensitivity analysis Understand the strengths and limitations of sensitivity analysis (slide 59) Interpret the results of sensitivity analysis (slide 59) 9. Measurement Understand the types of validity that relate to measurement (eg, face, construct, criterion, predictive, content) (slide 61) Distinguish validity from reliability (slides 60, 61) Distinguish internal from external validity (slide 61) Distinguish accuracy from precision (slide 60) Understand and interpret measurements of interobserver reliability (eg, kappa) (slide 60) Understand and interpret Cronbach's alpha (slide 60) Study Types Phases of Drug Studies Phase Description Subjects I Intended to find dose range that is tolerated and safe MTD Usually very small sample size, typically without a control group II Preliminary efficacy information Larger than Phase I – but still limited sample size III Definitive efficacy information and some common side effects Usually large sample size – maybe blinded IV Post-marketing surveillance – to detect rare side effects Very large sample size Retrospective Study • Exposures and outcomes have already occurred • Strengths: • Feasible and inexpensive • Limitations: • Limited availability of confounders • No control over when or how exposure or outcome measured • Recall bias Case Series • Describing similar cases, treatments or outcomes • Strengths: • Feasible and inexpensive • Limitations: • Cannot test hypotheses • Selection bias Cross-Sectional Study • All measures obtained on a single occasion • Strengths: • • Fast/inexpensive No lost to follow-up • Limitations: • • Difficult to establish causal relationships Can measure prevalence – not incidence Case-Control Studies • Identify those with and without outcome • Look BACK to see how many had potential predictor CASES Predictor Present/Absent? CONTROLS Case-Control Studies • Strengths: – Good for rare outcomes or long latency between predictor and outcome • Limitations: – Cannot estimate incidence or prevalence of disease – Can only study one outcome – Prone to bias: • Sampling – selection of controls is critical • Recall bias Cohort Studies • Identify those with and without potential predictor • Look FORWARD to see how many have outcome • Can be prospective or retrospective PRED YES Outcome Yes/No? PRED NO Cohort Studies • Strengths: – Time sequence strengthens inference – Absence of recall bias – Can calculate incidence • Limitations: – Expensive Randomized Controlled Trials Randomization: • Ensures that known and unknown potential confounders are equally distributed among the treatment and control groups • Avoid allocation bias Strengths: • Strongest design to make inferences about therapy • Limits influence of confounders, allocation bias Limitations: • Expensive • Usually lack generalizability Other Study Types • Longitudinal study: track same individuals over a period of time and repeat measurements – Strengths: natural history – Limitations: hard to determine causation, lost to follow-up • Before and after study: evaluate an outcome before and following institution of an intervention – Strengths: feasible – Limitations: confounders, regression to the mean • Crossover study: evaluate an outcome in two time periods within the same individual – typically randomize order – Strengths: reduces variability, improves power – Limitations: need chronic stable conditions, short onset of action, condition cannot be “cured” by intervention Other Study Types • Open label study: not blinded – Strengths: feasible, ethics – Limitations: co-interventions (may treat groups differently), contamination, observer bias • Post-hoc analysis: examining data after a study has been completed for relationships not hypothesized a priori • Limitations: multiple testing • Sub-group analysis: examining patterns in a sub-group of patients • Strengths: may be sub-groups of patients who respond differently to treatment • Limitations: multiple testing, limited power Bias and Confounding • Bias: systematic error • Selection bias • Measurement bias • Confounder: third variable that is associated with exposure and outcome variables and not in the causal pathway Both bias and confounders are major threats to validity of any study Strategies to Avoid Bias/Confounding • Bias: randomization, double-blinding, standardize measurement of outcomes • Confounding: • Restriction - only include specific sub-groups • Stratification - analyze by sub-group • Stratify randomization - ensure equal distribution • Multiple regression techniques Effect Modifier Systolic BP 250 200 Male 150 Female 100 50 0 20 25 30 35 40 BMI Systolic BP by BMI 250 200 Systolic BP • Interaction • Two exposure variables of interest • Effect of one exposure variable depends on the second exposure variable • Example: influence of BMI on SBP differs in males and females Systolic BP by BMI 150 Male 100 Female 50 0 20 25 30 35 40 BMI Understand how results may differ between distinct sub-populations (effect modification) Causation • Association: correlation • Causation: outcome causally related to exposure Factors that strengthen causal inference in observational studies (Bradford-Hill criteria): • • • • • • • • Temporality – cause precedes effect Strength - large relative risk Dose-response Reversibility Consistency – repeatedly observed Biologic plausibility Specificity – one cause, one effect Analogy – same relationship observed in a different disease Incidence and Prevalence • Incidence: number of new cases per unit time • Prevalence: total number of cases Screening Factors that affect rationale for screening: Disease prevalence and burden (important) Latent stage of disease (presymptomatic) Test available, accurate and acceptable Screening will do more benefit than harm Availability of treatment Decision Analysis Decision analysis – quantitative method to determine the treatment option with the best expected outcome Strengths: • Quantitative comparison of different treatment strategies that can incorporate benefits, side effects and costs Limitations: Less useful for individual patient decision making Many probabilities and outcomes (such as quality of life) not known Cost Analyses • Cost benefit analysis: both benefits and costs expressed in monetary terms - “net present value” • Cost effectiveness analysis: compares costs and outcomes of different strategies eg. cost per quality-adjusted life years (QALYs) • QALYs = quality of life * length of life – In cost effectiveness analysis, illustrates the cost to gain quality and quantity of life • Different perspective – individual, society, healthcare system – influence CBA and CEA Sensitivity Analysis • Decision and cost analyses, probabilities and outcomes uncertain • Sensitivity analysis – vary probabilities and outcomes to determine if conclusions change • If findings don’t change, conclusions are robust Measurement Reliability: consistency or reproducibility (precise) Test re-test (repeated over time), interrater (repeated by different raters) • Measure agreement with kappa statistic (dichotomous outcomes) Cronbach’s alpha-internal consistency • Extent to which items are correlated Validity Validity: degree to which it measures what it claims to measure (accurate) • Internal - within the specific study • External - generalizability • How to determine: – Face – do the items make sense – Content – does the instrument seem to contain the correct items – Criterion – if there is a gold standard – Construct – extent to which the measure behaves in the hypothesized manner – Predictive – extent to which a score predicts some criterion measure Principles of Epidemiology and Clinical Research Design and Applying Research to Clinical Practice C. Applying Research to Clinical Practice 1. Assessment of study design, performance & analysis (internal validity) Recognize when appropriate control groups have been selected for a case-control study (slide 64) Recognize when appropriate control groups have been selected for a cohort study (slide 64) Recognize the use and limitations of surrogate endpoints (slide 64) Understand the use of intent-to-treat analysis (slide 65) Understand how sample size affects the power of a study (slide 65) Understand how sample size may limit the ability to detect adverse events (slide 65) Understand how to calculate an adequate sample size for a controlled trial (ie, clinically meaningful difference, variability in measurement, choice of alpha and beta) (slide 66) 2. Assessment of generalizability (external validity) Identify factors that contribute to or jeopardize generalizability (slide 67) Understand how non-representative samples can bias results (slide 67) Assess how the data source (eg, diaries, billing data, discharge diagnostic code) may affect study results (slide 68) 3. Application of information for patient care Estimate the post-test probability of a disease, given the pretest probability of the disease and the likelihood ratio for the test (slide 69) Calculate absolute risk reduction (slide 69) Calculate and interpret the number-needed-to treat (slide 69) Distinguish statistical significance from clinical importance (slide 69) 4. Using the medical literature Given the need for specific clinical information, identify a clear, structured, searchable clinical question (slide 70) Identify the study design most likely to yield valid information about the accuracy of a diagnostic test (slide 70) Identify the study design most likely to yield valid information about the benefits and/or harms of an intervention (slide 70) Identify the study design most likely to yield valid information about the prognosis of a condition (slide 70) Internal Validity • Appropriate control groups for case-control study • From population at risk that are otherwise similar to cases • Appropriate control groups for cohort study • Non-exposed that are otherwise similar to exposed group • Surrogate endpoints: • Laboratory test or physical finding used instead of a clinically meaningful endpoint • Strengths: may be detected earlier or in more patients • Limitations: effects on surrogate endpoint may not correspond to effects on clinical endpoint • Intention to treat analysis: • Includes all randomized subjects to the group they were assigned • Important because patients do not cross-over randomly • Should be primary analysis • Power – probability test will reject the null hypothesis • Larger sample size • More power • Greater ability to detect adverse events Sample Size Considerations To calculate sample size for a controlled trial: Alpha - typically 0.05 (5% probability of finding a difference by chance alone if one does not exist) Beta - (1-power) - typically 0.1 or 0.2 (20% probability of not finding a difference if one really exists) Minimal clinically important difference and variability of outcome External Validity • Factors that can jeopardize generalizability • Highly selected subjects • Intervention applied in a manner not feasible for routine practice • Maneuvers to enhance compliance • Excessive monitoring • If study sample not representative – biased because not generalizable Influence of Data Sources • Diaries – missing data, recall bias • Administrative data – eg billings data, discharge diagnosis codes: – Validity needs to be established – Lack clinical information such as disease risk status – Errors Application of Information • Post-test probability of disease = pretest probability of disease x likelihood ratio of test • ARR = risk (control) – risk (treatment) (slide 21) NTT = 1/ARR • Significance – Statistical: set by alpha – more likely as sample size increases – Clinical: what is important – does not change with sample size Using Medical Literature • To conduct search, need to identify a clear, structured searchable research question • Optimal study designs: Purpose Design That Yields Most Valid Information Benefits and/or harms of an intervention Prognosis Randomized controlled trials Cohort studies Diagnostic test Cross-sectional studies THANKS! More questions? Email: [email protected] Phone: 416-813-5287