Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Medical Council of Canada Certifying Exam Review April 16, 2007 Jeffrey P Schaefer MSc MD FRCPC Epidemiology Clinical Epidemiology Commonest Clinical Questions • • • • • Harm Diagnosis Therapy Prognosis Prevention Measures of Disease Measures of Disease Frequency • Disease Incidence – refers to new cases of disease among a population at risk for that disease over a specified period of time. Measures of Disease Frequency • Cumulative Incidence (CI) CI = new cases / population at risk / time interval Population 120 Population at risk 100 Population not at risk 20 1 yr New cases 25 CI = 25/100/yr Measures of Disease Frequency • Incidence Density ID = new cases . person-time population at risk per time period Useful when there is losses to followup among the population under observation Measures of Disease Frequency • A study plans to observe 20,000 people for 1 years but 10,000 left town after 6 mo. 15 cases of disease were observed among the population. 10,000 observed x 1 yr = 10,000 p-yr 10,000 observed x 0.5 yr = 5,000 p-yr Total persons years = 15,000 p-yr Measures of Disease Frequency 15 cases / 15,000 person years 1 cases / 1,000 person-years 10 cases / 10,000 persons per year Incidence of Cancer of the Lung, Trachea, and Bronchus, Both Sexes Combined, All Ages, 1998 from Statistics Canada by Province or Territory. Can Nfld PEI NS NB Que Ont Man Sask Alb BC Yuk NWT 61.10 41.74 78.57 75.54 72.69 74.72 55.83 62.72 54.05 52.30 54.62 48.30 122.25 Incidence Data • Challenge – ensure the denominator is truly at risk. Measures of Disease Frequency • Disease Prevalence – refers to the proportion of individuals who have the disease at a specific time among a given population. (not a rate) Prevalence = existing cases at a point in time total population Measures of Disease Frequency • Point Prevalence – ‘snapshot’ in calendar time or event related time – number of cases of diabetes on June 30, 2002 in Calgary – number of people with infected incision on postoperative day 3 on General Surgery unit at Foothills Hospital Measures of Disease Frequency • Period Prevalence – longer duration under consideration – number of people with diabetes between Jan 1 and Dec 31, 2002 in Calgary – number of people with infected incision between surgery and post-operative day 90 on General Surgery unit at Foothills Hospital Period Prevalence of Diabetes Canada 1996-97 Global Cigarette Consumption Special Measures of Disease Frequency • Mortality Rate – incidence of death (cause specific vs all cause mortality) • Infant Mortality Rate – death of a live born infant within the 1st year of life • Neonatal Death – death of an infant under 28 days of age • Post-neonatal Death – death of an infant between 4 weeks and 1 year of age Infant Mortality Rates Selected Countries, 1995 Infant Mortality Rates by Income Quintile, Urban Canada, 1991 Prevention primary prevention risk factors only secondary prevention subclinical disease time tertiary prevention clinical disease complicated disease Harm Harm • Exposures that cause disease = Harm • Harmful exposures are Risk Factors • Risk Factors may be etiologies of disease. Harm EXPOSURE DISEASE A harmful exposure is a RISK FACTOR • Robert Koch – Dec 11, 1843 - May 27, 1910 – Nobel Prize - 1905 • Anthrax • Tuberculosis • Cholera Harm • Koch’s Postulates (1882) • Criteria for Causation – the organism must be present in every case of disease, – the organism be isolated from the diseased animal and grown in pure culture, – the organism causes the same disease when inoculated into a healthy animal, – the organism must be recovered from that animal (step 3) and identified. Harm • Limitations of Koch’s Postulates – not all diseases are infectious – one risk factor may cause > 1 diseases – one disease may have > 1 risk factors Figure 4-4 Leading causes of death, number and percentage of deaths, Canada, 1999 Respiratory (22,026) 10% Other IHD (21,693) 10% AMI (20,926) 9.5% Other (33,240) 15% Diabetes (6,137) 3% Infectious Diseases (2,583) 1% All Cardiovascular Cerebrovascular Disease Disease (15,409) (78,942) 7% 36% Other CVD (20,914) 9.5% Cancer (62,606) 29% Accidents/ Poisoning/ Violence (13,996) 6% Total Number of Deaths: 219,530 Cardiovascular (ICD-9 390-459); Respiratory (ICD-9 460-519); Diabetes (ICD-9 250); Cancer (ICD-9 140-239); Infectious Diseases (ICD-9 001-139); Accidents/Poisonings/Violence (ICD-9 E800-E999) Source: Health Canada, using data from Mortality File, Statistics Canada The Growing Burden of Heart Disease and Stroke in Canada 2003 Harm Emphysema Chronic Bronchitis Smoking Lung Cancer Bladder Cancer Lip Cancer Coronary dx Stroke Ischemic Limb Harm Hypertension Myocardial Infarction Hypercholesterolemia Male Sex Family History Smoking Causality for the Modern Era • Austin Bradford Hill – July 8, 1897 - April 18, 1991 – English epidemiologist and statistician – pioneered the randomized clinical trial – published criteria for causality Harm • Austin Bradford Hill’s criteria for causality. • Temporality – exposure precedes onset of disease • Strength of Association – exposure strongly associated with disease frequency Harm • Dose – Response – more exposure associates with higher disease frequency or severity • Reversibility – reduction in exposure associates with lower rates of disease • Consistency – association between exposure and disease is observed by difference persons in different places during different circumstances Harm • Specificity – one cause leads to one effect • Analogy – cause and effect relationship already established for similar risk factor or disease • Biological Plausibility – is the association biologically plausible (keeping in mind that plausibility changes with time) Measures of Risk for Disease • 1964 cohort study data Lung Cancer Death Smokers RF+ Not Lung Cancer Death Population Not Smoker RFStratify on Smoking Status Lung Cancer Death Not Lung Cancer Death Observe for several years Measures of Risk for Disease lung cancer mortality cigarette smokers non-smokers all subjects 0.96 / 1000 / year 0.07 / 1000 / year 0.56 / 1000 / year prevalence of cigarette smoking = 56% Measures of Risk for Disease • Q: What is the additional risk (incidence) of disease following exposure, over and above that experienced by people who are not exposed? • Attributable Risk Measures of Risk for Disease AR = Attributable Risk IEXP = disease incidence among exposed IUNEXP = disease incidence among not exposed AR = IEXP – IUNEXP Measures of Risk for Disease lung cancer mortality cigarette smokers 0.96 / 1000 / year non-smokers 0.07 / 1000 / year 0.96 - 0.07 = 0.89 Attributable Risk = 0.89 deaths / 1000 / yr For every 10,000 smokers compared to 10,000 non-smokers there were 9 more deaths each year. Measures of Risk for Disease • Q: How many times are exposed persons more likely to get the disease relative to nonexposed persons? • Relative Risk Measures of Risk for Disease RR = Relative Risk IEXP = disease incidence among exposed IUNEXP = disease incidence among not exposed IEXP RR = ----------------IUNEXP Measures of Risk for Disease lung cancer mortality cigarette smokers 0.96 / 1000 / year non-smokers 0.07 / 1000 / year 0.96 / 0.07 = 13.7 Relative Risk = 13.7 Smokers are 13.7 times more likely to die from lung cancer than non-smokers. Measures of Risk for Disease • Q: How much does a risk factor contribute to the overall rates of disease in groups of people, rather than individuals? • Population Attributable Risk Measures of Risk for Disease • PAR = Population Attributable Risk • AR = attributable risk • P = prevalence of exposure PAR = AR x P Measures of Risk for Disease lung cancer mortality cigarette smokers 0.96 / 1000 / year non-smokers 0.07 / 1000 / year smokers 56% of the population (0.96-0.07) x 0.56 = 0.4984 ~ 0.5 Population Attributable Risk = 0.5 / 1000 / yr Annually, smoking accounts for 1 lung cancer death per every 2000 people in the population. Measures of Risk for Disease • Example of utility of Pop. Attributable Risk • Hypertension (high blood pressure) is a risk factor for stroke. Surprise! What would be more effective in lowering the Treating Mild Hypertension incidence of stroke in a population? Treating mild hypertension The prevalence of mild hypertension far Treating severe exceeds thehypertension prevalence of severe hypertension. Measures of Risk for Disease • What fraction of disease in a population is attributable to exposure to a risk factor? • Population Attributable Fraction Measures of Risk for Disease • PAF = Population Attributable Fraction • PAR = population attributable risk • ITotal = disease incidence in total population PAR PAF = ---------ITotal Measures of Risk for Disease lung cancer mortality all subjects 0.56 / 1000 / year PAR 0.4984 / 1000 / year 0.4984 / 0.56 = 0.89 Population Attributable Fraction = 0.89 Eighty-nine (89)% of the lung cancer deaths were accounted for by cigarette smoking. Diagnosis Medical Diagnostics general process history physical examination diagnostic tests Medical Diagnostics general process history physical examination diagnostic tests Medical Diagnostics general process history physical examination diagnostic tests What’s the best test? • Gold Standard: A method, procedure, or measurement that is widely accepted as being the best available. • Gold Standard = ‘reference standard’ Issues in Diagnostic Testing • Invasiveness – urine sample versus brain biopsy versus autopsy • Cost – glucoscan strip ~ $1.00 versus MRI $770.00 • Availability – hemogram versus Positive Emission Tomogram • Patient Acceptability – urine sample versus 3 day fecal fat collection • How well the test performs!!! Test Characteristics • Sensitivity • Specificity • Positive predictive value • Negative predictive value • Accuracy • Likelihood ratio PE - diagnosis Pulmonary angiogram - gold standard PE - diagnosis (spiral CT scan) DISEASE Present Absent TRUE FALSE Positive TEST POSITIVE POSITIVE FALSE TRUE Negative NEGATIVE NEGATIVE Hypothetical Test Results DISEASE (PE) Present Absent Positive TRUE POSITIVE a = 80 FALSE POSITIVE b = 20 a + b = 100 Negative FALSE NEGATIVE c = 10 TRUE NEGATIVE d = 90 c + d = 100 a + c = 90 b + d = 110 a+b+c+d = 200 TEST (V/Q scan) Sensitivity • Probability that test is positive given that disease is present. P (T+ | D+) Sensitivity DISEASE (PE) Present Absent Positive TRUE POSITIVE a = 80 FALSE POSITIVE b = 20 a + b = 100 Negative FALSE NEGATIVE c = 10 TRUE NEGATIVE d = 90 c + d = 100 a + c = 90 b + d = 110 a+b+c+d = 200 TEST (V/Q scan) 80 / (80 + 10) = 88.9% Specificity • Probability that test is negative given that disease is absent. P (T- | D-) Specificity DISEASE (PE) Present Absent Positive TRUE POSITIVE a = 80 FALSE POSITIVE b = 20 a + b = 100 Negative FALSE NEGATIVE c = 10 TRUE NEGATIVE d = 90 c + d = 100 a + c = 90 b + d = 110 a+b+c+d = 200 TEST (V/Q scan) 90 / (90 + 20) = 81.8% Sensitivity - Specificity Trade-Off • Most test results are not positive or negative. • There is often a selected value – over which a test is said to be positive – under which a test is said to be negative. • As a result…. – increasing sensitivity results in loss of specificity – increasing specificity results in loss of sensitivity Sensitivity / Specificity Trade-off Sensitivity Decreases Specificity Increases Sensitivity / Specificity Trade-off • Receiver Operating Characteristic (ROC) curve Test Characteristic Issues • Highly Sensitive Tests: – tend to be less invasive, less risky, less costly – best for screening programs – best for ruling out disease: “SNOUT” Test Characteristic Issues • Highly Specific Tests: – tend to be more invasive, more risky, more costly – best for confirming (ruling in) disease: “SPIN” Positive Predictive Value • Probability that disease is present given that the test was positive. P (D+ | T+) Positive Predictive Value DISEASE (PE) Present Absent Positive TRUE POSITIVE a = 80 FALSE POSITIVE b = 20 a + b = 100 Negative FALSE NEGATIVE c = 10 TRUE NEGATIVE d = 90 c + d = 100 TEST (V/Q scan) 80a/+ c(80 80.0% = 90 + 20) b + d= = 110 a+b+c+d = 200 Negative Predictive Value • Probability that disease is absent given that the test was negative. P (D- | T-) Negative Predictive Value DISEASE (PE) Present Absent Positive TRUE POSITIVE a = 80 FALSE POSITIVE b = 20 a + b = 100 Negative FALSE NEGATIVE c = 10 TRUE NEGATIVE d = 90 c + d = 100 a + c = 90 b + d = 110 a+b+c+d = 200 TEST (V/Q scan) 90 / (90 + 10) = 90.0% Test Characteristic Issues • Positive and Negative Predictive Values suffer from depending on disease prevalence • This is a major drawback.* (* excellent exam question) Change Disease Prevalence from 90 to 110 per 200 DISEASE (PE) Present Absent TRUE FALSE POSITIVE POSITIVE a + b = Positive a = 80 b = 20 114.1 TEST 97.7 16.4 (V/Q FALSE TRUE scan) NEGATIVE NEGATIVE c + d = Negative c = 10 d = 90 85.8 12.2 73.6 a + c = 90 b+d= a+b+c+d 110 110 90 = 200 Change Disease Prevalence from 90 to 110 per 200 DISEASE (PE) Present Absent prevalence = 110 / 200 = 0.55 = FALSE 55% (was 45%) TRUE POSITIVE POSITIVE a + b = Positive sensitivity = 97.7 / 110 88.8% (unchanged) a ==80 b = 20 114.1 TEST 97.7 16.4 specificity = 73.6 / 90 = 81.7% (unchanged) (V/Q FALSE TRUE scan) NEGATIVE c + d = positiveNegative predictiveNEGATIVE value = 86.5% (was 80%) c = 10 d = 90 85.8 negative predictive value (was 90%) 12.2 = 85.8%73.6 a + c = 90 b+d= a+b+c+d 110 110 90 = 200 Accuracy • Probability that the test is true. • (not a useful concept as you’ll see later) Accuracy DISEASE (PE) Present Absent Positive TRUE POSITIVE a = 80 FALSE POSITIVE b = 20 a + b = 100 Negative FALSE NEGATIVE c = 10 TRUE NEGATIVE d = 90 c + d = 100 a + c = 90 b + d = 110 a+b+c+d = 200 TEST (V/Q scan) (80+90) / (80+ 20 + 10 + 90) = 85.0% Test Characteristic Issues • Accuracy: – not useful characteristic – high sensitivity / low specificity test may have same accuracy as low sensitivity / high specifity test (positive) Likelihood Ratio • Ratio of: probability of positive test when disease is present -------------------------------------------------------------------probability of positive test when disease is absent Positive Likelihood Ratio DISEASE (PE) Present Absent Positive TRUE POSITIVE a = 80 FALSE POSITIVE b = 20 a + b = 100 Negative FALSE NEGATIVE c = 10 TRUE NEGATIVE d = 90 c + d = 100 a + c = 90 b + d = 110 a+b+c+d = 200 TEST (V/Q scan) (80 / 90) / (20 / 110) = 4.89 Utility of (Positive) Likelihood Ratios • expresses how many times more likely a test result is to be found in diseased, compared to nondiseased, people. • can estimate the post-test probability of disease if prevalence is known. Pre-test Probability of Disease • Consider: a female presents for a screening breast mammogram for breast cancer. • What’s her pre-test probability of disease? Prevalence of Disease Positive Test Result • Say that her mammogram show her to have a 1 cm spiculated calcification • Say that this finding is associated with a likelihood ratio of 20 (a very suspicious lesion). Highly suspicious lesion What is the post-test probability of disease? Answer: Pretest odds x Likelihood Ratio = Posttest odds (the use of odds ratios makes the math convoluted) What is the post-test probability of disease? Pretest odds x Likelihood Ratio = Posttest odds Assume: prevalence = 10 / 1000 = 1% = P(0.01) Odds = probability of event / (1 - probability of event) Pre-test Odds = (10/1000) / (1 - (10/1000)) = 0.0101 What is the post-test probability of disease? Pretest odds x Likelihood Ratio = Posttest odds 0.0101 x 20 = 0.2020 Probability = Odds / (1 + Odds) Posttest Probability = 0.2020 / (1 + 0.2020) Posttest Probability = 0.167 = 16.7% Utility of (Positive) Likelihood Ratio Pre-test Probability = 1% Post-test Probability = 16.7% Prudent Course: move from screening test to confirmatory test! Treatment and Prevention Measures of Treatment & Prevention Effect • Results – Incidence of Stroke over 5 years: – treatment: 5.2 / 100 participants – placebo: 8.2 / 100 participants Measures of Treatment & Prevention Effect • • • • • Absolute Risk Reduction Relative Risk Reduction Relative Risk Odds Ratio Number Needed to Treat Absolute Risk Reduction ARR = P(event placebo) – P(event treatment) treatment placebo 5.2% 8.2% 0.052 0.082 ARR = 0.082 – 0.052 = 0.03 Relative Risk Reduction RR = P(event placebo) – P(event treatment) ------------------------------------------------- P(event placebo) treatment placebo 5.2% 8.2% 0.052 0.082 RR = (0.082 – 0.052) / 0.082 = 0.37 Relative Risk RR = P(event treatment) --------------------------- P(event placebo) treatment placebo 5.2% 8.2% 0.052 0.082 RR = 0.052 / 0.082 = 0.63 RR & RRR • Relative Risk + Relative Risk Reduction = 1 0.37 0.63 1.00 Number Needed to Treat NNT = 1 / ARR treatment placebo 5.2% 8.2% 0.052 0.082 NNT = 1 / (0.082 – 0.052) = 33 Odds Ratio ‘Ratio of the Odds’ Odds = P(event) / ( 1 – P(event) ) % treatment 5.2% placebo 8.2% prob 0.052 0.082 odds 0.052 / (1 – 0.052) 0.082 / (1 – 0.082) Odds (treatment) 0.55 ---------------------- = ------ = 0.61 Odds (placebo) 0.89 Measures of Treatment & Prevention Effect • What’s the fuss? – Absolute risk graphs may not convey the whole story. – Small effect large relative risks – see data from a recent large trial... Measures of Treatment & Prevention Effect • 100% scale on y axis Measures of Treatment & Prevention Effect • Raw data: – losartan: – atenolol: 508 events / 4605 subjects = 11.0% 588 events / 4588 subjects = 12.8% – ARR: 0.018 – RRR: 0.14 – RR: 0.86 1.8% 14.0% 86.0% NNT = 55.5 Which number would you like? Measures of Treatment & Prevention Effect (examples) Therapy Endpoint NNT (5yr) stepped care for diastolic BP of 115 - 129 coronary artery bypass grafting for left main disease ASA for transient ischemic attack cholestyramine for hypercholesterolemia INH for inactive tuberculosis stepped care for diastolic BP 90 - 109 death, stroke, myocardial infarction 3 death 6 death, stroke 6 death, myocardial infarction 89 active tuberculosis 96 death, stroke, myocardial infarction 141 Efficacy versus Effectiveness • Clinical Trials (treatment efficacy): – ideal setting • Everyday Practice (treatment effectiveness) – subjects not excluded owing to artificial criteria – less monitoring by health care workers – patients will be less adherent on average – less rigor in measurement Prognosis Prognosis • natural history of disease – disease course without intervention • clinical course of disease – disease course with intervention Prognosis • Common outcomes of disease – 5-year survival: percent of patients surviving for 5 years from some point in their disease – Case Fatality Rate: percent of patients with a disease who die of it – Disease-specific Mortality: number of people per 10,000 (or 100,000) population dying of a specific disease – Response: percent of patients showing some evidence of improvement following an intervention – Remission: percent of patients entering a phase in which disease is no longer detectable – Recurrence: percent of patients who have return of disease after a disease-free interval Prognosis • Common outcomes of disease – not all outcomes are as definite (or bleak) as those just shown – health related quality of life • more qualitative (than quantitative) • may be more important to patients than mortality outcomes • is likely to become increasingly important Prognostic Factor versus Risk Factor • A good example: – low blood pressure is protective against a myocardial infarction – low blood pressure is predicts a poor prognosis among those having a myocardial infarction – Why? Prognosis versus Risk Factor • High BP is associated with atherosclerosis and thickening of the heart muscle which both increase risk of heart attack (MI) • Low BP at an MI is consistent with sufficient muscle damage that it cannot generate a satisfactory blood pressure. Research Design – study designs • • • • • • case reports and case series correlational research cross-sectional surveys case series case-control quasi-experimental designs Case Report • detailed report of the diagnosis, treatment, and follow-up of an individual patient • contain some demographic information about the patient – age – gender – ethnic origin – (not usually name) Case Series • detailed report of the diagnosis, treatment, and follow-up of > 1 patient • contain some demographic information about the patient – age – gender – ethnic origin – (not usually name) • may be consecutive or non-consecutive Case Report • • BHIVA Conf 2005 Apr 20-23;11:P112D INTRODUCTION – The effects of methotrexate on patients with HIV are not well known. We report a case of a man with a relatively high CD4 count, not requiring treatment with HAART, displaying evidence of immune failure while on methotrexate. • CASE PRESENTATION – We describe the case of a 48 year old British Caucasian man who presented with a 2 year history of lower limb synovitis requiring recurrent steroid injections; he became progressively more debilitated. Initially tested negative for HIV. He also complained of early morning back pain for a number of years and HLA B27 confirmed clinical suspicion of Ankylosing Spondylitis (AS). Repeated HIV testing 2 years after initial presentation confirmed HIV with CD4 count 622 cells/µl (25%) and 41,400 viral load HIV-1 RNA copies/ml. – Initial results suggested that HAART should be withheld. Escalating doses of methotrexate have coincided with evidence of impaired T cell function manifest as widespread Molluscum and violatious lesions on his 1st MTP joint. Biopsy confirmed Kaposis sarcoma (KS). His arthropathy remains difficult to control; however he now has an AIDS illness requiring treatment with HAART. • DISCUSSION – the effects of steroid use in patients with HIV is established as a risk for the development of opportunistic infection and KS. The effects of methotrexate are not as clear and there is very little literature of the interaction between HIV and AS. Case Series • In the period October 1980-May 1981, five young men, all active homosexuals, were treated for biopsy-confirmed Pneumocystis carinii pneumonia at three different hospitals in Los Angeles, California. Two of the patients died. -- Morbidity & Mortality Weekly Report, 6/5/1981 Case Reports & Case Series • Strengths – inexpensive – rapid (Hanta Virus, Legionnaire’s dx, SARS) • Weaknesses – non-systematic generalizability issue – draws attention to the unusual – may be factually inaccurate • recall issues • incomplete information or initial workup – What’s the clinical question? Correlational Research (Ecological Study Design) • Correlational Research – compares health outcomes between groups or populations (Not Individuals) – the units of analysis are populations or groups of people, rather than individuals Population A Population B Exposure? Disease ? Exposure? Disease ? Population C Population D Exposure? Disease ? Exposure? Disease ? • Correlational Study – mean population exposure & mean population disease – measure during the same period of time Meat consumption and Colon Cancer Incidence Correlational / Ecological Research • Strengths – inexpensive – rapid • Weaknesses – individual exposure and outcome unknown – confounding – cause and effect Cross Sectional Survey • XSS: – exposure and disease status is measured among individuals at the same time Inidividual A Inidividual B Inidividual C Inidividual D Measure Exposure & Disease Cross-sectional survey: measure exposure and disease among individuals at same encounter Cross Sectional Survey BMJ 2002:324:1152 Injury and financial status. Cross Sectional Survey • Strengths – can study entire populations – provide estimates of prevalence of all factors measured • Weaknesses – cause and effect may not be determined – not good for rare diseases or rare exposures Case-control Design • Exposure is measured among individuals with a disease and among controls. Individuals with Lung Cancer Individuals without Lung Cancer radon gas exposed radon gas unexposed radon gas exposed radon gas unexposed Case-control design: first establish disease then exposure Case-control design American Journal of Epidemiology Vol. 138, No. 5: 281-293 Asthma death and prescription of inhaled medication Case Control Study • Strengths – – – – – – relatively low cost short duration good for diseases with long latent periods good for rare disease can examine multiple exposures for a single disease study does not impart risk to patient • Weaknesses – cannot determine disease incidence – may be hard to establish a temporal relationship between exposure and disease if disease – it is particularly prone to selection and recall bias – not good for rare exposures. Cohort Study • Disease incidence is measured among exposed individuals and controls. interstitial lung disease Hard Rock Miners Not Hard Rock Miners no interstitial lung disease interstitial lung disease no interstitial lung disease Cohort Study: first identify exposure, then measure disease American Journal of Epidemiology. 152(4):297-306, August 15, 2000. Categories of exposure among German rubber workers. American Journal of Epidemiology. 152(4):297-306, August 15, 2000. Categories of exposure among German rubber workers. Cohort Study • Strengths – – – – – – – • able to estimate disease incidence good for determining temporality if prospective, little misclassification of disease status can examine multiple effects of a single exposure study does not impart risk to subjects retrospective cohort trials are inexpensive and require a short duration to complete the trial good for rare exposures Weaknesses – – – – not good for rare diseases if retrospective, requires availability of adequate records, if prospective, validity is effect by losses to follow-up if prospective, may be very costly and may take several years to complete Quasi-experimental designs • ‘Almost experimental research designs’ • Several and Varied – e.g. non-equivalent groups design – e.g. pre-post (before-after design) Acute MI Canmore Lifestyle Education Program Acute MI Cochrane Acute MI Canmore Acute MI Cochrane • An example of a Quasi-experimental Design – non-equivalent groups: measure baseline in two groups, intervene in one group, remeasure incidence in both groups Quasi Experimental Designs • Weakness – comparison may or may not be valid Validity and Reliability • We accept that: – we cannot measure all persons in a populatoin – we cannot make perfect measurements – we cannot be certain in our associations Threats to Validity and Reliability Schaefer’s Simpleton View of the World • Bias – any process at any stage of research that systematically produce results that deviate from the true values • Confounding – occurs when two factors are associated and the effect of one is confused with or distorted by the effect of the other. • Play of Chance – it is not possible to perfectly control for all factors Bias • Any process at any time during research • Systematically causes a result deviation • Usually results from one group behaving or being treated differently from the other • Examples – Selection bias – Measurement bias – Recall bias Confounding • ‘Confusion of effects…’ • If an association exists between two exposures we may be confused as to which caused the disease Confounding Down’s Syndrome and Birth Order Cases of(what’s Down syndroms by birth order wrong here?) Cases per 100 000 live births 180 160 140 120 100 80 60 40 20 0 1 2 3 Birth order 4 5 Play of Chance • Random error – will treat both groups equally in the long run – biases toward the null hypothesis (no difference exists between groups) – reduces reliability Random Error Per Cent 14 12 10 8 6 4 2 0 0 5 10 15 20 25 Size of induration (mm) 30 35 Systematic Error (Bias) Per Cent 14 12 10 8 6 4 2 0 0 5 10 15 20 Size of induration (mm) 25 30 Biostatistics • • • • Type of Data Measures of Central Tendency Measures of Dispersion Expressing Results Types of Data • • • • • Nominal Ordinal Ranked Discrete Continuous Nominal Data • Data is placed into ‘named’ categories. • E.g. • 1 = pneumonia • 2 = heart disease • 3 = abdominal pain Mathematical analysis usually inappropriate. (exception might be 0 = male, 1 = female) Ordinal Data • Data relates to a logical order. • Example: • • • • • 5 = fatal 4 = severe 3 = moderate 2 = mild 1 = none • Mathematical analysis usually inappropriate. Does mild + moderate = fatal? Ranked Data • Data relates to position within a sequence. E.g. Causes of death… • 1 = cardiovascular disease • 2 = neoplasm • Mathematical analysis usually inappropriate. However, information is usually useful and often quoted. Discrete Data • Data represents ‘counts’. • E.g. – number of children – number of accidents – number dying of heart failure • Mathematics are appropriate although result may not be. e.g. 2.4 children / family Continuous Data • Data has any numerical value (ratio data) • E.g. – cholesterol values – blood pressures • Mathematics is usually appropriate. e.g. Average hemoglobin was 120 g/l Staging of Heart Failure NYHA Cardiac Status • Class I: uncompromised • Class II: slightly compromised • Class III: moderately compromised • Class IV: severely compromised – updated from old NYHA Classification • ‘usual activities’ ‘minimal exertion’ Specific Activity Scale Goldman Circulation 64:1227, 1981 Stage I • patients can perform to completion any activity requiring 7 metabolic equivalents – can carry 24 lb up eight steps – carry objects that weigh 80 lb – do outdoor work [shovel snow, spade soil] – do recreational activities [skiing, basketball, squash, handball, jog/walk 5 mph] Specific Activity Scale Goldman Circulation 64:1227, 1981 Stage II • patients can perform to completion any activity requiring 5 metabolic equivalents – have sexual intercourse without stopping – garden, rake, weed, roller skate – dance fox trot, walk at 4 mph on level ground – but cannot and do not perform to completion activities requiring 7 metabolic equivalents Specific Activity Scale Goldman Circulation 64:1227, 1981 Stage III • patients can perform to completion any activity requiring 2 metabolic equivalents – dress, shower without stopping, strip and make bed, clean windows – walk 2.5 mph, bowl, play golf, dress without stopping – but cannot and do not perform to completion any activities requiring 5 metabolic equivalents Specific Activity Scale Goldman Circulation 64:1227, 1981 Stage IV • patients cannot or do not perform to completion activities requiring 2 metabolic equivalents – CAN’T: • dress without stopping • shower without stopping • strip and make bed • walk 2.5 mph • bowl, play golf Prognosis varies with Class Stage IV NOT 4 X more serious than stage I heart failure. Measures of Central Tendency • Mean • Median • Mode • others exist – truncated mean – geometric mean – weighted mean Mean • Average sum of all observations -------------------------------------number of observations 2, 3, 6, 8, 10, 12 41 / 6 = 6.83333 Median • The 50th percentile (or ‘middlemost’ value). 3, 6, 7, 19, 10, 13, 2, 1, 21, 4, 22 1, 2, 3, 4, 6, 7, 10, 13, 19, 21, 22 Median = 7 (Use Average of the Two Middle Values if Even Number of Observations) 1, 2, 3, 4, 6, 6, 7, 10, 13, 19, 21, 22 Median = (6 + 7)/2 = 6.5 Mode • Most common value. 3, 6, 7, 4, 19, 4, 10, 13, 10, 2, 1, 21, 4, 22 Mode = 4 Measures of Central Tendency • Medicine and Health – mainly mean and median • Mean: – sensitive to outliers – does not convey multimodal distributions • Median: – less intuitive – less suitable for mathematical analysis Hospital Length of Stay: typical example of where a few patients (e.g. complication of surgery) requires longer stays same mean age Normal Distribution • • mean = median = mode bell shaped (single peak) and symmetrical Measures of Dispersion (variability) • • • • • Range Variance Standard Deviation Standard Error Confidence Intervals Range • The difference between largest and smallest values. (Usually expressed as smallest to largest) 2, 4, 6, 10, 12, 14, 17, 20 range = 18 The range was 2 to 20. Interquartile Range • the distance between the 25th percentile and the 75th percentile 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 IQR = 4 to 9 Variance (sample) 115, 116, 118, 114, 117 mean = 116 range = 3 44, 80, 110, 180, 166 mean = 116 range = 136 Range is helpful but depends only on two numbers. Variance (sample) 115, 116, 118, 114, 117 mean = 116 range = 3 observations 115 116 118 114 117 mean 116 116 116 116 116 difference -1 0 2 -2 1 sum = 0 diff squared 1 0 4 4 1 sum = 10 divide by obs 10 / (5-1) = 2.5 = variance take square root of variance = √2.5 = 1.58 std dev Variance (sample) 44, 80, 110, 180, 166 observations mean difference diff squared divide by obs mean = 116 range = 3 44 80 110 180 166 116 116 116 116 116 -72 -36 -6 64 50 sum = 0 5184 1296 36 4096 2500 sum=13,112 13,112 / (5-1) = 3,278 = variance take square root of variance = √3,278 = 57.3 std dev Normal Distribution • +/- 1 sd 66% +/- 2 sd 95% +/- 3 sd 99.7% Variance (Population) • Variance of a Population – population is where everyone is measured – denominator = number of observations • Variance of a Sample – a sample of the population is selected – denominator = number of observations - 1 Standard Error • Imagine a data set with 1,000 values – – – – – Select 100 values, calculate mean Select 100 values, calculate mean Select 100 values, calculate mean Select 100 values, calculate mean and so on, and so on… – Plot the means – Calculate the standard deviation of these means Standard Error Another method: Standard Dev / √ sample size Confidence Interval • General Formula: 95% Confidence Interval = mean – (1.96 x Standard Error) to mean + (1.96 x Standard Error) So what does this actually mean? • Confidence Interval – the range over which the TRUE VALUE is covered 95% of the time. Expressing Our Results • Point Estimate and Confidence Interval Rales Trial NEJM 1999;341:709-17 Relative Risk 0.7 Point Estimate placebo: 753 / 841 = 0.895 spirono: / 822 = 0.625 95% CI (0.59 515 to 0.82) Measure of Precision (0.625) What / 0.895 70% 1.0? if C.I.=included Graphs • Box Plots • Survival Curves • There are others, which you are likely familiar with. – pie, line, bar… Box Plots Survival (Kaplan – Meier Curve) - plots events over time (not nec. death) - takes into consideration losses to followup - be able to identify this graph type Independent versus Dependent Variables • Independent Variables – those that are manipulated – includes the ‘populations’ of interest – e.g. experimental drug vs placebo – e.g. population with diabetes vs controls • Dependent Variables – those that are only measured or registered – includes the ‘outcome’ of interest – e.g. mortality, morbidity – e.g. health related quality of life Hypothesis Testing • • Is there an association between an independent and dependent variable? Generate a null hypothesis There is no association between these variables. 1. Reject the Null Hypothesis or 2. Do Not Reject the Null Hypothesis Implications • We do not ‘accept’ the null hypothesis… – Failing to demonstrate an association does not ensure that an association does not exist! – Equivalency trials Errors two possibilities • Type 1 – alpha error or rejection error – ‘rejecting the null hypothesis when in fact there is no association’ – bias – confounding – play of chance • P-value accept a probability of Type 1 = 0.05 (5%) Errors two possibilities • Type 2 Error – beta error – ‘error of missed opportunity’ – inter-related reasons • high variance among the outcomes – population attributes • small sample size relative to variance • intervention was insufficient (too low a dose) • intervention was too brief (too short a trial) – Power = 1 – Beta – ‘The power to detect a difference was … values typically vary from 80% to 95% End of the Line… Any Questions or Suggestions?