Download RMEBM - Calgary Internal Medicine

Document related concepts
no text concepts found
Transcript
Medical Council of Canada
Certifying Exam Review
April 16, 2007
Jeffrey P Schaefer MSc MD FRCPC
Epidemiology
Clinical Epidemiology
Commonest Clinical Questions
•
•
•
•
•
Harm
Diagnosis
Therapy
Prognosis
Prevention
Measures of Disease
Measures of Disease Frequency
• Disease Incidence
– refers to new cases of disease among a
population at risk for that disease over a
specified period of time.
Measures of Disease Frequency
• Cumulative Incidence (CI)
CI = new cases / population at risk / time interval
Population
120
Population
at risk
100
Population
not at risk
20
1 yr
New cases
25
CI = 25/100/yr
Measures of Disease Frequency
• Incidence Density
ID =
new cases
.
person-time population at risk per time period
Useful when there is losses to followup among the population under
observation
Measures of Disease Frequency
• A study plans to observe 20,000 people for 1
years but 10,000 left town after 6 mo. 15
cases of disease were observed among the
population.
10,000 observed x 1
yr = 10,000 p-yr
10,000 observed x 0.5 yr = 5,000 p-yr
Total persons years
= 15,000 p-yr
Measures of Disease Frequency
15 cases / 15,000 person years
1 cases / 1,000 person-years
10 cases / 10,000 persons per year
Incidence of Cancer of the Lung, Trachea, and Bronchus, Both Sexes
Combined, All Ages, 1998 from Statistics Canada by Province or Territory.
Can
Nfld
PEI
NS
NB
Que
Ont
Man
Sask Alb
BC
Yuk
NWT
61.10 41.74 78.57 75.54 72.69 74.72 55.83 62.72 54.05 52.30 54.62 48.30 122.25
Incidence Data
• Challenge
– ensure the denominator is truly at risk.
Measures of Disease Frequency
• Disease Prevalence
– refers to the proportion of individuals who have
the disease at a specific time among a given
population. (not a rate)
Prevalence = existing cases at a point in time
total population
Measures of Disease Frequency
• Point Prevalence
– ‘snapshot’ in calendar time or event related
time
– number of cases of diabetes on June 30, 2002
in Calgary
– number of people with infected incision on postoperative day 3 on General Surgery unit at
Foothills Hospital
Measures of Disease Frequency
• Period Prevalence
– longer duration under consideration
– number of people with diabetes between Jan 1
and Dec 31, 2002 in Calgary
– number of people with infected incision between
surgery and post-operative day 90 on General
Surgery unit at Foothills Hospital
Period Prevalence of Diabetes
Canada 1996-97
Global Cigarette Consumption
Special Measures of Disease Frequency
• Mortality Rate
– incidence of death (cause specific vs all cause
mortality)
• Infant Mortality Rate
– death of a live born infant within the 1st year of
life
• Neonatal Death
– death of an infant under 28 days of age
• Post-neonatal Death
– death of an infant between 4 weeks and 1 year
of age
Infant Mortality Rates Selected Countries, 1995
Infant Mortality Rates by Income Quintile,
Urban Canada, 1991
Prevention
primary
prevention
risk
factors
only
secondary
prevention
subclinical
disease
time
tertiary
prevention
clinical
disease
complicated
disease
Harm
Harm
• Exposures that cause disease = Harm
• Harmful exposures are Risk Factors
• Risk Factors may be etiologies of disease.
Harm
EXPOSURE
DISEASE
A harmful exposure is a RISK FACTOR
• Robert Koch
– Dec 11, 1843 - May 27, 1910
– Nobel Prize - 1905
• Anthrax
• Tuberculosis
• Cholera
Harm
• Koch’s Postulates (1882)
• Criteria for Causation
– the organism must be present in every case of
disease,
– the organism be isolated from the diseased
animal and grown in pure culture,
– the organism causes the same disease when
inoculated into a healthy animal,
– the organism must be recovered from that
animal (step 3) and identified.
Harm
• Limitations of Koch’s Postulates
– not all diseases are infectious
– one risk factor may cause > 1 diseases
– one disease may have > 1 risk factors
Figure 4-4
Leading causes of death, number and percentage of
deaths, Canada, 1999
Respiratory (22,026)
10%
Other IHD (21,693)
10%
AMI (20,926)
9.5%
Other (33,240)
15%
Diabetes (6,137)
3%
Infectious Diseases
(2,583)
1%
All
Cardiovascular
Cerebrovascular
Disease
Disease (15,409)
(78,942)
7%
36%
Other CVD (20,914)
9.5%
Cancer (62,606)
29%
Accidents/
Poisoning/ Violence
(13,996)
6%
Total Number of Deaths: 219,530
Cardiovascular (ICD-9 390-459); Respiratory (ICD-9 460-519); Diabetes (ICD-9 250); Cancer (ICD-9 140-239);
Infectious Diseases (ICD-9 001-139); Accidents/Poisonings/Violence (ICD-9 E800-E999)
Source:
Health Canada, using data from Mortality File, Statistics Canada
The Growing Burden of Heart Disease and Stroke in Canada 2003
Harm
Emphysema
Chronic Bronchitis
Smoking
Lung Cancer
Bladder Cancer
Lip Cancer
Coronary dx
Stroke
Ischemic Limb
Harm
Hypertension
Myocardial
Infarction
Hypercholesterolemia
Male Sex
Family History
Smoking
Causality for the Modern Era
• Austin Bradford Hill
– July 8, 1897 - April 18,
1991
– English epidemiologist
and statistician
– pioneered the
randomized clinical trial
– published criteria for
causality
Harm
• Austin Bradford Hill’s criteria for causality.
• Temporality
– exposure precedes onset of disease
• Strength of Association
– exposure strongly associated with disease
frequency
Harm
• Dose – Response
– more exposure associates with higher disease
frequency or severity
• Reversibility
– reduction in exposure associates with lower
rates of disease
• Consistency
– association between exposure and disease is
observed by difference persons in different
places during different circumstances
Harm
• Specificity
– one cause leads to one effect
• Analogy
– cause and effect relationship already
established for similar risk factor or disease
• Biological Plausibility
– is the association biologically plausible
(keeping in mind that plausibility changes with
time)
Measures of Risk for Disease
• 1964 cohort study data
Lung Cancer Death
Smokers
RF+
Not Lung Cancer Death
Population
Not
Smoker
RFStratify on
Smoking Status
Lung Cancer Death
Not Lung Cancer Death
Observe for
several years
Measures of Risk for Disease
lung cancer mortality
cigarette smokers
non-smokers
all subjects
0.96 / 1000 / year
0.07 / 1000 / year
0.56 / 1000 / year
prevalence of cigarette smoking = 56%
Measures of Risk for Disease
• Q: What is the additional risk (incidence) of
disease following exposure, over and above
that experienced by people who are not
exposed?
• Attributable Risk
Measures of Risk for Disease
AR = Attributable Risk
IEXP = disease incidence among exposed
IUNEXP = disease incidence among not exposed
AR = IEXP – IUNEXP
Measures of Risk for Disease
lung cancer mortality
cigarette smokers
0.96 / 1000 / year
non-smokers
0.07 / 1000 / year
0.96 - 0.07 = 0.89
Attributable Risk = 0.89 deaths / 1000 / yr
For every 10,000 smokers compared to
10,000 non-smokers there were 9 more
deaths each year.
Measures of Risk for Disease
• Q: How many times are exposed persons
more likely to get the disease relative to
nonexposed persons?
• Relative Risk
Measures of Risk for Disease
RR = Relative Risk
IEXP = disease incidence among exposed
IUNEXP = disease incidence among not exposed
IEXP
RR = ----------------IUNEXP
Measures of Risk for Disease
lung cancer mortality
cigarette smokers
0.96 / 1000 / year
non-smokers
0.07 / 1000 / year
0.96 / 0.07 = 13.7
Relative Risk = 13.7
Smokers are 13.7 times more likely to die
from lung cancer than non-smokers.
Measures of Risk for Disease
• Q: How much does a risk factor contribute
to the overall rates of disease in groups of
people, rather than individuals?
• Population Attributable Risk
Measures of Risk for Disease
• PAR = Population Attributable Risk
• AR = attributable risk
• P = prevalence of exposure
PAR = AR x P
Measures of Risk for Disease
lung cancer mortality
cigarette smokers
0.96 / 1000 / year
non-smokers
0.07 / 1000 / year
smokers
56% of the population
(0.96-0.07) x 0.56 = 0.4984 ~ 0.5
Population Attributable Risk = 0.5 / 1000 / yr
Annually, smoking accounts for 1 lung cancer
death per every 2000 people in the
population.
Measures of Risk for Disease
• Example of utility of Pop. Attributable Risk
• Hypertension (high blood pressure) is a risk
factor for stroke.
Surprise!
What would
be more
effective
in lowering the
Treating
Mild
Hypertension
incidence of stroke in a population?
Treating
mild hypertension
The
prevalence
of mild hypertension far
Treating
severe
exceeds
thehypertension
prevalence of severe
hypertension.
Measures of Risk for Disease
• What fraction of disease in a population is
attributable to exposure to a risk factor?
• Population Attributable Fraction
Measures of Risk for Disease
• PAF = Population Attributable Fraction
• PAR = population attributable risk
• ITotal = disease incidence in total population
PAR
PAF = ---------ITotal
Measures of Risk for Disease
lung cancer mortality
all subjects
0.56 / 1000 / year
PAR
0.4984 / 1000 / year
0.4984 / 0.56 = 0.89
Population Attributable Fraction = 0.89
Eighty-nine (89)% of the lung cancer deaths
were accounted for by cigarette smoking.
Diagnosis
Medical Diagnostics
general process
history
physical
examination
diagnostic
tests
Medical Diagnostics
general process
history
physical
examination
diagnostic
tests
Medical Diagnostics
general process
history
physical
examination
diagnostic
tests
What’s the best test?
• Gold Standard: A method, procedure, or
measurement that is widely accepted as
being the best available.
• Gold Standard = ‘reference standard’
Issues in Diagnostic Testing
• Invasiveness
– urine sample versus brain biopsy versus autopsy
• Cost
– glucoscan strip ~ $1.00 versus MRI $770.00
• Availability
– hemogram versus Positive Emission Tomogram
• Patient Acceptability
– urine sample versus 3 day fecal fat collection
• How well the test performs!!!
Test Characteristics
• Sensitivity
• Specificity
• Positive predictive value
• Negative predictive value
• Accuracy
• Likelihood ratio
PE - diagnosis
Pulmonary angiogram
- gold standard
PE - diagnosis (spiral CT scan)
DISEASE
Present
Absent
TRUE
FALSE
Positive
TEST
POSITIVE POSITIVE
FALSE
TRUE
Negative
NEGATIVE NEGATIVE
Hypothetical Test Results
DISEASE (PE)
Present
Absent
Positive
TRUE
POSITIVE
a = 80
FALSE
POSITIVE
b = 20
a + b = 100
Negative
FALSE
NEGATIVE
c = 10
TRUE
NEGATIVE
d = 90
c + d = 100
a + c = 90
b + d = 110
a+b+c+d = 200
TEST
(V/Q
scan)
Sensitivity
• Probability that test is positive given that
disease is present.
P (T+ | D+)
Sensitivity
DISEASE (PE)
Present
Absent
Positive
TRUE
POSITIVE
a = 80
FALSE
POSITIVE
b = 20
a + b = 100
Negative
FALSE
NEGATIVE
c = 10
TRUE
NEGATIVE
d = 90
c + d = 100
a + c = 90
b + d = 110
a+b+c+d = 200
TEST
(V/Q
scan)
80 / (80 + 10) = 88.9%
Specificity
• Probability that test is negative given that
disease is absent.
P (T- | D-)
Specificity
DISEASE (PE)
Present
Absent
Positive
TRUE
POSITIVE
a = 80
FALSE
POSITIVE
b = 20
a + b = 100
Negative
FALSE
NEGATIVE
c = 10
TRUE
NEGATIVE
d = 90
c + d = 100
a + c = 90
b + d = 110
a+b+c+d = 200
TEST
(V/Q
scan)
90 / (90 + 20) = 81.8%
Sensitivity - Specificity Trade-Off
• Most test results are not positive or negative.
• There is often a selected value
– over which a test is said to be positive
– under which a test is said to be negative.
• As a result….
– increasing sensitivity results in loss of specificity
– increasing specificity results in loss of sensitivity
Sensitivity / Specificity Trade-off
Sensitivity Decreases
Specificity Increases
Sensitivity / Specificity Trade-off
• Receiver Operating Characteristic (ROC) curve
Test Characteristic Issues
• Highly Sensitive Tests:
– tend to be less invasive, less risky, less costly
– best for screening programs
– best for ruling out disease: “SNOUT”
Test Characteristic Issues
• Highly Specific Tests:
– tend to be more invasive, more risky, more costly
– best for confirming (ruling in) disease: “SPIN”
Positive Predictive Value
• Probability that disease is present given
that the test was positive.
P (D+ | T+)
Positive Predictive Value
DISEASE (PE)
Present
Absent
Positive
TRUE
POSITIVE
a = 80
FALSE
POSITIVE
b = 20
a + b = 100
Negative
FALSE
NEGATIVE
c = 10
TRUE
NEGATIVE
d = 90
c + d = 100
TEST
(V/Q
scan)
80a/+ c(80
80.0%
= 90 + 20)
b + d=
= 110
a+b+c+d = 200
Negative Predictive Value
• Probability that disease is absent given that
the test was negative.
P (D- | T-)
Negative Predictive Value
DISEASE (PE)
Present
Absent
Positive
TRUE
POSITIVE
a = 80
FALSE
POSITIVE
b = 20
a + b = 100
Negative
FALSE
NEGATIVE
c = 10
TRUE
NEGATIVE
d = 90
c + d = 100
a + c = 90
b + d = 110
a+b+c+d = 200
TEST
(V/Q
scan)
90 / (90 + 10) = 90.0%
Test Characteristic Issues
• Positive and Negative Predictive Values
suffer from depending on disease
prevalence
• This is a major drawback.*
(* excellent exam question)
Change Disease Prevalence from 90 to 110 per 200
DISEASE (PE)
Present
Absent
TRUE
FALSE
POSITIVE POSITIVE a + b =
Positive
a = 80
b = 20
114.1
TEST
97.7
16.4
(V/Q
FALSE
TRUE
scan)
NEGATIVE NEGATIVE c + d =
Negative
c = 10
d = 90
85.8
12.2
73.6
a + c = 90
b+d=
a+b+c+d
110
110 90
= 200
Change Disease Prevalence from 90 to 110 per 200
DISEASE (PE)
Present
Absent
prevalence = 110 / 200
= 0.55 = FALSE
55% (was 45%)
TRUE
POSITIVE POSITIVE a + b =
Positive
sensitivity = 97.7 / 110
88.8% (unchanged)
a ==80
b = 20
114.1
TEST
97.7
16.4
specificity
=
73.6
/
90
=
81.7%
(unchanged)
(V/Q
FALSE
TRUE
scan)
NEGATIVE c + d =
positiveNegative
predictiveNEGATIVE
value = 86.5%
(was 80%)
c = 10
d = 90
85.8
negative predictive value
(was 90%)
12.2 = 85.8%73.6
a + c = 90
b+d=
a+b+c+d
110
110 90
= 200
Accuracy
• Probability that the test is true.
• (not a useful concept as you’ll see later)
Accuracy
DISEASE (PE)
Present
Absent
Positive
TRUE
POSITIVE
a = 80
FALSE
POSITIVE
b = 20
a + b = 100
Negative
FALSE
NEGATIVE
c = 10
TRUE
NEGATIVE
d = 90
c + d = 100
a + c = 90
b + d = 110
a+b+c+d = 200
TEST
(V/Q
scan)
(80+90) / (80+ 20 + 10 + 90) = 85.0%
Test Characteristic Issues
• Accuracy:
– not useful characteristic
– high sensitivity / low specificity test may have
same accuracy as low sensitivity / high specifity
test
(positive) Likelihood Ratio
• Ratio of:
probability of positive test when disease is present
-------------------------------------------------------------------probability of positive test when disease is absent
Positive Likelihood Ratio
DISEASE (PE)
Present
Absent
Positive
TRUE
POSITIVE
a = 80
FALSE
POSITIVE
b = 20
a + b = 100
Negative
FALSE
NEGATIVE
c = 10
TRUE
NEGATIVE
d = 90
c + d = 100
a + c = 90
b + d = 110
a+b+c+d = 200
TEST
(V/Q
scan)
(80 / 90) / (20 / 110) =
4.89
Utility of (Positive) Likelihood Ratios
• expresses how many times more likely a
test result is to be found in diseased,
compared to nondiseased, people.
• can estimate the post-test probability of
disease if prevalence is known.
Pre-test Probability of Disease
• Consider: a female presents for a screening
breast mammogram for breast cancer.
• What’s her pre-test probability of disease?
Prevalence of Disease
Positive Test Result
• Say that her mammogram show her to have
a 1 cm spiculated calcification
• Say that this finding is associated with a
likelihood ratio of 20 (a very suspicious
lesion).
Highly suspicious lesion
What is the post-test probability of
disease?
Answer:
Pretest odds x Likelihood Ratio = Posttest odds
(the use of odds ratios makes the math convoluted)
What is the post-test probability of disease?
Pretest odds x Likelihood Ratio = Posttest odds
Assume: prevalence = 10 / 1000 = 1% = P(0.01)
Odds = probability of event / (1 - probability of event)
Pre-test Odds = (10/1000) / (1 - (10/1000)) = 0.0101
What is the post-test probability of
disease?
Pretest odds x Likelihood Ratio = Posttest odds
0.0101 x 20 = 0.2020
Probability = Odds / (1 + Odds)
Posttest Probability = 0.2020 / (1 + 0.2020)
Posttest Probability = 0.167 = 16.7%
Utility of (Positive) Likelihood Ratio
Pre-test Probability = 1%
Post-test Probability = 16.7%
Prudent Course: move from screening test
to confirmatory test!
Treatment and Prevention
Measures of Treatment & Prevention Effect
• Results
– Incidence of Stroke over 5 years:
– treatment:
5.2 / 100 participants
– placebo:
8.2 / 100 participants
Measures of Treatment & Prevention Effect
•
•
•
•
•
Absolute Risk Reduction
Relative Risk Reduction
Relative Risk
Odds Ratio
Number Needed to Treat
Absolute Risk Reduction
ARR = P(event placebo) – P(event treatment)
treatment
placebo
5.2%
8.2%
0.052
0.082
ARR = 0.082 – 0.052 = 0.03
Relative Risk Reduction
RR = P(event placebo) – P(event treatment)
-------------------------------------------------
P(event placebo)
treatment
placebo
5.2%
8.2%
0.052
0.082
RR = (0.082 – 0.052) / 0.082 = 0.37
Relative Risk
RR =
P(event treatment)
---------------------------
P(event placebo)
treatment
placebo
5.2%
8.2%
0.052
0.082
RR = 0.052 / 0.082 = 0.63
RR & RRR
• Relative Risk + Relative Risk Reduction = 1
0.37
0.63
1.00
Number Needed to Treat
NNT = 1 / ARR
treatment
placebo
5.2%
8.2%
0.052
0.082
NNT = 1 / (0.082 – 0.052) = 33
Odds Ratio
‘Ratio of the Odds’
Odds = P(event) / ( 1 – P(event) )
%
treatment 5.2%
placebo
8.2%
prob
0.052
0.082
odds
0.052 / (1 – 0.052)
0.082 / (1 – 0.082)
Odds (treatment)
0.55
---------------------- = ------ = 0.61
Odds (placebo)
0.89
Measures of Treatment & Prevention Effect
• What’s the fuss?
– Absolute risk graphs may not convey the whole
story.
– Small effect  large relative risks
– see data from a recent large trial...
Measures of Treatment & Prevention Effect
• 100% scale on y axis
Measures of Treatment & Prevention Effect
• Raw data:
– losartan:
– atenolol:
508 events / 4605 subjects = 11.0%
588 events / 4588 subjects = 12.8%
– ARR: 0.018
– RRR: 0.14
– RR: 0.86
1.8%
14.0%
86.0%
NNT = 55.5
Which number
would you like?
Measures of Treatment & Prevention Effect
(examples)
Therapy
Endpoint
NNT (5yr)
stepped care for diastolic
BP of 115 - 129
coronary artery bypass
grafting for left main
disease
ASA for transient
ischemic attack
cholestyramine for
hypercholesterolemia
INH for
inactive tuberculosis
stepped care for diastolic
BP 90 - 109
death, stroke, myocardial
infarction
3
death
6
death, stroke
6
death, myocardial infarction
89
active tuberculosis
96
death, stroke, myocardial
infarction
141
Efficacy versus Effectiveness
• Clinical Trials (treatment efficacy):
– ideal setting
• Everyday Practice (treatment effectiveness)
– subjects not excluded owing to artificial criteria
– less monitoring by health care workers
– patients will be less adherent on average
– less rigor in measurement
Prognosis
Prognosis
• natural history of disease
– disease course without intervention
• clinical course of disease
– disease course with intervention
Prognosis
• Common outcomes of disease
– 5-year survival: percent of patients surviving for 5 years
from some point in their disease
– Case Fatality Rate: percent of patients with a disease who
die of it
– Disease-specific Mortality: number of people per 10,000 (or
100,000) population dying of a specific disease
– Response: percent of patients showing some evidence of
improvement following an intervention
– Remission: percent of patients entering a phase in which
disease is no longer detectable
– Recurrence: percent of patients who have return of disease
after a disease-free interval
Prognosis
• Common outcomes of disease
– not all outcomes are as definite (or bleak) as those
just shown
– health related quality of life
• more qualitative (than quantitative)
• may be more important to patients than mortality
outcomes
• is likely to become increasingly important
Prognostic Factor versus Risk Factor
• A good example:
– low blood pressure is protective against a
myocardial infarction
– low blood pressure is predicts a poor prognosis
among those having a myocardial infarction
– Why?
Prognosis versus Risk Factor
• High BP is associated with atherosclerosis
and thickening of the heart muscle which
both increase risk of heart attack (MI)
• Low BP at an MI is consistent with sufficient
muscle damage that it cannot generate a
satisfactory blood pressure.
Research Design
– study designs
•
•
•
•
•
•
case reports and case series
correlational research
cross-sectional surveys
case series
case-control
quasi-experimental designs
Case Report
• detailed report of the diagnosis, treatment,
and follow-up of an individual patient
• contain some demographic information
about the patient
– age
– gender
– ethnic origin
– (not usually name)
Case Series
• detailed report of the diagnosis, treatment,
and follow-up of > 1 patient
• contain some demographic information
about the patient
– age
– gender
– ethnic origin
– (not usually name)
• may be consecutive or non-consecutive
Case Report
•
•
BHIVA Conf 2005 Apr 20-23;11:P112D
INTRODUCTION
– The effects of methotrexate on patients with HIV are not well known. We report a
case of a man with a relatively high CD4 count, not requiring treatment with HAART,
displaying evidence of immune failure while on methotrexate.
•
CASE PRESENTATION
– We describe the case of a 48 year old British Caucasian man who presented with a
2 year history of lower limb synovitis requiring recurrent steroid injections; he
became progressively more debilitated. Initially tested negative for HIV. He also
complained of early morning back pain for a number of years and HLA B27
confirmed clinical suspicion of Ankylosing Spondylitis (AS). Repeated HIV testing 2
years after initial presentation confirmed HIV with CD4 count 622 cells/µl (25%) and
41,400 viral load HIV-1 RNA copies/ml.
– Initial results suggested that HAART should be withheld. Escalating doses of
methotrexate have coincided with evidence of impaired T cell function manifest as
widespread Molluscum and violatious lesions on his 1st MTP joint. Biopsy
confirmed Kaposis sarcoma (KS). His arthropathy remains difficult to control;
however he now has an AIDS illness requiring treatment with HAART.
•
DISCUSSION
– the effects of steroid use in patients with HIV is established as a risk for the
development of opportunistic infection and KS. The effects of methotrexate are not
as clear and there is very little literature of the interaction between HIV and AS.
Case Series
• In the period October 1980-May 1981, five
young men, all active homosexuals, were
treated for biopsy-confirmed Pneumocystis
carinii pneumonia at three different
hospitals in Los Angeles, California. Two of
the patients died.
-- Morbidity & Mortality Weekly Report, 6/5/1981
Case Reports & Case Series
• Strengths
– inexpensive
– rapid (Hanta Virus, Legionnaire’s dx, SARS)
• Weaknesses
– non-systematic  generalizability issue
– draws attention to the unusual
– may be factually inaccurate
• recall issues
• incomplete information or initial workup
– What’s the clinical question?
Correlational Research
(Ecological Study Design)
• Correlational Research
– compares health outcomes between groups or
populations (Not Individuals)
– the units of analysis are populations or groups
of people, rather than individuals
Population A
Population B
Exposure?
Disease ?
Exposure?
Disease ?
Population C
Population D
Exposure?
Disease ?
Exposure?
Disease ?
• Correlational Study
– mean population exposure & mean population disease
– measure during the same period of time
Meat consumption and Colon Cancer Incidence
Correlational / Ecological Research
• Strengths
– inexpensive
– rapid
• Weaknesses
– individual exposure and outcome unknown
– confounding
– cause and effect
Cross Sectional Survey
• XSS:
– exposure and disease status is measured
among individuals at the same time
Inidividual A
Inidividual B
Inidividual C
Inidividual D
Measure
Exposure
&
Disease
Cross-sectional survey: measure exposure and
disease among individuals at same encounter
Cross Sectional Survey
BMJ 2002:324:1152 Injury and financial
status.
Cross Sectional Survey
• Strengths
– can study entire populations
– provide estimates of prevalence of all factors
measured
• Weaknesses
– cause and effect may not be determined
– not good for rare diseases or rare exposures
Case-control Design
• Exposure is measured among individuals
with a disease and among controls.
Individuals
with
Lung Cancer
Individuals
without
Lung Cancer
radon gas
exposed
radon gas
unexposed
radon gas
exposed
radon gas
unexposed
Case-control design: first establish disease then exposure
Case-control design
American Journal of Epidemiology Vol. 138, No. 5: 281-293
Asthma death and prescription of inhaled medication
Case Control Study
• Strengths
–
–
–
–
–
–
relatively low cost
short duration
good for diseases with long latent periods
good for rare disease
can examine multiple exposures for a single disease
study does not impart risk to patient
• Weaknesses
– cannot determine disease incidence
– may be hard to establish a temporal relationship
between exposure and disease if disease
– it is particularly prone to selection and recall bias
– not good for rare exposures.
Cohort Study
• Disease incidence is measured among
exposed individuals and controls.
interstitial lung
disease
Hard Rock
Miners
Not
Hard Rock
Miners
no interstitial lung
disease
interstitial lung
disease
no interstitial lung
disease
Cohort Study: first identify exposure, then measure disease
American Journal of Epidemiology. 152(4):297-306, August 15, 2000.
Categories of exposure among German rubber workers.
American Journal of Epidemiology. 152(4):297-306, August 15, 2000.
Categories of exposure among German rubber workers.
Cohort Study
•
Strengths
–
–
–
–
–
–
–
•
able to estimate disease incidence
good for determining temporality
if prospective, little misclassification of disease status
can examine multiple effects of a single exposure
study does not impart risk to subjects
retrospective cohort trials are inexpensive and require a short
duration to complete the trial
good for rare exposures
Weaknesses
–
–
–
–
not good for rare diseases
if retrospective, requires availability of adequate records,
if prospective, validity is effect by losses to follow-up
if prospective, may be very costly and may take several years to
complete
Quasi-experimental designs
• ‘Almost experimental research designs’
• Several and Varied
– e.g. non-equivalent groups design
– e.g. pre-post (before-after design)
Acute MI
Canmore
Lifestyle
Education
Program
Acute MI
Cochrane
Acute MI
Canmore
Acute MI
Cochrane
• An example of a Quasi-experimental Design
– non-equivalent groups: measure baseline in two groups, intervene
in one group, remeasure incidence in both groups
Quasi Experimental Designs
• Weakness
– comparison may or may not be valid
Validity and Reliability
• We accept that:
– we cannot measure all persons in a populatoin
– we cannot make perfect measurements
– we cannot be certain in our associations
Threats to Validity and Reliability
Schaefer’s Simpleton View of the World
• Bias
– any process at any stage of research that
systematically produce results that deviate from
the true values
• Confounding
– occurs when two factors are associated and the
effect of one is confused with or distorted by the
effect of the other.
• Play of Chance
– it is not possible to perfectly control for all
factors
Bias
• Any process at any time during research
• Systematically causes a result deviation
• Usually  results from one group behaving
or being treated differently from the other
• Examples
– Selection bias
– Measurement bias
– Recall bias
Confounding
• ‘Confusion of effects…’
• If an association exists between two
exposures  we may be confused as to
which caused the disease
Confounding
Down’s Syndrome and Birth Order
Cases of(what’s
Down syndroms
by birth order
wrong here?)
Cases per 100 000
live births
180
160
140
120
100
80
60
40
20
0
1
2
3
Birth order
4
5
Play of Chance
• Random error
– will treat both groups equally in the long run
– biases toward the null hypothesis (no difference
exists between groups)
– reduces reliability
Random Error
Per Cent
14
12
10
8
6
4
2
0
0
5
10
15
20
25
Size of induration (mm)
30
35
Systematic Error
(Bias)
Per Cent
14
12
10
8
6
4
2
0
0
5
10
15
20
Size of induration (mm)
25
30
Biostatistics
•
•
•
•
Type of Data
Measures of Central Tendency
Measures of Dispersion
Expressing Results
Types of Data
•
•
•
•
•
Nominal
Ordinal
Ranked
Discrete
Continuous
Nominal Data
• Data is placed into ‘named’ categories.
• E.g.
• 1 = pneumonia
• 2 = heart disease
• 3 = abdominal pain
Mathematical analysis usually inappropriate.
(exception might be 0 = male, 1 = female)
Ordinal Data
• Data relates to a logical order.
• Example:
•
•
•
•
•
5 = fatal
4 = severe
3 = moderate
2 = mild
1 = none
• Mathematical analysis usually inappropriate.
Does mild + moderate = fatal?
Ranked Data
• Data relates to position within a sequence.
E.g. Causes of death…
• 1 = cardiovascular disease
• 2 = neoplasm
• Mathematical analysis usually
inappropriate. However, information is
usually useful and often quoted.
Discrete Data
• Data represents ‘counts’.
• E.g.
– number of children
– number of accidents
– number dying of heart failure
• Mathematics are appropriate although result
may not be. e.g. 2.4 children / family
Continuous Data
• Data has any numerical value (ratio data)
• E.g.
– cholesterol values
– blood pressures
• Mathematics is usually appropriate. e.g.
Average hemoglobin was 120 g/l
Staging of Heart Failure
NYHA Cardiac Status
• Class I: uncompromised
• Class II: slightly compromised
• Class III: moderately compromised
• Class IV: severely compromised
– updated from old NYHA Classification
• ‘usual activities’ ‘minimal exertion’
Specific Activity Scale
Goldman Circulation 64:1227, 1981
Stage I
• patients can perform to completion any
activity requiring 7 metabolic equivalents
– can carry 24 lb up eight steps
– carry objects that weigh 80 lb
– do outdoor work [shovel snow, spade soil]
– do recreational activities [skiing, basketball,
squash, handball, jog/walk 5 mph]
Specific Activity Scale
Goldman Circulation 64:1227, 1981
Stage II
• patients can perform to completion any
activity requiring 5 metabolic equivalents
– have sexual intercourse without stopping
– garden, rake, weed, roller skate
– dance fox trot, walk at 4 mph on level ground
– but cannot and do not perform to completion
activities requiring 7 metabolic equivalents
Specific Activity Scale
Goldman Circulation 64:1227, 1981
Stage III
• patients can perform to completion any
activity requiring 2 metabolic equivalents
– dress, shower without stopping, strip and make
bed, clean windows
– walk 2.5 mph, bowl, play golf, dress without
stopping
– but cannot and do not perform to completion
any activities requiring 5 metabolic equivalents
Specific Activity Scale
Goldman Circulation 64:1227, 1981
Stage IV
• patients cannot or do not perform to
completion activities requiring 2 metabolic
equivalents
– CAN’T:
• dress without stopping
• shower without stopping
• strip and make bed
• walk 2.5 mph
• bowl, play golf
Prognosis varies with Class
Stage IV NOT 4 X more serious than stage I heart failure.
Measures of Central Tendency
• Mean
• Median
• Mode
• others exist
– truncated mean
– geometric mean
– weighted mean
Mean
• Average
sum of all observations
-------------------------------------number of observations
2, 3, 6, 8, 10, 12
41 / 6 = 6.83333
Median
• The 50th percentile (or ‘middlemost’ value).
3, 6, 7, 19, 10, 13, 2, 1, 21, 4, 22
1, 2, 3, 4, 6, 7, 10, 13, 19, 21, 22
Median = 7
(Use Average of the Two Middle Values
if Even Number of Observations)
1, 2, 3, 4, 6, 6, 7, 10, 13, 19, 21, 22
Median = (6 + 7)/2 = 6.5
Mode
• Most common value.
3, 6, 7, 4, 19, 4, 10, 13, 10, 2, 1, 21, 4, 22
Mode = 4
Measures of Central Tendency
• Medicine and Health
– mainly mean and median
• Mean:
– sensitive to outliers
– does not convey multimodal distributions
• Median:
– less intuitive
– less suitable for mathematical analysis
Hospital Length of Stay: typical example of where a few patients
(e.g. complication of surgery) requires longer stays
same mean age
Normal Distribution
•
•
mean = median = mode
bell shaped (single peak) and symmetrical
Measures of Dispersion (variability)
•
•
•
•
•
Range
Variance
Standard Deviation
Standard Error
Confidence Intervals
Range
• The difference between largest and smallest
values. (Usually expressed as smallest to
largest)
2, 4, 6, 10, 12, 14, 17, 20
range = 18
The range was 2 to 20.
Interquartile Range
• the distance between the 25th percentile
and the 75th percentile
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
IQR = 4 to 9
Variance (sample)
115, 116, 118, 114, 117
 mean = 116
 range = 3
44, 80, 110, 180, 166
 mean = 116
 range = 136
Range is helpful but depends only on two numbers.
Variance (sample)
115, 116, 118, 114, 117
 mean = 116
 range = 3
observations
115 116 118 114 117
mean
116 116 116 116 116
difference
-1
0
2
-2
1 sum = 0
diff squared
1
0
4
4
1 sum = 10
divide by obs
10 / (5-1) = 2.5 = variance
take square root of variance = √2.5 = 1.58  std dev
Variance (sample)
44, 80, 110, 180, 166
observations
mean
difference
diff squared
divide by obs
 mean = 116
 range = 3
44 80 110 180 166
116 116 116 116 116
-72 -36 -6 64
50 sum = 0
5184 1296 36 4096 2500 sum=13,112
13,112 / (5-1) = 3,278 = variance
take square root of variance = √3,278 = 57.3  std dev
Normal Distribution
• +/- 1 sd  66% +/- 2 sd  95% +/- 3 sd 99.7%
Variance (Population)
• Variance of a Population
– population is where everyone is measured
– denominator = number of observations
• Variance of a Sample
– a sample of the population is selected
– denominator = number of observations - 1
Standard Error
• Imagine a data set with 1,000 values
–
–
–
–
–
Select 100 values, calculate mean
Select 100 values, calculate mean
Select 100 values, calculate mean
Select 100 values, calculate mean
and so on, and so on…
– Plot the means
– Calculate the standard deviation of these means
Standard Error
Another method: Standard Dev / √ sample size
Confidence Interval
• General Formula:
95% Confidence Interval =
mean – (1.96 x Standard Error)
to
mean + (1.96 x Standard Error)
So what does this actually mean?
• Confidence Interval
– the range over which the TRUE VALUE is
covered 95% of the time.
Expressing Our Results
• Point Estimate and Confidence Interval
Rales Trial NEJM 1999;341:709-17
Relative Risk 0.7  Point Estimate
placebo: 753 / 841 = 0.895
spirono:
/ 822
= 0.625
95% CI (0.59 515
to 0.82)
 Measure
of Precision
(0.625) What
/ 0.895
70% 1.0?
if C.I.=included
Graphs
• Box Plots
• Survival Curves
• There are others, which you are likely
familiar with.
– pie, line, bar…
Box Plots
Survival (Kaplan – Meier Curve)
- plots events over
time (not nec.
death)
- takes into
consideration
losses to followup
- be able to
identify this
graph type
Independent versus Dependent Variables
• Independent Variables
– those that are manipulated
– includes the ‘populations’ of interest
– e.g. experimental drug vs placebo
– e.g. population with diabetes vs controls
• Dependent Variables
– those that are only measured or registered
– includes the ‘outcome’ of interest
– e.g. mortality, morbidity
– e.g. health related quality of life
Hypothesis Testing
•
•
Is there an association between an
independent and dependent variable?
Generate a null hypothesis
There is no association between these variables.
1. Reject the Null Hypothesis or
2. Do Not Reject the Null Hypothesis
Implications
• We do not ‘accept’ the null hypothesis…
– Failing to demonstrate an association does not
ensure that an association does not exist!
– Equivalency trials
Errors  two possibilities
• Type 1
– alpha error or rejection error
– ‘rejecting the null hypothesis when in fact there is
no association’
– bias
– confounding
– play of chance
• P-value  accept a probability of Type 1 = 0.05 (5%)
Errors  two possibilities
• Type 2 Error
– beta error
– ‘error of missed opportunity’
– inter-related reasons
• high variance among the outcomes
– population attributes
• small sample size relative to variance
• intervention was insufficient (too low a dose)
• intervention was too brief (too short a trial)
– Power = 1 – Beta
– ‘The power to detect a difference was … values
typically vary from 80% to 95%
End of the Line…
Any Questions or Suggestions?