Download Epidemiology

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
A short introduction to epidemiology
Chapter 2b: Conducting a casecontrol study
Neil Pearce
Centre for Public Health Research
Massey University
Wellington, New Zealand
Chapter 2 (additional material)
Case-control studies
• This presentation includes additional
material on conducting a case-control
study
• More information on data analysis is
given in chapter 9
Chapter 2 (additional material)
Case-control studies
• Reasons for doing a case-control
study
• Basic study design
• Selection of cases
• Selection of controls
– control sampling strategies
– sources of controls
– issues in control selection
Birth
End of Follow up
Death
“non-diseased”
other death
symptoms
lost to follow up
severe disease
A Hypothetical Incidence Study
Exposed
Non-exposed
Ratio
Cases
1,813
952
Non-cases
8,187
9,048
Total
10,000
10,000
Person-years
90,635
95,163
Incidence rate
0.0200
0.0100
2.00
Incidence
proportion
0.1813
0.0952
1.90
Incidence odds
0.2214
0.1052
2.11
A Hypothetical Case-Control Study
Odds ratio =
=
1813/8187
---------------952/9048
1813/952
---------------8187/9048
=
a/c
ad
----- = ---b/d
bc
=
a/b
ad
----- = ---c/d
bc
Reasons for Doing a Case-Control
Study
• It may be inefficient to have to obtain exposure
information on all people in the source population
• It is sufficient to obtain information on all of the
2,765 deaths and a control sample (e.g. 2,765
controls) of the 17,235 survivors
• We therefore only need to get exposure information
on 5,530 people instead of 20,000
• This gain in efficiency is much greater when the
disease is “rare” (e.g. if it were 1/10th as “common”
then we would have 277 cases and 277 controls)
Reasons for Doing a Case-Control
Study
• Rare disease
• long induction time
• smaller study size permits collection and
analysis of more detailed exposure information
• cohort difficult to enumerate (registry-based
studies)
Chapter 2 (additional material)
Case-control studies
• Reasons for doing a case-control
study
• Basic study design
• Selection of cases
• Selection of controls
– control sampling strategies
– sources of controls
– issues in control selection
Basic Case-Control Study Design
• Every study is based on a particular source
•
•
•
•
population followed over a particular period of
time (the risk period)
Ideally the study base should be made explicit
We study all cases of the outcome and a sample
of controls drawn from the source population
The case-control design thus involves all of the
potential biases involved in a full cohort study, as
well as additional biases involved in sampling
controls
Information bias is not an inherent feature of
such studies
Cohort-Based (Nested) Case-Control
Studies
• Enumerate the cohort (source population)
and its experience over time (the risk period)
• Ascertain all cases generated by this study
base
• Sample controls from the person-time (or
persons) that generated the cases
Registry-Based Case-Control Studies
• Ascertain all cases appearing in the registry
during a specified period of time
• Sample controls from the source population
for the registry
Chapter 2 (additional material)
Case-control studies
• Reasons for doing a case-control
study
• Basic study design
• Selection of cases
• Selection of controls
– control sampling strategies
– sources of controls
– issues in control selection
Selection of Cases
Cohort-based
• All cases (or deceased cases) generated by the
cohort study
• Living cases may be added from other sources
(e.g. hospital records, cancer registrations)
Registry-based
• All eligible cases appearing in the “registry”
during a specified period of time
Chapter 2 (additional material)
Case-control studies
• Reasons for doing a case-control
study
• Basic study design
• Selection of cases
• Selection of controls
– control sampling strategies
– sources of controls
– issues in control selection
Control Sampling Strategies
• Cumulative incidence sampling
• Case-base sampling
• Density sampling
Birth
End of Follow up
Death
“non-diseased”
other death
symptoms
lost to follow up
severe disease
A Hypothetical Incidence Study
Exposed
Non-exposed
Ratio
Cases
1,813
952
Non-cases
8,187
9,048
Total
10,000
10,000
Person-years
90,635
95,163
Incidence rate
0.0200
0.0100
2.00
Incidence
proportion
0.1813
0.0952
1.90
Incidence odds
0.2214
0.1052
2.11
Cumulative Incidence Sampling
• “Traditional” method of control selection in
nested case-control studies
• Controls are sampled from the “non-cases”,
those free of disease at the end of the followup period, i.e. the survivors
• I.e. controls are sampled from the
denominators for (cohort) odds ratio analyses
A Hypothetical Incidence Study
Exposed
Cases
Non-cases
Total
1,813
8,187
10,000
Non-exposed
962
9,048
10,000
Total
2,765
17,235
20,000
Odds ratio
Incidence
odds
1813/8187
952/9048
2.11
A Hypothetical Case-control Study
Exposed
Cases
Controls
1,813
1,313
Non-exposed
952
1,452
Total
2,765
2,765
Odds ratio
Odds
1813/1313
952/1452
2.11
Cumulative Incidence Sampling
• Estimates the (cohort) odds ratio (without any
rare disease assumption)
• Estimates the risk ratio and rate ratio
approximately (with a rare disease
assumption)
• May involve matching on age, etc
• Exposure is usually only considered up to the
“time” (year or age) that the case occurred
Case-cohort Sampling
• Controls can be selected from those at risk at
the beginning of the follow-up period, I.e.
from the entire source population
• I.e. controls are selected from the
denominators for (cohort) risk ratio analyses
A Hypothetical Incidence Study
Exposed
Cases
Total
1,813
10,000
Non-exposed
952
10,000
Total
2,765
20,000
Odds ratio
Odds
1813/10000
952/10000
1.90
A Hypothetical Case-control Study
Exposed
Cases
Controls
1,813
1,383
Non-exposed
952
1,383
Total
2,765
2,766
Odds ratio
Odds
1813/1383
952/1383
1.90
Case-cohort Sampling
• Estimates the risk ratio (without any rare disease
assumption)
• Requires minor modifications to the standard
formulas for confidence intervals and p-values
• May involve matching on age, etc
• Once again, exposure is usually only considered up
until the “time” that the case occurred
Birth
End of Follow up
Death
“non-diseased”
other death
symptoms
lost to follow up
severe disease
Density Sampling
• Controls are selected longitudinally
throughout the course of the study, i.e.
from the person-time of the study base
• I.e. controls are sampled from the
denominators for the rate ratio analyses
• In general, controls are selected from the
“risk set” of persons at risk at the “time”
that each case occurred
A Hypothetical Incidence Study
Exposed
Cases
Person-years
1,813
90,635
Non-exposed
952
95,163
Total
2,765
185,798
Rate ratio
Incidence
rate
1813/90635
952/95163
2.00
A Hypothetical Case-control Study
Exposed
Cases
Controls
1,813
1,349
Non-exposed
952
1,416
Total
2,765
2,765
Odds ratio
Odds
1813/1349
952/1416
2.00
Density Sampling
• The “time” variable is usually taken to be
age rather than calendar time (year)
• Estimates the rate ratio (without any rare
disease assumption)
• Matching may also be done on other timerelated factors, although this is usually not
necessary
• Usual method of sampling in registrybased studies
Selecting Controls
Cohort-based studies
• Sample of the cohort (preferably by density
sampling on age)
Registry-based studies
• Sample of the source population for the
Registry (usually by density sampling on
year, perhaps with matching on age)
Selecting Controls in
Registry-Based Studies
• Cases chosen from all lung cancer cases at
hospitals in the City
• Controls chosen from general population of
the City?
Selecting Controls in
Registry-Based Studies
• All lung cancer cases at all hospitals in the City
• Controls chosen from general population of the
City?
• Restrict cases to those living in the City (exclude
those who have come to the City for treatment)
• Restrictions that apply to one group (e.g. having a
telephone, being on Electoral Roll, having health
insurance) should also be applied to the other
Selecting Controls in
Registry-Based Studies
• Cases chosen from all lung cancer cases at
the main hospital in the City
• What is the source population for these
cases?
Selecting Controls in
Registry-Based Studies
• Cases chosen from all lung cancer cases at
the main hospital in the City
• What is the source population for these
cases?
• “All those who would have come to the main
hospital in the City for treatment if they had
developed lung cancer”
Issues in Control Selection
• Controls are usually sampled at random from
the entire study base
• However, it is sometimes desirable to restrict
the controls to a sample of a subset of the
study base
• In particular, we may select controls from
persons with other diseases generated by the
same study base (e.g. other deaths, other
cancers, other hospital admissions)
“Other Disease” Controls
• All other diseases
• All other diseases except those known to be
related to exposure
• A disease “known to be unrelated to exposure”
Reasons for Using “Other Disease”
Controls
The cohort (source population) is not enumerated
• E.g. if the cases are identified from hospital
admissions (e.g. for lung cancer) then the study
base is “all persons who would have been
admitted to this hospital if they had developed
lung cancer”
• Controls might be selected from other
admissions to the same hospital
Reasons for Using “Other Disease”
Controls
Comparability of information
• E.g. in a case-control study of non-Hodgkin’s
lymphoma and pesticide exposure, cases
might be more likely to recall brief exposures
• We might therefore select controls from
“other cancer” registrations rather than from
the entire source population for the Cancer
Registry
Selection Bias in Case-Control Studies
• In a case-control study, the controls are a
sample of the source population
• Selection bias can occur if the sample is nonrandom, and the selection of controls is
related to exposure status
• In other words, selection bias can occur if the
controls are not representative of the
exposure in the source population
Selection Bias in Case-Control Studies:
Solutions
• Selection bias can occur if the selection of
controls is related to exposure status
• In the analysis, we can control for the
determinants of control selection (e.g. social
class)
• An exception is when we have chosen “other
disease” controls and the other diseases are
directly caused by the main exposure of
interest: this selection bias cannot be removed
General Population and “Other
Cancer” Controls
General population
• Represents study base
• May be more prone to
recall bias if cases are
more likely to recall
exposures
• Difficult to keep interviewer
blind, and may get
interviewer bias
“Other cancers”
• Other diseases may be
caused by exposure
(selection bias)
• Equal motivation and recall
in cases and controls
• Easier to keep interviewer
blind
Reasons for Matching
Practical efficiency
• e.g. if we are using hospital controls then it is
usually more efficient to select a control
admitted on the same day as the case, rather
than sampling at random from all admissions
for the year
Reasons for Matching
Statistical efficiency
• e.g. if we select general population controls
at random in a lung cancer case-control
study then the cases will be mostly “old” and
the controls will be mostly “young”. It will
therefore be difficult to stratify on, and control
for, age
Reasons for Not Matching
Practical efficiency
• matching can be costly and time-consuming
and is usually not necessary since we can
adjust for the major matching factors (e.g.
age, gender, smoking status) in the analysis
Reasons for Not Matching
Statistical efficiency
• Matching on a weak risk factor (or a non-risk
factor) that is strongly correlated with the
main exposure can dramatically reduce
efficiency
Matching
Only match on risk factors that are:
• Not of intrinsic interest in themselves
(e.g. age)
• Strong risk factors for disease
• Not too difficult to match on
Common misconceptions about casecontrol studies
• Fundamentally different type of study design
that proceeds from disease to exposure (I.e.
reverse causality)
• Inherently less valid (more biased) than
cohort studies
• Require a rare-disease assumption
• Odds ratio only approximates the rate ratio or
risk ratio (under the rare disease assumption)
A short introduction to epidemiology
Chapter 2b: Conducting a casecontrol study
Neil Pearce
Centre for Public Health Research
Massey University
Wellington, New Zealand