Download Cancer epidemiology: study designs and data analysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
(290-297) CTO-1059-Red series-Cancer...qxp
E D U C AT I O N A L S E R I E S
22/05/2007
13:24
Página 290
Clin Transl Oncol (2007) 9:290-297
DOI 10.1007/s12094-007-0056-x
Red series*
Cancer epidemiology: study designs and data analysis
N. Malats and G. Castaño-Vinyals
Centre de Recerca en Epidemiologia Ambiental (CREAL). Institut Municipal d’Investigació Mèdica (IMIM). Barcelona, Spain
Abstract Among the scientific interests of cancer epidemiology is the identification of both environmental
and genetic factors associated with cancer development.
Observational designs requiring sophisticated methodology are applied to control for potential confounding factors. The enormous biotechnological potential developed in the last two decades has allowed the integration
of a plethora of new biomarkers in epidemiological
studies to better define the exposure and “neoclassic”
outcomes, as well as incorporating genetic susceptibility
factors in both classical and new epidemiological designs. The integration of scopes, objectives, data and
tools coming from different disciplines also benefits
epidemiology, thus evolving into “systems epidemiology”. In this manuscript, we review the basic concepts of
study designs and data analysis and introduce readers to
the more innovative aspects that are now being applied
in epidemiological studies.
Key words Case-control study • Exposure biomarker •
Disease biomarker • Genetic susceptibility • Data analysis
Malats N, Castaño-Vinyals G (2007) Cancer epidemiology:
study designs and data analysis. Clin Transl Oncol 9:290-297
*Supported by an unrestricted educational grant from
Roche Farma S.A.
N. Malats (쾷)
Centre de Recerca en Epidemiologia Ambiental (CREAL)
Institut Municipal d’Investigació Mèdica (IMIM)
Carrer del Dr. Aiguader, 88
E 08003 Barcelona, Spain
E-mail: [email protected]
Introduction
Epidemiology aims at identifying the determinants that
affect health in the population, at assessing their distribution according to specific subgroups of individuals
and at controlling the diseases [1]. Cancer is a major
area of epidemiological interest because of its importance as a public health problem [2]. The scientific focus of cancer epidemiology ranges from aetiological objectives, aimed at identifying both environmental and
genetic factors associated with cancer development, to
more clinical aims that involve both diagnostic and
prognostic aspects. In this paper, we comment only on
the aetiological arena of cancer epidemiology.
The research on cancer epidemiology has allowed
estimation of the percentage of cancer cases attributed
to several environmental exposures and lifestyle factors
– attributable risk. For example, tobacco use accounts
for 30% of all cancer cases, unhealthy diet for 10–25%,
obesity for 15%, physical inactivity for 5%, alcohol
consumption for 4% and viruses for 3% [3–5].
Epidemiology deals with people, hence ethical issues often hamper their inclusion in experimental studies. Instead, “observational” designs are commonly applied, the most classical being case-control and cohort
studies. This limitation has forced epidemiology to develop a sophisticated and complex methodology to control for potential confounding factors. This methodology
mainly regards the type of design, selection of subjects
and sample size, collection of information and statistical
analyses.
While what is known as “classical” epidemiology restricts its interest to environmental exposures, “molecular” and “genetic” epidemiology apply a wide variety of
biomarkers to better define both the exposure and “neoclassic” outcomes, as well as the inherited genetic patterns that predispose or protect against the disease or the
variation in its natural history.
The enormous biotechnological potential developed
in the last two decades has allowed the integration of a
plethora of new biomarkers in epidemiological studies,
such as gene expression, genetic polymorphisms and
epigenetic patterns. This fact is provoking, at the same
(290-297) CTO-1059-Red series-Cancer...qxp
22/05/2007
13:24
Página 291
N. Malats, G. Castaño-Vinyals: Cancer epidemiology
time, the development of methodological innovations
regarding all aspects of epidemiology: new designs and
information, including the development of strategies to
reduce the very large sets of genetic information, new
statistical analytical tools and methods to adjust for genetic complexity. This is the era of integration of scopes,
objectives, data and tools coming from different disciplines such as molecular and cellular biology, genetics,
toxicology, statistics, bioinformatics and clinical specialities. Epidemiology also benefits from this integrative approach, thus evolving into “systems epidemiology”.
In this manuscript, we review the basic concepts of
study designs and data analysis and introduce readers to
the more innovative aspects that are now being applied
in epidemiological studies.
291
1) 2 x 2 table
Exposed
Non-exposed
ORcrude =
Cases
Controls
A1
C1
A0
C0
A1 / A1 + C 1
1 − ( A1 / A1 + C 1) A1xC 0
=
A0 / A0 + C 0
C 1xA0
1 − ( A0 / A0 + C 0)
If confounder is present:
If confounder is absent:
Cases
Controls
Exposed
A1
C1
Non-exposed
A0
C0
Cases
Controls
Exposed
A1
C1
Non-exposed
A0
C0
OR2
OR1
Classical study design and analytical strategies
Research studies can be classified in two main groups:
experimental and observational. The first is mostly a
laboratory-based approach that allows the researcher to
manipulate the conditions of the exposed subjects. On
the other hand, in observational studies the researcher
“just” observes the subjects’ behaviour and exposures
and assesses their association with disease occurrence
[6].
There are different study designs on individuals as
observation units: cohort, case-control, cross-sectional
and case-crossover studies [7]. Probably, the most common design is the case-control study. It consists of identifying diseased – cases – and healthy subjects – controls, collecting past history of exposures, and then
assessing whether there are differences in the distribution of exposures between cases and controls through
association tests that are interpreted as the exposure effect. In addition to an appropriate sample size, the success of this design relies on very careful selection of
controls. The natural history and the relatively low frequency of cancer in the population (12% of all mortality
causes) make it an ideal topic to be studied by case-control design.
The measure of association most commonly used in
case-control studies is the odds ratio (OR), defined as
the ratio of the odds of developing the disease. This is
an approximation of the relative risk used in cohort
studies (see below). As in case-control studies the information on all subjects that are not exposed is lacking,
the measure of risk we obtain should be derived from
the available information (see Fig. 1). Without getting
into further details of the development of the equation,
the OR is the cross-product ratio, or (A×D)/(B×C). The
statistical analysis method applied to test whether the
observed distribution statistically departs from the expected one is the χ2 test of association.
2) 2 x 3 table
Cases –
Cases –
group 1
group 2
Exposed
A1
B1
C1
Non-exposed
A0
B0
C0
OR group1 =
A1xC 0
C 1xA0
Controls
OR group2 =
B1xC 0
C 1xB 0
3) 2 x 4 table for gene-environmental assessing.
The 2 x 2 table (model 1) for risk assessment in the context of case-control studies can
also be displayed as:
Exp/gene
Cases
Controls
0
A0
C0
Odds ratio
1
1
A1
C1
A1C0 / A0C1
A simple gene-environmental interaction model in the context of case-control studies is
then displayed in a 2 x 4 table
Exposur
e
Susceptibility
genotype
0
Case-control study
Cases
Controls
0
A00
C00
Odds ratio
1
0
1
A01
C01
Rg=A01C00/A00C01
1
0
A10
C10
Re=A10C00/A00C10
1
1
A11
C11
Rge=A11C00/A00C11
4) 2 x 2 table for case-only design
Exposed
Non-exposed
Cases —
Cases —
group 1
group 2
A1
B1
A0
B0
ORcrude =
A1xB 0
B1xA0
Fig. 1 Cross-tabulation of exposure and disease
(290-297) CTO-1059-Red series-Cancer...qxp
22/05/2007
13:24
292
When the OR only takes into account one exposure,
it is called the crude OR. Confounding by a third factor
–whether an exposure or not– can bias this association.
A confounder is defined as a factor that is causally associated with the outcome; it is also associated with the
exposure (either causally or non-causally) and it is not
an intermediate in the causal pathway between the exposure and the outcome. In the case of suspicion of the
presence of a confounder, this should be checked by
stratifying the association according to the potential
confounder strata, obtaining one OR for each category
in the confounder variable. If the presence of a confounder is verified, an adjusted model should be used, with
the Maentel-Haenszel method, yielding an ORadjusted.
For more complex modelling, such as adjusting for
several factors, statistical packages need to be used. For
a binary outcome, such as diseased and non-diseased
subjects, the logistic regression model is the one used to
estimate the association between case-control status and
an exposure. This method allows adjustment for different variables.
Unlike case-control studies, cohort studies recruit
healthy subjects, regardless of exposures, till they develop
the disease of interest –i.e., cancer– and assess the association between exposure and disease by providing estimates of disease incidence and survival analysis methods. The estimate of risk computed in cohort studies is
named relative risk (RR). The RR is the ratio of the risk
of developing the disease among the exposed individuals over the risk among the unexposed. This is actually a
ratio of incidence rates among the exposed and non-exposed individuals. An important limitation of this design
is the common need for very extensive follow-up periods and the relatively large number of drop-outs for the
study to get enough cases and sufficient statistical power. For many cancers, this requires up to 20 years.
Molecular epidemiology: from exposures to the disease
“Analysis of biomarkers is increasingly being incorporated into cross-sectional, retrospective, prospective, or
nested case-control studies to gain improved resolution
of the risk factors and mechanisms responsible for cancer.” [8]
Carcinogenesis is a multistage process evolving
from environmental carcinogen exposure to the development of preneoplastic and neoplastic phenotypes and to
cancer progression. It is now possible to obtain biochemical and molecular information – biomarkers – of
the intermediate states in this process and integrate it in
large epidemiological studies [9, 10]. The use of biomarkers in epidemiology constitutes so-called molecular epidemiology [11].
In a broader sense, a biomarker is a parameter that
we can measure in humans, in general using non-inva-
Página 292
N. Malats, G. Castaño-Vinyals: Cancer epidemiology
sive techniques, and that is relevant for our object of
study [12]. An already classical scheme has been widely
used to describe the molecular epidemiology path [8,
13–15].
The aims of using biomarkers in epidemiologic studies are to increase the sensitivity and specificity in detecting exposure to carcinogens, to evaluate more precisely the interplay between genetic and environmental
determinants of cancer, to detect earlier the pro-carcinogenic effects of exposures, to characterise the disease
subtypes and to evaluate primary prevention measures
[16]. An additional objective is to consider inherited genetic variants (polymorphisms) as effect modifier factors – factors that modify the effect/association of environmental factors with a disease by either increasing or
decreasing the risk. This phenomenon is called gene–environment interaction. The advances in detailing the
complex aetiological picture of cancer demands the application of new designs – such as the case-only, casecase-control, case-parental control and affected sib-pair
– among others, for the comparison of the frequency of
factors among unrelated affected and unaffected individuals. In addition to these new designs, molecular epidemiology studies often demand large sample size, and
innovative methods on biobanking, selection of subjects
and statistical analyses. Table1 displays free online interactive links of databases and statistical packages
Biomarkers can be broadly classified into biomarkers of exposure, effect and susceptibility. The first category is subdivided in internal dose biomarkers, which
mainly include the chemical compound – or its metabolites – detected in biological media, and biological effective dose biomarkers, mainly adducts of DNA or proteins [17–19]. When talking about biomarkers of effect,
early biological effect biomarkers – such as chromosomal aberrations, sister chromatid exchanges, micronuclei and specific gene mutations –as well as subclinical
and clinical disease biomarkers and prognostic biomarkers– are considered. The former group in the category of
effect biomarkers can be used as exposure makers as
well (see Fig. 2, panel B). Markers of susceptibility include a wide category of conditions, both genetic and
non-genetic, that can participate at all the stages of carcinogenesis [13]. There is no perfect classification: recent reviews suggest no longer using this classification
of biomarkers, as one specific group of them, i.e., DNA
adducts, can integrate exposure, effect and susceptibility
[20].
Biomarkers
Once a substance enters the body, it undergoes metabolic transformation. An internal dose biomarker is the
measure of either the parent substance or one of its
metabolites in one of the body fluids, such as urine,
(290-297) CTO-1059-Red series-Cancer...qxp
22/05/2007
13:24
Página 293
N. Malats, G. Castaño-Vinyals: Cancer epidemiology
293
Table 1 Free online interactive links
Databases
Databases
Entrez Gene
OMIM
National Cancer Institute
IARC Handbooks on Cancer Prevention
IARC Monographs on the Evaluation of Carcinogenic
Risks to Humans
National Center for Health Statistics,
National Health and Nutrition Survey
NCI Early-Detection Research Network
US Surgeon General Reports on Smoking and Health
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM
http://www.cancer.gov
http://www.iarc.fr/IARCPress/general/prev.pdf
http://monographs.iarc.fr/
http://www.cdc.gov/nchs/nhanes.htm
http://www3.cancer.gov/prevention/cbrg/edrn/
http://profiles.nlm.nih.gov/NN/ListByDate.html
Packages to estimate statistical power and sample size (from Cordell and Clayton [35])
QUANTO
Genetic power calculator
Stata power and sample size programs
TDTPOWER71
TDTASP
TDT-PC
http://hydra.usc.edu/gxe
http://statgen.iop.kcl.ac.uk/gpc/index.html
http://cruk.leeds.ac.uk/katie
http://www.uni-bonn.de/~umt70e/soft.htm
http://odin.mdacc.tmc.edu/anonftp/
http://www.biostat.jhsph.edu/~wmchen/pc.html
Standard statistical packages
Stata
SAS
S-Plus
R
http://www.stata.com/
http://www.sas.com/
http://www.insightful.com/products/splus/
http://www.r-project.org/
blood or saliva [13]. Several biomarkers of exposure to
PAHs were analysed in children exposed to air pollution, such as measurement of 1-hydroxyperene [21]; this
marker was also used is several environmental and occupational studies of exposure to carcinogens [22, 23].
Internal dose biomarkers usually replace environmental
exposure in classical study designs, both case-control
and cohort studies.
In a subsequent stage in the carcinogenesis process,
biomarkers of biologically effective dose measure the interaction between the carcinogen and cellular substructure molecules such as DNA or proteins [13], DNA
adducts being the most representative type. An adduct
can be defined as the covalent bond between a biological molecule – such as DNA or proteins – and a toxic
substance, either the parent compound or one of its
metabolites. The half-life of these adducts depends on
both the half-life of the biological molecule and the stability of the chemical. The study of adducts can yield
clues about the mechanism of the disease; Peluso et al.
found a positive association between DNA-PAH adducts
and lung cancer [24]. In a meta-analysis of DNA
adducts and risk of cancer [17], an association between
the biomarkers studied and the risk of lung and bladder
cancer was observed, restricted to current smokers.
Again, these types of markers usually replace – or refine – environmental exposures in classical study designs, both case-control and cohort studies.
Early biological effect biomarkers are those alterations that have a biological effect, sometimes irre-
versible, due to the interaction with a toxic substance.
Their detection depends on the DNA repair capacity and
the cellular turnover. They may be specific and non-specific. The most classical assays that refer to the non-specific markers are chromosomal aberrations, sister chromatid exchanges and micronuclei [25, 26]. Among the
specific marker group, there are those that imply
changes in a key oncogene or tumour suppressor gene,
such as alterations in the Tp53 or ras genes, cases being
subclassified according to these alterations.
As mentioned before, these can be considered as
biomarkers of exposure, effect and even susceptibility
[20]. If they are “used” as effect markers, they are included in case-control studies by splitting the case
group according to the molecular characteristics of cases. So, we would have 2 or more groups of cases and
one control group (Fig. 1.2). Analytically, each one of
the case groups is compared with the control group by
applying the so-called multinomial or polytomous logistic regression. Usually, only two groups of cases are
considered, those with and those without the molecular
characteristic of interest. Statistically speaking, it would
be a case-case-control analysis. There is evidence from
the literature on studies that observe differences between the case subgroups. Porta et al. studied the association between serum concentrations of several
organochlorine compounds, such as PCBs, DDT and
DDE, and K-ras mutations in pancreatic cancer. DDT
and DDE concentrations were significantly higher
among K-ras mutated cases than among K-ras wild-type
(290-297) CTO-1059-Red series-Cancer...qxp
22/05/2007
13:24
294
Página 294
N. Malats, G. Castaño-Vinyals: Cancer epidemiology
Fig. 2 Molecular cancer epidemiology models
cases and controls [27]. Other examples consider the Nras mutations and occupational exposures in patients
with acute myeloid leukaemia [28], to K-ras mutations
and asbestos exposure in lung adenocarcinoma [29], and
to K-ras p21 protein and vinyl chloride exposure in angiosarcomas of the liver [30]. A more recent example
studied different dietary characteristics –as exposure
factors– in relation to colorectal adenomas [31]. Cases
(290-297) CTO-1059-Red series-Cancer...qxp
22/05/2007
13:24
Página 295
N. Malats, G. Castaño-Vinyals: Cancer epidemiology
were classified according to the K-ras mutation, and
each group (mutated or wild-type) was compared with
the control group, finding differences in the risk for the
disease depending on the K-ras status.
Susceptibility markers provide information on those
physiological statuses, both genetic and non-genetic,
that predispose the individuals to develop the disease of
interest. Among the non-genetic markers, we can find
the nutritional, immunological and disease statuses [13],
as well as global methylation levels [32].
Regarding inherited factors, genetic variants can be
classified according to their penetrance in two groups:
high and low penetrance genetic factors. The latter
group comprises most of the inherited variations in a
gene sequence that translate into a functional variation
(i.e., gene expression levels, protein function, among
others) that do not cause the disease by itself but predispose to its development. These markers are classified as
genetic susceptibility factors. We broadly name these
variants as polymorphisms. Many of them are single nucleotide polymorphisms (SNP), meaning the molecular
change involves only one nucleotide. The number of
SNPs estimated in the genome exceeds 10,000,000 and
they can be transmitted together within the population
(i.e., in linkage disequilibrium) following haplotype
blocks.
The choice of strategies to identify genes that increase/decrease individual susceptibility to develop cancer depends on the number of genes involved (for a specific site), their frequency, penetrance and interactions.
When low-frequency alleles, dominant, and of moderate-high penetrance, are suspected to be involved in the
development of a specific phenotype, linkage analysis
in extended “pedigrees” of individuals in whom cancer
was diagnosed at an early age or with strong familial
aggregation (i.e., breast and colon cancer) is proposed
to be applied. On the contrary, when it is suspected that
multiple and frequent loci are involved, following recessive inheritance patterns, or more complex patterns resulting from interactions with environmental factors, association/observational studies is the best option [33].
As described, this type of study can provide sufficient power to distinguish slight variations in disease
risks, being more sensitive than linkage methods when
the genes of interest contribute to disease susceptibility
but are neither necessary nor sufficient to cause disease.
Association studies imply the comparison of the frequency of candidate alleles in candidate genes, or
TagSNPs along the genome (whole genome scan),
among unrelated affected and unaffected individuals.
The alleles analysed may be thought to contribute to the
disease or be in linkage disequilibrium with any such
causative variation. Another advantage of these approaches is that they use the same methodology as epidemiological studies (cohort and case-control design),
although new non-traditional designs have also been
proposed.
295
Case-control is, again, the most commonly applied
design. It overcomes the assumption of temporality –as
the genetic cause always precedes the effect– and it allows the assessment of a large number of genetic and
environmental factors, as well as their interaction.
Furthermore, this design provides the opportunity to
conduct unique analyses such as the case-control design
with extreme phenotypes. To assess gene×environment
interaction, the 2×2 table expands (2×4 table) to include
an additional column that refers to the distribution of
subjects according to genetic factor (see Fig 1.3). The
first row of the table is taken as the reference category,
the OR assessed in the next two refer to the effect of the
environmental or genetic factors alone, and the last row
to the joint effect (gene×environment interaction).
Gene×environment interactions can also be assessed
through a case-only design conducted only with cases
(see Fig 1.4). Instead of controls, this design distributes
exposures between cases with and without the genetic
characteristic [34]. This design does not allow investigators to evaluate the independent effect of the exposure
alone or the genotype alone. The interaction assessed
departs from the multiplicative effect and it assumes independence between exposure and allele. The associations observed may be due to linkage disequilibrium between the genetic marker and the true susceptibility
allele at a neighbouring locus.
In addition to misclassification of genotyping and
phenotypes, and confounding by population structure,
lack of statistical power and false-positive results due to
chance because of multiple testing are important caveats
of this type of designs. This is mainly so when gene×environmental interactions are being assessed.
There are other, yet less popular, designs that are
placed in between familial and population studies and
that are suitable to explore genetic associations under
certain conditions. Among them are the affected sib-pair
and the case-parent triads. The latter consists of assessing the significance of the parental and the case allele’s
distribution through the transmission/disequilibrium
test, conditional logistic regression or log-linear models
[35]. This approach also allows analysis of gene×environment interactions by stratifying the allele distribution
tables according to the presence or the absence of the
environmental exposure. This type of study is less powerful than case-control studies and requires the genotyping of the parents; hence, for most patients with the
common types of cancers, it cannot be applied. On the
other hand, it is robust to genetic stratification and can
estimate maternal and imprinting effects [35].
The integration of epidemiology and bioinformatics
in cancer research
If you do not look for the unexpected you will never
find it because it is painful and hard to find.
(290-297) CTO-1059-Red series-Cancer...qxp
22/05/2007
13:24
296
Página 296
N. Malats, G. Castaño-Vinyals: Cancer epidemiology
Heraclito of Efeso, Centuries VI-V B.C.
Among factors hampering the progress in cancer
epidemiological research are the different molecular and
pathological phenotypes of the disease, the complex nature of the yet unknown environmental factors, the complexity of the genetic networks acting at different steps
in carcinogenesis, and the dynamic interaction of causes
acting during the natural history of the disease. We have
discussed how the integration of information from toxicology and molecular biology permits the first two
points to be dealt with. As for the last two, the interaction of epidemiology with bioinformatics should allow
epidemiology to move ahead. The potential of information technology in managing large amounts of information and searching for dynamic interactions between
factors –through what is currently known as systems biology methods– should also be considered in epidemiology, adding a new dimension, that of systems epidemiology.
As with systems biology, epidemiology should consider the integration of large amounts of data from high
throughput sources both to create new hypotheses and to
build disease models. Computational models allowing
the exploration of dynamic data may be a useful tool for
References
1. Last JM (2001) A dictionary of epidemiology,
4th Edn. Oxford University Press, Oxford
2. Ferlay J, Bray F, Pisani P, Parkin DM (2004)
GLOBOCAN 2002: Cancer Incidence, Mortality and Prevalence Worldwide, Version 1.0.
IARC Cancer Base No. 5. IARC Press, Lyon
3. Adami HO, Hunter D, Trichopoulos D (eds)
(2002) Textbook of cancer epidemiology. Oxford
University Press, New York
4. Colditz GA, Sellers TA, Trapido E (2005)
Epidemiology – identifying the causes and preventability of cancer? Nat Rev Cancer 6:75–83
5. Schottenfeld D, Fraumeni J (eds) (2006) Cancer
epidemiology and prevention, 3rd Edn. Oxford
University Press, New York
6. dos Santos Silva I (1999) Cancer epidemiology:
principles and methods. IARC Scientif ic
Publication No. 157. IARC, Lyon
7. Szklo M, Nieto FJ (2000) Epidemiology: beyond the basics. Jones and Bartlett Publishers,
Sudbury, MA
8. Perera FP (1996) Molecular epidemiology: insights into cancer susceptibility, risk assessment,
and prevention. J Natl Cancer Inst 88:496–509
9. Denissenko MF, Pao A, Tang M, Pfeifer GP
(1996) Preferential formation of benzo[a]pyrene
adducts at lung cancer mutational hotspots in
P53. Science 274:430–432
10. Smith LE, Denissenko MF, Bennett WP et al
(2000) Targeting of lung cancer mutational
hotspots by polycyclic aromatic hydrocarbons. J
Natl Cancer Inst 92:803–811
11. Caporaso N (2000) Molecular epidemiology. In:
Gail MH, Benichou J (eds) Encyclopedia of epidemiologic methods. John Wiley & Sons, Ltd.,
Chichester, West Sussex, England, pp 612–617
12. Collins AR (1998) Molecular epidemiology in
cancer research. Mol Aspects Med 19:359–432
13. Toniolo P, Boffetta P, Shuker DEG et al (eds)
(1997) Application of biomarkers in cancer epidemiology. IARC Scientific Publications No.
142. IARC, Lyon
epidemiologists too, as they are proving useful for biologists and bioinformatics [36].
Epidemiology should move ahead from reductionism to a holistic systems perspective because: (1) there
are properties that are only possessed by the system as a
whole and not by its individual components; (2) there is
an interactive and dynamic stability concerning time,
space and context; (3) models are probably non-linear,
displaying homeostatic, bistable, oscillatory, and chaotic
types of behaviour; and (4) multidimensional analytical
methods require computational and mathematical tools.
Several initiatives have been reported in the last year
on the application of complex system methodology to
disentangle the complex associations and interactions
driving pathology and drug discovery [37–41].
As Ness et al. recently proposed, dynamic systems
models, reflecting that diseases are caused within complex molecular, biological and social systems, with positive and negative feedback, should be incorporated as
one component of the epidemiologic toolbox [42].
Acknowledgement This work was partially supported by the
Fondo de Investigación Sanitaria, Spain (G03/174, G03/160,
C03/09, C03/10, PI051436, PI061614) and Fundació Marató
TV3.
14. Schulte PA, Rothman N, Perera FP, Talaska G
(1995) Biomarkers of exposure in cancer epidemiology. Epidemiology 6:637–638
15. Rothman N, Stewart WF, Schulte PA (1995)
Incorporating biomarkers into cancer epidemiology: a matrix of biomarker and study design categories. Cancer Epidemiol Biomarkers Prev 4:
301–311
16. Merlo DF, Sormani MP, Bruzzi P (2006) Molecular epidemiology: new rules for new tools?
Mutat Res 600:3–11
17. Veglia F, Matullo G, Vineis P (2003) Bulky
DNA adducts and risk of cancer: a meta-analysis. Cancer Epidemiol Biomarkers Prev 12:157–
160
18. Vineis P, Kogevinas M, Simonato L et al (2000)
Levelling-off of the risk of lung and bladder
cancer in heavy smokers: an analysis based on
multicentric case-control studies and a metabolic interpretation. Mutat Res463:103–110
19. Vineis P (2002) DNA adducts and the protective
role of fruits and vegetables. IARC Sci Publ
156:469–474
20. Buffler P, Rice J, Bird M, Boffetta P (eds) (2004)
Mechanisms of carcinogenesis: contributions of
molecular epidemiology. IARC Scientif ic
Publication No. 157. IARC, Lyon
21. Ruchirawat M, Settachan D, Navasumrit P et al
(2007) Assessment of potential cancer risk in
children exposed to urban air pollution in
Bangkok, Thailand. Toxicol Lett 168:200–209
22. Sram RJ, Binkova B (2000) Molecular epidemiology studies on occupational and environmental
exposure to mutagens and carcinogens,
1997–1999. Environ Health Perspect 108[Suppl
1]:57–70
23. Castano-Vinyals G, D'Errico A, Malats N, Kogevinas M (2004) Biomarkers of exposure to polycyclic aromatic hydrocarbons from environmental air pollution. Occup Environ Med 61:e12
24. Peluso M, Munnia A, Hoek G et al (2005) DNA
adducts and lung cancer risk: a prospective
study. Cancer Res 65:8042–8048
25. Hagmar L, Bonassi S, Strömberg U et al (1998)
26.
27.
28.
29.
30.
31.
32.
33.
34.
Chromosomal aberrations in lymphocytes predict human cancer: a report from the European
Study Group on Cytogenetic Biomarkers and
Health (ESCH). Cancer Res 58:4117–4121
Bonassi S, Ugolini D, Kirsch-Volders M et al
(2005) Human population studies with cytogenetic biomarkers: review of the literature and future prospectives. Environ Mol Mutagen 45:
258–270
Porta M, Malats N, Jariod M et al (1999) Serum
concentrations of organochlorine compounds
and K-ras mutations in exocrine pancreatic cancer. PANKRAS II Study Group. Lancet 354:
2125–2129
Taylor JA, Sandler DP, Bloomfield CD et al
(1992) ras oncogene activation and occupational
exposures in acute myeloid leukemia. Natl Cancer Inst 84:1626–1632
Husgafvel-Pursiainen K, Hackman P, Ridanpaa
M et al (1993) K-ras mutations in human adenocarcinoma of the lung: association with smoking
and occupational exposure to asbestos. Int J
Cancer 53:250–256
De Vivo I, Marion MJ, Smith SJ et al (1994)
Mutant c-Ki-ras p21 protein in chemical carcinogenesis in humans exposed to vinyl chloride. Cancer Causes Control 5:273–278
Wark PA, Van der KW, Ploemacher J et al (2006)
Diet, lifestyle and risk of K-ras mutation-positive and -negative colorectal adenomas. Int J
Cancer 119:398–405
Feinberg AP, Ohlsson R, Henikoff S (2006) The
epigenetic progenitor origin of human cancer.
Nat Rev Genet 7:21–33
Risch N (2001) The genetic epidemiology of
cancer: interpreting family and twin studies and
their implications for molecular genetic approaches. Cancer Epidemiol Biomarkers Prev
10:733–741
García-Closas M, Malats N, Silverman D et al
(2005) NAT2 slow acetylation, GSTM1 null
genotype, and risk of bladder cancer: results
from the Spanish Bladder Cancer Study and
meta-analyses. Lancet 366:649–659
(290-297) CTO-1059-Red series-Cancer...qxp
22/05/2007
13:24
Página 297
N. Malats, G. Castaño-Vinyals: Cancer epidemiology
35. Cordell HT, Clayton DG (2005) Genetic association studies. Lancet 366:1121–1131
36. Bosl WJ (2007) Systems biology by the rules:
hybrid intelligent systems for pathway modelling and discovery. BMC Syst Biol 1:e13
37. Whitcomb DC, Barmada MM (2007) A systems
biology approach to genetic studies of pancreatitis and other complex diseases. Cell Mol Life Sci
(in press)
38. Chen JY, Shen C, Yan Z et al (2006) A systems
biology case study of ovarian cancer drug resistance. Comput Syst Bioinformatics Conf 389–
398
39. Baranzini SE (2006) Systems-based medicine
approaches to understand and treat complex diseases. The example of multiple sclerosis. Autoimmunity 39:651–662
40. Wagenmakers AJ, van Riel NA, Frenneaux MP,
297
Stewart PM (2006) Integration of the metabolic
and cardiovascular effects of exercise. Essays
Biochem 42:193–210
41. Mujagic H (2006) Systems biology: potential to
improve decision making in pharmaceutical development. Drug News Perspect 9:575–583
42. Ness RB, Koopman JS, Roberts MS (2007)
Causal system modelling in chronic disease epidemiology: a proposal. Ann Epidemiol (in press)