Download Cancer epidemiology: study designs and data analysis

(290-297) CTO-1059-Red series-Cancer...qxp E D U C AT I O N A L S E R I E S 22/05/2007 13:24 Página 290 Clin Transl Oncol (2007) 9:290-297 DOI 10.1007/s12094-007-0056-x Red series* Cancer epidemiology: study designs and data analysis N. Malats and G. Castaño-Vinyals Centre de Recerca en Epidemiologia Ambiental (CREAL). Institut Municipal d’Investigació Mèdica (IMIM). Barcelona, Spain Abstract Among the scientific interests of cancer epidemiology is the identification of both environmental and genetic factors associated with cancer development. Observational designs requiring sophisticated methodology are applied to control for potential confounding factors. The enormous biotechnological potential developed in the last two decades has allowed the integration of a plethora of new biomarkers in epidemiological studies to better define the exposure and “neoclassic” outcomes, as well as incorporating genetic susceptibility factors in both classical and new epidemiological designs. The integration of scopes, objectives, data and tools coming from different disciplines also benefits epidemiology, thus evolving into “systems epidemiology”. In this manuscript, we review the basic concepts of study designs and data analysis and introduce readers to the more innovative aspects that are now being applied in epidemiological studies. Key words Case-control study • Exposure biomarker • Disease biomarker • Genetic susceptibility • Data analysis Malats N, Castaño-Vinyals G (2007) Cancer epidemiology: study designs and data analysis. Clin Transl Oncol 9:290-297 *Supported by an unrestricted educational grant from Roche Farma S.A. N. Malats (쾷) Centre de Recerca en Epidemiologia Ambiental (CREAL) Institut Municipal d’Investigació Mèdica (IMIM) Carrer del Dr. Aiguader, 88 E 08003 Barcelona, Spain E-mail: [email protected] Introduction Epidemiology aims at identifying the determinants that affect health in the population, at assessing their distribution according to specific subgroups of individuals and at controlling the diseases [1]. Cancer is a major area of epidemiological interest because of its importance as a public health problem [2]. The scientific focus of cancer epidemiology ranges from aetiological objectives, aimed at identifying both environmental and genetic factors associated with cancer development, to more clinical aims that involve both diagnostic and prognostic aspects. In this paper, we comment only on the aetiological arena of cancer epidemiology. The research on cancer epidemiology has allowed estimation of the percentage of cancer cases attributed to several environmental exposures and lifestyle factors – attributable risk. For example, tobacco use accounts for 30% of all cancer cases, unhealthy diet for 10–25%, obesity for 15%, physical inactivity for 5%, alcohol consumption for 4% and viruses for 3% [3–5]. Epidemiology deals with people, hence ethical issues often hamper their inclusion in experimental studies. Instead, “observational” designs are commonly applied, the most classical being case-control and cohort studies. This limitation has forced epidemiology to develop a sophisticated and complex methodology to control for potential confounding factors. This methodology mainly regards the type of design, selection of subjects and sample size, collection of information and statistical analyses. While what is known as “classical” epidemiology restricts its interest to environmental exposures, “molecular” and “genetic” epidemiology apply a wide variety of biomarkers to better define both the exposure and “neoclassic” outcomes, as well as the inherited genetic patterns that predispose or protect against the disease or the variation in its natural history. The enormous biotechnological potential developed in the last two decades has allowed the integration of a plethora of new biomarkers in epidemiological studies, such as gene expression, genetic polymorphisms and epigenetic patterns. This fact is provoking, at the same (290-297) CTO-1059-Red series-Cancer...qxp 22/05/2007 13:24 Página 291 N. Malats, G. Castaño-Vinyals: Cancer epidemiology time, the development of methodological innovations regarding all aspects of epidemiology: new designs and information, including the development of strategies to reduce the very large sets of genetic information, new statistical analytical tools and methods to adjust for genetic complexity. This is the era of integration of scopes, objectives, data and tools coming from different disciplines such as molecular and cellular biology, genetics, toxicology, statistics, bioinformatics and clinical specialities. Epidemiology also benefits from this integrative approach, thus evolving into “systems epidemiology”. In this manuscript, we review the basic concepts of study designs and data analysis and introduce readers to the more innovative aspects that are now being applied in epidemiological studies. 291 1) 2 x 2 table Exposed Non-exposed ORcrude = Cases Controls A1 C1 A0 C0 A1 / A1 + C 1 1 − ( A1 / A1 + C 1) A1xC 0 = A0 / A0 + C 0 C 1xA0 1 − ( A0 / A0 + C 0) If confounder is present: If confounder is absent: Cases Controls Exposed A1 C1 Non-exposed A0 C0 Cases Controls Exposed A1 C1 Non-exposed A0 C0 OR2 OR1 Classical study design and analytical strategies Research studies can be classified in two main groups: experimental and observational. The first is mostly a laboratory-based approach that allows the researcher to manipulate the conditions of the exposed subjects. On the other hand, in observational studies the researcher “just” observes the subjects’ behaviour and exposures and assesses their association with disease occurrence [6]. There are different study designs on individuals as observation units: cohort, case-control, cross-sectional and case-crossover studies [7]. Probably, the most common design is the case-control study. It consists of identifying diseased – cases – and healthy subjects – controls, collecting past history of exposures, and then assessing whether there are differences in the distribution of exposures between cases and controls through association tests that are interpreted as the exposure effect. In addition to an appropriate sample size, the success of this design relies on very careful selection of controls. The natural history and the relatively low frequency of cancer in the population (12% of all mortality causes) make it an ideal topic to be studied by case-control design. The measure of association most commonly used in case-control studies is the odds ratio (OR), defined as the ratio of the odds of developing the disease. This is an approximation of the relative risk used in cohort studies (see below). As in case-control studies the information on all subjects that are not exposed is lacking, the measure of risk we obtain should be derived from the available information (see Fig. 1). Without getting into further details of the development of the equation, the OR is the cross-product ratio, or (A×D)/(B×C). The statistical analysis method applied to test whether the observed distribution statistically departs from the expected one is the χ2 test of association. 2) 2 x 3 table Cases – Cases – group 1 group 2 Exposed A1 B1 C1 Non-exposed A0 B0 C0 OR group1 = A1xC 0 C 1xA0 Controls OR group2 = B1xC 0 C 1xB 0 3) 2 x 4 table for gene-environmental assessing. The 2 x 2 table (model 1) for risk assessment in the context of case-control studies can also be displayed as: Exp/gene Cases Controls 0 A0 C0 Odds ratio 1 1 A1 C1 A1C0 / A0C1 A simple gene-environmental interaction model in the context of case-control studies is then displayed in a 2 x 4 table Exposur e Susceptibility genotype 0 Case-control study Cases Controls 0 A00 C00 Odds ratio 1 0 1 A01 C01 Rg=A01C00/A00C01 1 0 A10 C10 Re=A10C00/A00C10 1 1 A11 C11 Rge=A11C00/A00C11 4) 2 x 2 table for case-only design Exposed Non-exposed Cases — Cases — group 1 group 2 A1 B1 A0 B0 ORcrude = A1xB 0 B1xA0 Fig. 1 Cross-tabulation of exposure and disease (290-297) CTO-1059-Red series-Cancer...qxp 22/05/2007 13:24 292 When the OR only takes into account one exposure, it is called the crude OR. Confounding by a third factor –whether an exposure or not– can bias this association. A confounder is defined as a factor that is causally associated with the outcome; it is also associated with the exposure (either causally or non-causally) and it is not an intermediate in the causal pathway between the exposure and the outcome. In the case of suspicion of the presence of a confounder, this should be checked by stratifying the association according to the potential confounder strata, obtaining one OR for each category in the confounder variable. If the presence of a confounder is verified, an adjusted model should be used, with the Maentel-Haenszel method, yielding an ORadjusted. For more complex modelling, such as adjusting for several factors, statistical packages need to be used. For a binary outcome, such as diseased and non-diseased subjects, the logistic regression model is the one used to estimate the association between case-control status and an exposure. This method allows adjustment for different variables. Unlike case-control studies, cohort studies recruit healthy subjects, regardless of exposures, till they develop the disease of interest –i.e., cancer– and assess the association between exposure and disease by providing estimates of disease incidence and survival analysis methods. The estimate of risk computed in cohort studies is named relative risk (RR). The RR is the ratio of the risk of developing the disease among the exposed individuals over the risk among the unexposed. This is actually a ratio of incidence rates among the exposed and non-exposed individuals. An important limitation of this design is the common need for very extensive follow-up periods and the relatively large number of drop-outs for the study to get enough cases and sufficient statistical power. For many cancers, this requires up to 20 years. Molecular epidemiology: from exposures to the disease “Analysis of biomarkers is increasingly being incorporated into cross-sectional, retrospective, prospective, or nested case-control studies to gain improved resolution of the risk factors and mechanisms responsible for cancer.” [8] Carcinogenesis is a multistage process evolving from environmental carcinogen exposure to the development of preneoplastic and neoplastic phenotypes and to cancer progression. It is now possible to obtain biochemical and molecular information – biomarkers – of the intermediate states in this process and integrate it in large epidemiological studies [9, 10]. The use of biomarkers in epidemiology constitutes so-called molecular epidemiology [11]. In a broader sense, a biomarker is a parameter that we can measure in humans, in general using non-inva- Página 292 N. Malats, G. Castaño-Vinyals: Cancer epidemiology sive techniques, and that is relevant for our object of study [12]. An already classical scheme has been widely used to describe the molecular epidemiology path [8, 13–15]. The aims of using biomarkers in epidemiologic studies are to increase the sensitivity and specificity in detecting exposure to carcinogens, to evaluate more precisely the interplay between genetic and environmental determinants of cancer, to detect earlier the pro-carcinogenic effects of exposures, to characterise the disease subtypes and to evaluate primary prevention measures [16]. An additional objective is to consider inherited genetic variants (polymorphisms) as effect modifier factors – factors that modify the effect/association of environmental factors with a disease by either increasing or decreasing the risk. This phenomenon is called gene–environment interaction. The advances in detailing the complex aetiological picture of cancer demands the application of new designs – such as the case-only, casecase-control, case-parental control and affected sib-pair – among others, for the comparison of the frequency of factors among unrelated affected and unaffected individuals. In addition to these new designs, molecular epidemiology studies often demand large sample size, and innovative methods on biobanking, selection of subjects and statistical analyses. Table1 displays free online interactive links of databases and statistical packages Biomarkers can be broadly classified into biomarkers of exposure, effect and susceptibility. The first category is subdivided in internal dose biomarkers, which mainly include the chemical compound – or its metabolites – detected in biological media, and biological effective dose biomarkers, mainly adducts of DNA or proteins [17–19]. When talking about biomarkers of effect, early biological effect biomarkers – such as chromosomal aberrations, sister chromatid exchanges, micronuclei and specific gene mutations –as well as subclinical and clinical disease biomarkers and prognostic biomarkers– are considered. The former group in the category of effect biomarkers can be used as exposure makers as well (see Fig. 2, panel B). Markers of susceptibility include a wide category of conditions, both genetic and non-genetic, that can participate at all the stages of carcinogenesis [13]. There is no perfect classification: recent reviews suggest no longer using this classification of biomarkers, as one specific group of them, i.e., DNA adducts, can integrate exposure, effect and susceptibility [20]. Biomarkers Once a substance enters the body, it undergoes metabolic transformation. An internal dose biomarker is the measure of either the parent substance or one of its metabolites in one of the body fluids, such as urine, (290-297) CTO-1059-Red series-Cancer...qxp 22/05/2007 13:24 Página 293 N. Malats, G. Castaño-Vinyals: Cancer epidemiology 293 Table 1 Free online interactive links Databases Databases Entrez Gene OMIM National Cancer Institute IARC Handbooks on Cancer Prevention IARC Monographs on the Evaluation of Carcinogenic Risks to Humans National Center for Health Statistics, National Health and Nutrition Survey NCI Early-Detection Research Network US Surgeon General Reports on Smoking and Health http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM http://www.cancer.gov http://www.iarc.fr/IARCPress/general/prev.pdf http://monographs.iarc.fr/ http://www.cdc.gov/nchs/nhanes.htm http://www3.cancer.gov/prevention/cbrg/edrn/ http://profiles.nlm.nih.gov/NN/ListByDate.html Packages to estimate statistical power and sample size (from Cordell and Clayton [35]) QUANTO Genetic power calculator Stata power and sample size programs TDTPOWER71 TDTASP TDT-PC http://hydra.usc.edu/gxe http://statgen.iop.kcl.ac.uk/gpc/index.html http://cruk.leeds.ac.uk/katie http://www.uni-bonn.de/~umt70e/soft.htm http://odin.mdacc.tmc.edu/anonftp/ http://www.biostat.jhsph.edu/~wmchen/pc.html Standard statistical packages Stata SAS S-Plus R http://www.stata.com/ http://www.sas.com/ http://www.insightful.com/products/splus/ http://www.r-project.org/ blood or saliva [13]. Several biomarkers of exposure to PAHs were analysed in children exposed to air pollution, such as measurement of 1-hydroxyperene [21]; this marker was also used is several environmental and occupational studies of exposure to carcinogens [22, 23]. Internal dose biomarkers usually replace environmental exposure in classical study designs, both case-control and cohort studies. In a subsequent stage in the carcinogenesis process, biomarkers of biologically effective dose measure the interaction between the carcinogen and cellular substructure molecules such as DNA or proteins [13], DNA adducts being the most representative type. An adduct can be defined as the covalent bond between a biological molecule – such as DNA or proteins – and a toxic substance, either the parent compound or one of its metabolites. The half-life of these adducts depends on both the half-life of the biological molecule and the stability of the chemical. The study of adducts can yield clues about the mechanism of the disease; Peluso et al. found a positive association between DNA-PAH adducts and lung cancer [24]. In a meta-analysis of DNA adducts and risk of cancer [17], an association between the biomarkers studied and the risk of lung and bladder cancer was observed, restricted to current smokers. Again, these types of markers usually replace – or refine – environmental exposures in classical study designs, both case-control and cohort studies. Early biological effect biomarkers are those alterations that have a biological effect, sometimes irre- versible, due to the interaction with a toxic substance. Their detection depends on the DNA repair capacity and the cellular turnover. They may be specific and non-specific. The most classical assays that refer to the non-specific markers are chromosomal aberrations, sister chromatid exchanges and micronuclei [25, 26]. Among the specific marker group, there are those that imply changes in a key oncogene or tumour suppressor gene, such as alterations in the Tp53 or ras genes, cases being subclassified according to these alterations. As mentioned before, these can be considered as biomarkers of exposure, effect and even susceptibility [20]. If they are “used” as effect markers, they are included in case-control studies by splitting the case group according to the molecular characteristics of cases. So, we would have 2 or more groups of cases and one control group (Fig. 1.2). Analytically, each one of the case groups is compared with the control group by applying the so-called multinomial or polytomous logistic regression. Usually, only two groups of cases are considered, those with and those without the molecular characteristic of interest. Statistically speaking, it would be a case-case-control analysis. There is evidence from the literature on studies that observe differences between the case subgroups. Porta et al. studied the association between serum concentrations of several organochlorine compounds, such as PCBs, DDT and DDE, and K-ras mutations in pancreatic cancer. DDT and DDE concentrations were significantly higher among K-ras mutated cases than among K-ras wild-type (290-297) CTO-1059-Red series-Cancer...qxp 22/05/2007 13:24 294 Página 294 N. Malats, G. Castaño-Vinyals: Cancer epidemiology Fig. 2 Molecular cancer epidemiology models cases and controls [27]. Other examples consider the Nras mutations and occupational exposures in patients with acute myeloid leukaemia [28], to K-ras mutations and asbestos exposure in lung adenocarcinoma [29], and to K-ras p21 protein and vinyl chloride exposure in angiosarcomas of the liver [30]. A more recent example studied different dietary characteristics –as exposure factors– in relation to colorectal adenomas [31]. Cases (290-297) CTO-1059-Red series-Cancer...qxp 22/05/2007 13:24 Página 295 N. Malats, G. Castaño-Vinyals: Cancer epidemiology were classified according to the K-ras mutation, and each group (mutated or wild-type) was compared with the control group, finding differences in the risk for the disease depending on the K-ras status. Susceptibility markers provide information on those physiological statuses, both genetic and non-genetic, that predispose the individuals to develop the disease of interest. Among the non-genetic markers, we can find the nutritional, immunological and disease statuses [13], as well as global methylation levels [32]. Regarding inherited factors, genetic variants can be classified according to their penetrance in two groups: high and low penetrance genetic factors. The latter group comprises most of the inherited variations in a gene sequence that translate into a functional variation (i.e., gene expression levels, protein function, among others) that do not cause the disease by itself but predispose to its development. These markers are classified as genetic susceptibility factors. We broadly name these variants as polymorphisms. Many of them are single nucleotide polymorphisms (SNP), meaning the molecular change involves only one nucleotide. The number of SNPs estimated in the genome exceeds 10,000,000 and they can be transmitted together within the population (i.e., in linkage disequilibrium) following haplotype blocks. The choice of strategies to identify genes that increase/decrease individual susceptibility to develop cancer depends on the number of genes involved (for a specific site), their frequency, penetrance and interactions. When low-frequency alleles, dominant, and of moderate-high penetrance, are suspected to be involved in the development of a specific phenotype, linkage analysis in extended “pedigrees” of individuals in whom cancer was diagnosed at an early age or with strong familial aggregation (i.e., breast and colon cancer) is proposed to be applied. On the contrary, when it is suspected that multiple and frequent loci are involved, following recessive inheritance patterns, or more complex patterns resulting from interactions with environmental factors, association/observational studies is the best option [33]. As described, this type of study can provide sufficient power to distinguish slight variations in disease risks, being more sensitive than linkage methods when the genes of interest contribute to disease susceptibility but are neither necessary nor sufficient to cause disease. Association studies imply the comparison of the frequency of candidate alleles in candidate genes, or TagSNPs along the genome (whole genome scan), among unrelated affected and unaffected individuals. The alleles analysed may be thought to contribute to the disease or be in linkage disequilibrium with any such causative variation. Another advantage of these approaches is that they use the same methodology as epidemiological studies (cohort and case-control design), although new non-traditional designs have also been proposed. 295 Case-control is, again, the most commonly applied design. It overcomes the assumption of temporality –as the genetic cause always precedes the effect– and it allows the assessment of a large number of genetic and environmental factors, as well as their interaction. Furthermore, this design provides the opportunity to conduct unique analyses such as the case-control design with extreme phenotypes. To assess gene×environment interaction, the 2×2 table expands (2×4 table) to include an additional column that refers to the distribution of subjects according to genetic factor (see Fig 1.3). The first row of the table is taken as the reference category, the OR assessed in the next two refer to the effect of the environmental or genetic factors alone, and the last row to the joint effect (gene×environment interaction). Gene×environment interactions can also be assessed through a case-only design conducted only with cases (see Fig 1.4). Instead of controls, this design distributes exposures between cases with and without the genetic characteristic [34]. This design does not allow investigators to evaluate the independent effect of the exposure alone or the genotype alone. The interaction assessed departs from the multiplicative effect and it assumes independence between exposure and allele. The associations observed may be due to linkage disequilibrium between the genetic marker and the true susceptibility allele at a neighbouring locus. In addition to misclassification of genotyping and phenotypes, and confounding by population structure, lack of statistical power and false-positive results due to chance because of multiple testing are important caveats of this type of designs. This is mainly so when gene×environmental interactions are being assessed. There are other, yet less popular, designs that are placed in between familial and population studies and that are suitable to explore genetic associations under certain conditions. Among them are the affected sib-pair and the case-parent triads. The latter consists of assessing the significance of the parental and the case allele’s distribution through the transmission/disequilibrium test, conditional logistic regression or log-linear models [35]. This approach also allows analysis of gene×environment interactions by stratifying the allele distribution tables according to the presence or the absence of the environmental exposure. This type of study is less powerful than case-control studies and requires the genotyping of the parents; hence, for most patients with the common types of cancers, it cannot be applied. On the other hand, it is robust to genetic stratification and can estimate maternal and imprinting effects [35]. The integration of epidemiology and bioinformatics in cancer research If you do not look for the unexpected you will never find it because it is painful and hard to find. (290-297) CTO-1059-Red series-Cancer...qxp 22/05/2007 13:24 296 Página 296 N. Malats, G. Castaño-Vinyals: Cancer epidemiology Heraclito of Efeso, Centuries VI-V B.C. Among factors hampering the progress in cancer epidemiological research are the different molecular and pathological phenotypes of the disease, the complex nature of the yet unknown environmental factors, the complexity of the genetic networks acting at different steps in carcinogenesis, and the dynamic interaction of causes acting during the natural history of the disease. We have discussed how the integration of information from toxicology and molecular biology permits the first two points to be dealt with. As for the last two, the interaction of epidemiology with bioinformatics should allow epidemiology to move ahead. The potential of information technology in managing large amounts of information and searching for dynamic interactions between factors –through what is currently known as systems biology methods– should also be considered in epidemiology, adding a new dimension, that of systems epidemiology. As with systems biology, epidemiology should consider the integration of large amounts of data from high throughput sources both to create new hypotheses and to build disease models. Computational models allowing the exploration of dynamic data may be a useful tool for References 1. Last JM (2001) A dictionary of epidemiology, 4th Edn. Oxford University Press, Oxford 2. Ferlay J, Bray F, Pisani P, Parkin DM (2004) GLOBOCAN 2002: Cancer Incidence, Mortality and Prevalence Worldwide, Version 1.0. IARC Cancer Base No. 5. IARC Press, Lyon 3. Adami HO, Hunter D, Trichopoulos D (eds) (2002) Textbook of cancer epidemiology. Oxford University Press, New York 4. Colditz GA, Sellers TA, Trapido E (2005) Epidemiology – identifying the causes and preventability of cancer? Nat Rev Cancer 6:75–83 5. Schottenfeld D, Fraumeni J (eds) (2006) Cancer epidemiology and prevention, 3rd Edn. Oxford University Press, New York 6. dos Santos Silva I (1999) Cancer epidemiology: principles and methods. IARC Scientif ic Publication No. 157. IARC, Lyon 7. Szklo M, Nieto FJ (2000) Epidemiology: beyond the basics. Jones and Bartlett Publishers, Sudbury, MA 8. Perera FP (1996) Molecular epidemiology: insights into cancer susceptibility, risk assessment, and prevention. J Natl Cancer Inst 88:496–509 9. Denissenko MF, Pao A, Tang M, Pfeifer GP (1996) Preferential formation of benzo[a]pyrene adducts at lung cancer mutational hotspots in P53. Science 274:430–432 10. Smith LE, Denissenko MF, Bennett WP et al (2000) Targeting of lung cancer mutational hotspots by polycyclic aromatic hydrocarbons. J Natl Cancer Inst 92:803–811 11. Caporaso N (2000) Molecular epidemiology. In: Gail MH, Benichou J (eds) Encyclopedia of epidemiologic methods. John Wiley & Sons, Ltd., Chichester, West Sussex, England, pp 612–617 12. Collins AR (1998) Molecular epidemiology in cancer research. Mol Aspects Med 19:359–432 13. Toniolo P, Boffetta P, Shuker DEG et al (eds) (1997) Application of biomarkers in cancer epidemiology. IARC Scientific Publications No. 142. IARC, Lyon epidemiologists too, as they are proving useful for biologists and bioinformatics [36]. Epidemiology should move ahead from reductionism to a holistic systems perspective because: (1) there are properties that are only possessed by the system as a whole and not by its individual components; (2) there is an interactive and dynamic stability concerning time, space and context; (3) models are probably non-linear, displaying homeostatic, bistable, oscillatory, and chaotic types of behaviour; and (4) multidimensional analytical methods require computational and mathematical tools. Several initiatives have been reported in the last year on the application of complex system methodology to disentangle the complex associations and interactions driving pathology and drug discovery [37–41]. As Ness et al. recently proposed, dynamic systems models, reflecting that diseases are caused within complex molecular, biological and social systems, with positive and negative feedback, should be incorporated as one component of the epidemiologic toolbox [42]. Acknowledgement This work was partially supported by the Fondo de Investigación Sanitaria, Spain (G03/174, G03/160, C03/09, C03/10, PI051436, PI061614) and Fundació Marató TV3. 14. Schulte PA, Rothman N, Perera FP, Talaska G (1995) Biomarkers of exposure in cancer epidemiology. Epidemiology 6:637–638 15. Rothman N, Stewart WF, Schulte PA (1995) Incorporating biomarkers into cancer epidemiology: a matrix of biomarker and study design categories. Cancer Epidemiol Biomarkers Prev 4: 301–311 16. Merlo DF, Sormani MP, Bruzzi P (2006) Molecular epidemiology: new rules for new tools? Mutat Res 600:3–11 17. Veglia F, Matullo G, Vineis P (2003) Bulky DNA adducts and risk of cancer: a meta-analysis. Cancer Epidemiol Biomarkers Prev 12:157– 160 18. Vineis P, Kogevinas M, Simonato L et al (2000) Levelling-off of the risk of lung and bladder cancer in heavy smokers: an analysis based on multicentric case-control studies and a metabolic interpretation. Mutat Res463:103–110 19. Vineis P (2002) DNA adducts and the protective role of fruits and vegetables. IARC Sci Publ 156:469–474 20. Buffler P, Rice J, Bird M, Boffetta P (eds) (2004) Mechanisms of carcinogenesis: contributions of molecular epidemiology. IARC Scientif ic Publication No. 157. IARC, Lyon 21. Ruchirawat M, Settachan D, Navasumrit P et al (2007) Assessment of potential cancer risk in children exposed to urban air pollution in Bangkok, Thailand. Toxicol Lett 168:200–209 22. Sram RJ, Binkova B (2000) Molecular epidemiology studies on occupational and environmental exposure to mutagens and carcinogens, 1997–1999. Environ Health Perspect 108[Suppl 1]:57–70 23. Castano-Vinyals G, D'Errico A, Malats N, Kogevinas M (2004) Biomarkers of exposure to polycyclic aromatic hydrocarbons from environmental air pollution. Occup Environ Med 61:e12 24. Peluso M, Munnia A, Hoek G et al (2005) DNA adducts and lung cancer risk: a prospective study. Cancer Res 65:8042–8048 25. Hagmar L, Bonassi S, Strömberg U et al (1998) 26. 27. 28. 29. 30. 31. 32. 33. 34. Chromosomal aberrations in lymphocytes predict human cancer: a report from the European Study Group on Cytogenetic Biomarkers and Health (ESCH). Cancer Res 58:4117–4121 Bonassi S, Ugolini D, Kirsch-Volders M et al (2005) Human population studies with cytogenetic biomarkers: review of the literature and future prospectives. Environ Mol Mutagen 45: 258–270 Porta M, Malats N, Jariod M et al (1999) Serum concentrations of organochlorine compounds and K-ras mutations in exocrine pancreatic cancer. PANKRAS II Study Group. Lancet 354: 2125–2129 Taylor JA, Sandler DP, Bloomfield CD et al (1992) ras oncogene activation and occupational exposures in acute myeloid leukemia. Natl Cancer Inst 84:1626–1632 Husgafvel-Pursiainen K, Hackman P, Ridanpaa M et al (1993) K-ras mutations in human adenocarcinoma of the lung: association with smoking and occupational exposure to asbestos. Int J Cancer 53:250–256 De Vivo I, Marion MJ, Smith SJ et al (1994) Mutant c-Ki-ras p21 protein in chemical carcinogenesis in humans exposed to vinyl chloride. Cancer Causes Control 5:273–278 Wark PA, Van der KW, Ploemacher J et al (2006) Diet, lifestyle and risk of K-ras mutation-positive and -negative colorectal adenomas. Int J Cancer 119:398–405 Feinberg AP, Ohlsson R, Henikoff S (2006) The epigenetic progenitor origin of human cancer. Nat Rev Genet 7:21–33 Risch N (2001) The genetic epidemiology of cancer: interpreting family and twin studies and their implications for molecular genetic approaches. Cancer Epidemiol Biomarkers Prev 10:733–741 García-Closas M, Malats N, Silverman D et al (2005) NAT2 slow acetylation, GSTM1 null genotype, and risk of bladder cancer: results from the Spanish Bladder Cancer Study and meta-analyses. Lancet 366:649–659 (290-297) CTO-1059-Red series-Cancer...qxp 22/05/2007 13:24 Página 297 N. Malats, G. Castaño-Vinyals: Cancer epidemiology 35. Cordell HT, Clayton DG (2005) Genetic association studies. Lancet 366:1121–1131 36. Bosl WJ (2007) Systems biology by the rules: hybrid intelligent systems for pathway modelling and discovery. BMC Syst Biol 1:e13 37. Whitcomb DC, Barmada MM (2007) A systems biology approach to genetic studies of pancreatitis and other complex diseases. Cell Mol Life Sci (in press) 38. Chen JY, Shen C, Yan Z et al (2006) A systems biology case study of ovarian cancer drug resistance. Comput Syst Bioinformatics Conf 389– 398 39. Baranzini SE (2006) Systems-based medicine approaches to understand and treat complex diseases. The example of multiple sclerosis. Autoimmunity 39:651–662 40. Wagenmakers AJ, van Riel NA, Frenneaux MP, 297 Stewart PM (2006) Integration of the metabolic and cardiovascular effects of exercise. Essays Biochem 42:193–210 41. Mujagic H (2006) Systems biology: potential to improve decision making in pharmaceutical development. Drug News Perspect 9:575–583 42. Ness RB, Koopman JS, Roberts MS (2007) Causal system modelling in chronic disease epidemiology: a proposal. Ann Epidemiol (in press)

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Cancer epidemiology: study designs and data analysis