Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki Epidemiology examines determinants of disease in relation to place, time and person characteristics such as: - genes - behavior - environment - developmental stage Place Person Time Development of life expectancy (U.S females in 1990, 1995 and projected) Olshansky et al., Science 2001; 291:1491 Changes in cardiovascular risk factors explain for men 66% of the changes in mortality from stroke Vartiainen et al. BMJ 1995;310:901-4 Genes, developmental history and environment as determinants of health In complex disease a person's susceptibility genotype and environmental history combine to establish present health status, and the genotype's norm of reaction determines future health trajectory The post-genomic era Now that the full human genome sequence has been published, we have access to genetic information in an unprecedented manner: – 3 billion base pairs in the human genome – c 20 000 to 30 000 genes Thus, developments in molecular genetic analysis render it now possible to attempt identification of liability genes in complex, multifactorial traits, and to dissect out with new precision the role of genetic predisposition and environment/life style factors in these disorders. New technologies and statistical tools are continuously introduced Nonetheless, quantitative genetic methods provide an overall picture of the role of familial and genetic factors Monogenic & Complex disorders The majority of human diseases are complex, i.e. multiple genetic and non-genetic causes Figure: Peltonen & McKusick Science 2001 Segregation and linkage Do diseased family members share alleles at a locus more often than expected? Are these alleles the same in many families? Sibpairs or large pedigrees can be studied, depending on the disease or trait in question Types of genes Rare inborn errors of metabolism and other Mendelian gene variants (e.g. familial hypercholesterolemia) have major impact on individuals and families, but little effect on population level; – FH accounts for 1% of serum cholesterol variability in the population see e.g. OMIM: http://www.ncbi.nlm.nih.gov/Omim/ However, they continue to account for only a small fraction of all cases Characteristics of complex traits Trait values are determined by complex interactions among numerous metabolic and physiological systems, as well as demographic and lifestyle factors Variation in a large number of genes can potentially influence interindividual variation of trait values The impact of any one gene is likely to be small to moderate in size For diseases: Monogenic diseases that mimic complex diseases typically account for a small fraction of disease cases (examples in obesity, hypertension, dyslipidemias) . Susceptibility genes Susceptibility genes increase disease risk only moderately and are context dependent. – total heritability of cholesterol levels is typically c 50% – Apo E account for 5-10% of variability in serum cholesterol in many populations, but effect of Apo E4 allele is small in individuals – presence of apo E4 moderately increases CHD and AD risk in many populations For example frequency of apo E4 allele (associated with CHD and Alzheimer’s) is highest in nomadic populations [e.g. Pygmies (0.407) and Khoi San (0.370), Papuans (0.368), some Native Americans (0.280), and Lapps (0.310) ] compared to .10 to .15 in populations of Mediterranean descent. Genetic epidemiology and behavior genetics Strategies for family studies: Does disease or behavior aggregate in families? What are the causes of familial aggregation? What is the model of genetic inheritance and which genes are responsible? How do genes interact with the environment? How to detect genetic effects and genes? Family studies: – provide estimates of heritability – information on mode of inheritance – adoption and twin studies as special cases Molecular genetic studies: – genome-wide association studies & snpheritability – linkage in families – animal studies (e.g.’knockouts’) – known functional variants What is heritability Heritability is the estimate of the proportion in total variance of a trait or liability to a disease that is accounted for by genetic variance - interindividual genetic differences. Genetic variance may arise from additive effects, due to different alleles at a locus, or may be due to dominance, the interactions of alleles Heritability is a characteristic of populations, not individuals or families, which is affected by both genetic and environmental effects FAMILY STUDY Provides estimates of the degree of family aggregation Risks to siblings, parents, offspring as well as to other relatives can be estimated Similarity of different types of relatives can permit modelling of genetic versus non-genetic familial influences Obesity in families (Quebec Family study, 1996) 0,3 0,25 0,2 0,15 0,1 0,05 0 BMI correlation Parent-child Siblings Spouse Genetic epidemiology To disentangle genes and experience, we study special family groups: Either family members sharing experiences but differing in shared genes, e.g. twin studies or family members sharing genes, but differing in their shared experience, e.g. adoption studies ADOPTION DESIGN Test for association between trait in adoptees and trait in biological parents (genetic correlation) & Test for association between trait in adoptees and trait in adoptive parents. STRENGTHS: relatively powerful WEAKNESSES: (1) poor generalizability (2) adoptive parents likely to provide ‘good homes’ (3) biological parents of adoptive children may have had multiple forms of psychopathology - selection (4) poor characterization of phenotypes of biological parents Adoption studies of obesity (Sörensen et al.1998) 0,25 0,2 0,15 0,1 0,05 0 BMI correlation Bio. mother bio father bio. sibs adop.parent The Classical Twin Study Monozygotic (MZ) pairs are genetically alike Dizygotic (DZ) pairs, like siblings, share on average half of their segregating genes DZ pairs can be same-sexed or opposite-sex (male-female) Increased similarity of twin pairs compared to unrelated subjects suggests familial factors Increased similarity of MZ pairs compared to DZ pairs provides evidence for genetic factors BMI in 25 year olds female twin pairs (rMZ= 0.78, rDZ = 0.37) DZ 30 20 10 BMI in twin 1 40 50 MZ 10 20 30 40 10 BMI in twin 2 Graphs by Zygosity FinnTwin16 study 20 30 40 The classical twin study modelling Model contribution of additive (A) and non-additive (D)genetic effects, environmental effects shared by family members (C ) and unshared effects (E) (i.e. unique to each family member) Competing models, e.g. E, AE, ACE can be statistically compared and tested against actual data Mx – statistical program created by Mike Neale most commonly used in genetic modelling: http://views.vcu.edu/mx/ 1.0 (MZ) / .5 (DZ) A1 C1 Twin 1 E1 1.0 A2 C2 Twin 2 E2 Twin similarity for life span at very old age Extensions of the classical twin study I Effect modification by age, sex and environmental factors, e.g. smoking or obesity Assess genetic covariance over time through longitudinal models Assess sex effects by comparison of like-sexed and same-sexed DZ pairs Assess social interaction effects Age dependence of genetic effects: CHD in twin brothers Bivariate analyses indicate the genetic and environmental contributions to the relationship of relative weight at birth and in adolescence (Pietiläinen et al, Obes Res 2002) 1.0 1.0 (0.5) rc ra A C Twin 1 Variable 1 ra m: 0.21, f: 0.13 E 1.0 (0.5) 1.0 re A ra C E Twin 1 Variable 2 A C rc E Twin 2 Variable 1 re A C Twin 2 Variable 2 re m: 0.16, f: 0.07 a2 c2 e2 a2 e2 m:0.20 f:0.47 m:0.42 f:0.18 m:0.39 f:0.35 m:0.84 f:0.90 m:0.16 f:0.10 PI at birth r m: 0.11, f: 0.09 BMI at 16 y FinnTwin16 E Different phenotypes, different effects of genes: smoking Genetic effects Non-genetic family effects Experimentation (age 12) 11% 73% Initiation/ever smoker (adolescents) 20-36% 18-59% Initiation/ever smoker (adults) Persistence/ cessation 28-80% 4-50% 58-71% None Nicotine dependence 60-72% None (Fagerström or DSM-IV) Models of Gene-Environment Interaction Purcell, S., Variance components models for gene-environment interaction in twin analysis. Twin Research, 2002. 5: p. 554-571 A a + βXM C E e + βZM c + βyM + βMM T Standardized Variance Parental Monitoring and Smoking Quantity (Dick et al, J Abn. Psych, 2006) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 a2 c2 e2 4 5 6 7 8 Low 9 10 11 High Parental Monitoring 12 TWIN DESIGN: Weaknesses (1) Generalizability - having a same-age sibling?? having a genetically identical same-age sibling?? (2) Relative rarity of twin pairs. (3) Non-orthogonal design -- need large sample sizes. (4) If major environmental risk-factors are not assessed, interaction of genetic effects and shared environmental effects will be confounded with genetic effects. (5) Weak for detecting parent-to-offspring environmental influences. Assumptions of the classical twin study Equality of environmental variances in MZ and DZ pairs Differences may arise from: placentation and in utero effects Fetal programming hypothesis implications differential parental treatment zygosity determination errors Random mating Perinatal mortality among twins by zygosity and chorionicity 9 8 7 6 DZ MZDC MZMC 5 4 3 2 1 0 Fetal 1-7 days Perinatal Birthweights of twins East Flanders Prospective Twin Survey (Loos 1998) % of pairs DZ MZDC MZMC 64 10 26 2401g 2314g Mean 2476g Birthweight FAMILY STUDY Ultimately, sampling regular families must be a key part of any genetic epidemiologic approach. * Provides tests of generalizability of findings using more specialized twin-family and adoption designs. * Allows adequate representation of minority groups. Numbers of minority twin pairs, eg. Swedish speaking twin pairs in Finland, available for study are often small. How to detect genetic effects and genes? Molecular genetic studies: – candidate genes, genome-wide scans – association studies & linkage – animal studies (e.g.’knockouts’) Family studies: – provide estimates of heritability – information on mode of inheritance – adoption and twin studies as special cases Increasing the genetic signal in the data... ascertain pedigree units that are likely to segregate genes of relevance – Ex: pedigrees with quasi-Mendelian disease transmission – affected sib pair approach of linkage analysis ascertain families on the basis of individuals with extreme or remarkable phenotypes – Ex: extremely discordant sibpairs – ascertain young individuals with the disease ascertain individuals from isolated populations: – more homogenous genetically and culturally as well ascertain intermediate phenotypes – physiologic phenotype is “closer” to sequence variants Two basic Analysis Strategies 1. candidate gene analysis motto: study a few good genes 2. whole-genome searches (genome scans) motto: cast out a net that catches all the big fish Association studies: Case-control design What is the difference between genes of cases (e.g. with disease or trait) and controls? Selection of controls is major challenge, as in all case-control studies High rate of false-positive studies: many genes are available for study population admixture confounding factor Publication bias Candidate Gene Studies statistically straightforward: test the association between genotypes and phenotype with contingency tables, chisquare test, regression principle: if an allele is more frequent in affecteds than unaffecteds gene may be close to a disease gene candidacy of a gene can come from a number of different sources: – – – – biological insights (e.g. gene expressed in a certain tissue) homology to other genes functional studies in model organisms member of a relevant gene family Challenge: greater biological understanding of the genes POPULATION STRATIFICATION Hypothetical Example (by Andrew Heath) NOT A1 allele A1 allele NORTHERN EUROPEAN ANCESTRY (N=200) NOT ROMAN ROMAN CATHOLIC CATHOLIC 162 18 90% 18 2 10% 90% 10% NO ASSOCIATION SOUTHERN EUROPEAN ANCESTRY (N=200) NOT ROMAN ROMAN CATHOLIC CATHOLIC 35 105 70% 15 45 30% 25% 75% NO ASSOCIATION MINGLED IN AUSTRALIAN POPULATION (N=400) NOT ROMAN CATHOLIC NOT A1 allele 197 A1 allele 33 ROMAN CATHOLIC 123 47 OR = 2.28, 95%CI 1.39 - 3.73 Falsely infer that A1 allele is risk-factor for Roman Catholicism. Genome-wide association studies Large scale case-control series For example MI patients and matched controls without MI Use of very large numbers of SNPs to identify all possible genes associated with the disease Typically 100,000 to 500,000 SNPs Different technology platforms (Affymetrix, Illumina) Gene x Environment Interactions aa Predisposing Environment Liability to Illness Liability to Illness Protective Aa AA Protective Aa aa Predisposing Environment Genes control susceptibility to environmental pathogenesis AA Liability to Illness Genes control degree of sensitivity to environmental influence Genes and environment have additive, independent effects AA Kendler & Eaves, 1986 Protective Aa aa Predisposing Environment Gene-environment correlations refer to genetic effects on individual differences in liability to exposure to particular environmental circumstances. (Background is the extensive evidence that environmental risk exposure is far from randomly distributed) Gene-environment interactions concern genetically influenced individual differences in the sensitivity to specific environmental factors. (Background is the extensive evidence of huge individual differences in vulnerability to all manner of environmental hazards) Examples of social x biological interactive effects Biology controls sensitivity to environment effects – E.g., family stress x serotonin metabolism => depression and anxiety risk (Caspi, Science 2003) Social context generates undifferentiated risk; biology constrains pathologic specificity – E.g., childhood neglect => alcoholism in men, eating disorders in women Biological susceptibilities are amplified during rapid or intense contextual change – E.g., biological or gender-based vulnerabilities to depression and alcohol use as indexed by pubertal development Biology controls liability to experiencing predisposing environments – E.g. genes for skin color Integration of information at different levels Developments in molecular genetics render it now possible to attempt identification of liability genes in complex, multifactorial traits, and to dissect out with new precision the role of genetic predisposition and environment/life style factors in these disorders. But, an integrative framework is needed Gottesmann I, Science 1997 Complex picture Complexity of Complex Diseases Classical polygenic or "threshold" inheritance: a certain number of mutations at different loci must be present before a system is sufficiently challenged to result in disease. Locus heterogeneity, in which defects in any of a number of genes or loci confer disease susceptibility independently of each other. Epistasis, or gene interaction: interactive effects of mutations, genotypes, and/or their biologic products Environmental vulnerability: gene products are influenced by environmental stimuli. Gene × environment interactions: gene has a deleterious effects only in the presence of a particular environmental stimulus. Time-dependent expression of genes General aging of the system Testing of epidemiological causal hypotheses – use of twins Differences between MZ cotwins in a pair are due to environmental causes (in the very broadest sense) somatic mutations and other genetic changes during development prenatal environmental and birth order effects differential treatment in childhood different exposures ( occupational, lifestyle) Exposure/disease discordant DZ pairs are fully matched on early childhood effects, and partially on genetic factors Studies of exposure discordant twin pairs have increased power compared to unmatched case-control series, depending on the degree of familiality of the exposure