Download Causes of variation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Sources of variation and co-variation in
the population
Jaakko Kaprio
University of Helsinki
Epidemiology examines determinants
of disease in relation to place, time
and person characteristics such as:
- genes
- behavior
- environment
- developmental stage
Place
Person
Time
Development of life expectancy
(U.S females in 1990, 1995 and projected)
Olshansky et al., Science 2001; 291:1491
Changes in cardiovascular risk factors explain for men
66% of the changes in mortality from stroke
Vartiainen et al. BMJ 1995;310:901-4
Genes, developmental history and
environment as determinants of health
In complex disease a
person's susceptibility
genotype and
environmental history
combine to establish
present health status, and
the genotype's norm of
reaction determines
future health trajectory
The post-genomic era
 Now that the full human genome sequence has been published,
we have access to genetic information in an unprecedented
manner:
– 3 billion base pairs in the human genome
– c 20 000 to 30 000 genes
 Thus, developments in molecular genetic analysis render it now
possible to attempt identification of liability genes in complex,
multifactorial traits, and to dissect out with new precision the
role of genetic predisposition and environment/life style factors
in these disorders.
 New technologies and statistical tools are continuously
introduced
 Nonetheless, quantitative genetic methods provide an overall
picture of the role of familial and genetic factors
Monogenic &
Complex
disorders
The majority of
human diseases
are complex, i.e.
multiple genetic
and non-genetic
causes
Figure: Peltonen &
McKusick
Science 2001
Segregation and linkage
Do diseased family members share alleles at a locus more
often than expected?
Are these alleles the same in many families?
Sibpairs or large pedigrees can be studied, depending on the
disease or trait in question
Types of genes
Rare inborn errors of metabolism and other Mendelian
gene variants (e.g. familial hypercholesterolemia) have
major impact on individuals and families, but little
effect on population level;
– FH accounts for 1% of serum cholesterol variability in the
population
see e.g. OMIM:
http://www.ncbi.nlm.nih.gov/Omim/
However, they continue to account for only a small
fraction of all cases
Characteristics of complex traits
Trait values are determined by complex interactions among
numerous metabolic and physiological systems, as well as
demographic and lifestyle factors
Variation in a large number of genes can potentially influence
interindividual variation of trait values
The impact of any one gene is likely to be small to moderate
in size
For diseases: Monogenic diseases that mimic complex
diseases typically account for a small fraction of disease
cases (examples in obesity, hypertension, dyslipidemias)
.
Susceptibility genes
 Susceptibility genes increase disease risk only moderately and
are context dependent.
– total heritability of cholesterol levels is typically c 50%
– Apo E account for 5-10% of variability in serum cholesterol in many
populations, but effect of Apo E4 allele is small in individuals
– presence of apo E4 moderately increases CHD and AD risk in many
populations
 For example frequency of apo E4 allele (associated with CHD
and Alzheimer’s) is highest in nomadic populations [e.g. Pygmies
(0.407) and Khoi San (0.370), Papuans (0.368), some Native Americans (0.280), and
Lapps (0.310) ] compared to .10 to .15 in populations of Mediterranean descent.
Genetic epidemiology and
behavior genetics
Strategies for family studies:
Does disease or behavior aggregate in families?
What are the causes of familial aggregation?
What is the model of genetic inheritance and
which genes are responsible?
How do genes interact with the environment?
How to detect genetic effects and genes?
Family studies:
– provide estimates of heritability
– information on mode of inheritance
– adoption and twin studies as special cases
Molecular genetic studies:
– genome-wide association studies & snpheritability
– linkage in families
– animal studies (e.g.’knockouts’)
– known functional variants
What is heritability
Heritability is the estimate of the proportion in total
variance of a trait or liability to a disease that is
accounted for by genetic variance - interindividual
genetic differences.
Genetic variance may arise from additive effects, due
to different alleles at a locus, or may be due to
dominance, the interactions of alleles
Heritability is a characteristic of populations, not
individuals or families, which is affected by both
genetic and environmental effects
FAMILY STUDY
 Provides estimates of the degree of family
aggregation
Risks to siblings, parents, offspring as well as
to other relatives can be estimated
Similarity of different types of relatives can
permit modelling of genetic versus non-genetic
familial influences
Obesity in families
(Quebec Family study, 1996)
0,3
0,25
0,2
0,15
0,1
0,05
0
BMI correlation
Parent-child
Siblings
Spouse
Genetic epidemiology
To disentangle genes and experience, we study
special family groups:
Either family members sharing experiences but
differing in shared genes, e.g. twin studies or
family members sharing genes, but differing in
their shared experience, e.g. adoption studies
ADOPTION DESIGN
Test for association between trait in adoptees and trait in
biological parents (genetic correlation) &
Test for association between trait in adoptees and trait in
adoptive parents.
STRENGTHS:
relatively powerful
WEAKNESSES:
(1) poor generalizability
(2) adoptive parents likely to provide ‘good homes’
(3) biological parents of adoptive children may have
had multiple forms of psychopathology - selection
(4) poor characterization of phenotypes of biological
parents
Adoption studies of obesity
(Sörensen et al.1998)
0,25
0,2
0,15
0,1
0,05
0
BMI correlation
Bio. mother
bio father
bio. sibs
adop.parent
The Classical Twin Study
 Monozygotic (MZ) pairs are genetically alike
 Dizygotic (DZ) pairs, like siblings, share on average half of their
segregating genes
 DZ pairs can be same-sexed or opposite-sex (male-female)
 Increased similarity of twin pairs compared to unrelated subjects
suggests familial factors
 Increased similarity of MZ pairs compared to DZ pairs provides
evidence for genetic factors
BMI in 25 year olds female twin pairs
(rMZ= 0.78, rDZ = 0.37)
DZ
30
20
10
BMI in twin 1
40
50
MZ
10
20
30
40
10
BMI in twin 2
Graphs by Zygosity
FinnTwin16 study
20
30
40
The classical twin study modelling
 Model contribution of additive (A)
and non-additive (D)genetic effects,
environmental effects shared by
family members (C ) and unshared
effects (E) (i.e. unique to each
family member)
 Competing models, e.g. E, AE, ACE
can be statistically compared and
tested against actual data
 Mx – statistical program created by
Mike Neale most commonly used in
genetic modelling:
http://views.vcu.edu/mx/
1.0 (MZ) / .5 (DZ)
A1
C1
Twin 1
E1
1.0
A2
C2
Twin 2
E2
Twin similarity for life span at very old age
Extensions of the classical twin
study I
Effect modification by age, sex and
environmental factors, e.g. smoking or obesity
Assess genetic covariance over time through
longitudinal models
Assess sex effects by comparison of like-sexed
and same-sexed DZ pairs
Assess social interaction effects
Age dependence of genetic
effects: CHD in twin brothers
Bivariate analyses indicate the genetic and
environmental contributions to the relationship
of relative weight at birth and in adolescence
(Pietiläinen et al, Obes Res 2002)
1.0
1.0 (0.5)
rc
ra
A
C
Twin 1
Variable 1
ra m: 0.21, f: 0.13
E
1.0 (0.5)
1.0
re
A
ra
C
E
Twin 1
Variable 2
A
C
rc
E
Twin 2
Variable 1
re
A
C
Twin 2
Variable 2
re m: 0.16, f: 0.07
a2
c2
e2
a2
e2
m:0.20
f:0.47
m:0.42
f:0.18
m:0.39
f:0.35
m:0.84
f:0.90
m:0.16
f:0.10
PI at birth
r
m: 0.11, f: 0.09
BMI at 16 y
FinnTwin16
E
Different phenotypes, different
effects of genes: smoking
Genetic
effects
Non-genetic
family effects
Experimentation (age 12) 11%
73%
Initiation/ever smoker
(adolescents)
20-36%
18-59%
Initiation/ever smoker
(adults)
Persistence/ cessation
28-80%
4-50%
58-71%
None
Nicotine dependence
60-72%
None
(Fagerström or DSM-IV)
Models of Gene-Environment Interaction
Purcell, S., Variance components models for gene-environment
interaction in twin analysis. Twin Research, 2002. 5: p. 554-571
A
a + βXM
C
E
e + βZM
c + βyM
 + βMM
T
Standardized Variance
Parental Monitoring and Smoking
Quantity (Dick et al, J Abn. Psych, 2006)
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
a2
c2
e2
4
5
6
7
8
Low
9
10
11
High
Parental Monitoring
12
TWIN DESIGN: Weaknesses
(1)
Generalizability -
having a same-age sibling??
having a genetically identical
same-age sibling??
(2)
Relative rarity of twin pairs.
(3)
Non-orthogonal design -- need large sample sizes.
(4)
If major environmental risk-factors are not assessed,
interaction of genetic effects and shared environmental
effects will be confounded with genetic effects.
(5)
Weak for detecting parent-to-offspring environmental
influences.
Assumptions of the classical twin study
Equality of environmental variances in MZ and
DZ pairs
Differences may arise from:
placentation and in utero effects
Fetal programming hypothesis implications
differential parental treatment
zygosity determination errors
Random mating
Perinatal mortality among twins
by zygosity and chorionicity
9
8
7
6
DZ
MZDC
MZMC
5
4
3
2
1
0
Fetal
1-7 days
Perinatal
Birthweights of twins
East Flanders Prospective Twin Survey (Loos 1998)
% of pairs
DZ
MZDC
MZMC
64
10
26
2401g
2314g
Mean
2476g
Birthweight
FAMILY STUDY
Ultimately, sampling regular families must be a key part of any
genetic epidemiologic approach.
*
Provides tests of generalizability of findings using more
specialized twin-family and adoption designs.
*
Allows adequate representation of minority groups.
Numbers of minority twin pairs, eg. Swedish speaking twin pairs
in Finland, available for study are often small.
How to detect genetic effects and genes?
Molecular genetic studies:
– candidate genes, genome-wide scans
– association studies & linkage
– animal studies (e.g.’knockouts’)
Family studies:
– provide estimates of heritability
– information on mode of inheritance
– adoption and twin studies as special
cases
Increasing the genetic signal in the data...
ascertain pedigree units that are likely to segregate
genes of relevance
– Ex: pedigrees with quasi-Mendelian disease transmission
– affected sib pair approach of linkage analysis
ascertain families on the basis of individuals with
extreme or remarkable phenotypes
– Ex: extremely discordant sibpairs
– ascertain young individuals with the disease
ascertain individuals from isolated populations:
– more homogenous genetically and culturally as well
ascertain intermediate phenotypes
– physiologic phenotype is “closer” to sequence variants
Two basic Analysis Strategies
1. candidate gene analysis
motto: study a few good genes
2. whole-genome searches (genome
scans)
motto: cast out a net that catches all
the big fish
Association studies:
Case-control design
What is the difference between genes of cases (e.g.
with disease or trait) and controls?


Selection of controls is major challenge, as in all case-control studies
High rate of false-positive studies:



many genes are available for study
population admixture confounding factor
Publication bias
Candidate Gene Studies
statistically straightforward: test the association between
genotypes and phenotype with contingency tables, chisquare test, regression
principle: if an allele is more frequent in affecteds than
unaffecteds  gene may be close to a disease gene
candidacy of a gene can come from a number of different
sources:
–
–
–
–
biological insights (e.g. gene expressed in a certain tissue)
homology to other genes
functional studies in model organisms
member of a relevant gene family
Challenge: greater biological understanding of the genes
POPULATION STRATIFICATION
Hypothetical Example (by Andrew Heath)
NOT A1 allele
A1 allele
NORTHERN EUROPEAN
ANCESTRY (N=200)
NOT
ROMAN
ROMAN
CATHOLIC
CATHOLIC
162
18
90%
18
2
10%
90%
10%
NO ASSOCIATION
SOUTHERN EUROPEAN
ANCESTRY (N=200)
NOT
ROMAN
ROMAN
CATHOLIC
CATHOLIC
35
105
70%
15
45
30%
25%
75%
NO ASSOCIATION
MINGLED IN AUSTRALIAN POPULATION (N=400)
NOT
ROMAN
CATHOLIC
NOT A1 allele
197
A1 allele
33
ROMAN
CATHOLIC
123
47
OR = 2.28, 95%CI 1.39 - 3.73
Falsely infer that A1 allele is risk-factor for Roman Catholicism.
Genome-wide association studies
 Large scale case-control series
 For example MI patients and matched controls without MI
 Use of very large numbers of SNPs to identify all possible genes
associated with the disease
 Typically 100,000 to 500,000 SNPs
 Different technology platforms (Affymetrix, Illumina)
Gene x Environment Interactions
aa
Predisposing
Environment
Liability to Illness
Liability to Illness
Protective
Aa
AA
Protective
Aa
aa
Predisposing
Environment
Genes control
susceptibility to
environmental
pathogenesis
AA
Liability to Illness
Genes control
degree of
sensitivity to
environmental
influence
Genes and
environment
have additive,
independent
effects
AA
Kendler & Eaves, 1986
Protective
Aa
aa
Predisposing
Environment
Gene-environment correlations refer to genetic effects on
individual differences in liability to exposure to particular
environmental circumstances.
(Background is the extensive evidence that
environmental risk exposure is far from
randomly distributed)
Gene-environment interactions concern genetically influenced
individual differences in the sensitivity to specific environmental
factors.
(Background is the extensive evidence of huge
individual differences in vulnerability to all
manner of environmental hazards)
Examples of social x biological interactive effects
 Biology controls sensitivity to environment effects
– E.g., family stress x serotonin metabolism => depression and anxiety risk (Caspi, Science 2003)
 Social context generates undifferentiated risk; biology constrains pathologic
specificity
– E.g., childhood neglect => alcoholism in men, eating disorders in women
 Biological susceptibilities are amplified during rapid or intense contextual change
– E.g., biological or gender-based vulnerabilities to depression and alcohol use as indexed by
pubertal development
 Biology controls liability to experiencing predisposing environments
– E.g. genes for skin color
Integration of information at different
levels
Developments in molecular
genetics render it now
possible to attempt
identification of liability genes
in complex, multifactorial
traits, and to dissect out with
new precision the role of
genetic predisposition and
environment/life style factors
in these disorders. But, an
integrative framework is
needed
Gottesmann I, Science 1997
Complex picture
Complexity of Complex Diseases
 Classical polygenic or "threshold" inheritance: a certain number
of mutations at different loci must be present before a system is
sufficiently challenged to result in disease.
 Locus heterogeneity, in which defects in any of a number of genes
or loci confer disease susceptibility independently of each other.
 Epistasis, or gene interaction: interactive effects of mutations,
genotypes, and/or their biologic products
 Environmental vulnerability: gene products are influenced by
environmental stimuli.
 Gene × environment interactions: gene has a deleterious effects
only in the presence of a particular environmental stimulus.
 Time-dependent expression of genes
 General aging of the system
Testing of epidemiological causal
hypotheses – use of twins
 Differences between MZ cotwins in a pair are due to environmental causes
(in the very broadest sense)
 somatic mutations and other genetic changes during development
 prenatal environmental and birth order effects
 differential treatment in childhood
 different exposures ( occupational, lifestyle)
 Exposure/disease discordant DZ pairs are fully matched on early childhood
effects, and partially on genetic factors
 Studies of exposure discordant twin pairs have increased power compared to
unmatched case-control series, depending on the degree of familiality of the
exposure