Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Admixture Mapping
Controlled Crosses Are Often Used to
Determine the Genetic Basis of
Differences Between Populations.
When controlled crosses are not an
option, can use natural admixture as a
substitute, and must use genetic
markers to determine the degree of
admixture in the sampled individuals.
Admixture Between Two Demes
European Population West African Population
Ancestral
Gene Pools
A
a
A
a
pE
qE
pW
qW
1-M
1
Gene Pools in
Present North
America
M
Note, M is now
measuring amount
of African Ancestry
A
a
A
a
pE
qE
pA= (1-M)pE+MpW
qA
European Americans
African Americans
End-Stage Kidney (Renal) Disease
(ESKD) Is A Major Disease In the
USA and Other Countries
• 100,000 Americans develop ESKD each year, and
it is associated with high health care costs and
high mortality
• The cumulative life time risk for ESKD varies
with ethnicity:
7.5% in African-Americans
2.1% in European-Americans
• The cause of this increased risk is not explained
by social-economic status, lifestyle factors, etc.
Admixture Between Two Demes
Basic strategy of admixture mapping:
Subdivide the African American Sample into Cases
(those with ESKD) and Controls (matched for as
many other variables as possible, but do NOT have
ESKD).
Idea: genes increasing risk of ESKD should be in
genomic regions of West African Ancestry.
Ancestry informative markers are used to compute the ancestry across the chromosomes of cases and
controls
European
African
Bercovici S. et.al. Genome Res. 2008;18:661-667
©2008 by Cold Spring Harbor Laboratory Press
Admixture Between Two Demes
For this to work you need to have markers that cover the
whole genome and that are informative about the
population differences between Africans and Europeans.
In this study, 2500 SNPs were used.
The SNP panel was then run on a sample of 723 Cases and
1,059 Controls. An independent sample was run by Kopp
et al, and both studies had identical results.
Admixture Between Two Demes
The Sharp Peak Fell Inside A
Large Gene Known as MYH9
MYH9
A Problem of Scale
Linkage, GWAS, and Admixture Mapping All
Require Recombination. Recent Studies Show
That Recombination Is Often Clustered Into
“Hotspots”, Leaving Large Genomic Blocks
Without Recombination.
E.g, Disequilibrium
patterns from previous
work on MYH9:
Haplotypes & Recombination
The Next Phase of the Analysis Was To Organize
the SNP Genotype Data Into Haplotype Data and
Determine The Role of Recombination In this
107,000 base pair region.
We used the program PHASE to do the initial
recombination inferences, followed by the more
sensitive Crandall & Templeton algorithm.
35
00
73
4
00 6
7
35 39
00 2
7
35 45
00 6
7
35 51
00 0
7
35 67
00 5
7
35 92
00 8
7
35 97
00 4
8
35 06
00 4
8
35 26
00 6
8
35 39
00 9
8
35 42
00 2
9
35 00
00 4
9
35 16
01 1
1
35 88
01 9
4
35 27
01 7
4
35 30
01 4
4
35 88
01 9
4
35 92
01 6
7
35 90
02 8
1
35 57
02 2
2
35 69
02 8
3
35 92
02 6
5
35 11
02 9
5
35 37
02 4
6
35 19
03 5
0
35 12
03 1
2
35 83
03 8
5
35 47
03 5
8
35 03
03 0
8
35 16
03 2
8
35 42
04 9
0
35 46
04 5
2
35 75
04 6
4
35 03
04 5
4
35 65
04 6
5
35 55
05 6
3
35 64
05 3
7
35 22
06 9
2
35 48
07 3
5
35 22
08 1
9
35 76
09 1
9
35 63
10 1
76
85
35
Recombination Frequency
Recombination Inferences
0.14
0.12
0.1
0.08
0.06
Hotspot
Not
Detected
With
PHASE
0.04
0.02
0
SNP Location
Theoretical Decay of LD in a Random-Mating Population
In a genomic region with no recombination,
the LD created by mutation never dissipates.
In Genomic Regions of Little to
No Recombination, We Can
Estimate Haplotype Trees, and
The Tree Can Be Used to
Analyze Genotype/Phenotype
Associations.
Rationale: Functional Mutations Define Clades (branches of
an evolutionary tree) In Regions of Low Recombination.
Ancestral Haplotype:
1
Time
Neutral Mutation
Functional Mutation
4
2
5
6
7
This entire clade of
haplotypes bears the
same functional
mutation
3
Central Haplotype Tree
TCTCGCTCCTGTTGTTTTCATC
TCTCGCTCCTGTTGTTTTCGCC
35
00
73
4
00 6
7
35 39
00 2
7
35 45
00 6
7
35 51
00 0
7
35 67
00 5
7
35 92
00 8
7
35 97
00 4
8
35 06
00 4
8
35 26
00 6
8
35 39
00 9
8
35 42
00 2
9
35 00
00 4
9
35 16
01 1
1
35 88
01 9
4
35 27
01 7
4
35 30
01 4
4
35 88
01 9
4
35 92
01 6
7
35 90
02 8
1
35 57
02 2
2
35 69
02 8
3
35 92
02 6
5
35 11
02 9
5
35 37
02 4
6
35 19
03 5
0
35 12
03 1
2
35 83
03 8
5
35 47
03 5
8
35 03
03 0
8
35 16
03 2
8
35 42
04 9
0
35 46
04 5
2
35 75
04 6
4
35 03
04 5
4
35 65
04 6
5
35 55
05 6
3
35 64
05 3
7
35 22
06 9
2
35 48
07 3
5
35 22
08 1
9
35 76
09 1
9
35 63
10 1
76
85
35
Recombination Frequency
Haplotype Trees
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
SNP Location
Nested Design of Central Tree
Nested Design of Central Tree
1-7
1-6
1-3
1-5
1-4
1-8
1-9
1-2
1-1
Nested Design of Central Tree
1-7
1-6
1-3
1-5
1-4
1-8
1-9
1-2
1-1
Nested Design of Central Tree
1-7
1-6
1-3
1-5
1-4
1-8
1-9
1-2
1-1
Nested Design of Central Tree
3-1
Total Cladogram Control
3-1
122
3-2
186
308
3-2
ESKD
129
449
578
p-value < 0.0001
alpha
0.005
(pe-pc)/pc
-1.092
0.286
35
00
73
4
00 6
7
35 39
00 2
7
35 45
00 6
7
35 51
00 0
7
35 67
00 5
7
35 92
00 8
7
35 97
00 4
8
35 06
00 4
8
35 26
00 6
8
35 39
00 9
8
35 42
00 2
9
35 00
00 4
9
35 16
01 1
1
35 88
01 9
4
35 27
01 7
4
35 30
01 4
4
35 88
01 9
4
35 92
01 6
7
35 90
02 8
1
35 57
02 2
2
35 69
02 8
3
35 92
02 6
5
35 11
02 9
5
35 37
02 4
6
35 19
03 5
0
35 12
03 1
2
35 83
03 8
5
35 47
03 5
8
35 03
03 0
8
35 16
03 2
8
35 42
04 9
0
35 46
04 5
2
35 75
04 6
4
35 03
04 5
4
35 65
04 6
5
35 55
05 6
3
35 64
05 3
7
35 22
06 9
2
35 48
07 3
5
35 22
08 1
9
35 76
09 1
9
35 63
10 1
76
85
35
Recombination Frequency
ESKD Associations
In Blocks
0.1
0.06
0.04
N.S.
< 0.0001
0.004
0.005
N.S.
0.14
0.12
0.004
N.S.
0.08
< 0.0001
0.02
0
SNP Location
35
00
73
4
00 6
7
35 39
00 2
7
35 45
00 6
7
35 51
00 0
7
35 67
00 5
7
35 92
00 8
7
35 97
00 4
8
35 06
00 4
8
35 26
00 6
8
35 39
00 9
8
35 42
00 2
9
35 00
00 4
9
35 16
01 1
1
35 88
01 9
4
35 27
01 7
4
35 30
01 4
4
35 88
01 9
4
35 92
01 6
7
35 90
02 8
1
35 57
02 2
2
35 69
02 8
3
35 92
02 6
5
35 11
02 9
5
35 37
02 4
6
35 19
03 5
0
35 12
03 1
2
35 83
03 8
5
35 47
03 5
8
35 03
03 0
8
35 16
03 2
8
35 42
04 9
0
35 46
04 5
2
35 75
04 6
4
35 03
04 5
4
35 65
04 6
5
35 55
05 6
3
35 64
05 3
7
35 22
06 9
2
35 48
07 3
5
35 22
08 1
9
35 76
09 1
9
35 63
10 1
76
85
35
Recombination Frequency
ESKD Associations In Hotspots
0.14
The SNPs In The Recombinational Hotspots Have Been
Ignored Up To Now, So Each of These SNPs Was Then
Tested For ESKD Associations
0.12
0.1
0.08
0.06
0.04
0.02
0
SNP Location
One SNP Showed A
Significant Effect
35
00
73
4
00 6
7
35 39
00 2
7
35 45
00 6
7
35 51
00 0
7
35 67
00 5
7
35 92
00 8
7
35 97
00 4
8
35 06
00 4
8
35 26
00 6
8
35 39
00 9
8
35 42
00 2
9
35 00
00 4
9
35 16
01 1
1
35 88
01 9
4
35 27
01 7
4
35 30
01 4
4
35 88
01 9
4
35 92
01 6
7
35 90
02 8
1
35 57
02 2
2
35 69
02 8
3
35 92
02 6
5
35 11
02 9
5
35 37
02 4
6
35 19
03 5
0
35 12
03 1
2
35 83
03 8
5
35 47
03 5
8
35 03
03 0
8
35 16
03 2
8
35 42
04 9
0
35 46
04 5
2
35 75
04 6
4
35 03
04 5
4
35 65
04 6
5
35 55
05 6
3
35 64
05 3
7
35 22
06 9
2
35 48
07 3
5
35 22
08 1
9
35 76
09 1
9
35 63
10 1
76
85
35
Recombination Frequency
ESKD Associations In MYH9
0.14
0.12
0.08
0.06
0.04
0.02
Used Logistic Regression To Test For the
Simultaneous Effects of These Three Genetic
Risk Factors
C vs T
0.1
3-1
vs
3-2
0
SNP Location
1-3
vs
rest
1-1
vs
1–2
35
00
73
4
00 6
7
35 39
00 2
7
35 45
00 6
7
35 51
00 0
7
35 67
00 5
7
35 92
00 8
7
35 97
00 4
8
35 06
00 4
8
35 26
00 6
8
35 39
00 9
8
35 42
00 2
9
35 00
00 4
9
35 16
01 1
1
35 88
01 9
4
35 27
01 7
4
35 30
01 4
4
35 88
01 9
4
35 92
01 6
7
35 90
02 8
1
35 57
02 2
2
35 69
02 8
3
35 92
02 6
5
35 11
02 9
5
35 37
02 4
6
35 19
03 5
0
35 12
03 1
2
35 83
03 8
5
35 47
03 5
8
35 03
03 0
8
35 16
03 2
8
35 42
04 9
0
35 46
04 5
2
35 75
04 6
4
35 03
04 5
4
35 65
04 6
5
35 55
05 6
3
35 64
05 3
7
35 22
06 9
2
35 48
07 3
5
35 22
08 1
9
35 76
09 1
9
35 63
10 1
76
85
35
Recombination Frequency
ESKD Associations In MYH9
Results of Logistic Regression:
0.14
0.12
0.08
0.06
0.04
0.02
C vs T
0.1
3-1
vs
3-2
0
1-3
vs
rest
1-1
vs
1–2
SNP Location
Detected three separable main effects and one strong interaction
between the central and end haplotype blocks.
A Major Advantage of the
Candidate Locus Approach
Is That Interactions
Between Genes (Epistasis)
and Between Genes and
Environments Can Be
Studied.
However, interactions challenge our usual
concepts of causation and scientific inference.
Epistasis Between ApoE and LDLR
HDL particle containing cholesterol
ApoE
ApoB
Epistasis Between ApoE and LDLR for LDL Cholesterol
20 0
LDLR
Genotype
18 0
A1 /16 0
A2 /A2
Serum LDL Cholesterol (mg/dl)
14 0
12 0
10 0
80
60
40
20
0
22
23
33
ApoE Genotype
24
34
44
Two Populations
• Frequency ApoE-4
Allele = 0.152
• Frequency ApoE-3
Allele = 0.77
• Frequency LDLR A2
Allele = 0.78
• Frequency ApoE-4
Allele = 0.95
• Frequency ApoE-3
Allele = 0.03
• Frequency LDLR A2
Allele = 0.50
Quantitative Genetic Components As a Function
of Allele Frequencies: A. 4 allele at ApoE is
Rare, A2 at LDLR Common; B. Reversed
Genetic Variance
60.00
A.
{
Epistatic Variance
Dominance Variance
Additive Variance
50.00
35.00
B.
30.00
40.00
25.00
30.00
20.00
15.00
20.00
10.00
10.00
5.00
0.00
0.00
ApoE & LDLR
ApoE
LDLR
ApoE & LDLR
ApoE
LDLR
Interactions Also Exist Between
Genetic and Environmental Factors (e.g,
Macular Degeneration, the cause of 90% of all legal blindness)
Macular Degeneration
LD genome scan
chromosome 10q26
Schmidt, S., M. A. Hauser, et al. (2006). Am J Hum Genet 78(5): 852-864.
Macular Degeneration
Genotype
Frequencies in
Subjects with MD
General Population
Genotype
Frequencies in
Subjects without MD
Schmidt, S., M. A. Hauser, et al. (2006). Am J Hum Genet 78(5): 852-864.
Macular Degeneration
Linkage
genome
scan
Schmidt, S., M. A. Hauser, et al. (2006). Am J Hum Genet 78(5): 852-864.
There is a strong relationship between frequency and
effect size in genetic association studies
This category is
expected to be rare in
any system where true
causation arises from
interactions among
components.
TA Manolio et al. Nature 461, 747-753 (2009)
doi:10.1038/nature08494
Autoimmune Diseases & Allergies
Hygiene Hypothesis
Hygiene Hypothesis
Hygiene Hypothesis
Braun-Fahrlander et al (2002) showed that children
of farmers in Central Europe that were exposed to
high levels of bacterial endotoxin in house dust had
low levels of allergies & asthma compared to
children from the same communities exposed to low
levels of endotoxin.
This suggests a candidate environment (endotoxin
loads in house dust), and a candidate gene, CD14
Hygiene Hypothesis
Hygiene Hypothesis
CC has highest risk
CC has lowest risk
Interactions of Genes
With Other Genes and
With Environmental
Factors Ensure That The
Phrase “The Gene For X”
Is Often False and
Actively Misleading.
Using Interactions in Treatments
Decline in LDL-Cholesterol Levels
0
-10
-20
-30
-40
e3
e4
-50
-60
-70
-80
-90
Exercise Training
Probucol
Statins
Genetics Has the Potential of Making Medicine Much More Individualized
Candidate Genes & Human Disease
The Major Application of Genetics to Risk
Prediction and Treatment Is Not “Gene
Therapy” But Rather In Understanding
Genetic and Environmental Interactions. This
Requires A Shift In Medicine From Treating
The Diseases of Individuals To Treating The
Individual With The Disease.
Complexity Is Both A
Challenge and an Opportunity