Download Tree Scanning - Technion - Laboratory of Computational Biology

Document related concepts
no text concepts found
Transcript
Haplotype Trees
Using The Evolutionary History
of Small DNA Regions To
Investigate Common Diseases
Replication
Coalesence
Unrooted Haplotype Tree
21
Intra-Allelic Sequence Variation ( ApoE)
Chimp CCACATGGGCGGTTCCCCCA? GT J
2
3
4
10
16
22
9
24
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
26
21
4
29
1
30
14
19
6
27
8
3
11
2
28
7
25
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. .
. .
. .
. .
. .
. .
. .
. .
T.
. .
. .
. .
. .
. .
. .
. .
17
12
13
20
23
15
18
5
31
T.
. .
. .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
. . .
T. .
TC.
. C.
. C.
.
.
.
.
.
.
.
.
.
.
. .
. .
T.
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. TCT C. . . . . .
. . CT C. T. . . .
. T. T C. . . . . .
. . . T C. . . . . .
. . . T C. . . . . .
. . . T C. . . . G.
. . . T CA. . . . .
. . . . C. . . . . .
. . . . . . . . . . .
. T. . . . . . . . .
. T. . . . . . . . .
. T. . . . . . A. .
. . C. . . . . A. .
. . . . . . . . A. .
T. . . . . . . A. .
. . . . . . . . A. .
. . . . . . . AA. .
. .
. .
. .
G.
G.
. .
. .
. .
. .
. . . .
. . . .
T. . .
T. T .
. . T.
. CT .
. . T.
. . T.
. . T.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. .
. .
. .
. .
A.
A.
A.
A.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
T.
T.
T.
T.
T.
. . T.
. . T.
. . T.
. GT.
. . T.
. . T.
. . T.
. . T.
. . T.
G. T.
. . T.
. . T.
. . T.
. . T.
. . T.
. . T.
. . T.
. .
. .
. .
. .
. .
. .
. .
. .
C.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
C
N
R
T
4
2
1
5
1
T.
T.
T.
T.
TT
T.
T.
T.
T.
T.
1
1
0
0
0
0
0
0
0
0
1
0
0
1
0
2
1
1
4
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
C
.
.
.
.
.
.
.
.
C
C
0
0
1
0
8
0
0
1
3
1
2
8
0
15
1
1
0
0
0
5
1
19
0
0
0
5
0
3
3
0
6
0
1
0
0
0
2
0
11
1
2
0
0
0
0
3
0
11
0
4
1
1
1
1
1
8
16
0
1
7
45
0
1
0
2
0
1
0
8
0
1
0
5
1
15
2
2
11 43
0
1
2
8
0
1
.
.
C
.
.
.
C
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
C
.
.
2 0 0
1 0 0
0 0 1
1 0 0
1 0 0
0 0 0
0 0 1
0 5 9
0 0 0
T.
. .
. .
. .
. .
. .
. .
. .
. .
0
1
1
0
0
2
0
1
1
2
2
2
1
1
2
1
15
1
13 (6.8%)
152 (79.2%)
27 (14%)
Statistical Vs. Maximum Parsimony
A = AGCT
B = TGCT
C = TACT
D = AAGG
3
21
14
1575
2907
26
545
1998
7
5361
2440
2
6
4036
3937
471
5361
1998
560
624
624
560
1575
22
16
5
624
3106
4951
4075
10
2
832
5229B
9
4951
560
3673
4951
560
12
8
308
24
18
23
560
3
20
73
1163
560
27
17
4 560 1
29 3701 832
11
19
624
28
25
5361
624
624
The Apoprotein E
Haplotype
Tree
1522
30
13
31
15
4
What Use Are Haplotype Trees?
• Provides an Interpretive Framework When
Integrated With Other Analyses
• Evolutionary History Generates Hypotheses
About Current Significance
• Provides a Powerful Tool For Detecting
Current Genotype-Phenotype Associations
A Haplotype Tree Can Provide an
Interpretive Framework When
Integrated With Other Analyses
g0560 g1163
g2440
g0560
g0832
g0560
g0832 g1163
g0624 g2440
g0624 g5361
g0560 g3937
g2440 g3937
g1163 g4951
g1998 g5361
g3937 g5361
g0832 g2440
g1998 g2440
g0560 g1998
g0832 g1998
g1998 g4951
g1163 g3106
g3106 g5361
g2440 g3106
g0624 g4075
g1163 g1522
g0832 g3106
g0832 g1522
g0560 g4075
g1163 g3937
g1163 g1998
g0624 g0832
g0832 g2907
g0624 g3937
g1522 g2440
g1998
g0624
g5361
g0832
g0624 g1163
g0832 g3937
g2440 g2907
g1998 g2907
g2907 g3937
g1163 g2440
g0832 g4951
g4075 g5361
g1163 g4075
g0560 g5361
g0560 g3106
g1998 g4075
g3937 g4075
g2440 g4951
g0832 g4075
g1163 g5361
g1998 g3937
g2440 g4075
g2440 g5361
g3937 g4951
g0560 g0624
0.0
2.0
4.0
R2 X 100
6.0
8.0
Hamon and Sing estimated interactions for all 53
pairs of ApoE sites for lnApoE variability
in North Karelia, Females
560-832**
5601163**
560-2440**
832-1163**
3937-4075
21
14
1575
2907
26
4
11
28
1998
7
5361
3701
2440
560
4036
6
3937
832
5229B
471
5361
1998
560
624
624
560
1575
22
16
5
624
3106
4951
4075
9
4951
560
10
4951
560
12
8
3673
24
18
23
560
3
27
73
832
1163
2
20
19
624
308
Sites Identified
By Hamon and
Sing That
“Interact”
With Site 560
17
1
560
29
545
25
5361
624
624
Parallel
Mutations
At Site
560
1522
30
13
31
15
21
Evolutionary
Hypothesis:
Two Functional
Mutations (Occurring
On A Specific
Haplotype
Background) 25
Have Created
Three Allelic Clades
For the Phenotype
Of ln(ApoE); the
Red, Blue and Black
Clades
14
1575
2907
26
4
11
28
5361
3701
73
832
2440
4036
6
3937
8
832
5229B
3673
560
624
624
9
471
5361
1998
560
1575
22
16
5
624
3106
4951
4075
10
4951
4951
560
308
24
560
12
560
3
18
23
1163
2
20
19
624
560
27
17
1
560
29
545
7
5361
624
624
1998
1522
30
13
31
15
21
14
1522
30
1575
2907
26
545
The Blue Clade
Is Uniquely
Defined By
These Two Sites
1998
7
5361
2440
2
6
4036
3937
832
5229B
471
5361
1998
560
10
624
624
560
1575
22
16
5
624
3106
4951
4075
9
4951
560
3673
4951
560
12
8
308
24
18
23
560
3
20
73
1163
560
27
17
4 560 1
29 3701 832
11
19
624
28
25
5361
624
624
The Red Clade Is
Uniquely Defined
By These Two Sites
13
31
15
21
14
1522
30
1575
2907
26
545
1998
7
5361
2440
2
6
4036
3937
832
5229B
471
5361
1998
560
10
624
624
560
1575
22
16
5
624
3106
4951
4075
9
4951
560
3673
4951
560
12
8
308
24
18
23
560
3
20
73
1163
560
27
17
4 560 1
29 3701 832
11
19
624
28
25
5361
624
624
The Red Clade Is
Not Uniquely Defined
By These Two Sites
Due to Homoplasy
13
31
15
21
14
2907
26
The Apoprotein E
Haplotype
Tree
545
1998
7
5361
2440
2
6
4036
3937
832
5229B
471
5361
1998
560
10
624
624
560
1575
22
16
5
624
3106
4951
4075
9
4951
560
3673
4951
560
12
8
308
24
18
23
560
3
20
73
1163
560
27
17
4 560 1
29 3701 832
11
19
624
28
25
5361
624
624
Sites 560 and 624
Fall into an Alu Repeat
1522
30
1575
13
31
15
Single SNP Analysis of lnApoE in North
Karelia, females
1 Kb
Exon 1
Exon 2
Exon 3
832
*
*
Indicates a significant single site effect
*
Exon 4
21
14
1522
30
1575
2907
26
624
624
545
1998
7
5361
2440
2
6
4036
3937
832
5229B
471
5361
1998
560
10
624
624
560
1575
22
16
5
624
3106
4951
4075
9
4951
560
3673
4951
560
12
8
308
24
18
23
560
3
20
73
1163
560
27
17
4 560 1
29 3701 832
11
19
624
28
25
5361
13
31
The Single SNP
Analysis Identifies
Sites With A Weaker
Phenotypic
Association Because
It Cannot Deal With
Homoplasy At Site
560
15
21
14
1522
30
1575
2907
26
624
624
545
1998
7
5361
2440
2
6
4036
3937
832
5229B
471
5361
1998
560
10
624
624
560
1575
22
16
5
624
15
3106
4951
4075
9
4951
560
3673
4951
560
12
8
308
24
18
23
560
3
20
73
1163
560
27
17
4 560 1
29 3701 832
11
19
624
28
25
5361
The Single SNP
Analysis Identifies
Sites With A Weaker
Phenotypic
Association Because
It Cannot Deal With
Homoplasy At Site
560
13
31
There is a deliberate attempt
To find SNPs that are
Polymorphic in most or all
Populations and that have
High heterozygosities; that is,
SNPs just like the one at
Site 560.
Linkage Disequilibrium Is
Frequently Used in Association
Studies, But Also Is Frequently
Misinterpreted.
Haplotype Trees Can Aid In
Understanding The Proper
Biological Interpretation
ApoE Gene
Stengård et al. (1996)
showed the amino acid
replacement alleles at
ApoE have a major
impact on mortality due
to CAD in a longitudinal
study.
7
6
5
4
3
3/3
3/4
2/4 & 4/4
2
1
0
CAD Mortality
Relative to
CAD Mortality
of 3/3
Apoprotein E Gene Region
5.5
5.
4.5
4.
3.5
3.
2.5
2.
1.5
1.
0.5
0.
Exon 4
Exon 3
Exon 2
Exon 1
5361
5229B
5229 A
4951
4075
4036
3937
3701*
3673
3106
2907
2440
1998
1575
1522
1163
832
624
560
545
471
30 8
73
Apoprotein E Gene Region
Exon 4
Exon 3
Exon 2
Exon 1
5361
5229B
5229 A
4951
4075
4036
3937
3701*
3673
3106
2907
2440
1998
1575
1522
1163
832
624
560
545
471
30 8
73
These Two Sites Are in Disequilibrium
5.5
5.
4.5
4.
3.5
3.
2.5
2.
1.5
1.
0.5
0.
21
14
1575
2907
26
545
1998
7
5361
2440
2
6
4036
3937
832
5229B
471
5361
1998
560
10
624
624
560
1575
22
16
5
624
3106
4951
4075
9
4951
560
3673
4951
560
12
8
308
24
18
23
560
3
20
73
1163
560
27
17
4 560 1
29 3701 832
11
19
624
28
25
5361
624
624
The Apoprotein E
Haplotype
Tree
1522
30
13
31
15
21
2907
26
The Apoprotein E
Haplotype
Tree
1522
30
1575
545
1998
7
5361
2440
2
6
4036
3937
832
5229B
471
5361
1998
560
10
624
624
560
These haplotypes Are G at
Site 832 & T At Site 3937
1575
22
16
5
624
3106
4951
4075
9
4951
560
3673
4951
560
12
8
308
24
18
23
560
3
20
73
1163
560
27
17
4 560 1
29 3701 832
11
19
624
28
25
5361
624
624
These haplotypes
Are T at Site 832 &
C At Site 3937
14
13
31
15
Apoprotein E Gene Region
0.
0.5
1.
1.5
2.
2.5
3.
3.5
4.
4.5
5.
5.5
Exon 4
Exon 3
Exon 2
Exon 1
5361
5229B
5229 A
4951
4075
4036
3937
3701*
3673
3106
2907
2440
1998
1575
1522
1163
832
624
560
545
471
30 8
73
Site 3937 Is An Amino Acid
Polymorphism That Affects ApoE
Function and CAD
Apoprotein E Gene Region
Suppose Only This Portion Was Sequenced
0.
0.5
1.
1.5
2.
2.5
3.
3.5
4.
4.5
5.
5.5
Exon 4
Exon 3
Exon 2
Exon 1
5361
5229B
5229 A
4951
4075
4036
3937
3701*
3673
3106
2907
2440
1998
1575
1522
1163
832
624
560
545
471
30 8
73
Site 3937 Is An Amino Acid
Polymorphism That Affects ApoE
Function and CAD
Apoprotein E Gene Region
Suppose Only This Portion Was Sequenced
0.
0.5
1.
1.5
2.
2.5
3.
4.
4.5
5.
5.5
Exon 4
Exon 3
Exon 2
Exon 1
5361
5229B
5229 A
4951
4075
4036
3937
3701*
3673
3106
2907
2440
1998
1575
1522
1163
832
624
560
545
471
30 8
73
Site 832 Would Appear to Have The
Strongest Association with ApoE
Function and CAD
3.5
Site 3937 Is An Amino Acid
Polymorphism That Affects ApoE
Function and CAD
Apoprotein E Gene Region
Suppose Only This Portion Was Sequenced
0.
0.5
1.
1.5
2.
2.5
3.
4.
4.5
5.
5.5
Exon 4
Exon 3
Exon 2
Exon 1
5361
5229B
5229 A
4951
4075
4036
3937
3701*
3673
3106
2907
2440
1998
1575
1522
1163
832
624
560
545
471
30 8
73
Site 832 Would Have The
Strongest Association with ApoE
Function and CAD
3.5
Apoprotein E Gene Region
Suppose Only This Portion Was Sequenced
0.
0.5
1.
1.5
2.
2.5
3.
4.
4.5
5.
5.5
Exon 4
Exon 3
Exon 2
Exon 1
5361
5229B
5229 A
4951
4075
4036
3937
3701*
3673
3106
2907
2440
1998
1575
1522
1163
832
624
560
545
471
30 8
73
Site 832 Would Have The
Strongest Association with ApoE
Function and CAD
3.5
Would you
infer
From this
Association
That the
Marker Closest
to the
Functional Site
Was
Here?
Haplotype Trees Estimate an
Evolutionary History That Can
Generate Hypotheses About The
Current Significance of Genetic
Variation
51N
63
56
74R
33
29
83R
31
41N
50
44 59 19
30
17
44N
20
t
23
87R
59
h
31
21
15J
63
56
61N
76R
66
1JNR
65
7
63
61
2JNR
33J
49
28
4
53N
49N
29
67N
62
41
27 40
42
26
40
46N
19
m
r
q
50
29
53
6NR
5
1
29
23
21
44
11
53
69R 27
88R
14J
LPL Tree
35J
22J
35
57J
38J
28
19
24
g
61
n
25
63
67
26
f
40
8
36
5
69
T-1
31
19J
35
40J
38
37J
39
41
16
5
19
31
56
24J
43
44
49
44
18J 30 80R 3
8
30
25J
8
26
29
68
T-2
36
46
58
12J
46
20
16
T-3
52
48
64
69
19
19
44
7
29J
2
8
20
55
58N
36
35
66N
53
59
75R
29
59
w
69
34
45J 55N
11
53
16
23
17
73R
47N
13J
10
69
53
23
72R
T-4
9
27
69
34
4
26
68 x
63N
60
57
32
37
13
71R
57
47
55 51
62
22
45
40
6
18
55 u 16
49
7
10
58
35
19
30
8
44
30J
29
65
16
41
53
17
41
8
l
5
8J
31J
58
9
10
61
19
27J
5
17
66
29
81R
13
62N
16J
c
66
53
k
33
4
55
12
e
10J
5
65
a
26
78R 50
61
8
26
34J
9
26
17J
36J
8
29
33
17
30
7
23J
55
42
77R
s
4
3
17
19
16
4
39J
21J
86R
13
64J
50
30
48N
6
36
2
10
63
46
59N
65N
6
35
56
26
42
13
79R
43J
32J
14
56
42
44 35
15
41
33
70R
53
84R 23
i
v
d
25
65
50
25
58
31
p
59
53
b
64
30
67
29
50N
18
14
4JN
26
30
60
20J
26J
9N
38
29
45
59
25
9
8
7
53
j
11J
56N
68R
15
13
8
60N
4
18
65
28J
52N
5
20
63
54
42N
11
6
5NR
7NR
13
27
17
26
3
38
29
85R
3JNR
31
38
27
54N
33
82R
5
4
1
21
Detecting Recombinantion Events in LPL
5NR
{
20
Branch "A"
8
13
2JNR
36J
11J
7
5
12
16
5
25
29
31
70R
33
56
53
16
79R
13
7
8
65
19
61
31J
66
29
36
69
a=3, b=5, k=3, p =0.0179, crossover between sites 13 and 29.
1
10
20
30
40
50
60
69
2JNR
CAGTTTCCCT CAGCACGATC GCAATTGCAC CTCAATGTAT AGTTGTAACC GAGTCCGCAT AACTATAGG
5NR
CAGTTTATCT CACCACGATA GCAATTGCAC CTCAATGTAT AGTTGTAACC GAGTCCGCAT AACTATAGG
Node a
CAGTTTATCT CACCACGATC GCAATTGCTC TTTAATGTAT AGTTGTAACC GAATCAGCAT AACTATAGG
a=2, b=7, k=2, p =0.0278, crossover between sites 16 and 19.
Node d
11J
Node e
CAGTTTATCT CACCACGATC GCAACTGCTC TTTAATGTAT AGTTGTAACC GAATCAGCAT AACTATAGG
CAGTATATCT CACCATGATC GCAACTGCTC TTTAATGTAT AGTTGTAACC GAATCAGCAT AACTATAGG
CAGTATATCT CACCATGAGC GCAATTGCAC TTTAA?GTAT AGTTGTAACC GAATCAGCAT CACTGGAGA
11J
Node e
T-1
CAGTATATCT CACCATGATC GCAACTGCTC TTTAATGTAT AGTTGTAACC GAATCAGCAT AACTATAGG
CAGTATATCT CACCATGAGC GCAATTGCAC TTTAA?GTAT AGTTGTAACC GAATCAGCAT CACTGGAGA
CAGTTTATCT CACCACGAGC GCAATTGCAC TTTAA?GTAT AGTTGTAACC GAATCAGCAT CACTGGAGA
Linkage Disequilibrium & The
Recombinational Hotspot in LPL
Haplotype Network in 5’ Region of
LPL
84R
4
49N
17
5'-1
13
23J
7
5'-2
8
5'-4
36J
4
12
5'-3
3
5'-5
17
44N
5
16
5'-6
6
4
32J
9
9
2
10
10
17
5'-8
6
18
14
15
5'-7
16
14J
8
8J
Haplotype Network in 3’ Region of
LPL
53
3'-11
75R
59
59
3'-10
53
45J
69
30J
41
16J
36
69
36J
T-1
3'-9
38
39
41
43
44
46
47
60
66
3'-12
46
43J
65
40J
63
55
37J
40
58
28J
53
60N
59
45
53
67
34J
56
40
24J
39J
61N
3'-4
3'-1
41N
38
54
12J
42N
54N
36
63
3'-2
38
63
9N
3'7
42
50
53N
56N
46N
62
67N
41
40
48N
42
62
T-2
44
50
68
3'-8
50
T-3
78R
42
8J
32J
49N 53 77R
29J
55
44
14J
41
49
42
44
55
40
53
38J
52
61
50N
64
48
49
3'-6
58
49
81R
46
37
61
45
55
58
56
38
44N
50
44
53
56
45
59
42
51N 63 3'-3
59N
56
59
3'-5
T-4
36
55
19J
65
63
57
51
61
64J
64
35J
20J
36
41
58
67
26J
Neutral Genetic
Drift, Stable
Population Size
Neutral Genetic
Drift, Expanding
Population Size
Negative
Selection
Positive
Positive
(Directional) (Diversifying)
Selection or Selection or
Subdivision
Bottleneck
Peeled Haplotype Network in of
LPL
2
17
17J
19
10
36J
12
9
10J
11J
28J
52N
60N
56
61N
59
29
45
9N
38
1JNR
63
2JNR
30
56N
64
29
26
42
31
70R
33
56
30
50
59N
29
8
29
5
79R
43J
42
53N
56
16J
4
13
7
8
29
46
65
53
8
25
63
40
27J
64J
61
19
58
26
23J
16
41
30J
8
36
66
25
65
26
50
63
4JN
59
53
66
16
31J
65
69
5
T-1
31
19J
35
40J
38
37J
39
53
34J
26
41
67
43
44
46
T-2
68
49
45
47
40
55 51
T-3
19
44
29J
52
48
37
32
60
57
64
T-4
Evolutionary Inferences On LPL
• 5’ End Subject to Directional Selection,
With A Selective Sweep Enhanced By
Recombination
• 3’ End Subject to Diversifying Selection
• Implies That Most Current Polymorphisms
With Functional Significance Are In 3’ End
Haplotype Trees Provide a
Powerful Tool For Detecting
Current Genotype-Phenotype
Associations
• Nested Clade Analysis
• Tree Scanning
Nested Clade Analysis
• In 1987 Published The Nested Clade Method For
Using A Haplotype Tree As A Tool For
Discovering Gene/Phenotype Associations
• Nests The Haplotypes in Tree Into Evolutionary
Clades (Branches)
• The Resulting Nested Design Provides Asymptotic
Independence And A Priori Contrasts For
Detecting Phenotypic Associations.
The Drosophila Adh Haplotype Tree
The Drosophila Adh Haplotype Tree
1-6
1-11
1-7
1-1
1-2
1-10
1-9
1-3
1-5
1-4
1-8
The Drosophila Adh Haplotype Tree
2-5
2-3
2-1
2-2
2-4
The Drosophila Adh Haplotype Tree
3-1
3-2
Results of Nested Analysis of Variance of Adh
Activity Using The Adh Haplotype Tree
**
***
**
**
*** Significant 0.1% Level
** Significant at 1% Level
Functional Allelic Categories from the Nested
Analysis of Variance of Adh Activity
**
***
**
**
Phenotypic Distributions Identified Though Nested Clade Analysis
9
8
Clade 3-1
7
Clade 3-2
Number of Lines
6
5
4
3
2
1
0
2.14
2.64
3.14
3.64
4.14
4.64
5.14
5.64
6.14
6.64
7.14
7.64
8.14
8.64
9.14
9.64
10.1
7.14
7.64
8.14
8.64
9.14
9.64
10.1
7.14
7.64
8.14
8.64
9.14
9.64
10.1
Adh Activity
9
8
Clade 1-4
7
Remainder 3-1
Number of Lines
6
5
4
3
2
1
0
2.14
2.64
3.14
3.64
4.14
4.64
5.14
5.64
6.14
6.64
Adh Activity
4
Number of Lines
Haplotype 19
Haplotype 23
Remainder 3-2
3
2
1
0
2.14
2.64
3.14
3.64
4.14
4.64
5.14
5.64
6.14
6.64
Adh Activity
Nested Clade Analyses
• Greater Statistical Power By Focusing On Fewer
Comparisons
• Greater Biological Power In Detecting Mutations
With Phenotypic Effects
• Deals With High Levels of Genetic Variation
Through Pooling Into Clades
• Deals With Linkage Disequilibrium Through
Haplotypes And Tree Branches
• Useful In Ultimately Identifying Causative
Mutations
Nested Clade Analyses
• Although Nesting Is Common In Statistics and
Evolutionary Biology, It Is Unfamiliar and
Daunting To Others
• The Analysis Finds Phenotypic Associations With
Haplotypes or Groups of Haplotypes: Does Not
Deal Directly With Dominance Effects Or
Genotypes.
• Is Inherently A Single Locus (Or Smaller)
Analysis: Does Not Deal Directly With Epistasis
Tree Scanning
A New Method for Using
Haplotype Trees At Candidate
Loci To Investigate GenotypePhenotype Associations.
E.g., A Genome Scan for Lupus (Gray-McGuire et al. 2000)
Tree Scanning
• Make All Possible Bi-Allelic Partitions of the
Haplotype Tree.
• Test For Phenotypic Heterogeneity Among the
Resulting Genotypes Using Standard Statistics
(ANOVA, t-Tests, Etc.)
• Because Tests Are Not Independent Across The
Bi-Allelic Partitions, Randomly Permute
Phenotypes Across Genotypes 10,000 Times To
Determine the Treewise Type I Error Rate
Scanning The Drosophila Adh Haplotype Tree
Scanning The Drosophila Adh Haplotype Tree
Scanning The Drosophila Adh Haplotype Tree
Significant Results of Adh Tree Scan
(Proportion of Phenotypic Variance Explained)
Sequential Adh Tree Scan
(Fix Two or More Alleles From First Scan Defined By Distinct Peaks,
Then Examine All Possible Partitions Into Three Alleles)
Sequential Adh Tree Scan
(Fix Two or More Alleles From First Scan Defined By Distinct Peaks,
Then Examine All Possible Partitions Into Three Alleles)
Sequential Adh Tree Scan
(Fix Two or More Alleles From First Scan Defined By Distinct Peaks,
Then Examine All Possible Partitions Into Three Alleles)
Sequential Adh Tree Scan
(Fix Two or More Alleles From First Scan Defined By Distinct Peaks,
Then Examine All Possible Partitions Into Three Alleles)
Significant Peaks In Second Round of the
Sequential Adh Tree Scan (Color Changes) vs.
The Nested Clade Analysis (*)
*
*
*
*
Tree Scanning
• Is Less Powerful Statistically Than A Nested
Clade Analysis, But Tends To Identify The Same
Functional Allelic Categories
• Is Easier To Implement and Automate Than A
Nested Clade Analysis
• Detects Phenotypic Heterogeneity Among
Genotypes And Therefore Can Detect Dominance
Effects, Etc.
• Is Superior To Single SNP Association Tests
• It Is Computationally Feasible To Exhaustively
Examine All Combinations of Bi-Allelic Partitions
At Two Separate Genes And Therefore Detect
Epistasis
Haplotype Trees
Provide a valuable tool in the
investigation of common diseases
whose potential has not yet been fully
explored or developed.
Genomic Approaches to Common
Chronic Disease
A Research Project Supported by: National Institute of General Medical Sciences
(NIGMS), P50-GM65509
U. of Texas
U. of Texas
Houston, TX
Houston, TX
James E. Hixson (Component 1)
Eric Boerwinkle (Component 2)
Myriam Fornage
Craig Hanis
Andrei Rodin
Cornell, U.
U. of Michigan
Ithaca, NY
Ann Arbor, MI
Andrew G. Clark (Component 3)
S. Malia Fullerton
Charles F. Sing (PI, Component 4)
Sharon L. Kardia
Kathy L. Klos
Northwestern U.
U. of Alabama
Kiang Liu
Heather McCreath
O. Dale Williams
Washington U. St. Louis, MO
Alan R. Templeton (Project Consultant)
Support From MDECODE and
A Burroughs-Wellcome Fund
Innovation Award In Functional
Genomics Are Gratefully
Acknowledged
Related documents