Download QTLs, Part Two (Ed Buckler) ()

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
QTL Analysis: Concept
Generation
Parents
Procedure
A
B

B
H
B
H
H
.
.
A
B
A
B
H
B
.
.
H
H
H
H
B
H
.
.
H
H
A
H
B
H
.
.
H
A
A
H
B
A
.
.
A
..
..
..
..
..
.
.
..
A
H
A
H
B
.
.
A
210
190
203
159
206
.
.
171
F2
F2:3
Office

LOD score
Field
F1
1
2
3
4
5
.
.
N
PHT[cm]
Laboratory
×
#
Marker
1 2 3 4 5 .. M
PHT
Alternatives: BC1, RIL, DHL
Chromosome 1
QTL Analysis: Single Marker Analysis
Plant height (cm)
240
umc157
umc130
220
200
180
160
XMC (cm)
Total
AA
Aa
aa
AA
Aa
aa
196
195
197
195
201
196
191
F = 0.48 ns
F = 6.47**
QTL Analysis: Single Marker Model (F2)
M
m
r
QQ
Q
q
Qq
qq
MM
(1-r)2
2r(1-r)
Mm
r(1-r)
(1-r)2+r2
r(1-r)
μ(Mm)
mm
r2
2r(1-r)
(1-r)2
μ(mm)
μ1
r2
μ2
μ3
Additive effect:
( MM  mm ) / 2  a(1  2r )
Dominance effect:
 Mm  (  MM  mm ) / 2  d (1  2r )2
F tests on the contrasts of
marker classes test the
following hypothesis:
μ(MM)
a > 0
d > 0
r < 0.5
Schön, 2002
QTL Analysis: Single Marker Model (F2)
Example:
Plant height, umc130
X(MM)
X(Mm)
X(mm)
=
=
=
201cm
196cm
191cm
Case 1
MQ
Case 2
M
r = 0
mq
r = 0.2
m
PHT (cm)
Add. Effect
X(QQ)
X(Qq)
X(qq)
Q
q
r = 0
r = 0.2
r = 0.4
5.0
8.3
25.0
201.0
196.0
191.0
204.3
196.0
187.7
221.0
196.0
171.0
4. Association Analysis
Concepts
Dissecting A Quantitative Trait:
Time Versus Resolution
Research Time in Years
5
Positional
Cloning
NILs
RI QTL
Mapping
F2 QTL
Mapping
Associations
1
1
1x104
Resolution in bp
1x107
Resolution Versus Allelic Range
Alleles Evaluated
>40
Associations In
Diverse Germplasm
Associations In
Narrow Germplasm
Positional
Cloning
1
1
NIL
1x104
Resolution in bp
Pedigree
F2 or RIL
Mapping
1x107
Association
Tests
• Evaluate whether nucleotide
polymorphisms associate with
phenotype
• Natural populations
• Exploit extensive recombination
A
C
G
A
G
1.3m
A
C
G
A
T
1.4m
A
T
A
A
G
1.5m
C
T
A
G
T
1.8m
A
T
G
G
T
2.0m
A
T
G
G
G
2.0m
Association mapping
• Mainstay of human genetics
– One of a few possible approaches
– Reproducibility was an issue
• Cystic fibrosis
– Kerem, et al. (1989). Science 245, 1073-1080.
• Alzheimer's disease
– Corder et al. (1994). Nature Genet. 7, 180-184.
Associations may result from at
least three causes
1. The locus is the cause of the phenotype
2. The locus is in linkage disequilibrium with
the cause of the phenotype
Linked and highly correlated
Complete Linkage Disequilibrium
2
Same mutational history
and no recombination.
No resolution
Locus 1
Locus 2
1
6
6
D’=1
r2=1
Adapted from Rafalski (2002) Curr
Opin Plant Biol 5:94-100.
Linkage Disequilibrium
2
Different mutational history
and no recombination.
Some resolution
Locus 1
Locus 2
1
3
3
6
D’=1
r2=0.33
Linkage Equilibrium
2
Same mutational history
with recombination.
Resolution
Locus 1
Locus 2
1
3
3
3
3
D’=0
r2=0
3. Population structure can produce associations
G
G
G
G
G
T
T
T
T
T
T
U.S.
Andes
10
200
8
180
Plant Height
Kernel Hue
G
6
4
2
P<<0.001
0
160
140
120
100
P=0.04
80
T
G
T
G
These non-functional associations can be accounted for by
estimating the population structure using random markers.
5. QTL mapping analysis
QTL Analysis: Interval Mapping
M1
Q
r1
M2
r2
Simple Interval Mapping
m1
q
Composite Interval Mapping
m2
r
PLOT
LOD
Peak at
96
=
4.7 +
=== =====
I
===
===
I
==
===
I
==
I
=
2.4 +
==
I
====
I
I
====
===========**********
******
***************
0.0 M----+----+---M+----MC--M+----M----+----+----+-C--+----+---M+----+----+--M cM
(0.47)
10
20
30
40
50
60
70
80
90
100
110 120 130 140
PlabQTL
150
QTL Analysis: Power of QTL detection
100
Power (%)
90
N = 600
Power: Probability of
finding a QTL
80
70
60
50
N = 300
40
30
20
10
N = 100
Heritability:
2

h 2  g2
p
0
0.4 0.5 0.6 0.7 0.8 0.9 1.0
Heritability
Utz and Melchinger, 1994
QTL Analysis: Conclusions
There are a number of QTL, in analysis the largest
ones easiest to detect BUT
Makes detection of others difficult
Models can adjust for this – detect others
QTL Analysis: Conclusions
QTL mapping combines qualitative linkage analysis
with quantitative genetic analysis. – Association
between marker genotypes and phenotypic trait values.
Single marker analysis is easy to perform but QTL
effect and position are confounded. This results in
low power of QTL detection.
Interval mapping approaches increase power of QTL
detection and allow the estimation of QTL effects and
position.
QTL Analysis: Conclusions
Estimates of QTL effects and the proportion of the
genotypic variance explained by QTL are biased due to
genotypic and environmental sampling.
Estimates of QTL position show low precision.
With large populations a large number of QTL is found
for complex traits.
When conducting a QTL study you may wish to use a
large population size.
6. Candidate Genes
Functional Genomics Using
Diversity
Forward Genetics
Reverse Genetics
Trait
Trait
QTL
Candidate
Polymorpism
Positionally
clone gene
Candidate
gene
Mutagenesis
Comparative
Genomics
Molecular &
Expression
Candidate Genes
Biochemical
Analyses Physiology Morphology
QTL Mapping
Positional Candidate Genes
Survey Diverse Races For:
1. Phenotype
2. Candidate Gene Sequence
3. Population History
Evolutionary
AssociationAssociation
Analysis Tests
Identify Genes with Phenotypic Effects
Identification of More Favorable Alleles
Move Alleles into Elite Lines with
Transgenics and Introgression
Enhanced Marker Assisted Breeding
7. Linkage Disequilibrium
Analysis
Properties of LD
 The basic measure of LD is:
DAB = PAB - pA pB
( DAB = DAb = DaB = Dab )
B
b
PAB =
PAb =
A
pA
pApB + DAB pApb - DAB
PaB =
Pab =
pa
a
papB - DAB papb + DAB
pB
pb
1
25
rAB  (1-c)g
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
B
r
Disequilibrium, A
Linkage Disequilibrium versus
Generations Since its Creation
0
100 200 300 400 500
Generation, g
Recomb. Rate (c)
c = 0.1
c = 0.02
c = 0.01
c = 0.005
c = 0.001
Other Measures of LD
Can divide DAB by the maximum value it can obtain:
D’AB =
DAB / [max(-pApB, -papb)] if DAB < 0
DAB / [min (pApb, papB)]
if DAB > 0
The sampling properties of D’AB are not well understood.
r2AB =
D2AB
pA pB p a pb
E(r2)= 1 / (1 – 4Nc)
LD generally decays rapidly with distance
1.00
Remington, D. L., et al. 2001..
PNAS-USA 98:11479-11484. &
unpublished
0.90
0.80
0.70
r2
0.60
0.50
0.40
0.30
d8
id1
sh1
tb1
d3
fae2
su1
bt2
sh2
wx1
0.20
0.10
0.00
0
2000
4000
6000
Distance in bp
8000
10000
Population Effect on Linkage
Disequilibrium in Maize
Investigator
Population
Studied
Extent of LD
Gaut
Landraces
<1000 bp
Buckler
Diverse Inbreds
2000 bp
Rafalski
Elite Lines
100 kb?
(6 kb
euchromatin?)
Reviewed in Flint-Garcia, S. A. et al. 2003. Annual
Review of Plant Biology 54:357-374.
8. Association Analysis
Allele Case-Control Test
marker
allele 1 allele 2
Affected
n1|aff
Unaffected n1|unaff
n1
n2|aff
n2|unaff
2 naff
2 nunaff
n2
if naff = nunaff
(ni|aff - ni|unaff)2
X2 = Si ni|aff + ni|unaff
2 N individuals
~
2
c
(k-1)
(k alleles)
Population Stratification: American Indian and Diabetes
Full heritage American Indian Population
+
Caucasian Population
-
+
Gm3;5,13,14
~1% ~99%
(NIDDM Prevalence  40%)
-
Gm3;5,13,14
~66% ~34%
(NIDDM Prevalence  15%)
Study without knowledge of genetic background:
Gm3;5,13,14
haplotype
+
-
Cases
Controls
7.8%
92.2%
OR=0.27
95%CI=0.18 to 0.40
29.0%
71.0%
Proportion with NIDDM by heritage and marker status
Index of Indian
Heritage
Gm3;5,13,14 haplotype
+
-
0
17.8%
19.9%
4
28.3%
28.8%
8
35.9%
39.3%
Knowler 1988 Am J
Hum Genet 43, 520526.
Use SSR Markers to Estimate
Population Structure
Method: Pritchard, J. K., M.
Stephens, and P. Donnelly.
2000. Inference of
population structure using
multilocus genotype data.
Genetics 155:945-59.
100%
8 Stiff Stalk
% Stiff Stalk
80%
60%
38 Non-Stiff Stalk
40%
30 Sub-Tropical
Example: Remington, D. L.,
et al. 2001.. Proc Natl Acad
Sci U S A 98:11479-11484.
20%
0%
0%
20%
40%
60%
% Non-Stiff Stalk
80%
100%
Logistic Regression Ratio Test
For Association
• Adapted from Pritchard case-control approach
• Where:
Pr1 (C; T , Qˆ )

Pr0 (C; Qˆ )
–C = candidate polymorphism distribution
–T = trait value
–Q = matrix of population membership
Pritchard, J. K., M.
Stephens, N. A.
Rosenberg, and P.
Donnelly. 2000. Am J
Hum Genet 67:170181.
• Evaluated by logistic regression
• Significance evaluated by permutation based on
haplotype distribution in populations
Population Structure Estimates Greatly
Reduce Estimated Type I Error Rates
SSR Estimated Type I
Error Rate
0.25
No Pop. Structure Estimate
With Pop. Structure Estimate
Pop. Structure with Rescaling
0.20
0.15
0.10
0.05
0.00
1
2
3
Flowering Time
4
1
2
Height
3
4
Fields
Su1
• Sugary1 is an
isoamylase, a starch
debranching enzyme
• Sequenced fully
from 32 diverse lines
• Sampled 2 small
parts of gene from
Whitt, S. R., et al. 2002. PNAS-USA 99:12959102 lines
12962.
11100bp
11
2
su1 Promoter
& 1st Exon
34
4
45
64:DE
0 0
4
2
00
7
00
3
0 0
1
2 0
15
0 0
1
• Two distinct alleles
• Sweet phenotype
not associated
2 0
79
00
2
Sweet Pop
Dent + Flint
50
0
su1 Coding
Region
30
0
578:WR
04
7
00
1
01
1
92
163:FL
00
1
662:KE
00
2
5
00
3
00
1
0 0 B4
11
00
2
0 0
61
• Two distinct
alleles
• Sweet
phenotype
associated with
W578R
00
2
00
1
00
13
Sweet Pop
Dent + Flint
50
0
Su1
30
0
578:WR
04
7
00
1
01
1
92
578:WR
163:FL
00
1
662:KE
00
2
5
00
3
00
1
0 0 B4
11
00
2
00
2
00
13
0 0
61
00
1
Sweet Pop
Dent + Flint
Based on survey of 12kbp from 32-102 lines.
Dwarf8 functional variation
2 Amino Acid
Deletion
MITE
Indel
When controlling for
population structure,
associates with flowering
time & plant height across
12 environments.
Thornsberry et al. 2001
Nat. Genet.
Days to Silking relative to B73
SH2 Domain
1.8
1.6
1.4
1.2
1
0.8
0.6
D8 SH2 Variant
9. Type I and Type II Error
Statistics - Hypothesis Test
Reject Null
Hypothesis
Fail to Reject Null
Hypothesis
Null Hypoth True
Null Hypoth False
Type I Error
α
Correct
Correct
Type II Error
β
P-value = α
Power = 1- β
Experimentwise P value
• Each statistical test has a Type I error rate
– Test 20 independent SNPs, one will be significant at
P<0.05
• Bonferroni correction essentially divides the P
by number of tests
– Often too conservative (no power), as markers are
correlated
• Churchill and Doerge permutation help estimate
experimentwise P,
– Permutes the entire genotype relative to the
phenotypes
Power of approaches
• Sample size
– 100 to 1000 are typical
• Heritability of trait
– H2 = 10% - 90%
– Depends on ability to measure trait
– Interactions with environment
• Depends on statistical properties of test
Association Approaches
Complement QTL Linkage Mapping
Association
Linkage (RILs)
2000 bp
10,000,000 bp
Genome Scan
Little Power
High Power
Allelic Range
High (10s)
Low (1 or 2)
Low
High
Resolution
Statistical Power
per Allele
Related documents