Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
QTL Analysis: Concept Generation Parents Procedure A B B H B H H . . A B A B H B . . H H H H B H . . H H A H B H . . H A A H B A . . A .. .. .. .. .. . . .. A H A H B . . A 210 190 203 159 206 . . 171 F2 F2:3 Office LOD score Field F1 1 2 3 4 5 . . N PHT[cm] Laboratory × # Marker 1 2 3 4 5 .. M PHT Alternatives: BC1, RIL, DHL Chromosome 1 QTL Analysis: Single Marker Analysis Plant height (cm) 240 umc157 umc130 220 200 180 160 XMC (cm) Total AA Aa aa AA Aa aa 196 195 197 195 201 196 191 F = 0.48 ns F = 6.47** QTL Analysis: Single Marker Model (F2) M m r QQ Q q Qq qq MM (1-r)2 2r(1-r) Mm r(1-r) (1-r)2+r2 r(1-r) μ(Mm) mm r2 2r(1-r) (1-r)2 μ(mm) μ1 r2 μ2 μ3 Additive effect: ( MM mm ) / 2 a(1 2r ) Dominance effect: Mm ( MM mm ) / 2 d (1 2r )2 F tests on the contrasts of marker classes test the following hypothesis: μ(MM) a > 0 d > 0 r < 0.5 Schön, 2002 QTL Analysis: Single Marker Model (F2) Example: Plant height, umc130 X(MM) X(Mm) X(mm) = = = 201cm 196cm 191cm Case 1 MQ Case 2 M r = 0 mq r = 0.2 m PHT (cm) Add. Effect X(QQ) X(Qq) X(qq) Q q r = 0 r = 0.2 r = 0.4 5.0 8.3 25.0 201.0 196.0 191.0 204.3 196.0 187.7 221.0 196.0 171.0 4. Association Analysis Concepts Dissecting A Quantitative Trait: Time Versus Resolution Research Time in Years 5 Positional Cloning NILs RI QTL Mapping F2 QTL Mapping Associations 1 1 1x104 Resolution in bp 1x107 Resolution Versus Allelic Range Alleles Evaluated >40 Associations In Diverse Germplasm Associations In Narrow Germplasm Positional Cloning 1 1 NIL 1x104 Resolution in bp Pedigree F2 or RIL Mapping 1x107 Association Tests • Evaluate whether nucleotide polymorphisms associate with phenotype • Natural populations • Exploit extensive recombination A C G A G 1.3m A C G A T 1.4m A T A A G 1.5m C T A G T 1.8m A T G G T 2.0m A T G G G 2.0m Association mapping • Mainstay of human genetics – One of a few possible approaches – Reproducibility was an issue • Cystic fibrosis – Kerem, et al. (1989). Science 245, 1073-1080. • Alzheimer's disease – Corder et al. (1994). Nature Genet. 7, 180-184. Associations may result from at least three causes 1. The locus is the cause of the phenotype 2. The locus is in linkage disequilibrium with the cause of the phenotype Linked and highly correlated Complete Linkage Disequilibrium 2 Same mutational history and no recombination. No resolution Locus 1 Locus 2 1 6 6 D’=1 r2=1 Adapted from Rafalski (2002) Curr Opin Plant Biol 5:94-100. Linkage Disequilibrium 2 Different mutational history and no recombination. Some resolution Locus 1 Locus 2 1 3 3 6 D’=1 r2=0.33 Linkage Equilibrium 2 Same mutational history with recombination. Resolution Locus 1 Locus 2 1 3 3 3 3 D’=0 r2=0 3. Population structure can produce associations G G G G G T T T T T T U.S. Andes 10 200 8 180 Plant Height Kernel Hue G 6 4 2 P<<0.001 0 160 140 120 100 P=0.04 80 T G T G These non-functional associations can be accounted for by estimating the population structure using random markers. 5. QTL mapping analysis QTL Analysis: Interval Mapping M1 Q r1 M2 r2 Simple Interval Mapping m1 q Composite Interval Mapping m2 r PLOT LOD Peak at 96 = 4.7 + === ===== I === === I == === I == I = 2.4 + == I ==== I I ==== ===========********** ****** *************** 0.0 M----+----+---M+----MC--M+----M----+----+----+-C--+----+---M+----+----+--M cM (0.47) 10 20 30 40 50 60 70 80 90 100 110 120 130 140 PlabQTL 150 QTL Analysis: Power of QTL detection 100 Power (%) 90 N = 600 Power: Probability of finding a QTL 80 70 60 50 N = 300 40 30 20 10 N = 100 Heritability: 2 h 2 g2 p 0 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Heritability Utz and Melchinger, 1994 QTL Analysis: Conclusions There are a number of QTL, in analysis the largest ones easiest to detect BUT Makes detection of others difficult Models can adjust for this – detect others QTL Analysis: Conclusions QTL mapping combines qualitative linkage analysis with quantitative genetic analysis. – Association between marker genotypes and phenotypic trait values. Single marker analysis is easy to perform but QTL effect and position are confounded. This results in low power of QTL detection. Interval mapping approaches increase power of QTL detection and allow the estimation of QTL effects and position. QTL Analysis: Conclusions Estimates of QTL effects and the proportion of the genotypic variance explained by QTL are biased due to genotypic and environmental sampling. Estimates of QTL position show low precision. With large populations a large number of QTL is found for complex traits. When conducting a QTL study you may wish to use a large population size. 6. Candidate Genes Functional Genomics Using Diversity Forward Genetics Reverse Genetics Trait Trait QTL Candidate Polymorpism Positionally clone gene Candidate gene Mutagenesis Comparative Genomics Molecular & Expression Candidate Genes Biochemical Analyses Physiology Morphology QTL Mapping Positional Candidate Genes Survey Diverse Races For: 1. Phenotype 2. Candidate Gene Sequence 3. Population History Evolutionary AssociationAssociation Analysis Tests Identify Genes with Phenotypic Effects Identification of More Favorable Alleles Move Alleles into Elite Lines with Transgenics and Introgression Enhanced Marker Assisted Breeding 7. Linkage Disequilibrium Analysis Properties of LD The basic measure of LD is: DAB = PAB - pA pB ( DAB = DAb = DaB = Dab ) B b PAB = PAb = A pA pApB + DAB pApb - DAB PaB = Pab = pa a papB - DAB papb + DAB pB pb 1 25 rAB (1-c)g 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 B r Disequilibrium, A Linkage Disequilibrium versus Generations Since its Creation 0 100 200 300 400 500 Generation, g Recomb. Rate (c) c = 0.1 c = 0.02 c = 0.01 c = 0.005 c = 0.001 Other Measures of LD Can divide DAB by the maximum value it can obtain: D’AB = DAB / [max(-pApB, -papb)] if DAB < 0 DAB / [min (pApb, papB)] if DAB > 0 The sampling properties of D’AB are not well understood. r2AB = D2AB pA pB p a pb E(r2)= 1 / (1 – 4Nc) LD generally decays rapidly with distance 1.00 Remington, D. L., et al. 2001.. PNAS-USA 98:11479-11484. & unpublished 0.90 0.80 0.70 r2 0.60 0.50 0.40 0.30 d8 id1 sh1 tb1 d3 fae2 su1 bt2 sh2 wx1 0.20 0.10 0.00 0 2000 4000 6000 Distance in bp 8000 10000 Population Effect on Linkage Disequilibrium in Maize Investigator Population Studied Extent of LD Gaut Landraces <1000 bp Buckler Diverse Inbreds 2000 bp Rafalski Elite Lines 100 kb? (6 kb euchromatin?) Reviewed in Flint-Garcia, S. A. et al. 2003. Annual Review of Plant Biology 54:357-374. 8. Association Analysis Allele Case-Control Test marker allele 1 allele 2 Affected n1|aff Unaffected n1|unaff n1 n2|aff n2|unaff 2 naff 2 nunaff n2 if naff = nunaff (ni|aff - ni|unaff)2 X2 = Si ni|aff + ni|unaff 2 N individuals ~ 2 c (k-1) (k alleles) Population Stratification: American Indian and Diabetes Full heritage American Indian Population + Caucasian Population - + Gm3;5,13,14 ~1% ~99% (NIDDM Prevalence 40%) - Gm3;5,13,14 ~66% ~34% (NIDDM Prevalence 15%) Study without knowledge of genetic background: Gm3;5,13,14 haplotype + - Cases Controls 7.8% 92.2% OR=0.27 95%CI=0.18 to 0.40 29.0% 71.0% Proportion with NIDDM by heritage and marker status Index of Indian Heritage Gm3;5,13,14 haplotype + - 0 17.8% 19.9% 4 28.3% 28.8% 8 35.9% 39.3% Knowler 1988 Am J Hum Genet 43, 520526. Use SSR Markers to Estimate Population Structure Method: Pritchard, J. K., M. Stephens, and P. Donnelly. 2000. Inference of population structure using multilocus genotype data. Genetics 155:945-59. 100% 8 Stiff Stalk % Stiff Stalk 80% 60% 38 Non-Stiff Stalk 40% 30 Sub-Tropical Example: Remington, D. L., et al. 2001.. Proc Natl Acad Sci U S A 98:11479-11484. 20% 0% 0% 20% 40% 60% % Non-Stiff Stalk 80% 100% Logistic Regression Ratio Test For Association • Adapted from Pritchard case-control approach • Where: Pr1 (C; T , Qˆ ) Pr0 (C; Qˆ ) –C = candidate polymorphism distribution –T = trait value –Q = matrix of population membership Pritchard, J. K., M. Stephens, N. A. Rosenberg, and P. Donnelly. 2000. Am J Hum Genet 67:170181. • Evaluated by logistic regression • Significance evaluated by permutation based on haplotype distribution in populations Population Structure Estimates Greatly Reduce Estimated Type I Error Rates SSR Estimated Type I Error Rate 0.25 No Pop. Structure Estimate With Pop. Structure Estimate Pop. Structure with Rescaling 0.20 0.15 0.10 0.05 0.00 1 2 3 Flowering Time 4 1 2 Height 3 4 Fields Su1 • Sugary1 is an isoamylase, a starch debranching enzyme • Sequenced fully from 32 diverse lines • Sampled 2 small parts of gene from Whitt, S. R., et al. 2002. PNAS-USA 99:12959102 lines 12962. 11100bp 11 2 su1 Promoter & 1st Exon 34 4 45 64:DE 0 0 4 2 00 7 00 3 0 0 1 2 0 15 0 0 1 • Two distinct alleles • Sweet phenotype not associated 2 0 79 00 2 Sweet Pop Dent + Flint 50 0 su1 Coding Region 30 0 578:WR 04 7 00 1 01 1 92 163:FL 00 1 662:KE 00 2 5 00 3 00 1 0 0 B4 11 00 2 0 0 61 • Two distinct alleles • Sweet phenotype associated with W578R 00 2 00 1 00 13 Sweet Pop Dent + Flint 50 0 Su1 30 0 578:WR 04 7 00 1 01 1 92 578:WR 163:FL 00 1 662:KE 00 2 5 00 3 00 1 0 0 B4 11 00 2 00 2 00 13 0 0 61 00 1 Sweet Pop Dent + Flint Based on survey of 12kbp from 32-102 lines. Dwarf8 functional variation 2 Amino Acid Deletion MITE Indel When controlling for population structure, associates with flowering time & plant height across 12 environments. Thornsberry et al. 2001 Nat. Genet. Days to Silking relative to B73 SH2 Domain 1.8 1.6 1.4 1.2 1 0.8 0.6 D8 SH2 Variant 9. Type I and Type II Error Statistics - Hypothesis Test Reject Null Hypothesis Fail to Reject Null Hypothesis Null Hypoth True Null Hypoth False Type I Error α Correct Correct Type II Error β P-value = α Power = 1- β Experimentwise P value • Each statistical test has a Type I error rate – Test 20 independent SNPs, one will be significant at P<0.05 • Bonferroni correction essentially divides the P by number of tests – Often too conservative (no power), as markers are correlated • Churchill and Doerge permutation help estimate experimentwise P, – Permutes the entire genotype relative to the phenotypes Power of approaches • Sample size – 100 to 1000 are typical • Heritability of trait – H2 = 10% - 90% – Depends on ability to measure trait – Interactions with environment • Depends on statistical properties of test Association Approaches Complement QTL Linkage Mapping Association Linkage (RILs) 2000 bp 10,000,000 bp Genome Scan Little Power High Power Allelic Range High (10s) Low (1 or 2) Low High Resolution Statistical Power per Allele