Download 1 - r - Barley World

Basic QTL Analysis Is there an association between marker genotype and quantitative trait phenotype? - Classify progeny by marker genotype g = genotypic effect - Compare phenotypic mean between classes (t-test or ANOVA) µ1 = trait mean for - Significance = marker linked to QTL genotypic class AA - Difference between means = estimate of QTL effect g = (µ1 - µ2)/2 µ2 = trait mean for genotypic class aa y βo 0 -1 aa AA Genotypic classes x Notations for single-QTL models in backcross and F2 populations Model Backcross (Qq x QQ) DH (qq x QQ) F2 (Qq x Qq) Genotype Value QQ µ1 Qq µ2 Genetic effect g = 0.5(µ1 - µ2) QQ µ1 Qq µ2 Genetic effect g = 0.5(µ1 - µ2) QQ µ1 Qq µ2 qq µ3 Additive a = 0.5(µ1 - µ3) Dominance d = 0.5(2µ2 - µ1 - µ3) Single-marker analysis • How it works – Finds associations between marker genotype and trait value y j    f ( A)   j r A (marker) Q (putative QTL) • When to use – Order of markers unknown or incomplete maps – Quick scan – Find best possible QTLs – Identify missing or incorrectly formatted data • Limitations Underestimates QTL number and effects QTL position can not be precisely determined r = recombination fraction yj = trait value for the jth individual in the population μ = population mean f(A) = function of marker genotype εj = residual associated with the jth individual Single-marker analysis in backcross progeny • Parents: • Backcross: AAQQ x aaqq aaqq x AaQq x AAQQ Expected Frequency • BC Progeny AaQq AAQQ 0.5 (1 - r) Aaqq AAQq 0.5r aaQq AaQQ 0.5r aaqq AaQq 0.5(1 - r) r is recombination frequency between A and Q Expected QTL genotypic frequencies conditional on genotypes Marker genotype Observed count Marginal frequencies QTL genotype QQ Qq Expected trait value Joint frequency AA n1 0.5 0.5(1-r) 0.5r Aa n2 0.5 0.5r 0.5(1-r) Conditional frequency AA n1 0.5 1-r r (1-r)µ1 + rµ2 Aa n2 0.5 r 1-r rµ1 + (1-r)µ2 Single-marker analysis r A Q (marker) (putative QTL) - Simple t-test - Analysis of variance - Linear regression - Likelihood Simple t-test using backcross progeny H0: [μAa - μaa ] = 0 Yj(i)k = μ + Mi + g(M)j(i) + ei(j)k (a + d) = 0 r = 0.5 tM  ˆ Aa  ˆ aa 1 1 sˆ     n1 n2  2 M ˆ Aa  ˆ aa tM  2 sˆAa sˆaa2  n1 n2 t-distribution with df = N – 2 Yj(i)k = trait value for individual j with genotype i in the replication k μ = population mean Mi = effect of the marker genotype g(M)j(i) = genotypic effect which cannot be explained by the marker genotype ei(j)k = error term µAa = trait mean for genotypic class Aa µaa = trait mean for genotypic class aa s2M = pooled variance within the two classes If tM is significant, then a QTL is declared to be near the marker Analysis of variance using backcross H : [μ progeny 0 Aa - μaa ] = 0 (a + d) = 0 r = 0.5 Source df MS (Mean Square) Expected MS Total Genetics N - 1 MSG  e2  b G2 Marker 1 MSM  e2  b  G2 (QTL )  4r (1  r )a 2  bc(1  2r ) 2 a 2 G(Marker) N-2 MSG(M)  e2 Residual N (b - 1) MSE  e2   b 2 G ( QTL ) MSM F MSG (M ) F-distribution with 1 and N – 2 df If F is significant, then a QTL is declared to be near the marker F = t if df for numerator is 1   4r (1  r )a  2 N= no. of individuals in pop. b = no. of replications r = recombination fraction Analysis of variance using SAS (A simple example) data a; input Individuals Trait1 Marker1 Marker2; cards; 1 1.57 A B 2 1.35 B A 3 10.7 B B … proc glm; class Marker1 Marker2; model Trait1 = Marker1 Marker2; lsmeans Marker1 Marker2; run; Linear regression using backcross progeny y j  0  1 x j   j H0: [μAa - μaa ] = 0 (a + d) = 0 R2: percent of the phenotypic variance explained by the QTL r = 0.5 y β1 Dummy variables: yj= trait value for the jth individual βo aa = -1 xj= dummy variable Aa = 1 βo= intercept for the regression 0 -1 Expectations: aa Aa Genotypic classes x β1= slope for the regression j= random error E(βo) = 0.5 (µAa + µaa) = Mean for the trait E(β1) = 0.5 (1 - 2r) (µAa - µaa) = (1 - 2r) g = 0.5 (a + d) (1 - 2r) Linear regression using backcross progeny Interpretation of results depends on coding of the dummy variables 6 y y=3 +x+e 5 -1 4 3 3 2 2 1 1 0 1 aa Aa Genotypic classes µ=3 µAa = 4 µaa = 2 g = 0.5(µAa - µaa) = 1 y=3 -x+e 5 4 0 -2 y 6 x2 0 -2 -1 0 1 aa Aa Genotypic classes µ=3 µAa = 2 µaa = 4 g = 0.5(µAa - µaa) = -1 x2 A likelihood approach using backcross progeny Joint distribution function: L  ( yi   j ) 2  p(Q j / M i ) exp    2 2  i 1 j 1   N 1  2  N 2 A likelihood approach using backcross progeny (cont.) 2 2     N ( y   )  i j 2 2 Ln L( 1 , 2 ,  , r   Ln p(Q j / M i ) exp   Ln ( 2  )   2 2  j 1 i 1    2   N 1 N N LnL( 1   2      2  ( yi   ) 2  Ln(2 2 ) 2 i 1 2   ( yi  1 ) 2   ( yi   2 ) 2   N 2 LnL(r  0.5)   Lnexp    Ln ( 2  )     2 2 i 1   2  2   2 N A likelihood approach using backcross progeny (cont.) (Weller, 1986) G-statistics H0: [μAa - μaa ] = 0 Likelihood ratio test statistics (LR) Probability of occurrence of the data under the (a + d) = 0 null hypothesis   r = 0.5  G  2 ln L( ˆ Aa , ˆ aa , ˆ 2 , rˆ)  ln L(r  0.5)  G is distributed asymptotically as a chisquare variable with one degree of freedom   G  2 ln L( ˆ Aa , ˆ aa , ˆ 2 , rˆ)  ln L(  Aa   aa   ) The t-test is approximately equivalent to the likelihood ratio test using this formula LOD score LOD : Logarithm of the odds ratio Base 10 logarithm of G LR= 2 (log)LOD = 4.605LOD LOD= 0.217LR LOD is interpreted as an odds ratio (probability of observing the data under linkage/probability of observing the same data under no linkage) No theoretical distribution is needed to interpret a lOD score Key value: ≥ 3 (H1 is 1000 times more likely than H0 -no linkage-) (approx: p = 0.001) p= probability of type I error Type I error: false positive (declare a QTL when there is no QTL) G-Statistics and LOD score Single-marker analysis Summary • • • • Identify marker-trait associations Identify missing or incorrectly formatted data Genetic map is not required Divide the population into subpopulations based on the allelic segregation of individual loci (one marker at a time) • Get trait means for each subpopulation (genotypic class) • Determine if the subpopulations trait means are significantly different • Limitations Underestimates QTL number and effects QTL position can not be precisely determined

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 1 - r - Barley World