Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001 Linkage Mapping Compares inheritance pattern of trait with the inheritance pattern of chromosomal regions First gene-mapping in 1913 (Sturtevant) Uses naturally occurring DNA variation (polymorphisms) as genetic markers >400 Mendelian (single gene) disorders mapped Current challenge is to map QTLs Linkage = Co-segregation A3A4 A1A2 A1A3 A1A2 A1A4 A2A4 A3A4 A2A3 A3A2 Marker allele A1 cosegregates with dominant disease Recombination A1 Q1 Parental genotypes A2 Q2 A1 Q1 A2 Q2 A1 Q2 A2 Q1 Likely gametes (Non-recombinants) Unlikely gametes (Recombinants) Recombination of three linked loci 1 2 (1-1)(1-2) (1-1)2 1(1-2) 12 Map distance Map distance between two loci (Morgans) = Expected number of crossovers per meiosis Note: Map distances are additive Recombination & map distance 0.5 Recombination fraction 0.45 0.4 0.35 0.3 Haldane map function 0.25 0.2 2m 1 e 2 0.15 0.1 0.05 0 0 0.2 0.4 0.6 Map distance (M) 0.8 1 Methods of Linkage Analysis Model-based lod scores Assumes explicit trait model Model-free allele sharing methods Affected sib pairs Affected pedigree members Quantitative trait loci Variance-components models Double Backcross : Fully Informative Gametes aabb AABB AaBb AaBb aabb Non-recombinant aabb Aabb Recombinant aaBb Linkage Analysis : Fully Informative Gametes Count Data Parameter Recombinant Gametes: R Non-recombinant Gametes: N Recombination Fraction: Likelihood L() = R (1- )N Parameter ˆ R ( N R) Chi-square R log N log( 1 ) 2 ( R N ) log(. 5 ) 2 Phase Unknown Meioses AaBb AaBb aabb Either : Non-recombinant Or : Recombinant aabb Aabb aaBb Recombinant Non-recombinant Linkage Analysis : Phase-unknown Meioses Count Data or Likelihood Recombinant Gametes: X Non-recombinant Gametes: Y Recombinant Gametes: Y Non-recombinant Gametes: X L() = X (1- )Y + Y (1- )X An example of incomplete data : Mixture distribution likelihood function Parental genotypes unknown AaBb aabb Aabb aaBb Likelihood will be a function of allele frequencies (population parameters) (transmission parameter) Trait phenotypes Penetrance parameters Phenotype Genotype f2 AA Aa aa Disease f1 1- f2 f0 1- f1 1- f0 Normal Each phenotype is compatible with multiple genotypes. General Pedigree Likelihood Likelihood is a sum of products (mixture distribution likelihood) n f n 1 1 f 1 L pen( xi | gi ) pop( gi ) trans( gi | gif , gim) G number of terms = (m1, m2 …..mk)2n where mj is number of alleles at locus j Elston-Stewart algorithm Reduces computations by Peeling: Step 1 Condition likelihoods of family 1 on genotype of X. 1 X 2 Step 2 Joint likelihood of families 2 and 1 Lod Score: Morton (1955) L Lod log L 0.5 Lod > 3 conclude linkage Prior odds 1:50 linkage ratio 1000 Lod <-2 exclude linkage Posterior odds 20:1 Linkage Analysis Admixture Test Model Probabilty of linkage in family = Likelihood L(, ) = L() + (1- ) L(=1/2) Allele sharing (non-parametric) methods Penrose (1935): Sib Pair linkage For rare disease Concordant affected Concordant normal Discordant Therefore Affected sib pair design Test H0: Proportion of alleles IBD =1/2 IBD Affected sib pairs: incomplete marker information Parameters: IBD sharing probabilities Z=(z0, z1, z2) Marker Genotype Data M: Finite Mixture Likelihood 2 Lz zi PM | IBD i i 0 SPLINK, ASPEX Joint distribution of Pedigree IBD IBD of relative pairs are independent e.g If IBD(1,2) = 2 and IBD (1,3) = 2 then IBD(2,3) = 2 Inheritance vector gives joint IBD distribution Each element indicates whether paternally inherited allele is transmitted (1) or maternally inherited allele is transmitted (0) Vector of 2N elements (N = # of non-founders) Pedigree allele-sharing methods Problem APM: Affected family members Uses IBS ERPA: Extended Relative Pairs Analysis Genehunter NPL: Non-Parametric Linkage Dodgy statistic Conservative Genehunter-PLUS: Likelihood (“tilting”) •All these methods consider affected members only Convergence of parametric and non-parametric methods Curtis and Sham (1995) MFLINK: Treats penetrance as parameter Terwilliger et al (2000) Complex recombination fractions Parameters with no simple biological interpretation Quantitative Sib Pair Linkage X, Y standardised to mean 0, variance 1 r = sib correlation VA = additive QTL variance Haseman-Elston Regression (1972) (X-Y)2 = 2(1-r) – 2VA(-0.5) + Haseman-Elston Revisited (2000) XY = r + VA(-0.5) + Improved Haseman-Elston Sham and Purcell (2001) Use as dependent variable X Y 2 X Y 2 X Y (1XrY) 22 2 (1 r ) 2 (1 r ) 2 Gives equivalent power to variance components model for sib pair data Variance components linkage Models trait values of pedigree members jointly Assumes multivariate normality conditional on IBD Covariance between relative pairs = Vr + VA [-E()] Where V = trait variance r = correlation (depends on relationship) VA= QTL additive variance E() = expected proportion IBD QTL linkage model for sib-pair data 1 [0 / 0.5 / 1] N S n s PT1 Q q Q S q s PT2 N n No linkage Under linkage Incomplete Marker Information IBD sharing cannot be deduced from marker genotypes with certainty Obtain probabilities of all possible IBD values Finite mixture likelihood L Zi L X | IBD i;VA Pi-hat likelihood L L X | IBD 2ˆ ;VA QTL linkage model for sib-pair data 1 ˆ N S n s PT1 Q q Q S q s PT2 N n Conditioning on Trait Values Usual test ln Z i L X | IBD i;VA ln LR Max ln L X ; V A 0 Conditional test ln Z i L X | IBD i;VA ln LR Max ln P L X | IBD i;VA i Zi = IBD probability estimated from marker genotypes Pi = IBD probability given relationship QTL linkage: some problems Sensitivity to marker misspecification of marker allele frequencies and positions Sensitivity to non-normality / phenotypic selection Heavy computational demand for large pedigrees or many marker loci Sensitivity to marker genotype and relationship errors Low power and poor localisation for minor QTL