Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Biomath-HG 207B/ Biostat 237 February 5, 2004 Lecture 5 Linkage Analysis = Gene Mapping Think of the markers as genetic guideposts along the chromosome. 1 2 3 4 56 7 8 9 10 11 = marker = Location of disease susceptibility gene 1 The distortion in the expected segregation pattern of 1/4:1/4:1/4:1/4 depends on the frequency of crossovers between the two loci. The frequency is dependent on the distance between the two loci. The two members of the same autosomal pair, duplicate and pair up A2 A1 B2 B1 A1 A2 A1 B1 B2 B1 B2 B2 Let the crossover frequency be 10% Recombination Fraction q = 10% in gametes A1 A1 A2 A2 B1 B2 B1 B2 45% 5% 5% 45% 2 LOD = log base 10 of the Odds of recombination. = the log base 10(probability of the data when recombination fraction equals q / probability of the data the recombination fraction equals ½) The pedigrees are independent of one another so we sum their contribution to the overall LOD. LOD = LOD(family 1)+ LOD(family 2) + ... Calculate LOD for different values of q. The value of q that maximizes the LOD is the most likely value of q. In certain cases it can be easy to calculate the LOD and find the maximum q: 3 For example, when the disease is due to a single locus acting in a Mendelian manner with complete penetrance. Particularly easy when the phase, whether the disease allele and a marker allele were inherited from the same parent or not, is known. In this case, just need to count the number of crossovers, r, in the number of informative meioses, n. The contribution to the overall lod score for this family for specified q is q r 1 q nr r nr q 1 q 2 2 log LOD log10 10 r nr n 1/ 2 1 1/ 2 1 2 2 2 log10 2 n q 1 q r nr 4 Example of calculating the LOD when phase is known Let the disease be rare, dominant and fully penetrant 2 1 A-B A-A 4 3 B-B A-B 5 6 7 8 9 A-B B-B B-B B-B B-B We know that 3 inherited disease allele,D, and marker B allele from father, and normal allele,d, and marker allele A from mother. Father 4 is homozygous, d-d and B-B. 3 children inherited D and B from mother, 1 child inherited d and A from mother. Child 7 inherited d and B from mother, a recombinant. If q=0.20 then the contribution to the lod is log10={25(1/5)(4/5)4}=0.42 5 Example of calculating the LOD when phase is unknown Let the disease be rare, dominant and fully penetrant 4 3 B-B A-B 5 6 7 8 9 A-B B-B B-B B-B B-B We no longer know if mother 3 inherited D and B from the same parent. There are two equally probable possibilities: mother got D and B from one parent and d and A from the other or she got D and A on one parent and d and B from the other. These are mutually exclusive events. Prob(data)=Prob(data|mom is D,B/d,A)P(D,B/d,A)+ Prob(data|mom is D,A/d,B)P(D,A/d,B) If mom got D and B from one parent, then 1 recombinant and 4 nonrecombinant informative meioses. If mom got D and A from one parent, then 4 recombinants and 1 nonrecombinant. 6 Pr ob(data) Pr ob(data | 3 is D, B / d , A) Pr ob( D, B / d , A) Pr ob(data | 3 is D, A/d, B)Prob(D, A/d, B) q 1q 2 2 r nr 1 1q q 2 2 2 r nr 1 2 The contribution to the overall lod score for this family is r nr q 1 q 2 2 LOD log 10 r 1 / 2 1 1 / 2 nr 2 2 r nr 1 1 q q 1 2 2 2 2 r nr 1 1 1/ 2 1/ 2 1 2 2 2 2 r q 1 q nr 1 q r q nr log 10 n 1 1 2 log 10 2 n1 q 1 q r nr 1 q q r nr For our family with 5 sibs, the contribution to the overall lod when q is 0.20 is 7 Plot LOD by recombination fraction. LOD 5 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 0 0.1 0.2 q 0.3 0.4 0.5 prob(Y | q ) lod log10 1 ) prob ( Y | q 2 8 Comments about using pedigrees to locate genes: (1) Statistical mapping, Linkage analysis, is the first step in the strategy of positional cloning of disease loci. Prior to cloning linkage analysis can be used for risk prediction. (2) An unqualified success for mapping disease genes with regular Mendelian patterns of inheritance, e.q. cystic fibrosis and hypercholesterolemia (3) Based on the fact that genes close in distance violate Mendel’s law of independent assortment of loci. The actual relationship between physical distance and genetic distance is complicated. (4) Trait loci are positioned (mapped) relative to nondisease markers. 9 The Genetic Model based methods have been successful in localizing genes in single gene traits with high penetrance (Mendelian traits): In genetic model based linkage analysis need to know or estimate: (1) Penetrance: What is the probability of having the trait given a particular genotype? What is the probability of the marker phenotype given the marker genotype? (2) Priors: Marker and gene genotype frequencies must be determined. (3) Transmission Probabilities: How much recombination is there between genes and markers? LY, M | Ω ... P(Y, M | g, m) P(g, m) g1 m1 s gs ms ... P(Yi | g i ) P( M i | mi ) g1 m1 gs ms i 1 j founders P( g i ) P(mi ) P( g k , mk |g l , ml , g n , mn ,q ) { k , l , n} koffspring 10 Does genetic model based linkage work with complex traits? Yes, provided the model is correctly (nearly correctly) specified. It has worked with complex diseases (example: Alzheimer's disease, breast cancer, hemolytic disease of the newborn) if one can identify a population, the environmental factors involved etc. In some cases, there is a subpopulation where the trait seems to be segregating in a Mendelian (single gene) manner. Model Based Linkage Analysis has worked best when the disease gene is necessary and sufficient for disease expression. Genetic model based linkage analysis can give misleading results if (1) Incorrect ascertainment correction is used; (2) Phenotypes are incorrectly classified (the penetrance is misspecified); (3) The degree of heterogeneity is misspecified (populations differ in the causal gene); or (4) The mode of transmission is misspecified. (Clerget-Darpoux, 1986; J. Ott, 1991) All of these problems can occur in linkage analysis of a complex trait. 11