* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Linkage Mapping 2 3 – point linkage mapping One crossover Two
Survey
Document related concepts
Genomic library wikipedia , lookup
Human genetic variation wikipedia , lookup
Public health genomics wikipedia , lookup
Genetics and archaeogenetics of South Asia wikipedia , lookup
Genetic engineering wikipedia , lookup
Holliday junction wikipedia , lookup
Microevolution wikipedia , lookup
History of genetic engineering wikipedia , lookup
Genome evolution wikipedia , lookup
Population genetics wikipedia , lookup
Genome (book) wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Gene expression programming wikipedia , lookup
Homologous recombination wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Transcript
Recombination Frequency Estimates Linkage Mapping 2 – Increasing sample size – Using codominant genes/markers – Progeny testing F2:3s to better classify genotypes with dominant markers (e.g., distinguish AA from Aa) – Using F2 populations with codominant markers – Using backcross populations with dominant markers CS741 2009 Jim Holland 3 – point linkage mapping Gene Pair F - Rj1 F - Idh1 Rj1 - Idh1 Recombination Frequency 0.41 ± 0.05 0.22 ± 0.03 0.25 ± 0.03 F Idh1 0.22 • They are ESTIMATES! • Variance or standard error of the estimates can be obtained from 2nd derivative (see Mather, 1951). • You can get better estimates (smaller std. deviations) by: Rj1 0.25 Recombination frequencies are not additive! • The number of crossovers that occur along a chromosome is additive, but there is not a 1:1 relationship between crossover events and recombination. • What is the relationship between crossovers and recombination? • Remember that crossovers are the CAUSE and recombinations are the RESULT. Notice: 0.22 + 0.25 > 0.41! One crossover A--------------- B A--------------- B a--------------- b a--------------- b → A--------------- B A--------------- b a --------------- B a --------------- b Parental gamete Recombinant gamete Recombinant gamete Parental gamete 50% recombinant gametes produced! Two crossovers A --------------- B A --------------- B a --------------- b a --------------- b → A --------------- B A --------------- B a --------------- b a --------------- b Parental gamete Parental gamete Parental gamete Parental gamete A --------------- B A --------------- B a --------------- b a --------------- b → A --------------- b A --------------- b a --------------- B a --------------- B Recombinant gamete Recombinant gamete Recombinant gamete Recombinant gamete A --------------- B A --------------- B a --------------- b a --------------- b → A --------------- B A --------------- b a --------------- b a --------------- B Parental gamete Recombinant gamete Recombinant gamete Parental gamete A --------------- B A --------------- B a --------------- b a --------------- b → A --------------- b A --------------- B a --------------- B a --------------- b Recombinant gamete Parental gamete Recombinant gamete Parental gamete Averaged over all four possibilities – 50% recombinant gametes are produced! 1 Crossovers and Recombination Mapping Functions • Any number of crossovers greater than zero will produce 50% recombinant gametes on average. • This is why recombination frequency is not a linear function of the average number of crossovers between two loci. • If loci are widely separated on the chromosome, several crossovers may occur between them regularly at each meiosis, but they will still have only a maximum recombination frequency of 50%. • How to make a genetic map linear, so that map distances of separate intervals can be added up to equal distance of combined intervals? • Mapping Functions! - genetic distances as a function of the presumed number of crossovers that occur between genes. • An interval in which an average of one crossover occurs every meiosis is defined to have a genetic map distance of 50 cM. Mapping Functions Mapping Functions • The problem is that we do not actually observe crossovers, we observe the resulting recombinations. • By making assumptions about the level of interference (whether crossovers are really independent along chromosome or if one crossover reduces the probability of a nearby crossover), the number of crossovers can be estimated from the observed recombination frequency. • So, we are working backwards from the observed result (recombinations) to the unobserved cause (crossovers). • The expected number of crossovers across different intervals then can be summed and treated as a linear map distance. Map Distance vs. Recombination Frequency ⎛ 1 ⎞ ⎛ 1 + 2r ⎞ cM = ⎜ ⎟ ln⎜ ⎟ × 100 ⎝ 4 ⎠ ⎝ 1 − 2r ⎠ • Kosambi is usually preferred, as there is evidence for interference, but in reality the level of interference is unknown, and anyway seems to vary across the genome (Sherman & Stack, 95) Map Distance: Example Gene Pair F - Idh1 Rj1 - Idh1 F - Rj1 140 120 100 80 60 40 20 0.48 0.4 0.44 0.36 0.32 0.28 0.2 0.24 0.16 0.12 0.08 0 0 0.04 Kosambi cM • Haldane mapping function assumes no interference. • Kosambi mapping function assumes a constant and specific level of interference: Recomb. Freq. 0.22 ± 0.03 0.25 ± 0.03 0.41 ± 0.05 cM 24 27 58 Notice that the map units are supposed to be more additive than the original recombination units, but in reality they are not always so! Differences between the actual level of interference and the assumptions in the mapping function cause this. Recombination Frequency 2 LOD scores LOD scores • An alternative method of measuring the evidence for linkage: • “logarithm of odds”: • From soybean example: L(r$ = 0.41) = 250! (.587025)144 (.162975) ( 44 + 39 ) (.087025) 23 144! 44! 39 ! 23! ⎛ (.587025)144 (.162975) ( 44 + 39 ) (.087025) 23 ⎞ ⎛ L(r$ = 0.41) ⎞ LOD = log10 ⎜ ⎟ ⎟ = log10 ⎜ ⎝ L(r = 0.5) ⎠ (9 / 16)144 (3 / 16) ( 44 + 39 ) (1 / 16) 23 ⎠ ⎝ ⎛ L(r$MLE ) ⎞ LOD = log10 ⎜ ⎟ ⎝ L(r = 0.5) ⎠ ⎛ 8.02425 × 10−124 ⎞ = log10 ⎜ ⎟ = log10 (8.364) = 0.9224 ⎝ 9.59378 × 10− 125 ⎠ LOD 1 means estimated recombination frequency is 10 times more likely than 50% recombination. LOD 3 means estimated recombination frequency is 1000 times more likely than 50% recombination. Typically in full genome linkage mapping, we consider genes with LOD 2.5 or more linked. Why use such a stringent level of significance? Multipoint Mapping in Mapmaker Multipoint Mapping in Mapmaker • First, must select the linear order of loci in the linkage group. • With many loci, it is not easy to know the correct order (huge numbers of possible orders). • Tightly linked loci make it even harder! • So, usually start with a subset of loci to make ordering simpler and a best guess of the order. • Given the order, the maximum likelihood estimate (MLE) of the recombination frequencies is computed by extending the two locus multinomial probability function to a multiple locus probability function. • Try different orders of loci. • The LOD scores of the MLE for the different orders can be compared, the highest LOD score is selected. • Additional loci can be added one at a time to the framework order to build up the map of the linkage group. • At each step, compute the MLE and LOD score and also shuffle the order of loci a few times and compute LODs for each new order. • Keep the order with best LOD score. • You are not guaranteed the most likely order! Two different people can get two different maps from the same data set! Linkage Map Example Portyanko et al., 2001 OT2 OT1 (KO22) 24 22 E 3 0.8 1.2 0.8 1.6 2.8 6.1 1.9 8.2 3.5 1.6 1.6 4.3 6.3 aa03.875 isu59a adh2a psr144 psr153 oisu1093 cdo795a [22,24;E] rz474a cdo590a [3,24;A] e35m61-101.o 10.7 bcd1235d [25] 5.8 isu92 rz912 oisu1877b, cdo504 [E], rz630 bcd1734 [22;E] cdo87 [E], cdo241a [E], oisu1961a [28] e40m48-173.t bcd454 (=bcd450) [3;A,E,F,G], cdo1373b [E], bcd1087 [E] 10.5 skdh [32] E 5.6 2.6 oisu2191c isu77d bcd349 [G] cdo1373a [E] rz474b isu128b 10.3 bcd808d [24,28;E], cdo344c [30;F], psr152a cdo595 [6;D] F cdo216a* [3;A],, isu35c, plc, srgh8b**, 31.8 D 15.0 6 32 cdo407 [D] cdo1090c [6,14,20,23,32] 4.8 cdo1081a [5;F] 12.4 cdo1090e [6,14,20,23,32] bcd1840b [33,35], cdo216c [3;A] isu107b** 8.2 72.5 cM A Markers in bold are framework markers, positions are certain with LOD > 3.0 2.3 2.1 6.7 cdo216e [A] cdo395 [32;A,B] cdo216d [3;A] pic20b psr167 97.0 cM Using common The sum of interval markers, you can distances is used as total compare maps across populations and species map length for the linkage group Markers in italics have positions supported at LOD < 3.0, so not used to compute map distances. They may be in wrong interval. 3