* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Non-linear conversion between genetic and
Non-coding DNA wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Skewed X-inactivation wikipedia , lookup
Genetic code wikipedia , lookup
Human genome wikipedia , lookup
Pharmacogenomics wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genetic drift wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Behavioural genetics wikipedia , lookup
Heritability of IQ wikipedia , lookup
Y chromosome wikipedia , lookup
X-inactivation wikipedia , lookup
Medical genetics wikipedia , lookup
Genome evolution wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Neocentromere wikipedia , lookup
Designer baby wikipedia , lookup
Gene expression programming wikipedia , lookup
Population genetics wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
History of genetic engineering wikipedia , lookup
Genetic testing wikipedia , lookup
Human genetic variation wikipedia , lookup
Public health genomics wikipedia , lookup
Genetic engineering wikipedia , lookup
Microevolution wikipedia , lookup
BIOINFORMATICS APPLICATIONS NOTE Vol. 20 no. 12 2004, pages 1966–1967 doi:10.1093/bioinformatics/bth168 Non-linear conversion between genetic and physical chromosomal distances Claudia Voigt, Steffen Möller, Saleh M. Ibrahim and Pablo Serrano-Fernández∗ Proteome Center Rostock, Joachim-Jungius-Strasse 9, 18059 Rostock, Germany Received on January 29, 2004; revised and accepted on March 8, 2004 Advance Access publication March 22, 2004 ABSTRACT Summary: A supervised nonlinear interpolation significantly improves the reliability of conversions from genetic distances to physical distances as compared with the linear ones. A webaccessible application was created that addresses this question with a graphical presentation that may be wrapped by local installations. Motivation: Genetic linkage maps and radiation hybrid (RH) maps are based on the rate of uncoupling between linked genetic markers. These are usually measured in centiMorgan (cM) when uncoupling is originated by natural recombination or in centiRay (cR) for chromosomes that are irradiated artificially to separate the markers. Physical maps arise from genome-wide DNA sequencing and are measured in bp. This work was originally motivated as an extension of the software application Expressionview (Fischer et al., 2003), exploring its spectrum of appliance combining different mapping systems. The relationship between physical and genetic maps is known to be not always linear (Yu et al., 2001). The shift from the linear model seems to depend on local idiosyncrasies of the chromosomes and the kind of genetic map used. The present application addresses this problem for the first time. Availability: http://qtl.pzr.uni-rostock.de/cartographer.php Contact: [email protected] SOFTWARE DESCRIPTION Cartographer is implemented for the human, mouse and rat genomes and includes a total of 18 different mapping systems. Chromosomal features from EnsEMBL (Clamp et al., 2003), such as chromosome bands and the density curves of single nucleotide polymorphisms, DNA repeats, G–C content and genes, have also been included. Chromosomes are plotted for different genetic linkage units against physical positions in a two-dimensional graph. Inconsistencies with an ideal linear correspondence are represented by changes in the slope of the regression curve. A steep ∗ To whom correspondence should be addressed. 1966 slope stands for a smaller chance of genetic uncoupling, e.g. recombination, per physical unit and vice versa. For a global comparison between different maps on a single chromosome the correlation coefficient (c.c.) and the Spearman rank c.c. were calculated. The Spearman c.c. solves some difficulties linked with the traditional c.c. It is robust towards the presence of outliers and does not require a normal distribution of the data (Sachs, 2002). GENERAL RELATIONSHIPS Centromeres are characterized by abrupt steps between both chromosomal arms when plotting genetic against physical maps by using Cartographer. For non-acrocentric chromosomes, the situation at the telomeres is the inverse (Fig. 1). These show a higher recombination rate according to previously stated models for the distribution of recombination in the genome (Akhunov et al., 2003; Phillips et al., 2003). These models also predict that gene density and recombination likelihood should mostly be proportional. The application shows this to be consistent even for different genetic linkage maps; for instance, the human chromosome bands 12q13–14 for both the Genethon (Fig. 1) and the Marshfield map. However, there are exceptions like the major histocompatibility (MHC) locus, where recombination is greatly suppressed within HLA class II subregions (Cullen et al., 1997). Besides the relationship between different genetic and physical maps, the software also analyzes the correlation between distinct genetic maps. Genetic linkage maps are shown to deviate stronger, from the linear model, than radiation hybrid maps, as expected by the hypothesis of linkage disequilibrium being dependent on the intimate structure of chromosomes and playing a decisive role in their evolution (Phillips et al., 2003). CONVERSION PARTICULARITIES Some markers localized unambiguously in genetic linkage units are found to be mapped in multiple physical positions all Bioinformatics 20(12) © Oxford University Press 2004; all rights reserved. Supervised non-linear interpolation 0 0 13 26 39 53 66 79 92 106 119 132 48 72 96 120 144 169 Genes 24 .... .. .. .. .. . . . . . . .. .. .. .. .. . . . . Position in cM . . . .. .. .. . .. . . p13.31 . . .. .. .. .. .. . . . . . . .. .. . .. .. . . p13.1 . . . .. .. .. .. . . . .. ... .. .. . . . .. .. .. p12.2 . . .. ... ... .. ... . .. . ...... p12.1 . ... . .. ... p11.22 . ... .... .. ... .. p11.1 .. ... ... q11 ... ... .. .. ... ... .. q12 ... ....... ... .................... q13.12 .. ................ .................... q13.2 .................. ....... ... q14.1 ....... .... . . q14.3 ... ...... . . . . q15 .... ... . . . .... . . . . . q21.1 ... . .. .... . . . . .. ... q21.2 .. . ...... .... .. ... ... q21.31 ... ....... ............. .... . .. ... ......... ..... q21.33 ............ .... q22 ....... . .... . . ... q23.1 .... .. ..... . . .... . ......... . . . ... q23.3 ...... ......................... .. .. . . . q24.12 .... . . . . . . .. . . . . . . . .. .. . .. . . q24.21 . . .. .. .. .. .. . . q24.23 . . . .. .. .. .. . . . . . .. .. .. .. .. .. . . . . . .. .. .. . . . q24.31 . . . .. .. q24.33 Position in bps (Mill.) Fig. 1. Human chromosome 12. x-axis: position in cM (Genethon map). y-axis: position in bp. Markers known for both mapping systems are plotted as black rectangles. Filled circles depict the regression curve and the flanking confidence intervals. The telomeres and the position marked by a box show a higher recombination rate coinciding with a greater gene density (horizontal bars). This figure can be viewed in colour on Bioinformatics online. over a single chromosome. The result is a distorted regression curve and confidence intervals at those positions, thus reducing the reliability of conversions. Such markers are detected and ignored for conversions. The plot of genetic markers for chromosome 2 of the mouse with the MIT Genetic Map illustrates an unexpected split of the chromosome in two parallel arms. The split is consistent throughout markers of different sources. The inclusion of gene positions as additional genetic markers resolves the ambiguity and the lower arm of the split arises as the best matching the extended data set. Global (e.g. the rat chromosome 18; Oxford Genetic Map versus physical map) and, to a lesser extent, local negative slopes—reflecting an error at the assembly stage—(e.g. the human chromosome 19; Stanford TNG RH Map versus physical map near to the end telomere) would make direct linear conversions simply absurd. These particularities are supervised by the user in the Cartographer application, and are—together with the foregoing examples—best arguments to avoid linear or averaged map conversions. ACKNOWLEDGEMENTS Financial support was provided by the BMBF Leitprojekt ‘Proteom-Analyse des Menschen’ (FKZ 01GG9831) and the BMBF program NBL3 (FKZ 01ZZ0108). Brigitte MüllerHilke is thanked for discussions. Hans-Jürgen Thiesen and Michael O. Glocker are thanked for their support. REFERENCES Akhunov,E.D., Goodyear,A.W., Geng,S., Qi,L.L., Echalier,B., Gill,B.S., Miftahudin, Gustafson,J.P., Lazo,G., Chao,S. et al. (2003) The organization and rate of evolution of wheat genomes are correlated with recombination rates along chromosome arms. Genome Res., 13, 753–763. Clamp,M., Andrews,D., Barker,D., Bevan,P., Cameron,G., Chen,Y., Clark,L., Cox,T., Cuff,J., Curwen,V. et al. (2003) Ensembl 2002: accommodating comparative genomics. Nucleic Acids Res., 31, 38–42. Cullen,M., Noble,J., Erlich,H., Thrope,K., Beck,S., Klitz,W., Trousdale,J. and Carrington,M. (1997) Characterization of recombination in the HLA class II region. Am. J. Hum. Genet., 60, 397–407. Fischer,G., Ibrahim,S.M., Brockmann,G.A., Pahnke,J., Bertocci,E., Thiesen,M.J., Serrano-Fernandez,P. and Moller,S. (2003) Expressionview: visualization of quantitative trait loci and gene expression data in EnsEMBL. Genome Biol., 4, R77. Phillips,M.S., Lawrence,R., Sachidanandam,R., Morris,A.P., Balding,D.J., Donaldson,M.A., Studebaker,J.F., Ankener,W.M., Alfisi,S.V., Kuo,F.S. et al. (2003) Chromosome-wide distribution of haplotype blocks and the role of recombination hot spots. Nat. Genet., 33, 382–387. Sachs,L. (2002) Verteilungsunabhängige Abhängigkeitsmaße. In Sachs,L. (ed.) Angewandte Statistik, 10th edn., Springer-Verlag, Berlin, Heidelberg, New York, pp. 511–518. Yu,A., Zhao,C., Fan,Y., Jang,W., Mungall,A.J., Deloukas,P., Olsen,A., Doggett,N.A., Ghebranious,N., Broman,K.W. and Weber,J.L. (2001) Comparison of human genetic and sequencebased physical maps. Nature, 409, 951–953. 1967