Download Non-linear conversion between genetic and

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Ploidy wikipedia , lookup

Non-coding DNA wikipedia , lookup

Twin study wikipedia , lookup

Genomics wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Skewed X-inactivation wikipedia , lookup

Genetic code wikipedia , lookup

Human genome wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Genetic drift wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Behavioural genetics wikipedia , lookup

Heritability of IQ wikipedia , lookup

Y chromosome wikipedia , lookup

X-inactivation wikipedia , lookup

Medical genetics wikipedia , lookup

Genome evolution wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Neocentromere wikipedia , lookup

Chromosome wikipedia , lookup

Designer baby wikipedia , lookup

Karyotype wikipedia , lookup

Gene expression programming wikipedia , lookup

Population genetics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Polyploid wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genetic testing wikipedia , lookup

Human genetic variation wikipedia , lookup

Public health genomics wikipedia , lookup

Genetic engineering wikipedia , lookup

Microevolution wikipedia , lookup

Genetic engineering in science fiction wikipedia , lookup

Genome (book) wikipedia , lookup

Transcript
BIOINFORMATICS APPLICATIONS NOTE
Vol. 20 no. 12 2004, pages 1966–1967
doi:10.1093/bioinformatics/bth168
Non-linear conversion between genetic and
physical chromosomal distances
Claudia Voigt, Steffen Möller, Saleh M. Ibrahim and Pablo
Serrano-Fernández∗
Proteome Center Rostock, Joachim-Jungius-Strasse 9, 18059 Rostock, Germany
Received on January 29, 2004; revised and accepted on March 8, 2004
Advance Access publication March 22, 2004
ABSTRACT
Summary: A supervised nonlinear interpolation significantly
improves the reliability of conversions from genetic distances
to physical distances as compared with the linear ones. A
webaccessible application was created that addresses this
question with a graphical presentation that may be wrapped
by local installations.
Motivation: Genetic linkage maps and radiation hybrid (RH)
maps are based on the rate of uncoupling between linked
genetic markers. These are usually measured in centiMorgan
(cM) when uncoupling is originated by natural recombination
or in centiRay (cR) for chromosomes that are irradiated artificially to separate the markers. Physical maps arise from
genome-wide DNA sequencing and are measured in bp.
This work was originally motivated as an extension of the
software application Expressionview (Fischer et al., 2003),
exploring its spectrum of appliance combining different mapping systems. The relationship between physical and genetic
maps is known to be not always linear (Yu et al., 2001). The shift
from the linear model seems to depend on local idiosyncrasies
of the chromosomes and the kind of genetic map used. The
present application addresses this problem for the first time.
Availability: http://qtl.pzr.uni-rostock.de/cartographer.php
Contact: [email protected]
SOFTWARE DESCRIPTION
Cartographer is implemented for the human, mouse and rat
genomes and includes a total of 18 different mapping systems.
Chromosomal features from EnsEMBL (Clamp et al., 2003),
such as chromosome bands and the density curves of single
nucleotide polymorphisms, DNA repeats, G–C content and
genes, have also been included.
Chromosomes are plotted for different genetic linkage units
against physical positions in a two-dimensional graph. Inconsistencies with an ideal linear correspondence are represented
by changes in the slope of the regression curve. A steep
∗ To
whom correspondence should be addressed.
1966
slope stands for a smaller chance of genetic uncoupling, e.g.
recombination, per physical unit and vice versa.
For a global comparison between different maps on a
single chromosome the correlation coefficient (c.c.) and the
Spearman rank c.c. were calculated. The Spearman c.c. solves
some difficulties linked with the traditional c.c. It is robust
towards the presence of outliers and does not require a normal
distribution of the data (Sachs, 2002).
GENERAL RELATIONSHIPS
Centromeres are characterized by abrupt steps between both
chromosomal arms when plotting genetic against physical
maps by using Cartographer. For non-acrocentric chromosomes, the situation at the telomeres is the inverse (Fig. 1).
These show a higher recombination rate according to previously stated models for the distribution of recombination
in the genome (Akhunov et al., 2003; Phillips et al., 2003).
These models also predict that gene density and recombination likelihood should mostly be proportional. The application shows this to be consistent even for different genetic
linkage maps; for instance, the human chromosome bands
12q13–14 for both the Genethon (Fig. 1) and the Marshfield
map. However, there are exceptions like the major histocompatibility (MHC) locus, where recombination is greatly
suppressed within HLA class II subregions (Cullen et al.,
1997).
Besides the relationship between different genetic and physical maps, the software also analyzes the correlation between
distinct genetic maps. Genetic linkage maps are shown to
deviate stronger, from the linear model, than radiation hybrid
maps, as expected by the hypothesis of linkage disequilibrium
being dependent on the intimate structure of chromosomes
and playing a decisive role in their evolution (Phillips et al.,
2003).
CONVERSION PARTICULARITIES
Some markers localized unambiguously in genetic linkage
units are found to be mapped in multiple physical positions all
Bioinformatics 20(12) © Oxford University Press 2004; all rights reserved.
Supervised non-linear interpolation
0
0
13
26
39
53
66
79
92
106
119
132
48
72
96
120
144
169
Genes 24
.... .. .. .. .. . . . .
. . .. .. .. .. .. . . . .
Position in cM
. . . .. .. .. . .. . .
p13.31
. . .. .. .. .. .. . . .
. . . .. .. . .. .. . .
p13.1
. . . .. .. .. .. .
. . .. ... .. .. .
. . .. .. ..
p12.2
. . .. ... ...
.. ... . .. . ......
p12.1
. ... . .. ...
p11.22
. ... ....
.. ... ..
p11.1
.. ... ...
q11
... ... ..
.. ...
... ..
q12
... .......
... ....................
q13.12
.. ................
....................
q13.2
..................
....... ...
q14.1
....... ....
. .
q14.3
... ...... . . . .
q15
.... ... . . .
.... . . . . .
q21.1
... . .. ....
. . . . .. ...
q21.2
.. . ...... ....
.. ... ...
q21.31
... .......
............. .... . ..
... .........
.....
q21.33
............
....
q22
....... .
....
. . ...
q23.1
.... ..
..... . .
....
. .........
. . . ...
q23.3
...... ......................... .. .. . . .
q24.12
....
. . . . . . .. . . .
. . . . .. .. . .. . .
q24.21
. . .. .. .. .. .. . .
q24.23
. . . .. .. .. .. . . .
. . .. .. .. .. .. .. . .
. . . .. .. .. . . .
q24.31
. . . .. ..
q24.33
Position in bps (Mill.)
Fig. 1. Human chromosome 12. x-axis: position in cM (Genethon map). y-axis: position in bp. Markers known for both mapping systems
are plotted as black rectangles. Filled circles depict the regression curve and the flanking confidence intervals. The telomeres and the position
marked by a box show a higher recombination rate coinciding with a greater gene density (horizontal bars). This figure can be viewed in
colour on Bioinformatics online.
over a single chromosome. The result is a distorted regression
curve and confidence intervals at those positions, thus reducing the reliability of conversions. Such markers are detected
and ignored for conversions.
The plot of genetic markers for chromosome 2 of the mouse
with the MIT Genetic Map illustrates an unexpected split of
the chromosome in two parallel arms. The split is consistent
throughout markers of different sources. The inclusion of gene
positions as additional genetic markers resolves the ambiguity
and the lower arm of the split arises as the best matching the
extended data set.
Global (e.g. the rat chromosome 18; Oxford Genetic Map
versus physical map) and, to a lesser extent, local negative slopes—reflecting an error at the assembly stage—(e.g.
the human chromosome 19; Stanford TNG RH Map versus
physical map near to the end telomere) would make direct
linear conversions simply absurd. These particularities are
supervised by the user in the Cartographer application, and
are—together with the foregoing examples—best arguments
to avoid linear or averaged map conversions.
ACKNOWLEDGEMENTS
Financial support was provided by the BMBF Leitprojekt
‘Proteom-Analyse des Menschen’ (FKZ 01GG9831) and the
BMBF program NBL3 (FKZ 01ZZ0108). Brigitte MüllerHilke is thanked for discussions. Hans-Jürgen Thiesen and
Michael O. Glocker are thanked for their support.
REFERENCES
Akhunov,E.D., Goodyear,A.W., Geng,S., Qi,L.L., Echalier,B.,
Gill,B.S., Miftahudin, Gustafson,J.P., Lazo,G., Chao,S. et al.
(2003) The organization and rate of evolution of wheat genomes
are correlated with recombination rates along chromosome arms.
Genome Res., 13, 753–763.
Clamp,M., Andrews,D., Barker,D., Bevan,P., Cameron,G., Chen,Y.,
Clark,L., Cox,T., Cuff,J., Curwen,V. et al. (2003) Ensembl 2002:
accommodating comparative genomics. Nucleic Acids Res., 31,
38–42.
Cullen,M., Noble,J., Erlich,H., Thrope,K., Beck,S., Klitz,W.,
Trousdale,J. and Carrington,M. (1997) Characterization of recombination in the HLA class II region. Am. J. Hum. Genet., 60,
397–407.
Fischer,G., Ibrahim,S.M., Brockmann,G.A., Pahnke,J., Bertocci,E.,
Thiesen,M.J., Serrano-Fernandez,P. and Moller,S. (2003) Expressionview: visualization of quantitative trait loci and gene expression data in EnsEMBL. Genome Biol., 4, R77.
Phillips,M.S., Lawrence,R., Sachidanandam,R., Morris,A.P.,
Balding,D.J., Donaldson,M.A., Studebaker,J.F., Ankener,W.M.,
Alfisi,S.V., Kuo,F.S. et al. (2003) Chromosome-wide distribution
of haplotype blocks and the role of recombination hot spots. Nat.
Genet., 33, 382–387.
Sachs,L. (2002) Verteilungsunabhängige Abhängigkeitsmaße. In
Sachs,L. (ed.) Angewandte Statistik, 10th edn., Springer-Verlag,
Berlin, Heidelberg, New York, pp. 511–518.
Yu,A., Zhao,C., Fan,Y., Jang,W., Mungall,A.J., Deloukas,P.,
Olsen,A., Doggett,N.A., Ghebranious,N., Broman,K.W. and
Weber,J.L. (2001) Comparison of human genetic and sequencebased physical maps. Nature, 409, 951–953.
1967