Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Is Interlineage Recombination Responsible for Low Divergence of Mitochondrial nad3 Genes in Mytilus galloprovincialis? Artur Burzyński and Beata Śmietanka Department of Genetics and Marine Biotechnology, Polish Academy of Sciences, Institute of Oceanology, Sopot, Poland The existence of mtDNA recombination in animals has been confirmed by several case studies. Still, for Mytilus mussels possessing two divergent mitochondrial genomes (M and F), which can recombine, no recombination between coding sequences of highly diverged M and F genomes has been shown. Based on the full sequences of both genomes, it has been suggested that particularly low divergence observed within the mitochondrial nad3 gene of the Mytilus galloprovincialis mussel may be caused by its exceptionally low evolutionary rate. Here, we contribute a new pair of mitochondrial genomes typical for M. galloprovincialis and show that this low divergence is not a sign of evolutionary conservation but is rather caused by the acquisition of an F-related sequence by the published M genome of M. galloprovincialis. The most likely scenario for this apparent mtDNA-coding region recombination case is an assembly artifact. Introduction Mytilus mussels have an unusual system of mitochondrial inheritance: In addition to the classic, maternal route (F genome), the males pass another mtDNA (M genome) to sons (Skibinski et al. 1994; Zouros et al. 1994). Because these two genomes can be quite divergent, Mytilus mussels became the mitochondrial recombination hunting ground. Indeed, several different variants of the fragment of the cox3 gene have been cloned from a gonadal tissue of an atypical Mytilus galloprovincialis male bearing two similar (;3% divergence) M and F genomes. Some of them showed a recombination signature (Ladoukakis and Zouros 2001). Haplotypes with mosaic M–F control region sequences are widely known in Mytilus (Burzyński et al. 2003, 2006; Rawson 2005; Venetis et al. 2007; Filipowicz et al. 2008). Only analyses of entire mtDNA genomes may answer the question if other parts of the molecule also recombine in Mytilus. Mizi et al. (2005) described full sequences of both M and F genomes of M. galloprovincialis. One of the most interesting findings was that the nad3 gene showed the lowest divergence among all protein-coding genes (0.127). The explanation given for this fact was ‘‘the low rate of evolution of nad3’’ in Mytilus (p 962). The hypothesis that nad3 may contain the origin of replication of the lagging strand (OL) was given as a possible cause for this conservation. Several mitochondrial (mt) genomes of Mytilus mussels have been published, including the sequence of F (Boore et al. 2004) and M (Breton et al. 2006) genomes of Mytilus edulis. To test the ‘‘conserved nad3’’ hypothesis, the alignment of four genomes—both haplotypes of M. galloprovincialis and both haplotypes of M. edulis—has been created. The sliding window plots of M–F divergence are presented in figure 1A. Contrary to the description given by Mizi et al. (2005), it is not the ‘‘nad3 plus the adjacent 100 bp of UR4’’ (p 937) that is responsible for the lowest divergence region in this plot. This region is located at the very beginning of nad3. Importantly, the relevant comparison of M. edulis genomes (fig. 1A, gray line) does not show Key words: DUI, mtDNA recombination, assembly errors, mitogenomics. E-mail: [email protected]. Mol. Biol. Evol. 26(7):1441–1445. 2009 doi:10.1093/molbev/msp085 Advance Access publication April 22, 2009 Ó The Author 2009. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] this. This is the only region of the genome for which there is such a difference in M–F divergences between M. edulis and M. galloprovincialis. It argues against the ‘‘conserved nad3’’ hypothesis predicting that the nad3 gene should exhibit low divergence also in M. edulis, whereas the overall M–F distance of nad3 is 0.27 in M. edulis (for the most conservative cox1, the distance is 0.19). The most likely position of OL (Rodakis et al. 2007) is outside the region (fig. 1, arrow) so its influence on the divergence anomaly is unlikely. A closer look at the region in question (fig. 1B) shows that the distance approaches zero within the anomaly. Such a pattern could be the result of an M–F recombination event in which the M genome acquired a fragment of the F genome. To test this hypothesis, a set of recombination detection algorithms has been applied to the four-genome alignment. All programs conclusively confirmed recombination (table 1). The positions of the most probable breakpoints are marked with asterisks in figure 1C. Taking these data at the face value, it must be concluded that the fragment of the M genome, approximately 150 bp long, is derived from the F genome in M. galloprovincialis. This would be the first case of a mosaic mitochondrial protein-coding gene resulting from homologous recombination between highly divergent (.20%) genomes in animals. To check whether this is a typical feature of M. galloprovincialis, we sequenced another pair of representative M and F genomes from this species. The descriptive statistics for the newly sequenced genomes are presented in table 2. They differ from the comparable data presented by Mizi et al. (2005) in five genes: cox1, nad1, cob, nad6, and nad3. The differences in length of nad1 and cox1 are merely the result of different annotations chosen for these genes. Should the same annotation convention have been used for the sequences of Mizi et al. (2005), these two differences would disappear. The differences in length of cob as well as both length and Ka/Ks differences in nad6 result from the fact that the 3# parts of these two genes differ in their reading frames. These apparent frameshifts have been discussed in detail by Zbawicka et al. (2007). Most importantly, the new data do not support particularly low divergence of the nad3: The distance is 0.215, only slightly lower than in M. edulis. The nad3 region was examined in detail, following the approach described above for the original data. Figure 2 shows that newly sequenced genomes do not reveal any nad3 divergence anomaly. Recombination detection programs also 1442 Burzyński and Śmietanka FIG. 1.—Comparison of M and F genomes of Mytilus edulis (gray) and Mytilus galloprovincialis (black). Arrow points at the location of OL (Rodakis et al. 2007). (A) M–F nucleotide diversity has been calculated in a 150-bp window moving along the alignment in 50-bp steps and plotted at each window midpoint. The positions of all protein-coding genes and two rRNA genes are shown above the plot. (B) The same as in (A) except the plot is limited to the region from nad2 to cox1, with a smaller window of 75 bp and in 1-bp steps. The positions of two relevant cloned fragments are shown by lines above the protein-coding genes. The position of the single direct PCR product is shown below the protein-coding genes. (C) The details of the alignment over the region from 8374 to 8702 in (A) and (B). Sequences are cited by their GenBank accession numbers. AY484747 and AY497292 represent F haplotypes AY823623 and AY363687 represent M haplotypes. The nucleotides in bold support the F origin of the first M sequence. The underlined nucleotides support a regular M–F dichotomy. Asterisks over the alignment mark the most probable recombination breakpoints. The positions of genes are indicated under the alignment, the recognition sites for restriction enzymes used for cloning are boxed. The positions and the sequences of two primers: ND3-f1 and ND3-r used to obtain the PCR product linking cloned fragments are shown under the alignment. do not indicate mosaic fragments in the newly sequenced mt genomes—neither in nad3 nor in any other coding sequence. The recombination signal in the original M sequence of Mizi et al. (2005) became even stronger when all six sequences were taken into account (table 1). This was a result of the closer relationship between the newly Table 1 Statistical Support (Bonferroni—Corrected Average P Value) for the Recombination in the nad3 of Mytilus galloprovincialis Based on Two Alignments Including Four or Six Genomes Program RDP GENECONV BootScan MaxChi Chimaera SiScan LARD 3Seq Six 1.37 1.44 7.34 9.13 8.17 2.00 3.73 2.00 10 10 10 10 10 10 10 10 Four 30 24 31 09 09 12 29 14 6.17 4.26 6.49 1.89 6.38 3.93 5.77 1.95 10 10 10 10 10 10 10 10 21 18 21 08 09 16 29 14 sequenced M genome and the nonrecombinant majority of the original sequence. Recombinant sequences of supposedly mitochondrial origin are not rare in GenBank. Still, many of them may rather represent all kinds of artifacts. According to Piganeau et al. (2004), much care should be taken to avoid errors, and even greater care should be exercised before announcing recombination. Following this recommendation, it is reasonable to ask: Is it possible that the case discussed here does not a represent genuine recombinant mitochondrial sequence? Mizi et al. (2005) used a three-step procedure to obtain the sequence. It involved two long-range polymerase chain reactions (PCR), digestion of obtained products with restriction enzymes and cloning of the resulting fragments followed by sequencing of the obtained clones. The final assembly was facilitated by sequences obtained from additional PCR products spanning clone junctions. The span of obtained clones is presented in figure 1B, and the relevant restriction sites are shown in figure 1C. Apparently, the cut sites are flanked by potential recombination breakpoints. Therefore, the whole ‘‘recombined’’ fragment must have been obtained in the PCR with primers ND3-f1 and nad3 Recombination in Mytilus 1443 Table 2 Length, Base Composition, and Sequence Divergence of Newly Sequenced Genomes mtDNA Gene/Region Noncoding VD1 CD VD2 Whole CR tRNA All tRNA rRNA rrnaL rrnaS All rRNA Protein atp6 cox1 cox2 cox3 cob nad1 nad2 nad3 nad4 nad4L nad5 nad6 All CDS Complete Base Composition (%) Divergence (SE) Type Length T C A G K Ks Ka F M F M F M F M 690 491 355 341 149 83 1,194 915 28.3 28.9 33.0 33.4 14.8 10.8 28.0 29.0 15.4 16.5 14.1 13.8 12.1 14.5 14.6 15.3 26.4 35.4 36.6 36.4 47.0 43.4 32.0 36.5 30.0 19.1 16.3 16.4 26.2 31.3 25.5 19.2 0.446 (0.042) NA NA 0.046 (0.011) NA NA 0.106 (0.041) NA NA 0.232 (0.018) NA NA F M 1,517 1,518 34.2 34.5 13.8 14.0 32.6 32.9 19.4 18.6 0.120 (0.009) NA NA F M F M F M 1,244 1,243 947 949 2,191 2,192 33.8 31.5 31.4 30.1 32.7 30.9 13.0 14.9 13.4 14.0 13.2 14.5 32.3 33.6 32.5 34.8 32.4 34.1 20.9 20.0 22.7 21.1 21.7 20.4 0.166 (0.013) NA NA 0.134 (0.012) NA NA 0.151 (0.009) NA NA F M F M F M F M F M F M F M F M F M F M F M F M F M F M 717 717 1,656 1,740 729 729 936 936 1,308 1,311 1,079 1,067 948 948 351 351 1,308 1,308 282 282 1,707 1,689 465 465 11,486 11,543 16,780 16,639 38.4 37.4 34.3 33.2 33.6 34.6 35.0 34.9 35.7 36.2 36.6 37.3 31.9 35.7 38.5 38.8 33.3 35.1 39.1 38.4 35.6 36.2 40.0 39.4 35.3 35.8 34.2 34.5 14.8 14.1 15.2 16.6 15.6 16.1 16.4 17.0 17.4 18.3 15.6 14.1 13.9 12.0 12.4 12.9 15.5 13.5 12.9 14.3 14.6 14.2 8.4 10.0 15.0 14.9 14.5 14.6 22.1 24.8 26.7 27.4 27.0 27.4 23.4 24.1 24.4 24.9 22.8 23.3 27.1 27.9 21.6 20.7 25.7 26.9 24.7 25.1 26.5 26.9 24.9 27.7 25.1 26.0 27.5 28.6 24.6 23.7 23.8 22.8 23.8 21.9 25.2 23.9 22.5 20.6 25.0 25.4 27.2 24.4 27.6 27.6 25.5 24.5 23.3 22.2 23.4 22.7 26.6 22.9 24.6 23.3 23.8 22.2 0.263 (0.023) 0.884 (0.115) 0.081 (0.014) 0.200 (0.012) 0.835 (0.071) 0.029 (0.006) 0.237 (0.023) 0.924 (0.135) 0.061 (0.012) 0.237 (0.017) 0.831 (0.090) 0.070 (0.013) 0.239 (0.016) 0.757 (0.073) 0.086 (0.011) 0.302 (0.022) 0.895 (0.095) 0.123 (0.015) 0.350 (0.024) 0.996 (0.116) 0.162 (0.019) 0.215 (0.028) 0.707 (0.128) 0.061 (0.017) 0.295 (0.019) 0.965 (0.091) 0.099 (0.012) 0.318 (0.044) 1.083 (0.255) 0.107 (0.025) 0.273 (0.016) 0.759 (0.062) 0.114 (0.012) 0.311 (0.033) 0.776 (0.120) 0.147 (0.024) 0.264 (0.006) 0.852 (0.031) 0.091 (0.004) 0.224 (0.004) NA NA ND3-r (fig. 1C). Notably, both primers show a bias toward F-like sequences, particularly in the 3# part. Mizi et al. (2005) used sperm as the source of DNA in the hope to avoid contamination with the F genome. However, it is difficult, if not impossible, to obtain DNA in which sensitive PCR of defined specificity could not detect both genomes—even if one of them is present only in a minuscule quantity. Because the primers used for PCR show bias toward the F genomes, the advantage of using sperm DNA in an effort to avoid F contamination could have been easily offset by PCR specificity. Therefore, it is possible that PCR amplified the fragment of the F genome and such sequences were then assembled with the sequences from the cloned M genome fragments leading to the observed mosaic structure. The GenBank database contains sequences from several M. galloprovincialis expressed sequence tag libraries (Venier et al. 2003; Tanguy et al. 2008). It has been checked for the presence of nad3 sequences returning nine reasonably long sequences, eight of F-type, one of M-type, and no mosaic sequences. Therefore, even if the presented interpretation is wrong and this is the true case of mtDNA recombination, it would then be most likely limited to this single individual. Methods The selection of individual genomes for sequencing was as follows: Two M and three F major phylogenetic clades have been identified among M. galloprovincialis– M. edulis mt haplotypes (Śmietanka et al. 2009). Of the 1444 Burzyński and Śmietanka lated in MEGA4 (Tamura et al. 2007), following the procedure used by Mizi et al. (2005). Acknowledgments This work was partially funded by grant no. N303 418336 from the Polish Ministry of Science to A.B. Research was done at Polish Academy of Sciences, Institute of Oceanology, Department of Genetics and Marine Biotechnology. Literature Cited FIG. 2.—Comparison of newly sequenced Mytilus galloprovincialis M and F genomes over the area shown in figure 1B (gray). The span of overlapping, sequenced PCR products is shown by lines above the protein-coding genes. The comparison of genomes sequenced by Mizi et al. (2005) is shown in black. two M clades, one was associated with Mediterranean– Black Sea M. galloprovincialis. One male individual carrying the haplotype representative for this clade has been chosen for sequencing of the entire M genome (ORI27). Likewise, a female carrying the haplotype typical for M. galloprovincialis F clade was chosen for sequencing of the entire F genome (AZO20). DNA was isolated from the mantle tissue of ripe animals, as described previously (Śmietanka et al. 2009). The sequencing strategy involved two steps: a long-range PCR (LR-PCR) and a set of rePCRs in which the product of the LR-PCR was highly diluted (1:1,000) and used as a substrate. PCR products were then sequenced directly, as described previously (Zbawicka et al. 2007). The nearly complete M genome (16,432 bp) was amplified in one reaction with the following, highly specific primers: MGL 5#-ACGCTTAGATTCCTTGCCATTGCC-3# and MGLR 5#-GAAACTATCTCCAACTATCTGCGTATATTCCTG-3#. The remaining 264 bp located between LR-PCR primers was determined by sequencing the M-specific AB40-AB20 PCR product (Burzyński et al. 2006; Filipowicz et al. 2008). The F genome was amplified in two overlapping LR-PCRs (10,782 and 7,709 bp, 1,737-bp overlap) with universal primers AB23–AB50R and AB44–AB33: 5#-AATTGCGCTGTTATCCCTAGAGTAGC-3# (Burzyński et al. 2006; Zbawicka et al. 2007). The small fragment (164 bp) from between LRPCR primers was obtained from PCR product AB40CBM2 (Burzyński et al. 2006). The annotation of mt sequences followed the procedure described by Zbawicka et al. (2007). Obtained sequences have been deposited in GenBank (FJ890849–50). Sliding window plots of M–F divergences were obtained using DnaSP version 4.50 (Rozas et al. 2003). The following programs were used for detecting recombination: RDP (Martin and Rybicki 2000), MaxChi (Maynard Smith 1992), LARD (Holmes et al. 1999), GENECONV (Padidam et al. 1999), SiScan (Gibbs et al. 2000), Chimaera (Posada and Crandall 2001), Bootscan (Martin et al. 2005), and 3Seq (Boni et al. 2007) under RDP version 3.34 with default parameters (Martin et al. 2005). Genetic diversity indices were calcu- Boni MF, Posada D, Feldman MW. 2007. An exact nonparametric method for inferring mosaic structure in sequence triplets. Genetics. 176:1035–1047. Boore JL, Medina M, Rosenberg LA. 2004. Complete sequences of the highly rearranged molluscan mitochondrial genomes of the scaphopod Graptacme eborea and the bivalve Mytilus edulis. Mol Biol Evol. 21:1492–1503. Breton S, Burger G, Stewart DT, Blier PU. 2006. Comparative analysis of gender-associated complete mitochondrial genomes in marine mussels (Mytilus spp.). Genetics. 172: 1107–1119. Burzyński A, Zbawicka M, Skibinski DOF, Wenne R. 2003. Evidence for recombination of mtDNA in the marine mussel Mytilus trossulus from the Baltic. Mol Biol Evol. 20: 388–392. Burzyński A, Zbawicka M, Skibinski DOF, Wenne R. 2006. Doubly uniparental inheritance is associated with high polymorphism for rearranged and recombinant control region haplotypes in Baltic Mytilus trossulus. Genetics. 174: 1081–1094. Filipowicz M, Burzyński A, Śmietanka B, Wenne R. 2008. Recombination in mitochondrial DNA of European mussels Mytilus. J Mol Evol. 67:377–388. Gibbs MJ, Armstrong JS, Gibbs AJ. 2000. Sister-scanning: a Monte Carlo procedure for assessing signals in recombinant sequences. Bioinformatics. 16:573–582. Holmes E, Worobey M, Rambaut A. 1999. Phylogenetic evidence for recombination in dengue virus. Mol Biol Evol. 16:405–409. Ladoukakis ED, Zouros E. 2001. Direct evidence for homologous recombination in mussel (Mytilus galloprovincialis) mitochondrial DNA. Mol Biol Evol. 18:1168–1175. Martin DP, Posada D, Crandall KA, Williamson C. 2005. A modified bootscan algorithm for automated identification of recombinant sequences and recombination breakpoints. AIDS Res Hum Retroviruses. 21:98–102. Martin DP, Rybicki E. 2000. RDP: detection of recombination amongst aligned sequences. Bioinformatics. 16:562–563. Martin DP, Williamson C, Posada D. 2005. RDP2: recombination detection and analysis from sequence alignments. Bioinformatics. 21:260–262. Maynard Smith J. 1992. Analyzing the mosaic structure of genes. J Mol Evol. 34:126–129. Mizi A, Zouros E, Moschonas N, Rodakis GC. 2005. The complete maternal and paternal mitochondrial genomes of the Mediterranean mussel Mytilus galloprovincialis: implications for the doubly uniparental inheritance mode of mtDNA. Mol Biol Evol. 22:952–967. Padidam M, Sawyer S, Fauquet CM. 1999. Possible emergence of new geminiviruses by frequent recombination. Virology. 265:218–225. nad3 Recombination in Mytilus Piganeau G, Gardner M, Eyre-Walker A. 2004. A broad survey of recombination in animal mitochondria. Mol Biol Evol. 21:2319–2325. Posada D, Crandall KA. 2001. Evaluation of methods for detecting recombination from DNA sequences: computer simulations. Proc Natl Acad Sci USA. 98:13757– 13762. Rawson P. 2005. Nonhomologous recombination between the large unassigned region of the male and female mitochondrial genomes in the mussel, Mytilus trossulus. J Mol Evol. 61:717–732. Rodakis GC, Cao L, Mizi A, Kenchington ELR, Zouros E. 2007. Nucleotide content gradients in maternally and paternally inherited mitochondrial genomes of the mussel Mytilus. J Mol Evol. 65:124–136. Rozas J, Sánchez-DelBarrio JC, Messeguer X, Rozas R. 2003. DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics. 19:2496–2497. Skibinski DOF, Gallagher C, Beynon CM. 1994. Mitochondrial DNA inheritance. Nature. 368:817–818. Śmietanka B, Burzyński A, Wenne R. 2009. Molecular population genetics of male and female mitochondrial genomes in European mussels Mytilus. Mar Biol (Berl). 156:913–925. 1445 Tamura K, Dudley J, Nei M, Kumar S. 2007. MEGA4: molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol. 24:1596–1599. Tanguy A, Bierne N, Saavedra C, et al. (25 co-authors). 2008. Increasing genomic information in bivalves through new EST collections in four species: development of new genetic markers for environmental studies and genome evolution. Gene. 408:27–36. Venetis C, Theologidis I, Zouros E, Rodakis GC. 2007. A mitochondrial genome with a reversed transmission route in the Mediterranean mussel Mytilus galloprovincialis. Gene. 406:79–90. Venier P, Pallavicini A, De Nardi B, Lanfranchi G. 2003. Towards a catalogue of genes transcribed in multiple tissues of Mytilus galloprovincialis. Gene. 314:29–40. Zbawicka M, Burzyński A, Wenne R. 2007. Complete sequences of mitochondrial genomes from the Baltic mussel Mytilus trossulus. Gene. 406:191–198. Zouros E, Ball AO, Saavedra C, Freeman KR. 1994. Mitochondrial DNA inheritance. Nature. 368:818. Richard Thomas, Associate Editor Accepted April 14, 2009