Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Automated Structural Comparisons Clarify the Phylogeny of the Right-Hand-Shaped Polymerases Heli A. M. Mönttinen,1 Janne J. Ravantti,1,2 David I. Stuart,3,4 and Minna M. Poranen*,1 1 Department of Biosciences, Viikki Biocenter, University of Helsinki, Helsinki, Finland Institute of Biotechnology, Viikki Biocenter, University of Helsinki, Helsinki, Finland 3 Division of Structural Biology and the Oxford Protein Production Facility, The Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom 4 Diamond Light Source Limited, Harwell Science and Innovation Campus, Oxfordshire, United Kingdom *Corresponding author: E-mail: [email protected]. Associate editor: Csaba Pal 2 Abstract Polymerases are essential for life, being responsible for replication, transcription, and the repair of nucleic acid molecules. Those that share a right-hand-shaped fold and catalytic site structurally similar to the DNA polymerase I of Escherichia coli may catalyze RNA- or DNA-dependent RNA polymerization, reverse transcription, or DNA replication in eukarya, archaea, bacteria, and their viruses. We have applied novel computational methods for structure-based clustering and phylogenetic analyses of this functionally diverse polymerase superfamily, which currently comprises six families. We identified a structural core common to all right-handed polymerases, composed of 57 amino acid residues, harboring two positionally and chemically conserved residues, the catalytic aspartates. The structural conservation within each of the six families is considerable, for example, the structural core shared by family Y DNA polymerases covers over 90% of the polymerase domain of the Sulfolobus solfataricus Dpo4. Our phylogenetic analyses propose an early separation of RNAdependent polymerases that use primers from those that are primer-independent. Furthermore, the exchange of polymerase genes between viruses and their hosts is evident. Because of this horizontal gene transfer, the phylogeny of polymerases does not always reflect the evolutionary history of the corresponding organisms. Key words: nucleic acid polymerase, protein evolution, structural alignment, structural classification. Introduction 2001; Murakami et al. 2008). In addition, cellular telomerase reverse transcriptases have been assigned to be members of this group (Gillis et al. 2008). The DNA polymerases of the third group share a similar overall fold with the second group, but the topology of the palm subdomain is different as exemplified by the polymerase of Rattus norvegicus (Joyce and Steitz 1995; Aravind and Koonin 1999). To emphasize this difference, these DNA polymerases are sometimes referred to left-handed polymerases (Beard and Wilson 2000). The hallmark of a right-handed polymerase is the canonical palm subdomain with a 1-1-2-3-2-4-fold (fig. 1B) (Ollis et al. 1985; Steitz 1999). Strands 1 and 3 contain catalytic aspartates, which coordinate the two catalytic magnesium ions (fig. 1B) (Steitz 1998). In addition to the polymerase activity that derives from the fingers, thumb, and palm subdomains, the polypeptide chain of many righthanded polymerases contains domains that may carry out, for instance, exonuclease activity (Freemont et al. 1986). Polymerases can use two different initiation mechanisms: Primer dependent or primer independent (de novo) (van Dijk et al. 2004). In primer-dependent initiation, an RNA or a protein primer provides the hydroxyl group for the first incoming NTP, or the polymerase initiates from a 3’-hydroxyl ß The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] Mol. Biol. Evol. 31(10):2741–2752 doi:10.1093/molbev/msu219 Advance Access publication July 24, 2014 2741 Article Polymerases are responsible for preserving genetic information by replicating and repairing nucleic acid molecules as well as for the expression of genes through transcription. Known replicases, transcriptases, and DNA repair polymerases can be divided into three groups based on the structural fold of their catalytic site. The first group includes multisubunit cellular transcriptases and homodimeric RNA silencing pathway polymerases, which have similar double-psi -barrel structures containing the catalytic residues (Salgado et al. 2006; Werner and Grohmann 2011). The second group comprises polymerases that structurally resemble DNA polymerase I (Pol I) of Escherichia coli, often referred as right-handed polymerases (Ollis et al. 1985; Beard and Wilson 2000). Their overall fold is characterized by thumb, fingers, and palm subdomains (fig. 1A) (Kohlstaedt et al. 1992). Such a protein fold is shared by the DNA polymerases of families A, B, and Y that catalyze DNA synthesis using DNA template, the single-subunit DNA-dependent RNA polymerases (DdRps) that are related to phage T7 RNA polymerase, and viral RNA-dependent polymerases including reverse transcriptases of retroviruses and viral RNA-dependent RNA polymerases (RdRps) (table 1) (Ollis et al. 1985; Kohlstaedt et al. 1992; Sousa et al. 1993; Hansen et al. 1997; Wang et al. 1997; Zhou et al. MBE Mönttinen et al. . doi:10.1093/molbev/msu219 group of a pre-existing DNA strand, as exemplified by telomerases and DNA polymerases involved in DNA repair or employing a rolling-circle mechanism. In de novo initiation, the first (initiation) NTP serves this function (Butcher et al. 2001). Previously, the homology and phylogeny of right-handed polymerases has been investigated mainly by using amino acid sequence alignments (Poch et al. 1989; Delarue et al. 1990; Ito and Braithwaite 1991; Koonin 1991; Braithwaite and Ito 1993; Ohmori et al. 2001). Although such approaches can lead to unexpected notions of protein homology, they are prone to limitations when the sequence similarity is low, as it is between some of the right-handed polymerases. Nowadays, there are many high-resolution polymerase structures available in the Protein Data Bank (PDB; www.pdb.org, last accessed July 28, 2014), making structure-based phylogeny and homology studies potential alternatives. The advantage of such approaches is that similarities and hence homology can still be detected between the protein folds of distantly related, homologous proteins, when the amino acid sequences have lost detectable similarity (Benson et al. 1999). FIG. 1. A typical structure of a right-hand-shaped polymerase and its palm subdomain. (A) The structure of Escherichia coli Pol I (PDBid: 2KFZ). The subdomains are indicated with colors: Palm in green, fingers in yellow, and thumb in pink. A close-up (B) shows the palm subdomain rotated 55 by y axis compared with (A). The order of the secondary structure elements is indicated with numbers and the position of the catalytic aspartates with turquoise spheres. We have applied a structure-based comparison method (Ravantti et al. 2013) for the identification of equivalent residues in right-handed polymerase structures. Based on these equivalences, the structures were classified into four main clusters that were further dissected into subclusters. This approach allowed us to identify the conserved structural core for the right-handed polymerases, that is, the subset of amino acid residues that can be considered to be equivalent in all these enzymes. Similarly, equivalences were defined for the different polymerase clusters and subclusters, that is, RdRps, reverse transcriptases, DdRps, and family A, B, and Y DNA polymerases. On the basis of the similarity of the structural cores, we also present phylogenetic trees for the four major clusters: RNA-dependent polymerases (RdRps and reverse transcriptases), the family Y DNA polymerases, the family A DNA polymerases and the single-subunit DdRps, and the family Y DNA polymerases. Results The Right-Handed Polymerase Structures In the PDB, there are over 1,000 high-resolution structures of right-handed polymerases. However, most of these are for a rather small number of extensively studied enzymes, such as the RdRp of the hepatitis C virus (HCV) and the reverse transcriptase of human immunodeficiency virus (HIV). For the analysis presented here, a single representative structure was selected for each polymerase type from all available species. These 49 polymerases included 18 RdRps, 3 reverse transcriptases, 3 single-subunit DdRps, and 6, 10, and 9 family A, family B, and family Y DNA polymerases, respectively (supplementary table S1, Supplementary Material online), involved in nucleic acid replication, transcription, or repair. The resolution of the selected structures varied between 1.6 and 3.3 Å, and typically, a structure covered 80–100% of the total length of the corresponding polypeptide (see supplementary table S1, Supplementary Material online). Despite the small coverage of some polymerase structures (e.g., only ~35% in the case of human Rev1), all the selected structures contain nearly complete polymerase domains, composed of palm, fingers, and thumb subdomains. RNA templated DNA templated Table 1. Polymerases Used in This Study. a Polymerase Family Family A DNA polymerases Family B DNA polymerases Family Y DNA polymerases Single-subunit DdRps Reverse transcriptases RdRps For example, retrovirus reverse transcriptase. 2742 Activity DNA-dependent DNA polymerization DNA-dependent DNA polymerization DNA-dependent DNA polymerization DNA-dependent RNA polymerization RNA (or DNA)a-dependent DNA polymerization RNA-dependent RNA polymerization Function Processing of the Okazaki fragments, DNA repair, or replication Replication or DNA repair DNA repair Transcription Viral genome replication or telomere synthesis Viral genome replication and transcription Organism Bacteria, phages, and mitochondria Archaea, bacteria, eukaryotes, and phages Archaea, bacteria, and eukaryotes Mitochondria and phages Retroviruses ssRNA viruses and dsRNA viruses Phylogeny of the Right-Hand-Shaped Polymerases . doi:10.1093/molbev/msu219 The Common Structural Core Shared by the Right-Hand-Shaped Polymerases In this study, the selected polymerase structures were aligned and clustered based on their similarity by performing progressive pairwise comparisons between sets of protein structures on the basis of several properties such as geometry, secondary structure, and physicochemical properties of the amino acid residues (see supplementary table S2, Supplementary Material online) to identify equivalent residues. The final parameterization was somewhat different from that originally reported (Ravantti et al. 2013), having been optimized for this problem, as described in Materials and Methods, and was dominated by the properties describing the local geometry and local alignment (supplementary table S2, Supplementary Material online). Instead, the proportional weight of the sequence in the structural alignment was only 3.5%. The optimized parameters for this problem produced more complete and biologically more meaningful structural alignment compared with the original ones (data not shown). The pairwise comparison for a pair or a group of structures led to the identification of a set of equivalent residues defining their common core. This process was gradually continued by merging two closest structures or common structural cores to produce the final hierarchical classification (Ravantti et al. 2013) (fig. 2). Homologous structure finder (HSF)-based structural alignment with the optimized parameters found a total of 57 -carbons common to all the selected right-handed polymerase structures with an average root-mean-square deviation (rmsd) of 4.0 Å (fig. 2). The corresponding residues are depicted for representatives of the six polymerase families in figure 3 (panel CORE). The identified structurally conserved amino acids are mainly located in two -helixes and three -strands (or equivalently positioned polypeptide chains assigned as random coils) within the palm subdomain (fig. 3, panel CORE, green). A closer inspection of structures revealed a fourth -strand in the palm subdomain of all the right-handed polymerases, except in the single-subunit DdRps, which do not have this strand. In these proteins, the C-terminus of the protein partially aligns with the N-terminus of this fourth -strand (fig. 3, panel CORE). The HSF program could not automatically detect any chemically and positionally identical amino acids among the polymerases included in our comparison. This implies that a fold with similar function can be formed by sequences sharing barely any identity. However, when we relaxed the positional constrains and searched for identical amino acids within a sequence window of 1 residues, the two catalytic aspartates located in 1 and 3 strands (fig. 1B) were identified as conserved. The Four Clusters of Right-Handed Polymerases and Their Common Cores The right-handed polymerase structures (supplementary table S1, Supplementary Material online) were classified by the HSF program into four clusters: I) RdRps and reverse transcriptases, II) family Y DNA polymerases, III) family A MBE DNA polymerases and single-subunit DdRps, and IV) family B DNA polymerases (fig. 2). The four main clusters were further divided into several subclusters (fig. 2, indicated with letters). The structure-based grouping clustered the polymerase structures according to the established family classification (fig. 2). Our data imply that the RdRps and reverse transcriptases (cluster I) are structurally closely related sister families (fig. 2). Similar structural relatedness could be observed between the family A DNA polymerases and single-subunit DdRps (cluster III), despite of their different functional specialization. Next, we will describe the four clusters and selected subclusters or groups of subclusters that correspond to the six polymerase families and define structurally conserved cores for each family (fig. 3). The sizes of the cores (number of structurally equivalent -carbons) and the corresponding rmsd values are provided for each cluster and subcluster in figure 2. The coverage of the core as a percentage of the entire polymerase domain is provided for selected polymerase structures in figure 3. Cluster I Cluster I covers 18 RdRp and 3 reverse transcriptase structures, that is, all the RNA-templated polymerases in our data set (fig. 2). Twenty of these polymerases originate from viruses and one from a eukaryotic organism (supplementary table S1, Supplementary Material online). The structurally conserved region shared by the RNA-dependent polymerases is depicted for rabbit hemorrhagic disease virus (RHDV) RdRp and the reverse transcriptase of HIV in figure 3 (panel I, first and second rows, respectively). This conserved core covers most of the palm subdomain (fig. 3, panel I, green), including the binding sites of the catalytic and noncatalytic ions (Mönttinen et al. 2012), and parts of the thumb and fingers subdomains (fig. 3, panel I, pink and yellow, respectively). Cluster I has six subclusters (IA–F) (fig. 2), which follow the established viral family classification (Carstens 2012). Subclusters A–F encompass polymerases that originate from the members of the viral families Birnaviridae and Cystoviridae, Caliciviridae, Picornaviridae, Flaviviridae, Reoviridae, Leviviridae, and Retroviridae, respectively (for the taxonomic classification of the viruses, see supplementary table S1, Supplementary Material online). In addition to the reverse transcriptases of retroviruses, subcluster F contains the telomerase reverse transcriptase from Tribolium castaneum. The conserved core of subclusters IA–E is depicted for the RHDV RdRps in figure 3 panel IA–E. The degree of structural conservation in RdRps is significantly higher, and the secondary structures are more continuously covered than at the previous level of hierarchy, that is, in the RNA-dependent polymerases (fig. 3, first row, compare panels IA–E and I). The core encompasses basically all the secondary structure elements of the palm, fingers, and thumb subdomains of the RHDV RdRp (fig. 3, first row, compare panels M and IA–E) forming a structural scaffold for the smallest RdRps, such as those originating from caliciviruses (e.g., the RHDV RdRp depicted in fig. 3) and picornaviruses. 2743 Mönttinen et al. . doi:10.1093/molbev/msu219 MBE FIG. 2. The hierarchical clustering of the right-handed polymerases. The main clusters are marked with Roman numerals (I, II, III, and IV) and the subclusters with letters. The established polymerase family name is indicated in the surrounding bar (RT refers to reverse transcriptases) and the organism from which the polymerase comes from in the tip of each branch. The background colors indicate the domain of life: Viral polymerases green, eukaryotic polymerases yellow, archaeal polymerases orange, and bacterial polymerases pink. The type or name of the polymerase is also provided when it is essential for clarity (DNA pol refers to DNA polymerase). The number of equivalent amino acids, that is, the size of the common core, and the corresponding average rmsd values based on the common -carbons (in Ångströms) are indicated close to the root of each branch. IBDV, infectious bursal disease virus; ’6, pseudomonas phage ’6; CV, human enterovirus B (coxsackie virus); PV, poliovirus; HMDV, human enterovirus A (handfoot-and-mouth disease virus); HRV, human rhinovirus A; FMDV, foot-and-mouth disease virus; HCV, hepatitis C virus; BVDV, bovine viral diarrhea virus; Q, enterobacteria virus Q; MLV, murine leukemia virus; T. castaneum, Tribolium castaneum; M. smegmatis, Mycobacterium smegmatis; G. kaustophilus, Geobacillus kaustophilus; G. stearothermophilus, Geobacillus stearothermophilus; T. aquaticus, Thermus aquaticus; T7, enterobacteria phage T7; N4, Escherichia phage N4; T. gorgonarius, Thermococcus gorgonarius; P. furiosus, Pyrococcus furiosus; D. tok, Desulfurococcus tok; T. kodakarensis, Thermococcus kodakarensis; P. abyssi, Pyrococcus abyssi; RB69, enterobacteria phage RB69; ’29, Bacillus phage ’29. The structurally common regions characteristic of the RdRps (subclusters IA–E) are located in the fingers and thumb subdomains (fig. 3, panel IA–E, yellow and pink, respectively). Interestingly, two of the seven conserved -helices in the fingers subdomain (fig. 3, panel IA–E, yellow) are conserved also in the structure of the telomerase reverse 2744 transcriptase (PDBid: 3KYL), which is a member of subcluster IF, but not in other members of this subcluster (the retroviral reverse transcriptases, data not shown). This suggests that the fingers subdomain of the telomerase reverse transcriptases is more closely related to RdRps than is the fingers subdomain of retroviral reverse transcriptases. Phylogeny of the Right-Hand-Shaped Polymerases . doi:10.1093/molbev/msu219 MBE FIG. 3. The conserved structural features in the right-handed polymerases. The common structural features at each level of hierarchy are shown for a representative structure from each polymerase family. The four clusters (identified in fig. 2) covering the six polymerase families are indicated on the left (DNA pols refers to DNA polymerases). Green, pink, and yellow refer to residues in the palm, thumb, and fingers subdomains, respectively. Brown indicates regions (subdomains/domains) that are distinct from the basic polymerase domain. The first panel of each row (M) depicts the selected model structures. In the second panel (CORE), the residues that are common for all the right-handed polymerases are shown for each model structure. (continued) 2745 MBE Mönttinen et al. . doi:10.1093/molbev/msu219 Cluster II The family Y DNA polymerases are highly similar in the palm, fingers, and thumb subdomains. The common region covers 93% of the polymerase domain of Sulfolobus solfataricus Dpo4 depicted in figure 3 (third row). The members of family Y DNA polymerases have a unique structure known as the polymerase-associated domain in eukaryotic polymerases and as the little finger in archaeal and bacterial enzymes. However, the similarity of this domain cannot be confirmed because the corresponding region is not present in all the crystal structures used for the comparison (supplementary table S1, Supplementary Material online). Cluster III The family A DNA polymerases and single-subunit DdRps form the two subclusters of cluster III (fig. 2). The grouping of these two polymerase families into the same cluster supports the hypothesis that the single-subunit DdRps and family A DNA polymerases are formed via gene duplication in a phage T7-like organism (Cermakian et al. 1997; Filee and Forterre 2005). The polymerases of this cluster are from bacteria (family A DNA polymerases), phages (family A DNA polymerases and single-subunit DdRps), and mitochondrion (family A DNA polymerases and single-subunit DdRps). In addition to the common palm region, the equivalent residues are mainly located in the fingers and thumb subdomains as depicted for the E. coli Pol I and phage T7 DdRp in figure 3 (fourth and fifth rows, respectively, panel III). The new structural features in the common structural core of the subcluster IIIA that were not observed at the higher level of hierarchy (cluster III) include the fourth -strand in the palm domain (fig. 3, panel IIIA, green) and the structurally conserved residues of the exonuclease domain (fig. 3, panel IIIA, brown). The common core of the IIIB subcluster has two notable features not present at the higher level of hierarchy, the most prominent composed of four -helices (fig. 3, panel IIIB, green), which are connected to 1 of the conserved fold in the palm. This structure is known as the palm-insertion module, and it closes off the back of the substrate tunnel (Jeruzalmi and Steitz 1998). Cluster IV Cluster IV of family B DNA polymerases contains four distinct subclusters (fig. 2). Subclusters A and B contain only one member, the DNA polymerases of phage ’29 of Podoviridae family and phage RB69 of Myoviridae family, respectively. The DNA polymerases from thermophilic archaea are clustered together in subcluster C. Interestingly, these polymerases are structurally very similar with each other (631 common -carbons with rmsd of 3.1 Å; fig. 2) even though they are from two different archaeal phyla, Crenarchaeota and Euryarchaeota. The last subcluster (D) contains three members: Polymerase II (Pol II) from E. coli, polymerase from Saccharomyces cerevisiae, and herpesvirus DNA polymerase. The relatively low structural conservation observed for the thumb subdomain of family B DNA polymerases (fig. 3, panel IV, pink) likely reflects the difference in the angle between the thumb and the palm subdomains in these polymerase structures. Furthermore, the polymerase of phage ’29, the only representative DNA polymerase in our data set using a protein primer, has a notably smaller thumb subdomain (PDBid: 2PYJ) compared with the other family B polymerases (e.g., the E. coli Pol II depicted in fig. 3, last row). Phylogeny of the Right-Handed Polymerases The HSF-based structural comparison (fig. 2) provides information on the degree of structural similarity among righthanded polymerases resulting in a hierarchical clustering of the polymerase families that accurately reflects the accepted classification (Carstens 2012). However, HSF-based clustering is sensitive to the quality of structures, and also the size of the polymerase may influence its clustering as scattered random matches of unrelated residues between compared cores may increase the score between structures under comparison. This may especially happen if there are no close homologies in the data set (e.g., the RdRps of phage ’6 and Q, fig. 2). Consequently, these results should not be used to directly draw conclusions on phylogenetic relationships between individual structures. To obtain a deeper understanding of the phylogenetic relationships among the right-handed polymerases, we constructed phylogenetic trees for the four polymerase clusters (fig. 2). The phylogeny was deduced from the equivalent residues among the members within the studied cluster, calculated for all pairs of structures within the cluster. The advantage of this approach is that only the structurally conserved cores (fig. 3) shared with all structures are considered, instead of comparing pair-wise features that are not shared among all structures in the data set, and thus may not share a common origin. Because of the limited number of equivalent amino acids within the entire group of right-handed polymerases (57 residues), phylogenetic trees were separately inferred for each of the four clusters, using the larger cores of each of these clusters, instead of constructing a single tree for all the right-handed polymerases. As expected, the grouping of the polymerases in these phylogenetic trees (fig. 4) is slightly different from that in the hierarchical tree (fig. 2), fine tuning the analysis of their FIG. 3. Continued The third column (panels I–IV) represents the residues that are structurally equivalent within each of the four clusters (fig. 2). The fourth column shows equally positioned amino acid residues for selected subclusters or groups of subclusters (panels IA–E, IF, IIIA, and IIIB). The coverage of the common core in each cluster/family is given as a percentage of the entire polymerase domain (third and fourth columns). The representative structure shown are the RdRp of RHDV (PDBid: 1KHV; first row), the reverse transcriptase of HIV (PDBid: 2IAJ; second row), the Dpo4 of Sulfolobus solfataricus (PDBid: 1JX4; third row), the Pol I of Escherichia coli (PDBid: 2KFZ; fourth row), the DdRp of the enterobacteria phage T7 (PDBid: 1MSW; fifth row), and the Pol II of E. coli (PDBid: 3K59; last row). 2746 Phylogeny of the Right-Hand-Shaped Polymerases . doi:10.1093/molbev/msu219 MBE FIG. 4. Phylogeny of the right-handed polymerases. The phylogenetic relations among the conserved core structures of RdRps and reverse transcriptases (I), family Y DNA polymerases (II), family A DNA polymerases and single-subunit DdRps (III), and family B DNA polymerases (IV). The colored squares around the names of the organisms indicate the domain of life, for meaning of colors see figure 2. The mechanism of initiation and the primer type are shown with colored background: White, nucleic acid primer; blue, protein primer; purple, primer-independent initiation mechanism. The RdRps of the dsRNA viruses are shown with red branches and the RdRps of the ssRNA viruses with blue branches. For abbreviations, see figure 2 legend. phylogeny. The most notable difference between the hierarchical tree (fig. 2) and the phylogenetic tree of cluster I (fig. 4, panel I) is the location of the phage ’6 RdRp, which in the phylogenetic tree locates separately from the birnavirus (infectious pancreatic necrosis virus [IPNV] and infectious bursal disease virus) RdRps, between the RdRps that originate from flaviviruses (HCV, bovine viral diarrhea virus, West Nile virus, and Dengue virus) and leviviruses (phage Q). Although the relative limited sampling in clusters II, III, and IV could potentially limit the reliability of the results, the main observations on the phylogenies of the DNA-dependent polymerases (fig. 4, panels II–IV) are consistent with previous studies (Filee et al. 2002). Is There Correlation between the Phylogeny of Viral RNA Polymerases and the Taxonomic Classification of Viruses? In the phylogenetic tree constructed for cluster I, the RNAdependent right-handed polymerases are grouped together 2747 MBE Mönttinen et al. . doi:10.1093/molbev/msu219 into clusters that correspond to the taxonomic families of RNA viruses (see supplementary table S1, Supplementary Material online, for taxonomic groups). However, the phylogeny of the viral RdRps does not seem to follow the higher level taxonomic classification of viruses that is based on the genome type (Baltimore 1971; Carstens 2012). Instead, the polymerases originating from single- and double-stranded RNA viruses are scattered around in the phylogenetic tree (fig. 4, panel I, blue and red branches, respectively). Previous sequence-based analyses have also indicated that the polymerase subunits of ssRNA viruses and dsRNA viruses, like caliciviruses and totiviruses, respectively, may be closely related (Koonin et al. 2008). These results are in line with the observation that viruses with homologous major capsid proteins and genome packaging enzymes may have nonhomologous polymerase genes (Krupovič and Bamford 2009, 2010). The Mechanism of Initiation Is Reflected in the Grouping of Polymerases in the Phylogenetic Tree In our structure-based phylogenetic analysis of the 21 RNAdependent polymerases, the polymerases that apply de novo initiation mechanism are clustered together (fig. 4, panel I, purple background) although derived from clearly distinct viral families. Interestingly, this functional grouping of RNAdependent polymerases is evident although the specific features used for initiation, for example, the initiation platforms of reovirus, phage ’6, and HCV RdRps (Butcher et al. 2001; Hong et al. 2001; Tao et al. 2002; Sarin et al. 2012), are not part of the region used in the reconstruction of the tree. The RdRps that apply protein priming form a distinct cluster (fig. 4, panel I, blue background). However, the birnavirus (IPNV and infectious bursal disease virus) RdRps that function both as the polymerase and the protein primer (Dobos 1995; Pan et al. 2007) are clearly distinct from the RdRps of picornaviruses and caliciviruses, which encode separate RdRp and primer polypeptides. Similarly, among the family B DNA polymerases, the phage ’29 polymerase that uses a protein primer is separated from the other family B DNA polymerases that initiate from nucleic acid primers (fig. 4, panel IV). Similar separation based on the priming mechanisms has also been observed in previous sequence-based phylogenetic analyses of family B DNA polymerases (Filee et al. 2002). Within the phylogenetic tree of cluster I, the viral and telomerase reverse transcriptases that initiate nucleic acid synthesis from a pre-existing nucleic acid primer appear to branch off from the RdRps that apply de novo initiation mechanism (fig. 4, panel I). Although located within the same branch, the telomerase reverse transcriptases appear to be structurally more closely related to the RdRps than are the viral reverse transcriptases. Transfer of DNA Polymerase Genes between Viruses and Cellular Organisms It has been proposed that the key enzymes involved in nucleic acid metabolism were initially developed in virus-like organisms and then transferred to cellular genomes (Forterre 2002; Koonin et al. 2006). In our data set, clusters I, III, and IV contain both viral and cellular enzymes (fig. 2) indicating horizontal transfer of polymerase genes between viruses and 2748 cellular organisms, which is consistent with the “virus origins" proposal. In the phylogenetic tree of cluster III, the DNA polymerases and the single-subunit DdRps are located separately, placing the two polymerases of phage T7 into separate branches (fig. 4, panel III) as observed in the hierarchical clustering (fig. 2). The corresponding enzymes of mitochondria display similar grouping to T7-like phages suggesting that the mitochondrial polymerases probably originate from a T7-like phage integrated into the genome of an -proteobacterium at the origin of mitochondria (Filee and Forterre 2005; Shutt and Gray 2006). The family Y DNA polymerases are involved in DNA repair functions and are currently classified into six branches based on sequence alignments: DinB, Rad30A, Rad30B, Rev1, UmuC of Gram-negative bacteria, and UmuC of Gram-positive bacteria (Ohmori et al. 2001). Structural information is available only for the members of Rad30A ( of Homo sapiens and Sa. cerevisiae), Rad30B ( of H. sapiens), DinB ( of H. sapiens, Dpo4 of Mycobacterium smegmatis and S. solfataricus), Dbh (Dbh of S. acidocaldarius), and Rev1 (Rev1 of H. sapiens and Sa. cerevisiae) branches (supplementary table S1, Supplementary Material online) of which Rad30A, Rad30B, and Rev1 proteins are only found from eukaryotes and DinB from all three domains of cellular life (Pata 2010). In the structure-based phylogenetic tree, these family Y DNA polymerases (fig. 4, panel II) do not form clearly separated branches. However, the grouping resembles the established sequence-based clustering (Ohmori et al. 2001), not detected by the hierarchical clustering (fig. 2). These particular branches of family Y DNA polymerases do not include viral representatives (Filee et al. 2002) perhaps partly explaining the limited variation among the polymerases of this cluster (figs. 2 and 3), consistent with the observation that viruses are major reservoirs of gene diversity and a driving force in cellular evolution (Forterre and Prangishvili 2013). Transfer of DNA Polymerase Genes among Cellular Organisms Previous studies have revealed massive horizontal gene transfer between and among archaea and bacteria, especially among halophilic and thermophilic species (Koonin et al. 2001; Dagan et al. 2008). In the phylogenetic tree of cluster IV, the DNA polymerases of the thermophilic archaea are tightly grouped together, separately from the other family B DNA polymerases, indicating a close phylogenetic relation (fig. 4, panel IV). The family B DNA polymerases are only encoded by the members of a single class of the domain Bacteria, gammaproteobacteria (in this study E. coli Pol II, PDBid: 3K59) but are common among phages infecting these bacteria, including members of the families Podoviridae, Myoviridae, and Tectiviridae (Wang et al. 1997, 1989; Kamtekar et al. 2004; phages RB69 and ’29 in this study), suggesting gene transfer between phages and their hosts. Discussion At present, right-handed polymerase structures are available from three different DNA polymerase families (A, B, and Y), Phylogeny of the Right-Hand-Shaped Polymerases . doi:10.1093/molbev/msu219 single-subunit DdRps, RdRps, and reverse transcriptases. However, knowledge of the structural relationships between these functionally varied enzymes is piecemeal. Typically, newly solved polymerase structures are compared with structures of well-studied polymerases, often from organisms with medical importance, and the broader structural commonalities and relationships are not recognized. We have now applied a novel, automated structure-based comparison method to gain insights into the structural conservation and phylogeny of the full set of structurally characterized right-hand-shaped polymerases. The relatively small number of known unique protein structures, their incompleteness, and the differences in conformational states present serious obstacles for a systematic structure-based comparison of the right-handed polymerases. Despite these limitations, the automated structure-based clustering of 49 right-handed polymerases using the HSF program exactly recapitulated the six established polymerase families (Poch et al. 1989; Delarue et al. 1990; Ito and Braithwaite 1991; Braithwaite and Ito 1993; Ohmori et al. 2001) (fig. 2). Subclustering of the branches of the major clusters was also successful (fig. 2), for example, RdRps and reverse transcriptases of cluster I follow the conventional taxonomy of viruses (Carstens 2012). Note that no manual intervention was required for the structural alignment, whereas this is often necessary for sequence alignments. Thus, the results deduced from this analysis can be considered to be objective. The conserved structural core that is common to all righthanded polymerases is relatively small, comprising 57 -carbons (fig. 3, panel CORE) that form three -strands and two -helices. In both RNA-dependent polymerases (RdRps and reverse transcriptases) and DNA polymerases, a fourth structurally common -strand can be identified (fig. 3) which is not present in the single-subunit DdRps suggesting deletion of the corresponding region in an ancestor of the currently known DdRp genes. Indeed, it has been proposed that the single-subunit DdRps and the family A DNA polymerases originated by gene duplication (Cermakian et al. 1997), probably in an ancient T7-like phage, before the integration of the corresponding polymerase genes into the -proteobacterium that subsequently became an early mitochondrion (Filee and Forterre 2005). Such a hypothesis is supported by the phylogenetic tree (fig. 4, panel III). Based on the structural similarity, the positional conservation of the catalytic amino acids, and conserved function, it is likely that the structural cores (fig. 3, panel CORE) of all righthanded polymerases share a common origin. This common core represents the most crucial structure for the catalysis of DNA and RNA polymers from nucleoside-triphosphate precursors in the right-handed polymerases. Consequently, an ancient protein fold reminiscent of this core structure and harboring the catalytic aspartates on secondary structural elements 1 and 3 (fig. 1B) might have been able to catalyze nucleic acid synthesis in a primitive organism with an RNA genome. To our knowledge, this is the first time that the common, characteristic features are described systematically for all the MBE families of the right-handed polymerases (RdRps, reverse transcriptases, single-subunit DdRps, and family A, B, and Y DNA polymerases). However, less extensive phylogenetic analyses on viral RdRps, both sequence- and structure based, have y been conducted previously (e.g., Koonin et al. 2008; Cern et al. 2014). Although different subsets of polymerases were used, the results obtained here and in these previous analyses are consistent, confirming that the HSF program (Ravantti et al. 2013) can accurately classify structures based on their similarity. The structurally conserved cores defined here (fig. 3) cover substantially larger regions of the polymerase domain than the previously identified conserved motifs deduced from sequence comparisons or structure-based sequence alignments alone (Poch et al. 1989; Hansen et al. 1997; Bruenn 2003; Lang et al. 2013), reflecting the fact that sequence-based methods have a limited capacity to detect spatial and functional conservation. In line with this, all the sequence motifs previously identified for RdRps (Poch et al. 1989) map onto their structurally conserved core shown in figure 3 (panel IA–E), except motif F, which is located in a region not present in all the crystal structures used for the comparison. The classification of viruses based on processes involved in genome replication has been a matter of debate (Koonin et al. 2008; Krupovič and Bamford 2009, 2010). The structure-based clustering of viral RdRps and reverse transcriptases (fig. 2) and the phylogenetic tree deduced from their core structure (fig. 4, panel I) follows the established division of viral families (Carstens 2012). However, the phylogeny of viral RdRps presented here does not follow the proposed higher level classification of viruses that is based on the genome structure (Baltimore 1971) or on the similarity of the major coat protein fold and virion architecture (Abrescia et al. 2012; El Omari et al. 2013). This is in agreement with the notion that because the replication process is not directly coupled to the virion assembly, viral polymerase genes were exchanged relatively freely among the dsRNA and ssRNA viruses prior to the division of the currently known viral families. Consequently, the phylogeny of viral RdRps reflects the evolutionary history of RNA viruses only within limited subgroups (e.g., within viral families) and probably should not be used as a scaffold for the reconstruction of the broader aspects of RNA virus evolution. This is borne out by the phylogeny of polymerase subunits of DNA viruses, which is even more complex than that of viral RdRps. The DNA polymerases of phages belonging to Podoviridae family (e.g., phages T7 and ’29) clustered into distinct parts of the hierarchical tree (III and IV, corresponding to the family A and B DNA polymerases, respectively), whereas DNA polymerases from more distantly related DNA viruses, like human herpesvirus and phage RB69, clustered into the same branch (cluster IV) of the tree (fig. 2). Furthermore, both the hierarchical clustering (fig. 2) and the phylogenetic analyses (fig. 4) as well as previous studies (Filee et al. 2002) indicate that mitochondrial family A DNA polymerases and DdRps as well as gammaproteobacterial family B DNA polymerases were possibly acquired from phages. Based on the phylogenetic analyses presented here (fig. 4, panel I) and elsewhere (Nakamura et al. 1997), it seems that 2749 MBE Mönttinen et al. . doi:10.1093/molbev/msu219 cellular and viral reverse transcriptases share a common origin. However, our data imply that the telomerase reverse transcriptase from T. castaneum is structurally more closely related to the viral RdRps than are the reverse transcriptases of HIV and murine leukemia virus. We propose that this reflects the different mutation rates of retroviral and eukaryotic genomes; once an ancient reverse transcriptase gene, the precursor of telomerase reverse transcriptases, integrated into an eukaryotic genome, it was replicated by a cellular DNA polymerase possessing high fidelity, whereas the precursor of currently known retroviral reverse transcriptases continued to be replicated by the low-fidelity reverse transcriptase enzyme. This resulted in a higher accumulation of mutations in the viral reverse transcriptase gene and its faster evolution. This would also mean that the cellular telomerase reverse transcriptase, resembling RdRps, represents a more ancient form of the reverse transcriptase structure. However, more reverse transcriptase structures are needed to make further predictions on the origin and evolution of telomerases. According to the RNA world hypothesis, the first genomes were RNA replicated by primitive RdRps (Leon 1998), applying most likely de novo initiation mechanisms. Later, RdRps that initiate RNA polymerization with the aid of a protein primer and reverse transcriptases that could utilize deoxyribonucleotides and nucleic acid primers evolved from these primerindependent RdRps (fig. 4). The reverse transcriptases catalyze DNA synthesis on both RNA and DNA templates as a part of their normal function, and consequently, their early forms are potential precursors for the different right-hand-shaped DNA polymerases. The change from RNA to DNA genomes required also DdRps for gene expression from the otherwise inert DNA molecules. Potentially, this was initially carried out by the precursors of cellular RdRps, such as the silencing pathway polymerases QDE-1 of Neurospora crassa, which catalyzes both RNA- and DNA-dependent RNA polymerization (Aalto et al. 2010; Lee et al. 2010) or as a side activity of the preexisting RdRps and reverse transcriptases. Also, the single-subunit right-handed DdRps developed to carry out this function in an ancestor of T7-like phages (fig. 4, panel III). Although present-day polymerases tend to be characterized on the basis of a particular activity (table 1), their potential is frequently much wider. Thus, telomerase reverse transcriptases can work not only as an RNA-dependent DNA polymerase but also as an RdRp (Maida et al. 2009). The RdRp of the phage ’6 can use a DNA chain as template (Makeyev and Grimes 2004), and the RdRp of the IPNV (a birnavirus) can function as a reverse transcriptase (Graham et al. 2011). All the members of the DNA polymerase families can also switch the usage from a DNA template to an RNA template (Ricchetti and Buc 1993; Murakami et al. 2003; Franklin et al. 2004; Storici et al. 2007). Furthermore, one mutation in the DdRp of T7 enables the polymerase to carry out the functions of RdRp, reverse transcriptase, DdRp, and DNA polymerase (Sousa and Padilla 1995). These observations suggest that the template and substrate specificities are relatively flexible properties controlled mainly by differences in affinity and reaction kinetics as well as the availability of different substrate and template molecules. 2750 However, the change in priming mechanism (de novo and nucleic acid priming or protein priming), which is accompanied by more substantial changes in the polymerase structure, has likely been less frequent (fig. 4). Therefore, it is quite possible that the switch in template and substrate specificities might have occurred several times during the evolution of right-handed polymerases. Materials and Methods Protein Structures for Structural Alignment Structures of right-handed polymerases were manually collected from the PDB (www.pdb.org, last accessed July 28, 2014; structures before February 1, 2013). Only one structure from each polymerase type per species was selected for the analysis (see supplementary table S1, Supplementary Material online) applying the following criteria: 1) absence or low number of disordered residues, 2) absence or low number of mutations, 3) resolution of the X-ray analysis and finally 4) structures representing the closed conformation were preferred. Structural Alignment and Clustering Alignment of the selected polymerase structures was performed using HSF program (described previously in Ravantti et al. 2013), which automatically clusters protein structures based on a similarity score defined in Ravantti et al. (2013). The alignment parameters (Ravantti et al. 2013) were optimized for the best automatic clustering by using a locally written Python (Summerfield 2010) script for a selected subset of polymerase structures (available from H.A.M.M.). The criterion for a correct clustering was the alignment of the -sheet in the palm subdomain (fig. 1B), which was verified by visual inspection. The parameters fulfilling the alignment criterion and yielding the highest scores produced by HSF were applied for the full data set (the parameters are listed in supplementary table S2, Supplementary Material online). Visual molecular dynamics program (Humphrey et al. 1996) was used for the visualization of the structures. Identification of Conserved Amino Acids Conserved amino acids within the identified structural cores were recognized by using a locally written Python (Summerfield 2010) script (available from H.A.M.M.). From the original structures, a window of 1 residues around the structurally equivalent amino acids was used. If an identical amino acid was detected within the window in all the polymerases for a cluster, it was marked as a conserved amino acid in that position for that cluster. Phylogenetic Tree Phylogenetic trees for each subcluster were made on the basis of the common cores identified by HSF. First, all pairwise scores were calculated by using the set of parameters used in the initial clustering (supplementary table S2, Supplementary Material online). At the moment, there is no standard method for converting structural alignments into distances reflecting evolution. Consequently, scores were first converted to distances by normalizing them as in Phylogeny of the Right-Hand-Shaped Polymerases . doi:10.1093/molbev/msu219 Ravantti et al. (2013). The resulting distance matrix was converted to a phylogenic tree using Fitch-Margoliash algorithm of Fitch program (Felsenstein 1989). Phylogenetic trees were visualized with Dendroscope 3 program (Huson and Scornavacca 2012). Supplementary Material Supplementary tables S1 and S2 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/). Acknowledgments This work was supported by the Academy of Finland (grant numbers 250113 and 256069 to M.M.P.); the Sigrid Juselius Foundation; and the UK Medical Research Council (grant number G1000099 to D.I.S). H.A.M.M. is a fellow of the Integrative Life Science Doctoral Program. This is a contribution from the Finnish and UK Instruct centers. References Aalto AP, Poranen MM, Grimes JM, Stuart DI, Bamford DH. 2010. In vitro activities of the multifunctional RNA silencing polymerase QDE-1 of Neurospora crassa. J Biol Chem. 285:29367–29374. Abrescia NG, Bamford DH, Grimes JM, Stuart DI. 2012. Structure unifies the viral universe. Annu Rev Biochem. 81:795–822. Aravind L, Koonin EV. 1999. DNA polymerase -like nucleotidyltransferase superfamily: identification of three new families, classification and evolutionary history. Nucleic Acids Res. 27:1609–1618. Baltimore D. 1971. Expression of animal virus genomes. Bacteriol Rev. 35: 235–241. Beard WA, Wilson SH. 2000. Structural design of a eukaryotic DNA repair polymerase: DNA polymerase . Mutat Res. 460:231–244. Benson SD, Bamford JK, Bamford DH, Burnett RM. 1999. Viral evolution revealed by bacteriophage PRD1 and human adenovirus coat protein structures. Cell 98:825–833. Braithwaite DK, Ito J. 1993. Compilation, alignment, and phylogenetic relationships of DNA polymerases. Nucleic Acids Res. 21:787–802. BruennJA.2003.Astructuralandprimarysequencecomparisonoftheviral RNA-dependent RNA polymerases. Nucleic Acids Res. 31:1821–1829. Butcher SJ, Grimes JM, Makeyev EV, Bamford DH, Stuart DI. 2001. A mechanism for initiating RNA-dependent RNA polymerization. Nature 410:235–240. Carstens EB. 2012. Introduction to virus taxonomy. In: King AMQ, Adams MJ, Carstens EB, Lefkowitz EJ, editors. Virus taxonomy: classification and nomenclature of viruses: ninth report of the International Committee on Taxonomy of Viruses. San Diego (CA): Elsevier. p. 3–20. Cermakian N, Ikeda TM, Miramontes P, Lang BF, Gray MW, Cedergren R. 1997. On the evolution of the single-subunit RNA polymerases. J Mol Evol. 45:671–681. y J, Cern a Bolfıkova B, Valdes JJ, Grubhoffer L, Růzek D. 2014. Cern Evolution of tertiary structure of viral RNA dependent polymerases. PLoS One 9:e96070. Dagan T, Artzy-Randrup Y, Martin W. 2008. Modular networks and cumulative impact of lateral transfer in prokaryote genome evolution. Proc Natl Acad Sci U S A. 105:10039–10044. Delarue M, Poch O, Tordo N, Moras D, Argos P. 1990. An attempt to unify the structure of polymerases. Protein Eng. 3:461–467. Dobos P. 1995. Protein-primed RNA synthesis in vitro by the virionassociated RNA polymerase of infectious pancreatic necrosis virus. Virology 208:19–25. El Omari K, Sutton G, Ravantti JJ, Zhang H, Walter TS, Grimes JM, Bamford DH, Stuart DI, Mancini EJ. 2013. Plate tectonics of virus shell assembly and reorganization in phage ’8, a distant relative of mammalian reoviruses. Structure 21:1384–1395. MBE Felsenstein J. 1989. PHYLIP—Phylogeny inference package (version 3.2). Cladistics 5:164–166. Filee J, Forterre P. 2005. Viral proteins functioning in organelles: a cryptic origin? Trends Microbiol. 13:510–513. Filee J, Forterre P, Sen-Lin T, Laurent J. 2002. Evolution of DNA polymerase families: evidences for multiple gene exchange between cellular and viral proteins. J Mol Evol. 54:763–773. Forterre P. 2002. The origin of DNA genomes and DNA replication proteins. Curr Opin Microbiol. 5:525–532. Forterre P, Prangishvili D. 2013. The major role of viruses in cellular evolution: facts and hypotheses. Curr Opin Virol. 3:558–565. Franklin A, Milburn PJ, Blanden RV, Steele EJ. 2004. Human DNA polymerase-, an A-T mutator in somatic hypermutation of rearranged immunoglobulin genes, is a reverse transcriptase. Immunol Cell Biol. 82:342–342. Freemont PS, Ollis DL, Steitz TA, Joyce CM. 1986. A domain of the Klenow fragment of Escherichia coli DNA polymerase I has polymerase but no exonuclease activity. Proteins 1:66–73. Gillis AJ, Schuller AP, Skordalakes E. 2008. Structure of the Tribolium castaneum telomerase catalytic subunit TERT. Nature 455:633–637. Graham SC, Sarin LP, Bahar MW, Myers RA, Stuart DI, Bamford DH, Grimes JM. 2011. The N-terminus of the RNA polymerase from infectious pancreatic necrosis virus is the determinant of genome attachment. PLoS Pathog. 7:e1002085. Hansen JL, Long AM, Schultz SC. 1997. Structure of the RNA-dependent RNA polymerase of poliovirus. Structure 5:1109–1122. Hong Z, Cameron CE, Walker MP, Castro C, Yao N, Lau JY, Zhong W. 2001. A novel mechanism to ensure terminal initiation by hepatitis C virus NS5B polymerase. Virology 285:6–11. Humphrey W, Dalke A, Schulten K. 1996. VMD: visual molecular dynamics. J Mol Graph. 14:33–38. Huson DH, Scornavacca C. 2012. Dendroscope 3: an interactive tool for rooted phylogenetic trees and networks. Syst Biol. 61:1061–1067. Ito J, Braithwaite DK. 1991. Compilation and alignment of DNA polymerase sequences. Nucleic Acids Res. 19:4045–4057. Jeruzalmi D, Steitz TA. 1998. Structure of T7 RNA polymerase complexed to the transcriptional inhibitor T7 lysozyme. EMBO J. 17: 4101–4113. Joyce CM, Steitz TA. 1995. Polymerase structures and functions: variations on a theme? J Bacteriol 177:6321–6329. Kamtekar S, Berman AJ, Wang JM, Lazaro JM, de Vega M, Blanco L, Salas M, Steitz TA. 2004. Insights into strand displacement and processivity from the crystal structure of the protein-primed DNA polymerase of bacteriophage ’29. Mol Cell. 16:609–618. Kohlstaedt LA, Wang J, Friedman JM, Rice PA, Steitz TA. 1992. Crystal structure at 3.5 Å resolution of HIV-1 reverse transcriptase complexed with an inhibitor. Science 256:1783–1790. Koonin EV. 1991. The phylogeny of RNA-dependent RNA polymerases of positive-strand RNA viruses. J Gen Virol. 72:2197–2206. Koonin EV, Makarova KS, Aravind L. 2001. Horizontal gene transfer in prokaryotes: quantification and classification. Annu Rev Microbiol. 55:709–742. Koonin EV, Senkevich TG, Dolja VV. 2006. The ancient Virus World and evolution of cells. Biol Direct. 1:29. Koonin EV, Wolf YI, Nagasaki K, Dolja VV. 2008. The Big Bang of picorna-like virus evolution antedates the radiation of eukaryotic supergroups. Nat Rev Microbiol. 6:925–939. Krupovič M, Bamford DH. 2009. Does the evolution of viral polymerases reflect the origin and evolution of viruses? Nat Rev Microbiol. 7:250. Krupovič M, Bamford DH. 2010. Order to the viral universe. J Virol 84: 12476–12479. Lang DM, Zemla AT, Zhou CL. 2013. Highly similar structural frames link the template tunnel and NTP entry tunnel to the exterior surface in RNA-dependent RNA polymerases. Nucleic Acids Res. 41:1464–1482. Lee HC, Aalto AP, Yang Q, Chang SS, Huang G, Fisher D, Cha J, Poranen MM, Bamford DH, Liu Y. 2010. The DNA/RNA-dependent RNA polymerase QDE-1 generates aberrant RNA and dsRNA for RNAi in a process requiring replication protein A and a DNA helicase. PLoS Biol. 8:e1000496. 2751 Mönttinen et al. . doi:10.1093/molbev/msu219 Leon PE. 1998. Inhibition of ribozymes by deoxyribonucleotides and the origin of DNA. J Mol Evol. 47:122–126. Maida Y, Yasukawa M, Furuuchi M, Lassmann T, Possemato R, Okamoto N, Kasim V, Hayashizaki Y, Hahn WC, Masutomi K. 2009. An RNA-dependent RNA polymerase formed by TERT and the RMRP RNA. Nature 461:230–235. Makeyev EV, Grimes JM. 2004. RNA-dependent RNA polymerases of dsRNA bacteriophages. Virus Res. 101:45–55. Mönttinen HAM, Ravantti JJ, Poranen MM. 2012. Evidence for a noncatalytic ion-binding site in multiple RNA-dependent RNA polymerases. PLoS One 7:e40581. Murakami E, Feng JY, Lee H, Hanes J, Johnson KA, Anderson KS. 2003. Characterization of novel reverse transcriptase and other RNA-associated catalytic activities by human DNA polymerase —importance in mitochondrial DNA replication. J Biol Chem. 278: 36403–36409. Murakami KS, Davydova EK, Rothman-Denes LB. 2008. X-ray crystal structure of the polymerase domain of the bacteriophage N4 virion RNA polymerase. Proc Natl Acad Sci U S A. 105:5046–5051. Nakamura TM, Morin GB, Chapman KB, Weinrich SL, Andrews WH, Lingner J, Harley CB, Cech TR. 1997. Telomerase catalytic subunit homologs from fission yeast and human. Science 277:955–959. Ohmori H, Friedberg EC, Fuchs RP, Goodman MF, Hanaoka F, Hinkle D, Kunkel TA, Lawrence CW, Livneh Z, Nohmi T, et al. 2001. The Yfamily of DNA polymerases. Mol Cell. 8:7–8. Ollis DL, Brick P, Hamlin R, Xuong NG, Steitz TA. 1985. Structure of large fragment of Escherichia coli DNA polymerase I complexed with dTMP. Nature 313:762–766. Pan J, Vakharia VN, Tao YJ. 2007. The structure of a birnavirus polymerase reveals a distinct active site topology. Proc Natl Acad Sci U S A. 104:7385–7390. Pata JD. 2010. Structural diversity of the Y-family DNA polymerases. Biochim Biophys Acta. 1804:1124–1135. Poch O, Sauvaget I, Delarue M, Tordo N. 1989. Identification of four conserved motifs among the RNA-dependent polymerase encoding elements. EMBO J. 8:3867–3874. Ravantti J, Bamford D, Stuart DI. 2013. Automatic comparison and classification of protein structures. J Struct Biol. 183:47–56. Ricchetti M, Buc H. 1993. Escherichia coli DNA polymerase I as a reverse transcriptase. EMBO J. 12:387–396. 2752 MBE Salgado PS, Koivunen MR, Makeyev EV, Bamford DH, Stuart DI, Grimes JM. 2006. The structure of an RNAi polymerase links RNA silencing and transcription. PLoS Biol. 4:e434. Sarin LP, Wright S, Chen Q, Degerth LH, Stuart DI, Grimes JM, Bamford DH, Poranen MM. 2012. The C-terminal priming domain is strongly associated with the main body of bacteriophage ’6 RNA-dependent RNA polymerase. Virology 432:184–193. Shutt TE, Gray MW. 2006. Bacteriophage origins of mitochondrial replication and transcription proteins. Trends Genet. 22:90–95. Sousa R, Chung YJ, Rose JP, Wang BC. 1993. Crystal structure of bacteriophage T7 RNA polymerase at 3.3 Å resolution. Nature 364: 593–599. Sousa R, Padilla R. 1995. Mutant T7 RNA polymerase as a DNA polymerase. EMBO J. 14:4609–4621. Steitz TA. 1998. A mechanism for all polymerases. Nature 391:231–232. Steitz TA. 1999. DNA polymerases: structural diversity and common mechanisms. J Biol Chem. 274:17395–17398. Storici F, Bebenek K, Kunkel TA, Gordenin DA, Resnick MA. 2007. RNAtemplated DNA repair. Nature 447:338–341. Summerfield M. 2010. Programming in Python 3: a complete introduction to the Python language. Upper Saddle River (NJ): AddisonWesley. Tao Y, Farsetta DL, Nibert ML, Harrison SC. 2002. RNA synthesis in a cage—structural studies of reovirus polymerase 3. Cell 111: 733–745. van Dijk AA, Makeyev EV, Bamford DH. 2004. Initiation of viral RNA-dependent RNA polymerization. J Gen Virol. 85: 1077–1093. Wang J, Sattar AK, Wang CC, Karam JD, Konigsberg WH, Steitz TA. 1997. Crystal structure of a pol family replication DNA polymerase from bacteriophage RB69. Cell 89:1087–1099. Wang TS, Wong SW, Korn D. 1989. Human DNA polymerase : predicted functional domains and relationships with viral DNA polymerases. FASEB J. 3:14–21. Werner F, Grohmann D. 2011. Evolution of multisubunit RNA polymerases in the three domains of life. Nat Rev Microbiol. 9:85–98. Zhou BL, Pata JD, Steitz TA. 2001. Crystal structure of a DinB lesion bypass DNA polymerase catalytic fragment reveals a classic polymerase catalytic domain. Mol Cell. 8:427–437.