Download Article Automated Structural Comparisons Clarify

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

HIV wikipedia , lookup

Hepatitis B wikipedia , lookup

Herpes simplex virus wikipedia , lookup

Potato virus Y wikipedia , lookup

Transcript
Automated Structural Comparisons Clarify the Phylogeny of
the Right-Hand-Shaped Polymerases
Heli A. M. Mönttinen,1 Janne J. Ravantti,1,2 David I. Stuart,3,4 and Minna M. Poranen*,1
1
Department of Biosciences, Viikki Biocenter, University of Helsinki, Helsinki, Finland
Institute of Biotechnology, Viikki Biocenter, University of Helsinki, Helsinki, Finland
3
Division of Structural Biology and the Oxford Protein Production Facility, The Wellcome Trust Centre for Human Genetics,
University of Oxford, Oxford, United Kingdom
4
Diamond Light Source Limited, Harwell Science and Innovation Campus, Oxfordshire, United Kingdom
*Corresponding author: E-mail: [email protected].
Associate editor: Csaba Pal
2
Abstract
Polymerases are essential for life, being responsible for replication, transcription, and the repair of nucleic acid molecules.
Those that share a right-hand-shaped fold and catalytic site structurally similar to the DNA polymerase I of Escherichia
coli may catalyze RNA- or DNA-dependent RNA polymerization, reverse transcription, or DNA replication in eukarya,
archaea, bacteria, and their viruses. We have applied novel computational methods for structure-based clustering and
phylogenetic analyses of this functionally diverse polymerase superfamily, which currently comprises six families. We
identified a structural core common to all right-handed polymerases, composed of 57 amino acid residues, harboring two
positionally and chemically conserved residues, the catalytic aspartates. The structural conservation within each of the six
families is considerable, for example, the structural core shared by family Y DNA polymerases covers over 90% of the
polymerase domain of the Sulfolobus solfataricus Dpo4. Our phylogenetic analyses propose an early separation of RNAdependent polymerases that use primers from those that are primer-independent. Furthermore, the exchange of polymerase genes between viruses and their hosts is evident. Because of this horizontal gene transfer, the phylogeny of
polymerases does not always reflect the evolutionary history of the corresponding organisms.
Key words: nucleic acid polymerase, protein evolution, structural alignment, structural classification.
Introduction
2001; Murakami et al. 2008). In addition, cellular telomerase
reverse transcriptases have been assigned to be members of
this group (Gillis et al. 2008). The DNA polymerases of the
third group share a similar overall fold with the second group,
but the topology of the palm subdomain is different as exemplified by the polymerase of Rattus norvegicus (Joyce and
Steitz 1995; Aravind and Koonin 1999). To emphasize this
difference, these DNA polymerases are sometimes referred
to left-handed polymerases (Beard and Wilson 2000).
The hallmark of a right-handed polymerase is the canonical
palm subdomain with a 1-1-2-3-2-4-fold (fig. 1B)
(Ollis et al. 1985; Steitz 1999). Strands 1 and 3 contain
catalytic aspartates, which coordinate the two catalytic magnesium ions (fig. 1B) (Steitz 1998). In addition to the polymerase activity that derives from the fingers, thumb, and
palm subdomains, the polypeptide chain of many righthanded polymerases contains domains that may carry out,
for instance, exonuclease activity (Freemont et al. 1986).
Polymerases can use two different initiation mechanisms:
Primer dependent or primer independent (de novo) (van Dijk
et al. 2004). In primer-dependent initiation, an RNA or a
protein primer provides the hydroxyl group for the first
incoming NTP, or the polymerase initiates from a 3’-hydroxyl
ß The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please
e-mail: [email protected]
Mol. Biol. Evol. 31(10):2741–2752 doi:10.1093/molbev/msu219 Advance Access publication July 24, 2014
2741
Article
Polymerases are responsible for preserving genetic information by replicating and repairing nucleic acid molecules as well
as for the expression of genes through transcription. Known
replicases, transcriptases, and DNA repair polymerases can be
divided into three groups based on the structural fold of their
catalytic site. The first group includes multisubunit cellular
transcriptases and homodimeric RNA silencing pathway polymerases, which have similar double-psi -barrel structures
containing the catalytic residues (Salgado et al. 2006; Werner
and Grohmann 2011). The second group comprises polymerases that structurally resemble DNA polymerase I (Pol I) of
Escherichia coli, often referred as right-handed polymerases
(Ollis et al. 1985; Beard and Wilson 2000). Their overall fold is
characterized by thumb, fingers, and palm subdomains
(fig. 1A) (Kohlstaedt et al. 1992). Such a protein fold is
shared by the DNA polymerases of families A, B, and Y that
catalyze DNA synthesis using DNA template, the single-subunit DNA-dependent RNA polymerases (DdRps) that are related to phage T7 RNA polymerase, and viral RNA-dependent
polymerases including reverse transcriptases of retroviruses
and viral RNA-dependent RNA polymerases (RdRps)
(table 1) (Ollis et al. 1985; Kohlstaedt et al. 1992; Sousa
et al. 1993; Hansen et al. 1997; Wang et al. 1997; Zhou et al.
MBE
Mönttinen et al. . doi:10.1093/molbev/msu219
group of a pre-existing DNA strand, as exemplified by telomerases and DNA polymerases involved in DNA repair or
employing a rolling-circle mechanism. In de novo initiation,
the first (initiation) NTP serves this function (Butcher et al.
2001).
Previously, the homology and phylogeny of right-handed
polymerases has been investigated mainly by using amino
acid sequence alignments (Poch et al. 1989; Delarue et al.
1990; Ito and Braithwaite 1991; Koonin 1991; Braithwaite
and Ito 1993; Ohmori et al. 2001). Although such approaches
can lead to unexpected notions of protein homology, they are
prone to limitations when the sequence similarity is low, as it
is between some of the right-handed polymerases. Nowadays,
there are many high-resolution polymerase structures available in the Protein Data Bank (PDB; www.pdb.org, last
accessed July 28, 2014), making structure-based phylogeny
and homology studies potential alternatives. The advantage
of such approaches is that similarities and hence homology
can still be detected between the protein folds of distantly
related, homologous proteins, when the amino acid sequences have lost detectable similarity (Benson et al. 1999).
FIG. 1. A typical structure of a right-hand-shaped polymerase and its
palm subdomain. (A) The structure of Escherichia coli Pol I (PDBid:
2KFZ). The subdomains are indicated with colors: Palm in green, fingers
in yellow, and thumb in pink. A close-up (B) shows the palm subdomain
rotated 55 by y axis compared with (A). The order of the secondary
structure elements is indicated with numbers and the position of the
catalytic aspartates with turquoise spheres.
We have applied a structure-based comparison method
(Ravantti et al. 2013) for the identification of equivalent residues in right-handed polymerase structures. Based on these
equivalences, the structures were classified into four main
clusters that were further dissected into subclusters. This approach allowed us to identify the conserved structural core
for the right-handed polymerases, that is, the subset of amino
acid residues that can be considered to be equivalent in all
these enzymes. Similarly, equivalences were defined for the
different polymerase clusters and subclusters, that is, RdRps,
reverse transcriptases, DdRps, and family A, B, and Y DNA
polymerases. On the basis of the similarity of the structural
cores, we also present phylogenetic trees for the four major
clusters: RNA-dependent polymerases (RdRps and reverse
transcriptases), the family Y DNA polymerases, the family A
DNA polymerases and the single-subunit DdRps, and the
family Y DNA polymerases.
Results
The Right-Handed Polymerase Structures
In the PDB, there are over 1,000 high-resolution structures of
right-handed polymerases. However, most of these are for a
rather small number of extensively studied enzymes, such as
the RdRp of the hepatitis C virus (HCV) and the reverse
transcriptase of human immunodeficiency virus (HIV). For
the analysis presented here, a single representative structure
was selected for each polymerase type from all available species. These 49 polymerases included 18 RdRps, 3 reverse transcriptases, 3 single-subunit DdRps, and 6, 10, and 9 family A,
family B, and family Y DNA polymerases, respectively (supplementary table S1, Supplementary Material online), involved in nucleic acid replication, transcription, or repair.
The resolution of the selected structures varied between 1.6
and 3.3 Å, and typically, a structure covered 80–100% of the
total length of the corresponding polypeptide (see supplementary table S1, Supplementary Material online). Despite
the small coverage of some polymerase structures (e.g., only
~35% in the case of human Rev1), all the selected structures
contain nearly complete polymerase domains, composed of
palm, fingers, and thumb subdomains.
RNA
templated
DNA
templated
Table 1. Polymerases Used in This Study.
a
Polymerase Family
Family A
DNA polymerases
Family B
DNA polymerases
Family Y
DNA polymerases
Single-subunit DdRps
Reverse transcriptases
RdRps
For example, retrovirus reverse transcriptase.
2742
Activity
DNA-dependent DNA
polymerization
DNA-dependent DNA
polymerization
DNA-dependent DNA
polymerization
DNA-dependent RNA
polymerization
RNA (or DNA)a-dependent
DNA polymerization
RNA-dependent RNA
polymerization
Function
Processing of the Okazaki fragments, DNA
repair, or replication
Replication or DNA repair
DNA repair
Transcription
Viral genome replication or telomere
synthesis
Viral genome replication and
transcription
Organism
Bacteria, phages, and
mitochondria
Archaea, bacteria, eukaryotes, and phages
Archaea, bacteria, and
eukaryotes
Mitochondria and phages
Retroviruses
ssRNA viruses and dsRNA
viruses
Phylogeny of the Right-Hand-Shaped Polymerases . doi:10.1093/molbev/msu219
The Common Structural Core Shared by the
Right-Hand-Shaped Polymerases
In this study, the selected polymerase structures were aligned
and clustered based on their similarity by performing progressive pairwise comparisons between sets of protein structures
on the basis of several properties such as geometry, secondary
structure, and physicochemical properties of the amino acid
residues (see supplementary table S2, Supplementary
Material online) to identify equivalent residues. The final parameterization was somewhat different from that originally
reported (Ravantti et al. 2013), having been optimized for this
problem, as described in Materials and Methods, and was
dominated by the properties describing the local geometry
and local alignment (supplementary table S2, Supplementary
Material online). Instead, the proportional weight of the
sequence in the structural alignment was only 3.5%.
The optimized parameters for this problem produced more
complete and biologically more meaningful structural alignment compared with the original ones (data not shown). The
pairwise comparison for a pair or a group of structures led to
the identification of a set of equivalent residues defining their
common core. This process was gradually continued by merging two closest structures or common structural cores to
produce the final hierarchical classification (Ravantti et al.
2013) (fig. 2).
Homologous structure finder (HSF)-based structural alignment with the optimized parameters found a total of 57
-carbons common to all the selected right-handed polymerase structures with an average root-mean-square deviation
(rmsd) of 4.0 Å (fig. 2). The corresponding residues are depicted for representatives of the six polymerase families in
figure 3 (panel CORE). The identified structurally conserved
amino acids are mainly located in two -helixes and three
-strands (or equivalently positioned polypeptide chains
assigned as random coils) within the palm subdomain
(fig. 3, panel CORE, green). A closer inspection of structures
revealed a fourth -strand in the palm subdomain of all the
right-handed polymerases, except in the single-subunit
DdRps, which do not have this strand. In these proteins,
the C-terminus of the protein partially aligns with the
N-terminus of this fourth -strand (fig. 3, panel CORE).
The HSF program could not automatically detect any
chemically and positionally identical amino acids among
the polymerases included in our comparison. This implies
that a fold with similar function can be formed by sequences
sharing barely any identity. However, when we relaxed the
positional constrains and searched for identical amino acids
within a sequence window of 1 residues, the two catalytic
aspartates located in 1 and 3 strands (fig. 1B) were identified as conserved.
The Four Clusters of Right-Handed Polymerases and
Their Common Cores
The right-handed polymerase structures (supplementary
table S1, Supplementary Material online) were classified by
the HSF program into four clusters: I) RdRps and reverse
transcriptases, II) family Y DNA polymerases, III) family A
MBE
DNA polymerases and single-subunit DdRps, and IV) family
B DNA polymerases (fig. 2). The four main clusters were
further divided into several subclusters (fig. 2, indicated
with letters). The structure-based grouping clustered the polymerase structures according to the established family classification (fig. 2). Our data imply that the RdRps and reverse
transcriptases (cluster I) are structurally closely related sister
families (fig. 2). Similar structural relatedness could be
observed between the family A DNA polymerases and
single-subunit DdRps (cluster III), despite of their different
functional specialization.
Next, we will describe the four clusters and selected subclusters or groups of subclusters that correspond to the six
polymerase families and define structurally conserved cores
for each family (fig. 3). The sizes of the cores (number of
structurally equivalent -carbons) and the corresponding
rmsd values are provided for each cluster and subcluster in
figure 2. The coverage of the core as a percentage of the entire
polymerase domain is provided for selected polymerase structures in figure 3.
Cluster I
Cluster I covers 18 RdRp and 3 reverse transcriptase structures, that is, all the RNA-templated polymerases in our data
set (fig. 2). Twenty of these polymerases originate from viruses
and one from a eukaryotic organism (supplementary table S1,
Supplementary Material online). The structurally conserved
region shared by the RNA-dependent polymerases is depicted
for rabbit hemorrhagic disease virus (RHDV) RdRp and the
reverse transcriptase of HIV in figure 3 (panel I, first and
second rows, respectively). This conserved core covers most
of the palm subdomain (fig. 3, panel I, green), including the
binding sites of the catalytic and noncatalytic ions
(Mönttinen et al. 2012), and parts of the thumb and fingers
subdomains (fig. 3, panel I, pink and yellow, respectively).
Cluster I has six subclusters (IA–F) (fig. 2), which follow the
established viral family classification (Carstens 2012).
Subclusters A–F encompass polymerases that originate
from the members of the viral families Birnaviridae and
Cystoviridae, Caliciviridae, Picornaviridae, Flaviviridae,
Reoviridae, Leviviridae, and Retroviridae, respectively (for the
taxonomic classification of the viruses, see supplementary
table S1, Supplementary Material online). In addition to the
reverse transcriptases of retroviruses, subcluster F contains
the telomerase reverse transcriptase from Tribolium
castaneum.
The conserved core of subclusters IA–E is depicted for the
RHDV RdRps in figure 3 panel IA–E. The degree of structural
conservation in RdRps is significantly higher, and the secondary structures are more continuously covered than at the
previous level of hierarchy, that is, in the RNA-dependent
polymerases (fig. 3, first row, compare panels IA–E and I).
The core encompasses basically all the secondary structure
elements of the palm, fingers, and thumb subdomains of the
RHDV RdRp (fig. 3, first row, compare panels M and IA–E)
forming a structural scaffold for the smallest RdRps, such as
those originating from caliciviruses (e.g., the RHDV RdRp depicted in fig. 3) and picornaviruses.
2743
Mönttinen et al. . doi:10.1093/molbev/msu219
MBE
FIG. 2. The hierarchical clustering of the right-handed polymerases. The main clusters are marked with Roman numerals (I, II, III, and IV) and the
subclusters with letters. The established polymerase family name is indicated in the surrounding bar (RT refers to reverse transcriptases) and the
organism from which the polymerase comes from in the tip of each branch. The background colors indicate the domain of life: Viral polymerases green,
eukaryotic polymerases yellow, archaeal polymerases orange, and bacterial polymerases pink. The type or name of the polymerase is also provided when
it is essential for clarity (DNA pol refers to DNA polymerase). The number of equivalent amino acids, that is, the size of the common core, and the
corresponding average rmsd values based on the common -carbons (in Ångströms) are indicated close to the root of each branch. IBDV, infectious
bursal disease virus; ’6, pseudomonas phage ’6; CV, human enterovirus B (coxsackie virus); PV, poliovirus; HMDV, human enterovirus A (handfoot-and-mouth disease virus); HRV, human rhinovirus A; FMDV, foot-and-mouth disease virus; HCV, hepatitis C virus; BVDV, bovine viral diarrhea
virus; Q, enterobacteria virus Q; MLV, murine leukemia virus; T. castaneum, Tribolium castaneum; M. smegmatis, Mycobacterium smegmatis;
G. kaustophilus, Geobacillus kaustophilus; G. stearothermophilus, Geobacillus stearothermophilus; T. aquaticus, Thermus aquaticus; T7, enterobacteria
phage T7; N4, Escherichia phage N4; T. gorgonarius, Thermococcus gorgonarius; P. furiosus, Pyrococcus furiosus; D. tok, Desulfurococcus tok; T. kodakarensis,
Thermococcus kodakarensis; P. abyssi, Pyrococcus abyssi; RB69, enterobacteria phage RB69; ’29, Bacillus phage ’29.
The structurally common regions characteristic of the
RdRps (subclusters IA–E) are located in the fingers and
thumb subdomains (fig. 3, panel IA–E, yellow and pink, respectively). Interestingly, two of the seven conserved -helices
in the fingers subdomain (fig. 3, panel IA–E, yellow) are conserved also in the structure of the telomerase reverse
2744
transcriptase (PDBid: 3KYL), which is a member of subcluster
IF, but not in other members of this subcluster (the retroviral
reverse transcriptases, data not shown). This suggests that the
fingers subdomain of the telomerase reverse transcriptases is
more closely related to RdRps than is the fingers subdomain
of retroviral reverse transcriptases.
Phylogeny of the Right-Hand-Shaped Polymerases . doi:10.1093/molbev/msu219
MBE
FIG. 3. The conserved structural features in the right-handed polymerases. The common structural features at each level of hierarchy are shown for a
representative structure from each polymerase family. The four clusters (identified in fig. 2) covering the six polymerase families are indicated on the left
(DNA pols refers to DNA polymerases). Green, pink, and yellow refer to residues in the palm, thumb, and fingers subdomains, respectively. Brown
indicates regions (subdomains/domains) that are distinct from the basic polymerase domain. The first panel of each row (M) depicts the selected model
structures. In the second panel (CORE), the residues that are common for all the right-handed polymerases are shown for each model structure.
(continued)
2745
MBE
Mönttinen et al. . doi:10.1093/molbev/msu219
Cluster II
The family Y DNA polymerases are highly similar in the palm,
fingers, and thumb subdomains. The common region covers
93% of the polymerase domain of Sulfolobus solfataricus Dpo4
depicted in figure 3 (third row). The members of family Y
DNA polymerases have a unique structure known as the polymerase-associated domain in eukaryotic polymerases and as
the little finger in archaeal and bacterial enzymes. However,
the similarity of this domain cannot be confirmed because
the corresponding region is not present in all the crystal
structures used for the comparison (supplementary table
S1, Supplementary Material online).
Cluster III
The family A DNA polymerases and single-subunit DdRps
form the two subclusters of cluster III (fig. 2). The grouping
of these two polymerase families into the same cluster supports the hypothesis that the single-subunit DdRps and family
A DNA polymerases are formed via gene duplication in a
phage T7-like organism (Cermakian et al. 1997; Filee and
Forterre 2005). The polymerases of this cluster are from bacteria (family A DNA polymerases), phages (family A DNA
polymerases and single-subunit DdRps), and mitochondrion
(family A DNA polymerases and single-subunit DdRps). In
addition to the common palm region, the equivalent residues
are mainly located in the fingers and thumb subdomains as
depicted for the E. coli Pol I and phage T7 DdRp in figure 3
(fourth and fifth rows, respectively, panel III).
The new structural features in the common structural core
of the subcluster IIIA that were not observed at the higher
level of hierarchy (cluster III) include the fourth -strand in
the palm domain (fig. 3, panel IIIA, green) and the structurally
conserved residues of the exonuclease domain (fig. 3, panel
IIIA, brown). The common core of the IIIB subcluster has two
notable features not present at the higher level of hierarchy,
the most prominent composed of four -helices (fig. 3, panel
IIIB, green), which are connected to 1 of the conserved fold
in the palm. This structure is known as the palm-insertion
module, and it closes off the back of the substrate tunnel
(Jeruzalmi and Steitz 1998).
Cluster IV
Cluster IV of family B DNA polymerases contains four distinct
subclusters (fig. 2). Subclusters A and B contain only one
member, the DNA polymerases of phage ’29 of
Podoviridae family and phage RB69 of Myoviridae family, respectively. The DNA polymerases from thermophilic archaea
are clustered together in subcluster C. Interestingly, these
polymerases are structurally very similar with each other
(631 common -carbons with rmsd of 3.1 Å; fig. 2) even
though they are from two different archaeal phyla,
Crenarchaeota and Euryarchaeota. The last subcluster (D)
contains three members: Polymerase II (Pol II) from E. coli,
polymerase from Saccharomyces cerevisiae, and herpesvirus
DNA polymerase.
The relatively low structural conservation observed for the
thumb subdomain of family B DNA polymerases (fig. 3, panel
IV, pink) likely reflects the difference in the angle between the
thumb and the palm subdomains in these polymerase structures. Furthermore, the polymerase of phage ’29, the only
representative DNA polymerase in our data set using a protein primer, has a notably smaller thumb subdomain (PDBid:
2PYJ) compared with the other family B polymerases (e.g., the
E. coli Pol II depicted in fig. 3, last row).
Phylogeny of the Right-Handed Polymerases
The HSF-based structural comparison (fig. 2) provides information on the degree of structural similarity among righthanded polymerases resulting in a hierarchical clustering of
the polymerase families that accurately reflects the accepted
classification (Carstens 2012). However, HSF-based clustering
is sensitive to the quality of structures, and also the size of the
polymerase may influence its clustering as scattered random
matches of unrelated residues between compared cores may
increase the score between structures under comparison. This
may especially happen if there are no close homologies in the
data set (e.g., the RdRps of phage ’6 and Q, fig. 2).
Consequently, these results should not be used to directly
draw conclusions on phylogenetic relationships between individual structures.
To obtain a deeper understanding of the phylogenetic
relationships among the right-handed polymerases, we constructed phylogenetic trees for the four polymerase clusters
(fig. 2). The phylogeny was deduced from the equivalent
residues among the members within the studied cluster,
calculated for all pairs of structures within the cluster. The
advantage of this approach is that only the structurally conserved cores (fig. 3) shared with all structures are considered,
instead of comparing pair-wise features that are not shared
among all structures in the data set, and thus may not share a
common origin. Because of the limited number of equivalent
amino acids within the entire group of right-handed polymerases (57 residues), phylogenetic trees were separately inferred
for each of the four clusters, using the larger cores of each of
these clusters, instead of constructing a single tree for all the
right-handed polymerases.
As expected, the grouping of the polymerases in these
phylogenetic trees (fig. 4) is slightly different from that in
the hierarchical tree (fig. 2), fine tuning the analysis of their
FIG. 3. Continued
The third column (panels I–IV) represents the residues that are structurally equivalent within each of the four clusters (fig. 2). The fourth column shows
equally positioned amino acid residues for selected subclusters or groups of subclusters (panels IA–E, IF, IIIA, and IIIB). The coverage of the common core
in each cluster/family is given as a percentage of the entire polymerase domain (third and fourth columns). The representative structure shown are the
RdRp of RHDV (PDBid: 1KHV; first row), the reverse transcriptase of HIV (PDBid: 2IAJ; second row), the Dpo4 of Sulfolobus solfataricus (PDBid: 1JX4;
third row), the Pol I of Escherichia coli (PDBid: 2KFZ; fourth row), the DdRp of the enterobacteria phage T7 (PDBid: 1MSW; fifth row), and the Pol II of
E. coli (PDBid: 3K59; last row).
2746
Phylogeny of the Right-Hand-Shaped Polymerases . doi:10.1093/molbev/msu219
MBE
FIG. 4. Phylogeny of the right-handed polymerases. The phylogenetic relations among the conserved core structures of RdRps and reverse transcriptases
(I), family Y DNA polymerases (II), family A DNA polymerases and single-subunit DdRps (III), and family B DNA polymerases (IV). The colored squares
around the names of the organisms indicate the domain of life, for meaning of colors see figure 2. The mechanism of initiation and the primer type are
shown with colored background: White, nucleic acid primer; blue, protein primer; purple, primer-independent initiation mechanism. The RdRps of the
dsRNA viruses are shown with red branches and the RdRps of the ssRNA viruses with blue branches. For abbreviations, see figure 2 legend.
phylogeny. The most notable difference between the hierarchical tree (fig. 2) and the phylogenetic tree of cluster I (fig. 4,
panel I) is the location of the phage ’6 RdRp, which in the
phylogenetic tree locates separately from the birnavirus (infectious pancreatic necrosis virus [IPNV] and infectious bursal
disease virus) RdRps, between the RdRps that originate from
flaviviruses (HCV, bovine viral diarrhea virus, West Nile virus,
and Dengue virus) and leviviruses (phage Q). Although the
relative limited sampling in clusters II, III, and IV could
potentially limit the reliability of the results, the main observations on the phylogenies of the DNA-dependent polymerases (fig. 4, panels II–IV) are consistent with previous studies
(Filee et al. 2002).
Is There Correlation between the Phylogeny of Viral RNA
Polymerases and the Taxonomic Classification of Viruses?
In the phylogenetic tree constructed for cluster I, the RNAdependent right-handed polymerases are grouped together
2747
MBE
Mönttinen et al. . doi:10.1093/molbev/msu219
into clusters that correspond to the taxonomic families of
RNA viruses (see supplementary table S1, Supplementary
Material online, for taxonomic groups). However, the phylogeny of the viral RdRps does not seem to follow the higher level
taxonomic classification of viruses that is based on the
genome type (Baltimore 1971; Carstens 2012). Instead, the
polymerases originating from single- and double-stranded
RNA viruses are scattered around in the phylogenetic tree
(fig. 4, panel I, blue and red branches, respectively). Previous
sequence-based analyses have also indicated that the polymerase subunits of ssRNA viruses and dsRNA viruses, like
caliciviruses and totiviruses, respectively, may be closely related (Koonin et al. 2008). These results are in line with the
observation that viruses with homologous major capsid proteins and genome packaging enzymes may have nonhomologous polymerase genes (Krupovič and Bamford 2009, 2010).
The Mechanism of Initiation Is Reflected in the Grouping of
Polymerases in the Phylogenetic Tree
In our structure-based phylogenetic analysis of the 21 RNAdependent polymerases, the polymerases that apply de novo
initiation mechanism are clustered together (fig. 4, panel I,
purple background) although derived from clearly distinct
viral families. Interestingly, this functional grouping of RNAdependent polymerases is evident although the specific features used for initiation, for example, the initiation platforms
of reovirus, phage ’6, and HCV RdRps (Butcher et al. 2001;
Hong et al. 2001; Tao et al. 2002; Sarin et al. 2012), are not part
of the region used in the reconstruction of the tree. The
RdRps that apply protein priming form a distinct cluster
(fig. 4, panel I, blue background). However, the birnavirus
(IPNV and infectious bursal disease virus) RdRps that function
both as the polymerase and the protein primer (Dobos 1995;
Pan et al. 2007) are clearly distinct from the RdRps of picornaviruses and caliciviruses, which encode separate RdRp and
primer polypeptides. Similarly, among the family B DNA polymerases, the phage ’29 polymerase that uses a protein
primer is separated from the other family B DNA polymerases
that initiate from nucleic acid primers (fig. 4, panel IV). Similar
separation based on the priming mechanisms has also been
observed in previous sequence-based phylogenetic analyses of
family B DNA polymerases (Filee et al. 2002).
Within the phylogenetic tree of cluster I, the viral and
telomerase reverse transcriptases that initiate nucleic acid
synthesis from a pre-existing nucleic acid primer appear to
branch off from the RdRps that apply de novo initiation
mechanism (fig. 4, panel I). Although located within the
same branch, the telomerase reverse transcriptases appear
to be structurally more closely related to the RdRps than
are the viral reverse transcriptases.
Transfer of DNA Polymerase Genes between Viruses and
Cellular Organisms
It has been proposed that the key enzymes involved in nucleic
acid metabolism were initially developed in virus-like organisms and then transferred to cellular genomes (Forterre 2002;
Koonin et al. 2006). In our data set, clusters I, III, and IV contain both viral and cellular enzymes (fig. 2) indicating horizontal transfer of polymerase genes between viruses and
2748
cellular organisms, which is consistent with the “virus origins"
proposal.
In the phylogenetic tree of cluster III, the DNA polymerases
and the single-subunit DdRps are located separately, placing
the two polymerases of phage T7 into separate branches
(fig. 4, panel III) as observed in the hierarchical clustering
(fig. 2). The corresponding enzymes of mitochondria display
similar grouping to T7-like phages suggesting that the mitochondrial polymerases probably originate from a T7-like
phage integrated into the genome of an -proteobacterium
at the origin of mitochondria (Filee and Forterre 2005; Shutt
and Gray 2006).
The family Y DNA polymerases are involved in DNA repair
functions and are currently classified into six branches based
on sequence alignments: DinB, Rad30A, Rad30B, Rev1, UmuC
of Gram-negative bacteria, and UmuC of Gram-positive bacteria (Ohmori et al. 2001). Structural information is available
only for the members of Rad30A ( of Homo sapiens and Sa.
cerevisiae), Rad30B ( of H. sapiens), DinB ( of H. sapiens,
Dpo4 of Mycobacterium smegmatis and S. solfataricus), Dbh
(Dbh of S. acidocaldarius), and Rev1 (Rev1 of H. sapiens and
Sa. cerevisiae) branches (supplementary table S1,
Supplementary Material online) of which Rad30A, Rad30B,
and Rev1 proteins are only found from eukaryotes and DinB
from all three domains of cellular life (Pata 2010). In the
structure-based phylogenetic tree, these family Y DNA polymerases (fig. 4, panel II) do not form clearly separated
branches. However, the grouping resembles the established
sequence-based clustering (Ohmori et al. 2001), not detected
by the hierarchical clustering (fig. 2). These particular
branches of family Y DNA polymerases do not include viral
representatives (Filee et al. 2002) perhaps partly explaining
the limited variation among the polymerases of this cluster
(figs. 2 and 3), consistent with the observation that viruses are
major reservoirs of gene diversity and a driving force in cellular
evolution (Forterre and Prangishvili 2013).
Transfer of DNA Polymerase Genes among Cellular Organisms
Previous studies have revealed massive horizontal gene transfer between and among archaea and bacteria, especially
among halophilic and thermophilic species (Koonin et al.
2001; Dagan et al. 2008). In the phylogenetic tree of cluster
IV, the DNA polymerases of the thermophilic archaea are
tightly grouped together, separately from the other family B
DNA polymerases, indicating a close phylogenetic relation
(fig. 4, panel IV). The family B DNA polymerases are only
encoded by the members of a single class of the domain
Bacteria, gammaproteobacteria (in this study E. coli Pol II,
PDBid: 3K59) but are common among phages infecting
these bacteria, including members of the families
Podoviridae, Myoviridae, and Tectiviridae (Wang et al. 1997,
1989; Kamtekar et al. 2004; phages RB69 and ’29 in this
study), suggesting gene transfer between phages and their
hosts.
Discussion
At present, right-handed polymerase structures are available
from three different DNA polymerase families (A, B, and Y),
Phylogeny of the Right-Hand-Shaped Polymerases . doi:10.1093/molbev/msu219
single-subunit DdRps, RdRps, and reverse transcriptases.
However, knowledge of the structural relationships between
these functionally varied enzymes is piecemeal. Typically,
newly solved polymerase structures are compared with structures of well-studied polymerases, often from organisms with
medical importance, and the broader structural commonalities and relationships are not recognized. We have now applied a novel, automated structure-based comparison
method to gain insights into the structural conservation
and phylogeny of the full set of structurally characterized
right-hand-shaped polymerases.
The relatively small number of known unique protein
structures, their incompleteness, and the differences in conformational states present serious obstacles for a systematic
structure-based comparison of the right-handed polymerases.
Despite these limitations, the automated structure-based
clustering of 49 right-handed polymerases using the HSF program exactly recapitulated the six established polymerase
families (Poch et al. 1989; Delarue et al. 1990; Ito and
Braithwaite 1991; Braithwaite and Ito 1993; Ohmori et al.
2001) (fig. 2). Subclustering of the branches of the major
clusters was also successful (fig. 2), for example, RdRps and
reverse transcriptases of cluster I follow the conventional taxonomy of viruses (Carstens 2012). Note that no manual intervention was required for the structural alignment, whereas
this is often necessary for sequence alignments. Thus, the
results deduced from this analysis can be considered to be
objective.
The conserved structural core that is common to all righthanded polymerases is relatively small, comprising 57 -carbons (fig. 3, panel CORE) that form three -strands and two
-helices. In both RNA-dependent polymerases (RdRps and
reverse transcriptases) and DNA polymerases, a fourth structurally common -strand can be identified (fig. 3) which is
not present in the single-subunit DdRps suggesting deletion
of the corresponding region in an ancestor of the currently
known DdRp genes. Indeed, it has been proposed that the
single-subunit DdRps and the family A DNA polymerases
originated by gene duplication (Cermakian et al. 1997), probably in an ancient T7-like phage, before the integration of the
corresponding polymerase genes into the -proteobacterium
that subsequently became an early mitochondrion (Filee and
Forterre 2005). Such a hypothesis is supported by the phylogenetic tree (fig. 4, panel III).
Based on the structural similarity, the positional conservation of the catalytic amino acids, and conserved function, it is
likely that the structural cores (fig. 3, panel CORE) of all righthanded polymerases share a common origin. This common
core represents the most crucial structure for the catalysis of
DNA and RNA polymers from nucleoside-triphosphate precursors in the right-handed polymerases. Consequently, an
ancient protein fold reminiscent of this core structure and
harboring the catalytic aspartates on secondary structural elements 1 and 3 (fig. 1B) might have been able to catalyze
nucleic acid synthesis in a primitive organism with an RNA
genome.
To our knowledge, this is the first time that the common,
characteristic features are described systematically for all the
MBE
families of the right-handed polymerases (RdRps, reverse transcriptases, single-subunit DdRps, and family A, B, and Y DNA
polymerases). However, less extensive phylogenetic analyses
on viral RdRps, both sequence- and structure based, have
y
been conducted previously (e.g., Koonin et al. 2008; Cern
et al. 2014). Although different subsets of polymerases were
used, the results obtained here and in these previous analyses
are consistent, confirming that the HSF program (Ravantti
et al. 2013) can accurately classify structures based on their
similarity.
The structurally conserved cores defined here (fig. 3) cover
substantially larger regions of the polymerase domain than
the previously identified conserved motifs deduced from sequence comparisons or structure-based sequence alignments
alone (Poch et al. 1989; Hansen et al. 1997; Bruenn 2003; Lang
et al. 2013), reflecting the fact that sequence-based methods
have a limited capacity to detect spatial and functional conservation. In line with this, all the sequence motifs previously
identified for RdRps (Poch et al. 1989) map onto their structurally conserved core shown in figure 3 (panel IA–E), except
motif F, which is located in a region not present in all the
crystal structures used for the comparison.
The classification of viruses based on processes involved in
genome replication has been a matter of debate (Koonin et al.
2008; Krupovič and Bamford 2009, 2010). The structure-based
clustering of viral RdRps and reverse transcriptases (fig. 2) and
the phylogenetic tree deduced from their core structure (fig.
4, panel I) follows the established division of viral families
(Carstens 2012). However, the phylogeny of viral RdRps presented here does not follow the proposed higher level classification of viruses that is based on the genome structure
(Baltimore 1971) or on the similarity of the major coat protein
fold and virion architecture (Abrescia et al. 2012; El Omari
et al. 2013). This is in agreement with the notion that because
the replication process is not directly coupled to the virion
assembly, viral polymerase genes were exchanged relatively
freely among the dsRNA and ssRNA viruses prior to the division of the currently known viral families. Consequently, the
phylogeny of viral RdRps reflects the evolutionary history of
RNA viruses only within limited subgroups (e.g., within viral
families) and probably should not be used as a scaffold for the
reconstruction of the broader aspects of RNA virus evolution.
This is borne out by the phylogeny of polymerase subunits
of DNA viruses, which is even more complex than that of viral
RdRps. The DNA polymerases of phages belonging to
Podoviridae family (e.g., phages T7 and ’29) clustered into
distinct parts of the hierarchical tree (III and IV, corresponding
to the family A and B DNA polymerases, respectively),
whereas DNA polymerases from more distantly related
DNA viruses, like human herpesvirus and phage RB69, clustered into the same branch (cluster IV) of the tree (fig. 2).
Furthermore, both the hierarchical clustering (fig. 2) and the
phylogenetic analyses (fig. 4) as well as previous studies (Filee
et al. 2002) indicate that mitochondrial family A DNA polymerases and DdRps as well as gammaproteobacterial family B
DNA polymerases were possibly acquired from phages.
Based on the phylogenetic analyses presented here (fig. 4,
panel I) and elsewhere (Nakamura et al. 1997), it seems that
2749
MBE
Mönttinen et al. . doi:10.1093/molbev/msu219
cellular and viral reverse transcriptases share a common
origin. However, our data imply that the telomerase reverse
transcriptase from T. castaneum is structurally more closely
related to the viral RdRps than are the reverse transcriptases
of HIV and murine leukemia virus. We propose that this reflects the different mutation rates of retroviral and eukaryotic
genomes; once an ancient reverse transcriptase gene, the precursor of telomerase reverse transcriptases, integrated into an
eukaryotic genome, it was replicated by a cellular DNA polymerase possessing high fidelity, whereas the precursor of currently known retroviral reverse transcriptases continued to be
replicated by the low-fidelity reverse transcriptase enzyme.
This resulted in a higher accumulation of mutations in the
viral reverse transcriptase gene and its faster evolution. This
would also mean that the cellular telomerase reverse transcriptase, resembling RdRps, represents a more ancient form
of the reverse transcriptase structure. However, more reverse
transcriptase structures are needed to make further predictions on the origin and evolution of telomerases.
According to the RNA world hypothesis, the first genomes
were RNA replicated by primitive RdRps (Leon 1998), applying
most likely de novo initiation mechanisms. Later, RdRps that
initiate RNA polymerization with the aid of a protein primer
and reverse transcriptases that could utilize deoxyribonucleotides and nucleic acid primers evolved from these primerindependent RdRps (fig. 4). The reverse transcriptases catalyze
DNA synthesis on both RNA and DNA templates as a part of
their normal function, and consequently, their early forms are
potential precursors for the different right-hand-shaped DNA
polymerases. The change from RNA to DNA genomes required also DdRps for gene expression from the otherwise
inert DNA molecules. Potentially, this was initially carried
out by the precursors of cellular RdRps, such as the silencing
pathway polymerases QDE-1 of Neurospora crassa, which catalyzes both RNA- and DNA-dependent RNA polymerization
(Aalto et al. 2010; Lee et al. 2010) or as a side activity of the preexisting RdRps and reverse transcriptases. Also, the single-subunit right-handed DdRps developed to carry out this function
in an ancestor of T7-like phages (fig. 4, panel III).
Although present-day polymerases tend to be characterized on the basis of a particular activity (table 1), their potential is frequently much wider. Thus, telomerase reverse
transcriptases can work not only as an RNA-dependent
DNA polymerase but also as an RdRp (Maida et al. 2009).
The RdRp of the phage ’6 can use a DNA chain as template
(Makeyev and Grimes 2004), and the RdRp of the IPNV (a
birnavirus) can function as a reverse transcriptase (Graham
et al. 2011). All the members of the DNA polymerase families
can also switch the usage from a DNA template to an RNA
template (Ricchetti and Buc 1993; Murakami et al. 2003;
Franklin et al. 2004; Storici et al. 2007). Furthermore, one
mutation in the DdRp of T7 enables the polymerase to
carry out the functions of RdRp, reverse transcriptase,
DdRp, and DNA polymerase (Sousa and Padilla 1995).
These observations suggest that the template and substrate
specificities are relatively flexible properties controlled mainly
by differences in affinity and reaction kinetics as well as the
availability of different substrate and template molecules.
2750
However, the change in priming mechanism (de novo and
nucleic acid priming or protein priming), which is accompanied by more substantial changes in the polymerase structure,
has likely been less frequent (fig. 4). Therefore, it is quite
possible that the switch in template and substrate specificities
might have occurred several times during the evolution of
right-handed polymerases.
Materials and Methods
Protein Structures for Structural Alignment
Structures of right-handed polymerases were manually collected from the PDB (www.pdb.org, last accessed July 28, 2014;
structures before February 1, 2013). Only one structure from
each polymerase type per species was selected for the analysis
(see supplementary table S1, Supplementary Material online)
applying the following criteria: 1) absence or low number of
disordered residues, 2) absence or low number of mutations,
3) resolution of the X-ray analysis and finally 4) structures
representing the closed conformation were preferred.
Structural Alignment and Clustering
Alignment of the selected polymerase structures was performed using HSF program (described previously in
Ravantti et al. 2013), which automatically clusters protein
structures based on a similarity score defined in Ravantti
et al. (2013). The alignment parameters (Ravantti et al.
2013) were optimized for the best automatic clustering by
using a locally written Python (Summerfield 2010) script for a
selected subset of polymerase structures (available from
H.A.M.M.). The criterion for a correct clustering was the alignment of the -sheet in the palm subdomain (fig. 1B), which
was verified by visual inspection. The parameters fulfilling the
alignment criterion and yielding the highest scores produced
by HSF were applied for the full data set (the parameters are
listed in supplementary table S2, Supplementary Material
online). Visual molecular dynamics program (Humphrey
et al. 1996) was used for the visualization of the structures.
Identification of Conserved Amino Acids
Conserved amino acids within the identified structural cores
were recognized by using a locally written Python
(Summerfield 2010) script (available from H.A.M.M.). From
the original structures, a window of 1 residues around the
structurally equivalent amino acids was used. If an identical
amino acid was detected within the window in all the polymerases for a cluster, it was marked as a conserved amino acid
in that position for that cluster.
Phylogenetic Tree
Phylogenetic trees for each subcluster were made on the basis
of the common cores identified by HSF. First, all pairwise
scores were calculated by using the set of parameters used
in the initial clustering (supplementary table S2,
Supplementary Material online). At the moment, there is
no standard method for converting structural alignments
into distances reflecting evolution. Consequently, scores
were first converted to distances by normalizing them as in
Phylogeny of the Right-Hand-Shaped Polymerases . doi:10.1093/molbev/msu219
Ravantti et al. (2013). The resulting distance matrix was converted to a phylogenic tree using Fitch-Margoliash algorithm
of Fitch program (Felsenstein 1989). Phylogenetic trees were
visualized with Dendroscope 3 program (Huson and
Scornavacca 2012).
Supplementary Material
Supplementary tables S1 and S2 are available at Molecular
Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
Acknowledgments
This work was supported by the Academy of Finland (grant
numbers 250113 and 256069 to M.M.P.); the Sigrid Juselius
Foundation; and the UK Medical Research Council (grant
number G1000099 to D.I.S). H.A.M.M. is a fellow of the
Integrative Life Science Doctoral Program. This is a contribution from the Finnish and UK Instruct centers.
References
Aalto AP, Poranen MM, Grimes JM, Stuart DI, Bamford DH. 2010. In
vitro activities of the multifunctional RNA silencing polymerase
QDE-1 of Neurospora crassa. J Biol Chem. 285:29367–29374.
Abrescia NG, Bamford DH, Grimes JM, Stuart DI. 2012. Structure unifies
the viral universe. Annu Rev Biochem. 81:795–822.
Aravind L, Koonin EV. 1999. DNA polymerase -like nucleotidyltransferase superfamily: identification of three new families, classification
and evolutionary history. Nucleic Acids Res. 27:1609–1618.
Baltimore D. 1971. Expression of animal virus genomes. Bacteriol Rev. 35:
235–241.
Beard WA, Wilson SH. 2000. Structural design of a eukaryotic DNA
repair polymerase: DNA polymerase . Mutat Res. 460:231–244.
Benson SD, Bamford JK, Bamford DH, Burnett RM. 1999. Viral evolution
revealed by bacteriophage PRD1 and human adenovirus coat protein structures. Cell 98:825–833.
Braithwaite DK, Ito J. 1993. Compilation, alignment, and phylogenetic
relationships of DNA polymerases. Nucleic Acids Res. 21:787–802.
BruennJA.2003.Astructuralandprimarysequencecomparisonoftheviral
RNA-dependent RNA polymerases. Nucleic Acids Res. 31:1821–1829.
Butcher SJ, Grimes JM, Makeyev EV, Bamford DH, Stuart DI. 2001. A
mechanism for initiating RNA-dependent RNA polymerization.
Nature 410:235–240.
Carstens EB. 2012. Introduction to virus taxonomy. In: King AMQ,
Adams MJ, Carstens EB, Lefkowitz EJ, editors. Virus taxonomy: classification and nomenclature of viruses: ninth report of the
International Committee on Taxonomy of Viruses. San Diego
(CA): Elsevier. p. 3–20.
Cermakian N, Ikeda TM, Miramontes P, Lang BF, Gray MW, Cedergren
R. 1997. On the evolution of the single-subunit RNA polymerases.
J Mol Evol. 45:671–681.
y J, Cern
a Bolfıkova B, Valdes JJ, Grubhoffer L, Růzek D. 2014.
Cern
Evolution of tertiary structure of viral RNA dependent polymerases.
PLoS One 9:e96070.
Dagan T, Artzy-Randrup Y, Martin W. 2008. Modular networks and
cumulative impact of lateral transfer in prokaryote genome evolution. Proc Natl Acad Sci U S A. 105:10039–10044.
Delarue M, Poch O, Tordo N, Moras D, Argos P. 1990. An attempt to
unify the structure of polymerases. Protein Eng. 3:461–467.
Dobos P. 1995. Protein-primed RNA synthesis in vitro by the virionassociated RNA polymerase of infectious pancreatic necrosis virus.
Virology 208:19–25.
El Omari K, Sutton G, Ravantti JJ, Zhang H, Walter TS, Grimes JM,
Bamford DH, Stuart DI, Mancini EJ. 2013. Plate tectonics of virus
shell assembly and reorganization in phage ’8, a distant relative of
mammalian reoviruses. Structure 21:1384–1395.
MBE
Felsenstein J. 1989. PHYLIP—Phylogeny inference package (version 3.2).
Cladistics 5:164–166.
Filee J, Forterre P. 2005. Viral proteins functioning in organelles: a cryptic
origin? Trends Microbiol. 13:510–513.
Filee J, Forterre P, Sen-Lin T, Laurent J. 2002. Evolution of DNA polymerase families: evidences for multiple gene exchange between cellular and viral proteins. J Mol Evol. 54:763–773.
Forterre P. 2002. The origin of DNA genomes and DNA replication
proteins. Curr Opin Microbiol. 5:525–532.
Forterre P, Prangishvili D. 2013. The major role of viruses in cellular
evolution: facts and hypotheses. Curr Opin Virol. 3:558–565.
Franklin A, Milburn PJ, Blanden RV, Steele EJ. 2004. Human DNA polymerase-, an A-T mutator in somatic hypermutation of rearranged
immunoglobulin genes, is a reverse transcriptase. Immunol Cell Biol.
82:342–342.
Freemont PS, Ollis DL, Steitz TA, Joyce CM. 1986. A domain of the
Klenow fragment of Escherichia coli DNA polymerase I has polymerase but no exonuclease activity. Proteins 1:66–73.
Gillis AJ, Schuller AP, Skordalakes E. 2008. Structure of the Tribolium
castaneum telomerase catalytic subunit TERT. Nature 455:633–637.
Graham SC, Sarin LP, Bahar MW, Myers RA, Stuart DI, Bamford DH,
Grimes JM. 2011. The N-terminus of the RNA polymerase from
infectious pancreatic necrosis virus is the determinant of genome
attachment. PLoS Pathog. 7:e1002085.
Hansen JL, Long AM, Schultz SC. 1997. Structure of the RNA-dependent
RNA polymerase of poliovirus. Structure 5:1109–1122.
Hong Z, Cameron CE, Walker MP, Castro C, Yao N, Lau JY, Zhong W.
2001. A novel mechanism to ensure terminal initiation by hepatitis
C virus NS5B polymerase. Virology 285:6–11.
Humphrey W, Dalke A, Schulten K. 1996. VMD: visual molecular dynamics. J Mol Graph. 14:33–38.
Huson DH, Scornavacca C. 2012. Dendroscope 3: an interactive tool for
rooted phylogenetic trees and networks. Syst Biol. 61:1061–1067.
Ito J, Braithwaite DK. 1991. Compilation and alignment of DNA polymerase sequences. Nucleic Acids Res. 19:4045–4057.
Jeruzalmi D, Steitz TA. 1998. Structure of T7 RNA polymerase complexed to the transcriptional inhibitor T7 lysozyme. EMBO J. 17:
4101–4113.
Joyce CM, Steitz TA. 1995. Polymerase structures and functions: variations on a theme? J Bacteriol 177:6321–6329.
Kamtekar S, Berman AJ, Wang JM, Lazaro JM, de Vega M, Blanco L, Salas
M, Steitz TA. 2004. Insights into strand displacement and processivity from the crystal structure of the protein-primed DNA polymerase of bacteriophage ’29. Mol Cell. 16:609–618.
Kohlstaedt LA, Wang J, Friedman JM, Rice PA, Steitz TA. 1992. Crystal
structure at 3.5 Å resolution of HIV-1 reverse transcriptase complexed with an inhibitor. Science 256:1783–1790.
Koonin EV. 1991. The phylogeny of RNA-dependent RNA polymerases
of positive-strand RNA viruses. J Gen Virol. 72:2197–2206.
Koonin EV, Makarova KS, Aravind L. 2001. Horizontal gene transfer in
prokaryotes: quantification and classification. Annu Rev Microbiol.
55:709–742.
Koonin EV, Senkevich TG, Dolja VV. 2006. The ancient Virus World and
evolution of cells. Biol Direct. 1:29.
Koonin EV, Wolf YI, Nagasaki K, Dolja VV. 2008. The Big Bang of
picorna-like virus evolution antedates the radiation of eukaryotic
supergroups. Nat Rev Microbiol. 6:925–939.
Krupovič M, Bamford DH. 2009. Does the evolution of viral polymerases
reflect the origin and evolution of viruses? Nat Rev Microbiol. 7:250.
Krupovič M, Bamford DH. 2010. Order to the viral universe. J Virol 84:
12476–12479.
Lang DM, Zemla AT, Zhou CL. 2013. Highly similar structural frames link
the template tunnel and NTP entry tunnel to the exterior surface in
RNA-dependent RNA polymerases. Nucleic Acids Res. 41:1464–1482.
Lee HC, Aalto AP, Yang Q, Chang SS, Huang G, Fisher D, Cha J, Poranen
MM, Bamford DH, Liu Y. 2010. The DNA/RNA-dependent RNA
polymerase QDE-1 generates aberrant RNA and dsRNA for RNAi
in a process requiring replication protein A and a DNA helicase.
PLoS Biol. 8:e1000496.
2751
Mönttinen et al. . doi:10.1093/molbev/msu219
Leon PE. 1998. Inhibition of ribozymes by deoxyribonucleotides and the
origin of DNA. J Mol Evol. 47:122–126.
Maida Y, Yasukawa M, Furuuchi M, Lassmann T, Possemato R,
Okamoto N, Kasim V, Hayashizaki Y, Hahn WC, Masutomi K.
2009. An RNA-dependent RNA polymerase formed by TERT and
the RMRP RNA. Nature 461:230–235.
Makeyev EV, Grimes JM. 2004. RNA-dependent RNA polymerases of
dsRNA bacteriophages. Virus Res. 101:45–55.
Mönttinen HAM, Ravantti JJ, Poranen MM. 2012. Evidence for a noncatalytic ion-binding site in multiple RNA-dependent RNA polymerases. PLoS One 7:e40581.
Murakami E, Feng JY, Lee H, Hanes J, Johnson KA, Anderson KS. 2003.
Characterization of novel reverse transcriptase and other RNA-associated catalytic activities by human DNA polymerase —importance in mitochondrial DNA replication. J Biol Chem. 278:
36403–36409.
Murakami KS, Davydova EK, Rothman-Denes LB. 2008. X-ray crystal
structure of the polymerase domain of the bacteriophage N4
virion RNA polymerase. Proc Natl Acad Sci U S A. 105:5046–5051.
Nakamura TM, Morin GB, Chapman KB, Weinrich SL, Andrews WH,
Lingner J, Harley CB, Cech TR. 1997. Telomerase catalytic subunit
homologs from fission yeast and human. Science 277:955–959.
Ohmori H, Friedberg EC, Fuchs RP, Goodman MF, Hanaoka F, Hinkle D,
Kunkel TA, Lawrence CW, Livneh Z, Nohmi T, et al. 2001. The Yfamily of DNA polymerases. Mol Cell. 8:7–8.
Ollis DL, Brick P, Hamlin R, Xuong NG, Steitz TA. 1985. Structure of large
fragment of Escherichia coli DNA polymerase I complexed with
dTMP. Nature 313:762–766.
Pan J, Vakharia VN, Tao YJ. 2007. The structure of a birnavirus polymerase reveals a distinct active site topology. Proc Natl Acad Sci U S A.
104:7385–7390.
Pata JD. 2010. Structural diversity of the Y-family DNA polymerases.
Biochim Biophys Acta. 1804:1124–1135.
Poch O, Sauvaget I, Delarue M, Tordo N. 1989. Identification of four
conserved motifs among the RNA-dependent polymerase encoding
elements. EMBO J. 8:3867–3874.
Ravantti J, Bamford D, Stuart DI. 2013. Automatic comparison and
classification of protein structures. J Struct Biol. 183:47–56.
Ricchetti M, Buc H. 1993. Escherichia coli DNA polymerase I as a reverse
transcriptase. EMBO J. 12:387–396.
2752
MBE
Salgado PS, Koivunen MR, Makeyev EV, Bamford DH, Stuart DI, Grimes
JM. 2006. The structure of an RNAi polymerase links RNA silencing
and transcription. PLoS Biol. 4:e434.
Sarin LP, Wright S, Chen Q, Degerth LH, Stuart DI, Grimes JM, Bamford
DH, Poranen MM. 2012. The C-terminal priming domain is strongly
associated with the main body of bacteriophage ’6 RNA-dependent
RNA polymerase. Virology 432:184–193.
Shutt TE, Gray MW. 2006. Bacteriophage origins of mitochondrial replication and transcription proteins. Trends Genet. 22:90–95.
Sousa R, Chung YJ, Rose JP, Wang BC. 1993. Crystal structure of bacteriophage T7 RNA polymerase at 3.3 Å resolution. Nature 364:
593–599.
Sousa R, Padilla R. 1995. Mutant T7 RNA polymerase as a DNA polymerase. EMBO J. 14:4609–4621.
Steitz TA. 1998. A mechanism for all polymerases. Nature 391:231–232.
Steitz TA. 1999. DNA polymerases: structural diversity and common
mechanisms. J Biol Chem. 274:17395–17398.
Storici F, Bebenek K, Kunkel TA, Gordenin DA, Resnick MA. 2007. RNAtemplated DNA repair. Nature 447:338–341.
Summerfield M. 2010. Programming in Python 3: a complete introduction to the Python language. Upper Saddle River (NJ): AddisonWesley.
Tao Y, Farsetta DL, Nibert ML, Harrison SC. 2002. RNA synthesis in a
cage—structural studies of reovirus polymerase 3. Cell 111:
733–745.
van Dijk AA, Makeyev EV, Bamford DH. 2004. Initiation of
viral RNA-dependent RNA polymerization. J Gen Virol. 85:
1077–1093.
Wang J, Sattar AK, Wang CC, Karam JD, Konigsberg WH, Steitz TA. 1997.
Crystal structure of a pol family replication DNA polymerase from
bacteriophage RB69. Cell 89:1087–1099.
Wang TS, Wong SW, Korn D. 1989. Human DNA polymerase : predicted functional domains and relationships with viral DNA polymerases. FASEB J. 3:14–21.
Werner F, Grohmann D. 2011. Evolution of multisubunit
RNA polymerases in the three domains of life. Nat Rev Microbiol.
9:85–98.
Zhou BL, Pata JD, Steitz TA. 2001. Crystal structure of a DinB lesion
bypass DNA polymerase catalytic fragment reveals a classic polymerase catalytic domain. Mol Cell. 8:427–437.