Download Tracing the Archaeal Origins of Eukaryotic Membrane

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Biochemical switches in the cell cycle wikipedia , lookup

SNARE (protein) wikipedia , lookup

Phosphorylation wikipedia , lookup

Cell nucleus wikipedia , lookup

Proteasome wikipedia , lookup

Endomembrane system wikipedia , lookup

LSm wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Magnesium transporter wikipedia , lookup

Signal transduction wikipedia , lookup

Type three secretion system wikipedia , lookup

Apoptosome wikipedia , lookup

Bacterial microcompartment wikipedia , lookup

Protein moonlighting wikipedia , lookup

Protein phosphorylation wikipedia , lookup

Protein structure prediction wikipedia , lookup

Ribosome wikipedia , lookup

Protein wikipedia , lookup

SR protein wikipedia , lookup

P-type ATPase wikipedia , lookup

Cyclol wikipedia , lookup

Flagellum wikipedia , lookup

JADE1 wikipedia , lookup

List of types of proteins wikipedia , lookup

Trimeric autotransporter adhesin wikipedia , lookup

Intrinsically disordered proteins wikipedia , lookup

Proteolysis wikipedia , lookup

Transcript
Tracing the Archaeal Origins of Eukaryotic MembraneTrafficking System Building Blocks
Christen M. Klinger,†,1 Anja Spang,†,2 Joel B. Dacks,*,‡,1 and Thijs J. G. Ettema‡,2
1
Department of Cell Biology, University of Alberta, Edmonton, AB, Canada
Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
†
Equally contributing first authors.
‡
Equally contributing senior authors.
*Corresponding author: E-mail: [email protected].
Associate editor: Sergei Kosakovsky Pond
2
Abstract
In contrast to prokaryotes, eukaryotic cells are characterized by a complex set of internal membrane-bound compartments. A subset of these, and the protein machineries that move material between them, define the membranetrafficking system (MTS), the emergence of which represents a landmark in eukaryotic evolution. Unlike mitochondria
and plastids, MTS organelles have autogenous origins. Much of the MTS machinery is composed of building blocks,
including small GTPase, coiled-coil, beta-propeller þ alpha-solenoid, and longin domains. Despite the identification of
prokaryotic proteins containing these domains, only few represent direct orthologues, leaving the origins and early
evolution of the MTS poorly understood. Here, we present an in-depth analysis of MTS building block homologues in the
composite genome of Lokiarchaeum, the recently discovered archaeal sister clade of eukaryotes, yielding several key
insights. We identify two previously unreported Eukaryotic Signature Proteins; orthologues of the Gtr/Rag family
GTPases, involved in target of rapamycin complex signaling, and of the RLC7 dynein component. We could not identify
golgin or SNARE (coiled-coil) or beta-propeller þ alpha-solenoid orthologues, nor typical MTS domain fusions, suggesting
that these either were lost from Lokiarchaeum or emerged later in eukaryotic evolution. Furthermore, our phylogenetic
analyses of lokiarchaeal GTPases support a split into Ras-like and Arf-like superfamilies, with different prokaryotic
antecedents, before the advent of eukaryotes. While no GTPase activating proteins or exchange factors were identified,
we show that Lokiarchaeum encodes numerous roadblock domain proteins and putative longin domain proteins, confirming the latter’s origin from Archaea. Altogether, our study provides new insights into the emergence and early
evolution of the eukaryotic membrane-trafficking system.
Key words: archaea, eukaryogenesis, longin domain, membrane trafficking, roadblock domain, small GTPases,
Lokiarchaeum.
Article
Introduction
The membrane-trafficking system is crucial for normal cellular
function in modern eukaryotes. Composed of organelles including the endoplasmic reticulum, Golgi apparatus, endosomes, lysosomes, and the plasma membrane, this system is a
defining characteristic of eukaryotic cellular organization. Its
emergence represents a milestone in the evolutionary transition from a prokaryotic configuration.
The process of vesicular transport, whereby proteins at a
donating organelle bind cargo and deform the lipid membrane into a carrier vesicle, allows trafficking of material between membrane-trafficking system (MTS) organelles. The
machinery responsible includes GTPases such as Arf or Sar
and their cognate activator or exchange factors, coat proteins,
and cargo adaptors (Bonifacino and Glick 2004). Once at the
target organelle, another set of protein machineries will dock
and tether the carrier vesicle, through the action of Rab
GTPases and tethering factors. The carrier vesicle subsequently
undergoes fusion through the action of SNAREs to deliver the
cargo (Bonifacino and Glick 2004). Studies undertaken overwhelmingly in opisthokont model systems, in particular humans and yeast, have identified these, as well as other proteins
that encode the specificity of trafficking pathways and organelle identity (Cai et al. 2007). Molecular evolutionary analyses
of the protein machinery for vesicular trafficking have shown a
common core complement across the diversity of eukaryotes
(Schlacht et al. 2014, inter alia), implying a shared basic mechanism of vesicular trafficking in eukaryotes (with obvious intriguing differences). It also strongly suggests that the Last
Eukaryotic Common Ancestor (LECA) possessed a sophisticated set of trafficking machinery.
Phylogenetic investigations of specific trafficking machinery
components have enabled the proposition of a mechanism
for the evolution of endomembrane organelles. This mechanism is encapsulated in the Organelle Paralogy Hypothesis
(OPH), whereby a simple set of core ancestral machineries
can give rise to the complexity seen in extant eukaryotes
through paralogous duplication and co-evolution of interacting organelle identity-encoding proteins (Dacks and Field
ß The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please
e-mail: [email protected]
1528
Mol. Biol. Evol. 33(6):1528–1541 doi:10.1093/molbev/msw034 Advance Access publication February 17, 2016
MBE
Tracing Archaeal Origins of Eukaryotic Endomembrane Machinery . doi:10.1093/molbev/msw034
2007; Dacks et al. 2008). Therefore, despite the apparent complexity of the modern trafficking machinery, the paralogous
nature of the proteins involved suggests that this machinery
evolved from a smaller set of primordial vesicle formation and
fusion proteins that were present in early stages of eukaryogenesis. More recent phylogenetic studies have even yielded
some insights into the proximal order of events immediately
leading to the complexity seen in LECA (Elias et al. 2012; Hirst
et al. 2014). While these observations suggest that this mechanism contributed significantly to early eukaryotic evolution,
the deepest origins of the MTS remain unclear.
This lack of insight may be due to the fact that, in contrast
to proteins derived from endosymbiotic organelles (i.e., mitochondria and plastids), there are few unambiguous prokaryotic orthologues for components of the eukaryotic
trafficking machinery. The best examples are the archaeal
homologues of the ESCRT-III and III-associated subcomplexes,
which in eukaryotes mediate inward budding of the late endosome as well as cytokinesis (Makarova et al. 2010; Henne
et al. 2011). These proteins were shown to be involved in
cytokinesis in certain members of the archaeal TACK superphylum (Lindås et al. 2008; Ettema and Bernander 2009).
Although prokaryotic orthologues of MTS proteins are
scarce, several prokaryotic proteins that possess domains representing one of the four building blocks of eukaryotic trafficking machinery have been identified before (fig. 1). A
building block is defined as a protein domain that is present
in multiple proteins involved in membrane-trafficking, often
in both vesicle formation and fusion steps (Vedovato et al.
2009). Much of the eukaryotic trafficking machinery is composed of either a single building block (e.g., small GTPases),
combinations of building blocks (e.g., heterotetrameric
adaptor complexes that contain longin domains as well as
beta-propeller þ alpha-solenoid coats) or of fusions thereof
(e.g., R-SNAREs that are a fusion of the longin domain with a
coiled-coil forming domain).
The recognition of these building blocks contributes to a
conceptual line from prokaryotic proteins, through fusions of
some building blocks, and to a primordial set of trafficking
machinery that subsequently expanded via the OPH mechanism (Dacks and Field 2007; Vedovato et al. 2009; fig. 1).
Prokaryotic proteins unambiguously containing these building blocks have been identified for the Ras superfamily
GTPases (subsequently referred to as “small GTPases”)
(Dong et al. 2007; Wuichet and Søgaard-Andersen 2014),
coiled-coils (Parry et al. 2008), and beta-propeller þ alphasolenoid proteins (Santarella-Mellwig et al. 2010). The prokaryotic origins of the fourth building block, the longin
domain, are less obvious, with structural connections proposed to roadblock domain proteins found in eukaryotes,
Bacteria, and Archaea (Levine et al. 2013; De Franceschi
et al. 2014). Nonetheless, the specific prokaryotic contributors
of these building blocks for the eukaryotic lineage have been
elusive to date, with insight into the origins of the eukaryotic
MTS restricted to hypotheses based on distant homology
with prokaryotic components.
Since several recent phylogenomic studies indicate an archaeal ancestor for eukaryotes (Cox et al. 2008; Foster et al.
Longin
GTPase
?
Beta-propeller+alpha-solenoid
Coiled-coil
?
?
?
(OPH)
LAECA
FECA
LECA
FIG. 1. Proposed early stages of eukaryotic membrane trafficking evolution. This figure depicts a hypothetical scenario for the establishment of initial eukaryotic membrane trafficking orthologues.
Membrane trafficking building blocks as defined in this study are
represented by filled circles, including longin (magenta), GTPase
(teal), beta-propellerþalpha-solenoid (brown, specifically proteins
involved in coat and NPC complexes), and coiled-coil (orange, specifically proteins belonging to the coiled-coil tether and SNARE families)
domains. The first eukaryotic organism is suggested to have possessed
orthologues of each building block. These might have been derived
directly from the archaeal line or stem from another contributor
(dotted outlines and question marks). In the case of, at least the
GTPase domain, diversification into distinct groups occurred even
before eukaryogenesis (darker and lighter teal circles in LAECA).
Additionally, fusions, for example, between longin and SNARE domains (black line joining magenta and orange circles), and complexes
containing multiple building blocks, for example, longin and betapropellerþalpha-solenoid domain proteins (adjacent brown and
magenta circles), would have emerged either in the first eukaryotic
common ancestor or between this organism and the LECA. Following
the acquisition of these components, they would have expanded and
diversified according to the mechanism outlined in the organelle
paralogy hypothesis, leading to the complexity inferred in the
LECA. LAECA; last archaeal and eukaryotic common ancestor,
FECA; first eukaryotic common ancestor.
2009; Guy and Ettema 2011; Williams et al. 2012; LasekNesselquist and Gogarten 2013; Guy et al. 2014; Raymann
et al. 2015; Spang et al. 2015), we decided to search for components of the MTS in an archaeal lineage closely related to
eukaryotes. Such a lineage was not described in literature until
recently, when the investigation of metagenomic data from
the Loki’s Castle hydrothermal vent field has lead to the discovery of the Lokiarchaeota (Spang et al. 2015). This novel
archaeal phylum emerges as a sister group to eukaryotes in
sophisticated phylogenomic analyses (Spang et al. 2015) and
represents the closest known relative of the elusive archaeal
ancestor of eukaryotes.
Excitingly, the Lokiarchaeum composite genome encodes
trafficking machinery proteins previously unreported outside
of eukaryotes. For example, components of all three ESCRT
subcomplexes as well as potential longin domain proteins
were identified (Spang et al. 2015). Furthermore, the lokiarchaeal composite genome is unique among prokaryotes in having a large number of small GTPases (Spang et al. 2015), which
allows the reinvestigation of the origin of the eukaryotic small
GTPase complement. While a phylogenetic analysis of these
sequences provided initial insights into the putative identity of
lokiarchaeal GTPases, it lacked support for many nodes (Spang
et al. 2015). Beyond noting their existence, the complement of
putative longin domains was essentially unexplored.
1529
Klinger et al. . doi:10.1093/molbev/msw034
Here we have undertaken a detailed study of the
Lokiarchaeum composite genome, searching for proteins
containing the building blocks of the eukaryotic MTS. Our
analyses yield new insights into the origins and evolution of
small GTPases and their associated regulatory proteins, with
additional implications for the prokaryotic origins of eukaryotic trafficking machinery.
Results
Lokiarchaeum and the Ancestry of Eukaryotic
GTPases
Eukaryotic small GTPases can be divided into several families
including Ras, Rho, Ran, Rab, (hereafter referred to as “Ras-like
superfamily”) and Arf/Sar/SRPRb (signal recognition particle
receptor subunit beta, hereafter referred to as “Arf-like superfamily”; Rojas et al. 2012). In contrast to the large number of
small GTPases encoded in many eukaryotes (Diekmann et al.
2011; Elias et al. 2012), genomes of prokaryotes have few
homologues most of which are restricted to the MglA (gliding
motility associated) and Rup (Ras superfamily GTPase of unknown function in prokaryotes) families (Wuichet and
Søgaard-Andersen 2014). To date, no direct prokaryotic
orthologues have been identified for eukaryotic small
GTPase families, and their early evolutionary origins remain
unresolved.
In an attempt to shed light on the evolution of the small
GTPase superfamily in eukaryotes, we re-assessed its phylogenetic history compared to a subset of lokiarchaeal homologues. Previously identified lokiarchaeal GTPases were
confirmed by HMMer searches of the predicted proteome
and allowed initial separation into sequences more similar to
either Ras or Arf families (supplementary table S1,
Supplementary Material online). In addition, we identified
17 small GTPases not reported in preliminary analyses of
the composite genome (Spang et al. 2015), bringing the total
number encoded by the composite genome to 109.
Various maximum likelihood and Bayesian phylogenetic analyses of carefully selected representative sets of small GTPases
from Bacteria, Archaea, and eukaryotes (Materials and Methods,
supplementary text S1, Supplementary Material online) did provide novel insights into the evolution of, at least, three groups of
eukaryotic small GTPases: the Ras-like and Arf-like superfamilies,
as well as the Gtr/Rag family (fig. 2).
We consistently observed a large clade comprising eukaryotic Ras-like superfamily GTPases together with Rup1 sequences and 45 lokiarchaeal Ras-like GTPases (0.77
posterior probability [PP]/93 bootstrap support [BS], fig. 2).
The internal relationships between members of this clade
were poorly resolved, and a separate analysis aimed at achieving resolution did not yield further insight (supplementary fig.
S1, Supplementary Material online). Importantly, we consistently observed a node separating eukaryotic small GTPases
of the Ras-like and Arf-like superfamilies, together with different prokaryotic origins for each, though support was never
strong (PP ¼ 0.77/BS ¼ 67, fig. 2). Lokiarchaeal sequences
were grouped with each of these superfamilies, albeit not as
1530
MBE
direct outgroups to particular eukaryotic GTPase families,
with one striking exception.
By homology searching, we identified 17 sequences with
similarity to the Gtr1_RagA/PF04670 domain (supplementary
table S1, Supplementary Material online). Proteins with this
domain comprise atypical members of the Ras superfamily
functioning in target of rapamycin complex 1 (TORC1) signaling at the lysosome/vacuole (Sekiguchi et al. 2001; Kim
et al. 2008). The 14 lokiarchaeal sequences selected for inclusion in phylogenetic analyses consistently grouped with eukaryotic Gtr/Rag sequences with strong support (PP ¼ 1/
BS ¼ 100, fig. 2). Furthermore, the eukaryotic sequences
emerged in a bifurcating node from within lokiarchaeal homologues, consistent with an archaeal origin and subsequent
gene duplication and diversification in the eukaryotic lineage.
This conclusion was further supported through inspection
of additional sequence elements. Crystallographic studies of
yeast Gtr1 and Gtr2 proteins revealed that a C-terminal extension, known to be important in dimerization and function,
adopts a roadblock fold (Gong et al. 2011). Investigation of
secondary structural elements (supplementary figs. S3 and S5,
Supplementary Material online) and evidence based on
HMM-HMM comparisons (Materials and Methods; supple
mentary table S4, Supplementary Material online) with other
roadblock domain proteins revealed similar extensions in 12
out of the 14 lokiarchaeal orthologues. These data not only
support the observed orthologous relationship recovered
through phylogenetic analysis, but also indicates that the
observed fusions between roadblock and GTPase domains
are the result of the same ancestral fusion event.
Robust evidence for additional clear lokiarchaeal orthologues of eukaryotic Ras-like or Arf-like superfamilies (fig. 2;
supplementary figs. S1 and S2, Supplementary Material online)
was not obtained. While the small clade of lokiarchaeal Arf-like
sequences almost always placed as an outgroup to the large
clade of eukaryotic Arf-like superfamily sequences, strong support was never obtained (not resolved in figure 2, excluded
from MglA and Ef-Tu clades with PP ¼ 0.59/BS ¼ 70).
However, detailed analyses of sequence motifs, especially
of the “G-box” motifs, which are important in nucleotide
binding and catalysis, corroborated the observed phylogenetic relationships of lokiarchaeal GTPases with eukaryotic
Gtr/Rag family, as well as Ras- and Arf-like superfamily, sequences (fig. 3; supplementary text S1 and fig. S3,
Supplementary Material online). Lokiarchaeal Ras-like sequences encode an aspartic acid residue at the second position of the G1 box and tend to follow a serine-alanine-lysine
pattern for the G5 box similar to eukaryotic Ras-like superfamily sequences (fig. 3). The single clade of lokiarchaeal Arflike sequences, despite poor node support, closely follows the
eukaryotic Arf G4 box signature (ANKQD). Finally, as additional support for direct orthology between lokiarchaeal and
eukaryotic Gtr/Rag family members, both the G4 box histidine residue as well as the G5 box serine-isoleucine-hydrophobic residue patterns are conserved between, but never
observed outside, of these two groups (fig. 3).
Furthermore, we demonstrate a relationship between the
Arf-like superfamily and Gtr/Rag family of small GTPases,
Tracing Archaeal Origins of Eukaryotic Endomembrane Machinery . doi:10.1093/molbev/msw034
Lokiarch 53730
Lokiarch 06040
Lokiarch 15960
ArfL
Lokiarch 10070*
Lokiarch 05490
Lokiarch 44040
Lokiarch 05440
XP_001350853.1 Plasmodium falciparum SRPRβ
0.99/100
XP_008876353.1 Aphanomyces invadans SRPRβ
NP_067026.3 Homo sapiens SRPβ
ESA14800.1 Rhizophagus irregularis SRPRβ
Capsaspora owczarzaki Sar1
XP_004346956.1
1/100
XP_002889736.1 Arabdiopsis lyrata Sar1
XP_002842381.1
Tuber
melanosporum Sar1
0.56/
KIO13649.1 Pisolithus tinctorius Sar1
XP_008882834.1 Hammondia hammondi Arl1
74
0.99/
XP_002895840.1 Phytophthora infestans Arl1
XP_002892585.1 Arabidopsis lyrata Arf
100
XP_008610697.1 Saprolegnia diclina Arf
XP_002785532.1 Perkinsus marinus Arf
KFH64649.1 Mortierella verticillata Arf
KDP33476.1 Jatropha curcas Arl1
0.65/19
XP_004365083.1 Capsaspora owczarzaki Arl1
ESA12577.1 Rhizophagus irregularis Arl1
0.59/70
Lokiarch 38370*
Lokiarch 42960
Lokiarch 15440
Lokiarch 51460
Lokiarch 42900
Lokiarch 10260
Lokiarch 09660
1/100
Rag
Lokiarch 18170
Lokiarch 03700
Lokiarch 35420
Lokiarch 02630
Lokiarch 51600
Lokiarch 12680
Lokiarch 50090
XP_009039870.1 Aureococcus anophagefferens RagA
1/100
XP_009523339.1 Phytophthora sojae RagA
XP_004885572.1 Heterocelphus glaber RagA
XP_004368007.1 Acanthamoeba castellanii RagA
0.91/86
XP_002683014.1 Naegleria gruberi RagA
XP_008878228.1 Aphanomyces invadans RagC
XP_004334232.1 Acanthamoeba castellanii RagC
1/100
NP_071440.1 Homo sapiens RagC
CDS12217.1 Absidia idahoensis RagC
YP_004337182.1 Thermoproteus uzoniensis Rup2
1/100
YP_001154233.1 Pyrobaculum arsenaticum Rup2
YP_001056059.1 Pyrobaculum calidifontis Rup2
NP_560775.1 Pyrobaculum aerophilum Rup2
270966 Selaginella moellendorffii Ran‡
108912 Phytophthora sojae Ran‡
32121 Naegleria gruberi Ran‡
NP_006316.1 Homo sapiens Ran‡
EAL61601.1 Dictyostelium discoideum Ran‡
0.91/99
0.77/67
181611 Nematostella vectensis Rab7‡
3837959 Naegleria gruberi Rab11‡
RO3G 14572.1 Rhizopus oryzae Rab6‡
MJL00000537 Malawimonas jakobiformis Rab5‡
Lokiarch 53190
Lokiarch 48070
Lokiarch 45420
Lokiarch 31930
0.77/93
Lokiarch 21790
Lokiarch 52350
Lokiarch 36400
RasL III
Lokiarch 15930
Lokiarch 12880
Lokiarch 39550
0.99/100
Lokiarch 12240
Lokiarch 38560
Lokiarch 38550
Lokiarch 51330
Lokiarch 51340
Lokiarch 33770
RasL I
0.61/31
Lokiarch 51370
Lokiarch 33750
Lokiarch 36830*
Lokiarch 02350
Lokiarch 34920
0.55/66
Lokiarch 21460
Lokiarch 21100
Lokiarch 53220
Lokiarch 04210
Lokiarch 17900
0.99/83
Lokiarch 49520
Lokiarch 45790
Lokiarch 44790
Lokiarch 30440
RasL IV
Lokiarch 47920
Lokiarch 27490
Lokiarch 18970
Lokiarch 18250
Lokiarch 10880
PhyloBayes/RAxML
Lokiarch 33450
Lokiarch 04620
Lokiarch 44980
1/95
Lokiarch 01690
0.52/74
Lokiarch 01650
0.90/75
Lokiarch 22470
Lokiarch 31850
0.80/50
RasL II
Lokiarch 31830
Lokiarch 31810
Lokiarch 00500
XP_007872518.1 Pneumocystis murina Rheb
XP_004358963.1 Dictyostelium fasciculatum Rheb
NP_444305.2 Mus musculus Rheb
0.99/98
XP_003291742.1 Dictyostelium purpureum Rap
XP_002683335.1 Naegleria gruberi Rap
XP_006677258.1 Batrachochytrium dendrobatidis Rap
0.98/98
XP_004348999.1 Capsaspora owczarzaki Rap
NP_002875.1 Homo sapiens Rap
XP_008037023.1 Trametes versicolor Cdc42
1/100
XP_006676440.1 Batrachochytrium dendrobatidis Cdc42
NP_001782.1 Homo sapiens Cdc42
0.73/89
CCM05783.1 Fibroporia radiculosa Cdc42
EAR02531.1 Maribacter sp. Rup1
ACM52776.1 Chloroflexus sp. Rup1
BAB50180.1 Mesorhizobium loti Rup1
ACC80548.1 Nostoc punctiforme Rup1
ADC89772.1 Thermocrinis albus MglA
Lokiarch 50350
1/100
Lokiarch 45120
MglA
Lokiarch 35740
Lokiarch 50330
BAI79536.1 Deferribacter desulfuricans MglA
ACZ39003.1 Sphaerobacter thermophilus MglA
ABF90872.1 Myxococcus xanthus MglA
YP_004484604.1 Methanotaurus igneus MglA
0.58/14
YP_004576173.1 Methanothermococcus okinawensis MglA
YP_003127510.1 Methanocaldococcus fervens MglA
ACR11686.1 Teredinibacter turnerae MglA
0.97/100
ACK79530.1 Acidithiobacillus ferrooxidans MglA
ACB32931.1 Leptothrix cholodnii MglA
ADH65724.1 Nocardiopsis dassonvillei MglA
ADG86885.1 Thermobispora bispora MglA
ABW15337.1 Frankia sp. MglA
ABV96682.1 Salinispora arenicola MglA
BAJ62712.1 Anaerolinea thermophila MglA
ABQ90522.1 Roseiflexus sp. MglA
ABD00965.1 Synechococcus sp. MglA
ADR35632.1 Oceanithermus profundus MglA
WP_025321931.1 Deferrisoma camini EfTu
WP_014968648.1 Gottschalkia acidurici EfTu
WP_015590312.1 Archaeoglobus sulfaticallidus EfTu
WP_014025778.1 Pyrolobus fumarii EfTu
XP_006089394.1 Myotis lucifugus EfTu
NP_001393.1 Homo sapiens EfTu
AIC15681.1 Nitrososphaera viennensis EfTu
MBE
1/100
Sar
Arf-like
Arf/
Arl
RagA/B
RagC/D
Rup2
Ran
Rab
Ras-like
1/100
SRPRβ
Ras
Rho
Rup1
MglA1
MglA2-5
Ef-Tu
FIG. 2. Phylogenetic analysis of bacterial, eukaryotic and (loki-) archaeal GTPase proteins. This figure demonstrates the phylogenetic relationships
between identified lokiarchaeal GTPase proteins and known homologues from Bacteria, Archaea, and eukaryotes, based on aligned amino acid
sequences (127 positions). For this, and all subsequent phylogenies, the best Bayesian topology, as inferred by Phylobayes, is shown with the scale
bar indicating the number of substitutions per site. Support values are indicated as PPs/RAxML bootstrap values; nodes discussed in the text are
shown in larger bold italics. Internal node support values are indicated by symbols, as denoted in the figure legend, whereby a node with support
greater than or equal to the numbers listed for both methods is indicated by the symbol. For lokiarchaeal Arf-like, Rag, and Ras-like group I
sequences, a structure showing the associated roadblock/longin domain is provided for the accession in bold followed by an asterisk. Eukaryotic
sequences followed by a double dagger are those from Elias et al. (2012).
1531
Klinger et al. . doi:10.1093/molbev/msw034
MBE
FIG. 3. Comparison of G-box motifs for eukaryotic and lokiarchaeal GTPase proteins. This figure compares observed lokiarchaeal GTPase G-box
motifs to the corresponding consensus sequence for each family of eukaryotic GTPases. Lokiarchaeal GTPase groups, as depicted in figure 2, are
shown together with motif logo representations of aligned sequences for the five G-box motifs involved in nucleotide binding and catalysis. The
RasL III group is excluded due to a paucity of information. Notably, lokiarchaeal ArfL sequences share the eukaryotic Arf G4 signature, and
lokiarchaeal RagL sequences share eukaryotic Rag G4 and G5 signatures. Eukaryotic consensus sequences are derived either from Rojas et al. (2012)
or from alignments.
although the exact relationship is unclear, with the Gtr/Rag
clade emerging either sister to, or within, the larger Arf-like
superfamily (fig. 2; supplementary fig. S2, Supplementary
Material online). Regardless of the topology, we consistently
obtained a node separating these two groups from all other
sequences (PP ¼ 0.56/BS ¼ 74, fig. 2). Despite the moderate
phylogenetic support, further reinforcement of this affiliation
was observed by inspection of sequence motifs, for instance
the G1 box second position leucine shared between Arflike superfamily and both lokiarchaeal and eukaryotic Gtr/
Rag family members (fig. 3; supplementary text S1,
Supplementary Material online). Thus, our results clarify the
relationship of the Gtr/Rag family to other members of the
Ras superfamily, and provide clear prokaryotic origins for this
family.
Finally, we aimed to determine whether the phylogenetic
relationships we observed, as well as the apparent lack of
direct orthologous relationships was genuine or an artifact
of specific sequence selection. Therefore, we explored the
relationship of all lokiarchaeal GTPase sequences to other
much larger data sets including additional archaeal, bacterial,
eukaryotic, and environmental sequences (supplementary fig.
S2 and text S1, Supplementary Material online). This analysis
did not yield any new insight into, nor contradict, any observed relationships between lokiarchaeal and eukaryotic
GTPase families. However, it did retrieve the relationship between lokiarchaeal and eukaryotic Gtr/Rag sequences
(BS ¼ 92) as well as between the Arf-like superfamily and
Gtr/Rag family, albeit with low support (BS ¼ 56).
1532
Lokiarchaeum Does Not Contain Orthologues of
Coiled-Coil, Fused Beta-Propeller-Alpha-Solenoid
Proteins, or GTPase Exchange or Activating Factors
Having obtained an increased resolution for small GTPases, we
searched for additional building blocks of eukaryotic MTS proteins, including beta-propeller þ alpha-solenoid components,
SNAREs, tethers, golgins, and V4R domain proteins (supplemen
tary fig. S4, tables S2 and S3, and texts S2 and S3, Supplementary
Material online). We carried out extensive homology searching
using a combination of position-specific scoring matrixsequence, HMM-sequence, and HMM-HMM comparisons
against the lokiarchaeal composite genome (Materials and
Methods; supplementary table S3, Supplementary Material online) and employed these sensitive methods to allow for detection of remote homologues by incorporating subtle sequence
similarity and underlying structural information.
Notably, we failed to identify bona fide beta-propeller þ
alpha-solenoid fusion protein homologues in the lokiarchaeal
composite genome, including any representatives of clathrin/
adaptor protein or COPI or COPII coat complex subunits,
BBsome/IFT, or nucleoporin families. The presence of, at least,
eight alpha-solenoid and four beta-propeller proteins (supple
mentary table S2 and text S2, Supplementary Material on
line), suggests that the individual domains of this building
block were present in the archaeal ancestor of eukaryotes,
as well as in a wide diversity of prokaryotes (SantarellaMellwig et al. 2010). Some of the identified proteins model
onto known protocoatomer proteins such as Clathrin adaptor core proteins or intraflagellar transport components, but
Tracing Archaeal Origins of Eukaryotic Endomembrane Machinery . doi:10.1093/molbev/msw034
MBE
FIG. 4. Illustration of the phylogenetic analysis of bacterial, eukaryotic, and (loki-) archaeal roadblock domain proteins. This
figure shows the separation of lokiarchaeal roadblock proteins into MglB and RLC7 families, based on aligned amino acid
sequences (150 positions). Gray dots indicate the same range of support values as those in all other phylogenies. Structures
are provided for lokiarchaeal homologues of each family (Lokiarch_06490 and Lokiarch_25010) and the difference in
secondary structural composition is indicated. A version of the same phylogeny with all clades expanded is shown in
supplementary figure S7, Supplementary Material online.
are detectably homologous to unrelated proteins containing
HEAT or TPR repeats (e.g., Lokiarch_11360, supplementary
table S2 and text S2, Supplementary Material online). These
results suggest a subsequent contribution from another
source and/or fusion event between structural units as key
events leading to eukaryogenesis (fig. 1). Alternatively, these
components could have been present in the archaeal ancestor of eukaryotes, but were subsequently lost in the lokiarchaeal lineage.
Similarly, we were unable to identify orthologues of eukaryotic coiled-coil tethers (e.g., EEA1, golgin family members),
SNAREs, or other trafficking-related components (supplemen
tary table S3, supplementary text S2, Supplementary Material
online). The Lokiarchaeum genome does encode four proteins
with a V4R domain (supplementary text S3, Supplementary
Material online), the likely prokaryotic homologue of the Bet3
family of the trafficking protein particle (TRAPP) tethering
complex (Podar et al. 2008). However, direct orthologous relationships to eukaryotic TRAPP components could not be established with high confidence (supplementary fig. S4,
Supplementary Material online).
Given the large number of small GTPases encoded in
Lokiarchaeum, we performed extensive domain and sequence motif searches (supplementary table S3 and text S4,
Supplementary Material online) for various canonical eukaryotic GEFs (guanine nucleotide exchange factor) and GAPs
(GTPase-activating proteins). We could not identify any significant hits in Lokiarchaeum. However, roadblock and longin
domains are also known to be present in diverse GTPase
interacting proteins (Miertzschke et al. 2011; Levine et al.
2013; De Franceschi et al. 2014).
Lokiarchaeal Roadblock Proteins and an Archaeal
Origin of the RLC7 Family
Roadblock domain proteins include the MglB family, ubiquitous in Archaea and Bacteria, and the RLC7 family present in
eukaryotes. The MglB family acts as GAPs for MglA family small
GTPases, in amongst others Myxococcus xanthus (Leonardy
et al. 2010). Comprised of five b-strands and two to three ahelices, MglB domains (a-b-b-a-b-b-b-a) contain a terminal
alpha helix, which is lacking in the eukaryotic RLC7 family of
dynein chain components (Koonin and Aravind 2000).
The abundance of small GTPases encoded in
Lokiarchaeum spurred previous investigations, which revealed that the lokiarchaeal composite genome encodes diverse roadblock domain proteins (Spang et al. 2015). A closer
inspection using more sensitive homology searching methods
(Materials and Methods) revealed a total of 38 potential roadblock domains in the lokiarchaeal composite genome, some
of which were fused to the N-terminus of Ras-like or the Cterminus of Rag-like lokiarchaeal small GTPases as discussed
above (fig. 2; supplementary table S4, Supplementary Material
online). While the N-terminal roadblock domains model onto
known roadblock structures when separated from the remainder of the GTPase sequence, the C-terminal roadblock
domains only modeled correctly when the complete sequence was used (supplementary table S4, Supplementary
Material online).
Multiple sequence alignments of lokiarchaeal roadblock
domains (supplementary figs. S5 and S6, Supplementary
Material online) revealed that one protein, Lokiarch_06490,
is most similar to eukaryotic RLC7 homologues and lacks the
C-terminal a-helix present in MglB homologues. Further, in
phylogenetic analyses, Lokiarch_06490 as well as close homologues from a marine sediment metagenome (Kawai et al.
2014) emerge basal to the eukaryotic RLC7 clade (PP ¼ 0.99/
BS ¼ 91, fig. 4, supplementary fig. S7, Supplementary Material
online). Albeit with lower support (BS ¼ 64, supplementary
fig. S8, Supplementary Material online), this relationship was
also recovered in a much larger phylogenetic analysis that
included all identified potential archaeal roadblock homologues (Materials and Methods) as well as a representative
set of homologues from eukaryotes, bacteria, and environmental sequences. In accordance with their secondary structure and predicted tertiary structure (supplementary table S4,
Supplementary Material online), all other roadblock domain
1533
MBE
Klinger et al. . doi:10.1093/molbev/msw034
proteins present in Lokiarchaeum group with members of the
prokaryotic MglB family.
Importantly, the presence of a roadblock domain in
Lokiarchaeum lacking the last a-helix and forming a monophyletic clade with eukaryotic homologues, suggests that the
archaeal ancestor of eukaryotes already encoded a bona fide
eukaryotic RLC7 protein, a component of dynein light chains.
While not a component of the trafficking machinery per se,
this observation provides evidence for an additional orthologue in Lokiarchaeum of a protein previously held as specific
to eukaryotes.
Tracing the Origin of Eukaryotic Longin Domain
Proteins
Finally, we sought to investigate the suggested presence of
longin domains in Lokiarchaeum (Spang et al. 2015). The
longin domain is composed of 120 amino acids forming
five antiparallel b-strands sandwiched in-between a-helices
(Tochio et al. 2001), and is present in seven conserved eukaryotic protein superfamilies including longins, heterotetrameric adaptor complexes (HTACs), Sedlins, SANDs (Mon1
proteins), Targetins, DENNs, and AVLs (De Franceschi et al.
2014). Previous domain analysis using IPR scan (Goujon et al.
2010) suggested that the lokiarchaeal composite genome
contains eight longin-like domains (IPR011012 and
IPR010908) or partial MON1 domains (IPR004353), which
thus far had not been found in prokaryotes (Spang et al.
2015). Our more extensive analysis of this protein family in
the lokiarchaeal composite genome extended the number of
putative longin domains to 41, five of which were fused to
lokiarchaeal Arf-like small GTPases (figs. 2 and 5; supplemen
tary figs. S9 and S10 and table S4, Supplementary Material
online).
Multiple sequence alignments of lokiarchaeal longins combined with secondary structural predictions suggest conservation of the b-b-a-b-b-b-a-a arrangement characteristic of
longin domains (supplementary fig. S9, Supplementary
Material online). While several of these putative longins modeled with moderate to good confidence to various eukaryotic
longin domain proteins, a subset did not model well (supple
mentary table S4, Supplementary Material online). This might
in part be due to the absence of closely related homologues in
relevant databases.
Phylogenetic analyses including all major eukaryotic longin
domain families, which, apart from Lokiarchaeum, are lacking
in all other prokaryotes, revealed that many lokiarchaeal
longin domains emerge as single branches from a large
multi-furcation (fig. 5). This lack of overall resolution is not
surprising given the short length of the longin domain, the
deep evolutionary distance involved, as well as the extensive
subfunctionalization of this domain in eukaryotes and, arguably, in Lokiarchaeum as well. Nonetheless, some resolution at
more internal nodes was observed. Three lokiarchaeal sequences (Lokiarch_04850, Lokiarch_13110, Lokiarch_01890)
group with the a-subunit of the eukaryotic signal recognition
particle receptor, albeit with low BS (PP ¼ 0.91/BS ¼ 24, fig.
5). This affiliation was not recovered in phylogenetic analyses
excluding divergent lokiarchaeal homologues (supplementary
1534
fig. S10, Supplementary Material online). Several well-supported clades of related sequences were also observed, including a clade of those lokiarchaeal longins that are fused to Arflike small GTPases (PP ¼ 1/BS ¼ 100, figs. 2 and 5).
Most notably, we observed a well-supported clade of eukaryotic longin domains comprising medium and small subunits of the TSET, COPI, and Adaptin complexes (PP ¼ 0.95/
BS ¼ 85). Our initial set of analyses also yielded a rooted set of
relationships for these complexes, suggesting, for the first
time, that TSET and COPI are indeed derived from a common
ancestor, to the exclusion of the adaptins (0.99/87 for COPZ
and TSPOON, fig. 5). Additional phylogenies aimed at corroborating these relationships (supplementary figs. S11 and
S12, Supplementary Material online) revealed that the TSET
subunits TCUP and TSPOON are indeed more closely related
to their COPI counterparts, the delta and zeta subunits,
respectively.
Altogether, despite poor backbone resolution, these findings indicate that the archaeal ancestor of eukaryotes already
encoded bona fide longin domains.
Discussion
The composite genome of Lokiarchaeum, which emerges as a
sister group to eukaryotes in phylogenetic analyses, and
whose genome has revealed the presence of an extended
set of novel Eukaryotic Signature Proteins (ESP) (Spang
et al. 2015), has provided us with the opportunity to reinvestigate the prokaryotic origins for the building blocks of eukaryotic membrane-trafficking components. In the current
study, we demonstrate the origin of two important building
blocks of the eukaryotic trafficking machinery, small GTPases
and longin domains, from an archaeal source. These findings
underscore the notion that the archaeal ancestor of eukaryotes was more complex than previously presumed (Martijn
and Ettema 2013; Koonin and Yutin 2014; Koonin 2015) and,
in particular, they provide more detail on the emergence and
evolution of the critical eukaryotic MTS.
Notably, several aspects of our analyses speak to the veracity of these lokiarchaeal sequences as ancestral eukaryotic
genes contributed by the closest archaeal contributor to
eukaryogenesis, as opposed to either recent horizontal gene
transfer or genome project contamination. First, our analysis
of genomic flanking regions (supplementary fig. S13 and text
S6, Supplementary Material online) showed that the ESPs are
located on contigs that contain clear homologues from prokaryotes, making misassembly or recent horizontal gene
transfer unlikely. Furthermore, while we cannot rule out the
possibility that there may be undiscovered eukaryotic diversity that is not included in our analyses, we included as wide a
sampling of eukaryotic diversity as possible, including sampling environmental databases. In our phylogenetic analyses,
none of the lokiarchaeal sequences group within clades of
eukaryotic sequences, but are rather basal to them. This
makes explaining their presence by horizontal gene transfer
or contamination unparsimonious, as compared to their being distant but distinct archaeal homologues.
Tracing Archaeal Origins of Eukaryotic Endomembrane Machinery . doi:10.1093/molbev/msw034
Origins of Eukaryotic Small GTPase Families
Except for the Ran family, the diverse eukaryotic small GTPase
families are often expanded and diversified in eukaryotes, with
multiple homologues encoded in each genome. In previous
studies it was suggested that LECA had a single Ran and
SRPRb (Rojas et al. 2012), up to 23 Rabs (Diekmann et al.
2011; Elias et al. 2012), as well as a single Arf homologue, a
single Sar homologue, and at least four Arl homologues (Li
et al. 2004). Despite that obtaining phylogenetic resolution of
some members has proven difficult, it was proposed that
LECA might in addition have possessed at least one member
of the Rho family, Rac (Boureux et al. 2007), and at least two
Ras members, Rap and Rheb (Van Dam et al. 2011). In stark
contrast, prokaryotic homologues are few in number and
largely restricted to the MglA and Rup families (Wuichet
and Søgaard-Andersen 2014). In previous studies the Rup2
family was suggested as the prokaryotic ancestor of the Arf
family (Wuichet and Søgaard-Andersen 2014), while at the
same time it was shown that MglA proteins bear some structural and mechanistic similarities to Arf (Miertzschke et al.
2011). Yet, the prokaryotic origins of Arf, as well as the other
eukaryotic small GTPase families, remained ambiguous.
In this study, we undertook a detailed analysis of a subset
of 70 small GTPases encoded in the lokiarchaeal composite
genome. Consistent with previous results (Spang et al. 2015),
the majority of sequences did not form sister groups to specific eukaryotic families, with the striking exception of Gtr/Rag
family GTPases. Our phylogenetic analyses and in depth analysis of sequence patterns (figs. 2 and 3; supplementary figs.
S1–S3 and text S1, Supplementary Material online) are consistent with other studies suggesting an origin for the Ras-like
superfamily of GTPases as members of a larger clade including
the Rup1 family (Wuichet and Søgaard-Andersen 2014).
Although none of the lokiarchaeal sequences placed phylogenetically as outgroup to any of the eukaryotic families, support for the group as a whole is strong. Future studies
including new sequence data might be able to clarify the
internal relationships between these groups.
In contrast to previous findings (Wuichet and SøgaardAndersen 2014), our results confirm and extend those of
Spang et al. 2015, indicating that the Rup2 family of small
GTPases is more closely related to the Rup1/Ras-like superfamily clade rather than the Arf-like superfamily (fig. 2; sup
plementary fig. S1, Supplementary Material online).
Lokiarchaeum does not encode direct orthologues of any
Arf-like family, suggesting a separate origin for members of
this superfamily, possibly from the Korarchaeota (Spang et al.
2015; supplementary text S1 and fig. S2, Supplementary
Material online). Future analyses with new archaeal genome
data may shed more light on the origins of the Arf-like
superfamily.
We identified 14 lokiarchaeal sequences that group with
members of the atypical eukaryotic GTPases of the Gtr/Rag
family with strong support. Although previous biochemical
and sequence analyses placed these GTPases as members of
the Ras superfamily, this view is not without contention and
the relationship of the family to other Ras superfamily members was unclear (Sekiguchi et al. 2001; Rojas et al. 2012). Here,
MBE
we show that Gtr/Rag family GTPases are related to the
Arf-like superfamily, and we could firmly establish that
lokiarchaeal sequences represent archaeal orthologues of
this eukaryotic family.
In eukaryotes, Gtr/Rag family GTPases function in the
amino acid-regulated activation of TORC1, which in turn
regulates cell growth, metabolism, and autophagy in response
to amino acid starvation (Kim et al. 2008; Sancak et al. 2008;
Sancak et al. 2010). The GTPases are tethered at the lysosome/vacuole by the pentameric Ragulator complex, a RagA/
B GEF activated by amino acids via the vacuolar ATPase
(Sancak et al. 2010; Zoncu et al. 2011; Bar-Peled et al. 2012).
Ragulator acts both as a GEF and as a tether, the p18 component of which is N-terminally modified with lipid moieties
to mediate its association with the lysosomal/vacuolar compartment (Nada et al. 2009). The remaining four Ragulator
components all possess roadblock domains (Kurzbauer et al.
2004; Lunin et al. 2004; Garcia-Saez et al. 2011; Levine et al.
2013). Recent studies have identified the GATOR1 complex,
including the longin domain proteins Nprl2 and Nprl3
(Levine et al. 2013) as the corresponding GAP (Bar-Peled
et al. 2013; Panchaud et al. 2013). Although direct orthologues
of all of these regulatory components as well as the TORC1
complex are absent from the lokiarchaeal composite genome
(supplementary table S3, Supplementary Material online), the
presence of Gtr/Rag family members, multiple roadblock, and
longin domain proteins, as well as a putative vacuolar ATPase
and a primordial ESCRT complex (Spang et al. 2015) raises the
intriguing possibility that Lokiarchaeum might possess a
primitive endolysosomal-like capacity (Koonin 2015).
Evolutionary Insights into GTPase Regulation
GTPases bind both GDP and GTP with low nanomolar to
picomolar affinities, such that their intrinsic rate of GDP/GTP
turnover and thus GTPase activity is very low. These properties necessitate the presence of additional factors, GEFs and
GAPs, to allow for these reactions to occur on biologically
relevant timescales (Bos et al. 2007).
While our extensive analyses (supplementary table S3 and
text S4, Supplementary Material online) did not recover significant hits in Lokiarchaeum for canonical eukaryotic GEFs
and GAPs, the large number of roadblock (38) and longin (41)
domains encoded is striking (supplementary table S4,
Supplementary Material online). Recently, it was established
that dimerized roadblock domain proteins of the MglB family
in prokaryotes possess GAP activity towards MglA family
GTPases (Miertzschke et al. 2011) by causing a drastic conformational change in the GTPase, which allows the intrinsic
arginine and glutamine residues to hydrolyze GTP. As in other
prokaryotes, the various lokiarchaeal MglB homologues could
serve as GAPs in this organism (supplementary text S5,
Supplementary Material online). This idea is supported by
the observation that a subset of these proteins was directly
fused to specific groups of small GTPases. Thus, roadblock
domain proteins might be involved in the regulation of
lokiarchaeal small GTPases.
Until recently, longin domains had only been found in
eukaryotes, with the proposal that roadblocks gave rise to
1535
Klinger et al. . doi:10.1093/molbev/msw034
MBE
FIG. 5. Phylogenetic analysis of lokiarchaeal and eukaryotic longin-domain proteins. This figure demonstrates the phylogenetic relationships of
putative lokiarchaeal longin-domain proteins to a representative set of eukaryotic longin-domain proteins, based on aligned amino acid sequences
(171 positions). Well-supported groups of longin domain proteins are indicated by gray boxes and associated structures are shown for sequences in
bold font followed by an asterisk.
1536
Tracing Archaeal Origins of Eukaryotic Endomembrane Machinery . doi:10.1093/molbev/msw034
longins by circular permutation no later than in LECA. This
proposition suggests that the last element of the longin domain originated from the first element of the roadblock domain (Levine et al. 2013; De Franceschi et al. 2014). The
identification of longin domains in Lokiarchaeum places
this transition much earlier, before the emergence of eukaryotes from an archaeal ancestor. Notably, while GEFs for some
members of the Ras-like superfamily remain poorly characterized, several GTPase interacting proteins and protein complexes contain longin domains. This includes the TRAPP
complex (Cai et al. 2008), the DENN protein family
(Yoshimura et al. 2010; Wu et al. 2011), the Mon1/Ccz1 complex (Cabrera et al. 2014), and the GATOR1 complex (Levine
et al. 2013). Similar to roadblock domains, longin domains are
found in interacting modules that possess either GEF or GAP
activity. These observations, combined with our findings, including several independent fusions of roadblock and longin
domains directly to archaeal GTPases, suggest that these domains represent ancestral GTPase regulators (supplementary
text S5, Supplementary Material online).
Importantly, although small GTPases are thought to share
a common origin, the various eukaryotic families of GAP and
GEF proteins are structurally and evolutionarily unrelated
(Bos et al. 2007), even between closely related families of
GTPases. For example, the GAPs for Arf and Arl GTPases
are unrelated (East et al. 2012), GEF and GAP regulators differ
between Ras and Rab families (Barr and Lambright 2010; Van
Dam et al. 2011), and even within the Rab family, nonhomologous GEF families operate on different Rab paralogues (Barr
and Lambright 2010). Our phylogenetic analysis of lokiarchaeal small GTPases established that the main divisions into
Ras-like and Arf-like superfamilies occurred even before the
radiation of the eukaryotic lineage, yet Lokiarchaeum does
not encode homologues of canonical eukaryotic GEF and
GAP families. This supports the view that small GTPases diversified in the eukaryotic lineage prior to the acquisition of
GEF and GAP families for each GTPase family, and clarifies the
observed evolutionary relationships between these regulatory
factors. It furthermore highlights the different evolutionary
dynamics of the GTPases and their regulators where diversification from a common ancestor appears dominant for the
GTPases, but where accretion of nonhomologous factors, followed by diversification by gene duplication, plays more of a
role for the regulators.
What Remains to Be Discovered
The ambiguous nature of the prokaryotic connection to eukaryotic trafficking machinery components has hampered
efforts to delve into the deepest roots of the system and
understand the connection to eukaryogenesis. While we identified lokiarchaeal candidates for two key building blocks of
the trafficking system (GTPase and longin domains), we were
unable to detect direct homologues for the beta-propeller
þ alpha-solenoid proteins or eukaryotic coiled-coil domain
proteins such as tethers and SNAREs. The genes encoding
these proteins could have been present in the archaeal ancestor of eukaryotes, but were lost from the lokiarchaeal lineage analyzed in this study. Alternatively, the genes encoding
MBE
the building blocks were acquired via other prokaryotic contributors (fig. 1; Pittis and Gabaldon 2016).
Additionally, the presence of building blocks but not of
fusions in the eukaryotic configurations potentially puts some
bounds on the timing of the archaeal contribution. Either the
fusions seen in eukaryotes evolved after the divergence of the
lineages or were present and lost later in Lokiarchaeum.
Important eukaryotic fusions, such as those between a longin
and SNARE domain leading to the R-SNARE family, involve at
least one domain not present in the Lokiarchaeum genome.
While this means that there was a great deal of membranetrafficking evolution, via building block acquisition and domain fusion that took place after the sampling point that
Lokiarchaeum represents, the fact that seemingly independent fusions of building blocks were observed in
Lokiarchaeum suggests that the fusion of building blocks
might have represented a general mechanism to generate
functional diversity in the lineage leading to eukaryotes.
Concluding Remarks
While the current analysis of the Lokiarchaeum genome alone
does not explain the origin of the membrane-trafficking components, it provides the first concrete sampling point giving
insight into one starting point of the complicated journey
that gave rise to this sophisticated array of cellular transport
machinery. The demonstration of the direct orthologous relationship of two lokiarchaeal protein families—the roadblock-only RLC7 and Gtr/Rag GTPases—with their
eukaryotic counterparts, provides further support for the relationship of this archaeal lineage with the elusive ancestor of
eukaryotes (Spang et al. 2015). Certainly, future efforts to
obtain and analyze additional archaeal lineages related to
Lokiarchaeota will certainly yield even deeper insights in the
origin and evolution of these key eukaryotic features (Saw
et al. 2015).
Materials and Methods
Identification of Trafficking-Related Machinery
Initial domain searches were based on IPRscan (Goujon et al.
2010), Pfam (Finn et al. 2014), and SMART (Letunic et al.
2014) databases. Secondary structure prediction was carried
out using the JPred4 server (Drozdetskiy et al. 2015) or the
integrated JPred3 interface available in Jalview (Waterhouse
et al. 2009). Homology searches were carried out using the
BLASTp and psi-BLAST algorithms (Altschul et al. 1997), with
a maximum of six iterations for psi-BLAST, against the predicted Lokiarchaeal proteome. For queries defined by the
presence of one or more domains or structural units, hidden
markov model (HMM)-based searches using HMMer (Finn
et al. 2011) were performed with curated sequence alignments from the Pfam database (Finn et al. 2014). The
HHSuite set of programs was used to build a database of
HMM models for every protein in the predicted
Lokiarchaeal proteome (S€oding 2005). HMM models were
built through comparison to the UniProt database clustered
to 20% sequence identity, with a maximum of two iterations.
Single query sequences were also transformed to HMMs in
1537
Klinger et al. . doi:10.1093/molbev/msw034
this manner to allow the use of HMM-HMM pairwise
searches suited for uncovering distant evolutionary relationships; alignment-based queries, such as those obtained from
Pfam, were used as direct inputs for HMM-HMM comparison. In all cases, hits with an E-value 0.05 were chosen as
potential homologues for further analysis. This involved
BLAST searches against the Homo sapiens and nonredundant
databases (NRDB) at NCBI, as well as domain and structural
predictions. The top hits from each search were also subjected to HHM-HMM comparisons against H. sapiens via
http://toolkit.tuebingen.mpg.de/hhpred
(last
accessed
February 22, 2016).
The ard2 (Fournier et al. 2013) and SMURF (Menke et al.
2010) webservers were used for the prediction of alpha-solenoids and beta-propellers, respectively. Sequence motif prediction and analysis was performed using the MEME suite
(Bailey et al. 2015) while sequence logos were created using
WebLogo (Crooks et al. 2004). Tertiary structure models were
created using the Phyre2 web servers using the normal mode
for batch submission and intensive mode for final models
(Kelley et al. 2015).
Unless otherwise stated, all program parameters for homology searching, domain identification, and structural prediction programs were left at their respective defaults.
Analysis of Flanking Regions
To check for potential assembly error/eukaryotic contamination for our candidate ESPs, we performed an analysis of taxonomic affiliation of genes encoded on contigs with either
GTPases, roadblock, or longin domain proteins. Taxonomic
affiliation was inferred using the lowest common ancestor
rule (parameters were set as follows: Min Score, 50; Max
Expected, 0.01; Top Percent, 5; Min Support, 1; Min
Complexity, 0.0) as applied by MEGAN (Huson et al. 2007)
and the results of this analysis were plotted in R (supplemen
tary fig. S13 and text S6, Supplementary Material online; Spang
et al. 2015). Additionally, we gathered flanking genes for two
genes upstream and downstream of each candidate ESP, respectively, and subjected them to homology search methods
described above to determine if any conserved functions or
previously undetected eukaryotic membrane-trafficking components were present (supplementary text S6 and table S3,
Supplementary Material online).
Sequence Selection for Phylogenetic Analysis
The set of Lokiarchaeal small GTPases previously identified by
IPRscan (Spang et al. 2015) was validated and expanded using
HMMer searches against the lokiarchaeal proteome yielding
additional candidate homologues. From a starting set of 117
putative small GTPases, we removed proteins predicted to be
involved in ATP-dependent processes, those that were
shorter than the proposed minimal Ras domain of 160 amino
acids (Cenatiempo et al. 1987; Parmeggiani et al. 1987), those
missing one or more of the G box motifs responsible for
nucleotide binding and catalysis (Bourne et al. 1991), as
well as those that were highly divergent and/or of uncertain
affiliation in phylogenetic analyses (supplementary text S1,
Supplementary Material online). This resulted in a final set
1538
MBE
of 70 sequences for downstream phylogenetic analyses (sup
plementary table S1, Supplementary Material online).
For the purposes of identifying relationships between
lokiarchaeal small GTPases and known families from prokaryotes and eukaryotes, we chose a strategy to identify marker
sequences that would serve as surrogates for each established
family. For bacterial and archaeal small GTPases, we started
with a set of 526 previously identified proteins (Wuichet and
Søgaard-Andersen 2014), and performed multiple iterations
of tree reconstruction, each time removing branches from the
common group node that were visibly longer than the majority. This is an approximation of the scrollsaw approach that
aims to choose the least divergent orthologues to represent
larger assemblages in phylogenetic analyses (Elias et al. 2012).
This resulted in a set of bacterial and archaeal GTPases belonging to one of seven groups (MglA1-5, Rup1, and Rup2).
For eukaryotic Ran and Rab families homologues previously
identified as the least divergent members of their respective
families were used (Elias et al. 2012). For the remainder of the
eukaryotic GTPase families, we devised a pipeline to provide
similar surrogate sequences without necessitating the same
level of in depth analysis. We used relevant queries from
H. sapiens in BLAST searches against the NRDB and limited
results to the top 10 hits from each eukaryotic phylum, as
defined by the NCBI taxonomy database. Short branches were
selected as described above for prokaryotic sequences. Finally,
seven sequences representing the elongation factor tu (Ef-Tu)
family were included as outgroup. Two different strategies
to select sequences for expanded GTPase data sets are described in supplementary text S1, Supplementary Material
online.
Lokiarchaeal roadblock domain proteins were identified
using IPRscan (IPR004942) as well as by HMMer searches
against the lokiarchaeal composite genome (supplementary
table S4, Supplementary Material online). Close homologues
(more than 80% amino acid identity and sequence coverage)
of these lokiarchaeal roadblock domain proteins were extracted from a marine sediment metagenome known to contain high numbers of organisms affiliating with the Deep Sea
Archaeal Group (DSAG) (Inagaki et al. 2003). A representative
set of eukaryotic, archaeal, and bacterial MglB and RLC7 family domain proteins (IPR004942) that aimed at covering the
taxonomic diversity of members of all three groups was
downloaded from UniProt using SMART (Letunic et al.
2014). In an additional phylogenetic analysis of a more comprehensive set of roadblock homologues, all archaeal arCOGs
(Makarova et al. 2007) related to roadblock domain proteins
were included (arCOG02603, arCOG02605, arCOG03412,
arCOG05211,
arCOG05565,
arCOG08684,
and
arCOG11698), as were all roadblock domain proteins assigned
to IPR004942 and present in a taxonomically diverse set of
eukaryotic and bacterial genomes and lokiarchaeal homologues, which were not fused to small GTPases. In addition,
Lokiarch_06490, which represents the eukaryotic-type roadblock domain protein of Lokiarchaeum, was queried against
the environmental NR database to identify marine sediment
homologues as well as against all archaeal and bacterial genomes, respectively. Notably, and consistent with results from
Tracing Archaeal Origins of Eukaryotic Endomembrane Machinery . doi:10.1093/molbev/msw034
previous phylogenetic analyses, no homologues of this eukaryotic-type roadblock protein were detected in either bacterial or archaeal genomes. Prior to final alignment,
redundant sequences (threshold 99% identity) were
removed.
Lokiarchaeal longin domain proteins were initially identified using IPRscan (IPR004353 and IPR011012). Subsequently,
additional putative homologues not detected by IPR scan
were found using HMMer (supplementary table S2,
Supplementary Material online). Metagenomic longin/
Mon1 domain proteins assigned to either of these IPRscan
categories were extracted from UniProt (identity > 30%; coverage > 80, E-value > 1e 4). Additionally, longin domain
containing proteins from diverse eukaryotic protein families
(including the alpha subunit of the signal recognition particle
receptor; synaptobrevins Ykt6, Sec22, and VAMP; TRAPP
subunits TRAPPC1, TRAPPC2, TRAPPC2-like, TRAPPC4;
Mon1 domain proteins; and small and medium subunits
from the various HTAC representatives (COPI, TSET, and
the Adaptins) were selected from a subset of seven representative eukaryotic species. In some cases, highly divergent
members of these protein families were removed.
Alignments and Phylogenetic Analyses
Alignments were created using MAFFT-L-INS-i version 7
(Katoh and Standley 2013) either using default parameters
(longin domain proteins and small GTPase family proteins) or
by leaving regions with gaps (roadblock alignment).
Alignments were manually curated and adjusted in Jalview
(Waterhouse et al. 2009). Zorro, a probabilistic masking program, was used to assign confidence scores to aligned sites
(Wu et al. 2012) used for maximum likelihood inferences. For
large GTPase and roadblock alignments (supplementary figs.
S2 and S8, Supplementary Material online), alignments were
trimmed using trimAl (Capella-Gutierrez et al. 2009) with
either the “gappyout” (roadblock) or “gt 0.7” (GTPase) options to select columns. Before subjecting the large roadblock
alignment to trimAl, ambiguously aligned positions at the Nand C-termini were trimmed manually. This included the
removal of the last alpha helix, which was absent from eukaryotic RLC7 family proteins as well as from Lokiarch_06490
and some environmental homologues. Bayesian analyses
were carried out using PhyloBayes v3.3 (Lartillot et al. 2009)
with default settings and the CAT-GTR model. For each of the
alignments, four chains were run in parallel, sampling every
100 points until the maximum difference was 0.15 (or 0.3
for the longin-small data set). The first 20% or the respective
generations at which the parameters started to stabilize were
selected as burn-in. Maximum likelihood bootstrapping using
the –f b option was carried out using RAxML v8.1.17
(Stamatakis 2014) under the LG þ C model to yield 100
nonparametric bootstrap replicates while rapid bootstrapping using the –f a option was performed for the large
GTPase data sets. Bootstrapping was carried out on unmodified alignments, or using Zorro confidence scores as column
weights, and bootstrap scores were mapped onto the best
Bayesian topology using the sumtrees program of the
DendroPy package (Sukumaran and Holder 2010). All
MBE
inference under gamma distributed rates used four distinct
rate categories, unless otherwise stated. Analyses were either
run locally or using the CIPRES Science Gateway webportal
(Miller et al. 2010).
Supplementary Material
Supplementary tables S1–S4, figures S1–S13, and texts S1–S6
are available at Molecular Biology and Evolution online (http://
www.mbe.oxfordjournals.org/).
Acknowledgments
This work was supported by the European Research Council
(ERC Starting grant no. 310039-PUZZLE_CELL) and the
Swedish Foundation for Strategic Research (FFL12-0024) to
T.J.G.E., as well as by a Marie Curie IEF (625521) grant by the
European Union to A.S. C.M.K. is supported by graduate studentships from the Women and Children’s Health Research
Institute and Alberta Innovates Health Solutions. J.B.D. is the
Canada Research Chair in Evolutionary Cell Biology and this
work was supported by an NSERC Discovery grant
(RES0021028) and a grant from Alberta Innovates
Technology Futures (RES0004718). The authors would like
to thank Dr Lionel Guy for an R script to plot lokiarchaeal
contigs.
References
Altschul SF, Madden TL, Sch€affer AA, Zhang J, Zhang Z, Miller W,
Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation
of protein database search programs. Nucleic Acids Res.
25:3389–3402.
Bailey TL, Johnson J, Grant CE, Noble WS. 2015. The MEME Suite. Nucleic
Acids Res. 43:39–49.
Bar-Peled L, Chantranupong L, Cherniack AD, Chen WW, Ottina KA,
Grabiner BC, Spear ED, Carter SL, Meyerson M, Sabatini DM. 2013. A
Tumor suppressor complex with GAP activity for the rag GTPases
that signal amino acid sufficiency to mTORC1. Science
340:1100–1106.
Bar-Peled L, Schweitzer LD, Zoncu R, Sabatini DM. 2012. Ragulator is a
GEF for the Rag GTPases that signal amino acid levels to mTORC1.
Cell 150:1196–1208.
Barr F, Lambright DG. 2010. Rab GEFs and GAPs. Curr Opin Cell Biol.
22:461–470.
Bonifacino JS, Glick BS. 2004. The mechanisms of vesicle budding and
fusion. Cell 116:153–166.
Bos J, Rehmann H, Wittinghofer A. 2007. GEFs and GAPs: critical elements in the control of small G proteins. Cell 129:865–877.
Boureux A, Vignal E, Faure S, Fort P. 2007. Evolution of the Rho family of
Ras-like GTPases in eukaryotes. Mol Biol Evol. 24:203–216.
Bourne HR, Sanders DA, McCormick F. 1991. The GTPase superfamily:
conserved structure and molecular mechanism. Nature
349:117–127.
Cabrera M, Engelbrecht-Vandre S, Ungermann C. 2014. Function of the
Mon1-Ccz1 complex on endosomes. Small GTPases 5:1–3.
Cai H, Reinisch K, Ferro-Novick S. 2007. Coats, tethers, Rabs, and SNAREs
work together to mediate the intracellular destination of a transport
vesicle. Dev Cell. 12:671–682.
Cai Y, Chin HF, Lazarova D, Menon S, Fu C, Cai H, Sclafani A, Rodgers
DW, De La Cruz EM, Ferro-Novick S, et al. 2008. The structural basis
for activation of the Rab Ypt1p by the TRAPP membrane-tethering
complexes. Cell 133:1202–1213.
1539
Klinger et al. . doi:10.1093/molbev/msw034
Capella-Gutierrez S, Silla-Martınez JM, Gabaldon T. 2009. trimAl: a tool
for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25:1972–1973.
Cenatiempo Y, Deville F, Dondon J, Grunberg-Manago M, Sacerdot C,
Hershey JW, Hansen HF, Petersen HU, Clark BF, Kjeldgaard M, et al.
1987. The protein synthesis initiation factor 2 G-domain. Study of a
functionally active C-terminal 65-kilodalton fragment of IF2 from
Escherichia coli. Biochemistry 26:5070–5076.
Cox CJ, Foster PG, Hirt RP, Harris SR, Embley TM. 2008. The archaebacterial
origin of eukaryotes. Proc Natl Acad Sci U S A. 105:20356–20361.
Crooks GE, Hon G, Chandonia JM, Brenner SE. 2004. WebLogo: a sequence logo generator. Genome Res. 14:1188–1190.
Dacks JB, Field MC. 2007. Evolution of the eukaryotic membrane-trafficking system: origin, tempo and mode. J Cell Sci. 120:2977–2985.
Dacks JB, Poon PP, Field MC. 2008. Phylogeny of endocytic components
yields insight into the process of nonendosymbiotic organelle evolution. Proc Natl Acad Sci U S A. 105:588–593.
Diekmann Y, Seixas E, Gouw M, Tavares-Cadete F, Seabra MC, PereiraLeal JB. 2011. Thousands of rab GTPases for the cell biologist. PLoS
Comput Biol. 7:e1002217.
Dong JH, Wen JF, Tian HF. 2007. Homologs of eukaryotic Ras superfamily
proteins in prokaryotes and their novel phylogenetic correlation
with their eukaryotic analogs. Gene 396:116–124.
Drozdetskiy A, Cole C, Procter J, Barton GJ. 2015. JPred4: a protein secondary structure prediction server. Nucleic Acids Res.
43:W389–W394.
East MP, Bowzard JB, Dacks JB, Kahn RA. 2012. ELMO domains, evolutionary and functional characterization of a novel GTPase-activating
protein (GAP) domain for Arf protein family GTPases. J Biol Chem.
287:39538–39553.
Elias M, Brighouse A, Gabernet-Castello C, Field MC, Dacks JB. 2012.
Sculpting the endomembrane system in deep time: high resolution
phylogenetics of Rab GTPases. J Cell Sci. 125:2500–2508.
Ettema TJG, Bernander R. 2009. Cell division and the ESCRT complex: a
surprise from the archaea. Commun Integr Biol. 2:86–88.
Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger
A, Hetherington K, Holm L, Mistry J, et al. 2014. Pfam: the protein
families database. Nucleic Acids Res. 40:D290–D301.
Finn RD, Clements J, Eddy SR. 2011. HMMER web server: interactive
sequence similarity searching. Nucleic Acids Res. 39:W29–W37.
Foster PG, Cox CJ, Embley TM. 2009. The primary divisions of life: a
phylogenomic approach employing composition-heterogeneous
methods. Philos Trans R Soc Lond B Biol Sci. 364:2197–2207.
Fournier D, Palidwor GA, Shcherbinin S, Szengel A, Schaefer MH, PerezIratxeta C, Andrade-Navarro MA. 2013. Functional and genomic
analyses of alpha-solenoid proteins. PLoS One 8:e79894.
De Franceschi N, Wild K, Schlacht A, Dacks JB, Sinning I, Filippini F. 2014.
Longin and GAF domains: structural evolution and adaptation to
the subcellular trafficking machinery. Traffic 15:104–121.
Garcia-Saez I, Lacroix FB, Blot D, Gabel F, Skoufias DA. 2011. Structural
characterization of HBXIP: the protein that interacts with the antiapoptotic protein survivin and the oncogenic viral protein HBx.
J Mol Biol. 405:331–340.
Gong R, Li L, Liu Y, Wang P, Yang H, Wang L, Cheng J, Guan KL, Xu Y.
2011. Crystal structure of the Gtr1p—Gtr2p complex reveals new
insights into the amino acid-induced TORC1 activation. Genes Dev.
25:1668–1673.
Goujon M, McWilliam H, Li W, Valentin F, Squizzato S, Paern J, Lopez R.
2010. A new bioinformatics analysis tools framework at EMBL-EBI.
Nucleic Acids Res. 38:W695–W699.
Guy L, Ettema TJG. 2011. The archaeal “TACK” superphylum and the
origin of eukaryotes. Trends Microbiol. 19:580–587.
Guy L, Saw JH, Ettema TJG. 2014. The archaeal legacy of eukaryotes: a
phylogenomic perspective. Cold Spring Harb Perspect Biol. 6:a016022.
Henne WM, Buchkovich NJ, Emr SD. 2011. The ESCRT pathway. Dev Cell.
21:77–91.
Hirst J, Schlacht A, Norcott JP, Traynor D, Bloomfield G, Antrobus R, Kay
RR, Dacks JB, Robinson MS. 2014. Characterization of TSET, an ancient and widespread membrane trafficking complex. Elife 3:e02866.
1540
MBE
Huson DH, Auch AF, Qi J, Schuster SC. 2007. MEGAN analysis of metagenomic data. Genome Res. 17:377–386.
Inagaki F, Suzuki M, Takai K, Oida H, Sakamoto T, Aoki K, Nealson KH,
Horikoshi K. 2003. Microbial communities associated with geological
horizons in coastal subseafloor sediments from the sea of okhotsk.
Appl Environ Microbiol. 69:7224–7235.
Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol
Evol. 30:772–780.
Kawai M, Futagami T, Toyoda A, Takaki Y, Nishi S, Hori S, Arai W,
Tsubouchi T, Morono Y, Uchiyama I, et al. 2014. High frequency of
phylogenetically diverse reductive dehalogenase-homologous genes in
deep subseafloor sedimentary metagenomes. Front Microbiol. 5:80.
Kelley LA, Mezulis S, Yates CM, Wass MN, Sternberg MJE. 2015. The
Phyre2 web portal for protein modeling, prediction and analysis. Nat
Protoc. 10:845–858.
Kim E, Goraksha-Hicks P, Li L, Neufeld TP, Guan KL. 2008. Regulation of
TORC1 by Rag GTPases in nutrient response. Nat Cell Biol.
10:935–945.
Koonin EV. 2015. Origin of eukaryotes from within archaea, archaeal
eukaryome and bursts of gene gain: eukaryogenesis just made easier?
Philos Trans R Soc Lond B Biol Sci. 370:20140333
Koonin EV, Aravind L. 2000. Dynein light chains of the Roadblock/LC7
group belong to an ancient protein superfamily implicated in
NTPase regulation. Curr Biol. 10:774–776.
Koonin EV, Yutin N. 2014. The dispersed archaeal eukaryome and the
complex archaeal ancestor of eukaryotes. Cold Spring Harb Perspect
Biol. 6:a016188.
Kurzbauer R, Teis D, de Araujo MEG, Maurer-Stroh S, Eisenhaber F,
Bourenkov GP, Bartunik HD, Hekman M, Rapp UR, Huber LA,
et al. 2004. Crystal structure of the p14/MP1 scaffolding complex:
how a twin couple attaches mitogen-activated protein kinase signaling to late endosomes. Proc Natl Acad Sci U S A.
101:10984–10989.
Lartillot N, Lepage T, Blanquart S. 2009. PhyloBayes 3: a Bayesian software
package for phylogenetic reconstruction and molecular dating.
Bioinformatics 25:2286–2288.
Lasek-Nesselquist E, Gogarten JP. 2013. The effects of model choice and
mitigating bias on the ribosomal tree of life. Mol Phylogenet Evol.
69:17–38.
Leonardy S, Miertzschke M, Bulyha I, Sperling E, Wittinghofer A, SøgaardAndersen L. 2010. Regulation of dynamic polarity switching in bacteria by a Ras-like G-protein and its cognate GAP. EMBO J.
29:2276–2289.
Letunic I, Doerks T, Bork P. 2014. SMART: recent updates, new developments and status in 2015. Nucleic Acids Res. 43:D257–D260.
Levine TP, Daniels RD, Wong LH, Gatta AT, Gerondopoulos A, Barr FA.
2013. Discovery of new longin and roadblock domains that form
platforms for small GTPases in ragulator and TRAPP-II. Small
GTPases 4:62–69.
Li Y, Kelly WG, Logsdon JM, Schurko AM, Harfe BD, Hill-Harfe KL, Kahn
RA. 2004. Functional genomic analysis of the ADP-ribosylation factor
family of GTPases: phylogeny among diverse eukaryotes and function in C. elegans. FASEB J. 18:1834–1850.
Lindås A-C, Karlsson EA, Lindgren MT, Ettema TJG, Bernander R. 2008. A
unique cell division machinery in the Archaea. Proc Natl Acad Sci U S
A. 105:18942–18946.
Lunin VV, Munger C, Wagner J, Ye Z, Cygler M, Sacher M. 2004. The
structure of the MAPK scaffold, MP1, bound to its partner, p14: a
complex with a critical role in endosomal MAP kinase signaling. J Biol
Chem. 279:23422–23430.
Makarova KS, Sorokin AV, Novichkov PS, Wolf YI, Koonin EV. 2007.
Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea. Biol Direct. 2:33.
Makarova KS, Yutin N, Bell SD, Koonin EV. 2010. Evolution of diverse cell
division and vesicle formation systems in Archaea. Nat Rev Microbiol.
8:731–741.
Martijn J, Ettema TJG. 2013. From archaeon to eukaryote: the evolutionary dark ages of the eukaryotic cell. Biochem Soc Trans. 41:451–457.
Tracing Archaeal Origins of Eukaryotic Endomembrane Machinery . doi:10.1093/molbev/msw034
Menke M, Berger B, Cowen L. 2010. Markov random fields reveal an Nterminal double beta-propeller motif as part of a bacterial hybrid
two-component sensor system. Proc Natl Acad Sci U S A.
107:4069–4074.
Miertzschke M, Koerner C, Vetter IR, Keilberg D, Hot E, Leonardy S,
Søgaard-Andersen L, Wittinghofer A. 2011. Structural analysis of
the Ras-like G protein MglA and its cognate GAP MglB and implications for bacterial polarity. EMBO J. 30:4185–4197.
Miller MA, Pfeiffer W, Schwartz T. 2010. Creating the CIPRES Science
Gateway for inference of large phylogenetic trees. 2010 Gateway
Computing Environments Workshop GCE. New Orleans
Convention Centre, New Orleans: IEEE. p. 1–8.
Nada S, Hondo A, Kasai A, Koike M, Saito K, Uchiyama Y, Okada M. 2009.
The novel lipid raft adaptor p18 controls endosome dynamics by
anchoring the MEK-ERK pathway to late endosomes. EMBO J.
28:477–489.
Panchaud N, Peli-Gulli MP, De Virgilio C. 2013. SEACing the GAP that
nEGOCiates TORC1 activation: evolutionary conservation of Rag
GTPase regulation. Cell Cycle 12:2948–2952.
Parmeggiani A, Swart GW, Mortensen KK, Jensen M, Clark BF, Dente L,
Cortese R. 1987. Properties of a genetically engineered G domain of
elongation factor Tu. Proc Natl Acad Sci U S A. 84:3141–3145.
Parry DAD, Fraser RDB, Squire JM. 2008. Fifty years of coiled-coils and
alpha-helical bundles: a close relationship between sequence and
structure. J Struct Biol. 163:258–269.
Pittis AA, Gabaldon T. 2016. Late acquisition of mitochondria by a host
with chimaeric prokaryotic ancestry. Nature doi: 10.1038/nature
16941.
Podar M, Wall MA, Makarova KS, Koonin EV. 2008. The prokaryotic V4R
domain is the likely ancestor of a key component of the eukaryotic
vesicle transport system. Biol Direct. 3:2
Raymann K, Brochier-Armanet C, Gribaldo S. 2015. The two-domain tree
of life is linked to a new root for the Archaea. Proc Natl Acad Sci U S
A. 112:6670–6675.
Rojas AM, Fuentes G, Rausell A, Valencia A. 2012. The Ras protein superfamily: evolutionary tree and role of conserved amino acids. J Cell
Biol. 196:189–201.
Sancak Y, Bar-Peled L, Zoncu R, Markhard AL, Nada S, Sabatini DM. 2010.
Ragulator-Rag complex targets mTORC1 to the lysosomal surface
and is necessary for its activation by amino acids. Cell 141:290–303.
Sancak Y, Peterson TR, Shaul YD, Lindquist RA, Thoreen CC, Bar-Peled L,
Sabatini DM. 2008. The Rag GTPases bind raptor and mediate amino
acid signaling to mTORC1. Science 320:1496–1501.
Santarella-Mellwig R, Franke J, Jaedicke A, Gorjanacz M, Bauer U, Budd A,
Mattaj IW, Devos DP. 2010. The compartmentalized bacteria of the
planctomycetes-verrucomicrobia-chlamydiae superphylum have
membrane coat-like proteins. PLoS Biol. 8:e1000281.
Saw JH, Spang A, Zaremba-Niedzwiedzka K, Juzokaite L, Dodsworth JA,
Murugapiran SK, Colman DR, Takacs-vesbach C, Hedlund BP, Guy L,
MBE
et al. 2015. Exploring microbial dark matter to resolve the deep
archaeal ancestry of eukaryotes. Philos Trans R Soc Lond B Biol Sci.
370:20140328.
Schlacht A, Herman EK, Klute MJ, Field MC, Dacks JB. 2014. Missing
pieces of an ancient puzzle: evolution of the eukaryotic membranetrafficking system. Cold Spring Harb Perspect Biol. 6:a016048.
Sekiguchi T, Hirose E, Nakashima N, Li M, Nishimoto T. 2001. Novel G
Proteins, Rag C and Rag D, Interact with GTP-binding Proteins, Rag
A and Rag B. J Biol Chem. 276:7246–7257.
S€oding J. 2005. Protein homology detection by HMM-HMM comparison. Bioinformatics 21:951–960.
Spang A, Saw JH, Jørgensen SL, Zaremba-Niedzwiedzka K, Martijn J, Lind
AE, van Eijk R, Schleper C, Guy L, Ettema TJG. 2015. Complex archaea
that bridge the gap between prokaryotes and eukaryotes. Nature
521:173–179.
Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic analysis
and post-analysis of large phylogenies. Bioinformatics 30:1312–1313.
Sukumaran J, Holder MT. 2010. DendroPy: a Python library for phylogenetic computing. Bioinformatics 26:1569–1571.
Tochio H, Tsui MM, Banfield DK, Zhang M. 2001. An autoinhibitory
mechanism for nonsyntaxin SNARE proteins revealed by the structure of Ykt6p. Science 293:698–702.
Van Dam TJ, Bos JL, Snel B. 2011. Evolution of the Ras-like small GTPases
and their regulators. Small GTPases 2:4–16.
Vedovato M, Rossi V, Dacks JB, Filippini F. 2009. Comparative analysis of
plant genomes allows the definition of the “Phytolongins”: a novel
non-SNARE longin domain protein family. BMC Genomics 10:510.
Waterhouse AM, Procter JB, Martin DM, Clamp M, Barton GJ. 2009.
Jalview version 2-A multiple sequence alignment editor and analysis
workbench. Bioinformatics 25:1189–1191.
Williams TA, Foster PG, Nye TMW, Cox CJ, Embley TM. 2012. A congruent phylogenomic signal places eukaryotes within the Archaea.
Proc Biol Sci. 279:4870–4879.
Wu M, Chatterji S, Eisen JA. 2012. Accounting for alignment uncertainty
in phylogenomics. PLoS One 7:e30288.
Wu X, Bradley MJ, Cai Y, Kummel D, De La Cruz EM, Barr FA, Reinisch
KM. 2011. Insights regarding guanine nucleotide exchange from the
structure of a DENN-domain protein complexed with its Rab
GTPase substrate. Proc Natl Acad Sci U S A. 108:18672–18677.
Wuichet K, Søgaard-Andersen L. 2014. Evolution and diversity of the Ras
superfamily of small GTPases in prokaryotes. Genome Biol Evol.
7:57–70.
Yoshimura SI, Gerondopoulos A, Linford A, Rigden DJ, Barr FA. 2010.
Family-wide characterization of the DENN domain Rab GDP-GTP
exchange factors. J Cell Biol. 191:367–381.
Zoncu R, Bar-Peled L, Efeyan A, Wang S, Sancak Y, Sabatini DM. 2011.
mTORC1 senses lysosomal amino acids through an inside-out
mechanism that requires the vacuolar H(þ)-ATPase. Science
334:678–683.
1541