* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Tracing the Archaeal Origins of Eukaryotic Membrane
Survey
Document related concepts
Biochemical switches in the cell cycle wikipedia , lookup
SNARE (protein) wikipedia , lookup
Phosphorylation wikipedia , lookup
Cell nucleus wikipedia , lookup
Endomembrane system wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Magnesium transporter wikipedia , lookup
Signal transduction wikipedia , lookup
Type three secretion system wikipedia , lookup
Bacterial microcompartment wikipedia , lookup
Protein moonlighting wikipedia , lookup
Protein phosphorylation wikipedia , lookup
Protein structure prediction wikipedia , lookup
P-type ATPase wikipedia , lookup
List of types of proteins wikipedia , lookup
Trimeric autotransporter adhesin wikipedia , lookup
Transcript
Tracing the Archaeal Origins of Eukaryotic MembraneTrafficking System Building Blocks Christen M. Klinger,†,1 Anja Spang,†,2 Joel B. Dacks,*,‡,1 and Thijs J. G. Ettema‡,2 1 Department of Cell Biology, University of Alberta, Edmonton, AB, Canada Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden † Equally contributing first authors. ‡ Equally contributing senior authors. *Corresponding author: E-mail: [email protected]. Associate editor: Sergei Kosakovsky Pond 2 Abstract In contrast to prokaryotes, eukaryotic cells are characterized by a complex set of internal membrane-bound compartments. A subset of these, and the protein machineries that move material between them, define the membranetrafficking system (MTS), the emergence of which represents a landmark in eukaryotic evolution. Unlike mitochondria and plastids, MTS organelles have autogenous origins. Much of the MTS machinery is composed of building blocks, including small GTPase, coiled-coil, beta-propeller þ alpha-solenoid, and longin domains. Despite the identification of prokaryotic proteins containing these domains, only few represent direct orthologues, leaving the origins and early evolution of the MTS poorly understood. Here, we present an in-depth analysis of MTS building block homologues in the composite genome of Lokiarchaeum, the recently discovered archaeal sister clade of eukaryotes, yielding several key insights. We identify two previously unreported Eukaryotic Signature Proteins; orthologues of the Gtr/Rag family GTPases, involved in target of rapamycin complex signaling, and of the RLC7 dynein component. We could not identify golgin or SNARE (coiled-coil) or beta-propeller þ alpha-solenoid orthologues, nor typical MTS domain fusions, suggesting that these either were lost from Lokiarchaeum or emerged later in eukaryotic evolution. Furthermore, our phylogenetic analyses of lokiarchaeal GTPases support a split into Ras-like and Arf-like superfamilies, with different prokaryotic antecedents, before the advent of eukaryotes. While no GTPase activating proteins or exchange factors were identified, we show that Lokiarchaeum encodes numerous roadblock domain proteins and putative longin domain proteins, confirming the latter’s origin from Archaea. Altogether, our study provides new insights into the emergence and early evolution of the eukaryotic membrane-trafficking system. Key words: archaea, eukaryogenesis, longin domain, membrane trafficking, roadblock domain, small GTPases, Lokiarchaeum. Article Introduction The membrane-trafficking system is crucial for normal cellular function in modern eukaryotes. Composed of organelles including the endoplasmic reticulum, Golgi apparatus, endosomes, lysosomes, and the plasma membrane, this system is a defining characteristic of eukaryotic cellular organization. Its emergence represents a milestone in the evolutionary transition from a prokaryotic configuration. The process of vesicular transport, whereby proteins at a donating organelle bind cargo and deform the lipid membrane into a carrier vesicle, allows trafficking of material between membrane-trafficking system (MTS) organelles. The machinery responsible includes GTPases such as Arf or Sar and their cognate activator or exchange factors, coat proteins, and cargo adaptors (Bonifacino and Glick 2004). Once at the target organelle, another set of protein machineries will dock and tether the carrier vesicle, through the action of Rab GTPases and tethering factors. The carrier vesicle subsequently undergoes fusion through the action of SNAREs to deliver the cargo (Bonifacino and Glick 2004). Studies undertaken overwhelmingly in opisthokont model systems, in particular humans and yeast, have identified these, as well as other proteins that encode the specificity of trafficking pathways and organelle identity (Cai et al. 2007). Molecular evolutionary analyses of the protein machinery for vesicular trafficking have shown a common core complement across the diversity of eukaryotes (Schlacht et al. 2014, inter alia), implying a shared basic mechanism of vesicular trafficking in eukaryotes (with obvious intriguing differences). It also strongly suggests that the Last Eukaryotic Common Ancestor (LECA) possessed a sophisticated set of trafficking machinery. Phylogenetic investigations of specific trafficking machinery components have enabled the proposition of a mechanism for the evolution of endomembrane organelles. This mechanism is encapsulated in the Organelle Paralogy Hypothesis (OPH), whereby a simple set of core ancestral machineries can give rise to the complexity seen in extant eukaryotes through paralogous duplication and co-evolution of interacting organelle identity-encoding proteins (Dacks and Field ß The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] 1528 Mol. Biol. Evol. 33(6):1528–1541 doi:10.1093/molbev/msw034 Advance Access publication February 17, 2016 MBE Tracing Archaeal Origins of Eukaryotic Endomembrane Machinery . doi:10.1093/molbev/msw034 2007; Dacks et al. 2008). Therefore, despite the apparent complexity of the modern trafficking machinery, the paralogous nature of the proteins involved suggests that this machinery evolved from a smaller set of primordial vesicle formation and fusion proteins that were present in early stages of eukaryogenesis. More recent phylogenetic studies have even yielded some insights into the proximal order of events immediately leading to the complexity seen in LECA (Elias et al. 2012; Hirst et al. 2014). While these observations suggest that this mechanism contributed significantly to early eukaryotic evolution, the deepest origins of the MTS remain unclear. This lack of insight may be due to the fact that, in contrast to proteins derived from endosymbiotic organelles (i.e., mitochondria and plastids), there are few unambiguous prokaryotic orthologues for components of the eukaryotic trafficking machinery. The best examples are the archaeal homologues of the ESCRT-III and III-associated subcomplexes, which in eukaryotes mediate inward budding of the late endosome as well as cytokinesis (Makarova et al. 2010; Henne et al. 2011). These proteins were shown to be involved in cytokinesis in certain members of the archaeal TACK superphylum (Lindås et al. 2008; Ettema and Bernander 2009). Although prokaryotic orthologues of MTS proteins are scarce, several prokaryotic proteins that possess domains representing one of the four building blocks of eukaryotic trafficking machinery have been identified before (fig. 1). A building block is defined as a protein domain that is present in multiple proteins involved in membrane-trafficking, often in both vesicle formation and fusion steps (Vedovato et al. 2009). Much of the eukaryotic trafficking machinery is composed of either a single building block (e.g., small GTPases), combinations of building blocks (e.g., heterotetrameric adaptor complexes that contain longin domains as well as beta-propeller þ alpha-solenoid coats) or of fusions thereof (e.g., R-SNAREs that are a fusion of the longin domain with a coiled-coil forming domain). The recognition of these building blocks contributes to a conceptual line from prokaryotic proteins, through fusions of some building blocks, and to a primordial set of trafficking machinery that subsequently expanded via the OPH mechanism (Dacks and Field 2007; Vedovato et al. 2009; fig. 1). Prokaryotic proteins unambiguously containing these building blocks have been identified for the Ras superfamily GTPases (subsequently referred to as “small GTPases”) (Dong et al. 2007; Wuichet and Søgaard-Andersen 2014), coiled-coils (Parry et al. 2008), and beta-propeller þ alphasolenoid proteins (Santarella-Mellwig et al. 2010). The prokaryotic origins of the fourth building block, the longin domain, are less obvious, with structural connections proposed to roadblock domain proteins found in eukaryotes, Bacteria, and Archaea (Levine et al. 2013; De Franceschi et al. 2014). Nonetheless, the specific prokaryotic contributors of these building blocks for the eukaryotic lineage have been elusive to date, with insight into the origins of the eukaryotic MTS restricted to hypotheses based on distant homology with prokaryotic components. Since several recent phylogenomic studies indicate an archaeal ancestor for eukaryotes (Cox et al. 2008; Foster et al. Longin GTPase ? Beta-propeller+alpha-solenoid Coiled-coil ? ? ? (OPH) LAECA FECA LECA FIG. 1. Proposed early stages of eukaryotic membrane trafficking evolution. This figure depicts a hypothetical scenario for the establishment of initial eukaryotic membrane trafficking orthologues. Membrane trafficking building blocks as defined in this study are represented by filled circles, including longin (magenta), GTPase (teal), beta-propellerþalpha-solenoid (brown, specifically proteins involved in coat and NPC complexes), and coiled-coil (orange, specifically proteins belonging to the coiled-coil tether and SNARE families) domains. The first eukaryotic organism is suggested to have possessed orthologues of each building block. These might have been derived directly from the archaeal line or stem from another contributor (dotted outlines and question marks). In the case of, at least the GTPase domain, diversification into distinct groups occurred even before eukaryogenesis (darker and lighter teal circles in LAECA). Additionally, fusions, for example, between longin and SNARE domains (black line joining magenta and orange circles), and complexes containing multiple building blocks, for example, longin and betapropellerþalpha-solenoid domain proteins (adjacent brown and magenta circles), would have emerged either in the first eukaryotic common ancestor or between this organism and the LECA. Following the acquisition of these components, they would have expanded and diversified according to the mechanism outlined in the organelle paralogy hypothesis, leading to the complexity inferred in the LECA. LAECA; last archaeal and eukaryotic common ancestor, FECA; first eukaryotic common ancestor. 2009; Guy and Ettema 2011; Williams et al. 2012; LasekNesselquist and Gogarten 2013; Guy et al. 2014; Raymann et al. 2015; Spang et al. 2015), we decided to search for components of the MTS in an archaeal lineage closely related to eukaryotes. Such a lineage was not described in literature until recently, when the investigation of metagenomic data from the Loki’s Castle hydrothermal vent field has lead to the discovery of the Lokiarchaeota (Spang et al. 2015). This novel archaeal phylum emerges as a sister group to eukaryotes in sophisticated phylogenomic analyses (Spang et al. 2015) and represents the closest known relative of the elusive archaeal ancestor of eukaryotes. Excitingly, the Lokiarchaeum composite genome encodes trafficking machinery proteins previously unreported outside of eukaryotes. For example, components of all three ESCRT subcomplexes as well as potential longin domain proteins were identified (Spang et al. 2015). Furthermore, the lokiarchaeal composite genome is unique among prokaryotes in having a large number of small GTPases (Spang et al. 2015), which allows the reinvestigation of the origin of the eukaryotic small GTPase complement. While a phylogenetic analysis of these sequences provided initial insights into the putative identity of lokiarchaeal GTPases, it lacked support for many nodes (Spang et al. 2015). Beyond noting their existence, the complement of putative longin domains was essentially unexplored. 1529 Klinger et al. . doi:10.1093/molbev/msw034 Here we have undertaken a detailed study of the Lokiarchaeum composite genome, searching for proteins containing the building blocks of the eukaryotic MTS. Our analyses yield new insights into the origins and evolution of small GTPases and their associated regulatory proteins, with additional implications for the prokaryotic origins of eukaryotic trafficking machinery. Results Lokiarchaeum and the Ancestry of Eukaryotic GTPases Eukaryotic small GTPases can be divided into several families including Ras, Rho, Ran, Rab, (hereafter referred to as “Ras-like superfamily”) and Arf/Sar/SRPRb (signal recognition particle receptor subunit beta, hereafter referred to as “Arf-like superfamily”; Rojas et al. 2012). In contrast to the large number of small GTPases encoded in many eukaryotes (Diekmann et al. 2011; Elias et al. 2012), genomes of prokaryotes have few homologues most of which are restricted to the MglA (gliding motility associated) and Rup (Ras superfamily GTPase of unknown function in prokaryotes) families (Wuichet and Søgaard-Andersen 2014). To date, no direct prokaryotic orthologues have been identified for eukaryotic small GTPase families, and their early evolutionary origins remain unresolved. In an attempt to shed light on the evolution of the small GTPase superfamily in eukaryotes, we re-assessed its phylogenetic history compared to a subset of lokiarchaeal homologues. Previously identified lokiarchaeal GTPases were confirmed by HMMer searches of the predicted proteome and allowed initial separation into sequences more similar to either Ras or Arf families (supplementary table S1, Supplementary Material online). In addition, we identified 17 small GTPases not reported in preliminary analyses of the composite genome (Spang et al. 2015), bringing the total number encoded by the composite genome to 109. Various maximum likelihood and Bayesian phylogenetic analyses of carefully selected representative sets of small GTPases from Bacteria, Archaea, and eukaryotes (Materials and Methods, supplementary text S1, Supplementary Material online) did provide novel insights into the evolution of, at least, three groups of eukaryotic small GTPases: the Ras-like and Arf-like superfamilies, as well as the Gtr/Rag family (fig. 2). We consistently observed a large clade comprising eukaryotic Ras-like superfamily GTPases together with Rup1 sequences and 45 lokiarchaeal Ras-like GTPases (0.77 posterior probability [PP]/93 bootstrap support [BS], fig. 2). The internal relationships between members of this clade were poorly resolved, and a separate analysis aimed at achieving resolution did not yield further insight (supplementary fig. S1, Supplementary Material online). Importantly, we consistently observed a node separating eukaryotic small GTPases of the Ras-like and Arf-like superfamilies, together with different prokaryotic origins for each, though support was never strong (PP ¼ 0.77/BS ¼ 67, fig. 2). Lokiarchaeal sequences were grouped with each of these superfamilies, albeit not as 1530 MBE direct outgroups to particular eukaryotic GTPase families, with one striking exception. By homology searching, we identified 17 sequences with similarity to the Gtr1_RagA/PF04670 domain (supplementary table S1, Supplementary Material online). Proteins with this domain comprise atypical members of the Ras superfamily functioning in target of rapamycin complex 1 (TORC1) signaling at the lysosome/vacuole (Sekiguchi et al. 2001; Kim et al. 2008). The 14 lokiarchaeal sequences selected for inclusion in phylogenetic analyses consistently grouped with eukaryotic Gtr/Rag sequences with strong support (PP ¼ 1/ BS ¼ 100, fig. 2). Furthermore, the eukaryotic sequences emerged in a bifurcating node from within lokiarchaeal homologues, consistent with an archaeal origin and subsequent gene duplication and diversification in the eukaryotic lineage. This conclusion was further supported through inspection of additional sequence elements. Crystallographic studies of yeast Gtr1 and Gtr2 proteins revealed that a C-terminal extension, known to be important in dimerization and function, adopts a roadblock fold (Gong et al. 2011). Investigation of secondary structural elements (supplementary figs. S3 and S5, Supplementary Material online) and evidence based on HMM-HMM comparisons (Materials and Methods; supple mentary table S4, Supplementary Material online) with other roadblock domain proteins revealed similar extensions in 12 out of the 14 lokiarchaeal orthologues. These data not only support the observed orthologous relationship recovered through phylogenetic analysis, but also indicates that the observed fusions between roadblock and GTPase domains are the result of the same ancestral fusion event. Robust evidence for additional clear lokiarchaeal orthologues of eukaryotic Ras-like or Arf-like superfamilies (fig. 2; supplementary figs. S1 and S2, Supplementary Material online) was not obtained. While the small clade of lokiarchaeal Arf-like sequences almost always placed as an outgroup to the large clade of eukaryotic Arf-like superfamily sequences, strong support was never obtained (not resolved in figure 2, excluded from MglA and Ef-Tu clades with PP ¼ 0.59/BS ¼ 70). However, detailed analyses of sequence motifs, especially of the “G-box” motifs, which are important in nucleotide binding and catalysis, corroborated the observed phylogenetic relationships of lokiarchaeal GTPases with eukaryotic Gtr/Rag family, as well as Ras- and Arf-like superfamily, sequences (fig. 3; supplementary text S1 and fig. S3, Supplementary Material online). Lokiarchaeal Ras-like sequences encode an aspartic acid residue at the second position of the G1 box and tend to follow a serine-alanine-lysine pattern for the G5 box similar to eukaryotic Ras-like superfamily sequences (fig. 3). The single clade of lokiarchaeal Arflike sequences, despite poor node support, closely follows the eukaryotic Arf G4 box signature (ANKQD). Finally, as additional support for direct orthology between lokiarchaeal and eukaryotic Gtr/Rag family members, both the G4 box histidine residue as well as the G5 box serine-isoleucine-hydrophobic residue patterns are conserved between, but never observed outside, of these two groups (fig. 3). Furthermore, we demonstrate a relationship between the Arf-like superfamily and Gtr/Rag family of small GTPases, Tracing Archaeal Origins of Eukaryotic Endomembrane Machinery . doi:10.1093/molbev/msw034 Lokiarch 53730 Lokiarch 06040 Lokiarch 15960 ArfL Lokiarch 10070* Lokiarch 05490 Lokiarch 44040 Lokiarch 05440 XP_001350853.1 Plasmodium falciparum SRPRβ 0.99/100 XP_008876353.1 Aphanomyces invadans SRPRβ NP_067026.3 Homo sapiens SRPβ ESA14800.1 Rhizophagus irregularis SRPRβ Capsaspora owczarzaki Sar1 XP_004346956.1 1/100 XP_002889736.1 Arabdiopsis lyrata Sar1 XP_002842381.1 Tuber melanosporum Sar1 0.56/ KIO13649.1 Pisolithus tinctorius Sar1 XP_008882834.1 Hammondia hammondi Arl1 74 0.99/ XP_002895840.1 Phytophthora infestans Arl1 XP_002892585.1 Arabidopsis lyrata Arf 100 XP_008610697.1 Saprolegnia diclina Arf XP_002785532.1 Perkinsus marinus Arf KFH64649.1 Mortierella verticillata Arf KDP33476.1 Jatropha curcas Arl1 0.65/19 XP_004365083.1 Capsaspora owczarzaki Arl1 ESA12577.1 Rhizophagus irregularis Arl1 0.59/70 Lokiarch 38370* Lokiarch 42960 Lokiarch 15440 Lokiarch 51460 Lokiarch 42900 Lokiarch 10260 Lokiarch 09660 1/100 Rag Lokiarch 18170 Lokiarch 03700 Lokiarch 35420 Lokiarch 02630 Lokiarch 51600 Lokiarch 12680 Lokiarch 50090 XP_009039870.1 Aureococcus anophagefferens RagA 1/100 XP_009523339.1 Phytophthora sojae RagA XP_004885572.1 Heterocelphus glaber RagA XP_004368007.1 Acanthamoeba castellanii RagA 0.91/86 XP_002683014.1 Naegleria gruberi RagA XP_008878228.1 Aphanomyces invadans RagC XP_004334232.1 Acanthamoeba castellanii RagC 1/100 NP_071440.1 Homo sapiens RagC CDS12217.1 Absidia idahoensis RagC YP_004337182.1 Thermoproteus uzoniensis Rup2 1/100 YP_001154233.1 Pyrobaculum arsenaticum Rup2 YP_001056059.1 Pyrobaculum calidifontis Rup2 NP_560775.1 Pyrobaculum aerophilum Rup2 270966 Selaginella moellendorffii Ran‡ 108912 Phytophthora sojae Ran‡ 32121 Naegleria gruberi Ran‡ NP_006316.1 Homo sapiens Ran‡ EAL61601.1 Dictyostelium discoideum Ran‡ 0.91/99 0.77/67 181611 Nematostella vectensis Rab7‡ 3837959 Naegleria gruberi Rab11‡ RO3G 14572.1 Rhizopus oryzae Rab6‡ MJL00000537 Malawimonas jakobiformis Rab5‡ Lokiarch 53190 Lokiarch 48070 Lokiarch 45420 Lokiarch 31930 0.77/93 Lokiarch 21790 Lokiarch 52350 Lokiarch 36400 RasL III Lokiarch 15930 Lokiarch 12880 Lokiarch 39550 0.99/100 Lokiarch 12240 Lokiarch 38560 Lokiarch 38550 Lokiarch 51330 Lokiarch 51340 Lokiarch 33770 RasL I 0.61/31 Lokiarch 51370 Lokiarch 33750 Lokiarch 36830* Lokiarch 02350 Lokiarch 34920 0.55/66 Lokiarch 21460 Lokiarch 21100 Lokiarch 53220 Lokiarch 04210 Lokiarch 17900 0.99/83 Lokiarch 49520 Lokiarch 45790 Lokiarch 44790 Lokiarch 30440 RasL IV Lokiarch 47920 Lokiarch 27490 Lokiarch 18970 Lokiarch 18250 Lokiarch 10880 PhyloBayes/RAxML Lokiarch 33450 Lokiarch 04620 Lokiarch 44980 1/95 Lokiarch 01690 0.52/74 Lokiarch 01650 0.90/75 Lokiarch 22470 Lokiarch 31850 0.80/50 RasL II Lokiarch 31830 Lokiarch 31810 Lokiarch 00500 XP_007872518.1 Pneumocystis murina Rheb XP_004358963.1 Dictyostelium fasciculatum Rheb NP_444305.2 Mus musculus Rheb 0.99/98 XP_003291742.1 Dictyostelium purpureum Rap XP_002683335.1 Naegleria gruberi Rap XP_006677258.1 Batrachochytrium dendrobatidis Rap 0.98/98 XP_004348999.1 Capsaspora owczarzaki Rap NP_002875.1 Homo sapiens Rap XP_008037023.1 Trametes versicolor Cdc42 1/100 XP_006676440.1 Batrachochytrium dendrobatidis Cdc42 NP_001782.1 Homo sapiens Cdc42 0.73/89 CCM05783.1 Fibroporia radiculosa Cdc42 EAR02531.1 Maribacter sp. Rup1 ACM52776.1 Chloroflexus sp. Rup1 BAB50180.1 Mesorhizobium loti Rup1 ACC80548.1 Nostoc punctiforme Rup1 ADC89772.1 Thermocrinis albus MglA Lokiarch 50350 1/100 Lokiarch 45120 MglA Lokiarch 35740 Lokiarch 50330 BAI79536.1 Deferribacter desulfuricans MglA ACZ39003.1 Sphaerobacter thermophilus MglA ABF90872.1 Myxococcus xanthus MglA YP_004484604.1 Methanotaurus igneus MglA 0.58/14 YP_004576173.1 Methanothermococcus okinawensis MglA YP_003127510.1 Methanocaldococcus fervens MglA ACR11686.1 Teredinibacter turnerae MglA 0.97/100 ACK79530.1 Acidithiobacillus ferrooxidans MglA ACB32931.1 Leptothrix cholodnii MglA ADH65724.1 Nocardiopsis dassonvillei MglA ADG86885.1 Thermobispora bispora MglA ABW15337.1 Frankia sp. MglA ABV96682.1 Salinispora arenicola MglA BAJ62712.1 Anaerolinea thermophila MglA ABQ90522.1 Roseiflexus sp. MglA ABD00965.1 Synechococcus sp. MglA ADR35632.1 Oceanithermus profundus MglA WP_025321931.1 Deferrisoma camini EfTu WP_014968648.1 Gottschalkia acidurici EfTu WP_015590312.1 Archaeoglobus sulfaticallidus EfTu WP_014025778.1 Pyrolobus fumarii EfTu XP_006089394.1 Myotis lucifugus EfTu NP_001393.1 Homo sapiens EfTu AIC15681.1 Nitrososphaera viennensis EfTu MBE 1/100 Sar Arf-like Arf/ Arl RagA/B RagC/D Rup2 Ran Rab Ras-like 1/100 SRPRβ Ras Rho Rup1 MglA1 MglA2-5 Ef-Tu FIG. 2. Phylogenetic analysis of bacterial, eukaryotic and (loki-) archaeal GTPase proteins. This figure demonstrates the phylogenetic relationships between identified lokiarchaeal GTPase proteins and known homologues from Bacteria, Archaea, and eukaryotes, based on aligned amino acid sequences (127 positions). For this, and all subsequent phylogenies, the best Bayesian topology, as inferred by Phylobayes, is shown with the scale bar indicating the number of substitutions per site. Support values are indicated as PPs/RAxML bootstrap values; nodes discussed in the text are shown in larger bold italics. Internal node support values are indicated by symbols, as denoted in the figure legend, whereby a node with support greater than or equal to the numbers listed for both methods is indicated by the symbol. For lokiarchaeal Arf-like, Rag, and Ras-like group I sequences, a structure showing the associated roadblock/longin domain is provided for the accession in bold followed by an asterisk. Eukaryotic sequences followed by a double dagger are those from Elias et al. (2012). 1531 Klinger et al. . doi:10.1093/molbev/msw034 MBE FIG. 3. Comparison of G-box motifs for eukaryotic and lokiarchaeal GTPase proteins. This figure compares observed lokiarchaeal GTPase G-box motifs to the corresponding consensus sequence for each family of eukaryotic GTPases. Lokiarchaeal GTPase groups, as depicted in figure 2, are shown together with motif logo representations of aligned sequences for the five G-box motifs involved in nucleotide binding and catalysis. The RasL III group is excluded due to a paucity of information. Notably, lokiarchaeal ArfL sequences share the eukaryotic Arf G4 signature, and lokiarchaeal RagL sequences share eukaryotic Rag G4 and G5 signatures. Eukaryotic consensus sequences are derived either from Rojas et al. (2012) or from alignments. although the exact relationship is unclear, with the Gtr/Rag clade emerging either sister to, or within, the larger Arf-like superfamily (fig. 2; supplementary fig. S2, Supplementary Material online). Regardless of the topology, we consistently obtained a node separating these two groups from all other sequences (PP ¼ 0.56/BS ¼ 74, fig. 2). Despite the moderate phylogenetic support, further reinforcement of this affiliation was observed by inspection of sequence motifs, for instance the G1 box second position leucine shared between Arflike superfamily and both lokiarchaeal and eukaryotic Gtr/ Rag family members (fig. 3; supplementary text S1, Supplementary Material online). Thus, our results clarify the relationship of the Gtr/Rag family to other members of the Ras superfamily, and provide clear prokaryotic origins for this family. Finally, we aimed to determine whether the phylogenetic relationships we observed, as well as the apparent lack of direct orthologous relationships was genuine or an artifact of specific sequence selection. Therefore, we explored the relationship of all lokiarchaeal GTPase sequences to other much larger data sets including additional archaeal, bacterial, eukaryotic, and environmental sequences (supplementary fig. S2 and text S1, Supplementary Material online). This analysis did not yield any new insight into, nor contradict, any observed relationships between lokiarchaeal and eukaryotic GTPase families. However, it did retrieve the relationship between lokiarchaeal and eukaryotic Gtr/Rag sequences (BS ¼ 92) as well as between the Arf-like superfamily and Gtr/Rag family, albeit with low support (BS ¼ 56). 1532 Lokiarchaeum Does Not Contain Orthologues of Coiled-Coil, Fused Beta-Propeller-Alpha-Solenoid Proteins, or GTPase Exchange or Activating Factors Having obtained an increased resolution for small GTPases, we searched for additional building blocks of eukaryotic MTS proteins, including beta-propeller þ alpha-solenoid components, SNAREs, tethers, golgins, and V4R domain proteins (supplemen tary fig. S4, tables S2 and S3, and texts S2 and S3, Supplementary Material online). We carried out extensive homology searching using a combination of position-specific scoring matrixsequence, HMM-sequence, and HMM-HMM comparisons against the lokiarchaeal composite genome (Materials and Methods; supplementary table S3, Supplementary Material online) and employed these sensitive methods to allow for detection of remote homologues by incorporating subtle sequence similarity and underlying structural information. Notably, we failed to identify bona fide beta-propeller þ alpha-solenoid fusion protein homologues in the lokiarchaeal composite genome, including any representatives of clathrin/ adaptor protein or COPI or COPII coat complex subunits, BBsome/IFT, or nucleoporin families. The presence of, at least, eight alpha-solenoid and four beta-propeller proteins (supple mentary table S2 and text S2, Supplementary Material on line), suggests that the individual domains of this building block were present in the archaeal ancestor of eukaryotes, as well as in a wide diversity of prokaryotes (SantarellaMellwig et al. 2010). Some of the identified proteins model onto known protocoatomer proteins such as Clathrin adaptor core proteins or intraflagellar transport components, but Tracing Archaeal Origins of Eukaryotic Endomembrane Machinery . doi:10.1093/molbev/msw034 MBE FIG. 4. Illustration of the phylogenetic analysis of bacterial, eukaryotic, and (loki-) archaeal roadblock domain proteins. This figure shows the separation of lokiarchaeal roadblock proteins into MglB and RLC7 families, based on aligned amino acid sequences (150 positions). Gray dots indicate the same range of support values as those in all other phylogenies. Structures are provided for lokiarchaeal homologues of each family (Lokiarch_06490 and Lokiarch_25010) and the difference in secondary structural composition is indicated. A version of the same phylogeny with all clades expanded is shown in supplementary figure S7, Supplementary Material online. are detectably homologous to unrelated proteins containing HEAT or TPR repeats (e.g., Lokiarch_11360, supplementary table S2 and text S2, Supplementary Material online). These results suggest a subsequent contribution from another source and/or fusion event between structural units as key events leading to eukaryogenesis (fig. 1). Alternatively, these components could have been present in the archaeal ancestor of eukaryotes, but were subsequently lost in the lokiarchaeal lineage. Similarly, we were unable to identify orthologues of eukaryotic coiled-coil tethers (e.g., EEA1, golgin family members), SNAREs, or other trafficking-related components (supplemen tary table S3, supplementary text S2, Supplementary Material online). The Lokiarchaeum genome does encode four proteins with a V4R domain (supplementary text S3, Supplementary Material online), the likely prokaryotic homologue of the Bet3 family of the trafficking protein particle (TRAPP) tethering complex (Podar et al. 2008). However, direct orthologous relationships to eukaryotic TRAPP components could not be established with high confidence (supplementary fig. S4, Supplementary Material online). Given the large number of small GTPases encoded in Lokiarchaeum, we performed extensive domain and sequence motif searches (supplementary table S3 and text S4, Supplementary Material online) for various canonical eukaryotic GEFs (guanine nucleotide exchange factor) and GAPs (GTPase-activating proteins). We could not identify any significant hits in Lokiarchaeum. However, roadblock and longin domains are also known to be present in diverse GTPase interacting proteins (Miertzschke et al. 2011; Levine et al. 2013; De Franceschi et al. 2014). Lokiarchaeal Roadblock Proteins and an Archaeal Origin of the RLC7 Family Roadblock domain proteins include the MglB family, ubiquitous in Archaea and Bacteria, and the RLC7 family present in eukaryotes. The MglB family acts as GAPs for MglA family small GTPases, in amongst others Myxococcus xanthus (Leonardy et al. 2010). Comprised of five b-strands and two to three ahelices, MglB domains (a-b-b-a-b-b-b-a) contain a terminal alpha helix, which is lacking in the eukaryotic RLC7 family of dynein chain components (Koonin and Aravind 2000). The abundance of small GTPases encoded in Lokiarchaeum spurred previous investigations, which revealed that the lokiarchaeal composite genome encodes diverse roadblock domain proteins (Spang et al. 2015). A closer inspection using more sensitive homology searching methods (Materials and Methods) revealed a total of 38 potential roadblock domains in the lokiarchaeal composite genome, some of which were fused to the N-terminus of Ras-like or the Cterminus of Rag-like lokiarchaeal small GTPases as discussed above (fig. 2; supplementary table S4, Supplementary Material online). While the N-terminal roadblock domains model onto known roadblock structures when separated from the remainder of the GTPase sequence, the C-terminal roadblock domains only modeled correctly when the complete sequence was used (supplementary table S4, Supplementary Material online). Multiple sequence alignments of lokiarchaeal roadblock domains (supplementary figs. S5 and S6, Supplementary Material online) revealed that one protein, Lokiarch_06490, is most similar to eukaryotic RLC7 homologues and lacks the C-terminal a-helix present in MglB homologues. Further, in phylogenetic analyses, Lokiarch_06490 as well as close homologues from a marine sediment metagenome (Kawai et al. 2014) emerge basal to the eukaryotic RLC7 clade (PP ¼ 0.99/ BS ¼ 91, fig. 4, supplementary fig. S7, Supplementary Material online). Albeit with lower support (BS ¼ 64, supplementary fig. S8, Supplementary Material online), this relationship was also recovered in a much larger phylogenetic analysis that included all identified potential archaeal roadblock homologues (Materials and Methods) as well as a representative set of homologues from eukaryotes, bacteria, and environmental sequences. In accordance with their secondary structure and predicted tertiary structure (supplementary table S4, Supplementary Material online), all other roadblock domain 1533 MBE Klinger et al. . doi:10.1093/molbev/msw034 proteins present in Lokiarchaeum group with members of the prokaryotic MglB family. Importantly, the presence of a roadblock domain in Lokiarchaeum lacking the last a-helix and forming a monophyletic clade with eukaryotic homologues, suggests that the archaeal ancestor of eukaryotes already encoded a bona fide eukaryotic RLC7 protein, a component of dynein light chains. While not a component of the trafficking machinery per se, this observation provides evidence for an additional orthologue in Lokiarchaeum of a protein previously held as specific to eukaryotes. Tracing the Origin of Eukaryotic Longin Domain Proteins Finally, we sought to investigate the suggested presence of longin domains in Lokiarchaeum (Spang et al. 2015). The longin domain is composed of 120 amino acids forming five antiparallel b-strands sandwiched in-between a-helices (Tochio et al. 2001), and is present in seven conserved eukaryotic protein superfamilies including longins, heterotetrameric adaptor complexes (HTACs), Sedlins, SANDs (Mon1 proteins), Targetins, DENNs, and AVLs (De Franceschi et al. 2014). Previous domain analysis using IPR scan (Goujon et al. 2010) suggested that the lokiarchaeal composite genome contains eight longin-like domains (IPR011012 and IPR010908) or partial MON1 domains (IPR004353), which thus far had not been found in prokaryotes (Spang et al. 2015). Our more extensive analysis of this protein family in the lokiarchaeal composite genome extended the number of putative longin domains to 41, five of which were fused to lokiarchaeal Arf-like small GTPases (figs. 2 and 5; supplemen tary figs. S9 and S10 and table S4, Supplementary Material online). Multiple sequence alignments of lokiarchaeal longins combined with secondary structural predictions suggest conservation of the b-b-a-b-b-b-a-a arrangement characteristic of longin domains (supplementary fig. S9, Supplementary Material online). While several of these putative longins modeled with moderate to good confidence to various eukaryotic longin domain proteins, a subset did not model well (supple mentary table S4, Supplementary Material online). This might in part be due to the absence of closely related homologues in relevant databases. Phylogenetic analyses including all major eukaryotic longin domain families, which, apart from Lokiarchaeum, are lacking in all other prokaryotes, revealed that many lokiarchaeal longin domains emerge as single branches from a large multi-furcation (fig. 5). This lack of overall resolution is not surprising given the short length of the longin domain, the deep evolutionary distance involved, as well as the extensive subfunctionalization of this domain in eukaryotes and, arguably, in Lokiarchaeum as well. Nonetheless, some resolution at more internal nodes was observed. Three lokiarchaeal sequences (Lokiarch_04850, Lokiarch_13110, Lokiarch_01890) group with the a-subunit of the eukaryotic signal recognition particle receptor, albeit with low BS (PP ¼ 0.91/BS ¼ 24, fig. 5). This affiliation was not recovered in phylogenetic analyses excluding divergent lokiarchaeal homologues (supplementary 1534 fig. S10, Supplementary Material online). Several well-supported clades of related sequences were also observed, including a clade of those lokiarchaeal longins that are fused to Arflike small GTPases (PP ¼ 1/BS ¼ 100, figs. 2 and 5). Most notably, we observed a well-supported clade of eukaryotic longin domains comprising medium and small subunits of the TSET, COPI, and Adaptin complexes (PP ¼ 0.95/ BS ¼ 85). Our initial set of analyses also yielded a rooted set of relationships for these complexes, suggesting, for the first time, that TSET and COPI are indeed derived from a common ancestor, to the exclusion of the adaptins (0.99/87 for COPZ and TSPOON, fig. 5). Additional phylogenies aimed at corroborating these relationships (supplementary figs. S11 and S12, Supplementary Material online) revealed that the TSET subunits TCUP and TSPOON are indeed more closely related to their COPI counterparts, the delta and zeta subunits, respectively. Altogether, despite poor backbone resolution, these findings indicate that the archaeal ancestor of eukaryotes already encoded bona fide longin domains. Discussion The composite genome of Lokiarchaeum, which emerges as a sister group to eukaryotes in phylogenetic analyses, and whose genome has revealed the presence of an extended set of novel Eukaryotic Signature Proteins (ESP) (Spang et al. 2015), has provided us with the opportunity to reinvestigate the prokaryotic origins for the building blocks of eukaryotic membrane-trafficking components. In the current study, we demonstrate the origin of two important building blocks of the eukaryotic trafficking machinery, small GTPases and longin domains, from an archaeal source. These findings underscore the notion that the archaeal ancestor of eukaryotes was more complex than previously presumed (Martijn and Ettema 2013; Koonin and Yutin 2014; Koonin 2015) and, in particular, they provide more detail on the emergence and evolution of the critical eukaryotic MTS. Notably, several aspects of our analyses speak to the veracity of these lokiarchaeal sequences as ancestral eukaryotic genes contributed by the closest archaeal contributor to eukaryogenesis, as opposed to either recent horizontal gene transfer or genome project contamination. First, our analysis of genomic flanking regions (supplementary fig. S13 and text S6, Supplementary Material online) showed that the ESPs are located on contigs that contain clear homologues from prokaryotes, making misassembly or recent horizontal gene transfer unlikely. Furthermore, while we cannot rule out the possibility that there may be undiscovered eukaryotic diversity that is not included in our analyses, we included as wide a sampling of eukaryotic diversity as possible, including sampling environmental databases. In our phylogenetic analyses, none of the lokiarchaeal sequences group within clades of eukaryotic sequences, but are rather basal to them. This makes explaining their presence by horizontal gene transfer or contamination unparsimonious, as compared to their being distant but distinct archaeal homologues. Tracing Archaeal Origins of Eukaryotic Endomembrane Machinery . doi:10.1093/molbev/msw034 Origins of Eukaryotic Small GTPase Families Except for the Ran family, the diverse eukaryotic small GTPase families are often expanded and diversified in eukaryotes, with multiple homologues encoded in each genome. In previous studies it was suggested that LECA had a single Ran and SRPRb (Rojas et al. 2012), up to 23 Rabs (Diekmann et al. 2011; Elias et al. 2012), as well as a single Arf homologue, a single Sar homologue, and at least four Arl homologues (Li et al. 2004). Despite that obtaining phylogenetic resolution of some members has proven difficult, it was proposed that LECA might in addition have possessed at least one member of the Rho family, Rac (Boureux et al. 2007), and at least two Ras members, Rap and Rheb (Van Dam et al. 2011). In stark contrast, prokaryotic homologues are few in number and largely restricted to the MglA and Rup families (Wuichet and Søgaard-Andersen 2014). In previous studies the Rup2 family was suggested as the prokaryotic ancestor of the Arf family (Wuichet and Søgaard-Andersen 2014), while at the same time it was shown that MglA proteins bear some structural and mechanistic similarities to Arf (Miertzschke et al. 2011). Yet, the prokaryotic origins of Arf, as well as the other eukaryotic small GTPase families, remained ambiguous. In this study, we undertook a detailed analysis of a subset of 70 small GTPases encoded in the lokiarchaeal composite genome. Consistent with previous results (Spang et al. 2015), the majority of sequences did not form sister groups to specific eukaryotic families, with the striking exception of Gtr/Rag family GTPases. Our phylogenetic analyses and in depth analysis of sequence patterns (figs. 2 and 3; supplementary figs. S1–S3 and text S1, Supplementary Material online) are consistent with other studies suggesting an origin for the Ras-like superfamily of GTPases as members of a larger clade including the Rup1 family (Wuichet and Søgaard-Andersen 2014). Although none of the lokiarchaeal sequences placed phylogenetically as outgroup to any of the eukaryotic families, support for the group as a whole is strong. Future studies including new sequence data might be able to clarify the internal relationships between these groups. In contrast to previous findings (Wuichet and SøgaardAndersen 2014), our results confirm and extend those of Spang et al. 2015, indicating that the Rup2 family of small GTPases is more closely related to the Rup1/Ras-like superfamily clade rather than the Arf-like superfamily (fig. 2; sup plementary fig. S1, Supplementary Material online). Lokiarchaeum does not encode direct orthologues of any Arf-like family, suggesting a separate origin for members of this superfamily, possibly from the Korarchaeota (Spang et al. 2015; supplementary text S1 and fig. S2, Supplementary Material online). Future analyses with new archaeal genome data may shed more light on the origins of the Arf-like superfamily. We identified 14 lokiarchaeal sequences that group with members of the atypical eukaryotic GTPases of the Gtr/Rag family with strong support. Although previous biochemical and sequence analyses placed these GTPases as members of the Ras superfamily, this view is not without contention and the relationship of the family to other Ras superfamily members was unclear (Sekiguchi et al. 2001; Rojas et al. 2012). Here, MBE we show that Gtr/Rag family GTPases are related to the Arf-like superfamily, and we could firmly establish that lokiarchaeal sequences represent archaeal orthologues of this eukaryotic family. In eukaryotes, Gtr/Rag family GTPases function in the amino acid-regulated activation of TORC1, which in turn regulates cell growth, metabolism, and autophagy in response to amino acid starvation (Kim et al. 2008; Sancak et al. 2008; Sancak et al. 2010). The GTPases are tethered at the lysosome/vacuole by the pentameric Ragulator complex, a RagA/ B GEF activated by amino acids via the vacuolar ATPase (Sancak et al. 2010; Zoncu et al. 2011; Bar-Peled et al. 2012). Ragulator acts both as a GEF and as a tether, the p18 component of which is N-terminally modified with lipid moieties to mediate its association with the lysosomal/vacuolar compartment (Nada et al. 2009). The remaining four Ragulator components all possess roadblock domains (Kurzbauer et al. 2004; Lunin et al. 2004; Garcia-Saez et al. 2011; Levine et al. 2013). Recent studies have identified the GATOR1 complex, including the longin domain proteins Nprl2 and Nprl3 (Levine et al. 2013) as the corresponding GAP (Bar-Peled et al. 2013; Panchaud et al. 2013). Although direct orthologues of all of these regulatory components as well as the TORC1 complex are absent from the lokiarchaeal composite genome (supplementary table S3, Supplementary Material online), the presence of Gtr/Rag family members, multiple roadblock, and longin domain proteins, as well as a putative vacuolar ATPase and a primordial ESCRT complex (Spang et al. 2015) raises the intriguing possibility that Lokiarchaeum might possess a primitive endolysosomal-like capacity (Koonin 2015). Evolutionary Insights into GTPase Regulation GTPases bind both GDP and GTP with low nanomolar to picomolar affinities, such that their intrinsic rate of GDP/GTP turnover and thus GTPase activity is very low. These properties necessitate the presence of additional factors, GEFs and GAPs, to allow for these reactions to occur on biologically relevant timescales (Bos et al. 2007). While our extensive analyses (supplementary table S3 and text S4, Supplementary Material online) did not recover significant hits in Lokiarchaeum for canonical eukaryotic GEFs and GAPs, the large number of roadblock (38) and longin (41) domains encoded is striking (supplementary table S4, Supplementary Material online). Recently, it was established that dimerized roadblock domain proteins of the MglB family in prokaryotes possess GAP activity towards MglA family GTPases (Miertzschke et al. 2011) by causing a drastic conformational change in the GTPase, which allows the intrinsic arginine and glutamine residues to hydrolyze GTP. As in other prokaryotes, the various lokiarchaeal MglB homologues could serve as GAPs in this organism (supplementary text S5, Supplementary Material online). This idea is supported by the observation that a subset of these proteins was directly fused to specific groups of small GTPases. Thus, roadblock domain proteins might be involved in the regulation of lokiarchaeal small GTPases. Until recently, longin domains had only been found in eukaryotes, with the proposal that roadblocks gave rise to 1535 Klinger et al. . doi:10.1093/molbev/msw034 MBE FIG. 5. Phylogenetic analysis of lokiarchaeal and eukaryotic longin-domain proteins. This figure demonstrates the phylogenetic relationships of putative lokiarchaeal longin-domain proteins to a representative set of eukaryotic longin-domain proteins, based on aligned amino acid sequences (171 positions). Well-supported groups of longin domain proteins are indicated by gray boxes and associated structures are shown for sequences in bold font followed by an asterisk. 1536 Tracing Archaeal Origins of Eukaryotic Endomembrane Machinery . doi:10.1093/molbev/msw034 longins by circular permutation no later than in LECA. This proposition suggests that the last element of the longin domain originated from the first element of the roadblock domain (Levine et al. 2013; De Franceschi et al. 2014). The identification of longin domains in Lokiarchaeum places this transition much earlier, before the emergence of eukaryotes from an archaeal ancestor. Notably, while GEFs for some members of the Ras-like superfamily remain poorly characterized, several GTPase interacting proteins and protein complexes contain longin domains. This includes the TRAPP complex (Cai et al. 2008), the DENN protein family (Yoshimura et al. 2010; Wu et al. 2011), the Mon1/Ccz1 complex (Cabrera et al. 2014), and the GATOR1 complex (Levine et al. 2013). Similar to roadblock domains, longin domains are found in interacting modules that possess either GEF or GAP activity. These observations, combined with our findings, including several independent fusions of roadblock and longin domains directly to archaeal GTPases, suggest that these domains represent ancestral GTPase regulators (supplementary text S5, Supplementary Material online). Importantly, although small GTPases are thought to share a common origin, the various eukaryotic families of GAP and GEF proteins are structurally and evolutionarily unrelated (Bos et al. 2007), even between closely related families of GTPases. For example, the GAPs for Arf and Arl GTPases are unrelated (East et al. 2012), GEF and GAP regulators differ between Ras and Rab families (Barr and Lambright 2010; Van Dam et al. 2011), and even within the Rab family, nonhomologous GEF families operate on different Rab paralogues (Barr and Lambright 2010). Our phylogenetic analysis of lokiarchaeal small GTPases established that the main divisions into Ras-like and Arf-like superfamilies occurred even before the radiation of the eukaryotic lineage, yet Lokiarchaeum does not encode homologues of canonical eukaryotic GEF and GAP families. This supports the view that small GTPases diversified in the eukaryotic lineage prior to the acquisition of GEF and GAP families for each GTPase family, and clarifies the observed evolutionary relationships between these regulatory factors. It furthermore highlights the different evolutionary dynamics of the GTPases and their regulators where diversification from a common ancestor appears dominant for the GTPases, but where accretion of nonhomologous factors, followed by diversification by gene duplication, plays more of a role for the regulators. What Remains to Be Discovered The ambiguous nature of the prokaryotic connection to eukaryotic trafficking machinery components has hampered efforts to delve into the deepest roots of the system and understand the connection to eukaryogenesis. While we identified lokiarchaeal candidates for two key building blocks of the trafficking system (GTPase and longin domains), we were unable to detect direct homologues for the beta-propeller þ alpha-solenoid proteins or eukaryotic coiled-coil domain proteins such as tethers and SNAREs. The genes encoding these proteins could have been present in the archaeal ancestor of eukaryotes, but were lost from the lokiarchaeal lineage analyzed in this study. Alternatively, the genes encoding MBE the building blocks were acquired via other prokaryotic contributors (fig. 1; Pittis and Gabaldon 2016). Additionally, the presence of building blocks but not of fusions in the eukaryotic configurations potentially puts some bounds on the timing of the archaeal contribution. Either the fusions seen in eukaryotes evolved after the divergence of the lineages or were present and lost later in Lokiarchaeum. Important eukaryotic fusions, such as those between a longin and SNARE domain leading to the R-SNARE family, involve at least one domain not present in the Lokiarchaeum genome. While this means that there was a great deal of membranetrafficking evolution, via building block acquisition and domain fusion that took place after the sampling point that Lokiarchaeum represents, the fact that seemingly independent fusions of building blocks were observed in Lokiarchaeum suggests that the fusion of building blocks might have represented a general mechanism to generate functional diversity in the lineage leading to eukaryotes. Concluding Remarks While the current analysis of the Lokiarchaeum genome alone does not explain the origin of the membrane-trafficking components, it provides the first concrete sampling point giving insight into one starting point of the complicated journey that gave rise to this sophisticated array of cellular transport machinery. The demonstration of the direct orthologous relationship of two lokiarchaeal protein families—the roadblock-only RLC7 and Gtr/Rag GTPases—with their eukaryotic counterparts, provides further support for the relationship of this archaeal lineage with the elusive ancestor of eukaryotes (Spang et al. 2015). Certainly, future efforts to obtain and analyze additional archaeal lineages related to Lokiarchaeota will certainly yield even deeper insights in the origin and evolution of these key eukaryotic features (Saw et al. 2015). Materials and Methods Identification of Trafficking-Related Machinery Initial domain searches were based on IPRscan (Goujon et al. 2010), Pfam (Finn et al. 2014), and SMART (Letunic et al. 2014) databases. Secondary structure prediction was carried out using the JPred4 server (Drozdetskiy et al. 2015) or the integrated JPred3 interface available in Jalview (Waterhouse et al. 2009). Homology searches were carried out using the BLASTp and psi-BLAST algorithms (Altschul et al. 1997), with a maximum of six iterations for psi-BLAST, against the predicted Lokiarchaeal proteome. For queries defined by the presence of one or more domains or structural units, hidden markov model (HMM)-based searches using HMMer (Finn et al. 2011) were performed with curated sequence alignments from the Pfam database (Finn et al. 2014). The HHSuite set of programs was used to build a database of HMM models for every protein in the predicted Lokiarchaeal proteome (S€oding 2005). HMM models were built through comparison to the UniProt database clustered to 20% sequence identity, with a maximum of two iterations. Single query sequences were also transformed to HMMs in 1537 Klinger et al. . doi:10.1093/molbev/msw034 this manner to allow the use of HMM-HMM pairwise searches suited for uncovering distant evolutionary relationships; alignment-based queries, such as those obtained from Pfam, were used as direct inputs for HMM-HMM comparison. In all cases, hits with an E-value 0.05 were chosen as potential homologues for further analysis. This involved BLAST searches against the Homo sapiens and nonredundant databases (NRDB) at NCBI, as well as domain and structural predictions. The top hits from each search were also subjected to HHM-HMM comparisons against H. sapiens via http://toolkit.tuebingen.mpg.de/hhpred (last accessed February 22, 2016). The ard2 (Fournier et al. 2013) and SMURF (Menke et al. 2010) webservers were used for the prediction of alpha-solenoids and beta-propellers, respectively. Sequence motif prediction and analysis was performed using the MEME suite (Bailey et al. 2015) while sequence logos were created using WebLogo (Crooks et al. 2004). Tertiary structure models were created using the Phyre2 web servers using the normal mode for batch submission and intensive mode for final models (Kelley et al. 2015). Unless otherwise stated, all program parameters for homology searching, domain identification, and structural prediction programs were left at their respective defaults. Analysis of Flanking Regions To check for potential assembly error/eukaryotic contamination for our candidate ESPs, we performed an analysis of taxonomic affiliation of genes encoded on contigs with either GTPases, roadblock, or longin domain proteins. Taxonomic affiliation was inferred using the lowest common ancestor rule (parameters were set as follows: Min Score, 50; Max Expected, 0.01; Top Percent, 5; Min Support, 1; Min Complexity, 0.0) as applied by MEGAN (Huson et al. 2007) and the results of this analysis were plotted in R (supplemen tary fig. S13 and text S6, Supplementary Material online; Spang et al. 2015). Additionally, we gathered flanking genes for two genes upstream and downstream of each candidate ESP, respectively, and subjected them to homology search methods described above to determine if any conserved functions or previously undetected eukaryotic membrane-trafficking components were present (supplementary text S6 and table S3, Supplementary Material online). Sequence Selection for Phylogenetic Analysis The set of Lokiarchaeal small GTPases previously identified by IPRscan (Spang et al. 2015) was validated and expanded using HMMer searches against the lokiarchaeal proteome yielding additional candidate homologues. From a starting set of 117 putative small GTPases, we removed proteins predicted to be involved in ATP-dependent processes, those that were shorter than the proposed minimal Ras domain of 160 amino acids (Cenatiempo et al. 1987; Parmeggiani et al. 1987), those missing one or more of the G box motifs responsible for nucleotide binding and catalysis (Bourne et al. 1991), as well as those that were highly divergent and/or of uncertain affiliation in phylogenetic analyses (supplementary text S1, Supplementary Material online). This resulted in a final set 1538 MBE of 70 sequences for downstream phylogenetic analyses (sup plementary table S1, Supplementary Material online). For the purposes of identifying relationships between lokiarchaeal small GTPases and known families from prokaryotes and eukaryotes, we chose a strategy to identify marker sequences that would serve as surrogates for each established family. For bacterial and archaeal small GTPases, we started with a set of 526 previously identified proteins (Wuichet and Søgaard-Andersen 2014), and performed multiple iterations of tree reconstruction, each time removing branches from the common group node that were visibly longer than the majority. This is an approximation of the scrollsaw approach that aims to choose the least divergent orthologues to represent larger assemblages in phylogenetic analyses (Elias et al. 2012). This resulted in a set of bacterial and archaeal GTPases belonging to one of seven groups (MglA1-5, Rup1, and Rup2). For eukaryotic Ran and Rab families homologues previously identified as the least divergent members of their respective families were used (Elias et al. 2012). For the remainder of the eukaryotic GTPase families, we devised a pipeline to provide similar surrogate sequences without necessitating the same level of in depth analysis. We used relevant queries from H. sapiens in BLAST searches against the NRDB and limited results to the top 10 hits from each eukaryotic phylum, as defined by the NCBI taxonomy database. Short branches were selected as described above for prokaryotic sequences. Finally, seven sequences representing the elongation factor tu (Ef-Tu) family were included as outgroup. Two different strategies to select sequences for expanded GTPase data sets are described in supplementary text S1, Supplementary Material online. Lokiarchaeal roadblock domain proteins were identified using IPRscan (IPR004942) as well as by HMMer searches against the lokiarchaeal composite genome (supplementary table S4, Supplementary Material online). Close homologues (more than 80% amino acid identity and sequence coverage) of these lokiarchaeal roadblock domain proteins were extracted from a marine sediment metagenome known to contain high numbers of organisms affiliating with the Deep Sea Archaeal Group (DSAG) (Inagaki et al. 2003). A representative set of eukaryotic, archaeal, and bacterial MglB and RLC7 family domain proteins (IPR004942) that aimed at covering the taxonomic diversity of members of all three groups was downloaded from UniProt using SMART (Letunic et al. 2014). In an additional phylogenetic analysis of a more comprehensive set of roadblock homologues, all archaeal arCOGs (Makarova et al. 2007) related to roadblock domain proteins were included (arCOG02603, arCOG02605, arCOG03412, arCOG05211, arCOG05565, arCOG08684, and arCOG11698), as were all roadblock domain proteins assigned to IPR004942 and present in a taxonomically diverse set of eukaryotic and bacterial genomes and lokiarchaeal homologues, which were not fused to small GTPases. In addition, Lokiarch_06490, which represents the eukaryotic-type roadblock domain protein of Lokiarchaeum, was queried against the environmental NR database to identify marine sediment homologues as well as against all archaeal and bacterial genomes, respectively. Notably, and consistent with results from Tracing Archaeal Origins of Eukaryotic Endomembrane Machinery . doi:10.1093/molbev/msw034 previous phylogenetic analyses, no homologues of this eukaryotic-type roadblock protein were detected in either bacterial or archaeal genomes. Prior to final alignment, redundant sequences (threshold 99% identity) were removed. Lokiarchaeal longin domain proteins were initially identified using IPRscan (IPR004353 and IPR011012). Subsequently, additional putative homologues not detected by IPR scan were found using HMMer (supplementary table S2, Supplementary Material online). Metagenomic longin/ Mon1 domain proteins assigned to either of these IPRscan categories were extracted from UniProt (identity > 30%; coverage > 80, E-value > 1e 4). Additionally, longin domain containing proteins from diverse eukaryotic protein families (including the alpha subunit of the signal recognition particle receptor; synaptobrevins Ykt6, Sec22, and VAMP; TRAPP subunits TRAPPC1, TRAPPC2, TRAPPC2-like, TRAPPC4; Mon1 domain proteins; and small and medium subunits from the various HTAC representatives (COPI, TSET, and the Adaptins) were selected from a subset of seven representative eukaryotic species. In some cases, highly divergent members of these protein families were removed. Alignments and Phylogenetic Analyses Alignments were created using MAFFT-L-INS-i version 7 (Katoh and Standley 2013) either using default parameters (longin domain proteins and small GTPase family proteins) or by leaving regions with gaps (roadblock alignment). Alignments were manually curated and adjusted in Jalview (Waterhouse et al. 2009). Zorro, a probabilistic masking program, was used to assign confidence scores to aligned sites (Wu et al. 2012) used for maximum likelihood inferences. For large GTPase and roadblock alignments (supplementary figs. S2 and S8, Supplementary Material online), alignments were trimmed using trimAl (Capella-Gutierrez et al. 2009) with either the “gappyout” (roadblock) or “gt 0.7” (GTPase) options to select columns. Before subjecting the large roadblock alignment to trimAl, ambiguously aligned positions at the Nand C-termini were trimmed manually. This included the removal of the last alpha helix, which was absent from eukaryotic RLC7 family proteins as well as from Lokiarch_06490 and some environmental homologues. Bayesian analyses were carried out using PhyloBayes v3.3 (Lartillot et al. 2009) with default settings and the CAT-GTR model. For each of the alignments, four chains were run in parallel, sampling every 100 points until the maximum difference was 0.15 (or 0.3 for the longin-small data set). The first 20% or the respective generations at which the parameters started to stabilize were selected as burn-in. Maximum likelihood bootstrapping using the –f b option was carried out using RAxML v8.1.17 (Stamatakis 2014) under the LG þ C model to yield 100 nonparametric bootstrap replicates while rapid bootstrapping using the –f a option was performed for the large GTPase data sets. Bootstrapping was carried out on unmodified alignments, or using Zorro confidence scores as column weights, and bootstrap scores were mapped onto the best Bayesian topology using the sumtrees program of the DendroPy package (Sukumaran and Holder 2010). All MBE inference under gamma distributed rates used four distinct rate categories, unless otherwise stated. Analyses were either run locally or using the CIPRES Science Gateway webportal (Miller et al. 2010). Supplementary Material Supplementary tables S1–S4, figures S1–S13, and texts S1–S6 are available at Molecular Biology and Evolution online (http:// www.mbe.oxfordjournals.org/). Acknowledgments This work was supported by the European Research Council (ERC Starting grant no. 310039-PUZZLE_CELL) and the Swedish Foundation for Strategic Research (FFL12-0024) to T.J.G.E., as well as by a Marie Curie IEF (625521) grant by the European Union to A.S. C.M.K. is supported by graduate studentships from the Women and Children’s Health Research Institute and Alberta Innovates Health Solutions. J.B.D. is the Canada Research Chair in Evolutionary Cell Biology and this work was supported by an NSERC Discovery grant (RES0021028) and a grant from Alberta Innovates Technology Futures (RES0004718). The authors would like to thank Dr Lionel Guy for an R script to plot lokiarchaeal contigs. References Altschul SF, Madden TL, Sch€affer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389–3402. Bailey TL, Johnson J, Grant CE, Noble WS. 2015. The MEME Suite. Nucleic Acids Res. 43:39–49. Bar-Peled L, Chantranupong L, Cherniack AD, Chen WW, Ottina KA, Grabiner BC, Spear ED, Carter SL, Meyerson M, Sabatini DM. 2013. A Tumor suppressor complex with GAP activity for the rag GTPases that signal amino acid sufficiency to mTORC1. Science 340:1100–1106. Bar-Peled L, Schweitzer LD, Zoncu R, Sabatini DM. 2012. Ragulator is a GEF for the Rag GTPases that signal amino acid levels to mTORC1. Cell 150:1196–1208. Barr F, Lambright DG. 2010. Rab GEFs and GAPs. Curr Opin Cell Biol. 22:461–470. Bonifacino JS, Glick BS. 2004. The mechanisms of vesicle budding and fusion. Cell 116:153–166. Bos J, Rehmann H, Wittinghofer A. 2007. GEFs and GAPs: critical elements in the control of small G proteins. Cell 129:865–877. Boureux A, Vignal E, Faure S, Fort P. 2007. Evolution of the Rho family of Ras-like GTPases in eukaryotes. Mol Biol Evol. 24:203–216. Bourne HR, Sanders DA, McCormick F. 1991. The GTPase superfamily: conserved structure and molecular mechanism. Nature 349:117–127. Cabrera M, Engelbrecht-Vandre S, Ungermann C. 2014. Function of the Mon1-Ccz1 complex on endosomes. Small GTPases 5:1–3. Cai H, Reinisch K, Ferro-Novick S. 2007. Coats, tethers, Rabs, and SNAREs work together to mediate the intracellular destination of a transport vesicle. Dev Cell. 12:671–682. Cai Y, Chin HF, Lazarova D, Menon S, Fu C, Cai H, Sclafani A, Rodgers DW, De La Cruz EM, Ferro-Novick S, et al. 2008. The structural basis for activation of the Rab Ypt1p by the TRAPP membrane-tethering complexes. Cell 133:1202–1213. 1539 Klinger et al. . doi:10.1093/molbev/msw034 Capella-Gutierrez S, Silla-Martınez JM, Gabaldon T. 2009. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25:1972–1973. Cenatiempo Y, Deville F, Dondon J, Grunberg-Manago M, Sacerdot C, Hershey JW, Hansen HF, Petersen HU, Clark BF, Kjeldgaard M, et al. 1987. The protein synthesis initiation factor 2 G-domain. Study of a functionally active C-terminal 65-kilodalton fragment of IF2 from Escherichia coli. Biochemistry 26:5070–5076. Cox CJ, Foster PG, Hirt RP, Harris SR, Embley TM. 2008. The archaebacterial origin of eukaryotes. Proc Natl Acad Sci U S A. 105:20356–20361. Crooks GE, Hon G, Chandonia JM, Brenner SE. 2004. WebLogo: a sequence logo generator. Genome Res. 14:1188–1190. Dacks JB, Field MC. 2007. Evolution of the eukaryotic membrane-trafficking system: origin, tempo and mode. J Cell Sci. 120:2977–2985. Dacks JB, Poon PP, Field MC. 2008. Phylogeny of endocytic components yields insight into the process of nonendosymbiotic organelle evolution. Proc Natl Acad Sci U S A. 105:588–593. Diekmann Y, Seixas E, Gouw M, Tavares-Cadete F, Seabra MC, PereiraLeal JB. 2011. Thousands of rab GTPases for the cell biologist. PLoS Comput Biol. 7:e1002217. Dong JH, Wen JF, Tian HF. 2007. Homologs of eukaryotic Ras superfamily proteins in prokaryotes and their novel phylogenetic correlation with their eukaryotic analogs. Gene 396:116–124. Drozdetskiy A, Cole C, Procter J, Barton GJ. 2015. JPred4: a protein secondary structure prediction server. Nucleic Acids Res. 43:W389–W394. East MP, Bowzard JB, Dacks JB, Kahn RA. 2012. ELMO domains, evolutionary and functional characterization of a novel GTPase-activating protein (GAP) domain for Arf protein family GTPases. J Biol Chem. 287:39538–39553. Elias M, Brighouse A, Gabernet-Castello C, Field MC, Dacks JB. 2012. Sculpting the endomembrane system in deep time: high resolution phylogenetics of Rab GTPases. J Cell Sci. 125:2500–2508. Ettema TJG, Bernander R. 2009. Cell division and the ESCRT complex: a surprise from the archaea. Commun Integr Biol. 2:86–88. Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, et al. 2014. Pfam: the protein families database. Nucleic Acids Res. 40:D290–D301. Finn RD, Clements J, Eddy SR. 2011. HMMER web server: interactive sequence similarity searching. Nucleic Acids Res. 39:W29–W37. Foster PG, Cox CJ, Embley TM. 2009. The primary divisions of life: a phylogenomic approach employing composition-heterogeneous methods. Philos Trans R Soc Lond B Biol Sci. 364:2197–2207. Fournier D, Palidwor GA, Shcherbinin S, Szengel A, Schaefer MH, PerezIratxeta C, Andrade-Navarro MA. 2013. Functional and genomic analyses of alpha-solenoid proteins. PLoS One 8:e79894. De Franceschi N, Wild K, Schlacht A, Dacks JB, Sinning I, Filippini F. 2014. Longin and GAF domains: structural evolution and adaptation to the subcellular trafficking machinery. Traffic 15:104–121. Garcia-Saez I, Lacroix FB, Blot D, Gabel F, Skoufias DA. 2011. Structural characterization of HBXIP: the protein that interacts with the antiapoptotic protein survivin and the oncogenic viral protein HBx. J Mol Biol. 405:331–340. Gong R, Li L, Liu Y, Wang P, Yang H, Wang L, Cheng J, Guan KL, Xu Y. 2011. Crystal structure of the Gtr1p—Gtr2p complex reveals new insights into the amino acid-induced TORC1 activation. Genes Dev. 25:1668–1673. Goujon M, McWilliam H, Li W, Valentin F, Squizzato S, Paern J, Lopez R. 2010. A new bioinformatics analysis tools framework at EMBL-EBI. Nucleic Acids Res. 38:W695–W699. Guy L, Ettema TJG. 2011. The archaeal “TACK” superphylum and the origin of eukaryotes. Trends Microbiol. 19:580–587. Guy L, Saw JH, Ettema TJG. 2014. The archaeal legacy of eukaryotes: a phylogenomic perspective. Cold Spring Harb Perspect Biol. 6:a016022. Henne WM, Buchkovich NJ, Emr SD. 2011. The ESCRT pathway. Dev Cell. 21:77–91. Hirst J, Schlacht A, Norcott JP, Traynor D, Bloomfield G, Antrobus R, Kay RR, Dacks JB, Robinson MS. 2014. Characterization of TSET, an ancient and widespread membrane trafficking complex. Elife 3:e02866. 1540 MBE Huson DH, Auch AF, Qi J, Schuster SC. 2007. MEGAN analysis of metagenomic data. Genome Res. 17:377–386. Inagaki F, Suzuki M, Takai K, Oida H, Sakamoto T, Aoki K, Nealson KH, Horikoshi K. 2003. Microbial communities associated with geological horizons in coastal subseafloor sediments from the sea of okhotsk. Appl Environ Microbiol. 69:7224–7235. Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 30:772–780. Kawai M, Futagami T, Toyoda A, Takaki Y, Nishi S, Hori S, Arai W, Tsubouchi T, Morono Y, Uchiyama I, et al. 2014. High frequency of phylogenetically diverse reductive dehalogenase-homologous genes in deep subseafloor sedimentary metagenomes. Front Microbiol. 5:80. Kelley LA, Mezulis S, Yates CM, Wass MN, Sternberg MJE. 2015. The Phyre2 web portal for protein modeling, prediction and analysis. Nat Protoc. 10:845–858. Kim E, Goraksha-Hicks P, Li L, Neufeld TP, Guan KL. 2008. Regulation of TORC1 by Rag GTPases in nutrient response. Nat Cell Biol. 10:935–945. Koonin EV. 2015. Origin of eukaryotes from within archaea, archaeal eukaryome and bursts of gene gain: eukaryogenesis just made easier? Philos Trans R Soc Lond B Biol Sci. 370:20140333 Koonin EV, Aravind L. 2000. Dynein light chains of the Roadblock/LC7 group belong to an ancient protein superfamily implicated in NTPase regulation. Curr Biol. 10:774–776. Koonin EV, Yutin N. 2014. The dispersed archaeal eukaryome and the complex archaeal ancestor of eukaryotes. Cold Spring Harb Perspect Biol. 6:a016188. Kurzbauer R, Teis D, de Araujo MEG, Maurer-Stroh S, Eisenhaber F, Bourenkov GP, Bartunik HD, Hekman M, Rapp UR, Huber LA, et al. 2004. Crystal structure of the p14/MP1 scaffolding complex: how a twin couple attaches mitogen-activated protein kinase signaling to late endosomes. Proc Natl Acad Sci U S A. 101:10984–10989. Lartillot N, Lepage T, Blanquart S. 2009. PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating. Bioinformatics 25:2286–2288. Lasek-Nesselquist E, Gogarten JP. 2013. The effects of model choice and mitigating bias on the ribosomal tree of life. Mol Phylogenet Evol. 69:17–38. Leonardy S, Miertzschke M, Bulyha I, Sperling E, Wittinghofer A, SøgaardAndersen L. 2010. Regulation of dynamic polarity switching in bacteria by a Ras-like G-protein and its cognate GAP. EMBO J. 29:2276–2289. Letunic I, Doerks T, Bork P. 2014. SMART: recent updates, new developments and status in 2015. Nucleic Acids Res. 43:D257–D260. Levine TP, Daniels RD, Wong LH, Gatta AT, Gerondopoulos A, Barr FA. 2013. Discovery of new longin and roadblock domains that form platforms for small GTPases in ragulator and TRAPP-II. Small GTPases 4:62–69. Li Y, Kelly WG, Logsdon JM, Schurko AM, Harfe BD, Hill-Harfe KL, Kahn RA. 2004. Functional genomic analysis of the ADP-ribosylation factor family of GTPases: phylogeny among diverse eukaryotes and function in C. elegans. FASEB J. 18:1834–1850. Lindås A-C, Karlsson EA, Lindgren MT, Ettema TJG, Bernander R. 2008. A unique cell division machinery in the Archaea. Proc Natl Acad Sci U S A. 105:18942–18946. Lunin VV, Munger C, Wagner J, Ye Z, Cygler M, Sacher M. 2004. The structure of the MAPK scaffold, MP1, bound to its partner, p14: a complex with a critical role in endosomal MAP kinase signaling. J Biol Chem. 279:23422–23430. Makarova KS, Sorokin AV, Novichkov PS, Wolf YI, Koonin EV. 2007. Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea. Biol Direct. 2:33. Makarova KS, Yutin N, Bell SD, Koonin EV. 2010. Evolution of diverse cell division and vesicle formation systems in Archaea. Nat Rev Microbiol. 8:731–741. Martijn J, Ettema TJG. 2013. From archaeon to eukaryote: the evolutionary dark ages of the eukaryotic cell. Biochem Soc Trans. 41:451–457. Tracing Archaeal Origins of Eukaryotic Endomembrane Machinery . doi:10.1093/molbev/msw034 Menke M, Berger B, Cowen L. 2010. Markov random fields reveal an Nterminal double beta-propeller motif as part of a bacterial hybrid two-component sensor system. Proc Natl Acad Sci U S A. 107:4069–4074. Miertzschke M, Koerner C, Vetter IR, Keilberg D, Hot E, Leonardy S, Søgaard-Andersen L, Wittinghofer A. 2011. Structural analysis of the Ras-like G protein MglA and its cognate GAP MglB and implications for bacterial polarity. EMBO J. 30:4185–4197. Miller MA, Pfeiffer W, Schwartz T. 2010. Creating the CIPRES Science Gateway for inference of large phylogenetic trees. 2010 Gateway Computing Environments Workshop GCE. New Orleans Convention Centre, New Orleans: IEEE. p. 1–8. Nada S, Hondo A, Kasai A, Koike M, Saito K, Uchiyama Y, Okada M. 2009. The novel lipid raft adaptor p18 controls endosome dynamics by anchoring the MEK-ERK pathway to late endosomes. EMBO J. 28:477–489. Panchaud N, Peli-Gulli MP, De Virgilio C. 2013. SEACing the GAP that nEGOCiates TORC1 activation: evolutionary conservation of Rag GTPase regulation. Cell Cycle 12:2948–2952. Parmeggiani A, Swart GW, Mortensen KK, Jensen M, Clark BF, Dente L, Cortese R. 1987. Properties of a genetically engineered G domain of elongation factor Tu. Proc Natl Acad Sci U S A. 84:3141–3145. Parry DAD, Fraser RDB, Squire JM. 2008. Fifty years of coiled-coils and alpha-helical bundles: a close relationship between sequence and structure. J Struct Biol. 163:258–269. Pittis AA, Gabaldon T. 2016. Late acquisition of mitochondria by a host with chimaeric prokaryotic ancestry. Nature doi: 10.1038/nature 16941. Podar M, Wall MA, Makarova KS, Koonin EV. 2008. The prokaryotic V4R domain is the likely ancestor of a key component of the eukaryotic vesicle transport system. Biol Direct. 3:2 Raymann K, Brochier-Armanet C, Gribaldo S. 2015. The two-domain tree of life is linked to a new root for the Archaea. Proc Natl Acad Sci U S A. 112:6670–6675. Rojas AM, Fuentes G, Rausell A, Valencia A. 2012. The Ras protein superfamily: evolutionary tree and role of conserved amino acids. J Cell Biol. 196:189–201. Sancak Y, Bar-Peled L, Zoncu R, Markhard AL, Nada S, Sabatini DM. 2010. Ragulator-Rag complex targets mTORC1 to the lysosomal surface and is necessary for its activation by amino acids. Cell 141:290–303. Sancak Y, Peterson TR, Shaul YD, Lindquist RA, Thoreen CC, Bar-Peled L, Sabatini DM. 2008. The Rag GTPases bind raptor and mediate amino acid signaling to mTORC1. Science 320:1496–1501. Santarella-Mellwig R, Franke J, Jaedicke A, Gorjanacz M, Bauer U, Budd A, Mattaj IW, Devos DP. 2010. The compartmentalized bacteria of the planctomycetes-verrucomicrobia-chlamydiae superphylum have membrane coat-like proteins. PLoS Biol. 8:e1000281. Saw JH, Spang A, Zaremba-Niedzwiedzka K, Juzokaite L, Dodsworth JA, Murugapiran SK, Colman DR, Takacs-vesbach C, Hedlund BP, Guy L, MBE et al. 2015. Exploring microbial dark matter to resolve the deep archaeal ancestry of eukaryotes. Philos Trans R Soc Lond B Biol Sci. 370:20140328. Schlacht A, Herman EK, Klute MJ, Field MC, Dacks JB. 2014. Missing pieces of an ancient puzzle: evolution of the eukaryotic membranetrafficking system. Cold Spring Harb Perspect Biol. 6:a016048. Sekiguchi T, Hirose E, Nakashima N, Li M, Nishimoto T. 2001. Novel G Proteins, Rag C and Rag D, Interact with GTP-binding Proteins, Rag A and Rag B. J Biol Chem. 276:7246–7257. S€oding J. 2005. Protein homology detection by HMM-HMM comparison. Bioinformatics 21:951–960. Spang A, Saw JH, Jørgensen SL, Zaremba-Niedzwiedzka K, Martijn J, Lind AE, van Eijk R, Schleper C, Guy L, Ettema TJG. 2015. Complex archaea that bridge the gap between prokaryotes and eukaryotes. Nature 521:173–179. Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313. Sukumaran J, Holder MT. 2010. DendroPy: a Python library for phylogenetic computing. Bioinformatics 26:1569–1571. Tochio H, Tsui MM, Banfield DK, Zhang M. 2001. An autoinhibitory mechanism for nonsyntaxin SNARE proteins revealed by the structure of Ykt6p. Science 293:698–702. Van Dam TJ, Bos JL, Snel B. 2011. Evolution of the Ras-like small GTPases and their regulators. Small GTPases 2:4–16. Vedovato M, Rossi V, Dacks JB, Filippini F. 2009. Comparative analysis of plant genomes allows the definition of the “Phytolongins”: a novel non-SNARE longin domain protein family. BMC Genomics 10:510. Waterhouse AM, Procter JB, Martin DM, Clamp M, Barton GJ. 2009. Jalview version 2-A multiple sequence alignment editor and analysis workbench. Bioinformatics 25:1189–1191. Williams TA, Foster PG, Nye TMW, Cox CJ, Embley TM. 2012. A congruent phylogenomic signal places eukaryotes within the Archaea. Proc Biol Sci. 279:4870–4879. Wu M, Chatterji S, Eisen JA. 2012. Accounting for alignment uncertainty in phylogenomics. PLoS One 7:e30288. Wu X, Bradley MJ, Cai Y, Kummel D, De La Cruz EM, Barr FA, Reinisch KM. 2011. Insights regarding guanine nucleotide exchange from the structure of a DENN-domain protein complexed with its Rab GTPase substrate. Proc Natl Acad Sci U S A. 108:18672–18677. Wuichet K, Søgaard-Andersen L. 2014. Evolution and diversity of the Ras superfamily of small GTPases in prokaryotes. Genome Biol Evol. 7:57–70. Yoshimura SI, Gerondopoulos A, Linford A, Rigden DJ, Barr FA. 2010. Family-wide characterization of the DENN domain Rab GDP-GTP exchange factors. J Cell Biol. 191:367–381. Zoncu R, Bar-Peled L, Efeyan A, Wang S, Sancak Y, Sabatini DM. 2011. mTORC1 senses lysosomal amino acids through an inside-out mechanism that requires the vacuolar H(þ)-ATPase. Science 334:678–683. 1541