Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Origin and Spread of Photosynthesis Based upon Conserved Sequence Features in Key Bacteriochlorophyll Biosynthesis Proteins Radhey S. Gupta* Department of Biochemistry, McMaster University, Hamilton, ON, Canada *Corresponding author: E-mail: [email protected]. Associate editor: Neelima Sinha Abstract Key words: origin of photosynthesis, conserved signature indels, BchL, BchX, NifH, BchN, BchB, Clade C cyanobacteria, phylogenetic trees, Chloroflexi, Chlorobi, Heliobacteriaceae, Proteobacteria, lateral gene transfers. Introduction The origin of photosynthesis, which sustains most life on earth, and its spread to various bacterial groups remain important unresolved problems in the evolutionary history of life (Blankenship 1992; Blankenship and Hartman 1998; Hartman 1998; Dismukes et al. 2001; Raymond and Segre 2006; Hohmann-Marriott and Blankenship 2011). With the exception of plants and algae that are secondarily photosynthetic due to endosymbiotic acquisition of cyanobacteria (Morden et al. 1992; Margulis 1993), the (bacterio)chlorophyll [Bchl]-based photosynthesis is found in five discontinuous phyla of cultured bacteria viz. Cyanobacteria (Cyano), Chloroflexi, Bacteroidetes/Chlorobi, Firmicutes (Heliobacteriaceae), and Proteobacteria (Proteo) (Gest and Favinger 1983; Olson and Pierson 1987; Blankenship 1992; Bryant and Frigaard 2006; Hohmann-Marriott and Blankenship 2011). Additionally, an uncultured bacterium belonging to the phylum Acidobacteria is also inferred to be photosynthetic (Bryant et al. 2007; Raymond 2008). The similarities in the photosynthetic pigments and overall charge transfer mechanisms in the reaction centers (RCs) of various phototrophs suggest that photosynthesis has evolved only once (Nitschke and Rutherford 1991; Golbeck 1993; Blankenship 1994; Schubert et al. 1998; Olson and Blankenship 2004; Nelson and Ben Shem 2005; Sadekar et al. 2006). However, it has proven difficult to determine in which of these bacterial groups photosynthesis first originated and how other groups acquired this ability (Raymond et al. 2002; Xiong and Bauer 2002; Raymond et al. 2003; Raymond 2009). Different approaches used to investigate this problem have given discordant results indicating the origin of photosynthesis in Proteobacteria (Xiong et al. 2000; Xiong and Bauer 2002), Firmicutes (Vermaas 1994; Gupta et al. 1999; Gupta 2003), Chloroflexi (Pierson 1994; ß The Author 2012. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] Mol. Biol. Evol. 29(11):3397–3412 doi:10.1093/molbev/mss145 Advance Access publication May 24, 2012 3397 Research article The origin of photosynthesis and how this capability has spread to other bacterial phyla remain important unresolved questions. I describe here a number of conserved signature indels (CSIs) in key proteins involved in bacteriochlorophyll (Bchl) biosynthesis that provide important insights in these regards. The proteins BchL and BchX, which are essential for Bchl biosynthesis, are derived by gene duplication in a common ancestor of all phototrophs. More ancient gene duplication gave rise to the BchX–BchL proteins and the NifH protein of the nitrogenase complex. The sequence alignment of NifH–BchX–BchL proteins contain two CSIs that are uniquely shared by all NifH and BchX homologs, but not by any BchL homologs. These CSIs and phylogenetic analysis of NifH–BchX–BchL protein sequences strongly suggest that the BchX homologs are ancestral to BchL and that the Bchl-based anoxygenic photosynthesis originated prior to the chlorophyll (Chl)-based photosynthesis in cyanobacteria. Another CSI in the BchX–BchL sequence alignment that is uniquely shared by all BchX homologs and the BchL sequences from Heliobacteriaceae, but absent in all other BchL homologs, suggests that the BchL homologs from Heliobacteriaceae are primitive in comparison to all other photosynthetic lineages. Several other identified CSIs in the BchN homologs are commonly shared by all proteobacterial homologs and a clade consisting of the marine unicellular Cyanobacteria (Clade C). These CSIs in conjunction with the results of phylogenetic analyses and pair-wise sequence similarity on the BchL, BchN, and BchB proteins, where the homologs from Clade C Cyanobacteria and Proteobacteria exhibited close relationship, provide strong evidence that these two groups have incurred lateral gene transfers. Additionally, phylogenetic analyses and several CSIs in the BchL-N-B proteins that are uniquely shared by all Chlorobi and Chloroflexi homologs provide evidence that the genes for these proteins have also been laterally transferred between these groups. Other results and observations reported here indicate that the genes for the BchL-N-B proteins in Proteobacteria are derived from the Clade C Cyanobacteria, whereas those in Chlorobi were acquired from Chloroflexus or related bacteria by means of LGTs. Some implications of these observations regarding the origin and spread of photosynthesis are discussed. MBE Gupta . doi:10.1093/molbev/mss145 Dismukes et al. 2001), and in anoxygenic ancestors of Cyanobacteria (Mulkidjanian et al. 2006). The origin of photosynthesis has proven difficult to resolve because photosynthesis-related genes, which are clustered in genomes (Xiong et al. 1998; Choudhary and Kaplan 2000) are prone to lateral gene transfers (LGTs) (Raymond et al. 2002; Raymond et al. 2003; Zhaxybayeva et al. 2006; Raymond 2009). This makes it difficult to interpret the results of phylogenetic analyses based on such genes/proteins (Xiong et al. 1998; Xiong et al. 2000; Green and Gantt 2000). The analyses based on other genes and proteins that are less prone to LGTs, although they provide useful insights regarding the branching orders of these bacterial phyla (Olsen et al. 1994; Gupta et al. 1999; Gupta 2000; Gupta 2003; Ciccarelli et al. 2006), they do not necessarily indicate that photosynthesis evolved in this manner. Thus, an understanding of this important problem can only emerge from some novel characteristics of the photosynthesis-related genes/proteins, which despite the widespread occurrence of LGTs can provide useful insights to resolve this problem. Recent analyses of genome sequences have revealed that only nine proteins related to photosynthesis, all of which are involved in the biosynthesis of Bchl, are shared by all phototrophs (Raymond et al. 2002; Mulkidjanian et al. 2006). Of these nine proteins, only three proteins viz. BchL, BchN, and BchB, are uniquely found in all phototrophic lineages indicating their central importance in the origin of photosynthesis (Raymond et al. 2002; Mulkidjanian et al. 2006). The proteins BchL, BchN, and BchB (referred to as BchL-N-B), which are unique to all phototrophs, are part of an enzyme complex viz. light-independent (or darkoperative) protochlorophyllide oxidoreductase (DPOR) that plays a key role in the biosynthesis of Bchl by converting protochlorophyllide to chlorophyllide a (chlorin) (Burke et al. 1993; Beale 1999; Raymond et al. 2004; Chew and Bryant 2007). A second enzyme complex, chlorin reductase, consisting of the proteins BchX, BchY, and BchZ (referred to as BchX-Y-Z), found in various phototrophic bacteria except cyanobacteria, further reduces chlorin to bacteriochlorin that serves as the direct precursor for the Bchls (Beale 1999; Chew and Bryant 2007). Importantly, the three proteins BchL-N-B from the DPOR complex exhibit significant sequence similarity to the three subunits (viz. BchX–Y–Z) of chlorin reductase, indicating that these two sets of proteins have evolved from an ancient gene duplication in a common ancestor of all phototrophs (Burke et al. 1993; Xiong and Bauer 2002; Raymond et al. 2004; Chew and Bryant 2007). Additionally, these two sets of proteins also exhibit significant sequence and structural similarity to the three subunits viz. NifH, NifD, and NifK, of the nitrogenase complex (Sarma et al. 2008; Muraki et al. 2010), which plays a central role in nitrogen fixation and is sporadically distributed in prokaryotes (Haselkorn 1986; Burke et al. 1993; Xiong et al. 2000; Raymond et al. 2004). Of these proteins, the homologs of BchL, BchX, and NifH show maximal sequence conservation making them particularly useful for understanding the origin of photosynthesis (Burke et al. 1993; Xiong et al. 2000; Raymond et al. 2004). 3398 In recent years, genome sequences for large numbers of photosynthetic prokaryotes representing different bacterial phyla have become available (Raymond and Swingley 2008; NCBI 2011). They provide valuable resource for using different approaches to gain insights into the origin of photosynthesis. For understanding of ancient evolutionary relationships, one approach that has proven very useful consists of identifying conserved indels (i.e., inserts or deletions) in protein sequences that are uniquely shared by particular groups of organisms (Rivera and Lake 1992; Gupta and Golding 1993; Baldauf and Palmer 1993; Delwiche et al. 1995; Gupta 1998; Gupta 2011). Because conserved indels in protein sequences (even a 1 amino acid indel) are the results of rare and highly specific genetic changes, their presence or absence in gene/ protein sequences is generally not affected by factors such as differences in evolutionary rates at different sites or among different species that greatly influence the branching patterns of species in phylogenetic trees (Felsenstein 1988; Gupta 1998; Moreira and Philippe 2000; Felsenstein 2004). Hence, the shared presence of such markers in different group(s) of species provides powerful means to establish ancestral evolutionary relationships as well as to identify cases of LGTs among unrelated taxa. Further, depending upon the presence or absence of these characters (viz. indels) in outgroup species, it is possible to infer which of the two character states of the protein (i.e., indel-containing or indel-lacking) is ancestral (Baldauf and Palmer 1993; Gupta 1998; Gupta 2003; Gupta 2010). In this work, I describe a number of conserved signature indels in the NifH, BchX, BchL, BchN, and BchB proteins, which together with the results of phylogenetic analyses provide valuable information regarding the origin of photosynthesis and its spread among bacterial phyla. Based upon the novel CSIs that are reported here and other sequence characteristics of these proteins, the proteobacterial homologs of the BchL-N-B proteins are specifically related to a clade consisting of the marine unicellular cyanobacteria (Clade C Cyano), whereas those from Chlorobi are closely related to the Chloroflexi. Phylogenetic analyses and other CSIs reported here further indicate that the BchX homologs originated prior to the BchL and that the BchL homologs in Heliobacteriaceae (Firmicutes phylum) are primitive in comparison to those found in other photosynthetic lineages. Materials and Methods Using Blastp searches on the BchL, BchN, and BchB proteins (Altschul et al. 1997), homologs of these proteins from different photosynthetic bacteria were retrieved and their multiple sequence alignments were created using the ClustalX 1.83 program (Jeanmougin et al. 1998). These alignments were visually inspected to identify all indels that were flanked on both sides by at least 4–5 identical/conserved residues in the neighboring 30–40 amino acids. The indels that were not flanked by conserved regions were not considered as they do not provide reliable markers for evolutionary studies (Gupta 1998). The species distribution patterns of all potentially useful indels were further evaluated by detailed Blastp searches on short sequence segments (generally between Evolutionary Origin and Spread of Photosynthesis . doi:10.1093/molbev/mss145 60–100 amino acids depending upon the lengths of the indels) containing the indels and their flanking conserved regions (Gupta 2009; Gupta and Mathews 2010). The sequence information for various conserved indels from different phototrophic bacteria was compiled into signature files. Sequence information for only representative species from larger photosynthetic phyla is provided in the figures that are shown here. However, all of the indels reported are highly specific for the indicated groups. It should be clarified that the term Chloroflexi in this work refers only to the filamentous anoxygenic phototrophs (FAP)(Bryant and Frigaard 2006; Hanada and Pierson 2006; Tang et al. 2011), as other Chloroflexi that are not photosynthetic, do not contain homologs of these proteins. The sequences for BchX and NifH homologs from various photosynthetic organisms as well as other lineages (for NifH) were also retrieved and multiple sequence alignments of these proteins together with the BchL sequences were created. These alignments were also inspected for the presence of conserved indels and their specificities were evaluated as described above. Phylogenetic trees based on sequence alignments of the BchL, BchX, and NifH homologs were constructed using both neighbour joining (NJ) and maximumlikelihood (ML) algorithms. Bootstrapped NJ trees based on these sequences were constructed using the Kimura model (Kimura 1983) employing the TREECON 1.3b program (Van de Peer and De Wachter 1994). ML analysis based on these sequences was carried out using the WAG + F model with gamma distribution of evolutionary rates with four categories and 10,000 puzzling steps employing the TREE-PUZZLE program (Schmidt et al. 2002). The pair-wise identity and similarity between different homologs for the BchL, BchN, and BchB proteins were determined using the EMBOSS molecular biology program package with default parameters (Rice et al. 2000). Results Identification of a Conserved Indel in the BchL Protein Suggesting the Primitive Nature of the BchL Homolog from Heliobacteriaceae In the sequence alignments of BchL homologs from various phototrophs two different CSIs were identified (fig. 1A). Both these CSIs are of defined lengths and they are flanked on both sides by a number of conserved residues, indicating that they provide useful molecular markers for evolutionary studies. Of these two CSIs, the first consisting of a 1 aa indel (CSI ¶) is specifically present in the two Heliobacteriaceae species (fig. 1A). The absence of this indel in the BchL homologs from all other phyla of bacteria indicates that it is a specific characteristic of the Heliobacteriaceae BchL. The second CSI (CSI •), which is present in an adjoining region, is comprised of a 5 amino acid indel that is uniquely found in different Chlorobi and Chloroflexi homologs, but absent in all other bacteria. These CSIs could represent either inserts in the genes from these particular taxa or alternatively they could result from deletion(s) in the BchL homologs from other phototrophic lineages. To gain insights in these regards, a MBE multiple sequence alignment of diverse BchL and BchX homologs was created. The sequence region where these CSIs are present is sufficiently conserved between the BchL and BchX homologs so that their sequences can be reliably aligned. Figure 1B shows the sequence alignment of the BchX homologs to the corresponding region of the BchL proteins where these CSIs are found. From the sequence alignments of these two proteins, it is clear that the 1 aa CSI that is uniquely found in the BchL homologs of Heliobacteriaceae is a shared characteristic of all BchX homologs from different lineages. Because, BchX and BchL homologs are derived by gene duplication in a common ancestor of all phototrophs (Burke et al. 1993; Xiong et al. 2000), the presence of this CSI in all BchX homologs as well as the BchL homolog from Heliobacteriaceae suggests that the BchL homologs from Heliobacteriaceae containing this CSI are primitive in comparison to those from other phototrophic lineages. The most parsimonious explanation to account for the absence of this CSI in the BchL homologs from other phototrophs is that a deletion occurred in the ancestral form of the protein, before it spread to other lineages. However, other possibilities cannot be entirely excluded, In contrast to the CSI ¶, the CSI • that is specifically present in the in BchL homologs of Chlorobi and Chloroflexi was absent in all of the BchX homologs indicating that the absence of this indel was the ancestral character state of the BchX–BchL protein. Therefore, this CSI likely represents an insert in the BchL homologs of Chlorobi and Chloroflexi and its shared presence in these two phylogenetically distinct lineages could be due to LGTs. Other observations supporting this inference will be described later. Conserved Indels in the NifH, BchX, and BchL Proteins Provide Evidence that BchX Homologs Originated Prior to the BchL Homologs The BchL and BchX proteins are distantly related to the dinitrogenase reductase (NifH) protein of the nitrogenase complex (Burke et al. 1993; Xiong et al. 2000; Raymond et al. 2004). The nitrogenases are present in a number of bacterial phyla as well as in methanogenic archaea and they perform similar function as the DPOR and chlorin reductase complexes by coupling ATP hydrolysis-driven electron transfer to enable substrate (nitrogen) reduction (Haselkorn 1986; Burke et al. 1993; Xiong et al. 2000; Raymond et al. 2004). The observed sequence, structural and functional similarities between these proteins complexes indicate that they have evolved from an ancestral protein complex (Burke et al. 1993; Xiong et al. 2000; Raymond et al. 2004; Sarma et al. 2008; Muraki et al. 2010). Hence, a multiple sequence alignment of the BchL, BchX, and NifH proteins was also created to determine if it contains any informative CSIs. The sequence alignments of these three proteins have led to identification of two CSIs that are of much interest. A partial sequence alignment of representative NifH, BchX, and BchL homologs from different phototrophic lineages showing these two CSIs (‚ and „) is presented in fig. 2. As seen from this sequence alignment, all of the NifH and BchX homologs commonly contained two CSIs (boxed) that are not found in any of 3399 Gupta . doi:10.1093/molbev/mss145 MBE FIG. 1. (A) Excerpts from the sequence alignment of BchL protein showing two conserved signature indels (CSIs, boxed) that are specific for particular lineages of phototrophic bacteria. The CSI ¶ is specific for the Heliobacteriaceae, whereas CSI • is commonly shared by different Chlorobi and Chloroflexi (FAP) homologs. Although sequence information is shown for only representative species, all available sequences from these groups behaved as shown here with regard to the presence or absence of these CSIs. The dashes (-) in these as well as other sequence alignments indicate identity with the amino acid on the top line. The numbers on the top indicate the position of the sequence in the species on the top line. (B) A sequence alignment of the BchX homologs from different phototrophic lineages for the same region as shown in (A) for the BchL protein sequences. 3400 Evolutionary Origin and Spread of Photosynthesis . doi:10.1093/molbev/mss145 MBE FIG. 2. Partial sequence alignments of the NifH, BchX, and BchL homologs showing two CSIs in different region of these proteins that are commonly shared by the NifH and BchX homologs. The roman numerals in the names of the NifH sequences refer to different clusters of the NifH family of proteins (Raymond et al. 2004). The numbers below the group or phyla names indicate the presence or absence of these CSIs in all available NifH, BchX, and BchL homologs. The dashes (-) indicate identity with the amino acid on the top line. The abbreviations in the species names are as follows: Az. vin., Azotobacter vinelandii; Cb. pha., Chlorobium phaeobacteroides; Cb. tep., Chlorobium tepidum; Cf. aur., Chloroflexus aurantiacus; Cf. agg., Chloroflexus aggregans; Ch.tha., Chloroherpeton thalassium; Cl. ace., Clostridium acetobutylicum; De. haf., Desulfitobacterium hafniense; Ha. hal., Halorhodospira halophila; He. mob., Heliobacterium mobilis; He. mod., Heliobacterium modesticaldum; He. chl., He. chlorum; Me. bar., Methanosarcina barkeri; Me. ace, Methanosarcina acetivorans; Me. ext., Methylobacterium extorquens; No. pun., Nostoc punctiforme; Pmar9303, Prochlorococcus marinus MIT9303; Pr. vib., Prosthecochloris vibrioformis; Ro. cas., Roseiflexus castenholzii; Ro. RS-1, Ro. sp. RS-1; Rh. pal., Rhodopseudomonas palustris; Rh. rub., Rhodospirillum rubrum; Rh. sph., Rhodobacter sphaeroides; Ru. gel., Rubrivivax gelatinosus; Si. mel., Sinorhizobium meliloti; Syn6803, Synechocystis sp. PCC6803; SynRC307, Synechococcus sp. RC307; Tr. ery., Trichodesmium erythraeum. the BchL homologs. Because the split between the nitrogenases (viz. NifH) and the Bchl biosynthesis proteins (viz. BchX and BchL) occurred before the gene duplication event that led to the formation of the DPOR (viz. BchL) and chlorin reductase (viz. BchX) complexes, the unique shared presence of these two CSIs by all NifH and BchX homologs, but not by any BchL homologs, provides strong evidence that the BchX homologs are ancestral in comparison to the BchL homologs. Phylogenetic trees were also constructed based upon NifH, BchX, and BchL sequences from representative prokaryotic taxa, omitting the indels in these sequence alignments. These trees were made using both maximum-likelihood (ML) and neighbor-joining (NJ) methods and their results are presented in figure 3 and supplementary figure 1, Supplementary Material online. In both these trees, the NifH, BchX, and BchL homologs formed distinct clades that were strongly supported by the bootstrap and puzzling scores. The distinct clustering of homologs from these three families provides evidence that these proteins carry out distinct functions and no LGTs have occurred between them. In these trees, the clade consisting of the BchX protein was more closely related to the NifH proteins than that seen for the BchL family of proteins. This result provides further evidence that the BchX homologs are primitive in comparison to the BchL proteins. Similar results based upon phylogenetic analyses of these proteins have been obtained in earlier studies (Burke et al. 1993; Xiong and Bauer 2002). Within the BchL clade, the branching of different phototrophs showed a number of interesting relationships. The most surprising of these was that the BchL homologs from Cyanobacteria did not form a monophyletic grouping, but they were split into two distinct clades. One of these clades (referred to as Clade C) (Gupta 2009; Gupta and Mathews 2010) consisting of various marine unicellular cyanobacteria grouped 100% of the time with the homologs from different proteobacteria and it was separated from all other cyanobacteria by a long branch. The remainder of the cyanobacteria branched with or in close proximity of 3401 Gupta . doi:10.1093/molbev/mss145 MBE FIG. 3. Phylogenetic tree based on NifH, BchX, and BchL sequences. The tree shown is a ML distance tree and numbers on the nodes indicate percentage of puzzling score and bootstrap scores for the nodes in the ML and NJ trees. Only values >50% are shown. A NJ tree based upon these sequences is shown in supplementary figure 1, Supplementary Material online. Heliobacteriaceae, but neither ML nor NJ method supported a specific relationship between these two groups. In the ML tree (shown in fig. 3), the homolog from Heliobacteriaceae branched in the midst of other cyanobacteria with a long branch, whereas in the NJ tree it formed an outgroup of the clade consisting of the remainder of the cyanobacteria (supplementary fig. 1, Supplementary Material online). Both these trees additionally showed that the BchL homologs from Chlorobi and Chloroflexi were closely related and formed strongly supported clusters. A close relationship between these two groups was also reported in earlier studies 3402 (Raymond et al. 2002; Xiong and Bauer 2002) and its significance will be discussed later. Identification of Conserved Indels in the BchN and BchB Proteins Showing a Specific Relationship of the Proteobacterial Homologs to the Clade C Cyanobacteria and the Chlorobi Homologs to the Chloroflexi In phylogenetic trees based upon different data sets of protein sequences, the sequenced cyanobacterial species/strains form Evolutionary Origin and Spread of Photosynthesis . doi:10.1093/molbev/mss145 a number of distinct clades (Swingley et al. 2008; Shi and Falkowski 2008; Gupta 2009; Blank and Sanchez-Baracaldo 2010; Gupta and Mathews 2010). One of the major clades observed in these trees, referred to as Clade C in our work (Gupta 2009; Gupta and Mathews 2010), is mainly comprised of the marine unicellular Prochlorococcus and Synechococcus species and strains. These bacteria are the dominant photosynthetic organisms in oceans and they also contain the smallest genomes of any photosynthetic organisms (Dufresne et al. 2003; Zhaxybayeva et al. 2009). The species/ strains belonging to this clade are also clearly distinguished from all other cyanobacteria by large numbers of CSIs in widely distributed proteins (Gupta 2009) as well as by numerous signature proteins that are uniquely found in all of the species/strains from this clade of cyanobacteria (Gupta and Mathews 2010). In the present work, in the sequence alignments of BchN homologs, I have identified three CSIs (viz. ›, fi, and ) that distinguish the Clade C cyanobacteria from other cyanobacteria. Excerpts from the sequence alignment of BchN homologs where these CSIs are found are shown in figure 4. These signature indels include 4, 2, and 1 amino acid inserts in the BchN protein that are commonly shared by all Clade C cyanobacteria (fig. 4), but which are lacking in all other cyanobacteria. At the same time, this protein also contains an 8 amino acid conserved insert (CSI –) that is commonly shared by all other cyanobacteria, except the Clade C cyanobacteria (fig. 5A). The mutually exclusive presence of these CSIs in either the Clade C cyanobacteria or all other cyanobacteria provides further evidence that these two groups/clades of cyanobacteria are distinct. Importantly, all three of the CSIs that are specific for the Clade C cyanobacteria (CSIs ›, fi, and ) are also present in all of the BchN homologs from different classes of Proteobacteria. The shared presence of these CSIs by all Clade C cyanobacteria and the proteobacterial homologs strongly suggests that the BchN gene has been laterally transferred between these two groups. Further, similar to the phylogenetic tree for BchL homologs (fig. 3), in a phylogenetic tree based upon BchN sequences (fig. 6), the clade C cyanobacteria exhibited a strong and specific association with the proteobacterial homologs. Additionally, a specific grouping of the Clade C cyanobacteria with the proteobacteria is also observed in the phylogenetic tree based upon BchB sequences (supplementary fig. 2, Supplementary Material online). As shown above, the CSI • in BchL homologs is uniquely present in all sequenced Chlorobi and Chloroflexi species (fig. 1). In the sequence alignments of BchN and BchB proteins, four other CSIs were also identified that are specifically present in species from these two phyla. Sequence information for these CSIs is presented in figures 4 and 5. The BchN protein contains three CSIs (marked ‹, fl, and ‡), all three consisting of 1 amino acid deletions, that are specific for Chlorobi and Chloroflexi (figs. 4A, 4C, and 5A). Similarly, the BchB protein also contains a 1 amino acid conserved deletion that is restricted to these two phototrophic lineages (CSI ” in fig. 5B). It is important to note that all four of these CSIs are uniquely present in all available Chlorobi and Chloroflexi (FAP) homologs, but they are not found in the homologs from any MBE other phototrophic lineages. Based upon our analysis of the CSI • in the BchL protein sequences (see previous section), where it was shown to be an insert in these lineages, it is likely that the genetic changes leading to these CSIs occurred in the BchN and BchB genes within these lineages. The unique shared presence of these CSIs by these two phylogenetically distinct lineages in all three subunits of the DPOR complex suggests that the genetic changes responsible for them initially occurred in these genes in one of these two lineages (or in an extinct lineage), followed by lateral transfer of these genes to the other group(s). In addition to the phylogenetic studies and the shared CSIs in the BchL, BchB, and BchN proteins, I have also examined pair-wise sequence identity/similarity for these protein sequences. The results of these analyses, which are presented in table 1, show that the Chlorobi homologs for all three proteins were most similar to those from Chloroflexi and the pair-wise identity/similarity values for them were at least 15% higher than those seen for any other phototrophic lineage. Both these lineages also contain the unique Bchl-containing light-harvesting complexes “chlorosomes,” which are not found in other lineages (Olson and Pierson 1987; Olson and Blankenship 2004; Bryant and Frigaard 2006; Hohmann-Marriott and Blankenship 2007). Another observation that stands out from table 1 is that for all three of these proteins the Clade C cyanobacterial homologs exhibited maximal sequence similarity (58–68% identity) to the proteobacterial homologs in comparison to those from other cyanobacteria or any other phototrophic lineage (32–36% identity). All of these observations strongly indicate that the genes for these proteins have been laterally transferred between Chloroflexi and Chlorobi on one hand and the Clade C cyanobacteria and proteobacteria on the other hand. Discussion The results presented in this manuscript provide important insights into a number of different aspects of evolution of photosynthesis and its spread to other bacterial phyla. One important question is of the two forms of photosynthesis, i.e., oxygenic photosynthesis carried out cyanobacteria and the anoxygenic photosynthesis carried out by other bacterial phyla, which form originated first (Blankenship 1992; Burke et al. 1993; Olson and Blankenship 2004; Mulkidjanian et al. 2006; Blankenship 2010; Hohmann-Marriott and Blankenship 2011). According to the Granick hypothesis (Granick 1965), in a given biochemical pathway the enzymes/proteins that carry out an earlier biochemical step have likely evolved earlier than those carrying out later steps. In photosynthesis, a key biochemical process/pathway is the synthesis of bacteriochlorophyll and chlorophyll (Beale 1999; Chew and Bryant 2007). In the pathway leading to the biosynthesis of Bchl/Chl, the enzyme complex protochlorophyllide oxidoreductase (BchL-N-B), responsible for the production of chlorin (a precursor to Chl), precedes the complex chlorin reductase (BchX–Y–Z) that reduces chlorin to bacteriochlorin, which is a direct precursor for Bchl (Burke et al. 1993; Beale 1999; Chew and Bryant 2007). Thus, based upon this hypothesis, the oxygenic photosynthesis based on Chl should have evolved 3403 Gupta . doi:10.1093/molbev/mss145 MBE FIG. 4. Excerpts from the sequence alignment for BchN homologs showing a number of CSIs that are specific for different groups of phototrophs. The dashes in the alignments show identity with the amino acid on the top line. All of these CSIs are highly specific for the indicated groups and the numbers below the group names indicate their presence or absence in the available sequences from these groups. 3404 Evolutionary Origin and Spread of Photosynthesis . doi:10.1093/molbev/mss145 MBE FIG. 5. Partial sequence alignments of (A) the BchN protein and (B) BchB protein showing some CSIs that are specific for different groups of phototrophs. Other details are same as in figures 1 and 4. prior to the anoxygenic photosynthesis requiring Bchl and some models for the evolution of photosynthesis based upon this have been proposed (Mauzerall 1978; Olson and Pierson 1987; Olson and Blankenship 2004). However, earlier phylogenetic studies based on NifH, BchX, and BchL proteins indicated that the BchX homologs originated prior to the BchL homologs, suggesting that anoxygenic photosynthesis requiring BchX homologs preceded the oxygenic photosynthesis that requires BchL (Burke et al. 1993; Xiong et al. 2000; Raymond et al. 2003). The phylogenetic studies based upon these protein sequences reported here also strongly support the results of earlier studies. However, because construction of phylogenetic trees and inferences derived from them are influenced by large numbers of variables and assumptions (Felsenstein 1988; Moreira and Philippe 2000), it is important to confirm this inference by other means. In this context, the two CSIs (‚ and „) in the sequence alignments of NifH, BchX, and BchL sequences that have been identified in the present work, which are uniquely shared by all NifH and BchX homologs but not found in any BchL homologs, are highly significant. Because of the earlier divergence of the NifH protein from the BchX and BchL proteins, the NifH sequences can be used to determine whether any conserved characteristic that is present in the BchX or BchL protein is ancestral or derived. Based upon this simple premise, the unique shared presence of the CSIs ‚ and „ by all NifH and BchX homologs, but not by any of the BchL homologs, strongly indicates that the BchX homologs containing these CSIs are ancestral and that the genetic changes leading to deletion of sequences corresponding to these CSIs occurred in the ancestral BchL gene after its divergence from BchX by gene duplication. It should be emphasized that the interpretation of these shared CSIs is straightforward and it only assumes that these shared genetic characteristics (synapomorphies) have a common evolutionary origin, which is the most parsimonious explanation to account for them. Due to the presence of these CSIs in conserved regions and their presence in all NifH and BchX homologs, but none of the BchL homologs, it is difficult to explain these results by any other means except by inferring that the NifH and the BchX 3405 MBE Gupta . doi:10.1093/molbev/mss145 Table 1. Pair-wise Sequence Identity/Similarity Values for the BChL, BChN, and BChlB Homologs. ChlB Pm9303 Syn9902 Amax Sc6803 Gvio Rpal Rrub Caur Rcas Paes Ctep Hmod BChN Pm9303 Syn9902 Amax Sc6803 Gvio Rpal Rrub Caur Rcas Paes Ctep Hmod BchB Pm9303 Syn9902 Amax Sc6803 Gvio Rpal Rrub Caur Rcas Paes Ctep Hmod Pm9303 Sy9902 Amax Sc6803 Gvio Rpal Rrub Caur Rcas Paes Ctep Hmod – 86.7 49.5 51.4 51.0 72.8 73.0 50.5 49.2 50.9 50.8 50.0 77.2 – 50.7 51.2 48.7 72.5 72.6 51.3 49.2 50.8 51.1 51.8 33.6 32.7 – 92.5 86.6 50.0 47.8 58.7 54.7 56.7 56.3 56.6 33.8 33.5 83.5 – 84.8 49.6 47.3 58.0 54.5 57.2 55.4 57.0 33.9 33.0 74.6 73.4 – 50.7 48.9 57.1 55.8 56.0 56.4 56.7 61.8 60.9 32.7 32.0 33.3 – 81.8 51.0 50.8 50.9 51.3 50.7 60.1 59.9 31.6 30.6 32.9 71.2 – 50.6 51.2 51.2 52.1 49.6 34.8 34.2 39.7 39.5 39.8 37.3 36.5 – 76.3 72.6 72.6 55.8 34.1 32.9 36.4 36.2 36.7 34.9 34.7 66.5 – 71.1 71.8 54.8 33.6 33.4 36.8 37.0 35.9 34.6 35.3 57.1 56.0 – 84.0 58.3 32.7 33.3 35.6 36.0 36.2 35.8 36.4 58.7 56.7 73.0 – 59.3 32.1 35.1 40.6 40.2 40.9 33.3 33.0 39.7 37.2 39.7 40.5 – – 84.8 51.5 50.6 52.7 74.9 69.3 52.9 56.7 52.5 52.7 54.1 76.5 – 51.0 52.3 50.4 74.0 71.7 53.9 56.7 51.9 54.9 55.6 33.5 32.8 – 91.3 83.2 47.9 47.4 54.3 53.2 53.9 53.9 56.7 32.2 33.5 85.2 – 82.3 50.0 49.3 54.4 56.8 52.8 53.4 55.9 33.7 33.5 73.1 70.7 – 47.7 46.7 52.8 53.3 52.8 52.7 56.0 61.8 61.2 32.9 33.7 32.0 – 79.3 49.8 53.5 51.7 51.2 51.3 57.8 60.5 33.1 33.2 32.3 68.2 – 48.5 50.3 49.2 51.1 50.4 33.7 35.6 35.5 36.2 33.3 33.0 34.7 – 81.5 73.7 75.8 55.1 37.5 38.0 35.7 38.3 35.0 36.1 35.7 71.1 – 77.7 78.4 56.6 34.0 34.1 35.2 36.2 35.4 33.3 33.9 60.8 64.8 – 90.5 55.7 34.7 35.5 33.6 34.3 33.3 33.8 35.6 61.3 64.4 81.4 – 56.4 34.9 36.8 41.0 39.2 39.5 34.8 36.5 39.4 40.1 38.4 40.1 – – 92.6 61.6 65.5 62.3 78.3 82.5 63.2 62.3 62.4 63.4 59.5 88.8 – 60.4 63.6 62.1 78.7 82.6 61.8 60.8 64.1 62.9 58.9 43.1 44.2 – 64.4 89.6 58.8 59.5 66.1 67.5 65.9 67.9 69.1 46.5 47.0 47.9 – 81.4 61.3 62.1 62.7 63.9 62.8 64.4 63.0 46.4 46.0 81.2 73.6 – 59.9 61.0 70.2 69.1 68.5 68.6 70.0 66.2 68.2 44.2 44.6 45.3 – 83.7 60.1 58.8 59.9 60.8 57.4 71.4 71.9 45.8 48.3 47.1 72.4 – 61.8 60.8 61.9 62.7 58.9 46.4 46.6 50.2 47.5 52.1 46.2 47.5 – 91.6 89.5 87.3 67.0 47.7 47.1 49.8 46.8 50.0 46.5 49.5 85.7 – 88.0 87.3 67.7 44.6 44.8 49.0 46.7 51.0 43.6 45.7 75.6 73.8 – 94.2 68.5 45.2 46.3 51.0 47.9 50.2 44.8 46.5 71.7 73.2 85.9 – 67.8 41.8 41.4 51.3 46.0 50.9 39.3 41.8 48.3 48.6 48.1 45.1 – NOTE—The abbreviations for the species are: Pm 9303, Pro. marinus MIT9303; Syn9902, Synechochoccus sp. PCC9902: Amax, Arthrospira maxima, Sc8803, Synechocystis sp. PCC6803; Gvio., Gloebacter violaceus; Rpal, Rhodopseud. palustris; Rrub, Rhodospirillum rubrum; Caur., Chloroflexus aurantiacus; Rcas, Roseiflexus castenholzii; Paes, Prostheco. aestuarii; Ctep. Chlorobium tepidum; Hmod., Heliobacterium modesticaldum. homologs shared a common ancestor exclusive of the BchL homologs. The earlier origin of the BchX homologs in comparison to the BchL homologs as strongly suggested by these CSIs provides strong and independent evidence that the anoxygenic photosynthesis supported by BchX homologs originated before the oxygenic photosynthesis requiring BchL homologs. The lack of support of Granick hypothesis by these results can be explained (Burke et al. 1993) if the ancestral BchX–Y–Z enzyme complex carried out the functions of both the DPOR (BchL-N-B) and chlorin reductase (BchX–Y–Z) complexes, and these complexes became more specialized after the gene duplication event. However, other 3406 explanations to account for this anomaly are also possible (Xiong et al. 2000; Olson and Blankenship 2004). Another important unresolved question concerning the evolution of photosynthesis is to determine in which bacterial group or phyla this process first evolved. The results presented here again provide important insights in this regard. One of the CSIs in the BchL protein identified here (CSI ¶ in fig. 1) is specifically present in the BchL homologs from Heliobacteriaceae, but it is absent in all other BchL homologs. Importantly, based upon the presence of this CSI in different BchX homologs it is possible to infer that the presence of this CSI represents the ancestral character state of the BchL–BchX Evolutionary Origin and Spread of Photosynthesis . doi:10.1093/molbev/mss145 MBE FIG. 6. Phylogenetic tree based on BchN sequences showing the relationships among different photosynthetic taxa. The tree shown is a ML distance tree, which was arbitrarily rooted using H. modesticaldum sequence. The numbers on the nodes indicate percentage of puzzling quartets or bootstrap scores (>50%) supporting these nodes. protein and that the BchL homologs from Heliobacteriaceae are ancestral in comparison to those from other lineages. In view of the absence of this indel in all other BchL homologs and the distinct branching of the BchX and BchL homologs, the possibility of chance occurrence of this indel in the Heliobacteriaceae BchL homologs, or their acquisition of a gene containing this CSI by means of LGT from other sources is considered unlikely, but they cannot be entirely excluded. Because the Firmicutes phylum, of which Heliobacteriaceae are part of, represents the earliest branching phylum within the Bacteria (Gupta 2001, 2003, 2011; Ciccarelli et al. 2006), it suggests that photosynthesis evolved very early in the evolutionary history of life. It should be noted that earlier phylogenetic studies based on BchL, BchN, and BchB proteins have led to the inference that Proteobacteria were the earliest photosynthetic lineage that evolved (Xiong et al. 2000; Xiong and Bauer 2002). However, the data set employed in these studies lacked any Clade C cyanobacteria to which the proteobacterial homologs are most closely related. Because of the highly divergent nature of the proteobacterial homologs and the lack of any close relatives to them (viz. Clade C cyanobacteria) in the datasets that were employed, the deep branching of proteobacterial homologs in earlier studies was very likely a result of long branch length effect (Green and Gantt 2000). Although photosynthesis-related genes are known to have undergone extensive LGTs, it has proven difficult to determine the directions of LGTs or how photosynthetic ability was acquired by various phyla (Raymond et al. 2002; Xiong and Bauer 2002; Raymond et al. 2003; Raymond 2009; Hohmann-Marriott and Blankenship 2011). In the present work, we have identified several CSIs in the BchL, BchB, and BchN protein sequences that are commonly shared by either Chlorobi and Chloroflexi, or by Clade C Cyanobacteria and Proteobacteria, providing further evidence that these genes have incurred LGTs. Importantly, based upon a number of 3407 Gupta . doi:10.1093/molbev/mss145 MBE FIG. 7. Partial sequence alignments of the BchB protein showing two CSIs that are present in the same position, which are specific for either species from the genus Roxiflexus or for all of the Chlorobi. A 2 amino acid insert in this position is also present in O. trichoides. Other details are the same as in figures 1 and 4. observations made in this work and our earlier work, it is possible to infer for the above genes the directions of LGTs. In the protein BchB, which contains CSI ” that is commonly shared by various Chlorobi and Chloroflexi, two other CSIs have also been identified (fig. 7). One of these CSIs consisting of a 9 amino acid insert (») is uniquely found in the two Roseiflexus species, whereas the other CSI consisting of 1 amino acid insert (…) is specific for various Chlorobi homologs (fig. 7). The genetic changes responsible for these CSIs likely occurred in the common ancestors of these particular taxa. Based upon these CSIs, for the BchB gene, if the gene transfer had taken place from Chlorobi to Chloroflexi than it was expected that the CSI … that is specific for various Chlorobi should also be found in the Chloroflexi homologs. However, the absence of the CSI … in the Chloroflexi homologs indicates that the gene transfer has not occurred in this direction, but it has likely occurred from Chloroflexi (FAP) to Chlorobi followed by the introduction of the genetic change leading to the CSI … in the common ancestor of Chlorobi. Furthermore, the absence of the large Roseiflexus-specific CSI » in the Chlorobi homologs indicates that this genus 3408 was not the source of LGT and suggests that the ancestral Chloroflexi from which this gene transfer occurred was either a Chloroflexus or some related filamentous anoxygenic phototroph (FAP) that lacked this indel. The presence of a 2 amino acid insert in this position in Oscillochloris also makes it less likely as the source of LGT. Based upon these observations and the fact that BchL-N-B proteins are part of the same enzyme complex (DPOR), it is likely that the genes for the BchN and BchL proteins that also contain CSIs that are commonly shared by Chloroflexi and Chlorobi were also laterally transferred from Chloroflexus or a related FAP to the Chlorobi. Our results also provide compelling evidence that the homologs of the BchL-N-B proteins from Proteobacteria are closely related to those from Clade C cyanobacteria. This inference is based upon several CSIs that are uniquely shared by these two groups (viz. ›, fi, and ), by phylogenetic analyses based on these protein sequences, and the pair-wise sequence identity/similarity scores for these proteins. Of the different phyla of photosynthetic bacteria, Cyanobacteria are made up entirely of photosynthetic organisms (Castenholz and Phylum 2001; Mulkidjanian et al. 2006; Gupta 2010; Blank and Evolutionary Origin and Spread of Photosynthesis . doi:10.1093/molbev/mss145 Sanchez-Baracaldo 2010) and their monophyletic nature is supported by different lines of evidence (Castenholz and Phylum 2001; Wilmotte and Herdman 2001; Ciccarelli et al. 2006) including large numbers of CSIs and signature proteins that are uniquely present in all Cyanobacteria (Gupta 2009; Gupta and Mathews 2010). Further, our recent work on Cyanobacteria provides evidence that the Clade C is a derived clade and several other cyanobacterial species/strains, particularly those belonging to Clade A, constitute the deepest branching lineage within this phylum (Gupta 2009; Gupta and Mathews 2010). In contrast to Cyanobacteria, photosynthetic ability within Proteobacteria is sporadically distributed in a limited number of species belonging to the Alpha-, Beta-, and Gamma-classes of proteobacteria (Yurkov and Beatty 1998; Imhoff 2001; Gupta and Mok 2007; Gupta 2010). In view of these observations, it is more likely that the various CSIs (viz. ›, fi, and ) and other genetic changes in the BchL-N-B proteins that distinguish the Clade C cyanobacteria from other cyanobacteria initially occurred in a common ancestor of the Clade C cyanobacteria and then these genes were laterally acquired by Proteobacteria. The alternate possibility that these genetic changes first occurred in a proteobacterial ancestor and their subsequent transfer to the Clade C cyanobacteria would require numerous gene losses, gene transfers as well as gene replacement events and it is considered highly unlikely. The transfer of these genes from Clade C cyanobacteria to proteobacteria, both of which are major components of the marine microbial community, could have occurred in oceanic environments (Partensky et al. 1999; Kolber et al. 2001; Dufresne et al. 2003; Oda et al. 2008) and their further dissemination within the proteobacteria may have been facilitated by the gene transfer agent that are present in many alpha proteobacteria (Lang and Beatty 2007). Presently, no unique aspects of photosynthesis are known that are commonly shared by Proteobacteria and the Clade C Cyanobacteria. In view of the close similarities seen for the components of the DPOR complex between these two groups, studies aimed at identifying common and unique aspects of photosynthesis between them should be of much interest. It should also be mentioned that in contrast to the Cyanobacteria, which contain both reaction centers I and II, the Proteobacteria possess only the RC II and carry out anoxygenic photosynthesis. Therefore, if the genes for other photosynthesis related proteins were also transferred from Clade C cyanobacteria to proteobacteria, then this gene transfer was likely accompanied/followed by loss of genes for many photosynthesis-related proteins. The results of pair-wise sequence similarities on the BchL-N-B proteins (table 1) indicate that for all three of these proteins, the homologs from Heliobacteriaceae exhibited higher similarity to those from cyanobacteria (except Clade C) and Chloroflexi/Chlorobi. Our earlier work based on many other CSIs in universally distributed proteins indicates that the phylum Chloroflexi branched after the Firmicutes but prior to Cyanobacteria (Gupta 2001, 2003; Ciccarelli et al. 2006). These observations suggest that either Chloroflexi or Cyanobacteria were the earliest recipients of these genes from Heliobacteriaceae. It should be noted that MBE FIG. 8. Structure of the A chain of the BchL protein from Rhodobacter sphaeroides showing the location of the two identified CSIs (¶ and •) in this protein. The structure of the BchL protein from R. sphaeroides was obtained from the Protein Data Bank (Sarma et al. 2008; Muraki et al. 2010) and the image was constructed using the PyMol program. The protein contains a bound MgADP and a [4Fe-4S] cluster that are shown in dull yellow and red colors, respectively. The positions of the two CSIs that are specific for Heliobacteriaceae and Chlorobi-Chloroflexi, respectively, in this structure are marked with arrows and they were inferred based upon sequence alignment (fig. 1). the Heliobacteriaceae species, which our results indicate contain a primitive form of the BchL protein (i.e., DPOR complex), they also possess a primitive photosynthetic reaction center (RC), where both antenna and RC complexes are part of a single protein (Trost and Blankenship 1989; Blankenship 1992; Vermaas 1994; Vassiliev et al. 2001; Heinnickel and Golbeck 2007; Sattley et al. 2008). Further, unlike other photosynthetic prokaryotes, no photo-autotrophic growth thus far has been observed for any Heliobacteriaceae species (Gest and Favinger 1983; Bryant and Frigaard 2006; Madigan 2006; Sattley and Blankenship 2010). It should also be noted that photosynthetic ability or genes within the phylum Firmicutes have only been found within the Heliobacteriaceae family (Gest and Favinger 1983; Sattley and Blankenship 2010). It is possible that due to the primitive nature of the photosynthetic apparatus within the Firmicutes and its inability to support photoautotrophic growth, in an environment that is now predominantly oxygenic, photosynthesis related genes have been lost from other extant Firmicutes. These observations raise the possibility that although the genes for some of the key photosynthesis proteins (viz. DPOR complex) and a primitive photosynthetic RC first evolved in the Heliobacteriaceae, functional photosynthetic ability was not developed until the later diverging phototrophic lineages such as Chloroflexi and/ or Cyanobacteria. This possibility can also account for the geological and fossil evidence that the earliest phototrophic microbial communities existing as early as 3.4 Ga ago, and which used Calvin cycle for CO2 fixation, were comprised of filamentous anoxygenic bacteria (Dismukes et al. 2001; Tice and Lowe 2004, 2006). In contrast, oxygenic photosynthesis 3409 MBE Gupta . doi:10.1093/molbev/mss145 attributable to Cyanobacteria is indicated to have evolved 2.2–2.6 Ga ago (Kazmierczak and Altermann 2002; Olson and Blankenship 2004; Olson 2006; Blank and Sanchez-Baracaldo 2010). Because Chloroflexi have filamentous morphology and they are capable of carrying out anoxygenic photosynthesis by a variety of mechanisms including the Calvin cycle (Hanada and Pierson 2006), they could account for the earliest phototrophic microbial communities (Tice and Lowe 2006; Olson 2006). It should be acknowledged that the inferences drawn in this study concerning the origin of photosynthesis and its spread to other bacteria phyla are made almost solely on the basis of the proteins (viz. BchL, BchN, BchB, BchX, Nifh) that were studied in this work. Although these proteins (all except NifH) are unique and central components of the photosynthesis pathway, the process of photosynthesis overall is very complex and it involves varied sets of genes in different lineages that have been acquired by different means including gene gains and losses and LGTs (Raymond et al. 2002; Xiong and Bauer 2002; Olson and Blankenship 2004; Mulkidjanian et al. 2006; Raymond 2009; Hohmann-Marriott and Blankenship 2011). Therefore, it is likely and may in fact be expected that not all components of this complex process will exhibit similar evolutionary histories. Nonetheless, unlike other photosynthesis components, the genes/proteins that were studied in this work are unique characteristics of all photosynthetic organisms and they play pivotal roles in the photosynthesis process. Hence, the evolutionary histories of these genes/proteins are of central importance in understanding the origin and spread of photosynthesis. Lastly, this work has identified large numbers of CSIs in key photosynthesis proteins that are specific for different groups of photosynthetic prokaryotes. Hence, it is of much interest to understand the functional significance of these evolutionary conserved genetic changes. Recent work on several CSIs in other important proteins has shown that such CSIs, which are generally located in the surface loops of proteins (Akiva et al. 2008; Gupta 2010), are essential for the groups of species where they are found (Singh and Gupta 2009). Two of the important CSIs (¶ and •) in the BchL protein that were identified in the present work (fig. 1), are also located in the surface loops in the structure of this protein (fig. 8) (Sarma et al. 2008; Muraki et al. 2010). The surface loops in protein sequences play important roles in mediating protein– protein interactions (Akiva et al. 2008; Singh and Gupta 2009; Hormozdiari et al. 2009). Hence, it is likely that the identified CSIs in the BchL, BchN, and BchB proteins are also involved in mediating protein–protein interaction that are specific and essential for different groups of phototrophs. Therefore, further studies on understanding the functional significance of these CSIs could reveal novel aspects of these important proteins that are specific for different lineages of photosynthetic bacteria. Supplementary Material Supplementary figures 1 and 2 are available at Molecular Biology and Evolution online (http://www.mbe. oxfordjournals.org/). 3410 Acknowledgments This work was supported by a research grant from the Natural Science and Engineering Research Council of Canada. I acknowledge the assistance of Sanjan George in Blast searches and in the creation of signature files. References Akiva E, Itzhaki Z, Margalit H. 2008. Built-in loops allow versatility in domain-domain interactions: lessons from self-interacting domains. Proc Natl Acad Sci U S A. 105:13292–13297. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein databases search programs. Nucleic Acids Res. 25: 3389–402. Baldauf SL, Palmer JD. 1993. Animals and fungi are each other’s closest relatives: congruent evidence from multiple proteins. Proc Natl Acad Sci U S A. 90:11558–11562. Beale SI. 1999. Enzyme of Chlorophyll biosynthesis. Photosynth Res. 60: 43–73. Blank CE, Sanchez-Baracaldo P. 2010. Timing of morphological and ecological innovations in the cyanobacteria–a key to understanding the rise in atmospheric oxygen. Geobiology 8:1–23. Blankenship RE. 1992. Origin and early evolution of photosynthesis. Photosynth Res. 33:91–111. Blankenship RE. 1994. Protein structure, electron transfer and evolution of prokaryotic photosynthetic reaction centers. Antonie van Leeuwenhoek 65:311–329. Blankenship RE. 2010. Early evolution of photosynthesis. Plant Physiol. 154:434–438. Blankenship RE, Hartman H. 1998. The origin and evolution of oxygenic photosynthesis. Trends Biochem Sci. 23:94–97. Bryant DA, Costas AM, Maresca JA, et al. (11 co-authors). 2007. Candidatus Chloracidobacterium thermophilum: an aerobic phototrophic Acidobacterium. Science 317:523–526. Bryant DA, Frigaard NU. 2006. Prokaryotic photosynthesis and phototrophy illuminated. Trends Microbiol. 14:488–496. Burke DH, Hearst JE, Sidow A. 1993. Early evolution of photosynthesis: clues from nitrogenase and chlrophyll iron proteins. Proc Natl Acad Sci U S A. 90:7134–7138. Castenholz RW, Phylum BX. 2001. Cyanobacteria: oxygenic photosynthetic bacteria. In: Boone DR, Castenholz RW, editors. Bergey’s manual of systematic bacteriology. New York: Springer. p. 474-487. Chew AG, Bryant DA. 2007. Chlorophyll biosynthesis in bacteria: the origins of structural and functional diversity. Annu Rev Microbiol. 61: 113–129. Choudhary M, Kaplan S. 2000. DNA sequence analysis of the photosynthesis region of Rhodobacter sphaeroides 2.4.1. Nucleic Acids Res. 28: 862–867. Ciccarelli FD, Doerks T, von Mering C, Creevey CJ, Snel B, Bork P. 2006. Toward automatic reconstruction of a highly resolved tree of life. Science 311:1283–1287. Delwiche CF, Kuhsel M, Palmer JD. 1995. Phylogenetic analysis of tufA sequences indicates a cyanobacterial origin of all plastids. Mol Phylogenet Evol. 4:110–128. Dismukes GC, Klimov VV, Baranov SV, Kozlov YN, DasGupta J, Tyryshkin A. 2001. The origin of atmospheric oxygen on Earth: the innovation of oxygenic photosynthesis. Proc Natl Acad Sci U S A. 98:2170–2175. Evolutionary Origin and Spread of Photosynthesis . doi:10.1093/molbev/mss145 Dufresne A, Salanoubat M, Partensky F, et al. (21 co-authors). 2003. Genome sequence of the cyanobacterium Prochlorococcus marinus SS120, a nearly minimal oxyphototrophic genome. Proc Natl Acad Sci U S A. 100:10020–10025. Felsenstein J. 1988. Phylogenies from molecular sequences: inference and reliability. Annu Rev Genet. 22:521–65. Felsenstein J. 2004. Inferring phylogenies. Sunderland (MA): Sinauer Associates, Inc. Gest H, Favinger J. 1983. Heliobacterium chlorum, an anoxygenic brownish-green photosynthetic bacterium containing a “new” form of bacteriochlorophyll. Arch Microbiol. 136:11–16. Golbeck JH. 1993. Shared thematic elements in photochemical reaction centers. Proc Natl Acad Sci U S A. 90:1642–1646. Granick S. Evolution of heme and chlorophyll. In: Bryson V, Vogel HJ, editors. Evolving genes and proteins. New York: Academic Press. p. 67-88. Green BR, Gantt E. 2000. Is photosynthesis really derived from proteobacteria? Phycology 36:983–985. Gupta RS. 1998. Protein phylogenies and signature sequences: a reappraisal of evolutionary relationships among archaebacteria, eubacteria, and eukaryotes. Microbiol Mol Biol Rev. 62: 1435–1491. Gupta RS. 2000. The natural evolutionary relationships among prokaryotes. Crit Rev Microbiol. 26:111–131. Gupta RS. 2001. The branching order and phylogenetic placement of species from completed bacterial genomes, based on conserved indels found in various proteins. Int Microbiol. 4:187–202. Gupta RS. 2003. Evolutionary relationships among photosynthetic bacteria. Photosynth Res. 76:173–183. Gupta RS. 2009. Protein signatures (molecular synapomorphies) that are distinctive characteristics of the major cyanobacterial clades. Int J Syst Evol Microbiol. 59:2510–2526. Gupta RS. 2010. Molecular signatures for the main phyla of photosynthetic bacteria and their subgroups. Photosynth Res. 104:357–372. Gupta RS. 2011. Origin of diderm (Gram-negative) bacteria: antibiotic selection pressure rather than endosymbiosis likely led to the evolution of bacterial cells with two membranes. Antonie van Leeuwenhoek 100:171–182. Gupta RS, Golding GB. 1993. Evolution of HSP70 gene and its implications regarding relationships between archaebacteria, eubacteria, and eukaryotes. J Mol Evol. 37:573–582. Gupta RS, Mathews DW. 2010. Signature proteins for the major clades of Cyanobacteria. BMC Evol Biol. 10:24. Gupta RS, Mok A. 2007. Phylogenomics and signature proteins for the alpha proteobacteria and its main groups. BMC Microbiol. 7:106. Gupta RS, Mukhtar T, Singh B. 1999. Evolutionary relationships among photosynthetic prokaryotes (Heliobacterium chlorum, Chloroflexus aurantiacus, cyanobacteria, Chlorobium tepidum and proteobacteria): implications regarding the origin of photosynthesis. Mol Microbiol. 32:893–906. Hanada S, Pierson BK. 2006. The Family Chloroflexaceae. In: Dworkin M, Falkow S, Rosenberg E, Schleifer KH, Stackebrandt E, editors. The Prokaryotes: a handbook on the biology of bacteria. New York: Springer. p. 81?–842. Hartman H. 1998. Photosynthesis and the origin of life. Orig Life Evol Biosphere 28:515–521. Haselkorn R. 1986. Organization of the genes for nitrogen fixation in photosynthetic bacteria and cyanobacteria. Annu Rev Microbiol. 40: 525–547. MBE Heinnickel M, Golbeck JH. 2007. Heliobacterial photosynthesis. Photosynth Res. 92:35–53. Hohmann-Marriott MF, Blankenship RE. 2007. Hypothesis on chlorosome biogenesis in green photosynthetic bacteria. FEBS Lett. 581: 800–803. Hohmann-Marriott MF, Blankenship RE. 2011. Evolution of photosynthesis. Annu Rev Plant Biol. 2:515–548. Hormozdiari F, Salari R, Hsing M, Schonhuth A, Chan SK, Sahinalp SC, Cherkasov A. 2009. The effect of insertions and deletions on wirings in protein-protein interaction networks: a large-scale study. J Comput Biol. 16:159–167. Imhoff JF. 2001. The anoxygenic phototrophic purple bacteria. In: Boone DR, Castenholz RW, editors. Bergey’s manual of systematic bacteriology. Berlin (Germany): Springer. p. 631-637. Jeanmougin F, Thompson JD, Gouy M, Higgins DG, Gibson TJ. 1998. Multiple sequence alignment with Clustal x. Trends Biochem Sci. 23: 403–405. Kazmierczak J, Altermann W. 2002. Neoarchean biomineralization by benthic cyanobacteria. Science 298:2351. Kimura M. 1983. The neutral theory of molecular evolution. Cambridge: Cambridge University Press. Kolber ZS, Plumley FG, Lang AS, et al. (10 co-authors). 2001. Contribution of aerobic photoheterotrophic bacteria to the carbon cycle in the ocean. Science 292:2492–2495. Lang AS, Beatty JT. 2007. Importance of widespread gene transfer agent genes in alpha proteobacteria. Trends Microbiol. 15:54–62. Madigan MT. 2006. The Family Heliobacteriaceae. In: Dworkin M, Falkow S, Rosenberg E, Schleifer KH, Stackebrandt E, editors. The Prokaryotes: a handbook on the biology of bacteria. New York: Springer. p. 9?1–964. Margulis L. 1993. Symbiosis in cell evolution. New York: W.H. Freeman and Company. Mauzerall D. 1978. Bacteriochlorophyll and photosynthesis evolution. In: Clayton RK, Sistrom WR, editors. The photosynthetic bacteria. New York: Plenum Press. p. 223–231. Morden CW, Delwiche CF, Kuhsel M, Palmer JD. 1992. Gene phylogenies and the endosymbiotic origin of plastids. Biosystems 28:75–90. Moreira D, Philippe H. 2000. Molecular phylogeny: pitfalls and progress. Int Microbiol. 3:9–16. Mulkidjanian AY, Koonin EV, Makarova KS, et al. (12 co-authors). 2006. The cyanobacterial genome core and the origin of photosynthesis. Proc Natl Acad Sci U S A. 103:13126–13131. Muraki N, Nomata J, Ebata K, Mizoguchi T, Shiba T, Tamiaki H, Kurisu G, Fujita Y. 2010. X-ray crystal structure of the light-independent protochlorophyllide reductase. Nature 465:110–114. NCBI. 2011. NCBI Completed microbial genomes [Internet]. [cited 2011 Dec 5]. Available from: http://www.ncbi.nlm.nih.gov/genome/ browse/ Nelson N, Ben Shem A. 2005. The structure of photosystem I and evolution of photosynthesis. Bioessays 27:914–922. Nitschke W, Rutherford AW. 1991. Photosynthetic reaction centers: variation on a common structural theme? Trends Biochem Sci. 16: 241–245. Oda Y, Larimer FW, Chain PS, et al. (13 co-authors). 2008. Multiple genome sequences reveal adaptations of a phototrophic bacterium to sediment microenvironments. Proc Natl Acad Sci U S A. 105: 18543–18548. Olsen GJ, Woese CR, Overbeek R. 1994. The winds of (evolutionary) change: breathing new life into microbiology. J Bacteriol. 176:1–6. 3411 Gupta . doi:10.1093/molbev/mss145 Olson JM. 2006. Photosynthesis in the Archean era. Photosynth Res. 88: 109–117. Olson JM, Blankenship RE. 2004. Thinking about the evolution of photosynthesis. Photosynth Res. 80:373–386. Olson JM, Pierson BK. 1987. Evolution of reaction centers in photosynthetic prokaryotes. Int Rev Cytol. 108:209–248. Partensky F, Hess WR, Vaulot D. 1999. Prochlorococcus, a marine photosynthetic prokaryote of global significance. Microbiol Mol Biol Rev. 63:106–27. Pierson BK. 1994. The emergence, diversification, and role of photosynthetic eubacteria. In: Benston S, editor. Early life on earth: nobel symposium no. 84. New York: Columbia University Press. p. 161–180. Raymond J. 2008. Coloring in the tree of life. Trends Microbiol. 16:41–43. Raymond J. 2009. The role of horizontal gene transfer in photosynthesis, oxygen production, and oxygen tolerance. Methods Mol Biol. 532: 323–338. Raymond J, Segre D. 2006. The effect of oxygen on biochemical networks and the evolution of complex life. Science 311:1764–1767. Raymond J, Siefert JL, Staples CR, Blankenship RE. 2004. The natural history of nitrogen fixation. Mol Biol Evol. 21:541–554. Raymond J, Swingley WD. 2008. Phototroph genomics ten years on. Photosynth Res. 97:5–19. Raymond J, Zhaxybayeva O, Gogarten JP, Blankenship RE. 2003. Evolution of photosynthetic prokaryotes: a maximum-likelihood mapping approach. Phil Trans R Soc Lond [Biol]. 358:223–230. Raymond J, Zhaxybayeva O, Gogarten JP, Gerdes SY, Blankenship RE. 2002. Whole-genome analysis of photosynthetic prokaryotes. Science 298:1616–1620. Rice P, Longden I, Bleasby A. 2000. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 16:276–277. Rivera MC, Lake JA. 1992. Evidence that eukaryotes and eocyte prokaryotes are immediate relatives. Science 257:74–76. Sadekar S, Raymond J, Blankenship RE. 2006. Conservation of distantly related membrane proteins: photosynthetic reaction centers share a common structural core. Mol Biol Evol. 23:2001–2007. Sarma R, Barney BM, Hamilton TL, Jones A, Seefeldt LC, Peters JW. 2008. Crystal structure of the L protein of Rhodobacter sphaeroides light-independent protochlorophyllide reductase with MgADP bound: a homologue of the nitrogenase Fe protein. Biochemistry 47:13004–13015. Sattley WM, Blankenship RE. 2010. Insights into heliobacterial photosynthesis and physiology from the genome of Heliobacterium modesticaldum. Photosynth Res. 104:113–122. Sattley WM, Madigan MT, Swingley WD, et al. (20 co-authors). 2008. The genome of Heliobacterium modesticaldum, a phototrophic representative of the Firmicutes containing the simplest photosynthetic apparatus. J Bacteriol. 190:4687–4696. Schmidt HA, Strimmer K, Vingron M, von Haeseler A. 2002. TREEPUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18:502–504. Schubert WD, Klukas O, Saenger W, Witt HT, Fromme P, Krauss N. 1998. A common ancestor for oxygenic and anoxygenic photosynthetic 3412 MBE systems: a comparison based on the structural model of photosystem I. J Mol Biol. 280:297–314. Shi T, Falkowski PG. 2008. Genome evolution in cyanobacteria: the stable core and the variable shell. Proc Natl Acad Sci U S A. 105: 2510–2515. Singh B, Gupta RS. 2009. Conserved inserts in the Hsp60 (GroEL) and Hsp70 (DnaK) proteins are essential for cellular growth. Mol Genet Genomics 281:361–373. Swingley WD, Blankenship RE, Raymond J. 2008. Integrating Markov clustering and molecular phylogenetics to reconstruct the cyanobacterial species tree from conserved protein families. Mol Biol Evol. 25:643–654. Tang KH, Barry K, Chertkov O, et al. (15 co-authors). 2011. Complete genome sequence of the filamentous anoxygenic phototrophic bacterium Chloroflexus aurantiacus. BMC Genomics 12:334. Tice MM, Lowe DR. 2004. Photosynthetic microbial mats in the 3,416-Myr-old ocean. Nature 431:549–552. Tice MM, Lowe DR. 2006. Hydrogen-bsed carbon fixation in the earliest known photosynthetic organisms. Geology 34:37–40. Trost JT, Blankenship RE. 1989. Isolation of a photoactive photosynthetic reaction center-core antenna complex from Heliobacillus mobilis. Biochemistry 28:9898–9904. Van de Peer Y, De Wachter R. 1994. TREECON for Windows: a software package for the construction and drawing of evolutionary trees for the Microsoft Windows environment. Comput Appl Biosci. 10: 569–570. Vassiliev IR, Antonkine ML, Golbeck JH. 2001. Iron-sulfur clusters in type I reaction centers. Biochim Biophys Acta. 1507:139–160. Vermaas WFJ. 1994. Evolution of heliobacteria: implications for photosynthetic reaction center complexes. Photosynth Res. 41: 285–294. Wilmotte A, Herdman M. 2001. Phylogenetic relationships among the cyanobacteria based on 16S rRNA sequences. In: Boone DR, Castenholz RW, editors. Bergey’s manual of systematic bacteriology. New York: Springer. p. 487–493. Xiong J, Bauer CE. 2002. Complex evolution of photosynthesis. Annu Rev Plant Biol. 53:503–521. Xiong J, Fischer WM, Inoue K, Nakahara M, Bauer CE. 2000. Molecular evidence for the early evolution of photosynthesis. Science 289: 1724–1730. Xiong J, Inoue K, Bauer CE. 1998. Tracking molecular evolution of photosynthesis by characterizaton of a major photosynthesis gene cluster from Heliobacillus mobilis. Proc Natl Acad Sci U S A. 95: 14851–14856. Yurkov VV, Beatty JT. 1998. Aerobic anoxygenic phototrophic bacteria. Microbiol Mol Biol Rev. 62:695–724. Zhaxybayeva O, Doolittle WF, Papke RT, Gogarten JP. 2009. Intertwined evolutionary histories of marine Synechococcus and Prochlorococcus marinus. Genome Biol Evol. 1:325–339. Zhaxybayeva O, Gogarten JP, Charlebois RL, Doolittle WF, Papke RT. 2006. Phylogenetic analyses of cyanobacterial genomes: quantification of horizontal gene transfer events. Genome Res. 16: 1099–1108.