* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
664 Insights into the evolutionary process of genome degradation Jan O Andersson* and Siv GE Andersson† Studies of noncoding and pseudogene sequence diversity, particularly in Rickettsia, have begun to reveal the basic principles of genome degradation in microorganisms. Increasingly, studies of genes and genomes suggest that there has been an extensive amount of horizontal gene transfer among microorganisms. As this inflow of genetic material does not seem generally to have resulted in genome size expansions, however, degenerative processes must be at the very least as widespread as horizontal gene transfer. The basic principles of gene degradation and elimination that are being explored in Rickettsia are likely to be of major importance for our understanding of how microbial genomes evolve. Addresses Department of Molecular Evolution, Uppsala University, Box 590, Biomedical Center, 751 24 Uppsala, Sweden *e-mail: [email protected] †e-mail: [email protected] Current Opinion in Genetics & Development 1999, 9:664–671 0959-437X/99/$ — see front matter © 1999 Elsevier Science Ltd. All rights reserved. Abbreviations Ado-Met S-Adenosylmethionine ORF open reading frame SFG spotted fever group TG typhus group in R. prowazekii and C. trachomatis has provided examples of reductive convergent evolution associated with the evolution of metabolic parasitism in response to the intracellular habitat [6•]. For example, both organisms rely on their host cells for supply of nucleoside monophosphates and seem to have discarded all genes coding for enzymes involved in de novo purine and pyrimidine biosynthesis. Overall, the relative fraction of genes allocated to different functional categories is very similar in the genomes of R. prowazekii and C. trachomatis [6•] The reductive evolutionary processes acting on genomes of intracellular bacteria is likely to also have shaped the structures of organellar genomes [5•]. As mitochondria and α-proteobacteria are thought to share a common ancestor, there is also a very close, phylogenetic link between mitochondria and Rickettsia [1••,7•,8••]. Needless to say, this does not mean that mitochondria has evolved from an ancestral bacterium with a genome like that of modern Rickettsia. More likely, the rickettsial and the mitochondrial genomes have both been reduced in size independently since they diverged from a common ancestor some 2000 million years ago [5•,7•,8••,9•]. Indeed, comparative studies of protist mitochondrial genomes suggest that individual genes have been lost many times independently in different lineages and that the flux of genes from the mitochondrion to the host nucleus is an ongoing process [8••,10]. Introduction A large number of obligate intracellular parasites and symbionts – Rickettsia, Chlamydia and Buchnera – have genome sizes in the 1 Mb range or less [1••–3••,4]. These bacteria have almost certainly evolved from bacteria with larger genome sizes. Obligate intracellular parasitism has been found in a variety of bacterial phyla, suggesting that transitions to intracellular environments have occurred a number of times independently in evolution [5•]. Thus, there seems to be a correlation between intracellular lifestyles, small genome sizes and reductive evolutionary processes. We expect this similarity in history to have left ‘footprints’ in the genome sequences of modern intracellularly replicating bacteria. Indeed, recent work has begun to reveal how genomes of intracellular parasites deteriorate. Fundamental to this progress has been the publication of complete genome sequence data from two genera of obligate intracellular parasites: Rickettsia and Chlamydia. The 1.1 Mb genome sequence of Rickettsia prowazekii, a member of the αProteobacteria and the causative agent of epidemic typhus, was published last year [1••]. The 1.0–1.1 Mb genome sequences of Chlamydia trachomatis and Chlamydia pneumoniae, the causative agents of trachoma and pneumoniae, respectively, were also published during the past year. [2••,3••]. A comparative analysis of the gene complements The chloroplast genomes of non-photosynthetic plants provide a particularly apt model system for studies of degenerative processes. For example, photosynthesis has been lost secondarily in Epifagus virigiana, a plant which lives on the roots of beech trees . Not surprisingly, most genes for photosynthesis and chlororespiration have been discarded from this genome . Chloroplast genomes from this group of plants contain a multitude of pseudogenes such as the photosynthesis gene rbcL, which has been mutationally destroyed in some lineages, whereas it has either been retained or completely eliminated in others . Another nice model system for studies of reductive evolutionary processes are the nucleomorph genomes — vestigial nuclear remnants of eukaryotic algae that have established secondary endosymbiotic relationships with marine protists . The nucleomorph genomes have been reduced in size to <1 Mb and contain densely packed, co-transcribed genes which are interrupted by very short but functional introns [13,14]. Comparative analyses of the residual genes in the nucleomorph genomes are likely to yield important information about the flux and elimination of genetic information within and among eukaryotic organisms. Recent studies on the evolution of pseudogene sequences in Rickettsia have now also started to yield insights into the Insights into the evolutionary process of genome degradation Andersson and Andersson 665 Figure 1 A schematic view of gene degradation in Rickettsia. The tree shows the phylogenetic relationship of a subset of species from the TG and SFG Rickettsia. Thick boxes represent functional genes and thin boxes with a Ψ-sign pseudogenes. The genes are from left to right: polA, white; hicB, light gray; metK, black; f-orf, gray; dnaE, white. The flanking genes polA and dnaE are functional and present in all species. The pseudogene status of (a–e), the ancestral species, have been inferred from the occurrence of pseudogenes in (f–j), the modern Rickettsia species. We hypothesize that (a) in the common ancestor of the SFG and the TG Rickettsia, all three genes between polA and dnaE were functional and that the inactivation of the metK gene was triggered by the invention of a transport system for AdoMet. Early in the branch leading to the TG Rickettsia, both hicB and the f-orf were (b) inactivated and (c) eliminated. The single internal termination codon in the metK gene in (f) R. prowazekii is indicative of a recent geneinactivation event. The spacer region between metK and dnaE in (f–g) R. prowazekii and R. typhi show weak sequence similarity to the complete ORF in (j) R. felis. In the SFG Rickettsia, the three genes were functional at (c) (b) (f) 1 stop ψ R. prowazekii ψ ψ (a) (g) TG Rickettsia R. typhi (e) ψ ψ 7 del 1 ins (h) 5 del 1 stop ψ SFG Rickettsia R. rickettsii ψ 2 del 1 ins 4 del 1 ins ψ ψ (i) 1 stop 1 stop Invention of a transport system for Ado-Met? (d) (j) 3 del ψ polA hicB metK f-orf dnaE the time when (d) R. felis diverged from the other two species. However, both the metK and the f-orf genes were inactivated prior to (e) the split between R. rickettsii and R. montana. A number of deletion/insertion R. montana R. felis Current Opinion in Genetics & Development mutations have accumulated in (h–j) the modern lineages of the SFG Rickettsia. The number of deletions (del), insertions (ins) and internal termination codons (stop) are indicated. (Data taken from [16••].) principles of degenerative processes in microorganisms. The R. prowazekii genome sequence is unique in this sense: only 76% of the 1.1 Mb genome has a coding function and about a dozen pseudogenes were initially identified [1••]. The high fraction of noncoding DNA has been speculated to represent remnants of ancient genes that are currently in the process of being eliminated from the genome. Here, we review recent work on the analysis of pseudogene sequence variation in the Rickettsia genomes, with additional references to similar studies now being initiated in other species. R. prowazekii strain B, had metK genes that comprise complete open reading frames (ORFs). In contrast, the metK genes in all members of the SFG were found to be disrupted by insertion/deletion mutations and termination codons (Figure 1; [16••]). These presumably inactivated genes were found to be subjected to increased fixation rates for substitutions at sites that cause amino acid replacements. The evidence taken together suggests strongly that metK is a non-functional, neutrally evolving pseudogene in most lineages of Rickettsia [16••]. Degradation of the gene coding for S-Adenosylmethionine synthetase… The inactivation of metK may have been induced by a relaxation of the functional constraints acting on this gene, for example by the combined utilization of cytosolic and internally produced Ado-Met. To be able to exploit cytosolic metabolites that are not freely exchangeable over the bacterial cell membrane, however, specific import systems must be invented. For example, bacteria and eukaryotes are normally impermeable to nucleotides because of the lack of appropriate transport systems but both Rickettsia and Chlamydia are able to exploit the cytosolic ATP with the help of a unique transport system for ATP and ADP [1••–3••], which has not yet been found in any other bacteria. By analogy, it may be speculated that a transport system for Ado-Met was invented prior to the divergence of the TG and the SFG, after which both of the diverging lineages would have been free to start accumulating nucleotide and frameshift mutations. The first indication that genome degradation is an ongoing process in Rickettsia was obtained from the identification of an internal termination codon in the metK gene which codes for S-Adenosylmethionine synthetase . This enzymes catalyzes the biosynthesis of S-Adenosylmethionine (Ado-Met), an essential co-factor in a variety of very important cellular processes (e.g. the methylation of DNA sequences). To determine whether the termination codon was an unusual but conserved feature of a functional gene or if it was the very first sign of gene inactivation, we examined seven additional Rickettsia species for sequence variation in this region [16••]. The genus Rickettsia can be divided into two major groups: the typhus group (TG), represented by the etiological agent of epidemic typhus, R. prowazekii, and the spotted fever group (SFG), represented by the etiologic agent of Rocky Mountain spotted fever, Rickettsia rickettsii (Figure 1). Only two TG lineages, R. typhi and It is interesting to note that uptake of Ado-Met has previously been demonstrated in Leishmania as well as in mitochondria [17,18]. This might be a reflection of a much 666 Genomes and evolution Figure 2 An illustration of the patterns of changes in pseudogenes in Rickettsia. The relative frequencies (%) have been plotted against the average sizes (bp) of insertion and deletion mutations in the metK and f-orf pseudogenes. The figure shows that deletion mutations predominate over insertion mutations, both with respect to occurrences and average sizes. (Data taken from [16••].) Relative frequency (%) 30 25 20 15 10 5 0 1 2 Deletions 3–10 11–25 Insertions 26–500 501–1500 Size in bp Current Opinion in Genetics & Development more general phenomenon: whenever a bacterium makes a change in its lifestyle and habitat, some genes will become nonessential and thereby act as targets for gene inactivation events. For example, Lactococcus lactis strains isolated from dairy products have been shown to be auxotrophs for several amino acids due to frameshifts and nonsense mutations in the corresponding biosynthetic genes, while strains from nondairy products are prototrophs for the same amino acids [19,20]. changes has shown that deletions are far more common than insertions, and on the average much larger in size (Figure 2). Whereas the insertions were only 1–2 bp in size, the deletions ranged in size from 1 bp up to >1000 bp [16••]. Thus, in the long-term the deletions will tend to override the insertions, which means that once a rickettsial gene has become nonfunctional it will be eliminated from the genome solely by mutational events. … and many, many other rickettsial genes! … and its downstream gene… A second pseudogene was found in the region downstream of metK. This region contains a long open reading frame in Rickettsia felis, which has a nucleotide composition pattern that is characteristic of rickettsial genes but with no sequence similarities to genes in the public databases [16••]. Remnants of this gene were detected in five additional members of the SFG. In these species, between 6 and 12 frameshift events were required to recreate ORFs with the expected codon usage patterns (Figure 1; [16••]). However, as these pseudogenes do not show any sequence similarities to genes in the public databases, there are no clues as to what the ancestral function of this gene might have been, or why it is being eliminated from the genome. It may be that its inactivation was an indirect result of a promoter mutation upstream of metK which simultaneously inactivated both genes [16••]. These inactivated gene sequences serve as a wonderful dataset for studies of neutral sequence evolution in Rickettsia. A detailed examination of the patterns of The genomic regions initially associated with putative pseudogenes in R. prowazekii have by now been examined systematically for sequence variation in several other Rickettsia species. The analysis has shown that seven of the disrupted genes in R. prowazekii are also defective in one or more of the other species (JO Andersson, SGE Andersson, unpublished data). Surprisingly, out of a total of 18 genes uniquely present in members of the SFG, as many as half were found to correspond to pseudogenes. As the unique genes as well as the reconstructed pseudogenes displayed the characteristic patterns in codon usage, they did not seem to have been acquired by horizontal transfer. Rather, the analysis suggested that these genes were present in the Rickettsia lineage long before the divergence of the two groups of Rickettsia, implying that their absence from the TG must be a result of recent gene losses. Indeed, the pseudogenes in the SFG were occasionally surrounded by flanking genes the homologs of which were separated by long intergenic regions in the TG. It is interesting to note that sequence similarities were detected for some of these intergenic regions in the TG and the corresponding Insights into the evolutionary process of genome degradation Andersson and Andersson genes or pseudogenes in the SFG (JO Andersson, SGE Andersson, unpublished data). These data support the notion that some of the noncoding DNA in the R. prowazekii genome corresponds to genes that have been so extensively degraded that they are no longer recognizable as genes [1••]. A rough calculation based on ~6% of the R. prowazekii genome for which sequence data is also available for four other Rickettsia species suggests that 200–300 genes may have been lost since the divergence of the TG and SFG (JO Andersson, SGE Andersson, unpublished data). Plasmid pseudogenes in Buchnera and Borrelia burgdorferi Bacterial pseudogenes have also been identified on naturally occurring plasmids in Buchnera sp. and Borrelia burgdorferi. Buchnera are obligate endosymbionts of aphids . The symbiotic relationship is mutual; it has not yet been possible to cultivate the bacteria on artificial media and the aphids are either sterilized or killed by treatment with antibiotics . The role of the endosymbionts is to supply the aphids with essential amino acids. To ensure that the amino acids are produced efficiently, several amino acid biosynthetic genes have been amplified on plasmids . Recently, it has been found that these tandem repeats sometimes contain pseudogene copies that have accumulated mutations in a seemingly neutral manner, possibly through changes in the exogenous amino acid supply [23,24]. Additional examples of plasmid pseudogenes have been found in B. burgdorferi, the etiological agent of Lyme disease. This parasite contains at least 17 different plasmids . The average coding content for the plasmids, however, is only 71% and putative gene functions could be assigned for only 16% of the plasmid genes . Even more surprising, a very large number of the putative genes were found to contain frameshift and/or internal termination codons . For example, the gene coding for recombinase/invertase was present as a full-length copy on one plasmid but as many as seven copies seemed to be in various stages of degradation on the other plasmids . Although the pseudogene status of these sequences has yet to be verified by comparative sequence analysis or by expression studies, it seems likely that the genes with frameshift and/or internal stop codons have indeed been inactivated and are no longer under purifying selection. The ouflows and inflows of DNA sequences The outflow of DNA sequences by gene inactivation events can, in principle, be compensated for by a corresponding inflow of DNA sequences via horizontal transfers. Indeed, horizontal transfer events in free-living bacteria such as Escherichia coli, have been suggested to occur much more frequently than was previously thought [26•]. It should be recalled, however, that the host cell cytoplasm is a very isolated environment, with few opportunities for intracellularly 667 growing parasites to mix and mingle with other bacteria during their reproductive phase. It is therefore questionable whether small populations of isolated obligate intracellular parasites are as prone to horizontal transfers as large populations of free-living bacteria. Phylogenetic studies and comparative sequence analysis have provided a few examples of putative, ancient horizontal transfers in both Rickettsia and Chlamydia [1••,2••,26•]. For example, the valyl-tRNA synthetase and lysyl-tRNA synthetase in R. prowazekii show a close phylogenetic relationship with the corresponding synthetases in the archaea rather than with their homologs in bacteria ([1••]; B Canbäck, SGE Andersson, unpublished data) but a majority of genes display the expected phylogenetic relationships to bacteria (T Sicheritz, SGE Andersson, unpublished data). One way of quantifying the relative frequencies of horizontal gene transfers is by estimating the fraction of recently introduced genes from their atypical codon usage patterns. Indeed, it was recently inferred from such an analysis that as much as 18% of the E. coli genome may be of recent foreign origin [27•]. In striking contrast to the heterogeneity in codon usage patterns within the E. coli genome, R. prowazekii genes are extremely homogeneous in their usage of codons , with few, if any, indications of recently introduced genes ([1••]; M Remm, SGE Andersson, unpublished data). An alternative way of ‘creating’ new DNA sequences is by internal gene duplication events but both the number and sizes of gene families are much lower in the R. prowazekii genome than in the genomes of other free-living relatives. Taken together, the suggestion is that the outflow of DNA sequences is not compensated for by either externally introduced DNA or by internal gene duplications in R. prowazekii. Thus, low rates of gene influx in combination with a mutation bias for deletions will cause a gradual shrinkage in genome sizes, as expected for obligate intracellular parasites. Furthermore, in organisms with small population sizes, recurrent bottlenecks and low rates of recombination, even mutations that are slightly deleterious to the organism, may become fixed in the population. This phenomenon, which is known as Muller’s ratchet, [29,30], has been most extensively studied in the genus Buchnera [31,32•–34•]. Thus, genes may be inactivated and lost either because they are no longer needed or just by coincidence even though the inactivated genes may be slightly disadvantageous to the organism. In either case, the lost gene functions will be difficult or impossible to recover again in organisms with low rates of gene inflow from other individuals, strains or species. Intracellular parasites in different stages of genome degradation The evolutionary transition to the intracellular environment is likely to have occurred in a series of steps that successively eliminated most of the initial gene complement. We expect that these degenerative processes are relatively fast in the early stages and then gradually slow 668 Genomes and evolution down as the genome decreases in size. Indeed, the obligate intracellular parasite Mycobacterium leprae has a genome with a size of 2.8 Mb . This is the largest genome known for an obligate intracellular parasite and its ancestral genome may have been even larger, possibly as large as the 4.4 Mb genome of its close relative Mycobacterium tuberculosis . The M. leprae genome seems to be in an early, rapid phase of degradation, as inferred from both its large genome size and from the observation that as much as 3.5% of the possible protein coding regions contain multiple frameshift and/or in-frame termination codons . The presence of pseudogenes and a large fraction of noncoding DNA in the R. prowazekii genome suggests that genes are currently being inactivated at a higher rate than they are being eliminated ([1••,16••]; JO Andersson, SGE Andersson, unpublished data). However, once an equilibrium has been reached, such that the rate of gene inactivation is significantly lower than the rate at which genes are being degraded, the coding content should increase up to a level of ~90%, as seen for a majority of the bacterial genomes sequenced so far. Indeed, C. trachomatis has a coding content of 90% and no identifiable pseudogenes [2••]. This might indicate that C. trachomatis has already reached the final stage of its adaptation to the host-cell environment or that the rate of degradation is much faster in Chlamydia than in Rickettsia. The finding that C. pneumoniae has 214 protein genes that are not present in C. trachomatis [3••] suggests that there is a significant rate of gene turnover also in the Chlamydia genomes. It is possible, however, that Chlamydia has a more efficient system for removing nonfunctional genes, which would make it more difficult to identify pseudogenes at any given time point. Indeed, it is interesting to note that both metK and spoT/relA, which are present as pseudogenes in the R. prowazekii genome, have already been completely eliminated from the C. trachomatis genome ([1••,2••,16••]; JO Andersson, SGE Andersson, unpublished data). Genome sequences are only snapshots in evolutionary time and space! It is argued increasingly that horizontal transfers occur at such a high rate that it may not be possible to reconstruct organismal relationships on the basis of individual gene sequences [37•,38–40,41•,42•]; but if horizontal transfers are indeed as common as suggested, the sizes of microbial genomes would grow indefinitely! As genomes apparently do not grow in such an uncontrolled fashion, it means that the estimated frequencies of horizontal transfers are either overestimated or that they are compensated for by an equally frequent occurrence of degenerative processes. For simplicity, it can be assumed that the size of a genome is the net result of the rate at which sequences are being acquired versus the rate at which sequences are being eliminated. These depend on the bias for different types of mutations — horizontal transfers, duplications and deletions — as well as on the strength of any selection on genome size. Similarly, the coding content of a genome is determined by the fixation rate for gene inactivation events and for how long time a gene no longer under purifying selection remains in the genome as a pseudogene. Cleaning up pseudogene sequences solely by random mutations requires that deletions predominate over insertions, both in frequencies of occurrence and in average sizes. Indeed, it has been argued that a high rate and large average size of deletions in Drosophila compared to mammals may explain the lack of pseudogenes in Drosophila, as well as the differences in genome size between the two lineages [43,44,45••]. Likewise, the sizes and coding contents of microbial genomes probably reflect the rates and sizes of horizontal transfers as well as of internal duplication and deletion events. A rigorous phylogenetic study based on a set of 312 orthologous genes from six completely sequenced prokaryotic genomes has suggested that the transfer of genetic material occurs continuously . Furthermore, the complete genome sequence of the bacterium Thermotoga maritima has revealed that as much as 24% of the genes were most similar to archaeal genes . Many of these were clustered in the genome, which was taken as evidence for extensive lateral transfer from the Archaea to T. maritima . Finally, it has been estimated that as much as 18% of the E. coli genome may be of recent foreign origin [27•]. Indirect evidence for horizontal transfers in E. coli has also been obtained from the striking differences in genome sizes of natural isolates, sometimes by as much as 1 Mb [41•]. Indeed, a sequence analysis of the accessory DNA in the genomes of different strains of the E. coli reference collection has shown that the strain-specific genes are mostly genes of exogenous origin [42•]. Thus, lateral gene transfer seems to be an important mechanism for generating genomic variants, although the extent to which it occurs in individual species remains to be determined. This means that the basic principles of gene inactivation, degradation and elimination that we have started to glimpse in the Rickettsia genomes are processes that are probably far more general than what has been appreciated to date. Thus, it is important to recognize that the sequence of any individual genome is only a snapshot in evolutionary time and space. To really understand the dynamics of genomes, we need to understand the balance as well as the processes whereby new genes are being acquired and old genes are being removed. Such information can only be obtained through vigorous, comparative analyses of closely related strains and species. Here, the situation is encouraging: the genomes of several closely related strains and species are currently under investigation [3••,46••]. This kind of knowledge will be crucial for how new genome sequence data is interpreted in general and to evaluate hypotheses of horizontal gene transfer in particular. Indeed, we are convinced that as scientists begin to inspect genomes from a Insights into the evolutionary process of genome degradation Andersson and Andersson 669 comparative, evolutionary perspective, many more examples of degenerative processes will be obtained from a large variety of different microorganisms. Acknowledgements Conclusions References and recommended reading The cytoplasm of a eukaryotic cell is an extreme growth environment. When a free-living bacterium changes lifestyle to become an obligate intracellular parasite or symbiont, the genomic consequences are enormous. For example, the ability to exploit host-cell metabolites will immediately lead to a reduced level of purifying selection on a large set of the genes involved in small molecule biosynthesis. These genes will disappear from the genome at a rate set by the balance between the insertion/deletion mutation bias and the strength of selection acting on the size of the genome. During this process the number of pseudogenes will gradually increase and the coding content decrease until a new steady state has been reached. Papers of particular interest, published within the annual period of review, have been highlighted as: The very first evidence for reductive evolutionary processes acting on the genomes of obligate intracellular parasites was obtained from the metK pseudogene, which contains an internal termination codon in R. prowazekii and numerous short insertions and deletions in the SFG Rickettsia. Comparative analyses of several other pseudogenes have since confirmed that there is a continuous outflow of gene sequences from the Rickettsia genomes. In total, we have estimated that R. prowazekii may have lost ~200–300 genes since its divergence from the SFG. The basic principles of gene deterioration upon shifts to intracellular environments may apply to changes of lifestyles in general. Indeed, it seems likely that longterm shifts to new growth habitats renders subsets of genes nonessential and these will eventually be eliminated. How many pseudogenes can be detected at any given time-point is largely dependent upon the intrinsic insertion/deletion mutation bias. If insertions and deletions are rare compared to point mutations, non-functional genes may remain in the genome for a long period of time. This process has profound effects on the way in which microbial genomes evolve. The loss of genetic information may, in principle, be equilibrated by a corresponding level of horizontally transferred genes that are more beneficial for growth in the new environment. However, the relative rates of gains and losses of genes may vary substantially in different microbial genomes, which could provide an explanation for the over ten-fold variation in genome sizes. Unfortunately, single genome sequences provide very few clues about the extent to which genes are being shuffled into and out of the genome. Resolution of these issues can only be obtained by comparative sequencing of closely related strains and species. Elegant experiments can then be designed to fully explore the delicate balance of genome shrinkage and expansion in different microbial lineages. The authors work is supported by the National Science Research Council, the Knut and Alice Wallenberg Foundation and the Swedish Foundation for Strategic Research. • of special interest •• of outstanding interest 1. •• Andersson SGE, Zomorodipour A, Andersson JO, Sicheritz-Ponten T, Alsmark UCM, Podowski RM, Naslund AK, Eriksson A-S, Winkler HH, Kurland CG: The genome sequence of Rickettsia prowazekii and the origin of mitochondria. Nature 1998, 396:133-140. The complete genome sequence of the obligate intracellular parasite Rickettsia prowazekii. One of the most remarkable aspects of this genome is its high non-coding content (24%) and the presence of several pseudogenes. The non-coding DNA is speculated to represent remnants of genes that are in their final stages of elimination. 2. •• Stephens RS, Kalman S, Lammel C, Fan J, Marathe R, Aravind L, Mitchell W, Olinger L, Tatusov RL, Zhao Q et al.: Genome sequence of an obligate intracellular pathogen of humans: Chlamydia trachomatis. Science 1998, 282:754-759. The complete genome sequence of the obligate intracellular parasite C. trachomatis. The genome lacks many genes for biosynthetic capabilities but encodes an intact glycolytic pathway. In addition, it contains genes coding for a transport system for ATP which enables C. trachomatis to exploit cytosolic ATP as a source of energy. 3. •• Kalman S, Mitchell W, Marathe R, Lammel C, Fan J, Hyman RW, Olinger L, Grimwood J, Davis RW, Stephens RS: Comparative genomes of Chlamydia pneumoniae and C. trachomatis. Nat Genet 1999, 21:385-389. The first comparison of two closely related obligate intracellular parasites. The analysis shows that 214 protein coding sequences are uniquely present in the larger genome of C. pneumoniae. The unique genes are dispersed throughout the chromosome. 4. Charles H, Ishikawa H: Physical and genetical map of the genome of Buchnera, the primary endosymbiont of the pea aphid Acyrthosiphon pisum. J Mol Evol 1999, 48:142-150. 5. Andersson SGE, Kurland CG: Reductive evolution of resident • genomes. Trends Microbiol 1998, 6:263-278. Genome evolution of intracellular bacteria resembles the evolution of organelles in many ways. This review discusses the evolutionary forces acting on genomes that replicate within the cytoplasm of a eukaryotic host cell. The effects of these reductive forces on genome sizes, architectures and nucleotide substitution rates are discussed. 6. • Zomorodipour A, Andersson SGE: Obligate intracellular parasites: Rickettsia prowazekii and Chlamydia trachomatis. FEBS Lett 1999, 452:11-15. This review discusses a comparative analysis of the obligate intracellular parasites R. prowazekii and C. trachomatis. These organisms are not phylogenetically related and it is generally thought that they have adopted to the intracellular environment independently of each other. Both genomes have small genomes sizes, few biosynthetic genes and similar fractions of genes allocated to the different functional categories; however, the identity of genes within the functional categories differ. The most striking difference is that the C. trachomatis genome has a coding content of 89.5%, whereas the R. prowazekii genome has a coding content of only 75.4%. 7. • Sicheritz-Ponten T, Kurland CG, Andersson SGE: A phylogenetic analysis of the cytochrome b and cytochrome c oxidase I genes supports an origin of mitochondria from within the Rickettsiaceae. Biochim Biophys Acta 1998, 1365:545-551. This is a phylogenetic study based on cytochrome c oxidase I and cytochrome b. The analysis reveals a close phylogenetic relationship between mitochondria and α-proteobacteria in general and between mitochondria and the group of bacteria to which R. prowazekii belongs in particular. 8. Gray MW, Burger G, Lang BF: Mitochondrial evolution. Science •• 1999, 283:1476-1481. An interesting discussion of mitochondrial origin and evolution. Of special interest for the purpose of this review is that all sequenced mitochondrial genomes can be divided into two types: ‘the conserved’ and ‘the derived’. The implication is that there was a first rapid phase of degradation during which a majority of the initial genes were lost, resulting in mitochondrial genomes with similarities to the conserved type of mitochondrial genomes, such as, for example, those found in protists. In some lineages, a second phase of degradation occurred, which resulted in additional gene losses, 670 Genomes and evolution accelerated mutation rates and non-standard genetic codes. The mammalian mitochondrial genomes are examples of highly derived genomes. 9. Gray MW: Rickettsia, typhus and the mitochondrial connection. • Nature 1998, 396:109-110. A ‘News and Views’ piece stressing the striking similarities between R. prowazekii and modern mitochondria. The loss of genetic information is most likely a result of convergent reductive evolution, as their common ancestor was almost certainly a free-living microorganism with a larger genome size. 10. Gray MW, Lang BF, Cedergren R, Golding GB, Lemieux C, Sankoff D, Turmel M, Brossard N, Delage E, Littlejohn TG et al.: Genome structure and gene content in protist mitochondrial DNAs. Nucleic Acids Res 1998, 26:865-878. 11. Wolfe KH, Morden CW, Palmer JD: Function and evolution of a minimal plastid genome from a nonphotosynthetic parasitic plant. Proc Natl Acad Sci USA 1992, 89:10648-10652. 12. dePamphilis CW, Young ND, Wolfe AD: Evolution of plastid gene rps2 in a lineage of hemiparasitic and holoparasitic plants: many losses of photosynthesis and complex patterns of rate variation. Proc Natl Acad Sci USA 1997, 94:7367-7372. 13. Gilson PR, McFadden GI: The miniaturized nuclear genome of a eukaryotic endosymbiont contains genes that overlap, genes that are cotranscribed, and the smallest known spliceosomal introns. Proc Natl Acad Sci USA 1996, 93:7737-7742. 14. Gilson PR, McFadden GI: Good things in small packages: the tiny genomes of chlorarachniophyte endosymbionts. Bioessays 1997, 19:167-173. 15. Andersson JO, Andersson SGE: Genomic rearrangements during evolution of the obligate intracellular parasite Rickettsia prowazekii as inferred from an analysis of 52015 bp nucleotide sequence. Microbiology 1997, 143:2783-2795. 16. Andersson JO, Andersson SGE: Genome degradation is an •• ongoing process in Rickettsia. Mol Biol Evol 1999, 16:1178-1191. The first detailed, comparative analysis of pseudogene sequence evolution in microorganisms. The analysis shows that genes which have been inactivated by frameshift mutations and/or termination codons in the Rickettsia genomes have strongly elevated fixation rates for mutations at sites that cause amino acid replacements, which demonstrates that there is no purifying selection acting on the identified pseudogenes. The analysis also shows that deletions predominate over insertions in these neutral evolving sequences, indicating that an inactivated gene will gradually accumulate substitutions and short deletions until it is no longer recognizable and/or until it is totally eliminated. 17. Avila J, Polegre MA: Uptake and metabolism of S-adenosyl-Lmethionine by Leishmania mexicana and Leishmania braziliensis promastigotes. Mol Biochem Parasitol 1993, 58:123-134. 18. Horne DW, Holloway RS, Eagner C: Transport of Sadenosylmethionine in isolated rat liver mitochondria. Arch Biochem Biophys 1997, 343:201-206. 19. Godon JJ, Delorme C, Bardowski J, Chopin MC, Ehrlich SD, Renault P: Gene inactivation in Lactococcus lactis: branched-chain amino acid biosynthesis. J Bacteriol 1993, 175:4383-4390. 20. Delorme C, Godon JJ, Ehrlich SD, Renault P: Gene inactivation in Lactococcus lactis: histidine biosynthesis. J Bacteriol 1993, 175:4391-4399. 21. Baumann P, Baumann L, Lai C-Y, Rouhbakhshu D, Moran N, Clark MA: Genetics, physiology, and evolutionary relationships of the genus Buchnera: intracellular symbionts of aphids. Annu Rev Microbiol 1995, 49:55-94. 22. Lai CY, Baumann L, Baumann P: Amplification of trpEG: adaptation of Buchnera aphidicola to an endosymbiotic association with aphids. Proc Natl Acad Sci USA 1994, 91:3819-3823. 23. Lai CY, Baumann P, Moran N: The endosymbiont (Buchnera sp.) of the aphid Diuraphis noxia contains plasmids consisting of trpEG and tandem repeats of trpEG pseudogenes. Appl Environ Microbiol 1996, 62:332-339. 24. Baumann L, Clark MA, Rouhbakhsh D, Baumann P, Moran NA, Voegtlin DJ: Endosymbionts (Buchnera) of the aphid Uroleucon sonchi contain plasmids with trpEG and remnants of trpE pseudogenes. Curr Microbiol 1997, 35:18-21. 25. Fraser CM, Casjens S, Huang WM, Sutton GG, Clayton R, Lathigra R, White O, Ketchum KA, Dodson R, Hickey EK et al.: Genomic sequence of a Lyme disease spirochaete, Borrelia burgdorferi. Nature 1997, 390:580-586. 26. Wolf YI, Aravind L, Koonin EV: Rickettsiae and Chlamydiae: · evidence of horizontal gene transfer and gene exchange. Trends Genet 1999, 15:173-175. This analysis of the genomes of the intracellular parasites R. prowazekii and C. trachomatis shows that a total of 16 and 26 proteins, respectively, are most similar to their eukaryotic homologs. The genes coding for these proteins may have been obtained by horisontal transfer. It would be interesting to examine well sampled phylogenetic trees based on these proteins to infer when and from which organisms the putative transfers occured. 27. • Lawrence JG, Ochman H: Molecular archaeology of the Escherichia coli genome. Proc Natl Acad Sci USA 1998, 95:9413-9417. This paper discusses frequencies of horizontal transfers in the E. coli genome. The analysis utilizes parameters such as the G+C contents of the first and third position, χ2 values of codon usage biases and codon adaptation indices to distinguish between ‘native’ E. coli genes, and genes which have been introduced recently from another genome with different base composition and/or codon usage patterns. It is concluded that ~18% of the current E. coli chromosome is of foreign origin and has been introduced recently. 28. Andersson SGE, Sharp PM: Codon usage and base composition in Rickettsia prowazekii. J Mol Evol 1996, 42:525-536. 29. Muller JJ: The relation of recombination to mutational advance. Mutat Res 1964, 1:2-9. 30. Felsenstein J: The evolutionary advantage of recombination. Genetics 1974, 78:737-756. 31. Moran NA: Accelerated evolution and Muller’s rachet in endosymbiotic bacteria. Proc Natl Acad Sci USA 1996, 93:2873-2878. 32. Brynnel EU, Kurland CG, Moran NA, Andersson SGE: Evolutionary • rates for tuf genes in endosymbionts of aphids. Mol Biol Evol 1998, 15:574-582. This paper shows that both synonymous and non-synonymous substitution rates are higher in intracellularly replicating symbionts (Buchnera) than in the free-living microorganisms (E. coli and S. typhimurium). The intrinsic mutation rates for the two lineages were estimated to be very similar, however, suggesting that the fixation rate for synonymous and non-synonymous mutations are significantly higher in the endosymbionts. The results are related to the absence of codon preferences in Buchnera and to the influence of Muller’s ratchet on small asexual populations. 33. Lambert JD, Moran NA: Deleterious mutations destabilize • ribosomal RNA in endosymbiotic bacteria. Proc Natl Acad Sci USA 1998, 95:4458-4462. By examining the free energy of the 16S rRNA genes in a number of bacteria, it has been shown that endosymbiotic bacteria, such as Buchnera have reduced rRNA stabilities compared to their free-living relatives. The results suggest that endosymbiotic bacteria may accumulate slightly deleterious mutations probably as a result of their asexuality and small population sizes. 34. Wernegreen JJ, Moran NA: Evidence for genetic drift in • endosymbionts (Buchnera): analyses of protein-coding genes. Mol Biol Evol 1999, 16:83-97. This paper demonstrates that there is either no or only a very weak selection for codon bias in Buchnera. Furthermore, the authors show that many genes in Buchnera seem to have accumulated slightly deleterious mutations at sites that cause amino acid replacements, consistent with a decreased effectiveness of purifying selection at these sites. The extent to which the strong codon bias in E. coli and Salmonella typhimurium and the strong composition bias towards A+T nucleotides in Buchnera may have affected the results is unclear. 35. Smith DR, Richterich P, Rubenfield M, Rice PW, Butler C, Lee HM, Kirst S, Gundersen K, Abendschan K, Xu Q et al.: Multiplex sequencing of 1.5 Mb of the Mycobacterium leprae genome. Genome Res 1997, 7:802-819. 36. Cole ST, Brosch R, Parkhill J, Garnier T, Churcher C, Harris D, Gordon SV, Eiglmeier K, Gas S, Barry CE III et al.: Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 1998, 393:537-544. 37. Martin W: Mosaic bacterial chromosomes: a challenge en route to • a tree of genomes. Bioessays 1999, 21:99-104. A commentary on [26•] discussing the profound impact their results will have if the estimated rates of horizontal transfers in E. coli are correct and if they can be generalized to all bacterial genomes throughout evolutionary time. It is argued that bacterial genomes should be viewed as dynamic rather than static structures in which genes come and go in a continual manner. 38. Doolittle WF: Phylogenetic classification and the universal tree. Science 1999, 284:2124-2129. Insights into the evolutionary process of genome degradation Andersson and Andersson 39. Jain R, Rivera MC, Lake JA: Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci USA 1999, 96:3801-3806. 40. Nelson KE, Clayton RA, Gill SR, Gwinn ML, Dodson RJ, Haft DH, Hickey EK, Peterson JD, Nelson WC, Ketchum KA et al.: Evidence for lateral gene transfer between Archaea and bacteria from genome sequence of Thermotoga maritima. Nature 1999, 399:323-329. 41. Bergthorsson U, Ochman H: Distribution of chromosome length • variation in natural isolates of Escherichia coli. Mol Biol Evol 1998, 15:6-16. A comparative, experimental study which supports the idea that the E. coli chromosome is a highly dynamic structure with high rates of genetic material inflow and outflow. The genome length variations seen in natural isolates of E. coli seem to have been generated by multiple changes throughout the genome. It is argued that the major source of variation is related to horizontal transfer events. 42. Hurtado A, Rodriguez-Valera F: Accessory DNA in the genomes of • representatives of the Escherichia coli reference collection. J Bacteriol 1999, 181:2548-2554. Fragments generated by random amplified polymorphic DNA which were not found in all strains of the E. coli reference collection were analysed. It is shown that most of this strain-specific DNA has base composition patterns and sequence similarities which are consistent with an exogenous origin. 671 43. Petrov DA, Lozovskaya ER, Hartl DL: High intrinsic mutation rate of DNA loss in Drosophila. Nature 1996, 384:346-349. 44. Petrov DA, Hartl DL: Trash DNA is what gets thrown away: high rate of DNA loss in Drosophila. Gene 1997, 205:279-289. 45. Petrov DA, Hartl DL: High rate of DNA loss in the Drosophila •• melanogaster and Drosophila virilis species groups. Mol Biol Evol 1998, 15:293-302. Non-LTR transposable elements have been used to study patterns of spontaneous mutations in Drosophila. The most remarkable aspect of this paper is that deletions were found to be much larger and much more frequent than insertions. It is also shown that deletions in Drosophila are larger and more frequent than deletions in mammals. The authors have estimated that the half-life of a pseudogene is 14 million years in Drosophila as compared to 880 million years in mammals. The results may explain the rarity of pseudogenes in Drosophila, as well as the large differences in genome sizes in eukaryotes. 46. Alm RA, Ling LL, Moir DT, King BL, Brown ED, Doig PC, Smith DR, •• Noonan B, Guild BC, deJonge BL et al.: Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori. Nature 1999, 397:176-180. This is the first comparison at the genomic level of two strains of H. pylori. The overall genomic organization, gene order and coding content of the two strains are quite similar. The analysis shows that 6–7% of the genes are uniquely present in each strain, almost half of which are clustered in the hypervariable region.