* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Catalyzing Bacterial Speciation: Correlating Lateral Transfer with
Transposable element wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Oncogenomics wikipedia , lookup
Gene desert wikipedia , lookup
Point mutation wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Human genome wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Ridge (biology) wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Non-coding DNA wikipedia , lookup
Genomic imprinting wikipedia , lookup
Koinophilia wikipedia , lookup
Population genetics wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Gene expression programming wikipedia , lookup
Genomic library wikipedia , lookup
Gene expression profiling wikipedia , lookup
Metagenomics wikipedia , lookup
Genetic engineering wikipedia , lookup
Public health genomics wikipedia , lookup
Genome editing wikipedia , lookup
Helitron (biology) wikipedia , lookup
Genome (book) wikipedia , lookup
Minimal genome wikipedia , lookup
Designer baby wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Pathogenomics wikipedia , lookup
History of genetic engineering wikipedia , lookup
Syst. Biol. 50(4):479–496, 2001 Catalyzing Bacterial Speciation: Correlating Lateral Transfer with Genetic Headroom J EFFREY G. LAWRENCE Department of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA; E-mail: [email protected] Abstract.—Unlike crown eukaryotic species, microbial species are created by continual processes of gene loss and acquisition promoted by horizontal genetic transfer. The amounts of foreign DNA in bacterial genomes, and the rate at which this is acquired, are consistent with gene transfer as the primary catalyst for microbial differentiation. However, the rate of successful gene transfer varies among bacterial lineages. The heterogeneity in foreign DNA content is directly correlated with amount of genetic headroom intrinsic to a bacterial species. Genetic headroom reects the amount of potentially dispensable information—reected in codon usage bias and codon context bias—that can be transiently sacriced to allow experimentation with functions introduced by gene transfer. In this way, genetic headroom offers a potential metric for assessing the propensity of a lineage to speciate. [Bacteria; codon usage bias; compositional bias; evolution; genome; horizontal transfer; speciation.] The majority of free-living organisms in the biosphere are Bacteria and Archaea, which comprise the two domains of “prokaryotic” microbes. Among extant organisms, the majority of the genetic variation at loci encoding rRNA, ATPase, elongation factors, aminoacyl-tRNA synthases, the protein export translocon, and other universally distributed and functionally constrained molecules is found within the prokaryotic domains (Gogarten et al., 1989; Iwabe et al., 1989; Woese et al., 1990; Brown et al., 1997; Gribaldo and Cammarano, 1998). In addition, the bulk of physiological variation is found among these microbes as well, members of which thrive in virtually every environment examined, including superheated hydrothermal vents, Antarctic ice oes, salt-saturated alkaline pools, acid hot springs, distilled water reservoirs, and the upper atmosphere. Virtually any substance that can be oxidized, either organic or inorganic, can be exploited as an energy source by a microbe. Analyses of reassociation curves for DNA extracted from soil bacteria—which reect the diversity of bacterial species in this sample (Torsvik et al., 1990)—have led to conservative estimates that Bacteria and Archaea comprise more than 1,000,000,000 species within the biosphere (Dykhuizen, 1998) (note that the term “bacteria” is used to connote both Bacteria and Archaea). Over their substantial evolutionary history, microbial taxa have diversied and speciated to exploit an ever-increasing array of complex environments. Speciation can be dened as the process by which one group of organisms propagates into two groups of organisms, each distinct from each other and, in many cases, from their common parental stock (King, 1993; Vrba, 1985). The creation of two daughter species from a single parental species is thought to be correlated with, although not necessarily caused by, the differential partitioning of parental genetic variation in the two descendent groups. This process allows for phenotypic distinction between incipient species and serves to dene each group of organisms in terms of their own ecological capabilities. Yet the working criteria for dening and describing a species remain delightfully contentious issues, and many species concepts have been developed to categorize organisms into groups that reect some relevant aspects of their biology. Almost uniformly, mechanisms driving the speciation process have been postulated in the context of diploid, obligatorily sexual eukaryotic species. However, Bacterial and Archaeal cells are typically haploid, bear single chromosomes (although see Casjens [1998]), and reproduce by binary ssion in life cycles that lack requisite sexual exchange. These features preclude the adoption of conventional speciation mechanisms, which rely on facets of eukaryotic biology that are absent or irrelevant among Bacterial and Archaeal taxa. More importantly, most proposed speciation mechanisms fail to include important aspects of Bacterial evolution that have little or no impact in the evolution of most 479 480 S YSTEMATIC BIOLOGY eukaryotic species. Broad descriptions that attempt to address speciation processes both in Bacteria and in Eukarya are invariably too vague to be of use in dening speciation in either class of organisms. Here I provide a framework for examining speciation in the Bacteria and Archaea, a process that differs fundamentally from speciation in Eukaryotes. These differences can be attributed to the marked contrast in reproductive mechanisms, the frequency and necessity of intraspecic recombination, the frequency and utility of interspecic recombination, mechanisms of gene expression, population size, genetic diversity, and the biochemical foundation of phenotypic differentiation between these groups of organisms. As detailed below, the nature of a Bacterial species provides a context for understanding Bacterial speciation. T HE PROBLEM WITH S PECIES CONCEPTS The Biological Species Concept The earliest descriptions of biological species relied on a “type specimen” concept, reective of the Platonic idealized form (that is, all earthly items are projections of their perfect, heavenly forms), and thereby categorized organisms on the basis of their overall morphological similarity (Linneus, 1742). However, bacterial cells lack the rich sets of structural characteristics that allow partitioning of eukaryotic organisms into relevant groupings. As a result, classication and taxonomy of prokaryotes lagged behind the description of eukaryotic organisms. These difculties were exasperated by complications in identifying synapomorphic characters, especially when using variable biotypic markers. The limitations of taxonomic classication in rigorously dening the most fundamental of groups, the species, was somewhat circumvented by the Biological Species Concept (BSC), which dened a species as a population of organisms that share a common gene pool (Mayr, 1942, 1954, 1963). This idea is rooted in the observation that intraspecic recombination is obligatory upon syngamy among sexually reproducing eukaryotes; in those organisms, interspecic recombination is almost completely absent, in that crossspecies syngamy is rarely possible or fruitful. The limitations of the BSC are evident when considering organisms that do not ex- VOL. 50 hibit obligate genetic exchange. As a result, factors inuencing bacterial speciation remained unexplored, partly because it was not clear how to dene a microbial species. Yet extensive population genetic surveys (Hartl and Dykhuizen, 1984; Milkman and Stoltzfus, 1988; Milkman and Bridges, 1990, 1993; Dykhuizen and Green, 1991; Guttman and Dykhuizen, 1994a; Milkman, 1997; Selander et al., 1996) have demonstrated that intraspecic recombination among strains of the enteric bacterium Escherichia coli, or among strains of its sister species Salmonella enterica, occur far more often than exchange between these lineages. Dykhuizen and Green (1991) proposed that microbial species could be effectively and empirically dened as groups within which gene trees would not be congruent (because of recombination) but between which gene trees would be congruent. However, the prevalence of horizontal genetic transfer among Bacterial and Archaeal taxa (Syvanen and Kado, 1998; Doolittle, 1999a, 1999b)—and even between Bacteria and Plants or between Bacteria and Fungi (Buchanan-Wollaston et al., 1987; Heinemann and Sprague, 1989; Figge et al., 1999)—necessitates a more formal denition of “common gene pool” for prokaryotes, given that gene transfer across vast phylogenetic distances suggests all organisms share the same gene pool to some extent. The BSC could be revised to describe a species as a group of organisms that mutually exchange genes substantially more often with each other than with other organisms. However, the rates of intraspecic recombination are tremendously variable among bacterial taxa (Maynard Smith et al., 1993), rendering a simple threshold for discriminating between intraspecic and interspecic recombination impractical. Thus, additional criteria are required to distinguish microbial species and to make inferences regarding their origins. The Ecological Species Concept The Ecological Species Concept (ESC) is practical in describing a species as a group of organisms that exploit the same ecological niche (Van Valen, 1976). Regardless of common gene pools, mate recognition (Paterson, 1985), or evolutionary trajectory (Wiley, 1978), if two sets of organisms are ecological identical—that is, if they are attempting to occupy the same ecological 2001 LAWRENCE—BACTERIAL S PECIATION niche at the same time in the same place— stochastic processes will inevitably result in one group being displaced by the other. Unlike macroscopic eukaryotic taxa, microorganisms are typically not constrained by the physical or geographic barriers that can allow otherwise identical groups of organisms to develop ecological differences very slowly, or not at all. The ESC acts somewhat to move the task of discriminating between species to the task of discriminating among distinct, persistent ecological niches. Yet, parameters dening ecological niches, such as physiological capabilities and environmental tolerances, can be evaluated empirically in microbial taxa, or can be predicted from genomic sequence information (for example, the enteric bacterium E. coli grows best at 37± C, respires to numerous anaerobic electron acceptors, and degrades milk sugar, suiting it well for life in mammalian guts; in contrast, the related bacterium Serratia marcesens cannot degrade milk sugar and grows best at 26± C). Moreover, the process of speciation is readily applied to bacterial taxa when it can be dened as an organism’s adoption or invasion of a novel ecology. A Unied Bacterial Species Concept In practical ways, a bacterial species can be dened by using the modication of the BSC detailed above, invoking the ESC as the arbiter of natural selection. In this fashion, a bacterial species is dened as a group of organisms that exploit a common ecology and, as a result, exhibit effective rates of recombination that are greater among members of the group than with other organisms; that is, although interspecic recombination may occur, the resulting recombinants are, on average, less successful because the hybrids do not effectively exploit either parental environment (see below). Over time, the divergence of nucleotide sequences reduces the likelihood of homologous DNA exchange between lineages, effectively imposing premating genetic isolation. This species concept is sufciently exible to accommodate the huge variance in the tempo and mode of recombination among bacterial species. Moreover, it makes predictions regarding population structure and functional diversity that are testable in the laboratory environment. Under this framework, mechanisms of speciation would en- 481 tail substantial changes in ecology among subpopulations of a species; as a result, incipient species would form as the rates of intergroup recombination decreased. These speciation mechanisms are discussed below. The ne-scale dynamics that mediate the propagation of incipient species, or their demise, are discussed elsewhere (Cohan, 2001). FUNCTIONAL ECOLOGY AND M ETABOLIC D IVERSITY Metabolism Is Ecology Because the origin of a distinct bacterial lineage is contingent on the exploitation of a novel ecological niche, this process plays a key role in understanding bacterial speciation. Although a description of an organism’s ecology includes complex evaluations of environmental tolerances, food sources, habitat selection, and interactions with other species, microbial ecologies nonetheless can be described and predicted on a molecular and genetic level. That is, the presence of a gene, and the implementation of a gene product, can often be equated with ecological capabilities. This intimacy of microbial physiology with functional ecology reects primarily the size of microorganisms. Bacterial adaptation to the consumption of novel food sources typically entails the use of novel biochemistries. For example, consumption of lactose as a carbon source requires the action of a ¯-galactosidase; expression of this enzyme indicates an organism has a role for ¯-galactoside degradation in its lifestyle, whereas lack of this enzyme activity precludes exploitation of that resource. The mere redeployment of existing metabolic pathways, or subtle alteration of morphology, will not allow a bacterium to hydrolyze an unfamiliar glycosidic bond. Rather, the exploitation of novel environments often requires the use of novel gene products. In contrast, consider differentiation among Darwin’s nches, where food selection can be predicted by the size and shape of the beak. Although these organisms may select morphologically different substances as food, all of them are using the same underlying physiology and biochemistry to utilize these substrates (carbohydrates, fats, and proteins). So, although ecological adaptation among these nches may entail the consumption of different types of food, novel biochemistry is not involved. Rather, 482 VOL. 50 S YSTEMATIC BIOLOGY one would predict that differential regulation and implementation of developmental processes would result in the necessary alterations of skeletal, muscular, and digestive systems to accommodate the novel food sources. As expected, the physiological differentiation between bacterial species can be dramatic, with some organisms exhibiting complex characteristics that may be completely absent from closely related organisms but shared with distantly related organisms. For example, the enteric bacterium E, coli is characterized by its ability to ferment lactose by using the LacZ ¯-galactosidase, a function found in many bacteria, but absent from the closely related species Salmonella enterica. Similarly, Salmonella can degrade citrate and propanediol, synthesize coenzyme B12 de novo, and reduce thiosulfate to H2 S—all characteristics lacking in E. coli. The acquisition of such novel metabolic and physiological capabilities could catalyze the process of bacterial speciation. Reinventing the Wheel How did these novel capabilities arise? Classic models for the evolution of novel functions typically invoke gene duplication and divergence. For example, a ¯galactosidase may evolve by duplication of the resident gene, placing one of the genes under relaxed selection for function. This duplicated gene would then be free to mutate and to adopt new functions. Yet this model is somewhat unsatisfying, in that purifying selection cannot prevent the immediate mutation or deletion of the duplicated gene. Moreover, loss of duplicated genes will be selectively advantageous if aberrant gene dosage is problematic, or if mutant forms conferred dominant negative phenotypes. Models for transient maintenance of duplicate genes have been proposed (Stoltzfus, 1999), but such models seek to prolong the brief period available for the acquisition of a selectively advantageous functions before one of the copies is inevitably lost. The persistence of multiple copies of the same gene—even if both are crippled by mutation (Stoltzfus, 1999)—is intrinsically unstable, because mutational processes, intragenomic gene conversion, or intraspecic recombination will eventually restore one copy, leading to deletion of the second. An additional conundrum appears if the deployment of novel capabilities is important for the invasion of novel ecologies, and hence speciation. In vitro studies suggest that change in enzyme specicity would entail multiple changes, even for subtle alterations in activity (Matsumura et al., 1999). If a new environment is competitive, then gradual renement of novel biochemical characteristics by alteration of endogenous genetic material (by duplication and divergence) will provide insufcient selective advantages to promote niche invasion. The immense population sizes of microorganisms make even mildly deleterious alleles unsuitable for catalyzing niche invasion in competition with organisms already bearing rened, highly efcient processes for accomplishing similar tasks. Inheriting the Wheel Examination of the phylogenetic relationships among protein sequences shows that rampant paralogy—the constant reinvention of specic enzymes from related protein sequences by duplication and divergence—is not evident among bacteria. Rather, closely related groups of enzymes typically catalyze the same biochemical reactions, with the same substrates and products; enzymes with different substrate specicities form distinct groups. This pattern has been evident since the early 1970s, when analysis of NADC binding dehydrogenases revealed clear segregation based on substrate specicity (e.g., malate dehydrogenase, lactate dehydrogenase) and demonstrated these enzymes were distinct from FAD-binding dehydrogenases (Rossman et al., 1974). Analyses of additional protein families have yielded similar results. So, although exceptions have been noted (Wu et al., 1999), genes encoding lactate dehydrogenases have not arisen multiple times from parental malate dehydrogenase (or vice versa), even though a single mutation can enable this transition (Wilks et al., 1988; Golding and Dean, 1998). Yet the duplication and divergence model makes clear and contrary predictions regarding the assortment of enzymatic functions among members of a protein family, in which novel enzymatic functions (e.g., recognition of a new substrate) are expected to arise numerous times. According to this model, one would expect that 2001 LAWRENCE—BACTERIAL S PECIATION lactate dehydrogenases would have evolved multiple times from within the clade of malate dehydrogenases, and vice versa. Because analyses of proteins families demonstrate that enzymatic novelties have arisen very few times, the distribution of these enzymes among extant organisms— including both Bacteria and Archaea—must reect one of two processes. Either genes encoding all enzymes were present in the common ancestor of all known life (clearly a cumbersome and infeasible proposition), or genes have been mobilized among taxa after their origin. This second model suggests that rather than reinvent the wheel every time it is needed (by point mutation from transiently duplicated sequences present in the same cytoplasm), bacteria can acquire the wheel from other taxa. That is, horizontal genetic transfer can serve to distribute genes encoding specic metabolic functions among diverse bacterial genomes (Lawrence, 1999a; Lawrence and Roth, 1999). Although adaptation by way of internal genome dynamics do occur, and have been documented in natural populations (Sokurenko et al., 1998) and in the laboratory (Rainey and Travisano, 1998; Papadopoulos et al., 1999), lateral gene transfer allows fully functional pathways to be acquired and used for efcient exploitation of novel environments. Below we explore the feasibility of invoking lateral gene transfer as the catalyst for Bacterial speciation. I MPACT OF HORIZONTAL G ENE T RANS FER Although gene acquisition is a powerful mechanism for gaining new metabolic capabilities, it cannot be responsible for microbial diversication if it occurs only very rarely. Therefore, assessing the impact of horizontal gene transfer is tantamount to measuring its rate, which requires two sets of data. First, the amount of foreign DNA present in a genome must be assessed. Mere genecounting cannot establish a rate of gene transfer, because bacterial genomes are littered with selsh elements—transposons, integrated bacteriophages, and the like—that are foreign but typically do not contribute functions that change the ecological character of a species (transposons bearing antibiotic resistance genes [Hall et al., 1999] and bacteriophages mediating virulence [Waldor, 1998] are notable exceptions; see Campbell [1981] and Levin and Bergstrom [2000] 483 regarding the role of accessory elements in bacterial evolution). Thus, the time of introduction of foreign DNA must also be established. If a gene has persisted in a genome for a sufciently long period without accumulating mutations that would abolish its encoded function, then we may infer that it has evolved under purifying selection. Identifying Foreign DNA Three general methods have been used to identify foreign genes in bacterial genomes (Ochman and Lawrence, 1996; Ochman et al., 2000). First, the recent acquisition of a gene will be reected in its restricted distribution among sibling species. Such data are readily collected (albeit tediously if done systematically), but they support only an inference that a gene may have been acquired. Second, lateral transfer will result in an atypically high degree of similarity between genes found in otherwise unrelated organisms; for example, two E. coli genes (f108 and f234) are 90% identical to genes encoding glutathione transporters in humans and mice. Although these classes of data can be quite conclusive, detecting foreign genes by this method clearly depends on the breadth and depth of the sequence database. Moreover, less striking similarities can result in spurious conclusions of horizontal gene transfer to explain apparent phylogenetic incongruities, where convergent evolution, mutational saturation, long branch attraction, or other processes may have produced the aberrant phylogenetic pattern. Lastly, DNA sequences themselves often provide clues to their ancestry. That is, genes native to a bacterial genome accumulate mutations that reect the directional mutation pressures intrinsic to that cytoplasm (Sueoka, 1962, 1988, 1992, 1993); the resulting mutational biases are reected in the nucleotide composition, codon usage biases (Sharp and Li, 1987a, b; Mrazek and Karlin, 1999), and di- and trinucleotide frequencies (Karlin, 1998; Campbell et al., 1999) within coding regions. As a result, genes recently integrated into a bacterial genome will exhibit atypical compositional patterns, having evolved under a different suite of directional mutation pressures. Thus, foreign genes can be identied as those bearing atypical sequence features that cannot be readily explained by internal processes (such as an 484 S YSTEMATIC BIOLOGY unusual amino acid composition of the encoded protein) and appear unusual only in their new genomic context. Hence, encyclopedic phylogenetic comparisons and homology searches are not necessary for the identication of acquired genes, although they provide clear means for verifying inferences of gene ancestry. The genes encoding glutathione transporters in E. coli noted above were rst identied as bearing unusual compositional patterns, and their foreign ancestry was veried by their unusual similarity to mammalian genes. Analyses of subsets of genes from the closely related enteric bacteria E. coli and Salmonella enterica led to predictions that between 8% and 15% of their genomes was introduced by horizontal processes (Medigue et al., 1991; Whittam and Ake, 1992; Ochman and Lawrence, 1996; Lawrence and Ochman, 1997). These estimates were congruent with the amount of unique DNA predicted from alignments of their genetic maps (Riley and Anilonis, 1978). More recent estimates based on complete genome sequences yielded similar estimates for the E. coli genome (Lawrence and Ochman, 1998). Similarly high values have been inferred for the genomes of Aquifex aeolicus (Aravind et al., 1998) and Thermotoga maratima (Nelson et al., 1999), for each of which large fractions of their genome have been inferred to be derived from the Archaea, although alternative interpretations have been offered (Logsdon and Fuguy, 1999). Rate of Lateral Gene Transfer Is Substantial The same sequence features that enable identication of horizontally transferred genes within a genome also allow an estimation of their time of arrival (Lawrence and Ochman, 1997). Immediately after transfer, acquired genes will naturally resemble the genes of their donor genome, reecting that particular set of directional mutation pressures. Over time, acquired genes will accumulate mutations that reect the directional mutation pressures of their recipient genome and will ameliorate to resemble native genes. The degree to which amelioration has progressed provides an estimate of the time the acquired DNA has persisted in the new host genome. Because all bacterial genomes display certain properties along predictable and quantiable continua—such VOL. 50 as nucleotide composition at codon positions (Muto and Osawa, 1987)—these measurements provide baselines against which ameliorating genes can be compared. Genes that conform to these relationships are not in the process of amelioration, being either long-term residents of a genome or having been acquired very recently. In contrast, foreign genes ameliorating to a new set of directional mutation pressures will deviate from these relationships until they converge on the patterns exhibited by their new host genome. Lawrence and Ochman (1997, 1998) identied atypical genes in the E. coli genome and estimated that 18% of the protein-coding sequences were atypical and probably had been introduced by lateral genetic transfer. Quantication of the amelioration times of these genes established a rate of horizontal transfer of 16 kb per million years (My)(Lawrence and Ochman, 1998), implying that 1.6 Mb of the E. coli DNA had been acquired by lateral transfer processes since its divergence 100 My ago from its sister lineage, Salmonella enterica (Fig. 1). Although the bulk of these sequences have had only transient persistence times, >750 of the 4,288 proteincoding genes in the extant E. coli genome are readily identied as having been introduced over the past several hundred million years. The identication of foreign genes by atypical compositional patterns is suited only for detecting relatively recent acquisitions, which are those most likely to mediate recent speciation events. After a sufciently long time, acquired genes will be fully ameliorated to their recipient genome— having experienced the mutational biases of their new host genomes—and will not be identied as atypical, or foreign. Therefore, ancient lateral transfer events cannot be identied by this method. Reconstructing ancient lateral transfer events would require the application of traditional phylogenetic methods to detect incongruities in sets of gene trees. Because such inferences are readily confounded by convergent evolutionary processes, phylogenetic incongruency alone does not demand invocation of horizontal processes. Although the rates of lateral gene transfer may be substantial, and can serve to confound phylogenetic inference (Doolittle, 1999a, b), analyses of complete genome sequences suggest that rampant recent horizontal gene transfer has not completely 2001 LAWRENCE—BACTERIAL S PECIATION 485 FIGURE 1. Evolution of bacterial genomes by genomic ux. The old genes from the ancestral chromosome are lost, while the new genes are acquired by horizontal processes. Flux data are shown for the evolution of the E. coli genome, taken from Lawrence and Ochman (1998). Acquired DNA is depicted by solid arrows and arcs; ancestral DNA is depicted as open arrows and arcs. obliterated phylogenetic signal from microbial genomes, where the bulk of chromosomal genes are phylogenetically distributed in a manner consistent with the rDNA phylogeny (Huynen and Bork, 1998; Snel et al., 1999). Horizontal Events and Point Mutations Confer Different Classes of Changes Point mutations will almost always produce modest, incremental changes in the performance of encoded functions. In contrast, horizontal process can, in a single step, dramatically extend the cell’s repertoire of metabolic capabilities (Lawrence, 1997, 1999a; Lawrence and Roth, 1998, 1999). The rate of horizontal gene transfer calculated above, 16 kb/My, is comparable to the amount of variant information introduced into the E. coli genome by mutational processes (Drake et al., 1998). However, although mutational processes introduce an important number of changes into a bacterial genome, most of these changes are effectively neutral. In contrast, the information introduced by lateral gene transfer may allow for immediate and effective exploitation of new resources. In this way, horizontal processes have most likely catalyzed the diversication of enteric bacteria such as E. coli. Among enteric bacteria, all functions that can be used to discriminate among closely related taxa can be attributed to genes gained by horizontal processes or lost by deletion. No functions that differentiate between these organisms can be attributed to gene products ances- tral to these species that mediate different processes. The horizontal transfer of a single gene into a naive genome will not be successful—that is, the gene will not persist over time—if its product does not confer a selectable function. In many cases, multiple genes are required for the implementation of a useful function, such as degradation of a compound for energy or the biosynthesis of a cofactor or other metabolite. Acquisition of novel functions is facilitated by the organization of bacterial genes into operons (clusters of cotranscribed genes, whose products often contribute to a single function), which offer highly promiscuous packages of genetic material that can, in horizontal transmission, confer complex metabolic capabilities to recipient taxa. In contrast to the changes conferred by point mutations, horizontal processes may deliver multiple functions simultaneously for example, both an enzyme required for degradation of a new food source and a highafnity transporter allowing acquisition of the new food source—in the form of the bacterial operon. Therefore, operons circumvent the necessity for the ineffectual intermediate stages implicit in the evolution of complex, novel capabilities by point mutational processes. Although operons represent convenient parcels for the mobilization of functions among organisms, how did they arise? The Selsh Operon Model proposes that lateral gene transfer has catalyzed the assembly of genes into operons by promoting facile transfer to naive genomes (Lawrence and Roth, 486 VOL. 50 S YSTEMATIC BIOLOGY 1996; Lawrence, 1997, 1999b, 2000). That is, the physical clustering of genes allows all information required for implementation of a selectable function to be cotransferred. The cluster improves the tness of a gene by allowing the gene to exploit both vertical and horizontal inheritance. Because the organization of genes into operons does not immediately benet the host, it may be considered a selsh property of the constituent genes. And because cotranscription of genes allows their efcient expression in foreign hosts from a promoter at the site of integration, the coalescence of clustered genes into operons can also be considered a selsh process. As predicted by the Selsh Operon Model, operons in E. coli and Salmonella enterica typically encode nonessential metabolic functions; in many cases (cob, pdu, lac, phn, tct), operons have clearly been obtained by lateral gene transfer (Lawrence and Roth, 1996). In contrast, genes less likely to have been subject to lateral gene transfer—for example, essential genes found in all potential recipients—are rarely found in operons (notable exceptions to this, such as operons of ribosomal proteins, are discussed elsewhere [Lawrence and Roth, 1996]). Moreover, the inconsistency of operon organization across genes reects their constant assembly and breakdown (Itoh et al., 1999). Hence, the organization of genes into operons reects the history of gene transfer among bacteria and their role in catalyzing bacterial diversication. Coupling DNA Acquisition and DNA Loss Although the rate of horizontal transfer in E. coli has been substantial, and potentially useful information has been delivered by way of selsh operons, clearly bacterial genomes are not growing ever larger in size (Bergthorsson and Ochman, 1995, 1998; Ochman and Bergthorsson, 1995, 1998). And far from enabling every possible biochemical function, the physiology of an individual bacterium reects the synergism of a specic, denable subset of metabolic capabilities. What limits bacterial genome size? A nite population of individuals cannot maintain an innite amount of information free from mutation. If mutation rates (¹) are nonzero, individuals will accumulate mutations over time. If the effective population size (Ne ) is nite, some mutant in- dividuals will succeed in reproducing, and their progeny will not be eliminated from the population. Whereas intraspecic recombination (r) can recreate individuals free from deleterious mutations (Muller, 1932), limited recombination—as seen in many bacteria (Dykhuizen and Green, 1991; Maynard Smith et al., 1993; Guttman and Dykhuizen, 1994a, b)—will allow the xation of deleterious mutations, including those that eliminate potentially useful genes. These factors all inuence the maximum amount of genomic information (G) that can be maintained (Lawrence and Roth, 1999). We can express this relationships as follows, where the genome size can be expressed as functions ( f, g, h) of the mutation rate, recombination rate, and effective population size: G / f (r)g(Ne )= h(¹) (1) The amount of genomic information that can be maintained under purifying selection must decrease as mutation rate increases, recombination rate decreases, or population size decreases. As a result, genome size cannot increase indenitely. An integrated model of bacterial genome evolution would offset chronic gene acquisition by horizontal transfer with gene loss by deletion (Fig. 1). Empirical evidence supports this limitation to genome size. Despite high rates of horizontal genetic transfer into enteric bacterial species, the genomes of E. coli, Salmonella enterica, and related organisms are notably uniform in size (Bergthorsson and Ochman, 1995, 1998). For example, although bacterial genomes vary in size from » 500 to >10,000 kb, the genome sizes of natural variants of E. coli vary far less, measuring 4,968 § 253 kb (Bergthorsson and Ochman, 1998), despite an inux of 16 kb/MY over the past 100 MY. Although absolute genome size cannot be equated to the amount of information maintained in a genome (because not all base pairs actually carry information; see below), these results demonstrate that gene gain is indeed offset by gene loss in E. coli. Comparisons among enteric bacteria have revealed many cases in which gene products have been lost from certain lineages while being maintained in other lineages. Genes that confer selectable functions in one ecological context may fail to provide a benet to the cell in another environmental 2001 LAWRENCE—BACTERIAL S PECIATION context; such genes would be subject to loss by mutation and genetic drift. For example, the phoA gene, encoding alkaline phosphatase, has been lost from the Salmonella lineage but is maintained in the genomes of virtually all other enteric bacteria (DuBose and Hartl, 1990). In addition, genes may be lost because their functions may interfere with the adoption of a novel ecological role. For example, the surface protease OmpT probably was lost from pathogenic Shigella because its function interferes with virulence (Nakata et al., 1993). Similarly, Shigella lost the cadA gene because its product, lysine decarboxylase, also diminishes virulence (Maurelli et al., 1998). HETEROGENEITY IN HORIZONTAL G ENE T RANSFER Lateral Gene Transfer Varies Among Lineages Can analyses of the E. coli genome be extended to form a general model of bacterial innovation and diversication catalyzed by horizontal genetic transfer? Bacterial genomes vary dramatically in the proportions of foreign DNA they harbor (Fig. 2). Yet the amount of foreign DNA within a genome is not entirely predictable from the genome size. For example, among organisms with large genomes, the pathogen Mycobacterium tuberculosis harbors few foreign genes, the mammalian commensal E. coli contains 487 many more acquired genes, and the soil bacterium Bacillus subtilus contains an intermediate amount. These results suggest that the rate of lateral gene transfer derived for the E. coli genome is not immediately applicable to other organisms. The variation in the amount of acquired DNA evident in microbial genomes may be attributable to several sources. First, organisms may differ in their exposure to foreign DNA. For example, the intracellular lifestyle of Rickettsia or Mycoplasma may reduce the opportunity for DNA introduction into their cytoplasm. Second, methods of DNA integration into the chromosome may differ. Acquired DNA in E. coli is strongly associated with chromosomally located mobile genetic elements that probably mediated its integration (Lawrence and Ochman, 1998); this association is also evident in the Synechocystis, Archeoglobus, and Helicobacter genomes (Ochman et al., 2000). The smaller amount of acquired DNA in some organisms may reect a dearth of mechanisms for allowing integration of foreign DNA once it has been introduced into the cytoplasm. Beyond these mechanical constraints, natural selection is the nal arbiter of gene exchange, and newly acquired genes must provide a benecial function for them to persist. Yet bacterial populations can maintain only a nite amount of information in their genomes (see above). Even if newly acquired genes provide a potentially benecial FIGURE 2. Horizontally transferred DNA present in bacterial genomes, after Ochman et al. (2000). Grey bars denote protein-coding sequences native to the bacterial genome (present for at least 100 MY); black bars denote atypical genes probably acquired recently by horizontal transfer. Atypical genes were identied by the methods of Lawrence and Ochman (1997, 1998). 488 S YSTEMATIC BIOLOGY function, they must confer a sufciently strong selective advantage to allow displacement of—that is, the loss of purifying selection on—existing information in the cell. Simply stated, the acquisition of new information necessitates the loss of existing information. Therefore, the rate of gene acquisition should be inversely correlated with the quality of the information it must confer for the acquired genes to persist. If an acquired gene must provide a strong advantage to displace existing information, the rate of acquisition will be low, for few genes would make such a contribution to organismal tness. Alternatively, if an acquired gene need provide only a modest benet to displace existing information, the apparent rate of acquisition will be commensurately higher. Classes of Genomic Information How does variation in information content lead to variation in the effective rate of horizontal transfer among genomes? Although one may infer that total genomic information should increase linearly with genome size, this is not always the case. Efciency in operon organization aside, some information does scale linearly with gene number: signals that allow for its appropriate expression, including transcription initiation and termination signals; sequences that regulate promoter activity and, within protein-coding regions, signals for translation initiation and termination; and the suite of requisite features that allow the gene product to perform its function (e.g., appropriate transmembrane domains, export signal sequences, ligand binding domains, catalytic centers, and other critical features) encoded in nonsynonymous sites. Strong purifying selection maintains these classes of information, and is reected in their high selection coefcients (Fig. 3). Additional classes of genomic information are not reected directly in the presence or composition of gene products. Nonrandom codon usage demonstrates that some genomes maintain a substantial amount of information that inuences the expression, not the composition, of protein products (Sharp and Li, 1986, 1987a; Sharp et al., 1995). Not simply the result of mutational biases, codon usage bias reects the intervention of natural selection in preventing the accumulation of certain synonymous substitutions VOL. 50 among highly expressed genes. The degree of codon usage bias is proportional to the expression level of a gene and is inversely correlated to its synonymous substitution rate (Sharp and Li, 1987b). For example, AUC (Ile), UUR (Leu), and AGR (Arg) codons are underrepresented among highly expressed genes in E. coli, because their cognate tRNAs are rare; similarly, the preference of NAC codons over NAU codons for tyrosine, histidine, asparagine, and aspartate reects the differential binding of queuosine-bearing tRNAs to each pair of codons. In genes with codon usage bias, synonymous mutations introducing nonpreferred codons are counterselected, even thought the protein product is not affected. In addition, codon context bias (Borodovskii et al., 1988) reects problematic juxtaposition of tRNAs within the ribosome, which is strongly avoided (Lawrence, unpubl. results). For example, although GAA glutamine codons are favored on average 4:1 over GAG among highly expressed genes in E. coli, it is favored by an extraordinary 53:1 when followed by guanosine (GAGG is highly avoided), yet barely favored by 1.65:1 when followed by a cytosine (GAGC is modestly avoided) (Lawrence, unpubl. results). Because these classes of information do not affect the composition of gene products, only their expression, one may infer that the purifying selection maintaining this information is reected in smaller selection coefcients (Fig. 3). That is, nonsynonymous substitutions are expected to impart a more dramatic average phenotype than are synonymous substitutions. The information maintained in genomic codon usage and codon context biases is qualitatively different from the other classes of information discussed above. Rather than providing additional genes—and therefore additional metabolic capabilities—to the cell, additional information reected in codon biases renes the expression of existing genes. That is, genomes maintaining large amounts of codon usage bias have invested information in ne-tuning the efcient expression of a particular suite of genes; they did not invest this information in the maintenance of additional genes. Because the amount of genomic information that can be maintained is nite, this pattern demonstrates a trade-off in how genomic information is apportioned. 2001 LAWRENCE—BACTERIAL S PECIATION 489 FIGURE 3. Types of information present in microbial genomes. Curves are plotted along arbitrary axes, the ordinate depicting a gradient of selection coefcients, the abscissa depicting the amount of information maintained. The areas under the arbitrary curves represent the total genomic information for each class of sites. Information representing “genetic headroom” is indicated with the gray bar. Genetic Headroom Given that genomes are limited in their information capacity, acquisition of additional information by way of laterally transferred sequences must be offset by information loss. A genome cannot maintain both the full complement of ancestral information and the information contained in the newly acquired genes. Which information is maintained and which information is discarded? Clearly the information that maximally augments the tness of the organism—that associated with the highest selection coefcients—will be maintained. Genetic headroom can be dened as information that bears very low selection coefcients—exhibited as codon usage bias, codon context bias, and other sites that do not contribute directly to metabolic capabilities—and can be removed from purifying selection without altering the metabolic capabilities of the organism. Organisms with large genetic headroom can explore novel ecologies with impunity, because the information that is transiently sacriced does not handicap the organism with respect to its metabolic capabilities. That is, maintenance of the additional metabolic capabilities encoded by newly acquired genes is offset by accumulation of mutations that affect codon usage bias and codon context bias (but not primary amino acid sequences), the class of information bearing the lowest selection coefcients (Fig. 3). If the niche is successful, ancestral genes may be discarded (e.g., the Shigella cadA and ompT genes) as the organism adapts to a new ecological role. Alternatively, if the new niche is not successful, the acquired genes will be discarded. In either case, ancestral physiology has been maintained during the exploratory phase (Fig. 4). In organisms with little genetic headroom, experimentation with novel ecologies upon acquisition of additional information in novel genes cannot be offset by allowing accumulation of synonymous substitutions. Rather, the maintenance of additional information is offset by loss of protein-coding sequences or other classes of information bearing high selection coefcients (Fig. 3). Without genetic headroom, an organism may compromise ancestral physiology to pursue novel ecological routes (Fig. 4). If the lineage is not successful, it cannot return to its ancestral state. In this way, the magnitude of genetic headroom can promote microbial 490 S YSTEMATIC BIOLOGY VOL. 50 FIGURE 4. Models for the reapportionment of genomic information upon acquisition of genes by horizontal transfer. On the left, organisms with high genetic headroom can offset information gain by transient accumulation of synonymous substitutions. On the right, organisms with little genetic headroom offset the gain of information with loss of ancestral coding sequences. 2001 LAWRENCE—BACTERIAL S PECIATION diversication by allowing purifying selection to be reapportioned between counterselecting synonymous substitutions in native genes and nonsynonymous substitutions in acquired genes. This model predicts that organisms with greater genetic headroom can maintain greater numbers of genes acquired by lateral gene transfer. Because codon usage bias can be assessed within individual genomes and does not require extensive comparisons with homologues for calculation of relative rates of evolution, quantitation of codon usage bias offers a cogent vehicle for measuring overall genomic information. Thus, the magnitude of codon usage bias may be an accurate predictor of the rate of effective horizontal transfer. As expected, the overall extent of codon usage bias, here quantitated as the average normalized Â2 of codon usage, is a good predictor of the amount of horizontally transferred DNA in a microbial genome (Fig. 5); this correlation between average Â2 of codon usage and percentage atypical DNA is independent of genome size. Organisms capable of maintaining laterally transferred genes presumably would be more prone to speciation, because acquired genes would promote the exploitation of novel environments. Thus, genetic headroom can be viewed as a predictor of the potential of a lineage to diversify and to proliferate. 491 LIMITATIONS ON HORIZONTAL G ENE T RANSFER Shufing the Deck Horizontal transfer mediates a combinatorial process whereby organisms can be assembled in a stepwise fashion from both existing functions and those acquired from other organisms. Yet organisms are clearly constrained in the paths available for realistic diversication. For example, the ancestor of E. coli and Salmonella enterica was likely a commensal inhabitant of the differentiated, lower intestinal tract. E. coli evolved as a commensal of mammalian gut environments, optimized to grow at 37± C and able to degrade milk sugars. Salmonella still inhabits a variety of gastrointestinal tracts, including those of reptiles and birds, but has adopted a pathogenic lifestyle as well. Although horizontal gene transfer was instrumental in catalyzing the evolution of each of these lineages, the new ecological roles of these species were not dramatic leaps from the ancestral ecology. That is, a facultatively anaerobic, mammalian commensal did not evolve from a photosynthetic cyanobacterium. Similarly, we would not expect that the immediate descendents of an E. coli lineage would include carnivorous, social predators like the Myxobacteria. Therefore, although horizontal gene transfer can introduce functions capable of catalyzing bacterial speciation, the potential ecologies exploited by species-tobe must be reasonably accessible. Inventing the Cards FIGURE 5. Correlation between the amount of foreign DNA in a bacterial genome and the amount of genetic headroom. Data points represent complete genome analyses for the organisms listed in Figure 2. Average codon usage bias was calculated from length (L)normalized  2 values for each gene by using codonposition-specic nucleotide compositions as expected values. Atypical DNA was calculated as described (Lawrence and Ochman, 1997, 1998). Although horizontal genetic transfer can reshufe metabolic capabilities among organisms, allowing for novel combinations of capabilities to dene incipient species, gene transfer is unlikely to result directly in the creation of truly novel metabolic capabilities. That is, although a current organism is likely to gain the ability to degrade ¯galactosides by the acquisition of another organism’s ¯-galactosidase (as did E. coli in obtaining the lacZYA operon), the enzyme must have evolved at some point by mutational processes. However, gene transfer can facilitate the evolution of novel functions by allowing for gene duplication and divergence of function—the classic model for the evolution of novel genes—to occur in different cytoplasms. That is, the difculties inherent 492 VOL. 50 S YSTEMATIC BIOLOGY in asking for a duplicated gene to be maintained until a fortuitous benecial mutation occurs are circumvented if novel functionality evolves in separate cytoplasms; the genes may be reunited in the same cytoplasm by intraspecic or interspecic recombination after the evolution of distinct functions. Role of Intraspecic Recombination In contrast to lateral, interspecic gene transfer, intraspecic gene transfer is unlikely to catalyze speciation. When genes are mobilized among conspecic strains, homologous recombination facilitates their integration into the chromosome. This process has been demonstrated to reassort alleles in E. coli (Milkman and Stoltzfus, 1988; Milkman and Bridges, 1990, 1993; Dykhuizen and Green, 1991; Guttman and Dykhuizen, 1994a; Selander et al., 1996; Milkman, 1997) and other organisms. Novel functionality is not introduced directly, but intraspecic recombination will allow propagation of acquired genes throughout a species. As organisms diverge, homologous recombination becomes less efcient (Zawadzki et al., 1995; Majewski et al., 2000), as mismatch correction systems prevent the formation of heteroduplex strands (Vulic et al., 1997, 1999; Majewski and Cohan, 1998, 1999). Therefore, homologous recombination becomes less efcient at transferring traits across species boundaries. This caveat does not diminish in any way the role of intraspecic recombination in distributing potentially benecial alleles within species boundaries. On the contrary, it has been estimated that the rate of recombination among E. coli strains is on the order of the mutation rate (Guttman and Dykhuizen, 1994a, b) and is as much as 10 times the mutation rate in recombinagenic species such as Streptomyces (Feil et al., 2000); as a result, a single nucleotide may be 50 times more likely to change through recombination than through mutational processes (Guttman and Dykhuizen, 1994a; Guttman, 1997; Feil et al., 2000). Hence, intraspecic recombination is a formidable force behind periodic selection events that would allow benecial genes, including those introduced by horizontal genetic transfer, to rise to high frequency in bacterial populations. The population dynamics mediating the dispersal of benecial genes and, hence, speciation, are discussed elsewhere (Cohan, 2001). PUS H AND P ULL IN THE S PECIATION PROCESS Pull Speciation In standard models of speciation in eukaryotes (Vrba, 1985; King, 1993), two events are crucial for successful speciation to occur. First, gene ow must be reduced between the nascent species (reproductive isolation) to avoid coalescence of the lineages into a single species. Second, if they are to coexist, the organisms eventually must play markedly different ecological roles to avoid direct competition. Although many mechanisms have been proposed that allow for reproductive isolation, models for the evolution of phenotypic differentiation between the two diverging lineages share a common feature: Natural variation found in a parental population is apportioned differently among daughter populations. For example, differential response to selection may allow for differential utilization along a resource gradient (Fig. 6, left); these differences then allow for simultaneous (for sympatric speciation) or eventual (for allopatric speciation) coexistence of the two taxa. Intrinsically, this separation of ecological roles is a gradual process, in which incipient species-to-be progressively alter their phenotypic character by selection for naturally arising variants present in the parental population. As the lineages are diversifying, the average phenotype in at least one daughter population is being slightly altered relative to that in the parental population. In this way, daughter populations are “pulled” away from each other and into new niches by the action of natural selection on variant traits. Because daughter populations are effectively enriched for variants found in the parental populations, incipient species initially exhibit characteristics that are effectively subsets of those seen in their parental populations. That is, parental populations always contain the seeds of the incipient species, and the vagaries of population dynamics dictate when species are created from the preexisting variation within a parental population. Push Speciation The prevalence of horizontal transfer in prokaryotes makes this view incongruent with bacterial speciation. Acquired genes 2001 LAWRENCE—BACTERIAL S PECIATION 493 FIGURE 6. Two models for speciation, pull and push (see text). The distribution of individuals is depicted along an arbitrary phenotypic axis. have the potential to alter dramatically the metabolic capabilities of recipient organisms; cells can suddenly nd themselves performing feats that were never within the grasp of the parent population (Fig. 6, right). In a way, daughter populations are “pushed” into new niches beyond the scope of their parental populations. If new functions mediate competitive exploitation of this environment, the new lineage will persist. Additional acquired genes will reinforce the ecological distinctiveness of each lineage, and inevitable gene loss will further differentiate between nascent species and their parent populations. In time, much of the steady rain of horizontally transferred DNA into bacterial genomes will quickly evaporate, not having conferred useful functions. Eventually, however, some functions will be introduced, packaged in selsh operons, that allow the organism to invade a new niche successfully. The inevitable loss of ancestral functions provides further ecological differentiation between parent and daughter populations that serves to counterselect recombinants between these populations. In this way, we can view bacterial speciation as an inevitable process, which must occur in the face of widespread gene loss and acquisition. From a phylogenetic standpoint, dendrograms of descent are less accurately represented as phylogenetic trees than as intricate webs (Doolittle, 1999a, b), where the ow of genes among clades has promoted the diversication of prokaryotic life. ACKNOWLEDGMENTS I thank Ford Doolittle, Matt Kane, and an anonymous reviewer for helpful comments on the manuscript. This work was supported by grants from the Alfred P. Sloan Foundation and the David and Lucile Packard Foundation. R EFERENCES ARAVIND , L., R. L. TATUSOV , Y. I. WOLF, D. R. WALKER , AND E. V. KOONIN. 1998. Evidence for massive gene exchange between archaeal and bacterial hyperthermophiles. Trends Genet. 14:442–444. BERGTHORSSON, U., AND H. OCHMAN. 1995. Heterogeneity of genome size among natural isolates of Escherichia coli. J. Bacteriol. 177:5784 –5789. BERGTHORSSON, U., AND H. OCHMAN. 1998. Distribution of chromosome length variatio n in natural isolates of Escherichia coli. Mol. Biol. Evol. 15:6–16. BORODOVSKII , M. Y., V. A. SHEPELEV , AND A. A. ALEKSANDR OV. 1988. Context-connected shift pattern of the frequencies of synonymous codons in Escherichia coli. Mol. Biol. 22:767–779 (in Russian). BROWN, J. R., F. T. ROBB , R. WEISS , AND W. F. DOOLITTLE. 1997. Evidence for the early divergence of tryptophanyl- and tyrosyl-tRNA synthetases. J. Mol. Evol. 45:9–16. BUCHANAN-WOLLASTON, V., J. E. PASSIATORE, AND F. CANON. 1987. The mob and oriT mobilization functions of a bacterial plasmid promote its transfer to plants. Nature 328:170–175. 494 S YSTEMATIC BIOLOGY CAMPBELL, A. 1981. Evolutionary signicance of accessory DNA elements in bacteria. Annu. Rev. Microbiol. 35:55–83. CAMPBELL, A., J. MR AZEK , AND S. K ARLIN. 1999. Genome signature comparisons among prokaryote, plasmid, and mitochondrial DNA. Proc. Natl. Acad. Sci. USA 96:9184–9189. CASJENS , S. 1998. The diverse and dynamic structure of bacterial genomes. Annu. Rev. Genet. 32:339–377. COHAN, F. M. 2001. Bacterial species and speciation. Sys. Biol. 50: (this issue). DOOLITTLE, W. F. 1999a. Lateral genomics. Trends Cell. Biol. 9:M5–8. DOOLITTLE, W. F. 1999b. Phylogenetic classication and the universal tree. Science 284:2124–2129. DRAKE, J. W., B. CHARLESWORTH, D. CHARLESWORTH, AND J. F. CROW. 1998. Rates of spontaneous mutation. Genetics 148:1667 –1686. DUBOSE, R. F., AND D. L. HARTL. 1990. The molecular evolution of alkaline phosphatase: Correlating variation among enteric bacteria to experimental manipulations of the protein. Mol. Biol. Evol. 7:547–577. DYKHUIZEN, D. E. 1998. Santa Rosalia revisited: Why are there so many species of bacteria? Antoine van Leeuwenhoek 73:25–33. DYKHUIZEN, D. E., AND L. GREEN. 1991. Recombination in Escherichia coli and the denition of biological species. J. Bacteriol. 173:7257 –7268. FEIL, E. J., J. M. SMITH, M. C. ENRIG HT , AND B. G. SPRATT . 2000. Estimating recombinational parameters in streptococcus pneumoniae from multilocus sequence typing data. Genetics 154:1439 –1450. FIGGE, R. M., M. SCHUBERT , H. BRINKMANN, AND R. CERFF. 1999. Glyceraldehyde-3-phosphate dehydrogenase genes in eubacteria and eukaryotes: Evidence for intra- and inter-kingdom gene transfer. Mol. Biol. Evol. 16:429–440. GOGARTEN, J. P., H. KIBAK , P. DITTRICH, L. TAIZ, E. J. BOWMAN, B. J. BOWMAN, M. F. MANOLSON, R. J. POOLE, T. DATE, T. OS HIMA, ET AL. 1989. Evolution of the vacuolar HC -ATPase: Implications for the origin of eukaryotes. Proc. Natl. Acad. Sci. USA 86:6661–6665. GOLDING , G. B., AND A. M. DEAN . 1998. The structural basis of molecular adaptation. Mol. Biol. Evol. 15:355– 369. GRIBALDO , S., AND P. CAMMAR ANO . 1998. The root of the universal tree of life inferred from anciently duplicated genes encoding components of the proteintargeting machinery. J. Mol. Evol. 47:508–516. GUTTMAN, D. S. 1997. Recombination and clonality in natural populations of Escherichia coli. trends Ecol. Evol. 12:16–22. GUTTMAN, D. S., AND D. E. DYKHUIZEN. 1994a. Clonal divergence in Escherichia coli as a result of recombination, not mutation. Science 266:1380–1383. GUTTMAN, D. S., AND D. E. DYKHUIZEN. 1994b. Detecting selective sweeps in naturally occurring Escherichia coli. Genetics 138:993–1003. HALL, R. M., C. M. COLLIS , M. J. KIM , S. R. P ARTRIDGE, G. D. RECCHIA, AND H. W. STOKES . 1999. Mobile gene cassettes and integrons in evolution. Ann. N.Y. Acad. Sci. 870:68–80. HARTL, D. L., AND D. E. DYKHUIZEN. 1984. The population genetics of Escherichia coli. Annu. Rev. Genet. 18:31–68. HEINEMANN , J. A., AND G. F. J. SPR AGUE. 1989. Bacterial conjugative plasmids mobilize DNA transfer between bacteria and yeast. Nature 340:205–209. VOL. 50 HUYNEN, M. A., AND P. BORK . 1998. Measuring genome evolution. Proc. Natl. Acad. Sci. USA 95:442– 444. ITOH, T., K. TAKEMO TO , H. MORI, AND T. GOJOBORI. 1999. Evolutionary instability of operon structures disclosed by sequence comparisons of complete microbial genomes. Mol. Biol. Evol. 16:332–346. IWABE, N., K.-I. KUMA, M. HASEGAWA, S. OSAWA, AND T. MIYATA. 1989. Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred from phylogenetic trees of duplicated genes. Proc. Natl. Acad. Sci. USA 86:9355–9359. KARLIN, S. 1998. Global dinucleotide signatures and analysis of genomic heterogeneity. Curr. Opin. Microbiol. 1:598–610. KING , M. 1993. Species evolution. Cambridge Univ. Press, Cambridge. LAWRENCE, J. G. 1997. Selsh operons and speciatio n by gene transfer. Trends Microbiol. 5:355–359. LAWRENCE, J. G. 1999a. Gene transfer, speciation, and the evolution of bacterial genomes. Curr. Opin. Microbiol. 2:519–523. LAWRENCE, J. G. 1999b. Selsh operons: The evolutionary impact of gene clustering in the prokaryotes and eukaryotes. Curr. Opin. Genet. Dev. 9:642–648. LAWRENCE, J. G. 2000. Clustering of antibiotic resistance genes : Beyond the selsh operon. ASM News 66:281– 286. LAWRENCE, J. G., AND H. OCHMAN. 1997. Amelioration of bacterial genomes: Rates of change and exchange. J. Mol. Evol. 44:383–397. LAWRENCE, J. G., AND H. O CHMAN. 1998. Molecular archaeology of the Escherichia coli genome. Proc. Natl. Acad. Sci. USA 95:9413–9417. LAWRENCE, J. G., AND J. R. ROTH. 1996. Selsh operons: Horizontal transfer may drive the evolution of gene clusters. Genetics 143:1843–1860. LAWRENCE, J. G., AND J. R. ROTH. 1998. Roles of horizontal transfer in bacterial evolution. Pages 208–225 in Horizontal Ttransfer (M. Syvanen, and C. I. Kado, eds.). Chapman and Hall, London. LAWRENCE, J. G., AND J. R. ROTH. 1999. Genomic ux: Genome evolution by gene loss and acquisition. Pages 263–289 in Organization of the prokaryotic genome (R. L. Charlebois, ed.). ASM Press, Washington, D.C. LEVIN, B. R., AND C. T. BERGSTROM . 2000. Bacteria are different: Observations, interpretations, speculations, and opinions about the mechanisms of adaptive evolution in prokaryotes. Proc. Natl. Acad. Sci. USA 97:6981–6985. LINNEUS , C. 1742. Systema Naturale. LOGSDON, J. M., AND D. M. FUGUY. 1999. Thermotoga heats up lateral gene transfer. Curr. Biol. 9:R747–R751. MAJEWS KI , J., AND F. M. COHAN. 1998. The effect of mismatch repair and heteroduplex formation on sexual isolation in Bacillus. Genetics 148:13–18. MAJEWS KI , J., AND F. M. COHAN. 1999. DNA sequence similarity requirements for interspecic recombination in Bacillus. Genetics 153:1525 –1533. MAJEWS KI , J., P. ZAWADZKI, P. PICKERILL, F. M. COHAN, AND C. G. DOWSON. 2000. Barriers to genetic exchange between bacterial species: Streptococcus pneumoniae transformation. J. Bacteriol. 182:1016–1023. MATSUMURA, I., J. B. WALLINGFORD, N. K. SURANA, P. D. VIZE, AND A. D. ELLINGTON. 1999. Directed evolution of the surface chemistry of the reporter enzyme beta-glucuronidase. Nat. Biotechnol. 17:696– 701. 2001 LAWRENCE—BACTERIAL S PECIATION MAURELLI , A. T., R. E. FERNÁNDEZ, C. A. BLOCH, C. K. RODE, AND A. FASANO . 1998. “Black holes” and bacterial pathogenicity: A large genomic deletion that enhances the virulence of Shigella spp. and enteroinvasive Escherichia coli. Proc. Natl. Acad. Sci. USA 95:3943–3948. MAYNARD S MITH, J., N. H. SMITH , M. O’ROURKE, AND B. G. SPRATT . 1993. How clonal are bacteria? Proc. Natl. Acad. Sci. USA 90:4384–4388. MAYR, E. 1942. Systematics and the origin of species. Columbia Univ. Press, New York. MAYR, E. 1954. Change of genetic environment and evolution. Pages 156–180 in Evolution as a process (J. S. Huxley, A. C. Hardy, and E. B. Ford, eds.). Allen and Unwin, London. MAYR, E. 1963. Animal species and evolution. Harvard Univ. Press, Cambridge, Massachusetts. MEDIGUE, C., T. ROUXEL, P. VIGIER , A. HENAUT , AND A. DANCHIN. 1991. Evidence for horizontal gene transfer in Escherichia coli speciation. J. Mol. Biol. 222:851–856. MILKMAN, R. 1997. Recombination and population structure in Escherichia coli. Genetics 146:745–750. MILKMAN, R.,AND M. M. BRIDGES . 1990. Molecular evolution of the E. coli chromosome. III. Clonal frames. Genetics 126:505–517. MILKMAN, R., AND M. M. BRIDGES . 1993. Molecular evolution of the E. coli chromosome. IV. Sequence comparisons. Genetics 133:455–468. MILKMAN, R., AND A. STOLTZFUS . 1988. Molecular evolution of the Escherichia coli chromosome. II. Clonal segments. Genetics 120:359–366. MRAZEK , J., AND S. KARLIN. 1999. Detecting alien genes in bacterial genomes. Ann. N.Y. Acad. Sci. 870:314– 329. MULLER , H. 1932. Some genetic aspects of sex. Am. Nat. 66:118–138. MUTO , A., AND S. OSAWA. 1987. The guanine and cytosine content of genomic DNA and bacterial evolution. Proc. Natl. Acad. Sci. USA 84:166–169. NAKATA, N., T. TOBE, I. FUKUDA, T. SUZUKI, K. KOMATSU, M. YOSHIKAWA, AND C. SASAKAWA. 1993. The absence of a surface protease, OmpT, determines the intercellular spreading ability of Shigella: The relationship between the ompT and kcpA loci. Mol. Microbiol. 9:459–468. NELSON, K. E., R. A. CLAYTON, S. R. GILL, M. L. GWINN, R. J. DODSON, D. H. HAFT, E. K. HICKEY, J. D. PETERS ON, W. C. NELSON, K. A. KETCHUM , ET AL. 1999. Evidence for lateral gene transfer between Archaea and bacteria from genome sequence of Thermotoga maritima. Nature 399:323–329. OCHMAN, H., AND U. BERGTHORS SON. 1995. Genome evolution in enteric bacteria. Curr. Opin. Genet. Dev. 5:734–738. OCHMAN, H., AND U. BERGTHORSS ON. 1998. Rates and patterns of chromosome evolution in enteric bacteria. Curr. Opin. Microbiol. 1:580–583. OCHMAN, H., AND J. G. LAWRENCE. 1996. Phylogenetics and the amelioratio n of bacterial genomes. Pages 2627–2637 in Escherichia coli and Salmonella typhimurium: Cellular and molecular biology, 2nd edition (F. C. Neidhardt, R. Curtiss III, J. L. Ingraham, E. C. C. Lin, K. B. Low, B. Magasanik, W. S. Reznikoff, M. Riley, M. Schaechter, and H. E. Umbarger, eds.). American Society for Microbiology, Washington, D.C. OCHMAN, H., J. G. LAWRENCE, AND E. GROISMAN. 2000. Lateral gene transfer and the nature of bacterial innovation. Nature 405:299–304. 495 PAPADOPOULOS , D., D. SCHNEIDER , J. MEIER -EIS S , W. ARBER , R. E. LENSKI , AND M. BLOT . 1999. Genomic evolution during a 10,000-generation experiment with bacteria. Proc. Natl. Acad. Sci. USA 96:3807–3812. PATERSON, H. E. H. 1985. The recognition concept of species. Pages 21–29 in Species and speciation (E. S. Vrba, ed.). Transvaal Museum, Pretoria, South Africa. RAINEY, P. B., AND M. TRAVISANO . 1998. Adaptive radiation in a heterogeneous environment. Nature 394:69– 72. RILEY, M., AND A. ANILONIS . 1978. Evolution of the bacterial genome. Annu. Rev. Microbiol. 32:519–560. ROS SMAN, M. G., D. MORAS , AND K. W. OLSEN. 1974. Chemical and biological evolution of a nucleotidebinding protein. Nature 250:194–199. SELANDER , R. K., J. LI , AND K. NELSON. 1996. Evolutionary genetics of Salmonella enterica. Pages 2691–2707 in Escherichia coli and Salmonella typhimurium: Cellular and molecular biology, 2nd edition (F. C. Neidhardt, R. Curtiss III, J. L. Ingraham, E. C. C. Lin, K. B. Low, B. Magasanik, W. S. Reznikoff, M. Riley, M. Schaechter, and H. E. Umbarger, eds.). American Society for Microbiology, Washington, D.C. SHARP, P. M., M. AVEROF, A. T. LLOYD, G. MATAS SI , AND J. F. PEDEN. 1995. DNA sequence evolution: The sounds of silence. Philos. Trans. R. Soc. London B Biol. Sci. 349:241–247. SHARP, P. M., AND W.-H. LI . 1986. Codon usage in regulatory genes in Escherichia coli does not reect selection for ‘rare’ codons. Nucleic. Acids Res. 14:7737–7749. SHARP, P. M., AND W.-H. LI . 1987a. The codon adaptation index—a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 15:1281–1295. SHARP, P. M., AND W.-H. LI . 1987b. The rate of synonymous substitution in enterobacterial genes is inversely related to codon usage bias. Mol. Biol. Evol. 4:222– 230. SNEL, B., P. BORK , AND M. HUYNEN. 1999. Genome phylogeny based on gene content. Nat. Genet. 21:108– 110. SOKURENKO , E. V., V. CHESNOKOVA, D. E. DYKHUIZEN, I. OFEK , X. R. WU, K. A. K ROGFELT, C. STR UVE, M. A. SCHEMBR I , AND D. L. HASTY. 1998. Pathogenic adaptation of Escherichia coli by natural variation of the FimH adhesin. Proc. Natl. Acad. Sci. USA 95:8922– 8926. STOLTZFUS , A. 1999. On the possibility of constructive neutral evolution. J. Mol. Evol. 49:169–181. SUEOKA, N. 1962. On the genetic basis of variation and heterogeneity in base composition. Proc. Natl. Acad. Sci. USA 48:582–592. SUEOKA, N. 1988. Directional mutation pressure and neutral molecular evolution. Proc. Natl. Acad. Sci. USA 85:2653–2657. SUEOKA, N. 1992. Directional mutation pressure, selective constraints, and genetic equilibria. J. Mol. Evol. 34:95–114. SUEOKA, N. 1993. Directional mutation pressure, mutator mutations, and dynamics of molecular evolution. J. Mol. Evol. 37:137–153. SYVANEN, M., AND C. I. KADO (eds.). 1998. Horizontal gene transfer. Chapman & Hall, London. TORSVIK , V., J. GOKS øYR , AND F. L. DAAE. 1990. High diversity of DNA in soil bacteria. Appl. Environ. Microbiol. 56:776–781. VAN VALEN, L. 1976. Ecological species, multispecies, and oaks. Taxon 25:223–239. 496 S YSTEMATIC BIOLOGY VRBA, E. S. (ed.). 1985. Species and Sspeciation. Transvaal Museum, Pretoria, South Africa. VULIC, M., F. DIONISIO , F. TAD DEI , AND M. RADMAN. 1997. Molecular keys to speciation: DNA polymorphism and the control of genetic exchange in Enterobacteria. Proc. Natl. Acad. Sci. USA 94:9763– 9767. VULIC, M., R. E. LENSKI , AND M. RADMAN. 1999. Mutation, recombination, and incipient speciation of bacteria in the laboratory. Proc. Natl. Acad. Sci. USA 96:7348–7351. WALDOR , M. K. 1998. Bacteriophage biology and bacterial virulence. Trends Microbiol. 6:295–297. WHITTAM , T. S., AND S. AKE. 1992. Genetic polymorphisms and recombination in natural populations of Escherichia coli. Pages 223–246 in Mechanisms of molecular evolution (N. Takahata and A. G. Clark, eds.). Japan Scientic Society Press, Tokyo. WILEY, E. O. 1978. The evolutionary species concept reconsidered. Syst. Zool. 27:17–26. VOL. 50 WILKS , H. M., K. W. HART , R. FEENEY, C. R. DUNN, H. MUIRHEAD , W. N. CHIA , D. A. BARSTOW, T. ATKINSO N, A. R. CLARKE, AND J. J. HOLBROOK. 1988. A specic, highly active malate dehydrogenase by redesign of a lactate dehydrogenase framework. Science 242:1541 –1544. WOESE, C. R., O. KAND LER , AND M. L. WHEELIS . 1990. Towards a natural system of organisms: Proposal for the domains archaea, bacteria, and eucarya. Proc. Natl. Acad. Sci. USA 87:4576–4579. WU, G., A. FISER , B. TER K UILE, S. ALI , AND M. MÜLLER . 1999. Convergent evolution of Trichomonas vaginalis lactate dehydrogenase from malate dehydrogenase. Proc. Natl. Acad. Sci. USA 96:6285–6290. ZAWADZKI, P., M. S. ROBERTS , AND F. M. COHAN. 1995. The log-linear relationship between sexual isolation and sequence divergence in Bacillus transformation is robust. Genetics 140:917–932. Received 14 April 2000; accepted 19 June 2000 Associate Editor: M. Kane