* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Microbial Evolution and Diversity
Survey
Document related concepts
Phospholipid-derived fatty acids wikipedia , lookup
Microorganism wikipedia , lookup
Disinfectant wikipedia , lookup
Human microbiota wikipedia , lookup
Magnetotactic bacteria wikipedia , lookup
Bacterial cell structure wikipedia , lookup
Metagenomics wikipedia , lookup
Triclocarban wikipedia , lookup
Bacterial morphological plasticity wikipedia , lookup
Horizontal gene transfer wikipedia , lookup
Community fingerprinting wikipedia , lookup
Transcript
PART V Microbial Evolution and Diversity This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. The objectives of this chapter are to: N Provide information on how bacteria are named and what is meant by a validly named species. N Discuss the classification of Bacteria and Archaea and the recent move toward an evolutionarily based, phylogenetic classification. N Describe the ways in which the Bacteria and Archaea are identified in the laboratory. This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. 17 Taxonomy of Bacteria and Archaea It’s just astounding to see how constant, how conserved, certain sequence motifs—proteins, genes—have been over enormous expanses of time. You can see sequence patterns that have persisted probably for over three billion years. That’s far longer than mountain ranges last, than continents retain their shape. —Carl Woese, 1997 (in Perry and Staley, Microbiology) T his part of the book discusses the variety of microorganisms that exist on Earth and what is known about their characteristics and evolution. Most of the material pertains to the Bacteria and Archaea because there is a special chapter dedicated to eukaryotic microorganisms. Therefore, this first chapter discusses how the Bacteria and Archaea are named and classified and is followed by several chapters (Chapters 18–22) that discuss the properties and diversity of the Bacteria and Archaea. When scientists encounter a large number of related items—such as the chemical elements, plants, or animals—they characterize, name, and organize them into groups. Thousands of species of plants, animals, and bacteria have been named, and many more will be named in the future as more are discovered. Not even the most brilliant biologist knows all of the species. Organizing the species into groups of similar types aids the scientist not only in remembering them but also in comparing them to their closest relatives, some of which the scientist would know very well. In addition, biologists are interested in evolution, because this is the process through which organisms became diverse. Unraveling the route of evolution leads to an understanding of how one species is related to another. As discussed in subsequent text of this chapter, evolutionary relationships are assessed by molecular phylogeny, the analysis of gene and protein sequences to determine the relatedness among organisms. To date, approximately 5,000 bacterial and archaeal species have been named and, based on their characteristics, placed This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. 486 Chapter Seventeen within the existing framework of other known species. The branch of bacteriology that is responsible for characterizing and naming organisms and organizing them into groups is called taxonomy or systematics. Taxonomy can be separated into three major areas of activity. One is nomenclature, which is the naming of bacteria. The second is classification, which entails the ordering of bacteria into groups based on common properties. In identification, the third area, an unknown bacterium, for example, from a clinical or soil sample, is characterized to determine its species. This chapter covers all three of these areas. 17.1 Nomenclature Bacteriologists throughout the world have agreed on a set of rules for naming Bacteria and Archaea. These rules, called the “International Code for the Nomenclature of Bacteria” (1992), state what a scientist must do to describe a new species or other taxon (taxa, pl.), which is a unit of classification, such as a species, genus, or family. Each bacterium is placed in a genus and given a species name in the same manner as are plants and animals. For example, humans are Homo sapiens (genus name first, followed by species), and a common intestinal bacterium is named Escherichia coli. This binomial system of names follows that proposed for plants and animals by the Swedish taxonomist Carl von Linné (Linnaeus; 1707–1778). According to the rules of bacterial nomenclature, the root for the name of a species or other taxon can be derived from any language, but it must be given a Latin ending so that the genus and species names agree in gender. For example, consider the species name Staphylococcus aureus. The first letter in the genus name is capitalized, the species name is lowercase, and they are both italicized to indicate that they are Latinized. When writing species names in longhand, as for a laboratory notebook, they should be underlined to denote that they are italicized. The genus name Staphylococcus is derived from the Greek Staphyl from staphyle, which means a “bunch of grapes,” and coccus, from the Greek, meaning “a berry.” The o (“oh”) between the two words is a joining vowel used to connect two Greek words together. The figurative meaning of the genus name is “a cluster of cocci,” which describes the overall morphology of members of the genus. The species name aureus is from the Latin and means “golden,” the pigmentation of members of this species. The -us ending of the genus and species names is the Latin masculine ending for a noun (Staphylococcus in this case) and its adjective (aureus). Successively higher taxonomic categories are family, order, class, phylum, and domain (Table 17.1) TABLE 17.1 Hierarchical classification of the bacterium Spirochaeta plicatilis Taxon Name Domain Phylum Class Order Family Genus Species Bacteria Spirochaetes (vernacular name: spirochetes) Spirochaetes Spirochaetales Spirochaetaceae Spirochaeta plicatilis The International Journal of Systematic and Evolutionary Microbiology (IJSEM) is a journal devoted to the taxonomy of bacteria that is published by the Society for General Microbiology. IJSEM publishes papers that describe and name new bacterial taxa and contains an updated listing of all new bacteria whose names have been validly published. Thus, although bacterial species may be described in other scientific journals, they are not considered validly published until they have been included on a validation list in IJSEM. IJSEM also provides a forum to debate specific controversies in nomenclature by allowing a scientist to challenge the current nomenclature of an organism or group of organisms. Such challenges, if accepted by peer review, are then published as a question in the IJSEM. The question is then evaluated by the Judicial Commission of the International Union of Microbiological Societies, which subsequently publishes a ruling in the journal. One typical example of a problem considered by the Judicial Commission was the question about Yersinia pestis, the causative agent of bubonic plague. Scientific evidence indicates that Y. pestis is really just a subspecies of Yersinia pseudotuberculosis, a species name that has precedence over Y. pestis because of its earlier publication. Because of the potential confusion and possible public health issues that could arise by renaming Y. pestis, Y. pseudotuberculosis subspecies pestis, the Judicial Commission ruled against renaming the bacterium despite its scientific justification. SECTION HIGHLIGHTS Nomenclature is concerned with naming organisms. For Bacteria and Archaea, specific rules must be followed in order to name and describe new species. Organisms that are placed on the approved or validated lists are officially recognized species. This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. Taxonomy of Bacteria and Archaea 17.2 Classification Classification is that part of taxonomy concerned with the grouping of bacteria into taxa based on common characteristics. The earliest classifications did not consider microorganisms. There were two kingdoms of life: Plants and Animals. In 1868, Ernst Haeckel, a German scientist, proposed a third kingdom specifically for microorganisms. Approximately a century later, in 1969, Robert Whittaker proposed a five-kingdom system of classification. His classification included Plants (Plantae), Animals (Animalia), Fungi, Protista, and Monera. In this system, the eukaryotic microorganisms were placed in the Protista kingdom and the fungi had their own special kingdom. The bacteria and archaea were placed, as prokaryotes, in the kingdom Monera. Organisms were separated from one another on the basis of nutrition and cell structure. Therefore, plants are photosynthetic eukaryotes, fungi are heterotrophs that use dissolved nutrients, and animals are heterotrophs that ingest their food. The five-kingdom classification remained popular until recently. Then, in 1990, Carl Woese and colleagues proposed an entirely new classification, the Tree of Life (see Chapter 1). Unlike all previous classifications this new classification uses a molecular phylogenic approach. The Tree of Life is based on the sequence analysis of a common macromolecule that all organisms share, the RNA in the small subunit of the ribosome (see Chapter 4). This RNA was used to separate all organisms on Earth into three different domains, the Bacteria, the Archaea, and the Eukarya. Viruses, which are not organisms because they are not cellular (see Chapter 1), cannot be classified by this system. Classification systems can be either artificial or natural. Artificial systems of classification are based on expressed characteristics of the organisms, or the phenotype of the organism. In contrast, natural or phylogenetic systems are based on the purported evolution of the organism. Until recently, all bacterial classifications were artificial because there was no meaningful basis for determining their evolution. In contrast, plants and animals have a fairly extensive fossil record on which to base an evolutionary classification system. Although fossils of microorganisms do exist (see Chapter 1), the simple structures of microorganisms do not permit their identification into a taxonomic group by morphological criteria. Furthermore, the stable morphology of and ontology of plants and animals have been useful in developing natural systems of classification. This type of information is rarely available for microorganisms. When considering bacterial classification, it is important to keep in mind that bacteria have been evolving on Earth for the past 3.5 to 4 billion years. Therefore, it should not seem surprising that two separate domains of prokary- 487 otic organisms exist—the Bacteria and the Archaea—versus only one domain for eukaryotes, Eukarya. In addition, the Eukarya may have evolved more recently as the result of symbiotic events between different early prokaryotic forms of life (see Chapter 1). Because of the long period of evolution of bacteria and archaea, the various groups within these domains exhibit considerable diversity, particularly metabolic and physiological. In contrast, the metabolic diversity of the Eukarya is limited, especially with respect to energy generation. The vast diversity of metabolic types of prokaryotes is discussed more fully in Chapter 5 and in Chapters 18–22. This chapter first discusses the traditional system of classification and then covers what is being done to make it phylogenetic. Artificial versus Phylogenetic Classifications Conventional artificial taxonomy uses phenotypic tests to determine differences between strains and species. These tests are typically weighted so that characteristics that are considered to be more important are given higher priority. For example, in traditional taxonomy, the Gram stain has been given more weight in determining the classification of an organism than whether the organism uses glucose as a carbon source. Therefore, all gram-positive strains would be ascribed to one family or genus, and, within that group, certain species or strains would use glucose and others would not. Most bacteriologists favor a phylogenetic system for the classification of bacteria, and with the advent of molecular phylogeny, this hope is now being realized. The current accepted treatise that contains a complete listing of prokaryotic species and their classification is Bergey’s Manual of Systematic Bacteriology (2001–2008), published by Springer, and its more condensed edition, Bergey’s Manual of Determinative Bacteriology (1994). In addition to containing a complete classification of bacteria and archaea, the more comprehensive version of Bergey’s Manual contains a description of all known validly described bacterial species. Therefore, it is the “encyclopedia” of the bacteria and archaea that is widely used by bacteriologists (Box 17.1). Bergey’s Manual of Systematic Bacteriology is now in its second edition. The first edition was based on an artificial classification because too little phylogenetic information was available. However, the second edition is based on the Tree of Life that uses a phylogenetic framework as discussed in subsequent text of this chapter. Phenotypic Properties and Artificial Classifications Phenotypic properties are those that are expressed by an organism, and they have always played a major role in This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. 488 Chapter Seventeen BOX 17.1 Milestones Bergey’s Manual Trust David Bergey was a professor of bacteriology at the University of Pennsylvania in the early 1900s. As a taxonomist he was a member of a committee of the Society of American Bacteriologists (SAB— now called the American Society for Microbiology), which was interested in formulating a classification of the bacteria that could be used for identification of species. In 1923, he and four others published the first edition of Bergey’s Manual of Determinative Bacteriology. This was followed by new editions every few years. Royalties collected by the publication activities of the committee were held in SAB. When David Bergey and his co-editor, Robert Breed, requested money from the account to be used for preparation of the fifth edition, the leadership in SAB refused. After a long and bitter fight, the SAB relented and turned the total proceeds over to Bergey, who promptly put the money into a nonprofit trust with a board of trustees to oversee the publication of manuals on bacterial systematics. The Trust, now named in his honor as Bergey’s Manual Trust, is responsible for the publication of Bergey’s Manual of Determinative Bacteriology, which is now in its ninth edition, as well as other taxonomic books such as Bergey’s Manual of Systematic Bacteriology. The Trust is headquartered at University of Georgia and has a nine-member international board of trustees, as well as associate members from many countries. microbial taxonomy. Indeed, early classifications had to be based entirely on phenotypic properties because they were the only properties that could be studied. In classifications, it is important to select phenotypic characteristics that allow one to group organisms together and others that enable one to distinguish organisms from one another. Furthermore, it is important that the identifying characteristics selected be easy to determine. Two examples of simple phenotypic characteristics that have been widely used in artificial classification schemes are the Gram stain and cell shape. It turns out that each of these has some utility in classifications. For example, the Gram stain tells something about the nature of the cell wall (see Chapter 4). Furthermore, the Gram stain happens to be important phylogenetically because two of the phylogenetic groups of Bacteria are gram-positive (i.e., Firmicutes and Actinobacteria) and the other 20 or more are gram-negative. However, there are some drawbacks to phenotypic properties. For example, it is noteworthy that some species of gram-positive bacteria stain as gram-negative bacteria. Likewise, the mycoplasmas, which stain as gram-negative organisms, have been found to be members of the Firmicutes through 16S rRNA analyses (see Chapter 20). The rea- David Bergey. Courtesy of the National Library of Medicine. son the mycoplasmas stain as gram-negative is that they lack cell walls altogether. Therefore, they have apparently evolved from a group of gram-positive bacteria that lost their peptidoglycan wall during evolution. In contrast to the gram-positive bacteria, gram-negative bacteria and archaea fall into many different phylogenetic groups, including peptidoglycan-containing types and non–peptidoglycan-containing types that are bacteria as well as archaea. Therefore, gram-negative organisms are very diverse phylogenetically. At one time, some bacteriologists proposed that the simplest, and purportedly the most stable, cell shape— the sphere—must have been the shape of the earliest bacteria. They then developed an evolutionary scheme based on this theory, in which all of the coccus-shaped bacteria were included in the same phylogenetic group. The validity of this classification has not been borne out by molecular phylogeny. For example, there are both gram-negative as well as gram-positive cocci. Some cocci are photosynthetic Proteobacteria, others are nonphotosynthetic, some are highly resistant to ultraviolet (UV) light (Deinococcus), some are Archaea, and others are Bacteria. Nonetheless, cell shape is important for some groups. For example, the spirochetes phylum contains all of the This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. Taxonomy of Bacteria and Archaea helically shaped bacteria with periplasmic flagella (see Table 17.1 and Chapter 22). Because of their morphology, these bacteria were correctly classified with one another in the order Spirochaetales and now in the phylum Spirochaetes long before phylogenetic data confirmed the grouping. Apart from this group, overall cell shape has little meaning at higher taxonomic levels. Nonetheless, it can still be significant at the species, genus, and even family levels. Other phenotypic properties have also proved useful in both artificial and phylogenetic classifications. For example, because of their unique ability to produce methane gas, the methanogenic microorganisms have always been classified together in artificial classifications. Likewise, from a phylogenetic standpoint, all the methanogenic organisms are members of the Euryarchaeota of the Archaea. Of course, phenotypic properties have a special significance not found in gene sequence analyses in that these features provide information about what the organism is capable of doing. One cannot directly conclude from a sequence that an organism is or is not a methanogen, for example, unless that particular feature has been tested for and determined. Thus, phenotypic tests provide valuable information about the capabilities of the organism that may help explain its role in the environment in which it lives. Numerical Taxonomy When a large number of similar bacteria are being compared, computers are very useful in the analysis of the data. This aspect of taxonomy, which has been used in artificial classifications, is referred to as numerical taxonomy. Numerical taxonomy is most useful at the species and strain level, where phylogenetic relatedness has already been established by ribosomal RNA (rRNA) sequencing and DNA–DNA reassociation. In numerical taxonomy all characteristics are given equal weight. Therefore, metabolism of a particular carbon source is considered to be as important as the Gram stain or the presence of a flagellum. In characterizing strains in this manner, a large number of characteristics are determined, and the similarity between strains is then compared by a similarity coefficient. Each strain is compared with every other strain. The similarity coefficient, SAB, between two strains, A and B, is defined as follows: S AB = a a+b+c where a represents the number of properties shared in common by strains A and B; b represents the number of properties positive for A and negative for B; and c rep- 489 resents the number of properties positive for B and negative for A. Characteristics for which both strains A and B are negative are considered irrelevant because there would be many such features that would have no bearing on their similarity. For example, endospore formation is an uncommon characteristic for bacteria. It is not significant to incorporate this characteristic when comparing two species within a genus that do not produce endospores. It would, however, be of value in comparing endosporeforming organisms to closely related organisms. It should be noted that the similarity coefficient can be used to relate not only phenotypic features of one organism to another, but also to relate the sequence similarity of macromolecules of different organisms. In numerical taxonomy, it is best to have as many tests of phenotypic characters as possible. Typically at least 50 independent characters are used, and many strains are usually compared simultaneously. Ideally, each characteristic should represent a single and separate gene. The same gene should not be assessed more than once, and, therefore, overlapping phenotypic tests must be avoided. Typically, SAB values are greater than or equal to 70% within a species and greater than 50% within a genus. An example of a numerical analysis is shown in Box 17.2. In artificial classifications, bacteria are grouped into a hierarchy based on phenotypic properties. For example, the chemoautotrophic bacteria that obtain energy from the oxidation of inorganic nitrogen compounds, such as ammonia and nitrite, would be classified in the group of nitrifying bacteria. This group would be regarded as an order. One subgroup, termed the family, would contain ammonia oxidizers and another would contain nitrite oxidizers. Within each of these groups, features such as cell shape would be further used to define differences among the various genera and species. However, this artificial classification does not take into account the evolutionary relatedness among the members of the nitrifying bacteria. Phylogenetic Classification and Molecular Phylogeny An important article published by Zuckerkandl and Pauling in 1962 suggested that the evolution of organisms might be recorded in the sequences of their macromolecules. Subsequent research in the late 1960s and 1970s has supported this concept, resulting in a revolution in bacterial taxonomy. In particular, molecules such as rRNA and some proteins have changed at a very slow rate during evolution; therefore, their sequences provide important clues to the relatedness among the various bacterial taxa and their relatedness to plants and animals This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. 490 Chapter Seventeen BOX 17.2 Methods & Techniques Determination of Similarity Coefficients In this example, eight strains, A through H, are compared to one another by ten phenotypic tests. The results are shown in the first table: Table 1 Results of phenotypic tests for eight strains, A–H Strains Tested Tests A B C D E F G H 1 2 3 4 5 6 7 8 9 10 + – + + + + – + + – – + + – – + + – – + + + – – – – + – + + + – + – + + + + + – + + – + + – + – + + + – + + + – – + + – – + + – – + – + – + + + + + – – + – + + Similarity coefficients are then determined by comparing the results of the tests for each of the strains against one another, using the formula given in the text. The results are shown in Table 2. Table 2 Similarity coefficients (× 100 to give percent similarity) for the eight strains Strains A B C D A B C D E F G H 100 20 20 75 40 86 33 40 100 43 33 38 10 67 50 100 33 71 22 25 71 100 40 63 33 40 E F G H 100 44 20 75 100 22 44 100 33 100 The information from this matrix is then used to group the strains into similar types, as shown in the following matrix: Table 3 Similarity matrix of grouped strains Strain A F D E H C G B A F D E H C G B 100 86 75 40 40 20 33 20 100 63 40 44 22 22 10 100 40 40 33 33 33 100 75 71 20 38 100 71 33 50 100 25 43 100 67 100 According to these tests, strains A, F, and D are very similar to one another and probably compose a single species. Likewise, E, C, and H are very similar and appear to be a separate species. Strains B and G may also be a different species, although more tests should probably be performed to substantiate this. As mentioned earlier in the chapter, phenotypic data such as this can provide an indication of relatedness at the species level, but if new species are being described, DNA/DNA reassociation tests should be performed. This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. Taxonomy of Bacteria and Archaea as well. The result is that a major breakthrough has occurred in the classification of prokaryotic organisms, which has rapidly become phylogenetic through the study of molecular phylogeny as discussed in subsequent text. The fact that the two domains Bacteria and Archaea exist was not at all appreciated until molecular phylogenetic studies were performed. Bacteriologists are now trying to determine when the split occurred between the Bacteria, Archaea, and Eukarya and what the nature was of their last common ancestor. By the 1970s, data had accumulated indicating that a true phylogenetic classification of bacteria was possible. What made it possible was, first of all, acceptance of the evidence that some of the macromolecules of organisms were highly conserved, that is, changed very slowly during evolution, and that their sequences held the key to unlocking the relatedness of bacteria to one another and to plants and animals. Second, sequencing techniques were developed and improved so that it became easy to conduct sequence analyses of rRNA and other macromolecules. In this section, emphasis will be given to rRNA sequencing, in particular the RNA of the small subunit of the ribosome (see Chapter 4), 16S rRNA (or 18S rRNA of Eukarya), as it is the most common conserved molecule used to study the phylogeny of microorganisms (Figure 17.1). In actual practice, the 16S rRNA gene, or 16S rDNA, is sequenced because the polymerase chain reaction (PCR) procedures are simple, and both strands of the DNA can be used to confirm the actual sequence. Several reasons justify the choice of rRNA as an evolutionary marker. First, the ribosome is found in all cellular organisms (see Chapter 4) and therefore allows one to compare all organisms. Second, the function of the ribosome as the structure responsible for protein synthesis holds true for all classes of life. Therefore, it is possible to compare the phylogeny of all organisms by analysis of a single structure with an important cellular function. The third advantage of using the ribosome in phylogeny is that ribosomes are highly conserved and therefore have changed very slowly over many millions of years. This is true because it is a very complex structure that carries out a specific function—protein synthesis (see Chapter 11). Keep in mind that the 16S rRNA is interacting in a three-dimensional structure with protein and other rRNA molecules as well messenger RNA (mRNA). Therefore, a high rate of evolutionary change in ribosome structure has been selected against during evolution. Mutant organisms with dysfunctional ribosomes would be unable to compete with existing types and have therefore not survived. In this manner, evolution has selected against major changes in the ribosome. Nonetheless, incremental modifications have occurred over the billions of years of biological evolution and these differences are used to construct evolutionary trees. 491 Ribosomal RNAs are not the only macromolecules that have been considered in determining relatedness at higher taxonomic levels. Proteins such as cytochrome c and ribulose bisphosphate carboxylase are just two examples of other molecules that have been used. However, not all organisms synthesize these macromolecules. Furthermore, not all cytochrome c–like molecules found in different organisms have the same physiological function. Because these other molecules are not universally distributed among all organisms, they cannot be used to compare distantly related taxa. However, the sequences of highly conserved proteins, such as cytochromes and nitrogenase, are very useful in constructing phylogenies of their origin and evolution within disparate microbial groups. It should be noted that because proteins are difficult and expensive to sequence, their sequences are normally deduced from their gene sequences. Within the biological world, ribosomes share many similarities, indicating the conservative nature of the structure. Prokaryotic ribosomes contain three types of RNA: 5S, 16S, and 23S. Both 5S and 16S rRNA have been used to determine relatedness among organisms. Because the 16S molecule is larger (with about 1,500 bases), it contains more information (see Figure 17.1) than the smaller 5S molecule with only about 120 bases (Figure 17.2). Less work has been done on the 23S molecule because it is longer (about 3,000 nucleotides) and therefore not as easy to study. Therefore, scientists interested in the classification and evolution of bacteria have concentrated on 16S rRNA. The method of evaluation that provides the most information is sequence determination, especially for the complementary DNA of 16S rRNA, that is, the double-stranded 16S rRNA gene or 16S rDNA. It has been found that some regions of these molecules are more highly conserved than others. The more highly conserved regions permit the comparison of distantly related organisms (Figure 17.3), and the more variable domains are used for comparing more closely related organisms. Figure 17.3 shows the “stem-loop” secondary structure of a model 16S rRNA that allows for comparison among all organisms. The stem and loop portions, designated by the blue color, contain the most highly variable regions. For example, the blue area designated with a red star has been expanded for three species—E. coli (a bacterium), Methanococcus vannielii (an archaean), and Saccharomyces cerevisiae (a eukaryote)— to illustrate the variation among these three species. The regions that are unique to a given taxon are termed signature sequences. They can be used for the design of specific hybridization probes for identification (see later section on Identification). It appears that an analysis of 16S rDNA sequences provides important information on the evolution of prokaryotes. However, before concluding that the se- This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. 492 Chapter Seventeen A A U G U U G C G C 1100 G C U GC C AA 700 A G A C U A G U G AG U A CGA GC CCUUA UCCUUUGU CC CGG UC CC G G UG G A G A A G A UGC C A UCUGGA G AU G GC G A A UGCUCG GG GCC GG GGGGU A GGA A A C U G A G UCAA A G A A U U G CG G UGGA CCUU GA UGGCG G AA A U G 1150 G A U C G G U A U C A G A G A G C C G A G U C C C G G G U U C U G G GA A G G C C A G U U A C U G U A 1200 G A U A C G G C G U C G A C A U G U G A C U G G C G C A C G U A G C U G C A U C G G C A U U A A G A C G G A C G A A U U A CG U U A G 800 A A A UGA GA A U G C G U G U A G C 1050 C U A G [ m2G ] C A A A G G C 750 G C C G G A C U U U U G AA G U A U A U G U G AG G C G C G G G G C A A GG 650 U C C U C A U A UA U A C G C C 1000 U UGCA UCUGA CU GGCA A G C A U A C A A CCUGGGA A U U GC G C C C C U G C U A C U G U C GGGCCCC C G C G U A GU GU A GA C U GA U U GU U U GG C G C C G G C A C G C U A U A A A G G G C G A G A A U 600 G A G U A U A G AC U C G A U C C UA C C G A G GC U G A U A G C C G C G A G G G G G 850 A G C C C G UA G U G A G C CG U U U GU CGA CU A U GCC A A U A U AC C A CC G A U A UU G C A A C C G G G A G A C U m7 A A A G C U U C A G C U G C C G C G C G U G A A G U UG C G A A A A U C G CG U A G A C G 1250 G U G A A C G A C G A G UG C A A C G C A U 5 A C G 950 U C U U A A A G G mC 2 U C U G mG G G CA U U A A C C C A G U U U U C AG G UA CA G C A A A C C A G G G C G G C U A U A A y A U A GGCC G GGG G U G CA U CGU C C A G U G G GA AA C G U C A 900 U A A G A A C G A C G G CUC UUGG A U GG A C C AAAA G U A A C G U A 450 AA U A C G G UA A C G G G C G G UG C G G C G C G A A A G G A CG G C U U UG A CG GGGCCCG A CA A G C U C A G C G G G G U A 500 C A CG U A CU C A CU A U 550 A A A G U G C A C UGU UCCGGGC UG GA G C G A A U A G U 1300 U A C U C U U C U G U A U G C C U G C UAA C GA A U G AG U C G A A U U A C G AG G A U G GGUU GUA C G A A C C A A G A A 10 C A 1350 G A A A G U 1400 C C GU A A U U A U G U GG C U CCGG G U U U C G C A G 400 G C U G 4 U U AGAA G C G G A G mC m A G G A C U A G AU C C C G m3U G G A G A C m2G C C G G A C C A C A U A GCCUG A UGC A G A G C 1500 U A A C C G U A G G G G C U G A C 6 U U U G G mA C A 5 2 C GA GUUGGCGUCC 6 C 5' mC U A C G UA U G CGGGU A G mA A G 2 A A G A U U AAC G C C G A A C U A U U A C A A C G C G C G A C G C A G C U A 50 G C U G U G U G CA C G C U A C G A G CA U A G A U C G U G U A C G G 350 A U A G A G C U C A G C G C G A C G U 3' U G U C A C U A CA G GA AA A GUC A U G CAG GUC A CG GU CA G GA A GA A GC G G U U 300 A C G U A A G GGU CGG UGA CA GUC UUUCUUCG A U UC C A G G G G A U G A G A G A AU C A U C G G 100 C A A UC C AG A C UG G C G C A G G U A GU G C CG A A C G A G U U A U C U A A C A U G C GU GU G C G C G G A C AU U G G U A C G A A G G U UU G C U A C C G G C G A U G G G C G A G U G C G C A A U U A A G A C U U A G C U GA C A U A G C A 250 G C G C G U G 1450 U U U U U C G GC G A G G C G C G C G U G A U A G C A U 150 U A G C C G AUA ACUACUGG A G G G GG G GG CCUCUU G A C A U CGCC CGA UGGC A U CC GGGGA G A A U A AG UA A U 200 A A A A C C C G A U G C G A A C Canonical base pair (AU, GC) GU base pair GA base pair Non-canonical base pair Figure 17.1 16S rRNA Secondary structure of the 16S rRNA molecule from the small subunit of the ribosome from the bacterium Escherichia coli. The bases are numbered from 1 at the 5’ end to 1,542 at the 3’ end. Every tenth nucleotide is marked with a tick mark, and every fiftieth nucleotide is numbered. Tertiary interactions with strong comparative data are connected by solid lines. From the Comparative RNA Web Site, www.rna.icmb.utexas.edu (courtesy of Robin Gutell). This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. E. coli 5′ 3′ 10 G UGCCUGGCGGCC 40 H. sapiens CAUG C C C C C G A A UCA G U C A 30 C C G A 50 GA CCA U C U G G A G A C A U G 60 20 G G C C C G G U C UA G GA C G 70 C A C GG A C C G U C A AGG C G G A 120 110 A U U G G A C G U A A 100 G G 80 C U G G U U A G C G C G C G C U 90 U C CG UC U U C G U A A U UC G C C C G C GA G C G GA C C U A A A G A CG U A C G C C G G A U A C AU C G C G C GUCU A CGGC G C U C G G A U G U C G UG C U G U G G C G C UU A A U G G — C A U A A G C G U G U G U G A U G C G C UA U U G G C G C G A A G 493 Taxonomy of Bacteria and Archaea Figure 17.2 5S rRNA Comparison of the secondary structures of 5S rRNA molecules from the bacterium Escherichia coli and the eukaryote Homo sapiens. Note how similar the overall secondary structure is between these two molecules even though they represent many millions of years of divergence from one another. 5′ 3′ Canonical base pair (AU, GC) GU base pair GA base pair Non-canonical base pair (UU, AC) (A) (B) A U AC A C U U U UG U A G A C A U U CA U G U A G G A G CGU U G A A G A C G C 450 A C G G G G C G A C G G A A A C G C G U A G U C UAA A U G A G GGUU GUA C C U U A U G U GG C U CCGG A G C AGAA G C G Identical in 98% or more of all organisms Conserved only in the Bacteria Conserved only in the Archaea Conserved only in the Eukarya Conserved within each domain, variable among domains Regions that vary structurally among domains E. coli A A CA A G C U U GU U G G A A A G C G U A A A U U C U A G U AG C U A GCA UGGGC C A C G U A C U C G U GA A C U U G M. vannielii A A UA A C U C U GU A UA A A CA C G U A G G A G A G A U G C A A U A A U G A CGGGUCUUGUA UUG S. cerevisiae 5' U U C A A A CCCGGGA CA U A GCAA UA A A U 3' G Figure 17.3 Conservation and variation in small subunit rRNA (A) This diagram shows conserved and variable regions of the small subunit rRNA (16S in prokaryotes or 18S in eukaryotes). Each dot and triangle represents a position that holds a nucleotide in 95% of all organisms sequenced, although the actual nucleotide present (A, U, C, or G) varies among species. (B) The starred region from (A) as it appears in a bacterium (Escherichia coli), an archaean (Methanococcus vannielii), and a eukaryote (Saccharomyces cerevisiae). This region includes important signature sequences for the Bacteria and Archaea. Figure by Jamie Cannone, courtesy of Robin Gutell; data from the Comparative RNA Web Site: www.rna.icmb.utexas.edu. This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. 494 Chapter Seventeen quence of bases in rDNA accurately reflects the phylogeny of organisms, it is important to find totally separate and independent evolutionary markers to confirm the classification. Some work has been performed with sequencing of ATP synthases, elongation factors, RNA polymerases, and other conserved macromolecules. The outcome of this research, which represents one of the most exciting areas of biology, is leading to the development of a complete phylogenetic classification of the Bacteria, Archaea, and other microorganisms. The 16S rDNA–based phylogeny will face its most stringent test as additional genomes are sequenced and compared. Before we discuss the use of 16S rDNA sequences in arriving at the current phylogeny of Bacteria and Archaea, it is first necessary to consider phylogenetic trees. Like a family tree, a phylogenetic tree contains the tree of descendants of a biological family or group. However, whereas the family tree traces the genealogy of a family of humans, phylogenetic trees trace the lineage of a variety of different species. Thus, phylogenetic trees reflect the purported evolutionary relationships among a group of species, usually through the use of some molecular attribute they possess, such as the sequence of their rRNA. In this particular section we will discuss molecular phylogeny based on a comparison of the 16S rRNA sequences of organisms. Phylogenetic trees have two features—branches and nodes (Figure 17.4). Each node represents an individual species. External nodes (usually drawn to the extreme right of the tree) represent living species, and internal (A) Unrooted trees (B) Rooted trees A B C A A C C A B C PHYLOGENETIC TREES In both types of trees, internal nodes represent ancestor species… …external nodes (here, A through E) represent extant, known species… … and branch lengths represent the evolutionary distance or degree of relatedness between species. A A B B C C D Internal nodes Branch E External nodes D Internal nodes E Branch External nodes Figure 17.4 Phylogenetic trees Two different formats of phylogenetic trees used to show relatedness among genes or species. B B B C A Figure 17.5 Unrooted and rooted trees Representations of the possible relatedness between three species: A, B, and C. (A) A single unrooted tree (shown in both formats; see Figure 17.4). (B) Three possible rooted trees (in one format). nodes represent ancestors. A branch is a length that represents the distance between or degree of separation of the species (nodes). Trees may be rooted or unrooted. Figure 17.5A shows unrooted trees containing three different species: A, B, and C. Unrooted trees typically compare one feature of a group of related organisms, such as the sequence of their 16S rRNA gene. There is only one shape to this particular tree with three species. In contrast, rooted trees provide more information. Typically, rooted trees use the same gene from a distantly related organism for the root. This allows one to compare the relatedness of the more closely related species— A, B and C—to one another. A rooted tree containing three species has three possible shapes (Figure 17.5B). In the examples shown, A and B are more closely related to one another in the uppermost example, whereas A and C and B and C, respectively, are more closely related in the two lower examples. Alternatively, an additional gene can be used to compare the species. When using this approach, a different gene such as the sequence of a macromolecule that underwent a gene duplication event prior to when the taxa that are being studied, diverged from one another. As an example, a comparison of the elongation factors Tu and G (see Chapter 13) was used to root the 16S rRNA Tree of Life (see Figure 1.6). For bacterial phylogeny, trees are constructed from information based on the sequence of subunits in macro- This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. Taxonomy of Bacteria and Archaea molecules. As mentioned previously, rRNA, in particular the small subunit rRNA molecule, 16S rRNA, has been selected as the molecule of choice because of its conserved nature and length. SEQUENCING 16S rDNA If one isolates a new bacterium and wishes to determine its phylogenetic position among the known bacteria, it is necessary to determine the sequence of its 16S rRNA gene. This can be accomplished in a number of ways. One of the most common ways is to use polymerase chain reaction (PCR) to amplify the 16S rDNA from genomic DNA from the bacterium (see Chapter 16). The amplified 16S rDNA may be sequenced directly or ligated into a cloning vector and cloned into E. coli. This latter step allows a large quantity of 16S rDNA to be produced through growth prior to sequence analysis. The entire 16S rDNA can be sequenced using a standard set of oligonucleotide primers and standard sequencing techniques. ALIGNMENT WITH KNOWN SEQUENCES The next step is to incorporate the determined linear DNA sequence into an alignment with the sequences of other known organisms. Two internet databases are of special importance in this regard. One is called the Ribosome Database Project, located at Michigan State University. It contains the 16S rRNA sequences for those bacteria that have been sequenced. Sequences from this database can be retrieved electronically via the Internet (http://rdp.cme.msu.edu/). Another frequently used database for sequence analysis is the National Center for Biotechnology Information (NCBI) (www.ncbi.nlm. nih.gov/Genbank), which contains a comprehensive listing for 16S rRNA genes as well as other genes. By using a series of computer programs and careful manual examination, one can align and compare the sequence with those retrieved from the database. PHYLOGENETIC ANALYSIS Having the sequence is only half of the story. It is next necessary to compare the sequence of the unknown bacterium to that of other bacteria from the database. The determination of the evolutionary relatedness among organisms can be accomplished by one of a number of phylogenetic methods. There are several types of analysis that can be used. Distance matrix methods are one type of approach. In distance matrix methods, the evolutionary distances, based on the number of nucleic acid or amino acid monomers that differ in a sequence, are determined among the strains being compared. A second approach is to use maximum parsimony methods. In maximum parsimony, the goal is to find the simplest or most parsimonious phylogenetic tree that could explain the relatedness between different sequences. In both approaches, 495 the sequence of nucleic acid subunits among different strains of bacteria or other genes is compared. Figure 17.6 illustrates the use of two different methods to create a tree from an aligned hypothetical sequence region of rRNA from four different strains (shown in Figure 17.6A). The first approach is a distance matrix method called the “unweighted pair group method with arithmetic mean,” or UPGMA (see Figure 17.6B). This is one of the simplest analytical methods that can be used. Figure 17.6C shows an analysis of the same sequences by maximum parsimony. In the UPGMA method, a distance matrix is set up to compare the differences in sequence or “distance,” d, between each of the four strains. There are four nucleotide differences between the sequence of organism a and the sequence of organism b; therefore, dab is determined to be 4. Likewise, dac is 5, dbc is 5, and so forth. In this instance, the shortest distance, 2, is between strains c and d. From these two strains, which show the closest relationship to one another, a simple tree is constructed that shows c and d connected by a node that is half the distance between the two, that is, 1 unit. This is expressed in the actual tree as a horizontal branch length of one unit from each of the organisms to a common ancestral node (see Figure 17.6B). The next step is to construct another matrix in which c and d are considered as a single composite unit (cd) and compared with a and b. From this matrix, the two most similar organisms are a and b, and the length of this branch is calculated as the distance, dab/2, which is equal to 2 units. From this an intermediate second tree is formed. Finally, a is different from (cd) by (dac + dad)/2, or (5 + 6)/2 = 5.5. Likewise, b is different from (cd) by (dbc + dbd)/2 = 4.5. To determine the connection between the composite branch cd and ab, the distance is calculated as the average of d(cd)(ab), or 5.5 + 4.5/2, which is 5.0. This value is then divided by 2 to give the average distance between the two composites, cd and ab. Using this reasoning, a final tree (see Figure 17.6B) is produced showing the relationship among the four different strains. The UPGMA method is the simplest of the distance methods used. More sophisticated distance methods include transformed distance and neighbor-joining methods, which will not be discussed here. As mentioned earlier, in maximum parsimony the goal is to identify the simplest tree that could explain the difference between two different sequences or species. This approach has its philosophical basis in Occam’s razor, commonly used in the sciences, which states that the likely solution to a problem is the simplest one. In this case, to explain the evolutionary difference between two species, one looks at the tree that has the fewest changes (mutational events) that could explain their differences. This is accomplished with a computer that, in theory, con- This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. 496 Chapter Seventeen Figure 17.6 Phylogenetic analysis (A) 16S rRNA sequences Phylogenetic analysis of four different strains—a, b, c, and d—using a hypothetical region of their 16S rRNA that contains nine bases (A). (B) The UPGMA method of determining a phylogenetic tree. (C) The maximum parsimony method (see text for details). This table shows the sequence of a nine-base region of the 16S rRNAs of the four strains. Organism (strain) 1 2 3 4 a b c d G G G G C A A A G C A A G G A A Site number 5 6 A C U G C C C C 7 8 9 A A U U A A A A A G A G (B) UPGMA method 1 Construct a distance matrix showing the relatedness (as distance, d) between strains to determine the most closely related strains—in this case, c and d. First matrix b c d a b c dab = 4 dac = 5 dad = 6 — — — dcd = 2 dbc = 5 dbd = 4 (cd) a b c where d d(cd)a = 11/2 d(cd)b = 9/2 dcd =1 2 1 3 Construct a second matrix to assess the distance between the first paired group (cd) and the remaining strains (a and b). Second matrix Beginning tree 2 Diagram the relatedness between these strains. Second tree 4 Since a and b are close to one another, determine and diagram their relatedness. c a d a — dab = 4 where b dab 2 =2 2 Final tree 5 Determine the relatedness between cd and a and b, and diagram this. c d a where d(cd)(ab) = (11/2 + 9/2 )/2 = 2.5 2 b 2.5 siders all possible trees and then identifies the simplest one (the one with the fewest assumed mutational events). As for all phylogenetic analyses, the alignment of the sequences must be accurate. In maximum parsimony it is important to recognize sites in the sequences that are useful for a comparison between organisms. These are termed informative sites. These sites are then used to determine the most parsimonious tree. For example, Site 1 in the example given (see Figure 17.6C) is not informative because all the bases are identical. Site 2 is not informative either, because three of the strains have A and one has C, suggesting that a single mutational event has occurred. Site 3 is not informative, because Trees 1 and 2, which have two changes, are equally parsimonious, and Tree 3 differs from Tree 2 only in the inferred ancestor. Site 5 is not informative because all trees constructed from the information at this site differ from one another by three mutations. In contrast, Site 4 is informative, because one tree (Tree 1) is more parsimonious than the other two. Sites 6 and 8 are This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. Taxonomy of Bacteria and Archaea 497 (C) Maximum parsimony method 1 Identify the informative sites in the sequences. Here, construction of trees for sites 3, 4, 5, 7, and 9 shows that only sites 4, 7, and 9 are informative, because one tree is more parsimonious at these sites than are the others. Following a maximum parsimony analysis, this proves to be the most parsimonious tree. Site 3 Three possible trees a c b G * d C b G A G * Site 5 Site 4 G A A G A G C G * A A * C A C G A * Site 7 * U A G A G A C A * Site 9 U A U G A A U * * G A G G Tree 1 a * * A c d A b G A * A A C G G G * A * A U G A * * G * G U C A A A * G A U A A A * G G Tree 2 a * A d c Tree 3 * C A * A Two changes in trees 1 and 2; three changes in tree 3 * G A G * A One change in tree 1; two changes in other trees * G G * * U U Three changes in all trees * U A A * U One change in tree 1; two changes in other trees * G A * G A A One change in tree 2; two changes in other trees Informative sites are at positions 4, 7, and 9. Mutations (changes) in trees at informative sites: Tree 1 1+1+2=4 Tree 2 2+2+1=5 Tree 3 2+2+2=6 2 Determine the most parsimonious tree by analyzing each of the informative sites. Therefore, tree 1 is the most parsimonious tree. not informative because all bases are identical, but both Sites 7 and 9 are informative. Site 7 favors Tree 1, whereas Site 9 favors Tree 2. Thus, for this set of data, Tree 1 is favored two of three times, Tree 2 is favored one of three times, and Tree 3 is not favored at any time. Adding the changes at those three sites gives the following data: Tree 1 is the most parsimonious because a total of only four changes ( 1 + 1 + 2 = 4) would explain its phylogeny, whereas in Tree 2, five changes are required ( 2 + 2 + 1 = 5), and in Tree 3 six changes are required (2 + 2 + 2 = 6). Note that the two trees that were constructed from the distance matrix and maximum parsimony methods are identical in shape. Both trees indicate that two of the strains, a and b, are more closely related to one another than they are to c and d. Likewise, c and d are more closely related to one another than they are to a and b. As you can imagine, some rather sophisticated computer programs have been developed to handle the immense amount of information inherent in longer sequences such as 16S rDNA, which contains about 1,500 bp. Moreover, when such large sets of data are being analyzed, it is often not possible to determine that a proposed tree is, in fact, the true tree. Indeed, all trees should be considered as hypotheses until additional information has been analyzed. For example, the inclusion of a sequence from a newly discovered, closely related organism may alter the shape of a tree when it is included in the analysis. To help support the reliability of a given tree, other techniques are used. For example, in “bootstrap” analyses, random portions of the sequence are selected by the computer, and the trees formed from them are compared with the proposed tree to assess the statistical significance of the proposed tree. In this statistical approach, some 100 or 1,000 different bootstrap comparisons might be made and provided as evidence that the proposed tree is indeed the most parsimonious one. Bootstrap analyses are applicable for all phylogenetic treatment procedures. This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. 498 Chapter Seventeen As mentioned previously, other analytical methods can be used to analyze sequence information and construct phylogenetic trees. A common one used by microbiologists is the maximum likelihood method, which involves selecting trees that have the greatest likelihood of accounting for the observed data. This is accomplished by assigning a probability to the mutation of any one base to any other base at each possible sequence position. From this, all possible topological trees are constructed. By integrating the probabilities for each mutation over each tree, a degree of improbability for a tree is assessed. The least improbable tree is chosen as the “true” tree. A final note on studying microbial phylogeny: It should be emphasized that many other genes besides 16S rDNA can be used for phylogenetic analysis. An appropriate gene must have the degree of conservation necessary for the analysis desired and must have a homolog in the other organisms of interest. A homologous gene is a gene that shares a common ancestry. Speciation The process by which organisms evolve is termed speciation. As with plants and animals, bacteria evolve in habitats. However, unlike plants and animals, bacteria can evolve very quickly because of their rapid growth rates, high population sizes, and haploid genomes that allow for the rapid expression of favorable mutations through natural selection. Thus, a lineage of bacteria is determined in large part through vertical inheritance, the process by which the parental genotype is transferred to the progeny cells following DNA replication and asexual reproduction. However, bacteria can also acquire genetic material from other different organisms through their various genetic exchange mechanisms: conjugation, transformation, or transduction (see Chapter 15). This phenomenon is referred to as horizontal (or lateral) gene transfer (HGT) to distinguish it from vertical inheritance. As a result, prokaryotic organisms may undergo dramatic changes in their population structure in a relatively short period. For example, we know that multiple-drug resistance can be rapidly acquired by a bacterial species that is sensitive to antibiotics if it is exposed to antibiotics in the presence of other antibiotic-resistant bacteria. Consider the situation of a bacterium that is a member of the normal microbiota of the intestinal tract that is exposed to an antibiotic to which it is sensitive. The bacterium may either perish or—if a gene is available in the environment that confers resistance and the bacterium has the capability—acquire the resistance gene through an HGT process and survive. This example of a strong selective pressure likely explains how it is pos- sible for sensitive bacteria to quickly become resistant to an antibiotic. This scenario applies equally to other environmental pressures that confront bacteria, such as exposure to potentially toxic hydrocarbons that are used as an energy source by other species in the environment. Many of the known examples of rapid genetic change occur through the acquisition of plasmids from related organisms. Thus, in the preceding examples, some plasmids are known that carry multiple antibiotic resistant genes and others are known that carry hydrocarbondegrading genes. In addition, we do know that genes can be acquired from distantly related organisms. For example, it has been recently reported that some members of the Proteobacteria and other phyla of the Bacteria have been found to contain a gene responsible for bacteriorhodopsin synthesis, which was only known previously from members of the Archaea. Thus, this example appears to represent the transfer of genetic material across two different domains as well as phyla within the Bacteria. Another example is the bacterium, Agrobacterium tumefaciens, which naturally transfers genetic material to plants (Chapters 16 and 19). From an analysis of genomes that have been sequenced, genes that have been derived from other organisms have been identified in some prokaryotic genomes (Figure 17.7). The transfers between phyla appear to be relatively rare events. In addition, it should be noted that each species has hundreds of core genes that can be used to compare its relatedness to other organisms. If HGT occurred too extensively, it could confuse phylogenetic classifications based on vertical inheritance to such an extent that it would render them utterly useless. Fortunately, this does not appear to pose a major problem for 16S rRNA gene trees. SECTION HIGHLIGHTS Both phenotypic and genotypic properties have been used to describe microorganisms, and both are important in describing and naming new prokaryotic species. Artificial taxonomy entails the use of phenotypic tests, whereas a taxonomy based on evolutionary processes relies on molecular phylogeny. Molecular phylogeny uses 16S rRNA gene sequence and protein sequence analyses, which have become very important in the classification of Bacteria and Archaea. Several different methods, such as distance, parsimony, and maximum likelihood methods, are used to construct phylogenetic trees. This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. Taxonomy of Bacteria and Archaea 499 Pseudomonas aeruginosa Escherichia coli Mycobacterium tuberculosis Bacillus halodurans Vibrio cholerae Bacillus subtilis Synechocystis PCC6803 Deinococcus radiodurans Xylella fastidiosa Pasteurella multocida Lactococcus lactis Archaeoglobus fulgidus Neisseria meningitidis Z2491 Neisseria meningitidis MC58 Halobacterium NRC-1 Thermotoga maritima Mycobacterium leprae Pyrococcus abyssi Pyrococcus horikoshii Methanobacterium thermautotrophicum Aeropyrum pernix Campylobacter jejuni Haemophilus influenzae Helicobacter pylori 26695 Aquifex aeolicus Thermoplasma acidophilum Methanococcus jannaschii Treponema pallidum Borrelia burgdorferi Rickettsia prowazekii Mycoplasma pneumoniae Ureaplasma urealyticum Buchnera aphidicola Mycoplasma genitalium 0 1 2 3 4 Megabases of protein-coding DNA 5 6 Figure 17.7 Horizontal gene transfer Analyses of sequenced bacterial genomes indicate that a significant proportion of their genes can be traced to other phylogenetic groups, indicating the importance of horizontal gene transfer (HGT) in bacterial speciation. This diagram shows the proportion of DNA that was acquired by HGT (in red) in some microbial genomes. Courtesy of Jeffrey Lawrence. 17.3 Taxonomic Units The basic taxonomic unit is the species, although as mentioned earlier, some species have subspecies categories as well. The categories above the species are (sequentially) genus, family, order, class, phylum, and domain (see Table 17.1). It should be noted that uncertainties exist in bacteriology about the meaning of the higher taxonomic categories such as kingdom because oftentimes the phylogenetic markers that have been used (primarily 16S rDNA sequences) cannot definitively resolve the earliest branching points in the Tree of Life. Thus, although we know that each of these major branches is equivalent to the plant and animal “kingdoms,” how the microbial “kingdoms” or phyla are related to one another is only poorly understood. Each colony or culture of an organism represents an individual strain or clone in which all of the cells are descended from one single organism. In a somewhat different sense of meaning, a strain can also refer to a mutant of a species that has changed characteristics (for example, lacks a particular gene). The strain, however, is not considered a formal taxonomic unit, and Latin names are therefore not ascribed to strains; they have only informal designations, such as E. coli strain K12. There can also be varieties within species that exhibit differences. These are called biovars (i.e., biological varieties). For example, a serological variety such as E. coli O157:H7 is a pathogenic serovar that causes hemolytic uremic syndrome and can be lethal to children who become infected by eating contaminated food. Likewise, pathogenic varieties are termed pathovars, ecolog- This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. 500 Chapter Seventeen ical types ecovars, and so forth. Now let us look at the individual taxa beginning with the species to see what features are typical at each taxonomic level. The Species The definition of a bacterial species differs from that of plants and animals. In mammals, the classical species is defined as a group of individuals (males and females) that exhibit evident morphological similarities and produce fertile progeny through sexual reproduction. Indeed, the production of progeny in many animals such as mammals requires sexual reproduction. Although gene exchange occurs in prokaryotic organisms, it is not essential for reproduction. Most bacterial reproduction is asexual and occurs by simple binary transverse fission or budding. In prokaryotic organisms, sexuality is uncommon and different from that of eukaryotes. Eukaryotes produce haploid gametes in meiosis (see Chapter 1). During sexual reproduction, the haploid gametes (egg and sperm) from the male and female fuse to form the diploid zygote. In bacterial conjugation, DNA from one cell is transferred during replication to a receptor cell; however, only partial diploidy occurs. Genetic material can also be transferred by other mechanisms such as transformation and transduction (see Chapter 15). These transfers are not always restricted to members of the same species. A bacterial species comprises a group of organisms that share many phenotypic properties and a common evolutionary history and are therefore much more closely related to one another than to other species. This definition, which is very subjective, has been interpreted differently by bacteriologists in describing species. For example, at one extreme some taxonomists are called lumpers because they group (or “lump”) fairly diverse organisms into a single species or genus. An example of a lumper is F. Drouet, who has proposed reducing the number of cyanobacteria from 2,000 species to only 62! At the opposite pole are splitters. These are taxonomists who consider even the slightest differences sufficient for a new species. For example, many years ago it was proposed that the genus Salmonella be “split” into hundreds of different species, a separate species for each of the hundreds of different serotypes (or serovars) that are recognized based on specific cell-surface antigens of their lipopolysaccharides and flagella. However, the views of lumpers and splitters illustrated here are considered to be extreme and are not accepted by the majority of microbiologists. In fact, more recently, a less arbitrary, quantitative basis has been proposed to define a bacterial species. Agreement was reached by a group of prominent bacterial taxonomists to define a bacterial species based on genomic similarity between strains. Accordingly, a species is defined as follows: two strains of the same species must have a similar mole percent guanine plus cytosine content (mol % G + C) and must exhibit 70% or greater DNA–DNA reassociation. The procedures used to determine these features are described here. MOLE PERCENT GUANINE PLUS CYTOSINE (MOL % G + C) The mol % G + C refers to the proportion of guanine and cytosine to total bases (guanine, cytosine, adenine, and thymine) in the DNA. Recall that because G and C are paired in the double-stranded DNA molecule by hydrogen bonds, as are A and T, they occur in equal concentrations. The formula is given as: mol % G + C = moles (G + C) × 100 moles (G + C + A + T) Several methods can be used to determine the mol % G + C, sometimes also called the “GC ratio” of a bacterium. All of them require that the DNA be first isolated from a bacterium and purified. Thus, it is necessary to lyse the cells to release the cytoplasmic constituents including DNA, and the DNA must then be purified to remove proteins and other cellular material. Cell lysis is typically accomplished by treatment with lysozyme and detergents, and the DNA is precipitated with ethanol. When the DNA has been sufficiently purified, it can be analyzed chemically to determine the content of each of the bases. Several different procedures can be used to determine the GC ratio of the purified DNA. We will describe two of them here. A common procedure to determine GC ratios is by thermal denaturation. The principle behind this method is that the hydrogen bonds between the double strands can be broken by heating dissolved DNA. As the hydrogen bonds are broken and the two strands separate, the absorbance of the DNA increases. This procedure, called “melting the DNA,” is conducted with a spectrophotometer set at 260 nm, a wavelength at which DNA absorbs strongly. The hydrogen bonding of the GC base pair is stronger than the AT pair in the double-stranded DNA molecule. Therefore, a higher temperature is required to melt DNA that has a high content of GC pairs, that is, a high GC ratio. Figure 17.8 shows a graph of the melting of a double-stranded DNA molecule. This process is accomplished by gradually increasing the temperature of a solution of the DNA in an appropriate buffer (ionic strength is important). As the temperature is increased, the melting process begins and continues until the double-stranded DNA molecule is completely converted to the single-stranded form. The absorbance increases during this melting process. The midpoint temperature (Tm) is directly related to the GC ratio of the DNA. Thus, the This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. 501 Taxonomy of Bacteria and Archaea UV absorbance (at 260 nm) 0.4 Prokaryotes Bacteria Archaea 0.3 For each DNA, the midpoint of the melting process occurs… 0.2 Humans Plants Algae Fungi 0.1 …at a characteristic midpoint temperature, Tm, in this case 90°C. Tm 80 85 90 95 Protozoa 0 100 Figure 17.8 DNA melting curve Melting curve for a double-stranded molecule of DNA. As the temperature is increased during the experiment, the double-stranded DNA is converted to the single-stranded form and the UV absorbance of the solution increases. The midpoint temperature, Tm, can be calculated from the curve. This process is reversible if the temperature of the solution is slowly decreased to allow the single strands to reanneal. The Tm of this species, Escherichia coli, can be used to determine its mol % G + C content (see Figure 17.9). GC ratio can be read from a chart showing the relationship between Tm and GC content (Figure 17.9). Once the DNA has been melted, it will reanneal if the temperature is slowly lowered. Thus, the process shown in Figure 17.8 is reversible. However, if the solution is cooled rapidly, hybrid formation does not recur, and the molecules of DNA are left in the single-stranded state. 100 80 Mycobacterium phlei Pseudomonas 60 40 Serratia E. coli Bacillus subtilis Cytophaga 20 0 60 70 80 90 100 110 Tm (°C) Figure 17.9 Tm and DNA base composition 10 20 30 40 50 60 Mol % G + C 70 80 90 100 Figure 17.10 DNA base composition range Temperature Mol % G + C Eukaryotes Animals Graph showing the direct relationship between mol % G + C and midpoint temperature (Tm) of purified DNA in thermal denaturation experiments. Range of mol % G + C content among various groups of organisms. Note the broad range of GC ratios for bacteria, archaea, and the lower eukaryotes in comparison to plants and animals. Another method is to use the “readout” from a genome sequence, which contains all of the genetic information of a species. Figure 17.10 shows the range of GC ratios in various groups of organisms. On this basis alone, one can see that bacteria, which have GC ratios ranging from approximately 20 to greater than 70, are truly a very diverse group. In contrast, higher organisms such as animals have a very restricted GC ratio range. The GC ratio provides only the relative amount of guanine and cytosine compared to total bases in the DNA of an organism and says nothing about the inherent characteristics of the organisms or what genes are present. Indeed, two very different organisms can have similar or even identical GC ratios. For example, the DNA of Streptococcus pneumoniae and humans have the same mol % G + C content. DNA–DNA REASSOCIATION OR HYBRIDIZATION Although the determination of GC ratio is useful in bacterial taxonomy, it does not tell us anything about the linear arrangement of the bases in the DNA. It is the arrangement of the DNA subunits that codes for specific genes and proteins and therefore determines the features of an organism. DNA–DNA reassociation or hybridization is one method used to compare the linear order of bases in two different organisms (Box 17.3). It is important to recognize that the = 70% level of reassociation used for the species definition does not indicate that the two DNAs are 70% homologous or identical. In DNA–DNA reassociation, the actual order of bases in the DNA is not determined, but rather the extent of reannealing between the DNAs of two different strains is assessed. Ideally one would like to know the actual This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. 502 Chapter Seventeen BOX 17.3 i Methods & Techniques DNA–DNA Reassociation DNA–DNA reassociation can be performed by using a variety of different methods. In all approaches, it is necessary to begin with purified DNA from the two organisms that are being compared. The DNA is first cut into smaller segments (i.e., sheared ) and then denatured by melting. DNA from the two different strains are mixed and allowed to cool together to allow reannealing to occur. This reannealing will occur both between DNA Unlabeled DNA strands of the same species and between strands of the comparison species. The degree of reannealing depends on how similar the DNAs are to one another. If two strains are very similar, their DNAs will reanneal to a high degree. In contrast, if two strains are very different, then the extent of reannealing will be much less. One way to perform DNA–DNA reassociation is to radiolabel the DNA by growing the bacterium with tritiated thymidine Radiolabeled DNA or 14C-labeled thymidine (other DNA bases or 32P labeling can be used as well). If the bacterium takes up this labeled substrate and incorporates it into DNA, then the DNA becomes labeled. Alternatively, the DNA can be purified from the bacterium and labeled enzymatically in the laboratory. After the DNA has been labeled and purified, it is sheared to an appropriate length by sonication. It is then ready for the hybridization 1 Mix radiolabeled single-stranded DNA (obtained by labeling doublestranded DNA, then shearing and denaturing it) with large amounts of unlabeled single-stranded DNA segments from (in this control experiment) the same strain. 2 Heat the solution, then slowly cool. 3 Reannealing occurs between complementary segments. 4 Treat the solution with S-1 endonuclease to digest any remaining single-stranded segments. 5 Collect the double-stranded segments on a membrane filter, and measure the amount of radiolabel; this reveals the degree of reassociation (similarity between strains). DNA–DNA reassociation. In this example, which is a control experiment (the radiolabeled sample is reannealed with unlabeled DNA from the same strain), the degree of reassociation is highest and treated as 100%. If a different strain is reannealed with the radiolabeled DNA, it will show a lower degree of reannealing (compared with the 100% attributed to the control), indicative of the similarity between the two strains being tested. Strains with reannealing values of 70% or greater are considered to be the same species. This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. Taxonomy of Bacteria and Archaea BOX 17.3 503 Continued experiments. First, single-stranded DNA is prepared. This is accomplished by heating the isolated DNA molecules to render them singlestranded and then cooling them rapidly to prevent reannealing. First, let’s look at the control assay for the DNA reassociation experiment. In this case, a small amount of sheared radiolabeled DNA is rendered single-stranded. This is then mixed with a much larger amount of unlabeled DNA obtained from the same bacterial strain. These are heated together and cooled slowly to allow the two single-stranded groups to reanneal to form hybrid double strands. Because the amount of labeled DNA relative to the unlabeled DNA is small, there is a very low probability that it will reanneal with other labeled strands. Most of the reassociations will occur between unlabeled strands, and most of the remainder will be between the labeled and unlabeled strands. The single-stranded fragments that did not reanneal are removed by enzyme digestion using S-1 endonuclease, which specifically degrades only single-stranded DNA, and the double-stranded fragments are collected on a membrane filter or in a column. The amount of radioactivity remaining on the filter or on the column, after washing to remove low-molecular-weight material, represents the amount of hybrid formation between the labeled and unlabeled DNA for this identical strain. This is the control reaction, and the amount of radiolabel (the extent of hybridization) is considered to be 100%. To determine the extent of reassociation between the strain described and an unknown strain, similar experiments need to be performed. In this instance, unlabeled single-stranded DNA from the unknown strain is prepared and mixed with the known strain for which we have labeled the DNA. As sequence of genes of a species. Indeed, sequencing entire bacterial genomes has become quite common (see Chapter 16). It is worthwhile noting that one could consider that the actual DNA base sequence of a strain is the ultimate definition of a strain—analogous to the chemical formula for a compound. However, it is important to recognize that, unlike chemical compounds, bacterial strains and species are not static; they continue to evolve. Interestingly, the bacterial species definition appears to be much broader than that used for animals and plants when one considers DNA–DNA reassociation and other molecular criteria. For example, the bacterial species E. coli can be compared to its host mammalian species using a variety of molecular features, including the range in GC ratio, 16S rDNA sequence (versus 18S rDNA sequence), and DNA–DNA reassociation (Table 17.2). Therefore, although there is essentially no variation in the range in GC ratio for the human species, it is about 4 mol % within the E. coli species. Indeed, there is less variation in GC ratio in the Primate order than indicated previously, those strains that show 70% or greater reassociation or hybrid formation with the labeled strain (determined by the amount of hybrid DNA that is radiolabeled compared with the same strain control of 100% shown in the figure) are considered to be the same species. Anything less is considered a different species. The temperature and salt concentration at which the DNA–DNA reannealing occurs will influence the degree of reassociation between single strands. Scientists conducting DNA–DNA reassociation experiments typically use a reannealing temperature that is 25°C lower than the average midpoint temperature (Tm) of the DNAs being compared. This temperature is sufficiently high that only those sequences that are most complementary will reanneal. Therefore, this is considered a stringent condition for reannealing. there is in the species E. coli. Likewise, the 16S rDNA sequences of E. coli possess more than 15 substitutions, whereas the difference between 18S rDNA of the mouse (order Rodentia) and the human is less than 16. Finally, DNA–DNA reassociation data, which are used to define the bacterial species at 70% or greater, indicate that humans are much more highly similar to one another in comparison with E. coli. Therefore, it is evident that the typical bacterial species is equivalent to a genus or family of mammals based on molecular divergence, indicating that a bacterial species is defined much differently from its eukaryotic counterparts. This difference is further evidenced by the biological species definition for animals, which requires that within a species, mating between sexes produces fertile progeny. The Genus All species belong to a genus, the next higher taxonomic unit. When DNA–DNA reassociation is performed This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. 504 Chapter Seventeen TABLE 17.2 Comparison of E. coli and its host speciesa Property Comparison Mole % G + C Among E. coli Among Homo sapiens Among all primates Between H. sapiens and mouse Between H. sapiens and chimpanzee Between H. sapiens and lemurs a 48–52 42 42 — — — 16S or 18S rRNA Substitutions DNA/DNA Reassociation >15 bases — — 16 bases — — >70% — — — 98.6% >70% Adapted from J. T. Staley, ASM News, 1999. within a genus, some species of the genus may show little or no significant reassociation with other species. This does not indicate that they are unrelated to one another, only that this technique is too specific to identify outlying members of the same genus. Therefore, DNA–DNA hybridization has limited utility for determining whether a species is a member of a known bacterial genus. The definition of the genus is based on one or more prominent phenotypic characteristics that permit it to be distinguished from its closest relatives. Oftentimes some striking physiological or morphological feature is present that permits the genus to be differentiated from closely related taxa. For example, the genus Nitrosomonas is a group of rod-shaped bacteria that grow as chemoautotrophs, gaining energy from the oxidation of ammonia. Other ammonia oxidizers with coccus-shaped and helical cells are placed in other genera. Of course, all strains of each of these genera need to be more closely related to one another phylogenetically than to strains of other genera in a phylogenetic classification. Ideally then, the genus makes up a monophyletic lineage (i.e., one in which all are members of the same phylogenetic cluster or clade). thesis are sufficient to proclaim a separate genus, even though two groups are otherwise very closely related. As mentioned previously, the taxonomy of prokaryotes is undergoing major changes. Although according to classical taxonomy each genus belongs to a family of similar genera, relatedness at the familial and higher level is often uncertain for bacteria. Bacteriologists have been reluctant to ascribe organisms to formal Latinized families and orders. However, as more becomes known about bacterial phylogeny, it is increasingly apparent that higher taxonomic levels do have meaning and can be distinguished from one another by comparing the sequences of certain macromolecules, as is reflected in the new edition of Bergey’s Manual of Systematic Bacteriology. SECTION HIGHLIGHTS Bacteria and Archaea are classified in a hierarchical structure from the domain level to the phylum, class, order, family, genus, and finally species. Species are described based on both phenotypic and genotypic properties including GC ratio, 16S rRNA sequence, and DNA–DNA hybridization. Higher Taxa Odd bedfellows are sometimes found in phylogenetic trees; therefore, photosynthetic and nonphotosynthetic members of some closely related groups have been reported. Of course, loss of a key gene or two may result in converting a formerly photosynthetic organism to one that is not photosynthetic. Consequently, although the plant and animal kingdoms are differentiated on the basis of whether or not they are photosynthetic, both features have been reported in two closely related bacterial genera. However, because phenotype is so important at the genus level, important features such as photosyn- 17.4 Major Groups of Archaea and Bacteria Bacteriologists have begun to construct classifications using phylogenetic information from rRNA analyses. As mentioned in Chapter 1, some prokaryotes are very different from others, a revelation that came through an analysis of rRNA (Box 17.4). Ribosomal RNA data allow the division of all organisms on Earth, prokaryotic and eukaryotic, into three domains: Bacteria, Archaea, and Eukarya (see Figure 1.6). These domains can also be dis- This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. Taxonomy of Bacteria and Archaea BOX 17.4 505 Research Highlights The Discovery of Archaea Clearly, one of the most exciting developments in bacterial classification of the twentieth century was the discovery that there are two major groups called Domains, named the Bacteria and the Archaea. The appreciation of the difference between the Bacteria and the Archaea was the culmination of years of research by microbiologists throughout the world. However, the final piece of evidence that convinced microbiologists of this dichotomy was the discovery that these organisms had very different 16S rRNAs. This research was performed in Carl Woese’s laboratory at the University of Illinois. At that time, rRNA sequencing was not done routinely in laboratories. Instead, 16S rRNA was purified and digested by ribonucleases that cut between specific nucleotide pairs. The oligonucleotide fragments produced were subjected to two-dimensional electrophoresis. The pattern of spots on a two-dimensional chromatogram represented the various rRNA oligonucleotides that were typical of each species. Studies from Woese’s laboratory demonstrated that the patterns were very different for Bacteria and Archaea. Indeed, their studies of 18S rRNA from eukaryotic organisms indicated that the Archaea are as different from Bacteria as they are from eukaryotes. The seminal findings of this work have forever changed the way microbiologists view taxonomy and phylogeny. tinguished from one another by phenotypic testing. For example, consider the cell envelope composition of the organisms. Peptidoglycan is found only in Bacteria, although two groups—the mycoplasmas and the Planctomycetales—lack it. Furthermore, the lipids of the Bacteria and Eukarya are ester-linked, whereas they are ether-linked in the Archaea (see Chapters 4 and 18). At this time, 28 different phylogenetic groups, referred to here as phyla, are known. Included are 24 phyla of Bacteria and four phyla of Archaea. Each of these phyla has specific signature sequences in their ribosomes that are distinctive to them. The prokaryotic phyla are listed here along with a brief description of their major features. They are treated in more detail in subsequent chapters (Chapters 18 through 22), and the groups in each chapter are indicated below. It is noteworthy that many new phyla of the Bacteria, in particular, have been discovered in natural environments using clone library approaches, but have not yet been isolated in pure culture (Box 17.5). Thus, at least ten to 15 additional major prokaryotic groups are very poorly understood. The second edition of Bergey’s Manual of Systematic Bacteriology has been largely followed in organizing the Carl Woese. Courtesy of Jason Lindsey. bacterial groups treated in this book. The Archaea are described in Chapter 18 and the Bacteria in Chapters 19–22 (Table 17.3). Domain: Archaea The Archaea are divided into the following four phylogenetic groups or phyla. CRENARCHAEOTA This phylum contains the most thermophilic organisms known. Some of these organisms grow at temperatures higher than the boiling point of water. Most rely on sulfur metabolism either as an energy source or as an electron sink. For example, some oxidize reduced sulfur compounds aerobically to produce sulfuric acid. Others reduce elemental sulfur and use it as an electron acceptor to form hydrogen sulfide. Some are iron and manganese reducers. Not all organisms are thermophilic. They are also significant in deep-sea environments as well as in polar seas. The methanogens (methane producers) are noted for their ability to produce methane gas EURYARCHEOTA This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. 506 Chapter Seventeen Research Highlights BOX 17.5 Novel Phyla Discovered by Molecular Analyses of Natural Habitats S1 ino onsu bac lfur rgi s s ne ipe Sy ist c ba no a Cy w Lo teri a group A Marine r Green sulfu Cytophagales cter Thermu ermoba moto Ther act ob ulf s ode s/Deino m eriu TM 6 m er Th coccus Spir oche tes s ale ic uif Cg acter OP9 Dictyoglomus gales G+ ve siti -po ram Fibrob OP5 Coproth ia ter ex Fl 8 OP tes rou p ium ite g bacter Ter m a OS-K 0 P1 O W Gree nn Acido icrobia m Verruco a mydi Chla 3 tes OP yce tom nc Pla Act the various 16S rDNA types retrieved from a natural sample can provide important information about the diversity of prokaryotes that occur in that environment. Using these approaches, it has recently been determined that more than 50 major phyla of Bacteria exist, yet isolates have been obtained of only 24. I interest from the DNA. For phylogenetic (i.e., diversity) information, 16S rDNA primers for the Bacteria or Archaea or universal primers that amplify both groups are commonly used. The segments retrieved, typically about 500 bp, can then be sequenced to identify the phylogenetic groups. Although PCR approaches are not quantitative, Nitrospir One of the major advances in exploring the diversity of microorganisms in natural environments has been the application of molecular approaches developed in Norman Pace’s laboratory. In the most recent variation of this approach, DNA is extracted from the environment of interest. Then PCR is used to amplify genes of W S6 TM 7 Aq a eri act sob Fu ia ter bac teo Pro 11 OP Archaea A phylogenetic tree of 16S rDNA sequences of Bacteria, based on pure cultures and clonal libraries from natural samples. Note the existence of many phyla (shown in outline rather than as solid black lines) that have not yet been cultivated. Courtesy of Phil Hugenholz and ASM Publications (Hugenholz, P., B. M. Goebel and N. R. Pace. 1998. J. Bacteriol. 180: 4765–4774). This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. Taxonomy of Bacteria and Archaea TABLE 17.3 507 Overall organization for treatment of Bacteria and Archaea Bacterial Group This Book Bergey’s Manual of Systematic Bacteriologya Archaea Proteobacteria Gram-positive bacteria Phototrophic bacteria Other bacterial phyla Chapter 18 Chapter 19 Chapter 20 Chapter 21 Chapter 22 Volume 1 Volume 2 Volumes 3 and 5 Volumes 1, 2, and 3 Volume 4 a For more complete treatment of organization of taxa see Bergey’s Manual of Systematic Bacteriology (2nd ed.). from simple carbon sources. Some use carbon dioxide and hydrogen gas, whereas others use methanol or acetic acid. These archaea are anaerobes, some of which grow at the lowest oxidation–reduction potentials of all prokaryotes. Some of these archaea fix carbon dioxide, but they use neither the Calvin cycle nor the reductive tricarboxylic acid (TCA) cycle. Some of the Euryarcheota are hyperthermophilic. Extreme halophiles make up another phenotypic subgroup. These extremely halophilic archaea grow only in saturated salt-brine solutions. They lyse when placed in distilled water. It should be noted that there is an overlap between the extreme halophiles and methanogens. Thus, some species of methanogens grow in high-salt environments. NANOARCHEOTA This recently discovered group of the Archaea comprises obligate parasites of other members of the Archaea. They are among the smallest of organisms, hence their name. These archaeal microorganisms have been found in hot springs, but no strains have yet been isolated in pure culture, so little is known about their phenotypic properties. KORARCHEOTA Domain: Bacteria The Bacteria are divided into a number of phyla, which are described in the following text. PROTEOBACTERIA The Proteobacteria comprise a very large and diverse group of organisms. All four of the major bacterial nutritional types are represented within this group. Some of these organisms are photosynthetic (treated in Chapter 21), whereas some are heterotrophic and others are chemolithotrophic. The chemolithotrophic bacteria include the nitrifiers, the thiobacilli, the filamentous sulfur oxidizers (Beggiatoa and related genera), and many species that grow as hydrogen autotrophs. Carbon dioxide fixation, when present, is via the Calvin cycle in all members of this phylum. This phylogenetic group contains many of the wellknown gram-negative heterotrophic bacteria such as Pseudomonas, the enteric bacteria including E. coli, Vibrio and luminescent bacteria, and the more morphologically unusual bacteria such as the prosthecate bacteria. In addition, many symbiotic genera such as Agrobacterium, Rickettsia, and Rhizobium are members of this group. The gram-negative bacterial sulfate reducers such as Desulfovibrio are also found in this group. Also included in this phylum are the exotic myxobacteria that form fruiting structures as well as unicellular, nongliding forms such as the bacterial predator Bdellovibrio. Finally, the mitochondria present in almost all eukaryotes evolved from this group of bacteria. FIRMICUTES The bacteria in this group are all grampositive, although the Mycoplasma group lacks a cell wall altogether and therefore stains as gram-negative. All other members of the group contain large amounts of peptidoglycan in their cell wall structure. The Firmicutes are unicellular organisms that have a low mol % G + C content. Most are cocci or rods, and some produce endospores. Bacillus are aerobic or facultative spore formers, whereas Clostridium species are anaerobic fermenters. Some are sulfate reducers. One group, the heliobacteria, are photosynthetic, and members produce a unique form of bacteriochlorophyll, Bchl g. ACTINOBACTERIA These gram-positive bacteria range in shape from unicellular organisms to branching, filamentous, mycelial organisms. Most are common soil organisms, some of which produce specialized dissemination stages called conidiospores, which enable them to survive during dry periods. This group contains the genus Chloroflexus, a green gliding bacterium that is metabolically versatile. Members can grow as heterotrophs or photo- CHLOROFLEXI This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. 508 Chapter Seventeen synthetically. Carbon dioxide is not fixed by either the Calvin cycle or the reductive TCA cycle but by a special pathway known only for this group of bacteria. PLANCTOMYCETES The Planctomycetes group of Bacteria are budding, unicellular, or filamentous bacteria. These bacteria lack peptidoglycan. CHLOROBI The green sulfur bacteria are anoxygenic photosynthetic bacteria. Some are unicellular forms, and others produce networks of cells. None are motile by flagella or gliding motility. Some have gas vacuoles. They use the reductive TCA cycle rather than the Calvin cycle to fix carbon dioxide. CHLAMYDIAE CYANOBACTERIA The cyanobacteria are the only bacteria that carry out oxygenic photosynthesis. This is a diverse group of bacteria ranging from unicellular to multicellular filamentous and colonial types. Some grow in association with higher plants and animals. All cyanobacteria use the Calvin cycle for carbon dioxide fixation. The chloroplast found in all eukaryotic photosynthetic organisms evolved from this group of bacteria. The Chlamydiae are a group of obligately intracellular parasites and pathogens whose closest relatives are the Planctomycetes. They also lack peptidoglycan. VERRUCOMICROBIA These bacteria are unusual in that some members have bacterial tubulin genes. Very few representatives of this phylum have been isolated in pure culture, although they comprise up to 3% of the microbiota from soils. This is a newly discovered phylum. Some isolates are marine, others are intestinal symbionts of mammals. LENTISPHAERA The spirochetes are morphologically distinct from other bacteria. Their flexible cells are helical. All are motile due to a special flagellum-like structure, the axial filament, not found in other bacteria. SPIROCHAETES These bacteria are hydrogen autotrophs. This phylogenetic group contains the most thermophilic member of the Bacteria known and makes up one of the deepest branches of the Bacteria. AQUIFICAE These are anaerobic bacteria, some of which live in the gastrointestinal tracts of animals. Some are cellulose degraders. FIBROBACTERES This is a fermentative genus that contains some of the most thermophilic members of the Bacteria known. The term “toga” refers to the outer extracellular material that surrounds the cells. They grow at temperatures from 55°C to 90°C. Their cell lipids are unusual. THERMOTOGAE THERMOMICROBIA The genus Thermomicrobium contains small, rod-shaped thermophiles that grow as heterotrophs in hot springs with an optimal temperature for growth of 70°C to 75°C. The cell wall contains very low amounts of diaminopimelic acid. This is a group of thermophilic sulfur-reducing bacteria. THERMODESULFOBACTERIA These bacteria are commonly found in soils and sediments, but few strains have been cultivated. In addition to the aerobic genus Acidobacterium, this phylum contains homoacetogenic bacteria, Holophaga, and iron-reducing bacteria in the genus Geothrix. ACIDOBACTERIA These obligately anaerobic bacteria are commonly found in the oral cavities and intestinal tracts of animals. FUSOBACTERIA DICTYOGLOMI Species of this group are thermophilic, obligately anaerobic fermentative bacteria. DEINOCOCCUS–THERMUS This is a very small group of organisms currently represented by very few genera. The genus Deinococcus contains gram-positive bacteria. However, they differ from other gram-positive bacteria in showing strong resistance to gamma radiation and UV light. Thermus contains thermophilic, rodshaped bacteria. Ornithine is the diamino acid in the cell walls of both Thermus and Deinococcus. 17.5 Identification BACTEROIDETES This is a diverse group containing heterotrophic aerobes and anaerobes. Some are gliding heterotrophic bacteria that have a low DNA base composition (about 30 to 40 mol % G + C). The final area of taxonomy is identification. Bacteriologists are often confronted with determining to which species a newly isolated organism belongs. Clinical microbiologists need to know whether a specific patho- SECTION HIGHLIGHTS Currently four phyla have been described in the domain Archaea and 24 have been described in the domain Bacteria. This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. Taxonomy of Bacteria and Archaea genic bacterium is present so they can properly diagnose a disease. Food microbiologists need to determine whether Salmonella, Listeria, or other potentially pathogenic bacteria are present in foods. Dairy microbiologists need to keep their important lactic acid bacteria in culture to produce a uniform variety of cheese. Brewers and wine makers need to keep their cultures pure to inoculate the proper strains for quality control of their fermentations. Analysts at water treatment plants need to make sure that their chlorination treatment is effective in killing coliform bacteria in the treated water and distribution systems. Microbial ecologists need to identify bacteria that are responsible for important processes such as pesticide breakdown and nitrogen fixation. The process of identification first assumes that the bacterium of interest is one that has already been described and named. This is the usual case for most clinical specimens or specimens from known fermentations. However, microbial ecologists often find that the organism they are interested in is new. It is estimated that less than 1% of prokaryotic species have been isolated, studied in the laboratory, and named. Therefore, it is not always possible to identify a bacterium that has been isolated from an environmental sample. Phenotypic Tests Phenotypic tests based on readily determined characteristics are often used to identify a species. Most are simple to perform and inexpensive. Furthermore, the amount of time and equipment required for conducting genotypic tests such as DNA–DNA reassociation preclude the use of these tests in routine diagnosis. By performing a battery of some 10 to 20 simple phenotypic tests, it is often possible to determine the genus, and perhaps even the species, of a clinically important bacterium, although some taxa are much more difficult to identify than others. Traditional methods for identification require growing the organism in question in pure culture and performing a number of phenotypic tests. For example, if one wishes to identify a rod-shaped bacterium, the first test would be a Gram stain. If the organism is determined to be a gram-negative rod, the next questions to ask include the following: Is it motile? Is it an obligate aerobe? Is it fermentative? Does it have catalase? Can it grow using acetate as a sole carbon source? The answers to these questions will direct the investigator toward the next step in identification. Conversely, if the organism is a gram-positive rod, then a completely different set of tests would need to be performed such as a test for endospore formation. These tests take time to perform, and the appropriate tests for one genus of bacteria differ from that of oth- 509 ers. Therefore, the results of one set of tests will determine which tests will need to be performed for further clarification of the taxon. Sometimes several weeks might be required to conduct all the tests needed to identify a strain. Rapid tests are extremely helpful, especially in the medical, food, and water-testing areas. Fortunately, standardized, routine tests can be performed for most clinically important bacteria that allow for their rapid identification. Several companies have now produced commercial kits that are helpful in assisting in the identification of unknowns (see Figure 30.4). An increasingly popular approach to identification of bacterial unknowns involves characterization of their fatty acids. The fatty acids are found in membrane lipids and are readily extracted from the cells and analyzed. Different species of Bacteria produce different types and ratios of fatty acids. The Archaea, of course, do not produce fatty acids, so this procedure is not of value for their identification. However, they do produce characteristic lipids that are useful taxonomically, especially for the halobacteria. The fatty acid analysis procedure involves hydrolyzing a small quantity of cell material (about 40 mg is all that is needed) and saponifying it in sodium hydroxide. This is acidified with hydrochloric acid in methanol so that the fatty acids can be methylated to form methyl esters. The fatty acid methylated esters (FAME) are then extracted with an organic solvent and injected into a gas chromatograph. The resulting chromatogram (Figure 17.11) can be used to identify the fatty acids that are indicative of a species. Commercial firms have developed databases of fatty acid profiles that can be used for the identification of species. The advantage of this procedure is that many samples can be analyzed quickly and without great effort. However, all organisms must be grown under controlled conditions of temperature and length of incubation and on the same medium. Nucleic Acid Probes and Fluorescent Antisera One exciting current area of research and commercial application involves the development of DNA or RNA “probes” that are specific for the signature sequences of rRNA or some other appropriate gene such as an enzyme that is characteristic of a species of interest. The probes are labeled in some manner so that the hybridization can be visualized. This is accomplished either by making the probes radioactive or by tagging them to a fluorescent dye or an enzyme that gives a colorimetric reaction. By the proper selection or design of probes, it is possible to identify an organism to a domain, genus, or species by demonstrating specific hybridization to the probe (see Chapter 25). This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. 510 Chapter Seventeen 10.024 1.601 10.322 Solvent 3500 13.477 18:1 omega 7 cis, omega 9 trans, omega 12 trans 16:1 omega 7 cis 16:0 Quantity of fatty acid 3000 2500 4.147 10:0 2000 16:0 2O H 4.668 12:0 1500 19:0 cyclo omega 8 12.450 15.354 1 5 10 15 20 Retention time (minutes) Figure 17.11 Fatty acid analysis Fatty acid methyl ester (FAME) chromatogram of an unknown species, showing chromatographic column retention times and peak heights. Note: 10:0, 12:0, 16:0, and 19:0 indicate saturated fatty acids with 10, 12, 16, and 19 carbons; 16:1 and 18:1, monounsaturated 16-carbon and 18-carbon fatty acids; omega number, the position of the double bond relative to the omega end—that is, the hydrocarbon end (not the carboxyl end)—of the fatty acid chain; cis and trans, the configuration of the double bond. For example, omega 7 cis Potential applications for probe technology are considerable. A number of commercial firms are already marketing probes to identify pathogenic bacteria from clinical samples. A goal of this technology is to enable the rapid identification of organisms directly from clinical or environmental samples without actually growing them in culture first. Fluorescent antiserum tests are also useful. For example, Legionella spp., the causative agents of Legionnaires’ disease, are very difficult to cultivate. However, good fluorescent antisera are available for the identification of these species directly from clinical samples or even from environmental samples (Figure 17.12). Another approach in the identification of species is through the use of multiple locus sequence typing (MLST). In this approach, several core genes (typically seven or eight) of an organism are sequenced and used to compare the organism to known species. In the clin- indicates a cis double bond between the seventh and eighth carbons from the omega end of the fatty acid. Also, 2OH indicates a hydroxyl group at the second carbon from the omega end; cyclo omega 8, a cyclo-carbon at the eighth position from the omega end. The 18:1 omega 7 cis, omega 9 trans, and omega 12 trans peak results from either one fatty acid or a mixture of fatty acids with double bonds at the three positions indicated (the chromatographic column does not separate these three fatty acids). Courtesy of MIDI (Microbial Identification, Incorporated, Delaware). ical setting, if the organism has been isolated from a patient, its MLST “type” can be compared to that of a large database to determine whether it belongs to a known pathogenic species. The MLST approach is rapidly gaining acceptance as a means of identification of pathogenic strains and species. Culture Collections Unlike plants and animals, many bacteria and archaea can be easily grown in pure culture and preserved by freeze-drying (lyophilization), handled in small test tubes and vials, and readily sent anywhere in the world. As lyophils, many of these organisms remain viable for 10 to 20 years or more and can be revived and studied by anyone anywhere. Cultures can also be frozen at –80°C in vials containing a suspension medium amended with 15% glycerol. These remain viable for many years. Thus, This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. Taxonomy of Bacteria and Archaea 511 Figure17.12 Legionella fluorescence Legionella species are difficult to grow, but can be identified to species by fluorescence microscopy using tagged and specifically labeled antisera. In this image the bacteria are yellow-green. ©Michael Abbey/Visuals Unlimited. unlike plants, for which an herbarium is used to preserve the specimens collected of an original species, bacteria are preserved as type cultures that are clones of the original viable type strain of a species. The type strain of a species is the one on which the species definition has been based. Through culture collections type strains are made available to professional microbiologists throughout the world. Many countries maintain national collections of microorganisms, such as the American Type Culture Collection (ATCC) in the United States (www.atcc.org). If a microbiologist from India wishes to determine if he or she has a new species, the original type strain can be obtained from a culture collection and used to conduct DNA–DNA reassociation assays and other tests to compare it with his or her isolates. Because of the importance of biological materials to science and industry, culture collections have become biological resource centers. Thus, strains that have been patented are also deposited in culture collections so that they are accessible. Clones of genomes that have been sequenced are also deposited as are viruses and other biological materials. SECTION HIGHLIGHTS A number of tests are used to identify bacteria that have been isolated from clinical specimens and natural sources. Some tests are phenotypic, whereas others use molecular sequence information. SUMMARY • Bacterial taxonomy or systematics consists of three areas: nomenclature, classification, and identification. Nomenclature is the naming of an organism. An International Code for the Nomenclature of Bacteria has been published containing the rules for naming Bacteria and Archaea. • Classification is the organization of Bacteria and Archaea into groups of similar species. Bacteria and Archaea are classified in increasing hierarchical rank from species, genus, family, order, class, phylum, and domain. Artificial classifications are not based on the evolution of organisms but on expressed features or the phenotype of an organism that includes properties such as cell shape and nutritional patterns. Phylogenetic classifications are based on the evolution of a group of organisms. Bacterial phylogeny is now based on the sequence information from the highly conserved macromolecule, 16S rRNA, as well as the sequences of other genes and proteins. • Phylogenetic trees, with branches and nodes, can be constructed based on the sequence of macromolecules such as rRNA. The length of the branch represents the inferred difference (number of changes) between organisms. External nodes represent extant species, whereas internal nodes represent ancestor species. Rooted trees are based on a comparison of related species and an out-group. This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc. 512 Chapter Seventeen • Speciation is the process whereby organisms evolve. Typically genetic material is transferred from the parental bacterium to the progeny through vertical inheritance. Horizontal gene transfer (HGT) occurs in which genetic material is transferred among bacteria that may or may not be closely related. • A bacterial species is a group of similar strains that show at least 70% DNA–DNA hybridization. Organisms of the same species will have similar if not identical DNA mol % G + C content. Organisms that have the same GC ratio are not necessarily similar. Based on molecular criteria, such as DNA–DNA reassociation, bacterial species are much more broadly defined than are plant and animal species. • Identification is the process whereby unknown cultures can be compared to existing species to determine if they are sufficiently similar to be members of the same species. • Type strains of all species must be deposited in at least two different types of culture collections, repositories where strains are preserved by lyophilization and deep freezing. The culture collections provide cultures to microbiologists worldwide so they can compare unidentified strains to the official type strains. i Find more at www.sinauer.com/microbial-life REVIEW QUESTIONS 1. Is it important to name and classify bacteria? 2. What procedure(s) are necessary to identify a bacterial isolate as a species? 3. Differentiate between an artificial and a phylogenetic classification. 4. In what ways does the classification of Bacteria differ from that of eukaryotic organisms? 5. How do the Archaea differ from Bacteria? From eukaryotes? 6. How is DNA melted and reannealed, and why is this useful in bacterial taxonomy? 7. How would you go about identifying a bacterium that you isolated from a soil habitat? 8. Why is morphology of little use in bacterial classification? Is it of any use? 9. What is weighting and should phenotypic features be weighted in a bacterial classification scheme? 10. Distinguish between lumpers and splitters. 11. If you were working in a clinical laboratory, outline the types of procedures you would use to identify isolates. Why do you recommend using the procedures you suggest? 12. Why is rRNA of use in bacterial classification? 13. Compare the information obtained from determining the DNA base composition (GC ratio) with that obtained by DNA reassociation experiments. SUGGESTED READING Boone, D., R. Castenholz and G. Garrity, eds. 2001. Bergey’s Manual of Systematic Bacteriology. 2nd ed., Vol. 1. New York: Springer-Verlag. Brenner, D. J., N. R. Krieg, J. T. Staley and G. Garrity, eds. 2005. Bergey’s Manual of Systematic Bacteriology. 2nd ed., Vol. 2. New York: Springer-Verlag. Gerhardt, P., ed. 1993. Methods for General and Molecular Microbiology. Washington, DC: ASM Press. Graur, D. and W. H. Li. 2000. Fundamentals of Molecular Evolution. 2nd ed. Sunderland, MA: Sinauer Associates, Inc. Hall, B. G. 2004. Phylogenetic Trees Made Easy. 2nd ed. Sunderland, MA: Sinauer Associates, Inc. COMPUTER INTERNET RESOURCES American Type Culture Collection: http://www.atcc.org/. This site has a listing of all the bacterial strains deposited in the American Type Culture Collection, as well as growth media and conditions. Bergey’s Manual Trust. Headquarters at the University of Georgia. This website has information on the current classification of Bacteria and Archaea: http://www.bergeys.org/. National Center for Biotechnology Information (NCBI): http://www.ncbi.nlm.nih.gov/. This center contains information that allows for comparison of genes from different organisms through BLAST (Basic Local Alignment Search Tool), Genbank, which contains a huge collection of gene sequences that can be used for comparative analyses and a section on taxonomy. Ribosome Database Project (RDP), Michigan State University: http://rdp.cme.msu.edu/. This site has information on the 16S rRNA sequences of more than 250,000 bacterial and archaeal sequences. The database allows one to conduct phylogenetic analyses of unknown strains whose 16S rRNA sequence has been determined and to compare the sequence with those already reported. Comparative RNA Web Site: www.rna.icmb.utexas.edu. A remarkable collection of RNA sequence information presented with secondary structure models, conservation diagrams, and more. Published as Cannone, J. J., et al. 2002. “The Comparative RNA Web Site: An online database of comparative sequence and structure information for ribosomal, intron, and other RNAs.” BioMed Central Bioinformatics 3: 2. This material cannot be copied, disseminated, or used in any way without the express written permission of the publisher. Copyright 2007 Sinauer Associates Inc.