Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Prescott−Harley−Klein: Microbiology, Fifth Edition III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 CHAPTER 10 Metabolism: The Use of Energy in Biosynthesis The nitrogenase Fe protein’s subunits are arranged like a pair of butterfly wings. Nitrogenase consists of the Fe protein and the MoFe protein; it catalyzes the reduction of atmospheric nitrogen during nitrogen fixation. Outline 10.1 10.2 Principles Governing Biosynthesis 205 The Photosynthetic Fixation of CO2 207 10.5 10.6 10.7 The Carboxylation Phase 208 The Reduction Phase 208 The Regeneration Phase 208 10.3 10.4 Synthesis of Sugars and Polysaccharides 209 The Assimilation of Inorganic Phosphorus, Sulfur, and Nitrogen 210 Phosphorus Assimilation 210 Sulfur Assimilation 210 Nitrogen Assimilation 210 Nitrogen Fixation 212 The Synthesis of Amino Acids 214 Anaplerotic Reactions 215 The Synthesis of Purines, Pyrimidines, and Nucleotides 216 Purine Biosynthesis 217 Pyrimidine Biosynthesis 218 10.8 10.9 10.10 Lipid Synthesis 218 Peptidoglycan Synthesis 221 Patterns of Cell Wall Formation 223 Prescott−Harley−Klein: Microbiology, Fifth Edition III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 10.1 Concepts Principles Governing Biosynthesis Level of organization Cells 1. In anabolism or biosynthesis, cells use free energy to construct more complex molecules and structures from smaller, simpler precursors. 2. Biosynthetic pathways are organized to optimize efficiency by conserving biosynthetic raw materials and energy. 3. Autotrophs use ATP and NADPH from photosynthesis or from oxidation of inorganic molecules to reduce CO2 and incorporate it into organic material. 4. Catabolic and anabolic pathways may differ in enzymes, regulation, intracellular location, and use of cofactors and nucleoside diphosphate carriers. Although many enzymes of amphibolic pathways participate in both catabolism and anabolism, some pathway enzymes are involved only in one of the two processes. Organelles Supramolecular systems 5. Phosphorus, in the form of phosphate, can be directly assimilated, whereas inorganic sulfur and nitrogen compounds must often be reduced before incorporation into organic material. 6. The tricarboxylic acid (TCA) cycle acts as an amphibolic pathway and requires anaplerotic reactions to maintain adequate levels of cycle intermediates. 7. Most glycolytic enzymes participate in both the synthesis and catabolism of glucose. In contrast, fatty acids are synthesized from acetyl-CoA and malonyl-CoA by a pathway quite different from fatty acid -oxidation. 8. Peptidoglycan synthesis is a complex, multistep process that is begun in the cytoplasm and completed at the cell wall after the peptidoglycan repeat unit has been transported across the plasma membrane. Macromolecules Monomers or building blocks Inorganic molecules Biological structures are almost always constructed in a hierarchical manner, with subassemblies acting as important intermediates en route from simple starting molecules to the end products of organelles, cells, and organisms. —W. M. Becker and D.W. Deamer s the last chapter makes clear, microorganisms can obtain energy in many ways. Much of this energy is used in biosynthesis or anabolism. During biosynthesis a microorganism begins with simple precursors, such as inorganic molecules and monomers, and constructs ever more complex molecules until new organelles and cells arise (figure 10.1). A microbial cell must manufacture many different kinds of molecules; however, it is possible to discuss the synthesis of only the most important types of cell constituents. This chapter begins with a general introduction to anabolism, then focuses on the synthesis of carbohydrates, amino acids, purines and pyrimidines, and lipids. It also describes the assimilation of CO2, phosphorus, sulfur, and nitrogen. The chapter ends with a section on the synthesis of peptidoglycan and bacterial cell walls. Protein and nucleic acid synthesis is so significant and complex that it is described separately in chapters 11 and 12. A Because anabolism is the creation of order and a cell is highly ordered and immensely complex, much energy is required for biosynthesis. This is readily apparent from estimates of the biosynthetic capacity of rapidly growing Escherichia coli (table 10.1). Although most ATP dedicated to biosynthesis is employed in protein synthesis, ATP is also used to make other cell constituents. 205 Examples Bacteria Algae Fungi Protozoa Nuclei Mitochondria Ribosomes Flagella Membranes Enzyme complexes Nucleic acids Proteins Polysaccharides Lipids Nucleotides Amino acids Sugars Fatty acids CO2, NH3, H2O, PO4 3– Figure 10.1 The Construction of Cells. The biosynthesis of procaryotic and eucaryotic cell constituents. Biosynthesis is organized in levels of ever greater complexity. Free energy is required for biosynthesis in mature cells of constant size because cellular molecules are continuously being degraded and resynthesized, a process known as turnover. Cells are never the same from instant to instant. Despite the continuous turnover of cell constituents, metabolism is carefully regulated so that the rate of biosynthesis is approximately balanced by that of catabolism. In addition to the energy expended in the turnover of molecules, many nongrowing cells also use energy to synthesize enzymes and other substances for release into their surroundings. Regulation of metabolism (chapters 8 and 12) 10.1 Principles Governing Biosynthesis Biosynthetic metabolism seems to follow certain patterns or be shaped by a few general principles. Six of these are now briefly discussed. 1. A microbial cell contains large quantities of proteins, nucleic acids, and polysaccharides, all of which are macromolecules or very large molecules that are polymers of smaller units joined together. The construction of large, complex molecules from a few simple structural units or monomers saves much genetic storage capacity, biosynthetic raw material, and energy. A consideration of protein synthesis clarifies this. Proteins—whatever size, shape, or function—are made of only 20 common amino acids joined by peptide bonds (see appendix I). Different Prescott−Harley−Klein: Microbiology, Fifth Edition 206 Chapter 10 III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 Metabolism:The Use of Energy in Biosynthesis Table 10.1 Biosynthesis in Escherichia coli Cell Constituent Number of Molecules per Cella Molecules Synthesized per Second Molecules of ATP Required per Second for Synthesis DNA RNA Polysaccharides Lipids Proteins 1b 15,000 39,000 15,000,000 1,700,000 0.00083 12.5 32.5 12,500.0 1,400.0 60,000 75,000 65,000 87,000 2,120,000 From Bioenergetics by Albert Lehninger. Copyright © 1971 by the Benjamin/Cummings Publishing Company. Reprinted by permission. Estimates for a cell with a volume of 2.25 µm3, a total weight of 1 × 10–12g, a dry weight of 2.5 × 10–13g, and a 20 minute cell division cycle. a b It should be noted that bacteria can contain multiple copies of their genomic DNA. proteins simply have different amino acid sequences but not new and dissimilar amino acids. Suppose that proteins were composed of 40 different amino acids instead of 20. The cell would then need the enzymes to manufacture twice as many amino acids (or would have to obtain the extra amino acids in its diet). Genes would be required for the extra enzymes, and the cell would have to invest raw materials and energy in the synthesis of these additional genes, enzymes, and amino acids. Clearly the use of a few monomers linked together by a single type of covalent bond makes the synthesis of macromolecules a highly efficient process. Almost all cell structures are built mainly of about 30 small precursors. 2. The cell often saves additional materials and energy by using many of the same enzymes for both catabolism and anabolism. For example, most glycolytic enzymes are involved in the synthesis and the degradation of glucose. 3. Although many enzymes in amphibolic pathways (see section 9.1) participate in both catabolic and anabolic activities, some steps are catalyzed by two different enzymes. One enzyme catalyzes the reaction in the catabolic direction, the other reverses this conversion (figure 10.2). Thus catabolic and anabolic pathways are never identical although many enzymes are shared. Use of separate enzymes for the two directions of a single step permits independent regulation of catabolism and anabolism. Although this has been discussed in more detail in sections 8.7 through 8.9, note that the regulation of anabolism is somewhat different from that of catabolism. Both types of pathways can be regulated by their end products as well as by the concentrations of ATP, ADP, AMP, and NAD⫹. Nevertheless, end product regulation generally assumes more importance in anabolic pathways. 4. To synthesize molecules efficiently, anabolic pathways must operate irreversibly in the direction of biosynthesis. Cells can achieve this by connecting some biosynthetic reactions to the breakdown of ATP and other nucleoside triphosphates. When these two processes are coupled, the free energy made available during nucleoside triphosphate breakdown drives the biosynthetic reaction to completion (see sections 8.3 and 8.4). Anabolic X Y Z G Amphibolic F E D E2 E1 C B A Figure 10.2 A Hypothetical Biosynthetic Pathway. The routes connecting G with X, Y, and Z are purely anabolic because they are used only for synthesis of the end products. The pathway from A to G is amphibolic—that is, it has both catabolic and anabolic functions. Most reactions are used in both roles; however, the interconversion of C and D is catalyzed by two separate enzymes, E1 (catabolic) and E2 (anabolic). 5. In eucaryotic microorganisms biosynthetic pathways are frequently located in different cellular compartments from their corresponding catabolic pathways (Box 10.1). For example, fatty acid biosynthesis occurs in the cytoplasmic matrix, whereas fatty acid oxidation takes place within the mitochondrion. Compartmentation makes it easier for the pathways to operate simultaneously but independently. 6. Finally, anabolic and catabolic pathways often use different cofactors. Usually catabolic oxidations produce NADH, a substrate for electron transport. In contrast, when a Prescott−Harley−Klein: Microbiology, Fifth Edition III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 10.2 The Photosynthetic Fixation of CO2 207 Box 10.1 The Identification of Anabolic Pathways here are three approaches to the study of pathway organization: (1) study of the pathway in vitro, (2) use of nutritional mutants, and (3) incubation of cells with precursors labeled radioisotopically. In vitro [Latin, in glass] studies employ cell-free extracts to search for enzymes and metabolic intermediates that might belong to a pathway. Although this direct approach was used to work out the organization of many catabolic pathways, progress in research on biosynthesis was slow until the other two techniques were developed in the early to middle 1940s. Techniques using nutritional mutants were developed during Beadle and Tatum’s work on the genetics of the red bread mold Neurospora. This approach is best illustrated with a hypothetical example. Suppose that a pathway for the synthesis of end product Z is organized with E1, E2, and so on, representing pathway enzymes. E1 E2 E3 ⬎ B ————⬎ C ————⬎ Z A ———— The prototroph (see section 11.6), which will grow in medium lacking Z, can be treated with mutagenic agents such as ultraviolet light, X rays, or chemical mutagens. Some resulting mutants will be auxotrophs that require the presence of Z for growth because one of their biosynthetic enzymes is now inactive. When E3 is inactive, the microorganism will grow only in the presence of Z, even though it can make C from the precursor A. When grown in the presence of a small amount of Z, intermediate C (the interme- T reductant is needed during biosynthesis, NADPH rather than NADH normally serves as the donor. Fatty acid metabolism provides a second example. Fatty acyl-CoA molecules are oxidized to generate energy, whereas fatty acid synthesis involves acyl carrier protein thioesters (p. 220). After macromolecules have been constructed from simpler precursors, they are assembled into larger, more complex structures such as supramolecular systems and organelles (figure 10.1). Macromolecules normally contain the necessary information to form spontaneously in a process known as self-assembly. For example, ribosomes are large assemblages of many proteins and ribonucleic acid molecules, yet they arise by the self-assembly of their components without the involvement of extra factors. 1. Define biosynthesis or anabolism and turnover. 2. List six principles by which biosynthetic pathways are organized. 10.2 The Photosynthetic Fixation of CO2 Although most microorganisms can incorporate or fix CO2, at least in anaplerotic reactions (pp. 215–16), only autotrophs use CO2 as their sole or principal carbon source. The reduction and incorporation of CO2 requires much energy. Usually autotrophs obtain energy by trapping light during photosynthesis, but some diate just before the blocked step) will accumulate in the medium. In this way a variety of mutants can be used to establish the identity of pathway intermediates. The order of intermediates can be determined by cross-feeding experiments. If E2 has been inactivated by a mutation, the mutant will grow only when either C or Z is supplied. Because the medium in which the E3 mutant has been cultured contains intermediate C, it will support growth of the E2 mutant (other mutants would not produce enough C to support growth). If cross-feeding experiments are conducted with mutants of each step in the pathway, the steps can be placed in the correct order. Application of this technique quickly led to the elucidation of the pathways for the synthesis of tryptophan, folic acid, and other molecules. Radioisotopes such as 14C are used in the third approach to studying pathway organization. Potential biosynthetic precursors are synthesized in the laboratory with specific atoms made radioactive. The microorganism then is incubated with culture medium containing the radioactive molecule, and the biosynthetic end product is isolated and analyzed. If the molecule truly is a precursor of the end product, the latter should be radioactive. The location of the radioactive atom will determine what part of the product is contributed by the radioactively labeled precursor. Precisely the same approach can be employed with nonradioactive atoms like 15N. This technique provided some of the first information about the nature of the purine biosynthetic pathway. derive energy from the oxidation of reduced inorganic electron donors. Autotrophic CO2 fixation is crucial to life on earth because it provides the organic matter on which heterotrophs depend. Photosynthetic light reactions and chemolithotrophy (pp. 193–201) Microorganisms can fix CO2 or convert this inorganic molecule to organic carbon and assimilate it in three major ways. Almost all microbial autotrophs incorporate CO2 by a special metabolic pathway called by several names: the Calvin cycle, Calvin-Benson cycle, or reductive pentose phosphate cycle. Although the Calvin cycle is found in photosynthetic eucaryotes and most photosynthetic procaryotes, it is absent in the Archaea, some obligately anaerobic bacteria, and some microaerophilic bacteria. These microorganisms usually employ one of two other pathways. A reductive tricarboxylic acid pathway (see figure 20.6a) is used by some archaea (Thermoproteus, Sulfolobus) and by bacteria such as Chlorobium and Desulfobacter. The acetylCoA pathway (see figure 20.6b) is found in methanogens, sulfate reducers, and bacteria that can form acetate from CO2 during fermentation (acetogens). Because of its importance, we will focus on the Calvin cycle here. Alternate CO2 fixation pathways (pp. 454–55) The Calvin cycle is found in the chloroplast stroma of eucaryotic microbial autotrophs. Cyanobacteria, some nitrifying bacteria, and thiobacilli possess carboxysomes. These are polyhedral inclusion bodies that contain the enzyme ribulose-1,5-bisphosphate carboxylase (see following section). They may be the site of CO2 fixation or may store the carboxylase and other proteins. Understanding the cycle is easiest if it is divided into three phases: carboxylation, Prescott−Harley−Klein: Microbiology, Fifth Edition 208 Chapter 10 O H C OH H C OH 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 Metabolism:The Use of Energy in Biosynthesis CH2O P C III. Microbial Metabolism CH2O P CO2 OOC H CH2O P C OH C O C OH H H2 O CH2O P C C OH C Ribulose 1,5bisphosphate HCOH COOH H OH Ribulose1,5-bisphosphate carboxylase CH2O P 3-phosphoglycerate (PGA) CH2O P CO2 H2O CH2O P COOH 3-phosphoglycerate + HCOH HOCH CH2O P COOH Figure 10.3 The Ribulose-1,5-Bisphosphate Carboxylase Reaction. This enzyme catalyzes the addition of carbon dioxide to ribulose 1,5-bisphosphate, forming an unstable intermediate, which then breaks down to two molecules of 3-phosphoglycerate. CARBOXYLATION PHASE O HCOH COOH CH2O P Ribulose 1,5bisphosphate (RuBP) CH2O P Phosphoglycerate kinase ATP ADP ADP ATP C reduction, and regeneration. An overview of the cycle is given in figure 10.4 and the details are presented in appendix II. REDUCTION PHASE O O P HCOH 1,3-bisphosphoglycerate CH2O P NADPH + H+ The Carboxylation Phase Glyceraldehyde3-phosphate dehydrogenase Carbon dioxide fixation is accomplished by the enzyme ribulose 1,5-bisphosphate carboxylase or ribulosebisphosphate carboxylase/ oxygenase (rubisco) (figure 10.3), which catalyzes the addition of CO2 to ribulose 1,5-bisphosphate (RuBP), forming two molecules of 3-phosphoglycerate (PGA). NADP+ Pi H O C HCOH CH2OH Glyceraldehyde 3-phosphate C CH2O P The Reduction Phase O CH2O P DHAP After PGA is formed by carboxylation, it is reduced to glyceraldehyde 3-phosphate. The reduction, carried out by two enzymes, is essentially a reversal of a portion of the glycolytic pathway, although the glyceraldehyde 3-phosphate dehydrogenase differs from the glycolytic enzyme in using NADP⫹ rather than NAD⫹ (figure 10.4). Ribulose 5phosphate (1) REGENERATION PHASE (5) Biosynthetic products Fructose 1,6-bisphosphate Fructose 6-phosphate Erythrose 4-phosphate Ribose 5-phosphate and other intermediates The Regeneration Phase The third phase of the Calvin cycle regenerates RuBP and produces carbohydrates such as glyceraldehyde 3-phosphate, fructose, and glucose (figure 10.4). This portion of the cycle is similar to the pentose phosphate pathway and involves the transketolase and transaldolase reactions. The cycle is completed when phosphoribulokinase reforms RuBP. To synthesize fructose 6-phosphate or glucose 6-phosphate from CO2, the cycle must operate six times to yield the desired hexose and reform the six RuBP molecules. 6RuBP ⫹ 6CO2 ————⬎ 12PGA 6RuBP ⫹ fructose 6-P Figure 10.4 The Calvin Cycle. This is an overview of the cycle with only the carboxylation and reduction phases in detail. Three ribulose 1,5-bisphosphates are carboxylated to give six 3-phosphoglycerates in the carboxylation phase. These are converted to six glyceraldehyde 3-phosphates, which can be converted to dihydroxyacetone phosphate (DHAP). Five of the six trioses (glyceraldehyde phosphate and dihydroxyacetone phosphate) are used to reform three ribulose 1,5-bisphosphates in the regeneration phase. The remaining triose is used in biosynthesis. ⬎ ———— The incorporation of one CO2 into organic material requires three ATPs and two NADPHs. The formation of glucose from CO2 may be summarized by the following equation. 6CO2 ⫹ 18ATP ⫹ 12NADPH ⫹ 12H+ ⫹ 12H2O ————⬎ glucose ⫹ 18ADP ⫹ 18Pi ⫹ 12NADP+ ATP and NADPH are provided by photosynthetic light reactions or by oxidation of inorganic molecules in chemoautotrophs. Sugars formed in the Calvin cycle can then be used to synthesize other essential molecules. Prescott−Harley−Klein: Microbiology, Fifth Edition III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 10.3 Synthesis of Sugars and Polysaccharides O Glucose ATP Hexokinase Pi ADP HN Glucose 6-phosphatase H 2O CH2OH Glucose 6-phosphate O OH Fructose 6-phosphate Phosphofructokinase OH Pi ATP ADP 209 O O O P O OH O – O N O P O CH2 O – Fructose bisphosphatase H 2O OH Fructose 1,6-bisphosphate OH Uridine diphosphate Glyceraldehyde 3-phosphate Dihydroxyacetone phosphate Pi NAD Figure 10.6 Uridine Diphosphate Glucose. Glucose is in color. + + NADH + H CH2OH O 1,3-bisphosphoglycerate ADP ATP UDP-glucose UDP OH OH 3-phosphoglycerate H2 O OH NADH 2-phosphoglycerate NAD H 2O Phosphoenolpyruvate OH CO2 GDP ADP Pyruvate kinase ATP GTP CH2OH O COOH O UDP OH Phosphoenolpyruvate carboxykinase UDP OH OH OH UDP-galactose Oxaloacetate + OH UDP-glucuronic acid ADP ATP CO2 Pyruvate Pyruvate carboxylase Figure 10.5 Gluconeogenesis. The gluconeogenic pathway used in many microorganisms. The names of the four enzymes catalyzing reactions different from those found in glycolysis are in shaded boxes. Glycolytic steps are shown in blue for comparison. 10.3 Synthesis of Sugars and Polysaccharides Many microorganisms cannot carry out photosynthesis and are heterotrophs that must synthesize sugars from reduced organic molecules rather than from CO2. The synthesis of glucose from noncarbohydrate precursors is called gluconeogenesis. Although the gluconeogenic pathway is not identical with the glycolytic pathway, they do share seven enzymes (figure 10.5). Three glycolytic steps are irreversible in the cell: (1) the conversion of phosphoenolpyruvate to pyruvate, (2) the formation of fructose 1,6-bisphosphate from fructose 6-phosphate, and (3) the phosphorylation of glucose. These must be bypassed when the pathway is operating biosynthetically. For example, the formation of fructose 1,6-bisphosphate by phosphofructokinase is reversed by the enzyme, fructose bisphosphatase, which hydrolytically re- Figure 10.7 Uridine Diphosphate Galactose and Glucuronate Synthesis. The synthesis of UDP-galactose and UDP-glucuronic acid from UDP-glucose. Structural changes are indicated by colored boxes. moves a phosphate from fructose bisphosphate. Usually at least two enzymes are involved in the conversion of pyruvate to phosphoenolpyruvate (the reversal of the pyruvate kinase step). As can be seen in figure 10.5, the pathway synthesizes fructose as well as glucose. Once glucose and fructose have been formed, other common sugars can be manufactured. For example, mannose comes directly from fructose by a simple rearrangement. Fructose 6-phosphate mannose 6-phosphate Several sugars are synthesized while attached to a nucleoside diphosphate. The most important nucleoside diphosphate sugar is uridine diphosphate glucose (UDPG). Glucose is activated by attachment to the pyrophosphate of uridine diphosphate through a reaction with uridine triphosphate (figure 10.6). The UDP portion of UDPG is recognized by enzymes and carries glucose around the cell for participation in enzyme reactions much like ADP bears phosphate in the form of ATP. UDP-galactose is synthesized from UDPG through a rearrangement of one hydroxyl group. A different enzyme catalyzes the synthesis of UDPglucuronic acid through the oxidation of UDPG (figure 10.7). Prescott−Harley−Klein: Microbiology, Fifth Edition 210 Chapter 10 III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 Metabolism:The Use of Energy in Biosynthesis Nucleoside diphosphate sugars also play a central role in the synthesis of polysaccharides such as starch and glycogen. Again, biosynthesis is not simply a direct reversal of catabolism. Glycogen and starch catabolism (see section 9.7) proceeds either by hydrolysis to form free sugars or by the addition of phosphate to these polymers with the production of glucose 1-phosphate. Nucleoside diphosphate sugars are not involved. In contrast, during the synthesis of glycogen and starch in bacteria and algae, adenosine diphosphate glucose is formed from glucose 1-phosphate and then donates glucose to the end of growing glycogen and starch chains. ATP ⫹ glucose 1-phosphate ————⬎ ADP-glucose ⫹ PPi (Glucose)n ⫹ ADP-glucose ————⬎ (glucose)n+1 ⫹ ADP O – O O O S P O O O CH2 Adenine O – O O P OH O – – O Figure 10.8 Phosphoadenosine 5′-phosphosulfate (PAPS). The sulfate group is in color. Nucleoside diphosphate sugars also participate in the synthesis of complex molecules such as bacterial cell walls (pp. 221–23). 1. Briefly describe the three stages of the Calvin cycle. 2. What is gluconeogenesis and how does it usually occur? Describe the formation of mannose, galactose, starch, and glycogen. Why are nucleoside diphosphate sugars important? 10.4 The Assimilation of Inorganic Phosphorus, Sulfur, and Nitrogen Besides carbon and oxygen, microorganisms also require large quantities of phosphorus, sulfur, and nitrogen for biosynthesis. Each of these is assimilated, or incorporated into organic molecules, by different routes. Microbial nutrition (chapter 5); Microbial participation in biogeochemical cycles (section 28.4) Phosphorus Assimilation Phosphorus is found in nucleic acids, proteins, phospholipids, ATP, and coenzymes like NADP. The most common phosphorus sources are inorganic phosphate and organic phosphate esters. Inorganic phosphate is incorporated through the formation of ATP in one of three ways: by (1) photophosphorylation (see pp. 196–99), (2) oxidative phosphorylation (see pp. 187–89), and (3) substratelevel phosphorylation. Glycolysis provides an example of the latter process. Phosphate is joined with glyceraldehyde 3-phosphate to give 1,3-bisphosphoglycerate, which is next used in ATP synthesis. Glyceraldehyde 3-P + Pi + NAD+ ————⬎ 1,3-bisphosphoglycerate + NADH + H+ 1,3-bisphosphoglycerate + ADP ————⬎ 3-phosphoglycerate + ATP Microorganisms may obtain organic phosphates from their surroundings in dissolved or particulate form. Phosphatases very often hydrolyze organic phosphate esters to release inorganic phosphate. Gram-negative bacteria have phosphatases in the periplasmic space between their cell wall and the plasma membrane, which allows phosphate to be taken up immediately after release. On the other hand, protozoa can directly use organic phosphates after ingestion or hydrolyze them in lysosomes and incorporate the phosphate. Sulfur Assimilation Sulfur is needed for the synthesis of amino acids (cysteine and methionine) and several coenzymes (e.g., coenzyme A and biotin) and may be obtained from two sources. Many microorganisms use cysteine and methionine, obtained from either external sources or intracellular amino acid reserves. In addition, sulfate can provide sulfur for biosynthesis. The sulfur atom in sulfate is more oxidized than it is in cysteine and other organic molecules; thus sulfate must be reduced before it can be assimilated. This process is known as assimilatory sulfate reduction to distinguish it from the dissimilatory sulfate reduction that takes place when sulfate acts as an electron acceptor during anaerobic respiration (see figure 28.21). Anaerobic respiration (pp. 190–91) Assimilatory sulfate reduction involves sulfate activation through the formation of phosphoadenosine 5′-phosphosulfate (figure 10.8), followed by reduction of the sulfate. The process is a complex one (figure 10.9) in which sulfate is first reduced to sulfite (SO32⫺), then to hydrogen sulfide. Cysteine can be synthesized from hydrogen sulfide in two ways. Fungi appear to combine hydrogen sulfide with serine to form cysteine (process 1), whereas many bacteria join hydrogen sulfide with Oacetylserine instead (process 2). (1) (2) H2S ⫹ serine ⬎ cysteine ⫹ H2O ——————— 哭 acetyl-CoA CoA Serine —————————⬎ 哭 H2 S acetate O-acetylserine —————————⬎ cysteine Once formed, cysteine can be used in the synthesis of other sulfurcontaining organic compounds. Nitrogen Assimilation Because nitrogen is a major component of proteins, nucleic acids, coenzymes, and many other cell constituents, the cell’s ability to assimilate inorganic nitrogen is exceptionally important. Al- Prescott−Harley−Klein: Microbiology, Fifth Edition III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis 10.4 SO4 2– © The McGraw−Hill Companies, 2002 The Assimilation of Inorganic Phosphorus, Sulfur, and Nitrogen ∝-Ketoglutarate + NH4 211 Amino acid ATP NAD(P)H PPi Adenosine 5′-phosphosulfate GDH Transaminases + NAD(P) ATP H2 O ADP Phosphoadenosine 5′-phosphosulfate + NADPH + H Phosphoadenosine 5′-phosphate NADP SO3 + 2– + Glutamate ∝-Keto acid Figure 10.10 The Ammonia Assimilation Pathway. Ammonia assimilation by use of glutamate dehydrogenase (GDH) and transaminases. Either NADP- or NAD-dependent glutamate dehydrogenases may be involved. This route is most active at high ammonia concentrations. NADPH + H NADP + H2 S Organic sulfur compounds (e.g., cysteine) Figure 10.9 The Sulfate Reduction Pathway. though nitrogen gas is abundant in the atmosphere, few microorganisms can reduce the gas and use it as a nitrogen source. Most must incorporate either ammonia or nitrate. Ammonia Incorporation Ammonia nitrogen can be incorporated into organic material relatively easily and directly because it is more reduced than other forms of inorganic nitrogen. Some microorganisms form the amino acid alanine in a reductive amination reaction catalyzed by alanine dehydrogenase. Pyruvate ⫹ NH4+ ⫹ NADH (NADPH) ⫹ H+ + + L-alanine ⫹ NAD (NADP ) ⫹ H2O The major route for ammonia incorporation often is the formation of glutamate from ␣-ketoglutarate (a TCA cycle intermediate). Many bacteria and fungi employ glutamate dehydrogenase, at least when the ammonia concentration is high. α-ketoglutarate ⫹ NH4+ ⫹ NADPH (NADH) ⫹ H+ glutamate ⫹ NADP+ (NAD+) ⫹ H2O Different species vary in their ability to use NADPH and NADH as the reducing agent in glutamate synthesis. Once either alanine or glutamate has been synthesized, the newly formed ␣-amino group can be transferred to other carbon skeletons by transamination reactions (see section 9.9) to form different amino acids. Transaminases possess the coenzyme pyridoxal phosphate, which is responsible for the amino group transfer. Microorganisms have a number of transaminases, each of which catalyzes the formation of several amino acids using the same amino acid as an amino group donor. When glutamate dehydrogenase works in cooperation with transaminases, ammonia can be incorporated into a variety of amino acids (figure 10.10). A second route of ammonia incorporation involves two enzymes acting in sequence, glutamine synthetase and glutamate synthase (figure 10.11). Ammonia is used to synthesize glutamine from glutamate, then the amide nitrogen of glutamine is transferred to ␣-ketoglutarate to generate a new glutamate molecule. Because glutamate acts as an amino donor in transaminase reactions, ammonia may be used to synthesize all common amino acids when suitable transaminases are present (figure 10.12). Both ATP and a source of electrons, such as NADPH or reduced ferredoxin, are required. This route is present in Escherichia coli, Bacillus megaterium, and other bacteria. The two enzymes acting in sequence operate very effectively at low ammonia concentrations, unlike the glutamate dehydrogenase pathway. As we saw earlier, glutamine synthetase is tightly regulated by reversible covalent modification and allosteric effectors (see pp. 168–69). Assimilatory Nitrate Reduction The nitrogen in nitrate (NO3⫺) is much more oxidized than that in ammonia. Nitrate must first be reduced to ammonia before the nitrogen can be converted to an organic form. This reduction of nitrate is called assimilatory nitrate reduction, which is not the same as that occurring during anaerobic respiration and dissimilatory nitrate reduction (see sections 9.6 and 28.4). In assimilatory nitrate reduction, nitrate is incorporated into organic material and does not participate in energy generation. The process is widespread among bacteria, fungi, and algae. Assimilatory nitrate reduction takes place in the cytoplasm in bacteria. The first step in nitrate assimilation is its reduction to nitrite Prescott−Harley−Klein: Microbiology, Fifth Edition 212 Chapter 10 III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 Metabolism:The Use of Energy in Biosynthesis Glutamine synthetase reaction O COOH C CH2 CH2 NH3 + ATP CH2 + CH NH2 NH2 CH2 CH + ADP + Pi NH2 COOH COOH Glutamic acid Glutamine Glutamate synthase reaction COOH C O + CH2 COOH NH2 CH NH2 CH CH2 + CH2 + CH2 CH2 CH2 COOH C NADPH + H or Fdreduced + NH2 NH2 CH2 CH2 COOH COOH O Glutamine α-Ketoglutaric acid COOH COOH CH + + NADP or Fdoxidized Two glutamic acids Figure 10.11 Glutamine Synthetase and Glutamate Synthase. The glutamine synthetase and glutamate synthase reactions involved in ammonia assimilation. Some glutamine synthases use NADPH as an electron source; others use reduced ferredoxin (Fd). The nitrogen being incorporated and transferred is shown in green. O NH3 Glutamate R Glutamate + ATP Glutamine synthetase ADP + Pi NADP Fd(ox) Glutamate synthase Transaminases NADPH Fd(red) NH2 α-Ketoglutarate Glutamine C COOH α-Keto acid R CH COOH Amino acid Figure 10.12 Ammonia Incorporation Using Glutamine Synthetase and Glutamate Synthase. This route is effective at low ammonia concentrations. by nitrate reductase, an enzyme that contains both FAD and molybdenum (figure 10.13). NADPH is the electron source. NO3– ⫹ NADPH ⫹ H+ ————⬎ NO2 ⫹ NADP ⫹ H2O – + Nitrite is next reduced to ammonia with a series of two electron additions catalyzed by nitrite reductase and possibly other enzymes. Hydroxylamine may be an intermediate. The ammonia is then incorporated into amino acids by the routes already described. Nitrogen Fixation The reduction of atmospheric gaseous nitrogen to ammonia is called nitrogen fixation. Because ammonia and nitrate levels often are low and only a few procaryotes can carry out nitrogen fixation (eucaryotic cells completely lack this ability), the rate of this process limits plant growth in many situations. Nitrogen fixation occurs in (1) free-living bacteria (e.g., Azotobacter, Klebsiella, Clostridium, and Methanococcus), (2) bacteria living in Prescott−Harley−Klein: Microbiology, Fifth Edition III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis 10.4 2H + NO3 © The McGraw−Hill Companies, 2002 The Assimilation of Inorganic Phosphorus, Sulfur, and Nitrogen 213 – 2e– Nitrate reductase Mo5+ FAD 2e– NADPH H2O 3H + NO2– 2e – H2O [NOH] 2H Nitroxyl + 2e– NH2OH Nitrite reductase Hydroxylamine 2H + 2e– H2O NH3 Figure 10.13 Assimilatory Nitrate Reduction. This sequence is thought to operate in bacteria that can reduce and assimilate nitrate nitrogen. See text for details. quite exergonic, but the reaction has a high activation energy because molecular nitrogen is an unreactive gas with a triple bond between the two nitrogen atoms. Therefore nitrogen reduction is expensive and requires a large ATP expenditure. At least 8 electrons and 16 ATP molecules, 4 ATPs per pair of electrons, are required. Enzyme N2 Enzyme • N N 2e–,2H+ Enzyme • HN N2 ⫹ 8H+ ⫹ 8e– ⫹ 16ATP ————⬎ 2NH3 ⫹ H2 ⫹ 16ADP ⫹ 16Pi NH 2e–,2H+ Enzyme • H2N Figure 10.15 Structure of the Nitrogenase Fe Protein. The Fe protein’s two subunits are arranged like a pair of butterfly wings with the iron sulfur cluster between the wings and at the “head” of the butterfly. The iron sulfur cluster is very exposed, which helps account for nitrogenase’s sensitivity to oxygen. The oxygen can readily attack the exposed irons. NH2 2e–,2H+ 2NH3 Enzyme Figure 10.14 Nitrogen Reduction. A hypothetical sequence of nitrogen reduction by nitrogenase. symbiotic association with plants such as legumes (Rhizobium), and (3) cyanobacteria (Nostoc and Anabaena). The biological aspects of nitrogen fixation are discussed in chapter 30. The biochemistry of nitrogen fixation is the focus of this section. The biology of nitrogen-fixing microorganisms (pp. 492, 616, 675–78) The reduction of nitrogen to ammonia is catalyzed by the enzyme nitrogenase. Although the enzyme-bound intermediates in this process are still unknown, it is believed that nitrogen is reduced by two-electron additions in a way similar to that illustrated in figure 10.14. The reduction of molecular nitrogen to ammonia is The electrons come from ferredoxin that has been reduced in a variety of ways: by photosynthesis in cyanobacteria, respiratory processes in aerobic nitrogen fixers, or fermentations in anaerobic bacteria. For example, Clostridium pasteurianum (an anaerobic bacterium) reduces ferredoxin during pyruvate oxidation, whereas the aerobic Azotobacter uses electrons from NADPH to reduce ferredoxin. Nitrogenase is a complex system consisting of two major protein components, a MoFe protein (MW 220,000) joined with one or two Fe proteins (MW 64,000). The MoFe protein contains 2 atoms of molybdenum and 28 to 32 atoms of iron; the Fe protein has 4 iron atoms (figure 10.15). Fe protein is first reduced by ferredoxin, then it binds ATP (figure 10.16). ATP binding changes the conformation of the Fe protein and lowers its reduction potential, enabling it to reduce the MoFe protein. ATP is hydrolyzed when this electron transfer occurs. Finally, reduced MoFe protein donates electrons to atomic nitrogen. Nitrogenase is quite sensitive to O2 and must be protected from O2 inactivation within the cell. In many cyanobacteria, this protection against oxygen is provided by a special structure called the heterocyst (see p. 473). The reduction of N2 to NH3 occurs in three steps, each of which requires an electron pair (figures 10.14 and 10.16). Six electron transfers take place, and this requires a total 12 ATPs per N2 reduced. The overall process actually requires at least 8 electrons Prescott−Harley−Klein: Microbiology, Fifth Edition 214 Chapter 10 III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 Metabolism:The Use of Energy in Biosynthesis Ferredoxinoxidized cell. The primary route of ammonia assimilation seems to be the synthesis of glutamine by the glutamine synthetase–glutamate synthase system (figure 10.11). However, substances such as the purine derivatives allantoin and allantoic acid also are synthesized and used for the transport of nitrogen to other parts of the plant. Ferredoxinreduced 2e – Fe proteinred. Fe proteinox. 4MgADP 4MgATP Fe proteinred.•4MgATP Fe proteinox.•4MgADP 2e MoFe proteinox. MoFe proteinred. 2e 2NH3 + H2 Pi – 2H – + N2 + 8H+ Figure 10.16 Mechanism of Nitrogenase Action. The flow of two electrons from ferredoxin to nitrogen is outlined. This process is repeated three times in order to reduce N2 to two molecules of ammonia. The stoichiometry at the bottom includes proton reduction to H2. See the text for a more detailed explanation. and 16 ATPs because nitrogenase also reduces protons to H2. The — NH) to form N2 and H2. This futile H2 reacts with diimine (HN — cycle produces some N2 even under favorable conditions and makes nitrogen fixation even more expensive. Symbiotic nitrogenfixing bacteria can consume almost 20% of the ATP produced by the host plant. Nitrogenase can reduce a variety of molecules containing triple bonds (e.g., acetylene, cyanide, and azide). — CH — CH ⫹ 2H+ ⫹ 2e– ————⬎ H C — HC — — 2 2 The rate of reduction of acetylene to ethylene is even used to estimate nitrogenase activity. Once molecular nitrogen has been reduced to ammonia, the ammonia can be incorporated into organic compounds. In the symbiotic nitrogen fixer Rhizobium, it appears that ammonia diffuses out of the bacterial cell and is assimilated in the surrounding legume 10.5 The Synthesis of Amino Acids Microorganisms vary with respect to the type of nitrogen source they employ, but most can assimilate some form of inorganic nitrogen by the routes just described. Amino acid synthesis also requires construction of the proper carbon skeletons, and this is often a complex process involving many steps. Because of the need to conserve nitrogen, carbon, and energy, amino acid synthetic pathways are usually tightly regulated by allosteric and feedback mechanisms (see section 8.9). Although individual amino acid biosynthetic pathways are not described in detail, a survey of the general pattern of amino acid biosynthesis is worthwhile. Further details of amino acid biosynthesis may be found in introductory biochemistry textbooks. The relationship of amino acid biosynthetic pathways to amphibolic routes is shown in figure 10.17. Amino acid skeletons are derived from acetyl-CoA and from intermediates of the TCA cycle, glycolysis, and the pentose phosphate pathway. To maximize efficiency and economy, the precursors for amino acid biosynthesis are provided by a few major amphibolic pathways. Sequences leading to individual amino acids branch off from these central routes. Alanine, aspartate, and glutamate are made by transamination directly from pyruvate, oxaloacetate, and ␣-ketoglutarate, respectively. Most biosynthetic pathways are more complex, and common intermediates often are used in the synthesis of families of related amino acids for the sake of further economy. For example, the amino acids lysine, threonine, isoleucine, and methionine are synthesized from oxaloacetate by such a branching anabolic route (figure 10.18). The biosynthetic pathways for the aromatic amino acids phenylalanine, tyrosine, and tryptophan also share many intermediates (figure 10.19). 1. How do microorganisms assimilate sulfur and phosphorus? 2. Describe the roles of glutamate dehydrogenase, glutamine synthetase, glutamate synthase, and transaminases in ammonia assimilation. How is nitrate incorporated by assimilatory nitrate reduction? 3. What is nitrogen fixation? Briefly describe the structure and mechanism of action of nitrogenase. 4. Summarize in general terms the organization of amino acid biosynthesis. Prescott−Harley−Klein: Microbiology, Fifth Edition III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 10.6 Glucose Anaplerotic Reactions 215 Nucleosides Glucose-6 - P Ribose - P Erythrose-4 - P Lipids Chorismate Glycerol - P Triose - P Prephenate Phenylalanine 3-phosphoglycerate Tyrosine Tryptophan Serine Glycine Cysteine Purines Histidine Phosphoenolpyruvate CO2 CO2, ATP Alanine Lysine Isoleucine Valine Leucine Pyruvate Acetyl-CoA Pyrimidines Lipids Oxaloacetate Aspartate Asparagine Threonine Isoleucine Methionine Lysine Citrate Succinate Isocitrate Succinyl-CoA Porphyrins CO2 CO2 α-Ketoglutarate Glutamate Glutamine Proline Arginine Figure 10.17 The Organization of Anabolism. Biosynthetic products (in blue) are derived from intermediates of amphibolic pathways. Two major anaplerotic CO2 fixation reactions are shown in red. 10.6 Anaplerotic Reactions Inspection of figure 10.17 will show that TCA cycle intermediates are used in the synthesis of pyrimidines and a wide variety of amino acids. In fact, the biosynthetic functions of this pathway are so es- sential that most of it must operate anaerobically to supply biosynthetic precursors, even though NADH is not required for electron transport and oxidative phosphorylation in the absence of oxygen. Thus there is a heavy demand upon the TCA cycle to supply carbon for biosynthesis, and cycle intermediates could be depleted if Prescott−Harley−Klein: Microbiology, Fifth Edition 216 Chapter 10 III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 Metabolism:The Use of Energy in Biosynthesis that supplies the carbon required by autotrophs. In autotrophs CO2 fixation provides most or all of the carbon required for growth. Anaplerotic CO2 fixation reactions simply replace TCA cycle intermediates and maintain metabolic balance. Usually CO2 is added to an acceptor molecule, either pyruvate or phosphoenolpyruvate, to form the cycle intermediate oxaloacetate (figure 10.17). Some microorganisms (e.g., Arthrobacter globiformis, yeasts) use pyruvate carboxylase in this role. Oxaloacetate Aspartate Aspartate β-semialdehyde biotin Pyruvate ⫹ CO2 ⫹ ATP ⫹ H2O ————⬎ Lysine Homoserine oxaloacetate ⫹ ADP ⫹ Pi Methionine Threonine Isoleucine Figure 10.18 A Branching Pathway of Amino Acid Synthesis. The pathways to methionine, threonine, isoleucine, and lysine. Although some arrows represent one step, most interconversions require the participation of several enzymes. Phosphoenolpyruvate + Erythrose-4- P Chorismate Phenylalanine Tyrosine Phosphoenolpyruvate + CO2 Anthranilate Tryptophan Figure 10.19 Aromatic Amino Acid Synthesis. The synthesis of the aromatic amino acids phenylalanine, tyrosine, and tryptophan. Most arrows represent more than one enzyme reaction. isocitrate lyase Isocitrate ——————————⬎ succinate ⫹ glyoxylate malate synthase ⬎ malate ⫹ CoA —————————— The glyoxylate cycle is actually a modified TCA cycle. The two decarboxylations of the latter pathway (the isocitrate dehydrogenase and ␣-ketoglutarate dehydrogenase steps) are bypassed, making possible the conversion of acetyl-CoA to form oxaloacetate without loss of acetyl-CoA carbon as CO2. In this fashion acetate and any molecules that give rise to it can contribute carbon to the cycle and support microbial growth. The TCA cycle (pp. 183–84) 1. Define an anaplerotic reaction and give an example. 2. How does the glyoxylate cycle convert acetyl-CoA to oxaloacetate, and what special enzymes are used? 10.7 nothing were done to maintain their levels. However, microorganisms have reactions that replenish cycle intermediates so that the TCA cycle can continue to function when active biosynthesis is taking place. Reactions that replace cycle intermediates are called anaplerotic reactions [Greek anaplerotic, filling up]. Most microorganisms can replace TCA cycle intermediates by CO2 fixation, in which inorganic CO2 is converted to organic carbon and assimilated. It should be emphasized that anaplerotic reactions do not serve the same function as the CO2 fixation pathway ⬎ oxaloacetate + Pi ———— Some bacteria, algae, fungi, and protozoa can grow with acetate as the sole carbon source by using it to synthesize TCA cycle intermediates in the glyoxylate cycle (figure 10.20). This cycle is made possible by two unique enzymes, isocitrate lyase and malate synthase, that catalyze the following reactions. Glyoxylate ⫹ acetyl-CoA Shikimate Prephenate This enzyme requires the cofactor biotin and uses ATP energy to join CO2 and pyruvate. Biotin is often the cofactor for enzymes catalyzing carboxylation reactions. Because of its importance, biotin is a required growth factor for many species. Other microorganisms, such as the bacteria Escherichia coli and Salmonella typhimurium, have the enzyme phosphoenolpyruvate carboxylase, which catalyzes the following reaction. The Synthesis of Purines, Pyrimidines, and Nucleotides Purine and pyrimidine biosynthesis is critical for all cells because these molecules are used in the synthesis of ATP, several cofactors, ribonucleic acid (RNA), deoxyribonucleic acid (DNA), and other important cell components. Nearly all microorganisms can synthesize their own purines and pyrimidines as these are so crucial to cell function. DNA and RNA synthesis (pp. 235–39, 261–64) Purines and pyrimidines are cyclic nitrogenous bases with several double bonds and pronounced aromatic properties. Purines Prescott−Harley−Klein: Microbiology, Fifth Edition III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 10.7 H O O C Oxaloacetate Malate dehydrogenase HO – O H O C C C – O C O O C C H O C C The Synthesis of Purines, Pyrimidines, and Nucleotides S CoA H Acetyl-CoA O Citrate synthase C O– CoASH H CoASH Malate synthase H O C C O O– HO C O C O– H GLYOXYLATE CYCLE H Malate H H H Oxaloacetate – H O C C – S O C CoA – O H O O C C H O– O O C C C H O H C C O O– H O C C O– C C OH O– Isocitrate lyase – H Fumarate O H Isocitrate H H C O O Aconitase Glyoxylate C C – H H Citrate H Acetyl-CoA H2 O 217 C C H O H C C H O C H C C O C O – H O H C C H C H C S H H Succinate – O– CoA CO2 O CO2 O– O α-Ketoglutarate O Succinyl-CoA Overall equation: + 2 Acetyl-CoA + FAD + 2NAD + 3H2O Oxaloacetate + 2CoA + FADH2 + 2NADH + 2H + Figure 10.20 The Glyoxylate Cycle. The reactions and enzymes unique to the cycle are shown in color. The tricarboxylic acid cycle enzymes that have been bypassed are at the bottom. consist of two joined rings, whereas pyrimidines have only one (figure 10.21 and figure 10.23). The purines adenine and guanine and the pyrimidines uracil, cytosine, and thymine are commonly found in microorganisms. A purine or pyrimidine base joined with a pentose sugar, either ribose or deoxyribose, is a nucleoside. A nucleotide is a nucleoside with one or more phosphate groups attached to the sugar. Purine Biosynthesis The biosynthetic pathway for purines is a complex, 11-step sequence (see appendix II) in which seven different molecules contribute parts to the final purine skeleton (figure 10.21). Because the pathway begins with ribose 5-phosphate and the purine skeleton is constructed on this sugar, the first purine product of the pathway is the nucleotide inosinic acid, not a free purine base. The cofactor folic acid is very important in purine biosynthesis. Folic acid derivatives contribute carbons two and eight to the purine skeleton. In fact, the drug sulfonamide inhibits bacterial growth by blocking folic acid synthesis. This interferes with purine biosynthesis and other processes that require folic acid. Once inosinic acid has been formed, relatively short pathways synthesize adenosine monophosphate and guanosine monophosphate (figure 10.22) and produce nucleoside diphosphates and triphosphates by phosphate transfers from ATP. DNA contains deoxyribonucleotides (the ribose lacks a hydroxyl Prescott−Harley−Klein: Microbiology, Fifth Edition 218 Chapter 10 III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 Metabolism:The Use of Energy in Biosynthesis CO2 N O C 6 N HN 7 5C 1 8 C Formate group from folic acid N N Ribose P C 2 Formate group from folic acid N Glycine Amino nitrogen of aspartate 3 N C 4 9 Inosinic acid N H Adenylosuccinate Amide nitrogen of glutamine Xanthylic acid Figure 10.21 Purine Biosynthesis. The sources of purine skeleton nitrogen and carbon are indicated. The contribution of glycine is shaded. NH2 O N group on carbon two) instead of the ribonucleotides found in RNA. Deoxyribonucleotides arise from the reduction of nucleoside diphosphates or nucleoside triphosphates by two different routes. Some microorganisms reduce the triphosphates with a system requiring vitamin B12 as a cofactor. Others, such as E. coli, reduce the ribose in nucleoside diphosphates. Both systems employ a small sulfur-containing protein called thioredoxin as their reducing agent. N HN N N N H2 N N N Ribose P Ribose P Adenosine monophosphate Guanosine monophosphate Figure 10.22 Synthesis of Adenosine Monophosphate and Guanosine Monophosphate. The shaded groups are the ones differing from those in inosinic acid. Pyrimidine Biosynthesis Pyrimidine biosynthesis begins with aspartic acid and carbamoyl phosphate, a high-energy molecule synthesized from CO2 and ammonia (figure 10.23). Aspartate carbamoyltransferase catalyzes the condensation of these two substrates to form carbamoylaspartate, which is then converted to the initial pyrimidine product, orotic acid. The regulation of aspartate carbamoyltransferase ac- 1. Define purine, pyrimidine, nucleoside, and nucleotide. 2. Outline the way in which purines and pyrimidines are synthesized. How is the deoxyribose component of deoxyribonucleotides made? tivity (pp. 166–67) After synthesis of the pyrimidine skeleton, a nucleotide is produced by the ribose 5-phosphate addition using the highenergy intermediate 5-phosphoribosyl 1-pyrophosphate. Thus construction of the pyrimidine ring is completed before ribose is added, in contrast with purine ring synthesis, which begins with ribose 5-phosphate. Decarboxylation of orotidine monophosphate yields uridine monophosphate and eventually uridine triphosphate and cytidine triphosphate. The third common pyrimidine is thymine, a constituent of DNA. The ribose in pyrimidine nucleotides is reduced in the same way as it is in purine nucleotides. Then deoxyuridine monophosphate is methylated with a folic acid derivative to form deoxythymidine monophosphate (figure 10.24). 10.8 Lipid Synthesis A variety of lipids are found in microorganisms, particularly in cell membranes. Most contain fatty acids or their derivatives. Fatty acids are monocarboxylic acids with long alkyl chains that usually have an even number of carbons (the average length is 18 carbons). Some may be unsaturated—that is, have one or more double bonds. Most microbial fatty acids are straight chained, but some are branched. Gram-negative bacteria often have cyclopropane fatty acids (fatty acids with one or more cyclopropane rings in their chains). Lipid structure and nomenclature (appendix I) Fatty acid synthesis is catalyzed by the fatty acid synthetase complex with acetyl-CoA and malonyl-CoA as the substrates and Prescott−Harley−Klein: Microbiology, Fifth Edition III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 10.8 HOOC CH2 CH2 H2 N – O HCO3 + Glutamine + 2ATP + H2O COOH Aspartic acid NH2 Pi C O HOOC NH2 CH2 C CH N H O P Carbamoyl phosphate COOH Carbamoylaspartate Dihydroorotic acid O PPi PRPP HN Orotidine 5′-monophosphate O N H Orotic acid CO2 Glutamine or NH3 O HN UDP O COOH Uridine triphosphate NH3 HN O N N Ribose P Ribose P Uridine 5′-monophosphate (UMP) P P Cytidine triphosphate Figure 10.23 Pyrimidine Synthesis. PRPP stands for 5-phosphoribose 1-pyrophosphoric acid, which provides the ribose 5-phosphate chain. O O 5 HN O 10 N ,N -methylenetetrahydrofolic acid N Deoxyribose P Deoxyuridine monophosphate CH3 HN O N Deoxyribose P Deoxythymidine monophosphate Figure 10.24 Deoxythymidine Monophosphate Synthesis. Deoxythymidine differs from deoxyuridine in having the shaded methyl group. Lipid Synthesis 219 Prescott−Harley−Klein: Microbiology, Fifth Edition 220 Chapter 10 III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 Metabolism:The Use of Energy in Biosynthesis O CH3 ATP O CH3 C C ADP + Pi CoA Acetyl-CoA HOOC CoA – HCO3 O CH2 C O CH3 C O ACP HOOC CH2 C CoA Malonyl-CoA ACP CO2 CO2 Malonyl-ACP O CH3 O CH3 CH2 NADP CH2 C ACP NADPH as the reductant. Malonyl-CoA arises from the ATPdriven carboxylation of acetyl-CoA (figure 10.25). Synthesis takes place after acetate and malonate have been transferred from coenzyme A to the sulfhydryl group of the acyl carrier protein (ACP), a small protein that carries the growing fatty acid chain during synthesis. The synthetase adds two carbons at a time to the carboxyl end of the growing fatty acid chain in a two-stage process (figure 10.25). First, malonyl-ACP reacts with the fatty acyl-ACP to yield CO2 and a fatty acyl-ACP two carbons longer. The loss of CO2 drives this reaction to completion. Notice that ATP is used to add CO2 to acetyl-CoA, forming malonyl-CoA. The same CO2 is lost when malonyl-ACP donates carbons to the chain. Thus carbon dioxide is essential to fatty acid synthesis but it is not permanently incorporated. Indeed, some microorganisms require CO2 for good growth, but they can do without it in the presence of a fatty acid like oleic acid (an 18-carbon unsaturated fatty acid). In the second stage of synthesis, the -keto group arising from the initial condensation reaction is removed in a threestep process involving two reductions and a dehydration. The fatty acid is then ready for the addition of two more carbon atoms. Unsaturated fatty acids are synthesized in two ways. Eucaryotes and aerobic bacteria like Bacillus megaterium employ an aerobic pathway using both NADPH and O2. O || R — (CH2)9 — C — SCoA ⫹ NADPH ⫹ H+ ⫹ O2 ⬎ ———— O || — CH — (CH2)7 — C — SCoA ⫹ NADP+ ⫹ 2H2O R — CH — C ACP NADPH + H etc. + + NADP + O Figure 10.25 Fatty Acid Synthesis. The cycle is repeated until the proper chain length has been reached. Carbon dioxide carbon and the remainder of malonyl-CoA are shown in different colors. ACP stands for acyl carrier protein. O CH2 + NADPH + H CH3 C CH CH C OH ACP CH3 CH O CH2 C ACP H2 O A double bond is formed between carbons nine and ten, and O2 is reduced to water with electrons supplied by both the fatty acid and NADPH. Anaerobic bacteria and some aerobes create double bonds during fatty acid synthesis by dehydrating hydroxy fatty acids. Oxygen is not required for double bond synthesis by this pathway. The anaerobic pathway is present in a number of common gram-negative bacteria (e.g., Escherichia coli and Salmonella typhimurium), gram-positive bacteria (e.g., Lactobacillus plantarum and Clostridium pasteurianum), and cyanobacteria. Eucaryotic microorganisms frequently store carbon and energy as triacylglycerol, glycerol esterified to three fatty acids. Glycerol arises from the reduction of the glycolytic intermediate dihydroxyacetone phosphate to glycerol 3-phosphate, which is then esterified with two fatty acids to give phosphatidic acid (figure 10.26). Phosphate is hydrolyzed from phosphatidic acid giving a diacylglycerol, and the third fatty acid is attached to yield a triacylglycerol. Phospholipids are major components of eucaryotic and most procaryotic cell membranes. Their synthesis also usually proceeds by way of phosphatidic acid. A special cytidine diphosphate (CDP) carrier plays a role similar to that of uridine and adenosine diphosphate carriers in carbohydrate biosynthesis. For example, bacteria synthesize phosphatidylethanolamine, a major cell membrane component, through the initial formation of CDPdiacylglycerol (figure 10.26). This CDP derivative then reacts with serine to form the phospholipid phosphatidylserine, and decarboxylation yields phosphatidylethanolamine. In this way a complex membrane lipid is constructed from the products of glycolysis, fatty acid biosynthesis, and amino acid biosynthesis. Prescott−Harley−Klein: Microbiology, Fifth Edition III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 10.9 Pepidoglycan Synthesis 221 CH2OH C O Dihydroxyacetone phosphate CH2O P NADH + H NAD CH2OH HO + Glycerol 3-phosphate CH O CH2O P R2 C 2 R C O CoA O C R1 CH2 O O + CH Phosphatidic acid CH2O P H2 O CTP Pi O O O R2 CH2 C O O O C R1 R2 O CH2 C O CH C R1 CDP-diacylglycerol CH CH2 O P O P R3COOH O O R2 C CH2 O O C CH CH2 cytidine Serine CH2OH CMP Phosphatidylserine R1 O O C O R3 R2 Triacylglycerol C CO2 O O C CH2 O CH CH2 R1 O O P O CH2 CH2 NH2 – O Phosphatidylethanolamine CH3 CH3 C CH3 CH CH2 ( CH 2 C O CH3 CH CH2 ) 9 CH2 C CH CH2 O 1. What is a fatty acid? Describe in general terms how the fatty acid synthetase manufactures a fatty acid. 2. How are unsaturated fatty acids made? 3. Briefly describe the pathways for triacylglycerol and phospholipid synthesis. Of what importance are phosphatidic acid and CDPdiacylglycerol? 10.9 Peptidoglycan Synthesis As discussed earlier, most bacterial cell walls contain a large, complex peptidoglycan molecule consisting of long polysaccharide chains made of alternating N-acetylmuramic acid O P O Figure 10.26 Triacylglycerol and Phospholipid Synthesis. O – P O O NAM Figure 10.27 Bactoprenol Pyrophosphate. Bactoprenol pyrophosphate connected to Nacetylmuramic acid (NAM). – (NAM) and N-acetylglucosamine (NAG) residues. Pentapeptide chains are attached to the NAM groups. The polysaccharide chains are connected through their pentapeptides or by interbridges (see figures 3.18 and 3.19). Peptidoglycan structure and function (p. 56) Not surprisingly such an intricate structure requires an equally intricate biosynthetic process, especially because the synthetic reactions occur both inside and outside the cell membrane. Peptidoglycan synthesis is a multistep process that has been best studied in the gram-positive bacterium Staphylococcus aureus. Two carriers participate: uridine diphosphate (UDP) and bactoprenol (figure 10.27). Bactoprenol is a 55-carbon alcohol that attaches to NAM by a pyrophosphate group and moves peptidoglycan components through the hydrophobic membrane. Prescott−Harley−Klein: Microbiology, Fifth Edition 222 Chapter 10 III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 Metabolism:The Use of Energy in Biosynthesis Cytoplasm UDP NAM L-Ala L-Ala – D-Glu D-Ala 2 L-Lys D-Ala UDP P Pi Cycloserine – D-Ala Pentapeptide pentapeptide 3 Bactoprenol – NAM (DAP) UMP P P UDP NAM 4 NAG UDP Bactoprenol NAM NAG NAM NAG Bactoprenol Membrane 7 Pentapeptide P P 5 Bacitracin Bactoprenol Bactoprenol – Exterior P P Peptidoglycan NAM 6 NAG Peptidoglycan P P Pentapeptide Pentapeptide Vancomycin Figure 10.28 Peptidoglycan Synthesis. NAM is N-acetylmuramic acid and NAG is N-acetylglucosamine. The pentapeptide contains L-lysine in S. aureus peptidoglycan, and diaminopimelic acid (DAP) in E. coli. Inhibition by bacitracin, cycloserine, and vancomycin also is shown. The numbers correspond to six of the eight stages discussed in the text. Stage eight is depicted in figure 10.29. The synthesis of peptidoglycan, outlined in figures 10.28 and 10.29, occurs in eight stages. 1. UDP derivatives of N-acetylmuramic acid and Nacetylglucosamine are synthesized in the cytoplasm. 2. Amino acids are sequentially added to UDP-NAM to form the pentapeptide chain (the two terminal D-alanines are added as a dipeptide). ATP energy is used to make the peptide bonds, but tRNA and ribosomes are not involved. 3. The NAM-pentapeptide is transferred from UDP to a bactoprenol phosphate at the membrane surface. 4. UDP-NAG adds NAG to the NAM-pentapeptide to form the peptidoglycan repeat unit. If a pentaglycine interbridge is required, the glycines are added using special glycyltRNA molecules, not ribosomes. 5. The completed NAM-NAG peptidoglycan repeat unit is transported across the membrane to its outer surface by the bactoprenol pyrophosphate carrier. 6. The peptidoglycan unit is attached to the growing end of a peptidoglycan chain to lengthen it by one repeat unit. 7. The bactoprenol carrier returns to the inside of the membrane. A phosphate is released during this process to give bactoprenol phosphate, which can now accept another NAM-pentapeptide. 8. Finally, peptide cross-links between the peptidoglycan chains are formed by transpeptidation (figure 10.29). In E. coli the free amino group of diaminopimelic acid attacks the subterminal D-alanine, releasing the terminal D-alanine residue. ATP is used to form the terminal peptide bond inside the membrane. No more ATP energy is required when transpeptidation takes place on the outside. The same process occurs when an interbridge is involved; only the group reacting with the subterminal Dalanine differs. Peptidoglycan synthesis is particularly vulnerable to disruption by antimicrobial agents. Inhibition of any stage of synthesis weakens the cell wall and can lead to osmotic lysis. Many antibiotics interfere with peptidoglycan synthesis. For example, penicillin inhibits the transpeptidation reaction (figure 10.29), and bacitracin blocks the dephosphorylation of bactoprenol pyrophosphate (figure 10.28). Antibiotic effects on cell wall synthesis (pp. 813–15, 817) 10.10 Patterns of Cell Wall Formation To grow and divide efficiently, a bacterial cell must add new peptidoglycan to its cell wall in a precise and well-regulated way while maintaining wall shape and integrity in the presence of high osmotic pressure. Because the cell wall peptidoglycan is essentially a single enormous network, the growing bacterium Prescott−Harley−Klein: Microbiology, Fifth Edition III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 10.10 H 2N NAM D-Ala •• • NAG D-Ala L-Ala D-Ala DAP D-Glu DAP DAP D-Glu L-Ala D-Ala L-Ala NAG NAM • •• D-Ala •• • D-Glu • • • DAP D-Ala NAG NAM • • • L-Ala D-Glu • •• D-Ala • • • •• • NAM 223 Figure 10.29 Transpeptidation. The transpeptidation reactions in the formation of the peptidoglycans of Escherichia coli and Staphylococcus aureus. E. coli transpeptidation NAG Patterns of Cell Wall Formatoin Penicillins S. aureus transpeptidation NAM D-Ala D-Ala L-Ala D-GluNH2 L-Lys D-GluNH2 L-Lys D-GluNH2 L-Lys D-GluNH2 L-Ala D-Ala D-Ala H2 N (Gly)5 • •• D-Ala NAG NAM Septal region (a) D-Ala (Gly)5 L-Ala • •• L-Lys • • • L-Ala NAG NAM • •• NAG • •• D-Ala • •• NAM •• • •• • NAG Figure 10.30 Wall Synthesis Patterns. Patterns of new cell wall synthesis in growing and dividing bacteria. (a) Streptococci and some other gram-positive cocci. (b) Synthesis in rod-shaped bacteria (Escherichia coli, Salmonella, Bacillus). The zones of growth are in blue-green. The actual situation is more complex than indicated because cells can begin to divide again before the first division is completed. (b) must be able to degrade it just enough to provide acceptor ends for the incorporation of new peptidoglycan units. It must also reorganize peptidoglycan structure when necessary. This limited peptidoglycan digestion is accomplished by enzymes known as autolysins, some of which attack the polysaccharide chains, while others hydrolyze the peptide cross-links. Autolysin inhibitors keep the activity of these enzymes under tight control. Bacillus. Active peptidoglycan synthesis occurs at the site of septum formation just as before, but growth sites also are scattered along the cylindrical portion of the rod. Thus growth is distributed more diffusely in rod-shaped bacteria than in the streptococci. Synthesis must lengthen rod-shaped cells as well as divide them. Presumably this accounts for the differences in wall growth pattern. Control of cell division (pp. 285–86) Although the location and distribution of cell wall synthetic activity varies with species, there seem to be two general patterns (figure 10.30). Many gram-positive cocci (e.g., Enterococcus faecalis and Streptococcus pyogenes) have only one to a few zones of growth. The principal growth zone is usually at the site of septum formation, and new cell halves are synthesized back-to-back. The second pattern of synthesis occurs in the rod-shaped bacteria Escherichia coli, Salmonella, and 1. Outline in a diagram the steps involved in the synthesis of peptidoglycan and show their relationship to the plasma membrane. What are the roles of bactoprenol and UDP? 2. What is the function of autolysins in cell wall peptidoglycan synthesis? Describe the patterns of peptidoglycan synthesis seen in gram-positive cocci and in rod-shaped bacteria such as E. coli. Prescott−Harley−Klein: Microbiology, Fifth Edition 224 Chapter 10 III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 Metabolism:The Use of Energy in Biosynthesis Summary 1. In biosynthesis or anabolism, cells use energy to construct complex molecules from smaller, simpler precursors. 2. Many important cell constituents are macromolecules, large polymers constructed of simple monomers. 3. Although many catabolic and anabolic pathways share enzymes for the sake of efficiency, some of their enzymes are separate and independently regulated. 4. Macromolecular components often undergo self-assembly to form the final molecule or complex. 5. Photosynthetic CO2 fixation is carried out by the Calvin cycle and may be divided into three phases: the carboxylation phase, the reduction phase, and the regeneration phase (figure 10.4). Three ATPs and two NADPHs are used during the incorporation of one CO2. 6. Gluconeogenesis is the synthesis of glucose and related sugars from nonglucose precursors. 7. Glucose, fructose, and mannose are gluconeogenic intermediates or made directly from them; galactose is synthesized with nucleoside diphosphate derivatives. Bacteria and algae synthesize glycogen and starch from adenosine diphosphate glucose. 8. Phosphorus is obtained from inorganic or organic phosphate. 9. Microorganisms can use cysteine, methionine, and inorganic sulfate as sulfur sources. Sulfate is reduced to sulfide during assimilatory sulfate reduction. 10. Ammonia nitrogen can be directly assimilated by the activity of transaminases and either glutamate dehydrogenase or the glutamine synthetase–glutamate synthase system (figures 10.10–10.12). 11. Nitrate is incorporated through assimilatory nitrate reduction catalyzed by the enzymes nitrate reductase and nitrite reductase. 12. Nitrogen fixation is catalyzed by the nitrogenase complex. Atmospheric molecular nitrogen is reduced to ammonia, which is then incorporated into amino acids (figures 10.14 and 10.16). 13. Amino acid biosynthetic pathways branch off from the central amphibolic pathways (figure 10.17). 14. Anaplerotic reactions replace TCA cycle intermediates to keep the cycle in balance while it supplies biosynthetic precursors. Many anaplerotic enzymes catalyze CO2 fixation reactions. The glyoxylate cycle is also anaplerotic. 15. Purines and pyrimidines are nitrogenous bases found in DNA, RNA, and other molecules. The purine skeleton is synthesized beginning with ribose 5-phosphate and initially produces 16. 17. 18. 19. 20. inosinic acid. Pyrimidine biosynthesis starts with carbamoyl phosphate and aspartate, and ribose is added after the skeleton has been constructed. Fatty acids are synthesized from acetyl-CoA, malonyl-CoA, and NADPH by the fatty acid synthetase system. During synthesis the intermediates are attached to the acyl carrier protein. Double bonds can be added in two different ways. Triacylglycerols are made from fatty acids and glycerol phosphate. Phosphatidic acid is an important intermediate in this pathway. Phospholipids like phosphatidylethanolamine can be synthesized from phosphatidic acid by forming CDP-diacylglycerol, then adding an amino acid. Peptidoglycan synthesis is a complex process involving both UDP derivatives and the lipid carrier bactoprenol, which transports NAM-NAG-pentapeptide units across the cell membrane. Cross-links are formed by transpeptidation (figures 10.28 and 10.29). Peptidoglycan synthesis occurs in discrete zones in the cell wall. Existing peptidoglycan is selectively degraded by autolysins so new material can be added. Key Terms glutamate dehydrogenase 211 glutamate synthase 211 glutamine synthetase 211 glyoxylate cycle 216 phosphatidic acid 220 phosphoadenosine 5′-phosphosulfate 210 purine 216 pyrimidine 216 bactoprenol 221 Calvin cycle 207 guanine 217 macromolecule 205 monomers 205 nitrate reductase 212 ribulose-1,5-bisphosphate carboxylase 208 self-assembly 207 thymine 217 transaminases 221 carboxysomes 207 CO2 fixation 216 cytosine 217 nitrite reductase 212 nitrogenase 213 nitrogen fixation 212 transpeptidation 223 triacylglycerol 220 turnover 205 dissimilatory sulfate reduction 210 fatty acid 218 fatty acid synthetase 218 gluconeogenesis 209 nucleoside 217 nucleotide 217 phosphatase 210 uracil 217 uridine diphosphate glucose (UDPG) 209 acyl carrier protein (ACP) 220 adenine 217 anaplerotic reactions 216 assimilatory nitrate reduction 211 assimilatory sulfate reduction 210 autolysins 223 Prescott−Harley−Klein: Microbiology, Fifth Edition III. Microbial Metabolism 10. Metabolism: The Use of Energy in Biosynthesis © The McGraw−Hill Companies, 2002 Additional Reading Questions for Thought and Review 1. Discuss the relationship between catabolism and anabolism. How does anabolism depend on catabolism? 2. Suppose that a microorganism was growing on a medium that contained amino acids but no sugars. In general terms how would it synthesize the pentoses and hexoses it required? 3. Activated carriers participate in carbohydrate, lipid, and peptidoglycan synthesis. Briefly describe these carriers and their roles. 4. Which two enzymes discussed in the chapter appear to be specific to the Calvin cycle? 5. Why can phosphorus be directly incorporated into cell constituents whereas sulfur and nitrogen often cannot? 6. What is unusual about the synthesis of peptides that takes place during peptidoglycan construction? 225 Critical Thinking Questions 1. In metabolism important intermediates are covalently attached to carriers, as if to mark these as important so the cell does not lose track of them. Think about a hotel placing your room key on a very large ring. List a few examples of these carriers and indicate whether they are involved primarily in anabolism or catabolism. 2. Intermediary carriers are in a limited supply— when they cannot be recycled because of a metabolic block, serious consequences ensue. Think of some examples of these consequences. Additional Reading General Caldwell, D. R. 2000. Microbial physiology and metabolism 2d ed. Belmont, Calif.: Star Publishing. Communications, Inc. Dawes, I. W., and Sutherland, I. W. 1992. Microbial physiology, 2d ed. Boston, Mass.: Blackwell Scientific Publications. Garrett, R. H., and Grisham, C. M. 1999. Biochemistry, 2d ed. New York: Saunders. Gottschalk, G. 1986. Bacterial metabolism, 2d ed. New York: Springer-Verlag. Lehninger, A. L.; Nelson, D. L.; and Cox, M. M. 1993. Principles of biochemistry, 2d ed. New York: Worth Publishers. Mandelstam, J.; McQuillen, K.; and Dawes, I. 1982. Biochemistry of bacterial growth, 3d ed. London: Blackwell Scientific Publications. Mathews, C. K., and van Holde, K. E. 1996. Biochemistry, 2d ed. Redwood City, Calif.: Benjamin/Cummings. Moat, A. G., and Foster, J. W. 1995. Microbial physiology, 3d ed. New York: John Wiley and Sons. Neidhardt, F. C.; Ingraham, J. L.; and Schaechter, M. 1990. Physiology of the bacterial cell: A molecular approach. Sunderland, Mass.: Sinauer Associates. Voet, D., and Voet, J. G. 1995. Biochemistry, 2d ed. New York: John Wiley and Sons. White, D. 1995. The physiology and biochemistry of procaryotes. New York: Oxford University Press. Zubay, G. 1998. Biochemistry, 4th ed. Dubuque, Iowa: WCB/McGraw-Hill. 10.2 The Photosynthetic Fixation of CO2 Schlegel, H. G., and Bowien, B., editors. 1989. Autotrophic bacteria. Madison, Wis.: Science Tech Publishers. Yoon, K.-S.; Hanson, T. E.; Gibson, J. L.; and Tabita, F. R. 2000. Autotrophic CO2 metabolism. In Encyclopedia of microbiology, 2d ed., vol. 1, J. Lederberg, editor-in-chief, 349–58. San Diego: Academic Press. 10.4 The Assimilation of Inorganic Phosphorus, Sulfur, and Nitrogen Brill, W. J. 1977. Biological nitrogen fixation. Sci. Am. 236(3):68–81. Dean, D. R.; Bolin, J. T.; and Zheng, L. 1993. Nitrogenase metalloclusters: Structures, organization, and synthesis. J. Bacteriol. 175(21):6737–44. Dilworth, M., and Glenn, A. R. 1984. How does a legume nodule work? Trends Biochem. Sci. 9(12):519–23. Glenn, A. R., and Dilworth, M. J. 1985. Ammonia movements in rhizobia. Microbiol. Sci. 2(6):161–67. Howard, J. B., and Rees, D. C. 1994. Nitrogenase: A nucleotide-dependent molecular switch. Annu. Rev. Biochem. 63:235–64. Knowles, R. 2000. Nitrogen cycle. In Encyclopedia of microbiology, 2d ed., vol. 3, J. Lederberg, editor-in-chief, 379–91. San Diego: Academic Press. Kuykendall, L. D.; Dadson, R. B.; Hashem, F. M.; and Elkan, G. H. 2000. Nitrogen fixation. In Encyclopedia of microbiology, 2d ed., vol. 3, J. Lederberg, editor-in-chief, 392–406. San Diego: Academic Press. Lens, P., and Pol, L. H. 2000. Sulfur cycle. In Encyclopedia of microbiology, 2d ed., vol. 4, J. Lederberg, editor-in-chief, 495–505. San Diego: Academic Press. Luden, P. W. 1991. Energetics of and sources of energy for biological nitrogen fixation. In Current topics in bioenergetics, vol. 16, 369–90. San Diego: Academic Press. Mora, J. 1990. Glutamine metabolism and cycling in Neurospora crassa. Microbiol. Rev. 54(3):293–304. Peters, J. W.; Fisher, K.; and Dean, D. R. 1995. Nitrogenase structure and function: A biochemical-genetic perspective. Annu. Rev. Microbiol. 49:335–66. 10.10 Patterns of Cell Wall Formation Doyle, R. J.; Chaloupka, J.; and Vinter, V. 1988. Turnover of cell walls in microorganisms. Microbiol. Rev. 52(4):554–67. Harold, F. M. 1990. To shape a cell: An inquiry into the causes of morphogenesis of microorganisms. Microbiol. Rev. 54(4):381–431. Höltje, J.-V. 1998. Growth of the stress-bearing and shape-maintaining murein sacculus of Escherichia coli. Microbiol. Mol. Biol. Rev. 62(1):181–203. Höltje, J.-V. 2000. Cell walls, bacterial. In Encyclopedia of microbiology, 2d ed., vol. 1, J. Lederberg, editor-in-chief, 759–71. San Diego: Academic Press. Koch, A. L. 1995. Bacterial growth and form. New York: Chapman & Hall. Nanninga, N.; Wientjes, F. B.; Mulder, E.; and Woldringh, C. L. 1992. Envelope growth in Escherichia coli—Spatial and temporal organization. In Prokaryotic structure and function, S. Mohan, C. Dow, and J. A. Coles, editors, 185–222. New York: Cambridge University Press. Prescott−Harley−Klein: Microbiology, Fifth Edition PA RT IV. Microbial Molecular Biology and Genetics IV Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 CHAPTER 11 Genes: Structure, Replication, and Mutation Chapter 11 Genes: Structure, Replication, and Mutation Chapter 12 Genes: Expression and Regulation Chapter 13 Microbial Recombination and Plasmids This model illustrates double-stranded DNA. DNA is the genetic material for procaryotes and eucaryotes. Genetic information is contained in the sequence of base pairs that lie in the center of the helix. Outline 11.1 11.2 DNA as Genetic Material 228 Nucleic Aid Structure 230 11.6 Mutations and Mutagenesis 244 Spontaneous Mutations 246 Induced Mutations 246 The Expression of Mutations 248 DNA Structure 231 RNA Structure 233 The Organization of DNA in Cells 234 11.3 DNA Replication 235 Patterns of DNA Synthesis 235 Mechanism of DNA Replication 236 11.4 11.5 11.7 Gene Structure 241 Genes That Code for Proteins 242 Genes That Code for tRNA and rRNA 244 Detection and Isolation of Mutants 251 Mutant Detection 251 Mutant Selection 252 Carcinogenicity Testing 253 The Genetic Code 240 Establishment of the Genetic Code 240 Organization of the Code 240 Mutations and Their Chemical Basis 244 11.8 DNA Repair 254 Excision Repair 254 Removal of Lesions 254 Postreplication Repair 254 Recombination Repair 255 Prescott−Harley−Klein: Microbiology, Fifth Edition 228 Chapter 11 IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 Genes: Structure, Replication, and Mutation Concepts 1. The two kinds of nucleic acid, deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), differ from one another in chemical composition and structure. In procaryotic and eucaryotic cells, DNA serves as the repository for genetic information. 2. DNA is associated with basic proteins in the cell. In eucaryotes these are special histone proteins, whereas in procaryotes nonhistone proteins are complexed with DNA. 3. The flow of genetic information usually proceeds from DNA through RNA to protein. A protein’s amino acid sequence reflects the nucleotide sequence of its mRNA. This messenger is a complementary copy of a portion of the DNA genome. 4. DNA replication is a very complex process involving a variety of proteins and a number of steps. It is designed to operate rapidly while minimizing errors and correcting those that arise when the DNA sequence is copied. 5. Genetic information is contained in the nucleotide sequence of DNA (and sometimes RNA). When a structural gene directs the synthesis of a polypeptide, each amino acid is specified by a triplet codon. 6. A gene is a nucleotide sequence that codes for a polypeptide, tRNA, or rRNA. 7. Most bacterial genes have at least four major parts, each with different functions: promoters, leaders, coding regions, and trailers. 8. Mutations are stable, heritable alterations in the gene sequence and usually, but not always, produce phenotypic changes. Nucleic acids are altered in several different ways, and these mutations may be either spontaneous or induced by chemical mutagens or radiation. 9. It is extremely important to keep the nucleotide sequence constant, and microorganisms have several repair mechanisms designed to detect alterations in the genetic material and restore it to its original state. Often more than one repair system can correct a particular type of mutation. Despite these efforts some alterations remain uncorrected and provide material and opportunity for evolutionary change. information on plasmids and the nature of genetic recombination in microorganisms. These three chapters provide the background needed for understanding the material in Part Five: recombinant DNA technology (chapter 14) and microbial genomics (chapter 15). Geneticists, including microbial geneticists, use a specialized vocabulary because of the complexities of their discipline. Some knowledge of basic terminology is necessary at the beginning of this survey of general principles. The experimental material of the microbial geneticist is the clone. A clone is a population of cells that are derived asexually from a parental cell and are genetically identical. Sometimes a clone is called a pure culture. The term genome refers to all the genes present in a cell or virus. Procaryotes normally have one set of genes. That is, they are haploid (1N). Eucaryotic microorganisms usually have two sets of genes, or are diploid (2N). The genotype of an organism is the specific set of genes it possesses. In contrast, the phenotype is the collection of characteristics that are observable by the investigator. All genes are not expressed at the same time, and the environment profoundly influences phenotypic expression. Much genetics research has focused on the relationship between an organism’s genotype and phenotype, and gene expression will be the focus of chapter 12. Although genetic analysis began with the rediscovery of the work of Gregor Mendel in the early part of the twentieth century, subsequent elegant experimentation involving both bacteria and bacteriophages actually elucidated the nature of genetic information, gene structure, the genetic code, and mutations. We will first review a few of these early experiments and then summarize the view of DNA, RNA and protein relationships—sometimes called the Central Dogma—that has guided much of modern research. But the most important qualification of bacteria for genetic studies is their extremely rapid rate of growth. . . . a single E. coli cell will grow overnight into a visible colony containing millions of cells, even under relatively poor growth conditions.Thus, genetic experiments on E. coli usually last one day, whereas experiments on corn, for example, take months. It is no wonder that we know so much more about the genetics of E. coli than about the genetics of corn, even though we have been studying corn much longer. —R.F.Weaver and P.W. Hedrick he preceding chapters have introduced the essentials of microbial metabolism. We now turn to microbial genetics and molecular biology. This chapter reviews some of the most basic concepts of molecular genetics: how genetic information is stored and organized in the DNA molecule, the way in which DNA is replicated, the nature of the genetic code, gene structure, mutagenesis, and DNA repair. In addition, the use of microorganisms to identify potentially dangerous mutagenic agents in the fight against cancer is described. Much of this information will be familiar to those who have taken an introductory genetics course. Because of the importance of procaryotes, primary emphasis is placed on their genetics. Based on the foundation provided by this chapter, chapter 12 will focus on gene expression and its regulation. Chapter 13 contains T 11.1 DNA as Genetic Material The early work of Fred Griffith in 1928 on the transfer of virulence in the pathogen Streptococcus pneumoniae (figure 11.1) set the stage for the research that first showed that DNA was the genetic material. Griffith found that if he boiled virulent bacteria and injected them into mice, the mice were not affected and no pneumococci could be recovered from the animals. When he injected a combination of killed virulent bacteria and a living nonvirulent strain, the mice died; moreover, he could recover living virulent bacteria from the dead mice. Griffith called this change of nonvirulent bacteria into virulent pathogens transformation. Oswald T. Avery and his colleagues then set out to discover which constituent in the heat-killed virulent pneumococci was responsible for Griffith’s transformation. These investigators selectively destroyed constituents in purified extracts of virulent pneumococci, using enzymes that would hydrolyze DNA, RNA, or protein. They then exposed nonvirulent pneumococcal strains to the treated extracts. Transformation of the nonvirulent bacteria was blocked only if the DNA was destroyed, suggesting that DNA was carrying the information required for transformation (figure 11.2). The publication of these studies by O. T. Avery, C. M. MacLeod, and M. J. McCarty in 1944 provided the first evidence that Griffith’s transforming principle was DNA and therefore that DNA carried genetic information. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 11.1 Strain of Colony 229 Strain of Colony Effect Cell type Effect Cell type Capsule Smooth (S) DNA as Genetic Material No capsule Live S strain Live R strain Rough (R) (a) (b) Live R strain Heat-killed S strain Heat-killed S strain (c) (d) Live S and R strains isolated from dead mouse Figure 11.1 Griffith’s Transformation Experiments. (a) Mice died of pneumonia when injected with pathogenic strains of S pneumococci, which have a capsule and form smooth-looking colonies. (b) Mice survived when injected with a nonpathogenic strain of R pneumococci, which lacks a capsule and forms rough colonies. (c) Injection with heat-killed strains of S pneumococci had no effect. (d) Injection with a live R strain and a heat-killed S strain gave the mice pneumonia, and live S strain pneumococci could be isolated from the dead mice. R cells + purified S cell polysaccharide R colonies R cells + purified S cell protein R colonies R cells + purified S cell RNA R colonies R cells + purified S cell DNA S colonies S cell extract + protease + R cells S colonies S cell extract + RNase + R cells S colonies Figure 11.2 Experiments on the Transforming Principle. Summary of the experiments of Avery, MacLeod, and McCarty on the transforming principle. DNA alone changed R to S cells, and this effect was lost when the extract was treated with deoxyribonuclease. Thus DNA carried the genetic information required for the R to S conversion or transformation. Some years later (1952), Alfred D. Hershey and Martha Chase performed several experiments that indicated that DNA was the genetic material in the T2 bacteriophage. Some luck was involved in their discovery, for the genetic material of many viruses is RNA and the researchers happened to select a DNA virus for their studies. Imagine the confusion if T2 had been an RNA virus! The controversy surrounding the nature of genetic information might have lasted considerably longer than it did. Hershey and Chase made the virus DNA radioactive with 32P or labeled the viral protein coat with 35S. They mixed radioactive bacteriophage with E. coli and incubated the mixture for a few minutes. The suspension was then agitated violently in a Waring blender to shear off any adsorbed bacteriophage particles Prescott−Harley−Klein: Microbiology, Fifth Edition 230 Chapter 11 IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 Genes: Structure, Replication, and Mutation Figure 11.3 The Hershey-Chase Experiment. (a) When E. coli was infected with a T2 phage containing 35S protein, most of the radioactivity remained outside the host cell. (b) When a T2 phage containing 32P DNA was mixed with the host bacterium, the radioactive DNA was injected into the cell and phages were produced. Thus DNA was carrying the virus’s genetic information. 35 S protein in coat DNA Blender treatment + (a) Coat protein 32 P DNA Blender treatment + (b) (figure 11.3). After centrifugation, radioactivity in the supernatant and the bacterial pellet was determined. They found that most radioactive protein was released into the supernatant, whereas 32P DNA remained within the bacteria. Since genetic material was injected and T2 progeny were produced, DNA must have been carrying the genetic information for T2. The bi- DNA ology of bacteriophages (chapter 17) RNA Subsequent studies on the genetics of viruses and bacteria were largely responsible for the rapid development of molecular genetics. Furthermore, much of the new recombinant DNA technology (see chapter 14) has arisen from recent progress in bacterial and viral genetics. Research in microbial genetics has had a profound impact on biology as a science and on the technology that affects everyday life. Biologists have long recognized a relationship between DNA, RNA, and protein (figure 11.4), and this recognition has guided a vast amount of research over the past decades. DNA is precisely copied during its synthesis or replication. The expression of the information encoded in the base sequence of DNA begins with the synthesis of an RNA copy of the DNA sequence making up a gene. A gene is a DNA segment or sequence that codes for a polypeptide, an rRNA, or a tRNA. Although DNA has two complementary strands, only the template strand is copied at any particular point on DNA. If both strands of DNA were transcribed, two different mRNAs would result and cause genetic confusion. Thus the sequence corresponding to a gene is located only on one of the two complementary DNA strands. Different genes may be encoded on opposite strands. This process of DNA-directed RNA synthesis is called transcription because the DNA base sequence is being written into an RNA base sequence. The RNA that carries information from DNA and directs protein synthesis is messenger RNA (mRNA). The last phase of gene expression is translation or protein synthesis. The genetic information in the form of an mRNA nucleotide sequence is translated and governs the synthesis of protein. Thus the amino acid sequence of a protein is a direct reflection of the base sequence in mRNA. In turn the mRNA nucleotide sequence is a complementary copy of a portion of the DNA genome. Replication Transcription Translation Protein Figure 11.4 Relationships between DNA, RNA, and Protein Synthesis. This conceptual framework is sometimes called the Central Dogma. 1. Define clone, genome, genotype, and phenotype. 2. Briefly summarize the experiments of Griffith; Avery, MacLeod, and McCarty; and Hershey and Chase. What did each show, and why were these experiments important to the development of microbial genetics? 3. Describe the general relationship between DNA, RNA, and protein. 11.2 Nucleic Acid Structure The structure and synthesis of purine and pyrimidine nucleotides are introduced in chapter 10. These nucleotides can be combined to form nucleic acids of two kinds (figure 11.5a). Deoxyribonucleic acid (DNA) contains the 2′-deoxyribonucleosides (figure 11.5b) of adenine, guanine, cytosine, and thymine. Ribonucleic acid (RNA) is composed of the ribonucleosides of adenine, guanine, cytosine, and uracil (instead of thymine). In both DNA and RNA, nucleosides are joined by phosphate groups to form long polynucleotide chains (figure 11.5c). The differences in chemical composition between the chains reside in their sugar and pyrimi- Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 11.2 Nucleic Acid Structure 231 Ribose or deoxyribose Purine and pyrimidine bases O Nucleoside or deoxynucleoside Phosphoric acid O – O N P NH O O NH2 N N Nucleotide or deoxynucleotide CH2 O (a) Nucleic acid (RNA, DNA) 3′ O – NH2 NH2 O P N O N O O H 3C NH N 5′ CH2 N 5′ HOCH2 4′ O N9 5′ HOCH2 O 3′ OH 2′ OH 1′ N O O N1 O 1′ 4′ 3′ O – 2′ O P O OH O (b) Adenosine 2′-deoxycytidine (c) Figure 11.5 The Composition of Nucleic Acids. (a) A diagram showing the relationships of various nucleic acid components. Combination of a purine or pyrimidine base with ribose or deoxyribose gives a nucleoside (a ribonucleoside or deoxyribonucleoside). A nucleotide contains a nucleoside and one or more phosphoric acid molecules. Nucleic acids result when nucleotides are connected together in polynucleotide chains. (b) Examples of nucleosides—the purine nucleoside adenosine and the pyrimidine deoxynucleoside 2′-deoxycytidine. The carbons of nucleoside sugars are indicated by numbers with primes. (c) A segment of a polynucleotide chain showing two nucleosides, deoxyguanosine and thymidine, connected by a phosphodiester linkage between the 3′ and 5′-carbons of adjacent deoxyribose sugars. dine bases: DNA has deoxyribose and thymine; RNA has ribose and uracil in place of thymine. DNA Structure Deoxyribonucleic acids are very large molecules, usually composed of two polynucleotide chains coiled together to form a double helix 2.0 nm in diameter (figure 11.6). Each chain contains purine and pyrimidine deoxyribonucleosides joined by phosphodiester bridges (figure 11.5c). That is, two adjacent deoxyribose sugars are connected by a phosphoric acid molecule esterified to a 3′hydroxyl of one sugar and a 5′-hydroxyl of the other. Purine and pyrimidine bases are attached to the 1′-carbon of the deoxyribose sugars and extend toward the middle of the cylinder formed by the two chains. They are stacked on top of each other in the center, one base pair every 0.34 nm. The purine adenine (A) is always paired with the pyrimidine thymine (T) by two hydrogen bonds. The purine guanine (G) pairs with cytosine (C) by three hydrogen bonds (figure 11.7). This AT and GC base pairing means that the two strands in a DNA double helix are complementary. That is, the bases in one strand match up with those of the other according to the base pairing rules. Because the sequences of bases in these strands encode genetic information, considerable effort has been devoted to determining the base sequences of DNA and RNA from many microorganisms (see pp. 345–47). Nucleic acid sequence comparison and microbial taxonomy (chapter 19) The two polynucleotide strands fit together much like the pieces in a jigsaw puzzle because of complementary base pairing (Box 11.1). Inspection of figure 11.6a,b, depicting the B form of DNA (probably the most common form in cells), shows that the two strands are not positioned directly opposite one another in the helical cylinder. Therefore, when the strands twist about one another, a wide major groove and narrower minor groove are formed by the backbone. Each base pair rotates 36° around the cylinder with respect to adjacent pairs so that there are 10 base pairs per turn of the helical spiral. Each turn of the helix has a vertical length of 3.4 nm. The helix is right-handed—that is, the chains turn counterclockwise as they approach a viewer looking down the longitudinal axis. The two backbones are antiparallel or run in opposite directions with respect to the orientation of their Prescott−Harley−Klein: Microbiology, Fifth Edition 232 Chapter 11 IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 Side view Genes: Structure, Replication, and Mutation Base pairs H Ribbon II Ribbon I Major groove O P C in phosphate-ester chain S 5′ 3′ P P 3′ S S 5′ P C and N in bases S Minor groove P P (a) 3.4 nm Minor groove P Base pairs P 5′ S 3′ P S S 3′ P S 5′ P P Major groove (b) 2.0 nm Figure 11.6 The Structure of the DNA Double Helix. (a) A space-filling model of the B form of DNA with the base pairs, major groove, and minor groove shown. The backbone phosphate groups, shown in color, spiral around the outside of the helix. (b) A diagrammatic representation of the double helix. The backbone consists of deoxyribose sugars (S) joined by phosphates (P) in phosphodiester bridges. The arrows at the top and bottom of the chains point in the 5′ to 3′ direction. The ribbons represent the sugar phosphate backbones. (c) An end view of the double helix showing the outer backbone and the bases stacked in the center of the cylinder. In the top drawing the ribose ring oxygens are red. The nearest base pair, an AT base pair, is highlighted in white. (c) Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 11.2 Nucleic Acid Structure 233 Box 11.1 The Elucidation of DNA Structure he basic chemical composition of nucleic acids was elucidated in the 1920s through the efforts of P. A. Levene. Despite his major contributions to nucleic acid chemistry, Levene mistakenly believed that DNA was a very small molecule, probably only four nucleotides long, composed of equal amounts of the four different nucleotides arranged in a fixed sequence. Partly because of his influence, biologists believed for many years that nucleic acids were too simple in structure to carry complex genetic information. They concluded that genetic information must be encoded in proteins because proteins were large molecules with complex amino sequences that could vary among different proteins. As so often happens, further advances in our understanding of DNA structure awaited the development of significant new analytical techniques in chemistry. One development was the invention of paper chromatography by Archer Martin and Richard Synge between 1941 and 1944. By 1948 the chemist Erwin Chargaff had begun using paper chromatography to analyze the base composition of DNA from a number of species. He soon found that the base composition of DNA from genetic material did indeed vary among species just as he expected. Furthermore, the total amount of purines always equaled the total amount of pyrimidines; and the adenine/thymine and guanine/cytosine ratios were always 1. These findings, known as Chargaff’s rules, were a key to the understanding of DNA structure. Another turning point in research on DNA structure was reached in 1951 when Rosalind Franklin arrived at King’s College, London, and joined Maurice Wilkins in his efforts to prepare highly oriented DNA fibers and study them by X-ray crystallography. By the winter of 1952–1953, Franklin had obtained an excellent X-ray diffraction photograph of DNA. T H N H N 7 C8 N 9 C 5 C 4 N 3 6 H• • • O C CH3 C 1N• • •H 2 C 4 C 5 2 6C 1 N N 3 C H The same year that Franklin began work at King’s College, the American biologist James Watson went to Cambridge University and met Francis Crick. Although Crick was a physicist, he was very interested in the structure and function of DNA, and the two soon began to work on its structure. Their attempts were unsuccessful until Franklin’s data provided them with the necessary clues. Her photograph of fibrous DNA contained a crossing pattern of dark spots, which showed that the molecule was helical. The dark regions at the top and bottom of the photograph showed that the purine and pyrimidine bases were stacked on top of each other and separated by 0.34 nm. Franklin had already concluded that the phosphate groups lay to the outside of the cylinder. Finally, the X-ray data and her determination of the density of DNA indicated that the helix contained two strands, not three or more as some had proposed. Without actually doing any experiments themselves, Watson and Crick constructed their model by combining Chargaff’s rules on base composition with Franklin’s X-ray data and their predictions about how genetic material should behave. By building models, they found that a smooth, two-stranded helix of constant diameter could be constructed only when an adenine hydrogen bonded with thymine and when a guanine bonded with cytosine in the center of the helix. They immediately realized that the double helical structure provided a mechanism by which genetic material might be replicated. The two parental strands could unwind and direct the synthesis of complementary strands, thus forming two new identical DNA molecules (figure 11.10). Watson, Crick, and Wilkins received the Nobel Prize in 1962 for their discoveries. Franklin could not be considered for the prize because she had died of cancer in 1958 at the age of thirty-seven. sugars. One end of each strand has an exposed 5′-hydroxyl group, often with phosphates attached, whereas the other end has a free 3′-hydroxyl group. If the end of a double helix is examined, the 5′ end of one strand and the 3′ end of the other are visible. In a given direction one strand is oriented 5′ to 3′ and the other, 3′ to 5′ (figure 11.6b). Sugar O H Adenine (A) Sugar Thymine (T) Besides differing chemically from DNA, ribonucleic acid is usually single stranded rather than double stranded like most DNA. An RNA strand can coil back on itself to form a hairpin-shaped structure with complementary base pairing and helical organization. Cells contain three different types of RNA—messenger RNA, ribosomal RNA, and transfer RNA—that differ from one another in function, site of synthesis in eucaryotic cells, and structure. H O• • •H H N C8 7 C 5 9 N C 4 3 N Sugar 6 N C H C 1N 2 C N 4 C 5 2 6C 1 N H• • •N 3 C H• • •O H Sugar H Guanine (G) RNA Structure Cytosine (C) Figure 11.7 DNA Base Pairs. DNA complementary base pairing showing the hydrogen bonds (. . .). Prescott−Harley−Klein: Microbiology, Fifth Edition 234 Chapter 11 IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 Genes: Structure, Replication, and Mutation The Organization of DNA in Cells Although DNA exists as a double helix in both procaryotic and eucaryotic cells, its organization differs in the two cell types (see table 4.2). DNA is organized in the form of a closed circle in almost all procaryotes (the chromosome of Borrelia is a linear DNA molecule). This circular double helix is further twisted into supercoiled DNA (figure 11.8) and is associated with basic proteins but not with the histones found complexed with almost all eucaryotic DNA. These histonelike proteins do appear to help organize bacterial DNA into a coiled chromatinlike structure. The structure of the bacterial nucleoid (p. 54) (a) (b) Figure 11.8 DNA Forms. (a) The DNA double helix of almost all bacteria is in the shape of a closed circle. (b) The circular DNA strands, already coiled in a double helix, are twisted a second time to produce supercoils. DNA is much more highly organized in eucaryotic chromatin (see section 4.9) and is associated with a variety of proteins, the most prominent of which are histones. These are small, basic proteins rich in the amino acids lysine and/or arginine. There are five types of histones in almost all eucaryotic cells studied: H1, H2A, H2B, H3, and H4. Eight histone molecules (two each of H2A, H2B, H3, and H4) form an ellipsoid about 11 nm long and 6.5 to 7 nm in diameter (figure 11.9a). DNA coils around the surface of the ellipsoid approximately 134 turns or 166 base pairs before proceeding on to the next. This complex of histones plus DNA is called a (a) Figure 11.9 Nucleosome Internal Organization and Function. (a) The nucleosome core particle is a histone octamer surrounded by the 146 base pair DNA helix (brown and turquoise). The octamer is a disk-shaped structure composed of two H2A-H2B dimers and two H3-H4 dimers. The eight histone proteins are colored differently: blue, H3; green, H4; yellow, H2A; and red, H2B. Histone proteins interact with the backbone of the DNA minor groove. The DNA double helix circles the histone octamer in a lefthanded helical path. (b) An illustration of how a string of nucleosomes, each associated with a histone H1, might be organized to form a highly supercoiled chromatin fiber. The nucleosomes are drawn as cylinders. H1 DNA (b) Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 11.3 nucleosome. Thus DNA gently isolated from chromatin looks like a string of beads. The stretch of DNA between the beads or nucleosomes, the linker region, varies in length from 14 to over 100 base pairs. Histone H1 appears to associate with the linker regions to aid the folding of DNA into more complex chromatin structures (figure 11.9b). When folding reaches a maximum, the chromatin takes the shape of the visible chromosomes seen in eucaryotic cells during mitosis and meiosis (see figure 4.20). 1. What are nucleic acids? How do DNA and RNA differ in structure? 2. Describe in some detail the structure of the DNA double helix. What does it mean to say that the two strands are complementary and antiparallel? 3. What are histones and nucleosomes? Describe the way in which DNA is organized in the chromosomes of procaryotes and eucaryotes. 11.3 DNA Replication The replication of DNA is an extraordinarily important and complex process, one upon which all life depends. We shall first discuss the overall pattern of DNA synthesis and then examine the mechanism of DNA replication in greater depth. 5′ 235 Patterns of DNA Synthesis Watson and Crick published their description of DNA structure in April 1953. Almost exactly one month later, a second paper appeared in which they suggested how DNA might be replicated. They hypothesized that the two strands of the double helix unwind from one another and separate (figure 11.10). Free nucleotides now line up along the two parental strands through complementary base pairing—A with T, G with C (figure 11.7). When these nucleotides are linked together by one or more enzymes, two replicas result, each containing a parental DNA strand and a newly formed strand. Research in subsequent years has proved Watson and Crick’s hypothesis correct. Replication patterns are somewhat different in procaryotes and eucaryotes. For example, when the circular DNA chromosome of E. coli is copied, replication begins at a single point, the origin. Synthesis occurs at the replication fork, the place at which the DNA helix is unwound and individual strands are replicated. Two replication forks move outward from the origin until they have copied the whole replicon, that portion of the genome that contains an origin and is replicated as a unit. When the replication forks move around the circle, a structure shaped like the Greek letter theta () is formed (figure 11.11). Finally, since the bacterial chromosome is a single replicon, the forks meet on the other side and two separate chromosomes are released. 3′ Parental helix A DNA Replication G C T Origin G C T A T A G C A T A T A Replication forks C G Replication fork C A G T G A G C G G A T A A A Parental G 5′ New T T A CG C T A T C T A T T 3′ T C A 3′ New G C A CG C T A T Replicas 5′ Parental Figure 11.10 Semiconservative DNA Replication. The replication fork of DNA showing the synthesis of two progeny strands. Newly synthesized strands are in maroon. Each copy contains one new and one old strand. This process is called semiconservative replication. Figure 11.11 Bidirectional Replication. The replication of a circular bacterial genome. Two replication forks move around the DNA forming theta-shaped intermediates. Newly replicated DNA double helix is in red. Prescott−Harley−Klein: Microbiology, Fifth Edition 236 Chapter 11 IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 Genes: Structure, Replication, and Mutation 10–100 µm Replication forks Nick 3′ OH ′ 5 P Figure 11.13 The Replication of Eucaryotic DNA. Replication is initiated every 10 to 100 m and the replication forks travel away from the origin. Newly copied DNA is in red. mechanism is particularly useful to viruses (see p. 388) because it allows the rapid, continuous production of many genome copies from a single initiation event. Eucaryotic DNA is linear and much longer than procaryotic DNA; E. coli DNA is about 1,300 m in length, whereas the 46 chromosomes in the human nucleus have a total length of 1.8 m (almost 1,400 times longer). Clearly many replication forks must copy eucaryotic DNA simultaneously so that the molecule can be duplicated in a relatively short period, and so many replicons are present that there is an origin about every 10 to 100 m along the DNA. Replication forks move outward from these sites and eventually meet forks that have been copying the adjacent DNA stretch (figure 11.13). In this fashion a large molecule is copied quickly. Growing point 3′ 5′ Displaced strand 3′ 5′ Displaced strand is almost 1 unit length Mechanism of DNA Replication 5′ Displaced strand is > 1 unit length 3′ Complementary strand synthesis 3′ 5′ Figure 11.12 The Rolling-Circle Pattern of Replication. A singlestranded tail, often composed of more than one genome copy, is generated and can be converted to the double-stranded form by synthesis of a complementary strand. The “free end” of the rollingcircle strand is probably bound to the primosome. A different pattern of DNA replication occurs during E. coli conjugation (see section 13.4) and the reproduction of viruses, such as phage lambda (see section 17.5). In the rolling-circle mechanism (figure 11.12), one strand is nicked and the free 3′hydroxyl end is extended by replication enzymes. As the 3′ end is lengthened while the growing point rolls around the circular template, the 5′ end of the strand is displaced and forms an everlengthening tail. The single-stranded tail may be converted to the double-stranded form by complementary strand synthesis. This Because DNA replication is so essential to organisms, a great deal of effort has been devoted to understanding its mechanism. The replication of E. coli DNA is probably best understood and is the focus of attention in this section. The process in eucaryotic cells is thought to be similar. DNA replication is initiated at the oriC locus. The DnaA protein binds to oriC while hydrolyzing ATP. This leads to the initial unwinding of double-stranded DNA at the initiation site. Further unwinding occurs through the activity of the DnaB protein, a helicase (see below). E. coli has three different DNA polymerase enzymes, each of which catalyzes the synthesis of DNA in the 5′ to 3′ direction while reading the DNA template in the 3′ to 5′ direction (figures 11.14 and 11.15). The polymerases require deoxyribonucleoside triphosphates (dATP, dGTP, dCTP, and dTTP) as substrates and a DNA template to copy. Nucleotides are added to the 3′ end of the growing chain when the free 3′-hydroxyl group on the deoxyribose attacks the first or alpha phosphate group of the substrate to release pyrophosphate (figure 11.14). DNA polymerase III plays the major role in replication, although it is probably assisted by polymerase I. It is thought that polymerases I and II participate in the repair of damaged DNA (p. 254). During replication the DNA double helix must be unwound to generate separate single strands. Unwinding occurs very quickly; the fork may rotate as rapidly as 75 to 100 revolutions per second. Helicases are responsible for DNA unwinding. These enzymes use energy from ATP to unwind short stretches of helix just ahead of the replication fork. Once the strands have separated, they are kept sin- Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 11.3 DNA polymerase reaction DNA polymerase n[dATP, dGTP, dCTP, dTTP] DNA + nPPi DNA template The mechanism of chain growth • •• 5′ • •• O O CH2 O 5′ CH 2 T O O O T O O P – O O P O – O CH2 CH2 O O A A + PPi 3′ OH O – O O O P O – P – O O O O P O O O CH2 P O – O O – C CH2 O C OH OH 3′ end of chain Figure 11.14 The DNA Polymerase Reaction and Its Mechanism. The mechanism involves a nucleophilic attack by the hydroxyl of the 3′ terminal deoxyribose on the alpha phosphate group of the nucleotide substrate (in this example, adenosine attacks cytidine triphosphate). DNA Replication gle through specific binding with single-stranded DNA binding proteins (SSBs) as shown in figure 11.15. Rapid unwinding can lead to tension and formation of supercoils or supertwists in the helix, just as rapid separation of two strands of a rope can lead to knotting or coiling of the rope. The tension generated by unwinding is relieved, and the unwinding process is promoted by enzymes known as topoisomerases. These enzymes change the structure of DNA by transiently breaking one or two strands in such a way that it remains unaltered as its shape is changed (e.g., a topoisomerase might tie or untie a knot in a DNA strand). DNA gyrase is an E. coli topoisomerase that removes the supertwists produced during replication (see figure 35.6). After the double helix has been unwound, successful replication requires the solution of two problems. First, DNA polymerase only synthesizes a new copy of DNA while moving in the 5′ to 3′ direction. Inspection of figure 11.15 shows that synthesis of the leading strand copy is relatively simple because the new strand can be extended continuously at its 3′ end as the DNA unwinds. In contrast, the lagging strand cannot be extended in the same direction because this would require 3′ to 5′ synthesis, which is not possible. As a result, the lagging strand copy is synthesized discontinuously in the 5′ to 3′ direction as a series of fragments; then the fragments are joined to form a complete copy. The second problem arises because DNA polymerase cannot start a new copy from scratch, but must build on an already existing strand. In figure 11.15, the leading strand copy already exists; however, the lagging strand fragments must be synthesized without a DNA strand to build upon. In this case, a special RNA primer is first synthesized and then a DNA copy can be built on the primer. The details of DNA replication are outlined in a diagram of the replication fork (figure 11.16). The replication process takes place in four stages. 1. Helicases unwind the helix with the aid of topoisomerases like the DNA gyrase (figure 11.16, step 1). It appears that the DnaB protein is the helicase most actively involved in 3′ 5′ Fork movement Leading strand 3′ Okazaki fragment Lagging strand 237 SSBs RNA primer 5′ 3′ 5′ DNA gyrase, helicases 3′ 5′ Figure 11.15 Bacterial DNA Replication. A general diagram of the synthesis of DNA in E. coli at the replication fork. Bases and base pairs are represented by lines extending outward from the strands. The RNA primer is in gold. See text for details. Prescott−Harley−Klein: Microbiology, Fifth Edition 238 Chapter 11 IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 Genes: Structure, Replication, and Mutation Leading strand SSB Helicase 3′ (1) 5′ DNA Gyrase 5′ 3′ 3′ 5′ Lagging strand (2) 3′ 5′ 5′ 3′ 3′ 5′ Primosome making primer RNA primer DNA Polymerase III Leading strand template (3) 5′ 3′ 3′ 5′ 3′ 5′ Fork movement Lagging strand template (4) 5′ 3′ 3′ 5′ 5′ 3′ DNA polymerase I replacing RNA primer Completed Okazaki fragment (5) 5′ 3′ 3′ 5′ 5′ 3′ DNA ligase joining fragments Figure 11.16 A Hypothetical Model for Activity at the Replication Fork. The overall process is pictured in five stages with only one cycle of replication shown for sake of clarity. In practice, all these enzymes are functioning simultaneously and more than one round of replication can occur simultaneously; for example, new primer RNA can be synthesized at the same time as DNA is being replicated. (1) DNA gyrase, helicases, and single-stranded DNA binding proteins (SSBs) unwind DNA to produce a single-stranded stretch. (2) The primosome synthesizes an RNA primer. (3) The replisome has two DNA polymerase III complexes. One polymerase continuously copies the leading strand. The lagging strand loops around the other polymerase so that both strands can be replicated simultaneously. When DNA polymerase III encounters a completed Okazaki fragment, it releases the lagging strand. (4) DNA polymerase I removes the RNA primer and fills in the gap with complementary DNA. (5) DNA ligase seals the nick and joins the two fragments. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 11.3 replication, but the n′ protein also may participate in unwinding. The single strands are kept separate by the DNA binding proteins (SSBs). 2. DNA is probably replicated continuously by DNA polymerase III when the leading strand is copied. Lagging strand replication is discontinuous, and the fragments are synthesized in the 5′ to 3′ direction just as in leading strand synthesis. First, a special RNA polymerase called a primase synthesizes a short RNA primer, usually around 10 nucleotides long, complementary to the DNA (figure 11.16, step 2). It appears that the primase requires the assistance of several other proteins, and the complex of the primase with its accessory proteins is called the primosome. DNA polymerase III holoenzyme then synthesizes complementary DNA beginning at the 3′ end of the RNA primer. Both leading and lagging strand synthesis probably occur concurrently on a single multiprotein complex with two catalytic sites, the replisome. If this is the case, the lagging strand template must be looped around the complex (figure 11.16, step 3). The final fragments are around 1,000 to 2,000 nucleotides long in bacteria and approximately 100 nucleotides long in eucaryotic cells. They are called Okazaki fragments after their discoverer, Reiji Okazaki. 3. After most of the lagging strand has been duplicated by the formation of Okazaki fragments, DNA polymerase I or RNase H removes the RNA primer. Polymerase I synthesizes complementary DNA to fill the gap resulting from RNA deletion (figure 11.16, step 4). The polymerase appears to remove one primer nucleotide at a time and replace it with the appropriate complementary deoxyribonucleotide. Polymerase III holoenzyme also may be able to fill in the gap. 4. Finally, the fragments are joined by the enzyme DNA ligase, which forms a phosphodiester bond between the 3′hydroxyl of the growing strand and the 5′-phosphate of an Okazaki fragment (figure 11.16, step 5, and figure 11.17). Bacterial ligases use the pyrophosphate bond of NAD⫹ as an energy source; many other ligases employ ATP. DNA polymerase III holoenzyme, the enzyme complex that synthesizes most of the DNA copy, is a very large entity containing DNA polymerase III and several other proteins. The ␥␦ complex and  subunits of the holoenzyme bind it to the DNA template and primer. The ␣ subunit carries out the actual polymerization reaction. It appears that most or all of the replication proteins form a huge complex or replication factory, sometimes called a replisome, that is relatively stationary and probably bound to the plasma membrane. The DNA moves through this factory and is copied, emerging as two daughter chromosomes. In slowly growing bacteria there seem to be two factories located at or close to the center of the cell. Rapidly growing cells might have four or more factories. DNA replication stops when the polymerase complex reaches a termination site on the DNA in E. coli. The Tus protein binds to these Ter sites and halts replication. In many procaryotes, replication stops randomly when the forks meet. DNA replication is an extraordinarily complex process. At least 30 proteins are required to replicate the E. coli chromosome. •• • •• • O O CH2 O CH2 Base 1 O O O Base 1 O O P – O O P O – O CH2 CH2 O O Base 2 OH O O 239 DNA Replication Base 2 O DNA ligase + – NAD or ATP O P O O P – – O CH2 O O Base 3 CH2 O Base 3 O O O P O O O – P O – O • •• • •• Figure 11.17 The DNA Ligase Reaction. The groups being altered are shaded in blue. Presumably much of the complexity is necessary for accuracy in copying DNA. It would be very dangerous for any organism to make many errors during replication because a large number of mutations would certainly be lethal. In fact, E. coli makes errors with a frequency of only 10⫺9 or 10⫺10 per base pair replicated (or about 10⫺6 per gene per generation). Part of this precision results from the low error rate of the copying process itself. However, DNA polymerase III (and DNA polymerase I) also can proofread the newly formed DNA. As polymerase III moves along synthesizing a new DNA strand, it recognizes any errors resulting in improper base pairing and hydrolytically removes the wrong nucleotide through a special 3′ to 5′ exonuclease activity (which is found in the ⑀ subunit). The enzyme then backs up and adds the proper nucleotide in its place. Polymerases delete errors by acting much like correcting typewriters. DNA repair (pp. 254–56) Despite its complexity and accuracy, replication occurs very rapidly. In procaryotes replication rates approach 750 to 1,000 base pairs per second. Eucaryotic replication is much slower, about 50 to 100 base pairs per second. This is not surprising because eucaryotic replication also involves operations like unwinding the DNA from nucleosomes. Prescott−Harley−Klein: Microbiology, Fifth Edition 240 Chapter 11 IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation Genes: Structure, Replication, and Mutation 1. Define the following terms: replication, transcription, messenger RNA, translation, replicon, replication fork, primosome, and replisome. 2. Be familiar with the nature and functions of the following replication components and intermediates: DNA polymerases I and III, topoisomerase, DNA gyrase, helicase, single-stranded DNA binding protein, Okazaki fragment, DNA ligase, leading strand, and lagging strand. 11.4 © The McGraw−Hill Companies, 2002 The Genetic Code The realization that DNA is the genetic material triggered efforts to understand how genetic instructions are stored and organized in the DNA molecule. Early studies on the nature of the genetic code showed that the DNA base sequence corresponds to the amino acid sequence of the polypeptide specified by the gene. That is, the nucleotide and amino acid sequences are colinear. It also became evident that many mutations are the result of changes of single amino acids in a polypeptide chain. However, the exact nature of the code was still unclear. Establishment of the Genetic Code Since only 20 amino acids normally are present in proteins, there must be at least 20 different code words in a linear, single strand of DNA. The code must be contained in some sequence of the four nucleotides commonly found in the linear DNA sequence. There are only 16 possible combinations (42) of the four nucleotides if only nucleotide pairs are considered, not enough to code for all 20 amino acids. Therefore a code word, or codon, must involve at least nucleotide triplets even though this would give 64 possible combinations (43), many more than the minimum of 20 needed to specify the common amino acids. The actual codons were discovered in the early 1960s through the experiments carried out by Marshall Nirenberg, Heinrich Matthaei, Philip Leder, and Har Gobind Khorana. In 1968 Nirenberg and Khorana shared the Nobel prize with Robert W. Holley, the first person to sequence a nucleic acid (phenylalanyl-tRNA). Organization of the Code The genetic code, presented in RNA form, is summarized in table 11.1. Note that there is code degeneracy. That is, there are up to six different codons for a given amino acid. Only 61 codons, the sense Table 11.1 The Genetic Code Second Position UUU U UUC UUA UUG First Position (5´ End)a CUU C CUC CUA CUG A AUU AUC AUA 其 其 其 其 AUG G GUU GUC GUA GUG a C Phe UCC Leu UCA UCG CCU Leu CCC CCA CCG ACU lle ACC ACA Met 其 UCU ACG GCU GCC Val GCA GCG 其 其 其 其 A UAU UAC Ser UAA UAG CAU Pro CAC CAA CAG AAU AAC Thr AAA AAG GAU GAC Ala The code is presented in the RNA form. Codons run in the 5´ to 3´ direction. See text for details. GAA GAG 其 其 其 其 其 其 其 其 G Tyr UGU UGC STOP Asn UGG Trp G CGC CGG AGU AGA AGG Asp GGU GGC GGA Glu C A AGC Lys U STOP CGA Gln Cys UGA CGU His 其 GGG 其 其 其 其 U C Arg A G U Ser Arg C A G U C Gly A G Third Position (3´ End) U Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 11.5 Figure 11.18 Wobble and Coding. The use of wobble in coding for the amino acid glycine. (a) Inosine (I) is a wobble nucleoside that can base pair with uracil (U), cytosine (C), or adenine (A). Thus ICC base pairs with GGU, GGC, and GGA in the mRNA. (b) Because of the wobble produced by inosine, two tRNA anticodons can recognize the four glycine (Gly) codons. ICC recognizes GGU, GGC, and GGA; CCC recognizes GGG. Gene Structure 241 (a) Base pairing of one glycine tRNA with three codons due to wobble Gly 5′ 3′ O Gly Gly O O tRNA CCI mRNA 5′ GGU 3′ CCI CCI GGC GGA (b) Glycine codons and anticodons (written in the 5′ 3′ direction) Glycine mRNA codons: GGU, GGC, GGA, GGG Glycine tRNA anticodons: ICC, CCC codons, direct amino acid incorporation into protein. The remaining three codons (UGA, UAG, and UAA) are involved in the termination of translation and are called stop or nonsense codons. Despite the existence of 61 sense codons, there are not 61 different tRNAs, one for each codon. The 5′ nucleotide in the anticodon can vary, but generally, if the nucleotides in the second and third anticodon positions complement the first two bases of the mRNA codon, an aminoacyl-tRNA with the proper amino acid will bind to the mRNA-ribosome complex. This pattern is evident on inspection of changes in the amino acid specified with variation in the third position (table 11.1). This somewhat loose base pairing is known as wobble and relieves cells of the need to synthesize so many tRNAs (figure 11.18). Wobble also decreases the effects of DNA mutations. The mechanism of protein synthesis and tRNA function (pp. 265–71) 1. Why must a codon contain at least three nucleotides? 2. Define the following: code degeneracy, sense codon, stop or nonsense codon, and wobble. 11.5 Gene Structure The gene has been defined in several ways. Initially geneticists considered it to be the entity responsible for conferring traits on the organism and the entity that could undergo recombination. Recombination involves exchange of DNA from one source with that from another (see section 13.1) and is responsible for generating much of the genetic variability found in viruses and living organisms. Genes were typically named for some mutant or altered phenotype. With the discovery and characterization of DNA, the gene was defined more precisely as a linear sequence of nucleotides or codons (this term can be used for RNA as well as DNA) with a fixed start point and end point. At first, it was thought that a gene contained information for the synthesis of one enzyme, the one gene–one enzyme hypothesis. This has been modified to the one gene–one polypeptide hypothesis because of the existence of enzymes and other proteins composed of two or more different polypeptide chains coded for by separate genes. The segment that codes for a single polypeptide is sometimes also called a cistron. More recent results show that even this description is oversimplified. Not all genes are involved in protein synthesis; some code instead for rRNA and tRNA. Thus a gene might be defined as a polynucleotide sequence that codes for a polypeptide, tRNA, or rRNA. Some geneticists think of it as a segment of nucleic acid that is transcribed to give an RNA product. Most genes consist of discrete sequences of codons that are “read” only one way to produce a single product. That is, the code is not overlapping and there is a single starting point with one reading frame or way in which nucleotides are grouped into codons (figure 11.19). Chromosomes therefore usually consist of gene sequences that do not overlap one another (figure 11.20a). However, there are exceptions to the rule. Some viruses such as the phage X174 do have overlapping genes (figure 11.20b), and parts of genes overlap in some bacterial genomes. Procaryotic and viral gene structure differs greatly from that of eucaryotes. In bacterial and viral systems, the coding information within a cistron normally is continuous (some bacterial genes do contain introns); however, in eucaryotic organisms, many genes contain coding information (exons) interrupted periodically by noncoding sequences (introns). An interesting exception to this rule is eucaryotic histone genes, which lack introns. Because procaryotic and viral systems are the best characterized, the more detailed description of gene structure that follows will focus on E. coli genes. Exons and introns in eucaryotic genes (p. 263) Prescott−Harley−Klein: Microbiology, Fifth Edition 242 Chapter 11 IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 Genes: Structure, Replication, and Mutation Reading start Reading start DNA T A C G G T A T G A C C T T A C G G T A T G A C C T mRNA A U G C C A U A C U G G U U G C C A U A C U G G U Met Cys Peptide Pro Tyr Trp His Thr Gly Figure 11.19 Reading Frames and Their Importance. The place at which DNA sequence reading begins determines the way nucleotides are grouped together in clusters of three (outlined with brackets), and this specifies the mRNA codons and the peptide product. In the example, a change in the reading frame by one nucleotide yields a quite different mRNA and final peptide. tyrB metA purD,H thiA,B,C G E D C E arg B P H A ilv Y arol C cysE argl pyrB purA D C B A ABC pyrA leu thr proA,B argF A H A B F bio C D proC purE 100/0 A* G φX174 aroA pyrD B K pyrC ilvH,J,K aroB cysG argG 75 Escherichia coli 25 A purB B trp C D cysB E C E F (a) metE serA lysA thyA argA C pyrG D cys H I J aroH aroD 50 pheA tyrA aroF purF aroC cysA,K purC cysA,K his metG G D C B H A F I E Genes That Code for Proteins Recall from the discussion of transcription that although DNA is double stranded, only one strand contains coded information and directs RNA synthesis. This strand is called the template strand, and the complementing strand is known as the nontemplate strand (figure 11.21). Because the mRNA is made from the 5′ to the 3′ D J (b) Figure 11.20 Chromosomal Organization in Bacteria and Viruses. (a) Simplified genetic map of E. coli. The E. coli map is divided into 100 minutes. (b) The map of phage X174 shows the overlap of gene B with A, K with A and C, and E with D. The solid regions are spaces lying between genes. Protein A* consists of the last part of protein A and arises from reinitiation of transcription within gene A. end, the polarity of the DNA template strand is therefore 3′ to 5′. Therefore the beginning of the gene is at the 3′ end of the template strand (also the 5′ end of the nontemplate strand). An RNA polymerase recognition/binding and regulatory site known as the promoter is located at the start of the gene. The mechanism of transcription (pp. 261–64) Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 11.5 243 Gene Structure RNA polymerase recognition site Nontemplate strand RNA polymerase binding site (Pribnow box) –35 Template strand –10 +1 3′ 5′ DNA 5′ 3′ Promoter Antileader G or A mRNA Coding region Antitrailer Terminator Shine-Dalgarno sequence AUG 3′ 5′ Leader Transcription start Direction of transcription Trailer Translation start (initiation codon) Figure 11.21 A Bacterial Structural Gene. The organization of a typical structural gene in bacteria. Leader and trailer sequences are included even though some genes lack one or both. Transcription begins at the ⫹1 position in DNA, and the first nucleotide incorporated into mRNA is usually GTP or ATP. Translation of the mRNA begins with the AUG initiation codon. Regulatory sites are not shown. – 35 region TTGACA RNA polymerase recognition site – 10 region Consensus sequences TATAAT RNA polymerase binding site G T T G T G T G G A AT 5′ -C C C C A GG C T T T A C A C T T T AT G C T T C C G G C T C G TAT 3′ - G G G G T C C G A A AT G T G A A ATA C G A A G G C C G A G C ATA P P P GA CAACACACCTT A +1 TGTGAGC- 3 ′ ACACTCG- 5 ′ Beginning of RNA chain Template strand Region unwound by RNA polymerase in open complex Figure 11.22 A Bacterial Promoter. The lactose operon promoter and its consensus sequences. The start point for RNA synthesis is labeled ⫹1. The region around ⫺35 is the site at which the RNA polymerase first attaches to the promoter. RNA polymerase binds and begins to unwind the DNA helix at the Pribnow box or RNA polymerase binding site, which is located in the ⫺10 region. Promoters are sequences of DNA that are usually upstream from the actual coding or transcribed region (figure 11.21)—that is, the promoter is located before or upstream from the coding region in relationship to the direction of transcription (the direction of transcription is referred to as downstream). Different genes have different promoters, and promoters also vary in sequence between bacteria. In E. coli the promoter has two important functions, and these relate to two specific segments within the promoter (figure 11.21 and figure 11.22). Although these two segments do vary slightly in sequence between bacterial strains and different genes, they are fairly constant and may be represented by consensus sequences. These are idealized sequences composed of the bases most often found at each position when the sequences from different bacteria are compared. The RNA polymerase recognition site, with a consensus sequence of 5′TTGACA3′ on the nontemplate strand in E. coli, is centered about 35 base pairs before (the ⫺35 region) the transcriptional start point (labeled as ⫹1) of RNA synthesis. This sequence seems to be the site of the initial association of the RNA polymerase with the DNA. The RNA polymerase binding site, also Prescott−Harley−Klein: Microbiology, Fifth Edition 244 Chapter 11 IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 Genes: Structure, Replication, and Mutation known as the Pribnow box, is centered at the ⫺10 region and has a consensus sequence 5′TATAAT3′ in E. coli (a sequence that favors the localized unwinding of DNA). This is where the RNA polymerase begins to unwind the DNA for eventual transcription. The initially transcribed portion of the gene is not necessarily coding material. Rather, a leader sequence may be synthesized first. The leader is usually a nontranslated sequence that is important in the initiation of translation and sometimes is involved in regulation of transcription. The leader (figure 11.21) in procaryotes generally contains a consensus sequence known as the Shine-Dalgarno sequence, 5′AGGA3′, the transcript of which complements a sequence on the 16S rRNA in the small subunit of the ribosome. The binding of mRNA leader with 16S rRNA properly orients the mRNA on the ribosome. The leader also sometimes regulates transcription by attenuation (see section 12.4). Downstream and next to the leader is the most important part of the structural gene, the coding region. The coding region (figure 11.21) of genes that direct the synthesis of proteins typically begins with the template DNA sequence 3′TAC5′. This produces the RNA translation initiation codon 5′AUG3′, which codes for N-formylmethionine. This modified form of methionine is the first amino acid incorporated in most procaryotic proteins. The remainder of the gene coding region consists of a sequence of codons that specifies the sequence of amino acids for that particular protein. Transcription does not stop at the translation stop codon but rather at a terminator sequence (see section 12.1). The terminator often lies after a nontranslated trailer sequence located downstream from the coding region (figure 11.21). The trailer sequence, like the leader, is needed for the proper expression of the coding region of the gene. Besides the basic components described above—the promoter, leader, coding region, trailer, and terminator—many procaryotic genes have a variety of regulatory sites. These are locations where DNA-recognizing regulatory proteins bind to stimulate or prevent gene expression. Regulatory sites often are associated with promoter function, and some consider them to be parts of special promoters. Two such sites, the operator and the CAP binding site, are discussed in section 12.3. Certainly everything is not known about genes and their structure. With the ready availability of purified cloned genes and DNA sequencing technology, major discoveries continue to be made in this area. The operon and transcription regulation (pp. 275–78) Genes That Code for tRNA and rRNA The DNA segments that code for tRNA and rRNA also are considered genes, although they give rise to structurally important RNA rather than protein. In E. coli the genes for tRNA are fairly typical, consisting of a promoter and transcribed leader and trailer sequences that are removed during the maturation process (figure 11.23a). The precise function of the leader is not clear; however, the trailer is required for termination. Genes coding for tRNA may code for more than a single tRNA molecule or type of tRNA (figure 11.23a). The segments coding for tRNAs are separated by short spacer sequences that are removed after transcription by special ribonucleases, at least one of which contains catalytic RNA. As mentioned in chapter 12 (see p. 266), mature tRNAs contain unusual nucleosides. Modified nucleosides such as inosine, ribothymidine, and pseudouridine almost always are formed after the tRNA has been synthesized. Special tRNA-modifying enzymes are responsible. RNA splicing and ribozymes (pp. 264–65) The genes for rRNA also are similar in organization to genes coding for proteins and have promoters, trailers, and terminators (figure 11.23b). Interestingly all the rRNAs are transcribed as a single, large precursor molecule that is cut up by ribonucleases after transcription to yield the final rRNA products. E. coli prerRNA spacer and trailer regions even contain tRNA genes. Thus the synthesis of tRNA and rRNA involve posttranscriptional modification, a relatively rare process in procaryotes. 1. Define or describe the following: gene, template and nontemplate strands, promoter, consensus sequence, RNA polymerase recognition and binding sites, Pribnow box, leader, Shine-Dalgarno sequence, coding region, reading frame, trailer, and terminator. 2. How do the genes of procaryotes and eucaryotes usually differ from each other? 3. Briefly discuss the general organization of tRNA and rRNA genes. How does their expression differ from that of structural genes with respect to posttranscriptional modification of the gene product? 11.6 Mutations and Their Chemical Basis Considerable information is embedded in the precise order of nucleotides in DNA. For life to exist with stability, it is essential that the nucleotide sequence of genes is not disturbed to any great extent. However, sequence changes do occur and often result in altered phenotypes. These changes are largely detrimental but are important in generating new variability and contribute to the process of evolution. Microbial mutation rates also can be increased, and these genetic changes have been put to many important uses in the laboratory and industry. Mutations [Latin mutare, to change] were initially characterized as altered phenotypes or phenotypic expressions. Long before the existence of direct proof that a mutation is a stable, heritable change in the nucleotide sequence of DNA, geneticists predicted that several basic types of transmitted mutations could exist. They believed that mutations could arise from the alteration of single pairs of nucleotides and from the addition or deletion of one or two nucleotide pairs in the coding regions of a gene. Clearly, mutations may be characterized according to either the kind of genotypic change that has occurred or their phenotypic consequences. In this section the molecular basis of mutations and mutagenesis is first considered. Then the phenotypic effects of mutations, the detection of mutations, and the use of mutations in carcinogenicity testing are discussed. Mutations and Mutagenesis Mutations can alter the phenotype of a microorganism in several different ways. Morphological mutations change the microorganism’s colonial or cellular morphology. Lethal mutations, when expressed, Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 CGACUAU GGCCCGC AGAUGU GCUGAUA • • • • • •• •• •• •• •• •• •• •• •• •• •• •• • • • • • • • • • • • • • • • • • • • G C G G G • • • GCUGA U ACGGGCG C G C C C Spacer 245 Anticodon GUGGG U G G CACCC G C U C tRNA Ser G A C G G •• •• •• •• • • • • CGACU G C U G C C •• • •• • •• • •• • A U G •• •• •• •• •• •• •• • • • • • • • A U A C U –OH U C A C A Mutations and Their Chemical Basis •• •• •• •• •• • • • • • Anticodon G U U C C A G G A U •• •• •• •• • • • • 11.6 C G A G tRNA Thr (a) 23S 16S 1 or 2 0-2 5S Spacer tRNA Trailer tRNA (b) Figure 11.23 tRNA and rRNA Genes. (a) A tRNA precursor from E. coli that contains two tRNA molecules. The spacer and extra nucleotides at both ends are removed during processing. (b) The E. coli ribosomal RNA gene codes for a large transcription product that is cleaved into three rRNAs and one to three tRNAs. The 16S, 23S, and 5S rRNA segments are represented by blue lines, and tRNA sequences are placed in brackets. The seven copies of this gene vary in the number and kind of tRNA sequences. result in the death of the microorganism. Since the microorganism must be able to grow in order to be isolated and studied, lethal mutations are recovered only if they are recessive in diploid organisms or conditional (see the following) in haploid organisms. Conditional mutations are those that are expressed only under certain environmental conditions. For example, a conditional lethal mutation in E. coli might not be expressed under permissive conditions such as low temperature but would be expressed under restrictive conditions such as high temperature. Thus the hypothetical mutant would grow normally at the permissive temperature but would die at high temperatures. Biochemical mutations are those causing a change in the biochemistry of the cell. Since these mutations often inactivate a biosynthetic pathway, they frequently make a microorganism unable to grow on a medium lacking an adequate supply of the pathway’s end product. That is, the mutant cannot grow on minimal medium and requires nutrient supplements. Such mutants are called auxotrophs, whereas microbial strains that can grow on minimal medium are prototrophs. Analysis of auxotrophy has been quite important in microbial genetics due to the ease of auxotroph selection and the relative abundance of this mutational type. Mutant detection and replica plating (pp. 251–52); Nutrient requirements and nutritional types (pp. 96–98) Prescott−Harley−Klein: Microbiology, Fifth Edition 246 Chapter 11 IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 Genes: Structure, Replication, and Mutation A resistant mutant is a particular type of biochemical mutant that acquires resistance to some pathogen, chemical, or antibiotic. Such mutants also are easy to select for and very useful in microbial genetics. Mechanisms of drug resistance (pp. 818–19) Mutations occur in one of two ways. (1) Spontaneous mutations arise occasionally in all cells and develop in the absence of any added agent. (2) Induced mutations, on the other hand, are the result of exposure of the organism to some physical or chemical agent called a mutagen. Although most geneticists believe that spontaneous mutations occur randomly in the absence of an external agent and are then selected, observations by some microbiologists have led to a new and controversial hypothesis. John Cairns and his collaborators have reported that a mutant E. coli strain, which is unable to use lactose as a carbon and energy source, regains the ability to do so more rapidly when lactose is added to the culture medium as the only carbon source. Lactose appears to induce mutations that allow E. coli to use the sugar again. It has been claimed that these and similar observations on different mutations are examples of directed or adaptive mutation—that is, some bacteria seem able to choose which mutations occur so that they can better adapt to their surroundings. Many explanations have been offered to account for this phenomenon without depending on bacterial selection of particular mutations. One of the most interesting is the proposal that hypermutation can produce such results. Some starving bacteria might rapidly generate multiple mutations through activation of special mutator genes. This would produce many mutant bacterial cells. In such a random process, the rate of production of favorable mutants would increase, with many of these mutants surviving to be counted. There would appear to be directed or adaptive mutation because many of the unfavorable mutants would die. There is support for this hypothesis. Mutator genes have been discovered and do cause hypermutation under nutritional stress. Even if the directed mutation hypothesis is incorrect, it has stimulated much valuable research and led to the discovery of new phenomena. Some of these issues will be discussed in chapter 42 in the context of evolutionary biotechnology. Spontaneous Mutations Spontaneous mutations arise without exposure to external agents. This class of mutations may result from errors in DNA replication, or even from the action of transposons (see section 13.3). A few of the more prevalent mechanisms are described in the following paragraphs. Generally replication errors occur when the base of a template nucleotide takes on a rare tautomeric form. Tautomerism is the relationship between two structural isomers that are in chemical equilibrium and readily change into one another. Bases typically exist in the keto form. However, they can at times take on either an imino or enol form (figure 11.24a). These tautomeric shifts change the hydrogen-bonding characteristics of the bases, allowing purine for purine or pyrimidine for pyrimidine substitutions that can eventually lead to a stable alteration of the nucleotide sequence (figure 11.24b). Such substitutions are known as transition mutations and are relatively common, although most of them are repaired by various proofreading functions (see pp. 239 and 254). In transversion mutations, a purine is substituted for a pyrimidine, or a pyrimidine for a purine. These mutations are rarer due to the steric problems of pairing purines with purines and pyrimidines with pyrimidines. Spontaneous mutations also arise from frameshifts, usually caused by the deletion of DNA segments resulting in an altered codon reading frame. These mutations generally occur where there is a short stretch of the same nucleotide. In such a location, the pairing of template and new strand can be displaced by the distance of the repeated sequence leading to additions or deletions of bases in the new strand (figure 11.25). Spontaneous mutations originate from lesions in DNA as well as from replication errors. For example, it is possible for purine nucleotides to be depurinated—that is, to lose their base. This results in the formation of an apurinic site, which will not base pair normally and may cause a transition type mutation after the next round of replication. Cytosine can be deaminated to uracil, which is then removed to form an apyrimidinic site. Reactive forms of oxygen such as oxygen free radicals and peroxides are produced by aerobic metabolism (see p. 128). These may alter DNA bases and cause mutations. For example, guanine can be converted to 8-oxo-7,8-dihydrodeoxyguanine, which often pairs with adenine rather than cytosine during replication. Finally, spontaneous mutations can result from the insertion of DNA segments into genes. This results from the movement of insertion sequences and transposons (see pp. 298–302), and usually inactivates the gene. Insertion mutations are very frequent in E. coli and many other bacteria. Induced Mutations Virtually any agent that directly damages DNA, alters its chemistry, or interferes with repair mechanisms ( pp. 254–56) will induce mutations. Mutagens can be conveniently classified according to their mechanism of action. Four common modes of mutagen action are incorporation of base analogs, specific mispairing, intercalation, and bypass of replication. Base analogs are structurally similar to normal nitrogenous bases and can be incorporated into the growing polynucleotide chain during replication. Once in place, these compounds typically exhibit base pairing properties different from the bases they replace and can eventually cause a stable mutation. A widely used base analog is 5-bromouracil (5-BU), an analog of thymine. It undergoes a tautomeric shift from the normal keto form to an enol much more frequently than does a normal base. The enol forms hydrogen bonds like cytosine and directs the incorporation of guanine rather than adenine (figure 11.26). The mechanism of action of other base analogs is similar to that of 5-bromouracil. Specific mispairing is caused when a mutagen changes a base’s structure and therefore alters its base pairing characteristics. Some mutagens in this category are fairly selective; they preferentially react with some bases and produce a specific kind of DNA damage. An example of this type of mutagen is Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation H 3C H O H N H H N N N Rare imino form of cytosine (C*) O H Rare enol form of thymine (T*) N N N N • •• N O O N H • •• H N • •• • •• N • •• Figure 11.24 Transition and Transversion Mutations. Errors in replication due to base tautomerization. (a) Normally AT and GC pairs are formed when keto groups participate in hydrogen bonds. In contrast, enol tautomers produce AC and GT base pairs. (b) Mutation as a consequence of tautomerization during DNA replication. The temporary enolization of guanine leads to the formation of an AT base pair in the mutant, and a GC to AT transition mutation occurs. The process requires two replication cycles. The mutation only occurs if the abnormal first-generation GT base pair is missed by repair mechanisms. © The McGraw−Hill Companies, 2002 Adenine N N N H Guanine • CH3 H N H H • •• H N N • •• N • •• N N N O H Thymine N N N •• Cytosine O N H H O • •• O N N H Rare imino form of adenine (A*) N N Rare enol form of guanine (G*) (a) Rare and temporary enol tautomeric form of guanine A C G T C • •• • •• • •• • • • • T G C A G Parental DNA DNA replication A C ••••• T G Wild type •• ••• •• •• ••• A C A T C T G T A G Mutant • • • • •• • •• • • • • A C G T C T G C A G Wild type A C G T C T G C A G Wild type A C G T C T G T A G C T ••• G• • • • G • A T DNA replication •G C• •• •T C A ••• G (b) A C G T C T G C A G • • • • •• • •• • • • • A C G T C • • • • •• • •• • • • • T G C A G • • • • •• • •• • • • • First-generation progeny Second-generation progeny Figure 11.25 Additions and Deletions. A hypothetical mechanism for the generation of additions and deletions during replication. The direction of replication is indicated by the large arrow. In each case there is strand slippage resulting in the formation of a small loop that is stabilized by the hydrogen bonding in the repetitive sequence, the AT stretch in this example. DNA synthesis proceeds to the right in this figure. (a) If the new strand slips, an addition of one T results. (b) Slippage of the parental strand yields a deletion (in this case, a loss of two Ts). Slippage leading to a deletion Slippage leading to an addition 5′ C G T T T T 5′ C G T T T 3′ G C A A A A A C G T A C... 3′ G C A A A A A C G T A C... Slippage in new strand Slippage in parental strand G T 5′ C 3′ G C A A A A A C G T A C... T T T 5′ C G T T T 3′ G C A A A C G T A C... A A G T 5′ C 3′ G C A A A A A C G T A C... (a) T T T T T G C A T G 5′ C G T T T G C A T G 3′ G C A A A C G T A C... (b) A A 247 Prescott−Harley−Klein: Microbiology, Fifth Edition 248 Chapter 11 IV. Microbial Molecular Biology and Genetics • • • Br H N O 4 3 C 6 1 N 5 2 1 N C 2 C N H C 6 3 N H • • • 9 C 4 C C 5 N C C N C 8 7 © The McGraw−Hill Companies, 2002 Genes: Structure, Replication, and Mutation H H 11. Genes: Structure, Replication, and Mutation C O H Adenine (normal amino state) A Bu A Bu G Bue G C +Bu 5-bromouracil (normal keto state) A T A Bu Mutant • • • Br O O H (b) H C H N C C N H C • • • C N C New hydrogen bond C N H • • • C N C C N O N C H Guanine (normal amino state) 5-bromouracil (uncommon enol state) Figure 11.26 Mutagenesis by the Base Analog 5-Bromouracil. (a) Base pairing of the normal keto form of 5-BU is shown in the top illustration. The enol form of 5-BU (bottom illustration) base pairs with guanine rather than with adenine as might be expected for a thymine analog. (b) If the keto form of 5-BU is incorporated in place of thymine, its occasional tautomerization to the enol form (BUe) will produce an AT to GC transition mutation. (a) methyl-nitrosoguanidine, an alkylating agent that adds methyl groups to guanine, causing it to mispair with thymine (figure 11.27). A subsequent round of replication could then result in a GC-AT transition. DNA damage also stimulates error-prone repair mechanisms. Other examples of mutagens with this mode of action are the alkylating agents ethylmethanesulfonate and hydroxylamine. Hydroxylamine hydroxylates the C-4 nitrogen of cytosine, causing it to base pair like thymine. There are many other DNA modifying agents that can cause mispairing. Intercalating agents distort DNA to induce single nucleotide pair insertions and deletions. These mutagens are planar and insert themselves (intercalate) between the stacked bases of the helix. This results in a mutation, possibly through the formation of a loop in DNA. Intercalating agents include acridines such as proflavin and acridine orange. Many mutagens, and indeed many carcinogens, directly damage bases so severely that hydrogen bonding between base pairs is impaired or prevented and the damaged DNA can no longer act as a template. For instance, UV radiation generates cyclobutane type dimers, usually thymine dimers, between adjacent pyrimidines (figure 11.28). Other examples are ionizing radiation and carcinogens such as aflatoxin B1 and other benzo(a)pyrene derivatives. Such damage to DNA would generally be lethal but may trigger a repair mechanism that restores much of the damaged genetic material, although with considerable error incorporation (pp. 254–56). Retention of proper base pairing is essential in the prevention of mutations. Often the damage can be repaired before a mutation is permanently established. If a complete DNA replication cycle takes place before the initial lesion is repaired, the mutation frequently becomes stable and inheritable. The Expression of Mutations The expression of a mutation will only be readily noticed if it produces a detectable, altered phenotype. A mutation from the most prevalent gene form, the wild type, to a mutant form is called a forward mutation. Later, a second mutation may make the mutant appear to be a wild-type organism again. Such a mutation is called a reversion mutation because the organism seems to have reverted back to its original phenotype. A true back mutation converts the mutant nucleotide sequence back to the wild-type sequence. The wild-type phenotype also can be regained by a second mutation in a different gene, a suppressor mutation, which overcomes the effect of the first mutation (table 11.2). If the second mutation is within the same gene, the change may be called a second site reversion or intragenic suppression. Thus, although revertant phenotypes appear to be wild types, the original DNA sequence may not be restored. In practice, a mutation is visibly expressed when a protein that is in some way responsible for the phenotype is altered sufficiently to produce a new phenotype. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 11.6 O N C Pairs normally with cytosine H H H Guanine N C C 249 C C N Mutations and Their Chemical Basis N N H CH3 O N H N C NH N NO2 N-methyl-N ′-nitro-N-nitrosoguanidine CH3 O N C C N Sometimes pairs with thymine C H C C N H 6 O - methylguanine N N H Figure 11.27 Methyl-Nitrosoguanidine Mutagenesis. Mutagenesis by methyl-nitrosoguanidine due to the methylation of guanine. CH3 O NH O N O CH3 O NH O N O Figure 11.28 Thymine Dimer. Thymine dimers are formed by ultraviolet radiation. The enzyme photolyase cleaves the two colored bonds during photoreactivation. However, mutations may occur and not alter the phenotype for a variety of reasons. Although very large deletion and insertion mutations exist, most mutations affect only one base pair in a given location and therefore are called point mutations. There are several types of point mutations (table 11.2). One kind of point mutation that could not be detected until the advent of nucleic acid sequencing techniques is the silent mutation. If a mutation is an alteration of the nucleotide sequence of DNA, mutations can occur and have no visible effect because of code degeneracy. When there is more than one codon for a given amino acid, a single base substitution could result in the formation of a new codon for the same amino acid. For example, if the codon CGU were changed to CGC, it usually would still code for arginine even though a mutation had occurred. The expression of this mutation often would not be detected except at the level of the DNA or mRNA. When there is no change in the protein or its concentration, there will be no change in the phenotype of the organism. Prescott−Harley−Klein: Microbiology, Fifth Edition 250 Chapter 11 IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 Genes: Structure, Replication, and Mutation Table 11.2 Summary of Some Molecular Changes from Gene Mutations Type of Mutation Result and Example Forward Mutations Single Nucleotide-Pair (Base-Pair) Substitutions At DNA Level Transition Transversion At Protein Level Silent mutation Neutral mutation Missense mutation Nonsense mutation Single Nucleotide-Pair Addition or Deletion: Frameshift Mutation Purine replaced by a different purine, or pyrimidine replaced by a different pyrimidine (e.g., AT ——⬎ GC). Purine replaced by a pyrimidine, or pyrimidine replaced by a purine (e.g., AT ——⬎ CG). Triplet codes for same amino acid: AGG ——⬎ CGG both code for Arg Triplet codes for different but functionally equivalent amino acid: AAA (Lys) ——⬎ AGA (Arg) Triplet codes for a different amino acid. Triplet codes for chain termination: CAG (Gln) ——⬎ UAG (stop) Any addition or deletion of base pairs that is not a multiple of three results in a frameshift in reading the DNA segments that code for proteins. Intragenic Addition or Deletion of Several to Many Nucleotide Pairs Reverse Mutations True Reversion Equivalent Reversion AAA (Lys) forward UCC (Ser) reverse ⬎ GAA (Glu) ————— ⬎ UGC (Cys) ————— ————— wild type mutant forward ————— wild type wild type reverse mutant CGC (Arg, basic) wild type forward ⬎ AAA (Lys) ⬎ AGC (Ser) wild type ⬎ CCC (Pro, not basic) ————— mutant reverse ⬎ CAC (His, basic) ————— pseudo-wild type Suppressor Mutations CATCATCATCATCATCAT (+) (–) ⬎ ——— ⬎ ——— Intragenic Suppressor Mutations Frameshift of opposite sign at site within gene. Addition of X to the base sequence shifts the reading frame from the CAT codon to XCA followed by TCA codons. The subsequent deletion of a C base shifts the reading frame back to CAT. { { { { { { CATXCATATCATCATCAT y x z y y y Extragenic Suppressor Mutations Nonsense suppressors Physiological suppressors Gene (e.g., for tyrosine tRNA) undergoes mutational event in its anticodon region that enables it to recognize and align with a mutant nonsense codon (e.g., UAG) to insert an amino acid (tyrosine) and permit completion of the translation. A defect in one chemical pathway is circumvented by another mutation—for example, one that opens up another chemical pathway to the same product, or one that permits more efficient uptake of a compound produced in small quantities because of the original mutation. From An Introduction to Genetic Analysis, 3rd edition by Suzuki, Griffiths, Miller and Lewontin. Copyright © 1986 by W. H. Freeman and Company. Used with permission. A second type of point mutation is the missense mutation. This mutation involves a single base substitution in the DNA that changes a codon for one amino acid into a codon for another. For example, the codon GAG, which specifies glutamic acid, could be changed to GUG, which codes for valine. The expression of missense mutations can vary. Certainly the mutation is expressed at the level of protein structure. However, at the level of protein function, the effect may range from complete loss of activity to no change at all. Mutations also occur in the regulatory sequences responsible for the control of gene expression and in other noncoding portions of structural genes. Constitutive lactose operon mutants in E. coli are excellent examples. These mutations map in the operator site and produce altered operator sequences that are not recognized by the repressor protein, and therefore the operon is continuously active in transcription. If a mutation renders the promoter sequence nonfunctional, the coding region of the struc- Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 11.7 tural gene will be completely normal, but a mutant phenotype will result due to the absence of a product. RNA polymerase rarely transcribes a gene correctly without a fully functional promoter. 251 Detection and Isolation of Mutants Mutant strain (GC addition) Wild type Template strand The lac operon and gene regulation (pp. 275–78) Mutations also occur in rRNA and tRNA genes and can alter the phenotype through disruption of protein synthesis. In fact, these mutants often are initially identified because of their slow growth. One type of suppressor mutation is a base substitution in the anticodon region of a tRNA that allows the insertion of the correct amino acid at a mutant codon (table 11.2). DNA Codons mRNA 1. Define or describe the following: mutation, conditional mutation, auxotroph and prototroph, spontaneous and induced mutations, mutagen, transition and transversion mutations, frameshift, apurinic site, base analog, specific mispairing, intercalating agent, thymine dimer, wild type, forward and reverse mutations, suppressor mutation, point mutation, silent mutation, missense and nonsense mutations, directed or adaptive mutation and hypermutation, and frameshift mutation. 2. Give four ways in which spontaneous mutations might arise. 3. How do the mutagens 5-bromouracil, methyl-nitrosoguanidine, proflavin, and UV radiation induce mutations? 4. Give examples of intragenic and extragenic suppressor mutations. 11.7 Detection and Isolation of Mutants In order to study microbial mutants, one must be able to detect them readily, even when there are few, and then efficiently isolate them from the parent organism and other mutants. Fortunately this often is easy to do. This section describes some techniques used in mutant detection, selection, and isolation. Mutant Detection When collecting mutants of a particular organism, one must know the normal or wild-type characteristics so as to recognize an altered phenotype. A suitable detection system for the mutant phenotype under study also is needed. Since mutations are generally rare, about one per 107 to 1011 cells, it is important to have a very sensitive detection system so that these events will not be missed. Geneticists often induce mutations to increase the probability of obtaining specific changes at high frequency (about one in 103 to 106); even so, mutations are rare. Many proteins are still functional after the substitution of a single amino acid, but this depends on the type and location of the amino acid. For instance, replacement of a nonpolar amino acid in the protein’s interior with a polar amino acid probably will drastically alter the protein’s three-dimensional structure and therefore its function. Similarly the replacement of a critical amino acid at the active site of an enzyme will destroy its activity. However, the replacement of one polar amino acid with another at the protein surface may have little or no effect. Missense mutations may actually play a very important role in providing Peptide TA C G G TAT G A C C AT G C C ATA C T G G TA C G G T C AT G A C C AT G C C A G TA C T G G AUGCCAUACUGG AUGCCAGUACUGG Met Pro Tyr Trp Met Pro Val Leu Figure 11.29 Frameshift Mutation. A frameshift mutation resulting from the insertion of a GC base pair. The reading frameshift produces a different peptide after the addition. new variability to drive evolution because they often are not lethal and therefore remain in the gene pool. Protein structure (appendix I) A third type of point mutation causes the early termination of translation and therefore results in a shortened polypeptide. Such mutations are called nonsense mutations because they involve the conversion of a sense codon to a nonsense or stop codon. Depending on the relative location of the mutation, the phenotypic expression may be more or less severely affected. Most proteins retain some function if they are shortened by only one or two amino acids; complete loss of normal function will almost certainly result if the mutation occurs closer to the middle of the gene. The frameshift mutation is a fourth type of point mutation and was briefly mentioned earlier. Frameshift mutations arise from the insertion or deletion of one or two base pairs within the coding region of the gene. Since the code consists of a precise sequence of triplet codons, the addition or deletion of fewer than three base pairs will cause the reading frame to be shifted for all codons downstream (figure 11.19). Figure 11.29 shows the effect of a frameshift mutation on a short section of mRNA and the amino acid sequence it codes for. Frameshift mutations usually are very deleterious and yield mutant phenotypes resulting from the synthesis of nonfunctional proteins. The reading frameshift often eventually produces a nonsense or stop codon so that the peptide product is shorter as well as different in sequence. Of course if the frameshift occurred near the end of the gene, or if there were a second frameshift shortly downstream from the first that restored the reading frame, the phenotypic effect might not be as drastic. A second nearby frameshift that restores the proper reading frame is a good example of an intragenic suppressor mutation. Detection systems in bacteria and other haploid organisms are straightforward because any new allele should be seen immediately, even if it is a recessive mutation. Sometimes detection of Prescott−Harley−Klein: Microbiology, Fifth Edition 252 Chapter 11 IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 Genes: Structure, Replication, and Mutation mutants is direct. If albino mutants of a normally pigmented bacterium are being studied, detection simply requires visual observation of colony color. Other direct detection systems are more complex. For example, the replica plating technique is used to detect auxotrophic mutants. It distinguishes between mutants and the wild-type strain based on their ability to grow in the absence of a particular biosynthetic end product (figure 11.30). A lysine auxotroph, for instance, will grow on lysine-supplemented media but not on a medium lacking an adequate supply of lysine because it cannot synthesize this amino acid. Once a detection method is established, mutants are collected. Since a specific mutation is a rare event, it is necessary to look at perhaps thousands to millions of colonies or clones. Using direct detection methods, this could become quite a task, even with microorganisms. Consider a search for the albino mutants mentioned previously. If the mutation rate were around one in a million, on the average a million or more organisms would have to be tested to find one albino mutant. This probably would require several thousand plates. The task of isolating auxotrophic mutants in this way would be even more taxing with the added labor of replica plating. This difficulty can be partly overcome by using mutagens to increase the mutation rate, thus reducing the number of colonies to be examined. However, it is more efficient to use a selection system employing some environmental factor to separate mutants from wild-type microorganisms. Handle Treatment of E. coli cells with a mutagen, such as nitrosoguanidine. Velvet surface (sterilized) Master plate (complete medium) Inoculate a plate containing complete growth medium and incubate. Both wild-type and mutant survivors will form colonies. Mutant Selection An effective selection technique uses incubation conditions under which the mutant will grow, because of properties given it by the mutation, whereas the wild type will not. Selection methods often involve reversion mutations or the development of resistance to an environmental stress. For example, if the intent is to isolate revertants from a lysine auxotroph (Lys⫺), the approach is quite easy. A large population of lysine auxotrophs is plated on minimal medium lacking lysine, incubated, and examined for colony formation. Only cells that have mutated to restore the ability to manufacture lysine will grow on minimal medium (figure 11.31). Thus several million cells can be plated on a single petri dish, and many cells can be tested for mutations by scanning a few petri dishes for growth. This is because the auxotrophs will not grow on minimal medium and confuse the results; only the phenotypic revertants will form colonies. This method has proven very useful in determining the relative mutagenicity of many substances. Resistance selection methods follow a similar approach. Often wild-type cells are not resistant to virus attack or antibiotic treatment, so it is possible to grow the bacterium in the presence of the agent and look for surviving organisms. Consider the example of a phage-sensitive wild-type bacterium. When the organism is cultured in medium lacking the virus and then plated out on selective medium containing phages, any colonies that form will be resistant to phage attack and very likely will be mutants in this regard. Resistance selection can be used together with virtually any environmental parameter; resistance to bacteriophages, antibiotics, or temperature are most commonly employed. Substrate utilization mutations also are employed in bacterial selection. Many bacteria use only a few primary carbon sources. Replica plate (complete medium) Replica plate (medium minus lysine) Incubation All strains grow. Lysine auxotrophs do not grow. Culture lysine – auxotroph (Lys ). Figure 11.30 Replica Plating. The use of replica plating in isolating a lysine auxotroph. Mutants are generated by treating a culture with a mutagen. The culture containing wild type and auxotrophs is plated on complete medium. After the colonies have developed, a piece of sterile velveteen is pressed on the plate surface to pick up bacteria from each colony. Then the velvet is pressed to the surface of other plates and organisms are transferred to the same position as on the master plate. After determining the location of Lys⫺ colonies growing on the replica with complete medium, the auxotrophs can be isolated and cultured. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 11.7 Detection and Isolation of Mutants 253 Culture of Salmonella histidine auxotrophs Treatment of lysine auxotrophs (Lys– ) with a mutagen such as nitrosoguanidine or UV radiation to produce revertants. Complete medium plus a small amount of histidine Medium with test mutagen and a small amount of histidine Plate culture Plate out mixture on minimal medium (which lacks lysine). Incubate at 37°C Incubate. Only prototrophs able to synthesize lysine will grow. Spontaneous revertants Revertants induced by the mutagen Figure 11.32 The Ames Test for Mutagenicity. See text for details. Figure 11.31 Mutant Selection. The production and direct selection of auxotroph revertants. In this example, lysine revertants will be selected after treatment of a lysine auxotroph culture because the agar contains minimal medium that will not support auxotroph growth. With such bacteria, it is possible to select mutants by a method similar to that employed in resistance selection. The culture is plated on medium containing an alternate carbon source. Any colonies that appear can use the substrate and are probably mutants. Mutant detection and selection methods are used for purposes other than understanding more about the nature of genes or the biochemistry of a particular microorganism. One very important role of mutant selection and detection techniques is in the study of carcinogens. The next section briefly describes one of the first and perhaps best known of the carcinogen testing systems. Carcinogenicity Testing An increased understanding of the mechanisms of mutation and cancer induction has stimulated efforts to identify environmental carcinogens so that they can be avoided. The observation that many carcinogenic agents also are mutagenic is the basis for detecting potential carcinogens by testing for mutagenicity while taking advantage of bacterial selection techniques and short generation times. The Ames test, developed by Bruce Ames in the 1970s, has been widely used to test for carcinogens. The Ames test is a mutational reversion assay employing several special strains of Salmonella typhimurium, each of which has a different mutation in the histidine biosynthesis operon. The bacteria also have mutational alterations of their cell walls that make them more permeable to test substances. To further increase assay sensitivity, the strains are defective in the ability to carry out repair of DNA and have plasmid genes that enhance error-prone DNA repair. In the Ames test these special tester strains of Salmonella are plated with the substance being tested and the appearance of visible colonies followed (figure 11.32). To ensure that DNA replication can take place in the presence of the potential mutagen, the bacteria and test substance are mixed in dilute molten top agar to which a trace of histidine has been added. This molten mix is then poured on top of minimal agar plates and incubated for 2 to 3 days at 37°C. All of the histidine auxotrophs will grow for the first few hours in the presence of the test compound until the histidine is depleted. Once the histidine supply is exhausted, only revertants that have mutationally Prescott−Harley−Klein: Microbiology, Fifth Edition 254 Chapter 11 IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation Genes: Structure, Replication, and Mutation regained the ability to synthesize histidine will grow. The visible colonies need only be counted and compared to controls in order to estimate the relative mutagenicity of the compound: the more colonies, the greater the mutagenicity. A mammalian liver extract is also often added to the molten top agar prior to plating. The extract converts potential carcinogens into electrophilic derivatives that will readily react with DNA. This process occurs naturally when foreign substances are metabolized in the liver. Since bacteria do not have this activation system, liver extract often is added to the test system to promote the transformations that occur in mammals. Many potential carcinogens, such as the aflatoxins (see pp. 967–68), are not actually carcinogenic until they are modified in the liver. The addition of extract shows which compounds have intrinsic mutagenicity and which need activation after uptake. Despite the use of liver extracts, only about half the potential animal carcinogens are detected by the Ames test. 1. Describe how replica plating is used to detect and isolate auxotrophic mutants. 2. Why are mutant selection techniques generally preferable to the direct detection and isolation of mutants? 3. Briefly discuss how reversion mutations, resistance to an environmental factor, and the ability to use a particular nutrient can be employed in mutant selection. 4. What is the Ames test and how is it carried out? What assumption concerning mutagenicity and carcinogenicity is it based upon? 11.8 © The McGraw−Hill Companies, 2002 DNA Repair Since replication errors and a variety of mutagens can alter the nucleotide sequence, a microorganism must be able to repair changes in the sequence that might be fatal. DNA is repaired by several different mechanisms besides proofreading by replication enzymes (DNA polymerases can remove an incorrect nucleotide immediately after its addition to the growing end of the chain). Repair in E. coli is best understood and is briefly described in this section. DNA replication and proofreading (pp. 235–39) Excision Repair Excision repair is a general repair system that corrects damage that causes distortions in the double helix. A repair endonuclease or uvrABC endonuclease removes the damaged bases along with some bases on either side of the lesion (figure 11.33). The resulting single-stranded gap, about 12 nucleotides long, is filled by DNA polymerase I, and DNA ligase joins the fragments ( p. 239). This system can remove thymine dimers (figure 11.28) and repair almost any other injury that produces a detectable distortion in DNA. Besides this general excision repair system, specialized versions of the system excise specific sites on the DNA where the sugar phosphate backbone is intact but the bases have been removed to form apurinic or apyrimidinic sites (AP sites). Special endonucleases called AP endonucleases recognize these locations and nick the backbone at the site. Excision repair then commences, beginning with the excision of a short stretch of nucleotides. Another type of excision repair employs DNA glycosylases. These enzymes remove damaged or unnatural bases yielding AP sites that are then repaired as above. Not all types of damaged bases are repaired in this way, but new glycosylases are being discovered and the process may be of more general importance than first thought. Removal of Lesions Thymine dimers and alkylated bases often are directly repaired. Photoreactivation is the repair of thymine dimers by splitting them apart into separate thymines with the help of visible light in a photochemical reaction catalyzed by the enzyme photolyase. Because this repair mechanism does not remove and replace nucleotides, it is error free. Sometimes damage caused by alkylation is repaired directly as well. Methyls and some other alkyl groups that have been added to the O–6 position of guanine can be removed with the help of an enzyme known as alkyltransferase or methylguanine methyltransferase. Thus damage to guanine from mutagens such as methyl-nitrosoguanidine (figure 11.27) can be repaired directly. Postreplication Repair Despite the accuracy of DNA polymerase action and continual proofreading, errors still are made during DNA replication. Remaining mismatched bases and other errors are usually detected and repaired by the mismatch repair system in E. coli. The mismatch correction enzyme scans the newly replicated DNA for mismatched pairs and removes a stretch of newly synthesized DNA around the mismatch. A DNA polymerase then replaces the excised nucleotides, and the resulting nick is sealed with a ligase. Postreplication repair is a type of excision repair. Successful postreplication repair depends on the ability of enzymes to distinguish between old and newly replicated DNA strands. This distinction is possible because newly replicated DNA strands lack methyl groups on their bases, whereas older DNA has methyl groups on the bases of both strands. DNA methylation is catalyzed by DNA methyltransferases and results in three different products: N6-methyladenine, 5-methylcytosine, and N4-methylcytosine. After strand synthesis, the E. coli DNA adenine methyltransferase (DAM) methylates adenine bases in d(GATC) sequences to form N6-methyladenine. For a short time after the replication fork has passed, the new strand lacks methyl groups while the template strand is methylated. The repair system cuts out the mismatch from the unmethylated strand. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 11.8 DNA Repair 255 Repair enzyme bound to DNA T T Enzyme detects distortion due to dimer. Endonuclease activity cuts damaged DNA strand, 8 nucleotides to 5′ side of the dimer and 4 or 5 nucleotides to 3′ side. Damaged segment diffuses away. T T DNA polymerase I fills gap and DNA ligase seals remaining nick. T T Figure 11.33 Excision Repair. Excision repair of a thymine dimer that has distorted the double helix. The repair endonuclease or uvrABC endonuclease is coded for by the uvrA, B, and C genes. Recombination Repair In recombination repair, damaged DNA for which there is no remaining template is restored. This situation arises if both bases of a pair are missing or damaged, or if there is a gap opposite a lesion. In this type of repair the recA protein cuts a piece of template DNA from a sister molecule and puts it into the gap or uses it to replace a damaged strand. Although bacteria are haploid, another copy of the damaged segment often is available because either it has recently been replicated or the cell is growing rapidly and has more than one copy of its chromosome. Once the template is in place, the remaining damage can be corrected by another repair system. The recA protein also participates in a type of inducible repair known as SOS repair. In this instance the DNA damage is so great that synthesis stops completely, leaving many large gaps. RecA will bind to the gaps and initiate strand exchange. Simultaneously it takes on a proteolytic function that destroys the lexA repressor protein, which regulates the function of many genes involved in DNA repair and synthesis (figure 11.34). As a result many more copies of these enzymes are produced, accelerating the replication and repair processes. The system can quickly repair extensive damage caused by agents such as UV radiation, but it is error prone and does produce mutations. However, it is certainly better to have a few mutations than no DNA replication at all. 1. Define the following: proofreading, excision repair, photoreactivation, methylguanine methyltransferase, mismatch repair, DNA methylation, recombination repair, recA protein, SOS repair, and lexA repressor. 2. Describe in general terms the mechanisms of the following repair processes: excision repair, recombination repair, and SOS repair. Prescott−Harley−Klein: Microbiology, Fifth Edition 256 Chapter 11 IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 Genes: Structure, Replication, and Mutation Replicating chromosomes To other regulated genes lexA O recA O uvrA uvrB O O UV radiation T=T Bound recA protein has protease activity Cleavage of lexA repressor proteins by bound recA lexA O recA O uvrA O uvrB O Figure 11.34 The SOS Repair Process. In the absence of damage, repair genes are expressed in E. coli at low levels due to binding of the lexA repressor protein at their operators (O). When the recA protein binds to a damaged region—for example, a thymine dimer created by UV radiation—it destroys lexA and the repair genes are expressed more actively. The uvr genes code for the repair endonuclease or uvrABC endonuclease responsible for excision repair. Summary 1. The knowledge that DNA is the genetic material for cells came from studies on transformation by Griffith and Avery and from experiments on T2 phage reproduction by Hershey and Chase. 2. DNA differs in composition from RNA in having deoxyribose and thymine rather than ribose and uracil. 3. DNA is double stranded, with complementary AT and GC base pairing between the strands. The strands run antiparallel and are twisted into a righthanded double helix (figure 11.6). 4. RNA is normally single stranded, although it can coil upon itself and base pair to form hairpin structures. 5. In almost all procaryotes DNA exists as a closed circle that is twisted into supercoils and associated with histonelike proteins. 6. Eucaryotic DNA is associated with five types of histone proteins. Eight histones associate to form ellipsoidal octamers around which the DNA is coiled to produce the nucleosome (figure 11.9). 7. DNA synthesis is called replication. Transcription is the synthesis of an RNA copy of DNA and produces three types of RNA: messenger RNA (mRNA), transfer RNA (tRNA), and ribosomal RNA (rRNA). 8. The synthesis of protein under the direction of mRNA is called translation. 9. Most circular procaryotic DNAs are copied by two replication forks moving around the circle to form a theta-shaped () figure. Sometimes a rolling-circle mechanism is employed instead. 10. Eucaryotic DNA has many replicons and replication origins located every 10 to 100 m along the DNA. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 Key Terms 11. DNA polymerase enzymes catalyze the synthesis of DNA in the 5′ to 3′ direction while reading the DNA template in the 3′ to 5′ direction. 20. RNA polymerase binds to the promoter region, which contains RNA polymerase recognition and RNA polymerase binding sites (figure 11.22). 12. The double helix is unwound by helicases with the aid of topoisomerases like the DNA gyrase. DNA binding proteins keep the strands separate. 13. DNA polymerase III holoenzyme synthesizes a complementary DNA copy beginning with a short RNA primer made by a primase enzyme. 21. The gene also contains a coding region and a terminator; it may have a leader and a trailer (figure 11.21). Regulatory segments such as operators may be present. 22. The genes for tRNA and rRNA often code for a precursor that is subsequently processed to yield several products. 14. The leading strand is probably replicated continuously, whereas DNA synthesis on the lagging strand is discontinuous and forms Okazaki fragments (figures 11.15 and 11.16). 23. A mutation is a stable, heritable change in the nucleotide sequence of the genetic material, usually DNA. 24. Mutations can be divided into many categories based on their effects on the phenotype, some major types are morphological, lethal, conditional, biochemical, and resistance mutations. 25. Spontaneous mutations can arise from replication errors (transitions, transversions, and frameshifts), from DNA lesions (apurinic sites, apyrimidinic sites, oxidations), and from insertions. 26. Induced mutations are caused by mutagens. Mutations may result from the incorporation of base analogs, specific mispairing due to alteration of a base, the presence of intercalating agents, and a bypass of replication because of severe damage. Starvation and environmental stresses may 15. DNA polymerase I excises the RNA primer and fills in the resulting gap. DNA ligase then joins the fragments together. 16. Genetic information is carried in the form of 64 nucleotide triplets called codons (table 11.1); sense codons direct amino acid incorporation, and stop or nonsense codons terminate translation. 17. The code is degenerate—that is, there is more than one codon for most amino acids. 18. A gene may be defined as the nucleic acid sequence that codes for a polypeptide, tRNA, or rRNA. 19. The template strand of DNA carries genetic information and directs the synthesis of the RNA transcript. 257 stimulate mutator genes and lead to hypermutation. 27. The mutant phenotype can be restored to wild type by either a true reverse mutation or a suppressor mutation (table 11.2). 28. There are four important types of point mutations: silent mutations, missense mutations, nonsense mutations, and frameshift mutations (table 11.2). 29. It is essential to have a sensitive and specific detection technique to isolate mutants; an example is replica plating (figure 11.30) for the detection of auxotrophs (a direct detection system). 30. One of the most effective isolation techniques is to adjust environmental conditions so that the mutant will grow while the wild-type organism does not. 31. Because many carcinogens are also mutagenic, one can test for mutagenicity with the Ames test (figure 11.32) and use the results as an indirect indication of carcinogenicity. 32. Mutations and DNA damage are repaired in several ways; for example: proofreading by replication enzymes, excision repair, removal of lesions (e.g., photoreactivation), postreplication repair (mismatch repair), and recombination repair. Key Terms Ames test 253 apurinic site 246 apyrimidinic site 246 auxotroph 245 back mutation 248 histone 234 hypermutation 246 intercalating agent 248 leader sequence 244 major groove 231 replication 230 replication fork 235 replicon 235 reversion mutation 248 ribonucleic acid (RNA) 230 base analog 246 cistron 241 clone 228 code degeneracy 240 messenger RNA (mRNA) 230 minor groove 231 mismatch repair system 254 missense mutation 250 RNA polymerase binding site or Pribnow box 243 rolling-circle mechanism 236 sense codons 240 Shine-Dalgarno sequence 244 coding region 244 codon 240 mutagen 246 mutation 244 silent mutation 249 single-stranded DNA binding proteins (SSBs) 237 complementary 231 conditional mutation 245 deoxyribonucleic acid (DNA) 230 directed or adaptive mutation 246 nonsense mutation 251 nucleosome 235 Okazaki fragment 239 photoreactivation 254 SOS repair 255 specific mispairing 246 stop or nonsense codons 241 suppressor mutation 248 DNA gyrase 237 DNA ligase 239 DNA methylation 254 point mutation 249 Pribnow box 244 primase 239 template strand 242 terminator sequence 244 topoisomerase 237 DNA polymerase 236 excision repair 254 forward mutation 248 frameshift 246 primosome 239 promoter 242 proofreading 254 prototroph 245 trailer sequence 244 transcription 230 transformation 228 transition mutation 246 frameshift mutation 251 gene 241 reading frame 241 recA protein 255 translation 230 transversion mutation 246 genome 228 helicase 236 recombination repair 255 replica plating 252 wild type 248 wobble 241 Prescott−Harley−Klein: Microbiology, Fifth Edition 258 Chapter 11 IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 Genes: Structure, Replication, and Mutation Questions for Thought and Review 1. How do replication patterns differ between procaryotes and eucaryotes? Describe the operation of replication forks in the generation of theta-shaped intermediates and in the rolling-circle mechanism. 2. Outline the steps involved in DNA synthesis at the replication fork. How do DNA polymerases correct their mistakes? 3. Currently a gene is described in several ways. Which definition do you prefer and why? 4. How could one use small deletion mutations to show that codons are triplet (i.e., that the nucleotide sequence is read three bases at a time rather than two or four)? 5. Sometimes a point mutation does not change the phenotype. List all the reasons you can why this is so. 6. Why might a mutation leading to an amino acid change at a protein’s surface not result in a phenotypic change while the substitution of an internal amino acid will? 7. Describe how you would isolate a mutant that required histidine for growth and was resistant to penicillin. 8. How would the following DNA alterations and replication errors be corrected (there may be more than one way): base addition errors by DNA polymerase III during replication, thymine dimers, AP sites, methylated guanines, and gaps produced during replication? Critical Thinking Questions 1. Mutations are often considered harmful. Give an example of a mutation that would be beneficial to a microorganism. What gene would bear the mutation? How would the mutation alter the gene’s role in the cell, and what conditions would select for this mutant allele? 2. Mistakes made during transcription affect the cell, but are not considered “mutations.” Why not? 3. Given what you know about the difference between procaryotic and eucaryotic cells, give two reasons why the Ames test detects only about half of potential carcinogens, even when liver extracts are used. 4. Suppose that you have isolated a microorganism from a soil sample. Describe how you would go about determining the nature of its genetic material. Additional Reading General 11.2 Dale, J. W. 1998. Molecular genetics of bacteria, 3rd ed. New York: John Wiley and Sons. Griffiths, A. J. F.; Miller, J. H.; Suzuki, D. T.; Lewontin, R. C.; and Gelbart, W. M. 2000. An introduction to genetic analysis, 7th ed. New York: W. H. Freeman. Hartwell, L. H.; Hood, L.; Goldberg, M. L.; Reynolds, A. E.; Silver, L. M.; and Veres, R. C. 2000. Genetics: From genes to genomes. New York: McGraw-Hill. Holloway, B. W. 1993. Genetics for all bacteria. Ann. Rev. Microbiol. 47:659–84. Joset, F., and Guespin-Michel, J. 1993. Prokaryotic genetics: Genome organization, transfer and plasticity. Boston: Blackwell. Kendrew, J., editor. 1994. The encyclopedia of molecular biology. Boston: Blackwell Scientific Publications. Klug, W. S., and Cummings, M. R. 1997. Concepts of Genetics, 5th ed. Upper Saddle River, N.J.: Prentice-Hall. Lewin, B. 2000. Genes, 7th ed. New York: Oxford University Press. Maloy, S. R.; Cronan, J. E., Jr.; and Freifelder, D. 1994. Microbial genetics, 2d ed. Boston: Jones and Bartlett. Russell, P. J. 1998. Genetics, 5th ed. New York: Harper Collins. Scaife, J.; Leach, D.; and Galizzi, A., editors. 1985. Genetics of bacteria. New York: Academic Press. Smith-Keary, P. 1989. Molecular genetics of Escherichia coli. New York: Guilford Press. Snyder, L., and Champness, W. 1997. Molecular Genetics of Bacteria. Washington, D.C.: ASM Press Weaver, R. F. 1999. Molecular biology. Dubuque, Iowa: WCB McGraw-Hill. Weaver, R. F., and Hedrick, P. W. 1997. Genetics, 3d ed. Dubuque, Iowa: Wm. C. Brown. Bauer, W. R.; Crick, F. H. C.; and White, J. H. 1980. Supercoiled DNA. Sci. Am. 243(1):118–33. Darnell, J. E., Jr. 1985. RNA. Sci. Am. 253(4):68–78. Drlica, K. 2000. Chromosome, Bacterial. In Encyclopedia of microbiology, 2d ed., vol. 1, J. Lederberg, editor-in-chief, 808–21. San Diego: Academic Press. Drlica, K., and Rouviere-Yaniv, J. 1987. Histonelike proteins of bacteria. Microbiol. Rev. 51(3):301–19. Felsenfeld, G. 1985. DNA. Sci. Am. 253(4):59–67. Kornberg, R. D., and Klug, A. 1981. The nucleosome. Sci. Am. 244(2):52–64. Rich, A., and Kim, S. H. 1978. The threedimensional structure of transfer RNA. Sci. Am. 238(1):52–62. 11.3 Nucleic Acid Structure DNA Replication Baker, T. A., and Bell, S. P. 1998. Polymerases and the replisome: Machines within machines. Cell 92:295–305. Cook, P. R. 1999. The organization of replication and transcription. Science 284: 1790–95. Dickerson, R. E. 1983. The DNA helix and how it is read. Sci. Am. 249(6):94–111. Hejna, J. A., and Moses, R. E. 2000. DNA replication. In Encyclopedia of microbiology, 2d ed., vol. 2, J. Lederberg, editor-in-chief, 82–90. San Diego: Academic Press. Johnson, K. A. 1993. Conformational coupling in DNA polymerase fidelity. Ann. Rev. Biochem. 62:685–713. Joyce, C. M., and Steitz, T. A. 1994. Function and structure relationships in DNA polymerases. Ann. Rev. Biochem. 63:777–822. Kornberg, A. 1992. DNA replication, 2d ed. San Francisco: W. H. Freeman. Lohman, T. M.; Thorn, K.; and Vale, R. D. 1998. Staying on track: Common features of DNA helicases and microtubule motors. Cell 93:9–12. Losick, R., and Shapiro, L. 1998. Bringing the mountain to Mohammed. Science 282:1430–31. Marians, K. J. 1992. Prokaryotic DNA replication. Annu. Rev. Biochem. 61:673–719. Matson, S. W., and Kaiser-Rogers, K. A. 1990. DNA helicases. Annu. Rev. Biochem. 59:289–329. McHenry, C. S. 1988. DNA polymerase III holoenzyme of Escherichia coli. Annu. Rev. Biochem. 57:519–50. Meyer, R. R., and Laine, P. S. 1990. The singlestranded DNA-binding protein of Escherichia coli. Microbiol. Rev. 54(4):342–80. Radman, M., and Walker, R. 1988. The high fidelity of DNA duplication. Sci. Am. 259(2):40–46. Roca, J. 1995. The mechanisms of DNA topoisomerases. Trends Biochem. Sci. 20(4):156–60. Stillman, B. 1994. Smart machines at the DNA replication fork. Cell 78:725–28. Waga, S., and Stillman, B. 1998. The DNA replication fork in eukaryotic cells. Annu. Rev. Biochem. 67:721–51. Wang, J. C. 1982. DNA topoisomerases. Sci. Am. 247(1):94–109. 11.4 The Genetic Code Andersson, S. G. E., and Kurland, C. G. 1990. Codon preferences in free-living microorganisms. Microbiol. Rev. 54(2):198–210. Osawa, S.; Jukes, T. H.; Watanabe, K.; and Muto, A. 1992. Recent evidence for evolution of the genetic code. Microbiol. Rev. 56(1):229–64. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 11. Genes: Structure, Replication, and Mutation © The McGraw−Hill Companies, 2002 Additional Reading 11.5 Gene Structure Berlyn, M. K. B.; Low, K. B.; and Rudd, K. E. 1996. Linkage map of Escherichia coli K–12, edition 9. In Escherichia coli and Salmonella: Cellular and molecular biology, 2d ed., vol. 2, F. C. Neidhardt, editor-in-chief, 1715–1902. Washington, D.C.: ASM Press. Breathnach, R., and Chambon, P. 1981. Organization and expression of eucaryotic split genes coding for proteins. Annu. Rev. Biochem. 50:349–83. Fournier, M. J., and Ozeki, H. 1985. Structure and organization of the transfer ribonucleic acid genes of Escherichia coli K–12. Microbiol. Rev. 49(4):379–97. Girons, I. S.; Old, I. G.; and Davidson, B. E. 1994. Molecular biology of Borrelia, bacteria with linear replicons. Microbiology 140:1803–16. Lindahl, L., and Zengel, J. M. 1986. Ribosomal genes in Escherichia coli. Annu. Rev. Genet. 20:297–326. Marinus, M. G. 2000. Methylation of nucleic acids and proteins. In Encyclopedia of microbiology, 2d ed., vol. 3, J. Lederberg, editor-in-chief, 240–44. San Diego: Academic Press. Riley, M. 1993. Functions of the gene products of Escherichia coli. Microbiol. Rev. 57(4):862–952. Weinstock, G. M. 1994. Bacterial genomes: Mapping and stability. ASM News 60(2):73–78. 11.6 Mutations and Their Chemical Basis Foster, P. L. 1993. Adaptive mutation: The uses of adversity. Annu. Rev. Microbiol. 47:467–504. Hall, B. G. 1991. Increased rates of advantageous mutations in response to environmental challenges. ASM News 57(2):82–86. Lederberg, J. 1992. Bacterial variation since Pasteur. ASM News 58(5):261–65. Lenski, R. E., and Mittler, J. E. 1993. The directed mutation controversy and neo-Darwinism. Science 259:188–94. Miller, J. H. 1983. Mutational specificity in bacteria. Annu. Rev. Genet. 17:215–38. Miller J. H. 1996. Spontaneous mutators in bacteria: Insights into pathways of mutagenesis and repair. Annu. Rev. Microbiol. 50:625–43. Singer, B., and Kusmierek, J. T. 1982. Chemical mutagenesis. Annu. Rev. Biochem. 52:655–93. 11.7 Detection and Isolation of Mutants Devoret, R. 1979. Bacterial tests for potential carcinogens. Sci. Am. 241(2):40–49. 11.8 DNA Repair Claverys, J.-P., and Lacks, S. A. 1986. Heteroduplex deoxyribonucleic acid base mismatch repair in bacteria. Microbiol. Rev. 50(2):133–65. Friedberg, E. C.; Walker, G. C.; and Siede, W. 1995. DNA repair and mutagenesis. Herndon, Va.: ASM Press. Grossman, L. 2000. DNA repair. In Encyclopedia of microbiology, 2d ed., vol. 2, J. Lederberg, editor-in-chief, 71–81. San Diego: Academic Press. 259 Grossman, L., and Thiagalingam, S. 1993. Nucleotide excision repair, a tracking mechanism in search of damage. J. Biol. Chem. 268(23):16871–74. Howard-Flanders, P. 1981. Inducible repair of DNA. Sci. Am. 245(5):72–80. Kuzminov, A. 1999. Recombinational repair of DNA damage in Escherichia coli and bacteriophage . Microbiol. Mol. Biol. Rev. 63(4):751–813. McCullough, A. K.; Dodson, M. L.; and Lloyd, R. S. 1999. Initiation of base excision repair: Glycosylase mechanisms and structures. Annu. Rev. Biochem. 68:255–85. Miller, R. V. 2000. recA: The gene and its protein product. In Encyclopedia of microbiology, 2d ed., vol. 4, J. Lederberg, editor-in-chief, 43–54. San Diego: Academic Press. Sutherland, B. M. 1981. Photoreactivation. BioScience 31(6):439–44. Van Houten, B. 1990. Nucleotide excision repair in Escherichia coli. Microbiol. Rev. 54(1):18–51. Walker, G. C. 1985. Inducible DNA repair systems. Annu. Rev. Biochem. 54:425–57. Winterling, K. W. 2000. SOS response. In Encyclopedia of microbiology, 2d ed., vol. 4, J. Lederberg, editor-in-chief, 336–43. San Diego: Academic Press. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 CHAPTER 12 Genes: Expression and Regulation Lactose operon activity is under the control of a repressor protein. The lac repressor (violet) and catabolite activator protein (blue) are bound to the lac operon. The repressor blocks transcription when bound to the operators (red). Outline 12.1 DNA Transcription or RNA Synthesis 261 12.3 Transcription in Procaryotes 261 Transcription in Eucaryotes 263 12.2 Protein Synthesis 265 Transfer RNA and Amino Acid Activation 266 The Ribosome 267 Initiation of Protein Synthesis 268 Elongation of the Polypeptide Chain 270 Termination of Protein Synthesis 270 Protein Folding and Molecular Chaperones 272 Protein Splicing 275 Regulation of mRNA Synthesis 275 Induction and Repression 275 Negative Control 276 Positive Control 278 12.4 12.5 Attenuation 279 Global Regulatory Systems 281 Catabolite Repression 281 Regulation by Sigma Factors and Control of Sporulation 282 Antisense RNA and the Control of Porin Proteins 282 12.6 12.7 Two-Component Phosphorelay Systems 283 Control of the Cell Cycle 285 Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 12.1 Concepts Table 12.1 RNA Bases Coded for by DNA 1. In transcription the RNA polymerase copies the appropriate sequence on the DNA template strand to produce a complementary RNA copy of the gene. Transcription differs in a number of ways between procaryotes and eucaryotes, even though the basic mechanism of RNA polymerase action is essentially the same. 2. Translation is the process by which the nucleotide sequence of mRNA is converted into the amino acid sequence of a polypeptide through the action of ribosomes, tRNAs, aminoacyl-tRNA synthetases, ATP and GTP energy, and a variety of protein factors. As in the case of DNA replication, this complex process is designed to minimize errors. 3. The long-term regulation of metabolism in bacteria is achieved through the control of transcription by such mechanisms as sigma factors, repressor proteins during induction and repression, and by the attenuation of many biosynthetic operons. 4. Procaryotes must be able to respond rapidly to changing environmental conditions and often control many operons simultaneously using global regulatory systems. 5. DNA replication and cell division are coordinated in such a way that the distribution of new DNA copies to each daughter cell is ensured. 261 DNA Transcription or RNA Synthesis DNA Base Purine or Pyrimidine Incorporated into RNA Adenine Guanine Cytosine Thymine Uracil Cytosine Guanine Adenine Initiation codon (usually AUG) Termination codon (UAA, UAG, UGA) 3′ 5′ Leader Gene 1 Spacer Gene 2 Trailer Figure 12.1 A Polygenic Bacterial Messenger RNA. See text for details. The particular field which excites my interest is the division between the living and the non-living, as typified by, say, proteins, viruses, bacteria and the structure of chromosomes.The eventual goal, which is somewhat remote, is the description of these activities in terms of their structure, i.e., the spatial distribution of their constituent atoms, in so far as this may prove possible.This might be called the chemical physics of biology. —Francis Crick nine directs the incorporation of thymine during DNA replication, it usually codes for uracil during RNA synthesis. Transcription generates three kinds of RNA. Messenger RNA (mRNA) bears the message for protein synthesis. Transfer RNA (tRNA) carries amino acids during protein synthesis, and ribosomal RNA (rRNA) molecules are components of ribosomes. The structure and synthesis of procaryotic mRNA is described first. Transcription in Procaryotes hapter 11 is concerned with the genetic material of microorganisms. Its focus is on the structure and replication of DNA, the nature of the genetic code and genes, and the way in which genes change by mutation. This provides background for understanding chapters 12 through 15. Chapter 12 is devoted to genetic expression and its regulation. An adequate knowledge of transcription and protein synthesis is essential to understanding molecular biology and microbial genetics. Thus the first two sections describe DNA transcription and protein synthesis in some detail. Because the role of these processes in the life of microorganisms depends on their proper regulation, the second half of the chapter is devoted to regulation. The control of mRNA synthesis and attenuation are discussed first. Then more complex levels of regulation are described in sections on global regulatory systems, two-component phosphorelay systems, and control of the bacterial cell cycle. C 12.1 DNA Transcription or RNA Synthesis As mentioned earlier, synthesis of RNA under the direction of DNA is called transcription. The RNA product has a sequence complementary to the DNA template directing its synthesis (table 12.1). Thymine is not normally found in mRNA and rRNA. Although ade- Procaryotic mRNA is a single-stranded RNA of variable length containing directions for the synthesis of one to many polypeptides. Messenger RNA molecules also have sequences that do not code for polypeptides (figure 12.1). There is a nontranslated leader sequence of 25 to 150 bases at the 5′ end preceding the initiation codon. In addition, polygenic mRNAs (those directing the synthesis of more than one polypeptide) have spacer regions separating the segments coding for individual polypeptides. Polygenic messenger polypeptides usually function together in some way (e.g., as part of the same metabolic pathway). At the 3′ end, following the last termination codon, is a nontranslated trailer. Messenger RNA is synthesized under the direction of DNA by the enzyme RNA polymerase. An E. coli cell can have as many as 7,000 RNA polymerase molecules; only 2,000 to 5,000 polymerases may be active at any one time. The reaction is quite similar to that catalyzed by DNA polymerase. ATP, GTP, CTP, and UTP are used to produce an RNA copy of the DNA sequence. As mentioned earlier, these nucleotides contain ribose rather than deoxyribose. n[ATP, GTP, CTP, UTP] RNA polymerase ⬎ RNA + nPPi —————————— DNA template RNA synthesis, like DNA synthesis, proceeds in a 5′ to 3′ direction with new nucleotides being added to the 3′ end of the growing chain at a rate of about 40 nucleotides per second at 37°C (figure 12.2). The RNA polymerase opens or unwinds the double helix to form a Prescott−Harley−Klein: Microbiology, Fifth Edition 262 Chapter 12 IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 Genes: Expression and Regulation Coding region 3′ 5′ 5′ 3′ Template strand Terminator Promoter RNA polymerase Binding of RNA polymerase to the promoter Sigma factor 5′ 3′ 3′ 5′ Open promoter complex Elongation of RNA 5′ 3′ 3′ 5′ 3′ P 5′ P P mRNA Termination of synthesis 5′ 3′ 3′ 5′ 3′ 5′ P P P mRNA Figure 12.2 mRNA Transcription From DNA. The lower DNA strand directs mRNA synthesis, and the RNA polymerase moves from left to right. Transcription has been simplified for clarity and divided into three general phases: binding of RNA polymerase to the promoter with the aid of a sigma factor, synthesis of RNA by the polymerase under the direction of the template strand, and termination of the process coupled with release of the RNA product. Upon binding, the RNA polymerase unwinds DNA to form a transcription bubble or open complex so that it can copy the template strand. See text for details. transcription bubble, about 12 to 20 base pairs in length, and transcribes the template strand to produce an RNA transcript that is complementary and antiparallel to the DNA template. It should be noted that pyrophosphate is produced in both DNA and RNA polymerase reactions. Pyrophosphate is then removed by hydrolysis to orthophosphate in a reaction catalyzed by the pyrophosphatase enzyme. Removal of the pyrophosphate product makes DNA and RNA synthesis irreversible. If the pyrophosphate level were too high, DNA and RNA would be degraded by a reversal of the polymerase reactions. The RNA polymerase of E. coli is a very large molecule (about 480,000 daltons) containing four types of polypeptide chains: ␣, , ′, and . The core enzyme is composed of four chains (␣2, , ′) and catalyzes RNA synthesis. The sigma factor () has no catalytic activity but helps the core enzyme recognize the start of genes. Once RNA synthesis begins, the sigma factor dissociates from the core enzyme–DNA complex and is available to aid another core enzyme. There are several different sigma factors in E. coli; 70 (about 70,000 molecular weight) is most often involved in transcription initiation. The precise functions of the ␣, , and ′ polypeptides are not yet clear. The ␣ subunit seems to be involved in the assembly of the core enzyme, recognition of promoters (see below), and interaction with some regulatory factors. The binding site for DNA is on ′, and the  subunit binds ribonucleotide substrates. Rifampin, an RNA polymerase inhibitor, binds to the  subunit. The region to which RNA polymerase binds with the aid of the sigma factor is called the promoter. The procaryotic promoter sequence is not transcribed. A 6 base sequence (usually TTGACA), approximately 35 base pairs before the transcription starting point, is present in E. coli promoters. A TATAAT sequence or Pribnow box lies within the promoter about 10 base pairs before the starting point of transcription or around 16 to 18 base pairs from the first hexamer sequence. The RNA polymerase recognizes these sequences, binds to the promoter, and unwinds a short segment of Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 12.1 A A U C U G A C G C G 263 Table 12.2 Eucaryotic RNA Polymerases Enzyme Location Product RNA polymerase I RNA polymerase II Nucleolus Chromatin, nuclear matrix Chromatin, nuclear matrix rRNA (5.8S, 18S, 28S) G C C G DNA Transcription or RNA Synthesis RNA polymerase III mRNA tRNA, 5S rRNA C G G C A U Figure 12.3 Procaryotic Terminators. An example of a hairpin structure formed by an mRNA terminator sequence. DNA beginning around the Pribnow box. Transcription starts 6 or 7 base pairs away from the 3′ end of the promoter. The RNA polymerase remains at the promoter while it constructs a chain about 9 nucleotides long, then it begins to move down the template strand. The first base used in RNA synthesis is usually a purine, either ATP or GTP. Since these phosphates are not removed during transcription, the 5′ end of procaryotic mRNA has a triphosphate attached to the ribose. Promoter structure and function (pp. 242–44) There also must be stop signals to mark the end of a gene or sequence of genes and stop transcription by the RNA polymerase. Procaryotic terminators often contain a sequence coding for an RNA stretch that can hydrogen bond to form a hairpin-shaped loop and stem structure (figure 12.3). This structure appears to cause the RNA polymerase to pause or stop transcribing DNA. There are two kinds of stop signals or terminators. The first type contains a stretch of about six uridine residues following the mRNA hairpin and causes the polymerase to stop transcription and release the mRNA without the aid of any accessory factors. The second kind of terminator lacks a poly-U region, and often the hairpin; it requires the aid of a special protein, the rho factor (). It is thought that rho binds to mRNA and moves along the molecule until it reaches the RNA polymerase that has halted at a terminator. The rho factor then causes the polymerase to dissociate from the mRNA, probably by unwinding the mRNA-DNA complex. Transcription in Eucaryotes Transcriptional processes in eucaryotic microorganisms (and in other eucaryotic cells) differ in several ways from procaryotic transcription. There are three major RNA polymerases, not one as in procaryotes. RNA polymerase II, associated with chromatin in the nuclear matrix, is responsible for mRNA synthesis. Polymerases I and III synthesize rRNA and tRNA, respectively (table 12.2). The eucaryotic RNA polymerase II is a large aggregate, at least 500,000 daltons in size, with about 10 or more subunits. It is inhibited by the octapeptide ␣-amanitin. Unlike the bacterial polymerase it requires extra transcription factors to recognize its promoters. The polymerase binds near the start point; the transcription factors bind to the rest of the promoter. Eucaryotic promoters also differ from those in procaryotes. They have combinations of several elements. Three of the most common are the TATA box (located about 30 base pairs before the start point or upstream), the CAAT box (about 75 base pairs upstream), and the GC box (90 base pairs upstream). Recently it has been shown that the TATA-binding protein sharply bends the DNA on attachment. This makes the DNA more accessible to other initiation factors. A variety of general transcription factors, promoter specific factors, and promoter elements have been discovered in different eucaryotic cells. Each eucaryotic gene seems to be regulated differently, and more research will be required to understand the regulation of eucaryotic gene transcription. Eucaryotic mRNA arises from posttranscriptional modification of large RNA precursors, about 5,000 to 50,000 nucleotides long, called heterogeneous nuclear RNA (hnRNA) molecules. These are the products of RNA polymerase II activity (figure 12.4). After hnRNA synthesis, the precursor RNA is cleaved by an endonuclease to yield the proper 3′-OH group. The enzyme polyadenylate polymerase then catalyzes the addition of adenylic acid to the 3′ end of hnRNA to produce a poly-A sequence about 200 nucleotides long. The hnRNA finally is cleaved to generate the functional mRNA. Usually eucaryotic mRNA also differs in having a 5′ cap consisting of 7-methylguanosine attached to the 5′-hydroxyl by a triphosphate linkage (figure 12.5). The adjacent nucleotide also may be methylated. Eucaryotic mRNAs have 5′ caps, unlike procaryotic mRNAs. Both types of cells can have mRNA with 3′ poly-A, but procaryotes have poly-A much less often and the tracts are shorter. In addition, eucaryotic mRNA normally is monogenic in contrast to procaryotic mRNA, which often contains transcripts of two or more genes. The functions of poly-A and capping still are not completely clear. Poly-A protects mRNA from rapid enzymatic degradation. The poly-A tail must be shortened to about 10 nucleotides before mRNA can be degraded. Poly-A also seems to aid in mRNA translation. The 5′ cap on eucaryotic messengers may promote the initial binding of ribosomes to the messenger. The cap also may protect the messenger from enzymatic attack. Many eucaryotic genes differ from procaryotic genes in being split or interrupted, which leads to another type of posttranscriptional processing. Split or interrupted genes have exons (expressed sequences), regions coding for RNA that end up in the final RNA product (e.g., mRNA). Exons are separated from one another by introns (intervening sequences), sequences coding for RNA that is missing from the final product (figure 12.4b). The initial RNA transcript has the intron sequences present in the interrupted gene. Genes coding for rRNA and tRNA may also be interrupted. Except for cyanobacteria and Archaea (see chapters 20 and 21), interrupted genes have not been found in procaryotes. Prescott−Harley−Klein: Microbiology, Fifth Edition 264 Chapter 12 IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation Genes: Expression and Regulation Introns are removed from the initial RNA transcript by a process called RNA splicing (figure 12.4b). The intron’s borders must be clearly marked for accurate removal, and this is the case. Exon-intron junctions have a GU sequence at the intron’s 5′ boundary and an AG sequence at its 3′ end. These two sequences define the splice junctions and are recognized by special RNA molecules. The nucleus contains several small nuclear RNA (snRNA) molecules, about 60 to 300 nucleotides long. These complex with proteins to form small nuclear ribonucleoprotein particles called snRNPs or snurps. Some of the snRNPs recognize splice junctions and ensure splicing accuracy. For example, U1-snRNP recognizes the 5′ splice junction, and U5-snRNP recognizes the 3′ junction. Splicing of pre-mRNA occurs in a large complex called a spliceosome that contains the pre-mRNA, at least five kinds of snRNPs, and non-snRNP splicing factors. As just mentioned, a few rRNA genes also have introns. Some of these pre-rRNA molecules are self-splicing. The RNA actually catalyzes the splicing reaction and now is called a ribozyme (Box 12.1). Thomas Cech first discovered that pre-rRNA from the ciliate protozoan Tetrahymena thermophila is selfsplicing. Sidney Altman then showed that ribonuclease P, which cleaves a fragment from one end of pre-tRNA, contains a piece of RNA that catalyzes the reaction. Several other self-splicing rRNA introns have since been discovered. Cech and Altman received the 1989 Nobel Prize in chemistry for these discoveries. Although the focus has been on mRNA synthesis, it should be noted that both rRNAs and tRNAs also begin as parts of large precursors synthesized by RNA polymerases (table 12.2). The final rRNA and tRNA products result from posttranscriptional processing, as mentioned previously. DNA RNA polymerase II hnRNA © The McGraw−Hill Companies, 2002 5′ 3′ Endonuclease 5′ 3′ Polyadenylate polymerase ATP 5′ AAAA 3′ Fragments mRNA 5′ AAAA 3′ (a) Exon 1 Intron Exon 2 DNA Transcription Splicing 1. Define the following terms: leader, trailer, spacer region, polygenic mRNA, RNA polymerase core enzyme, sigma factor, promoter, template strand, terminator, and rho factor. 2. Define or describe posttranscriptional modification, heterogeneous nuclear RNA, 3′ poly-A sequence, 5′ capping, split or interrupted genes, exon, intron, RNA splicing, snRNA, spliceosome, and ribozyme. (b) Figure 12.4 Eucaryotic mRNA Synthesis. (a) The production of eucaryotic messenger RNA. The addition of poly-A to the 3′ end of mRNA is included, but not the capping of the 5′ end. Poly-A sequence and introns are in color. (b) The splicing of interrupted genes to produce mRNA. Poly-A sequences and exons are in color. The excised intron is in the shape of a circle or lariat. O CH3 N HN H2 N N O N CH2 OH OH O O O P O – O O P O – O P O CH2 O – Base 1 O O Figure 12.5 The 5′ Cap of Eucaryotic mRNA. 7-methylguanosine O OCH3 P O O CH2 O – O •• • Base 2 OH Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 12.2 265 Protein Synthesis Box 12.1 Catalytic RNA (Ribozymes) ntil recently biologists believed that all cellular reactions were catalyzed by proteins called enzymes (see section 8.6). The discovery during 1981–1984 by Thomas Cech and Sidney Altman that RNA also can sometimes catalyze reactions has transformed our way of thinking about topics as diverse as catalysis and the origin of life. It is now clear that some RNA molecules, called ribozymes, catalyze reactions that alter either their own structure or that of other RNAs. This discovery has stimulated scientists to hypothesize that the early earth was an “RNA world” in which RNA acted as both the genetic material and a reaction catalyst. Experiments showing that introns from Tetrahymena thermophila can catalyze the formation of polycytidylic acid under certain circumstances have further encouraged such speculations. Some have suggested that RNA viruses are “living fossils” of the original RNA world. The best-studied ribozyme activity is the self-splicing of RNA. This process is widespread and occurs in Tetrahymena pre-rRNA; the mitochondrial rRNA and mRNA of yeast and other fungi; chloroplast tRNA, rRNA, and mRNA; and in mRNA from some bacteriophages (e.g., the T4 phage of E. coli). The 413-nucleotide rRNA intron of T. thermophila provides a good example of the self-splicing reaction. The reaction occurs in three steps and requires the presence of guanosine (see Box figure). First, the 3′-OH group of guanosine attacks the intron’s 5′-phosphate group and cleaves the phosphodiester bond. Second, the new 3′-hydroxyl on the left exon attacks the 5′-phosphate of the right exon. This joins the two exons and releases the intron. Finally, the intron’s 3′-hydroxyl attacks the phosphate bond of the nucleotide 15 residues from its end. This releases a terminal fragment and cyclizes the intron. Self-splicing of this rRNA occurs about 10 billion times faster than spontaneous RNA hydrolysis. Just as with enzyme proteins, the RNA’s shape is essential to catalytic efficiency. The ribozyme even has Michaelis-Menten kinetics (pp. 162–63). The discovery of ribozymes has many potentially important practical consequences. Ribozymes act as “molecular scissors” and will enable researchers to manipulate RNA easily in laboratory experiments. It also might be possible to protect hosts by specifically removing RNA from pathogenic viruses, bacteria, and fungi. For example, ribozymes are already being tested against the AIDS, herpes, and tobacco mosaic viruses. U G pre-rRNA 5′ OH C U C U C UA A GG G A G G U 5′ G U A A G GU 3′ G U A A G GU 3′ U U C U C U C U OH A GG G A G G U U U A G Ligated exons Linear intron (intervening sequence) 5′ C U C U C U U A A G G U + A GG G A G G U U U A G 5′ 3′ HOG3′ Conformational change HO Linear intron G GA U U U GG A A G G G 5′ G A U U U 3′ + Circular intron G G G A G G GA Ribozyme Action. The mechanism of Tetrahymena thermophila pre-rRNA self-splicing. See text for details. 12.2 Protein Synthesis The final step in gene expression is protein synthesis or translation. The mRNA nucleotide sequence is translated into the amino acid sequence of a polypeptide chain in this step. Polypeptides are synthesized by the addition of amino acids to the end of the chain with the free ␣-carboxyl group (the C-terminal end). That is, the synthesis of polypeptides begins with the amino acid at the end of the chain with a free amino group (the N-terminal) and moves in the C-terminal direction. The ribosome is the site of protein synthesis. Protein synthesis is not only quite accurate but also very rapid. In E. coli synthesis occurs at a rate of at least 900 residues per minute; eucaryotic translation is slower, about 100 residues per minute. Polypeptide and protein structure (appendix I) Many bacteria grow so quickly that each mRNA must be used with great efficiency to synthesize proteins at a sufficiently rapid rate. Ribosomal subunits are free in the cytoplasm if protein is not being synthesized. They come together to form the complete ribosome only when translation occurs. Frequently bacterial mRNAs are simultaneously complexed with several ribosomes, each ribosome reading the mRNA message and synthesizing a polypeptide. At Prescott−Harley−Klein: Microbiology, Fifth Edition 266 Chapter 12 IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 Genes: Expression and Regulation maximal rates of mRNA use, there may be a ribosome every 80 nucleotides along the messenger or as many as 20 ribosomes simultaneously reading an mRNA that codes for a 50,000 dalton polypeptide. A complex of mRNA with several ribosomes is called a polyribosome or polysome. Polysomes are present in both procaryotes and eucaryotes. Bacteria can further increase the efficiency of gene expression through coupled transcription and translation (figure 12.6). While RNA polymerase is synthesizing an mRNA, ribosomes can already be attached to the messenger and involved in polypeptide synthesis. Coupled transcription and translation is possible in procaryotes because a nuclear envelope does not separate the translation machinery from DNA as it does in eucaryotes (see figure 3.14). Transfer RNA and Amino Acid Activation The first stage of protein synthesis is amino acid activation, a process in which amino acids are attached to transfer RNA molecules. These RNA molecules are normally between 73 and 93 nucleotides in length and possess several characteristic structural features. The structure of tRNA becomes clearer when its chain is folded in such a way to maximize the number of normal base pairs, which results in a cloverleaf conformation of five arms or loops (figure 12.7). The acceptor or amino acid stem holds the activated amino acid on the 3′ end of the tRNA. The 3′ end of all tRNAs has the same —C—C—A sequence; the amino acid is attached to the terminal adenylic acid. At the other end of the cloverleaf is the anticodon arm, which contains the anticodon triplet complementary to the mRNA codon triplet. There are two other large arms: the D or DHU arm has the unusual pyrimidine nucleoside dihydrouridine; and the T or T⌿C arm has ribothymidine (T) and pseudouridine (⌿), both of which are unique to tRNA. Finally, the cloverleaf has a variable arm whose length changes with the overall length of the tRNA; the other arms are fairly constant in size. N 5′ Polypeptide C Ribosome mRNA DNA RNA polymerase Figure 12.6 Coupled Transcription and Translation in Bacteria. mRNA is synthesized 5′ to 3′ by RNA polymerase while ribosomes are attaching to the newly formed 5′ end of mRNA and translating the message even before it is completed. Polypeptides are synthesized in the N-terminal to C-terminal direction. A 3′ end C C 5′ end Acceptor stem TψC stem Py D stem Pu A C U TψC arm Pu A TψC loop D arm G D loop G Py G T ψ C A Variable arm Anticodon stem Py Pu U 5′ 3′ Anticodon arm Anticodon loop Anticodon Figure 12.7 tRNA Structure. The cloverleaf structure for tRNA in procaryotes and eucaryotes. Bases found in all tRNAs are in diamonds; purine and pyrimidine positions in all tRNAs are labeled Pu and Py respectively. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 12.2 TψC stem TψC loop 54 1 56 Variable loop 267 Acceptor stem 64 72 D loop 7 20 Protein Synthesis 3′ acceptor end 69 12 D stem 44 26 Anticodon stem 38 32 Anticodon Figure 12.8 Transfer RNA Conformation. The three-dimensional structure of tRNA. The various regions are distinguished with different colors. Transfer RNA molecules are folded into an L-shaped structure (figure 12.8). The amino acid is held on one end of the L, the anticodon is positioned on the opposite end, and the corner of the L is formed by the D and T loops. Because there must be at least one tRNA for each of the 20 amino acids incorporated into proteins, at least 20 different tRNA molecules are needed. Actually more tRNA species exist (see pp. 240–41). Amino acids are activated for protein synthesis through a reaction catalyzed by aminoacyl-tRNA synthetases (figure 12.9). Figure 12.9 An Aminoacyl-tRNA Synthetase. A model of E. coli glutamyl-tRNA synthetase complexed with its tRNA and ATP. The enzyme is in blue, the tRNA in red and yellow, and ATP in green. H 2N R O CH C OH O Mg2+ Amino acid + tRNA + ATP ————⬎ aminoacyl-tRNA + AMP + PPi O Adenine CH2 O Just as is true of DNA and RNA synthesis, the reaction is driven to completion when the pyrophosphate product is hydrolyzed to two orthophosphates. The amino acid is attached to the 3′-hydroxyl of the terminal adenylic acid on the tRNA by a high-energy bond (figure 12.10), and is readily transferred to the end of a growing peptide chain. This is why the amino acid is called activated. There are at least 20 aminoacyl-tRNA synthetases, each specific for a single amino acid and for all the tRNAs (cognate tRNAs) to which each may be properly attached. This specificity is critical because once an incorrect acid is attached to a tRNA, it will be incorporated into a polypeptide in place of the correct amino acid. The protein synthetic machinery recognizes only the anticodon of the aminoacyl-tRNA and cannot tell whether the correct amino acid is attached. Some aminoacyl-tRNA synthetases will even proofread just like DNA polymerases do. If the wrong aminoacyl-tRNA is formed, aminoacyl-tRNA synthetases will hydrolyze the amino acid from the tRNA rather than release the incorrect product. The Ribosome The actual process of protein synthesis takes place on ribosomes that serve as workbenches, with mRNA acting as the blueprint. Procaryotic ribosomes have a sedimentation value of 70S and a mass of 2.8 million daltons. A rapidly growing E. coli cell may O P O – O •• • Figure 12.10 Aminoacyl-tRNA. The 3′ end of an aminoacyl-tRNA. The activated amino acid is attached to the 3′- hydroxyl of adenylic acid by a high-energy bond (red). have as many as 15,000 to 20,000 ribosomes, about 15% of the cell mass. Introduction to ribosomal function and the Svedberg unit (p. 52) The procaryotic ribosome is an extraordinarily complex organelle made of a 30S and a 50S subunit (figure 12.11). Each subunit is constructed from one or two rRNA molecules and many polypeptides. The shape of ribosomal subunits and their association to form the 70S ribosome are depicted in figure 12.12. The region of the ribosome directly responsible for translation is called the translational domain (figure 12.12d). Both subunits contribute to this domain, located in the upper half of the small subunit and in the associated areas of the large subunit. For example, the peptidyl transferase (p. 270) is found on the central protuberance of the large subunit. The growing peptide chain Prescott−Harley−Klein: Microbiology, Fifth Edition 268 Chapter 12 IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 Genes: Expression and Regulation emerges from the large subunit at the exit domain. This is located on the side of the subunit opposite the central protuberance in both procaryotes and eucaryotes. Eucaryotic cytoplasmic ribosomes are 80S, with a mass of 4 million daltons, and are composed of two subunits, 40S and 60S. Many of these ribosomes are found free in the cytoplasmic matrix, whereas others are attached to membranes of the endoplasmic reticulum by their 60S subunit at a site next to the exit domain. The ribosomes of eucaryotic mitochondria and chloroplasts are smaller than cytoplasmic ribosomes and resemble the procaryotic organelle. Ribosomal RNA is thought to have two roles. It obviously contributes to ribosome structure. The 16S rRNA of the 30S subunit also may aid in the initiation of protein synthesis in procaryotes. There is evidence that the 3′ end of the 16S rRNA complexes with an initiating signal site on the mRNA and helps position the mRNA on the ribosome. It also binds initiation factor 3 (p. 270) and the 3′ CCA end of aminoacyl-tRNA. Because of the discovery of catalytic RNA, some have proposed that ribosomal RNA has a catalytic role in protein synthesis. The use of 16S rRNA sequences in the study of phylogeny (pp. 433–35) 20 nm Initiation of Protein Synthesis 70S 6 (2.8 × 10 daltons) 30S 6 (0.9 × 10 daltons) 50S 6 (1.8 × 10 daltons) 16S rRNA + 21 polypeptide chains 5S rRNA + 23S rRNA + 34 polypeptide chains Figure 12.11 The 70S Ribosome. The structure of the procaryotic ribosome. Protein synthesis proper may be divided into three stages: initiation, elongation, and termination. In the initiation stage E. coli and most bacteria begin protein synthesis with a specially modified aminoacyl-tRNA, Nformylmethionyl-tRNAfMet (figure 12.13). Because the ␣-amino is blocked by a formyl group, this aminoacyl-tRNA can be used only for initiation. When methionine is to be added to a growing polypeptide chain, a normal methionyl-tRNAMet is employed. Eucaryotic protein synthesis (except in the mitochondrion and chloroplast) and archaeal protein synthesis begin with a special initiator methionyl-tRNAMet. Although most bacteria start protein synthesis with formylmethionine, the formyl group does not remain but is hydrolytically removed. In fact, one to three amino acids may be removed from the amino terminal end of the polypeptide after synthesis. Central protuberance Cleft EF-Tu Head EF-G Platform tRNA Base Small subunit (30S) Ridge Head Messenger RNA + Large subunit Central (50S) protuberance Valley Ribosome (70S) Translational domain Stalk Exit domain Platform + Nascent protein (a) (b) (c) (d) Figure 12.12 Two Views of the E. coli Ribosome. (a) The 30S subunit. (b) The 50S subunit. (c) The complete 70S ribosome. (d) Diagrammatic representation of ribosomal structure with the translational and exit domains shown. The locations of elongation factor and mRNA binding are shown. The growing peptide chain probably remains unfolded and extended until it leaves the large subunit. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 12.2 O CH3 S CH2 CH CH2 C tRNA O H Figure 12.13 Procaryotic Initiator tRNA. The initiator aminoacyltRNA, N-formylmethionyl-tRNAfMet, is used by bacteria. The formyl group is in color. Archaea use methionyl-tRNA for initiation. IF-3 3 30S subunit fMet IF-2 GTP fMet 16S rRNA complementary region 2 3′ 5′ 2 Initiator tRNA mRNA AUG or GUG IF-1 1 3 fMet 2 5′ 3′ 1 30S initiation complex 50S subunit Pi IF-2 1 and 2 P site 269 Figure 12.14 shows the initiation process in procaryotes. The initiator N-formylmethionyl-tRNAfMet (fMet-tRNA) binds to the free 30S subunit first. Next mRNA attaches to the 30S subunit and is positioned properly through interactions with both the 3′ end of the 16S rRNA and the anticodon of fMet-tRNA. Messengers have a special initiator codon (AUG or sometimes GUG) that specifically binds with the fMet-tRNA anticodon (see section 12.2). Finally, the 50S subunit binds to the 30S subunit-mRNA forming an active ribosome-mRNA complex. The fMet-tRNA is positioned at the peptidyl or P site (see description of the elongation cycle). fMet NH C Protein Synthesis IF-2 · GDP fMet + GDP 2 A site E site 5′ 3′ 70S initiation complex Figure 12.14 Initiation of Protein Synthesis. The initiation of protein synthesis in procaryotes. The following abbreviations are employed: IF-1, IF-2, and IF-3 stand for initiation factors 1, 2, and 3; initiator tRNA is N-formylmethionyl-tRNAfMet. The ribosomal locations of initiation factors are depicted for illustration purposes only. They do not represent the actual initiation factor binding sites. See text for further discussion. Prescott−Harley−Klein: Microbiology, Fifth Edition 270 Chapter 12 IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 Genes: Expression and Regulation There is some uncertainty about the exact initiation sequence, and mRNA may bind before fMet-tRNA in procaryotes. Eucaryotic initiation appears to begin with the binding of a special initiator MettRNA to the small subunit, followed by attachment of the mRNA. In procaryotes three protein initiation factors are required (figure 12.14). Initiation factor 3 (IF-3) prevents 30S subunit binding to the 50S subunit and promotes the proper mRNA binding to the 30S subunit. IF-2, the second initiation factor, binds GTP and fMet-tRNA and directs the attachment of fMet-tRNA to the 30S subunit. GTP is hydrolyzed during association of the 50S and 30S subunits. The third initiation factor, IF-1, appears to be needed for release of IF-2 and GDP from the completed 70S ribosome. IF-1 also may aid in the binding of the 50S subunit to the 30S subunit. Eucaryotes require more initiation factors; otherwise the process is quite similar to that of procaryotes. The initiation of protein synthesis is very elaborate. Apparently the complexity is necessary to ensure that the ribosome does not start synthesizing a polypeptide chain in the middle of a gene—a disastrous error. Elongation of the Polypeptide Chain Every amino acid addition to a growing polypeptide chain is the result of an elongation cycle composed of three phases: aminoacyltRNA binding, the transpeptidation reaction, and translocation. The process is aided by special protein elongation factors (just as with the initiation of protein synthesis). In each turn of the cycle, an amino acid corresponding to the proper mRNA codon is added to the C-terminal end of the polypeptide chain. The procaryotic elongation cycle is described next. The ribosome has three sites for binding tRNAs: (1) the peptidyl or donor site (the P site), (2) the aminoacyl or acceptor site (the A site), and (3) the exit site (the E site). At the beginning of an elongation cycle, the peptidyl site is filled with either Nformylmethionyl-tRNAfMet or peptidyl-tRNA and the aminoacyl and exit sites are empty (figure 12.15). Messenger RNA is bound to the ribosome in such a way that the proper codon interacts with the P site tRNA (e.g., an AUG codon for fMet-tRNA). The next codon (green) is located within the A site and is ready to direct the binding of an aminoacyl-tRNA. The first phase of the elongation cycle is the aminoacyl-tRNA binding phase. The aminoacyl-tRNA corresponding to the green codon is inserted into the A site. GTP and the elongation factor EF-Tu, which donates the aminoacyl-tRNA to the ribosome, are required for this insertion. When GTP is bound to EF-Tu, the protein is in its active state and delivers aminoacyl-tRNA to the A site. This is followed by GTP hydrolysis, and the EF-TuⴢGDP complex leaves the ribosome. EF-TuⴢGDP is converted to EF-TuⴢGTP with the aid of a second elongation factor, EF-Ts. Subsequently another aminoacyl-tRNA binds to EF-TuⴢGTP (figure 12.15). Aminoacyl-tRNA binding to the A site initiates the second phase of the elongation cycle, the transpeptidation reaction (figure 12.15 and figure 12.16). This is catalyzed by the peptidyl transferase, located on the 50S subunit. The ␣-amino group of the A site amino acid nucleophilically attacks the ␣-carboxyl group of the C-terminal amino acid on the P site tRNA in this reaction (figure 12.16). The peptide chain grows by one amino acid and is trans- ferred to the A site tRNA. No extra energy source is required for peptide bond formation because the bond linking an amino acid to tRNA is high in energy. Recent evidence strongly suggests that 23S rRNA contains the peptidyl transferase function. Almost all protein can be removed from the 50S subunit, leaving the 23S rRNA and protein fragments. The remaining complex still has peptidyl transferase activity. The high-resolution structure of the large subunit has now been obtained by X-ray crystallography. There is no protein in the active site region. A specific adenine base seems to participate in catalyzing peptide bond formation. Thus the 23S rRNA appears to be the major component of the peptidyl transferase and contributes to both A and P site functions. The final phase in the elongation cycle is translocation. Three things happen simultaneously: (1) the peptidyl-tRNA moves from the A site to the P site; (2) the ribosome moves one codon along mRNA so that a new codon is positioned in the A site; and (3) the empty tRNA leaves the P site. Instead of immediately being ejected from the ribosome, the empty tRNA moves from the P site to the E site and then leaves the ribosome. The intricate process requires the participation of the EF-G or translocase protein and GTP hydrolysis. The ribosome changes shape as it moves down the mRNA in the 5′ to 3′ direction. Termination of Protein Synthesis Protein synthesis stops when the ribosome reaches one of three special nonsense codons—UAA, UAG, and UGA (figure 12.17). Three release factors (RF-1, RF-2, and RF-3) aid the ribosome in recognizing these codons. After the ribosome has stopped, peptidyl transferase hydrolyzes the peptide free from its tRNA, and the empty tRNA is released. GTP hydrolysis seems to be required during this sequence, although it may not be needed for termination in procaryotes. Next the ribosome dissociates from its mRNA and separates into 30S and 50S subunits. IF-3 binds to the 30S subunit and prevents it from reassociating with the 50S subunit until the proper stage in initiation is reached. Thus ribosomal subunits associate during protein synthesis and separate afterward. The termination of eucaryotic protein synthesis is similar except that only one release factor appears to be active. Protein synthesis is a very expensive process. Three GTP molecules probably are used during the elongation cycle, and two ATP high-energy bonds are required for amino acid activation (ATP is converted to AMP rather than to ADP). Therefore five high-energy bonds are required to add one amino acid to a growing polypeptide chain. GTP also is used in initiation and termination of protein synthesis (figures 12.14 and 12.17). Presumably this large energy expenditure is required to ensure the fidelity of protein synthesis. Very few mistakes can be tolerated. Although the mechanism of protein synthesis is similar in procaryotes and eucaryotes, procaryotic ribosomes differ substantially from those in eucaryotes. This explains the effectiveness of many important chemotherapeutic agents. Either the 30S or the 50S subunit may be affected. For example, streptomycin binding to the 30S ribosomal subunit inhibits protein synthesis and causes mRNA misreading. Erythromycin binds to the 50S subunit and inhibits peptide chain elongation. The effect of antibiotics on protein synthesis (pp. 810–11, 817) Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 12.2 Peptidyl transferase aa aa P site Protein Synthesis Empty A site Peptidyl-tRNA Empty E site 5′ aa 3′ mRNA aa GTP EF-Tu Binding AA-tRNA to A site AA-tRNA–GTP–EF-Tu complex EF-Ts aa aa GTP aa EF-Tu EF-Tu GDP GDP EF-Ts 5′ 3′ P mRNA EF-Ts Peptide bond formation aa aa 5′ aa 3′ mRNA GTP EF-G-GTP complex Translocation GDP aa aa aa EF-G · GDP + 5′ 3′ mRNA tRNA discharge Figure 12.15 Elongation Cycle. The elongation cycle of protein synthesis. The ribosome possesses three sites, a peptidyl or donor site (P site), an aminoacyl or acceptor site (A site), and an exit site (E site). The arrow below the ribosome in translocation step shows the direction of mRNA movement. See text for details. 271 Prescott−Harley−Klein: Microbiology, Fifth Edition 272 Chapter 12 P site N •• • • •• – O P N O NH2 O O P O O O CH2 CH2 – N N aa N UAA N mRNA 1′ 2′ 3′ 3′ 2′ O C O C H •• C RF-1, RF-2, RF-3 OH O C Rn + 1 Peptide chain O 1′ HO O © The McGraw−Hill Companies, 2002 AA-tRNA A site O O N 12. Genes: Expression and Regulation Genes: Expression and Regulation NH2 N IV. Microbial Molecular Biology and Genetics H H2 N NH UAA Rn + 2 C O GTP Rn CH GDP + Pi NH • •• UAA P site • •• NH2 O N N N • •• – O N O P A site NH2 O O O P O O O CH2 CH2 N – N IF-3 N 50S O 1′ 1′ 2′ 3′ HO N 3′ 2′ UAA OH OH O Rn + 2 C O C H 30S IF-3 NH C Rn + 1 O CH NH C Rn Figure 12.17 Termination of Protein Synthesis in Procaryotes. Although three different nonsense codons can terminate chain elongation, UAA is most often used for this purpose. Three release factors (RF) assist the ribosome in recognizing nonsense codons and terminating translation. GTP hydrolysis is probably involved in termination. O CH NH •• • Figure 12.16 Transpeptidation. The peptidyl transferase reaction. The peptide grows by one amino acid and is transferred to the A site. Protein Folding and Molecular Chaperones For many years it was believed that polypeptides would spontaneously fold into their final native shape, either as they were synthesized by ribosomes or shortly after completion of protein synthesis. Although the amino acid sequence of a polypeptide does determine its final conformation, it is now clear that special helper proteins aid the newly formed or nascent polypeptide in folding to its proper shape. These proteins, called molecular chaperones or chaperones, recognize only unfolded polypeptides or partly denatured proteins and do not bind to normal, func- tional proteins. Their role is essential because the cytoplasmic matrix is filled with nascent polypeptide chains and proteins. Under such conditions it is quite likely that new polypeptide chains often will fold improperly and aggregate to form nonfunctional complexes. Molecular chaperones suppress incorrect folding and may reverse any incorrect folding that has already taken place. They are so important that chaperones are present in all cells, procaryotic and eucaryotic. Several chaperones and cooperating proteins aid proper protein folding in bacteria. The process has been most studied in Escherichia coli and involves at least four chaperones—DnaK, DnaJ, GroEL, and GroES—and the stress protein GrpE. After a sufficient length of nascent polypeptide extends from the ribosome, DnaJ binds to the unfolded chain (figure 12.18). DnaK, which is complexed with ATP, then attaches to the polypeptide. These two chaperones prevent the polypeptide from folding improperly as it is synthesized. The ATP is hydrolyzed to ADP after DnaK binding, and this increases the affinity of DnaK for the un- Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 12.2 Protein Synthesis 273 ATP ATP DnaK • ATP DnaJ P AT ATP Ribosome ATP Nascent polypeptide ATP Pi ADP Native protein P AD GrpE ADP + Pi GroES GroEL ADP ADP ATP ATP ATP Native protein Figure 12.18 Chaperones and Polypeptide Folding. The involvement of bacterial chaperones in the proper folding of a newly synthesized polypeptide chain is depicted in this diagram. Three possible outcomes of a chaperone reaction cycle are shown. A native protein may result, the partially folded polypeptide may bind again to DnaK and DnaJ, or the polypeptide may be transferred to GroEL and GroES. See text for details. folded polypeptide. When the polypeptide has been synthesized, the GrpE protein binds to the chaperone-polypeptide complex and causes DnaK to release ADP. Then ATP binds to DnaK and both DnaK and DnaJ dissociate from the polypeptide. The polypeptide has been folding during this sequence of events and may have reached its final native conformation. If it is still only partially folded, it can bind DnaJ and DnaK again and repeat the process. Often DnaK and DnaJ will transfer the polypeptide to the chaperones GroEL and GroES, where the final folding takes place. GroEL is a large, hollow barrel-shaped complex of 14 subunits arranged in two stacked rings (figure 12.19). GroES exists as a single ring of seven subunits and can bind to one or both ends of the GroEL cylinder. As with DnaK, ATP binding to GroEL and ATP hydrolysis change the chaperone’s affinity for the folding polypeptide and regulate polypeptide binding and release (polypeptide release is ATP-dependent). GroES binds to GroEL and assists in its binding and release of the refolding polypeptide. Chaperones were first discovered because they dramatically increased in concentration when cells were exposed to high tempera- tures, metabolic poisons, and other stressful conditions. Thus many chaperones often are called heat-shock proteins or stress proteins. When an E. coli culture is switched from 30 to 42°C, the concentrations of some 20 different heat-shock proteins increase greatly within about 5 minutes. If the cells are exposed to a lethal temperature, the heat-shock proteins are still synthesized but most proteins are not. Thus chaperones protect the cell from thermal damage and other stresses as well as promote the proper folding of new polypeptides. For example, DnaK protects E. coli RNA polymerase from thermal inactivation in vitro. In addition, DnaK reactivates thermally inactivated RNA polymerase, especially if ATP, DnaJ, and GrpE are present. GroEL and GroES also protect intracellular proteins from aggregation. As one would expect, large quantities of chaperones are present in hyperthermophiles such as Pyrodictium occultum, an archaeon that will grow at temperatures as high as 110°C. Pyrodictium has a chaperone similar to the GroEL of E. coli. The chaperone hydrolyzes ATP most rapidly at 100°C and makes up almost 3/4 of the cell’s soluble protein when P. occultum grows at 108°C. Thermophilic and hyperthermophilic procaryotes (pp. 126–27) Prescott−Harley−Klein: Microbiology, Fifth Edition 274 Chapter 12 IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 Genes: Expression and Regulation A B 140Å 33Å 80Å 10Å 80Å 184Å 71Å (a) Figure 12.19 The GroEL-GroES Chaperone Complex. (a) Top and side views of the GroEL-GroES complex. The trans GroEL ring is red; the cis GroEL ring is green; GroES is gold. (b) A top view of the GroEL complex. Several domains have been colored to distinguish them from one another. Note the large central chamber in which a protein can fold. Chaperones have other functions as well. They are particularly important in the transport of proteins across membranes. For example, in E. coli the chaperone SecB binds to the partially unfolded forms of many proteins and keeps them in an export-competent state until they are translocated across the plasma membrane. DnaK, DnaJ, and GroEL/GroES also can aid in protein translocation across membranes. Proteins destined for the periplasm or outer membrane (see pp. 58–60) are synthesized with the proper amino-terminal signal sequence. The signal sequence is a short stretch of amino acids that help direct the completed polypeptide to its final destination. Polypeptides associate with SecB and the chaperone then attaches to the membrane translocase. The polypeptides are transported through the membrane as ATP is hydrolyzed. When they enter the periplasm, the signal peptidase enzyme removes the signal sequence and the protein moves to its final location. As already noted, the polypeptide folds into its final shape after synthesis, often with the aid of molecular chaperones. This (b) folding is possible because protein conformation is a direct function of the amino acid sequence (see appendix I). Recent research indicates that procaryotes and eucaryotes may differ with respect to the timing of protein folding. In terms of conformation, proteins are composed of compact, self-folding, structurally independent regions. These regions, normally around 100 to 300 amino acids in length, are called domains. Larger proteins such as immunoglobulins (see p. 734) may have two or more domains that are linked by less structured portions of the polypeptide chain. In eucaryotes, domains fold independently right after being synthesized by the ribosome. It appears that procaryotic polypeptides, in contrast, do not fold until after the complete chain has been synthesized. Only then do the individual domains fold. This difference in timing may account for the observation that chaperones seem to be more important in the folding of procaryotic proteins. Folding a whole polypeptide is more complex than folding one domain at a time and would require the aid of chaperones. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 12.3 Protein Splicing 1. In which direction are polypeptides synthesized? What is a polyribosome and why is it useful? 2. Briefly describe the structure of transfer RNA and relate this to its function. How are amino acids activated for protein synthesis, and why is the specificity of the aminoacyl-tRNA synthetase reaction so important? 3. What are translational and exit domains? Contrast procaryotic and eucaryotic ribosomes in terms of structure. What roles does ribosomal RNA have? 4. Describe the nature and function of the following: fMet-tRNA, initiator codon, IF-3, IF-2, IF-1, elongation cycle, peptidyl and aminoacyl sites, EF-Tu, EF-Ts, transpeptidation reaction, peptidyl transferase, translocation, EF-G or translocase, nonsense codon, and release factors. 5. What are molecular chaperones and heat-shock proteins? Describe their functions. A further level of complexity in the formation of proteins has been discovered. Some microbial proteins are spliced after translation. In protein splicing, a part of the polypeptide is removed before the polypeptide folds into its final shape. Self-splicing proteins begin as larger precursor proteins composed of one or more internal intervening sequences called inteins flanked by external sequences or exteins, the N-exteins and C-exteins (figure 12.20a). Inteins, which sometimes are over 500 residues in length, are removed in an autocatalytic process involving a branched intermediate (figure 12.20b). Thus far, 10 or more self-splicing proteins have been discovered. Some examples are an ATPase in the yeast Saccharomyces cerevisiae, the recA protein of Mycobacterium tuberculosis, and DNA polymerase in Pyrococcus. The presence of self-splicing proteins in all three domains may mean that they are quite widespread and prevalent. 12.3 Intein N-extein Cys/Ser N-extein Intein His-Asn-Cys/Ser/Thr The control of metabolism by regulation of enzyme activity is a fine-tuning mechanism: it acts rapidly to adjust metabolic activity from moment to moment. Microorganisms also are able to control the expression of their genome, although over longer intervals. For example, the E. coli chromosome can code for about 2,000 to 4,000 peptide chains, yet many fewer proteins are present in E. coli growing with glucose as its energy source. Regulation of gene expression serves to conserve energy and raw material, to maintain balance between the amounts of various cell proteins, and to adapt to long-term environmental change. Thus control of gene expression complements the regulation of enzyme activity. Regulation of enzyme activity (pp. 165–69) Intein C-extein O C-extein Induction and Repression HO The regulation of -galactosidase synthesis has been intensively studied and serves as a primary example of how gene expression is controlled. This enzyme catalyzes the hydrolysis of the sugar lactose to glucose and galactose (figure 12.21). When E. coli grows with lactose as its carbon source, each cell contains about 3,000 -galactosidase molecules, but has less than three molecules in the absence of lactose. The enzyme -galactosidase is an inducible enzyme—that is, its level rises in the presence of a small molecule called an inducer (in this case the lactose derivative allolactose). O Intein Regulation of mRNA Synthesis C-extein (a) N-extein 275 Regulation of mRNA Synthesis + N-extein N H C-extein (b) Figure 12.20 Protein Splicing. (a) A generalized illustration of intein structure. The amino acids that are commonly present at each end of the inteins are shown. Note that many are thiol- or hydroxyl-containing amino acids. (b) An overview of the proposed pattern or sequence of splicing. The precise mechanism is not yet known but presumably involves the hydroxyls or thiols located at each end of the intein. Figure 12.21 The -Galactosidase Reaction. OH CH2OH O CH2OH O O OH + H 2O OH OH OH OH Lactose β-galactosidase OH CH2OH O CH2OH O + OH OH OH Galactose OH OH OH OH Glucose Prescott−Harley−Klein: Microbiology, Fifth Edition 276 Chapter 12 IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation Genes: Expression and Regulation Genes transcribed Reg. © The McGraw−Hill Companies, 2002 Genes not transcribed P O Structural genes Reg. P Active repressor mRNA Active repressor Inducer Inactive repressor Structural genes O Inactive repressor Corepressor Genes not transcribed Reg. P O Structural genes Genes transcribed Reg. P O Structural genes mRNA Figure 12.22 Gene Induction. The regulator gene, Reg., synthesizes an active repressor that binds to the operator, O, and blocks RNA polymerase binding to the promoter, P, unless the inducer inactivates it. In the presence of the inducer, the repressor protein is inactive and transcription occurs. The genes for enzymes involved in the biosynthesis of amino acids and other substances often respond differently from genes coding for catabolic enzymes. An amino acid present in the surroundings may inhibit the formation of the enzymes responsible for its biosynthesis. This makes good sense because the microorganism will not need the biosynthetic enzymes for a particular substance if it is already available. Enzymes whose amount is reduced by the presence of an end product are repressible enzymes, and metabolites causing a decrease in the concentrations of repressible enzymes are corepressors. Generally, repressible enzymes are necessary for synthesis and always are present unless the end product of their pathway is available. Inducible enzymes, in contrast, are required only when their substrate is available; they are missing in the absence of the inducer. Although variations in enzyme levels could be due to changes in the rates of enzyme degradation, most enzymes are relatively stable in growing bacteria. Induction and repression result principally from changes in the rate of transcription. When E. coli is growing in the absence of lactose, it often lacks mRNA molecules coding for the synthesis of -galactosidase. In the presence of lactose, however, each cell has 35 to 50 galactosidase mRNA molecules. The synthesis of mRNA is dramatically influenced by the presence of lactose. DNA transcription mechanism (pp. 261–64) Figure 12.23 Gene Repression. The regulator gene, Reg., synthesizes an inactive repressor protein that must be activated by corepressor binding before it can bind to the operator, O, and block transcription. In the absence of the corepressor, the repressor is inactive and transcription occurs. Negative Control A controlling factor can either inhibit or activate transcription. Although the responses to the presence of metabolites are different, both induction and repression are forms of negative control: mRNA synthesis proceeds more rapidly in the absence of the active controlling factor. The rate of mRNA synthesis is controlled by special repressor proteins that are synthesized under the direction of regulator genes. The repressor binds to a specific site on DNA called the operator. The importance of regulator genes and repressors is demonstrated by mutationally inactivating a regulator gene to form a constitutive mutant. A constitutive mutant produces the enzymes in question whether or not they are needed. Thus inactivation of repressor proteins blocks the regulation of transcription. Gene structure (pp. 241–44) Repressors must exist in both active and inactive forms because transcription would never occur if they were always active. In inducible systems the regulator gene directs the synthesis of an active repressor. The inducer stimulates transcription by reversibly binding to the repressor and causing it to change to an inactive shape (figure 12.22). Just the opposite takes place in a system controlled by repression (figure 12.23). The repressor protein initially is an inactive form called an aporepressor and becomes an active repressor only when the corepressor binds to it. The corepressor inhibits transcription by activating the aporepressor. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 12.3 Regulation of mRNA Synthesis 277 Box 12.2 The Discovery of Gene Regulation he ability of microorganisms to adapt to their environments by adjusting enzyme levels was first discovered by Emil Duclaux, a colleague of Louis Pasteur. He found that the fungus Aspergillus niger would produce the enzyme that hydrolyzes sucrose (invertase) only when grown in the presence of sucrose. In 1900 F. Dienert found that yeast contained the enzymes for galactose metabolism only when grown with lactose or galactose and would lose these enzymes upon transfer to a glucose medium. Such a response made sense because the yeast cells would not need enzymes for galactose metabolism when using glucose as its carbon and energy source. Further examples of adaptation were discovered and by the 1930s H. Karström could divide enzymes into two classes: (1) adaptive enzymes that are formed only in the presence of their substrates, and (2) constitutive enzymes that are always present. It was originally thought that enzymes might be formed from inactive precursors and that the presence of the substrate simply shifted the equilibrium between precursor and enzyme toward enzyme formation. In 1942 Jacques Monod, working at the Pasteur Institute in Paris, began a study of adaptation in the bacterium E. coli. It was already known that the enzyme -galactosidase, which hydrolyzes the sugar lactose to glucose and galactose, was present only when E. coli was grown in the presence of lactose. Monod discovered that nonmetabolizable analogues of -galactosides, such as thiomethylgalactoside, also could induce enzyme production. This discovery made it possible to study induction in cells growing on carbon and energy sources other than lactose so that the growth rate and inducer concentration would not depend on the lactose supply. He next demonstrated that induction involved the synthesis of new enzyme, not just the conversion of already available precursor. Monod accomplished this by making E. coli proteins radioactive with 35S, then transferring the labeled bacteria to nonradioactive medium and adding inducer. The newly formed T -galactosidase was nonradioactive and must have been synthesized after addition of inducer. A study of the genetics of lactose induction in E. coli was begun by Joshua Lederberg a few years after Monod had started his work. Lederberg isolated not only mutants lacking -galactosidase but also a constitutive mutant in which synthesis of the enzyme proceeded in the absence of an inducer (LacI⫺). During bacterial conjugation (see section 13.4), genes from the donor bacterium enter the recipient to temporarily form an organism with two copies of those genes provided by the donor. When Arthur B. Pardee, François Jacob, and Monod transferred the gene for inducibility to a constitutive recipient not sensitive to inducers, the newly acquired gene made the recipient bacterium sensitive to inducer again. This functional gene was not a part of the recipient’s chromosome. Thus the special gene directed the synthesis of a cytoplasmic product that inhibited the formation of -galactosidase in the absence of the inducer. In 1961 Jacob and Monod named this special product the repressor and suggested that it was a protein. They further proposed that the repressor protein exerted its effects by binding to the operator, a special site next to the structural genes. They provided genetic evidence for their hypothesis. The name operon was given to the complex of the operator and the genes it controlled. Several years later in 1967, Walter Gilbert and Benno Müller-Hill managed to isolate the lac repressor and show that it was indeed a protein and did bind to a specific site in the lac operon. The existence of repression was discovered by Monod and G. Cohen-Bazire in 1953 when they found that the presence of the amino acid tryptophan would repress the synthesis of tryptophan synthetase, the final enzyme in the pathway for tryptophan biosynthesis. Subsequent research in many laboratories showed that induction and repression were operating by quite similar mechanisms, each involving repressor proteins that bound to operators on the genome. The synthesis of several proteins is often regulated by a single repressor. The structural genes, or genes coding for a polypeptide, are simply lined up together on the DNA, and a single mRNA carries all the messages. The sequence of bases coding for one or more polypeptides, together with the operator controlling its expression, is called an operon. This arrangement is of great advantage to the bacterium because coordinated control of the synthesis of several metabolically related enzymes (or other proteins) can be achieved. The Lactose Operon The best-studied negative control system is the lactose operon of E. coli. The lactose or lac operon contains three structural genes and is controlled by the lac repressor (figure 12.24). One gene codes for galactosidase; a second gene directs the synthesis of -galactoside permease, the protein responsible for lactose uptake. The third gene codes for the enzyme -galactoside transacetylase, whose function still is uncertain. The presence of the first two genes in the same operon ensures that the rates of lactose uptake and breakdown will vary together (Box 12.2). Figure 12.24 Lactose Repressor Binding to DNA. The lac repressor-DNA complex is shown here. The repressor dimer binds to two stretches of DNA (blue) by specialized N-terminal headpiece subdomains that fit in the major groove. The lac operon has three operators. The lac repressor protein finds an operator in a two-step process. First, the repressor binds to a DNA molecule, then rapidly slides along the DNA until it reaches an operator and stops. A portion of the repressor fits into Prescott−Harley−Klein: Microbiology, Fifth Edition 278 Chapter 12 IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 Genes: Expression and Regulation NH2 N N N O – O P O O– N O O P O O– P O O CH2 O– ATP OH OH Adenyl cyclase PPi Figure 12.25 Repressor and CAP Bound to the lac Operon. The lac repressor is in violet, operators in red, promoter in green, and CAP (catabolite activator protein) in blue. RNA polymerase access to the promoter in the loop of DNA is hindered in this complex and transcription cannot begin. NH2 N N N N 5′ the major groove of operator-site DNA by special N-terminal subdomains. The shape of the repressor protein is ideally suited for specific binding to the DNA double helix. How does the repressor inhibit transcription? The promoter to which RNA polymerase binds (see p. 242–44) is located next to the operator. The repressor may bind simultaneously to more than one operator and bend the DNA segment that contains the promoter (figure 12.25). The bent promoter may not allow proper RNA polymerase binding or may not be able to initiate transcription after polymerase binding. Even if the polymerase is bound to the promoter, it is stored there and does not begin transcription until the repressor leaves the operator. A repressor does not affect the actual rate of transcription once it has begun. Positive Control The preceding section shows that operons can be under negative control, resulting in induction and repression. In contrast, some operons function only in the presence of a controlling factor— that is, they are under positive operon control. The lac operon is under positive control as well as negative control—that is, it is under dual control. Lac operon function is regulated by the catabolite activator protein (CAP) or cyclic AMP receptor protein (CRP) and the small cyclic nucleotide 3', 5' -cyclic adenosine monophosphate (cAMP; figure 12.26), as well as by the lac repressor protein. The lac promoter contains a CAP site to which CAP must bind before RNA polymerase can attach to the promoter and begin transcription (figure 12.27). The catabolite activator protein is able to bind to the CAP site only when complexed with cAMP. Upon binding, CAP bends the DNA about 90° within two helical turns (figure 12.25 and figure 12.28). Interaction of CAP with RNA polymerase stimulates transcription. This positive control system makes lac operon activity dependent on the presence of cAMP as well as on that of lactose. O CH2 O O P 3′ O OH OH cAMP Figure 12.26 Cyclic Adenosine Monophosphate (cAMP). The phosphate group extends between the 3′ and 5′ hydroxyls of the ribose sugar. The enzyme adenyl cyclase forms cAMP from ATP. cAMP Active CAP Lac operon Promoter Operator Structural genes RNA polymerase (+ sigma factor) P O Structural genes Figure 12.27 Positive Control of the Lac Operon. When cyclic AMP is absent or present at a low level, the CAP protein remains inactive and does not bind to the promoter. In this situation RNA polymerase also does not bind to the promoter and transcribe the operon’s genes. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 12.4 Attenuation 279 3′ 5′ T A T A DNA A T A T E F G G "Recognition" helices C T D A C A G A T T T C C A E G D F A C G G C T A A T 5′ (a) CAP 3′ (b) Figure 12.28 CAP Structure and DNA Binding. (a) The CAP dimer binding to DNA at the lac operon promoter. The recognition helices fit into two adjacent major grooves on the double helix. (b) A model of the E. coli CAP-DNA complex derived from crystal structure studies. The cAMP binding domain is in blue and the DNA binding domain, in purple. The cAMP molecules bound to CAP are in red. Note that the DNA is bent by 90° when complexed with CAP. 12.4 Attenuation Bacteria can regulate transcription in other ways, as may be seen in the tryptophan operon of E. coli. The tryptophan operon contains structural genes for five enzymes in this amino acid’s biosynthetic pathway. As might be expected, the operon is under the control of a repressor protein coded for by the trpR gene (trp stands for tryptophan), and excess tryptophan inhibits transcription of operon genes by acting as a corepressor and activating the repressor protein. Although the operon is regulated mainly by repression, the continuation of transcription also is controlled. That is, there are two decision points involved in transcriptional control, the initiation of transcription and the continuation of transcription past the attenuator region. A leader region lies between the operator and the first structural gene in the operon, the trpE gene, and is responsible for controlling the continuation of transcription after the RNA polymerase has bound to the promoter (figure 12.29a). The leader region contains an attenuator and a sequence that codes for the synthesis of a leader peptide. The attenuator is a rho-independent termination site (p. 263) with a short GC-rich segment followed by a sequence of eight U residues. The four stretches marked off in figure 12.29a have complementary base sequences and can base pair with each other to form hairpin loops. In the absence of a ribosome, mRNA segments one and two pair to form a hairpin, while segments three and four generate a second loop next to the poly(U) sequence (figure 12.29b). The hairpin formed by segments three and four plus the poly(U) sequence will terminate transcription. If segment one is prevented from base pairing with segment two, segment two is free to associate with segment three. As a result segment four remains single stranded (figure 12.29c) and cannot serve as a terminator for transcription. It is important to note that the sequence coding for the leader peptide contains two adjacent codons that code for the amino acid tryptophan. Thus the complete peptide can be made only when there is an adequate supply of tryptophan. Since the leader peptide has not been detected, it must be degraded immediately after synthesis. Ribosome behavior during translation of the mRNA regulates RNA polymerase activity as it transcribes the leader region. This is possible because translation and transcription are tightly coupled. When the active repressor is absent, RNA polymerase binds to the promoter and moves down the leader synthesizing mRNA. If there is no translation of the mRNA after the RNA polymerase has begun copying the leader region, segments three and four form a hairpin loop, and transcription terminates before the polymerase reaches the trpE gene (figure 12.30a). When tryptophan is present, there is sufficient tryptophanyl-tRNA for protein synthesis. Therefore the ribosome will synthesize the leader peptide and continue moving along the mRNA until it reaches a UGA stop codon (see section 12.2) lying between Prescott−Harley−Klein: Microbiology, Fifth Edition 280 Chapter 12 IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 Genes: Expression and Regulation uuuuuuuu Leader peptide sequence trpE gene Attenuator 1 1 2 trp codons 3 4 2 3 4 Poly (U) (a) Rho-independent terminator (b) 2 Figure 12.29 The Tryptophan Operon Leader. (a) Organization and base pairing of the tryptophan operon leader region. The promoter and operator are to the left of the segment diagrammed, and the first structural gene (trpE) begins to the right of the attenuator. (b and c) The stretches of DNA marked off as 1 through 4 can base pair with each other to form hairpin loops: segment 2 with 1, and segment 3 with 2 or 4. See text for details. 3 uuuuuuuu 1 4 (c) (a) No translation of mRNA RNA polymerase dissociates trpE gene DNA mRNA 3 1 uuuu 4 2 uuu Rho-independent termination site (b) Tryptophan available to form trp-tRNA RNA polymerase dissociates uu A ribosome synthesizes the leader peptide and stops at the termination codon. uu uu u 4 3 Rho-independent termination site Ribosome (c) Tryptophan absent RNA polymerase continues trpE gene 3 Ribosome stops at trp codons in region 1 and forces 2 to base pair with 3. Region 4 remains single stranded and the termination site is not formed. 2 Figure 12.30 Attenuation Control. The control of tryptophan operon function by attenuation. See text for details. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 12.5 segments one and two. The ribosome halts at this codon and projects into segment two far enough to prevent it from pairing properly with segment three (figure 12.30b). Segments three and four form a hairpin loop, and the RNA polymerase terminates at the attenuator just as if no translation had taken place. If tryptophan is lacking, the ribosome will stop at the two adjacent tryptophan codons in the leader peptide sequence and prevent segment one from base pairing with segment two, because the tryptophan codons are located within segment one (figures 12.29a and 12.30c). If this happens while the RNA polymerase is still transcribing the leader region, segments two and three associate before segment four has been synthesized. Therefore segment four will remain single stranded and the terminator hairpin will not form. Consequently, when tryptophan is absent, the RNA polymerase continues on and transcribes tryptophan operon genes. Control of the continuation of transcription by a specific aminoacyltRNA is called attenuation. Attenuation’s usefulness is apparent. If the bacterium is deficient in an amino acid other than tryptophan, protein synthesis will slow and tryptophanyl-tRNA will accumulate. Transcription of the tryptophan operon will be inhibited by attenuation. When the bacterium begins to synthesize protein rapidly, tryptophan may be scarce and the concentration of tryptophanyl-tRNA may be low. This would reduce attenuation activity and stimulate operon transcription, resulting in larger quantities of the tryptophan biosynthetic enzymes. Acting together, repression and attenuation can coordinate the rate of synthesis of amino acid biosynthetic enzymes with the availability of amino acid end products and with the overall rate of protein synthesis. When tryptophan is present at high concentrations, any RNA polymerases not blocked by the activated repressor protein probably will not get past the attenuator sequence. Repression decreases transcription about seventyfold and attenuation slows it another eight- to tenfold; when both mechanisms operate together, transcription can be slowed about 600-fold. Attenuation seems important in the regulation of several amino acid biosynthetic pathways. At least five other operons have leader peptide sequences that resemble the tryptophan system in organization. For example, the leader peptide sequence of the histidine operon codes for seven histidines in a row and is followed by an attenuator that is a terminator sequence. 12.5 Global Regulatory Systems Thus far, we have been considering the function of isolated operons. However, bacteria must respond rapidly to a wide variety of changing environmental conditions and be able to cope with such things as nutrient deprivation, dessication, and major temperature fluctuations. They also have to compete successfully with other organisms for scarce nutrients and use these nutrients efficiently. These challenges require a regulatory system that can rapidly control many operons at the same time. Such regulatory systems that affect many genes and pathways simultaneously are called global regulatory systems. There are many examples of these multigene global systems. Catabolite repression in enteric bacteria and sporulation in Bacillus subtilis will be discussed shortly. Two other pre- Global Regulatory Systems 281 viously discussed global systems are the SOS response (see p. 255) and the production of heat-shock proteins (p. 273). Although it is usually possible to regulate all the genes of a metabolic pathway in a single operon, there are good reasons for more complex global systems. Some processes involve too many genes to be accommodated in a single operon. For example, the machinery required for protein synthesis is composed of 150 or more gene products, and coordination requires a regulatory network that controls many separate operons. Sometimes two levels of regulation are required because individual operons must be controlled independently and also cooperate with other operons. Regulation of sugar catabolism in E. coli is a good example. E. coli uses glucose when it is available; in such a case, operons for other catabolic pathways are repressed. If glucose is unavailable and another nutrient is present, the appropriate operon is activated. Global regulation can be accomplished by several different mechanisms. A protein repressor or activator may affect several operons simultaneously. A sigma factor may cause RNA polymerase to recognize and transcribe an array of different operons with similar promoters. Sometimes a nonprotein regulator such as the nucleotide guanosine tetraphosphate controls several operons. Global regulatory systems are so complex that a specialized nomenclature is used to describe the various kinds. Perhaps the most basic type is the regulon. A regulon is a collection of genes or operons that is controlled by a common regulatory protein. Usually the operons are associated with a single pathway or function (e.g., the production of heat-shock proteins or the catabolism of glycerol). A somewhat more complex situation is seen with a modulon. This is an operon network under the control of a common global regulatory protein, but whose constituent operons also are controlled separately by their own regulators. A good example of a modulon is catabolite repression. The most complex global systems are referred to as stimulons. A stimulon is a regulatory system in which all operons respond together in a coordinated way to an environmental stimulus. It may contain several regulons and modulons, and some of these may not share regulatory proteins. The genes involved in a response to phosphate limitation are scattered among several regulons and are part of one stimulon. We will now briefly consider three examples of global regulation. First we will discuss catabolite repression and the use of positive operon control. Then an introduction to regulation by sigma factors and the induction of sporulation will follow. Finally, the regulation of porin protein synthesis by antisense RNA will be described. Catabolite Repression If E. coli grows in a medium containing both glucose and lactose, it uses glucose preferentially until the sugar is exhausted. Then after a short lag, growth resumes with lactose as the carbon source (figure 12.31). This biphasic growth pattern or response is called diauxic growth. The cause of diauxic growth or diauxie is complex and not completely understood, but catabolite repression or the glucose effect probably plays a part. The enzymes for glucose catabolism are constitutive and unaffected by CAP activity. When the bacterium is given glucose, the cAMP level drops, resulting in Prescott−Harley−Klein: Microbiology, Fifth Edition 282 Chapter 12 IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 Genes: Expression and Regulation Whatever the precise mechanism, such control is of considerable advantage to the bacterium. It will use the most easily catabolized sugar (glucose) first rather than synthesize the enzymes necessary for another carbon and energy source. These control mechanisms are present in a variety of bacteria and metabolic pathways. transcription, or the synthesis of several gene products in a precisely timed sequence, it may be regulated by a series of sigma factors. Each sigma factor enables the RNA polymerase core enzyme to recognize a specific set of promoters and transcribe only those genes. Substitution of the sigma factor immediately changes gene expression. Bacterial viruses often use sigma factors to control mRNA synthesis during their life cycle (see chapter 17). This regulatory mechanism also is common among both gram-negative and gram-positive bacteria. For example, Escherichia coli synthesizes several sigma factors. Under normal conditions the sigma factor 70 directs RNA polymerase activity. (The superscript letter or number indicates the function or size of the sigma factor; 70 stands for 70,000 Da.) When flagella and chemotactic proteins are needed, E. coli produces F (28). If the temperature rises too high, H (32) appears and stimulates the formation of around 17 heat-shock proteins to protect the cell from thermal destruction. As would be expected, the promoters recognized by each sigma factor differ characteristically in sequence at the ⫺10 and ⫺35 positions (see pp. 242–44). One of the best-studied examples of gene regulation by sigma factors is the control of sporulation in the gram-positive Bacillus subtilis. When B. subtilis is deprived of nutrients, it will form endospores in a complex developmental process lasting about 8 hours. The bacterial endospore (pp. 68–71) Normally the B. subtilis RNA polymerase uses sigma factor A (43) to recognize genes. Environmental signals such as nutrient deprivation stimulate a kinase (Kin A or Kin B) to catalyze the phosphorylation of the Spo0F protein (figure 12.32). Spo0F transfers the phosphate to Spo0B, which in turn phosphorylates Spo0A. Phosphorylated Spo0A has several effects. It binds to a promoter and represses the expression of the abrB gene. abrB codes for a protein that inhibits many genes not needed during growth with excess nutrients (e.g., at least three sporulation genes). Phosphorylated Spo0A also activates the production of two sigma factors: an active F and an inactive pro-E. Sporulation begins when F partially replaces A in the forespore. The RNA polymerase then transcribes sporulation genes as well as vegetative genes. One of these early sporulation genes codes for another sigma factor, G, that causes RNA polymerase to transcribe late sporulation genes in the forespore. At this point the pro-E protein is activated by cleavage to yield E, which then stimulates the transcription of the gene for pro-K. Pro-K is activated by a protease to yield K and trigger the transcription of late genes in the mother cell. In summary, sporulation is regulated by two cascades of sigma factors, one in the forespore and the other in the mother cell. Each cascade influences the other through a series of signals so that the whole complex developmental process is properly coordinated. Regulation by Sigma Factors and Control of Sporulation Antisense RNA and the Control of Porin Proteins Although the RNA polymerase core enzyme can transcribe any gene to produce a messenger RNA copy, it needs the assistance of a sigma factor to bind the promoter and initiate transcription (see p. 262). This provides an excellent means of regulating gene expression. When a complex process requires a radical change in Microbiologists have known for many years that gene expression can be controlled by both regulatory proteins (e.g., repressor proteins and CAP) and aminoacyl-tRNA (attenuation). More recently it has been discovered that the activity of some genes is controlled by a special type of small regulatory RNA molecule. The regula- Bacterial density Lactose used Glucose used 0 2 4 6 8 10 Time (hours) Figure 12.31 Diauxic Growth. The diauxic growth curve of E. coli grown with a mixture of glucose and lactose. Glucose is first used, then lactose. A short lag in growth is present while the bacteria synthesize the enzymes needed for lactose use. deactivation of the catabolite activator protein and inhibition of lac operon expression. The decrease in cAMP may be due to the effect of the phosphoenolpyruvate:phosphotransferase system (PTS) on the activity of adenyl cyclase, the enzyme that synthesizes cAMP. Enzyme III of the PTS donates a phosphate to glucose during its transport; therefore, it enters the cell as glucose 6-phosphate. The phosphorylated form of enzyme III also activates adenyl cyclase. If glucose is being rapidly transported by PTS, the amount of phosphorylated enzyme III is low and the adenyl cyclase is less active, so the cAMP level drops. At least one other mechanism is involved in diauxic growth. When the PTS is actively transporting glucose into the cell, nonphosphorylated enzyme III is more prevalent. Nonphosphorylated enzyme III binds to the lactose permease and allosterically inhibits it, thus blocking lactose uptake. The phosphoenolpyruvate: phosphotransferase system (pp. 103–4) Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 12.6 Kin A Kin B Spo0F Spo0F – P Spo0B – P Spo0B Spo0A Spo0A – P Active σF Early sporulation gene transcription Active σ G Late sporulation gene transcription Figure 12.32 Initiation of Sporulation in Bacillus subtilis. A simplified diagram of the initial steps in triggering sporulation. The activation of kinases A and B begins a two-component phosphorelay system that activates the transcription regulator Spo0A. See text for more detail. tory RNA, called antisense RNA, has a base sequence complementary to a segment of another RNA molecule and specifically binds to the target RNA. Antisense RNA binding can block DNA replication, mRNA synthesis, or translation. The genes coding for these RNAs are sometimes called antisense genes. This mode of regulation appears to be widespread among viruses and bacteria. Examples are the regulation of plasmid replication and Tn10 transposition, osmoregulation of porin protein expression, regulation of phage reproduction, and the autoregulation of cAMP-receptor protein synthesis. Antisense RNA regulation has not yet been demonstrated in eucaryotic cells, although there is evidence that it may exist. It is possible that antisense RNAs bind with some eucaryotic mRNAs and stimulate their degradation. Plasmids and transposons (pp. 294–302) The regulation of E. coli outer membrane porin proteins provides an example of control by antisense RNA. The outer membrane contains channels made of porin proteins (see p. 60). The two most important porins in E. coli are the OmpF and OmpC proteins. OmpC pores are slightly smaller and are made when the bacterium grows at high osmotic pressures. It is the dominant porin in E. coli from the intestinal tract. This makes sense because the smaller pores would exclude many of the toxic molecules present in the intestine. The larger OmpF pores are favored when E. coli grows in a dilute environment, and they allow solutes to diffuse into the cell more readily. The ompF and ompC genes are partly regulated by a special OmpR protein that represses the ompF gene and activates ompC. In addition, the micF gene produces a 174-nucleotide-long antisense micF RNA that blocks ompF action (mic stands for mRNA- Two-Component Phosphorelay Systems 283 interfering complementary RNA). The micF RNA is complementary to ompF at the translation initiation site. It complexes with ompF mRNA and represses translation. The micF gene is activated by conditions such as high osmotic pressure or the presence of some toxic materials that favor ompC expression. This helps ensure that OmpF protein is not produced at the same time as OmpC protein. The fact that antisense RNA can bind specifically to mRNA and block its activity has great practical implications. Antisense RNA is already a valuable research tool. Suppose one desires to study the action of a particular gene. An antisense RNA that will bind to the gene’s mRNA can be constructed and introduced into the cell, thus blocking gene expression. Changes in the cell are then observed. It also is possible to use the same approach with short strands of antisense DNA that bind to mRNA. Antisense RNA and DNA may well be effective against a variety of cancers and infectious diseases. Promising preliminary results have been obtained using antisense oligonucleotides directed against Trypanosoma brucei brucei (the cause of African sleeping sickness), herpesviruses, the HIV virus, tumor viruses such as the RSV and polyoma viruses, ovarian cancer, cytomegalovirus infections, Crohn’s disease, and chronic myelogenous leukemia. Although much further research is needed to determine the medical potential of these molecules, they may prove invaluable in the treatment of many diseases. 1. What are induction and repression and why are they useful? Define inducer, corepressor, repressor protein, aporepressor, regulator gene, negative control, constitutive mutant, operator, structural gene, and operon. Describe how the lac operon is regulated. 2. Define positive control, dual control, and catabolite activator protein. How is the lac operon controlled positively? 3. Define attenuation and describe how it works in terms of a labeled diagram, such as that provided in figure 12.30. What are the functions of the leader region and the attenuator in attenuation? 4. What are global regulatory systems and why are they necessary? Briefly describe regulons, modulons, and stimulons. 5. What is diauxic growth and how does it result from catabolite repression? 6. Briefly describe how sigma factors can be used to control gene expression. Describe the regulation of sporulation by sigma factors. 7. What is antisense RNA? How does it regulate gene expression? 12.6 Two-Component Phosphorelay Systems A two-component phosphorelay system is a signal transduction system that uses the transfer of phosphoryl groups to control gene transcription and protein activity. It has two major components: a sensor kinase and a response regulator. There are many phosphorelay systems; two good examples are the systems that control sporulation and chemotaxis. In the sporulation regulation system, kin A is a sensor kinase. It serves as a transmitter that phosphorylates itself (autophosphorylation) on a special histidine residue in response to environmental signals. The Spo0F acts as a receiver and catalyzes the Prescott−Harley−Klein: Microbiology, Fifth Edition 284 Chapter 12 IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 Genes: Expression and Regulation Attractant MCP MCP CheR CH3 AdoMet CheW CheA H2 O CH3 CH3 CH3OH ATP ccw run CheW CheA P CheB P H2 O cw tumble Pi CheB CheY CheY Pi P CheZ Figure 12.33 The Mechanism of Chemotaxis in Escherichia coli. The chemotaxis system is designed to control counterclockwise (ccw) and clockwise (cw) flagellar rotation so that E. coli moves up an attractant gradient by a sequence of runs and tumbles. See the text for a description of the process. transfer of the phosphoryl group from kin A to a special aspartic acid residue on its surface; Spo0F then donates the phosphoryl group to a histidine on Spo0B. Spo0A is a response regulator. It has a receiver domain aspartate and picks up the phosphoryl group from Spo0B to become an active transcription regulator. Control of sporulation by sigma factors (p. 282) Chemotaxis is controlled by a well-studied phosphorelay regulatory system. As we have seen previously, procaryotes sense various chemicals in their environment when these substances bind to chemoreceptors called methyl-accepting chemotaxis proteins (MCPs). The MCPs can influence flagellar rotation in such a way that the organisms swim toward attractants and away from repellants. This response is regulated by a complex system in which the CheA protein serves as a sensor kinase and the CheY protein is the response regulator. Chemotaxis (pp. 66–68) The MCP chemoreceptors are buried in the plasma membrane with major parts exposed on both sides. The periplasmic side of each MCP has a binding site for one or more attractant molecules and may also bind repellents. Although attractants often bind directly to the MCP, in some cases they may attach to special periplasmic binding proteins, which then interact with the MCPs. The cytoplasmic side of an MCP interacts with two proteins (figure 12.33). The CheW protein binds to MCPs and helps attach the CheA protein. The full complex is composed of an MCP dimer, two CheW monomers, and a CheA dimer. When the MCP is not bound to an attractant, it stimulates CheA to phosphorylate itself using ATP, a process called autophosphorylation. CheA autophosphorylation is inhibited when the attractant is bound to its MCP. Phosphorylated CheA can donate its phosphate to one of two receptor proteins, CheY or CheB. If CheY is phosphorylated by CheA, it changes to an active conformation, moves to the flagellum, and in- teracts with the switch protein (Fli M) at its base (see Figure 3.36). This causes the flagellum to rotate clockwise. Thus a decrease in attractant level promotes clockwise rotation and tumbling. The phosphate is removed from CheY in about 10 seconds in a process aided by the protein CheZ. The short lifetime of phosphorylated CheY means that the bacterium is very responsive to changes in attractant concentration. It will not be stuck in the tumble mode for too long a time when the attractant level changes. When no attractants or repellents are present, the system maintains intermediate concentrations of CheA phosphate and CheY phosphate. This produces a normal run-tumble swimming pattern. The E. coli cell must ignore past stimulus responses so that it can compare the most recent attractant or repellent concentration with the immediately previous one and respond to any changes. This means that it must be able to adapt to a concentration change in order to detect still further changes. That is, it must have a short-term memory with a retention time of only seconds. This adaptation is accomplished by methylation of the MCP receptors. The cytoplasmic portion or domain of MCP molecules usually has about four or five methylation sites containing special glutamic acid residues. Methyls can be added to these glutamic acid carboxyl groups using S-adenosylmethionine as the methylating agent. The reaction is catalyzed by the CheR protein and occurs at a fairly steady rate regardless of the attractant level. Methyl groups are hydrolytically removed from MCPs by the phosphorylated CheB protein, a methylesterase. These enzymes are part of a feedback circuit that stops motor responses a short time after they have commenced. The attractant-MCP complex is a good substrate for CheR and a poor substrate for CheB. When an attractant binds to the MCP, the levels of both CheY phosphate and CheB phosphate drop because the autophosphorylation of Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 12.7 CheA is inhibited. This not only causes counterclockwise rotation and a run, but also lowers methylesterase activity so that MCP methylation increases. Increased methylation changes the conformation of the MCP so that it again supports an intermediate level of CheA autophosphorylation. CheY phosphate and CheB phosphate return to intermediate levels and restore the normal run-tumble behavior. Removal of the attractant causes the overmethylated MCP to stimulate CheA autophosphorylation and the levels of CheY phosphate and CheB phosphate increase. This induces tumbling and simultaneously promotes MCP demethylation so that the system returns to an intermediate level of CheA autophosphorylation. The chemotactic response is a very complex one involving many different proteins and two forms of covalent protein regulation (see section 8.9). The actual response arises from a combination of (1) the control of CheA phosphorylation by attractant and repellent levels; (2) the clockwise rotation promoted by phosphorylated CheY; and (3) a feedback regulatory circuit involving CheR, phosphorylated CheB, and variations in MCP methylation. 285 Control of the Cell Cycle Replication factory O O O O T T T O (a) T T T T O O O 1. What is a two-component phosphorelay system? 2. Explain in a general way how bacteria are attracted to substances like nutrients while being repelled by toxic materials. 3. Describe the molecular mechanism by which molecules attract E. coli. (b) 12.7 Control of the Cell Cycle Although much progress has been made in understanding the control of microbial enzyme activity and pathway function, much less is known about the regulation of more complex events such as bacterial sporulation and cell division. This section briefly describes the regulation of bacterial cell division. Attention is focused primarily on E. coli because it has been intensively studied. The complete sequence of events extending from the formation of a new cell through the next division is called the cell cycle. A young E. coli cell growing at a constant rate will double in length without changing in diameter, then divide into two cells of equal size by transverse fission. Because each daughter cell receives at least one copy of the genetic material, DNA replication and cell division must be tightly coordinated. In fact, if DNA synthesis is inhibited by a drug or a gene mutation, cell division is also blocked and the affected cells continue to elongate, forming long filaments. Termination of DNA replication also seems connected in some way with cell division. Although the growth rate of E. coli at 37°C may vary considerably, division usually takes place about 20 minutes after replication has finished. During this final interval the genetic material must be distributed between the daughter cells. The newly formed DNA copies are attached to adjacent sites on the plasma membrane at or close to the center of the cell, probably at their replication factories (figure 12.34a). It is not yet clear how the two copies are sepa- Figure 12.34 DNA Replication in Bacteria. (a) In slowly growing bacteria, the chromosome is replicated once before division. In this simplified illustration, the replicating chromosomal DNA is spooled through a membrane-bound replication factory, the factory then divides into two replication foci, and eventually the duplicated chromosomes separate and move to opposite ends of the cell. O is the origin of replication and T is the termination region. (b) DNA replication in rapidly growing bacteria is more complex. A new round of DNA replication is initiated before the original cell divides so that the DNA in daughter cells is already partially replicated. The colored circles at the ends of the DNA loops are replication origins; the black circles along the sides are replication forks. Newly synthesized DNA is in color, with the red representing the most recently synthesized DNA. Membrane attachments are not shown for sake of simplicity. rated. ParA, ParB, MukB and other proteins are involved in DNA partitioning. The ParA and ParB proteins are localized at the poles of late predivisional cells and may be part of a mitotic-like apparatus for bacterial division. There is some evidence for active separation of chromosomes by a force-generating mechanism. Possibly the chromosomes move apart because they are pushed by the replication factories and pulled by some sort of “mitotic” apparatus. DNA movement also may result from membrane growth and cell wall synthesis, but membrane growth is too slow to account for all the movement. After the chromosomes have been separated, a cross wall or septum forms between them. Patterns of cell wall formation (p. 223) Prescott−Harley−Klein: Microbiology, Fifth Edition 286 Chapter 12 IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 Genes: Expression and Regulation Threshold length reached Initiation of division process Division proteins and septum precursors Septation Initiation mass reached 0 Initiation of DNA replication DNA replication and partition Division Partitioned DNA copies 20 40 60 Time (minutes) Figure 12.35 Control of the Cell Cycle in E. coli. A 60-minute interval between divisions has been assumed for purposes of simplicity (the actual time between cell divisions may be shorter). E. coli requires about 40 minutes to replicate its DNA and 20 minutes after termination of replication to prepare for division. The position of events on the time line is approximate and meant to show the general pattern of occurrences. Current evidence suggests that two sequences of events, operating in parallel but independently, control division and the cell cycle (figure 12.35). Like eucaryotic cells, bacteria must reach a specific threshold size or initiation mass to trigger DNA replication. E. coli also has to reach a threshold length before it can partition its chromosomes and divide into two cells. Thus there seem to be two separate controls for the cell cycle, one sensitive to cell mass and the other responding to cell length. DNA replication takes about 40 minutes to complete. Some of the E. coli cell cycle control mechanisms are becoming clearer, although much remains to be learned. The initiation of DNA replication requires binding of many copies of the DnaA protein to oriC, the replication origin site (see pp. 235–39). Active DnaA protein has bound ATP, and the interconversion between DnaA-ATP and DnaA-ADP may help regulate initiation. Other factors also appear to participate in initiating DNA replication. After DNA replication is under way, another round does not immediately begin, partly because the parental DNA strand is methylated right after replication. The methylated replication origin binds to specific areas on the plasma membrane and is inactive. The initiation of septation is equally complex and tightly regulated. Both termination of DNA replication and the attainment of threshold length are required to trigger septation and cell division. This is at least partly due to the inhibition of septation by the proximity of chromosomes. The presence of DNA damage inhibits septation as well. A cell will complete chromosome replication, repair any DNA damage, and partition the chromosomes into opposite ends before it forms a septum and divides. There probably are one or more regulatory proteins that interact with various division proteins to promote septum formation and division. An adequate supply of peptidoglycan chains or precursors also must be available at the proper time. The FtsZ protein is a division protein essential in initiating septation. This protein is scattered throughout the cell between divisions. At the onset of septation, it forms a Z ring at the septation site, and the ring then becomes smaller. Because the FtsZ protein hydrolyzes GTP, the ring is a contractile structure that uses GTP energy; it also determines the placement of the septum. Several other proteins also are required for cell division. For example, the PBP3 or penicillinbinding protein 3 catalyzes peptidoglycan transglycosylation and peptidoglycan transpeptidation to help create the new cell wall (see pp. 221–23). Clearly regulation of the bacterial cell cycle is complex and involves several interacting regulatory mechanisms. The relationship of DNA synthesis to the cell cycle varies with the growth rate. If E. coli is growing with a doubling time of about 60 minutes, DNA replication does not take place during the last 20 minutes—that is, replication is a discontinuous process when the doubling time is 60 minutes or longer. When the culture is growing with a doubling time of less than 60 minutes, a second round of replication begins while the first round is still under way (figure 12.34b). The daughter cells may actually receive DNA with two or more replication forks, and replication is continuous because the cells are always copying their DNA. Two decades of research have provided a fairly adequate overall picture of the cell cycle in E. coli. Several cell division genes have been identified. Yet it still is not known precisely how the cycle is controlled. Future work should improve our understanding of this important process. 1. What is a cell cycle? Briefly describe how the cycle in E. coli is regulated and how cycle timing results. 2. How are the two DNA copies separated and apportioned between the two daughter cells? Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 Key Terms 287 Summary 1. Procaryotic mRNA has nontranslated leader and trailer sequences at its ends. Spacer regions exist between genes when mRNA is polygenic. 2. RNA is synthesized by RNA polymerase that copies the sequence of the DNA template strand (figure 12.2). 3. The sigma factor helps the procaryotic RNA polymerase bind to the promoter region at the start of a gene. 4. A terminator marks the end of a gene. A rho factor is needed for RNA polymerase release from some terminators. 5. RNA polymerase II synthesizes heterogeneous nuclear RNA, which then undergoes posttranscriptional modification by RNA cleavage and addition of a 3′ poly-A sequence and a 5′ cap to generate eucaryotic mRNA (figure 12.5). 6. Many eucaryotic genes are split or interrupted genes that have exons and introns. Exons are joined by RNA splicing. Splicing involves small nuclear RNA molecules, spliceosomes, and sometimes ribozymes. 7. In translation, ribosomes attach to mRNA and synthesize a polypeptide beginning at the Nterminal end. A polysome or polyribosome is a complex of mRNA with several ribosomes. 8. Amino acids are activated for protein synthesis by attachment to the 3′ end of transfer RNAs. Activation requires ATP, and the reaction is catalyzed by aminoacyl-tRNA synthetases. 9. Ribosomes are large, complex organelles composed of rRNAs and many polypeptides. Amino acids are added to a growing peptide chain at the translational domain. 10. Protein synthesis begins with the binding of fMet-tRNA (procaryotes) or an initiator methionyl-tRNAMet (eucaryotes) to an initiator codon on mRNA and to the two ribosomal subunits. This involves the participation of protein initiation factors (figure 12.14). 11. In the elongation cycle the proper aminoacyltRNA binds to the A site with the aid of EF-Tu and GTP (figure 12.15). Then the transpeptidation reaction is catalyzed by peptidyl transferase. Finally, during translocation, the peptidyl-tRNA moves to the P site and the ribosome travels along the mRNA one codon. Translocation requires GTP and EF-G or translocase. The empty tRNA leaves the ribosome by way of the exit site. 12. Protein synthesis stops when a nonsense codon is reached. Procaryotes require three release factors for codon recognition and ribosome dissociation from the mRNA. 13. Molecular chaperones help proteins fold properly, protect cells against environmental stresses, and transport proteins across membranes. 14. Procaryotic proteins may not fold until completely synthesized, whereas eucaryotic protein domains fold as they leave the ribosome. Some proteins are self-splicing and excise portions of themselves before folding into their final shape. 15. -Galactosidase is an inducible enzyme whose concentration rises in the presence of its inducer. 16. Many biosynthetic enzymes are repressible enzymes whose levels are reduced in the presence of end products called corepressors. 17. Induction and repression result from regulation of the rate of transcription by repressor proteins coded for by regulator genes. This is an example of negative control. A regulator gene mutation can lead to a constitutive mutant, which continuously produces a metabolite. 18. The repressor inhibits transcription by binding to an operator and interfering with the binding of RNA polymerase to its promoter (figure 12.22). 19. In inducible systems the newly synthesized repressor protein is active, and inducer binding inactivates it. In contrast, an inactive repressor or aporepressor is synthesized in a repressible system and is activated by the corepressor (figure 12.23). 20. Often one repressor regulates the synthesis of several enzymes because they are part of a single operon, a DNA sequence coding for one or more polypeptides and the operator controlling its expression. 21. Positive operon control of the lac operon is due to the catabolite activator protein, which is activated by cAMP (figure 12.27). 22. In the tryptophan operon a leader region lies between the operator and the first structural gene (figure 12.29). It codes for the synthesis of a leader peptide and contains an attenuator, a rho-independent termination site. 23. The synthesis of the leader peptide by a ribosome while RNA polymerase is transcribing the leader region regulates transcription; therefore the tryptophan operon is expressed only when there is insufficient tryptophan available. This mechanism of transcription control is called attenuation (figure 12.30). 24. Global regulatory systems can control many operons simultaneously and help procaryotes respond rapidly to a wide variety of environmental challenges. 25. Catabolite repression probably contributes to diauxic growth when E. coli is cultured in the presence of both glucose and lactose. 26. The transcription of genes can be regulated by altering the promoters to which RNA polymerase binds by changing the available sigma factors. A good example is the control of sporulation. 27. Small antisense RNA molecules regulate the expression of some genes. They can affect DNA replication, RNA transcription, or translation. For example, they help control porin protein levels. 28. Two-component phosphorelay systems are signal transduction systems that use phosphoryl group transfers in regulation. They have a sensor kinase and a response regulator. 29. Sporulation and chemotaxis regulatory systems are two-component phosphorelay systems. 30. The complete sequence of events extending from the formation of a new cell through the next division is called the cell cycle. 31. The end of DNA replication is tightly linked to cell division, so division in E. coli usually takes place about 20 minutes after replication is finished (figure 12.34). Special division and regulatory proteins are involved. 32. In very rapidly dividing bacterial cells, a new round of DNA replication begins before the cells divide. Key Terms amino acid activation 266 catabolite activator protein (CAP) 278 cyclic AMP receptor protein (CRP) 278 aminoacyl or acceptor site (A site) 270 aminoacyl-tRNA synthetases 267 catabolite repression 281 cell cycle 285 diauxic growth 281 domains 274 anticodon triplet 266 antisense RNA 283 aporepressor 276 attenuation 281 constitutive mutant 276 core enzyme 262 corepressor 276 3′, 5′-cyclic adenosine monophosphate elongation cycle 270 elongation factors 270 exit site (E site) 270 exon 263 attenuator 279 (cAMP) 278 exteins 275 Prescott−Harley−Klein: Microbiology, Fifth Edition 288 Chapter 12 IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 Genes: Expression and Regulation global regulatory systems 281 heat-shock proteins 273 heterogeneous nuclear RNA (hnRNA) 263 operator 276 operon 277 peptidyl or donor site (P site) 270 ribosomal RNA (rRNA) 261 ribozyme 264 RNA polymerase 261 inducer 275 inducible enzyme 275 peptidyl transferase 270 polyribosome 266 RNA splicing 264 sigma factor 262 initiation factors 270 initiator codon 269 inteins 275 intron 263 positive operon control 278 posttranscriptional modification 263 Pribnow box 262 promoter 262 small nuclear RNA (snRNA) 264 spliceosome 264 split or interrupted genes 263 structural gene 277 leader region 279 leader sequence 261 methyl-accepting chemotaxis protein (MCP) 284 protein splicing 275 regulon 281 release factors 270 terminator 263 transfer RNA (tRNA) 261 translocation 270 molecular chaperones 272 negative control 276 nonsense codons 270 repressible enzyme 276 repressor proteins 276 rho factor 263 transpeptidation reaction 270 two-component phosphorelay system 283 Questions for Thought and Review 1. Describe how RNA polymerase transcribes procaryotic DNA. How does the polymerase know where to begin and end transcription? 2. How do eucaryotic RNA polymerases and promoters differ from those in procaryotes? In what ways does eucaryotic mRNA differ from procaryotic mRNA with respect to synthesis and structure? How does eucaryotic synthesis of rRNA and tRNA resemble that of mRNA? How does it differ? 3. Draw diagrams summarizing the sequence of events in the three stages of protein synthesis (initiation, elongation, and termination) and accounting for the energy requirements of translation. 4. Describe in some detail the organization of the regulatory systems responsible for induction and repression, and the mechanism of their operation. 5. How is E. coli able to use glucose exclusively when presented with a mixture of glucose and lactose? 6. Of what practical importance is attenuation in coordinating the synthesis of amino acids and proteins? Describe how attenuation activity would vary when protein synthesis suddenly rapidly accelerated, then later suddenly decelerated. 7. How does the timing of DNA replication seem to differ between slow-growing and fastgrowing cells? Be able to account for the fact that bacterial cells may contain more than a single copy of DNA. Critical Thinking Questions 1. Attenuation affects anabolic pathways, whereas repression can affect either anabolic or catabolic pathways. Provide an explanation for this. 2. Many people say that RNA was the first of the information molecules (RNA, DNA, protein) to occur during evolution. Given the information in this chapter, what evidence is there to support this hypothesis? 3. Compare and contrast RNA and DNA synthesis. Additional Reading General Becker, W. M.; Kleinsmith, L. J.; and Hardin, J. 2000. The world of the cell, 4th ed. Redwood City, Calif.: Benjamin/Cummings. Judson, H. F. 1979. The eighth day of creation: Makers of the revolution in biology. London: Jonathan Cape. Kendrew, J., editor. 1994. The encyclopedia of molecular biology. Boston: Blackwell Scientific Publications. Lewin, B. 2000. Genes, 7th ed. New York: Oxford University Press. Lodish, H.; Berk, A.; Zipursky, S. L.; Matsudaira, P.; Baltimore, D.; and Darnell, J. 2000. Molecular Cell Biology, 4th ed. New York: W. H. Freeman. Moat, A. G., and Foster, J. W. 1995. Microbial physiology, 3d ed. New York: John Wiley and Sons. Neidhardt, F. C.; Ingraham, J. L.; and Schaechter, M. 1990. Physiology of the bacterial cell. Sunderland, Mass.: Sinauer Associates. Snyder, L., and Champness, W. 1997. Molecular genetics of bacteria. Washington, D.C.: ASM Press. Squires, C. L., and Zaporojets, D. 2000. Proteins shared by the transcription and translation machines. Annu. Rev. Microbiol. 54:775–98. Voet, D., and Voet, J. G. 1995. Biochemistry, 2d ed. New York: John Wiley and Sons. Weaver, R. F. 1999. Molecular biology. Dubuque, Iowa: WCB McGraw-Hill. Zubay, G. 1998. Biochemistry, 4th ed. Dubuque, Iowa: WCB/McGraw-Hill. 12.1 DNA Transcription or RNA Synthesis Ahern, H. 1991. Self-splicing introns: Molecular fossils or selfish DNA? ASM News 57(5):258–61. Cech, T. R. 1986. RNA as an enzyme. Sci. Am. 255(5):64–75. Darnell, J. E., Jr. 1983. The processing of RNA. Sci. Am. 249(4):90–100. Das, A. 1993. Control of transcription termination by RNA-binding proteins. Annu. Rev. Biochem. 62:893–930. Gelles, J., and Landick, R. 1998. RNA polymerase as a molecular motor. Cell 93:13–16. Guthrie, C. 1991. Messenger RNA splicing in yeast: Clues to why the spliceosome is a ribonucleoprotein. Science 253:157–63. Koleske, A. J., and Young, R. A. 1995. The RNA polymerase II holoenzyme and its implications for gene regulation. Trends Biochem. Sci. 20(3):113–16. Landick, R. 1997. RNA polymerase slides home: Pause and termination site recognition. Cell 88:741–44. McClure, W. R. 1985. Mechanism and control of transcription initiation in prokaryotes. Annu. Rev. Biochem. 54:171–204. Rosbash, M., and Séraphin, B. 1991. Who’s on first? The U1 snRNP-5′ splice site interaction and splicing. Trends Biochem. Sci. 16(5):187–90. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 Additional Reading Sachs, A., and Wahle, E. 1993. Poly(A) tail metabolism and function in eucaryotes. J. Biol. Chem. 268(31):22955–58. Sarkar, N. 1997. Polyadenylation of mRNA in prokaryotes. Annu. Rev. Biochem. 66:173–97. Staley, J. P., and Guthrie, C. 1998. Mechanical devices of the spliceosome: Motors, clocks, springs, and things. Cell 92:315–26. Steitz, J. A. 1988. “Snurps.” Sci. Am. 258(6):56–63. Tjian, R. 1995. Molecular machines that control genes. Sci. Am. 272(2):54–61. 12.2 Protein Synthesis Ban, N.; Nissen, P.; Hansen, J.; Moore, P. B.; and Steitz, T. A. 2000. The complete atomic structure of the large ribosomal subunit at 2.4Å resolution. Science 289:905–20. Bukau, B., and Horwich, A. L. 1998. The Hsp70 and Hsp60 chaperone machines. Cell 92:351–66. Cate, J. H.; Yusupov, M. M.; Yusupova, G. Zh.; Earnest, T. N.; and Noller, H. F. 1999. X-ray crystal structures of 70S ribosome functional complexes. Science 285:2095–2135. Cooper, A. A., and Stevens, T. H. 1995. Protein splicing: Self-splicing of genetically mobile elements at the protein level. Trends Biochem. Sci. 20:351–56. Frank, J. 1998. How the ribosome works. American Scientist 86:428–39. Georgopoulos, C., and Welch, W. J. 1993. Role of the major heat shock proteins as molecular chaperones. Annu. Rev. Cell Biol. 9:601–34. Green, R., and Noller, H. F. 1997. Ribosomes and translation. Annu. Rev. Biochem. 66:679–716. Hartl, F. U. 1996. Molecular chaperones in cellular protein folding. Nature. 381:571–80. Jagus, R., and Joshi, B. 2000. Protein biosynthesis. In Encyclopedia of microbiology, 2d ed., vol. 3, J. Lederberg, editor-in-chief, 824–46. San Diego: Academic Press. Lake, J. A. 1981. The ribosome. Sci. Am. 245(2):84–97. Lake, J. A. 1985. Evolving ribosome structure: Domains in archaebacteria, eubacteria, eocytes and eukaryotes. Annu. Rev. Biochem. 54:507–30. Merrick, W. C. 1992. Mechanism and regulation of eukaryotic protein synthesis. Microbiol. Rev. 56(2):291–315. Netzer, W. J., and Hartl, F. U. 1998. Protein folding in the cytosol: Chaperonin-dependent and -independent mechanisms. Trends Biochem. Sci. 23:68–73. Sigler, P. B.; Xu, Z.; Rye, H. S.; Burston, S. G.; Fenton, W. A.; and Horwich, A. L. 1998. Structure and function in GroEL-mediated protein folding. Annu. Rev. Biochem. 67:581–608. Weijland, A., and Parmeggiani, A. 1994. Why do two EF-Tu molecules act in the elongation cycle of protein biosynthesis? Trends Biochem. Sci. 19:188–93. Wilson, K. S., and Noller, H. F. 1998. Molecular movement inside the translational engine. Cell 92:337–49. 12.3 Regulation of mRNA Synthesis Amster-Choder, O. 2000. Transcriptional regulation in prokaryotes. In Encyclopedia of microbiology, 2d ed., vol. 4, J. Lederberg, editor-in-chief, 610–27. San Diego: Academic Press. Botsford, J. L., and Harman, J. G. 1992. Cyclic AMP in prokaryotes. Microbiol. Rev. 56(1):100–22. Busby, S., and Buc, H. 1987. Positive regulation of gene expression by cyclic AMP and its receptor protein in Escherichia coli. Microbiol. Sci. 4(12):371–75. Errington, J. 1993. Bacillus subtilis sporulation: Regulation of gene expression and control of morphogenesis. Microbiol. Rev. 57(1):1–33. Ishihama, A. 2000. Functional modulation of Escherichia coli RNA polymerase. Annu. Rev. Microbiol. 54:499–518. Losick, R. 1995. Differentiation and cell fate in a simple organism. BioScience 45(6):400–5. Lovett, P. S., and Rogers, E. J. 1996. Ribosome regulation by the nascent peptide. Microbiol. Rev. 60(2):366–85. Maniatis, T., and Ptashne, M. 1976. A DNA operator-repressor system. Sci. Am. 234(1):64–76. McKnight, S. L. 1991. Molecular zippers in gene regulation. Sci. Am. 264(4):54–64. Perez-Martin, J.; Rojo, F.; and de Lorenzo, V. 1994. Promoters responsive to DNA bending: A common theme in prokaryotic gene expression. Microbiol. Rev. 58(2):268–90. Ptashne, M. 1992. A genetic switch, 2d ed. Cambridge, Mass.: Blackwell Scientific Publications. Ptashne, M., and Gann, A. 1997. Transcriptional activation by recruitment. Nature 386:569–76. Ptashne, M., and Gilbert, W. 1970. Genetic repressors. Sci. Am. 222(6):36–44. Saier, M. H. 1989. Protein phosphorylation and allosteric control of inducer exclusion and catabolite repression by the bacterial phosphoenolpyruvate:sugar phosphotransferase system. Microbiol. Rev. 53(1):109–20. Severinov, K. 2000. RNA polymerase structure— function: Insights into points of transcriptional regulation. Curr. Opin. Microbiol. 3:118–25. Welch, W. J. 1993. How cells respond to stress. Sci. Am. 268(5):56–64. Werner, M. H., and Burley, S. K. 1997. Architectural transcription factors: Proteins that remodel DNA. Cell 88:733–36. 12.4 Attenuation Landick, R.; Turnbough, C. L., Jr.; and Yanofsky, C. 1996. Transcription attenuation. In Escherichia coli and Salmonella: Cellular and molecular biology, 2d ed., vol. 1, F. C. Neidhardt, editor-in-chief, 1263–86. Washington, D.C.: ASM Press. Yanofsky, C. 1981. Attenuation in the control of expression of bacterial operons. Nature 289:751–58. 12.5 289 Global Regulatory Systems Green, P. J.; Pines, O.; and Inouye, M. 1986. The role of antisense RNA in gene regulation. Annu. Rev. Biochem. 55:569–97. Neidhardt, F. C., and Savageau, M. A. 1996. Regulation beyond the operon. In Escherichia coli and Salmonella: Cellular and molecular biology, 2d ed., vol. 1, F. C. Neidhardt, editorin-chief, 1310–24. Washington, D.C.: ASM Press. Nellen, W., and Lichtenstein, C. 1993. What makes an mRNA anti-sense-itive? Trends Biochem. Sci. 18:419–23. Nogueira, T., and Springer, M. 2000. Posttranscriptional control by global regulators of gene expression in bacteria. Curr. Opin. Microbiol. 3:154–58. Weintraub, H. M. 1990. Antisense RNA and DNA. Sci. Am. 262(1):40–46. Yura, T.; Nagai, H.; and Mori, H. 1993. Regulation of the heat-shock response in bacteria. Annu. Rev. Microbiol. 47:321–50. 12.6 Two-Component Phosphorelay Systems Hoch, J. A. 2000. Two-component and phosphorelay signal transduction. Curr. Opin. Microbiol. 3:165–70. Ninfa, A. J., and Atkinson, M. R. 2000. Twocomponent systems. In Encyclopedia of microbiology, 2d ed., vol. 4, J. Lederberg, editorin-chief, 742–54. San Diego: Academic Press. Perego, M. 1998. Kinase-phosphatase competition regulates Bacillus subtilis development. Trends Microbiol. 6(9):366–70. Perraud, A.-L.; Weiss, V.; and Gross, R. 1999. Signalling pathways in two-component phosphorelay systems. Trends Microbiol. 7(3):115–20. Piggot, P. J. 2000. Sporulation. In Encyclopedia of microbiology, 2d ed., vol. 4, J. Lederberg, editor-in-chief, 377–86. San Diego: Academic Press. Stock, J. B.; Ninfa, A. J.; and Stock, A. M. 1989. Protein phosphorylation and regulation of adaptive responses in bacteria. Microbiol. Rev. 53(4):450–90. 12.7 Control of the Cell Cycle de Boer, P. A. J.; Cook, W. R.; and Rothfield, L. I. 1990. Bacterial cell division. Annu. Rev. Genet. 24:249–74. Donachie, W. D. 1993. The cell cycle of Escherichia coli. Annu. Rev. Microbiol. 47:199–230. Errington, J. 1998. Dramatic new view of bacterial chromosome segregation. ASM News 64(4):210–17. Gordon, G. S., and Wright, A. 2000. DNA segregation in bacteria. Annu. Rev. Microbiol. 54:681–708. Helmstetter, C. E. 1996. Timing of synthetic activities in the cell cycle. In Escherichia coli and Salmonella: Cellular and molecular biology, 2d ed., vol. 2, F. C. Neidhardt, editor-in-chief, 1627–39. Washington, D.C.: ASM Press. Prescott−Harley−Klein: Microbiology, Fifth Edition 290 Chapter 12 IV. Microbial Molecular Biology and Genetics 12. Genes: Expression and Regulation © The McGraw−Hill Companies, 2002 Genes: Expression and Regulation Laub, M. T.; McAdams, H. H.; Feldblyum, T.; Fraser, C. M.; and Shapiro, L. 2000. Global analysis of the genetic network controlling a bacterial cell cycle. Science 290:2144–48. Leonard, A. C., and Grimwade, J. E. 2000. Chromosome replication and segregation. In Encyclopedia of microbiology, 2d ed., vol. 1, J. Lederberg, editor-in-chief, 822–33. San Diego: Academic Press. Lutkenhaus, J., and Mukherjee, A. 1996. Cell division. In Escherichia coli and Salmonella: Cellular and molecular biology, 2d ed., vol. 2, F. C. Neidhardt, editor-in-chief, 1615–26. Washington, D.C.: ASM Press. Lutkenhaus, J., and Addinall, S. G. 1997. Bacterial cell division and the Z ring. Annu. Rev. Biochem. 66:93–116. Marr, A. G. 1991. Growth rate of Escherichia coli. Microbiol. Rev. 55(2):316–33. Murray, A., and Hunt, T. 1993. The cell cycle: An introduction. New York: W. H. Freeman. Rothfield, L. I., and Justice, S. S. 1997. Bacterial cell division: The cycle of the ring. Cell 88:581–84. Wake, R. G., and Errington, J. 1995. Chromosome partitioning in bacteria. Annu. Rev. Genetics 29:41–67. Wheeler, R. T., and Shapiro, L. 1997. Bacterial chromosome segregation: Is there a mitotic apparatus? Cell 88:577–79. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 CHAPTER 13 Microbial Recombination and Plasmids The scanning electron micrograph shows Streptococcus pneumoniae, the bacterium first used to study transformation and obtain evidence that DNA is the genetic material of organisms. Outline 13.1 13.2 Bacterial Recombination: General Principles 292 Bacterial Plasmids 294 Fertility Factors 295 Resistance Factors 297 Col Plasmids 297 Other Types of Plasmids 297 13.3 13.4 Transposable Elements 298 Bacterial Conjugation 302 13.5 13.6 DNA Transformation 305 Transduction 307 F⫹ ⫻ F⫺ Mating 302 Hfr Conjugation 303 F′ Conjugation 303 Generalized Transduction 308 Specialized Transduction 309 13.7 13.8 Mapping the Genome 312 Recombination and Genome Mapping in Viruses 314 Concepts 1. Recombination is a one-way process in procaryotes: a piece of genetic material (the exogenote) is donated to the chromosome of a recipient cell (the endogenote) and integrated into it. 2. The actual transfer of genetic material between bacteria usually takes place in one of three ways: direct transfer between two bacteria temporarily in physical contact (conjugation), transfer of a naked DNA fragment (transformation), or transport of bacterial DNA by bacteriophages (transduction). 3. Plasmids and transposable elements can move genetic material between bacterial chromosomes and within chromosomes to cause rapid changes in genomes and drastically alter phenotypes. 4. The bacterial chromosome can be mapped with great precision, using Hfr conjugation in combination with transformational and transductional mapping techniques. 5. Recombination of virus genomes occurs when two viruses with homologous chromosomes infect a host cell at the same time. Prescott−Harley−Klein: Microbiology, Fifth Edition 292 Chapter 13 IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 Microbial Recombination and Plasmids M N O P M N O P m n o p m n o p M N O P M n O P m N o p m n o p M N O P M n O P m N o p m n o p Deep in the cavern of the infant’s breast The father’s nature lurks, and lives anew. —Horace, Odes hapter 12 introduces the fundamentals of molecular genetics: the way genetic information is organized and stored, the nature of mutations and techniques for their isolation and study, and DNA repair. This chapter focuses on genetic recombination in microorganisms, with primary emphasis placed on recombination in bacteria and viruses. The chapter begins with a general overview of bacterial recombination and an introduction to both bacterial plasmids and transposable elements. Next, the three types of bacterial gene transfer—conjugation, transformation, and transduction—are discussed. Because an understanding of the techniques used to locate genes on chromosomes depends on knowledge of recombination mechanisms, mapping the bacterial genome is discussed after the introduction to recombination. The chapter ends with a description of recombination in viruses and a brief discussion of viral chromosome mapping. C In a general sense, recombination is the process in which one or more nucleic acids molecules are rearranged or combined to produce a new nucleotide sequence. Usually genetic material from two parents is combined to produce a recombinant chromosome with a new, different genotype. Recombination results in a new arrangement of genes or parts of genes and normally is accompanied by a phenotypic change. Most eucaryotes exhibit a complete sexual life cycle, including meiosis, a process of extreme importance in generating new combinations of alleles (alternate forms of a particular gene) through recombination. These chromosome exchanges during meiosis result from crossing-over between homologous chromosomes, chromosomes containing identical sequences of genes (figure 13.1). Until about 1945 the primary focus in genetic analysis was on the recombination of genes in plants and animals. The early work on recombination in higher eucaryotes laid the foundations of classical genetics, but it was the development of bacterial and phage genetics between about 1945 and 1965 that really stimulated a rapid advance in our understanding of molecular genetics. Meiosis (pp. 87–88) 13.1 Bacterial Recombination: General Principles Microorganisms carry out several types of recombination. General recombination, the most common form, usually involves a reciprocal exchange between a pair of homologous DNA sequences. It can occur anyplace on the chromosome, and it results from DNA strand breakage and reunion leading to crossing-over (figure 13.2). General recombination is carried out by the prod- Figure 13.1 Crossing-Over. An example of recombination through crossing-over between homologous eucaryotic chromosomes. The Nn gene pair is exchanged. This process usually occurs during meiosis. ucts of rec genes such as the recA protein so important for DNA repair (see pp. 254–55). In bacterial transformation a nonreciprocal form of general recombination takes place (figure 13.3). A piece of genetic material is inserted into the chromosome through the incorporation of a single strand to form a stretch of heteroduplex DNA. A second type of recombination, one particularly important in the integration of virus genomes into bacterial chromosomes, is site-specific recombination. The genetic material is not homologous with the chromosome it joins, and generally the enzymes responsible for this event are specific for the particular virus and its host. A third kind of recombination, which may be considered a type of site-specific recombination, is called replicative recombination. It accompanies the replication of genetic material and does not depend on sequence homology. It is used by some genetic elements that move about the chromosome. DNA replication (pp. 235–39) Although sexual reproduction with the formation of a zygote and subsequent meiosis is not present in bacteria, recombination can take place in several ways following horizontal gene transfer. In this process genes are transferred from one independent, mature organism to another. Horizontal gene transfer is quite different Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids A B a b A B © The McGraw−Hill Companies, 2002 A B Strand nicking Rotate two of the arms a A b b B a Strand exchange A Ligating nicked strands after exchange Branch migration to create more hybrid a A b B a A b B a b A B B b Endonuclease cuts in branch region to produce recombinants a A A B B a b Equivalent structures b b A a a B A B A b a b a B A B A b a b a B Chi form b Fill in gaps and ligate nicks a Figure 13.2 The Holliday Model for Reciprocal General Recombination. Source: From H. Potter and D. Dressler, Proceedings of the National Academy of Sciences, 73:3000, 1976; after R. Holliday, Genetics, 78:273, 1974; and previous publications cited therein. A B Association of homologous segments Donor a b Host Strand separation and pairing Endonuclease nick at the arrow on donor strand Endonuclease nicks host strand Gaps in strand filled and ligated a b a b a b a b Heteroduplex DNA Figure 13.3 Nonreciprocal General Recombination. The Fox model for nonreciprocal general recombination. This mechanism has been proposed for the recombination occurring during transformation in some bacteria. 293 Prescott−Harley−Klein: Microbiology, Fifth Edition 294 Chapter 13 IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids Microbial Recombination and Plasmids Integrated exogenote Exogenote Merozygote Conjugation Transformation Transduction Endogenote © The McGraw−Hill Companies, 2002 Partial diploid clone Partial diploid cell times persists outside the endogenote and replicates to produce a clone of partially diploid cells. Third, the exogenote may survive, but not replicate, so that only one cell is a partial diploid. Finally, host cell nucleases may degrade the exogenote, a process called host restriction. 1. Define the following terms: recombination, crossing-over, general recombination, site-specific recombination, replicative recombination, exogenote, endogenote, horizontal gene transfer, merozygote, and host restriction. 2. Distinguish among the three forms of recombination mentioned in this section. 3. What four fates can DNA have after entering a bacterium? Host restriction Figure 13.4 The Production and Fate of Merozygotes. See text for discussion. from the transmission of genes from parents to offspring (vertical gene transfer). In general, a piece of donor DNA, the exogenote, must enter the recipient cell and become a stable part of the recipient cell’s genome, the endogenote. Two kinds of DNA can move between bacteria. If a DNA fragment is the exchange vehicle, then the exogenote must get into the recipient cell and become incorporated into the endogenote as a replacement piece (or as an “extra” piece) without being destroyed by the host. During replacement of host genetic material, the recipient cell becomes temporarily diploid for a portion of the genome and is called a merozygote (figure 13.4). Sometimes the DNA exists in a form that cannot be degraded by the recipient cell’s endonucleases. In this case the DNA does not need to be integrated into the host genome but must only enter the recipient to confer its genetic information on the cell. Most linear DNA fragments are not stably maintained unless they have been integrated into the bacterial genome. Resistant DNA, such as that in plasmids (see following), usually is circular and has sequences that allow it to maintain itself independent of the host chromosome. Recombination in bacteria is a one-way gene transfer from donor to recipient. Recombination in eucaryotes is reciprocal—that is, all of the DNA is conserved in the gametes that eventually arise from meiosis and recombination. Movement of DNA from a donor bacterium to the recipient can take place in three ways: direct transfer between two bacteria temporarily in physical contact (conjugation), transfer of a naked DNA fragment (transformation), and transport of bacterial DNA by bacteriophages (transduction). Whatever the mode of transfer, the exogenote has only four possible fates in the recipient (figure 13.4). First, when the exogenote has a sequence homologous to that of the endogenote, integration may occur; that is, it may pair with the recipient DNA and be incorporated to yield a recombinant genome. Second, the foreign DNA some- 13.2 Bacterial Plasmids Conjugation, the transfer of DNA between bacteria involving direct contact, depends on the presence of an “extra” piece of circular DNA known as a plasmid. Plasmids play many important roles in the lives of bacteria. They also have proved invaluable to microbiologists and molecular geneticists in constructing and transferring new genetic combinations and in cloning genes (see chapter 14). In this section the different types of bacterial plasmids are discussed. Plasmids are small double-stranded DNA molecules, usually circular, that can exist independently of host chromosomes and are present in many bacteria (they are also present in some yeasts and other fungi). They have their own replication origins and are autonomously replicating and stably inherited. A replicon is a DNA molecule or sequence that has a replication origin and is capable of being replicated. Plasmids and bacterial chromosomes are separate replicons. Plasmids have relatively few genes, generally less than 30. Their genetic information is not essential to the host, and bacteria that lack them usually function normally. Single-copy plasmids produce only one copy per host cell. Multicopy plasmids may be present at concentrations of 40 or more per cell. Characteristically, plasmids can be eliminated from host cells in a process known as curing. Curing may occur spontaneously or be induced by treatments that inhibit plasmid replication while not affecting host cell reproduction. The inhibited plasmids are slowly diluted out of the growing bacterial population. Some commonly used curing treatments are acridine mutagens, UV and ionizing radiation, thymine starvation, and growth above optimal temperatures. Plasmids may be classified in terms of their mode of existence and spread. An episome is a plasmid that can exist either with or without being integrated into the host’s chromosome. Some plasmids, conjugative plasmids, have genes for pili and can transfer copies of themselves to other bacteria during conjugation. A brief summary of the types of plasmids and their properties is given in table 13.1. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 13.2 Bacterial Plasmids 295 Table 13.1 Major Types of Plasmids Type Representatives Fertility Factorb F factor R Plasmids Col Plasmids Approximate Size (kbp) Metabolic Plasmids Hosts Phenotypic Featuresa 95–100 1–3 E. coli, Salmonella, Citrobacter Sex pilus, conjugation RP4 54 1–3 R1 80 1–3 Pseudomonas and many other gram-negative bacteria Gram-negative bacteria R6 R100 98 90 1–3 1–3 Sex pilus, conjugation, resistance to Ap, Km, Nm, Tc Resistance to Ap, Km, Su, Cm, Sm Su, Sm, Cm, Tc, Km, Nm Cm, Sm, Su, Tc, Hg pSH6 pSJ23a 21 36 pAD2 25 ColE1 9 ColE2 CloDF13 Virulence Plasmids Copy Number (Copies/Chromosome) Ent (P307) K88 plasmid ColV-K30 E. coli, Proteus mirabilis E. coli, Shigella, Salmonella, Proteus Staphylococcus aureus S. aureus Enterococcus faecalis Gm, Tm, Km Pn, Asa, Hg, Gm, Km, Nm, Em, etc. Em, Km, Sm 10–30 E. coli Colicin E1 production 10–15 Shigella Enterobacter cloacae Colicin E2 Cloacin DF13 E. coli E. coli E. coli Enterotoxin production Adherence antigens Siderophore for iron uptake; resistance to immune mechanisms Enterotoxin B Tumor induction 83 2 pZA10 Ti 56 200 S. aureus Agrobacter tumefaciens CAM SAL TOL pJP4 230 56 75 Pseudomonas Pseudomonas Pseudomonas putida Pseudomonas sym E. coli, Klebsiella, Salmonella Providencia Rhizobium Camphor degradation Salicylate degradation Toluene degradation 2,4-dichlorophenoxyacetic acid degradation Lactose degradation Urease Nitrogen fixation and symbiosis a Abbreviations used for resistance to antibiotics and metals: Ap, ampicillin; Asa, arsenate; Cm, chloramphenicol; Em, erythromycin; Gm, gentamycin; Hg, mercury; Km, kanamycin; Nm, neomycin; Pn, penicillin; Sm, streptomycin; Su, sulfonamides; Tc, tetracycline. b Many R plasmids, metabolic plasmids, and others are also conjugative. Fertility Factors A plasmid called the fertility or F factor plays a major role in conjugation in E. coli and was the first to be described (figure 13.5). The F factor is about 100 kilobases long and bears genes responsible for cell attachment and plasmid transfer between specific bacterial strains during conjugation. Most of the information required for plasmid transfer is located in the tra operon, which contains at least 28 genes. Many of these direct the formation of sex pili that attach the F⫹ cell (the donor cell containing an F plasmid) to an F⫺ cell (figure 13.6). Other gene products aid DNA transfer. Sex pili (p. 63) The F factor also has several segments called insertion sequences (p. 298) that assist plasmid integration into the host cell chromosome. Thus the F factor is an episome that can exist outside the bacterial chromosome or be integrated into it (figure 13.7). Prescott−Harley−Klein: Microbiology, Fifth Edition 296 Chapter 13 IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 Microbial Recombination and Plasmids IS3 γδ 100/0 90 tra genes 10 IS3 IS2 80 20 70 30 60 oriT 40 50 Figure 13.6 Bacterial Conjugation. An electron micrograph of two E. coli cells in an early stage of conjugation. The F⫹ cell to the right is covered with small pili or fimbriae, and a sex pilus connects the two cells. oriV rep genes Figure 13.5 The F plasmid. A map showing the size and general organization of the F plasmid. The plasmid contains several transposable elements. IS2 and IS3 are insertion sequences; ␥␦ is also called transposon Tn1000. The tra genes code for proteins needed in pilus synthesis and conjugation. The rep genes code for proteins involved in DNA replication. OriV is the initiation site for circular DNA replication and oriT, the site for initiation of rolling circle replication and gene transfer during conjugation. O ▼ F factor 1 2 IS A B Bacterial chromosome IS ▼ O 1 2 A A IS B 2 O ▼ Figure 13.7 F Plasmid Integration. The reversible integration of an F plasmid or factor into a host bacterial chromosome. The process begins with association between plasmid and bacterial insertion sequences. The O arrowhead (blue-green) indicates the site at which the oriented transfer of chromosome to the recipient cell begins. A, B, 1, and 2 represent genetic markers. Integrated F factor 1 IS B Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 13.2 Bacterial Plasmids 297 Box 13.1 Virulence Plasmids and Disease t is becoming increasingly evident that many bacteria are pathogenic because of their plasmids. These plasmids can carry genes for toxins, render the bacterium better able to establish itself in the host, or aid in resistance to host defenses. E. coli provides the beststudied example of virulence plasmids. Several strains of E. coli cause diarrhea. The enterotoxigenic strains responsible for traveler’s diarrhea can produce two toxins: a heat-labile toxin (LT), which is a large protein very similar in structure and mechanism of action to cholera toxin (see chapter 32), and a heat-stable toxin (ST), a low molecular weight polypeptide. Both toxin genes are plasmid borne, and sometimes they are even carried by the same plasmid. The ST toxin gene is located on a transposon. Enterotoxigenic strains of E. coli also must be able to colonize the epithelium of the small intestine to cause diarrhea. This is made possible by the presence of special adhesive fimbriae encoded by genes on another plasmid. A second type of pathogenic E. coli invades the intestinal epithelium and causes a form of diarrhea very similar to the dysentery resulting from a Shigella infection. This E. coli strain and I Resistance Factors Plasmids often confer antibiotic resistance on the bacteria that contain them. R factors or plasmids typically have genes that code for enzymes capable of destroying or modifying antibiotics. They are not usually integrated into the host chromosome. Genes coding for resistance to antibiotics such as ampicillin, chloramphenicol, and kanamycin have been found in plasmids. Some R plasmids have only a single resistance gene, whereas others can have as many as eight. Often the resistance genes are within a transposon (p. 298), and thus it is possible for bacterial strains to rapidly develop multiple resistance plasmids. R factors and antibi- Shigella contain virulence plasmids that code for special cell wall antigens and other factors enabling them to enter and destroy epithelial cells. Some E. coli strains can invade the blood and organs of a host, causing a generalized infection. These pathogens often have ColV plasmids and produce colicin V. The ColV plasmid carries genes for two virulence determinants. One product increases bacterial resistance to host defense mechanisms involving complement (see sections 31.7 and 32.3). The other plasmid gene directs the synthesis of a hydroxamate that enables E. coli to accumulate iron more efficiently from its surroundings (see section 5.6). Since iron is not readily available in the animal host, but is essential for bacterial growth, this is an important factor in pathogenicity. Several other pathogens carry virulence plasmids. Some Staphylococcus aureus strains produce an exfoliative toxin that is plasmid borne. The toxin causes the skin to loosen and often peel off in sheets, leading to the disease staphylococcal scalded skin syndrome (see section 39.3). Other plasmid-borne toxins are the tetanus toxin of Clostridium tetani and the anthrax toxin of Bacillus anthracis. by forming channels in the plasma membrane, thus increasing its permeability. They also may degrade DNA and RNA or attack peptidoglycan and weaken the cell wall. Col plasmids contain genes for the synthesis of bacteriocins known as colicins, which are directed against E. coli. Similar plasmids carry genes for bacteriocins against other species. For example, Col plasmids produce cloacins that kill Enterobacter species. Clearly the host is unaffected by the bacteriocin it produces. Some Col plasmids are conjugative and also can carry resistance genes. Bacteriocins and host defenses (p. 712) otic resistance (p. 819) Other Types of Plasmids Because many R factors also are conjugative plasmids, they can spread throughout a population, although not as rapidly as the F factor. Often, nonconjugative R factors also move between bacteria during plasmid promoted conjugation. Thus a whole population can become resistant to antibiotics. The fact that some of these plasmids are readily transferred between species further promotes the spread of resistance. When the host consumes large quantities of antibiotics, E. coli and other bacteria with R factors are selected for and become more prevalent. The R factors can then be transferred to more pathogenic genera such as Salmonella or Shigella, causing even greater public health problems (see section 35.7). Several other important types of plasmids have been discovered. Some plasmids, called virulence plasmids, make their hosts more pathogenic because the bacterium is better able to resist host defense or to produce toxins. For example, enterotoxigenic strains of E. coli cause traveler’s diarrhea because of a plasmid that codes for an enterotoxin (Box 13.1). Metabolic plasmids carry genes for enzymes that degrade substances such as aromatic compounds (toluene), pesticides (2,4-dichlorophenoxyacetic acid), and sugars (lactose). Metabolic plasmids even carry the genes required for some strains of Rhizobium to induce legume nodulation and carry out nitrogen fixation. Col Plasmids Bacteria also harbor plasmids with genes that may give them a competitive advantage in the microbial world. Bacteriocins are bacterial proteins that destroy other bacteria. They usually act only against closely related strains. Bacteriocins often kill cells 1. Give the major distinguishing features of a plasmid. What is an episome? A conjugative plasmid? 2. Describe each of the following plasmids and their importance: F factor, R factor, Col plasmid, virulence plasmid, and metabolic plasmid. Prescott−Harley−Klein: Microbiology, Fifth Edition 298 Chapter 13 IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 Microbial Recombination and Plasmids (a) Insertion sequence IR Transposase gene IR (b) A composite transposon IR IR Other genes Insertion sequence Insertion sequence (c) A target site for the Tn3 transposon …TTTATTTTCC GAATTCCAAGCGCAGCC… …AAATAAAAGGCTTAAGGTTCGCGTCGG … Target site for transposition insertion …TTTATTTTCC G AATTCGGGGTCTGACGCTCAG- ……………… -CTGAGCGTCAGACCCCAATTCCAAGCGCAGCC… …AAATAAAAGGCTTAAGCCCCAGACTGCGAGTC- ……………… -GACTCGCAGTCTGGGGTTAAGGTTCGCGTCGG… Flanking direct repeat Flanking direct repeat Portion of left inverted repeat of Tn3 Portion of right inverted repeat of Tn3 Figure 13.8 Insertion Sequences and Transposons. The structure of insertion sequences (a), composite transposons (b), and target sites (c). IR stands for inverted repeat. In (c), the highlighted five-base target site is duplicated during Tn3 transposition to form flanking direct repeats. The remainder of Tn3 lies between the inverted repeats. 13.3 Transposable Elements The chromosomes of bacteria, viruses, and eucaryotic cells contain pieces of DNA that move around the genome. Such movement is called transposition. DNA segments that carry the genes required for this process and consequently move about chromosomes are transposable elements or transposons. Unlike other processes that reorganize DNA, transposition does not require extensive areas of homology between the transposon and its destination site. Transposons behave somewhat like lysogenic prophages (see pp. 390–95) except that they originate in one chromosomal location and can move to a different location in the same chromosome. Transposable elements differ from phages in lacking a virus life cycle and from plasmids in being unable to reproduce autonomously and to exist apart from the chromosome. They were first discovered in the 1940s by Barbara McClintock during her studies on maize genetics (a discovery that won her the Nobel prize in 1983). They have been most intensely studied in bacteria. The simplest transposable elements are insertion sequences or IS elements (figure 13.8a). An IS element is a short sequence of DNA (around 750 to 1,600 base pairs [bp] in length) containing only the genes for those enzymes required for its transposi- tion and bounded at both ends by identical or very similar sequences of nucleotides in reversed orientation known as inverted repeats (figure 13.8c). Inverted repeats are usually about 15 to 25 base pairs long and vary among IS elements so that each type of IS has its own characteristic inverted repeats. Between the inverted repeats is a gene that codes for an enzyme called transposase (and sometimes a gene for another essential protein). This enzyme is required for transposition and accurately recognizes the ends of the IS. Each type of element is named by giving it the prefix IS followed by a number. In E. coli several copies of different IS elements have been observed; some of their properties are given in table 13.2. Transposable elements also can contain genes other than those required for transposition (for example, antibiotic resistance or toxin genes). These elements often are called composite transposons or elements. Complete agreement about the nomenclature of transposable elements has not yet been reached. Sometimes transposable elements are called transposons when they have extra genes, and insertion sequences when they lack these. Composite transposons often consist of a central region containing the extra genes, flanked on both sides by IS elements that are identical or very similar in sequence (figure 13.8b). Many com- Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 13.3 Transposable Elements 299 Table 13.2 The Properties of Selected Insertion Sequences Insertion Sequence Length (bp) Inverted Repeat (Length in bp) Target Site (Length in bp) Number of Copies on E. coli Chromosome 768 1,327 1,400 1,428 1,195 23 41 38 18 16 9 or 8 5 3– 4 11 or 12 4 6–10 4–13(1)a 5–6(2) 1–2 10–11 IS1 IS2 IS3 IS4 IS5 a The value in parentheses indicates the number of IS elements on the F factor. Table 13.3 The Properties of Selected Composite Transposons Transposon Tn3 Tn501 Tn951 Tn5 Tn9 Tn10 Tn903 Tn1681 Tn2901 Length (bp) Terminal Repeat Length 4,957 8,200 16,500 5,700 2,500 9,300 3,100 2,061 11,000 38 38 Unknown Terminal Module IS50 IS1 IS10 IS903 IS1 IS1 Genetic Markers a Ap Hg Lactose utilization Km Cm Tc Km Heat-stable enterotoxin Arginine biosynthesis a Abbreviations for antibiotics and metals same as in table 13.1. posite transposons are simpler in organization. They are bounded by short inverted repeats, and the coding region contains both transposition genes and the extra genes. It is believed that composite transposons are formed when two IS elements associate with a central segment containing one or more genes. This association could arise if an IS element replicates and moves only a gene or two down the chromosome. Composite transposon names begin with the prefix Tn. Some properties of selected composites are given in table 13.3. The process of transposition in procaryotes involves a series of events, including self-replication and recombinational processes. Typically in bacteria, the original transposon remains at the parental site on the chromosome, while a replicated copy inserts at the target DNA (figure 13.8c). This is called replicative transposition. Target sites are specific sequences about five to nine base pairs long. When a transposon inserts at a target site, the target sequence is duplicated so that short, direct-sequence repeats flank the transposon’s terminal inverted repeats (figure 13.9). This can be seen in figure 13.8c where the five base pair target sequence moves to both ends of the transposon and retains the same orientation. The transposition of the Tn3 transposon is a well-studied example of replicative transposition. Its mechanism is outlined in figure 13.10. In the first stage the plasmid containing the Tn3 transposon fuses with the target plasmid to form a cointegrate molecule (figure 13.10, steps 1 to 4). This process requires the Tn3 trans- posase enzyme coded for by the tnpA gene (figure 13.11). Note that the cointegrate has two copies of the Tn3 transposon. In the second stage the cointegrate is resolved to yield two plasmids, each with a copy of the transposon (figure 13.10, steps 5 and 6). Resolution involves a crossover at the two res sites and is catalyzed by a resolvase enzyme coded for by the tnpR gene (figure 13.11). Transposable elements produce a variety of important effects. They can insert within a gene to cause a mutation or stimulate DNA rearrangement, leading to deletions of genetic material. If a transposon insertion produces an obvious phenotypic change, the gene can be tracked by following this altered phenotype. One can fragment the genome and isolate the mutated fragment, thereby partially purifying the gene. Thus transposons may be used to purify genes and study their functions. Because some transposons carry stop codons or termination sequences, they may block translation or transcription. Other elements carry promoters and thus activate genes near the point of insertion. Eucaryotic genes as well as procaryotic genes can be turned on and off by transposon movement. Transposons also are located within plasmids and participate in such processes as plasmid fusion and the insertion of F plasmids into the E. coli chromosome (figure 13.7). In the previous discussion of plasmids, it was noted that an R plasmid can carry genes for resistance to several drugs. Transposons have antibiotic resistance genes and play a major role in generating these plasmids. Consequently the existence of these elements causes Prescott−Harley−Klein: Microbiology, Fifth Edition 300 Chapter 13 IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 Microbial Recombination and Plasmids Transposon Target C ATA G C C T G AT G TAT C G G A C TA Target plasmid Step 1. Nicking DNA e f a b Cut at arrows g h cd C ATA G C C T G A G Stage 1: Fusion of the plasmids (tnpA gene) T TAT C G G A C TA Step 2. Joining free ends af bh ec d g Insert transposon C ATA G C C T G A G T TAT C G G A C TA Step 3. Replication of transposon af ec d g af h h c e Fill in gaps b g C ATA G C C T G A G TAT C G G A C T d ATA G C C T G A T TAT C G G A C T A Step 4. Completion of replication res a f Figure 13.9 Generation of Direct Repeats in Host DNA Flanking a Transposon. (a) The arrows indicate where the two strands of host DNA will be cut in a staggered fashion, 9 base pairs apart. (b) After cutting. (c) The transposon (pink) has been ligated to one strand of host DNA at each end, leaving two 9 base gaps. (d) After the gaps are filled in, there are 9 base pair repeats of host DNA (purple boxes) at each end of the transposon. Figure 13.10 Tn3 Transposition Mechanism. Step 1: The two plasmids are nicked to form the free ends labeled a–h. Step 2: Ends a and f are joined, as are g and d. This leaves b, c, e, and h free. Step 3: Two of these remaining free ends (b and c) serve as primers for DNA replication, which is shown in a blowup of the replicating region. Step 4: Replication continues until end b reaches e and end c reaches h. These ends are ligated to complete the cointegrate. Notice that the whole transposon has been replicated. The paired res sites are shown for the first time here, even though one res site existed in the previous steps. Steps 5 and 6: A crossover occurs between the two res sites in the two copies of the transposon, leaving two independent plasmids, each bearing a copy of the transposon. e c h b g Cointegrate d + Step 5. Recombination between res sites Stage 2: Resolution of the cointegrate (tnpR gene) Step 6 Transposons on both plasmids Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 13.3 Transposable Elements 301 RTF IS1 IS1 Cm Km Sm, Su Ap Tn4 Tn3 38 base pair inverted repeat 38 base pair inverted repeat Target site flanking repeat tnpA Transposase (MW 120 K) tnpR Resolvase (MW 21 K) bla Target site flanking repeat β-lactamase (MW 20 K) Figure 13.11 The Structure of R Plasmids and Transposons. The R1 plasmid carries resistance genes for five antibiotics: chloramphenicol (Cm), streptomycin (Sm), sulfonamide (Su), ampicillin (Ap), and kanamycin (Km). These are contained in the Tn3 and Tn4 transposons. The resistance transfer factor (RTF) codes for the proteins necessary for plasmid replication and transfer. The structure of Tn3 is shown in more detail. The arrows indicate the direction of gene transcription. serious problems in the treatment of disease. Since plasmids can contain several different target sites, transposons will move between them; thus plasmids act as both the source and the target for transposons with resistance genes. In fact, multiple drug resistance plasmids probably often arise from transposon accumulation on a single plasmid (figure 13.11). Because transposons also move between plasmids and primary chromosomes, drug resistance genes can exchange between plasmids and chromosomes, resulting in the further spread of antibiotic resistance. Some transposons bear transfer genes and can move between bacteria through the process of conjugation, as discussed in the next section. A well-studied example of such a conjugative transposon is Tn916 from Enterococcus faecalis. Although Tn916 cannot replicate autonomously, it will transfer itself from E. faecalis to a variety of recipients and integrate into their chromosomes. Because it carries a gene for tetracycline resistance, this conjugative transposon also spreads drug resistance. Prescott−Harley−Klein: Microbiology, Fifth Edition 302 Chapter 13 IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 Microbial Recombination and Plasmids Transposable elements are widespread in nature. They are present in eucaryotes, bacteria, and the Archaea. For example, transposable elements have been found in yeast, maize, Drosophila, and humans. Clearly, transposable elements play an extremely important role in the generation and transfer of new gene combinations. + + – – – – – + + – + Bio Phe Cys Thr Leu Thi 1. Define the following terms: transposition, transposable element or transposon, insertion sequence, transposase, conjugative transposon, and composite transposon. Be able to distinguish between an insertion sequence and a composite transposon. 2. How does transposition usually occur in bacteria, and what happens to the target site? What is replicative transposition? 3. Give several important effects transposable elements have on bacteria. 13.4 + Bio Phe Cys Thr Leu Thi Mixture Bacterial Conjugation The initial evidence for bacterial conjugation, the transfer of genetic information by direct cell to cell contact, came from an elegant experiment performed by Joshua Lederberg and Edward L. Tatum in 1946. They mixed two auxotrophic strains, incubated the culture for several hours in nutrient medium, and then plated it on minimal medium. To reduce the chance that their results were due to simple reversion, they used double and triple auxotrophs on the assumption that two or three reversions would not often occur simultaneously. For example, one strain required biotin (Bio⫺), phenylalanine (Phe⫺), and cysteine (Cys⫺) for growth, and another needed threonine (Thr⫺), leucine (Leu⫺), and thiamine (Thi⫺). Recombinant prototrophic colonies appeared on the minimal medium after incubation (figure 13.12). Thus the chromosomes of the two auxotrophs were able to associate and undergo recombination. Lederberg and Tatum did not directly prove that physical contact of the cells was necessary for gene transfer. This evidence was provided by Bernard Davis (1950), who constructed a U tube consisting of two pieces of curved glass tubing fused at the base to form a U shape with a fritted glass filter between the halves. The filter allows the passage of media but not bacteria. The U tube was filled with nutrient medium and each side inoculated with a different auxotrophic strain of E. coli (figure 13.13). During incubation, the medium was pumped back and forth through the filter to ensure medium exchange between the halves. After a 4 hour incubation, the bacteria were plated on minimal medium. Davis discovered that when the two auxotrophic strains were separated from each other by the fine filter, gene transfer could not take place. Therefore direct contact was required for the recombination that Lederberg and Tatum had observed. + + + + + + Bio Phe Cys Thr Leu Thi Prototrophic colonies Figure 13.12 Evidence for Bacterial Conjugation. Lederberg and Tatum’s demonstration of genetic recombination using triple auxotrophs. See text for details. F⫹ ⫻ F⫺ Mating In 1952 William Hayes demonstrated that the gene transfer observed by Lederberg and Tatum was polar. That is, there were definite donor (F⫹) and recipient (F⫺) strains, and gene transfer was nonreciprocal. He also found that in F⫹ ⫻ F⫺ mating the progeny were only rarely changed with regard to auxotrophy (that is, bacterial genes were not often transferred), but F⫺ strains frequently became F⫹. These results are readily explained in terms of the F factor previously described (figure 13.5). The F⫹ strain contains an extrachromosomal F factor carrying the genes for pilus formation and plasmid transfer. During F⫹ ⫻ F⫺ mating or conjugation, the F factor replicates by the rolling-circle mechanism, and a copy Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 13.4 Bacterial Conjugation 303 Pressure or suction – + + Met Thr Leu Thi + + – – Met Thr Leu Thi Fritted glass filter moves to the recipient (figure 13.14a). The entering strand is copied to produce double-stranded DNA. Because bacterial chromosome genes are rarely transferred with the independent F factor, the recombination frequency is low. It is still not completely clear how the plasmid moves between bacteria. The sex pilus or F pilus joins the donor and recipient and may contract to draw them together. The channel for DNA transfer could be either the hollow F pilus or a special conjugation bridge formed upon contact. The rolling-circle mechanism of DNA replication (p. 236) Although most research on plasmids and conjugation has been done using E. coli and other gram-negative bacteria, selftransmissible plasmids are present in gram-positive bacterial genera such as Bacillus, Streptococcus, Enterococcus, Staphylococcus, and Streptomyces. Much less is known about these systems. It appears that fewer transfer genes are involved, possibly because a sex pilus does not seem to be required for plasmid transfer. For example, Enterococcus faecalis recipient cells release short peptide chemical signals that activate transfer genes in donor cells containing the proper plasmid. Donor and recipient cells directly adhere to one another through special plasmid-encoded proteins released by the activated donor cell. Plasmid transfer then occurs. Hfr Conjugation Because certain donor strains transfer bacterial genes with great efficiency and do not usually change recipient bacteria to donors, a second type of conjugation must exist. The F factor is an episome and can integrate into the bacterial chromosome at several different locations by recombination between homologous insertion se- – Figure 13.13 The U-Tube Experiment. The U-tube experiment used to show that genetic recombination by conjugation requires direct physical contact between bacteria. See text for details. quences present on both the plasmid and host chromosomes. When integrated, the F plasmid’s tra operon is still functional; the plasmid can direct the synthesis of pili, carry out rolling-circle replication, and transfer genetic material to an F⫺ recipient cell. Such a donor is called an Hfr strain (for high frequency of recombination) because it exhibits a very high efficiency of chromosomal gene transfer in comparison with F⫹ cells. DNA transfer begins when the integrated F factor is nicked at its site of transfer origin (figure 13.14b). As it is replicated, the chromosome moves through the pilus or conjugation bridge connecting the donor and recipient. Because only part of the F factor is transferred at the start (the initial break is within the F plasmid), the F⫺ recipient does not become F⫹ unless the whole chromosome is transferred. Transfer is standardized at 100 minutes in E. coli, and the connection usually breaks before this process is finished. Thus a complete F factor usually is not transferred, and the recipient remains F⫺. As mentioned earlier, when an Hfr strain participates in conjugation, bacterial genes are frequently transferred to the recipient. Gene transfer can be in either a clockwise or counterclockwise direction around the circular chromosome, depending on the orientation of the integrated F factor. After the replicated donor chromosome enters the recipient cell, it may be degraded or incorporated into the F⫺ genome by recombination. F′ Conjugation Because the F plasmid is an episome, it can leave the bacterial chromosome. Sometimes during this process the plasmid makes an error in excision and picks up a portion of the chromosomal Prescott−Harley−Klein: Microbiology, Fifth Edition 304 Chapter 13 IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 Microbial Recombination and Plasmids F – F + – F Hfr Pilus connects cells Pilus connects cells Donor DNA replicated by rollingcircle method and transferred Connection breaks, ending transfer; fragment of donor DNA incorporated into recipient chromosome – F F factor replicated and transferred F + + F (a) Hfr (b) Figure 13.14 The Mechanism of Bacterial Conjugation. (a) F⫹ ⫻ F⫺ mating. (b) Hfr ⫻ F⫺ mating (the integrated F factor is shown in red). Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 13.5 material to form an F′ plasmid (figure 13.15a). It is not unusual to observe the inclusion of one or more genes in excised F plasmids. The F′ cell retains all of its genes, although some of them are on the plasmid, and still mates only with an F⫺ recipient. F′ ⫻ F⫺ conjugation is virtually identical with F⫹ ⫻ F⫺ mating. Once again, the plasmid is transferred, but usually bacterial genes on the chromosome are not (figure 13.15b). Bacterial genes on the F′ plasmid are transferred with it and need not be incorporated into the recipient chromosome to be expressed. The recipient becomes F′ and is a partially diploid merozygote since it has two sets of the genes carried by the plasmid. In this way specific bacterial genes may spread rapidly throughout a bacterial population. Such transfer of bacterial genes is often called sexduction. F′ conjugation is very important to the microbial geneticist. A partial diploid’s behavior shows whether the allele carried by an F′ plasmid is dominant or recessive to the chromosomal gene. The formation of F′ plasmids also is useful in mapping the chromosome since if two genes are picked up by an F factor they must be neighbors. 1. What is bacterial conjugation and how was it discovered? 2. Distinguish between F⫹, Hfr, and F⫺ strains of E. coli with respect to their physical nature and role in conjugation. 3. Describe in some detail how F⫹ ⫻ F⫺ and Hfr conjugation processes proceed, and distinguish between the two in terms of mechanism and the final results. 4. What is F′ conjugation and why is it so useful to the microbial geneticist? How does the F′ plasmid differ from a regular F plasmid? What is sexduction? DNA Transformation 305 F′ Hfr A A De-integration including part of bacterial chromosome (a) Pilus connects cells F– F′ A a A a 13.5 DNA Transformation The second way in which DNA can move between bacteria is through transformation, discovered by Fred Griffith in 1928. Transformation is the uptake by a cell of a naked DNA molecule or fragment from the medium and the incorporation of this molecule into the recipient chromosome in a heritable form. In natural transformation the DNA comes from a donor bacterium. The process is random, and any portion of a genome may be transferred between bacteria. The discovery of transformation (pp. 228–29) When bacteria lyse, they release considerable amounts of DNA into the surrounding environment. These fragments may be relatively large and contain several genes. If a fragment contacts a competent cell, one able to take up DNA and be transformed, it can be bound to the cell and taken inside (figure 13.16a). The transformation frequency of very competent cells is around 10⫺3 for most genera when an excess of DNA is used. That is, about one cell in every thousand will take up and integrate the gene. Competency is a complex phenomenon and is dependent on several conditions. Bacteria need to be in a certain stage of growth; for example, S. pneumoniae becomes competent during the exponential phase when the population reaches about 107 to 108 cells per ml. When a population becomes competent, bacteria such as pneumococci secrete a small protein called the competence factor that stimulates the production F′ plasmid replicated and transferred F′ A F′ A a (b) Figure 13.15 F′ Conjugation. (a) Due to an error in excision, the A gene of an Hfr cell is picked up by the F factor. (b) The A gene is then transferred to a recipient during conjugation. of 8 to 10 new proteins required for transformation. Natural transformation has been discovered so far only in certain gram-positive and gram-negative genera: Streptococcus, Bacillus, Thermoactinomyces, Haemophilus, Neisseria, Moraxella, Acinetobacter, Prescott−Harley−Klein: Microbiology, Fifth Edition 306 Chapter 13 IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 Microbial Recombination and Plasmids Transformation with DNA fragments 1 DNA fragments 2 Bacterial chromosome 4 Nucleotides 3 Uptake of DNA Figure 13.17 The Mechanism of Transformation. (1) A long double-stranded DNA molecule binds to the surface with the aid of a ). (2) One DNA-binding protein ( ) and is nicked by a nuclease ( strand is degraded by the nuclease. (3) The undegraded strand associates with a competence-specific protein ( ). (4) The single strand enters the cell and is integrated into the host chromosome in place of the homologous region of the host DNA. • Integration by nonreciprocal recombination OR Stable transformation (a) Degradation Unsuccessful transformation Transformation with a plasmid DNA plasmid Bacterial chromosome Uptake of plasmid (b) Stable transformation Figure 13.16 Bacterial Transformation. Transformation with (a) DNA fragments and (b) plasmids. Transformation with a plasmid often is induced artificially in the laboratory. The transforming DNA is in red and integration is at a homologous region of the genome. See text for details. Azotobacter, and Pseudomonas. Other genera also may be capable of transformation. Gene transfer by this process occurs in soil and marine environments and may be an important route of genetic exchange in nature. The mechanism of transformation has been intensively studied in S. pneumoniae (figure 13.17). A competent cell binds a double-stranded DNA fragment if the fragment is moderately large; the process is random, and donor fragments compete with each other. The DNA then is cleaved by endonucleases to doublestranded fragments about 5 to 15 kilobases in size. DNA uptake requires energy expenditure. One strand is hydrolyzed by an envelope-associated exonuclease during uptake; the other strand associates with small proteins and moves through the plasma membrane. The single-stranded fragment can then align with a homologous region of the genome and be integrated, probably by a mechanism similar to that depicted in figure 13.3. Transformation in Haemophilus influenzae, a gram-negative bacterium, differs from that in S. pneumoniae in several respects. Haemophilus does not produce a competence factor to stimulate the development of competence, and it takes up DNA from only closely related species (S. pneumoniae is less particular about the source of its DNA). Double-stranded DNA, complexed with proteins, is taken in by membrane vesicles. The specificity of Haemophilus transformation is due to a special 11 base pair sequence (5′AAGTGCGGTCA3′) that is repeated over 1,400 times in H. influenzae DNA. DNA must have this sequence to be bound by a competent cell. Artificial transformation is carried out in the laboratory by a variety of techniques, including treatment of the cells with calcium chloride, which renders their membranes more permeable to DNA. This approach succeeds even with species that are not naturally competent, such as E. coli. Relatively high concentrations of DNA, higher than would normally be present in nature, are used to increase transformation frequency. When linear DNA fragments are to be used in transformation, E. coli usually is ren- Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 13.6 (a) Lytic Transduction 307 (b) Lysogenic Reinfection Infection (adsorption and penetration) Cell division Phage DNA cyclizes Cell lysis Phage DNA replicates (rolling circle) UV light induction Integration of phage DNA to form prophage Cell division Lysogenic clone Phage heads, tails, and DNA assemble into progeny phage Figure 13.18 Lytic versus Lysogenic Infection by Phage Lambda. (a) Lytic infection. (b) Lysogenic infection. Viral and prophage DNA are in red. dered deficient in one or more exonuclease activities to protect the transforming fragments. It is even easier to transform bacteria with plasmid DNA since plasmids are not as easily degraded as linear fragments and can replicate within the host (figure 13.16b). This is a common method for introducing recombinant DNA into bacterial cells (see sections 14.5 and 14.7). DNA from any source can be introduced into bacteria by splicing it into a plasmid before transformation. 1. Define transformation and competence. 2. Describe how transformation occurs in S. pneumoniae. How does the process differ in H. influenzae? 3. Discuss two ways in which artificial transformation can be used to place functional genes within bacterial cells. 13.6 Transduction Bacterial viruses or bacteriophages participate in the third mode of bacterial gene transfer. These viruses have relatively simple structures in which virus genetic material is enclosed within an outer coat, composed mainly or solely of protein. The coat protects the genome and transmits it between host cells. The morphology and life cycle of bacteriophages is not discussed in detail until chapter 17. Nevertheless, it is necessary to briefly describe the life cycle here as background for a consideration of the bac- teriophage’s role in gene transfer. The lytic cycle (pp. 382–88); Lysogeny (pp. 390–95) After infecting the host cell, a bacteriophage (phage for short) often takes control and forces the host to make many copies of the virus. Eventually the host bacterium bursts or lyses and releases new phages. This reproductive cycle is called a lytic cycle because it ends in lysis of the host. The cycle has four phases (figure 13.18a). First, the virus particle attaches to a specific receptor site on the bacterial surface. The genetic material, which is often double-stranded DNA, then enters the cell. After adsorption and penetration, the virus chromosome forces the bacterium to make virus nucleic acids and proteins. The third stage begins after the synthesis of virus components. Phages are assembled from these components. The assembly process may be complex, but in all cases phage nucleic acid is packed within the virus’s protein coat. Finally, the mature viruses are released by cell lysis. Bacterial viruses that reproduce using a lytic cycle often are called virulent bacteriophages because they destroy the host cell. Many DNA phages, such as the lambda phage (see p. 391), are also capable of a different relationship with their host (figure 13.18b). After adsorption and penetration, the viral genome does not take control of its host and destroy it while producing new phages. Instead the genome remains within the host cell and is reproduced along with the bacterial chromosome. A clone of infected cells arises and may grow for long periods while appearing perfectly normal. Each of these infected bacteria can produce phages and lyse under appropriate environmental conditions. This relationship between the phage and its host is called lysogeny. Prescott−Harley−Klein: Microbiology, Fifth Edition 308 Chapter 13 IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 Microbial Recombination and Plasmids Destruction of host DNA Synthesis of virus DNA and coat proteins Virus capsid synthesis and virus assembly Transducing particle Lysis of cell with release of phage particles and subsequent infection of another cell Survival of donor DNA Integration of donor DNA into recipient chromosome Stable gene transfer Abortive transduction Degradation Unsuccessful gene transfer Figure 13.19 Generalized Transduction by Bacteriophages. See text for details. Bacteria that can produce phage particles under some conditions are said to be lysogens or lysogenic, and phages able to establish this relationship are temperate phages. The latent form of the virus genome that remains within the host without destroying it is called the prophage. The prophage usually is integrated into the bacterial genome (figure 13.18b). Sometimes phage reproduction is triggered in a lysogenized culture by exposure to UV radiation or other factors. The lysogens are then destroyed and new phages released. This phenomenon is called induction. Transduction is the transfer of bacterial genes by viruses. Bacterial genes are incorporated into a phage capsid because of errors made during the virus life cycle. The virus containing these genes then injects them into another bacterium, completing the transfer. Transduction may be the most common mechanism for gene exchange and recombination in bacteria. There are two very different kinds of transduction: generalized and specialized. Generalized Transduction Generalized transduction (figure 13.19) occurs during the lytic cycle of virulent and temperate phages and can transfer any part of the bacterial genome. During the assembly stage, when Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 13.6 the viral chromosomes are packaged into protein capsids, random fragments of the partially degraded bacterial chromosome also may be packaged by mistake. Because the capsid can contain only a limited quantity of DNA, the viral DNA is left behind. The quantity of bacterial DNA carried depends primarily on the size of the capsid. The P22 phage of Salmonella typhimurium usually carries about 1% of the bacterial genome; the P1 phage of E. coli and a variety of gram-negative bacteria carries about 2.0 to 2.5% of the genome. The resulting virus particle often injects the DNA into another bacterial cell but does not initiate a lytic cycle. This phage is known as a generalized transducing particle or phage and is simply a carrier of genetic information from the original bacterium to another cell. As in transformation, once the DNA has been injected, it must be incorporated into the recipient cell’s chromosome to preserve the transferred genes. The DNA remains double stranded during transfer, and both strands are integrated into the endogenote’s genome. About 70 to 90% of the transferred DNA is not integrated but often is able to survive and express itself. Abortive transductants are bacteria that contain this nonintegrated, transduced DNA and are partial diploids. Generalized transduction was discovered in 1951 by Joshua Lederberg and Norton Zinder during an attempt to show that conjugation, discovered several years earlier in E. coli, could occur in other bacterial species. Lederberg and Zinder were repeating the earlier experiments with Salmonella typhimurium. They found that incubation of a mixture of two multiply auxotrophic strains yielded prototrophs at the level of about one in 105. This seemed like good evidence for bacterial recombination, and indeed it was, but their initial conclusion that the transfer resulted from conjugation was not borne out. When these investigators performed the U-tube experiment (figure 13.13) with Salmonella, they still recovered prototrophs. The filter in the U tube had small enough pores to block the movement of bacteria between the two sides but allowed the phage P22 to pass. Lederberg and Zinder had intended to confirm that conjugation was present in another bacterial species and had instead discovered a completely new mechanism of bacterial gene transfer. The seemingly routine piece of research led to surprising and important results. A scientist must always keep an open mind about results and be prepared for the unexpected. Specialized Transduction In specialized or restricted transduction, the transducing particle carries only specific portions of the bacterial genome. Specialized transduction is made possible by an error in the lysogenic life cycle. When a prophage is induced to leave the host chromosome, excision is sometimes carried out improperly. The resulting phage genome contains portions of the bacterial chromosome (about 5 to 10% of the bacterial DNA) next to the integration site, much like the situation with F′ plasmids (figure 13.20). A transducing phage genome usually is defective and lacks some part of its attachment site. The transducing particle will inject bacterial genes into another bacterium, even though the defective phage cannot reproduce without assis- Transduction 309 tance. The bacterial genes may become stably incorporated under the proper circumstances. The best-studied example of specialized transduction is the lambda phage. The lambda genome inserts into the host chromosome at specific locations known as attachment or att sites (figure 13.21; see also figures 17.16 and 17.20). The phage att sites and bacterial att sites are similar and can complex with each other, although they are not identical. The att site for lambda is next to the gal and bio genes on the E. coli chromosome; consequently, specialized transducing lambda phages most often carry these bacterial genes. The lysate, or product of cell lysis, resulting from the induction of lysogenized E. coli contains normal phage and a few defective transducing particles. These particles are called lambda dgal because they carry the galactose utilization genes (figure 13.21). Because these lysates contain only a few transducing particles, they often are called low-frequency transduction lysates (LFT lysates). Whereas the normal phage has a complete att site, defective transducing particles have a nonfunctional hybrid integration site that is part bacterial and part phage in origin. Integration of the defective phage chromosome does not readily take place. Transducing phages also may have lost some genes essential for reproduction. Stable transductants can arise from recombination between the phage and the bacterial chromosome because of crossovers on both sides of the gal site (figure 13.21). The lysogenic cycle of lambda phage (pp. 391–95) Defective lambda phages carrying the gal gene can integrate if there is a normal lambda phage in the same cell. The normal phage will integrate, yielding two bacterial/phage hybrid att sites where the defective lambda dgal phage can insert (figure 13.21). It also supplies the genes missing in the defective phage. The normal phage in this instance is termed the helper phage because it aids integration and reproduction of the defective phage. These transductants are unstable because the prophages can be induced to excise by agents such as UV radiation. Excision, however, produces a lysate containing a fairly equal mixture of defective lambda dgal phage and normal helper phage. Because it is very effective in transduction, the lysate is called a high-frequency transduction lysate (HFT lysate). Reinfection of bacteria with this mixture will result in the generation of considerably more transductants. LFT lysates and those produced by generalized transduction have one transducing particle in 105 or 106 phages; HFT lysates contain transducing particles with a frequency of about 0.1 to 0.5. 1. Briefly describe the lytic and lysogenic viral reproductive cycles. Define lysogeny, lysogen, temperate phage, prophage, and transduction. 2. Describe generalized transduction, how it occurs, and the way in which it was discovered. What is an abortive transductant? 3. What is specialized or restricted transduction and how does it come about? Be able to distinguish between LFT and HFT lysates and describe how they are formed. Prescott−Harley−Klein: Microbiology, Fifth Edition 310 Chapter 13 IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 Microbial Recombination and Plasmids Lysogenized cell with prophage Induction Prophage Rare-deintegration that includes some bacterial genes Replication of defective virus DNA with incorporated host genes Assembly and release of transducing phage particles Infection of recipient cell Crossover to integrate bacterial genes Integration as prophage Bacterial chromosome containing both virus and donor DNA Bacterial chromosome containing donor DNA Figure 13.20 Specialized Transduction by a Temperate Bacteriophage. Recombination can produce two types of transductants. See text for details. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 13.6 Transduction 2 3 1 λ att sites gal + Integration to form prophage 1 gal + 2 3 Error in excision Normal excision 2 1 3 1 gal + 2 3 gal + 1 2 3 λ 1 λdgal 2 gal + 3 gal + 2 1 1 gal + 2 gal + 1 gal – 2 3 gal – λdgal 1 2 λ gal + 1 gal – 2 3 gal + Unstable transductant Stable transductant Figure 13.21 The Mechanism of Transduction for Phage Lambda and E. coli. Integrated lambda phage lies next to the gal genes. When it excises normally (top left), the new phage is complete and contains no bacterial genes. Rarely excision occurs asymmetrically (top right), and the gal genes are then picked up and some phage genes are lost. The result is a defective lambda phage that carries bacterial genes and can transfer them to a new recipient. See text for details. 311 Prescott−Harley−Klein: Microbiology, Fifth Edition 312 Chapter 13 F IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 Microbial Recombination and Plasmids – Hfr lac lac tsx gal trp arg a lc tsx gal trp arg gal tsx gal trp trp arg tsx lac (a) 100 13.7 Frequency of Hfr genetic characters among recombinants (%) Figure 13.22 The Interrupted Mating Experiment. An interrupted mating experiment on Hfr ⫻ F⫺ conjugation. (a) The linear transfer of genes is stopped by breaking the conjugation bridge to study the sequence of gene entry into the recipient cell. (b) An example of the results obtained by an interrupted mating experiment. The gene order is lac-tsx-gal-trp. lac Mapping the Genome Finding the location of genes in any organism’s genome is a very complex task. This section surveys approaches to mapping the bacterial genome, using E. coli as an example. All three modes of gene transfer and recombination have been used in mapping. Gene structure and the nature of mutations (pp. 241–53) Hfr conjugation is frequently used to map the relative location of bacterial genes. This technique rests on the observation that during conjugation the linear chromosome moves from donor to recipient at a constant rate. In an interrupted mating experiment the conjugation bridge is broken and Hfr ⫻ F⫺ mating is stopped at various intervals after the start of conjugation by mixing the culture vigorously in a blender (figure 13.22a). The order and timing of gene transfer can be determined because they are a direct reflection of the order of genes on the bacterial chromosome (figure 13.22b). For example, extrapolation of the curves tsx 80 60 gal 40 trp 20 0 0 10 20 30 40 50 60 Time (minutes) (b) in figure 13.22b back to the x-axis will give the time at which each gene just began to enter the recipient. The result is a circular chromosome map with distances expressed in terms of the minutes elapsed until a gene is transferred. This technique can fairly precisely locate genes 3 minutes or more apart. The heights of the plateaus in figure 13.22b are lower for genes that are more distant from the F factor (the origin of transfer) because there is an ever-greater chance that the conjugation bridge will spontaneously break as transfer continues. Because of the relatively large size of the E. coli genome, it is not possible to generate a map from one Hfr strain. Therefore several Hfr strains with the F plasmid integrated at different locations must be used and their maps superimposed on one another. The overall map is adjusted to 100 minutes, although complete transfer may require somewhat more than 100 minutes. In a sense, minutes are an indication of map distance and not strictly a measure of time. Zero time is set at the threonine (thr) locus. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 rE pu 100/0 5 95 10 HfrH (Hay 90 15 lli) es) 85 T,E lK, ,D ga F,C t t a A,B, , o i b rB uv serC Hf rC (C a va ilvG ,E, D, A, ori C C dn aA pyr E xyl p yrD 20 80 pyrC PK3 25 75 malA 313 lip rh aD ,A ,B ,C Mapping the Genome metD pro A arg ,B la c F tsx A,Y,Z ,O, P tonA pil dnaC ar A uvr B a dn lB ma ,C ,B iA ,H th ,C,B tB e gE m valS pyrB purA thrA,B,C araD,A,B,C leuB,A 13.7 B7 70 purB att80 trpA,B,C ,D,E 30 65 35 60 tC me 55 40 ph 45 50 m ty an rS eS cysA ptsl r ec A tyrA phe A ) nalA(gyrA purF ,A eB ch C uvr ,I,E ,A,F ,B,H ,D,C hisG S arg re ar cB gA rA se sA ly 9 PK1 KL98 G arg KL 16 a rgR Figure 13.23 E. coli Genetic Map. A circular genetic map of E. coli K12 with the location of selected genes. The inner circle shows the origin and direction of transfer of several Hfr strains. The map is divided into 100 minutes, the time required to transfer the chromosome from an Hfr cell to F⫺ at 37°C. Gene linkage, or the proximity of two genes on a chromosome, also can be determined from transformation by measuring the frequency with which two or more genes simultaneously transform a recipient cell. Consider the case for cotransformation by two genes. In theory, a bacterium could simultaneously receive two genes, each carried on a separate DNA fragment. However, it is much more likely that the genes reside on the same fragment. If two genes are closely linked on the chromosome, then they should be able to cotransform. The closer the genes are together, the more often they will be carried on the same fragment and the higher will be the frequency of cotransformation. If genes are spaced a great distance apart, they will be carried on separate DNA fragments and the frequency of double transformants will equal the product of the individual transformation frequencies. Generalized transduction can be used to obtain linkage information in much the same way as transformation. Linkages usually are expressed as cotransduction frequencies, using the argument that the closer two genes are to each other, the more likely they both will reside on the DNA fragment incorporated into a single phage capsid. The E. coli phage P1 is often used in such mapping because it can randomly transduce up to 1 to 2% of the genome. Specialized transduction is used to find which phage attachment site is close to a specific gene. The relative locations of specific phage att sites are known from conjugational mapping, and the genes linked to each att site can be determined by means of specialized transduction. These data allow precise placement of genes on the chromosome. A simplified genetic map of E. coli K12 is given in figure 13.23. Because conjugation data are not high resolution and cannot be used to position genes that are very close together, the whole map is developed using several mapping techniques. Usually, interrupted mating data are combined with those from cotransduction and Prescott−Harley−Klein: Microbiology, Fifth Edition 314 Chapter 13 IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids Microbial Recombination and Plasmids cotransformation studies. Data from recombination studies also are used. Normally a new genetic marker in the E. coli genome is located within a relatively small region of the genome (10 to 15 minutes long) using a series of Hfr strains with F factor integration sites scattered throughout the genome. Once the genetic marker has been located with respect to several genes in the same region, its position relative to nearby neighbors is more accurately determined using transformation and transduction studies. Recent maps of the E. coli chromosome give the locations of more than a thousand genes. Remember that the genetic map only depicts physical reality in a relative sense. A map unit in one region of the genome may not be the same physical distance as a unit in another part. Genetic maps provide useful information in addition to the order of the genes. For example, there is considerable clustering of genes in E. coli K12 (figure 13.23). In the regions around 2, 17, and 27 minutes, there are many genes, whereas relatively few genetic markers are found in the 33 minute region. The areas apparently lacking genes may well have undiscovered genes, but perhaps their function is not primarily that of coding genetic information. One hypothesis is that the 33 minute region is involved in attachment of the E. coli chromosome to the plasma membrane during replication and cell division. It is interesting that this region is almost exactly opposite the origin of replication for the chromosome (oriC). Of course a great deal could be learned by comparing a microorganism’s genetic map with the actual nucleotide sequence of its genome. It is now possible to fairly rapidly sequence procaryotic genomes; genome sequencing and analysis will be discussed later in some detail. Microbial genomics (chapter 15) 13.8 © The McGraw−Hill Companies, 2002 Recombination and Genome Mapping in Viruses Bacteriophage genomes also undergo recombination, although the process is different from that in bacteria. Because phages themselves reproduce within cells and cannot recombine directly, crossing-over must occur inside a host cell. In principle, a virus recombination experiment is easy to carry out. If bacteria are mixed with enough phages, at least two virions will infect each cell on the average and genetic recombination should be observed. Phage progeny in the resulting lysate can be checked for alternate combinations of the initial parental genotypes. Bacteriophage lytic cycle (pp. 382–88) Alfred Hershey initially demonstrated recombination in the phage T2, using two strains with differing phenotypes. Two of the parental strains in Hershey’s crosses were h⫹r⫹ and hr (figure 13.24a). The gene h influences host range; when gene h changes, T2 infects different strains of E. coli. Phages with the r⫹ gene have wild type plaque morphology, whereas T2 with the r genotype has a rapid lysis phenotype and produces larger than normal plaques with sharp edges (figures 13.24b and 13.24c). In one experiment Hershey infected E. coli with large quantities of the h⫹r⫹ and hr T2 strains (figure 13.24a). He then plated out the lysates with a mixture of two different host strains and was able to detect significant numbers of h⫹r and hr⫹ recombinants, as well as parental type plaques. As long as there are detectable phenotypes and methods for carrying out the crosses, it is possible to map phage genes in this way. Plaque formation and morphology (p. 364) Phage genomes are so small that often it is convenient to map them without determining recombination frequencies. Some techniques actually generate physical maps, which often are most useful in genetic engineering. Several of these methods require manipulation of the DNA with subsequent examination in the electron microscope. For example, one can directly compare wild-type and mutant viral chromosomes. In heteroduplex mapping the two types of chromosomes are denatured, mixed, and allowed to rejoin or anneal. When joined, the homologous regions of the different DNA molecules form a regular double helix. In locations where the bases do not pair due to the presence of a mutation such as a deletion or insertion, bubbles will be visible in the electron microscope. Several other direct techniques are used to map viral genomes or parts of them. Restriction endonucleases (see section 14.1) are employed together with electrophoresis to analyze DNA fragments and locate deletions and other mutations that affect electrophoretic mobility. Phage genomes also can be directly sequenced to locate particular mutations and analyze the changes that have taken place. It should be noted that many of these physical mapping techniques also have been employed in the analysis of relatively small portions of bacterial genomes. Furthermore, these methods are useful to the genetic engineer who is concerned with direct manipulation of DNA. 1. Describe how the bacterial genome can be mapped using Hfr conjugation, transformation, generalized transduction, and specialized transduction. Include both a description of each technique and any assumptions underlying its use. 2. Why is it necessary to use several different techniques in genome mapping? How is this done in practice? 3. How does recombination in viruses differ from that in bacteria? How did Hershey first demonstrate virus recombination? 4. Describe heteroduplex mapping. Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 13.8 Recombination and Genome Mapping in Viruses + + T2 hr T2 h r crossing-over between two different chromosomes h h + h h + + r (a) (b) + + hr hr + r hr r + + r + hr (c) Figure 13.24 Genetic Recombination in Bacteriophages. (a) A summary of a genetic recombination experiment with the hr and h⫹r⫹ strains of the T2 phage. The hr chromosome is shown in color. (b) The types of plaques produced by this experiment on a lawn of E. coli. (c) A close-up of the four plaque types. See text for details. 315 Prescott−Harley−Klein: Microbiology, Fifth Edition 316 Chapter 13 IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 Microbial Recombination and Plasmids Summary 1. In recombination, genetic material from two different chromosomes is combined to form a new, hybrid chromosome. There are three types of recombination: general recombination, site-specific recombination, and replicative recombination. 2. Bacterial recombination is a one-way process in which the exogenote is transferred from the donor to a recipient and integrated into the endogenote (figure 13.4). 3. Plasmids are small, circular, autonomously replicating DNA molecules that can exist independent of the host chromosome. Their genes are not required for host survival. 4. Episomes are plasmids that can be reversibly integrated with the host chromosome. 5. Many important types of plasmids have been discovered: F factors, R factors, Col plasmids, virulence plasmids, and metabolic plasmids. 6. Transposons or transposable elements are DNA segments that move about the genome in a process known as transposition. 7. There are two types of transposable elements: insertion sequences and composite transposons. 8. Transposable elements cause mutations, block translation and transcription, turn genes on and off, aid F plasmid insertion, and carry antibiotic resistance genes. 9. Conjugation is the transfer of genes between bacteria that depends upon direct cell-cell contact mediated by the F pilus. 10. In F⫹ ⫻ F⫺ mating the F factor remains independent of the chromosome and a copy is transferred to the F⫺ recipient; donor genes are not usually transferred (figure 13.14). 11. Hfr strains transfer bacterial genes to recipients because the F factor is integrated into the host chromosome. A complete copy of the F factor is not often transferred (figure 13.14). 12. When the F factor leaves an Hfr chromosome, it occasionally picks up some bacterial genes to become an F′ plasmid, which readily transfers these genes to other bacteria (figure 13.15). 13. Transformation is the uptake of a naked DNA molecule by a competent cell and its incorporation into the genome (figure 13.16). 14. Bacterial viruses or bacteriophages can reproduce and destroy the host cell (lytic cycle) or become a latent prophage that remains within the host (lysogenic cycle) (figure 13.18). 15. Transduction is the transfer of bacterial genes by viruses. 16. In generalized transduction any host DNA fragment can be packaged in a virus capsid and transferred to a recipient (figure 13.19). 17. Temperate phages carry out specialized transduction by incorporating bacterial genes during prophage induction and then donating those genes to another bacterium (figure 13.20). 18. The bacterial genome can be mapped by following the order of gene transfer during Hfr conjugation (figure 13.22); transformational and transductional mapping techniques also may be used. 19. When two viruses simultaneously enter a host cell, their chromosomes can undergo recombination. 20. Virus genomes are mapped by recombination and heteroduplex mapping techniques. Key Terms abortive transductants 309 bacteriocin 297 competent 305 composite transposons 298 conjugation 302 conjugative plasmid 294 conjugative transposon 301 crossing-over 292 curing 294 endogenote 294 episome 294 exogenote 294 F factor 295 F′ plasmid 305 generalized transducing particle 309 generalized transduction 308 general recombination 292 helper phage 309 heteroduplex DNA 292 Hfr strains 303 high-frequency transduction lysate (HFT lysate) 309 horizontal gene transfer 292 host restriction 294 insertion sequence 298 interrupted mating experiment 312 low-frequency transduction lysate (LFT lysate) 309 lysogenic 308 lysogens 308 lysogeny 307 merozygote 294 metabolic plasmids 297 plasmid 294 prophage 308 recombination 292 replicative recombination 292 replicon 294 restricted transduction 309 R factors 297 sex pilus 303 site-specific recombination 292 specialized transduction 309 temperate phage 308 transduction 308 transformation 305 transposable element 298 transposase 298 transposition 298 transposon 298 virulence plasmids 297 Prescott−Harley−Klein: Microbiology, Fifth Edition IV. Microbial Molecular Biology and Genetics 13. Microbial Recombination and Plasmids © The McGraw−Hill Companies, 2002 Additional Reading Questions for Thought and Review 1. How does recombination in procaryotes differ from that in most eucaryotes? 2. Distinguish between plasmids, transposons, and temperate phages. 3. How might one demonstrate the presence of a plasmid in a host cell? 4. What effect would you expect the existence of transposable elements and plasmids to have on the rate of microbial evolution? Give your reasoning. 5. How do multiple drug resistant plasmids often arise? 6. Suppose that you carried out a U-tube experiment with two auxotrophs and discovered that recombination was not blocked by the filter but was stopped by treatment with deoxyribonuclease. What gene transfer process is responsible? Why would it be best to use double or triple auxotrophs in this experiment? 7. List the similarities and differences between conjugation, transformation, and transduction. 8. How might one tell whether a recombination process was mediated by generalized or specialized transduction? 9. Why doesn’t a cell lyse after successful transduction with a temperate phage? 10. Describe how you would precisely locate the recA gene and show that it was between 58 and 58.5 minutes on the E. coli chromosome. 317 Critical Thinking Questions 1. Diagram a double crossover event and a single crossover event. Which is more infrequent and why? Suggest experiments in which you would use one or the other event and what types of genetic markers you would employ. What kind of recognition features and catalytic capabilities would the recombination machinery need to possess? 2. Suppose that transduction took place when a U-tube experiment was conducted. How would you confirm that something like a virus was passed through the filter and transduced the recipient? 3. What would be the evolutionary advantage of having a period of natural “competence” in a bacterial life cycle? What would be possible disadvantages? Additional Reading Chapters 11 and 12 references also should be consulted, particularly the introductory and advanced textbooks. General Berg, D. E., and Howe, M. M., editors. 1989. Mobile DNA. Washington, D.C.: American Society for Microbiology. Brock, T. D. 1990. The emergence of bacterial genetics. Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press. Levy, S. B., and Marshall, B. M. 1988. Genetic transfer in the natural environment. In Release of genetically-engineered microorganisms, M. Sussman, G. H. Collins, F. A. Skinner, and D. E. Stewart-Tall, editors, 61–76. San Diego, Calif.: Academic Press. Low, K. B. 2000. Escherichia coli and Salmonella, genetics. In Encyclopedia of microbiology, 2d ed., vol. 2, J. Lederberg, editor-in-chief, 270–82. San Diego: Academic Press. Neidhardt, F. C., editor-in-chief. 1996. Escherichia coli and Salmonella: Cellular and molecular biology, 2d ed. Washington, D.C.: ASM Press. Neidhardt, F. C.; Ingraham, J. L.; and Schaechter, M. 1990. Physiology of the bacterial cell. Sunderland, Mass.: Sinauer. Streips, U. N., and Yasbin, R. E., editors. 1991. Modern microbial genetics. New York: WileyLiss, Inc. 13.1 Bacterial Recombination: General Principles Cohan, F. M. 1996. The role of genetic exchange in bacterial evolution. ASM News 62(12):631–36. Dressler, D., and Potter, H. 1982. Molecular mechanisms in genetic recombination. Annu. Rev. Biochem. 51:727–61. Hotchkiss, R. D. 1974. Models of genetic recombination. Annu. Rev. Microbiol. 28:445–68. Kowalczykowski, S. C. 2000. Initiation of genetic recombination and recombination-dependent replication. Trends Biochem. Sci. 25:156–65. Kowalczykowski, S. C.; Dixon, D. A.; Eggleston, A. K.; Lauder, S. D.; and Rehrauer, W. M. 1994. Biochemistry of homologous recombination in Escherichia coli. Microbiol. Rev. 58(3):401–65. Kucherlapati, R., and Smith, G. R., editors. 1988. Genetic recombination. Washington, D.C.: American Society for Microbiology. Matic, I.; Taddei, F.; and Radman, M. 1996. Genetic barriers among bacteria. Trends Microbiol. 4(2):69–73. Miller, R. V. 1998. Bacterial gene swapping in nature. Sci. Am. 278(1):67–71. Smith, G. R. 1988. Homologous recombination in procaryotes. Microbiol. Rev. 52(1):1–28. Stahl, F. W. 1987. Genetic recombination. Sci. Am. 256(2):91–101. 13.2 Bacterial Plasmids Hardy, K. 1986. Bacterial plasmids, 2d ed. Washington, D.C.: American Society for Microbiology. Khan, S. A. 1997. Rolling-circle replication of bacterial plasmids. Microbiol. Mol. Biol. R. 61(4):442–55. Mayer, L. W. 1988. Use of plasmid profiles in epidemiologic surveillance of disease outbreaks and in tracing the transmission of antibiotic resistance. Clin. Microbiol. Rev. 1(2):228–43. Novick, R. P. 1980. Plasmids. Sci. Am. 243(6):103–27. Rasooly, A., and Rasooly, R. S. 1997. How rolling circle plasmids control their copy number. Trends Microbiol. 5(11):40–46. Summers, D. K. 1996. The biology of plasmids. Cambridge: Mass.: Blackwell Science Ltd. Thomas, C. M. 2000. Plasmids, bacterial. In Encyclopedia of microbiology, 2d ed., vol. 3, J. Lederberg, editor-in-chief, 711–29. San Diego: Academic Press. 13.3 Transposable Elements Bennett, P. M. 2000. Transposable elements. In Encyclopedia of microbiology, 2d ed., vol. 4, J. Lederberg, editor-in-chief, 704–24. San Diego: Academic Press. Berg, C. M., and Berg, D. E. 1984. Jumping genes: The transposable DNAs of bacteria. Am. Biol. Teach. 46(8):431–39. Prescott−Harley−Klein: Microbiology, Fifth Edition 318 Chapter 13 IV. Microbial Molecular Biology and Genetics © The McGraw−Hill Companies, 2002 Microbial Recombination and Plasmids Cohen, S. N., and Shapiro, J. A. 1980. Transposable genetic elements. Sci. Am. 242(2):40–49. Grindley, N. D. F., and Reed, R. R. 1985. Transpositional recombination in prokaryotes. Annu. Rev. Biochem. 54:863–96. Haren, L.; Ton-Hoang, B.; and Chandler, M. 1999. Integrating DNA: Transposases and retroviral integrases. Annu. Rev. Microbiol. 53:245–81. Kleckner, N. 1981. Transposable elements in prokaryotes. Annu. Rev. Genet. 15:341–404. Kleckner, N. 1990. Regulation of transposition in bacteria. Annu. Rev. Cell Biol. 6:297–327. Salyers, A. A.; Shoemaker, N. B.; Stevens, A. M.; and Li, L.-Y. 1995. Conjugative transposons: An unusual and diverse set of integrated gene transfer elements. Microbiol. Rev. 59(4):579–90. Scott, J. R., and Churchward, G. G. 1995. Conjugative transposition. Annu. Rev. Microbiol. 49:367–97. 13.4 13. Microbial Recombination and Plasmids Bacterial Conjugation Dunny, G. M.; Leonard, B. A. B.; and Hedberg, P. J. 1995. Pheromone-inducible conjugation in Enterococcus faecalis: Interbacterial and hostparasite chemical communication. J. Bacteriol. 177(4):871–76. Ippen-Ihler, K. A., and Minkley, E. G., Jr. 1986. The conjugation system of F, the fertility factor of Escherichia coli. Annu. Rev. Genet. 20:593–624. Firth, N.; Ippen-Ihler, K.; and Skurray, R. A. 1996. Structure and function of the F factor and mechanism of conjugation. In Escherichia coli and Salmonella: Cellular and molecular biology, 2d ed., vol. 2, F. C. Neidhardt, editorin-chief, 2377–401. Washington, D.C.: ASM Press. Frost, L. S. 2000. Conjugation, bacterial. In Encyclopedia of microbiology, 2d ed., vol. 1, J. Lederberg, editor-in-chief, 847–62. San Diego: Academic Press. 13.5 DNA Transformation Dubnau, D. 1999. DNA uptake in bacteria. Annu. Rev. Microbiol. 53:217–44. Lorenz, M. G., and Wackernagel, W. 1994. Bacterial gene transfer by natural genetic transformation in the environment. Microbiol. Rev. 58(3):563–602. McCarty, M. 1985. The transforming principle: Discovering that genes are made of DNA. New York: W. W. Norton. Solomon, J. M., and Grossman, A. D. 1996. Who’s competent and when: Regulation of natural genetic competence in bacteria. Trends Genetics 12(4):150–55. Stewart, G. J., and Carlson, C. A. 1986. The biology of natural transformation. Annu. Rev. Microbiol. 40:211–35. Wilkins, B. M., and Meacock, P. A. 2000. Transformation, genetic. In Encyclopedia of microbiology, 2d ed., vol. 4, J. Lederberg, editorin-chief, 651–65. San Diego: Academic Press. 13.6 Transduction Masters, M. 2000. Transduction: Host DNA transfer by bacteriophages. In Encyclopedia of microbiology, 2d ed., vol. 4, J. Lederberg, editorin-chief, 637–50. San Diego: Academic Press. 13.7 Mapping the Genome Ash, C. 1997. Year of the genome. Trends Microbiol. 5(4):135–39. Berlyn, M. K. B.; Low, K. B.; and Rudd, K. E. 1996. Linkage map of Escherichia coli K–12, edition 9. In Escherichia coli and Salmonella: Cellular and molecular biology, 2d ed., vol. 2. F. C. Neidhardt, editor-in-chief, 1715–1902. Washington, D. C.: ASM Press. Sanderson, K. E.; Hessel, A.; and Rudd, K. E. 1995. Genetic map of Salmonella typhimurium, edition VIII. Microbiol. Rev. 59(2):241–303. Prescott−Harley−Klein: Microbiology, Fifth Edition PA RT V. DNA Technology and Genomics V DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 CHAPTER 14 Recombinant DNA Technology Chapter 14 Recombinant DNA Technology Chapter 15 Microbial Genomics Four streaks of E. coli glow different colors because they contain different cloned luciferase genes. Outline 14.1 14.2 14.3 14.4 Historical Perspectives 320 Synthetic DNA 323 The Polymerase Chain Reaction 326 Preparation of Recombinant DNA 327 Isolating and Cloning Fragments 327 Gene Probes 331 Isolating and Purifying Cloned DNA 333 14.5 Cloning Vectors 333 Plasmids 334 Phage Vectors 335 Cosmids 335 Artificial Chromosomes 335 14.6 14.7 14.8 Inserting Genes into Eucaryotic Cells 335 Expression of Foreign Genes in Bacteria 336 Applications of Genetic Engineering 337 Medical Applications 337 Industrial Applications 339 Agricultural Applications 339 14.9 Social Impact of Recombinant DNA Technology 341 Concepts 1. Genetic engineering makes use of recombinant DNA technology to fuse genes with vectors and then clone them in host cells. In this way large quantities of isolated genes and their products can be synthesized. 2. The production of recombinant DNA molecules depends on the ability of restriction endonucleases to cleave DNA at specific sites. 3. Plasmids, bacteriophages and other viruses, and cosmids are used as vectors. They can replicate within a host cell while carrying foreign DNA and possess phenotypic traits that allow them to be detected. 4. Genetic engineering is already making substantial contributions to biological research, medicine, industry, and agriculture. Future benefits are probably much greater. 5. Genetic engineering also is accompanied by potential problems in such areas as safety, the ethics of its use with human subjects, environmental impact, and biological warfare. Prescott−Harley−Klein: Microbiology, Fifth Edition 320 Chapter 14 V. DNA Technology and Genomics 14. Recombinant DNA Technology Recombinant DNA Technology The recombinant DNA breakthrough has provided us with a new and powerful approach to the questions that have intrigued and plagued man for centuries. —Paul Berg hapters 12 and 13 introduce the essentials of microbial genetics. This chapter focuses on the practical applications of microbial genetics and the technology arising from it. Although human beings have been altering the genetic makeup of organisms for centuries by selective breeding, only recently has the direct manipulation of DNA been possible. The deliberate modification of an organism’s genetic information by directly changing its nucleic acid genome is called genetic engineering and is accomplished by a collection of methods known as recombinant DNA technology. First, the DNA responsible for a particular phenotype is identified and isolated. Once purified, the gene or genes are fused with other pieces of DNA to form recombinant DNA molecules. These are propagated (gene cloning) by insertion into an organism that need not even be in the same kingdom as the original gene donor. Recombinant DNA technology opens up totally new areas of research and applied biology. Thus it is an essential part of biotechnology, which is now experiencing a stage of exceptionally rapid growth and development. Although the term has several definitions, in this text biotechnology refers to those processes in which living organisms are manipulated, particularly at the molecular genetic level, to form useful products. The promise for medicine, agriculture, and industry is great; yet the potential risks of this technology are not completely known and may be considerable. Biotechnology and in- C Table 14.1 Some Milestones in Biotechnology and Recombinant DNA Technology 1958 1970 1972 1973 1975 1976 1977 1978 1979 1981 1982 1983 1985 1987 1988 1989 1990 1991 1994 dustrial microbiology (chapter 42) Recombinant DNA technology is very much the result of several key discoveries in microbial genetics. The first section briefly reviews some landmarks in the development of recombinant technology (table 14.1). 1995 1996 1997 1998 14.1 © The McGraw−Hill Companies, 2002 DNA polymerase purified A complete gene synthesized in vitro Discovery of the first sequence-specific restriction endonuclease and the enzyme reverse transcriptase First recombinant DNA molecules generated Use of plasmid vectors for gene cloning Southern blot technique for detecting specific DNA sequences First prenatal diagnosis using a gene-specific probe Methods for rapid DNA sequencing Discovery of “split genes” and somatostatin synthesized using recombinant DNA Human genomic library constructed Insulin synthesized using recombinant DNA First human viral antigen (hepatitis B) cloned Foot-and-mouth disease viral antigen cloned First monoclonal antibody-based diagnostic kit approved for use Commercial production by E. coli of genetically engineered human insulin Isolation, cloning, and characterization of a human cancer gene Transfer of gene for rat growth hormone into fertilized mouse eggs Engineered Ti plasmids used to transform plants Tobacco plants made resistant to the herbicide glyphosate through insertion of a cloned gene from Salmonella Development of the polymerase chain reaction technique Insertion of a functional gene into a fertilized mouse egg cures the shiverer mutation disease of mice, a normally fatal genetic disease The first successful production of a genetically engineered staple crop (soybeans) Development of the gene gun First field test of a genetically engineered virus (a baculovirus that kills cabbage looper caterpillars) Production of the first fertile corn transformed with a foreign gene (a gene for resistance to the herbicide bialaphos) Development of transgenic pigs and goats capable of manufacturing proteins such as human hemoglobin First test of gene therapy on human cancer patients The Flavr Savr tomato introduced, the first genetically engineered whole food approved for sale Fully human monoclonal antibodies produced in genetically engineered mice Haemophilus influenzae genome sequenced Methanococcus jannaschii and Saccharomyces cerevisiae genomes sequenced Human clinical trials of antisense drugs and DNA vaccines begun E. coli genome sequenced First cloned mammal (the sheep Dolly) Historical Perspectives Recombinant DNA is DNA with a new sequence formed by joining fragments from two or more different sources. One of the first breakthroughs leading to recombinant DNA (rDNA) technology was the discovery in the late 1960s by Werner Arber and Hamilton Smith of microbial enzymes that make cuts in doublestranded DNA. These enzymes recognize and cleave specific sequences about 4 to 8 base pairs long and are known as restriction enzymes or restriction endonucleases (figure 14.1). They normally protect the host cell by destroying phage DNA after its entrance. Cells protect their own DNA from restriction enzymes by methylating nucleotides in the sites that these enzymes recognize. Incoming foreign DNA is not methylated at the same sites and often is cleaved by host restriction enzymes. There are three general types of restriction enzymes. Types I and III cleave DNA away from recognition sites. Type II restriction endonucleases cleave DNA at specific recognition sites. The type II enzymes can be used to prepare DNA fragments containing specific genes or portions of genes. For example, the restriction enzyme EcoRI, isolated by Herbert Boyer in 1969 from E. coli, cleaves the DNA between G and A in the base sequence GAATTC (figure 14.2). Note that in the double-stranded condition, the base sequence GAATTC will base pair with the same sequence running in the opposite direction. EcoRI therefore cleaves both DNA strands between the G and the A. When the two DNA fragments separate, they often contain single-stranded complementary ends, known as sticky ends or cohesive ends. There are hundreds of restriction Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 14.1 Historical Perspectives 321 Figure 14.1 Restriction Endonuclease Binding to DNA. The structure of BamHI binding to DNA viewed down the DNA axis. The enzyme’s two subunits lie on each side of the DNA double helix. The ␣-helices are in green, the  conformations in purple to red, and DNA is in orange. Cut Cut mRNA 5′ 5′ G A AT T C G A AT T C 3′ 3′ C T TA A G C T TA A G 5′ Cut Cut Poly-A tail 3′ AAAAAAAA TTTT Reverse Oligo-dT primer transcriptase and free nucleotides mRNA cDNA with a hairpin loop 5′ G 3′ C T TA A A AT T C G G C T TA A A AT T C 3′ G 5′ AAAAAAAA TTTT RNase H or alkali TTTT DNA polymerase I and free nucleotides Figure 14.2 Restriction Endonuclease Action. The cleavage catalyzed by the restriction endonuclease EcoRI. The enzyme makes staggered cuts on the two DNA strands to form sticky ends. S1 nuclease enzymes that recognize many different specific sequences (table 14.2). Each restriction enzyme name begins with three letters, indicating the bacterium producing it. For example, EcoRI is obtained from E. coli, whereas BamHI comes from Bacillus amyloliquefaciens H, and SalI from Streptomyces albus. Restriction Double-stranded cDNA and phages (p. 386) In 1970 Howard Temin and David Baltimore independently discovered the enzyme reverse transcriptase that retroviruses use to produce DNA copies of their RNA genome. This enzyme can be used to construct a DNA copy, called complementary DNA (cDNA), of any RNA (figure 14.3). Thus genes or major portions of genes can be synthesized from mRNA. Reverse transcriptase and retroviruses (p. 407) Figure 14.3 The Synthesis of Double-Stranded cDNA from mRNA. This figure briefly summarizes one procedure for synthesis of cDNA. Reverse transcriptase synthesizes a DNA copy of the mRNA using an oligo-dT primer. The RNA of the resulting RNA-DNA hybrid is degraded to produce single-stranded DNA with a hairpin loop at one end. DNA polymerase I then converts this to a double-stranded form, and S1 nuclease nicks the DNA at the points indicated by the two arrows on the left to generate the final DNA copy. Prescott−Harley−Klein: Microbiology, Fifth Edition 322 Chapter 14 V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 Recombinant DNA Technology Table 14.2 Some Type II Restriction Endonucleases and Their Recognition Sequences Microbial Source Recognition Sequencea End Producedb AluI Arthrobacter luteus 5´—A—G—C—T—3´ 3´— T —C—G—A—5´ BamHI Bacillus amyloliquefaciens H 5´—G—G—A—T—C —C—3´ 3´—C — C— T—A—G—G—5´ EcoRI Escherichia coli 5´—G—A—A—T— T—C—3´ 3´— C— T— T—A—A—G—5´ HaeIII Haemophilus aegyptius 5´—G—G—C —C—3´ 3´—C — C—G—G—5´ HindIII Haemophilus influenzae b 5´—A—A—G—C—T— T—3´ 3´— T — T—C—G—A—A—5´ NotI Nocardia otitidis-caviarum 5´—G—C —G—G—C — C —G—C —3´ 3´— C—G—C—C—G— G—C—G—5´ PstI Providencia stuartii 5´— C — T—G—C —A—G—3´ 3´—G—A—C—G—T—C—5´ SalI Streptomyces albus 5´—G— T—C—G—A—C—3´ 3´—C—A —G—C— T—G—5´ ⬎ — — Enzyme — — ⬎ C—T—3´ G—A—5´ — — ⬎ ⬎ — — ⬎ ⬎ — — — — G—A—T—C —C—3´ G—5´ A—A—T— T—C—3´ G—5´ ⬎ — — — — ⬎ C —C—3 ´ G—G—5´ — — ⬎ ⬎ — — — — ⬎ — —⬎ ⬎ ⬎ — — — — ⬎ ⬎ — — — — A—G—C—T— T—3´ A—5´ G—G—C — C —G—C —3 ´ C—G—5´ G—3 ´ A—C—G—T—C—5´ T—C—G—A—C—3´ G—5´ a The arrows indicate the sites of cleavage on each strand. b Only the end of the right-hand fragment is shown. The next advance came in 1972, when David Jackson, Robert Symons, and Paul Berg reported that they had successfully generated recombinant DNA molecules. They allowed the sticky ends of fragments to anneal—that is, to base pair with one another—and then covalently joined the fragments with the enzyme DNA ligase. Within a year, plasmid vectors, or carriers of foreign DNA fragments during gene cloning, had been developed and combined with foreign DNA (figure 14.4). The first such recombinant plasmid capable of being replicated within a bacterial host was the pSC101 plasmid constructed by Stanley Cohen and Herbert Boyer in 1973 (SC in the plasmid name stands for Stanley Cohen). In 1975 Edwin M. Southern published a procedure for detecting specific DNA fragments so that a particular gene could be isolated from a complex DNA mixture. The Southern blotting technique depends on the specificity of base complementarity in nucleic acids (figure 14.5). DNA fragments are first separated by size with agarose gel electrophoresis. The fragments are then denatured (rendered single stranded) and transferred to a nitrocellulose filter or nylon membrane so that each fragment is firmly bound to the filter at the same position as on the gel. Originally the transfer occurred when buffer flowed through the gel and the membrane as shown in figure 14.5. The negatively charged DNA fragments also can be electrophoresed from the gel onto the blotting membrane. The filter is bathed with solution containing a radioactive probe, a piece of labeled nucleic acid that hybridizes with complementary DNA fragments and is used to locate them. Those fragments complementary to the probe become radioactive and are readily detected by autoradiography. In this technique a sheet of photographic film is placed over the filter for several hours and then developed. The film is exposed and becomes dark everywhere a radioactive fragment is located because the energy released by the isotope causes the formation of dark-silver grains. Using Southern blotting, one can detect and isolate fragments with any desired sequence from a complex mixture. More recently, nonradioactive probes to detect specific DNAs have been developed. In one approach the DNA probe is linked to an enzyme such as horseradish peroxidase. After the enzyme-DNA probe has bound to a DNA fragment on the filter, the substrate luminol that will emit light when acted on by the peroxidase is added. The chemiluminescent probe is detected by exposing the filter to Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 14.2 DNA-containing gene to be cloned DNA cut by a restriction enzyme Synthetic DNA 323 By the late 1970s techniques for easily sequencing DNA, synthesizing oligonucleotides, and expressing eucaryotic genes in procaryotes had also been developed. These techniques were then used to solve practical problems (table 14.1). The following sections describe how the previously discussed techniques and others are used in genetic engineering. 1. Define or describe restriction enzyme, sticky end, cDNA, vector, Southern blotting, probe, and autoradiography. Annealing of fragment with plasmid Plasmid DNA Sticky ends Joining of ends by DNA ligase Ligation points Figure 14.4 Recombinant Plasmid Construction. The general procedure used in constructing recombinant plasmid vectors is shown here. photographic film for about 20 minutes. A second technique makes use of the vitamin biotin. A biotin-DNA probe is detected by incubating the filter with either the protein avidin or a similar bacterial protein, streptavidin. The protein specifically attaches to biotin, and is visualized with a special reagent containing biotin complexed with the enzyme alkaline phosphatase (streptavidin also can be directly attached to the enzyme). The bands with the probe appear blue. These nonradioactive techniques are more rapid and safer than using radioisotopes. On the other hand, they may be less sensitive than radioactively labeled probes. 14.2 Synthetic DNA Oligonucleotides [Greek oligo, few or scant] are short pieces of DNA or RNA between about 2 and 20 or 30 nucleotides long. The ability to synthesize DNA oligonucleotides of known sequence is extremely useful. For example, DNA probes can be synthesized and DNA fragments can be prepared for use in molecular techniques such as PCR (p. 326). DNA structure (pp. 231–33) DNA oligonucleotides are synthesized by a stepwise process in which single nucleotides are added to the end of the growing chain (figure 14.6). The 3′ end of the chain is attached to a solid support such as a silica gel particle. A DNA synthesizer or “gene machine” carries out the solid-phase synthesis. A specially activated nucleotide derivative is added to the 5′ end of the chain in a series of steps. At the end of an addition cycle, the growing chain is separated from the reaction mixture by filtration or centrifugation. The process is then repeated to attach another nucleotide. It takes about 40 minutes to add a nucleotide to the chain, and chains as large as 50 to 100 nucleotides can be synthesized. Advances in DNA synthetic techniques have accelerated progress in the study of protein function. One of the most effective ways of studying the relationship of protein structure to function is by altering a specific part of the protein and observing functional changes. In the past this has been accomplished either by chemically modifying individual amino acids or by inducing mutations in the gene coding for the protein under study. There are problems with these two approaches. Chemical modification of a protein is not always specific; several amino acids may be altered, not just the one desired. It is not always possible to produce the proper mutation in the desired gene location. Recently these difficulties have been overcome with a technique called sitedirected mutagenesis. In site-directed mutagenesis an oligonucleotide of about 20 residues that contains the desired sequence change is synthesized. The altered oligonucleotide with its artificially mutated sequence is now allowed to bind to a single-stranded copy of the complete gene (figure 14.7). DNA polymerase is added to the gene-primer complex. The polymerase extends the primer and replicates the remainder of the target gene to produce a new gene copy with the desired mutation. If the gene is attached to a single-stranded DNA bacteriophage (see pp. 372–74 and 388) such as the M13 phage, it can be introduced into a host bacterium and cloned using the techniques to be described shortly. This will yield large quantities of the mutant protein for study of its function. Prescott−Harley−Klein: Microbiology, Fifth Edition 324 Chapter 14 V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 Recombinant DNA Technology DNA molecule Restriction enzymes DNA fragments Agarose gel electrophoresis Fragments separated by size and denatured DNA bands Nitrocellulose filter Gel containing DNA bands Buffer Nitrocellulose filter Gel Absorbent material Buffer flow Transfer of fragments to filter Nitrocellulose filter with fragments at the same locations as in the gel Hybridization with radioactive DNA probes, washing, and autoradiography Autoradiograph showing hybrid DNA fragments Figure 14.5 The Southern Blotting Technique. The insert illustrates how buffer flow transfers DNA bands to the nitrocellulose filter. The fragments also can be transferred by electrophoresis. See text for further details. Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 14.2 AG Support Synthetic DNA 325 Target gene A TGCT M13 phage (single-stranded DNA) Attachment of 3′ nucleotide to support particle G TCTGCGA Synthetic oligonucleotide with an altered base G G T C GA C T AT G CT AG 5′ T GT DNA polymerase dATP, dGTP, dCTP, dTTP G Altered gene T C GA TC TGCT G A A A Target gene GT A A TGCT AG C TG CG A TC GT AC Wild type Mutated gene Removal from support GT AC Figure 14.6 The Synthesis of a DNA Oligonucleotide. During each cycle, the DNA synthesizer adds an activated nucleotide (A, T, G, or C) to the growing end of the chain. At the end of the process, the oligonucleotide is removed from its support. Transformation and cloning Figure 14.7 Site-Directed Mutagenesis. A synthetic oligonucleotide is used to add a specific mutation to a gene. See text for details. Prescott−Harley−Klein: Microbiology, Fifth Edition 326 14.3 Chapter 14 V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 Recombinant DNA Technology The Polymerase Chain Reaction Between 1983 and 1985 Kary Mullis developed a new technique that made it possible to synthesize large quantities of a DNA fragment without cloning it. Although this chapter emphasizes recombinant DNA technology, the polymerase chain reaction or PCR technique will be introduced here because of its great practical importance and impact on biotechnology. Figure 14.8 outlines how the PCR technique works. Suppose that one wishes to make large quantities of a particular DNA sequence. The first step is to synthesize fragments with sequences identical to those flanking the targeted sequence. This is readily accomplished with a DNA synthesizer machine as previously described. These synthetic oligonucleotides are usually about 20 nucleotides long and serve as primers for DNA synthesis. The reaction mix contains the target DNA, a very large excess of the desired primers, a thermostable DNA polymerase, and four deoxyribonucleoside triphosphates. The PCR cycle itself takes place in three steps. First, the target DNA containing the sequence to be amplified is heat denatured to separate its complementary strands (step 1). Normally the target DNA is between 100 and 5,000 base pairs in length. Next, the temperature is lowered so that the primers can hydrogen bond or anneal to the DNA on both sides of the target sequence (step 2). Because the primers are present in excess, the targeted DNA strands normally anneal to the primers rather than to each other. Finally, DNA polymerase extends the primers and synthesizes copies of the target DNA sequence using the deoxyribonucleoside triphosphates (step 3). Only polymerases able to function at the high temperatures employed in the PCR technique can be used. Two popular enzymes are the Taq polymerase from the thermophilic bacterium Thermus aquaticus and the Vent polymerase from Thermococcus litoralis. At the end of one cycle, the targeted sequences on both strands have been copied. When the three-step cycle is repeated (figure 14.8), the four strands from the first cycle are copied to produce eight fragments. The third cycle yields 16 products. Theoretically, 20 cycles will produce about one million copies of the target DNA sequence; 30 cycles yield around one billion copies. Pieces ranging in size from less than 100 base pairs to several thousand base pairs in length can be amplified, and only 10 to 100 picomoles of primer are required. The concentration of target DNA can be as low as 10⫺20 to 10⫺15 M (or 1 to 105 DNA copies per 100 l). The whole reaction mixture is often 100 l or less in volume. DNA sequencing (p. 345); DNA replication (pp. 235–39) The polymerase chain reaction technique has now been automated and is carried out by a specially designed machine (figure 14.9). Currently a PCR machine can carry out 25 cycles and amplify DNA 105 times in as little as 57 minutes. During a typical cycle the DNA is denatured at 94°C for 15 seconds, then the primers are annealed and extended (steps 2 and 3) at 68°C for 60 seconds. Figure 14.8 The Polymerase Chain Reaction (PCR). In four cycles the targeted sequence has been amplified many times. See text for details. 5′ CYCLE 1 3′ Steps 1–2 Targeted sequence 3′ 5′ Primer Step 3 CYCLE 2 Steps 1–2 Step 3 CYCLE 3 Steps 1–2 Step 3 CYCLE 4 Steps 1–3 Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 14.4 Preparation of Recombinant DNA 327 nique is already having an impact on forensic science where it is being used in criminal cases as a part of DNA fingerprinting technology. It is possible to exclude or incriminate suspects using extremely small samples of biological material discovered at the crime scene. 14.4 Preparation of Recombinant DNA There are three ways to obtain adequate quantities of a DNA fragment. One can extract all the DNA from an organism, cleave the DNA into fragments, isolate the fragment of interest, and finally clone it. Alternatively, all of the fragments can be cloned by means of a suitable vector, and each clone (the population of identical molecules with a single ancestral molecule) can be tested for the desired gene. One also can directly synthesize the desired DNA fragment as described earlier, and then clone it. Isolating and Cloning Fragments Figure 14.9 A Modern PCR Machine. PCR machines are now fully automated and microprocessor controlled. They can process up to 96 samples at a time. PCR technology is improving continually. For example, RNA now can be efficiently used in PCR procedures. The Tth DNA polymerase, a recombinant Thermus thermophilus DNA polymerase, will transcribe RNA to DNA and then amplify the DNA. Cellular RNAs and RNA viruses may be studied even when the RNA is present in very small amounts (as few as 100 copies can be transcribed and amplified). PCR also can quantitate DNA products without the use of isotopes. This allows one to find the initial amount of target DNA in less than an hour using automated equipment. Quantitative PCR is quite valuable in virology and gene expression studies. As mentioned earlier, the target DNA to be amplified is normally less than about 5,000 base pairs in length. A “long PCR” technique has been developed that will amplify sequences up to 42 kilobases long. It depends on the use of errorcorrecting polymerases because Taq polymerase is error-prone. The PCR technique has already proven exceptionally valuable in many areas of molecular biology, medicine, and biotechnology. It can be used to amplify very small quantities of a specific DNA and provide sufficient material for accurately sequencing the fragment or cloning it by standard techniques. PCR-based diagnostic tests for AIDS, Lyme disease, chlamydia, tuberculosis, hepatitis, the human papilloma virus, and other infectious agents and diseases are being developed. The tests are rapid, sensitive, and specific. PCR is particularly valuable in the detection of genetic diseases such as sickle cell anemia, phenylketonuria, and muscular dystrophy. The tech- Long, linear DNA molecules are fragile and easily sheared into fragments by passing the DNA suspension through a syringe needle several times. DNA also can be cut with restriction enzymes. The resulting fragments are either separated by electrophoresis or first inserted into vectors and cloned. Agarose or polyacrylamide gels usually are used to separate DNA fragments electrophoretically. In electrophoresis, charged molecules are placed in an electrical field and allowed to migrate toward the positive and negative poles. The molecules separate because they move at different rates due to their differences in charge and size. In practice, the fragment mixture is usually placed in wells molded within a sheet of gel (figure 14.10). The gel concentration varies with the size of DNA fragments to be separated. Usually 1 to 3% agarose gels or 3 to 20% polyacrylamide gels are used. When an electrical field is generated in the gel, the fragments move through the pores of the gel toward an electrode (negatively charged DNA fragments migrate toward the positive electrode or anode). Each fragment’s migration rate is inversely proportional to the log of its molecular weight, and the fragments are separated into size classes (figure 14.10b). A simple DNA molecule might yield only a few bands. If the original DNA was very large and complex, staining of the gel would reveal a smear representing an almost continuous gradient in fragment size. The band or section containing the desired fragment is located with the Southern blotting technique (figure 14.5) and removed. Since a band may represent a mixture of several fragments, it is electrophoresed on a gel with a different concentration to separate similarly sized fragments. The location of the pure fragment is determined by Southern blotting, and it is extracted from the gel. Once fragments have been isolated, they are ligated with an appropriate vector, such as a plasmid (see section 13.2), to form a recombinant molecule that can reproduce in a host cell. One of the easiest and most popular approaches is to cut the plasmid and donor DNA with the same restriction enzyme so that identical sticky ends are formed (figure 14.11). Prescott−Harley−Klein: Microbiology, Fifth Edition 328 Chapter 14 V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 Recombinant DNA Technology Cathode – Buffer Sample Gel Plastic frame Anode + Buffer (a) 1 kb ladder Negative pole 5,090 4,072 3,054 12,216 11,198 10,180 9,162 8,144 7,126 6,108 -HindIII fragments 23,130 9,416 6,557 4,361 2,036 1,636 2,322 2,027 1,018 506, 517 396 344 298 Positive pole (b) Figure 14.10 Gel Electrophoresis of DNA. (a) A diagram of a vertical gel apparatus showing its parts. (b) The 1 kilobase ladder is an electrophoretic gel containing a series of DNA fragments of known size. The numbers indicate number of base pairs each fragment contains. The smallest fragments have moved the farthest. The gel on the right shows many of the fragments that arise when lambda phage DNA is digested with the HindIII restriction enzyme. Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 14.4 Preparation of Recombinant DNA Plasmid vector Donor DNA G A C A T T A T T A G C Plasmid cut with EcoRI G A AT T C G A AT T C C T TA A G C T TA A G C T TA A G Donor DNA cut with EcoRI FPO G G A AT T C A AT T C CT TA A G Tet r C G A A T T G A AT T C C T TA A G G C T TA A Donor DNA fragments (a) AA G C T G T A A T A AT Tet T T C C G Plasmids r G C T T A A Tet r Complementary base pairing followed by ligation Transform 2+ Ca -treated Transformed cell Tet r E. coli Culture bacteria in medium with tetracycline Recombinant DNA molecule G FPO C Tet r GA ATT (b) C T A A T T T T A A G T A Petri dish C C A G Tetracyclineresistant colony from transformed cell Figure 14.11 Recombinant Plasmid Construction and Cloning. The construction and cloning of a recombinant plasmid vector using an antibiotic resistance gene to select for the presence of the plasmid. The scale of the sticky ends of the fragments and plasmid has been enlarged to illustrate complementary base pairing. (a) The electron micrograph shows a plasmid that has been cut by a restriction enzyme and a donor DNA fragment. (b) The micrograph shows a recombinant plasmid. See text for details. 329 Prescott−Harley−Klein: Microbiology, Fifth Edition 330 Chapter 14 V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 Recombinant DNA Technology Cellular genome Treatment with restriction enzymes Plasmid vector Mixture of DNA fragments Agarose gel electrophoresis Southern blotting Band containing desired fragment Cut at one site Extraction of DNA Electrophoresis on a different gel Start with linear fragment of DNA Isolated DNA fragment Anneal with plasmid or phage vector DNA ligase treatment Recombinant vector Add dT to 3′ ends with terminal transferase Add dA to 3′ ends with terminal transferase A AAA AA A AA Transform bacterial host Culture bacteria Isolated recombinant clone AA A TTTTTT Figure 14.13 Cloning Cellular DNA Fragments. The preparation of a recombinant clone from previously isolated DNA fragments. TT TT A T T AAA AA AAA A T T T AA TT T TTTTTT Chimeric plasmid Figure 14.12 Terminal Transferase and the Construction of Recombinant Plasmids. The poly(dA-dT) tailing technique can be used to construct sticky ends on DNA and generate recombinant molecules. After a fragment has annealed with the plasmid through complementary base pairing, the breaks are joined by DNA ligase. A second method for creating recombinant molecules can be used with fragments and vectors lacking sticky ends. After cutting the plasmid and donor DNA, one can add poly(dA) to the 3′ ends of the plasmid DNA, using the enzyme terminal transferase (figure 14.12). Similarly, poly(dT) is added to the 3′ ends of the fragments. The ends will now base pair with each other and are joined by DNA ligase to form a recombinant plasmid. Some enzymes (e.g., the AluI restriction enzyme from Arthrobacter luteus) cut DNA at the same position on both strands to form a blunt end. Fragments and vectors with blunt ends may be joined by T4 DNA ligase (blunt-end ligation). The rDNA molecules are cloned by inserting them into bacteria, using transformation or phage injection (see section 13.5). Each strain reproduces to yield a population containing a single type of recombinant molecule. The overall process is outlined in figure 14.13. The same cloning techniques can be used with DNA fragments prepared using a DNA synthesizer machine. Although selected fragments can be isolated and cloned as just described, it often is preferable to fragment the whole genome and clone all the fragments by using a vector. Then the desired clone can be identified. To be sure that the complete genome is represented in this collection of clones, called a genomic library, more than a thousand transformed bacterial strains must be maintained (the larger the genome, the more clones are needed). Libraries of cloned genes also can be generated using phage lambda as a vector and stored as phage lysates. It is necessary to identify which clone in the library contains the desired gene. If the gene is expressed in the bacterium, it may be possible to assay each clone for a specific protein. However, a nucleic acid probe is normally employed in identification. The bacteria are replica plated on nitrocellulose paper Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 14.4 Preparation of Recombinant DNA Foreign DNA Plasmid tet r amp r Restriction enzyme cleavage Restriction enzyme cleavage Fragments amp Recombinant plasmids a b r a amp 331 b r amp r Transform bacteria with plasmids a b Plate bacteria and grow separate clones Nitrocellulose paper a o er t se nsf Tra ellulo oc r nitr pape a b Master plate b Lys e d e n ce l l s at u r e D an d NA a A ut ora b dio gra a Add cDNA probe b phy a b Dark spot on film identifies clone a as containing the desired fragment the off cDNA sh Wa idized ybr unh Figure 14.14 Cloning with Plasmid Vectors. The use of plasmid vectors to clone a mixture of DNA fragments. The desired fragment is identified with a cDNA probe. See text for details. and lysed in place with sodium hydroxide (figure 14.14). This yields a pattern of membrane-bound, denatured DNA corresponding to the colony pattern on the agar plate. The membrane is treated with a radioactive probe as in the original Southern blotting method. The radioactive spots identify colonies on the master plate that contain the desired DNA fragment. This approach also is used to analyze a library of cloned lambda phage (figure 14.15). Gene Probes Success in isolating the desired recombinant clones depends on the availability of a suitable probe. Gene-specific probes are obtained in several ways. Frequently they are constructed with cDNA clones. If the gene of interest is expressed in a specific tissue or cell type, its mRNA is often relatively abundant. For example, reticulocyte mRNA may be enriched in globin mRNA, and pancreatic cells Prescott−Harley−Klein: Microbiology, Fifth Edition 332 Chapter 14 V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 Recombinant DNA Technology Nitrocellulose filter Bacteriophage DNA Cellular DNA Cleave DNA with restriction enzyme Cleave with restriction enzyme Fragments Filter Add radioactive probe and incubate Ligate Recombinant DNA Ligated DNA packaged in vitro Hybrid phage mixture Plate phage clones to be analyzed with host bacteria. After incubation, overlay with filter. Autoradiograph Film Plate recombinant phages with host bacteria Phage clone Lawn of bacteria Locate desired clone from position of radioactivity on filter Desired clone Isolation of phage clones Library of phage clones (b) (a) Figure 14.15 The Use of Lambda Phage as a Vector. (a) The preparation of a genomic library. Each colony on the bacterial lawn is a recombinant clone carrying a different DNA fragment. (b) Detection and cloning of the desired recombinant phage. Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 14.5 Cloning Vectors 333 Table 14.3 Some Recombinant DNA Cloning Vectors Type Vector Restriction Sequences Present Features Plasmid (E. coli) pBR322 Plasmid (yeast–E. coli hybrid) Cosmid (artificially constructed E. coli plasmid carrying lambda cos site) YAC (yeast artificial chromosome) BAC (bacterial artificial chromosome) pYe(CEN3)41 pJC720 BamHI, EcoRI, HaeIII, HindIII, PstI, SalI, XorII BamHI, BglII, EcoRI, HindIII, PstI, SalI HindIII pYAC SmaI, BamHI pBAC108L HindIII, BamHI, NotI, SmaI, and others Virus Charon phage EcoRI, HindIII, BamHI, SstI Virus Lambda 1059 BamHI Virus M13 EcoRI Plasmid Ti SmaI, HpaI Carries genes for tetracycline and ampicillin resistance Multiplies in E. coli or yeast cells Can be packaged in lambda phage particles for efficient introduction into bacteria; replicates as a plasmid; useful for cloning large DNA inserts Carries gene for ampicillin resistance; multiplies in Saccharomyces cerevisiae. Modified F plasmid that can carry 100–300 kb fragments; has a cosN site and a chloramphenicol resistance marker Constructed using restriction enzymes and a ligase, having foreign DNA as central portion, with lambda DNA at each end; carries β-galactosidase gene; packaged into lambda phage particles; useful for cloning large DNA inserts Will carry large DNA fragments (8–21 kb); recombinant can grow on E. coli lysogenic for P2 phage, whereas vector cannot Single-stranded DNA virus; useful in studies employing single-stranded DNA insert and in producing DNA fragments for sequencing Maize plasmid Adapted from G. D. Elseth and K. D. Baumgartner, Genetics, 1984 Benjamin/Cummings Publishing, Menlo Park, CA. Reprinted by permission of the author. in insulin mRNA. Although mRNA is not available in sufficient quantity to serve as a probe, the desired mRNA species can be converted into cDNA by reverse transcription (figure 14.3). The cDNA copies are purified, spliced into appropriate vectors, and cloned to provide adequate amounts of the required probe. Probes also can be generated if the gene codes for a protein of known amino acid sequence. Oligonucleotides, about 20 nucleotides or longer, that code for a characteristic amino acid sequence are synthesized. These often are satisfactory probes since they will specifically bind to the gene segment coding for the desired protein. Sometimes previously cloned genes or portions of genes may be used as probes. This approach is effective when there is a reasonable amount of similarity between the nucleotide sequences of the two genes. Probes also can be generated by the polymerase chain reaction. After construction, the probe is labeled to aid detection. Often 32P is added to both DNA strands so that the radioactive strands can be located with autoradiography. Nonradioactively labeled probes may also be used. Clearly, DNA fragments can be isolated, purified, and cloned in several ways. Regardless of the exact approach, a key to successful cloning is choosing the right vector. The next section considers types of cloning vectors and their uses. 1. How are oligonucleotides synthesized? What is site-directed mutagenesis? 2. Briefly describe the polymerase chain reaction technique. What is its importance? 3. What is electrophoresis and how does it work? 4. Describe three ways in which a fragment can be covalently attached to vector DNA. 5. Outline in detail two different ways to isolate and clone a specific gene. What is a genomic library? 6. How are gene-specific probes obtained? 14.5 Cloning Vectors Isolating and Purifying Cloned DNA After the desired clone of recombinant bacteria or phages has been located with a probe, it can be picked from the master plate and propagated. The recombinant plasmid or phage DNA is then extracted and further purified when necessary. The DNA fragment is cut out of the plasmid or phage genome by means of restriction enzymes and separated from the remaining DNA by electrophoresis. There are four major types of vectors: plasmids, bacteriophages and other viruses, cosmids, and artificial chromosomes (table 14.3). Each type has its own advantages. Plasmids are the easiest to work with; rDNA phages and other viruses are more conveniently stored for long periods; larger pieces of DNA can be cloned with cosmids and artificial chromosomes. Besides these major types, there also are vectors designed for a specific function. For example, shuttle Prescott−Harley−Klein: Microbiology, Fifth Edition © The McGraw−Hill Companies, 2002 eI I Ha B a eI I m HI Figure 14.16 The pBR322 Plasmid. A map of the E. coli plasmid pBR322. The map is marked off in units of 1 ⫻ 105 daltons (outer circle) and 0.1 kilobases (inner circle). The locations of some restriction enzyme sites are indicated. The plasmid has resistance genes for ampicillin (Apr) and tetracycline (Tetr). Hae II Recombinant DNA Technology EcoRII 2.6 0 Ha Chapter 14 14. Recombinant DNA Technology EcoRI Alul 334 V. DNA Technology and Genomics Sa Alu Ps Alu tI I Ap Alu 40 Kb I r lI l Ha Tet e II Ha r eII Hae II EcoRlI AluI 2 3 1 EcoRV ScaI BamHI PvuI EcoR lI HaeII SalI PstI Ap r r AluI XmaIII Tet of uI 2 1 Ha eI I Ec Alu oR I lI Al O re rigi pli n ca tio n pBR322 eI I II Hae Ec Alu I oR lI Cleavage with Pstl Ha r Tet Foreign DNA Annealing and ligation r Tet vectors are used in transferring genes between very different organisms and usually contain one replication origin for each host. The pYe type plasmids reproduce in both yeast and E. coli and can be used to clone yeast genes in E. coli. All vectors share several common characteristics. They are typically small, well-characterized molecules of DNA. They contain at least one replication origin and can be replicated within the appropriate host, even when they contain “foreign” DNA. Finally, they code for a phenotypic trait that can be used to detect their presence; frequently it is also possible to distinguish parental from recombinant vectors. Plasmids Transformation Transformed cell resistant to tetracycline and sensitive to ampicillin Figure 14.17 Detection of Recombinant Plasmids. The use of antibiotic resistance genes to detect the presence of recombinant plasmids. Because foreign DNA has been inserted into the ampicillin resistance gene, the recombinant host is only resistant to tetracycline. The restriction enzymes indicated at the top of the figure cleave only one site on the plasmid. Plasmids were the first cloning vectors. They are easy to isolate and purify, and they can be reintroduced into a bacterium by transformation. Plasmids often bear antibiotic resistance genes, which are used to select their bacterial hosts. A recombinant plasmid containing foreign DNA often is called a chimera, after the Greek mythological monster that had the head of a lion, the tail of a dragon, and the body of a goat. One of the most widely used plasmids is pBR322. The biology of plasmids (pp. 294–97) Plasmid pBR322 has both resistance genes for ampicillin and tetracycline and many restriction sites (figure 14.16). Several of these restriction sites occur only once on the plasmid and are located within an antibiotic resistance gene. This arrangement aids detection of recombinant plasmids after transformation (figure 14.17). For example, if foreign DNA is inserted into the ampicillin re- Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 14.6 sistance gene, the plasmid will no longer confer resistance to ampicillin. Thus tetracycline-resistant transformants that lack ampicillin resistance contain a chimeric plasmid. Phage Vectors Both single- and double-stranded phage vectors have been employed in recombinant DNA technology. For example, lambda phage derivatives are very useful for cloning and can carry fragments up to about 45 kilobases in length. The genes for lysogeny and integration often are nonfunctional and may be deleted to make room for the foreign DNA. The modified phage genome also contains restriction sequences in areas that will not disrupt replication. After insertion of the foreign DNA into the modified lambda vector chromosome, the recombinant phage genome is packaged into viral capsids and can be used to infect host E. coli cells (figure 14.15). These vectors are often used to generate genomic libraries. E. coli also can be directly transformed with recombinant lambda DNA and produce phages. However, this approach is less efficient than the use of complete phage particles. The process is sometimes called transfection. Phages other than lambda also are used as vectors. For example, fragments as large as 95 kilobases can be carried by the P1 bacteriophage. The biol- Inserting Genes into Eucaryotic Cells 335 cloning vector. BACs are cloning vectors based on the E. coli F-factor plasmid. They contain appropriate restriction enzyme sites and a marker such as chloramphenicol resistance. The modified plasmid is cleaved at a restriction site, and a foreign DNA fragment up to 300 kilobases in length is attached using DNA ligase. The BAC is reproduced in E. coli after insertion by electroporation. This vector is easy to reproduce and manipulate and does not undergo recombination as readily as YACs (which means that the insert is less likely to be rearranged). Because they can carry such large fragments of DNA, artificial chromosomes have been particularly useful in genome sequencing. 1. Define shuttle vector, chimera, cosmid, transfection, yeast, artificial chromosome, and bacterial artificial chromosome. 2. Describe how each of the major types of vectors is used in genetic engineering. Give an advantage of each. 3. How can the presence of two antibiotic resistance genes be used to detect recombinant plasmids? 14.6 Inserting Genes into Eucaryotic Cells ogy of bacteriophages (chapter 17) Cosmids Cosmids are plasmids that contain lambda phage cos sites and can be packaged into phage capsids. The lambda genome contains a recognition sequence called a cos site (or cohesive end) at each end. When the genome is to be packaged in a capsid, it is cleaved at one cos site and the linear DNA is inserted into the capsid until the second cos site has entered. Thus any DNA inserted between the cos sites is packaged. Cosmids typically contain several restriction sites and antibiotic resistance genes. They are packaged in lambda capsids for efficient injection into bacteria, but they also can exist as plasmids within a bacterial host. As much as 50 kilobases of DNA can be carried in this way. Artificial Chromosomes Artificial chromosomes are popular because they carry large amounts of genetic material. The yeast artificial chromosome (YAC) is one of the most widely used. YACs are stretches of DNA that contain all the elements required to propagate a chromosome in yeast: a replication origin, the centromere required to segregate chromatids into daughter cells, and two telomeres to mark the ends of the chromosome. They will also have restriction enzyme sites and genetic markers so that they can be traced and selected. Cleavage of a YAC with the proper restriction enzyme such as SmaI will open it up and allow the insertion of a piece of foreign DNA between the centromere and a telomere. In this way YACs containing DNA fragments between 100 and 2,000 kilobases in size can be placed in Saccharomyces cerevisiae cells and will be replicated along with the true chromosomes. The bacterial artificial chromosome (BAC) is an increasingly popular alternative Because of its practical importance, much effort has been devoted to the development of techniques for inserting genes into eucaryotic cells. Some of these techniques have also been used successfully to transform bacterial cells. The most direct approach is the use of microinjection. Genetic material directly injected into animal cells such as fertilized eggs is sometimes stably incorporated into the host genome to produce a transgenic animal, one that has gained new genetic information from the acquisition of foreign DNA. Another effective technique that works with mammalian cells and plant cell protoplasts is electroporation. If cells are mixed with a DNA preparation and then briefly exposed to pulses of high voltage (from about 250 to 4,000 V/cm for mammalian cells), the cells take up DNA through temporary holes in the plasma membrane. Some of these cells will be transformed. One of the most effective techniques is to shoot microprojectiles coated with DNA into plant and animal cells. The gene gun, first developed at Cornell University, operates somewhat like a shotgun. A blast of compressed gas shoots a spray of DNAcoated metallic microprojectiles into the cells. The device has been used to transform corn and produce fertile corn plants bearing foreign genes. Other guns use either electrical discharges or high-pressure gas to propel the DNA-coated projectiles. These guns are sometimes called biolistic devices, a name derived from biological and ballistic. They have been used to transform microorganisms (yeast, the mold Aspergillus, and the alga Chlamydomonas), mammalian cells, and a variety of plant cells (corn, cotton, tobacco, onion, and poplar). Other techniques also are available. Plants can be transformed with Agrobacterium vectors as will be described shortly (p. 340). Viruses increasingly are used to insert desired genes into eucaryotic cells. For example, genes may be placed in a retrovirus (see p. 407 ), which then infects the target cell and integrates a Prescott−Harley−Klein: Microbiology, Fifth Edition 336 Chapter 14 V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 Recombinant DNA Technology DNA copy of its RNA genome into the host chromosome. Adenoviruses also can transfer genes to animal cells. Recombinant baculoviruses will infect insect cells and promote the production of many proteins. EcoRI BamHI Apr Tet r EcoRI M pBR322 14.7 Expression of Foreign Genes in Bacteria After a suitable cloning vector has been constructed, rDNA enters the host cell, and a population of recombinant microorganisms develops. Most often the host is an E. coli strain that lacks restriction enzymes and is recA⫺ to reduce the chances that the rDNA will undergo recombination with the host chromosome. Bacillus subtilis and the yeast Saccharomyces cerevisiae also may serve as hosts. Plasmid vectors enter E. coli cells by calcium chlorideinduced transformation. Electroporation at 3 to 24 kV/cm is effective with both gram-positive and gram-negative bacteria. A cloned gene is not always expressed in the host cell without further modification of the recombinant vector. To be transcribed, the recombinant gene must have a promoter that is recognized by the host RNA polymerase. Translation of its mRNA depends on the presence of leader sequences and mRNA modifications that allow proper ribosome binding. These are quite different in eucaryotes and procaryotes, and a procaryotic leader must be provided to synthesize eucaryotic proteins in a bacterium. Finally, introns in eucaryotic genes must be removed because the procaryotic host will not excise them after transcription of mRNA; a eucaryotic protein is not functional without intron removal prior to translation. The problems of expressing recombinant genes in host cells are largely overcome with the help of special cloning vectors called expression vectors. These vectors are often derivatives of plasmid pBR322 and contain the necessary transcription and translation start signals. They also have useful restriction sites next to these sequences so that foreign DNA can be inserted with relative ease. Some expression vectors contain portions of the lac operon and can effectively regulate the expression of the cloned genes in the same manner as the operon. Somatostatin, the 14-residue hypothalamic polypeptide hormone that helps regulate human growth, provides an example of useful cloning and protein production. The gene for somatostatin was initially synthesized by chemical methods. Besides the 42 bases coding for somatostatin, the polynucleotide contained a codon for methionine at the 5′ end (the N-terminal end of the peptide) and two stop codons at the opposite end. To aid insertion into the plasmid vector, the 5′ ends of the synthetic gene were extended to form single-stranded sticky ends complementary to those formed by the EcoRI and BamHI restriction enzymes. A modified pBR322 plasmid was cut with both EcoRI and BamHI to remove a part of the plasmid DNA. The synthetic gene was then spliced into the vector by taking advantage of its cohesive ends (figure 14.18). Finally, a fragment containing the initial part of the lac operon (including the promoter, operator, ribosome binding site, and much of the -galactosidase gene) was inserted next to the somatostatin gene. The plasmid now contained the somatostatin gene fused in the proper orientation to the remaining portion of the -galactosidase gene. BamHI S Somatostatin gene EcoRI BamHI EcoRI BamHI M S Apr EcoRI EcoRI Tet r Lac control + β-gal gene EcoRI EcoRI BamHI M S EcoRI M = methionine S = stop codon Tet r Apr Figure 14.18 Cloning the Somatostatin Gene. An overview of the procedure used to synthesize a recombinant plasmid containing the somatostatin gene. See text for details. After introduction of this chimeric plasmid into E. coli, the somatostatin gene was transcribed with the -galactosidase gene fragment to generate an mRNA having both messages. Translation formed a protein consisting of the total hormone polypeptide attached to the -galactosidase fragment by a methionine residue. Cyanogen bromide cleaves peptide bonds at methionine residues. Treatment of the fusion protein with cyanogen bromide broke the peptide chain at the methionine and released the hormone (figure 14.19). Once free, the polypeptide was able to fold properly and become active. Since production of the fusion protein was under the control of the lac operon, it could be easily regulated. Many proteins have been produced since the synthesis of somatostatin. A similar approach was used to manufacture human insulin. Human growth hormone and some interferons also have been synthesized by rDNA techniques. The human growth hormone gene was too long to synthesize by chemical procedures and was prepared from mRNA as cDNA. Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 14.8 EcoRI site AATTC al β-g TG T C AA O G AA P TGG AAG ACT TTC atin gene atost TTT Lac Eucaryotic mRNA processing (pp. 263–64) S om TTC pBR322 AC TG T m A TG TAG GATC BamHI site G TC as DN A Stop codons In vivo gene expression H2N If the mRNA is scarce, it may not be easy to obtain enough for cDNA synthesis. Often the sequence of the protein coded for by the gene is used to deduce the best DNA sequence for the specific polypeptide segment (a logical procedure called reverse translation). Then the DNA probe is synthesized and used to locate and isolate the desired mRNA after gel electrophoresis. Finally, the isolated mRNA is used to make cDNA. 1. What is a transgenic animal? Describe how electroporation and gene guns are used to insert foreign genes into eucaryotic cells. What other approaches may be used? How are bacteria transformed? 2. How can one prevent rDNA from undergoing recombination in a bacterial host cell? 3. List several reasons why a cloned gene might not be expressed in a host cell. What is an expression vector? 4. Briefly outline the procedure for somatostatin production. 5. Identify a way to eliminate eucaryotic introns during the synthesis of rDNA. T Pl id 337 As mentioned earlier, introns in eucaryotic genes are not removed by bacteria and will render the final protein nonfunctional. The easiest solution is to prepare cDNA from processed mRNA that lacks introns and directly reflects the correct amino acid sequence of the protein product. In this instance it is particularly important to fuse the gene with an expression vector since a promoter and other essential sequences will be missing in the cDNA. Met codon ATG G CT G GT Applications of Genetic Engineering Met–Ala–Gly–Cys–Lys–Asn–Phe–Phe Trp SH – – – – – SH Lys 14.8 Applications of Genetic Engineering HOOC–Cys–Ser–Thr–Phe–Thr Som Met–Ala–Gly–Cys–Lys–Asn–Phe–Phe Trp S – – – Lys – – S – HOOC–Cys–Ser–Thr–Phe–Thr Cyanogen bromide cleavage of peptide bond NH2–Ala–Gly–Cys–Lys–Asn–Phe–Phe Trp S – – S Lys – HO–Cys–Ser–Thr–Phe–Thr – + – β-Gal fragments Active somatostatin Figure 14.19 The Synthesis of Somatostatin by Recombinant E. coli. Cyanogen bromide cleavage at the methionine residue releases active hormone from the -galactosidase fragment. The gene and associated sequences are shaded in color. Stop codons, the special methionine codon, and restriction enzyme sites are enclosed in boxes. Genetic engineering and biotechnology will continue to contribute in the future to medicine, industry, and agriculture, as well as to basic research (Box 14.1). In this section some practical applications are briefly discussed. Medical Applications – H2N β-Gal Certainly the production of medically useful proteins such as somatostatin, insulin, human growth hormone, and interferon is of great practical importance (table 14.4). This is particularly true of substances that previously only could be obtained from human tissues. For example, in the past, human growth hormone for treatment of pituitary dwarfism was extracted from pituitaries obtained during autopsies and was available only in limited amounts. Interleukin-2 (a protein that helps regulate the immune response) and bloodclotting factor VIII have recently been cloned, and undoubtedly other important peptides and proteins will be produced in the future. A particularly interesting development is the use of transgenic corn and soybean plants to produce monoclonal antibodies (see p. 743) for medical uses. It also is possible to use genetically engineered plants to produce oral vaccines. Genetically engineered mice now can produce fully human monoclonal antibodies. Synthetic vaccines—for instance, vaccines for malaria and rabies—are also being developed with recombinant techniques. A recombinant hepatitis B vaccine is already commercially available. Prescott−Harley−Klein: Microbiology, Fifth Edition 338 Chapter 14 V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 Recombinant DNA Technology Box 14.1 Gene Expression and Kittyboo Colors enetic engineering techniques can yield unusual approaches to long-standing problems. To understand how the regulation of gene expression influences complex processes, the activity of more than one gene must be followed simultaneously. The use of genetic engineering techniques with luciferase genes has made this much easier. A Jamaican beetle named the kittyboo has two light organs on its head and one on its abdomen. Recently four luciferase genes have been isolated from the beetle. Each luciferase enzyme produces a different colored light when it acts on the substrate luciferin. Keith V. Wood has cloned the genes and inserted them into E. coli. When the bacteria are exposed to luciferin, they glow (see Box figure). These luciferase genes can be used to study gene regulation. Suppose that one wishes to follow the activity of the liver gene that codes for serum albumin. The albumin gene can be replaced with a luciferase gene, and the modified cell incubated with luciferin. Whenever the albumin gene is activated, the newly inserted luciferase gene will function and the cell glow. Measurement of light intensity is easy, rapid, and sensitive. By substituting a different luciferase gene for another liver gene, one can simultaneously follow the activity and coordination of two different liver genes. It is only necessary to measure light intensity at the two wavelengths characteristic of the luciferase genes. G Table 14.4 Some Human Peptides and Proteins Synthesized by Genetic Engineering Peptide or Protein Potential Use α1-antitrypsin α-, β-, and γ-interferons Treatment of emphysema As antiviral, antitumor, and anti-inflammatory agents Treatment of hemophilia Treatment of osteomalacia Treatment of wounds Treatment of anemia Growth promotion Treatment of diabetes Treatment of immune disorders and tumors Cancer treatment Blood-clotting factor VIII Calcitonin Epidermal growth factor Erythropoetin Growth hormone Insulin Interleukins-1, 2, and 3 Macrophage colony stimulating factor Relaxin Serum albumin Somatostatin Streptokinase Tissue plasminogen activator Tumor necrosis factor Aid to childbirth Plasma supplement Treatment of acromegaly Anticoagulant Anticoagulant Cancer treatment E. coli with Luciferase Genes. These four streaks of E. coli glow different colors because they contain four different luciferase genes cloned from the Jamaican click beetle or kittyboo, Pyrophorus plagiophthalamus. Other medical uses of genetic engineering are being investigated. Probes are now being used in the diagnosis of infectious disease. An individual could be screened for mutant genes with probes and hybridization techniques (even before birth when used together with amniocentesis). A type of genetic surgery called somatic cell gene therapy may be possible for afflicted individuals. For example, cells of an individual with a genetic disease could be removed, cultured, and transformed with cloned DNA containing a normal copy of the defective gene or genes. These cells could then be reintroduced into the individual; if they became established, the expression of the normal genes might cure the patient. An immune deficiency disease patient lacking the enzyme adenosine deaminase that destroys toxic metabolic by-products has been treated in this way. Some of the patient’s lymphocytes (see p. 705) were removed, given the adenosine deaminase gene with the use of a modified retrovirus, and returned to the patient’s body. It may be possible to use a defective retrovirus (see section 18.2) or another virus to directly insert the proper genes into host cells, perhaps even specifically targeted organs or tissues. Fusion toxins provide a third example of potentially important genetic engineering applications. The first to be developed are recombinant proteins in which the enzymatic and membrane translocation Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 14.8 domains of the diphtheria toxin (see pp. 797–98) are combined with proteins that attach specifically to target cell surface receptors. Clinical trials are being carried out on a fusion toxin that contains interleukin-2. IL-2 binds to the cell, and the diphtheria toxin enzymatic domain enters and blocks protein synthesis. Encouraging results have been obtained in the treatment of leukemia, lymphomas, and rheumatoid arthritis. The use of probes and plasmid fingerprinting in clinical microbiology (pp. 843–41) It now appears that livestock will also be important in medically oriented biotechnology through the use of an approach sometimes called molecular pharming. Pig embryos injected with human hemoglobin genes develop into transgenic pigs that synthesize human hemoglobin. Current plans are to purify the hemoglobin and use it as a blood substitute. A pig could yield 20 units of blood substitute a year. Somewhat similar techniques have produced transgenic goats whose milk contains up to 3 grams of human tissue plasminogen activator per liter. Tissue plasminogen activator (TPA) dissolves blood clots and is used to treat cardiac patients. Recombinant DNA techniques are playing an increasingly important role in research on the molecular basis of disease. The transfer from research to practical application often is slow because it is not easy to treat a disease effectively unless its molecular mechanism is known. Thus genetic engineering may aid in the fight against a disease by providing new information about its nature, as well as by aiding in diagnosis and therapy. Industrial Applications Industrial applications of recombinant DNA technology include manufacturing protein products by using bacteria, fungi, and cultured mammalian cells as factories; strain improvement for existing bioprocesses; and the development of new strains for additional bioprocesses. As mentioned earlier, the pharmaceutical industry is already producing several medically important polypeptides using this technology. In addition, there is interest in making expensive, industrially important enzymes with recombinant bacteria. Bacteria that metabolize petroleum and other toxic materials have been developed. These bacteria can be constructed by assembling the necessary catabolic genes on a single plasmid and then transforming the appropriate organism. There also are many potential applications in the chemical and food industries. Food microbiology (chapter 41); Industrial microbiology and biotechnology (chapter 42) Agricultural Applications It is also possible to bypass the traditional methods of selective breeding and directly transfer desirable traits to agriculturally important animals and plants. Potential exists for increasing growth rates and overall protein yields of farm animals. The growth hormone gene has already been transferred from rats to mice, and both in vitro fertilization and embryo implantation methods are fairly well developed. Recently, recombinant bovine growth hormone has been used to increase milk production by at least 10%. Perhaps farm animals’ disease resistance and tolerance to environmental extremes also can be improved. Applications of Genetic Engineering 339 Cloned genes can be inserted into plant as well as animal cells. Presently a popular way to insert genes into plants is with a recombinant Ti plasmid (tumor-inducing plasmid) obtained from the bacterium Agrobacterium tumefaciens (Box 14.2; see also section 30.4 ). It also is possible to donate genes by forming plant cell protoplasts, making them permeable to DNA, and then adding the desired rDNA. The advent of the gene gun (p. 335) will greatly aid the production of transgenic plants. Much effort has been devoted to the transfer of the nitrogenfixing abilities of bacteria associated with legumes to other crop plants. The genes largely responsible for the process have been cloned and transferred to the genome of plant cells; however, the recipient cells have not been able to fix nitrogen. If successful, the potential benefit for crop plants such as corn is great. Nevertheless, there is some concern that new nitrogen-fixing varieties might spread indiscriminantly like weeds or disturb the soil nitrogen cycle (see sections 28.3 and 30.4). Attempts at making plants resistant to environmental stresses have been more successful. For example, the genes for detoxification of glyphosate herbicides were isolated from Salmonella, cloned, and introduced into tobacco cells using the Ti plasmid. Plants regenerated from the recombinant cells were resistant to the herbicide. Herbicideresistant varieties of cotton and fertile, transgenic corn also have been developed. This is of considerable importance because many crop plants suffer stress when treated with herbicides. Resistant crop plants would not be stressed by the chemicals being used to control weeds, and yields would presumably be much greater. U.S. farmers grow substantial amounts of genetically modified (GM) crops. About a third of the corn, half of the soybeans, and a significant fraction of cotton crops are genetically modified. Cotton and corn are resistant to herbicides and insects. Soybeans have herbicide resistance and lowered saturated fat content. Other examples of genetically engineered commercial crops are canola, potato, squash, and tomato. Many new agricultural applications are being explored. A good example is an altered strain of Pseudomonas syringae that protects against plant frost damage because it cannot produce the protein that induces ice-crystal formation. Much effort is being devoted to defending plants against pests without the use of chemical pesticides. A strain of Pseudomonas fluorescens carrying the gene for the Bacillus thuringiensis toxin (see pp. 1020–21) is under testing. This toxin destroys many insect pests such as the cabbage looper and the European corn borer. A variety of corn with the B. thuringiensis toxin gene has been developed. There is considerable interest in insect-killing viruses and particularly in the baculoviruses. A scorpion toxin gene has been inserted into the autographa californica multicapsid nuclear polyhedrosis virus (AcMNPV). The engineered AcMNPV kills cabbage looper more rapidly than the normal virus and reduces crop damage significantly. Finally, virus-resistant strains of soybeans, potatoes, squash, rice, and other plants are under development. 1. List several important present or future applications of genetic engineering in medicine, industry, and agriculture. 2. What is the Ti plasmid and why is it so important? Prescott−Harley−Klein: Microbiology, Fifth Edition 340 Chapter 14 V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 Recombinant DNA Technology Box 14.2 Plant Tumors and Nature’s Genetic Engineer plasmid from the plant pathogenic bacterium Agrobacterium tumefaciens is responsible for much success in the genetic engineering of plants. Infection of normal cells by the bacterium transforms them into tumor cells, and the crown gall disease develops in dicotyledonous plants such as grapes and ornamental plants. Normally, the gall or tumor is located near the junction of the plant’s root and stem. The tumor forms because of the insertion of genes into the plant cell genome, and only strains of A. tumefaciens possessing a large conjugative plasmid called the Ti plasmid are pathogenic (see Figure 30.20). The Ti plasmid carries genes for virulence and the synthesis of substances involved in the regulation of plant growth. The genes that induce tumor formation reside between two 23 base pair direct-repeat sequences. This region is known as T-DNA and is very similar to a transposon. T-DNA contains genes for the synthesis of plant growth hormones (an auxin and a cytokinin) and an amino acid derivative called opine that serves as a nutrient source for the invading bacteria. In diseased plant cells, T-DNA is inserted into the chromosomes at various sites and is stably maintained in the cell nucleus. A T-DNA Cloning vector Gene insertion into T-DNA region Ti plasmid sequence When the molecular nature of crown gall disease was recognized, it became clear that the Ti plasmid and its T-DNA had great potential as a vector for the insertion of rDNA into plant chromosomes. In one early experiment the yeast alcohol dehydrogenase gene was added to the T-DNA region of the Ti plasmid. Subsequent infection of cultured plant cells resulted in the transfer of the yeast gene. Since then, many modifications of the Ti plasmid have been made to improve its characteristics as a vector. Usually one or more antibiotic resistance genes are added, and the nonessential T-DNA, including the tumor inducing genes, is deleted. Those genes required for the actual infection of the plant cell by the plasmid are left. T-DNA also has been inserted into the E. coli pBR322 plasmid and other plasmids to produce cloning vectors that can move between bacteria and plants (see Box figure). The gene or genes of interest are spliced into the T-DNA region between the direct repeats. Then the plasmid is returned to A. tumefaciens, plant culture cells are infected with the bacterium, and transformants are selected by screening for antibiotic resistance (or another trait in the T-DNA). Finally, whole plants are regenerated from the transformed cells. In this way several plants have been made herbicide resistant (p. 339). Unfortunately A. tumefaciens does not produce crown gall disease in monocotyledonous plants such as corn, wheat, and other grains; and it has been used only to modify plants such as potato, tomato, celery, lettuce, and alfalfa. However, there is evidence that T-DNA is transferred to monocotyledonous plants and expressed, although it does not produce tumors. This discovery plus the creation of new procedures for inserting DNA into plant cells may well lead to the use of rDNA techniques with many important crop plants. Firefly luciferase gene and promoter Recombinant plasmid Transform Agrobacterium and infect tobacco plant Plant genome (a) T-DNA and luciferase gene Plant genome (b) The Use of the Ti Plasmid Vector to Produce Transgenic Plants. (a) The formation of a cloning vector and its use in transformation. See text for details. (b) A tobacco plant, Nicotiana tabacum, that has been made bioluminescent by transfection with a special Ti plasmid vector containing the firefly luciferase gene. When the plant is watered with a solution of luciferin, the substrate for the luciferase enzyme, it glows. The picture was made by exposing the transgenic plant to Ektachrome film for 24 hours. Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 14.9 14.9 Social Impact of Recombinant DNA Technology Despite the positive social impact of rDNA technology, dangers may be associated with rDNA work and gene cloning. The potential to alter an organism genetically raises serious scientific and philosophical questions, many of which have not yet been adequately addressed. In this section some of the debate is briefly reviewed. The initial concern raised by the scientific community was that recombinant E. coli and other genetically engineered microogranisms (GEMs) carrying dangerous genes might escape and cause widespread infections. Because of these worries, the federal government established guidelines to limit and regulate the locations and types of potentially dangerous experiments. Physical containment was to be practiced—that is, rDNA research was to be carried out in specially designed laboratories with extra safety precautions. In addition, rDNA researchers were also to practice biological containment. Only weakened microbial hosts unable to survive in a natural environment and nonconjugating bacterial strains were to be used to avoid the spread of the vector or rDNA itself. Further experiments suggested that the dangers were not as extreme as initially conceived. Yet little is known about what would happen if recombinant organisms escaped. For example, could dangerous genes such as oncogenes (see section 18.5) move from a weakened strain to a hardy one that would then spread? Will there be increased risk when extremely large quantities of recombinant microorganisms are grown in industrial fermenters? Some people worry because the use of rDNA techniques has become so widespread and the guidelines on their safe use have been relaxed to a considerable extent. Biomedical rDNA research has been regulated by the Recombinant DNA Advisory Committee (RAC) of the National Institutes of Health. More recently, the Food and Drug Administration (FDA) has assumed principal responsibility for oversight of gene therapy research. The Environmental Protection Agency and state governments have jurisdiction over agriculturally related field experiments. Industrial rDNA research can proceed without such close regulation, and this has also caused some anxiety about potential safety hazards. Aside from these concerns over safety, the use of recombinant technology on human beings raises ethical and moral questions. These problems are not extreme in cases of somatic cell gene therapy to cure serious disease. There is much greater concern about attempts to correct defects or bestow traits considered to be desirable or cosmetic by introducing genes into human eggs and embryos. In 1983 two Nobel Prize winners, several other prominent scientists, and about 40 religious leaders signed a statement to Congress requesting that such alterations of human eggs or sperm not be attempted. There is a temptation to “improve” ourselves or reengineer the human body. Some might wish to try to change their children’s intelligence, size, or physical attractiveness through genetic modifications. Others argue that the good arising from such interventions more than justifies the risks of misuse and that there is nothing inherently wrong in modifying the human gene pool to reduce the incidence of genetic disease. This discussion extends to other organisms as well. Some argue that we have no right to create new life-forms and further Social Impact of Recombinant DNA Technology 341 disrupt the genetic diversity that human activities have already severely reduced. United States courts have decided that researchers and companies can patent “nonnaturally occurring” living organisms, whether microbial, plant, or animal. One of the major current efforts in biotechnology, the human genome project, has determined the sequences of all human chromosomes. Success in this project and further technical advances in biotechnology will make genetic screening very effective. Physicians will be able to detect critical flaws in DNA long before the resulting genetic disease becomes manifest. In some cases this may lead to immediate treatment. But often nothing can be done about the damage and all sorts of dilemmas arise. Should people be told about the situation? Do they even wish to know about defects that cannot be corrected? What happens if their employers and insurance companies learn of the problem? Employees may lose their jobs and insurance. The issue of privacy is crucial in a practical sense, especially if genetic data banks are established. There may be increasing pressure on parents to abort fetuses with genetic defects in order not to stress the medical and social welfare systems. Clearly modern biomedical technology holds both threat and promise. Society has not yet faced its implications despite rapid progress in the development of genetic screening technology. Another area of considerable controversy involves the use of recombinant organisms in agriculture. Many ecologists are worried that the release of unusual, recombinant organisms without careful prior risk assessment may severely disrupt the ecosystem. They point to the many examples of ecosystem disruption from the introduction of foreign organisms. A major concern is the exchange of genes between transgenic plants and viruses or other plants. Viral nucleic acids inserted into plants to make them virus resistant might combine with the genome of an invading virus to make it even more virulent. Weeds might acquire resistance to viruses, pests, and herbicides from transgenic crop plants. Other potential problems trouble critics. For example, insect-killing viruses might destroy many more insect species than expected. Toxin genes inserted into one virus might move to other viruses. Genetically modified foods might trigger damaging allergic responses in consumers. Potential risks have been vigorously discussed because several recombinant organisms are already on the market. Thus far, no obvious ecological or health effects have been observed. Nevertheless, there have been calls for stricter regulation of GM foods. In Europe, GM crops are not commonly produced by farmers or purchased by consumers. Some large U.S. food producers have responded to public concern and quit using GM crops. As with any technology, the potential for abuse exists. A case in point is the use of genetic engineering in biological warfare and terrorism. Although there are international agreements that limit the research in this area to defense against other biological weapons, the knowledge obtained in such research can easily be used in offensive biological warfare. Effective vaccines constructed using rDNA technology can protect the attacker’s troops and civilian population. Because it is relatively easy and inexpensive to prepare bacteria capable of producing massive quantities of toxins or to develop particularly virulent strains of viral and bacterial pathogens, even small countries and terrorist organizations might acquire biological weapons. Many scientists and Prescott−Harley−Klein: Microbiology, Fifth Edition 342 Chapter 14 V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 Recombinant DNA Technology nonscientists also are concerned about increases in the military rDNA research carried out by the major powers. The emerging threat of bioterrorism (p. 863) Recombinant DNA technology has greatly enhanced our knowledge of genes and how they function, and it promises to improve our lives in many ways. Yet, as this brief discussion shows, problems and concerns remain to be resolved. Past scientific advances have sometimes led to unanticipated and unfortu- nate consequences, such as environmental pollution and nuclear weapons. With prudence and forethought we may be able to avoid past mistakes in the use of this new technology. 1. Describe four major areas of concern about the application of genetic engineering. In each case give both the arguments for and against the use of genetic engineering. Summary 1. Genetic engineering became possible after the discovery of restriction enzymes and reverse transcriptase, and the development of both the Southern blotting technique and other essential methods in nucleic acid chemistry (table 14.1). 2. Oligonucleotides of any desired sequence can be synthesized by a DNA synthesizer machine. This has made possible site-directed mutagenesis. 3. The polymerase chain reaction allows small amounts of DNA to be increased in concentration thousands of times (figure 14.8). 4. Agarose gel electrophoresis is used to separate DNA fragments according to size differences. 5. Fragments can be isolated and identified, then joined with plasmids or phage genomes and cloned; one also can first clone all fragments 6. 7. 8. 9. and subsequently locate the desired clone. (figures 14.11, 14.14, and 14.15). Three techniques that can be used to join DNA fragments are the creation of similar sticky ends with a single restriction enzyme, the addition of poly(dA) and poly(dT) to create sticky ends, and blunt-end ligation with T4 DNA ligase. Probes for the detection of recombinant clones are made in several ways and usually labeled with 32P. Nonradioactive probes are also used. Several types of vectors may be used, each with different advantages: plasmids, phages and other viruses, cosmids, artificial chromosomes, and shuttle vectors (table 14.3). Genes can be inserted into eucaryotic cells by techniques such as microinjection, electroporation, and the use of a gene gun. 10. The recombinant vector often must be modified by the addition of promoters, leaders, and other elements. Eucaryotic gene introns also must be removed. An expression vector has the necessary features to express any recombinant gene it carries. 11. Many useful products, such as the hormone somatostatin, have been synthesized using recombinant DNA technology (figure 14.19). 12. Recombinant DNA technology will provide many benefits in medicine, industry, and agriculture. 13. Despite the great promise of genetic engineering, it also brings with it potential problems in areas of safety, human experimentation, potential ecological disruption, and biological warfare. Key Terms autoradiography 322 bacterial artificial chromosome (BAC) 335 biotechnology 320 chimera 334 complementary DNA (cDNA) 321 expression vector 336 gene gun 335 genetic engineering 320 library 330 oligonucleotides 323 restriction enzymes 320 site-directed mutagenesis 323 Southern blotting technique 322 Ti plasmid 339 transfection 335 cosmid 335 electrophoresis 327 electroporation 335 polymerase chain reaction (PCR technique) 326 probe 322 recombinant DNA technology 320 transgenic animal 335 vectors 322 yeast artificial chromosome (YAC) 335 Questions for Thought and Review 1. Could the Southern blotting technique be applied to RNA and proteins? How might this be done? 2. Why could a band on an electrophoresis gel still contain more than one kind of DNA fragment? 3. What advantage might there be in creating a genomic library first rather than directly isolating the desired DNA fragment? 4. In what areas do you think genetic engineering will have the greatest positive impact in the future? Why? 5. What do you consider to be the greatest potential dangers of genetic engineering? Are there ethical problems with any of its potential applications? Critical Thinking Questions 1. Initial attempts to perform PCR were carried out using the DNA polymerase from E. coli. What was the major difficulty? 2. Why can’t PCR be done with RNA polymerase? 3. Write a one-page explanation for a reader with a high school education that explains gene therapy. Assume the reader has a son or daughter with a life-threatening disease for which the only treatment course is gene therapy. 4. Suppose that one inserted a simple plasmid (one containing an antibiotic resistance gene and a separate restriction site) carrying a human interferon gene into E. coli, but none of the transformed bacteria produced interferon. Give as many plausible reasons as possible for this result. Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 14. Recombinant DNA Technology © The McGraw−Hill Companies, 2002 Additional Reading 343 Additional Reading General Cohen, S. N. 1975. The manipulation of genes. Sci. Am. 233(1):25–33. Glazer, A. N., and Nikaido, H. 1995. Microbial biotechnology: Fundamentals of applied microbiology. New York: W. H. Freeman. Glick, B. R., and Pasternak, J. J. 1998. Molecular biotechnology: Principles and applications of recombinant DNA, 2d ed. Washington, D.C.: ASM Press. Maulik, S., and Patel, S. D. 1997. Molecular biotechnology: Therapeuitic applications and strategies. New York: John Wiley & Sons. Old, R. W., and Primrose, S. B. 1994. Principles of gene manipulation, 5th ed. Boston: Blackwell Scientific Publications. Peters, P. 1993. Biotechnology: A guide to genetic engineering. Dubuque, Iowa: Wm. C. Brown. Snyder, L., and Champness, W. 1997. Molecular genetics of bacteria. Washington, D.C.: ASM Press. Watson, J. D.; Gilman, M.; Witkowski, J.; and Zoller, M. 1992. Recombinant DNA, 2d ed. San Francisco: W. H. Freeman. Zyskind, J. W. 2000. Recombinant DNA, basic procedures. In Encyclopedia of microbiology, 2d ed., vol. 4, J. Lederberg, editor-in-chief, 55–64. San Diego: Academic Press. 14.1 Historical Perspectives Bickle, T. A., and Krüger, D. H. 1993. Biology of DNA restriction. Microbiol. Rev. 57(2):434–50. Lear, J. 1978. Recombinant DNA: The untold story. New York: Crown Publishers. Murray, N. E. 2000. DNA restriction and modification. In Encyclopedia of microbiology, 2d ed., vol. 2, J. Lederberg, editor-in-chief, 91–105. San Diego: Academic Press. 14.3 The Polymerase Chain Reaction Arnheim, N., and Levenson, C. H. 1990. Polymerase chain reaction. Chem Eng. News 68:36–47. Atlas, R. M. 1991. Environmental applications of the polymerase chain reaction. ASM News 57(12):630–32. Ehrlich, G. D., and Greenberg, S. J. 1994. PCRbased diagnostics in infectious disease. Boston: Blackwell Scientific. Erlich, H. A. 1989. PCR technology: Principles and applications of DNA amplification. San Francisco: W. H. Freeman. Erlich, H. A.; Gelfand, D.; and Sninsky, J. J. 1991. Recent advances in the polymerase chain reaction. Science 252:1643–51. Mullis, K. B. 1990. The unusual origin of the polymerase chain reaction. Sci. Am. 262(4):56–65. Palmer, C. J., and Paszko-Kolva, C. 2000. Polymerase chain reaction (PCR). In Encyclopedia of microbiology, 2d ed., vol. 3, J. Lederberg, editor-in-chief, 787–91. San Diego: Academic Press. 14.5 Cloning Vectors Monaco, A. P., and Larin, Z. 1994. YACs, BACs, PACs and MACs: Artificial chromosomes as research tools. Trends Biotechnol. 12:280–86. Shizuya, H., et al. 1992. Cloning and stable maintenance of 300-kilobase-pair fragments of human DNA in Escherichia coli using an F-factor-based vector. Proc. Natl. Acad. Sci. 89:8794–97. 14.6 Inserting Genes into Eucaryotic Cells Chilton, M.-D. 1983. A vector for introducing new genes into plants. Sci. Am. 248(6):51–59. Crystal, R. G. 1995. Transfer of genes to humans: Early lessons and obstacles to success. Science 270:404–10. Karcher, S. J. 1994. Getting DNA into a cell: A survey of transformation methods. Am. Biol. Teach. 56(1):14–20. Smith, A. E. 1995. Viral vectors in gene therapy. Annu. Rev. Microbiol. 49:807–38. 14.8 Applications of Genetic Engineering Cohen, J. S., and Hogan, M. E. 1994. The new genetic medicines. Sci. Am. 271(6):76–82. Felgner, P. L. 1997. Nonviral strategies for gene therapy. Sci. Am. 276(6):102–6. Friedmann, T. 1997. Overcoming the obstacles to gene therapy. Sci. Am. 276(6):96–101. Gasser, C. S., and Fraley, R. T. 1992. Transgenic crops. Sci. Am. 266(6):62–69. Gerngross, T. U., and Slater, S. C. 2000. How green are green plastics? Sci. Am. 283(2):36–41. Hansen, M.; Busch, L.; Burkhardt, J.; Lacy, W. B.; and Lacy, L. R. 1986. Plant breeding and biotechnology. BioScience 36(1):29–39. Jaenisch, R. 1988. Transgenic animals. Science 240:1468–74. Lillehoj, E. P., and Ford, G. M. 2000. Industrial biotechnology, overview. In Encyclopedia of microbiology, 2d ed., vol. 2, J. Lederberg, editor-in-chief, 722–37. San Diego: Academic Press. Pestka, S. 1983. The purification and manufacture of human interferons. Sci. Am. 249(2):37–43. Porter, A. G.; Davidson, E. W.; and Liu, J.-W. 1993. Mosquitocidal toxins of bacilli and their genetic manipulation for effective biological control of mosquitoes. Microbiol. Rev. 57(4):838–61. Roessner, C. A., and Scott, A. I. 1996. Genetically engineered synthesis of natural products: From alkaloids to corrins. Annu. Rev. Microbiol. 50:467–90. 14.9 Social Impact of Recombinant DNA Technology Brown, K. 2001. Seeds of Concern. Sci. Am. 284(4):52–57. Goodfield, J. 1977. Playing God: Genetic engineering and the manipulation of life. New York: Random House. Grobstein, C. 1977. The recombinant-DNA debate. Sci. Am. 237(1):22–33. Kenney, M. 1986. Biotechnology: The universityindustrial complex. New Haven, Conn.: Yale University Press. Krimsky, S. 1982. Genetic alchemy: The social history of the recombinant DNA controversy. Cambridge, Mass.: MIT Press. Marvier, M. 2001. Ecology of transgenic crops. American Scientist 89:160–67. Moses, P. B. 1987. Strange bedfellows. BioScience 37:6–10. Poupard, J. A., and Miller, L. A. 2000. Biological warfare. In Encyclopedia of microbiology, 2d ed., vol. 1, J. Lederberg, editor-in-chief, 506–19. San Diego: Academic Press. Richards, J., editor. 1978. Recombinant DNA: Science, ethics, and politics. New York: Academic Press. Rifkin, J. 1983. Algeny. New York: Viking Press. Teitelman, R. 1989. Gene dreams: Wall street, academia, and the rise of biotechnology. New York: Basic Books, Inc. Tolin, S., and Vidaver, A. 2000. Genetically modified organisms: Guidelines and regulations for research. In Encyclopedia of microbiology, 2d ed., vol. 2, J. Lederberg, editor-in-chief, 499–509. San Diego: Academic Press. Tucker, J. B. Winter 1984–85. Gene wars. Foreign Policy 57:58–79. Wilson, M., and Lindow, S. E. 1993. Release of recombinant microorganisms. Annu. Rev. Microbiol. 47:913–44. Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 15. Microbial Genomics © The McGraw−Hill Companies, 2002 CHAPTER 15 Microbial Genomics A DNA chip can be used to follow gene expression by a microorganism’s complete genome. This chip contains probes for the over 4,200 known open reading frames in the E. coli genome. Outline 15.1 15.2 15.3 15.4 15.5 15.6 Introduction 345 Determining DNA Sequences 345 Whole-Genome Shotgun Sequencing 345 Bioinformatics 348 General Characteristics of Microbial Genomes 348 Functional Genomics 353 Genome Annotation 353 Evaluation of RNA-Level Gene Expression 354 Evaluation of Protein-Level Gene Expression 356 15.7 The Future of Genomics 356 Concepts 1. Genomics is the study of the molecular organization of genomes, their information content, and the gene products they encode. It may be divided into structural genomics, functional genomics, and comparative genomics. 2. Individual pieces of DNA can be sequenced using the Sanger method. The easiest way to analyze microbial genomes is by whole-genome shotgun sequencing in which randomly produced fragments are sequenced individually and then aligned by computers to give the complete genome. 3. Because of the mass of data to be analyzed, the use of sophisticated programs on high-speed computers is essential to genomics. 4. Many bacterial genomes have already been sequenced and compared. The results are telling us much about such subjects as genome structure, microbial physiology, microbial phylogeny, and how pathogens cause disease. They will undoubtedly help in preparing new vaccines and drugs for the treatment of infectious disease. 5. Genome function can be analyzed by annotation, the use of DNA chips to study mRNA synthesis, and the study of the organism’s protein content (proteome) and its changes. Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 15. Microbial Genomics © The McGraw−Hill Companies, 2002 15.3 345 Whole-Genome Shotgun Sequencing NH3 N A prerequisite to understanding the complete biology of an organism N is the determination of its entire genome sequence. —J. Craig Venter, et al. O – C O N O P O– hapter 13 provided a brief introduction to microbial recombination and plasmids, including the use of conjugation and other techniques in mapping the chromosome. Chapter 14 described the development and impact of recombinant DNA technology. This chapter will carry these themes further with the discussion of the current revolution in genome sequencing. We will begin with a general overview of the topic, followed by an introduction to the DNA sequencing technique. Next, the wholegenome shotgun sequencing method will be briefly described. This is followed by a comparison of selected microbial genomes and a discussion of what has been learned from them. After we have considered genome structure, we will turn to genome function and the array of transcripts and proteins produced by genomes. The focus will be on annotation, DNA chips, and the use of two-dimensional electrophoresis to study the proteome. The chapter concludes with a brief consideration of future challenges and opportunities in genomics. O O P O– O P O CH2 N O O– 3′ H 2′ H Figure 15.1 Dideoxyadenosine triphosphate (ddATP). Note the lack of a hydroxyl group on the 3′ carbon, which prevents further chain elongation by DNA polymerase. Genomics is the study of the molecular organization of genomes, their information content, and the gene products they encode. It is a broad discipline, which may be divided into at least three general areas. Structural genomics is the study of the physical nature of genomes. Its primary goal is to determine and analyze the DNA sequence of the genome. Functional genomics is concerned with the way in which the genome functions. That is, it examines the transcripts produced by the genome and the array of proteins they encode. The third area of study is comparative genomics, in which genomes from different organisms are compared to look for significant differences and similarities. This helps identify important, conserved portions of the genome and discern patterns in function and regulation. The data also provide much information about microbial evolution, particularly with respect to phenomena such as horizontal gene transfer. It should be emphasized at the beginning that whole-genome sequence information provides an entirely new starting point for biological research. In the future, microbiologists will not have to spend as much time cloning genes because they will be able to generate new questions and hypotheses from computer analyses of genome data. Then they can test their hypotheses in the laboratory. ble normal nucleotides except that they lack a 3′-hydroxyl group (figure 15.1). They are added to the growing end of the chain, but terminate the synthesis catalyzed by DNA polymerase because more nucleotides cannot be attached to further extend the chain. In the manual sequencing method, a single strand of the DNA to be sequenced is mixed with a primer, DNA polymerase I, four deoxynucleoside triphosphate substrates (one of which is radiolabeled), and a small amount of one of the dideoxynucleotides. DNA synthesis begins with the primer and terminates when a ddNTP is incorporated in place of a regular deoxynucleoside triphosphate. The result is a series of fragments of varying lengths. Four reactions are run, each with a different ddNTP. The mix with ddATP produces fragments with an A terminus; the mix with ddCTP produces fragments with C terminals, and so forth (figure 15.2). The radioactive fragments are removed from the DNA template and electrophoresed on a polyacrylamide gel to separate them from one another based on size. Four lanes are electrophoresed, one for each reaction mix, and the gel is autoradiographed (see p. 322). A DNA sequence is read directly from the gel, beginning with the smallest fragment or fastest-moving band and moving to the largest fragment or slowest band (figure 15.2a). Up to 800 residues can be read from a single gel. In automated systems dideoxynucleotides that have been labeled with fluorescent dyes are used (each ddNTP is labeled with a dye of a different color). The products from the four reactions are mixed and electrophoresed together. Because each ddNTP fluoresces with a different color, a detector can scan the gel and rapidly determine the sequence from the order of colors in the bands (figure 15.2b,c). Recently, fully automated capillary electrophoresis sequencers have been developed. These are much faster and allow up to 96 samples to be sequenced simultaneously; it is possible to generate over 350 kilobases of sequences a day. Current systems can sequence strands of DNA around 700 bases long in about 4 hours. 15.2 Determining DNA Sequences 15.3 The most widely used sequencing technique is that developed by Frederick Sanger in 1975. This approach uses dideoxynucleoside triphosphates (ddNTPs) in DNA synthesis. These molecules resem- Although several virus genomes have been sequenced, in the past it has not been possible to sequence the genomes of bacteria. Prior to 1995, whole-genome approaches to sequencing were not 15.1 Introduction Whole-Genome Shotgun Sequencing Prescott−Harley−Klein: Microbiology, Fifth Edition 346 Chapter 15 V. DNA Technology and Genomics 15. Microbial Genomics © The McGraw−Hill Companies, 2002 Microbial Genomics Figure 15.2 The Sanger Method for DNA Sequencing. (a) A sequencing gel with four separate lanes. The sequence begins, reading from the bottom, CAAAAAACGGACCGGGTGTAC. (b) An example of sequencing by use of fluorescent dideoxynucleoside triphosphates. See text for details. (c) Part of an automated DNA sequencing run. Bases 493 to 499 were used as the example in (b). GCGACAT +ddA GCGACA GCGA +ddG +ddC GCG G +ddT GCGAC GC GCGACAT Mix and electrophorese (a) (b) T A C A G C G GC G A C A T C A C T C C A G C T T G A A G C A G T T C T T C T C G T C T T C T G T T T T G T C T A A C T T G T C T T C C T T C T T C T C T T C C T G T T T A A G A A G A G A A 500 510 520 530 540 550 560 570 580 (c) possible because available computational power was insufficient for assembling a genome from thousands of DNA fragments. J. Craig Venter, Hamilton Smith, and their collaborators initially sequenced the genomes of two free-living bacteria, Haemophilus influenzae and Mycoplasma genitalium. The genome of H. influenzae, the first to be sequenced, contains about 1,743 genes in 1,830,137 base pairs and is much larger than a virus genome. Venter and Smith developed an approach called wholegenome shotgun sequencing. The process is fairly complex when considered in detail, and there are many procedures to ensure the accuracy of the results, but the following summary gives a general idea of the approach originally employed by The Institute of Genomic Research (TIGR). For simplicity, this approach may be broken into four stages: library construction, random sequencing, fragment alignment and gap closure, and editing. 1. Library construction. The large bacterial chromosomes were randomly broken into fairly small fragments, about the size of a gene or less, using ultrasonic waves; the fragments were then purified (figure 15.3). These fragments were attached to plasmid vectors (see pp. 334–35), and plasmids with a single insert were isolated. Special E. coli strains lacking restriction enzymes were transformed with the plasmids to produce a library of the plasmid clones. 2. Random sequencing. After the clones were prepared and the DNA purified, thousands of bacterial DNA fragments were sequenced with automated sequencers, employing special dye-labeled primers. Thousands of templates were used, normally with universal primers that recognized the plasmid DNA sequences just next to the bacterial DNA insert. The nature of the process is such that almost all stretches of genome are sequenced several times, and this increases the accuracy of the final results. 3. Fragment alignment and gap closure. Using special computer programs, the sequenced DNA fragments were clustered and assembled into longer stretches of sequence by comparing nucleotide sequence overlaps between fragments. Two Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 15. Microbial Genomics © The McGraw−Hill Companies, 2002 15.3 H. influenzae chromosome Sonication DNA fragments Agarose gel electrophoresis of fragments and DNA size markers Fragment purification from gel DNA fragments (about 2kb) Clonal library preparation Sequence the clonal inserts, particularly the end sequences. End sequences Construct sequence contigs and align using overlaps; fill in gaps TGTTACTCGGTCA AACCAATATTCAAT TGTTACTCGGTCA AACCAATATTCAAT TGTTACTCGGTCA AACCAATATTCAAT TGTTACTCGGTCA Whole-Genome Shotgun Sequencing 347 Figure 15.3 Whole-Genome Shotgun Sequencing. This general overview shows how the Haemophilus influenzae genome was sequenced. See text for details. fragments were joined together to form a larger stretch of DNA if the sequences at their ends overlapped and matched (i.e., were the same). This overlap comparison process resulted in a set of larger contiguous nucleotide sequences or contigs. Finally, the contigs were aligned in the proper order to form the completed genome sequence. If gaps existed between two contigs, sometimes fragment samples with their ends in the two adjacent contigs were available. These fragments could be analyzed and the gaps filled in with their sequences. When this approach was not possible, a variety of other techniques were used to align contigs and fill in gaps. For example, phage libraries containing large bacterial DNA fragments were constructed (see pp. 330–332, 335). The large fragments in these libraries overlapped the previously sequenced contigs. These fragments were then combined with oligonucleotide probes that matched the ends of the contigs to be aligned. If the probes bound to a library fragment, it could be used to prepare a stretch of DNA that represented the gap region. Overlaps in the sequence of this new fragment with two contigs would allow them to be placed side-by-side and fill in the gap between them. 4. Editing. The sequence was then carefully proofread in order to resolve any ambiguities in the sequence. Also the sequence was checked for unwanted frameshift mutations and corrected if necessary. The approach worked so well that it took less than 4 months to sequence the M. genitalium genome (about 500,000 base pairs in size). The shotgun technique also has been used successfully by Celera Genomics in the Human Genome Project and to sequence the Drosophila genome. Once the genome sequence has been established, the process of annotation begins. The goal of annotation is to determine the location of specific genes in the genome map. Every open reading frame (ORF)—a reading frame sequence (see p. 241) not interrupted by a stop codon—larger than 100 codons is considered to be a potential protein coding sequence. Computer programs are used to compare the sequence of the predicted ORF against large databases containing nucleotide and amino acid sequences of known enzymes and other proteins. If a bacterial sequence matches one in the database, it is assumed to code for the same protein. Although this comparison process is not without errors, it can provide tentative function assignments for about 40 to 50% of the presumed coding regions. It also gives some information about transposable elements, operons, repeat sequences, the presence of various metabolic pathways, and other genome features. The results of genome sequencing and annotation for Mycoplasma genitalium and Haemophilus influenzae are shown in figures 15.5 and 15.6. Often the results of annotation are expressed in a diagram that summarizes the known metabolism and physiology of the organism. An example of this is given in figure 15.7. Prescott−Harley−Klein: Microbiology, Fifth Edition 348 Chapter 15 V. DNA Technology and Genomics 15. Microbial Genomics Microbial Genomics 1. Define genomics. What are the three general areas into which it can be divided. 2. Describe the Sanger method for DNA sequencing. 3. Outline the whole-genome shotgun sequencing method. What is annotation and how is it carried out? 15.4 © The McGraw−Hill Companies, 2002 Bioinformatics DNA sequencing techniques have developed so rapidly that an enormous amount of data has already accumulated and genomes are being sequenced at an ever-increasing pace. The only way to organize and analyze all these data is through the use of computers, and this has led to the development of a new interdisciplinary field that combines biology, mathematics, and computer science. Bioinformatics is the field concerned with the management and analysis of biological data using computers. In the context of genomics, it focuses on DNA and protein sequences. The annotation process just described is one aspect of bioinformatics. DNA sequence data is stored in large databases. One of the largest genome databases is the International Nucleic Acid Sequence Data Library, often referred to as GenBank. Databases can be searched with special computer programs to find homologous sequences, DNA sequences that are similar to the one being studied. Protein coding regions also can be translated into amino acid sequences and then compared. These sequence comparisons can suggest functions of the newly discovered genes and proteins. The gene under study often will have a function similar to that of genes with homologous DNA or amino acid sequences. Table 15.1 Examples of Complete Published Microbial Genomes Genome Aquifex aeolicus Archaeoglobus fulgidus Bacillus subtilis Borrelia burgdorferi Campylobacter jejuni Chlamydia pneumoniae Chlamydia trachomatis Deinococcus radiodurans Escherichia coli Haemophilus influenzae Rd Helicobacter pylori Methanobacterium thermoautotrophicum Methanococcus jannaschii Mycobacterium tuberculosis Mycoplasma genitalium Mycoplasma pneumoniae Neisseria meningitidis Pseudomonas aeruginosa Pyrococcus horiksohii Rickettsia prowazekii Saccharomyces cerevisiae Synechocystis sp. Thermotoga maritima Treponema pallidum Vibrio cholerae Domaina Size (Mb) %G⫹C B A B B B B B B B B B 1.50 2.18 4.20 1.44 1.64 1.23 1.05 3.28 4.60 1.83 1.66 43 48 43 28 31 40 41 67 50 39 39 A A B B B B B A B E B B B B 1.75 1.66 4.40 0.58 0.81 2.27 6.3 1.80 1.10 13 3.57 1.80 1.14 4.0 49 31 65 31 40 51 67 42 29 38 47 46 52 48 a The following abbreviations are used: A, Archaea; B, Bacteria; E, Eucarya. Methanococcus jannaschii 15.5 General Characteristics of Microbial Genomes The development of shotgun sequencing and other genome sequencing techniques has led to the characterization of many procaryotic genomes in a very short time. Many genome sequences of procaryotes have been completed and published, and some of these are given in table 15.1. These procaryotes represent great phylogenetic diversity (figure 15.4). At least 100 more procaryotes, many of them major human pathogens, are being sequenced at present. Comparison of the genomes from different procaryotes will contribute significantly to the understanding of procaryotic evolution and help deduce which genes are responsible for various cellular processes. Genome sequences will aid in our understanding of genetic regulation and genome organization. In some cases, such information will also aid in the search for human genes by the Human Genome Project because of the similarities between procaryotic and human biochemistry. Currently published genome sequences already have provided new and important insights into genome organization and function. Mycoplasma genitalium grows in human genital and respiratory tracts and has a genome of only 580 kilobases in size, one of the smallest genomes of any free-living organism (figure 15.5). Thus the sequence data are of great interest because they help establish the minimal set of genes needed for a free-living existence. There Aquifex aeolicus Deinococcus radiodurans Mycobacterium tuberculosis Mycoplasma genitalium Chlamydia trachomatis Treponema pallidum Rickettsia prowazekii Haemophilus influenzae Escherichia coli Figure 15.4 Phylogenetic Relationships of Some Procaryotes with Sequenced Genomes. These procaryotes are discussed in the text. Methanococcus jannaschii is in the domain Archaea, the rest are members of the domain Bacteria. Genomes from a broad diversity of procaryotes have been sequenced and compared. Source: The Ribosomal Database Project. appear to be approximately 517 genes (480 protein-encoding genes and 37 genes for RNA species). About 90 proteins are involved in translation, and only around 29 proteins for DNA replication. Interestingly, 140 genes, or 29% of those in the genome, code for membrane proteins, and up to 4.5% of the genes seem to be involved in evasion of host immune responses. Only 5 genes have regulatory Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 15. Microbial Genomics © The McGraw−Hill Companies, 2002 15.5 General Characteristics of Microbial Genomes 349 Figure 15.5 Map of the Mycoplasma genitalium genome. The predicted coding regions are shown with the direction of transcription indicated by arrows. The genes are color coded by their functional role. The rRNA operon, tRNA genes, and adhesin protein operons (MgPa) are indicated. Reprinted with permission from Fraser, C. M., et al. Copyright 1995. The minimal gene complement of Mycoplasma genitalium. Science 270:397–403. Figure 1, page 398 and The Institute for Genomic Research. functions. Even in this smallest genome, 22% of the genes do not match any known protein sequence. Comparison with the M. pneumoniae genome and studies of gene inactivation by transposon insertion suggests that about 108 to 121 M. genitalium genes may not be essential for survival. Thus the minimum gene set required for laboratory growth conditions seems to be approximately 265 to 350 genes; about 100 of these have unknown functions. Haemophilus influenzae has a much larger genome, 1.8 megabases and 1,743 genes (figure 15.6). More than 40% of the genes have unknown functions. It has already been found that the bacterium lacks three Krebs cycle genes and thus a functional cycle. It does devote many more genes (64 genes) to regulatory functions than does M. genitalium. Haemophilus influenzae is a species capable of transformation (see pp. 305–7). The process must be very important to this bacterium because it contains 1,465 copies of the recognition sequence used in DNA uptake during transformation. Methanococcus jannaschii, a member of the Archaea, also has been sequenced. Only 44% of its 1,738 genes match those of other organisms, an indica- tion of how different this archaeon is from bacteria and eucaryotes. Despite this profound difference, many of its genes for DNA replication, transcription, and translation are similar to eucaryotic genes and quite different from bacterial genes. However, the metabolism of M. jannaschii is more similar to that of bacteria than to eucaryotic metabolism. More recently the sequence of the 4.6 megabase Escherichia coli K12 genome has been published. About 5 to 6% of the genes code for proteins involved in cell and membrane structure; 12 to 14% for transport proteins; 10% for the enzymes of energy and central intermediary metabolic pathways; 4% for regulatory genes; and 8% for replication, transcription, and translation proteins. The genome contains about 4,288 predicted genes, almost 2,500 of which do not resemble known genes. The large number of unknown genes in Escherichia coli, Haemophilus influenzae, and other procaryotes has great significance. It shows how little we know about microbial biology. Clearly there is much more to learn about the genetics, physiology, and metabolism of procaryotes, even of those that have been intensively studied. Prescott−Harley−Klein: Microbiology, Fifth Edition 350 Chapter 15 V. DNA Technology and Genomics 15. Microbial Genomics © The McGraw−Hill Companies, 2002 Microbial Genomics Sma I 1 Sma I Not I 1,800,000 Sma I 1,700,000 100,000 RsrII Sma I 200,000 Sma I Sma I Sma I 1,600,000 Sma I Sma I RsrII 300,000 RsrII 1,500,000 400,000 1,400,000 Sma I 500,000 1,300,000 Sma I 600,000 Sma I 1,200,000 Sma I Sma I 700,000 1,100,000 Sma I 1,000,000 RsrII 900,000 Sma I Sma I 800,000 Figure 15.6 Map of the Haemophilus influenzae genome. The predicted coding regions in the outer concentric circle are indicated with colors representing their functional roles. The outer perimeter shows the NotI, RsrII, and SmaI restriction sites. The inner concentric circle shows regions of high G ⫹ C content (red and blue) and high A ⫹ T content (black and green). The third circle shows the coverage by clones (blue). The fourth circle shows the locations of rRNA operons (green), tRNAs (black), and the mu-like prophage (blue). The fifth circle shows simple tandem repeats and the probable origin of replication (outward pointing green arrows). The red lines are potential termination sequences. Reprinted with permission from Fleischman, R. D., et al. 1995. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269:496–512. Figure 1, page 507 and The Institute of Genomic Research. Comparison of these and other genome sequences shows large differences. Not surprisingly, E. coli is most similar to H. influenzae, which is also a member of the ␥-proteobacteria (1,130 similar genes). It differs more from the cyanobacterium Synechocystis sp. PCC6803 (675 similarities) and Mycoplasma genitalium (468 similarities). These four bacteria have only 111 proteins in common. Escherichia coli is even more unlike the archaeon M. jannaschii (231 similar genes) and the eucaryotic yeast Saccharomyces cerevisiae (254 similar genes). Only 16 proteins, mostly translation proteins such as ribosomal proteins and aminoacyl synthetases, are essentially the same in all six organisms. There have been many gene losses and changes during the course of evolution. Many of the genomes already sequenced belong to procaryotes that are either major human pathogens or of particular biological interest. Some recent sequences have yielded interesting discoveries and will be briefly discussed as examples of the kind of important information that can be obtained from genomics. Because of their practical importance, our focus will be primarily on human pathogens. As we will see, the results often pose more new questions than they answer old ones and open up many new areas of research. Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 15. Microbial Genomics © The McGraw−Hill Companies, 2002 15.5 The deinococci are soil bacteria of great interest because of their ability to survive a dose of radiation thousands of times greater than the amount needed to kill humans. They survive by stitching together their splintered chromosomes after radiation exposure. The genome consists of two circular chromosomes of different size (2.6 Mb and 0.4 Mb), a megaplasmid (177,466 bp), and a regular plasmid (45,704 bp). One would think that this genome should have some quite different DNA repair genes. However, despite its remarkable resistance to radiation, Deinococcus radiodurans has the same array of DNA repair mechanisms as other bacteria. It differs in simply having more of them. For example, most organisms have one MutT gene, which is involved in disposing of oxidized nucleotides; Deinococcus has 20 MutT-like genes. The genome also possesses many repeat sequences, which may be important in the repair process. It should be emphasized that many of the bacterium’s genes have unknown functions, and some of these genes may aid in its unusually great resistance to radiation. The deinococci (p. 468) Rickettsia prowazekii is a member of the ␣-proteobacteria that is an obligate intracellular parasite of lice and humans. It is the causative agent of typhus fever and killed millions during and following the First and Second World Wars. Many microbiologists think that mitochondria may have arisen when a member of the ␣-proteobacteria established an endosymbiotic relationship with the ancestral eucaryotic cell (see pp. 424–25). The sequenced genome of R. prowazekii is consistent with this hypothesis. Its protein-encoding genes show similarities to mitochondrial genes. Glycolysis is absent, but genes for the TCA cycle and electron transport are present, and ATP synthesis is similar to that in mitochondria. Both Rickettsia and the mitochondrion lack many genes for the biosynthesis of amino acids and nucleosides, in contrast with the situation in free-living ␣-proteobacteria. Thus aerobic respiration in eucaryotes may have arisen from an ancestor of Rickettsia. Rickettsia biology and clinical aspects (pp. 488–90, 909–10) Chlamydiae are nonmotile, coccoid, gram-negative bacteria that reproduce only within cytoplasmic vesicles of eucaryotic cells by a unique life cycle. Chlamydia trachomatis infects humans and causes the sexually transmitted disease, nongonococcal urethritis, probably the most commonly transmitted sexual disease in the United States. It also is the leading cause of preventable blindness in the world. The sequencing of its genome has revealed several surprises. The bacteria’s life cycle is so unusual (see pp. 477–78) that one would expect its genome to be somewhat atypical. This has turned out not to be the case; the genome is similar to that of many other bacteria. Microbiologists have called Chlamydia an “energy parasite” and believed that it obtained all its ATP from the host cell. The genome results show that Chlamydia has the genes to make at least some ATP on its own, although it also has genes for ATP transporters. Another surprise is the presence of enzymes for the synthesis of peptidoglycan. Chlamydial cell walls lack peptidoglycan and microbiologists have been unable to explain why the antibiotic penicillin, which disrupts peptidoglycan synthesis, is able to inhibit chlamydial growth. The presence of peptidoglycan biosynthetic enzymes helps account for the penicillin effect, but no one knows the purpose of peptidoglycan synthesis in this bacterium. Another major surprise is the absence of the FtsZ gene, which is thought to be required by all bacteria and archaea for septum formation dur- General Characteristics of Microbial Genomes 351 ing cell division (see p. 286). The absence of this supposedly essential gene makes one wonder how Chlamydia divides. It may be that some of the genes with unknown functions play a major role in cell division. Perhaps Chlamydia employs a mechanism of cell division different from that of other procaryotes. Finally, the genome contains at least 20 genes that have been obtained from eucaryotic host cells (most bacteria have no more than 3 or 4 such genes). Some of these genes are plantlike; originally Chlamydia may have infected a plantlike host and then moved to animals. Chlamydia (pp. 477–78; section 39.3) One of the most difficult human pathogens to study has been the causative agent of syphilis, Treponema pallidum. This is because it has not been possible to grow Treponema outside the human body. We know little about its metabolism or the way it avoids host defenses, and no vaccine for Treponema has yet been developed. Naturally the sequencing of the Treponema pallidum genome has generated considerable excitement and hope. It turns out that Treponema is metabolically crippled. It can use carbohydrates as an energy source, but lacks the TCA cycle and oxidative phosphorylation (figure 15.7). Treponema also lacks many biosynthetic pathways (e.g., for enzyme cofactors, fatty acids, nucleotides, and some electron transport proteins) and must rely on molecules supplied by its host. In fact, about 5% of its genes code for transport proteins. Given the lack of several critical pathways, it is not surprising that the pathogen has not been cultured successfully. The genes for surface proteins are of particular interest. Treponema has a family of surface protein genes characterized by many repetitive sequences. Some have speculated that these genes might undergo recombination in order to generate new surface proteins and allow the organism to avoid attack by the immune system, but this is not certain. However, it may be possible to develop a vaccine for syphilis using some of the newly discovered surface proteins. We also may be able to identify strains of Treponema using these surface proteins, which would be of great importance in syphilis epidemiology. The genome results have not provided much of a clue about how Treponema causes syphilis. About 40% of the genes have unknown functions. Possibly some of them are responsible for avoiding host defenses and for the production of toxins and other virulence factors. Treponema and syphilis (pp. 479–81, 923–24) For centuries, tuberculosis has been one of the major scourges of humankind and still kills about 3 million people annually. Furthermore, because of the spread of AIDS and noncompliance in drug treatment, Mycobacterium tuberculosis is increasing in frequency once again and is becoming ever more drug resistant. Anything that can be learned from genome studies could be of great importance in the fight to control the renewed spread of tuberculosis. The Mycobacterium tuberculosis genome is one of the largest yet found (4.40 Mb), exceeded only by E. coli (4.60 Mb genome) and Pseudomonas aeruginosa (6.26 Mb), and contains around 4,000 genes. Only about 40% of the genes have been given precise functions and 16% of its genes resemble no known proteins; presumably they are responsible for specific mycobacterial functions. More than 250 genes are devoted to lipid metabolism (E. coli has only about 50 such genes), and M. tuberculosis may obtain much of its energy by degrading host lipids. There are a surprisingly large number of regulatory elements in the genome. This may mean that the infection process is much more complex and Prescott−Harley−Klein: Microbiology, Fifth Edition 352 Chapter 15 V. DNA Technology and Genomics 15. Microbial Genomics Microbial Genomics Cations P-type ATPase Carnitine Glutamate/aspartate Glutamate Spermidine/putrescine potABCD Neutral amino acids K+ ntpJ Cu+ TpF1 Cations troABCD Mg2+ mgtCE ENERGY PRODUCTION Glucose D-Alanine/glycine Ribose-5-P dagA Na+ oadAB Glucose-6-P Ribulose-5-P Glyc-3-P L-Proline Pentose phosphate 6-P-gluconate pathway Glycolysis L-Serine L-Glutamate + PEP Oxaloacetate Alanine Thiamine L-Glycine ADP + Pi V-type ATPase L-Glutamine PRPPAdenine Nucleosides/ nucleotides dUMP PPi dAMP, dCMP, dTMP Ribose/ galactose rbsAC © The McGraw−Hill Companies, 2002 rAMP, rCMP, rUMP dNDPs rNDPs dNTPs rNTPs Galactose mglABC Pyruvate ? Na+ Lactate L-Asparagine ATP Acetyl-CoA MUREIN SYNTHESIS + Acetyl P N-acetyl-Gin-1-P Fatty acid ? Acetate UDP-N-acetyl-D-Gin Phosphatidic acid Phosphatidyl glyc-P ? Phosphatidyl glycerol 8 murein synthesis genes UDP-N-acetyl-Gin L-Glutamate D-Glutamate L-Alanine D-Alanine CELL WALL SYNTHESIS D-Alanyl-D-Alanine Glycerol-3-P ATP Malate/succinate/ fumarate dctM Glucose/galactose/ glycerol-P(?) + L-Aspartate + α-Ketoglutarate H+ PROTEIN SECRETION ADP + Pi sec protein excretion and leader peptidases V-type ATPase 22 putative lipoproteins Figure 15.7 Metabolic Pathways and Transport Systems of Treponema pallidum. This depicts T. pallidum metabolism as deduced from genome annotation. Note the limited biosynthetic capabilities and extensive array of transporters. Although glycolysis is present, the TCA cycle and respiratory electron transport are lacking. Question marks indicate where uncertainties exist or expected activities have not been found. sophisticated than previously thought. Two families of novel glycine-rich proteins with unknown functions are present and represent about 10% of the genome. They may be a source of antigenic variation and involved in defense against the host immune system. One major medical problem has been the lack of a good vaccine. A large number of proteins that are either secreted by the bacterium or on the bacterial surface have been identified from the genome sequence. It is hoped that some of these proteins can be used to develop new, effective vaccines. This is particularly important in view of the spread of multiply drug resistant M. tuberculosis. Mycobacterium tuberculosis (pp. 543–44, 906–8) It is tempting to think that closely related and superficially similar bacteria must have similar genomes. Although the genome of the leprosy bacillus, Mycobacterium leprae, has only been about 90% sequenced, it is already clear that this assumption can be mistaken. The whole M. leprae genome is a third smaller than that of M. tuberculosis. About half the genome seems to be devoid of functional genes; it consists of junk DNA that represents over 1,000 degraded, nonfunctional genes. In total, M. leprae seems to have lost as many as 2,000 genes during its career as an intracellular parasite. It even lacks some of the enzymes required for energy production and DNA replication. This might explain why the bacterium has such a long doubling time, about two weeks in mice. One hope from the genomics study is that critical surface proteins can be discovered and used to develop a sensitive test for early detection of leprosy. This would allow immediate treatment of the disease before nerve damage occurs. Leprosy (pp. 916–17) Analysis and comparison of the genomes already sequenced have disclosed some general patterns in genome organization. Although protein sequences are usually conserved (i.e., about 70% of proteins contain ancient conserved regions), genome organization is quite variable in the Bacteria and Archaea. Sometimes two genes can fuse to form a new gene that has a combination of the functions possessed by the two separate genes. Less often, a gene can split or undergo fission; this seems to be more prevalent in thermophilic procaryotes. There also appears to be considerable horizontal gene transfer, particularly of housekeeping or operational genes. Informational genes, primarily those essential to transcription and translation, are not transferred as often. Perhaps, as James Lake has proposed, genes whose protein products are parts of large, complex systems and interact with many other molecules are not often transferred successfully. About 18% of the genes in E. coli seem to have been acquired by horizontal transfer after its divergence from Salmonella. There also is gene transfer Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 15. Microbial Genomics © The McGraw−Hill Companies, 2002 15.6 353 Functional Genomics 757 77 53 59 51 25 97 22 54 6 7 11 21 15 523 27 36 18 39 23 87 6 48 6 9 11 12 17 847 43 42 57 53 23 100 15 61 12 13 25 15 31 2,095 65 50 87 57 26 90 77 211 57 72 78 48 84 Pyrococcus abyssi Chlamydia trachomatis 477 6 29 33 29 13 90 5 33 7 0 8 19 4 Methanococcus jannaschii Rickettsia prowazekii 2,232 123 86 223 80 45 105 163 230 61 97 53 68 79 Mycobacterium tuberculosis Treponema pallidum 2,933 179 146 304 97 38 121 159 351 64 89 67 75 97 Mycoplasma genitalium Approximate total number of genesb Cellular processesc Cell envelope components Transport and binding proteins DNA metabolism Transcription Protein synthesis Regulatory functions Energy metabolismd Central intermediary metabolisme Amino acid biosynthesis Fatty acid and phospholipid metabolism Purines, pyrimidines, nucleosides, and nucleotides Biosynthesis of cofactors and prosthetic groups Bacillus subtilis Gene Function Escherichia coli K12 Table 15.2 Estimated Number of Genes Involved in Various Cell Functionsa 1,271 26 25 56 53 21 117 18 158 18 64 9 37 49 1,345 44 25 67 33 19 99 19 116 25 51 8 40 31 a Data adapted from TIGR (The Institute for Genomic Research) databases. b The number of genes with known or hypothetical functions. c Genes involved in cell division, chemotaxis and motility, detoxification, transformation, toxin production and resistance, pathogenesis, adaptations to atypical conditions, etc. d Genes involved in amino acid and sugar catabolism, polysaccharide degradation and biosynthesis, electron transport and oxidative phosphorylation, fermentation, glycolysis/gluconeogenesis, pentose phosphate pathway, Entner-Doudoroff, pyruvate dehydrogenase, TCA cycle, photosynthesis, chemoautotrophy, etc. e Amino sugars, phosphorus compounds, polyamine biosynthesis, sulfur metabolism, nitrogen fixation, nitrogen metabolism, etc. between domains. The bacterium Aquifex aeolicus probably received about 16% of its genes from the Archaea, and 24% of the genes in Thermotoga maritima are similar to archaeal sequences. Some microbiologists have proposed that new species are created by the acquisition of genes that allow exploitation of a new ecological niche. For example, E. coli may have acquired the lactose operon and thus become able to metabolize the milk sugar lactose. This capacity would aid in colonization of the mammalian colon. Existence of extensive gene transfer between species and domains may require reevaluation of the bacterial taxonomic schemes that are based only on rRNA sequences (see sections 19.6 and 19.7). The comparison of many more genome sequences may clarify these phylogenetic relationships. Horizontal gene transfer (section 13.1) 1. What sorts of general insights have been provided by the analysis of the genomes of M. genitalium, H. influenzae, M. jannaschii, and E. coli? 2. The genomes of D. radiodurans, R. prowazekii, C. trachomatis, T. pallidum, M. tuberculosis, and M. leprae have been briefly discussed. Give one or two surprises or interesting insights that have arisen from each genome sequence. 3. Discuss what has been learned about horizontal gene transfer from genome comparisons. 15.6 Functional Genomics Clearly, determination of genome sequences is only the start of genome research. It will take years to learn how the genome actually functions in a cell or organism (if that is completely possible) and to apply this knowledge in practical ways such as the conquest of disease and increased crop production. Sometimes the study of genome function and the practical application of this knowledge is referred to as postgenomics because it builds upon genome sequencing data. Functional genomics is a major postgenomics discipline. As mentioned earlier, functional genomics is concerned with learning how the genome operates. We will consider a few of the many approaches used to study genome function. First we will discuss annotation, which has already been introduced in the context of genome sequencing. Then techniques for the study of RNA- and protein-level expression will be described. Genome Annotation After sequencing, annotation can be used to tentatively identify many genes and this allows analysis of the kinds of genes and functions present in the microorganism (figure 15.7). Table 15.2 summarizes some of the data for several important procaryotic genomes, seven bacterial and two archaeal (Methanococcus and Prescott−Harley−Klein: Microbiology, Fifth Edition 354 Chapter 15 V. DNA Technology and Genomics 15. Microbial Genomics © The McGraw−Hill Companies, 2002 Microbial Genomics Pyrococcus). Even with these few examples, patterns can be seen. Genes responsible for essential informational functions (DNA metabolism, transcription, and protein synthesis) do not vary in number as much as other genes. There seems to be a minimum number of these essential genes necessary for life. Second, complex free-living bacteria such as E. coli and B. subtilis have many more operational or housekeeping genes than do most of the parasitic forms, which depend on the host for a variety of nutrients. Generally, larger genomes show more metabolic diversity. Parasitic bacteria derive many nutrients from their hosts and can shed genes for unnecessary pathways; thus they have smaller genomes. Protecting group 1. Application of mask Mask 2. Irradiation 3. Evaluation of RNA-Level Gene Expression One of the best ways to evaluate gene expression is through the use of DNA microarrays (DNA chips). These are solid supports, typically of glass or silicon and about the size of a microscope slide, that have DNA attached in highly organized arrays. The chips can be constructed in several ways. In one approach, a programmable robotic machine delivers hundreds to thousands of microscopic droplets of DNA samples to specific positions on a chip using tiny pins to apply the solution (see figure 42.26, p. 1020.) The spots are then dried and treated in order to bind the DNA tightly to the surface. Any DNA fragment can be attached in this way; often cDNA (see p. 321) about 500 to 5,000 bases long is used. A second procedure involves the synthesis of oligonucleotides directly on the chip in the following way (figure 15.8): 1. Coat the glass support with light-sensitive protecting groups that prevent random nucleoside attachment. 2. Cover the surface with a mask that has holes corresponding to the sites for attachment of the desired nucleosides. 3. Shine laser light through the mask holes to remove the exposed protecting groups. 4. Bathe the chip in a solution containing the first nucleoside to be attached. The nucleoside will chemically couple to the light-activated sites. Each nucleoside has a light-removable protecting group to prevent addition of another nucleoside until the appropriate time. 5. Repeat steps 2 through 4 with a new mask each time to add nucleosides until all sequences on the chip have been completed. This procedure can be used to construct any sequence. The commercial chip contains oligonucleotide probes that are 25 bases long. It is about 1.3 cm on a side and can have over 200,000 ad- A solution A A 4. Irradiation 5. A A A A C solution C A C C A Figure 15.8 Construction of a DNA Chip with Attached Oligonucleotide Sequences. Only two cycles of synthesis are shown. See text for description of the steps. dressable positions (figure 15.9). The probes are often expressed sequence tags. An expressed sequence tag (EST) is a partial gene sequence unique to the gene in question that can be used to identify and position the gene during genomic analysis. It is derived from cDNA molecules. There are now chips that have probes for every expressed gene or open reading frame in the genome of Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 15. Microbial Genomics © The McGraw−Hill Companies, 2002 15.6 Functional Genomics 355 Figure 15.9 The GeneChip Expression Probe Array. The DNA chip manufactured by Affymetrix, Inc. contains probes designed to represent thousands or tens of thousands of genes. E. coli (about 4,200 open reading frames) and the yeast Saccharomyces cerevisiae (approximately 6,100 open reading frames). The nucleic acids to be analyzed, often called the targets, are isolated and labeled with fluorescent reporter groups. Nucleic acid targets may be mRNA or cDNA produced from mRNA by reverse transcription (see figure 14.3). The chip is incubated with the target mixture long enough to ensure proper binding to probes with complementary sequences. The unbound target is washed off and the chip is then scanned with laser beams. Fluorescence at an address indicates that the probe is bound to that particular sequence. Analysis of the hybridization pattern shows which genes are being expressed. Target samples from two experiments can be labeled with different fluorescent groups and compared using the same chip. Figure 15.9 and the chapter opening figure provide examples of fluorescently labeled DNA microarrays. DNA chip results allow one to observe the characteristic expression of whole sets of genes during differentiation or in response to environmental changes. In some cases, many genes change expression in response to a single change in conditions. Patterns of gene expression can be detected and functions can be tentatively assigned based on expression. If an unknown gene is Prescott−Harley−Klein: Microbiology, Fifth Edition 356 Chapter 15 V. DNA Technology and Genomics 15. Microbial Genomics © The McGraw−Hill Companies, 2002 Microbial Genomics expressed under the same conditions as genes of known function, it is coregulated and quite likely shares the same general function. DNA chips also can be used to study regulatory genes directly by perturbing a regulatory gene and observing the effect on genome activity. Of course, only mRNAs that are currently expressed can be detected. If a gene is transiently expressed, its activity may be missed by a DNA chip analysis. Evaluation of Protein-Level Gene Expression Genome function can be studied at the translation level as well as the transcription level. The entire collection of proteins that an organism produces is called its proteome. Thus proteomics is the study of the proteome or the array of proteins an organism can produce. It is an essential discipline because proteomics provides information about genome function that mRNA studies cannot. There is not always a direct correlation between mRNA and protein levels because of the posttranslational modification of proteins and protein turnover. Measurement of mRNA levels can show the dynamics of gene expression and tell what might occur in the cell, whereas proteomics discovers what is actually happening. Although new techniques in proteomics are currently being developed, we will focus briefly only on the traditional approach. A mixture of proteins is separated using two-dimensional electrophoresis. The first dimension makes use of isoelectric focusing, in which proteins move electrophoretically through a pH gradient (e.g., pH 3 to 10 or 4 to 7). The protein mixture is applied to a strip with an immobilized pH gradient and electrophoresed. Each protein moves along the strip until the pH on the strip equals its isoelectric point. At this point, the protein’s net charge is zero and the protein stops moving. Thus the technique separates the proteins based on their content of ionizable amino acids. The second dimension is SDS polyacrylamide gel electrophoresis (SDSPAGE). SDS (sodium dodecyl sulfate) is an anionic detergent that denatures proteins and coats the polypeptides with a negative charge. After the first electrophoretic run has been completed, the pH gradient strip is soaked in SDS buffer and then placed at the edge of an SDS-PAGE gel sheet. Then a voltage is applied at right angles to the strip at the edge of the sheet. Under these circumstances, polypeptides migrate through the polyacrylamide gel at a rate inversely proportional to their masses. That is, the smallest polypeptide will move the farthest in a particular length of time. This two-dimensional technique is very effective at separating proteins and can resolve thousands of proteins in a mixture (figure 15.10). If radiolabeled substrates are used, newly synthesized proteins can be distinguished and their rates of synthesis determined. Gel electrophoresis (pp. 327–28) Two-dimensional electrophoresis is even more powerful when coupled with mass spectrometry. The unknown protein spot is cut from the gel and cleaved into fragments by trypsin digestion. Then fragments are analyzed by a mass spectrometer and the mass of the fragments is plotted. This mass fingerprint can be used to estimate the probable amino acid composition of each fragment and tentatively identify the protein. When the two techniques are employed together, the proteome and its changes can be studied very effectively. Proteomics has been used to study the physiology of E. coli. Some areas of research have been the effect of phosphate limitation, proteome changes under anaerobic conditions, heat-shock protein production, and the response to the toxicant 2,4-dinitrophenol. One particularly useful approach in studying genome function is to inactivate a specific gene and then look for changes in protein expression. Because changes in the whole proteome are followed, gene inactivation can tell much about gene function and the large-scale effects of gene activity. A gene-protein database for E. coli has been established and provides information about the conditions under which each protein is expressed and where it is located in the cell. The preceding discussion of functional genomics has emphasized areas of investigation with a record of success and a bright future. However it should be noted that many problems remain to be solved, and there may be limits to how much genomics can tell us about the living cell for a variety of reasons. For example, sequence information does not specify the nature and timing of gene regulation. Regulation of protein activity in living cells is extraordinarily complex and involves regulatory networks, which we do not yet understand completely. Functional assignments from annotation and other approaches sometimes may be inadequate because the function of a gene product often depends on its cellular context. Cells are extremely complex structural entities permeated by various physical compartments in which many processes are restricted to surfaces of membranes and macromolecular complexes (see p. 165). Thus localization of proteins also affects function, and genomics cannot account for this. These and other problems should be kept in mind when thinking about future progress in genomics. 1. What general lessons about genome function have been learned from the annotation results? 2. How are DNA microarrays or chips constructed and used to analyze gene expression? What sorts of things can be learned by this approach? 3. Describe two-dimensional electrophoresis and how it is used in the study of proteomics. What kinds of studies can be carried out with this technique? 15.7 The Future of Genomics Although much has been accomplished in the past few years, the field of genomics is just beginning to mature. There are challenges ahead and many ways in which genomics can advance our knowledge of microorganisms and their practical uses. A few of these challenges and opportunities are outlined here. 1. We need to develop new methods for the large-scale analysis of genes and proteins so that more organisms can be studied. 2. All the new information about DNA and protein sequences, variations in mRNA and protein levels, and protein interactions must be integrated in order to understand Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 15. Microbial Genomics © The McGraw−Hill Companies, 2002 The Future of Genomics 357 10 20 30 Mw (Kd) 50 70 100 200 15.7 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 pH Figure 15.10 Two-Dimensional Electrophoresis of Proteins. The SWISS-2D PAGE map of E. coli K12 proteins. The first dimension pH gradient ran from pH 3 to 10. The second dimension comprised an 8 to 18% acrylamide gel for separation based on molecular weight. Identified proteins are indicated by red crosses. genome organization and the workings of a living cell. One goal would be to have sufficient knowledge to model a cell on a computer and make predictions about how it would respond to environmental changes. 3. Genomics can be used to provide insights into pathogenicity and suggest treatments for infectious disease. Possible virulence genes can be identified, and the expression of genes during infection can be studied. Host responses to pathogens can be examined. More sensitive diagnostic tests, new antibiotics, and different vaccines may come from genomic studies of pathogens. 4. The field of pharmacogenomics should produce many new drugs to treat disease. Databases of human gene sequences can be searched both for proteins that might have therapeutic value and for new drug targets. Genomics also can be used to study variations in drug-metabolizing enzymes and individual responses to medication. 5. The nature of horizontal gene transfer and the process of microbial evolution can be studied by comparing a wide variety of genomes. Comparative genomics will aid in the study of microbial biodiversity. 6. The industrial applications are numerous. For example, genomics can be used to identify novel enzymes with industrial potential, enhance the bioremediation of hazardous wastes, and improve techniques for the microbial production of methane and other fuels. 7. Genomics will profoundly impact agriculture. It can be used to find new biopesticides and to improve sustainable agricultural practices through enhancements in processes such as nitrogen fixation. It is clear that the possibilities are great and genomics will profoundly impact many areas of microbiology. Advances in our understanding of microorganisms also will aid in the genomic study of more complex eucaryotic organisms. Prescott−Harley−Klein: Microbiology, Fifth Edition 358 Chapter 15 V. DNA Technology and Genomics 15. Microbial Genomics © The McGraw−Hill Companies, 2002 Microbial Genomics Summary 1. Genomics is the study of the molecular organization of genomes, their information content, and the gene products they encode. It may be divided into three broad areas: structural genomics, functional genomics, and comparative genomics. 2. DNA fragments are normally sequenced using dideoxynucleotides and the Sanger technique (figure 15.2). 3. Most often microbial genomes are sequenced using the whole-genome shotgun technique of Venter, Smith, and collaborators. Four stages are involved: library construction, sequencing of randomly produced fragments, fragment alignment and gap closure, and editing the final sequence (figure 15.3). 4. After the sequence has been determined, it is annotated. That is, computer analysis is used to identify genes and their functions by comparing them with gene sequences in databases. 5. Analysis of vast amounts of genome data requires sophisticated computers and programs; these analytical procedures are a part of the discipline of bioinformatics. 6. Many microbial genomes have already been sequenced (table 15.1) and about 100 more procaryotic genomes currently are being sequenced. 7. The genome of Mycoplasma genitalium is one of the smallest of any free-living organism. Analysis of this genome and others indicates that only about 265 to 350 genes are required for growth in the laboratory. 8. Haemophilus influenzae lacks a complete set of Krebs cycle genes and has 1,465 copies of the recognition sequence used in DNA uptake during transformation. 9. The archaeon Methanococcus jannaschii is quite different genetically from bacteria and eucaryotes; only around 44% of its genes match those of the bacteria and eucaryotes it was compared against. Its informational genes (replication, transcription, and translation) are more similar to eucaryotic genes, whereas its metabolic genes resemble those of bacteria more closely. 10. Even in the case of E. coli, perhaps the beststudied bacterium, about 58% of the predicted genes do not resemble known genes. 11. The genome sequence of Rickettsia prowazekii is very similar to that of mitochondria. Aerobic respiration in eucaryotes may have arisen from an ancestor of Rickettsia. 12. The genome of Chlamydia trachomatis has provided many surprises. For example, it appears able to make at least some ATP and peptidoglycan, despite the fact that it seems to obtain most ATP from the host and does not have a cell wall with peptidoglycan. The presence of plantlike genes indicates that it might have infected plantlike hosts before moving to animals. 13. Treponema pallidum, the causative agent of syphilis, has lost many of its metabolic genes, which may explain why it hasn’t been cultivated outside a host. 14. Mycobacterium tuberculosis contains more than 250 genes for lipid metabolism and may obtain much of its energy from host lipids. Surface and secretory proteins have been identified and may help vaccine development. 15. There has been a great deal of horizontal gene transfer between genomes in both Bacteria and Archaea. This is particularly the case for housekeeping or operational genes. 16. Annotation of genomes can be used to identify many genes and their functions. There seem to be patterns in gene distribution. For example, parasitic forms tend to lose genes and obtain nutrients from their hosts. 17. DNA microarrays (DNA chips) can be used to follow gene expression and mRNA production (figures 15.9 and 15.10). 18. The entire collection of proteins that an organism can produce is the proteome, and its study is called proteomics. The proteome often is analyzed by two-dimensional electrophoresis followed in some cases by mass spectrometry. Proteomic experiments sometimes provide more evidence about gene expression than the use of DNA chips. 19. Despite great success in sequencing genomes, many problems still need to be resolved before the data can be interpreted adequately and applied to the understanding of organisms. 20. In the future genomics will positively impact many areas of microbiology. Key Terms annotation 347 bioinformatics 348 comparative genomics 345 DNA microarrays (DNA chips) 354 expressed sequence tag (EST) 354 functional genomics 345 genomics 345 open reading frame (ORF) 347 Questions for Thought and Review 1. What impact might genome comparisons have on the current phylogenetic schemes for Bacteria and Archaea that are discussed in chapter 19? 2. How would you use genomics data to develop new vaccines and antimicrobial drugs? 3. Why are proteomic studies necessary when one can use DNA chips to follow mRNA synthesis? 4. Discuss the importance of bioinformatics for genomics and the information it can supply. 5. Contrast informational and housekeeping or operational genes with respect to function and variation in quantity between genomes. How do free-living and parasitic microorganisms differ with respect to these genes? 6. Discuss some of the more important problems for postgenomic studies of microorganisms. What areas of microbiology do you think will be most positively impacted by genomics? proteome 356 proteomics 356 structural genomics 345 whole-genome shotgun sequencing 346 Critical Thinking Questions 1. Propose an experiment that can be done easily with a microchip that would have required years before this new technology. 2. What are the pitfalls of searches for homologous genes and proteins? Prescott−Harley−Klein: Microbiology, Fifth Edition V. DNA Technology and Genomics 15. Microbial Genomics © The McGraw−Hill Companies, 2002 Additional Reading 359 Additional Reading General Brown, T. A. 1999. Genomes. New York: John Wiley. Brown, K. 2000. The human genome business today. Sci. Am. 283(1):50–55. Charlebois, R. L., editor 1999. Organization of the prokaryotic genome. Washington, D.C.: ASM Press. Dougherty, B. A. 2000. DNA sequencing and genomics. In Encyclopedia of microbiology, 2d ed., vol. 2, J. Lederberg, editor-in-chief, 106–16. San Diego: Academic Press. Downs, D. M., and Escalante-Semerena, J. C. 2000. Impact of genomics and genetics on the elucidation of bacterial metabolism. Methods 20(1):47–54. Ezzell, C. 2000. Beyond the human genome. Sci. Am. 283(1):64–69. Field, D.; Hood, D.; and Moxon, R. 1999. Contribution of genomics to bacterial pathogenesis. Curr. Opin. Genet. Dev. 9(6):700–703. Haseltine, W. A. 1997. Discovering genes for new medicines. Sci. Am. 276(3):92–7. Lander, E. S., and Weinberg, R. A. 2000. Genomics: Journey to the center of biology. Science 287:1777–82. Strauss, E. J., and Falkow, S. 1997. Microbial pathogenesis: Genomics and beyond. Science 276:707–12. 15.4 Bioinformatics Ashburner, M., and Goodman, N. 1997. Informatics—genome and genetic databases. Curr. Opin. Genet. Dev. 7:750–56. Howard, K. 2000. The bioinformatics gold rush. Sci. Am. 283(1):58–63. Patterson, M., and Handel, M., editors. 1998. Trends guide to bioinformatics. New York: Elsevier Science, Ltd. Rashidi, H. H., and Buehler, L. K. 2000. Bioinformatics basics: Applications in the biological sciences and medicine. Boca Raton, Fla.: CRC Press. 15.5 General Characteristics of Microbial Genomes Andersson, S. G. E., et al. 1998. The genome sequence of Rickettsia prowazekii and the origin of mitochondria. Nature 396:133–40. Blattner, F. R., et al. 1997. The complete genome sequence of Escherichia coli K-12. Science 277:1453–62. Bult, C. J., et al. 1996. Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii. Science 273:1058–1107. Cole, S. T., et al. 1998. Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 393:537–44. Fleischmann, R. D., et al. 1995. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269:496–512. Fraser, C. M., et al. 1995. The minimal gene complement of Mycoplasma genitalium. Science 270:397–403. Fraser, C. M., et al. 1998. Complete genome sequence of Treponema pallidum, the syphilis spirochete. Science 281:375–88. Galperin, M. Y., and Koonin, E. V. 1998. Sources of systematic error in functional annotation of genomes: Domain rearrangement, nonorthologous gene displacement and operon disruption. In Silico Biology 1:55–67. Gaasterland, T. 1999. Archaeal genomics. Curr. Opin. Microbiol. 2(5):542–47. Gogarten, J. P., and Olendzenski, L. 1999. Orthologs, paralogs and genome comparisons. Curr. Opin. Genet. Dev. 9:630–36. Heidelberg, J. F., et al. 2000. DNA sequence of both chromosomes of the cholera pathogen Vibrio cholerae. Nature 406:477–83. Jain, R.; Rivera, M. C.; and Lake, J. A. 1999. Horizontal gene transfer among genomes: The complexity hypothesis. Proc. Natl. Acad. Sci. 96:3801–6. Koonin, E. V., and Galperin, M. Y. 1997. Prokaryotic genomes: The emerging paradigm of genome-based microbiology. Curr. Opin. Genet. Dev. 7:757–63. Lawrence, J. G., and Ochman, H. 1998. Molecular archaeology of the Escherichia coli genome. Proc. Natl. Acad. Sci. 95:9413–17. Makarova, K. S.; Aravind, L.; Wolf, Y. I.; Tatusov, R. L.; Minton, K. W.; Koonin, E. V.; and Daly, M. J. 2001. Genome of the extremely radiation-resistant bacterium Deinococcus radiodurans viewed from the perspective of comparative genomics. Microbiol. Mol. Biol. Rev. 65(1):44–79. Ochman, H.; Lawrence, J. G.; and Groisman, E. A. 2000. Lateral gene transfer and the nature of bacterial innovation. Nature 408:299–304. Pollack, J. D. 1997. Mycoplasma genes: A case for reflective annotation. Trends Microbiol. 5(10):413–19. Riley, M., and Serres, M. H. 2000. Interim report on genomics of Escherichia coli. Annu. Rev. Microbiol. 54:341–411. Snel, B.; Bork, P.; and Huynen, M. 2000. Genome evolution: Gene fusion versus gene fission. Trends Genet. 16(1):9–11. Stephens, R. S., et al. 1998. Genome sequence of an obligate intracellular pathogen of humans: Chlamydia trachomatis. Science 282:754–59. Travis, J. 2000. Pass the genes, please. Science News 158:60–61. White, O., et al. 1999. Genome sequence of the radioresistant bacterium Deinococcus radiodurans R1. Science 286:1571–77. 15.6 Functional Genomics Blackstock, W., and Mann, M., editors. 2000. Proteomics: A trends guide. New York: Elsevier Science, Ltd. Brenner, S. 2000. The end of the beginning. Science 287:2173–74. Eisenberg, D.; Marcotte, E. M.; Xenarios, I.; and Yeates, T. O. 2000. Protein function in the post-genomic era. Nature 405:823–826. Ferea, T. L., and Brown, P. O. 1999. Observing the living genome. Curr. Opin. Genet. Dev. 9:715–22. Galperin, M. Y., and Koonin, E. V. 1999. Functional genomics and enzyme evolution. Genetica 106(1–2):159–70. Gingeras, T. R., and Rosenow, C. 2000. Studying microbial genomes with high-density oligonucleotide arrays. ASM News 66(8):463–69. Hamadeh, H., and Afshari, C. A. 2000. Gene chips and functional genomics. American Scientist 88:508–15. Huang, S. 2000. The practical problems of postgenomic biology. Nature Biotechnol. 18:471–72. Lockhart, D. J., and Winzeler, E. A. 2000. Genomics, gene expression and DNA arrays. Nature 405:827–36. Phimister, B., editor. 1999. The chipping forecast. Nature Genet. 21(1) supplement. Pandey, A., and Mann, M. 2000. Proteomics to study genes and genomes. Nature 405:837–46. Rastan, S., and Beeley, L. J. 1997. Functional genomics: Going forwards from the databases. Curr. Opin. Genet. Dev. 7:777–83. Schena, M., editor. 1999. DNA microarrays: A practical approach. New York: Oxford University Press. Prescott−Harley−Klein: Microbiology, Fifth Edition PA RT VI. The Viruses VI The Viruses Chapter 16 The Viruses: Introduction and General Characteristics 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 CHAPTER 16 The Viruses: Introduction and General Characteristics The simian virus 40 (SV-40) capsid shown here differs from most icosahedral capsids in containing only pentameric capsomers (pp. 370–72). SV-40 is a small double-stranded DNA polyomavirus with 72 capsomers. It may cause a central nervous system disease in rhesus monkeys and can produce tumors in hamsters. SV-40 was first discovered in cultures of monkey kidney cells during preparation of the poliovirus vaccine. Chapter 17 The Viruses: Bacteriophages Chapter 18 The Viruses:Viruses of Eucaryotes Outline 16.1 16.2 16.3 16.4 Early Development of Virology 362 General Properties of Viruses 363 The Cultivation of Viruses 364 Virus Purification and Assays 366 Virus Purification 366 Virus Assays 367 16.5 The Structure of Viruses 368 Virion Size 369 General Structural Properties 369 Helical Capsids 370 Icosahedral Capsids 370 Nucleic Acids 372 Viral Envelopes and Enzymes 374 Viruses with Capsids of Complex Symmetry 376 16.6 Principles of Virus Taxonomy 377 Prescott−Harley−Klein: Microbiology, Fifth Edition 362 Chapter 16 VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 The Viruses: Introduction and General Characteristics Concepts 1. Viruses are simple, acellular entities consisting of one or more molecules of either DNA or RNA enclosed in a coat of protein (and sometimes, in addition, substances such as lipids and carbohydrates). They can reproduce only within living cells and are obligately intracellular parasites. 2. Viruses are cultured by inoculating living hosts or cell cultures with a virion preparation. Purification depends mainly on their large size relative to cell components, high protein content, and great stability. The virus concentration may be determined from the virion count or from the number of infectious units. 3. All viruses have a nucleocapsid composed of a nucleic acid surrounded by a protein capsid that may be icosahedral, helical, or complex in structure. Capsids are constructed of protomers that self-assemble through noncovalent bonds. A membranous envelope often lies outside the nucleocapsid. 4. More variety is found in the genomes of viruses than in those of procaryotes and eucaryotes; they may be either single-stranded or double-stranded DNA or RNA. The nucleic acid strands can be linear, closed circle, or able to assume either shape. 5. Viruses are classified on the basis of their nucleic acid’s characteristics, capsid symmetry, the presence or absence of an envelope, their host, the diseases caused by animal and plant viruses, and other properties. Great fleas have little fleas upon their backs to bite ‘em And little fleas have lesser fleas, and so on ad infinitum —Augustus De Morgan hapters 16, 17, and 18 focus on the viruses. These are infectious agents with fairly simple, acellular organization. They possess only one type of nucleic acid, either DNA or RNA, and only reproduce within living cells. Clearly viruses are quite different from procaryotic and eucaryotic microorganisms, and are studied by virologists. Despite their simplicity in comparison with cellular organisms, viruses are extremely important and deserving of close attention. The study of viruses has contributed significantly to the discipline of molecular biology. Many human viral diseases are already known and more are discovered or arise every year, as demonstrated by the recent appearance of AIDS. The whole field of genetic engineering is based in large part upon discoveries in virology. Thus it is easy to understand why virology (the study of viruses) is such a significant part of microbiology. This chapter focuses on the broader aspects of virology: its development as a scientific discipline, the general properties and structure of viruses, the ways in which viruses are cultured and studied, and viral taxonomy. Chapter 17 is concerned with the bacteriophages, and chapter 18 is devoted to the viruses of eucaryotes. C Viruses have had enormous impact on humans and other organisms, yet very little was known about their nature until fairly recently. A brief history of their discovery and recognition as uniquely different infectious agents can help clarify their nature. 16.1 Early Development of Virology Although the ancients did not understand the nature of their illnesses, they were acquainted with diseases, such as rabies, that are now known to be viral in origin. In fact, there is some evidence that the great epidemics of A.D. 165 to 180 and A.D. 251 to 266, which severely weakened the Roman Empire and aided its decline, may have been caused by measles and smallpox viruses. Smallpox had an equally profound impact on the New World. Hernán Cortés’s conquest of the Aztec Empire in Mexico was made possible by an epidemic that ravaged Mexico City. The virus was probably brought to Mexico in 1520 by the relief expedition sent to join Cortés. Before the smallpox epidemic subsided, it had killed the Aztec King Cuitlahuac (the nephew and son-in-law of the slain emperor, Montezuma II) and possibly 1/3 of the population. Since the Spaniards were not similarly afflicted, it appeared that God’s wrath was reserved for the Native Americans, and this disaster was viewed as divine support for the Spanish conquest (Box 16.1). The first progress in preventing viral diseases came years before the discovery of viruses. Early in the eighteenth century, Lady Wortley Montagu, wife of the English ambassador to Turkey, observed that Turkish women inoculated their children against smallpox. The children came down with a mild case and subsequently were immune. Lady Montagu tried to educate the English public about the procedure but without great success. Later in the century an English country doctor, Edward Jenner, stimulated by a girl’s claim that she could not catch smallpox because she had had cowpox, began inoculating humans with material from cowpox lesions. He published the results of 23 successful vaccinations in 1798. Although Jenner did not understand the nature of smallpox, he did manage to successfully protect his patients from the dread disease through exposure to the cowpox virus. Until well into the nineteenth century, harmful agents were often grouped together and sometimes called viruses [Latin virus, poison or venom]. Even Louis Pasteur used the term virus for any living infectious disease agent. The development in 1884 of the porcelain bacterial filter by Charles Chamberland, one of Pasteur’s collaborators and inventor of the autoclave, made possible the discovery of what are now called viruses. Tobacco mosaic disease was the first to be studied with Chamberland’s filter. In 1892 Dimitri Ivanowski published studies showing that leaf extracts from infected plants would induce tobacco mosaic disease even after filtration to remove bacteria. He attributed this to the presence of a toxin. Martinus W. Beijerinck, working independently of Ivanowski, published the results of extensive studies on tobacco mosaic disease in 1898 and 1900. Because the filtered sap of diseased plants was still infectious, he proposed that the disease was caused by an entity different from bacteria, a filterable virus. He observed that the virus would multiply only in living plant cells, but could survive for long periods in a dried state. At the same time Friedrich Loeffler and Paul Frosch in Germany found that the hoof-and-mouth disease of cattle was also caused by a filterable virus rather than by a toxin. In 1900 Walter Reed began his study of the yellow fever disease whose incidence had been increasing in Cuba. Reed showed that this human disease Prescott−Harley−Klein: Microbiology, Fifth Edition VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 16.2 General Properties of Viruses 363 Box 16.1 Disease and the Early Colonization of America lthough the case is somewhat speculative, there is considerable evidence that disease, and particularly smallpox, played a major role in reducing Indian resistance to the European colonization of North America. It has been estimated that Indian populations in Mexico declined about 90% within 100 years of initial contact with the Spanish. Smallpox and other diseases were a major factor in this decline, and there is no reason to suppose that North America was any different. As many as 10 to 12 million Indians may have lived north of the Rio Grande before contact with Europeans. In New England alone, there may have been over 72,000 in A.D. 1600; yet only around 8,600 remained in New England by A.D. 1674, and the decline continued in subsequent years. Such an incredible catastrophe can be accounted for by consideration of the situation at the time of European contact with the Native Americans. The Europeans, having already suffered major epidemics in the preceding centuries, were relatively immune to the diseases they carried. On the other hand, the Native Americans had never been exposed to diseases like smallpox and were decimated by epidemics. In the sixteenth century, before any permanent English colonies had been established, many contacts were made by missionaries and explorers A was due to a filterable virus that was transmitted by mosquitoes. Mosquito control shortly reduced the severity of the yellow fever problem. Thus by the beginning of this century, it had been established that filterable viruses were different from bacteria and could cause diseases in plants, livestock, and humans. Shortly after the turn of the century, Vilhelm Ellermann and Oluf Bang in Copenhagen reported that leukemia could be transmitted between chickens by cell-free filtrates and was probably caused by a virus. Three years later in 1911, Peyton Rous from the Rockefeller Institute in New York City reported that a virus was responsible for a malignant muscle tumor in chickens. These studies established that at least some malignancies were caused by viruses. It was soon discovered that bacteria themselves also could be attacked by viruses. The first published observation suggesting that this might be the case was made in 1915 by Frederick W. Twort. Twort isolated bacterial viruses that could attack and destroy micrococci and intestinal bacilli. Although he speculated that his preparations might contain viruses, Twort did not follow up on these observations. It remained for Felix d’Herelle to establish decisively the existence of bacterial viruses. D’Herelle isolated bacterial viruses from patients with dysentery, probably caused by Shigella dysenteriae. He noted that when a virus suspension was spread on a layer of bacteria growing on agar, clear circular areas containing viruses and lysed cells developed. A count of these clear zones allowed d’Herelle to estimate the number of viruses present (plaque assay, p. 368). D’Herelle demonstrated that these viruses could reproduce only in live bacteria; therefore he named them bacteriophages because they could eat holes in bacterial “lawns.” The chemical nature of viruses was established when Wendell M. Stanley announced in 1935 that he had crystallized the to- who undoubtedly brought disease with them and infected the natives. Indeed, the English noted at the end of the century that Indian populations had declined greatly but attributed it to armed conflict rather than to disease. Establishment of colonies simply provided further opportunities for infection and outbreak of epidemics. For example, the Huron Indians decreased from a minimum of 32,000 people to 10,000 in 10 years. Between the time of initial English colonization and 1674, the Narraganset Indians declined from around 5,000 warriors to 1,000, and the Massachusetts Indians, from 3,000 to 300. Similar stories can be seen in other parts of the colonies. Some colonists interpreted these plagues as a sign of God’s punishment of Indian resistance: the “Lord put an end to this quarrel by smiting them with smallpox. . . . Thus did the Lord allay their quarrelsome spirit and make room for the following part of his army.” It seems clear that epidemics of European diseases like smallpox decimated Native American populations and prepared the way for colonization of the North American continent. Many American cities—for example, Boston, Philadelphia, and Plymouth—grew upon sites of previous Indian villages. bacco mosaic virus (TMV) and found it to be largely or completely protein. A short time later Frederick C. Bawden and Norman W. Pirie managed to separate the TMV virus particles into protein and nucleic acid. Thus by the late 1930s it was becoming clear that viruses were complexes of nucleic acids and proteins able to reproduce only in living cells. 1. Describe the major technical advances and discoveries important in the early development of virology. 2. Give the contribution to virology made by each scientist mentioned in this section. 16.2 General Properties of Viruses Viruses are a unique group of infectious agents whose distinctiveness resides in their simple, acellular organization and pattern of reproduction. A complete virus particle or virion consists of one or more molecules of DNA or RNA enclosed in a coat of protein, and sometimes also in other layers. These additional layers may be very complex and contain carbohydrates, lipids, and additional proteins. Viruses can exist in two phases: extracellular and intracellular. Virions, the extracellular phase, possess few if any enzymes and cannot reproduce independent of living cells. In the intracellular phase, viruses exist primarily as replicating nucleic acids that induce host metabolism to synthesize virion components; eventually complete virus particles or virions are released. Prescott−Harley−Klein: Microbiology, Fifth Edition 364 Chapter 16 VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 The Viruses: Introduction and General Characteristics In summary, viruses differ from living cells in at least three ways: (1) their simple, acellular organization; (2) the presence of either DNA or RNA, but not both, in almost all virions (human cytomegalovirus has a DNA genome and four mRNAs); and (3) their inability to reproduce independent of cells and carry out cell division as procaryotes and eucaryotes do. Although bacteria such as chlamydia and rickettsia (see sections 21.5 and 22.1) are obligately intracellular parasites like viruses, they do not meet the first two criteria. Air sac Amniotic cavity Chorioallantoic membrane inoculation Chorioallantoic membrane Shell 16.3 The Cultivation of Viruses Because they are unable to reproduce independent of living cells, viruses cannot be cultured in the same way as bacteria and eucaryotic microorganisms. For many years researchers have cultivated animal viruses by inoculating suitable host animals or embryonated eggs—fertilized chicken eggs incubated about 6 to 8 days after laying (figure 16.1). To prepare the egg for virus cultivation, the shell surface is first disinfected with iodine and penetrated with a small sterile drill. After inoculation, the drill hole is sealed with gelatin and the egg incubated. Viruses may be able to reproduce only in certain parts of the embryo; consequently they must be injected into the proper region. For example, the myxoma virus grows well on the chorioallantoic membrane, whereas the mumps virus prefers the allantoic cavity. The infection may produce a local tissue lesion known as a pock, whose appearance often is characteristic of the virus. More recently animal viruses have been grown in tissue (cell) culture on monolayers of animal cells. This technique is made possible by the development of growth media for animal cells and by the advent of antibiotics that can prevent bacterial and fungal contamination. A layer of animal cells in a specially prepared petri dish is covered with a virus inoculum, and the viruses are allowed time to settle and attach to the cells. The cells are then covered with a thin layer of agar to limit virion spread so that only adjacent cells are infected by newly produced virions. As a result localized areas of cellular destruction and lysis called plaques often are formed (figure 16.2) and may be detected if stained with dyes, such as neutral red or trypan blue, that can distinguish living from dead cells. Viral growth does not always result in the lysis of cells to form a plaque. Animal viruses, in particular, can cause microscopic or macroscopic degenerative changes or abnormalities in host cells and in tissues called cytopathic effects (figure 16.3). Cytopathic effects may be lethal, but plaque formation from cell lysis does not always occur. Bacterial viruses or bacteriophages (phages for short) are cultivated in either broth or agar cultures of young, actively growing bacterial cells. So many host cells are destroyed that turbid bacterial cultures may clear rapidly because of cell lysis. Agar cultures are prepared by mixing the bacteriophage sample with cool, liquid agar and a suitable bacterial culture. The mixture is quickly poured into a petri dish containing a bottom layer of sterile agar. After hardening, bacteria in the layer of top agar grow and reproduce, forming a continuous, opaque layer or “lawn.” Wherever a virion comes to rest in the top agar, the virus infects an adjacent cell and reproduces. Eventually, bacterial lysis generates a plaque or clearing in the lawn (figure 16.4). As can be seen Allantoic cavity Albumin Allantoic cavity inoculation Yolk sac Figure 16.1 Virus Cultivation Using an Embryonated Egg. Two sites that are often used to grow animal viruses are the chorioallantoic membrane and the allantoic cavity. The diagram shows a 9 day chicken embryo. Figure 16.2 Virus Plaques. Poliovirus plaques in a monkey kidney cell culture. in figure 16.4, plaque appearance often is characteristic of the phage being cultivated. Plant viruses are cultivated in a variety of ways. Plant tissue cultures, cultures of separated cells, or cultures of protoplasts (see section 3.3) may be used. Viruses also can be grown in whole plants. Leaves are mechanically inoculated when rubbed with a mixture of viruses and an abrasive such as carborundum. When the cell walls are broken by the abrasive, the viruses directly contact the plasma membrane and infect the exposed host cells. (The role of the abrasive is frequently filled by insects that suck or crush plant leaves and thus transmit viruses.) A localized necrotic lesion often develops due to the rapid death of cells in Prescott−Harley−Klein: Microbiology, Fifth Edition VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 16.3 The Cultivation of Viruses 365 Figure 16.3 The Cytopathic Effects of Viruses. (a) Normal mammalian cells in tissue culture. (b) Appearance of tissue culture cells 18 hours after infection with adenovirus. Transmission electron microscope photomicrographs (⫻11,000). (a) (b) T1 T2 (a) T3 T4 Figure 16.4 Phage Plaques. Plaques produced on a lawn of E. coli by some of the T coliphages. Note the large differences in plaque appearance. The photographs are about 1/3 full size. the infected area (figure 16.5). Even when lesions do not occur, the infected plant may show symptoms such as changes in pigmentation or leaf shape. Some plant viruses can be transmitted only if a diseased part is grafted onto a healthy plant. 1. What is a virus particle or virion, and how is it different from living organisms? 2. Discuss the ways in which viruses may be cultivated. Define the terms pock, plaque, cytopathic effect, bacteriophage, and necrotic lesion. (b) Figure 16.5 Necrotic Lesions on Plant Leaves. (a) Tobacco mosaic virus on Nicotiana glutinosa. (b) Tobacco mosaic virus infection of an orchid showing leaf color changes. Prescott−Harley−Klein: Microbiology, Fifth Edition 366 Chapter 16 VI. The Viruses 16. The Viruses: Introduction and General Characteristics The Viruses: Introduction and General Characteristics 100,000 × g 1–3 hours Decant; resuspend 8,000–10,000 × g 10–20 min 100,000 × g 1–3 hours Decant supernate Figure 16.6 The Use of Differential Centrifugation to Purify a Virus. At the beginning the centrifuge tube contains homogenate and icosahedral viruses (in green). First, the viruses and heavier cell organelles are removed from smaller molecules. After resuspension, the mixture is centrifuged just fast enough to sediment cell organelles while leaving the smaller virus particles in suspension; the purified viruses are then collected. This process can be repeated several times to further purify the virions. 16.4 © The McGraw−Hill Companies, 2002 Virus Purification and Assays Virologists must be able to purify viruses and accurately determine their concentrations in order to study virus structure, reproduction, and other aspects of their biology. These methods are so important that the growth of virology as a modern discipline depended on their development. Virus Purification Purification makes use of several virus properties. Virions are very large relative to proteins, are often more stable than normal cell components, and have surface proteins. Because of these characteristics, many techniques useful for the isolation of proteins and organelles can be employed in virus isolation. Four of the most widely used approaches are (1) differential and density gradient centrifugation, (2) precipitation of viruses, (3) denaturation of contaminants, and (4) enzymatic digestion of cell constituents. 1. Host cells in later stages of infection that contain mature virions are used as the source of material. Infected cells are first disrupted in a buffer to produce an aqueous suspension or homogenate consisting of cell components and viruses. Viruses can then be isolated by differential centrifugation, the centrifugation of a suspension at various speeds to separate particles of different sizes (figure 16.6). Usually the homogenate is first centrifuged at high speed to sediment viruses and other large cellular particles, and the supernatant, which contains the homogenate’s soluble molecules, is discarded. The pellet is next resuspended and centrifuged at a low speed to remove substances heavier than viruses. Higher speed centrifugation then sediments the viruses. This process may be repeated to purify the virus particles further. Viruses also can be purified based on their size and density by use of gradient centrifugation (figure 16.7). A sucrose solution is poured into a centrifuge tube so that its concentration smoothly and linearly increases between the top and the bottom of the tube. The virus preparation, often after purification by differential centrifugation, is layered on top of the gradient and centrifuged. As shown in figure 16.7a, the particles settle under centrifugal force until they come to rest at the level where the gradient’s density equals theirs (isopycnic gradient centrifugation). Viruses can be separated from other particles only slightly different in density. Gradients also can separate viruses based on differences in their sedimentation rate (rate zonal gradient centrifugation). When this is done, particles are separated on the basis of both size and density; usually the largest virus will move most rapidly down the gradient. Figure 16.7b shows that viruses differ from one another and cell components with respect to either density (grams per milliliter) or sedimentation coefficient(s). Thus these two types of gradient centrifugation are very effective in virus purification. 2. Viruses, like many proteins, can be purified through precipitation with concentrated ammonium sulfate. Initially, sufficient ammonium sulfate is added to raise its concentration to a level just below that which will precipitate the virus. After any precipitated contaminants are removed, more ammonium sulfate is added and the precipitated viruses are collected by centrifugation. Viruses sensitive to ammonium sulfate often are purified by precipitation with polyethylene glycol. 3. Viruses frequently are less easily denatured than many normal cell constituents. Contaminants may be denatured and precipitated with heat or a change in pH to purify viruses. Because some viruses also tolerate treatment with organic solvents like butanol and chloroform, solvent treatment can be used to both denature protein contaminants and extract any lipids in the preparation. The solvent is thoroughly mixed with the virus preparation, then allowed to stand and separate into organic and aqueous layers. The unaltered virus remains suspended in the aqueous phase while lipids dissolve in the organic phase. Substances denatured by organic solvents collect at the interface between the aqueous and organic phases. Denaturation of enzymes (pp. 163–64) 4. Cellular proteins and nucleic acids can be removed from many virus preparations through enzymatic degradation because viruses usually are more resistant to attack by nucleases and proteases than are free nucleic acids and proteins. For example, ribonuclease and trypsin often degrade cellular ribonucleic acids and proteins while leaving virions unaltered. Prescott−Harley−Klein: Microbiology, Fifth Edition VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 16.4 Virus Purification and Assays 367 20% sucrose 50% sucrose 1 2 3 4 5 (a) RNA 2.0 DNA Density (g/ml) Starch granules Ribosome 1.5 φX174 Proteins Polio T3 T2 Phage TMV ts las Nuclei op lor Ch Mitochondria Endoplasmic reticulum (rough & smooth) 1.0 100 102 104 106 108 Sedimentation coefficient (S) (b) Figure 16.7 Gradient Centrifugation. (a) A linear sucrose gradient is prepared, 1, and the particle mixture is layered on top, 2 and 3. Centrifugation, 4, separates the particles on the basis of their density and sedimentation coefficient, (the arrows in the centrifuge tubes indicate the direction of centrifugal force). 5. In isopycnic gradient centrifugation, the bottom of the gradient is denser than any particle, and each particle comes to rest at a point in the gradient equal to its density. Rate zonal centrifugation separates particles based on their sedimentation coefficient, a function of both size and density, because the bottom of the gradient is less dense than the densest particles and centrifugation is carried out for a shorter time so that particles do not come to rest. The largest, most dense particles travel fastest. (b) The densities and sedimentation coefficients of representative viruses (shown in color) and other biological substances. Virus Assays The quantity of viruses in a sample can be determined either by counting particle numbers or by measurement of the infectious unit concentration. Although most normal virions are probably potentially infective, many will not infect host cells because they do not contact the proper surface site. Thus the total particle count may be from 2 to 1 million times the infectious unit number depending on the nature of the virion and the experimental conditions. Despite this, both approaches are of value. Virus particles can be counted directly with the electron microscope. In one procedure the virus sample is mixed with a known concentration of small latex beads and sprayed on a coated specimen grid. The beads and virions are counted; the virus concentration is calculated from these counts and from the bead concentration (figure 16.8). This technique often works well with concentrated preparations of viruses of known morphology. Viruses can be concentrated by centrifugation before counting if the preparation is too dilute. However, if the beads and viruses are not evenly distributed (as sometimes happens), the final count will be inaccurate. Prescott−Harley−Klein: Microbiology, Fifth Edition 368 Chapter 16 VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 The Viruses: Introduction and General Characteristics 100 % Deaths 80 Figure 16.8 Tobacco Mosaic Virus. A tobacco mosaic virus preparation viewed in the transmission electron microscope. Latex beads 264 mm in diameter (white spheres) have been added. 60 40 20 0 10 The most popular indirect method of counting virus particles is the hemagglutination assay. Many viruses can bind to the surface of red blood cells (see figure 33.10). If the ratio of viruses to cells is large enough, virus particles will join the red blood cells together, forming a network that settles out of suspension or agglutinates. In practice, red blood cells are mixed with a series of virus preparation dilutions and each mixture is examined. The hemagglutination titer is the highest dilution of virus (or the reciprocal of the dilution) that still causes hemagglutination. This assay is an accurate, rapid method for determining the relative quantity of viruses such as the influenza virus. If the actual number of viruses needed to cause hemagglutination is determined by another technique, the assay can be used to ascertain the number of virus particles present in a sample. A variety of assays analyze virus numbers in terms of infectivity, and many of these are based on the same techniques used for virus cultivation. For example, in the plaque assay several dilutions of bacterial or animal viruses are plated out with appropriate host cells. When the number of viruses plated out are much fewer than the number of host cells available for infection and when the viruses are distributed evenly, each plaque in a layer of bacterial or animal cells is assumed to have arisen from the reproduction of a single virus particle. Therefore a count of the plaques produced at a particular dilution will give the number of infectious virions or plaque-forming units (PFU), and the concentration of infectious units in the original sample can be easily calculated. Suppose that 0.10 ml of a 10–6 dilution of the virus preparation yields 75 plaques. The original concentration of plaque-forming units is –5 –6 10 10 –7 10 –8 Dilution Figure 16.9 A Hypothetical Dose-Response Curve. The LD50 is indicated by the dashed line. the dilution factor and divided by the inoculum volume to obtain the concentration of infectious units. When biological effects are not readily quantified in these ways, the amount of virus required to cause disease or death can be determined by the endpoint method. Organisms or cell cultures are inoculated with serial dilutions of a virus suspension. The results are used to find the endpoint dilution at which 50% of the host cells or organisms are destroyed (figure 16.9). The lethal dose (LD50) is the dilution that contains a dose large enough to destroy 50% of the host cells or organisms. In a similar sense, the infectious dose (ID50) is the dose which, when given to a number of test systems or hosts, causes an infection of 50% of the systems or hosts under the conditions employed. 1. Give the four major approaches by which viruses may be purified, and describe how each works. Distinguish between differential and density gradient centrifugation in terms of how they are carried out. 2. How can one find the virus concentration, both directly and indirectly, by particle counts and measurement of infectious unit concentration? Define plaque-forming units, lethal dose, and infectious dose. PFU/ml ⫽ (75 PFU/0.10 ml)(106) ⫽ 7.5 ⫻ 108. Viruses producing different plaque morphology types on the same plate may be counted separately. Although the number of PFU does not equal the number of virus particles, their ratios are proportional: a preparation with twice as many viruses will have twice the plaque-forming units. The same approach employed in the plaque assay may be used with embryos and plants. Chicken embryos can be inoculated with a diluted preparation or plant leaves rubbed with a mixture of diluted virus and abrasive. The number of pocks on embryonic membranes or necrotic lesions on leaves is multiplied by 16.5 The Structure of Viruses Virus morphology has been intensely studied over the past decades because of the importance of viruses and the realization that virus structure was simple enough to be understood. Progress has come from the use of several different techniques: electron microscopy, X-ray diffraction, biochemical analysis, and immunology. Although our knowledge is incomplete due to the large number of different viruses, the general nature of virus structure is becoming clear. Prescott−Harley−Klein: Microbiology, Fifth Edition VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 16.5 369 (d) Orf virus (c) Herpesvirus (a) Vaccinia virus The Structure of Viruses (b) Paramyxovirus (mumps) (h) Adenovirus (e) Rhabdovirus (g) Flexuoustailed phage (f) T-even coliphage (i) Influenza virus (m) Tubulovirus (j) Polyomavirus (k) Picornavirus (l) φX174 phage 1 µm Figure 16.10 The Size and Morphology of Selected Viruses. The viruses are drawn to scale. A 1 m line is provided at the bottom of the figure. Virion Size Virions range in size from about 10 to 300 or 400 nm in diameter (figure 16.10). The smallest viruses are a little larger than ribosomes, whereas the poxviruses, like vaccinia, are about the same size as the smallest bacteria and can be seen in the light microscope. Most viruses, however, are too small to be visible in the light microscope and must be viewed with the scanning and transmission electron microscopes (see section 2.4). General Structural Properties All virions, even if they possess other constituents, are constructed around a nucleocapsid core (indeed, some viruses consist only of a nucleocapsid). The nucleocapsid is composed of a nucleic acid, either DNA or RNA, held within a protein coat called the capsid, which protects viral genetic material and aids in its transfer between host cells. There are four general morphological types of capsids and virion structure. 1. Some capsids are icosahedral in shape. An icosahedron is a regular polyhedron with 20 equilateral triangular faces and 12 vertices (figure 16.10h,j–l). These capsids appear spherical when viewed at low power in the electron microscope. 2. Other capsids are helical and shaped like hollow protein cylinders, which may be either rigid or flexible (figure 16.10m). 3. Many viruses have an envelope, an outer membranous layer surrounding the nucleocapsid. Enveloped viruses have a roughly spherical but somewhat variable shape even though their nucleocapsid can be either icosahedral or helical (figure 16.10b,c,i). 4. Complex viruses have capsid symmetry that is neither purely icosahedral nor helical (figure 16.10a,d,f,g). They may possess tails and other structures (e.g., many bacteriophages) or have complex, multilayered walls surrounding the nucleic acid (e.g., poxviruses such as vaccinia). Both helical and icosahedral capsids are large macromolecular structures constructed from many copies of one or a few types of protein subunits or protomers. Probably the most important advantage of this design strategy is that the information stored in viral genetic material is used with maximum efficiency. For example, the tobacco mosaic virus (TMV) capsid contains a single type of small subunit possessing 158 amino acids. Only about 474 nucleotides out of 6,000 in the virus RNA are required to code for coat protein amino acids. Unless the same protein is used many times in capsid construction, a large nucleic acid, such Prescott−Harley−Klein: Microbiology, Fifth Edition 370 Chapter 16 VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 The Viruses: Introduction and General Characteristics as the TMV RNA, cannot be enclosed in a protein coat without using much or all of the available genetic material to code for capsid proteins. If the TMV capsid were composed of six different protomers of the same size as the TMV subunit, about 2,900 of the 6,000 nucleotides would be required for its construction, and much less genetic material would be available for other purposes. The genetic code and translation (pp. 240–41) Once formed and exposed to the proper conditions, protomers usually interact specifically with each other and spontaneously associate to form the capsid. Because the capsid is constructed without any outside aid, the process is called self-assembly (see p. 65). Some more complex viruses possess genes for special factors that are not incorporated into the virion but are required for its assembly. (a) RNA Protomer Helical Capsids Helical capsids are shaped much like hollow tubes with protein walls. The tobacco mosaic virus provides a well-studied example of helical capsid structure (figure 16.11). A single type of protomer associates together in a helical or spiral arrangement to produce a long, rigid tube, 15 to 18 nm in diameter by 300 nm long. The RNA genetic material is wound in a spiral and positioned toward the inside of the capsid where it lies within a groove formed by the protein subunits. Not all helical capsids are as rigid as the TMV capsid. Influenza virus RNAs are enclosed in thin, flexible helical capsids folded within an envelope (figures 16.10i and 16.12a,b). The size of a helical capsid is influenced by both its protomers and the nucleic acid enclosed within the capsid. The diameter of the capsid is a function of the size, shape, and interactions of the protomers. The nucleic acid determines helical capsid length because the capsid does not seem to extend much beyond the end of the DNA or RNA. Icosahedral Capsids The icosahedron is one of nature’s favorite shapes (the helix is probably most popular). Viruses employ the icosahedral shape because it is the most efficient way to enclose a space. A few genes, sometimes only one, can code for proteins that selfassemble to form the capsid. In this way a small number of linear genes can specify a large three-dimensional structure. Certain requirements must be met to construct an icosahedron. Hexagons pack together in planes and cannot enclose a space, and therefore pentagons must also be used. When icosahedral viruses are negatively stained and viewed in the transmission electron microscope, a complex icosahedral capsid structure is revealed (figure 16.12). The capsids are constructed from ring- or knob-shaped units called capsomers, each usually made of five or six protomers. Pentamers (pentons) have five subunits; hexamers (hexons) possess six. Pentamers are at the vertices of the icosahedron, whereas hexamers form its edges and triangular faces (figure 16.13). The icosahedron in figure 16.13 is constructed of 42 capsomers; larger icosahedra are made if more hexamers are used to form the edges and faces (adenoviruses have a capsid with 252 capsomers as shown in figure 0 10 nm 20 nm (b) (c) Figure 16.11 Tobacco Mosaic Virus Structure. (a) An electron micrograph of the negatively stained helical capsid (⫻400,000). (b) Illustration of TMV structure. Note that the nucleocapsid is composed of a helical array of promoters with the RNA spiraling on the inside. (c) A model of TMV. 16.12g,h). In many plant and bacterial RNA viruses, both the pentamers and hexamers of a capsid are constructed with only one type of subunit, whereas adenovirus pentamers are composed of different proteins than are adenovirus hexamers. Transmission electron microscopy and negative staining (pp. 30–33) Protomers join to form capsomers through noncovalent bonding. The bonds between proteins within pentamers and hexamers are stronger than those between separate capsomers. Empty capsids can even dissociate into separate capsomers. Recently it has been discovered that there is more than one way to build an icosahedral capsid. Although most icosahedral capsids appear to contain both pentamers and hexamers, simian virus 40 (SV-40), a small double-stranded DNA polyomavirus, has only pentamers (figure 16.14a). The virus is constructed of 72 cylindrical pentamers with hollow centers. Five flexible arms Prescott−Harley−Klein: Microbiology, Fifth Edition VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 16.5 (a) (b) (c) (d) (e) (f ) (g) (h) The Structure of Viruses Figure 16.12 Examples of Icosahedral Capsids. (a) Canine parvovirus model, 12 capsomers, with the four parts of each capsid polypeptide given different colors. (b) and (c) Poliovirus model, 32 capsomers, with the four capsid proteins in different colors. The capsid surface is depicted in (b) and a cross section in (c). (d) Clusters of the human papilloma virus, 72 capsomers (⫻80,000). (e) Simian virus 40 (SV-40), 72 capsomers (⫻340,000). (f) Computer-simulated image of the polyomavirus, 72 capsomers, that causes a rare demyelinating disease of the central nervous system. (g) Adenovirus, 252 capsomers (⫻171,000). (h) Computer-simulated model of adenovirus. 371 Prescott−Harley−Klein: Microbiology, Fifth Edition 372 Chapter 16 VI. The Viruses 16. The Viruses: Introduction and General Characteristics The Viruses: Introduction and General Characteristics H H extend from the edge of each pentamer (figure 16.14b). Twelve pentamers occupy the icosahedron’s vertices and associate with five neighbors, just as they do when hexamers also are present. Each of the 60 nonvertex pentamers associates with its six adjacent neighbors as shown in figure 16.14c. An arm extends toward the adjacent vertex pentamer (pentamer 1) and twists around one of its arms. Three more arms interact in the same way with arms of other nonvertex pentamers (pentamers 3 to 5). The fifth arm binds directly to an adjacent nonvertex pentamer (pentamer 6) but does not attach to one of its arms. An arm does not extend from the central pentamer to pentamer 2; other arms hold pentamer 2 in place. Thus an icosahedral capsid is assembled without hexamers by using flexible arms as ropes to tie the pentamers together. P P P H H H H H P H P © The McGraw−Hill Companies, 2002 H H H H P H H H P H H Nucleic Acids P H P Viruses are exceptionally flexible with respect to the nature of their genetic material. They employ all four possible nucleic acid types: single-stranded DNA, double-stranded DNA, single-stranded RNA, and double-stranded RNA. All four types are found in animal viruses. Plant viruses most often have single-stranded RNA genomes. Although phages may have single-stranded DNA or single-stranded RNA, bacterial viruses usually contain double-stranded DNA. H H P Figure 16.13 The Structure of an Icosahedral Capsid. Pentons are located at the 12 vertices. Hexons form the edges and faces of the icosahedron. This capsid contains 42 capsomers; all protomers are identical. (a) (b) 1 2 α Figure 16.14 An Icosahedral Capsid Constructed of Pentamers. (a) The simian virus 40 capsid. The 12 pentamers at the icosahedron vertices are in white. The nonvertex pentamers are shown with each polypeptide chain in a different color. (b) A pentamer with extended arms. (c) A schematic diagram of the surface structure depicted in part c. The body of each pentamer is represented by a fivepetaled flower design. Each arm is shown as a line or a line and cylinder (␣-helix) with the same color as the rest of its protomer. The outer protomers are numbered clockwise beginning with the one at the vertex. α′ 6 γ α′ α′′ β′ β 3 β′ γ 5 (c) 4 Prescott−Harley−Klein: Microbiology, Fifth Edition VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 16.5 The Structure of Viruses 373 Table 16.1 Types of Viral Nucleic Acids Nucleic Acid Type DNA Single-Stranded Double-Stranded RNA Single-Stranded Nucleic Acid Structure Virus Examples Linear single strand Circular single strand Parvoviruses φX174, M13, fd phages Linear double strand Linear double strand with single chain breaks Herpesviruses (herpes simplex viruses, cytomegalovirus, Epstein-Barr virus), adenoviruses, T coliphages, lambda phage, and other bacteriophages T5 coliphage Double strand with cross-linked ends Vaccinia, smallpox Closed circular double strand Polyomaviruses (SV-40), papillomaviruses, PM2 phage, cauliflower mosaic Linear, single stranded, positive strand Picornaviruses (polio, rhinoviruses), togaviruses, RNA bacteriophages, TMV, and most plant viruses Rhabdoviruses (rabies), paramyxoviruses (mumps, measles) Brome mosaic virus (individual segments in separate virions) Retroviruses (Rous sarcoma virus, human immunodeficiency virus) Paramyxoviruses, orthomyxoviruses (influenza) Linear, single stranded, negative strand Linear, single stranded, segmented, positive strand Linear, single stranded, segmented, diploid (two identical single strands), positive strand Linear, single stranded, segmented, negative strand Double-Stranded Linear, double stranded, segmented Reoviruses, wound-tumor virus of plants, cytoplasmic polyhedrosis virus of insects, phage φ6, many mycoviruses Modified from S. E. Luria, et al., General Virology, 3d edition, 1983. John Wiley & Sons, Inc., New York, NY. Table 16.1 summarizes many variations seen in viral nucleic acids. The size of viral genetic material also varies greatly. The smallest genomes (those of the MS2 and Q viruses) are around 1 ⫻ 106 daltons, just large enough to code for three to four proteins. MS2, Q, and some other viruses even save space by using overlapping genes (see section 11.5). At the other extreme, T-even bacteriophages, herpesvirus, and vaccinia virus have genomes of 1.0 to 1.6 ⫻ 108 daltons and may be able to direct the synthesis of over 100 proteins. In the following paragraphs the nature of each nucleic acid type is briefly summarized. Nucleic acid structure (pp. 230–35) Tiny DNA viruses like X174 and M13 bacteriophages or the parvoviruses possess single-stranded DNA (ssDNA) genomes (table 16.1). Some of these viruses have linear pieces of DNA, whereas others use a single, closed circle of DNA for their genome (figure 16.15). Most DNA viruses use double-stranded DNA (dsDNA) as their genetic material. Linear dsDNA, variously modified, is found in many viruses; others have circular dsDNA. The lambda phage has linear dsDNA with cohesive ends—single-stranded Figure 16.15 Circular Phage DNA. The closed circular DNA of the phage PM2 (⫻93,000). Note both the relaxed and highly twisted or supercoiled forms. Prescott−Harley−Klein: Microbiology, Fifth Edition 374 Chapter 16 VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 The Viruses: Introduction and General Characteristics Figure 16.16 Circularization of Lambda DNA. The linear DNA of the lambda phage can be reversibly circularized. This is made possible by cohesive ends (in color) that have complementary sequences and can base pair with each other. GGGCGGCGACCT complementary segments 12 nucleotides long—that enable it to cyclize when they base pair with each other (figure 16.16). Besides the normal nucleotides found in DNA, many virus DNAs contain unusual bases. For example, the T-even phages of E. coli (see chapter 17) have 5-hydroxymethylcytosine (see figure 17.7 ) instead of cytosine. Glucose is usually attached to the hydroxymethyl group. Most RNA viruses employ single-stranded RNA (ssRNA) as their genetic material. The RNA base sequence may be identical with that of viral mRNA, in which case the RNA strand is called the plus strand or positive strand (viral mRNA is defined as plus or positive). However, the viral RNA genome may instead be complementary to viral mRNA, and then it is called a minus or negative strand. Polio, tobacco mosaic, brome mosaic, and Rous sarcoma viruses are all positive strand RNA viruses; rabies, mumps, measles, and influenza viruses are examples of negative strand RNA viruses. Many of these RNA genomes are segmented genomes—that is, they are divided into separate parts. It is believed that each fragment or segment codes for one protein. Usually all segments are probably enclosed in the same capsid even though some virus genomes may be composed of as many as 10 to 12 segments. However, it is not necessary that all segments be located in the same virion for successful reproduction. The brome mosaic virus genome is composed of four segments distributed among three different virus particles. All three of the largest segments are required for infectivity. Despite this complex and seemingly inefficient arrangement, the different brome mosaic virions manage to successfully infect the same host. Plus strand viral RNA often resembles mRNA in more than the equivalence of its nucleotide sequence. Just as eucaryotic mRNA usually has a 5′ cap of 7-methylguanosine, many plant and animal viral RNA genomes are capped. In addition, most or all plus strand RNA animal viruses also have a poly-A stretch at the 3′ end of their genome, and thus closely resemble eucaryotic mRNA with respect to the structure of both ends. In fact, plus strand RNAs can direct protein synthesis immediately after entering the cell. Strangely enough, a number of single-stranded plant viral RNAs have 3′ ends that resemble eucaryotic transfer CCCGCCGCTGGA RNA, and the genomes of tobacco mosaic virus will actually accept amino acids. Capping is not seen in the RNA bacteriophages. Eucaryotic mRNA structure and function (pp. 263–64) A few viruses have double-stranded RNA (dsRNA) genomes. All appear to be segmented; some, such as the reoviruses, have 10 to 12 segments. These dsRNA viruses are known to infect animals, plants, fungi, and even one bacterial species. Viral Envelopes and Enzymes Many animal viruses, some plant viruses, and at least one bacterial virus are bounded by an outer membranous layer called an envelope (figure 16.17). Animal virus envelopes usually arise from host cell nuclear or plasma membranes; their lipids and carbohydrates are normal host constituents. In contrast, envelope proteins are coded for by virus genes and may even project from the envelope surface as spikes or peplomers (figure 16.17a,b,f ). These spikes may be involved in virus attachment to the host cell surface. Since they differ among viruses, they also can be used to identify some viruses. Because the envelope is a flexible, membranous structure, enveloped viruses frequently have a somewhat variable shape and are called pleomorphic. However, the envelopes of viruses like the bullet-shaped rabies virus are firmly attached to the underlying nucleocapsid and endow the virion with a constant, characteristic shape (figure 16.17c). In some viruses the envelope is disrupted by solvents like ether to such an extent that lipid-mediated activities are blocked or envelope proteins are denatured and rendered inactive. The virus is then said to be “ether sensitive.” Influenza virus (figure 16.17a,b) is a well-studied example of an enveloped virus. Spikes project about 10 nm from the surface at 7 to 8 nm intervals. Some spikes possess the enzyme neuraminidase, which may aid the virus in penetrating mucous layers of the respiratory epithelium to reach host cells. Other spikes have hemagglutinin proteins, so named because they can bind the virions to red blood cell membranes and cause hemagglutination (see figure 33.10). Hemagglutinins participate in virion attachment to host cells. Proteins, like the spike proteins that are Prescott−Harley−Klein: Microbiology, Fifth Edition VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 16.5 The Structure of Viruses Hemagglutinin spike Neuraminidase spike Matrix protein Lipid bilayer Polymerase Ribonucleoprotein (b) (a) (c) 50 nm (d) Figure 16.17 Examples of Enveloped Viruses. (a) Human influenza virus. Note the flexibility of the envelope and the spikes projecting from its surface (⫻282,000). (b) Diagram of the influenza virion. (c) Rhabdovirus particles (⫻250,000). This is the vesicular stomatitis virus, a relative of the rabies virus, which is similar in appearance. (d) Human immunodeficiency viruses (⫻33,000). (e) Herpesviruses (⫻100,000). (f) Computer image of the Semliki Forest virus, a virus that occasionally causes encephalitis in humans. (e) (f ) 375 Prescott−Harley−Klein: Microbiology, Fifth Edition 376 Chapter 16 VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 The Viruses: Introduction and General Characteristics Elliptical or lateral body Outer envelope 240 nm (a) Nucleoid 400 nm Nucleoid (b) (c) Figure 16.18 Vaccinia Virus Morphology. (a) Diagram of vaccinia structure. (b) Micrograph of the virion clearly showing the nucleoid (⫻200,000). (c) Vaccinia surface structure. An electron micrograph of four virions showing the thick array of surface fibers (⫻150,000). exposed on the outer envelope surface, are generally glycoproteins— that is, the proteins have carbohydrate attached to them. A nonglycosylated protein, the M or matrix protein, is found on the inner surface of the envelope and helps stabilize it. Although it was originally thought that virions had only structural capsid proteins and lacked enzymes, this has proven not to be the case. In some instances, enzymes are associated with the envelope or capsid (e.g., influenza neuraminidase). Most viral enzymes are probably located within the capsid. Many of these are involved in nucleic acid replication. For example, the influenza virus uses RNA as its genetic material and carries an RNAdependent RNA polymerase that acts both as a replicase and as an RNA transcriptase that synthesizes mRNA under the direction of its RNA genome. The polymerase is associated with ribonucleoprotein (figure 16.17b). Although viruses lack true metabolism and cannot reproduce independently of living cells, they may carry one or more enzymes essential to the completion of their life cycles. Nucleic acid replication and transcription (sections 11.3 and 12.1); Animal virus reproduction (pp. 399–410) intensely studied. Their head resembles an icosahedron elongated by one or two rows of hexamers in the middle (figure 16.19) and contains the DNA genome. The tail is composed of a collar joining it to the head, a central hollow tube, a sheath surrounding the tube, and a complex baseplate. The sheath is made of 144 copies of the gp18 protein arranged in 24 rings, each containing six copies. In T-even phages, the baseplate is hexagonal and has a pin and a jointed tail fiber at each corner. The tail fibers are responsible for virus attachment to the proper site on the bacterial surface (see section 17.2). There is considerable variation in structure among the large bacteriophages, even those infecting a single host. In contrast with the T-even phages, many coliphages have true icosahedral heads. T1, T5, and lambda phages have sheathless tails that lack a baseplate and terminate in rudimentary tail fibers. Coliphages T3 and T7 have short, noncontractile tails without tail fibers. Clearly these viruses can complete their reproductive cycles using a variety of tail structures. Complex bacterial viruses with both heads and tails are said to have binal symmetry because they possess a combination of icosahedral (the head) and helical (the tail) symmetry. Viruses with Capsids of Complex Symmetry Although most viruses have either icosahedral or helical capsids, many viruses do not fit into either category. The poxviruses and large bacteriophages are two important examples. The poxviruses are the largest of the animal viruses (about 400 ⫻ 240 ⫻ 200 nm in size) and can even be seen with a phasecontrast microscope or in stained preparations. They possess an exceptionally complex internal structure with an ovoid- to brickshaped exterior. The double-stranded DNA is associated with proteins and contained in the nucleoid, a central structure shaped like a biconcave disk and surrounded by a membrane (figure 16.18). Two elliptical or lateral bodies lie between the nucleoid and its outer envelope, a membrane and a thick layer covered by an array of tubules or fibers. Some large bacteriophages are even more elaborate than the poxviruses. The T2, T4, and T6 phages that infect E. coli have been 1. Define the following terms: nucleocapsid, capsid, icosahedral capsid, helical capsid, complex virus, protomer, self-assembly, capsomer, pentamer or penton, and hexamer or hexon. How do pentamers and hexamers associate to form a complete icosahedron; what determines helical capsid length and diameter? 2. All four nucleic acid forms can serve as virus genomes. Describe each, the types of virion possessing it, and any distinctive physical characteristics the nucleic acid can have. What are the following: plus strand, minus strand, and segmented genome? 3. What is an envelope? What are spikes or peplomers? Why are some enveloped viruses pleomorphic? Give two functions spikes might serve in the virus life cycle, and the proteins that the influenza virus uses in these processes. 4. What is a complex virus? Binal symmetry? Briefly describe the structure of poxviruses and T-even bacteriophages. Prescott−Harley−Klein: Microbiology, Fifth Edition VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 16.6 Principles of Virus Taxonomy 377 Head Collar Core or tube (hollow) Helical sheath Tail fibers Hexagonal baseplate Tail pins (a) (b) Figure 16.19 T-Even Coliphages. (a) The structure of the T4 bacteriophage. (b) The micrograph shows the phage before injection of its DNA. 16.6 Principles of Virus Taxonomy The classification of viruses is in a much less satisfactory state than that of either bacteria or eucaryotic microorganisms. In part, this is due to a lack of knowledge of their origin and evolutionary history (Box 16.2). Usually viruses are separated into several large groups based on their host preferences: animal viruses, plant viruses, bacterial viruses, bacteriophages, and so forth. In the past virologists working with these groups were unable to agree on a uniform system of classification and nomenclature. Beginning with its 1971 report, the International Committee for Taxonomy of Viruses has developed a uniform classification system and now divides viruses into three orders, 56 families, 9 subfamilies, 233 genera, and 1,550 virus species. The committee places greatest weight on a few properties to define families: nucleic acid type, nucleic acid strandedness, the sense (positive or negative) of ssRNA genomes, presence or absence of an envelope, and the host. Virus family names end in viridae; subfamily names, in virinae; and genus (and species) names, in virus. For example, the poxviruses are in the family Poxviridae; the subfamily Chorodopoxvirinae contains poxviruses of vertebrates. Within the subfamily are several genera that are distinguished on the basis of immunologic characteristics and host specificity. The genus Orthopoxvirus contains several species, among them variola major (the cause of smallpox), vaccinia, and cowpox. Viruses are divided into different taxonomic groups based on characteristics that are related to the type of host used, virion structure and composition, mode of reproduction, and the nature of any diseases caused. Some of the more important characteristics are: 1. Nature of the host—animal, plant, bacterial, insect, fungal 2. Nucleic acid characteristics—DNA or RNA, single or double stranded, molecular weight, segmentation and number of pieces of nucleic acid (RNA viruses), the sense of the strand in ssRNA viruses 3. Capsid symmetry—icosahedral, helical, binal 4. Presence of an envelope and ether sensitivity 5. Diameter of the virion or nucleocapsid 6. Number of capsomers in icosahedral viruses 7. Immunologic properties 8. Gene number and genomic map 9. Intracellular location of viral replication 10. The presence or absence of a DNA intermediate (ssRNA viruses), and the presence of reverse transcriptase 11. Type of virus release 12. Disease caused and/or special clinical features, method of transmission Table 16.2 illustrates the use of some of these properties to describe a few common virus groups. Virus classification is further discussed when bacterial, animal, and plant viruses are considered more specifically, and a fairly complete summary of virus taxonomy is presented in appendix V. 1. List some characteristics used in classifying viruses. Which seem to be the most important? 2. What are the endings for virus families, subfamilies, and genera or species? Prescott−Harley−Klein: Microbiology, Fifth Edition 378 Chapter 16 VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 The Viruses: Introduction and General Characteristics Box 16.2 The Origin of Viruses he origin and subsequent evolution of viruses are shrouded in mystery, in part because of the lack of a fossil record. However, recent advances in the understanding of virus structure and reproduction have made possible more informed speculation on virus origins. At present there are two major hypotheses entertained by virologists. It has been proposed that at least some of the more complex enveloped viruses, such as the poxviruses and herpesviruses, arose from small cells, probably procaryotic, that parasitized larger, more complex cells. These parasitic cells would become ever simpler and more dependent on their hosts, much like multicellular parasites have done, in a process known as retrograde evolution. There are several problems with this hypothesis. Viruses are radically different from procaryotes, and it is difficult to envision the mechanisms by which such a transformation might have occurred or the selective pressures leading to it. In addition, one would expect to find some forms intermediate between procaryotes and at least the more complex enveloped viruses, but such forms have not been detected. The second hypothesis is that viruses represent cellular nucleic acids that have become partially independent of the cell. Possibly a few T mutations could convert nucleic acids, which are only synthesized at specific times, into infectious nucleic acids whose replication could not be controlled. This conjecture is supported by the observation that the nucleic acids of retroviruses (see section 18.2) and a number of other virions do contain sequences quite similar to those of normal cells, plasmids, and transposons (see chapter 13). The small, infectious RNAs called viroids (see section 18.9) have base sequences complementary to transposons, the regions around the boundary of mRNA introns (see section 12.1), and portions of host DNA. This has led to speculation that they have arisen from introns or transposons. It is possible that viruses have arisen by way of both mechanisms. Because viruses differ so greatly from one another, it seems likely that they have originated independently many times during the course of evolution. Probably many viruses have evolved from other viruses just as cellular organisms have arisen from specific predecessors. The question of virus origins is complex and quite speculative; future progress in understanding virus structure and reproduction may clarify this question. Table 16.2 Some Common Virus Groups and Their Characteristics Nucleic Acid Strandedness RNA Single Capsid Symmetrya Presence of Envelope Size of Capsid (nm)b I I I? H H H H – + + + + + + I,B H I – – – 22–30 40–70(e) 100(e) 9(h), 80–120(e) 18(h), 125–250(e) 14–16(h), 80–160(e) 18(h), 70–80 × 130–240 (bullet shaped) 26–35; 18–26 × 30–85 18 × 300 26–27 Number of Capsomers 32 32 32 Virus Group Host Rangec Picornaviridae Togaviridae Retroviridae Orthomyxoviridae Paramyxoviridae Coronaviridae Rhabdoviridae A A A A A A A Bromoviridae Tobamovirus Leviviridae [Qβ] P P B RNA Double I I – + 70–80 100(e) 92 Reoviridae Cystoviridae DNA Single I I I H – – – – 20–25 18 × 30 (paired particles) 25–35 6 × 900–1,900 12 Parvoviridae Geminiviridae Microviridae Inoviridae A P B B DNA Double I I I I I C H C I,B I Bi – – – + + + + + – – – 40 55 60–90 130–180 100, 180–200(e) 200–260 × 250–290(e) 40 × 300(e) 28 (core), 42(e) 50; 30 × 60–900 60 80 × 110, 110d Polyomaviridae Papillomaviridae Adenoviridae Iridoviridae Herpesviridae Poxviridae Baculoviridae Hepadnaviridae Caulimoviridae Corticoviridae Myoviridae A A A A A A A A P B B a Types of symmetry: I, icosahedral; H, helical; C, complex; Bi, binal; B, bacilliform. b Diameter of helical capsid (h); diameter of enveloped virion (e). c Host range: A, animal; P, plant; B, bacterium. d The first number is the head diameter; the second number, the tail length. 72 72 252 162 42 A,P B Prescott−Harley−Klein: Microbiology, Fifth Edition VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 Critical Thinking Questions 379 Summary 1. Europeans were first protected from a viral disease when Edward Jenner developed a smallpox vaccine in 1798. 2. Chamberland’s invention of a porcelain filter that could remove bacteria from virus samples enabled microbiologists to show that viruses were different from bacteria. 3. In the late 1930s Stanley, Bawden, and Pirie crystallized the tobacco mosaic virus and demonstrated that it was composed only of protein and nucleic acid. 4. A virion is composed of either DNA or RNA enclosed in a coat of protein (and sometimes other substances as well). It cannot reproduce independently of living cells. 5. Viruses are cultivated using tissue cultures, embryonated eggs, bacterial cultures, and other living hosts. 6. Sites of animal viral infection may be characterized by cytopathic effects such as pocks and plaques. Phages produce plaques in bacterial lawns. Plant viruses can cause localized necrotic lesions in plant tissues. 7. Viruses can be purified by techniques such as differential and gradient centrifugation, precipitation, and denaturation or digestion of contaminants. 8. Virus particles can be counted directly with the transmission electron microscope or indirectly by the hemagglutination assay. 9. Infectivity assays can be used to estimate virus numbers in terms of plaque-forming units, lethal dose (LD50), or infectious dose (ID50). 10. All virions have a nucleocapsid composed of a nucleic acid, either DNA or RNA, held within a protein capsid made of one or more types of protein subunits called protomers. 11. There are four types of viral morphology: naked icosahedral, naked helical, enveloped icosahedral and helical, and complex. 12. Helical capsids resemble long hollow protein tubes and may be either rigid or quite flexible. The nucleic acid is coiled in a spiral on the inside of the cylinder (figure 16.11b). 13. Icosahedral capsids are usually constructed from two types of capsomers: pentamers (pentons) at the vertices and hexamers (hexons) on the edges and faces of the icosahedron (figure 16.13). 14. Viral nucleic acids can be either single stranded or double stranded, DNA or RNA. Most DNA viruses have double-stranded DNA genomes that may be linear or closed circles (table 16.1). 15. RNA viruses usually have ssRNA that may be either plus (positive) or minus (negative) when compared with mRNA (positive). Many RNA genomes are segmented. 16. Viruses can have a membranous envelope surrounding their nucleocapsid. The envelope lipids usually come from the host cell; in contrast, many envelope proteins are viral and may project from the envelope surface as spikes or peplomers. 17. Although viruses lack true metabolism, some contain a few enzymes necessary for their reproduction. 18. Complex viruses (e.g., poxviruses and large phages) have complicated morphology not characterized by icosahedral and helical symmetry. Large phages often have binal symmetry: their heads are icosahedral and their tails, helical (figure 16.19a). 19. Currently viruses are classified with a taxonomic system placing primary emphasis on the host, type and strandedness of viral nucleic acids, and on the presence or absence of an envelope. Key Terms bacteriophage 364 binal symmetry 376 capsid 369 capsomers 390 complex viruses 369 cytopathic effects 364 differential centrifugation 366 envelope 369 gradient centrifugation 366 hexamers (hexons) 370 icosahedral 369 infectious dose (ID50) 368 lethal dose (LD50) 368 minus strand or negative strand 374 necrotic lesion 364 nucleocapsid 369 pentamers (pentons) 370 phage 364 helical 369 hemagglutination assay 368 plaque 364 plaque assay 368 Questions for Thought and Review 1. In what ways do viruses resemble living organisms? 2. Why might virology have developed much more slowly without the use of Chamberland’s filter? 3. What advantage would an RNA virus gain by having its genome resemble eucaryotic mRNA? 4. A number of characteristics useful in virus taxonomy are listed on page 377. Can you think of any other properties that might be of considerable importance in future studies on virus taxonomy? plaque-forming units (PFU) 368 plus strand or positive strand 374 protomers 369 segmented genome 374 spike or peplomer 374 virion 363 virologist 362 virology 362 virus 363 Critical Thinking Questions 1. Many classification schemes are used to identify bacteria. These start with Gram staining, progress to morphology/ arrangement characteristics, and include a battery of metabolic tests. Build an analogous scheme that could be used to identify viruses. You might start by considering the host, or you might start with viruses found in a particular environment, such as a marine filtrate. 2. Consider the different perspectives on the origin of viruses in Box 16.2. Discuss whether you think viruses evolved before the first procaryote, or whether they have coevolved, and are perhaps still coevolving with their hosts. Prescott−Harley−Klein: Microbiology, Fifth Edition 380 Chapter 16 VI. The Viruses 16. The Viruses: Introduction and General Characteristics © The McGraw−Hill Companies, 2002 The Viruses: Introduction and General Characteristics Additional Reading General Ackermann, H.-W., and Berthiaume, L., editors. 1995. Atlas of Virus Diagrams. Boca Raton, Fla.: CRC Press. Cann, A. J. 1993. Principles of molecular virology. San Diego: Academic Press. Dimmock, N. J., and Primrose, S. B. 1994. Introduction to modern virology, 4th ed. London: Blackwell Scientific Publications. Dulbecco, R., and Ginsberg, H. S. 1988. Virology, 2d ed. Philadelphia: J. B. Lippincott. Fields, B. N.; Knipe, D. M.; Chanock, R. M.; Hirsch, M. S.; Melnick, J. L.; Monath, T. P.; and Roizman, B., editors. 1990. Fields virology, 2d ed. New York: Raven Press. Flint, S. J.; Enquist, L. W.; Krug, R. M.; Racaniello, V. R.; and Skalka, A. M. 2000. Principles of virology: Molecular biology, pathogenesis, and control. Washington, D.C.: ASM Press. Hendrix, R. W.; Lawrence, J. G.; Hatfull, G. F.; and Casjens, S. 2000. The origins and ongoing evolution of viruses. Trends Microbiol. 8(11):504–8. Henig, R. M. 1993. A dancing matrix—Voyages along the viral frontier. New York: Knopf. Levine, A. J. 1991. Viruses. New York: Scientific American Library. Levy, J. A.; Fraenkel-Conrat, H.; and Owens, R. 1994. Virology, 3d ed. Englewood Cliffs, N.J.: Prentice-Hall. Luria, S. E.; Darnell, J. E., Jr.; Baltimore, D.; and Campbell, A. 1978. General virology, 3d ed. New York: John Wiley and Sons. Matthews, R. E. F. 1991. Plant virology, 3d ed. San Diego: Academic Press Schlesinger, S., and Schlesinger, M. J. 2000. Viruses. In Encyclopedia of microbiology, 2d ed., vol. 4, J. Lederberg, editor-in-chief, 796–810. San Diego: Academic Press. Scott, A. 1985. Pirates of the cell: The story of viruses from molecule to microbe. New York: Basil Blackwell. Strauss, J. H., and Strauss, E. G. 1988. Evolution of RNA viruses. Annu. Rev. Microbiol. 42:657–83. Voyles, B. A. 2002. The biology of viruses, 2d ed. Chicago: McGraw-Hill. Webster, R. G., and Granoff, A., editors 1994. Encyclopedia of virology. San Diego: Academic Press. White, D. O., and Fenner, F. J. 1994. Medical virology, 4th ed. San Diego: Academic Press. 16.1 Early Development of Virology Bos, L. 2000. 100 years of virology: From vitalism via molecular biology to genetic engineering. Trends Microbiol. 8(2):82–87. Eggers, H. J. 1995. Picornaviruses—A historical view. ASM News 61(3):121–24. Jennings, F. 1975. The invasion of America: Indians, colonialism, and the cant of conquest. Chapel Hill: University of North Carolina Press. Lechevalier, H. A., and Solotorovsky, M. 1965. Three centuries of microbiology. New York: McGraw-Hill. McNeill, W. H. 1976. Plagues and peoples. Garden City, N.Y.: Anchor. Oldstone, M. B. 1998. Viruses, plagues & history. New York: Oxford University Press. Stearn, E. W., and Stearn, A. E. 1945. The effect of smallpox on the destiny of the Amerindian. Boston: Bruce Humphries. van Helvoort, T. 1996. When did virology start? ASM News 62(3):142–45. Zaitlin, M. 1999. Tobacco mosaic virus and its contributions to virology. ASM News 65(10): 675–80. 16.4 Virus Purification and Assays Henshaw, N. G. 1988. Identification of viruses by methods other than electron microscopy. ASM News 54(9):482–85. Miller, S. E. 1988. Diagnostic virology by electron microscopy. ASM News 54(9):475–81. 16.5 The Structure of Viruses Baker, T. S.; Olson, N. H.; and Fuller, S. D. 1999. Adding the third dimension to virus life cycles: Three-dimensional reconstruction of icosahedral viruses from cryo-electron micrographs. Microbiol. Mol. Biol. Rev. 63(4):862–922. Bresnahan, W. A., and Shenk, T. 2000. A subset of viral transcripts packaged within human cytomegalovirus particles. Science 288:2373–76. Casjens, S. 1985. Virus structure and assembly. Boston: Jones and Bartlett. Harrison, S. C. 1984. Structure of viruses. In The microbe 1984: Part I, viruses. 36th Symposium Society for General Microbiology. Cambridge: Cambridge University Press. 16.6 Principles of Virus Taxonomy Eigen, M. 1993. Viral quasispecies. Sci. Am. 269(1):42–49. Lwoff, A., and Tournier, P. 1971. Remarks on the classification of viruses. In Comparative virology, K. Maramorosch and E. Kurstak, editors, 1–42. New York: Academic Press. Matthews, R. E. F. 1985. Viral taxonomy for the nonvirologist. Annu. Rev. Microbiol. 39:451–74. Van Regenmortel, M. H. V.; Fauquet, C. M.; Bishop, D. H. L.; Carstens, E. B.; Estes, M. K.; Lemon, S. M.; Maniloff, J.; Mayo, M. A.; McGeoch, D. J.; Pringle, C. R.; and Wickner, R. B., editors. 2000. Virus taxonomy: The classification and nomenclature of viruses. Seventh report of the international committee on taxonomy of viruses. San Diego: Academic Press.