Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Review TRENDS in Microbiology Vol.9 No.11 November 2001 547 Evolutionary genomics of pathogenic bacteria J. Ross Fitzgerald and James M. Musser Complete genome sequences are now available for multiple strains of several bacterial pathogens and comparative analysis of these sequences is providing important insights into the evolution of bacterial virulence. Recently, DNA microarray analysis of many strains of several pathogenic species has contributed to our understanding of bacterial diversity, evolution and pathogenesis. Comparative genomics has shown that pathogens such as Escherichia coli, Helicobacter pylori and Staphylococcus aureus contain extensive variation in gene content whereas Mycobacterium tuberculosis nucleotide divergence is very limited. Overall, these approaches are proving to be a powerful means of exploring bacterial diversity, and are providing an important framework for the analysis of the evolution of pathogenesis and the development of novel antimicrobial agents. Comparative genomics is a rapidly advancing discipline that is currently being energized by the availability of genome sequences for multiple strains of pathogenic bacterial species and by the advent of DNA microarray technology. Here, we review very recent findings from comparative genomic studies of selected important human pathogens, and discuss the implications of these studies for our understanding of bacterial evolution and the development of improved disease therapeutics. Escherichia coli J. Ross Fitzgerald James M. Musser* Laboratory of Human Bacterial Pathogenesis, Rocky Mountain Laboratories, National Institute of Allergy and Infectious Diseases, National Institutes of Health, 903 South 4th St, Hamilton, MT 59840, USA. *e-mail: [email protected] The non-pathogenic Escherichia coli strain K12 has frequently been used as a model system to develop our understanding of bacterial metabolism and growth. Although most strains of E. coli are non-pathogenic and usually exist as commensals, enterohemorrhagic E. coli 0157:H7 (EHEC) causes hemorrhagic colitis, which can be associated with the often fatal hemolytic uremic syndrome, and enteropathogenic E. coli (EPEC) is an important cause of diarrhea in the developing world. A comparison between the genome sequences of E. coli K12 (Ref. 1) and E. coli 0157:H7 (Ref. 2) identified many new genes that are likely to play an important role in the virulence of this food-borne pathogen. Perna et al.2 showed that although 4.1 Mb of sequence is shared between these strains, the intra-species diversity is extensive: E. coli 0157:H7 contains 1.34 Mb of strain-specific sequence (1387 genes) compared with 0.53 Mb (528 genes) of unique sequence in the K12 strain MG1655. Most of these genes are localized in strain-specific ‘islands’in which the base composition is atypical, suggesting that these sequences were obtained from bacterial donor species with a different base composition by relatively recent horizontal transfer events. The 0157:H7-specific islands encode many known and candidate virulence factors. Several http://tim.trends.com genes encoding toxins, including a macrophage toxin, an RTX-toxin-like exoprotein and transport system, and two urease gene clusters were identified in the larger islands, which are >15 kb in size. Two fatty-acid biosynthesis systems, an adhesin gene and the locus of enterocyte effacement, which has been shown to be involved in virulence3, were also identified. In addition, putative fimbrial and non-fimbrial adhesin genes were present in the smaller chromosomal islands. Molecular analysis of the proteins encoded by these genes should allow additional mechanisms of disease pathogenesis to be elucidated. The identification of many horizontally transferred regions raises questions regarding the evolution of contemporary clones of E. coli. To investigate the evolution of pathogenic E. coli, Reid and colleagues analyzed the sequence of seven genes in 20 pathogenic strains and the non-pathogenic strain K12 and constructed a phylogenetic tree4. The results provide evidence that E. coli 0157:H7 diverged from a common ancestor of E. coli K12 as much as 4.5 million years ago4. The data also indicate that in some E. coli clones, virulence is a recently evolved trait resulting from the horizontal transfer of virulence genes. Moreover, distinct evolutionary lineages of E. coli appear to have acquired some of the same virulence factors independently. DNA microarray analysis has also been used to study the evolutionary genomics of E. coli5. Ochman and Jones5 analyzed the distribution of the 4290 open reading frames (ORFs) of the sequenced E. coli K12 MG1655 strain in five strains of known evolutionary relationships. The amount of unique DNA present in each strain was predicted based on genome size and the variation in gene content, and sequence analysis determined the relative age of every gene in the MG1655 chromosome. They found that 3782 ORFs were common to all strains of E. coli examined. The present distribution of all 4290 genes in MG1655 could be accounted for by a total of 67 molecular events, including 37 insertions and 30 deletions (Fig. 1). This precise prediction of the molecular events contributing to the evolution of E. coli clones demonstrates the power of whole-genome DNA microarrays to examine the diversity and evolution of bacterial species on a scale never before possible. Mycobacterium tuberculosis Mycobacterium tuberculosis continues to be the leading bacterial killer in the world, with 3–4 million 0966-842X/01/$ – see front matter © 2001 Elsevier Science Ltd. All rights reserved. PII: S0966-842X(01)02228-4 548 Review TRENDS in Microbiology Vol.9 No.11 November 2001 1.0 Mb MG1655 2.0 Mb 3.0 Mb 4.0 Mb Genome size 4.7 Mb III II A I D E W3110 4.7 Mb ECOR 21 4.5 Mb ECOR 40 5.3 Mb ECOR 37 5.5 Mb TRENDS in Microbiology Fig. 1. Distribution of deletions among strains of Escherichia coli. Chromosomes are represented linearly and drawn to equal length to show the locations of missing open reading frames (ORFs) relative to their map positions in MG1655. Actual chromosome sizes, as determined by pulsed field gel electrophoresis, are shown on the right. Black lines portray chromosomal positions where the corresponding MG1655 ORFs are lacking in a laboratory (W3110) or natural (ECOR) isolate of E. coli. The relative thickness of these lines denotes the amount of missing DNA. Blue bands show the positions and sizes of regions in the MG1655 chromosome deduced to be horizontally acquired, based on sequence features, and the orange bands represent the positions and sizes of known prophage in the MG1655 chromosome. Upper-case letters designate major phylogenetic subgroups within E. coli. Roman numerals indicate ancestral lineages of E. coli. Modified with permission from Ref. 5. human deaths annually as a result of tuberculosis (TB). The M. tuberculosis H37Rv genome sequence was published in 19986. Although originally cultured from a patient with TB, this strain has been passaged in vitro for approximately 100 years. Importantly, since the publication of the genome sequence of strain H37Rv, a recent clinical isolate has been sequenced (http://www.tigr.org), although not yet described in a publication. Comparison of these completed genomes should provide a framework to begin elucidating the mechanisms involved in virulence. In addition, the Mycobacterium leprae genome sequence has been completed7 and the genomes of strains of Mycobacterium bovis, Mycobacterium avium and Mycobacterium smegmatis are being sequenced. These sequence data will provide a wealth of important information for the analysis of evolution and virulence of the M. tuberculosis complex. The members of the M. tuberculosis complex are extremely closely related at the nucleotide level8. Sreevatsan and colleagues performed comparative sequencing of 26 structural genes in >800 isolates of members of the M. tuberculosis complex. They showed that the level of synonymous (silent) nucleotide variation in structural genes of M. tuberculosis complex members is greatly restricted and considerably less than that present in other pathogens. The data indicate that M. tuberculosis has http://tim.trends.com undergone a recent evolutionary bottleneck at the time of speciation, which is predicted to have occurred somewhere in the order of 15 000–20 000 years ago. Accordingly, M. tuberculosis must have disseminated worldwide very recently in evolutionary time. In another study, Musser et al.9 sequenced 24 genes encoding targets of the human immune system in a sample of 16 isolates recovered from different global sources. Surprisingly, 19 of the 24 genes lacked nucleotide sequence diversity. Only six polymorphic sites were identified in the other five genes, confirming that the variation in M. tuberculosis structural genes is negligible, even in genes encoding proteins that interact with the host immune system. The original vaccine against TB (M. bovis bacille Calmette–Guérin; BCG) was developed by Calmette and Guérin by passaging a strain of M. bovis 230 times between 1908 and 1921. However, because of the requirement for continuous passage of derivative strains, by the time lyophilized seed stocks were made in the 1960s, different daughter strains had undergone up to 1000 additional passages. This has resulted in many phenotypically different strains, which are thought to vary considerably in their ability to induce protection against TB. In spite of this, the BCG vaccine is still the world’s most widely used vaccine. To investigate the genetic basis for the variation in the efficacy of M. bovis-derived BCG strains, Behr et al.10 used a whole-genome DNA microarray to analyze 12 such strains used for vaccination in different global locations. A total of 16 deleted regions were identified among the strains examined. Behr and colleagues were able to construct a historical genealogy of these daughter strains that predicted their relationship to each other and identified when the deletions took place. One of these regions (RD1) was absent in all BCG strains tested but present in all other M. tuberculosis complex members, and might be responsible for the original attenuation described by Calmette11. This study could have important implications for the design of a more effective vaccine for protection against TB. Review TRENDS in Microbiology Vol.9 No.11 November 2001 Very recently, a whole-genome microarray approach has also been used for epidemiological analysis of 19 M. tuberculosis isolates12. Twenty-five deleted chromosomal regions were identified among these isolates, and an inverse correlation was observed between the percentage of the genome deleted from each clone and the percentage of patients infected with that clone who had pulmonary cavitations. The authors interpreted these data to indicate that accumulation of deletions in M. tuberculosis results in diminished virulence. Chromosomal deletions in M. tuberculosis complex members appear to occur as the result of homologous recombination between copies of IS6110 that flank chromosomal regions in the direct orientation. Taken together, the data indicate that singlenucleotide polymorphisms are very rare and deletion events are an important source of genome variation in the M. tuberculosis complex. Moreover, horizontal gene transfer has been very limited within the M. tuberculosis complex. Helicobacter pylori One-half of the world’s population is predicted to be infected by Helicobacter pylori, although <10% of these individuals ever present with clinical disease. H. pylori is the cause of superficial and chronic gastritis, duodenal ulcers and many gastric ulcers, and has also been strongly linked to gastric cancer and mucosa-associated lymphoid tissue lymphoma13. H. pylori was the first bacterial species for which genome sequences were completed for two strains (strains J99 and 26695)14,15. This has proved to be an important resource for analysis of H. pylori diversity and provides a useful framework to study pathogenesis and disease specificity. Many different moleculartyping techniques have demonstrated the remarkable genetic diversity of this important gastric pathogen16,17 and a high degree of sequence variation is observed between J99 and 26695 (Ref. 15). Each genome contains strain-specific genes (89 genes in J99, and 117 genes in 26695), and most of these genes are located at similar chromosomal locations in both strains. The so-called plasticity zone contains more than half of the strain-specific genes, including many of the numerous restriction and modification genes typical of H. pylori. Although not all functional18, the presence of large numbers of restriction-modification systems in H. pylori strains could represent a mechanism to promote homologous recombination of short DNA fragments. Comparison of the two genomes indicates that some genes for DNA mismatch repair are missing, indicating that this might be a contributing factor to increased nucleotide divergence. It has been speculated that the isolated ecological niche inhabited by H. pylori might have promoted the development of mechanisms for DNA uptake and exchange which, in turn, have resulted in extensive intra-species diversity. A whole-genome DNA microarray specific for both sequenced H. pylori strains has been used to analyze http://tim.trends.com 549 chromosomal diversity in H. pylori. Salama and colleagues determined the chromosomal gene content of 15 H. pylori isolates, including strains J99 and 26695 (Ref. 19). Of the 1643 genes represented on the array, 1281 were common to all strains whereas 22% of genes were strain-specific and could encode proteins involved in niche adaptation. Many of the strain-specific genes were present in either the 40 kb pathogenicity island or the plasticity zone. However, two-thirds of all other strain-specific genes were distributed elsewhere in the chromosome in smaller tracts containing between one and eight genes. This indicates that H. pylori uses multiple mechanisms of gene acquisition and loss, which have contributed to its evolution. Importantly, the study identified new candidate virulence genes that could play a role in disease pathogenesis. A similar DNA microarray was used to identify differences in gene content between two H. pylori isolates from gastric and duodenal ulcers that differed in virulence in a gerbil model of gastritis20. Israel and co-workers identified several strainspecific differences including the presence of the cag pathogenicity island in one strain. Subsequent disruption of the cag locus resulted in reduced gastric inflammation. This study demonstrates the utility of whole-genome microarrays for identifying differences in gene content between strains of different pathogenic potential. Staphylococcus aureus Staphylococcus aureus causes a wide variety of infections including several that are life threatening. Moreover, this organism is one of the leading causes of nosocomial infection. Treatment of infections caused by S. aureus is complicated by the resistance of many strains to the antibiotic of choice, methicillin. The genome sequences of a methicillinresistant strain (N315) and a clonally related vancomycin-resistant strain (Mu50) have recently been published21. In addition, four S. aureus genome sequencing projects are nearing completion. Two of the strains being sequenced are methicillinresistant strains, COL (http://www.tigr.org) and 252 (http://www.sanger.ac.uk). The others are a laboratory strain, 8325 (http://www.genome.ou.edu), and a community-acquired infection strain, 476 (http://www.sanger.ac.uk). Several population genetic studies of S. aureus have used multilocus enzyme electrophoresis and multilocus sequencing to analyze genetic diversity within the species and have provided insights into host- and disease specificity22–24. However, these studies have indexed variation at only a limited number of chromosomal loci. Recently, we constructed a whole-genome DNA microarray specific for S. aureus strain COL, and analyzed the chromosomal gene content of 36 S. aureus strains representing the most abundant clonal lineages within the species25. Strains from well-defined human clinical conditions were Review TRENDS in Microbiology Vol.9 No.11 November 2001 COL MSA3410 MSA890 MSA3426 MSA817 MSA961 MSA820 MSA3400 MSA3405 MSA2120 RF122 MSA2965 MSA2348 MSA2020 MSA551 MSA535 MSA2389 MSA1601 MSA2099 MSA3412 MSA3407 MSA2885 MSA2335 MSA2754 MSA2345 MSA1836 MSA1827 MSA700 MSA2786 MSA3095 MSA2346 MSA1205 MSA1832 MSA537 MSA3418 MSA3402 MSA1695 550 RD1 RD2 RD3 RD4 RD5 RD6 RD7 RD8 RD9 RD10 RD11 RD12 RD13 RD14 RD15 RD16 RD17 RD18 insights into genetic variation in natural populations of S. aureus and demonstrated the power of DNA microarray technology to address issues of evolution and pathogenesis definitively. Group A streptococci TRENDS in Microbiology Fig. 2. The presence or absence of large chromosomal regions of difference (RDs) in 36 Stapylococcus aureus strains. A square symbol denotes the presence of an RD and an empty space its absence. Hatched squares indicate presence of RDs in methicillin-resistant S. aureus (MRSA) strains. Red signifies strains of the predominant clone associated with female urogenital toxic shock syndrome. Reproduced from Ref. 25. selected for analysis, including toxic shock syndrome (TSS) isolates and methicillin-resistant strains, and strains isolated from bovine and ovine intramammary infections. We found that 78% of genes were common to all strains examined, identifying a gene complement likely to be important for general cell maintenance and growth. Conversely, 22% of genes were strain-specific and might play a role in adaptation to specialized niches, or other contingency functions. Eighteen large chromosomal regions of difference (RD) of between 3 and 50 kb in size were identified, and these were variably present among the strains examined (Fig. 2). Several of these RDs contained extensive variation in gene content and size, suggesting that they are large chromosomal loci with elevated recombination activity. Many RDs contained genes encoding gene-mobility proteins such as integrases or transposases, and genes that encode virulence determinants or proteins mediating antibiotic resistance. The data indicate that horizontal gene transfer and recombination have played a fundamental role in the evolution of pathogenic S. aureus. The study also provided important insight into the evolution of methicillin-resistant strains. The data indicated that the mec element encoding the methicillin-resistance phenotype was acquired multiple times by horizontal gene transfer into different S. aureus clones. This finding rules out the single progenitor theory of the origin of methicillinresistant S. aureus26. In addition, the study demonstrated that, although female urogenital TSS isolates are related, they do not share a very recent ancestor. This finding provides convincing evidence that the TSS epidemic of the 1970s and 1980s was caused by a change in the host (use of a new superabsorbent tampon) rather than by rapid global dissemination of a hypervirulent clone. Overall, this comparative genomics study provided important http://tim.trends.com Group A streptococci (GAS) are responsible for a wide range of human infections, including pharyngitis, skin infections, sepsis, osteomyelitis, TSS and necrotizing fasciitis27. The complete genome sequence for a GAS M1 serotype strain has recently been published28, and other M-type strains are currently being sequenced at the Sanger Centre (http://www.sanger.ac.uk) and the Laboratory of Human Bacterial Pathogenesis, Rocky Mountain Laboratories (Hamilton, MT, USA). Comparison of these genomes is already providing the basis for an improved understanding of GAS pathogenesis. Reid and colleagues29 recently examined four GAS genomes and identified 11 genes present in all strains examined encoding previously uncharacterized extracellular putative virulence factors. Sequence analysis of the 11 genes in 37 diverse strains found that recombination has contributed substantially to chromosomal diversity, and western blot analysis with sera from infected patients confirmed that these proteins were antigenic. Moreover, transcription of many genes was influenced by the covR and mga trans-acting regulatory loci. Currently, these proteins are being examined for their role in virulence, and their potential use in vaccines. In addition, a GAS genome-specific microarray is being used to explore variation within and between M protein serotypes of GAS. Preliminary data indicate that horizontal transfer by phage transduction has been a major source of intra-species diversity. Chlamydia trachomatis Chlamydia trachomatis and Chlamydia pneumoniae are obligate intracellular human bacterial pathogens with markedly different tissue tropism and disease manifestation. The genetic basis for these differences is unknown. The organisms grow only within a specialized vacuole in the post-Golgi exocytic vesicular compartment of eukaryotic cells. They undergo a distinct developmental cycle that alternates between an extracellular elementary body (EB) and an intracellular replicating cell, termed the reticulate body (RB). C. trachomatis causes trachoma, an eye infection that leads to blindness, and sexually transmitted diseases such as pelvic inflammatory disease. Comparison of the two sequenced C. trachomatis genomes (MoPn, a mouse pathogen and serovar D, a human pathogen)30,31 reveals that the genomes are extremely similar outside of a region referred to as the plasticity zone. There are several differences at the plasticity zone of these different host-specific strains that could influence chlamydial pathogenesis30. C. trachomatis strain MoPn has a plasticity zone Review TRENDS in Microbiology Vol.9 No.11 November 2001 of ~51 kb that contains the guaAB and adenosine deaminase (add) genes as a single operon. At the same position in the ~23 kb plasticity zone in serovar D, a tryptophan biosynthesis cluster appears to have replaced the guaAB and add genes. These genes might encode proteins involved in scavenging nucleotides; this potential difference could influence the host range of tissues each strain is capable of infecting. Another potentially very significant difference between the plasticity zones in these two strains is the presence of a 9675-nucleotide gene in strain MoPn encoding a putative toxin similar to a predicted E. coli 0157:H7 toxin. The amino termini of the derived proteins contain homology with the amino termini of large clostridial toxins (LCT), which have been shown to interfere with eukaryotic cell chemistry. Accordingly, this could be an important virulence determinant involved in promoting acute high-level infection. The serovar D strain contains the toxin gene but with multiple frameshift mutations. A recent paper by Belland et al.32 demonstrated that both strains produced a functional toxin but the MoPn LCT-like toxin had a higher cytotoxic activity. Also located in the plasticity zone are the phospholipase D-endonuclease (PLD) genes, with four paralogs in serovar D and five in MoPn. The role of the protein products of these genes is unknown. Overall, the plasticity zone is the site of most differences between these two strains, indicating that it could be a ‘hot spot’ for recombination. The genetic variation at this chromosomal location provides important clues as to strain differences in host specificity and pathogenesis. Chlamydia pneumoniae Chlamydia pneumoniae causes pneumonia and bronchitis and is frequently associated with complex chronic diseases such as atherosclerosis, asthma and multiple sclerosis. The three C. pneumoniae genomes (AR39, CWL029 and J138) that have been sequenced30,33,34 are very similar in gene content and order, with up to 99.9% identity. There are only small differences offering targets for strain differentiation. The genome sequences of strains AR39 and CWL029 were compared by Read et al.30 Only 296 singlenucleotide polymorphisms and 21 single-base frameshift mutations were identified. Many of the mutations occurred in intergenic regions and only 161 of 1165 derived proteins are not identical. Interestingly, strain AR39 contains a 4524-nucleotide circular bacteriophage that is not found in the other strains and which is the first C. pneumoniae bacteriophage to be described. This discovery raises interesting questions regarding horizontal gene transfer between obligate intracellular pathogens and the possibility of a role for the phage in pathogenesis. Shirai et al.33 determined the genome sequence for C. pneumoniae J138 from Japan and compared it to strain CWL029. They observed a high level of structural and functional conservation between the two unrelated isolates. Only three chromosomal http://tim.trends.com 551 segments of between 27 and 84 nucleotides are unique to the J138 genome whereas five segments of between 89 and 1649 nucleotides are unique to the CWL029 genome. The striking similarity observed among the C. pneumoniae strains sequenced to date could reflect the intracellular niche they inhabit, and indicates recent evolution from a common ancestor. Streptococcus pneumoniae Streptococcus pneumoniae causes pneumonia, bacteremia, meningitis and otitis media, and is responsible for the deaths of >3 million children every year. Recently, the complete genome sequences for a virulent, capsular type 4 isolate of S. pneumoniae (TIGR4)35 and the avirulent R6 strain36 were determined. Sequence analysis of the TIGR4 strain has already provided important clues into S. pneumoniae pathogenesis37 and comparative analysis of the two genomes should result in improved understanding of S. pneumoniae virulence. DNA microarray hybridizations were carried out to compare the TIGR4 strain with the R6 non-capsulated laboratory strain and strain D39, a serotype 2 capsulated isolate35. Nine chromosomal regions were missing in strains R6 and D39 compared with TIGR4. Moreover, six of these nine regions contained an atypical GC content, indicating that they were horizontally acquired. Within these regions were many genes encoding proteins that are surfaceexposed and/or related to pathogenesis, including the capsule biosynthesis locus, a gene cluster encoding a cell-wall surface anchor protein, and genes encoding a putative macrolide efflux protein, a V-type ATPase and an IgA1 protease. Clearly, these genetic differences could contribute to strain-specific features of pathogenesis or antigenicity. Implications of comparative genomics for diagnostics and therapeutics The ability to determine differences in gene content between strains of a bacterial species has important consequences for pathogen identification and disease prevention and therapy. Comparative genomic approaches are identifying novel targets for improved diagnostic and therapeutic procedures. For example, in the DNA microarray study of BCG strains by Behr et al.10, many ORFs were identified that were present in M. tuberculosis but absent in M. bovis. The products of these ORFs are useful targets for diagnostic discrimination between individuals infected with M. tuberculosis and those who have been vaccinated with the M. bovis BCG vaccine. This is currently not possible using the traditional tuberculin skin test. Musser et al.9 showed that there is negligible diversity in human immune-system protein targets worldwide. This indicates that potential drug targets could also be largely conserved and provides hope that a broadly effective therapy could be identified to control a disease which causes several million deaths globally each year. 552 Review TRENDS in Microbiology Vol.9 No.11 November 2001 Questions for future research • • • • Acknowledgements We thank S. Reid and M. Chaussee for critical review of the manuscript. What is the extent of genome diversity within pathogenic bacterial species? How has lateral gene transfer contributed to the evolution of pathogenic bacteria? How does the genetic variation between strains result in differences in host-specificity, tissue tropism and virulence? How can comparative genomics be used to develop better therapeutics and vaccines? The recent comparative genome analysis of E. coli 0157:H7 and MG1655 identified many strain-specific and candidate virulence factor genes. These findings could provide the basis for sensitive diagnostic methods to identify virulent strains of E. coli 0157:H7 in contaminated food products. In addition, DNA microarray studies of S. aureus and H. pylori have revealed the gene complement common to all strains examined, including virulence factor genes present in all strains. Some of these gene products could be targets for therapeutics effective against all strains of a species. The genome sequences for both a serogroup A and a serogroup B strain of Neisseria meningitidis have been completed38,39. A whole-genome approach was used successfully to identify vaccine candidates for prevention of N. meningitidis serogroup B infection40. After genome sequence analysis, 350 proteins were expressed in E. coli and used to immunize mice. Comparative sequence analysis identified proteins that were conserved in all strains and those that elicited a bactericidal antibody response were References 1 Blattner, F.R. et al. (1997) The complete genome sequence of Escherichia coli K-12. Science 277, 1453–1474 2 Perna, N.T. et al. (2001) Genome sequence of enterohaemorrhagic Escherichia coli O157:H7. Nature 409, 529–533 3 Perna, N.T. et al. (1998) Molecular evolution of a pathogenicity island from enterohemorrhagic Escherichia coli O157:H7. Infect. Immun. 66, 3810–3817 4 Reid, S.D. et al. (2000) Parallel evolution of virulence in pathogenic Escherichia coli. Nature 406, 64–67 5 Ochman, H. and Jones, I.B. (2000) Evolutionary dynamics of full genome content in Escherichia coli. EMBO J. 19, 6637–6643 6 Cole, S.T. et al. (1998) Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 393, 537–544 7 Cole, S.T. et al. (2001) Massive gene decay in the leprosy bacillus. Nature 409, 1007–1011 8 Sreevatsan, S. et al. (1997) Restricted structural gene polymorphism in the Mycobacterium tuberculosis complex indicates evolutionarily recent global dissemination. Proc. Natl. Acad. Sci. U. S. A. 94, 9869–9874 9 Musser, J.M. et al. (2000) Negligible genetic diversity of Mycobacterium tuberculosis host http://tim.trends.com 10 11 12 13 14 15 16 17 selected as potential vaccine candidates. This first study of its kind demonstrates the benefits of a genome-scale approach to vaccine development and a similar approach could be used to design novel vaccines against other bacterial pathogens. Concluding comments and future directions Overall, comparative genomics has demonstrated that bacterial pathogens such as E. coli, H. pylori and S. aureus contain extensive genome diversity, and multiple molecular mechanisms of horizontal gene transfer and recombination have contributed to the variation in different species. The many strain-specific genes identified could allow colonization of specialized host and environmental niches and could help explain the versatility of pathogens, such as S. aureus, that cause many different disease types in multiple host species. By contrast, M. tuberculosis complex members have very limited nucleotide divergence and variation in gene content has occurred through deletion and movement of insertion elements. This article discusses recent findings from only a few selected human pathogens. However, it is clear that comparative genomic approaches are extremely powerful tools for understanding the evolution of microbial pathogens. In particular, whole-genome DNA microarrays allow rapid analysis of gene content of large numbers of strains and provide an excellent framework for the analysis of pathogenesis, and host- and disease specificity. In the near future such methods will be applied to all major human pathogens. This will undoubtedly lead to increased understanding of bacterial evolution and pathogenesis and, importantly, should lead to improved diagnostics and the development of novel therapeutic strategies. immune system protein targets: evidence of limited selective pressure. Genetics 155, 7–16 Behr, M.A. et al. (1999) Comparative genomics of BCG vaccines by whole-genome DNA microarray. Science 284, 1520–1523 Domenech, P. et al. (2001) Mycobacterium tuberculosis in the post-genomic age. Curr. Opin. Microbiol. 4, 28–34 Kato-Maeda, M. et al. (2001) Comparing genomes within the species Mycobacterium tuberculosis. Genome Res. 11, 547–554 Alm, R.A. and Trust, T.J. (1999) Analysis of the genetic diversity of Helicobacter pylori: the tale of two genomes. J. Mol. Med. 77, 834–846 Tomb, J.F. et al. (1997) The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 388, 539–547 Alm, R.A. et al. (1999) Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori. Nature 397, 176–180 Marshall, D.G. et al. (1998) Helicobacter pylori – a conundrum of genetic diversity. Microbiology 144, 2925–2939 Go, M.F. et al. (1996) Population genetic analysis of Helicobacter pylori by multilocus enzyme electrophoresis: extensive allelic diversity and recombinational population structure. J. Bacteriol. 178, 3934–3938 18 Lin, L.F. et al. (2001) Comparative genomics of the restriction-modification systems in Helicobacter pylori. Proc. Natl. Acad. Sci. U. S. A. 98, 2740–2745 19 Salama, N. et al. (2000) A whole-genome microarray reveals genetic diversity among Helicobacter pylori strains. Proc. Natl. Acad. Sci. U. S. A. 97, 14668–14673 20 Israel, D.A. et al. (2001) Helicobacter pylori strain-specific differences in genetic content, identified by microarray, influence host inflammatory responses. J. Clin. Invest. 107, 611–620 21 Kuroda, M. et al. (2001) Whole genome sequencing of meticillin-resistant Staphylococcus aureus. Lancet 357, 1225–1240 22 Musser, J.M. et al. (1990) A single clone of Staphylococcus aureus causes the majority of cases of toxic shock syndrome. Proc. Natl. Acad. Sci. U. S. A. 87, 225–229 23 Musser, J.M. and Selander, R.K. (1990) Genetic analysis of natural populations of Staphylococcus aureus. In Molecular Biology of the Staphylococci (Novick, R.P., ed.), pp. 59–67, Wiley-VCH 24 Enright, M.C. et al. (2000) Multilocus sequence typing for characterization of methicillinresistant and methicillin-susceptible clones of Staphylococcus aureus. J. Clin. Microbiol. 38, 1008–1015 Review TRENDS in Microbiology Vol.9 No.11 November 2001 25 Fitzgerald, J.R. et al. (2001) Evolutionary genomics of Staphylococcus aureus: insights into the origin of methicillin-resistant strains and the toxic shock syndrome epidemic. Proc. Natl. Acad. Sci. U. S. A. 98, 8821–8826 26 Kreiswirth, B. et al. (1993) Evidence for a clonal origin of methicillin resistance in Staphylococcus aureus. Science 259, 227–230 27 Musser, J.M. and Krause, R.M. (1998) The revival of group A streptococcal diseases, with a commentary on staphylococcal toxic shock syndrome. In Emerging Infections (Krause, R.M., ed.), pp. 185–218, Academic Press 28 Ferretti, J.J. et al. (2001) Complete genome sequence of an M1 strain of Streptococcus pyogenes. Proc. Natl. Acad. Sci. U. S. A. 98, 4658–4663 29 Reid, S.D. et al. (2001) Multilocus analysis of extracellular putative virulence proteins made by group A Streptococcus: population genetics, 30 31 32 33 34 human serologic response, and gene transcription. Proc. Natl. Acad. Sci. U. S. A. 98, 7552–7557 Read, T.D. et al. (2000) Genome sequences of Chlamydia trachomatis MoPn and Chlamydia pneumoniaeAR39. Nucleic Acids Res. 28, 1397–1406 Stephens, R.S. et al. (1998) Genome sequence of an obligate intracellular pathogen of humans: Chlamydia trachomatis. Science 282, 754–759 Belland, R.J. et al. Chlamydia trachomatis cytotoxicity associated with complete and partial cytotoxin genes. Proc. Natl. Acad. Sci. U. S. A. (in press) Shirai, M. et al. (2000) Comparison of whole genome sequences of Chlamydia pneumoniae J138 from Japan and CWL029 from USA. Nucleic Acids Res. 28, 2311–2314 Kalman, S. et al. (1999) Comparative genomes of Chlamydia pneumoniae and C. trachomatis. Nat. Genet. 21, 385–389 553 35 Tettelin, H. et al. (2001) Complete genome sequence of a virulent isolate of Streptococcus pneumoniae. Science 293, 498–506 36 Hoskins, J. et al. (2001) Genome of the bacterium Streptococcus pneumoniae strain R6. J. Bacteriol. 19, 5709–5717 37 Musser, J.M. and Kaplan, S.L. Pneumococcal research transformed. New Engl. J. Med. (in press) 38 Parkhill, J. et al. (2000) Complete DNA sequence of a serogroup A strain of Neisseria meningitidis Z2491. Nature 404, 502–506 39 Tettelin, H. et al. (2000) Complete genome sequence of Neisseria meningitidis serogroup B strain MC58. Science 287, 1809–1815 40 Pizza, M. et al. (2000) Identification of vaccine candidates against serogroup B meningococcus by whole-genome sequencing. Science 287, 1816–1820 Desiccation tolerance: a simple process? Malcolm Potts Water is essential for life, and thus the removal of water from a cell is a severe, often lethal stress. This is not a remarkable observation but it is one that is often taken for granted. Desiccation-tolerant cells implement structural, physiological and molecular mechanisms to survive severe water deficit. These mechanisms, and the components and pathways which facilitate them, are poorly understood. Here, recent developments are considered to illustrate the importance of desiccation, longevity and cell stasis in basic microbiology, and the relevance of the topic to the metabolic engineering of sensitive cells, including those of humans. In fact, few organisms can withstand these complex phase changes. To understand how they do so one must deal with controversial issues surrounding cell age, longevity, the structural and biochemical properties of anhydrous cytoplasm, and metabolic stasis. An earlier review provided a critical appraisal of this subject as well as an introduction to some of the relevant biophysical principles5. Dry-down phase Malcolm Potts Virginia Tech Center for Genomics, W. Campus Drive, Virginia Tech, Blacksburg, VA 24061, USA. e-mail: [email protected] Paracelsus c1500 was perhaps the first to engage in the study of desiccation phenomena1 but it is Antoni van Leeuwenhoek who tends to be remembered for his revival of dried ‘animalcules’ (rotifers) upon rehydration. The tercentenary of his first published observations will be celebrated in 2002. Over the past 300 years, the phenomenon of desiccation tolerance has received comparatively little attention. David Keilin2 first introduced the term anabiosis (also known as cryptobiosis, hidden life3) to describe the unusual state of biological organization where cells cease metabolism but remain viable in a state of ‘suspended animation’. In this account, desiccation tolerance (also referred to as anhydrobiosis4) is considered in the context of a state of suspended metabolism (stasis) induced by the removal of cell water. The salient features of desiccation tolerance are few: a complete arrest of cellular metabolism, followed by time spent in a state of suspended animation and then subsequent recovery of metabolic functions. Dry, desiccate, rehydrate; a simple process? http://tim.trends.com A cell that is sensitive to water deficit becomes so at some point(s) during the phase(s) of drying, desiccation or rehydration. The timing of the onset of this sensitivity, and the reason(s) behind its acquisition, remain cryptic. It is unclear whether all sensitive cells show a uniform response (same timing, cause and effect) or whether the dysfunction differs from cell type to cell type. In this regard, it is worth considering a new concept: the viable-but-nonculturable (VBNC) phenotype. Cells become VBNC upon exposure to different stresses6. Some features of this phenomenon are reminiscent of aspects of desiccation tolerance. For example, strains of Listeria monocytogenes isolated from biofilms differ in their capacity to enter the VBNC state, which has been attributed to differences in their extracellular polysaccharides7, and recovery of Aeromonas hydrophila from the VBNC state is enhanced by the presence of H2O2-degrading agents, including catalase8. There are specific modifications of the cell wall when Enterococcus faecalis cells become 0966-842X/01/$ – see front matter © 2001 Elsevier Science Ltd. All rights reserved. PII: S0966-842X(01)02231-4