Download Evolutionary genomics of pathogenic bacteria

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Artificial gene synthesis wikipedia , lookup

Transcript
Review
TRENDS in Microbiology Vol.9 No.11 November 2001
547
Evolutionary genomics of
pathogenic bacteria
J. Ross Fitzgerald and James M. Musser
Complete genome sequences are now available for multiple strains of several
bacterial pathogens and comparative analysis of these sequences is providing
important insights into the evolution of bacterial virulence. Recently, DNA
microarray analysis of many strains of several pathogenic species has contributed
to our understanding of bacterial diversity, evolution and pathogenesis.
Comparative genomics has shown that pathogens such as Escherichia coli,
Helicobacter pylori and Staphylococcus aureus contain extensive variation in gene
content whereas Mycobacterium tuberculosis nucleotide divergence is very
limited. Overall, these approaches are proving to be a powerful means of exploring
bacterial diversity, and are providing an important framework for the analysis of
the evolution of pathogenesis and the development of novel antimicrobial agents.
Comparative genomics is a rapidly advancing
discipline that is currently being energized by the
availability of genome sequences for multiple strains
of pathogenic bacterial species and by the advent of
DNA microarray technology. Here, we review very
recent findings from comparative genomic studies of
selected important human pathogens, and discuss the
implications of these studies for our understanding of
bacterial evolution and the development of improved
disease therapeutics.
Escherichia coli
J. Ross Fitzgerald
James M. Musser*
Laboratory of Human
Bacterial Pathogenesis,
Rocky Mountain
Laboratories, National
Institute of Allergy and
Infectious Diseases,
National Institutes of
Health, 903 South 4th St,
Hamilton, MT 59840,
USA.
*e-mail:
[email protected]
The non-pathogenic Escherichia coli strain K12 has
frequently been used as a model system to develop our
understanding of bacterial metabolism and growth.
Although most strains of E. coli are non-pathogenic
and usually exist as commensals, enterohemorrhagic
E. coli 0157:H7 (EHEC) causes hemorrhagic colitis,
which can be associated with the often fatal hemolytic
uremic syndrome, and enteropathogenic E. coli
(EPEC) is an important cause of diarrhea in the
developing world.
A comparison between the genome sequences of
E. coli K12 (Ref. 1) and E. coli 0157:H7 (Ref. 2) identified
many new genes that are likely to play an important
role in the virulence of this food-borne pathogen. Perna
et al.2 showed that although 4.1 Mb of sequence is
shared between these strains, the intra-species
diversity is extensive: E. coli 0157:H7 contains 1.34 Mb
of strain-specific sequence (1387 genes) compared
with 0.53 Mb (528 genes) of unique sequence in the
K12 strain MG1655. Most of these genes are localized
in strain-specific ‘islands’in which the base composition
is atypical, suggesting that these sequences were
obtained from bacterial donor species with a different
base composition by relatively recent horizontal
transfer events. The 0157:H7-specific islands encode
many known and candidate virulence factors. Several
http://tim.trends.com
genes encoding toxins, including a macrophage toxin,
an RTX-toxin-like exoprotein and transport system,
and two urease gene clusters were identified in the
larger islands, which are >15 kb in size. Two fatty-acid
biosynthesis systems, an adhesin gene and the locus
of enterocyte effacement, which has been shown to
be involved in virulence3, were also identified. In
addition, putative fimbrial and non-fimbrial adhesin
genes were present in the smaller chromosomal
islands. Molecular analysis of the proteins encoded by
these genes should allow additional mechanisms of
disease pathogenesis to be elucidated.
The identification of many horizontally
transferred regions raises questions regarding the
evolution of contemporary clones of E. coli. To
investigate the evolution of pathogenic E. coli, Reid
and colleagues analyzed the sequence of seven genes
in 20 pathogenic strains and the non-pathogenic
strain K12 and constructed a phylogenetic tree4. The
results provide evidence that E. coli 0157:H7 diverged
from a common ancestor of E. coli K12 as much as
4.5 million years ago4. The data also indicate that in
some E. coli clones, virulence is a recently evolved
trait resulting from the horizontal transfer of
virulence genes. Moreover, distinct evolutionary
lineages of E. coli appear to have acquired some of the
same virulence factors independently.
DNA microarray analysis has also been used to
study the evolutionary genomics of E. coli5. Ochman
and Jones5 analyzed the distribution of the 4290 open
reading frames (ORFs) of the sequenced E. coli K12
MG1655 strain in five strains of known evolutionary
relationships. The amount of unique DNA present in
each strain was predicted based on genome size and
the variation in gene content, and sequence analysis
determined the relative age of every gene in the
MG1655 chromosome. They found that 3782 ORFs
were common to all strains of E. coli examined. The
present distribution of all 4290 genes in MG1655
could be accounted for by a total of 67 molecular
events, including 37 insertions and 30 deletions
(Fig. 1). This precise prediction of the molecular
events contributing to the evolution of E. coli clones
demonstrates the power of whole-genome DNA
microarrays to examine the diversity and evolution of
bacterial species on a scale never before possible.
Mycobacterium tuberculosis
Mycobacterium tuberculosis continues to be the
leading bacterial killer in the world, with 3–4 million
0966-842X/01/$ – see front matter © 2001 Elsevier Science Ltd. All rights reserved. PII: S0966-842X(01)02228-4
548
Review
TRENDS in Microbiology Vol.9 No.11 November 2001
1.0 Mb
MG1655
2.0 Mb
3.0 Mb
4.0 Mb
Genome
size
4.7 Mb
III
II
A
I
D
E
W3110
4.7 Mb
ECOR 21
4.5 Mb
ECOR 40
5.3 Mb
ECOR 37
5.5 Mb
TRENDS in Microbiology
Fig. 1. Distribution of deletions among strains of Escherichia coli.
Chromosomes are represented linearly and drawn to equal length to
show the locations of missing open reading frames (ORFs) relative to
their map positions in MG1655. Actual chromosome sizes, as
determined by pulsed field gel electrophoresis, are shown on the right.
Black lines portray chromosomal positions where the corresponding
MG1655 ORFs are lacking in a laboratory (W3110) or natural (ECOR)
isolate of E. coli. The relative thickness of these lines denotes the
amount of missing DNA. Blue bands show the positions and sizes of
regions in the MG1655 chromosome deduced to be horizontally
acquired, based on sequence features, and the orange bands represent
the positions and sizes of known prophage in the MG1655
chromosome. Upper-case letters designate major phylogenetic
subgroups within E. coli. Roman numerals indicate ancestral lineages
of E. coli. Modified with permission from Ref. 5.
human deaths annually as a result of tuberculosis
(TB). The M. tuberculosis H37Rv genome sequence
was published in 19986. Although originally cultured
from a patient with TB, this strain has been passaged
in vitro for approximately 100 years. Importantly,
since the publication of the genome sequence of strain
H37Rv, a recent clinical isolate has been sequenced
(http://www.tigr.org), although not yet described in a
publication. Comparison of these completed genomes
should provide a framework to begin elucidating
the mechanisms involved in virulence. In addition,
the Mycobacterium leprae genome sequence has
been completed7 and the genomes of strains of
Mycobacterium bovis, Mycobacterium avium and
Mycobacterium smegmatis are being sequenced.
These sequence data will provide a wealth of
important information for the analysis of evolution
and virulence of the M. tuberculosis complex.
The members of the M. tuberculosis complex are
extremely closely related at the nucleotide level8.
Sreevatsan and colleagues performed comparative
sequencing of 26 structural genes in >800 isolates
of members of the M. tuberculosis complex. They
showed that the level of synonymous (silent)
nucleotide variation in structural genes of
M. tuberculosis complex members is greatly restricted
and considerably less than that present in other
pathogens. The data indicate that M. tuberculosis has
http://tim.trends.com
undergone a recent evolutionary bottleneck at the
time of speciation, which is predicted to have occurred
somewhere in the order of 15 000–20 000 years ago.
Accordingly, M. tuberculosis must have disseminated
worldwide very recently in evolutionary time. In
another study, Musser et al.9 sequenced 24 genes
encoding targets of the human immune system in a
sample of 16 isolates recovered from different global
sources. Surprisingly, 19 of the 24 genes lacked
nucleotide sequence diversity. Only six polymorphic
sites were identified in the other five genes,
confirming that the variation in M. tuberculosis
structural genes is negligible, even in genes encoding
proteins that interact with the host immune system.
The original vaccine against TB (M. bovis bacille
Calmette–Guérin; BCG) was developed by Calmette
and Guérin by passaging a strain of M. bovis 230 times
between 1908 and 1921. However, because of the
requirement for continuous passage of derivative
strains, by the time lyophilized seed stocks were made
in the 1960s, different daughter strains had undergone
up to 1000 additional passages. This has resulted in
many phenotypically different strains, which are
thought to vary considerably in their ability to induce
protection against TB. In spite of this, the BCG vaccine
is still the world’s most widely used vaccine. To
investigate the genetic basis for the variation in the
efficacy of M. bovis-derived BCG strains, Behr et al.10
used a whole-genome DNA microarray to analyze
12 such strains used for vaccination in different global
locations. A total of 16 deleted regions were identified
among the strains examined. Behr and colleagues
were able to construct a historical genealogy of these
daughter strains that predicted their relationship to
each other and identified when the deletions took place.
One of these regions (RD1) was absent in all BCG
strains tested but present in all other M. tuberculosis
complex members, and might be responsible for the
original attenuation described by Calmette11. This
study could have important implications for the design
of a more effective vaccine for protection against TB.
Review
TRENDS in Microbiology Vol.9 No.11 November 2001
Very recently, a whole-genome microarray
approach has also been used for epidemiological
analysis of 19 M. tuberculosis isolates12. Twenty-five
deleted chromosomal regions were identified among
these isolates, and an inverse correlation was
observed between the percentage of the genome
deleted from each clone and the percentage of
patients infected with that clone who had pulmonary
cavitations. The authors interpreted these data
to indicate that accumulation of deletions in
M. tuberculosis results in diminished virulence.
Chromosomal deletions in M. tuberculosis complex
members appear to occur as the result of homologous
recombination between copies of IS6110 that flank
chromosomal regions in the direct orientation.
Taken together, the data indicate that singlenucleotide polymorphisms are very rare and deletion
events are an important source of genome variation
in the M. tuberculosis complex. Moreover, horizontal
gene transfer has been very limited within the
M. tuberculosis complex.
Helicobacter pylori
One-half of the world’s population is predicted to be
infected by Helicobacter pylori, although <10% of
these individuals ever present with clinical disease.
H. pylori is the cause of superficial and chronic
gastritis, duodenal ulcers and many gastric ulcers,
and has also been strongly linked to gastric cancer
and mucosa-associated lymphoid tissue lymphoma13.
H. pylori was the first bacterial species for which
genome sequences were completed for two strains
(strains J99 and 26695)14,15. This has proved to be an
important resource for analysis of H. pylori diversity
and provides a useful framework to study pathogenesis
and disease specificity. Many different moleculartyping techniques have demonstrated the remarkable
genetic diversity of this important gastric pathogen16,17
and a high degree of sequence variation is observed
between J99 and 26695 (Ref. 15). Each genome
contains strain-specific genes (89 genes in J99, and
117 genes in 26695), and most of these genes are located
at similar chromosomal locations in both strains. The
so-called plasticity zone contains more than half of the
strain-specific genes, including many of the numerous
restriction and modification genes typical of H. pylori.
Although not all functional18, the presence of large
numbers of restriction-modification systems in
H. pylori strains could represent a mechanism to
promote homologous recombination of short DNA
fragments. Comparison of the two genomes indicates
that some genes for DNA mismatch repair are missing,
indicating that this might be a contributing factor
to increased nucleotide divergence. It has been
speculated that the isolated ecological niche inhabited
by H. pylori might have promoted the development of
mechanisms for DNA uptake and exchange which, in
turn, have resulted in extensive intra-species diversity.
A whole-genome DNA microarray specific for both
sequenced H. pylori strains has been used to analyze
http://tim.trends.com
549
chromosomal diversity in H. pylori. Salama and
colleagues determined the chromosomal gene content
of 15 H. pylori isolates, including strains J99 and
26695 (Ref. 19). Of the 1643 genes represented on
the array, 1281 were common to all strains whereas
22% of genes were strain-specific and could encode
proteins involved in niche adaptation. Many of the
strain-specific genes were present in either the 40 kb
pathogenicity island or the plasticity zone. However,
two-thirds of all other strain-specific genes were
distributed elsewhere in the chromosome in smaller
tracts containing between one and eight genes. This
indicates that H. pylori uses multiple mechanisms of
gene acquisition and loss, which have contributed to
its evolution. Importantly, the study identified new
candidate virulence genes that could play a role in
disease pathogenesis.
A similar DNA microarray was used to identify
differences in gene content between two H. pylori
isolates from gastric and duodenal ulcers that
differed in virulence in a gerbil model of gastritis20.
Israel and co-workers identified several strainspecific differences including the presence of the cag
pathogenicity island in one strain. Subsequent
disruption of the cag locus resulted in reduced gastric
inflammation. This study demonstrates the utility of
whole-genome microarrays for identifying differences
in gene content between strains of different
pathogenic potential.
Staphylococcus aureus
Staphylococcus aureus causes a wide variety of
infections including several that are life threatening.
Moreover, this organism is one of the leading causes
of nosocomial infection. Treatment of infections
caused by S. aureus is complicated by the resistance
of many strains to the antibiotic of choice,
methicillin. The genome sequences of a methicillinresistant strain (N315) and a clonally related
vancomycin-resistant strain (Mu50) have recently
been published21. In addition, four S. aureus genome
sequencing projects are nearing completion. Two
of the strains being sequenced are methicillinresistant strains, COL (http://www.tigr.org) and
252 (http://www.sanger.ac.uk). The others are a
laboratory strain, 8325 (http://www.genome.ou.edu),
and a community-acquired infection strain,
476 (http://www.sanger.ac.uk).
Several population genetic studies of S. aureus
have used multilocus enzyme electrophoresis and
multilocus sequencing to analyze genetic diversity
within the species and have provided insights into
host- and disease specificity22–24. However, these
studies have indexed variation at only a limited
number of chromosomal loci. Recently, we constructed
a whole-genome DNA microarray specific for S. aureus
strain COL, and analyzed the chromosomal gene
content of 36 S. aureus strains representing the most
abundant clonal lineages within the species25. Strains
from well-defined human clinical conditions were
Review
TRENDS in Microbiology Vol.9 No.11 November 2001
COL
MSA3410
MSA890
MSA3426
MSA817
MSA961
MSA820
MSA3400
MSA3405
MSA2120
RF122
MSA2965
MSA2348
MSA2020
MSA551
MSA535
MSA2389
MSA1601
MSA2099
MSA3412
MSA3407
MSA2885
MSA2335
MSA2754
MSA2345
MSA1836
MSA1827
MSA700
MSA2786
MSA3095
MSA2346
MSA1205
MSA1832
MSA537
MSA3418
MSA3402
MSA1695
550
RD1
RD2
RD3
RD4
RD5
RD6
RD7
RD8
RD9
RD10
RD11
RD12
RD13
RD14
RD15
RD16
RD17
RD18
insights into genetic variation in natural populations
of S. aureus and demonstrated the power of DNA
microarray technology to address issues of evolution
and pathogenesis definitively.
Group A streptococci
TRENDS in Microbiology
Fig. 2. The presence or absence of large chromosomal regions of difference (RDs) in
36 Stapylococcus aureus strains. A square symbol denotes the presence of an RD and an empty space
its absence. Hatched squares indicate presence of RDs in methicillin-resistant S. aureus (MRSA)
strains. Red signifies strains of the predominant clone associated with female urogenital toxic shock
syndrome. Reproduced from Ref. 25.
selected for analysis, including toxic shock syndrome
(TSS) isolates and methicillin-resistant strains, and
strains isolated from bovine and ovine intramammary
infections. We found that 78% of genes were common to
all strains examined, identifying a gene complement
likely to be important for general cell maintenance and
growth. Conversely, 22% of genes were strain-specific
and might play a role in adaptation to specialized
niches, or other contingency functions. Eighteen large
chromosomal regions of difference (RD) of between
3 and 50 kb in size were identified, and these were
variably present among the strains examined (Fig. 2).
Several of these RDs contained extensive variation in
gene content and size, suggesting that they are large
chromosomal loci with elevated recombination activity.
Many RDs contained genes encoding gene-mobility
proteins such as integrases or transposases, and genes
that encode virulence determinants or proteins
mediating antibiotic resistance. The data indicate
that horizontal gene transfer and recombination
have played a fundamental role in the evolution of
pathogenic S. aureus.
The study also provided important insight into
the evolution of methicillin-resistant strains. The
data indicated that the mec element encoding the
methicillin-resistance phenotype was acquired
multiple times by horizontal gene transfer into
different S. aureus clones. This finding rules out the
single progenitor theory of the origin of methicillinresistant S. aureus26. In addition, the study
demonstrated that, although female urogenital TSS
isolates are related, they do not share a very recent
ancestor. This finding provides convincing evidence
that the TSS epidemic of the 1970s and 1980s was
caused by a change in the host (use of a new superabsorbent tampon) rather than by rapid global
dissemination of a hypervirulent clone. Overall, this
comparative genomics study provided important
http://tim.trends.com
Group A streptococci (GAS) are responsible for a wide
range of human infections, including pharyngitis,
skin infections, sepsis, osteomyelitis, TSS and
necrotizing fasciitis27. The complete genome
sequence for a GAS M1 serotype strain has recently
been published28, and other M-type strains are
currently being sequenced at the Sanger Centre
(http://www.sanger.ac.uk) and the Laboratory of
Human Bacterial Pathogenesis, Rocky Mountain
Laboratories (Hamilton, MT, USA). Comparison of
these genomes is already providing the basis for an
improved understanding of GAS pathogenesis.
Reid and colleagues29 recently examined four GAS
genomes and identified 11 genes present in all strains
examined encoding previously uncharacterized
extracellular putative virulence factors. Sequence
analysis of the 11 genes in 37 diverse strains found
that recombination has contributed substantially to
chromosomal diversity, and western blot analysis
with sera from infected patients confirmed that these
proteins were antigenic. Moreover, transcription of
many genes was influenced by the covR and mga
trans-acting regulatory loci. Currently, these proteins
are being examined for their role in virulence, and
their potential use in vaccines. In addition, a GAS
genome-specific microarray is being used to explore
variation within and between M protein serotypes of
GAS. Preliminary data indicate that horizontal
transfer by phage transduction has been a major
source of intra-species diversity.
Chlamydia trachomatis
Chlamydia trachomatis and Chlamydia pneumoniae
are obligate intracellular human bacterial pathogens
with markedly different tissue tropism and disease
manifestation. The genetic basis for these differences
is unknown. The organisms grow only within a
specialized vacuole in the post-Golgi exocytic
vesicular compartment of eukaryotic cells. They
undergo a distinct developmental cycle that
alternates between an extracellular elementary body
(EB) and an intracellular replicating cell, termed the
reticulate body (RB).
C. trachomatis causes trachoma, an eye infection
that leads to blindness, and sexually transmitted
diseases such as pelvic inflammatory disease.
Comparison of the two sequenced C. trachomatis
genomes (MoPn, a mouse pathogen and serovar D, a
human pathogen)30,31 reveals that the genomes are
extremely similar outside of a region referred to as the
plasticity zone. There are several differences at the
plasticity zone of these different host-specific strains
that could influence chlamydial pathogenesis30.
C. trachomatis strain MoPn has a plasticity zone
Review
TRENDS in Microbiology Vol.9 No.11 November 2001
of ~51 kb that contains the guaAB and adenosine
deaminase (add) genes as a single operon. At the same
position in the ~23 kb plasticity zone in serovar D, a
tryptophan biosynthesis cluster appears to have
replaced the guaAB and add genes. These genes might
encode proteins involved in scavenging nucleotides;
this potential difference could influence the host range
of tissues each strain is capable of infecting.
Another potentially very significant difference
between the plasticity zones in these two strains is the
presence of a 9675-nucleotide gene in strain MoPn
encoding a putative toxin similar to a predicted E. coli
0157:H7 toxin. The amino termini of the derived
proteins contain homology with the amino termini
of large clostridial toxins (LCT), which have been
shown to interfere with eukaryotic cell chemistry.
Accordingly, this could be an important virulence
determinant involved in promoting acute high-level
infection. The serovar D strain contains the toxin gene
but with multiple frameshift mutations. A recent
paper by Belland et al.32 demonstrated that both
strains produced a functional toxin but the MoPn
LCT-like toxin had a higher cytotoxic activity. Also
located in the plasticity zone are the phospholipase
D-endonuclease (PLD) genes, with four paralogs in
serovar D and five in MoPn. The role of the protein
products of these genes is unknown. Overall, the
plasticity zone is the site of most differences between
these two strains, indicating that it could be a ‘hot
spot’ for recombination. The genetic variation at this
chromosomal location provides important clues as to
strain differences in host specificity and pathogenesis.
Chlamydia pneumoniae
Chlamydia pneumoniae causes pneumonia and
bronchitis and is frequently associated with complex
chronic diseases such as atherosclerosis, asthma
and multiple sclerosis. The three C. pneumoniae
genomes (AR39, CWL029 and J138) that have been
sequenced30,33,34 are very similar in gene content and
order, with up to 99.9% identity. There are only small
differences offering targets for strain differentiation.
The genome sequences of strains AR39 and CWL029
were compared by Read et al.30 Only 296 singlenucleotide polymorphisms and 21 single-base
frameshift mutations were identified. Many of the
mutations occurred in intergenic regions and only
161 of 1165 derived proteins are not identical.
Interestingly, strain AR39 contains a 4524-nucleotide
circular bacteriophage that is not found in the other
strains and which is the first C. pneumoniae
bacteriophage to be described. This discovery raises
interesting questions regarding horizontal gene
transfer between obligate intracellular pathogens and
the possibility of a role for the phage in pathogenesis.
Shirai et al.33 determined the genome sequence for
C. pneumoniae J138 from Japan and compared it
to strain CWL029. They observed a high level of
structural and functional conservation between the
two unrelated isolates. Only three chromosomal
http://tim.trends.com
551
segments of between 27 and 84 nucleotides are unique
to the J138 genome whereas five segments of between
89 and 1649 nucleotides are unique to the CWL029
genome. The striking similarity observed among the
C. pneumoniae strains sequenced to date could reflect
the intracellular niche they inhabit, and indicates
recent evolution from a common ancestor.
Streptococcus pneumoniae
Streptococcus pneumoniae causes pneumonia,
bacteremia, meningitis and otitis media, and is
responsible for the deaths of >3 million children every
year. Recently, the complete genome sequences for a
virulent, capsular type 4 isolate of S. pneumoniae
(TIGR4)35 and the avirulent R6 strain36 were
determined. Sequence analysis of the TIGR4 strain
has already provided important clues into
S. pneumoniae pathogenesis37 and comparative
analysis of the two genomes should result in
improved understanding of S. pneumoniae virulence.
DNA microarray hybridizations were carried
out to compare the TIGR4 strain with the R6
non-capsulated laboratory strain and strain D39, a
serotype 2 capsulated isolate35. Nine chromosomal
regions were missing in strains R6 and D39 compared
with TIGR4. Moreover, six of these nine regions
contained an atypical GC content, indicating that
they were horizontally acquired. Within these regions
were many genes encoding proteins that are surfaceexposed and/or related to pathogenesis, including the
capsule biosynthesis locus, a gene cluster encoding a
cell-wall surface anchor protein, and genes encoding a
putative macrolide efflux protein, a V-type ATPase
and an IgA1 protease. Clearly, these genetic
differences could contribute to strain-specific features
of pathogenesis or antigenicity.
Implications of comparative genomics for diagnostics
and therapeutics
The ability to determine differences in gene content
between strains of a bacterial species has important
consequences for pathogen identification and disease
prevention and therapy. Comparative genomic
approaches are identifying novel targets for improved
diagnostic and therapeutic procedures. For example,
in the DNA microarray study of BCG strains by
Behr et al.10, many ORFs were identified that were
present in M. tuberculosis but absent in M. bovis.
The products of these ORFs are useful targets for
diagnostic discrimination between individuals
infected with M. tuberculosis and those who have
been vaccinated with the M. bovis BCG vaccine.
This is currently not possible using the traditional
tuberculin skin test. Musser et al.9 showed that there
is negligible diversity in human immune-system
protein targets worldwide. This indicates that
potential drug targets could also be largely conserved
and provides hope that a broadly effective therapy
could be identified to control a disease which causes
several million deaths globally each year.
552
Review
TRENDS in Microbiology Vol.9 No.11 November 2001
Questions for future research
•
•
•
•
Acknowledgements
We thank S. Reid and
M. Chaussee for critical
review of the manuscript.
What is the extent of genome diversity within
pathogenic bacterial species?
How has lateral gene transfer contributed to the
evolution of pathogenic bacteria?
How does the genetic variation between strains
result in differences in host-specificity, tissue
tropism and virulence?
How can comparative genomics be used to
develop better therapeutics and vaccines?
The recent comparative genome analysis of E. coli
0157:H7 and MG1655 identified many strain-specific
and candidate virulence factor genes. These findings
could provide the basis for sensitive diagnostic
methods to identify virulent strains of E. coli
0157:H7 in contaminated food products. In addition,
DNA microarray studies of S. aureus and H. pylori
have revealed the gene complement common to all
strains examined, including virulence factor genes
present in all strains. Some of these gene products
could be targets for therapeutics effective against all
strains of a species.
The genome sequences for both a serogroup A and
a serogroup B strain of Neisseria meningitidis have
been completed38,39. A whole-genome approach was
used successfully to identify vaccine candidates for
prevention of N. meningitidis serogroup B infection40.
After genome sequence analysis, 350 proteins were
expressed in E. coli and used to immunize mice.
Comparative sequence analysis identified proteins
that were conserved in all strains and those that
elicited a bactericidal antibody response were
References
1 Blattner, F.R. et al. (1997) The complete genome
sequence of Escherichia coli K-12. Science
277, 1453–1474
2 Perna, N.T. et al. (2001) Genome sequence of
enterohaemorrhagic Escherichia coli O157:H7.
Nature 409, 529–533
3 Perna, N.T. et al. (1998) Molecular evolution of a
pathogenicity island from enterohemorrhagic
Escherichia coli O157:H7. Infect. Immun.
66, 3810–3817
4 Reid, S.D. et al. (2000) Parallel evolution of
virulence in pathogenic Escherichia coli. Nature
406, 64–67
5 Ochman, H. and Jones, I.B. (2000) Evolutionary
dynamics of full genome content in Escherichia
coli. EMBO J. 19, 6637–6643
6 Cole, S.T. et al. (1998) Deciphering the biology of
Mycobacterium tuberculosis from the complete
genome sequence. Nature 393, 537–544
7 Cole, S.T. et al. (2001) Massive gene decay in the
leprosy bacillus. Nature 409, 1007–1011
8 Sreevatsan, S. et al. (1997) Restricted structural
gene polymorphism in the Mycobacterium
tuberculosis complex indicates evolutionarily
recent global dissemination. Proc. Natl. Acad. Sci.
U. S. A. 94, 9869–9874
9 Musser, J.M. et al. (2000) Negligible genetic
diversity of Mycobacterium tuberculosis host
http://tim.trends.com
10
11
12
13
14
15
16
17
selected as potential vaccine candidates. This first
study of its kind demonstrates the benefits of a
genome-scale approach to vaccine development and
a similar approach could be used to design novel
vaccines against other bacterial pathogens.
Concluding comments and future directions
Overall, comparative genomics has demonstrated
that bacterial pathogens such as E. coli, H. pylori and
S. aureus contain extensive genome diversity, and
multiple molecular mechanisms of horizontal gene
transfer and recombination have contributed to the
variation in different species. The many strain-specific
genes identified could allow colonization of specialized
host and environmental niches and could help explain
the versatility of pathogens, such as S. aureus, that
cause many different disease types in multiple host
species. By contrast, M. tuberculosis complex
members have very limited nucleotide divergence and
variation in gene content has occurred through
deletion and movement of insertion elements.
This article discusses recent findings from only a
few selected human pathogens. However, it is clear
that comparative genomic approaches are extremely
powerful tools for understanding the evolution of
microbial pathogens. In particular, whole-genome
DNA microarrays allow rapid analysis of gene content
of large numbers of strains and provide an excellent
framework for the analysis of pathogenesis, and
host- and disease specificity. In the near future
such methods will be applied to all major human
pathogens. This will undoubtedly lead to increased
understanding of bacterial evolution and pathogenesis
and, importantly, should lead to improved diagnostics
and the development of novel therapeutic strategies.
immune system protein targets: evidence of
limited selective pressure. Genetics 155, 7–16
Behr, M.A. et al. (1999) Comparative genomics of
BCG vaccines by whole-genome DNA microarray.
Science 284, 1520–1523
Domenech, P. et al. (2001) Mycobacterium
tuberculosis in the post-genomic age. Curr. Opin.
Microbiol. 4, 28–34
Kato-Maeda, M. et al. (2001) Comparing genomes
within the species Mycobacterium tuberculosis.
Genome Res. 11, 547–554
Alm, R.A. and Trust, T.J. (1999) Analysis of the
genetic diversity of Helicobacter pylori: the tale of
two genomes. J. Mol. Med. 77, 834–846
Tomb, J.F. et al. (1997) The complete genome
sequence of the gastric pathogen Helicobacter
pylori. Nature 388, 539–547
Alm, R.A. et al. (1999) Genomic-sequence
comparison of two unrelated isolates of the
human gastric pathogen Helicobacter pylori.
Nature 397, 176–180
Marshall, D.G. et al. (1998) Helicobacter pylori – a
conundrum of genetic diversity. Microbiology
144, 2925–2939
Go, M.F. et al. (1996) Population genetic analysis
of Helicobacter pylori by multilocus enzyme
electrophoresis: extensive allelic diversity and
recombinational population structure.
J. Bacteriol. 178, 3934–3938
18 Lin, L.F. et al. (2001) Comparative genomics of the
restriction-modification systems in Helicobacter
pylori. Proc. Natl. Acad. Sci. U. S. A.
98, 2740–2745
19 Salama, N. et al. (2000) A whole-genome
microarray reveals genetic diversity among
Helicobacter pylori strains. Proc. Natl. Acad. Sci.
U. S. A. 97, 14668–14673
20 Israel, D.A. et al. (2001) Helicobacter pylori
strain-specific differences in genetic content,
identified by microarray, influence host
inflammatory responses. J. Clin. Invest.
107, 611–620
21 Kuroda, M. et al. (2001) Whole genome
sequencing of meticillin-resistant Staphylococcus
aureus. Lancet 357, 1225–1240
22 Musser, J.M. et al. (1990) A single clone of
Staphylococcus aureus causes the majority of
cases of toxic shock syndrome. Proc. Natl. Acad.
Sci. U. S. A. 87, 225–229
23 Musser, J.M. and Selander, R.K. (1990) Genetic
analysis of natural populations of Staphylococcus
aureus. In Molecular Biology of the Staphylococci
(Novick, R.P., ed.), pp. 59–67, Wiley-VCH
24 Enright, M.C. et al. (2000) Multilocus sequence
typing for characterization of methicillinresistant and methicillin-susceptible clones of
Staphylococcus aureus. J. Clin. Microbiol.
38, 1008–1015
Review
TRENDS in Microbiology Vol.9 No.11 November 2001
25 Fitzgerald, J.R. et al. (2001) Evolutionary
genomics of Staphylococcus aureus: insights into
the origin of methicillin-resistant strains and the
toxic shock syndrome epidemic. Proc. Natl. Acad.
Sci. U. S. A. 98, 8821–8826
26 Kreiswirth, B. et al. (1993) Evidence for a clonal
origin of methicillin resistance in Staphylococcus
aureus. Science 259, 227–230
27 Musser, J.M. and Krause, R.M. (1998) The revival
of group A streptococcal diseases, with a
commentary on staphylococcal toxic shock
syndrome. In Emerging Infections (Krause, R.M.,
ed.), pp. 185–218, Academic Press
28 Ferretti, J.J. et al. (2001) Complete genome
sequence of an M1 strain of Streptococcus pyogenes.
Proc. Natl. Acad. Sci. U. S. A. 98, 4658–4663
29 Reid, S.D. et al. (2001) Multilocus analysis of
extracellular putative virulence proteins made by
group A Streptococcus: population genetics,
30
31
32
33
34
human serologic response, and gene transcription.
Proc. Natl. Acad. Sci. U. S. A. 98, 7552–7557
Read, T.D. et al. (2000) Genome sequences of
Chlamydia trachomatis MoPn and Chlamydia
pneumoniaeAR39. Nucleic Acids Res. 28, 1397–1406
Stephens, R.S. et al. (1998) Genome sequence of
an obligate intracellular pathogen of humans:
Chlamydia trachomatis. Science 282, 754–759
Belland, R.J. et al. Chlamydia trachomatis
cytotoxicity associated with complete and partial
cytotoxin genes. Proc. Natl. Acad. Sci. U. S. A.
(in press)
Shirai, M. et al. (2000) Comparison of whole
genome sequences of Chlamydia pneumoniae
J138 from Japan and CWL029 from USA. Nucleic
Acids Res. 28, 2311–2314
Kalman, S. et al. (1999) Comparative genomes of
Chlamydia pneumoniae and C. trachomatis. Nat.
Genet. 21, 385–389
553
35 Tettelin, H. et al. (2001) Complete genome
sequence of a virulent isolate of Streptococcus
pneumoniae. Science 293, 498–506
36 Hoskins, J. et al. (2001) Genome of the bacterium
Streptococcus pneumoniae strain R6. J. Bacteriol.
19, 5709–5717
37 Musser, J.M. and Kaplan, S.L. Pneumococcal
research transformed. New Engl. J. Med.
(in press)
38 Parkhill, J. et al. (2000) Complete DNA sequence
of a serogroup A strain of Neisseria meningitidis
Z2491. Nature 404, 502–506
39 Tettelin, H. et al. (2000) Complete genome
sequence of Neisseria meningitidis serogroup B
strain MC58. Science 287, 1809–1815
40 Pizza, M. et al. (2000) Identification of vaccine
candidates against serogroup B meningococcus by
whole-genome sequencing. Science
287, 1816–1820
Desiccation tolerance: a simple
process?
Malcolm Potts
Water is essential for life, and thus the removal of water from a cell is a severe,
often lethal stress. This is not a remarkable observation but it is one that is
often taken for granted. Desiccation-tolerant cells implement structural,
physiological and molecular mechanisms to survive severe water deficit. These
mechanisms, and the components and pathways which facilitate them, are
poorly understood. Here, recent developments are considered to illustrate the
importance of desiccation, longevity and cell stasis in basic microbiology, and
the relevance of the topic to the metabolic engineering of sensitive cells,
including those of humans.
In fact, few organisms can withstand these complex
phase changes. To understand how they do so one
must deal with controversial issues surrounding cell
age, longevity, the structural and biochemical
properties of anhydrous cytoplasm, and metabolic
stasis. An earlier review provided a critical appraisal
of this subject as well as an introduction to some of the
relevant biophysical principles5.
Dry-down phase
Malcolm Potts
Virginia Tech Center
for Genomics,
W. Campus Drive,
Virginia Tech,
Blacksburg, VA 24061,
USA.
e-mail: [email protected]
Paracelsus c1500 was perhaps the first to engage in
the study of desiccation phenomena1 but it is Antoni
van Leeuwenhoek who tends to be remembered for
his revival of dried ‘animalcules’ (rotifers) upon
rehydration. The tercentenary of his first published
observations will be celebrated in 2002. Over the past
300 years, the phenomenon of desiccation tolerance
has received comparatively little attention. David
Keilin2 first introduced the term anabiosis (also known
as cryptobiosis, hidden life3) to describe the unusual
state of biological organization where cells cease
metabolism but remain viable in a state of ‘suspended
animation’. In this account, desiccation tolerance (also
referred to as anhydrobiosis4) is considered in the
context of a state of suspended metabolism (stasis)
induced by the removal of cell water.
The salient features of desiccation tolerance are
few: a complete arrest of cellular metabolism,
followed by time spent in a state of suspended
animation and then subsequent recovery of metabolic
functions. Dry, desiccate, rehydrate; a simple process?
http://tim.trends.com
A cell that is sensitive to water deficit becomes so
at some point(s) during the phase(s) of drying,
desiccation or rehydration. The timing of the onset
of this sensitivity, and the reason(s) behind its
acquisition, remain cryptic. It is unclear whether all
sensitive cells show a uniform response (same timing,
cause and effect) or whether the dysfunction differs
from cell type to cell type. In this regard, it is worth
considering a new concept: the viable-but-nonculturable (VBNC) phenotype. Cells become VBNC
upon exposure to different stresses6. Some features
of this phenomenon are reminiscent of aspects of
desiccation tolerance. For example, strains of Listeria
monocytogenes isolated from biofilms differ in their
capacity to enter the VBNC state, which has been
attributed to differences in their extracellular
polysaccharides7, and recovery of Aeromonas
hydrophila from the VBNC state is enhanced by
the presence of H2O2-degrading agents, including
catalase8. There are specific modifications of the
cell wall when Enterococcus faecalis cells become
0966-842X/01/$ – see front matter © 2001 Elsevier Science Ltd. All rights reserved. PII: S0966-842X(01)02231-4