Download Recombination and clonal groupings within Helicobacter pylori from

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Quantitative trait locus wikipedia , lookup

Non-coding DNA wikipedia , lookup

Genomic library wikipedia , lookup

Transposable element wikipedia , lookup

Human genetic variation wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Point mutation wikipedia , lookup

Gene therapy wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Copy-number variation wikipedia , lookup

Human genome wikipedia , lookup

Genomic imprinting wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Public health genomics wikipedia , lookup

Ridge (biology) wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Gene nomenclature wikipedia , lookup

Genomics wikipedia , lookup

Genetic engineering wikipedia , lookup

Minimal genome wikipedia , lookup

History of genetic engineering wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Gene desert wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Metagenomics wikipedia , lookup

Gene expression programming wikipedia , lookup

Gene wikipedia , lookup

Helitron (biology) wikipedia , lookup

RNA-Seq wikipedia , lookup

Genome (book) wikipedia , lookup

Genome editing wikipedia , lookup

Gene expression profiling wikipedia , lookup

Genome evolution wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Designer baby wikipedia , lookup

Pathogenomics wikipedia , lookup

Microevolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
Molecular Microbiology (1999) 32(3), 459–470
MicroGenomics
Recombination and clonal groupings within
Helicobacter pylori from different geographical regions
Mark Achtman,1* Takeshi Azuma,2 Douglas E. Berg,3
Yoshiyuki Ito,2 Giovanna Morelli,1 Zhi-Jun Pan,3
Sebastian Suerbaum,4 Stuart A. Thompson,5 Arie
van der Ende6 and Leen-Jan van Doorn7
1
Max-Planck Institut für molekulare Genetik, Ihnestraße
73, 14195 Berlin, Germany.
2
Second Department of Internal Medicine, Fukui Medical
University, Fukui 910-1193, Japan.
3
Departments of Molecular Microbiology and Genetics,
Washington University School of Medicine, St Louis,
MO 63110, USA.
4
Department of Medical Microbiology, Ruhr-Universität
Bochum, 44780 Bochum, Germany.
5
Vanderbilt University School of Medicine, Nashville,
TN 37232, USA.
6
Department of Medical Microbiology, Academic Medical
Center, 1105 AZ Amsterdam, The Netherlands.
7
Delft Diagnostic Laboratory, 2625 AD Delft, The
Netherlands.
Summary
A collection of 20 strains of Helicobacter pylori from
several regions of the world was studied to better
understand the population genetic structure and diversity of this species. Sequences of fragments from
seven housekeeping genes (atpA , efp , mutY , ppa ,
trpC , ureI , yphC ) and two virulence-associated genes
(cagA , vacA ) showed high levels of synonymous sequence variation (mean percentage K s of 10–27%) and
lower levels of non-synonymous variation (mean percentage K a of 0.2–5.6%). Cluster analysis of pairwise
differences between alleles revealed the existence of
two weakly clonal groupings, which included half of
the strains investigated. All six strains isolated from
Japanese and coastal Chinese were assigned to the
‘Asian’ clonal grouping, probably reflecting descent
from a distinct common ancestor. The clonal groupings
were not totally uniform; recombination, as measured
Received 1 December, 1998; revised 8 February, 1999; accepted
11 February, 1999. Authors’ names are in alphabetical order. *For
correspondence. E-mail [email protected];
Tel. (þ49) 30 8413 1262; Fax (þ49) 30 8413 1385.
䊚 1999 Blackwell Science Ltd
by the homoplasy test and compatibility matrices,
was extremely common within all genes tested, except
cagA . The fact that clonal descent could still be discerned despite such frequent recombination possibly
reflects founder effects and geographical separation
and/or selection for particular alleles of these genes.
Introduction
Helicobacter pylori is a Gram-negative bacterium that
infects the gastric mucosa of more than half of all humans.
It is a major cause of peptic ulcers and an early risk factor
for gastric cancer (Blaser, 1997). H. pylori is also genetically more diverse than most bacterial species (Go et al .,
1996). The DNA fingerprint pattern and sequences of various genes are almost always different between independent pairs of isolates (Majewski and Goodwin, 1988;
Akopyanz et al ., 1992a,b; Kansau et al ., 1996), and a comparison of the genomes of two strains has shown that 7%
of the genes are specific to each strain (Alm et al ., 1999).
However, bacteria isolated from the members of one family
can be indistinguishable, or show only limited variation,
which implies intrafamilial transmission (Bamford et al .,
1993; van der Ende et al ., 1996; Suerbaum et al ., 1998;
Kersulyte et al ., 1999).
The limited data available to date on the population
genetics of this organism have led to somewhat contradictory interpretations. For example, multilocus enzyme electrophoresis of six loci indicated no linkage disequilibrium
between the different loci (Go et al ., 1996), and restriction
fragment length polymorphism (RFLP) analysis indicated
that its population genetic structure is panmictic (Salaun
et al ., 1998). Similarly, analyses of sequences from fragments of the vacA , flaA and flaB genes indicated ‘free
recombination’ among isolates from Canada and Germany
(Suerbaum et al ., 1998), i.e. no clonal associations were
detected, even among pairs of polymorphic sites within
single genes. In agreement, multiple recombinant types
were found among strains isolated from one individual
(Kersulyte et al ., 1999).
The virulence-associated vacA gene encodes a vacuolating cytotoxin (Cover, 1996; Atherton, 1998). Particular
sequence motifs in the signal sequence (‘s’) and middle
(‘m’) regions of the vacA gene differ between H. pylori
460 M. Achtman et al.
isolated from East Asians and Europeans (Atherton et al .,
1995; Ito et al ., 1998; Pan et al ., 1998; van Doorn et al .,
1998a), suggesting that recombination is rare between
bacteria from different continents or that particular alleles
are selected for in certain populations. However, a different
segment of the vacA gene was found to have recombined
freely in bacteria isolated from Canada and South Africa
(Suerbaum et al ., 1998).
The virulence-associated cagA gene encodes an immunodominant protein of unknown function (Covacci et al .,
1993; Tummuru et al ., 1993; Kuipers et al ., 1995). cagA
is located at one end of a 40 kb pathogenicity island (Censini et al ., 1996; Akopyants et al ., 1998) that is lacking
from many putatively avirulent strains from the USA and
Europe and is altered in other strains (Jenks et al ., 1998).
Similarly to vacA , the population of cagA alleles differs
between strains isolated from East Asians and strains
from other ethnic groups (Miehlke et al ., 1996; Pan et al .,
1997; van der Ende et al ., 1998; van Doorn et al ., 1999).
Thus, recombination might be frequent among strains isolated from individual ethnic groups but rare between strains
from geographically separated ethnic groups. However, the
same Chinese and Dutch strains that had exhibited geographical partitioning of cagA sequences did not show ethnic associations for sequences of the glmM housekeeping
gene (van der Ende et al ., 1998). And the same polymorphisms were found in the vacA , flaA and flaB genes from South
Africa and Germany or Canada (Suerbaum et al ., 1998).
Recently, sequences from multiple housekeeping genes
(multilocus sequence typing, MLST) have been used to
define phylogenetic relationships within the weakly clonal
species Neisseria meningitidis (Maiden et al ., 1998) and
Streptococcus pneumoniae (Enright and Spratt, 1998).
Most members of each clonal grouping within these species
are uniform for multiple housekeeping genes; recombinant
alleles are the exception rather than the rule. It seemed
possible, especially considering the geographical distributions of cagA and vacA described above, that MLST would
also allow the recognition of distinct clonal groupings within
H. pylori . The conclusion that H. pylori is freely recombining
was based on vacA , flaA and flaB sequences from strains
isolated in Canada, Germany and South Africa; it seemed
possible that analysis of a more global collection might
reveal geographical associations that were lacking in the
earlier sample. Furthermore, alleles of cytoplasmic housekeeping genes are more likely to be selectively neutral,
and therefore more uniform, than is the case for genes
for which there might be selection for diversity as a result
of association with virulence (vacA ) or motility (flaA , flaB ).
In this report, we analyse the sequence variability of
both housekeeping and virulence-associated genes from
H. pylori strains isolated in several continents and show
that weakly clonal groupings can be identified in H. pylori
despite a rich history of interstrain recombination.
Results
A global strain collection
We combined strains of H. pylori that have been used for
various laboratory studies with additional strains from
diverse geographical locations to assemble a collection of
19 strains that may represent much of the global population
Table 1. Strains of H. pylori whose sequences were analysed.
Hybridization
Strain
Country
R29
97-42
88-39
88-28
F32
HP1
#12
J166
5596
CC28
96-228
96-232
N6
SS1
NCTC11637
NCTC11638
25
BO265
26695
5-1
China
Hong Kong
Thailand
Thailand
Japan
Peru
Guatemala
USA
Gambia
S. Africa
Alaska
Alaska
France
Australia
Australia
Australia
Holland
Germany
UK
Lithuania
Clonal
group
Asia
Asia
Asia
Asia
Asia
Asia
Clone2
Clone2
Clone2
Clone2
cagA
þ
þ
þ
þ
þ
–
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
vacA
s1c
s1c
s1c
s1c
s1c
s1c
s1b
s1b
s1b
s1a/s1b
s1a
s1a
s1a
s2
s1a
s1a
s1a
s1b
s1a
s1a
m2a
m2a
m2a
m2a
m1
m1
m1
m1
m1
m1
m1
m1
m2a
m2a
m1
m1
m1
m1
m1
m1
Source a
Reference
AvdE
SAT
SAT
SAT
TA
DEB
DEB
SAT
DEB
SS
DEB
DEB
DEB
DEB
DEB
DEB
AvdE
SS
DEB
DEB
van der Ende et al . (1998)
Xu et al . (1997)
Suerbaum et al . (1998)
Ferrero et al . (1992)
Lee et al . (1997)
van der Ende et al . (1998)
Suerbaum et al . (1998)
Tomb et al . (1997)
Kersulyte et al . (1999)
Strain NCTC11638 has also been referred to as 8D3, J166 as 95-58 and HP1 as Peru2-11.
a. Source, first letters of author’s name from whom the strain or its DNA may be obtained.
䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470
Recombination and clonal groupings of H. pylori 461
Table 2. Gene fragments that were sequenced, ordered by position in the genome of strain 26695.
Gene
Number
ureI
mutY
efp
cagA
cagA N
cagA C
ppa
yphC
vacA
atpA
trpC
HP0071
HP0142
HP0177
HP547
HP0620
HP0834
HP0887
HP1134
HP1279
Position
(in gene)
74 750 (1)
154 138 (409)
183 801 (76)
579 972
580 938
665 768
886 070
938 619
1197 139
1354 997
(52)
(1018)
(22)
(4)
(205)
(512)
(664)
Length
(bp)
Gene product
or inferred function
585
420
410
Urease accessory protein
A/G-specific adenine glycosylase
Elongation factor EF-P
Cytotoxin-associated gene A
N-terminal cagA fragment
Central cagA fragment
Inorganic pyrophosphatase
GTPase
Vacuolating cytotoxin
ATP synthase, F1␣
Anthranilate isomerase
413
243
396
510
444
627
456
Accession
AJ239701-19
AJ239645-63
AJ239607-25
AJ239683-700
AJ239720-37
AJ239626-44
AJ239664-82
AJ239550-68
AJ239588-606
AJ239569-87
Longer sequences for yphC (1071 bp) and cagA N (up to 464 bp) were trimmed to the lengths shown such that all sequences evaluated were from
intragenic regions and did not contain any alignment gaps. Position (in gene), TIGR position of the N-terminal end of the fragment sequenced
(position relative to start of gene, which was defined as 1).
of this species (Table 1). Corresponding sequences from
strain 26695, whose entire genome has been sequenced
(Tomb et al ., 1997), were also included in the analysis.
[The genome sequence of strain J99 (Alm et al ., 1999)
was not included because it was published after completion
of the analyses presented here.] The strains were tested
by hybridization with probes specific for cagA and for the
s (signal peptide) and m (internal) region alleles of vacA
(Atherton et al ., 1995; van Doorn et al ., 1998a,b). All strains
hybridized with the cagA probe except HP1, in which
much of the cag pathogenicity island including part of
cagA is deleted (D. Kersulyte and D. E. Berg, unpublished).
All strains except one (CC28) yielded unique hybridization
patterns with probes for the s and m regions of vacA . The
results (Table 1) indicated that the vacA s1c allele was
present exclusively in the six strains isolated from Asians,
in agreement with published data (van Doorn et al ., 1998a),
whereas the other vacA alleles were not associated exclusively with any geographical or ethnic source.
Sequences
Sequences were obtained from fragments of seven housekeeping genes scattered around the chromosome, from a
fragment of vacA and from both N-terminal (termed cagA N)
and internal (cagA C) fragments of cagA (see Experimental procedures ) (Table 2). Together with the published
sequence from strain 26695, this resulted in a total of 20
sequences per gene fragment. One other strain (HP1)
also lacked much of the cagA gene, as indicated above.
Accordingly, our analyses were restricted to 19 sequences
for cagA N and cagA C.
Sequence variation
After alignment, none of the sequences contained gaps or
insertions. The average GC content of the entire H. pylori
genome is 39% (Tomb et al ., 1997), but large regions can
contain as little as 33% or as much as 43%. Most of the
gene fragments analysed here possessed GC contents of
⬇39%, but three possessed GC contents of ⬇45% (atpA ,
efp , ureI ; Table 3).
Only a few amino acid changes (non-synonymous sequence variation) were found in most gene fragments (mean
percentage K a 0.2–3.1%), but the mean percentage K a
of both cagA N and cagA C was 5.5%. All sequences
showed high levels of synonymous sequence variation,
with mean percentage K s values of 12–27%; individual
Table 3. Properties of gene fragments sequenced.
Gene
%GC
Mean percentage K s (range)
Mean percentage K a (range)
Ratio K s / K a
Mean percentage compatibility
H ratio
atpA
cagA N
cagA C
efp
mutY
ppa
trpC
ureI
vacA
yphC
46
37
38
46
39
41
39
44
43
38
12.1 ⫾ 3.2 (5.5–19.5)
12.4 ⫾ 6.2 (0–25.8)
24.1 ⫾ 18.6 (0–65.5)
14.6 ⫾ 5.7 (0–31.8)
22.3 ⫾ 7.3 (6.6–41.4)
10.5 ⫾ 4.0 (0–23.0)
26.6 ⫾ 8.6 (6.1–43.6)
11.6 ⫾ 4.8 (3.8–26.1)
17.8 ⫾ 6.1 (3.2–36.4)
17.5 ⫾ 5.5 (5.5–31.3)
0.2 ⫾ 0.2
5.6 ⫾ 3.8
5.3 ⫾ 4.2
0.3 ⫾ 0.3
1.9 ⫾ 0.8
0.5 ⫾ 0.3
3.1 ⫾ 1.2
0.7 ⫾ 0.4
2.6 ⫾ 1.5
1.8 ⫾ 0.8
49.6
2.2
4.5
48.7
11.9
22.9
8.6
16.6
6.9
9.7
48
62
77
50
54
54
55
45
52
42
0.76
0.17
0.15
0.60
0.66
0.68
0.46
0.79
0.69
0.67
(0–0.6)
(0.6–12.4)
(0–13.0)
(0–1.3)
(0.3–3.9)
(0–1.6)
(0.8–5.4)
(0–1.9)
(0–7.5)
(0–4.2)
䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470
462 M. Achtman et al.
pairs of sequences differed at as many as 66% of synonymous sites. Because non-synonymous variation was frequent within cagA , the K s / K a ratio was considerably lower
for both fragments from this gene (2-5) than from the other
genes (9-50). All sequences were unique, except that
identical efp sequences were found in strains 96-232
and 92-228 (both from Alaska), and identical ppa sequences were found in strains 96-228 (Alaska) and J166
(USA). These results indicate, in agreement with previous
analyses (Garner and Cover, 1995; Kansau et al., 1996;
Suerbaum et al ., 1998; van der Ende et al ., 1998; van
Doorn et al ., 1998a, b), that identical sequences are rare
among H. pylori.
Frequency of recombination
The sets of sequences were tested by the homoplasy test
(Maynard Smith and Smith, 1998), which measures the
frequency of the same nucleotide changes in different
branches of a maximum parsimony phylogenetic tree (homoplasies). Homoplasies are caused by horizontal DNA
transfer between unrelated strains (recombination) or by
independent mutations. The homoplasy test calculates
the H ratio, the frequency of observed synonymous homoplasies relative to the frequencies expected for the observed
sequence variation under clonal descent or free recombination. If all descent is clonal (no interstrain recombination,
H ratio ¼ 0.0), all homoplasies arise by the coincidental
accumulation of mutations at the same sites in independent
strains. In contrast, under free recombination (H ratio ¼
1.0), mutations need only arise once and are then acquired
by sequences in other genomes via recombination. Sequences from the flaA , flaB and vacA genes from strains isolated in Germany or Canada had yielded H ratios between
0.8 and 0.9 (Suerbaum et al ., 1998).
With the current collection of 20 strains, the vacA gene
fragment yielded an H ratio of 0.69, only slightly lower
than observed previously, indicating frequent recombination. The H ratio for six of the housekeeping genes was
between 0.60 and 0.79 (geometric mean of 0.69), indicating similarly high levels of recombination for these genes
(Table 3). The seventh housekeeping gene, trpC , yielded
an intermediate H ratio of 0.46. In contrast to these results,
the two cagA gene fragments yielded H ratios of 0.15–0.17,
among the lowest ratios yet observed in any species (Suerbaum et al ., 1998). Thus, most genes in H. pylori exhibit
free recombination, while recombinants involving at least
some alleles of the cagA gene are rare.
The sequences were also analysed by compatibility
matrices (Jakobsen and Easteal, 1996) (Fig. 1). This test
scores whether pairs of informative sites are compatible
with a maximum parsimony tree (clonal descent), as indicated in Fig. 1 by a white box, or incompatible (recombination), as indicated by a black box. The matrices are largely
black, implying repeated recombination events, except for
the cagA C gene fragment in which large regions are white,
compatible with clonal descent. A numerical measure of
the degree of clonal descent is the average frequency with
which pairs of sites are compatible, called the mean percentage compatibility. The mean percentage compatibilities of
the matrices were between 42% and 62% (Table 3), except
for cagA C, which was 77%. These results indicate that
interstrain recombination was frequent within all gene sets
except cagA C.
The data from both analyses concur that there has been
extensive interstrain recombination in the H. pylori chromosome, with the notable exception of the cagA C gene
fragment, and intermediate or contradictory values for the
trpC and cagA N gene fragments.
Clonal descent
MLST clonal groupings in N. meningitidis and S. pneumoniae have been defined as sets of bacteria sharing a significant number of identical alleles (Enright and Spratt, 1998;
Maiden et al ., 1998). Identical alleles are so rare in H. pylori
that this criterion does not apply to this species. Instead, we
examined the average sequence homology for the eight
gene fragments for which sequences had been obtained
from all 20 strains (thus excluding cagA ). Normalized distance matrices of the numbers of nucleotide differences
between each of the pairs of sequences were calculated
for each gene and averaged to yield a mean distance matrix.
An UPGMA (unweighted pair group mean average) cluster
analysis of this matrix showed that the six strains isolated
from East Asians clustered together, including strain HP1,
which was from an ethnic Japanese living in Peru (Fig. 2).
We refer to this group of strains as the ‘Asian’ clone. Similarly, three other strains (#12, 5596 and CC28; isolated in
Guatemala, The Gambia and South Africa respectively)
also clustered together and separately from the other
strains. These bacteria are referred to as ‘clone2’ (Fig.
2). Very similar results were obtained using a mean distance matrix from pairwise distances that had not been
normalized.
The sequences from each of the 10 gene fragments
were also analysed individually by bootstrap tests of neighbour-joining trees and by maximum likelihood cluster analysis. The results showed that, for each gene fragment
except ppa and atpA , two to six strains of the Asian clone
clustered together with bootstrap values of at least 47%
(Table 4). Furthermore, for each of the six Asian strains,
four to eight of the 10 gene fragments clustered with those
from other strains of the Asian clone. Similarly, except for
ureI and ppa , all gene fragments usually clustered together
for the strains of clone2 (Table 5). These clusters also
included strain J166 for four of the gene fragments
(Table 5). This group of four strains was also uniform for
䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470
Recombination and clonal groupings of H. pylori 463
Asian clone
clone2
Fig. 1. Compatibility matrices of 10 gene
fragments from 20 strains of H. pylori . These
matrices depict the informative sites within the
sequence sets as white boxes when pairs of
sites are compatible with a parsimony tree and
as black boxes for pairs of sites that are
incompatible. Black regions indicate runs of
incompatible sites that probably arose by
repeated recombination, whereas white regions
(in particular in the cagA C gene fragment)
indicate that those runs of sites are compatible
with clonal descent by the sequential
accumulation of mutations.
0.70
0.65
0.60
0.55
0.50
0.45
0.40
Normalized Linkage Distance
0.35
0.30
䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470
0.25
#12
5596
CC28
J166
NCTC11638
NCTC11637
SS1
5-1
96-228
96-232
26695
25
BO265
N6
F32
R29
88-28
97-42
HP1
88-39
Fig. 2. An UPGMA dendrogram of the mean
normalized pairwise differences between 20
alleles for eight gene fragments (excluding cagAN
and cagA C). The number of nucleotide
differences between each pair of alleles was
normalized to a maximal value of 1.0 before
averaging over the eight gene fragments. The
resulting matrix was used for a cluster analysis
using the unweighted pair–group mean average
clustering method. Note that strain J166 is not in
clone2 according to these results, although many
gene fragments from J166 did cluster with clone2
in dendrograms of the individual genes (Table 5).
464 M. Achtman et al.
Table 4. Gene fragments whose sequences cluster significantly within six strains isolated from East Asians.
Gene
ureI
mutY
efp
cagA N
cagA C
ppa
yphC
vacA
atpA
trpC
Position (kb)
75
154
184
580
581
666
886
939
1197
1355
Strains
F32
R29
88-28
97-42
HP1
88-39
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
ND
þ
þ
þ
þ
þ
ND
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
No. of strains
Bootstrap (%)
3
65
5
95
2
63
5
99
5
100
5
64
6
47
0
þ
þ
þ
þ
þ
0
5
98
No. of genes
6
7
7
8
4
4
8/10
Sequences from Asian strains that clustered in neighbour-joining dendrograms with a percentage bootstrap value of at least 47% are indicated by a
‘þ’, and other sequences are indicated by an empty space. ND, not tested because a sequence was not obtained.
the vacA s1b allele, which was otherwise only found in one
other strain (BO265).
The clonal associations seen here would probably not
have been considered significant if only single housekeeping genes had been analysed. The dendrograms of the
individual genes were quite different (see Fig. 3 for examples), and strains that clustered significantly for individual
genes (e.g. the two unlabelled clusters for mutY in Fig. 3)
were often unrelated for other genes. The dendrogram of
the cagA gene fragments differed from the dendrograms
for other loci in that three distinct branches were observed,
one of which included the five cagAþ Asian strains and a
second contained only strain J166, which was otherwise
often associated with clone2 (Fig. 3).
Discussion
We studied the sequences of fragments of seven housekeeping and two virulence-associated genes from 20
strains of H. pylori isolated from diverse geographical
regions in order to assess its population genetic structure
and evolutionary history. Two weakly clonal groupings
were found, superimposed on a pattern of free recombination, an outcome that provides a basis for further population genetic analyses.
Clonality within panmictic species
H. pylori does not possess a strong clonal structure (Go
et al ., 1996; Salaun et al ., 1998), as is also the case with
species such as Neisseria gonorrhoeae (O’Rourke and
Spratt, 1994) and Bacillus subtilis (Istock et al ., 1992),
whose population genetic structure has been called panmictic (Maynard Smith et al ., 1993). In panmictic species,
recombination is sufficiently frequent and unifying selection sufficiently rare that clonal groupings are not expected
to survive except over relatively short periods, such as in
cases of transmission within families (Suerbaum et al .,
1998) or between sexual contacts (O’Rourke et al ., 1995).
Furthermore, numerous distinct alleles are generated as a
result of repeated genetic exchange between divergent
lineages. Because most alleles from independent strains
of H. pylori are unique, it initially seemed unlikely that
multilocus sequence typing (Maiden et al ., 1998) would
reveal any clonal groupings within H. pylori . However,
the results presented here show that widespread clonal
groupings can be discerned even within a largely panmictic species such as H. pylori . We note that a widespread clone has also been found within the supposedly
panmictic N. gonorrhoeae (Gutjahr et al ., 1997) and
suggest that, with time, more clonal groupings will be
Table 5. Clone2, a group of three or four strains in which some gene fragments cluster significantly.
Gene
ureI
mutY
efp
cagA N
cagA C
ppa
yphC
vacA
atpA
trpC
Position (kb)
75
154
184
580
581
666
886
939
1197
1355
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
6
8
6
4
4
58
2
90
2
68
3
72
4
66
4
74
8/10
Strain
#12
5596
CC28
J166
No. of strains
Bootstrap (%)
0
þ
þ
þ
þ
þ
2
55
0
3
96
No. of genes
Details are as in Table 4.
䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470
Recombination and clonal groupings of H. pylori 465
J166
cagA
100
100
96-232
Asia
26695
F32
R29
25
96-228
NCTC11637
5-1
B0265
NCTC11638
#12
5596 SS1
88-39
88-28
100
CC28
N6
97-42
NCTC11637
NCTC11638
58
25
58 26695
clone2
5596
clone2
88-39 B0265
J166
5596
CC28
5-1
25
#12
#12
88-39
HP1
97-42
CC28
N6
66
Asia
95
J166
26695
96-228
SS1
B0265
F32
R29
96-232
88-28
96-228
5-1
mutY
SS1
88-28
HP1
F32
96-232
R29
NCTC11638
97-42
N6 NCTC11637 75
Asia
25
atpA
0.10
Fig. 3. Maximum likelihood unrooted dendrograms for gene fragments of cagA (cagAN appended to cagA C), mutY and atpA . Bacterial strains
are indicated by small italic letters. Bootstrap values over 50% from neighbour-joining trees based on Jukes–Cantor distance matrices are
indicated by bold lettering within the dotted boxes. Asian strain clusters are indicated by the designation Asia. This group of strains is also
boxed for the atpA locus, although the bootstrap value was not significant. The horizontal line at the bottom is a scale line for genetic
distance.
recognized within other largely panmictic bacterial
species.
Both the Asian clone and clone2 are currently represented by only a few isolates, and experiments with additional bacteria from additional geographical regions and
those that lack the cag pathogenicity island are needed
to better define their variability and other properties. However, the existence of these weakly clonal groupings seems
inescapable, because sequence polymorphisms in multiple
housekeeping genes scattered around the chromosome
yielded relatively consistent results. These results also
demonstrate the power of multilocus sequence analysis
for the recognition of clonal groupings, even within panmictic species.
Clonal groupings versus selection for particular
alleles
The cagA gene is in a pathogenicity island that is associated with virulence. Large sequence differences distinguish
cagA gene fragments from Asian strains versus others (van
䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470
der Ende et al ., 1998) (Fig. 3). The synonymous sequence
diversity (K s ) of cagA is not unusual in comparison with
other chromosomal genes within H. pylori (Table 3), and
the age of all these genes may be similar. Furthermore,
the cagA sequences from the Asian clone are not particularly uniform (Fig. 3), arguing against a recent selective
sweep of the cag pathogenicity island, such as has been
invoked to explain the relative lack of sequence polymorphism within the gapA gene of Escherichia coli (Guttman and
Dykhuizen, 1994). The alleles of five of the seven housekeeping gene fragments clustered together significantly
within at least two of the six Asian strains (Fig. 2, Table
4). It therefore seems likely that the different cagA or
vacA populations in Asia and other continents (Ito et al .,
1998; van der Ende et al ., 1998) reflect an old clonal expansion. Similar conclusions have been drawn for Mycobacterium tuberculosis (Van Soolingen et al ., 1995), in which the
insertion sites for IS6110 were conserved in a considerable proportion of strains isolated from East Asians. Geographical partitioning has also been documented for type
b Haemophilus influenzae (Musser et al ., 1990), for which
466 M. Achtman et al.
unusual electrophoretic types were commonly isolated in
certain countries to which recent human migration has
been rare.
We have combined the data on cagA C presented
here with that published elsewhere (van der Ende et al .,
1998) and with several unpublished cagA C sequences
(data not shown). The three distinct branches in the
cagA dendrogram (Fig. 3) remain unchanged with these
75 sequences. Analysis of the larger set of sequences
has also indicated that distinct branches of the cagA
phylogenetic dendrogram can largely be accounted for by
clustered amino acid polymorphisms that are uniform
within each individual branch (data not shown). Specific
amino acid polymorphisms might reflect different, specific
interactions between the CagA protein and physiologically
or genetically distinct human hosts, and selection for such
polymorphisms may have helped to maintain clonal
groupings despite frequent recombination. Elucidation of
such interactions will depend on identifying the function
of the CagA protein in human colonization and disease.
The data presented here also indicate that particular
allelic variants in the signal peptide of the vacA gene
are associated with each of the two clonal groupings.
The different m regions of VacA seem to correlate with
different target cell specificities of the toxin (Paggliaccia
et al ., 1998).
Age of H. pylori
How old is H. pylori and how old is the Asian clone? The
mean K s values of the various genes in Table 3 and similar
values for vacA , flaA and flaB (Suerbaum et al ., 1998) are
higher than for most housekeeping genes in Drosophila
melanogaster, E. coli and N. meningitidis. This observation suggests that H. pylori has a long history, possibly of
millions of years. Sequence variation within the Asian
clone was less than within the entire species, but sufficiently
high that the Asian clone of H. pylori may have accompanied Homo sapiens when they colonized Asia some 40 000
or more years ago. Alternatively, a strain variant of H. pylori
that is particularly well adapted to East Asians might have
spread through that population more recently, displacing
bacteria that were formerly present. Apparently, subsequent recombination has not been sufficiently frequent to
obscure the evidence for clonal descent. The age of clone2
may be similar to that of the Asian clone because the
sequence variability of the two sets of strains was similar
(Fig. 2).
Free recombination versus clonality
The data presented here provide evidence for both extensive recombination and clonality. One possible explanation
for such mutually incompatible models is that their relative
importance for H. pylori differs with the geographical region.
For example, lower H ratios were observed with strains
from South Africans (including CC28, which belongs to
clone2) than with strains isolated in Canada and Germany
(Suerbaum et al ., 1998). Not all of H. pylori is necessarily
clonal: remnants of clonal descent were not detected
among the 10 strains that were not assigned to the Asian
clone or clone2. Possibly, the clonal members differ in
the frequency with which genes transferred from other
strains of H. pylori are fixed. Such differences are known
in other species, such as N. meningitidis, in which seven
distinct clonal groupings of hypervirulent bacteria exist
(Maiden et al ., 1998) even though recombination is frequent (Morelli et al ., 1997) and many other isolates are
panmictic (Maynard Smith et al ., 1993). Finally, the spread
of H. pylori from one human host to the next may be sufficiently slow that insufficient time has elapsed for recombination to remove all traces of clonal descent.
Frequencies of recombination
We used both the homoplasy test and compatibility
matrices to ensure that our conclusions on recombination
did not reflect the limitations of a single method. The results
with both methods were consistent with two exceptions,
namely cagA N, which is clonal according to the H ratio,
but seems from compatibility matrices to recombine frequently, and trpC , which has undergone less recombination than the other housekeeping genes based on its H
ratio, but not according to compatibility matrices. Possibly,
these discrepancies can be accounted for by the high K a
values for both exceptional genes. Compatibility matrices
evaluate all informative sequence variation and would be
affected by the higher level of non-synonymous variation
in these gene fragments, whereas the homoplasy test
only uses synonymous, informative polymorphisms. Both
methods agreed that cagA C was highly clonal and that
recombinants were very rare. Thus, different genes (or
gene fragments) within one chromosome of a bacterial
species may be associated with different degrees of
recombination.
In other species, elevated levels of polymorphism have
been attributed to linkage to gene clusters that are imported
from other species (gnd linked to rfb ) (Bisercic et al ., 1991)
or to diversifying selection (Smith et al ., 1995; Boyd et
al ., 1997). However, the genes we chose for sequencing
encode cyptoplasmic enzymes, and they are not linked
to genes encoding outer membrane or secreted proteins
that might be under selection. We have also argued above
that the evidence does not support a selective sweep having
caused the clonal descent associated with the cagA gene.
Thus, the mechanisms responsible for different degrees of
recombination associated with individual genes in H. pylori
remain to be elucidated.
䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470
Recombination and clonal groupings of H. pylori 467
Summary
Two clonal groupings, the Asian clone and clone2, were
detected in a collection of H. pylori from several parts
of the world. These clonal groupings are widespread and
have probably existed for a long time. Recombination
seems to have occurred on a global scale for most genes
studied, but has not totally disrupted the relationships with
the clonal groupings. The results presented here should
stimulate further extensive analyses of the geographical
specialization and evolution of this species.
Experimental procedures
Bacterial strains
Nineteen strains were chosen from diverse ethnic groups and
countries to represent the geographical diversity of H. pylori
(Table 1). The collection deliberately included many strains
Table 6. Oligonucleotide primers.
that had already been used in previous analyses. Because
of our intention of examining sequence variation of the cagA
gene, we only included strains thought to be Cagþ. The 19
strains were purified by single-colony isolation before further
analysis. These strains, or their DNAs, are available from the
individual sources shown in Table 1.
DNA, genes, PCR products and sequences
DNA was purified from cultures of H. pylori by the CTAB
method (Ausubel et al ., 1994) or by other standard methods
appropriate for PCR amplification of DNA fragments and sent
in dried form to each of the laboratories of the consortium.
We chose seven housekeeping genes for sequencing (Table
2), using the criteria that these genes be both widely separated
on the chromosome of the strain (26695) whose genome has
been sequenced (Tomb et al ., 1997) and not adjacent to genes
encoding putative outer membrane, secreted or hypothetical
proteins that might be under selective pressure. Based on the
sequences in the genome of strain 26695, amplification and
Name
(orientation)
Purpose a
Sequence
atpA
AtpA1(þ)
AtpA6(¹)
AtpA7(þ)
AtpA4(¹)
A
A
S
S
GCTTAAATGGTGTGATGTCG
AATGGGCAAGGGCGAATAAG
CGCTTTGGGTGAGCCTATTG
TGCCCGTCTGTAATAGAAATG
cagAN
cagP1(þ)
cagF2(þ)
cagB1(¹)
A,S
A,S
A,S
CCATTTTAAGCAACTCCATAAACC
AAGATACCGATAGGTATGAA
TCTGCCAAACAATCTTTTGCAG
cagA C
cagA5(þ)
cagA2(¹)
A,S
A,S
GGCAATGGTGGTCCTGGAGCTAGGC
GGAAATCTTTAATCTCAGTTCGG
efp
F01(þ)
R02(¹)
F02(þ)
R01(¹)
A
A
S
S
GGCAATTGGGATGAGCGAGCTC
CTTCACCTTTTCAAGATACTC
GGGCTTGAAAATTGAATTGGGCGG
GTATTGACTTTAATGATCTCACCC
mutY
mutY101(þ)
mutY102(¹)
A,S
A,S
AGCGAAGTGATGAGCCAACAAAC
AAAGGGCAAATCGCACATTTGGG
ppa
ppa1(þ)
ppa2(¹)
ppa3(¹)
A,S
A,S
A,S
GTGAGCCATGACGCTGATTCTTTGT
GCCTTGATAGGCTTTTATCGCTTTCT
GCCATTTCACACCAACACCCAAT
trpC
HptrpC-1(þ)
HptrpC-6(þ)
HptrpC-5(¹):
HptrpC-9(þ)
HptrpC-7(¹):
HptrpC-8(¹):
A
A,S
A
S
S
S
CAAGCTCCTAGAAGTCTCTG
TAGAATGCAAAAAAGCATCGCCCTC
CCCAGCTAGCATGAAAGG
CGCTTGCTCAA(AG)CTCCAATACGAC
TAAGCCCGCACACTTTATTTTCGCC
GTCGTATTGGCG(CT)TTGAGCAAGCG
vacA
OLHPVacA-3(þ)
OLHPVacA-4(¹)
A,S
A,S
ACAACCGTGATCATTCCAGC
ATACGCTCCCACGTATTGC
ureI
Hp71S1(þ)
Hp71S2(þ)
Hp71S3(þ)
Hp71AS1(¹)
A,S
A,S
A,S
A,S
CAATAAAGTGAGCTTGGCGCAACT
GTTATTCGTAAGGTGCGTTTGTTG
GGCAATGCTAGGACTTGT
TCCCTTAGATTGCCAACTAAACGC
yphC
F1(þ)
F3(þ)
R4(¹)
R5(¹)
A,S
A,S
A,S
A,S
CACTATTACCACGCCTATTTTTTTGAC
CTTATGCGTTTTCTTCTTTTGG
AAGCAGCTGGTTGTGATCACGGGGGC
TTTCTARGCTTTCTAAAATATC
Gene
a. A, amplification; S, sequencing.
䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470
468 M. Achtman et al.
sequencing primers were designed such that complete doublestranded sequences could be obtained with a single sequencing run in each direction. For several gene fragments,
sequence diversity prevented amplification from all strains
with a single pair of primers. Additional primers were designed
from the available sequences and, where necessary, after
cloning and sequencing PCR amplicons that had been amplified inefficiently. The complete set of primers used for the
seven housekeeping genes is shown in Table 6. Two gene
fragments of cagA (cagA N, which corresponds to the N-terminus, and cagA C, which is slightly C-terminal of cagA N) and a
vacA gene fragment were also sequenced to allow comparison
with previous results (van der Ende et al ., 1998; van Doorn et
al ., 1999). The primers used to amplify these genes are also
included in Table 6 for convenience, although they have been
published previously (Pan et al ., 1997; Suerbaum et al ., 1998;
van der Ende et al ., 1998; van Doorn et al ., 1999). Automated
sequencing of PCR products from both strands was performed
using standard protocols, and sequences were deposited in
GenBank under the designations listed in Table 2. vacA s
and m typing was performed by hybridization as described
previously (van Doorn et al ., 1998b).
Phylogenetic and other analyses
Sequence alignments in GCG MSF format were converted
to MEGA format using the PSFIND program and evaluated with
the homoplasy test (Maynard Smith and Smith, 1998), using
the HOMOPLASY program (Suerbaum et al ., 1998). PSFIND and
HOMOPLASY are both available by anonymous ftp at ftp:/ /
novell-del-valle.rz-berlin.mpg.de/software/. Homoplasy ratios
are the means of five repetitions of the homoplasy test. Mean
K a and K s distances, using Jukes–Cantor corrections (Jukes
and Cantor, 1969), were calculated using DNASP 2.2 (Rozas
and Rozas, 1997). Compatibility matrices and mean percentage compatibility values were calculated using the program
RETICULATE (Jakobsen and Easteal, 1996). Pairwise distance
matrices based on numbers of different nucleotides (p-distance) were calculated using the MEGA program (Kumar et al .,
1993) and normalized to a maximum of 1.0 for each matrix.
These matrices were averaged (Excel, Microsoft) and used
to generate an UPGMA tree (Statistica, Statsoft). Bootstrap
tests (Felsenstein, 1985) (500 repetitions) of neighbour-joining
trees based on distances with Jukes–Cantor corrections were
performed using MEGA . Maximum-likelihood trees and graphical
outputs were calculated using ARB (http:/ /www.mikro.biologie.
tu-muenchen.de). Percentage GC was calculated using the
program CODONW (http:/ /molbiol.ox.ac.uk/cu/).
Acknowledgements
We gratefully acknowledge the expert assistance of Monique
Feller, Susanne Friedrich, Ricardo Sanna, Kerstin Zurth and
grant support from the Deutsche Forschungsgemeinschaft
(Su 133/3-1, Ac 36/10-1) and the US Public Health Service
(NIH grants AI38166, DK48029 and HG00820). We thank
the following for H. pylori strains studied in this work: A. J.
Parkinson (96-228, 96-232), R. H. Gilman (HP1), A. Lee
(SS1), C. Gareia and O. Torres (#12) and E. Kunstmann
(BO265, CC28).
References
Akopyanz, N., Bukanov, N.O., Westblom, T.U., and Berg,
D.E. (1992a) PCR-based RFLP analysis of DNA sequence
diversity in the gastric pathogen Helicobacter pylori.
Nucleic Acids Res 11: 6221–6225.
Akopyanz, N., Bukanov, N.O., Westblom, T.U., Kresovich,
S., and Berg, D.E. (1992b) DNA diversity among clinical
isolates of Helicobacter pylori detected by PCR-based
RAPD fingerprinting. Nucleic Acids Res 20: 5137–5142.
Akopyants, N.S., Clifton, S.W., Kersulyte, D., Crabtree, J.E.,
Youree, B.E., Reece, C.A., et al . (1998) Analyses of the
cag pathogenicity island of Helicobacter pylori. Mol Microbiol 28: 37–53.
Alm, R.A., Ling, L.-S.L., Moir, D.T., King, B.L., Brown, E.D.,
Doig, P.C., et al . (1999) Genomic-sequence comparison
of two unrelated isolates of the human gastric pathogen
Helicobacter pylori. Nature 397: 176–180.
Atherton, J.C. (1998) H. pylori virulence factors. Br Med Bull
54: 105–120.
Atherton, J.C., Cao, P., Peek, R.M.J., Tummuru, M.K., Blaser,
M.J., and Cover, T.L. (1995) Mosaicism in vacuolating cytotoxin alleles of Helicobacter pylori . Association of specific
vacA types with cytotoxin production and peptic ulceration.
J Biol Chem 270: 17771–17777.
Ausubel, F.M., Brent, R., Kingston, R.E., Moore, D.D., Seidman, J.G., Smith, J.A., et al . (1994) Current Protocols in
Molecular Biology . New York: John Wiley & Sons.
Bamford, K.B., Bickley, J., Collins, J.S., Johnston, B.T., Potts,
S., Boston, V., et al . (1993) Helicobacter pylori : comparison of DNA fingerprints provides evidence for intrafamilial
infection. Gut 34: 1348–1350.
Bisercic, M., Feutrier, J.Y., and Reeves, P.R. (1991) Nucleotide sequences of the gnd genes from nine natural isolates
of Escherichia coli : evidence of intragenic recombination
as a contributing factor in the evolution of the polymorphic
gnd locus. J Bacteriol 173: 3894–3900.
Blaser, M.J. (1997) Ecology of Helicobacter pylori in the human
stomach. J Clin Invest 100: 759–762.
Boyd, E.F., Li, J., Ochman, H., and Selander, R.K. (1997)
Comparative genetics of the inv-spa invasion gene complex
of Salmonella enterica. J Bacteriol 179: 1985–1991.
Censini, S., Lange, C., Xiang, Z., Crabtree, J.E., Ghiara, P.,
Borodovsky, M., et al . (1996) cag , a pathogenicity island
of Helicobacter pylori , encodes type-I specific and diseaseassociated virulence factors. Proc Natl Acad Sci USA 93:
14648–14653.
Covacci, A., Censini, S., Bugnoli, M., Petracca, R., Burroni,
D., Macchia, G., et al . (1993) Molecular characterization
of the 128-kDa immunodominant antigen of Helicobacter
pylori associated with cytotoxicity and duodenal ulcer.
Proc Natl Acad Sci USA 90: 5791–5795.
Cover, T.L. (1996) The vacuolating cytotoxin of Helicobacter
pylori. Mol Microbiol 20: 241–246.
van Doorn, L.J., Figueiredo, C., Sanna, R., Pena, S., Midolo,
P., Ng, E.K.W., et al . (1998a) Expanding allelic diversity of
Helicobacter pylori vacA . J Clin Microbiol 36: 2597–2603.
van Doorn, L.J., Figueiredo, C., Rossau, R., Jannes, G., Van
Asbroeck, M., Sousa, J.C., et al . (1998b) Typing of Helicobacter pylori vacA gene and detection of cagA gene by PCR
and reverse hybridization. J Clin Microbiol 36: 1271–1276.
䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470
Recombination and clonal groupings of H. pylori 469
van Doorn, L.J., Figueiredo, C., Sanna, R., Blaser, M.J., and
Quint, W.G.V. (1999) Distinct variants of Helicobacter pylori
cagA are associated with vacA subtypes. J Clin Microbiol
(in press).
van der Ende, A., Pan, Z.J., Bart, A., Van der Hulst, R.W.,
Feller, M., Xiao, S.D., et al . (1998) cagA -positive Helicobacter pylori populations in China and The Netherlands
are distinct. Infect Immun 66: 1822–1826.
van der Ende, A., Rauws, E.A.J., Feller, M., Mulder, C.J.J.,
Tytgat, G.N.J., and Dankert, J. (1996) Heterogeneous
Helicobacter pylori isolates from members of a family with
a history of peptic ulcer disease. Gastroenterology 111:
638–647.
Enright, M.C., and Spratt, B.G. (1998) A multilocus sequence
typing scheme for Streptococcus pneumoniae : identification of clones associated with serious invasive disease.
Microbiology 144: 3049–3060.
Felsenstein, J. (1985) Confidence limits on phylogenies: an
approach using the bootstrap. Evolution 39: 783–791.
Ferrero, R.L., Cussac, V., Courcoux, P., and Labigne, A.
(1992) Construction of isogenic urease-negative mutants
of Helicobacter pylori by allelic exchange. J Bacteriol 174:
4212–4217.
Garner, J.A., and Cover, T.L. (1995) Analysis of genetic diversity in cytotoxin-producing and non-cytotoxin-producing
Helicobacter pylori strains. J Infect Dis 172: 290–293.
Go, M.F., Kapur, V., Graham, D.Y., and Musser, J.M. (1996)
Population genetic analysis of Helicobacter pylori by multilocus enzyme electrophoresis: extensive allelic diversity
and recombinational population structure. J Bacteriol 178:
3934–3938.
Gutjahr, T.S., O’Rourke, M., Ison, C.A., and Spratt, B.G.
(1997) Arginine-, hypoxanthine-, uracil-requiring isolates of
Neisseria gonorrhoeae are a clonal lineage within a nonclonal population. Microbiology 143: 633–640.
Guttman, D.S., and Dykhuizen, D.E. (1994) Detecting selective sweeps in naturally occurring Escherichia coli. Genetics
138: 993–1003.
Istock, C.A., Duncan, K.E., Ferguson, N., and Zhou, X. (1992)
Sexuality in a natural population of bacteria – Bacillus subtilis challenges the clonal paradigm. Mol Ecol 1: 93–103.
Ito, Y., Azuma, T., Ito, S., Suto, H., Miyaji, H., Yamazaki, Y.,
et al . (1998) Full-length sequence analysis of the vacA
gene from cytotoxic and noncytotoxic Helicobacter pylori.
J Infect Dis 178: 1391–1398.
Jakobsen, I.B., and Easteal, S. (1996) A program for calculating and displaying compatibility matrices as an aid in
determining reticulate evolution in molecular sequences.
CABIOS 12: 291–295.
Jenks, P.J., Mégraud, F., and Labigne, A. (1998) Clinical outcome after infection with Helicobacter pylori does not
appear to be reliably predicted by the presence of any of
the genes of the cag pathogenicity island. Gut 43: 752–758.
Jukes, T.H., and Cantor, C.R. (1969) Evolution of protein
molecules. In Mammalian Protein Metabolism . Munro,
H.N. (ed.). New York: Academic Press, pp. 21–132.
Kansau, I., Raymond, J., Bingen, E., Courcoux, P., Kalach,
N., Bergeret, M., et al . (1996) Genotyping of Helicobacter
pylori isolates by sequencing of PCR products and comparison with the RAPD technique. Res Microbiol 147:
661–669.
䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470
Kersulyte, D., Chalkauskas, H., and Berg, D.E. (1999) Emergence of recombinant strains of Helicobacter pylori during
human infection. Mol Microbiol 31: 31–43.
Kuipers, E.J., Perez-Perez, G.I., Meuwissen, S.G., and Blaser,
M.J. (1995) Helicobacter pylori and atrophic gastritis: importance of the cagA status. J Natl Cancer Inst 87: 1777–1780.
Kumar, S., Tamura, K., and Nei, M. (1993) MEGA: Molecular
Evolutionary Genetics Analysis , Version 1.01 . University
Park, PA: The Pennsylvania State University.
Lee, A., O’Rourke, J., De Ungria, M.C., Robertson, B., Daskalopolous, G., and Dixon, M.F. (1997) A standardized mouse
model of Helicobacter pylori infection: introducing the Sydney strain. Gastroenterology 112: 1386–1397.
Maiden, M.C.J., Bygraves, J.A., Feil, E., Morelli, G., Russell,
J.E., Urwin, R., et al . (1998) Multilocus sequence typing:
a portable approach to the identification of clones within
populations of pathogenic microorganisms. Proc Natl Acad
Sci USA 95: 3140–3145.
Majewski, S.I., and Goodwin, C.S. (1988) Restriction endonuclease analysis of the genome of Campylobacter pylori
with a rapid extraction method: evidence for considerable
genomic variation. J Infect Dis 157: 465–471.
Maynard Smith, J., and Smith, N.H. (1998) Detecting recombination from gene trees. Mol Biol Evol 15: 590–599.
Maynard Smith, J., Smith, N.H., O’Rourke, M., and Spratt,
B.G. (1993) How clonal are bacteria? Proc Natl Acad Sci
USA 90: 4384–4388.
Miehlke, S., Kibler, K., Kim, J.G., Figura, N., Small, S.M.,
Graham, D.Y., et al . (1996) Allelic variation in the cagA
gene of Helicobacter pylori obtained from Korea compared
to the United States. Am J Gastroenterol 91: 1322–1325.
Morelli, G., Malorny, B., Müller, K., Seiler, A., Wang, J., del
Valle, J., et al . (1997) Clonal descent and microevolution
of Neisseria meningitidis during 30 years of epidemic
spread. Mol Microbiol 25: 1047–1064.
Musser, J.M., Kroll, J.S., Granoff, D.M., Moxon, E.R., Brodeur,
B.R., Campos, J., et al . (1990) Global genetic structure
and molecular epidemiology of encapsulated Haemophilus
influenzae. Rev Infect Dis 12: 75–111.
O’Rourke, M., and Spratt, B.G. (1994) Further evidence for
the non-clonal population structure of Neisseria gonorrhoeae : extensive genetic diversity within isolates of the
same electrophoretic type. J Gen Microbiol 140: 1285–1290.
O’Rourke, M., Ison, C.A., Renton, A.M., and Spratt, B.G.
(1995) Opa-typing: a high-resolution tool for studying the
epidemiology of gonorrhoea. Mol Microbiol 17: 865–875.
Paggliaccia, C., de Bernard, M., Lupetti, P., Ji, X., Burroni, D.,
Cover, T.L., et al . (1998) The m2 form of the Helicobacter
pylori cytotoxin has cell type-specific vacuolating activity.
Proc Natl Acad Sci USA 95: 10212–10217.
Pan, Z.J., Van der Hulst, R.W., Feller, M., Xiao, S.D., Tytgat,
G.N., Dankert, J., et al . (1997) Equally high prevalences of
infection with cagA -positive Helicobacter pylori in Chinese
patients with peptic ulcer disease and those with chronic
gastritis-associated dyspepsia. J Clin Microbiol 35: 1344–
1347.
Pan, Z.-J., Berg, D.E., van der Hulst, R.W.M., Su, W.-W.,
Raudonikiene, A., Xiao, S.-D., et al . (1998) Prevalence of
vacuolating cytotoxin production and distribution of distinct
vacA alleles in Helicobacter pylori from China. J Infect Dis
178: 220–226.
470 M. Achtman et al.
Rozas, J., and Rozas, R. (1997) DnaSP, Version 2.0: a novel
software package for extensive molecular population genetics analyses. CABIOS 13: 307–311.
Salaun, L., Audibert, C., Le Lay, G., Burucoa, C., Fauchere,
J.L., and Picard, B. (1998) Panmictic structure of Helicobacter pylori demonstrated by the comparative study of six
genetic markers. FEMS Microbiol Lett 161: 231–239.
Smith, N.H., Maynard Smith, J., and Spratt, B.G. (1995) Sequence evolution of the porB gene of Neisseria gonorrhoeae
and Neisseria meningitidis : evidence of positive Darwinian
selection. Mol Biol Evol 12: 363–370.
Van Soolingen, D., Qian, L., De Haas, P.E.W., Douglas, J.T.,
Traore, H., Portaels, F., et al . (1995) Predominance of
a single genotype of Mycobacterium tuberculosis in countries of East Asia. J Clin Microbiol 33: 3234–3238.
Suerbaum, S., Maynard Smith, J., Bapumia, K., Morelli, G.,
Smith, N.H., Kunstmann, E., et al . (1998) Free recombination within Helicobacter pylori. Proc Natl Acad Sci USA 95:
12619–12624.
Tomb, J.F., White, O., Kerlavage, A.R., Clayton, R.A., Sutton,
G.G., Fleischmann, R.D., et al . (1997) The complete genome sequence of the gastric pathogen Helicobacter pylori.
Nature 388: 539–547.
Tummuru, M.K., Cover, T.L., and Blaser, M.J. (1993) Cloning
and expression of a high-molecular-mass major antigen of
Helicobacter pylori : evidence of linkage to cytotoxin production. Infect Immun 61: 1799–1809.
Xu, Q., Peek, Jr, R.M., Miller, G.G., and Blaser, M.J. (1997)
The Helicobacter pylori genome is modified at CATG by
the product of hpyIM. J Bacteriol 179: 6807–6815.
䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470