* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Recombination and clonal groupings within Helicobacter pylori from
Quantitative trait locus wikipedia , lookup
Non-coding DNA wikipedia , lookup
Genomic library wikipedia , lookup
Transposable element wikipedia , lookup
Human genetic variation wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Point mutation wikipedia , lookup
Gene therapy wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Copy-number variation wikipedia , lookup
Human genome wikipedia , lookup
Genomic imprinting wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Public health genomics wikipedia , lookup
Ridge (biology) wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Gene nomenclature wikipedia , lookup
Genetic engineering wikipedia , lookup
Minimal genome wikipedia , lookup
History of genetic engineering wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Gene desert wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Metagenomics wikipedia , lookup
Gene expression programming wikipedia , lookup
Helitron (biology) wikipedia , lookup
Genome (book) wikipedia , lookup
Genome editing wikipedia , lookup
Gene expression profiling wikipedia , lookup
Genome evolution wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Designer baby wikipedia , lookup
Pathogenomics wikipedia , lookup
Molecular Microbiology (1999) 32(3), 459–470 MicroGenomics Recombination and clonal groupings within Helicobacter pylori from different geographical regions Mark Achtman,1* Takeshi Azuma,2 Douglas E. Berg,3 Yoshiyuki Ito,2 Giovanna Morelli,1 Zhi-Jun Pan,3 Sebastian Suerbaum,4 Stuart A. Thompson,5 Arie van der Ende6 and Leen-Jan van Doorn7 1 Max-Planck Institut für molekulare Genetik, Ihnestraße 73, 14195 Berlin, Germany. 2 Second Department of Internal Medicine, Fukui Medical University, Fukui 910-1193, Japan. 3 Departments of Molecular Microbiology and Genetics, Washington University School of Medicine, St Louis, MO 63110, USA. 4 Department of Medical Microbiology, Ruhr-Universität Bochum, 44780 Bochum, Germany. 5 Vanderbilt University School of Medicine, Nashville, TN 37232, USA. 6 Department of Medical Microbiology, Academic Medical Center, 1105 AZ Amsterdam, The Netherlands. 7 Delft Diagnostic Laboratory, 2625 AD Delft, The Netherlands. Summary A collection of 20 strains of Helicobacter pylori from several regions of the world was studied to better understand the population genetic structure and diversity of this species. Sequences of fragments from seven housekeeping genes (atpA , efp , mutY , ppa , trpC , ureI , yphC ) and two virulence-associated genes (cagA , vacA ) showed high levels of synonymous sequence variation (mean percentage K s of 10–27%) and lower levels of non-synonymous variation (mean percentage K a of 0.2–5.6%). Cluster analysis of pairwise differences between alleles revealed the existence of two weakly clonal groupings, which included half of the strains investigated. All six strains isolated from Japanese and coastal Chinese were assigned to the ‘Asian’ clonal grouping, probably reflecting descent from a distinct common ancestor. The clonal groupings were not totally uniform; recombination, as measured Received 1 December, 1998; revised 8 February, 1999; accepted 11 February, 1999. Authors’ names are in alphabetical order. *For correspondence. E-mail [email protected]; Tel. (þ49) 30 8413 1262; Fax (þ49) 30 8413 1385. 䊚 1999 Blackwell Science Ltd by the homoplasy test and compatibility matrices, was extremely common within all genes tested, except cagA . The fact that clonal descent could still be discerned despite such frequent recombination possibly reflects founder effects and geographical separation and/or selection for particular alleles of these genes. Introduction Helicobacter pylori is a Gram-negative bacterium that infects the gastric mucosa of more than half of all humans. It is a major cause of peptic ulcers and an early risk factor for gastric cancer (Blaser, 1997). H. pylori is also genetically more diverse than most bacterial species (Go et al ., 1996). The DNA fingerprint pattern and sequences of various genes are almost always different between independent pairs of isolates (Majewski and Goodwin, 1988; Akopyanz et al ., 1992a,b; Kansau et al ., 1996), and a comparison of the genomes of two strains has shown that 7% of the genes are specific to each strain (Alm et al ., 1999). However, bacteria isolated from the members of one family can be indistinguishable, or show only limited variation, which implies intrafamilial transmission (Bamford et al ., 1993; van der Ende et al ., 1996; Suerbaum et al ., 1998; Kersulyte et al ., 1999). The limited data available to date on the population genetics of this organism have led to somewhat contradictory interpretations. For example, multilocus enzyme electrophoresis of six loci indicated no linkage disequilibrium between the different loci (Go et al ., 1996), and restriction fragment length polymorphism (RFLP) analysis indicated that its population genetic structure is panmictic (Salaun et al ., 1998). Similarly, analyses of sequences from fragments of the vacA , flaA and flaB genes indicated ‘free recombination’ among isolates from Canada and Germany (Suerbaum et al ., 1998), i.e. no clonal associations were detected, even among pairs of polymorphic sites within single genes. In agreement, multiple recombinant types were found among strains isolated from one individual (Kersulyte et al ., 1999). The virulence-associated vacA gene encodes a vacuolating cytotoxin (Cover, 1996; Atherton, 1998). Particular sequence motifs in the signal sequence (‘s’) and middle (‘m’) regions of the vacA gene differ between H. pylori 460 M. Achtman et al. isolated from East Asians and Europeans (Atherton et al ., 1995; Ito et al ., 1998; Pan et al ., 1998; van Doorn et al ., 1998a), suggesting that recombination is rare between bacteria from different continents or that particular alleles are selected for in certain populations. However, a different segment of the vacA gene was found to have recombined freely in bacteria isolated from Canada and South Africa (Suerbaum et al ., 1998). The virulence-associated cagA gene encodes an immunodominant protein of unknown function (Covacci et al ., 1993; Tummuru et al ., 1993; Kuipers et al ., 1995). cagA is located at one end of a 40 kb pathogenicity island (Censini et al ., 1996; Akopyants et al ., 1998) that is lacking from many putatively avirulent strains from the USA and Europe and is altered in other strains (Jenks et al ., 1998). Similarly to vacA , the population of cagA alleles differs between strains isolated from East Asians and strains from other ethnic groups (Miehlke et al ., 1996; Pan et al ., 1997; van der Ende et al ., 1998; van Doorn et al ., 1999). Thus, recombination might be frequent among strains isolated from individual ethnic groups but rare between strains from geographically separated ethnic groups. However, the same Chinese and Dutch strains that had exhibited geographical partitioning of cagA sequences did not show ethnic associations for sequences of the glmM housekeeping gene (van der Ende et al ., 1998). And the same polymorphisms were found in the vacA , flaA and flaB genes from South Africa and Germany or Canada (Suerbaum et al ., 1998). Recently, sequences from multiple housekeeping genes (multilocus sequence typing, MLST) have been used to define phylogenetic relationships within the weakly clonal species Neisseria meningitidis (Maiden et al ., 1998) and Streptococcus pneumoniae (Enright and Spratt, 1998). Most members of each clonal grouping within these species are uniform for multiple housekeeping genes; recombinant alleles are the exception rather than the rule. It seemed possible, especially considering the geographical distributions of cagA and vacA described above, that MLST would also allow the recognition of distinct clonal groupings within H. pylori . The conclusion that H. pylori is freely recombining was based on vacA , flaA and flaB sequences from strains isolated in Canada, Germany and South Africa; it seemed possible that analysis of a more global collection might reveal geographical associations that were lacking in the earlier sample. Furthermore, alleles of cytoplasmic housekeeping genes are more likely to be selectively neutral, and therefore more uniform, than is the case for genes for which there might be selection for diversity as a result of association with virulence (vacA ) or motility (flaA , flaB ). In this report, we analyse the sequence variability of both housekeeping and virulence-associated genes from H. pylori strains isolated in several continents and show that weakly clonal groupings can be identified in H. pylori despite a rich history of interstrain recombination. Results A global strain collection We combined strains of H. pylori that have been used for various laboratory studies with additional strains from diverse geographical locations to assemble a collection of 19 strains that may represent much of the global population Table 1. Strains of H. pylori whose sequences were analysed. Hybridization Strain Country R29 97-42 88-39 88-28 F32 HP1 #12 J166 5596 CC28 96-228 96-232 N6 SS1 NCTC11637 NCTC11638 25 BO265 26695 5-1 China Hong Kong Thailand Thailand Japan Peru Guatemala USA Gambia S. Africa Alaska Alaska France Australia Australia Australia Holland Germany UK Lithuania Clonal group Asia Asia Asia Asia Asia Asia Clone2 Clone2 Clone2 Clone2 cagA þ þ þ þ þ – þ þ þ þ þ þ þ þ þ þ þ þ þ þ vacA s1c s1c s1c s1c s1c s1c s1b s1b s1b s1a/s1b s1a s1a s1a s2 s1a s1a s1a s1b s1a s1a m2a m2a m2a m2a m1 m1 m1 m1 m1 m1 m1 m1 m2a m2a m1 m1 m1 m1 m1 m1 Source a Reference AvdE SAT SAT SAT TA DEB DEB SAT DEB SS DEB DEB DEB DEB DEB DEB AvdE SS DEB DEB van der Ende et al . (1998) Xu et al . (1997) Suerbaum et al . (1998) Ferrero et al . (1992) Lee et al . (1997) van der Ende et al . (1998) Suerbaum et al . (1998) Tomb et al . (1997) Kersulyte et al . (1999) Strain NCTC11638 has also been referred to as 8D3, J166 as 95-58 and HP1 as Peru2-11. a. Source, first letters of author’s name from whom the strain or its DNA may be obtained. 䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470 Recombination and clonal groupings of H. pylori 461 Table 2. Gene fragments that were sequenced, ordered by position in the genome of strain 26695. Gene Number ureI mutY efp cagA cagA N cagA C ppa yphC vacA atpA trpC HP0071 HP0142 HP0177 HP547 HP0620 HP0834 HP0887 HP1134 HP1279 Position (in gene) 74 750 (1) 154 138 (409) 183 801 (76) 579 972 580 938 665 768 886 070 938 619 1197 139 1354 997 (52) (1018) (22) (4) (205) (512) (664) Length (bp) Gene product or inferred function 585 420 410 Urease accessory protein A/G-specific adenine glycosylase Elongation factor EF-P Cytotoxin-associated gene A N-terminal cagA fragment Central cagA fragment Inorganic pyrophosphatase GTPase Vacuolating cytotoxin ATP synthase, F1␣ Anthranilate isomerase 413 243 396 510 444 627 456 Accession AJ239701-19 AJ239645-63 AJ239607-25 AJ239683-700 AJ239720-37 AJ239626-44 AJ239664-82 AJ239550-68 AJ239588-606 AJ239569-87 Longer sequences for yphC (1071 bp) and cagA N (up to 464 bp) were trimmed to the lengths shown such that all sequences evaluated were from intragenic regions and did not contain any alignment gaps. Position (in gene), TIGR position of the N-terminal end of the fragment sequenced (position relative to start of gene, which was defined as 1). of this species (Table 1). Corresponding sequences from strain 26695, whose entire genome has been sequenced (Tomb et al ., 1997), were also included in the analysis. [The genome sequence of strain J99 (Alm et al ., 1999) was not included because it was published after completion of the analyses presented here.] The strains were tested by hybridization with probes specific for cagA and for the s (signal peptide) and m (internal) region alleles of vacA (Atherton et al ., 1995; van Doorn et al ., 1998a,b). All strains hybridized with the cagA probe except HP1, in which much of the cag pathogenicity island including part of cagA is deleted (D. Kersulyte and D. E. Berg, unpublished). All strains except one (CC28) yielded unique hybridization patterns with probes for the s and m regions of vacA . The results (Table 1) indicated that the vacA s1c allele was present exclusively in the six strains isolated from Asians, in agreement with published data (van Doorn et al ., 1998a), whereas the other vacA alleles were not associated exclusively with any geographical or ethnic source. Sequences Sequences were obtained from fragments of seven housekeeping genes scattered around the chromosome, from a fragment of vacA and from both N-terminal (termed cagA N) and internal (cagA C) fragments of cagA (see Experimental procedures ) (Table 2). Together with the published sequence from strain 26695, this resulted in a total of 20 sequences per gene fragment. One other strain (HP1) also lacked much of the cagA gene, as indicated above. Accordingly, our analyses were restricted to 19 sequences for cagA N and cagA C. Sequence variation After alignment, none of the sequences contained gaps or insertions. The average GC content of the entire H. pylori genome is 39% (Tomb et al ., 1997), but large regions can contain as little as 33% or as much as 43%. Most of the gene fragments analysed here possessed GC contents of ⬇39%, but three possessed GC contents of ⬇45% (atpA , efp , ureI ; Table 3). Only a few amino acid changes (non-synonymous sequence variation) were found in most gene fragments (mean percentage K a 0.2–3.1%), but the mean percentage K a of both cagA N and cagA C was 5.5%. All sequences showed high levels of synonymous sequence variation, with mean percentage K s values of 12–27%; individual Table 3. Properties of gene fragments sequenced. Gene %GC Mean percentage K s (range) Mean percentage K a (range) Ratio K s / K a Mean percentage compatibility H ratio atpA cagA N cagA C efp mutY ppa trpC ureI vacA yphC 46 37 38 46 39 41 39 44 43 38 12.1 ⫾ 3.2 (5.5–19.5) 12.4 ⫾ 6.2 (0–25.8) 24.1 ⫾ 18.6 (0–65.5) 14.6 ⫾ 5.7 (0–31.8) 22.3 ⫾ 7.3 (6.6–41.4) 10.5 ⫾ 4.0 (0–23.0) 26.6 ⫾ 8.6 (6.1–43.6) 11.6 ⫾ 4.8 (3.8–26.1) 17.8 ⫾ 6.1 (3.2–36.4) 17.5 ⫾ 5.5 (5.5–31.3) 0.2 ⫾ 0.2 5.6 ⫾ 3.8 5.3 ⫾ 4.2 0.3 ⫾ 0.3 1.9 ⫾ 0.8 0.5 ⫾ 0.3 3.1 ⫾ 1.2 0.7 ⫾ 0.4 2.6 ⫾ 1.5 1.8 ⫾ 0.8 49.6 2.2 4.5 48.7 11.9 22.9 8.6 16.6 6.9 9.7 48 62 77 50 54 54 55 45 52 42 0.76 0.17 0.15 0.60 0.66 0.68 0.46 0.79 0.69 0.67 (0–0.6) (0.6–12.4) (0–13.0) (0–1.3) (0.3–3.9) (0–1.6) (0.8–5.4) (0–1.9) (0–7.5) (0–4.2) 䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470 462 M. Achtman et al. pairs of sequences differed at as many as 66% of synonymous sites. Because non-synonymous variation was frequent within cagA , the K s / K a ratio was considerably lower for both fragments from this gene (2-5) than from the other genes (9-50). All sequences were unique, except that identical efp sequences were found in strains 96-232 and 92-228 (both from Alaska), and identical ppa sequences were found in strains 96-228 (Alaska) and J166 (USA). These results indicate, in agreement with previous analyses (Garner and Cover, 1995; Kansau et al., 1996; Suerbaum et al ., 1998; van der Ende et al ., 1998; van Doorn et al ., 1998a, b), that identical sequences are rare among H. pylori. Frequency of recombination The sets of sequences were tested by the homoplasy test (Maynard Smith and Smith, 1998), which measures the frequency of the same nucleotide changes in different branches of a maximum parsimony phylogenetic tree (homoplasies). Homoplasies are caused by horizontal DNA transfer between unrelated strains (recombination) or by independent mutations. The homoplasy test calculates the H ratio, the frequency of observed synonymous homoplasies relative to the frequencies expected for the observed sequence variation under clonal descent or free recombination. If all descent is clonal (no interstrain recombination, H ratio ¼ 0.0), all homoplasies arise by the coincidental accumulation of mutations at the same sites in independent strains. In contrast, under free recombination (H ratio ¼ 1.0), mutations need only arise once and are then acquired by sequences in other genomes via recombination. Sequences from the flaA , flaB and vacA genes from strains isolated in Germany or Canada had yielded H ratios between 0.8 and 0.9 (Suerbaum et al ., 1998). With the current collection of 20 strains, the vacA gene fragment yielded an H ratio of 0.69, only slightly lower than observed previously, indicating frequent recombination. The H ratio for six of the housekeeping genes was between 0.60 and 0.79 (geometric mean of 0.69), indicating similarly high levels of recombination for these genes (Table 3). The seventh housekeeping gene, trpC , yielded an intermediate H ratio of 0.46. In contrast to these results, the two cagA gene fragments yielded H ratios of 0.15–0.17, among the lowest ratios yet observed in any species (Suerbaum et al ., 1998). Thus, most genes in H. pylori exhibit free recombination, while recombinants involving at least some alleles of the cagA gene are rare. The sequences were also analysed by compatibility matrices (Jakobsen and Easteal, 1996) (Fig. 1). This test scores whether pairs of informative sites are compatible with a maximum parsimony tree (clonal descent), as indicated in Fig. 1 by a white box, or incompatible (recombination), as indicated by a black box. The matrices are largely black, implying repeated recombination events, except for the cagA C gene fragment in which large regions are white, compatible with clonal descent. A numerical measure of the degree of clonal descent is the average frequency with which pairs of sites are compatible, called the mean percentage compatibility. The mean percentage compatibilities of the matrices were between 42% and 62% (Table 3), except for cagA C, which was 77%. These results indicate that interstrain recombination was frequent within all gene sets except cagA C. The data from both analyses concur that there has been extensive interstrain recombination in the H. pylori chromosome, with the notable exception of the cagA C gene fragment, and intermediate or contradictory values for the trpC and cagA N gene fragments. Clonal descent MLST clonal groupings in N. meningitidis and S. pneumoniae have been defined as sets of bacteria sharing a significant number of identical alleles (Enright and Spratt, 1998; Maiden et al ., 1998). Identical alleles are so rare in H. pylori that this criterion does not apply to this species. Instead, we examined the average sequence homology for the eight gene fragments for which sequences had been obtained from all 20 strains (thus excluding cagA ). Normalized distance matrices of the numbers of nucleotide differences between each of the pairs of sequences were calculated for each gene and averaged to yield a mean distance matrix. An UPGMA (unweighted pair group mean average) cluster analysis of this matrix showed that the six strains isolated from East Asians clustered together, including strain HP1, which was from an ethnic Japanese living in Peru (Fig. 2). We refer to this group of strains as the ‘Asian’ clone. Similarly, three other strains (#12, 5596 and CC28; isolated in Guatemala, The Gambia and South Africa respectively) also clustered together and separately from the other strains. These bacteria are referred to as ‘clone2’ (Fig. 2). Very similar results were obtained using a mean distance matrix from pairwise distances that had not been normalized. The sequences from each of the 10 gene fragments were also analysed individually by bootstrap tests of neighbour-joining trees and by maximum likelihood cluster analysis. The results showed that, for each gene fragment except ppa and atpA , two to six strains of the Asian clone clustered together with bootstrap values of at least 47% (Table 4). Furthermore, for each of the six Asian strains, four to eight of the 10 gene fragments clustered with those from other strains of the Asian clone. Similarly, except for ureI and ppa , all gene fragments usually clustered together for the strains of clone2 (Table 5). These clusters also included strain J166 for four of the gene fragments (Table 5). This group of four strains was also uniform for 䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470 Recombination and clonal groupings of H. pylori 463 Asian clone clone2 Fig. 1. Compatibility matrices of 10 gene fragments from 20 strains of H. pylori . These matrices depict the informative sites within the sequence sets as white boxes when pairs of sites are compatible with a parsimony tree and as black boxes for pairs of sites that are incompatible. Black regions indicate runs of incompatible sites that probably arose by repeated recombination, whereas white regions (in particular in the cagA C gene fragment) indicate that those runs of sites are compatible with clonal descent by the sequential accumulation of mutations. 0.70 0.65 0.60 0.55 0.50 0.45 0.40 Normalized Linkage Distance 0.35 0.30 䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470 0.25 #12 5596 CC28 J166 NCTC11638 NCTC11637 SS1 5-1 96-228 96-232 26695 25 BO265 N6 F32 R29 88-28 97-42 HP1 88-39 Fig. 2. An UPGMA dendrogram of the mean normalized pairwise differences between 20 alleles for eight gene fragments (excluding cagAN and cagA C). The number of nucleotide differences between each pair of alleles was normalized to a maximal value of 1.0 before averaging over the eight gene fragments. The resulting matrix was used for a cluster analysis using the unweighted pair–group mean average clustering method. Note that strain J166 is not in clone2 according to these results, although many gene fragments from J166 did cluster with clone2 in dendrograms of the individual genes (Table 5). 464 M. Achtman et al. Table 4. Gene fragments whose sequences cluster significantly within six strains isolated from East Asians. Gene ureI mutY efp cagA N cagA C ppa yphC vacA atpA trpC Position (kb) 75 154 184 580 581 666 886 939 1197 1355 Strains F32 R29 88-28 97-42 HP1 88-39 þ þ þ þ þ þ þ þ þ þ þ þ þ þ ND þ þ þ þ þ ND þ þ þ þ þ þ þ þ þ þ þ þ No. of strains Bootstrap (%) 3 65 5 95 2 63 5 99 5 100 5 64 6 47 0 þ þ þ þ þ 0 5 98 No. of genes 6 7 7 8 4 4 8/10 Sequences from Asian strains that clustered in neighbour-joining dendrograms with a percentage bootstrap value of at least 47% are indicated by a ‘þ’, and other sequences are indicated by an empty space. ND, not tested because a sequence was not obtained. the vacA s1b allele, which was otherwise only found in one other strain (BO265). The clonal associations seen here would probably not have been considered significant if only single housekeeping genes had been analysed. The dendrograms of the individual genes were quite different (see Fig. 3 for examples), and strains that clustered significantly for individual genes (e.g. the two unlabelled clusters for mutY in Fig. 3) were often unrelated for other genes. The dendrogram of the cagA gene fragments differed from the dendrograms for other loci in that three distinct branches were observed, one of which included the five cagAþ Asian strains and a second contained only strain J166, which was otherwise often associated with clone2 (Fig. 3). Discussion We studied the sequences of fragments of seven housekeeping and two virulence-associated genes from 20 strains of H. pylori isolated from diverse geographical regions in order to assess its population genetic structure and evolutionary history. Two weakly clonal groupings were found, superimposed on a pattern of free recombination, an outcome that provides a basis for further population genetic analyses. Clonality within panmictic species H. pylori does not possess a strong clonal structure (Go et al ., 1996; Salaun et al ., 1998), as is also the case with species such as Neisseria gonorrhoeae (O’Rourke and Spratt, 1994) and Bacillus subtilis (Istock et al ., 1992), whose population genetic structure has been called panmictic (Maynard Smith et al ., 1993). In panmictic species, recombination is sufficiently frequent and unifying selection sufficiently rare that clonal groupings are not expected to survive except over relatively short periods, such as in cases of transmission within families (Suerbaum et al ., 1998) or between sexual contacts (O’Rourke et al ., 1995). Furthermore, numerous distinct alleles are generated as a result of repeated genetic exchange between divergent lineages. Because most alleles from independent strains of H. pylori are unique, it initially seemed unlikely that multilocus sequence typing (Maiden et al ., 1998) would reveal any clonal groupings within H. pylori . However, the results presented here show that widespread clonal groupings can be discerned even within a largely panmictic species such as H. pylori . We note that a widespread clone has also been found within the supposedly panmictic N. gonorrhoeae (Gutjahr et al ., 1997) and suggest that, with time, more clonal groupings will be Table 5. Clone2, a group of three or four strains in which some gene fragments cluster significantly. Gene ureI mutY efp cagA N cagA C ppa yphC vacA atpA trpC Position (kb) 75 154 184 580 581 666 886 939 1197 1355 þ þ þ þ þ þ þ þ þ þ þ þ þ þ þ þ þ þ þ 6 8 6 4 4 58 2 90 2 68 3 72 4 66 4 74 8/10 Strain #12 5596 CC28 J166 No. of strains Bootstrap (%) 0 þ þ þ þ þ 2 55 0 3 96 No. of genes Details are as in Table 4. 䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470 Recombination and clonal groupings of H. pylori 465 J166 cagA 100 100 96-232 Asia 26695 F32 R29 25 96-228 NCTC11637 5-1 B0265 NCTC11638 #12 5596 SS1 88-39 88-28 100 CC28 N6 97-42 NCTC11637 NCTC11638 58 25 58 26695 clone2 5596 clone2 88-39 B0265 J166 5596 CC28 5-1 25 #12 #12 88-39 HP1 97-42 CC28 N6 66 Asia 95 J166 26695 96-228 SS1 B0265 F32 R29 96-232 88-28 96-228 5-1 mutY SS1 88-28 HP1 F32 96-232 R29 NCTC11638 97-42 N6 NCTC11637 75 Asia 25 atpA 0.10 Fig. 3. Maximum likelihood unrooted dendrograms for gene fragments of cagA (cagAN appended to cagA C), mutY and atpA . Bacterial strains are indicated by small italic letters. Bootstrap values over 50% from neighbour-joining trees based on Jukes–Cantor distance matrices are indicated by bold lettering within the dotted boxes. Asian strain clusters are indicated by the designation Asia. This group of strains is also boxed for the atpA locus, although the bootstrap value was not significant. The horizontal line at the bottom is a scale line for genetic distance. recognized within other largely panmictic bacterial species. Both the Asian clone and clone2 are currently represented by only a few isolates, and experiments with additional bacteria from additional geographical regions and those that lack the cag pathogenicity island are needed to better define their variability and other properties. However, the existence of these weakly clonal groupings seems inescapable, because sequence polymorphisms in multiple housekeeping genes scattered around the chromosome yielded relatively consistent results. These results also demonstrate the power of multilocus sequence analysis for the recognition of clonal groupings, even within panmictic species. Clonal groupings versus selection for particular alleles The cagA gene is in a pathogenicity island that is associated with virulence. Large sequence differences distinguish cagA gene fragments from Asian strains versus others (van 䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470 der Ende et al ., 1998) (Fig. 3). The synonymous sequence diversity (K s ) of cagA is not unusual in comparison with other chromosomal genes within H. pylori (Table 3), and the age of all these genes may be similar. Furthermore, the cagA sequences from the Asian clone are not particularly uniform (Fig. 3), arguing against a recent selective sweep of the cag pathogenicity island, such as has been invoked to explain the relative lack of sequence polymorphism within the gapA gene of Escherichia coli (Guttman and Dykhuizen, 1994). The alleles of five of the seven housekeeping gene fragments clustered together significantly within at least two of the six Asian strains (Fig. 2, Table 4). It therefore seems likely that the different cagA or vacA populations in Asia and other continents (Ito et al ., 1998; van der Ende et al ., 1998) reflect an old clonal expansion. Similar conclusions have been drawn for Mycobacterium tuberculosis (Van Soolingen et al ., 1995), in which the insertion sites for IS6110 were conserved in a considerable proportion of strains isolated from East Asians. Geographical partitioning has also been documented for type b Haemophilus influenzae (Musser et al ., 1990), for which 466 M. Achtman et al. unusual electrophoretic types were commonly isolated in certain countries to which recent human migration has been rare. We have combined the data on cagA C presented here with that published elsewhere (van der Ende et al ., 1998) and with several unpublished cagA C sequences (data not shown). The three distinct branches in the cagA dendrogram (Fig. 3) remain unchanged with these 75 sequences. Analysis of the larger set of sequences has also indicated that distinct branches of the cagA phylogenetic dendrogram can largely be accounted for by clustered amino acid polymorphisms that are uniform within each individual branch (data not shown). Specific amino acid polymorphisms might reflect different, specific interactions between the CagA protein and physiologically or genetically distinct human hosts, and selection for such polymorphisms may have helped to maintain clonal groupings despite frequent recombination. Elucidation of such interactions will depend on identifying the function of the CagA protein in human colonization and disease. The data presented here also indicate that particular allelic variants in the signal peptide of the vacA gene are associated with each of the two clonal groupings. The different m regions of VacA seem to correlate with different target cell specificities of the toxin (Paggliaccia et al ., 1998). Age of H. pylori How old is H. pylori and how old is the Asian clone? The mean K s values of the various genes in Table 3 and similar values for vacA , flaA and flaB (Suerbaum et al ., 1998) are higher than for most housekeeping genes in Drosophila melanogaster, E. coli and N. meningitidis. This observation suggests that H. pylori has a long history, possibly of millions of years. Sequence variation within the Asian clone was less than within the entire species, but sufficiently high that the Asian clone of H. pylori may have accompanied Homo sapiens when they colonized Asia some 40 000 or more years ago. Alternatively, a strain variant of H. pylori that is particularly well adapted to East Asians might have spread through that population more recently, displacing bacteria that were formerly present. Apparently, subsequent recombination has not been sufficiently frequent to obscure the evidence for clonal descent. The age of clone2 may be similar to that of the Asian clone because the sequence variability of the two sets of strains was similar (Fig. 2). Free recombination versus clonality The data presented here provide evidence for both extensive recombination and clonality. One possible explanation for such mutually incompatible models is that their relative importance for H. pylori differs with the geographical region. For example, lower H ratios were observed with strains from South Africans (including CC28, which belongs to clone2) than with strains isolated in Canada and Germany (Suerbaum et al ., 1998). Not all of H. pylori is necessarily clonal: remnants of clonal descent were not detected among the 10 strains that were not assigned to the Asian clone or clone2. Possibly, the clonal members differ in the frequency with which genes transferred from other strains of H. pylori are fixed. Such differences are known in other species, such as N. meningitidis, in which seven distinct clonal groupings of hypervirulent bacteria exist (Maiden et al ., 1998) even though recombination is frequent (Morelli et al ., 1997) and many other isolates are panmictic (Maynard Smith et al ., 1993). Finally, the spread of H. pylori from one human host to the next may be sufficiently slow that insufficient time has elapsed for recombination to remove all traces of clonal descent. Frequencies of recombination We used both the homoplasy test and compatibility matrices to ensure that our conclusions on recombination did not reflect the limitations of a single method. The results with both methods were consistent with two exceptions, namely cagA N, which is clonal according to the H ratio, but seems from compatibility matrices to recombine frequently, and trpC , which has undergone less recombination than the other housekeeping genes based on its H ratio, but not according to compatibility matrices. Possibly, these discrepancies can be accounted for by the high K a values for both exceptional genes. Compatibility matrices evaluate all informative sequence variation and would be affected by the higher level of non-synonymous variation in these gene fragments, whereas the homoplasy test only uses synonymous, informative polymorphisms. Both methods agreed that cagA C was highly clonal and that recombinants were very rare. Thus, different genes (or gene fragments) within one chromosome of a bacterial species may be associated with different degrees of recombination. In other species, elevated levels of polymorphism have been attributed to linkage to gene clusters that are imported from other species (gnd linked to rfb ) (Bisercic et al ., 1991) or to diversifying selection (Smith et al ., 1995; Boyd et al ., 1997). However, the genes we chose for sequencing encode cyptoplasmic enzymes, and they are not linked to genes encoding outer membrane or secreted proteins that might be under selection. We have also argued above that the evidence does not support a selective sweep having caused the clonal descent associated with the cagA gene. Thus, the mechanisms responsible for different degrees of recombination associated with individual genes in H. pylori remain to be elucidated. 䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470 Recombination and clonal groupings of H. pylori 467 Summary Two clonal groupings, the Asian clone and clone2, were detected in a collection of H. pylori from several parts of the world. These clonal groupings are widespread and have probably existed for a long time. Recombination seems to have occurred on a global scale for most genes studied, but has not totally disrupted the relationships with the clonal groupings. The results presented here should stimulate further extensive analyses of the geographical specialization and evolution of this species. Experimental procedures Bacterial strains Nineteen strains were chosen from diverse ethnic groups and countries to represent the geographical diversity of H. pylori (Table 1). The collection deliberately included many strains Table 6. Oligonucleotide primers. that had already been used in previous analyses. Because of our intention of examining sequence variation of the cagA gene, we only included strains thought to be Cagþ. The 19 strains were purified by single-colony isolation before further analysis. These strains, or their DNAs, are available from the individual sources shown in Table 1. DNA, genes, PCR products and sequences DNA was purified from cultures of H. pylori by the CTAB method (Ausubel et al ., 1994) or by other standard methods appropriate for PCR amplification of DNA fragments and sent in dried form to each of the laboratories of the consortium. We chose seven housekeeping genes for sequencing (Table 2), using the criteria that these genes be both widely separated on the chromosome of the strain (26695) whose genome has been sequenced (Tomb et al ., 1997) and not adjacent to genes encoding putative outer membrane, secreted or hypothetical proteins that might be under selective pressure. Based on the sequences in the genome of strain 26695, amplification and Name (orientation) Purpose a Sequence atpA AtpA1(þ) AtpA6(¹) AtpA7(þ) AtpA4(¹) A A S S GCTTAAATGGTGTGATGTCG AATGGGCAAGGGCGAATAAG CGCTTTGGGTGAGCCTATTG TGCCCGTCTGTAATAGAAATG cagAN cagP1(þ) cagF2(þ) cagB1(¹) A,S A,S A,S CCATTTTAAGCAACTCCATAAACC AAGATACCGATAGGTATGAA TCTGCCAAACAATCTTTTGCAG cagA C cagA5(þ) cagA2(¹) A,S A,S GGCAATGGTGGTCCTGGAGCTAGGC GGAAATCTTTAATCTCAGTTCGG efp F01(þ) R02(¹) F02(þ) R01(¹) A A S S GGCAATTGGGATGAGCGAGCTC CTTCACCTTTTCAAGATACTC GGGCTTGAAAATTGAATTGGGCGG GTATTGACTTTAATGATCTCACCC mutY mutY101(þ) mutY102(¹) A,S A,S AGCGAAGTGATGAGCCAACAAAC AAAGGGCAAATCGCACATTTGGG ppa ppa1(þ) ppa2(¹) ppa3(¹) A,S A,S A,S GTGAGCCATGACGCTGATTCTTTGT GCCTTGATAGGCTTTTATCGCTTTCT GCCATTTCACACCAACACCCAAT trpC HptrpC-1(þ) HptrpC-6(þ) HptrpC-5(¹): HptrpC-9(þ) HptrpC-7(¹): HptrpC-8(¹): A A,S A S S S CAAGCTCCTAGAAGTCTCTG TAGAATGCAAAAAAGCATCGCCCTC CCCAGCTAGCATGAAAGG CGCTTGCTCAA(AG)CTCCAATACGAC TAAGCCCGCACACTTTATTTTCGCC GTCGTATTGGCG(CT)TTGAGCAAGCG vacA OLHPVacA-3(þ) OLHPVacA-4(¹) A,S A,S ACAACCGTGATCATTCCAGC ATACGCTCCCACGTATTGC ureI Hp71S1(þ) Hp71S2(þ) Hp71S3(þ) Hp71AS1(¹) A,S A,S A,S A,S CAATAAAGTGAGCTTGGCGCAACT GTTATTCGTAAGGTGCGTTTGTTG GGCAATGCTAGGACTTGT TCCCTTAGATTGCCAACTAAACGC yphC F1(þ) F3(þ) R4(¹) R5(¹) A,S A,S A,S A,S CACTATTACCACGCCTATTTTTTTGAC CTTATGCGTTTTCTTCTTTTGG AAGCAGCTGGTTGTGATCACGGGGGC TTTCTARGCTTTCTAAAATATC Gene a. A, amplification; S, sequencing. 䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470 468 M. Achtman et al. sequencing primers were designed such that complete doublestranded sequences could be obtained with a single sequencing run in each direction. For several gene fragments, sequence diversity prevented amplification from all strains with a single pair of primers. Additional primers were designed from the available sequences and, where necessary, after cloning and sequencing PCR amplicons that had been amplified inefficiently. The complete set of primers used for the seven housekeeping genes is shown in Table 6. Two gene fragments of cagA (cagA N, which corresponds to the N-terminus, and cagA C, which is slightly C-terminal of cagA N) and a vacA gene fragment were also sequenced to allow comparison with previous results (van der Ende et al ., 1998; van Doorn et al ., 1999). The primers used to amplify these genes are also included in Table 6 for convenience, although they have been published previously (Pan et al ., 1997; Suerbaum et al ., 1998; van der Ende et al ., 1998; van Doorn et al ., 1999). Automated sequencing of PCR products from both strands was performed using standard protocols, and sequences were deposited in GenBank under the designations listed in Table 2. vacA s and m typing was performed by hybridization as described previously (van Doorn et al ., 1998b). Phylogenetic and other analyses Sequence alignments in GCG MSF format were converted to MEGA format using the PSFIND program and evaluated with the homoplasy test (Maynard Smith and Smith, 1998), using the HOMOPLASY program (Suerbaum et al ., 1998). PSFIND and HOMOPLASY are both available by anonymous ftp at ftp:/ / novell-del-valle.rz-berlin.mpg.de/software/. Homoplasy ratios are the means of five repetitions of the homoplasy test. Mean K a and K s distances, using Jukes–Cantor corrections (Jukes and Cantor, 1969), were calculated using DNASP 2.2 (Rozas and Rozas, 1997). Compatibility matrices and mean percentage compatibility values were calculated using the program RETICULATE (Jakobsen and Easteal, 1996). Pairwise distance matrices based on numbers of different nucleotides (p-distance) were calculated using the MEGA program (Kumar et al ., 1993) and normalized to a maximum of 1.0 for each matrix. These matrices were averaged (Excel, Microsoft) and used to generate an UPGMA tree (Statistica, Statsoft). Bootstrap tests (Felsenstein, 1985) (500 repetitions) of neighbour-joining trees based on distances with Jukes–Cantor corrections were performed using MEGA . Maximum-likelihood trees and graphical outputs were calculated using ARB (http:/ /www.mikro.biologie. tu-muenchen.de). Percentage GC was calculated using the program CODONW (http:/ /molbiol.ox.ac.uk/cu/). Acknowledgements We gratefully acknowledge the expert assistance of Monique Feller, Susanne Friedrich, Ricardo Sanna, Kerstin Zurth and grant support from the Deutsche Forschungsgemeinschaft (Su 133/3-1, Ac 36/10-1) and the US Public Health Service (NIH grants AI38166, DK48029 and HG00820). We thank the following for H. pylori strains studied in this work: A. J. Parkinson (96-228, 96-232), R. H. Gilman (HP1), A. Lee (SS1), C. Gareia and O. Torres (#12) and E. Kunstmann (BO265, CC28). References Akopyanz, N., Bukanov, N.O., Westblom, T.U., and Berg, D.E. (1992a) PCR-based RFLP analysis of DNA sequence diversity in the gastric pathogen Helicobacter pylori. Nucleic Acids Res 11: 6221–6225. Akopyanz, N., Bukanov, N.O., Westblom, T.U., Kresovich, S., and Berg, D.E. (1992b) DNA diversity among clinical isolates of Helicobacter pylori detected by PCR-based RAPD fingerprinting. Nucleic Acids Res 20: 5137–5142. Akopyants, N.S., Clifton, S.W., Kersulyte, D., Crabtree, J.E., Youree, B.E., Reece, C.A., et al . (1998) Analyses of the cag pathogenicity island of Helicobacter pylori. Mol Microbiol 28: 37–53. Alm, R.A., Ling, L.-S.L., Moir, D.T., King, B.L., Brown, E.D., Doig, P.C., et al . (1999) Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori. Nature 397: 176–180. Atherton, J.C. (1998) H. pylori virulence factors. Br Med Bull 54: 105–120. Atherton, J.C., Cao, P., Peek, R.M.J., Tummuru, M.K., Blaser, M.J., and Cover, T.L. (1995) Mosaicism in vacuolating cytotoxin alleles of Helicobacter pylori . Association of specific vacA types with cytotoxin production and peptic ulceration. J Biol Chem 270: 17771–17777. Ausubel, F.M., Brent, R., Kingston, R.E., Moore, D.D., Seidman, J.G., Smith, J.A., et al . (1994) Current Protocols in Molecular Biology . New York: John Wiley & Sons. Bamford, K.B., Bickley, J., Collins, J.S., Johnston, B.T., Potts, S., Boston, V., et al . (1993) Helicobacter pylori : comparison of DNA fingerprints provides evidence for intrafamilial infection. Gut 34: 1348–1350. Bisercic, M., Feutrier, J.Y., and Reeves, P.R. (1991) Nucleotide sequences of the gnd genes from nine natural isolates of Escherichia coli : evidence of intragenic recombination as a contributing factor in the evolution of the polymorphic gnd locus. J Bacteriol 173: 3894–3900. Blaser, M.J. (1997) Ecology of Helicobacter pylori in the human stomach. J Clin Invest 100: 759–762. Boyd, E.F., Li, J., Ochman, H., and Selander, R.K. (1997) Comparative genetics of the inv-spa invasion gene complex of Salmonella enterica. J Bacteriol 179: 1985–1991. Censini, S., Lange, C., Xiang, Z., Crabtree, J.E., Ghiara, P., Borodovsky, M., et al . (1996) cag , a pathogenicity island of Helicobacter pylori , encodes type-I specific and diseaseassociated virulence factors. Proc Natl Acad Sci USA 93: 14648–14653. Covacci, A., Censini, S., Bugnoli, M., Petracca, R., Burroni, D., Macchia, G., et al . (1993) Molecular characterization of the 128-kDa immunodominant antigen of Helicobacter pylori associated with cytotoxicity and duodenal ulcer. Proc Natl Acad Sci USA 90: 5791–5795. Cover, T.L. (1996) The vacuolating cytotoxin of Helicobacter pylori. Mol Microbiol 20: 241–246. van Doorn, L.J., Figueiredo, C., Sanna, R., Pena, S., Midolo, P., Ng, E.K.W., et al . (1998a) Expanding allelic diversity of Helicobacter pylori vacA . J Clin Microbiol 36: 2597–2603. van Doorn, L.J., Figueiredo, C., Rossau, R., Jannes, G., Van Asbroeck, M., Sousa, J.C., et al . (1998b) Typing of Helicobacter pylori vacA gene and detection of cagA gene by PCR and reverse hybridization. J Clin Microbiol 36: 1271–1276. 䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470 Recombination and clonal groupings of H. pylori 469 van Doorn, L.J., Figueiredo, C., Sanna, R., Blaser, M.J., and Quint, W.G.V. (1999) Distinct variants of Helicobacter pylori cagA are associated with vacA subtypes. J Clin Microbiol (in press). van der Ende, A., Pan, Z.J., Bart, A., Van der Hulst, R.W., Feller, M., Xiao, S.D., et al . (1998) cagA -positive Helicobacter pylori populations in China and The Netherlands are distinct. Infect Immun 66: 1822–1826. van der Ende, A., Rauws, E.A.J., Feller, M., Mulder, C.J.J., Tytgat, G.N.J., and Dankert, J. (1996) Heterogeneous Helicobacter pylori isolates from members of a family with a history of peptic ulcer disease. Gastroenterology 111: 638–647. Enright, M.C., and Spratt, B.G. (1998) A multilocus sequence typing scheme for Streptococcus pneumoniae : identification of clones associated with serious invasive disease. Microbiology 144: 3049–3060. Felsenstein, J. (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39: 783–791. Ferrero, R.L., Cussac, V., Courcoux, P., and Labigne, A. (1992) Construction of isogenic urease-negative mutants of Helicobacter pylori by allelic exchange. J Bacteriol 174: 4212–4217. Garner, J.A., and Cover, T.L. (1995) Analysis of genetic diversity in cytotoxin-producing and non-cytotoxin-producing Helicobacter pylori strains. J Infect Dis 172: 290–293. Go, M.F., Kapur, V., Graham, D.Y., and Musser, J.M. (1996) Population genetic analysis of Helicobacter pylori by multilocus enzyme electrophoresis: extensive allelic diversity and recombinational population structure. J Bacteriol 178: 3934–3938. Gutjahr, T.S., O’Rourke, M., Ison, C.A., and Spratt, B.G. (1997) Arginine-, hypoxanthine-, uracil-requiring isolates of Neisseria gonorrhoeae are a clonal lineage within a nonclonal population. Microbiology 143: 633–640. Guttman, D.S., and Dykhuizen, D.E. (1994) Detecting selective sweeps in naturally occurring Escherichia coli. Genetics 138: 993–1003. Istock, C.A., Duncan, K.E., Ferguson, N., and Zhou, X. (1992) Sexuality in a natural population of bacteria – Bacillus subtilis challenges the clonal paradigm. Mol Ecol 1: 93–103. Ito, Y., Azuma, T., Ito, S., Suto, H., Miyaji, H., Yamazaki, Y., et al . (1998) Full-length sequence analysis of the vacA gene from cytotoxic and noncytotoxic Helicobacter pylori. J Infect Dis 178: 1391–1398. Jakobsen, I.B., and Easteal, S. (1996) A program for calculating and displaying compatibility matrices as an aid in determining reticulate evolution in molecular sequences. CABIOS 12: 291–295. Jenks, P.J., Mégraud, F., and Labigne, A. (1998) Clinical outcome after infection with Helicobacter pylori does not appear to be reliably predicted by the presence of any of the genes of the cag pathogenicity island. Gut 43: 752–758. Jukes, T.H., and Cantor, C.R. (1969) Evolution of protein molecules. In Mammalian Protein Metabolism . Munro, H.N. (ed.). New York: Academic Press, pp. 21–132. Kansau, I., Raymond, J., Bingen, E., Courcoux, P., Kalach, N., Bergeret, M., et al . (1996) Genotyping of Helicobacter pylori isolates by sequencing of PCR products and comparison with the RAPD technique. Res Microbiol 147: 661–669. 䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470 Kersulyte, D., Chalkauskas, H., and Berg, D.E. (1999) Emergence of recombinant strains of Helicobacter pylori during human infection. Mol Microbiol 31: 31–43. Kuipers, E.J., Perez-Perez, G.I., Meuwissen, S.G., and Blaser, M.J. (1995) Helicobacter pylori and atrophic gastritis: importance of the cagA status. J Natl Cancer Inst 87: 1777–1780. Kumar, S., Tamura, K., and Nei, M. (1993) MEGA: Molecular Evolutionary Genetics Analysis , Version 1.01 . University Park, PA: The Pennsylvania State University. Lee, A., O’Rourke, J., De Ungria, M.C., Robertson, B., Daskalopolous, G., and Dixon, M.F. (1997) A standardized mouse model of Helicobacter pylori infection: introducing the Sydney strain. Gastroenterology 112: 1386–1397. Maiden, M.C.J., Bygraves, J.A., Feil, E., Morelli, G., Russell, J.E., Urwin, R., et al . (1998) Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc Natl Acad Sci USA 95: 3140–3145. Majewski, S.I., and Goodwin, C.S. (1988) Restriction endonuclease analysis of the genome of Campylobacter pylori with a rapid extraction method: evidence for considerable genomic variation. J Infect Dis 157: 465–471. Maynard Smith, J., and Smith, N.H. (1998) Detecting recombination from gene trees. Mol Biol Evol 15: 590–599. Maynard Smith, J., Smith, N.H., O’Rourke, M., and Spratt, B.G. (1993) How clonal are bacteria? Proc Natl Acad Sci USA 90: 4384–4388. Miehlke, S., Kibler, K., Kim, J.G., Figura, N., Small, S.M., Graham, D.Y., et al . (1996) Allelic variation in the cagA gene of Helicobacter pylori obtained from Korea compared to the United States. Am J Gastroenterol 91: 1322–1325. Morelli, G., Malorny, B., Müller, K., Seiler, A., Wang, J., del Valle, J., et al . (1997) Clonal descent and microevolution of Neisseria meningitidis during 30 years of epidemic spread. Mol Microbiol 25: 1047–1064. Musser, J.M., Kroll, J.S., Granoff, D.M., Moxon, E.R., Brodeur, B.R., Campos, J., et al . (1990) Global genetic structure and molecular epidemiology of encapsulated Haemophilus influenzae. Rev Infect Dis 12: 75–111. O’Rourke, M., and Spratt, B.G. (1994) Further evidence for the non-clonal population structure of Neisseria gonorrhoeae : extensive genetic diversity within isolates of the same electrophoretic type. J Gen Microbiol 140: 1285–1290. O’Rourke, M., Ison, C.A., Renton, A.M., and Spratt, B.G. (1995) Opa-typing: a high-resolution tool for studying the epidemiology of gonorrhoea. Mol Microbiol 17: 865–875. Paggliaccia, C., de Bernard, M., Lupetti, P., Ji, X., Burroni, D., Cover, T.L., et al . (1998) The m2 form of the Helicobacter pylori cytotoxin has cell type-specific vacuolating activity. Proc Natl Acad Sci USA 95: 10212–10217. Pan, Z.J., Van der Hulst, R.W., Feller, M., Xiao, S.D., Tytgat, G.N., Dankert, J., et al . (1997) Equally high prevalences of infection with cagA -positive Helicobacter pylori in Chinese patients with peptic ulcer disease and those with chronic gastritis-associated dyspepsia. J Clin Microbiol 35: 1344– 1347. Pan, Z.-J., Berg, D.E., van der Hulst, R.W.M., Su, W.-W., Raudonikiene, A., Xiao, S.-D., et al . (1998) Prevalence of vacuolating cytotoxin production and distribution of distinct vacA alleles in Helicobacter pylori from China. J Infect Dis 178: 220–226. 470 M. Achtman et al. Rozas, J., and Rozas, R. (1997) DnaSP, Version 2.0: a novel software package for extensive molecular population genetics analyses. CABIOS 13: 307–311. Salaun, L., Audibert, C., Le Lay, G., Burucoa, C., Fauchere, J.L., and Picard, B. (1998) Panmictic structure of Helicobacter pylori demonstrated by the comparative study of six genetic markers. FEMS Microbiol Lett 161: 231–239. Smith, N.H., Maynard Smith, J., and Spratt, B.G. (1995) Sequence evolution of the porB gene of Neisseria gonorrhoeae and Neisseria meningitidis : evidence of positive Darwinian selection. Mol Biol Evol 12: 363–370. Van Soolingen, D., Qian, L., De Haas, P.E.W., Douglas, J.T., Traore, H., Portaels, F., et al . (1995) Predominance of a single genotype of Mycobacterium tuberculosis in countries of East Asia. J Clin Microbiol 33: 3234–3238. Suerbaum, S., Maynard Smith, J., Bapumia, K., Morelli, G., Smith, N.H., Kunstmann, E., et al . (1998) Free recombination within Helicobacter pylori. Proc Natl Acad Sci USA 95: 12619–12624. Tomb, J.F., White, O., Kerlavage, A.R., Clayton, R.A., Sutton, G.G., Fleischmann, R.D., et al . (1997) The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 388: 539–547. Tummuru, M.K., Cover, T.L., and Blaser, M.J. (1993) Cloning and expression of a high-molecular-mass major antigen of Helicobacter pylori : evidence of linkage to cytotoxin production. Infect Immun 61: 1799–1809. Xu, Q., Peek, Jr, R.M., Miller, G.G., and Blaser, M.J. (1997) The Helicobacter pylori genome is modified at CATG by the product of hpyIM. J Bacteriol 179: 6807–6815. 䊚 1999 Blackwell Science Ltd, Molecular Microbiology, 32, 459–470