* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download genes, pseudogenes, deletions, insertion elements and DNA islands
Biology and consumer behaviour wikipedia , lookup
Ridge (biology) wikipedia , lookup
Epigenetics in learning and memory wikipedia , lookup
Genomic imprinting wikipedia , lookup
DNA barcoding wikipedia , lookup
Cancer epigenetics wikipedia , lookup
DNA vaccination wikipedia , lookup
Gene nomenclature wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Genetic code wikipedia , lookup
Molecular cloning wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Minimal genome wikipedia , lookup
Transposable element wikipedia , lookup
Primary transcript wikipedia , lookup
Human genome wikipedia , lookup
Gene desert wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Gene expression programming wikipedia , lookup
Genetic engineering wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Metagenomics wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Genome (book) wikipedia , lookup
Microsatellite wikipedia , lookup
Genome evolution wikipedia , lookup
Non-coding DNA wikipedia , lookup
Gene expression profiling wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Pathogenomics wikipedia , lookup
Genome editing wikipedia , lookup
History of genetic engineering wikipedia , lookup
Point mutation wikipedia , lookup
Designer baby wikipedia , lookup
Helitron (biology) wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Molecular Microbiology (1999) 33(3), 635±650 The opcA and CopcB regions in Neisseria: genes, pseudogenes, deletions, insertion elements and DNA islands Peixuan Zhu, Giovanna Morelli and Mark Achtman* Max-Planck Institut fuÈr molekulare Genetik, Ihnestraûe 73, D-14195 Berlin, Germany. Summary Previous data have indicated that the opc gene encoding an immunogenic invasin is speci®c to Neisseria meningitidis (Nm) and is lacking in Neisseria gonorrhoeae (Ng). The data presented here show that Nm and Ng both contain two paralogous opclike genes, opcA, corresponding to the former opc gene, and CopcB, a pseudogene. The predicted OpcA and OpcB proteins possess transmembrane regions with conserved non-polar faces but differ extensively in four of the ®ve surface-exposed loops. Gonococcal OpcA was expressed weakly under in vitro conditions, and it is unknown whether these bacteria can express this protein at high levels. Analysis of the sequences ¯anking opcA and CopcB revealed a framework of conserved housekeeping genes interspersed with DNA islands. These regions also contained several pseudogenes, deletions and IS elements, attesting to considerable genome plasticity. Both opcA and CopcB are located on DNA islands that have probably been imported from unrelated bacteria. A third island encodes the dcmD/ dcrD R /M genes in Ng versus a small open reading frame in most strains of Nm. Rare strains of Nm were identi®ed in which the R /M island has been imported. DNA islands in Nm and Ng seem to have been acquired by recombination via conserved ¯anking housekeeping genes rather than by insertion of mobile genetic elements. Introduction The physical chromosome map order of the Gramnegative bacterial pathogens Neisseria gonorrhoeae (Ng) and Neisseria meningitidis (Nm) is similar (Dempsey et al., 1995), and their housekeeping genes are 98% Received 23 March, 1999; revised 12 May, 1999; accepted 20 May, 1999. *For correspondence. E-mail [email protected]; Tel. (49) 30 8413 1262; Fax (49) 30 8413 1385. Q 1999 Blackwell Science Ltd homologous (Zhou and Spratt, 1992). Both species colonize only humans, but Ng is specialized for the mucosa of the urogenital tract and causes gonorrhoea and pelvic in¯ammatory disease, whereas Nm is specialized for the mucosa of the nasopharynx and causes meningitis and septicaemia. Strains of Nm contain genes within the cps region whose products result in the synthesis of capsular polysaccharide (Petering et al., 1996), whereas different genes are found at the corresponding location in Ng. Several other genes are also thought to be speci®c to Nm versus Ng: mip (Sampson and Gotschlich, 1992); frp encoding a homologue to RTX toxins (Thompson and Sparling, 1993; Thompson et al., 1993); gpxA encoding a glutathione peroxidase (Moore and Sparling, 1995); and the subject of this report, the opc gene (Olyhoek et al., 1991). opc encodes a cell surface-exposed, outer membrane protein with 10 transmembrane strands and ®ve surfaceexposed loops (Merker et al., 1997). The Opc protein (formerly 5C) mediates adhesion to and invasion of endothelial and epithelial cells (Virji et al., 1992; 1993; De Vries et al., 1996; 1998); it interacts indirectly via vitronectin with b-integrins on endothelial cells (Virji et al., 1994) and directly with heparan sulphate proteoglycans on epithelial cells (De Vries et al., 1998). Opc expression is hypervariable owing to different levels of transcription, depending on the length of a polycytidine stretch in the promoter region (Sarkari et al., 1994). Opc is also highly immunogenic in humans and stimulates bactericidal antibodies (Rosenqvist et al., 1993; 1995; Haneberg et al., 1998). The opc gene is present in many strains of Nm (Seiler et al., 1996) and is highly conserved, with only minor sequence variation. However, Ng and some Nm, including the clonal groupings of serogroups B and C called the A4 cluster and the ET-37 complex, did not hybridize with an opc probe. The degree of genetic variation between different strains from one species or from closely related species has only rarely been investigated. Closely related bacteria can differ in their possession of mobile genetic elements, some of which have been designated pathogenicity islands (Hacker et al., 1997). A comparison of the complete genome sequences from two strains of Helicobacter pylori has shown that 7% of the genes were only present 636 P. Zhu, G. Morelli and M. Achtman in one or the other strain (Alm et al., 1999). Similarly, a subtraction library between Nm and Ng yielded clones from at least nine regions scattered around the chromosome (Tinsley and Nassif, 1996). It seemed likely that the opc gene might represent another example of such intraspecies and interspecies genetic variability, although none of the clones isolated by Tinsley and Nassif (1996) contained opc sequences. In order to resolve this question, we sequenced the region surrounding opc from representative strains of Nm and Ng that differ in their hybridization with an opc probe. Furthermore, a second region surrounding a paralogous opc-like gene was also sequenced. These regions contained opc-like genes plus a mixture of housekeeping genes, insertion elements, pseudogenes, deletions and DNA islands, which are described here. The results demonstrate high levels of genetic variability outside of housekeeping genes associated with import by horizontal genetic exchange. Results The opcA and CopcB regions The < 12 kb region between glyA and purK, two housekeeping genes that ¯ank opc, was sequenced from polymerase chain reaction (PCR) products ampli®ed from chromosomal DNA of Nm serogroup A, subgroup IV-1 strain Z2491, Nm serogroup C, ET-37 complex strain FAM18, Ng strain FA1090 and Ng strain MS11 (Fig. 1). The genomes of two of these strains are currently being sequenced (Z2491, http://www.sanger.ac.uk/ Projects/N_meningitidis/; FA1090, http://dna1.chem.ou. Fig. 1. Genetic organization of the opcA region in Ng and Nm. ORFs are indicated by grey boxes with gene symbols whose transcriptional direction is indicated by short arrows. The primary maps indicate the homologous genes, while the small additional maps indicate the positions of genes that are unique to each species. Top: Ng strain FA1090 (GenBank accession number AJ242839). Nucleotide differences in Ng strain MS11 (GenBank AJ242840) are indicated by single letters under the map, except for a box around the GGC codon, which is deleted in strain MS11. Middle: Nm strain Z2491 (GenBank AJ242841). Bottom: Nm strain FAM18 (GenBank AJ242842). A deletion in FAM18 is indicated by a dotted line. CE16, a Correia element in CIS1106A2. C10 , position of the polycytidine stretch in Nm that was arbitrarily set to 10 cytidines. Symbols for the target site and inverted repeats ¯anking IS1106A3, stop codons and frameshifts are indicated at the bottom left. Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650 opcA and opcB regions 637 Fig. 2. The genetic organization of the CopcB region in Ng FA1090 (top; Genbank AJ242836), Nm Z2491 (middle; Genbank AJ242837) and Nm FAM18 (bottom; Genbank AJ242838). Symbols are as in Fig. 1. The DNA inserted in Nm at the end of the CopcB gene is indicated on a separate small map, and deletions are indicated by dotted lines. edu/gono.html), and the two other strains have been used extensively for genetic analyses (Aho et al., 1991; Bihlmaier et al., 1991). The opc gene in this region will be referred to henceforth as opcA because a second opc-like gene, CopcB, was found by searching the two genome projects for homologies with the opcA sequence. The < 4 kb region from aroG to comEA containing CopcB was also sequenced from chromosomal PCR products from strains Z2491, FA1090 and FAM18 (Fig. 2). The map locations of the opcA and CopcB regions were determined by hybridizing opcA and CopcB probes from each species to Southern blots of chromosomal DNA that had been digested with rare-cutting restriction enzymes whose sites have been mapped (Dempsey et al., 1991; 1995). Each region is at approximately the same position in each species, and the two regions are almost diametrically opposed on the chromosome (see Experimental procedures). Both the opcA and CopcB regions contain a conserved framework of open reading frames (ORFs), most of which have strong (61±85%) protein homology to known genes from other bacterial species. These framework genes were named glyA (serine hydroxymethyltransferase, homologous to SWISSPROT P00477); dedA (Escherichia coli protein of unknown function, P05948); abcZ (ABC Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650 transporter ATP-binding protein, P43672); trpE (anthranilate synthase component I, P20580); purK (phosphoribosylaminoimidazole carboxylase, P12045); aroG (phospho-2-dehydro-3-deoxyheptaneonate aldolase, P00886) and comEA (a Bacillus subtilis protein involved in DNA transformability, P39694). The sizes of these framework ORFs were the same in Nm, Ng and the other species, except for DedA (150 versus 219 amino acids). The regions also contained ORFs with no known homologies, arbitrarily named orfA, orfD, orfX, orfY and orfZ. The sizes of the opcA and CopcB regions varied as a result of intergenic regions that contain different genes (orfA, orfE, orfX, orfY, dcmD, dcrD ) in each species as well as the variable presence of indels (insertions or deletions) or IS (insertion) elements related to IS1106 (CIS1106A2, IS1106A3) and IS1016 (IS1016C2). Several of the ORFs were pseudogenes, as indicated by the pre®x C, resulting from frameshifts, stop codons and/or insertion of repetitive DNA (CE16, CE17) related to the socalled Correia elements (Correia et al., 1988). [There seem to be errors in the GenBank dcmD sequence from Ng strain MS11 (M86915) in which positions 9165±9282 are frameshifted, resulting in an incorrect stretch of 40 amino acids (positions 216±255 in SWISSPROT P31033). The sequences described here were obtained 638 P. Zhu, G. Morelli and M. Achtman by direct sequencing of both strands from PCR products, and these frameshifts were not present in either of the Ng strains tested.] The opcA regions of the two Ng strains are almost identical (20 polymorphic sites over 11 730 bp; 99.8% identical) (Fig. 1). In Nm FAM18, the opcA region has a 3.9 kb deletion between the glyA gene and the dedA gene that has eliminated the opcA gene and ¯anking DNA (Fig. 1). A deletion of approximately the same size is present in all 42 Nm strains that were shown not to hybridize with an opcA probe (Seiler et al., 1996) (PCR analysis, data not shown). Excluding the deletion, the opcA region was very similar (96.8% identical over 8170 bp) in the two Nm strains sequenced here. The homology between Ng FA1090 and Nm Z2491 in the framework of the opcA region was 93%, somewhat less than the homologies within either species. The homology between the Nm strains in the framework of the CopcB region was 99%; between Ng strain FA1090 and either of the Nm strains, the homology was 97% for Fig. 3. Aligned amino acid sequences of the OpcA and OpcB proteins from Nm and Ng. Frameshifts (bent arrows, frame indicated underneath) and stop codons (yellow Xs) in OpcB were replaced by appropriate amino acids, and the sequences were aligned manually in order to avoid disrupting the transmembrane regions previously elucidated in the two-dimensional model of Nm OpcA (Merker et al., 1997). Tm1 to Tm10 indicate the 10 transmembrane strands, and loop 1 to loop 5 indicate the surface-exposed loops. Non-polar amino acids on the non-polar face of transmembrane strands are shaded in dark grey. Identical amino acids at other positions are coloured red (all four OpcA and OpcB proteins from strains FA1090 and Z2491), blue (three proteins) or green (two proteins). Dots indicate single amino acid gaps introduced for the purposes of alignment. Sequence polymorphisms (grey) in OpcA within Nm (13 sequence variants; Seiler et al., 1996) or strain MS11 (Ng) are indicated in the lowest two lines of each part of the ®gure where dashes indicate unaltered amino acids. Underlining within the FA1090 OpcA protein indicates the sequences of ®ve synthetic peptides used for immunization. Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650 opcA and opcB regions 639 the aroG gene through to the beginning of opcB and 90% from the end of the opcB gene to comEA. Thus, large portions of the two regions were strongly homologous within each species and between both species. The Opc protein family The opcA region of Ng strains FA1090 and MS11 contains an opcA-like gene (Fig. 1). Furthermore, an opcA-like gene is present in most Ng strains because we ampli®ed a product of the same size from 26 Ng strains using oligonucleotides O570 immediately upstream of opcA and O574 within opcA. Sequence variability of the opcA gene was very low in different strains of Nm (Seiler et al., 1996). The opcA genes from Ng strains FA1090 and MS11 are also almost identical; they only differ by three nucleotides (two synonymous and one non-synonymous change) plus an indel of one codon (Fig. 1). In contrast, opcA from Ng FA1090 was only 59% identical to opcA from Nm Z2491, suf®ciently different that it had not been detected previously by hybridization (Seiler et al., 1996). CopcB contains multiple frameshifts and stop codons in both Nm and Ng. CopcB is 99% identical in the two Nm strains, but the CopcB gene was considerably shorter in Nm (476 bp) than in Ng (753 bp) (Fig. 2), and CopcB from Ng FA1090 is only 71% identical to CopcB from Nm Z2491 for those parts of the gene that are present in both species. Finally, the DNA homology between opcA and CopcB is 55% within Ng FA1090 and 45% within Nm Z2491. A multiple sequence alignment of the Opc family of proteins was constructed manually in which the integrity of the transmembrane strands was maximized, the frameshifts in CopcB were compensated by adding or deleting nucleotides, and stop codons in CopcB were replaced with suitable amino acids (Fig. 3). The known sequence polymorphism of OpcA in Nm and Ng is also indicated in Fig. 3. As expected for outer membrane proteins (Struyve et al., 1991), the aligned OpcA and OpcB sequences all contain a signal peptide, a C-terminal phenylalanine and transmembrane regions containing non-polar amino acids on one face. OpcA from both Ng and Nm contains 10 transmembrane strands and ®ve surface-exposed loops, but loops 2 and 5 are shorter in Ng. Part of loop 2 was deleted in Ng OpcB, and Nm OpcB is deleted from the middle of loop 1 through to the end of loop 3. Fairly strong homologies were obvious within the signal peptide, the transmembrane regions and parts of loops 1 and 5 for all the sequences. However, the only longer stretch of identity between the ®ve surface-exposed loops is the sequence LKEKHK in loop 1 of OpcA from Nm and Ng. The periplasmic turns between the transmembrane regions showed minor sequence variability. In particular, Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650 Fig. 4. Western blot analysis with murine sera against loops 1 (A) and 2 (B) of Ng OpcA. Cell envelope preparations were from Ng strains FA1090, MS11 and freshly isolated clinical isolates (Zwxyz) that had been passaged no more than twice in vitro. Controls include cell envelopes from E. coli strains expressing cloned opcA (plasmid pBE651) or a vector (pMA5-8pL) plasmid control as well as puri®ed OpcA from Nm and cell envelopes from an Opc Nm strain (Z5104). the Ng OpcA sequence from MS11 possesses a deletion of one amino acid in the turn between transmembrane strands Tm6 and Tm7, while the Nm OpcA sequence is one amino acid shorter between Tm2 and Tm3 than is Ng OpcA (Fig. 3). Absence of strong expression of Ng opcA in vitro Murine sera against loops 1 and 2 of Ng OpcA were used in immunoblots to determine whether OpcA was detectable in cell envelopes from Ng strains FA1090, MS11 and 12 fresh clinical isolates (Fig. 4). Cell envelopes from E. coli expressing recombinant Ng OpcA were used as a positive control. As might be expected from the homologies between OpcA loop 1 from Nm and Ng, sera to Ng loop 1 reacted with puri®ed OpcA protein from Nm and with Nm cell envelopes containing OpcA (Fig. 4A), whereas sera to loop 2 did not (data not shown). Sera to loops 1 and 2 both reacted with recombinant Ng OpcA, but no reaction was detected with any of the 14 cell envelopes prepared from Ng after aerobic (Fig. 4) or anaerobic growth. Transcription of the Ng opcA gene was tested by Northern blot analysis of total RNA from FA1090 using a digoxygenin (DIG)-labelled opcA probe prepared by PCR ampli®cation with oligonucleotides O570/O574 (Fig. 5). A weak band of 0.9 kb, the size of the opcA coding region, was detected (Fig. 5A). After prolonged exposure, additional extremely weak bands of 1.5 and 2.6 kb were also 640 P. Zhu, G. Morelli and M. Achtman Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650 opcA and opcB regions 641 detected (not shown). Primer extension with oligonucleotides located within opcA and the upstream ORF, orfY, was performed to map the start point(s) of transcription (Fig. 5C and D). The results show the possible existence of two weak transcription start sites, P1, immediately after the ATG start codon of opcA, and P2, upstream of orfY. P1 bears no obvious relationship to the promoter upstream of opcA in Nm (Sarkari et al., 1994), but a second ATG start codon is located 48 bp downstream of P1 that might allow translation of a shortened opcA gene. Neither P1 nor P2 are preceded by sequences that resemble 10 or 35 consensus sequences, and there are no ribosomal binding sites after P1 or P2. Reverse transcription (RT)±PCR was performed in order to elucidate whether opcA is transcribed alone or is co-transcribed with orfY. RT±PCR was performed on total RNA using primers within opcA and orfY (Fig. 5E). Speci®c products were obtained with primers O570/ O574 (Fig. 5B, lane 7) and O571/O574 (lane 8), spanning much of the opcA gene. Thus, the Ng opcA gene is transcribed. In order to test whether this transcript could be attributed to P2, RT±PCR was performed with O1178/ O956 (Fig. 5B, lane 10), which spans the region including orfY and the beginning of opcA. A transcript was detected, but this probably does not include all of opcA because no product was obtained using primers O1178/O574 (Fig. 5B, lane 9). The existence of yet another transcript beginning upstream of orfY (and 58 of P2) is indicated by the observation that RT±PCR products were obtained with primers O595/O1135 that span P2. This transcript probably terminates before opcA because no products were obtained with O595/O956 (Fig. 5B, lane 11). Thus, these results con®rm that the opcA gene is transcribed in Ng, even if only weakly, but do not clarify the location of the promoter used for transcription. IS elements in the opcA and CopcB regions IS1106 is an IS element that is inserted next to the Nm porA gene (Knight et al., 1992). In Nm strain Z2491, the region upstream of opcA contains a pseudo-IS element, CIS1106A2, into which IS1106A3, a related IS element, has inserted in the inverted orientation. Sequence comparisons between strains in which IS1106A3 is lacking revealed that the insertion of IS1106A3 resulted in a duplication of the insertion target site, CAGACCG (data not shown). CIS1106A2 was probably a pseudo-IS element even before insertion of IS1106A3 because the internal transposase gene, orf1A, contains two frameshifts and one stop codon as well as a 105 bp Correia element, CE16. We assigned unique designations to both of the IS elements upstream of opcA because their transposase genes, orf1A and orf1B, are only 94% identical. Orf1B from IS1106A3 consists of 335 amino acids versus 288 for ORF1 from IS1106, and those two ORFs are only 79% identical. The three IS elements also differ in their organization: the 18 bp inverted repeats (AGA CCT TTG CAA AAT TCC) present at the ends of IS1106A3 differ in sequence from the 35±36 bp inverted repeats ¯anking IS1106. CIS1106A2 does not possess any inverted repeats. A different IS element, IS1016C2, is present in the CopcB region of Nm strain Z2491 in the intergenic region between orfE and orfD (Fig. 2). IS1016C2 is 97% homologous to IS1016N (Hilse et al., 1996; Mahillon and Chandler, 1998) and is ¯anked by the same 19 bp inverted repeats. OrfE (60 amino acids) may also be associated with IS elements because it is 43% identical to a 64amino-acid transposase (D64001) from Synechocystis sp. (Kaneko et al., 1996). Insertions, deletions and pseudogenes The homology between Ng and Nm stops immediately upstream of opcA. In Ng strains MS11 and FA1090, a 1401 bp segment has been inserted between glyA and opcA that contains two small ORFs (orfX, 216 bp; orfY, 270 bp), separated by an intergenic region (Fig. 1). This insertion is at the same location as the IS elements present in Nm strain Z2491, whereas both opcA and its upstream region have been deleted in Nm strain FAM18. FAM18 also has a second deletion of 892 bp near CopcB that fuses orfE and orf1C of IS1016C2 (Fig. 2). If these genes are functional in strain Z2491, they are pseudogenes in FAM18. Fig. 5. Transcription of opcA in Ng. A. Northern blot analysis. Track 1: RNA size markers whose molecular weights (kb) are indicated at the left. Track 2: hybridization of total RNA from strain FA1090 with a DIG-labelled Ng opcA probe ampli®ed using oligonucleotides O570 and O574. B. RT±PCR analysis using total RNA from FA1090 with oligonucleotides from the opcA and orfY regions. Molecular weights of size markers are indicated at the left and right. Tracks 1±6, no reverse transcriptase step before PCR ampli®cation; tracks 7±10, including the reverse transcriptase step. Oligonucleotides used for reverse transcription, O574 (tracks 1±3 and 7±9); O956 (4 ±5 and 10±11); O1135 (6 and 12). The oligonucleotides used for PCR ampli®cation and their locations are indicated in (E). C and D. Transcriptional start sites in the opcA and orfY region of FA1090. PE, products of primer extension reaction. Sequence reactions (C, T, A and G) were performed using the same oligonucleotides as for the corresponding primer extension experiments. Total RNA extracted from FA1090 was extended using reverse transcriptase with radiolabelled oligonucleotides O956 (C) or O1135 (D). E. Schematic representation showing the locations of the P1 and P2 transcriptional start sites and the results of the RT±PCR experiments. Small bent arrows indicate the locations of the oligonucleotides used. RT±PCR products that were obtained are indicated by horizontal black lines, while products that were not obtained are indicated by boxes with dashed lines. Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650 642 P. Zhu, G. Morelli and M. Achtman Fig. 6. Sequence comparisons of the left (top) and right (bottom) borders of the opcA island in three strains. Sequences belonging to the glyA and opcA gene are boxed. Polymorphic positions are indicated by grey nucleotides. DNA uptake sequences are black; nucleotides in the DUS sequence de®ned by (Goodman and Scocca (1988) are bold, while nucleotides differing from that sequence are in normal type. A direct repeat (DR) is indicated by a horizontal arrow, and gaps are indicated by dots. CcomEA at the right end of the CopcB region is probably a pseudogene in Ng because it possesses a Correia element (CE17, 152 bp) that results in a frameshift. The CcomEA ORF in Ng terminates three amino acids beyond the comEA ORF in Nm, but the sequences of the predicted proteins are completely different. CorfD is probably also a pseudogene in Ng because it possesses two frameshifts (relative to the Nm sequences) that result in two stop codons (Fig. 2). Import of the opcA island As described above, the orthologous homology between opcA of Nm and Ng is only 55%, whereas much higher levels of homology are found between the framework ORFs and other pairs of genes in Ng and Nm (see Discussion). Although this very low sequence homology might have resulted from diversifying selection, such selection should have led to considerable sequence diversity within each of these species. However, the sequence diversity that is actually found in opcA in each species is extremely low, and the opcA genes seem to be located on a DNA island that might have been imported into Nm and/or Ng from an unrelated species. This explanation is supported by the presence of unrelated sequences to the left of opcA in Nm (IS elements) and Ng (orfX, orfY ). Furthermore, the %GC of orfX (47%) and orfY (43%) is lower than the 51% typical of Nm and Ng. We therefore examined the sequences of these putative islands for other features that are compatible with import. The DNA uptake sequence (DUS) GCCGTCTGAA or sequences that differ from the DUS by 1 bp are present as inverted repeats at each end of the opcA islands in Nm and Ng (Fig. 6). DUS can facilitate the uptake of DNA segments from an unrelated species (Kroll et al., 1998). After uptake, recombination could occur via homology in conserved ¯anking housekeeping genes, in which case the islands should be ¯anked by sequences that are more diverse between Nm and Ng than are the framework sequences that were inherited from their last common ancestor. The stop codon of the glyA gene marks the left border of the opcA islands, and no remnants of recombination were detected near glyA. However, the aligned 189 bp to the right of opcA contains gaps as well as 59 polymorphic sites and is only 59% identical between Nm Z2491 and Ng FA1090 (Fig. 6), whereas the subsequent 4.5 kb region through to the end of trpE is 92% identical between the two strains and does not contain any other stretch of low homology. Thus, the region between glyA and dedA does seem to correspond to a DNA island that has been imported into Nm or Ng or both. The sequences at the ends of these islands show no motifs typical of pathogenicity islands (Hacker et al., 1997) or mobile genetic elements that could account for their import, except that the uptake sequences near both ends of the island form part of a 14 bp direct repeat (Fig. 6, DR). However, DNA islands within species that are naturally transformable and recombine frequently may differ from those of other bacteria. A second island within the opcA region The region between trpE and purK contains alternative, unrelated sequences in Nm and Ng. In both strains of Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650 opcA and opcB regions 643 Nm, this region contains orfA, whereas that region in Ng contains the dcmD/dcrD restriction±modi®cation genes (Stein et al., 1995) (Fig. 1). Thus, this region corresponds to a small island containing different genes in the two species. The %GC of these genes is also unusually low (dcmD, 47%; dcrD, 44%; orfA, 36%). We have searched for direct evidence for import of this island by examining this region in multiple strains of Nm and Ng. All 21 strains of Ng yielded PCR products of the same size with primers in the ¯anking trpE and purK genes. The island was sequenced from nine of these bacteria in addition to strains FA1090 and MS11. Five alleles were found whose sequences were 99% identical, and only one amino acid polymorphism was detected (N-274 to D-274 in the DcmD methylase). A representative collection of 107 strains of Nm has been characterized for sequence variability at multiple gene loci by multilocus sequence typing (Maiden et al., 1998). Strains with identical combinations of alleles at six conservative loci are assigned to a common sequence type (ST); a total of 49 STs were recognized within the 107 strains. The region between trpE and purK was ampli®ed by PCR from all 107 strains. Altogether, 105 of the 107 strains possessed an island of the same size as in strains Z2491 and FAM18. The 546 bp region including orfA was sequenced from eight of these strains in addition to Z2491 and FAM18; three alleles of orfA that differed by three synonymous polymorphic sites and two frameshifts were found among the 10 sequences. These results indicate that the island contains orfA in most strains of Nm. Two of the 107 Nm strains (STs 25 and 26) yielded a PCR product of the same size as did strains of Ng. The sequences of that PCR product were identical from both Nm strains. They contained dcmD/dcrD genes that were 98% homologous to the genes from Ng, bordered by a region of somewhat lower homology in purK (95% over 150 bp) (Fig. 8). In both Ng and STs 25 and 26, the dcmD/dcrD island contains a sequence (GTCGTCTGAA) that differs from a DUS by one nucleotide. These data indicate that ST25 and ST26 are descended from a common ancestor that had imported this island from an unrelated species via recombination in the ¯anking genes. The DcmD protein (also called S.Ngo IV) methylates DNA containing GCCGGC (Stein et al., 1995) and renders it resistant to digestion by the DcrD isoschizomer NaeI. Chromosomal DNAs from Ng strains MS11 and FA1090 were resistant to digestion by NaeI, as were DNAs from Nm STs 25 and 26 (Fig. 7). These results indicate that the dcmD gene is functional and methylates chromosomal DNA. In contrast, eight of the 10 Nm strains possessing orfA were digested by NaeI, although DNAs from FAM18 and NZ10 (A4 cluster) were not (Fig. 7). Thus, a DcmD methylase is rare in Nm, and most Nm do not methylate GCCGGC. The dcrD gene in ST25 contains a Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650 Fig. 7. Digestion of chromosomal DNA with NaeI. Tracks 1 and 16 contain molecular weight markers (1 kb ladder; Gibco BRL). Tracks 2±15 contain chromosomal DNA from strains (ST or clonal group): NG E28 (Nm ST26), NG G40 (Nm ST25), MS11 (Ng), FA1090 (Ng), B40 (Nm subgroup I), Z3524 (Nm subgroup III), BZ 10 (Nm A4 cluster), 44/76 (Nm ET-5 complex), BZ 198 (Nm lineage 3), NG 4/88 (Nm ST30), 297-0 (Nm ST49), BZ 147 (Nm ST48), Z2491 (Nm subgroup IV-1) and FAM18 (Nm ET-37 complex). frameshift mutation and may encode an inactive restriction enzyme (Fig. 8). CopcB may be on a third DNA island The homology between CopcB in Nm and Ng is low (71%), and DNA encoding orfE and IS1016C2 is inserted to the right of CopcB in Nm. These results resemble those for the opcA island, and we therefore also searched for uptake sequences and low homology in ¯anking DNA. The DNA to the left of CopcB is 97% identical between Nm and Ng and does not possess any blocks of nonhomology. However, alignments in which the orfE ± IS1016C2 segment to the right of CopcB had been removed showed gaps and numerous polymorphisms over 199 bp to the right of the CopcB gene (Fig. 9, 38% identity). This stretch of DNA also contained inverted DUS (Fig. 9). The 796 bp to the right of that stretch were 95% homologous between Nm and Ng and 99% homologous within Nm. These data are similar to those from the other two islands and suggest that CopcB is on a third island that has been imported from an unrelated species into either Nm or Ng or both. Discussion Before this analysis, opc was perceived to be a phasevariable gene present in some strains of Nm and characterized by extremely low levels of sequence polymorphism (Sarkari et al., 1994; Seiler et al., 1996). The results presented here alter this perception drastically. A sequence variant of opc, now opcA, is present (but poorly expressed) in Ng as well as in Nm. Furthermore, both Ng and Nm possess a second paralogous opcA-like 644 P. Zhu, G. Morelli and M. Achtman Fig. 8. Features and ¯anking DNA of the island between trpE and purK in the opcA region. The islands and those parts of the trpE and purK genes that were sequenced in ST25 (Nm strain NG G40) are shown, as is the percentage identity between Ng FA1090 and the Nm ST25 strain or between the Nm ST25 strain and Nm Z2491. The locations of a DNA uptake sequence and of a 1 frameshift deletion within dcrD of the ST25 strain are also indicated. pseudogene, CopcB. opcA and CopcB show large sequence differences that seem to re¯ect their location in DNA islands that have been imported from unrelated species. Import of three DNA islands in the opcA and CopcB regions The evidence for three DNA islands in the opcA and CopcB regions is similar. Each contains genes that are characterized by unusually low homology or are unrelated when Nm and Ng are compared. Each island contains DNA uptake sequences and one border with low homology between Nm and Ng. For the island between trpE and purK, rare strains of Nm have been identi®ed in which DNA import had occurred in a common ancestor. The opcA gene was not detected in 12 commensal neisseriae by PCR with various combinations of degenerate primers that ampli®ed these genes from Nm and Ng Fig. 9. Sequence comparisons of the right border of the CopcB island in three strains. The alignments were made after removing the sequences from orfE to IS1016C2 whose location is indicated by an arrow. Other details are as in Fig. 6. Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650 opcA and opcB regions 645 (data not shown). PCR analysis of the glyA ±abcZ region in Neisseria lactamica ampli®ed a product that is 6 kb shorter than that from Nm Z2491 (data not shown), as if the DNA immediately surrounding opcA is lacking in these bacteria. These observations suggest that Nm and Ng may have evolved from an ancestor that did not possess any opcA island. The opcA islands may have been imported only recently, because only 2.8% of the opcA gene was polymorphic among 43 diverse strains of Nm (Seiler et al., 1996). (Ng is so homogeneous that the minimal sequence diversity of two opcA sequences is not necessarily unusual.) However, it remains unclear whether these islands were imported once and then replaced by a second import event or whether they have been imported independently into each species. The DNA sequences of the argF, fbp, recA, adk, argJ, glnA and mtg housekeeping genes are 94 ±99% identical between Nm and Ng (Zhou and Spratt, 1992; Feil et al., 1996; Zhou et al., 1997). A housekeeping gene, aroE, that was concluded to be exceptionally diverse (Zhou et al., 1997) was 88±93% identical between Nm and Ng. Similarly, within the opcA region, DNA stretches containing housekeeping genes were 93% identical between Nm and Ng, DNA to the left of CopcB was 97% identical and DNA to the right of CopcB was 90% identical. Many genes encoding outer membrane or secreted proteins and that might be under diversifying selection show similarly high levels of homology between Nm and Ng: porA versus CporA, 91% (Feavers and Maiden, 1998); class 3 porB versus PI.B, 90% (Murakami et al., 1989); rmpG versus rmpM, 96% (Klugman et al., 1989); tbpA, 96% (Legrain et al., 1993) and iga, 90%. As the low orthologous homology of 55% between opcA of Nm and Ng is associated with their import from an unrelated species, possibly other orthologous outer membrane proteins with low homologies have also been imported. These include the class 2 porB versus the PI.B gene, 69% (Murakami et al., 1989); tbpA from the ET-37 complex versus Ng, 75% (Legrain et al., 1993) and tbpB, 77% (Legrain et al., 1993). Import of islands containing foreign genes is realistic for naturally transformable bacteria with a rich history of recombination such as Nm and Ng. Recent data (B. Linz, M. Schenker and M. Achtman, in preparation) has shown a high frequency of recombination during local epidemics and that, on average, the imported DNA is several kilobases long. opcA and CopcB probably re¯ect an old duplication event. Their homologies in either Nm (45%) or Ng (55%) are lower than the homology for either opcA (59%) or CopcB (71%) between the species, suggesting that the duplication predates the subsequent sequence evolution that has occurred in each of these genes. It seems unlikely that an imported pseudogene could have been ®xed because of the lack of positive selection. Therefore, if an Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650 island containing CopcB was imported into Nm and Ng, similarly to CporA in Ng (Feavers and Maiden, 1998), the imported island probably contained a functional gene that was subsequently inactivated by mutations. The opcA region is lacking in bacteria of the ET-37 complex, such as FAM18, the A4 cluster and other unrelated isolates (Seiler et al., 1996). These bacteria might be descendants of an ancestral strain that had not yet imported the opcA region or represent the results of a deletion event. Comparison of the sequences ¯anking the opcA island between FAM18 and Z2491 (Fig. 6) revealed no obvious end-points for an insertion event and no clustered heterogeneity indicating recombination with unrelated DNA. In contrast, the presence of IS1106-like elements near the left end of the opcA island in Nm is compatible with the possibility of deletion by illegitimate recombination, even if there are no obvious repetitive sequences directly at the ends of the deletion. Once a deletion had arisen, it could be transferred by homologous recombination to other strains of Nm. Transcriptional regulation of opcA Although Nm OpcA is highly immunogenic in humans (Rosenqvist et al., 1993; 1995; Haneberg et al., 1998), Nm can escape bactericidal killing by phase variation of opcA transcription resulting from slipped-strand mispairing within a polycytidine stretch in the promoter region (Sarkari et al., 1994). Nm freshly isolated from the nasopharynx can express very large amounts of OpcA protein (Achtman et al., 1991), but yield variants expressing very low levels at frequencies of 1% upon subsequent cultivation in vitro (Sarkari et al., 1994). OpcA protein is apparently not expressed strongly by Ng in vitro. Immunoblots with Ng within two passages of isolation did not reveal any expression of OpcA protein. Furthermore, a convincing potential opcA promoter region was not identi®ed in Ng, and no homopolymeric or heteropolymeric nucleotide stretches that might vary by slippedstrand mispairing precede the opcA structural gene. Only a low level of mRNA expression was observed by Northern blots and by RT±PCR, even from Ng grown under iron deprivation or anaerobically. Intermediate expression of OpcA in Nm (Opc ) does not suf®ce for detectable adhesion and invasion and, by extrapolation, the low levels of transcription that were observed with Ng are unlikely to promote adhesion and invasion. However, protein expression can be triggered by contact with eukaryotic cells (Pettersson et al., 1996; Taha et al., 1998) or by conditions that are unique within an animal host (Camilli and Mekalanos, 1995; Hensel et al., 1995; Valdivia and Falkow, 1996), and it is conceivable that Ng OpcA is expressed ef®ciently during urogenital infections. 646 P. Zhu, G. Morelli and M. Achtman CopcB IS elements Most of the opcB gene is inactivated by a combination of frameshifts and stop codons, similarly to the CporA gene in Ng (Feavers and Maiden, 1998) and to a variety of genes in H. pylori (Alm et al., 1999). The ®rst of these frameshift mutations is in a homopolymeric adenine stretch (A-6 in Ng and A-8 in Nm) encoding the second and third amino acids of the OpcA signal peptide in both species, whereas the other frameshift mutations are at different positions in Nm and Ng. The homopolymeric stretch in the signal peptide is reminiscent of homopolymeric stretches in other neisserial proteins (Jonsson et al., 1991; Jennings et al., 1995; Yang and Gotschlich, 1996) that result in phase-variable expression of the PilC, LgtA, LgtB and LgtD proteins. Possibly, OpcB was originally expressed as a phase-variable protein and ®rst became a pseudogene after Nm and Ng acquired opcA as a result of selection against expression of both genes in one cell. Over 500 different IS elements belonging to more than 17 families have been described (Mahillon and Chandler, 1998). For several of these, the mechanism of transposition has been investigated in great detail (Mahillon and Chandler, 1998), and chromosomal mobility has been analysed under natural conditions (Boyd and Hartl, 1997). Three families of IS elements have been described in Nm (IS1016, Hilse et al., 1996; IS1106, Knight et al., 1992; and IS1301, Hilse et al., 1996), and the IS elements described here are variants of two of these families with somewhat unusual features. The opcA region in many strains of Nm contains IS1106A3 inserted in inverted orientation into CIS1106A2. Often, integration of one IS element into a second results in a tandem duplication leading to increased transcription because of partial promoter sequences in each of the inverted repeats (Mahillon and Chandler, 1998). We have only been able to ®nd one other example in the literature (Trieu-Cuot and Courvalin, 1984) in which one IS element has inserted into the transposase gene of a second. In that case, the transposition was such that the composite IS element expressed two copies of two ORFs and transposed more ef®ciently than a single IS element. This is unlikely to be the case for IS1106A3 because the transposase of CIS1106A2 is apparently a pseudogene and is also disrupted by the insertion. IS1106A3 contains a copy of the target sequence used for integration into CIS1106A2, raising the amusing possibility that it might also be a target for further insertion of a related IS element. A further unusual feature is that the inverted repeats at the ends of IS1106A3 are totally different from those of IS1106 itself and are lacking in IS1106A2. OrfE in the opcB region of strain Z2491 is signi®cantly homologous to a transposase in Synechocystis but lacks terminal inverted repeats. IS elements lacking inverted repeats have been described, and orfE might represent the ®rst example of such a novel IS element in the neisseriae. We note that the spread of sel®sh DNA need not depend on transposition in transformable bacteria; after having entered the global gene pool, these elements can spread by horizontal genetic exchange via transformation, as suggested for a 250 bp element near opcA (Seiler et al., 1996) that is lacking in strain Z2491. Genetic diversity of intergenic regions Much interest related to genetic diversity has focused on pathogenicity islands (Hacker et al., 1997) and DNA acquisitions that are related to virulence. However, DNA mobility is not purely related to virulence, and the data presented here add to the growing evidence that intergenic regions can readily acquire additional genetic information during bacterial evolution (Lawrence and Roth, 1996; Lawrence and Ochman, 1997; Nelson et al., 1997; Bergthorsson and Ochman, 1998). Comparative sequencing of larger regions of DNA from multiple isolates of a single species is just beginning (Milkman and Bridges, 1993; Alm et al., 1999), and it is currently unclear what degree of sequence variation will be discovered when more genes are investigated that are not under positive selection pressure. Deletions or frameshifts have not been detected in any of seven fragments from housekeeping genes tested by MLST analysis (http:/mlst.zoo.ox.ac.uk) of over 200 strains of Nm, and an internal stop codon has only been observed once (unpublished data). In contrast, the sequences near the opcA and CopcB loci revealed several examples of insertion/deletion, genetic replacement and transposition over a cumulative region of less than 20 kb. Numerous pseudogenes were also detected during this analysis, including an IS transposase, the CopcB gene, CcomEA and CorfD, similarly to recent observations with H. pylori (Alm et al., 1999). Together with other data (S. R. Klee, C. R. Tinsley, B. Kusecek et al., submitted), these results show that the neisseriae are characterized by both conserved housekeeping genes and highly dynamic intergenic regions. Concluding remarks The results presented here have revealed the existence of a family of outer membrane proteins with 10 transmembrane strands and ®ve surface-exposed loops, the Opc family. The pathogenic neisseriae contain both orthologous and paralogous opc loci. Until now, only pseudogene Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650 opcA and opcB regions 647 variants of opcB have been discerned, but functional copies may exist in other species and yet other variants of the opc family remain to be discovered. The opc-like loci in Nm and Ng are located in imported DNA islands. Importing such islands can contribute greatly to the diversity of these species and their abilities to evolve under selective pressures. Other than for virulence genes, only a few comparative sequence analyses of large regions have been performed with natural isolates of a single species. The well-characterized collections of epidemic Nm are ideally suited for comparative analyses because numerous changes have occurred in recent decades and changes can be dated (Morelli et al., 1997). The signi®cance of the opcA gene for infection by Ng is unresolved. Experiments are needed to determine whether this gene is expressed in the human body and under in vivo conditions. The sequence data presented here will provide the necessary background for such experiments. Experimental procedures Bacterial strains The opcA region was sequenced from strains Ng FA1090 (Connell et al., 1988), Ng MS11 (Swanson, 1973), Nm Z2491 (serogroup A, subgroup IV-1) (Crowe et al., 1989; Sarkari et al., 1994) and Nm FAM18 (serogroup C, ET-37 complex) (Aho and Cannon, 1988). Additional strains of Nm were chosen among those that have been tested for the possession of opc in Southern blots (Seiler et al., 1996) or that have been tested by multilocus sequence typing (MLST) (Maiden et al., 1998). Additional strains of Ng were fresh clinical isolates obtained from Drs Cathy Ison and Ian Feavers. Strains from 12 Neisseria species used to test whether opcA is present in commensal species were as follows (ATCC no./species): 23970/N. lactamica ; 19696/N. mucosa ; 13120/N. ¯avescens ; 49275/N. sub¯ava ; 14221/N. ¯ava ; 10555/N. per¯ava ; 14686/N. denitri®cans ; 43768/N. polysaccharea ; 25295/N. elongata ; 14687/N. canis ; 33078/N. ovis ; 29256/N. sicca. Two E. coli strains, WK6lcl and K12DHIDtrp (Bernard et al., 1979), were used for cloning and expression of the Ng opcA gene. DNA manipulations Fragments (15 kb) were cloned in l DashII, and recombinant phages containing opc inserts were detected after hybridization with an opc probe. Sequences from the ends of the inserts were used to design primers for long-range PCR. Long-range PCR products were then used to determine the entire sequence of the inserts by a combination of direct sequencing and cloning by inverse PCR. Subsequently, primers (available on request) were designed that enabled direct sequencing from both strands of PCR products made from chromosomal DNA. The techniques for isolating chromosomal DNA (Sarkari et al., 1994) and pulsed-®eld gel electrophoresis (PFGE; Morelli Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650 et al., 1997) have been described. DNA probes were labelled with DIG-11-dUTP during PCR ampli®cation (PCR DIG labelling mix, Boehringer Mannheim) and used for Southern hybridization and detection on Hybond N nylon (Amersham) membranes, as described by the manufacturer. Automated sequencing was performed using puri®ed products ampli®ed from the bacterial chromosome and an Applied Biosystems model 377 (Perkin-Elmer) with dRhodamine-labelled dye terminators. Mapping of opcA and CopcB Chromosomal DNAs of Ng FA1090 and Nm Z2491 were digested with rare-cutting restriction endonucleases (FA1090: NheI, SpeI, Bgl II and Pac I; Z2491: NheI, SpeI, SgfI, Pac I, Bgl II and PmeI), separated by PFGE and tested for Southern hybridization with DIG-labelled opcA, opcB and orfA probes prepared from the corresponding strains. The map locations were determined by correlating the fragments that hybridized (FA1090, opcA : fragments N5, P1 and S2; opcB : B5, N8, P5 and S1; Z2491, opcA : S1 and N9; opcB : N14, S13, G2, B3 and M9) with the published maps of FA1090 (Dempsey et al., 1991) and Z2491 (Dempsey et al., 1995). RNA analysis Bacteria in the exponential phase of growth (6 h of growth on supplemented GC agar plates, 378C, 95% humidity) were lysed with guanidine, diluted with an ethanol solution, and the total RNA was isolated after binding to glass®bre as described by the manufacturer (RNAqueous, Ambion). Formaldehyde gel electrophoresis and transfer to Hybond N nylon membranes was performed using standard techniques (Sambrook et al., 1989), and Northern hybridization was performed using DIG-labelled PCR probes ampli®ed using oligonucleotides O570 (ACTGCTACTTCTCTGAGTAG) and O574 (TTAATGTCGGCTGAGACACC). For RT±PCR, RNA samples were incubated with RNase-free DNase I (Ambion) to remove contaminating genomic DNA and heated at 758C for 5 min to inactivate the DNase I. For RT±PCR reactions, 1 mg of RNA plus 10 pmol of primer were denatured at 658C for 10 min and chilled. Then 4 ml of 5 ´ reverse transcriptase buffer, 2 ml of 100 mM DTT, 2 ml of 10 mM dNTPs, 50 units of reverse transcriptase (Expand reverse transcriptase kit, Boehringer Mannheim) and 20 units of RNase inhibitor (Ambion) were added in a total volume of 20 ml, and the reaction mixture was incubated at 428C for 1 h. Reaction product (5 ml) was then used as the template cDNA in a PCR reaction. Primer extension analysis was carried out as described (Sarkari et al., 1994) using oligonucleotides O956 (TGTCTGT ACGGACGGTATATTC) and O1135 (GGCGTCTCCGATACGACGTA), except that [g-32P]-ATP was used. Expression of Ng OpcA in E. coli and immunological methods Ng opcA was cloned downstream of the l pL promoter to allow overexpression in E. coli. To this end, the coding region of the opcA gene from Ng FA1090, from 6 bp before the ATG 648 P. Zhu, G. Morelli and M. Achtman start codon to 125 bp after the TAA stop codon, was ampli®ed using oligonucleotides O765 (CGGAATTCCTAGGAGAT AAAAATGAAAAAAGCAC, Avr II site underlined) and O762 (CACGTGAAGCTTGCGGTATTTTAACCGATT, Hin dIII site underlined). The PCR fragment was digested with Avr II and Hin dIII, ligated with StyI/Hin dIII-digested plasmid pMA5-8pL (Stanssens et al., 1989) and transformed into strain E. coli WK6lcl. The resulting plasmid, pBE651, contains the predicted insert fragment as con®rmed by sequence analysis. pBE651 was transformed into E. coli K12DHIDtrp, which contains a chromosomal copy of the temperaturesensitive cI857 repressor gene for the induction of OpcA expression. Bacteria growing exponentially at 288C were shifted to 378C to induce OpcA expression. After 2 h further incubation, the cells were harvested and cell envelopes were prepared as described (Merker et al., 1997). Expression was lethal when it was induced by a temperature shift to 428C or after incubation for more than 2 h at 378C, similarly to Nm OpcA (Merker et al., 1997). Five 17- to 20-mer peptides containing sequences from the central portions of loops 1±5 of OpcA protein from Ng FA1090 (sequences underlined in Fig. 3), containing an additional Cterminal cysteine, were synthesized using Cys-Rink Resin (0.15 mM Tentagel R RAM, 0.2 mM g 1, Rapp Polymere) by FastMoc chemistry with an automated peptide synthesizer (model ABI433A, Applied Biosystems), cleaved and puri®ed by reversed-phase high-performance liquid chromatography (HPLC) as described (Thiesen et al., 1997). The structures were con®rmed by MALDI mass spectrometry (Re¯ex II; Bruker-Franzen) as being within 0.6 atomic units of the predicted sizes. The peptides were coupled to thyroglobulin via the cysteine residue using MBS (m-maleimidobenzoyl-Nhydroxysuccimide ester) as described (Van Regenmortel et al., 1988). Balb/c mice were immunized in the caudal region on days 0, 14 and 28 with 5 mg of peptide conjugate mixed with 125 ml of RIBI adjuvant (Immunochem Research), and sera were taken on day 32. Peptides and peptide conjugates were coated on microtitre plates as described (Thiesen et al., 1997), and enzyme-linked immunosorbent assays (ELISAs) (Wang et al., 1992) were performed using serial doubling dilutions of each serum. After diluting the immune sera 25 000-fold, they yielded an OD of $ 1.0 in ELISAs with both the peptides and the peptide conjugates. SDS±PAGE and immunoblots were performed as described (Achtman et al., 1988). Immunoblot analysis with E. coli cell envelopes containing recombinant OpcA only showed speci®c reactions with the murine antisera against loops 1 and 2 (Fig. 4); the other sera gave weaker reactions with OpcA protein and additional cross-reactions with other proteins (data not shown). Acknowledgements We gratefully acknowledge Thomas Schnibbe for theoretical and practical help in preparing the peptide conjugates and the MS analyses; Martin Schenker for help with computer analyses; Kerstin Zurth for technical assistance with immunizations; and Barica Kusecek for technical assistance with peptide synthesis. We are very grateful to Cathy Ison and Ian Feavers for supplying fresh clinical isolates of Ng. References Achtman, M., Neibert, M., Crowe, B.A., Strittmatter, W., Kusecek, B., Weyse, E., et al. (1988) Puri®cation and characterization of eight class 5 outer membrane protein variants from a clone of Neisseria meningitidis serogroup A. J Exp Med 168: 507±525. Achtman, M., Wall, R.A., Bopp, M., Kusecek, B., Morelli, G., Saken, E., et al. (1991) Variation in class 5 protein expression by serogroup A meningococci during a meningitis epidemic. J Infect Dis 164: 375±382. Aho, E.L., and Cannon, J.G. (1988) Characterization of a silent pilin gene locus from Neisseria meningitidis strain FAM18. Microb Pathog 5: 391±398. Aho, E.L., Dempsey, J.A.F., Hobbs, M.M., Klapper, D.G., and Cannon, J.G. (1991) Characterization of the opa (class 5) gene family of Neisseria meningitidis. Mol Microbiol 5: 1429±1437. Alm, R.A., Ling, L.-S.L., Moir, D.T., King, B.L., Brown, E.D., Doig, P.C., et al. (1999) Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori. Nature 397: 176±180. Bergthorsson, U., and Ochman, H. (1998) Distribution of chromosome length variation in natural isolates of Escherichia coli. Mol Biol Evol 15: 6±16. Bernard, H.-U., Remaut, E., Hersh®eld, M.V., Das, H.K., Helinski, D.R., Yanofsky, C., et al. (1979) Construction of plasmid cloning vehicles that promote gene expression from the bacteriophage lambda pL promoter. Gene 5: 59±76. Bihlmaier, A., RoÈmling, U., Meyer, T.F., TuÈmmler, B., and Gibbs, C.P. (1991) Physical and genetic map of the Neisseria gonorrhoeae strain MS11-N198 chromosome. Mol Microbiol 5: 2529±2539. Boyd, E.F., and Hartl, D.L. (1997) Nonrandom location of IS1 elements in the genomes of natural isolates of Escherichia coli. Mol Biol Evol 14: 725±732. Camilli, A., and Mekalanos, J.J. (1995) Use of recombinase gene fusions to identify Vibrio cholerae genes induced during infection. Mol Microbiol 18: 671±683. Connell, T.D., Black, W.J., Kawula, T.H., Barritt, D.S., Dempsey, J.A., Kverneland, Jr, K., et al. (1988) Recombination among protein II genes of Neisseria gonorrhoeae generates new coding sequences and increases structural variability in the protein II family. Mol Microbiol 2: 227± 236. Correia, F.F., Inouye, S., and Inouye, M. (1988) A family of small repeated elements with some transposon-like properties in the genome of Neisseria gonorrhoeae. J Biol Chem 263: 12194±12198. Crowe, B.A., Wall, R.A., Kusecek, B., Neumann, B., Olyhoek, T., Abdillahi, H., et al. (1989) Clonal and variable properties of Neisseria meningitidis isolated from cases and carriers during and after an epidemic in the Gambia, West Africa. J Infect Dis 159: 686±700. De Vries, F.P., van der Ende, A., van Putten, J.P.M., and Dankert, J. (1996) Invasion of primary nasopharyngeal epithelial cells by Neisseria meningitidis is controlled by phase variation of multiple surface antigens. Infect Immun 64: 2998±3006. De Vries, F.P., Cole, R., Frosch, M., and Putten, V.J.P.M. (1998) Neisseria meningitidis producing the Opc adhesin Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650 opcA and opcB regions 649 binds epithelial cell proteoglycan receptors. Mol Microbiol 6: 1203±1212. Dempsey, J.A.F., Litaker, W., Madhure, A., Snodgrass, T.L., and Cannon, J.G. (1991) Physical map of the chromosome of Neisseria gonorrhoeae FA1090 with locations of genetic markers, including opa and pil genes. J Bacteriol 173: 5476±5486. Dempsey, J.A.F., Wallace, A.B., and Cannon, J.G. (1995) The physical map of the chromosome of a serogroup A strain of Neisseria meningitidis shows complex rearrangements relative to the chromosomes of two mapped strains of the closely related species N. gonorrhoeae. J Bacteriol 177: 6390±6400. Feavers, I.M., and Maiden, M.C.J. (1998) A gonococcal porA pseudogene: implications for understanding the evolution and pathogenicity of Neisseria gonorrhoeae. Mol Microbiol 30: 647±656. Feil, E., Zhou, J.J., Smith, J.M., and Spratt, B.G. (1996) A comparison of the nucleotide sequences of the adk and recA genes of pathogenic and commensal Neisseria species: evidence for extensive interspecies recombination within adk. J Mol Evol 43: 631±640. Goodman, S.D., and Scocca, J.J. (1988) Identi®cation and arrangement of the DNA sequence recognized in speci®c transformation of Neisseria gonorrhoeae. Proc Natl Acad Sci USA 85: 6982±6986. Hacker, J., Blum-Oehler, G., MuÈhldorfer, I., and TschaÈpe, H. (1997) Pathogenicity islands of virulent bacteria: structure, function and impact on microbial evolution. Mol Microbiol 23: 1089±1097. Haneberg, B., Dalseg, R., Oftung, F., Wedege, E., Hùiby, E.A., Haugen, I.L., et al. (1998) Towards a nasal vaccine against meningococcal disease, and prospects for its use as a mucosal adjuvant. Dev Biol Stand 92: 127±133. Hensel, M., Shea, J.E., Gleeson, C., Jones, M.D., Dalton, E., and Holden, D.W. (1995) Simultaneous identi®cation of bacterial virulence genes by negative selection. Science 269: 400±403. Hilse, R., Hammerschmidt, S., Bautsch, W., and Frosch, M. (1996) Site-speci®c insertion of IS1301 and distribution in Neisseria meningitidis strains. J Bacteriol 178: 2527± 2532. Jennings, M.P., Hood, D., Peak, I., Virji, M., Moxon, E.R., Hood, D.W., et al. (1995) Molecular analysis of a locus for the biosynthesis and phase-variable expression of the lacto-N-neotetraose terminal lipopolysaccharide structure in Neisseria meningitidis. Mol Microbiol 18: 729±740. Jonsson, A.-B., Nyberg, G., and Normark, S. (1991) Phase variation of gonococcal pili by frameshift mutation in pilC, a novel gene for pilus assembly. EMBO J 10: 477±488. Kaneko, T., Sato, S., Kotani, H., Tanaka, A., Asamizu, E., Nakamura, Y., et al. (1996) Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions (supplement). DNA Res 3: 185±209. Klugman, K.P., Gotschlich, E.C., and Blake, M.S. (1989) Sequence of the structural gene (rmpM) for the class 4 outer membrane protein of Neisseria meningitidis, homology of the protein to gonococcal protein III and Escherichia Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650 coli OmpA, and construction of meningococcal strains that lack class 4 protein. Infect Immun 57: 2066±2071. Knight, A.I., Ni, H., Cartwright, K.A.V., and McFadden, J.J. (1992) Identi®cation and characterization of a novel insertion sequence, IS1106, downstream of the porA gene in B15 Neisseria meningitidis. Mol Microbiol 6: 1565±1573. Kroll, J.S., Wilks, K.E., Farrant, J.L., and Langford, P.R. (1998) Natural genetic exchange between Haemophilus and Neisseria: intergeneric transfer of chromosomal genes between major human pathogens. Proc Natl Acad Sci USA 95: 12381±12385. Lawrence, J.G., and Ochman, H. (1997) Amelioration of bacterial genomes: rates of change and exchange. J Mol Evol 44: 383±397. Lawrence, J.G., and Roth, J.R. (1996) Sel®sh operons: horizontal transfer may drive the evolution of gene clusters. Genetics 143: 1843±1860. Legrain, M., Mazarin, V., Irwin, S.W., Bouchon, B., QuentinMillet, M.-J., Jacobs, E., et al. (1993) Cloning and characterization of Neisseria meningitidis genes encoding the transferrin-binding proteins Tbp1 and Tbp2. Gene 130: 73±80. Mahillon, J., and Chandler, M. (1998) Insertion sequences. Microbiol Mol Biol Rev 62: 725±744. Maiden, M.C.J., Bygraves, J.A., Feil, E., Morelli, G., Russell, J.E., Urwin, R., et al. (1998) Multilocus sequence typing: a portable approach to the identi®cation of clones within populations of pathogenic microorganisms. Proc Natl Acad Sci USA 95: 3140±3145. Merker, P., Tommassen, J., Virji, M., Sesardic, D., and Achtman, M. (1997) Two-dimensional structure of the Opc invasin from Neisseria meningitidis. Mol Microbiol 23: 281± 293. Milkman, R., and Bridges, M.M. (1993) Molecular evolution of the Escherichia coli chromosome. IV. Sequence comparisons. Genetics 133: 455±468. Moore, T.D.E., and Sparling, P.F. (1995) Isolation and identi®cation of a glutathione peroxidase homolog gene, gpxA, present in Neisseria meningitidis but absent in Neisseria gonorrhoeae. Infect Immun 63: 1603±1607. Morelli, G., Malorny, B., MuÈller, K., Seiler, A., Wang, J., del Valle, J., et al. (1997) Clonal descent and microevolution of Neisseria meningitidis during 30 years of epidemic spread. Mol Microbiol 25: 1047±1064. Murakami, K., Gotschlich, E.C., and Seiff, M.E. (1989) Cloning and characterization of the structural gene for the class 2 protein of Neisseria meningitidis. Infect Immun 57: 2318±2323. Nelson, K., Wang, F.S., Boyd, E.F., and Selander, R.K. (1997) Size and sequence polymorphism in the isocitrate dehydrogenase kinase/phosphatase gene (aceK ) and ¯anking regions in Salmonella enterica and Escherichia coli. Genetics 147: 1509±1520. Olyhoek, A.J.M., Sarkari, J., Bopp, M., Morelli, G., and Achtman, M. (1991) Cloning and expression in Escherichia coli of opc, the gene for an unusual class 5 outer membrane protein from Neisseria meningitidis. Microb Pathog 11: 249±257. Petering, H., Hammerschmidt, S., Frosch, M., van Putten, J.P.M., Ison, C.A., and Robertson, B.D. (1996) Genes associated with the meningococcal capsule complex are 650 P. Zhu, G. Morelli and M. Achtman also found in Neisseria gonorrhoeae. J Bacteriol 178: 3342±3345. Pettersson, J., Nordfelth, R., Dubinina, E., Bergman, T., Gustafsson, M., Magnusson, K.E., et al. (1996) Modulation of virulence factor expression by pathogen target cell contact. Science 273: 1231±1233. Rosenqvist, E., Hùiby, E.A., Wedege, E., Kusecek, B., and Achtman, M. (1993) The 5C protein of Neisseria meningitidis is highly immunogenic in humans and induces bactericidal antibodies. J Infect Dis 167: 1065±1073. Rosenqvist, E., Hùiby, E.A., Wedege, E., Bryn, K., Kolberg, J., Klem, A.R., et al. (1995) Human antibody responses to meningococcal outer membrane antigens after three doses of the Norwegian group B meningococcal vaccine. Infect Immun 63: 4642±4652. Sambrook, J., Fritsch, E.F., and Maniatis, T. (1989) Molecular Cloning: a Laboratory Manual, 2nd edn. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press. Sampson, B.A., and Gotschlich, E.C. (1992) Neisseria meningitidis encodes an FK506-inhibitable rotamase. Proc Natl Acad Sci USA 89: 1164±1168. Sarkari, J., Pandit, N., Moxon, E.R., and Achtman, M. (1994) Variable expression of the Opc outer membrane protein in Neisseria meningitidis is caused by size variation of a promoter containing poly-cytidine. Mol Microbiol 13: 207±217. Seiler, A., Reinhardt, R., Sarkari, J., Caugant, D.A., and Achtman, M. (1996) Allelic polymorphism and site-speci®c recombination in the opc locus of Neisseria meningitidis. Mol Microbiol 19: 841±856. Stanssens, P., Opsomer, C., McKeown, Y.M., Zabeau, M., and Fritz, H.-J. (1989) Ef®cient oligonucleotide-directed construction of mutations in expression vectors by the gapped duplex DNA method using alternating selectable markers. Nucleic Acids Res 17: 4441±4454. Stein, D.C., Gunn, J.S., Radlinska, M., and Piekarowicz, A. (1995) Restriction and modi®cation systems of Neisseria gonorrhoeae. Gene 157: 19±22. StruyveÂ, M., Moons, M., and Tommassen, J. (1991) Carboxyterminal phenylalanine is essential for the correct assembly of a bacterial outer membrane protein. J Mol Biol 218: 141±148. Swanson, J. (1973) Studies on gonococcus infection IV. Pili: their role in attachment of gonococci to tissue culture cells. J Exp Med 137: 571±589. Taha, M.-K., Morand, P.C., Pereira, Y., EugeÁne, E., Giorgini, D., Larribe, M., et al. (1998) Pilus-mediated adhesion of Neisseria meningitidis: the essential role of cell contactdependent transcriptional upregulation of the PilC1 protein. Mol Microbiol 28: 1153±1163. Thiesen, B., Greenwood, B., Brieske, N., and Achtman, M. (1997) Persistence of antibodies to meningococcal IgA1 protease versus decay of antibodies to group A polysaccharide and Opc protein. Vaccine 15: 209±219. Thompson, S.A., and Sparling, P.F. (1993) The RTX cytotoxin-related FrpA protein of Neisseria meningitidis is secreted extracellularly by meningococci and by HlyBD Escherichia coli. Infect Immun 61: 2906±2911. Thompson, S.A., Wang, L.L., and Sparling, P.F. (1993) Cloning and nucleotide sequence of frpC, a second gene from Neisseria meningitidis encoding a protein similar to RTX cytotoxins. Mol Microbiol 9: 85±96. Tinsley, C.R., and Nassif, X. (1996) Analysis of the genetic differences between Neisseria meningitidis and Neisseria gonorrhoeae : two closely related bacteria expressing two different pathogenicities. Proc Natl Acad Sci USA 93: 11109±11114. Trieu-Cuot, P., and Courvalin, P. (1984) Nucleotide sequence of the transposable element IS15. Gene 30: 113± 120. Valdivia, R.H., and Falkow, S. (1996) Bacterial genetics by ¯ow cytometry: rapid isolation of Salmonella typhimurium acid-inducible promoters by differential ¯uorescence induction. Mol Microbiol 22: 367±378. Van Regenmortel, M.H.V., Briand, J.P., Muller, S., and PlaueÁ, S. (1988) Synthetic Polypeptides as Antigens. Amsterdam: Elsevier Science. Virji, M., Makepeace, K., Ferguson, D.J.P., Achtman, M., Sarkari, J., and Moxon, E.R. (1992) Expression of the Opc protein correlates with invasion of epithelial and endothelial cells by Neisseria meningitidis. Mol Microbiol 6: 2785±2795. Virji, M., Makepeace, K., Ferguson, D.J.P., Achtman, M., and Moxon, E.R. (1993) Meningococcal Opa and Opc proteins: their role in colonization and invasion of human epithelial and endothelial cells. Mol Microbiol 10: 499±510. Virji, M., Makepeace, K., and Moxon, E.R. (1994) Distinct mechanisms of interactions of Opc-expressing meningococci at apical and basolateral surfaces of human endothelial cells; The role of integrins in apical interactions. Mol Microbiol 14: 173±184. Wang, J.-F., Caugant, D.A., Li, X., Hu, X., Poolman, J.T., Crowe, B.A., et al. (1992) Clonal and antigenic analysis of serogroup A Neisseria meningitidis with particular reference to epidemiological features of epidemic meningitis in China. Infect Immun 60: 5267±5282. Yang, Q.-L., and Gotschlich, E.C. (1996) Variation of gonococcal lipooligosaccharide structure is due to alterations in poly-G tracts in lgt genes encoding glycosyl transferases. J Exp Med 183: 323±327. Zhou, J., and Spratt, B.G. (1992) Sequence diversity within the argF, fbp and recA genes of natural isolates of Neisseria meningitidis: interspecies recombination within the argF gene. Mol Microbiol 6: 2135±2146. Zhou, J.J., Bowler, L.D., and Spratt, B.G. (1997) Interspecies recombination, and phylogenetic distortions, within the glutamine synthetase and shikimate dehydrogenase genes of Neisseria meningitidis and commensal Neisseria species. Mol Microbiol 23: 799±812. Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650