Download genes, pseudogenes, deletions, insertion elements and DNA islands

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Biology and consumer behaviour wikipedia , lookup

Ridge (biology) wikipedia , lookup

Epigenetics in learning and memory wikipedia , lookup

Genomic imprinting wikipedia , lookup

DNA barcoding wikipedia , lookup

Cancer epigenetics wikipedia , lookup

DNA vaccination wikipedia , lookup

Gene nomenclature wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Genetic code wikipedia , lookup

Molecular cloning wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Minimal genome wikipedia , lookup

Transposable element wikipedia , lookup

Primary transcript wikipedia , lookup

Human genome wikipedia , lookup

Gene desert wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Gene expression programming wikipedia , lookup

NEDD9 wikipedia , lookup

Genetic engineering wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Metagenomics wikipedia , lookup

Genomics wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Genome (book) wikipedia , lookup

RNA-Seq wikipedia , lookup

Microsatellite wikipedia , lookup

Genome evolution wikipedia , lookup

Non-coding DNA wikipedia , lookup

Gene expression profiling wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Pathogenomics wikipedia , lookup

Genome editing wikipedia , lookup

History of genetic engineering wikipedia , lookup

Gene wikipedia , lookup

Point mutation wikipedia , lookup

Designer baby wikipedia , lookup

Helitron (biology) wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Microevolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
Molecular Microbiology (1999) 33(3), 635±650
The opcA and CopcB regions in Neisseria: genes,
pseudogenes, deletions, insertion elements and
DNA islands
Peixuan Zhu, Giovanna Morelli and Mark Achtman*
Max-Planck Institut fuÈr molekulare Genetik, Ihnestraûe
73, D-14195 Berlin, Germany.
Summary
Previous data have indicated that the opc gene
encoding an immunogenic invasin is speci®c to Neisseria meningitidis (Nm) and is lacking in Neisseria
gonorrhoeae (Ng). The data presented here show
that Nm and Ng both contain two paralogous opclike genes, opcA, corresponding to the former opc
gene, and CopcB, a pseudogene. The predicted
OpcA and OpcB proteins possess transmembrane
regions with conserved non-polar faces but differ
extensively in four of the ®ve surface-exposed
loops. Gonococcal OpcA was expressed weakly
under in vitro conditions, and it is unknown whether
these bacteria can express this protein at high levels.
Analysis of the sequences ¯anking opcA and CopcB
revealed a framework of conserved housekeeping
genes interspersed with DNA islands. These regions
also contained several pseudogenes, deletions and
IS elements, attesting to considerable genome plasticity. Both opcA and CopcB are located on DNA
islands that have probably been imported from unrelated bacteria. A third island encodes the dcmD/
dcrD R /M genes in Ng versus a small open reading
frame in most strains of Nm. Rare strains of Nm
were identi®ed in which the R /M island has been
imported. DNA islands in Nm and Ng seem to have
been acquired by recombination via conserved ¯anking housekeeping genes rather than by insertion of
mobile genetic elements.
Introduction
The physical chromosome map order of the Gramnegative bacterial pathogens Neisseria gonorrhoeae
(Ng) and Neisseria meningitidis (Nm) is similar (Dempsey
et al., 1995), and their housekeeping genes are 98%
Received 23 March, 1999; revised 12 May, 1999; accepted 20
May, 1999. *For correspondence. E-mail [email protected]; Tel. (‡49) 30 8413 1262; Fax (‡49) 30 8413 1385.
Q 1999 Blackwell Science Ltd
homologous (Zhou and Spratt, 1992). Both species colonize only humans, but Ng is specialized for the mucosa
of the urogenital tract and causes gonorrhoea and pelvic
in¯ammatory disease, whereas Nm is specialized for the
mucosa of the nasopharynx and causes meningitis and
septicaemia. Strains of Nm contain genes within the cps
region whose products result in the synthesis of capsular
polysaccharide (Petering et al., 1996), whereas different
genes are found at the corresponding location in Ng.
Several other genes are also thought to be speci®c to
Nm versus Ng: mip (Sampson and Gotschlich, 1992); frp
encoding a homologue to RTX toxins (Thompson and
Sparling, 1993; Thompson et al., 1993); gpxA encoding
a glutathione peroxidase (Moore and Sparling, 1995);
and the subject of this report, the opc gene (Olyhoek et
al., 1991).
opc encodes a cell surface-exposed, outer membrane
protein with 10 transmembrane strands and ®ve surfaceexposed loops (Merker et al., 1997). The Opc protein
(formerly 5C) mediates adhesion to and invasion of
endothelial and epithelial cells (Virji et al., 1992; 1993;
De Vries et al., 1996; 1998); it interacts indirectly via vitronectin with b-integrins on endothelial cells (Virji et al.,
1994) and directly with heparan sulphate proteoglycans
on epithelial cells (De Vries et al., 1998). Opc expression
is hypervariable owing to different levels of transcription,
depending on the length of a polycytidine stretch in the
promoter region (Sarkari et al., 1994). Opc is also highly
immunogenic in humans and stimulates bactericidal antibodies (Rosenqvist et al., 1993; 1995; Haneberg et al.,
1998). The opc gene is present in many strains of Nm (Seiler et al., 1996) and is highly conserved, with only minor
sequence variation. However, Ng and some Nm, including
the clonal groupings of serogroups B and C called the A4
cluster and the ET-37 complex, did not hybridize with an
opc probe.
The degree of genetic variation between different
strains from one species or from closely related species
has only rarely been investigated. Closely related bacteria
can differ in their possession of mobile genetic elements,
some of which have been designated pathogenicity
islands (Hacker et al., 1997). A comparison of the complete genome sequences from two strains of Helicobacter
pylori has shown that 7% of the genes were only present
636
P. Zhu, G. Morelli and M. Achtman
in one or the other strain (Alm et al., 1999). Similarly, a
subtraction library between Nm and Ng yielded clones
from at least nine regions scattered around the chromosome (Tinsley and Nassif, 1996). It seemed likely that
the opc gene might represent another example of such
intraspecies and interspecies genetic variability, although
none of the clones isolated by Tinsley and Nassif (1996)
contained opc sequences. In order to resolve this question, we sequenced the region surrounding opc from
representative strains of Nm and Ng that differ in their
hybridization with an opc probe. Furthermore, a second
region surrounding a paralogous opc-like gene was also
sequenced. These regions contained opc-like genes plus
a mixture of housekeeping genes, insertion elements,
pseudogenes, deletions and DNA islands, which are
described here. The results demonstrate high levels of
genetic variability outside of housekeeping genes associated with import by horizontal genetic exchange.
Results
The opcA and CopcB regions
The < 12 kb region between glyA and purK, two housekeeping genes that ¯ank opc, was sequenced from polymerase chain reaction (PCR) products ampli®ed from
chromosomal DNA of Nm serogroup A, subgroup IV-1
strain Z2491, Nm serogroup C, ET-37 complex strain
FAM18, Ng strain FA1090 and Ng strain MS11 (Fig. 1).
The genomes of two of these strains are currently
being sequenced (Z2491, http://www.sanger.ac.uk/
Projects/N_meningitidis/; FA1090, http://dna1.chem.ou.
Fig. 1. Genetic organization of the opcA region in Ng and Nm. ORFs are indicated by grey boxes with gene symbols whose transcriptional
direction is indicated by short arrows. The primary maps indicate the homologous genes, while the small additional maps indicate the
positions of genes that are unique to each species. Top: Ng strain FA1090 (GenBank accession number AJ242839). Nucleotide differences in
Ng strain MS11 (GenBank AJ242840) are indicated by single letters under the map, except for a box around the GGC codon, which is deleted
in strain MS11. Middle: Nm strain Z2491 (GenBank AJ242841). Bottom: Nm strain FAM18 (GenBank AJ242842). A deletion in FAM18 is
indicated by a dotted line. CE16, a Correia element in CIS1106A2. C10 , position of the polycytidine stretch in Nm that was arbitrarily set to 10
cytidines. Symbols for the target site and inverted repeats ¯anking IS1106A3, stop codons and frameshifts are indicated at the bottom left.
Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650
opcA and opcB regions 637
Fig. 2. The genetic organization of the CopcB region in Ng FA1090 (top; Genbank AJ242836), Nm Z2491 (middle; Genbank AJ242837) and
Nm FAM18 (bottom; Genbank AJ242838). Symbols are as in Fig. 1. The DNA inserted in Nm at the end of the CopcB gene is indicated on a
separate small map, and deletions are indicated by dotted lines.
edu/gono.html), and the two other strains have been
used extensively for genetic analyses (Aho et al., 1991;
Bihlmaier et al., 1991). The opc gene in this region will
be referred to henceforth as opcA because a second
opc-like gene, CopcB, was found by searching the two
genome projects for homologies with the opcA sequence.
The < 4 kb region from aroG to comEA containing
CopcB was also sequenced from chromosomal PCR products from strains Z2491, FA1090 and FAM18 (Fig. 2).
The map locations of the opcA and CopcB regions were
determined by hybridizing opcA and CopcB probes from
each species to Southern blots of chromosomal DNA
that had been digested with rare-cutting restriction
enzymes whose sites have been mapped (Dempsey et
al., 1991; 1995). Each region is at approximately the
same position in each species, and the two regions are
almost diametrically opposed on the chromosome (see
Experimental procedures).
Both the opcA and CopcB regions contain a conserved
framework of open reading frames (ORFs), most of which
have strong (61±85%) protein homology to known genes
from other bacterial species. These framework genes
were named glyA (serine hydroxymethyltransferase,
homologous to SWISSPROT P00477); dedA (Escherichia
coli protein of unknown function, P05948); abcZ (ABC
Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650
transporter ATP-binding protein, P43672); trpE (anthranilate synthase component I, P20580); purK (phosphoribosylaminoimidazole
carboxylase,
P12045);
aroG
(phospho-2-dehydro-3-deoxyheptaneonate
aldolase,
P00886) and comEA (a Bacillus subtilis protein involved
in DNA transformability, P39694). The sizes of these
framework ORFs were the same in Nm, Ng and the other
species, except for DedA (150 versus 219 amino acids).
The regions also contained ORFs with no known homologies, arbitrarily named orfA, orfD, orfX, orfY and orfZ.
The sizes of the opcA and CopcB regions varied as a
result of intergenic regions that contain different genes
(orfA, orfE, orfX, orfY, dcmD, dcrD ) in each species as
well as the variable presence of indels (insertions or deletions) or IS (insertion) elements related to IS1106
(CIS1106A2, IS1106A3) and IS1016 (IS1016C2). Several
of the ORFs were pseudogenes, as indicated by the pre®x
C, resulting from frameshifts, stop codons and/or insertion of repetitive DNA (CE16, CE17) related to the socalled Correia elements (Correia et al., 1988). [There
seem to be errors in the GenBank dcmD sequence from
Ng strain MS11 (M86915) in which positions 9165±9282
are frameshifted, resulting in an incorrect stretch of 40
amino acids (positions 216±255 in SWISSPROT
P31033). The sequences described here were obtained
638
P. Zhu, G. Morelli and M. Achtman
by direct sequencing of both strands from PCR products,
and these frameshifts were not present in either of the
Ng strains tested.]
The opcA regions of the two Ng strains are almost identical (20 polymorphic sites over 11 730 bp; 99.8% identical)
(Fig. 1). In Nm FAM18, the opcA region has a 3.9 kb deletion between the glyA gene and the dedA gene that has
eliminated the opcA gene and ¯anking DNA (Fig. 1). A
deletion of approximately the same size is present in all
42 Nm strains that were shown not to hybridize with an
opcA probe (Seiler et al., 1996) (PCR analysis, data not
shown). Excluding the deletion, the opcA region was
very similar (96.8% identical over 8170 bp) in the two
Nm strains sequenced here. The homology between Ng
FA1090 and Nm Z2491 in the framework of the opcA
region was 93%, somewhat less than the homologies
within either species.
The homology between the Nm strains in the framework
of the CopcB region was 99%; between Ng strain FA1090
and either of the Nm strains, the homology was 97% for
Fig. 3. Aligned amino acid sequences of the OpcA and OpcB proteins from Nm and Ng. Frameshifts (bent arrows, frame indicated underneath)
and stop codons (yellow Xs) in OpcB were replaced by appropriate amino acids, and the sequences were aligned manually in order to avoid
disrupting the transmembrane regions previously elucidated in the two-dimensional model of Nm OpcA (Merker et al., 1997). Tm1 to Tm10
indicate the 10 transmembrane strands, and loop 1 to loop 5 indicate the surface-exposed loops. Non-polar amino acids on the non-polar face
of transmembrane strands are shaded in dark grey. Identical amino acids at other positions are coloured red (all four OpcA and OpcB proteins
from strains FA1090 and Z2491), blue (three proteins) or green (two proteins). Dots indicate single amino acid gaps introduced for the
purposes of alignment. Sequence polymorphisms (grey) in OpcA within Nm (13 sequence variants; Seiler et al., 1996) or strain MS11 (Ng) are
indicated in the lowest two lines of each part of the ®gure where dashes indicate unaltered amino acids. Underlining within the FA1090 OpcA
protein indicates the sequences of ®ve synthetic peptides used for immunization.
Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650
opcA and opcB regions 639
the aroG gene through to the beginning of opcB and 90%
from the end of the opcB gene to comEA. Thus, large portions of the two regions were strongly homologous within
each species and between both species.
The Opc protein family
The opcA region of Ng strains FA1090 and MS11 contains
an opcA-like gene (Fig. 1). Furthermore, an opcA-like
gene is present in most Ng strains because we ampli®ed
a product of the same size from 26 Ng strains using oligonucleotides O570 immediately upstream of opcA and
O574 within opcA.
Sequence variability of the opcA gene was very low in
different strains of Nm (Seiler et al., 1996). The opcA
genes from Ng strains FA1090 and MS11 are also almost
identical; they only differ by three nucleotides (two synonymous and one non-synonymous change) plus an indel
of one codon (Fig. 1). In contrast, opcA from Ng
FA1090 was only 59% identical to opcA from Nm Z2491,
suf®ciently different that it had not been detected previously by hybridization (Seiler et al., 1996).
CopcB contains multiple frameshifts and stop codons in
both Nm and Ng. CopcB is 99% identical in the two Nm
strains, but the CopcB gene was considerably shorter in
Nm (476 bp) than in Ng (753 bp) (Fig. 2), and CopcB
from Ng FA1090 is only 71% identical to CopcB from
Nm Z2491 for those parts of the gene that are present in
both species. Finally, the DNA homology between opcA
and CopcB is 55% within Ng FA1090 and 45% within
Nm Z2491.
A multiple sequence alignment of the Opc family of proteins was constructed manually in which the integrity of the
transmembrane strands was maximized, the frameshifts
in CopcB were compensated by adding or deleting
nucleotides, and stop codons in CopcB were replaced
with suitable amino acids (Fig. 3). The known sequence
polymorphism of OpcA in Nm and Ng is also indicated in
Fig. 3. As expected for outer membrane proteins (StruyveÂ
et al., 1991), the aligned OpcA and OpcB sequences all
contain a signal peptide, a C-terminal phenylalanine and
transmembrane regions containing non-polar amino
acids on one face. OpcA from both Ng and Nm contains
10 transmembrane strands and ®ve surface-exposed
loops, but loops 2 and 5 are shorter in Ng. Part of loop 2
was deleted in Ng OpcB, and Nm OpcB is deleted from
the middle of loop 1 through to the end of loop 3. Fairly
strong homologies were obvious within the signal peptide,
the transmembrane regions and parts of loops 1 and 5 for
all the sequences. However, the only longer stretch of
identity between the ®ve surface-exposed loops is the
sequence LKEKHK in loop 1 of OpcA from Nm and Ng.
The periplasmic turns between the transmembrane
regions showed minor sequence variability. In particular,
Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650
Fig. 4. Western blot analysis with murine sera against loops 1 (A)
and 2 (B) of Ng OpcA. Cell envelope preparations were from Ng
strains FA1090, MS11 and freshly isolated clinical isolates (Zwxyz)
that had been passaged no more than twice in vitro. Controls
include cell envelopes from E. coli strains expressing cloned opcA
(plasmid pBE651) or a vector (pMA5-8pL) plasmid control as well
as puri®ed OpcA from Nm and cell envelopes from an Opc‡‡ Nm
strain (Z5104).
the Ng OpcA sequence from MS11 possesses a deletion
of one amino acid in the turn between transmembrane
strands Tm6 and Tm7, while the Nm OpcA sequence is
one amino acid shorter between Tm2 and Tm3 than is
Ng OpcA (Fig. 3).
Absence of strong expression of Ng opcA in vitro
Murine sera against loops 1 and 2 of Ng OpcA were used
in immunoblots to determine whether OpcA was detectable in cell envelopes from Ng strains FA1090, MS11
and 12 fresh clinical isolates (Fig. 4). Cell envelopes
from E. coli expressing recombinant Ng OpcA were used
as a positive control. As might be expected from the homologies between OpcA loop 1 from Nm and Ng, sera to Ng
loop 1 reacted with puri®ed OpcA protein from Nm and
with Nm cell envelopes containing OpcA (Fig. 4A),
whereas sera to loop 2 did not (data not shown). Sera to
loops 1 and 2 both reacted with recombinant Ng OpcA,
but no reaction was detected with any of the 14 cell envelopes prepared from Ng after aerobic (Fig. 4) or anaerobic
growth.
Transcription of the Ng opcA gene was tested by Northern blot analysis of total RNA from FA1090 using a digoxygenin (DIG)-labelled opcA probe prepared by PCR
ampli®cation with oligonucleotides O570/O574 (Fig. 5).
A weak band of 0.9 kb, the size of the opcA coding region,
was detected (Fig. 5A). After prolonged exposure, additional extremely weak bands of 1.5 and 2.6 kb were also
640
P. Zhu, G. Morelli and M. Achtman
Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650
opcA and opcB regions 641
detected (not shown). Primer extension with oligonucleotides located within opcA and the upstream ORF, orfY,
was performed to map the start point(s) of transcription
(Fig. 5C and D). The results show the possible existence
of two weak transcription start sites, P1, immediately after
the ATG start codon of opcA, and P2, upstream of orfY.
P1 bears no obvious relationship to the promoter upstream
of opcA in Nm (Sarkari et al., 1994), but a second ATG
start codon is located 48 bp downstream of P1 that might
allow translation of a shortened opcA gene. Neither P1
nor P2 are preceded by sequences that resemble 10 or
35 consensus sequences, and there are no ribosomal
binding sites after P1 or P2.
Reverse transcription (RT)±PCR was performed in
order to elucidate whether opcA is transcribed alone or
is co-transcribed with orfY. RT±PCR was performed on
total RNA using primers within opcA and orfY (Fig. 5E).
Speci®c products were obtained with primers O570/
O574 (Fig. 5B, lane 7) and O571/O574 (lane 8), spanning
much of the opcA gene. Thus, the Ng opcA gene is transcribed. In order to test whether this transcript could be
attributed to P2, RT±PCR was performed with O1178/
O956 (Fig. 5B, lane 10), which spans the region including
orfY and the beginning of opcA. A transcript was detected,
but this probably does not include all of opcA because no
product was obtained using primers O1178/O574
(Fig. 5B, lane 9). The existence of yet another transcript
beginning upstream of orfY (and 58 of P2) is indicated by
the observation that RT±PCR products were obtained
with primers O595/O1135 that span P2. This transcript
probably terminates before opcA because no products
were obtained with O595/O956 (Fig. 5B, lane 11). Thus,
these results con®rm that the opcA gene is transcribed
in Ng, even if only weakly, but do not clarify the location
of the promoter used for transcription.
IS elements in the opcA and CopcB regions
IS1106 is an IS element that is inserted next to the Nm
porA gene (Knight et al., 1992). In Nm strain Z2491, the
region upstream of opcA contains a pseudo-IS element,
CIS1106A2, into which IS1106A3, a related IS element,
has inserted in the inverted orientation. Sequence comparisons between strains in which IS1106A3 is lacking
revealed that the insertion of IS1106A3 resulted in a duplication of the insertion target site, CAGACCG (data not
shown). CIS1106A2 was probably a pseudo-IS element
even before insertion of IS1106A3 because the internal
transposase gene, orf1A, contains two frameshifts and
one stop codon as well as a 105 bp Correia element,
CE16. We assigned unique designations to both of the IS
elements upstream of opcA because their transposase
genes, orf1A and orf1B, are only 94% identical. Orf1B
from IS1106A3 consists of 335 amino acids versus 288 for
ORF1 from IS1106, and those two ORFs are only 79% identical. The three IS elements also differ in their organization:
the 18 bp inverted repeats (AGA CCT TTG CAA AAT TCC)
present at the ends of IS1106A3 differ in sequence from the
35±36 bp inverted repeats ¯anking IS1106. CIS1106A2
does not possess any inverted repeats.
A different IS element, IS1016C2, is present in the
CopcB region of Nm strain Z2491 in the intergenic
region between orfE and orfD (Fig. 2). IS1016C2 is
97% homologous to IS1016N (Hilse et al., 1996; Mahillon
and Chandler, 1998) and is ¯anked by the same 19 bp
inverted repeats. OrfE (60 amino acids) may also be associated with IS elements because it is 43% identical to a 64amino-acid transposase (D64001) from Synechocystis sp.
(Kaneko et al., 1996).
Insertions, deletions and pseudogenes
The homology between Ng and Nm stops immediately
upstream of opcA. In Ng strains MS11 and FA1090, a
1401 bp segment has been inserted between glyA and
opcA that contains two small ORFs (orfX, 216 bp; orfY,
270 bp), separated by an intergenic region (Fig. 1). This
insertion is at the same location as the IS elements present in Nm strain Z2491, whereas both opcA and its
upstream region have been deleted in Nm strain FAM18.
FAM18 also has a second deletion of 892 bp near
CopcB that fuses orfE and orf1C of IS1016C2 (Fig. 2).
If these genes are functional in strain Z2491, they are
pseudogenes in FAM18.
Fig. 5. Transcription of opcA in Ng.
A. Northern blot analysis. Track 1: RNA size markers whose molecular weights (kb) are indicated at the left. Track 2: hybridization of total
RNA from strain FA1090 with a DIG-labelled Ng opcA probe ampli®ed using oligonucleotides O570 and O574.
B. RT±PCR analysis using total RNA from FA1090 with oligonucleotides from the opcA and orfY regions. Molecular weights of size markers
are indicated at the left and right. Tracks 1±6, no reverse transcriptase step before PCR ampli®cation; tracks 7±10, including the reverse
transcriptase step. Oligonucleotides used for reverse transcription, O574 (tracks 1±3 and 7±9); O956 (4 ±5 and 10±11); O1135 (6 and 12).
The oligonucleotides used for PCR ampli®cation and their locations are indicated in (E).
C and D. Transcriptional start sites in the opcA and orfY region of FA1090. PE, products of primer extension reaction. Sequence reactions
(C, T, A and G) were performed using the same oligonucleotides as for the corresponding primer extension experiments. Total RNA extracted
from FA1090 was extended using reverse transcriptase with radiolabelled oligonucleotides O956 (C) or O1135 (D).
E. Schematic representation showing the locations of the P1 and P2 transcriptional start sites and the results of the RT±PCR experiments.
Small bent arrows indicate the locations of the oligonucleotides used. RT±PCR products that were obtained are indicated by horizontal black
lines, while products that were not obtained are indicated by boxes with dashed lines.
Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650
642
P. Zhu, G. Morelli and M. Achtman
Fig. 6. Sequence comparisons of the left (top) and right (bottom) borders of the opcA island in three strains. Sequences belonging to the
glyA and opcA gene are boxed. Polymorphic positions are indicated by grey nucleotides. DNA uptake sequences are black; nucleotides in the
DUS sequence de®ned by (Goodman and Scocca (1988) are bold, while nucleotides differing from that sequence are in normal type. A direct
repeat (DR) is indicated by a horizontal arrow, and gaps are indicated by dots.
CcomEA at the right end of the CopcB region is probably a pseudogene in Ng because it possesses a Correia
element (CE17, 152 bp) that results in a frameshift. The
CcomEA ORF in Ng terminates three amino acids beyond
the comEA ORF in Nm, but the sequences of the predicted
proteins are completely different. CorfD is probably also a
pseudogene in Ng because it possesses two frameshifts
(relative to the Nm sequences) that result in two stop
codons (Fig. 2).
Import of the opcA island
As described above, the orthologous homology between
opcA of Nm and Ng is only 55%, whereas much higher
levels of homology are found between the framework
ORFs and other pairs of genes in Ng and Nm (see Discussion). Although this very low sequence homology might
have resulted from diversifying selection, such selection
should have led to considerable sequence diversity within
each of these species. However, the sequence diversity
that is actually found in opcA in each species is extremely
low, and the opcA genes seem to be located on a DNA
island that might have been imported into Nm and/or Ng
from an unrelated species. This explanation is supported
by the presence of unrelated sequences to the left of
opcA in Nm (IS elements) and Ng (orfX, orfY ). Furthermore, the %GC of orfX (47%) and orfY (43%) is lower
than the 51% typical of Nm and Ng. We therefore
examined the sequences of these putative islands for
other features that are compatible with import.
The DNA uptake sequence (DUS) GCCGTCTGAA or
sequences that differ from the DUS by 1 bp are present
as inverted repeats at each end of the opcA islands in
Nm and Ng (Fig. 6). DUS can facilitate the uptake of
DNA segments from an unrelated species (Kroll et al.,
1998). After uptake, recombination could occur via homology in conserved ¯anking housekeeping genes, in which
case the islands should be ¯anked by sequences that
are more diverse between Nm and Ng than are the framework sequences that were inherited from their last common ancestor. The stop codon of the glyA gene marks
the left border of the opcA islands, and no remnants of
recombination were detected near glyA. However, the
aligned 189 bp to the right of opcA contains gaps as well
as 59 polymorphic sites and is only 59% identical between
Nm Z2491 and Ng FA1090 (Fig. 6), whereas the subsequent 4.5 kb region through to the end of trpE is 92% identical between the two strains and does not contain any
other stretch of low homology. Thus, the region between
glyA and dedA does seem to correspond to a DNA island
that has been imported into Nm or Ng or both. The sequences at the ends of these islands show no motifs typical
of pathogenicity islands (Hacker et al., 1997) or mobile
genetic elements that could account for their import,
except that the uptake sequences near both ends of the
island form part of a 14 bp direct repeat (Fig. 6, DR). However, DNA islands within species that are naturally transformable and recombine frequently may differ from those
of other bacteria.
A second island within the opcA region
The region between trpE and purK contains alternative,
unrelated sequences in Nm and Ng. In both strains of
Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650
opcA and opcB regions 643
Nm, this region contains orfA, whereas that region in Ng
contains the dcmD/dcrD restriction±modi®cation genes
(Stein et al., 1995) (Fig. 1). Thus, this region corresponds
to a small island containing different genes in the two species. The %GC of these genes is also unusually low
(dcmD, 47%; dcrD, 44%; orfA, 36%).
We have searched for direct evidence for import of this
island by examining this region in multiple strains of Nm
and Ng. All 21 strains of Ng yielded PCR products of the
same size with primers in the ¯anking trpE and purK
genes. The island was sequenced from nine of these bacteria in addition to strains FA1090 and MS11. Five alleles
were found whose sequences were 99% identical, and
only one amino acid polymorphism was detected (N-274
to D-274 in the DcmD methylase).
A representative collection of 107 strains of Nm has
been characterized for sequence variability at multiple
gene loci by multilocus sequence typing (Maiden et al.,
1998). Strains with identical combinations of alleles at six
conservative loci are assigned to a common sequence
type (ST); a total of 49 STs were recognized within the
107 strains. The region between trpE and purK was ampli®ed by PCR from all 107 strains. Altogether, 105 of the 107
strains possessed an island of the same size as in strains
Z2491 and FAM18. The 546 bp region including orfA was
sequenced from eight of these strains in addition to Z2491
and FAM18; three alleles of orfA that differed by three
synonymous polymorphic sites and two frameshifts were
found among the 10 sequences. These results indicate
that the island contains orfA in most strains of Nm.
Two of the 107 Nm strains (STs 25 and 26) yielded a
PCR product of the same size as did strains of Ng. The
sequences of that PCR product were identical from both
Nm strains. They contained dcmD/dcrD genes that were
98% homologous to the genes from Ng, bordered by a
region of somewhat lower homology in purK (95% over
150 bp) (Fig. 8). In both Ng and STs 25 and 26, the
dcmD/dcrD island contains a sequence (GTCGTCTGAA)
that differs from a DUS by one nucleotide. These data indicate that ST25 and ST26 are descended from a common
ancestor that had imported this island from an unrelated
species via recombination in the ¯anking genes.
The DcmD protein (also called S.Ngo IV) methylates
DNA containing GCCGGC (Stein et al., 1995) and renders
it resistant to digestion by the DcrD isoschizomer NaeI.
Chromosomal DNAs from Ng strains MS11 and FA1090
were resistant to digestion by NaeI, as were DNAs from
Nm STs 25 and 26 (Fig. 7). These results indicate that
the dcmD gene is functional and methylates chromosomal
DNA. In contrast, eight of the 10 Nm strains possessing
orfA were digested by NaeI, although DNAs from
FAM18 and NZ10 (A4 cluster) were not (Fig. 7). Thus, a
DcmD methylase is rare in Nm, and most Nm do not
methylate GCCGGC. The dcrD gene in ST25 contains a
Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650
Fig. 7. Digestion of chromosomal DNA with NaeI. Tracks 1 and 16
contain molecular weight markers (1 kb ladder; Gibco BRL). Tracks
2±15 contain chromosomal DNA from strains (ST or clonal group):
NG E28 (Nm ST26), NG G40 (Nm ST25), MS11 (Ng), FA1090
(Ng), B40 (Nm subgroup I), Z3524 (Nm subgroup III), BZ 10 (Nm
A4 cluster), 44/76 (Nm ET-5 complex), BZ 198 (Nm lineage 3), NG
4/88 (Nm ST30), 297-0 (Nm ST49), BZ 147 (Nm ST48), Z2491
(Nm subgroup IV-1) and FAM18 (Nm ET-37 complex).
frameshift mutation and may encode an inactive restriction
enzyme (Fig. 8).
CopcB may be on a third DNA island
The homology between CopcB in Nm and Ng is low
(71%), and DNA encoding orfE and IS1016C2 is inserted
to the right of CopcB in Nm. These results resemble those
for the opcA island, and we therefore also searched for
uptake sequences and low homology in ¯anking DNA.
The DNA to the left of CopcB is 97% identical between
Nm and Ng and does not possess any blocks of nonhomology. However, alignments in which the orfE ±
IS1016C2 segment to the right of CopcB had been
removed showed gaps and numerous polymorphisms
over 199 bp to the right of the CopcB gene (Fig. 9, 38%
identity). This stretch of DNA also contained inverted
DUS (Fig. 9). The 796 bp to the right of that stretch were
95% homologous between Nm and Ng and 99% homologous within Nm. These data are similar to those from the
other two islands and suggest that CopcB is on a third
island that has been imported from an unrelated species
into either Nm or Ng or both.
Discussion
Before this analysis, opc was perceived to be a phasevariable gene present in some strains of Nm and characterized by extremely low levels of sequence polymorphism
(Sarkari et al., 1994; Seiler et al., 1996). The results presented here alter this perception drastically. A sequence
variant of opc, now opcA, is present (but poorly
expressed) in Ng as well as in Nm. Furthermore, both
Ng and Nm possess a second paralogous opcA-like
644
P. Zhu, G. Morelli and M. Achtman
Fig. 8. Features and ¯anking DNA of the island between trpE and purK in the opcA region. The islands and those parts of the trpE and purK
genes that were sequenced in ST25 (Nm strain NG G40) are shown, as is the percentage identity between Ng FA1090 and the Nm ST25
strain or between the Nm ST25 strain and Nm Z2491. The locations of a DNA uptake sequence and of a ‡ 1 frameshift deletion within dcrD
of the ST25 strain are also indicated.
pseudogene, CopcB. opcA and CopcB show large sequence differences that seem to re¯ect their location in DNA
islands that have been imported from unrelated species.
Import of three DNA islands in the opcA and CopcB
regions
The evidence for three DNA islands in the opcA and
CopcB regions is similar. Each contains genes that are
characterized by unusually low homology or are unrelated
when Nm and Ng are compared. Each island contains
DNA uptake sequences and one border with low homology between Nm and Ng. For the island between trpE
and purK, rare strains of Nm have been identi®ed in
which DNA import had occurred in a common ancestor.
The opcA gene was not detected in 12 commensal neisseriae by PCR with various combinations of degenerate
primers that ampli®ed these genes from Nm and Ng
Fig. 9. Sequence comparisons of the right
border of the CopcB island in three strains.
The alignments were made after removing the
sequences from orfE to IS1016C2 whose
location is indicated by an arrow. Other details
are as in Fig. 6.
Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650
opcA and opcB regions 645
(data not shown). PCR analysis of the glyA ±abcZ region
in Neisseria lactamica ampli®ed a product that is 6 kb
shorter than that from Nm Z2491 (data not shown), as if
the DNA immediately surrounding opcA is lacking in
these bacteria. These observations suggest that Nm and
Ng may have evolved from an ancestor that did not possess any opcA island. The opcA islands may have been
imported only recently, because only 2.8% of the opcA
gene was polymorphic among 43 diverse strains of Nm
(Seiler et al., 1996). (Ng is so homogeneous that the minimal sequence diversity of two opcA sequences is not
necessarily unusual.) However, it remains unclear
whether these islands were imported once and then
replaced by a second import event or whether they have
been imported independently into each species.
The DNA sequences of the argF, fbp, recA, adk, argJ,
glnA and mtg housekeeping genes are 94 ±99% identical
between Nm and Ng (Zhou and Spratt, 1992; Feil et al.,
1996; Zhou et al., 1997). A housekeeping gene, aroE,
that was concluded to be exceptionally diverse (Zhou et
al., 1997) was 88±93% identical between Nm and Ng.
Similarly, within the opcA region, DNA stretches containing housekeeping genes were 93% identical between
Nm and Ng, DNA to the left of CopcB was 97% identical
and DNA to the right of CopcB was 90% identical. Many
genes encoding outer membrane or secreted proteins
and that might be under diversifying selection show similarly high levels of homology between Nm and Ng: porA
versus CporA, 91% (Feavers and Maiden, 1998); class
3 porB versus PI.B, 90% (Murakami et al., 1989); rmpG
versus rmpM, 96% (Klugman et al., 1989); tbpA, 96%
(Legrain et al., 1993) and iga, 90%. As the low orthologous
homology of 55% between opcA of Nm and Ng is associated with their import from an unrelated species, possibly other orthologous outer membrane proteins with low
homologies have also been imported. These include the
class 2 porB versus the PI.B gene, 69% (Murakami et
al., 1989); tbpA from the ET-37 complex versus Ng, 75%
(Legrain et al., 1993) and tbpB, 77% (Legrain et al.,
1993). Import of islands containing foreign genes is realistic for naturally transformable bacteria with a rich history
of recombination such as Nm and Ng. Recent data (B.
Linz, M. Schenker and M. Achtman, in preparation) has
shown a high frequency of recombination during local epidemics and that, on average, the imported DNA is several
kilobases long.
opcA and CopcB probably re¯ect an old duplication
event. Their homologies in either Nm (45%) or Ng (55%)
are lower than the homology for either opcA (59%) or
CopcB (71%) between the species, suggesting that the
duplication predates the subsequent sequence evolution
that has occurred in each of these genes. It seems unlikely
that an imported pseudogene could have been ®xed
because of the lack of positive selection. Therefore, if an
Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650
island containing CopcB was imported into Nm and Ng,
similarly to CporA in Ng (Feavers and Maiden, 1998),
the imported island probably contained a functional gene
that was subsequently inactivated by mutations.
The opcA region is lacking in bacteria of the ET-37 complex, such as FAM18, the A4 cluster and other unrelated
isolates (Seiler et al., 1996). These bacteria might be descendants of an ancestral strain that had not yet imported
the opcA region or represent the results of a deletion
event. Comparison of the sequences ¯anking the opcA
island between FAM18 and Z2491 (Fig. 6) revealed no
obvious end-points for an insertion event and no clustered
heterogeneity indicating recombination with unrelated
DNA. In contrast, the presence of IS1106-like elements
near the left end of the opcA island in Nm is compatible
with the possibility of deletion by illegitimate recombination, even if there are no obvious repetitive sequences
directly at the ends of the deletion. Once a deletion had
arisen, it could be transferred by homologous recombination to other strains of Nm.
Transcriptional regulation of opcA
Although Nm OpcA is highly immunogenic in humans
(Rosenqvist et al., 1993; 1995; Haneberg et al., 1998),
Nm can escape bactericidal killing by phase variation of
opcA transcription resulting from slipped-strand mispairing within a polycytidine stretch in the promoter region
(Sarkari et al., 1994). Nm freshly isolated from the nasopharynx can express very large amounts of OpcA protein
(Achtman et al., 1991), but yield variants expressing very
low levels at frequencies of 1% upon subsequent cultivation in vitro (Sarkari et al., 1994).
OpcA protein is apparently not expressed strongly by Ng
in vitro. Immunoblots with Ng within two passages of isolation did not reveal any expression of OpcA protein.
Furthermore, a convincing potential opcA promoter region
was not identi®ed in Ng, and no homopolymeric or heteropolymeric nucleotide stretches that might vary by slippedstrand mispairing precede the opcA structural gene. Only
a low level of mRNA expression was observed by Northern blots and by RT±PCR, even from Ng grown under
iron deprivation or anaerobically.
Intermediate expression of OpcA in Nm (Opc‡ ) does
not suf®ce for detectable adhesion and invasion and, by
extrapolation, the low levels of transcription that were
observed with Ng are unlikely to promote adhesion and
invasion. However, protein expression can be triggered
by contact with eukaryotic cells (Pettersson et al., 1996;
Taha et al., 1998) or by conditions that are unique within
an animal host (Camilli and Mekalanos, 1995; Hensel et
al., 1995; Valdivia and Falkow, 1996), and it is conceivable
that Ng OpcA is expressed ef®ciently during urogenital
infections.
646
P. Zhu, G. Morelli and M. Achtman
CopcB
IS elements
Most of the opcB gene is inactivated by a combination of
frameshifts and stop codons, similarly to the CporA
gene in Ng (Feavers and Maiden, 1998) and to a variety
of genes in H. pylori (Alm et al., 1999). The ®rst of these
frameshift mutations is in a homopolymeric adenine
stretch (A-6 in Ng and A-8 in Nm) encoding the second
and third amino acids of the OpcA signal peptide in both
species, whereas the other frameshift mutations are at different positions in Nm and Ng. The homopolymeric stretch
in the signal peptide is reminiscent of homopolymeric
stretches in other neisserial proteins (Jonsson et al.,
1991; Jennings et al., 1995; Yang and Gotschlich, 1996)
that result in phase-variable expression of the PilC,
LgtA, LgtB and LgtD proteins. Possibly, OpcB was originally expressed as a phase-variable protein and ®rst
became a pseudogene after Nm and Ng acquired opcA
as a result of selection against expression of both genes
in one cell.
Over 500 different IS elements belonging to more than 17
families have been described (Mahillon and Chandler,
1998). For several of these, the mechanism of transposition has been investigated in great detail (Mahillon and
Chandler, 1998), and chromosomal mobility has been
analysed under natural conditions (Boyd and Hartl,
1997). Three families of IS elements have been described
in Nm (IS1016, Hilse et al., 1996; IS1106, Knight et al.,
1992; and IS1301, Hilse et al., 1996), and the IS elements
described here are variants of two of these families with
somewhat unusual features.
The opcA region in many strains of Nm contains
IS1106A3 inserted in inverted orientation into
CIS1106A2. Often, integration of one IS element into a
second results in a tandem duplication leading to
increased transcription because of partial promoter sequences in each of the inverted repeats (Mahillon and
Chandler, 1998). We have only been able to ®nd one
other example in the literature (Trieu-Cuot and Courvalin,
1984) in which one IS element has inserted into the transposase gene of a second. In that case, the transposition
was such that the composite IS element expressed two
copies of two ORFs and transposed more ef®ciently than
a single IS element. This is unlikely to be the case for
IS1106A3 because the transposase of CIS1106A2 is
apparently a pseudogene and is also disrupted by the
insertion. IS1106A3 contains a copy of the target sequence used for integration into CIS1106A2, raising the
amusing possibility that it might also be a target for further
insertion of a related IS element. A further unusual feature
is that the inverted repeats at the ends of IS1106A3 are
totally different from those of IS1106 itself and are lacking
in IS1106A2.
OrfE in the opcB region of strain Z2491 is signi®cantly
homologous to a transposase in Synechocystis but lacks
terminal inverted repeats. IS elements lacking inverted
repeats have been described, and orfE might represent
the ®rst example of such a novel IS element in the neisseriae. We note that the spread of sel®sh DNA need not
depend on transposition in transformable bacteria; after
having entered the global gene pool, these elements can
spread by horizontal genetic exchange via transformation,
as suggested for a 250 bp element near opcA (Seiler et al.,
1996) that is lacking in strain Z2491.
Genetic diversity of intergenic regions
Much interest related to genetic diversity has focused on
pathogenicity islands (Hacker et al., 1997) and DNA acquisitions that are related to virulence. However, DNA mobility is not purely related to virulence, and the data
presented here add to the growing evidence that intergenic
regions can readily acquire additional genetic information
during bacterial evolution (Lawrence and Roth, 1996;
Lawrence and Ochman, 1997; Nelson et al., 1997; Bergthorsson and Ochman, 1998). Comparative sequencing of
larger regions of DNA from multiple isolates of a single
species is just beginning (Milkman and Bridges, 1993;
Alm et al., 1999), and it is currently unclear what degree
of sequence variation will be discovered when more
genes are investigated that are not under positive selection pressure.
Deletions or frameshifts have not been detected in any
of seven fragments from housekeeping genes tested by
MLST analysis (http:/mlst.zoo.ox.ac.uk) of over 200
strains of Nm, and an internal stop codon has only been
observed once (unpublished data). In contrast, the sequences near the opcA and CopcB loci revealed several
examples of insertion/deletion, genetic replacement and
transposition over a cumulative region of less than 20 kb.
Numerous pseudogenes were also detected during this
analysis, including an IS transposase, the CopcB gene,
CcomEA and CorfD, similarly to recent observations
with H. pylori (Alm et al., 1999). Together with other data
(S. R. Klee, C. R. Tinsley, B. Kusecek et al., submitted),
these results show that the neisseriae are characterized
by both conserved housekeeping genes and highly
dynamic intergenic regions.
Concluding remarks
The results presented here have revealed the existence of
a family of outer membrane proteins with 10 transmembrane strands and ®ve surface-exposed loops, the Opc
family. The pathogenic neisseriae contain both orthologous and paralogous opc loci. Until now, only pseudogene
Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650
opcA and opcB regions 647
variants of opcB have been discerned, but functional
copies may exist in other species and yet other variants
of the opc family remain to be discovered.
The opc-like loci in Nm and Ng are located in imported
DNA islands. Importing such islands can contribute greatly
to the diversity of these species and their abilities to evolve
under selective pressures. Other than for virulence genes,
only a few comparative sequence analyses of large
regions have been performed with natural isolates of a
single species. The well-characterized collections of epidemic Nm are ideally suited for comparative analyses
because numerous changes have occurred in recent decades and changes can be dated (Morelli et al., 1997).
The signi®cance of the opcA gene for infection by Ng is
unresolved. Experiments are needed to determine
whether this gene is expressed in the human body and
under in vivo conditions. The sequence data presented
here will provide the necessary background for such
experiments.
Experimental procedures
Bacterial strains
The opcA region was sequenced from strains Ng FA1090
(Connell et al., 1988), Ng MS11 (Swanson, 1973), Nm
Z2491 (serogroup A, subgroup IV-1) (Crowe et al., 1989; Sarkari et al., 1994) and Nm FAM18 (serogroup C, ET-37 complex) (Aho and Cannon, 1988). Additional strains of Nm
were chosen among those that have been tested for the possession of opc in Southern blots (Seiler et al., 1996) or that
have been tested by multilocus sequence typing (MLST)
(Maiden et al., 1998). Additional strains of Ng were fresh clinical isolates obtained from Drs Cathy Ison and Ian Feavers.
Strains from 12 Neisseria species used to test whether
opcA is present in commensal species were as follows
(ATCC no./species): 23970/N. lactamica ; 19696/N. mucosa ;
13120/N. ¯avescens ; 49275/N. sub¯ava ; 14221/N. ¯ava ;
10555/N. per¯ava ; 14686/N. denitri®cans ; 43768/N. polysaccharea ; 25295/N. elongata ; 14687/N. canis ; 33078/N.
ovis ; 29256/N. sicca. Two E. coli strains, WK6lcl‡ and
K12DHIDtrp (Bernard et al., 1979), were used for cloning
and expression of the Ng opcA gene.
DNA manipulations
Fragments (15 kb) were cloned in l DashII, and recombinant
phages containing opc inserts were detected after hybridization with an opc probe. Sequences from the ends of the
inserts were used to design primers for long-range PCR.
Long-range PCR products were then used to determine the
entire sequence of the inserts by a combination of direct sequencing and cloning by inverse PCR. Subsequently, primers
(available on request) were designed that enabled direct
sequencing from both strands of PCR products made from
chromosomal DNA.
The techniques for isolating chromosomal DNA (Sarkari et
al., 1994) and pulsed-®eld gel electrophoresis (PFGE; Morelli
Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650
et al., 1997) have been described. DNA probes were labelled
with DIG-11-dUTP during PCR ampli®cation (PCR DIG
labelling mix, Boehringer Mannheim) and used for Southern
hybridization and detection on Hybond N‡ nylon (Amersham) membranes, as described by the manufacturer.
Automated sequencing was performed using puri®ed products ampli®ed from the bacterial chromosome and an
Applied Biosystems model 377 (Perkin-Elmer) with dRhodamine-labelled dye terminators.
Mapping of opcA and CopcB
Chromosomal DNAs of Ng FA1090 and Nm Z2491 were
digested with rare-cutting restriction endonucleases
(FA1090: NheI, SpeI, Bgl II and Pac I; Z2491: NheI, SpeI,
SgfI, Pac I, Bgl II and PmeI), separated by PFGE and tested
for Southern hybridization with DIG-labelled opcA, opcB
and orfA probes prepared from the corresponding strains.
The map locations were determined by correlating the fragments that hybridized (FA1090, opcA : fragments N5, P1
and S2; opcB : B5, N8, P5 and S1; Z2491, opcA : S1 and
N9; opcB : N14, S13, G2, B3 and M9) with the published
maps of FA1090 (Dempsey et al., 1991) and Z2491 (Dempsey et al., 1995).
RNA analysis
Bacteria in the exponential phase of growth (6 h of growth on
supplemented GC agar plates, 378C, 95% humidity) were
lysed with guanidine, diluted with an ethanol solution, and
the total RNA was isolated after binding to glass®bre as
described by the manufacturer (RNAqueous, Ambion). Formaldehyde gel electrophoresis and transfer to Hybond N‡
nylon membranes was performed using standard techniques
(Sambrook et al., 1989), and Northern hybridization was performed using DIG-labelled PCR probes ampli®ed using oligonucleotides O570 (ACTGCTACTTCTCTGAGTAG) and O574
(TTAATGTCGGCTGAGACACC). For RT±PCR, RNA samples were incubated with RNase-free DNase I (Ambion) to
remove contaminating genomic DNA and heated at 758C for
5 min to inactivate the DNase I. For RT±PCR reactions,
1 mg of RNA plus 10 pmol of primer were denatured at 658C
for 10 min and chilled. Then 4 ml of 5 ´ reverse transcriptase
buffer, 2 ml of 100 mM DTT, 2 ml of 10 mM dNTPs, 50 units
of reverse transcriptase (Expand reverse transcriptase kit,
Boehringer Mannheim) and 20 units of RNase inhibitor
(Ambion) were added in a total volume of 20 ml, and the reaction mixture was incubated at 428C for 1 h. Reaction product
(5 ml) was then used as the template cDNA in a PCR reaction.
Primer extension analysis was carried out as described
(Sarkari et al., 1994) using oligonucleotides O956 (TGTCTGT
ACGGACGGTATATTC) and O1135 (GGCGTCTCCGATACGACGTA), except that [g-32P]-ATP was used.
Expression of Ng OpcA in E. coli and immunological
methods
Ng opcA was cloned downstream of the l pL promoter to
allow overexpression in E. coli. To this end, the coding region
of the opcA gene from Ng FA1090, from 6 bp before the ATG
648
P. Zhu, G. Morelli and M. Achtman
start codon to 125 bp after the TAA stop codon, was
ampli®ed using oligonucleotides O765 (CGGAATTCCTAGGAGAT AAAAATGAAAAAAGCAC, Avr II site underlined)
and O762 (CACGTGAAGCTTGCGGTATTTTAACCGATT,
Hin dIII site underlined). The PCR fragment was digested
with Avr II and Hin dIII, ligated with StyI/Hin dIII-digested plasmid pMA5-8pL (Stanssens et al., 1989) and transformed into
strain E. coli WK6lcl‡. The resulting plasmid, pBE651, contains the predicted insert fragment as con®rmed by sequence
analysis. pBE651 was transformed into E. coli K12DHIDtrp,
which contains a chromosomal copy of the temperaturesensitive cI857 repressor gene for the induction of OpcA
expression. Bacteria growing exponentially at 288C were
shifted to 378C to induce OpcA expression. After 2 h further
incubation, the cells were harvested and cell envelopes
were prepared as described (Merker et al., 1997). Expression
was lethal when it was induced by a temperature shift to 428C
or after incubation for more than 2 h at 378C, similarly to Nm
OpcA (Merker et al., 1997).
Five 17- to 20-mer peptides containing sequences from the
central portions of loops 1±5 of OpcA protein from Ng FA1090
(sequences underlined in Fig. 3), containing an additional Cterminal cysteine, were synthesized using Cys-Rink Resin
(0.15 mM Tentagel R RAM, 0.2 mM g 1, Rapp Polymere) by
FastMoc chemistry with an automated peptide synthesizer
(model ABI433A, Applied Biosystems), cleaved and puri®ed
by reversed-phase high-performance liquid chromatography
(HPLC) as described (Thiesen et al., 1997). The structures
were con®rmed by MALDI mass spectrometry (Re¯ex II;
Bruker-Franzen) as being within 0.6 atomic units of the predicted sizes. The peptides were coupled to thyroglobulin via
the cysteine residue using MBS (m-maleimidobenzoyl-Nhydroxysuccimide ester) as described (Van Regenmortel et
al., 1988). Balb/c mice were immunized in the caudal region
on days 0, 14 and 28 with 5 mg of peptide conjugate mixed
with 125 ml of RIBI adjuvant (Immunochem Research), and
sera were taken on day 32. Peptides and peptide conjugates
were coated on microtitre plates as described (Thiesen et al.,
1997), and enzyme-linked immunosorbent assays (ELISAs)
(Wang et al., 1992) were performed using serial doubling
dilutions of each serum. After diluting the immune sera
25 000-fold, they yielded an OD of $ 1.0 in ELISAs with
both the peptides and the peptide conjugates. SDS±PAGE
and immunoblots were performed as described (Achtman
et al., 1988). Immunoblot analysis with E. coli cell envelopes
containing recombinant OpcA only showed speci®c reactions with the murine antisera against loops 1 and 2
(Fig. 4); the other sera gave weaker reactions with OpcA
protein and additional cross-reactions with other proteins
(data not shown).
Acknowledgements
We gratefully acknowledge Thomas Schnibbe for theoretical and practical help in preparing the peptide conjugates
and the MS analyses; Martin Schenker for help with computer analyses; Kerstin Zurth for technical assistance with
immunizations; and Barica Kusecek for technical assistance with peptide synthesis. We are very grateful to
Cathy Ison and Ian Feavers for supplying fresh clinical isolates of Ng.
References
Achtman, M., Neibert, M., Crowe, B.A., Strittmatter, W.,
Kusecek, B., Weyse, E., et al. (1988) Puri®cation and characterization of eight class 5 outer membrane protein variants from a clone of Neisseria meningitidis serogroup A.
J Exp Med 168: 507±525.
Achtman, M., Wall, R.A., Bopp, M., Kusecek, B., Morelli, G.,
Saken, E., et al. (1991) Variation in class 5 protein expression by serogroup A meningococci during a meningitis epidemic. J Infect Dis 164: 375±382.
Aho, E.L., and Cannon, J.G. (1988) Characterization of a
silent pilin gene locus from Neisseria meningitidis strain
FAM18. Microb Pathog 5: 391±398.
Aho, E.L., Dempsey, J.A.F., Hobbs, M.M., Klapper, D.G., and
Cannon, J.G. (1991) Characterization of the opa (class 5)
gene family of Neisseria meningitidis. Mol Microbiol 5:
1429±1437.
Alm, R.A., Ling, L.-S.L., Moir, D.T., King, B.L., Brown, E.D.,
Doig, P.C., et al. (1999) Genomic-sequence comparison
of two unrelated isolates of the human gastric pathogen
Helicobacter pylori. Nature 397: 176±180.
Bergthorsson, U., and Ochman, H. (1998) Distribution of
chromosome length variation in natural isolates of Escherichia coli. Mol Biol Evol 15: 6±16.
Bernard, H.-U., Remaut, E., Hersh®eld, M.V., Das, H.K.,
Helinski, D.R., Yanofsky, C., et al. (1979) Construction of
plasmid cloning vehicles that promote gene expression
from the bacteriophage lambda pL promoter. Gene 5:
59±76.
Bihlmaier, A., RoÈmling, U., Meyer, T.F., TuÈmmler, B., and
Gibbs, C.P. (1991) Physical and genetic map of the Neisseria gonorrhoeae strain MS11-N198 chromosome. Mol
Microbiol 5: 2529±2539.
Boyd, E.F., and Hartl, D.L. (1997) Nonrandom location of IS1
elements in the genomes of natural isolates of Escherichia
coli. Mol Biol Evol 14: 725±732.
Camilli, A., and Mekalanos, J.J. (1995) Use of recombinase
gene fusions to identify Vibrio cholerae genes induced during infection. Mol Microbiol 18: 671±683.
Connell, T.D., Black, W.J., Kawula, T.H., Barritt, D.S.,
Dempsey, J.A., Kverneland, Jr, K., et al. (1988) Recombination among protein II genes of Neisseria gonorrhoeae
generates new coding sequences and increases structural
variability in the protein II family. Mol Microbiol 2: 227±
236.
Correia, F.F., Inouye, S., and Inouye, M. (1988) A family of
small repeated elements with some transposon-like properties in the genome of Neisseria gonorrhoeae. J Biol
Chem 263: 12194±12198.
Crowe, B.A., Wall, R.A., Kusecek, B., Neumann, B., Olyhoek, T., Abdillahi, H., et al. (1989) Clonal and variable
properties of Neisseria meningitidis isolated from cases
and carriers during and after an epidemic in the Gambia,
West Africa. J Infect Dis 159: 686±700.
De Vries, F.P., van der Ende, A., van Putten, J.P.M., and
Dankert, J. (1996) Invasion of primary nasopharyngeal
epithelial cells by Neisseria meningitidis is controlled by
phase variation of multiple surface antigens. Infect
Immun 64: 2998±3006.
De Vries, F.P., Cole, R., Frosch, M., and Putten, V.J.P.M.
(1998) Neisseria meningitidis producing the Opc adhesin
Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650
opcA and opcB regions 649
binds epithelial cell proteoglycan receptors. Mol Microbiol
6: 1203±1212.
Dempsey, J.A.F., Litaker, W., Madhure, A., Snodgrass, T.L.,
and Cannon, J.G. (1991) Physical map of the chromosome
of Neisseria gonorrhoeae FA1090 with locations of genetic
markers, including opa and pil genes. J Bacteriol 173:
5476±5486.
Dempsey, J.A.F., Wallace, A.B., and Cannon, J.G. (1995)
The physical map of the chromosome of a serogroup A
strain of Neisseria meningitidis shows complex rearrangements relative to the chromosomes of two mapped strains
of the closely related species N. gonorrhoeae. J Bacteriol
177: 6390±6400.
Feavers, I.M., and Maiden, M.C.J. (1998) A gonococcal porA
pseudogene: implications for understanding the evolution
and pathogenicity of Neisseria gonorrhoeae. Mol Microbiol
30: 647±656.
Feil, E., Zhou, J.J., Smith, J.M., and Spratt, B.G. (1996) A
comparison of the nucleotide sequences of the adk and
recA genes of pathogenic and commensal Neisseria species: evidence for extensive interspecies recombination
within adk. J Mol Evol 43: 631±640.
Goodman, S.D., and Scocca, J.J. (1988) Identi®cation and
arrangement of the DNA sequence recognized in speci®c
transformation of Neisseria gonorrhoeae. Proc Natl Acad
Sci USA 85: 6982±6986.
Hacker, J., Blum-Oehler, G., MuÈhldorfer, I., and TschaÈpe, H.
(1997) Pathogenicity islands of virulent bacteria: structure,
function and impact on microbial evolution. Mol Microbiol
23: 1089±1097.
Haneberg, B., Dalseg, R., Oftung, F., Wedege, E., Hùiby,
E.A., Haugen, I.L., et al. (1998) Towards a nasal vaccine
against meningococcal disease, and prospects for its use
as a mucosal adjuvant. Dev Biol Stand 92: 127±133.
Hensel, M., Shea, J.E., Gleeson, C., Jones, M.D., Dalton, E.,
and Holden, D.W. (1995) Simultaneous identi®cation of
bacterial virulence genes by negative selection. Science
269: 400±403.
Hilse, R., Hammerschmidt, S., Bautsch, W., and Frosch, M.
(1996) Site-speci®c insertion of IS1301 and distribution in
Neisseria meningitidis strains. J Bacteriol 178: 2527±
2532.
Jennings, M.P., Hood, D., Peak, I., Virji, M., Moxon, E.R.,
Hood, D.W., et al. (1995) Molecular analysis of a locus
for the biosynthesis and phase-variable expression of
the lacto-N-neotetraose terminal lipopolysaccharide
structure in Neisseria meningitidis. Mol Microbiol 18:
729±740.
Jonsson, A.-B., Nyberg, G., and Normark, S. (1991) Phase
variation of gonococcal pili by frameshift mutation in pilC,
a novel gene for pilus assembly. EMBO J 10: 477±488.
Kaneko, T., Sato, S., Kotani, H., Tanaka, A., Asamizu, E.,
Nakamura, Y., et al. (1996) Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp.
strain PCC6803. II. Sequence determination of the entire
genome and assignment of potential protein-coding
regions (supplement). DNA Res 3: 185±209.
Klugman, K.P., Gotschlich, E.C., and Blake, M.S. (1989)
Sequence of the structural gene (rmpM) for the class 4
outer membrane protein of Neisseria meningitidis, homology of the protein to gonococcal protein III and Escherichia
Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650
coli OmpA, and construction of meningococcal strains that
lack class 4 protein. Infect Immun 57: 2066±2071.
Knight, A.I., Ni, H., Cartwright, K.A.V., and McFadden, J.J.
(1992) Identi®cation and characterization of a novel insertion sequence, IS1106, downstream of the porA gene in
B15 Neisseria meningitidis. Mol Microbiol 6: 1565±1573.
Kroll, J.S., Wilks, K.E., Farrant, J.L., and Langford, P.R.
(1998) Natural genetic exchange between Haemophilus
and Neisseria: intergeneric transfer of chromosomal
genes between major human pathogens. Proc Natl Acad
Sci USA 95: 12381±12385.
Lawrence, J.G., and Ochman, H. (1997) Amelioration of bacterial genomes: rates of change and exchange. J Mol Evol
44: 383±397.
Lawrence, J.G., and Roth, J.R. (1996) Sel®sh operons: horizontal transfer may drive the evolution of gene clusters.
Genetics 143: 1843±1860.
Legrain, M., Mazarin, V., Irwin, S.W., Bouchon, B., QuentinMillet, M.-J., Jacobs, E., et al. (1993) Cloning and characterization of Neisseria meningitidis genes encoding the
transferrin-binding proteins Tbp1 and Tbp2. Gene 130:
73±80.
Mahillon, J., and Chandler, M. (1998) Insertion sequences.
Microbiol Mol Biol Rev 62: 725±744.
Maiden, M.C.J., Bygraves, J.A., Feil, E., Morelli, G., Russell,
J.E., Urwin, R., et al. (1998) Multilocus sequence typing: a
portable approach to the identi®cation of clones within
populations of pathogenic microorganisms. Proc Natl
Acad Sci USA 95: 3140±3145.
Merker, P., Tommassen, J., Virji, M., Sesardic, D., and Achtman, M. (1997) Two-dimensional structure of the Opc invasin from Neisseria meningitidis. Mol Microbiol 23: 281±
293.
Milkman, R., and Bridges, M.M. (1993) Molecular evolution of
the Escherichia coli chromosome. IV. Sequence comparisons. Genetics 133: 455±468.
Moore, T.D.E., and Sparling, P.F. (1995) Isolation and identi®cation of a glutathione peroxidase homolog gene,
gpxA, present in Neisseria meningitidis but absent in Neisseria gonorrhoeae. Infect Immun 63: 1603±1607.
Morelli, G., Malorny, B., MuÈller, K., Seiler, A., Wang, J., del
Valle, J., et al. (1997) Clonal descent and microevolution
of Neisseria meningitidis during 30 years of epidemic
spread. Mol Microbiol 25: 1047±1064.
Murakami, K., Gotschlich, E.C., and Seiff, M.E. (1989) Cloning and characterization of the structural gene for the class
2 protein of Neisseria meningitidis. Infect Immun 57:
2318±2323.
Nelson, K., Wang, F.S., Boyd, E.F., and Selander, R.K.
(1997) Size and sequence polymorphism in the isocitrate
dehydrogenase kinase/phosphatase gene (aceK ) and
¯anking regions in Salmonella enterica and Escherichia
coli. Genetics 147: 1509±1520.
Olyhoek, A.J.M., Sarkari, J., Bopp, M., Morelli, G., and Achtman, M. (1991) Cloning and expression in Escherichia coli
of opc, the gene for an unusual class 5 outer membrane
protein from Neisseria meningitidis. Microb Pathog 11:
249±257.
Petering, H., Hammerschmidt, S., Frosch, M., van Putten,
J.P.M., Ison, C.A., and Robertson, B.D. (1996) Genes
associated with the meningococcal capsule complex are
650
P. Zhu, G. Morelli and M. Achtman
also found in Neisseria gonorrhoeae. J Bacteriol 178:
3342±3345.
Pettersson, J., Nordfelth, R., Dubinina, E., Bergman, T., Gustafsson, M., Magnusson, K.E., et al. (1996) Modulation of
virulence factor expression by pathogen target cell contact.
Science 273: 1231±1233.
Rosenqvist, E., Hùiby, E.A., Wedege, E., Kusecek, B., and
Achtman, M. (1993) The 5C protein of Neisseria meningitidis is highly immunogenic in humans and induces bactericidal antibodies. J Infect Dis 167: 1065±1073.
Rosenqvist, E., Hùiby, E.A., Wedege, E., Bryn, K., Kolberg,
J., Klem, A.R., et al. (1995) Human antibody responses
to meningococcal outer membrane antigens after three
doses of the Norwegian group B meningococcal vaccine.
Infect Immun 63: 4642±4652.
Sambrook, J., Fritsch, E.F., and Maniatis, T. (1989) Molecular Cloning: a Laboratory Manual, 2nd edn. Cold Spring
Harbor, NY: Cold Spring Harbor Laboratory Press.
Sampson, B.A., and Gotschlich, E.C. (1992) Neisseria
meningitidis encodes an FK506-inhibitable rotamase.
Proc Natl Acad Sci USA 89: 1164±1168.
Sarkari, J., Pandit, N., Moxon, E.R., and Achtman, M. (1994)
Variable expression of the Opc outer membrane protein in
Neisseria meningitidis is caused by size variation of a promoter containing poly-cytidine. Mol Microbiol 13: 207±217.
Seiler, A., Reinhardt, R., Sarkari, J., Caugant, D.A., and
Achtman, M. (1996) Allelic polymorphism and site-speci®c
recombination in the opc locus of Neisseria meningitidis.
Mol Microbiol 19: 841±856.
Stanssens, P., Opsomer, C., McKeown, Y.M., Zabeau, M.,
and Fritz, H.-J. (1989) Ef®cient oligonucleotide-directed
construction of mutations in expression vectors by the
gapped duplex DNA method using alternating selectable
markers. Nucleic Acids Res 17: 4441±4454.
Stein, D.C., Gunn, J.S., Radlinska, M., and Piekarowicz, A.
(1995) Restriction and modi®cation systems of Neisseria
gonorrhoeae. Gene 157: 19±22.
StruyveÂ, M., Moons, M., and Tommassen, J. (1991) Carboxyterminal phenylalanine is essential for the correct assembly of a bacterial outer membrane protein. J Mol Biol
218: 141±148.
Swanson, J. (1973) Studies on gonococcus infection IV. Pili:
their role in attachment of gonococci to tissue culture cells.
J Exp Med 137: 571±589.
Taha, M.-K., Morand, P.C., Pereira, Y., EugeÁne, E., Giorgini,
D., Larribe, M., et al. (1998) Pilus-mediated adhesion of
Neisseria meningitidis: the essential role of cell contactdependent transcriptional upregulation of the PilC1 protein.
Mol Microbiol 28: 1153±1163.
Thiesen, B., Greenwood, B., Brieske, N., and Achtman, M.
(1997) Persistence of antibodies to meningococcal IgA1
protease versus decay of antibodies to group A polysaccharide and Opc protein. Vaccine 15: 209±219.
Thompson, S.A., and Sparling, P.F. (1993) The RTX
cytotoxin-related FrpA protein of Neisseria meningitidis is
secreted extracellularly by meningococci and by HlyBD‡
Escherichia coli. Infect Immun 61: 2906±2911.
Thompson, S.A., Wang, L.L., and Sparling, P.F. (1993) Cloning and nucleotide sequence of frpC, a second gene from
Neisseria meningitidis encoding a protein similar to RTX
cytotoxins. Mol Microbiol 9: 85±96.
Tinsley, C.R., and Nassif, X. (1996) Analysis of the genetic
differences between Neisseria meningitidis and Neisseria
gonorrhoeae : two closely related bacteria expressing two
different pathogenicities. Proc Natl Acad Sci USA 93:
11109±11114.
Trieu-Cuot, P., and Courvalin, P. (1984) Nucleotide sequence of the transposable element IS15. Gene 30: 113±
120.
Valdivia, R.H., and Falkow, S. (1996) Bacterial genetics by
¯ow cytometry: rapid isolation of Salmonella typhimurium
acid-inducible promoters by differential ¯uorescence induction. Mol Microbiol 22: 367±378.
Van Regenmortel, M.H.V., Briand, J.P., Muller, S., and
PlaueÁ, S. (1988) Synthetic Polypeptides as Antigens.
Amsterdam: Elsevier Science.
Virji, M., Makepeace, K., Ferguson, D.J.P., Achtman, M.,
Sarkari, J., and Moxon, E.R. (1992) Expression of the
Opc protein correlates with invasion of epithelial and
endothelial cells by Neisseria meningitidis. Mol Microbiol
6: 2785±2795.
Virji, M., Makepeace, K., Ferguson, D.J.P., Achtman, M., and
Moxon, E.R. (1993) Meningococcal Opa and Opc proteins:
their role in colonization and invasion of human epithelial
and endothelial cells. Mol Microbiol 10: 499±510.
Virji, M., Makepeace, K., and Moxon, E.R. (1994) Distinct
mechanisms of interactions of Opc-expressing meningococci at apical and basolateral surfaces of human endothelial cells; The role of integrins in apical interactions. Mol
Microbiol 14: 173±184.
Wang, J.-F., Caugant, D.A., Li, X., Hu, X., Poolman, J.T.,
Crowe, B.A., et al. (1992) Clonal and antigenic analysis
of serogroup A Neisseria meningitidis with particular reference to epidemiological features of epidemic meningitis in
China. Infect Immun 60: 5267±5282.
Yang, Q.-L., and Gotschlich, E.C. (1996) Variation of gonococcal lipooligosaccharide structure is due to alterations
in poly-G tracts in lgt genes encoding glycosyl transferases. J Exp Med 183: 323±327.
Zhou, J., and Spratt, B.G. (1992) Sequence diversity within
the argF, fbp and recA genes of natural isolates of Neisseria meningitidis: interspecies recombination within the
argF gene. Mol Microbiol 6: 2135±2146.
Zhou, J.J., Bowler, L.D., and Spratt, B.G. (1997) Interspecies
recombination, and phylogenetic distortions, within the glutamine synthetase and shikimate dehydrogenase genes of
Neisseria meningitidis and commensal Neisseria species.
Mol Microbiol 23: 799±812.
Q 1999 Blackwell Science Ltd, Molecular Microbiology, 33, 635±650