Download The nucleotide sequence of the tnpA gene completes the sequence

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Epigenomics wikipedia , lookup

Copy-number variation wikipedia , lookup

Pathogenomics wikipedia , lookup

Expanded genetic code wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Frameshift mutation wikipedia , lookup

Genetic engineering wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Primary transcript wikipedia , lookup

Genome (book) wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Human genome wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Non-coding DNA wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Gene therapy wikipedia , lookup

Genome evolution wikipedia , lookup

Gene nomenclature wikipedia , lookup

Gene expression programming wikipedia , lookup

Nutriepigenomics wikipedia , lookup

History of genetic engineering wikipedia , lookup

Metagenomics wikipedia , lookup

Gene expression profiling wikipedia , lookup

Genetic code wikipedia , lookup

Gene desert wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

RNA-Seq wikipedia , lookup

Microsatellite wikipedia , lookup

Genomics wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Transposable element wikipedia , lookup

Gene wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Point mutation wikipedia , lookup

Genome editing wikipedia , lookup

Microevolution wikipedia , lookup

Designer baby wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Helitron (biology) wikipedia , lookup

Transcript
Volume 13 Number 15 1985
Nucleic Acids Research
The nndeotide seqnence of tbe tnpA gene completes tbe sequence of the Pscudomonas transposon
Tn50i
Nigel L.Brown, Joseph N.Winnie, David Fritzinger* and R.David Pridroore +
Department of Biochemistry, University of Bristol, University Walk, Bristol BS8 1TD, UK
Received 21 June 1985; Accepted 22 July 1985
ABSTRACT
The nucleotide sequence of the gene (tnpA) which codes for the
transposase of transposon Tn501 has been determined.
It contains an open
reading frame for a polypeptide of M -111,500, which terminates within
the inverted repeat sequence of the transposon. The reading frame would be
transcribed in the same direction as the mercury-resistance genes and the
tnpR gene. The amino acid sequence predicted from this reading frame shows
32X identity with that of the transposase
of the related transposon
Tn3_. The C-terminal regions of
these
two polypeptides show slightly
greater homology than the N-terminal regions when conservative amino acid
substitutions are considered. With
this sequence determination, the
nucleotide sequence of Tn501 is fully defined. The main features of the
sequence are briefly presented.
INTRODUCTION
Tn501 (1), from Paeudomonas aerugjposa. is a member of the "Tn3
family" of transposons (2), in that it has inverted terminal repeats of 38
base pairs, which are partially homologous to those of Tn3; and it is
flanked by five base-pair direct repeats generated from the recipient
replicon durino the transposition
process (3). Several transposons,
including Tn21,, Tnl721 and Tn2603, are known
to have transposition
functions sufficiently related to those of Tn501 that complementation of
mutants in the transposition genes can occur (4,5), and models for the
evolutionary relationship between these transposons have been proposed (6-9).
Several other transposons have been identified which appear to be closely
related to Tn21. (9).
The major mechanism of transposition of the TnS-related transposons
involves two steps, each requiring a transposon-encoded gene product (10,11).
The first step requires the product of the tnpA gene together with
host-coded replication functions, and involves the formation of a cointegrate
molecule between the donor replicon containing
the transposon, and the
recipient
replicon. The cointegrate molecule contains two copies of the
© IR L Press Limited, Oxford, England.
5657
Nucleic Acids Research
transposon in
the
direct
tnpR gene, and
copies of
the
cis-acting
repeat. The second event is mediated by the product of
is
an
intramolecular
transposon
components
in
in
the
addition
These
between
two
two
require
on
the
presence
of
the
of
inverted
repeats of the transposon, and the resolution of the cointegrate by
transposition resolvase
(res) sites in each
is
copy
by recombination between the internal resolution
of
the
transposon. Current models of replicative
transposition (see 2) suggest that the transposase catalyses
the
the
steps
to the gene products; the formation
the cointegrate is absolutely dependent
terminal
recombination
cointegrate.
formation
of
the
in
the
steps in
cointegrate: site-specific cleavage in one strand at
each terminus of the transposon,
cleavage
three
recipient
a
less
specific double-stranded staggered
replicon, and the ligation
termini with the recipient replicon.
The
of
determination
sequence of the tnpA gene and the concomitant prediction
the
of
transposon
the nucleotide
of
the
amino acid
sequence of the transposase is an essential part of elucidating the molecular
mechanism of transposition.
In this paper we present
Tn501.
which
the
sequence
the predicted amino
acid
preliminary
as
involved in recognition of DNA
DNA
sequence
tnpA
step
in
locating
and
in
those
members
of
(7,9)
a
which
regions
catalysis.
presented.
Tn501
of
This
the
study
is finding use as a mutagen
locate and define genes in cyanobacteria (12), and
transposons
compare
of Tn501, and the main features of the sequence
of the transposon are briefly
simplest
gene. We
of the gene product with that of the Tn3_
transposase
completes the
a
sequence
transposase
to
of nucleotides 5301 - 8355 of
includes the coding sequence for the
widely-dispersed
class
are
with
associated
it
of
is
one
of
the
mercury-resistance
multiple
resistance to
antimicrobial agents. The information summarised here will be of use to those
working on the molecular biology of cyanobacteria, and to those interested in
the evolution of transposons and of antimicrobial resistance determinants.
METHODS
DNA
fragments
Tn501;
sequence
of
analysis
pJ0E114
DNA
13,14) in M13mp7, and
chain-termination
sequence
whole
was
transposon
was
carried
(a SalGI-EcoRI
related
analysis
determined
out by the
deletion
bacteriophage
random
of
(15,16),
of
carrying
followed
by
(17). The nucleotide sequence of the
on
encountered in the sequence analysis due
both
to
strands. When
persistent
problems
were
secondary structure
these were resolved by using formamide gels or deozyinosine
5658
cloning
pBR322
substitution
as
Nucleic Acids Research
described elsewhere (18).
Programs for the storage and analysis of DNA sequence data
described
by Staden (19-21) running on a PDP 11/45 or VAX
The program
used
were those
11/750
computer.
for the prediction of protein secondary structure was that
described by McLachlan (22).
RESULTS
The
DNA sequence of nucleotldes 5301 to
containing the
Nucleotides
tnpA
gene
5301-5398
and
have
its
been
presented
in
describing the tnpR gene of Tn501 (23), and
at
8355
of
transposon
Tn501.
flanking sequences, is shown in Fig. 1.
the
a
previous
inverted
publication
repeat
sequence
nucleotides 8318-8355 has also been described earlier (3). The nucleotide
sequence has a G+C content of 66Z, and contains one major open reading
frame
which would code for a polypeptide of 988 amino acids, M f « 111,500.
We have presented data (23) showing that the termination codon for the
tnpR gene of
only
Tn501_ was
that
at
positions
5350-5352
(Fig. 1 ) . There are
three nucleotides between the third base of this codon
base of
the
suggested
evidence that we have correctly
identified
of
(24,25),
the
GGA
as
shown
at
first
the gene product aligns well
with the amino acid sequence determined for the transposase
is
the
the initiation codon of the tnpA
gene, but the predicted amino acid sequence
Tn3_
and
initiation codon of the tnpA gene. We have no formal
from
transposon
in Fig. 2. The prospective Shine-Dalgarno sequence
positions
5344-5346,
of the tnpR gene. The nearest
which is within the coding sequence
alternative
initiation
codon
is
a
GTG
at
position 5512, which has no apparent Shine-Dalgarno sequence. The codon usage
of
the
proposed
tnpA
gene is similar to that described for other genes in
Tn501 (14,23,26,27), there being a preference for codons containing
at
the
third
sequence
position.
presented here, the largest
starting at
nucleotide
5922
as
other
Tn501
relatively high GC
or
G
genes, and
content
being
on
the
complementary
strand,
and extending beyond the sequence in Fig. 1 to
nucleotide 4660. These reading frames
usage
C
There are several other open reading frames in the
do
not
are
have the same predicted codon
probably
not
functional. The
of the DNA gives a lower statistical probability
of there being termination codon
sequences
present,
thus
giving
rise
to
longer open reading frames.
There
is
a
number of short inverted and directly repeated sequences
within the DNA sequence
presented
consequence
high
of
the
GC
here. This
content
may
be
a
mere statistical
of the DNA, although the inverted
5659
Nucleic Acids Research
F
tppR I * I M
H I S I E
AACATTJ
5310
T
L
.
Q
I
T
L
1
T
D
5320
TACCACTAIX ' l U L U J H I
5330
5340
5430
3440
TTCAAOCAITCCGACI
5450
5460
5550
5560
OUTCAOCTCJITC
5570
JCGCACCTACCTTCJU
GCACGCCCAGC!4ACT(
5660
5690
5670
5790
5800
D
*
tapA gens
P t E L I
M
XACTCA CACATGCCG
5350
5360
S
A
5930
l
5480
5490
5500
5510
5580
5390
3600
3610
5620
5630
5700
5710
5720
5730
3950
5940
C
T
5830
A
L
P
S
Q
5420
1 ii
111
u.
5540
5650
5640
5760
E
5410
5660
'IP AACGCCTTGCrCCTCGC
5770
5780
5690
5660
5900
S B B
T
5990
3980
5960
L
CAACTC1IU, CTt
5520
5530
5870
5660
L
3400
UCCTCACCCAACTGCCOCAS
5740
5750
3 I T U L
CCGCTCJIOCCA1nCGCATCGCCCCAAGen^rkrr.kTt TT(lAACCTCAACGCCGGCACC jeena IOC
5920
E
5390
S
5910
T
5380
5470
ICCCATCt. 1 UAUOOU FlUibUj 1 l«ATlj
3820
3830
5640
5810
L
5370
CGQCA
6020
6010
6000
5970
G
CATCCTTCACt•M1T>X4CCCCCTC ikT.kTk
6040
60X
6050
m rCACCTCI
6060
6070
6080
6190
6200
6310
6320
n
I
L A B
Q « 1 L L I
B B iTnCAI
ID:
AAULTUfCCX
6100
6120
6090
6110
kkknkrt:rccor
6160
6130
6170
6180
cttr CTCTTCJICCcaXCCAiGCAC A A C U T ZMKACCAC1ITCrir
6280
6270
6290
6300
6400
6410
6420
6430
Q H T
6150
6140
KAACCtrrUTTUTCAOTOCTGCATCTCCACCACCCCATC
6220
6230
6240
6250
6210
6440
6430
6560
6570
XAGCTCCTCCCCCGCCCGC40GCCITC
6460
6470
6480
XACCTGCAC
6490
6300
6580
6600
6610
6720
6730
caaCOCTCCCCCCCTACnaSXCGCIT:TCenXACGTA
C1CAAC1ITC
6510
A D II L
CCCOCACAACt
6630
S
E
L
6520
65»
6640
6650
6730
A
I
D I
6770
II P
6550
T
6670
1 i
V
6780
• S
6590
JTC4A
6620
A
6660
S C
6760
6670
P
A
C M A L B
A L P L
SCO! CTQOCCt
6540
6260
ITCAACGACAACCTGCCCCTCTACTCCAAUTCCCCCianKTGCiaUCGCCAACCA
6330
6340
6350
6360
6370
6380
IfXT ECCiGQ JACca^ATCCCGCCiTCCAGaxarnaTCCccrccciCGAGTTCACcaGJUX
6390
E C C
rrccrc
D Q T
E B
L
6790
L E
TO
ca
B
q F
B D F
6800
Q L
6700
6690
6680
3
C
D t
0
D
L
P
V
A B
6820
6810
L
6710
L
Q L
E
A
E I
F A A L
6840
6830
A T
L A
I
6740
I B E Q
6630
D II
6660
E L P D A
SACCTCCCOGATCC
I L T E S C L I I T P L D A A T P D B A Q A L I P Q T S 0 L L P B I I I T E L L
r
ATYXTr^T f r/ l rl' 1 • J J T^ftA^A""A^^ M ^ 1 ' J ^* l ' J V J ^^-^ J T ^rtT r T r^ y r J '^ f a : '^ 3 T-*'rXA r r fl'r 1 / T 1 A f 7T r A f f T T t / T'> 1 'r1 l^rirrAkP-1TrirrVrAfL'
"r1 •
6990
7000
7010
7020
70X
7040
7050
7060
7070
7080
7090
7100
H D T D D V T C F S B B F T B L I D C A B A I D B
T L L L S A I L C D A I II L C
'^Tf7iiA1V l 1i f W?E J TU^ifi,!,!^ Ill i / j m m j 'I'll H T P ^ m ^ i V]TiJtfT?<iVi*11i'tf?*yT*A*^M^7'iJH '*• I IV'Mfl !'• I'' ' / \ ' l v r * T y ' i HilTI^I^TTrtHI '\\ \'JTi
7110
7120
71M
7140
7150
7160
7170
7180
7190
7200
7210
7220
L T I H A E S S P C L T T A I L S W L q A V B I B D E T T S A A L A E L V I I B Q
fi-mKTm^TmrmjCTnr-n-jTn-rj-.Ti'n^TTirrr-r^i^i ii.iiii... T /- f ic^v- r r ? ^- < ri T rYVirrimirirrTii i i m u j i i IIJJ.I •fY-iry-n-ftirT-ifi~i
7230
7240
7250
7260
7270
7280
7290
7300
7310
7320
7330
7340
T
H
'
* .r...*...*-H.
7330
E
P
C B
7470
7360
L F
U
..
C
.-
D
G
I
T
7480
B
T
T
S
7380
1
s
3
D
|
7370
3
D 0
7490
7390
T A P
7500
C
q
7400
F
S
7510
T
B
7520
B
F
B
A
C
C
7410
V 1
»
T
I
C
7420
C
7 5 X
t
E
S
T
C
B
7 4 X
B D S
75*0
T
7550
t
B
7440
T
T
L
P
C
T
7430
D
7560
C
L
L
7570
C
S
7460
T
B E
7580
S D L B I E E B T T D T A C F T D B V F A L H B I . L C F B F A P B I B D L C E T
7390
76O0
7610
7620
7630
I L T t P q C T Q T r P T L I P L
A C T
7710
7720
7730
7740
7750
7640
I
7650
C C T L I
7760
7770
7660
I
I
H T I
7780
7670
7680
A B V D 0
7790
7800
7690
I
7700
L » L A S S
7810
7820
I i q C T T T A S L H L t l L G S T P B Q I I C L A T A L I E L C B I E I T L F I
iTrii^inyiirrjruimnuTCiTinuiriiCTmrmrTiirmm .T.u<jj<-nj3ii.lijjiiTnrnrcinrTmrynrr.iTrnjirnr»/ijii.lii »T
7830
7840
7830
7860
7870
7880
7890
7900
7910
7920
7930
7940
L
D V L Q 3
7950
5660
T
7960
E
L
I
7970
B
I
T
B
7980
A
C
I
7990
• I C E A I » S L A B A T F F » I L C E I B D
Nucleic Acids Research
I S f t q g i T I i S C L l L V T i J I V L W K T T I L t l i T Q C L T E l C
9070
S0B0
8090
S100
8110
8120
8130
81*0
8130
8160
8170
8180
P T D C E L L Q F L S r L G U E B I D L T C D T V U B Q S B I L E D C I P I P L
6190
I
8200
M P C
I
8210
P
BIO
«
8220
lirr«nj<l
8230
82*0
8230
8260
S27O
8280
8290
8300
fap.
8320
Figure 1. DNA sequence of the tnpA gene of Tn501 showing the predicted
primary sequence of the gene product. The predicted amino acid sequences of
both the C-terminus of the tnpR gene product (nucleotides 5301-5349) and
the complete tnpA gene product (nucleotides 5336-8319) are shown. Amino
acid sequences are in the single letter code, with asterisks marking the
termination codons.
The postulated Shine-Dalgarno sequence (5343-5345) and
the potential stem-loop structure discussed in the text (8310-8338) are
underlined, and the 38bp terminal inverted repeat sequence is also marked.
repeats may affect the
rate
of
synthesis
or the stability of mRNA in this
region. However, there is one short-range inverted repeat
noteworthy
between nucleotides
8310
and
8338,
and
inside
end
of
the
terminal
inverted
for
repeat
the
stem-loop
contains
structure
the
tnpA
in
gene
approximately
terminator
the
and
terminal repeat. In Tn3_ the stem would
same
of
the
end
of
may be
the
tnpA
transcript
and
and
to
form
and this also
of
the
inverted
six nucleotides, and the loop
be
structures
at
involved in transcription
(28). Alternatively, or additionally, these stem-loop structures
capable
of
the
replicon which would
tnpA
gene
gene
potential
place,
the inside end
be
tnpA
would be 14 nucleotides. These sequences may form a stem-loop
termination
at a
of the transposon. The
corresponding region of transposon Tn3_ (24) also has the
a
occurs
function. This lies
consists of a 7bp inverted repeat
separated by 15bp containing the termination codon
the
that
position, and may have a specific biological
gene. There
attenuation of transcripts initiating in the vector
otherwise
is
no
transcribe the
such
non-coding
stem-loop structure which
strand
could
of
the
attenuate
transcription at the other end of the transposon in either Tn501 or Tn3.
There
and Tn3
is
bat
a
is
when
32Z
amino
the sequences are aligned
by
parsimony
procedures. There
number of conservative amino a d d substitutions between the sequences,
these have
not
substitutions are
been
scored,
that the homology between
that
acid identity between the transposases of Tn501
between
strongly
the
conserved
The significance of
scored
in
the
alignment.
the
N-terminal
C-terminal
regions
conservative
regions. Certain
is slightly greater than
oligopeptlde sequences are
between the transposases of Ta501
these
These
however, in the DIAGON plot (Fig. 2b), which show
and
Tn3
(Fig. 2a).
is not yet clear. The more hydrophobic regions of
5661
Nucleic Acids Research
(a)
50
.
.
.
.
100
kTQLClXIIlUIAljbllJCkUjU^lljniii^l-CPASVTlTCClinitKlJlAQ&JZTrLQLJU^CL
:
30
:
:
:
:
i
: 1
i
t: :
11:111
:
i
:
0JJICCVliTi7rn.TIMETl^GVlHrTASOJin8I)ITTL«Ta)raXTaEHtiLI!qHTqTUriW
.
.
.
.
100
.
.
150
.
.
.
.
200
.
.
.
n r«njnr»tTi mi <r»Ti»i/-<:migmiiTrr»mTPi-mourn m i n rireirruiTui nTipirpjcnmnmm m r p Tn
.
350
f?r|ti I niPjremT
250
.
.
.
.
300
I i m m i r n i r ] i r e i i n »cnrari>m>mii^rrTPrprriTi n v n FVTiTTTmn me m e n
i
300
.
.
350
400
.
.
.
450
.
.
WTTHTITI o y r p n I m n D . i p i t t p • •« T n ppnuiwr »nri>iim»T.gTn>«un» gTrrrei J J i m i f n CTnil ggfjim
i i
: it
i
nn
:
II : : : i
: : i : :i :
it :
™A-RL-U
P I / ^ <**TK m'.CTHTiqi'i PPVU.TTI'J J smimr^n
550
500
650
600
t'l
:
i
:: :
i i ::
i
: ii
: i
i
.
: :t
DZFTHASEASUtVOTiTOISinXIEiaiaZPLIISIITPALTIHIIJIVriLA]^^
600
.
.
.
.
650
.
i i ::
.
i
.
:
.
n
.
:
.
t n
.
::
700
:
I
:
i
:
700
800
.TVPQCVOJTPTLSPUCCnjniHVUawnDILIUSSIIQt;
750
imqsnmivijguDQarrAcsuLG
900
850
.
.
.
900
.
.
.
9
5
0
950
PT I f y i . ^ W [ U W I Ml TT^TTWTQ<PP1 FTHJ HlW W H J P
i
t i n
: t:
i i
t i n
CII/lJLSPLCHCHllOlUBTSFTUIlV-nCHLfiRJIlSEiBfVA
1000
(b)
TQ301
Figure 2
Alignment of the predicted
gene products of the. tnpA
genes of Tn501 and Tn3 (24,25) by (a) parsimonious alignment of idantical
aoino acid residues; and (b) dot matrix analysis (DIAGON; 20) in which
conservative amino acid substitutions are scored. In (a) the upper sequence
5662
Nucleic Acids Research
is that from Tn501. Amino acid identities are marked with a colon, and
the hyphens are padding characters to help alignment. In (b) a window of 15
residues was used for the DIAGON alignment, and a match was scored if the
homology was greater than that
expected
to
occur at a frequency of
10~^ for proteins of the same amino acid compositions.
the two polypeptides are similarly distributed along their respective primary
structures, which is expected if
the
polypeptides
adopt
similar
tertiary
structures.
The
the
data
mercury
(23),
presented in this paper, together with the DNA sequences of
resistance
complete
the
genes
DNA
(14,26,27)
sequence
sequence has been lodged with the
Cambridge
Nucleotide
physico-genetic
map
Sequence
of
Tn501
of
HffiL
Library
the
gene
TnSOl
is
location of some of the
Tn501.
Service
map
region
The
full
(29).
derived
The
from the DNA
gene boundaries and other
Data which are derived from the full sequence
are only presented in the text if they are concerned
functions. As
res-tnpR
Sequence Library via the
sequence) is shown in Fig. 3, and the positions of
features are given in Table 1.
the
transposon
Nucleotide
Data
(i.e.
and
used
with
the transposition
for transposon mutagenesis (12), we give the
restriction
endonuclease cleavage sites which occur
less than five times In Tn501 (Table 2 ) .
DISCUSSION
Tn501 transposase
The
determination
of
the
DNA
sequence
presented here has allowed the prediction of the
second
transposase
of
the
Tn3
family,
of
the
the
the
significance
transposases.
The
of
not
(4).
transposases
from
complement one another, as
Thus, each
transposase
gene
of
a
being that of Tn3
it is difficult to
the homologies and differences between the two
Tn501
analogous functions in transposition of
will
tnpA
structure
other
itself (24,25). In the absence of further biological data
assess
TnSOl
primary
must
and
Tn3
have
completely
their parental transposons, but they
determined
by
using
TnpA~
mutants
contain sites required for the specific
recognition of the ends of the transposon, which differ in detail between the
Tn501
and
Tn3
enzymes;
and
catalytic
sites
for
the
cleavage-ligation
reaction, which may be very similar in both.
We have tried to identify regions of
in DNA
to
recognition
the
transposase
by looking for primary sequences which
helix-turn-helix
tertiary
that
are involved
could
give rise
structure common to several DNA-blnding
5663
Nucleic Acids Research
,. /
/
/
/
/
/
/
/
/
Tn501
Figure 3. Physico-genetic nap of Tn501 showing the relative locations of the
known genes and major open reading frames. The terminal inverted repeats and
the major promoter (39) are marked.
The gene designated merP is that
described as merC in (26), and has been renamed because
the merC gene
originally identified in plasmid R100 by genetic criteria (40) corresponds to
a reading frame that is not present in Tn501 (41). The reading frames urf-1
and urf-2 have not been ascribed a function. The exact positions of gene
boundaries and other features are given in Table 1, as are references to the
sequence data. The transposon is 8355 nucleotide pairs in length.
proteins (30). This structural motif
hydrophobic
residues
glycine. The method of predicting
coupled
with
examination
occurs
an
'invariant'
secondary
structure
that
of the sequence for hoaology to
residues failed to identify
which
contains
glycine and
four amino acids before and six amino acids after the
was used (22)
these
conserved
a strong candidate for such a DNA binding domain
in both transpoaases.
All methods
of
predicting
secondary
structure have weaknesses, and a helix-turn-helix structure may be present,
but not have been detected.
around
Gly-854
of
The best candidate for such a sequence is that
the Tn501 transposase.
An alternative
explanation
Table 1.
Feature table for the Tn501 sequence based on the EKBL
Nucleotide Sequence Library Format. CDS - coding sequence; INVREP inverted repeat; MSG - nessenger RNA. (C) refers to a coding sequence
or mRNA on the complementary strand.
5664
ley
From
INVREP
CDS
MSG
MSG
CDS
CDS
CDS
CDS
CDS
CDS
SITE
CDS
CDS
INVREP
1
548
576
591
620
983
1330
3033
3395
3628
4603
4792
5356
8318
To
38
117 (C)
? (C)
?
967
1255
3012
3395
3628
4668
4729
5349
8319
8355
Description
terminal repeat
merR gene
merR mRNA
mer mRNA
merT gene
merP gene
merA gene
merD gene
urf-1 (merE gene?)
urf-2
res
tnpR gene
tnpA gene
terminal repeat
Reference
(3)
(39)
(39)
(39)
(26)
(26)
(14)
(27)
(27)
(27)
(34,35)
(23)
(This paper)
(3)
is
Nucleic Acids Research
Table 2.
Restriction endonuclease cleavage sites occurring
less than five times in Tn501 DNA.
First base of recognition site
Enzyme
Aatll
AccI
Aflll
AflHI
AsuII
Aval
Avail
Bell
BstXI
Drall
Dralll
EcoRI
Espl
Gdil
HglEII
HgiHIII
Hlndlll
Hael
Narl
Ndel
Nhel
NotI
Nrul
PvuII
SalGI
SphI
StuI
XhoII
1288
1885
4763
158
603
695
5884
7473
4543
3404
2220
7216
3949
4980
2114
4231
5066
5125
6860
13
33
2353
4953
8338
2064
6692
284
6705
850
2064
2368
2220
2949
4980
136
68
1525
6525
7924
7743
5655
6055
6916
7347
7980
1301
2341
1524
3273
4352
4852
1885
2498
6705
1319
6237
The enzymes from (42) which do not cleave Tn501 are:
Ahalll, Anal, Ayalll, Avrll, BamHI. Bglll. BstEII, Clal, EcoRV, Hpal.
Kpnl, Hlul, Ncol, Pvul, Pstl. Erul. Rspl, RsrII. S a d . SacII. Saul,
Seal. Sfil. Smal, Snal. Spel, Sspl, Tthllll, Ibal. Xhol and Xmnl.
(Isoschizomers are not included in this table.)
that
transposase,
which specifically
recognises
the
long-range
inverted
repeats, may not contain the same type of DNA binding motif found In proteins
that recognise short-range
shown that the binding of
symmetrical
Tn3
sequences.
transposase
to
the
Recent studies (31) have
inverted repeats of the
transposon Is ATP-dependent. This further argues against the DNA binding site
being
homologous to those of the ATP-Independent DNA binding proteins. There
1 B no sequence with obvious homology to known ATP-bindlng sites (30).
Complementation studies (4,5) have shown
of the transposon terminus is precise, some
terminal
Tn501
Inverted
and
Tnl721
repeats
TnpA~
of
closely-related
mutants
can
be
that, although the cleavage
transposases
can
transposona.
complemented
by
recognise the
For
a
example,
functional
5665
Nucleic Acids Research
tnpA
gene
Tn21
cannot
from
Tn501.
be
complement
or
transposase
be
or
TngJ,, whereas
Tn501
or
by
inverted
for
the
it
must
identical
a
were
by
complemented
transposition
to
Tnl721
complemented
Tn501
fail
to
Tn21^ inverted
1-80
of
The
of
of
these
can
specificity
of
the
repeats
utilire
a
then,
Tn501
a
is
sequence
repeat (7). If this
used at any great frequency,
events, nucleotides
Tn3.
mutant
None
Tnl721.
after
the
such
that
Tn501
during
within Tn501 that is
Tn21. inverted
only
containing
TnpA~
a
few
repeat
transposition
left terminal inverted
repeat and would be lost (7).
The
hierarchy
of
complementation between Tn501. Tnl721,
related transposons, makes the
for
the
study
Tn3-related
sequences of two transposases
of
this
regions
of
powerful
and
systems
The availability of the primary
family,
which they are derived, will help in the
those
transposases
of DNA-protein interaction.
Tn21
and the gene sequences from
design
of
experiments to identify
the TnS-related transposases required for specificity and
for catalysis.
The DHA sequence of Tn501 and expression of the transposition genes
The DNA sequence of Tn501
is
now completely defined, and examination
of the full sequence has revealed several
biology
of
Tn501.
(3,7,14,23,26,27).
affect the
Some
of
these
features
have
relevant to the detailed
been
discussed
previously
This discussion is limited to those features
expression
which
may
of the transposition functions. Some of the sequences
discussed below (between positions
4231 and 5398) were presented by Diver et
al (23).
In Tn3 and some other
transposition
genes,
tnpR
TnS-like
and
transcription of both genes is
product
at
the
transposition
transcribed
resistance
site
lying
genes
are
in
site
by
between
the
that
The
and
will
same
gene
Tn21_. In
bind
resolvase.
as
binding
(positions
4603-4730;
of Tn5Ol>and Tnl721
catalysed
promoter
5666
by
the
immediately
can
These
in
tnpR
in
gene
product
the
they
are
in
the
closely-related
are three sequences in the
have
hybrid,
Tn501
to, the induclble mercury
occurs
there
and
been
identified
by
highly conserved in
given as 373-502 in ref. 23). The
participate
Tn501
of the tnpR gene
res-tnpR-tnpA.
footprinting the protein-DNA complexes (34) and they are
Tn501
gamma-delta, the
genes (33). In
distal
order
Tnl721
the
the
order
from the same strand as, and
genes.
such
are divergently transcribed, and the
repressed
res
transposons Tnl721
res
elements,
tnpA.
reciprocal
(35). Tnl721
res. sites
recombination
contains
a
front of the tnpR gene (34). This promoter is in a
Nucleic Acids Research
position to be regulated
site,
and
expressed
in
Tnl721
from
this
by
the
there
tnpR
is
gene
sequences
binding
at
the
res
a reduction of 30% in expression of a gene
promoter when tnpR is
proposed -35 and -10
product
are
supplied
±n^ trans
(36). The
of this promoter are conserved in Tn501
(at positions 4686-4691 and 4708-4713, respectively; 458-463
ref. 23). We therefore assume that binding of resolvase
and
to
480-485 in
DNA may regulate
the transcription of the tnpR gene in Tn501.
There
is
no sequence readily identifiable as being homologous to the
consensus sequences of
regions
of
the
E^_ coli
presumptive
gene. In Tnl721 the tnpA gene
detected
in
Tn501
(37)
between
promoter and
promoter
is
very
tnpA
between
promoter
of
weak,
and
could
Tnl721
and
Tn501_ in
the
presence
the tnpA
that
Tn501
in
the
However,
absence
gene
of
has
transcription
by
reconcile
the
which
no
mercuric
its
of
own
the
read-through
salts
transposition
structural
transposition
from
region
the
genes
mer
(38)
is
genes.
and
al
at
a higher
evidence
that
induced
It
to
we
is
which
the known promoter
are
by
mercury,
difficult
to
with the presence of a sequence in
features in Tn501 which
which may explain this,
et
of mercuric salts
occurred
has identical -35 and -10 sequences and
spacer
Kitts
promoter, but that the tnpR gene does not, and
data of litts et al
identity in the
found
tnpA
only be
cointegrates, whereas
and the products were resolved. This was taken as
presumably
Tn501
and -10
the
this region implies that
would also be very weak.
(38) showed that transposition of
frequency
-35
start
occurred at a low frequency and that the products were
in
the
the
the transposase-mediated transposition reaction (36). The high
degree of homology
the
promoters
tnpR
are
in
not
shows
15/17
Tnl721. We have
present
in
Tnl721
investigating the transcription of the
transposition genes in more detail.
ACKNOWLEDGEMENTS
We
thank
Dr
H.
Muirhead
for
making
programs available to us, Dr H.C. Watson for
Drs
M.J.
Bishop
and
G.G.
protein structure prediction
providing
Kneale for their help in
Nucleotide Sequence Data Library Service, K.
Weston
computer facilities,
using
for
the
Cambridge
instructing JNW in
the ever-evolving DNA sequencing methods, and Drs P.M. Bennett,
and P.A. Lund for helpful discussion. This work was supported
S.E. Halford
by grants from
the MRC to NLB, who is a Royal Society EPA Cephalosporin Fund Senior Research
Fellow.
5667
Nucleic Acids Research
Present
address: Public Health
Research
Institute
of
the
City
of
New
York Inc., 455 First Avenue, New York, N.Y. 10016, USA.
+
Present
address:
Biozentnnn
der
Universitat
Basel,
Illngelbergstrasse
70, CH-4056 Basel, Switzerland.
RKKKRENCES
1. Stanisich, V.A., Bennett, P.M. and Richmond, M.H. (1977) J. Bacteriol.
129, 1227-1233.
2. Kleckner, N. (1981) Ann. Rev. Genet. 15, 341-404.
3. Brown, N.L., Choi, C.-L., Grlnsted, J., Richmond, M.H. and Whitehead,
P.R. (1980) Nucleic Acids Res. 8, 1933-1945.
4. Grinsted, J., de la Cruz, F., Altenbuchner, J. and Schmitt, R. (1982)
Plasmid 8, 276-286.
5. Tanaka, M., Yamamoto, T., and Sawai, T. (1983) Molec. Gen. Genet. 191,
442-450.
6. Altenbuchner, J., Choi, C.-L., Grinsted, J., Schmitt, R. and Richmond,
M.H. (1981) Genet. Res. Camb. 37, 285-189.
7. Grinsted, J. and Brown, N.L. (1984) Molec. Gen. Genet. 197, 497-502.
8. Schmitt, F. and Klopfer-Kaul, I. (1984) Molec. Gen. Genet. 197, 109-119.
9. Tanaka,
M., Yamamoto, T., and Sawai, T. (1983) J. Bacteriol. 153,
1432-1438.
10. Kitts, P.A., Lamond, A. and Sherratt, D.J. (1982) Nature 295, 626-628.
11. Arthur, A. and Sherratt, D.J. (1979) Molec. Gen. Genet. 175, 267-274.
12. Bullerjahn, J. reported in Haselkorn, R. (1985) Plant Molec. Biol.
Reporter 3, 24-32.
13. SchSffl, F., Arnold, W., Punier, A., Altenbuchner J. and Schmitt, R.
(1981) Molec. Gen. Genet. 181, 87-94.
14. Brown, N.L., Ford, S.J. Pridmore, R.D. and Fritzinger, D.C. (1983)
Biochemistry 22, 4089-4095.
15. Messing, J., Crea, R. and Seeburg, P.H. (1981) Nucleic Acids Res. 9,
309-321.
16. Vieira, J. and Messing, J. (1982) Gene 19, 269-276.
17. Sanger, F., Coulson, A.R., Barrell, B.G., Smith, A.J.H. and Roe, B.A.
(1980) J. Molec. Biol 143, 161-178.
18. Brown, N.L. (1984) Methods in Microbiology 17, 259-313.
19. Staden, R. (1980) Nucleic Acids Res 8, 3873-3694.
20. Staden, R. (1982) Nucleic Acids Res 10, 2951-2961.
21. Staden, R. (1984) Nucleic Acids Res 12, 521-538.
22. McClachlan, A.D. (1977) Int. J. Quantum Chem. 12, Suppl. 1, 371-385.
23. Diver, W.P., Grinsted, J., Fritzinger, D . C , Brown, N.L., Altenbuchner,
J., Rogowsky, P. and Schmitt, R. (1983) Molec. Gen. Genet. 191, 189-193.
24. Heffron, F., McCarthy, B.J., Ohtsubo, H. and Ohtsubo, E. (1979) Cell 18,
1153-1163
25. Fennewald, M.A., Gerrard, S.P., Chou, J., Casadaban, M.J. and Cozzarelli,
N.R. (1981) J. Biol. Chem. 256, 4687-4690.
26. Misra, T.K., Brown, N.L., Fritzinger, D . C , Pridmore, R.D., Barnes, W.M.,
Haberstroh, L. and Silver, S. (1984) Proc. Natl Acad. Sci. USA 81,
5975-5979.
27. Brown, N.L., Misra, T.K., Winnie, J.N., Schmidt, A., Lien, C , Sieff, M.
and Silver, S. (1985) In preparation.
28. von Hippel, P.H., Bear, D.G., Morgan, W.D. and McSwiggen, J.A. (1984)
Ann. Rev. Biocheffl. 53, 389-446.
29. Kneale, G.G. and Kennard, 0. (1984) Biochem. Soc. Trans. 12, 1011-1015.
30. Pabo, C O . and Sauer, R.T. (1984) Ann. Rev. Biochem. 53, 293-321.
5668
Nucleic Acids Research
31. Wishart, W.L., Broach, J.R. and Ohtsubo (1985) Nature 314, 556-558.
32. Walker, J.E., Saraste, M., Runswick, M.J. and Gay, N.J. (1982) EMBO J. 1,
945-951
33. Reed, R., Shlbuya, G.I. and Steitz, J.A. (1982) Nature 300, 381-383.
34. Rogowaky, P. and Schmitt, R. (1984) Molec. Gen. Genet, 193, 162-166.
35. Rogowaky, P., Halford, S.E. and Schndtt, R. (1985) EMBO J. In the Press.
36. Altenbuchner, J. and Schmitt, R. (1983) Molec. Gen. Genet. 190,300-308.
37. Hawley, D.K. and McClure, W.R. (1983) Nucleic Acida Res. 11, 2237-2255.
38. Kitta, P., Symington, L., Burke, M., Reed, R., and Sherratt, D. (1982)
Proc. Natl Acad. Sci. USA 79, 46-50.
39. Lund, P.A., Ford, S.J. and Brown, N.L. (1985) Submitted to J. Gen.
Microbiol.
40. NiBhriain, N., Silver, S. and Foater, T.J. (1983) J. Bacteriol. 155,
690-703.
41. Misra, T.K., Brown, N.L., Haberstroh, L., Schmidt, A., Goddette, D. and
Silver, S. (1985) Gene 34, 253-262.
42. Roberts, R.J. (1985) Nucleic Acids Res. 13 auppl., rl65-r200.
5669