* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The nucleotide sequence of the tnpA gene completes the sequence
Epigenomics wikipedia , lookup
Copy-number variation wikipedia , lookup
Pathogenomics wikipedia , lookup
Expanded genetic code wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Frameshift mutation wikipedia , lookup
Genetic engineering wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Primary transcript wikipedia , lookup
Genome (book) wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Human genome wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Non-coding DNA wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Gene therapy wikipedia , lookup
Genome evolution wikipedia , lookup
Gene nomenclature wikipedia , lookup
Gene expression programming wikipedia , lookup
Nutriepigenomics wikipedia , lookup
History of genetic engineering wikipedia , lookup
Metagenomics wikipedia , lookup
Gene expression profiling wikipedia , lookup
Genetic code wikipedia , lookup
Gene desert wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Microsatellite wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Transposable element wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Point mutation wikipedia , lookup
Genome editing wikipedia , lookup
Microevolution wikipedia , lookup
Designer baby wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Volume 13 Number 15 1985 Nucleic Acids Research The nndeotide seqnence of tbe tnpA gene completes tbe sequence of the Pscudomonas transposon Tn50i Nigel L.Brown, Joseph N.Winnie, David Fritzinger* and R.David Pridroore + Department of Biochemistry, University of Bristol, University Walk, Bristol BS8 1TD, UK Received 21 June 1985; Accepted 22 July 1985 ABSTRACT The nucleotide sequence of the gene (tnpA) which codes for the transposase of transposon Tn501 has been determined. It contains an open reading frame for a polypeptide of M -111,500, which terminates within the inverted repeat sequence of the transposon. The reading frame would be transcribed in the same direction as the mercury-resistance genes and the tnpR gene. The amino acid sequence predicted from this reading frame shows 32X identity with that of the transposase of the related transposon Tn3_. The C-terminal regions of these two polypeptides show slightly greater homology than the N-terminal regions when conservative amino acid substitutions are considered. With this sequence determination, the nucleotide sequence of Tn501 is fully defined. The main features of the sequence are briefly presented. INTRODUCTION Tn501 (1), from Paeudomonas aerugjposa. is a member of the "Tn3 family" of transposons (2), in that it has inverted terminal repeats of 38 base pairs, which are partially homologous to those of Tn3; and it is flanked by five base-pair direct repeats generated from the recipient replicon durino the transposition process (3). Several transposons, including Tn21,, Tnl721 and Tn2603, are known to have transposition functions sufficiently related to those of Tn501 that complementation of mutants in the transposition genes can occur (4,5), and models for the evolutionary relationship between these transposons have been proposed (6-9). Several other transposons have been identified which appear to be closely related to Tn21. (9). The major mechanism of transposition of the TnS-related transposons involves two steps, each requiring a transposon-encoded gene product (10,11). The first step requires the product of the tnpA gene together with host-coded replication functions, and involves the formation of a cointegrate molecule between the donor replicon containing the transposon, and the recipient replicon. The cointegrate molecule contains two copies of the © IR L Press Limited, Oxford, England. 5657 Nucleic Acids Research transposon in the direct tnpR gene, and copies of the cis-acting repeat. The second event is mediated by the product of is an intramolecular transposon components in in the addition These between two two require on the presence of the of inverted repeats of the transposon, and the resolution of the cointegrate by transposition resolvase (res) sites in each is copy by recombination between the internal resolution of the transposon. Current models of replicative transposition (see 2) suggest that the transposase catalyses the the steps to the gene products; the formation the cointegrate is absolutely dependent terminal recombination cointegrate. formation of the in the steps in cointegrate: site-specific cleavage in one strand at each terminus of the transposon, cleavage three recipient a less specific double-stranded staggered replicon, and the ligation termini with the recipient replicon. The of determination sequence of the tnpA gene and the concomitant prediction the of transposon the nucleotide of the amino acid sequence of the transposase is an essential part of elucidating the molecular mechanism of transposition. In this paper we present Tn501. which the sequence the predicted amino acid preliminary as involved in recognition of DNA DNA sequence tnpA step in locating and in those members of (7,9) a which regions catalysis. presented. Tn501 of This the study is finding use as a mutagen locate and define genes in cyanobacteria (12), and transposons compare of Tn501, and the main features of the sequence of the transposon are briefly simplest gene. We of the gene product with that of the Tn3_ transposase completes the a sequence transposase to of nucleotides 5301 - 8355 of includes the coding sequence for the widely-dispersed class are with associated it of is one of the mercury-resistance multiple resistance to antimicrobial agents. The information summarised here will be of use to those working on the molecular biology of cyanobacteria, and to those interested in the evolution of transposons and of antimicrobial resistance determinants. METHODS DNA fragments Tn501; sequence of analysis pJ0E114 DNA 13,14) in M13mp7, and chain-termination sequence whole was transposon was carried (a SalGI-EcoRI related analysis determined out by the deletion bacteriophage random of (15,16), of carrying followed by (17). The nucleotide sequence of the on encountered in the sequence analysis due both to strands. When persistent problems were secondary structure these were resolved by using formamide gels or deozyinosine 5658 cloning pBR322 substitution as Nucleic Acids Research described elsewhere (18). Programs for the storage and analysis of DNA sequence data described by Staden (19-21) running on a PDP 11/45 or VAX The program used were those 11/750 computer. for the prediction of protein secondary structure was that described by McLachlan (22). RESULTS The DNA sequence of nucleotldes 5301 to containing the Nucleotides tnpA gene 5301-5398 and have its been presented in describing the tnpR gene of Tn501 (23), and at 8355 of transposon Tn501. flanking sequences, is shown in Fig. 1. the a previous inverted publication repeat sequence nucleotides 8318-8355 has also been described earlier (3). The nucleotide sequence has a G+C content of 66Z, and contains one major open reading frame which would code for a polypeptide of 988 amino acids, M f « 111,500. We have presented data (23) showing that the termination codon for the tnpR gene of only Tn501_ was that at positions 5350-5352 (Fig. 1 ) . There are three nucleotides between the third base of this codon base of the suggested evidence that we have correctly identified of (24,25), the GGA as shown at first the gene product aligns well with the amino acid sequence determined for the transposase is the the initiation codon of the tnpA gene, but the predicted amino acid sequence Tn3_ and initiation codon of the tnpA gene. We have no formal from transposon in Fig. 2. The prospective Shine-Dalgarno sequence positions 5344-5346, of the tnpR gene. The nearest which is within the coding sequence alternative initiation codon is a GTG at position 5512, which has no apparent Shine-Dalgarno sequence. The codon usage of the proposed tnpA gene is similar to that described for other genes in Tn501 (14,23,26,27), there being a preference for codons containing at the third sequence position. presented here, the largest starting at nucleotide 5922 as other Tn501 relatively high GC or G genes, and content being on the complementary strand, and extending beyond the sequence in Fig. 1 to nucleotide 4660. These reading frames usage C There are several other open reading frames in the do not are have the same predicted codon probably not functional. The of the DNA gives a lower statistical probability of there being termination codon sequences present, thus giving rise to longer open reading frames. There is a number of short inverted and directly repeated sequences within the DNA sequence presented consequence high of the GC here. This content may be a mere statistical of the DNA, although the inverted 5659 Nucleic Acids Research F tppR I * I M H I S I E AACATTJ 5310 T L . Q I T L 1 T D 5320 TACCACTAIX ' l U L U J H I 5330 5340 5430 3440 TTCAAOCAITCCGACI 5450 5460 5550 5560 OUTCAOCTCJITC 5570 JCGCACCTACCTTCJU GCACGCCCAGC!4ACT( 5660 5690 5670 5790 5800 D * tapA gens P t E L I M XACTCA CACATGCCG 5350 5360 S A 5930 l 5480 5490 5500 5510 5580 5390 3600 3610 5620 5630 5700 5710 5720 5730 3950 5940 C T 5830 A L P S Q 5420 1 ii 111 u. 5540 5650 5640 5760 E 5410 5660 'IP AACGCCTTGCrCCTCGC 5770 5780 5690 5660 5900 S B B T 5990 3980 5960 L CAACTC1IU, CTt 5520 5530 5870 5660 L 3400 UCCTCACCCAACTGCCOCAS 5740 5750 3 I T U L CCGCTCJIOCCA1nCGCATCGCCCCAAGen^rkrr.kTt TT(lAACCTCAACGCCGGCACC jeena IOC 5920 E 5390 S 5910 T 5380 5470 ICCCATCt. 1 UAUOOU FlUibUj 1 l«ATlj 3820 3830 5640 5810 L 5370 CGQCA 6020 6010 6000 5970 G CATCCTTCACt•M1T>X4CCCCCTC ikT.kTk 6040 60X 6050 m rCACCTCI 6060 6070 6080 6190 6200 6310 6320 n I L A B Q « 1 L L I B B iTnCAI ID: AAULTUfCCX 6100 6120 6090 6110 kkknkrt:rccor 6160 6130 6170 6180 cttr CTCTTCJICCcaXCCAiGCAC A A C U T ZMKACCAC1ITCrir 6280 6270 6290 6300 6400 6410 6420 6430 Q H T 6150 6140 KAACCtrrUTTUTCAOTOCTGCATCTCCACCACCCCATC 6220 6230 6240 6250 6210 6440 6430 6560 6570 XAGCTCCTCCCCCGCCCGC40GCCITC 6460 6470 6480 XACCTGCAC 6490 6300 6580 6600 6610 6720 6730 caaCOCTCCCCCCCTACnaSXCGCIT:TCenXACGTA C1CAAC1ITC 6510 A D II L CCCOCACAACt 6630 S E L 6520 65» 6640 6650 6730 A I D I 6770 II P 6550 T 6670 1 i V 6780 • S 6590 JTC4A 6620 A 6660 S C 6760 6670 P A C M A L B A L P L SCO! CTQOCCt 6540 6260 ITCAACGACAACCTGCCCCTCTACTCCAAUTCCCCCianKTGCiaUCGCCAACCA 6330 6340 6350 6360 6370 6380 IfXT ECCiGQ JACca^ATCCCGCCiTCCAGaxarnaTCCccrccciCGAGTTCACcaGJUX 6390 E C C rrccrc D Q T E B L 6790 L E TO ca B q F B D F 6800 Q L 6700 6690 6680 3 C D t 0 D L P V A B 6820 6810 L 6710 L Q L E A E I F A A L 6840 6830 A T L A I 6740 I B E Q 6630 D II 6660 E L P D A SACCTCCCOGATCC I L T E S C L I I T P L D A A T P D B A Q A L I P Q T S 0 L L P B I I I T E L L r ATYXTr^T f r/ l rl' 1 • J J T^ftA^A""A^^ M ^ 1 ' J ^* l ' J V J ^^-^ J T ^rtT r T r^ y r J '^ f a : '^ 3 T-*'rXA r r fl'r 1 / T 1 A f 7T r A f f T T t / T'> 1 'r1 l^rirrAkP-1TrirrVrAfL' "r1 • 6990 7000 7010 7020 70X 7040 7050 7060 7070 7080 7090 7100 H D T D D V T C F S B B F T B L I D C A B A I D B T L L L S A I L C D A I II L C '^Tf7iiA1V l 1i f W?E J TU^ifi,!,!^ Ill i / j m m j 'I'll H T P ^ m ^ i V]TiJtfT?<iVi*11i'tf?*yT*A*^M^7'iJH '*• I IV'Mfl !'• I'' ' / \ ' l v r * T y ' i HilTI^I^TTrtHI '\\ \'JTi 7110 7120 71M 7140 7150 7160 7170 7180 7190 7200 7210 7220 L T I H A E S S P C L T T A I L S W L q A V B I B D E T T S A A L A E L V I I B Q fi-mKTm^TmrmjCTnr-n-jTn-rj-.Ti'n^TTirrr-r^i^i ii.iiii... T /- f ic^v- r r ? ^- < ri T rYVirrimirirrTii i i m u j i i IIJJ.I •fY-iry-n-ftirT-ifi~i 7230 7240 7250 7260 7270 7280 7290 7300 7310 7320 7330 7340 T H ' * .r...*...*-H. 7330 E P C B 7470 7360 L F U .. C .- D G I T 7480 B T T S 7380 1 s 3 D | 7370 3 D 0 7490 7390 T A P 7500 C q 7400 F S 7510 T B 7520 B F B A C C 7410 V 1 » T I C 7420 C 7 5 X t E S T C B 7 4 X B D S 75*0 T 7550 t B 7440 T T L P C T 7430 D 7560 C L L 7570 C S 7460 T B E 7580 S D L B I E E B T T D T A C F T D B V F A L H B I . L C F B F A P B I B D L C E T 7390 76O0 7610 7620 7630 I L T t P q C T Q T r P T L I P L A C T 7710 7720 7730 7740 7750 7640 I 7650 C C T L I 7760 7770 7660 I I H T I 7780 7670 7680 A B V D 0 7790 7800 7690 I 7700 L » L A S S 7810 7820 I i q C T T T A S L H L t l L G S T P B Q I I C L A T A L I E L C B I E I T L F I iTrii^inyiirrjruimnuTCiTinuiriiCTmrmrTiirmm .T.u<jj<-nj3ii.lijjiiTnrnrcinrTmrynrr.iTrnjirnr»/ijii.lii »T 7830 7840 7830 7860 7870 7880 7890 7900 7910 7920 7930 7940 L D V L Q 3 7950 5660 T 7960 E L I 7970 B I T B 7980 A C I 7990 • I C E A I » S L A B A T F F » I L C E I B D Nucleic Acids Research I S f t q g i T I i S C L l L V T i J I V L W K T T I L t l i T Q C L T E l C 9070 S0B0 8090 S100 8110 8120 8130 81*0 8130 8160 8170 8180 P T D C E L L Q F L S r L G U E B I D L T C D T V U B Q S B I L E D C I P I P L 6190 I 8200 M P C I 8210 P BIO « 8220 lirr«nj<l 8230 82*0 8230 8260 S27O 8280 8290 8300 fap. 8320 Figure 1. DNA sequence of the tnpA gene of Tn501 showing the predicted primary sequence of the gene product. The predicted amino acid sequences of both the C-terminus of the tnpR gene product (nucleotides 5301-5349) and the complete tnpA gene product (nucleotides 5336-8319) are shown. Amino acid sequences are in the single letter code, with asterisks marking the termination codons. The postulated Shine-Dalgarno sequence (5343-5345) and the potential stem-loop structure discussed in the text (8310-8338) are underlined, and the 38bp terminal inverted repeat sequence is also marked. repeats may affect the rate of synthesis or the stability of mRNA in this region. However, there is one short-range inverted repeat noteworthy between nucleotides 8310 and 8338, and inside end of the terminal inverted for repeat the stem-loop contains structure the tnpA in gene approximately terminator the and terminal repeat. In Tn3_ the stem would same of the end of may be the tnpA transcript and and to form and this also of the inverted six nucleotides, and the loop be structures at involved in transcription (28). Alternatively, or additionally, these stem-loop structures capable of the replicon which would tnpA gene gene potential place, the inside end be tnpA would be 14 nucleotides. These sequences may form a stem-loop termination at a of the transposon. The corresponding region of transposon Tn3_ (24) also has the a occurs function. This lies consists of a 7bp inverted repeat separated by 15bp containing the termination codon the that position, and may have a specific biological gene. There attenuation of transcripts initiating in the vector otherwise is no transcribe the such non-coding stem-loop structure which strand could of the attenuate transcription at the other end of the transposon in either Tn501 or Tn3. There and Tn3 is bat a is when 32Z amino the sequences are aligned by parsimony procedures. There number of conservative amino a d d substitutions between the sequences, these have not substitutions are been scored, that the homology between that acid identity between the transposases of Tn501 between strongly the conserved The significance of scored in the alignment. the N-terminal C-terminal regions conservative regions. Certain is slightly greater than oligopeptlde sequences are between the transposases of Ta501 these These however, in the DIAGON plot (Fig. 2b), which show and Tn3 (Fig. 2a). is not yet clear. The more hydrophobic regions of 5661 Nucleic Acids Research (a) 50 . . . . 100 kTQLClXIIlUIAljbllJCkUjU^lljniii^l-CPASVTlTCClinitKlJlAQ&JZTrLQLJU^CL : 30 : : : : i : 1 i t: : 11:111 : i : 0JJICCVliTi7rn.TIMETl^GVlHrTASOJin8I)ITTL«Ta)raXTaEHtiLI!qHTqTUriW . . . . 100 . . 150 . . . . 200 . . . n r«njnr»tTi mi <r»Ti»i/-<:migmiiTrr»mTPi-mourn m i n rireirruiTui nTipirpjcnmnmm m r p Tn . 350 f?r|ti I niPjremT 250 . . . . 300 I i m m i r n i r ] i r e i i n »cnrari>m>mii^rrTPrprriTi n v n FVTiTTTmn me m e n i 300 . . 350 400 . . . 450 . . WTTHTITI o y r p n I m n D . i p i t t p • •« T n ppnuiwr »nri>iim»T.gTn>«un» gTrrrei J J i m i f n CTnil ggfjim i i : it i nn : II : : : i : : i : :i : it : ™A-RL-U P I / ^ <**TK m'.CTHTiqi'i PPVU.TTI'J J smimr^n 550 500 650 600 t'l : i :: : i i :: i : ii : i i . : :t DZFTHASEASUtVOTiTOISinXIEiaiaZPLIISIITPALTIHIIJIVriLA]^^ 600 . . . . 650 . i i :: . i . : . n . : . t n . :: 700 : I : i : 700 800 .TVPQCVOJTPTLSPUCCnjniHVUawnDILIUSSIIQt; 750 imqsnmivijguDQarrAcsuLG 900 850 . . . 900 . . . 9 5 0 950 PT I f y i . ^ W [ U W I Ml TT^TTWTQ<PP1 FTHJ HlW W H J P i t i n : t: i i t i n CII/lJLSPLCHCHllOlUBTSFTUIlV-nCHLfiRJIlSEiBfVA 1000 (b) TQ301 Figure 2 Alignment of the predicted gene products of the. tnpA genes of Tn501 and Tn3 (24,25) by (a) parsimonious alignment of idantical aoino acid residues; and (b) dot matrix analysis (DIAGON; 20) in which conservative amino acid substitutions are scored. In (a) the upper sequence 5662 Nucleic Acids Research is that from Tn501. Amino acid identities are marked with a colon, and the hyphens are padding characters to help alignment. In (b) a window of 15 residues was used for the DIAGON alignment, and a match was scored if the homology was greater than that expected to occur at a frequency of 10~^ for proteins of the same amino acid compositions. the two polypeptides are similarly distributed along their respective primary structures, which is expected if the polypeptides adopt similar tertiary structures. The the data mercury (23), presented in this paper, together with the DNA sequences of resistance complete the genes DNA (14,26,27) sequence sequence has been lodged with the Cambridge Nucleotide physico-genetic map Sequence of Tn501 of HffiL Library the gene TnSOl is location of some of the Tn501. Service map region The full (29). derived The from the DNA gene boundaries and other Data which are derived from the full sequence are only presented in the text if they are concerned functions. As res-tnpR Sequence Library via the sequence) is shown in Fig. 3, and the positions of features are given in Table 1. the transposon Nucleotide Data (i.e. and used with the transposition for transposon mutagenesis (12), we give the restriction endonuclease cleavage sites which occur less than five times In Tn501 (Table 2 ) . DISCUSSION Tn501 transposase The determination of the DNA sequence presented here has allowed the prediction of the second transposase of the Tn3 family, of the the the significance transposases. The of not (4). transposases from complement one another, as Thus, each transposase gene of a being that of Tn3 it is difficult to the homologies and differences between the two Tn501 analogous functions in transposition of will tnpA structure other itself (24,25). In the absence of further biological data assess TnSOl primary must and Tn3 have completely their parental transposons, but they determined by using TnpA~ mutants contain sites required for the specific recognition of the ends of the transposon, which differ in detail between the Tn501 and Tn3 enzymes; and catalytic sites for the cleavage-ligation reaction, which may be very similar in both. We have tried to identify regions of in DNA to recognition the transposase by looking for primary sequences which helix-turn-helix tertiary that are involved could give rise structure common to several DNA-blnding 5663 Nucleic Acids Research ,. / / / / / / / / / Tn501 Figure 3. Physico-genetic nap of Tn501 showing the relative locations of the known genes and major open reading frames. The terminal inverted repeats and the major promoter (39) are marked. The gene designated merP is that described as merC in (26), and has been renamed because the merC gene originally identified in plasmid R100 by genetic criteria (40) corresponds to a reading frame that is not present in Tn501 (41). The reading frames urf-1 and urf-2 have not been ascribed a function. The exact positions of gene boundaries and other features are given in Table 1, as are references to the sequence data. The transposon is 8355 nucleotide pairs in length. proteins (30). This structural motif hydrophobic residues glycine. The method of predicting coupled with examination occurs an 'invariant' secondary structure that of the sequence for hoaology to residues failed to identify which contains glycine and four amino acids before and six amino acids after the was used (22) these conserved a strong candidate for such a DNA binding domain in both transpoaases. All methods of predicting secondary structure have weaknesses, and a helix-turn-helix structure may be present, but not have been detected. around Gly-854 of The best candidate for such a sequence is that the Tn501 transposase. An alternative explanation Table 1. Feature table for the Tn501 sequence based on the EKBL Nucleotide Sequence Library Format. CDS - coding sequence; INVREP inverted repeat; MSG - nessenger RNA. (C) refers to a coding sequence or mRNA on the complementary strand. 5664 ley From INVREP CDS MSG MSG CDS CDS CDS CDS CDS CDS SITE CDS CDS INVREP 1 548 576 591 620 983 1330 3033 3395 3628 4603 4792 5356 8318 To 38 117 (C) ? (C) ? 967 1255 3012 3395 3628 4668 4729 5349 8319 8355 Description terminal repeat merR gene merR mRNA mer mRNA merT gene merP gene merA gene merD gene urf-1 (merE gene?) urf-2 res tnpR gene tnpA gene terminal repeat Reference (3) (39) (39) (39) (26) (26) (14) (27) (27) (27) (34,35) (23) (This paper) (3) is Nucleic Acids Research Table 2. Restriction endonuclease cleavage sites occurring less than five times in Tn501 DNA. First base of recognition site Enzyme Aatll AccI Aflll AflHI AsuII Aval Avail Bell BstXI Drall Dralll EcoRI Espl Gdil HglEII HgiHIII Hlndlll Hael Narl Ndel Nhel NotI Nrul PvuII SalGI SphI StuI XhoII 1288 1885 4763 158 603 695 5884 7473 4543 3404 2220 7216 3949 4980 2114 4231 5066 5125 6860 13 33 2353 4953 8338 2064 6692 284 6705 850 2064 2368 2220 2949 4980 136 68 1525 6525 7924 7743 5655 6055 6916 7347 7980 1301 2341 1524 3273 4352 4852 1885 2498 6705 1319 6237 The enzymes from (42) which do not cleave Tn501 are: Ahalll, Anal, Ayalll, Avrll, BamHI. Bglll. BstEII, Clal, EcoRV, Hpal. Kpnl, Hlul, Ncol, Pvul, Pstl. Erul. Rspl, RsrII. S a d . SacII. Saul, Seal. Sfil. Smal, Snal. Spel, Sspl, Tthllll, Ibal. Xhol and Xmnl. (Isoschizomers are not included in this table.) that transposase, which specifically recognises the long-range inverted repeats, may not contain the same type of DNA binding motif found In proteins that recognise short-range shown that the binding of symmetrical Tn3 sequences. transposase to the Recent studies (31) have inverted repeats of the transposon Is ATP-dependent. This further argues against the DNA binding site being homologous to those of the ATP-Independent DNA binding proteins. There 1 B no sequence with obvious homology to known ATP-bindlng sites (30). Complementation studies (4,5) have shown of the transposon terminus is precise, some terminal Tn501 Inverted and Tnl721 repeats TnpA~ of closely-related mutants can be that, although the cleavage transposases can transposona. complemented by recognise the For a example, functional 5665 Nucleic Acids Research tnpA gene Tn21 cannot from Tn501. be complement or transposase be or TngJ,, whereas Tn501 or by inverted for the it must identical a were by complemented transposition to Tnl721 complemented Tn501 fail to Tn21^ inverted 1-80 of The of of these can specificity of the repeats utilire a then, Tn501 a is sequence repeat (7). If this used at any great frequency, events, nucleotides Tn3. mutant None Tnl721. after the such that Tn501 during within Tn501 that is Tn21. inverted only containing TnpA~ a few repeat transposition left terminal inverted repeat and would be lost (7). The hierarchy of complementation between Tn501. Tnl721, related transposons, makes the for the study Tn3-related sequences of two transposases of this regions of powerful and systems The availability of the primary family, which they are derived, will help in the those transposases of DNA-protein interaction. Tn21 and the gene sequences from design of experiments to identify the TnS-related transposases required for specificity and for catalysis. The DHA sequence of Tn501 and expression of the transposition genes The DNA sequence of Tn501 is now completely defined, and examination of the full sequence has revealed several biology of Tn501. (3,7,14,23,26,27). affect the Some of these features have relevant to the detailed been discussed previously This discussion is limited to those features expression which may of the transposition functions. Some of the sequences discussed below (between positions 4231 and 5398) were presented by Diver et al (23). In Tn3 and some other transposition genes, tnpR TnS-like and transcription of both genes is product at the transposition transcribed resistance site lying genes are in site by between the that The and will same gene Tn21_. In bind resolvase. as binding (positions 4603-4730; of Tn5Ol>and Tnl721 catalysed promoter 5666 by the immediately can These in tnpR in gene product the they are in the closely-related are three sequences in the have hybrid, Tn501 to, the induclble mercury occurs there and been identified by highly conserved in given as 373-502 in ref. 23). The participate Tn501 of the tnpR gene res-tnpR-tnpA. footprinting the protein-DNA complexes (34) and they are Tn501 gamma-delta, the genes (33). In distal order Tnl721 the the order from the same strand as, and genes. such are divergently transcribed, and the repressed res transposons Tnl721 res elements, tnpA. reciprocal (35). Tnl721 res. sites recombination contains a front of the tnpR gene (34). This promoter is in a Nucleic Acids Research position to be regulated site, and expressed in Tnl721 from this by the there tnpR is gene sequences binding at the res a reduction of 30% in expression of a gene promoter when tnpR is proposed -35 and -10 product are supplied ±n^ trans (36). The of this promoter are conserved in Tn501 (at positions 4686-4691 and 4708-4713, respectively; 458-463 ref. 23). We therefore assume that binding of resolvase and to 480-485 in DNA may regulate the transcription of the tnpR gene in Tn501. There is no sequence readily identifiable as being homologous to the consensus sequences of regions of the E^_ coli presumptive gene. In Tnl721 the tnpA gene detected in Tn501 (37) between promoter and promoter is very tnpA between promoter of weak, and could Tnl721 and Tn501_ in the presence the tnpA that Tn501 in the However, absence gene of has transcription by reconcile the which no mercuric its of own the read-through salts transposition structural transposition from region the genes mer (38) is genes. and al at a higher evidence that induced It to we is which the known promoter are by mercury, difficult to with the presence of a sequence in features in Tn501 which which may explain this, et of mercuric salts occurred has identical -35 and -10 sequences and spacer Kitts promoter, but that the tnpR gene does not, and data of litts et al identity in the found tnpA only be cointegrates, whereas and the products were resolved. This was taken as presumably Tn501 and -10 the this region implies that would also be very weak. (38) showed that transposition of frequency -35 start occurred at a low frequency and that the products were in the the the transposase-mediated transposition reaction (36). The high degree of homology the promoters tnpR are in not shows 15/17 Tnl721. We have present in Tnl721 investigating the transcription of the transposition genes in more detail. ACKNOWLEDGEMENTS We thank Dr H. Muirhead for making programs available to us, Dr H.C. Watson for Drs M.J. Bishop and G.G. protein structure prediction providing Kneale for their help in Nucleotide Sequence Data Library Service, K. Weston computer facilities, using for the Cambridge instructing JNW in the ever-evolving DNA sequencing methods, and Drs P.M. Bennett, and P.A. Lund for helpful discussion. This work was supported S.E. Halford by grants from the MRC to NLB, who is a Royal Society EPA Cephalosporin Fund Senior Research Fellow. 5667 Nucleic Acids Research Present address: Public Health Research Institute of the City of New York Inc., 455 First Avenue, New York, N.Y. 10016, USA. + Present address: Biozentnnn der Universitat Basel, Illngelbergstrasse 70, CH-4056 Basel, Switzerland. RKKKRENCES 1. Stanisich, V.A., Bennett, P.M. and Richmond, M.H. (1977) J. Bacteriol. 129, 1227-1233. 2. Kleckner, N. (1981) Ann. Rev. Genet. 15, 341-404. 3. Brown, N.L., Choi, C.-L., Grlnsted, J., Richmond, M.H. and Whitehead, P.R. (1980) Nucleic Acids Res. 8, 1933-1945. 4. Grinsted, J., de la Cruz, F., Altenbuchner, J. and Schmitt, R. (1982) Plasmid 8, 276-286. 5. Tanaka, M., Yamamoto, T., and Sawai, T. (1983) Molec. Gen. Genet. 191, 442-450. 6. Altenbuchner, J., Choi, C.-L., Grinsted, J., Schmitt, R. and Richmond, M.H. (1981) Genet. Res. Camb. 37, 285-189. 7. Grinsted, J. and Brown, N.L. (1984) Molec. Gen. Genet. 197, 497-502. 8. Schmitt, F. and Klopfer-Kaul, I. (1984) Molec. Gen. Genet. 197, 109-119. 9. Tanaka, M., Yamamoto, T., and Sawai, T. (1983) J. Bacteriol. 153, 1432-1438. 10. Kitts, P.A., Lamond, A. and Sherratt, D.J. (1982) Nature 295, 626-628. 11. Arthur, A. and Sherratt, D.J. (1979) Molec. Gen. Genet. 175, 267-274. 12. Bullerjahn, J. reported in Haselkorn, R. (1985) Plant Molec. Biol. Reporter 3, 24-32. 13. SchSffl, F., Arnold, W., Punier, A., Altenbuchner J. and Schmitt, R. (1981) Molec. Gen. Genet. 181, 87-94. 14. Brown, N.L., Ford, S.J. Pridmore, R.D. and Fritzinger, D.C. (1983) Biochemistry 22, 4089-4095. 15. Messing, J., Crea, R. and Seeburg, P.H. (1981) Nucleic Acids Res. 9, 309-321. 16. Vieira, J. and Messing, J. (1982) Gene 19, 269-276. 17. Sanger, F., Coulson, A.R., Barrell, B.G., Smith, A.J.H. and Roe, B.A. (1980) J. Molec. Biol 143, 161-178. 18. Brown, N.L. (1984) Methods in Microbiology 17, 259-313. 19. Staden, R. (1980) Nucleic Acids Res 8, 3873-3694. 20. Staden, R. (1982) Nucleic Acids Res 10, 2951-2961. 21. Staden, R. (1984) Nucleic Acids Res 12, 521-538. 22. McClachlan, A.D. (1977) Int. J. Quantum Chem. 12, Suppl. 1, 371-385. 23. Diver, W.P., Grinsted, J., Fritzinger, D . C , Brown, N.L., Altenbuchner, J., Rogowsky, P. and Schmitt, R. (1983) Molec. Gen. Genet. 191, 189-193. 24. Heffron, F., McCarthy, B.J., Ohtsubo, H. and Ohtsubo, E. (1979) Cell 18, 1153-1163 25. Fennewald, M.A., Gerrard, S.P., Chou, J., Casadaban, M.J. and Cozzarelli, N.R. (1981) J. Biol. Chem. 256, 4687-4690. 26. Misra, T.K., Brown, N.L., Fritzinger, D . C , Pridmore, R.D., Barnes, W.M., Haberstroh, L. and Silver, S. (1984) Proc. Natl Acad. Sci. USA 81, 5975-5979. 27. Brown, N.L., Misra, T.K., Winnie, J.N., Schmidt, A., Lien, C , Sieff, M. and Silver, S. (1985) In preparation. 28. von Hippel, P.H., Bear, D.G., Morgan, W.D. and McSwiggen, J.A. (1984) Ann. Rev. Biocheffl. 53, 389-446. 29. Kneale, G.G. and Kennard, 0. (1984) Biochem. Soc. Trans. 12, 1011-1015. 30. Pabo, C O . and Sauer, R.T. (1984) Ann. Rev. Biochem. 53, 293-321. 5668 Nucleic Acids Research 31. Wishart, W.L., Broach, J.R. and Ohtsubo (1985) Nature 314, 556-558. 32. Walker, J.E., Saraste, M., Runswick, M.J. and Gay, N.J. (1982) EMBO J. 1, 945-951 33. Reed, R., Shlbuya, G.I. and Steitz, J.A. (1982) Nature 300, 381-383. 34. Rogowaky, P. and Schmitt, R. (1984) Molec. Gen. Genet, 193, 162-166. 35. Rogowaky, P., Halford, S.E. and Schndtt, R. (1985) EMBO J. In the Press. 36. Altenbuchner, J. and Schmitt, R. (1983) Molec. Gen. Genet. 190,300-308. 37. Hawley, D.K. and McClure, W.R. (1983) Nucleic Acida Res. 11, 2237-2255. 38. Kitta, P., Symington, L., Burke, M., Reed, R., and Sherratt, D. (1982) Proc. Natl Acad. Sci. USA 79, 46-50. 39. Lund, P.A., Ford, S.J. and Brown, N.L. (1985) Submitted to J. Gen. Microbiol. 40. NiBhriain, N., Silver, S. and Foater, T.J. (1983) J. Bacteriol. 155, 690-703. 41. Misra, T.K., Brown, N.L., Haberstroh, L., Schmidt, A., Goddette, D. and Silver, S. (1985) Gene 34, 253-262. 42. Roberts, R.J. (1985) Nucleic Acids Res. 13 auppl., rl65-r200. 5669