* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Nucleotide sequence of the thioredoxin gene from
Real-time polymerase chain reaction wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Biochemistry wikipedia , lookup
Genomic library wikipedia , lookup
Gene therapy wikipedia , lookup
Molecular ecology wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Multilocus sequence typing wikipedia , lookup
Gene desert wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Amino acid synthesis wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Gene regulatory network wikipedia , lookup
Expression vector wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Gene expression wikipedia , lookup
Gene nomenclature wikipedia , lookup
Biosynthesis wikipedia , lookup
Protein structure prediction wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Genetic code wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Homology modeling wikipedia , lookup
Point mutation wikipedia , lookup
Bioscience Reports 4, 917-923 (1984) Printed in Great Britain Nucleotide 917 sequence of the t h i o r e d o x i n gene from Escherlchia coli o~ ,, 3an-Olov HOOG 1, Hedvig VON BAHR-LINDSTROM 1, Staffan 3OSEPHSONI~ Betty 3. WALLACE~, Sidney R. KUSHNER 3, Hans 3ORNVALLI~ and Arne HOLMGREN 1 iDepartment of Chemistry I, Karolinska Institutet, S-I04 01 Stockholm; 2Department of Chemistry, KabiGen AB, S-I12 87 Stockholm, Sweden; and 3Department of Molecular and Population Genetics, University of Georgia, Athens, GA 30602, U.S.A. (Received 27 September 1984) The nucleotide sequence of the thioredoxin gene from c o l i was determined. The structural gene was i d e n t i f i e d on a cloned 3-kb PvuII Iragment by hybridization with a synthetic oligodeoxyribonucleotide corresponding to a part of the amino acid sequence of thioredoxin. Restriction-enzyme fragments were used as templates in the dideoxy sequence method, directly and after subcloning into M13mpS. A segment of 450 nucleotides was determined using both strands7 alternatively~ w i t h o u t e x t e n s i v e o v e r l a p s . The sequence c o n t a i n s the t h i o r e d o x i n coding region~ a potential ribosome-binding site~ and a putative promotor region. The p r e d i c t e d a m i n o acid sequence differs by two i n v e r s i o n s from the p r e v i o u s l y given t h i o r e d o x i n sequence. The revised sequence is presented and the results further show that thioredoxins from E. c o l i B and K12 are identical. Escherichia Thioredoxin from Escherichia coli has suggested functions as a general protein disulphide reductase (1) and as a hydrogen donor for ribonucleotide reductase (1) and is also an essential subunit of phage T7 DNA polymerase (2). T h i o r e d o x i n from E. c o l i consists of 108 amino acid% and a primary structure for this protein has been deduced (3). The t e r t i a r y structure has also been determined (4) as well as the relationship to g l u t a r e d o x i n (5-7) and i n t r a c e l l u l a r localizations (8). Mapping experiments have located the gene for thioredoxin~ t r x A , at 84 min on the E. c o l i chromosome (9). Recently~ a segment of the E. c o l i chromosome containing the thioredoxin gene has been ]igated into the ptasmid pBR325 (10). The resulting plasmid pBHK8 transformed into E. coli K12 ( s t r a i n : 5K3981) o v e r p r o d u c e s thioredoxin 150- to 200-fold ( I 0 ) . 01984 The Biochemical Society 918 HDOG ET AL. Knowledge of the nucleotide sequence of the thioredoxin gene will e n a b l e s t u d i e s using site-directed mutagenesis to define structurefunction relationships of thioredoxin. In this study9 we report determination of the nucleotide sequence of the thioredoxin gene from E. c o l i 9 independent of similar studies elsewhere (11) 9 we analyze the protein~ and compare the amino acid prediction from DNA with the primary structure of thioredoxin. i' M a t e r i a l s and Methods Preparation of DNA The plasmid pBHK8 was isolated from E. coli SK3981 with the alkaline lysis method (10912). Digestions of plasmid DNA by restriction endonucleases were performed using the conditions suggested by the manufacturer (Boehringer). Restriction endonuclease fragments w e r e s e p a r a t e d by a g a r o s e - g e l e l e c t r o p h o r e s i s and i s o l a t e d by electro-elution in dialysis bags (12). Synthesis of oligodeoxyribonucleotides O l i g o d e o x y r i b o n u c i e o t i d e s w e r e s y n t h e s i z e d by the solid-phase phosphoamidite method using an automatic synthesizer (13914). The concentrations of t h e f o u r 5 ' - O - d i m e t h o x y t r i t y l - 2 ' - d e o x y nucleoside-3'-O-methyl phosphodimethylamidite reagents were measured (1#) prior to making the mixtures so as to get equal ratios between reagents. Initially 9 a mixed l#-mer was synthesized corresponding to the known amino acid sequence (3). When part of the DNA structure was known 9 a 15-mer was synthesized corresponding to the nucleotide sequence. Hybridization with synthetic oligodeoxyribonucleotides Oligodeoxyribonucleotides (0.01 pg/pl) were labelled at the 5' end by transfer from [y-32P]ATP (Amersham; 3000 Ci/mmol; 1:2 molar ratio [y-g2P]ATP:oligodeoxyribonucleotides) by using T# polynucleotide kinase (New England Biolabs; 2 U/pl) in 50-ram Tris/HCl pH 7.59 10-ram MgCI29 10-ram dithiothreitol for #5 rain at 37~ Excess reagents were removed by separation on Sephadex G-50 (0.25 x 15 cm) in 10-ram Tris/HCl, l-raM EDTA9 pH 7.5. Radioactively labelled oligodeoxyribonucieotides (3 x 108 c . p . m . / p g ) w e r e used as probes for hybridization analysis by the Southern blot (15916) and dot-blot (17) techniques. Nucleotide sequence and protein analysis S a u 3 A fragments were cloned into the BamHI site of Ml3mpg (18). Transfection of E. c o l i 3M 103 was performed as described (19). Sequence analysis by the dideoxy method (20921) utilizing singlestranded M13-templates (22) and restriction-enzyme fragments after heat-induced strand separation (23) was carried out with either an Mi3-specific universal primer (Amersham) or with the thioredoxinspecific oligodeoxyribonucleotides as primers. The labelled nucleotide was [~-aSS]dATP (Amersham~ 600 Ci/mmol) and the reaction mixture was s e p a r a t e d by electrophoresis in a 0.2-ram-thick 6% urea-polyacrylamide gel. NUCLEOTIDE SEQUENCE OF E. COLI THIOREDOXIN GENE A 919 B 3 kb Fig. I. Identification of the fragment from pBHK8 containing the thioredoxin gene. Lane A: Ethidium bromide stain after agarose-gel electrophoresis of a PvuII digest of pBHK8, showing the 3-kb and 2.6-kb fragments (I0). Lane B: Autoradiograph of lane A after transfer to nitrocellulose by the method of Southern (15) and hybridization with the 32p_ labelled mixed 14-mer oligodeoxyribonucleotide (Table i). SK3981 was prepared as described (10) a n a l y s i s was carried out with the DABITC Thioredoxin from E. coli and manual s e q u e n c e method (24925). Results The plasmid pBHK8 was cleaved with z~zuII and the digest was separated by agarose-gel electrophoresis (Fig. 1). A 3-kb fragment containing the thioredoxin gene was identified (Fig. 1) after transfer to nitrocellulose by the method of Southern and hybridization with the mixed 14-mer synthetic probe (Table 1). The 3-kb fragment was isolated from agarose gel and used for sequence analysis directly and after subcloning into Ml3mp8. Table I. Synthetic oligodeoxyribonucleotides used as hybridization probes and primers in dldeoxy sequence analysis 28 Amino acid sequence - Trp 32 - Ala - Glu - Trp - Cys - C 14-mer 3" A C C C G T C T ~ A C C A C 5" A 52 57 - Lys - Leu - T h r - Va] - A l a - Lys Amino a c i d sequence 15-mer 5" ACTGAC CGTTGCAAA 3" 920 Hb6G 50 I A I i 100 100 I i 200 ET AL. l 300 -I 400 C3 ED Fig. 2. Summary of nucleotide sequence data. Line A represents the complete structure determined, with amino acid sequence numbering above the line and nucleotide sequence numbering below. Line B represents the two Sau3A fragments sequenced in the MI3 system. L i n e s C and D show the regions sequenced directly from the 3-kb fragment using as primer, respectively, the mixed 14-met and the 15-mer oligodeoxyribonucleotides (Table I). The 3-kb f r a g m e n t was digested with Sau3A, and the mixture o b t a i n e d was cloned i n t o M l 3 m p g . A # g - n u c l e o t i d e fragment9 i d e n t i f i e d as c o m p l e m e n t a r y to the mixed l#-mer probe by the dot-blot hybridization technique, was sequenced (Fig. 2). Based on the sequence of another Sau3A fragment (Fig. 2) from the thioredoxin gene~ a 15-mer oligodeoxyribonucleotide was synthesized (Table 1) for use as a primer in the sequence analysis. The isolated 3-kb fragment was used as template in direct dideoxy sequence analysis after heat-induced strand separation. The mixed l#-mer oligodeoxyribonucleotide was utilized as primer to determine the nucleotide sequence of the coding strand on the 5' side of the r e g i o n c o m p l e m e n t a r y to the primer (Fig. 2) (26). The 15-mer oligodeoxyribonucleotide was used in the same way to determine the sequence of the non-coding strand on the 3' side of the oligodeoxyr i b o n u c l e o t i d e (Fig. 2). The resulting nucleotide sequence of the thioredoxin gene is shown in Fig. 3. The N-terminal sequence of thioredoxin from E. eo2i strain SK39gl was d e t e r m i n e d w i t h t h e D A B I T C method and found to be Ser-Asp-Lys-lle-~ which is identical to that of E. co2i strain B (3). Discussion The n u c l e o t i d e sequence of the thioredoxin gene from E. coli K12 is given in Fig. 3 t o g e t h e r with the p r e d i c t e d amino acid sequence. The predicted protein structure was f o u n d to d i f f e r from the p r e v i o u s l y r e p o r t e d amino a c i d sequence (3) at two places. The d i f f e r e n c e s c o n c e r n positions t6-17 and 71-72, which have been quoted as Leu-Val and Ile-Gly~ r e s p e c t i v e l y (Fig. 5 in ref. 3). However, t h e s e s t r u c t u r e s were mis-quotations of original protein data~ giving Val-Leu (27) and Gly-Ile (Fig. # in ref. 3)~ and the orders are now v e r i f i e d and established by the present n u c l e o t i d e sequence. The e n t i r e gene was not analyzed on both strands. This was considered u n n e c e s s a r y since the n u c l e o t i d e sequence a g r e e s with the amino acid sequence. NUCLEOTIDE SEQUENCE OF E. COLI THIOREDOXIN GENE TCA GCT TAA CTA TTG CTT TAC GAA AGC GTA TTT AGT TGG TTA ATG CCT G T ~ G A G T T ~ TCC GGT GAA ATA AAG TCA ACC TTA CAC CAA CAA CGA AAC CAA CAC GCC AGG TAT ATG AGC GAT AAA ATT ATT Ser Asp Lys Ile Ile Leu Thr Asp Asp Ser 10 TTT GAC ACG GAT GTA CTC AAA GCG Val Leu Lys CTT ATT CAC CTG ACT GAC GAC AGT His 1 Phe Asp Thr Asp 921 GAC GGG GCG ATC CTC GTC GAT Ala Asp Gly Ala TTC TGG Ile Leu Val Asp P h e Trp 20 GCA GAG TGG TGC GGT CCG TGC AAA ATG ATC GCC CCG ATT CTG GAT GAA ATC AIa GIu Trp s Giy Pro Cys Lys Met Iie A1a Pro lie Leu Asp Giu Ile 30 40 GCT GAC GAA TAT CAG GGC AAA CTG ACC GTT A1a GCA AAA CTG AAC ATC GAT CAA Asp G1u Tyr Gln G1y Lys Leu Thr Val Ala Lys Leu Ash 50 GO AAC CCT GGC ACT GCG CCG AAA TAT GGC ATC Asn Pro Gly Thr Ala lie Asp Gin Pro tys Tyr Gly CG~ GGT ATC CCG ACT CTG CTG lie Arg Gly lie Pro Thr Leu Leu 70 CTG TTC AAA AAC GGT GAA GTG GCG GCA ACC AAA GTG GGT GCA CTG TCT AAA Leu Phe Lys Ash Gly Glu Val Ala Ala Thr Lys Val Gly Ala Leu 80 Ser Lys 9O GGT CAG TTG AAA GAG TTC CTC GAC GCT AAC CTG GCG Gly Leu Ala Gin Leu Lys Glu 100 Phe Leu Asp Ala Ash TAA GGG AAT 108 Fig. 3. Nucleotide sequence of the thioredoxin gene and its corresponding amino acid sequence. The nucleotide sequence of 450 nucleotides comprising the non-coding strand of the thioredoxin gene is s h o w n in the 5'§ direction. The proposed ribosome-binding site is boxed and the -35 and -I0 sites of the proposed promotor are indicated by lines below the nucleotide sequence. The E . c o l i KI2 strain SK3981 containing the plasmid pBHK8 produces thioredoxin in amounts great er than 300 mg per litre of culture medium (10), which is 150-200 times more than wild-type E. coli B. A normal processing of the thioredoxin N-terminus by the o v e r p r o d u c i n g strain is indicated by the N-terminal serine residue, which is identical to that of the wild-type protein. This contrasts with N - t e r m i n i d e t e r m i n e d for glutaredoxin, a small redox-protein belonging to the same superfamily as thioredoxin (5). The glutaredoxin from the overproducing E . c o l i K12 strain C[0-17 has an N-terminal methionine residue, followed by a glutamine residue (5), compared to the protein from E. coli B which starts with glutamic acid/glutamine (28). Altogether, the nucleotide sequence determination and the presently determined N-terminal amino acid sequence compared with the previously known amino acid sequence shows that there is no strain variation for thioredoxin from E. coli B and KI2 strain SK3981. 922 HOC)G ET AL. The thioredoxin coding region is initiated by an AUG triplet. It is preceded by a potential ribosome-binding site (G-G-A-G-U-U) showing f i v e out of six residues identical to the Shine-Dalgarno sequence (G-G-A-G-G-U) (29). The coding region is terminated by an ochre (U-A-A) codon. The best putative promotor region is 70 base pairs upstream of the translation-initiation codon. It comprises a Pribnow box and a -35 region (Fig. 3); both showing four out of six residues i d e n t i c a l to the respective consensus sequences (30) in E. c o l t . Therefore, it seems as if the transcription of E. c o l t thioredoxin is governed by its own promotor region, which is consistent with the fact that thioredoxin is an abundant protein (104 copies/cell (31)). The results also show that thioredoxin is not synthesized with a signal sequence, although it is localized in the periphery of E. co.Zi cells (~). During the identification and nucleotide sequence determination of a plasmid containing the rho gene from E. colt K12, the presence of an unidentified, expressed polypeptide of 12 000-14 000 Da was observed (32). The sequence of the first 60 nucleotides of the 390 preceding the coding part of the rho gene (32) is identical to the last residues of the coding part of the thioredoxin gene (Fig. 3). This identifies the unknown polypeptide as thioredoxin and furthermore locates the gene for thioredoxin upstream of the rho gene in E. eoli K12. Acknowledgements We thank Drs. Lars Hellman~ Erik Holmgren, and Lars Wieslander for v a l u a b l e discussions and Eva Lagerholm for synthesis of the oligodeoxyribonucleotides. This work was s u p p o r t e d by grants from the Swedish Cancer S o c i e t y , the Swedish Medical Research Council, Magn. Bergvall's Foundation, and the National Institute of Health (6m27997). References i. Holmgren A (1981) Trends Biochem. Sci. 6, 26-29. 2. Mark DF & Richardson CC (1976) Proc. Natl. Acad. Sci. U.S.A. 73, 780-784. 3. Holmgren A (1968) Eur. J. Biochem. 6, 475-484. 4. Holmgren A, S~derberg B-O, Eklund H & BrHnd6n C-I (1975) Proc. Natl. Acad. Sci. U.S.A. 72, 2305-2309. 5. H~Bg J-O, JSrnvall H, Holmgren A, Carlquist M & Persson M (1983) Eur. J. Biochem. 136, 223-232. 6. Eklund H, Cambillau C, SjSberg B-M, Holmgren A, JSrnvall H, H~Sg J-O & Br~nd~n C-I (1984) EMBO J. 3, 1443-1449. 7. Klintrot I-M, H88g J-O, J~rnvall H, Holmgren A & Luthman M (1984) Eur. J. Biochem., in press. 8. Nygren H, Rozell B, Holmgren A & Hansson H-A (1981) FEBS Lett. 133, 145-150. 9. Mark DF, Chase JW & Richardson CC (1977) Molec. Gen. Genet. 155, 145-152. I0. Lunn CA, Kathju S, Wallace BJ, Kushner SR & Pigiet V (1984) J. Biol. Chem., in press. NUCLEOTIDE SEQUENCE OF E. COLI THIOREDOXIN GENE 923 II. Wallace BJ & Kushner SR (1984) Gene, submitted for publication. 12. Maniatis T, Fritsch EF & Sambrook J (1982) Molecular Cloning, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. 13. Chow F, Kempe T & Palm G (1981) Nucleic Acids Res. 9, 2807-2817. 14. Josephson S, Lagerholm E & Palm G (1984) Acta Chem. Scand., in press. 15. Southern EM (1975) J. Mol. Biol. 98, 503-517. 16. Wallace RB, Shaffer J, Murphy RF, Bonner J, Hirose T & Itakura K (1979) Nucleic Acids Res. 6, 3543-3557. 17. Hu N & Messing J (1982) Gene 17, 271-277. 18. Messing J & Vieira J (1982) Gene 19, 269-276. 19. Messing J (1983) Methods Enzymol. I01, 20-78. 20. Sanger F, Nicklen S & Coulson AR (1977) Proc. Natl. Acad. Sci. U.S.A. 74, 5463-5467. 21. Sanger F, Coulson AR, Barrell BG, Smith AJH & Roe BA (1980) J. Mol. Biol. 143, 161-178. 22. Schreir PH & Cortese R (1979) J. Mol. Biol. 129, 169-172. 23. Wallace RB, Johnson MJ, Suggs SV, Miyoshi K, Bhatt R & Itakura K (1981) Gene 16, 21-26. 24. Chang JY, Brauer D & Wittmann-Liebold B (1978) FEBS Lett. 93, 205-214. 25. von Bahr-Lindstr~m H, Hempel J & J~rnvall H (1982) J. Protein Chem. I, 257-261. 26. Edlund T, Ny T, R~nby M, Hed6n L-O, Palm G, Holmgren E & Josephson S (1983) Proc. Natl. Acad. Sci. U.S.A. 80, 349-352. 27. Holmgren A (1968) Eur. J. Biochem. 5, 359-365. 28. Holmgren A (1979) J. Biol. Chem. 254, 3664-3671. 29. Shine J & Dalgarno L (1974) Proc. Natl. Acad. Sci. U.S.A. 71, 1342-1346. 30. Rosenberg M & Court D (1979) Ann. Rev. Genet. 13, 319-353. 31. Holmgren A, Ohlsson I & Grankvist M-L (1978) J. Biol. Chem. 253, 430-436. 32. Pinkham JL & Platt T (1983) Nucleic Acids Res. II, 3531-3545.