* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download a standard translat7on in titation codon, we
Metagenomics wikipedia , lookup
Transposable element wikipedia , lookup
Frameshift mutation wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Genetic engineering wikipedia , lookup
Gene therapy wikipedia , lookup
Human genome wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Essential gene wikipedia , lookup
Gene nomenclature wikipedia , lookup
Public health genomics wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Pathogenomics wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Oncogenomics wikipedia , lookup
Gene desert wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Transfer RNA wikipedia , lookup
Gene expression programming wikipedia , lookup
Point mutation wikipedia , lookup
Genomic imprinting wikipedia , lookup
History of genetic engineering wikipedia , lookup
Ridge (biology) wikipedia , lookup
Minimal genome wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Expanded genetic code wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Genome evolution wikipedia , lookup
Helitron (biology) wikipedia , lookup
Mitochondrial DNA wikipedia , lookup
Genome (book) wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Microevolution wikipedia , lookup
Gene expression profiling wikipedia , lookup
Designer baby wikipedia , lookup
Volume 1 1 Number 19 1983 Volume 11 Number 19 1983 Nucleic Acids Research Nucleic Acids Research Genes for cytochrome c oxidase subunit I, URF2, and three tRNAs in Drosophila mitochondrial DNA Douglas O.Clary and David R.Wolstenholme Department of Biology, University of Utah, Salt Lake City, UT 84112, USA Received 27 July 1983; Revised and Accepted 12 September 1983 ABSTRACT Genes for URF2, tRNAtrP, tRNACYS, tRNAtyr and cytochrome c oxidase subunit I (COI) have been identified within a sequenced segment of the Drosophila yakuba mtDNA molecule. The five genes are arranged in the order given. Transcription of the tRNAcys and tRNATyr genes is in the same direction as replication, while transcription of the URF2, tRNALrP and COI genes is in the opposite direction. A similar arrangement of these genes is found in mamnalian mtDNA except that in the latter, the tRNAala and tRNAasn genes are located between the tRNAtrP and tRNACYS genes. Also, a sequence found between the tRNAasn and tRNACYS genes in mammalian mtDNA, which is associated with the initiation of second strand DNA synthesis, is not found in this region of the D. yakuba mtDNA molecule. As the D. yakuba COI gene lacks a standard translat7on i n titation codon, we consider thepossbil i ty that the quadruplet ATAA may serve this function. As in other D. yakba mitochondrial polypeptide genes, AGA codons in the URF2 and COI genes do not correspond in position to arginine-specifying codons in the equivalent genes of mouse and yeast mtDNAs, but do most frequently correspond to serine-specifying codons. INTRODUCTION We have recently reported the sequences of various segments of the mitochondrial DNA (mtDNA) molecule of the fly, Drosophilayakuba, within which we have identified the genes for the two ribosomal RNAs, nine tRNAs, and seven polypeptides (cytochrome c oxidase (CO) subunits II and III, ATPase subunit 6, Unidentified Reading Frames (URF) I, II, III and A6L) (1-3). While these genes are similar to the genes found within a number of mammalian mtDNAs (46), the order in which they are arranged differs considerably between the D. yakuba and mannalian mtDNA molecules (1-3). A difference in the genetic code of Drosophila and mammalian mtDNA is also indicated by our data. In mammalian mtDNA the triplet AGA (arginine in the standard genetic code) is used only as a rare termination codon. In contrast, in D. yakuba AGA is used to specify an amino acid in the seven polypeptide genes we have identified. However, these AGA codons never correspond in position to codons which specify arginine in © I R L Press Umited, Oxford, England. 6859 Nucleic Acids Research the equivalent polypeptide genes which have been sequenced from mtDNAs of mouse (4), yeast (7-10) and a plant (Zea mays; 11). We have presented arguments that rather than specifying arginine, AGA may specify serine in the Drosophila mitochondrial genetic code. In this paper we report a new sequence of D. yabmtDNA which contains the genes for URF2, tRNAtrp, tRNAcys, tRNAtyr and COI. Further interesting differences between D. yakuba and mammalian mtDNAs are noted, including the absence in the D. yakuba molecule of a sequence associated with initiation of second strand DNA synthesis in mammalian molecules, and the lack of a standard translation initiation codon in the D. yakuba COI gene. We suggest that initiation may be achieved in response to the four nucleotide codon ATAA. MATERIALS AND METHODS Experimental details regarding isolation of mtDNA from Drosophila yakuba (stock 2371.6, Ivory Coast), preparation and identification of pBR322 and pBR325 clones of D. yakuba mtDNAs, restriction enzyme digestion, electrophoresis, cloning of fragments into M13mp8 or M13mp9 DNA and purification of M13 DNAs are given or referenced in (1). The larger (2.8 kb) HindIII - ClaI fragment of the pBR322-cloned HindIIIB fragment of D. yakuba mtDNA (Fig. 1) was cloned separately into M13mp8 and M13mp9. From replicative forms (RF) of the two hybrid M13 DNA molecules, viral DNAs containing partial deletions of the 2.8 kb HindIII - ClaI mtDNA fragment were prepared as previously described (3). In order to confirm the orientation of the HindIII-F fragment, the 1.0 kb BglII - EcoRI subfragment was cleaved from the pBR325-cloned EcoRI-A fragment, cloned into M13mp8, and partial deletions were prepared. All DNA sequences were obtained from M13mp8- or M13mp9- cloned fragments by the extension-dideoxyribonucleotide termination procedure (12) using [a_32P]dATP (800 Ci/mM; New England Nuclear) as described (1). The sequencing strategies used are given in Fig. 1. Computer analyses of DNA sequences were carried out using the programs referenced in (3). RESULTS AND DISCUSSION Sequence organization. The entire sequence of a continuous 2789 nucleotide section of the D. yakuba mtDNA molecule, determined using the strategies summarized in Fig. 1, is shown in Fig. 2. Within this sequence are two open reading frames which, from comparisons of nucleotide and amino acid 6860 Nucleic Acids Research O-R_*- EcoRI Hindlil |E A B C 2Kb IDT c B A El D -F Clal Bgl 200bp Hindindll Hindill b e a -*--4 --I I-~- F -- <-- I-*- d I-*--I4 9 f h Figure 1. A map of the D. yakuba mtDNA molecule showing the relative locations of the A+T-ricF region (cross hatched), the two rRNA genes (dotted), the origin (0) and direction (R) of replication, and EcoRI and HindIII sites and fragments (A-E and A-F respectively) (see (1) for details and references). The bar under the map indicates the segment sequenced. This segment is expanded below and the restriction sites and strategy employed to obtain the entire nucleotide sequence are shown. The origin of each sequence is as follows: a and b, the two ends of the larger (2.8 kb) HindIII - Clal subfragment of the pBR722-cloned HindIII-B fragment. c and d, the two different orientations of the smaller (0.4 kb) HindIII - B lII subfragment of the HindIII-B fragment. e, the larger (4.4 kbF)indIII -BgII subfragment of the HindIII-B fragment. T, the HindIII-F fragment. j, the HindIII-C fragment. h, a DNaseI - BglII deletion (cloned in M13mp8) of the 1.0 kb BglII - EcoRI subTragment of the pBR325-cloned EcoRI-A fragment. Unlabeled sequences with arrows pointing to the left are DNaseI - HindIII deletions (cloned in M13mp8) of the 2.8 kb HindIII - ClaI subfragment. Unlabeled sequences with arrows pointing to the right are DNaseI - ClaI deletions (cloned in M13mp9) of the 2.8 kb HindIII - ClaI subfragment. The small vertical terminal arrows on the lower bar indicate the extent of the sequence shown in Fig. 2, and the solid arrow head shows the 5'-3' direction of this sequence. sequences to the corresponding sequences of previously identified genes of mouse mtDNA (4; Table 1 and Fig. 3), have been identified as the genes for cytochrome c oxidase subunit I (COI) and URF2. The sequence also contains three regions each of which can fold into the characteristic secondary structure of a tRNA, with anticodons indicating them to be the genes for tRNAtrp, tRNACYS, and tRNAtyr (Fig. 4). The five genes occur in the order URF2, tRNAtrp, tRNAcys, tRNAtyr, and COI. The 5' end of the URF2 gene is located adjacent to the tRNAf-met gene as shown previously (1), and the 3' end of the COI gene is next to the previously described (3) tRNAleu! gene (Fig. 6861 Nucleic Acids Research URF2 f tRNAf-met I F Y TTTCTTTTT N S S K I L F T T M I I I G T I L T V ATTTTTTATAATTCATCAAAAATTTTATTTACCACAATTATAATTATTGGAACATTAATTACAGT 75 T S N S W L G A W M G L E I N L L S F I P L L a D TACATCTAATTCTTGGTTAGGAGCTTGAATAGGTTTAGAAATTAATTTGTTATCTTTTATCCCCCTATTAAGAGA 150 N N N L M S T E AS L KY FL T Q A LAST V L L TAATAATAATTTAATATCTACAGAAGCTTCTTTAAAATATTTTTTAACCCAAGCTTTGGCATCAACTGTTTTATT 225 F S S L I L M A L N N N L N E N I E F S T S M I I ATTTTCTTCAATTTTACTTATATTGGCAAATAATTTAAATAATGAAATTAATGAATCTTTTACATCAATAATTAT S M A L L L K G A s A P F H F W F P N M M E G L 300 T TATATCGGCCTTATTATTAAAAAGAGGAGCCGCTCCTTTTCATTTTTGATTTCCTAATATAATAGAAGGATTAAC 375 W M N A L M L M T W Q K I A P L M L IS Y L N I K ATGAATAAATGCTTTGATATTAATAACTTGACAAAAAATTGCTCCATTAATATTAATTTCTTATTTAAATATTAA 450 N L L I L S V S L I V I I G A I G G N L Q S T L R AAATTTATTATTAATTAGTGTAATTTTATCAGTTATTATTGGAGCAATTGGAGGTTTAAACCAAACTTCACTCCG 525 K L M A F S S I N H L G W M L s S L M I s E S I W AAAATTAATAGCATTTTCTTCTATTAATCATTTAGGATGAATATTAAGATCTTTAATGATTAGAGAATCAATTTG 600 I L Y F I T F M F N I F K L F H ATTAATTTATTTTATTTTTTATTCATTCTTATCTTTTGTATTAACATTTATATTTAATATTTTTAAATTATTTCA 675 Q L F S W F V N s K I L K F S L F M N F L S L TTTAAATCAATTATTTTCTTGATTTGTAAACAGAAAAATTTTAAAATTTTCATTATTTATAAATTTTTTATCTTT 750 G G L P P F L G F L P K W L V I Q Q L T M C N Q Y AGGTGGATTACCTCCATTTTTAGGATTTTTACCAAAATGATTAGTAATTCAACAATTAACAATATGTAATCAATA 825 L Y F S F S L V F L N L F L L T M M T S M L I T L F F Y R I L C Y S A F TTTTTTATTAACATTAATAATAATATCAACTTTAATTACATTATTTTTTTATTTACGAATTTGTTACTCAGCTTT 900 M L N Y F E N N W I M E M N M N S N N T N L Y L I TATATTAAATTATTTCGAAAATAACTGAATCATGGAAATAAATATAAATAGTAATAATACTAATTTATATTTAAT 975 F T M F S I G F F L L I S F F L F L * M tRNAtrpNA TATAACTTTTTTTTCAATTTTCGGATTAJTTTTAATTTCTTTATTTTTTTTTATACTTTAAGGCTTTAAGTTAAC 1050 TAAACTAATAGCCTTCAAAGCTGTAAATAAAGGGTATTCCTT AGTCTT TAAAAATTTACTCCTTCAAAATT 1125 4P tRNACYs GCAGTTTGATATCATTATTGACTATAAGACC AGATTTAATTTAT rGATTAAGAAGAATAATTCTTATAAATAGA 1200 * _ tRNAtyr COI- ^ S R Q ? W L F S T N H K TTTACAATCTATCGCCTAAACTTCAGCCACTTAATCATAATCGCGACAATGGTTATTTTCTACAAATCATAAAG 1275 I G T L Y F I F G A W A G M V G T S L s I L I R ATATTGGAACTTTATATTTCATTTTTGGAGCTTGAGCCGGAATAGTAGGAACATCTTTAAGAATTTTAATTCGAG 1350 D E L G H P G A L I G D D Q I Y N V I V T A H A F CAGAATTAGGTCATCCAGGAGCATTAATTGGAGATGATCAAATTTATAATGTAATTGTTACTGCACATGCTTTTA 1425 A I M I F F M V M P I M I G G F G N W L V P L M L G TTATAATTTTTTTTATAGTAATACCTATTATAATTGGGGGGTTTGGAAATTGATTAGTGCCTTTAATATTAGGAG 1500 A P D M A F P R M N N Ms F W L L P P A L S L L L CTCCTGACATAGCATTCCCACGAATAAATAATATAAGATTTTGATTACTACCTCCTGCTCTTTCTTTATTATTAG 1575 V s a M V E N G A G T G W T V Y P P L S S G I A H TAAGAAGAATAGTTGAAAACGGAGCTGGTACAGGTTGAACTGTTTACCCTCCTTTATCTTCAGGTATCGCTCATG 1650 G G A S V D L A I F S L H L A G I S S I L G A V N GTGGAGCTTCTGTAGATTTAGCTATTTTTTCTCTTCATTTAGCTGGAATTTCTTCAATTTTAGGAGCTGTAAATT 6862 1725 Nucleic Acids Research F I T T V I N M R S T G I T L D R M P L F V W S V TTATTACGACTGTAATTAATATACGATCAACTGGAATTACATTAGACCGAATACCTTTATTTGTATGATCAGTAG 1800 V I T A L L L L L S L P V L A G A I T M L L T D R TTATTACTGCTTTATTACTTTTACTATCTTTACCAGTTCTTGCCGGAGCTATTACTATATTATTAACAGACCGAA 1875 N L N T S F F D P A G G G D P I L Y Q H L F W F F ATTTAAATACTTCTTTTTTTGA1'CCAGCTGGAGGAGGAGATCCTATTTTGTACCAACATTTATTTTGATTTTTTG 1950 G H P E V Y I L I L P G F G M I S H I I s Q E S G GTCACCCTGAAGTTTATATTTTAATTTTACCGGGATTTGGAATAATTTCTCATATTATTAGACAAGAATCTGGTA 2025 K K E T F G S L G M I Y A M L A I G L L G F I V W AAAAGGAAACTTTCGGTTCTTTAGGAATAATCTATGCTATACTTGCTATTGGATTATTAGGATTTATTGTTTGAG 2100 A H H M F T V G M D V D T R A Y F T S A T M I I A CTCATCATATATTTACAGTTGGAATAGACGTTGATACACGAGCTTATTTTACTTCTGCTACTATAATTATTGCGG 2175 V P T G I K I F s W L A T L H G T Q L S Y S P A I TTCCTACAGGAATTAAAATTTTTAGATGATTAGCTACTTTACATGGAACTCAACTTTCTTATTCTCCAGCTATTT 2250 L W A L G F V F L F T V G G L T G V V L A N S S V TATGAGCTTTAGGATTTGTTTTTTTATTCACAGTAGGAGGATTAACAGGAGTTGTATTAGCTAATTCATCAGTTG 2325 D II L H D T Y Y V V A H F H Y V L S M G A V F A ATATTATTTTACATGATACTTATTATGTAGTAGCTCATTTCCACTACGTTTTATCAATAGGAGCTGTATTTGCTA 2400 I M A G F I H W Y P L F T G L T L N N K W L K S Q TTATAGCAGGTTTTATTCACTGATACCCATTATTTACTGGATTGACATTAAATAATAAATGGTTAAAAAGTCAAT 2475 F I I M F I G V N L T F F P Q H F L G L A G M P R TTATTATTATGTTTATTGGAGTAAATTTAACATTTTTCCCCCAACATTTTTTAGGATTAGCAGGAATACCTCGAC 2550 R Y S D Y P D A Y T T W N V V S T I G S T I S L L GTTATTCAGATTACCCTGATGCTTACACTACATGAAATGTTGTGTCTACTATTGGGTCAACTATTTCATTATTAG 2625 G I L F F F Y I I W E S LV S Q R Q V I Y P I Q L GAATTTTATTTTTTTTCTATATTATTTGAGAAAGTTTAGTGTCTCAACGACAAGTAATTTATCCAATTCAATTAA 2700 N S S I E W Y Q N T P P A E H s Y S E L P L L T N * ATTCATCTATTGAATGATATCAAAATACACCCCCAGCTGAACATAGATATTCTGAATTACCACTTTTAACAAATT 2775 tRNAleu -0 AATm c TAATAT Figure 2. Nucleotide sequence of the segment of the D. yakuba mtDNA molecule identified in Fig. 1. From considerations of nucleot7ide and predicted amino acid sequence homologies to mouse mtDNA (4), this sequence contains the URF2 and cytochrome c oxidase subunit I (COI) genes. The boxed nucleotide sequences fold into the characteristic cloverleaf structures of tRNAs with anticodons (underlined) indicating them to be the genes for tRNALrp, tRNAcys and tRNA yr (Fig. 4). The nucleptide sequence shown is the (5'-3') sense strand of the URF2, COI and tRNAL rp genes, and the arrows indicate the direction of transcription of each gene. Asterisks indicate partial or complete termination codons. In the amino acid sequences a lower case letter s indicates the tentative assignment of serine to an AGA codon (see text). The question mark at the beginning of the COI sequence indicates uncertainty as to which nucleotides constitute the translation initiation sequence (see text). The sequence con9aining the amino-terminal 169 nucleotides of the URF2 gene and the entire tRNA -me gene are published in (1). The entire tRNAlUR gene i s gi ven i n (3). 6863 Nucleic Acids Research URF2 mouse D. yakuba MNPITLA.IY F..FL-.PV. .MS.TNLMLM .V... FS..A I..M.INKK. PR.... AT.. IFYNSSKILF TTIMIIGTLI TVTSNSWLGA WMGLEINLLS FIPLLsDNNN LMSTEASLKY 60 .V .. T. .MI I.LAIV.NYK QLGTWMFQQQ TNGLILN.TL M ..SM.L.L. L.EVT Q.IPLHMG.I 130 FLTQALASTV LLFSSILLML ANNLNNEINE SFTSMIIMS- -ALLLKsGAA PFHFWFPNMM EGLTWMNALM ..... .LSI QI.P.L NSTII.MLA. T.IFM..W QM. .I. .Y ...A.M.. . AI.PYNP. 200 .. . LMTWQKIAPL M-LIS-Y-LN IKNLLLISVI LSVIIGAIGG LNQTSLRKLM AFSSINHLGW MLsSLMIsES .. LT.LNLMI.I I.TAPMFMAL MLNNSMTI.S ISLLWNKTPA MLTMISLML. LT. II.TE 270 IWLIYFIFYS FLSFVLTFMF NIFKLFHLNQ LFSWFVNsKI LKFSLFMNFL SLGGLPPFLG FLPKWLVIQQ .MKN.CLIMA .. .A.MA.LN ....I.LI.. TSLTMFPT.. NSKM.THQTK TKPNLMFS.L AIMSTMT.PL 340 LTMCNQYFLL TLMMMSTLIT LFFYLRICYS AFMLNYFENN WIMEMNMNSN NTNLYLIMTF FSIFGLFLIS APQLIT* LFFFML* COI mouse D. yakuba yeast MFINR ..... L L .. .A.S . Q. .... L.. .......... ?-SRQWL-FST NHKDIGTLYF IFGAWAGMVG TSLsILIRAE LGHPGALIGD DQIYNVIVTA M-VQR..Y.. -A...AV... MLAIFS..A. .AMSLI..L. .AA..SQYLH GNSQLFL.VG ......... 60 ..... ..M ....... I.................... SFL ... ASS-..A .......... 130 HAFIMIFFMV MPIMIGGFGN WLVPLMLGAP DMAFPRMNNM sFWLLPPALS LLLVssMVEN GAGTGWTVYP ..VL.. CL. . .AL . Y.L. ...IIT .T ....I.I A .V. .MG.V C.VTSTL.S.. ..... ..AGNPV.A...... .. .KPPAM.QYQ T....... L. ..V ....... 200 T V..I I PLSSGIAHGG ASVDLAIFSL HLAGISSILG AVNFITTVIN MRSTGITLDR MPLFVWSVVI TALLLLLSLP .... IQ. .S. P.......A. .TS. V.TL. ..TN.MTMHK L..... IF. .. ....... L. . .. .... .. ...AG T.. .......... ;;....... I... VVTY Y.....P.Y 270 VLAGAITMLL TDRNLNTSFF DPAGGGDPIL YQHLFWFFGH PEVYILILPG FGMISHIIsQ ESGKKETFGS ..SAG . . F. I.. ..I... VVST Y.- ..PVF.E L.. EV .E ............ .. ...... ....... ...... M. .VW. .MS. F.....L ...... L.....C ...... ... . V.V S....... G NIKW ... M.. 340 LGMIYAMLAI GLLGFIVWAH HMFTVGMDVD TRAYFTSATM IIAVPTGIKI FsWLATLHGT QLSYSPAILW IS.V . AS. . I. .S L.S. ..YI .L.A. IY.G SIRLATPM.Y ... . I.I..S. ... .F... S.F . DDT.A 410 L. VL....V. ..... ... ALGFVFLFTV GGLTGVVLAN SSVDIILHDT YYVVAHFHYV LSMGAVFAIM AGFIHWYPLF TGLTLNNKWL .IA.L. ..Y..... I.SLF ..YYY.S.QI L .NY.E.LA M... A... A.L.VAF ....... ... ... .AH.A. . ... . ..M....... SM ..F. ..TAV.IMIF M.. AFA.K. 480 .T KSQFIIMFIG VNLTFFPQHF LGLAGMPRRY SDYPDAYTTW NVVSTIGSTI SLLGILFFFY IIWESLVSQR M. ..IN . I P. FAG. .Y.AS ... F. AT.SLFL.I. .LYDQ ..4NNK QI. WLI ... A.VI ... ... .... . E.MSVSYAST NL..LHGC.. PY.TFE.PTY VKVK* QVIYPIQLNS SIEWYQNTPP AEHsYSELPL LTN-* S...AKAPS. ...PLLTS.. .V.SFNT-.A VQS-* Figure 3. Comparisons of the amino acid sequences of URF2 and cytochrome c oxidase subunit I (COI) predicted from the nucleotide sequences of the respective genes of mtDNA of D. yakuba, with the corresponding amino acid sequences of mouse (4) and yeast (13). A dot indicates an amino acid which is conserved relative to D. yakuba. A dash indicates an amino acid which is absent. An asterisk iWdicates a partial or complete termination codon. Wide vertical solid arrows indicate D. yakuba amino acids specified by AGA which are represented by a lower case letter s to indicate the tentative assignment of serine (see text). The question mark at the beginning of the D. yakuba COI sequence indicates uncertainty regarding translation initiation (-ee text). 6864 Nucleic Acids Research Table 1. Nucleotide and amino acid sequence comparisons of the URF2 and cytochrome c oxidase subunit I (COI) genes of D. yakuba with the corresponding genes of mouse and yeast. Gene % Number of nucleotidesa (D. yakuba) (D. Nucleotide sequence homologya.b yakuba/mouse) x Silent nucleotide substitutionsab (D. yakuba/mouse) Z Amino acid sequence homologyb D. yakuba/ D. yakuba/ yeast mouse URF2 1023 47.0 30.2 34.7 __ COI 1537 71.6 55.7 74.7 59.2 aExcluding nucleotides concerned with termination. values include all deduced insertion/deletions (see Fig. 3). The data for mouse and yeast mtDNAs are taken from references (4,13). bThese 2). The tRNAcys and tRNAtyr genes are transcribed in the same direction as that in which replication proceeds around the molecule, while the URF2, COI and tRNAtrp genes are transcribed in the opposite direction. The arrangement of these five genes relative to each other and to adjacent genes within the D. yakuba mtDNA molecule is similar to what is found in mouse and other mammalian mtDNAs (4-6,14) except that in mammalian mtDNAs, the genes for tRNAala and tRNAasn separate the tRNAtrP and tRNACYS genes, and the tRNAt!i gene is next to the 3' end of the COI gene. The relative location of the genes for URF2, tRNAtrp, tRNAcys, tRNAtyr and COI and all of the other D. yakuba mitochondrial genes which have been determined to date are shown in Fig. 5. Alignment of the nucleotide and amino acid sequences (Fig. 3) of the URF2 genes of D. yakuba and mouse indicates that insertion-deletions involving six internal codons have occurred between these two genes resulting in a D. yakuba URF2 which is 12 nucleotides shorter. The last codon of both the D. yakuba and the mouse URF2 genes is separated by a single T from the 5' terminal nucleotide of the tRNAtrp gene. This observation provides further support for the view that polyadenylation of some gene transcripts (15), following their excision from multicistronic primary transcripts, generates the UAA termination codon (3), as first suggested for transcripts of some mammalian mitochondrial genes (5,16). Replication of the two strands of the mtDNA molecules of both mammals and Drosophila is highly asymmetrical (17-24). In mtDNAs of cultured cells of mouse, human, and hamster, synthesis of one strand (H strand) proceeds for two-thirds of the genome length before synthesis of the second strand (L strand) begins (17,19,22). It has been clearly demonstrated that initiation of L strand synthesis in all mouse L-cell and most human KB cell mtDNA 6865 Nucleic Acids Research A A-T A-T G -C 6 T C-G T -A T-A T T 5 A A A T T G A C T C C A A A G G I,I, A A C TA T ATA T A T AA-TA T -G A-T 6 -C C-6 C T G-T 6-C T-A C-G T -A T-A A-T 5' A TT G T A A T C A T T TT T T A A C T G * A AA T G A A T A A AT AG A 6 AT-ATA C-G A-T A -T A-T tRNAtrp A T C A T c A 5 6- C A- T T- A T- A A- T A- T 6- C TA T T T C T G T .II,, A A G T C 6 A AG A G 6 A T TA T G G C T T A GA- T TT IC T AGA- tRNACYS T T A T GC A A T C T tRNAtyr A A T GT A Figure 4. The genes for tRNAtrp, tRNACYS and tRNAtyr of D. yakuba mtDNA shown in the presumed characteristic secondary structures of the correspondi ng tRNAs. molecules occurs within a specific sequence of 32 (mouse) or 31 (human) nucleotides which can fold into a hairpin loop with a perfect twelve base pair stem, and is located between the 5' end of the tRNAasn gene and the 3' end of the tRNACYS gene (4,25,26). This 31-32 nucleotide sequence is highly conserved in other mammalian mtDNAs (6,14). However, such a sequence is not found in the corresponding region of D. yakuba mtDNA molecules (Fig. 2). As noted above, a tRNAasn gene does not occur in the region between the URF2 and COI genes, and the 3' end of the tRNAtrp gene overlaps by eight nucleotides the 5' end of the tRNAcys gene (Fig. 2). The tRNACYS and tRNAtyr genes are separated by 14 apparently non-coding nucleotides, but this sequence does not have the potential to form a hairpin loop. These observations are consistent with the results of our previous electron microscope studies on replicating forms of Drosophila mtDNA molecules (20,21) which failed to provide evidence for a preferred site of initiation of second strand synthesis that would map in the region between the URF2 and COI genes. However, evidence for a highly preferred site of initiation of second strand synthesis was obtained for D. melanogaster mtDNA molecules (20) which, from consideration of mapping and sequencing data (1,27), lies close to the boundary of the tRNAile gene and the A+T-rich region of these mtDNA molecules (Fig. 5). The COI gene lacks a standard initiation codon. Homology of both nucleotide and predicted amino acid sequences of the COI genes of D. yakuba and mouse are higher than has been found for any other mitochondrial gene compared between these species (Table 1; Fig. 3; 3). However, the D. yakuba 6866 Nucleic Acids Research Figure 5. The arrangement of genes determined to date in the circular mtDNA molecule of D. yakuba, from the results of various studies (for ry94 kRs,D references see 1,3) and the present data. Each tRNA gene is identified by the one letter amino acid code. Arrows within and outside the molecule ! ' / indicate the direction of transcription of each gene. Wavy lines indicate uncertain gene termini. 0 and R indicate the origin and direction of 5 replication, respectively. ° R COI gene does not appear to have a standard translation initiation codon. Each of the three other identified polypeptide genes (COII, COIII, and ATPase6) of D. yakuba mtDNA which we have sequenced has an ATG initiation codon, and the initiation codon for all four of the URFs (1, 2, 3 and A6L) for which complete or partial sequences have been obtained appears to be ATT (1,3). These codons as well as ATA and ATC are recognized as initiation codons in mamalian mtDNAs (4,5). Also we have identified in D. yakuba mtDNA a putative tRNAf-met gene with a 5' CAT anticodon (1) which presumably could recognize ATN triplets as initiation codons as in mammalian mitochondria (4,6). The initiation codon of the COIII gene of Aspergillus nidulans is unique among mitochondrial genes in that it appears to be GUG (28), which is also the initiation codon of the bacteriophage MS2 A-protein gene (29) and of some Escherchia coli genes (30). The first sense codon in the D. yakuba COI gene is TCG which would be expected to specify serine. This codon is two codons downstream from the mouse COI initiation codon and one codon downstream from the yeast COI initiation codon (ATG in both cases; Fig. 3). TCG is only used one other time as an internal codon in the D. yakuba mitochondrial genes we have sequenced (Table 2), and it is not clear how it might be recognized as an initiation codon by the D. yakuba tRNAf-met gene. There is not an obvious candidate for an initiation codon among the inframe codons which lie downstream from the TCG codon (Fig. 2); beginning with the fourth sense codon (TGA) this region is highly conserved in the D. yakuba, mouse and yeast genes (Fig. 3). Also, the possibility that initiation occurs upstream in the region overlapping the 6867 Nucleic Acids Research sense strand of the tRNAtyr gene is unlikely due to the presence of a TM codon inmnediately 5' to the TCG codon. As the D. yakuba COI gene is highly conserved relative to the mouse COI gene (Table 1, Fig. 3) it seems extremely unlikely that this D. yakuba mitochondrial COI gene is nonfunctional. Because of these difficulties of interpretation, we have considered the possibility that initiation of translation of the COI gene involves an unusual initiation sequence. The four nucleotides immediately 5' to the first sense codon of the COI gene are ATM. If a +1 ribosomal frameshift occurred in this region, it would allow the triplet ATA to function as an initiation codon for the open reading frame of the COI gene. Ribosomal frameshifting has frequently been invoked as a mechanism to correct frameshift mutations (see 31,32 for references), in one case in the mitochondrial COII gene of yeast (33). More important to the present discussion is evidence of ribosomal frameshifting in some wild type viruses which is essential to the production of some proteins, and which may have a regulatory function (34,35). Alternatively there may be a tRNA in D. yakuba mitochondria with an anticodon loop with unusual properties that permit it to read the complete four letter ATM sequence as a single codon. Suppressor tRNAs have been described from bacteria and yeast which correct frameshift mutations by reading four nucleotides as a codon (36-38). Such tRNAs have been found which have an extra nucleotide in their anticodon loop (36,38; Bossi, L., unpublished). A tRNAthr gene with an eight rather than a seven nucleotide anticodon loop has been found in yeast mtDNA (39). The only potential tRNAf-met gene we have sequenced from D. yakuba mtDNA has a normal anticodon loop of seven nucleotides (1). However, 5' to the CAT anticodon is a T. This differs from the situation found in mammalian mtDNA where a C is located in this position (4-6). The T residue seen in D. yakuba mtDNA may be functional in allowing the tRNAf-met to read the four letter sequence ATAA as a single codon and thus permit a continuation of reading in the correct frame. This suggestion is consistent with evidence from studies with bacteriophage QS coat cistron, that an A following an AUA (as well as an AUG) initiation codon might interact with a complementary U in the anticodon loop to increase the efficiency of codon/anticodon interaction (40). Also, it has been shown that in Salmonella typhimurium interaction of an A adjacent to the 3' side of an amber codon with a U located 5' to the anticodon of a suppressor tRNA increases the efficiency of translation of the codon (41). It is interesting to note that the sequence ATM precedes the initiation codon (ATG) of the D. yakuba COII gene (3) and is separated by one sense codon (AM) 6868 Nucleic Acids Research from the putative GUG initiation codon of the A. nidulans COIII gene (28). Transfer RNA genes. The tRNAtrP, tRNACYS and tRNAtyr genes of D. yakuba mtDNA are 57%, 67% and 60% homologous, respectively, to the corresponding tRNA genes of mouse mtDNA (4). The major structural characteristics of these three tRNA genes are similar to those of the nine other D. yakuba and D. melanogaster mt-tRNA genes we have described previously (1-3). All these tRNA genes lack a number of the nucleotides which are constant in prokaryotic and non-organelle eukaryotic tRNAs (42,43). However, the common occurrence among other Drosophila mt-tRNA genes of the constant Pu26, T33 and Pu37 nucleotides (numbering system in 42) is maintained in each of the tRNAtrP, tRNAcys and tRNAtyr genes. The Py1l-Pu24 nucleotide pair, which is conserved in all other Drosophila tRNA genes except tRNAf-met where it is a G-C pair, is also a G-C pair in tRNAtrp. Codon usage and the genetic code. The frequencies of codons in the URF2 and COI genes are shown in Table 2. An exceptionally high frequency of codons ending in A or T is found in both the URF2 gene (93.3%) and the COI gene (91.6%). These values are similar to the mean frequency (94.4%) of codons ending in A or T in the COII, COIII, ATPase6 and URFA6L genes of D. yakuba mtDNA reported previously (3). Among the completed sequences of six polypeptide genes of D. yakuba mtDNA (3; and Fig. 2) and the partial URF1 sequence (1,2) only 5 expected sense codons are not used at all, and each of these contains at most a single A or T. The G+C content of the URF2 gene (18.6%) is similar to that of the URFA6L gene (17.2%), both of which are considerably lower than the G+C content of the COI gene (30.1%) and of the other three identified D. yakuba polypeptide genes (range 24.2% to 28.8%) sequenced to date (3 and the present data). In comparisons of nucleotide sequences of corresponding genes of D. yakuba and mouse, the frequency of nucleotide substitutions which would not result in a corresponding amino acid substitution (silent substitutions, Table 1) is considerably higher for the more homologous COI genes than for the URF2 genes, which is consistent with our previous observations (3). In comparisons to the nucleotide sequences of mouse genes, TGA codons of D. yakuba mtDNA are conserved in the URF2 gene (five out of nine), which has a nucleotide sequence homology to the mouse URF2 gene of only 47.0% (Table 1), as well as in the mnore homologous (71.6%; Table 1) COI gene (13 out of 13). The finding of a tRNA gene with an anticodon (5' TCA; Figs. 2 and 4) expected to be able to decode TGA and TGG adds to earlier evidence that in D. yakuba mtDNA as in mammalian (44) and fungal mtDNAs (8,45,46), TGA specifies 6869 Nucleic Acids Research Table 2. Codon usage in the URF2 and COI genes of D. yakuba mtDNA. URF2 COI Othersa URF2 Phe-TTT TTC Leu-TTA TTG 34 3 57 4 32 7 54 2 58 3 94 1 Ser-TCT TCC TCA TCG 15 0 Leu-CTT CTC CTA CTG 2 1 1 0 7 0 2 0 10 1 4 0 Pro-CCT CCC CCA CCG Ile-ATT 32 2 25 2 47 2 22 68 1 23 8 15 0 16 19 GTA GTG 3 0 4 0 3 1 25 0 Tyr-TAT TAC TER-TAA TAG 9 1 0 0 13 6 1 0 24 7 3 0 His-CAT CAC 14 3 11 0 26 CAG 3 0 7 0 Asn-AAT AAC Lys-AAA AAG 27 3 10 0 17 1 5 43 3 9 1 Asp-GAT GAC Glu-GAA GAG 1 0 8 0 11 ATC Met-ATA ATG Val-GTT GTC Gln-CAA 1 1 4 9 0 3 19 0 13 3 21 0 COI Othersa 1 19 0 11 1 19 1 27 0 3 1 3 0 13 2 9 1 21 0 13 1 Thr-ACT ACC ACA ACG 6 2 11 0 18 0 15 1 25 1 26 0 Ala-GCT GCC 7 2 4 0 27 2 6 1 24 2 11 0 TGG 2 0 9 1 0 0 13 2 5 1 27 0 Arg-CGT CGC CGA CGG 0 0 2 0 1 0 9 0 2 0 12 0 Ser-AGT 2 0 5 0 2 0 7 0 6 1 12 0 3 0 10 0 9 0 35 3 12 0 26 4 GCA GCG Cys-TGT TGC Trp-TGA AGC (Ser)-AGA AGG Gly-GGT GGC GGA GGG 13 aThese values are the sums of the codon frequencies in the COII, COIII, ATPase6 and URFA6L genes of D. y mtDNA (3). AGA is tentatively shown as serine (see text). AGG has not been found in Drosophila mtDNA, and among manmnalian mtDNAs has been found only as the termination codon of human URF6 (5). TGA is interpreted as tryptophan (see text). TAG has been interpreted as the termination codon for the D. melanogaster COIII gene (2). As in manmnalian mtDNA, ATA is assumeT to specify methionine ('4,5). tryptophan (1,3). Among metazoan mtDNAs sequenced so far, the occurrence of AGA as an amino acid-specifying codon is unique to Drosophila mtDNA (1,3) where it has been found in every polypeptide gene for which sequences have been obtained. In comparisons of the sequences of D. yakuba genes to the corresponding genes of mouse, yeast and corn, we found that D. yakuba AGA codons never correspond in position to codons specifying arginine in the other species (3). Further, in comparisons to mouse mtDNA it was noted that while AGAs do correspond in 6870 Nucleic Acids Research position to codons specifying nine different amino acids, the most frequent were those specifying serine. From these observations and considerations of the characteristic features of mitochondrial genetic codes, we have argued that AGA may specify serine rather than arginine in the Drosophila mitochondrial genetic code. AGA occurs five times in the URF2 gene and seven times in the COI gene. As in previous D. yakuba/mouse gene comparisons, none of the twelve AGA codons correspond in position to arginine specifying codons in URF2 or COI genes of mouse and yeast (Fig. 2). In the D. yakuba URF2 gene the five AGA codons correspond in position to codons specifying five different amino acids (isoleucine, leucine, alanine, asparagine, and threonine) in the mouse URF2 gene (Fig. 3). However, the seven AGA codons in the D. yakuba COI gene (which has a higher homology to mouse and yeast COI genes (Table 1) than has been found to date for any other D. yakuba/mouse or D. yakuba/yeast mitochondrial gene comparison) show a high degree of correspondence to serinespecifying codons in both the mouse COI gene and the yeast COI gene (five in each case; Fig. 3). This latter observation adds further support to the arguments (3) that AGA may specify serine rather than arginine in the Drosophila mitochondrial genetic code. ACKNOWLEDGEMENTS We wish to thank John F. Atkins for valuable discussions, and Felix Machuca for excellent technical assistance. This work was supported by the National Institutes of Health Grants GM 18375 and RR 07092. REFERENCES 1. Clary,D.O., Goddard,J.M., Martin,S.C., Fauron,C.M.-R. and Wolstenholme, D.R. (1982) Nucl. Acids Res. 10, 6619-6637. 2. Clary,D.O., Wahleithner,J.A. and Wolstenholme,D.R. (1983) Nucl. Acids Res. 11, 2411-2425. 3. Clary,D.O. and Wolstenholme,D.R. (1983) Nucl. Acids Res. 11, 4211-4227. 4. Bibb,M.J., Van Etten,R.A., Wright,C.T., Walberg,M.W. and Clayton,D.A. (1981) Cell 26, 167-180. 5. Anderson,S., Bankier,A.T., Barrell,B.G., DeBruijn,M.H.L., Coulson,A.R., Drouin,J., Eperon,I.C., Nierlich ,D.P., Roe,B.A., Sanger,F., Schreier, P.H., Smith,A.J.H., Staden,R. and Young,I.G. (1981) Nature 290, 457-465. 6. Anderson,S., DeBruijn,M.H.L., Coulson,A.R., Eperon,I.C., Sanger,F. and Young,I.G. (1982) J. Mol. Biol. 156, 683-717. 7. Coruzzi,G. and Tzagoloff,A. (1979) J. Biol. Chem. 254, 9324-9330. 8. Fox,T.D. (1979) Proc. Natl. Acad. Sci. USA 76, 6534-6538. 9. Thalenfeld,B.E. and Tzagoloff,A. (1980) J. Biol. Chem. 255, 6173-6180. 10. Macino,G. and Tzagoloff,A. (1980) Cell 20, 507-517. 11. Fox,T.D. and Leaver,C.J. (1981) Cell 26, 315-323. 12. Sanger,F., Nicklen,S. and Coulson,A.R. (1977) Proc. Natl. Acad. Sci. USA 74, 5463-5467. 13. Bonitz,S.G., Coruzzi,G ., Thalenfeld,B.E., Tzagoloff,A . and Macino,G . 6871 Nucleic Acids Research (1980) J. Biol. Chem. 255, 11927-11941. 14. Taira,M., Yoshida,E., Kobayashi,M., Yaginuma,K. and Koike,K. (1983) Nucl. Acids Res. 11, 1635-1643. 15. Mertens,S.H. and Pardue,M.L. (1981) J. Mol. Biol. 153, 1-21. 16. Ojala,D., Montoya,J. and Attardi,G. (1981) Nature 290, 470-474. 17. Robberson,D.L., Kasamatsu,H. and Vinograd,J. (1972) Proc. Natl. Acad. Sci. USA 69, 737-741. 18. Wolstenholme,D.R., Koike,K. and Cochran-Fouts,P. (1973) J. Cell Biol. 56, 230-245. 19. Kasamatsu,H. and Vinograd,J. (1974) Ann. Rev. Biochem. 43, 695-719. 20. Goddard,J.M. and Wolstenholme,D.R. (1978) Proc. Natl. Acad. Sci. USA 75, 3886-3890. 21. Goddard,J.M. and Wolstenholme,D.R. (1980) Nucl. Acids Res. 8, 741-757. 22. Nass,M.M.K. (1980) J. Mol. Biol. 140, 257-281. 23. Clayton,D.A. (1982) Cell 28, 693-705. 24. Wolstenholme,D.R., Goddard,J.M. and Fauron,C.M.-R. (1983) in Developments in Molecular Virology. Vol. 2. Molecular Events in the Replication of Viral and Cellular Genomes, Becker,Y., Ed., Martinus Nijhoff B.V. The Hague. in press. 25. Martens,P.A. and Clayton,D.A. (1979) J. Mol. Biol. 135, 327-351. 26. Tapper,D.P. and Clayton,D.A. (1981) J. Biol. Chem. 256, 5109-5115. 27. Fauron,C.M.-R. and Wolstenholme,D.R. 41980) Nucl. Acids Res. 8, 2439-2452. 28. Netzker,R., Kochel,H.G., Basak,N. and Kuntzel,H. (1982) Nucl. Acids Res. 10, 4783-4794. 29. Fiers,W., Contreras,R., Duerinck,F., Haegmean,G., Merregaert,J., Min Jou, W., Raeymakers,A., Volckaert,G., Ysebaert,M., Van de Kerckhove,J., Nolf,F., and Van Montagu,M. (1975) Nature 256, 273-278. 30. Gay,N.J. and Walker,J.E. (1981) Nucl. Acids Res. 9, 3919-3926. 31. Atkins,J.F., Nichols,B.P. and Thompson,S. (1983) The EMBO Journal 2, in press. 32. Atkins,J.F. (1980) tRNA: Biological Aspects, Albelson,J., Schimell,P. and Soll,D. Eds., in Cold Spring Harbor Laboratory, New York pp. 439-449. 33. Fox,T.D. and Weiss-Bruniner,B. (1980) Nature 288, 60-63. 34. Beremand,M.N. and Blumenthal,T. (1979) Cell 18, 257-266. 35. Kastelein,R.A., Remant,E., Fiers,W. and Van Duin,J. (1982) Nature, 295, 35-41. 36. Riddle,D.L. and Carbon,J. (1973) Nature New Biology 242, 230-234. 37. Bossi,L. and Roth,J.R. (1981) Cell 25, 489-496. 38. Cumnins,C.M., Donahue,T.F. and Culbertson,M.R. (1982) Proc. Natl. Acad. Sci. USA 79, 3565-3659. 39. Tzagoloff,A., Macino,G., Nobrega,M.P. and Li,M. (1979) in Proc. 8th Ann. ICN-UCLA Symp. Molecular and Cellular Biology; Extra chromosomal DNA, Cummings,D.J., Dawid,I.B., Bert,P., Weissman,S.M. and Fox,C.F. Eds., pp. 339-355, Academic Press, New York. 40. Taniguchi,T. and Weissmann,C. (1978) J. Mol. Biol. 118, 533-565. 41. Bossi,L. and Roth,J.R. (1980) Nature 286, 123-127. 42. Gauss,D.H. and Sprinzl,M. (1983) Nucl. Acids Res. 11, rl-r53. 43. Gauss,D.H. and Sprinzl,M. (1983) Nucl. Acids Res. 11, r55-r103. 44. Barrell,B.G., Bankier,A.T., DeBruijn,M.H.L., Chen,E., Coulson,A.R., Drouin,J., Eperon,I.C., Nierlich,D.P., Roe,B.A., Sanger,F., Schreier, P.H., Smith,A.J.H., Staden,R. and Young,I.G. (1980) Proc. Natl. Acad. Sci. USA 77, 3164-3166. 45. Bonitz,S.G., Berlani,R., Coruzzi,G., Li,M., Macino,G., Nobrega,F.G., Nobrega,M.P., Thalenfeld,B.E. and Tzagoloff,A. (1980) Proc. Natl. Acad. Sci. USA 77, 3167-3170. 46. Heckman,J.E., Sarnoff,J., Alzner-DeWeerd,B., Yin,S. and RajBhandary,U.L. (1980) Proc. Natl. Acad. Sci. USA 77, 3159-3163. 6872