* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Analysis of the 3′-terminal nucleotide sequence of vesicular
Cre-Lox recombination wikipedia , lookup
Western blot wikipedia , lookup
Expanded genetic code wikipedia , lookup
DNA sequencing wikipedia , lookup
RNA interference wikipedia , lookup
Non-coding DNA wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Biochemistry wikipedia , lookup
Molecular evolution wikipedia , lookup
RNA silencing wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Eukaryotic transcription wikipedia , lookup
Transcriptional regulation wikipedia , lookup
RNA polymerase II holoenzyme wikipedia , lookup
Genetic code wikipedia , lookup
Real-time polymerase chain reaction wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Agarose gel electrophoresis wikipedia , lookup
Gel electrophoresis of nucleic acids wikipedia , lookup
Non-coding RNA wikipedia , lookup
Gene expression wikipedia , lookup
Community fingerprinting wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Gel electrophoresis wikipedia , lookup
Messenger RNA wikipedia , lookup
Polyadenylation wikipedia , lookup
Volume 5 Number 11 November 1978 Nucleic A c i d s Research Analysis of the 31-terminal nucleotide sequence of vesicular stomatitis virus N protein mRNA Duncan J.McGeoch and Nancy T.Turnbull MRC Virology Unit, Institute of Virology, Church Street, Glasgow, Gil 5JR, UK Received 8 September 1978 ABSTRACT The sequence of 205 nucleotides adjacent to the poly(A) tract at the 3'-terminus of the mRNA encoding the N polypeptide of vesicular stomatitis virus has been determined by copying with reverse transcriptase and using 2',3'-dideoxynucleoside triphosphates as specific chain terminators. The method appears highly suitable for sequence determination in any purified mRNA. An examination of the sequence did not locate without ambiguity the limit of polypeptide coding RNA. The hexanucleotide AAUAAA, previously found in all poly(A)-containing eukaryote mRNAs, is not present, although the sequence immediately adjacent to the 3'-terminal poly(A) has a high content of A+U. INTRODUCTION The genome of vesicular stomatitis virus (VSV) (1) consists of a single-stranded RNA molecule. In the infected cell, a virus-specified RNA-dependent RNA polymerase transcribes five species of mRNA from this genome (2-6). Such mRNA synthesis can also be obtained in vitro with disrupted virus preparations, which contain the RNA polymerase, and the mRNAs made rn vitro are indistinguishable from those produced in vivo (3,7-10). The most abundant of these mRNAs is that encoding the nucleocapsid polypeptide N (11). As part of a study of the fine structure of the VSV genome, we describe here an analysis of the sequence of 205 nucleotides adjacent to the 3'-terminal poly(A) tract of N mRNA. It has recently become possible to determine sequences adjoining poly(A) in mRNA species by reverse transcribing the RNA into a complementary DNA copy, which can then be sequenced either by a "plus-and-minus" method (12) or by partial, basespecific chemical degradation (131. All variations of this © Information Retrieval Limited 1 Falconberg Court London W1V5FG England 4007 Nucleic Acids Research general approach require the use of a "phased" primer for reverse transcription, that is, an oligonucleotide of general formula (dT)n-dN or (dT) -dN,-dN_, which will anneal specifically to the mRNA at the junction of the poly(A) tract and heteropolymeric RNA (14). We have used this general approach of synthesizing a DNA copy, but instead of the methods mentioned above have adapted the dideoxynucleoside triphosphate chain termination system of Sanger, Nicklen and Coulson (15), which was developed using DNA polymerase I to copy singlestranded DNA from an unique starting site. MATERIALS AND METHODS (1) Materials p ( d T ) 1 ,p(dT)g-dC and 2 ' ,3'-dideoxynucleoside triphosphates were purchased from PL Biochemicals Inc. P-labelled dNTPs and NTPs were from the Radiochemical Centre, Amersham. AMV reverse transcriptase was the gift of J. W. Beard. Rat liver RNase inhibitor was a gift of G. D. Searle Co. (2) Production of mRNAs of VSV The virus strain used in this work was the Indiana serotype of VSV employed by Pringle (16). Polyadenylated mRNAs were synthesized in vitro by detergent-disrupted virions, as follows (10,17,18). Stocks of VSV Indiana were propagated and purified by standard methods (19). The virion RNA-polymerase reaction mix, usually 5 ml, contained 100 mM Tris-HCl pH 8.0, 100 mM NaCl, 5 mM DTT, 5 mM MgCl 2 , 2 mM ATP, 1 mM GTP, 1 mM CTP, 0.2 mM (a 32 P)-UTP (0.1 Ci/mmol),O.O25 mM S-adenosyl-L-methionine, 0.05% Triton N101, 2 units/ml rat liver RNase inhibitor and VSV, 300 ug protein/ml. Incubation was for 3-5h at 31°. 50-80% of the (32P)-UTP had then been converted to an acid-precipitable form. After addition of sodium dodecyl sulphate to O.5% and EDTA to 10 mM, the RNA was recovered by two extractions with phenol/chloroform (1:1) followed by precipitation with ethanol. Poly(A)-containing RNA, comprising 70-80% of total labelled RNA, was then selected by chromatography on oligo(dT)-cellulose. (3) Purification of N mRNA mRNAs were fractionated by electrophoresis through poly- 4008 Nucleic Acids Research acrylamide gels. Gels were cast as slabs 22 cm long by 15 cm by 1.5 mm or 3 mm, and contained 2.6% acrylamide, 0.13% N,N'methylene-bis-acrylamide, 6 M urea, 90 mM Tris-borate pH 8.3, 2.5 mM EDTA, 0.2% ammonium persulphate and 0.1% TEMED. Electrode tanks contained 90 mM Tris-borate pH 8.3, 2.5 mM EDTA, 0.05% sodium dodecyl sulphate. RNA was dissolved at 400 ug/ml in 90% dimethyl sulphoxide and heated for 10 min at 45°. One half volume of 0.02% xylene cyanol, O.O2% bromophenol blue, 7 M urea, 5 mM Tris-borate pH 8.3, 0.1 mM EDTA was added and the solution layered on to the gel (0.7 yg RNA/mm of gel surface). Electrophoresis was at 150 V for 16 h at room temperature, with recirculation of the 32 tank buffer. The ( P)-RNA bands were then located by auto- radiography. RNA was extracted from appropriate gel slices by homogenizing the gel with two volumes of 500 mM NaCl, 50 mM TrisHC1 pH 7.5, 1 mM EDTA, plus 0.5 volumes buffer-saturated phenol and 50 ug carrier tRNA. The aqueous phase was recovered by centrifuging at 10,000 g for 10 min, and the gel/phenol mixture re-extracted with buffer as before. The pooled aqueous phases were dialysed against three changes of the same buffer. The RNA was then recovered by chromatography on oligo(dT)cellulose and ethanol precipitation. (4) Nucleotides adjacent to the poly(A) tail of N mRNA The sequence immediately adjacent to the poly (A) tract was investigated first using p(dT), Q as a primer for limited reverse transcription in the absence of TTP (14). Three reaction mixes were set up, each containing, in 10 yl, 50 mM Tris-HCl pH 8.3, 50 mM KC1, 10 mM DTT, 5 mM MgCl 2 , 0.5 ug mRNA, 0.005 mM p ( d T ) 1 Q , dATP, dGTP and dCTP. dNTPs was a 32 P-labelled In each mix one of the (200 Ci/mmol, 0.002 mM) and the two unlabelled dNTPs were at 0.05 mM. 3 units of reverse transcrip- tase were added and the mixtures incubated for 30 min at 37 (14, 2 0 ) . Reaction was then terminated by addition of 0.02% xylene cyanol, 0.02% bromophenol blue, 10 M urea, 5 mM Trisborate pH. 8.3, 0.1 mM EDTA. The oligonucleotides synthesized were fractionated on a 16% acrylamide gel (as described in section 6 of these Materials and Methods) and located by autoradiography. 400ff Nucleic Acids Research Labelled oligonucleotides were recovered by soaking the gel slice in 1 ml of 100 m M N a C l , 10 mM Tris-HCl, 1 mM EDTA, 0.1% sodium dodecyl sulphate overnight. The solution was then filtered through a 50-ul DEAE-cellulose column. After washing with water, the oligonucleotide was eluted with 1 M triethylamine carbonate pH 10 and recovered by several cycles of freeze drying. Samples were digested to 3'-dNMPs with micrococcal nuclease and spleen phosphodiesterase 121). 3'dNMPs were separated by chromatography on PEI-cellulose thin layers (22) and detected by autoradiography. (5) Purification of p(dT)g-dC Early sequencing experiments showed that the p(dT)g-dC preparation used as a phasing primer was heterogeneous. Further purification was as follows: 5 A 2 6 Q units of p(dT) g -dC were dissolved in 400 ul 0.01% xylene cyanol, 0.01% bromophenol blue, 6 M urea, containing 2 x 1 0 4 dpm of (5 1 - P)-p(dTu -dC (prepared by labelling the dephosphorylated compound with (y P)-ATP and polynucleotide kinase; specific activity >100 Ci/mmol). The mixture was loaded into a 30-cm slot in a 42 cm long by 38 cm x 1.5 mm slab gel of 12% acrylamide (see section 6 of these Materials and Methods) and subjected to electrophoresis at 30 W until the bromophenol blue marker was 8 cm from the end of the gel. The gel was then autoradiographed, with pre-flashed film and intensifying screen, at -70° (23). The main labelled band was cut out and the oligonucleotide eluted with 100 mM KC1, 20 mM Tris-HCl pH 8.0. The eluate was run through a 0.5-ml column of DEAE-cellulose, which was then washed with 100 mM KC1, 20 mM Tris-HCl pH 8.0. The oligonucleotide was eluted with 1.0 M KC1, 20 mM Tris-HCl pH 8.O, and quantitated by UV absorbance. The primer was used in this form, with the KC1 contributing to the final KC1 level in the reverse transcription reactions (section 6 ) . (6) Nucleotide sequence determination using chain terminators The principle of this method is identical to that described by Sanger, Nicklen and Coulson (15). The conditions used for reverse transcription are based on conditions devised to optimise the yield of full length reverse transcripts of VSV mRNAs (to be published). The final protocol adopted is 4010 Nucleic Acids Research described below. Four separate reactions were set up, each containing one of the four 21,3'-dideoxynucleoside triphosphate (ddNTP) chain terminators, as specified below. Each reaction mix also contained, in 5 pi, 50 mM Tris-HCl pH 8.3, 140 mM KC1, 7 mM MgCl 2 , 10 mM DTT, 0.04 mM each of dCTP, dGTP and TTP, 0.002 mM (<x32P)-dATP (100-350 Ci/mmol) , 0.002 mM purified p(dT)g-dC, 0.25 pg mRNA, reverse transcriptase (160 units/ml) and rat liver RNase inhibitor (2 units/ml)). The reaction mix was assembled at 0 , except for the polymerase and the RNase inhibitor. The reaction was then started by addition of these latter components in 1 \xl. Incubation was for 10 min at 42 . 2.5 yl of "chase mix" (see below) was then added, and incubation continued for 20 min at 42°. "Chase mix" contained 50 mM Tris-HCl pH 8.3, 140 mM KC1, 7 mM MgCl 2 , 10 mM DTT and dATP, dCTP, dGTP and TTP at 1.5 mM each. The reaction was then terminated by addition of 12,5 yl of formamide containing 0.02% xylene cyanol, 0.02% bromophenol blue. ddNTPs were included to the following levels: (a) for sequence up to 20 nucleotides from initiation of reverse transcription, 10 x the concentration of the corresponding dNTP; (b) for 10 - 100 nucleotides, 1 x the dNTP concentration; (c) for more than 100 nucleotides, 0.5 x or 0,25 x. the dNTP concentration. The products of the reactions were fractionated by electrophoresis in polyacrylamide slab gels containing 7 M urea (13,15,24). Most experiments used 1.5 mm thick gels. Later experiments used 0.35 mm thick gels (24). Acrylamide: N,N'methylene-bis-acrylamide ratio was 30:1 in all cases. Gels and electrode tanks contained 50 mM Tris-borate pH 8.3, 1.5 mM EDTA. Other conditions are specified in Table 1. For the 0.35 mm gels the final reaction mixes were concentrated twofold by phenol extracting, precipitating with ethanol and dissolving in 80% formamide containing 4% Ficoll, 0.02% xylene cyanol and 0.02% bromophenol blue. After electrophoresis the 32 ( P)-nucleotides were detected by autoradiography using preflashed films and intensifying screens (23). 4011 Nucleic Acids Research Table 1. Conditions for polyacrylamide sequencing gels Acrylamide conc.tw/v 16% 10% 8% 6% Gel size 42cm 38cm 42cm 38cm 42cm 20cm 42cm 20cm long x x 1.5mm long x x 1.5mm Slot Sample Power Volts size volume 9nun 5-10yl 30W 6001000 9mm 5-10pl 40W 6001000 Time Sequence read 6-3Oh 1-150 12-24h 50-200 6mm long x x 0.35mm lyi 30W 12001700 4-8h 50-250 6mm long x x 0.3 5mm llil 3050W 12002000 4-8h 100-250 RESULTS CD Preparation of N mRNA The synthesis of VSV mRNAs in vitro is well characterized; our results are in agreement with published work (8-11). Figure 1 shows a gel fractionation of poly(A)-containing RNA. The separated bands were identified by comparison with published data (5,6) and by chain length estimates from gel mobility. The work in this paper concerns the major RNA species, the mRNA for the virus nucleocapsid protein, N. This mRNA was extracted from preparative gels and the purity and integrity of the preparations evaluated by two criteria:- (a) when a sample was subjected to electrophoresis on a second gel, it ran as a discrete band comigrating with N mRNA of the mixture (Figure 1 ) ; (b) reverse transcription of the isolated mRNA yielded a single major discrete product (data not shown). In addition, of course, the sequencing results themselves represent the most compelling and relevant assay of purity. Using these methods of synthesis and isolation, preparations of around 50 yg of the purified mRNA were made. (2) Nucleotide sequence immediately adjacent to poly (A). As the first part of our sequencing strategy we determined the sequence immediately adjacent to the 3'-terminal poly(A) tract of the mRNA, using a method based on that of Cheng et al. (14). N mRNA was incubated with reverse transcriptase in the presence of dATP, dCTP, dGTP and p ( d T ) l o . 4012 The lack of TTP Nucleic Acids Research NS/M Figure 1 Gel electrophoresis of VSV mRNAs. (32p)mRNAs were fractionated by electrophoresis through a 2.6% acrylamide slab and detected by autoradiography. Track 1: total poly(A)-RNAs. The G,N and NS plus M species are indicated. The small amounts of the largest mRNA, L (which runs halfway between G and the top) are not visible on this exposure. The top of the gel is indicated by an arrow. Track 2: purified N mRNA. forces reverse transcription from the p ( d T ) . Q primer to start with the first nucleotide adjacent to the poly(A) tract and to terminate before addition of a TMP residue is required. The short products thus obtained were fractionated by gel electrophoresis. As shown in Figure 2, (« 32 p)-dATP and (oc32p)-dCTP both gave a heavily labelled product which ran in a position corresponding to a chain two nucleotides longer than marker (5'- 3 2 P)-p(dT) 1 0 . (<*32P)-dGTP did not yield any comparable labelled product (the faint band obtained with (a 32 P)-dGTP at this position is a contaminant of the (<x32P)-dGTP preparation). 4013 Nucleic Acids Research 1 2 3 4 Figure 2 Limited reverse transcripts of N mRNA. Transcripts synthesized in the absence of TTP and with p ( d T ) 1 Q primer were fractionated by gel electrophoresis. Only a portion of the gel slab is shown. Tracks 1, 2, 3: (<x32p)dATP, -dCTP and -dGTP labels respectively. Track 4: (5'-32p)ptdT)io; the faint band is marked with an asterisk. X and B mark the positions of xylene cyanol and bromophenol blue markers, respectively. A nearest-neighbour analysis was performed on the dATP and dCTP labelled bands. On digestion to 3'-dNMPs, the (oc32p) -dATP label was transferred to dCMP and the (a 32 P)-dCTP label was transferred to TMP. Thus, the p ( d T ) 1 Q primer has been elon- gated to p(dT) 0 -dC-dA and we expect that the next residue to be added should be T. Controls demonstrated that the appear- ance of the labelled bands required both mRNA and primer (data not shown). This result is in agreement with work of Banerjee, Moyer and Rhodes (25) on total VSV mRNAs, which tentatively 4014 Nucleic Acids Research identified the residue adjacent to the poly(A) in the mRNA chains as G. In Figure 2, small amounts of longer products are visible. For both dATP and dCTP labels, the P in these 32 P in chains longer than comprises less than 10% of the total p(dT)- Q . A minor band running just ahead of the major band is also visible in both the dATP and dCTP labelled tracks. Such minor products could arise from length heterogeneity of the priming oligo(dT) or from false terminal addition by the transcriptase (26) or from low level annealing of the primer to other sites. They were not further studied. We conclude from this section that p(dT) -dC or p(dT) -dC-dA should be suitable phasing primers for more extensive reverse transcription. This was borne out by the results described in the next section. (3) Nucleotide sequence by use of chain terminators Sanger, Nicklen and Coulson (15) have described a method of DNA sequence determination using DNA polymerase I to copy single-stranded DNA starting from a defined DNA fragment as primer and using ddNTPs as specific chain terminators, thus generating sets of copied chains with one end common and the other base-specific, which can be fractionated on a length basis by gel electrophoresis to yield the sequence. The method reported here consists of an adaptation of their system to reverse transcription of mRNA by AMV reverse transcriptase using a "phasing" oligonucleotide primer, in this case p(dT)g-dC. Initial experiments demonstrated that, whereas with the DNA polymerase I system, a 100-fold excess of ddNTP over the corresponding dNTP was required to give the necessary amount of chain termination (15), with reverse transcriptase equimolar ddNTP and dNTP were suitable. Thus, AMV reverse transcriptase is less discriminating in this respect than DNA polymerase I. 32 We labelled the reverse transcription products with (<x P ) 32 dATP. Variations in (a p)-dATP concentration and incubation time were examined; also the effect of differing conditions of chase with unlabelled dATP. We found that relatively efficient chain elongation to several hundred nucleotides could be achieved with 0.002 mM dATP, and that increases in dATP concentration did not give worthwhile further increase in 4015 Nucleic Acids Research specific labelling, but resulted in higher backgrounds. The low dATP concentration did, however, result in some accumulation of incomplete chains, especially above about 15O nucleotides long, but these were sufficiently well removed by an unlabelled dATP chase. When we examined different incubation times, we found that after 10-15 min of incubation, most of the specific synthesis was complete, and further incubation gave higher backgrounds and stronger artefact bands. Figures 3, 4 and 5 illustrate the results obtained. The sequence is derived by comparing the mobilities of the products resulting from use of each- ddNTP, as indicated beside eachfigure (.15). The sequence obtained is described here as the complementary DNA strand. Residues are numbered starting with C of the primer p(dT)g-dC as number 1. In the presence or absence of ddNTPs, faint bands were always found in the positions expected for residues 2-7. In Figure 3 this is particularly visible at position 3 (all tracks). 32 of (<x These products are thought to result from addition P)-dATP to primer by the reverse transcriptase (26). Since these bands were of the same order of intensity as the expected specific bands, they obscured the sequence in this region (Figure 3, tracks 4-7). This was resolved by using conditions with 10 times the standard ddNTP levels. The strong termination produced then allowed clear reading of the sequence (Figure 3, tracks 8-11). The first specific ddNTP- produced band is in the ddTTP track at position 3. This agrees with the result obtained by nearest-neighbour analysis giving -C-A-T as the starting sequence (Results, section 2 ) . Having resolved the above difficulty unambiguous sequence results were obtained to residue 2O5 using standard conditions of copying and various times of gel run on 6%-16% gels (Table 1), as illustrated by Figures 3, 4 and 5. derived is presented in Figure 6. The DNA sequence so This sequence was obtained from many gel runs which provided overlapping readings with each portion of the sequence in at least three experiments. Further sequence data was obtained to, approximately, residue 2 50. However, this contained several ambiguities and is not presented here. 4016 Nucleic Acids Research 1 2 ^ ^ 4 5 8 7 8 9 1 O 1 1 Figure 3 Complementary DNA sequence, nucleotides 3-38 ddNTF inhibited reverse transcripts were fractionated on a 16% gel. Track 1: CS'^lp^p.(dT)_8-dC;faint at this exposure; position of band arrowed. Tracks 2 and 3: no ddNTP present, without and with chase, respectively. Tracks 4-7: standard conditions with ddGTP, ddATP, ddTTP and ddCTP respectively. Tracks 8-11: as 4-7 but with ddNTPS at high concentration.X and B mark the positions of the dye markers. 4017 Nucleic Acids Research 2 1 4 1 rT 100 .GO' ." T° Figure 4 Complementary DNA sequence, 20-100. Electrophoresis was on an 8% gel. Track 1: no ddNTP. Tracks 2-5:ddGTP, ddCTP, ddTTP and ddATP, respectively. X marks the xylene cyanol dye. 4018 Nucleic Acids Research 12 3 4 200 1 50 1*0 1 4 0 160 170 1C0 131 Figure 5 Complementary DNA sequence, 90-205 Electrophoresis was on a 6% gel. Tracks 1-4: ddATP, ddTTP, ddCTP and ddGTP respectively. Left panel: electrophoresis for 3 h at 30 W. Right panel: 4 h, 30 W. Only the lower 22 cm of gel is shown in each case. Figures 3 and 4 also illustrate the lengths of transcript obtained in the absence of ddNTP. Track 3 of Figure 3 demonstrates that the"chase" reaction is at least partially effective in removing short products (compare track 2 ) , and track 1 of Figure 4 shows that there are few detectable termination products below about 150 residues. Above this length, however, the amounts of prematurely terminated chains present increase. A close examination of such bands shows that they mostly consist of chains terminated before addition of an A residue is required. However, comparison with, the uninhibited reaction presents the worst case: in the inhibited reaction tracks these unwanted termination products were pro4019 Nucleic Acids Research 51 TTT 1 10 20 30 40 CATATGTAGC ATAATATATA A T A G G T G A T C TGAGAATTAT 41 50 60 70 80 AGGGTCATTT GTCAAATTCT GACTTAGCAT ACTTGCCAAT 81 90 100 110 120 TGTCTTCTCT CTTAGGCCTT GCAGTGACAT GACTGCTCGT 130 140 150 160 121 TTCGCATACT GCATCATATC AGGAGTCGGT TTTCTGTTTT 170 180 190 200 161 GATCTTCAAA CCATCCGAGC CATTCGACCA CATCTCTGCC 201 205 TTGTG 3' FIGURE 6 Complementary DNA sequence, 1-205 portionately less intense, and in all cases much weaker than the ddNTP generated bands. DISCUSSION (1) The technique The method described here gave clean, unambiguous results up to residue 205, with further, tentative sequence information to about residue 250. We consider that sequence data to about 300 nucleotides are potentially attainable. In our view this method represents the best currently available approach to sequence studies on RNA by reverse transcription. The important reservation must be made that, as presented here, there is no confirmatory evidence available, such as sequencing of the complementary strand. Thus, while we emphasize that the sequence obtained is the^result of a number of experiments, the system is subject to the same limitations as other rapid sequencing techniques (12, 13, 15) and we cannot exclude a low error frequency. (2) The N mRNA and polypeptide No data are available on the sequence or amino acid composition of the N polypeptide. Recent estimates of the molecular weight of N by electrophoresis in polyacrylamide gels, have yielded values in the range 45,300 to 54,000 (6, 27, 2 8 ) . Assuming a mean molecular weight of 115 for the constituent amino acids, the polypeptide thus contains 394-470 4020 Nucleic Acids Research amino acids and so requires a minimum of 1182-1410 nucleotides in its mRNA. Published estimates of the cha.in length- of the mRNA, excluding poly(Al, range from 1115 to 1466 nucleotides (5, 29, 3 0 ) . We consider that the most accurate available estimate is 1322 nucleotides, obtained by mobility on gel eleetrophoresis of the full-length reverse transcript of the mRNA with 0X174 DNA restriction nuclease fragments as standards (D. McGeoch, in preparation). This allows an estimate for the maximum length of non-coding RNA of 140 nucleotides (excluding poly (A)). The 5'-non-coding region is 12 nucleotides (31), so we estimate the 3'-non-coding RNA as 128 nucleotides. This estimate suffers from a number of possible sources of error, and is presented to suggest that the 3'-non-coding region can be expected to be less than 200 nucleotides. (3) Possible translation frames The complement of the nucleotide sequence obtained is presented as the mRNA strand in Figure 7. This was examined in an attempt to determine the limit of polypeptide coding RNA. Figure 7 shows the distribution of translation terminating codons in the three possible reading frames (designated 1, 2 and 3 ) . Frame 1 contains 5 terminators, with the most distant from poly (A) at 165-167. Frame 2 contains 2 terminators, with the most poly(A)-distant at 135-137. Frame 3 contains 1 terminator at 45-47 (frame 3 contains an additional UGA if the first A of the poly(A) tract is considered). Thus, if coding sequence termination is nearer the 3'-terminus than nucleotide 128, as argued above, then frame 3 is the reading frame. However, from the uncertainties of the argument, it is still possible for either of the other two phases to be the reading frame, A consideration of other features of the sequence has also led us to support frame 3 as being the most likely candidate for reading frame. First, with frame 3 as the reading frame, the 3r-non-coding region comprises residues 1-44. The nucleotide composition of this region is sharply differentiated from the rest of the sequenceit is much higher in U and lower in G - arguing a difference in function. Second, as mentioned above, the frame contains no 4021 Nucleic Acids Research 205 2 0 1 CACAA 5«> 200 190 180 170 161 GGCAGAGAUG UGGUCGAAUG GCUCGGAUGG UUlJG^AGAUC 160 150 140 AAAACAGAAA ACCGACUCOJ GflUA'UGaUGC 1 2 120 110 100 ACGAGCAGUC AUGUCACUGC AAGGCCJUAAiG 130 121 AGUAUGCGAA 90 81 AGAGAAGACA 80 70 60 50 41 AUUGGCAAGU AUGCJJAAJGUC AGAAUutJGflfc AAAt)GA!CCCU I 40 » 8 30 20 10 1 JUCUCA GAUCACCUAU UAUAUAUUAU GCUACAUAUG poly A Figure 7 3'-terminal sequence of N mRNA. The sequence is presented as the complement of that shown in Figure 6. Translation termination codons are boxed, with the reading frame underneath. The repeated sequences UAU and UAUUAU are underlined by solid and dashed lines respectively. terminator codons for at least 158 nucleotides. On a random basis we expect around 3 such codons in phase over this length of sequence. We also examined amino acid composition and codon usage in the three frames, but concluded that this was not helpful for our present purpose. It is clear that this question can only be definitely resolved with more sequence data. (If frame 3 is the correct reading frame, with translation terminating at nucleotides 45-47, then the sequence data predicts the C-terminal amino acid sequence of N to be Gln-Gly-Arg-Asp-Val-Val-Glu-Trp-Leu-Gly-Trp-Phe-Glu-Asp-GlnAsn-Arg-Lys-Pro-Thr-Pro-Asp-Met-Met-Gln-Tyr-Ala-Lys-Arg-AlaVal-Met-Ser-Leu-Gln-Gly-Leu-Arg-Glu-Lys-Thr-Ileu-Gly-Lys-TyrAla-Lys-Ser-Glu-Phe-Asp-Lys-COOH). (4) The immediate 3'-terminal sequence The sequence determined does not contain the hexa- nucleotide AAUAAA, which has been found near the 3'-termini of all polyadenylated eukaryote mRNAs sequenced, and near the 3'-terminus of encephalomycarditis virus genome RNA (that is, the coding strand) (32, 33). This rather suggests that any role of the sequence concerns eukaryote mRNA metabolism per se rather than as a signal in translation. 4022 As noted above, the Nucleic Acids Research 3'-terminal 40 nucleotides differ in composition from the rest of the sequence. The 3'-terminal 23 nucleotides are particularly extreme, containing 11 U residues and 8 A's, with repeating sequences - 6 copies of UAU and 2 copies of UAUUAU (at positions 11-16 and 18-23). This region, as the genome RNA strand, may contain signals to the virus RNA-dependent RNA polymerase for termination of transcription and start of poly(A) synthesis. This should become more clear when 3'- terminal structures of the other VSV mRNAs are determined. AC KNOWLEDGEMENTS We thank our colleagues for their assistance! Dr C. R. Pringle for supplying the virus strain, Dr J. F. Szilagyi for discussion and for initial mRNA samples, and Professor J. H. Subak-Sharpe for support and critical analysis of the data and text. REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. Abbreviations: VSV, vesicular stomatitis virus; N mRNA: messenger RNA encoding N polypeptide; ddNTP, 2',3'dideoxynucleoside triphospha,te, Wagner, R.R. (1975) in Comprehensive Virology, FraenkelConrat, H. and Wagner, R.R., Eds., Vol. 4, pp. 1-80. Plenum Press, New York. Baltimore, D., Huang, A.S. and Stampfer, M. (1970). Proc. Nat. Acad. Sci. U.S.A. £6, 572-576. Both, G.W., Moyer, S.A. and Banerjee, A.K. (1975). J. Virol. 15, 1012-1019. Rose, J.K. and Knipe, D. (1975). J. Virol. 15, 994-1003. Knipe, D., Rose, J.K. and Lodish, H.F. (1975TT J. Virol. 15_, 1004-1011. Szilagyi, J.F. and Uryvayev, L. (1973). J. Virol. 11, 279-286. Both, G.V7., Moyer, S.A. and Banerjee, A.K. (1975). Proc. Nat. Acad. Sci. U.S.A. 72^, 274-278. Moyer, S.A., Grubman, M.J., Ehrenfeld, E. and Banerjee, Virology 62, 463-473. A.K. (1975). Preston, C M . and Szilagyi, J.F. (1977). J. Virol. 21, 1002-1009. Villarreal, L.P., Breindl, M. and Holland, J.J. (1976). Biochemistry, !L5_, 1663-1667. Brownlee, G.G. and Cartwright, E.M. (1977). J. Mol. Biol. 114, 93-118. Maxam, A.M. and Gilbert, W. (1977). Proc. Nat. Acad. Sci. U.S.A. 7_4, 560-564. Cheng, C.C., Brownlee, G.G., Carey, N.H., Doel, M.T., Gillam, S. and Smith, M. (1976). J. Mol. Biol. 107, 527-547. Sanger, F., Nicklen, S. and Coulson, A.R. (1977). Proc. 4023 Nucleic Acids Research 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. Nat. Acad. Sci. U.S.A. 7£, 5463-5467. Pringle, C.R. (1970). J. Virol. 5_, 559-567. Szilagyi, J.F. and Pringle, C.R. (1975). J. Virol. 16, 927-936. — Rhodes, D.P., Moyer, S.A. and Banerjee, A.K. (1974). Cell 3_, 327-333. Emerson, S.U. and Yu, Y.-H. (1975). J. Virol. 15, 1348-1356. Proudfoot, N.J. (1976). J. Mol. Biol. 107, 491-525. McGeoch, D.J., Crawford, L.V. and Follett, E.A.C. (1970). J. Gen. Virol. 6_, 33-40. Mirzabekov, A.D. and.Griffin, B.E. (1972). J. Mol. Biol. 72^, 633-643. Laskey, R.A. and Mills,A.D. (1977). FEBS Letters 82, 314-316. Sanger, F. and Coulson, A.R. (1978). FEBS Letters 87, 107-110. Banerjee, A.K., Moyer, S.A. and Rhodes, D.P. (1974). Virology 6_1, 547-558. Marcus, S.L. and Sarkar, N.H. (1978). Virology 84, 247-259. Wunner, W.H. and Pringle, C.R. (1972). J. Gen. Virol. 1£, 1-10. 28. 29. 30. 31. 32. 33. 4024 Obijeski, J.F., Marchenko, A.T., Bishop, D.H.L., Cann, J. Gen. Virol. £2_, 21-33. B.W. and Murphy, F.A. (1974). Freeman, G.J., Rose, J.K., Clinton, G.M. and Huang A.S. (1977). J. Virol. 23., 1094-1104. Rhodes, D.P., Abraham, G., Colonno, R.J., Jelinek, W. and Banerjee, A.K. (1977). J. Virol. 2_1, 1105-1112. Rose, J.K. (1977). Proc. Nat. Acad. Sci. U.S.A. 74, 3672-3676. Proudfoot, N.J. and Brownlee, G.G. (1976). Nature 263, 211-214. Merregaert, J., Van Emmelo, J., Devos, R., Porter, A., Fellner, P. and Fiers, W. (1978). Eur. J. Biochem. 82, 55-63.