Download Analysis of the 3′-terminal nucleotide sequence of vesicular

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cre-Lox recombination wikipedia , lookup

Western blot wikipedia , lookup

Expanded genetic code wikipedia , lookup

DNA sequencing wikipedia , lookup

RNA interference wikipedia , lookup

Non-coding DNA wikipedia , lookup

Promoter (genetics) wikipedia , lookup

Replisome wikipedia , lookup

Biochemistry wikipedia , lookup

Molecular evolution wikipedia , lookup

RNA silencing wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Eukaryotic transcription wikipedia , lookup

RNA wikipedia , lookup

Transcriptional regulation wikipedia , lookup

RNA polymerase II holoenzyme wikipedia , lookup

Genetic code wikipedia , lookup

Real-time polymerase chain reaction wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Agarose gel electrophoresis wikipedia , lookup

Gel electrophoresis of nucleic acids wikipedia , lookup

Non-coding RNA wikipedia , lookup

Gene expression wikipedia , lookup

RNA-Seq wikipedia , lookup

QPNC-PAGE wikipedia , lookup

Community fingerprinting wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Gel electrophoresis wikipedia , lookup

Messenger RNA wikipedia , lookup

Polyadenylation wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Epitranscriptome wikipedia , lookup

Transcript
Volume 5 Number 11 November 1978
Nucleic A c i d s Research
Analysis of the 31-terminal nucleotide sequence of vesicular stomatitis virus N protein mRNA
Duncan J.McGeoch and Nancy T.Turnbull
MRC Virology Unit, Institute of Virology, Church Street, Glasgow, Gil 5JR, UK
Received 8 September 1978
ABSTRACT
The sequence of 205 nucleotides adjacent to the poly(A)
tract at the 3'-terminus of the mRNA encoding the N polypeptide
of vesicular stomatitis virus has been determined by copying
with reverse transcriptase and using 2',3'-dideoxynucleoside
triphosphates as specific chain terminators. The method appears
highly suitable for sequence determination in any purified mRNA.
An examination of the sequence did not locate without ambiguity
the limit of polypeptide coding RNA. The hexanucleotide AAUAAA,
previously found in all poly(A)-containing eukaryote mRNAs, is
not present, although the sequence immediately adjacent to the
3'-terminal poly(A) has a high content of A+U.
INTRODUCTION
The genome of vesicular stomatitis virus (VSV) (1) consists
of a single-stranded RNA molecule.
In the infected cell, a
virus-specified RNA-dependent RNA polymerase transcribes five
species of mRNA from this genome (2-6).
Such mRNA synthesis
can also be obtained in vitro with disrupted virus preparations,
which contain the RNA polymerase, and the mRNAs made rn vitro
are indistinguishable from those produced in vivo (3,7-10).
The
most abundant of these mRNAs is that encoding the nucleocapsid
polypeptide N (11). As part of a study of the fine structure
of the VSV genome, we describe here an analysis of the sequence
of 205 nucleotides adjacent to the 3'-terminal poly(A) tract
of N mRNA.
It has recently become possible to determine sequences
adjoining poly(A) in mRNA species by reverse transcribing the
RNA into a complementary DNA copy, which can then be sequenced
either by a "plus-and-minus" method (12) or by partial, basespecific chemical degradation (131.
All variations of this
© Information Retrieval Limited 1 Falconberg Court London W1V5FG England
4007
Nucleic Acids Research
general approach require the use of a "phased" primer for
reverse transcription, that is, an oligonucleotide of general
formula (dT)n-dN or (dT) -dN,-dN_, which will anneal specifically to the mRNA at the junction of the poly(A) tract and
heteropolymeric RNA (14). We have used this general approach
of synthesizing a DNA copy, but instead of the methods
mentioned above have adapted the dideoxynucleoside triphosphate
chain termination system of Sanger, Nicklen and Coulson (15),
which was developed using DNA polymerase I to copy singlestranded DNA from an unique starting site.
MATERIALS AND METHODS
(1) Materials
p ( d T ) 1 ,p(dT)g-dC and 2 ' ,3'-dideoxynucleoside triphosphates
were purchased from PL Biochemicals Inc.
P-labelled dNTPs
and NTPs were from the Radiochemical Centre, Amersham. AMV
reverse transcriptase was the gift of J. W. Beard. Rat liver
RNase inhibitor was a gift of G. D. Searle Co.
(2) Production of mRNAs of VSV
The virus strain used in this work was the Indiana serotype of VSV employed by Pringle (16).
Polyadenylated mRNAs were synthesized in vitro by
detergent-disrupted virions, as follows (10,17,18). Stocks of
VSV Indiana were propagated and purified by standard methods
(19).
The virion RNA-polymerase reaction mix, usually 5 ml,
contained 100 mM Tris-HCl pH 8.0, 100 mM NaCl, 5 mM DTT, 5 mM
MgCl 2 , 2 mM ATP, 1 mM GTP, 1 mM CTP, 0.2 mM (a 32 P)-UTP (0.1
Ci/mmol),O.O25 mM S-adenosyl-L-methionine, 0.05% Triton N101,
2 units/ml rat liver RNase inhibitor and VSV, 300 ug protein/ml.
Incubation was for 3-5h at 31°. 50-80% of the (32P)-UTP had
then been converted to an acid-precipitable form. After addition
of sodium dodecyl sulphate to O.5% and EDTA to 10 mM, the RNA was
recovered by two extractions with phenol/chloroform (1:1)
followed by precipitation with ethanol. Poly(A)-containing RNA,
comprising 70-80% of total labelled RNA, was then selected by
chromatography on oligo(dT)-cellulose.
(3) Purification of N mRNA
mRNAs were fractionated by electrophoresis through poly-
4008
Nucleic Acids Research
acrylamide gels.
Gels were cast as slabs 22 cm long by 15 cm
by 1.5 mm or 3 mm, and contained 2.6% acrylamide, 0.13% N,N'methylene-bis-acrylamide, 6 M urea, 90 mM Tris-borate pH 8.3,
2.5 mM EDTA, 0.2% ammonium persulphate and 0.1% TEMED.
Electrode tanks contained 90 mM Tris-borate pH 8.3, 2.5 mM EDTA,
0.05% sodium dodecyl sulphate.
RNA was dissolved at 400 ug/ml in 90% dimethyl sulphoxide
and heated for 10 min at 45°.
One half volume of 0.02% xylene
cyanol, O.O2% bromophenol blue, 7 M urea, 5 mM Tris-borate pH
8.3, 0.1 mM EDTA was added and the solution layered on to the
gel (0.7 yg RNA/mm
of gel surface).
Electrophoresis was at
150 V for 16 h at room temperature, with recirculation of the
32
tank buffer.
The (
P)-RNA bands were then located by auto-
radiography.
RNA was extracted from appropriate gel slices by homogenizing the gel with two volumes of 500 mM NaCl, 50 mM TrisHC1 pH 7.5, 1 mM EDTA, plus 0.5 volumes buffer-saturated
phenol and 50 ug carrier tRNA.
The aqueous phase was recovered
by centrifuging at 10,000 g for 10 min, and the gel/phenol
mixture re-extracted with buffer as before.
The pooled aqueous
phases were dialysed against three changes of the same buffer.
The RNA was then recovered by chromatography on oligo(dT)cellulose and ethanol precipitation.
(4) Nucleotides adjacent to the poly(A) tail of N mRNA
The sequence immediately adjacent to the poly (A)
tract
was investigated first using p(dT), Q as a primer for limited
reverse transcription in the absence of TTP (14).
Three
reaction mixes were set up, each containing, in 10 yl, 50 mM
Tris-HCl pH 8.3, 50 mM KC1, 10 mM DTT, 5 mM MgCl 2 , 0.5 ug mRNA,
0.005 mM p ( d T ) 1 Q , dATP, dGTP and dCTP.
dNTPs was a 32 P-labelled
In each mix one of the
(200 Ci/mmol, 0.002 mM) and the two
unlabelled dNTPs were at 0.05 mM.
3 units of reverse transcrip-
tase were added and the mixtures incubated for 30 min at 37
(14,
2 0 ) . Reaction was then terminated by addition of 0.02%
xylene cyanol, 0.02% bromophenol blue, 10 M urea, 5 mM Trisborate pH. 8.3, 0.1 mM EDTA.
The oligonucleotides synthesized
were fractionated on a 16% acrylamide gel (as described in section
6 of these Materials and Methods) and located by autoradiography.
400ff
Nucleic Acids Research
Labelled oligonucleotides were recovered by soaking the gel
slice in 1 ml of 100 m M N a C l , 10 mM Tris-HCl, 1 mM EDTA, 0.1%
sodium dodecyl sulphate overnight. The solution was then
filtered through a 50-ul DEAE-cellulose column. After washing
with water, the oligonucleotide was eluted with 1 M triethylamine carbonate pH 10 and recovered by several cycles
of freeze drying. Samples were digested to 3'-dNMPs with
micrococcal nuclease and spleen phosphodiesterase 121). 3'dNMPs were separated by chromatography on PEI-cellulose thin
layers (22) and detected by autoradiography.
(5) Purification of p(dT)g-dC
Early sequencing experiments showed that the p(dT)g-dC
preparation used as a phasing primer was heterogeneous.
Further purification was as follows: 5 A 2 6 Q units of p(dT) g -dC
were dissolved in 400 ul 0.01% xylene cyanol, 0.01% bromophenol blue, 6 M urea, containing 2 x 1 0 4 dpm of (5 1 - P)-p(dTu
-dC (prepared by labelling the dephosphorylated compound with
(y P)-ATP and polynucleotide kinase; specific activity >100
Ci/mmol). The mixture was loaded into a 30-cm slot in a 42 cm
long by 38 cm x 1.5 mm slab gel of 12% acrylamide (see section
6 of these Materials and Methods) and subjected to electrophoresis at 30 W until the bromophenol blue marker was 8 cm
from the end of the gel. The gel was then autoradiographed,
with pre-flashed film and intensifying screen, at -70° (23).
The main labelled band was cut out and the oligonucleotide
eluted with 100 mM KC1, 20 mM Tris-HCl pH 8.0. The eluate
was run through a 0.5-ml column of DEAE-cellulose, which was
then washed with 100 mM KC1, 20 mM Tris-HCl pH 8.0. The
oligonucleotide was eluted with 1.0 M KC1, 20 mM Tris-HCl
pH 8.O, and quantitated by UV absorbance. The primer was used
in this form, with the KC1 contributing to the final KC1 level
in the reverse transcription reactions (section 6 ) .
(6) Nucleotide sequence determination using chain terminators
The principle of this method is identical to that
described by Sanger, Nicklen and Coulson (15). The conditions
used for reverse transcription are based on conditions devised
to optimise the yield of full length reverse transcripts of
VSV mRNAs (to be published). The final protocol adopted is
4010
Nucleic Acids Research
described below.
Four separate reactions were set up, each containing one
of the four 21,3'-dideoxynucleoside triphosphate (ddNTP)
chain terminators, as specified below.
Each reaction mix also
contained, in 5 pi, 50 mM Tris-HCl pH 8.3, 140 mM KC1, 7 mM
MgCl 2 , 10 mM DTT, 0.04 mM each of dCTP, dGTP and TTP, 0.002 mM
(<x32P)-dATP (100-350 Ci/mmol) , 0.002 mM purified p(dT)g-dC,
0.25 pg mRNA, reverse transcriptase (160 units/ml) and rat
liver RNase inhibitor (2 units/ml)).
The reaction mix was
assembled at 0 , except for the polymerase and the RNase
inhibitor.
The reaction was then started by addition of these
latter components in 1 \xl. Incubation was for 10 min at 42 .
2.5 yl of "chase mix" (see below) was then added, and incubation continued for 20 min at 42°.
"Chase mix" contained
50 mM Tris-HCl pH 8.3, 140 mM KC1, 7 mM MgCl 2 , 10 mM DTT and
dATP, dCTP, dGTP and TTP at 1.5 mM each.
The reaction was
then terminated by addition of 12,5 yl of formamide containing
0.02% xylene cyanol, 0.02% bromophenol blue.
ddNTPs were included to the following levels: (a) for
sequence up to 20 nucleotides from initiation of reverse
transcription, 10 x the concentration of the corresponding
dNTP; (b) for 10 - 100 nucleotides, 1 x the dNTP concentration;
(c) for more than 100 nucleotides, 0.5 x or 0,25 x. the dNTP
concentration.
The products of the reactions were fractionated by
electrophoresis in polyacrylamide slab gels containing 7 M urea
(13,15,24).
Most experiments used 1.5 mm thick gels.
Later
experiments used 0.35 mm thick gels (24). Acrylamide: N,N'methylene-bis-acrylamide ratio was 30:1 in all cases.
Gels
and electrode tanks contained 50 mM Tris-borate pH 8.3, 1.5 mM
EDTA.
Other conditions are specified in Table 1.
For the
0.35 mm gels the final reaction mixes were concentrated twofold by phenol extracting, precipitating with ethanol and
dissolving in 80% formamide containing 4% Ficoll, 0.02% xylene
cyanol and 0.02% bromophenol blue. After electrophoresis the
32
( P)-nucleotides were detected by autoradiography using preflashed films and intensifying screens (23).
4011
Nucleic Acids Research
Table 1.
Conditions for polyacrylamide sequencing gels
Acrylamide
conc.tw/v
16%
10%
8%
6%
Gel size
42cm
38cm
42cm
38cm
42cm
20cm
42cm
20cm
long x
x 1.5mm
long x
x 1.5mm
Slot Sample Power Volts
size volume
9nun 5-10yl
30W
6001000
9mm 5-10pl
40W
6001000
Time
Sequence
read
6-3Oh
1-150
12-24h 50-200
6mm
long x
x 0.35mm
lyi
30W
12001700
4-8h
50-250
6mm
long x
x 0.3 5mm
llil
3050W
12002000
4-8h
100-250
RESULTS
CD
Preparation of N mRNA
The synthesis of VSV mRNAs in vitro is well characterized;
our results are in agreement with published work (8-11).
Figure 1 shows a gel fractionation of poly(A)-containing RNA.
The separated bands were identified by comparison with published data (5,6) and by chain length estimates from gel mobility.
The work in this paper concerns the major RNA species, the mRNA
for the virus nucleocapsid protein, N.
This mRNA was extracted
from preparative gels and the purity and integrity of the
preparations evaluated by two criteria:- (a) when a sample
was subjected to electrophoresis on a second gel, it ran as a
discrete band comigrating with N mRNA of the mixture (Figure 1 ) ;
(b) reverse transcription of the isolated mRNA yielded a single
major discrete product (data not shown).
In addition, of
course, the sequencing results themselves represent the most
compelling and relevant assay of purity.
Using these methods
of synthesis and isolation, preparations of around 50 yg of the
purified mRNA were made.
(2)
Nucleotide sequence immediately adjacent to poly (A).
As the first part of our sequencing strategy we determined
the sequence immediately adjacent to the 3'-terminal poly(A)
tract of the mRNA, using a method based on that of Cheng et al.
(14).
N mRNA was incubated with reverse transcriptase in the
presence of dATP, dCTP, dGTP and p ( d T ) l o .
4012
The lack of TTP
Nucleic Acids Research
NS/M
Figure 1 Gel electrophoresis of VSV mRNAs. (32p)mRNAs were fractionated by electrophoresis through a 2.6%
acrylamide slab and detected by autoradiography. Track 1:
total poly(A)-RNAs. The G,N and NS plus M species are indicated. The small amounts of the largest mRNA, L (which runs
halfway between G and the top) are not visible on this
exposure. The top of the gel is indicated by an arrow. Track
2: purified N mRNA.
forces reverse transcription from the p ( d T ) . Q primer to start
with the first nucleotide adjacent to the poly(A) tract and to
terminate before addition of a TMP residue is required.
The
short products thus obtained were fractionated by gel electrophoresis.
As shown in Figure 2, (« 32 p)-dATP and (oc32p)-dCTP
both gave a heavily labelled product which ran in a position
corresponding to a chain two nucleotides longer than marker
(5'- 3 2 P)-p(dT) 1 0 .
(<*32P)-dGTP did not yield any comparable
labelled product (the faint band obtained with (a 32 P)-dGTP at
this position is a contaminant of the (<x32P)-dGTP preparation).
4013
Nucleic Acids Research
1
2
3
4
Figure 2 Limited reverse transcripts of N mRNA.
Transcripts synthesized in the absence of TTP and with p ( d T ) 1 Q
primer were fractionated by gel electrophoresis. Only a
portion of the gel slab is shown. Tracks 1, 2, 3: (<x32p)dATP, -dCTP and -dGTP labels respectively. Track 4: (5'-32p)ptdT)io; the faint band is marked with an asterisk. X and B
mark the positions of xylene cyanol and bromophenol blue
markers, respectively.
A nearest-neighbour analysis was performed on the dATP and dCTP
labelled bands.
On digestion to 3'-dNMPs, the (oc32p) -dATP
label was transferred to dCMP and the (a 32 P)-dCTP label was
transferred to TMP.
Thus, the p ( d T ) 1 Q primer has been elon-
gated to p(dT) 0 -dC-dA and we expect that the next residue to
be added should be T.
Controls demonstrated that the appear-
ance of the labelled bands required both mRNA and primer (data
not shown).
This result is in agreement with work of Banerjee,
Moyer and Rhodes (25) on total VSV mRNAs, which tentatively
4014
Nucleic Acids Research
identified the residue adjacent to the poly(A) in the mRNA
chains as G.
In Figure 2, small amounts of longer products
are visible.
For both dATP and dCTP labels, the
P in these
32
P in chains longer than
comprises less than 10% of the total
p(dT)- Q .
A minor band running just ahead of the major band
is also visible in both the dATP and dCTP labelled tracks.
Such minor products could arise from length heterogeneity of
the priming oligo(dT) or from false terminal addition by the
transcriptase (26) or from low level annealing of the primer to
other sites.
They were not further studied.
We conclude
from this section that p(dT) -dC or p(dT) -dC-dA should be
suitable phasing primers for more extensive reverse transcription.
This was borne out by the results described in the next
section.
(3)
Nucleotide sequence by use of chain terminators
Sanger, Nicklen and Coulson (15) have described a method
of DNA sequence determination using DNA polymerase I to copy
single-stranded DNA starting from a defined DNA fragment as
primer and using ddNTPs as specific chain terminators, thus
generating sets of copied chains with one end common and the
other base-specific, which can be fractionated on a length
basis by gel electrophoresis to yield the sequence.
The
method reported here consists of an adaptation of their system
to reverse transcription of mRNA by AMV reverse transcriptase
using a "phasing" oligonucleotide primer, in this case
p(dT)g-dC.
Initial experiments demonstrated that, whereas with
the DNA polymerase I system, a 100-fold excess of ddNTP over
the corresponding dNTP was required to give the necessary amount
of chain termination (15), with reverse transcriptase equimolar
ddNTP and dNTP were suitable.
Thus, AMV reverse transcriptase
is less discriminating in this respect than DNA polymerase I.
32
We labelled the reverse transcription products with (<x P ) 32
dATP. Variations in (a p)-dATP concentration and incubation
time were examined; also the effect of differing conditions of
chase with unlabelled dATP.
We found that relatively efficient
chain elongation to several hundred nucleotides could be
achieved with 0.002 mM dATP, and that increases in dATP
concentration did not give worthwhile further increase in
4015
Nucleic Acids Research
specific labelling, but resulted in higher backgrounds.
The low
dATP concentration did, however, result in some accumulation of
incomplete chains, especially above about 15O nucleotides long,
but these were sufficiently well removed by an unlabelled dATP
chase.
When we examined different incubation times, we found
that after 10-15 min of incubation, most of the specific
synthesis was complete, and further incubation gave higher
backgrounds and stronger artefact bands.
Figures 3, 4 and 5 illustrate the results obtained.
The
sequence is derived by comparing the mobilities of the products
resulting from use of each- ddNTP, as indicated beside eachfigure (.15).
The sequence obtained is described here as the
complementary DNA strand.
Residues are numbered starting
with C of the primer p(dT)g-dC as number 1.
In the presence or absence of ddNTPs, faint bands were
always found in the positions expected for residues 2-7.
In
Figure 3 this is particularly visible at position 3 (all
tracks).
32
of (<x
These products are thought to result from addition
P)-dATP to primer by the reverse transcriptase (26).
Since these bands were of the same order of intensity as the
expected specific bands, they obscured the sequence in this
region (Figure 3, tracks 4-7).
This was resolved by using
conditions with 10 times the standard ddNTP levels.
The strong
termination produced then allowed clear reading of the
sequence (Figure 3, tracks 8-11).
The first specific ddNTP-
produced band is in the ddTTP track at position 3.
This
agrees with the result obtained by nearest-neighbour analysis
giving -C-A-T as the starting sequence
(Results, section 2 ) .
Having resolved the above difficulty unambiguous sequence
results were obtained to residue 2O5 using standard conditions
of copying and various times of gel run on 6%-16% gels (Table
1),
as illustrated by Figures 3, 4 and 5.
derived is presented in Figure 6.
The DNA sequence so
This sequence was obtained
from many gel runs which provided overlapping readings with
each portion of the sequence in at least three experiments.
Further sequence data was obtained to, approximately, residue
2 50.
However, this contained several ambiguities and is not
presented here.
4016
Nucleic Acids Research
1
2
^ ^
4
5
8
7
8
9
1
O
1
1
Figure 3 Complementary DNA sequence, nucleotides
3-38 ddNTF inhibited reverse transcripts were fractionated
on a 16% gel. Track 1: CS'^lp^p.(dT)_8-dC;faint at this exposure;
position of band arrowed. Tracks 2 and 3: no ddNTP present,
without and with chase, respectively. Tracks 4-7: standard
conditions with ddGTP, ddATP, ddTTP and ddCTP respectively.
Tracks 8-11: as 4-7 but with ddNTPS at high concentration.X
and B mark the positions of the dye markers.
4017
Nucleic Acids Research
2 1 4
1
rT
100
.GO'
."
T°
Figure 4 Complementary DNA sequence, 20-100.
Electrophoresis was on an 8% gel. Track 1: no ddNTP. Tracks
2-5:ddGTP, ddCTP, ddTTP and ddATP, respectively. X marks the
xylene cyanol dye.
4018
Nucleic Acids Research
12
3
4
200
1 50
1*0
1 4 0
160
170
1C0
131
Figure 5 Complementary DNA sequence, 90-205
Electrophoresis was on a 6% gel. Tracks 1-4: ddATP, ddTTP,
ddCTP and ddGTP respectively. Left panel: electrophoresis
for 3 h at 30 W. Right panel: 4 h, 30 W. Only the lower
22 cm of gel is shown in each case.
Figures 3 and 4 also illustrate the lengths of transcript
obtained in the absence of ddNTP.
Track 3 of Figure 3
demonstrates that the"chase" reaction is at least partially
effective in removing short products (compare track 2 ) , and
track 1 of Figure 4 shows that there are few detectable
termination products below about 150 residues.
Above this
length, however, the amounts of prematurely terminated chains
present increase.
A close examination of such bands shows
that they mostly consist of chains terminated before addition
of an A residue is required.
However, comparison with, the
uninhibited reaction presents the worst case: in the inhibited
reaction tracks these unwanted termination products were pro4019
Nucleic Acids Research
51
TTT
1
10
20
30
40
CATATGTAGC ATAATATATA A T A G G T G A T C TGAGAATTAT
41
50
60
70
80
AGGGTCATTT GTCAAATTCT GACTTAGCAT ACTTGCCAAT
81
90
100
110
120
TGTCTTCTCT CTTAGGCCTT GCAGTGACAT GACTGCTCGT
130
140
150
160
121
TTCGCATACT GCATCATATC AGGAGTCGGT TTTCTGTTTT
170
180
190
200
161
GATCTTCAAA CCATCCGAGC CATTCGACCA CATCTCTGCC
201 205
TTGTG
3'
FIGURE 6 Complementary DNA sequence, 1-205
portionately less intense, and in all cases much weaker than
the ddNTP generated bands.
DISCUSSION
(1) The technique
The method described here gave clean, unambiguous
results up to residue 205, with further, tentative sequence
information to about residue 250. We consider that sequence
data to about 300 nucleotides are potentially attainable. In
our view this method represents the best currently available
approach to sequence studies on RNA by reverse transcription.
The important reservation must be made that, as presented here,
there is no confirmatory evidence available, such as
sequencing of the complementary strand. Thus, while we
emphasize that the sequence obtained is the^result of a number
of experiments, the system is subject to the same limitations
as other rapid sequencing techniques (12, 13, 15) and we
cannot exclude a low error frequency.
(2) The N mRNA and polypeptide
No data are available on the sequence or amino acid
composition of the N polypeptide. Recent estimates of the
molecular weight of N by electrophoresis in polyacrylamide
gels, have yielded values in the range 45,300 to 54,000 (6, 27,
2 8 ) . Assuming a mean molecular weight of 115 for the
constituent amino acids, the polypeptide thus contains 394-470
4020
Nucleic Acids Research
amino acids and so requires a minimum of 1182-1410 nucleotides
in its mRNA.
Published estimates of the cha.in length- of the
mRNA, excluding poly(Al, range from 1115 to 1466 nucleotides
(5,
29, 3 0 ) . We consider that the most accurate available
estimate is 1322 nucleotides, obtained by mobility on gel
eleetrophoresis of the full-length reverse transcript of the
mRNA with 0X174 DNA restriction nuclease fragments as
standards (D. McGeoch, in preparation).
This allows an
estimate for the maximum length of non-coding RNA of 140
nucleotides (excluding poly (A)).
The 5'-non-coding region
is 12 nucleotides (31), so we estimate the 3'-non-coding RNA
as 128 nucleotides.
This estimate suffers from a number of
possible sources of error, and is presented to suggest that
the 3'-non-coding region can be expected to be less than 200
nucleotides.
(3)
Possible translation frames
The complement of the nucleotide sequence obtained is
presented as the mRNA strand in Figure 7.
This was examined
in an attempt to determine the limit of polypeptide coding
RNA.
Figure 7 shows the distribution of translation
terminating codons in the three possible reading frames
(designated 1, 2 and 3 ) . Frame 1 contains 5 terminators, with
the most distant from poly (A)
at 165-167.
Frame 2 contains
2 terminators, with the most poly(A)-distant at 135-137.
Frame 3 contains 1 terminator at 45-47 (frame 3 contains an
additional UGA if the first A of the poly(A) tract is
considered).
Thus, if coding sequence termination is nearer
the 3'-terminus than nucleotide 128, as argued above, then
frame 3 is the reading frame.
However, from the uncertainties
of the argument, it is still possible for either of the other
two phases to be the reading frame,
A consideration of other
features of the sequence has also led us to support frame 3
as being the most likely candidate for reading frame.
First,
with frame 3 as the reading frame, the 3r-non-coding region
comprises residues 1-44.
The nucleotide composition of this
region is sharply differentiated from the rest of the sequenceit is much higher in U and lower in G - arguing a difference
in function.
Second, as mentioned above, the frame contains no
4021
Nucleic Acids Research
205 2 0 1
CACAA
5«>
200
190
180
170
161
GGCAGAGAUG UGGUCGAAUG GCUCGGAUGG UUlJG^AGAUC
160
150
140
AAAACAGAAA ACCGACUCOJ GflUA'UGaUGC
1
2
120
110
100
ACGAGCAGUC AUGUCACUGC AAGGCCJUAAiG
130
121
AGUAUGCGAA
90
81
AGAGAAGACA
80
70
60
50
41
AUUGGCAAGU AUGCJJAAJGUC AGAAUutJGflfc AAAt)GA!CCCU
I
40
»
8
30
20
10
1
JUCUCA GAUCACCUAU UAUAUAUUAU GCUACAUAUG
poly A
Figure 7 3'-terminal sequence of N mRNA. The
sequence is presented as the complement of that shown in
Figure 6. Translation termination codons are boxed, with the
reading frame underneath. The repeated sequences UAU and
UAUUAU are underlined by solid and dashed lines respectively.
terminator codons for at least 158 nucleotides.
On a random
basis we expect around 3 such codons in phase over this length
of sequence.
We also examined amino acid composition and
codon usage in the three frames, but concluded that this was
not helpful for our present purpose.
It is clear that this
question can only be definitely resolved with more sequence
data.
(If frame 3 is the correct reading frame, with
translation terminating at nucleotides 45-47, then the sequence
data predicts the C-terminal amino acid sequence of N to be
Gln-Gly-Arg-Asp-Val-Val-Glu-Trp-Leu-Gly-Trp-Phe-Glu-Asp-GlnAsn-Arg-Lys-Pro-Thr-Pro-Asp-Met-Met-Gln-Tyr-Ala-Lys-Arg-AlaVal-Met-Ser-Leu-Gln-Gly-Leu-Arg-Glu-Lys-Thr-Ileu-Gly-Lys-TyrAla-Lys-Ser-Glu-Phe-Asp-Lys-COOH).
(4)
The immediate 3'-terminal sequence
The sequence determined does not contain the hexa-
nucleotide AAUAAA, which has been found near the 3'-termini of
all polyadenylated eukaryote mRNAs sequenced, and near the
3'-terminus of encephalomycarditis virus genome RNA (that is,
the coding strand) (32, 33). This rather suggests that any
role of the sequence concerns eukaryote mRNA metabolism per se
rather than as a signal in translation.
4022
As noted above, the
Nucleic Acids Research
3'-terminal 40 nucleotides differ in composition from the rest
of the sequence.
The 3'-terminal 23 nucleotides are
particularly extreme, containing 11 U residues and 8 A's, with
repeating sequences - 6 copies of UAU and 2 copies of UAUUAU
(at positions 11-16 and 18-23).
This region, as the genome
RNA strand, may contain signals to the virus RNA-dependent
RNA polymerase for termination of transcription and start of
poly(A) synthesis.
This should become more clear when 3'-
terminal structures of the other VSV mRNAs are determined.
AC KNOWLEDGEMENTS
We thank our colleagues for their assistance! Dr C. R.
Pringle for supplying the virus strain, Dr J. F. Szilagyi for
discussion and for initial mRNA samples, and Professor J. H.
Subak-Sharpe for support and critical analysis of the data and
text.
REFERENCES
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
Abbreviations: VSV, vesicular stomatitis virus; N mRNA:
messenger RNA encoding N polypeptide; ddNTP, 2',3'dideoxynucleoside triphospha,te,
Wagner, R.R. (1975) in Comprehensive Virology, FraenkelConrat, H. and Wagner, R.R., Eds., Vol. 4, pp. 1-80.
Plenum Press, New York.
Baltimore, D., Huang, A.S. and Stampfer, M. (1970).
Proc. Nat. Acad. Sci. U.S.A. £6, 572-576.
Both, G.W., Moyer, S.A. and Banerjee, A.K. (1975).
J. Virol. 15, 1012-1019.
Rose, J.K. and Knipe, D. (1975). J. Virol. 15, 994-1003.
Knipe, D., Rose, J.K. and Lodish, H.F. (1975TT
J. Virol.
15_, 1004-1011.
Szilagyi, J.F. and Uryvayev, L. (1973).
J. Virol. 11,
279-286.
Both, G.V7., Moyer, S.A. and Banerjee, A.K. (1975).
Proc.
Nat. Acad. Sci. U.S.A. 72^, 274-278.
Moyer, S.A., Grubman, M.J., Ehrenfeld, E. and Banerjee,
Virology 62, 463-473.
A.K. (1975).
Preston, C M . and Szilagyi, J.F. (1977).
J. Virol. 21,
1002-1009.
Villarreal, L.P., Breindl, M. and Holland, J.J. (1976).
Biochemistry, !L5_, 1663-1667.
Brownlee, G.G. and Cartwright, E.M. (1977).
J. Mol. Biol.
114, 93-118.
Maxam, A.M. and Gilbert, W. (1977).
Proc. Nat. Acad. Sci.
U.S.A. 7_4, 560-564.
Cheng, C.C., Brownlee, G.G., Carey, N.H., Doel, M.T.,
Gillam, S. and Smith, M. (1976).
J. Mol. Biol. 107,
527-547.
Sanger, F., Nicklen, S. and Coulson, A.R. (1977).
Proc.
4023
Nucleic Acids Research
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
Nat. Acad. Sci. U.S.A. 7£, 5463-5467.
Pringle, C.R. (1970).
J. Virol. 5_, 559-567.
Szilagyi, J.F. and Pringle, C.R. (1975).
J. Virol. 16,
927-936.
—
Rhodes, D.P., Moyer, S.A. and Banerjee, A.K. (1974).
Cell 3_, 327-333.
Emerson, S.U. and Yu, Y.-H. (1975).
J. Virol. 15,
1348-1356.
Proudfoot, N.J. (1976).
J. Mol. Biol. 107, 491-525.
McGeoch, D.J., Crawford, L.V. and Follett, E.A.C. (1970).
J. Gen. Virol. 6_, 33-40.
Mirzabekov, A.D. and.Griffin, B.E. (1972).
J. Mol. Biol.
72^, 633-643.
Laskey, R.A. and Mills,A.D. (1977).
FEBS Letters 82,
314-316.
Sanger, F. and Coulson, A.R. (1978).
FEBS Letters 87,
107-110.
Banerjee, A.K., Moyer, S.A. and Rhodes, D.P. (1974).
Virology 6_1, 547-558.
Marcus, S.L. and Sarkar, N.H. (1978).
Virology 84,
247-259.
Wunner, W.H. and Pringle, C.R. (1972).
J. Gen. Virol.
1£, 1-10.
28.
29.
30.
31.
32.
33.
4024
Obijeski, J.F., Marchenko, A.T., Bishop, D.H.L., Cann,
J. Gen. Virol. £2_, 21-33.
B.W. and Murphy, F.A. (1974).
Freeman, G.J., Rose, J.K., Clinton, G.M. and Huang A.S.
(1977).
J. Virol. 23., 1094-1104.
Rhodes, D.P., Abraham, G., Colonno, R.J., Jelinek, W. and
Banerjee, A.K. (1977).
J. Virol. 2_1, 1105-1112.
Rose, J.K. (1977).
Proc. Nat. Acad. Sci. U.S.A. 74,
3672-3676.
Proudfoot, N.J. and Brownlee, G.G. (1976).
Nature 263,
211-214.
Merregaert, J., Van Emmelo, J., Devos, R., Porter, A.,
Fellner, P. and Fiers, W. (1978).
Eur. J. Biochem. 82,
55-63.