Download Nucleotide sequence and genome organization of foot-and

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

RNA interference wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

Metagenomics wikipedia , lookup

RNA world wikipedia , lookup

Nucleic acid tertiary structure wikipedia , lookup

Protein moonlighting wikipedia , lookup

Genomic library wikipedia , lookup

Expanded genetic code wikipedia , lookup

RNA silencing wikipedia , lookup

RNA wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Human genome wikipedia , lookup

Genome evolution wikipedia , lookup

Primary transcript wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Gene wikipedia , lookup

Point mutation wikipedia , lookup

Epitranscriptome wikipedia , lookup

Genomics wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Non-coding RNA wikipedia , lookup

History of RNA biology wikipedia , lookup

Polyadenylation wikipedia , lookup

Genome editing wikipedia , lookup

Genetic code wikipedia , lookup

Helitron (biology) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

RNA-Seq wikipedia , lookup

Transcript
volume 12 Number 16 1984
Nucleic Acids Research
Nucleotide sequence and genome organization of foot-and-mouth disease virus
S.Forss, K.Strebel, E.Beck and H.Schaller
Microbiology, University of Heidelberg, Im Neuenheimer Feld 230, D-6900 Heidelberg, FRG
Received 21 May 1984; Revised and Accepted 24 July 1984
ABSTRACT
A continuous 7802 nucleotide sequence spanning the 94* of foot and mouth
disease virus RNA between the 5 -proximal poly(C) tract and the 3'-terminat
poty(A) was obtained from cloned cpNA, and the total size of the RNA
genome was corrected to 8450 nucteotides. A long open reading frame was
identified within this sequence starting about 1300 bases from the 5 end of the
RNA genome and extending to a termination codon 92 bases from its polyadenylated 3 end. The protein sequence of 2332 ammo acids deduced from this
coding sequence was correlated with the 260 K FMDV polyprotein. Its processing sites and twelve mature viral proteins were interred from protein data,
available for some proteins, a predicted cleavage specificity of an FMDV
encoded protease for Gtu / GlyCThr, Ser) linkages, and homologies to related
proteins from poliovirus. In addition, a short unlinked reading frame of 92
codons has been identified by sequence homology to the polyprotein initiation
signal and by in vitro translation studies.
INTRODUCTION
Foot and mouth disease viruses
which are the
(FMDV) or aphthoviruses
causative agent of
an aggressive
are
picornaviruses
and economically
important
disease of cloven-footed farm animals. Their virion contains a single-stranded
RNA genome of about 8 kb with a small protein (VPg) covalently attached to
its 5' end, an internal poly(C) tract, and a poly(A) sequence at the 3' end.
This RNA is of
positive polarity
Protein sythesis involves
and can act directly
post-translational
cleavage
as a messenger RNA.
of a 260 K
potyprotein
which is encoded between a single major translation initiation site next to the
poly(C) tract and the 3' end of the RNA.
This mode of protein synthesis is
common to all picornaviruses (1) and makes them interesting model systems for
studying the mechanisms of
translation initiation and of protein maturation by
specific proteolytic cleavages (2).
To obtain more information about the FMDV genome, its control signals and its
gene products we have cloned cDNA copies of the viral RNA from strain O-jK
and determined their nucleotide sequence. Using a set of overlapping clones we
obtained a continuous sequence of 7915 nucleotides representing the 3' proximal long L segment of the FMDV RNA which is thought to contain all its c o d ing information.
In a previous publication we had already determined the initia-
© IRL Press Limited, Oxford, England.
6587
Nucleic Acids Research
lion sites for polyprotein synthesis and a long open reading frame encoding the
structural proteins of FMDV (3). The present work extends this reading frame
by the coding
sequence
translation product
of
for
the non-structural
proteins,
2332 ammo acids which could
predicting a single
be correlated in many
parts with known protein data trom FMDV induced proteins.
In addition, we
report and discuss a sequence of 713 bases preceding the polyprotein gene to
the poly(C) tract.
MATERIALS AND METHODS
Enzymes
Restriction endonucleases were isolated and purified by standard procedures or
purchased from
Boeringer
Mannheim or
Biolabs.
Avian
Myeloblastosis
reverse transcriptase was a gift from W. Keller, Heidelberg.
Aspergillus oryzae, calf
Virus
Nuclease S1 from
intestinal phosphatase, and T4 polynucleotide
kinase
were from Boehnnger Mannheim, and terminal deoxynucleotidyl transferase was
[ a / - 3 2 P ] - A T P was prepared according to (4).
from BRL.
FMDV RNA
FMDV O.|K RNA, isolated after 7 and 64 passages in BHK-cells of the same
field-isolate of the virus (Kaufbeuren 1966-67) was provided by K. Strohmaier
and W. Keller.
The virus was plaque-purified at the beginning of the passages
and again after passage 16.
FMDV cDNA clones
The clones FMDV-2735 and -2615 were constructed essentially as described
by (3) using single-stranded restriction fragments (mapping at pos. 3662-3752
and 3792-3882, cf. Figure 2) trom the existing cDNA clone FMDV-715 as primers for
cDNA-synthesis.
The double-stranded cDNA was provided with dC-
tails and annealed to dG-tailed pBR322, linearized at the Pstl site.
tion
of
clones
FMDV-3214a
and -3214c
was
carried
out
using
Construca modified
procedure that avoids second strand synthesis and G/C tailing (cf. Figure 1):
cDNA
synthesis
692-743)
from
was
primed
cDNA clone
with
an Aval/Hindlll
FMDV2735
which
potyacrytamide strand separation get (8).
had
restriction
fragment
been isolated
(Pos.
from a
6*
Reverse transcription was followed
by RNase treatment (pane. RNase 20ug/ml, 30 min., 37°C). Oligo-dA tails of
70-80 nucleotides
were
added to
Deoxynucleotidyl Transferase.
the
3'-ends
of
the cDNA
using
Terminal
This ohgo-dA tailed cDNA was annealed to the
DNA fragment complementary to the primer fragment.
By this a partially d o u -
ble stranded molecule was obtained having a single stranded oligo-dA tail at
its 3'-side and a double stranded portion at its 5'-side reconstituting the origi-
6588
Nucleic Acids Research
nal Hindlll site ot FMDV pos. 743. This DNA molecule was then annealed to
the vector pUC9 (6), that had been cleaved by Pstl, dT-tailed, Hindlll cleaved
and puntied by agorose gel electrophoresis. The DNA mixture was ligated and
transformed into competent E.coli C600 cells. Clones were screened for FMDV
inserts by colony hybridization with [ 32 P]-labelled FMDV RNA as described
elsewhere (7). Clones FMDV-2735 and -2615 were derived from RNA isolated
from low passaged FMDV (7 passages), all other clones were derived from a
virus-isolate after 64 passages in BHK cells.
Nucleotide sequence analysis
Restriction fragments were endlabelted and chemically degraded by the basespecific cleavage methods according to (8). 5' endlabelling was used throughout except for the 3' end of the genome where 3' endlabelling was employed
at a Hpall site located 68 nucleotides in front of the poly(A) tail. Thin
sequencing gels (0,04 cm), either 40 or 100 cm long, were dried prior to printing, according to (9) to improve resolution of the bands in the sequencing ladders.
Computer analyses
The derived nucleotide sequences were entered into a data base, where the
information was stored and processed using the computer programs of (10).
The alignment program from Kriiger and Osterburg (unpublished) was based on
the algorithm from (11). The homology comparison in Figure 4b was based on
such computer-derived alignments scoring "similar ammo acids" as one third of
identical ammo acids.
RESULTS AND DISCUSSION
Cloning of FMDV cDNA
A set of cDNA clones from FMDV strain OiK (FMDV-144, -512, -703, -715,
-1034, -1448) has been described which cover the two thirds of the FMDV
genome from the VP3 coding region to the 3' end (5, cf. Figure 2). To obtain
cloned cDNA copies upstream of the VP3 gene single-stranded restriction fragments from the existing cDNA clones (indicated by I and II in Figure 2) were
used in two steps (7, 12) to prime cDNA synthesis close to the missing parts
of the FMDV genome (see Methods and Figure 1). As a result the cloned part
of the FMDV genome was extended into the 3'-end of the poly(C) tract.
Nucleotide sequence analysis
In Figure 2 exact map positions of the cloned cDNA inserts used for nucleotide
sequence analysis and the sequencing strategy are depicted. The methods of
(8) were followed for 5' [ 3 2P]-endlabelling and subsequent partial chemical
6589
Nucleic Acids Research
RNA 51
cDNA 3'
__
~
Hindu
_=_*'
•™
***
TTTTT
Transformation
—
PUC9
Fig.1:
S t r a t e g y tor cloning 5'-terminal sequences of FMDV O i K . The
minus s t r a n d of an Aval/Hmdlll fragment ( p o s . 6 9 2 - 7 4 3 , hatched b o x )
from cDNA c l o n e pFMDV2735 was used t o s p e c i f i c a l l y prime cDNA
synthesis on FMDV RNA (step I).
The RNA template was h y d r o l y z e d
using p a n e . RNase, o l i g o - d A tails were a d d e d to the 3 ' - e n d s of the
cDNA and the cDNA was annealed to the plus s t r a n d of the Aval/Hmdlll
fragment ( e m p t y b o x ) , complementary to the primer fragment (step II).
The partially double stranded and d A - tailed cDNA was annealed to
pUC9 v e c t o r DNA (6) that had been Pstl c l e a v e d , d T - t a i l e d and Hmdlll
c l e a v e d in this order (step III) and then t r a n s f e r r e d into competent E.coli
C600.
degradation of r e s t r i c t i o n fragments.
data most
In order
to obtain unambiguous
regions were sequenced in several
sequence
independent
r u n s , and the entire
sequence was analyzed on both DNA strands
(Figure 2 ) .
Using o v e r l a p s of at
least hundred
a continuous sequence was g e n -
erated over
poly(C)
n u c l e o t i d e s between s u b c t o n e s ,
the 7915 c l o n e d
tract
pFMDV-3214c
and
nucleotides
ending
in
102
and p F M D V - 5 1 2 ,
starting
with
A residues
respectively
11 C residues from
as
(Figure
determined
3).
All
in
the
clone
cleavage
sites
p i e d i c t e d in this sequence for the restriction enzymes used during the sequence
analysis
5026.
were a l s o o b s e r v e d experimentally,
This
site
overlaps
two Mbol
sites
e x c e p t for a Clal site at
which
are known
to
position
be targets
for
adenosine m e t h y l a t i o n in E. coli (13).
Minor h e t e r o l o g i e s
in the sequence
different c l o n e s (Figure 3 , 14).
high
probability
in populations
because of the intrinsically
were o b s e r v e d in o v e r l a p p i n g
regions
from
Such point mutations are known to occur
of
FMDV
RNA (15)
and in
other
viral
with
RNAs
imprecise mechanism of RNA replication (16).
variation was p a r t i c u l a r l y obvious
when cDNA
from moderately
saged virus (7 and 64 passages in 6HK c e l l s , r e s p e c t i v e l y ) was c o m p a r e d .
substitutions
were f o u n d in 550 bases o v e r l a p p i n g
the c l o n e s
This
and highly p a s No
FMDV-2735 and
- 2 6 1 5 from the same l o w passaged virus (14).
The n u c l e o t i d e
The FMDV
sequence
genome is divided by the p o l y ( C )
size and f u n c t i o n .
The small (S) segment
tides (17)) is p r o b a b l y
involved only
tract
into
t w o parts of
to the 5 ' side ( a p p r o x .
different
400 n u c l e o -
in initiation of viral RNA r e p l i c a t i o n , and
the large (L) segment 3 ' of the p o l y ( C ) contains a l l the protein coding i n f o r m a -
6590
ryib'f"i
** •
p».
'I
»"
'
m
I
\
|
P12
n*
; 'viv'W i; ;'—;B 't' " t y r t i 1 i " i ' i ' i i f t i / f t ^ ' i ^
_
.,
-
I
m
| »ni |
raoi
I'T #N»tfU{'.itf"'' ' 4 ^ 1
,
i
,
.
™-j
Fig.2: Physical map ot the FMDV genome, FMDV cDNA clones and
strategy for the nucleotide sequence analysis. The top part shows the
physical and the gene map of the FMDV RNA. Positions of the restriction sites of endonucleases used for 5' end-labeling are indicated on the
second line. Below that, the restriction fragments used to prime cDNA
synthesis (I and II) are illustrated as open arrows. Horizontal arrows
show direction and extent ot individual sequencing runs, grouped according to the clones they originate from. Dashed lines represent portions of
clones or sequencing runs from which no information was obtained. The
analysis of the clones FMDV-1034 and FMDV-144 has already been
reported (26) and only the extent ot the sequenced areas are indicated
here. The restriction enzymes are represented by the letters: A, Aval;
B, BamHi; C, Clal; E, Haelll; F, Hind; H, Hhal; I, Hphl; N, Hindlli; P,
Hpali; R, EcoRl; T, Taql; U, Pvuli; X, Xhol; Xb, Xbal.
CD
o'
3)
a>
O)
in
CD
0)
a
Nucleic Acids Research
tion (18). The sequence ot 7802 nonhomopolymeric nucleotides shown in Figure
3, represents the complete primary structure ot the L segment. Assuming
additional 150 bases tor poly(C) and 400 tor the S segment this corresponds to
94 % of the FMDV genome indicating that the size of the FMDV genome is
about 8500 nucleotides, i.e. 500 nucleotides longer than estimated from sizinggets by (17).
This size correction is also supported by the sizing and the partial sequence
analysis of cDNA copies that cover the as yet uncloned 5'-terminat part of the
FMDV genome, i.e. the S segment and the poly(C) tract (7). So tar, all
attempts to clone this missing part of the viral RNA have been unsuccessful.
These difficulties seem to be related to the internal poly(C) tract, which, although readily copied into cDNA (7), is probably highly unstable in E. coli, as
shown for other (GtC)-homopolymers longer than 30 basepairs (19). In accordance with this notion we have only been able to clone 11 C residues from the
3' proximal part of the poly(C) tract which still contains interdispersed non-C
nucleotides. The sequence obtained ...GCT(C)i 1AAG... was confirmed by
direct sequencing of the cDNA extending into the poly(C) tract (7). It is different, but shows similarities to the sequence ..T(C)2AUUCCAAG... determined
for the 3' end ot a T1 oligo nucleotide containing the poly(C) tract from FMDV
O-V1 and A61 (17).
Coding regions and translation initiation sites
From the kinetics of the appearance of FMDV specific gene products it has
been predicted that FMDV RNA contains a single long open translational
reading frame for a 260 K polyprotein, which in turn is a precursor for all gene
products. Our sequence reveals such a reading frame of 7035 nucleotides
(pos. 766 to 7800) with a first possible initiaton codon for the polyprotein at
position 805 (see Figure 3). The coding region is preceded by 724 nucleotides
of known sequence and by some additional 550 nucleotides of yet uncloned
RNA comprising the poly(C) tract and the S fragment. It is followed by several atop codons in all three reading frames leaving 92 nucleotides untranslated
in front of the poly(A) tail. Usually the 5' proximal AUG is used to initiate
translation in an eucaryotic mRNA, which suggests that the ribosomes first
recognize the capped 5' end ot the RNA and then traverse downstream until an
AUG is encountered (20). Clearly this model is not applicable to FMDV since
translation of the major primary gene product starts at position 805 and to a
lesser extent also at position 889 (3), which are the ninth and tenth AUGs
downstream trom the poly(C) (see Figure 3). These two start sites differ from
other AUG codons in the sequence in that they are preceded at a short dis-
6592
Nucleic Acids Research
92
fCTnAACTTTTACCCTCCTTCCCGACGTAAAASCCAGGTAACCACAACCITCAAACCGTCCCCCCCCACCTAA^
?0«
31S
CGGA^GAAACCACAAGACTTACCTTCCCTCCCAAGTAAAACGACAAACACACACAGTTTIGCCCGTTTT^^
TTGTACAAACACCATCTATGCAGCTTTCCOa^ACTGACACAAACCGTGCAACTTCAAACTCCGCCIGCTCT^^
«<0
GATCCACTAGCGAGTCTTAGTAGCGGTACTGCTGTCTCGTAGCGGAGCAJGTTGGCCGTGGGAACACCTCT^
See
•I t
P2Q&B0&
CATGTGTGCAAC<SX>CCACGGCAGCTTTACTCTCAAACCCACTTCAAGGIGACATTGATACTGCTACTCAA
C«1ACACTCGGGATCTGAGAACCGGACTGGGACTTCTTTAAAGTGCCCAGTTTAAAAA&CITCTACCCCTGA^^
ATG AAT
ACA ACT
& * C TCT
TTT
ATC GCT
TTG G T A CAG CCT
A T C AGA GAG AIT
AAA GCA CTT
TTT
CTA
TCA CGC ACC A C A GGG AAA I ATG GAA
P20fi
"j.V"J!;'_'J!'_i"i".tri"i*'j_""-*'j_i;i*'jJ!.'.t;ii';_'i'_L."--i"ifi>_'-i"_i'i-if_'i'-r_'_!'j'.M\r:7Ol" " P I O
<u
C
G&h
TT
C
CAA
CTG
CAT
GAG
GGT
G&h
CCA
CCT
12SS
TTT
CCG
TGT
GTC
ACC
TCC
AAC
CCG
TGG
13*S
GTC
CCC
TAC
CAT
CAA
GAA
CCA
CTC
AAC
GGG CAA
Prv
Lfu
Asn
VaI
HIS
)i5I5
ie<»
P'o
Tyr
Aap
Cl u
Gin
CCC TCC CAG AAC CAA
TAC
CCT
GCG
CTC
ATT
ATC
CAC
GAT
TCG AAA
GCC
Gly
Glu
TCT CCClAAT
GTG
Trp
TGG
GAG
AAC
GAC
TTC
AAG GTT
*
AI
Ly*
ATC
TAC
CAA
Lya
AAG CAC
CCC
CCC
Val
TTG
CAC
AAA
Arg
Lya
Ltu
ACT GGC ACC ATA ATA AAC AAC TAC
CTT
CCC
ACC
TCT
GCT
TTC
AGC
GGT
CTT
TTC
GCC
CCT
CTT
CTC
GGC
CCG
GGG GCT
GCA
Lya
ATC
GGC
TCC GAC
CAA
Gly
GTT
TCC
Ala
ACT
Gly
G C
GAA
ACC
GCC
CTG
GTC
CCA
GCG
Gin
Sar
TC
TCG
AG
G
f f j
i f i GAT
CTC
C&l
TTT
ACC
Sar
Pro
A l a Thr
TAC ATC CAC CAG TAT CAA AAC
GAC AAC GCA ATC ACT GCA GGC TCT AAC CAG GGC TCC ACC CAC ACA ACC
AAA
T
ACC
TGG ACG CCG GAC
AAG CTC
Gin
CTC
210
TCC ATG GAC ACA CAC CTT GGT
TCC ACC CAC ACA ACC AAC ACC CAC AAC AAT GAC TGG TTC
CCC I c A C
AAG
AAG
ACA
GAG CAG
ACC
ACT
CTC
CTC
CAA
CAC
CCC
ATC
VP4
TCC
VP2
Idti L t l IiLF SlH S±S IE.' !£.' LU l U 41k 410 &!« ill
30
°
170) CTC ACC ACC CGI AAC GGC CAC ACC ACG TCG ACA ACC CAC TCA AGC GTT GGA GTC A C A TAC GGG TAC CCA ACA CCT GAA GAT TTT CTG ACC
Lau Tnf Tnr A^j Ain 61 r H n Thr Tnr &*r Thr Tnr Gin S«' S«' V«l Gly V«l Thr Tyr Gly Tyr Al# Thr Alt Glu Aap Prl* V*l Sar 330
17^^ G^y^ CCG 'UhC ACT TCC GCT CTC ^^rt ACC ACA CTT CTG £P^1 GCA ^^^ CGC TTT TTC AAA ACC f**ftC CTC TTC ^*if TCG CTC ACC ACT f^yf TCA
1 ^f ^ TTC G&^ CGT TGC ^**f CTC CTG ^^^A CTC CCG ACC ^^c CAC '^OV GGT GTC TAC GGC AGC CTG ACT CAC TCG TAT CCA TAT ATG A^y^ *<ff GGC
:
CIGGA
2336
TGT
2a I S
TTG
ACC
Cyt
2S1S
CAC
Sar
CCG
^AA
GCC
Aap
Cly
GGG CGT
ff*fi
TAT
GGT
Tyr
TTT
fL^f*
f a r
GGC
Gly
ACC
CTC
GTG
Glv
AAC
Lau
CTC
ACC
ACG GAC
Val
CTT
CCG
AAC
Pio
T^^ T j ^ A a p
CAT
GTG
GCT
ACC
f>AO
GCA
CCT
Ly^
TCC
CAC
CCC
GTT
Thr A l • Aap
CCG
ACG
TTT
TAT
Pry
CTC
GGG
VaI
CGC
Thr
Aap
T»f
Tyi
Tnr G i n Tyr
Gl r
itt
Tnr
l i t
Am
n n
Ltu
Pnr
Mai
Put
Th<
GAG GCG CCC CCC CAC
i16S
T A C fJC^m
2»0S
CCC
CCT
Ala
JOSft
TCC ATC CCC
Sar
i i a
3736
ACC
337$
CTC
Tnr
TAC CTC TCC CCC GCC CAT
Lau i n
A l a A l a A«o
T v Ala
P r o Tyr
C C C GAA IACC
ACT
TCT
CCC CCC
Arg Ai a G I U I T I W
C C C f~^* ^AC
TCC
CTT
Trp
CCC
Pro
GCA
A m
CCC
GCC
C l y
CCC
GAG
TIW Sa_
ACG fi^c O ' C
CCA AAT
V a l
TAC ACT
TTC
Tyr G l v ^ _ >
TTC
G l y *>'O
26 9ft CCA CCA GCC ATC CAC CCG CCC ^^G ACA CCT
Pnt
CTG
GAG
GGT
AAC CCC
VaI
GCC
Pha
GTA
CCC
AAC
Pio
Aan
CCG
CGC
TAC
ACC
CAA
TTC
GGT
CCC
CCC
GTG
CAC
TCA
CCC CAT
Ala
Cly
Tyr
CCT
frlu
CTC
$\t
Arg
Aan
Gin
640
ACG
ACC
CCC
CAA
AAC
CCC
T T C CAC * A C
Pro
C l u
t-ya
A l a
GTG
TTG
CCA
I
L t u
A l n
r\^c
A l t
CAT C
ACC
ACC
AID
GTT
Pip
GAA
AAC
Val
TAC
Thr
GGT
CCC
Thr
Arg
Trf
T CAA
Thr
C C A ACT GCT
Thr
CCT CAC
A«n
P r o
TCC i * ^ ^
Mai
Val
Alt
Tyr
A l t
S30
TGG fr^f ACT GGC TTA ifeAC TCA AAG
A l t
rtj^c
ACA C A C ATC
Vil
Tyr
GCA
M i a
CCA
Lya
CAG
Glu A i n
CCQ C A ^
TAC CAC AAC
Tnr
TAC
CAA
Thr
ACA T T T C T C ^%/sG C T C A C A
A C C ACC A A C
A»p
^CC C T C TAC
TGC ATT
CCC C A G
TTT
TAC ACC CCG TCT GfaC GTG GCC GAG ACC ACft J^^T GTG C^flr ^ f ^ TGG GTC
S i r Gly
V»l
A l a Q*u
Tnr Tnr Aan V a l
Gin Gly
Trp Vt I C y l
Lau
BOO
Ala
Ala
TCC TTC ATC ATC ^ ^ f
A l t
CGC
Thr
A I D A l a Lv>
Thr
ACC
TTG
V P 1
Thr
Gin
I li
CTC
ATC CAC ATT
CTC
ACC
CCC
Pit
CCC
Tirr
ACT
TGC
* ^ ^ ^ ^ ^ A T T ^\^c A T T T T 6 ^AC
A l a
^^^^ A A T G C T C T G
CJj
Gl v Glu
CjJJ AfQ
f 50
C T T CCC
L.au
AAC
VP3~
CCC
CAC
Pro
CTC
ACT
TCG G^C AGG CTC CTT CCT CAC TTT CAC ATC TCT TTC CCA CCA AAA CJ^^ ATC T C A ^SAC ACC TTC CTC C C A GCT CTT
Aip
* ' 9 V t i L a u A l a C l n P n t A a o M a I & i r L a u A4 a A l a L y a G i n Ma I S a r A m T h r P h a L t u A l t G l y L a u A l a G i n
000
Lra
TTT
AA*
AAC
ATA
Tnr
Arg
TTC ACA
Lau
A l t
1 * 0
CGT
GcTo
Aan Pha Aap Lau Lau Lya Lau Ala Cly Aap Val Civ tar Aan Pro GlylPre Ph* P*na Put 6ar Aap VaI 00
_________________
— — - — — __ — — — — — — J
P12
P52n
Arg
t n
I , ,
)S5
sar
TCCfif^
Ago
TTT
A | a
C T C CAC
ACC
2A6 CTC C T C
Ph.
t . r
^^hG ^ . ^ ^
ACC
TTC
L*>
Lau
T TG GCiC
pna C | H
c i u
GTC
TTC C C
ATT
L ( u
GIG
V i l
G l u Tnr
^ ^ *
A | a
AAC AAC
CTC
Ma
*TC
Ma
M V fj^f
Aan
Cln
Mai
ATC
*fi*
MTf
Glv Val
Lya
TCC
TCC
CAC
Ala
CTC
Cln
GGT
Ma
TCC
Clu
CTC
Arg
ACT
^iTrfi TTC CTC ^Art CTT &CC TCC ACT
**P
CTC
TTC
Mai
C^^C ^ * B *
Tnr
Gl r
TTT
CAC
Sar
f j fr
Lau
Tn. L , .
fififl
A*p
GTG CCG
TTC ^fi4i
CCC
Glu
GCC
H . i
TGG
C l y Pro
Ti^C
^^3
* l « L y i Pro
CCC
GTC
TTC
CTT
Aap
Pnt
ATC
T ' P Tyr
ACT
TTC
Aan
*t^*j
Lya
GCA
TCC ACA CCC f^\A ^^^C CTT
Arg
CTC
Lau
CCA
Lau
Val
990
CTA
H i
Ly»
L * u Lau
1034
CCC
^j^^j fv^^i f y * CAfi *a^W^ CACICTC
Ptfi^ ^^~*
^^34
6593
Nucleic Acids Research
1
I4BS
GC*
CAC
CCA
ATT
TCC
1
<
ACC
CCC
ACA
ATC
CAC
I O
CTC
TCC
TAC
TCC
CCA
CCT
CAC
CCT
CAC
CAC
TTC
CAC
CCT
TAC
AAC
AAC
CAC
CCC
ACC
AAC
TAC
CAC
: c c c ACC ACC * A C TTC TAC i c e c c c T I C ACC
" l i
CCO
ACC
ACC
r CCT ffiv; GAT ccc TAC AAA ATT AAC ACC
AT<
P i O O i l l i AAC CCC CAA <
L * . G . f Gl» •
ATC CAA CAA ACT
i ( p 3 )
:
i
4*15
CTT
GAT
CAC
GAC
CTC
GAG CCC
CtC
AAA
CAC
AAC
GIG
AAC
TCC
AGA
ACC
GAC CCT
CCC
CCT
CTA
GCC
AAC CTC
CAA
CCA
CAC
CCT
ACC
CTT
CAC CAC
ACC
CCC
CCC
CAC C A C
CAC CCC
CAA
CCT
ACC
ACC
CCA
TAC
CTT
GCC
C A A GCT
GCT
GCT
TTT
CAA ICCA
CCC
ATC
ACA
CCC
CAC
TAC
CAC ACA
ACA
CCC
CAA
ACT
CTC
C C A CCA
AAA
CCA
CCT
CAA
AAC
CCA
T
CTC
CCC C I A
AAA
CTC
CCC
CAT
CTC
ACA
CCC
TAC
ACA
CCC
ACC
AAC
CCT
CC1
TAC
TCC
CCA
TAC TCC
CCA
GCC
CTT
CTT
T C A TCC CTT
CCC
AAA
CAC
CCA
GCT
CAC
TCC ACC TCC ATC CTT
ACT
TTC
CTT
AIC
CTC
CCC
ACT
CAC
ATT
CCC
TCi
GAC
CGC
ACA
GCC
TCC
AAC
CCT
G*T
O i l
I > 1
TGG CAS
A&A
T y •
>
TTT
CAC
CCC f ^ f t
CCA CAC
ccc ATC CAA CCA GAC ACT cco ccc ccc
I
•
vAA
CCT
CAC
A A A A T C Af^fi CCA C A C ATT f ^ C
:
CCi
AAA
(
GCC
GCA CCC AAC CCA GT T GGA
'.
CAC
*
-.
e n s
i
ACC ACA CAT GAC
TCC
Tt.T
TTC
GC*
(AC
CCC
CAG
ATC
TAC
CAC
ACA
CCT
TCA
GCC
AAT
AAC
CCT
AAC
CTC
AAC
ACT
OCA
TCC
CAT
: ccc
TTC CAC
« ATG CCC TCT
:TC TCC TIT CCA CGC
iliAA rcccn
6594
<
ASCCGSSCTC
Nucleic Acids Research
tance by a stretch ot 11 pyrimidines, which are interrupted by no more than
one punne residue and which also show significant base complementarity to the
3'-end of the 18S libosomal RNA from eucaryotes (3).
These pyrimidine runs
are also present at the translational start sites of poliovirus and EMCV (21,
22).
We therefore speculate that both features may be of importance tor the
recognition by ribosomes or initiation factors of an uncapped mRNA with the
long untranslated leader sequence.
Sequence complementary to the 18S rRNA
terminus has also been noted for less pyrimidine rich sequences in other eucaryotic mRNAs (23).
The sequence of 1.3 kb preceding the polyprotein gene seems to be exceptionally long for a leader segment of a small, and otherwise compactly organized
viral genome and suggests that additional short coding sequences may exist in
this segment. At present the most likely candidate for such an unlinked FMDV
gene is a sequence of 92 translatable codons that follows the first AUG after
the poly(C) tract (pos. 209 in Fig. 3). The evidence for this hypothesis is twofold.
Firstly, a polypeptide ot 10 K (tentatively named P10) has recently been
identified among the products of in vitro translation of FMDV RNA by immunoprecipitation with an antiserum directed against the P10 amino acid sequence
(Strebel et al, in prep.).
Secondly, the translational start of the presumed P10
gene (pos. 209) is structurally very similar to other sites controlling translation
initiation at internal positions of picornaviral RNAs in that it is also preceded at
a short distance by the pyrimidine rich sequence noted above.
The deduced protein map
The long open reading frame encodes a polypeptide with a maximal size of
2332 ammo acids and a calculated molecular weight of 258.9 K, in excellent
agreement with the 260 k determined experimentally tor the FMDV
(1).
polyprotein
Its deduced amino acid sequence could be correlated in the P1 (P88)
segment with all known sequence data from the structural proteins of FMDV
O-)K (3, 26, 27, ammo acids underlined in Figure 3).
Much less information
Fig.3: The nucteotide sequence and encoded information of the L segment of FMDV RNA strain O^K (U residues are shown as T). The amino
acid sequence corresponds to the total polyprotein of 2332 amino acids.
A translated amino acid sequence is also shown for a hypothetical small
polypeptide encoded upstream from the polyprotein. The numbers to the
right represent amtno acid positions in the potyprotein. Predicted limits of
the primary precursors are indicated to the left, and those of the stable
viral potypeptides to the right. Dashed lines represent borders where
the supporting data are weak. Underlined amino acids have been determined experimentally for this virus strain (26,27). The nucleotide positions to the left use the BamHI site at position 3000 as a reference (5).
Nucleotide 92 (indicated by an arrowhead) is the first nucleotide downstream Irom the poly(C), and is put forward here as nucleotide 1 in an
improved numbering system. AUG codons in the 5' region are underlined, a palindromic sequence at the 3' end is underlined by arrows.
Nucleotide heterologies are indicated by dots.
6595
Nucleic Acids Research
RNA
o
FMDV
P16
,,P20o VR. VP2
189
69 218
217
92
Poliovirus
VP1
220
213 16154
.P12
VP4,
VP2
VP3
VP1
69
271
238
302
-f-
r^
H—h
153
\ M-& mi
, P3-t,
329
182
i
1
- I proteins
3d
3c
2C
i
1
P3-ib
661
ii
^2a
L
proteins
p \
$
d
P56
213
P2-X
U9 97
\
FMDV
f~
2332 aa T
P20b
P2L
318
i
1 i
Poliovirus
2207 aa
VP3
/
S I mmi
m i
P2
-capsid proteins -
P3
[primer -fprotea
lor replication?
Fig.4: Comparison of the FMDV and poliovirus genomes.
a) Schematic map of gene organization and protein processing Open arrows
indicate cleavages probably executed by cellular proteases, while filled
arrows represent processing by the viral protease. The dashed arrows
illustrate morphogenetic cleavages occurring during virus particle maturation. The FMDV polypeptides are termed as in Figure 3; the number of
amino acids is given for each protein below the lines.
b) Sequence homology between corresponding parts of the two polyproteins.
The degree of homology in different segments (see Methods) is: hatched
25-40'', crosshatched 40-65", filled in 565H. Related gene products are
connected by vertical lines and identified using the new general nomenclature for picornaviruses (according to the third Meeting of the European
Study Group of Picornaviruses, Urbino, Italy Sept. 5-10, 1983).
was available for the non-structural proteins encoded in the P2 (P52) and the
P3 (P100) precursors where only the VPg genes had been exactly mapped (24)
and only approximate positions had been allocated for P34 and P56a (1) and
for P12 and P20b (A.King, pers.comm.).
A more complete protein map was
established (14), using size estimations from SOS polyacrylamide gel electrophoresis and sequence
order was known (21).
proteins
were
homologies
to poliovirus
polypeptides
for
which the
The exact coding limits ot the individual functional
predicted
from
the
cleavage
specificity
for
Glu (GlnVGly (Ser,Thr) linkages of the FMDV protease (see below) and also
using very recent data from Grubman and collaborators, who determined amino
acid sequences at the N-termini of the polypeptides synthesized in vitro from
FMDV A12
RNA
and P56 (25).
and corresponding to P52, P12, P34, P14, P20b (pers. comm.)
The order and limits of FMDV proteins thus obtained (Figure 4)
were recently confirmed by immunoprecipitation of polypeptides of the predicted
size from FMDV infected BHK cells, using antisera against bacterially synthes-
6596
Nucleic Acids Research
Table I:
Predicted map positions and biochemical properties of FMDV polypeptides as
deduced from the nucleotide sequence.
Polypeptide
Polyprotein
Map co- ,.
ordinate '
Map
position
No. of
ami no acids
Molecular
weight
Net
charge
258946.8
24367.4
21243.1
+37
Function
precursor
?
?
805-7800
805-1455
889-1455
2332
217
lb
lc
Id
1456-3615
1456-1662
1663-2316
2317-2976
2977-3615
720
69
218
220
213
79305.6
7362.3
24410.4
23746.4
+13
-3
+4
-2
23840.5
+14
P52
(PI 2)
P34
P2
2b
2c
3616-5079
3664-4125
4126-5079
488
154
318
54501.1
16255.1
+8
+2
precursor
35892.9
+8
?
100844.7
17355.2
2604.5
2622.4
+21
-5
+3
+4
2579.3
23012.4
52760.9
+3
+9
+7
P20a
P16
L
L1
P88
VP4
VP2
VP3
VP1
PI
la
189
P100
P3
(P14)
VPg-1
3a
3b-l
5080-7800
5080-5538
5539-5607
907
153
23
VPg-2
VPg-3
3b-2
3b-3
5608-5679
5680-5751
P2Ob
P56
3c
3d
5752-6390
6391-7800
24
24
213
470
-5
-7
precursor
capsid protein
n
n
ii
ii
ii
n
?
precursor
genome-linked
protein
protease
RNA polymerase
1) Nomenclature suggested for the picornaviral polypeptides at the 3 r Meeting of
the European Study Group of Picornaviruses, Urbino Italy, Sept. 5-10 1983.
ized polypeptides that correspond to the respective segments of the polyprotein (Strebel et al., in prep.).
The biochemical properties of the FMDV proteins predicted from the ammo acid
sequences in Figure 3 are summarized in Table I and a schematic protein map is
shown in Figure 4a.
Our sequence-derived molecular weights often differ from
earlier determinations, and correlate better with recent size estimations of FMDV
proteins synthesized in infected cells (3, Strebel et al., in prep.).
The polyprotein sequence starts with two closely related "leader" proteins, L
and L'
("P20a" and "P16") which differ by 28 amino acids at their N-termini
and have molecular weights of 24.4 k and 21.2 K, respectively.
The primary
product following the leader protein L/L' in the polyprotein is P1 (MW 79.3k),
formerly called "P88", the precursor of the capsid proteins which are arranged
in the order VP4-VP2-VP3-VP1 (1).
P2 (formerly "P52"), the precursor from
6597
Nucleic Acids Research
the middle part of the potyprotein has a calculated molecular weight of 54.5 K.
It contains two stable proteins, 2b ("P12") and 2c ("P34"), with unknown functions.
The N-termtnal limit of P2 was originally set next to the carboxy-termi-
nus of VP1 (cf. Figure 3) which had been accurately identified for FMDV OiK
by (26). However, as shown for FMDV A 1 2 , the proteins P2 and P12 start 16
ammo acids downstream from the C-terminus of VP1 (Grubman pers. comm.,
cf. Figure 3).
The carboxy-termmal precursor P3 (formerly "P100", calculated
molecular weight 100.8 K) comprises the sequence between ammo acids 1426
and 2332.
It is processed into six proteins: 3a ("P14") of 17.4 K, three VPgs
(in tandem) ot 2.3 K each, a candidate for a protease, 3c ("P20b"), of 23.0 K,
and an RNA polymerase, 3d ("P56"), of 52.7 K.
Pfotease cleavage sites
The present map ot the FMDV genome (Figures 2 and 4) predicts that at least
12 sites in the protein sequence need to be cleaved to give rise to mature
viral
gene
pioducts.
Seven
out of
these
sites
show
similar
amino
acid
sequences
VP2/VP3
Pro-Ser-Lys-Glu/Giy-Ile-Phe-Pro
VP3/VP1
Ala-Arg-Ata-Glu/Thr-Thr-Ser-Ala
P14/VPg-1
Pro-Gln-Ala-Glu/Gly-Pro-Tyr-Ala
VPg-1/VPg-2
Pro-Gln-Gln-Glu/Gly-Pro-Tyr-Ala
VPg-2/VPg-3
VaI-Val-Lys-Glu/Gly-Pro-Tyr-Glu
VPg-3/P20b
Ite-Val-Thr-Glu/Ser-GIy-Ala-Pro
P20b/P56
Pro-His-His-GIu/GIy-Leu-Ile-Val
suggesting that they are cleaved by a single (viral) protease, recognizing the
consensus sequence Glu/Gly (Ser, Thr).
This specificity is similar to the one
displayed by the poliovirus protease which cleaves between Gin and Gly residues at eight out ot eleven processing sites in the polio potyprotein ( 2 1 , 28,
Fig. 4a). The sequences around the remaining five cleavage sites in FMDV
P2 0a/VP4
Ser-Gln/Asn-Gly-Ser-GIy/Asn-Thr-Gly-Ser
VP4/VP2
Ala-Leu-Leu-Ala/Asp-Lys-Asn-Thr
VP1/P5 2
Lys-GIn-Thr-Leu/Asn-Phe-Asp-Leu
P12/P34
Ala-Glu-Lys-Gln/Leu-Lys-Ala-Arg
P34/P100
Ile-Phe-Lys-Gln/lle-Ser-lle-Pro
show little sequence homology to each other and are thought to be recognized
by cellular proteases. However recent experiments from this laboratory (unpublished data) suggest that the L polypeptide may also be involved in the processing of at least the L/P1 junction.
6598
Nucleic Acids Research
In FMDV there are 14 Glu-Gly,
3 Glu-Ser, and 9 Glu-Thr
dipeptides in the
polyprotein sequence which may be substrate for the FMDV protease. Of these
only five, one, and one respectively are utilized, according to the protein map
(Figures 3 and 4a).
Therefore, not simply the primary sequence but also a
certain conformation of
the amino acid sequence must be recognized by
the
viral enzyme.
Homoloqy to poliovirus
Gene organization
and
processing
mechanisms predicted
for
FMDV
from the
nucleotide sequence in Figure 3 are summarized in Figure 4 and compared to
those from poliovirus, a member of the enterovirus family.
As outlined in Fig-
ure 4a, the overall organization is well conserved between the two genomes.
This map is also similar
to the approximate gene map established for EMCV
(29), another well studied virus belonging to a third genus of picornaviridae. In
an overall comparison FMDV differs from poliovirus by an increase in size of
its
genome
by
about
1000
nucleotides
5' pioximal part of the genome.
most
of
which
are
added
in
the
Within the polyprotein region the two genomes
differ drastically only by the addition of the L gene and two extra VPg genes
in FMDV, and by the addition of a third extended gene (2a or 2b) in the c e n tral part of the polio genome. Other corresponding genes often differ in size.
Thus, the capsid proteins are larger in poliovirus, while FMDV has expanded its
nonstructural proteins in the P3 segment (Figure 4).
The alignment of the two genomes was refined using sequence homologies between functionally corresponding segments. Using a dot matrix program significant homology was detected on the ammo acid level throughout most part of
the coding region, and to a lower extend also on the nucleotide level indicating
common
structural
features
of
general
functional
importance
(14).
As
shown in Figure 4 these sequence homologies are very high in certain parts of
two
non-structural
proteins,
the
polymerase
(3d),
and protein
2c
(x/P34).
Less, but still high homology was detected between the protease genes and
the capsid proteins VP2 and VP3, indicating that the functional specificity
of
these proteins was less stringently conserved during picornaviral evolution. The
evolutionary divergance is most pronounced in proteins 1d, 2a/2b and 3a. Protein 1d (VP1) is the capsid protein most exposed at the surface of the virion
and, as a consequence of
the pressure
of
the host's
immune
system, its
sequence is highly variable also between different aphtoviruses giving rise
seven serotypes and many more subtypes.
to
In contrast proteins 2a/2b and 3a,
although varible between FMDV and poliovirus, are highly conserved between
two FMDV serotypes as exemplified by a comparison of FMDV O-|K and
6599
Nucleic Acids Research
(unpublished results).
Therefore, we conclude that these latter proteins play
an important role in the FMDV lite cycle but did coevolve with their targets
which are most likely FMDV-specified molecules.
Following the same argu-
ment, we predict that the more generally conserved picornaviral proteins (2c
and 3d) depend in their functions on the interaction with host factors of conserved structure. In this context it is of note that sequences common to FMDV
and poliovirus in proteins 2c and 3d are also present in Cow Pea Mosaic Virus,
a plant virus which has thus been correlated with picornaviruses (30).
The comparison of the two picornaviral genomes in Figure 4b also indicates
that these differ in size predominantly in regions with low sequence homology.
Sometimes extended blocks of nucleotides are also found inserted into well
conserved genes like in gene 1b (VP2).
Together with the addition of com-
plete genes, like the L gene or the extra VPg genes in FMDV, these data indicate that evolution of picornaviral genomes involved the insertion or deletion of
RNA segments of several hundred nucleotides.
CONCLUSIONS
We report the nucleotide sequence of the complete coding part of an aphthovirus genome. This sequence is useful in several ways for further studies of the
viral life cycle. It has already allowed exact predictions of the genome organisation of
these viruses and of the amino acid sequences of their gene pro-
ducts. This information facilitates the identification and characterization of these
proteins, e.g. by antisera elicited against synthetic peptides.
In addition, cDNA
segments from specific parts of the genome can be expressed in E. coli and
used as specific antigens free of any other viral protein, or be used as substrate tor processing enzymes. Finally, FMDV genes and genomic signals can
be transferred well defined as cDNA copies into animal cells and their functions
analyzed. Such studies should also answer questions as to the function of the
1300 nucleotides long leader sequence preceding the polyprotein gene in the
FMDV RNA.
During preparation of this manuscript, the complete coding sequence for the
FMDV polyprotein has been reported for strain A-io (31). While there are extensive serotype related sequence variations in the structural proteins (3, 12) the
sequence differs in the non-structural genes from the OiK
nucteotides
(7.9*0
and 44
predicted amino acids (3.1"),
sequence by 331
provided
that we
neglect three T (U) residues (positions 5978/79, 6013/14, 6016/17), appearing
in duplicate in the A 1 0 sequence published.
6600
Nucleic Acids Research
ACKNOWLEDGEMENTS
We are indepted to Joern Wolters and Oliver Steinau lor the computer analyses,
and to Gudrun Feil for technical assistance. We also thank Dr. W. Keller and
Dr. K. Strohmaier
for gifts of FMDV RNA.
Dr. M. Grubman for
We acknowledge Dr. A. King and
the communication of unpublished data.
This work
was
supported by research grants from Biogen S.A. and the Deutsche Forschungsgemeinschaft (Forschergruppe Genexpression).
REFERENCES
1 Sangar, D.V. (1979) J. gen. Virot. 45, 1-13
2 Perez-Bercoff, R. (ed.) (1979), The Molecular Biology of the Picornaviruses. Plenum, New York
3 Beck, E., Foiss, S., Strebel, K., Cattaneo, R. and Fell, G., (1983)
Nucleic Acids Res. 11, 7873-7885
4 Johnson, R.A. and Walseth, T.F. (1979) Adv. cyclic Nucteotide Res.
10, 135-137
5 Kupper, H., Keller, W., Kurz, C , Forss, S., Schaller, H., Franze, R.,
Marquardt, O., Zaslavsky, V . , and Hofschneider, P.-H.
(1981) Nature
289, 555-559
6 Vieira, J. and Messing,J. (1982) Gene 19, 259-268
7 Strebel, K. (1982) Diplomarbeit, University of Heidelberg
8 Maxam, A.M. and Gilbert, W. (1980) Methods in Enzymol. 65, 499-560
9 Garoff, H. and Ansorge, W. (1981) Analyt. Biochem. 115, 454-457
10 Osterburg, G., Glatting, K . - H . , and Sommer, R. (1982) Nucleic Acids
Res. 10, 207-216
11 Zucker, M. and Stiegler, P. (1981) Nucleic Acids Res. 9, 133-148
12 Beck, E., Feil, G., and Strohmaier, K., (1983a) EMBO Journal 2, 555-559
13 Hattman, S., Brooks, J.E., and Masurekar, M. (1978) J. Mol.Biol. 126,
367-380
14 Forss (1983), P.H. thesis. University of Heidelberg
15 Domingo, E., Dayila, M., and Ortin, J. (1980) Gene 11, 333-346
18 Holland, J . , Spindler, K., Horodyski, F., Grabau, E., Nichol, S., and
VandePol, S. (1982) Science 215, 1577-1585
17 Harris, T.J.R., Robson, K.J.H., and Brown, F. (1980) J. gen. Virot. 50,
403-418
18 Sangar, D.V., Black, D.N., Rowlands, D.J., Harris, T.J.R., and Brown,
F., (1980) J . Virol. 33, 59-68
19 Peacock, S.L., Mclver, C M . , and Monoham, J.J. (1981) Biochem. Biophys. Acta 655, 243-250
20 Kozak, M. (1983) Microbiological reviews 47, 1-45.
21 Kitamura, N., Semler, B.L., Rothberg, P.G., Larsen, G.R., Adler, C.G.,
Dorner, A . J . , Emini, E.A., Hanecak, R., Lee, J . J . , VanderWerf, S.,
Anderson, C.W., and Wimmer, E. (1981) Nature 291, 547-553
22 Palmenberg, A . C . , Kirby, E.M., Janda, M.R., Drake, N.L., Duke, G.M.,
Potratz, K.F., and Coltett, M.S. (1984) Nucleic Acids Res. 12,
2969-2996
23 Hagenbiichle, O., Sauter, M., Steitz, J.A., and Maus, R.G. (1978) Cetl
13, 551-563
24 Forss, S. and Schaller, H. (1982) Nucleic Acids Res. 10, 6441-6450
25 Robertson, B.H., Morgan, D.O., Moore, D.M., Grubman, M.J., Card, J . ,
Fischer, T., Weddell, G., Dowbenko, D., and Yansura, D. (1983) Virology 126, 614-623
26 Kurz, C , Forss, S., Kupper, H., Strohmaier, K., and Schaller, H.
(1981) Nucleic Acids Res. 9, 1919-1931
27 Strohmaier, K., Wittmann-Liebold, B., and Geissler, A.W. (1978) Biochem.
Biophys. Res. Comm. 85, 1840-1645
28 Hanecak, R., Semler, B.L., Anderson, C.W., and Wimmer, E. (1982)
Proc. Natl. Acad. Sci. U.S.A. 79, 3973-3977
29 Palmenberg, A.C. (1982) J. Virol. 44, 900-906
30 Franssen, H., Lennissen, J . , Goldbach, R., Lomonossoff, G., and Zimmern, D., EMBO Journal, in press.
31 Carroll, A.R., Rowlands, D.J., and Clarke, B.E. (1984) Nucleic Acids
Res. 12, 2461-2470
6601