Download The novel genome organization of the insect picorna

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Interactome wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Expression vector wikipedia , lookup

Polyadenylation wikipedia , lookup

RNA silencing wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Biochemistry wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

Biosynthesis wikipedia , lookup

RNA wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Western blot wikipedia , lookup

Protein wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Epitranscriptome wikipedia , lookup

Protein structure prediction wikipedia , lookup

SR protein wikipedia , lookup

RNA-Seq wikipedia , lookup

Gene expression wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Genetic code wikipedia , lookup

Proteolysis wikipedia , lookup

Plant virus wikipedia , lookup

Transcript
Journal
of General Virology (1998), 79, 191–203. Printed in Great Britain
...................................................................................................................................................................................................................................................................................
The novel genome organization of the insect picorna-like virus
Drosophila C virus suggests this virus belongs to a previously
undescribed virus family
Karyn N. Johnson1, 2 and Peter D. Christian2
1
2
Botany and Zoology Department, Australian National University, Canberra, Australia
CSIRO, Division of Entomology, GPO Box 1700, Canberra 2601, Australia
The complete nucleotide sequence of the genomic
RNA from the insect picorna-like virus Drosophila C
virus (DCV) was determined. The DCV sequence
predicts a genome organization different to that of
other RNA virus families whose sequences are
known. The single-stranded positive-sense genomic
RNA is 9264 nucleotides in length and contains two
large open reading frames (ORFs) which are separated by 191 nucleotides. The 5« ORF contains
regions of similarities with the RNA-dependent RNA
polymerase, helicase and protease domains of
viruses from the picornavirus, comovirus and sequi-
Introduction
The insect virus Drosophila C virus (DCV) has historically
been classified as an ungrouped member of the virus family
Picornaviridae (Murphy et al., 1995). Both DCV and a
serologically related insect virus, cricket paralysis virus (CrPV),
have a number of picornavirus-like characteristics including
their size, sedimentary coefficient, buoyant density, the number
and size of their structural proteins and polyadenylated
genomic RNA with a small protein (VPg) attached to the 5«
end (King & Moore, 1988 ; for review see Scotti et al., 1981).
The gene expression and replication strategies of both
DCV and CrPV have been extensively studied (for review see
Moore et al., 1985) using a combination of in vitro and in vivo
translational studies and comparisons have generally been
made between these viruses and the picornavirus paradigm.
Many good reviews of the events involved in picornaviral
translation and replication are available (see Ansardi et al.,
Author for correspondence : Peter Christian.
Fax ­61 6 246 4173. e-mail peterc!ento.csiro.au
The GenBank accession number of the sequence reported in this paper
is AF014388.
virus families. The 3« ORF encodes the capsid
proteins as confirmed by N-terminal sequence analysis of these proteins. The capsid protein coding
region is unusual in two ways : firstly the cistron
appears to lack an initiating methionine and secondly no subgenomic RNA is produced, suggesting
that the proteins may be translated through internal
initiation of translation from the genomic length
RNA. The finding of this novel genome organization
for DCV shows that this virus is not a member of the
Picornaviridae as previously thought, but belongs
to a distinct and hitherto unrecognized virus family.
1996 ; Carrasco, 1994 ; Stanway, 1990). There are some
similarities between the translation strategies of DCV and
CrPV and the picornaviruses. The translation products observed in DCV}CrPV infected Drosophila cells include several
large precursor proteins which can be chased into an array of
smaller proteins (Moore et al., 1980, 1981). Furthermore, in
vitro translation studies were used to show that at least some
of the proteolytic cleavages of the these precursor proteins are
performed by virally encoded products (Reavy & Moore,
1983 a). Pactamycin mapping of the proteins seen during
translation of CrPV in Drosophila cells indicated that the capsid
proteins were situated at the 5« end of the ORF that encodes
them (Reavy & Moore, 1983 b) ; the authors concluded that
this demonstrated that, like the picornaviruses, the capsid
proteins of CrPV are encoded at the 5« end of the genome.
Whilst there are similarities between DCV}CrPV and the
picornaviruses, there are also some fundamental differences.
One of the most intriguing is that replication of both DCV and
CrPV in vivo leads to the capsid proteins being produced in
supramolar excess in comparison to the non-structural proteins
(Moore et al., 1980, 1981). This would seem to contradict the
assumption that translation of these viruses involves production of a single polyprotein from which all the viral proteins
0001-5109 # 1998 SGM
BJB
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Mon, 19 Jun 2017 02:33:26
K. N. Johnson and P. D. Christian
originate. Moreover, during in vitro translation of either DCV
or CrPV RNA in rabbit reticulocyte lysate, two of the capsid
proteins are synthesized in reduced quantities, and neither the
third capsid protein nor its precursor (i.e. VP0) can be detected
(Reavy & Moore, 1981 a, b, 1983 a). Interestingly, when the
rabbit reticulocyte lysate was supplemented with a Drosophila
cell lysate, CrPV RNA was able to direct the synthesis of VP0
(Reavy & Moore, 1983 c).
Although the biophysical properties of DCV and CrPV
particles indicate they may belong to the picornavirus family,
the available molecular data are equivocal. The sequence from
the 1600 nucleotides at the 3« end of the CrPV genome has
been published (King et al., 1987). It was initially speculated
that this region encoded the RNA-dependent RNA polymerase
(RdRp) (King et al., 1987) in a fashion similar to the
picornaviruses. However, on the basis of sequence similarities
with the coat proteins of other viruses Koonin & Gorbalenya
(1992) suggested that the region encodes the structural
proteins of CrPV. No further evidence has been generated to
determine which of these postulates is correct. However,
preliminary sequence data from the genome of DCV indicate
that DCV and CrPV share some similarity at the level of
nucleotide sequence (P. D. Christian, K. N. Johnson & F.
Weyts, unpublished data).
In order to gain a better understanding of the genomic
organization and replication of Drosophila C virus we examined
the genomic sequence of a DCV isolate. We report here on the
genome organization of the insect virus DCV, which not only
is different from that of the picornaviruses but is novel among
all documented RNA viruses.
Methods
+ Insects. The Boambee stock of Drosophila melanogaster was established from flies caught at Coffs Harbour (Australia). Flies were
maintained routinely at 25 °C on a maize meal media (18±8 %, w}v, yeast ;
10 %, w}v, maize meal ; 10 %, v}v, treacle ; 1 %, w}v, agar ; 0±26 %, w}v,
methyl p-hydroxybenzoate). A virus free Drosophila melanogaster population (referred to as BVF) was established essentially as described by
Brun & Plus (1980).
+ DCV isolates. The DCV isolate EB was originally isolated from an
Australian Drosophila melanogaster population (Christian, 1992). DCV
isolate C was isolated from a French D. melanogaster stock (Jousset et al.,
1972), M from a Moroccan population (Plus et al., 1975) and CYG from
an Australian population (Christian, 1987).
+ Virus growth and purification. The EB and CYG isolates were
passaged through virus free D. melanogaster by injection of a virus
suspension into the haemocoel of 400–500 flies (see Christian, 1992). The
M and C isolates were propagated in Drosophila melanogaster Line 2 cells
(DL2). All virus preparations were purified essentially as described by
Scotti (1976).
+ RNA preparation. Viral RNA was extracted using a method based
on the acid phenol–guanidinium thiocyanate method (Chomczynski &
Sacchi, 1987) as previously described (Johnson & Christian, 1996).
+ Synthesis of cDNA and cloning. Purified EB viral RNA was used
to synthesize cDNA using AMV reverse transcriptase (Promega) as
BJC
recommended by the supplier. First strand synthesis was initiated at the
3« terminus of the RNA with an oligo(dT) primer. Further cDNA
synthesis was initiated with oligonucleotide primers designed according
to sequences toward the end of the preceding clone. In each case, second
strand synthesis was performed using RNase H and E. coli polymerase I
(Promega). The cDNA was blunt-ended with T4 polymerase, size
selected on polyacrylamide gels and ligated into the EcoRV site of
Bluescript KS(­) (Stratagene). The ligation mixtures were used to
transform TG1 cells and later DH10β cells. Three overlapping clones
were obtained in this way which covered most of the DCV genome apart
from the very 5« end. To obtain a clone representing the 5« end of the
viral genome 5« RACE was performed using a system obtained from
Gibco}BRL as per the instructions. Protease treatment of RNA
(400 µg}ml proteinase K, 37 °C, 60 min) prior to this procedure ensured
that the VPg was removed to avoid interference in cloning the 5« end.
Fragments amplified during PCR were purified using a Qiaquick
purification column (Qiagen) and ligated into T-tailed pGEM vector
(Promega).
cDNA was also synthesized from the RNA of EB, C, CYG and M
isolates using the DCV specific oligonucleotide DCV-7 (see Fig. 1) and
AMV reverse transcriptase (Promega) as per the manufacturer’s instructions with the exception that incubation was at 60 °C for 15 min. PCR
fragments were amplified from these cDNAs using the DCV-7 and DCV8 oligonucleotides (see Fig. 1).
+ Nucleotide sequencing. cDNA clones were prepared for sequencing using exonuclease III digestion following the supplier’s
instructions (Promega). Clones and subclones were sequenced by the
dideoxy chain termination method (Sanger et al., 1977) using both dye
terminator and dye primer sequencing kits (ABI Prism}Perkin Elmer).
PCR products amplified using the DCV-7 and DCV-8 oligonucleotides
from the DCV-7 primed cDNAs were sequenced directly using the DCV7 and DCV-8 oligos and a dye terminator kit.
+ Protein sequencing. A 5 µl sample of DCV EB (2¬10""
particles}ml) was spotted onto PVDF membrane. N-terminal sequence
was determined using a Applied Biosystems Procise gas phase sequencer.
+ Sequence analysis. Nucleotide and amino acid sequence data
were assembled and analysed using the University of Wisconsin Genetics
Computer Group (GCG) program (Devereux et al., 1984). The sequence
data of other viruses were retrieved and analysed using BLAST (Altschul
et al., 1990) from composite non-redundant nucleotide and protein
databases which are maintained by the Australian National Genomic
Information Service (ANGIS).
+ Northern blot analysis. The general procedure of Sambrook et al.
(1989) was used for Northern blot hybridizations. The filters were probed
with $#P-labelled PCR products amplified from DCV EB cDNA with the
oligonucleotides DCV-1 and DCV-2 (see Fig. 1). RNA for analysis was
extracted from DCV infected and uninfected DL2 cells 4–27 h postinfection.
Results
cDNA clones and sequence analysis of DCV RNA
The sequence of the genomic RNA from DCV isolate EB
was constructed by compiling sequences from a series of four
overlapping cDNA clones (see Fig. 1). Both strands of each of
the cDNA clones were completely sequenced. Sequences in
some areas were confirmed by sequence from multiple clones
or PCR products. The 5« end of the viral genome was cloned
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Mon, 19 Jun 2017 02:33:26
The Drosophila C virus genome
by 5« RACE ; the 5« terminal nucleotides were determined by
comparison of the sequence of five clones.
The DCV genome was found to be 9264 bases in length
excluding the 3« poly(A) tail and contains a majority of A}U
nucleotides (A-30 %, U-34 %, G-20 %, C-16 %).
Coding and non-coding regions of DCV genomic RNA
A computer assisted analysis of the nucleotide sequence of
DCV showed that the genomic RNA contains two large open
reading frames (ORFs), nucleotides 775–6075 (ORF-1) and
6420–8969 (ORF-2). One more ORF (greater than 10 kDa)
was identified, nucleotides 2225–2602, which is contained
within the sequence encoding ORF-1 but in the ­1 frame.
ORF-1 and ORF-2 account for 86 % of the DCV genome ; the
other 14 % consists of non-coding or untranslated regions.
These include the 798 nucleotide 5« UTR, a 191 nucleotide
intergenic region (see below) which separates ORF-1 from
ORF-2, and a 295 nucleotide 3« UTR [excluding the poly(A)
tail].
There are seven AUG initiation codons within the first 150
nucleotides of ORF-1. It is unlikely that translation is initiated
at the first AUG (nucleotides 775–777) as the sequence context
surrounding this codon (CACAUGA) is not optimal ; in
particular the pyrimidine C at the ®3 position is disfavourable
for initiation (Cavener & Ray, 1991). It is predicted that the
second AUG codon (799–801) is recognized as the translation
initiation site as the surrounding nucleotides (AAAAUGG)
concur with the most common initiation sequence found in
non-vertebrates (ANNAUGG) (Cavener & Ray, 1991). If the
assignment of this second AUG as the initiation codon is
correct, the coding capacity of ORF-1 is 1759 amino acids
forming a polyprotein with a calculated molecular mass of
202 kDa.
Sequence of capsid proteins and the organization of
ORF-2
The size of the ORFs identified for the DCV genome
indicated that the capsid proteins must be derived from
processing of the proteins encoded by one of the two major
ORFs. To allow identification of the capsid coding regions the
N-terminal sequence of three of the capsid proteins was
determined by sequencing the proteins contained in whole
capsids. Two strong signals and a third weaker signal could be
identified in each of the six sequencing cycles. The sequences
of two of the proteins were confirmed by N-terminal sequence
from two of the capsid proteins which were separated by
SDS–PAGE (data not shown) ; therefore the remaining sequence was assumed to be from a third capsid protein.
By comparing the N-terminal sequence of the capsid
proteins to the deduced amino acid sequences of ORF-1 and 2,
it was found that two of the capsid proteins were encoded by
sequences in ORF-2 (SKPTVQ and VMGEDL). The third Nterminal sequence (ANFQTN) was found to be situated
upstream of the first methionine of ORF-2 (see Fig. 1). This
sequence is encoded in the same frame ; thus if there were
another initiation codon which was recognized 5« of this
sequence then all three capsid proteins would be produced as
a single polyprotein. A methionine initiation codon was found
36 nucleotides 5« of the alanine codon (nucleotides 6267–
6269), identified as the N-terminal amino acid residue of this
capsid protein ; however, an in-frame stop codon lies 6
nucleotides upstream from this alanine codon.
It was necessary to establish whether the sequence around
the termination codon was authentic or an artefact of the
cloning and}or sequencing procedure. PCR was used to
amplify a 342 bp DNA fragment (nucleotides 6051–6393)
which included the intergenic region, the termination codon
and region encoding the capsid protein N terminus. This
region was amplified from cDNA synthesized from EB RNA
using three different reverse transcriptases and cDNA synthesized from RNA extracted from three further DCV isolates (M,
C and CYG). Direct sequencing of the PCR fragments amplified
from this region indicated that the nucleotide sequence
preceding the start of the capsid protein coding region was
invariant between the four DCV isolates and was the same for
cDNA synthesized with each of the reverse transcriptases.
To simplify the analysis we made the assumption that
translation of the capsid polyprotein starts at the alanine
residue. Thus the ORF includes nucleotides 6267–8972,
representing an amino acid coding capacity of 901 residues.
This polyprotein has a calculated molecular mass of 100 kDa.
Proteolytic cleavage of this polyprotein at sites immediately
preceding the N-terminal protein sequences would produce
three proteins of predicted molecular mass of 37±7, 33±3 and
29±2 kDa.
DCV does not produce a subgenomic RNA
Northern blot analysis of the RNA produced during DCV
infection of DL2 cells confirmed that no subgenomic RNA of
the correct size to encode the capsid polyprotein was
synthesized (see Fig. 2). No hybridization was seen to
uninfected DL2 extracts (data not shown).
Similarities to other viruses
(i) Capsid proteins
The amino acid sequence of each of the capsid proteins was
used individually to search the non-redundant protein database
for similar sequences. The DCV 37±7 kDa capsid protein was
most similar to the VP2 protein of hepatitis A virus (HAV)
(Najarian et al., 1985), with 21 % identity. The DCV 29±2 kDa
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Mon, 19 Jun 2017 02:33:26
BJD
K. N. Johnson and P. D. Christian
Fig. 1. For legend see p. 197.
BJE
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Mon, 19 Jun 2017 02:33:26
The Drosophila C virus genome
Fig. 1. For legend see p. 197.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Mon, 19 Jun 2017 02:33:26
BJF
K. N. Johnson and P. D. Christian
Fig. 1. For legend see facing page.
BJG
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Mon, 19 Jun 2017 02:33:26
The Drosophila C virus genome
distinct regions of similarity were detected which correspond
to the previously identified conserved motifs of the helicase,
protease and RdRp (Koonin & Dolja, 1993) found in viruses
from the picorna-like superfamily.
Fig. 2. Northern blot analysis of DCV RNAs accumulated in infected DL2
cells (27 h post-infection) containing sequences from the 3« region of the
genome. The genomic length RNA (G) and expected sized of a capsid
protein subgenomic RNA (C) are indicated. No hybridization was seen to
uninfected DL2 extracts (not shown).
capsid protein sequence was found to be similar to the
translation of the 3« end of the partial CrPV sequence (King et
al., 1987), showing 47 % identity and 73 % similarity in their
amino acid sequences in this region. The DCV 33±3 kDa capsid
protein is closely related to a portion of the CrPV virus
sequence (amino acids 1–197) with 74 % amino acid similarity.
This DCV capsid protein shares some sequence similarity with
the VP3 of the hepatovirus, enterovirus and cardiovirus genera
of the Picornaviridae and to one of the capsid proteins of each
of the sequiviruses RTSV (rice tungro spherical virus) (CP-2)
and PYFV (parsnip yellow fleck virus) (26K).
RdRp domain. The C-terminal region of the polyprotein
encoded by ORF-1 contains all of the conserved motifs
identified as common to the RdRp domain of replicases of
positive-strand RNA viruses (Koonin & Dolja, 1993). Furthermore, the DCV sequence shows strong agreement with the
eight consensus motifs for the RNA-dependent RNA polymerase found in Supergroup 1 of Koonin (1993), with only one
exception : motif five contains one change, a cysteine (amino
acid 1569) where all other members of this group encode a
serine or a threonine. The nucleotide sequence across this
region was the same from four independent clones.
An alignment of the conserved region of the RdRp domain
from the replicase of viruses representing the picornavirus
family (HAV), and the sequivirus (PYFV), waikavirus (RTSV),
comovirus (cowpea mosaic virus – CPMV) and nepovirus
(tomato black ring virus – TBRV) genera and including DCV is
shown in Fig. 3. On the basis of distances calculated by
CLUSTALW (Table 1) DCV RdRp is most like that of RTSV ;
however, the sequence most similar to RTSV is PYFV.
Amongst this group of viruses the most distant relative DCV
is HAV. The DCV RdRp shares more sequence similarity with
each of the plant viruses RdRp than HAV does.
The nucleotide binding (helicase) domain. The helicase domain
is found around 400 amino acids from the N terminus of the
replicase protein. An alignment spanning the helicase domain
motifs identified by Koonin (1993) is shown in Fig. 4. Of the
three conserved motifs (A, B, C) identified by Koonin, the
putative DCV helicase fully matches only B. In the A motif
DCV differs at amino acid 442, which is an arginine, whereas
the consensus indicates that this residue is usually a aspartic or
glutamic acid. In DCV, motif C deviates somewhat from the
consensus sequence but the motif is still recognizable. The
distances generated from this helicase alignment (Table 1)
indicate that in this region of the replicase, DCV is the least
closely related virus to each of the other viruses.
Protease domain. A region (amino acids 987–1174) of the DCV
(ii) Conserved protein motifs in ORF-1
The non-redundant protein database was searched for
sequences with similarity to those in ORF-1. The DCV
polyprotein has similarities to the replicase of viruses from the
Picornaviridae, Sequiviridae and Comoviridae families. Three
replicase protein was similar in sequence to picornaviral 3C
proteases and the putative protease domain of the comovirus
and sequivirus families. The alignment in Fig. 5 allowed us to
identify candidates in DCV for the conserved histidine and
cysteine residues of the catalytic triad. The overall similarity
between the protease domains is low (data not shown).
Fig. 1. Ribonucleotide sequence of the DCV genome. The deduced amino acid sequence for the two large ORFs is shown
under the nucleotide sequence. Termination codons are indicated by an asterisk. The amino acids identified by N-terminal
sequencing of the capsid proteins are underlined and in bold. The nucleotide sequences underlined are complementary (DCV2, DCV-7) or analogous (DCV-1, DCV-8) to the dideoxy oligonucleotides synthesized for use in cDNA and PCR amplifications.
The cDNA clones used to produce this sequence cover the following ranges : nucleotides 1–746, pGEMT EB846 ; 678–3309,
pBS EB606 ; 3136–5047, pBS EB201 ; 4852–9264, pBS EB50.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Mon, 19 Jun 2017 02:33:26
BJH
K. N. Johnson and P. D. Christian
Fig. 3. Alignment of the putative RdRp domain of the DCV replicase with those of other viruses. The viruses and references are
as follows : HAV (hepatitis A virus), Najarian et al. (1985) ; TBRV (tomato black ring virus), Greif et al. (1988) ; CPMV (cowpea
mosaic virus), Lomonossoff & Shanks (1983) ; RTSV (rice tungro spherical virus), Shen et al. (1993) ; PYFV (parsnip yellow
fleck virus), Turnbull-Ross et al. (1992). Residues identical in at least four viruses are shaded. The motifs designated by Koonin
(1993) are labelled I–VIII.
Table 1. Pairwise distance comparisons of amino acid
sequences of the RNA-dependent RNA polymerase
domain on the upper triangle and for the helicase domain
on the lower triangle
1993). Fig. 6 outlines the genomic organization of representative viruses from these groups as compared to DCV.
Distance values were calculated using CLUSTALW for the alignments
shown in Figs 3 and 4.
Analysis of the sequence of the DCV genome indicates that
the genomic organization and replication strategy of this virus
differs from other documented RNA viruses. DCV was
previously considered to have a genome organization and
replication strategy similar to that of the picornaviruses. That
is, the DCV genome encoded a single polyprotein which was
post-translationally cleaved to liberate all structural and nonstructural viral proteins. There are some similarities between
the picornaviruses and DCV, such as the existence of a long 5«
UTR and some domains of amino acid sequence similarity ;
however, there are also some fundamental differences.
The most significant of the differences is the fact that DCV
appears to have two distinct ORFs and the initiation of
translation from ORF-2 is seemingly unusual. The capsid
proteins are encoded downstream of the replicase proteins in a
different reading frame which is separated from the replicase
reading frame by 191 nucleotides. There are three conceivable
DCV
RTSV
PYFV
CPMV
TBRV
HAV
DCV
RTSV
PYFV
CPMV
TBRV
HAV
–
0±787
0±745
0±754
0±781
0±759
0±701
–
0±596
0±688
0±745
0±723
0±722
0±556
–
0±703
0±723
0±693
0±727
0±685
0±651
–
0±606
0±669
0±721
0±701
0±699
0±579
–
0±652
0±764
0±757
0±749
0±763
0±758
–
Conservation of gene order. The organization of the replicase
domains identified in this analysis follows a similar order as
that of the picorna-like superfamily (see Koonin & Dolja,
BJI
Discussion
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Mon, 19 Jun 2017 02:33:26
The Drosophila C virus genome
Fig. 4. Alignment of the putative helicase domain of the DCV replicase with those of other viruses. Viruses and references are
as Fig. 3. Residues identical in at least four viruses are shaded. The motifs designated by Koonin (1993) are labelled A, B and
C.
Fig. 5. Alignment of the putative protease domain of the DCV replicase with those of other viruses. Viruses and references are
as Fig. 3. Residues identical in at least three viruses are shaded. The (putative) catalytic residues identified in Koonin (1993)
are shown by asterisks.
ways that translation of the downstream ORF (ORF-2) could
proceed : (1) the stop codons could be suppressed allowing
readthrough translation, (2) a ribosomal frameshift may occur
between the two ORFs, (3) a subgenomic RNA may be
produced during replication which includes ORF-2 or (4)
translation of ORF-2 is initiated independently. The first and
second possibilities are unlikely as there are five or more stop
codons in each of the reading frames and the product of ORF2 is produced in excess of that of ORF-1. The third hypothesis
represents a strategy commonly used by RNA viruses, but
contradicts previous reports (Reavy et al., 1983) and the results
of Northern analysis presented here which indicate that no
subgenomic RNA is produced during DCV replication. These
observations indicate that translation of ORF-2 is probably
independently, and possibly internally, initiated – although
direct supporting evidence for this is lacking.
There is evidence that internal initiation is utilized by the
picornaviruses to achieve translation of the single polyprotein
(see Belsham & Sonenberg, 1996) and indeed the long 5« UTR
of DCV indicates that a similar strategy may be used to achieve
translation of ORF-1. Where DCV is unusual is that if
translation of ORF-2 is initiated independently of ORF-1 from
the genomic RNA, the DCV RNA is acting as a bicistronic
message. Infectious genetically engineered picornaviruses have
been made in which the capsid and non-structural proteins
were separated by the 5« UTR of another picornavirus,
therefore resulting in an artificially constructed bicistronic
virus (for example Pelletier & Sonenberg, 1988). Recently, a
tobamovirus has been shown to have a naturally occurring
IRES (internal ribosome entry site) which functions in vitro to
produce the 17 kDa capsid protein (Ivanov et al., 1997) ; it is
not clear at this time whether this IRES functions in vivo.
The second way that the translation of ORF-2 is unusual is
that it appears to lack an initiating methionine codon. The Nterminal sequence of the 37±7 kDa capsid protein begins with
an alanine. According to the conceptual translation of DCV
genomic sequence this alanine lies two amino acid residues
downstream of an Ochre stop codon. There is a single
methionine which is situated 12 amino acids upstream of the
alanine residue, but it is separated from the alanine by the stop
codon. For this methionine to be the initiator of the capsid
protein the stop codon must be read through and then the
protein post-translationally modified for the alanine to become
the N-terminal residue. Alternatively, this alanine codon may
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Mon, 19 Jun 2017 02:33:26
BJJ
K. N. Johnson and P. D. Christian
Fig. 6. Comparison of the genome arrangements of some viruses from the picorna-like superfamily with that of DCV. The
approximate position of domains identified by sequence similarity are denoted by : +, RdRp ; U, helicase domains ; E,
protease domains. CP denotes capsid proteins, AAA denotes the poly(A) tail, , indicate the VPg (? are shown when the
presence of a VPg has not been proven). Solid lines indicate where cleavages are known to occur in the polyprotein and
dashed lines indicate the uncertainty of the position of the C terminus of the RTSV and PYFV capsid proteins. The position of
the VPg coding region is shown by a shaded box in HAV, TBRV and CPMV.
act as the initiating codon ; we are unaware, however, of other
circumstances in which an alanine codon serves the role of a
translation initiation codon. Obviously, further experimentation is essential to unravel the events involved in the
initiation of translation of this unusual ORF and experiments
are currently under way to determine whether ORF-2
translation is initiated via an IRES.
Analysis of the proteins predicted from the two major
ORFs indicated some regions of similarity to a number of other
viruses from the Picornaviridae, Sequiviridae and Comoviridae. In
general, the level of sequence similarity is low (about 20–30 %
identity) ; however, it is considered that these levels are
significant when considering RNA viruses due to their high
mutation rate (see Candresse et al., 1990 ; Contamine et al.,
1989 ; Goldbach & Haan, 1994 ; Goldbach & Wellink, 1988 ;
Koonin & Dolja, 1993). Although some caution should be
exercised when inferring higher order virus taxonomy on the
basis of sequence similarities alone (see Zanotto et al., 1996),
there seems to be substantial evidence from a range of
characteristics that several families of RNA viruses of plants
and animals, including the Picornaviridae, Sequiviridae, Comoviridae and Caliciviridae, are linked in a higher order grouping
or ‘ superfamily ’. Clearly, from the similarities found between
DCV and members of the picorna-like superfamily, it is a
member of this superfamily.
CAA
Alignments of the regions common to the representatives
of these virus families allow functions to be suggested for some
(but not all) of the protein domains of ORF-1. ORF-1 likely
encodes a helicase, 3C-like protease and RdRp. These genes are
organized in a similar order to viruses of the picorna-like
lineage of superfamily 1 described by Koonin (1993). By
analogy, it is expected that this ORF directs translation of a
single polyprotein which is later cleaved into functional protein
units. The calculated molecular mass of the polyprotein
encoded by ORF-1 is 202 kDa. When Drosophila cells treated
with the protease inhibitor iodoacetamide were infected with
DCV, the largest translation product identified was a protein of
with an estimated size of 212 kDa (Moore et al., 1981). This
212 kDa protein was not found in untreated infected Drosophila
cells, indicating that it probably acts as a precursor polyprotein
(Moore et al., 1981). It is therefore possible that the previously
observed 212 kDa protein is the translation product of ORF-1,
and further that this protein acts as a precursor for the replicase
proteins.
The DCV capsid proteins were first analysed by Jousset et
al. (1977), who found that purified virions contained three
major proteins in roughly equimolar amounts, of size 31, 30
and 28 kDa. In addition, two minor proteins of 37 and 8±5 kDa
were also detected. These proteins were named VP1, VP2,
VP3, VP0 and VP4, respectively, because of the apparent
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Mon, 19 Jun 2017 02:33:26
The Drosophila C virus genome
similarities with the picornaviruses. It has been demonstrated
that the DCV VP0 is the precursor for VP3 and VP4 (Moore
et al., 1981) and there is some evidence that the capsid proteins
are processed from a distinct precursor of about 100 kDa in
size (King et al., 1984 ; Moore & Pullin, 1983).
Comparison of the N-terminal sequence of the DCV particle
proteins with the conceptual translation of ORF-2 suggests
that the ORF encodes a polyprotein which is cleaved into at
least three capsid proteins. The calculated sizes of the capsid
proteins predicted in this study match well with three of the
proteins previously described for DCV (Jousset et al., 1977),
namely VP0 (37±7 kDa), VP1 (33±3 kDa) and VP2 (29±2 kDa).
Therefore, it is likely that the N-terminal sequence of either
VP4 or VP3 (depending on which of these is found at the C
terminus of VP0) was not identified during sequencing. It is
possible that this protein was N-terminally blocked or that it
was not in high enough concentration to be identified.
DCV has been shown to be similar to viruses from several
families at the level of amino acid sequence. The genomic
organization of DCV, however, fundamentally differs from
each of these virus families. The Picornaviridae and Sequiviridae
encode a single polyprotein which contains the capsid proteins
toward the N-terminal regions and replicase proteins in the Cterminal regions of the polyprotein [as the exception RTSV
also encodes two small ORFs of unknown function which may
be translated from subgenomic RNAs (see Shen et al., 1993 ;
Thole & Hull, 1996)]. The Comoviridae have two genomic
RNAs, one encoding the capsid protein(s) and the other
encoding the replicase proteins. It should be noted that DCV
is similar to the Caliciviridae inasmuch as the calicivirus capsid
protein is encoded downstream of the replicase proteins.
However, the single calicivirus capsid protein is either
translated as part of a large polyprotein or from a subgenomic
RNA (Murphy et al., 1995) and furthermore, when the amino
acid sequences of feline calicivirus (Carter et al., 1992) were
included in analyses of the replicase domains, it was shown to
be less related to DCV than the other viruses described (K. N.
Johnson & P. D. Christian, unpublished observation).
Originally, it was suggested that DCV and CrPV encoded
the capsid proteins at the 5« end of the genome ; this hypothesis
was based on the sequence of a limited portion of the CrPV
genome (King et al., 1987) and pactamycin mapping of CrPV
proteins (Reavy & Moore, 1983 b). The DCV sequence shows
that the capsid proteins are encoded in the 3« region of the
DCV genome. Given the high level of sequence similarity
between the DCV capsid and the 3« end of the CrPV genome
(greater than 70 % amino acid similarity), it is reasonable to
assume that the capsid proteins are encoded in the 3« region of
the CrPV genome. This conclusion is consistent with the
reassessment by Koonin and Gorbalenya (1992) of the CrPV 3«
end sequence (King et al., 1987).
The genome organization indicated by the DCV sequence
seems to be incompatible with the pactamycin studies on CrPV
by Reavy & Moore (1983 b) ; however, re-evaluation of the
pactamycin mapping data shows that they actually support the
genome organization identified in this report. The CrPV capsid
proteins mapped to the 5« end of their ORF, but several other
proteins (denoted S, P and Q) also mapped to this region
(Reavy & Moore, 1983 b) ; the authors suggested that these
may be ‘ cleavage products synthesized during capsid protein
production ’. We believe that it is more likely that proteins S, P
and Q are actually functional proteins encoded by ORF-1 ;
hence these proteins have similar pactamycin ratios to the
capsid proteins as they are a proportional distance from the N
terminus of the ORF-1 encoded polyprotein as the capsid
proteins are from the N terminus of the ORF-2 encoded
polyprotein. In support of this hypothesis, proteins S, P and Q
were produced in much reduced quantities as compared to the
capsid proteins (Reavy & Moore, 1983 b), whereas we would
expect them to be in equimolar amounts if produced from the
same polyprotein. The results of the pactamycin study are
consistent with the 3« positioning of the capsid proteins of
CrPV if, like DCV, CrPV is found to contain two major ORFs
which are independently translated. We have recently sequenced 4000 nucleotides from the 3« end of the CrPV genome
and found that the capsid proteins are encoded toward the 3«
end of the genome and the RdRp is encoded upstream of the
capsid proteins. The ORFs encoding these two proteins are
separated by an untranslated sequence of similar length and
sharing 76 % identity with the DCV sequence (T. N. Hanzlik,
K. N. Johnson & P. D. Christian, unpublished observation).
Although much experimentation is still needed to confirm
the expression and replication strategies used by DCV, the
information provided from the sequence of the genomic RNA
indicates that the genome of this virus is organized in a way
that is fundamentally different from the genomes of the RNA
viruses described thus far. Comparisons of the amino acid
sequences from several proteins of viruses from the families
Sequiviridae, Comoviridae and Picornaviridae show that DCV
shares some sequence similarities with them, but is less similar
to any of them than viruses from other genera within that
family. These findings lead to the conclusion that DCV, and
probably CrPV, should be considered members of a distinct
and hitherto unrecognized virus family.
The authors wish to thank Jillian McGovern of the Biomolecular
Resource Facility of the Australian National University for sequencing of
the DCV peptides. Terry Hanzlik, Peter Waterhouse and Karl Gordon are
gratefully acknowledged for discussions and}or critical comments on the
manuscript.
References
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J.
(1990). Basic local alignment search tool. Journal of Molecular Biology
215, 403–410.
Ansardi, D. C., Porter, D. C., Anderson, M. J. & Morrow, C. D. (1996).
Poliovirus assembly and encapsidation of genomic RNA. Advances in
Virus Research 46, 1–68.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Mon, 19 Jun 2017 02:33:26
CAB
K. N. Johnson and P. D. Christian
Belsham, G. J. & Sonenberg, N. (1996). RNA–protein interactions in
regulation of picornavirus RNA translation. Microbiological Reviews 60,
499–511.
Brun, G. & Plus, N. (1980). The viruses of Drosophila. In The Genetics and
Biology of Drosophila, pp. 625–702. Edited by M. Ashburner & T. F. R.
Wright. New York : Academic Press.
Candresse, T., Morch, M. D. & Dunez, J. (1990). Multiple alignment
and hierarchical clustering of conserved amino acid sequences in the
replication-associated proteins of plant RNA viruses. Research in Virology
141, 315–329.
Carrasco, L. (1994). Picornavirus inhibitors. Pharmacology and Therapeutics 64, 215–290.
Carter, M. J., Milton, I. D., Meanger, J., Bennett, M., Gaskell, R. M. &
Turner, P. C. (1992). The complete nucleotide sequence of a feline
calicivirus. Virology 190, 443–448.
Cavener, D. R. & Ray, S. C. (1991). Eukaryotic start and stop translation
sites. Nucleic Acids Research 19, 3185.
Chomczynski, P. & Sacchi, N. (1987). Single-step method of RNA
isolation by acid guanidinium thiocyanate–phenol–chloroform extraction. Analytical Biochemistry 162, 156–159.
Christian, P. D. (1987). Studies on Drosophila C and A viruses in Australian
populations of Drosophila melanogaster. PhD thesis, Australian National
University.
Christian, P. D. (1992). A simple vacuum dot-blot hybridization assay
for the detection of Drosophila A and C viruses in single Drosophila.
Journal of Virological Methods 38, 153–165.
Contamine, D., Petitjean, A.-M. & Ashburner, M. (1989). Genetic
resistance to virus infection : the molecular cloning of a Drosophila gene
that restricts infection by the rhabdovirus sigma. Genetics 123, 525–533.
Devereux, J., Haeberli, P. & Smithies, O. (1984). A comprehensive set
of sequence analysis programs for the Vax. Nucleic Acids Research 12,
387–395.
Goldbach, R. & Haan, P. (1994). RNA viral supergroups and the
evolution of RNA viruses. In The Evolutionary Biology of Viruses, pp.
105–119. Edited by S. S. Morse. New York : Raven Press.
Goldbach, R. & Wellink, J. (1988). Evolution of plus-strand RNA
viruses. Intervirology 29, 260–267.
Greif, C., Hemmer, O. & Fritsch, C. (1988). Nucleotide sequence of
tomato black ring virus. Journal of General Virology 69, 1517–1529.
Ivanov, P. A., Karpova, O. V., Skulachev, M. V., Tomashevskaya, O. L.,
Rodionova, N. P., Dorokhov, Y. L. & Atabekov, J. G. (1997). A
tobamovirus genome that contains an internal ribosome entry site
functional in vitro. Virology 232, 32–43.
Johnson, K. N. & Christian, P. D. (1996). A molecular taxonomy for
cricket paralysis virus including two new isolates from Australian
populations of Drosophila (Diptera : Drosophilidae). Archives of Virology
141, 1509–1522.
Jousset, F. X., Plus, N., Croizier, G. & Thomas, M. (1972). Existence
chez Drosophila de deux groupes de picornavirus de proprie! te! s se! rologiques et biologiques diffe! rentes. Comptes Rendus de l’Academie des Sciences
(Paris) 275, 3043–3046.
Jousset, F., Bergoin, M. & Revet, B. (1977). Characterization of the
Drosophila C virus. Journal of General Virology 34, 269–285.
King, L. A. & Moore, N. F. (1988). Evidence for the presence of a
genome-linked protein in two insect picornaviruses, cricket paralysis and
Drosophila C viruses. FEMS Microbiology Letters 50, 41–44.
King, L. A., Pullin, J. S. K. & Moore, N. F. (1984). Characterisation of a
picornavirus isolated from a tumorous blood cell line of Drosophila
melanogaster. Microbios Letters 26, 121–127.
CAC
King, L. A., Pullin, J. S. K., Stanway, G., Almond, J. W. & Moore, N. F.
(1987). Cloning of the genome of cricket paralysis virus : sequence of the
3« end. Virus Research 6, 331–344.
Koonin, E. V. & Dolja, V. V. (1993). Evolution and taxonomy of
positive-strand RNA viruses : Implications of comparative analysis of
amino acid sequences. Critical Reviews in Biochemistry and Molecular
Biology 28, 375–430.
Koonin, E. V. & Gorbalenya, A. E. (1992). An insect picornavirus may
have genome organization similar to that of caliciviruses. FEBS Letters
297, 81–86.
Lomonossoff, G. P. & Shanks, M. (1983). The nucleotide sequence of
cowpea mosaic virus B RNA. EMBO Journal 2, 2253–2258.
Moore, N. F. & Pullin, J. S. K. (1983). Heat shock used in combination
with amino acid analogues and protease inhibitors to demonstrate the
processing of proteins of an insect picornavirus (Drosophila C virus) in
Drosophila melanogaster cells. Annales de Virologie 134, 285–292.
Moore, N. F., Kearns, A. & Pullin, J. S. K. (1980). Characterization of
cricket paralysis virus-induced polypeptides. Journal of Virology 33, 1–9.
Moore, N. F., Reavy, B., Pullin, J. S. K. & Plus, N. (1981). The
polypeptides induced in Drosophila cells by Drosophila C virus (strain
Ouarzazate). Virology 112, 411–416.
Moore, N. F., Reavy, B. & King, L. A. (1985). General characteristics,
gene organization and expression of small RNA viruses of insects. Journal
of General Virology 66, 647–659.
Murphy, F. A., Fauquet, C. M., Bishop, D. H. L., Ghabrial, S. A., Jarvis,
A. W., Martelli, G. P., Mayo, M. A. & Summers, M. D. (editors) (1995).
Virus Taxonomy : Sixth Report of the International Committee on Taxonomy
of Viruses. New York & Vienna : Springer-Verlag.
Najarian, R., Caput, D., Gee, W., Potter, S. J., Renard, A., Merryweather, J., Nest, G. V. & Dina, D. (1985). Primary structure and gene
organization of human hepatitis A virus. Proceedings of the National
Academy of Sciences, USA 82, 2627–2631.
Pelletier, J. & Sonenberg, N. (1988). Internal initiation of translation of
eukaryotic mRNA directed by a sequence derived from poliovirus RNA.
Nature 334, 320–325.
Plus, N., Croizier, G., Jousset, F. X. & David, J. (1975). Picornaviruses
of laboratory and wild Drosophila melanogaster : geographical distribution
and serotypic composition. Annales de Microbiologie 126A, 107–117.
Reavy, B. & Moore, N. F. (1981 a). Cell-free translation of cricket
paralysis virus RNA : analysis of the synthesis and processing of virusspecified proteins. Journal of General Virology 55, 429–438.
Reavy, B. & Moore, N. F. (1981 b). In vitro translation of cricket paralysis
virus RNA. Archives of Virology 67, 175–180.
Reavy, B. & Moore, N. F. (1983 a). Cell-free translation of Drosophila C
virus RNA : identification of a virus protease activity involved in capsid
protein synthesis and further studies on in vitro processing of cricket
paralysis virus specified proteins. Archives of Virology 76, 101–115.
Reavy, B. & Moore, N. F. (1983 b). The gene organization of a small
RNA-containing insect virus : comparison with that of mammalian
picornaviruses. Virology 131, 551–554.
Reavy, B. & Moore, N. F. (1983 c). An inhibitor-resistant protease
specified by an insect picornavirus, and the role of cellular proteases in the
rapid processing of capsid protein precursors. Journal of General Virology
64, 1831–1833.
Reavy, B., Crump, W. A. L. & Moore, N. F. (1983). Characterization of
cricket paralysis virus and Drosophila C virus-induced RNA species
synthesized in infected Drosophila melanogaster cells. Journal of Invertebrate
Pathology 41, 397–400.
Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989). Molecular Cloning : A
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Mon, 19 Jun 2017 02:33:26
The Drosophila C virus genome
Laboratory Manual, 2nd edn. Cold Spring Harbor, NY : Cold Spring
Harbor Laboratory.
Sanger, F., Nicklen, S. & Coulson, A. R. (1977). DNA sequencing with
chain-terminating inhibitors. Proceedings of the National Academy of
Sciences, USA 74, 5463–5467.
Scotti, P. D. (1976). Cricket paralysis virus replicates in cultured
Drosophila cells. Intervirology 6, 333–342.
Scotti, P. D., Longworth, J. F., Plus, N., Croizier, G. & Reinganum, C.
(1981). The biology and ecology of strains of an insect small RNA virus
complex. Advances in Virus Research 26, 117–143.
Shen, P., Kaniewska, M., Smith, C. & Beachy, R. N. (1993). Nucleotide
sequence and genomic organization of rice tungro spherical virus.
Virology 193, 621–630.
Stanway, G. (1990). Structure, function and evolution of picornaviruses.
Journal of General Virology 71, 2482–2501.
Thole, V. & Hull, R. (1996). Rice tungro spherical virus : nucleotide
sequence of the 3« genomic half and studies on the two small 3« open
reading frames. Virus Genes 13, 239–246.
Turnbull-Ross, A. D., Reavy, B., Mayo, M. A. & Murant, A. F. (1992).
The nucleotide sequence of parsnip yellow fleck virus : a plant picornalike virus. Journal of General Virology 73, 3203–3211.
Zanotto, P. M. D., Gibbs, M. J., Gould, E. A. & Holmes, E. C. (1996). A
reevaluation of the higher taxonomy of viruses based on RNA
polymerases. Journal of Virology 70, 6083–6096.
Received 25 July 1997 ; Accepted 9 September 1997
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Mon, 19 Jun 2017 02:33:26
CAD