* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The novel genome organization of the insect picorna
Interactome wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Expression vector wikipedia , lookup
Polyadenylation wikipedia , lookup
RNA silencing wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Biochemistry wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Biosynthesis wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Western blot wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Epitranscriptome wikipedia , lookup
Protein structure prediction wikipedia , lookup
Gene expression wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Genetic code wikipedia , lookup
Journal of General Virology (1998), 79, 191–203. Printed in Great Britain ................................................................................................................................................................................................................................................................................... The novel genome organization of the insect picorna-like virus Drosophila C virus suggests this virus belongs to a previously undescribed virus family Karyn N. Johnson1, 2 and Peter D. Christian2 1 2 Botany and Zoology Department, Australian National University, Canberra, Australia CSIRO, Division of Entomology, GPO Box 1700, Canberra 2601, Australia The complete nucleotide sequence of the genomic RNA from the insect picorna-like virus Drosophila C virus (DCV) was determined. The DCV sequence predicts a genome organization different to that of other RNA virus families whose sequences are known. The single-stranded positive-sense genomic RNA is 9264 nucleotides in length and contains two large open reading frames (ORFs) which are separated by 191 nucleotides. The 5« ORF contains regions of similarities with the RNA-dependent RNA polymerase, helicase and protease domains of viruses from the picornavirus, comovirus and sequi- Introduction The insect virus Drosophila C virus (DCV) has historically been classified as an ungrouped member of the virus family Picornaviridae (Murphy et al., 1995). Both DCV and a serologically related insect virus, cricket paralysis virus (CrPV), have a number of picornavirus-like characteristics including their size, sedimentary coefficient, buoyant density, the number and size of their structural proteins and polyadenylated genomic RNA with a small protein (VPg) attached to the 5« end (King & Moore, 1988 ; for review see Scotti et al., 1981). The gene expression and replication strategies of both DCV and CrPV have been extensively studied (for review see Moore et al., 1985) using a combination of in vitro and in vivo translational studies and comparisons have generally been made between these viruses and the picornavirus paradigm. Many good reviews of the events involved in picornaviral translation and replication are available (see Ansardi et al., Author for correspondence : Peter Christian. Fax 61 6 246 4173. e-mail peterc!ento.csiro.au The GenBank accession number of the sequence reported in this paper is AF014388. virus families. The 3« ORF encodes the capsid proteins as confirmed by N-terminal sequence analysis of these proteins. The capsid protein coding region is unusual in two ways : firstly the cistron appears to lack an initiating methionine and secondly no subgenomic RNA is produced, suggesting that the proteins may be translated through internal initiation of translation from the genomic length RNA. The finding of this novel genome organization for DCV shows that this virus is not a member of the Picornaviridae as previously thought, but belongs to a distinct and hitherto unrecognized virus family. 1996 ; Carrasco, 1994 ; Stanway, 1990). There are some similarities between the translation strategies of DCV and CrPV and the picornaviruses. The translation products observed in DCV}CrPV infected Drosophila cells include several large precursor proteins which can be chased into an array of smaller proteins (Moore et al., 1980, 1981). Furthermore, in vitro translation studies were used to show that at least some of the proteolytic cleavages of the these precursor proteins are performed by virally encoded products (Reavy & Moore, 1983 a). Pactamycin mapping of the proteins seen during translation of CrPV in Drosophila cells indicated that the capsid proteins were situated at the 5« end of the ORF that encodes them (Reavy & Moore, 1983 b) ; the authors concluded that this demonstrated that, like the picornaviruses, the capsid proteins of CrPV are encoded at the 5« end of the genome. Whilst there are similarities between DCV}CrPV and the picornaviruses, there are also some fundamental differences. One of the most intriguing is that replication of both DCV and CrPV in vivo leads to the capsid proteins being produced in supramolar excess in comparison to the non-structural proteins (Moore et al., 1980, 1981). This would seem to contradict the assumption that translation of these viruses involves production of a single polyprotein from which all the viral proteins 0001-5109 # 1998 SGM BJB Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Mon, 19 Jun 2017 02:33:26 K. N. Johnson and P. D. Christian originate. Moreover, during in vitro translation of either DCV or CrPV RNA in rabbit reticulocyte lysate, two of the capsid proteins are synthesized in reduced quantities, and neither the third capsid protein nor its precursor (i.e. VP0) can be detected (Reavy & Moore, 1981 a, b, 1983 a). Interestingly, when the rabbit reticulocyte lysate was supplemented with a Drosophila cell lysate, CrPV RNA was able to direct the synthesis of VP0 (Reavy & Moore, 1983 c). Although the biophysical properties of DCV and CrPV particles indicate they may belong to the picornavirus family, the available molecular data are equivocal. The sequence from the 1600 nucleotides at the 3« end of the CrPV genome has been published (King et al., 1987). It was initially speculated that this region encoded the RNA-dependent RNA polymerase (RdRp) (King et al., 1987) in a fashion similar to the picornaviruses. However, on the basis of sequence similarities with the coat proteins of other viruses Koonin & Gorbalenya (1992) suggested that the region encodes the structural proteins of CrPV. No further evidence has been generated to determine which of these postulates is correct. However, preliminary sequence data from the genome of DCV indicate that DCV and CrPV share some similarity at the level of nucleotide sequence (P. D. Christian, K. N. Johnson & F. Weyts, unpublished data). In order to gain a better understanding of the genomic organization and replication of Drosophila C virus we examined the genomic sequence of a DCV isolate. We report here on the genome organization of the insect virus DCV, which not only is different from that of the picornaviruses but is novel among all documented RNA viruses. Methods + Insects. The Boambee stock of Drosophila melanogaster was established from flies caught at Coffs Harbour (Australia). Flies were maintained routinely at 25 °C on a maize meal media (18±8 %, w}v, yeast ; 10 %, w}v, maize meal ; 10 %, v}v, treacle ; 1 %, w}v, agar ; 0±26 %, w}v, methyl p-hydroxybenzoate). A virus free Drosophila melanogaster population (referred to as BVF) was established essentially as described by Brun & Plus (1980). + DCV isolates. The DCV isolate EB was originally isolated from an Australian Drosophila melanogaster population (Christian, 1992). DCV isolate C was isolated from a French D. melanogaster stock (Jousset et al., 1972), M from a Moroccan population (Plus et al., 1975) and CYG from an Australian population (Christian, 1987). + Virus growth and purification. The EB and CYG isolates were passaged through virus free D. melanogaster by injection of a virus suspension into the haemocoel of 400–500 flies (see Christian, 1992). The M and C isolates were propagated in Drosophila melanogaster Line 2 cells (DL2). All virus preparations were purified essentially as described by Scotti (1976). + RNA preparation. Viral RNA was extracted using a method based on the acid phenol–guanidinium thiocyanate method (Chomczynski & Sacchi, 1987) as previously described (Johnson & Christian, 1996). + Synthesis of cDNA and cloning. Purified EB viral RNA was used to synthesize cDNA using AMV reverse transcriptase (Promega) as BJC recommended by the supplier. First strand synthesis was initiated at the 3« terminus of the RNA with an oligo(dT) primer. Further cDNA synthesis was initiated with oligonucleotide primers designed according to sequences toward the end of the preceding clone. In each case, second strand synthesis was performed using RNase H and E. coli polymerase I (Promega). The cDNA was blunt-ended with T4 polymerase, size selected on polyacrylamide gels and ligated into the EcoRV site of Bluescript KS() (Stratagene). The ligation mixtures were used to transform TG1 cells and later DH10β cells. Three overlapping clones were obtained in this way which covered most of the DCV genome apart from the very 5« end. To obtain a clone representing the 5« end of the viral genome 5« RACE was performed using a system obtained from Gibco}BRL as per the instructions. Protease treatment of RNA (400 µg}ml proteinase K, 37 °C, 60 min) prior to this procedure ensured that the VPg was removed to avoid interference in cloning the 5« end. Fragments amplified during PCR were purified using a Qiaquick purification column (Qiagen) and ligated into T-tailed pGEM vector (Promega). cDNA was also synthesized from the RNA of EB, C, CYG and M isolates using the DCV specific oligonucleotide DCV-7 (see Fig. 1) and AMV reverse transcriptase (Promega) as per the manufacturer’s instructions with the exception that incubation was at 60 °C for 15 min. PCR fragments were amplified from these cDNAs using the DCV-7 and DCV8 oligonucleotides (see Fig. 1). + Nucleotide sequencing. cDNA clones were prepared for sequencing using exonuclease III digestion following the supplier’s instructions (Promega). Clones and subclones were sequenced by the dideoxy chain termination method (Sanger et al., 1977) using both dye terminator and dye primer sequencing kits (ABI Prism}Perkin Elmer). PCR products amplified using the DCV-7 and DCV-8 oligonucleotides from the DCV-7 primed cDNAs were sequenced directly using the DCV7 and DCV-8 oligos and a dye terminator kit. + Protein sequencing. A 5 µl sample of DCV EB (2¬10"" particles}ml) was spotted onto PVDF membrane. N-terminal sequence was determined using a Applied Biosystems Procise gas phase sequencer. + Sequence analysis. Nucleotide and amino acid sequence data were assembled and analysed using the University of Wisconsin Genetics Computer Group (GCG) program (Devereux et al., 1984). The sequence data of other viruses were retrieved and analysed using BLAST (Altschul et al., 1990) from composite non-redundant nucleotide and protein databases which are maintained by the Australian National Genomic Information Service (ANGIS). + Northern blot analysis. The general procedure of Sambrook et al. (1989) was used for Northern blot hybridizations. The filters were probed with $#P-labelled PCR products amplified from DCV EB cDNA with the oligonucleotides DCV-1 and DCV-2 (see Fig. 1). RNA for analysis was extracted from DCV infected and uninfected DL2 cells 4–27 h postinfection. Results cDNA clones and sequence analysis of DCV RNA The sequence of the genomic RNA from DCV isolate EB was constructed by compiling sequences from a series of four overlapping cDNA clones (see Fig. 1). Both strands of each of the cDNA clones were completely sequenced. Sequences in some areas were confirmed by sequence from multiple clones or PCR products. The 5« end of the viral genome was cloned Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Mon, 19 Jun 2017 02:33:26 The Drosophila C virus genome by 5« RACE ; the 5« terminal nucleotides were determined by comparison of the sequence of five clones. The DCV genome was found to be 9264 bases in length excluding the 3« poly(A) tail and contains a majority of A}U nucleotides (A-30 %, U-34 %, G-20 %, C-16 %). Coding and non-coding regions of DCV genomic RNA A computer assisted analysis of the nucleotide sequence of DCV showed that the genomic RNA contains two large open reading frames (ORFs), nucleotides 775–6075 (ORF-1) and 6420–8969 (ORF-2). One more ORF (greater than 10 kDa) was identified, nucleotides 2225–2602, which is contained within the sequence encoding ORF-1 but in the 1 frame. ORF-1 and ORF-2 account for 86 % of the DCV genome ; the other 14 % consists of non-coding or untranslated regions. These include the 798 nucleotide 5« UTR, a 191 nucleotide intergenic region (see below) which separates ORF-1 from ORF-2, and a 295 nucleotide 3« UTR [excluding the poly(A) tail]. There are seven AUG initiation codons within the first 150 nucleotides of ORF-1. It is unlikely that translation is initiated at the first AUG (nucleotides 775–777) as the sequence context surrounding this codon (CACAUGA) is not optimal ; in particular the pyrimidine C at the ®3 position is disfavourable for initiation (Cavener & Ray, 1991). It is predicted that the second AUG codon (799–801) is recognized as the translation initiation site as the surrounding nucleotides (AAAAUGG) concur with the most common initiation sequence found in non-vertebrates (ANNAUGG) (Cavener & Ray, 1991). If the assignment of this second AUG as the initiation codon is correct, the coding capacity of ORF-1 is 1759 amino acids forming a polyprotein with a calculated molecular mass of 202 kDa. Sequence of capsid proteins and the organization of ORF-2 The size of the ORFs identified for the DCV genome indicated that the capsid proteins must be derived from processing of the proteins encoded by one of the two major ORFs. To allow identification of the capsid coding regions the N-terminal sequence of three of the capsid proteins was determined by sequencing the proteins contained in whole capsids. Two strong signals and a third weaker signal could be identified in each of the six sequencing cycles. The sequences of two of the proteins were confirmed by N-terminal sequence from two of the capsid proteins which were separated by SDS–PAGE (data not shown) ; therefore the remaining sequence was assumed to be from a third capsid protein. By comparing the N-terminal sequence of the capsid proteins to the deduced amino acid sequences of ORF-1 and 2, it was found that two of the capsid proteins were encoded by sequences in ORF-2 (SKPTVQ and VMGEDL). The third Nterminal sequence (ANFQTN) was found to be situated upstream of the first methionine of ORF-2 (see Fig. 1). This sequence is encoded in the same frame ; thus if there were another initiation codon which was recognized 5« of this sequence then all three capsid proteins would be produced as a single polyprotein. A methionine initiation codon was found 36 nucleotides 5« of the alanine codon (nucleotides 6267– 6269), identified as the N-terminal amino acid residue of this capsid protein ; however, an in-frame stop codon lies 6 nucleotides upstream from this alanine codon. It was necessary to establish whether the sequence around the termination codon was authentic or an artefact of the cloning and}or sequencing procedure. PCR was used to amplify a 342 bp DNA fragment (nucleotides 6051–6393) which included the intergenic region, the termination codon and region encoding the capsid protein N terminus. This region was amplified from cDNA synthesized from EB RNA using three different reverse transcriptases and cDNA synthesized from RNA extracted from three further DCV isolates (M, C and CYG). Direct sequencing of the PCR fragments amplified from this region indicated that the nucleotide sequence preceding the start of the capsid protein coding region was invariant between the four DCV isolates and was the same for cDNA synthesized with each of the reverse transcriptases. To simplify the analysis we made the assumption that translation of the capsid polyprotein starts at the alanine residue. Thus the ORF includes nucleotides 6267–8972, representing an amino acid coding capacity of 901 residues. This polyprotein has a calculated molecular mass of 100 kDa. Proteolytic cleavage of this polyprotein at sites immediately preceding the N-terminal protein sequences would produce three proteins of predicted molecular mass of 37±7, 33±3 and 29±2 kDa. DCV does not produce a subgenomic RNA Northern blot analysis of the RNA produced during DCV infection of DL2 cells confirmed that no subgenomic RNA of the correct size to encode the capsid polyprotein was synthesized (see Fig. 2). No hybridization was seen to uninfected DL2 extracts (data not shown). Similarities to other viruses (i) Capsid proteins The amino acid sequence of each of the capsid proteins was used individually to search the non-redundant protein database for similar sequences. The DCV 37±7 kDa capsid protein was most similar to the VP2 protein of hepatitis A virus (HAV) (Najarian et al., 1985), with 21 % identity. The DCV 29±2 kDa Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Mon, 19 Jun 2017 02:33:26 BJD K. N. Johnson and P. D. Christian Fig. 1. For legend see p. 197. BJE Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Mon, 19 Jun 2017 02:33:26 The Drosophila C virus genome Fig. 1. For legend see p. 197. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Mon, 19 Jun 2017 02:33:26 BJF K. N. Johnson and P. D. Christian Fig. 1. For legend see facing page. BJG Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Mon, 19 Jun 2017 02:33:26 The Drosophila C virus genome distinct regions of similarity were detected which correspond to the previously identified conserved motifs of the helicase, protease and RdRp (Koonin & Dolja, 1993) found in viruses from the picorna-like superfamily. Fig. 2. Northern blot analysis of DCV RNAs accumulated in infected DL2 cells (27 h post-infection) containing sequences from the 3« region of the genome. The genomic length RNA (G) and expected sized of a capsid protein subgenomic RNA (C) are indicated. No hybridization was seen to uninfected DL2 extracts (not shown). capsid protein sequence was found to be similar to the translation of the 3« end of the partial CrPV sequence (King et al., 1987), showing 47 % identity and 73 % similarity in their amino acid sequences in this region. The DCV 33±3 kDa capsid protein is closely related to a portion of the CrPV virus sequence (amino acids 1–197) with 74 % amino acid similarity. This DCV capsid protein shares some sequence similarity with the VP3 of the hepatovirus, enterovirus and cardiovirus genera of the Picornaviridae and to one of the capsid proteins of each of the sequiviruses RTSV (rice tungro spherical virus) (CP-2) and PYFV (parsnip yellow fleck virus) (26K). RdRp domain. The C-terminal region of the polyprotein encoded by ORF-1 contains all of the conserved motifs identified as common to the RdRp domain of replicases of positive-strand RNA viruses (Koonin & Dolja, 1993). Furthermore, the DCV sequence shows strong agreement with the eight consensus motifs for the RNA-dependent RNA polymerase found in Supergroup 1 of Koonin (1993), with only one exception : motif five contains one change, a cysteine (amino acid 1569) where all other members of this group encode a serine or a threonine. The nucleotide sequence across this region was the same from four independent clones. An alignment of the conserved region of the RdRp domain from the replicase of viruses representing the picornavirus family (HAV), and the sequivirus (PYFV), waikavirus (RTSV), comovirus (cowpea mosaic virus – CPMV) and nepovirus (tomato black ring virus – TBRV) genera and including DCV is shown in Fig. 3. On the basis of distances calculated by CLUSTALW (Table 1) DCV RdRp is most like that of RTSV ; however, the sequence most similar to RTSV is PYFV. Amongst this group of viruses the most distant relative DCV is HAV. The DCV RdRp shares more sequence similarity with each of the plant viruses RdRp than HAV does. The nucleotide binding (helicase) domain. The helicase domain is found around 400 amino acids from the N terminus of the replicase protein. An alignment spanning the helicase domain motifs identified by Koonin (1993) is shown in Fig. 4. Of the three conserved motifs (A, B, C) identified by Koonin, the putative DCV helicase fully matches only B. In the A motif DCV differs at amino acid 442, which is an arginine, whereas the consensus indicates that this residue is usually a aspartic or glutamic acid. In DCV, motif C deviates somewhat from the consensus sequence but the motif is still recognizable. The distances generated from this helicase alignment (Table 1) indicate that in this region of the replicase, DCV is the least closely related virus to each of the other viruses. Protease domain. A region (amino acids 987–1174) of the DCV (ii) Conserved protein motifs in ORF-1 The non-redundant protein database was searched for sequences with similarity to those in ORF-1. The DCV polyprotein has similarities to the replicase of viruses from the Picornaviridae, Sequiviridae and Comoviridae families. Three replicase protein was similar in sequence to picornaviral 3C proteases and the putative protease domain of the comovirus and sequivirus families. The alignment in Fig. 5 allowed us to identify candidates in DCV for the conserved histidine and cysteine residues of the catalytic triad. The overall similarity between the protease domains is low (data not shown). Fig. 1. Ribonucleotide sequence of the DCV genome. The deduced amino acid sequence for the two large ORFs is shown under the nucleotide sequence. Termination codons are indicated by an asterisk. The amino acids identified by N-terminal sequencing of the capsid proteins are underlined and in bold. The nucleotide sequences underlined are complementary (DCV2, DCV-7) or analogous (DCV-1, DCV-8) to the dideoxy oligonucleotides synthesized for use in cDNA and PCR amplifications. The cDNA clones used to produce this sequence cover the following ranges : nucleotides 1–746, pGEMT EB846 ; 678–3309, pBS EB606 ; 3136–5047, pBS EB201 ; 4852–9264, pBS EB50. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Mon, 19 Jun 2017 02:33:26 BJH K. N. Johnson and P. D. Christian Fig. 3. Alignment of the putative RdRp domain of the DCV replicase with those of other viruses. The viruses and references are as follows : HAV (hepatitis A virus), Najarian et al. (1985) ; TBRV (tomato black ring virus), Greif et al. (1988) ; CPMV (cowpea mosaic virus), Lomonossoff & Shanks (1983) ; RTSV (rice tungro spherical virus), Shen et al. (1993) ; PYFV (parsnip yellow fleck virus), Turnbull-Ross et al. (1992). Residues identical in at least four viruses are shaded. The motifs designated by Koonin (1993) are labelled I–VIII. Table 1. Pairwise distance comparisons of amino acid sequences of the RNA-dependent RNA polymerase domain on the upper triangle and for the helicase domain on the lower triangle 1993). Fig. 6 outlines the genomic organization of representative viruses from these groups as compared to DCV. Distance values were calculated using CLUSTALW for the alignments shown in Figs 3 and 4. Analysis of the sequence of the DCV genome indicates that the genomic organization and replication strategy of this virus differs from other documented RNA viruses. DCV was previously considered to have a genome organization and replication strategy similar to that of the picornaviruses. That is, the DCV genome encoded a single polyprotein which was post-translationally cleaved to liberate all structural and nonstructural viral proteins. There are some similarities between the picornaviruses and DCV, such as the existence of a long 5« UTR and some domains of amino acid sequence similarity ; however, there are also some fundamental differences. The most significant of the differences is the fact that DCV appears to have two distinct ORFs and the initiation of translation from ORF-2 is seemingly unusual. The capsid proteins are encoded downstream of the replicase proteins in a different reading frame which is separated from the replicase reading frame by 191 nucleotides. There are three conceivable DCV RTSV PYFV CPMV TBRV HAV DCV RTSV PYFV CPMV TBRV HAV – 0±787 0±745 0±754 0±781 0±759 0±701 – 0±596 0±688 0±745 0±723 0±722 0±556 – 0±703 0±723 0±693 0±727 0±685 0±651 – 0±606 0±669 0±721 0±701 0±699 0±579 – 0±652 0±764 0±757 0±749 0±763 0±758 – Conservation of gene order. The organization of the replicase domains identified in this analysis follows a similar order as that of the picorna-like superfamily (see Koonin & Dolja, BJI Discussion Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Mon, 19 Jun 2017 02:33:26 The Drosophila C virus genome Fig. 4. Alignment of the putative helicase domain of the DCV replicase with those of other viruses. Viruses and references are as Fig. 3. Residues identical in at least four viruses are shaded. The motifs designated by Koonin (1993) are labelled A, B and C. Fig. 5. Alignment of the putative protease domain of the DCV replicase with those of other viruses. Viruses and references are as Fig. 3. Residues identical in at least three viruses are shaded. The (putative) catalytic residues identified in Koonin (1993) are shown by asterisks. ways that translation of the downstream ORF (ORF-2) could proceed : (1) the stop codons could be suppressed allowing readthrough translation, (2) a ribosomal frameshift may occur between the two ORFs, (3) a subgenomic RNA may be produced during replication which includes ORF-2 or (4) translation of ORF-2 is initiated independently. The first and second possibilities are unlikely as there are five or more stop codons in each of the reading frames and the product of ORF2 is produced in excess of that of ORF-1. The third hypothesis represents a strategy commonly used by RNA viruses, but contradicts previous reports (Reavy et al., 1983) and the results of Northern analysis presented here which indicate that no subgenomic RNA is produced during DCV replication. These observations indicate that translation of ORF-2 is probably independently, and possibly internally, initiated – although direct supporting evidence for this is lacking. There is evidence that internal initiation is utilized by the picornaviruses to achieve translation of the single polyprotein (see Belsham & Sonenberg, 1996) and indeed the long 5« UTR of DCV indicates that a similar strategy may be used to achieve translation of ORF-1. Where DCV is unusual is that if translation of ORF-2 is initiated independently of ORF-1 from the genomic RNA, the DCV RNA is acting as a bicistronic message. Infectious genetically engineered picornaviruses have been made in which the capsid and non-structural proteins were separated by the 5« UTR of another picornavirus, therefore resulting in an artificially constructed bicistronic virus (for example Pelletier & Sonenberg, 1988). Recently, a tobamovirus has been shown to have a naturally occurring IRES (internal ribosome entry site) which functions in vitro to produce the 17 kDa capsid protein (Ivanov et al., 1997) ; it is not clear at this time whether this IRES functions in vivo. The second way that the translation of ORF-2 is unusual is that it appears to lack an initiating methionine codon. The Nterminal sequence of the 37±7 kDa capsid protein begins with an alanine. According to the conceptual translation of DCV genomic sequence this alanine lies two amino acid residues downstream of an Ochre stop codon. There is a single methionine which is situated 12 amino acids upstream of the alanine residue, but it is separated from the alanine by the stop codon. For this methionine to be the initiator of the capsid protein the stop codon must be read through and then the protein post-translationally modified for the alanine to become the N-terminal residue. Alternatively, this alanine codon may Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Mon, 19 Jun 2017 02:33:26 BJJ K. N. Johnson and P. D. Christian Fig. 6. Comparison of the genome arrangements of some viruses from the picorna-like superfamily with that of DCV. The approximate position of domains identified by sequence similarity are denoted by : +, RdRp ; U, helicase domains ; E, protease domains. CP denotes capsid proteins, AAA denotes the poly(A) tail, , indicate the VPg (? are shown when the presence of a VPg has not been proven). Solid lines indicate where cleavages are known to occur in the polyprotein and dashed lines indicate the uncertainty of the position of the C terminus of the RTSV and PYFV capsid proteins. The position of the VPg coding region is shown by a shaded box in HAV, TBRV and CPMV. act as the initiating codon ; we are unaware, however, of other circumstances in which an alanine codon serves the role of a translation initiation codon. Obviously, further experimentation is essential to unravel the events involved in the initiation of translation of this unusual ORF and experiments are currently under way to determine whether ORF-2 translation is initiated via an IRES. Analysis of the proteins predicted from the two major ORFs indicated some regions of similarity to a number of other viruses from the Picornaviridae, Sequiviridae and Comoviridae. In general, the level of sequence similarity is low (about 20–30 % identity) ; however, it is considered that these levels are significant when considering RNA viruses due to their high mutation rate (see Candresse et al., 1990 ; Contamine et al., 1989 ; Goldbach & Haan, 1994 ; Goldbach & Wellink, 1988 ; Koonin & Dolja, 1993). Although some caution should be exercised when inferring higher order virus taxonomy on the basis of sequence similarities alone (see Zanotto et al., 1996), there seems to be substantial evidence from a range of characteristics that several families of RNA viruses of plants and animals, including the Picornaviridae, Sequiviridae, Comoviridae and Caliciviridae, are linked in a higher order grouping or ‘ superfamily ’. Clearly, from the similarities found between DCV and members of the picorna-like superfamily, it is a member of this superfamily. CAA Alignments of the regions common to the representatives of these virus families allow functions to be suggested for some (but not all) of the protein domains of ORF-1. ORF-1 likely encodes a helicase, 3C-like protease and RdRp. These genes are organized in a similar order to viruses of the picorna-like lineage of superfamily 1 described by Koonin (1993). By analogy, it is expected that this ORF directs translation of a single polyprotein which is later cleaved into functional protein units. The calculated molecular mass of the polyprotein encoded by ORF-1 is 202 kDa. When Drosophila cells treated with the protease inhibitor iodoacetamide were infected with DCV, the largest translation product identified was a protein of with an estimated size of 212 kDa (Moore et al., 1981). This 212 kDa protein was not found in untreated infected Drosophila cells, indicating that it probably acts as a precursor polyprotein (Moore et al., 1981). It is therefore possible that the previously observed 212 kDa protein is the translation product of ORF-1, and further that this protein acts as a precursor for the replicase proteins. The DCV capsid proteins were first analysed by Jousset et al. (1977), who found that purified virions contained three major proteins in roughly equimolar amounts, of size 31, 30 and 28 kDa. In addition, two minor proteins of 37 and 8±5 kDa were also detected. These proteins were named VP1, VP2, VP3, VP0 and VP4, respectively, because of the apparent Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Mon, 19 Jun 2017 02:33:26 The Drosophila C virus genome similarities with the picornaviruses. It has been demonstrated that the DCV VP0 is the precursor for VP3 and VP4 (Moore et al., 1981) and there is some evidence that the capsid proteins are processed from a distinct precursor of about 100 kDa in size (King et al., 1984 ; Moore & Pullin, 1983). Comparison of the N-terminal sequence of the DCV particle proteins with the conceptual translation of ORF-2 suggests that the ORF encodes a polyprotein which is cleaved into at least three capsid proteins. The calculated sizes of the capsid proteins predicted in this study match well with three of the proteins previously described for DCV (Jousset et al., 1977), namely VP0 (37±7 kDa), VP1 (33±3 kDa) and VP2 (29±2 kDa). Therefore, it is likely that the N-terminal sequence of either VP4 or VP3 (depending on which of these is found at the C terminus of VP0) was not identified during sequencing. It is possible that this protein was N-terminally blocked or that it was not in high enough concentration to be identified. DCV has been shown to be similar to viruses from several families at the level of amino acid sequence. The genomic organization of DCV, however, fundamentally differs from each of these virus families. The Picornaviridae and Sequiviridae encode a single polyprotein which contains the capsid proteins toward the N-terminal regions and replicase proteins in the Cterminal regions of the polyprotein [as the exception RTSV also encodes two small ORFs of unknown function which may be translated from subgenomic RNAs (see Shen et al., 1993 ; Thole & Hull, 1996)]. The Comoviridae have two genomic RNAs, one encoding the capsid protein(s) and the other encoding the replicase proteins. It should be noted that DCV is similar to the Caliciviridae inasmuch as the calicivirus capsid protein is encoded downstream of the replicase proteins. However, the single calicivirus capsid protein is either translated as part of a large polyprotein or from a subgenomic RNA (Murphy et al., 1995) and furthermore, when the amino acid sequences of feline calicivirus (Carter et al., 1992) were included in analyses of the replicase domains, it was shown to be less related to DCV than the other viruses described (K. N. Johnson & P. D. Christian, unpublished observation). Originally, it was suggested that DCV and CrPV encoded the capsid proteins at the 5« end of the genome ; this hypothesis was based on the sequence of a limited portion of the CrPV genome (King et al., 1987) and pactamycin mapping of CrPV proteins (Reavy & Moore, 1983 b). The DCV sequence shows that the capsid proteins are encoded in the 3« region of the DCV genome. Given the high level of sequence similarity between the DCV capsid and the 3« end of the CrPV genome (greater than 70 % amino acid similarity), it is reasonable to assume that the capsid proteins are encoded in the 3« region of the CrPV genome. This conclusion is consistent with the reassessment by Koonin and Gorbalenya (1992) of the CrPV 3« end sequence (King et al., 1987). The genome organization indicated by the DCV sequence seems to be incompatible with the pactamycin studies on CrPV by Reavy & Moore (1983 b) ; however, re-evaluation of the pactamycin mapping data shows that they actually support the genome organization identified in this report. The CrPV capsid proteins mapped to the 5« end of their ORF, but several other proteins (denoted S, P and Q) also mapped to this region (Reavy & Moore, 1983 b) ; the authors suggested that these may be ‘ cleavage products synthesized during capsid protein production ’. We believe that it is more likely that proteins S, P and Q are actually functional proteins encoded by ORF-1 ; hence these proteins have similar pactamycin ratios to the capsid proteins as they are a proportional distance from the N terminus of the ORF-1 encoded polyprotein as the capsid proteins are from the N terminus of the ORF-2 encoded polyprotein. In support of this hypothesis, proteins S, P and Q were produced in much reduced quantities as compared to the capsid proteins (Reavy & Moore, 1983 b), whereas we would expect them to be in equimolar amounts if produced from the same polyprotein. The results of the pactamycin study are consistent with the 3« positioning of the capsid proteins of CrPV if, like DCV, CrPV is found to contain two major ORFs which are independently translated. We have recently sequenced 4000 nucleotides from the 3« end of the CrPV genome and found that the capsid proteins are encoded toward the 3« end of the genome and the RdRp is encoded upstream of the capsid proteins. The ORFs encoding these two proteins are separated by an untranslated sequence of similar length and sharing 76 % identity with the DCV sequence (T. N. Hanzlik, K. N. Johnson & P. D. Christian, unpublished observation). Although much experimentation is still needed to confirm the expression and replication strategies used by DCV, the information provided from the sequence of the genomic RNA indicates that the genome of this virus is organized in a way that is fundamentally different from the genomes of the RNA viruses described thus far. Comparisons of the amino acid sequences from several proteins of viruses from the families Sequiviridae, Comoviridae and Picornaviridae show that DCV shares some sequence similarities with them, but is less similar to any of them than viruses from other genera within that family. These findings lead to the conclusion that DCV, and probably CrPV, should be considered members of a distinct and hitherto unrecognized virus family. The authors wish to thank Jillian McGovern of the Biomolecular Resource Facility of the Australian National University for sequencing of the DCV peptides. Terry Hanzlik, Peter Waterhouse and Karl Gordon are gratefully acknowledged for discussions and}or critical comments on the manuscript. References Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology 215, 403–410. Ansardi, D. C., Porter, D. C., Anderson, M. J. & Morrow, C. D. (1996). Poliovirus assembly and encapsidation of genomic RNA. Advances in Virus Research 46, 1–68. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Mon, 19 Jun 2017 02:33:26 CAB K. N. Johnson and P. D. Christian Belsham, G. J. & Sonenberg, N. (1996). RNA–protein interactions in regulation of picornavirus RNA translation. Microbiological Reviews 60, 499–511. Brun, G. & Plus, N. (1980). The viruses of Drosophila. In The Genetics and Biology of Drosophila, pp. 625–702. Edited by M. Ashburner & T. F. R. Wright. New York : Academic Press. Candresse, T., Morch, M. D. & Dunez, J. (1990). Multiple alignment and hierarchical clustering of conserved amino acid sequences in the replication-associated proteins of plant RNA viruses. Research in Virology 141, 315–329. Carrasco, L. (1994). Picornavirus inhibitors. Pharmacology and Therapeutics 64, 215–290. Carter, M. J., Milton, I. D., Meanger, J., Bennett, M., Gaskell, R. M. & Turner, P. C. (1992). The complete nucleotide sequence of a feline calicivirus. Virology 190, 443–448. Cavener, D. R. & Ray, S. C. (1991). Eukaryotic start and stop translation sites. Nucleic Acids Research 19, 3185. Chomczynski, P. & Sacchi, N. (1987). Single-step method of RNA isolation by acid guanidinium thiocyanate–phenol–chloroform extraction. Analytical Biochemistry 162, 156–159. Christian, P. D. (1987). Studies on Drosophila C and A viruses in Australian populations of Drosophila melanogaster. PhD thesis, Australian National University. Christian, P. D. (1992). A simple vacuum dot-blot hybridization assay for the detection of Drosophila A and C viruses in single Drosophila. Journal of Virological Methods 38, 153–165. Contamine, D., Petitjean, A.-M. & Ashburner, M. (1989). Genetic resistance to virus infection : the molecular cloning of a Drosophila gene that restricts infection by the rhabdovirus sigma. Genetics 123, 525–533. Devereux, J., Haeberli, P. & Smithies, O. (1984). A comprehensive set of sequence analysis programs for the Vax. Nucleic Acids Research 12, 387–395. Goldbach, R. & Haan, P. (1994). RNA viral supergroups and the evolution of RNA viruses. In The Evolutionary Biology of Viruses, pp. 105–119. Edited by S. S. Morse. New York : Raven Press. Goldbach, R. & Wellink, J. (1988). Evolution of plus-strand RNA viruses. Intervirology 29, 260–267. Greif, C., Hemmer, O. & Fritsch, C. (1988). Nucleotide sequence of tomato black ring virus. Journal of General Virology 69, 1517–1529. Ivanov, P. A., Karpova, O. V., Skulachev, M. V., Tomashevskaya, O. L., Rodionova, N. P., Dorokhov, Y. L. & Atabekov, J. G. (1997). A tobamovirus genome that contains an internal ribosome entry site functional in vitro. Virology 232, 32–43. Johnson, K. N. & Christian, P. D. (1996). A molecular taxonomy for cricket paralysis virus including two new isolates from Australian populations of Drosophila (Diptera : Drosophilidae). Archives of Virology 141, 1509–1522. Jousset, F. X., Plus, N., Croizier, G. & Thomas, M. (1972). Existence chez Drosophila de deux groupes de picornavirus de proprie! te! s se! rologiques et biologiques diffe! rentes. Comptes Rendus de l’Academie des Sciences (Paris) 275, 3043–3046. Jousset, F., Bergoin, M. & Revet, B. (1977). Characterization of the Drosophila C virus. Journal of General Virology 34, 269–285. King, L. A. & Moore, N. F. (1988). Evidence for the presence of a genome-linked protein in two insect picornaviruses, cricket paralysis and Drosophila C viruses. FEMS Microbiology Letters 50, 41–44. King, L. A., Pullin, J. S. K. & Moore, N. F. (1984). Characterisation of a picornavirus isolated from a tumorous blood cell line of Drosophila melanogaster. Microbios Letters 26, 121–127. CAC King, L. A., Pullin, J. S. K., Stanway, G., Almond, J. W. & Moore, N. F. (1987). Cloning of the genome of cricket paralysis virus : sequence of the 3« end. Virus Research 6, 331–344. Koonin, E. V. & Dolja, V. V. (1993). Evolution and taxonomy of positive-strand RNA viruses : Implications of comparative analysis of amino acid sequences. Critical Reviews in Biochemistry and Molecular Biology 28, 375–430. Koonin, E. V. & Gorbalenya, A. E. (1992). An insect picornavirus may have genome organization similar to that of caliciviruses. FEBS Letters 297, 81–86. Lomonossoff, G. P. & Shanks, M. (1983). The nucleotide sequence of cowpea mosaic virus B RNA. EMBO Journal 2, 2253–2258. Moore, N. F. & Pullin, J. S. K. (1983). Heat shock used in combination with amino acid analogues and protease inhibitors to demonstrate the processing of proteins of an insect picornavirus (Drosophila C virus) in Drosophila melanogaster cells. Annales de Virologie 134, 285–292. Moore, N. F., Kearns, A. & Pullin, J. S. K. (1980). Characterization of cricket paralysis virus-induced polypeptides. Journal of Virology 33, 1–9. Moore, N. F., Reavy, B., Pullin, J. S. K. & Plus, N. (1981). The polypeptides induced in Drosophila cells by Drosophila C virus (strain Ouarzazate). Virology 112, 411–416. Moore, N. F., Reavy, B. & King, L. A. (1985). General characteristics, gene organization and expression of small RNA viruses of insects. Journal of General Virology 66, 647–659. Murphy, F. A., Fauquet, C. M., Bishop, D. H. L., Ghabrial, S. A., Jarvis, A. W., Martelli, G. P., Mayo, M. A. & Summers, M. D. (editors) (1995). Virus Taxonomy : Sixth Report of the International Committee on Taxonomy of Viruses. New York & Vienna : Springer-Verlag. Najarian, R., Caput, D., Gee, W., Potter, S. J., Renard, A., Merryweather, J., Nest, G. V. & Dina, D. (1985). Primary structure and gene organization of human hepatitis A virus. Proceedings of the National Academy of Sciences, USA 82, 2627–2631. Pelletier, J. & Sonenberg, N. (1988). Internal initiation of translation of eukaryotic mRNA directed by a sequence derived from poliovirus RNA. Nature 334, 320–325. Plus, N., Croizier, G., Jousset, F. X. & David, J. (1975). Picornaviruses of laboratory and wild Drosophila melanogaster : geographical distribution and serotypic composition. Annales de Microbiologie 126A, 107–117. Reavy, B. & Moore, N. F. (1981 a). Cell-free translation of cricket paralysis virus RNA : analysis of the synthesis and processing of virusspecified proteins. Journal of General Virology 55, 429–438. Reavy, B. & Moore, N. F. (1981 b). In vitro translation of cricket paralysis virus RNA. Archives of Virology 67, 175–180. Reavy, B. & Moore, N. F. (1983 a). Cell-free translation of Drosophila C virus RNA : identification of a virus protease activity involved in capsid protein synthesis and further studies on in vitro processing of cricket paralysis virus specified proteins. Archives of Virology 76, 101–115. Reavy, B. & Moore, N. F. (1983 b). The gene organization of a small RNA-containing insect virus : comparison with that of mammalian picornaviruses. Virology 131, 551–554. Reavy, B. & Moore, N. F. (1983 c). An inhibitor-resistant protease specified by an insect picornavirus, and the role of cellular proteases in the rapid processing of capsid protein precursors. Journal of General Virology 64, 1831–1833. Reavy, B., Crump, W. A. L. & Moore, N. F. (1983). Characterization of cricket paralysis virus and Drosophila C virus-induced RNA species synthesized in infected Drosophila melanogaster cells. Journal of Invertebrate Pathology 41, 397–400. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989). Molecular Cloning : A Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Mon, 19 Jun 2017 02:33:26 The Drosophila C virus genome Laboratory Manual, 2nd edn. Cold Spring Harbor, NY : Cold Spring Harbor Laboratory. Sanger, F., Nicklen, S. & Coulson, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences, USA 74, 5463–5467. Scotti, P. D. (1976). Cricket paralysis virus replicates in cultured Drosophila cells. Intervirology 6, 333–342. Scotti, P. D., Longworth, J. F., Plus, N., Croizier, G. & Reinganum, C. (1981). The biology and ecology of strains of an insect small RNA virus complex. Advances in Virus Research 26, 117–143. Shen, P., Kaniewska, M., Smith, C. & Beachy, R. N. (1993). Nucleotide sequence and genomic organization of rice tungro spherical virus. Virology 193, 621–630. Stanway, G. (1990). Structure, function and evolution of picornaviruses. Journal of General Virology 71, 2482–2501. Thole, V. & Hull, R. (1996). Rice tungro spherical virus : nucleotide sequence of the 3« genomic half and studies on the two small 3« open reading frames. Virus Genes 13, 239–246. Turnbull-Ross, A. D., Reavy, B., Mayo, M. A. & Murant, A. F. (1992). The nucleotide sequence of parsnip yellow fleck virus : a plant picornalike virus. Journal of General Virology 73, 3203–3211. Zanotto, P. M. D., Gibbs, M. J., Gould, E. A. & Holmes, E. C. (1996). A reevaluation of the higher taxonomy of viruses based on RNA polymerases. Journal of Virology 70, 6083–6096. Received 25 July 1997 ; Accepted 9 September 1997 Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Mon, 19 Jun 2017 02:33:26 CAD