* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The complete nucleotide sequence of cucumber green mottle
Amino acid synthesis wikipedia , lookup
Western blot wikipedia , lookup
Expression vector wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Metalloprotein wikipedia , lookup
RNA silencing wikipedia , lookup
Polyadenylation wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Epitranscriptome wikipedia , lookup
Biochemistry wikipedia , lookup
Point mutation wikipedia , lookup
Proteolysis wikipedia , lookup
Gene expression wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Protein structure prediction wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Plant virus wikipedia , lookup
Journal of General Virology (1991), 72, 1487-1495. Printed in Great Britain 1487 The complete nucleotide sequence of cucumber green mottle mosaic virus (SH strain) genomic RNA M. Ugaki, ~ M. Tomiyama, 1 T. Kakutani, 1 S. Hidaka, l~ T. Kiguchi, 2 R. Nagata, 3 T. Sato, 2 F. Motoyoshil* and M. Nishiguchi 4. 1National Institute of Agrobiological Resources, 2-1-2 Kan-nondaL Tsukuba, Ibaraki 305, 2Hokkaido Central Agricultural Experiment Station, Naganuma, Yu-ubari, Hokkaido 069-13, 3Miyazaki Prefectural Agricultural Experiment Station, Sadowara, Miyazaki-gun, Miyazaki 880-02 and aKyushu National Agricultural Experiment Station, 2421 Suya, Nishigoushi, Kikuchi-gun, Kumamoto 861-11, Japan The complete nucleotide sequence of the genomic RNA of cucumber green mottle mosaic virus watermelon strain SH (CGMMV-SH) was determined using cloned cDNA. This sequence is 6421 nucleotides long containing at least four open reading frames, which correspond to 186K, 129K, 29K and 17.3K proteins. The 17.3K protein is the coat protein. Sequence analysis shows that C G M M V - S H is very closely related to another watermelon strain, C G M M V - W , although three amino acid substitutions in the 29K protein were found between these strains. The sequence was also compared to those of other tobamoviruses, tobacco mosaic virus (TMV) vulgare, TMV-L (a tomato strain) and tobacco mild green mosaic virus reported by other groups. It shows 55 to 56% identity with these viruses. The size and location of the open reading frames are very similar to those of T M V but the 129K and 186K proteins are composed of 1142 and 1646 amino acids, being larger than those of T M V by 27 and 31 amino acids, respectively. The deduced amino acid sequences of these proteins are highly homologous to those of TMV, especially in the readthrough downstream region of the 186K protein. Introduction bottlegourd isolate, CGMMV-C (Vasudeva et al., 1949; Vasudeva & Nariani, 1952), than to the cucumber strain, the Yodo strain or two British isolates, E1 and E2. Francki et al. (1986) pointed out that the CGMMV watermelon strain is taxonomically different from the cucumber strain (Inoue et al., 1967), shown by the lack of molecular hybridization between these two strains, and proposed a new virus name, 'kyuri green mottle mosaic virus', for the cucumber strain. The sequence of 1071 nucleotides from the 3' end of the genomic RNA of a watermelon isolate (CGMMVW) of the watermelon strain was determined by Meshi et al. (1983b) by analysing the cDNA of the genomic RNA. The amino acid sequence of the coat protein of the same isolate was determined by Nozu & Tsugita (1986) by the conventional protein sequencing method. Furthermore Saito et al. (1988) sequenced the region covering the 30K protein gene and compared it with the known 30K protein of other tobamoviruses. A new type of mosaic disease was found in greenhouse-grown muskmelon crops in 1971, and the agent of this disease was serologically identical with the watermelon strain (Furuki & Komuro, 1973). Necrotic lesions surrounded by a water-soaked area are characteristic of Cucumber green mottle mosaic virus (CGMMV) is a member of the tobamovirus group. C G M M V causes diseases in cucurbitacea and is different in host range from tobacco mosaic virus (TMV) whose main hosts are members of the Solanaceae. Strains of CGMMV were first described as cucumber virus 3 (CV3) and cucumber virus 4 (CV4) by Ainsworth (1935). In Japan, three different strains have been described: 'cucumber strain' (Inoue et al., 1967), 'watermelon strain' (Komuro et al., 1968) and 'Yodo strain' (Kitani et al., 1970). Among them, the watermelon strain of CGMMV is the most serious disease agent, causing severe disease symptoms in infected watermelon plants, particularly the deterioration of fruit pulp which causes considerable economic losses to watermelon growers (Komuro et al., 1971). Tochihara & Komuro (1974) demonstrated that the watermelon strain has a closer relationship to an Indian t Present address: Tohoku National Agricultural Experiment Station, Akahira 4, Shimokuriyagawa, Morioka, Iwate 020-01, Japan. :[;Present address: Research Institute for Bioresources, Okayama University, Chuo 2-20-1, Kurashiki 710, Japan. 0001-0044 © 1991 SGM Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sun, 18 Jun 2017 08:20:41 1488 M. Ugaki and others muskmelon fruit infected with the virus, and severely affected the economic value of the products (I. Furuki & T. Ohsawa, personal communication). However, the disease of muskmelon caused by CGMMV can be controlled by cross-protection, if the seedlings of muskmelon are previously inoculated with an attenuated CGMMV strain (Motoyoshi & Nishiguchi, 1988). We have determined the complete nucleotide sequence of the genomic R N A of CGMMV-SH, a muskmelon isolate of the CGMMV watermelon strain derived from a diseased muskmelon, by preparing cDNA clones covering almost the full-length of the R N A and sequencing them with an automated fluorescent D N A sequencer, and also, partially, by directly sequencing the genomic RNA with synthetic oligonucleotides. We also discuss the difference at the level of nucleotide sequence of this isolate from other tobamoviruses. Methods Virus purification and RNA isolation. CGMMV-SH is an isolate from a muskmelon leaf that was supplied by Dr I. Furuki, Shizuoka Agricultural Experiment Station. The virus sample for this study was obtained after five passages through local lesions in the leaves of Gomphrena globosa and propagated in muskmelon plants (Cucumis melo L. cv. Earl's Favorite Natsukei-4). Plants inoculated with the virus were grown in the greenhouse. About 2 weeks after inoculation, the leaves were harvested and stored at - 6 0 °C. Virus was purified according to the method reported by Nozu et al. (1971). Viral RNA wag isolated from the purified virus preparation using a phenol-SDS method (Gierer & Schramm, 1956; Fraenkel-Conrat et al., 1957). The RNA was resuspended in H20 and kept at - 7 0 °C until use. Polyadenylation of virus RNA. Virus RNA was polyadenylated at the 3' end using poly(A) polymerase by the method of Meshi et al. (1982) with a slight modification. Two-hundred lal of the reaction mixture containing 50 mM-Tris-HCl pH 8.0, 0-2 M-NaCI, 10 m/~-MgCl~, 0-4 mM-EDTA, 1 mM-DTT, 30 lag RNA, 8.1 mM-[~-32p]ATP (Amersham, 50 Ci/mol) and poly(A) polymerase (Bethesda Research Laboratories, 20 units) was incubated at 37 °C for 13 rain. The resulting poly(A)tailed RNA was passed through a Sephadex G-50 column and collected by ethanol precipitation. Preparation of double-stranded eDNA W polyadenylated CGMMV RNA, and construction and transformation of hybrid plasmids. Using the cDNA synthesis system (Amersham) based on the method of Gubler & Hoffman (1983), cDNA was synthesized by the manufacturer's protocol. Larger sized cDNA was recovered by fractionation through a Sephadex G-50 column. Addition of oligo(dC) to the double-stranded cDNA at the 3' end was carried out using terminal deoxynucleotidyl transferase (Takara Shuzo Company). The tailed cDNA (about 100 ng) was annealed with 660 ng of the vector, oligo(dG)-tailed pUC9 (Pharmacia). The hybrid plasmid DNA was used to transform competent cells of Escherichia coli JM101 prepared by the modified RbCl/CaCl2-mediated method (Hanahan, 1985). White colonies on the plates of selection medium were picked and used for colony hybridization. Preparation of probes and colony hybridization. Partially degraded CGMMV-SH RNA or the 1"2fragment of TMV-OM RNA was used as a probe after labelling at the 5' end with [y-3zp]ATP by the method described by Meshi et al. (1982). The 12 fragment of TMV-OM RNA was prepared as described by Mandeles (1968). Colony hybridization was performed at 40 °C overnight using a probe of approximately 105 c.p.m, per filter (nitrocellulose filter BA85, Schleicher & Schiill) as described by Ohshima (1981). For the SH RNA probe, the filter was washed once with 4 x SSC, twice with 2 x SSC for 15 min each and then treated with RNase A at 20 lag/ml for 30 min followed by washing with 2 x SSC twice for 15 min. When the TMV-OM RNA f2 fragment was used as a probe, the filter was washed only twice with 4 × SSC. cDNA sequencing. Sequencing of eDNA was performed by the dideoxynucleotide method (Sanger et al., 1977). The eDNA was cloned into the pBluescript vector (Stratagene) and was used to make a series of deletion clones as reported by Henikoff (1984). The deletion clones were sequenced using an automated fluorescent DNA sequencer (370A, Applied Biosystems) as described by Ugaki et al. (1988). The DNA was sequenced in both directions. RNA sequencing. The sequence of the coat protein-encoding and 3' non-coding region of SH RNA was further confirmed by direct sequencing of SH RNA using reverse transcriptase (Seikagaku Kogyo) as described by Meshi et al. (1983a). The primer was a 5'-labelled cDNA fragment which had been digested with the appropriate restriction enzymes. The Y-terminal region of the genome RNA was determined as follows: (i) SH-RNA was annealed to a synthetic primer ( G G C G T C A C G T T G G T T G T T G A T T , positions 81 to 102) which had been labelled with [~,-32p]ATP and extended with reverse transcriptase at the 5' end and (ii) the base at the 5' end was identified by the methods described by Moss (1977) and Konarska (1984). The 3' end of SH-RNA was determined as described by Miura et al. (1974) after labelling of the RNA with T4 RNA ligase (England et al., 1980). Sequence analysis. Sequence data were analysed using DNASIS (Hitachi). Results and Discussion Cloning For cDNA synthesis, virus R N A was extracted from purified virus particles and subjected to agarose gel electrophoresis. Two main R N A bands were detected (data not shown). The main larger one corresponded to full-length genomic RNA. The smaller R N A band was presumed to be derived from short virus particles as shown by Okada et al. (1980) and Fukuda et al. (1981). Some of the cDNA clones are shown in Fig. 1. All of these were selected using fragmented SH-RNA probes. The largest cDNA clone, pSH-K-3, covers almost the whole length of the genomic RNA, but lacks only 33 nucleotides at the 5' end, proved by sequencing using a cDNA primer. This and two other clones (pSH-G-6 and pSH-J-44) were found to hybridize with the probe of TMV-OM R N A f2 fragments. Clones of pSH-G-6, pSHJ-44 and pSH-P-11 were thought to be derived from the fragments which lost large 3' portions of SH-RNA prior to the addition of poly(A). Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sun, 18 Jun 2017 08:20:41 Genome sequence of C G M M V I I I III t I I ]3' pSH-A-6 pSH-G-6 pSH-J-44 pSH-K-3 pSH-O-9 pSH-P-II Fig. 1. cDNA clones obtained from genome RNA of CGMMV-SH. 5" I 30K I -4CP t 30/lg0K 3" CGMMV'SH "~ (1142) [ 0646) ~ 6421nt TMVvulgare ~ (1115) j (1615, ~ 6395nt TMVL q (1.5, J (16.S, ~ 63~4n, Fig. 2. Comparison of ORFs of three tobamoviruses deduced from the nucleotide sequences. Figures in parentheses represent the number of amino acid residues. Data are from Goelet et al. (1982) (TMV vulgate), Ohno et aL (1984) (TMV-L) and Soils & Garcia-Arenal (1990) (TMGMV). Genome organization Sequencing was performed using a series of deletion clones from pSH-K-3 in both directions. The region not covered by the clone was directly sequenced for SHRNA as described above. The sequences of SH-RNA and the c D N A clone were identical for both the coat protein gene and the 3' non-coding region. The genome organization of CGMMV-SH was compared to three other sequenced tobamoviruses, TMV vulgare (Goelet et al., 1982), TMV-L (Ohno et al., 1984) and tobacco mild green mosaic virus (TMGMV, TMV-U2) (Solis & Garcia-Arenal, 1990) (Fig. 2), The genome of CGMMVSH is 6421 nucleotides long. At least four open reading frames (ORFs) were found in the positive strand, encoding putative proteins of 186K, 129K, 29K and 17-3K. The proteins encoded correspond to the 180K, 130K, 30K and 17.5K proteins ofTMV vulgare or TMVL. Each ORF of the CGMMV-SH genome is slightly larger than the corresponding ORF of the other tobamoviruses in the number of nucleotides except for the ORF encoding the 30K protein of TMV vulgare and TMV-L. Some ORF-like structures, at the largest 324 nucleotides long, were found in the negative strand (data 1489 not shown). It is apparent, however, that they are rather short to encode proteins although we have not yet tested this by in vitro or in vivo translation assays. The complete nucleotide sequence of CGMMV-SH is shown in Fig. 3. The first AUG initiation codon was found at residues 61 to 63, which starts an ORF encoding a protein composed of 1142 amino acids (129K). This ORF terminates at an amber codon, UAG, positioned at residues 3490 to 3492. The ORF encoding the readthrough protein composed of 1646 amino acids (186K) terminates at residues 5002 to 5004. The terminal 14 bases of the ORF encoding the 186K protein overlap the ORF encoding the 29K protein, which terminates with an amber codon. The coat protein gene initiates at 25 bases upstream from the terminal nucleotide of the amber codon for the 29K protein. The number ofnucleotides encoding the 29K protein was exactly the same as that of another CGMMV isolate (CGMMV-W) (Meshi et al., 1983b; Saito et al., 1988). The coat protein gene of our CGMMV isolate was also found to be composed of the same number of nucleotides as that found in CGMMV-W (Meshi et al., 1983b). In total, 27 nucleotide substitutions (six in the 186K protein, 14 in the 29K protein and seven in the coat protein) were found between CGMMV-SH and CGMMV-W, when the sequenced region (1878 nucleotides from the 3' end) of CGMMV-W was compared (Fig. 4). They include 25 transitions (15 between T and C, 10 between A and G) and two transversions (one between T and A, one between A and C). Twenty-five are in the third codon position and only two are in the first. Three resulted in amino acid substitutions which were all found in the 29K protein. The nucleotide substitutions at 5375 (CCTTCT), 5675 (GGT-AGT) and 5758 (GAA-GAC) correspond to P in CGMMV-SH for S in CGMMV-W (at position 128), G for S (228) and E for D (255), respectively. The number of T residues in a cluster (6401 to 6404) is one nucleotide less in CGMMV-SH than that in the corresponding T cluster of CGMMV-W. CGMMV-SH was isolated originally in Shizuoka Prefecture which is more than 200 km distant from Chiba Prefecture where CGMMV-W was isolated from watermelon. The number of nucleotide substitutions between these two strains was, however, very small, suggesting that CGMMV-SH is a variant of the CGMMV watermelon strain. It is likely that the three amino acid substitutions found in the 29K protein are responsible for adaptation of these strains to different host species. The 30K protein of tobamoviruses is known to be necessary for the movement of virus from cell to cell (Deom et al., 1987; Meshi et al., 1987). However, it remains to be determined whether CGMMV-SH moves from cell to cell more efficiently in muskmelon plants than does CGMMV-W. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sun, 18 Jun 2017 08:20:41 1490 M. Ugaki and others > I 2 9 K . 186K ~A N I N E Q I N N O R D A A A S G R N GTTTTAATTTTTATAATTAAACAAACAACAACAACAACAACAAACAATTTTAAAACAACA ATGGCAAACATTAATGAACAAATCAACAACCAACGTGACGCCGCGGCTAGCGGGAGAAAC 1 2 0 N L V S Q L A S K AATCTCGTTAGCCAATTGGCGTCAAAAAGG R V Y D E A V R S GTGTATGACGAGGCTGTTCGCTCGTTGGAT L D H O D R R P K CATCAAGACAGACGCCCGAAAATGAATTTT M N F S R V V S T E H T TCTCGTGTGGTCAGCACAGAGCACACCAGG 240 L V T D A Y P E F CTTOTAACTGACGCGTATCCGGAGTTTTCG S I S F T A T K N S V H S L A G G L R L L E L E Y M M M Q V P ATTAGCTTTACCGCCACCAAGAACTCTGTA C~CTCCC~GCGGG%GG~C~GAGGC~C~ GAA~TGGAA~A~ATGATGATGCAGGTGCCC 360 Y O S P C Y D I G G N Y T Q H L F K G R S Y V H C C N P C L D L K D V A R N V M TACGGCTCACCTTGTTATGACATCGGCGGT AACTATACGCAGCACTTGTTCAAAGGTAGA TCATATGTGCATTGCTGCAATCCGTGCCTA GATCTTAAGGATGTTOCGAGGAATGTGATG 480 Y N D M I T Q H V Q R H K G S C G C H P L F T F Q I D A F R R Y D S S P C A V T TACAACGATATOATTACGCAACATGTACAG AGGCACAAGGGATCTTGCGGGTGCAGACCT CTTCCAACTTTCCAGATAGA~GCATTCAGG AGGTACGATAGTTCTCCCTGTGCGGTCACC 600 C S D V F Q E C S Y D F G S G R D N H A V S L H S I Y D I P Y S S I G P A L H R TGTTCAGACGTTTTCCAAGAGTGTTCCTAT GATTTTGGGAGTGGTAGGGATAATCATGCA GTCTCGTTGCATTCAATCTACGATATCCCT TATTCTTCGATCGGACCTGCTCTTCATAGG 720 K N V R V C Y A A F H F S E A L L L G S P V G N L N S I G A Q F R V D G D D V H AAGAATGTGCGAGTTTGTTATGCAGCCTTT CATTTCTCGGAGGCA~TGCTTTTAGGTTCG CCTGTAGGTAATTTAAATAGTATTGGCGCT CAGTTTAGGGTCGATGGTGATGATGTGCAT 840 F L F S E E S T L H Y T H S L E N I K L I V M R T Y F P A D D R F V Y I K E F M TTTCTTTTTAGTGAAGAGTCTACTTTGCAT TATACTCATAGTTTAGAAAATATCAAGTTA ATCGTGATGCGTACTTACTTTCCTGCTGAT GATAGGTTTGTATATATTAAGGAGTTCATG 960 R V K R V D T F F F R L V R A D T H M L H K S V G H Y S K S K S E Y F A L N T P P GTTAAGCGTGTGGATACTTTTTTCTTTAGG TTGGTCAGAGCAGATACACACATGCTTCAT AAATC~GTGGGGCACTATTUGAAATGGAAG ~C¢GAGTACTTCGCGCTGAATACCCCTCCO 1080 I F Q D K A T F S V W F P E A K K V L I P K F E L S R F L S G N V K I S R M L V ATCTTCCAAGATAAAGCCACGTTTTCTGTG TGGTTTCCTGAAGCGAAGAAGGTGTTGATA CCCAAGTTTGAACTTTCGAGATTCCTTTCT GGGAATGTGAAAATCTCTAGGATGCTTGTC 1200 O A D F V H T I I N H I S T Y D N K A L V W K N V Q S F V E S I R S R V I V N GATGCTGATTTCGTCCATACCATTATTAAT CACATTAGCACGTATGATAACAAGGCCTTA GTGTGGAAGAATGTTCAGTCCTTTGTGGAA TCCATACGTTCAAGAGTAATTGTAAACGGA 1320 V S V K S E W N V P V D Q L T D I S F S I F P L V K V R K V Q I E L M S D K V V GTTTCCGTGAAATCTGAGTGGAACGTACCG GTTGATCAGCTCACTGATATCTCGTTCTCG ATATTCCCTCTCGTGAAGGTTAGGAAGGTA O~OA~CGAGTTAATGTCTGATAAAGTTGTA 1440 I E A R G L L R R F A D S L K S A V E G L C D C V Y D A L V Q T G W F D T S S D ATCGAGGCGAGGGGTTTGCTTCGGAGGTTC GCAGACAGTCTTAAATCTGCCGTAGAAGGA CTAGGTGATTGCGTCTATGATGCTCTAGTT CAAACCGGCTGGTTTGACACCTCTAGCGAC 1560 E L K V L L P E P F M T F S D Y L E G M Y E A D A K I E H E S V S E L L A S G D GAACTGAAAGTATTGCTACCTGAACCGTTT ATGACCTTTTCGGATTATCTTGAAGGGATG TACGAGCCAGATGCAAAGATCGAGAGAGAG AGTGTCTCTGAGTTGCTCGCTTCCGGTGAT 1680 D L F K K I D E I R N N Y S G V E F D V E K F Q E F C K E L N V N P M L I G H V GATTTGTTCAAGAAAATCGATGAGATAAGA AACAATTACAGTGGAGTCGAETTTGATGTA GAGAAATTCCAAGAA~TTGCAAGGAACTG AATGTTAATCCTATGCTAATTGGCCATGTC 1800 I E A I F S Q K A G V T V T G L G T L S P E M G A S V A L $ S T S V D T C E D M ATCGAAGCTATTTTTTCGCAGAAGGCTGGG GTAACAGTAACGGGTCTGGGCACGCTCTCT CCTGAGATGGGCGCTTCTGTTGCGTTATCC AGTACCTCTGTAGATACATGTGAAGATATG 1920 D V T E D M E D I V L M A D K S H S Y M S p E M A R W A D V K Y G N N K G A L V GATGTAACTGAAGATATGGAGGATATAGTG TTqATGGCGGACAAGAGTCATTCTTACATG TCCCCTGAAATGGCGAGATGGGCTGATGTT AAATATGGCAAOAATAAAGGGGCTCTAGTC 2040 E Y K V G T S M T L P A T W A E K V K A V L P L S G I C V R K P Q F S K P L D E GAGTACAAAGTCGGAACCTCGATGACTTTA CCTGCCACCTGGGCAGAGAAAGTTAAGGCT GTCTTACCGTTGTCGGGGATCTGTGTGAGG AAACCCCAATTTTCGAAQCCGCTTGATGAG 2160 E D D L R L S N M N F F K V S D L K L K K T I T P V V Y T G T I R E R Q M K N Y GAAGATGACTTGAGGTTATCAAACATGAAT TTCTTTAAGGTGAGCGATCTAAAGTTG~AG AAGACTATCACTCCAGTCGTTTACACTGGG AOCATTCGAGAGAGGCAAATGAAGAATTAT 2280 I D Y L S A S L G S T L G N L E R I V R S D W N G T E E S M Q T F G L Y D C E K ATTGATTACTTATCGGCCTCTCTTGGTTCC ACGCTGGGTAATCTGGAGAGAATCGTGCGG AGTGATTGGAATGGTACTGAGGAGAGTATG CAAACGTTCGGGTTGTATGACTGCGAAAAG 2400 C K W L L L P A E K K H A W A V V L A S D D T T R I I F L S TGCAAGTGGTTATTGTTGCCAGCCGAGAAG AAGCACGCATGGGCCGTGGTTCTGGCAAGT GACGATACCACTCGCATAATCTTCCTTTCA Y D E S G S P I I D TATGACGAA~CTGGTTCTCCTATAA~TGAT 2520 K K N W K R F A V C S E T K V Y S V I R S L E V L N K E A I V D P G V H I T L V AAGAAAAACTGGAAGCGATTTGCTGTCTGT TCCGAGACCAAAGTCTATAGTGTAATTCGT AGCTTAGAGGTTCTAAATAAGGAAGCAATA GTCGACCCCGGGGTTCACATAACATTAGTT 2 6 4 0 D G V P G C G K T A E I I A R V N W K T D L V L T P G R E A A A M I R R R A C A GACGGAGTGCCGGGTTGTGGAAAGACCGCC GAGKTTATAGCGAGGGTCAATTGGAAAACT GATCTAGTATTGACTCCCGGAAGGGAGGCA GCTGCTATGATTAGGCGGAGAGCCTGCGCC 2760 L H K S P V A T N D N V R T F D S F V M N R K I F K F D A V Y V D E G L M V H T CTOCACAAGTCACCTGTGGCAACCAATGAC AACGTCAGAACTTTCGATTCTTTTGTGATG AATAGGAAAATCTTCAAGTTTGACGCTGTG TATGTTGACGAGGGTCTGATGGTCCATACG 2880 G L L N F A L K I S G C K K A F V F G D A K Q I P F I N R V M N F D Y P K E L R GGATTACTT&~TTTTGCGTT~K~TCTCA GGT~GTA~AAGCCTTCGTCTT~GGTGAT GCTAAGCAAATCCCGTTTATAAACAOAGTC ATGAATTTCGATTATCCTAAGGAGTTAAGA 3000 T L I V D N V E R R Y V T H R C P R D V T S F L N T I Y K A A V A T T S P V V H ACTTTAATAGTCGATAATGTAGAGCGTAGG TATGTCACCCATAGGTGTCCTAGAGATGTC ACTAGTTTTCTTAATACTATCTATAAAGCC GCTGTCGCTACTACTAGTCCGOTTGTACAT 3120 S V K A I K V S G A G I L R P E L T K I K G K I I T F T O S D K Q S L I K S G Y TC~AAGGCARTTAA~TCAG~GCC GGTATTCTGAGGCCTGAGTTDACAAAGATC AAAGGAAAGATAATAACGTTTACTCAATCT GATAAGCAGTCCTTGATCAAGAGTGGGTAC 3240 N D V N T V H E I Q G E T F E E T A V V R A T P T P I G L I A R D S P H V L V A A A ~ A ~ A A T A C T G ~ C A ~ A A A T T C A G GGAGAAACCTT~AGGAGACG~CAGT~TG CGTGCCACCCCGACTCCAATAGGTTTGATT GCCCGTGATTCACCACATGTACTAGTGGCC 3360 L T R H T K A M V Y Y T V V F D A V T S TT~CTAGGCACACTA~GCA&~GTGTAT TATAC~TTGTATTCGATGCAGT~ACAAGT 3480 I I A D V E K V D O S I L T M F A T T V ATAATAGCGGATGTGGAAAAGGTCGATCAG TCGATCTTGACCATGTTTGCTACCACTGTG Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sun, 18 Jun 2017 08:20:41 G Genome sequence o f C G M M V 129K < P T K * ] Q L M O N S L Y V H R N I F L P V S K T G F Y T D M Q E CCTACCAAATAGCAATTAATGC~AATTCG CTGTATGTCCATCGTAATATTTTCCTCCCT GTTAGTAAAACGGGGTTTTATACAGACATG >186K N S F V L N D F D A V T M R L R D N E F N L ' Q P C R L T L S N L AATTCCTTCGTACTAAA~ATTTCGATGCC GTAACCATGCGGTTGAGGGACAACGAATTT AACTTACAACCTTCTAGGCTAACCTTGAGT 1491 F Y D R C L P G C~G~TTCTACGATAGATGCCTTCCTGGG 3600 D P V P A L I K AATTTAGATC~GTACCCGCTTTGATTAAG 3720 N E A Q N F L I P V L R T A C E R ' P R I P G L L E N L V A M I K R N M N T P D L AATGAAGCGCAGAATTTTCTGATCCCCGTT TTGCGTACGGCCTGTGAA~GCCGCGCATT CCGGGTCTTCTTGAGAATCTTGTAGCTATG ATAAAGAGGAATATGAATACTCCTGATTTA 3840 A G T V D I T N M $ I S I V D N F F S S F V R D E V L L D H L D C V R A S S I Q GCTGGGACCGTAGATATA~TAACA~TCG ATTTCTATAGTAGATAACTTCTTTTCTTCT TTTGTTAGGGACGAGGTTTTGCTTGATCAC TTAGATTG~TTAGGGCTAGTTCCATTCAA 3960 S F S D W F $ C Q P T S A V G Q L A N F N F I D L P A F D T Y M H M I K R Q P K AGTTTTTCTGATTGGTTTTCGTGTCAACCA ACCTCAGCGGTTGGCCAGTTAGCTAATTTC AATTTCATAGATTTGCCTGCCTTTGATACT TATATGCATATGATTAAGAGGCAACCCA~ 4080 S R L D T S I Q S E Y P A L Q T I V Y H P K V V N A V F G P V F K Y L T T K F L AGTCGGTT~ATACTTCGATTCAGTCTGAA TATCCGGCCTTGCAAACTATTGTTTATCAC CCTAAAGTGGTAAATGCAGTTTTTGGTCCG GTTTTCAAGTATTTAACCACCAAGTTTCTT 4200 $ M V D S S K F F F Y T R K K P E D L Q E F F S D L S S H S D Y E I L E L D V S AGTATGGTAGATAGTTCTAAGTTTTTCTTT TACACTAGGAAAAAACCAGAAGATCTGCAG GAATTTTTCTCAGATCTCTCTTCCCATTCT GATTATGAGATTCTTGAGCTTGATGTTTCT 4320 K Y D K S O S D F H F S I E M A I W E K L G L D D I L A W M W S M G H K R T I L AAATATCACAAGTCGCAATCCGATTTCCAC TTCTCTATTGAGATGGCAATTTGGGAAAAA TTAGGGCTTGACGATATTTTGGCTTGGATG TGGTCTATGGGTCACAAAAGAACTATACTG 4440 Q D F Q A G I K T L I Y Y Q R K S G D V T T F I G N T F I I A A C V A S M L P L CAAGATTTCCAAGCCGCGATAAAGACGCTC ATTTACTATCAACGGAAGTCTGGTGATGTA ACTACTTTTATAGGTAATACCTTTATTATC GCAGCGTGTGTGGCTAGTATGTTGCCGTTA 4560 D K C F K A S F C G D D S L I Y L P K G L E Y P D I Q A T A N L V W N F E A K L GATAAGTGTTTTAAAGCTAGTTTTTGTGGT GATGATTCGCTGATCTACCTTCCTAAGGGT TTGGAGTATCCTGATATACAGGCTACTGCC AACCTTGTTTGGAATTTTGAGGCGAAACTT C A F R K K Y G Y F C G K Y I I H H A N G C I V Y P D P L K L I S K L G N K 6 L V G TTCCGAAAGAAGTATGGTTACTTCTGCGGG AAGTATATAATTCACCATGCCAACGGCTGT ATTGTTTACCCTGACCCTTTAAAATTAATT AGTAAATTAGGTAATAAGACTCTTGTAGGG T Y E H V E E F 6 1 S L L D V A H S L F N G A Y F H L L D D A I H E L F P N A G G TATGAGCATGTTGAGCAGTTTCGTATATCT CTCCTCGACGTTGCTCATAGTTTGTTTAAT GGTGCTTATTTCCATTTACTCGACCATGCA ATCCACGAATTATTTCCTAATGCTGGGGGT C C 186K ( C S F V I N C L C K Y L S D K R L F R S L Y I D V S K ' ] TGCAGTTTTGTAATTAATTGTTTGTCTAAG TATTTGAGTGATAAGCGCCTTTTCCGTAGT CTTTACATAGATGTCTCTAAGTAAGGTGTC AGTCGAGAACTCGTTGAAACCTGAGAAGTT T T S L S K vAsGv E N SAL K P E K F I >29K TGTCAAAATCTCTTGGGTCCATAAGTTGCT CCCTAACTATTTTTCCATTCTCAAGTATTT ATCTATAACTGACTTTAGTGTAGTTAAAGC TCAGAGCTATGAATCCCTCGTGCCTGTCAA V K I S W V D K L L P N y F S I L K Y L S I T D F S V V K A Q S Y E S L V P V K 4680 4800 4920 5040 5160 GTTGTTGCGTGGTGTTGATCTTACAAAACA CCTTTATGTCACATTGTTGGGCGTTGTCGT TTCTGGTGTATGGAACGTACCGGAATCCTG TAGGGGTGGTGCTACTGTTGCTCTGGTTGA 5280 L L R G V D L T K H L Y V T L L G V V V 5 G V W N V P E S C R G G A T V A L V D S CACAAGGATGCATTCTGT~CAGAGGGAAC TATATGCAAATTTTCAGCTCCCCCCACCCT CCGCGAATTCTCTCTTAGGTTCATACCTAA TTATCCTGTGGTGGCTGCGGATCCCCTTCG 5aO0 T R M H S V A E G T I C K F S A P A T V R E F S V R F I P N Y T P V V A A D A L R CGATCCTTCGTCTTTATTTGTGAGACTCTC TAATGTGGGTATTAAAGATGGTTTCCATCC TTTGACTTTAGAGGTCGCTTGTTTAGTCGC TACAACTAAOTCTATTATCAAAAAGGGTCT 5520 D P W S L F V R L S N V A G I K D G F H P L T C L E V A C L V A T T N S I I K K G L TAGAGCTTCTGTAGTCGAGTCTGTCGTCTC TTCCGATCAGTCTAT~TCCTAGATTCCTT ATCCGAGAAAGTTGAACCTT2CTTTGACAA AGTTCCTATTTCAGCGGCTGTAATGGCAAG R A S V V E S V V S S D Q S C I V L D S T L $ E K V E P F F D T K V P I S G A A V M A R S D AGATCCCAGTTATAGGTCTAGGTCGCAGTC TGTCGGTGGTCGTGGTAAGCGGCATTCTAA ACCTCCAAATCGGAGGTTGGACTCTGCTTC TGAAGAGTCCAGTTCTGTTTCTTTTGAAGA C o a t ~ S Y p R S R S Q S V A G G R G K R H S K P P N R R L D S A S E E S S S V S F C E C D N I T P S K L I A F S A S Y V P V R T L L N F L V A S Q G T A F Q T Q A ~GCTTACAATCCGATCACACCTAGCAAAC TTATTGCGTTTAGTGCTTCTTATGTTCCCG TCAGGACTTTACTTAATTTTCTAGTTGCTT CACAAGGTACCGCTTTCCAGACTCAAGCGG G L o s D ~ T ,129K c G R D S F R E S L S A L P S S V V D I N S R F P D A G F Y A F L N G P V L R P I GAAGAGATTCTTTCCGCGAGTCCCTGTCTG CCTTACCCTCGTCTGTCGTAGATATTAATT CTAGATTCCCAGATGCCGGTTTTTACGCTT TCCTCAACCGTCCTGTGTTGAGGCCTATCT G F V S L L S S T D T R N R V I E V V D P S N P T T A E S L N A V K R T D D A S T TCGTTTCGCTTCTCAGCTCCACGGATACGC GTAATAGGGTCATTGAGGTTGTAGATCCTA GCAATCCTACGACTGCTGAGTCGCTTAACG CTGTAAAGCGTACTGATGACGCGTCTACAG A G A A R A E I D N L I E S I S K G F D V Y D R A S F E A A F S V V W S E A T T S K CCGCTAGGGCCGAGATACATAATTTAATAG AGTCTATTTCTAAGGGTTTTGATGTTTACG ATAGGGCTTCATTTGAACCCGCGTTTTCGG TAGTCTGGTCAGAGGCTACCACCTCGAAAG < TA T A .]Coat CTTAGTTTCGAGGGTCT~TGATGGTGGTG CACACCAAAGTGCATAGTCCTTTCCCGTTC ACTTAAATCGAACGGTTTGCTCATTGGTTT GCGGAAACCTCTCACGTGTGACGTTGAAGT 5640 5760 5880 6000 6120 62a0 6360 TTCTATGGCCAGTAATTCTGCAAGGGGTTC CAATCCCCCCTTTTCCCCGGGTAGGGGCCC A 6421 Fig. 3. The complete nucleotide sequence of the CGMMV-SH genome RNA and the deduced amino acid sequences of ORFs. The numbering refers to the nucleotide sequence. The amino acid sequences are given in the one-letter code. The nucleotide sequence within 1878 nucleotides from the 3' end determined by Meshi et al. (1983 b) and saito et al. (1988), and the deduced amino acid sequences of the CGMMV-W genome RNA are shown only in the positions where they are different from those of CGMMV-SH. The wavy line marks a T cluster where the number of Ts is three in CGMMV-W. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sun, 18 Jun 2017 08:20:41 1492 M. Ugaki and others CGMMV-SII T M V vulgare TMV-L PMMV-S TMV-U2 U .... A - - -CAAC- C - - -UUUUAAA . . . . I~1o. . . . . . . -~~ -ff~,~, o,,~~] - '..]'A'~I'~'~ ~; ~l ~ a n ~ ' A a cA]- tt ¢~AAC;A'~AAA CAAC - .n~ACA~CJA~ U~ CAb- - - A ACAAAA U~ACA~ACU~CAA O . ~ ~ U G U U U U G ~ C G A C A ~ -' - ' A ~ A ~ ........ --~ U~IJ~ - - AACAA CA C/~_C~ACA~CAAIJ G G~A~g't C al Fig. 4. Alignment of the nucleotide sequence of the 5' non-coding regions of tobamoviruses. Common nucleotides among them are boxed. The initiation codon of translation is underlined. A base indicated by the symbol * is not determined. The wavy line shows a ribosome-binding site (Tyc et al., 1984). Gaps indicated by slots have been introduced to obtain maximal alignment. The sequences are from Richards et al. (1978) (TMV vulgate), Ohno et al. (1984) (TMV-L), Avila-Rincon et al. (1989) (PMMV-S) and Solis & GarciaArenal (1990) (TMGMV). 180K The identity of the entire nucleotide sequences is approximately 55 to 56~ between CGMMV-SH and TMV vulgare, between CGMMV-SH and TMV-L or between CGMMV-SH and TMGMV. TMV 161 I I I 5' and 3" Non-coding regions The 5' non-coding region of the ordinary TMV (such as vulgate) is referred to as f2 and is characteristically free of G residues (Mandeles, 1968), The 5' non-coding regions of several tobamoviruses are compared in Fig. 4. The deletion of the Y-terminal eight nucleotides (GUAUUUUU) of TMV-L caused the complete loss of infectivity and the other sequence of the region seemed not to be crucial although a large deletion caused considerable effects (Takamatsu et al., 1991). The role of the sequence of eight nucleotides is however not known. This region is well conserved among five tobamoviruses. The ribosome-binding site is thought to be AUU (Tyc et al., 1984), being present in all the five viruses. The three nucleotides (ACA) upstream and the three (GCA) downstream to the initiation codon are also well conserved. The 5' non-coding region of CGMMV-SH has 60 nucleotides and is 67 to 6 8 ~ similar to that of TMV vulgare, TMV-L and a Spanish isolate of pepper mild mottle virus (PMMV-S; Avila-Rincon et al., 1989) or TMGMV. The first nucleotide of the CGMMV-SH genome was found to be G, as are those of the other three (TMV vulgate, TMV-L and TMGMV). Although the nature of the cap was not determined, we assume it to be mTGppp. The 3' non-coding region of CGMMV-SH genome is composed of 176 nucleotides. The T-terminal sequence of CGMMV-SH fits the pseudoknot structure model which was proposed for tobamovirus genomes by Van Belkum et al. (1985) (data not shown). Open reading frames No CGMMV-encoded proteins except for the coat protein have yet been identified in vivo (Okada, 1986). However by analogy with other tobamoviruses, at least four ORFs can be identified on the genome sequence. > .\ \ \ 164~ I I 1 I 30K TMV I \ 267 I -, > O 263 m Fig. 5. Homology plot comparison of 180K and 30K proteins between CGMMV-SH and TMV vulgare. Amino acid sequences were analysed using a DNASIS program based on the theory of Needleman & Wunsch (1970). Each dot marks where at least six of 10 amino acids are identical between the two viruses. The putative 129K and 186K proteins correspond to the 130K and 180K proteins of TMV vulgare and TM¥-L. The former protein of CGMMV-SH has 1142 amino acids which is 27 amino acids more than those of TMV vulgare and TMV-L. The 186K readthrough product of Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sun, 18 Jun 2017 08:20:41 Genome sequence of C G M M V 1493 (a) I 860 t ' J , I ~ ' J ~ t ~ ; ~ I / X ~ ; ' ~ ' ~ / K T I . ~ ' I ' ~ ' ~ , ~ E ~ . ~ C A L H K ~ P - - ~ N ~ R ~ 917 831 ~J'I~'t;R'I~',~KF~LS~;~FDEI~'~Ii~V~KQ~E~NS---~GII~K~K~%/~'~885 II III 918 ~V~'~R-~I----~--4"~DAVYV;~'~'@~V~LL~KI-~C~KK~F~F~AK~ 969 886 ~ F O . ~ C ~ L ~ - - - - ~ , ~ L , . ~ a ~ . C V ~ - ~ V ~ L ~ E ~ y ~ y ~ T Q ~ V ~ r ~ 940 IV 970 ~ N ~ D ~ - - - - ~ . ~ E , ~ T L ~ N ~ - ~ V V ~ , ~ , , ~ S m ~ Z ~ A ~ X ~ U ~ 1024 941 ~ S C ~ P ~ A H F ~ - ~ E - - - t ~ [ ~ a ~ ' ~ T - ~ L ~ : ' ~ I ~ A I . ~ Y ~ ] R R ~ E G F ~ M S ~ ' ~ - - - - ~ - 990 V 1025 L~-~----~-~G-~]LR~ELTK~-~.~---~I~"~S~IK~-~Ni%~N~T~'~I~ 1072 991 - 4 ~ S ~ T ~ Q E I ~ C ~ A V ~ N - ~ - - - - ~ S ~ P L H ~ L ~ ' ~ E / ~ L - ~ R ~ S ~ % / ~ ] ~ , ' g r ~ 1043 VI 1073 ~FEETA~--~A~--~GL~Ri,);I~;t,~,I;~;'~---4.~AMV~1110 (b) I 1415 r~~r~SD~FSIL~MA~EKII~.LD~I~AWM~SM~R~I~Q~FQ~y~ 1383 m ' ~ m ~ m i m t ' ~ N E ~ C m ~ E ~ R , ~ F E , ~ 2 D E V ~ Q ~ , 2 ~ : ~ , ~ Y ~ r ~ ' ~ C ~ II 1474 1442 1473 1441 III ~e};I:~'Ieor/~U~'(~b~F~"]~I)~C~.~-~S~I~L~r.~L~Yr~I~AT~ ~,)8'~.'i~,r~i,~(~kf~1~L~"%"~I~II~?~-9~~F~C~S~ IV 1533 V ~ R ~ . ' ~ L 4 ~ I ~ A N ~ I ' ~ 1501 ~ - ~ , v ~ O ~ R i ~ [ ~ R ~ r ~ 1532 1500 1560 1527 Fig. 6. Comparison of partial amino acid sequences of the 186K protein of CGMMV-SH with those of the 180K protein of TMV vulgare (Goelet et al., 1982). These sequences contain motifs indicated by Habili & Symons (1989). (a) The region contains nucleic acid helicase motifs; (b) the region contains RNA polymerase motifs. Upper and lower lines are the amino acid sequences of CGMMV-SH and TMV vulgate, respectively. Identical amino acids are written in white letters. Motifs are overlined. Roman letters represent numbers of motifs used by Habili & Symons (1989). Gaps have been introduced to obtain the closest match. CGMMV-SH is longer by 31 amino acids than those of TMV vulgare and TMV-L. A homology plot of the amino acids of the 180K (/130K) protein is shown in Fig. 5. Amino acid sequence identity of the CGMMV-SH 186K protein is approximately 48% with the corresponding 180K protein of TMV vulgare. The N-terminal one-third, the C-terminal one-third and the remainder of the CGMMV-SH 129K protein have identities of 48%, 47% and 33% with those of TMV vulgate respectively. These values are the same for TMVL except for that of the C-terminal one-third of the 130K protein (49% identity). The level of similarity of the amino acid sequence of the middle one-third portion of the CGMMV-SH 129K protein is not as high as those of the other portions. The readthrough part of CGMMVSH 186K has 58% identity with the corresponding regions of TMV vulgare and TMV-L, which is higher than those of any other regions of the protein. Rozanov et al. (1990) indicated that the N-terminal portions of large putative NTPases of'Sindbis-like' plant viruses including tobamoviruses might be methyltransferases. It is of interest that the 130K protein of TMV-U1 (vulgate) was found to have guanylytransferase- like activity (Dunigan & Zaitlin, 1990). Although the capping mechanism of plant virus R N A is not known, it might be possible that motifs for capping-related enzyme activities are located in the N-terminal one-third portion of the tobamovirus 130K protein. Habili & Symons (1989) showed that positive-strand viruses could be grouped based on amino acid sequence motifs of nucleic acid helicases and R N A polymerases, and proposed a new luteovirus supergroup. The amino acid sequences were compared in the presumed motifs in Fig. 6. These motifs in the 186K protein of CGMMV-SH are characteristic for tobamoviruses. The amino acid sequence identities in the helicase motifs region (a) were 56% and 53% with TMV-L and TMV vulgare and those in the polymerase motifs region (b) were 70% and 67%, respectively. A homology plot of amino acids of the 30K protein between CGMMV-SH and TMV vulgate is also shown in Fig. 5. Saito et al. (1988) showed local high homology in the 30K protein among tobamoviruses and a tobravirus. The 30K protein is thought to have at least two functions, binding to single-stranded nucleic acid (Citovsky et al., 1990) and interacting with plasmodesmata of host cells Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sun, 18 Jun 2017 08:20:41 1494 M . Ugaki ~and others (Tomenius et al., 1987). A closely similar region exhibited in Fig. 5 may be important in the binding of the 30K protein to single-stranded nucleic acid, which was indicated by Citovsky et al. (1990). The authors are very grateful to Drs I. Furuki and T. Ohsawa for providing them with the sample of CGMMV-SH and the muskmelon seeds. References AINSWORTrl, G. C. (1935). Mosaic disease of cucumber. Annals of Applied Biology 22, 55-67. AVILA-RINCON, M. J., FERRERO, M. L., ALONSO, E., GARCIA-LUQUE,1. & DiAz-Ruiz, J. R. (1989). Nucleotide sequences of 5' and 3' noncoding regions of pepper mild mottle virus strain S RNA. Journal of General Virology 70, 3025-3031. CITOVSKY, V., KNORR, D., SCHUSTER,G. & ZAMBRYSKI,X. (1990). The p30 movement protein of tobacco mosaic virus is a single-strand nucleic acid binding protein. Cell 60, 637-647. DEOM, C. M., SHAw, M. J. & BEACHY, R. N. (1987). The 30-kilodalton gene product of tobacco mosaic virus potentiates virus movement. Science 327, 389-394. DUNIGAN, D. D. & ZAITLIN, M. (1990). Capping of tobacco mosaic virus RNA - analysis of viral-coded guanylyltransferase-like activity. Journal of Biological Chemistry 265, 7779-7786. ENGLAND, T. E., BRUCE, A. G. & UHLENBECK, O. C. (1980). Specific labeling of 3' termini of RNA with T4 RNA ligase. Methods in Enzymology 65, 65-74. FRAENKEL-CONRAT, H., SINGER, B. & WmLl~tS, R. C. (1957). Infectivity of viral nucleic acid. Biochimicaet biophysica acta 25, 8796. FRANCKI, R. I. B., HU, J. & PALUKAITIS, P. (1986). Taxonomy of cucurbit-infecting tobamoviruses as determined by serological and molecular hybridization analyses, lntervirology 26, 156-163. FUKUDA, M., MESHI, T., OKADA, Y., OTSUKI, Y. & TAKEBE, I. (1981). Correlation between particle multiplicity and location on virion RNA of the assembly initiation site for viruses of the tobacco mosaic virus group. Proceedingsof the National Academy of Sciences, U.S.A. 78, 4231-4235. Ftmlml, I. & KOMURO, Y. (1973). A watermelon strain of CGMMV in greenhouse-grown melon. Annals of the Phytopathological Society of Japan 39, 218-219 (in Japanese). GIERER, A. & SCHRAMM,G. (1956). Infectivity of ribonucleic acid from tobacco mosaic virus. Nature, London 177, 702-703. GOELET, P., LOMONOSSOrF, G. P., BUTLER, P. J. G., AKA~, M. E., GAIT, M. J. & KARN, J. (1982). Nucleotide sequence of tobacco mosaic virus RNA. Proceedingsof the National Academy of Sciences, U.S.A. 79, 5818-5822. GUSLER, U. & HOrFMAN, B. J. (1983). A simple and very efficient method for generating cDNA libraries. Gene 25, 263-269. HAmLI, N. & SYMONS,R. H. (1989). Evolutionary relationship between luteoviruses and other RNA plant viruses based on sequence motifs in their putative RNA polymerases and nucleic acid helicases. Nucleic Acids Research 17, 9543-9555. HANAHAN, D. (1985). Techniques for transformation of E. coll. In DNA Cloning, vol. 1, pp. 109-135. Edited by D. M. Glover. Oxford: IRL Press. HE~KOFF, S. (1984). Unidirectional digestion with exonuclease III creates targeted breakpoints for DN A sequencing. Gene28, 351-359. INOUE,T., INOUE, N., ASATANI,M. & MITSUHATA,K. (1967). Studies on cucumber green mottle mosaic virus in Japan. Nogaku Kenkyu 51, 175-186 (in Japanese). KITANI, K., KISO, A. & SHIGEMATSU,Y. (1970). Studies on a new virus disease of cucumber (Cucumis sativus L. var. F1 Kurume-Otiai-H type) discovered in Yodo. Proceedings of the Associationfor Plant Protection of Shikoku 5, 59-66 (in Japanese). KOMURO, Y., TOCHIHARA, H., FUKATSU, R., NAGAI, Y. & YONEYAMA, S. (1968). Cucumber green mottle mosaic virus on watermelon in Chiba and lbaraki Prefectures. Annals of the Phytopathological Society of Japan 34, 377 (in Japanese). KOMURO, Y., TOCHIHARA, H., FUKATSU, R., NAGAI, Y. & YONEYAMA, S. (1971). Cucumber green mottle mosaic virus in watermelon and its bearing on deterioration of watermelon fruit known as 'Konnyaku' disease. Annals of the PhytopathologicalSociety of Japan 37, 34-42 (in Japanese). KONARSKA, M. M., PADGETT,R. A. & SHARP, P. A. (1984). Recognition of cap structure in splicing in vitroof mRNA precursors. Cell38, 731736. MANDELES, S. (1968). Localization of unique sequence in tobacco mosaic virus ribonucleic acid. Journal of Biological Chemistry 243, 3671-3674. MESHI, T., TAKAMATSU,N., OrlNO, T. & OKADA, Y. (1982). Molecular cloning of the complementary DNA copies of the common and cowpea strains of tobacco mosaic virus RNA. Virology 118, 64-75. MESHI, T., ISHIKAWA, M., TAKAMATSU,N., OHNO, T. & OKADA, Y. (1983a). The 5'-terminal sequence of TMV-RNA: question on the polymorphism found in vulgare strain. FEBS Letters 162, 282285. MESm, T., KIYAMA, R., OHNO, T. & OKADA, Y. (1983b). Nucleotide sequence of the coat protein cistron and the 3' non-coding region of cucumber green mottle mosaic virus (watermelon strain) RNA. Virology 127, 54-64. MESHI, T., WATANABE, Y., SAITO, T., SUGIMOTO, A., MAEDA, T. & OKADA, Y. (1987). Function of the 30 kd protein of tobacco mosaic virus: involvement in cell-to-cell movement and dispensability for replication. EMBO Journal 6, 2557-2563. MIURA, K., WATANABE, K. (~ SUGIURA, M. (1974). 5'-Terminal nucleotide sequences of the double-stranded RNA of silkworm cytoplasmic polyhedrosis virus. Journal of Molecular Biology 86, 3148. Moss, B. (1977). Utilization of the guanylyltransferase and methyltransferases of vaccinia virus to modify and identify the 5'-terminals of heterologous RNA species. Biochemicaland Biophysical Research Communications 74, 374-383. MOTOYOSm, F. & NISHIGUCHI, M. (1988). Control of virus diseases by attenuated virus strains, comparison between attenuated strains of cucumber green mottle mosaic virus and tobacco mosaic virus. Gamma Field Symposia, Institute of Radiation Breeding, National Institute of Agrobiological Resources 27, 91-109. NEEDLEMAN, S. B. & WUNSCH, C. D. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48, 443-453. Nozu, Y. & TSUGITA, A. (1986). The amino acid sequence of cucumber green mottle mosaic virus (watermelon strain) protein. Plant Science 44, 47-51. Nozu, Y., TOCHIrlARA,H., KOMURO,Y. & OKADA,Y. (1971". Chemical and immunological characterization of cucumber green mottle mosaic virus (watermelon strain) protein. Virology 45, 577-585. OHNO, T., AOYAGI, M., YAMANASHI,Y., SAITO, H., IKAWA, S., MESHI, T. & OKADA, Y. (1984). Nucleotide sequence of the tobacco mosaic virus (tomato strain) genome and comparison with the common strain genome. Journal of Biochemistry 96, 1915-1923. OHSHIMA,Y. (1981). Colony hybridization. In GeneticManipulation, pp. 303-307. Edited by K. Matsubara & K. Yano (in Japanese). Tokyo: Kyoritsu Publishing Company. OKADA, Y. (1986). Cucumber green mottle mosaic virus. In The Plant Viruses, vol. 2, pp. 267-281. Edited by M. H. V. Van Regenmortel & H. Fraenkel-Conrat. New York: Plenum Press. OKADA, Y., FUKUDA, M., TAKEBE, I. & OTSUKI, Y. (1980). Initiation site for assembly of several strains of TMV and its relation to occurrence of the short particles in infected plants. BioSystems 12, 257-264. RICHARDS, K. E., GUILLEY, H., JONARD, G. & HIRTH, L. (1978). Nucleotide sequence of the 5' extremity of tobacco mosaic virus RNA. European Journal of Biochemistry 84, 513-519. ROZANOV, M. N., KOONIN, E. V. & GORBALENYA, A. E. (1990). Nterminal domains of large putative NTPases of 'Sindbis-like' plant viruses share amino acid motifs and may be RNA methyltransferases. Abstracts of Vlllth International Congressof Virology, Berlin, p. 377. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sun, 18 Jun 2017 08:20:41 Genome sequence o f C G M M V SAITO, T., IMAI, Y., MESHI, T. & OKADA, Y. (1988). Interviral homologies of the 30-kD proteins of tobamoviruses. Virology 167, 653-656. SANGER, F., NICKLEN,S. & COULSON,A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences, U.S.A. 74, 5463-5467. SOLiS, I. & GARCJA-ARENAL,U. (1990). The complete nucleotide sequence of the genomic RNA of the tobamovirus tobacco mild green mosaic virus. Virology 177, 553-558. TAKAMATSU,N., WATANABE,Y., IWASAKI,T., SmBA, T., MI:,SHI,T. & OKADA, Y. (1991). Deletion analysis of the 5' untranslated leader sequence of tobacco mosaic virus. Journal of Virology 65, 1619-1622. TOMENIUS, K., CLAPHAM, D. & MESHI, T. (1987). Localization by immunogold cytochemistry of the virus-coded 30 K protein in plasmodesmata of leaves infected with tobacco mosaic virus. Virology 160, 363-371. TOCHIHARA,H. & KOMURO,Y. (1974). Infectivity test and serological relationships among various isolates of cucumber green mottle mosaic virus. Annals of the Phytopathological Society of Japan 40, 5258 (in Japanese). 1495 TYc, K., KORNASKA,M., GROSS, H. J. & FILIPOWICZ, W. (1984). Multiple ribosome binding to the 5'-terminal leader sequence of tobacco mosaic virus RNA, assembly of an 80S ribosome mRNA complex at the A U U codon. European Journal of Biochemistry 140, 503-511. UGAKLM., KAKUTANI,T., TOMIYAMA,M. & MOTOYOSHI,F. (1988). An efficient DNA sequencing protocol using a phage/plasmid chimeric vector and an automated fluorescent DNA sequencer. Bulletin of the National Institute of Agrobiological Resources 4, 277-294. VAN BELKUM,A., ABRAHAMS,J. P., PLEIJ, C. W. A. & BOSCH,L. (1985). Five pseudoknots are present at the 204 nucleotides long 3' noncoding region of tobacco mosaic virus RNA. Nucleic Acids Research 13, 7673-7686. "qASUDEVA,R. S. & NARIANI,T. K. (1952). Host range of bottlegourd mosaic virus and its inactivation by plant extracts. Phytopathology 42, 149-152. VASUDEVA,R. S., RAYCHAUDHURI,S. P. & SINGH, J. (1949). A new strain of Cucumis virus. Indian Phytopathology 2, 180-185. (Received 29 November 1990; Accepted 25 March 1991) Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sun, 18 Jun 2017 08:20:41