* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Nearest Neighbour Base Sequence Analysis of the
Survey
Document related concepts
Transcript
J. gen. ViroL (1967), 1, 101-108 101 Printed in Great Britain Nearest Neighbour Base Sequence Analysis of the Deoxyribonucleic Acids of a Further Three Mammalian Viruses: Simian Virus 40, Human Papilloma Virus and Adenovirus Type 2 By J. M. M O R R I S O N AND H. M. K E I R Institute of Biochemistry AND H. S U B A K - S H A R P E AND L. V. C R A W F O R D Medical Research Council Unitfor Experimental Virus Research, Institute of Virology, University of Glasgow, Glasgow, Scotland (Accepted 17 September 1966) SUMMARY The nearest neighbour frequency analyses of the DNAs of Simian virus 40, human papilloma virus and adenovirus type 2 are reported. The two small oncogenic viruses have DNA closely resembling that of the host cells, which confirms and extends the previous findings for such viruses. The DNA of adenovirus type 2 shows only limited resemblance to that of the host cells. The experimental findings are discussed in the context of previous analyses of the DNAs of polyoma virus, Shope papilloma virus, herpes simplex virus, pseudorabies virus, equine rhinopneumonitis virus and vaccinia virus. INTRODUCTION Josse, Kaiser & Koruberg (1961)* and Swartz, Trautner & Kornberg (1962) initiated and developed the technique of frequency analysis of doublets (nearest neighbour base sequences) in DNA. Basically, the technique involves isolation of polydeoxyribonucleotide synthesized enzymically on a supplied DNA template, from four reaction mixtures each containing the four deoxyribonucleoside 5'-triphosphates. Only one of these triphosphates (a different one for each reaction) is labelled with a~p in the ~-phosphate position. The *~P-labelled phosphate therefore enters the reaction attached at the 5'-position of the deoxyribose moiety in the labelled nucleotide. The synthesized polydeoxyribonucleotide is then degraded to deoxyribonucleoside 3'-monophosphates by the consecutive action of micrococcal nuclease (EC 3.1.4.7) and spleen phosphodiesterase (EC 3.1.4.1) and the four nucleotides from each reaction are separated by electrophoresis on paper and their 32p contents measured. The a~p label is thus transferred from the 5'-position of the ingoing nucleotide to the 3'-position of the nearest neighbour deoxyribonucleoside in the newly synthesized * In keeping with the original paper of Josse et aL (1961), dinucleotide sequences derived from nearest neighbour frequency analysis are denoted by ApC [deoxyadenylyl-(3'-5')-deoxycytidine], GpT [deoxyguanylyl-(3'-5')-deoxythymidine], etc.; the molar proportions of adenine, thymine, guanine and cytosine are denoted by ' a ' , ' t ' , ' g ' and ' c ' , and are expressed as a fraction of 1.000. ~o (G + C) is the percentage of guanine plus cytosine in the DNA. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Wed, 09 Aug 2017 23:30:02 102 J.M. MORRISON AND OTHERS polydeoxyribonucleotide. For example, if [~-32P]deoxyadenosine 5'-triphosphate were employed, the a2p in the four 3'-monophosphates isolated would describe the frequency distribution of the four doublets ApA, CpA, GpA and TpA. In this way, for any given DNA template, the frequency of occurrence of the nearest neighbour nucleotide to each base in turn is determined. The complete analysis is given as the frequencies of occurrence of all sixteen doublets. Josse et al. (1961) and Swartz et al. (1962) investigated DNAs from many different sources, and showed that the DNAs of different organisms have highly characteristic and significantly non-random doublet frequencies. Their data showed a striking anomaly in the occurrence of the CpG doublet, there being close correspondence to random expectation of its frequency in the DNAs of bacteria and bacteriophages, a substantial deficiency (two-thirds of random expectation is present) in echinoderms, and an extreme deficiency (less than one-third of expectation) in vertebrates. Assuming that the major part of an organism's DNA is concerned with the specification of polypeptides, then the extreme infrequency of the CpG doublet in the vertebrates must reflect rarity of use of this doublet in programming protein synthesis. Translation of nucleic acids into proteins is mediated by codon-specific species of transfer-RNA molecules, and it seems reasonable to assume that the population of transfer-RNA species in the cells of an organism will be optimally adapted (as a consequence of natural selection) to the translation requirements of the DNA of that organism. These considerations imply that there should be a severe shortage in mammalian cells of those transfer-RNA species that recognize CpG-containing codons. The DNAs of viruses which use the pre-existing translation apparatus of the host cells would have to be adapted to be translated by the transfer-RNA population of the host cells. The doublet pattern of such virus DNA should therefore resemble that of the host DNA. Only viruses that modify the codon recognition of the cell's transferRNA population would escape this restriction, and the doublet pattern of the nucleic acid of such viruses would be independent of that of the host cells. To test this hypothesis, we have analysed the DNAs of six mammalian viruses and their host cells (Subak-Sharpe et aL 1966a). It was found that the doublet pattern of the DNA of the two small oncogenic viruses tested (polyoma and Shope papilloma) did indeed closely resemble that of mammalian cell DNA, whereas the DNAs of four large viruses (herpes simplex, pseudorabies, equine rhinopneumonitis and vaccinia) gave different patterns. These latter patterns conformed more closely to random expectations although differing from each other. The present communication deals with the analysis of a further three DNA viruses which were studied to ascertain whether the conclusions tentatively drawn from the earlier investigation were more generally applicable. Three of the DNAs previously tested were included to serve as controls (vaccinia and equine rhinopneumonitis DNAs; and BHK21/C 13 cell DNA). METHODS Preparation o f D N A . DNAs from equine rhinopneumonitis virus, vaccinia virus and BHK21/C 13 (hamster) cells were prepared as previously described (Subak-Sharpe et al. 1966a). The DNAs of human papilloma virus and Simian virus 40 (sv40) were Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Wed, 09 Aug 2017 23:30:02 Analysis o f DNA o f mammalian viruses 103 prepared according to Crawford & Crawford (1963) and Crawford & Black (1964). Adenovirus type 2 was grown in monolayer cultures of H e L a cells and extracted from the infected cells by three cycles of freezing and thawing followed by a 3 hr treatment at 37 ° with 0.25 ~ sodium deoxycholate. The resulting virus suspension was purified by density gradient centrifugation and the virus D N A extracted by the method of Green & Pifia (1964)and freed from protein by sedimentation through CsC1 (density = 1.45 g./ml.). All D N A s were checked for purity by centrifugation to equilibrium in CsC1 in the Model E Spinco Ultracentrifuge (Table 1). Nearest neighbour frequency analysis. The technical details and procedure adopted were as presented by Subak-Sharpe et al. (1966a), which contained only slight modifications of the procedure described by Josse et aL (1961) and Swartz et aL (1962). Table 1. Experimentally obtained values from nearest neighbour frequency analyses of the DNAs of BHK21/C13 cells and five mammalian viruses DNA BHK21/C13 sv40 Human papilloma Adenovirus Equine rhinotype 2 pneumonitis ApT TpA ApATpT GpTApC TpGCpA GpATpC ApGCpT GpGCpC GpC CpG (G + C) ~ from frequency analysis (G+C) % from buoyant density determination 82 73 98 108 60 52 79 68 62 57 69 68 44 40 35 8 38-2 74 68 105 116 58 48 77 72 54 50 73 62 49 44 44 6 39-0 79 72 91 96 57 54 73 69 57 55 65 64 50 45 48 24 41.4 48 44 64 68 59 56 71 66 55 55 62 64 72 72 82 62 53.2 48 50 58 57 63 58 67 58 60 57 62 62 79 73 77 72 54.4 124 111 106 112 53 49 57 52 65 61 55 53 28 26 22 28 32.5 42 41 41 57 55 35 Vaccinia Each DNA was analysed at least twice; the average values are given. The values of the doublet frequencies are expressed in parts per thousand. RESULTS The nearest neighbour frequencies of the six D N A s analysed are presented in Table 1. in five of the six analyses, the percentage (G + C) of the synthesized D N A s was slightly lower than would be expected from published values or from the buoyant density determinations on the template DNA. This was also found in our previous studies (Subak-Sharpe et al. 1966a) and is at present under investigation. It must introduce a small error into the frequency patterns, but we consider that the doublet patterns of the D N A s examined are sufficiently precise for our present purposes. A series o f ' shortage histograms' (Fig. 1) has been drawn to render easier individual comparisons of doublet frequencies in the viral D N A s and the host cell D N A . The 'shortage histogram' indicates the extent (expressed as a percentage) to which any doublet frequency in the host D N A falls short of that in the viral D N A . It has been devised to focus attention on those doublets in the viral D N A which, if included in codons, might give rise to difficulties at the level of translation. Excess of a doublet Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Wed, 09 Aug 2017 23:30:02 104 J . M . MORRISON AND OTHERS in the host D N A relative to the virus D N A is of little interest here as it is the virus D N A which parasitizes the host cell. We have arbitrarily assigned four levels o f shortage which are illustrated in Fig. 1, but do not regard shortages o f less than twothirds to constitute a serious problem for translation. The D N A s of sv40 and h u m a n papilloma virus are unlikely to encounter difficulties < I-- < I-- t.9<t---U t.9 I - - < U t.9 U t.9 t.) 90 = 80 70 60 ,a= 50 40i ~30 ~ 20 10 0 sv40 Human papilloma - 90 - 50 - 90 -so 90 Adenovirus type 2 -~ 50 Equine rhinopneumonitis - Vaccinia ~ ~ 90 50 - - ~ Shortage histogram = 100 90 50 Doublets per 1000 present in host × 100 Doublets per 1000 present in virus Fig. 1. Doublet frequency patterns of mammalian virus DNAs expressed as shortage histograms relative to the frequency pattern of human spleen cell DNA. The latter, which is shown on the figure above the shortage histogram, is used as a reference (data of Swartz et aL 1962), since all mammalian DNAs investigated so far have essentially the same doublet frequency pattern. [3, 1-49 ~ ; [], 50--66 %; [~. 67-89 %; II, > 90 ~. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Wed, 09 Aug 2017 23:30:02 Analysis o f D N A o f mammalian viruses 105 at the level of the cell's translation apparatus, whereas the D N A of adenovirus type 2 may well do so. The results for equine rhinopneumonitis virus and vaccinia virus confirm our previous findings (Subak-Sharpe et al. 1966a). In the case of every DNA, the frequency of each doublet deviates characteristically from random expectation. To facilitate comparison of these DNAs of widely differing ( G + C ) content, the observed doublet frequency values have been normalized to the values they would have if the D N A contained 50 ~ (G + C) (Subak-Sharpe et aL 1966 a). The normalized values are listed in Table 2. Table 2. Normalized nearest neighbour frequencies of the DiVAs of BHK21/C13 cells and of five mammalian viruses DNA BHK21[C13 ApT TpA ApA TpT GpT ApC TpG CpA GpA TpC ApG CpT GpG CpC GpC CpG a/t 54 48 68 67 59 59 78 77 65 62 72 73 69 75 60 14 0.3011 0.3168 0.1994 0.1828 0.95 g/c 1.09 a t g C sv40 50 46 73 75 57 54 76 81 55 54 74 67 73 80 72 10 0.2989 0.3109 0.2049 0-1853 0.96 1-11 Human AdenovirusEquine rhinopapilloma type 2 pneumonitis Vaccinia 58 53 68 69 57 58 73 74 58 57 66 67 69 69 70 35 0-2895 0-2959 0.2124 0.2021 0.98 55 50 76 74 58 58 69 68 56 54 63 63 63 64 73 55 0.2293 0.2390 0.2673 0.2644 0.96 58 60 71 67 61 61 65 61 59 58 61 64 63 65 65 61 0.2258 0.2304 0-2792 0.2645 0.98 68 61 60 60 58 58 62 62 73 71 62 61 63 65 52 66 0.3336 0.3414 0.1672 0.1579 0.98 1-05 1.01 1.06 1.06 The values shown in Table 1 have been normalized to correspond with DNA containing 50 ~ (G + C). Normalizing entails dividing the observed doublet frequency, as listed in Table 1, by the product of the frequencies of the bases that make up the particular doublet, and multiplying by 0.0625 (the random fiequency expected for every doublet in DNA of 50~ (G+C) content (Subak-Sharpe et al. 1966). For example, the DNA of BHK21JC13 cells, the normalized frequency of the ApT doublet is [82/(0.3011 x 0-3168)]x 0'0625. The direction and extent of deviation from random expectation for all sixteen doublets are depicted in Fig. 2 to allow ready comparison of the patterns of deviation found in the different DNAs. The overall doublet patterns of the D N A of sv40, and to a lesser extent of human papilloma virus, strikingly resemble that of the host DNA. Thus, in this respect, these two small oncogenic viruses are similar to the two originally investigated (polyoma virus and Shope papilloma virus). In contrast, the pattern of deviation from random expectation shown by the D N A o f adenovirus type 2 differs more markedly from that of the host DNA, approaching the more nearly random patterns of the D N A s of the previously studied large viruses, namely vaccinia, equine rhinopneumonitis, herpes simplex and pseudorabies. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Wed, 09 Aug 2017 23:30:02 106 J. M. M O R R I S O N < 20 +10 BHK21/C13 (38 %) 0 --10 20 30 40 I-- <1-- AND OTHERS t9 < I - - U t.91--<£9 ~9 M t.9 U t L + sv40 (39 %) I m I Human + papilloma _ (41%) Adenovirus + type 2 (53 %) i Vaccinia + (32 %) -- ~ Equine rhino- + pneumonitis _ (54 %) ~ Fig. 2. Doublet frequency pattern of DNAs normalized to 50 % (G + C) content and expressed in terms of deviation in parts per thousand from random expectation. The random expectation for each doublet is 62-5 parts per thousand. The ( G + C ) contents given for each DNA are those calculated from the nearest neighbour analyses. The deviation from the expectation (of 62.5) was calculated by use of the normalized experimental values, which are given in Table 2. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Wed, 09 Aug 2017 23:30:02 Analysis of DNA of mammalian viruses 107 DISCUSSION It is now clear that four viruses (polyoma, sv40, human papilloma and Shope papilloma) have DNAs whose doublet patterns closely resemble that of the DNA of their mammalian host cells. These viruses also have in common that they are small, with information in their DNAs sufficient to specify only in the order of ten polypeptides, that they are oncogenic, and that they contain DNA that is supercoiled and has a (G + C) content of 41 to 48 ~. Previously, we have suggested that small viruses (a) may have to utilize the pre-existing translation apparatus of the host cells, and (b) may have evolved from stretches of the DNA of ancestral host cells. The fact that the four such viruses investigated to date are oncogenic may or may not be significant. At this stage it is not justifiable to infer in addition to a correlation between doublet pattern and smallness of the nucleic acid, also a correlation between doublet pattern and oncogenic potential. To clarify the situation, the doublet patterns of the DNAs of oncogenic and non-oncogenic members of the adenovirus group will have to be determined. A beginning has been made with the DNA of the non-oncogenic adenovirus type 2 and it is noteworthy that its doublet pattern exhibits only limited resemblance to that of the host DNA, showing much greater resemblance to the more nearly random patterns of the DNAs of the large viruses. It is therefore important to investigate the highly oncogenic adenoviruses types 12, 18 and 31, and the weakly oncogenic adenoviruses types 3, 7, 11, 14, 16 and 21, and compare their doublet patterns to those of non-oncogenic adenoviruses (2, 5, etc.). This is now being done. At this stage, a case could be made for an empirical relationship between the degree of resemblance of the DNA of viruses to that of the host cells and the size of the viral genome. (Molecular weights of the DNAs are approximately as follows: polyoma virus and sv40, 3 x 106; Shope and human papilloma viruses, 5 x 106; adenovirus type 2, 23 x 106; herpes simplex virus, 70 x 106; vaccinia virus, 160 x 106). However, this empirical approach seems to us to be unprofitable. We prefer, and have presented, the hypothesis that large animal viruses with a doublet pattern which shows no resemblance to that of the host cell DNA might (a) modify the translation apparatus of the host cells by introduction or modification of transfer-RNA species, and (b) take their origin in an evolutionary sense from the DNA of organisms not closely related to the host cells. In the case of herpes simplex virus there already is evidence which strongly suggests that this virus specifies new arginyl transfer-RNA (Subak-Sharpe, Shepherd & Hay, 1966). With regard to the peculiar rarity of the CpG doublet in vertebrate DNA, it is noteworthy that all the 5-methylcytosine which is found in mammalian DNA is present in the sequence CpG, and apparently that all cytosine in this sequence is methylated (Doskotil & ~ormov~i, 1965); however, very little, if any, cytosine in the DNA of polyoma virus grown in mammalian cells is methylated (Winocour, Kaye & Stollar, 1965). The implications of neither observation are understood. Rarity of CpG, taken together with the invariable enzymic methylation of only this doublet, indicate perhaps that its function along the genetic message in mammalian DNA is unusual. Here, C followed by G might not be used within codons, but only with C in one codon and G in the next (XXC: GXX: XXX). If transcription and translation are topographically connected (Stent, 1965) then methylation in mammalian DNA Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Wed, 09 Aug 2017 23:30:02 108 J. M. M O R R I S O N AND OTHERS may affect the fidelity of translation from D N A via R N A into protein. This may be unnecessary and therefore not apply to invading viral D N A . Alternatively, viral D N A may be spatially remote from the cell's methylating enzymes, and coating of the D N A and virus assembly may proceed too promptly for methylation to take place. We are grateful to Professor J. N. Davidson, F.R.S., and Professor M. G. P. Stoker for their interest and support. The investigation was aided by a grant from the British Empire Cancer Campaign. We also acknowledge with thanks the skilled technical assistance of Miss Helen Moss, Mrs M. Scott and Mr P. Ferry. The Escherichia coli strain cells from which the D N A polymerase was prepared was a generous gift from Dr R. Elsworth and colleagues, M.R.E. Porton, England. REFERENCES CRAVCFORD,L. V. & BLACK,P. H. (1964). The nucleic acid of Simian virus 40. Virology 24, 388. CRAWFORD,L.V. & CRAWFORD,E. M. (1963). A comparative study of polyoma and papilloma viruses. Virology 21, 258. DosKo~m, J. & ~ORMOV~,,Z. (1965). The methylated bases in deoxyribonucleicacids. I. Sequences of deoxy-5-methyl-cytidylicacid in bacterial DNA. Colin. Czech. chem. Commun. 30, 38. GREEN, M. & Pr~A, M. (1964). Biochemical studies on adenovirus multiplication. VI. Properties of highly purified tumorigenic human adenoviruses and their DNAs. Proc. natn. Acad. Sci. U.S.A. 51, 1251. JOSSE, J., KAISER,A.D. & KORt,mERO, A. (1961). Enzymatic synthesis of deoxyribonucleic acid. VIII. Frequencies of nearest neighbor base sequences in deoxyribonucleic acid. J. bioL Chem. 236, 864. STENT,G. S. (1965). Genetic transcription. Proc. R. Soc. B, 164, 181. SUBAK-SHARPE,H., Bf3RK, R.R., CRAWFORD,L.V., MORRISON,J. M., HAY, J. & KERR, H. M. (1966a). An approach to evolutionary relationships of mammalian DNA viruses through analysis of the pattern of nearest neighbor base sequences. Cold Spr. Hath. Symp. quant. Biol. 31 (in the Press). StrBAI<-SHARPE,H., SHEPHERD,W. M. & HAY, J. (1966). Studies on s-RNA coded by herpes virus. Cold Spring Harb. Symp. quant. BioL 31 (in the Press). SWARTZ,M. N., TRAUTNER,T. A. & KORNBERO,A. (1962). Enzymatic synthesis of deoxyribonucleic acid. XI. Further studies on nearest neighbor base sequences in deoxyribonucleic acids. J. biol. Chem. 237, 1961. WrNocouR, E., KAYE, A.M. & STOLLAR,V. (1965). Synthesis and transmethylation of DNA in polyoma-infected cultures. Virology 27, 156. (Received 6 September 1966) Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Wed, 09 Aug 2017 23:30:02