Download Correlations between the Amino Acid and Nucleotide Composition

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Metalloprotein wikipedia , lookup

Butyric acid wikipedia , lookup

Fatty acid synthesis wikipedia , lookup

Protein wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Peptide synthesis wikipedia , lookup

Point mutation wikipedia , lookup

Proteolysis wikipedia , lookup

Hepoxilin wikipedia , lookup

Metabolism wikipedia , lookup

Protein structure prediction wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Plant virus wikipedia , lookup

Biosynthesis wikipedia , lookup

Biochemistry wikipedia , lookup

Genetic code wikipedia , lookup

Transcript
J. gen. Virol. (1969), $, 379-389
379
Printed in Great Britain
Correlations between
the A m i n o A c i d and N u c l e o t i d e C o m p o s i t i o n o f Plant Virus
Particles: Evidence that Plants use the same Genetic
C o d e as Bacteria
BY A. J. G I B B S
Department of Microbiology, John Curtin School of Medical Research,
Australian National University, Canberra, Australia
AND
G. A. M A c l N T Y R E
Division of Mathematical Statistics,
Commonwealth Scientific and Industrial Research Organization,
Canberra, Australia
(Accepted IO June 1969)
SUMMARY
Correlations between the amino acid and nucleotide compositions of the
particles of 41 plant viruses suggest that plant viruses, and hence presumably
plants, use the same genetic code as bacteria, and that the gene for the protein
in the particles of each of these viruses has a nucleotide composition similar
to that of the whole nucleic acid molecule of the virus.
INTRODUCTION
It is now widely accepted that when a protein is synthesized in a cell the sequence
of amino acids in the protein is determined by the sequence of bases in a messenger
ribonucleic acid; each amino acid is specified by a sequence of three nucleotides called
a codon. Each amino acid is specified by one or more codons. Experiments to determine directly which codons specify each amino acid have been done in various ways
using bacterial cell extracts and viruses (summarized by Sadgopal, 1968). This assignment of codons we will call the bacterial genetic code.
Direct experiments, similar to those with bacteria, to see whether plants use the
same code have not been satisfactory for various technical reasons, but three types
of indirect evidence suggest that plants do use the same code. All come from work with
viruses, and assume that the viruses use the host's mechanisms for translating the
information in their nucleic acids into proteins, and hence use the same code. The
most convincing evidence is from studies of artificially induced mutants of tobacco
mosaic virus (TMV) (evidence summarized by Fraenkel-Conrat, I968). The amino
acid changes induced in the protein of the particles of this virus, when the virus
nucleic acid has been treated with nitrous acid, are mostly consistent with the bacterial
genetic code; nitrous acid deaminates adenine and cytosine to form hypoxanthine
and uracil respectively. However, even if we accept that nitrous acid only deaminates
the bases in TMV nucleic acid (and there is no evidence for this), these studies do not
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 19:30:55
380
A. J. GIBBS AND G. A. MACINTYRE
exclude the possibility that the plant code is partially or wholly a mirror image of the
bacterial code (i.e. the roles of adenine and cytosine, and of guanine and uracil are
reversed), nor do they explain why several of the TMV mutants have amino acid
changes that are not compatible with the bacterial genetic code.
The other evidence that suggests that plants use the bacterial genetic code is much
less convincing; there are reports (Clark et al. I965; van Ravenswaay-Claasen et aL
I967) that some plant virus nucleic acids, when put into Escherischia coli cell-free
protein-producing systems (Nirenberg & Matthaei, I960, induce the formation of
proteins like those produced in the host plant. These experiments have not yet been
confirmed, and similar experiments with several other plant virus nucleic acids have
failed. There are also a few plant viruses that multiply in their insect vectors, but it is
not known whether all the nucleic acid of these viruses is translated into proteins in
both plant and insect; different parts of the nucleic acid of these viruses may be
specially adapted to each type of host. Finally, claims have been made that a bacteriophage will grow in plants (Sander, I964; Schwartz et aL I965), and a plant virus
in animal cells (Atherton, I968), but neither these nor similar experiments with bacterial and animal cells and viruses have been confirmed yet.
In this paper we describe an alternative, though still indirect, way of checking
whether the plant and bacterial genetic codes are similar. The nucleic acids and
proteins of the particles of many different plant viruses have now been analysed
chemically, so that one can test whether the amino acid composition of the proteins
of these viruses is what one would expect from the nucleotide composition of their
nucleic acids if the code used in the translation mechanism in plants is similar to that
used by bacteria. To test this idea we must assume that the gene for the protein of the
particles of each plant virus has a composition similar to that of the whole nucleic
acid molecule of the virus, and that plants use the several codons associated with an
amino acid equally frequently. What evidence there is suggests that the first assumption may be correct, for when the nucleic acids from different components of a multicomponent plant virus, or from defined fragments of a plant virus nucleic acid, have
been analysed, most have been found to have an almost identical base composition
(evidence summarized by Gibbs, I969). There is some evidence to suggest that the
second assumption may be partly incorrect (Streisinger et aL 1966; Inouye et al. I967),
and codons ending in guanine or cytosine may be used infrequently in bacteria; therefore in one test we compared the full bacterial code with an 'A/U-rich' code, omitting,
where possible, all codons ending in guanine or cytosine.
We have tested our idea in two ways. We computed correlation coefficients between
the nucleotide and the amino acid compositions of the particles of 41 plant viruses,
to see whether the correlations obtained are related to the codon assignments for
amino acids in the bacterial genetic code. We also used the bacterial genetic code to
predict the amino acid composition of the particles of these viruses from their nucleotide composition, and vice versa, and then tested whether the predicted and observed
compositions are correlated.
Sources of information
The viruses, and the sources of the results of analyses that we used, are given in
Table I. Virus names and cryptograms are from Martyn (I968). Three-letter abbreviations are used in the tabulated results for the names of amino acids (Dayhoff & Eck,
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 19:30:55
Amino acid and nucleotide composition
381
1968). Not all the viruses had been analysed for cysteine and tryptophan, therefore
these were omitted from the analyses. The average nucleotide composition of the
codons for each amino acid (Table 2) was calculated from the bacterial codon assignments tabulated by Sadgopal 0968), assuming, except where stated, that the several
codons for each amino acid were used equally frequently. For one test the average
nucleotide composition of codons for each amino acid was calculated from, where
possible, only those codons in which adenine or uracil is the third base of the triplet.
T a b l e I. The viruses, and the sources of information
Alfalfa mosaic virus; R/x:x.3/x8:U/U:S/Ap. (top and bottom components). Kelley & Kaesberg
(I962); Rauws, Jaspers & Veldstra (I964).
Bean pod mottle virus; R/x:2"4/35:S/S:S/CI; Cowpea mosaic virus group. Semancik & Bancroft
(1964); Semancik 0966).
Bean southern mosaic virus; R/x:x-4/2x :S/S:S/Ap. (cowpea, Mexican and type strains). Ghabrial,
Shepherd & Grogan (1967).
Broad bean mottle virus; R/x : x'x/22:S/S:S/*; brome mosaic virus group. Aronson & Bancroft (1962);
Yamazaki & Kaesberg (1963).
Brome mosaic virus; R/x: x/22: S/S: S/*; brome mosaic virus group. Bockstahler & Kaesberg (i965);
Stubbs & Kaesberg (1964).
Carnation mottle virus; R/x:*/*:S/S:S/*. R. MacLeod (personal communication).
Carnation ring spot virus; R/x : x'4/2o: S/S :S/*. Kalmakoff & Tremalne (1967).
Clover (white) mosaic virus; R/x: */5: E/E: S/(Ap); potato virus X group. Fry, Grogan & Lyttleton
(196o); Miki & Knight (1967).
Crotalaria (Sann Hemp) mosaic virus; R/x:*/* :E/E:S/*; tobacco mosaic virus group. Rees & Short
(I965); R. H. Symons (personal communication).
Cucumber green mottle mosaic virus; R/x:*/* :E/E:S/*; tobacco mosaic virus group. Knight (1952);
Markham & Smith (195o); van Regeumortel (I967b).
Cowpea chlorotic mottle virus; R/x: x.x/z4: S/S: S/*; brome mosaic virus group. Bancroft et aL (1968).
Cucumber mosaic virus; R/x:x/I8:S/S:S/Ap. Francki et al. (1966); Kaper, Diener & Scott (1965);
van Regenmortel (I967a).
Cucumber (wild) mosaic virus; R/I:2"4/35:SIS:S/CI; turnip yellow mosaic group. Symons et aL
(I963).
Echtes Ackerbohnenmosaik virus; R/x : "1" : SIS: S/*; cowpea mosaic virus group. Gibbs, GiussaniBelli & Smith (1968); Wittmann & Paul (I961).
Pea enation mosaic virus; R/x:x'3127:S/S:S[Ap. Shepherd, Wakeman & Ghabrial (1968).
Pelargonium leaf curl virus; R/x:x'5/x7:S/S:S/*. Dorner & Knight (I953); de Fremery & Knight
(z955).
Potato virus X; R/x: */6:E/E:S/(Fu); potato virus Xgroup. Dorner & Knight (I953); Miki & Knight,
(1968); Reichmann (1964).
Satellite virus; R/x:o.4/2o:S/S:S/Fu. Reichmann (I964).
Sowbane mosaic virus; R/x: x'3/x7: S/S :S/Di. Kado (1967).
Squash mosaic virus; R/x:2"4/35:S/S:S/CI; cowpea mosaic virus group. Mazzone, Incardona &
Kaesberg (1962).
Tobacco mosaic virus; R/x :2/5:S/S:S[*; tobacco mosaic virus group. (GA, J I4DI, masked, ribgrass,
type, YA, and Y-TAMV strains) Domer & Knight (1953); Knight (I952); Markham & Smith
(I95o); Tsugita (1962); Wang & Knight (I967).
Tobacco rattle virus; R/x:2"3/s:E/E:S/Ne; Netu virus group. Offord & Harris (1965); Semancik &
Kajiyama 0967).
Tobacco ringspot virus; R/x: x.8142: S/S: S/Ne; Nepo virus group. Randles & Francki (1965); StaceSmith, Reichmann & Wright (1965).
Tomato ringspot virus; R/x:*/4I:S/S:S/Ne; Nepo virus group. Tremaine & Stace-Smith (1968).
Turnip crinkle virus; R/x :2/25:S/S:S/C1. Symons et al. (1963); R. MacLeod (personal communication).
Turnip yellow mosaic virus; R/I:X'9/37:S/S:S/CI; turnip yellow mosaic virus group (cauliflower,
Denmark, Honesty, Rademacher, Rothamsted and type strains). Symons et aL (1963).
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 19:30:55
A. J. G I B B S A N D G. A. M A C I N T Y R E
382
Table 2. Correlations between the nucleotide and amino acid composition of the
particles of 41 plant viruses, and their agreement with the bacterial code
Composition of plant virus particles
Mean
of
Amino
acid particles
ala
arg
asx
glx
gly
his
ile
leu
lys
met
phe
pro
ser
thr
tyr
val
9"3
5-2
9"5
8-3
6-I
1-2
5"2
8"5
4"0
1-6
4"0
6"3
9"0
9"3
2-6
7"7
~
Correlations* of the amino acid
composition with nucleotides
~
~~
G
A
C
U
o.2o
0"67
0.28
o'37
0.28
--0"4I
--O'7O
--0"30
o'I2
--0-22
0-24
--o'79
-0"25
--0"29
o'56
0"35
0"34
0"27
0"43
o'40
0-03
--0"52
--0"29
--0'58
0-06
-0"08
0"37
-o.6I
--o'I2
0"05
o'o4
o'o3
-0"23
-0'57
-0-46
-0"46
-0-27
0"51
o-60
0"38
--0"07
o'23
--0"52
o'87
o'o9
0'3 °
-o'41
--o'I4
-0.06
0"30
0"36
0-24
o'31
--0"I9
--O'34
0"07
o-lo
-o"I9
0'60
-o'57
o'17
-0"46
0-3o
-o'Io
o'71
-o.12
-o'I5
-o-II
0'45
0"49
-0"56
-0"33
0.22
0"27
0"62
o-19
-0"43
-0"25
0"05
0"33
-o'13
-o"3I
-0"07
0-26
-0"77
-0"03
o'29
0"03
-0-38
-o'55
0-52
o.28
-o.2o
-0.42
0'38
o-03
-0"o7
0'33
0"37
0"43
-0"50
0"06
0"30
0-45
-0'08
-o-24
o'24
-o'25
Bacterial genetic code.
Mean nucleotide compositiont
of codons for each amino acid
r
~
~
G
A
C
U
0"42
0'44
o'17
0"33
0"75
O'00
O'00
o'11
oq7
0"33
o.oo
o'o8
o'I7
0.08
o'oo
0-42
Agreement~
,
~--~
Measure
of
Agree- Relative
ment
Value
0.08
0.22
0"50
0"5°
0.08
0"33
O'44
Oql
0"83
0'33
o.oo
o.o8
o'17
0"42
o'33
0.08
0"42
0.28
o'I7
o'17
0-08
0"50
O.II
o'27
o-oo
o'oo
o'17
o'75
o'33
0"42
o'17
0-08
0.08
0-06
o'I7
o.oo
0.08
O'I7
o'44
0"50
o'oo
0'33
0"83
o'o8
o'33
0-08
o'5o
0-42
-0"35
o.i8
0"44
0-4°
0"47
0.68
--o-o8
0"63
o'17
--0'95
o'4I
o'99
o'95
0"93
-o.H
o'46
0.86
0"39
0"74
0'84
0"97
O'I2
0"56
0"74
0"75
o'I2
0"75
I.OO
o'39
0"86
o'I 4
0-72
0'50
o-I7
o"17
o"15
o'5I
oq7
o'I7
o't7
o"19
0"07
oq7
0"50
o-17
o.Io
o.16
o'50
o'I7
o'17
0-36
0"33
o'17
o'I7
0'50
o'21
o'18
o't9
0"5o
o't7
o't9
0"23
oq7
o-I7
o'I7
0'54
o'16
o"I4
o'I7
0'50
0"25
0"37
0"46
0-88
0"85
0"97
0"54
o-I2
0'92
0"30
-o'14
0-22
0"93
0"95
0"66
0'60
0"48
0'84
I'OO
0"87
0.22
0"45
0"30
o.16
o'32
0.22
0"83
0"89
Grouped amino acids§
GI
28-8
AI
29"4
C1
2o'4
UI
I5"4
G2
14"3
A2
25-6
C2
30"9
U2
27"0
GA3 13"9
CV3 17"3
GACU 3
61"4
* Correlation coefficients; levels of significance P o'oI = o'39; P o'ooI = 0'49.
t Average codon composition calculated from Sadgopal (I968).
:~ For explanation, see text.
§ Amino acids grouped according to bacterial genetic code by the first, second and third base in their codons
Group G i contains ala, asx/2, glx/2, gly, val. A t arg/3, asx/2, ile, lys, met, set/3, thr. C1 arg2/3, glx/2, his, pro,
leu2/3. U I leu/3, phe, set2/3, tyr. G2 arg, gly, ser/3. A2 asx, glx, his, lys, tyr. C2 ala, pro, ser2/3, thr. U 2 ile, leu,
met, phe, val. GA3 contains those amino acids whose codons end in either G or A, that is glx, lys, and met. CU3
contains those whose codons end in either C or U, that is asx, his, phe, and tyr. G A C U 3 contains those whose
codons end in either G, A, C or U and include all the remaining nine amino acids.
RESULTS
Correlations
Correlation coefficients were calculated between the amounts of each of the four
nucleotides and sixteen amino acids in the particles of the 4I viruses, and also between
each of the nucleotides and the amino acids grouped in various ways. The amino
acids were grouped according to which base occurred first in the codons assigned to
the amino acids in the bacterial genetic code, which occurred second, and which third.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 19:30:55
Amino acid and nucleotide composition
383
Most of the correlations between the nucleotide and amino acid composition of
these virus particles (Table z) are closely related to the codon assignments in the
bacterial genetic code. For example, arginine is closely positively correlated with
guanine, and negatively or less correlated with the other nucleotides. This agrees well
with the bacterial genetic code, in which the arginine codons have an average composition of G o.44:A o.zz:C o.z8:U 0.06. The correlations for nine of the 16 amino acids
agree with the bacterial code, six closely. With the grouped amino acids there is good
agreement between the bacterial code and the correlation coefficients for those
grouped according to the first nucleotide in the codon, but less agreement with those
grouped according to the second and third nucleotides in the codon.
To show the extent of the agreement more clearly, a measure of agreement (correlation coefficient) was calculated for each amino acid (or group of amino acids) between
the average nucleotide composition of the corresponding bacteriaI codons on the one
hand, and the correlation coefficients between the amino acid and nucleotide composition of the plant viruses on the other. These measures of agreement (Table 2, penultimate column) were positive for all except four of the ungrouped amino acids and one
of the grouped amino acids.
Obviously the possibility of getting any agreement will be greater with those amino
acids that form the bulk of the virus protein than with those that are present in small
amounts, and will also be greater with those amino acids whose bacterial codons have
greatly different amounts of the four nucleotides than with those that have similar
amounts of the four nucleotides. Therefore we calculated the relative value (Table 2,
last column) that can be attached to each measure of agreement; the relative value is
arbitrarily defined for each amino acid (or group of amino acids) as the standard
deviation of the mean amounts of individual nucleotides in the codons for that amino
acid multiplied by the mean percentage of that amino acid in the virus particles, and
is expressed relative to the 'value' for proline (to C2 for the grouped amino acids).
Five negative measures of agreement were obtained; these were for alanine, isoleucine and three other amino acids with such small relative values, that they may be
disregarded (Table 2). There is no obvious explanation for the result obtained with
alanine, but that for isoleucine may reflect the fact that there are correlations between
the amounts of the four nucleotides (Table 3) and between the amounts of the different amino acids (Table 4) in the virus particles. The isoleucine content of the viruses
is correlated with their cytosine content, yet it would be expected from the bacterial
code to be correlated with their adenine and uracil content (Table z). Isoleucine,
which constitutes on average 5"2 ~o of the virus proteins, is positively correlated with
both proline and threonine, which together constitute 15"6 ~o of the proteins (Table 4),
and which are both positively correlated with cytosine (Table z). Thus it is not surprising that isoleucine is also positively correlated with cytosine.
The correlations between the nucleotides in the nucleic acids of these viruses
(Table 3) presumably reflect the fact that there are only four nucleotides and their
amounts were expressed for these calculations as percentages. Thus cytosine, which
has the largest range (from I6.2 ~ to 42-I ~o of the nucleic acid), is inevitably negatively correlated with the other three nucleotides. The amino acids were also expressed
as percentages, but, as there are sixteen of them, the correlations between them
(Table 4) are not just the result of a mathematical constraint, and perhaps reflect
differences between the viruses in their use of the different amino acids in their sub-
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 19:30:55
384
A.J.
GIBBS
AND
G. A. MACINTYRE
units, all of which are similar globular protein molecules. Amino acids may be grouped
in various ways according to the properties of their side chains (summarized by Zimmerman, Eliezer & Simha, r968 ). Thus the negative correlations between methionine
and phenylalanine are perhaps because these amino acids are of similar size and
polarity and hence serve similar functions in the protein. Other correlations may
reflect the ionization properties of the amino acids rather than their size; thus the
basic amino acids arginine and histidine are negatively correlated, whereas arginine
is positively correlated with glutamic acid (and amine).
Table 3. Correlations* between the amountst of different nucleotides in
the nucleic acids of 4I plant viruses
G
0"34
A
- 0"79
-0"73
C
0-40
0.30
-0"73
U
* Correlation coefficients; significance levels as in Table 2.
t Amounts expressed as moles per cent.
Table 4. Correlations* between the amountst of different amino acids in
the particle proteins of 4I plant viruses
ala
arg
0"41
asp glu
• --o"41 -0'44
-0'65
o'43 --o'4I
.
0"48
-0"42
gly
• -o"47
his
.
fie
o'65
0"40
o'44
leu
-o'4z
lys
-
0"50
- 0-46
0"44
o-52
o'44
-o'4~ -0"55
met - 0"44
phe
pro
--o'41
ser
tier
tyr
val
* Correlation coefficients; significance levels as in Table z. For simplicity only statistically signiflcant correlations are listed.
t Moles %.
Predicted compositions
The amino acid composition of each virus was predicted indirectly from its observed
nucleotide composition (Gobs, Aobs, Cobs, Uob,), using the average bacterial codon
composition (Godn, Aodn, C~n, Uodn) (Table 2), to calculate for each amino acid in
turn, the sum
Gobs. Godn+ Aob,. Aodn+ Cob~.Co~ + Uob,. Uodn.
The nucleotide composition of each virus was predicted from its amino acid composition in a similar way
Gpr,d = 2 ~ X . Godn; Aped = 27~ X.Aod~;
etc.,
where X is the amount of each amino acid ( a l a . . . val) in the virus protein.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 19:30:55
Amino acid and nucleotide composition
385
Correlation coefficients calculated to compare the predicted and observed nucleotide
and amino acid compositions (Table 5) show that all four nucleotides and seven of
the 16 amino acids were successfully predicted. It is surprising that the predicted
nucleotide composition of the genes for the protein in the particles is close to the
observed composition of the whole nucleic acids and suggests that the nucleic acid
molecules of these viruses are relatively homogeneous in composition, and in this
respect differ from those of the larger bacteriophages (Hogness & Simmons, 1964).
Table 5. Correlations* between observed and predictedt (a) nucleotide composition and
(b) amino acid composition of the particles of 4I plant viruses using full bacterial code *
(a)
Guanine
Adenine
Cytosine
Uracil
(b)
0'7 I
0'31
0"74
0'33
ala
arg
asx
glx
gly
his
ile
leu
o"19
o'z2
0"44
0"42
o.28
0"47
0"24
o'61
lys
met
phe
pro
ser
thr
tyr
val
0'03
0.22
0"47
0'87
0.22
0"44
o-or
o"18
* Correlations coefficients; levels of significanceP 0"05 = 0"30; P o-oI 0"39; P o-oot = 0"49.
i" Method for predicting composition using bacterial genetic code described in text.
:~All codons in the bacterial code were used to calculate the average base composition of the
codons of each amino acid.
=
The predicted average guanine and cytosine contents of the viruses are slightly
greater than the observed amounts, and the adenine and uracil contents less; the
predicted average base ratio was G 24"4:A 26.2:C 25"8:U 23"4, whereas the observed
ratio was G 23"4:A 26"3:C 24"4:U 25"8. The regression coefficients for the relation
between the predicted and observed amounts of the bases also varied; G I ' I 7 :
A o'79:C 3"4o:U 0"67. These results suggest that if all possible codons are used by
plants equally frequently, then the gene for the protein in the particle of these viruses
has a higher guanine and cytosine content than other parts of their nucleic acids.
Alternatively if that gene has the same composition as the whole nucleic acid molecule
of the virus, these results suggest that the whole nucleic acid uses more of t h e ' adenineuracil' rich codons than the 'guanine-cytosine' rich codons for each amino acid.
The latter possibility would agree with the findings of Streisinger and his colleagues
(Streisinger et al. I966; Inouye et al. 1967), who determined the precise composition
of six codons of bacteriophage T 4. All the codons they found had adenine or uracil
as third base in the codon, suggesting that codons ending with guanine or cytosine
are rare or absent. To test whether our data would show that plant viruses have a
similarly restricted code we calculated the average nucleotide composition of the
codons of each amino acid, omitting where possible all ending with guanine or
cytosine. Then we used the 'A/U-rich' codon composition to predict, as before, the
amino acid composition of the viruses from their nucleic acid composition, and vice
versa, and calculated correlation coefficients between the predicted and observed
compositions (Table 6). The 'A/U-rich' code did not predict the nucleotide composition of the viruses as well as the complete code; the average ' g u a n i n e + c y t o s i n e '
content of the viruses when predicted by the restricted code was 34.2 70 but when
25
J. Virol. 5
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 19:30:55
386
A . J . GIBBS AND G. A. MACINTYRE
predicted by the full code was 50"2 ~o, compared with the observed average content
of 47"8 ~ . However the prediction of amino acids was better with the 'A/U-rich' code
than with the full code; the correlation coefficients for nine of the amino acids increased, only four decreased.
Table 6. Correlations* between observed and predicted~ (a) nucleotide composition and
(b) amino acid composition of the particles of 4I plant viruses, using 'A/U-rich' code~
(a)
(b)
Guanine
o'69
Adenine
Cytosine
Uracil
0"29
0"76
0"29
ala
arg
asx
glx
gly
his
ile
leu
o. t 9
0"47
0"49
o'35
o'29
0"42
o'39
o'3o
lys
met
phe
pro
ser
thr
tyr
val
0-06
0-22
0'60
o'88
o'33
o'45
o'25
o. I I
* Correlation coefficients; levels of significance P 0'05 = 0"30; P o'oI
0"39; P o,ooi
t Method for predicting composition using bacterial genetic code is described in text.
=
=
o'49.
:~The average base composition of the bacterial codons of each amino acid was calculated omitting,
where possible, all codons with guanine or cytosine as third nucleotide in the codon.
DISCUSSION
The results given above indirectly indicate that plants use a genetic code broadly
similar to that used by bacteria, and exclude the possibility that the genetic code used
by plants is a mirror-image of that used by bacteria. All the codons may not be used
equally frequently by plants, and it is possible that for some amino acids those with
adenine or uracil as the third base in the codon are used more frequently than those
with guanine or cytosine in that position.
It is interesting that the correlations we have found agree closely with results
obtained by Sueoka (I 96 I) with bacteria. Sueoka analysed the amino acid composition
of various bacteria with nucleic acids of different nucleotide composition, and found
that the guanine + cytosine content of the bacteria was positively correlated with their
content of alanine, arginine, glycine and proline, negatively correlated with aspartic
acid, isoleucine, lysine, glutamic acid, phenylalanine and tyrosine, and uncorrelated
with histidine, leucine, methionine, threonine, serine and valine. These correlations,
obtained before the bacterial genetic code was elucidated, agree closely with the code,
and are similar to those we have found with plant viruses.
The correlations we have obtained suggest that our initial assumption, that the
gene for the protein of the particles of each virus has a composition similar to that
of the whole nucleic acid molecule of the virus, may be largely correct. If true, what
does this imply ? It may be that most of the nucleic acid of each virus consists of the
gene for the protein in the particle. For satellite virus this is correct, for its nucleic
acid is only large enough to code for two proteins of the size of the protein in its
particles. However, all the other viruses used had much larger nucleic acid molecules
and it might imply that each nucleic acid molecule contained several genes for this
one protein. This is very unlikely, as the virus would then probably have several
slightly different proteins in its particles, and this seems to be not so. Thus each virus
nucleic acid molecule probably contains only one particle protein gene. Therefore
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 19:30:55
Amino acid and nucleotide composition
387
our initial assumption, if true, suggests that the different genes of each virus are of
similar composition. The viruses may 'use' the so-called 'degeneracy' of the code so
that genes of similar composition can code for quite different proteins, though an
alternative and more likely explanation may be that the different proteins produced
by viruses are of similar amino acid composition yet of different function. This seems
to be so with bacteria, for Sueoka's results were obtained using the amino acid composition of unfractionated mixtures of proteins from each bacterium. Perhaps the
characteristic nucleotide composition of the nucleic acid of every organism or virus
is correlated with a characteristic average amino acid composition of all its proteins.
REFERENCES
ARONSON, A. I. & BANCROFT,J. B. (1962). Density heterogeneity in purified preparations of broad
bean mottle virus. Virology 18, 57o.
AaT-mRTON, J. G. (1968). Formation of tobacco mosaic virus in an animal cell culture. Arch. ges.
Virusforsch. 24, 4o6.
BANCROFT,J. B., HIEBERT,E., I~ES, M. W. & MARKHAM,R. (1968). Properties of cowpea chlorotic
mottle virus, its protein and nucleic acid. Virology 34, 224.
BOCKSTAHLER, L . E . • KAESBERG,P. 0965). Isolation and properties of R N A from bromegrass
mosaic virus. J. molec. Biol. 13, I27.
CLARK, J. M., CHANG, A. Y., SPIEGELMAN,S. & REICHMANN,M. E. (1965). The in vitro translation
of a monocistronic message. Proc. natn. Acad. Sci. U.S.A. 54, 1193.
DAYHOFF, M. O. & ECK, R. V. (1968). Atlas of protein sequence and structure. 1967-68. National
Biomedical Research Foundation, Maryland.
DE F~taERY, D. & KNiGt-rr, C. A. (1955). A chemical comparison of three strains of tomato bushy
stunt virus. Jr. biol. Chem. zx4, 559DORNER, R. W. & KNIGHT, C. A. (I953)- The preparation and properties of some plant virus nucleic
acids. J. biol. Chem. 2o5, 959.
FRAENKEL-CONRAT,H. (I968). Molecular Basis of Virology, p. 134. Ed. by H. Fraenkel-Conrat. New
York: Reinhold Book Corp.
FRANCKI, R. I. B., RANDLES, J. W., CHAMBERS,T.C. & WILSON, S.B. (1966). Some properties of
purified cucumber mosaic virus (Q strain). Virology 28, 729.
FRY, P. R., GROGAN, R. G. & LVTTLETON,J. W. (196o). Physical and chemical properties of clover
mosaic virus. Phytopathology 50, t75.
GnAaRIAL, S. A., SHEPHERD,R. J. & GROGAN, R. G. (I967). Chemical properties of three strains of
southern bean mosaic virus. Virology 33, 17.
GIBBS, A. J. 0969). Plant virus classification. Adv. Virus Res. 14, 263.
GIBBS, A.J., G~SSANI-B~LLI, G. & SMITH, H. G. (I968). Broad-bean stain and true broad-bean
mosaic viruses. Ann. appl. Biol. 61, 99.
HOGNESS, D. S. & StMMONS,J. R. (I964). Breakage of ;tdg D N A : Chemical and genetic characterization of each isolated half molecule. J. molec. Biol. 9, 4I I.
INotr~, M., AKABOSHI,E., TsuGrrA, A., STREISINGER,G. & OKADA,Y. (1967). A frame shift mutation
resulting in the deletion of two base pairs in the lysozyme gene of bacteriophage T4. J. molec.
Biol. 30, 39KADO, C. I. (1967). Biological and biochemical characterization of sowbane mosaic virus. Virology
3 x, 217.
KALMAKOFF,J. & TREMAINE,J. H. 0967). Some physical and chemical properties of carnation ringspot virus. Virology 33, IO.
ICIER, J. M.,~DIENER, T. O. & SCOTT, H . A . (I965). Some physical and chemical properties of
cucumber mosaic virus (strain Y) and of its isolated ribonucleic acid. Virology 27, 54I(.ELLEY,J. J. & KAESBERG,P. (1962). Biophysical and biochemical properties of top component and
bottom component of alfalfa mosaic virus. Biochim. biophys. Acta 6x, 865.
~5-2
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 19:30:55
388
A . J . G I B B S A N D G. A. M A C I N T Y R E
KNIGHT, C. A. (I952). The nucleic acids of some strains of tobacco mosaic virus. J. biol. Chem. 197,
241.
MARKHAM,R. & SMITH, J. D. (t95o). Chromatographic studies on nucleic acids. Biochem. J. 46, 513.
MARTYN,B. (I968). Plant virus names. PhytopathologicalPapers (No. 9). Commonwealth Mycological
Inst., Kew, England.
MAZZONE, H. M., INCARDONA,N. L. & KAESBER6,P. (1962). Biochemical and biophysical properties
of squash mosaic virus and related macromolecules. Biochim. biophys. Acta 55, 164.
MIK1, T. & KN16HT, C. A. (1967). Some chemical studies on a strain of white clover mosaic virus.
Virology 31, 55.
MIKI, T. & KNIGHT, C. A. (I968). The protein subunit of potato virus X. Virology 36, 168.
NmENBERG, M. W. & MATrnAEI, J. H. (I96I). The dependence of cell-free protein synthesis in E. coli
upon the naturally occurring poly-ribonucleotides. Proc. natn. Acad. Sci. U.S.A. 47, 1588.
OrFORD, R. E. & HARRIS, J. I. (I965). The protein subunit of tobacco rattle virus. Proc. Federation
European Biochem. Soc. 2, 216.
RANDLES, J. W. & FRANCrd, R. I. B. (I965). Some properties of a tobacco ringspot virus isolate from
South Australia. Aust. J. biol. Sci. x8, 979.
RAUWS, A. G., JASPERS,E. M. J. & VELDSTRA,H. (1964). The base composition of ribonucleic acids
from alfalfa mosaic virus components. Virology 23, 283.
REES, M. W. & SHORT, M. N. (t965). Variations in the composition of two strains of tobacco mosaic
virus in relation to their host. Virology 26, 596.
R~ICHMANN, M. E. (1964). The satellite tobacco necrosis virus; a single protein and its genetic code.
Proc. natn. Acad. Sci. U.S.A. 52, IOO9.
SADGOPAL, A. (I968). The genetic code after the excitement. Adv. Genet. 4, 325.
SANDER, E. (I964). Evidence on the synthesis of a D N A phage in leaves of tobacco plants. Virology
24, 545.
SCHWARTZ, J. H., EXSENSTADT,J. M., BRAWERMAN,G. & ZINDUR, N. D. (I965). Biosynthesis of the
coat protein of coliphage f2 by extracts of Euglena gracilis. Proc. natn. Acad. Sci. U.S.A.
53, I95.
SEMANCIK,J. S. (1966). Studies on electrophoretic heterogeneity in isometric plant viruses. Virology
30, 698.
SEMANClK,J. S. & BANCROFT,J. B. (I 964). Further characterization of the nucleoprotein componentsof bean pod mottle virus. Virology 22, 33.
SEMANCIK,J. S. & KAJ~AMA, M. R. (1967). Properties and relationships among R N A species from
tobacco rattle virus. Virology 33, 523.
SHEPHERD,R. J., WAKEMAN,R. M. & GHABRtAL, S. A. (I968). Preparation and properties of the
protein and nucleic acid components of pea enation mosaic virus. Virology 35, 255.
STACE-S~a~-I, R., REICHMANN,M. E. & WRIGHX, N. S. (1965). Purification and properties of tobacco
ringspot virus and two RNA-deficient components. Virology 25, 487.
STREISINGER,G., OKADA,Y., E~,mtCH, J., NEWTON, J., TSUGITA, m., TERZAGHI, E. & [NOUYE, M.
(1966). Frame shift mutations and genetic code. CoM Spring Harb. Symp. quant. Biol. 3t, 77.
STLmBS, J. D. & KAZSBERG,P. (1964). A protein subunit of bromegrass mosaic virus, or. molec. Biol.
8, 314.
SUEOKA,N. (I96I). Composition correlation between deoxy-ribonucleic acid and protein. Cold Spring
Harb. Symp. quant. Biol. 26, 35.
SYMONS, R . H . , RE~, M.W., SHORT, M. N. & MARKHAM, R. (I963). Relationships between the
ribonucleic acid and protein of some plant viruses. J. molec. BioL 6, I.
T~MAINE, J . H . & STACE-S~nTH, R. (I968). Chemical composition and biophysical properties of
tomato ringspot virus. Virology 35, lO2.
TSUGrrA, A. (1962). The proteins of mutants of TMV: classification of spontaneous and chemically
evoked strains. J. molec. BioL 5, 293.
VAN RAWNSWAAY-CLAASEN,J. J., VAN LEnUWEN, A. B. J., Dtrmrrs, G. A. H. & BOSCH, L. (I967).
In vitro translation of alfalfa mosaic virus RNA. J. molec. Biol. 23, 535.
VAN R~GE~q~IORTEL,M. H. V. (1967a). Biochemical and biophysical properties of cucumber mosaic
virus. Virology 3 x, 39t.
VAN REGENMORT~L,M. H. V. (I967b). Serological studies on naturally occurring strains and chemically induced mutants of tobacco mosaic virus. Virology 3x, 467.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 19:30:55
Amino acM and nucleotide composition
389
WANG, A. L. & KNI6HT, C. A. (I967). Analysis of protein components of tomato strains of tobacco
mosaic virus. Virology 31, Ioi.
WIT'rMANN, H. G. & PAOL, H. L. (196I). Vergleich der Aminosauren-zusammensetzung der Proteine
des Echten Ackerbohnenmosaik-virus, des Broad Bean Mottle Virus und des Tabakmosaik
Virus. Phytopath. Z. 41, 74.
YAMAZAra, H. & KA~BERG, P. (I963). Isolation and characterization of a protein subunit of broad
bean mottle virus. J. molee. Biol. 6, 465.
ZIMMERMAN,J. M., ELrEZER, N. & SIMHA, R. (I968). The characterization of amino acid sequences
in proteins by statistical methods. J. theor. Biol. 21, 17o.
(Received IO March I969)
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 19:30:55