Download Amino acid sequence restriction in relation to proteolysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genetic code wikipedia , lookup

Magnesium transporter wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Signal transduction wikipedia , lookup

Expression vector wikipedia , lookup

Gene expression wikipedia , lookup

Point mutation wikipedia , lookup

Biochemistry wikipedia , lookup

Interactome wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

SR protein wikipedia , lookup

Metalloprotein wikipedia , lookup

QPNC-PAGE wikipedia , lookup

Protein wikipedia , lookup

Western blot wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Proteolysis wikipedia , lookup

Transcript
225
Bioscience Reports 3, 225-232 (1983)
Printed in Great Britain
Amino acid sequence restriction
in
r e l a t i o n to p r o t e o l y s i s
Hans 30RNVALL and Bengt PERSSON
Department of Chemistry I, Karolinska Instituter,
S-104 01 Stockholm 60, Sweden
(Received 28 January 1983)
Distributions of amino acid residues in proteins show
t h a t proline is overrepresented in sequence positions
rLys~ r Lys ~
following two basic residues (lArg~-lArgSJ, i.e. at sites
similar to those susceptible to proteolytic cleavages of
hormonal pro-forms.
Conformational correlations
~Lys~ ~Lys.
further show that IArg~-IArg)-t'ro sequences are often
(8/II) not adiacent to elements of secondary structure,
Lys
Lys.
~.
whereas the opposite applies to iArg#-iArg#-non~ro
sequences
(82/103 adjacent to elements of secondary
structure).
These distribution patterns from proteins in
general also seem applicable in individual protein groups
as demonstrated for some dehydrogenases.
It appears
possible
that
{ LySArg}- { LySArg}_nonPro c o n s t i t u t e s
a
restricted
addition to
a means
cessings of
sequence ,n proteins, and that proline, in
elements of secondary structure, contributes
of avoiding u n a c c e p t a b l e proteolytic proproteins in general.
A p a r t from c o n t a i n i n g the i n f o r m a t i o n for conformation and
f u n c t i o n , the p r i m a r y structures of proteins have short sequences
serving as 'signals' for special properties.
These signals are often
sites for modifications and are common between different proteins.
Examples are: polypeptide glycosylations (one type affecting Asn in
Asn-X-Thr/Ser
( 1 ) ) ; p h o s p h o r y l a t i o n s (e.g. on Ser in different
structures (2,3)); hormonal pro-form cleavages (often, but apparently
not i n v a r i a b l y (#), after dibasic structures like Lys-Arg-X (5,6));
cleavages (after small residues (7)) of signal peptides in pre-forms of
s e c r e t e d proteins; or N-terminal acetylations (of small residues in
special structures ( 8 ) ) .
However, all these sequence types are also
common elsewhere in proteins, and for correct modifications, additional
factors must therefore restrict the sequence-directed specificities.
One such r e s t r i c t i o n f a c t o r is caused by conformation.
For
example, sequences which actually get glycosylated differ from those
o t h e r w i s e i d e n t i c a l t h a t do not, by being in different secondary
s t r u c t u r e s (often in reverse turns (9,10)).
Similarly, non-cleaved
01983
The Biochemical society
226
J{SRNVALL
& PERSSON
bonds in pro-form sequences appear stabilized in secondary structure,
whereas otherwise identical but cleaved bonds are in regions without
secondary structure (l 1).
Furthermore, protease cleavages are often
limited to domain borders or other accessible parts (cf. 12-1~).
Another restriction might simply be selection against potentially
s e n s i t i v e s e q u e n c e s at p l a c e s w h e r e modification should not be
directed.
Thus, the Asn-X-Ser/Thr glycosylation signal is underrepresented in proteins in general (15), possibly indicating a tendency
to avoid u n f a v o u r a b l e g l y c o s y l a t i o n , and t h e r e f o r e reflecting a
'restricted sequence' in proteins.
It appears possible that one further type of restricted sequence
could apply to proteolytic signals, where sequence signals can be
limited by the nature of residues subsequent to potentially sensitive
cleavage sites.
Peptide bonds involving proline or with proline in an
adjacent position reduce sensitivities to many proteases, and proline
may protect some structural proteins against degradation (16).
In the p r e s e n t study, t h e p o s s i b l e i m p o r t a n c e of proline in
proteolysis was tested by comparisons of all structures of the type
occurring
Lys
Lys +
in pro-form cleavages, i.e. at the arrow in {Arg}-(Arg}-X.
Evidence was obtained that Pro, together with elements of secondary
structure, is important to protect potentially sensitive sites.
NonPro
bonds at such sites appear to form a type of restricted sequence in
p r o t e i n s , i n d i c a t i n g a g e n e r a l sequence restriction in relation to
proteolysis.
Table i.
Occurrence of proline after dibasic sequences in
proteins versus occurrence of proline
in general at any position
Proline residues after the dibasic structures were taken from
reference 18, proline in general from reference 17.
in both
cases, basic proteins disturb the usefulness of the values for
general interpretations, since such proteins frequently have many
basic residues together.
For example, without correction for
this, Arg-Arg is followed by one further Arg in 36% of known
sequences (18). Therefore, occurrences below are recalculated to
show the distribution of proline after subtraction of lysine and
ar~inine from the calculations.
Structure
Occurrence of Pro in this position in
proteins in general (after subtraction
of Lys, Arg; see above) (%)
Residue after Lys-Arg
i0.i
Residue after Lys-Lys
6.8
Residue after Arg-Arg
8.8
Residue after Arg-Lys
3.3
Residue at any position
5.9
RESTRICTED
SEQUENCES
IN
PROTEOLYSIS
227
M a t e r i a l s and M e t h o d s
Most amino acid sequences and conformations were taken from
references 17-199 and the remaining ones from references 20-2% as
indicated.
Results
Proline occurrence after dibasic structures in general
The occurrence in proteins of proline after dibasic structures (18)
versus that at any position (17) is shown in Table I,
The values
obtained suggest that proline is over=represented after three of these
f o u r d i b a s i c s e q u e n c e s in p r o t e i n s in g e n e r a l ( e s p e c i a l l y after
Lys-Arg).
Since Lys=Arg is a signal for the proteases cleaving
pro-form structures, proline may serve a protective function towards
unfavourable proteolysis.
This appears common enough to be visible
already on distributions in general.
Proline occurrence related to known conformations
Dibasi c structures with known conformations) both with and without
a subsequent Pro, are listed in Tables 2 and 3, respectively. Many of
Table 2,
Lys
.Lys
Properties of sequences of the type {Arg}-IArg}-Pro
in proteins of known conformation
Sequence and structure data are from references 17-19 except for
crystallin) which is from reference 20. Numbers list positions of
the first residue shown (in one-letter codes).
Secondary
structures are listed as ~9 B) or neither 9 and indicate the type
of structure known to cover the sequence listed.
In the case of
cytochrome c) the listing is ~ although covering a proline residue
since the structure determined is from a homologue lacking the
proline.
The two listed 6-structures are also atypical) being at
ends of H-bondings.
Surface (+) or non-surface (-) positions are
given for those entries that lack ~ or 8 structures in the regions
listed.
Protein
Alcohol dehydrogenase
Cytochrome c
Immunoglobulin, yl-chain
Immunoglobulin) %-chain
Phospholipase A 2
Prothrombin
Elastase
Triosephosphate isomerase
~-crystallin 9 B-chain
Sequence
Secondary structure
8
Neither
Surface
Nonsurface
18 KKP
247 KKP
94 KKP
90 KKP
12
55
16
429
218
2
Ii
KKP
KRP
KRP
KRP
RKP
RKP
RRP
(cO
(8)
(8)
228
3ORNVALL
P~O
0
0
~
0
0
0
m
9
.,-I
0
~1
u~
i
Q#I
o
g~o"
~
J~
o
0
0
~ ~r
~ +
/'b
4-I
~Q
o ~
4.1
~
,
~
9
,,~
~
~.~ ~
o
~
~
~
gl
0
9
0
~
m
~
~.~
0~'~
,-~
~.~ ~
0
~
~
co.r~
~.,~
co
& PERSSON
RESTRICTED
SEQUENCES
IN
PROTEOLYSIS
229
r
~
ca
~
~.
~
o~
"~
a2
x~
230
JORNVALL
&
PERSSON
the sequences are s u r f a c e - p o s i t i o n e d , which is natural since a dibasic
s t r u c t u r e can generally not be a c c e p t e d internally in a stable protein
c o n f o r m ation.
Of more i n t e r e s t , however, is the f a c t t h a t the m a j o r i t y of the
Lys
Lys
{Arg)-{Arg}-Pro sequences (Table 2) are in regions lacking secondary
L ys
Lys
structure. Thus, of a total of II {Arg}-{Arg}-Pro in structures of
known
conformations,
g are in regions without
close association to
elements of secondary structure (Table 4).
In contrast,
{Lys r Lys X
Arg#-iArg }(X ~ Pro) sequences, the opposite applies: many
for
are
close to elements of secondary structure (g2 of 103, Table 4).
Naturally, frequent borderline cases exist, and several structures are
difficult to judge.
Nevertheless, the shifts in general occurrence of
nonPro residues, as shown in Table 4, are substantial. It is therefore
possible that increased stabilization of proteins against cleavages after
dibasic structures can be obtained not onJy by conformation, but also
by a subsequent proline residue. The proline contribution appears both
common and general enough to be visible in whole protein properties
(Tables i - 4 ) .
Dehydrogenases
- a single protein family
In order to test the results from average distributions of different
proteins, dehydrogenases with known conformations were similarly
analysed, as an example of single protein families. As shown in Table
5, dibasic structures not protected by a subsequent proline residue are
p r e f e r e n t i a l l y found in dehydrogenase regions of ordered structure.
Conclusions from proteins in general are therefore noticeable even in
individual protein families, and restricted sequences may complement
conformations in providing stable proteins.
Table 4. Summary of correlations between
sequences and conformations
Values denote the number of sequences of each type
(counting +/- as -) in Tables 2 and 3.
Some of the
surface/non-surface distinctions are questionable.
Conformation
Sequence
a or 8
K
K
{R}-{R}-X
K
K
{R}-{R}-P
Neither
Surface Non-surface
+
82
16
5
3
8
0
RESTRICTED
SEQUENCES
IN
PROTEOLYSIS
231
Table 5. Conformations around Lys-Lys sequences
in dehydrogenases with known tertiary structures
Data from references 21-24. As shown, two Lys-Lys-Pro structures
are accessible without protection from elements of secondary
structure, whereas four structures not protected by Pro are inside
long ~-helices 9 close to elements of secondary structure, or
partly shielded.
Alcohol dehydrogenase
Lys-Lys-Pro
(positions 18.20
irregular structure; superficial
Lys-Lys-Pro
(positions 247-249)
irregular structure; between end
of B and start of
Lys-Lys-Phe
(positions 338-340)
after ~; partly shielded
in inter-domain cleft
Lactate dehydrogenase
Lys-Lys-Ser
(positions 317-319)
inside e-helix (covering 309-324)
Glyceraldehyde-3-phosphate
Lys-Lys-Val
(positions 114-116)
dehydrogenase
at start of 8 (covering 115-118)
(23) or 115-120 (24))
Lys-Lys-Val
(positions 256-258)
inside e-helix (covering 251-265)
Discussion
'Restricted' sequence
The d a t a in T a b l e s 1-5 suggest a role of proline residues in
regulation of proteolysis.
Proline is over-represented at potentially
s e n s i t i v e s i t e s in regions not s t a b i l i z e d by secondary structure.
C o n s e q u e n t l y , dibasic structures not stabilized by either secondary
structure or subsequent proline appear to form a type of 'restricted
sequence' in proteins not destined for proteolysis. Thus, both sequence
signals and sequence restrictions seem to apply to proteolysis, in a
similar mode as earlier demonstrated for glycosylation. The pro-form
cleavage may of course also be regulated by additional factors, and
sequence restrictions may also apply to further types of proteolysis
(especially, perhaps, to the similar trypsin-type of specificity).
In conclusion, restricted sequences appear applicable to two protein
modifications, cleavages (this work) and glycosylations (15), and may
i n d i c a t e a general principle of protein regulation.
Independent of
possible extensions to other modification signals or of additions of
other proteolysis signals, the present data suggest the presence of a
new t y p e of restricted sequence, demonstrating the importance of
protection against proteolysis.
Acknowledgements
Structural studies facilitating this work were supported by grants
from the Swedish Medical Research Council (project 13X-3532), the
Swedish C a n c e r S o c i e t y (project 1806), and the Magn. Bergvall's
Foundation.
232
3(SRNVALL & PERSSON
References
i. Neuberger A, Gottschalk A, Marshall RD & Spiro RG (1972) in
'Glycoproteins' (Gottschalk A, ed), 2nd ed, pp 464-470,
Elsevier, Amsterdam.
2. Feramisco JR, Glass DB & Krebs EG (1980) J. Biol. Chem. 255,
4240-4245.
3. Mercier J-C, Grosclaude F & Ribadeau-Dumas B (1971) Eur. J.
Biochem. 23, 41-51.
4. Ekman R, Hakanson R & J~rnvall H (1981) FEBS Lett. 132,
265-268.
5. Steiner DF (1976) in 'Peptide Hormones' (Parsons JA, ed),
University Park Press, Baltimore, pp 49-64.
6. Pradayrol L, Jornvall H, Mutt V & Ribet A (1980) FEBS Lett.
109, 55-58.
7. Austen BM (1979) FEBS Lett. 103, 308-313.
8. Jornvall H (1975) J. Theor. Biol. 55, 1-12.
9. Aubert J-P, Biserte G & Loucheux-Lefebvre M-H (1976) Arch.
Biochem. Biophys. 175, 410-418.
I0. Beeley JG (1977) Biochem. Biophys. Res. Commun. X6,
1051-1055.
ii. Geisow MJ & Smyth DG (1980) in 'The Enzymology of Post-Translational Modification of Proteins', vol 1 (Freedman RB &
Hawkins HC, eds), Academic Press, London, pp 259-287.
12. Porter RR (1959) Biochem. J. 73, 119-126.
13. Walsh KA, Ericsson LH, Parmelee DC & Titani K (1981) Ann. Rev.
Biochem. 509 261-284.
14. JSrnvall H & Philipson L (1980) Eur. J. Biochem. 104,237-247.
15. Hunt LT & Dayhoff MO (1970) Biochem. Biophys. Res. Commun.
39, 757-765.
16. JSrnvall H, Pettersson U & Philipson L (1974) Eur. J. Biochem.
48, 179-192.
17. Dayhoff MO, ed (1972) 'Atlas of Protein Sequence and
Structure', and (1973) Suppl. 1, (1976) Suppl. 2, and (1978)
Suppl. 3, National Biomedical Research Foundation, Silver
Springs, Maryland.
18. Dayhoff MO, ed (1978) in 'Protein Segment Dictionary'
National Biomedical Research Foundation, Silver Springs, Md.
19. Feldmann RJ, ed (1976) in 'Atlas of Macromolecular Structure
on Microfiche' (AMSOM), Tracor Jitco, Inc., Rockville, Md.
20. Blundell T, Lindley P, Miller L, Moss D, Slingsby C, Tickle I,
Turnell B & Wistow G (1981) Nature (London) 289, 771-777.
21. Eklund H, NordstrSm B, Zeppezauer E, S~derlund G, Ohlsson I
Boiwe T, SSderberg B-O, Tapia O, Brand~n C-I & Akeson A (1976)
J. Mol. Biol. 102, 27-59.
22. Holbrook JJ, Liljas A, Steindel SJ & Rossmann MG (1975) in
'The Enzymes' (Boyer PD, ed), vol ii, pp 191-292, Academic
Press.
23. Harris Jl & Waters M (1976) in 'The Enzymes' (Boyer PD, ed),
vol 13, pp 1-49, Academic Press.
24. Harris Jl & Walker JE (1977) in 'Pyridine Nucleotide-Dependent
Dehydrogenases' (Sund H, ed), pp 43-61, Walter de Gruyter,
Berlin.