Download Domain structure and sequence similarities in cartilage proteoglycan

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Rosetta@home wikipedia , lookup

Bimolecular fluorescence complementation wikipedia , lookup

Proteomics wikipedia , lookup

Circular dichroism wikipedia , lookup

Protein design wikipedia , lookup

Western blot wikipedia , lookup

Protein purification wikipedia , lookup

Protein folding wikipedia , lookup

Protein wikipedia , lookup

Cyclol wikipedia , lookup

Protein mass spectrometry wikipedia , lookup

List of types of proteins wikipedia , lookup

Structural alignment wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Intrinsically disordered proteins wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Alpha helix wikipedia , lookup

Homology modeling wikipedia , lookup

Trimeric autotransporter adhesin wikipedia , lookup

Protein structure prediction wikipedia , lookup

Protein domain wikipedia , lookup

Transcript
198
BIOCHEMICAL SOCIETY TRANSACTIONS
the length of the polypeptide. In the second (CS2), the
glycosaminoglycans occur as clusters separated by gaps.
The current view of the aggregating cartilage proteoglycan
is thus of a multidomain structure. Because the interaction
with hyaluronate is fundamental to the role these molecules
play in cartilage, a good deal of attention has been focused
on the molecular structure of this domain. The primary
structure deduced from cDNA clones [ 111 and from amino
acid sequencing 131 shows that it has two structural motifs.
The first of these is an Ig fold [ 141 and the second is a tandem
repeat of a sequence of about 100 residues, the so-called
proteoglycan tandem repeat (PTR). The second globular
domain G2 also contains the PTR, which exists as a pair of
loops defined by disulphide bonds.
The primary structure of link protein from rat chondrosarcoma [ 15, 161. chicken [ 171, pig and human [ 181 also
shows this structural arrangement. G I and link protein bind
to hyaluronate and to each other, but G 2 apparently shows
none of these properties [ 191. The precise function of G2
remains unclear.
A number of other proteoglycans have been studied in
detail and the nature of their interactions with components of
cell surfaces have been elucidated. Heparan sulphate proteoglycans, for instance, are involved in interactions with matrix
macromolecules such as collagen, fibronectin and N-CAM
(see [ 201 for a review j.
The topics covered in this Colloqium indicate the extent
to which understanding the nature and function of proteoglycans has progressed in recent years. The structure of the
human and rat proteoglycan genes have been defined [7a]
and it has been found that the exons correspond closely to
the distinct domains in the protein structure. Electron
microscopy and, in particular, the technique of rotary
shadowing have contributed a large amount to an understanding of domain structure [9, gal. It is now possible to
propose models to describe the nature of the interactions
between proteoglycan, hyaluronate and link protein [2I].
Similarly, knowledge of the structural organization of nonaggregating proteoglycans. such as the heparan sulphate
proteoglycans has progressed [22].
While a great deal of work has been done on the nature
and role of proteoglycans in pathological situations, there is
yet much to be learned about the molecular nature of conditions such as osteoarthritis, which lead to erosion of articular
cartilage [23]. It is likely that, over the next few years, work
will be carried out to determine the three-dimensional structure of the HABR and link protein and it will be then
possible to refine models describing their interaction with
hyaluronate. This in turn will lead to a greater understanding
o f the nature o f these molecules and of thc tissues in which
they are found.
I . Kennedy, J. F. ( 1979) I'roteoglycuns - Hiologicol and ('hemic~cil
Aspects irr I h n r i r i I*@, pp. 29-43. Elsevier. Amsterdam
2. Hascall. V. C. & Sajdera. S. W. ( 1970) J. Hiol. ('him. 245,
4920-4930
3. Kuettnrr. K. E. & Kimura. J. H. ( 1 9 8 5 ) J. Cell. Riochem. 27,
327-336
4. Heinegard, I>. & Axelsson. 1. ( 1977) J. Hiol. C ' h c w i . 252,
197 1-1979
5. Hascall. V. C. ( 1977)J. Sicpmmol. Striccr. 7. I0 I - I 2 0
6 . Hardingham. 7: E. & Muir, H. ( 1974) Hiochim. Hiophys. Acw
279,40 1-405
7. Hascall, V. C. & Heinegard, D. ( 1 974) J. Biol. C'hem. 249,
4241 -4249
7a. Doege, K., Sasaki. M. & Yamada. Y. ( 1990) Biocliem. Soc.
Trans. 18, 200-202
8 . Christner. J. E., Brown, M. L. & Dziewiatowski, D. D. (1978)
Anal. Biochem. 90.23-32
9. Buckwalter, J. A,, Rosenberg, L. C. & Tang, L.-H. (1984) J.
Hiol. C'hem. 259.536 1-5363
%.Morgelin, M.. Paulsson, M. & Engel. J. (1990) Hiochem. Soc.
Trans. 18,204-207
10. Paulsson. M., Morgelin. M., Wiedemann, H.. Beardmore-Gray,
M., Dunham. D., Hardingham, I.. Heinegard. D.. Timpl. R. &
Engel, J. ( 1987)Biochem. J. 245,763-772
1 1 . Doege. K., Sasaki. M., Horigan, E., Hassell. J. R. & Yamada, Y.
(1987)J . H i d . Ciiem. 262, I 7757-1 7767
12. Oldberg, A., Antonsson, P. & Heinegard. D. ( 1 987) J. Hiol.
C'hem. 243,255-259
13. Neame, P. J., Christner, J. E. & Baker. J. R. ( 1 987) J. H i d .
C'hem. 262,17768- 17778
14. Bonnet. F., Perin, J.-P., Lorenzo, F.. Jolles, J. & Jolles. P. ( I 986)
Biochim. Biophys. Acra 873, 152- 155
15. Doege, K.. Hassell, J. R., Caterson. B. & Yamada, Y. (1986)
Proc. Nrirl. Acacl. Sci. U.S.A. 83, 376 1-3765
16. Neame. P. J., Christner. J. E. & Baker. J. R. ( I 986) J. Hiol.
Chem. 261,3519-3535
17. Deak, F., Kiss, I., Sparks, K. J., Argraves. W. S., Hampikian, G.
& Goetinck. P. F. ( I 986) I'roc. NlirI. Acad. Sci. U.S.A. 83,
3766-3770
18. Dudhia, J. & Hardingham, T. E. (1989) J. Mol. Biol. 206,
737-753
19. Fosang. A. J. & Hardingham. T. E. (1989) Biochem. J. 261.
80 1-809
20. Gallagher. J. T., Lyon. M. & Steward, W. P. ( 1986) Biochem. J.
236,313-32s
2 I . Neame, P. J. ( 1990) Hiochem. .So(.. 7runs. 18, 202-204
22. Gallagher, J. T., Turnbull, J. E. & Lyon. M. ( 1990) Hiochem.
SOC. Truns. 18,207-209
23. Cashin, P. J. ( 1990) Hiochern. Soc. Trons. 18,2 12-2 14
~~
Received 17 October I989
Domain structure and sequence similarities in cartilage proteoglycan
JAYESH DUDHIA, AMANDA J. FOSANG and
TIMOTHY E. HARDINGHAM
Kennedy Institute of Rheumatology, Hatnmersmith, London
W6 7DW, U.K.
The most abundant proteoglycan in cartilage is a high
molecular mass aggregating species bearing chondroitin sulphate and keratan sulphate side-chains on a large protein
core (225 kDa). Rotary-shadowing techniques [ 11 and DNA
sequences 121 have shown this proteoglycan to be a multidomain structure that consists of three globular ( G l , G2 and
G3) and two extended regions (Fig. 1).
Abbreviations used: PTR, proteoglycan tandem repeat; Ig fold,
immunoglobulin variable regon fold.
The aggregation properties are due to interactions involving the G 1 globular domain lying at the N-terminus. This disulphide-bonded G I domain is composed of two structural
motifs, an Ig fold and a tandem repeat, that have also been
identified in link protein from its DNA sequences from pig
and human cartilage [3 ],rat chondrosarcoma [4] and chick
sternal cartilage [S]. These two structural motifs together
constitute the whole of link protein (39 kDa) as a disulphidebonded looped structure. One motif (loop A, Fig. 1) at the
N-terminal of G1 and link protein is a 90 residue sequence
that shows homology with an Ig variable region fold (Ig fold)
and, although the homology is not high, secondary structure
predictions [3]show significant identity with the Ig fold in the
number and position of /3-sheet sequences. The second motif
is a tandem repeat structure that contains two homologous
1990
PROTEOGLYCA N S
199
Proteoglycan
G1
G2
cs1
cs2
I
7
,-
G3
oligosaccharides
Link protein
Keratan sulphate
Chondroitin sulphate
Fig. 1. Schemutic represetitation of cartilage proteoglycan and link proteiti striictiire
The domain structures of cartilage proteoglycan and link protein are shown
including: Ig fold; PTR; CSl and C S 2 , chondroitin sulphate attachment region
sequences. Disulphide bonds are marked with dashed lines. The polypeptide chains
are drawn approximately to scale.
loops (B and B’; Fig. I), each of Y Y amino acid residues
lying adjacent to the Ig fold and towards the C-terminal. This
proteoglycan tandem repeat (PTR) is also found in a second
globular domain (G2) in the proteoglycan protein core,
where it is separated by a short extended segment from G I .
The sequences of the PTR B loops of proteoglycan and link
protein, taken together for all four species determined, show
a similarity of about 48%, while that of the B’ loops is 35%.
For link protein alone, comparison among different species
shows the amino acid sequence similarity between these two
loops to be almost 60%.
The G1 domain has been shown to bind specifically to
hyaluronate [6].Evidence from immunochemical data using
antibodies raised specifically to G1 and G 2 domains 171
clearly suggests that, while G1 can interact with link protein
and hyaluronate, G2 does not possess this functional
property. The G2 domain also does not appear to bind to
collagen or other matrix proteins and its precise functional
role in proteoglycan organization remains obscure.
The third globular domain (G3) is located at the Cterminal of proteoglycan and has a sequence quite different
from G1 and G2 domains. G 3 contains ten cysteine residues
and exhibits 90% similarity between chicken [8], rat [ 2 ] ,
bovine [O], and human and pig [ 101 sequences. Although
highly conserved, its function also remains unclear. A part of
G 3 containing six cysteine residues has sequence similarity
with vertebrate hepatic lectins specific for the terminal
galactosyl or N-acetylglucosaminyl residues. A C-terminal
G3-like domain found in a proteoglycan from human fibroblasts [ 111 is related in sequence to the G 3 domain of the
chondrocyte proteoglycan. It contains the lectin-like portion
and ten cysteine residues whose spacing is completely conserved between these two proteoglycans expressed by different cell types. The human chondrocyte proteoglycan G 3
domain has less sequence similarity (68% to the human
fibroblast G 3 domain than to the G 3 domains of chondrocytes from other species ( > YO%).
Adjoining the G 2 and G 3 domains is an extended region
that contains two parts both rich in hydroxy amino acids and
containing many Ser-Gly sequences, but with different
sequence patterns. From the available partial sequences for a
50 kDa region (region CS2, Fig. 1 ) for human, rat and bovine
proteoglycan, it contains 65% of common sequence,
although within it a 150 residue portion low in Ser-Gly
sequences immediately adjacent t o G 3 is less than 50% conserved. The total number of Ser-Gly dipeptides varies
between species, being 24 in rat, 35 in bovine and 28 in
human, of which 18 are maintained in common positions in
Vol. 18
all three species. The Ser-Gly dipeptides are found as a series
of ten amino acid repeats, and could serve as substitution
points for bearing chondroitin sulphate chains. This region is
less well conserved than the globular G 3 domain and suggests that the precise pattern and number of Ser-Gly dipeptides is not critical to its function in bearing large
numbers of chondroitin sulphate chains.
A general feature of the Ser-Gly dipeptide-containing
repeats is the presence of adjacent acidic amino acids residues. This arrangement has been postulated [ 121 to stimulate
glycosylation in vivo in synthetic Ser-Gly-containing peptides. The two extended regions of the core protein thus provide large numbers of substitution sites f o r glycosaminoglycan attachment and the density of chondroitin sulphate
chains found on each extension is reflected by the respective
numbers of the Ser-Gly-containing repeats. Hence, the
extended sequence (CS2) adjacent to G 3 contains chondroitin sulphate chains that are arranged in clusters, while in
the CS1 regions they are evenly distributed. The chondroitin
sulphate attachment region thus appears to have arisen from
the amplification of genes for two diffcrent Ser-Gly-rich
sequences.
The multi-domain structure of proteoglycan and the conservation of domain structure between species. and their
relationship with domains of other proteins. implies that the
gene for this protein has arisen from exon movement from a
number of gene families. The PTR loops ( B and B’) share
significant sequence homology with a human lymphocyte
homing receptor Hermes (cell adhesion molecule) [ 1 3, 14 1,
while the lectin-like region in G 3 is related t o a sequence
found in a mouse lymphocyte homing receptor [ I S ] . The
human Hermes molecule has only one copy of the PTR motif
located at the N-terminal and its similarity to the B loops of
link protein and proteoglycan is approximately 3% and to
the B’ loops about 25%. The PTRs of these three proteins
may thus share a common ancestral gene. The presence of an
Ig fold at the N-terminal also classes the proteoglycan (and
link protein) as members of the immunoglobulin superfamily
[ 161, in which there are many examples of proteins involved
in recognition and adhesion. Among the different gene
sequences that make up the proteoglycan protein core, there
are thus strong relationships with several prominent families
of cell surface proteins which each provide elements of the
globular protein structures which flank the heavily glycosylated chondroitin sulphate attachment region. The proteoglycan thus provides a further example of the close relationships that exist between proteins at the cell surface and
secreted components of the extracellular matrix.
200
BIOCHEMICAL SOCIETY TRANSACTIONS
1. Paulsson, M., Morgelin, M., Wiedemann, H., Beardmore-Gray,
M., Dunham, D., Hardingham, T. E.. Heinegard, D., Timpl, R.
& Engel, J. ( 1 987) Biochem. J. 245,763-772
2. Doege, K., Sasaki, M., Horigan, E., Hassell, J. R. & Yamada, Y.
( 1 987) Proc. Natl. Acad. Sci. U.S.A. 83, 3766-3770
3. Perkins, S. J., Nealis, A. S., Dudhia. J. & Hardingham. T. E.
( 1 989) J . Mol. Hiol. 206,737-753
4. Rhodes, C., Doege, K., Sasaki, M. & Yamada, Y. ( 1988) J. Hiol.
Chern. 263,6063-6067
5. Deak, F., Kiss, I., Sparks, K. J., Argraves, W. S., Hampikian, G.
& Goetinck, P. F. ( I 986) /'roc. Nufl. Accrd. Sci. U..S.A. 83,
3766-3770
6. Hardingham, T. E., Beardmore-Gray, M., Dunham. D. &
Ratcliffe, A. ( 1986) C'iba Found. Symp. 124, 30-46
7. Fosang, A. J. & Hardingham. T. E. (1989) Hiochem. J. 261,
801-809
8. Sai, S., Tanaka, T., Kosher, R. & Tanzer. M. ( 1986) /'roc. Nail.
Acad. Sci. U.S.A.83, 508 1-5085
9. Oldberg. A., Antonsson, P. & Heinegard, D. ( 1987) Biochcm. J.
243.2555259
1 0 . Dudhia, J . & Hardingham, T. E. (1989) Trans. Orfhop. Res.
Soc., U.S.A.,35th Meeting, 14, 8.
I I . Krusius, T. & Ruoslahti, E. (19x7) J. Hiol. C'hem. 262,
13120-13125
12. Bourdon, M. A,. Krusius. T., Camphell, S., Schwartz, N. 8. &
Ruoslahti, E. ( 1987) /'roc. Nuil. Acrid. .ki. ll..S.A. 84,
3 194-3 I98
13. Stemenkovic. I., Amiot, M., Pesando, J . M. & Seed, B. (1989)
C'ell(Cumhridge, Mu.ss.) 56, 1057-1062
14. Goldstein, L. A,. Zhou. D. E H.. Picker, J. L.. Minty, C. N.,
Bargatze, R. F., Ding, J. F. & Butcher, E. C. (1989) ( ' d l
(Cumbridge, Muss.) 56, 1063- 1072
15. Lasky, L. A,. Singer. M. S.. Yednock. T. A,. Dowhenko, I>..
Fennie. C., Rodriguez. H., Nguycn, I..,Stachel. S. & Rosen.
S. D. ( 1989) C ' d l (C'amhridp, Moss.) 56, 1045- I055
16. Williams, A. F. & Barclay, A. N. ( 1988) Annu. Rev. Irnmirnol.
6, 38 1-405
Received I7 October 1989
Rat and human cartilage proteoglycan (aggrecan) gene structure
KURT DOEGE,* MAKOTO SASAKII. and
YOSHl YAMADA$
*Shriner's Hospital for Crippled Children, 3101 S W Sum
Jackson Park Rd, Portland, OR 97201, U.S.A.; t Orthopedic
Sirrgery, Natiorial Reppii Hospital, Oita, Japan and
$Laboratory of Developmeritul Biology and Anomalies,
National Institute of Deritul Research, National Itistitiires of
Health, Bethesda, M D 20892, U.S.A.
The large, aggregating chondroitin sulphate proteoglycan of
cartilage, or aggrecan, is the predominant proteoglycan in
cartilage. It has a monomeric relative M , of approx.
2.4 x lo", is highly substituted with chondroitin sulphate and
keratan sulphate glycosaminoglycans, as well as N- and 0linked oligosaccharides and has the distinctive property of
assembling into very large aggregates with hyaluronic acid.
These aggregates form the water-retaining 'ground substance' of cartilage matrix and are thought to be single
unbranched strands of hyaluronic acid to which many aggrecan molecules are tightly bound through a high-affinity
domain at the N-terminus of the proteoglycan. A small (M,
43 000) glycoprotein called link protein is related both structurally and functionally to aggrecan and serves to stabilize
the hyaluronic acid-aggrecan complex.
The complete primary structure of the protein core of
aggrecan from rat chondrosarcoma has recently been
deduced from the sequence of corresponding cDNA clones
[ I ] , and several structural features were noted. The 2124residue core protein is composed of several distinct structural domains including a signal sequence (19 amino acids):
two N-terminal globular domains G 1 and G2 (334 and 200
amino acids, respectively), each with extensive sequence
similarity to link protein, and separated from each other by a
rod-like domain Ig ( 150 amino acids): a proline-rich domain
(100 amino acids), which is likely to be the keratan sulphateattachment domain: a large domain of 1157 amino acids, in
which the sequence Ser-Gly recurs over 100 times as part of
several repeating patterns, providing attachment sites for
chondroitin sulphate; and, lastly, a third globular domain at
the C-terminus (G3, 230 amino acids), which possesses
strong sequence similarity to a family of lectin-like proteins
in one of its two disulphide-containing portions, and has
been shown to interact specifically with carbohydrate [2].
The C-terminal portion of the aggrecan sequence has
been determined for a number of vertebrate species [3-51,
but only the rat sequence is known in full. The potential role
of this gene in degenerative joint conditions and heritable
skeletal defects has led us t o isolatc and sequence the complete human aggrecan cDNA, and t o isolate corresponding
genomic clones for both thc rat and human spccies. The
structure of the rat gene has been mapped as to exon size and
location, while the human gene has been partially characterized by sequencing and restriction-fragment hybridization
patterns.
Resiilts
Fig. 1 summarizes the comparison betwcen the human
and rat aggrecan sequences deduced from cDNA clones.
The human cDNA clones were obtained by screening a
human chondrocytc Agtl 1 library kindly provided by Dr T.
Kimura (61, using the rat aggrecan cDNA as probe. At the
same time, we screened a human genomic library (partial
S m 3 a in E M B U , provided by Dr F. Gonzalez, N.I.H.) and
obtained several overlapping clones which represented the
entire protein coding region of the gene. It was thus possible
. .
I
SP(i97)
HABR(661I
KS((61)
CS(1314)
G3 1-54)
Fig. I. C'ompurisoti of the cDNA-deiiiiced priman striictiires
for rut mid hiimuti aarecuti
The sequences are illustrated with the N-termini t o the left
and the various domains bracketed. SP, signal peptide;
HABR, hyaluronic acid-binding region; KS. keratan sulphate
domain; CS, chondroitin sulphate domain; (3.3, globular
domain 3. The number of amino acids in each domain is in
parentheses, and corresponding domains are bounded by the
dashed lines. Open boxes indicate these insertions of humanspecific repeated sequences (sizes in amino acids shown
above) and the filled box is an exon which is skipped in the
human G3 domain.
1990