Download Domain structure and sequence similarities in cartilage proteoglycan

198 BIOCHEMICAL SOCIETY TRANSACTIONS the length of the polypeptide. In the second (CS2), the glycosaminoglycans occur as clusters separated by gaps. The current view of the aggregating cartilage proteoglycan is thus of a multidomain structure. Because the interaction with hyaluronate is fundamental to the role these molecules play in cartilage, a good deal of attention has been focused on the molecular structure of this domain. The primary structure deduced from cDNA clones [ 111 and from amino acid sequencing 131 shows that it has two structural motifs. The first of these is an Ig fold [ 141 and the second is a tandem repeat of a sequence of about 100 residues, the so-called proteoglycan tandem repeat (PTR). The second globular domain G2 also contains the PTR, which exists as a pair of loops defined by disulphide bonds. The primary structure of link protein from rat chondrosarcoma [ 15, 161. chicken [ 171, pig and human [ 181 also shows this structural arrangement. G I and link protein bind to hyaluronate and to each other, but G 2 apparently shows none of these properties [ 191. The precise function of G2 remains unclear. A number of other proteoglycans have been studied in detail and the nature of their interactions with components of cell surfaces have been elucidated. Heparan sulphate proteoglycans, for instance, are involved in interactions with matrix macromolecules such as collagen, fibronectin and N-CAM (see [ 201 for a review j. The topics covered in this Colloqium indicate the extent to which understanding the nature and function of proteoglycans has progressed in recent years. The structure of the human and rat proteoglycan genes have been defined [7a] and it has been found that the exons correspond closely to the distinct domains in the protein structure. Electron microscopy and, in particular, the technique of rotary shadowing have contributed a large amount to an understanding of domain structure [9, gal. It is now possible to propose models to describe the nature of the interactions between proteoglycan, hyaluronate and link protein [2I]. Similarly, knowledge of the structural organization of nonaggregating proteoglycans. such as the heparan sulphate proteoglycans has progressed [22]. While a great deal of work has been done on the nature and role of proteoglycans in pathological situations, there is yet much to be learned about the molecular nature of conditions such as osteoarthritis, which lead to erosion of articular cartilage [23]. It is likely that, over the next few years, work will be carried out to determine the three-dimensional structure of the HABR and link protein and it will be then possible to refine models describing their interaction with hyaluronate. This in turn will lead to a greater understanding o f the nature o f these molecules and of thc tissues in which they are found. I . Kennedy, J. F. ( 1979) I'roteoglycuns - Hiologicol and ('hemic~cil Aspects irr I h n r i r i I*@, pp. 29-43. Elsevier. Amsterdam 2. Hascall. V. C. & Sajdera. S. W. ( 1970) J. Hiol. ('him. 245, 4920-4930 3. Kuettnrr. K. E. & Kimura. J. H. ( 1 9 8 5 ) J. Cell. Riochem. 27, 327-336 4. Heinegard, I>. & Axelsson. 1. ( 1977) J. Hiol. C ' h c w i . 252, 197 1-1979 5. Hascall. V. C. ( 1977)J. Sicpmmol. Striccr. 7. I0 I - I 2 0 6 . Hardingham. 7: E. & Muir, H. ( 1974) Hiochim. Hiophys. Acw 279,40 1-405 7. Hascall, V. C. & Heinegard, D. ( 1 974) J. Biol. C'hem. 249, 4241 -4249 7a. Doege, K., Sasaki. M. & Yamada. Y. ( 1990) Biocliem. Soc. Trans. 18, 200-202 8 . Christner. J. E., Brown, M. L. & Dziewiatowski, D. D. (1978) Anal. Biochem. 90.23-32 9. Buckwalter, J. A,, Rosenberg, L. C. & Tang, L.-H. (1984) J. Hiol. C'hem. 259.536 1-5363 %.Morgelin, M.. Paulsson, M. & Engel. J. (1990) Hiochem. Soc. Trans. 18,204-207 10. Paulsson. M., Morgelin. M., Wiedemann, H.. Beardmore-Gray, M., Dunham. D., Hardingham, I.. Heinegard. D.. Timpl. R. & Engel, J. ( 1987)Biochem. J. 245,763-772 1 1 . Doege. K., Sasaki. M., Horigan, E., Hassell. J. R. & Yamada, Y. (1987)J . H i d . Ciiem. 262, I 7757-1 7767 12. Oldberg, A., Antonsson, P. & Heinegard. D. ( 1 987) J. Hiol. C'hem. 243,255-259 13. Neame, P. J., Christner, J. E. & Baker. J. R. ( 1 987) J. H i d . C'hem. 262,17768- 17778 14. Bonnet. F., Perin, J.-P., Lorenzo, F.. Jolles, J. & Jolles. P. ( I 986) Biochim. Biophys. Acra 873, 152- 155 15. Doege, K.. Hassell, J. R., Caterson. B. & Yamada, Y. (1986) Proc. Nrirl. Acacl. Sci. U.S.A. 83, 376 1-3765 16. Neame. P. J., Christner. J. E. & Baker. J. R. ( I 986) J. Hiol. Chem. 261,3519-3535 17. Deak, F., Kiss, I., Sparks, K. J., Argraves. W. S., Hampikian, G. & Goetinck. P. F. ( I 986) I'roc. NlirI. Acad. Sci. U.S.A. 83, 3766-3770 18. Dudhia, J. & Hardingham, T. E. (1989) J. Mol. Biol. 206, 737-753 19. Fosang. A. J. & Hardingham. T. E. (1989) Biochem. J. 261. 80 1-809 20. Gallagher. J. T., Lyon. M. & Steward, W. P. ( 1986) Biochem. J. 236,313-32s 2 I . Neame, P. J. ( 1990) Hiochem. .So(.. 7runs. 18, 202-204 22. Gallagher, J. T., Turnbull, J. E. & Lyon. M. ( 1990) Hiochem. SOC. Truns. 18,207-209 23. Cashin, P. J. ( 1990) Hiochern. Soc. Trons. 18,2 12-2 14 ~~ Received 17 October I989 Domain structure and sequence similarities in cartilage proteoglycan JAYESH DUDHIA, AMANDA J. FOSANG and TIMOTHY E. HARDINGHAM Kennedy Institute of Rheumatology, Hatnmersmith, London W6 7DW, U.K. The most abundant proteoglycan in cartilage is a high molecular mass aggregating species bearing chondroitin sulphate and keratan sulphate side-chains on a large protein core (225 kDa). Rotary-shadowing techniques [ 11 and DNA sequences 121 have shown this proteoglycan to be a multidomain structure that consists of three globular ( G l , G2 and G3) and two extended regions (Fig. 1). Abbreviations used: PTR, proteoglycan tandem repeat; Ig fold, immunoglobulin variable regon fold. The aggregation properties are due to interactions involving the G 1 globular domain lying at the N-terminus. This disulphide-bonded G I domain is composed of two structural motifs, an Ig fold and a tandem repeat, that have also been identified in link protein from its DNA sequences from pig and human cartilage [3 ],rat chondrosarcoma [4] and chick sternal cartilage [S]. These two structural motifs together constitute the whole of link protein (39 kDa) as a disulphidebonded looped structure. One motif (loop A, Fig. 1) at the N-terminal of G1 and link protein is a 90 residue sequence that shows homology with an Ig variable region fold (Ig fold) and, although the homology is not high, secondary structure predictions [3]show significant identity with the Ig fold in the number and position of /3-sheet sequences. The second motif is a tandem repeat structure that contains two homologous 1990 PROTEOGLYCA N S 199 Proteoglycan G1 G2 cs1 cs2 I 7 ,- G3 oligosaccharides Link protein Keratan sulphate Chondroitin sulphate Fig. 1. Schemutic represetitation of cartilage proteoglycan and link proteiti striictiire The domain structures of cartilage proteoglycan and link protein are shown including: Ig fold; PTR; CSl and C S 2 , chondroitin sulphate attachment region sequences. Disulphide bonds are marked with dashed lines. The polypeptide chains are drawn approximately to scale. loops (B and B’; Fig. I), each of Y Y amino acid residues lying adjacent to the Ig fold and towards the C-terminal. This proteoglycan tandem repeat (PTR) is also found in a second globular domain (G2) in the proteoglycan protein core, where it is separated by a short extended segment from G I . The sequences of the PTR B loops of proteoglycan and link protein, taken together for all four species determined, show a similarity of about 48%, while that of the B’ loops is 35%. For link protein alone, comparison among different species shows the amino acid sequence similarity between these two loops to be almost 60%. The G1 domain has been shown to bind specifically to hyaluronate [6].Evidence from immunochemical data using antibodies raised specifically to G1 and G 2 domains 171 clearly suggests that, while G1 can interact with link protein and hyaluronate, G2 does not possess this functional property. The G2 domain also does not appear to bind to collagen or other matrix proteins and its precise functional role in proteoglycan organization remains obscure. The third globular domain (G3) is located at the Cterminal of proteoglycan and has a sequence quite different from G1 and G2 domains. G 3 contains ten cysteine residues and exhibits 90% similarity between chicken [8], rat [ 2 ] , bovine [O], and human and pig [ 101 sequences. Although highly conserved, its function also remains unclear. A part of G 3 containing six cysteine residues has sequence similarity with vertebrate hepatic lectins specific for the terminal galactosyl or N-acetylglucosaminyl residues. A C-terminal G3-like domain found in a proteoglycan from human fibroblasts [ 111 is related in sequence to the G 3 domain of the chondrocyte proteoglycan. It contains the lectin-like portion and ten cysteine residues whose spacing is completely conserved between these two proteoglycans expressed by different cell types. The human chondrocyte proteoglycan G 3 domain has less sequence similarity (68% to the human fibroblast G 3 domain than to the G 3 domains of chondrocytes from other species ( > YO%). Adjoining the G 2 and G 3 domains is an extended region that contains two parts both rich in hydroxy amino acids and containing many Ser-Gly sequences, but with different sequence patterns. From the available partial sequences for a 50 kDa region (region CS2, Fig. 1 ) for human, rat and bovine proteoglycan, it contains 65% of common sequence, although within it a 150 residue portion low in Ser-Gly sequences immediately adjacent t o G 3 is less than 50% conserved. The total number of Ser-Gly dipeptides varies between species, being 24 in rat, 35 in bovine and 28 in human, of which 18 are maintained in common positions in Vol. 18 all three species. The Ser-Gly dipeptides are found as a series of ten amino acid repeats, and could serve as substitution points for bearing chondroitin sulphate chains. This region is less well conserved than the globular G 3 domain and suggests that the precise pattern and number of Ser-Gly dipeptides is not critical to its function in bearing large numbers of chondroitin sulphate chains. A general feature of the Ser-Gly dipeptide-containing repeats is the presence of adjacent acidic amino acids residues. This arrangement has been postulated [ 121 to stimulate glycosylation in vivo in synthetic Ser-Gly-containing peptides. The two extended regions of the core protein thus provide large numbers of substitution sites f o r glycosaminoglycan attachment and the density of chondroitin sulphate chains found on each extension is reflected by the respective numbers of the Ser-Gly-containing repeats. Hence, the extended sequence (CS2) adjacent to G 3 contains chondroitin sulphate chains that are arranged in clusters, while in the CS1 regions they are evenly distributed. The chondroitin sulphate attachment region thus appears to have arisen from the amplification of genes for two diffcrent Ser-Gly-rich sequences. The multi-domain structure of proteoglycan and the conservation of domain structure between species. and their relationship with domains of other proteins. implies that the gene for this protein has arisen from exon movement from a number of gene families. The PTR loops ( B and B’) share significant sequence homology with a human lymphocyte homing receptor Hermes (cell adhesion molecule) [ 1 3, 14 1, while the lectin-like region in G 3 is related t o a sequence found in a mouse lymphocyte homing receptor [ I S ] . The human Hermes molecule has only one copy of the PTR motif located at the N-terminal and its similarity to the B loops of link protein and proteoglycan is approximately 3% and to the B’ loops about 25%. The PTRs of these three proteins may thus share a common ancestral gene. The presence of an Ig fold at the N-terminal also classes the proteoglycan (and link protein) as members of the immunoglobulin superfamily [ 161, in which there are many examples of proteins involved in recognition and adhesion. Among the different gene sequences that make up the proteoglycan protein core, there are thus strong relationships with several prominent families of cell surface proteins which each provide elements of the globular protein structures which flank the heavily glycosylated chondroitin sulphate attachment region. The proteoglycan thus provides a further example of the close relationships that exist between proteins at the cell surface and secreted components of the extracellular matrix. 200 BIOCHEMICAL SOCIETY TRANSACTIONS 1. Paulsson, M., Morgelin, M., Wiedemann, H., Beardmore-Gray, M., Dunham, D., Hardingham, T. E.. Heinegard, D., Timpl, R. & Engel, J. ( 1 987) Biochem. J. 245,763-772 2. Doege, K., Sasaki, M., Horigan, E., Hassell, J. R. & Yamada, Y. ( 1 987) Proc. Natl. Acad. Sci. U.S.A. 83, 3766-3770 3. Perkins, S. J., Nealis, A. S., Dudhia. J. & Hardingham. T. E. ( 1 989) J . Mol. Hiol. 206,737-753 4. Rhodes, C., Doege, K., Sasaki, M. & Yamada, Y. ( 1988) J. Hiol. Chern. 263,6063-6067 5. Deak, F., Kiss, I., Sparks, K. J., Argraves, W. S., Hampikian, G. & Goetinck, P. F. ( I 986) /'roc. Nufl. Accrd. Sci. U..S.A. 83, 3766-3770 6. Hardingham, T. E., Beardmore-Gray, M., Dunham. D. & Ratcliffe, A. ( 1986) C'iba Found. Symp. 124, 30-46 7. Fosang, A. J. & Hardingham. T. E. (1989) Hiochem. J. 261, 801-809 8. Sai, S., Tanaka, T., Kosher, R. & Tanzer. M. ( 1986) /'roc. Nail. Acad. Sci. U.S.A.83, 508 1-5085 9. Oldberg. A., Antonsson, P. & Heinegard, D. ( 1987) Biochcm. J. 243.2555259 1 0 . Dudhia, J . & Hardingham, T. E. (1989) Trans. Orfhop. Res. Soc., U.S.A.,35th Meeting, 14, 8. I I . Krusius, T. & Ruoslahti, E. (19x7) J. Hiol. C'hem. 262, 13120-13125 12. Bourdon, M. A,. Krusius. T., Camphell, S., Schwartz, N. 8. & Ruoslahti, E. ( 1987) /'roc. Nuil. Acrid. .ki. ll..S.A. 84, 3 194-3 I98 13. Stemenkovic. I., Amiot, M., Pesando, J . M. & Seed, B. (1989) C'ell(Cumhridge, Mu.ss.) 56, 1057-1062 14. Goldstein, L. A,. Zhou. D. E H.. Picker, J. L.. Minty, C. N., Bargatze, R. F., Ding, J. F. & Butcher, E. C. (1989) ( ' d l (Cumbridge, Muss.) 56, 1063- 1072 15. Lasky, L. A,. Singer. M. S.. Yednock. T. A,. Dowhenko, I>.. Fennie. C., Rodriguez. H., Nguycn, I..,Stachel. S. & Rosen. S. D. ( 1989) C ' d l (C'amhridp, Moss.) 56, 1045- I055 16. Williams, A. F. & Barclay, A. N. ( 1988) Annu. Rev. Irnmirnol. 6, 38 1-405 Received I7 October 1989 Rat and human cartilage proteoglycan (aggrecan) gene structure KURT DOEGE,* MAKOTO SASAKII. and YOSHl YAMADA$ *Shriner's Hospital for Crippled Children, 3101 S W Sum Jackson Park Rd, Portland, OR 97201, U.S.A.; t Orthopedic Sirrgery, Natiorial Reppii Hospital, Oita, Japan and $Laboratory of Developmeritul Biology and Anomalies, National Institute of Deritul Research, National Itistitiires of Health, Bethesda, M D 20892, U.S.A. The large, aggregating chondroitin sulphate proteoglycan of cartilage, or aggrecan, is the predominant proteoglycan in cartilage. It has a monomeric relative M , of approx. 2.4 x lo", is highly substituted with chondroitin sulphate and keratan sulphate glycosaminoglycans, as well as N- and 0linked oligosaccharides and has the distinctive property of assembling into very large aggregates with hyaluronic acid. These aggregates form the water-retaining 'ground substance' of cartilage matrix and are thought to be single unbranched strands of hyaluronic acid to which many aggrecan molecules are tightly bound through a high-affinity domain at the N-terminus of the proteoglycan. A small (M, 43 000) glycoprotein called link protein is related both structurally and functionally to aggrecan and serves to stabilize the hyaluronic acid-aggrecan complex. The complete primary structure of the protein core of aggrecan from rat chondrosarcoma has recently been deduced from the sequence of corresponding cDNA clones [ I ] , and several structural features were noted. The 2124residue core protein is composed of several distinct structural domains including a signal sequence (19 amino acids): two N-terminal globular domains G 1 and G2 (334 and 200 amino acids, respectively), each with extensive sequence similarity to link protein, and separated from each other by a rod-like domain Ig ( 150 amino acids): a proline-rich domain (100 amino acids), which is likely to be the keratan sulphateattachment domain: a large domain of 1157 amino acids, in which the sequence Ser-Gly recurs over 100 times as part of several repeating patterns, providing attachment sites for chondroitin sulphate; and, lastly, a third globular domain at the C-terminus (G3, 230 amino acids), which possesses strong sequence similarity to a family of lectin-like proteins in one of its two disulphide-containing portions, and has been shown to interact specifically with carbohydrate [2]. The C-terminal portion of the aggrecan sequence has been determined for a number of vertebrate species [3-51, but only the rat sequence is known in full. The potential role of this gene in degenerative joint conditions and heritable skeletal defects has led us t o isolatc and sequence the complete human aggrecan cDNA, and t o isolate corresponding genomic clones for both thc rat and human spccies. The structure of the rat gene has been mapped as to exon size and location, while the human gene has been partially characterized by sequencing and restriction-fragment hybridization patterns. Resiilts Fig. 1 summarizes the comparison betwcen the human and rat aggrecan sequences deduced from cDNA clones. The human cDNA clones were obtained by screening a human chondrocytc Agtl 1 library kindly provided by Dr T. Kimura (61, using the rat aggrecan cDNA as probe. At the same time, we screened a human genomic library (partial S m 3 a in E M B U , provided by Dr F. Gonzalez, N.I.H.) and obtained several overlapping clones which represented the entire protein coding region of the gene. It was thus possible . . I SP(i97) HABR(661I KS((61) CS(1314) G3 1-54) Fig. I. C'ompurisoti of the cDNA-deiiiiced priman striictiires for rut mid hiimuti aarecuti The sequences are illustrated with the N-termini t o the left and the various domains bracketed. SP, signal peptide; HABR, hyaluronic acid-binding region; KS. keratan sulphate domain; CS, chondroitin sulphate domain; (3.3, globular domain 3. The number of amino acids in each domain is in parentheses, and corresponding domains are bounded by the dashed lines. Open boxes indicate these insertions of humanspecific repeated sequences (sizes in amino acids shown above) and the filled box is an exon which is skipped in the human G3 domain. 1990

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Domain structure and sequence similarities in cartilage proteoglycan