Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Glycobiology vol. 13 no. 5 pp. 377±386, 2003 DOI: 10.1093/glycob/cwg042 Molecular modeling of glycosyltransferases involved in the biosynthesis of blood group A, blood group B, Forssman, and iGb3 antigens and their interaction with substrates Helena Heissigerov a2,3, Christelle Breton2, Jitka 3 Moravcova , and Anne Imberty1,2 2 Centre de Recherches sur les Macromolecules Vegetales, CNRS (affiliated with Universite Joseph Fourier), 601 rue de la Chimie, BP 53, 38041 Grenoble cedex 9, France; and 3Institute of Chemical Technology, Technicka 5, 166 28 Prague, Czech Republic Received on November 18, 2002; revised on December 9, 2002; accepted on December 17, 2002 A terminal a1-3 linked Gal or GalNAc sugar residue is the common structure found in several oligosaccharide antigens, such as blood groups A and B, the xeno-antigen, the Forssman antigen, and the isogloboside 3 (iGb3) glycolipid. The enzymes involved in the addition of this residue display strong amino acid sequence similarities, suggesting a common fold. From a recently solved crystal structure of the bovine a3-galactosyltransferase complexed with UDP, homology modeling methods were used to build the four other enzymes of this family in their locked conformation. Nucleotide-sugars, the Mn2 ion, and oligosaccharide acceptors were docked in the models. Nine different amino acid regions are involved in the substrate binding sites. After geometry optimization of the complexes and analysis of the predicted structures, the basis of the specificities can be rationalized. In the nucleotide-sugar binding site, the specificity between Gal or GalNAc transferase activity is due to the relative size of two clue amino acids. In the acceptor site, the presence of up to three tryptophan residues define the complexity of the oligosaccharide that can be specifically recognized. The modeling study helps in rationalizing the crystallographic data obtained in this family and provides insights on the basis of substrate and donor recognition. blood transfusion and organ allo- and xenotransplantation. A and B blood group oligosaccharides are located on cell surfaces (erythrocytes and/or vascular endothelium) of various mammals (Oriol, 1987) whereas the xeno-antigen, the Forssman antigen, and the isogloboside 3 (iGb3) glycolipids are not normally produced by human cells. The terminal structures of these oligosaccharides are represented in Scheme 1. The enzymes that are involved in the biosynthesis of those Gal(NAc)a1-3Gal(NAc)-containing antigens display strong amino acid sequence similarities, suggesting a common fold, and all belong to the same family, GT6, according to the classification in the CAZY database (http:// afmb.cnrs-mrs.fr/~cazy/CAZY/index.html). The a3-galactosyltransferase (a3GalT), responsible for the biosynthesis of a-Gal epitope (or xeno-antigen), is found in many mammalian species, including most primates and New World monkeys, but not in humans and their closest relatives because of the mutational inactivation of the gene (Galili and Swanson, 1991). Species lacking the a3GalT do not have the aGal epitope on glycoconjugates, and about 1% of their circulating antibodies are directed against this antigen, causing the hyperacute rejection in xenotransplantation. Because several crystal structures of a3GalT have been solved recently (Boix et al., 2001, 2002; Gastinel et al., 2001), structurally and mechanistically this enzyme is a model for the several other related retaining glycosyltransferases of varying donor and acceptor substrate specificity, such as Forssman glycolipid synthase (Forss-S) (Haslam and Baenziger, 1996), isogloboside 3 synthase (iGb3-S) Key words: blood group antigens/galactosyltransferase/ glycosyltransferase/molecular modeling Introduction Carbohydrates in the form of glycoproteins and glycolipids play crucial roles in various signaling and molecular recognition processes, affect the stability and structure of proteins, and are epitopes recognized by the immune system (Varki, 1993). A core a-galactosyl(1-3)galactose moiety is the common structure of several oligosaccharide antigens present on mammalian cell surfaces. These antigens have been well characterized because of their importance for 1 To whom correspondence should be addressed; e-mail: [email protected] Scheme 1. Terminal structures of the AB blood group/antigens, the xeno-antigen, the Forssman antigen, and the iGb3 antigen. Glycobiology vol. 13 no. 5 # Oxford University Press 2003; all rights reserved. 377 H. Heissigerova et al. Table I. Specificity and reaction products of the glycosyltransferases related to blood group A and B synthases Enzyme Donor Acceptora Product a3GalT UDP-Gal bGal1,4bGlcNAc-R aGal1,3bGal1,4bGlcNAc-R GTB UDP-Gal aFuc1,2bGal1,4bGlcNAc-R aGal1,3[aFuc1,2]bGal1,4bGlcNAc-R iGb3-S UDP-Gal bGal1,4bGlcb-ceramide aGal1,3bGal1,4bGlc1-ceramide GTA UDP-GalNAc aFuc1,2bGal1,4bGlcNAc-R aGalNAc1,3[aFuc1,2]bGal1,4bGlcNAc-R Forss-S UDP-GalNAc bGalNAc1,3aGal-R aGalNAc1,3bGalNAc1,3aGal-R a The monosaccharides that carry the acceptor hydroxyl are indicated in boldface type. (Keusch et al., 2000), and the histo-blood group A and B glycosyltransferases (GTA and GTB) (Yamamoto et al., 1990). Enzymes belonging to this a3-Gal(NAc)T family share common features. They all use a UDP-nucleotide sugar as donor, retain the configuration of the Gal (or GalNAc) transferred, and their activity is strictly dependent upon the presence of a divalent cation (generally Mn2 ). Nevertheless, they differ by their fine specificity: a3GalT, blood group B transferase, and iGb3-S only use UDP-Gal as donor, whereas blood group A transferase and Forss-S use UDP-GalNAc. In addition, the mouse AB glycosyltransferase derived from a cis-AB gene can transfer sugar from both donors (Yamamoto et al., 2001). The amino acid basis for Gal versus GalNAc specificity has been well defined for the blood group A and B transferases, which differ by only four amino acids (Yamamoto and McNeill, 1996). On the other hand, very little is known about the basis for the acceptor specificity: N-acetyllactosamine and lactose are the acceptor for a3GalT and iGb3-S, respectively. Blood group enzymes require substitution by a L-fucose on position 2 of the acceptor galactose, whereas Forss-S requires a N-acetyl group at the same location. Details for donor and acceptor specificities are listed in Table I. Like the majority of glycosyltransferases (Paulson and Colley, 1989), a3Gal(NAc)T enzymes are type II membrane proteins with a short N-terminal cytoplasmic tail, singletransmembrane domain, stem, and C-terminal catalytic region. Despite their importance, structural information about glycosyltransferases is rare, and only 12 different glycosyltransferases have been crystallized until now (see È nligil and Rini, 2000; Breton et al., 2002). Although review; U they share little or no sequence similarity, these 12 glycosyltransferases appear to adopt only two different folds, which have been named BGT fold and SpsA fold for reference to the first structure solved in each case. Among the known structures, only five are retaining glycosyltransferases, namely, bovine a3GalT (Boix et al., 2001, 2002; Gastinel et al., 2001), human blood group A and B synthase (Patenaude et al., 2002), a bacterial a4-galactosyltransferase (LgtC) (Persson et al., 2001), and rabbit glycogenin that initiates the biosynthesis of glycogen (Gibbons et al., 2002). All of these retaining glycosyltransferases adopt the SpsA fold. Analysis and comparison of glycosyltransferase crystal structures in the goal of elucidating the basis of substrate recognition and catalysis has been complicated by the occurrence of large movement of one or two loops involved 378 in substrate binding. First evidenced in inverting glycosylÈ nligil et al., 2000) transferases for b2-GlcNAc transferase (U and then b4-Gal transferase (Ramakrishnan and Qasba, 2001), this conformational change has also been demonstrated to occur in retaining glycosyltransferases (Boix et al., 2001). In a3GalT, opening of the acceptor site is dependent on a donor substrate-induced conformational change (Boix et al., 2002). Among retaining glycosyltransferases, only a3GalT (Boix et al., 2001, 2002) and LgtC (Persson et al., 2001) have been crystallized in this substrate buried state, also called Form II. From the several crystal structures that have presently been obtained in Form I (open) and Form II (closed), it can now be inferred that not only the presence of UDP or UDP-sugar is necessary for obtaining the ``locked'' conformation but also that this substrate should be added to the crystallization medium. Crystals obtained in the absence of substrate and then soaked with UDP do bind the substrate in the active site, but the ordering of loops does not occur in the solid state. Owing to the difficulties in obtaining cocrystals of glycosyltransferases with their substrates, molecular modeling is an alternative to approaching the role of flexible loops in substrate binding. In the present article, we propose to use the recent highresolution structure of a3GalT (Boix et al., 2001) as a template to model the other related enzymes with a3Gal(NAc) transferase activity listed in Table I. Docking of nucleotidesugars on one hand and of oligosaccharide acceptor on the other will lead to the comparison of the architecture of the binding sites. Comparison with the recently solved blood group A and B synthases (Patenaude et al., 2002) validates our modeling method and allows for comparing the two conformational forms of the enzymes. In addition, the different sequence motifs involved in substrate binding have been defined, and their precise role has been investigated. Results Description of the overall structure of the models Figure 1 displays the sequence alignments of the enzymes of interest for the study. This alignment is the basis of the homology building, and only the catalytic regions are shown. This glycosyltransferase family displays a high level of sequence identity. The percentage of identity varies from 44% to 55% (with the exception of GTA and GTB, which display 98% identity). There are very few insertions or deletions in the regions corresponding to secondary structures. Molecular modeling of a3Gal(NAc)transferases Fig. 1. Amino acid sequence alignment of bovine a3GalT and related enzymes of the a3Gal(NAc)T family. Secondary structure elements of a3GalT are indicated above the sequences and numbered as in Boix et al. (2001). Conserved amino acids have a black background and preserved ones a gray background. Regions involved in ligands binding are boxed. The stars indicate the four amino acids that are different between GTA and GTB sequences. Loop regions display more variability, but their sizes are comparable, thus facilitating the modeling study. Because all glycosyltransferases belong to the SpsA fold superfamily, the enzymes modeled here adopt a globular shape. Figure 2A displays the overall structure of the GTA model. One face is responsible for binding the nucleotidesugar and the acceptor. Because the crystal structure of a3GalT complexed with UDP was taken as the reference molecule (Boix et al., 2001), all models are in their locked Form II conformation, with the nucleotide-sugar almost completely buried under the C-terminal region. The acceptor binding site appears as a deep crevice adjacent to the nucleotide-binding site. These two binding sites are made up with residues belonging to nine peptide regions, identified as ligand binding regions (LBRs) and labeled from A to I in Figures 1 and 2. Because there is no significant differences in loop size, the overall structure of the four proteins modeled here does not present large variation, and only one has been displayed. The nucleotide-sugar binding site: predicted interactions between enzymes and UDP-Gal/GalNAc The UDP-sugar binding domain is located between the central b-sheet (b2±b5) and two long a-helices (a3 and a4) and is capped by the C-terminal region. Figure 3A gives insights into the binding site pockets of four models. The model for binding UDP-Gal by iGb3-S is very similar to the prediction for a3GalT and is not displayed. The Mn2 binding site is made up by the 225 DVD227 (numbering relative to a3GalT), the very conserved LBR-C region. The uridine moiety is bound through the conserved 134 FA(IV)(GK)(KR)Y139 motif in LBR-A, whereas the ribose ring interacts with amino acids of LBR-A, -B, and -C. The C-terminal region makes direct contacts with the phosphate atoms through Lys359 (LBR-H) and Arg365 (LBR-I), both positions absolutely conserved among the enzymes studied here. The Gal/GalNAc moiety interacts with different regions: LBR-B, -C, -E, and -F. Details of the hydrogen bonds involving the sugar moiety have been listed in Table II. Depending on the model, five to seven hydrogen bonds are established between the sugar moiety and the proteins (see Figure 4 for GTA/GalNAc). Four of them are conserved in the whole family: the first aspartate amino acid of the DXD motif (LBR-C) receives a hydrogen bond from O3 hydroxyl group. An acidic amino acid from region LBR-F (Asp316 in a3GalT) receives hydrogen bonds from both O-4 and O-6, this latest hydroxyl receiving a hydrogen bond from the conserved Ser amino acid of LBR-B. The Gal/GalNAc specificity is ensured by LBR-E, which contains the 277 FYYX(GA)(GA)282 motif. From our modeling, it appears that the possibility to accept an N-acetyl group in the binding site is controlled by a pair of amino acids facing each other (His280 and Ala282 in a3GalT). The steric hindrance created by Met266/Ala268 in GTB and by His280/Ala282 in a3GalT (His253 in iGb3-S) is clearly displayed in Figure 3A. The enzymes that can use GalNAc have a glycine at one of these two positions (Leu266/Gly268 in GTA and Gly261/Ala263 in Forss-S), therefore allowing for an opening of this subsite. These particular two residues correspond to two of the four amino acids differentiating the human blood group A and B glycosyltransferases. The acceptor binding site: interactions between enzyme and oligosaccharide acceptor Acceptors of various size (disaccharides and trisaccharides) have been docked into protein models. The molecules that have been used in the modeling studies are those listed in 379 H. Heissigerova et al. Fig. 2. (A) Graphical representation of the model of human GTA in interaction with Mn2 , UDP-GalNAc, and trisaccharide acceptor. The Connolly protein surface is represented with different colors for the different peptide regions involved in substrates binding. These loops are labeled according to Figure 1. (B) Same representation for the crystal structure of GTA in interaction with Mn2 , UDP, and H-disaccharide analog (Patenaude et al., 2002). (C) Stereoscopic representation of the binding site of the GTA model. (D) Same representation for GTA crystal structure. (E) Ribbon representation of the peptide chain in the GTA model, together with space fill representation of the amino acids involved in capping the nucleotide binding site. Table I, although the noncarbohydrate part (ceramide) has not been taken into account in the modeling. When compared to the donor nucleotide sugar, it can be stated that the acceptor oligosaccharides are more exposed to the solvent and establish a limited number of contacts with the protein surface. Nevertheless, four regions seem to play an important role in binding the acceptor and defining the specificity, namely, LBR-D, -F, -G, and -H. These regions 380 have been color coded in Figure 3B where the proteins are represented by their accessible surfaces. Minor contacts are also established with residues of LBR-E (Tyr278 in a3GT). Details of hydrogen bonds and hydrophobic contacts are listed in Tables III and IV. In all enzymes of this family, the primary acceptor is a Gal or GalNAc residue. This can be correlated to a conserved structural feature: LBR-F is almost invariant and Molecular modeling of a3Gal(NAc)transferases A S185 D302 S185 D302 E303 E303 G268 A268 M266 L266 GTA GTB D316 S180 E297 S199 E317 E298 G261 A263 H280 A282 α3GalT Forss-S B W287 W300 F236 H220 H233 H223 K346 W329 G235 L329 K332 A343 W222 E315 Q328 brings a Trp residue in contact with the galactose acceptor (Trp314 in a3GalT). As previously observed in several carbohydrate-binding proteins (Vyas, 1991; Rini, 1995), the aromatic rings stack against the CH of the galactose rings and ensure specificity of binding for a galactose at this position. For two of the enzymes, a3GalT and iGb3-S, a terminal galactose is required for acceptor, and no substitution is tolerated at the C-2 position of this residue. Examination of the models indicated that region LBR-H, and particularly the presence of an additional bulky Trp (Trp356 in a3GalT), is responsible for the fine specificity. The Trp356 in a3GalT makes a barrier for any substitution on the Gal ring, whereas in the other enzymes the smaller Ala or Thr residue together with basic residues 323 RK (Forss-S) or 328 QL (GTA and GTB) form a pocket accommodating either Fuc or NHAc group on C2 of the Gal ring. The size of the first amino acid of region LBR-H (Trp356 in a3GalT) seems to distinguish whether the first monosaccharide of acceptor could be branched or not. Tryptophan residues are also involved in defining the acceptor specificity at longer range, that is, with the reducing sugar of the di- or trisaccharide. Again, a3GalT, I316 iGb3-S GTA W314 W295 Y231 H228 Q247 K340 K359 W250 W356 T337 G230 K324 R323 W249 H342 I343 Forss-S α3GalT Fig. 3. (A) Models of interaction between different glycosyltransferases and the nucleotide-sugars. The protein is represented by its Connolly surface that has been sliced to display the interior of the binding site. Color coding represents the electrostatic potential (from red for positively charged area to blue for negatively charged area). The Mn2 cation is represented by a magenta sphere and the nucleotide sugar by sticks. (B) Models of interaction between different glycosyltransferases and the acceptor oligosaccharide. The Connolly surface of the protein that has been color coded according to Figure 2A. Fig. 4. Stereoscopic representation of the hydrogen bond network between GalNAc and surrounding amino acids in the model of the interaction between human GTA and UDP-GalNAc. Table II. Analysis of the models: hydrogen bonds between the Gal or GalNAc of the nucleotide sugars and the amino acids of the different glycosyltransferases GTA Forss-S O (N-acetyl) Gly267(N) Ala263(N) O (N-acetyl) Gly268(N) Ligand atom a3GalT O-2 (Gal) His280(NE2) O-3 Asp225(OD1) O-3 O-4 GTB His253(NE2) Asp211(OD1) Asp302(OD2) Asp316(OD1) Asp302(OD1) Asp211(OD1) Asp211(OD2) Asp206(OD2) Asp289(OD2) Asp302(OD2) Glu297(OE2) Asp302(OD1) Glu297(OE1) Ser199(OG) Ser185(OG) O6(Acceptor) O-6 O-6 Asp198(OD1) Asp211(OD2) Asp316(OD2) O-4 O-6 iGb3-S Asp289(OD2) Gln296NE1) Asp289(OD1) Ser172(OG) Trp295(O) Ser185(OG) Ser180(OG) 381 H. Heissigerova et al. which uses lactose or N-acetyllactosamine as acceptor, displays a Trp residue stacking with the Glc (or GlcNAc) ring at the reducing end, thus limiting the possibilities of substitution or branching for this residue. This selection of the flat, ribbonlike conformationÐwhich can be adopted by equatorial±equatorial linked carbohydrate (such as cellulose and chitin, but also lactose and N-acetlyllactosamine) by pavement of Trp residuesÐhas previously been observed in polysaccharide-binding protein module. Blood group A and B transferases, which use either aFuc1-2bGal13GlcNAc (type 1) or aFuc1-2bGal1-4GlcNAc (type 2) acceptors, present a PG/S motif in the LBR-D region instead of the AW motif characteristic of a3GalT. Role of the nine ligand binding regions The present dissection of the acceptor binding site allows researchers to clarify the role of each binding regions in cation, donor, and acceptor binding. The amino acids that are predicted to directly interact with the substrate have been labeled in Figure 5. The specific role of each of the nine sequence motifs can be summarized as follows. The most highly conserved regions, LBR-A and LBR-C, interact with the uracil and ribose rings and ensure part of the coordination of the Mn2 cation. Recognition of the sugar moiety of the donor involves loops LBR-B, -C, -E, and -F. A special role is devoted to LBR-E, which contains a variable sequence responsible for the specificity toward Gal and GalNAc. The well-conserved region LBR-F mostly ensures specificity for galacto configuration of the acceptor terminal residue. The variable LBR-H region defines the substitution that can be accepted on this terminal sugar of the acceptor. This loop, together with the C-terminal peptide LBR-I, undergoes the conformational change that locks the nucleotide sugar in the binding site. Those two regions also make contacts with the pyrophosphate group. The variable LBR-D region plays a role in the specificity for the reducing part of the acceptor. Finally, LBR-G only interacts with Fuc or NHAc moiety of the acceptor in GTA, GTB, and Forss-S. Discussion Comparison with the crystal structure of bovine a3GalT The modeling study started from the crystal structure of the complex a3GalT/UDP (Boix et al., 2001). Further attempts to crystallize a3GalT in the presence of UDP-Gal yielded to the cleavage of the substrate in the active site (Boix et al., 2002). In one structure the cleaved galactose residue is still present in the site. After cleavage, the sugar undergoes Table III. Analysis of the models: hydrogen bonds between the acceptor sugar and the amino acids of the different glycosyltransferases Ligand atom a3GalT GTB iGb3-S GTA Forss-S O40 Gln247(NE2) His233(NE2) His220(NE2) His233(NE2) His228(NE2) 0 00 00 00 00 O5 O3 (GlcNAc) O3 (GlcNAc) O3 (Glc) O3 (GlcNAc) O60 Tyr278(OH) Tyr264(OH) Tyr251(OH) Tyr264(OH) Tyr259(OH) O60 Glu317(OE1) Glu303(OE1) Glu290(OE1) Glu303(OE1) Glu298(OE1) O200 Trp250(NE1) His223(NE2) Table IV. Analysis of the models: amino acids of the different glycosyltransferases that are involved in van der Waals interaction with the acceptor sugar a3GalT GTB iGb3-S GTA Forss-S His233 His228 Gln247 His233 His220 Trp249 Ser235 Trp222 Trp250 Phe236 His223 Phe236 Tyr231 Tyr278 Tyr264 Tyr251 Tyr264 Tyr259 Trp314 Trp300 Trp287 Trp300 Trp295 Glu317 Glu303 Glu290 Glu303 Glu298 Arg323 Leu329 Trp356 Leu329 Lys324 Lys340 Trp329 Lys359 Lys346 Lys332 Lys346 Tyr361 His348 Tyr334 His348 382 Molecular modeling of a3Gal(NAc)transferases Fig. 5. Alignments of the nine sequence motifs that are predicted to form the binding site. Residues labeling are as follows: plus sign, interacting with Mn2 ; open squares, with uridine and ribose; filled diamonds, with pyrophosphate; open circles, with Gal or GalNAc of donor; and filled squares, with oligosaccharidic acceptor. The amino acids that play a crucial role in specificity have been boxed. rotation and translation movements of small amplitude that allow it to maximize the number of hydrogen bonds in the site. Our docking of N-acetyllactosamine in the acceptor site was also done from the a3GalT/UDP complex. Because the crystal structure of the enzyme in complex with LacNAc has been recently published (Boix et al., 2002), this gives a direct comparison between the model and the crystal structure and therefore a validation of our modeling approach. Indeed, the location of the acceptor disaccharides is correctly predicted, especially all hydrophobic interaction, with a particular role of the tryptophan residues. Small variations exist between the predicted and observed hydrogen bond networks, mainly due to the orientation of the hydroxyl group at O-60 . Comparison with the crystal structures of blood group A and B transferases Crystal structures of both blood group A and B transferases have been very recently elucidated in the native state and as complexes with UDP and acceptor analog (Patenaude et al., 2002). Superimposition of the protein backbone (Asp83± Arg176 and Cys196±Pro345) of the model structure on the Ê and 0.87 A Ê for crystal structure yielded a rms of 0.94 A GTA and GTB enzymes, respectively, thus confirming the quality of the homology modeling. In the crystal structures complexed with UDP and H-type substrate, two peptide regions adjacent to the active site are disordered: one loop (177±195) and the C-terminus region (346±354) therefore resulting in the open conformation (Form I) of the binding site cleft. On the opposite, the modeled structures correspond to the Form II conformation, which has been proposed to be induced by the binding of the nucleotide sugar (Persson et al., 2001; Boix et al., 2002; Ramakrishnan et al., 2002). The roles of regions LBR-B, LBR-H, and LBR-I in binding the ligands could then be inferred from the model. Figure 2 displays a comparison of Form I (crystal) and Form II (model) of GTA. The differences are maximum for the nucleotide-sugar binding site (Figures 2C and 2D): this substrate is completely buried in Form II. Side chains of amino acids from the C-terminal region (Lys346, His348, and Arg252) stack together and join amino acids from the basis of the long a4 helix (Trp181 and Gln182) to form a lid over the pyrophosphate moiety of the nucleotide sugar (Figure 2E). These five amino acids show a high degree of conservation among the family of glycosyltransferases studied. Such closing of the lid by hydrophobic contacts is also observed in LgtC structure (Persson et al., 2001) where Pro248 in the C-terminal region interacts with the His78±Ile79 cluster. When ordering these two regions in our model of GTA, additional contacts are established between the protein and the nucleotide-sugar: Lys346 (LBR-H) and Arg252 (LBR-I) of the C-terminal domain make salt bridges to the phosphate groups, whereas Val184 (LBR-B) interacts with the uracil ring. The GalNAc moiety of the nucleotide-sugar also presents additional contact with the protein. The orientation of Gal/GalNAc proposed from the present modeling study varies slightly from the one that has been deduced (also from modeling) but starting from the Form I crystals structure, where very few contacts were predicted to occur. In our model, GalNAc is stabilized by a larger number of hydrogen bonds (Figure 4), including Ser185, which was disordered in the crystal. Also, the N-acetyl moiety interacted with Gly268, one of the two crucial amino acids in term of AB specificity (see later discussion), therefore explaining why GalNAc is favored over Gal in the larger site of GTA. In the acceptor site, the differences between Forms I and II are less drastic (Figure 2C and D). Nevertheless, ordering of bulky Lys346 reduces the space available for the fucose residue. In the present model, this results not only in a different orientation of the fucose but also for the Gal moiety. This latter residue, presents a slightly different orientation than in the GTA or GTB crystals, resulting in a better stacking with Trp300, as previously observed in the crystal structure of a3GalT/lactose complex (Boix et al., 2002). Comparison with biochemical data on GTA and GTB The differences in amino acids between GTA and GTB are limited to four residues: Arg176Gly, Gly235Ser, Leu266Met, and Gly268Ala (Yamamoto et al., 1990). Among these four differences, three are located in regions that we identify as LBRs. As discussed, Leu266Met and Gly268Ala substitutions (region LBR-E) have a crucial effect on the shape of the nucleotide binding pocket and directly affect the specificity for the nucleotide sugar. Indeed, mutagenesis studies demonstrated that when only one of the two positions is substituted by its analog counterpart in the other enzyme, the resulting enzyme displays both A and B activity (Yamamoto et al., 1996; Seto et al., 1999). The mouse glycosyltransferase that naturally displays this dual specificity also has only one difference (Met to Gly) with human BGT (Yamamoto et al., 2001). Our modeling study rationalizes the concerted action of the two amino acids involved. As shown in Figure 3A, they form together a bottleneck that size controls the possibility to accept an Nacetyl group in a subsite. 383 H. Heissigerova et al. The substitution Gly235Ser is located in the acceptor site (region LBR-D) and may be involved in small differences in acceptor recognition that have been defined by chemical mapping of the acceptors (see later discussion). Arg176Gly mutation does not affect the A-specificity but gives an 11fold increase in kcat (Seto et al., 1997). This amino acid is not involved in the substrate binding but is located at the surface of the enzyme. Analysis of the surface indicates that Ê from LBR-A, which closes the site it is located about 8 A above the nucleotide sugar. Depending on the orientation of the Arg176 side chain, it can participate to a basic cluster with K123 and K124 of LBR-A, two basic amino acids of A and B transferases at the surface of the binding site. Modifying this basic cluster could affect the turnover of the nucleotide-sugar and the catalysis products. Progress in the understanding of the catalytic mechanism is needed to explain fully the role of these residues. Chemical mapping studies with modified synthetic acceptors demonstrated a special role for O-4 of the galactose residue (Lowary and Hindsgaul, 1993, 1994). Indeed, in the present models, this hydroxyl group is involved in a strong hydrogen bond with His233 for both GTA and GTB enzymes. Mapping studies on the fucose residue indicated some differences in specificity because only the B enzyme requires the methyl group at C6-Fuc (Mukherjee et al., 2000). In our model, this methyl group interacts with Leu329 in a region (LBR-G) where there is no difference between A and B transferases. Nevertheless, in our prediction, the methyl group of the fucose is also spatially close to the region LBR-D, where GTA and GTB differs by a Gly235Ser substitution. Conclusion The elucidation of the catalytic mechanism of glycosyltransferases remains one of the most challenging problems in structural glycobiology (Ly et al., 2002). Particularly, the catalytic event that allows transfer of a monosaccharide with retention of configuration has not yet been elucidated: the double displacement reaction via formation of a glycosylenzyme intermediate that has been proposed by analogy with glycosylhydrolases (Gastinel et al., 2001) has been revised recently and is now almost abandoned (Boix et al., 2002; Ly et al., 2002), although no clear alternative mechanism could be proposed. In this context, the aim of the present study was to clarify the role of conformational changes as well as the basis of substrate and acceptor binding in one family of glycosyltransferases and to rationalize their specificities. The sequence motifs playing a role in ligand binding have been identified: the sugar donor specificity (UDP-Gal versus UDP-GalNAc) is due to the relative size of two crucial amino acids, whereas in the acceptor site, tryptophan residues play a key role in defining the fine specificity. Because glycosyltransferases are widely used for oligosaccharide synthesis in biotechnological approaches, such a modeling study could provide some structural basis for rational engineering of specificities and design of new transferase activities. The present article has been limited to enzymes belonging to one family of homologous enzymes; it may nevertheless be extended to other glycosyltransferases. Among the 384 crystal structures that have been determined recently, all of the enzymes for which Mn2 is required for activity share the same fold, despite a lack of sequence similarities. In such cases, fold recognition methods are a powerful tool for identifying family of enzymes that can share the same fold (Breton et al., 2002). Combination of fold recognition study and molecular modeling may help in predicting the acceptor specificities of putative glycosyltransferase sequences from newly determined genomes. This approach may be helpful in the emerging area of glycogenomics. Materials and methods Multiple alignment of amino acid sequences was performed with the ClustalX program (Thompson et al., 1994) using the following sequences: bovine a3GalT (GenBank accession number J04989), canine Forss-S (U66140), human GTA (J05175) and GTB (AF134414), and rat iGb3-S (AF248543). The homology modeling COMPOSER program (Blundell et al., 1988) of the Sybyl software (SYBYL, St. Louis, MO) was used to build the different enzyme models. Conserved a-helices and b-strands (structurally conserved regions in COMPOSER) were built from the highest-resolution structure of a3GalT (Boix et al., 2001; code 1K4V in the Protein Data Bank; Berman et al., 2000). Loops were then modeled by using the most similar fragments in a library containing all crystal structures of glycosyltransferases available. Resulting models were then screened using the PROCHECK program (Laskowski et al., 1993), and backbone conformations lying outside the allowed regions of Ramachandran map were further optimized. Hydrogen was added on all atoms, and partial atomic charges were derived using the Pullman procedure. UDP-Gal/UDP-GalNAc was docked into the binding site in a conformation and location similar to what has been observed for UDP-Gal complexed with a3GalT (Boix et al., 2001) and for UDP-2F-Gal complexed with LgtC (Persson et al., 2001). A Mn2 ion was located between the pyrophosphate group and aspartate groups of the DXD motif of protein. Atom types and energy parameters available for carbohydrates (Imberty et al., 1999) were used together with parameters developed for the sugar±pyrophosphate linkage (Petrova et al., 1999). For disaccharide acceptors, the conformation at the glycosidic linkage was selected according to crystal structures, when available (PeÂrez et al., 2000), and to previously calculated energy maps (Imberty et al., 1995). In all cases, the monosaccharides on the nonreducing side have been located in the binding site as observed for lactose and LacNAc acceptor in complex with a3GalT (Boix et al., 2002). Four models were therefore generated: GTA/UDP-GalNAc/aFuc1-2bGal1-4bGlcNAc/Mn2 , GTB/UDP-Gal/ aFuc1-2bGal1-4bGlcNAc/Mn2 , Forss-S/UDP-GalNAc/ bGalNAc1-3aGal/Mn2 , and iGb3-S/UDP-Gal/bGal14bGlc/Mn2 . The complex a3GalT/UDP-Gal/bGal14bGlcNAc/Mn2 was also generated for comparison. In all complexes, several cycles of energy minimization were performed to optimize the geometry of all ligands and also of protein side chains in the binding site and its vicinity. Energy calculations were performed using the TRIPOS Molecular modeling of a3Gal(NAc)transferases force field (Clark et al., 1989) in the Sybyl package with addition of energy parameters developed for carbohydrates (Imberty et al., 1999). Validation of the models was performed by superimposition with the crystal structure of GTA and GTB enzymes complexed with UDP and disaccharidic acceptor analog (code 1LZI and 1LZJ; Patenaude et al., 2002). Acknowledgments H.H. is supported by a grant from the French Ministere des Affaires Etrangeres. This research was partially supported by the Assocation pour la Recherche contre le Cancer. Abbreviations GT, glycosyltransferase; GTA, blood group A transferase; GTB, blood group B transferase; Forss-S, Forssman synthase; iGb3, isogloboside 3; iGb3-S, isogloboside 3 synthase; LBR, ligand binding region; LgtC, a bacterial a4-galactosyltransferase. References Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., and Bourne, P.E. (2000) The protein data bank. Nucleic Acids Res., 28, 235±242. Blundell, T., Carney, D., Gardner, S., Hayes, F., Howlin, B., Hubbard, T., Overington, J., Singh, D.A., Sibanda, B.L., and Sutcliffe, M. (1988) 18th Sir Hans Krebs lecture. Knowledge-based protein modelling and design. Eur. J. Biochem., 172, 513±520. Boix, E., Swaminathan, G.J., Zhang, Y., Natesh, R., Brew, K., and Acharya, K.R. (2001) Structure of UDP complex of UDP-galactose: b-galactoside-a-1,3-galactosyltransferase at 1.53A resolution reveals a conformational change in the catalytically important C-terminus. J. Biol. Chem., 276, 48608±48614. Boix, E., Zhang, Y., Swaminathan, G.J., Brew, K., and Acharya, K.R. (2002) Structural basis of ordered binding of donor and acceptor substrates to the retaining glycosyltransferase: alpha-1,3 galactosyltransferase. J. Biol. Chem., 277, 28310±28318. Breton, C., Heissigerova, H., Jeanneau, C., Moravcova, J., and Imberty, A. (2002) Comparative aspects of glycosyltransferases. In Drickamer, K. and Dell, A. (eds.), Glycogenomics: the impact of genomics and informatics in glycobiology. Portland Press, London, pp. 23±32. Clark, M., Cramer, R.D.I., and van den Opdenbosch, N. (1989) Validation of the general purpose Tripos 5.2 force field. J. Comput. Chem., 10, 982±1012. Galili, U. and Swanson, K. (1991) Gene sequences suggest inactivation of alpha-1,3-galactosyltransferase in catarrhines after the divergence of apes from monkeys. Proc. Natl. Acad. Sci. USA, 88, 7401±7404. Gastinel, L.N., Bignon, C., Misra, A.K., Hindsgaul, O., Shaper, J.H., and Joziasse, D.H. (2001) Bovine a1,3-galactosyltransferase catalytic domain structure and its relationship with ABO histo-blood group and glycosphingolipid glycosyltransferases. EMBO J., 20, 638±649. Gibbons, B.J., Roach, P.J., and Hurley, T.D. (2002) Crystal structure of the autocatalytic initiator of glycogen biosynthesis, glycogenin. J. Mol. Biol., 319, 463±477. Haslam, D.B. and Baenziger, J.U. (1996) Expression cloning of Forssman glycolipid synthetase: a novel member of the histo-blood group ABO gene family. Proc. Natl. Acad. Sci. USA, 93, 10697±106702. Imberty, A., Mikros, E., Koca, J., Mollicone, R., Oriol, R., and Perez, S. (1995) Computer simulation of histo-blood group oligosaccharides. Energy maps of all constituting disaccharides and potential energy surfaces of 14 ABH and Lewis carbohydrate antigens. Glycoconj. J., 12, 331±349. Imberty, A., Bettler, E., Karababa, M., Mazeau, K., Petrova, P., and Perez, S. (1999) Building sugars: The sweet part of structural biology. In Vijayan, M., Yathindra, N., and Kolaskar, A.S. (eds.), Perspectives in structural biology. Indian Academy of Sciences and Universities Press, Hyderabad, pp. 392±409. Keusch, J.J., Manzella, S.M., Nyame, K.A., Cummings, R.D., and Baenziger, J.U. (2000) Expression cloning of a new member of the ABO blood group glycosyltransferases, iGb3 synthase, that directs the synthesis of isoglobo-glycosphingolipids. J. Biol. Chem., 275, 25308±25314. Laskowski, R.A., MacArthur, M.W., Moss, D.S., and Thornton, J.M. (1993) PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Crystallogr., 26, 283±291. Lowary, T.L. and Hindsgaul, O. (1993) Recognition of synthetic deoxy and deoxyfluoro analogs of the acceptor alpha-L-Fuc p-(1 ! 2)-betaD-Gal p-OR by the blood-group A and B gene-specified glycosyltransferases. Carbohydr. Res., 249, 163±195. Lowary, T.L. and Hindsgaul, O. (1994) Recognition of synthetic O-methyl, epimeric, and amino analogues of the acceptor alpha-LFuc p-(1 ! 2)-beta-D-Gal p-OR by the blood-group A and B genespecified glycosyltransferases. Carbohydr. Res., 251, 33±67. Ly, H.D., Lougheed, B., Wakarchuk, W.W., and Withers, S.G. (2002) Mechanistic studies of a retaining alpha-galactosyltransferase from Neisseria meningitidis. Biochemistry, 41, 5075±5085. Mukherjee, A., Palcic, M.M., and Hindsgaul, O. (2000) Synthesis and enzymatic evaluation of modified acceptors of recombinant blood group A and B. Carbohydr. Res., 326, 1±21. Oriol, R. (1987) Tissular expression of ABH and Lewis antigens in humans and animals: expected value of different animal models in the study of ABO-incompatible organ transplants. Transplant. Proc., 19, 4416±4420. Patenaude, S.I., Seto, N.O., Borisova, S.N., Szpacenko, A., Marcus, S.L., Palcic, M.M., and Evans, S.V. (2002) The structural basis for specificity in human ABO(H) blood group biosynthesis. Nature Struct. Biol., 9, 685±690. Paulson, J.C. and Colley, K.J. (1989) Glycosyltransferases. Structure, localization, and control of cell type-specific glycosylation. J. Biol. Chem., 264, 17615±17618. Perez, S., Gautier, C., and Imberty, A. (2000) Oligosaccharide conformations by diffraction methods. In Ernst, B., Hart, G., and Sinay, P. (eds.), Oligosaccharides in chemistry and biology: a comprehensive handbook. Wiley/VCH, Weinheim, pp. 969±1001. Persson, K., Ly, H.D., Dieckelmann, M., Wakarchuk, W.W., Withers, S.G., and Strynadka, N.C.J. (2001) Crystal structure of the retaining galactosyltransferase LgtC from Neisseria meningitidis in complex with donor and acceptor sugar analogs. Nature Struct. Biol., 8, 166±175. Petrova, P., Koca, J., and Imberty, A. (1999) Potential energy hypersurfaces of nucleotide-sugars: Ab initio calculations, force-field parametrization, and exploration of the flexibility. J. Am. Chem. Soc., 121, 5535±5547. Ramakrishnan, B. and Qasba, P.K. (2001) Crystal structure of lactose synthase reveals a large conformational change in its catalytic component, the beta1,4-galactosyltransferase-I. J. Mol. Biol., 310, 205±218. Ramakrishnan, B., Balaji, P.V., and Qasba, P.K. (2002) Crystal structure of beta-1,4-galactosyltransferase complex with UDP-Gal reveals an oligosaccharide binding site. J. Mol. Biol., 318, 491±502. Rini, J.M. (1995) Lectin structure. Annu. Rev. Biophys. Biomol. Struct., 24, 551±577. Seto, N.O., Palcic, M.M., Compston, C.A., Li, H., Bundle, D.R., and Narang, S.A. (1997) Sequential interchange of four amino acids from blood group B to blood group A glycosyltransferase boosts catalytic activity and progressively modifies substrate recognition in human recombinant enzymes. J. Biol. Chem., 272, 14133±14138. Seto, N.O., Compston, C.A., Evans, S.V., Bundle, D.R., Narang, S.A., and Palcic, M.M. (1999) Donor substrate specificity of recombinant human blood group A, B and hybrid A/B glycosyltransferases expressed in Escherichia coli. Eur. J. Biochem., 259, 770±775. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., and Higgins, D.G. (1994) The CLUSTAL_X windows interface: flexible 385 H. Heissigerova et al. strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res., 22, 4876±4882. È nligil, U.M. and Rini, J.M. (2000) Glycosyltransferase structure and U mechanism. Curr. Opin. Struct. Biol., 10, 510±517. È nligil, U.M., Zhou, S., Yuwaraj, S., Sarkar, M., Schachter, H., and U Rini, J.M. (2000) X-ray crystal structure of rabbit N-acetylglucosaminyltransferase I: catalytic mechanism and a new protein superfamily. EMBO J., 19, 5269±5280. Varki, A. (1993) Biological roles of oligosaccharides: all of the theories are correct. Glycobiology, 3, 97±130. Vyas, N. (1991) Atomic features of protein±carbohydrate interactions. Curr. Opin. Struct. Biol., 1, 732±740. 386 Yamamoto, F. and McNeill, P.D. (1996) Amino acid residue at codon 268 determines both activity and nucleotide-sugar donor substrate specificity of human histo-blood group A and B transferases. In vitro mutagenesis study. J. Biol. Chem., 271, 10515±10520. Yamamoto, F., Clausen, H., White, T., Marken, J., and Hakomori, S. (1990) Molecular genetic basis of the histo-blood group ABO system. Nature, 345, 229±233. Yamamoto, M., Lin, X.H., Kominato, Y., Hata, Y., Noda, R., Saitou, N., and Yamamoto, F. (2001) Murine equivalent of the human histoblood group ABO gene is a cis-AB gene and encodes a glycosyltransferase with both A and B transferase activity. J. Biol. Chem., 276, 13701±13708.