* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download procite - UWI St. Augustine
Transcriptional regulation wikipedia , lookup
Gene regulatory network wikipedia , lookup
Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup
Biochemistry wikipedia , lookup
Paracrine signalling wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Point mutation wikipedia , lookup
Metalloprotein wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Signal transduction wikipedia , lookup
Gene expression wikipedia , lookup
Silencer (genetics) wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Interactome wikipedia , lookup
Expression vector wikipedia , lookup
Magnesium transporter wikipedia , lookup
Homology modeling wikipedia , lookup
Protein purification wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Protein structure prediction wikipedia , lookup
Western blot wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Anthrax toxin wikipedia , lookup
{PDOC00000} {BEGIN} ********************************** *** PROSITE documentation file *** ********************************** Release 20.58 of 15-Dec-2009. PROSITE is developed by the Swiss Institute of Bioinformatics (SIB) under the responsability of Amos Bairoch and Nicolas Hulo. This release was prepared by: Nicolas Hulo, Virginie Bulliard, Petra Langendijk-Genevaux and Christian Sigrist with the help of Edouard de Castro, Lorenzo Cerutti, Corinne Lachaize and Amos Bairoch. See: http://www.expasy.org/prosite/ Email: [email protected] Acknowledgements: - To all those mentioned in this document who have reviewed the entry(ies) for which they are listed as experts. With specific thanks to Rein Aasland, Mark Boguski, Peer Bork, Josh Cherry, Andre Chollet, Frank Kolakowski, David Landsman, Bernard Henrissat, Eugene Koonin, Steve Henikoff, Manuel Peitsch and Jonathan Reizer. - Jim Apostolopoulos is the author of the PDOC00699 entry. - Brigitte Boeckmann is the author of the PDOC00691, PDOC00703, PDOC00829, PDOC00796, PDOC00798, PDOC00799, PDOC00906, PDOC00907, PDOC00908, PDOC00912, PDOC00913, PDOC00924, PDOC00928, PDOC00929, PDOC00955, PDOC00961, PDOC00966, PDOC00988 and PDOC50020 entries. - Jean-Louis Boulay is the author of the PDOC01051, PDOC01050, PDOC01052, PDOC01053 and PDOC01054 entries. - Ryszard Brzezinski is the author of the PDOC60000 entry. - Elisabeth Coudert is the author of the PDOC00373 entry. - Kirill Degtyarenko is the author of the PDOC60001 entry. - Christian Doerig is the author of the PDOC01049 entry. - Kay Hofmann is the author of the PDOC50003, PDOC50006, PDOC50007 and PDOC50017 entries. - Chantal Hulo is the author of the PDOC00987 entry. - Karine Michoud is the author of the PDOC01044 and PDOC01042 entries. - Yuri Panchin is the author of the PDOC51013 entry. - S. Ramakumar is the author of the PDOC51052, PDOC60004, PDOC60010, PDOC60011, PDOC60015, PDOC60016, PDOC60018, PDOC60020, PDOC60021, PDOC60022, PDOC60023, PDOC60024, PDOC60025, PDOC60026, PDOC60027, PDOC60028, PDOC60029 and PDOC60030 entries. - Keith Robison is the author of the PDOC00830 and PDOC00861 entries. ----------------------------------------------------------------------PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by nonprofit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. ----------------------------------------------------------------------+-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00001} {PS00001; ASN_GLYCOSYLATION} {BEGIN} ************************ * N-glycosylation site * ************************ It has been known for a long time [1] that potential N-glycosylation sites are specific to the consensus sequence Asn-Xaa-Ser/Thr. It must be noted that the presence of the consensus tripeptide is not sufficient to conclude that an asparagine residue is glycosylated, due to the fact that the folding of the protein plays an important role in the regulation of N-glycosylation [2]. It has been shown [3] that the presence of proline between Asn and Ser/Thr will inhibit N-glycosylation; this has been confirmed by a recent [4] statistical analysis of glycosylation sites, which also shows that about 50% of the sites that have a proline C-terminal to Ser/Thr are not glycosylated. It must also be noted that there are a few reported cases of glycosylation sites with the pattern Asn-Xaa-Cys; an experimentally demonstrated occurrence of such a non-standard site is found in the plasma protein C [5]. -Consensus pattern: N-{P}-[ST]-{P} [N is the glycosylation site] -Last update: May 1991 / Text revised. [ 1] Marshall R.D. "Glycoproteins." Annu. Rev. Biochem. 41:673-702(1972). PubMed=4563441; DOI=10.1146/annurev.bi.41.070172.003325 [ 2] Pless D.D., Lennarz W.J. "Enzymatic conversion of proteins to glycoproteins." Proc. Natl. Acad. Sci. U.S.A. 74:134-138(1977). PubMed=264667 [ 3] Bause E. "Structural requirements of N-glycosylation of proteins. Studies with proline peptides as conformational probes." Biochem. J. 209:331-336(1983). PubMed=6847620 [ 4] Gavel Y., von Heijne G. "Sequence differences between glycosylated and non-glycosylated Asn-X-Thr/Ser acceptor sites: implications for protein engineering." Protein Eng. 3:433-442(1990). PubMed=2349213 [ 5] Miletich J.P., Broze G.J. Jr. "Beta protein C is not glycosylated at asparagine 329. The rate of translation may influence the frequency of usage at asparagine-X-cysteine sites." J. Biol. Chem. 265:11397-11404(1990). PubMed=1694179 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00004} {PS00004; CAMP_PHOSPHO_SITE} {BEGIN} **************************************************************** * cAMP- and cGMP-dependent protein kinase phosphorylation site * **************************************************************** There has been a number of studies relative to the specificity of cAMP- and cGMP-dependent protein kinases [1,2,3]. Both types of kinases appear to share a preference for the phosphorylation of serine or threonine residues found close to at least two consecutive N-terminal basic residues. It is important to note that there are quite a number of exceptions to this rule. -Consensus pattern: [RK](2)-x-[ST] [S or T is the phosphorylation site] -Last update: June 1988 / First entry. [ 1] Fremisco J.R., Glass D.B., Krebs E.G. J. Biol. Chem. 255:4240-4245(1980). [ 2] Glass D.B., Smith S.B. "Phosphorylation by cyclic GMP-dependent protein kinase of a synthetic peptide corresponding to the autophosphorylation site in the enzyme." J. Biol. Chem. 258:14797-14803(1983). PubMed=6317673 [ 3] Glass D.B., el-Maghrabi M.R., Pilkis S.J. "Synthetic peptides corresponding to the site phosphorylated in 6-phosphofructo-2-kinase/fructose-2,6-bisphosphatase as substrates of cyclic nucleotide-dependent protein kinases." J. Biol. Chem. 261:2987-2993(1986). PubMed=3005275 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00005} {PS00005; PKC_PHOSPHO_SITE} {BEGIN} ***************************************** * Protein kinase C phosphorylation site * ***************************************** In vivo, protein kinase C phosphorylation of exhibits a preference for the serine or threonine residues found close to a C-terminal basic residue [1,2]. The presence of additional basic residues at the N- or C-terminal of the target amino acid enhances the Vmax and Km of the phosphorylation reaction. -Consensus pattern: [ST]-x-[RK] [S or T is the phosphorylation site] -Last update: June 1988 / First entry. [ 1] Woodget J.R., Gould K.L., Hunter T. Eur. J. Biochem. 161:177-184(1986). [ 2] Kishimoto A., Nishiyama K., Nakanishi H., Uratsuji Y., Nomura H., Takeyama Y., Nishizuka Y. "Studies on the phosphorylation of myelin basic protein by protein kinase C and adenosine 3':5'-monophosphate-dependent protein kinase." J. Biol. Chem. 260:12492-12499(1985). PubMed=2413024 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00006} {PS00006; CK2_PHOSPHO_SITE} {BEGIN} ***************************************** * Casein kinase II phosphorylation site * ***************************************** Casein kinase II (CK-2) is a protein serine/threonine kinase whose activity is independent of cyclic nucleotides and calcium. CK-2 phosphorylates many different proteins. The substrate specificity [1] of this enzyme can be summarized as follows: (1) Under comparable conditions Ser is favored over Thr. (2) An acidic residue (either Asp or Glu) must be present three residues from the C-terminal of the phosphate acceptor site. (3) Additional acidic residues in positions +1, +2, +4, and +5 increase the phosphorylation rate. Most physiological substrates have at least one acidic residue in these positions. (4) Asp is preferred to Glu as the provider of acidic determinants. (5) A basic residue at the N-terminal of the acceptor site decreases the phosphorylation rate, while an acidic one will increase it. -Consensus pattern: [ST]-x(2)-[DE] [S or T is the phosphorylation site] -Note: This pattern is found in most of the known physiological substrates. -Last update: May 1991 / Text revised. [ 1] Pinna L.A. "Casein kinase 2: an 'eminence grise' in cellular regulation?" Biochim. Biophys. Acta 1054:267-284(1990). PubMed=2207178 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00007} {PS00007; TYR_PHOSPHO_SITE} {BEGIN} **************************************** * Tyrosine kinase phosphorylation site * **************************************** Substrates of tyrosine protein kinases are generally characterized by a lysine or an arginine seven residues to the N-terminal side of the phosphorylated tyrosine. An acidic residue (Asp or Glu) is often found at either three or four residues to the N-terminal side of the tyrosine [1,2,3]. There are a number of exceptions to this rule such as the tyrosine phosphorylation sites of enolase and lipocortin II. -Consensus pattern: [RK]-x(2)-[DE]-x(3)-Y or [RK]-x(3)-[DE]-x(2)-Y [Y is the phosphorylation site] -Last update: June 1988 / First entry. [ 1] Patschinsky T., Hunter T., Esch F.S., Cooper J.A., Sefton B.M. "Analysis of the sequence of amino acids surrounding sites of tyrosine phosphorylation." Proc. Natl. Acad. Sci. U.S.A. 79:973-977(1982). PubMed=6280176 [ 2] Hunter T. "Synthetic peptide substrates for a tyrosine protein kinase." J. Biol. Chem. 257:4843-4848(1982). PubMed=6279650 [ 3] Cooper J.A., Esch F.S., Taylor S.S., Hunter T. "Phosphorylation sites in enolase and lactate dehydrogenase utilized by tyrosine protein kinases in vivo and in vitro." J. Biol. Chem. 259:7835-7841(1984). PubMed=6330085 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00008} {PS00008; MYRISTYL} {BEGIN} ************************* * N-myristoylation site * ************************* An appreciable number of eukaryotic proteins are acylated by the covalent addition of myristate (a C14-saturated fatty acid) to their N-terminal residue via an amide linkage [1,2]. The sequence specificity of the enzyme responsible for this modification, myristoyl CoA:protein N-myristoyl transferase (NMT), has been derived from the sequence of known N-myristoylated proteins and from studies using synthetic peptides. It seems to be the following: - The N-terminal residue must be glycine. - In position 2, uncharged residues are allowed. proline and large hydrophobic residues are not allowed. Charged residues, - In positions 3 and 4, most, if not all, residues are allowed. - In position 5, small uncharged residues are allowed (Ala, Ser, Thr, Cys, Asn and Gly). Serine is favored. - In position 6, proline is not allowed. -Consensus pattern: G-{EDRKHPFYW}-x(2)-[STAGCN]-{P} [G is the N-myristoylation site] -Note: We deliberately include as potential myristoylated glycine residues, those which are internal to a sequence. It could well be that the sequence under study represents a viral polyprotein precursor and that subsequent proteolytic processing could expose an internal glycine as the Nterminal of a mature protein. -Last update: October 1989 / Pattern and text revised. [ 1] Towler D.A., Gordon J.I., Adams S.P., Glaser L. "The biology and enzymology of eukaryotic protein acylation." Annu. Rev. Biochem. 57:69-99(1988). PubMed=3052287; DOI=10.1146/annurev.bi.57.070188.000441 [ 2] Grand R.J.A. "Acylation of viral and eukaryotic proteins." Biochem. J. 258:625-638(1989). PubMed=2658970 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00009} {PS00009; AMIDATION} {BEGIN} ****************** * Amidation site * ****************** The precursor of hormones and other active peptides which are Cterminally amidated is always directly followed [1,2] by a glycine residue which provides the amide group, and most often by at least two consecutive basic residues (Arg or Lys) which generally function as an active peptide precursor cleavage site. Although all amino acids can be amidated, neutral hydrophobic residues such as Val or Phe are good substrates, while charged residues such as Asp or Arg are much less reactive. C-terminal amidation has not yet been shown to occur in unicellular organisms or in plants. -Consensus pattern: x-G-[RK]-[RK] [x is the amidation site] -Last update: June 1988 / First entry. [ 1] Kreil G. "Occurrence, detection, and biosynthesis of carboxy-terminal amides." Methods Enzymol. 106:218-223(1984). PubMed=6548541 [ 2] Bradbury A.F., Smyth D.G. "Biosynthesis of the C-terminal amide in peptide hormones." Biosci. Rep. 7:907-916(1987). PubMed=3331120 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00010} {PS00010; ASX_HYDROXYL} {BEGIN} *************************************************** * Aspartic acid and asparagine hydroxylation site * *************************************************** Post-translational hydroxylation of aspartic acid or asparagine [1] to form erythro-beta-hydroxyaspartic acid or erythro-beta-hydroxyasparagine has been identified in a number of proteins with domains homologous to epidermal growth factor (EGF). Examples of such proteins are the blood coagulation protein factors VII, IX and X, proteins C, S, and Z, the LDL receptor, thrombomodulin, etc. Based on sequence comparisons of the EGF-homology region that contains hydroxylated Asp or Asn, a consensus sequence has been identified that seems to be required by the hydroxylase(s). -Consensus pattern: C-x-[DN]-x(4)-[FY]-x-C-x-C [D or N is the hydroxylation site] -Note: This consensus pattern is located in the N-terminal of EGF-like domains, while our EGF-like cysteine pattern signature (see the relevant entry <PDOC00021>) is located in the C-terminal. -Last update: January 1989 / First entry. [ 1] Stenflo J., Ohlin A.-K., Owen W.G., Schneider W.J. "beta-Hydroxyaspartic acid or beta-hydroxyasparagine in bovine low density lipoprotein receptor and in bovine thrombomodulin." J. Biol. Chem. 263:21-24(1988). PubMed=2826439 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00011} {PS00011; GLA_1} {PS50998; GLA_2} {BEGIN} ********************************************************************** * Gamma-carboxyglutamic acid-rich (Gla) domain signature and profile * ********************************************************************** The vitamin K-dependent blood coagulation factor IX as well as several extracellular regulatory proteins require vitamin K for the posttranslational synthesis of gamma-carboxyglutamic acid, an amino acid clustered in the N-terminal Gla domain of these proteins [1,2]. The Gla domain is a membrane binding motif which, in the presence of calcium ions, with phospholipid membranes that include phosphatidylserine. interacts The 3D structure of the Gla domain has been solved (see for example <PDB:1CFH>) [3,4]. Calcium ions induce conformational changes in the Gla domain and are necessary for the Gla domain to fold properly. A common structural feature of functional Gla domains is the clustering of Nterminal hydrophobic residues into a hydrophobic patch that mediates interaction with the cell surface membrane [4]. Proteins known to contain a Gla domain are listed below: - A number of plasma proteins involved in blood coagulation. These proteins are prothrombin, coagulation factors VII, IX and X, proteins C, S, and Z. - Two proteins that occur in calcified tissues: osteocalcin (also known as bone-Gla protein, BGP), and matrix Gla-protein (MGP). - Proline-rich Gla proteins 1 and 2 [5]. - Cone snail venom peptides: conantokin-G and -T, and conotoxin GS [6]. The pattern we developed start with the conserved Gla-x(3)-Gla-x-Cys motif found in the middle of the domain which seems to be important for substrate recognition by the carboxylase [7] and end with the last conserved position of the domain (an aromatic residue). We also developed a profile that covers the whole Gla domain. -Consensus pattern: E-x(2)-[ERK]-E-x-C-x(6)-[EDR]-x(10,11)-[FYA]-[YW] [The 2 E's are the carboxylation site] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 1. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: All glutamic residues present in the domain are potential carboxylation sites; in coagulation proteins, all are modified to Gla, while in BGP and MGP some are not. -Expert(s) to contact by email: Price P.A.; [email protected] -Last update: June 2004 / Pattern and text revised; profile added. [ 1] Friedman P.A., Przysiecki C.T. "Vitamin K-dependent carboxylation." Int. J. Biochem. 19:1-7(1987). PubMed=3106112 [ 2] Vermeer C. "Gamma-carboxyglutamate-containing proteins and the vitamin K-dependent carboxylase." Biochem. J. 266:625-636(1990). PubMed=2183788 [ 3] Freedman S.J., Furie B.C., Furie B., Baleja J.D. "Structure of the metal-free gamma-carboxyglutamic acid-rich membrane binding region of factor IX by two-dimensional NMR spectroscopy." J. Biol. Chem. 270:7980-7987(1995). PubMed=7713897 [ 4] Freedman S.J., Blostein M.D., Baleja J.D., Jacobs M., Furie B.C., Furie B. "Identification of the phospholipid binding site in the vitamin K-dependent blood coagulation protein factor IX." J. Biol. Chem. 271:16227-16236(1996). PubMed=8663165 [ 5] Kulman J.D., Harris J.E., Haldeman B.A., Davie E.W. "Primary structure and tissue distribution of two novel proline-rich gamma-carboxyglutamic acid proteins." Proc. Natl. Acad. Sci. U.S.A. 94:9058-9062(1997). PubMed=9256434 [ 6] Haack J.A., Rivier J.E., Parks T.N., Mena E.E., Cruz L.J., Olivera B.M. "Conantokin-T. A gamma-carboxyglutamate containing peptide with N-methyl-d-aspartate antagonist activity." J. Biol. Chem. 265:6025-6029(1990). PubMed=2180939 [ 7] Price P.A., Fraser J.D., Metz-Virca G. "Molecular cloning of matrix Gla protein: implications for substrate recognition by the vitamin K-dependent gamma-carboxylase." Proc. Natl. Acad. Sci. U.S.A. 84:8335-8339(1987). PubMed=3317405 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00012} {PS00012; PHOSPHOPANTETHEINE} {PS50075; ACP_DOMAIN} {BEGIN} ************************************** * Phosphopantetheine attachment site * ************************************** Phosphopantetheine (or pantetheine 4' phosphate) is the prosthetic group of acyl carrier proteins (ACP) in some multienzyme complexes where it serves as a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups [1]. Phosphopantetheine is attached to a serine residue in these proteins [2]. ACP proteins or domains have been found in various enzyme systems which are listed below (references are only provided for recently determined sequences). - Fatty acid synthetase (FAS), which catalyzes the formation of longchain fatty acids from acetyl-CoA, malonyl-CoA and NADPH. Bacterial and plant chloroplast FAS are composed of eight separate subunits which correspond to the different enzymatic activities; ACP is one of these polypeptides. Fungal FAS consists of two multifunctional proteins, FAS1 and FAS2; the ACP domain is located in the N-terminal section of FAS2. Vertebrate FAS consists of a single multifunctional enzyme; the ACP domain is located between the beta-ketoacyl reductase domain and the C-terminal thioesterase domain [3]. - Polyketide antibiotics synthase enzyme systems. Polyketides are secondary metabolites produced from simple fatty acids, by microorganisms and plants. ACP is one of the polypeptidic components involved in the biosynthesis of Streptomyces polyketide antibiotics actinorhodin, curamycin, granatacin, monensin, oxytetracycline and tetracenomycin C. - Bacillus subtilis putative polyketide synthases pksK, pksL and pksM which respectively contain three, five and one ACP domains. - The multifunctional 6-methysalicylic acid synthase (MSAS) from Penicillium patulum. This is a multifunctional enzyme involved in the biosynthesis of a polyketide antibiotic and which contains an ACP domain in the Cterminal extremity. - Multifunctional mycocerosic acid synthase (gene mas) from Mycobacterium bovis. - Gramicidin S synthetase I (gene grsA) from Bacillus brevis. This enzyme catalyzes the first step in the biosynthesis of the cyclic antibiotic gramicidin S. - Tyrocidine synthetase I (gene tycA) from Bacillus brevis. The reaction carried out by tycA is identical to that catalyzed by grsA - Gramicidin S synthetase II (gene grsB) from Bacillus brevis. This enzyme is a multifunctional protein that activates and polymerizes proline, valine, ornithine and leucine. GrsB contains four ACP domains. - Erythronolide synthase proteins 1, 2 and 3 from Saccharopolyspora erythraea which is involved in the biosynthesis of the polyketide antibiotic erythromicin. Each of these proteins contain two ACP domains. - Conidial green pigment synthase from Aspergillus nidulans. - ACV synthetase from various fungi. This enzyme catalyzes the first step in the biosynthesis of penicillin and cephalosporin. It contains three ACP domains. - Enterobactin synthetase component F (gene entF) from Escherichia coli. This enzyme is involved in the ATP-dependent activation of serine during enterobactin (enterochelin) biosynthesis. - Cyclic peptide antibiotic surfactin synthase subunits 1, 2 and 3 from Bacillus subtilis. Subunits 1 and 2 contains three related domains while subunit 3 only contains a single domain. - HC-toxin synthetase (gene HTS1) from Cochliobolus carbonum. This enzyme synthesizes HC-toxin, a cyclic tetrapeptide. HTS1 contains four ACP domains. - Fungal mitochondrial ACP, which is part of the respiratory chain NADH dehydrogenase (complex I). - Rhizobium nodulation protein nodF, which probably acts as an ACP in the synthesis of the nodulation Nod factor fatty acyl chain. The sequence around the phosphopantetheine attachment site is conserved in all these proteins and can be used as a signature pattern. A profile was also developed that spans the complete ACP-like domain. -Consensus pattern: [DEQGSTALMKRH]-[LIVMFYSTAC]-[GNQ]-[LIVMFYAG][DNEKHS]-S[LIVMST]-{PCFY}-[STAGCPQLIVMF]-[LIVMATN][DENQGTAKRHLM][LIVMWSTA]-[LIVGSTACR]-{LPIY}-{VY}-[LIVMFA] [S is the pantetheine attachment site] -Sequences known to belong to this class detected by the pattern: ALL, except C.paradoxa ACP. -Other sequence(s) detected in Swiss-Prot: 115. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Concise Encyclopedia Biochemistry, Second Edition, Walter de Gruyter, Berlin New-York (1988). [ 2] Pugh E.L., Wakil S.J. J. Biol. Chem. 240:4727-4733(1965). [ 3] Witkowski A., Rangan V.S., Randhawa Z.I., Amy C.M., Smith S. "Structural organization of the multifunctional animal fatty-acid synthase." Eur. J. Biochem. 198:571-579(1991). PubMed=2050137 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00013} {PS51257; PROKAR_LIPOPROTEIN} {BEGIN} ****************************************************************** * Prokaryotic membrane lipoprotein lipid attachment site profile * ****************************************************************** In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, which is cleaved by a specific lipoprotein signal peptidase (signal peptidase II). The peptidase recognizes a conserved sequence and cuts upstream of a cysteine residue to which a glyceride-fatty acid lipid is attached [1]. Some of the proteins known to undergo such processing currently include (for recent listings see [1,2,3]): - Major outer membrane lipoprotein (murein-lipoproteins) (gene lpp). - Escherichia coli lipoprotein-28 (gene nlpA). - Escherichia coli lipoprotein-34 (gene nlpB). - Escherichia coli lipoprotein nlpC. - Escherichia coli lipoprotein nlpD. - Escherichia coli osmotically inducible lipoprotein B (gene osmB). - Escherichia coli osmotically inducible lipoprotein E (gene osmE). - Escherichia coli peptidoglycan-associated lipoprotein (gene pal). - Escherichia coli rare lipoproteins A and B (genes rplA and rplB). - Escherichia coli copper homeostasis protein cutF (or nlpE). - Escherichia coli plasmids traT proteins. - Escherichia coli Col plasmids lysis proteins. - A number of Bacillus beta-lactamases. - Bacillus subtilis periplasmic oligopeptide-binding protein (gene oppA). - Borrelia burgdorferi outer surface proteins A and B (genes ospA and ospB). - Borrelia hermsii variable major protein 21 (gene vmp21) and 7 (gene vmp7). - Chlamydia trachomatis outer membrane protein 3 (gene omp3). - Fibrobacter succinogenes endoglucanase cel-3. - Haemophilus influenzae proteins Pal and Pcp. - Klebsiella pullulunase (gene pulA). - Klebsiella pullulunase secretion protein pulS. - Mycoplasma hyorhinis protein p37. - Mycoplasma hyorhinis variant surface antigens A, B, and C (genes vlpABC). - Neisseria outer membrane protein H.8. - Pseudomonas aeruginosa lipopeptide (gene lppL). - Pseudomonas solanacearum endoglucanase egl. - Rhodopseudomonas viridis reaction center cytochrome subunit (gene cytC). - Rickettsia 17 Kd antigen. - Shigella flexneri invasion plasmid proteins mxiJ and mxiM. - Streptococcus pneumoniae oligopeptide transport protein A (gene amiA). - Treponema pallidium 34 Kd antigen. - Treponema pallidium membrane protein A (gene tmpA). - Vibrio harveyi chitobiase (gene chb). - Yersinia virulence plasmid protein yscJ. - Halocyanin from Natrobacterium pharaonis [4], a membrane associated copperbinding protein. This is the first archaebacterial protein known to be modified in such a fashion). From the precursor sequences of all these proteins, we derived a profile that starts at the beginning of the sequence and ends after the post-translationally modified cysteine. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: some 100 prokaryotic proteins. Some of them are not membrane lipoproteins, but at least half of them could be. -Note: This profile replace an obsolete rule. All the information in the rule has been encoded in the profile format. -Last update: October 2006 / Text revised; profiles added; rule deleted. [ 1] Hayashi S., Wu H.C. "Lipoproteins in bacteria." J. Bioenerg. Biomembr. 22:451-471(1990). PubMed=2202727 [ 2] Klein P., Somorjai R.L., Lau P.C.K. "Distinctive properties of signal sequences from bacterial lipoproteins." Protein Eng. 2:15-20(1988). PubMed=3253732 [ 3] von Heijne G. Protein Eng. 2:531-534(1989). [ 4] Mattar S., Scharf B., Kent S.B.H., Rodewald K., Oesterhelt D., Engelhard M. "The primary structure of halocyanin, an archaeal blue copper protein, predicts a lipid anchor for membrane fixation." J. Biol. Chem. 269:14939-14945(1994). PubMed=8195126 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00014} {PS00014; ER_TARGET} {BEGIN} ******************************************** * Endoplasmic reticulum targeting sequence * ******************************************** Proteins that permanently reside in the lumen of the endoplasmic reticulum (ER) seem to be distinguished from newly synthesized secretory proteins by the presence of the C-terminal sequence Lys-Asp-Glu-Leu (KDEL) [1,2]. While KDEL is the preferred signal in many species, variants of that signal are used by different species. This situation is described in the following table. Signal Species ---------------------------------------------------------------KDEL Vertebrates, Drosophila, Caenorhabditis elegans, plants HDEL Saccharomyces cerevisiae, Kluyveromyces lactis, plants DDEL Kluyveromyces lactis ADEL Schizosaccharomyces pombe (fission yeast) SDEL Plasmodium falciparum The signal is usually very strictly conserved in major ER proteins but some minor ER proteins have divergent sequences (probably because efficient retention of these proteins is not crucial to the cell). Proteins bearing the KDEL-type signal are not simply held in the ER, but are selectively retrieved from a post-ER compartment by a receptor and returned to their normal location. The currently known ER luminal proteins are listed below. - Protein disulfide-isomerase (PDI) (also known as the betasubunit of prolyl 4-hydroxylase, as a component of oligosaccharyl transferase, as glutathione-insulin transhydrogenase and as a thyroid hormone binding protein). - ERp60, ERp72, and P5, three minor isoforms of PDI. - Trypanosoma brucei bloodstream-specific protein 2, a probable PDI. - hsp70 related protein GRP78 (also known as the immunoglobulin heavy chain binding protein (BiP), and as KAR2, in fungi). - hsp90 related protein 'endoplasmin' (also known as GRP94, Erp99 or Hsp108). - Calreticulin, a calcium-binding protein (also known as calregulin, CRP55, or HACBP). - ERC-55, a calcium-binding protein. - Reticulocalbin, a calcium-binding protein. - Hsp47, a heat-shock protein that binds strongly to collagen and could act as a chaperone in the collagen biosynthetic pathway. - A receptor for a plant hormone, auxin. - Thiol proteases from rice bean (SH-EP) and kidney bean (EP-C1). - Esterases from mammalian liver and from nematodes. - Alpha-2-macroglobulin receptor-associated protein (RAP). - Yeast peptidyl-prolyl cis-trans isomerase D (CYPD). - Yeast protein KRE5, a protein required for (1->6)-beta-D-glucan synthesis. - Yeast protein SEC20, required for the transport of proteins from the endoplasmic reticulum to the Golgi apparatus. - Yeast protein SCJ1, involved in protein sorting. -Consensus pattern: [KRHQSA]-[DENQ]-E-L> -Sequences known to belong to this class detected by the pattern: ALL, except for liver esterases which have H-[TVI]-E-L. -Other sequence(s) detected in Swiss-Prot: 24 proteins which are clearly not located in the ER (because they are of bacterial or viral origin, for example) and a protein which can be considered as valid candidate: human 80KH protein. -Last update: November 1997 / Text revised. [ 1] Munro S., Pelham H.R.B. "A C-terminal signal prevents secretion of luminal ER proteins." Cell 48:899-907(1987). PubMed=3545499 [ 2] Pelham H.R.B. "The retention signal for soluble proteins of the endoplasmic reticulum." Trends Biochem. Sci. 15:483-486(1990). PubMed=2077689 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00015} {PS50079; NLS_BP} {BEGIN} ************************************************* * Bipartite nuclear localization signal profile * ************************************************* The uptake of protein by the nucleus is extremely selective and nuclear proteins must therefore contain within their final structure a signal that specifies selective accumulation in the nucleus [1,2]. Studies on some nuclear proteins, such as the large T antigen of SV40, have indicated which part of the sequence is required for nuclear translocation. The known nuclear targeting sequences are generally basic, but there seems to be no clear common denominator between all the known sequences. Although some consensus sequence patterns have been proposed (see for example [3]), the current best strategy to detect a nuclear targeting sequence is based [4] on the following definition of what is called a 'bipartite nuclear localization signal': (1) Two adjacent basic amino acids (Arg or Lys). (2) A spacer region of any 10 residues. (3) At least three basic residues (Arg or Lys) in the five positions after the spacer region. The profile localization signal. we developed covers the entire bipartite nuclear -Sequences known to belong to this class detected by the profile: 56% of known nuclear proteins according to [4]. -Other sequence(s) detected in Swiss-Prot: about 4.2% of non-nuclear proteins according to [4]. -Note: This profile replace an obsolete rule. All the information in the rule has been encoded in the profile format. -Last update: October 2006 / Text revised; profiles added; rule deleted. [ 1] Dingwall C., Laskey R.A. "Protein import into the cell nucleus." Annu. Rev. Cell Biol. 2:367-390(1986). PubMed=3548772; DOI=10.1146/annurev.cb.02.110186.002055 [ 2] Garcia-Bustos J.F., Heitman J., Hall M.N. Biochim. Biophys. Acta 1071:83-101(1991). [ 3] Gomez-Marquez J., Segade F. FEBS Lett. 226:217-219(1988). [ 4] Dingwall C., Laskey R.A. "Nuclear targeting sequences -- a consensus?" Trends Biochem. Sci. 16:478-481(1991). PubMed=1664152 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00016} {PS00016; RGD} {BEGIN} **************************** * Cell attachment sequence * **************************** The sequence Arg-Gly-Asp, found in fibronectin, is crucial for its interaction with its cell surface receptor, an integrin [1,2]. What has been called the 'RGD' tripeptide is also found in the sequences of a number of other proteins, where it has been shown to play a role in cell adhesion. These proteins are: some forms of collagens, fibrinogen, vitronectin, von Willebrand factor (VWF), snake disintegrins, and slime mold discoidins. The 'RGD' tripeptide is also found in other proteins where it may also, but not always, serve the same purpose. -Consensus pattern: R-G-D -Last update: December 1991 / Text revised. [ 1] Ruoslahti E., Pierschbacher M.D. "Arg-Gly-Asp: a versatile cell recognition signal." Cell 44:517-518(1986). PubMed=2418980 [ 2] d'Souza S.E., Ginsberg M.H., Plow E.F. Trends Biochem. Sci. 16:246-250(1991). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00017} {PS00017; ATP_GTP_A} {BEGIN} ***************************************** * ATP/GTP-binding site motif A (P-loop) * ***************************************** From sequence comparisons and crystallographic data analysis it has been shown [1,2,3,4,5,6] that an appreciable proportion of proteins that bind ATP or GTP share a number of more or less conserved sequence motifs. The best conserved of these motifs is a glycine-rich region, which typically forms a flexible loop between a beta-strand and an alpha-helix. This loop interacts with one of the phosphate groups of the nucleotide. This sequence motif is generally referred to as the 'A' consensus sequence [1] or the 'P-loop' [5]. There are numerous ATP- or GTP-binding proteins in which the P-loop is found. We list below a number of protein families for which the relevance of the presence of such motif has been noted: - ATP synthase alpha and beta subunits (see <PDOC00137>). - Myosin heavy chains. - Kinesin heavy chains and kinesin-like proteins (see <PDOC00343>). - Dynamins and dynamin-like proteins (see <PDOC00362>). - Guanylate kinase (see <PDOC00670>). - Thymidine kinase (see <PDOC00524>). - Thymidylate kinase (see <PDOC01034>). - Shikimate kinase (see <PDOC00868>). - Nitrogenase iron protein family (nifH/chlL) (see <PDOC00580>). - ATP-binding proteins involved in 'active transport' (ABC transporters) [7] (see <PDOC00185>). - DNA and RNA helicases [8,9,10]. - GTP-binding elongation factors (EF-Tu, EF-1alpha, EF-G, EF-2, etc.). - Ras family of GTP-binding proteins (Ras, Rho, Rab, Ral, Ypt1, SEC4, etc.). - Nuclear protein ran (see <PDOC00859>). - ADP-ribosylation factors family (see <PDOC00781>). - Bacterial dnaA protein (see <PDOC00771>). - Bacterial recA protein (see <PDOC00131>). - Bacterial recF protein (see <PDOC00539>). - Guanine nucleotide-binding proteins alpha subunits (Gi, Gs, Gt, G0, etc.). - DNA mismatch repair proteins mutS family (See <PDOC00388>). - Bacterial type II secretion system protein E (see <PDOC00567>). Not all ATP- or GTP-binding proteins are picked-up by this motif. A number of proteins escape detection because the structure of their ATP-binding site is completely different from that of the P-loop. Examples of such proteins are the E1-E2 ATPases or the glycolytic kinases. In other ATP- or GTPbinding proteins the flexible loop exists in a slightly different form; this is the case for tubulins or protein kinases. A special mention must be reserved for adenylate kinase, in which there is a single deviation from the P-loop pattern: in the last position Gly is found instead of Ser or Thr. -Consensus pattern: [AG]-x(4)-G-K-[ST] -Sequences known to belong to this class detected by the pattern: a majority. -Other sequence(s) detected in Swiss-Prot: in addition to the proteins listed above, the 'A' motif is also found in a number of other proteins. Most of these proteins probably bind a nucleotide, but others are definitively not ATP- or GTP-binding (as for example chymotrypsin, or human ferritin light chain). -Expert(s) to contact by email: Koonin E.V.; [email protected] -Last update: July 1999 / Text revised. [ 1] Walker J.E., Saraste M., Runswick M.J., Gay N.J. "Distantly related sequences in the alpha- and beta-subunits of ATP synthase, myosin, kinases and other ATP-requiring enzymes and a common nucleotide binding fold." EMBO J. 1:945-951(1982). PubMed=6329717 [ 2] Moller W., Amons R. "Phosphate-binding sequences in nucleotide-binding proteins." FEBS Lett. 186:1-7(1985). PubMed=2989003 [ 3] Fry D.C., Kuby S.A., Mildvan A.S. "ATP-binding site of adenylate kinase: mechanistic implications of its homology with ras-encoded p21, F1-ATPase, and other nucleotidebinding proteins." Proc. Natl. Acad. Sci. U.S.A. 83:907-911(1986). PubMed=2869483 [ 4] Dever T.E., Glynias M.J., Merrick W.C. "GTP-binding domain: three consensus sequence elements with distinct spacing." Proc. Natl. Acad. Sci. U.S.A. 84:1814-1818(1987). PubMed=3104905 [ 5] Saraste M., Sibbald P.R., Wittinghofer A. "The P-loop -- a common motif in ATP- and GTP-binding proteins." Trends Biochem. Sci. 15:430-434(1990). PubMed=2126155 [ 6] Koonin E.V. "A superfamily of ATPases with diverse functions containing either classical or deviant ATP-binding motif." J. Mol. Biol. 229:1165-1174(1993). PubMed=8445645 [ 7] Higgins C.F., Hyde S.C., Mimmack M.M., Gileadi U., Gill D.R., Gallagher M.P. "Binding protein-dependent transport systems." J. Bioenerg. Biomembr. 22:571-592(1990). PubMed=2229036 [ 8] Hodgman T.C. "A new superfamily of replicative proteins." Nature 333:22-23(1988) and Nature 333:578-578(1988) (Errata). PubMed=3362205; DOI=10.1038/333022b0 [ 9] Linder P., Lasko P.F., Ashburner M., Leroy P., Nielsen P.J., Nishi K., Schnier J., Slonimski P.P. "Birth of the D-E-A-D box." Nature 337:121-122(1989). PubMed=2563148; DOI=10.1038/337121a0 [10] Gorbalenya A.E., Koonin E.V., Donchenko A.P., Blinov V.M. Nucleic Acids Res. 17:4713-4730(1989). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00018} {PS00018; EF_HAND_1} {PS50222; EF_HAND_2} {BEGIN} ******************************************************** * EF-hand calcium-binding domain signature and profile * ******************************************************** Many calcium-binding proteins belong to the same evolutionary family and share a type of calcium-binding domain known as the EF-hand [1 to 5]. This type of domain consists of a twelve residue loop flanked on both side by a twelve residue alpha-helical domain (see <PDB:1CLL>). In an EF-hand loop the calcium ion is coordinated in a pentagonal bipyramidal configuration. The six residues involved in the binding are in positions 1, 3, 5, 7, 9 and 12; these residues are denoted by X, Y, Z, -Y, -X and -Z. The invariant Glu or Asp at position 12 provides two oxygens for liganding Ca (bidentate ligand). The basic structural/functional unit of EF-hand proteins is usually a pair of EF-hand motifs that together form a stable four-helix bundle domain. The pairing of EF-hand enables cooperativity in the binding of Ca2+ ions. We list below the proteins which are known to contain EF-hand regions. For each type of protein we have indicated between parenthesis the total number of EF-hand regions known or supposed to exist. This number does not include regions which clearly have lost their calcium-binding properties, or the atypical low-affinity site (which spans thirteen residues) found in the S-100/ ICaBP family of proteins [6]. - Aequorin and Renilla luciferin binding protein (LBP) (Ca=3). - Alpha actinin (Ca=2). - Calbindin (Ca=4). - Calcineurin B subunit (protein phosphatase 2B regulatory subunit) (Ca=4). - Calcium-binding protein from Streptomyces erythraeus (Ca=3?). - Calcium-binding protein from Schistosoma mansoni (Ca=2?). - Calcium-binding proteins TCBP-23 and TCBP-25 from Tetrahymena thermophila (Ca=4?). - Calcium-dependent protein kinases (CDPK) from plants (Ca=4). - Calcium vector protein from amphoxius (Ca=2). - Calcyphosin (thyroid protein p24) (Ca=4?). - Calmodulin (Ca=4, except in yeast where Ca=3). - Calpain small and large chains (Ca=2). - Calretinin (Ca=6). - Calcyclin (prolactin receptor associated protein) (Ca=2). - Caltractin (centrin) (Ca=2 or 4). - Cell Division Control protein 31 (gene CDC31) from yeast (Ca=2?). - Diacylglycerol kinase (EC 2.7.1.107) (DGK) (Ca=2). - FAD-dependent glycerol-3-phosphate dehydrogenase (EC 1.1.99.5) from mammals (Ca=1). - Fimbrin (plastin) (Ca=2). - Flagellar calcium-binding protein (1f8) from Trypanosoma cruzi (Ca=1 or 2). - Guanylate cyclase activating protein (GCAP) (Ca=3). - Inositol phospholipid-specific phospholipase C isozymes gamma-1 and delta-1 (Ca=2) [10]. - Intestinal calcium-binding protein (ICaBPs) (Ca=2). - MIF related proteins 8 (MRP-8 or CFAG) and 14 (MRP-14) (Ca=2). - Myosin regulatory light chains (Ca=1). - Oncomodulin (Ca=2). - Osteonectin (basement membrane protein BM-40) (SPARC) and proteins that contains an 'osteonectin' domain (QR1, matrix glycoprotein SC1) (see the entry <PDOC00535>) (Ca=1). - Parvalbumins alpha and beta (Ca=2). - Placental calcium-binding protein (18a2) (nerve growth factor induced protein 42a) (p9k) (Ca=2). - Recoverins (visinin, hippocalcin, neurocalcin, S-modulin) (Ca=2 to 3). - Reticulocalbin (Ca=4). - S-100 protein, alpha and beta chains (Ca=2). - Sarcoplasmic calcium-binding protein (SCPs) (Ca=2 to 3). - Sea urchin proteins Spec 1 (Ca=4), Spec 2 (Ca=4?), Lps-1 (Ca=8). - Serine/threonine specific protein phosphatase rdgc (EC 3.1.3.16) from Drosophila (Ca=2). - Sorcin V19 from hamster (Ca=2). - Spectrin alpha chain (Ca=2). - Squidulin (optic lobe calcium-binding protein) from squid (Ca=4). - Troponins C; from skeletal muscle (Ca=4), from cardiac muscle (Ca=3), from arthropods and molluscs (Ca=2). There has been a number of attempts [7,8] to develop patterns that pickup EFhand regions, but these studies were made a few years ago when not so many different families of calcium-binding proteins were known. We therefore developed a new pattern which takes into account all published sequences. This pattern includes the complete EF-hand loop as well as the first residue which follows the loop and which seem to always be hydrophobic. We also developed a profile that covers the loop and the two alpha helices. -Consensus pattern: D-{W}-[DNS]-{ILVFYW}-[DENSTG]-[DNQGHRK]-{GP}-[LIVMC][DENQSTAGC]-x(2)-[DE]-[LIVMFYW] -Sequences known to belong to this class detected by the profile: ALL. for a few sequences. -Other sequence(s) detected in Swiss-Prot: NONE. probably not calcium-binding and a few proteins for which we have reason to believe that they bind calcium: a number of endoglucanases and a xylanase from the cellulosome complex of Clostridium [9]. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Positions 1 (X), 3 (Y) and 12 (-Z) are the most conserved. -Note: The 6th residue in an EF-hand loop is, in most cases a Gly, but the number of exceptions to this 'rule' has gradually increased and we felt that the pattern should include all the different residues which have been shown to exist in this position in functional Ca-binding sites. -Note: The pattern will, in some cases, miss one of the EF-hand regions in some proteins with multiple EF-hand domains. -Expert(s) to contact by email: Cox J.A.; [email protected] Kretsinger R.H.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Kawasaki H., Kretsinger R.H. "Calcium-binding proteins 1: EF-hands." Protein Prof. 2:305-490(1995). PubMed=7553064 [ 2] Kretsinger R.H. "Calcium coordination and the calmodulin fold: divergent versus convergent evolution." Cold Spring Harb. Symp. Quant. Biol. 52:499-510(1987). PubMed=3454274 [ 3] Moncrief N.D., Kretsinger R.H., Goodman M. "Evolution of EF-hand calcium-modulated proteins. I. Relationships based on amino acid sequences." J. Mol. Evol. 30:522-562(1990). PubMed=2115931 [ 4] Nakayama S., Moncrief N.D., Kretsinger R.H. "Evolution of EF-hand calcium-modulated proteins. II. Domains of several subfamilies have diverse evolutionary histories." J. Mol. Evol. 34:416-448(1992). PubMed=1602495 [ 5] Heizmann C.W., Hunziker W. "Intracellular calcium-binding proteins: more sites than insights." Trends Biochem. Sci. 16:98-103(1991). PubMed=2058003 [ 6] Kligman D., Hilt D.C. "The S100 protein family." Trends Biochem. Sci. 13:437-443(1988). PubMed=3075365 [ 7] Strynadka N.C.J., James M.N. "Crystal structures of the helix-loop-helix calcium-binding proteins." Annu. Rev. Biochem. 58:951-998(1989). PubMed=2673026; DOI=10.1146/annurev.bi.58.070189.004511 [ 8] Haiech J., Sallantin J. "Computer search of calcium binding sites in a gene data bank: use of learning techniques to build an expert system." Biochimie 67:555-560(1985). PubMed=3839696 [ 9] Chauvaux S., Beguin P., Aubert J.-P., Bhat K.M., Gow L.A., Wood T.M., Bairoch A. "Calcium-binding affinity and calcium-enhanced activity of Clostridium thermocellum endoglucanase D." Biochem. J. 265:261-265(1990). PubMed=2302168 [10] Bairoch A., Cox J.A. "EF-hand motifs in inositol phospholipid-specific phospholipase C." FEBS Lett. 269:454-456(1990). PubMed=2401372 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00019} {PS00019; ACTININ_1} {PS00020; ACTININ_2} {BEGIN} ************************************************ * Actinin-type actin-binding domain signatures * ************************************************ Alpha-actinin is a F-actin cross-linking protein which is thought to anchor actin to a variety of intracellular structures [1]. The actin-binding domain of alpha-actinin seems to reside in the first 250 residues of the protein. A similar actin-binding domain has been found in the N-terminal region of many different actin-binding proteins [2,3]: - In the beta chain of spectrin (or fodrin). - In dystrophin, the protein defective in Duchenne muscular dystrophy (DMD) and which may play a role in anchoring the cytoskeleton to the plasma membrane. - In the slime mold gelation factor (or ABP-120). - In actin-binding protein ABP-280 (or filamin), a protein that link actin filaments to membrane glycoproteins. - In fimbrin (or plastin), an actin-bundling protein. Fimbrin differs from the above proteins in that it contains two tandem copies of the actinbinding domain and that these copies are located in the C-terminal part of the protein. We selected two conserved regions as signature patterns for this type of domain. The first of this region is located at the beginning of the domain, while the second one is located in the central section and has been shown to be essential for the binding of actin. -Consensus pattern: [EQ]-{LNYH}-x-[ATV]-[FY]-{LDAM}-{T}-W-{PG}-N -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 32. -Consensus pattern: [LIVM]-x-[SGNL]-[LIVMN]-[DAGHENRS]-[SAGPNVT]-x[DNEAG][LIVM]-x-[DEAGQ]-x(4)-[LIVM]-x-[LM]-[SAG]-[LIVM][LIVMT][WS]-x(0,1)-[LIVM](2) -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Patterns revised. [ 1] Schleicher M., Andre E., Hartmann H., Noegel A.A. "Actin-binding proteins are conserved from slime molds to man." Dev. Genet. 9:521-530(1988). PubMed=3243032 [ 2] Matsudaira P. "Modular organization of actin crosslinking proteins." Trends Biochem. Sci. 16:87-92(1991). PubMed=2058002 [ 3] Dubreuil R.R. "Structure and evolution of the actin crosslinking proteins." BioEssays 13:219-226(1991). PubMed=1892474 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00020} {PS00021; KRINGLE_1} {PS50070; KRINGLE_2} {BEGIN} **************************************** * Kringle domain signature and profile * **************************************** Kringles [1,2,3] are triple-looped, disulfide cross-linked domains found in a varying number of copies, in some serine proteases and plasma proteins. The kringle domain has been found in the following proteins: - Apolipoprotein A (38 copies). Blood coagulation factor XII (Hageman factor) (1 copy). Hepatocyte growth factor (HGF) (4 copies). Hepatocyte growth factor like protein (4 copies) [4]. Hepatocyte growth factor activator [1] (once) [5]. Plasminogen (5 copies). Thrombin (2 copies). Tissue plasminogen activator (TPA) (2 copies). Urokinase-type plasminogen activator (1 copy). The schematic domain is shown below: representation of the structure of a typical kringle +---------------------------------------+ | | xCxxxxxxxxxxxCxxxxxxxxxxCxxxxxCxxxxxxCxxxCx | | | | +----------|-----+ | +------------+ 'C': conserved cysteine involved in a disulfide bond. Kringle domains are thought to play a role in binding mediators, such as membranes, other proteins or phospholipids, and in the regulation of proteolytic activity. As a signature pattern for this type of domain, we selected a conserved sequence that contains two of the cysteines invovled in disulfide bonds. -Consensus pattern: [FY]-C-[RH]-[NS]-x(7,8)-[WY]-C [The 2 C's are involved in a disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 5 -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Ikeo K.; [email protected] -Last update: May 2004 / Text revised. [ 1] Castellino F.J., Beals J.M. "The genetic relationships between the kringle domains of human plasminogen, prothrombin, tissue plasminogen activator, urokinase, and coagulation factor XII." J. Mol. Evol. 26:358-369(1987). PubMed=3131537 [ 2] Patthy L. "Evolution of the proteases of blood coagulation and fibrinolysis by assembly from modules." Cell 41:657-663(1985). PubMed=3891096 [ 3] Ikeo K., Takahashi K., Gojobori T. "Evolutionary origin of numerous kringles in human and simian apolipoprotein(a)." FEBS Lett. 287:146-148(1991). PubMed=1879523 [ 4] Friezner Degen S.J., Stuart L.A., Han S., Jamison C.S. Biochemistry 30:9781-9791(1991). [ 5] Miyazawa K., Shimomura T., Kitamura A., Kondo J., Morimoto Y., Kitamura N. "Molecular cloning and sequence analysis of the cDNA for a human serine protease reponsible for activation of hepatocyte growth factor. Structural similarity of the protease precursor to blood coagulation factor XII." J. Biol. Chem. 268:10024-10028(1993). PubMed=7683665 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00021} {PS00022; EGF_1} {PS01186; EGF_2} {PS50026; EGF_3} {BEGIN} ****************************************** * EGF-like domain signatures and profile * ****************************************** A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [1 to 6] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. EGF is a polypeptide of about 50 amino acids with three internal disulfide bridges. It first binds with high affinity to specific cell-surface receptors and then induces their dimerization, which is essential for activating the tyrosine kinase in the receptor cytoplasmic domain, initiating a signal transduction that results in DNA synthesis and cell proliferation. A common feature of all EGF-like domains is that they are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF-like domain includes six cysteine residues which have been shown to be involved in disulfide bonds. The structure of several EGF-like domains has been solved. The fold consists of two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet (see <PDB:1EGF). Subdomains between the conserved cysteines strongly vary in length as shown in the following schematic representation of the EGF-like domain: +-------------------+ +-------------------------+ | | | | x(4)-C-x(0,48)-C-x(3,12)-C-x(1,70)-C-x(1,6)-C-x(2)-G-a-x(0,21)-G-x(2)C-x | | ************************************ +-------------------+ 'C': 'G': 'a': '*': 'x': conserved cysteine involved in a disulfide bond. often conserved glycine often conserved aromatic amino acid position of both patterns. any residue Some proteins domain are listed below. known to contain one or more copies of an EGF-like - Adipocyte differentiation inhibitor (gene PREF-1) from mouse (6 copies). - Agrin, a basal lamina protein that causes the aggregation of acetylcholine receptors on cultured muscle fibers (4 copies). - Amphiregulin, a growth factor (1 copy). - Betacellulin, a growth factor (1 copy). - Blastula proteins BP10 and Span from sea urchin which are thought to be involved in pattern formation (1 copy). - BM86, a glycoprotein antigen of cattle tick (7 copies). - Bone morphogenic protein 1 (BMP-1), a protein which induces cartilage and bone formation and which expresses metalloendopeptidase activity (1-2 copies). Homologous proteins are found in sea urchin - suBMP (1 copy) - and in Drosophila - the dorsal-ventral patterning protein tolloid (2 copies). - Caenorhabditis elegans developmental proteins lin-12 (13 copies) and glp-1 (10 copies). - Caenorhabditis elegans apx-1 protein, a patterning protein (4.5 copies). - Calcium-dependent serine proteinase (CASP) which degrades the extracellular matrix proteins type I and IV collagen and fibronectin (1 copy). - Cartilage matrix protein CMP (1 copy). - Cartilage oligomeric matrix protein COMP (4 copies). - Cell surface antigen 114/A10 (3 copies). - Cell surface glycoprotein complex transmembrane subunit ASGP-2 from rat (2 copies). - Coagulation associated proteins C, Z (2 copies) and S (4 copies). - Coagulation factors VII, IX, X and XII (2 copies). - Complement C1r components (1 copy). - Complement C1s components (1 copy). - Complement-activating component of Ra-reactive factor (RARF) (1 copy). - Complement components C6, C7, C8 alpha and beta chains, and C9 (1 copy). - Crumbs, an epithelial development protein from Drosophila (29 copies). - Epidermal growth factor precursor (7-9 copies). - Exogastrula-inducing peptides A, C, D and X from sea urchin (1 copy). - Fat protein, a Drosophila cadherin-related tumor suppressor (5 copies). - Fetal antigen 1, a probable neuroendocrine differentiation protein, which is derived from the delta-like protein (DLK) (6 copies). - Fibrillin 1 (47 copies) and fibrillin 2 (14 copies). - Fibropellins IA (21 copies), IB (13 copies), IC (8 copies), II (4 copies) and III (8 copies) from the apical lamina - a component of the extracellular matrix - of sea urchin. - Fibulin-1 and -2, two extracellular matrix proteins (9-11 copies). - Giant-lens protein (protein Argos), which regulates cell determination and axon guidance in the Drosophila eye (1 copy). - Growth factor-related proteins from various poxviruses (1 copy). - Gurken protein, a Drosophila developmental protein (1 copy). - Heparin-binding EGF-like growth factor (HB-EGF), transforming growth factor alpha (TGF-alpha), growth factors Lin-3 and Spitz (1 copy); the precursors are membrane proteins, the mature form is located extracellular. - Hepatocyte growth factor (HGF) activator (EC 3.4.21.-) (2 copies). - LDL and VLDL receptors, which bind and transport low-density lipoproteins and very low-density lipoproteins (3 copies). - LDL receptor-related protein (LRP), which may act as a receptor for endocytosis of extracellular ligands (22 copies). - Leucocyte antigen CD97 (3 copies), cell surface glycoprotein EMR1 (6 copies) and cell surface glycoprotein F4/80 (7 copies). - Limulus clotting factor C, which is involved in hemostasis and host defense mechanisms in japanese horseshoe crab (1 copy). - Meprin A alpha subunit, a mammalian membrane-bound endopeptidase (1 copy). - Milk fat globule-EGF factor 8 (MFG-E8) from mouse (2 copies). - Neuregulin GGF-I and GGF-II, two human glial growth factors (1 copy). - Neurexins from mammals (3 copies). - Neurogenic proteins Notch, Xotch and the human homolog Tan-1 (36 copies), Delta (9 copies) and the similar differentiation proteins Lag-2 from Caenorhabditis elegans (2 copies), Serrate (14 copies) and Slit (7 copies) from Drosophila. - Nidogen (also called entactin), a basement membrane protein from chordates (2-6 copies). - Ookinete surface proteins (24 Kd, 25 Kd, 28 Kd) from Plasmodium (4 copies). - Pancreatic secretory granule membrane major glycoprotein GP2 (1 copy). - Perforin, which lyses non-specifically a variety of target cells (1 copy). - Proteoglycans aggrecan (1 copy), versican (2 copies), perlecan (at least 2 copies), brevican (1 copy) and chondroitin sulfate proteoglycan (gene PG-M) (2 copies). - Prostaglandin G/H synthase 1 and 2 (EC 1.14.99.1) (1 copy), which is found in the endoplasmatic reticulum. - Reelin, an extracellular matrix protein that plays a role in layering of neurons in the cerebral cortex and cerebellum of mammals (8 copies). - S1-5, a human extracellular protein whose ultimate activity is probably modulated by the environment (5 copies). - Schwannoma-derived growth factor (SDGF), an autocrine growth factor as well as a mitogen for different target cells (1 copy). - Selectins. Cell adhesion proteins such as ELAM-1 (E-selectin), GMP-140 (P-selectin), or the lymph-node homing receptor (L-selectin) (1 copy). - Serine/threonine-protein kinase homolog (gene Pro25) from Arabidopsis thaliana, which may be involved in assembly or regulation of light-harvesting chlorophyll A/B protein (2 copies). - Sperm-egg fusion proteins PH-30 alpha and beta from guinea pig (1 copy). - Stromal cell derived protein-1 (SCP-1) from mouse (6 copies). - TDGF-1, human teratocarcinoma-derived growth factor 1 (1 copy). - Tenascin (or neuronectin), an extracellular matrix protein from mammals (14.5 copies), chicken (TEN-A) (13.5 copies) and the related proteins human tenascin-X (18 copies) and tenascin-like proteins TEN-A and TEN-M from Drosophila (8 copies). - Thrombomodulin (fetomodulin), which together with thrombin activates protein C (6 copies). - Thrombospondin 1, 2 (3 copies), 3 and 4 (4 copies), adhesive glycoproteins that mediate cell-to-cell and cell-to-matrix interactions. - Thyroid peroxidase 1 and 2 (EC 2.7.10.1) from human (1 copy). - Transforming growth factor beta-1 binding protein (TGF-B1-BP) (16 or 18 copies). - Tyrosine-protein kinase receptors Tek and Tie (EC 2.7.1.112) (3 copies). - Urokinase-type plasminogen activator (EC 3.4.21.73) (UPA) and tissue plasminogen activator (EC 3.4.21.68) (TPA) (1 copy). - Uromodulin (Tamm-horsfall urinary glycoprotein) (THP) (3 copies). - Vitamin K-dependent anticoagulants protein C (2 copies) and protein S (4 copies) and the similar protein Z, a single-chain plasma glycoprotein of unknown function (2 copies). - 63 Kd sperm flagellar membrane protein from sea urchin (3 copies). - 93 Kd protein (gene nel) from chicken (5 copies). - Hypothetical 337.6 Kd protein T20G5.3 from Caenorhabditis elegans (44 copies). The region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains. We created two patterns for this domain, each including one of these C-terminal conserved glycine residues. The profile we developed covers the whole domain. -Consensus pattern: C-x-C-x(2)-{V}-x(2)-G-{C}-x-C [The 3 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. but not those that have very long or very short regions between the last 3 conserved cysteines of their EGF-like domain(s). -Other sequence(s) detected in Swiss-Prot: 87 proteins, of which 27 can be considered as possible candidates. -Consensus pattern: C-x-C-x(2)-[GP]-[FYW]-x(4,8)-C [The 3 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. but not those that have very long or very short regions between the last 3 conserved cysteines of their EGF-like domain(s). -Other sequence(s) detected in Swiss-Prot: 83 proteins, of which 49 can be considered as possible candidates. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: The beta chain of the integrin family of proteins contains 2 cysteinerich repeats which were said to be dissimilar with the EGF pattern [7]. -Note: Laminin EGF-like repeats (see <PDOC00961>) are longer than the average EGF module and contain a further disulfide bond C-terminal of the EGF-like region. Perlecan and agrin contain both EGF-like domains and laminin-type EGF-like domains. -Note: The pattern do not detect all of the repeats of proteins with multiple EGF-like repeats. -Note: See <PDOC00913> for an entry describing specifically the subset of EGFlike domains that bind calcium. -Last update: April 2006 / Pattern revised. [ 1] Davis C.G. "The many faces of epidermal growth factor repeats." New Biol. 2:410-419(1990). PubMed=2288911 [ 2] Blomquist M.C., Hunt L.T., Barker W.C. "Vaccinia virus 19-kilodalton protein: relationship to several mammalian proteins, including two growth factors." Proc. Natl. Acad. Sci. U.S.A. 81:7363-7367(1984). PubMed=6334307 [ 3] Barker W.C., Johnson G.C., Hunt L.T., George D.G. Protein Nucl. Acid Enz. 29:54-68(1986). [ 4] Doolittle R.F., Feng D.F., Johnson M.S. "Computer-based characterization of epidermal growth factor precursor." Nature 307:558-560(1984). PubMed=6607417 [ 5] Appella E., Weber I.T., Blasi F. "Structure and function of epidermal growth factor-like regions in proteins." FEBS Lett. 231:1-4(1988). PubMed=3282918 [ 6] Campbell I.D., Bork P. Curr. Opin. Struct. Biol. 3:385-392(1993). [ 7] Tamkun J.W., DeSimone D.W., Fonda D., Patel R.S., Buck C., Horwitz A.F., Hynes R.O. "Structure of integrin, a glycoprotein involved in the transmembrane linkage between fibronectin and actin." Cell 46:271-282(1986). PubMed=3487386 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00022} {PS00023; FN2_1} {PS51092; FN2_2} {BEGIN} ********************************************************************* * Fibronectin type-II collagen-binding domain signature and profile * ********************************************************************* Fibronectin is a plasma protein that binds cell surfaces and various compounds including collagen, fibrin, heparin, DNA, and actin. The major part of the sequence of fibronectin consists of the repetition of three types of domains, which are called type I, II, and III [1]. Type II domain (FN2) is approximately 40 residues long, contains four conserved cysteines involved in disulfide bonds and is part of the collagen-binding region of fibronectin [2]. In fibronectin the minimal collagen binding region is formed by one FN1 and two FN2 domains. This suggests that the collagen-binding sites spans multiple modules. A schematic representation of the position of the invariant residues and the topology of the disulfide bonds in FN2 domain is shown below. +----------------------+ | | xxCxxPFx#xxxxxxxCxxxxxxxxWCxxxxx#xxx#x#Cxx | | +-----------------------+ 'C': conserved cysteine involved in a disulfide bond. '#': large hydrophobic residue. The 3D-structure of the FN2 domain has been determined (see <PDB:2FN2>) [3]. The structure consists of two double-stranded anti-parallel betasheets, oriented approximately perpendicular to each other, and two irregular loops, one separating the two beta-sheets and the other between the two strands of the second beta-sheet. The minimal collagen-binding region (FN1FN2-FN2) adopts a hairpin structure where the conserved aromatic residues of FN2 form a hydrophobic pocket which polar residues in collagen [4]. is thought to provide a binding site for non Some proteins that contain an FN2 domain are listed below: - Blood coagulation factor XII (Hageman factor) (1 copy). - Bovine seminal plasma proteins PDC-109 (BSP-A1/A2) and BSP-A3 [5] (twice). - Cation-independent mannose-6-phosphate receptor (which is also the insulinlike growth factor II receptor) [6] (1 copy). - Mannose receptor of macrophages [7] (1 copy). - 180 Kd secretory phospholipase A2 receptor (1 copy) [8]. - DEC-205 receptor (1 copy) [9]. 72 Kd and 92 Kd type IV collagenases (EC 3.4.24.24) (MMP-2 and MMP-9) [10] (3 copies). Both metalloproteinases are strongly expressed in malignant tumors and have been attributed to metastasize. They both degradate collagen-IV thus facilitating penetration of the basement membranes by tumor cells. - Hepatocyte growth factor activator [11] (1 copy). Our consensus pattern spans the domain between the first and the last conserved cysteine. We also developed a profile that covers the whole domain. -Consensus pattern: C-x(2)-P-F-x-[FYWIV]-x(7)-C-x(8,10)-W-C-x(4)-[DNSR][FYW]x(3,5)-[FYW]-x-[FYWI]-C [The 4 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: March 2005 / Text revised; profile added. [ 1] Skorstengaard K., Jensen M.S., Sahl P., Petersen T.E., Magnusson S. "Complete primary structure of bovine plasma fibronectin." Eur. J. Biochem. 161:441-453(1986). PubMed=3780752 [ 2] Forastieri H., Ingham K.C. "Interaction of gelatin with a fluorescein-labeled 42-kDa chymotryptic fragment of fibronectin." J. Biol. Chem. 260:10546-10550(1985). PubMed=3928622 [ 3] Pickford A.R., Potts J.R., Bright J.R., Phan I., Campbell I.D. "Solution structure of a type 2 module from fibronectin: implications for the structure and function of the gelatin-binding domain." Structure 5:359-370(1997). PubMed=9083105 [ 4] Pickford A.R., Smith S.P., Staunton D., Boyd J., Campbell I.D. "The hairpin structure of the (6)F1(1)F2(2)F2 fragment from human fibronectin enhances gelatin binding." EMBO J. 20:1519-1529(2001). PubMed=11285216; DOI=10.1093/emboj/20.7.1519 [ 5] Seidah N.G., Manjunath P., Rochemont J., Sairam M.R., Chretien M. "Complete amino acid sequence of BSP-A3 from bovine seminal plasma. Homology to PDC-109 and to the collagen-binding domain of fibronectin." Biochem. J. 243:195-203(1987). PubMed=3606570 [ 6] Kornfeld S. "Structure and function of the mannose 6-phosphate/insulinlike growth factor II receptors." Annu. Rev. Biochem. 61:307-330(1992). PubMed=1323236; DOI=10.1146/annurev.bi.61.070192.001515 [ 7] Taylor M.E., Conary J.T., Lennartz M.R., Stahl P.D., Drickamer K. "Primary structure of the mannose receptor contains multiple motifs resembling carbohydrate-recognition domains." J. Biol. Chem. 265:12156-12162(1990). PubMed=2373685 [ 8] Lambeau G., Ancian P., Barhanin J., Lazdunski M. "Cloning and expression of a membrane receptor for secretory phospholipases A2." J. Biol. Chem. 269:1575-1578(1994). PubMed=8294398 [ 9] Jiang W., Swiggard W.J., Heufler C., Peng M., Mirza A., Steinman R.M., Nussenzweig M.C. "The receptor DEC-205 expressed by dendritic cells and thymic epithelial cells is involved in antigen processing." Nature 375:151-155(1995). PubMed=7753172; DOI=10.1038/375151a0 [10] Collier I.E., Wilhelm S.M., Eisen A.Z., Marmer B.L., Grant G.A., Seltzer J.L., Kronberger A., He C., Bauer E.A., Goldberg G.I. J. Biol. Chem. 263:6579-6587(1988). [11] Miyazawa K., Shimomura T., Kitamura A., Kondo J., Morimoto Y., Kitamura N. J. Biol. Chem. 268:10024-10028(1993). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00023} {PS00024; HEMOPEXIN} {BEGIN} ****************************** * Hemopexin domain signature * ****************************** Hemopexin is a serum glycoprotein that binds heme and transports it to the liver for breakdown and iron recovery, after which the free hemopexin returns to the circulation. Structurally hemopexin consists of two similar halves of approximately two hundred amino acid residues connected by a histidine-rich hinge region. Each half is itself formed by the repetition of a basic unit of some 35 to 45 residues. Hemopexin-like domains have been found [1,2] in two other types of proteins: - In vitronectin, a cell adhesion and spreading factor found in plasma and tissues. Vitronectin, like hemopexin, has two hemopexin-like domains. - In most members of the matrix metalloproteinases family (matrixins) (see <PDOC00129>): MMP-1, MMP-2, MMP-3, MMP-8, MMP-9, MMP-10, MMP-11, MMP-12, MMP-13, MMP-14, MMP-15, MMP-16, MMP-17, MMP-18, MMP-19, MMP-20, MMP-24, and MMP-25. These zinc endoproteases have a single hemopexin-like domain in their C-terminal section. It is suggested that the hemopexin domain facilitates binding to a variety of molecules and proteins. The signature pattern for this type of domain has been derived from the best conserved region which is located at the beginning of the second repeat. -Consensus pattern: [LIFAT]-{IL}-x(2)-W-x(2,3)-[PE]-x-{VF}-[LIVMFY][DENQS][STA]-[AV]-[LIVMFY] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 11. -Last update: April 2006 / Pattern revised. [ 1] Hunt L.T., Barker W.C., Chen H.R. Protein Seq. Data Anal. 1:21-26(1987). [ 2] Stanley K.K. "Homology with hemopexin suggests a possible scavenging function for S-protein/vitronectin." FEBS Lett. 199:249-253(1986). PubMed=2422056 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00024} {PS00025; P_TREFOIL_1} {PS51448; P_TREFOIL_2} {BEGIN} *************************************************** * P-type ('Trefoil') domain signature and profile * *************************************************** A cysteine-rich domain of approximately forty five amino-acid residues has been found in some extracellular eukaryotic proteins [1,2,3,4,5]. This domain is known as either the 'P', 'trefoil' or 'TFF' domain. It contains six cysteines that are linked by three disulfide bonds in a 1-5, 2-4, and 3-6 configuration. This leads to a characteristic three leafed structure ('trefoil'). The P-type domain is clearly composed of three looplike regions. The central core of the domain consists of a short two-stranded antiparallel beta-sheet, which is capped by an irregular loop and forms a central hairpin (loop 3). The beta-sheet is preceded by a short alpha-helix, with majority of the remainder of the domain contained in two loops, which lie on either side of the central hairpin (see <PDB:1E9T>) [6]. Proteins known to contain this domain are: - Protein pS2 (TFF1), a protein secreted by the stomach mucosa, whose gene is induced by estrogen. The exact function of pS2 is not known. It is a protein of about 65 residues and it contains a copy of the 'P' domain. - Spasmolytic polypeptide (SP) (TFF2), a protein of about 115 residues that inhibits gastrointestinal motility and gastric acid secretion. SP could be a growth factor. It contains two tandem copies of the 'P' domain. - Intestinal trefoil factor (ITF) (TFF3), an intestinal protein of about 60 residues which may have a role in promoting cell migration. It contains a copy of the 'P' domain. - Xenopus stomach proteins xP1 (one 'P' domain) and xP4 (four 'P' domains). - Xenopus integumentary mucins A.1 (FIM-A.1 or preprospasmolysin) and C.1 (FIM-C.1). These proteins could be involved in defense against microbial infections by protecting the epithelia from external environment. They are large proteins (400 residues for A.1; more than 660 residues for C.1 whose sequence is only partially known) that contain multiple copies of the 'P' domain interspersed with tandem repeats of threonine-rich, Oglycosylated regions. - Xenopus skin protein xp2 (or APEG) a protein that contains two 'P' domains and which exists in two alternative spliced forms that differ from the inclusion of a N-terminal region of 320 residues that consist of 33 tandem repeats of a G-[GE]-[AP](2,4)-A-E motif. - Zona pellucida sperm-binding protein B (ZP-B) (also known as ZP-X in rabbit and ZP-3 alpha in pig). This protein is a receptor-like glycoprotein whose extracellular region contains a 'P' domain followed by a ZP domain (see <PDOC00577>). - Intestinal sucrase-isomaltase (EC 3.2.1.48 / EC 3.2.1.10), a vertebrate membrane-bound, multifunctional enzyme complex which hydrolyzes sucrose, maltose and isomaltose (see <PDOC00120>). - Lysosomal alpha-glucosidase (EC 3.2.1.20) (acid maltase), a vertebrate extracellular glycosidase (see <PDOC00120>). Structurally the P-type domain can be represented as shown below. +-------------------------+ | +--------------+| | | || xxCxxxxxx+xxCG#xxxxxxxCxxxxCC#xxxxxxxxWC#xxxxxxxx *************|******* | | | +----------------+ 'C': '#': '+': '*': conserved cysteine involved in a disulfide bond. large hydrophobic residue. positively charged residue. position of the pattern. -Consensus pattern: [KRH]-x(2)-C-x-[FYPSTV]-x(3,4)-[ST]-x(3)-C-x(4)-C-C[FYWH] [The 4 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Hoffmann W.; [email protected] -Last update: May 2009 / Text revised; profile added. [ 1] Hoffmann W., Hauser F. "The P-domain or trefoil motif: a role in renewal and pathology of mucous epithelia?" Trends Biochem. Sci. 18:239-243(1993). PubMed=8267796 [ 2] Otto B., Wright N. "Trefoil peptides. Coming up clover." Curr. Biol. 4:835-838(1994). PubMed=7820556 [ 3] Bork P. "A trefoil domain in the major rabbit zona pellucida protein." Protein Sci. 2:669-670(1993). PubMed=8518738 [ 4] Wright N.A., Hoffmann W., Otto W.R., Rio M.-C., Thim L. "Rolling in the clover: trefoil factor family (TFF)-domain peptides, cell migration and cancer." FEBS Lett. 408:121-123(1997). PubMed=9187350 [ 5] Sommer P., Blin N., Goett P. "Tracing the evolutionary origin of the TFF-domain, an ancient motif at mucous surfaces." Gene 236:133-136(1999). PubMed=10433974 [ 6] Lemercinier X., Muskett F.W., Cheeseman B., McIntosh P.B., Thim L., Carr M.D. "High-resolution solution structure of human intestinal trefoil factor and functional insights from detailed structural comparisons with the other members of the trefoil family of mammalian cell motility factors." Biochemistry 40:9552-9559(2001). PubMed=11583154 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00025} {PS00026; CHIT_BIND_I_1} {PS50941; CHIT_BIND_I_2} {BEGIN} ****************************************************** * Chitin-binding type-1 domain signature and profile * ****************************************************** Many plants respond to pathogenic attack by producing defense proteins that are capable of reversible binding to chitin, an Nacetylglucosamine polysaccharide present in the cell wall of fungi and the exoskeleton of insects. Most of these chitin-binding proteins include a common structural motif of 30 to 43 residues organized around a conserved four-disulfide core, known as the chitin-binding domain type-1 [1]. The topological arrangement of the four disulfide bonds is shown in the following figure: +-------------+ +----|------+ | | | | | xxCgxxxxxxxCxxxxCCsxxgxCgxxxxxCxxxCxxxxC | ******|************* | | | | +----+ +--------------+ 'C': conserved cysteine involved in a disulfide bond. '*': position of the pattern. The structure (see of several chitin-binding domain type-1 have been solved, for example <PDB:1HEV>) [2]. The chitin-binding site is localized in a beta-hairpin loop formed by the second disulfide bridge. Conserved serine and aromatic residues associated with the hairpin-loop are essential for the chitin-binding activity [3]. The chitin-binding domain type-1 displays some structural similarities with the chitin-binding domain type-2 (see <PDOC50940>). Some of listed below: the proteins containing a chitin-binding domain type-1 are - A number of non-leguminous plant lectins. The best characterized of these lectins are the three highly homologous wheat germ agglutinins (WGA-1, 2 and 3). WGA is an N-acetylglucosamine/N-acetylneuraminic acid binding lectin which structurally consists of a fourfold repetition of the 43 amino acid domain. The same type of structure is found in a barley rootspecific lectin as well as a rice lectin. - Plants endochitinases (EC 3.2.1.14) from class IA (see <PDOC00620>). Endochitinases are enzymes that catalyze the hydrolysis of the beta-1,4 linkages of N-acetyl glucosamine polymers of chitin. Plant chitinases function as a defense against chitin containing fungal pathogens. Class IA chitinases generally contain one copy of the chitin-binding domain at their N-terminal extremity. An exception is agglutinin/chitinase [4] from the stinging nettle Urtica dioica which contains two copies of the domain. - Hevein, a wound-induced protein found in the latex of rubber trees. - Win1 and win2, two wound-induced proteins from potato. - Kluyveromyces lactis killer toxin alpha subunit [5]. The toxin encoded by the linear plasmid pGKL1 is composed of three subunits: alpha, beta, and gamma. The gamma subunit harbors toxin activity and inhibits growth of sensitive yeast strains in the G1 phase of the cell cycle; the alpha subunit, which is proteolytically processed from a larger precursor that also contains the beta subunit, is a chitinase (see <PDOC00839>). The profile we developed covers the whole domain. -Consensus pattern: C-x(4,5)-C-C-S-x(2)-G-x-C-G-x(3,4)-[FYW]-C [The 5 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Hevein is a strong allergen which is implied in the allergy to natural rubber latex (NRL). NLR can be associated to hypersensitivity to some plant-derived foods (latex–fruit syndrome). An increasing number of plant sources, such as avocado, banana, chestnut, kiwi, peach, tomato, potato and bell pepper, have been associated with this syndrome. Several papers [6,7] have shown that allergen cross-reactivity is due to IgE antibodies that recognize structurally similar epitopes on different proteins that are closely related. One of these family is plant defence proteins class I chitinase containing a type-1 chitin-binding domain. -Last update: December 2004 / Pattern and text revised. [ 1] Wright H.T., Sandrasegaram G., Wright C.S. "Evolution of a family of N-acetylglucosamine binding proteins containing the disulfide-rich domain of wheat germ agglutinin." J. Mol. Evol. 33:283-294(1991). PubMed=1757999 [ 2] Andersen N.H., Cao B., Rodriguez-Romero A., Arreguin B. "Hevein: NMR assignment and assessment of solution-state folding for the agglutinin-toxin motif." Biochemistry 32:1407-1422(1993). PubMed=8431421 [ 3] Asensio J.L., Canada F.J., Siebert H.C., Laynez J., Poveda A., Nieto P.M., Soedjanaamadja U.M., Gabius H.J., Jimenez-Barbero J. "Structural basis for chitin recognition by defense proteins: GlcNAc residues are bound in a multivalent fashion by extended binding sites in hevein domains." Chem. Biol. 7:529-543(2000). PubMed=10903932 [ 4] Lerner D.R., Raikhel N.V. "The gene for stinging nettle lectin (Urtica dioica agglutinin) encodes both a lectin and a chitinase." J. Biol. Chem. 267:11085-11091(1992). PubMed=1375935 [ 5] Butler A.R., O'Donnell R.W., Martin V.J., Gooday G.W., Stark M.J.R. "Kluyveromyces lactis toxin has an essential chitinase activity." Eur. J. Biochem. 199:483-488(1991). PubMed=2070799 [ 6] Sowka S., Hsieh L.S., Krebitz M., Akasawa A., Martin B.M., Starrett D., Peterbauer C.K., Scheiner O., Breiteneder H. "Identification and cloning of prs a 1, a 32-kDa endochitinase and major allergen of avocado, and its expression in the yeast Pichia pastoris." J. Biol. Chem. 273:28091-28097(1998). PubMed=9774427 [ 7] Wagner S., Breiteneder H. "The latex-fruit syndrome." Biochem. Soc. Trans. 30:935-940(2002). PubMed=12440950; +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00026} {PS51390; WAP} {BEGIN} ************************************************* * WAP-type 'four-disulfide core' domain profile * ************************************************* The 'four-disulfide core' or WAP domain comprises 8 cysteine residues involved in disulfide bonds in a conserved arrangement [1]. One or more of these domains occur in whey acidic protein (WAP), antileukoproteinase, elastase-inhibitor proteins and other structurally related proteins which are listed below. - Whey acidic protein (WAP). WAP is a major component of milk whey whose function might be that of a protease inhibitor. WAP consists of two 'four-disulfide core' domains in most mammals. - Antileukoproteinase 1 (HUSI), a mucous fluid serine proteinase inhibitor. HUSI consists of two 'four-disulfide core' domains. - Elafin, an elastase-specific inhibitor from human skin [2,3]. - Sodium/potassium ATPase inhibitors SPAI-1, -2, and -3 from pig [4]. - Chelonianin, a protease inhibitor from the eggs of red sea turtle. This inhibitor consists of two domains: an N-terminal domain which inhibits trypsin and belongs to the BPTI/Kunitz family of inhibitors, and a C-terminal domain which inhibits subtilisin and is a 'four-disulfide core domain'. - Extracellular peptidase inhibitor (WDNM1 protein), involved in the metastatic potential of adenocarcinomas in rats. - Caltrin-like protein 2 from guinea pig, which inhibits calcium transport into spermatozoa. - Kallmann syndrome protein (Anosmin-1 or KALIG-1) [5,6]. This secreted protein may be a adhesion-like molecule with anti-protease activity. It contains a 'four-disulfide core domain' in its N-terminal part. - Whey acidic protein (WAP) from the tammar wallaby, which consists of three 'four-disulfide core' domains [7]. - Waprins from snake venom, such as omwaprin from Oxyuranus microlepidotus [8] which has antibacterial activity against Gram-positive bacteria. The following schematic representation shows the position of the conserved cysteines that form the 'four-disulfide core' WAP domain (see <PDB:2REL>). +---------------------+ | +-----------+ | | | | | xxxxxxxCPxxxxxxxxxCxxxxCxxxxxCxxxxxCCxxxCxxxCxxxx | | | | | +--------------+ | | +----------------------------+ <------------------50-residues------------------> 'C': conserved cysteine involved in a disulfide bond. We developed a profile that WAP-type 'four-disulfide core' domain. covers the whole structure of the -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Claverie J.-M.; [email protected] -Last update: July 2008 / Pattern removed, profile added and text revised. [ 1] Hennighausen L.G., Sippel A.E. "Mouse whey acidic protein is a novel member of the family of 'four-disulfide core' proteins." Nucleic Acids Res. 10:2677-2684(1982). PubMed=6896234 [ 2] Wiedow O., Schroeder J.-M., Gregory H., Young J.A., Christophers E. "Elafin: an elastase-specific inhibitor of human skin. Purification, characterization, and complete amino acid sequence." J. Biol. Chem. 265:14791-14795(1990). PubMed=2394696 [ 3] Francart C., Dauchez M., Alix A.J., Lippens G. "Solution structure of R-elafin, a specific inhibitor of elastase." J. Mol. Biol. 268:666-677(1997). PubMed=9171290; DOI=10.1006/jmbi.1997.0983 [ 4] Araki K., Kuwada M., Ito O., Kuroki J., Tachibana S. "Four disulfide bonds' allocation of Na+, K(+)-ATPase inhibitor (SPAI)." Biochem. Biophys. Res. Commun. 172:42-46(1990). PubMed=2171523 [ 5] Legouis R., Hardelin J.-P., Levilliers J., Claverie J.-M., Compain S., Wunderle V., Millasseau P., Le Paslier D., Cohen D., Caterina D. Bougueleret L., Delemarre-Van de Waal H., Lutfalla G., Weissenbach J., Petit C. "The candidate gene for the X-linked Kallmann syndrome encodes a protein related to adhesion molecules." Cell 67:423-435(1991). PubMed=1913827 [ 6] Hu Y., Sun Z., Eaton J.T., Bouloux P.M., Perkins S.J. "Extended and flexible domain solution structure of the extracellular matrix protein anosmin-1 by X-ray scattering, analytical ultracentrifugation and constrained modelling." J. Mol. Biol. 350:553-570(2005). PubMed=15949815; DOI=10.1016/j.jmb.2005.04.031 [ 7] Simpson K.J., Ranganathan S., Fisher J.A., Janssens P.A., Shaw D.C., Nicholas K.R. "The gene for a novel member of the whey acidic protein family encodes three four-disulfide core domains and is asynchronously expressed during lactation." J. Biol. Chem. 275:23074-23081(2000). PubMed=10801834; DOI=10.1074/jbc.M002161200 [ 8] Nair D.G., Fry B.G., Alewood P., Kumar P.P., Kini R.M. "Antimicrobial activity of omwaprin, a new member of the waprin family of snake venom proteins." Biochem. J. 402:93-104(2007). PubMed=17044815; DOI=10.1042/BJ20060318 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00027} {PS00027; HOMEOBOX_1} {PS50071; HOMEOBOX_2} {BEGIN} ******************************************* * 'Homeobox' domain signature and profile * ******************************************* The 'homeobox' is a protein domain of 60 amino acids [1 to 5,E1] first identified in a number of Drosophila homeotic and segmentation proteins. It has since been found to be extremely well conserved in many other animals, including vertebrates. This domain binds DNA through a helix-turn-helix type of structure. Some of the proteins which contain a homeobox domain play an important role in development. Most of these proteins are known to be sequence specific DNA-binding transcription factors. The homeobox domain has also been found to be very similar to a region of the yeast mating type proteins. These are sequence-specific DNA-binding proteins that act as master switches in yeast differentiation by controlling gene expression in a cell type-specific fashion. A schematic representation of the homeobox domain is shown below. The helix-turn-helix region is shown by the symbols 'H' (for helix), and 't' (for turn). xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxHHHHHHHHtttHHHHHHHHHxxxxxxxxxx | | | | | | | 1 10 20 30 40 50 60 The pattern we developed to detect homeobox sequences long and spans positions 34 to 57 of the homeobox domain. is 24 residues -Consensus pattern: [LIVMFYG]-[ASLVR]-x(2)-[LIVMSTACN]-x-[LIVM]-{Y}-x(2){L}[LIV]-[RKNQESTAIY]-[LIVFSTNKH]-W-[FYVC]-x-[NDQTAH]x(5)[RKNAIMW] -Sequences known to belong to this class detected by the pattern: ALL, except for 10 sequences. -Other sequence(s) detected in Swiss-Prot: 9. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Proteins which contain a homeobox domain can be classified, on the basis of their sequence characteristics, into various subfamilies. We have developed specific patterns for conserved elements of the antennapedia, engrailed and paired families. -Expert(s) to contact by email: Buerglin T.R.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Gehring W.J. (In) Guidebook to the homebox genes, Duboule D., Ed., pp1-10, Oxford University Press, Oxford, (1994). [ 2] Buerglin T.R. (In) Guidebook to the homebox genes, Duboule D., Ed., pp25-72, Oxford University Press, Oxford, (1994). [ 3] Gehring W.J. Trends Biochem. Sci. 17:277-280(1992). [ 4] Gehring W.J., Hiromi Y. "Homeotic genes and the homeobox." Annu. Rev. Genet. 20:147-173(1986). PubMed=2880555; DOI=10.1146/annurev.ge.20.120186.001051 [ 5] Schofield P.N. Trends Neurosci. 10:3-6(1987). [E1] http://www.biosci.ki.se/groups/tbu/homeo.html +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00028} {PS00028; ZINC_FINGER_C2H2_1} {PS50157; ZINC_FINGER_C2H2_2} {BEGIN} ****************************************************** * Zinc finger C2H2-type domain signature and profile * ****************************************************** 'Zinc finger' domains [1-5] are nucleic acid-binding protein structures first identified in the Xenopus transcription factor TFIIIA. These domains have since been found in numerous nucleic acid-binding proteins. A zinc finger domain is composed of 25 to 30 amino-acid residues. There are two cysteine or histidine residues at both extremities of the domain, which are involved in the tetrahedral coordination of a zinc atom. It has been proposed that such a domain interacts with about five nucleotides. A schematic representation of a zinc finger domain is shown below: x x x x x x x x H x x x x C x \ / Zn x x x x x x x / C x x x \ H x x x x x Many classes of zinc fingers are characterized according to the number and positions of the histidine and cysteine residues involved in the zinc atom coordination. In the first class to be characterized, called C2H2, the first pair of zinc coordinating residues are cysteines, while the second pair are histidines. A number of experimental reports have demonstrated the zincdependent DNA or RNA binding property of some members of this class. Some of the proteins known to include C2H2-type zinc fingers are listed below. We have indicated, between brackets, the number of zinc finger regions found in each of these proteins; a '+' symbol indicates that only partial sequence data is available and that additional finger domains may be present. - Saccharomyces cerevisiae: ACE2 (3), ADR1 (2), AZF1 (4), FZF1 (5), MIG1 (2), MSN2 (2), MSN4 (2), RGM1 (2), RIM1 (3), RME1 (3), SFP1 (2), SSL1 (1), STP1 (3), SWI5 (3), VAC1 (1) and ZMS1 (2). - Emericella nidulans: brlA (2), creA (2). - Drosophila: AEF-1 (4), Cf2 (7), ci-D (5), Disconnected (2), Escargot (5), Glass (5), Hunchback (6), Kruppel (5), Kruppel-H (4+), Odd-skipped (4), Odd-paired (4), Pep (3), Snail (5), Spalt-major (7), Serependity locus beta (6), delta (7), h-1 (8), Suppressor of hairy wing su(Hw) (12), Suppressor of variegation suvar(3)7 (5), Teashirt (3) and Tramtrack (2). - Xenopus: transcription factor TFIIIA (9), p43 from RNP particle (9), Xfin (37 !!), Xsna (5), gastrula XlcGF5.1 to XlcGF71.1 (from 4+ to 11+), Oocyte XlcOF2 to XlcOF22 (from 7 to 12). - Mammalian: basonuclin (6), BCL-6/LAZ-3 (6), erythroid krueppel-like transcription factor (3), transcription factors Sp1 (3), Sp2 (3), Sp3 (3) and Sp(4) 3, transcriptional repressor YY1 (4), Wilms' tumor protein (4), EGR1/Krox24 (3), EGR2/Krox20 (3), EGR3/Pilot (3), EGR4/AT133 (4), Evi-1 (10), GLI1 (5), GLI2 (4+), GLI3 (3+), HIV-EP1/ZNF40 (4), HIV-EP2 (2), KR1 (9+), KR2 (9), KR3 (15+), KR4 (14+), KR5 (11+), HF.12 (6+), REX-1 (4), ZfX (13), ZfY (13), Zfp-35 (18), ZNF7 (15), ZNF8 (7), ZNF35 (10), ZNF42/MZF-1 (13), ZNF43 (22), ZNF46/Kup (2), ZNF76 (7), ZNF91 (36), ZNF133 (3). In addition to the conserved zinc ligand residues it has been shown [6] that a number of other positions are also important for the structural integrity of the C2H2 zinc fingers. The best conserved position is found four residues after the second cysteine; it is generally an aromatic or aliphatic residue. A profile was also developed that spans the whole domain. -Consensus pattern: C-x(2,4)-C-x(3)-[LIVMFYWC]-x(8)-H-x(3,5)-H [The 2 C's and the 2 H's are zinc ligands] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 42. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: 2. -Note: In proteins that include many copies of the C2H2 zinc finger domain, incomplete or degenerate copies of the domain are frequently found. The former are generally found at the extremity of the zinc finger region(s); the latter have typically lost one or more of the zinc-coordinating residues or are interrupted by insertions or deletions. Our pattern does not detect any of these finger domains. -Expert(s) to contact by email: Becker K.G.; [email protected] -Last update: May 2004 / Text revised. [ 1] Klug A., Rhodes D. Trends Biochem. Sci. 12:464-469(1987). [ 2] Evans R.M., Hollenberg S.M. "Zinc fingers: gilt by association." Cell 52:1-3(1988). PubMed=3125980 [ 3] Payre F., Vincent A. "Finger proteins and DNA-specific recognition: distinct patterns of conserved amino acids suggest different evolutionary modes." FEBS Lett. 234:245-250(1988). PubMed=3292287 [ 4] Miller J., McLachlan A.D., Klug A. "Repetitive zinc-binding domains in the protein transcription factor IIIA from Xenopus oocytes." EMBO J. 4:1609-1614(1985). PubMed=4040853 [ 5] Berg J.M. "Proposed structure for the zinc-binding domains from transcription factor IIIA and related proteins." Proc. Natl. Acad. Sci. U.S.A. 85:99-102(1988). PubMed=3124104 [ 6] Rosenfeld R., Margalit H. "Zinc fingers: conserved properties that can distinguish between spurious and actual DNA-binding motifs." J. Biomol. Struct. Dyn. 11:557-570(1993). PubMed=8129873 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00029} {PS00029; LEUCINE_ZIPPER} {BEGIN} ************************** * Leucine zipper pattern * ************************** A structure, referred to as the 'leucine zipper' [1,2], has been proposed to explain how some eukaryotic gene regulatory proteins work. The leucine zipper consist of a periodic repetition of leucine residues at every seventh position over a distance covering eight helical turns. The segments containing these periodic arrays of leucine residues seem to exist in an alphahelical conformation. The leucine side chains extending from one alpha-helix interact with those from a similar alpha helix of a second polypeptide, facilitating dimerization; the structure formed by cooperation of these two regions forms a coiled coil [3]. The leucine zipper pattern is present in many gene regulatory proteins, such as: - The - The ATFs). - The - The - The - The - The CCATT-box and enhancer binding protein (C/EBP). cAMP response element (CRE) binding proteins (CREB, CRE-BP1, Jun/AP1 family of transcription factors. yeast general control protein GCN4. fos oncogene, and the fos-related proteins fra-1 and fos B. C-myc, L-myc and N-myc oncogenes. octamer-binding transcription factor 2 (Oct-2/OTF-2). -Consensus pattern: L-x(6)-L-x(6)-L-x(6)-L -Sequences known to belong to this class detected by the pattern: All those mentioned in the original paper, with the exception of L-myc which has a Met instead of the second Leu. -Other sequence(s) detected in Swiss-Prot: some 600 other sequences from every category of protein families. -Note: As this is far from being a specific pattern you should be cautious in citing the presence of such pattern in a protein if it has not been shown to be a nuclear DNA-binding protein. -Last update: December 1992 / Text revised. [ 1] Landschulz W.H., Johnson P.F., McKnight S.L. "The leucine zipper: a hypothetical structure common to a new class of DNA binding proteins." Science 240:1759-1764(1988). PubMed=3289117 [ 2] Busch S.J., Sassone-Corsi P. "Dimers, leucine zippers and DNA-binding domains." Trends Genet. 6:36-40(1990). PubMed=2186528 [ 3] O'Shea E.K., Rutkowski R., Kim P.S. Science 243:538-542(1989). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00030} {PS50102; RRM} {BEGIN} ******************************************** * Eukaryotic RNA recognition motif profile * ******************************************** Many eukaryotic proteins that are known or supposed to bind singlestranded RNA contain one or more copies of a putative RNA-binding domain of about 90 amino acids [1,2]. This domain is known as the RNA recognition motif (RRM). This region has been found in the following proteins: ** Heterogeneous nuclear ribonucleoproteins ** - hnRNP A1 (helix destabilizing protein) (twice). - hnRNP A2/B1 (twice). - hnRNP C (C1/C2) (once). - hnRNP E (UP2) (at least once). - hnRNP G (once). ** Small nuclear ribonucleoproteins ** - U1 snRNP 70 Kd (once). - U1 snRNP A (once). - U2 snRNP B'' (once). ** Pre-RNA and mRNA associated proteins ** - Protein synthesis initiation factor 4B (eIF-4B) [3], a protein essential for the binding of mRNA to ribosomes (once). - Nucleolin (4 times). - Yeast single-stranded nucleic acid-binding protein (gene SSB1) (once). - Yeast protein NSR1 (twice). NSR1 is involved in pre-rRNA processing; it specifically binds nuclear localization sequences. - Poly(A) binding protein (PABP) (4 times). ** Others ** - Drosophila sex determination protein Sex-lethal (Sxl) (twice). - Drosophila sex determination protein Transformer-2 (Tra-2) (once). - Drosophila 'elav' protein (3 times), which is probably involved in the RNA metabolism of neurons. - Human paraneoplastic encephalomyelitis antigen HuD (3 times) [4], which is highly similar to elav and which may play a role in neuronspecific RNA processing. - Drosophila 'bicoid' protein (once) [5], a segment-polarity homeobox protein that may also bind to specific mRNAs. - La antigen (once), a protein which may play a role in the transcription of RNA polymerase III. - The 60 Kd Ro protein (once), a putative RNP complex protein. - A maize protein induced by abscisic acid in response to water stress, which seems to be a RNA-binding protein. - Three tobacco proteins, located in the chloroplast [6], which may be involved in splicing and/or processing of chloroplast RNAs (twice). - X16 [7], a mammalian protein which may be involved in RNA processing in relation with cellular proliferation and/or maturation. - Insulin-induced growth response protein Cl-4 from rat (twice). - Nucleolysins TIA-1 and TIAR (3 times) [8] which possesses nucleolytic activity against cytotoxic lymphocyte target cells. may be involved in apoptosis. - Yeast RNA15 protein, which plays a role in mRNA stability and/or poly-(A) tail length [9]. Inside the RRM there are two regions which are highly conserved. The first one is a hydrophobic segment of six residues (which is called the RNP-2 motif), the second one is an octapeptide motif (which is called RNP-1 or RNPCS). The position of both motifs in the domain is shown in the following schematic representation: xxxxxxx######xxxxxxxxxxxxxxxxxxxxxxxxxxxxx########xxxxxxxxxxxxxxxxxxxxxxx xx RNP-2 RNP-1 We have developed a profile that spans the RRM domain. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: August 2004 / Text revised; pattern deleted. [ 1] Bandziulis R.J., Swanson M.S., Dreyfuss G. "RNA-binding proteins as developmental regulators." Genes Dev. 3:431-437(1989). PubMed=2470643 [ 2] Dreyfuss G., Swanson M.S., Pinol-Roma S. "Heterogeneous nuclear ribonucleoprotein particles and the pathway of mRNA formation." Trends Biochem. Sci. 13:86-91(1988). PubMed=3072706 [ 3] Milburn S.C., Hershey J.W.B., Davies M.V., Kelleher K., Kaufman R.J. "Cloning and expression of eukaryotic initiation factor 4B cDNA: sequence determination identifies a common RNA recognition motif." EMBO J. 9:2783-2790(1990). PubMed=2390971 [ 4] Szabo A., Dalmau J., Manley G., Rosenfeld M., Wong E., Henson J., Posner J.B., Furneaux H.M. "HuD, a paraneoplastic encephalomyelitis antigen, contains RNAbinding domains and is homologous to Elav and Sex-lethal." Cell 67:325-333(1991). PubMed=1655278 [ 5] Rebagliati M. "An RNA recognition motif in the bicoid protein." Cell 58:231-232(1989). PubMed=2752425 [ 6] Li Y.Q., Sugiura M. "Three distinct ribonucleoproteins from tobacco chloroplasts: each contains a unique amino terminal acidic domain and two ribonucleoprotein consensus motifs." EMBO J. 9:3059-3066(1990). PubMed=1698606 [ 7] Ayane M., Preuss U., Koehler G., Nielsen P.J. "A differentially expressed murine RNA encoding a protein with similarities to two types of nucleic acid binding motifs." Nucleic Acids Res. 19:1273-1278(1991). PubMed=2030943 [ 8] Kawakami A., Tian Q., Duan X., Streuli M., Schlossman S.F., Anderson P. "Identification and functional characterization of a TIA-1-related nucleolysin." Proc. Natl. Acad. Sci. U.S.A. 89:8681-8685(1992). PubMed=1326761 [ 9] Minvielle-Sebastia L., Winsor B., Bonneaud N., Lacroute F. Mol. Cell. Biol. 11:3075-3087(1991). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00031} {PS00031; NUCLEAR_REC_DBD_1} {PS51030; NUCLEAR_REC_DBD_2} {BEGIN} ********************************************************************** * Nuclear hormone receptors DNA-binding domain signature and profile * ********************************************************************** Nuclear hormone receptors are ligand-activated transcription factors that regulate gene expression by interacting with specific DNA sequences upstream of their target genes. In vertebrates, these proteins regulate diverse biological processes such as pattern formation, cellular differentiation and homeostasis [1 to 6]. Classical nuclear hormone receptors contain two conserved regions, the hormone binding domain and a DNA-binding domain (DBD) that is composed of two C4-type zinc fingers. The DBD is responsible for targeting the receptors to their hormone response elements (HRE). It binds as a dimer with each monomer recognizing a six base pair sequence of DNA. The vast majority of targets contain the same 5'-AGGTCA-3' consensus sequence [7]. In some cases a less conserved C-terminal extension of the core DBD confers the DNA selectivity [8]. The two zinc fingers fold to form a single structural domain (see <PDB:1HCQ>) [9,10]. The structure consists of two helices perpendicular to each other. A zinc ion, coordinated by four conserved cysteines, holds the base of a loop at the N terminus of each helix. The helix of each monomer makes sequence specific contacts in the major groove of the DNA. Proteins known domain are listed below: to contain a nuclear hormone receptor DNA-binding - Androgen receptor (AR). - Estrogen receptor (ER). - Glucocorticoid receptor (GR). - Mineralocorticoid receptor (MR). - Progesterone receptor (PR). - Retinoic acid receptors (RARs and RXRs). - Thyroid hormone receptors (TR) alpha and beta. - The avian erythroblastosis virus oncogene v-erbA, derived from a cellular thyroid hormone receptor. - Vitamin D3 receptor (VDR). - Insects ecdysone receptor (EcR). - COUP transcription factor (also known as ear-3), and its Drosophila homolog seven-up (svp). - Hepatocyte nuclear factor 4 (HNF-4), which binds to DNA sites required for the transcription of the genes for alpha-1-antitrypsin, apolipoprotein CIII and transthyretin. - Ad4BP, a protein that binds to the Ad4 site found in the promoter region of steroidogenic P450 genes. - Apolipoprotein AI regulatory protein-1 (ARP-1), required for the transcription of apolipoprotein AI. - Peroxisome proliferator activated receptors (PPAR), transcription factors specifically activated by peroxisome proliferators. They control the peroxisomal beta-oxidation pathway of fatty acids by activating the gene for acyl-CoA oxidase. - Drosophila protein knirps (kni), a zygotic gap protein required for abdominal segmentation of the Drosophila embryo. - Drosophila protein ultraspiracle (usp) (or chorion factor 1), which binds to the promoter region of s15 chorion gene. - Human estrogen receptor related genes 1 and 2 (err1 and err2). - Human erbA related gene 2 (ear-2). - Mammalian NGFI-B (NAK1, nur/77, N10). - Mammalian NOT/nurR1/RNR-1. - Drosophila protein embryonic gonad (egon). - Drosophila knirps-related protein (knrl). - Drosophila protein tailless (tll). - Drosophila 20-oh-ecdysone regulated protein E75. - Insects Hr3. - Insects Hr38. - Caenorhabditis elegans cnr-8, cnr-14, and odr-7 - Caenorhabditis elegans hypothetical proteins B0280.8, EO2H1.7 and K06A1.4. As a signature pattern for this family of proteins, we took the most conserved residues, the first 27, of the DNA-binding domain. We also developed a profile that spans the whole domain. -Consensus pattern: C-x(2)-C-x(1,2)-[DENAVSPHKQT]-x(5,6)-[HNY]-[FY]-x(4)Cx(2)-C-x(2)-F(2)-x-R [The 4 C's are zinc ligands] -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Gronemeyer H., Laudet V. Protein Prof. 2:1173-1308(1995). [ 2] Evans R.M. "The steroid and thyroid hormone receptor superfamily." Science 240:889-895(1988). PubMed=3283939 [ 3] Gehring U. Trends Biochem. Sci. 12:399-402(1987). [ 4] Beato M. "Gene regulation by steroid hormones." Cell 56:335-344(1989). PubMed=2644044 [ 5] Segraves W.A. "Something old, some things new: the steroid receptor superfamily in Drosophila." [ 6] [ 7] [ 8] [ 9] [10] Cell 67:225-228(1991). PubMed=1913821 Laudet V., Haenni C., Coll J., Catzeflis F., Stehelin D. "Evolution of the nuclear receptor gene superfamily." EMBO J. 11:1003-1013(1992). PubMed=1312460 Stunnenberg H.G. "Mechanisms of transactivation by retinoic acid receptors." BioEssays 15:309-315(1993). PubMed=8393666 Zhao Q., Khorasanizadeh S., Miyoshi Y., Lazar M.A., Rastinejad F. "Structural elements of an orphan nuclear receptor-DNA complex." Mol. Cell 1:849-861(1998). PubMed=9660968 Schwabe J.W.R., Neuhaus D., Rhodes D. "Solution structure of the DNA-binding domain of the oestrogen receptor." Nature 348:458-461(1990). PubMed=2247153; DOI=10.1038/348458a0 Schwabe J.W.R., Chapman L., Finch J.T., Rhodes D. Cell 75:567-578(1993). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00032} {PS00032; ANTENNAPEDIA} {BEGIN} ************************************************** * 'Homeobox' antennapedia-type protein signature * ************************************************** The homeotic Hox proteins are sequence-specific transcription factors. They are part of a developmental regulatory system that provides cells with specific positional identities on the anterior-posterior (A-P) axis [1]. The hox proteins contain a 'homeobox' domain. In Drosophila and other insects, there are eight different Hox genes that are encoded in two gene complexes, ANT-C and BX-C. In vertebrates there are 38 genes organized in four complexes. In six of the eight Drosophila Hox genes the homeobox domain is highly similar and a conserved hexapeptide is found five to sixteen amino acids upstream of the homeobox domain. The six Drosophila proteins that belong to this group are antennapedia (Antp), abdominal-A (abd-A), deformed (Dfd), proboscipedia (pb), sex combs reduced (scr) and ultrabithorax (ubx) and are collectively known as the 'antennapedia' subfamily. In vertebrates the corresponding Hox genes are known [2] as Hox-A2, A3, A4, A5, A6, A7, Hox-B1, B2, B3, B4, B5, B6, B7, B8, Hox-C4, C5, C6, C8, Hox-D1, D3, D4 and D8. Caenorhabditis elegans lin-39 and mab-5 are also members of the 'antennapedia' subfamily. As a signature pattern for this subfamily of used the conserved hexapeptide. homeobox proteins, we have -Consensus pattern: [LIVMFE]-[FY]-P-W-M-[KRQTA] -Sequences known to belong to this class detected by the pattern: ALL, except for 6 sequences. -Other sequence(s) detected in Swiss-Prot: 3. -Note: Arg and Lys are most frequently found in the last position of the hexapeptide; other amino acids are found in only a few cases. -Last update: June 1994 / Text revised. [ 1] McGinnis W., Krumlauf R. "Homeobox genes and axial patterning." Cell 68:283-302(1992). PubMed=1346368 [ 2] Scott M.P. "Vertebrate homeobox gene nomenclature." Cell 71:551-553(1992). PubMed=1358459 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00033} {PS00033; ENGRAILED} {BEGIN} *********************************************** * 'Homeobox' engrailed-type protein signature * *********************************************** Most proteins which contain a 'homeobox' domain can be classified [1,2], on the basis of their sequence characteristics, in three subfamilies: engrailed, antennapedia and paired. Proteins currently known to belong to the engrailed subfamily are: - Drosophila segmentation polarity protein engrailed (en) which specifies the body segmentation pattern and is required for the development of the central nervous system. - Drosophila invected protein (inv). - Silk moth proteins engrailed and invected, which may be involved in the compartmentalization of the silk gland. - Honeybee E30 and E60. - Grasshopper (Schistocerca americana) G-En. - Mammalian and birds En-1 and En-2. - Zebrafish Eng-1, -2 and -3. - Sea urchin (Tripneusteas gratilla) SU-HB-en. - Leech (Helobdella triserialis) Ht-En. - Caenorhabditis elegans ceh-16. Engrailed homeobox proteins are characterized by the presence of a conserved region of some 20 amino-acid residues located at the C-terminal of the 'homeobox' domain. As a signature pattern for this subfamily of proteins, we have used a stretch of eight perfectly conserved residues in this region. -Consensus pattern: L-M-A-[EQ]-G-L-Y-N -Sequences known to belong to this class detected by the pattern: ALL, except for ceh-16. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: July 1999 / Pattern and text revised. [ 1] Scott M.P., Tamkun J.W., Hartzell G.W. III "The structure and function of the homeodomain." Biochim. Biophys. Acta 989:25-48(1989). PubMed=2568852 [ 2] Gehring W.J. "Homeo boxes in the study of development." Science 236:1245-1252(1987). PubMed=2884726 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00034} {PS00034; PAIRED_1} {PS51057; PAIRED_2} {BEGIN} *************************************** * Paired domain signature and profile * *************************************** The paired domain is a ~126 amino acid DNA-binding domain, which is found in eukaryotic transcription regulatory proteins involved in embryogenesis. The domain was originally described as the 'paired box' in the Drosophila protein paired (prd) [1,2]. The paired domain is generally located in the Nterminal part. An octapeptide [3] and/or a homeodomain (see <PDOC00027>) can occur C-terminal to the paired domain, as well as a Pro-Ser-Thr-rich Cterminus. Paired domain proteins can function as transcription repressors or activators. The paired domain contains three subdomains, which show functional differences in DNA-binding. The crystal structures of prd and Pax proteins show that the DNA-bound paired domain is bipartite, consisting of an N-terminal subdomain (PAI or NTD) and a C-terminal subdomain (RED or CTD), connected by a linker (see <PDB:1K78>). PAI and RED each form a three-helical fold, with the most C-terminal helices comprising a helix-turn-helix (HTH) motif that binds the DNA major groove. In addition, the PAI subdomain encompasses an N-terminal beta-turn and beta-hairpin, also named 'wing', participating in DNA-binding. The linker can bind into the DNA minor groove. Different Pax proteins and their alternatively spliced isoforms use different (sub)domains for DNA-binding to mediate the specificity of sequence recognition [4,5]. Some proteins known to contain a paired domain: - Drosophila paired (prd), a segmentation pair-rule class protein. - Drosophila gooseberry proximal (gsb-p) and gooseberry distal (gsb-d), segmentation polarity class proteins. - Drosophila Pox-meso and Pox-neuro proteins. The Pax proteins: - Mammalian protein Pax1, which may play a role in the formation of segmented structures in the embryo. In mouse, mutations in Pax1 produce the undulated phenotype, characterized by vertebral malformations along the entire rostro-caudal axis. - Mammalian protein Pax2, a probable transcription factor that may have a role in kidney cell differentiation. - Mammalian protein Pax3. Pax3 is expressed during early neurogenesis. In Man, defects in Pax3 are the cause of Waardenburg's syndrome (WS), an autosomal dominant combination of deafness and pigmentary disturbance. - Mammalian protein Pax5, also known as B-cell specific transcription factor (BSAP). Pax5 is involved in the regulation of the CD19 gene. It plays an important role in B-cell differentiation as well as neural development and spermatogenesis. - Mammalian protein Pax6 (oculorhombin). Pax6 is a transcription factor with important functions in eye and nasal development. In Man, defects in Pax6 are the cause of aniridia type II (AN2), an autosomal dominant disorder characterized by complete or partial absence of the iris. - Mammalian protein Pax8, required in thyroid development. - Mammalian protein Pax9. In man, defects in Pax9 cause oligodontia. - Zebrafish proteins Pax[Zf-a] and Pax[Zf-b]. We use the region spanning positions 34 to 50 of the paired domain as a signature pattern. This conserved region spans the DNA-binding HTH located in the N-terminal subdomain. We also developed a profile that covers the entire paired domain, including the PAI and RED subdomains and which allows a more sensitive detection. -Consensus pattern: R-P-C-x(11)-C-V-S -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: January 2005 / Text revised; profile added. [ 1] Bopp D., Burri M., Baumgartner S., Frigerio G., Noll M. "Conservation of a large protein domain in the segmentation gene paired and in functionally related genes of Drosophila." Cell 47:1033-1040(1986). PubMed=2877747 [ 2] Baumgartner S., Bopp D., Burri M., Noll M. "Structure of two genes at the gooseberry locus related to the paired gene and their spatial expression during Drosophila embryogenesis." Genes Dev. 1:1247-1267(1987). PubMed=3123319 [ 3] Eberhard D., Jimenez G., Heavey B., Busslinger M. "Transcriptional repression by Pax5 (BSAP) through interaction with corepressors of the Groucho family." EMBO J. 19:2292-2303(2000). PubMed=10811620; DOI=10.1093/emboj/19.10.2292 [ 4] Underhill D.A. "Genetic and biochemical diversity in the Pax gene family." Biochem. Cell Biol. 78:629-638(2000). PubMed=11103953 [ 5] Apuzzo S., Abdelhakim A., Fortin A.S., Gros P. "Cross-talk between the paired domain and the homeodomain of Pax3: DNA binding by each domain causes a structural change in the other domain, supporting interdependence for DNA Binding." J. Biol. Chem. 279:33601-33612(2004). PubMed=15148315; DOI=10.1074/jbc.M402949200 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00035} {PS00035; POU_1} {PS00465; POU_2} {PS51179; POU_3} {BEGIN} ***************************************************** * POU-specific (POUs) domain signatures and profile * ***************************************************** The POU (pronounced 'pow') domain [1 to 7 ] is a highly charged 155-162amino acid region of sequence similarity which has been identified in the three mammalian transcription factors Pit-1, Oct-1, and Oct-2 and in the product of the nematode gene unc-86. The POU domain is a bipartite DNA binding protein module that binds selectively to the DNA octamer motif ATGCAAAT and a subset of derivatives. It consists of two subdomains, a C-terminal homeodomain (POUh) (see <PDOC00027>) and an N-terminal 75- to 82-residue POU-specific (POUs) region separated by a short non-conserved linker. The POU-specific region or 'box' can be subdivided further into two highly conserved regions, A and B, separated by a less highly conserved segment. The POUs domain is always found in association with a POUh domain, and both are required for high affinity and sequence-specific DNA binding. The POUs domain consists of four alpha helices packed to enclose an extensive hydrophobic core (see <PDB:1POU>). The POUs domain contains an unusual HTH structure, which differs from the canonical HTH motif in the length of the first alpha helix and the turn. The region of hypervariability located between subdomains A and B lies within the sequence corresponding to the Cterminal end of helix 2 and the linker between helices 2 and 3. In the model of the POUs-DNA complex, the C-terminus of helix 2 and the turn of the HTH motif project away from the DNA such that sequence variability in this region can be accomodated without adversely affecting DNA binding [8]. Some proteins currently known to contain a POUs domain are listed below: - Oct-1 (or OTF-1, NF-A1) (gene POU2F1), a transcription factor for small nuclear RNA and histone H2B genes. - Oct-2 (or OTF-2, NF-A2) (gene POU2F2), a transcription factor that specifically binds to the immunoglobulin promoters octamer motif and activates these genes. - Oct-3 (or Oct-4, NF-A3) (gene POU5F1), a transcription factor that also binds to the octamer motif. - Oct-6 (or OTF-6, SCIP) (gene POU3F1), an octamer-binding transcription factor thought to be involved in early embryogenesis and neurogenesis. - Oct-7 (or N-Oct 3, OTF-7, Brn-2) (gene POU3F2), a nervous-system specific octamer-binding transcription factor. - Oct-11 (or OTF-11) (gene POU2F3), an octamer-binding transcription factor. - Pit-1 (or GHF-1) (gene POU1F1), a transcription factor that activates growth hormone and prolactin genes. - Brn-1 (or OTF-8) (gene POU3F3). - Brn-3A (or RDC-1) (gene POU4F1), a probable transcription factor that may play a role in neuronal tissue differentiation. - Brn-3B (gene POU4F2), a probable transcription factor that may play a role in determining or maintaining the identities of a small subset of visual system neurons. - Brn-3C (gene POU4F3). - Brn-4 (or OTF-9) (gene POU3F4), a probable transcription factor which exert its primary action widely during early neural development and in a very limited set of neurons in the mature brain. - Mpou (or Brn-5, Emb) (gene POU6F1), a transcription factor that binds preferentially to a variant of the octamer motif. - Skn, that activates cytokeratin 10 (k10) gene expression. - Sprm-1, a transcription factor that binds preferentially to the octamer motif and that may exert a regulatory function in meiotic events that are required for terminal differentiation of male germ cell. - Unc-86, a Caenorhabditis elegans transcription factor involved in cell lineage and differentiation. - Cf1-a, a Drosophila neuron-specific transcription factor necessary for the expression of the dopa decarboxylase gene (dcc). - I-POU, a Drosophila protein that forms a stable heterodimeric complex with Cf1-a and inhibits its action. - Drosophila protein nubbin/twain (PDM-1 or DPou-19). - Drosophila protein didymous (PDM-2 or DPou-28) that may play multiple roles during development. - Bombyx mori silk gland factor 3 (SGF-3). - Xenopus proteins Pou1, Pou2, and Pou3. - Zebrafish proteins Pou1, Pou2, Pou[C], ZP-12, ZP-23, ZP-47 and ZP-50. - Caenorhabditis elegans protein ceh-6. - Caenorhabditis elegans protein ceh-18. We have derived two signature patterns for the 'POU' domain. The first one spans positions 15 to 27 of the domain, the second positions 42 to 55. We have also developed a profile which covers the entire POUs domain. -Consensus pattern: [RKQ]-R-[LIM]-x-[LF]-G-[LIVMFY]-x-Q-x-[DNQ]-V-G -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: S-Q-[STK]-[TA]-I-[SC]-R-[FH]-[ET]-x-[LSQ]-x(0,1)[LIR][ST] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: January 2006 / Text revised; profile added. [ 1] Robertson M. "Homoeo boxes, POU proteins and the limits to promiscuity." Nature 336:522-524(1988). PubMed=2904652; DOI=10.1038/336522a0 [ 2] Sturm R.A., Herr W. "The POU domain is a bipartite DNA-binding structure." Nature 336:601-604(1988). PubMed=2904656; DOI=10.1038/336601a0 [ 3] Herr W., Sturm R.A., Clerc R.G., Corcoran L.M., Baltimore D., Sharp P.A., Ingraham H.A., Rosenfeld M.G., Finney M., Ruvkun G., Horvitz H.R. "The POU domain: a large conserved region in the mammalian pit-1, oct-1, oct-2, and Caenorhabditis elegans unc-86 gene products." Genes Dev. 2:1513-1516(1988). PubMed=3215510 [ 4] Levine M., Hoey T. "Homeobox proteins as sequence-specific transcription factors." Cell 55:537-540(1988). PubMed=2902929 [ 5] Rosenfeld M.G. "POU-domain transcription factors: pou-er-ful developmental regulators." Genes Dev. 5:897-907(1991). PubMed=2044958 [ 6] Schoeler H.R. Trends Genet. 7:323-329(1991). [ 7] Verrijzer C.P., Van der Vliet P.C. "POU domain transcription factors." Biochim. Biophys. Acta 1173:1-21(1993). PubMed=8485147 [ 8] Assa-Munt N., Mortishire-Smith R.J., Aurora R., Herr W., Wright P.E. "The solution structure of the Oct-1 POU-specific domain reveals a striking similarity to the bacteriophage lambda repressor DNAbinding domain." Cell 73:193-205(1993). PubMed=8462099 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00036} {PS00036; BZIP_BASIC} {PS50217; BZIP} {BEGIN} ************************************************************ * Basic-leucine zipper (bZIP) domain signature and profile * ************************************************************ The bZIP superfamily [1,2] of eukaryotic DNA-binding transcription factors groups together proteins that contain a basic region mediating sequencespecific DNA-binding followed by a leucine zipper (see <PDOC00029>) required for dimerization. bZIP domains usually bind a pallindromic 6 nucleotide site, but the specificity can be altered by interaction with accessory factor [3]. Several structure of bZIP have been solved (see for example <PDB:1AN2>) [4]. The basic region and the leucine zipper form a contiguous alpha helice where the four hydrophobic residues of the leucine zipper are oriented on one side. This conformation allows dimerization in parallel and it bends the helices so that the newly functional dimer forms a flexible fork where the basic domains, at the N-terminal open end, can then interact with DNA. The two leucine zipper are therefore oriented perpendicular to the DNA [4,5]. This family is quite large and we only list here some representative members. - Transcription factor AP-1, which binds selectively to enhancer elements in the cis control regions of SV40 and metallothionein IIA. AP-1, also known as c-jun, is the cellular homolog of the avian sarcoma virus 17 (ASV17) oncogene v-jun. - Jun-B and jun-D, probable transcription factors which are highly similar to jun/AP-1. - The fos protein, a proto-oncogene that forms a non-covalent dimer with c-jun. - The fos-related proteins fra-1, and fos B. - Mammalian cAMP response element (CRE) binding proteins CREB, CREM, ATF-1, ATF-3, ATF-4, ATF-5, ATF-6 and LRF-1. - Maize Opaque 2, a trans-acting transcriptional activator involved in the regulation of the production of zein proteins during endosperm. - Arabidopsis G-box binding factors GBF1 to GBF4, Parsley CPRF-1 to CPRF-3, Tobacco TAF-1 and wheat EMBP-1. All these proteins bind the G-box promoter elements of many plant genes. - Drosophila protein Giant, which represses the expression of both the kruppel and knirps segmentation gap genes. - Drosophila Box B binding factor 2 (BBF-2), a transcriptional activator that binds to fat body-specific enhancers of alcohol dehydrogenase and yolk protein genes. - Drosophila segmentation protein cap'n'collar (gene cnc), which is involved in head morphogenesis. - Caenorhabditis elegans skn-1, a developmental protein involved in the fate of ventral blastomeres in the early embryo. - Yeast GCN4 transcription factor, a component of the general control system that regulates the expression of amino acid-synthesizing enzymes in response to amino acid starvation, and the related Neurospora crassa cpc-1 protein. - Neurospora crassa cys-3 which turns on the expression of structural genes which encode sulfur-catabolic enzymes. - Yeast MET28, a transcriptional activator of sulfur amino acids metabolism. - Yeast PDR4 (or YAP1), a transcriptional activator of the genes for some oxygen detoxification enzymes. - Epstein-Barr virus trans-activator protein BZLF1. The pattern we developped is directed against also developed a profile that covers the whole domain. the basic region. We -Consensus pattern: [KR]-x(1,3)-[RKSAQ]-N-{VL}-x-[SAQ](2)-{L}-[RKTAENQ]x-R{S}-[RK] -Sequences known to belong to this class detected by the profile: the large majority. -Other sequence(s) detected in Swiss-Prot: 18. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Hurst H.C. Protein Prof. 2:105-168(1995). [ 2] Ellenberger T. Curr. Opin. Struct. Biol. 4:12-21(1994). [ 3] Baranger A.M. "Accessory factor-bZIP-DNA interactions." Curr. Opin. Chem. Biol. 2:18-23(1998). PubMed=9667910 [ 4] Ferre-D'amare A.R., Prendergast G.C., Ziff E.B., Burley S.K. Nature 363:38-45(1993). [ 5] Ellenberger T.E., Brandl C.J., Struhl K., Harrison S.C. "The GCN4 basic region leucine zipper binds DNA as a dimer of uninterrupted alpha helices: crystal structure of the protein-DNA complex." Cell 71:1223-1237(1992). PubMed=1473154 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00037} {PS50090; MYB_LIKE} {PS51294; HTH_MYB} {BEGIN} ******************************************** * Myb-type HTH DNA-binding domain profiles * ******************************************** The myb family can be classified into three groups: the myb-type HTH domain, which binds DNA, the SANT domain, which is a protein-protein interaction module (see <PDOC51293>) and the myb-like domain that can be involved in either of these functions. The myb-type HTH domain is a DNA-binding, helix-turn-helix (HTH) domain of ~55 amino acids, typically occurring in a tandem repeat in eukaryotic transcription factors. The domain is named after the retroviral oncogene v-myb, and its cellular counterpart c-myb, which encode nuclear DNAbinding proteins that specifically recognize the sequence YAAC(G/T)G [1,2]. Myb proteins contain three tandem repeats of 51 to 53 amino acids, termed R1, R2 and R3. This repeat region is involved in DNA-binding and R2 and R3 bind directly to the DNA major groove. The major part of the first repeat is missing in retroviral v-Myb sequences and in plant myb-related (R2R3) proteins [3]. A single myb-type HTH DNA-binding domain occurs in TRF1 and TRF2. The 3D-structure of the myb-type HTH domain forms three alpha-helices (see <PDB:1H88; C>) [4]. The second and third helices connected via a turn comprise the helix-turn-helix motif. Helix 3 is termed the recognition helix as it binds the DNA major groove, like in other HTHs. Some proteins known to contain a myb-type HTH domain: - Fruit fly myb protein [2]. - Vertebrate myb-like proteins A-myb and B-myb. - Maize anthocyanin regulatory C1 protein, a trans-acting factor which controls the expression of genes involved in anthocyanin biosynthesis. - Maize P protein [5], a trans-acting factor which regulates the biosynthetic pathway of a flavonoid-derived pigment in certain floral tissues. - Arabidopsis thaliana protein GL1/GLABROUS1 [6], required for the initiation of differentiation of leaf hair cells (trichomes). - Maize and barley myb-related proteins Zm1, Zm38 and Hv1, Hv33 [7]. - Yeast BAS1 [8], a transcriptional activator for the HIS4 gene. - Yeast REB1 [9], which recognizes sites within both the enhancer and the promoter of rRNA transcription, as well as upstream of many genes transcribed by RNA polymerase II. - Fission yeast cdc5, a possible transcription factor whose activity is required for cell cycle progression and growth during G2. - Fission yeast myb1, which regulates telomere length and function. - Baker's yeast pre-mRNA-splicing factor CEF1. - Vertebrate telomeric repeat-binding factors 1 and 2 (TRF1/2), which bind to telomeric DNA and are involved in telomere length regulation. We have developed a profile, which has been manually adapted to specifically detect the DNA-binding myb-type HTH domain. A second general profile was developed for detection of the myb-like domain with a high sensitivity. A third profile was developed for the SANT domain (see <PDOC51293>). -Sequences known to belong to this class detected by the first profile: ALL. -Other sequence(s) detected in Swiss-Prot: 2. -Sequences known to belong to this class detected by the second profile: ALL, except 25. -Other sequence(s) detected in Swiss-Prot: 2. -Note: The profiles are in competition with one another and with the profile of the SANT domain (see <PDOC51293>). -Last update: added; February 2007 / Profile and text revised; profile patterns removed. [ 1] Biedenkapp H., Borgmeyer U., Sippel A.E., Klempnauer K.-H. "Viral myb oncogene encodes a sequence-specific DNA-binding activity." Nature 335:835-837(1988). PubMed=3185713; DOI=10.1038/335835a0 [ 2] Peters C.W.B., Sippel A.E., Vingron M., Klempnauer K.-H. "Drosophila and vertebrate myb proteins share two conserved regions, one of which functions as a DNA-binding domain." EMBO J. 6:3085-3090(1987). PubMed=3121304 [ 3] Stracke R., Werber M., Weisshaar B. "The R2R3-MYB gene family in Arabidopsis thaliana." Curr. Opin. Plant. Biol. 4:447-456(2001). PubMed=11597504 [ 4] Tahirov T.H., Sato K., Ichikawa-Iwata E., Sasaki M., Inoue-Bungo T., Shiina M., Kimura K., Takata S., Fujikawa A., Morii H., Kumasaka T., Yamamoto M., Ishii S., Ogata K. "Mechanism of c-Myb-C/EBP beta cooperation from separated sites on a promoter." Cell 108:57-70(2002). PubMed=11792321 [ 5] Grotewold E., Athma P., Peterson T. "Alternatively spliced products of the maize P gene encode proteins with homology to the DNA-binding domain of myb-like transcription factors." Proc. Natl. Acad. Sci. U.S.A. 88:4587-4591(1991). PubMed=2052542 [ 6] Oppenheimer D.G., Herman P.L., Sivakumaran S., Esch J., Marks M.D. "A myb gene required for leaf trichome differentiation in Arabidopsis is expressed in stipules." Cell 67:483-493(1991). PubMed=1934056 [ 7] Marocco A., Wissenbach M., Becker D., Paz-Ares J., Saedler H., Salamini F., Rohde W. "Multiple genes are transcribed in Hordeum vulgare and Zea mays that carry the DNA binding domain of the myb oncoproteins." Mol. Gen. Genet. 216:183-187(1989). PubMed=2664447 [ 8] Tice-Baldwin K., Fink G.R., Arndt K.T. "BAS1 has a Myb motif and activates HIS4 transcription only in combination with BAS2." Science 246:931-935(1989). PubMed=2683089 [ 9] Ju Q.D., Morrow B.E., Warner J.R. "REB1, a yeast DNA-binding protein with many targets, is essential for growth and bears some resemblance to the oncogene myb." Mol. Cell. Biol. 10:5226-5234(1990). PubMed=2204808 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00038} {PS50888; HLH} {BEGIN} *********************************************** * Myc-type, 'helix-loop-helix' domain profile * *********************************************** A number of eukaryotic proteins, which probably are sequence specific DNAbinding proteins that act as transcription factors, share a conserved domain of 40 to 50 amino acid residues. It has been proposed [1] that this domain is formed of two amphipathic helices joined by a variable length linker region that could form a loop. This 'helix-loop-helix' (HLH) domain mediates protein dimerization and has been found in the proteins listed below [2,3]. Most of these proteins have an extra basic region of about 15 amino acid residues that is adjacent to the HLH domain and specifically binds to DNA. They are refered as basic helix-loop-helix proteins (bHLH), and are classified in two groups: class A (ubiquitous) and class B (tissue-specific). Members of the bHLH family bind variations on the core sequence 'CANNTG', also refered to as the E-box motif. The homo- or heterodimerization mediated by the HLH domain is independent of, but necessary for DNA binding, as two basic regions are required for DNA binding activity. The HLH proteins lacking the basic domain (Emc, Id) function as negative regulators since they form heterodimers, but fail to bind DNA. The hairy-related proteins (hairy, E(spl), deadpan) also repress transcription although they can bind DNA. The proteins of this subfamily act together with co-repressor proteins, like groucho, through their C-terminal motif WRPW. - The myc family of cellular oncogenes [4], which is currently known to contain four members: c-myc, N-myc, L-myc, and B-myc. The myc genes are thought to play a role in cellular differentiation and proliferation. - Proteins involved in myogenesis (the induction of muscle cells). In mammals MyoD1 (Myf-3), myogenin (Myf-4), Myf-5, and Myf-6 (Mrf4 or herculin), in birds CMD1 (QMF-1), in Xenopus MyoD and MF25, in Caenorhabditis elegans CeMyoD, and in Drosophila nautilus (nau). - Vertebrate proteins that bind specific DNA sequences ('E boxes') in various immunoglobulin chains enhancers: E2A or ITF-1 (E12/pan-2 and E47/pan-1), ITF-2 (tcf4), TFE3, and TFEB. - Vertebrate neurogenic differentiation factor 1 that acts as differentiation factor during neurogenesis. - Vertebrate MAX protein, a transcription regulator that forms a sequencespecific DNA-binding protein complex with myc or mad. - Vertebrate Max Interacting Protein 1 (MXI1 protein) which acts as a transcriptional repressor and may antagonize myc transcriptional activity by competing for max. - Proteins of the bHLH/PAS superfamily which are transcriptional activators. In mammals, AH receptor nuclear translocator (ARNT), single-minded homologs (SIM1 and SIM2), hypoxia-inducible factor 1 alpha (HIF1A), AH receptor (AHR), neuronal pas domain proteins (NPAS1 and NPAS2), endothelial pas domain protein 1 (EPAS1), mouse ARNT2, and human BMAL1. In drosophila, single-minded (SIM), AH receptor nuclear translocator (ARNT), trachealess protein (TRH), and similar protein (SIMA). - Mammalian transcription factors HES, which repress transcription by acting on two types of DNA sequences, the E box and the N box. - Mammalian MAD protein (max dimerizer) which acts as transcriptional repressor and may antagonize myc transcriptional activity by competing for max. - Mammalian Upstream Stimulatory Factor 1 and 2 (USF1 and USF2), which bind to a symmetrical DNA sequence that is found in a variety of viral and cellular promoters. - Human lyl-1 protein; which is involved, by chromosomal translocation, in Tcell leukemia. - Human transcription factor AP-4. - Mouse helix-loop-helix proteins MATH-1 and MATH-2 which activate E boxdependent transcription in collaboration with E47. - Mammalian stem cell protein (SCL) (also known as tal1), a protein which may play an important role in hemopoietic differentiation. SCL is involved, by chromosomal translocation, in stem-cell leukemia. - Mammalian proteins Id1 to Id4 [5]. Id (inhibitor of DNA binding) proteins lack a basic DNA-binding domain but are able to form heterodimers with other HLH proteins, thereby inhibiting binding to DNA. - Drosophila extra-macrochaetae (emc) protein, which participates in sensory organ patterning by antagonizing the neurogenic activity of the achaetescute complex. Emc is the homolog of mammalian Id proteins. - Human Sterol Regulatory Element Binding Protein 1 (SREBP1), a transcriptional activator that binds to the sterol regulatory element 1 (SRE-1) found in the flanking region of the LDLR gene and in other genes. - Drosophila achaete-scute (AS-C) complex proteins T3 (l'sc), T4 (scute), T5 (achaete) and T8 (asense). The AS-C proteins are involved in the determination of the neuronal precursors in the peripheral nervous system and the central nervous system. - Mammalian homologs of achaete-scute proteins, the MASH-1 and MASH-2 proteins. - Drosophila atonal protein (ato) which is involved in neurogenesis. - Drosophila daughterless (da) protein, which is essential for neurogenesis and sex-determination. - Drosophila deadpan (dpn), a hairy-like protein involved in the functional differentiation of neurons. - Drosophila delilah (dei) protein, which is plays an important role in the differentiation of epidermal cells into muscle. - Drosophila hairy (h) protein, a transcriptional repressor which regulates the embryonic segmentation and adult bristle patterning. - Drosophila enhancer of split proteins E(spl), that are hairy-like proteins active during neurogenesis. also act as transcriptional repressors. - Drosophila twist (twi) protein, which is involved in the establishment of germ layers in embryos. - Maize anthocyanin regulatory proteins R-S and LC. - Yeast centromere-binding protein 1 (CPF1 or CBF1). This protein is involved in chromosomal segregation. It binds to a highly conserved DNA sequence, found in centromers and in several promoters. - Yeast INO2 and INO4 proteins. - Yeast phosphate system positive regulatory protein PHO4 which interacts with the upstream activating sequence of several acid phosphatase genes. - Yeast serine-rich protein TYE7 that is required for ty-mediated ADH2 expression. - Neurospora crassa nuc-1, a protein that activates the transcription of structural genes for phosphorus acquisition. - Fission yeast protein esc1 which is involved in the sexual differentiation process. The schematic representation of the helix-loop-helix domain is shown here: xxxxxxxxxxxxxxxxxxxxxxxx--------------------xxxxxxxxxxxxxxxxxxxxxxx Amphipathic helix 1 Loop Amphipathic helix 2 The profile we developed covers the helix-loop-helix dimerization domain and the basic region. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: August 2003 / Pattern removed. [ 1] Murre C., McCaw P.S., Baltimore D. "A new DNA binding and dimerization motif in immunoglobulin enhancer binding, daughterless, MyoD, and myc proteins." Cell 56:777-783(1989). PubMed=2493990 [ 2] Garrel J., Campuzano S. BioEssays 13:493-498(1991). [ 3] Kato G.J., Dang C.V. "Function of the c-Myc oncoprotein." FASEB J. 6:3065-3072(1992). PubMed=1521738 [ 4] Krause M., Fire A., Harrison S.W., Priess J., Weintraub H. CeMyoD accumulation defines the body wall muscle cell fate during C. "elegans embryogenesis." Cell 63:907-919(1990). PubMed=2175254 [ 5] Riechmann V., van Cruechten I., Sablitzky F. "The expression pattern of Id4, a novel dominant negative helix-loop-helix protein, is distinct from Id1, Id2 and Id3." Nucleic Acids Res. 22:749-755(1994). PubMed=8139914 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00039} {PS00039; DEAD_ATP_HELICASE} {PS00690; DEAH_ATP_HELICASE} {BEGIN} ***************************************************************** * DEAD and DEAH box families ATP-dependent helicases signatures * ***************************************************************** A number of eukaryotic and prokaryotic proteins have been characterized [1,2, 3] on the basis of their structural similarity. They all seem to be involved in ATP-dependent, nucleic-acid unwinding. Proteins currently known to belong to this family are: - Initiation factor eIF-4A. Found in eukaryotes, this protein is a subunit of a high molecular weight complex involved in 5'cap recognition and the binding of mRNA to ribosomes. It is an ATP-dependent RNA-helicase. - PRP5 and PRP28. These yeast proteins are involved in various ATPrequiring steps of the pre-mRNA splicing process. - Pl10, a mouse protein expressed specifically during spermatogenesis. - An3, a Xenopus putative RNA helicase, closely related to Pl10. - SPP81/DED1 and DBP1, two yeast proteins probably involved in pre-mRNA splicing and related to Pl10. - Caenorhabditis elegans helicase glh-1. - MSS116, a yeast protein required for mitochondrial splicing. - SPB4, a yeast protein involved in the maturation of 25S ribosomal RNA. - p68, a human nuclear antigen. p68 has ATPase and DNA-helicase activities in vitro. It is involved in cell growth and division. - Rm62 (p62), a Drosophila putative RNA helicase related to p68. - DBP2, a yeast protein related to p68. - DHH1, a yeast protein. - DRS1, a yeast protein involved in ribosome assembly. - MAK5, a yeast protein involved in maintenance of dsRNA killer plasmid. - ROK1, a yeast protein. - ste13, a fission yeast protein. - Vasa, a Drosophila protein important for oocyte formation and specification of of embryonic posterior structures. - Me31B, a Drosophila maternally expressed protein of unknown function. - dbpA, an Escherichia coli putative RNA helicase. - deaD, an Escherichia coli putative RNA helicase which can suppress a mutation in the rpsB gene for ribosomal protein S2. - rhlB, an Escherichia coli putative RNA helicase. - rhlE, an Escherichia coli putative RNA helicase. - srmB, an Escherichia coli protein that shows RNA-dependent ATPase activity. It probably interacts with 23S ribosomal RNA. - Caenorhabditis elegans hypothetical proteins T26G10.1, ZK512.2 and ZK686.2. - Yeast hypothetical protein YHR065c. - Yeast hypothetical protein YHR169w. - Fission yeast hypothetical protein SpAC31A2.07c. - Bacillus subtilis hypothetical protein yxiN. All these proteins share a number of conserved sequence motifs. Some of them are specific to this family while others are shared by other ATPbinding proteins or by proteins belonging to the helicases `superfamily' [4,E1]. One of these motifs, called the 'D-E-A-D-box', represents a special version of the B motif of ATP-binding proteins. Some other proteins belong to a subfamily which have His instead of the second Asp and are thus said to be 'D-E-A-H-box' proteins [3,5,6,E1]. Proteins currently known to belong to this subfamily are: - PRP2, PRP16, PRP22 and PRP43. These yeast proteins are all involved in various ATP-requiring steps of the pre-mRNA splicing process. - Fission yeast prh1, which my be involved in pre-mRNA splicing. - Male-less (mle), a Drosophila protein required in males, for dosage compensation of X chromosome linked genes. - RAD3 from yeast. RAD3 is a DNA helicase involved in excision repair of DNA damaged by UV light, bulky adducts or cross-linking agents. Fission yeast rad15 (rhp3) and mammalian DNA excision repair protein XPD (ERCC-2) are the homologs of RAD3. - Yeast CHL1 (or CTF1), which is important for chromosome transmission and normal cell cycle progression in G(2)/M. - Yeast TPS1. - Yeast hypothetical protein YKL078w. - Caenorhabditis elegans hypothetical proteins C06E1.10 and K03H1.2. - Poxviruses' early transcription factor 70 Kd subunit which acts with RNA polymerase to initiate transcription from early gene promoters. - I8, a putative vaccinia virus helicase. - hrpA, an Escherichia coli putative RNA helicase. We have developed signature patterns for both subfamilies. -Consensus pattern: [LIVMF](2)-D-E-A-D-[RKEN]-x-[LIVMFYGSTN] -Sequences known to belong to this class detected by the pattern: ALL, except for YHR169w. -Other sequence(s) detected in Swiss-Prot: 14. -Consensus pattern: [GSAH]-x-[LIVMF](3)-D-E-[ALIV]-H-[NECR] -Sequences known to belong to this class detected by the pattern: ALL, except for hrpA. -Other sequence(s) detected in Swiss-Prot: 6. -Note: Proteins belonging to this family also contain a copy of the ATP/GTPbinding motif 'A' (P-loop) (see the relevant entry <PDOC00017>). -Expert(s) to contact by email: Linder P.; [email protected] -Last update: July 1999 / Text revised. [ 1] Schmid S.R., Linder P. "D-E-A-D protein family of putative RNA helicases." Mol. Microbiol. 6:283-291(1992). PubMed=1552844 [ 2] Linder P., Lasko P.F., Ashburner M., Leroy P., Nielsen P.J., Nishi K., Schnier J., Slonimski P.P. "Birth of the D-E-A-D box." Nature 337:121-122(1989). PubMed=2563148; DOI=10.1038/337121a0 [ 3] Wassarman D.A., Steitz J.A. "RNA splicing. Alive with DEAD proteins." Nature 349:463-464(1991). PubMed=1825133; DOI=10.1038/349463a0 [ 4] Hodgman T.C. "A new superfamily of replicative proteins." Nature 333:22-23(1988) and Nature 333:578-578(1988) (Errata). PubMed=3362205; DOI=10.1038/333022b0 [ 5] Harosh I., Deschavanne P. "The RAD3 gene is a member of the DEAH family RNA helicase-like protein." Nucleic Acids Res. 19:6331-6331(1991). PubMed=1956796 [ 6] Koonin E.V., Senkevich T.G. "Vaccinia virus encodes four putative DNA and/or RNA helicases distantly related to each other." J. Gen. Virol. 73:989-993(1992). PubMed=1321883 [E1] http://medweb2.unige.ch/~linder/RNA_helicases.html +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00040} {PS00041; HTH_ARAC_FAMILY_1} {PS01124; HTH_ARAC_FAMILY_2} {BEGIN} ******************************************************************** * Bacterial regulatory proteins, araC family signature and profile * ******************************************************************** The many bacterial transcription regulation proteins which bind DNA through a 'helix-turn-helix' motif can be classified into subfamilies on the basis of sequence similarities. One of these subfamilies groups together the following proteins [1,2,3]: - aarP, a transcriptional activator of the 2'-N-acetyltransferase gene in Providencia stuartii. - ada, an Escherichia coli and Salmonella typhimurium bifunctional protein that repairs alkylated guanine in DNA by transferring the alkyl group at the O(6) position to a cysteine residue in the enzyme. The methylated protein acts a positive regulator of its own synthesis and of the alkA, alkB and aidB genes. - adaA, a Bacillus subtilis bifunctional protein that acts both as a transcriptional activator of the ada operon and as a methylphosphotriesterDNA alkyltransferase. - adiY, an Escherichia coli protein of unknown function. - aggR, the transcriptional activator of aggregative adherence fimbria I expression in enteroaggregative Escherichia coli. - appY, a protein which acts as a transcriptional activator of acid phosphatase and other proteins during the deceleration phase of growth and acts as a repressor for other proteins that are synthesized in exponential growth or in the stationary phase. - araC, the arabinose operon regulatory protein, which activates the transcription of the araBAD genes. - cafR, the Yersinia pestis F1 operon positive regulatory protein. - celD, the Escherichia coli cel operon repressor. - cfaD, a protein which is required for the expression of the CFA/I adhesin of enterotoxigenic Escherichia coli. - csvR, a transcriptional activator of fimbrial genes in enterotoxigenic Escherichia coli. - envY, the porin thermoregulatory protein, which is involved in the control of the temperature-dependent expression of several Escherichia coli envelope proteins such as ompF, ompC, and lamB. - exsA, an activator of exoenzyme S synthesis in Pseudomonas aeruginosa. - fapR, the positive activator for the expression of the 987P operon coding for the fimbrial protein in enterotoxigenic Escherichia coli. - hrpB, a positive regulator of pathogenicity genes in Burkholderia solanacearum. - invF, the Salmonella typhimurium invasion operon regulator. - marA, which may be a transcriptional activator of genes involved in the multiple antibiotic resistance (mar) phenotype. - melR, the melibiose operon regulatory protein, which activates the transcription of the melAB genes. - mixE, a Shigella flexneri protein necessary for secretion of ipa invasins. - mmsR, the transcriptional activator for the mmsAB operon in Pseudomonas aeruginosa. - msmR, the multiple sugar metabolism operon transcriptional activator in Streptococcus mutans. - pchR, a Pseudomonas aeruginosa activator for pyochelin and ferripyochelin receptor. - perA, a transcriptional activator of the eaeA gene for intimin in enteropathogenic Escherichia coli. - pocR, a Salmonella typhimurium regulator of the cobalamin biosynthesis operon. - pqrA, from Proteus vulgaris. - rafR, the regulator of the raffinose operon in Pediococcus pentosaceus. - ramA, from Klebsiella pneumoniae. - rhaR, the Escherichia coli and Salmonella typhimurium L-rhamnose operon transcriptional activator. - rhaS, an Escherichia coli and Salmonella typhimurium positive activator of genes required for rhamnose utilization. - rns, a protein which is required for the expression of the cs1 and cs2 adhesins of enterotoxigenic Escherichia coli. - rob, a protein which binds to the right arm of the replication origin oriC of the Escherichia coli chromosome. - soxS, a protein that, with the soxR protein, controls a superoxide response regulon in Escherichia coli. - tetD, a protein from transposon TN10. - tcpN or toxT, the Vibrio cholerae transcriptional activator of the tcp operon involved in pilus biosynthesis and transport. - thcR, a probable regulator of the thc operon for the degradation of the thiocarbamate herbicide EPTC in Rhodococcus sp. strain NI86/21. - ureR, the transcriptional activator of the plasmid-encoded urease operon in Enterobacteriaceae. - virF and lcrF, the Yersinia virulence regulon transcriptional activator. - virF, the Shigella transcriptional factor of invasion related antigens ipaBCD. - xylR, the Escherichia coli xylose operon regulator. - xylS, the transcriptional activator of the Pseudomonas putida TOL plasmid (pWWO, pWW53 and pDK1) meta operon (xylDLEGF genes). - yfeG, an Escherichia coli hypothetical protein. - yhiW, an Escherichia coli hypothetical protein. - yhiX, an Escherichia coli hypothetical protein. - yidL, an Escherichia coli hypothetical protein. - yijO, an Escherichia coli hypothetical protein. - yuxC, a Bacillus subtilis hypothetical protein. - yzbC, a Bacillus subtilis hypothetical protein. Except for celD, all of these proteins seem to be positive transcriptional factors. Their size range from 107 (soxS) to 529 (yzbC) residues. The helix-turn-helix motif is located in the third quarter of most of the sequences; the N-terminal and central regions of these proteins are presumed to interact with effector molecules and may be involved in dimerization. The minimal DNA binding domain, which spans roughly 100 residues and comprises the HTH motif contains another region with similarity to classical HTH domain. However, it contains an insertion of one residue in the turn-region. A signature pattern was derived from the region that follows the first HTH domain and that includes the totality of the putative second HTH domain. A more sensitive detection of members of the araC family is available through the use of a profile which spans the minimal DNA-binding region of 100 residues. -Consensus pattern: [KRQ]-[LIVMA]-x(2)-[GSTALIV]-{FYWPGDN}-x(2)-[LIVMSA]x(4,9)-[LIVMF]-x-{PLH}-[LIVMSTA]-[GSTACIL]-{GPK}-{F}x[GANQRF]-[LIVMFY]-x(4,5)-[LFY]-x(3)-[FYIVA]-{FYWHCM}{PGVI}-x(2)-[GSADENQKR]-x-[NSTAPKL]-[PARL] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 50. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Ramos J.L.; [email protected] Gallegos M.-T.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Gallegos M.-T., Michan C., Ramos J.L. "The XylS/AraC family of regulators." Nucleic Acids Res. 21:807-810(1993). PubMed=8451183 [ 2] Henikoff S., Wallace J.C., Brown J.P. "Finding protein similarities with nucleotide sequence databases." Methods Enzymol. 183:111-132(1990). PubMed=2314271 [ 3] Gallegos M.T., Schleif R., Bairoch A., Hofmann K., Ramos J.L. "Arac/XylS family of transcriptional regulators." Microbiol. Mol. Biol. Rev. 61:393-410(1997). PubMed=9409145 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00041} {PS00042; HTH_CRP_1} {PS51063; HTH_CRP_2} {BEGIN} ********************************************* * Crp-type HTH domain signature and profile * ********************************************* The crp-type HTH domain is a DNA-binding, winged helix-turn-helix (wHTH) domain of about 70-75 amino acids present in transcription regulators of the crp-fnr family, involved in the control of virulence factors, enzymes of aromatic ring degradation, nitrogen fixation, photosynthesis, and various types of respiration. The crp-fnr family is named after the first members identified in E.coli: the well characterized cyclic AMP receptor protein CRP or CAP (catabolite activator protein) and the fumarate and nitrate reductase regulator Fnr. crp-type HTH domain proteins occur in most bacteria and in chloroplasts of red algae. The DNA-binding HTH domain is located in the C-terminal part; the N-terminal part of the proteins of the crp-fnr family contains a nucleotide-binding domain (see <PDOC00691>) and a dimerization/linker helix occurs in between. The crp-fnr regulators predominantly act as transcription activators, but can also be important repressors, and respond to diverse intracellular and exogenous signals, such as cAMP, anoxia, redox state, oxidative and nitrosative stress, carbon monoxide, nitric oxide or temperature [1,2]. The structure of the crp-type DNA-binding domain (see <PDB:1LB2>) shows that the helices (H) forming the helix-turn-helix motif (H2-H3) are flanked by two beta-hairpin (B) wings, in the topology H1-B1-B2-H2-H3-B3-B4. Helix 3 is termed the recognition helix, as in most wHTHs it binds the DNA major groove [3,4,5]. Some proteins known to contain a Crp-type HTH domain: - Escherichia coli crp (also known as cAMP receptor), a protein that complexes with cAMP and regulates the transcription of several catabolite-sensitive operons. - Escherichia coli fnr, a protein that activates genes for proteins involved in a variety of anaerobic electron transport systems. - Rhizobium leguminosarum fnrN, a transcription regulator of nitrogen fixation. - Rhodobacter sphaeroides fnrL, a transcription activator of genes for heme biosynthesis, bacteriochlorophyll synthesis and the lightharvesting complex LHII. - Rhizobiacae fixK, a protein that regulates nitrogen fixation genes, both positively and negatively. - Lactobacillus casei fnr-like protein flp, a putative regulatory protein linked to the trpDCFBA operon. - Cyanobacteria ntcA, a regulator of the expression of genes subject to nitrogen control. - Xanthomonas campestris clp, a protein involved in the regulation of phytopathogenicity. Clp controls the production of extracellular enzymes, xanthan gum and pigment, either positively or negatively. The 'helix-turn-helix' DNA-binding motif of these proteins is located in the C-terminal part of the sequence. The pattern we use to detect these proteins starts two residues before the HTH motif and ends two residues before the end of helix 3. We also developed a profile that covers the entire wHTH, including helix 1 and strand 4, and which allows a more sensitive detection. -Consensus pattern: [LIVM]-[STAG]-[RHNWM]-x(2)-[LIM]-[GA]-x-[LIVMFYAS][LIVSC]-[GA]-x-[STACN]-x(2)-[MST]-x(1,2)-[GSTN]-R-x[LIVMF]-x(2)-[LIVMF] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 1. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Irvine A.S., Guest J.R. "Lactobacillus casei contains a member of the CRP-FNR family." Nucleic Acids Res. 21:753-753(1993). PubMed=8441692 [ 2] Koerner H., Sofia H.J., Zumft W.G. FEMS Microbiol. Rev. 27:559-592(2003). [ 3] Busby S., Ebright R.H. "Transcription activation by catabolite activator protein (CAP)." J. Mol. Biol. 293:199-213(1999). PubMed=10550204; DOI=10.1006/jmbi.1999.3161 [ 4] Lanzilotta W.N., Schuller D.J., Thorsteinsson M.V., Kerby R.L., Roberts G.P., Poulos T.L. "Structure of the CO sensing transcription activator CooA." Nat. Struct. Biol. 7:876-880(2000). PubMed=11017196; DOI=10.1038/82820 [ 5] Huffman J.L., Brennan R.G. "Prokaryotic transcription regulators: more than just the helix-turn-helix motif." Curr. Opin. Struct. Biol. 12:98-106(2002). PubMed=11839496 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00042} {PS50949; HTH_GNTR} {BEGIN} ******************************** * GntR-type HTH domain profile * ******************************** The gntR-type HTH domain is a DNA-binding, winged helix-turn-helix (wHTH) domain of about 60-70 residues present in transcriptional regulators of the gntR family. This family of bacterial regulators is named after Bacillus subtilis gntR, a repressor of the gluconate operon [1,2]. Six subfamilies have been described for the gntR family: fadR, hutC, plmA, mocR, ytrA, and araR, which regulate various biological processes and important bacterial metabolic pathways. The DNA-binding gntR-type HTH domain occurs usually in the N-terminal part. The C-terminal part can contain a subfamilyspecific effector-binding domain and/or an oligomerization domain. The fadR-like regulators, representing the largest subfamily, are involved in the regulation of oxidized substrates related to metabolic pathways or metabolism of amino acids. HutC-like proteins are involved in conjugative plasmid transfer in several Streptomyces species. PlmA is a cyanobacterial regulator of plasmid maintenance. The mocR subfamily encompasses proteins homologous to class I aminotransferase proteins, which bind pyridoxal phosphate as a cofactor. Most of the ytrA-like proteins take part in operons involved in ATPbinding cassette (ABC) transport systems. AraR is an autoregulatory protein with a C-terminal domain that binds a carbohydrate effector, similar to that present in regulators of the lacI/galR family (see <PDOC00366>) [3,4]. The crystal structures of fadR show that the N-terminal, DNA binding domain contains a small beta-sheet (B) core and three alpha-helices (H) with a topology H1-B1-H2-H3-B2-B3 (see <PDB:1H9T>). Helices 2 and 3, connected via a tight turn, comprise the helix-turn-helix motif. The antiparallel beta-strands 2 and 3 together with B1 form a small beta-sheet, which is called the wing. Helix 3 is termed the recognition helix as in most wHTHs it binds the DNA major groove. Here, only the N-terminal tip of the recognition helix makes specific DNA-contacts and the wing makes unusual sequencespecific contacts to the minor groove. Like other HTH proteins, most gntR-type regulators bind as homodimers to 2-fold symmetric DNA sequences in which each monomer recognizes half of the site [5,6]. Some proteins known to contain a gntR-type HTH domain: - Bacillus subtilis gntR, a repressor of the gnt operon, which is responsible for gluconate metabolism. In the absence of gluconate, gntR binds to the promoter of the operon. The expression of the operon is induced in the presence of gluconate. - Escherichia coli fadR, a transcriptional regulator of fatty acid metabolism. In the absence of the acyl-CoA effector, fadR binds specific operator sites, represses the expression of genes involved in fatty acid degradation and import, and activates biosynthetic genes. Binding of acyl-CoA gives conformational changes abolishing DNA binding, which derepresses the catabolic genes and deactivates the anabolic genes. - Escherichia coli phdR, a transcriptional repressor of the pyruvate dehydrogenase complex. - Klebsiella aerogenes and Pseudomonas putida hutC, a transcriptional repressor of the histidine utilization (hut) operon. - Streptomyces lividans korA, a regulator that controls plasmid transfer. - Rhizobium meliloti mocR, a probable regulator of rhizopine catabolism. - Bacillus subtilis ytrA, a repressor of the acetoine utilization gene cluster. - Anabaena sp. strain PCC 7120 plmA, a regulator involved in plasmid maintenance [4]. - Bacillus arabinose operon. subtilis araR, a transcriptional repressor of the The profile we developed covers the entire gntR-type HTH domain, from the well-conserved part of helix 1 to the end of the wing. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Rigali S.; [email protected] -Last update: February 2004 / Text revised. [ 1] Buck D., Guest J.R. "Overexpression and site-directed mutagenesis of the succinyl-CoA synthetase of Escherichia coli and nucleotide sequence of a gene (g30) that is adjacent to the suc operon." Biochem. J. 260:737-747(1989). PubMed=2548486 [ 2] Haydon D.J., Guest J.R. "A new family of bacterial regulatory proteins." FEMS Microbiol. Lett. 63:291-295(1991). PubMed=2060763 [ 3] Rigali S., Derouaux A., Giannotta F., Dusart J. "Subdivision of the helix-turn-helix GntR family of bacterial regulators in the FadR, HutC, MocR, and YtrA subfamilies." J. Biol. Chem. 277:12507-12515(2002). PubMed=11756427; DOI=10.1074/jbc.M110968200 [ 4] Lee M.H., Scherer M., Rigali S., Golden J.W. "PlmA, a new member of the GntR family, has plasmid maintenance functions in Anabaena sp. strain PCC 7120." J. Bacteriol. 185:4315-4325(2003). PubMed=12867439 [ 5] Van Aalten D.M.F., DiRusso C.C., Knudsen J. EMBO J. 20:2041-2050(2001). [ 6] Xu Y., Heath R.J., Li Z., Rock C.O., White S.W. "The FadR.DNA complex. Transcriptional control of fatty acid metabolism in Escherichia coli." J. Biol. Chem. 276:17373-17379(2001). PubMed=11279025; DOI=10.1074/jbc.M100195200 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00043} {PS50931; HTH_LYSR} {BEGIN} ******************************** * LysR-type HTH domain profile * ******************************** The lysR-type HTH domain is a DNA-binding, winged helix-turn-helix (wHTH) domain of about 60 residues present in lysR-type transcriptional regulators (LTTR), one of the most common regulator families in prokaryotes. The family is named after the Escherichia coli regulator lysR [1]. LysR proteins are present in diverse bacterial genera, archaea and algal chloroplasts. All LTTRs contain the DNA-binding lysR-type HTH domain, usually in the N-terminal part. Most LTTRs require a small compound that acts as co-inducer. The Cterminal part of lysR proteins can contain a regulatory domain with two subdomains involved in (1) co-inducer recognition/response and (2) DNA binding and response. LTTRs activate the transcription of operons and regulons involved in very diverse functions, such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, regulation of virulence factors, nodulation for nitrogen fixing bacteria, oxidative stress response or aromatic compounds catabolism. Most LTTRs act as a transcriptional activator of the target genes and also as a repressor of their own expression. Typical LTTRs bind to a sequence of about 50-60 bp, which contains two distinct sites, (1) a recognition-binding site (RBS) centered near -65 of the target transcription start site and with an inverted repeat motif including the T-N(11)-A motif and (2) an activation-binding site (ABS) which overlaps the -35 region of the transcription start site of the regulated gene. LysR proteins are mainly cytoplasmic, but some seem membrane-bound [2]. The crystal structure of the lysR alpha helices and two anti-parallel the helix-turn-helix motif comprising strands being called the wing. Most LTTRs DNA-binding domain of CbnR shows three beta strands (see <PDB:1IXC>), with the second and third helices and the are likely tetramers [3]. Some proteins known to contain a lysR domain: - Proteus vulgaris blaA, a transcriptional regulator of beta-lactamase. - Pseudomonas putida catR, a regulator of catechol catabolism for benzoate degradation. - Escherichia coli cynR, a regulator for detoxification of cyanate. - Klebsiella aerogenes cysB, a regulator of cysteine biosynthesis. - Vibrio cholerae irgB, an iron-dependent regulator of virulence factors. - Escherichia coli lysR, a transcriptional regulator of lysine biosynthesis. - Escherichia coli nhaR, a regulator of a sodium/proton (Na+/H+) antiporter. - Rhizobium meliloti nodD and syrM, regulators of nodulation genes involved in nitrogen fixation symbiosis. - Salmonella typhimurium oxyR, a regulator of intracellular hydrogen peroxide and oxydative stress response. - Ralstonia solanacearum phcA, a regulator of virulence factors. The profile we developed covers the entire lysR-type HTH domain. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Schell M.; [email protected] -Last update: October 2003 / Pattern removed, profile added and text revised. [ 1] Henikoff S., Haughn G.W., Calvo J.M., Wallace J.C. "A large family of bacterial activator proteins." Proc. Natl. Acad. Sci. U.S.A. 85:6602-6606(1988). PubMed=3413113 [ 2] Schell M.A. "Molecular biology of the LysR family of transcriptional regulators." Annu. Rev. Microbiol. 47:597-626(1993). PubMed=8257110; DOI=10.1146/annurev.mi.47.100193.003121 [ 3] Muraoka S., Okumura R., Ogawa N., Nonaka T., Miyashita K., Senda T. "Crystal structure of a full-length LysR-type transcriptional regulator, CbnR: unusual combination of two subunit forms and molecular bases for causing and changing DNA bend." J. Mol. Biol. 328:555-566(2003). PubMed=12706716 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00044} {PS00045; HISTONE_LIKE} {BEGIN} ********************************************************* * Bacterial histone-like DNA-binding proteins signature * ********************************************************* Bacteria synthesize a set of small, usually basic proteins of about 90 residues that bind DNA and are known as histone-like proteins [1,2]. The exact function of these proteins is not yet clear but they are capable of wrapping DNA and stabilizing it from denaturation under extreme environmental conditions. The sequence of a number of different types of these proteins is known: - The HU proteins, which, in Escherichia coli, are a dimer of closely related alpha and beta chains and, in other bacteria, can be dimer of identical chains. HU-type proteins have been found in a variety of eubacteria, cyanobacteria and archaebacteria, and are also encoded in the chloroplast genome of some algae [3]. - The integration host factor (IHF), a dimer of closely related chains which seem to function in genetic recombination as well as in translational and transcriptional control [4] in enterobacteria. - The bacteriophage sp01 transcription factor 1 (TF1) which selectively binds to and inhibits the transcription of hydroxymethyluracil-containing DNA, such as sp01 DNA, by RNA polymerase in vitro. - The African Swine fever virus protein A104R (or LMW5-AR) [5]. As a signature pattern for this family of proteins, we use a twenty residue sequence which includes three perfectly conserved positions. According to the tertiary structure of one of these proteins [6], this pattern spans exactly the first half of the flexible DNA-binding arm. -Consensus pattern: [GSK]-F-x(2)-[LIVMF]-x(4)-[RKEQA]-x(2)-[RST]-x(1,2)[GA]x-[KN]-P-x-[TN] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Drlica K., Rouviere-Yaniv J. "Histonelike proteins of bacteria." Microbiol. Rev. 51:301-319(1987). PubMed=3118156 [ 2] Pettijohn D.E. "Histone-like proteins and bacterial chromosome structure." J. Biol. Chem. 263:12793-12796(1988). PubMed=3047111 [ 3] Wang S.L., Liu X.-Q. "The plastid genome of Cryptomonas phi encodes an hsp70-like protein, a histone-like protein, and an acyl carrier protein." Proc. Natl. Acad. Sci. U.S.A. 88:10783-10787(1991). PubMed=1961745 [ 4] Friedman D.I. "Integration host factor: a protein for all reasons." Cell 55:545-554(1988). PubMed=2972385 [ 5] Neilan J.G., Lu Z., Kutish G.F., Sussman M.D., Roberts P.C., Yozawa T., Rock D.L. "An African swine fever virus gene with similarity to bacterial DNA binding proteins, bacterial integration host factors, and the Bacillus phage SPO1 transcription factor, TF1." Nucleic Acids Res. 21:1496-1496(1993). PubMed=8464748 [ 6] Tanaka I., Appelt K., Dijk J., White S.W., Wilson K.S. "3-A resolution structure of a protein with histone-like properties in prokaryotes." Nature 310:376-381(1984). PubMed=6540370 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00045} {PS00046; HISTONE_H2A} {BEGIN} ************************* * Histone H2A signature * ************************* Histone H2A is one of the four histones, along with H2B, H3 and H4, which forms the eukaryotic nucleosome core. Using alignments of histone H2A sequences [1,2,E1] we selected, as a signature pattern, a conserved region in the N-terminal part of H2A. This region is conserved both in classical Sphase regulated H2A's and in variant histone H2A's which are synthesized throughout the cell cycle. -Consensus pattern: [AC]-G-L-x-F-P-V -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 2. -Last update: November 1995 / Pattern and text revised. [ 1] Wells D.E., Brown D. "Histone and histone gene compilation and alignment update." Nucleic Acids Res. 19:2173-2188(1991). PubMed=2041803 [ 2] Thatcher T.H., Gorovsky M.A. "Phylogenetic analysis of the core histones H2A, H2B, H3, and H4." Nucleic Acids Res. 22:174-179(1994). PubMed=8121801 [E1] http://research.nhgri.nih.gov/histones/ +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00046} {PS00047; HISTONE_H4} {BEGIN} ************************ * Histone H4 signature * ************************ Histone H4 is one of the four histones, along with H2A, H2B and H3, which forms the eukaryotic nucleosome core. Along with H3, it plays a central role in nucleosome formation. The sequence of histone H4 has remained almost invariant in more then 2 billion years of evolution [1,E1]. The region we use as a signature pattern is a pentapeptide found in positions 14 to 18 of all H4 sequences. It contains a lysine residue which is often acetylated [2] and a histidine residue which is implicated in DNA-binding [3]. -Consensus pattern: G-A-K-R-H -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 3. -Last update: November 1995 / Text revised. [ 1] Thatcher T.H., Gorovsky M.A. "Phylogenetic analysis of the core histones H2A, H2B, H3, and H4." Nucleic Acids Res. 22:174-179(1994). PubMed=8121801 [ 2] Doenecke D., Gallwitz D. "Acetylation of histones in nucleosomes." Mol. Cell. Biochem. 44:113-128(1982). PubMed=6808351 [ 3] Ebralidse K.K., Grachev S.A., Mirzabekov A.D. "A highly basic histone H4 domain bound to the sharply bent region of nucleosomal DNA." Nature 331:365-367(1988). PubMed=3340182; DOI=10.1038/331365a0 [E1] http://research.nhgri.nih.gov/histones/ +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00047} {PS00048; PROTAMINE_P1} {BEGIN} ************************** * Protamine P1 signature * ************************** Protamines are small, highly basic proteins, that substitute for histones in sperm chromatin during the haploid phase of spermatogenesis. They pack sperm DNA into a highly condensed, stable and inactive complex. There are two different types of mammalian protamine, called P1 and P2. P1 has been found in all species studied, while P2 is sometimes absent. There seems to be a single type of avian protamine whose sequence is closely related to that of mammalian P1 [1]. As a signature for this family of proteins, we selected a conserved region at the N-terminal extremity of the sequence. -Consensus pattern: [AV]-R-[NFY]-R-x(2,3)-[ST]-{S}-S-{NS}-S -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 4. -Last update: December 2004 / Pattern and text revised. [ 1] Oliva R., Goren R., Dixon G.H. "Quail (Coturnix japonica) protamine, full-length cDNA sequence, and the function and evolution of vertebrate protamines." J. Biol. Chem. 264:17627-17630(1989). PubMed=2808336 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00048} {PS00049; RIBOSOMAL_L14} {BEGIN} *********************************** * Ribosomal protein L14 signature * *********************************** Ribosomal protein L14 is one of the proteins from the large ribosomal subunit. In eubacteria, L14 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [1], groups: - Eubacterial L14. Algal and plant chloroplast L14. Cyanelle L14. Archaebacterial L14. Yeast L17A. Mammalian L23. Caenorhabditis elegans L23 (B0336.10). Higher eukaryotes mitochondrial L14. Yeast mitochondrial Yml38 (gene MRPL38). L14 is a protein of 119 to 137 amino-acid residues. As a signature pattern, we selected a conserved region located in the C-terminal half of these proteins. -Consensus pattern: [GA]-[LIV](3)-x(9,10)-[DNS]-G-x(4)-[FY]-x(2)-[NT]x(2)-V[LIV] -Sequences known to belong to this class detected by the pattern: ALL, except for pine L14 and for Acanthamoeba mitochondrial L14. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: November 1997 / Pattern and text revised. [ 1] Otaka E., Hashimoto T., Mizuta K., Suzuki K. Protein Seq. Data Anal. 5:301-313(1993). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00049} {PS00050; RIBOSOMAL_L23} {BEGIN} *********************************** * Ribosomal protein L23 signature * *********************************** Ribosomal protein L23 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L23 is known to bind a specific region on the 23S rRNA; in yeast, the corresponding protein binds to a homologous site on the 26S rRNA [1]. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [2,3,4], groups: - Eubacterial L23. Algal and plant chloroplast L23. Archaebacterial L23. Mammalian L23A. Caenorhabditis elegans L23A (F55D10.2). Fungi L25. Yeast mitochondrial YmL41 (gene MRPL41 or MRP20). As a signature pattern, we selected a small conserved region in the Cterminal section of these proteins, which is probably involved in rRNA-binding [2]. -Consensus pattern: [RK](2)-[AM]-[IVFYT]-[IV]-[RKT]-L-[STANEQK]-x(7)[LIVMFT] -Sequences known to belong to this class detected by the pattern: ALL, except for yeast mitochondrial YmL41. -Other sequence(s) detected in Swiss-Prot: 1. -Last update: July 1999 / Pattern and text revised. [ 1] El Baradi T.T.A.L., Raue H.A., van de Regt C.H.F., Verbree E.C., Planta R.J. EMBO J. 4:210-2107(1985). [ 2] Raue H.A., Otaka E., Suzuki K. "Structural comparison of 26S rRNA-binding ribosomal protein L25 from two different yeast strains and the equivalent proteins from three eubacteria and two chloroplasts." J. Mol. Evol. 28:418-426(1989). PubMed=2501503 [ 3] Fearon K., Mason T.L. "Structure and function of MRP20 and MRP49, the nuclear genes for two proteins of the 54 S subunit of the yeast mitochondrial ribosome." J. Biol. Chem. 267:5162-5170(1992). PubMed=1544898 [ 4] Otaka E., Hashimoto T., Mizuta K. Protein Seq. Data Anal. 5:285-300(1993). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00050} {PS00051; RIBOSOMAL_L39E} {BEGIN} ************************************ * Ribosomal protein L39e signature * ************************************ A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of: - Mammalian L39 [1]. Plants L39. Yeast L46 [2]. Archebacterial L39e [3]. These proteins are very basic. About 50 residues long, they are the smallest proteins of eukaryotic-type ribosomes. As a signature pattern, we selected a conserved region in the C-terminal section of these proteins. -Consensus pattern: [KRM]-[PTKS]-x(3)-[LIVMFG]-x(2)-[NHS]-x(3)-R-[DNHY]W-R[RS] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Lin A., McNally J., Wool I.G. "The primary structure of rat liver ribosomal protein L39." J. Biol. Chem. 259:487-490(1984). PubMed=6706949 [ 2] Leer R.J., van Raamsdonk-Duin M.M.C., Kraakman P., Mager W.H., Planta R.J. "The genes for yeast ribosomal proteins S24 and L46 are adjacent and divergently transcribed." Nucleic Acids Res. 13:701-709(1985). PubMed=4000930 [ 3] Ramirez C., Louie K.A., Matheson A.T. "A small basic ribosomal protein in Sulfolobus solfataricus equivalent to L46 in yeast: structure of the protein and its gene." FEBS Lett. 250:416-418(1989). PubMed=2502431 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00051} {PS00052; RIBOSOMAL_S7} {BEGIN} ********************************** * Ribosomal protein S7 signature * ********************************** Ribosomal protein S7 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S7 is known to bind directly to part of the 3'end of 16S ribosomal RNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [1,2,3], groups: - Eubacterial S7. Algal and plant chloroplast S7. Cyanelle S7. Archaebacterial S7. Plant mitochondrial S7. Mammalian S5. Plant S5. Caenorhabditis elegans S5 (T05E11.1). As a signature pattern, we selected the best conserved region located in the N-terminal section of these proteins. -Consensus pattern: [DENSK]-x-[LIVMDET]-x(3)-[LIVMFTA](2)-x(6)-G-K-[KR]x(5)[LIVMF]-[LIVMFC]-x(2)-[STAC] -Sequences known to belong to this class detected by the pattern: ALL, except for Thermococcus celer S7 and Acanthamoeba castellanii mitochondrial S7. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: July 1999 / Pattern and text revised. [ 1] Klussmann S., Franke P., Bergmann U., Kostka S., Wittmann-Liebold B. "N-terminal modification and amino-acid sequence of the ribosomal protein HmaS7 from Haloarcula marismortui and homology studies to other ribosomal proteins." Biol. Chem. Hoppe-Seyler 374:305-312(1993). PubMed=8338632 [ 2] Otaka E., Hashimoto T., Mizuta K. Protein Seq. Data Anal. 5:285-300(1993). [ 3] Ignatovich O., Cooper M., Kulesza H.M., Beggs J.D. "Cloning and characterisation of the gene encoding the ribosomal protein S5 (also known as rp14, S2, YS8) of Saccharomyces cerevisiae." Nucleic Acids Res. 23:4616-4619(1995). PubMed=8524651 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00052} {PS00053; RIBOSOMAL_S8} {BEGIN} ********************************** * Ribosomal protein S8 signature * ********************************** Ribosomal protein S8 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S8 is known to bind directly to 16S ribosomal RNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [1], groups: - Eubacterial S8. Algal and plant chloroplast S8. Cyanelle S8. Archaebacterial S8. Marchantia polymorpha mitochondrial S8. Mammalian S15A. Plant S15A. Yeast S22 (S24). As a signature pattern, we selected the best conserved region located in the C-terminal section of these proteins. -Consensus pattern: [GE]-x(2)-[LIV](2)-[STY]-[ST]-{A}-x-G-[LIVM](2)-x(4)[AG][KRHAYIL] -Sequences known to belong to this class detected by the pattern: ALL, except for some mitochondrial S8. -Other sequence(s) detected in Swiss-Prot: 1. -Last update: April 2006 / Pattern revised. [ 1] Otaka E., Hashimoto T., Mizuta K. Protein Seq. Data Anal. 5:285-300(1993). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00053} {PS00054; RIBOSOMAL_S11} {BEGIN} *********************************** * Ribosomal protein S11 signature * *********************************** Ribosomal protein S11 [1] plays an essential role in selecting the correct tRNA in protein biosynthesis. It is located on the large lobe of the small ribosomal subunit. S11 belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups [2]: - Eubacterial S11. Algal and plant chloroplast S11. Cyanelle S11. Archaebacterial S11. Marchantia polymorpha and Prototheca wickerhamii mitochondrial S11. Acanthamoeba castellanii mitochondrial S11. Neurospora crassa S14 (crp-2). Yeast S14 (RP59 or CRY1). Mammalian, Drosophila, Trypanosoma, and plant S14. Caenorhabditis elegans S14 (F37C12.9). We selected one of the best conserved regions in these proteins as a signature pattern. -Consensus pattern: [LIVMFR]-x-[GSTACQI]-[LIVMF]-x(1,2)-[GSTALVM]-x(0,1)[GSN]-[LIVMFY]-x-[LIVM]-x(4)-[DEN]-x-[TS]-[PS]-x[PA]- [STCHF]-[DN] -Sequences known to belong to this class detected by the pattern: ALL, except for some mitochondrial S11. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Kimura T., Nishikawa M., Fujisawa J. "Uncleaved env gp160 of human immunodeficiency virus type 1 is degraded within the Golgi apparatus but not lysosomes in COS-1 cells." FEBS Lett. 390:15-20(1996). PubMed=8706820 [ 2] Otaka E., Hashimoto T., Mizuta K. Protein Seq. Data Anal. 5:285-300(1993). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00054} {PS00055; RIBOSOMAL_S12} {BEGIN} *********************************** * Ribosomal protein S12 signature * *********************************** Ribosomal protein S12 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S12 is known to be involved in the translation initiation step. It is a very basic protein of 120 to 150 amino-acid residues. S12 belongs to a family of ribosomal proteins which, on the basis of sequence similarities [1], groups: - Eubacterial S12. Archaebacterial S12. Algal and plant chloroplast S12. Cyanelle S12. Protozoa and plant mitochondrial S12. Yeast S28. Drosophila mitochondrial protein tko (Technical KnockOut). Mammalian S23. As a signature pattern, we selected the best these proteins, located in the center of each sequence. conserved regions in -Consensus pattern: [RK]-x-P-N-S-[AR]-x-R -Sequences known to belong to this class detected by the pattern: ALL, except for some mitochondrial S12 and Micrococcus luteus S12. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: November 1995 / Text revised. [ 1] Otaka E., Hashimoto T., Mizuta K. Protein Seq. Data Anal. 5:285-300(1993). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00055} {PS00056; RIBOSOMAL_S17} {BEGIN} *********************************** * Ribosomal protein S17 signature * *********************************** Ribosomal protein S17 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S17 is known to bind specifically to the 5' end of 16S ribosomal RNA and is thought to be involved in the recognition of termination codons. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [1,2,3], groups: - Eubacterial S17. Plant chloroplast S17 (nuclear encoded). Red algal chloroplast S17. Cyanelle S17. Archaebacterial S17. Mammalian and plant cytoplasmic S11. Yeast S18a and S18b (RP41; YS12). As a signature pattern, we selected the best conserved regions located in the C-terminal section of these proteins. -Consensus pattern: G-D-x-[LIV]-x-[LIVA]-x-[QEK]-x-[RK]-P-[LIV]-S -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: November 1997 / Pattern and text revised. [ 1] Gantt J.S., Thompson M.D. "Plant cytosolic ribosomal protein S11 and chloroplast ribosomal protein CS17. Their primary structures and evolutionary relationships." J. Biol. Chem. 265:2763-2767(1990). PubMed=2406240 [ 2] Herfurth E., Hirano H., Wittmann-Liebold B. "The amino-acid sequences of the Bacillus stearothermophilus ribosomal proteins S17 and S21 and their comparison to homologous proteins of other ribosomes." Biol. Chem. Hoppe-Seyler 372:955-961(1991). PubMed=1772592 [ 3] Otaka E., Hashimoto T., Mizuta K. Protein Seq. Data Anal. 5:285-300(1993). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00056} {PS00057; RIBOSOMAL_S18} {BEGIN} *********************************** * Ribosomal protein S18 signature * *********************************** Ribosomal protein S18 is one of the proteins from the small ribosomal subunit. subunit. In Escherichia coli, S18 has been involved in aminoacyl-tRNA binding [1]. It appears to be situated at the tRNA A-site of the ribosome. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [2], groups: - Eubacterial S18. - Algal and plant chloroplast S18. - Cyanelle S18. As a signature pattern, we selected a conserved region in the central section of the protein. This region contains two basic residues which may be involved in RNA-binding. -Consensus pattern: [IVRLP]-[DYN]-[YLF]-x(2,3)-[LIVMTPFS]-x(2)-[LIVM]x(2)[FYTS]-[LIVMT]-[STNQG]-[DERPN]-x(1,2)-[GYAH]-[KCR][LIVM]x(3)-[RHG]-[LIVMASR] -Sequences known to belong to this class detected by the pattern: ALL, except for Euglena gracilis S18. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] McDougall J., Choli T., Kruft V., Kapp U., Wittmann-Liebold B. "The complete amino acid sequence of ribosomal protein S18 from the moderate thermophile Bacillus stearothermophilus." FEBS Lett. 245:253-260(1989). PubMed=2647521 [ 2] Otaka E., Hashimoto T., Mizuta K. Protein Seq. Data Anal. 5:285-300(1993). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00057} {PS00058; DNA_MISMATCH_REPAIR_1} {BEGIN} ************************************************************* * DNA mismatch repair proteins mutL / hexB / PMS1 signature * ************************************************************* Mismatch repair contributes to the overall fidelity of DNA replication [1]. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The sequence of some proteins involved in mismatch repair in different organisms have been found to be evolutionary related. These proteins are: - Escherichia coli and Salmonella typhimurium mutL protein [2]. MutL is required for dam-dependent methyl-directed DNA repair. - Streptococcus pneumoniae hexB protein [3]. The Hex system is nick directed. - Yeast proteins PMS1 and MLH1 [4]. - Human protein MLH1 [5] which is involved in a form of familial hereditary nonpolyposis colon cancer (HNPCC). As a signature pattern for this class of mismatch repair proteins we selected a perfectly conserved heptapeptide which is located in the N-terminal section of these proteins. -Consensus pattern: G-F-R-G-E-[AG]-L -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Modrich P. "DNA mismatch correction." Annu. Rev. Biochem. 56:435-466(1987). PubMed=3304141; DOI=10.1146/annurev.bi.56.070187.002251 [ 2] Mankovich J.A., McIntyre C.A., Walker G.C. "Nucleotide sequence of the Salmonella typhimurium mutL gene required for mismatch repair: homology of MutL to HexB of Streptococcus pneumoniae and to PMS1 of the yeast Saccharomyces cerevisiae." J. Bacteriol. 171:5325-5331(1989). PubMed=2676972 [ 3] Prudhomme M., Martin B., Mejean V., Claverys J.-P. "Nucleotide sequence of the Streptococcus pneumoniae hexB mismatch repair gene: homology of HexB to MutL of Salmonella typhimurium and to PMS1 of Saccharomyces cerevisiae." J. Bacteriol. 171:5332-5338(1989). PubMed=2676973 [ 4] Prolla T.A., Christie D.M., Liskay R.M. "Dual requirement in yeast DNA mismatch repair for MLH1 and PMS1, two homologs of the bacterial mutL gene." Mol. Cell. Biol. 14:407-415(1994). PubMed=8264608 [ 5] Bronner C.E., Baker S.M., Morrison P.T., Warren G., Smith L.G., Lescoe M.K., Kane M., Earabino C., Lipford J., Lindblom A. "Mutation in the DNA mismatch repair gene homologue hMLH1 is associated with hereditary non-polyposis colon cancer." Nature 368:258-261(1994). PubMed=8145827; DOI=10.1038/368258a0 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00058} {PS00059; ADH_ZINC} {PS01162; QOR_ZETA_CRYSTAL} {BEGIN} ***************************************************** * Zinc-containing alcohol dehydrogenases signatures * ***************************************************** Alcohol dehydrogenase (EC 1.1.1.1) (ADH) catalyzes the reversible oxidation of ethanol to acetaldehyde with the concomitant reduction of NAD [1]. Currently three, structurally and catalytically, different types of alcohol dehydrogenases are known: - Zinc-containing 'long-chain' alcohol dehydrogenases. - Insect-type, or 'short-chain' alcohol dehydrogenases. - Iron-containing alcohol dehydrogenases. Zinc-containing ADH's [2,3] are dimeric or tetrameric enzymes that bind two atoms of zinc per subunit. One of the zinc atom is essential for catalytic activity while the other is not. Both zinc atoms are coordinated by either cysteine or histidine residues; the catalytic zinc is coordinated by two cysteines and one histidine. Zinc-containing ADH's are found in bacteria, mammals, plants, and in fungi. In most species there are more than one isozyme (for example, human have at least six isozymes, yeast have three, etc.). A number of other zinc-dependent dehydrogenases are closely related to zinc ADH [4], these are: - Xylitol dehydrogenase (EC 1.1.1.9) (D-xylulose reductase). - Sorbitol dehydrogenase (EC 1.1.1.14). - Aryl-alcohol dehydrogenase (EC 1.1.1.90) (benzyl alcohol dehydrogenase). - L-threonine 3-dehydrogenase (EC 1.1.1.103). - Cinnamyl-alcohol dehydrogenase (EC 1.1.1.195) (CAD) [5]. CAD is a plant enzyme involved in the biosynthesis of lignin. - Galactitol-1-phosphate 5-dehydrogenase (EC 1.1.1.251). - Escherichia coli L-idonate 5-dehydrogenase (EC 1.1.1.264). - Pseudomonas putida 5-exo-alcohol dehydrogenase (EC 1.1.1.-) [6]. - Escherichia coli starvation sensing protein rspB. - Escherichia coli hypothetical protein yjgB. - Escherichia coli hypothetical protein yjgV. - Escherichia coli hypothetical protein yjjN. - Yeast hypothetical protein YAL060w (FUN49). - Yeast hypothetical protein YAL061w (FUN50). - Yeast hypothetical protein YCR105w. The pattern that we developed to detect this class of enzymes is based on a conserved region that includes a histidine residue which is the second ligand of the catalytic zinc atom. This family also includes NADP-dependent quinone oxidoreductase (EC 1.6.5.5), an enzyme found in bacteria (gene qor), in yeast and in mammals where, in some species such as rodents, it has been recruited as an eye lens protein and is known as zeta-crystallin [7]. The sequence of quinone oxidoreductase is distantly related to that other zinc-containing alcohol dehydrogenases and it lacks the zinc-ligand residues. The torpedo fish and mammlian synaptic vesicle membrane protein vat-1 is realted to qor. We have developed a specific pattern for this subfamily. -Consensus pattern: G-H-E-x-{EL}-G-{AP}-x(4)-[GA]-x(2)-[IVSAC] [H is a zinc ligand] -Sequences known to belong to this class detected by the pattern: ALL, except for quinone oxidoreductases. -Other sequence(s) detected in Swiss-Prot: 10. -Consensus pattern: [GSDN]-[DEQHKM]-x(2)-L-x(3)-[SAG](2)-G(2)-x-G-x(4)-Qx(2)[KRS] -Sequences known to belong to this class detected by the pattern: ALL quinone oxidoreductases. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Joernvall H.; [email protected] Persson B.; [email protected] -Last update: April 2006 / Patterns revised. [ 1] Branden C.-I., Joernvall H., Eklund H., Furugren B. (In) The Enzymes (3rd edition) 11:104-190(1975). [ 2] Joernvall H., Persson B., Jeffery J. Eur. J. Biochem. 167:195-201(1987). [ 3] Sun H.-W., Plapp B.V. "Progressive sequence alignment and molecular evolution of the Zn-containing alcohol dehydrogenase family." J. Mol. Evol. 34:522-535(1992). PubMed=1593644 [ 4] Persson B., Hallborn J., Walfridsson M., Hahn-Hagerdal B., Keranen S., Penttila M., Jornvall H. "Dual relationships of xylitol and alcohol dehydrogenases in families of two protein types." FEBS Lett. 324:9-14(1993). PubMed=8504864 [ 5] Knight M.E., Halpin C., Schuch W. "Identification and characterisation of cDNA clones encoding cinnamyl alcohol dehydrogenase from tobacco." Plant Mol. Biol. 19:793-801(1992). PubMed=1643282 [ 6] Koga H., Aramaki H., Yamaguchi E., Takeuchi K., Horiuchi T., Gunsalus I.C. "camR, a negative regulator locus of the cytochrome P-450cam hydroxylase operon." J. Bacteriol. 166:1089-1095(1986). PubMed=3011733 [ 7] Joernvall H., Persson B., Du Bois G., Lavers G.C., Chen J.H., Gonzalez P., Rao P.V., Zigler J.S. Jr. FEBS Lett. 322:240-244(1993). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00059} {PS00913; ADH_IRON_1} {PS00060; ADH_IRON_2} {BEGIN} ***************************************************** * Iron-containing alcohol dehydrogenases signatures * ***************************************************** Alcohol dehydrogenase (EC 1.1.1.1) (ADH) catalyzes the reversible oxidation of ethanol to acetaldehyde with the concomitant reduction of NAD [1]. Currently three, structurally and catalytically, different types of alcohol dehydrogenases are known: - Zinc-containing 'long-chain' alcohol dehydrogenases. - Insect-type, or 'short-chain' alcohol dehydrogenases. - Iron-containing alcohol dehydrogenases. Iron-containing ADH's have been found in yeast (gene ADH4) [2], as well as in Zymomonas mobilis (gene adhB) [3]. These two iron-containing ADH's are closely related to the following enzymes: - Escherichia coli propanediol oxidoreductase (EC 1.1.1.77) (gene fucO) [4], an enzyme involved in the metabolism of fucose and which also seems to contain ferrous ion(s). - Clostridium acetobutylicum NADPH- and NADH-dependent butanol dehydrogenases (EC 1.1.1.-) (genes adh1, bdhA and bdhB) [5], an enzyme which has activity using butanol and ethanol as substrates. - Escherichia coli adhE [6], an iron-dependent enzyme which harbor three different activities: alcohol dehydrogenase, acetaldehyde dehydrogenase (acetylating) (EC 1.2.1.10) and pyruvate-formate-lyase deactivase. - Bacterial glycerol dehydrogenase (EC 1.1.1.6) (gene gldA or dhaD) [7]. - Clostridium kluyveri NAD-dependent 4-hydroxybutyrate dehydrogenase (4hbd) (EC 1.1.1.61). - Citrobacter freundii and Klebsiella pneumoniae 1,3propanediol dehydrogenase (EC 1.1.1.202) (gene dhaT). - Bacillus methanolicus NAD-dependent methanol dehydrogenase (EC 1.1.1.244) [8]. - Escherichia coli and Salmonella typhimurium ethanolamine utilization protein eutG. - Escherichia coli hypothetical protein yiaY. - Escherichia coli hypothetical protein ybdH. - Escherichia coli hypothetical protein yqhD. - Methanococcus jannaschii hypothetical protein MJ0712. The patterns that we developed to based on two conserved regions. detect this class of enzymes are -Consensus pattern: [STALIV]-[LIVF]-x-[DE]-x(6,7)-P-x(4)-[ALIV]-x-[GST]x(2)D-[TAIVM]-[LIVMF]-x(4)-E -Sequences known to belong to this class detected by the pattern: ALL, except for a few. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [GSW]-x-[LIVTSACD]-[GH]-x(2)-[GSAE]-[GSHYQ]-x[LIVTP][GAST]-[GAS]-x(3)-[LIVMT]-x-[HNS]-[GA]-x-[GTAC] -Sequences known to belong to this class detected by the pattern: ALL, except for a few. -Other sequence(s) detected in Swiss-Prot: 1. -Last update: July 1998 / Patterns and text revised. [ 1] Branden C.-I., Joernvall H., Eklund H., Furugren B. (In) The Enzymes (3rd edition) 11:104-190(1975). [ 2] Conway T., Sewell G.W., Osman Y.A., Ingram L.O. "Cloning and sequencing of the alcohol dehydrogenase II gene from Zymomonas mobilis." J. Bacteriol. 169:2591-2597(1987). PubMed=3584063 [ 3] Williamson V.M., Paquin C.E. "Homology of Saccharomyces cerevisiae ADH4 to an iron-activated alcohol dehydrogenase from Zymomonas mobilis." Mol. Gen. Genet. 209:374-381(1987). PubMed=2823079 [ 4] Conway T., Ingram L.O. "Similarity of Escherichia coli propanediol oxidoreductase (fucO product) and an unusual alcohol dehydrogenase from Zymomonas mobilis and Saccharomyces cerevisiae." J. Bacteriol. 171:3754-3759(1989). PubMed=2661535 [ 5] Walter K.A., Bennett G.N., Papoutsakis E.T. "Molecular characterization of two Clostridium acetobutylicum ATCC 824 butanol dehydrogenase isozyme genes." J. Bacteriol. 174:7149-7158(1992). PubMed=1385386 [ 6] Kessler D., Leibrecht I., Knappe J. "Pyruvate-formate-lyase-deactivase and acetyl-CoA reductase activities of Escherichia coli reside on a polymeric protein particle encoded by adhE." FEBS Lett. 281:59-63(1991). PubMed=2015910 [ 7] Truniger V., Boos W. "Mapping and cloning of gldA, the structural gene of the Escherichia coli glycerol dehydrogenase." J. Bacteriol. 176:1796-1800(1994). PubMed=8132480 [ 8] de Vries G.E., Arfman N., Terpstra P., Dijkhuizen L. J. Bacteriol. 174:5346-5353(1992). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00060} {PS00061; ADH_SHORT} {BEGIN} ********************************************************** * Short-chain dehydrogenases/reductases family signature * ********************************************************** The short-chain dehydrogenases/reductases family (SDR) [1] is a very large family of enzymes, most of which are known to be NAD- or NADPdependent oxidoreductases. As the first member of this family to be characterized was Drosophila alcohol dehydrogenase, this family used to be called [2,3,4] 'insect-type', or 'short-chain' alcohol dehydrogenases. Most member of this family are proteins of about 250 to 300 amino acid residues. The proteins currently known to belong to this family are listed below. - Alcohol dehydrogenase (EC 1.1.1.1) from insects such as Drosophila. - Acetoin dehydrogenase (EC 1.1.1.5) from Klebsiella terrigena (gene budC). - D-beta-hydroxybutyrate dehydrogenase (BDH) (EC 1.1.1.30) from mammals. - Acetoacetyl-CoA reductase (EC 1.1.1.36) from various bacterial species (gene phbB or phaB). - Glucose 1-dehydrogenase (EC 1.1.1.47) from Bacillus. - 3-beta-hydroxysteroid dehydrogenase (EC 1.1.1.51) from Comomonas testosteroni. - 20-beta-hydroxysteroid dehydrogenase (EC 1.1.1.53) from Streptomyces hydrogenans. - Ribitol 2-dehydrogenase (EC 1.1.1.56) (RDH) from Klebsiella aerogenes. - Estradiol 17-beta-dehydrogenase (EC 1.1.1.62) from human. - Gluconate 5-dehydrogenase (EC 1.1.1.69) from Gluconobacter oxydans (gene gno). - 3-oxoacyl-[acyl-carrier protein] reductase (EC 1.1.1.100) from Escherichia coli (gene fabG) and from plants. - Retinol dehydrogenase (EC 1.1.1.105) from mammals. - 2-deoxy-d-gluconate 3-dehydrogenase (EC 1.1.1.125) from Escherichia coli and Erwinia chrysanthemi (gene kduD). - Sorbitol-6-phosphate 2-dehydrogenase (EC 1.1.1.140) from Escherichia coli (gene gutD) and from Klebsiella pneumoniae (gene sorD). - 15-hydroxyprostaglandin dehydrogenase (NAD+) (EC 1.1.1.141) from human. - Corticosteroid 11-beta-dehydrogenase (EC 1.1.1.146) (11-DH) from mammals. - 7-alpha-hydroxysteroid dehydrogenase (EC 1.1.1.159) from Escherichia coli (gene hdhA), Eubacterium strain VPI 12708 (gene baiA) and from Clostridium sordellii. - NADPH-dependent carbonyl reductase (EC 1.1.1.184) from mammals. - Tropinone reductase-I (EC 1.1.1.206) and -II (EC 1.1.1.236) from plants. - N-acylmannosamine 1-dehydrogenase (EC 1.1.1.233) from Flavobacterium strain 141-8. - D-arabinitol 2-dehydrogenase (ribulose forming) (EC 1.1.1.250) from fungi. - Tetrahydroxynaphthalene reductase (EC 1.1.1.252) from Magnaporthe grisea. - Pteridine reductase 1 (EC 1.5.1.33) (gene PTR1) from Leishmania. - 2,5-dichloro-2,5-cyclohexadiene-1,4-diol dehydrogenase (EC 1.1.-.-) from Pseudomonas paucimobilis. - Cis-1,2-dihydroxy-3,4-cyclohexadiene-1-carboxylate dehydrogenase (EC 1.3.1. -) from Acinetobacter calcoaceticus (gene benD) and Pseudomonas putida (gene xylL). - Biphenyl-2,3-dihydro-2,3-diol dehydrogenase (EC 1.3.1.-) (gene bphB) from various Pseudomonaceae. - Cis-toluene dihydrodiol dehydrogenase (EC 1.3.1.-) from Pseudomonas putida (gene todD). - Cis-benzene glycol dehydrogenase (EC 1.3.1.19) from Pseudomonas putida (gene bnzE). - 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase (EC 1.3.1.28) from Escherichia coli (gene entA) and Bacillus subtilis (gene dhbA). - Dihydropteridine reductase (EC 1.5.1.34) (HDHPR) from mammals. - Lignin degradation enzyme ligD from Pseudomonas paucimobilis. - Agropine synthesis reductase from Agrobacterium plasmids (gene mas1). - Versicolorin reductase from Aspergillus parasiticus (gene VER1). - Putative keto-acyl reductases from Streptomyces polyketide biosynthesis operons. - A trifunctional hydratase-dehydrogenase-epimerase from the peroxisomal beta-oxidation system of Candida tropicalis. This protein contains two tandemly repeated 'short-chain dehydrogenase-type' domain in its Nterminal extremity. - Nodulation protein nodG from species of Azospirillum and Rhizobium which is probably involved in the modification of the nodulation Nod factor fatty acyl chain. - Nitrogen fixation protein fixR from Bradyrhizobium japonicum. - Bacillus subtilis protein dltE which is involved in the biosynthesis of Dalanyl-lipoteichoic acid. - Human follicular variant translocation protein 1 (FVT1). - Mouse adipocyte protein p27. - Mouse protein Ke 6. - Maize sex determination protein TASSELSEED 2. - Sarcophaga peregrina 25 Kd development specific protein. - Drosophila fat body protein P6. - A Listeria monocytogenes hypothetical protein encoded in the internalins gene region. - Escherichia coli hypothetical protein yciK. - Escherichia coli hypothetical protein ydfG. - Escherichia coli hypothetical protein yjgI. - Escherichia coli hypothetical protein yjgU. - Escherichia coli hypothetical protein yohF. - Bacillus subtilis hypothetical protein yoxD. - Bacillus subtilis hypothetical protein ywfD. - Bacillus subtilis hypothetical protein ywfH. - Yeast hypothetical protein YIL124w. - Yeast hypothetical protein YIR035c. - Yeast hypothetical protein YIR036c. - Yeast hypothetical protein YKL055c. - Fission yeast hypothetical protein SpAC23D3.11. We use as a signature pattern for this family of proteins one of the best conserved regions which includes two perfectly conserved residues, a tyrosine and a lysine. The tyrosine residue participates in the catalytic mechanism. -Consensus pattern: [LIVSPADNK]-x(9)-{P}-x(2)-Y-[PSTAGNCV]-[STAGNQCIVM][STAGC]-K-{PC}-[SAGFYR]-[LIVMSTAGD]-x-{K}-[LIVMFYW]{D}-x{YR}-[LIVMFYWGAPTHQ]-[GSACQRHM] [Y is an active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for 18 sequences. -Other sequence(s) detected in Swiss-Prot: 35. -Expert(s) to contact by email: Joernvall H.; [email protected] Persson B.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Joernvall H., Persson B., Krook M., Atrian S., Gonzalez-Duarte R., Jeffery J., Ghosh D. Biochemistry 34:6003-6013(1995). [ 2] Villarroya A., Juan E., Egestad B., Joernvall H. "The primary structure of alcohol dehydrogenase from Drosophila lebanonensis. Extensive variation within insect 'short-chain' alcohol dehydrogenase lacking zinc." Eur. J. Biochem. 180:191-197(1989). PubMed=2707261 [ 3] Persson B., Krook M., Jorenvall H. "Characteristics of short-chain alcohol dehydrogenases and related enzymes." Eur. J. Biochem. 200:537-543(1991). PubMed=1889416 [ 4] Neidle E.L., Hartnett C., Ornston L.N., Bairoch A., Rekik M., Harayama S. "cis-diol dehydrogenases encoded by the TOL pWW0 plasmid xylL gene and the Acinetobacter calcoaceticus chromosomal benD gene are members of the short-chain alcohol dehydrogenase superfamily." Eur. J. Biochem. 204:113-120(1992). PubMed=1740120 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00061} {PS00798; ALDOKETO_REDUCTASE_1} {PS00062; ALDOKETO_REDUCTASE_2} {PS00063; ALDOKETO_REDUCTASE_3} {BEGIN} ***************************************** * Aldo/keto reductase family signatures * ***************************************** The aldo-keto reductase family [1,2] groups together a number of structurally and functionally related NADPH-dependent oxidoreductases as well as some other proteins. The proteins known to belong to this family are: - Aldehyde reductase (EC 1.1.1.2). - Aldose reductase (EC 1.1.1.21). - 3-alpha-hydroxysteroid dehydrogenase (EC 1.1.1.50), which terminates androgen action by converting 5-alpha-dihydrotestosterone to 3alphaandrostanediol. - Prostaglandin F synthase (EC 1.1.1.188) which catalyzes the reduction of prostaglandins H2 and D2 to F2-alpha. - D-sorbitol-6-phosphate dehydrogenase (EC 1.1.1.200) from apple. - Morphine 6-dehydrogenase (EC 1.1.1.218) from Pseudomonas putida plasmid pMDH7.2 (gene morA). - Chlordecone reductase (EC 1.1.1.225) which reduces the pesticide chlordecone (kepone) to the corresponding alcohol. - 2,5-diketo-D-gluconic acid reductase (EC 1.1.1.274) which catalyzes the reduction of 2,5-diketogluconic acid to 2-keto-L-gulonic acid, a key intermediate in the production of ascorbic acid. - NAD(P)H-dependent xylose reductase (EC 1.1.1.-) from the yeast Pichia stipitis. This enzyme reduces xylose into xylit. - Trans-1,2-dihydrobenzene-1,2-diol dehydrogenase (EC 1.3.1.20). - 3-oxo-5-beta-steroid 4-dehydrogenase (EC 1.3.99.6) which catalyzes the reduction of delta(4)-3-oxosteroids. - A soybean reductase, which co-acts with chalcone synthase in the formation of 4,2',4'-trihydroxychalcone. - Frog eye lens rho crystallin. - Yeast GCY protein, whose function is not known. - Leishmania major P110/11E protein. P110/11E is a developmentally regulated protein whose abundance is markedly elevated in promastigotes compared with amastigotes. Its exact function is not yet known. - Escherichia coli hypothetical protein yafB. - Escherichia coli hypothetical protein yghE. Yeast hypothetical protein YBR149w. Yeast hypothetical protein YHR104w. Yeast hypothetical protein YJR096w. These proteins have all about 300 amino acid residues. We derived 3 consensus patterns specific to this family of proteins. The first one is located in the N-terminal section of these proteins. The second pattern is located in the central section. The third pattern, located in the C-terminal, is centered on a lysine residue whose chemical modification, in aldose and aldehyde reductases, affect the catalytic efficiency. -Consensus pattern: G-[FY]-R-[HSAL]-[LIVMF]-D-[STAGCL]-[AS]-x(5)-[EQ]x(2)[LIVMCA]-[GS] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [LIVMFY]-x(8)-{L}-[KREQ]-{K}-[LIVM]-G-[LIVM]-[SC]-N[FY] -Sequences known to belong to this class detected by the pattern: ALL, except for morA. -Other sequence(s) detected in Swiss-Prot: 5. -Consensus pattern: [LIVM]-[PAIV]-[KR]-[ST]-{EPQG}-{RFI}-x(2)-R-{SVAF}-x[GSTAEQK]-[NSL]-x-{LVRI}-[LIVMFA] [K may be the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for yafB. -Other sequence(s) detected in Swiss-Prot: 41. -Last update: April 2006 / Patterns revised. [ 1] Bohren K.M., Bullock B., Wermuth B., Gabbay K.H. "The aldo-keto reductase superfamily. cDNAs and deduced amino acid sequences of human aldehyde and aldose reductases." J. Biol. Chem. 264:9547-9551(1989). PubMed=2498333 [ 2] Bruce N.C., Willey D.L., Coulson A.F.M., Jeffery J. "Bacterial morphine dehydrogenase further defines a distinct superfamily of oxidoreductases with diverse functional activities." Biochem. J. 299:805-811(1994). PubMed=8192670 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00062} {PS00064; L_LDH} {BEGIN} *************************************** * L-lactate dehydrogenase active site * *************************************** L-lactate dehydrogenase (EC 1.1.1.27) (LDH) [1] catalyzes the reversible NADdependent interconversion of pyruvate to L-lactate. In vertebrate muscles and in lactic acid bacteria it represents the final step in anaerobic glycolysis. This tetrameric enzyme is present in prokaryotic and eukaryotic organisms. In vertebrates there are three isozymes of LDH: the M form (LDH-A), found predominantly in muscle tissues; the H form (LDH-B), found in heart muscle and the X form (LDH-C), found only in the spermatozoa of mammals and birds. In birds and crocodilian eye lenses, LDH-B serves as a structural protein and is known as epsilon-crystallin [2]. L-2-hydroxyisocaproate dehydrogenase (EC 1.1.1.-) (L-hicDH) [3] catalyzes the reversible and stereospecific interconversion between 2-ketocarboxylic acids and L-2-hydroxy-carboxylic acids. L-hicDH is evolutionary related to LDH's. As a signature for LDH's we have selected a region that includes a conserved histidine which is essential to the catalytic mechanism. -Consensus pattern: [LIVMA]-G-[EQ]-H-G-[DN]-[ST] [H is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 1. -Last update: November 1995 / Text revised. [ 1] Abad-Zapatero C., Griffith J.P., Sussman J.L., Rossmann M.G. J. Mol. Biol. 198:445-467(1987). [ 2] Hendriks W., Mulders J.W.M., Bibby M.A., Slingsby C., Bloemendal H., de Jong W.W. "Duck lens epsilon-crystallin and lactate dehydrogenase B4 are identical: a single-copy gene product with two distinct functions." Proc. Natl. Acad. Sci. U.S.A. 85:7114-7118(1988). PubMed=3174623 [ 3] Lerch H.-P., Frank R., Collins J. "Cloning, sequencing and expression of the L-2-hydroxyisocaproate dehydrogenase-encoding gene of Lactobacillus confusus in Escherichia coli." Gene 83:263-270(1989). PubMed=2684788 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00063} {PS00065; D_2_HYDROXYACID_DH_1} {PS00670; D_2_HYDROXYACID_DH_2} {PS00671; D_2_HYDROXYACID_DH_3} {BEGIN} ************************************************************* * D-isomer specific 2-hydroxyacid dehydrogenases signatures * ************************************************************* A number of NAD-dependent 2-hydroxyacid dehydrogenases which seem to be specific for the D-isomer of their substrate have been shown [1,2,3,4] to be functionally and structurally related. These enzymes are listed below. - D-lactate dehydrogenase (EC 1.1.1.28), a bacterial enzyme which catalyzes the reduction of D-lactate to pyruvate. - D-glycerate dehydrogenase (EC 1.1.1.29) (NADH-dependent hydroxypyruvate reductase), a plant leaf peroxisomal enzyme that catalyzes the reduction of hydroxypyruvate to glycerate. This reaction is part of the glycolate pathway of photorespiration. - D-glycerate dehydrogenase from the bacteria Hyphomicrobium methylovorum and Methylobacterium extorquens. - 3-phosphoglycerate dehydrogenase (EC 1.1.1.95), a bacterial enzyme that catalyzes the oxidation of D-3-phosphoglycerate to 3phosphohydroxypyruvate. This reaction is the first committed step in the 'phosphorylated' pathway of serine biosynthesis. - Erythronate-4-phosphate dehydrogenase (EC 1.1.1.-) (gene pdxB), a bacterial enzyme involved in the biosynthesis of pyridoxine (vitamin B6). - D-2-hydroxyisocaproate dehydrogenase (EC 1.1.1.-) (D-hicDH), a bacterial enzyme that catalyzes the reversible and stereospecific interconversion between 2-ketocarboxylic acids and D-2-hydroxy-carboxylic acids. - Formate dehydrogenase (EC 1.2.1.2) (FDH) from the bacteria Pseudomonas sp. 101 and various fungi [5]. - Vancomycin resistance protein vanH from Enterococcus faecium; this protein is a D-specific alpha-keto acid dehydrogenase involved in the formation of a peptidoglycan which does not terminate by D-alanine thus preventing vancomycin binding. - Escherichia coli hypothetical protein ycdW. Escherichia coli hypothetical protein yiaE. Haemophilus influenzae hypothetical protein HI1556. Yeast hypothetical protein YER081w. Yeast hypothetical protein YIL074w. All these enzymes have similar enzymatic activities and are structurally related. We have selected three of the most conserved regions of these proteins to develop patterns. The first pattern is based on a glycine-rich region located in the central section of these enzymes, this region probably corresponds to the NAD-binding domain. The two other patterns contain a number of conserved charged residues, some of which may play a role in the catalytic mechanism. -Consensus pattern: [LIVMA]-[AG]-[IVT]-[LIVMFY]-[AG]-x-G-[NHKRQGSAC][LIV]-Gx(13,14)-[LIVMFT]-{A}-x-[FYWCTH]-[DNSTK] -Sequences known to belong to this class detected by the pattern: ALL, except for 5 sequences. -Other sequence(s) detected in Swiss-Prot: 5. -Consensus pattern: [LIVMFYWA]-[LIVFYWC]-x(2)-[SAC]-[DNQHR]-[IVFA][LIVF]-x- [LIVF]-[HNI]-x-P-x(4)-[STN]-x(2)-[LIVMF]-x-[GSDN] -Sequences known to belong to this class detected by the pattern: ALL, except for 5 sequences. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [LMFATCYV]-[KPQNHAR]-x-[GSTDNK]-x-[LIVMFYWRC][LIVMFYW](2)-N-x-[STAGC]-R-[GP]-x-[LIVH]-[LIVMCT][DNVE] -Sequences known to belong to this class detected by the pattern: ALL, except for 2 sequences. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Escherichia coli D-lactate dehydrogenase (gene dld) does not belong to this family, it is a membrane-bound FAD flavoenzyme. -Last update: April 2006 / Pattern revised. [ 1] Grant G.A. "A new family of 2-hydroxyacid dehydrogenases." Biochem. Biophys. Res. Commun. 165:1371-1374(1989). PubMed=2692566 [ 2] Kochhar S., Hunziker P.E., Leong-Morgenthaler P., Hottinger H. "Evolutionary relationship of NAD(+)-dependent D-lactate dehydrogenase: comparison of primary structure of 2-hydroxy acid dehydrogenases." Biochem. Biophys. Res. Commun. 184:60-66(1992). PubMed=1567457 [ 3] Taguchi H., Ohta T. "D-lactate dehydrogenase is a member of the D-isomer-specific 2-hydroxyacid dehydrogenase family. Cloning, sequencing, and expression in Escherichia coli of the D-lactate dehydrogenase gene of Lactobacillus plantarum." J. Biol. Chem. 266:12588-12594(1991). PubMed=1840590 [ 4] Goldberg J.D., Yoshida T., Brick P. "Crystal structure of a NAD-dependent D-glycerate dehydrogenase at 2.4 A resolution." J. Mol. Biol. 236:1123-1140(1994). PubMed=8120891 [ 5] Popov V.O., Lamzin V.S. "NAD(+)-dependent formate dehydrogenase." Biochem. J. 301:625-643(1994). PubMed=8053888 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00064} {PS00066; HMG_COA_REDUCTASE_1} {PS00318; HMG_COA_REDUCTASE_2} {PS01192; HMG_COA_REDUCTASE_3} {PS50065; HMG_COA_REDUCTASE_4} {BEGIN} ********************************************************************* * Hydroxymethylglutaryl-coenzyme A reductase signatures and profile * ********************************************************************* Hydroxymethylglutaryl-coenzyme A reductase (EC 1.1.1.34) (HMG-CoA reductase) [1,2] catalyzes the NADP-dependent synthesis of mevalonate from 3hydroxy-3methylglutaryl-CoA. In vertebrates, HMG-CoA reductase is the ratelimiting enzyme in cholesterol biosynthesis. In plants, mevalonate is the precursor of all isoprenoid compounds. HMG-CoA reductase is a membrane bound enzyme. Structurally, it consists of 3 domains. An N-terminal region that contains a variable number of transmembrane segments (7 in mammals, insects and fungi; 2 in plants), a linker region and a C-terminal catalytic domain of approximately 400 amino-acid residues. In archebacteria [3] HMG-CoA reductase, which is involved in the biosynthesis of the isoprenoids side chains of lipids, seems to be cytoplasmic and lack the N-terminal hydrophobic domain. Some bacteria, such as Pseudomonas mevalonii, can use mevalonate as the sole carbon source. These bacteria use an NAD-dependent HMG-CoA reductase (EC 1.1.1.88) to deacetylate mevalonate into 3-hydroxy-3methylglutaryl-CoA [3]. The Pseudomonas enzyme is structurally related to the catalytic domain of NADP-dependent HMG-CoA reductases. We selected HMG-CoA three conserved regions as signature patterns for reductases. The first is located in the center of the catalytic domain, the second is a glycine-rich region located in the C-terminal section of the same catalytic domain and the third is also located in the C-terminal section and contains an histidine residue that seems [4] to be implicated in the catalytic mechanism as a general base. -Consensus pattern: [RKH]-x-{Y}-{I}-x-{I}-{L}-D-x-M-G-x-N-x-[LIVMA] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 4. -Consensus pattern: [LIVM]-G-x-[LIVM]-G-G-[AG]-T -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 6. -Consensus pattern: A-[LIVM]-x-[STAN]-x(2)-[LI]-x-[KRNQ]-[GSA]-H-[LM]-x[FYLH] [H is an active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for archaebacterial HMG-CoA reductases. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Caelles C., Ferrer A., Balcells L., Hegardt F.G., Boronat A. "Isolation and structural characterization of a cDNA encoding Arabidopsis thaliana 3-hydroxy-3-methylglutaryl coenzyme A reductase." Plant Mol. Biol. 13:627-638(1989). PubMed=2491679 [ 2] Basson M.E., Thorsness M., Finer-Moore J., Stroud R.M., Rine J. "Structural and functional conservation between yeast and human 3-hydroxy-3-methylglutaryl coenzyme A reductases, the rate-limiting enzyme of sterol biosynthesis." Mol. Cell. Biol. 8:3797-3808(1988). PubMed=3065625 [ 3] Lam W.L., Doolittle W.F. "Mevinolin-resistant mutations identify a promoter and the gene for a eukaryote-like 3-hydroxy-3-methylglutaryl-coenzyme A reductase in the archaebacterium Haloferax volcanii." J. Biol. Chem. 267:5829-5834(1992). PubMed=1556098 [ 4] Beach M.J., Rodwell V.W. "Cloning, sequencing, and overexpression of mvaA, which encodes Pseudomonas mevalonii 3-hydroxy-3-methylglutaryl coenzyme A reductase." J. Bacteriol. 171:2994-3001(1989). PubMed=2656635 [ 5] Darnay B.G., Wang Y., Rodwell V.W. "Identification of the catalytically important histidine of 3-hydroxy-3-methylglutaryl-coenzyme A reductase." J. Biol. Chem. 267:15064-15070(1992). PubMed=1634543 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00065} {PS00067; 3HCDH} {BEGIN} ********************************************* * 3-hydroxyacyl-CoA dehydrogenase signature * ********************************************* 3-hydroxyacyl-CoA dehydrogenase (EC 1.1.1.35) (HCDH) [1] is an enzyme involved in fatty acid metabolism, it catalyzes the reduction of 3-hydroxyacylCoA to 3-oxoacyl-CoA. Most eukaryotic cells have 2 fatty-acid beta-oxidation systems, one located in mitochondria and the other in peroxisomes. In peroxisomes 3-hydroxyacyl-CoA dehydrogenase forms, with enoyl-CoA hydratase (ECH) and 3,2-trans-enoyl-CoA isomerase (ECI) a multifunctional enzyme where the Nterminal domain bears the hydratase/isomerase activities and the Cterminal domain the dehydrogenase activity. There are two mitochondrial enzymes: one which is monofunctional and the other which is, like its peroxisomal counterpart, multifunctional. In Escherichia coli (gene fadB) and Pseudomonas fragi (gene faoA) HCDH is part of a multifunctional enzyme which also contains an ECH/ECI domain as well as a 3-hydroxybutyryl-CoA epimerase domain [2]. The other proteins structurally related to HCDH are: - Bacterial 3-hydroxybutyryl-CoA dehydrogenase (EC 1.1.1.157) which reduces 3-hydroxybutanoyl-CoA to acetoacetyl-CoA [3]. - Eye lens protein lambda-crystallin [4], which is specific to lagomorphes (such as rabbit). There are two major region of similarities in the sequences of proteins of the HCDH family, the first one located in the N-terminal, corresponds to the NADbinding site, the second one is located in the center of the sequence. We have chosen to derive a signature pattern from this central region. -Consensus pattern: [DNES]-x(2)-[GA]-F-[LIVMFYA]-x-[NT]-R-x(3)-[PA][LIVMFY][LIVMFYST]-x(5,6)-[LIVMFYCT]-[LIVMFYEAH]-x(2)-[GVE] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Birktoff J.J., Holden H.M., Hamlin R., Xuong N.-H., Banaszak L.J. Proc. Natl. Acad. Sci. U.S.A. 84:8262-8266(1987). [ 2] Nakahigashi K., Inokuchi H. "Nucleotide sequence of the fadA and fadB genes from Escherichia coli." Nucleic Acids Res. 18:4937-4937(1990). PubMed=2204034 [ 3] Mullany P., Clayton C.L., Pallen M.J., Slone R., al-Saleh A., Tabaqchali S. "Genes encoding homologues of three consecutive enzymes in the butyrate/butanol-producing pathway of Clostridium acetobutylicum are clustered on the Clostridium difficile chromosome." FEMS Microbiol. Lett. 124:61-67(1994). PubMed=8001771 [ 4] Mulders J.W.M., Hendriks W., Blankesteijn W.M., Bloemendal H., de Jong W.W. "Lambda-crystallin, a major rabbit lens protein, is related to hydroxyacyl-coenzyme A dehydrogenases." J. Biol. Chem. 263:15462-15466(1988). PubMed=3170592 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00066} {PS00068; MDH} {BEGIN} ********************************************** * Malate dehydrogenase active site signature * ********************************************** Malate dehydrogenase (EC 1.1.1.37) (MDH) [1,2] catalyzes the interconversion of malate to oxaloacetate utilizing the NAD/NADH cofactor system. The enzyme participates in the citric acid cycle and exists in all aerobics organisms. While prokaryotic organisms contains a single form of MDH, in eukaryotic cells there are two isozymes: one which is located in the mitochondrial matrix and the other in the cytoplasm. Fungi and plants also harbor a glyoxysomal form which functions in the glyoxylate pathway. In plants chloroplast there is an additional NADP-dependent form of MDH (EC 1.1.1.82) which is essential for both the universal C3 photosynthesis (Calvin) cycle and the more specialized C4 cycle. As a signature pattern for this enzyme we have chosen a region that includes two residues involved in the catalytic mechanism [3]: an aspartic acid which is involved in a proton relay mechanism, and an arginine which binds the substrate. -Consensus pattern: [LIVM]-T-[TRKMN]-L-D-x(2)-R-[STA]-x(3)-[LIVMFY] [D and R are the active site residues] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: MDH from archaebacteria do not belong to the above family; they are evolutionary related to lactate dehydrogenases [4]. -Last update: November 1995 / Text revised. [ 1] McAlister-Henn L. Trends Biochem. Sci. 13:178-181(1988). [ 2] Gietl C. "Malate dehydrogenase isoenzymes: cellular locations and role in the flow of metabolites between the cytoplasm and cell organelles." Biochim. Biophys. Acta 1100:217-234(1992). PubMed=1610875 [ 3] Birktoft J.J., Rhodes G., Banaszak L.J. "Refined crystal structure of cytoplasmic malate dehydrogenase at 2.5-A resolution." Biochemistry 28:6065-6081(1989). PubMed=2775751 [ 4] Cendrin F., Chroboczek J., Zaccai G., Eisenberg H., Mevarech M. "Cloning, sequencing, and expression in Escherichia coli of the gene coding for malate dehydrogenase of the extremely halophilic archaebacterium Haloarcula marismortui." Biochemistry 32:4308-4313(1993). PubMed=8476859 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00067} {PS00069; G6P_DEHYDROGENASE} {BEGIN} ************************************************* * Glucose-6-phosphate dehydrogenase active site * ************************************************* Glucose-6-phosphate dehydrogenase (EC 1.1.1.49) (G6PD) [1] catalyzes the first step in the pentose pathway, the reduction of glucose-6phosphate to gluconolactone 6-phosphate. A lysine residue has been identified as a reactive nucleophile associated with the activity of the enzyme. The sequence around this lysine is totally conserved from bacterial to mammalian G6PD's and can be used as a signature pattern. -Consensus pattern: D-H-[YF]-L-G-K-[EQK] [K is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Jeffery J., Persson B., Wood I., Bergman T., Jeffery R., Joernvall H. "Glucose-6-phosphate dehydrogenase. Structure-function relationships and the Pichia jadinii enzyme structure." Eur. J. Biochem. 212:41-49(1993). PubMed=8444164 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00068} {PS00687; ALDEHYDE_DEHYDR_GLU} {PS00070; ALDEHYDE_DEHYDR_CYS} {BEGIN} **************************************** * Aldehyde dehydrogenases active sites * **************************************** Aldehyde dehydrogenases (EC 1.2.1.3 and EC 1.2.1.5) are enzymes which oxidize a wide variety of aliphatic and aromatic aldehydes. In mammals at least four different forms of the enzyme are known [1]: class-1 (or Ald C) a tetrameric cytosolic enzyme, class-2 (or Ald M) a tetrameric mitochondrial enzyme, class3 (or Ald D) a dimeric cytosolic enzyme, and class IV a microsomal enzyme. Aldehyde dehydrogenases have also been sequenced from fungal and bacterial species. A number of enzymes are known to be evolutionary related to aldehyde dehydrogenases; these enzymes are listed below. - Plants and bacterial betaine-aldehyde dehydrogenase (EC 1.2.1.8) [2], an enzyme that catalyzes the last step in the biosynthesis of betaine. - Plants and bacterial NADP-dependent glyceraldehyde-3phosphate dehydrogenase (EC 1.2.1.9). - Escherichia coli succinate-semialdehyde dehydrogenase (NADP+) (EC 1.2.1.16) (gene gabD) [3], which reduces succinate semialdehyde into succinate. - Escherichia coli lactaldehyde dehydrogenase (EC 1.2.1.22) (gene ald) [4]. - Mammalian succinate semialdehyde dehydrogenase (NAD+) (EC 1.2.1.24). - Escherichia coli phenylacetaldehyde dehydrogenase (EC 1.2.1.39). - Escherichia coli 5-carboxymethyl-2-hydroxymuconate semialdehyde dehydrogenase (gene hpcC). - Pseudomonas putida 2-hydroxymuconic semialdehyde dehydrogenase [5] (genes dmpC and xylG), an enzyme in the meta-cleavage pathway for the degradation of phenols, cresols and catechol. - Bacterial and mammalian methylmalonate-semialdehyde dehydrogenase (MMSDH) (EC 1.2.1.27) [6], an enzyme involved in the distal pathway of valine catabolism. - Yeast delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.5.1.12) [7] (gene PUT2), which converts proline to glutamate. - Bacterial multifunctional putA protein, which contains a delta-1pyrroline5-carboxylate dehydrogenase domain. - 26G, a garden pea protein of unknown function which is induced by dehydration of shoots [8]. - Mammalian formyltetrahydrofolate dehydrogenase (EC 1.5.1.6) [9]. This is a cytosolic enzyme responsible for the NADP-dependent decarboxylative reduction of 10-formyltetrahydrofolate into tetrahydrofolate. It is an protein of about 900 amino acids which consist of three domains; the Cterminal domain (480 residues) is structurally and functionally related to aldehyde dehydrogenases. - Yeast hypothetical protein YBR006w. - Yeast hypothetical protein YER073w. - Yeast hypothetical protein YHR039c. - Caenorhabditis elegans hypothetical protein F01F1.6. A glutamic acid and a cysteine residue have been implicated in the catalytic activity of mammalian aldehyde dehydrogenase. These residues are conserved in all the enzymes of this family. We have derived two patterns for this family, one for each of the active site residues. -Consensus pattern: [LIVMFGA]-E-[LIMSTAC]-[GS]-G-[KNLM]-[SADN]-[TAPFV] [E is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for 13 sequences. -Other sequence(s) detected in Swiss-Prot: 44. -Consensus pattern: [FYLVA]-x-{GVEP}-{DILV}-G-[QE]-{LPYG}-C-[LIVMGSTANC][AGCN]-{HE}-[GSTADNEKR] [C is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for 13 sequences. -Other sequence(s) detected in Swiss-Prot: 90. -Note: Omega-crystallins are minor structural components of squids and octopi eye lens. They are evolutionary related to aldehyde dehydrogenases but have lost their catalytic activity. These patterns will not detect them. -Expert(s) to contact by email: Joernvall H.; [email protected] Persson B.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Hempel J., Harper K., Lindahl R. "Inducible (class 3) aldehyde dehydrogenase from rat hepatocellular carcinoma and 2,3,7,8-tetrachlorodibenzo-p-dioxin-treated liver: distant relationship to the class 1 and 2 enzymes from mammalian liver cytosol/mitochondria." Biochemistry 28:1160-1167(1989). PubMed=2713359 [ 2] Weretilnyk E.A., Hanson A.D. "Molecular cloning of a plant betaine-aldehyde dehydrogenase, an enzyme implicated in adaptation to salinity and drought." Proc. Natl. Acad. Sci. U.S.A. 87:2745-2749(1990). PubMed=2320587 [ 3] Niegemann E., Schulz A., Bartsch K. "Molecular organization of the Escherichia coli gab cluster: nucleotide sequence of the structural genes gabD and gabP and expression of the GABA permease gene." Arch. Microbiol. 160:454-460(1993). PubMed=8297211 [ 4] Hidalgo E., Chen Y.-M., Lin E.C.C., Aguilar J. "Molecular cloning and DNA sequencing of the Escherichia coli K-12 ald gene encoding aldehyde dehydrogenase." J. Bacteriol. 173:6118-6123(1991). PubMed=1917845 [ 5] Nordlund I., Shingler V. "Nucleotide sequences of the meta-cleavage pathway enzymes 2-hydroxymuconic semialdehyde dehydrogenase and 2-hydroxymuconic semialdehyde hydrolase from Pseudomonas CF600." Biochim. Biophys. Acta 1049:227-230(1990). PubMed=2194577 [ 6] Steele M.I., Lorenz D., Hatter K., Park A., Sokatch J.R. "Characterization of the mmsAB operon of Pseudomonas aeruginosa PAO encoding methylmalonate-semialdehyde dehydrogenase and 3-hydroxyisobutyrate dehydrogenase." J. Biol. Chem. 267:13585-13592(1992). PubMed=1339433 [ 7] Krzywicki K.A., Brandriss M.C. "Primary structure of the nuclear PUT2 gene involved in the mitochondrial pathway for proline utilization in Saccharomyces cerevisiae." Mol. Cell. Biol. 4:2837-2842(1984). PubMed=6098824 [ 8] Guerrero F.D., Jones J.T., Mullet J.E. "Turgor-responsive gene transcription and RNA levels increase rapidly when pea shoots are wilted. Sequence and expression of three inducible genes." Plant Mol. Biol. 15:11-26(1990). PubMed=1715781 [ 9] Cook R.J., Lloyd R.S., Wagner C. "Isolation and characterization of cDNA clones for rat liver 10-formyltetrahydrofolate dehydrogenase." J. Biol. Chem. 266:4965-4973(1991). PubMed=1848231 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00069} {PS00071; GAPDH} {BEGIN} ******************************************************** * Glyceraldehyde 3-phosphate dehydrogenase active site * ******************************************************** Glyceraldehyde 3-phosphate dehydrogenase (EC 1.2.1.12) (GAPDH) [1] is a tetrameric NAD-binding enzyme common to both the glycolytic and gluconeogenic pathways. A cysteine in the middle of the molecule is involved in forming a covalent phosphoglycerol thioester intermediate. The sequence around this cysteine is totally conserved in eubacterial and eukaryotic GAPDHs and is also present, albeit in a variant divergent archaebacterial GAPDH [2]. form, in the otherwise highly Escherichia coli D-erythrose 4-phosphate dehydrogenase (E4PDH) (gene epd or gapB) is an enzyme highly related to GAPDH [3]. -Consensus pattern: [ASV]-S-C-[NT]-T-{S}-x-[LIM] [C is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for Bacillus megaterium GAPDH which has Pro instead of Ser in position 2 of the pattern. -Other sequence(s) detected in Swiss-Prot: 5. -Last update: December 2004 / Pattern and text revised. [ 1] Harris J.I., Waters M. (In) The Enzymes (3rd edition) 13:1-50(1976). [ 2] Fabry S., Lang J., Niermann T., Vingron M., Hensel R. "Nucleotide sequence of the glyceraldehyde-3-phosphate dehydrogenase gene from the mesophilic methanogenic archaebacteria Methanobacterium bryantii and Methanobacterium formicicum. Comparison with the respective gene structure of the closely related extreme thermophile Methanothermus fervidus." Eur. J. Biochem. 179:405-413(1989). PubMed=2492940 [ 3] Zhao G., Pease A.J., Bharani N., Winkler M.E. "Biochemical characterization of gapB-encoded erythrose 4-phosphate dehydrogenase of Escherichia coli K-12 and its possible role in pyridoxal 5'-phosphate biosynthesis." J. Bacteriol. 177:2804-2812(1995). PubMed=7751290 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00070} {PS00072; ACYL_COA_DH_1} {PS00073; ACYL_COA_DH_2} {BEGIN} ************************************** * Acyl-CoA dehydrogenases signatures * ************************************** Acyl-CoA dehydrogenases [1,2,3] are enzymes that catalyze the alpha,betadehydrogenation of acyl-CoA esters and transfer electrons to ETF, the electron transfer protein. Acyl-CoA dehydrogenases are FAD flavoproteins. This family currently includes: - Five eukaryotic isozymes that catalyze the first step of the betaoxidation cycles for fatty acids with various chain lengths. These are short (SCAD) (EC 1.3.99.2), medium (MCAD) (EC 1.3.99.3), long (LCAD) (EC 1.3.99.13), very-long (VLCAD) and short/branched (SBCAD) chain acyl-CoA dehydrogenases. These enzymes are located in the mitochondrion. They are all homotetrameric proteins of about 400 amino acid residues except VLCAD which is a dimer and which contains, in its mature form, about 600 residues. - Glutaryl-CoA dehydrogenase (EC 1.3.99.7) (GCDH), which is involved in the catabolism of lysine, hydroxylysine and tryptophan. - Isovaleryl-CoA dehydrogenase (EC 1.3.99.10) (IVD), involved in the catabolism of leucine. - Acyl-coA dehydrogenases acdA and mmgC from Bacillus subtilis. - Butyryl-CoA dehydrogenase (EC 1.3.99.2) from Clostridium acetobutylicum. - Escherichia coli protein caiA [4]. - Escherichia coli protein aidB. We have selected two conserved regions as signature patterns. The first is located in the center of these enzymes, the second in the C-terminal section. -Consensus pattern: [GAC]-[LIVM]-[ST]-E-x(2)-[GSAN]-G-[ST]-D-x(2)-[GSA] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [QDE]-x-{P}-G-[GS]-x-G-[LIVMFY]-x(2)-[DEN]-x(4)-[KR]x(3)[DEN] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Tanaka K., Ikeda Y., Matsubara Y., Hyman D.B. "Molecular basis of isovaleric acidemia and medium-chain acyl-CoA dehydrogenase deficiency." Enzyme 38:91-107(1987). PubMed=3326738 [ 2] Matsubara Y., Indo Y., Naito E., Ozasa H., Glassberg R., Vockley J., Ikeda Y., Kraus J., Tanaka K. "Molecular cloning and nucleotide sequence of cDNAs encoding the precursors of rat long chain acyl-coenzyme A, short chain acyl-coenzyme A, and isovaleryl-coenzyme A dehydrogenases. Sequence homology of four enzymes of the acyl-CoA dehydrogenase family." J. Biol. Chem. 264:16321-16331(1989). PubMed=2777793 [ 3] Aoyama T., Ueno I., Kamijo T., Hashimoto T. "Rat very-long-chain acyl-CoA dehydrogenase, a novel mitochondrial acyl-CoA dehydrogenase gene product, is a rate-limiting enzyme in long-chain fatty acid beta-oxidation system. cDNA and deduced amino acid sequence and distinct specificities of the cDNA-expressed protein." J. Biol. Chem. 269:19088-19094(1994). PubMed=8034667 [ 4] Eichler K., Bourgis F., Buchet A., Kleber H.-P., Mandrand-Berthelot M.-A. "Molecular characterization of the cai operon necessary for carnitine metabolism in Escherichia coli." Mol. Microbiol. 13:775-786(1994). PubMed=7815937 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00071} {PS00074; GLFV_DEHYDROGENASE} {BEGIN} **************************************************** * Glu / Leu / Phe / Val dehydrogenases active site * **************************************************** - Glutamate dehydrogenases (EC 1.4.1.2, EC 1.4.1.3, and EC 1.4.1.4) (GluDH) are enzymes that catalyze the NAD- or NADP-dependent reversible deamination of glutamate into alpha-ketoglutarate [1,2]. GluDH isozymes are generally involved with either ammonia assimilation or glutamate catabolism. - Leucine dehydrogenase (EC 1.4.1.9) (LeuDH) is a NAD-dependent enzyme that catalyzes the reversible deamination of leucine and several other aliphatic amino acids to their keto analogues [3]. - Phenylalanine dehydrogenase (EC 1.4.1.20) (PheDH) is a NAD-dependent enzyme that catalyzes the reversible deamidation of L-phenylalanine into phenylpyruvate [4]. - Valine dehydrogenase (EC 1.4.1.8) (ValDH) is a NADP-dependent enzyme that catalyzes the reversible deamidation of L-valine into 3methyl-2oxobutanoate [5]. These dehydrogenases are structurally and functionally related. A conserved lysine residue located in a glycine-rich region has been implicated in the catalytic mechanism. The conservation of the region around this residue allows the derivation of a signature pattern for such type of enzymes. -Consensus pattern: [LIV]-x(2)-G-G-[SAG]-K-x-[GV]-x(3)-[DNST]-[PL] [K is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: All known sequences from this family have Pro in the last position of the pattern with the exception of yeast GluDH which as Leu. -Last update: November 1997 / Pattern and text revised. [ 1] Britton K.L., Baker P.J., Rice D.W., Stillman T.J. "Structural relationship between the hexameric and tetrameric family of glutamate dehydrogenases." Eur. J. Biochem. 209:851-859(1992). PubMed=1358610 [ 2] Benachenhou-Lahfa N., Forterre P., Labedan B. J. Mol. Evol. 36:335-346(1993). [ 3] Nagata S., Tanizawa K., Esaki N., Sakamoto Y., Ohshima T., Tanaka H., Soda K. "Gene cloning and sequence determination of leucine dehydrogenase from Bacillus stearothermophilus and structural comparison with other NAD(P)+-dependent dehydrogenases." Biochemistry 27:9056-9062(1988). PubMed=3069133 [ 4] Takada H., Yoshimura T., Ohshima T., Esaki N., Soda K. "Thermostable phenylalanine dehydrogenase of Thermoactinomyces intermedius: cloning, expression, and sequencing of its gene." J. Biochem. 109:371-376(1991). PubMed=1880121 [ 5] Tang L., Hutchinson C.R. "Sequence, transcriptional, and functional analyses of the valine (branched-chain amino acid) dehydrogenase gene of Streptomyces coelicolor." J. Bacteriol. 175:4176-4185(1993). PubMed=8320231 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00072} {PS00075; DHFR_1} {PS51330; DHFR_2} {BEGIN} *************************************************************** * Dihydrofolate reductase (DHFR) domain signature and profile * *************************************************************** Dihydrofolate reductases (DHFRs) (EC 1.5.1.3) [1] are ubiquitous enzymes which catalyze the NADPH-linked reduction of 7,8-dihydrofolate to 5,6,7,8tetrahydrofolate. DHFRs are also capable of catalyzing the NADPHlinked reduction of folate to 7,8-dihydrofolate, but at a lesser rate, which varies among species. They can be inhibited by a number of antagonists such as trimethroprim and methotrexate which are used as antibacterial or anticancerous agents. Thymidylate synthase (TS) (see <PDOC00086>) and DHFR catalyze sequential reactions in the thymidylate cycle, which supplies cells with their sole de novo source of 2'-deoxythymidylate (dTMP) for DNA synthesis. TS catalyzes a reductive methylation of 2'deoxyuridylate (dUMP) to form dTMP in which the cofactor for the reaction, 5,10-methylenetetrahydrofolate is converted to dihydrofolate (FH(2)). DHFR then reduces FH(2) to tetrahydrofolate (FH(4)) in a reaction requiring NADPH. In sources as diverse as bacteriophage, prokaryotes, fungi, mammalian viruses, and vertebrates, TS and DHFR are distinct monofunctional enzymes. Protozoa and at least some plants are unusual in having a joined bifunctional polypetide that catalyzes both reactions [2,3]. An eight-stranded beta sheet consisting of seven parallel strands and a carboxy-terminal antiparallel strand composes the core of the DHFR domain. The beta-sheet core is flanked by alpha-helices (see <PDB:1DRH>) [2-6]. We have derived a signature pattern from a region in the N-terminal part of the DHFR domain, which includes a conserved Pro-Trp dipeptide; the tryptophan has been shown [7] to be involved in the binding of substrate by the enzyme. We have also developed a profile, which covers the entire DHFR domain. -Consensus pattern: [LVAGC]-[LIF]-G-x(4)-[LIVMF]-P-W-x(4,5)-[DE]-x(3)[FYIV]x(3)-[STIQ] -Sequences known to belong to this class detected by the pattern: ALL, except for type II bacterial, plasmid-encoded, dihydrofolate reductases which do not belong to the same class of enzymes. -Other sequence(s) detected in Swiss-Prot: 1. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: September 2007 / Text revised; profile added. [ 1] Harpers' Review of Biochemistry, Lange, Los Altos (1985). [ 2] Knighton D.R., Kan C.-C., Howland E., Janson C.A., Hostomska Z., Welsh K.M., Matthews D.A. "Structure of and kinetic channelling in bifunctional dihydrofolate reductase-thymidylate synthase." Nat. Struct. Biol. 1:186-194(1994). PubMed=7656037 [ 3] Yuvaniyama J., Chitnumsub P., Kamchonwongpaisan S., Vanichtanankul J., Sirawaraporn W., Taylor P., Walkinshaw M.D., Yuthavong Y. "Insights into antifolate resistance from malarial DHFR-TS structures." Nat. Struct. Biol. 10:357-365(2003). PubMed=12704428; DOI=10.1038/nsb921 [ 4] Davies J.F. II, Delcamp T.J., Prendergast N.J., Ashford V.A., Freisheim J.H., Kraut J. "Crystal structures of recombinant human dihydrofolate reductase complexed with folate and 5-deazafolate." Biochemistry 29:9467-9479(1990). PubMed=2248959 [ 5] McTigue M.A., Davies J.F. II, Kaufman B.T., Kraut J. "Crystal structure of chicken liver dihydrofolate reductase complexed with NADP+ and biopterin." Biochemistry 31:7264-7273(1992). PubMed=1510919 [ 6] Reyes V.M., Sawaya M.R., Brown K.A., Kraut J. "Isomorphous crystal structures of Escherichia coli dihydrofolate reductase complexed with folate, 5-deazafolate, and 5,10-dideazatetrahydrofolate: mechanistic implications." Biochemistry 34:2710-2723(1995). PubMed=7873554 [ 7] Bolin J.T., Filman D.J., Matthews D.A., Hamlin R.C., Kraut J. "Dihydrofolate reductase. The stereochemistry of inhibitor selectivity." J. Biol. Chem. 257:13650-13662(1982). PubMed=3880743; +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00073} {PS00076; PYRIDINE_REDOX_1} {BEGIN} ********************************************************************** * Pyridine nucleotide-disulphide oxidoreductases class-I active site * ********************************************************************** The pyridine nucleotide-disulphide oxidoreductases are FAD flavoproteins which contains a pair of redox-active cysteines involved in the transfer of reducing equivalents from the FAD cofactor to the substrate. On the basis of sequence and structural similarities [1] these enzymes can be classified into two categories. The first category groups together the following enzymes [2 to 6]: - Glutathione reductase (EC 1.8.1.7) (GR). - Higher eukaryotes thioredoxin reductase (EC 1.8.1.9). - Trypanothione reductase (EC 1.8.1.12). - Lipoamide dehydrogenase (EC 1.8.1.4), the E3 component ketoacid dehydrogenase complexes. - Mercuric reductase (EC 1.16.1.1). of alpha- The sequence around the two cysteines involved in the redox-active disulfide bond is conserved and can be used as a signature pattern. -Consensus pattern: G-G-x-C-[LIVA]-x(2)-G-C-[LIVM]-P [The 2 C's form the active site disulfide bond] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: In positions 6 and 7 of the pattern all known sequences have Asn(Val/ Ile) with the exception of GR from plant chloroplasts and from cyanobacteria which have Ile-Arg [7]. -Last update: May 2004 / Text revised. [ 1] Kurlyan J., Krishna T.S.R., Wong L., Guenther B., Pahler A., Williams C.H. Jr., Model P. Nature 352:172-174(1991). [ 2] Rice D.W., Schulz G.E., Guest J.R. "Structural relationship between glutathione reductase and lipoamide dehydrogenase." J. Mol. Biol. 174:483-496(1984). PubMed=6546954 [ 3] Brown N.L. Trends Biochem. Sci. 10:400-402(1985). [ 4] Carothers D.J., Pons G., Patel M.S. "Dihydrolipoamide dehydrogenase: functional similarities and divergent evolution of the pyridine nucleotide-disulfide oxidoreductases." Arch. Biochem. Biophys. 268:409-425(1989). PubMed=2643922 [ 5] Walsh C.T., Bradley M., Nadeau K. "Molecular studies on trypanothione reductase, a target for antiparasitic drugs." Trends Biochem. Sci. 16:305-309(1991). PubMed=1957352 [ 6] Gasdaska P.Y., Gasdaska J.R., Cochran S., Powis G. "Cloning and sequencing of a human thioredoxin reductase." FEBS Lett. 373:5-9(1995). PubMed=7589432 [ 7] Creissen G., Edwards E.A., Enard C., Wellburn A., Mullineaux P. "Molecular characterization of glutathione reductase cDNAs from pea (Pisum sativum L.)." Plant J. 2:129-131(1992). PubMed=1303792 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00074} {PS00077; COX1_CUB} {PS50855; COX1} {BEGIN} ******************************************************** * Cytochrome c oxidase subunit I signature and profile * ******************************************************** Cytochrome c oxidase (EC 1.9.3.1) [1] is an oligomeric integral membrane protein complexes that catalyze the terminal step in the respiratory chain: they transfer electrons from cytochrome c or a quinol to oxygen. Some terminal oxidases generate a transmembrane proton gradient across the plasma membrane (prokaryotes) or the mitochondrial inner membrane (eukaryotes). The enzyme complex consists of 3-4 subunits (prokaryotes) up to 13 polypeptides (mammals) of which only the catalytic subunit (equivalent to mammalian subunit 1 (CO I)) is found in all heme-copper respiratory oxidases. The presence of a bimetallic center, formed by a high-spin heme (heme a3) and copper B, as well as a low-spin heme (heme a), both ligated to six conserved histidine residues near the outer side of four transmembrane spans within CO I is common to all family members [2-4]. In contrary to eukaryotes the respiratory chain of prokaryotes is branched to multiple terminal oxidases. The enzyme complexes vary in heme and copper composition, substrate type and substrate affinity. The different respiratory oxidases allow the cells to customize their respiratory systems according a variety of environmental growth conditions [1]. The crystal structure of the whole enzyme complexe have been solved [5]. Subunit I contains 12 transmembrane helical segments and binds heme a and heme a3-copper B binuclear centre where molecular oxygen is reduced to water. (see <PDB:1OCZ; A>). Recently also a component of an anaerobic respiratory chain has been found to contain the copper B binding signature of this family: nitric oxide reductase (NOR) exists in denitrifying species of Archae and Eubacteria. Enzymes that belong to this family are: - Mitochondrial-type cytochrome c oxidase (EC 1.9.3.1) which uses cytochrome c as electron donor. The electrons are transferred via copper A (Cu(A)) and heme a to the bimetallic center of CO I that is formed by a pentacoordinated heme a and copper B (Cu(B)). Subunit 1 contains 12 transmembrane regions. Cu(B) is said to be ligated to three of the conserved histidine residues within the transmembrane segments 6 and 7. - Quinol oxidase from prokaryotes that transfers electrons from a quinol to the binuclear center of polypeptide I. This category of enzymes includes Escherichia coli cytochrome O terminal oxidase complex which is a component of the aerobic respiratory chain that predominates when cells are grown at high aeration. - FixN, the catalytic subunit of a cytochrome c oxidase expressed in nitrogen-fixing bacteroids living in root nodules. The high affinity for oxygen allows oxidative phosphorylation under low oxygen concentrations. A similar enzyme has been found in other purple bacteria. - Nitric oxide reductase (EC 1.7.99.7) from Pseudomonas stutzeri. NOR reduces nitrate to dinitrogen. It is a heterodimer of norC and the catalytic subunit norB. The latter contains the 6 invariant histidine residues and 12 transmembrane segments [6]. As a signature pattern we used the copper-binding region. We also developed a profile that cover the whole subunit I. -Consensus pattern: [YWG]-[LIVFYWTA](2)-[VGS]-H-[LNP]-x-V-x(44,47)-H-H [The 3 H's are copper B ligands] -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Cytochrome bd complexes do not belong to this family. -Last update: June 2004 / Text revised; profile added. [ 1] Garcia-Horsman J.A., Barquera B., Rumbley J., Ma J., Gennis R.B. J. Bacteriol. 176:5587-5600(1994). [ 2] Castresana J., Luebben M., Saraste M., Higgins D.G. "Evolution of cytochrome oxidase, an enzyme older than atmospheric oxygen." EMBO J. 13:2516-2525(1994). PubMed=8013452 [ 3] Capaldi R.A., Malatesta F., Darley-Usmar V.M. "Structure of cytochrome c oxidase." Biochim. Biophys. Acta 726:135-148(1983). PubMed=6307356 [ 4] Holm L., Saraste M., Wikstrom M. "Structural models of the redox centres in cytochrome oxidase." EMBO J. 6:2819-2823(1987). PubMed=2824194 [ 5] Yoshikawa S., Shinzawa-Itoh K., Nakashima R., Yaono R., Yamashita E., Inoue N., Yao M., Fei M.J., Libeu C.P., Mizushima T., Yamaguchi H., Tomizaki T., Tsukihara T. "Redox-coupled crystal structural changes in bovine heart cytochrome c oxidase." Science 280:1723-1729(1998). PubMed=9624044 [ 6] Saraste M., Castresana J. "Cytochrome oxidase evolved by tinkering with denitrification enzymes." FEBS Lett. 341:1-4(1994). PubMed=8137905 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00075} {PS00078; COX2} {PS50999; COX2_TM} {PS50857; COX2_CUA} {BEGIN} ********************************************************** * Cytochrome c oxidase subunit II signature and profiles * ********************************************************** Cytochrome c oxidase (EC 1.9.3.1) [1,2] is an oligomeric enzymatic complex which is a component of the respiratory chain and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane. The enzyme complex consists of 3-4 subunits (prokaryotes) to up to 13 polypeptides (mammals). Subunit 2 (CO II) transfers the electrons from cytochrome c to the catalytic subunit 1. It contains two adjacent transmembrane regions in its Nterminus and the major part of the protein is exposed to the periplasmic or to the mitochondrial intermembrane space, respectively. CO II provides the substratebinding site and contains a copper center called Cu(A), located in the extramembrane domain (see <PDB:1OCZ; B>), probably the primary acceptor in cytochrome c oxidase. An exception is the corresponding subunit of the cbb3-type oxidase which lacks the copper A redox-center. Several bacterial CO II have a C-terminal extension that contains a covalently bound heme c. It has been shown [3,4] that nitrous oxide reductase (EC 1.7.99.6) (gene nosZ) of Pseudomonas has sequence similarity in its C-terminus to CO II. This enzyme is part of the bacterial respiratory system which is activated under anaerobic conditions in the presence of nitrate or nitrous oxide. NosZ is a periplasmic homodimer that contains a dinuclear copper center, probably located in a 3dimensional fold similar to the cupredoxin-like fold that has been suggested for the copper-binding site of CO II [3]. The dinuclear purple copper center is formed by 2 histidines and 2 cysteines [5]. We used this region as a signature pattern. The conserved valine and the conserved methionine are said to be involved in stabilizing the copperbinding fold by interacting with each other. We also developed two profiles, one directed against the transmembrane region and one against the copper center. -Consensus pattern: V-x-H-x(33,40)-C-x(3)-C-x(3)-H-x(2)-M [The 2 C's and the 2 H's are copper ligands] -Sequences known to belong to this class detected by the pattern: ALL, except for Paramecium primaurelia as well as in some plants where the pattern ends with Thr; an RNA editing event at this position could change this Thr to Met. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the first profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the second profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Cytochrome cbb(3) subunit 2 does not belong to this family. -Last update: June 2004 / Text revised; profiles added. [ 1] Capaldi R.A., Malatesta F., Darley-Usmar V.M. "Structure of cytochrome c oxidase." Biochim. Biophys. Acta 726:135-148(1983). PubMed=6307356 [ 2] Garcia-Horsman J.A., Barquera B., Rumbley J., Ma J., Gennis R.B. J. Bacteriol. 176:5587-5600(1994). [ 3] van der Oost J., Lappalainen P., Musacchio A., Warne A., Lemieux L., Rumbley J., Gennis R.B., Aasa R., Pascher T., Malmstrom B.G., Saraste M. EMBO J. 11:3209-3217(1992). [ 4] Zumft W.G., Dreusch A., Lochelt S., Cuypers H., Friedrich B., Schneider B. "Derived amino acid sequences of the nosZ gene (respiratory N2O reductase) from Alcaligenes eutrophus, Pseudomonas aeruginosa and Pseudomonas stutzeri reveal potential copper-binding residues. Implications for the CuA site of N2O reductase and cytochrome-c oxidase." Eur. J. Biochem. 208:31-40(1992). PubMed=1324835 [ 5] Saraste M. Unpublished observations (1994). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00076} {PS00079; MULTICOPPER_OXIDASE1} {PS00080; MULTICOPPER_OXIDASE2} {BEGIN} *********************************** * Multicopper oxidases signatures * *********************************** Multicopper oxidases [1,2] are enzymes that possess three spectroscopically different copper centers. These centers are called: type 1 (or blue), type 2 (or normal) and type 3 (or coupled binuclear). The enzymes that belong to this family are: - Laccase (EC 1.10.3.2) (urishiol oxidase), an enzyme found in fungi and plants, which oxidizes many different types of phenols and diamines. - L-ascorbate oxidase (EC 1.10.3.3), a higher plant enzyme. - Ceruloplasmin (EC 1.16.3.1) (ferroxidase), a protein found in the serum of mammals and birds, which oxidizes a great variety of inorganic and organic substances. Structurally ceruloplasmin exhibits internal sequence homology, and seem to have evolved from the triplication of a copper-binding domain similar to that found in laccase and ascorbate oxidase. In addition to the above enzymes there are a number of proteins which, on the basis of sequence similarities, can be said to belong to this family. These proteins are: - Copper resistance protein A (copA) from a plasmid in Pseudomonas syringae. This protein seems to be involved in the resistance of the microbial host to copper. - Blood coagulation factor V (Fa V). - Blood coagulation factor VIII (Fa VIII) [E1]. - Yeast FET3 [3], which is required for ferrous iron uptake. - Yeast hypothetical protein YFL041w and SpAC1F7.08, the fission yeast homolog. Factors V and VIII act as cofactors in blood coagulation and are structurally similar [4]. Their sequence consists of a triplicated A domain, a B domain and a duplicated C domain; in the following order: A-A-B-A-C-C. The A-type domain is related to the multicopper oxidases. We have developed two signature patterns for these proteins. Both patterns are derived from the same region, which in ascorbate oxidase, laccase, in the third domain of ceruloplasmin, and in copA, contains five residues that are known to be involved in the binding of copper centers. The first pattern does not make any assumption on the presence of copper-binding residues and thus can detect domains that have lost the ability to bind copper (such as those in Fa V and Fa VIII), while the second pattern is specific to copperbinding domains. -Consensus pattern: G-x-[FYW]-x-[LIVMFYW]-x-[CST]-x-{PR}-{K}-x(2)-{S}-x{LFH}G-[LM]-x(3)-[LIVMFYW] -Sequences known to belong to this class detected by the pattern: ALL, except for Emericella nidulans laccase. -Other sequence(s) detected in Swiss-Prot: 33 other proteins and Thiobacillus ferrooxidans rusticyanin which is also a copper-binding protein, but which belong to the type-1 copper proteins family (see <PDOC00174>). -Consensus pattern: H-C-H-x(3)-H-x(3)-[AG]-[LM] [The first 2 H's are copper type 3 binding residues] [The C, the third H, and L or M are copper type 1 ligands] -Sequences known to belong to this class detected by the pattern: only domains that bind copper. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Messerschmidt A., Huber R. "The blue oxidases, ascorbate oxidase, laccase and ceruloplasmin. Modelling and structural relationships." Eur. J. Biochem. 187:341-352(1990). PubMed=2404764 [ 2] Ouzounis C., Sander C. "A structure-derived sequence pattern for the detection of type I copper binding domains in distantly related proteins." FEBS Lett. 279:73-78(1991). PubMed=1995346 [ 3] Askwith C., Eide D., Van Ho A., Bernard P.S., Li L., Davis-Kaplan S., Sipe D.M., Kaplan J. "The FET3 gene of S. cerevisiae encodes a multicopper oxidase required for ferrous iron uptake." Cell 76:403-410(1994). PubMed=8293473 [ 4] Mann K.G., Jenny R.J., Krishnaswamy S. "Cofactor proteins in the assembly and expression of blood clotting enzyme complexes." Annu. Rev. Biochem. 57:915-956(1988). PubMed=3052293; DOI=10.1146/annurev.bi.57.070188.004411 [E1] http://europium.csc.mrc.ac.uk/WebPages/Main/main.htm +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00077} {PS00711; LIPOXYGENASE_1} {PS00081; LIPOXYGENASE_2} {PS51393; LIPOXYGENASE_3} {BEGIN} ********************************************************************* * Lipoxygenase iron-binding catalytic domain signatures and profile * ********************************************************************* Lipoxygenases (EC 1.13.11.-) are a class of iron-containing dioxygenases which catalyzes the hydroperoxidation of lipids, containing a cis,cis-1,4pentadiene structure. The primary products are hydroperoxy fatty acids, which usually are rapidly reduced to hydroxy derivatives. Lipoxygenases are common in plants where they may be involved in a number of diverse aspects of plant physiology including growth and development, pest resistance, and senescence or responses to wounding [1]. In mammals a number of lipoxygenases isozymes are involved in the metabolism of prostaglandins and leukotrienes [2]. Lipoxygenases are also common in primitive animals such as coral [3] and occur in some bacteria [4,5]. The N-terminal part of the eukaryotic lipoxygenases contains a PLAT domain (see <PDOC50095>) that may be involved in membrane-binding or substrate acquisition, while the iron-binding catalytic domain forms the C-terminal part. The 3D structure of the catalytic domain is mainly alpha-helical with an iron in the active site (see <PDB:1F8N>). The center of the domain consists of two long helices, which contain four of the iron-binding residues (at least three of which are histidines). A fifth residue that coordinates the non-heme catalytic iron is the carboxylate of the C-terminal isoleucine. The mammalian catalytic domain has a length of ~550-600 residues, which is shorter than in the plant lipoxygenases and forms a more compact structure as the additional 100-150 amino acids in plant enzymes form extra loops [3,6,7]. Some proteins known to contain a lipoxygenase iron-binding catalytic domain: - Plant lipoxygenases (EC 1.13.11.12). Plants express a variety of cytosolic isozymes as well as what seems [8] to be a chloroplast isozyme. - Mammalian arachidonate 5-lipoxygenase (EC 1.13.11.34). - Mammalian arachidonate 12-lipoxygenase (EC 1.13.11.31). - Mammalian erythroid cell-specific 15-lipoxygenase (EC 1.13.11.33). - Coral (Plexaura homomalla) allene oxide synthase-lipoxygenase protein, a bifunctional enzyme including arachidonate 8-lipoxygenase (EC 1.13.11.40). - Pseudomonas aeruginosa oleic arachidonate 15-lipoxygenase (EC 1.13.11.33). both acid a peroxidase lipoxygenase and and Six histidines are strongly conserved in lipoxygenase sequences, five of them are found clustered in a stretch of 40 amino acids. This region contains two of the three iron-ligands; the other histidines have been shown [9] to be important for the activity of lipoxygenases. As signatures for this family of enzymes we have selected two patterns in the region of the histidine cluster. The first pattern contains the first three conserved histidines and the second pattern includes the fourth and the fifth. We also developed a profile that covers the entire lipoxygenase iron-binding catalytic domain. -Consensus pattern: [HQ]-[EQ]-x(3)-H-x-[LMA]-[NEQHRCS]-[GSTA]-H[LIVMSTAC](2)x-E [The second and third H's bind iron] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [LIVMACST]-H-P-[LIVM]-x-[KRQV]-[LIVMF](2)-x-[AP]-H -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: July 2008 / Text revised; profile added. [ 1] Vick B.A., Zimmerman D.C. (In) Biochemistry of plants: A comprehensive treatise, Stumpf P.K., Ed., Vol. 9, pp.53-90, Academic Press, New-York, (1987). [ 2] Needleman P., Turk J., Jakschik B.A., Morrison A.R., Lefkowith J.B. "Arachidonic acid metabolism." Annu. Rev. Biochem. 55:69-102(1986). PubMed=3017195; DOI=10.1146/annurev.bi.55.070186.000441 [ 3] Oldham M.L., Brash A.R., Newcomer M.E. "Insights from the X-ray crystal structure of coral 8R-lipoxygenase: calcium activation via a C2-like domain and a structural basis of product chirality." J. Biol. Chem. 280:39545-39552(2005). PubMed=16162493; DOI=10.1074/jbc.M506675200 [ 4] Busquets M., Deroncele V., Vidal-Mas J., Rodriguez E., Guerrero A., Manresa A. "Isolation and characterization of a lipoxygenase from Pseudomonas 42A2 responsible for the biotransformation of oleic acid into ( S )( E )-10-hydroxy-8-octadecenoic acid." Antonie Van Leeuwenhoek 85:129-139(2004). PubMed=15028873; DOI=10.1023/B:ANTO.0000020152.15440.65 [ 5] Zheng Y., Boeglin W.E., Schneider C., Brash A.R. "A 49-kDa mini-lipoxygenase from Anabaena sp. PCC 7120 retains catalytically complete functionality." J. Biol. Chem. 283:5138-5147(2008). PubMed=18070874; DOI=10.1074/jbc.M705780200 [ 6] Boyington J.C., Gaffney B.J., Amzel L.M. "The three-dimensional structure of an arachidonic acid 15-lipoxygenase." Science 260:1482-1486(1993). PubMed=8502991 [ 7] Gillmor S.A., Villasenor A., Fletterick R., Sigal E., Browner M.F. "The structure of mammalian 15-lipoxygenase reveals similarity to the lipases and the determinants of substrate specificity." Nat. Struct. Biol. 4:1003-1009(1997). PubMed=9406550 [ 8] Peng Y.L., Shirano Y., Ohta H., Hibino T., Tanaka K., Shibata D. "A novel lipoxygenase from rice. Primary structure and specific expression upon incompatible infection with rice blast fungus." J. Biol. Chem. 269:3755-3761(1994). PubMed=7508918 [ 9] Steczko J., Donoho G.P., Clemens J.C., Dixon J.E., Axelrod B. "Conserved histidine residues in soybean lipoxygenase: functional consequences of their replacement." Biochemistry 31:4053-4057(1992). PubMed=1567851 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00078} {PS00082; EXTRADIOL_DIOXYGENAS} {BEGIN} ************************************************** * Extradiol ring-cleavage dioxygenases signature * ************************************************** Dioxygenases catalyze the incorporation of both atoms of molecular oxygen into substrates. Cleavage of aromatic rings is one of the most important function of dioxygenases. The substrates of ring-cleavage dioxygenases can be classified into two groups according to the mode of scission of the aromatic ring. Intradiol enzymes cleave the aromatic ring between two hydroxyl groups, whereas extradiol enzymes cleave the aromatic ring between a hydroxylated carbon and another adjacent nonhydroxylated carbon. Extradiol dioxygenases are usually homomultimeric, bind one atom of ferrous ion per subunit and have a subunit size of about 33 Kd. It has been shown [1,2] that the known extradiol dioxygenases are evolutionary related. The enzymes that belong to this family are: - Catechol 2,3-dioxygenase (EC 1.13.11.2) (metapyrocatechase) (genes nahH, xylE, dmpB, mcpII, and pheB). - 3-methylcatechol 2,3-dioxygenase (EC 1.13.11.-) (gene todE). - Biphenyl-2,3-diol 1,2-dioxygenase (EC 1.13.11.39) (DHBD) (gene bphC). It should be noted that in Rhodococcus globerulus, three different isozymes of DHBD have been found (genes bphC1 to bphC3). bphC1 is a classical extradiol dioxygenase, but bphC2 and bphC3 are smaller proteins (189 residues). - 1,2-dihydroxynaphthalene dioxygenase (EC 1.13.11.-) (gene nahC). - 2,2',3-trihydroxybiphenyl dioxygenase (EC 1.13.11.-) (gene dbfB). As a signature pattern for these enzymes we selected a region that includes four conserved residues. Among them is a glutamate which has been shown [3], in bphC, to be implicated in the binding of the ferrous iron atom. -Consensus pattern: [GNTIV]-x-H-x(5,7)-[LIVMF]-Y-x(2)-[DENTA]-P-x-[GP]x(2,3)E [E is an iron ligand] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 1. -Expert(s) to contact by email: Harayama S.; [email protected] -Last update: November 1995 / Pattern and text revised. [ 1] Harayama S., Rekik M. "Bacterial aromatic ring-cleavage enzymes are classified into two different gene families." J. Biol. Chem. 264:15328-15333(1989). PubMed=2670937 [ 2] Asturias J.A., Eltis L.D., Prucha M., Timmis K.N. "Analysis of three 2,3-dihydroxybiphenyl 1,2-dioxygenases found in Rhodococcus globerulus P6. Identification of a new family of extradiol dioxygenases." J. Biol. Chem. 269:7807-7815(1994). PubMed=8126007 [ 3] Han S., Eltis L.D., Timmis K.N., Muchmore S.W., Bolin J.T. "Crystal structure of the biphenyl-cleaving extradiol dioxygenase from a PCB-degrading pseudomonad." Science 270:976-980(1995). PubMed=7481800 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00079} {PS00083; INTRADIOL_DIOXYGENAS} {BEGIN} ************************************************** * Intradiol ring-cleavage dioxygenases signature * ************************************************** Dioxygenases catalyze the incorporation of both atoms of molecular oxygen into substrates. Cleavage of aromatic rings is one of the most important function of dioxygenases. The substrates of ring-cleavage dioxygenases can be classified into two groups according to the mode of scission of the aromatic ring. Intradiol enzymes cleave the aromatic ring between two hydroxyl groups, whereas extradiol enzymes cleave the aromatic ring between a hydroxylated carbon and another adjacent nonhydroxylated carbon [1]. Intradiol dioxygenases require a nonheme ferric ion as a cofactor. The enzymes that belong to this family are: - Protocatechuate 3,4-dioxygenase (EC 1.13.11.3) (3,4-PCD), an oligomeric enzyme complex which consists of 12 copies each of an alpha beta subunits. Both subunits are evolutionary related. - Catechol 1,2-dioxygenase (EC 1.13.11.1) (gene catA or clcA). - Chlorocatechol 1,2-dioxygenase (EC 1.13.11.1) (gene tfdC). and a As a signature pattern for these enzymes we selected a region that includes a tyrosine residue which, in 3,4-PCD, has been shown [2], to be implicated in the binding of the ferric iron atom. -Consensus pattern: [LIVMF]-x-G-x-[LIVM]-x(4)-[GS]-x(2)-[LIVMA]-x(4)[LIVM][DE]-[LIVMFYC]-x(6)-G-x-[FY] [Y is an iron ligand] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Harayama S.; [email protected] -Last update: December 2004 / Pattern and text revised. [ 1] Harayama S., Rekik M. "Bacterial aromatic ring-cleavage enzymes are classified into two different gene families." J. Biol. Chem. 264:15328-15333(1989). PubMed=2670937 [ 2] Ohlendorf D.H., Lipscomb J.D., Weber P.C. "Structure and assembly of protocatechuate 3,4-dioxygenase." Nature 336:403-405(1988). PubMed=3194022; DOI=10.1038/336403a0 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00080} {PS00084; CU2_MONOOXYGENASE_1} {PS00085; CU2_MONOOXYGENASE_2} {BEGIN} ***************************************************************** * Copper type II, ascorbate-dependent monooxygenases signatures * ***************************************************************** Copper type II, ascorbate-dependent monooxygenases [1] are a class of enzymes that requires copper as a cofactor and which uses ascorbate as an electron donor. The enzymes which belong to this category are: - Dopamine-beta-monooxygenase (EC 1.14.17.1) (DBH) [2], which catalyzes the conversion of dopamine to the neurotransmitter norepinephrine. - Peptidyl-glycine alpha-amidating monooxygenase (EC 1.14.17.3) (PAM), which catalyzes the conversion of the carboxy-terminal glycine of many active peptides and hormones to an amide group. There are a few regions of sequence similarities between these two enzymes, two of these regions contain clusters of conserved histidine residues which are most probably involved in binding copper. We selected these two regions as signature patterns. -Consensus pattern: H-H-M-x(2)-F-x-C [The 2 H's are copper ligands] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: H-x-F-x(4)-H-T-H-x(2)-G [The 3 H's are copper ligands] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: May 2004 / Text revised. [ 1] Southan C., Kruse L.I. "Sequence similarity between dopamine beta-hydroxylase and peptide alpha-amidating enzyme: evidence for a conserved catalytic domain." FEBS Lett. 255:116-120(1989). PubMed=2792366 [ 2] Stewart L.C., Klinman J.P. "Dopamine beta-hydroxylase of adrenal chromaffin granules: structure and function." Annu. Rev. Biochem. 57:551-592(1988). PubMed=3052283; DOI=10.1146/annurev.bi.57.070188.003003 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00081} {PS00086; CYTOCHROME_P450} {BEGIN} ******************************************************* * Cytochrome P450 cysteine heme-iron ligand signature * ******************************************************* Cytochrome P450's [1,2,3,E1] are a group of enzymes involved in the oxidative metabolism of a high number of natural compounds (such as steroids, fatty acids, prostaglandins, leukotrienes, etc) as well as drugs, carcinogens and mutagens. Based on sequence similarities, P450's have been classified into about forty different families [4,5]. P450's are proteins of 400 to 530 amino acids; the only exception is Bacillus BM-3 (CYP102) which is a protein of 1048 residues that contains a N-terminal P450 domain followed by a reductase domain. P450's are heme proteins. A conserved cysteine residue in the Cterminal part of P450's is involved in binding the heme iron in the fifth coordination site. From a region around this residue, we developed a ten residue signature specific to P450's. -Consensus pattern: [FW]-[SGNH]-x-[GD]-{F}-[RKHPT]-{P}-C-[LIVMFAP]-[GAD] [C is the heme iron ligand] -Sequences known to belong to this class detected by the pattern: ALL, except for P450 IIB10 from mouse, which has Lys in the first position of the pattern. -Other sequence(s) detected in Swiss-Prot: 9. -Note: The term 'cytochrome' P450, while commonly used, P450 are not electron-transfer proteins; the appropriate 'hemethiolate proteins'. -Expert(s) to contact by email: Degtyarenko K.N.; [email protected] -Last update: December 2004 / Pattern and text revised. is incorrect as name is P450 [ 1] Nebert D.W., Gonzalez F.J. "P450 genes: structure, evolution, and regulation." Annu. Rev. Biochem. 56:945-993(1987). PubMed=3304150; DOI=10.1146/annurev.bi.56.070187.004501 [ 2] Coon M.J., Ding X.X., Pernecky S.J., Vaz A.D. "Cytochrome P450: progress and predictions." FASEB J. 6:669-673(1992). PubMed=1537454 [ 3] Guengerich F.P. "Reactions and significance of cytochrome P-450 enzymes." J. Biol. Chem. 266:10019-10022(1991). PubMed=2037557 [ 4] Nelson D.R., Kamataki T., Waxman D.J., Guengerich F.P., Estabrook R.W., Feyereisen R., Gonzalez F.J., Coon M.J., Gunsalus I.C., Gotoh O. "The P450 superfamily: update on new sequences, gene mapping, accession numbers, early trivial names of enzymes, and nomenclature." DNA Cell Biol. 12:1-51(1993). PubMed=7678494 [ 5] Degtyarenko K.N., Archakov A.I. "Molecular evolution of P450 superfamily and P450-containing monooxygenase systems." FEBS Lett. 332:1-8(1993). PubMed=8405421 [E1] http://www.icgeb.trieste.it/p450/ +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00082} {PS00087; SOD_CU_ZN_1} {PS00332; SOD_CU_ZN_2} {BEGIN} *********************************************** * Copper/Zinc superoxide dismutase signatures * *********************************************** Copper/Zinc superoxide dismutase (EC 1.15.1.1) (SODC) [1] is one of the three forms of an enzyme that catalyzes the dismutation of superoxide radicals. SODC binds one atom each of zinc and copper. Various forms of SODC are known: a cytoplasmic form in eukaryotes, an additional chloroplast form in plants, an extracellular form in some eukaryotes, and a periplasmic form in prokaryotes. The metal binding sites are conserved in all the known SODC sequences [2]. We derived two signature patterns for this family of enzymes: the first one contains two histidine residues that bind the copper atom; the second one is located in the C-terminal section of SODC and contains a cysteine which is involved in a disulfide bond. -Consensus pattern: [GA]-[IMFAT]-H-[LIVF]-H-{S}-x-[GP]-[SDG]-x-[STAGDE] [The 2 H's are copper ligands] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 5. -Consensus pattern: G-[GNHD]-[SGA]-[GR]-x-R-x-[SGAWRV]-C-x(2)-[IV] [C is involved in a disulfide bond] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: These patterns will not detect proteins related to SODC, but which have lost their catalytic activity, such as Vaccinia virus protein A45. -Last update: April 2006 / Patterns revised. [ 1] Bannister J.V., Bannister W.H., Rotilio G. "Aspects of the structure, function, and applications of superoxide dismutase." CRC Crit. Rev. Biochem. 22:111-180(1987). PubMed=3315461 [ 2] Smith M.W., Doolittle R.F. "A comparison of evolutionary rates of the two major kinds of superoxide dismutase." J. Mol. Evol. 34:175-184(1992). PubMed=1556751 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00083} {PS00088; SOD_MN} {BEGIN} ****************************************************** * Manganese and iron superoxide dismutases signature * ****************************************************** Manganese superoxide dismutase (EC 1.15.1.1) (SODM) [1] is one of the three forms of an enzyme that catalyzes the dismutation of superoxide radicals. The four ligands of the manganese atom are conserved in all the known SODM sequences. These metal ligands are also conserved in the related iron form of superoxide dismutases [2,3]. We selected, as a signature, a short conserved region which includes two of the four ligands: an aspartate and a histidine. -Consensus pattern: D-x-[WF]-E-H-[STA]-[FY](2) [D and H are manganese/iron ligands] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Bannister J.V., Bannister W.H., Rotilio G. "Aspects of the structure, function, and applications of superoxide dismutase." CRC Crit. Rev. Biochem. 22:111-180(1987). PubMed=3315461 [ 2] Parker M.W., Blake C.C.F. "Iron- and manganese-containing superoxide dismutases can be distinguished by analysis of their primary structures." FEBS Lett. 229:377-382(1988). PubMed=3345848 [ 3] Smith M.W., Doolittle R.F. "A comparison of evolutionary rates of the two major kinds of superoxide dismutase." J. Mol. Evol. 34:175-184(1992). PubMed=1556751 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00084} {PS00089; RIBORED_LARGE} {BEGIN} **************************************************** * Ribonucleotide reductase large subunit signature * **************************************************** Ribonucleotide reductase (EC 1.17.4.1) [1,2] catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. Ribonucleotide reductase is an oligomeric enzyme composed of a large subunit (700 to 1000 residues) and a small subunit (300 to 400 residues). There are regions of similarities in the sequence of the large chain from prokaryotes, eukaryotes and viruses. We have selected one of these regions as a signature pattern. -Consensus pattern: W-x(2)-[LIVF]-x(6,7)-G-[LIVM]-[FYRA]-[NH]-x(3)[STAQLIVM][ASC]-x(2)-[PA] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 7. -Last update: December 2001 / Pattern and text revised. [ 1] Nillson O., Lundqvist T., Hahne S., Sjoberg B.-M. Biochem. Soc. Trans. 16:91-94(1988). [ 2] Reichard P. "From RNA to DNA, why so many ribonucleotide reductases?" Science 260:1773-1777(1993). PubMed=8511586 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00085} {PS00699; NITROGENASE_1_1} {PS00090; NITROGENASE_1_2} {BEGIN} *************************************************************** * Nitrogenases component 1 alpha and beta subunits signatures * *************************************************************** Nitrogenase (EC 1.18.6.1) [1] is the enzyme system responsible for biological nitrogen fixation. Nitrogenase is an oligomeric complex which consists of two components: component 2 is an homodimer of an iron-sulfur protein, while component 1 which contains the active site for the reduction of nitrogen to ammonia exists in three different forms: - A molybdenum-iron containing protein (MoFe). The MoFe protein is a heterotetramer consisting of two pairs of alpha (nifD) and beta (nifK) subunits. - A vanadium-iron containing protein (VFe). The VFe protein is a hexamer of two pairs each of alpha (vnfD), beta (vnfK), and delta (vnfG) subunits. - The third form of component 1 seems to only contain iron. Like the vanadium form it is a hexamer composed of alpha (anfD), beta (anfK), and delta (anfG) subunits. The alpha and beta chains of the three types of component 1 are evolutionary related and they are also related to proteins nifE and nifN, which are most probably involved in the iron-molybdenum cofactor biosynthesis [2]. We selected as signature patterns for this family of proteins two stretches of residues which are located in the N-terminal section and which each contain a conserved cysteine thought to be one of the ligands for the metalsulfur clusters. -Consensus pattern: [LIVMFYH]-[LIVMFST]-H-[AG]-[AGSP]-[LIVMNQA]-[AG]-C [C may be an iron-sulfur ligand] -Sequences known to belong to this class detected by the pattern: ALL, except for Anabaena PCC 7120 and Methanococcus thermolithotrophicus nifK which have Gln instead of the conserved His. -Other sequence(s) detected in Swiss-Prot: 1. -Consensus pattern: [STANQ]-[ET]-C-x(5)-G-D-[DN]-[LIVMT]-x-[STAGR][LIVMFYST] [C may be an iron-sulfur ligand] -Sequences known to belong to this class detected by the pattern: ALL, except for nifN proteins. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: May 2004 / Text revised. [ 1] Pau R.N. "Nitrogenases without molybdenum." Trends Biochem. Sci. 14:183-186(1989). PubMed=2672439 [ 2] Aguilar O.M., Taormino J., Thony B., Ramseier T., Hennecke H., Szalay A.A. "The nifEN genes participating in FeMo cofactor biosynthesis and genes encoding dinitrogenase are part of the same operon in Bradyrhizobium species." Mol. Gen. Genet. 224:413-420(1990). PubMed=2266945 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00086} {PS00091; THYMIDYLATE_SYNTHASE} {BEGIN} ************************************ * Thymidylate synthase active site * ************************************ Thymidylate synthase (EC 2.1.1.45) [1,2] catalyzes the reductive methylation of dUMP to dTMP with concomitant conversion of 5,10methylenetetrahydrofolate to dihydrofolate. Thymidylate synthase plays an essential role in DNA synthesis and is an important target for certain chemotherapeutic drugs. Thymidylate synthase is an enzyme of about 30 to 35 Kd in most species except in protozoan and plants where it exists as a bifunctional enzyme that includes a dihydrofolate reductase domain. A cysteine residue is involved in the catalytic mechanism (it covalently binds the 5,6-dihydro-dUMP intermediate). The sequence around the active site of this enzyme is conserved from phages to vertebrates. -Consensus pattern: R-x(2)-[LIVMT]-x(2,3)-[FWY]-[QNYDI]-x(8,13)-[LVESI]x-P-C[HAVMLC]-x(3)-[QMTLHD]-[FYWL]-x(0,1)-[LV] [C is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Benkovic S.J. "On the mechanism of action of folate- and biopterin-requiring enzymes." Annu. Rev. Biochem. 49:227-251(1980). PubMed=6996564; DOI=10.1146/annurev.bi.49.070180.001303 [ 2] Ross P., O'Gara F., Condon S. "Cloning and characterization of the thymidylate synthase gene from Lactococcus lactis subsp. lactis." Appl. Environ. Microbiol. 56:2156-2163(1990). PubMed=2117882 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00087} {PS00092; N6_MTASE} {BEGIN} ************************************************* * N-6 Adenine-specific DNA methylases signature * ************************************************* N-6 adenine-specific DNA methylases (EC 2.1.1.72) (A-Mtase) are enzymes that specifically methylate the amino group at the C-6 position of adenines in DNA. Such enzymes are found in the three existing types of bacterial restrictionmodification systems (in type I system the A-Mtase is the product of the hsdM gene, and in type III it is the product of the mod gene). All of these enzymes recognize a specific sequence in DNA and methylate an adenine in that sequence. It has been shown [1,2,3,4] that A-Mtases contain a conserved motif Asp/AsnPro-Pro-Tyr/Phe in their N-terminal section, this conserved region could be involved in substrate binding or in the catalytic activity. We have derived a pattern from that motif. -Consensus pattern: [LIVMAC]-[LIVFYWA]-{DYP}-[DN]-P-P-[FYW] -Sequences known to belong to this class detected by the pattern: ALL, except for m.HhaII where the second Pro is replaced by Gln and in m.HindIII where that same Pro is replaced by Tyr. -Other sequence(s) detected in Swiss-Prot: 33 different proteins that are most probably not A-Mtases, and three hypothetical Escherichia coli proteins that could be A-Mtases. -Note: N-4 cytosine-specific DNA methylases, which are probably enzymatically related to A-Mtases, also include a conserved Pro-Pro dipeptide but the residues around them are sufficiently different to allow the derivation of a pattern specific to these enzymes. -Expert(s) to contact by email: Roberts R.J.; [email protected] Bickle T.; [email protected] -Last update: December 2004 / Pattern and text revised. [ 1] Loenen W.A.M., Daniel A.S., Braymer H.D., Murray N.E. "Organization and sequence of the hsd genes of Escherichia coli K-12." J. Mol. Biol. 198:159-170(1987). PubMed=3323532 [ 2] Narva K.E., Van Etten J.L., Slatko B.E., Benner J.S. "The amino acid sequence of the eukaryotic DNA [N6-adenine]methyltransferase, M.CviBIII, has regions of similarity with the prokaryotic isoschizomer M.TaqI and other DNA [N6-adenine] methyltransferases." Gene 74:253-259(1988). PubMed=3248728 [ 3] Lauster R. "Evolution of type II DNA methyltransferases. A gene duplication model." J. Mol. Biol. 206:313-321(1989). PubMed=2541254 [ 4] Timinskas A., Butkus V., Janulaitis A. "Sequence motifs characteristic for DNA [cytosine-N4] and DNA [adenine-N6] methyltransferases. Classification of all DNA methyltransferases." Gene 157:3-11(1995). PubMed=7607512 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00088} {PS00093; N4_MTASE} {BEGIN} ************************************************** * N-4 cytosine-specific DNA methylases signature * ************************************************** N-4 cytosine-specific DNA methylases (EC 2.1.1.113) [1,2,3] are enzymes that specifically methylate the amino group at the C-4 position of cytosines in DNA. Such enzymes are found as components of type II restrictionmodification systems in prokaryotes. Such enzymes recognize a specific sequence in DNA and methylate a cytosine in that sequence. By this action they protect DNA from cleavage by type II restriction enzymes that recognize the same sequence. Type II N-4 Mtases seem to be structurally and enzymatically related to N-6 adenine-specific DNA methylases. Like the N-6 Mtases they contain a conserved Pro-Pro-Tyr/Phe region, but the N- and C-terminal contexts of this region are sufficiently different to derive a consensus pattern specific to this type of enzymes. -Consensus pattern: [LIVMF]-T-S-P-P-[FY] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 5. -Expert(s) to contact by email: Roberts R.J.; [email protected] Bickle T.; [email protected] -Last update: November 1997 / Text revised. [ 1] Tao T., Walter J., Brennan K.J., Cotterman M.M., Blumenthal R.M. "Sequence, internal homology and high-level expression of the gene for a DNA-(cytosine N4)-methyltransferase, M.Pvu II." Nucleic Acids Res. 17:4161-4175(1989). PubMed=2662138 [ 2] Klimasauskas S., Timinskas A., Menkevicius S., Butkiene D., Butkus V., Janulaitis A. "Sequence motifs characteristic of DNA[cytosineN4]methyltransferases: similarity to adenine and cytosine-C5 DNA-methylases." Nucleic Acids Res. 17:9823-9832(1989). PubMed=2690010 [ 3] Timinskas A., Butkus V., Janulaitis A. "Sequence motifs characteristic for DNA [cytosine-N4] and DNA [adenine-N6] methyltransferases. Classification of all DNA methyltransferases." Gene 157:3-11(1995). PubMed=7607512 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00089} {PS00094; C5_MTASE_1} {PS00095; C5_MTASE_2} {BEGIN} *************************************************** * C-5 cytosine-specific DNA methylases signatures * *************************************************** C-5 cytosine-specific DNA methylases (EC 2.1.1.37) (C5 Mtase) are enzymes that specifically methylate the C-5 carbon of cytosines in DNA [1,2,3]. Such enzymes are found in the proteins described below. - As a component of type II prokaryotes and some bacteriophages. sequence restriction-modification systems in Such enzymes recognize a specific DNA where they methylate a cytosine. In doing so, they protect DNA from cleavage by type II restriction enzymes that recognize the same sequence. The sequences of a large number of type II C-5 Mtases are known. - In vertebrates, there are a number of C-5 Mtases that methylate CpG dinucleotides. The sequence of the mammalian enzyme is known. C-5 Mtases share a number of short conserved regions. We selected two of them. The first is centered around a conserved Pro-Cys dipeptide in which the cysteine has been shown [4] to be involved in the catalytic mechanism; it appears to form a covalent intermediate with the C6 position of cytosine. The second region is located at the C-terminal extremity in type-II enzymes. -Consensus pattern: [DENKS]-x-[FLIV]-x(2)-[GSTC]-x-P-C-x-{V}-[FYWLIM]-S [C is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for M.MthtI. -Other sequence(s) detected in Swiss-Prot: 3. -Consensus pattern: [RKQGTF]-x(2)-G-N-[SA]-[LIVF]-x-[VIP]-x-[LVMT]-x(3)[LIVM]-x(3)-[LIVM] -Sequences known to belong to this class detected by the pattern: ALL, except for M.AluI, M.HgaI 1 and 2, and M.HpaII. -Other sequence(s) detected in Swiss-Prot: 2. -Note: In the first position of the second pattern, most known Mtases have Arg or Lys. -Expert(s) to contact by email: Roberts R.J.; [email protected] Bickle T.; [email protected] Mugasimangalam R.; [email protected] -Last update: December 2004 / Pattern and text revised. [ 1] Posfai J., Bhagwat A.S., Roberts R.J. "Sequence motifs specific for cytosine methyltransferases." Gene 74:261-265(1988). PubMed=3248729 [ 2] Kumar S., Cheng X., Klimasauskas S., Mi S., Posfai J., Roberts R.J., Wilson G.G. "The DNA (cytosine-5) methyltransferases." Nucleic Acids Res. 22:1-10(1994). PubMed=8127644 [ 3] Lauster R., Trautner T.A., Noyer-Weidner M. "Cytosine-specific type II DNA methyltransferases. A conserved enzyme core with variable target-recognizing domains." J. Mol. Biol. 206:305-312(1989). PubMed=2716049 [ 4] Chen L., MacMillan A.M., Chang W., Ezaz-Nikpay K., Lane W.S., Verdine G.L. "Direct identification of the active-site nucleophile in a DNA (cytosine-5)-methyltransferase." Biochemistry 30:11018-11025(1991). PubMed=1932026 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00090} {PS00096; SHMT} {BEGIN} *********************************************************************** * Serine hydroxymethyltransferase pyridoxal-phosphate attachment site * *********************************************************************** Serine hydroxymethyltransferase (EC 2.1.2.1) (SHMT) [1] catalyzes the transfer of the hydroxymethyl group of serine to tetrahydrofolate to form 5,10methylenetetrahydrofolate and glycine. In vertebrates, it exists in a cytoplasmic and a mitochondrial form whereas only one form is found in prokaryotes. Serine hydroxymethyltransferase is a pyridoxalphosphate containing enzyme. The pyridoxal-P group is attached to a lysine residue around which the sequence is highly conserved in all forms of the enzyme. -Consensus pattern: [DEQHY]-[LIVMFYA]-x-[GSTMVA]-[GSTAV]-[ST]-[STVM][HQ]-K[STG]-[LFMI]-x-[GAS]-[PGAC]-[RQ]-[GSARH]-[GA] [K is the pyridoxal-P attachment site] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Usha R., Savithri H.S., Rao N.A. "The primary structure of sheep liver cytosolic serine hydroxymethyltransferase and an analysis of the evolutionary relationships among serine hydroxymethyltransferases." Biochim. Biophys. Acta 1204:75-83(1994). PubMed=8305478 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00091} {PS00097; CARBAMOYLTRANSFERASE} {BEGIN} *********************************************************** * Aspartate and ornithine carbamoyltransferases signature * *********************************************************** Aspartate carbamoyltransferase (EC 2.1.3.2) (ATCase) catalyzes the conversion of aspartate and carbamoyl phosphate to carbamoylaspartate, the second step in the de novo biosynthesis of pyrimidine nucleotides [1]. In prokaryotes ATCase consists of two subunits: a catalytic chain (gene pyrB) and a regulatory chain (gene pyrI), while in eukaryotes it is a domain in a multifunctional enzyme (called URA2 in yeast, rudimentary in Drosophila, and CAD in mammals [2]) that also catalyzes other steps of the biosynthesis of pyrimidines. Ornithine carbamoyltransferase (EC 2.1.3.3) (OTCase) catalyzes the conversion of ornithine and carbamoyl phosphate to citrulline. In mammals this enzyme participates in the urea cycle [3] and is located in the mitochondrial matrix. In prokaryotes and eukaryotic microorganisms it is involved in the biosynthesis of arginine. In some bacterial species it is also involved in the degradation of arginine [4] (the arginine deaminase pathway). It has been shown [5] that these two enzymes are evolutionary related. The predicted secondary structure of both enzymes are similar and there are some regions of sequence similarities. One of these regions includes three residues which have been shown, by crystallographic studies [6], to be implicated in binding the phosphoryl group of carbamoyl phosphate. We have selected this region as a signature for these enzymes. -Consensus pattern: F-x-[EK]-x-S-[GT]-R-T [S, R, and the 2nd T bind carbamoyl phosphate] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: The residue in position 3 of the between an ATCase (Glu) and an OTCase (Lys). pattern allows to distinguish -Last update: October 1993 / Text revised. [ 1] Lerner C.G., Switzer R.L. "Cloning and structure of the Bacillus subtilis aspartate transcarbamylase gene (pyrB)." J. Biol. Chem. 261:11156-11165(1986). PubMed=3015959 [ 2] Davidson J.N., Chen K.C., Jamison R.S., Musmanno L.A., Kern C.B. "The evolutionary history of the first three enzymes in pyrimidine biosynthesis." BioEssays 15:157-164(1993). PubMed=8098212 [ 3] Takiguchi M., Matsubasa T., Amaya Y., Mori M. "Evolutionary aspects of urea cycle enzyme genes." BioEssays 10:163-166(1989). PubMed=2662961 [ 4] Baur H., Stalon V., Falmagne P., Luethi E., Haas D. "Primary and quaternary structure of the catabolic ornithine carbamoyltransferase from Pseudomonas aeruginosa. Extensive sequence homology with the anabolic ornithine carbamoyltransferases of Escherichia coli." Eur. J. Biochem. 166:111-117(1987). PubMed=3109911 [ 5] Houghton J.E., Bencini D.A., O'Donovan G.A., Wild J.R. "Protein differentiation: a comparison of aspartate transcarbamoylase and ornithine transcarbamoylase from Escherichia coli K-12." Proc. Natl. Acad. Sci. U.S.A. 81:4864-4868(1984). PubMed=6379651 [ 6] Ke H.-M., Honzatko R.B., Lipscomb W.N. "Structure of unligated aspartate carbamoyltransferase of Escherichia coli at 2.6-A resolution." Proc. Natl. Acad. Sci. U.S.A. 81:4037-4040(1984). PubMed=6377306 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00092} {PS00098; THIOLASE_1} {PS00737; THIOLASE_2} {PS00099; THIOLASE_3} {BEGIN} ************************ * Thiolases signatures * ************************ Two different types of thiolase [1,2,3] are found both in eukaryotes and in prokaryotes: acetoacetyl-CoA thiolase (EC 2.3.1.9) and 3-ketoacyl-CoA thiolase (EC 2.3.1.16). 3-ketoacyl-CoA thiolase (also called thiolase I) has a broad chain-length specificity for its substrates and is involved in degradative pathways such as fatty acid beta-oxidation. Acetoacetyl-CoA thiolase (also called thiolase II) is specific for the thiolysis of acetoacetylCoA and involved in biosynthetic pathways such as poly beta-hydroxybutyrate synthesis or steroid biogenesis. In eukaryotes, there are two forms of 3-ketoacyl-CoA thiolase: one located in the mitochondrion and the other in peroxisomes. There are two conserved cysteine residues important for thiolase activity. The first located in the N-terminal section of the enzymes is involved in the formation of an acyl-enzyme intermediate; the second located at the Cterminal extremity is the active site base involved in deprotonation in the condensation reaction. Mammalian nonspecific lipid-transfer protein (nsL-TP) (also known as sterol carrier protein 2) is a protein which seems to exist in two different forms: a 14 Kd protein (SCP-2) and a larger 58 Kd protein (SCP-x). The former is found in the cytoplasm or the mitochondria and is involved in lipid transport; the latter is found in peroxisomes. The C-terminal part of SCP-x is identical to SCP-2 while the N-terminal portion is evolutionary related to thiolases [4]. We developed three signature patterns for this family of proteins, two of which are based on the regions around the biologically important cysteines. The third is based on a highly conserved region in the C-terminal part of these proteins. -Consensus pattern: [LIVM]-[NST]-{T}-x-C-[SAGLI]-[ST]-[SAG]-[LIVMFYNS]-x[STAG]-[LIVM]-x(6)-[LIVM] [C is involved in formation of acyl-enzyme intermediate] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 5. -Consensus pattern: N-x(2)-G(2)-x-[LIVM]-[SA]-x-G-H-P-x-[GAS]-x-[ST]-G -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [AG]-[LIVMA]-[STAGCLIVM]-[STAG]-[LIVMA]-C-{Q}-[AG]-x[AG]x-[AG]-x-[SAG] [C is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for nsL-TP. -Other sequence(s) detected in Swiss-Prot: 8. -Last update: April 2006 / Patterns revised. [ 1] Peoples O.P., Sinskey A.J. "Poly-beta-hydroxybutyrate biosynthesis in Alcaligenes eutrophus H16. Characterization of the genes encoding beta-ketothiolase and acetoacetyl-CoA reductase." J. Biol. Chem. 264:15293-15297(1989). PubMed=2670935 [ 2] Yang S.-Y., Yang X.-Y.H., Healy-Louie G., Schulz H., Elzinga M. "Nucleotide sequence of the fadA gene. Primary structure of 3-ketoacyl-coenzyme A thiolase from Escherichia coli and the structural organization of the fadAB operon." J. Biol. Chem. 265:10424-10429(1990). PubMed=2191949 [ 3] Igual J.C., Gonzalez-Bosch C., Dopazo J., Perez-Ortin J.E. "Phylogenetic analysis of the thiolase family. Implications for the evolutionary origin of peroxisomes." J. Mol. Evol. 35:147-155(1992). PubMed=1354266 [ 4] Baker M.E., Billheimer J.T., Strauss J.F. III "Similarity between the amino-terminal portion of mammalian 58-kD sterol carrier protein (SCPx) and Escherichia coli acetyl-CoA acyltransferase: evidence for a gene fusion in SCPx." DNA Cell Biol. 10:695-698(1991). PubMed=1755959 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00093} {PS00100; CAT} {BEGIN} ************************************************* * Chloramphenicol acetyltransferase active site * ************************************************* Chloramphenicol O-acetyltransferase (CAT) (EC 2.3.1.28) [1] catalyzes the acetyl-CoA dependent acetylation of chloramphenicol (Cm), an antibiotic which inhibits prokaryotic peptidyltransferase activity. Acetylation of Cm by CAT inactivates the antibiotic. A histidine residue, located in the Cterminal section of the enzyme, plays a central role in its catalytic mechanism. We derived a signature pattern from the region surrounding this active site residue. -Consensus pattern: Q-[LIV]-H-H-[SA]-x(2)-D-G-[FY]-H [The second H is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: There is a second family of CAT [2], evolutionary unrelated to the main family described above. These CAT belong to the bacterial hexapeptiderepeat containing-transferases family (see <PDOC00094>). -Last update: November 1997 / Text revised. [ 1] Shaw W.V., Leslie A.G.W. "Chloramphenicol acetyltransferase." Annu. Rev. Biophys. Biophys. Chem. 20:363-386(1991). PubMed=1867721 [ 2] Parent R., Roy P.H. "The chloramphenicol acetyltransferase gene of Tn2424: a new breed of cat." J. Bacteriol. 174:2891-2897(1992). PubMed=1314803 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00094} {PS00101; HEXAPEP_TRANSFERASES} {BEGIN} ******************************************************** * Hexapeptide-repeat containing-transferases signature * ******************************************************** On the basis of sequence similarity, a number of transferases have been proposed [1,2,3,4] to belong to a single family. These proteins are: - Serine O-acetyltransferase (EC 2.3.1.30) (SAT) (gene cysE), an enzyme involved in cysteine biosynthesis. - Azotobacter chroococcum nitrogen fixation protein nifP. NifP is most probably a SAT involved in the optimization of nitrogenase activity. - Escherichia coli thiogalactoside acetyltransferase (EC 2.3.1.18) (gene lacA), an enzyme involved in the biosynthesis of lactose. - UDP-N-acetylglucosamine acyltransferase (EC 2.3.1.129) (gene lpxA), an enzyme involved in the biosynthesis of lipid A, a phosphorylated glycolipid that anchors the lipopolysaccharide to the outer membrane of the cell. - UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase (EC 2.3.1.-) (gene lpxD or firA), which is also involved in the biosynthesis of lipid A. - Chloramphenicol O-acetyltransferase (CAT) (EC 2.3.1.28) from Agrobacterium tumefaciens, Bacillus sphaericus, Escherichia coli plasmid IncFII NR79, Pseudomonas aeruginosa, Staphylococcus aureus plasmid pIP630. These CAT are not evolutionary related to the main family of CAT (see <PDOC00093>). - Rhizobium nodulation protein nodL. NodL is an acetyltransferase involved in the O-acetylation of Nod factors. - Bacterial maltose O-acetyltransferase (EC 2.3.1.79). - Bacterial tetrahydrodipicolinate N-succinyltransferase (EC 2.3.1.117) (gene dapD) which catalyzes the fourth step in the biosynthesis of diaminopimelate and lysine from aspartate semialdehyde. - Bacterial N-acetylglucosamine-1-phosphate uridyltransferase (EC 2.7.7.23) (gene glmU or gcaD or tms), an enzyme involved in peptidoglycan and lipopolysaccharide biosynthesis. - Staphylococcus aureus protein capG which is involved in biosynthesis of type 1 capsular polysaccharide. - Yeast hypothetical protein YJL218w, which is highly similar to Escherichia coli lacA. - Fission yeast hypothetical protein SpAC18B11.09c. - Methanococcus jannaschii hypothetical protein MJ1064. These proteins have been shown [3,4] to contain a repeat structure composed of tandem repeats of a [LIV]-G-x(4) hexapeptide which, in the tertiary structure of lpxA [5], has been shown to form a left-handed parallel beta helix. Our signature pattern is based on a fourfold repeat of this hexapeptide. -Consensus pattern: [LIV]-[GAED]-x(2)-[STAV]-x-[LIV]-x(3)-[LIVAC]-x[LIV][GAED]-x(2)-[STAVR]-x-[LIV]-[GAED]-x(2)-[STAV]-x[LIV]x(3)-[LIV] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 2. -Expert(s) to contact by email: Roy P.H.; [email protected] -Last update: July 1998 / Text revised. [ 1] Downie J.A. "The nodL gene from Rhizobium leguminosarum is homologous to the acetyl transferases encoded by lacA and cysE." Mol. Microbiol. 3:1649-1651(1989). PubMed=2615659 [ 2] Parent R., Roy P.H. "The chloramphenicol acetyltransferase gene of Tn2424: a new breed of cat." J. Bacteriol. 174:2891-2897(1992). PubMed=1314803 [ 3] Vaara M. "Eight bacterial proteins, including UDP-N-acetylglucosamine acyltransferase (LpxA) and three other transferases of Escherichia coli, consist of a six-residue periodicity theme." FEMS Microbiol. Lett. 76:249-254(1992). PubMed=1427014 [ 4] Vuorio R., Haerkonen T., Tolvanen M., Vaara M. "The novel hexapeptide motif found in the acyltransferases LpxA and LpxD of lipid A biosynthesis is conserved in various bacteria." FEBS Lett. 337:289-292(1994). PubMed=8293817 [ 5] Raetz C.R.H., Roderick S.L. "A left-handed parallel beta helix in the structure of UDP-N-acetylglucosamine acyltransferase." Science 270:997-1000(1995). PubMed=7481807 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00095} {PS00102; PHOSPHORYLASE} {BEGIN} ***************************************************** * Phosphorylase pyridoxal-phosphate attachment site * ***************************************************** Phosphorylases enzymes in (EC 2.4.1.1) [1] are important allosteric carbohydrate metabolism. They catalyze the formation of glucose 1phosphate from polyglucose such as glycogen, starch or maltodextrin. Enzymes from different sources differ in their regulatory mechanisms and their natural substrates. However, all known phosphorylases share catalytic and structural properties. They are pyridoxal-phosphate dependent enzymes; the pyridoxal-P group is attached to a lysine residue around which the sequence is highly conserved and can be used as a signature pattern to detect this class of enzymes. -Consensus pattern: E-A-[SC]-G-x-[GS]-x-M-K-x(2)-[LM]-N [K is the pyridoxal-P attachment site] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: November 1997 / Pattern and text revised. [ 1] Fukui T., Shimomura S., Nakano K. "Potato and rabbit muscle phosphorylases: comparative studies on the structure, function and regulation of regulatory and nonregulatory enzymes." Mol. Cell. Biochem. 42:129-144(1982). PubMed=7062910 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00096} {PS00103; PUR_PYR_PR_TRANSFER} {BEGIN} *********************************************************** * Purine/pyrimidine phosphoribosyl transferases signature * *********************************************************** Phosphoribosyltransferases (PRT) are enzymes that catalyze the synthesis of beta-n-5'-monophosphates from phosphoribosylpyrophosphate (PRPP) and an enzyme specific amine. A number of PRT's are involved in the biosynthesis of purine, pyrimidine, and pyridine nucleotides, purines and pyrimidines. These enzymes are: or in the salvage of - Adenine phosphoribosyltransferase (EC 2.4.2.7) (APRT), which is involved in purine salvage. - Hypoxanthine-guanine or hypoxanthine phosphoribosyltransferase (EC 2.4.2.8) (HGPRT or HPRT), which are involved in purine salvage. - Orotate phosphoribosyltransferase (EC 2.4.2.10) (OPRT), which is involved in pyrimidine biosynthesis. - Amido phosphoribosyltransferase (EC 2.4.2.14), which is involved in purine biosynthesis. - Xanthine-guanine phosphoribosyltransferase (EC 2.4.2.22) (XGPRT), which is involved in purine salvage. In the sequence of all these enzymes there is a small conserved region which may be involved in the enzymatic activity and/or be part of the PRPP binding site [1]. -Consensus pattern: [LIVMFYWCTA]-[LIVM]-[LIVMA]-[LIVMFC]-[DE]-D-[LIVMS][LIVM]-[STAVD]-[STAR]-[GAC]-x-[STAR] -Sequences known to belong to this class detected by the pattern: ALL, except for Bacillus subtilis xanthine phosphoribosyltransferase. -Other sequence(s) detected in Swiss-Prot: bacterial phosphoribosyl pyrophosphate synthetases and 7 other proteins. -Note: In position 11 of the pattern most of these enzymes have Gly. -Last update: November 1997 / Pattern and text revised. [ 1] Hershey H.V., Taylor M.W. "Nucleotide sequence and deduced amino acid sequence of Escherichia coli adenine phosphoribosyltransferase and comparison with other analogous enzymes." Gene 43:287-293(1986). PubMed=3527873 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00097} {PS00104; EPSP_SYNTHASE_1} {PS00885; EPSP_SYNTHASE_2} {BEGIN} **************************** * EPSP synthase signatures * **************************** EPSP synthase (3-phosphoshikimate 1-carboxyvinyltransferase) (EC 2.5.1.19) catalyzes the sixth step in the biosynthesis from chorismate of the aromatic amino acids (the shikimate pathway) in bacteria (gene aroA), plants and fungi (where it is part of a multifunctional enzyme which catalyzes five consecutive steps in this pathway) [1]. EPSP synthase has been extensively studied as it is the target of the potent herbicide glyphosate which inhibits the enzyme. The sequence of EPSP from various biological sources shows that the structure of the enzyme has been well conserved throughout evolution. We selected two conserved regions as signature patterns. The first pattern corresponds to a region that is part of the active site and which is also important for the resistance to glyphosate [2]. The second pattern is located in the Cterminal part of the protein and contains a conserved lysine which seems to be important for the activity of the enzyme. -Consensus pattern: [LIVF]-{LV}-x-[GANQK]-[NLG]-[SA]-[GA]-[TAI]-[STAGV]{N}-Rx-[LIVMFYAT]-x-[GSTAP] -Sequences known to belong to this class detected by the pattern: ALL, except for Mycobacterium tuberculosis aroA. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [KR]-x-[KH]-E-[CSTVI]-[DNE]-R-[LIVMY]-x-[GSTAVLD][LIVMCTF]-x(3)-[LIVMFA]-x(2)-[LIVMFCGANY]-G -Sequences known to belong to this class detected by the pattern: ALL, except for Lactococcus lactis and Staphylococcus aureus aroA. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Stallings W.C., Abdel-Meguid S.S., Lim L.W., Shieh H.-S., Dayringer H.E., Leimgruber N.K., Stegeman R.A., Anderson K.S., Sikorski J.A., Padgette S.R., Kishore G.M. "Structure and topological symmetry of the glyphosate target 5-enolpyruvylshikimate-3-phosphate synthase: a distinctive protein fold." Proc. Natl. Acad. Sci. U.S.A. 88:5046-5050(1991). PubMed=11607190; [ 2] Padgette S.R., Re D.B., Gaser C.S., Eicholtz D.A., Frazier R.B., Hironaka C.M., Levine E.B., Shah D.M., Fraley R.T., Kishore G.M. "Site-directed mutagenesis of a conserved region of the 5-enolpyruvylshikimate-3-phosphate synthase active site." J. Biol. Chem. 266:22364-22369(1991). PubMed=1939260; +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00098} {PS00105; AA_TRANSFER_CLASS_1} {BEGIN} ***************************************************************** * Aminotransferases class-I pyridoxal-phosphate attachment site * ***************************************************************** Aminotransferases share certain mechanistic features with other pyridoxalphosphate dependent enzymes, such as the covalent binding of the pyridoxalphosphate group to a lysine residue. On the basis of sequence similarity, these various enzymes can be grouped [1,2] into subfamilies. One of these, called class-I, currently consists of the following enzymes: - Aspartate aminotransferase (AAT) (EC 2.6.1.1). AAT catalyzes the reversible transfer of the amino group from L-aspartate to 2-oxoglutarate to form oxaloacetate and L-glutamate. In eukaryotes, there are two AAT isozymes: one is located in the mitochondrial matrix, the second is cytoplasmic. In prokaryotes, only one form of AAT is found (gene aspC). - Tyrosine aminotransferase (EC 2.6.1.5) which catalyzes the first step in tyrosine catabolism by reversibly transferring its amino group to 2oxoglutarate to form 4-hydroxyphenylpyruvate and L-glutamate. - Aromatic aminotransferase (EC 2.6.1.57) involved in the synthesis of Phe, Tyr, Asp and Leu (gene tyrB). - 1-aminocyclopropane-1-carboxylate synthase (EC 4.4.1.14) (ACC synthase) from plants. ACC synthase catalyzes the first step in ethylene biosynthesis. - Pseudomonas denitrificans cobC, which is involved in cobalamin biosynthesis. - Yeast hypothetical protein YJL060w. The sequence around the pyridoxal-phosphate attachment site of this class of enzyme is sufficiently conserved to allow the creation of a specific pattern. -Consensus pattern: [GS]-[LIVMFYTAC]-[GSTA]-K-x(2)-[GSALVN]-[LIVMFA]-x[GNAR]{V}-R-[LIVMA]-[GA] [K is the pyridoxal-P attachment site] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 1. -Last update: April 2006 / Pattern revised. [ 1] Bairoch A. Unpublished observations (1992). [ 2] Sung M.H., Tanizawa K., Tanaka H., Kuramitsu S., Kagamiyama H., Hirotsu K., Okamoto A., Higuchi T., Soda K. "Thermostable aspartate aminotransferase from a thermophilic Bacillus species. Gene cloning, sequence determination, and preliminary x-ray characterization." J. Biol. Chem. 266:2567-2572(1991). PubMed=1990006; +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00099} {PS00106; GALACTOKINASE} {BEGIN} *************************** * Galactokinase signature * *************************** Galactokinase (EC 2.7.1.6) [1] catalyzes the first reaction of galactose metabolism, the conversion of galactose to galactose 1-phosphate. There are three well conserved regions in the sequence of eukaryotic and prokaryotic galactokinase. As a signature pattern we have selected the best conserved of these regions, which is located in the N-terminal section of galactokinase. In yeast the GAL3 protein [2] is required for rapid induction of the galactose system. The exact function of GAL3 is not known, but it may be involved in the production of a true inducer or coinducer molecule. The sequence of GAL3 is closely related to that of galactokinases. -Consensus pattern: G-R-x-N-[LIV]-I-G-[DE]-H-x-D-Y -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: See also the section describing the pattern ATPbinding domain <PDOC00545>. for the GHMP kinases -Last update: July 1999 / Pattern and text revised. [ 1] Debouck C., Riccio A., Schumperli D., McKenney K., Jeffers J., Hughes C., Rosenberg M., Heusterspreute M., Brunel F., Davison J. "Structure of the galactokinase gene of Escherichia coli, the last (?) gene of the gal operon." Nucleic Acids Res. 13:1841-1853(1985). PubMed=3158881 [ 2] Bajwa W., Torchia T.E., Hopper J.E. "Yeast regulatory gene GAL3: carbon regulation; UASGal elements in common with GAL1, GAL2, GAL7, GAL10, GAL80, and MEL1; encoded protein strikingly similar to yeast and Escherichia coli galactokinases." Mol. Cell. Biol. 8:3439-3447(1988). PubMed=3062381 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00100} {PS00107; PROTEIN_KINASE_ATP} {PS00108; PROTEIN_KINASE_ST} {PS00109; PROTEIN_KINASE_TYR} {PS50011; PROTEIN_KINASE_DOM} {BEGIN} ****************************************** * Protein kinases signatures and profile * ****************************************** Eukaryotic protein kinases [1 to 5] are enzymes that belong to a very extensive family of proteins which share a conserved catalytic core common to both serine/threonine and tyrosine protein kinases. There are a number of conserved regions in the catalytic domain of protein kinases. We have selected two of these regions to build signature patterns. The first region, which is located in the N-terminal extremity of the catalytic domain, is a glycine-rich stretch of residues in the vicinity of a lysine residue, which has been shown to be involved in ATP binding. The second region, which is located in the central part of the catalytic domain, contains a conserved aspartic acid residue which is important for the catalytic activity of the enzyme [6]; we have derived two signature patterns for that region: one specific for serine/ threonine kinases and the other for tyrosine kinases. We also developed a profile which is based on the alignment in [1] and covers the entire catalytic domain. -Consensus pattern: [LIV]-G-{P}-G-{P}-[FYWMGSTNH]-[SGA]-{PW}-[LIVCAT]{PD}-x- [GSTACLIVMFY]-x(5,18)-[LIVMFYWCSTAR]-[AIVP][LIVMFAGCKR]-K [K binds ATP] -Sequences known to belong to this class detected by the pattern: the majority of known protein kinases but it fails to find a number of them, especially viral kinases which are quite divergent in this region and are completely missed by this pattern. -Other sequence(s) detected in Swiss-Prot: 42. -Consensus pattern: [LIVMFYC]-x-[HY]-x-D-[LIVMFY]-K-x(2)-N-[LIVMFYCT](3) [D is an active site residue] -Sequences known to belong to this class detected by the pattern: Most serine/ threonine specific protein kinases with 10 exceptions (half of them viral kinases) and also Epstein-Barr virus BGLF4 and Drosophila ninaC which have respectively Ser and Arg instead of the conserved Lys and which are therefore detected by the tyrosine kinase specific pattern described below. -Other sequence(s) detected in Swiss-Prot: 1. -Consensus pattern: [LIVMFYC]-{A}-[HY]-x-D-[LIVMFY]-[RSTAC]-{D}-{PF}-N[LIVMFYC](3) [D is an active site residue] -Sequences known to belong to this class detected by the pattern: ALL tyrosine specific protein kinases with the exception of human ERBB3 and mouse blk. This pattern will also detect most bacterial aminoglycoside phosphotransferases [8,9] and herpesviruses ganciclovir kinases [10]; which are proteins structurally and evolutionary related to protein kinases. -Other sequence(s) detected in Swiss-Prot: 17. -Sequences known to belong to this class detected by the profile: ALL, except for three viral kinases. This profile also detects receptor guanylate cyclases (see <PDOC00430>) and 2-5A-dependent ribonucleases. Sequence similarities between these two families and the eukaryotic protein kinase family have been noticed before. It also detects Arabidopsis thaliana kinaselike protein TMKL1 which seems to have lost its catalytic activity. -Other sequence(s) detected in Swiss-Prot: 4. -Note: If a protein signatures, the analyzed includes the two protein kinase probability of it being a protein kinase is close to 100% -Note: Eukaryotic-type protein kinases have also been found in prokaryotes such as Myxococcus xanthus [11] and Yersinia pseudotuberculosis. -Note: The patterns shown above has been updated since their publication in [7]. -Expert(s) to contact by email: Hunter T.; [email protected] Quinn A.M.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Hanks S.K., Hunter T. "Protein kinases 6. The eukaryotic protein kinase superfamily: kinase (catalytic) domain structure and classification." FASEB J. 9:576-596(1995). PubMed=7768349 [ 2] Hunter T. "Protein kinase classification." Methods Enzymol. 200:3-37(1991). PubMed=1835513 [ 3] Hanks S.K., Quinn A.M. "Protein kinase catalytic domain sequence database: identification of conserved features of primary structure and classification of family members." Methods Enzymol. 200:38-62(1991). PubMed=1956325 [ 4] Hanks S.K. Curr. Opin. Struct. Biol. 1:369-383(1991). [ 5] Hanks S.K., Quinn A.M., Hunter T. "The protein kinase family: conserved features and deduced phylogeny of the catalytic domains." Science 241:42-52(1988). PubMed=3291115 [ 6] Knighton D.R., Zheng J.H., Ten Eyck L.F., Ashford V.A., Xuong N.-H., Taylor S.S., Sowadski J.M. "Crystal structure of the catalytic subunit of cyclic adenosine monophosphate-dependent protein kinase." Science 253:407-414(1991). PubMed=1862342 [ 7] Bairoch A., Claverie J.-M. "Sequence patterns in protein kinases." Nature 331:22-22(1988). PubMed=3340146; DOI=10.1038/331022a0 [ 8] Benner S. Nature 329:21-21(1987). [ 9] Kirby R. "Evolutionary origin of aminoglycoside phosphotransferase resistance genes." J. Mol. Evol. 30:489-492(1990). PubMed=2165531 [10] Littler E., Stuart A.D., Chee M.S. Nature 358:160-162(1992). [11] Munoz-Dorado J., Inouye S., Inouye M. Cell 67:995-1006(1991). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00101} {PS00110; PYRUVATE_KINASE} {BEGIN} ***************************************** * Pyruvate kinase active site signature * ***************************************** Pyruvate kinase (EC 2.7.1.40) (PK) [1] catalyzes the final step in glycolysis, the conversion of phosphoenolpyruvate to pyruvate with the concomitant phosphorylation of ADP to ATP. PK requires both magnesium and potassium ions for its activity. PK is found in all living organisms. In vertebrates there are four, tissues specific, isozymes: L (liver), R (red cells), M1 (muscle, heart, and brain), and M2 (early fetal tissues). In Escherichia coli there are two isozymes: PK-I (gene pykF) and PK-II (gene pykA). All PK isozymes seem to be tetramers of identical subunits of about 500 amino acid residues. As a signature pattern for PK we selected a conserved region that includes a lysine residue which seems to be the acid/base catalyst responsible for the interconversion of pyruvate and enolpyruvate, and a glutamic acid residue implicated in the binding of the magnesium ion. -Consensus pattern: [LIVAC]-x-[LIVM](2)-[SAPCV]-K-[LIV]-E-[NKRST]-x[DEQHS][GSTA]-[LIVM] [K is the active site residue] [E is a magnesium ligand] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 2. -Last update: July 1999 / Pattern and text revised. [ 1] Muirhead H. "Isoenzymes of pyruvate kinase." Biochem. Soc. Trans. 18:193-196(1990). PubMed=2379684 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00102} {PS00111; PGLYCERATE_KINASE} {BEGIN} ************************************* * Phosphoglycerate kinase signature * ************************************* Phosphoglycerate kinase (EC 2.7.2.3) (PGK) [1] catalyzes the second step in the second phase of glycolysis, the reversible conversion of 1,3diphosphoglycerate to 3-phosphoglycerate with generation of one molecule of ATP. PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. It is a two-domain protein; each domain is composed of six repeats of an alpha/beta structural motif. As a signature pattern for PGK's, we selected a conserved region in the N-terminal region. -Consensus pattern: [KRHGTCVN]-[VT]-[LIVMF]-[LIVMC]-R-x-D-x-N-[SACV]-P -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: July 1999 / Pattern and text revised. [ 1] Watson H.C., Littlechild J.A. "Isoenzymes of phosphoglycerate kinase: evolutionary conservation of the structure of this glycolytic enzyme." Biochem. Soc. Trans. 18:187-190(1990). PubMed=2379683 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00103} {PS00112; GUANIDO_KINASE} {BEGIN} *********************************************** * ATP:guanido phosphotransferases active site * *********************************************** ATP:guanido phosphotransferases are a family of structurally and functionally related enzymes [1,2] that reversibly catalyze the transfer of phosphate between ATP and various phosphogens. The enzymes that belongs to this family are: - Creatine kinase (EC 2.7.3.2) (CK) [3,4], which plays an important role in energy metabolism of vertebrates. It catalyzes the reversible transfer of high energy phosphate from ATP to creatine, generating phosphocreatine and ADP. There are at least four different, but very closely related, forms of CK. Two of the CK isozymes are cytosolic: the M (muscle) and B (brain) forms while the two others are mitochondrial. In sea urchin there is a flagellar isozyme, which consists of the triplication of a CK-domain. - Glycocyamine kinase (EC 2.7.3.1) (guanidoacetate kinase), an enzyme that catalyzes the transfer of phosphate from ATP to guanidoacetate. - Arginine kinase (EC 2.7.3.3), an enzyme that catalyzes the transfer of phosphate from ATP to arginine. - Taurocyamine kinase (EC 2.7.3.4), an annelid-specific enzyme that catalyzes the transfer of phosphate from ATP to taurocyamine. - Lombricine kinase (EC 2.7.3.5), an annelid-specific enzyme that catalyzes the transfer of phosphate from ATP to lombricine. - Smc74 [1], a cercaria-specific enzyme from Schistosoma mansoni. This enzyme consists of two CK-related duplicated domains. The substrate(s) specificity of Smc74 is not yet known. A cysteine residue is implicated in the catalytic activity of these enzymes. The region around this active site residue is highly conserved and can be used as a signature pattern. -Consensus pattern: C-P-x(0,1)-[ST]-N-[ILV]-G-T [C is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Stein L.D., Harn D.A., David J.R. "A cloned ATP:guanidino kinase in the trematode Schistosoma mansoni has a novel duplicated structure." J. Biol. Chem. 265:6582-6588(1990). PubMed=2324092 [ 2] Strong S.J., Ellington W.R. "Isolation and sequence analysis of the gene for arginine kinase from the chelicerate arthropod, Limulus polyphemus: insights into catalytically important residues." Biochim. Biophys. Acta 1246:197-200(1995). PubMed=7819288 [ 3] Bessman S.-P., Carpenter C.L. "The creatine-creatine phosphate energy shuttle." Annu. Rev. Biochem. 54:831-862(1985). PubMed=3896131; DOI=10.1146/annurev.bi.54.070185.004151 [ 4] Haas R.C., Strauss A.W. "Separate nuclear genes encode sarcomere-specific and ubiquitous human mitochondrial creatine kinase isoenzymes." J. Biol. Chem. 265:6921-6927(1990). PubMed=2324105 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00104} {PS00113; ADENYLATE_KINASE} {BEGIN} ****************************** * Adenylate kinase signature * ****************************** Adenylate kinase (EC 2.7.4.3) (AK) [1] is a small monomeric enzyme that catalyzes the reversible transfer of MgATP to AMP (MgATP + AMP = MgADP + ADP). In mammals there are three different isozymes: - AK1 (or myokinase), which is cytosolic. - AK2, which is located in the outer compartment of mitochondria. - AK3 (or GTP:AMP phosphotransferase), which is located in the mitochondrial matrix and which uses MgGTP instead of MgATP. The sequence of AK has also species and from plants and fungi. been obtained from different bacterial Two other enzymes have been found to be evolutionary related to AK. These are: - Yeast uridylate kinase (EC 2.7.4.-) (UK) (gene URA6) [2] which catalyzes the transfer of a phosphate group from ATP to UMP to form UDP and ADP. - Slime mold UMP-CMP kinase (EC 2.7.4.14) [3] which catalyzes the transfer of a phosphate group from ATP to either CMP or UMP to form CDP or UDP and ADP. Several regions of AK family enzymes are well conserved, including the ATPbinding domains. We have selected the most conserved of all regions as a signature for this type of enzyme. This region includes an aspartic acid residue that is part of the catalytic cleft of the enzyme and that is involved in a salt bridge. It also includes an arginine residue whose modification leads to inactivation of the enzyme. -Consensus pattern: [LIVMFYWCA]-[LIVMFYW](2)-D-G-[FYI]-P-R-x(3)-[NQ] [The R is an active site residue] [The D is involved in a salt bridge] -Sequences known to belong to this class detected by the pattern: ALL, except for Schistosoma mansoni (blood fluke) and Yersinia enterocolitica AK. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Archaebacterial AK do not belong to this family [4]. -Last update: May 2004 / Text revised. [ 1] Schulz G.E. "Structural and functional relationships in the adenylate kinase family." Cold Spring Harb. Symp. Quant. Biol. 52:429-439(1987). PubMed=2841070 [ 2] Liljelund P., Sanni A., Friesen J.D., Lacroute F. "Primary structure of the S. cerevisiae gene encoding uridine monophosphokinase." Biochem. Biophys. Res. Commun. 165:464-473(1989). PubMed=2556145 [ 3] Wiesmueller L., Noegel A.A., Barzu O., Gerisch G., Schleicher M. J. Biol. Chem. 265:6339-6345(1990). [ 4] Kath T.H., Schmid R., Schaefer G. "Identification, cloning, and expression of the gene for adenylate kinase from the thermoacidophilic archaebacterium Sulfolobus acidocaldarius." Arch. Biochem. Biophys. 307:405-410(1993). PubMed=8274029 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00105} {PS00114; PRPP_SYNTHETASE} {BEGIN} ***************************************************** * Phosphoribosyl pyrophosphate synthetase signature * ***************************************************** Phosphoribosyl pyrophosphate synthetase (EC 2.7.6.1) (PRPP synthetase) catalyzes the formation of PRPP from ATP and ribose 5-phosphate. PRPP is then used in various biosynthetic pathways, as for example in the formation of purines, pyrimidines, histidine and tryptophan. PRPP synthetase requires inorganic phosphate and magnesium ions for its stability and activity. In mammals, three isozymes of PRPP synthetase are found; in yeast there are at least four isozymes. As a signature pattern for this enzyme, we selected a very conserved region that has been suggested to be involved in binding divalent cations [1]. This region contains two conserved aspartic acid residues as well as a histidine, which are all potential ligands for a cation such as magnesium. -Consensus pattern: D-[LIM]-H-[SANDT]-x-[QS]-[IMSTAVF]-[QMLPH]-[GA]-[FY]Fx(2)-P-[LIVMFCT]-D [The 2 D's and the H are magnesium ligands] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Bower S.G., Harlow K.W., Switzer R.L., Hove-Jensen B. "Characterization of the Escherichia coli prsA1-encoded mutant phosphoribosylpyrophosphate synthetase identifies a divalent cation-nucleotide binding site." J. Biol. Chem. 264:10287-10291(1989). PubMed=2542328 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00106} {PS00115; RNA_POL_II_REPEAT} {BEGIN} **************************************************** * Eukaryotic RNA polymerase II heptapeptide repeat * **************************************************** RNA polymerase II (EC 2.7.7.6) [1,2] is one of the three forms of RNA polymerase that exist in eukaryotic nuclei. The C-terminal region of the largest subunit of this oligomeric enzyme consists of the tandem repeat of a conserved heptapeptide [3]. The number of repeats varies according to the species (for example: 17 in Plasmodium, 26 in yeast, 44 in Drosophila, and 52 in mammals). The region containing these repeats is essential to the function of polymerase II. This repeated heptapeptide (called CT7n or CTD) is rich in hydroxyl groups. It probably projects out of the globular catalytic domain and may interact with the acidic activator domains of transcriptional regulatory proteins. It is also known to bind by intercalation to DNA. RNA polymerase II is activated by phosphorylation. The serine and threonine residues in the CT7n repeats are the target of such phosphorylation. -Consensus pattern: Y-[ST]-P-[ST]-S-P-[STANK] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: The consensus for the heptapeptide repeat is Y-S-P-T-S-P-S, but we have allowed variants in position 2, 4, and, 7 of the pattern so as to detect some of the imperfect repeats. -Note: Protozoan parasites such Trypanosoma and Crithidia do not have a CT7n domain. -Last update: December 1991 / Text revised. [ 1] Woychik N.A., Young R.A. "RNA polymerase II: subunit structure and function." Trends Biochem. Sci. 15:347-351(1990). PubMed=1700503 [ 2] Young R.A. "RNA polymerase II." Annu. Rev. Biochem. 60:689-715(1991). PubMed=1883205; DOI=10.1146/annurev.bi.60.070191.003353 [ 3] Corden J.L. "Tails of RNA polymerase II." Trends Biochem. Sci. 15:383-387(1990). PubMed=2251729 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00107} {PS00116; DNA_POLYMERASE_B} {BEGIN} ************************************* * DNA polymerase family B signature * ************************************* Replicative DNA polymerases (EC 2.7.7.7) are the key enzymes catalyzing the accurate replication of DNA. They require either a small RNA molecule or a protein as a primer for the de novo synthesis of a DNA chain. On the basis of sequence similarity, a number of DNA polymerases have been grouped [1 to 7] under the designation of DNA polymerase family B. These are: - Higher eukaryotes polymerases alpha. - Higher eukaryotes polymerases delta. - Yeast polymerase I/alpha (gene POL1), polymerase II/epsilon (gene POL2), polymerase III/delta (gene POL3) and polymerase REV3. - Escherichia coli polymerase II (gene dinA or polB). - Archaebacterial polymerases. - Polymerases of viruses from the herpesviridae family. - Polymerases from Adenoviruses. - Polymerases from Baculoviruses. - Polymerases from Chlorella viruses. - Polymerases from Poxviruses. - Bacteriophage T4 polymerase. - Podoviridae bacteriophages Phi-29, M2 and PZA polymerase. - Tectiviridae bacteriophage PRD1 polymerase. - Polymerases encoded on mitochondrial linear DNA plasmids in various fungi and plants (Kluyveromyces lactis pGKL1 and pGKL2, Agaricus bitorquis pEM, Ascobolus immersus pAI2, Claviceps purpurea pCLK1, Neurospora Kalilo and Maranhar, maize S-1, etc). Six regions of similarity (numbered from I to VI) are found in all or a subset of the above polymerases. The most conserved region (I) includes a conserved tetrapeptide with two aspartate residues. Its function is not yet known. However, it has been suggested [3] that it may be involved in binding a magnesium ion. We selected this conserved region as a signature for this family of DNA polymerases. -Consensus pattern: [YA]-[GLIVMSTAC]-D-T-D-[SG]-[LIVMFTC]-{LA}-[LIVMSTAC] -Sequences known to belong to this class detected by the pattern: ALL, except for yeast polymerase II/epsilon, Agaricus bitorquis Sulfolobus solfataricus polymerase II. -Other sequence(s) detected in Swiss-Prot: 9. -Last update: December 2004 / Pattern and text revised. pEM and [ 1] Jung G.H., Leavitt M.C., Hsieh J.-C., Ito J. "Bacteriophage PRD1 DNA polymerase: evolution of DNA polymerases." Proc. Natl. Acad. Sci. U.S.A. 84:8287-8291(1987). PubMed=3479792 [ 2] Bernad A., Zaballos A., Salas M., Blanco L. "Structural and functional relationships between prokaryotic and eukaryotic DNA polymerases." EMBO J. 6:4219-4225(1987). PubMed=3127204 [ 3] Argos P. "A sequence motif in many polymerases." Nucleic Acids Res. 16:9909-9916(1988). PubMed=2461550 [ 4] Wang T.S.-F., Wong S.W., Korn D. "Human DNA polymerase alpha: predicted functional domains and relationships with viral DNA polymerases." FASEB J. 3:14-21(1989). PubMed=2642867 [ 5] Delarue M., Poch O., Tordo N., Moras D., Argos P. "An attempt to unify the structure of polymerases." Protein Eng. 3:461-467(1990). PubMed=2196557 [ 6] Ito J., Braithwaite D.K. "Compilation and alignment of DNA polymerase sequences." Nucleic Acids Res. 19:4045-4057(1991). PubMed=1870963 [ 7] Braithwaite D.K., Ito J. "Compilation, alignment, and phylogenetic relationships of DNA polymerases." Nucleic Acids Res. 21:787-802(1993). PubMed=8451181 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00108} {PS00117; GAL_P_UDP_TRANSF_I} {PS01163; GAL_P_UDP_TRANSF_II} {BEGIN} ******************************************************* * Galactose-1-phosphate uridyl transferase signatures * ******************************************************* Galactose-1-phosphate uridyl transferase (EC 2.7.7.12) (galT) catalyzes the transfer of an uridyldiphosphate group on galactose (or glucose) 1phosphate. During the reaction, the uridyl moiety links to a histidine residue. In the Escherichia coli enzyme, it has been shown [1] that two histidine residues separated by a single proline residue are essential for enzyme activity. The first one is a ligand to a zinc ion and the second act as a nucleophile. On the basis of sequence similarities, two apparently unrelated families seem to exist. Class-I enzymes are found in eukaryotes as well as some bacteria such as Escherichia coli or Streptomyces lividans, while class-II enzymes have been found so far only in some Gram-positive bacteria such as Bacillus subtilis or Lactobacillus helveticus [2]. We developed signature patterns for both families. For class-I enzymes the signature is based on the active site residues. For class-II enzymes we chose a region which also includes two conserved histidines. -Consensus pattern: F-E-N-[RK]-G-x(3)-G-x(4)-H-P-H-x-Q [The first H binds zinc and the second H is an active site residue] -Sequences known to belong to this class detected by the pattern: ALL class-I enzymes. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: D-L-P-I-[VS]-G-G-[ST]-[LIVM](2)-[STAV]-H-[DEN]-H[FY]-Q[GAT]-G -Sequences known to belong to this class detected by the pattern: ALL class-II enzymes. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Class-I enzymes are structurally related to the HIT family of proteins (see <PDOC00694>). -Last update: December 2004 / Pattern and text revised. [ 1] Reichardt J.K.V., Berg P. "Conservation of short patches of amino acid sequence amongst proteins with a common function but evolutionarily distinct origins: implications for cloning genes and for structure-function analysis." Nucleic Acids Res. 16:9017-9026(1988). PubMed=2845364 [ 2] Mollet B., Pilloud N. "Galactose utilization in Lactobacillus helveticus: isolation and characterization of the galactokinase (galK) and galactose-1phosphate uridyl transferase (galT) genes." J. Bacteriol. 173:4464-4473(1991). PubMed=2066342 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00109} {PS00118; PA2_HIS} {PS00119; PA2_ASP} {BEGIN} ******************************************** * Phospholipase A2 active sites signatures * ******************************************** Phospholipase A2 (EC 3.1.1.4) (PA2) [1,2] is an enzyme which releases fatty acids from the second carbon group of glycerol. PA2's are small and rigid proteins of 120 amino-acid residues that have four to seven disulfide bonds. PA2 binds a calcium ion which is required for activity. The side chains of two conserved residues, a histidine and an aspartic acid, participate in a 'catalytic network'. Many PA2's have been sequenced from snakes, lizards, bees and mammals. In the latter, there are at least four forms: pancreatic, membrane-associated as well as two less characterized forms. The venom of most snakes contains multiple forms of PA2. Some of them are presynaptic neurotoxins which inhibit neuromuscular transmission by blocking acetylcholine release from the nerve termini. We derived two different signature patterns for PA2's. The first is centered on the active site histidine and contains three cysteines involved in disulfide bonds. The second is centered on the active site aspartic acid and also contains three cysteines involved in disulfide bonds. -Consensus pattern: C-C-{P}-x-H-{LGY}-x-C [H is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL known functional PA2's. However, this pattern will not detect some snake toxins homologous with PA2 but which have lost their catalytic activity as well as otoconin-22, a Xenopus protein from the aragonitic otoconia which is also unlikely to be enzymatically active. -Other sequence(s) detected in Swiss-Prot: 15. -Consensus pattern: [LIVMA]-C-{LIVMFYWPCST}-C-D-{GS}-{G}-{N}-x-{QS}-C [D is the active site residue] -Sequences known to belong to this class detected by the pattern: the majority of functional and non-functional PA2's. Undetected sequences are bee PA2, gila monster PA2's, PA2 PL-X from habu and PA2 PA-5 from mulga. -Other sequence(s) detected in Swiss-Prot: 13. -Last update: April 2006 / Pattern revised. [ 1] Davidson F.F., Dennis E.A. "Evolutionary relationships and implications for the regulation of phospholipase A2 from snake venom to human secreted forms." J. Mol. Evol. 31:228-238(1990). PubMed=2120459 [ 2] Gomez F., Vandermeers A., Vandermeers-Piret M.-C., Herzog R., Rathe J., Stievenart M., Winand J., Christophe J. "Purification and characterization of five variants of phospholipase A2 and complete primary structure of the main phospholipase A2 variant in Heloderma suspectum (Gila monster) venom." Eur. J. Biochem. 186:23-33(1989). PubMed=2480893 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00110} {PS00120; LIPASE_SER} {BEGIN} ******************************* * Lipases, serine active site * ******************************* Triglyceride lipases (EC 3.1.1.3) [1] are lipolytic enzymes that hydrolyzes the ester bond of triglycerides. Lipases are widely distributed in animals, plants and prokaryotes. In higher vertebrates there are at least three tissuespecific isozymes: pancreatic, hepatic, and gastric/lingual. These three types of lipases are closely related to each other as well as to lipoprotein lipase (EC 3.1.1.34) [2], which hydrolyzes triglycerides of chylomicrons and very low density lipoproteins (VLDL). The most conserved region in all these proteins is centered around a serine residue which has been shown [3] to participate, with an histidine and an aspartic acid residue, to a charge relay system. Such a region is also present in lipases of prokaryotic origin and in lecithin-cholesterol acyltransferase (EC 2.3.1.43) (LCAT) [4], which catalyzes fatty acid transfer between phosphatidylcholine and cholesterol. We have built a pattern from that region. -Consensus pattern: [LIV]-{KG}-[LIVFY]-[LIVMST]-G-[HYWV]-S-{YAG}-G[GSTAC] [S is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 38. -Note: Drosophila vitellogenins are also related to lipases [5], but they have lost their active site serine. -Last update: December 2004 / Pattern and text revised. [ 1] Chapus C., Rovery M., Sarda L., Verger R. "Minireview on pancreatic lipase and colipase." Biochimie 70:1223-1234(1988). PubMed=3147715 [ 2] Persson B., Bengtsson-Olivecrona G., Enerback S., Olivecrona T., Jornvall H. "Structural features of lipoprotein lipase. Lipase family relationships, binding interactions, non-equivalence of lipase cofactors, vitellogenin similarities and functional subdivision of lipoprotein lipase." Eur. J. Biochem. 179:39-45(1989). PubMed=2917565 [ 3] Blow D. "Enzymology. More of the catalytic triad." Nature 343:694-695(1990). PubMed=2304545; DOI=10.1038/343694a0 [ 4] McLean J., Fielding C., Drayna D., Dieplinger H., Baer B., Kohr W., Henzel W., Lawn R. "Cloning and expression of human lecithin-cholesterol acyltransferase cDNA." Proc. Natl. Acad. Sci. U.S.A. 83:2335-2339(1986). PubMed=3458198 [ 5] Baker M.E. "Is vitellogenin an ancestor of apolipoprotein B-100 of human low-density lipoprotein and human lipoprotein lipase?" Biochem. J. 255:1057-1060(1988). PubMed=3145737 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00111} {PS00121; COLIPASE_1} {PS51342; COLIPASE_2} {BEGIN} ***************************************** * Colipase family signature and profile * ***************************************** Colipase [1,2,3] is pancreatic a protein that functions as a cofactor for lipase, with which it forms a stoichiometric complex. It also binds to the bile-salt covered triacylglycerol interface thus allowing the enzyme to anchor itself to the water-lipid interface. As shown in the following schematic representation, colipase is a small protein of approximately 100 amino-acid residues with five conserved disulfide bonds. +--------+ +--|--+ | +----------+ | | | | | ***** | xxxxxxxxCxxCxCCxxxxxCxxxxCxxxxxCxCxxCxxxxxxxxCxxxx | | | | +-----------------+ +-----------+ 'C': conserved cysteine involved in a disulfide bond. '*': position of the pattern. As a signature pattern for this family, we chose a region which includes two of the cysteines involved in disulfide bonds, as well as three tyrosine residues which seem to be involved in the interfacial binding. We also developed a profile that covers the whole colipase. -Consensus pattern: Y-x(2)-Y-Y-x-C-x-C [The 2 C's are involved in disulfide bonds] [The 3 Y's are involved in interfacial binding] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2007 / Text revised; profile added. [ 1] Erlanson-Albertsson C. "Pancreatic colipase. Structural and physiological aspects." Biochim. Biophys. Acta 1125:1-7(1992). PubMed=1567900 [ 2] Chapus C., Rovery M., Sarda L., Verger R. "Minireview on pancreatic lipase and colipase." Biochimie 70:1223-1234(1988). PubMed=3147715 [ 3] van Tilbeurgh H., Sarda L., Verger R., Cambillau C. "Structure of the pancreatic lipase-procolipase complex." Nature 359:159-162(1992). PubMed=1522902; DOI=10.1038/359159a0 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00112} {PS00122; CARBOXYLESTERASE_B_1} {PS00941; CARBOXYLESTERASE_B_2} {BEGIN} *************************************** * Carboxylesterases type-B signatures * *************************************** Higher eukaryotes have many distinct esterases. Among the different types are those which act on carboxylic esters (EC 3.1.1.-). Carboxyl-esterases have been classified into three categories (A, B and C) on the basis of differential patterns of inhibition by organophosphates. The sequence of a number of type-B carboxylesterases indicates [1,2,3] that the majority are evolutionary related. This family currently consists of the following proteins: - Acetylcholinesterase (EC 3.1.1.7) (AChE) from vertebrates and from Drosophila. - Mammalian cholinesterase II (butyryl cholinesterase) (EC 3.1.1.8). Acetylcholinesterase and cholinesterase II are closely related enzymes that hydrolyze choline esters [4]. - Mammalian liver microsomal carboxylesterases (EC 3.1.1.1). - Drosophila esterase 6, produced in the anterior ejaculatory duct of the male insect reproductive system where it plays an important role in its reproductive biology. - Drosophila esterase P. - Culex pipiens (mosquito) esterases B1 and B2. - Myzus persicae (peach-potato aphid) esterases E4 and FE4. - Mammalian bile-salt-activated lipase (BAL) [5], a multifunctional lipase which catalyzes fat and vitamin absorption. It is activated by bile salts in infant intestine where it helps to digest milk fats. - Insect juvenile hormone esterase (JH esterase) (EC 3.1.1.59). - Lipases (EC 3.1.1.3) from the fungi Geotrichum candidum and Candida rugosa. - Caenorhabditis gut esterase (gene ges-1). - Duck acyl-[acyl-carrier protein] hydrolase, medium chain (EC 3.1.2.14), an enzyme that may be associated with peroxisome proliferation and may play a role in the production of 3-hydroxy fatty acid diester pheromones. - Membrane enclosed crystal proteins from slime mold. These proteins are, most probably esterases; the vesicles where they are found have therefore been termed esterosomes. So far two bacterial proteins have been found to belong to this family: - Phenmedipham hydrolase (phenylcarbamate hydrolase), an Arthrobacter oxidans plasmid-encoded enzyme (gene pcd) that degrades the phenylcarbamate herbicides phenmedipham and desmedipham by hydrolyzing their central carbamate linkages. - Para-nitrobenzyl esterase from Bacillus subtilis (gene pnbA). The following proteins, while having lost their catalytic activity, contain a domain evolutionary related to that of carboxylesterases type-B: - Thyroglobulin (TG), a glycoprotein specific to the thyroid gland, which is the precursor of the iodinated thyroid hormones thyroxine (T4) and triiodo thyronine (T3). - Drosophila protein neurotactin (gene nrt) which may mediate or modulate cell adhesion between embryonic cells during development. - Drosophila protein glutactin (gene glt), whose function is not known. As is the case for lipases and serine proteases, the catalytic apparatus of esterases involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine. The sequence around the active site serine is well conserved and can be used as a signature pattern. As a second signature pattern, we selected a conserved region located in the N-terminal section and which contains a cysteine involved in a disulfide bond. -Consensus pattern: F-[GR]-G-x(4)-[LIVM]-x-[LIV]-x-G-x-S-[STAG]-G [S is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL members of this family with a catalytic activity. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [EDA]-[DG]-C-L-[YTF]-[LIVT]-[DNS]-[LIV]-[LIVFYW]-x[PQR] [C is involved in a disulfide bond] -Sequences known to belong to this class detected by the pattern: ALL, except for mosquito and peach-potato aphid esterases and juvenile hormone esterases. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Human esterase-D, also to be evolutionary related. a type-B carboxylesterase, does not seem -Expert(s) to contact by email: Sussman J.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Myers M., Richmond R.C., Oakeshott J.G. "On the origins of esterases." Mol. Biol. Evol. 5:113-119(1988). PubMed=3163407 [ 2] Krejci E., Duval N., Chatonnet A., Vincens P., Massoulie J. "Cholinesterase-like domains in enzymes and structural proteins: functional and evolutionary relationships and identification of a catalytically essential aspartic acid." Proc. Natl. Acad. Sci. U.S.A. 88:6647-6651(1991). PubMed=1862088 [ 3] Cygler M., Schrag J.D., Sussman J.L., Harel M., Silman I., Gentry M.K., Doctor B.P. "Relationship between sequence conservation and three-dimensional structure in a large family of esterases, lipases, and related proteins." Protein Sci. 2:366-382(1993). PubMed=8453375 [ 4] Lockridge O. "Structure of human serum cholinesterase." BioEssays 9:125-128(1988). PubMed=3067729 [ 5] Wang C.-S., Hartsuck J.A. "Bile salt-activated lipase. A multiple function lipolytic enzyme." Biochim. Biophys. Acta 1166:1-19(1993). PubMed=8431483 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00113} {PS00123; ALKALINE_PHOSPHATASE} {BEGIN} ************************************ * Alkaline phosphatase active site * ************************************ Alkaline phosphatase (EC 3.1.3.1) (ALP) [1] is a zinc and magnesiumcontaining metalloenzyme which hydrolyzes phosphate esters, optimally at high pH. It is found in nearly all living organisms, with the exception of some plants. In Escherichia coli, ALP (gene phoA) is found in the periplasmic space. In yeast it (gene PHO8) is found in lysosome-like vacuoles and in mammals, it is a glycoprotein attached to the membrane by a GPI-anchor. In mammals, four different isozymes are currently known [2]. Three of them are tissue-specific: the placental, placental-like (germ cell) and intestinal isozymes. The fourth form is tissue non-specific and was previously known as the liver/bone/kidney isozyme. Streptomyces' species involved in the synthesis of streptomycin (SM), an antibiotic, express a phosphatase (EC 3.1.3.39) (gene strK) which is highly related to ALP. It specifically cleaves both streptomycin-6-phosphate and, more slowly, streptomycin-3"-phosphate. A serine is involved in the catalytic activity of ALP. The region around the active site serine is relatively well conserved and can be used as a signature pattern. -Consensus pattern: [IV]-x-D-S-[GAS]-[GASC]-[GAST]-[GA]-T [S is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 3. -Last update: June 1994 / Text revised. [ 1] Trowsdale J., Martin D., Bicknell D., Campbell I. "Alkaline phosphatases." Biochem. Soc. Trans. 18:178-180(1990). PubMed=2379681 [ 2] Manes T., Glade K., Ziomek C.A., Millan J.L. "Genomic structure and comparison of mouse tissue-specific alkaline phosphatase genes." Genomics 8:541-554(1990). PubMed=2286375 [ 3] Mansouri K., Piepersberg W. "Genetics of streptomycin production in Streptomyces griseus: nucleotide sequence of five genes, strFGHIK, including a phosphatase gene." Mol. Gen. Genet. 228:459-469(1991). PubMed=1654502 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00114} {PS00124; FBPASE} {BEGIN} ******************************************* * Fructose-1-6-bisphosphatase active site * ******************************************* Fructose-1,6-bisphosphatase (EC 3.1.3.11) (FBPase) [1], a regulatory enzyme in gluconeogenesis, catalyzes the hydrolysis of fructose 1,6bisphosphate to fructose 6-phosphate. It is involved in many different metabolic pathways and found in most organisms. Sedoheptulose-1,7-bisphosphatase (EC 3.1.3.37) (SBPase) [2] is an enzyme found plant chloroplast and in photosynthetic bacteria that catalyzes the hydrolysis of sedoheptulose 1,7-bisphosphate to sedoheptulose 7-phosphate, a step in the Calvin's reductive pentose phosphate cycle. It is functionally and structurally related to FBPase. In mammalian FBPase, a lysine residue has been shown to be involved in the catalytic mechanism [3]. The region around this residue is highly conserved and can be used as a signature pattern for FBPase and SBPase. It must be noted that, in some bacterial FBPase sequences, the active site lysine is replaced by an arginine. -Consensus pattern: [AG]-[RK]-[LI]-x(1,2)-[LIV]-[FY]-E-x(2)-P-[LIVM][GSA] [K/R is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2001 / Pattern and text revised. [ 1] Benkovic S.J., DeMaine M.M. "Mechanism of action of fructose 1,6-bisphosphatase." Adv. Enzymol. 53:45-82(1982). PubMed=6277165 [ 2] Raines C.A., Lloyd J.C., Willingham N.M., Potts S., Dyer T.A. "cDNA and gene sequences of wheat chloroplast sedoheptulose-1,7-bisphosphatase reveal homology with fructose-1,6-bisphosphatases." Eur. J. Biochem. 205:1053-1059(1992). PubMed=1374332 [ 3] Ke H.M., Thorpe C.M., Seaton B., Lipscomb W.N., Marcus F. "Structure refinement of fructose-1,6-bisphosphatase and its fructose 2,6-bisphosphate complex at 2.8 A resolution." J. Mol. Biol. 212:513-539(1990). PubMed=2157849 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00115} {PS00125; SER_THR_PHOSPHATASE} {BEGIN} ************************************************************ * Serine/threonine specific protein phosphatases signature * ************************************************************ Serine/threonine specific protein phosphatases (EC 3.1.3.16) (PP) [1,2,3] are enzymes that catalyze the removal of a phosphate group attached to a serine or a threonine residue. They are very important in controlling intracellular events in eukaryotic cells. In mammalian tissues four different types of PP have been identified and are known as PP1, PP2A, PP2B and PP2C. Except for PP2C, these enzymes are evolutionary related. - Protein phosphatase-1 (PP1) is an enzyme of broad specificity. It is inhibited by two thermostable proteins, inhibitor-1 and -2. In mammals, there are two closely related isoforms of PP-1: PP-1alpha and PP1beta, produced by alternative splicing of the same gene. In Emericella nidulans, PP-1 (gene bimG) plays an important role in mitosis control by reversing the action of the nimA kinase. In yeast, PP-1 (gene SIT4) is involved in dephosphorylating the large subunit of RNA polymerase II. - Protein phosphatase-2A (PP2A) is also an enzyme of broad specificity. PP2A is a trimeric enzyme that consist of a core composed of a catalytic subunit associated with a 65 Kd regulatory subunit and a third variable subunit. In mammals, there are two closely related isoforms of the catalytic subunit of PP2A: PP2A-alpha and PP2A-beta, encoded by separate genes. - Protein phosphatase-2B (PP2B or calcineurin), a calcium-dependent enzyme whose activity is stimulated by calmodulin. It is composed of two subunits: the catalytic A-subunit and the calcium-binding B-subunit. The specificity of PP2B is restricted. In addition to the above-mentioned enzymes, some additional serine/threonine specific protein phosphatases have been characterized and are listed below. - Mammalian phosphatase-X (PP-X), and Drosophila phosphatase-V (PP-V) which are closely related but yet distinct from PP2A. - Yeast phosphatase PPH3, which is similar to PP2A, but with different enzymatic properties. - Drosophila phosphatase-Y (PP-Y), and yeast phosphatases Z1 and Z2 (genes PPZ1 and PPZ2) which are closely related but yet distinct from PP1. - Drosophila retinal degeneration protein C (gene rdgC), a calciumbinding phosphatase required to prevent light-induced retinal degeneration. - Phages Lambda and Phi-80 ORF-221 which have been shown to have phosphatase activity and are related to mammalian PP's. The best conserved regions in these proteins is conserved pentapeptide that can be used as a signature pattern. a highly -Consensus pattern: [LIVMN]-[KR]-G-N-H-E -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 2. -Last update: December 2001 / Pattern and text revised. [ 1] Cohen P. "The structure and regulation of protein phosphatases." Annu. Rev. Biochem. 58:453-508(1989). PubMed=2549856; DOI=10.1146/annurev.bi.58.070189.002321 [ 2] Cohen P., Cohen P.T.W. "Protein phosphatases come of age." J. Biol. Chem. 264:21435-21438(1989). PubMed=2557326 [ 3] Cohen P.T.W., Brewis N.D., Hughes V., Mann D.J. "Protein serine/threonine phosphatases; an expanding family." FEBS Lett. 268:355-359(1990). PubMed=2166691 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00116} {PS00126; PDEASE_I} {BEGIN} ******************************************************* * 3'5'-cyclic nucleotide phosphodiesterases signature * ******************************************************* 3'5'-cyclic nucleotide phosphodiesterases (EC 3.1.4.17) (PDEases) catalyze the hydrolysis of cAMP or cGMP to the corresponding nucleoside 5' monophosphates [1]. There are at least seven different subfamilies of PDEases [2,E1]: - Type Type Type Type Type Type Type 1, 2, 3, 4, 5, 6, 7, calmodulin/calcium-dependent PDEases. cGMP-stimulated PDEases. cGMP-inhibited PDEases. cAMP-specific PDEases. cGMP-specific PDEases. rhodopsin-sensitive cGMP-specific PDEases. High affinity cAMP-specific PDEases. All of these forms seem to share a conserved domain of about 270 residues. We have derived a signature pattern from a stretch of 12 residues that contains two conserved histidines. -Consensus pattern: H-D-[LIVMFY]-x-H-x-[AG]-x(2)-[NQ]-x-[LIVMFY] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Slime mold extracellular PDEase and yeast low-affinity PDEase (gene PDE1) do not show any similarity with the above enzymes and belong to another class of PDEases (see <PDOC00530>). -Last update: July 1998 / Pattern and text revised. [ 1] Charbonneau H., Beier N., Walsh K.A., Beavo J.A. "Identification of a conserved domain among cyclic nucleotide phosphodiesterases from diverse species." Proc. Natl. Acad. Sci. U.S.A. 83:9308-9312(1986). PubMed=3025833 [ 2] Beavo J.A., Reifsnyder D.H. "Primary sequence of cyclic nucleotide phosphodiesterase isozymes and the design of selective inhibitors." Trends Pharmacol. Sci. 11:150-155(1990). PubMed=2159198 [E1] http://depts.washington.edu/pde/ +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00117} {PS00523; SULFATASE_1} {PS00149; SULFATASE_2} {BEGIN} ************************* * Sulfatases signatures * ************************* Sulfatases (EC 3.1.6.-) are enzymes that hydrolyze various sulfate esters. The sequence of different types of sulfatases are available. These enzymes are: - Arylsulfatase A (EC 3.1.6.8) (ASA), a lysosomal enzyme which hydrolyzes cerebroside sulfate. - Arylsulfatase B (EC 3.1.6.12) (ASB), a lysosomal enzyme which hydrolyzes the sulfate ester group from N-acetylgalactosamine 4-sulfate residues of dermatan sulfate. - Arylsulfatase C (ASD). - Arylsulfatase E (ASE). - Steryl-sulfatase (EC 3.1.6.2) (STS) (arylsulfatase C), a membrane bound microsomal enzyme which hydrolyzes 3-beta-hydroxy steroid sulfates. - Iduronate 2-sulfatase precursor (EC 3.1.6.13) (IDS), a lysosomal enzyme that hydrolyzes the 2-sulfate groups from non-reducing-terminal iduronic acid residues in dermatan sulfate and heparan sulfate. - N-acetylgalactosamine-6-sulfatase (EC 3.1.6.4), an enzyme that hydrolyzes the 6-sulfate groups of the N-acetyl-D-galactosamine 6-sulfate units of chondroitin sulfate and the D-galactose 6-sulfate units of keratan sulfate. - Choline sulfatase (EC 3.1.6.6) (gene betC), a bacterial enzyme that converts choline-O-sulfate to choline. - Glucosamine-6-sulfatase (EC 3.1.6.14) (G6S), a lysosomal enzyme that hydrolyzes the N-acetyl-D-glucosamine 6-sulfate units of heparan sulfate and keratan sulfate. - N-sulphoglucosamine sulphohydrolase (EC 3.10.1.1) (sulphamidase), the lysosomal enzyme that catalyzes the hydrolysis of N-sulfo-dglucosamine into glucosamine and sulfate. - Sea urchin embryo arylsulfatase (EC 3.1.6.1). - Green alga arylsulfatase (EC 3.1.6.1), an enzyme which plays an important role in the mineralization of sulfates. - Arylsulfatase (EC 3.1.6.1) from Escherichia coli (gene aslA), Klebsiella aerogenes (gene atsA) and Pseudomonas aeruginosa (gene atsA). - Escherichia coli hypothetical protein yidJ. It has been shown that all these sulfatases are structurally related [1,2,3]. As signature patterns for that family of enzymes we have selected the two best conserved regions. Both regions are located in the N-terminal section of these enzymes. The first region contains a conserved arginine which could be implicated in the catalytic mechanism; it is located four residues after a position that, in eukaryotic sulfatases, is a conserved cysteine which has been shown [4] to be modified to 2-amino-3-oxopropionic acid. In prokaryotes, this cysteine is replaced by a serine. -Consensus pattern: [SAPG]-[LIVMST]-[CS]-[STACG]-P-[STA]-R-x(2)[LIVMFW](2)[TAR]-G [R is a putative active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: G-[YV]-x-[ST]-x(2)-[IVAS]-G-K-x(0,1)-[FYWMK]-[HL] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Peters C., Schmidt B., Rommerskirch W., Rupp K., Zuhlsdorf M., Vingron M., Meyer H.E., Pohlmann R., von Figura K. "Phylogenetic conservation of arylsulfatases. cDNA cloning and expression of human arylsulfatase B." J. Biol. Chem. 265:3374-3381(1990). PubMed=2303452 [ 2] Wilson P.J., Morris C.P., Anson D.S., Occhiodoro T., Bielicki J., Clements P.R., Hopwood J.J. "Hunter syndrome: isolation of an iduronate-2-sulfatase cDNA clone and analysis of patient DNA." Proc. Natl. Acad. Sci. U.S.A. 87:8531-8535(1990). PubMed=2122463 [ 3] de Hostos E.L., Schilling J., Grossman A.R. Mol. Gen. Genet. 218:229-239(1989). [ 4] Selmer T., Hallmann A., Schmidt B., Sumper M., von Figura K. "The evolutionary conservation of a novel protein modification, the conversion of cysteine to serinesemialdehyde in arylsulfatase from Volvox carteri." Eur. J. Biochem. 238:341-345(1996). PubMed=8681943 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00118} {PS00127; RNASE_PANCREATIC} {BEGIN} ******************************************** * Pancreatic ribonuclease family signature * ******************************************** Pancreatic ribonucleases (EC 3.1.27.5) are pyrimidine-specific endonucleases present in high quantity in the pancreas of a number of mammalian taxa and of a few reptiles [1,2]. As shown in the following schematic representation of the sequence of pancreatic RNases there are four conserved disulfide bonds and three amino acid residues involved in the catalytic activity. +---------------------------+ | +------------------|------+ | | | | xxxxx#xxxxxxCxxxxxxC#xxxxxxxCxxCxxxCxxxxxCxxxxxCxxxxxxCxxx#xxx | **** | | | | +---+ | +----------------------------+ 'C': conserved cysteine involved in a disulfide bond. '#': active site residue. '*': position of the pattern. A number of other proteins these are listed below. belongs to the pancreatic RNAse family and - Bovine seminal vesicle and bovine brain ribonucleases. - The kidney non-secretory ribonucleases (also known as eosinophilderived neurotoxin (EDN) [3]). - Liver-type ribonucleases [4]. - Angiogenin, which induces vascularization of normal and malignant tissues. It abolishes protein synthesis by specifically hydrolyzing cellular tRNAs. - Eosinophil cationic protein (ECP) [5], a cytotoxin and helminthotoxin with ribonuclease activity. - Frog liver ribonuclease and frog sialic acid-binding lectin [6]. The signature pattern we developed for these proteins includes five conserved residues: a cysteine involved in a disulfide bond, a lysine involved in the catalytic activity and three other residues important for substrate binding. -Consensus pattern: C-K-x(2)-N-T-F [C is involved in a disulfide bond] [K is an active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 4. -Last update: October 1993 / Text revised. [ 1] Beintema J.J., Schuller C., Irie M., Carsana A. "Molecular evolution of the ribonuclease superfamily." Prog. Biophys. Mol. Biol. 51:165-192(1988). PubMed=3074337 [ 2] Beintema J.J., van der Laan J.M. "Comparison of the structure of turtle pancreatic ribonuclease with those of mammalian ribonucleases." FEBS Lett. 194:338-342(1986). PubMed=3940901 [ 3] Rosenberg H.F., Tenen D.G., Ackerman S.J. "Molecular cloning of the human eosinophil-derived neurotoxin: a member of the ribonuclease gene family." Proc. Natl. Acad. Sci. U.S.A. 86:4460-4464(1989). PubMed=2734298 [ 4] Hofsteenge J., Matthies R., Stone S.R. "Primary structure of a ribonuclease from porcine liver, a new member of the ribonuclease superfamily." Biochemistry 28:9806-9813(1989). PubMed=2611266 [ 5] Rosenberg H.F., Ackerman S.J., Tenen D.G. "Human eosinophil cationic protein. Molecular cloning of a cytotoxin and helminthotoxin with ribonuclease activity." J. Exp. Med. 170:163-176(1989). PubMed=2473157 [ 6] Lewis M.T., Hunt L.T., Barker W.C. "Striking sequence similarity among sialic acid-binding lectin, pancreatic ribonucleases, and angiogenin: possible structural and functional relationships." Protein Seq. Data Anal. 2:101-105(1989). PubMed=2710786 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00119} {PS00128; LACTALBUMIN_LYSOZYME_1} {PS51348; LACTALBUMIN_LYSOZYME_2} {BEGIN} *************************************************************** * Alpha-lactalbumin / lysozyme C family signature and profile * *************************************************************** Alpha-lactalbumin [1], a milk protein, is the regulatory subunit of lactose synthetase. In the mammary gland, alpha-lactalbumin changes the substrate specificity of galactosyltransferase from N-acetylglucosamine to glucose. Lysozymes (EC 3.2.1.17) [2] act as bacteriolytic enzymes by hydrolyzing the beta(1->4) bonds between N-acetylglucosamine and N-acetylmuramic acid in the peptidoglycan of prokaryotic cell walls. There are at least five different classes of lysozymes [3,4]: C (chicken type), G (goose type), phage-type (T4), fungi (Chalaropsis), and bacterial (Bacillus subtilis) but there are few similarities in the sequences of the different types of lysozymes. Alpha-lactalbumin and lysozyme C are evolutionary related [5]. Around 35 to 40% of the residues are conserved in both proteins as well as the positions of the four disulfide bonds (see the schematic representation). The pattern for this family of proteins includes three cysteines involved in two of these disulfide bonds (the first cysteine is linked to the third one). +-------+ | **|******* xxCxxxxxxxxxxCxxxxxxxxxxxxxxxCxxxxxCxCxxxxxxCxxxxxxxxxCxxxCxx | | +--------+ | | | +----------------------------------------+ | +--------------------------------------------------------+ 'C': conserved cysteine involved in a disulfide bond. '*': position of the pattern. We also developed lactalbumin / lysozyme C. a profile that covers the entire alpha- -Consensus pattern: C-x(3)-C-x(2)-[LMF]-x(3)-[DEN]-[LI]-x(5)-C [The 3 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: These proteins belong to family 22 glycosyl hydrolases [6,E1]. in the classification of -Last update: December 2007 / Text revised; profile added. [ 1] Hall L., Campbell P.N. "Alpha-lactalbumin and related proteins: a versatile gene family with an interesting parentage." Essays Biochem. 22:1-26(1986). PubMed=3104032 [ 2] Concise Encyclopedia Biochemistry, Second Edition, Walter de Gruyter, Berlin New-York (1988). [ 3] Weaver L.H., Grutter M.G., Remington S.J., Gray T.M., Isaacs N.W., Matthews B.W. J. Mol. Evol. 21:97-111(1985). [ 4] Kamei K., Hara S., Ikenaka T., Murao S. "Amino acid sequence of a lysozyme (B-enzyme) from Bacillus subtilis YT-25." J. Biochem. 104:832-836(1988). PubMed=3148618 [ 5] Nitta K., Sugai S. "The evolution of lysozyme and alpha-lactalbumin." Eur. J. Biochem. 182:111-118(1989). PubMed=2731545 [ 6] Henrissat B. "A classification of glycosyl hydrolases based on amino acid sequence similarities." Biochem. J. 280:309-316(1991). PubMed=1747104 [E1] http://www.expasy.org/cgi-bin/lists?glycosid.txt +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00120} {PS00129; GLYCOSYL_HYDROL_F31_1} {PS00707; GLYCOSYL_HYDROL_F31_2} {BEGIN} ******************************************** * Glycosyl hydrolases family 31 signatures * ******************************************** It has been shown [1,2,3,E1] that the following glycosyl hydrolases can be, on the basis of sequence similarities, classified into a single family: - Lysosomal alpha-glucosidase (EC 3.2.1.20) (acid maltase) is a vertebrate glycosidase active at low pH, which hydrolyzes alpha(1->4) and alpha(1->6) linkages in glycogen, maltose, and isomaltose. - Alpha-glucosidase (EC 3.2.1.20) from the yeast Candida tsukunbaensis. - Alpha-glucosidase (EC 3.2.1.20) (gene malA) from the archebacteria Sulfolobus solfataricus. - Intestinal sucrase-isomaltase (EC 3.2.1.48 / EC 3.2.1.10) is a vertebrate membrane-bound, multifunctional enzyme complex which hydrolyzes sucrose, maltose and isomaltose. The sucrase and isomaltase domains of the enzyme are homologous (41% of amino acid identity) and have most probably evolved by duplication. - Glucoamylase 1 (EC 3.2.1.3) (glucan 1,4-alpha-glucosidase) from various fungal species. - Yeast hypothetical protein YBR229c. - Fission yeast hypothetical protein SpAC30D11.01c. An aspartic acid has been implicated [4] in the catalytic activity of sucrase, isomaltase, and lysosomal alpha-glucosidase. The region around this active residue is highly conserved and can be used as a signature pattern. We have used a second region, which contains two conserved cysteines, as an additional signature pattern. -Consensus pattern: [GFY]-[LIVMF]-W-x-D-M-[NSA]-E [D is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for SpAC30D11.01c. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: G-[AVP]-[DT]-[LIVMTAS]-[CG]-G-[FY]-x(3)-[STP]-x(3)-L[CL]x-R-W-x(2)-[LVMI]-[GSA]-[SA]-[FY]-x-P-[FY]-x-R-[DNA] -Sequences known to belong to this class detected by the pattern: ALL, except for YBR229c which lacks the two cysteines, rat sucrase-isomaltase and malA. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Henrissat B.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Henrissat B. "A classification of glycosyl hydrolases based on amino acid sequence similarities." Biochem. J. 280:309-316(1991). PubMed=1747104 [ 2] Kinsella B.T., Hogan S., Larkin A., Cantwell B.A. "Primary structure and processing of the Candida tsukubaensis alpha-glucosidase. Homology with the rabbit intestinal sucrase-isomaltase complex and human lysosomal alpha-glucosidase." Eur. J. Biochem. 202:657-664(1991). PubMed=1761061 [ 3] Naim H.Y., Niermann T., Kleinhans U., Hollenberg C.P., Strasser A.W.M. "Striking structural and functional similarities suggest that intestinal sucrase-isomaltase, human lysosomal alpha-glucosidase and Schwanniomyces occidentalis glucoamylase are derived from a common ancestral gene." FEBS Lett. 294:109-112(1991). PubMed=1743281 [ 4] Hermans M.M.P., Kroos M.A., van Beeumen J., Oostra B.A., Reuser A.J. "Human lysosomal alpha-glucosidase. Characterization of the catalytic site." J. Biol. Chem. 266:13507-13512(1991). PubMed=1856189 [E1] http://www.expasy.org/cgi-bin/lists?glycosid.txt +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00121} {PS00130; U_DNA_GLYCOSYLASE} {BEGIN} ************************************ * Uracil-DNA glycosylase signature * ************************************ Uracil-DNA glycosylase (EC 3.2.2.-) (UNG) [1] is a DNA repair enzyme that excises uracil residues from DNA by cleaving the N-glycosylic bond. Uracil in DNA can arise as a result of misincorportation of dUMP residues by DNA polymerase or deamination of cytosine. The sequence of uracil-DNA glycosylase is extremely well conserved [2] in bacteria and eukaryotes as well as in herpes viruses. More distantly related uracil-DNA glycosylases are also found in poxviruses [3]. In eukaryotic cells, UNG activity is found in both the nucleus and the mitochondria. Human UNG1 protein is transported to both the mitochondria and the nucleus [4]. The N-terminal 77 amino acids of UNG1 seem to be required for mitochondrial localization [4], but the presence of a mitochondrial transit peptide has not been directly demonstrated. As a signature for this type of enzyme, we selected the most Nterminal conserved region. This region contains an aspartic acid residue which has been proposed, based on X-ray structures [5,6] to act as a general base in the catalytic mechanism. -Consensus pattern: [KR]-[LIVA]-[LIVC]-[LIVM]-x-G-[QI]-D-P-Y [D is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: In humans, two additional sequences of UNG have been reported [7,8]. These isozymes are not evolutionary related to other known UNG. One of them is a glyceraldehyde 3-phosphate dehydrogenase [8] and the other related to cyclins [9]. Data available on three proteins proposed to be human uracil-DNA glycosylases is discussed in [10]. -Expert(s) to contact by email: Aasland R.; [email protected] -Last update: December 2004 / Pattern and text revised. [ 1] Sancar A., Sancar G.B. "DNA repair enzymes." Annu. Rev. Biochem. 57:29-67(1988). PubMed=3052275 [ 2] Olsen L.C., Aasland R., Wittwer C.U., Krokan H.E., Helland D.E. "Molecular cloning of human uracil-DNA glycosylase, a highly conserved DNA repair enzyme." EMBO J. 8:3121-3125(1989). PubMed=2555154 [ 3] Upton C., Stuart D.T., McFadden G. "Identification of a poxvirus gene encoding a uracil DNA glycosylase." Proc. Natl. Acad. Sci. U.S.A. 90:4518-4522(1993). PubMed=8389453 [ 4] Slupphaug G., Markussen F.-H., Olsen L.C., Aasland R., Aarsaether N., Bakke O., Krokan H.E., Helland D.E. "Nuclear and mitochondrial forms of human uracil-DNA glycosylase are encoded by the same gene." Nucleic Acids Res. 21:2579-2584(1993). PubMed=8332455 [ 5] Savva R., McAuley-Hecht K., Brown T., Pearl L. "The structural basis of specific base-excision repair by uracil-DNA glycosylase." Nature 373:487-493(1995). PubMed=7845459; DOI=10.1038/373487a0 [ 6] Mol C.D., Arvai A.S., Slupphaug G., Kavli B., Alseth I., Krokan H.E., Tainer J.A. "Crystal structure and mutational analysis of human uracil-DNA glycosylase: structural basis for specificity and catalysis." Cell 80:869-878(1995). PubMed=7697717 [ 7] Mueller S.J., Caradonna S. "Isolation and characterization of a human cDNA encoding uracil-DNA glycosylase." Biochim. Biophys. Acta 1088:197-207(1991). PubMed=2001396 [ 8] Meyer-Siegler K., Mauro D.J., Seal G., Wurzer J., Deriel J.K., Sirover M.A. Proc. Natl. Acad. Sci. U.S.A. 88:8460-8464(1991). [ 9] Mueller S.J., Caradonna S. "Cell cycle regulation of a human cyclin-like gene encoding uracilDNA glycosylase." J. Biol. Chem. 268:1310-1319(1993). PubMed=8419333 [10] Barnes D.E., Lindahl T., Sedgwick B. Curr. Opin. Cell Biol. 5:424-433(1993). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00122} {PS00131; CARBOXYPEPT_SER_SER} {PS00560; CARBOXYPEPT_SER_HIS} {BEGIN} ****************************************** * Serine carboxypeptidases, active sites * ****************************************** All known carboxypeptidases are either metallo carboxypeptidases or serine carboxypeptidases (EC 3.4.16.5 and EC 3.4.16.6). The catalytic activity of the serine carboxypeptidases, like that of the trypsin family serine proteases, is provided by a charge relay system involving an aspartic acid residue hydrogen-bonded to a histidine, which is itself hydrogen-bonded to a serine [1]. Proteins known to be serine carboxypeptidases are: - Barley and wheat serine carboxypeptidases I, II, and III [2]. - Yeast carboxypeptidase Y (YSCY) (gene PRC1), a vacuolar protease involved in degrading small peptides. - Yeast KEX1 protease, involved in killer toxin and alpha-factor precursor processing. - Fission yeast sxa2, a probable carboxypeptidase involved in degrading or processing mating pheromones [3]. - Penicillium janthinellum carboxypeptidase S1 [4]. - Aspergullus niger carboxypeptidase pepF. - Aspergullus satoi carboxypeptidase cpdS. - Vertebrate protective protein / cathepsin A [5], a lysosomal protein which is not only a carboxypeptidase but also essential for the activity of both beta-galactosidase and neuraminidase. - Mosquito vitellogenic carboxypeptidase (VCP) [6]. - Naegleria fowleri virulence-related protein Nf314 [7]. - Yeast hypothetical protein YBR139w. - Caenorhabditis elegans hypothetical proteins C08H9.1, F13D12.6, F32A5.3, F41C3.5 and K10B2.2. This family also includes: - Sorghum (s)-hydroxymandelonitrile lyase (EC 4.1.2.11) (hydroxynitrile lyase) (HNL) [8], an enzyme involved in plant cyanogenesis. The sequences surrounding the active site serine and histidine residues are highly conserved in all these serine carboxypeptidases. -Consensus pattern: [LIVM]-x-[GSTA]-E-S-Y-[AG]-[GS] [S is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for HNL. -Other sequence(s) detected in Swiss-Prot: 3. -Consensus pattern: [LIVF]-x(2)-[LIVSTA]-x-[IVPST]-x-[GSDNQL]-[SAGV][SG]-H-x[IVAQ]-P-x(3)-[PSA] [H is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: These proteins belong to family S10 in the classification of peptidases [9,E1]. -Last update: February 2003 / Patterns and text revised. [ 1] Liao D.I., Remington S.J. "Structure of wheat serine carboxypeptidase II at 3.5-A resolution. A new class of serine proteinase." J. Biol. Chem. 265:6528-6531(1990). PubMed=2324088 [ 2] Sorensen S.B., Svendsen I., Breddam K. "Primary structure of carboxypeptidase III from malted barley." Carlsberg Res. Commun. 54:193-202(1989). PubMed=2639682 [ 3] Imai Y., Yamamoto M. "Schizosaccharomyces pombe sxa1+ and sxa2+ encode putative proteases involved in the mating response." Mol. Cell. Biol. 12:1827-1834(1992). PubMed=1549128 [ 4] Svendsen I., Hofmann T., Endrizzi J., Remington S.J., Breddam K. "The primary structure of carboxypeptidase S1 from Penicillium janthinellum." FEBS Lett. 333:39-43(1993). PubMed=8224168 [ 5] Galjart N.J., Morreau H., Willemsen R., Gillemans N., Bonten E.J., d'Azzo A. "Human lysosomal protective protein has cathepsin A-like activity distinct from its protective function." J. Biol. Chem. 266:14754-14762(1991). PubMed=1907282 [ 6] Cho W.L., Deitsch K.W., Raikhel A.S. "An extraovarian protein accumulated in mosquito oocytes is a carboxypeptidase activated in embryos." Proc. Natl. Acad. Sci. U.S.A. 88:10821-10824(1991). PubMed=1961751 [ 7] Hu W.N., Kopachik W., Band R.N. "Cloning and characterization of transcripts showing virulencerelated gene expression in Naegleria fowleri." Infect. Immun. 60:2418-2424(1992). PubMed=1587609 [ 8] Wajant H., Mundry K.W., Pfizenmaier K. "Molecular cloning of hydroxynitrile lyase from Sorghum bicolor (L.). Homologies to serine carboxypeptidases." Plant Mol. Biol. 26:735-746(1994). PubMed=7948927 [ 9] Rawlings N.D., Barrett A.J. "Families of serine peptidases." Methods Enzymol. 244:19-61(1994). PubMed=7845208 [E1] http://www.expasy.org/cgi-bin/lists?peptidas.txt +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00123} {PS00132; CARBOXYPEPT_ZN_1} {PS00133; CARBOXYPEPT_ZN_2} {BEGIN} *********************************************************** * Zinc carboxypeptidases, zinc-binding regions signatures * *********************************************************** There are a number of different types of zinc-dependent carboxypeptidases (EC 3.4.17.-) [1,2]. All these enzymes seem to be structurally and functionally related. The enzymes that belong to this family are listed below. - Carboxypeptidase A1 (EC 3.4.17.1), a pancreatic digestive enzyme that can removes all C-terminal amino acids with the exception of Arg, Lys and Pro. - Carboxypeptidase A2 (EC 3.4.17.15), a pancreatic digestive enzyme with a specificity similar to that of carboxypeptidase A1, but with a preference for bulkier C-terminal residues. - Carboxypeptidase B (EC 3.4.17.2), also a pancreatic digestive enzyme, but that preferentially removes C-terminal Arg and Lys. - Carboxypeptidase N (EC 3.4.17.3) (also known as arginine carboxypeptidase), a plasma enzyme which protects the body from potent vasoactive and inflammatory peptides containing C-terminal Arg or Lys (such as kinins or anaphylatoxins) which are released into the circulation. - Carboxypeptidase H (EC 3.4.17.10) (also known as enkephalin convertase or carboxypeptidase E), an enzyme located in secretory granules of pancreatic islets, adrenal gland, pituitary and brain. This enzyme removes residual Cterminal Arg or Lys remaining after initial endoprotease cleavage during prohormone processing. - Carboxypeptidase M (EC 3.4.17.12), a membrane bound Arg and Lys specific enzyme. It is ideally situated to act on peptide hormones at local tissue sites where it could control their activity before or after interaction with specific plasma membrane receptors. - Mast cell carboxypeptidase (EC 3.4.17.1), an enzyme with a specificity to carboxypeptidase A, but found in the secretory granules of mast cells. - Streptomyces griseus carboxypeptidase (Cpase SG) (EC 3.4.17.-) [3], which combines the specificities of mammalian carboxypeptidases A and B. - Thermoactinomyces vulgaris carboxypeptidase T (EC 3.4.17.18) (CPT) [4], which also combines the specificities of carboxypeptidases A and B. - AEBP1 [5], a transcriptional repressor active in preadipocytes. AEBP1 seems to regulate transcription by cleavage of other transcriptional proteins. - Yeast hypothetical protein YHR132c. All of these enzymes bind an atom of zinc. Three conserved residues are implicated in the binding of the zinc atom: two histidines and a glutamic acid We have derived two signature patterns which contain these three zincligands. -Consensus pattern: [PK]-x-[LIVMFY]-x-[LIVMFY]-x(2)-{E}-x-H-[STAG]-x-E-x[LIVM]-[STAG]-{L}-x(5)-[LIVMFYTA] [H and E are zinc ligands] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: Bacillus sphaericus endopeptidase I which hydrolyses the gamma-D-Glu-(L)meso-diaminopimelic acid bond of spore cortex peptidoglycan [6] and which is possibly distantly related to zinc carboxypeptidases. -Consensus pattern: H-[STAG]-{ADNV}-{VGFI}-{YAR}-[LIVME]-{SDEP}-x[LIVMFYW]-P[FYW] [H is a zinc ligand] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 52. -Note: If a protein includes both signatures, the probability of being a eukaryotic zinc carboxypeptidase is 100% -Note: These proteins belong to families M14A/M14B in the classification of peptidases [7,E1]. it -Last update: April 2006 / Pattern revised. [ 1] Tan F., Chan S.J., Steiner D.F., Schilling J.W., Skidgel R.A. "Molecular cloning and sequencing of the cDNA for human membranebound carboxypeptidase M. Comparison with carboxypeptidases A, B, H, and N." [ 2] [ 3] [ 4] [ 5] [ 6] [ 7] [E1] J. Biol. Chem. 264:13165-13170(1989). PubMed=2753907 Reynolds D.S., Stevens R.L., Gurley D.S., Lane W.S., Austen K.F., Serafin W.E. "Isolation and molecular cloning of mast cell carboxypeptidase A. A novel member of the carboxypeptidase gene family." J. Biol. Chem. 264:20094-20099(1989). PubMed=2584208 Narahashi Y. "The amino acid sequence of zinc-carboxypeptidase from Streptomyces griseus." J. Biochem. 107:879-886(1990). PubMed=2118139 Teplyakov A., Polyakov K., Obmolova G., Strokopytov B., Kuranova I., Osterman A., Grishin N., Smulevitch S., Zagnitko O., Galperina O. "Crystal structure of carboxypeptidase T from Thermoactinomyces vulgaris." Eur. J. Biochem. 208:281-288(1992). PubMed=1521526 He G.-P., Muise A., Li A.W., Ro H.-S. "A eukaryotic transcriptional repressor with carboxypeptidase activity." Nature 378:92-96(1995). PubMed=7477299; DOI=10.1038/378092a0 Hourdou M.-L., Guinand M., Vacheron M.J., Michel G., Denoroy L., Duez C.M., Englebert S., Joris B., Weber G., Ghuysen J.-M. "Characterization of the sporulation-related gamma-D-glutamyl-(L)meso-diaminopimelic-acid-hydrolysing peptidase I of Bacillus sphaericus NCTC 9602 as a member of the metallo(zinc) carboxypeptidase A family. Modular design of the protein." Biochem. J. 292:563-570(1993). PubMed=8503890 Rawlings N.D., Barrett A.J. "Evolutionary families of metallopeptidases." Methods Enzymol. 248:183-228(1995). PubMed=7674922 http://www.expasy.org/cgi-bin/lists?peptidas.txt +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00124} {PS00134; TRYPSIN_HIS} {PS00135; TRYPSIN_SER} {PS50240; TRYPSIN_DOM} {BEGIN} ************************************************************ * Serine proteases, trypsin family, signatures and profile * ************************************************************ The catalytic activity of the serine proteases from the trypsin family is provided by a charge relay system involving an aspartic acid residue hydrogenbonded to a histidine, which itself is hydrogen-bonded to a serine. The sequences in the vicinity of the active site serine and histidine residues are well conserved in this family of proteases [1]. A partial list of proteases known to belong to the trypsin family is shown below. - Acrosin. - Blood coagulation factors VII, IX, X, XI and XII, thrombin, plasminogen, and protein C. - Cathepsin G. - Chymotrypsins. - Complement components C1r, C1s, C2, and complement factors B, D and I. - Complement-activating component of RA-reactive factor. - Cytotoxic cell proteases (granzymes A to H). - Duodenase I. - Elastases 1, 2, 3A, 3B (protease E), leukocyte (medullasin). - Enterokinase (EC 3.4.21.9) (enteropeptidase). - Hepatocyte growth factor activator. - Hepsin. - Glandular (tissue) kallikreins (including EGF-binding protein types A, B, and C, NGF-gamma chain, gamma-renin, prostate specific antigen (PSA) and tonin). - Plasma kallikrein. - Mast cell proteases (MCP) 1 (chymase) to 8. - Myeloblastin (proteinase 3) (Wegener's autoantigen). - Plasminogen activators (urokinase-type, and tissue-type). - Trypsins I, II, III, and IV. - Tryptases. - Snake venom proteases such as ancrod, batroxobin, cerastobin, flavoxobin, and protein C activator. - Collagenase from common cattle grub and collagenolytic protease from Atlantic sand fiddler crab. - Apolipoprotein(a). - Blood fluke cercarial protease. - Drosophila trypsin like proteases: alpha, easter, snake-locus. - Drosophila protease stubble (gene sb). - Major mite fecal allergen Der p III. All the above proteins belong to family S1 in the classification of peptidases [2,E1] and originate from eukaryotic species. It should be noted that bacterial proteases that belong to family S2A are similar enough in the regions of the active site residues that they can be picked up by the same patterns. These proteases are listed below. - Achromobacter lyticus protease I. Lysobacter alpha-lytic protease. Streptogrisin A and B (Streptomyces proteases A and B). Streptomyces griseus glutamyl endopeptidase II. Streptomyces fradiae proteases 1 and 2. We also developed a profile specific for the S1 family that spans the complete domain. In addition to proteases from the S1 family, this profile also detects proteins that have lost active site residues and which are therefore no longer catalytically active. Examples of such proteins are haptoglobin and protein Z. -Consensus pattern: [LIVM]-[ST]-A-[STAG]-H-C [H is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for complement components C1r and C1s, pig plasminogen, bovine protein C, rodent urokinase, ancrod, gyroxin and two insect trypsins. -Other sequence(s) detected in Swiss-Prot: 18. -Consensus pattern: [DNSTAGC]-[GSTAPIMVQH]-x(2)-G-[DE]-S-G-[GS]-[SAPHV][LIVMFYWH]-[LIVMFYSTANQH] [S is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for 18 different proteases which have lost the first conserved glycine. -Other sequence(s) detected in Swiss-Prot: 8. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: If a protein includes both the serine and the histidine active site signatures, the probability of it being a trypsin family serine protease is 100% -Last update: May 2002 / Text revised. [ 1] Brenner S. "The molecular evolution of genes and proteins: a tale of two serines." Nature 334:528-530(1988). PubMed=3136396; DOI=10.1038/334528a0 [ 2] Rawlings N.D., Barrett A.J. "Families of serine peptidases." Methods Enzymol. 244:19-61(1994). PubMed=7845208 [E1] http://www.expasy.org/cgi-bin/lists?peptidas.txt +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00125} {PS00136; SUBTILASE_ASP} {PS00137; SUBTILASE_HIS} {PS00138; SUBTILASE_SER} {BEGIN} **************************************************** * Serine proteases, subtilase family, active sites * **************************************************** Subtilases [1,2] are an extensive family of serine proteases whose catalytic activity is provided by a charge relay system similar to that of the trypsin family of serine proteases but which evolved by independent convergent evolution. The sequence around the residues involved in the catalytic triad (aspartic acid, serine and histidine) are completely different from that of the analogous residues in the trypsin serine proteases and can be used as signatures specific to that category of proteases. The subtilase family currently includes the following proteases: - Subtilisins (EC 3.4.21.62), these alkaline proteases from various Bacillus species have been the target of numerous studies in the past thirty years. - Alkaline elastase YaB from Bacillus sp. (gene ale). - Alkaline serine exoprotease A from Vibrio alginolyticus (gene proA). - Aqualysin I from Thermus aquaticus (gene pstI). - AspA from Aeromonas salmonicida. - Bacillopeptidase F (esterase) from Bacillus subtilis (gene bpf). - C5A peptidase from Streptococcus pyogenes (gene scpA). - Cell envelope-located proteases PI, PII, and PIII from Lactococcus lactis. - Extracellular serine protease from Serratia marcescens. - Extracellular protease from Xanthomonas campestris. - Intracellular serine protease (ISP) from various Bacillus. - Minor extracellular serine protease epr from Bacillus subtilis (gene epr). - Minor extracellular serine protease vpr from Bacillus subtilis (gene vpr). - Nisin leader peptide processing protease nisP from Lactococcus lactis. - Serotype-specific antigene 1 from Pasteurella haemolytica (gene ssa1). - Thermitase (EC 3.4.21.66) from Thermoactinomyces vulgaris. - Calcium-dependent protease from Anabaena variabilis (gene prcA). - Halolysin from halophilic bacteria sp. 172p1 (gene hly). - Alkaline extracellular protease (AEP) from Yarrowia lipolytica (gene xpr2). - Alkaline proteinase from Cephalosporium acremonium (gene alp). - Cerevisin (EC 3.4.21.48) (vacuolar protease B) from yeast (gene PRB1). - Cuticle-degrading protease (pr1) from Metarhizium anisopliae. - KEX-1 protease from Kluyveromyces lactis. - Kexin (EC 3.4.21.61) from yeast (gene KEX-2). - Oryzin (EC 3.4.21.63) (alkaline proteinase) from Aspergillus (gene alp). - Proteinase K (EC 3.4.21.64) from Tritirachium album (gene proK). - Proteinase R from Tritirachium album (gene proR). - Proteinase T from Tritirachium album (gene proT). - Subtilisin-like protease III from yeast (gene YSP3). - Thermomycolin (EC 3.4.21.65) from Malbranchea sulfurea. - Furin (EC 3.4.21.75), neuroendocrine convertases 1 to 3 (NEC-1 to 3) and PACE4 protease from mammals, other vertebrates, and invertebrates. These proteases are involved in the processing of hormone precursors at sites comprised of pairs of basic amino acid residues [3]. - Tripeptidyl-peptidase II (EC 3.4.14.10) (tripeptidyl aminopeptidase) from Human. - Prestalk-specific proteins tagB and tagC from slime mold [4]. Both proteins consist of two domains: a N-terminal subtilase catalytic domain and a Cterminal ABC transporter domain (see <PDOC00185>). -Consensus pattern: [STAIV]-{ERDL}-[LIVMF]-[LIVM]-D-[DSTA]-G-[LIVMFC]x(2,3)[DNH] [D is the active site residue] -Sequences known to belong to this class detected by the pattern: the majority of subtilases with a few exceptions. -Other sequence(s) detected in Swiss-Prot: 55. -Consensus pattern: H-G-[STM]-x-[VIC]-[STAGC]-[GS]-x-[LIVMA]-[STAGCLV][SAGM] [H is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for aspA and ssa1 which both seem to lack the histidine active site. -Other sequence(s) detected in Swiss-Prot: adenylate cyclase type VIII. -Consensus pattern: G-T-S-x-[SA]-x-P-x-{L}-[STAVC]-[AG] [S is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for nisP, tagC and S.marcescens extracellular serine protease. -Other sequence(s) detected in Swiss-Prot: 7. -Note: If a protein includes at least two of the three active site signatures, the probability of it being a serine protease from the subtilase family is 100% -Note: These proteins belong to family S8 in the classification of peptidases [5,E1]. -Expert(s) to contact by email: Brannigan J.; [email protected] Siezen R.J.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Siezen R.J., de Vos W.M., Leunissen J.A.M., Dijkstra B.W. "Homology modelling and protein engineering strategy of subtilases, the family of subtilisin-like serine proteinases." Protein Eng. 4:719-737(1991). PubMed=1798697 [ 2] Siezen R.J. (In) Proceeding subtilisin symposium, Hamburg, (1992). [ 3] Barr P.J. Cell 66:1-3(1991). [ 4] Shaulsky G., Kuspa A., Loomis W.F. "A multidrug resistance transporter/serine protease gene is required for prestalk specialization in Dictyostelium." Genes Dev. 9:1111-1122(1995). PubMed=7744252 [ 5] Rawlings N.D., Barrett A.J. "Families of serine peptidases." Methods Enzymol. 244:19-61(1994). PubMed=7845208 [E1] http://www.expasy.org/cgi-bin/lists?peptidas.txt +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00126} {PS00139; THIOL_PROTEASE_CYS} {PS00639; THIOL_PROTEASE_HIS} {PS00640; THIOL_PROTEASE_ASN} {BEGIN} ****************************************************** * Eukaryotic thiol (cysteine) proteases active sites * ****************************************************** Eukaryotic thiol proteases (EC 3.4.22.-) [1] are a family of proteolytic enzymes which contain an active site cysteine. Catalysis proceeds through a thioester intermediate and is facilitated by a nearby histidine side chain; an asparagine completes the essential catalytic triad. The proteases which are currently known to belong to this family are listed below (references are only provided for recently determined sequences). - Vertebrate lysosomal cathepsins B (EC 3.4.22.1), H (EC 3.4.22.16), L (EC 3.4.22.15), and S (EC 3.4.22.27) [2]. - Vertebrate lysosomal dipeptidyl peptidase I (EC 3.4.14.1) (also known as cathepsin C) [2]. - Vertebrate calpains (EC 3.4.22.52) (EC 3.4.22.53). Calpains are intracellular calcium-activated thiol protease that contain both a N-terminal catalytic domain and a C-terminal calcium-binding domain. - Mammalian cathepsin K, which seems involved in osteoclastic bone resorption [3]. - Human cathepsin O [4]. - Bleomycin hydrolase. An enzyme that catalyzes the inactivation of the antitumor drug BLM (a glycopeptide). - Plant enzymes: barley aleurain (EC 3.4.22.16), EP-B1/B4; kidney bean EP-C1, rice bean SH-EP; kiwi fruit actinidin (EC 3.4.22.14); papaya latex papain (EC 3.4.22.2), chymopapain (EC 3.4.22.6), caricain (EC 3.4.22.30), and proteinase IV (EC 3.4.22.25); pea turgor-responsive protein 15A; pineapple stem bromelain (EC 3.4.22.32); rape COT44; rice oryzain alpha, beta, and gamma; tomato low-temperature induced, Arabidopsis thaliana A494, RD19A and RD21A. - House-dust mites allergens DerP1 and EurM1. - Cathepsin B-like proteinases from the worms Caenorhabditis elegans (genes gcp-1, cpr-3, cpr-4, cpr-5 and cpr-6), Schistosoma mansoni (antigen SM31) and Japonica (antigen SJ31), Haemonchus contortus (genes AC-1 and AC-2), and Ostertagia ostertagi (CP-1 and CP-3). - Slime mold cysteine proteinases CP1 and CP2. - Cruzipain from Trypanosoma cruzi and brucei. - Throphozoite cysteine proteinase (TCP) from various Plasmodium species. - Proteases from Leishmania mexicana, Theileria annulata and Theileria parva. - Baculoviruses cathepsin-like enzyme (v-cath). - Drosophila small optic lobes protein (gene sol), a neuronal protein that contains a calpain-like domain. - Yeast thiol protease BLH1/YCP1/LAP3. - Caenorhabditis elegans hypothetical protein C06G4.2, a calpain-like protein. Two bacterial peptidases are also part of this family: - Aminopeptidase C from Lactococcus lactis (gene pepC) [5]. - Thiol protease tpr from Porphyromonas gingivalis. Three other proteins are structurally have lost their proteolytic activity. related to this family, but may - Soybean oil body protein P34. This protein has its active site cysteine replaced by a glycine. - Rat testin, a sertoli cell secretory protein highly similar to cathepsin L but with the active site cysteine is replaced by a serine. Rat testin should not be confused with mouse testin which is a LIM-domain protein (see <PDOC00382>). - Plasmodium falciparum serine-repeat protein (SERA), the major blood stage antigen. This protein of 111 Kd possesses a C-terminal thiolprotease-like domain [6], but the active site cysteine is replaced by a serine. The sequences around the three active site residues are well conserved and can be used as signature patterns. -Consensus pattern: Q-{V}-x-{DE}-[GE]-{F}-C-[YW]-{DN}-x-[STAGC]-[STAGCV] [C is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for P34, testins, SERA antigen, and Theileria annulara protease. -Other sequence(s) detected in Swiss-Prot: 6. -Note: The residue in position 4 of the pattern is almost always cysteine; the only exceptions are calpains (Leu), bleomycin hydrolase (Ser) and yeast YCP1 (Ser). -Note: The residue in position 5 of the pattern is always Gly except in papaya protease IV where it is Glu. -Consensus pattern: [LIVMGSTAN]-{IEVK}-H-[GSACE]-[LIVM]-{GPSI}[LIVMAT](2)-G{SLAG}-[GSADNH] [H is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for calpains, P34 and tpr. -Other sequence(s) detected in Swiss-Prot: 146. -Consensus pattern: [FYCH]-[WI]-[LIVT]-x-[KRQAG]-N-[ST]-W-x(3)-[FYW]-Gx(2)-G[LFYW]-[LIVMFYG]-x-[LIVMF] [N is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for calpains, bromelain, yeast BLH1, tomato low-temperature induced protease, cathepsin O, pepC and tpr. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: These proteins belong to family C1 (papain-type) and C2 (calpains) in the classification of peptidases [7,E1]. -Expert(s) to contact by email: Turk B.; [email protected] -Last update: April 2006 / Patterns revised. [ 1] Dufour E. "Sequence homologies, hydrophobic profiles and secondary structures of cathepsins B, H and L: comparison with papain and actinidin." Biochimie 70:1335-1342(1988). PubMed=3148320 [ 2] Kirschke H., Barrett A.J., Rawlings N.D. Protein Prof. 2:1587-1643(1995). [ 3] Shi G.-P., Chapman H.A., Bhairi S.M., DeLeeuw C., Reddy V.Y., Weiss S.J. "Molecular cloning of human cathepsin O, a novel endoproteinase and homologue of rabbit OC2." FEBS Lett. 357:129-134(1995). PubMed=7805878 [ 4] Velasco G., Ferrando A.A., Puente X.S., Sanchez L.M., Lopez-Otin C. "Human cathepsin O. Molecular cloning from a breast carcinoma, production of the active enzyme in Escherichia coli, and expression analysis in human tissues." J. Biol. Chem. 269:27136-27142(1994). PubMed=7929457 [ 5] Chapot-Chartier M.P., Nardi M., Chopin M.C., Chopin A., Gripon J.C. Appl. Environ. Microbiol. 59:330-333(1993). [ 6] Higgins D.G., McConnell D.J., Sharp P.M. "Malarial proteinase?" Nature 340:604-604(1989). PubMed=2671749; DOI=10.1038/340604a0 [ 7] Rawlings N.D., Barrett A.J. "Families of cysteine peptidases." Methods Enzymol. 244:461-486(1994). PubMed=7845226 [E1] http://www.expasy.org/cgi-bin/lists?peptidas.txt +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00127} {PS00140; UCH_1} {BEGIN} ************************************************************************ * Ubiquitin carboxyl-terminal hydrolases family 1 cysteine active site * ************************************************************************ Ubiquitin carboxyl-terminal hydrolases (EC 3.4.19.12) (UCH) (deubiquitinating enzymes) [1,2] are thiol proteases that recognize and hydrolyze the peptide bond at the C-terminal glycine of ubiquitin. These enzymes are involved in the processing of poly-ubiquitin precursors as well as that of ubiquinated proteins. There are two distinct families of UCH. The first class consist of enzymes of about 25 Kd and is currently represented by: - Mammalian isozymes L1, L3 and L5. - Yeast YUH1. - Drosophila Uch. One of the active site residues of class-I UCH [3] is a cysteine. We derived a signature pattern from the region around that residue. -Consensus pattern: Q-x(3)-N-[SA]-C-G-x(3)-[LIVM](2)-H-[SA]-[LIVM]-[SA] [C is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: These proteins belong to family C12 in the classification of peptidases [4,E1]. -Last update: December 2001 / Text revised. [ 1] Jentsch S., Seufert W., Hauser H.-P. "Genetic analysis of the ubiquitin system." Biochim. Biophys. Acta 1089:127-139(1991). PubMed=1647207 [ 2] D'andrea A., Pellman D. Crit. Rev. Biochem. Mol. Biol. 33:337-352(1998). [ 3] Johnston S.C., Larsen C.N., Cook W.J., Wilkinson K.D., Hill C.P. "Crystal structure of a deubiquitinating enzyme (human UCH-L3) at 1.8 A resolution." EMBO J. 16:3787-3796(1997). PubMed=9233788; DOI=10.1093/emboj/16.13.3787 [ 4] Rawlings N.D., Barrett A.J. "Families of cysteine peptidases." Methods Enzymol. 244:461-486(1994). PubMed=7845226 [E1] http://www.expasy.org/cgi-bin/lists?peptidas.txt +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00128} {PS00141; ASP_PROTEASE} {PS50175; ASP_PROT_RETROV} {BEGIN} ***************************************************************** * Eukaryotic and viral aspartyl proteases signature and profile * ***************************************************************** Aspartyl proteases, also known as acid proteases, (EC 3.4.23.-) are a widely distributed family of proteolytic enzymes [1,2,3] known to exist in vertebrates, fungi, plants, retroviruses and some plant viruses. Aspartate proteases of eukaryotes are monomeric enzymes which consist of two domains. Each domain contains an active site centered on a catalytic aspartyl residue. The two domains most probably evolved from the duplication of an ancestral gene encoding a primordial domain. Currently known eukaryotic aspartyl proteases are: - Vertebrate gastric pepsins A and C (also known as gastricsin). - Vertebrate chymosin (rennin), involved in digestion and used for making cheese. - Vertebrate lysosomal cathepsins D (EC 3.4.23.5) and E (EC 3.4.23.34). - Mammalian renin (EC 3.4.23.15) whose function is to generate angiotensin I from angiotensinogen in the plasma. - Fungal proteases such as aspergillopepsin A (EC 3.4.23.18), candidapepsin (EC 3.4.23.24), mucoropepsin (EC 3.4.23.23) (mucor rennin), endothiapepsin (EC 3.4.23.22), polyporopepsin (EC 3.4.23.29), and rhizopuspepsin (EC 3.4.23.21). - Yeast saccharopepsin (EC 3.4.23.25) (proteinase A) (gene PEP4). PEP4 is implicated in posttranslational regulation of vacuolar hydrolases. - Yeast barrierpepsin (EC 3.4.23.35) (gene BAR1); a protease that cleaves alpha-factor and thus acts as an antagonist of the mating pheromone. - Fission yeast sxa1 which is involved in degrading or processing the mating pheromones. Most retroviruses and some plant viruses, such as badnaviruses, encode for an aspartyl protease which is an homodimer of a chain of about 95 to 125 amino acids. In most retroviruses, the protease is encoded as a segment of a polyprotein which is cleaved during the maturation process of the virus. It is generally part of the pol polyprotein and, more rarely, of the gag polyprotein. Conservation of the sequence around the two aspartates of eukaryotic aspartyl proteases and around the single active site of the viral proteases allows us to develop a single signature pattern for both groups of protease. A profile was developed to specifically detect viral aspartyl proteases, which are missed by the pattern. -Consensus pattern: [LIVMFGAC]-[LIVMTADN]-[LIVFSA]-D-[ST]-G-[STAV][STAPDENQ]{GQ}-[LIVMFSTNC]-{EGK}-[LIVMFGTA] [D is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 37. -Sequences known to belong to this class detected by the profile: ALL viraltype proteases. -Other sequence(s) detected in Swiss-Prot: 3. -Note: These proteins classification of peptidases [4,E1]. belong to families A1 and A2 in the -Last update: December 2004 / Pattern and text revised. [ 1] Foltmann B. "Gastric proteinases--structure, function, evolution and mechanism of action." Essays Biochem. 17:52-84(1981). PubMed=6795036 [ 2] Davies D.R. "The structure and function of the aspartic proteinases." Annu. Rev. Biophys. Biophys. Chem. 19:189-215(1990). PubMed=2194475 [ 3] Rao J.K.M., Erickson J.W., Wlodawer A. "Structural and evolutionary relationships between retroviral and eucaryotic aspartic proteinases." Biochemistry 30:4663-4671(1991). PubMed=1851433 [ 4] Rawlings N.D., Barrett A.J. "Families of aspartic peptidases, and those of unknown catalytic mechanism." Methods Enzymol. 248:105-120(1995). PubMed=7674916 [E1] http://www.expasy.org/cgi-bin/lists?peptidas.txt +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00129} {PS00142; ZINC_PROTEASE} {BEGIN} ***************************************************************** * Neutral zinc metallopeptidases, zinc-binding region signature * ***************************************************************** The majority of zinc-dependent metallopeptidases (with the notable exception of the carboxypeptidases) share a common pattern of primary structure [1,2,3] in the part of their sequence involved in the binding of zinc, and can be grouped together as a superfamily,known as the metzincins, on the basis of this sequence similarity. They can be classified into a number of distinct families [4,E1] which are listed below along with the proteases which are currently known to belong to these families. Family M1 - Bacterial aminopeptidase N (EC 3.4.11.2) (gene pepN). - Mammalian aminopeptidase N (EC 3.4.11.2). - Mammalian glutamyl aminopeptidase (EC 3.4.11.7) (aminopeptidase A). It may play a role in regulating growth and differentiation of early Blineage cells. - Yeast aminopeptidase yscII (gene APE2). - Yeast alanine/arginine aminopeptidase (gene AAP1). - Yeast hypothetical protein YIL137c. - Leukotriene A-4 hydrolase (EC 3.3.2.6). This enzyme is responsible for the hydrolysis of an epoxide moiety of LTA-4 to form LTB-4; it has been shown that it binds zinc and is capable of peptidase activity. Family M2 - Angiotensin-converting enzyme (EC 3.4.15.1) (dipeptidyl carboxypeptidase I) (ACE) the enzyme responsible for hydrolyzing angiotensin I to angiotensin II. There are two forms of ACE: a testis-specific isozyme and a somatic isozyme which has two active centers. Family M3 - Thimet oligopeptidase (EC 3.4.24.15), a mammalian enzyme involved in the cytoplasmic degradation of small peptides. - Neurolysin (EC 3.4.24.16) (also known as mitochondrial oligopeptidase M or microsomal endopeptidase). - Mitochondrial intermediate peptidase precursor (EC 3.4.24.59) (MIP). It is involved the second stage of processing of some proteins imported in the mitochondrion. - Yeast saccharolysin (EC 3.4.24.37) (proteinase yscD). - Escherichia coli and related bacteria dipeptidyl carboxypeptidase (EC 3.4.15.5) (gene dcp). - Escherichia coli and related bacteria oligopeptidase A (EC 3.4.24.70) (gene opdA or prlC). - Yeast hypothetical protein YKL134c. Family M4 - Thermostable thermolysins (EC 3.4.24.27), and related thermolabile neutral proteases (bacillolysins) (EC 3.4.24.28) from various species of Bacillus. - Pseudolysin (EC 3.4.24.26) from Pseudomonas aeruginosa (gene lasB). - Extracellular elastase from Staphylococcus epidermidis. - Extracellular protease prt1 from Erwinia carotovora. - Extracellular minor protease smp from Serratia marcescens. - Vibriolysin (EC 3.4.24.25) from various species of Vibrio. - Protease prtA from Listeria monocytogenes. - Extracellular proteinase proA from Legionella pneumophila. Family M5 - Mycolysin (EC 3.4.24.31) from Streptomyces cacaoi. Family M6 - Immune inhibitor A from Bacillus thuringiensis (gene ina). Ina degrades two classes of insect antibacterial proteins, attacins and cecropins. Family M7 - Streptomyces extracellular small neutral proteases Family M8 - Leishmanolysin (EC 3.4.24.36) (surface glycoprotein gp63), surface protease from various species of Leishmania. a cell Family M9 - Microbial collagenase (EC 3.4.24.3) from Clostridium perfringens and Vibrio alginolyticus. Family M10A - Serralysin (EC 3.4.24.40), Serratia. - Alkaline metalloproteinase - Secreted proteases A, B, C - Yeast hypothetical protein an extracellular metalloprotease from from Pseudomonas aeruginosa (gene aprA). and G from Erwinia chrysanthemi. YIL108w. Family M10B - Mammalian extracellular matrix metalloproteinases (known as matrixins) [5]: MMP-1 (EC 3.4.24.7) (interstitial collagenase), MMP-2 (EC 3.4.24.24) (72 Kd gelatinase), MMP-9 (EC 3.4.24.35) (92 Kd gelatinase), MMP-7 (EC 3.4.24.23) (matrylisin), MMP-8 (EC 3.4.24.34) (neutrophil collagenase), MMP-3 (EC 3.4.24.17) (stromelysin-1), MMP-10 (EC 3.4.24.22) (stromelysin2), and MMP-11 (stromelysin-3), MMP-12 (EC 3.4.24.65) (macrophage metalloelastase). - Sea urchin hatching enzyme (envelysin) (EC 3.4.24.12). A protease that allows the embryo to digest the protective envelope derived from the egg extracellular matrix. - Soybean metalloendoproteinase 1. Family M11 - Chlamydomonas reinhardtii gamete lytic enzyme (GLE). Family M12A - Astacin (EC 3.4.24.21), a crayfish endoprotease. - Meprin A (EC 3.4.24.18), a mammalian kidney and intestinal brush border metalloendopeptidase. - Bone morphogenic protein 1 (BMP-1), a protein which induces cartilage and bone formation and which expresses metalloendopeptidase activity. The Drosophila homolog of BMP-1 is the dorsal-ventral patterning protein tolloid. - Blastula protease 10 (BP10) from Paracentrotus lividus and the related protein SpAN from Strongylocentrotus purpuratus. - Caenorhabditis elegans protein toh-2. - Caenorhabditis elegans hypothetical protein F42A10.8. - Choriolysins L and H (EC 3.4.24.67) (also known as embryonic hatching proteins LCE and HCE) from the fish Oryzias lapides. These proteases participates in the breakdown of the egg envelope, which is derived from the egg extracellular matrix, at the time of hatching. Family M12B - Snake venom metalloproteinases [6]. This subfamily mostly groups proteases that act in hemorrhage. Examples are: adamalysin II (EC 3.4.24.46), atrolysin C/D (EC 3.4.24.42), atrolysin E (EC 3.4.24.44), fibrolase (EC 3.4.24.72), trimerelysin I (EC 3.4.24.52) and II (EC 3.4.24.53). - Mouse cell surface antigen MS2. Family M13 - Mammalian neprilysin (EC 3.4.24.11) (neutral endopeptidase) (NEP). - Endothelin-converting enzyme 1 (EC 3.4.24.71) (ECE-1), which process the precursor of endothelin to release the active peptide. - Kell blood group glycoprotein, a major antigenic protein of erythrocytes. The Kell protein is very probably a zinc endopeptidase. - Peptidase O from Lactococcus lactis (gene pepO). Family M27 - Clostridial neurotoxins, including tetanus toxin (TeTx) and the various botulinum toxins (BoNT). These toxins are zinc proteases that block neurotransmitter release by proteolytic cleavage of synaptic proteins such as synaptobrevins, syntaxin and SNAP-25 [7,8]. Family M30 - Staphylococcus hyicus neutral metalloprotease. Family M32 - Thermostable carboxypeptidase 1 (EC 3.4.17.19) (carboxypeptidase Taq), an enzyme from Thermus aquaticus which is most active at high temperature. Family M34 - Lethal factor (LF) from Bacillus proteins composing the anthrax toxin. anthracis, one of the three Family M35 - Deuterolysin (EC 3.4.24.39) from Penicillium citrinum and related proteases from various species of Aspergillus. Family M36 - Extracellular elastinolytic metalloproteinases from Aspergillus. From the tertiary structure of thermolysin, the position of the residues acting as zinc ligands and those involved in the catalytic activity are known. Two of the zinc ligands are histidines which are very close together in the sequence; C-terminal to the first histidine is a glutamic acid residue which acts as a nucleophile and promotes the attack of a water molecule on the carbonyl carbon of the substrate. A signature pattern which includes the two histidine and the glutamic acid residues is sufficient to detect this superfamily of proteins. -Consensus pattern: [GSTALIVN]-{PCHR}-{KND}-H-E-[LIVMFYW]-{DEHRKP}-H{EKPC}[LIVMFYWGSPQ] [The 2 H's are zinc ligands] [E is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for members of families M5, M7 amd M11. -Other sequence(s) detected in Swiss-Prot: 77; including Neurospora crassa conidiation-specific protein 13 which could be a zinc-protease. -Last update: April 2006 / Pattern revised. [ 1] Jongeneel C.V., Bouvier J., Bairoch A. "A unique signature identifies a family of zinc-dependent metallopeptidases." FEBS Lett. 242:211-214(1989). PubMed=2914602 [ 2] Murphy G.J.P., Murphy G., Reynolds J.J. "The origin of matrix metalloproteinases and their familial [ 3] [ 4] [ 5] [ 6] [ 7] [ 8] [E1] relationships." FEBS Lett. 289:4-7(1991). PubMed=1894005 Bode W., Grams F., Reinemer P., Gomis-Rueth F.-X., Baumann U., McKay D.B., Stoecker W. Zoology 99:237-246(1996). Rawlings N.D., Barrett A.J. "Evolutionary families of metallopeptidases." Methods Enzymol. 248:183-228(1995). PubMed=7674922 Woessner J.F. Jr. "Matrix metalloproteinases and their inhibitors in connective tissue remodeling." FASEB J. 5:2145-2154(1991). PubMed=1850705 Hite L.A., Fox J.W., Bjarnason J.B. "A new family of proteinases is defined by several snake venom metalloproteinases." Biol. Chem. Hoppe-Seyler 373:381-385(1992). PubMed=1515064 Montecucco C., Schiavo G. "Tetanus and botulism neurotoxins: a new group of zinc proteases." Trends Biochem. Sci. 18:324-327(1993). PubMed=7901925 Niemann H., Blasi J., Jahn R. "Clostridial neurotoxins: new tools for dissecting exocytosis." Trends Cell Biol. 4:179-185(1994). PubMed=14731646 http://www.expasy.org/cgi-bin/lists?peptidas.txt +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00130} {PS00143; INSULINASE} {BEGIN} **************************************************** * Insulinase family, zinc-binding region signature * **************************************************** A number of proteases dependent on divalent cations for their activity have been shown [1,2] to belong to one family, on the basis of sequence similarity. These enzymes are listed below. - Insulinase (EC 3.4.24.56) (also known as insulysin or insulindegrading enzyme or IDE), a cytoplasmic enzyme which seems to be involved in the cellular processing of insulin, glucagon and other small polypeptides. - Escherichia coli protease III (EC 3.4.24.55) (pitrilysin) (gene ptr), a periplasmic enzyme that degrades small peptides. - Mitochondrial processing peptidase (EC 3.4.24.64) (MPP). This enzyme removes the transit peptide from the precursor form of proteins imported from the cytoplasm across the mitochondrial inner membrane. It is composed of two nonidentical homologous subunits termed alpha and beta. The beta subunit seems to be catalytically active while the alpha subunit has probably lost its activity. - Nardilysin (EC 3.4.24.61) (N-arginine dibasic convertase or NRD convertase) this mammalian enzyme cleaves peptide substrates on the N-terminus of Arg residues in dibasic stretches. - Klebsiella pneumoniae protein pqqF. This protein is required for the biosynthesis of the coenzyme pyrrolo-quinoline-quinone (PQQ). It is thought to be protease that cleaves peptide bonds in a small peptide (gene pqqA) thus providing the glutamate and tyrosine residues necessary for the synthesis of PQQ. - Yeast protein AXL1, which is involved in axial budding [3]. - Eimeria bovis sporozoite developmental protein. - Escherichia coli hypothetical protein yddC and HI1368, the corresponding Haemophilus influenzae protein. - Bacillus subtilis hypothetical protein ymxG. - Caenorhabditis elegans hypothetical proteins C28F5.4 and F56D2.1. It should be noted that in addition to the above enzymes, this family also includes the core proteins I and II of the mitochondrial bc1 complex (also called cytochrome c reductase or complex III), but the situation as to the activity or lack of activity of these subunits is quite complex: - In mammals and yeast, core proteins I and II lack enzymatic activity. - In Neurospora crassa and in potato core protein I is equivalent to the beta subunit of MPP. - In Euglena gracilis, core protein I seems to be active, while subunit II is inactive. These proteins do not share many regions of sequence similarity; the most noticeable is in the N-terminal section. This region includes a conserved histidine followed, two residues later by a glutamate and another histidine. In pitrilysin, it has been shown [4] that this H-x-x-E-H motif is involved in enzyme activity; the two histidines bind zinc and the glutamate is necessary for catalytic activity. Non active members of this family have lost from one to three of these active site residues. We developed a signature pattern that detect active members of this family as well as some inactive members. -Consensus pattern: G-x(8,9)-G-x-[STA]-H-[LIVMFY]-[LIVMC]-[DERN]-[HRKL][LMFAT]-x-[LFSTH]-x-[GSTAN]-[GST] [The 2 H's are zinc ligands] [E is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL active members as well as all MPP alpha subunits and core II subunits. Does not detect inactive core I subunits. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: These proteins belong to family M16 in the classification of peptidases [5,E1]. -Last update: May 2004 / Text revised. [ 1] Rawlings N.D., Barrett A.J. "Homologues of insulinase, a new superfamily of metalloendopeptidases." Biochem. J. 275:389-391(1991). PubMed=2025223 [ 2] Braun H.-P., Schmitz U.K. "Are the 'core' proteins of the mitochondrial bc1 complex evolutionary relics of a processing protease?" Trends Biochem. Sci. 20:171-175(1995). PubMed=7610476 [ 3] Becker A.B., Roth R.A. "An unusual active site identified in a family of zinc metalloendopeptidases." Proc. Natl. Acad. Sci. U.S.A. 89:3835-3839(1992). PubMed=1570301 [ 4] Fujita A., Oka C., Arikawa Y., Katagai T., Tonouchi A., Kuhara S., Misumi Y. "A yeast gene necessary for bud-site selection encodes a protein similar to insulin-degrading enzymes." Nature 372:567-570(1994). PubMed=7990931; DOI=10.1038/372567a0 [ 5] Rawlings N.D., Barrett A.J. "Evolutionary families of metallopeptidases." Methods Enzymol. 248:183-228(1995). PubMed=7674922 [E1] http://www.expasy.org/cgi-bin/lists?peptidas.txt +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00131} {PS00321; RECA_1} {PS50162; RECA_2} {PS50163; RECA_3} {BEGIN} ************************************** * recA family signature and profiles * ************************************** The bacterial recA protein [1,2,3,E1] is essential for homologous recombination and recombinational repair of DNA damage. RecA has many activities: it filaments, it binds to single- and double-stranded DNA, it binds and hydrolyzes ATP, it is also a recombinase and, finally, it interacts with lexA causing its activation and leading to its autocatalytic cleavage. RecA is a protein of about 350 amino-acid residues. Its sequence is very well conserved [3,4,5,E1] among eubacterial species. It is also found in the chloroplast of plants [6]. The recA protein is closely related to: - Eukaryotic exchange RAD51 protein. Promotes homologous pairing and strand on chromatin. - Eukaryotic DMC1 protein. Participates in meiotic recombination. - Prokaryotic radA protein. Involved in DNA repair and in homologous recombination. - Bacteriophage uvsX gene product. Important in genetic recombination, DNA repair, and replication. As a signature pattern specific for the bacterial and chloroplastic recA protein, we selected the best conserved region, a nonapeptide located in the middle of the sequence and which is part of the monomer-monomer interface in a recA filament. We also developed two profiles. The first one covers the ATP binding domain in the N-terminal part of the recA protein. The second one span the whole monomer-monomer interface. These two profiles also pick up the recA-like proteins. -Consensus pattern: A-L-[KR]-[IF]-[FY]-[STA]-[STAD]-[LIVMQ]-R -Sequences known to belong to this class detected by the pattern: ALL recA. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL recA and recA-like eukaryotic proteins. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Roca A.I.; [email protected] Eisen J.A.; [email protected] -Last update: December 2001 / Text revised; profile added. [ 1] Smith K.C., Wang T.-C. "recA-dependent DNA repair processes." BioEssays 10:12-16(1989). PubMed=2653307 [ 2] Lloyd A.T., Sharp P.M. "Evolution of the recA gene and the molecular phylogeny of bacteria." J. Mol. Evol. 37:399-407(1993). PubMed=8308907 [ 3] Roca A.I., Cox M.M. Prog. Nucleic Acids Res. Mol. Biol. 56:129-223(1997). [ 4] Karlin S., Weinstock G.M., Brendel V. "Bacterial classifications derived from recA protein sequence comparisons." J. Bacteriol. 177:6881-6893(1995). PubMed=7592482 [ 5] Eisen J.A. "The RecA protein as a model molecule for molecular systematic studies of bacteria: comparison of trees of RecAs and 16S rRNAs from the same species." J. Mol. Evol. 41:1105-1123(1995). PubMed=8587109 [ 6] Cerutti H.D., Osman M., Grandoni P., Jagendorf A.T. "A homolog of Escherichia coli RecA protein in plastids of higher plants." Proc. Natl. Acad. Sci. U.S.A. 89:8068-8072(1992). PubMed=1518831 [E1] http://www.tigr.org/~jeisen/RecA/RecA.html +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00132} {PS00144; ASN_GLN_ASE_1} {PS00917; ASN_GLN_ASE_2} {BEGIN} ****************************************************** * Asparaginase / glutaminase active sites signatures * ****************************************************** Asparaginase (EC 3.5.1.1), glutaminase (EC 3.5.1.2) and glutaminaseasparaginase (EC 3.5.1.38) are aminohydrolases that catalyze the hydrolysis of asparagine (or glutamine) to aspartate (or glutamate) and ammonia [1]. Two conserved threonine residues have been shown [2,3] to play a catalytic role. One of them is located in the N-terminal extremity while the second is located at the end of the first third of the sequence. We used both conserved regions as signature patterns. -Consensus pattern: [LIVM]-x-{L}-T-G(2)-T-[IV]-[AGS] [The second T is an active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 10. -Consensus pattern: [GA]-x-[LIVM]-x(2)-H-G-T-D-T-[LIVM] [The first T is an active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Plant asparaginases and mammalian glutaminases do this family and are thus not detected by the above pattern. not belong to -Expert(s) to contact by email: Gribskov M.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Tanaka S., Robinson E.A., Appella E., Miller M., Ammon H.L., Roberts J., Weber I.T., Wlodawer A. "Structures of amidohydrolases. Amino acid sequence of a glutaminase-asparaginase from Acinetobacter glutaminasificans and preliminary crystallographic data for an asparaginase from Erwinia chrysanthemi." J. Biol. Chem. 263:8583-8591(1988). PubMed=3379033 [ 2] Harms E., Wehner A., Aung H.P., Rohm K.H. "A catalytic role for threonine-12 of E. coli asparaginase II as established by site-directed mutagenesis." FEBS Lett. 285:55-58(1991). PubMed=1906013 [ 3] Miller M.M., Rao J.K.M., Wlodawer A., Gribskov M.R. "A left-handed crossover involved in amidohydrolase catalysis. Crystal structure of Erwinia chrysanthemi L-asparaginase with bound L-aspartate." FEBS Lett. 328:275-279(1993). PubMed=8348975 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00133} {PS01120; UREASE_1} {PS00145; UREASE_2} {PS51368; UREASE_3} {BEGIN} **************************************** * Urease domain signatures and profile * **************************************** Urease (EC 3.5.1.5) is a nickel-binding enzyme that catalyzes the hydrolysis of urea to carbon dioxide and ammonia [1]. Historically, it was the first enzyme to be crystallized (in 1926). It is mainly found in plant seeds, microorganisms and invertebrates. In plants, urease is a hexamer of identical chains. In bacteria [2], it consists of either two or three different subunits (alpha, beta and gamma). Urease binds two nickel ions per subunit; four histidine, an aspartate and a carbamated-lysine serve as ligands to these metals; an additional histidine is involved in the catalytic mechanism [3]. The urease domain forms an (alpha beta)(8) barrel structure (see <PDB:2KAU>) with structural similarity to other metal-dependent hydrolases, such as adenosine and AMP deaminase (see <PDOC00419>) and phosphotriesterase (see <PDOC01026>). As signatures for this enzyme, we selected a region that contains two histidines that bind one of the nickel ions and the region of the active site histidine. We also developed a profile that covers the whole urease domain. -Consensus pattern: T-[AY]-[GA]-[GATR]-[LIVMF]-D-x-H-[LIVM]-H-x(3)-[PA] [The 2 H's bind nickel] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [LIVM](2)-[CT]-H-[HNG]-L-x(3)-[LIVM]-x(2)-D-[LIVM]-xF[AS] [H is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2008 / Text revised; profile added. [ 1] Takishima K., Suga T., Mamiya G. "The structure of jack bean urease. The complete amino acid sequence, limited proteolysis and reactive cysteine residues." Eur. J. Biochem. 175:151-165(1988). PubMed=3402446 [ 2] Mobley H.L.T., Hausinger R.P. "Microbial ureases: significance, regulation, and molecular characterization." Microbiol. Rev. 53:85-108(1989). PubMed=2651866 [ 3] Jabri E., Carr M.B., Hausinger R.P., Karplus P.A. "The crystal structure of urease from Klebsiella aerogenes." Science 268:998-1004(1995). PubMed=7754395 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00134} {PS00146; BETA_LACTAMASE_A} {PS00336; BETA_LACTAMASE_C} {PS00337; BETA_LACTAMASE_D} {BEGIN} ****************************************************** * Beta-lactamases classes -A, -C, and -D active site * ****************************************************** Beta-lactamases (EC 3.5.2.6) [1,2] are enzymes which catalyze the hydrolysis of an amide bond in the beta-lactam ring of antibiotics belonging to the penicillin/cephalosporin family. Four kinds of beta-lactamase have been identified [3]. Class-B enzymes are zinc containing proteins whilst class -A, C and D enzymes are serine hydrolases. The three classes of serine betalactamases are evolutionary related and belong to a superfamily [4] that also includes DD-peptidases and a variety of other penicillin-binding proteins (PBP's). All these proteins contain a Ser-x-x-Lys motif, where the serine is the active site residue. Although clearly homologous, the sequences of the three classes of serine beta-lactamases exhibit a large degree of variability and only a small number of residues are conserved in addition to the catalytic serine. Since a pattern detecting all serine beta-lactamases would also pick up many unrelated sequences, we decided to provide specific patterns, centered on the active site serine, for each of the three classes. -Consensus pattern: [FY]-x-[LIVMFY]-{E}-S-[TV]-x-K-x(3)-{T}-[AGLM]-{D}{KA}[LC] [S is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL class-A beta-lactamases. -Other sequence(s) detected in Swiss-Prot: 7. -Consensus pattern: [FY]-E-[LIVM]-G-S-[LIVMG]-[SA]-K [The first S is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL class-C beta-lactamases. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [PA]-x-S-[ST]-F-K-[LIV]-[PALV]-x-[STA]-[LI] [S is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL class-D beta-lactamases. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Brannigan J.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Ambler R.P. "The structure of beta-lactamases." Philos. Trans. R. Soc. Lond., B, Biol. Sci. 289:321-331(1980). PubMed=6109327 [ 2] Pastor N., Pinero D., Valdes A.M., Soberon X. "Molecular evolution of class A beta-lactamases: phylogeny and patterns of sequence conservation." Mol. Microbiol. 4:1957-1965(1990). PubMed=2082152 [ 3] Bush K. "Characterization of beta-lactamases." Antimicrob. Agents Chemother. 33:259-263(1989). PubMed=2658779 [ 4] Joris B., Ghuysen J.-M., Dive G., Renard A., Dideberg O., Charlier P., Frere J.M., Kelly J.A., Boyington J.C., Moews P.C. "The active-site-serine penicillin-recognizing enzymes as members of the Streptomyces R61 DD-peptidase family." Biochem. J. 250:313-324(1988). PubMed=3128280 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00135} {PS01053; ARGINASE_1} {PS51409; ARGINASE_2} {BEGIN} ***************************************** * Arginase family signature and profile * ***************************************** Arginase family proteins are ureohydrolases with important roles in arginine/agmatine metabolism, the urea cycle, histidine degradation, and other pathways. The family includes arginase and evolutionary related [1] enzymes of about 300 amino acids that typically contain two manganese ions in the active site. Some proteins that belong to the arginase family are listed below: - Arginase (EC 3.5.3.1), a ubiquitous enzyme which catalyzes the degradation of arginine to ornithine and urea [2]. Two isoenzymes are found in mammals. Arginase-1 catalyzes the final cytosolic step of the urea cycle in liver, but it is also found in non-hepatic tissues. Arginase-2 is a mitochondrial enzyme that functions in arginine homeostasis in nonhepatic tissues. Deficiency of arginase can lead to diseases related to the accumulation of arginine or ammonia. - Agmatinase (EC 3.5.3.11) (agmatine ureohydrolase), a prokaryotic enzyme (gene speB) that catalyzes the hydrolysis of agmatine into putrescine and urea. - Formiminoglutamase (EC 3.5.3.8) (formiminoglutamate hydrolase), a prokaryotic enzyme (gene hutG) that hydrolyzes N-formimino-glutamate into glutamate and formamide. - Proclavaminate amidinohydrolase (EC 3.5.3.22) from Streptomyces clavuligerus (gene pah), an enzyme involved in antibiotic clavulanic acid biosynthesis. - Guanidinobutyrase (EC 3.5.3.7) from Arthrobacter sp. (gene gbh), an enzyme that hydrolyzes guanidinobutanoate into aminobutanoate and urea and that requires one zinc ion instead of manganese. - Hypothetical proteins from methanogenic archaebacteria. Known 3-D structures of such enzymes show trimeric or hexameric structures [3-6]. Each monomer forms a conserved alpha/beta fold with a central parallel beta-sheet flanked on both sides by several alpha-helices (see <PDB:1RLA; B>). Three conserved regions that contain charged residues which are involved in the binding of the two manganese ions in the active site are located in loop segments of the central beta-sheet [3-6]. We have used one of these regions for a signature pattern and we have also developed a profile that covers the entire arginase structure. -Consensus pattern: [ST]-[LIVMFY]-D-[LIVM]-D-x(3)-[PAQ]-x(3)-P-[GSA]x(7)-G [The 2 D's bind manganese] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Ouzounis C.; [email protected] -Last update: November 2008 / Text revised; profile added; patterns deleted. [ 1] Ouzounis C.A., Kyrpides N.C. [ 2] [ 3] [ 4] [ 5] [ 6] "On the evolution of arginases and related enzymes." J. Mol. Evol. 39:101-104(1994). PubMed=8064866 Jenkinson C.P., Grody W.W., Cederbaum S.D. "Comparative properties of arginases." Comp. Biochem. Physiol. 114B:107-132(1996). PubMed=8759304 Kanyo Z.F., Scolnick L.R., Ash D.E., Christianson D.W. "Structure of a unique binuclear manganese cluster in arginase." Nature 383:554-557(1996). PubMed=8849731 Elkins J.M., Clifton I.J., Hernandez H., Doan L.X., Robinson C.V., Schofield C.J., Hewitson K.S. "Oligomeric structure of proclavaminic acid amidino hydrolase: evolution of a hydrolytic enzyme in clavulanic acid biosynthesis." Biochem. J. 366:423-434(2002). PubMed=12020346; DOI=10.1042/BJ20020125 Ahn H.J., Kim K.H., Lee J., Ha J.Y., Lee H.H., Kim D., Yoon H.J., Kwon A.R., Suh S.W. "Crystal structure of agmatinase reveals structural conservation and inhibition mechanism of the ureohydrolase superfamily." J. Biol. Chem. 279:50505-50513(2004). PubMed=15355972; DOI=10.1074/jbc.M409246200 Dowling D.P., Di Costanzo L., Gennadios H.A., Christianson D.W. "Evolution of the arginase fold and functional diversity." Cell. Mol. Life Sci. 65:2039-2055(2008). PubMed=18360740; DOI=10.1007/s00018-008-7554-z +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00136} {PS00150; ACYLPHOSPHATASE_1} {PS00151; ACYLPHOSPHATASE_2} {PS51160; ACYLPHOSPHATASE_3} {BEGIN} ****************************************************** * Acylphosphatase-like domain signatures and profile * ****************************************************** Acylphosphatase (EC 3.6.1.7) acyl phosphate carboxyl-phosphate succinyl [1,2] catalyzes the hydrolysis of various bonds such as carbamyl phosphate, phosphate, 1,3-diphosphoglycerate, etc. The physiological role of this enzyme is not yet clear. Acylphosphatase is a small protein of around 100 amino-acid residues. Two different isoenzymes are expressed in mammalian tissues: muscle type (MT) acylphosphatase is prevalently found in the skeletal muscle and heart, whereas the organ common type (CT) acylphosphatase is expressed in erythrocytes, brain and testis. While acylphosphatase have been so far only characterized in vertebrates, there are a number of bacterial and archebacterial hypothetical proteins that are highly similar to that enzyme and that probably possess the same activity. These proteins are: - Escherichia coli probable acylphosphatase yccX. - Bacillus subtilis probable acylphosphatase yflL. - Archaeoglobus fulgidus probable acylphosphatase AF0818. An acylphosphatase-like domain is also found in the N-terminus of prokaryotic hydrogenase maturation protein hypF [3,4]. The acylphosphatase-like domain forms a compact, pear-shaped, stucture (see <PDB:2ACY>). It has a globular alpha/beta fold, consisting of a beta sheet with five antiparallel strands and two alpha helices packed parallel on the same side of the sheet, forming an alpha/beta sandwich protein. The acylphosphatase-like domain is stabilized by intramolecular contacts of the two antiparallel amphipatic alpha-helices, which pack their hydrophobic residues against the inner face of the beta-sheet, leaving no core cavities in the proteins structure [4,5]. As signature patterns, we selected two conserved regions. The first is located in the N-terminal section, while the second is found in the central part of the protein sequence. We also developed a profile that covers the entire acylphosphatase-like domain. -Consensus pattern: [LIV]-x-G-x-V-Q-[GH]-V-x-[FM]-R -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: G-[FYW]-[AVC]-[KRQAM]-N-x(3)-G-x-V-x(5)-G -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Stefani M., Ramponi G. "Acylphosphate phosphohydrolases." Life Chem. Rep. 12:271-301(1995). [ 2] Stefani M., Taddei N., Ramponi G. "Insights into acylphosphatase structure and catalytic mechanism." Cell. Mol. Life Sci. 53:141-151(1997). PubMed=9118002 [ 3] Wolf I., Buhrke T., Dernedde J., Pohlmann A., Friedrich B. "Duplication of hyp genes involved in maturation of [NiFe] hydrogenases in Alcaligenes eutrophus H16." Arch. Microbiol. 170:451-459(1998). PubMed=9799289 [ 4] Rosano C., Zuccotti S., Bucciantini M., Stefani M., Ramponi G., Bolognesi M. "Crystal structure and anion binding in the prokaryotic hydrogenase maturation factor HypF acylphosphatase-like domain." J. Mol. Biol. 321:785-796(2002). PubMed=12206761 [ 5] Thunnissen M.M.G.M., Taddei N., Liguri G., Ramponi G., Nordlund P. "Crystal structure of common type acylphosphatase from bovine testis." Structure 5:69-79(1997). PubMed=9016712 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00137} {PS00152; ATPASE_ALPHA_BETA} {BEGIN} ************************************************** * ATP synthase alpha and beta subunits signature * ************************************************** ATP synthase (proton-translocating ATPase) (EC 3.6.3.14) [1,2] is a component of the cytoplasmic membrane of eubacteria, the inner membrane of mitochondria, and the thylakoid membrane of chloroplasts. The ATPase complex is composed of an oligomeric transmembrane sector, called CF(0), and a catalytic core, called coupling factor CF(1). The former acts as a proton channel; the latter is composed of five subunits, alpha, beta, gamma, delta and epsilon. The sequences of subunits alpha and beta are related and both contain a nucleotide-binding site for ATP and ADP. The beta chain has catalytic activity, while the alpha chain is a regulatory subunit. Vacuolar ATPases [3] (V-ATPases) are responsible for acidifying a variety of intracellular compartments in eukaryotic cells. Like F-ATPases, they are oligomeric complexes of a transmembrane and a catalytic sector. The sequence of the largest subunit of the catalytic sector (70 Kd) is related to that of F-ATPase beta subunit, while a 60 Kd subunit, from the same sector, is related to the F-ATPases alpha subunit [4]. Archaebacterial membrane-associated ATPases subunits. The alpha chain is related to F-ATPases chain is related to F-ATPases alpha chain [4]. are composed of three beta chain and the beta A protein highly similar to F-ATPase beta subunits is found [5] in some bacterial apparatus involved in a specialized protein export pathway that proceeds without signal peptide cleavage. This protein is known as fliI in Bacillus and Salmonella, Spa47 (mxiB) in Shigella flexneri, HrpB6 in Xanthomonas campestris and yscN in Yersinia virulence plasmids. In order to detect these ATPase subunits, we took a segment of ten amino-acid residues, containing two conserved serines, as a signature pattern. The first serine seems to be important for catalysis - in the ATPase alpha chain at least - as its mutagenesis causes catalytic impairment. -Consensus pattern: P-[SAP]-[LIV]-[DNH]-{LKGN}-{F}-{S}-S-{DCPH}-S [The first S may be an active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for the archaebacterium Sulfolobus acidocaldarius ATPase alpha chain where the first Ser is replaced by Gly. -Other sequence(s) detected in Swiss-Prot: 45. -Note: F-ATPase alpha and beta subunits, V-ATPase 70 Kd subunit and the archaebacterial ATPase alpha subunit also contain a copy of the ATPbinding motifs A and B (see <PDOC00017>). -Last update: April 2006 / Pattern revised. [ 1] Futai M., Noumi T., Maeda M. "ATP synthase (H+-ATPase): results by combined biochemical and molecular biological approaches." Annu. Rev. Biochem. 58:111-136(1989). PubMed=2528322; DOI=10.1146/annurev.bi.58.070189.000551 [ 2] Senior A.E. "ATP synthesis by oxidative phosphorylation." Physiol. Rev. 68:177-231(1988). PubMed=2892214 [ 3] Nelson N. "Structure, molecular genetics, and evolution of vacuolar H+-ATPases." J. Bioenerg. Biomembr. 21:553-571(1989). PubMed=2531737 [ 4] Gogarten J.P., Kibak H., Dittrich P., Taiz L., Bowman E.J., Bowman B.J., Manolson M.F., Poole R.J., Date T., Oshima T. "Evolution of the vacuolar H+-ATPase: implications for the origin of eukaryotes." Proc. Natl. Acad. Sci. U.S.A. 86:6661-6665(1989). PubMed=2528146 [ 5] Dreyfus G., Williams A.W., Kawagishi I., Macnab R.M. "Genetic and biochemical analysis of Salmonella typhimurium FliI, a flagellar protein related to the catalytic subunit of the F0F1 ATPase and to virulence proteins of mammalian and plant pathogens." J. Bacteriol. 175:3131-3138(1993). PubMed=8491729 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00138} {PS00153; ATPASE_GAMMA} {BEGIN} **************************************** * ATP synthase gamma subunit signature * **************************************** ATP synthase (proton-translocating ATPase) (EC 3.6.3.14) [1,2] is a component of the cytoplasmic membrane of eubacteria, the inner membrane of mitochondria, and the thylakoid membrane of chloroplasts. The ATPase complex is composed of an oligomeric transmembrane sector, called CF(0), and a catalytic core, called coupling factor CF(1). The former acts as a proton channel; the latter is composed of five subunits, alpha, beta, gamma, delta and epsilon. Subunit gamma is believed to be important in regulating ATPase activity and the flow of protons through the CF(0) complex. The best conserved region of the gamma subunit [3] is its C-terminus which seems to be essential for assembly and catalysis. As a signature pattern to detect ATPase gamma subunits, we used a 14 residue conserved segment where the last amino acid is found one to three residues from the C-terminal extremity. -Consensus pattern: [IV]-T-x-E-x(2)-[DE]-x(3)-G-A-x-[SAKR] -Sequences known to belong to this class detected by the pattern: ALL, except for pea chloroplast gamma and two Bacillus species gamma. -Other sequence(s) detected in Swiss-Prot: 1. -Last update: November 1995 / Pattern and text revised. [ 1] Futai M., Noumi T., Maeda M. "ATP synthase (H+-ATPase): results by combined biochemical and molecular biological approaches." Annu. Rev. Biochem. 58:111-136(1989). PubMed=2528322; DOI=10.1146/annurev.bi.58.070189.000551 [ 2] Senior A.E. "ATP synthesis by oxidative phosphorylation." Physiol. Rev. 68:177-231(1988). PubMed=2892214 [ 3] Miki J., Maeda M., Mukohata Y., Futai M. "The gamma-subunit of ATP synthase from spinach chloroplasts. Primary structure deduced from the cloned cDNA sequence." FEBS Lett. 232:221-226(1988). PubMed=2896606 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00139} {PS00154; ATPASE_E1_E2} {BEGIN} *************************************** * P-type ATPases phosphorylation site * *************************************** P-type ATPases (also known as E1-E2) are cation transport ATPases which form an aspartyl phosphate intermediate in the course of ATP hydrolysis. ATPases which belong to this family are listed below [1,2,3]. - Fungal and plant plasma membrane (H+) ATPases (EC 3.6.3.6). - Vertebrate (Na+, K+) ATPases (sodium pump) (EC 3.6.3.9). - Gastric (K+, H+) ATPases (proton pump) (EC 3.6.3.10). - Calcium (Ca++) ATPases (calcium pump) (EC 3.6.3.8) from the sarcoplasmic reticulum (SR), the endoplasmic reticulum (ER) and the plasma membrane. - Copper (Cu++) ATPases (copper pump) (EC 3.6.3.4) which are involved in two human genetic disorders: Menkes syndrome and Wilson disease. - Bacterial cadmium efflux (Cd++) ATPases (EC 3.6.3.3). - Bacterial magnesium (Mg++) ATPases (EC 3.6.3.2). - Bacterial potassium (K+) ATPases (EC 3.6.3.12). - Bacterial zinc (Zn+) ATPases (EC 3.6.3.5). - Fungal ENA sodium ATPases (EC 3.6.3.7). - fixI, a probable cation ATPase from Rhizobacea, involved in nitrogen fixation. The region around the phosphorylated aspartate residue is perfectly conserved in all these ATPases and can be used as a signature pattern. -Consensus pattern: D-K-T-G-T-[LIVM]-[TI] [D is phosphorylated] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 1. -Last update: November 2002 / Text revised. -Expert(s) to contact by email: Axelsen K.B.; [email protected] [ 1] Fagan M.J., Saier M.H. Jr. "P-type ATPases of eukaryotes and bacteria: sequence analyses and construction of phylogenetic trees." J. Mol. Evol. 38:57-99(1994). PubMed=8151716 [ 2] Palmgren M.G., Axelsen K.B. "Evolution of P-type ATPases." Biochim. Biophys. Acta 1365:37-45(1998). PubMed=9693719 [ 3] Axelsen K.B., Palmgren M.G. "Evolution of substrate specificities in the P-type ATPase superfamily." J. Mol. Evol. 46:84-101(1998). PubMed=9419228 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00140} {PS00155; CUTINASE_1} {PS00931; CUTINASE_2} {BEGIN} ************************************ * Cutinase active sites signatures * ************************************ Cutinase [1] is an extracellular fungal enzyme that catalyzes the hydrolysis of cutin, an insoluble lipid-polyester that forms the structure of plant cuticle. Cutinase allows pathogenic fungi to penetrate through the host plant cuticular barrier during the initial stage of fungal infection. Cutinase is a serine esterase which contains the classical catalytic triad (Asp, Ser, and His) found in the serine hydrolases [2]. Two cutinase-like proteins (MtCY39.35 and MtCY339.08c) have been found in the genome of the bacteria Mycobacterium tuberculosis. The sequence around the catalytic residues is well conserved in the sequence of the known fungal cutinases and can be used as signature patterns. -Consensus pattern: P-x-[STA]-x-[LIV]-[IVT]-x-[GS]-G-Y-S-[QL]-G [S is an active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: C-x(3)-D-x-[IV]-C-x-G-[GST]-x(2)-[LIVM]-x(2,3)-H [D and H are active site residues] -Sequences known to belong to this class detected by the pattern: ALL, except for MtCY339.08c. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: November 1997 / Patterns and text revised. [ 1] Ettinger W.F., Thukral S.K., Kolattukudy P.E. Biochemistry 26:7883-7892(1987). [ 2] Martinez C., De Geus P., Lauwereys M., Matthyssens G., Cambillau C. "Fusarium solani cutinase is a lipolytic enzyme with a catalytic serine accessible to solvent." Nature 356:615-618(1992). PubMed=1560844; DOI=10.1038/356615a0 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00141} {PS00156; OMPDECASE} {BEGIN} **************************************************** * Orotidine 5'-phosphate decarboxylase active site * **************************************************** Orotidine 5'-phosphate decarboxylase (EC 4.1.1.23) (OMPdecase) [1,2] catalyzes the last step in the de novo biosynthesis of pyrimidines, the decarboxylation of OMP into UMP. In higher eukaryotes OMPdecase is part, orotate phosphoribosyltransferase, of a bifunctional enzyme, while the prokaryotic and fungal OMPdecases are monofunctional protein. with Some parts of the sequence of OMPdecase are well conserved across species. The best conserved region is located in the N-terminal half of OMPdecases and is centered around a lysine residue which is essential for the catalytic function of the enzyme. We have used this region as a signature pattern. -Consensus pattern: [LIVMFTAR]-[LIVMF]-x-D-x-K-x(2)-D-[IV]-[ADGP]-x-T[CLIVMNTA] [K is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: January 2002 / Pattern and text revised. [ 1] Jacquet M., Guilbaud R., Garreau H. "Sequence analysis of the DdPYR5-6 gene coding for UMP synthase in Dictyostelium discoideum and comparison with orotate phosphoribosyl transferases and OMP decarboxylases." Mol. Gen. Genet. 211:441-445(1988). PubMed=2835631 [ 2] Kimsey H.H., Kaiser D. "The orotidine-5'-monophosphate decarboxylase gene of Myxococcus xanthus. Comparison to the OMP decarboxylase gene family." J. Biol. Chem. 267:819-824(1992). PubMed=1730672 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00142} {PS00157; RUBISCO_LARGE} {BEGIN} ************************************************************* * Ribulose bisphosphate carboxylase large chain active site * ************************************************************* Ribulose bisphosphate carboxylase (EC 4.1.1.39) (RuBisCO) [1,2] catalyzes the initial step in Calvin's reductive pentose phosphate cycle in plants as well as purple and green bacteria. It consists of a large catalytic unit and a small subunit of undetermined function. In plants, the large subunit is coded by the chloroplastic genome while the small subunit is encoded in the nuclear genome. Molecular activation of RuBisCO by CO2 involves the formation of a carbamate with the epsilon-amino group of a conserved lysine residue. This carbamate is stabilized by a magnesium ion. One of the ligands of the magnesium ion is an aspartic acid residue close to the active site lysine [3]. We developed a pattern which includes both the active site residue and the metal ligand, and which is specific to RuBisCO large chains. -Consensus pattern: G-x-[DN]-F-x-K-x-D-E [K is the active site residue] [The second D is a magnesium ligand] -Sequences known to belong to this class detected by the pattern: ALL, except for Cheilopleuria biscuspis RuBisCO. -Other sequence(s) detected in Swiss-Prot: 1. -Last update: November 1995 / Pattern and text revised. [ 1] Miziorko H.M., Lorimer G.H. "Ribulose-1,5-bisphosphate carboxylase-oxygenase." Annu. Rev. Biochem. 52:507-535(1983). PubMed=6351728; DOI=10.1146/annurev.bi.52.070183.002451 [ 2] Akazawa T., Takabe T., Kobayashi H. Trends Biochem. Sci. 9:380-383(1984). [ 3] Andersson I., Knight S., Schneider G., Lindqvist Y., Lundqvist T., Branden C.-I., Lorimer G.H. Nature 337:229-234(1989). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00143} {PS00158; ALDOLASE_CLASS_I} {BEGIN} ****************************************************** * Fructose-bisphosphate aldolase class-I active site * ****************************************************** Fructose-bisphosphate aldolase (EC 4.1.2.13) [1,2] is a glycolytic enzyme that catalyzes the reversible aldol cleavage or condensation of fructose-1,6bisphosphate into dihydroxyacetone-phosphate and glyceraldehyde 3phosphate. There are two classes of fructose-bisphosphate aldolases with different catalytic mechanisms. Class-I aldolases [3], mainly found in higher eukaryotes, are homotetrameric enzymes which form a Schiff-base intermediate between the C-2 carbonyl group of the substrate (dihydroxyacetone phosphate) and the epsilon-amino group of a lysine residue. In vertebrates, three forms of this enzyme muscle, aldolase B in liver and aldolase C in brain. are found: aldolase A in The sequence around the lysine involved in the Schiff-base is highly conserved and can be used as a signature for this class of enzyme. -Consensus pattern: [LIVM]-x-[LIVMFYW]-E-G-x-[LSI]-L-K-[PA]-[SN] [K is involved in Schiff-base formation] -Sequences known to belong to this class detected by the pattern: ALL, except for Staphylococcus carnosus aldolase. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Perham R.N. "The fructose-1,6-bisphosphate aldolases: same reaction, different enzymes." Biochem. Soc. Trans. 18:185-187(1990). PubMed=2199259 [ 2] Marsh J.J., Lebherz H.G. "Fructose-bisphosphate aldolases: an evolutionary history." Trends Biochem. Sci. 17:110-113(1992). PubMed=1412694 [ 3] Freemont P.S., Dunbar B., Fothergill-Gilmore L.A. "The complete amino acid sequence of human skeletal-muscle fructose-bisphosphate aldolase." Biochem. J. 249:779-788(1988). PubMed=3355497 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00144} {PS00159; ALDOLASE_KDPG_KHG_1} {PS00160; ALDOLASE_KDPG_KHG_2} {BEGIN} ************************************************* * KDPG and KHG aldolases active site signatures * ************************************************* 4-hydroxy-2-oxoglutarate aldolase (EC 4.1.3.16) (KHG-aldolase) catalyzes the interconversion of 4-hydroxy-2-oxoglutarate into pyruvate and glyoxylate. Phospho-2-dehydro-3-deoxygluconate aldolase (EC 4.1.2.14) (KDPGaldolase) catalyzes the interconversion of 6-phospho-2-dehydro-3-deoxy-D-gluconate into pyruvate and glyceraldehyde 3-phosphate. These two enzymes are structurally and functionally related [1]. They are both homotrimeric proteins of approximately 220 amino-acid residues. They are class I aldolases whose catalytic mechanism involves the formation of a Schiff-base intermediate between the substrate and the epsilon-amino group of a lysine residue. In both enzymes, an arginine is required for catalytic activity. We developed two signature patterns for these enzymes. The first one contains the active site arginine and the second, the lysine involved in the Schiffbase formation. -Consensus pattern: G-[LIVM]-x(3)-E-[LIV]-T-[LF]-R [R is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for Bacillus subtilis KDPG-aldolase which has Thr instead of Arg in the active site. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: G-x(3)-[LIVMF]-K-[LF]-F-P-[SA]-x(3)-G [K is involved in Schiff-base formation] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: November 1997 / Patterns and text revised. [ 1] Vlahos C.J., Dekker E.E. "The complete amino acid sequence and identification of the active-site arginine peptide of Escherichia coli 2-keto-4-hydroxyglutarate aldolase." J. Biol. Chem. 263:11683-11691(1988). PubMed=3136164 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00145} {PS00161; ISOCITRATE_LYASE} {BEGIN} ****************************** * Isocitrate lyase signature * ****************************** Isocitrate lyase (EC 4.1.3.1) [1,2] is an enzyme that catalyzes the conversion of isocitrate to succinate and glyoxylate. This is the first step in the glyoxylate bypass, an alternative to the tricarboxylic acid cycle in bacteria, fungi and plants. A cysteine, a histidine and a glutamate or aspartate have been found to be important for the enzyme's catalytic activity. Only one cysteine residue is conserved between the sequences of the fungal, plant and bacterial enzymes; it is located in the middle of a conserved hexapeptide that can be used as a signature pattern for this type of enzyme. ICL is evolutionary related to two other type of enzymes: - Carboxyphosphonoenolpyruvate phosphonomutase mutase). (EC 2.7.8.23) (CPEP It forms a carbon-phosphorus bond in a rearrangement leading from carboxyphosphonoenolpyruvate (CPEP) to phosphinopyruvate. - Phosphoenolpyruvate phosphomutase (EC 5.4.2.9) (PEP mutase) [3]. It forms a carbon-phosphorus bond by converting phosphoenolpyruvate (PEP) to phosphonopyruvate. -Consensus pattern: K-[KR]-C-G-H-[LMQR] [C may be an active site residue] -Sequences known to belong to this class detected by the pattern: All ICLs and CPEP mutases, but not PEP mutases. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: May 2004 / Text revised. [ 1] Beeching J.R. "High sequence conservation between isocitrate lyase from Escherichia coli and Ricinus communis." Protein Seq. Data Anal. 2:463-466(1989). PubMed=2696959 [ 2] Atomi H., Ueda M., Hikida M., Hishida T., Teranishi Y., Tanaka A. "Peroxisomal isocitrate lyase of the n-alkane-assimilating yeast Candida tropicalis: gene analysis and characterization." J. Biochem. 107:262-266(1990). PubMed=2361956 [ 3] Huang K., Li Z., Jia Y., Dunaway-Mariano D., Herzberg O. "Helix swapping between two alpha/beta barrels: crystal structure of phosphoenolpyruvate mutase with bound Mg(2+)-oxalate." Structure 7:539-548(1999). PubMed=10378273 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00146} {PS00162; ALPHA_CA_1} {PS51144; ALPHA_CA_2} {BEGIN} *************************************************** * Alpha-carbonic anhydrases signature and profile * *************************************************** Carbonic anhydrases (EC 4.2.1.1) (CA) [1,2,3,4] are zinc metalloenzymes which catalyze the reversible hydration of carbon dioxide, a reaction underlying many diverse physiological processes in animals, plants, archaebacteria, and eubacteria. Currently there are five evolutionarily distinct CA families (alpha, beta, gamma, delta and epsilon) that have no significant sequence identity and were invented independently. The alpha-CAs are found predominantly in animals but also in bacteria and green algae [5,6,7]. To date 15 alpha-CA or alpha-CA-like proteins have been identified in mammals. These can be divided into five broad subgroups: the cytosolic CAs (CA-I, CA-II, CA-III, CA-VII and CA XIII), mitochondrial CAs (CA-VA and CA-VB), secreted CAs (CA-VI), membrane-associated (CA-IV, CA-IX, CA-XII and CA-XIV) and those without CA activity, the CA-related proteins (CA-RP VIII, X and XI) [6]. In the alga Chlamydomonas reinhardtii, two CA isozymes have been sequenced [8]. They are periplasmic glycoproteins evolutionary related to mammalian CAs. Some bacteria, such as Neisseria gonorrhoeae [9] also have an alpha-type CA. The dominating secondary structure is a 10-stranded, twisted beta-sheet, which divides the molecules into two halves (see <PDB:1RAZ>). Except for two pairs of parallel strands, the beta sheet is antiparallel. A few relatively short helices are located on the surface of the molecule [10]. Alpha-CAs contain a single zinc atom bound to three conserved histidine residues. The catalytically active group is the zinc-bound water which ionizes to a hydroxide group. In the mechanism of catalysis, nucleophilic attack of CO2 by a zinc-bound hydroxide ion is followed by displacement of the resulting zincbound bicarbonate ion by water; subsequent deprotonation regenerates the nucleophilic zinc-bound hydroxide ion [5,11]. Protein D8 from Vaccinia and other poxviruses is related to CAs but has lost two of the zinc-binding histidines as well as many otherwise conserved residues. This is also true of the N-terminal extracellular domain of some receptor-type tyrosine-protein phosphatases (see <PDOC00323>). We derived a signature pattern for the alpha-CAs which includes one of the zinc-binding histidines. We also developed a profile that covers the entire alpha-CA catalytic domain. -Consensus pattern: S-E-[HN]-x-[LIVM]-x(4)-[FYH]-x(2)-E-[LIVMGA]-H[LIVMFA](2) [The second H is a zinc ligand] -Sequences known to belong to this class detected by the pattern: ALL active CAs. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Most prokaryotic CAs as well as plant chloroplast CAs belong to another, evolutionary distinct family of proteins, the beta-family (see <PDOC00586>). -Last update: August 2005 / Text revised; profile added. [ 1] Deutsch H.F. "Carbonic anhydrases." Int. J. Biochem. 19:101-113(1987). PubMed=3106115 [ 2] Fernley R.T. "Non-cytoplasmic carbonic anhydrases." Trends Biochem. Sci. 13:356-359(1988). PubMed=3149805 [ 3] Tashian R.E. "The carbonic anhydrases: widening perspectives on their evolution, expression and function." BioEssays 10:186-192(1989). PubMed=2500929 [ 4] Edwards Y. "Structure and expression of mammalian carbonic anhydrases." Biochem. Soc. Trans. 18:171-175(1990). PubMed=2116334 [ 5] Hewett-Emmett D., Tashian R.E. "Functional diversity, conservation, and convergence in the evolution of the alpha-, beta-, and gamma-carbonic anhydrase gene families." Mol. Phylogenet. Evol. 5:50-77(1996). PubMed=8673298 [ 6] Leggat W., Dixon R., Saleh S., Yellowlees D. "A novel carbonic anhydrase from the giant clam Tridacna gigas contains two carbonic anhydrase domains." FEBS J. 272:3297-3305(2005). PubMed=15978036; DOI=10.1111/j.1742-4658.2005.04742.x [ 7] Premkumar L., Greenblatt H.M., Bageshwar U.K., Savchenko T., Gokhman I., Sussman J.L., Zamir A. "Three-dimensional structure of a halotolerant algal carbonic anhydrase predicts halotolerance of a mammalian homolog." Proc. Natl. Acad. Sci. U.S.A. 102:7493-7498(2005). PubMed=15894606; DOI=10.1073/pnas.0502829102 [ 8] Fujiwara S., Fukuzawa H., Tachiki A., Miyachi S. "Structure and differential expression of two genes encoding carbonic anhydrase in Chlamydomonas reinhardtii." Proc. Natl. Acad. Sci. U.S.A. 87:9779-9783(1990). PubMed=2124702 [ 9] Huang S., Xue Y., Sauer-Eriksson E., Chirica L., Lindskog S., Jonsson B.H. "Crystal structure of carbonic anhydrase from Neisseria gonorrhoeae and its complex with the inhibitor acetazolamide." J. Mol. Biol. 283:301-310(1998). PubMed=9761692 [10] Lindskog S. "Structure and mechanism of carbonic anhydrase." Pharmacol. Ther. 74:1-20(1997). PubMed=9336012 [11] Whittington D.A., Waheed A., Ulmasov B., Shah G.N., Grubb J.H., Sly W.S., Christianson D.W. "Crystal structure of the dimeric extracellular domain of human carbonic anhydrase XII, a bitopic membrane protein overexpressed in certain cancer tumor cells." Proc. Natl. Acad. Sci. U.S.A. 98:9545-9550(2001). PubMed=11493685; DOI=10.1073/pnas.161301298 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00147} {PS00163; FUMARATE_LYASES} {BEGIN} ***************************** * Fumarate lyases signature * ***************************** A number of enzymes, belonging to the lyase class, for which fumarate is a substrate have been shown [1,2] to share a short conserved sequence around a methionine which is probably involved in the catalytic activity of this type of enzymes. These enzymes are: - Fumarase (EC 4.2.1.2) (fumarate hydratase), which catalyzes the reversible hydration of fumarate to L-malate. There seem to be 2 classes of fumarases: class I are thermolabile dimeric enzymes (as for example: Escherichia coli fumC); class II enzymes are thermostable and tetrameric and are found in prokaryotes (as for example: Escherichia coli fumA and fumB) as well as in eukaryotes. The sequence of the two classes of fumarases are not closely related. - Aspartate ammonia-lyase (EC 4.3.1.1) (aspartase), which catalyzes the reversible conversion of aspartate to fumarate and ammonia. This reaction is analogous to that catalyzed by fumarase, except that ammonia rather than water is involved in the trans-elimination reaction. - Arginosuccinase (EC 4.3.2.1) (argininosuccinate lyase), which catalyzes the formation of arginine and fumarate from argininosuccinate, the last step in the biosynthesis of arginine. - Adenylosuccinase (EC 4.3.2.2) (adenylosuccinate lyase) [3], which catalyzes the eight step in the de novo biosynthesis of purines, the formation of 5'-phosphoribosyl-5-amino-4-imidazolecarboxamide and fumarate from 1-(5phosphoribosyl)-4-(N-succino-carboxamide). That enzyme can also catalyzes the formation of fumarate and AMP from adenylosuccinate. - Pseudomonas putida 3-carboxy-cis,cis-muconate cycloisomerase (EC 5.5.1.2) (3-carboxymuconate lactonizing enzyme) (gene pcaB) [4], an enzyme involved in aromatic acids catabolism. -Consensus pattern: G-S-x(2)-M-x-{RS}-K-x-N -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 8. -Last update: December 2004 / Pattern and text revised. [ 1] Woods S.A., Schwartzbach S.D., Guest J.R. "Two biochemically distinct classes of fumarase in Escherichia coli." Biochim. Biophys. Acta 954:14-26(1988). PubMed=3282546 [ 2] Woods S.A., Miles J.S., Guest J.R. FEMS Microbiol. Lett. 51:181-186(1988). [ 3] Zalkin H., Dixon J.E. "De novo purine nucleotide biosynthesis." Prog. Nucleic Acid Res. Mol. Biol. 42:259-287(1992). PubMed=1574589 [ 4] Williams S.E., Woolridge E.M., Ransom S.C., Landro J.A., Babbitt P.C., Kozarich J.W. "3-Carboxy-cis,cis-muconate lactonizing enzyme from Pseudomonas putida is homologous to the class II fumarase family: a new reaction in the evolution of a mechanistic motif." Biochemistry 31:9768-9776(1992). PubMed=1390752 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00148} {PS00164; ENOLASE} {BEGIN} ********************* * Enolase signature * ********************* Enolase (EC 4.2.1.11) is a glycolytic enzyme that catalyzes the dehydration of 2-phospho-D-glycerate to phosphoenolpyruvate [1]. It is a dimeric enzyme that requires magnesium both for catalysis and stabilizing the dimer. Enolase is probably found in all organisms that metabolize sugars. In vertebrates, there are three different tissue-specific isozymes: alpha present in most tissues, beta in muscles and gamma found only in nervous tissues. Tau-crystallin, one of the major lens proteins in some fish, reptiles and birds, has been shown [2] to be evolutionary related to enolase. As a signature pattern for enolase, we selected the best conserved region, it is located in the C-terminal third of the sequence. -Consensus pattern: [LIVTMS]-[LIVP]-[LIV]-[KQ]-x-[ND]-Q-[INV]-[GA]-[ST][LIVM]-[STL]-[DERKAQG]-[STA] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Lebioda L., Stec B., Brewer J.M. "The structure of yeast enolase at 2.25-A resolution. An 8-fold beta + alpha-barrel with a novel beta beta alpha alpha (beta alpha)6 topology." J. Biol. Chem. 264:3685-3693(1989). PubMed=2645275 [ 2] Wistow G., Piatigorsky J. "Recruitment of enzymes as lens structural proteins." Science 236:1554-1556(1987). PubMed=3589669 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00149} {PS00165; DEHYDRATASE_SER_THR} {BEGIN} ********************************************************************* * Serine/threonine dehydratases pyridoxal-phosphate attachment site * ********************************************************************* Serine and threonine dehydratases [1,2] are functionally structurally related pyridoxal-phosphate dependent enzymes: and - L-serine dehydratase (EC 4.3.1.17) and D-serine dehydratase (EC 4.3.1.18) catalyze the dehydratation of L-serine (respectively D-serine) into ammonia and pyruvate. - Threonine dehydratase (EC 4.3.1.19) (TDH) catalyzes the dehydratation of threonine into alpha-ketobutarate and ammonia. In Escherichia coli and other microorganisms, two classes of TDH are known to exist. One is involved in the biosynthesis of isoleucine, the other in hydroxamino acid catabolism. Threonine synthase (EC 4.2.3.1) is also a pyridoxal-phosphate enzyme, it catalyzes the transformation of homoserine-phosphate into threonine. It has been shown [3] that threonine synthase is distantly related to the serine/ threonine dehydratases. In all these enzymes, the pyridoxal-phosphate group is attached to a lysine residue. The sequence around this residue is sufficiently conserved to allow the derivation of a pattern specific to serine/threonine dehydratases and threonine synthases. -Consensus pattern: [DESH]-x(4,5)-[STVG]-{EVKD}-[AS]-[FYI]-K-[DLIFSA][RLVMF][GA]-[LIVMGA] [The K is the pyridoxal-P attachment site] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 17. -Note: Some bacterial L-serine dehydratases - such as those from Escherichia coli - are iron-sulfur proteins [4] and do not belong to this family. -Last update: December 2004 / Pattern and text revised. [ 1] Ogawa H., Gomi T., Konishi K., Date T., Nakashima H., Nose K., Matsuda Y., Peraino C., Pitot H.C., Fujioka M. "Human liver serine dehydratase. cDNA cloning and sequence homology with hydroxyamino acid dehydratases from other sources." J. Biol. Chem. 264:15818-15823(1989). PubMed=2674117 [ 2] Datta P., Goss T.J., Omnaas J.R., Patil R.V. "Covalent structure of biodegradative threonine dehydratase of Escherichia coli: homology with other dehydratases." Proc. Natl. Acad. Sci. U.S.A. 84:393-397(1987). PubMed=3540965 [ 3] Parsot C. "Evolution of biosynthetic pathways: a common ancestor for threonine synthase, threonine dehydratase and D-serine dehydratase." EMBO J. 5:3013-3019(1986). PubMed=3098560 [ 4] Grabowski R., Hofmeister A.E.M., Buckel W. "Bacterial L-serine dehydratases: a new family of enzymes containing iron-sulfur clusters." Trends Biochem. Sci. 18:297-300(1993). PubMed=8236444 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00150} {PS00166; ENOYL_COA_HYDRATASE} {BEGIN} ******************************************* * Enoyl-CoA hydratase/isomerase signature * ******************************************* Enoyl-CoA hydratase (EC 4.2.1.17) (ECH) [1] and D3,D2-enoyl-CoA isomerase (EC 5.3.3.8) (ECI) [2] are two enzymes involved in fatty acid metabolism. ECH catalyzes the hydratation of 2-trans-enoyl-CoA into 3-hydroxyacyl-CoA and ECI shifts the 3- double bond of the intermediates of unsaturated fatty acid oxidation to the 2-trans position. Most eukaryotic cells have two fatty-acid beta-oxidation systems, one located in mitochondria and the other in peroxisomes. In mitochondria, ECH and ECI are separate yet structurally related monofunctional enzymes. Peroxisomes contain a trifunctional enzyme [3] consisting of an N-terminal domain that bears both ECH and ECI activity, and a C-terminal domain responsible for 3hydroxyacylCoA dehydrogenase (HCDH) activity. In Escherichia coli (gene fadB) and Pseudomonas fragi (gene faoA), ECH and ECI are also part of a multifunctional enzyme which contains both a HCDH and a 3-hydroxybutyryl-CoA epimerase domain [4]. A number of other proteins to the ECH/ECI enzymes or domains: have been found to be evolutionary related - 3-hydroxbutyryl-coa dehydratase (EC 4.2.1.55) (crotonase), a bacterial enzyme involved in the butyrate/butanol-producing pathway. - Naphthoate synthase (EC 4.1.3.36) (DHNA synthetase) (gene menB) [5], a bacterial enzyme involved in the biosynthesis of menaquinone (vitamin K2). DHNA synthetase converts O-succinyl-benzoyl-CoA (OSB-CoA) to 1,4dihydroxy2-naphthoic acid (DHNA). - 4-chlorobenzoate dehalogenase (EC 3.8.1.6) [6], a Pseudomonas enzyme which catalyzes the conversion of 4-chlorobenzoate-CoA to 4-hydroxybenzoateCoA. - A Rhodobacter capsulatus protein of unknown function (ORF257) [7]. - Bacillus subtilis putative polyketide biosynthesis proteins pksH and pksI. - Escherichia coli carnitine racemase (gene caiD) [8]. - Escherichia coli hypothetical protein ygfG. - Yeast hypothetical protein YDR036c. As a signature pattern for these enzymes, we selected rich in glycine and hydrophobic residues. a conserved region -Consensus pattern: [LIVM]-[STAG]-x-[LIVM]-[DENQRHSTA]-G-x(3)-[AG](3)x(4)[LIVMST]-x-[CSTA]-[DQHP]-[LIVMFYA] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 5. -Expert(s) to contact by email: Hofmann K.; [email protected] -Last update: December 2004 / Pattern and text revised. [ 1] Minami-Ishii N., Taketani S., Osumi T., Hashimoto T. Eur. J. Biochem. 185:73-78(1989). [ 2] Mueller-Newen G., Stoffel W. Biol. Chem. Hoppe-Seyler 372:613-624(1991). [ 3] Palosaari P.M., Hiltunen J.K. "Peroxisomal bifunctional protein from rat liver is a trifunctional enzyme possessing 2-enoyl-CoA hydratase, 3-hydroxyacyl-CoA dehydrogenase, and delta 3, delta 2-enoyl-CoA isomerase activities." J. Biol. Chem. 265:2446-2449(1990). PubMed=2303409 [ 4] Nakahigashi K., Inokuchi H. "Nucleotide sequence of the fadA and fadB genes from Escherichia coli." Nucleic Acids Res. 18:4937-4937(1990). PubMed=2204034 [ 5] Driscoll J.R., Taber H.W. "Sequence organization and regulation of the Bacillus subtilis menBE operon." J. Bacteriol. 174:5063-5071(1992). PubMed=1629163 [ 6] Babbitt P.C., Kenyon G.L., Martin B.M., Charest H., Slyvestre M., Scholten J.D., Chang K.-H., Liang P.-H., Dunaway-Mariano D. "Ancestry of the 4-chlorobenzoate dehalogenase: analysis of amino acid sequence identities among families of acyl:adenyl ligases, enoyl-CoA hydratases/isomerases, and acyl-CoA thioesterases." Biochemistry 31:5594-5604(1992). PubMed=1351742 [ 7] Beckman D.L., Kranz R.G. "A bacterial homolog to the mitochondrial enoyl-CoA hydratase." Gene 107:171-172(1991). PubMed=1743516 [ 8] Eichler K., Bourgis F., Buchet A., Kleber H.-P., Mandrand-Berthelot M.-A. "Molecular characterization of the cai operon necessary for carnitine metabolism in Escherichia coli." Mol. Microbiol. 13:775-786(1994). PubMed=7815937 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00151} {PS00167; TRP_SYNTHASE_ALPHA} {BEGIN} ********************************************* * Tryptophan synthase alpha chain signature * ********************************************* Tryptophan synthase (EC 4.2.1.20) catalyzes the last step in the biosynthesis of tryptophan: the conversion of indoleglycerol phosphate and serine, to tryptophan and glyceraldehyde 3-phosphate [1,2]. It has two functional domains: one for the aldol cleavage of indoleglycerol phosphate to indole and glyceraldehyde 3-phosphate and the other for the synthesis of tryptophan from indole and serine. In bacteria and plants [3], each domain is found on a separate subunit (alpha and beta chains), while in fungi the two domains are fused together on a single multifunctional protein. As a signature pattern for the alpha chain, we selected a conserved region that contains three conserved acidic residues. The first and the third acidic residues are believed to serve as proton donors/acceptors in the enzyme's catalytic mechanism. -Consensus pattern: [LIVM]-E-[LIVM]-[GQ]-x(2)-[FYCHTWP]-[STPK]-[DEKY][PA][LIVMYGK]-[SGALIMY]-[DE]-[GN] [The first E and the second D/E are active site residues] -Sequences known to belong to this class detected by the pattern: ALL, except for the Sulfolobus solfataricus enzyme which is highly divergent. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Crawford I.P. "Evolution of a biosynthetic pathway: the tryptophan paradigm." Annu. Rev. Microbiol. 43:567-600(1989). PubMed=2679363; DOI=10.1146/annurev.mi.43.100189.003031 [ 2] Hyde C.C., Miles E.W. "The tryptophan synthase multienzyme complex: exploring structure-function relationships with X-ray crystallography and mutagenesis." Biotechnology (N.Y.) 8:27-32(1990). PubMed=1366510 [ 3] Berlyn M.B., Last R.L., Fink G.R. "A gene encoding the tryptophan synthase beta subunit of Arabidopsis thaliana." Proc. Natl. Acad. Sci. U.S.A. 86:4604-4608(1989). PubMed=2734310 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00152} {PS00168; TRP_SYNTHASE_BETA} {BEGIN} ********************************************************************** * Tryptophan synthase beta chain pyridoxal-phosphate attachment site * ********************************************************************** Tryptophan synthase (EC 4.2.1.20) catalyzes the last step in the biosynthesis of tryptophan: the conversion of indoleglycerol phosphate and serine, to tryptophan and glyceraldehyde 3-phosphate [1,2]. It has two functional domains: one for the aldol cleavage of indoleglycerol phosphate to indole and glyceraldehyde 3-phosphate and the other for the synthesis of tryptophan from indole and serine. In bacteria and plants [3], each domain is found on a separate subunit (alpha and beta chains), while in fungi the two domains are fused together on a single multifunctional protein. The beta chain of the enzyme requires pyridoxal-phosphate as a cofactor. The pyridoxal-phosphate group is attached to a lysine residue. The region around this lysine residue also contains two histidine residues which are part of the pyridoxal-phosphate binding site. The signature pattern for the tryptophan synthase beta chain is derived from that conserved region. -Consensus pattern: [LIVMYAHQ]-x-[HPYNVF]-x-G-[STA]-H-K-x-N-x(2)-[LIVM]x[QEH] [K is the pyridoxal-P attachment site] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Crawford I.P. "Evolution of a biosynthetic pathway: the tryptophan paradigm." Annu. Rev. Microbiol. 43:567-600(1989). PubMed=2679363; DOI=10.1146/annurev.mi.43.100189.003031 [ 2] Hyde C.C., Miles E.W. "The tryptophan synthase multienzyme complex: exploring structure-function relationships with X-ray crystallography and mutagenesis." Biotechnology (N.Y.) 8:27-32(1990). PubMed=1366510 [ 3] Berlyn M.B., Last R.L., Fink G.R. "A gene encoding the tryptophan synthase beta subunit of Arabidopsis thaliana." Proc. Natl. Acad. Sci. U.S.A. 86:4604-4608(1989). PubMed=2734310 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00153} {PS00169; D_ALA_DEHYDRATASE} {BEGIN} ***************************************************** * Delta-aminolevulinic acid dehydratase active site * ***************************************************** Delta-aminolevulinic acid dehydratase (EC 4.2.1.24) (ALAD) [1] catalyzes the second step in the biosynthesis of heme, the condensation of two molecules of 5-aminolevulinate to form porphobilinogen. The enzyme is an oligomer composed of eight identical subunits. Each of the subunits binds an atom of zinc or of magnesium (in plants). A lysine has been implicated in the catalytic mechanism [2]. The sequence of the region in the vicinity of the active site residue is conserved in ALAD from various prokaryotic and eukaryotic species. -Consensus pattern: G-x-D-x-[LIVM](2)-[IV]-K-P-[GSA]-x(2)-Y [K is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: November 1995 / Pattern and text revised. [ 1] Li J.-M., Russell C.S., Cosloy S.D. "The structure of the Escherichia coli hemB gene." Gene 75:177-184(1989). PubMed=2656410 [ 2] Gibbs P.N.B., Jordan P.M. "Identification of lysine at the active site of human 5-aminolaevulinate dehydratase." Biochem. J. 236:447-451(1986). PubMed=3092810 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00154} {PS00170; CSA_PPIASE_1} {PS50072; CSA_PPIASE_2} {BEGIN} ************************************************************************* *** * Cyclophilin-type peptidyl-prolyl cis-trans isomerase signature & profile * ************************************************************************* *** Cyclophilin [1] is the major high-affinity binding protein in vertebrates for the immunosuppressive drug cyclosporin A (CSA). It exhibits a peptidylprolyl cis-trans isomerase activity (EC 5.2.1.8) (PPIase or rotamase). PPIase is an enzyme that accelerates protein folding by catalyzing the cistrans isomerization of proline imidic peptide bonds in oligopeptides [2]. It is probable that CSA mediates some of its effects via an inhibitory action on PPIase. Cyclophilin is a cytosolic protein which belongs to a family [3,4,5] that also includes the following isozymes: - Cyclophilin B (or S-cyclophilin), a PPIase which is retained in an endoplasmic reticulum compartment. - Cyclophilin C, a cytoplasmic PPiase. - Mitochondrial matrix cyclophilin (cyp3). - A PPIase which seems specific for the folding of rhodopsin and is an integral membrane protein anchored by a C-terminal transmembrane region. This protein was first characterized in Drosophila (gene ninaA). - Bacterial periplasmic PPiase (gene ppiA). - Bacterial cytosolic PPiase (gene ppiB). - Natural-killer cell cyclophilin-related protein. This large protein (about 160 Kd) is a component of a putative tumor-recognition complex involved in the function of NK cells. It contains a cyclophilin-type PPiase domain. - Mammalian nucleoporin Nup358 [6], a nuclear pore complex protein of 358 Kd that contains a C-terminal cyclophilin-type PPiase domain. - Yeast hypothetical protein YJR032w. - Fission yeast hypothetical protein SpAC21E11.05c. - Caenorhabditis elegans hypothetical protein T27D1.1. The sequences of the different forms well conserved. As a signature pattern, we in the central part of these enzymes. of cyclophilin-type selected a PPIases are conserved region -Consensus pattern: [FY]-x(2)-[STCNLVA]-x-[FV]-H-[RH]-[LIVMNS]-[LIVM]x(2)-F[LIVM]-x-Q-[AGFT]-G -Sequences known to belong to this class detected by the pattern: ALL, except for 7 sequences. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: FKBP's, a family of proteins that bind the immunosuppressive drug FK506, are also PPIases, but their sequence is not at all related to that of cyclophilin (see <PDOC00426>). -Last update: December 2004 / Pattern and text revised. [ 1] Stamnes M.A., Rutherford S.L., Zuker C.S. "Cyclophilins: a new family of proteins involved in intracellular folding." Trends Cell Biol. 2:272-276(1992). PubMed=14731520 [ 2] Fischer G., Schmid F.X. "The mechanism of protein folding. Implications of in vitro refolding models for de novo protein folding and translocation in the cell." Biochemistry 29:2205-2212(1990). PubMed=2186809 [ 3] Trandinh C.C., Pao G.M., Saier M.H. Jr. "Structural and evolutionary relationships among the immunophilins: two ubiquitous families of peptidyl-prolyl cis-trans isomerases." FASEB J. 6:3410-3420(1992). PubMed=1464374 [ 4] Galat A. "Peptidylproline cis-trans-isomerases: immunophilins." Eur. J. Biochem. 216:689-707(1993). PubMed=8404888 [ 5] Hacker J., Fischer G. "Immunophilins: structure-function relationship and possible role in microbial pathogenicity." Mol. Microbiol. 10:445-456(1993). PubMed=7526121 [ 6] Wu J., Matunis M.J., Kraemer D., Blobel G., Coutavas E. "Nup358, a cytoplasmically exposed nucleoporin with peptide repeats, Ran-GTP binding sites, zinc fingers, a cyclophilin A homologous domain, and a leucine-rich region." J. Biol. Chem. 270:14209-14213(1995). PubMed=7775481 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00155} {PS00171; TIM_1} {PS51440; TIM_2} {BEGIN} **************************************************************** * Triosephosphate isomerase (TIM) family signature and profile * **************************************************************** Triosephosphate isomerase (EC 5.3.1.1) (TIM) [1] is the glycolytic enzyme that catalyzes the reversible interconversion of glyceraldehyde 3phosphate and dihydroxyacetone phosphate. TIM plays an important role in several metabolic pathways and is essential for efficient energy production. It is present in eukaryotes as well as in prokaryotes. TIM is a dimer of identical subunits, each of which is made up of about 250 amino-acid residues. A glutamic acid and a histidine residue are involved in the catalytic mechanism [2,3]. The tertiary structure of TIM has eight beta/alpha motifs folded into a barrel structure (see <PDB:1NEY>). The TIM barrel fold occurs ubiquitously and is found in numerous other enzymes that can be involved in energy metabolism, macromolecule metabolism, or small molecule metabolism [4]. The sequence around the active site residue is strongly conserved in all known TIM's and can be used as a signature pattern for this type of enzyme. We also developed a profile that covers the entire TIM structure. -Consensus pattern: [AVG]-[YLV]-E-P-[LIVMEPKST]-[WYEAS]-[SAL]-[IV]-[GN][TEKDVS]-[GKNAD] [E is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: March 2009 / Text revised; profile added. [ 1] Lolis E., Alber T., Davenport R.C., Rose D., Hartman F.C., Petsko G.A. "Structure of yeast triosephosphate isomerase at 1.9-A resolution." Biochemistry 29:6609-6618(1990). PubMed=2204417 [ 2] Knowles J.R. "Enzyme catalysis: not different, just better." Nature 350:121-124(1991). PubMed=2005961; DOI=10.1038/350121a0 [ 3] Jogl G., Rozovsky S., McDermott A.E., Tong L. "Optimal alignment for enzymatic proton transfer: structure of the Michaelis complex of triosephosphate isomerase at 1.2-A resolution." Proc. Natl. Acad. Sci. U.S.A. 100:50-55(2003). PubMed=12509510; DOI=10.1073/pnas.0233793100 [ 4] Nagano N., Orengo C.A., Thornton J.M. "One fold with many functions: the evolutionary relationships between TIM barrel families based on their sequences, structures and functions." J. Mol. Biol. 321:741-765(2002). PubMed=12206759 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00156} {PS51415; XYLOSE_ISOMERASE} {BEGIN} *********************************** * Xylose isomerase family profile * *********************************** Xylose isomerase (EC 5.3.1.5) [1] is an enzyme found in microorganisms which catalyzes the interconversion of an aldo sugar D-xylose to a keto sugar D-xylulose. It can also isomerize D-ribose to D-ribulose and Dglucose to D-fructose. Xylose isomerase seems to require magnesium for its activity, while cobalt is necessary to stabilize the tetrameric structure of the enzyme. Xylose isomerase also exists in plants [2] where it is manganesedependent. The enzyme has also been found in anaerobic fungi [3]. A number of residues are conserved in all known xylose isomerases. A histidine in the N-terminal section of the enzyme has been shown [4] to be involved in the catalytic mechanism of the enzyme. Two glutamate residues, a histidine and four aspartate residues are the metal-binding sites that bind two ions of magnesium, cobalt, or manganese [5-7]. Three-dimensional structures of xylose isomerases show a that each subunit contains a common alpha/beta-barrel fold (see <PDB:2GLK; A>) [7] similar to that of other divalent metal-dependent TIM barrel enzymes, such as rhamnose isomerase [8] and endonuclease 4 (see <PDOC00599>) [1,5,6]. The Cterminal smaller part forms an extended helical fold that seems to be implicated in multimerization. We have developed a profile that covers the entire xylose isomerase structure. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Jenkins J.; [email protected] -Last update: February 2009 / Text revised; profile added; patterns deleted. [ 1] Dauter Z., Dauter M., Hemker J., Witzel H., Wilson K.S. "Crystallisation and preliminary analysis of glucose isomerase from Streptomyces albus." FEBS Lett. 247:1-8(1989). PubMed=2651156 [ 2] Kristo P.A., Saarelainen R., Fagerstrom R., Aho S., Korhola M. "Protein purification, and cloning and characterization of the cDNA and gene for xylose isomerase of barley." Eur. J. Biochem. 237:240-246(1996). PubMed=8620879 [ 3] Harhangi H.R., Akhmanova A.S., Emmens R., van der Drift C., de Laat W.T., van Dijken J.P., Jetten M.S., Pronk J.T., Op den Camp H.J. "Xylose metabolism in the anaerobic fungus Piromyces sp. strain E2 follows the bacterial pathway." Arch. Microbiol. 180:134-141(2003). PubMed=12811467; DOI=10.1007/s00203-003-0565-0 [ 4] Vangrysperre W., Ampe C., Kersters-Hilderson H., Tempst P. "Single active-site histidine in D-xylose isomerase from Streptomyces violaceoruber. Identification by chemical derivatization and peptide mapping." Biochem. J. 263:195-199(1989). PubMed=2604694 [ 5] Henrick K., Collyer C.A., Blow D.M. "Structures of D-xylose isomerase from Arthrobacter strain B3728 containing the inhibitors xylitol and D-sorbitol at 2.5 A and 2.3 A resolution, respectively." J. Mol. Biol. 208:129-157(1989). PubMed=2769749 [ 6] Chang C., Park B.C., Lee D.S., Suh S.W. "Crystal structures of thermostable xylose isomerases from Thermus caldophilus and Thermus thermophilus: possible structural determinants of thermostability." J. Mol. Biol. 288:623-634(1999). PubMed=10329168; DOI=10.1006/jmbi.1999.2696 [ 7] Katz A.K., Li X., Carrell H.L., Hanson B.L., Langan P., Coates L., Schoenborn B.P., Glusker J.P., Bunick G.J. "Locating active-site hydrogen atoms in D-xylose isomerase: time-of-flight neutron diffraction." Proc. Natl. Acad. Sci. U.S.A. 103:8342-8347(2006). PubMed=16707576; DOI=10.1073/pnas.0602598103 [ 8] Korndoerfer I.P., Fessner W.D., Matthews B.W. "The structure of rhamnose isomerase from Escherichia coli and its relation with xylose isomerase illustrates a change between inter and intra-subunit complementation during evolution." J. Mol. Biol. 300:917-933(2000). PubMed=10891278; DOI=10.1006/jmbi.2000.3896 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00157} {PS00765; P_GLUCOSE_ISOMERASE_1} {PS00174; P_GLUCOSE_ISOMERASE_2} {PS51463; P_GLUCOSE_ISOMERASE_3} {BEGIN} ********************************************************************* * Glucose-6-phosphate isomerase (GPI) family signatures and profile * ********************************************************************* Glucose-6-phosphate isomerase (GPI) (EC 5.3.1.9) or phosphoglucose isomerase (PGI) [1,2] is a dimeric enzyme that catalyzes the reversible isomerization of glucose-6-phosphate and fructose-6-phosphate. PGI is involved in different pathways: in most higher organisms it is involved in glycolysis; in mammals it is involved in gluconeogenesis; in plants in carbohydrate biosynthesis; in some bacteria it provides a gateway for fructose into the EntnerDoudouroff pathway. Besides it's role as a glycolytic enzyme, mammalian PGI can function as a tumor-secreted cytokine and an angiogenic factor (AMF) that stimulates endothelial cell motility. Mammalian PGI is also neuroleukin [3], a neurotrophic factor which supports the survival of various types of neurons. The sequence of PGI is conserved among diverse species ranging from bacteria to mammals and structures form a similar fold (see <PDB:1IAT>) [4,5], comprised of two subdomains that each form an alpha-beta-alpha sandwich, with the active site located in the cleft between the subdomains and on the dimer interface. A glutamate and a lysine residue as well as a histidine from the other protomer in the dimer are implicated in the catalytic mechanism. The structure resembles that of the SIS domain (see <PDOC51464>). As signature patterns for this enzyme we selected two conserved regions, the first region is located in the central section of PGI, while the second one is located in its C-terminal section. We also developed a profile that covers the entire PGI. -Consensus pattern: [DENSA]-x-[LIVM]-[GP]-G-R-[FY]-[ST]-[LIVMFSTAP]-x[GSTA][PSTACM]-[LIVMSA]-[GSAN] -Sequences known to belong to this class detected by the pattern: ALL, except for PCC 6803 PGI. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [GSA]-x-[LIVMCAYQS]-[LIVMFYWN]-x(4)-[FY]-[DNTH]-Q-x[GA][IV]-[EQST]-x(2)-K [K is the active site residue] -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: September 2009 / Text revised; profile added. [ 1] Achari A., Marshall S.E., Muirhead H., Palmieri R.H., Noltmann E.A. "Glucose-6-phosphate isomerase." Philos. Trans. R. Soc. Lond., B, Biol. Sci. 293:145-157(1981). PubMed=6115414 [ 2] Smith M.W., Doolittle R.F. "Anomalous phylogeny involving the enzyme glucose-6-phosphate isomerase." J. Mol. Evol. 34:544-545(1992). PubMed=1593646 [ 3] Faik P., Walker J.I.H., Redmill A.A.M., Morgan M.J. "Mouse glucose-6-phosphate isomerase and neuroleukin have identical 3' sequences." Nature 332:455-457(1988). PubMed=3352745; DOI=10.1038/332455a0 [ 4] Read J., Pearce J., Li X., Muirhead H., Chirgwin J., Davies C. "The crystal structure of human phosphoglucose isomerase at 1.6 A resolution: implications for catalytic mechanism, cytokine activity and haemolytic anaemia." J. Mol. Biol. 309:447-463(2001). PubMed=11371164; DOI=10.1006/jmbi.2001.4680 [ 5] Yamamoto H., Miwa H., Kunishima N. "Crystal structure of glucose-6-phosphate isomerase from Thermus thermophilus HB8 showing a snapshot of active dimeric state." J. Mol. Biol. 382:747-762(2008). PubMed=18675274; DOI=10.1016/j.jmb.2008.07.041 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00158} {PS00175; PG_MUTASE} {BEGIN} ************************************************************* * Phosphoglycerate mutase family phosphohistidine signature * ************************************************************* Phosphoglycerate mutase (EC 5.4.2.1) (PGAM) and bisphosphoglycerate mutase (EC 5.4.2.4) (BPGM) are structurally related enzymes which catalyze reactions involving the transfer of phospho groups between the three carbon atoms of phosphoglycerate [1,2]. Both enzymes can catalyze three different reactions, although in different proportions: - The isomerization of 2-phosphoglycerate (2-PGA) to 3phosphoglycerate (3PGA) with 2,3-diphosphoglycerate (2,3-DPG) as the primer of the reaction. - The synthesis of 2,3-DPG from 1,3-DPG with 3-PGA as a primer. - The degradation of 2,3-DPG to 3-PGA (phosphatase EC 3.1.3.13 activity). In mammals, PGAM is a dimeric protein. There are two isoforms of PGAM: the M (muscle) and B (brain) forms. In yeast, PGAM is a tetrameric protein. BPGM is a dimeric protein and is found mainly in erythrocytes where it plays a major role in regulating hemoglobin oxygen affinity as a consequence of controlling 2,3-DPG concentration. The catalytic mechanism of both PGAM and of a phosphohistidine intermediate [3]. BPGM involves the formation The bifunctional enzyme 6-phosphofructo-2-kinase / fructose-2,6bisphosphatase (EC 2.7.1.105 and EC 3.1.3.46) (PF2K) [4] catalyzes both the synthesis and the degradation of fructose-2,6-bisphosphate. PF2K is an important enzyme in the regulation of hepatic carbohydrate metabolism. Like PGAM/BPGM, the fructose2,6-bisphosphatase reaction involves a phosphohistidine intermediate and the phosphatase domain of PF2K is structurally related to PGAM/BPGM. The bacterial enzyme alpha-ribazole-5'-phosphate phosphatase (gene cobC) which is involved in cobalamin biosynthesis also belongs to this family [5]. We built a signature pattern around the phosphohistidine residue. -Consensus pattern: [LIVM]-x-R-H-G-[EQ]-x-{Y}-x-N [H is the phosphohistidine residue] -Sequences known to belong to this class detected by the pattern: ALL, except for Haemophilus influenzae PGAM. -Other sequence(s) detected in Swiss-Prot: 2. -Note: Some organisms harbor a form of PGAM independent of 2,3-DPG, this enzyme is not related to the family described above [6]. -Last update: December 2004 / Pattern and text revised. [ 1] Le Boulch P., Joulin V., Garel M.-C., Rosa J., Cohen-Solal M. Biochem. Biophys. Res. Commun. 156:874-881(1988). [ 2] White M.F., Fothergill-Gilmore L.A. "Sequence of the gene encoding phosphoglycerate mutase from Saccharomyces cerevisiae." FEBS Lett. 229:383-387(1988). PubMed=2831102 [ 3] Rose Z.B. Methods Enzymol. 87:43-51(1982). [ 4] Bazan J.F., Fletterick R.J., Pilkis S.J. "Evolution of a bifunctional enzyme: 6-phosphofructo-2-kinase/fructose-2,6-bisphosphatase." Proc. Natl. Acad. Sci. U.S.A. 86:9642-9646(1989). PubMed=2557623 [ 5] O'Toole G.A., Trzebiatowski J.R., Escalante-Semerena J.C. J. Biol. Chem. 269:26503-26511(1994). [ 6] Grana X., de Lecea L., el-Maghrabi M.R., Urena J.M., Caellas C., Carreras J., Puigdomenech P., Pilkis S.J., Climent F. "Cloning and sequencing of a cDNA encoding 2,3-bisphosphoglycerate-independent phosphoglycerate mutase from maize. Possible relationship to the alkaline phosphatase family." J. Biol. Chem. 267:12797-12803(1992). PubMed=1535626 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00159} {PS00176; TOPOISOMERASE_I_EUK} {BEGIN} ********************************************** * Eukaryotic DNA topoisomerase I active site * ********************************************** DNA topoisomerase I (EC 5.99.1.2) [1,2,3,4] is one of the two types of enzyme that catalyze the interconversion of topological DNA isomers. Type I topoisomerases act by catalyzing the transient breakage of DNA, one strand at a time, and the subsequent rejoining of the strands. When a eukaryotic type 1 topoisomerase breaks a DNA backbone bond, it simultaneously forms a proteinDNA link where the hydroxyl group of a tyrosine residue is joined to a 3'phosphate on DNA, at one end of the enzyme-severed DNA strand. In eukaryotes and poxvirus topoisomerases I, there are a number of conserved residues in the region around the active site tyrosine. -Consensus pattern: [DEN]-x(6)-[GS]-[IT]-S-K-x(2)-Y-[LIVM]-x(3)-[LIVM] [Y is the active site tyrosine] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 2. -Last update: December 2001 / Text revised. [ 1] Sternglanz R. "DNA topoisomerases." Curr. Opin. Cell Biol. 1:533-535(1989). PubMed=2560656 [ 2] Sharma A., Mondragon A. "DNA topoisomerases." Curr. Opin. Struct. Biol. 5:39-47(1995). PubMed=7773745 [ 3] Lynn R.M., Bjornsti M.-A., Caron P.R., Wang J.C. "Peptide sequencing and site-directed mutagenesis identify tyrosine-727 as the active site tyrosine of Saccharomyces cerevisiae DNA topoisomerase I." Proc. Natl. Acad. Sci. U.S.A. 86:3559-3563(1989). PubMed=2542938 [ 4] Roca J. "The mechanisms of DNA topoisomerases." Trends Biochem. Sci. 20:156-160(1995). PubMed=7770916 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00160} {PS00177; TOPOISOMERASE_II} {BEGIN} ********************************** * DNA topoisomerase II signature * ********************************** DNA topoisomerase II (EC 5.99.1.3) [1,2,3,4] is one of the two types of enzyme that catalyze the interconversion of topological DNA isomers. Type II topoisomerases are ATP-dependent and act by passing a DNA segment through a transient double-strand break. Topoisomerase II is found in phages, archaebacteria, prokaryotes, eukaryotes, and in African Swine Fever virus (ASF). In bacteriophage T4 topoisomerase II consists of three subunits (the product of genes 39, 52 and 60). In prokaryotes and in archaebacteria the enzyme, known as DNA gyrase, consists of two subunits (genes gyrA and gyrB [E1]). In some bacteria, a second type II topoisomerase has been identified; it is known as topoisomerase IV and is required for chromosome segregation, it also consists of two subunits (genes parC and parE). In eukaryotes, type II topoisomerase is a homodimer. There are many regions of sequence homology between the different subtypes of topoisomerase II. The relation between the different in the following representation: subunits is shown <----------------About-1400-residues-----------------------> [----------Protein 39-*-----][----Protein 52----] [----------gyrB-------*-----][--------gyrA-----------------] Prokaryote II Phage T4 Archaea [----------parE-------*-----][--------parD-----------------] Prokaryote IV [---------------------*------------------------------------] Eukaryote and ASF '*': Position of the pattern. As a signature pattern for this family of proteins, we have selected a region that contains a highly conserved pentapeptide. The pattern is located in gyrB, in parE, and in protein 39 of phage T4 topoisomerase. -Consensus pattern: [LIVMA]-{R}-E-G-[DN]-S-A-{F}-[STAG] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 4. -Last update: December 2004 / Pattern and text revised. [ 1] Sternglanz R. "DNA topoisomerases." Curr. Opin. Cell Biol. 1:533-535(1989). PubMed=2560656 [ 2] Bjornsti M.-A. Curr. Opin. Struct. Biol. 1:99-103(1991). [ 3] Sharma A., Mondragon A. "DNA topoisomerases." Curr. Opin. Struct. Biol. 5:39-47(1995). PubMed=7773745 [ 4] Roca J. "The mechanisms of DNA topoisomerases." Trends Biochem. Sci. 20:156-160(1995). PubMed=7770916 [E1] http://seasquirt.mbio.co.jp/icb/background/background.php +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00161} {PS00178; AA_TRNA_LIGASE_I} {BEGIN} ******************************************************** * Aminoacyl-transfer RNA synthetases class-I signature * ******************************************************** Aminoacyl-tRNA synthetases (EC 6.1.1.-) [1] are a group of enzymes which activate amino acids and transfer them to specific tRNA molecules as the first step in protein biosynthesis. In prokaryotic organisms there are at least twenty different types of aminoacyl-tRNA synthetases, one for each different amino acid. In eukaryotes there are generally two aminoacyl-tRNA synthetases for each different amino acid: one cytosolic form and a mitochondrial form. While all these enzymes have a common function, they are widely diverse in terms of subunit size and of quaternary structure. A few years ago it was found [2] that several aminoacyl-tRNA synthetases share a region of similarity in their N-terminal section, in particular the consensus tetrapeptide His-Ile-Gly-His ('HIGH') is very well conserved. The 'HIGH' region has been shown [3] to be part of the adenylate binding site. The 'HIGH' signature has been found in the aminoacyl-tRNA synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, and valine. These aminoacyl-tRNA synthetases are referred to as class-I synthetases [4,5,6] and seem to share the same tertiary structure based on a Rossmann fold. -Consensus pattern: P-x(0,2)-[GSTAN]-[DENQGAPK]-x-[LIVMFP]-[HT][LIVMYAC]-G[HNTG]-[LIVMFYSTAGPC] -Sequences known to belong to this class detected by the pattern: ALL, except for Cys-tRNA ligases and some other sequences. -Other sequence(s) detected in Swiss-Prot: 57. -Note: In position 8 of the pattern His is present in all tRNAsynthetases of class-I except in some bacterial tryptophanyl-tRNA synthetases have a Thr in that position. which -Last update: November 1997 / Pattern and text revised. [ 1] Schimmel P. "Aminoacyl tRNA synthetases: general scheme of structure-function relationships in the polypeptides and recognition of transfer RNAs." Annu. Rev. Biochem. 56:125-158(1987). PubMed=3304131; DOI=10.1146/annurev.bi.56.070187.001013 [ 2] Webster T., Tsai H., Kula M., Mackie G.A., Schimmel P. "Specific sequence homology and three-dimensional structure of an aminoacyl transfer RNA synthetase." Science 226:1315-1317(1984). PubMed=6390679 [ 3] Brick P., Bhat T.N., Blow D.M. "Structure of tyrosyl-tRNA synthetase refined at 2.3 A resolution. Interaction of the enzyme with the tyrosyl adenylate intermediate." J. Mol. Biol. 208:83-98(1989). PubMed=2504923 [ 4] Delarue M., Moras D. "The aminoacyl-tRNA synthetase family: modules at work." BioEssays 15:675-687(1993). PubMed=8274143 [ 5] Schimmel P. "Classes of aminoacyl-tRNA synthetases and the establishment of the genetic code." Trends Biochem. Sci. 16:1-3(1991). PubMed=2053131 [ 6] Nagel G.M., Doolittle R.F. "Evolution and relatedness in two aminoacyl-tRNA synthetase families." Proc. Natl. Acad. Sci. U.S.A. 88:8121-8125(1991). PubMed=1896459 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00162} {PS00180; GLNA_1} {PS00181; GLNA_ATP} {PS00182; GLNA_ADENYLATION} {BEGIN} *********************************** * Glutamine synthetase signatures * *********************************** Glutamine synthetase (EC 6.3.1.2) (GS) [1] plays an essential role in the metabolism of nitrogen by catalyzing the condensation of glutamate and ammonia to form glutamine. There seem to be three different classes of GS [2,3,4]: - Class I enzymes (GSI) are specific to prokaryotes, and are oligomers of 12 identical subunits. The activity of GSI-type enzyme is controlled by the adenylation of a tyrosine residue. The adenylated enzyme is inactive. - Class II enzymes (GSII) are found in eukaryotes and in bacteria belonging to the Rhizobiaceae, Frankiaceae, and Streptomycetaceae families (these bacteria have also a class-I GS). GSII are octamer of identical subunits. Plants have two or more isozymes of GSII, one of the isozymes is translocated into the chloroplast. - Class III enzymes (GSIII) has, currently, only been found in Bacteroides fragilis and in butyrivibrio fibrisolvens. It is a hexamer of identical chains. It is much larger (about 700 amino acids) than the GSI (450 to 470 amino acids) or GSII (350 to 420 amino acids) enzymes. While the three classes of GS's are clearly structurally related, the sequence similarities are not so extensive. As signature patterns we selected three conserved regions. The first pattern is based on a conserved tetrapeptide in the N-terminal section of the enzyme, the second one is based on a glycinerich region which is thought to be involved in ATP-binding. The third pattern is specific to class I glutamine synthetases and includes the tyrosine residue which is reversibly adenylated. -Consensus pattern: [FYWL]-D-G-S-S-x(6,8)-[DENQSTAK]-[SA]-[DE]-x(2)[LIVMFY] -Sequences known to belong to this class detected by the pattern: ALL, except for GSIII and Rhizobium leguminosarum GSI. -Other sequence(s) detected in Swiss-Prot: 4. -Consensus pattern: K-P-[LIVMFYA]-x(3,5)-[NPAT]-[GA]-[GSTAN]-[GA]-x-Hx(3)-S -Sequences known to belong to this class detected by the pattern: ALL, except for C.elegans and mouse GSI. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: K-[LIVM]-x(5)-[LIVMA]-D-[RK]-[DN]-[LI]-Y [Y is the site of adenylation] -Sequences known to belong to this class detected by the pattern: ALL class-I GS, except for Clostridium acetobutylicum. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Tateno Y.; [email protected] -Last update: December 2004 / Pattern and text revised. [ 1] Eisenberg D., Almassy R.J., Janson C.A., Chapman M.S., Suh S.W., Cascio D., Smith W.W. "Some evolutionary relationships of the primary biological catalysts glutamine synthetase and RuBisCO." Cold Spring Harb. Symp. Quant. Biol. 52:483-490(1987). PubMed=2900091 [ 2] Kumada Y., Benson D.R., Hillemann D., Hosted T.J., Rochefort D.A., Thompson C.J., Wohlleben W., Tateno Y. "Evolution of the glutamine synthetase gene, one of the oldest existing and functioning genes." Proc. Natl. Acad. Sci. U.S.A. 90:3009-3013(1993). PubMed=8096645 [ 3] Shatters R.G., Kahn M.L. "Glutamine synthetase II in Rhizobium: reexamination of the proposed horizontal transfer of DNA from eukaryotes to prokaryotes." J. Mol. Evol. 29:422-428(1989). PubMed=2575672 [ 4] Brown J.R., Masuchi Y., Robb F.T., Doolittle W.F. "Evolutionary relationships of bacterial and archaeal glutamine synthetase genes." J. Mol. Evol. 38:566-576(1994). PubMed=7916055 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00163} {PS00183; UBIQUITIN_CONJUGAT_1} {PS50127; UBIQUITIN_CONJUGAT_2} {BEGIN} ******************************************************* * Ubiquitin-conjugating enzymes signature and profile * ******************************************************* Ubiquitin-conjugating enzymes (EC 6.3.2.19) (UBC or E2 enzymes) [1,2,3] catalyze the covalent attachment of ubiquitin to target proteins. An activated ubiquitin moiety is transferred from an ubiquitin-activating enzyme (E1) to E2 which later ligates ubiquitin directly to substrate proteins with or without the assistance of 'N-end' recognizing proteins (E3). In most species there are many forms of UBC which are implicated in diverse cellular functions. (at least 9 in yeast) A cysteine residue is required for ubiquitin-thiolester formation. There is a single conserved cysteine in UBC's and the region around that residue is conserved in the sequence of known UBC isozymes. We have used that region as a signature pattern. We also developed a profile that spans the complete catalytical domain. -Consensus pattern: [FYWLSP]-H-[PC]-[NHL]-[LIV]-x(3,4)-G-x-[LIVP]-C[LIV]x(1,2)-[LIVR] [C is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for yeast UBC6 (DOA2). -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Jentsch S.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Jentsch S., Seufert W., Sommer T., Reins H.-A. "Ubiquitin-conjugating enzymes: novel regulators of eukaryotic cells." Trends Biochem. Sci. 15:195-198(1990). PubMed=2193438 [ 2] Jentsch S., Seufert W., Hauser H.-P. "Genetic analysis of the ubiquitin system." Biochim. Biophys. Acta 1089:127-139(1991). PubMed=1647207 [ 3] Hershko A. "The ubiquitin pathway for protein degradation." Trends Biochem. Sci. 16:265-268(1991). PubMed=1656558 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00164} {PS00184; GARS} {BEGIN} ************************************************** * Phosphoribosylglycinamide synthetase signature * ************************************************** Phosphoribosylglycinamide synthetase (EC 6.3.4.13) (GARS) (phosphoribosylamine glycine ligase) [1] catalyzes the second step in the de novo biosynthesis of purine, the ATP-dependent addition of 5-phosphoribosylamine glycine to form 5'phosphoribosylglycinamide. to In bacteria GARS is a monofunctional enzyme (encoded by the purD gene), in yeast it is part, with phosphoribosylformylglycinamidine cyclo-ligase (AIRS) of a bifunctional enzyme (encoded by the ADE5,7 gene), in higher eukaryotes it is part, with AIRS and with phosphoribosylglycinamide formyltransferase (GART) of a trifunctional enzyme (GARS-AIRS-GART). The sequence of GARS is well conserved. selected a highly conserved octapeptide. As a signature pattern we -Consensus pattern: R-[LF]-G-D-P-E-x-[EQIM] -Sequences known to belong to this class detected by the pattern: ALL, with a few exceptions. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2001 / Pattern and text revised. [ 1] Aiba A., Mizobuchi K. "Nucleotide sequence analysis of genes purH and purD involved in the de novo purine nucleotide biosynthesis of Escherichia coli." J. Biol. Chem. 264:21239-21246(1989). PubMed=2687276 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00165} {PS00185; IPNS_1} {PS00186; IPNS_2} {BEGIN} ***************************************** * Isopenicillin N synthetase signatures * ***************************************** Isopenicillin N synthetase (EC 1.21.3.1) (IPNS) [1,2] is a key enzyme in the biosynthesis of penicillin and cephalosporin. In the presence of oxygen, it removes iron and ascorbate, four hydrogen atoms from L-(alphaaminoadipyl)-Lcysteinyl-d-valine to form the azetidinone and thiazolidine rings of isopenicillin. IPNS is an enzyme of about 330 amino-acid residues. Two cysteines are conserved in fungal and bacterial IPNS sequences; these may be involved in iron-binding and/or substrate-binding. Cephalosporium acremonium DAOCS/DACS [3] is a bifunctional enzyme involved in cephalosporin biosynthesis. The DAOCS domain, which is structurally related to IPNS, catalyzes the step from penicillin N to deacetoxy-cephalosporin C used as a substrate by DACS to form deacetylcephalosporin C. Streptomyces clavuligerus possesses also related to IPNS. a monofunctional We derived two signature patterns around the conserved cysteine residues. for DAOCS enzyme (gene cefE) [4] these enzymes, centered -Consensus pattern: [RK]-x-[STA]-x(2)-S-x-C-Y-[SL] [C may be an active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for Streptomyces clavuligerus DAOCS which has Ser in the first position of the pattern. -Other sequence(s) detected in Swiss-Prot: 6. -Consensus pattern: [LIVM](2)-x-C-G-[STA]-x(2)-[STAG]-x(2)-T-x-[DNG] [C may be an active site residue] -Sequences known to belong to this class detected by the pattern: ALL, except for Nocardia lactamdurans IPNS. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: May 2004 / Text revised. [ 1] Martin J.F. Trends Biotechnol. 5:306-308(1987). [ 2] Chen G., Shiffman D., Mevarech M., Aharonowitz Y. Trends Biotechnol. 8:105-111(1990). [ 3] Samson S.M., Dotzlaf J.E., Slisz M.L., Becker G.W., van Frank R.M., Veal L.E., Yeh W.K., Miller J.R., Queener S.W., Ingolia T.D. Bio/Technology 5:1207-1214(1987). [ 4] Kovacevic S., Weigel B.J., Tobin M.B., Ingolia T.D., Miller J.R. "Cloning, characterization, and expression in Escherichia coli of the Streptomyces clavuligerus gene encoding deacetoxycephalosporin C synthetase." J. Bacteriol. 171:754-760(1989). PubMed=2644235 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00166} {PS00187; TPP_ENZYMES} {BEGIN} ******************************************** * Thiamine pyrophosphate enzymes signature * ******************************************** A number of enzymes require thiamine pyrophosphate (TPP) (vitamin B1) as a cofactor. It has been shown [1] that some of these enzymes are structurally related. These related TPP enzymes are: - Pyruvate oxidase (POX) (EC 1.2.3.3) Reaction catalyzed: pyruvate + orthophosphate + O(2) + H(2)O = acetyl phosphate + CO(2) + H(2)O(2). - Pyruvate decarboxylase (PDC) (EC 4.1.1.1) Reaction catalyzed: pyruvate = acetaldehyde + CO(2). - Indolepyruvate decarboxylase (EC 4.1.1.74) [2] Reaction catalyzed: indole-3-pyruvate = indole-3-acetaldehyde + CO(2). - Acetolactate synthase (ALS) (EC 2.2.1.6) Reaction catalyzed: 2 pyruvate = acetolactate + CO(2). - Benzoylformate decarboxylase (BFD) (EC 4.1.1.7) [3] Reaction catalyzed: benzoylformate = benzaldehyde + CO(2). As a signature pattern for these enzymes we have selected a conserved region which is located in their C-terminal section. -Consensus pattern: [LIVMF]-[GSA]-x(5)-P-x(4)-[LIVMFYW]-x-[LIVMF]-x-G-D[GSA][GSAC] -Sequences known to belong to this class detected by the pattern: ALL, except for 13 sequences. -Other sequence(s) detected in Swiss-Prot: a hypothetical protein in the puf photosynthesis operon of Rhodobacter capsulatus; this protein could be a TPP enzyme. -Note: Other TPP enzymes such as the E1 component of pyruvate dehydrogenase complex, 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase, and transketolase do not seem to be related to the above enzymes. -Last update: November 1995 / Pattern and text revised. [ 1] Green J.B.A. "Pyruvate decarboxylase is like acetolactate synthase (ILV2) and not like the pyruvate dehydrogenase E1 subunit." FEBS Lett. 246:1-5(1989). PubMed=2651151; [ 2] Koga J., Adachi T., Hidaka H. "Molecular cloning of the gene for indolepyruvate decarboxylase from Enterobacter cloacae." Mol. Gen. Genet. 226:10-16(1991). PubMed=2034209 [ 3] Tsou A.Y., Ransom S.C., Gerlt J.A., Buechter D.D., Babbitt P.C., Kenyon G.L. "Mandelate pathway of Pseudomonas putida: sequence relationships involving mandelate racemase, (S)-mandelate dehydrogenase, and benzoylformate decarboxylase and expression of benzoylformate decarboxylase in Escherichia coli." Biochemistry 29:9856-9862(1990). PubMed=2271624 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00167} {PS00188; BIOTIN} {BEGIN} ******************************************** * Biotin-requiring enzymes attachment site * ******************************************** Biotin, which plays a catalytic role in some carboxyl transfer reactions, is covalently attached, via an amide bond, to a lysine residue in enzymes requiring this coenzyme [1,2,3,4]. Such enzymes are: - Pyruvate carboxylase (EC 6.4.1.1). - Acetyl-CoA carboxylase (EC 6.4.1.2). - Propionyl-CoA carboxylase (EC 6.4.1.3). - Methylcrotonyl-CoA carboxylase (EC 6.4.1.4). - Geranoyl-CoA carboxylase (EC 6.4.1.5). - Urea carboxylase (EC 6.3.4.6). - Oxaloacetate decarboxylase (EC 4.1.1.3). - Methylmalonyl-CoA decarboxylase (EC 4.1.1.41). - Glutaconyl-CoA decarboxylase (EC 4.1.1.70). - Methylmalonyl-CoA carboxyl-transferase (EC 2.1.3.1) (transcarboxylase). Sequence data reveal that the region around the biocytin (biotinlysine) residue is well conserved and can be used as a signature pattern. -Consensus pattern: [GDN]-[DEQTR]-x-[LIVMFY]-x(2)-[LIVM]-x-[AIV]-M-K[LVMAT]x(3)-[LIVM]-x-[SAV] [K is the biotin attachment site] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: The domain around the biotin-binding lysine residue is evolutionary related to that around the lipoyl-binding lysine residue of 2-oxo acid dehydrogenase acyltransferases (see <PDOC00168>). -Last update: December 2001 / Pattern and text revised. [ 1] Knowles J.R. "The mechanism of biotin-dependent enzymes." Annu. Rev. Biochem. 58:195-221(1989). PubMed=2673009; DOI=10.1146/annurev.bi.58.070189.001211 [ 2] Samols D., Thornton C.G., Murtif V.L., Kumar G.K., Haase F.C., Wood H.G. "Evolutionary conservation among biotin enzymes." J. Biol. Chem. 263:6461-6464(1988). PubMed=2896195 [ 3] Goss N.H., Wood H.G. "Formation of N epsilon-(biotinyl)lysine in biotin enzymes." Methods Enzymol. 107:261-278(1984). PubMed=6438443 [ 4] Shenoy B.C., Xie Y., Park V.L., Kumar G.K., Beegen H., Wood H.G., Samols D. "The importance of methionine residues for the catalysis of the biotin enzyme, transcarboxylase. Analysis by site-directed mutagenesis." J. Biol. Chem. 267:18407-18412(1992). PubMed=1526981 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00168} {PS00189; LIPOYL} {BEGIN} ************************************************************************* ** * 2-oxo acid dehydrogenases acyltransferase component lipoyl binding site * ************************************************************************* ** The 2-oxo acid dehydrogenase bacterial and eukaryotic sources catalyze acids to the corresponding acyl-CoA. multienzyme complexes are: multienzyme complexes [1,2] the from oxidative decarboxylation of 2-oxo The three members of this family of - Pyruvate dehydrogenase complex (PDC). - 2-oxoglutarate dehydrogenase complex (OGDC). - Branched-chain 2-oxo acid dehydrogenase complex (BCOADC). These three complexes share a common architecture: they are composed of multiple copies of three component enzymes - E1, E2 and E3. E1 is a thiamine pyrophosphate-dependent 2-oxo acid dehydrogenase, E2 a dihydrolipamide acyltransferase, and E3 an FAD-containing dihydrolipamide dehydrogenase. E2 acyltransferases have an essential cofactor, lipoic acid, which is covalently bound via a amide linkage to a lysine group. The E2 components of OGCD and BCOACD bind a single lipoyl group, while those of PDC bind either one (in yeast and in Bacillus), two (in mammals), or three (in Azotobacter and in Escherichia coli) lipoyl groups [3]. In addition to the E2 components of the three enzymatic complexes described above, a lipoic acid cofactor is also found in the following proteins: - H-protein of the glycine cleavage system (GCS) [4]. GCS is a multienzyme complex of four protein components, which catalyzes the degradation of glycine. H protein shuttles the methylamine group of glycine from the P protein to the T protein. H-protein from either prokaryotes or eukaryotes binds a single lipoic group. - Mammalian and yeast pyruvate dehydrogenase complexes differ from that of other sources, in that they contain, in small amounts, a protein of unknown function - designated protein X or component X. Its sequence is closely related to that of E2 subunits and seems to bind a lipoic group [5]. - Fast migrating protein (FMP) (gene acoC) from Alcaligenes eutrophus [6]. This protein is most probably a dihydrolipamide acyltransferase involved in acetoin metabolism. We developed a lipoylbinding site. signature pattern which allows the detection of the -Consensus pattern: [GDN]-x(2)-[LIVF]-x(3)-{VH}-{M}-[LIVMFCA]-x(2)[LIVMFA]{LDFY}-{KPE}-x-K-[GSTAIVW]-[STAIVQDN]-x(2)-[LIVMFS]x(5)[GCN]-x-[LIVMFY] [K is the lipoyl-binding site] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 6. -Note: The domain around the lipoyl-binding lysine residue is evolutionary related to that around the biotin-binding lysine residue of biotin requiring enzymes (see <PDOC00167>). -Last update: April 2006 / Pattern revised. [ 1] Yeaman S.J. "The 2-oxo acid dehydrogenase complexes: recent advances." Biochem. J. 257:625-632(1989). PubMed=2649080 [ 2] Yeaman S.J. Trends Biochem. Sci. 11:293-296(1986). [ 3] Russel G.C., Guest J.R. Biochim. Biophys. Acta 1076:225-232(1991). [ 4] Fujiwara K., Okamura-Ikeda K., Motokawa Y. "Chicken liver H-protein, a component of the glycine cleavage system. Amino acid sequence and identification of the N epsilon-lipoyllysine residue." J. Biol. Chem. 261:8836-8841(1986). PubMed=3522581 [ 5] Behal R.H., Browning K.S., Hall T.B., Reed L.J. "Cloning and nucleotide sequence of the gene for protein X from Saccharomyces cerevisiae." Proc. Natl. Acad. Sci. U.S.A. 86:8732-8736(1989). PubMed=2682658 [ 6] Priefert H., Hein S., Kruger N., Zeh K., Schmidt B., Steinbuechel A. "Identification and molecular characterization of the Alcaligenes eutrophus H16 aco operon genes involved in acetoin catabolism." J. Bacteriol. 173:4056-4071(1991). PubMed=2061286 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00169} {PS51007; CYTC} {PS51008; MULTIHEME_CYTC} {PS51009; CYTCII} {PS51010; CYTF} {BEGIN} ****************************************** * C-type cytochrome superfamily profiles * ****************************************** In proteins belonging to the c-type cytochrome family [1], the heme group is covalently attached by thioether bonds to two conserved cysteine residues located in the cytochrome c center. Cytochromes c typically function in electron transfer, but c-type cytochrome centers are also found in the active sites of many enzymes, and in eukaryotic cells, cytochrome c has also a role in apoptosis [2]. The known structures of c-type cytochromes have six different classes of fold. Of these, four are unique to c-type cytochromes [3]. The different folds are detailed in the example list below. The consensus sequence for the cytochrome c center is Cys-x-x-Cys-His, where the histidine residue is one of the two axial ligands of the heme iron [4]. This arrangement is shared by all proteins known to belong to the cytochrome c family, which presently includes: Monoheme proteins: - Cytochrome c, an electron carrier protein located in the mitochondrial matrix. Cytochrome c is a globular protein with an all alpha-helice fold (see <PDB:1HRC>). - Cytochrome c1. This is the heme-containing component of the cytochrome b-c1 complex, which accepts electrons from Rieske protein and transfers electrons to cytochrome c in the mitochondrial respiratory chain. - Bacterial class II cytochromes c (c' and c556). Cytochrome c' is a high-spin protein and is the most widely occurring bacterial c-type cytochrome. Cytochrome c556 is a low-spin cytochrome . Both have a C-terminal c-type cytochrome center. Class II cytochromes are composed of four alpha helices (see <PDB:1CPR>). - Cytochromes c2, c5 and c6. - Bacterial cytochromes c550 to c553 and c555. - Chloroplast and cyanobacteria cytochrome f. It translocates protons across the thylakoid membrane and transfers electrons from photosystem II to photosystem I. Structurally, cytochrome f is unique in the cytochrome c family as it is an all beta-strand fold (see <PDB:1CTM>). - Bacteria cytochrome c oxidase, mono-heme subunit, FixO. Multiheme proteins (prokaryotes). They are frequently associated with electron-transport processes within the nitrogen and sulphur cycles: - Cytochrome c nitrite reductase, each monomere contains five heme groups clustered in a pseudo-two fold structure (see <PDB:1QDB>). - Cytochrome c3. It participates in sulfate respiration coupled with phosphorylation by transferring electrons from the enzyme dehydrogenase to ferredoxin. It binds 4 heme groups per subunit. - Cytochrome c4. It binds 2 heme groups per subunit. - Cytochromes cc3/Hmc (High-molecular-weight cytochrome c), binds 16 heme groups per subunit. - Purple bacteria photosynthetic reaction center. It binds four heme groups. - HAO (hydroxylamine oxidoreductase). It catalyzes the oxidation of hydroxylamine to nitrite. The electrons released in the reaction are partitioned to ammonium monooxygenase and to the respiratory chain. It binds eight heme groups per subunit. - Cytochrome c554. It is the immediate acceptor of electrons from HAO. It binds four heme groups per subunit. - Flavocytochrome fumarate. It catalyzes unidirectional fumarate reduction using artificial electron donors such as methyl viologen. It binds four heme groups per subunit. To recognize c-type cytochrome family proteins we have developed 4 profiles. The first one recognizes all mono-heme cytochrome c proteins (except class II and f-type cytochromes). The second one recognizes cytochrome c that binds more than one heme group. The third one recognizes class II cytochrome c and the fourth one is directed against cytochrome f family. -Sequences known to belong to this class detected by the first profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the second profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the third profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the fourth profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: These profiles replace a pattern which specificity was inadequate. -Last update: August 2004 / Pattern removed, profiles added and text revised. [ 1] Mathews F.S. "The structure, function and evolution of cytochromes." Prog. Biophys. Mol. Biol. 45:1-56(1985). PubMed=3881803 [ 2] Martinou J.-C., Desagher S., Antonsson B. "Cytochrome c release from mitochondria: all or nothing." Nat. Cell Biol. 2:E41-E43(2000). PubMed=10707095; DOI=10.1038/35004069 [ 3] Allen J.W., Daltrop O., Stevens J.M., Ferguson S.J. "C-type cytochromes: diverse structures and biogenesis systems pose evolutionary problems." Philos. Trans. R. Soc. Lond., B, Biol. Sci. 358:255-266(2003). PubMed=12594933; DOI=10.1098/rstb.2002.1192 [ 4] Barker P.D., Ferguson S.J. "Still a puzzle: why is haem covalently attached in c-type cytochromes?" Structure 7:R281-R290(1999). PubMed=10647174 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00170} {PS00191; CYTOCHROME_B5_1} {PS50255; CYTOCHROME_B5_2} {BEGIN} ******************************************************************* * Cytochrome b5 family, heme-binding domain signature and profile * ******************************************************************* Cytochrome b5 is a membrane-bound hemoprotein which acts as an electron carrier for several membrane-bound oxygenases [1]. There are two homologous forms of b5, one found in microsomes and one found in the outer membrane of mitochondria. Two conserved histidine residues serve as axial ligands for the heme group. The structure of a number of oxidoreductases consists of the juxtaposition of a heme-binding domain homologous to that of b5 and either a flavodehydrogenase or a molybdopterin domain. These enzymes are: - Lactate dehydrogenase (EC 1.1.2.3) [2], an enzyme that consists of a flavodehydrogenase domain and a heme-binding domain called cytochrome b2. - Nitrate reductase (EC 1.7.1.-), a key enzyme involved in the first step of nitrate assimilation in plants, fungi and bacteria [3,4]. Consists of a molybdopterin domain (see <PDOC00484>), a heme-binding domain called cytochrome b557, as well as a cytochrome reductase domain. - Sulfite oxidase (EC 1.8.3.1) [5], which catalyzes the terminal reaction in the oxidative degradation of sulfur-containing amino acids. Also consists of a molybdopterin domain and a heme-binding domain. - Yeast acyl-CoA desaturase 1 (EC 1.14.19.1) (gene OLE1). This enzyme contains a C-termainal heme-binding domain. This family of proteins also includes: - TU-36B, a Drosophila muscle protein of unknown function [6]. Fission yeast hypothetical protein SpAC1F12.10c. Yeast hypothetical protein YMR073c. Yeast hypothetical protein YMR272c. We used a segment which includes the first of the two histidine heme ligands, as a signature pattern for the heme-binding domain of cytochrome b5 family. -Consensus pattern: [FY]-[LIVMK]-{I}-{Q}-H-P-[GA]-G [H is a heme axial ligand] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 7. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Rouze P.; [email protected] -Last update: December 2004 / Pattern and text revised. [ 1] Ozols J. "Structure of cytochrome b5 and its topology in the microsomal membrane." Biochim. Biophys. Acta 997:121-130(1989). PubMed=2752049 [ 2] Guiard B. "Structure, expression and regulation of a nuclear gene encoding a mitochondrial protein: the yeast L(+)-lactate cytochrome c oxidoreductase (cytochrome b2)." EMBO J. 4:3265-3272(1985). PubMed=3004948 [ 3] Calza R., Huttner E., Vincentz M., Rouze P., Galangau F., Vaucheret H., Cherel I., Meyer C., Kronenberger J., Caboche M. Mol. Gen. Genet. 209:552-562(1987). [ 4] Crawford N.M., Smith M., Bellissimo D., Davis R.W. "Sequence and nitrate regulation of the Arabidopsis thaliana mRNA encoding nitrate reductase, a metalloflavoprotein with three functional domains." Proc. Natl. Acad. Sci. U.S.A. 85:5006-5010(1988). PubMed=3393528 [ 5] Guiard B., Lederer F. "Amino acid sequence of the 'b5-like' heme-binding domain from chicken sulfite oxidase." Eur. J. Biochem. 100:441-453(1979). PubMed=510290 [ 6] Levin R.J., Boychuk P.L., Croniger C.M., Kazzaz J.A., Rozek C.E. "Structure and expression of a muscle specific gene which is adjacent to the Drosophila myosin heavy-chain gene and can encode a cytochrome b related protein." Nucleic Acids Res. 17:6349-6367(1989). PubMed=2549511 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00171} {PS51002; CYTB_NTER} {PS51003; CYTB_CTER} {BEGIN} **************************** * Cytochrome b/b6 profiles * **************************** In the mitochondrion of eukaryotes and in aerobic prokaryotes, cytochrome b is a component of respiratory chain complex III (EC 1.10.2.2) - also known as the bc1 complex or ubiquinol-cytochrome c reductase. This complex is the middle component of the mitochondrial respiratory chain, coupling the transfer of electrons from ubihydroquinone to cytochrome c with the generation of a proton gradient across the mitochondrial membrane. Every bc1 complex contains three common subunits with active redox centers (cytochrome b, cytochrome c1, and the "Rieske" [2Fe-2S] protein (ISP) (see <PDOC00177>)). The mitochondrial system contains additional subunits not present in the bacterial complexes. In plant chloroplasts and cyanobacteria, there is a analogous protein of cytochrome b, cytochrome b6, a component of the plastoquinoneplastocyanin reductase (EC 1.10.99.1), also known as the b6f complex. Cytochrome b/b6 [1,2] is an integral membrane protein of approximately 400 amino acid residues that has 8 transmembrane segments and four horizontal helices on the intermembrane side (see <PDB:1BE3; C>). The two hemes, bL and bH, are in the center of a four alphahelical bundle formed by helices 1 to 4 [3]. In plants and cyanobacteria, cytochrome b6 consists of two subunits encoded by the petB and petD genes. The sequence of petB is colinear with the Nterminal part of mitochondrial cytochrome b, while petD corresponds to the Cterminal part. Cytochrome b/b6 non-covalently binds two heme groups, known as b562 and b566. Four conserved histidine residues are postulated to be the ligands of the iron atoms of these two heme groups. Apart from regions around some of the histidine heme ligands, there are a few conserved regions in the sequence of b/b6. The best conserved of these regions includes an invariant P-E-W triplet which lies in the loop that separates the fifth and sixth transmembrane segments. It seems to be important for electron transfer at the ubiquinone redox site - called Qz or Qo (where o stands for outside) - located on the outer side of the membrane. A schematic representation of the structure of cytochrome b/b6 is shown below. +---Fe-b562----+ | +---Fe-b566--|-+ | | | | xxxxxxxxxxxHxHxxxxxxxxxxxxHxHxxxxxxxxxxPEWxxxxxxxxxxxxxxxxxx <------------------Cytochrome-b----------------------------> <----Cytochrome-b6-petB---------><--Cytochrome-b6-petD-----> We developed two profiles for cytochrome b/b6, one that spans the Nterminal region and also recognizes the petB subunit of plant b6 complex; the other profile is directed against the C-terminal region and recognizes also the plant petD subunit. -Sequences known to belong to this class detected by the first profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the second profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: These profiles replace two patterns which sensitivity were inadequate. -Last update: June 2004 / Patterns removed, profile added and text revised. [ 1] Howell N. "Evolutionary conservation of protein regions in the protonmotive cytochrome b and their possible roles in redox catalysis." J. Mol. Evol. 29:157-169(1989). PubMed=2509716 [ 2] Esposti M.D., De Vries S., Crimi M., Ghelli A., Patarnello T., Meyer A. "Mitochondrial cytochrome b: evolution and structure of the protein." Biochim. Biophys. Acta 1143:243-271(1993). PubMed=8329437 [ 3] Iwata S., Lee J.W., Okada K., Lee J.K., Iwata M., Rasmussen B., Link T.A., Ramaswamy S., Jap B.K. "Complete structure of the 11-subunit bovine mitochondrial cytochrome bc1 complex." Science 281:64-71(1998). PubMed=9651245 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00172} {PS00194; THIOREDOXIN_1} {PS51352; THIOREDOXIN_2} {BEGIN} *************************************************************** * Thioredoxin family active site signature and domain profile * *************************************************************** Thioredoxins [1 to 4] are small proteins amino- of approximately one hundred acid residues which participate in various redox reactions via the reversible oxidation of an active center disulfide bond. They exist in either a reduced form or an oxidized form where the two cysteine residues are linked in an intramolecular disulfide bond. Thioredoxin is present in prokaryotes and eukaryotes and the sequence around the redox-active disulfide bond is well conserved. Bacteriophage T4 also encodes for a thioredoxin but its primary structure is not homologous to bacterial, plant and vertebrate thioredoxins. A number of eukaryotic proteins contain domains evolutionary related to thioredoxin, most of them are protein disulfide isomerases (PDI). PDI (EC 5.3.4.1) [5,6,7] is an endoplasmic reticulum enzyme that catalyzes the rearrangement of disulfide bonds in various proteins. The various forms of PDI which are currently known are: - PDI major isozyme; a multifunctional protein that also function as the beta subunit of prolyl 4-hydroxylase (EC 1.14.11.2), as a component of oligosaccharyl transferase (EC 2.4.1.119), as thyroxine deiodinase (EC 3.8. 1.4), as glutathione-insulin transhydrogenase (EC 1.8.4.2) and as a thyroid hormone-binding protein ! - ERp60 (ER-60; 58 Kd microsomal protein). ERp60 was originally thought to be a phosphoinositide-specific phospholipase C isozyme and later to be a protease. - ERp72. - P5. All PDI contains two or three (ERp72) copies of the thioredoxin domain. Bacterial proteins that act as thiol:disulfide interchange proteins that allows disulfide bond formation in some periplasmic proteins also contain a thioredoxin domain. These proteins are: - Escherichia coli dsbA (or prfA) and its orthologs in Vibrio cholerae (tcpG) and Haemophilus influenzae (por). - Escherichia coli dsbC (or xpRA) and its orthologs in Erwinia chrysanthemi and Haemophilus influenzae. - Escherichia coli dsbD (or dipZ) and its Haemophilus influenzae ortholog. - Escherichia coli dsbE (or ccmG) and orthologs in Haemophilus influenzae, Rhodobacter capsulatus (helX), Rhiziobiacae (cycY and tlpA). The pattern we developed is directed against the two cysteines that form the redox-active bond. We also developed a profile that covers the whole domain. -Consensus pattern: [LIVMF]-[LIVMSTA]-x-[LIVMFYC]-[FYWSTHE]-x(2)[FYWGTN]-C[GATPLVE]-[PHYWSTA]-C-{I}-x-{A}-x(3)-[LIVMFYWT] [The 2 C's form the redox-active bond] -Sequences known to belong to this class detected by the profile: ALL. for Haemophilus influenzae dsbC, Escherichia coli dsbG and two others. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2007 / Profile added and text revised. [ 1] Holmgren A. "Thioredoxin." Annu. Rev. Biochem. 54:237-271(1985). PubMed=3896121; DOI=10.1146/annurev.bi.54.070185.001321 [ 2] Gleason F.K., Holmgren A. "Thioredoxin and related proteins in procaryotes." FEMS Microbiol. Rev. 4:271-297(1988). PubMed=3152490 [ 3] Holmgren A. "Thioredoxin and glutaredoxin systems." J. Biol. Chem. 264:13963-13966(1989). PubMed=2668278 [ 4] Eklund H., Gleason F.K., Holmgren A. "Structural and functional relations among thioredoxins of different species." Proteins 11:13-28(1991). PubMed=1961698 [ 5] Freedman R.B., Hawkins H.C., Murant S.J., Reid L. "Protein disulphide-isomerase: a homologue of thioredoxin implicated in the biosynthesis of secretory proteins." Biochem. Soc. Trans. 16:96-99(1988). PubMed=3371540 [ 6] Kivirikko K.I., Myllyla R., Pihlajaniemi T. "Protein hydroxylation: prolyl 4-hydroxylase, an enzyme with four cosubstrates and a multifunctional subunit." FASEB J. 3:1609-1617(1989). PubMed=2537773 [ 7] Freedman R.B., Hirst T.R., Tuite M.F. "Protein disulphide isomerase: building bridges in protein folding." Trends Biochem. Sci. 19:331-336(1994). PubMed=7940678 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00173} {PS00195; GLUTAREDOXIN_1} {PS51354; GLUTAREDOXIN_2} {BEGIN} ********************************************************* * Glutaredoxin active site signature and domain profile * ********************************************************* Glutaredoxin [1,2,3], also known as thioltransferase, is a small protein of approximately one hundred amino-acid residues. It functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin, which functions in a similar way, glutaredoxin possesses an active center disulfide bond. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulfide bond. Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed [4] that vaccinia protein O2L is most probably a glutaredoxin. Finally, it must be noted that phage T4 thioredoxin seems also to be evolutionary related. The pattern is directed against the 2 cysteines of the redox active bonds. We also developed a profile that covers the whole glutaredoxin domain. -Consensus pattern: [LIVMD]-[FYSA]-x(4)-C-[PV]-[FYWH]-C-x(2)-[TAV]x(2,3)- [LIV] [The 2 C's form the redox-active bond] -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: In position 5 of the pattern, all glutaredoxin sequences have Pro while T4 thioredoxin has Val. -Last update: December 2007 / Text revised and profile added. [ 1] Gleason F.K., Holmgren A. "Thioredoxin and related proteins in procaryotes." FEMS Microbiol. Rev. 4:271-297(1988). PubMed=3152490 [ 2] Holmgren A. "Thioredoxin and glutaredoxin: small multi-functional redox proteins with active-site disulphide bonds." Biochem. Soc. Trans. 16:95-96(1988). PubMed=3286320 [ 3] Holmgren A. "Thioredoxin and glutaredoxin systems." J. Biol. Chem. 264:13963-13966(1989). PubMed=2668278 [ 4] Johnson G.P., Goebel S.J., Perkus M.E., Davis S.W., Winslow J.P., Paoletti E. "Vaccinia virus encodes a protein with similarity to glutaredoxins." Virology 181:378-381(1991). PubMed=1994586 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00174} {PS00196; COPPER_BLUE} {BEGIN} ******************************************* * Type-1 copper (blue) proteins signature * ******************************************* Blue or 'type-1' single copper proteins are small proteins which bind a copper atom and which are characterized by an intense electronic absorption band near 600 nm [1,2]. The most well known members of this class of proteins are the plant chloroplastic plastocyanins, which exchange electrons with cytochrome c6, and the distantly related bacterial azurins, which exchange electrons with cytochrome c551. This family of proteins also includes all the proteins listed below (references are only provided for recently determined sequences). - Amicyanin from bacteria such as Methylobacterium extorquens or Thiobacillus versutus that can grow on methylamine. Amicyanin appears to be an electron receptor for methylamine dehydrogenase. - Auracyanins A and B from Chloroflexus aurantiacus [3]. These proteins can donate electrons to cytochrome c-554. - Blue copper protein from Alcaligenes faecalis. - Cupredoxin (CPC) from cucumber peelings [4]. - Cusacyanin (basic blue protein; plantacyanin, CBP) from cucumber. - Halocyanin from Natrobacterium pharaonis [5], a membrane associated copperbinding protein. - Pseudoazurin from Pseudomonas. - Rusticyanin from Thiobacillus ferrooxidans. Rusticyanin is an electron carrier from cytochrome c-552 to the a-type oxidase [6]. - Stellacyanin from the Japanese lacquer tree. - Umecyanin from horseradish roots. - Allergen Ra3 from ragweed. This pollen protein is evolutionary related to the above proteins, but seems to have lost the ability to bind copper. Although there is an appreciable amount of divergence in the sequence of all these proteins, the copper ligand sites are conserved and we have developed a pattern which includes two of the ligands: a cysteine and a histidine. -Consensus pattern: [GA]-x(0,2)-[YSA]-x(0,1)-[VFY]-{SEDT}-C-x(1,2)-[PG]x(0,1)-H-x(2,4)-[MQ] [C and H are copper ligands] -Sequences known to belong to this class detected by the pattern: ALL, except for allergen Ra3 and three other sequences. -Other sequence(s) detected in Swiss-Prot: 32. -Note: In position 5 of the pattern, only the Alcaligenes protein has a Val; all other proteins have either Phe or Tyr. In position 9 only CPC has Gly, all others have Pro. -Last update: December 2004 / Pattern and text revised. [ 1] Garret T.P.J., Clingeleffer D.J., Guss J.M., Rogers S.J., Freeman H.C. J. Biol. Chem. 259:2822-2825(1984). [ 2] Ryden L.G., Hunt L.T. "Evolution of protein complexity: the blue copper-containing oxidases and related proteins." J. Mol. Evol. 36:41-66(1993). PubMed=8433378 [ 3] McManus J.D., Brune D.C., Han J., Sanders-Loehr J., Meyer T.E., Cusanovich M.A., Tollin G., Blankenship R.E. "Isolation, characterization, and amino acid sequences of auracyanins, blue copper proteins from the green photosynthetic bacterium Chloroflexus aurantiacus." J. Biol. Chem. 267:6531-6540(1992). PubMed=1313011 [ 4] Mann K., Schafer W., Thoenes U., Messerschmidt A., Mehrabian Z., Nalbandyan R. "The amino acid sequence of a type I copper protein with an unusual serine-and hydroxyproline-rich C-terminal domain isolated from cucumber peelings." FEBS Lett. 314:220-223(1992). PubMed=1468551 [ 5] Mattar S., Scharf B., Kent S.B.H., Rodewald K., Oesterhelt D., Engelhard M. "The primary structure of halocyanin, an archaeal blue copper protein, predicts a lipid anchor for membrane fixation." J. Biol. Chem. 269:14939-14945(1994). PubMed=8195126 [ 6] Yano T., Fukumori Y., Yamanaka T. "The amino acid sequence of rusticyanin isolated from Thiobacillus ferrooxidans." FEBS Lett. 288:159-162(1991). PubMed=1879547 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00175} {PS00197; 2FE2S_FER_1} {PS51085; 2FE2S_FER_2} {BEGIN} ************************************************************************* ** * 2Fe-2S ferredoxin-type iron-sulfur binding domain signature and profile * ************************************************************************* ** Ferredoxins are small, acidic, electron transfer proteins that are ubiquitous in biological redox systems. They have either 4Fe-4S, 3Fe-4S, or 2Fe-2S cluster. Among them, ferredoxin with one 2Fe-2S cluster per molecule are present in plants, animals, and bacteria, and form a distinct 2FeFerredoxin family [1,2]. They are proteins of around one hundred amino acids with four conserved cysteine residues to which the 2Fe-2S cluster is ligated. This conserved region is also found as a domain in various metabolic enzymes. Several structures of the 2Fe-2S ferredoxin domain have been determined (see for example <PDB:4FXC>) [3]. The domain is classified as a beta-grasp which is characterized as having a beta-sheet comprised of four beta-strands and one alpha-helix flanking the sheet [4]. The two Fe atoms are coordinated tetrahedrally by the two inorganic S atoms and four cysteinyl S atoms. Some proteins that contains a 2Fe-2S ferredoxin-type domain are listed below: - Ferredoxin from photosynthetic organisms; namely plants and algae where it is located in the chloroplast or cyanelle; and cyanobacteria. - Ferredoxin from archaebacteria of the Halobacterium genus. - Ferredoxin IV (gene pftA) and V (gene fdxD) from Rhodobacter capsulatus. - Ferredoxin in the toluene degradation operon (gene xylT) and naphthalene degradation operon (gene nahT) of Pseudomonas putida. - Hypothetical Escherichia coli protein yfaE. - The N-terminal domain of the bifunctional ferredoxin/ferredoxin reductase electron transfer component of the benzoate 1,2-dioxygenase complex (gene benC) from Acinetobacter calcoaceticus, the toluene 4-monooxygenase complex (gene tmoF), the toluate 1,2-dioxygenase system (gene xylZ), and the xylene monooxygenase system (gene xylA) from Pseudomonas. - The N-terminal domain of phenol hydroxylase protein p5 (gene dmpP) from Pseudomonas Putida. - The N-terminal domain of methane monooxygenase component C (gene mmoC) from Methylococcus capsulatus . - The C-terminal domain of the vanillate degradation pathway protein vanB in a Pseudomonas species. - The N-terminal domain of bacterial fumarate reductase iron-sulfur protein (gene frdB). - The N-terminal domain of CDP-6-deoxy-3,4-glucoseen reductase (gene ascD) from Yersinia pseudotuberculosis. - The central domain of eukaryotic succinate dehydrogenase (ubiquinone) ironsulfur protein. - The N-terminal domain of eukaryotic xanthine dehydrogenase. - The N-terminal domain of eukaryotic aldehyde oxidase. Three of the four conserved cysteines are clustered together in the same region of the protein. Our signature pattern spans that iron-sulfur binding region. We also developed a profile that covers the whole domain. -Consensus pattern: C-{C}-{C}-[GA]-{C}-C-[GAST]-{CPDEKRHFYW}-C [The 3 C's are 2Fe-2S ligands] -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Ferredoxins from the adrenodoxin subfamily are slightly divergent and are not picked up by our pattern (but they are recognized by the profile). We have thus developed a second pattern specific for this subfamily (see <PDOC00642>). -Last update: March 2005 / Text revised; profile added. [ 1] Meyer J. Trends Ecol. Evol. 3:222-226(1988). [ 2] Harayama S., Polissi A., Rekik M. "Divergent evolution of chloroplast-type ferredoxins." FEBS Lett. 285:85-88(1991). PubMed=2065785 [ 3] Fukuyama K., Ueki N., Nakamura H., Tsukihara T., Matsubara H. "Tertiary structure of [2Fe-2S] ferredoxin from Spirulina platensis refined at 2.5 A resolution: structural comparisons of plant-type ferredoxins and an electrostatic potential analysis." J. Biochem. 117:1017-1023(1995). PubMed=8586613 [ 4] Overington J.P. Curr. Opin. Struct. Biol. 2:394-401(1992). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00176} {PS00198; 4FE4S_FER_1} {PS51379; 4FE4S_FER_2} {BEGIN} ************************************************************************* ** * 4Fe-4S ferredoxin-type iron-sulfur binding domain signature and profile * ************************************************************************* ** Ferredoxins [1] are a group of iron-sulfur proteins which mediate electron transfer in a wide variety of metabolic reactions. Ferredoxins can be divided into several subgroups depending upon the physiological nature of the iron-sulfur cluster(s). One of these subgroups are the 4Fe-4S ferredoxins, which are found in bacteria and which are thus often referred as 'bacterial-type' ferredoxins. The structure of these proteins [2] consists of the duplication of a domain of twenty six amino acid residues; each of these domains contains four cysteine residues that bind to a 4Fe-4S center. Several structures of the 4Fe-4S ferredoxin domain have been determined (see for example <PDB:1FDN>) [3]. The clusters consist of two interleaved 4Fe- and 4S-tetrahedra forming a cubane-like structure, in such a way that the four iron occupy the eight corners of a distorted cube. Each 4Fe-4S is attached to the polypeptide chain by four covalent Fe-S bonds involving cysteine residues. A number of proteins have been found [4] that include one or more 4Fe-4S binding domains similar to those of bacterial-type ferredoxins. These proteins are listed below: - The iron-sulfur proteins of the succinate dehydrogenase and the fumarate reductase complexes (EC 1.3.99.1). These enzyme complexes, which are components of the tricarboxylic acid cycle, each contain three subunits: a flavoprotein, an iron-sulfur protein, and a b-type cytochrome. The ironsulfur proteins contain three different iron-sulfur centers: a 2Fe2S, a 3Fe-3S and a 4Fe-4S. - Escherichia coli anaerobic glycerol-3-phosphate dehydrogenase (EC 1.1.99.5) This enzyme is composed of three subunits: A, B, and C. The C subunit seems to be an iron-sulfur protein with two ferredoxin-like domains in the Nterminal part of the protein. - Escherichia coli anaerobic dimethyl sulfoxide reductase. The B subunit of this enzyme (gene dmsB) is an iron-sulfur protein with four 4Fe-4S ferredoxin-like domains. - Escherichia coli formate hydrogenlyase. Two of the subunits of this oligomeric complex (genes hycB and hycF) seem to be iron-sulfur proteins that each contain two 4Fe-4S ferredoxin-like domains. - Methanobacterium formicicum formate dehydrogenase (EC 1.2.1.2). This enzyme is used by the archaebacteria to grow on formate. The beta chain of this dimeric enzyme probably binds two 4Fe-4S centers. - Escherichia coli formate dehydrogenases N and O (EC 1.2.1.2). The beta chain of these two enzymes (genes fdnH and fdoH) are iron-sulfur proteins with four 4Fe-4S ferredoxin-like domains. - Desulfovibrio periplasmic [Fe] hydrogenase (EC 1.18.99.1). The large chain of this dimeric enzyme binds three 4Fe-4S centers, two of which are located in the ferredoxin-like N-terminal region of the protein. - Methanobacterium thermoautrophicum methyl viologen-reducing hydrogenase subunit mvhB, which contains six tandemly repeated ferredoxin-like domains and which probably binds twelve 4Fe-4S centers. - Salmonella typhimurium anaerobic sulfite reductase (EC 1.8.1.-) [5]. Two of the subunits of this enzyme (genes asrA and asrC) seem to both bind two 4Fe-4S centers. - A Ferredoxin-like protein (gene fixX) from the nitrogen-fixation genes locus of various Rhizobium species, and one from the Nifregion of Azotobacter species. - The 9 Kd polypeptide of chloroplast photosystem I [6] (gene psaC). This protein contains two low potential 4Fe-4S centers, referred as the A and B centers. - The chloroplast frxB protein which is predicted to carry two 4Fe-4S centers. - An ferredoxin from a primitive eukaryote, the enteric amoeba Entamobea histolytica. - Escherichia coli hypothetical protein yjjW, a protein with a Nterminal region belonging to the radical activating enzymes family (see <PDOC00834>) and two potential 4Fe-4S centers. The pattern of cysteine residues in the iron-sulfur region is sufficient to detect this class of 4Fe-4S binding proteins. The profile we developed covers the whole domain. -Consensus pattern: C-x-{P}-C-x(2)-C-{CP}-x(2)-C-[PEG] [The 4 C's are 4Fe-4S ligands] -Sequences known to belong to this class detected by the profile: ALL. of known 4Fe-4S sequences, with very few exceptions. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: In some domains has bacterial ferredoxins, one of the two duplicated lost one or more of the four conserved cysteines. The consequence of such variations is that these domains have either lost their iron-sulfur binding property or bind to a 3Fe-3S center instead of a 4Fe-4S center. -Note: The last residue of this pattern in most proteins belonging to this group, is a Pro; the only exceptions are the Rhizobium ferredoxin-like proteins which have Gly, and two Desulfovibrio ferredoxins which have Glu. It must also be noted that the three non 4Fe-4S-binding proteins which are picked-up by the pattern have Gly in this position of the pattern. -Last update: April 2008 / Text revised; profile added. [ 1] Meyer J. "The evolution of ferredoxins." Trends Ecol. Evol. 3:222-226(1988). [ 2] Otaka E., Ooi T. "Examination of protein sequence homologies: IV. Twenty-seven bacterial ferredoxins." J. Mol. Evol. 26:257-267(1987). PubMed=3129571 [ 3] Duee E.D., Fanchon E., Vicat J., Sieker L.C., Meyer J., Moulis J.M. "Refined crystal structure of the 2[4Fe-4S] ferredoxin from Clostridium acidurici at 1.84 A resolution." J. Mol. Biol. 243:683-695(1994). PubMed=7966291 [ 4] Beinert H. "Recent developments in the field of iron-sulfur proteins." FASEB J. 4:2483-2491(1990). PubMed=2185975 [ 5] Huang C.J., Barrett E.L. "Sequence analysis and expression of the Salmonella typhimurium asr operon encoding production of hydrogen sulfide from sulfite." J. Bacteriol. 173:1544-1553(1991). PubMed=1704886 [ 6] Knaff D.B. "The photosystem I reaction centre." Trends Biochem. Sci. 13:460-461(1988). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00177} {PS51296; RIESKE} {BEGIN} ********************************************** * Rieske [2Fe-2S] iron-sulfur domain profile * ********************************************** There are multiple types of iron-sulfur clusters which are grouped into three main categories based on their atomic content: [2Fe-2S], [3Fe-4S], [4Fe-4S] (see <PDOC00176>), and other hybrid or mixed metal types. Two general types of [2Fe-2S] clusters are known and they differ in their coordinating residues. The ferredoxin-type [2Fe-2S] clusters are coordinated to the protein by four cysteine residues (see <PDOC00175>). The Rieske-type [2Fe-2S] cluster is coordinated to its protein by two cysteine residues and two histidine residues [1,2]. The structure of several Rieske domains has been solved (see for example <PDB:1RIE>) [3]. It contains three layers of antiparallel beta sheets forming two beta sandwiches. Both beta sandwiches share the central sheet 2. The metal-binding site is at the top of the beta sandwich formed by the sheets 2 and 3. The Fe1 iron of the Rieske cluster is coordinated by two cysteines while the other iron Fe2 is coordinated by two histidines. Two inorganic sulfide ions bridge the two iron ions forming a flat, rhombic cluster. Rieske-type iron-sulfur clusters are common to electron transfer chains of mitochondria and chloroplast and to non-heme iron oxygenase systems: - The Rieske protein of the Ubiquinol-cytochrome c reductase (EC 1.10.2.2) (also known as the bc1 complex or complex III), a complex of the electron transport chains of mitochondria and of some aerobic prokaryotes; it catalyzes the oxidoreduction of ubiquinol and cytochrome c. - The Rieske protein of chloroplastic plastoquinone-plastocyanin reductase (EC 1.10.99.1) (also known as the b6f complex). It is functionally similar to the bc1 complex and catalyzes the oxidoreduction of plastoquinol and cytochrome f. - Bacterial naphthalene 1,2-dioxygenase subunit alpha, a component of the naphthalene dioxygenase (NDO) multicomponent enzyme system which catalyzes the incorporation of both atoms of molecular oxygen into naphthalene to form cis-naphthalene dihydrodiol. - Bacterial 3-phenylpropionate dioxygenase ferredoxin subunit. - Bacterial toluene monoxygenase. - Bacterial biphenyl dioxygenase. The profile we developed covers the whole Rieske domain. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: March 2007 / Text revised; profiles added; patterns deleted. -Note: The Rieske profile is in competition with a profile of a related domain, i.e. nirD Rieske-like domain (see <PDOC51300>). [ 1] Ferraro D.J., Gakhar L., Ramaswamy S. "Rieske business: structure-function of Rieske non-heme oxygenases." Biochem. Biophys. Res. Commun. 338:175-190(2005). PubMed=16168954; DOI=10.1016/j.bbrc.2005.08.222 [ 2] Schneider D., Schmidt C.L. "Multiple Rieske proteins in prokaryotes: where and why?" Biochim. Biophys. Acta 1710:1-12(2005). PubMed=16271700; DOI=10.1016/j.bbabio.2005.09.003 [ 3] Iwata S., Saynovits M., Link T.A., Michel H. "Structure of a water soluble fragment of the 'Rieske' iron-sulfur protein of the bovine heart mitochondrial cytochrome bc1 complex determined by MAD phasing at 1.5 A resolution." Structure 4:567-579(1996). PubMed=8736555 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00178} {PS00201; FLAVODOXIN} {BEGIN} ************************ * Flavodoxin signature * ************************ Flavodoxins [1,E1] are electron-transfer proteins that function in various electron transport systems. Flavodoxins bind one FMN molecule, which serves as a redox-active prosthetic group. Flavodoxins are functionally interchangeable with ferredoxins. They have been isolated from prokaryotes, cyanobacteria, and some eukaryotic algae. The signature pattern for these proteins is derived from a conserved region in their N-terminal section, this region is involved in the binding of the FMN phosphate group. -Consensus pattern: [LIV]-[LIVFY]-[FY]-x-[ST]-{V}-x-[AGC]-x-T-{P}-x(2)-A{L}x-[LIV] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 6. -Last update: April 2006 / Pattern revised. [ 1] Wakabayashi S., Kimura T., Fukuyama K., Matsubara H., Rogers L.J. "The amino acid sequence of a flavodoxin from the eukaryotic red alga Chondrus crispus." Biochem. J. 263:981-984(1989). PubMed=2597140 [E1] http://www.icgeb.trieste.it/p450/flavodoxins.html +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00179} {PS00202; RUBREDOXIN} {BEGIN} ************************ * Rubredoxin signature * ************************ Rubredoxins [1] are small electron-transfer prokaryotic proteins. They contain an iron atom which is ligated by four cysteine residues. Rubredoxins are, in some cases, functionally interchangeable with ferredoxins. As a pattern for these proteins we have selected a conserved region that includes two of the cysteine residues that bind the iron atom. -Consensus pattern: [LIVM]-x-{G}-{R}-W-x-C-P-x-C-[AGD] [The 2 C's bind the iron atom] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 2. -Note: In Pseudomonas oleovorans rubredoxin 2 (gene alkG) [2], this pattern is found twice because alkG has two rubredoxin domains. -Note: Rubrerythrin [3], a protein with inorganic pyrophosphatase activity from Desulfovibrio vulgaris possesses a C-terminal rubredoxin-like domain. But this domain is too divergent to be detected by the above pattern. -Last update: December 2004 / Pattern and text revised. [ 1] Berg J.M., Holm R.H. (In) Iron-sulfur proteins, Spiro T.G., Ed., pp1-66, Wiley, New-York, (1982). [ 2] Kok M., Oldenhuis R., der Linden M.P.G., Meulenberg C.H.C., Kingma J., "Witholt B The Pseudomonas oleovorans alkBAC operon encodes two structurally related rubredoxins and an aldehyde dehydrogenase." J. Biol. Chem. 264:5442-5451(1989). PubMed=2647719; [ 3] van Beeumen J.J., van Driessche G., Liu M.-Y., Le Gall J. "The primary structure of rubrerythrin, a protein with inorganic pyrophosphatase activity from Desulfovibrio vulgaris. Comparison with hemerythrin and rubredoxin." J. Biol. Chem. 266:20645-20653(1991). PubMed=1657933; +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00180} {PS00203; METALLOTHIONEIN_VRT} {BEGIN} ***************************************** * Vertebrate metallothioneins signature * ***************************************** Metallothioneins (MT) [1,2,3] are small proteins which bind heavy metals such as zinc, copper, cadmium, nickel, etc., through clusters of thiolate bonds. MT's occur throughout the animal kingdom and are also found in higher plants, fungi and some prokaryotes. On the basis of structural relationships MT's have been subdivided into three classes. Class I includes mammalian MT's as well as MT's from crustacean and molluscs, but with clearly related primary structure. Class II groups together MT's from various species such as sea urchins, fungi, insects and cyanobacteria which display none or only very distant correspondence to class I MT's. Class III MT's are atypical polypeptides containing gamma-glutamylcysteinyl units. Vertebrate class I MT's are proteins of 60 to 68 amino acid residues, 20 of these residues are cysteines that bind to 7 bivalent metal ions. As a signature pattern we chose a region that spans 19 residues and which contains seven of the metal-binding cysteines, this region is located in the Nterminal section of class-I MT's. -Consensus pattern: C-x-C-[GSTAP]-x(2)-C-x-C-x(2)-C-x-C-x(2)-C-x-K [The 7 C's are involved in metal binding] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: This signature pattern is not meant to detect invertebrate class-I MT's whose sequence is highly divergent from that of vertebrate's. -Expert(s) to contact by email: Binz P.-A.; [email protected] -Last update: May 2004 / Text revised. [ 1] Hamer D.H. "Metallothionein." Annu. Rev. Biochem. 55:913-951(1986). PubMed=3527054; DOI=10.1146/annurev.bi.55.070186.004405 [ 2] Kagi J.H.R., Schaffer A. "Biochemistry of metallothionein." Biochemistry 27:8509-8515(1988). PubMed=3064814 [ 3] Binz P.-A. Thesis, 1996, University of Zurich. +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00181} {PS00540; FERRITIN_1} {PS00204; FERRITIN_2} {BEGIN} ******************************************** * Ferritin iron-binding regions signatures * ******************************************** Ferritin [1,2] is one of the major non-heme iron storage proteins. It consists of a mineral core of hydrated ferric oxide, and a multi-subunit protein shell which englobes the former and assures its solubility in an aqueous environment. In animals the protein is mainly cytoplasmic and there are generally two or more genes that encodes for closely related subunits (in mammals there are two subunits which are known as H(eavy) and L(ight)). In plants ferritin is found in the chloroplast [3]. There are a number of well conserved region in the sequence of ferritins. We have selected two of these regions to develop signature patterns. The first pattern is located in the central part of the sequence of ferritin and it contains three conserved glutamate which are thought to be involved in the binding of iron. The second pattern is located in the C-terminal section, it corresponds to a region which forms a hydrophilic channel through which small molecules and ions can gain access to the central cavity of the molecule; this pattern also includes conserved acidic residues which are potential metal binding sites. -Consensus pattern: E-x-[KR]-E-x(2)-E-[KR]-[LF]-[LIVMA]-x(2)-Q-N-x-R-x-GR [The 3 E's may be iron ligands] -Sequences known to belong to this class detected by the pattern: ALL, except for ferritin 1 from Schistosoma mansoni -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: D-x(2)-[LIVMF]-[STACQV]-[DH]-[FYMI]-[LIV]-[EN]-x(2)[FYC]L-x(6)-[LIVMQ]-[KNER] [The second D and the E may be iron ligands] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Crichton R.R., Charloteaux-Wauters M. "Iron transport and storage." Eur. J. Biochem. 164:485-506(1987). PubMed=3032619 [ 2] Theil E.C. "Ferritin: structure, gene regulation, and cellular function in animals, plants, and microorganisms." Annu. Rev. Biochem. 56:289-315(1987). PubMed=3304136; DOI=10.1146/annurev.bi.56.070187.001445 [ 3] Ragland M., Briat J.-F., Gagnon J., Laulhere J.-P., Massenet O., Theil E.C. "Evidence for conservation of ferritin sequences among plants and animals and for a transit peptide in soybean." J. Biol. Chem. 265:18339-18344(1990). PubMed=2211706 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00182} {PS00205; TRANSFERRIN_LIKE_1} {PS00206; TRANSFERRIN_LIKE_2} {PS00207; TRANSFERRIN_LIKE_3} {PS51408; TRANSFERRIN_LIKE_4} {BEGIN} ************************************************** * Transferrin-like domain signatures and profile * ************************************************** The transferrin family is a group of glycosylated proteins found in both vertebrates and invertebrates. Included in this group are molecules known to bind iron, including serotransferrin, ovotransferrin, lactotransferrin, and melanotransferrin (MTF). Additional members of this family include inhibitor of carbonic anhydrase (ICA; mammals), major yolk protein (sea urchins), saxiphilin (frog), pacifastin (crayfish), and TTF-1 (algae). Most family members contain two transferrin-like domains of around 340 amino acids, the result of an ancient duplication event [1]. Each of the duplicated domains can be further divided into two subdomains that form a cleft inside of which the iron atom is bound in iron-transporting transferrin (see <PDB:1LFH>) [2]. The iron-coordinating residues consist of an aspartic acid, two tyrosines and a histidine, as well as an arginine that coordinates a requisite anion. In addition to iron and anion liganding residues, the transferrin-like domain contains conserved cysteine residues involved in disulfide bond formation. Some proteins known to contain a transferrin-like domain are listed below: - Mammalian blood serotransferrin (siderophilin). It functions to deliver iron to cells via a receptor-mediated endocytic process as well to remove toxic free iron from the blood and to provide an anti-bacterial, low-iron environment. - Mammalian milk lactotransferrin (lactoferrin). It has antimicrobial activity and contributes to innate immunity by limiting the availability of iron to pathogenic organisms. In addition, lactoferrin appears to be a serine protease of the peptidase S60 family [E1] with an active site that may consist of a Ser-Lys catalytic dyad. Lactoferrin cleaves the putative Haemophilus influenzae colonization factors IgA1 protease and Hap adhesin at homologous arginine-rich sequences [3]. - Vertebrate egg white ovotransferrin (conalbumin). Its major function is thought to be keeping the iron concentration low in bodily fluids to prevent invading bacteria from acquiring iron. - Mammalian membrane-associated melanotransferrin. It was first identified in human skin cells but now is known to be expressed across a broad range of tissue types, and is of unknown function. It has only a single functional iron binding site located in its N-terminal domain. - Porcine inhibitor of carbonic anhydrase (ICA). It specifically binds and inhibits carbonic anhydrase 2 with nanomolar affinity but does not bind iron with high affinity [4]. - Bull frog saxiphilin, a plasma protein that binds saxitoxin (STX), a causative agent of paralytic shellfish poisoning. STX binds to the C-terminal transferrin-like domain of saxiphilin. The Nterminal transferrin-like domain includes an insert that represent two tandem thyroglobulin domains. Unlike transferrins, saxiphilin does not bind iron [5]. - Sea urchin toposome or major yolk protein (MYP), a modified calciumbinding, iron-less transferrin essential for cell adhesion and development. The protein lacks most of the five iron-binding amino acids D, Y, R, Y, and H present at specific positions in iron-transporting transferrins, which is consistent with the Ca(2+)-binding function of the toposome in cell adhesion rather than transport. The toposome polypeptide contains an insertion of some 280 amino acids in the second transferrin-like [6]. - Crayfish pacifastin, an iron-binding serine proteinase inhibitor. This protein is a heterodimeric protein, consisting of one proteinase inhibitory light chain, and one heavy chain related to transferrins. The pacifastin heavy chain contains three transferrin-like domains, two of which seem to be active for iron binding [7]. - Green algae Dunaliella salina TTF-1. The membrane associated TTF-1 is distinctly different in encompassing three, rather than two, transferrinlike domains [8]. We have developed three different signature patterns for ironbinding transferrin-like domains. Each of them is centered on one of the ironbinding residue, respectively the two tyrosines and the histidine. We also developed a profile which covers the entire transferrin-like domain. -Consensus pattern: Y-x(0,1)-[VAS]-V-[IVAC]-[IVA]-[IVA]-[RKH]-[RKS][GDENSA] [Y is an iron ligand] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 3. -Consensus pattern: [YI]-x-G-A-[FLI]-[KRHNQS]-C-L-x(3,4)-G-[DENQ]-V[GAT][FYW] [Y is an iron ligand] [C is involved in a disulfide bond] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [DENQK]-[YF]-x-[LY]-L-C-x-[DN]-x(5,8)-[LIV]-x(4,5)-Cx(2)A-x(4)-[HQR]-x-[LIVMFYW]-[LIVM] [H is an iron ligand] [The 2 C's are linked by a disulfide bond] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: February 2009 / Text revised. [ 1] Lambert L.A., Perri H., Meehan T.J. "Evolution of duplications in the transferrin family of proteins." Comp. Biochem. Physiol. 140B:11-25(2005). PubMed=15621505; DOI=10.1016/j.cbpc.2004.09.012 [ 2] Anderson B.F., Baker H.M., Norris G.E., Rice D.W., Baker E.N. "Structure of human lactoferrin: crystallographic structure analysis and refinement at 2.8 A resolution." J. Mol. Biol. 209:711-734(1989). PubMed=2585506 [ 3] Hendrixson D.R., Qiu J., Shewry S.C., Fink D.L., Petty S., Baker E.N., Plaut A.G., St. Geme J.W. III "Human milk lactoferrin is a serine protease that cleaves Haemophilus surface proteins at arginine-rich sites." Mol. Microbiol. 47:607-617(2003). PubMed=12535064 [ 4] Wuebbens M.W., Roush E.D., Decastro C.M., Fierke C.A. "Cloning, sequencing, and recombinant expression of the porcine inhibitor of carbonic anhydrase: a novel member of the transferrin family." Biochemistry 36:4327-4336(1997). PubMed=9100029; DOI=10.1021/bi9627424 [ 5] Krishnan G., Morabito M.A., Moczydlowski E. "Expression and characterization of Flag-epitope- and hexahistidine-tagged derivatives of saxiphilin for use in detection and assay of saxitoxin." Toxicon 39:291-301(2001). PubMed=10978747 [ 6] Noll H., Alcedo J., Daube M., Frei E., Schiltz E., Hunt J., Humphries T., Matranga V., Hochstrasser M., Aebersold R., Lee H., Noll M. "The toposome, essential for sea urchin cell adhesion and development, is a modified iron-less calcium-binding transferrin." Dev. Biol. 310:54-70(2007). PubMed=17707791; DOI=10.1016/j.ydbio.2007.07.016 [ 7] Liang Z., Sottrup-Jensen L., Aspan A., Hall M., Soederhaell K. "Pacifastin, a novel 155-kDa heterodimeric proteinase inhibitor containing a unique transferrin chain." Proc. Natl. Acad. Sci. U.S.A. 94:6682-6687(1997). PubMed=9192625 [ 8] Fisher M., Gokhman I., Pick U., Zamir A. "A structurally novel transferrin-like protein accumulates in the plasma membrane of the unicellular green alga Dunaliella salina grown in high salinities." J. Biol. Chem. 272:1565-1570(1997). PubMed=8999829 [E1] http://merops.sanger.ac.uk/cgi-bin/make_frame_file?id=S60 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00183} {PS00208; PLANT_GLOBIN} {BEGIN} ******************************* * Plant hemoglobins signature * ******************************* Leghemoglobins [1] are hemoproteins present in the root nodules of leguminous plants. Leghemoglobins are structurally and functionally related to hemoglobin and myoglobin. By providing oxygen to the bacteroids, they are essential for symbiotic nitrogen fixation. Structurally related hemoglobins are found in nonsymbiotic plants where they may not function as an oxygen storage or transport proteins, but might act as an oxygen sensors [2]. We have developed a signature pattern that exclusively picks up the sequence of plants hemoglobins. It is centered on an histidine that acts as the heme iron distal ligand. -Consensus pattern: [SN]-P-x-[LV]-x(2)-H-A-x(3)-F [H is an heme iron ligand] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 1. -Last update: December 2001 / Pattern and text revised. [ 1] Powell R., Gannon F. "The leghaemoglobins." BioEssays 9:117-121(1988). PubMed=2906540 [ 2] Arredondo-Peter R., Hargrove M.S., Moran J.F., Sarath G., Klucas R.V. Plant Physiol. 118:1121-1126(1998). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00184} {PS00209; HEMOCYANIN_1} {PS00210; HEMOCYANIN_2} {BEGIN} ************************************************** * Arthropod hemocyanins / insect LSPs signatures * ************************************************** Hemocyanins are copper-containing oxygen carriers occurring freely dissolved in the hemolymph of many molluscs and arthropods [1]. Arthropod hemocyanins consist of hexamer or multi-hexamers with subunits of about 75 Kd. Each of these subunits binds two copper ions. Larval storage proteins (LSP) [2] are proteins from the hemolymph of insects, which may serve as a store of amino acids for synthesis of adult proteins. There are two classes of LSP's: arylphorins, which are rich in aromatic amino acids, and methionine-rich LSP's. LSP's forms hexameric complexes. LSP's are structurally related to arthropod hemocyanins. In the lepidopteran Trichoplusia ni a protein has been found [3] which is associated with larval metamorphosis. This protein, which is called acidic juvenile hormone-suppressible protein 1 (AJSP-1) is also structurally related to arthropod hemocyanins. As signature patterns for these proteins we selected two conserved regions, the first of these regions is located in the N-terminal section of these proteins and include a conserved histidine residue which, in hemocyanins, binds a copper atom. The second pattern is located in the central part of the protein. -Consensus pattern: Y-[FYW]-x-E-D-[LIVM]-x(2)-N-x(6)-H-x(3)-P [H is a copper ligand in hemocyanins] -Sequences known to belong to this class detected by the pattern: ALL, except most LSPs. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: T-x(2)-R-D-P-x-[FY]-[FYW] -Sequences known to belong to this class detected by the pattern: ALL, except one LSP. -Other sequence(s) detected in Swiss-Prot: 2. -Note: See also the pattern for the tyrosinase <PDOC00398>; this pattern will also mollusc hemocyanins. copper B binding site pick up of all arthropod and -Last update: November 1997 / Patterns and text revised. [ 1] Linzen B. "Blue blood: structure and evolution of hemocyanin." Naturwissenschaften 76:206-211(1989). PubMed=2664531 [ 2] Willott E., Wang X.-Y., Wells M.A. "cDNA and gene sequence of Manduca sexta arylphorin, an aromatic amino acid-rich larval serum protein. Homology to arthropod hemocyanins." J. Biol. Chem. 264:19052-19059(1989). PubMed=2808410 [ 3] Jones G., Brown N., Manczak M., Hiremath S., Kafatos F.C. "Molecular cloning, regulation, and complete sequence of a hemocyanin-related, juvenile hormone-suppressible protein from insect hemolymph." J. Biol. Chem. 265:8596-8602(1990). PubMed=2341396 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00185} {PS00211; ABC_TRANSPORTER_1} {PS50893; ABC_TRANSPORTER_2} {BEGIN} ********************************************************************* * ATP-binding cassette, ABC transporter-type, signature and profile * ********************************************************************* ABC transporters belong which uses the hydrolysis of systems. ABC to the ATP-Binding Cassette (ABC) superfamily ATP to energize diverse biological transporters are minimally constituted of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These regions can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs [1]. ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [2,3]. The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyze ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A (see <PDOC00017>), and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarize the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [4,5,6]. The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms (see <PDB:1B0U>). ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis [7,8,9,10,11,12]. Proteins known to belong to this family are classified in several functional subfamilies depending on the substrate used [E1]. All different types of transporters with a functional attribution are listed below (references are only provided for recently characterized proteins). In prokaryotes: Active import transport system components: - Carbohydrate uptake transporter. Cobalt uptake transporter (cbiO). Ferric iron uptake transporter. Hydrophobic amino acid uptake transporter. Iron Chelate uptake transporter. Manganese/Zinc/Iron chelate uptake transporter. Molybdate uptake transporter. Nitrate/Nitrite/Cyanate uptake transporter. Peptide/Opine/Nickel uptake transporter. Phosphate uptake transporter. Phosphonate uptake transporter. Polyamine/Opine/Phosphonate uptake transporter. Quaternary amine uptake transporter. Sulfate uptake transporter. - Taurine uptake tranporter (tauB). - Thiamin uptake transporter (thiamin/thiamin pyrophosphate) (thiQ/yabJ). - Vitamine B12 uptake tranporter (btuD). Active export transport system components: - Capsular polysaccharide exporter (kpsT). - Drug exporter-1: daunorubicin/doxorubicin (drrA); oleandomycin (oleC4). - Drug resistance ATPase-1. - Drug/siderophore exporter-3. - Glucan exporter: Beta-(1,2)-glucan export (chvA/ndvA). - Lipid A exporter (msbA). - Lantibiotic exporter: hemolysin/bacteriocin (cylB). - Lipooligosaccharide exporter (nodulation protein nodI from Rhizobium). - Lipopolysaccharide exporter (rbfA). - Micrococin B17 exporter (mcbF). - Micrococin J25 exporter (mcjD). - Peptide-2 exporter: competence factor (comA/comB). - Peptide-3 exporter: modified cyclic peptide (syrD. - Protein-1 exporter: hemolysin (hlyB). - Protein-2 exporter: colicin V(cvaB). - S-layer protein exporter (rsaD/sapD). - Techoic Acid Exporter (tagH). In eukaryotes: - ALDP, a peroxisomal protein involved in X-linked adrenoleukodystrophy. - Antigen peptide transporters 1 (TAP1, PSF1, RING4, HAM-1, mtp1) and 2 (TAP2, PSF2, RING11, HAM-2, mtp2), which are involved in the transport of antigens from the cytoplasm to a membrane-bound compartment for association with MHC class I molecules. - Cystic fibrosis transmembrane conductance regulator (CFTR), which is most probably involved in the transport of chloride ions. - Drosophila proteins white (w) and brown (bw), which are involved in the import of ommatidium screening pigments. - Fungal elongation factor 3 (EF-3). - Multidrug transporters (Mdr1) (P-glycoprotein), a family of closely related proteins which extrude a wide variety of drugs out of the cell. - 70 Kd peroxisomal membrane protein (PMP70). - Sulfonylurea receptor, a putative subunit of the B-cell ATPsensitive potassium channel. As a signature pattern for this class of proteins, we use a region conserved which is located between the 'A' and the 'B' motifs of the ATP-binding site. The profile we developed is directed against the conserved ABC module by covering the region between beta strand 1 and alpha helix 9, including not only the conserved motifs but also structural elements found N and C terminal to them. Our profile also recognizes the UvrA family which is evolutionarily related to the ABC transporter family. -Consensus pattern: [LIVMFYC]-[SA]-[SAPGLVFYKQH]-G-[DENQMW][KRQASPCLIMFW][KRNQSTAVM]-[KRACLVM]-[LIVMFYPAN]-{PHY}-[LIVMFW][SAGCLIVP]-{FYWHP}-{KRHP}-[LIVMFYWSTA] -Sequences known to belong to this class detected by the pattern: ALL, except for 25 sequences. -Other sequence(s) detected in Swiss-Prot: 53. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: The ATP-binding region is duplicated in araG, mdl, msrA, tlrC, uvrA, yejF, Mdr's, CFTR, pmd1 and in EF-3. In some of those proteins, the above pattern only detect one of the two copies of the domain. rbsA, -Last update: November 2003 / Text revised. [ 1] Holland I.B., Cole S.P.C., Kuchler K., Higgins C.F. (In) ABC proteins from bacteria to man, Academic Press, San Diego, (2003). [ 2] Holland I.B., Blight M.A. J. Mol. Biol. 293:381-399(1999). [ 3] Saurin W., Hofnung M., Dassa E. "Getting in or out: early segregation between importers and exporters in the evolution of ATP-binding cassette (ABC) transporters." J. Mol. Evol. 48:22-41(1999). PubMed=9873074 [ 4] Higgins C.F. "ABC transporters: physiology, structure and mechanism--an overview." Res. Microbiol. 152:205-210(2001). PubMed=11421269 [ 5] Higgins C.F. "ABC transporters: from microorganisms to man." Annu. Rev. Cell Biol. 8:67-113(1992). PubMed=1282354; DOI=10.1146/annurev.cb.08.110192.000435 [ 6] Schneider E., Hunke S. "ATP-binding-cassette (ABC) transport systems: functional and structural aspects of the ATP-hydrolyzing subunits/domains." FEMS Microbiol. Rev. 22:1-20(1998). PubMed=9640644 [ 7] Kerr I.D. "Structure and association of ATP-binding cassette transporter nucleotide-binding domains." Biochim. Biophys. Acta 1561:47-64(2002). PubMed=11988180 [ 8] Karpowich N., Martsinkevich O., Millen L., Yuan Y.R., Dai P.L., MacVey K., Thomas P.J., Hunt J.F. "Crystal structures of the MJ1267 ATP binding cassette reveal an induced-fit effect at the ATPase active site of an ABC transporter." Structure 9:571-586(2001). PubMed=11470432 [ 9] Yuan Y.R., Blecker S., Martsinkevich O., Millen L., Thomas P.J., Hunt J.F. "The crystal structure of the MJ0796 ATP-binding cassette. Implications for the structural consequences of ATP hydrolysis in the active site of an ABC transporter." J. Biol. Chem. 276:32313-32321(2001). PubMed=11402022; DOI=10.1074/jbc.M100758200 [10] Hung L.W., Wang I.X., Nikaido K., Liu P.Q., Ames G.F., Kim S.H. Nature 396:703-707(1998). [11] Diederichs K., Diez J., Greller G., Muller C., Breed J., Schnell C., Vonrhein C., Boos W., Welte W. EMBO J. 19:5951-5961(2000). [12] Gaudet R., Wiley D.C. EMBO J. 20:4964-4972(2001). [E1] http://www.tcdb.org/tcdb/index.php?tc=3.A.1 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00186} {PS00212; ALBUMIN_1} {PS51438; ALBUMIN_2} {BEGIN} **************************************** * Albumin domain signature and profile * **************************************** The following related [1,2,3]: serum transport proteins are known to be evolutionary - Albumin (ALB), the main protein of plasma. It binds water, cations such as Ca++, Na+, K+, fatty acids, hormones, bilirubin and drugs. Its main function is the regulation of the colloidal osmotic pressure of blood. - Alpha-fetoprotein (AFP) (alpha-fetoglobulin). AFP is a fetal plasma protein which also binds various cations, fatty acids and bilirubin. - Vitamin D-binding protein (VDB), also known as group-specific component or Gc-globulin. VDB binds to vitamin D and its metabolites as well as fatty acids. - Afamin (or alpha-albumin), a protein whose biochemical role is not yet characterized. Structurally, these proteins consist of two to seven homologous domains of about 190 amino acids. Each domain, consisting of 10 alpha-helices, is formed by two smaller subdomains and contains five or six internal disulfide bonds as shown in the following schematic representation [4]. +---+ | | +----+ | | +-----+ | | xxCxxxxxxxxxxxxxxxxCCxxCxxxxCxxxxxCCxxxCxxxxxxxxxCxxxxxxxxxxxxxxCCxxxxCxx xx | | | | | ***|******** +-----------------+ +------+ +---------------+ 'C': conserved cysteine involved in a disulfide bond. '*': position of the pattern. The signature pattern we derived is based on three conserved cysteines at the end of the domain. We built it in such a way that it can detect all 3 repeats in albumin and human afamin, the first two in AFP and the first one in VDB and rat afamin. We also developed a profile, which covers the entire albumin domain. -Consensus pattern: [FY]-x(6)-C-C-x(2)-{C}-x(4)-C-[LFY]-x(6)-[LIVMFYW] [The 3 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: mammalian CD63 antigen. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: March 2009 / Text revised; profile added. [ 1] Haefliger D.N., Moskaitis J.E., Schoenberg D.R., Wahli W. "Amphibian albumins as members of the albumin, alpha-fetoprotein, vitamin D-binding protein multigene family." J. Mol. Evol. 29:344-354(1989). PubMed=2481749 [ 2] Schoentgen F., Metz-Boutigue M.-H., Jolles J., Constans J., Jolles P. "Complete amino acid sequence of human vitamin D-binding protein (group-specific component): evidence of a three-fold internal homology as in serum albumin and alpha-fetoprotein." Biochim. Biophys. Acta 871:189-198(1986). PubMed=2423133 [ 3] Lichenstein H.S., Lyons D.E., Wurfel M.M., Johnson D.A., McGinley M.D., Leidli J.C., Trollinger D.B., Mayer J.P., Wright S.D., Zukowski M.M. "Afamin is a new member of the albumin, alpha-fetoprotein, and vitamin D-binding protein gene family." J. Biol. Chem. 269:18149-18154(1994). PubMed=7517938 [ 4] He X.M., Carter D.C. "Atomic structure and chemistry of human serum albumin." Nature 358:209-215(1992). PubMed=1630489; DOI=10.1038/358209a0 [ 5] Verboven C., Rabijns A., De Maeyer M., Van Baelen H., Bouillon R., De Ranter C. "A structural basis for the unique binding features of the human vitamin D-binding protein." Nat. Struct. Biol. 9:131-136(2002). PubMed=11799400; DOI=10.1038/nsb754 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00187} {PS00213; LIPOCALIN} {BEGIN} *********************** * Lipocalin signature * *********************** Proteins which transport small hydrophobic molecules such as steroids, bilins, retinoids, and lipids share limited regions of sequence homology and a common tertiary structure architecture [1 to 5,E1]. This is an eight stranded antiparallel beta-barrel with a repeated + 1 topology enclosing a internal ligand binding site [1,3]. The name 'lipocalin' has been proposed [5] for this protein family. Proteins known to belong to this family are listed below (references are only provided for recently determined sequences). - Alpha-1-microglobulin (protein HC), which seems to bind porphyrin. - Alpha-1-acid glycoprotein (orosomucoid), which can bind a remarkable array of natural and synthetic compounds [6]. - Aphrodisin which, in hamsters, functions as an aphrodisiac pheromone. - Apolipoprotein D, which probably binds heme-related compounds. - Beta-lactoglobulin, a milk protein whose physiological function appears to bind retinol. - Complement component C8 gamma chain, which seems to bind retinol [7]. - Crustacyanin [8], a protein from lobster carapace, which binds astaxanthin, a carotenoid. - Epididymal-retinoic acid binding protein (E-RABP) [9] involved in sperm maturation. - Insectacyanin, a moth bilin-binding protein, and a related butterfly bilinbinding protein (BBP). - Late Lactation protein (LALP), a milk protein from tammar wallaby [10]. - Neutrophil gelatinase-associated lipocalin (NGAL) (p25) (SV-40 induced 24p3 protein) [11]. - Odorant-binding protein (OBP), which binds odorants. - Plasma retinol-binding proteins (PRBP). - Human pregnancy-associated endometrial alpha-2 globulin. - Probasin (PB), a rat prostatic protein. - Prostaglandin D synthase (EC 5.3.99.2) (GSH-independent PGD synthetase), a lipocalin with enzymatic activity [12]. - Purpurin, a retinal protein which binds retinol and heparin. - Quiescence specific protein p20K from chicken (embryo CH21 protein). - Rodent urinary proteins (alpha-2-microglobulin), which may bind pheromones. - VNSP 1 and 2, putative pheromone transport proteins from mouse vomeronasal organ [13]. - Von Ebner's gland protein (VEGP) [14] (also called tear lipocalin), a mammalian protein which may be involved in taste recognition. - A frog olfactory protein, which may transport odorants. - A protein found in the cerebrospinal fluid of the toad Bufo Marinus with a supposed function similar to transthyretin in transport across the blood brain barrier [15]. - Lizard's epididymal secretory protein IV (LESP IV), which could transport small hydrophobic molecules into the epididymal fluid during sperm maturation [16]. - Prokaryotic outer-membrane protein blc [17]. The sequences of most members of the family, the core or kernal lipocalins, are characterized by three short conserved stretches of residues [3,18]. Others, the outlier lipocalin group, share only one or two of these [3,18]. A signature pattern was built around the first, common to all outlier and kernal lipocalins, which occurs near the start of the first beta-strand. -Consensus pattern: [DENG]-{A}-[DENQGSTARK]-x(0,2)-[DENQARK]-[LIVFY]{CP}-G{C}-W-[FYWLRH]-{D}-[LIVMTA] -Sequences known to belong to this class detected by the pattern: ALL, except for rodent alpha-1-acid glycoproteins, kangaroo beta-lactoglobulin, VEGP and LESP IV. -Other sequence(s) detected in Swiss-Prot: 82. -Note: It is suggested, on the basis of similarities of structure, function, and sequence, that this family forms an overall superfamily, called the calycins, with the avidin/streptavidin <PDOC00499> and the cytosolic fattyacid binding proteins <PDOC00188> families [3,19]. -Expert(s) to contact by email: Flower D.R.; [email protected] -Last update: December 2004 / Pattern and text revised. [ 1] Cowan S.W., Newcomer M.E., Jones T.A. "Crystallographic refinement of human serum retinol binding protein at 2A resolution." Proteins 8:44-61(1990). PubMed=2217163 [ 2] Igaraishi M., Nagata A., Toh H., Urade H., Hayaishi N. Proc. Natl. Acad. Sci. U.S.A. 89:5376-5380(1992). [ 3] Flower D.R., North A.C.T., Attwood T.K. "Structure and sequence relationships in the lipocalins and related proteins." Protein Sci. 2:753-761(1993). PubMed=7684291 [ 4] Godovac-Zimmermann J. Trends Biochem. Sci. 13:64-66(1988). [ 5] Pervaiz S., Brew K. "Homology and structure-function correlations between alpha 1-acid glycoprotein and serum retinol-binding protein and its relatives." FASEB J. 1:209-214(1987). PubMed=3622999 [ 6] Kremer J.M.H., Wilting J., Janssen L.H. "Drug binding to human alpha-1-acid glycoprotein in health and disease." Pharmacol. Rev. 40:1-47(1988). PubMed=3064105 [ 7] Haefliger J.-A., Peitsch M.C., Jenne D.E., Tschopp J. "Structural and functional characterization of complement C8 gamma, a member of the lipocalin protein family." Mol. Immunol. 28:123-131(1991). PubMed=1707134 [ 8] Keen J.N., Caceres I., Eliopoulos E.E., Zagalsky P.F., Findlay J.B.C. "Complete sequence and model for the A2 subunit of the carotenoid pigment complex, crustacyanin." Eur. J. Biochem. 197:407-417(1991). PubMed=2026162 [ 9] Newcomer M.E. "Structure of the epididymal retinoic acid binding protein at 2.1 A resolution." Structure 1:7-18(1993). PubMed=8069623 [10] Collet C., Joseph R. Biochim. Biophys. Acta 1167:219-222(1993). [11] Kjeldsen L., Johnsen A.H., Sengelov H., Borregaard N. J. Biol. Chem. 268:10425-10432(1993). [12] Peitsch M.C., Boguski M.S. Trends Biochem. Sci. 16:363-363(1991). [13] Miyawaki A., Matsushita Y.R., Ryo Y., Mikoshiba T. EMBO J. 13:5835-5842(1994). [14] Kock K., Ahlers C., Schmale H. Eur. J. Biochem. 221:905-916(1994). [15] Achen M.G., Harms P.J., Thomas T., Richardson S.J., Wettenhall R.E.H., Schreiber G. J. Biol. Chem. 267:23170-23174(1992). [16] Morel L., Dufarre J.-P., Depeiges A. J. Biol. Chem. 268:10274-10281(1993). [17] Bishop R.E., Penfold S.S., Frost L.S., Holtje J.V., Weiner J.H. J. Biol. Chem. 270:23097-23103(1995). [18] Flower D.R., North A.C.T., Attwood T.K. Biochem. Biophys. Res. Commun. 180:69-74(1991). [19] Flower D.R. FEBS Lett. 333:99-102(1993). [E1] http://www.jenner.ac.uk/Lipocalin/frontpage.htm +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00188} {PS00214; FABP} {BEGIN} *************************************************** * Cytosolic fatty-acid binding proteins signature * *************************************************** A number of low molecular weight proteins which bind fatty acids and other organic anions are present in the cytosol [1,2]. Most of them are structurally related and have probably diverged from a common ancestor. This structure is a ten stranded antiparallel beta-barrel, albeit with a wide discontinuity between the fourth and fifth strands, with a repeated + 1 topology enclosing a internal ligand binding site [2,7]. Proteins known to belong to this family include: - Six, tissue-specific, types of fatty acid binding proteins (FABPs) found in liver, intestine, heart, epidermal, adipocyte, brain/retina. Heart FABP is also known as mammary-derived growth inhibitor (MDGI), a protein that reversibly inhibits proliferation of mammary carcinoma cells. Epidermal FABP is also known as psoriasis-associated FABP [3]. - Insect muscle fatty acid-binding proteins. - Testis lipid binding protein (TLBP). - Cellular retinol-binding proteins I and II (CRBP). - Cellular retinoic acid-binding protein (CRABP). - Gastrotropin, an ileal protein which stimulates gastric acid and pepsinogen secretion. It seems that gastrotropin binds to bile salts and bilirubins. - Fatty acid binding proteins MFB1 and MFB2 from the midgut of the insect Manduca sexta [4]. In addition to the above cytosolic proteins, this family also includes: - Myelin P2 protein, which may be a lipid transport protein in Schwann cells. P2 is associated with the lipid bilayer of myelin. - Schistosoma mansoni protein Sm14 [5] which seems to be involved in the transport of fatty acids. - Ascaris suum p18 a secreted protein that may play a role in sequestering potentially toxic fatty acids and their peroxidation products or that may be involved in the maintenance of the impermeable lipid layer of the eggshell. - Hypothetical fatty acid-binding proteins F40F4.2, F40F4.3, F40F4.4 and ZK742.5 from Caenorhabditis elegans. We use as a signature pattern for these proteins a segment from the Nterminal extremity. -Consensus pattern: [GSAIVK]-{FE}-[FYW]-x-[LIVMF]-x(2)-{K}-x-[NHG]-[FY][DE]x-[LIVMFY]-[LIVM]-{N}-{G}-[LIVMAKR] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 22. -Note: It is suggested, on the basis of similarities of structure, function, and sequence, that this family forms an overall superfamily, called the calycins, with the lipocalin <PDOC00187> and avidin/streptavidin <PDOC00499> families [6,7]. -Expert(s) to contact by email: Flower D.R.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Bernier I., Jolles P. "A survey on cytosolic non-enzymic proteins involved in the metabolism of lipophilic compounds: from organic anion binders to new protein families." Biochimie 69:1127-1152(1987). PubMed=3129018 [ 2] Veerkamp J.H., Peeters R.A., Maatman R.G.H.J. "Structural and functional features of different types of cytoplasmic fatty acid-binding proteins." Biochim. Biophys. Acta 1081:1-24(1991). PubMed=8068722 [ 3] Siegenthaler G., Hotz R., Chatellard-Gruaz D., Didierjean L., Hellman U., Saurat J.-H. "Purification and characterization of the human epidermal fatty acid-binding protein: localization during epidermal cell differentiation in vivo and in vitro." Biochem. J. 302:363-371(1994). PubMed=8092987 [ 4] Smith A.F., Tsuchida K., Hanneman E., Suzuki T.C., Wells M.A. "Isolation, characterization, and cDNA sequence of two fatty acid-binding proteins from the midgut of Manduca sexta larvae." J. Biol. Chem. 267:380-384(1992). PubMed=1730603 [ 5] Moser D., Tendler M., Griffiths G., Klinkert M.-Q. "A 14-kDa Schistosoma mansoni polypeptide is homologous to a gene family of fatty acid binding proteins." J. Biol. Chem. 266:8447-8454(1991). PubMed=2022660 [ 6] Flower D.R., North A.C.T., Attwood T.K. "Structure and sequence relationships in the lipocalins and related proteins." Protein Sci. 2:753-761(1993). PubMed=7684291 [ 7] Flower D.R. "Structural relationship of streptavidin to the calycin protein superfamily." FEBS Lett. 333:99-102(1993). PubMed=8224179 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00189} {PS50920; SOLCAR} {BEGIN} ****************************************** * Solute carrier (Solcar) repeat profile * ****************************************** Different types of substrate carrier proteins involved in energy transfer are found in the inner mitochondrial membrane [1 to 5]. These are: - The ADP,ATP carrier protein (AAC) (ADP/ATP translocase) which exports ATP into the cytosol and imports ADP into the mitochondrial matrix. The sequence of AAC has been obtained from various mammalian, plant and fungal species. - The 2-oxoglutarate/malate carrier protein (OGCP), which exports 2-oxoglutarate into the cytosol and imports malate or other dicarboxylic acids into the mitochondrial matrix. This protein plays an important role in several metabolic processes such as the malate/aspartate and the oxoglutarate/isocitrate shuttles. - The phosphate carrier protein, which transports phosphate groups from the cytosol into the mitochondrial matrix. - The brown fat uncoupling protein (UCP) which dissipates oxidative energy into heat by transporting protons from the cytosol into the mitochondrial matrix. - The tricarboxylate transport protein (or citrate transport protein) which is involved in citrate-H+/malate exchange. It is important for the bioenergetics of hepatic cells as it provides a carbon source for fatty acid and sterol biosyntheses, and NAD for the glycolytic pathway. - The Grave's disease carrier protein (GDC), a protein of unknown function recognized by IgG in patients with active Grave's disease. - Yeast mitochondrial proteins MRS3 and MRS4. The exact function of these proteins is not known. They suppress a mitochondrial splice defect in the first intron of the COB gene and may act as carriers, exerting their suppressor activity by modulating solute concentrations in the mitochondrion. - Yeast mitochondrial FAD carrier protein (gene FLX1). - Yeast protein ACR1 [6], which seems essential for acetyl-CoA synthetase activity. - Yeast protein PET8. - Yeast protein PMT. - Yeast protein RIM2. - Yeast protein YHM1/SHM1. - Yeast protein YMC1. - Yeast protein YMC2. - Yeast hypothetical proteins YBR291c, YEL006w, YER053c, YHR002w, and YIL006w. - Caenorhabditis elegans hypothetical protein K11H3.3. YFR045w, Two other proteins have been found to belong to this family, yet are not localized in the mitochondrial inner membrane: - Maize amyloplast Brittle-1 protein. This protein, found in the endosperm of kernels, could play a role in amyloplast membrane transport. - Candida boidinii peroxisomal membrane protein PMP47 [7]. PMP47 is an integral membrane protein of the peroxisome and it may play a role as a transporter. These proteins all seem to be evolutionary related. Structurally, they consist of three tandem repeats of a domain of approximately one hundred residues. Each of these domains contains two transmembrane regions. The profile we developed covers the entire solute carrier (Solcar) repeat. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: August 2003 / Pattern removed and profile added. [ 1] Klingenberg M. "Mechanism and evolution of the uncoupling protein of brown adipose tissue." Trends Biochem. Sci. 15:108-112(1990). PubMed=2158156 [ 2] Walker J.E. Curr. Opin. Struct. Biol. 2:519-526(1992). [ 3] Kuan J., Saier M.H. Jr. CRC Crit. Rev. Biochem. 28:209-233(1993). [ 4] Kuan J., Saier M.H. Jr. "Expansion of the mitochondrial carrier family." Res. Microbiol. 144:671-672(1993). PubMed=8140286 [ 5] Nelson D.R., Lawson J.E., Klingenberg M., Douglas M.G. "Site-directed mutagenesis of the yeast mitochondrial ADP/ATP translocator. Six arginines and one lysine are essential." J. Mol. Biol. 230:1159-1170(1993). PubMed=8487299 [ 6] Palmieri F. "Mitochondrial carrier proteins." FEBS Lett. 346:48-54(1994). PubMed=8206158 [ 7] Jank B., Habermann B., Schweyen R.J., Link T.A. "PMP47, a peroxisomal homologue of mitochondrial solute carrier proteins." Trends Biochem. Sci. 18:427-428(1993). PubMed=8291088 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00190} {PS00216; SUGAR_TRANSPORT_1} {PS00217; SUGAR_TRANSPORT_2} {BEGIN} *************************************** * Sugar transport proteins signatures * *************************************** In mammalian cells the uptake of glucose is mediated by a family of closely related transport proteins which are called the glucose transporters [1,2,3]. At least seven of these transporters are currently known to exist (in Human they are encoded by the GLUT1 to GLUT7 genes). These integral membrane proteins are predicted to comprise twelve membrane spanning domains. The glucose transporters show sequence similarities [4,5] with a number of other sugar or metabolite transport proteins listed below (references are only provided for recently determined sequences). - Escherichia coli - Escherichia coli - Escherichia coli (also known as citrate - Escherichia coli - Escherichia coli - Escherichia coli arabinose-proton symport (araE). galactose-proton symport (galP). and Klebsiella pneumoniae citrate-proton utilization determinant) (gene cit). alpha-ketoglutarate permease (gene kgtP). proline/betaine transporter (gene proP) [6]. xylose-proton symport (xylE). symport - Zymomonas mobilis glucose facilitated diffusion protein (gene glf). - Yeast high and low affinity glucose transport proteins (genes SNF3, HXT1 to HXT14). - Yeast galactose transporter (gene GAL2). - Yeast maltose permeases (genes MAL3T and MAL6T). - Yeast myo-inositol transporters (genes ITR1 and ITR2). - Yeast carboxylic acid transporter protein homolog JEN1. - Yeast inorganic phosphate transporter (gene PHO84). - Kluyveromyces lactis lactose permease (gene LAC12). - Neurospora crassa quinate transporter (gene Qa-y), and Emericella nidulans quinate permease (gene qutD). - Chlorella hexose carrier (gene HUP1). - Arabidopsis thaliana glucose transporter (gene STP1). - Spinach sucrose transporter. - Leishmania donovani transporters D1 and D2. - Leishmania enriettii probable transport protein (LTP). - Yeast hypothetical proteins YBR241c, YCR98c and YFL040w. Caenorhabditis elegans hypothetical protein ZK637.1. Escherichia coli hypothetical proteins yabE, ydjE and yhjE. Haemophilus influenzae hypothetical proteins HI0281 and HI0418. Bacillus subtilis hypothetical proteins yxbC and yxdF. It has been suggested [4] that these transport proteins have evolved from the duplication of an ancestral protein with six transmembrane regions, this hypothesis is based on the conservation of two G-R-[KR] motifs. The first one is located between the second and third transmembrane domains and the second one between transmembrane domains 8 and 9. We have developed two patterns to detect this family of proteins. The first pattern is based on the G-R-[KR] motif; but because this motif is too short to be specific to this family of proteins, we have derived a pattern from a larger region centered on the second copy of this motif. The second pattern is based on a number of conserved residues which are located at the end of the fourth transmembrane segment and in the short loop region between the fourth and fifth segments. -Consensus pattern: [LIVMSTAG]-[LIVMFSAG]-{SH}-{RDE}-[LIVMSA]-[DE]-{TD}[LIVMFYWA]-G-R-[RK]-x(4,6)-[GSTA] -Sequences known to belong to this class detected by the pattern: the majority of transporters with 23 exceptions. -Other sequence(s) detected in Swiss-Prot: 53. -Consensus pattern: [LIVMF]-x-G-[LIVMFA]-{V}-x-G-{KP}-x(7)-[LIFY]-x(2)[EQ]x(6)-[RK] -Sequences known to belong to this class detected by the pattern: the majority of transporters with 20 exceptions. -Other sequence(s) detected in Swiss-Prot: 67. -Last update: April 2006 / Patterns revised. [ 1] Silverman M. "Structure and function of hexose transporters." Annu. Rev. Biochem. 60:757-794(1991). PubMed=1883208; DOI=10.1146/annurev.bi.60.070191.003545 [ 2] Gould G.W., Bell G.I. "Facilitative glucose transporters: an expanding family." Trends Biochem. Sci. 15:18-23(1990). PubMed=2180146 [ 3] Baldwin S.A. "Mammalian passive glucose transporters: members of an ubiquitous family of active and passive transport proteins." Biochim. Biophys. Acta 1154:17-49(1993). PubMed=8507645 [ 4] Maiden M.C.J., Davis E.O., Baldwin S.A., Moore D.C.M., Henderson P.J.F. "Mammalian and bacterial sugar transport proteins are homologous." Nature 325:641-643(1987). PubMed=3543693; DOI=10.1038/325641a0 [ 5] Henderson P.J.F. Curr. Opin. Struct. Biol. 1:590-601(1991). [ 6] Culham D.E., Lasby B., Marangoni A.G., Milner J.L., Steer B.A., van Nues R.W., Wood J.M. "Isolation and sequencing of Escherichia coli gene proP reveals unusual structural features of the osmoregulatory proline/betaine transporter, ProP." J. Mol. Biol. 229:268-276(1993). PubMed=8421314 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00191} {PS00218; AMINO_ACID_PERMEASE_1} {BEGIN} ********************************** * Amino acid permeases signature * ********************************** Amino acid permeases are integral membrane proteins involved in the transport of amino acids into the cell. A number of such proteins have been found to be evolutionary related [1,2,3]. These proteins are: - Yeast general amino acid permeases (genes GAP1, AGP2 and AGP3). - Yeast basic amino acid permease (gene ALP1). - Yeast Leu/Val/Ile permease (gene BAP2). - Yeast arginine permease (gene CAN1). - Yeast dicarboxylic amino acid permease (gene DIP5). - Yeast asparagine/glutamine permease (gene AGP1). - Yeast glutamine permease (gene GNP1). - Yeast histidine permease (gene HIP1). - Yeast lysine permease (gene LYP1). - Yeast proline permease (gene PUT4). - Yeast valine and tyrosine permease (gene VAL1/TAT1). - Yeast tryptophan permease (gene TAT2/SCM2). - Yeast choline transport protein (gene HNM1/CTR1). - Yeast GABA permease (gene UGA4). - Yeast hypothetical protein YKL174c. - Fission yeast protein isp5. - Fission yeast hypothetical protein SpAC8A4.11 - Fission yeast hypothetical protein SpAC11D3.08c. - Emericella nidulans proline transport protein (gene prnB). - Trichoderma harzianum amino acid permease INDA1. - Salmonella typhimurium L-asparagine permease (gene ansP). - Escherichia coli aromatic amino acid transport protein (gene aroP). - Escherichia coli D-serine/D-alanine/glycine transporter (gene cycA). - Escherichia coli GABA permease (gene gabP). - Escherichia coli lysine-specific permease (gene lysP). - Escherichia coli phenylalanine-specific permease (gene pheP). - Salmonella typhimurium proline-specific permease (gene proY). - Escherichia coli and Klebsiella pneumoniae hypothetical protein yeeF. - Escherichia coli and Salmonella typhimurium hypothetical protein yifK. - Bacillus subtilis permeases rocC and rocE which probably transports arginine or ornithine. These proteins seem to contain up to 12 transmembrane segments. As a signature for this family of proteins we selected the best conserved region which is located in the second transmembrane segment. -Consensus pattern: [STAGC]-G-[PAG]-x(2,3)-[LIVMFYWA](2)-x-[LIVMFYW]-x[LIVMFWSTAGC](2)-[STAGC]-x(3)-[LIVMFYWT]-x-[LIVMST]x(3)[LIVMCTA]-[GA]-E-x(5)-[PSAL] -Sequences known to belong to this class detected by the pattern: ALL, except for yeeF. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: July 1999 / Pattern and text revised. [ 1] Weber E., Chevallier M.R., Jund R. "Evolutionary relationship and secondary structure predictions in four transport proteins of Saccharomyces cerevisiae." J. Mol. Evol. 27:341-350(1988). PubMed=3146645 [ 2] Vandenbol M., Jauniaux J.-C., Grenson M. "Nucleotide sequence of the Saccharomyces cerevisiae PUT4 proline-permease-encoding gene: similarities between CAN1, HIP1 and PUT4 permeases." Gene 83:153-159(1989). PubMed=2687114 [ 3] Reizer J., Finley K., Kakuda D., McLeod C.L., Reizer A., Saier M.H. Jr. "Mammalian integral membrane receptors are homologous to facilitators and antiporters of yeast, fungi, and eubacteria." Protein Sci. 2:20-30(1993). PubMed=8382989 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00192} {PS00219; ANION_EXCHANGER_1} {PS00220; ANION_EXCHANGER_2} {BEGIN} ************************************** * Anion exchangers family signatures * ************************************** Anion exchange is a cellular transport function which contributes the regulation of cell pH and volume. Anion exchangers are a family of functionally related proteins that contributes to these properties by maintaining the intracellular level of the two principal anions: chloride and to HCO3-. The best characterized anion exchanger is the band 3 protein [1], which is an erythrocyte anion exchange membrane glycoprotein. Band 3 is a protein of about 900 amino acids which consists of a cytoplasmic N-terminal domain of about 400 residues and an hydrophobic C-terminal section of about 500 residues that contains at least ten transmembrane regions. The cytoplasmic domain provides binding sites for cytoskeletal proteins, while the integral membrane domain is responsible for anion transport. Band 3 protein is specific to erythroid cells, at least two other proteins [2] structurally and functionally related to band 3, are found in nonerythroid tissues: - AE2 (or B3 related protein; B3RP), a protein of 1200 residues, which seems to be present in a variety of cell types including lymphoid, kidney, and choroid plexus. - AE3, a protein of 1200 residues, which is specific to neurons. Structurally AE2 and AE3 are very similar to band 3, the main difference being an extension of some 300 residues of the N-terminal domain in AE2 and AE3. We developed two signature patterns for these proteins. The first pattern is based on a conserved stretch of sequence that contains four clustered positive charged residues and which is located at the C-terminal extremity of the cytoplasmic domain, just before the first transmembrane segment from the integral domain. The second pattern is based on the perfectly conserved sequence of the fifth transmembrane segment; this segment contains a lysine, which is the covalent binding site for the isothiocyanate group of DIDS, an inhibitor of anion exchange. -Consensus pattern: F-G-G-[LIVM](2)-[KR]-D-[LIVM]-[RK]-R-R-Y -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [FI]-L-I-S-L-I-F-I-Y-E-T-F-x-K-L [K is important for anion exchange] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: May 2004 / Text revised. [ 1] Jay D., Cantley L. "Structural aspects of the red cell anion exchange protein." Annu. Rev. Biochem. 55:511-538(1986). PubMed=3527050; DOI=10.1146/annurev.bi.55.070186.002455 [ 2] Reithmeier R.A.F. Curr. Opin. Struct. Biol. 3:515-523(1993). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00193} {PS00221; MIP} {BEGIN} ************************ * MIP family signature * ************************ Recently the sequence of a number of different proteins, that all seem to be transmembrane channel proteins, has been found to be highly related [1 to 4]. These proteins are listed below. - Mammalian major intrinsic protein (MIP). MIP is the major component of lens fiber gap junctions. Gap junctions mediate direct exchange of ions and small molecule from one cell to another. - Mammalian aquaporins [5]. These proteins form water-specific channels that provide the plasma membranes of red cells and kidney proximal and collecting tubules with high permeability to water, thereby permitting water to move in the direction of an osmotic gradient. - Soybean nodulin-26, a major component of the peribacteroid membrane induced during nodulation in legume roots after Rhizobium infection. - Plants tonoplast intrinsic proteins (TIP). There are various isoforms of TIP: alpha (seed), gamma, Rt (root), and Wsi (water-stress induced). These proteins may allow the diffusion of water, amino acids and/or peptides from the tonoplast interior to the cytoplasm. - Bacterial glycerol facilitator protein (gene glpF), which facilitates the movement of glycerol across the cytoplasmic membrane. - Salmonella typhimurium propanediol diffusion facilitator (gene pduF). - Yeast FPS1, a glycerol uptake/efflux facilitator protein. - Drosophila neurogenic protein 'big brain' (bib). This protein may mediate intercellular communication; it may functions by allowing the transport of certain molecules(s) and thereby sending a signal for an exodermal cell to become an epidermoblast instead of a neuroblast. - Yeast hypothetical protein YFL054c. - A hypothetical protein from the pepX region of lactococcus lactis. The MIP family proteins seem to contain six transmembrane segments. Computer analysis shows that these protein probably arose by a tandem, intragenic duplication event from an ancestral protein that contained three transmembrane segments. As a signature pattern we selected a well conserved region which is located in a probable cytoplasmic loop between the second and third transmembrane regions. -Consensus pattern: [HNQA]-{D}-N-P-[STA]-[LIVMF]-[ST]-[LIVMF]-[GSTAFY] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 5. -Last update: December 2004 / Pattern and text revised. [ 1] Reizer J., Reizer A., Saier M.H. Jr. CRC Crit. Rev. Biochem. 28:235-257(1993). [ 2] Baker M.E., Saier M.H. Jr. "A common ancestor for bovine lens fiber major intrinsic protein, soybean nodulin-26 protein, and E. coli glycerol facilitator." Cell 60:185-186(1990). PubMed=2404610 [ 3] Pao G.M., Wu L.-F., Johnson K.D., Hofte H., Chrispeels M.J., Sweet G., Sandal N.N., Saier M.H. Jr. "Evolution of the MIP family of integral membrane transport proteins." Mol. Microbiol. 5:33-37(1991). PubMed=2014003 [ 4] Wistow G.J., Pisano M.M., Chepelinsky A.B. "Tandem sequence repeats in transmembrane channel proteins." Trends Biochem. Sci. 16:170-171(1991). PubMed=1715617 [ 5] Chrispeels M.J., Agre P. "Aquaporins: water channel proteins of plant and animal cells." Trends Biochem. Sci. 19:421-425(1994). PubMed=7529436 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00194} {PS00222; IGFBP_N_1} {PS51323; IGFBP_N_2} {BEGIN} ************************************************************************* ********************* * Insulin-like growth factor binding protein (IGFBP) N-terminal domain signature and profile * ************************************************************************* ********************* The insulin-like growth factors (IGF-I and IGF-II) bind to specific binding proteins in extracellular fluids with high affinity [1,2,3]. These IGFbinding proteins (IGFBP) prolong the half-life of the IGFs and have been shown to either inhibit or stimulate the growth promoting effects of the IGFs on cells culture. They seem to alter the interaction of IGFs with their cell surface receptors. The IGFBP family comprises six proteins (IGFBP-1 to -6) that bind to IGFs with high affinity. The precursor forms of all six IGFBPs have secretory signal peptides. All IGFBPs share a common domain organization and also a high degree of similarity in their primary protein structure. The highest conservation is found in the N- and C-terminal cysteine-rich regions. Twelve conserved cysteines (ten in IGFBP-6) are found in the Nterminal domain, and six are found in the C-terminal domain. Both the N- and Cterminal domains participate in binding to IGFs, although the specific roles each of these domains in IGF binding have not been decisively established. In general, the strongest binding to IGFs is shown by amino-terminal fragments, which, however bind to IGF with 10- to 1000-fold lower affinity than full length IGFBPs. The central weakly conserved part (L domain) contains most of the cleavage sites for specific proteases [4,5]. The N-terminal domain is ~80 residues in length and has an L-like structure (see <PDB:1WQJ>). It can be divided into two subdomains that are connected by a short stretch of amino acids. The two subdomains are perpendicular to each other, creating the "L" shape for the whole N-terminal domain. The core of the first subdomain presents a novel fold stabilized by a short two-stranded beta sheet and four disulfide bridges forming a disulfide bond ladder-like structure. The beta sheet and disulfide bridges are all in one plane, making the structure appear flat from one side like a "palm" of a hand. The palm is extended with a "thumb" segment in various IGFBPs. The thumb segment consists of the very N-terminal residues and contains a consensus XhhyC motif, where h is a hydrophobic amino acid and y is positively charged. The second subdomain adopts a globular fold whose scaffold is secured by an inside packing of two cysteines bridges stabilized by a three-stranded beta sheet [4,5]. The following growth-factor inducible proteins are structurally related to IGFBPs and could function as growth-factor binding proteins [6,7]: - Mouse protein cyr61 and its probable chicken homolog, protein CEF-10. - Human connective tissue growth factor (CTGF) and its mouse homolog, protein FISP-12. - Vertebrate protein NOV. As a signature pattern we located in the N-terminal IGFBP covers the have used a conserved cysteine-rich region domain. We also developed a profile that entire IGFBP N-terminal domain. -Consensus pattern: [GP]-C-[GSET]-[CE]-[CA]-x(2)-C-[ALP]-x(6)-C -Sequences known to belong to this class detected by the pattern: ALL, except for IGFBP-6's. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Landale E.C.; [email protected] -Last update: July 2007 / Text revised; profile added. [ 1] Rechler M.M. "Insulin-like growth factor binding proteins." Vitam. Horm. 47:1-114(1993). PubMed=7680510 [ 2] Shimasaki S., Ling N. "Identification and molecular characterization of insulin-like growth factor binding proteins (IGFBP-1, -2, -3, -4, -5 and -6)." Prog. Growth Factor Res. 3:243-266(1991). PubMed=1725860 [ 3] Clemmons D.R. Trends Endocrinol. Metab. 1:412-417(1990). [ 4] Kalus W., Zweckstetter M., Renner C., Sanchez Y., Georgescu J., Grol M., Demuth D., Schumacher R., Dony C., Lang K., Holak T.A. "Structure of the IGF-binding domain of the insulin-like growth factor-binding protein-5 (IGFBP-5): implications for IGF and IGF-I receptor interactions." EMBO J. 17:6558-6572(1998). PubMed=9822601; DOI=10.1093/emboj/17.22.6558 [ 5] Siwanowicz I., Popowicz G.M., Wisniewska M., Huber R., Kuenkele K.P., Lang K., Engh R.A., Holak T.A. "Structural basis for the regulation of insulin-like growth factors by IGF binding proteins." Structure 13:155-167(2005). PubMed=15642270; DOI=10.1016/j.str.2004.11.009 [ 6] Bradham D.M., Igarashi A., Potter R.L., Grotendorst G.R. "Connective tissue growth factor: a cysteine-rich mitogen secreted by human vascular endothelial cells is related to the SRC-induced immediate early gene product CEF-10." J. Cell Biol. 114:1285-1294(1991). PubMed=1654338 [ 7] Joliot V., Martinerie C., Dambrine G., Plassiart G., Brisac M., Crochet J., Perbal B. "Proviral rearrangements and overexpression of a new cellular gene (nov) in myeloblastosis-associated virus type 1-induced nephroblastomas." Mol. Cell. Biol. 12:10-21(1992). PubMed=1309586 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00195} {PS00223; ANNEXIN} {BEGIN} ************************************** * Annexins repeated domain signature * ************************************** Annexins [1 to 6] are a group of calcium-binding proteins that associate reversibly with membranes. They bind to phospholipid bilayers in the presence of micromolar free calcium concentration. The binding is specific for calcium and for acidic phospholipids. Annexins have been claimed to be involved in cytoskeletal interactions, phospholipase inhibition, intracellular signalling, anticoagulation, and membrane fusion. Each of these proteins consist of an N-terminal domain of variable length followed by four or eight copies of a conserved segment of sixty one residues. The repeat (sometimes known as an 'endonexin fold') consists of five alphahelices that are wound into a right-handed superhelix [7]. The proteins known to belong to the annexin family are listed below: - Annexin I (Lipocortin 1) (Calpactin 2) (p35) (Chromobindin 9). - Annexin II (Lipocortin 2) (Calpactin 1) (Protein I) (p36) (Chromobindin 8). - Annexin III (Lipocortin 3) (PAP-III). - Annexin IV (Lipocortin 4) (Endonexin I) (Protein II) (Chromobindin 4). - Annexin V (Lipocortin 5) (Endonexin 2) (VAC-alpha) (Anchorin CII) (PAP-I). - Annexin VI (Lipocortin 6) (Protein III) (Chromobindin 20) (p68) (p70). This - is the only known annexin that contains 8 (instead of 4) repeats. Annexin VII (Synexin). Annexin VIII (Vascular anticoagulant-beta) (VAC-beta). Annexin IX from Drosophila. Annexin X from Drosophila. Annexin XI (Calcyclin-associated annexin) (CAP-50). Annexin XII from Hydra vulgaris. Annexin XIII (Intestine-specific annexin) (ISA). The signature pattern for this domain spans positions 9 to 61 of the repeat and includes the only perfectly conserved residue (an arginine in position 22). -Consensus pattern: [TG]-[STV]-x(8)-[LIVMF]-x(2)-R-x(3)-[DEQNH]-x(2)-{S}x(4)[IFY]-x(7)-[LIVMF]-x(3)-[LIVMF]-x(5)-{I}-x(5)[LIVMFA]x(2)-[LIVMF] -Sequences known to belong to this class detected by the pattern: ALL. But the pattern will miss some of the repeats of annexin IX, X, XI, and XII. -Other sequence(s) detected in Swiss-Prot: 4. -Note: A sequence similar to the annexin domain in the N-terminal of alpha-giardins of Giardia lamblia [8]. has been found -Last update: December 2004 / Pattern and text revised. [ 1] Raynal P., Pollard H.B. "Annexins: the problem of assessing the biological role for a gene family of multifunctional calcium- and phospholipid-binding proteins." Biochim. Biophys. Acta 1197:63-93(1994). PubMed=8155692 [ 2] Barton G.J., Newman R.H., Freemont P.S., Crumpton M.J. "Amino acid sequence analysis of the annexin super-gene family of proteins." Eur. J. Biochem. 198:749-760(1991). PubMed=1646719 [ 3] Burgoyne R.D., Geisow M.J. "The annexin family of calcium-binding proteins. Review article." Cell Calcium 10:1-10(1989). PubMed=2659190 [ 4] Haigler H.T., Fitch J.M., Jones J.M., Schlaepfer D.D. "Two lipocortin-like proteins, endonexin II and anchorin CII, may be alternate splices of the same gene." Trends Biochem. Sci. 14:48-50(1989). PubMed=2539661 [ 5] Klee C.B. "Ca2+-dependent phospholipid- (and membrane-) binding proteins." Biochemistry 27:6645-6653(1988). PubMed=2973805 [ 6] Smith P.D., Moss S.E. "Structural evolution of the annexin supergene family." Trends Genet. 10:241-246(1994). PubMed=8091504 [ 7] Huber R., Roemisch J., Paques E.-P. "The crystal and molecular structure of human annexin V, an anticoagulant protein that binds to calcium and membranes." EMBO J. 9:3867-3874(1990). PubMed=2147412 [ 8] Fiedler K., Simons K. "Annexin homologues in Giardia lamblia." Trends Biochem. Sci. 20:177-178(1995). PubMed=7610478 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00196} {PS00224; CLATHRIN_LIGHT_CHN_1} {PS00581; CLATHRIN_LIGHT_CHN_2} {BEGIN} ************************************ * Clathrin light chains signatures * ************************************ Clathrin [1,2] is the major coat-forming protein that encloses vesicles such as coated pits and forms cell surface patches involved in membrane traffic within eukaryotic cells. The clathrin coats (called triskelions) are composed of three heavy chains (180 Kd) and three light chains (23 to 27 Kd). The clathrin light chains [3], which may help to properly orient the assembly and disassembly of the clathrin coats, bind non-covalently to the heavy chain, they also bind calcium and interact with the hsc70 uncoating ATPase. - In higher eukaryotes two genes code for distinct but related light chains: LC(a) and LC(b). Each of the two genes can yield, by tissuespecific alternative splicing, two separate forms which differ by the insertion of a sequence of respectively thirty or eighteen residues. There is, in the Nterminal part of the clathrin light chains a domain of twenty one amino acid residues which is perfectly conserved in LC(a) and LC(b). - In yeast there is a single light chain (gene CLC1) whose sequence is only distantly related to that of higher eukaryotes. We developed two signature patterns for clathrin light chains. The first pattern is a heptapeptide from the center of the conserved N-terminal region of eukaryotic light chains; the second pattern is derived from a positively charged region located in the C-terminal extremity of all known clathrin light chains. -Consensus pattern: F-L-A-[QH]-[QE]-E-S -Sequences known to belong to this class detected by the pattern: ALL higher eukaryotes light chains. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [KR]-[DS]-x-[SE]-[KR]-[LIVMF]-[KR]-x-[LIVM]-[LIVMY][LIVM]-x-L-[KA] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Patterns and text revised. [ 1] Keen J.H. "Clathrin and associated assembly and disassembly proteins." Annu. Rev. Biochem. 59:415-438(1990). PubMed=1973890 [ 2] Brodsky F.M. "Living with clathrin: its role in intracellular membrane traffic." Science 242:1396-1402(1988). PubMed=2904698 [ 3] Brodsky F.M., Hill B.L., Acton S.L., Nathke I., Wong D.H., Ponnambalam S., Parham P. "Clathrin light chains: arrays of protein motifs that regulate coated-vesicle dynamics." Trends Biochem. Sci. 16:208-213(1991). PubMed=1909824 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00197} {PS50915; CRYSTALLIN_BETA_GAMMA} {BEGIN} ******************************************************** * Crystallins beta and gamma 'Greek key' motif profile * ******************************************************** Crystallins are the dominant structural components of the eye lens. Among the different type of crystallins, the beta and gamma crystallins form a family of related proteins [1,2]. Structurally, beta and gamma crystallins are composed of two similar domains which, in turn, are each composed of two similar motifs with the two domains connected by a short connecting peptide. Each motif, which is about forty amino acid residues long, is folded in a distinctive 'Greek key' motif, composed of four antiparallel beta-strands: a, b, c, and d (see <PDB:1A45>). Apart from the different types of beta and family also includes the following proteins: gamma crystallins, this - Two related proteins from the sporulating bacterium Myxococcus xanthus: protein S, a calcium-binding protein that forms a major part of the spore coat, and a close homolog of protein S. - Spherulin 3a from the slime mold Physarum polycephalum. Spherulin 3a is a development specific protein synthesized in response to various kinds of stress leading to encystment and dormancy. The sequence of Spherulin 3a consists of two 'Greek key' motifs [3]. - Epidermis differenciation-specific protein (EDSP or ep37) of the amphibian Cynops pyrrhogaster. - Mammalian absent in melanoma 1 protein (AIM1). It contains 12 'Greek key' motifs. Beta/gamma 'Greek type. key' motifs may be further classified as A- or B- Vertebrate members of the beta/gamma superfamily conform to an ABAB motif pattern. B-type motifs are the most highly conserved and form most of the contacts between domains. In protein S, the order of motifs is reversed to BABA, suggesting a separate history of duplication events [4]. The profile 'Greek key' motif. we developed for this family of proteins covers the entire -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Wistow G.; [email protected] -Last update: July 2003 / Pattern removed, profile added and text revised. [ 1] Lubsen N.H., Aarts H.J.M., Schoenmakers J.G. "The evolution of lenticular proteins: the beta- and gammacrystallin super gene family." Prog. Biophys. Mol. Biol. 51:47-76(1988). PubMed=3064189 [ 2] Wistow G.J., Piatigorsky J. "Lens crystallins: the evolution and expression of proteins for a highly specialized tissue." Annu. Rev. Biochem. 57:479-504(1988). PubMed=3052280; DOI=10.1146/annurev.bi.57.070188.002403 [ 3] Wistow G. "Evolution of a protein superfamily: relationships between vertebrate lens crystallins and microorganism dormancy proteins." J. Mol. Evol. 30:140-145(1990). PubMed=2107329 [ 4] Ray M.E., Wistow G., Su Y.A., Meltzer P.S., Trent J.M. "AIM1, a novel non-lens member of the betagamma-crystallin superfamily, is associated with the control of tumorigenicity in human malignant melanoma." Proc. Natl. Acad. Sci. U.S.A. 94:3229-3234(1997). PubMed=9096375 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00198} {PS00226; IF} {BEGIN} ************************************ * Intermediate filaments signature * ************************************ Intermediate filaments (IF) [1,2,3] are proteins which are primordial components of the cytoskeleton and the nuclear envelope. They generally form filamentous structures 8 to 14 nm wide. IF proteins are members of a very large multigene family of proteins which has been subdivided in five major subgroups: - Type I: Acidic cytokeratins. - Type II: Basic cytokeratins. - Type III: Vimentin, desmin, glial fibrillary acidic protein (GFAP), peripherin, and plasticin. - Type IV: Neurofilaments L, H and M, alpha-internexin and nestin. - Type V: Nuclear lamins A, B1, B2 and C. All IF proteins are structurally similar in that they consist of: a central rod domain comprising some 300 to 350 residues which is arranged in coiledcoiled alpha-helices, with at least two short characteristic interruptions; a N-terminal non-helical domain (head) of variable length; and a Cterminal domain (tail) which is also non-helical, and which shows extreme length variation between different IF proteins. While IF proteins are evolutionary and structurally related, they have limited sequence homologies except in several regions of the rod domain. We use, as a sequence pattern for this class of proteins, a conserved region at the C-terminal extremity of the rod domain. -Consensus pattern: [IV]-{K}-[TACI]-Y-[RKH]-{E}-[LM]-L-[DE] -Sequences known to belong to this class detected by the pattern: ALL, except for Drosophila lamin DM0 and filensin. -Other sequence(s) detected in Swiss-Prot: 5. -Note: In the third position of the pattern, Ala is found in type IV and V IF proteins, Thr is found in IF proteins of type I, II, III, and VI, Cys in IF from snails, and Ile in IF from worms. In the first position of the pattern Val is found in type VI, Ile is found in all other types. -Last update: December 2004 / Pattern and text revised. [ 1] Quinlan R., Hutchison C., Lane B. Protein Prof. 2:801-952(1995). [ 2] Steiner P.M., Roop D.R. Annu. Rev. Biochem. 57:593-625(1988). [ 3] Stewart M. "Intermediate filaments: structure, assembly and molecular interactions." Curr. Opin. Cell Biol. 2:91-100(1990). PubMed=2183847 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00199} {PS00227; TUBULIN} {BEGIN} ***************************************************** * Tubulin subunits alpha, beta, and gamma signature * ***************************************************** Tubulins [1,2], the major constituent of microtubules are dimeric proteins which consist of two closely related subunits (alpha and beta). Tubulin binds two molecules of GTP at two different sites (N and E). At the E (Exchangeable) site, GTP is hydrolyzed during incorporation into the microtubule. Near the E site is an invariant region rich in glycines which is found in both chains and which is now [3] said to control the access of the nucleotide to its binding site. We developed a signature pattern from this region. With the exception of the simple eukaryotes, most species express a variety of closely related alpha and beta isotypes. In most species there is a third member of the tubulin family: gamma tubulin. Gamma tubulin is found at microtubule organizing centers (MTOC) such as the spindle poles or the centrosome, suggesting that it is involved in the minusend nucleation of microtubule assembly [4]. -Consensus pattern: [SAG]-G-G-T-G-[SA]-G -Sequences known to belong to this class detected by the pattern: ALL, except for maize tubulin beta-2 which has Leu in the first position of the pattern. -Other sequence(s) detected in Swiss-Prot: 13. -Note: The first residue in the pattern is Gly in all alpha and beta tubulins, and is Ala or Ser in gamma-tubulin. -Note: This pattern is almost identical to the GTP-binding site of the bacterial protein ftsZ (see <PDOC00873>) whose role in prokaryotes is probably similar to that of tubulins. -Last update: November 1995 / Text revised. [ 1] Cleveland D.W., Sullivan K.F. "Molecular biology and genetics of tubulin." Annu. Rev. Biochem. 54:331-365(1985). PubMed=3896122; DOI=10.1146/annurev.bi.54.070185.001555; [ 2] Joshi H.C., Cleveland D.W. "Diversity among tubulin subunits: toward what functional end?" Cell Motil. Cytoskeleton 16:159-163(1990). PubMed=2194680 [ 3] Hesse J., Thierauf M., Ponstingl H. "Tubulin sequence region beta 155-174 is involved in binding exchangeable guanosine triphosphate." J. Biol. Chem. 262:15472-15475(1987). PubMed=3680207 [ 4] Joshi H.C. "Gamma-tubulin: the hub of cellular microtubule assemblies." BioEssays 15:637-643(1993). PubMed=8274140 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00200} {PS00228; TUBULIN_B_AUTOREG} {BEGIN} ******************************************* * Tubulin-beta mRNA autoregulation signal * ******************************************* The stability of beta-tubulin mRNAs are autoregulated by their own translation product [1]. Unpolymerized tubulin subunits bind directly (or activate a factor(s) which binds co-translationally) to the nascent N-terminus of betatubulin. This binding is transduced through the adjacent ribosomes to activate an RNAse that degrades the polysome-bound mRNA. The recognition element has been shown to be the first four amino acids of beta-tubulin: Met-ArgGlu-Ile. Mutations to this sequence abolish the autoregulation effect (except for the replacement of Glu by Asp); transposition of this sequence to an internal region of a polypeptide also suppresses the autoregulatory effect. -Consensus pattern: <M-R-[DE]-[IL] -Sequences known to belong to this class detected by the pattern: ALL, except for soybean beta-2 tubulin which has Ser in the last position of the pattern. -Other sequence(s) detected in Swiss-Prot: a number of alpha-tubulins, as well as 26 other proteins. -Last update: May 1991 / Pattern and text revised. [ 1] Cleveland D.W. "Autoregulated instability of tubulin mRNAs: a novel eukaryotic regulatory mechanism." Trends Biochem. Sci. 13:339-343(1988). PubMed=3072712 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00201} {PS00229; TAU_MAP} {BEGIN} ********************************************************* * Tau and MAP proteins tubulin-binding domain signature * ********************************************************* Microtubules consist of tubulins as well as a group of additional proteins collectively known as the Microtubule Associated Proteins (MAP). MAP's have been classified into two classes: high molecular weight MAP's and Tau protein. These proteins promote microtubule assembly and stabilize microtubules. The C-terminal region of a subset of these proteins contains three or four tandem repeats of a conserved domain of about thirty amino acid residues which is implicated in tubulin-binding and which seems to have a stiffening effect on microtubules. The proteins currently known to contain such repeats are: - Tau [1], from neurones. - MAP2 [2], a neuronal member of the high molecular weight MAP's. - MAP4 [3], a non-neuronal member of the high molecular weight MAP's. MAP4 is is also expressed in some neurons. The pattern we developed to detect this last thirteen residues of the repeated region. repeated region spans the -Consensus pattern: G-S-x(2)-N-x(2)-H-x-[PA]-[AG]-G(2) -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: The first repeat of MAP2 is not picked up by the pattern because it has Tyr instead of His in position 8, and Lys instead of Gly in position 11. -Expert(s) to contact by email: Matus A.; [email protected] -Last update: November 1997 / Text revised. [ 1] Kosik K.S., Orecchio L.D., Bakalis S., Neve R.L. "Developmentally regulated expression of specific tau sequences." Neuron 2:1389-1397(1989). PubMed=2560640 [ 2] Matus A. "Stiff microtubules and neuronal morphology." Trends Neurosci. 17:19-22(1994). PubMed=7511844 [ 3] Chapin S.J., Bulinski J.C. "Non-neuronal 210 x 10(3) Mr microtubule-associated protein (MAP4) contains a domain homologous to the microtubule-binding domains of neuronal MAP2 and tau." J. Cell Sci. 98:27-36(1991). PubMed=1905296 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00202} {PS00230; MAP1B_NEURAXIN} {BEGIN} ********************************************************* * Neuraxin and MAP1B proteins repeated region signature * ********************************************************* MAP1B [1], and neuraxin [2] are neuronal microtubule-binding proteins. Both proteins contain a region that consists of 12 tandem repeats of a 17 residues motif. The pattern we developed to detect this repeated to the N-terminal ten residues of the repeated region. region corresponds -Consensus pattern: [STAGDN]-Y-x-Y-E-{AV}-{L}-[DE]-[KR]-[STAGCI] -Sequences known to belong to this class detected by the pattern: ALL; this pattern detects 8 out of the 12 copies of the repeated region. -Other sequence(s) detected in Swiss-Prot: 6; but in all cases the pattern is only found once. -Expert(s) to contact by email: Matus A.; [email protected] -Last update: December 2004 / Pattern and text revised. [ 1] Noble M., Lewis S.A., Cowan N.J. "The microtubule binding domain of microtubule-associated protein MAP1B contains a repeated sequence motif unrelated to that of MAP2 and tau." J. Cell Biol. 109:3367-3376(1989). PubMed=2480963 [ 2] Rienitz A., Grenningloh G., Hermans-Borgmeyer I., Kirsch J., Littauer U.Z., Prior P., Gundelfinger E.D., Schmitt B., Betz H. "Neuraxin, a novel putative structural protein of the rat central nervous system that is immunologically related to microtubuleassociated protein 5." EMBO J. 8:2879-2888(1989). PubMed=2555150 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00203} {PS00231; F_ACTIN_CAPPING_BETA} {BEGIN} ************************************************** * F-actin capping protein beta subunit signature * ************************************************** The F-actin capping protein binds in a calcium-independent manner to the fast growing ends of actin filaments (barbed end) thereby blocking the exchange of subunits at these ends. Unlike gelsolin and severin this protein does not sever actin filaments. The F-actin capping protein is a heterodimer composed of two unrelated subunits: alpha and beta. The beta subunit is a protein of about 280 amino acid residues whose sequence is well conserved in eukaryotic species [1]. As a signature pattern we chose a conserved hexapeptide in the N-terminal section of the beta subunit. -Consensus pattern: C-[DE]-[YF]-N-R-D -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Amatruda J.F., Cannon J.F., Tatchell K., Hug C., Cooper J.A. "Disruption of the actin cytoskeleton in yeast capping protein mutants." Nature 344:352-354(1990). PubMed=2179733 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00204} {PS00319; A4_EXTRA} {PS00320; A4_INTRA} {BEGIN} ***************************************** * Amyloidogenic glycoprotein signatures * ***************************************** Amyloidogenic glycoprotein (A4 protein or APP) is an integral, glycosylated membrane brain protein [1,2]. APP is associated with Alzheimer's disease (AD). This responsibility stems from the fact that a small peptide (of 43 residues), called the amyloid beta protein, which is part of the sequence of A4, is the major constituent of amyloid deposits in AD and in Down's syndrome. As shown in the schematic representation below, the amyloid beta protein both precedes and forms part of the unique transmembrane region of A4. +----------------------------------------xxxxxxx-------------+ | Extracellular XXXXXXX Cytoplasmic | +------------------------------------BBBBBBBBxxx-------------+ 'X': Transmembrane region. 'B': Position of the amyloid beta protein in A4. The exact function of A4 protein is not suggested that it mediates cell-cell interactions. mammalian yet known, but it has been The sequence of A4 from species is well conserved and is also similar to that of other proteins: - Drosophila APPL (gene vnd) [3]. - Mammalian protein APLP1 [4]. - Mammalian protein APLP2 (APPH) (YWK-II) (CDEI-binding protein) [5]. We have derived two patterns specific to these proteins, the first one is a perfectly conserved octapeptide located in the beginning of the extracellular domain; the second is a conserved octapeptide located at the C-terminal end of the cytoplasmic domain. -Consensus pattern: G-[VT]-[EK]-[FY]-V-C-C-P -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: G-Y-E-N-P-T-Y-[KR] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Dyrks T., Weidemann A., Multhaup G., Salbaum J.M., Lemaire H.-G., Kang J., Muller-Hill B., Masters C.L., Beyreuther K. "Identification, transmembrane orientation and biogenesis of the amyloid A4 precursor of Alzheimer's disease." EMBO J. 7:949-957(1988). PubMed=2900137 [ 2] Ashall F., Goate A.M. "Role of the beta-amyloid precursor protein in Alzheimer's disease." Trends Biochem. Sci. 19:42-46(1994). PubMed=8140621 [ 3] Rosen D.R., Martin-Morris L., Luo L.Q., White K. "A Drosophila gene encoding a protein resembling the human beta-amyloid protein precursor." Proc. Natl. Acad. Sci. U.S.A. 86:2478-2482(1989). PubMed=2494667 [ 4] Wasco W., Bupp K., Magendantz M., Gusella J.F., Tanzi R.E., Solomon F. "Identification of a mouse brain cDNA that encodes a protein related to the Alzheimer disease-associated amyloid beta protein precursor." Proc. Natl. Acad. Sci. U.S.A. 89:10758-10762(1992). PubMed=1279693 [ 5] Sprecher C.A., Grant F.J., Grimm G., O'Hara P.J., Norris F., Norris K., Foster D.C. "Molecular cloning of the cDNA for a human amyloid precursor protein homolog: evidence for a multigene family." Biochemistry 32:4481-4486(1993). PubMed=8485127 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00205} {PS00232; CADHERIN_1} {PS50268; CADHERIN_2} {BEGIN} ***************************************** * Cadherin domain signature and profile * ***************************************** Cadherins [1,2] are a family of animal glycoproteins responsible for calciumdependent cell-cell adhesion. Cadherins preferentially interact with themselves in a homophilic manner in connecting cells; thus acting as both receptor and ligand. A wide number of tissue-specific forms of cadherins are known, for example: - Epithelial (E-cadherin) (CDH1). Neural (N-cadherin) (CDH2). Placental (P-cadherin) (CDH3). Retinal (R-cadherin) (CDH4). Vascular endothelial (VE-cadherin) (CDH5). Kidney (K-cadherin) (CDH6). Cadherin-8 (CDH8). Cadherin-9 (CDH9). Osteoblast (OB-cadherin) (CDH11). Brain (BR-cadherin) (CDH12). T-cadherin (truncated cadherin) (CDH13). Muscle (M-cadherin) (CDH15). Kidney (Ksp-cadherin) (CDH16). Liver-intestine (LI-cadherin) (CDH17). Structurally, cadherins are built of the following domains: a signal sequence, followed by a propeptide of about 130 residues, then an extracellular domain of around 600 residues, then a transmembrane region, and finally a Cterminal cytoplasmic domain of about 150 residues. The extracellular domain can be subdivided into five parts: there are four repeats of about 110 residues followed by a region that contains four conserved cysteines. It is suggested that the calcium-binding region of cadherins is located in the extracellular repeats. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction plaque proteins: of - Desmoglein 1 (desmosomal glycoprotein I). - Desmoglein 2. - Desmoglein 3 (Pemphigus vulgaris antigen). Other proteins that include cadherin domains are: - Drosophila fat protein [3], a huge protein of over 5000 amino acids that contains 34 cadherin-like repeats in its extracellular domain. Homologs of fat are found in mammals. - Protocadherins (6 copies). - Proto-oncogene tyrosine-protein kinase receptor ret (1 copy). The signature pattern we have developed for the repeated domain is located in it the C-terminal extremity which is its best conserved region. The pattern includes two conserved aspartic acid residues as well as two asparagines; these residues could be implicated in the binding of calcium. We have also developed a profile that spans the complete domain. -Consensus pattern: [LIV]-x-[LIV]-x-D-x-N-D-[NH]-x-P [The 2 D's and the N are involved in calcium binding] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: 1. -Note: This pattern is found in the first, second, and fourth copies of the repeated domain. In the third copy there is a deletion of one residue after the second conserved Asp. -Last update: May 2004 / Text revised. [ 1] Takeichi M. "Cadherins: a molecular family important in selective cell-cell adhesion." Annu. Rev. Biochem. 59:237-252(1990). PubMed=2197976; DOI=10.1146/annurev.bi.59.070190.001321 [ 2] Takeichi M. Trends Genet. 3:213-217(1987). [ 3] Mahoney P.A., Weber U., Onofrechuk P., Biessmann H., Bryant P.J., Goodman C.S. "The fat tumor suppressor gene in Drosophila encodes a novel member of the cadherin gene superfamily." Cell 67:853-868(1991). PubMed=1959133 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00206} {PS00233; CHIT_BIND_RR_1} {PS51155; CHIT_BIND_RR_2} {BEGIN} ******************************************************** * Chitin-binding type R&R domain signature and profile * ******************************************************** Insect cuticle is composed of proteins and chitin. The cuticular proteins seem to be specific to the type of cuticle (flexible or stiff) that occur at stages of the insect development. The proteins found in the flexible cuticle of larva and pupa of different insects share a conserved C-terminal section [1]; such a region is also found in the soft endocuticle of adults insects [2] as well as in other cuticular proteins including in arachnids [3]. This conserved motif of 35-36 amino acids is known as the R&R consensus since it was first recognized by Rebers and Riddiford. N-terminal to the consensus is a region of hydrophilic amino acids. The two regions together have been called the extended R&R consensus and form an about 70 amino acids chitin-binding domain [4,5]. The R&R chitin-binding domain antiparallel beta-pleaated sheets [6]. has been proposed to constitute Some proteins known to contain a R&R chitin-binding domain are listed below: - Locust cuticle proteins 7 (LM-7), 8 (LM-8), 19 (LM-19) and endocuticle structural glycoprotein ABD-4. - Hyalophora cecropia cuticle proteins 12 and 66. - Drosophila larval cuticles proteins I, II, III and IV (LCP1 to LCP4). - Drosophila pupal cuticle protein (PCP). - Drosophila pupal cuticle proteins EDG-78E and EDG-84E. - Manduca sexta cuticle protein LCP-14. - Tenebrio molitor cuticle proteins ACP-20, A1A, A2B and A3A. - Araneus diadematus (spider) cuticle proteins ACP 11.9, ACP 12.4, ACP 12.6, ACP 15.5 and ACP 15.7. We have developed both a pattern and a profile for the R&R chitinbinding domain. The pattern covers the R&R consensus, whereas the profile covers the entire R&R chitin-binding domain. -Consensus pattern: G-x(7)-[DEN]-G-x(6)-[FY]-x-A-[DNG]-x(2,3)-G-[FY]-x[APV] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Rebers J.E., Riddiford L.M. "Structure and expression of a Manduca sexta larval cuticle gene homologous to Drosophila cuticle genes." J. Mol. Biol. 203:411-423(1988). PubMed=2462055 [ 2] Talbo G., Hoejrup P., Rahbek-Nielsen H., Andersen S.O., Roepstorff P. "Determination of the covalent structure of an N- and C-terminally blocked glycoprotein from endocuticle of Locusta migratoria. Combined use of plasma desorption mass spectrometry and Edman degradation to study post-translationally modified proteins." Eur. J. Biochem. 195:495-504(1991). PubMed=1997327 [ 3] Norup T., Berg T., Stenholm H., Andersen S.O., Hoejrup P. "Purification and characterization of five cuticular proteins from the spider Araneus diadematus." Insect Biochem. Mol. Biol. 26:907-915(1996). PubMed=9014336 [ 4] Rebers J.E., Willis J.H. "A conserved domain in arthropod cuticular proteins binds chitin." Insect Biochem. Mol. Biol. 31:1083-1093(2001). PubMed=11520687 [ 5] Togawa T., Nakato H., Izumi S. "Analysis of the chitin recognition mechanism of cuticle proteins from the soft cuticle of the silkworm, Bombyx mori." Insect Biochem. Mol. Biol. 34:1059-1067(2004). PubMed=15475300; DOI=10.1016/j.ibmb.2004.06.008 [ 6] Hamodrakas S.J., Willis J.H., Iconomidou V.A. "A structural model of the chitin-binding domain of cuticle proteins." Insect Biochem. Mol. Biol. 32:1577-1583(2002). PubMed=12530225 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00207} {PS00234; GAS_VESICLE_A_1} {PS00669; GAS_VESICLE_A_2} {BEGIN} **************************************** * Gas vesicles protein GVPa signatures * **************************************** Gas vesicles are small, hollow, gas filled protein structures found in several cyanobacterial and archaebacterial microorganisms [1]. They allow the positioning of the bacteria at the favorable depth for growth. Gas vesicles are hollow cylindrical tubes, closed by a hollow, conical cap at each end. Both the conical end caps and central cylinder are made up of 4-5 nm wide ribs that run at right angles to the long axis of the structure. Gas vesicles seem to be constituted of two different protein components: GVPa and GVPc. GVPa, a small protein of about 70 amino acid residues, is the main constituent of gas vesicles and form the essential core of the structure. The sequence of GVPa is extremely well conserved. GvpJ and gvpM, two proteins encoded in the cluster of genes required for gas vesicle synthesis in the archaebacteria Halobacterium halobium and Haloferax mediterranei, have been found [2] to be evolutionary related to GVPa. The exact function of these two proteins is not known, although they could be important for determining the shape determination gas vesicles. The N-terminal domain also related to GVPa. of Aphanizomenon flos-aquae protein gvpA/J is We developed two signature patterns for this family of proteins. The first pattern is located in the N-terminal section while the second is in the Cterminal section. -Consensus pattern: [LIVM]-x-[DE]-[LIVMFYT]-[LIVM]-[DE]-x-[LIVM](2)[DKR](2)G-x-[LIVMA]-[LIVM] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: R-[LIVA](3)-A-[GS]-[LIVMFY]-x-[TK]-x(3)-[YFI]-[AG] -Sequences known to belong to this class detected by the pattern: ALL, except for Aphanizomenon flos-aquae gvpA/J. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Patterns and text revised. [ 1] Walsby A.E., Hayes P.K. "Gas vesicle proteins." Biochem. J. 264:313-322(1989). PubMed=2513809 [ 2] Jones J.G., Young D.C., DasSarma S. "Structure and organization of the gas vesicle gene cluster on the Halobacterium halobium plasmid pNRC100." Gene 102:117-122(1991). PubMed=1864501 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00208} {PS00235; GAS_VESICLE_C} {BEGIN} ******************************************************* * Gas vesicles protein GVPc repeated domain signature * ******************************************************* Gas vesicles are small, hollow, gas filled protein structures found in several cyanobacterial and archaebacterial microorganisms [1]. They allow the positioning of the bacteria at the favorable depth for growth. Gas vesicles are hollow cylindrical tubes, closed by a hollow, conical cap at each end. Both the conical end caps and central cylinder are made up of 4-5 nm wide ribs that run at right angles to the long axis of the structure. Gas vesicles seem to be constituted of two different protein components: GVPa and GVPc. GVPc is a minor constituent of gas vesicles and seems to be located on the outer surface. Structurally, cyanobacterial GVPc consists of four or five tandem repeats of a 33 residue sequence flanked by sequences of 18 and 10 residues at the N- and C-termini, respectively. We derived a signature pattern for the repeated domain. spans positions 11 to 33 of that domain. This signature -Consensus pattern: F-L-x(2)-T-x(3)-R-x(3)-A-x(2)-Q-x(3)-L-x(2)-F -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: This pattern is not designed to detect archaebacterial GVPc [2] which is composed of 8 tandem repeats of a sequence very distantly (if at all) related to that of cyanobacterial GVPc. -Last update: June 1992 / Text revised. [ 1] Walsby A.E., Hayes P.K. "Gas vesicle proteins." Biochem. J. 264:313-322(1989). PubMed=2513809 [ 2] Jones J.G., Young D.C., DasSarma S. "Structure and organization of the gas vesicle gene cluster on the Halobacterium halobium plasmid pNRC100." Gene 102:117-122(1991). PubMed=1864501 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00209} {PS00236; NEUROTR_ION_CHANNEL} {BEGIN} ************************************************* * Neurotransmitter-gated ion-channels signature * ************************************************* Neurotransmitter-gated ion-channels [1,2,3,4] provide the molecular basis for rapid signal transmission at chemical synapses. They are postsynaptic oligomeric transmembrane complexes that transiently form a ionic channel upon the binding of a specific neurotransmitter. Presently, the sequence of subunits from five types of neurotransmitter-gated receptors are known: - The nicotinic acetylcholine receptor (AchR), an excitatory cation channel. In the motor endplates of vertebrates, it is composed of four different subunits (alpha, beta, gamma and delta or epsilon) with a molar stoichiometry of 2:1:1:1. In neurones, the AchR receptor is composed of two different types of subunits: alpha and non-alpha (also called beta). Nicotinic AchRs are also found in invertebrates. - The glycine receptor, an inhibitory chloride ion channel. The glycine receptor is a pentamer composed of two different subunits (alpha and beta). - The gamma-aminobutyric-acid (GABA) receptor, which is also an inhibitory chloride ion channel. The quaternary structure of the GABA receptor is complex; at least four classes of subunits are known to exist (alpha, beta, gamma, and delta) and there are many variants in each class (for example: six variants of the alpha class have already been sequenced). - The serotonin 5HT3 receptor. Serotonin is a biogenic hormone that functions as a neurotransmitter, a hormone and a mitogen. There are seven major groups of serotonin receptors; six of these groups (5HT1, 5HT2, and 5HT4 to 5HT7) transduce extracellular signal by activating G proteins, while 5HT3 is a ligand-gated cation-specific ion channel which, when activated causes fast, depolarizing responses in neurons. - The glutamate receptor, an excitatory cation channel. Glutamate is the main excitatory neurotransmitter in the brain. At least three different types of glutamate receptors have been described and are named according to their selective agonists (kainate, N-methyl-D-aspartate (NMDA) and quisqualate). All known sequences of subunits from neurotransmitter-gated ionchannels are structurally related. They are composed of a large extracellular glycosylated N-terminal ligand-binding domain, followed by three hydrophobic transmembrane regions which form the ionic channel, followed by an intracellular region of variable length. A fourth hydrophobic region is found at the C-terminal of the sequence. The sequence of subunits from the AchR, GABA, 5HT3, and Gly receptors are clearly evolutionary related and share many regions of sequence similarities. These sequence similarities are either absent or very weak the Glu receptors. in In the N-terminal extracellular domain of AchR/GABA/5HT3/Gly receptors, there are two conserved cysteine residues, which, in AchR, have been shown to form a disulfide bond essential to the tertiary structure of the receptor. number of amino acids between the two disulfide-bonded cysteines are also conserved. We have therefore used this region as a signature pattern for this subclass of proteins. A -Consensus pattern: C-x-[LIVMFQ]-x-[LIVMF]-x(2)-[FY]-P-x-D-x(3)-C [The 2 C's are linked by a disulfide bond] -Sequences known to belong to this class detected by the pattern: ALL the iongated receptors except for glutamate receptors. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: In most AchR subunits and in GABA beta subunits, the residue Nterminal to the second cysteine is a N-glycosylated asparagine. -Last update: May 2004 / Text revised. [ 1] Stroud R.M., McCarthy M.P., Shuster M. "Nicotinic acetylcholine receptor superfamily of ligand-gated ion channels." Biochemistry 29:11009-11023(1990). PubMed=1703009 [ 2] Betz H. "Ligand-gated ion channels in the brain: the amino acid receptor superfamily." Neuron 5:383-392(1990). PubMed=1698394 [ 3] Dingledine R., Myers S.J., Nicholas R.A. FASEB J. 4:2632-2645(1990). [ 4] Barnard E.A. "Receptor classes and the transmitter-gated ion channels." Trends Biochem. Sci. 17:368-374(1992). PubMed=1360717 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00210} {PS00237; G_PROTEIN_RECEP_F1_1} {PS50262; G_PROTEIN_RECEP_F1_2} {BEGIN} ************************************************************** * G-protein coupled receptors family 1 signature and profile * ************************************************************** G-protein coupled receptors [1 to 4,E1] (also called R7G) are an extensive group of hormones, neurotransmitters, odorants and light receptors which transduce extracellular signals by interaction with guanine nucleotidebinding (G) proteins. The receptors that are currently known to belong to this family are listed below. - 5-hydroxytryptamine (serotonin) 1A to 1F, 2A to 2C, 4, 5A, 5B, 6 and 7 [5]. - Acetylcholine, muscarinic-type, M1 to M5. - Adenosine A1, A2A, A2B and A3 [6]. - Adrenergic alpha-1A to -1C; alpha-2A to -2D; beta-1 to -3 [7]. - Angiotensin II types I and II. - Bombesin subtypes 3 and 4. - Bradykinin B1 and B2. - c3a and C5a anaphylatoxin. - Cannabinoid CB1 and CB2. - Chemokines C-C CC-CKR-1 to CC-CKR-8. - Chemokines C-X-C CXC-CKR-1 to CXC-CKR-4. - Cholecystokinin-A and cholecystokinin-B/gastrin. - Dopamine D1 to D5 [8]. - Endothelin ET-a and ET-b [9]. - fMet-Leu-Phe (fMLP) (N-formyl peptide). - Follicle stimulating hormone (FSH-R) [10]. - Galanin. - Gastrin-releasing peptide (GRP-R). - Gonadotropin-releasing hormone (GNRH-R). - Histamine H1 and H2 (gastric receptor I). - Lutropin-choriogonadotropic hormone (LSH-R) [10]. - Melanocortin MC1R to MC5R. - Melatonin. - Neuromedin B (NMB-R). - Neuromedin K (NK-3R). - Neuropeptide Y types 1 to 6. - Neurotensin (NT-R). - Octopamine (tyramine), from insects. - Odorants [11]. - Opioids delta-, kappa- and mu-types [12]. - Oxytocin (OT-R). - Platelet activating factor (PAF-R). - Prostacyclin. - Prostaglandin D2. - Prostaglandin E2, EP1 to EP4 subtypes. - Prostaglandin F2. - Purinoreceptors (ATP) [13]. - Somatostatin types 1 to 5. - Substance-K (NK-2R). - Substance-P (NK-1R). Thrombin. Thromboxane A2. Thyrotropin (TSH-R) [10]. Thyrotropin releasing factor (TRH-R). Vasopressin V1a, V1b and V2. Visual pigments (opsins and rhodopsin) [14]. - Proto-oncogene mas. - A number of orphan receptors (whose ligand is not known) from mammals and birds. - Caenorhabditis elegans putative receptors C06G4.5, C38C10.1, C43C3.2, T27D1.3 and ZC84.4. - Three putative receptors encoded in the genome of cytomegalovirus: US27, US28, and UL33. - ECRF3, a putative receptor encoded in the genome of herpesvirus saimiri. The structure of all these receptors is thought to be identical. They have seven hydrophobic regions, each of which most probably spans the membrane. The N-terminus is located on the extracellular side of the membrane and is often glycosylated, while the C-terminus is cytoplasmic and generally phosphorylated. Three extracellular loops alternate with three intracellular loops to link the seven transmembrane regions. Most, but not all of these receptors, lack a signal peptide. The most conserved parts of these proteins are the transmembrane regions and the first two cytoplasmic loops. A conserved acidic-Arg-aromatic triplet is present in the N-terminal extremity of the second cytoplasmic loop [15] and could be implicated in the interaction with G proteins. To detect this widespread family of proteins we have developed a pattern that contains the conserved triplet and that also spans the major part of the third transmembrane helix. We have also developed a profile that spans the seven transmembrane regions. -Consensus pattern: [GSTALIVMFYWC]-[GSTANCPDE]-{EDPKRH}-x-{PQ}[LIVMNQGA]{RK}-{RK}-[LIVMFT]-[GSTANC]-[LIVMFYWSTAC]-[DENH]-R- [FYWCSH]-{PE}-x-[LIVM] -Sequences known to belong to this class detected by the pattern: the majority of receptors. About 5% are not detected. -Other sequence(s) detected in Swiss-Prot: 64. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: 1. -Expert(s) to contact by email: Attwood T.K.; [email protected] Kolakowski L.F. Jr.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Strosberg A.D. "Structure/function relationship of proteins belonging to the family of receptors coupled to GTP-binding proteins." Eur. J. Biochem. 196:1-10(1991). PubMed=1848179 [ 2] Kerlavage A.R. Curr. Opin. Struct. Biol. 1:394-401(1991). [ 3] Probst W.C., Snyder L.A., Schuster D.I., Brosius J., Sealfon S.C. "Sequence alignment of the G-protein coupled receptor superfamily." DNA Cell Biol. 11:1-20(1992). PubMed=1310857 [ 4] Savarese T.M., Fraser C.M. "In vitro mutagenesis and the search for structure-function relationships among G protein-coupled receptors." Biochem. J. 283:1-19(1992). PubMed=1314560 [ 5] Branchek T. "More serotonin receptors?" Curr. Biol. 3:315-317(1993). PubMed=15335760 [ 6] Stiles G.L. "Adenosine receptors." J. Biol. Chem. 267:6451-6454(1992). PubMed=1551861 [ 7] Friell T., Kobilka B.K., Lefkowitz R.J., Caron M.G. Trends Neurosci. 11:321-324(1988). [ 8] Stevens C.F. "New recruit to the magnificent seven." Curr. Biol. 1:20-22(1991). PubMed=15336196 [ 9] Sakurai T., Yanagisawa M., Masaki T. "Molecular characterization of endothelin receptors." Trends Pharmacol. Sci. 13:103-108(1992). PubMed=1315462 [10] Salesse R., Remy J.J., Levin J.M., Jallal B., Garnier J. Biochimie 73:109-120(1991). [11] Lancet D., Ben-Arie N. Curr. Biol. 3:668-674(1993). [12] Uhl G.R., Childers S., Pasternak G. Trends Neurosci. 17:89-93(1994). [13] Barnard E.A., Burnstock G., Webb T.E. Trends Pharmacol. Sci. 15:67-70(1994). [14] Applebury M.L., Hargrave P.A. Vision Res. 26:1881-1895(1986). [15] Attwood T.K., Eliopoulos E.E., Findlay J.B.C. Gene 98:153-159(1991). [E1] http://www.gpcr.org/7tm/ +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00211} {PS00238; OPSIN} {BEGIN} ************************************************* * Visual pigments (opsins) retinal binding site * ************************************************* Visual pigments [1,2] are the light-absorbing molecules that mediate vision. They consist of an apoprotein, opsin, covalently linked to the chromophore cis-retinal. Vision is effected through the absorption of a photon by cisretinal which is isomerized to trans-retinal. This isomerization leads to a change of conformation of the protein. Opsins are integral membrane proteins with seven transmembrane regions that belong to family 1 of G-protein coupled receptors (see <PDOC00210>). In vertebrates four different pigments are generally found. Rod cells, which mediate vision in dim light, contain the pigment rhodopsin. Cone cells, which function in bright light, are responsible for color vision and contain three or more color pigments (for example, in mammals: red, blue and green). In Drosophila, the eye is composed of 800 facets or ommatidia. Each ommatidium contains eight photoreceptor cells (R1-R8): the R1 to R6 cells are outer cells, R7 and R8 inner cells. Each of the three types of cells (R1-R6, R7 and R8) expresses a specific opsin. Proteins evolutionary related to opsins include: - Squid retinochrome, also known as retinal photoisomerase, which converts various isomers of retinal into 11-cis retinal. - Mammalian opsin 3 (Encephalopsin) that may play a role in encephalic photoreception. - Mammalian opsin 4 (Melanopsin) that may mediate regulation of circadian rhythms and acute suppression of pineal melatonin. - Mammalian retinal pigment epithelium (RPE) RGR [3], a protein that may also act in retinal isomerization. The attachment site for retinal in the above proteins is a conserved lysine residue in the middle of the seventh transmembrane helix. The pattern we developed includes this residue. -Consensus pattern: [LIVMFWAC]-[PSGAC]-x-{G}-x-[SAC]-K-[STALIMR][GSACPNV][STACP]-x(2)-[DENF]-[AP]-x(2)-[IY] [K is the retinal binding site] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 2. -Last update: December 2004 / Pattern and text revised. [ 1] Applebury M.L., Hargrave P.A. "Molecular biology of the visual pigments." Vision Res. 26:1881-1895(1986). PubMed=3303660 [ 2] Fryxell K.J., Meyerowitz E.M. "The evolution of rhodopsins and neurotransmitter receptors." J. Mol. Evol. 33:367-378(1991). PubMed=1663559 [ 3] Shen D., Jiang M., Hao W., Tao L., Salazar M., Fong H.K.W. "A human opsin-related gene that encodes a retinaldehyde-binding protein." Biochemistry 33:13117-13125(1994). PubMed=7947717 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00212} {PS00239; RECEPTOR_TYR_KIN_II} {BEGIN} *********************************************** * Receptor tyrosine kinase class II signature * *********************************************** A number of growth factors stimulate mitogenesis by interacting with a family of cell surface receptors which possess an intrinsic, ligandsensitive, protein tyrosine kinase activity [1]. These receptor tyrosine kinases (RTK) all share the same topology: an extracellular ligand-binding domain, a single transmembrane region and a cytoplasmic kinase domain. However they can be classified into at least five groups. The prototype for class II RTK's is the insulin receptor, a heterotetramer of two alpha and two beta chains linked by disulfide bonds. The alpha and beta chains are cleavage products of a precursor molecule. The alpha chain contains the ligand binding site, the beta chain transverses the membrane and contains the tyrosine protein kinase domain. The receptors currently known to belong to class II are: - Insulin receptor from vertebrates. - Insulin growth factor I receptor from mammals. - Insulin receptor-related receptor (IRR), which is most probably a receptor for a peptide belonging to the insulin family. - Insects insulin-like receptors. - Molluscan insulin-related peptide(s) receptor (MIP-R). - Insulin-like peptide receptor from Branchiostoma lanceolatum. - The Drosophila developmental protein sevenless, a putative receptor for positional information required for the formation of the R7 photoreceptor cells. - The trk family of receptors (NTRK1, NTRK2 and NTRK3), which are high affinity receptors for nerve growth factor and related neurotrophic factors (BDNF and NT-3). And the following uncharacterized receptors: - ROS. LTK (TYK1). EDDR1 (cak, TRKE, RTK6). NTRK3 (Tyro10, TKT). A sponge putative receptor tyrosine kinase. While only the insulin and the insulin growth factor I receptors are known to exist in the tetrameric conformation specific to class II RTK's, all the above proteins share extensive homologies in their kinase domain, especially around the putative site of autophosphorylation. Hence, we developed a signature pattern for this class of RTK's, which includes the tyrosine residue, itself probably autophosphorylated. -Consensus pattern: [DN]-[LIV]-Y-x(3)-Y-Y-R [The second Y is the autophosphorylation site] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: November 1997 / Text revised. [ 1] Yarden Y., Ullrich A. "Growth factor receptor tyrosine kinases." Annu. Rev. Biochem. 57:443-478(1988). PubMed=3052279; DOI=10.1146/annurev.bi.57.070188.002303 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00213} {PS00240; RECEPTOR_TYR_KIN_III} {BEGIN} ************************************************ * Receptor tyrosine kinase class III signature * ************************************************ A number of growth factors stimulate mitogenesis by interacting with a family of cell surface receptors which possess an intrinsic, ligandsensitive, protein tyrosine kinase activity [1]. These receptor tyrosine kinases (RTK) all share the same topology: an extracellular ligand-binding domain, a single transmembrane region and a cytoplasmic kinase domain. However they can be classified into at least five groups. The class III RTK's are characterized by the presence of five to seven immunoglobulin-like domains [2] in their extracellular section. Their kinase domain differs from that of other RTK's by the insertion of a stretch of 70 to 100 hydrophilic residues in the middle of this domain. The receptors currently known to belong to class III are: - Platelet-derived growth factor receptor (PDGF-R). PDGF-R exists as a homoor heterodimer of two related chains: alpha and beta [3]. - Macrophage colony stimulating factor receptor (CSF-1-R) (also known as the fms oncogene). - Stem cell factor (mast cell growth factor) receptor (also known as the kit oncogene). - Vascular endothelial growth factor (VEGF) receptors Flt-1 and Flk1/KDR [4]. - Fl cytokine receptor Flk-2/Flt-3 [5]. - The putative receptor Flt-4 [6]. We developed a signature pattern for this class of RTK's which is based on a conserved region in the kinase domain. -Consensus pattern: G-x-H-x-N-[LIVM]-V-N-L-L-G-A-C-T -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2001 / Text revised. [ 1] Yarden Y., Ullrich A. "Growth factor receptor tyrosine kinases." Annu. Rev. Biochem. 57:443-478(1988). PubMed=3052279; DOI=10.1146/annurev.bi.57.070188.002303 [ 2] Hunkapiller T., Hood L. "Diversity of the immunoglobulin gene superfamily." Adv. Immunol. 44:1-63(1989). PubMed=2646860 [ 3] Lee K.-H., Bowen-Pope D.F., Reed R.R. "Isolation and characterization of the alpha platelet-derived growth factor receptor from rat olfactory epithelium." Mol. Cell. Biol. 10:2237-2246(1990). PubMed=2157969 [ 4] Terman B.I., Dougher-Vermazen M., Carrion M.E., Dimitrov D., Armellino D.C., Gospodarowicz D., Bohlen P. "Identification of the KDR tyrosine kinase as a receptor for vascular endothelial cell growth factor." Biochem. Biophys. Res. Commun. 187:1579-1586(1992). PubMed=1417831 [ 5] Lyman S.D., James L., Vanden Bos T., de Vries P., Brasel K., Gliniak B., Hollingsworth L.T., Picha K.S., McKenna H.J., Splett R.R. "Molecular cloning of a ligand for the flt3/flk-2 tyrosine kinase receptor: a proliferative factor for primitive hematopoietic cells." Cell 75:1157-1167(1993). PubMed=7505204 [ 6] Galland F., Karamysheva A., Pebusque M.J., Borg J.P., Rottapel R., Dubreuil P., Rosnet O., Birnbaum D. "The FLT4 gene encodes a transmembrane tyrosine kinase related to the vascular endothelial growth factor receptor." Oncogene 8:1233-1240(1993). PubMed=8386825 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00215} {PS00242; INTEGRIN_ALPHA} {BEGIN} *********************************** * Integrins alpha chain signature * *********************************** Integrins [1,2] are a large family of cell surface receptors that mediate cell to cell as well as cell to matrix adhesion. Some integrins recognize the R-G-D sequence in their extracellular matrix protein ligand. Structurally, integrins consist of a dimer of an alpha and a beta chain. Each subunit has a large N-terminal extracellular domain followed by a transmembrane domain and a short C-terminal cytoplasmic region. Some alpha subunits are cleaved posttranslationally to produce a heavy and a light chain linked by a disulfide bond. The sequence and are listed below: of a number of alpha chains has been obtained - The alpha-1 chain (VLA-1) (CD49a) which, with the beta-1 chain, acts as a receptor for laminin and collagen. - The alpha-2 chain (VLA-2) (CD49b) which, with the beta-1 chain, acts as a receptor that binds collagen. - The alpha-3 chain (VLA-3) (Galactoprotein B3). - The alpha-4 chain (VLA-4) (CD49d) which, with the beta-1 chain, interacts with vascular cell adhesion protein 1 (VCAM-1). - The alpha-5 chain (VLA-5) (CD49e) which, with the beta-1 chain, forms a receptor specific to fibronectin. - The alpha-6 chain (VLA-6) which, with the beta-1 chain, forms a platelet laminin receptor. - The alpha-7 chain which, with the beta-1 chain, forms a skeletal myoblast laminin receptor. - The alpha-8 chain which, with the beta-1 chain plays a possible role in cell-cell interactions during axon-growth and fasciculation. - The alpha-L chain (LFA-1) (CD11a) which, with the beta-2 chain, interacts with intercellular adhesion molecule 1 (ICAM-1). - The alpha-M chain (MAC-1) (CD11b) which, with the beta-2 chain, forms the receptor for the iC3b fragment of the third complement component. - The alpha-X chain (p150,95) (CD11c) which, with the beta-2 chain, probably forms a receptor for the iC3b fragment of the third complement component. - The alpha-V chain (CD51) which, with the beta-3 chain, forms a receptor that binds vitronectin. - The alpha-IIB chain (CD41) (also known as platelet glycoprotein IIb) which, with the beta-3 chain, forms a receptor which binds VWF, fibrinogen, fibronectin, and vitronectin. - The Drosophila position-specific antigen 2 alpha chain (PS2). - Caenorhabditis elegans hypothetical proteins F54F2.1 and F54G8.3. All these integrin alpha chains share a conserved sequence which is found at the beginning of the cytoplasmic domain, just after the end the transmembrane region. This motif is probably involved in heterodimer of association and may lock the heterodimer into a low affinity conformation in the abscence of activating signals. We have used this conserved region as a signature pattern. -Consensus pattern: [FYWS]-[RK]-x-G-F-F-x-R -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 3. -Note: In position 7 of the while Drosophila PS2 has Asn. pattern all vertebrate integrins have Lys, -Last update: December 2001 / Text revised. [ 1] Hynes R.O. "Integrins: a family of cell surface receptors." Cell 48:549-554(1987). PubMed=3028640 [ 2] Albelda S.M., Buck C.A. "Integrins and other cell adhesion molecules." FASEB J. 4:2868-2880(1990). PubMed=2199285 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00216} {PS00243; INTEGRIN_BETA} {BEGIN} ******************************************************* * Integrins beta chain cysteine-rich domain signature * ******************************************************* Integrins [1,2] are a large family of cell surface receptors that mediate cell to cell as well as cell to matrix adhesion. Some integrins recognize the R-G-D sequence in their extracellular matrix protein ligand. Structurally, integrins consist of a dimer of an alpha and a beta chain. Each subunit has a large N-terminal extracellular domain followed by a transmembrane domain and a short C-terminal cytoplasmic region. Some receptors share a common beta chain while having different alpha chains. The sequence of a number of different beta chains has been determined and are listed below: - Integrin beta-1, which associates with alpha-1 to form a laminin receptor, with alpha-2 to form a collagen receptor, with alpha-4 to interact with VCAM-1, with alpha-5 to form a fibronectin receptor, and with alpha-8. - Integrin beta-2, which associates with alpha-L (LFA-1) to interact with ICAM-1, and with alpha-M (MAC-1) or alpha-X (p150,95) to form the receptor for the iC3b fragment of the third complement component. - Integrin beta-3, which associates with alpha-IIB to form a receptor for fibrinogen, fibronectin, vitronectin and VWF, and with alpha-V to form a vitronectin receptor. - Integrin beta-4, which associates with alpha-6. - Integrin beta-5, which associates with alpha-V. - Integrin beta-6 [3]. - Integrin beta-7 [4]. - Integrin beta-8, which associates with alpha-V [5]. - The Drosophila myospheroid protein, a probable integrin beta chain. All the integrin beta chains contain four repeats of a forty amino acid region in the C-terminal extremity of their extracellular domain. Each of the repeats contains eight cysteines. We have developed a pattern from a section of the repeated region that includes five of these conserved cysteines. -Consensus pattern: C-x-[GNQ]-x(1,3)-G-x-C-x-C-x(2)-C-x-C [The 5 C's may be involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: The pattern will not pick up the first of the four repeats, the spacing of the cysteine residues being different in that repeat. -Last update: May 2004 / Text revised. [ 1] Hynes R.O. "Integrins: a family of cell surface receptors." Cell 48:549-554(1987). PubMed=3028640 [ 2] Albelda S.M., Buck C.A. "Integrins and other cell adhesion molecules." FASEB J. 4:2868-2880(1990). PubMed=2199285 [ 3] Sheppard D., Rozzo C., Starr L., Quaranta V., Erle D.J., Pytela R. "Complete amino acid sequence of a novel integrin beta subunit (beta 6) identified in epithelial cells using the polymerase chain reaction." J. Biol. Chem. 265:11502-11507(1990). PubMed=2365683 [ 4] Erle D.J., Rueegg C., Sheppard D., Pytela R. "Complete amino acid sequence of an integrin beta subunit (beta 7) identified in leukocytes." J. Biol. Chem. 266:11009-11016(1991). PubMed=2040616 [ 5] Moyle M., Napier M.A., McLean J.W. "Cloning and expression of a divergent integrin subunit beta 8." J. Biol. Chem. 266:19650-19658(1991). PubMed=1918072 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00217} {PS00244; REACTION_CENTER} {BEGIN} ***************************************************** * Photosynthetic reaction center proteins signature * ***************************************************** In the photosynthetic reaction center of purple bacteria, two homologous integral membrane proteins, L(ight) and M(edium), are known to be essential to the light-mediated water-splitting process. In the photosystem II of eukaryotic chloroplasts two related proteins are involved: the D1 (psbA) and D2 proteins (psbD). These four types of protein probably evolved from a common ancestor [see 1,2 for recent reviews]. We developed a signature pattern which include two conserved histidine residues. In L and M chains, the first histidine is a ligand of the magnesium ion of the special pair bacteriochlorophyll, the second is a ligand of a ferrous non-heme iron atom. In photosystem II these two histidines are thought to play a similar role. -Consensus pattern: [NQH]-x(4)-P-x-H-x(2)-[SAG]-x(11)-[SAGC]-x-H-[SAG](2) [The first H is a magnesium ligand] [The second H is a iron ligand] -Sequences known to belong to this class detected by the pattern: ALL, except for broad bean psbA which has Gln instead of the second His. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Edelman M.; [email protected] Hirschberg J.; [email protected] -Last update: December 2001 / Pattern and text revised. [ 1] Michel H., Deisenhofer J. Biochemistry 27:1-7(1988). [ 2] Barber J. Trends Biochem. Sci. 12:321-326(1987). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00218} {PS00245; PHYTOCHROME_1} {PS50046; PHYTOCHROME_2} {BEGIN} ******************************************* * Phytochrome chromophore attachment site * ******************************************* Phytochrome [1,2,3] is a plant protein that acts as a regulatory photoreceptor and which mediates red-light effects on a wide variety of physiological and molecular responses. Phytochrome can undergo a reversible photochemical conversion between a biologically inactive red light-absorbing and the active far-red light-absorbing form. Phytochrome is a dimer of identical 124 form Kd subunits, each tetrapyrrole chromophore. of which contains a covalently attached linear The chromophore is attached to a cysteine which is highly conserved region that can be used as a signature pattern. located in a Synechocystis strain PCC 6803 hypothetical protein slr0473 contains a domain similar to that of plants phytochrome and seems to also bind a chromophore. -Consensus pattern: [RGS]-[GSA]-[PV]-H-x-C-H-x(2)-Y [C is the chromophore attachment site] -Sequences known to belong to this class detected by the pattern: ALL, except for slr0473. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Partis M.D.; [email protected] -Last update: November 1997 / Pattern and text revised; profile. [ 1] Silverthorne J., Tobin E.M. BioEssays 7:18-23(1987). [ 2] Quail P.H. "Phytochrome: a light-activated molecular switch that regulates plant gene expression." Annu. Rev. Genet. 25:389-409(1991). PubMed=1812812; DOI=10.1146/annurev.ge.25.120191.002133 [ 3] Quail P.H. "The phytochromes: a biochemical mechanism of signaling in sight?" BioEssays 19:571-579(1997). PubMed=9230690 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00219} {PS00246; WNT1} {BEGIN} ************************** * Wnt-1 family signature * ************************** Wnt-1 (previously known as int-1) [1] is a proto-oncogene induced by the integration of the mouse mammary tumor virus. It is thought to play a role in intercellular communication and seems to be a signalling molecule important in the development of the central nervous system (CNS). The sequence of wnt-1 is highly conserved in mammals, fish, and amphibians. Wnt-1 is a member of a large family of related proteins [2,3,4] that are all thought to be developmental regulators. These proteins are known as wnt-2 (also known as irp), wnt-3 up to wnt-15. At least four members of this family are present in Drosophila; one of them, wingless (wg), is implicated in segmentation polarity. All these proteins share the following features characteristics of secretory proteins: a signal peptide, several potential N-glycosylation sites and 22 conserved cysteines that are probably involved in disulfide bonds. The Wnt proteins seem to adhere to the plasma membrane of the secreting cells and are therefore likely to signal over only few cell diameters. Signal transduction by the Wnt family of ligands is mediated by the binding to the extracellular domain fz of Frizzled receptors. It can lead to either the activation of dishvelled proteins, inhibition of GSK-3 kinase, nuclear accumulation of beta-catenin and activation of Wnt target genes or be coupled to the inositol signaling pathway and PKC activation, depending on the type of Frizzled receptor [5,6]. We selected a highly conserved region including three cysteines as a signature for this family of proteins. -Consensus pattern: C-[KR]-C-H-G-[LIVMT]-S-G-x-C -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Nusse R. "The int genes in mammary tumorigenesis and in normal development." Trends Genet. 4:291-295(1988). PubMed=3076290 [ 2] Nusse R., Varmus H.E. "Wnt genes." Cell 69:1073-1087(1992). PubMed=1617723 [ 3] McMahon A.P. Trends Genet. 8:1-5(1992). [ 4] Moon R.T. "In pursuit of the functions of the Wnt family of developmental regulators: insights from Xenopus laevis." BioEssays 15:91-97(1993). PubMed=8471061 [ 5] Dale T.C. "Signal transduction by the Wnt family of ligands." Biochem. J. 329:209-223(1998). PubMed=9425102 [ 6] Seidensticker M.J., Behrens J. "Biochemical interactions in the wnt pathway." Biochim. Biophys. Acta 1495:168-182(2000). PubMed=10656974 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00220} {PS00247; HBGF_FGF} {BEGIN} ***************************** * HBGF/FGF family signature * ***************************** Heparin-binding growth factors I and II (HBGF) [1,2] (also known as acidic and basic fibroblast growth factors (FGF) are structurally related mitogens which stimulate growth or differentiation of a wide variety of cells of mesodermal or neuroectodermal origin. These two proteins belong to a family of growth factors and oncogenes which is currently known [3,4] to include: - FGF-3 (int-2), induced by the integration of mouse mammary tumor virus (MMTV). - FGF-4 (hst-1; KS3), a transforming protein independently isolated from a human stomach tumor (hst-1) and from Kaposi's sarcoma (KS3). - FGF-5, an oncogene expressed in neonatal brain. - FGF-6 (hst-2), a transforming protein that exhibits strong mitogenic and angiogenic properties. - FGF-7 or keratinocyte growth factor (KGF), a paracrine effector of normal epithelial cell proliferation. - FGF-8 or androgen-induced growth factor (AIGF). - FGF-9 or glia-activating factor (GAF), a heparin-binding growth factor that may have a role in glial cell growth and differentiation during development. - FGF-11 (FHF-3), FGF-12 (FHF-1), FGF-13 (FHF-2) and FGF-14 (FHF-4) [5], which seem to be involved in nervous system development and function. - FGF-15, which may play an important role in regulating cell division and patterning within specific regions of the embryonic brain, spinal cord and sensory organs. - FGF-16. - FGF-17. - FGF-18, which stimulates hepatic and intestinal proliferation. - FGF-19, - A FGF homolog of unknown function from Autographa californica nuclear polyhedrosis virus [6]. From the sequences of these related proteins, we have derived a signature pattern which includes one of the two conserved cysteine residues. -Consensus pattern: G-x-[LIM]-x-[STAGP]-x(6,7)-[DENA]-C-x-[FLM]-x-[EQ]x(6)-Y -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Burgess W.H., Maciag T. "The heparin-binding (fibroblast) growth factor family of proteins." Annu. Rev. Biochem. 58:575-606(1989). PubMed=2549857; DOI=10.1146/annurev.bi.58.070189.003043 [ 2] Thomas K.A. "Transforming potential of fibroblast growth factor genes." Trends Biochem. Sci. 13:327-328(1988). PubMed=3072709 [ 3] Benharroch D., Birnbaum D. "Biology of the fibroblast growth factor gene family." Isr. J. Med. Sci. 26:212-219(1990). PubMed=1693362 [ 4] Miyamoto M., Naruo K.-I., Seko C., Matsumoto S., Kondo T., Kurokawa T. "Molecular cloning of a novel cytokine cDNA encoding the ninth member of the fibroblast growth factor family, which has a unique secretion property." Mol. Cell. Biol. 13:4251-4259(1993). PubMed=8321227 [ 5] Smallwood P.M., Munoz-Sanjuan I., Tong P., Macke J.P., Hendry S.H., Gilbert D.J., Copeland N.G., Jenkins N.A., Nathans J. "Fibroblast growth factor (FGF) homologous factors: new members of the FGF family implicated in nervous system development." Proc. Natl. Acad. Sci. U.S.A. 93:9850-9857(1996). PubMed=8790420 [ 6] Ayres M.D., Howard S.C., Kuzio J., Lopez-Ferber M., Possee R.D. "The complete DNA sequence of Autographa californica nuclear polyhedrosis virus." Virology 202:586-605(1994). PubMed=8030224 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00221} {PS00248; NGF_1} {PS50270; NGF_2} {BEGIN} **************************************************** * Nerve growth factor family signature and profile * **************************************************** Nerve growth factor (NGF or beta-NGF) is a vertebrate protein that stimulates division and differentiation of sympathetic and embryonic sensory neurons [1, 2]. NGF is a protein of about 120 residues that is cleaved from a larger precursor molecule. It contains six cysteines all involved in intrachain disulfide bonds. shown below: A schematic representation of the structure of NGF is +------------------------+ | | | ******* | xxxxxxCxxxxxxxxxxxxxxxxxxxxxCxxxxCxxxxxCxxxxxxxxxxxxxCxCxxxx | | | | +--------------------------|-----+ | +---------------------+ 'C': conserved cysteine involved in a disulfide bond. '*': position of the pattern. Some proteins have been found [3] to be structurally and related to NGF, these are: functionally - Brain-derived neurotrophic factor (BDNF), a protein that promotes the survival of neuronal populations all located either in the central nervous system or directly connected to it. - Neurotrophin-3 (NT-3), a protein that seems to promote the survival of visceral and proprioceptive sensory neurons. - Neurotrophins-4/5 (NT-4/5), which elicit neurite outgrowth from explanted dorsal root ganglia and could play a role in oogenesis and/or early embryogenesis. - Neurotrophin-6. - Neurotrophin-7 from zebrafish. The pattern we have developed for the NGF family spans the central region of these proteins and include two of the six cysteines involved in disulfide bonds. -Consensus pattern: [GSRE]-C-[KRL]-G-[LIVT]-[DE]-x(3)-[YW]-x-S-x-C [The 2 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL, except for two -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Levi-Montalcini R. Science 237:1154-1162(1987). [ 2] Bradshaw R.A., Blundell T.L., Lapatto R., McDonald N.Q., Murray-Rust J. "Nerve growth factor revisited." Trends Biochem. Sci. 18:48-52(1993). PubMed=8488558 [ 3] Lo D.C. "NGF takes shape." Curr. Biol. 2:67-69(1992). PubMed=15335999 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00222} {PS00249; PDGF_1} {PS50278; PDGF_2} {BEGIN} ********************************************************************** * Platelet-derived growth factor (PDGF) family signature and profile * ********************************************************************** Platelet-derived growth factor (PDGF) [1,2] is a potent mitogen for cells of mesenchymal origin, including smooth muscle cells and glial cells. PDGF consists of two different but closely related chains (A and B chains) which assemble to form disulfide linked homo- or heterodimers (A-A, B-B, an A-B). Alternate splicing of the A chain transcript can give rise to two different forms that differ only in their C-terminal extremity. The transforming protein of simian sarcoma virus (SSV), encoded by the v-sis oncogene, is derived from the B chain of PDGF. PDGF is structurally related to a number of other growth factors which also form disulfide-linked homo- or heterodimers: - Vascular vascular endothelial growth factor (VEGF), also known as permeability factor (VPF) [3], a growth factor active in angiogenesis and endothelial cell growth. The genome of the orf poxvirus encodes an homolog of VEGF [4]. - Vascular endothelial growth factor B (VEGF-B), also active in angiogenesis and endothelial cell growth [5]. - Vascular endothelial growth factor B (VEGF-C), also active in angiogenesis and endothelial cell growth [6]. - Placenta growth factor (PlGF) [7], which is also active in agiogenesis. As a signature pattern for this family of growth factors, we selected a region that include four of the eight cysteines conserved in the sequences of these proteins. In PDGF, these cysteines are known to be involved in intra- and inter-chain disulfide bonds [8]. We also developed a profile that spans the eight conserved cysteines. -Consensus pattern: P-[PSR]-C-V-x(3)-R-C-[GSTA]-G-C-C [The 4 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Hannink M., Donoghue D.J. "Structure and function of platelet-derived growth factor (PDGF) and related proteins." Biochim. Biophys. Acta 989:1-10(1989). PubMed=2546599 [ 2] Heldin C.-H. "Structural and functional studies on platelet-derived growth factor." EMBO J. 11:4251-4259(1992). PubMed=1425569 [ 3] Ferrara N., Houck K.A., Jakeman L.B., Winer J., Leung D.W. "The vascular endothelial growth factor family of polypeptides." J. Cell. Biochem. 47:211-218(1991). PubMed=1791185 [ 4] Lyttle D.J., Fraser K.M., Fleming S.B., Mercer A.A., Robinson A.J. "Homologs of vascular endothelial growth factor are encoded by the poxvirus orf virus." J. Virol. 68:84-92(1994). PubMed=8254780 [ 5] Olofsson B., Pajusola K., Kaipainen A., von Euler G., Joukov V., Saksela O., Orpana A., Pettersson R.F., Alitalo K., Eriksson U. Proc. Natl. Acad. Sci. U.S.A. 93:2567-2581(1996). [ 6] Joukov V., Pajusola K., Kaipainen A., Chilov D., Lahtinen I., Kukk E., Saksela O., Kalkkinen N., Alitalo K. "A novel vascular endothelial growth factor, VEGF-C, is a ligand for the Flt4 (VEGFR-3) and KDR (VEGFR-2) receptor tyrosine kinases." EMBO J. 15:290-298(1996). PubMed=8617204 [ 7] Maglione D., Guerriero V., Viglietto G., Ferraro M.G., Aprelikova O., Alitalo K., Del Vecchio S., Lei K.J., Chou J.Y., Persico M.G. "Two alternative mRNAs coding for the angiogenic factor, placenta growth factor (PlGF), are transcribed from a single gene of chromosome 14." Oncogene 8:925-931(1993). PubMed=7681160 [ 8] Oefner C., D'Arcy A., Winkler F.K., Eggimann B., Hosang M. "Crystal structure of human platelet-derived growth factor BB." EMBO J. 11:3921-3926(1992). PubMed=1396586 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00223} {PS00250; TGF_BETA_1} {PS51362; TGF_BETA_2} {BEGIN} ***************************************** * TGF-beta family signature and profile * ***************************************** Transforming growth factor-beta (TGF-beta) [1] is a multifunctional peptide that controls proliferation, differentiation and other functions in many cell types. TGF-beta-1 is a peptide of 112 amino acid residues derived by proteolytic cleavage from the C-terminal of a precursor protein. A number of proteins are known to be related to TGF-beta-1 [1,2,3]. They are listed below. - Two other forms of TGF-beta have been found, they are known as TGFbeta-2 and TGF-beta-3. - Mullerian inhibitory substance (MIS), produced by the testis, which is responsible for the regression of the Mullerian ducts in the male embryo. - Inhibins, which inhibit the secretion of follitropin by the pituitary gland, and activins which have the reverse action. Inhibins are heterodimer of an alpha chain and a beta-A or a beta-B chain; activins are either homodimers of beta-A chains or heterodimers of a beta-A and a beta-B chain. All three chains are related to TGF-beta. - Bone morphogenetic proteins [4] BMP-2, BMP-3 (osteogenin), BMP-3B (GDF-10), BMP-4 (BMP-2B), BMP-5, BMP-6 (VGR-1), BMP-7 (OP-1) and BMP-8 (OP-2) which induce cartilage and bone formation and which are probably involved in the control of the production of skeletal structures during development. - Embryonic growth factor GDF-1, which may mediate cell differentiation events during embryonic development. - Growth/development factor GDF-5 [5], a protein whose gene, when mutated in mice, is the cause of brachypodism, a defects which alters the length and numbers of bones in the limbs. - Growth/development factor GDF-3, GDF-6, GDF-7, GDF-8 (myostatin) and GDF-9. - Mouse protein nodal, which seems essential for mesoderm formation. - Chicken dorsalin-1 (dsl-1) which may regulate cell differentiation within the neural tube. - Xenopus vegetal hemisphere protein Vg1, which seems to induce the overlying animal pole cells to form mesodermal tissue. - Drosophila decapentaplegic protein (DPP-C), which participates in the establishment of dorsal-ventral specification. - Drosophila protein screw (scw) which also participates in the establishment of dorsal-ventral specification. - Drosophila protein 60A. - Caenorhabditis elegans larval development regulatory growth factor daf-7. - Mammalian endometrial bleeding-associated factor (EBAF) (Lefty). - Mammalian glial cell line-derived neurotrophic factor (GDNF), a distantly related member of this family which acts as neurotrophic factor for dopaminergic neurons of the substantia nigra. Proteins from the TGF-beta family are only active as homo- or heterodimer; the two chains being linked by a single disulfide bond. From X-ray studies of TGF-beta-2 [6], it is known that all the other cysteines are involved in intrachain disulfide bonds. As shown in the following schematic representation, there are four disulfide bonds in the TGF-betas and in inhibin beta chains, while the other members of this family lack the first bond. interchain | +------------------------------------------|+ | ******* || xxxcxxxxxCcxxxxxxxxxxxxxxxxxxCxxCxxxxxxxxxxxxxxxxxxxCCxxxxxxxxxxxxxxxxxxx CxCx | | | | | | +------+ +--|---------------------------------------+ | +-----------------------------------------+ 'C': conserved cysteine involved in a disulfide bond. '*': position of the pattern. As a pattern to detect these proteins, we use a region which includes two of the conserved cysteines. We also developed a profile that covers all the conserved cysteines. -Consensus pattern: [LIVM]-x(2)-P-x(2)-[FY]-x(4)-C-x-G-x-C [The 2 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the profile: ALL. for GDNF and neurturin. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: January 2008 / Text revised; profile added. [ 1] Roberts A.B., Sporn M.B. (In) Peptide growth factors and their receptors, Handbook of Experimental Pharmacology, Vol. 95, pp419-475, Springer Verlag, Heidelberg, (1990). [ 2] Burt D.W. Biochem. Biophys. Res. Commun. 184:590-595(1992). [ 3] Burt D.W., Law A.S. "Evolution of the transforming growth factor-beta superfamily." Prog. Growth Factor Res. 5:99-118(1994). PubMed=8199356 [ 4] Kingsley D.M. "What do BMPs do in mammals? Clues from the mouse short-ear mutation." Trends Genet. 10:16-21(1994). PubMed=8146910 [ 5] Storm E.E., Huynh T.V., Copeland N.G., Jenkins N.A., Kingsley D.M., Lee S.-J. "Limb alterations in brachypodism mice due to mutations in a new member of the TGF beta-superfamily." Nature 368:639-643(1994). PubMed=8145850; DOI=10.1038/368639a0 [ 6] Daopin S., Piez K.A., Ogawa Y., Davies D.R. "Crystal structure of transforming growth factor-beta 2: an unusual fold for the superfamily." Science 257:369-373(1992). PubMed=1631557 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00224} {PS00251; TNF_1} {PS50049; TNF_2} {BEGIN} ************************************ * TNF family signature and profile * ************************************ The following cytokines can be grouped into a family basis of sequence, functional, and structural similarities [1,2,3]: on the - Tumor Necrosis Factor (TNF) (also known as cachectin or TNF-alpha) [4,5] is a cytokine which has a wide variety of functions. It can cause cytolysis of certain tumor cell lines; it is involved in the induction of cachexia; it is a potent pyrogen, causing fever by direct action or by stimulation of interleukin-1 secretion; finally, it can stimulate cell proliferation and induce cell differentiation under certain conditions. - Lymphotoxin-alpha (LT-alpha) and lymphotoxin-beta (LT-beta), two related cytokines produced by lymphocytes and which are cytotoxic for a wide range of tumor cells in vitro and in vivo [6]. - T cell antigen gp39 (CD40L), a cytokine which seems to be important in Bcell development and activation. - CD27L, a cytokine which plays a role in T-cell activation. It induces the proliferation of costimulated T cells and enhances the generation of cytolytic T cells. - CD30L, a cytokine which induces proliferation of T cells. - FASL, a cytokine involved in cell death [7]. - 4-1BBL, a inducible T cell surface molecule that contributes to T-cell stimulation. - OX40L, a cytokine that co-stimulates T cell proliferation and cytokine production [8]. - TNF-related apoptosis inducing ligand (TRAIL) [9], a cytokine that induces apoptosis [9]. TNF-alpha is synthesized as a type II membrane protein which then undergoes post-translational cleavage liberating the extracellular domain. CD27L, CD30L, CD40L, FASL, LT-beta, 4-1BBL and TRAIL also appear to be type II membrane proteins. LT-alpha is a secreted protein. All these cytokines seem to form homotrimeric (or heterotrimeric in the case of LT-alpha/beta) complexes that are recognized by their specific receptors. As a signature for this family most conserved region. This region is central section of these proteins. of proteins, we have selected the located in a beta-strand in the -Consensus pattern: [LV]-x-[LIVM]-{V}-x-{L}-G-[LIVMF]-Y-[LIVMFY](2)-x(2)[QEKHL]-[LIVMGT]-x-[LIVMFY] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 4. -Sequences known to belong to this class detected by the profile: ALL, except for OX40L. -Other sequence(s) detected in Swiss-Prot: 2. -Last update: December 2004 / Pattern and text revised. [ 1] Peitsch M.C., Jongeneel C.V. "A 3-D model for the CD40 ligand predicts that it is a compact trimer similar to the tumor necrosis factors." Int. Immunol. 5:233-238(1993). PubMed=8095800 [ 2] Farrah T., Smith C.A. "Emerging cytokine family." Nature 358:26-26(1992). PubMed=1377364; DOI=10.1038/358026b0 [ 3] Bazan J.F. "Emerging families of cytokines and receptors." Curr. Biol. 3:603-606(1993). PubMed=15335677 [ 4] Beutler B., Cerami A. "The history, properties, and biological effects of cachectin." Biochemistry 27:7575-7582(1988). PubMed=3061461 [ 5] Vilcek J., Lee T.H. "Tumor necrosis factor. New insights into the molecular mechanisms of its multiple actions." J. Biol. Chem. 266:7313-7316(1991). PubMed=1850405 [ 6] Browning J.L., Ngam-ek A., Lawton P., DeMarinis J., Tizard R., Chow E.P., Hession C., O'Brine-Greco B., Foley S.F., Ware C.F. "Lymphotoxin beta, a novel member of the TNF family that forms a heteromeric complex with lymphotoxin on the cell surface." Cell 72:847-856(1993). PubMed=7916655 [ 7] Suda T., Takahashi T., Golstein P., Nagata S. "Molecular cloning and expression of the Fas ligand, a novel member of the tumor necrosis factor family." Cell 75:1169-1178(1993). PubMed=7505205 [ 8] Baum P.R., Gayle R.B. III, Ramsdell F., Srinivasan S., Sorensen R.A., Watson M.L., Seldin M.F., Baker E., Sutherland G.R., Clifford K.N. "Molecular characterization of murine and human OX40/OX40 ligand systems: identification of a human OX40 ligand as the HTLV-1regulated protein gp34." EMBO J. 13:3992-4001(1994). PubMed=8076595 [ 9] Wiley S.R., Schooley K., Smolak P.J., Din W.S., Huang C.-P., Nicholl J.K., Sutherland G.R., Smith T.D., Rauch C., Smith C.A. "Identification and characterization of a new member of the TNF family that induces apoptosis." Immunity 3:673-682(1995). PubMed=8777713 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00225} {PS00252; INTERFERON_A_B_D} {BEGIN} ***************************************************** * Interferon alpha, beta and delta family signature * ***************************************************** Interferons [1] are proteins which produce antiviral and antiproliferative responses in cells. On the basis of their sequence interferons are classified into five groups: alpha, alpha-II (or omega), beta, delta (or trophoblast) [2] and gamma. Except for gamma-interferon, the sequence of all the others are related. Once the signal peptide has been removed, these interferons are mature proteins of 160 to 170 residues. A disulfide bond is one of the best conserved structural features of the proteins belonging to this family. The signature pattern for this family of proteins includes one of the cysteines involved in this disulfide bond. -Consensus pattern: [FYH]-[FY]-x-[GNRCDS]-[LIVM]-x(2)-[FYL]-L-x(7)-[CY][AT]-W [The second C is involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Except for mouse interferon-beta, all have a cysteine in position 10 of the pattern. proteins from this family -Expert(s) to contact by email: Rubinstein M.; [email protected] -Last update: December 2004 / Pattern and text revised. [ 1] Interferons and other regulated cytokines. (In) de Maeyer E., de Maeyer-Guignard J., Eds., Wiley and Sons, New-York, (1988). [ 2] Roberts R.M., Cross J.C., Leaman D.W. "Interferons as hormones of pregnancy." Endocrinol. Rev. 13:432-452(1992). PubMed=1385108; +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00226} {PS00253; INTERLEUKIN_1} {BEGIN} *************************** * Interleukin-1 signature * *************************** Interleukin-1 (IL-1) [1,2,3] is a member of a family of cellular mediators known as cytokines. IL-1 has many biological activities. Among other functions, it is a fever-producing factor (pyrogen), induces prostaglandin synthesis, is involved in T-lymphocyte activation and proliferation as well as in B-lymphocyte activation and proliferation via interleukin-2. There are two different forms of IL-1: IL-1 alpha and IL-1 beta, whose sequence are about 25% identical. IL-1 alpha and beta bind to the same receptor. Both forms of IL-1 are synthesized as precursor proteins of about 270 residues which are then post-translationally processed by the cleavage of a N-terminal sequence of approximately 115 residues. The interleukin-1 receptor antagonist (IL-1ra) is a protein structurally related to IL-1's but whose biological function is not yet known. As a signature pattern for region in these cytokines, we selected a conserved the C-terminal section. -Consensus pattern: [FC]-x-S-[ASLV]-x(2)-P-x(2)-[FYLIV]-[LI]-[SCA]-Tx(7)[LIVM] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: November 1997 / Text revised. [ 1] Dinarello C.A. "Biology of interleukin 1." FASEB J. 2:108-115(1988). PubMed=3277884 [ 2] Mizel S.B. "The interleukins." FASEB J. 3:2379-2388(1989). PubMed=2676681 [ 3] Hughes A.L. "Evolution of the interleukin-1 gene family in mammals." J. Mol. Evol. 39:6-12(1994). PubMed=8064874 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00227} {PS00254; INTERLEUKIN_6} {BEGIN} ************************************************ * Interleukin-6 / G-CSF / MGF family signature * ************************************************ The following cytokines basis of sequence similarities. can be grouped into a single family on the - Interleukin-6 (IL-6) [1] (also known as B-cell stimulatory factor 2 (BSF-2) or interferon beta-2) is a cytokine that has a wide variety of biological functions. It plays an essential role in the final differentiation of Bcells into IG-secreting cells, as well as inducing myeloma/plasmacytoma growth, nerve cell differentiation and, in hepatocytes, acute phase reactants. - Granulocyte colony-stimulating factor (G-CSF) [2] belongs to the cytokine family whose members regulate hematopoietic cell proliferation and differentiation. G-CSF exclusively stimulates the colony formation of granulocytes. - Myelomonocytic growth factor (MGF) [3] is an avian hematopoeitic growth factor that stimulates the proliferation and colony formation of normal and transformed avian cells of the myeloid lineage. These cytokines are glycoproteins of about 170 to 180 amino acid residues that contains four conserved cysteine residues involved in two disulfide bonds [4], as shown in the following schematic representation. +--+ +---+ | | | | xxxxxxxxxxxCxxCxxxxxCxxxCxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx ********** 'C': conserved cysteine involved in a disulfide bond. '*': position of the pattern. The pattern developed for this family of proteins last two cysteines as well as other conserved residues. includes the -Consensus pattern: C-x(9)-C-x(6)-G-L-x(2)-[FY]-x(3)-L [The 2 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: It has been said [5] that this family can be extended by the adjunction of LIF and OSM (see the relevant entry <PDOC00509>) but, while all these cytokines seem to be structurally related, the sequence similarity is not high enough to allow the use of only one single consensus pattern. -Expert(s) to contact by email: Rose-John S.; [email protected] -Last update: May 2004 / Text revised. [ 1] Kishimoto T., Hirano T. "A new interleukin with pleiotropic activities." BioEssays 9:11-15(1988). PubMed=3063260 [ 2] Metcalf D. "The granulocyte-macrophage colony-stimulating factors." Science 229:16-22(1985). PubMed=2990035 [ 3] Leutz A., Damm K., Sterneck E., Kowenz E., Ness S., Frank R., Gausepohl H., Pan Y.-C.E., Smart J., Hayman M., Graf T. "Molecular cloning of the chicken myelomonocytic growth factor (cMGF) reveals relationship to interleukin 6 and granulocyte colony stimulating factor." EMBO J. 8:175-181(1989). PubMed=2785450 [ 4] Clogston C.L., Boone T.C., Crandall B.C., Mendiaz E.A., Lu H.S. "Disulfide structures of human interleukin-6 are similar to those of human granulocyte colony stimulating factor." Arch. Biochem. Biophys. 272:144-151(1989). PubMed=2472117 [ 5] Rose T.M., Bruce A.G. "Oncostatin M is a member of a cytokine family that includes leukemia-inhibitory factor, granulocyte colony-stimulating factor, and interleukin 6." Proc. Natl. Acad. Sci. U.S.A. 88:8641-8645(1991). PubMed=1717982 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00228} {PS00255; INTERLEUKIN_7_9} {BEGIN} ********************************** * Interleukin-7 and -9 signature * ********************************** Interleukin-7 (IL-7) [1] is a cytokine that serves as a growth factor for early lymphoid cells of both B- and T-cell lineages. Interleukin-9 (IL9) [2] is a cytokine that supports IL-2 independent and IL-4 independent growth of helper T-cells. Interleukin-7 and -9 signature pattern, we selected section of the protein. seems to be evolutionary related [3]. As a a conserved region located in the C-terminal -Consensus pattern: N-x-[LAP]-[SCT]-F-L-K-x-L-L -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Boulay J.-L.; [email protected] -Last update: July 1998 / Pattern and text revised. [ 1] Henney C.S. "Interleukin 7: effects on early events in lymphopoiesis." Immunol. Today. 10:170-173(1989). PubMed=2663018 [ 2] Renauld J.C., Goethals A., Houssiau F., Merz H., Van Roost E., Van Snick J. "Human P40/IL-9. Expression in activated CD4+ T cells, genomic organization, and comparison with the mouse gene." J. Immunol. 144:4235-4241(1990). PubMed=1971295 [ 3] Boulay J.-L., Paul W.E. "Hematopoietin sub-family classification based on size, gene organization and sequence homology." Curr. Biol. 3:573-581(1993). PubMed=15335670 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00229} {PS00256; AKH} {BEGIN} ***************************************** * Adipokinetic hormone family signature * ***************************************** Adipokinetic hormones (AKH) [1,2] are small active peptides produced by some insect species. They bring on the release of diglycerides from the fat body and then stimulate the flight muscles to use them as an energy source. Other types of active peptides structurally related to AKH are: - Hypertrehalosaemic factors (HTF), which are neuropeptides that elevate the level of trehalose in the hemolymph of some insects. - Red pigment concentrating hormone (RPCH), a peptide that stimulates pigment concentration in prawn and crab erythrophores. These peptides are eight to ten amino acid residues long. The signature pattern to detect them is based on the sequence of the first eight residues, which are common to all these peptides. -Consensus pattern: Q-[LV]-[NT]-[FY]-[ST]-x(2)-W -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 3. -Last update: November 1997 / Text revised. [ 1] Schaffer M.H., Noyes B.E., Slaughter C.A., Thorne G.C., Gaskell S.J. "The fruitfly Drosophila melanogaster contains a novel charged adipokinetic-hormone-family peptide." Biochem. J. 269:315-320(1990). PubMed=2117437 [ 2] Gade G., Hilbich C., Beyreuther K., Rinehart K.L. Jr. "Sequence analyses of two neuropeptides of the AKH/RPCH-family from the lubber grasshopper, Romalea microptera." Peptides 9:681-688(1988). PubMed=3226948 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00230} {PS00257; BOMBESIN} {BEGIN} ******************************************* * Bombesin-like peptides family signature * ******************************************* Bombesin-like peptides comprise a large family of peptides which were initially isolated from amphibian skin, where they stimulate smooth muscle contraction. They were later found to be widely distributed in mammalian neural and endocrine cells. The amphibian peptides which belong to this family are currently classified into three subfamilies [1,2]: - The Bombesin group, which includes bombesin and alytesin. - The Ranatensin group, which includes ranatensins, litorin, Rohdei litorin. - The Phyllolitorin group, which includes Leu(8)- and Phe(8)phyllolitorins. and In mammals and birds two categories of bombesin-like peptides are known [3,4]: - Gastrin-releasing peptide (GRP), which stimulates the release of gastrin as well as other gastrointestinal hormones. - Neuromedin B (NMB), a neuropeptide whose function is not yet clear. Bombesin-like peptides, like many other active peptides, are synthesized as larger protein precursors that are enzymatically converted to their mature forms. The final peptides are eight to fourteen residues long. As a signature pattern, we have chosen the last seven residues in the C-terminal, which are conserved and are essential for the biological activity. -Consensus pattern: W-A-x-G-[SH]-[LF]-M -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: In positions 5 to 6 of the pattern, bombesins and GRP have HisLeu, NMB and ranatensins have His-Phe, and phyllolitorins have Ser-(Leu or Phe). -Last update: January 1989 / First entry. [ 1] Erspamer V., Falconieri Erspamer G., Mazzanti G., Endean R. Comp. Biochem. Physiol. 77C:99-108(1984). [ 2] Erspamer V., Melchiorri P., Falconieri Erspamer G., Montecucchi P.C., de Castiglione R. Peptides 6 Suppl. 3:7-12(1985). [ 3] Spindel E.R. Trends Neurosci. 9:130-133(1986). [ 4] Krane I.M., Naylor S.L., Helin-Davis D., Chin W.W., Spindel E.R. "Molecular cloning of cDNAs encoding the human bombesin-like peptide neuromedin B. Chromosomal localization and comparison to cDNAs encoding its amphibian homolog ranatensin." J. Biol. Chem. 263:13317-13323(1988). PubMed=2458345 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00231} {PS00258; CALCITONIN} {BEGIN} ********************************************* * Calcitonin / CGRP / IAPP family signature * ********************************************* Calcitonin [1] is a 32 amino acid polypeptide hormone that causes a rapid but short-lived drop in the level of calcium and phosphate in the blood, by promoting the incorporation of these ions in the bones. Alternative splicing of the gene coding for calcitonin produces a distantly related peptide of 37 amino acids, called calcitonin gene-related peptide (CGRP). CGRP induces vasodilatation in a variety of vessels, including the coronary, cerebral and systemic vasculature. Its abundance in the CNS also points toward a neurotransmitter or neuromodulator role. Islet amyloid polypeptide (IAPP) [2] (also known as diabetesassociated peptide (DAP), or amylin) is a peptide of 37 amino acids that selectively inhibits insulin-stimulated glucose utilization and glycogen deposition in muscle, while not affecting adipocyte glucose metabolism. Structurally, IAPP is closely related to CGRP. Two conserved to be cysteines in the N-terminal of these peptides are known involved in a disulfide bond. peptides is amidated. The C-terminal residue of all three **************** xCxxxxxCxxxxxxxxxxxxxxxxxxxxxxxxxxxx-NH(2) | | Amide group +-----+ 'C': conserved cysteine involved in a disulfide bond. '*': position of the pattern. The pattern we developed region of the disulfide bond. for this family of peptides includes the -Consensus pattern: C-[SAGDN]-[STN]-x(0,1)-[SA]-T-C-[VMA]-x(3)-[LYF]x(3)[LYF] [The 2 C's are linked by a disulfide bond] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: May 2004 / Text revised. [ 1] Breimer L.H., McIntyre I., Zaidi M. "Peptides from the calcitonin genes: molecular genetics, structure and function." Biochem. J. 255:377-390(1988). PubMed=3060108 [ 2] Nishi M., Sanke T., Nagamatsu S., Bell G.I., Steiner D.F. "Islet amyloid polypeptide. A new beta cell secretory product related to islet amyloid deposits." J. Biol. Chem. 265:4173-4176(1990). PubMed=2407732 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00232} {PS00259; GASTRIN} {BEGIN} ********************************************** * Gastrin / cholecystokinin family signature * ********************************************** Gastrin and cholecystokinin (CCK) [1,2] are structurally and functionally related peptide hormones that function as hormonal regulators of various digestive processes and feeding behaviors . They are known to induce gastric secretion, stimulate pancreatic secretion, increase blood circulation and water secretion in the stomach and intestine, and stimulate smooth muscle contraction. Originally found in the gut, these hormones have since been shown to be present in various parts of the nervous system. Like many other active peptides they are synthesized as larger protein precursors that are enzymatically converted to their mature forms. They are found in several molecular forms due to tissue-specific post-translational processing. A number of other peptides are known to belong to the same family: - Caerulein [3], an amphibian skin peptide, with a biological activity similar to that of CCK or gastrin. There are different types of caerulein precursors [4] in which a single or up to four copies of the peptide are present. - Leukosulfakinin I and II (LSK) [5,6] are peptides, isolated from cockroach, that change the frequency and amplitude of contractions of the hindgut. - Drosulfakinins I and II [7] are putative CCK-homologs from Drosophila. Those two peptides are part of a precursor sequence that was isolated using a probe based on the sequence of CCK and LSK. - A chicken antrum peptide [8] which is a potent stimulus of avian gastric acid but not of pancreatic secretion. - Cionin [9], a neuropeptide from the protochordate Ciona intestinalis. The biological activity of gastrin and CCK is associated with the last five Cterminal residues. One or two positions downstream, there is a conserved sulfated tyrosine residue. The signature pattern developed for this family of peptides includes the biologically active C-terminal sequence as well as the sulfated tyrosine. -Consensus pattern: Y-x(0,1)-[GD]-[WH]-M-[DR]-F [Y is sulfated] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: The residues in positions 4 and 6 of the pattern are respectively Trp and Asp in vertebrate peptides, and His and Arg in insect peptides. -Last update: April 1990 / Text revised. [ 1] Concise Encyclopedia Biochemistry, Second Edition, Walter de Gruyter, Berlin New-York (1988). [ 2] Cholecystokinin. Ann. N.Y. Acad. Sci. 448(1985). [ 3] Erspamer V., Falconieri Erspamer G., Mazzanti G., Endean R. Comp. Biochem. Physiol. 77C:99-108(1984). [ 4] Richter K., Egger R., Kreil G. "Sequence of preprocaerulein cDNAs cloned from skin of Xenopus laevis. A small family of precursors containing one, three, or four copies of the final product." J. Biol. Chem. 261:3676-3680(1986). PubMed=3753978 [ 5] Nachman R.J., Holman G.M., Haddon W.F., Ling N. "Leucosulfakinin, a sulfated insect neuropeptide with homology to gastrin and cholecystokinin." Science 234:71-73(1986). PubMed=3749893 [ 6] Nachman R.J., Holman G.M., Cook B.J., Haddon W.F., Ling N. "Leucosulfakinin-II, a blocked sulfated insect neuropeptide with homology to cholecystokinin and gastrin." Biochem. Biophys. Res. Commun. 140:357-364(1986). PubMed=3778455 [ 7] Nichols R., Schneuwly S.A., Dixon J.E. "Identification and characterization of a Drosophila homologue to the vertebrate neuropeptide cholecystokinin." J. Biol. Chem. 263:12167-12170(1988). PubMed=2842322 [ 8] Dimaline R., Young J., Gregory H. "Isolation from chicken antrum, and primary amino acid sequence of a novel 36-residue peptide of the gastrin/CCK family." FEBS Lett. 205:318-322(1986). PubMed=3743781 [ 9] Johnsen A.H., Rehfeld J.F. "Cionin: a disulfotyrosyl hybrid of cholecystokinin and gastrin from the neural ganglion of the protochordate Ciona intestinalis." J. Biol. Chem. 265:3054-3058(1990). PubMed=2303439 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00233} {PS00260; GLUCAGON} {BEGIN} **************************************************** * Glucagon / GIP / secretin / VIP family signature * **************************************************** A number of polypeptidic hormones, mainly expressed in the intestine or the pancreas, belong to a group of structurally related peptides [1,2]. Members of this family are: - Glucagon, which promotes hydrolysis of glycogen and lipids, and raises the blood sugar level. - Glucagon-like peptide 1 (GLP-1), a peptide of unknown function processed from the same precursor protein as that of glucagon. - Glucagon-like peptide 2 (GLP-2), a peptide of unknown function also processed from the glucagon precursor protein but which, in contrast to GLP-1, is only found in mammals. - Gastric inhibitory polypeptide (GIP), which is a potent stimulator of insulin secretion and a relatively poor inhibitor of gastric acid secretion. - Secretin, which stimulates formation of NaHCO(3)-rich pancreatic juice and secretion of NaHCO(3)-rich bile as well as inhibiting HCl production by the stomach. - Vasoactive intestinal peptide (VIP), which causes vasodilatation, lowers arterial blood pressure, stimulates myocardial contractility, increases glycogenolysis and relaxes some smooth muscles. - Peptide PHI-27, a vasodilator peptide which is coded by the same precursor protein as that of VIP. - Growth hormone-releasing factor (GRF) (also known as somatoliberin), which is released by the hypothalamus and acts on the adenohypophyse to stimulate the secretion of growth hormone. - Pituitary adenylate cyclase activating polypeptide (PACAP) [3]. - Helospectin (exendin-1), helodermin (exendin-2), exendin-3, and exendin-4 from the venom of gila monsters. The exendins are peptides with a VIP/ secretin biological activity [4]. - A peptide produced by the X-cells of the islets of ratfish pancreas [5]. As a pattern for this family of peptides (which are from 30 to 45 amino acid residues long), we used the more or less conserved first ten positions of the N-terminal as well as a conserved hydrophobic residue in position 23. -Consensus pattern: [YH]-[STAIVGD]-[DEQ]-[AGF]-[LIVMSTE]-[FYLR]-x[DENSTAK][DENSTA]-[LIVMFYG]-x(8)-{K}-[KREQL]-[KRDENQL][LVFYWG][LIVQ] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 1. -Last update: December 2004 / Pattern and text revised. [ 1] Mutt V. "Vasoactive intestinal polypeptide and related peptides. Isolation and chemistry." Ann. N.Y. Acad. Sci. 527:1-19(1988). PubMed=3133967 [ 2] Bataille D., Blache P., Mercier F., Jarrousse C., Kervran A., Dufour M., Mangeat P., Dubrasquet M., Mallat A., Lotersztajn S., Pavoine C., Pecker F. Ann. N.Y. Acad. Sci. 527:169-185(1988). [ 3] Miyata A., Arimura A., Dahl R.R., Minamino N., Uehara A., Jiang L., Culler M.D., Coy D.H. "Isolation of a novel 38 residue-hypothalamic polypeptide which stimulates adenylate cyclase in pituitary cells." Biochem. Biophys. Res. Commun. 164:567-574(1989). PubMed=2803320 [ 4] Eng J., Kleinman W.A., Singh L., Singh G., Raufman J.-P. "Isolation and characterization of exendin-4, an exendin-3 analogue, from Heloderma suspectum venom. Further evidence for an exendin receptor on dispersed acini from guinea pig pancreas." J. Biol. Chem. 267:7402-7405(1992). PubMed=1313797 [ 5] Conlon J.M., Dafgard E., Falkmer S., Thim L. "A glucagon-like peptide, structurally related to mammalian oxyntomodulin, from the pancreas of a holocephalan fish, Hydrolagus colliei." Biochem. J. 245:851-855(1987). PubMed=3311036 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00234} {PS00261; GLYCO_HORMONE_BETA_1} {PS00689; GLYCO_HORMONE_BETA_2} {BEGIN} *********************************************** * Glycoprotein hormones beta chain signatures * *********************************************** Glycoprotein hormones [1,2] (or gonadotropins) are a family of proteins which include the mammalian hormones follitropin (FSH), lutropin (LSH), thyrotropin (TSH) and chorionic gonadotropin (CG), as well as at least two forms of fish gonadotropins. All these hormones consist of two glycosylated chains (alpha and beta). In mammalian gonadotropins, the alpha chain is identical in the four types of hormones but the beta chains, while homologous, are different. The beta chains are proteins of about 100 to 140 amino acid residues which contain twelve conserved cysteines all involved in disulfide bonds [3], as shown in the following schematic representation. +----------------------+ | +------------|-----------------------------+ | +-|------------|--------+ | | | | **** | | *************** xxxCxxxxxxxCxCxxCxCxxxxxxxCxxxxxxxxCxxxxxxxCxCxCxxCxxxxxCxxxxxxxxxxx | | | | | | | | | | +--+ +-|------------------------+ | +--------------------------+ 'C': conserved cysteine involved in a disulfide bond. '*': position of the patterns. We have developed two patterns for these hormones. The first one, located in the N-terminal section, is a region which has been said to be involved in the association between the two chains of the hormones. The second pattern consists of a cluster of five conserved cysteines in the C-terminal section. -Consensus pattern: C-[STAGM]-G-[HFYL]-C-x-[ST] [The 2 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL, except for rat beta-FSH which has Glu in position 2 of the pattern. -Other sequence(s) detected in Swiss-Prot: 2. -Consensus pattern: [PA]-V-A-x(2)-C-x-C-x(2)-C-x(4)-[STDAI]-[DEY]-Cx(6,8)[PGSTAVMI]-x(2)-C [The 5 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL, except for 5 sequences. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Lapthorn A.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Pierce J.G., Parsons T.F. "Glycoprotein hormones: structure and function." Annu. Rev. Biochem. 50:465-495(1981). PubMed=6267989 [ 2] Stockell Hartree A., Renwick A.G.C. Biochem. J. 287:665-679(1992). [ 3] Lapthorn A.J., Harris D.C., Littlejohn A., Lustbader J.W., Canfield R.E., Machin K.J., Morgan F.J., Isaacs N.W. "Crystal structure of human chorionic gonadotropin." Nature 369:455-461(1994). PubMed=8202136; DOI=10.1038/369455a0 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00235} {PS00262; INSULIN} {BEGIN} **************************** * Insulin family signature * **************************** The insulin family of proteins [1] groups a number of active peptides which are evolutionary related. This family currently consists of: - Insulin. - Relaxin. - Insulin-like growth factors I and II (IGFs or somatomedins) [2]. - Mammalian Leydig cell-specific insulin-like peptide (Ley-I-L) (gene INSL3) [3]. - Mammalian early placenta insulin-likepeptide (ELIP) (gene INSL4) [4]. - Insulin-like peptide 5 (gene INSL5). - Insect prothoracicotropic hormone (PTTH) (bombyxin) [5]. - Locust insulin-related peptide (LIRP) [6]. - Molluscan insulin-related peptides 1 to 5 (MIP) [7]. - Caenorhabditis elegans insulin-like peptides [8]. Structurally, all these peptides consist of two polypeptide and B) linked by two disulfide bonds. B chain A chain chains (A xxxxxxCxxxxxxxxxxxxCxxxxxxxxx | | xxxxxCCxxxCxxxxxxxxCx *************** | | +----+ 'C': conserved cysteine involved in a disulfide bond. '*': position of the pattern. As shown in the schematic representation above, they all share a conserved arrangement of four cysteines in their A chain. The first of these cysteines is linked by a disulfide bond to the third one and the second and fourth cysteines are linked by interchain disulfide bonds to cysteines in the B chain. As a pattern for this family of proteins, we have used the region which includes the four conserved cysteines in the A chain. -Consensus pattern: C-C-{P}-{P}-x-C-[STDNEKPI]-x(3)-[LIVMFS]-x(3)-C [The 4 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL, except for what was thought to be a sponge insulin [9], but which originates from an unidentified rodent and which contains sequencing errors and lacks the first cysteine of the A chain. -Other sequence(s) detected in Swiss-Prot: 1. -Last update: December 2004 / Pattern and text revised. [ 1] Blundell T.L., Humbel R.E. "Hormone families: pancreatic hormones and homologous growth factors." Nature 287:781-787(1980). PubMed=6107857 [ 2] Humbel R.E. "Insulin-like growth factors I and II." Eur. J. Biochem. 190:445-462(1990). PubMed=2197088 [ 3] Adham I.M., Burkhardt E., Benahmed M., Engel W. "Cloning of a cDNA for a novel insulin-like peptide of the testicular Leydig cells." J. Biol. Chem. 268:26668-26672(1993). PubMed=8253799 [ 4] Chassin D., Laurent A., Janneau J.-L., Berger R., Bellet D. "Cloning of a new member of the insulin gene superfamily (INSL4) expressed in human placenta." Genomics 29:465-470(1995). PubMed=8666396 [ 5] Nagasawa H., Kataoka H., Isogai A., Tamura S., Suzuki A., Mizoguchi A., Fujiwara Y., Suzuki A., Takahashi S.Y., Ishizaki H. Proc. Natl. Acad. Sci. U.S.A. 83:5480-5483(1986). [ 6] Lagueux M., Lwoff L., Meister M., Goltzene F., Hoffmann J.A. "cDNAs from neurosecretory cells of brains of Locusta migratoria (Insecta, Orthoptera) encoding a novel member of the superfamily of insulins." Eur. J. Biochem. 187:249-254(1990). PubMed=1688797 [ 7] Smit A.B., Geraerts W.P.M., Meester I., van Heerikhuizen H., Joosse J. "Characterization of a cDNA clone encoding molluscan insulin-related peptide II of Lymnaea stagnalis." Eur. J. Biochem. 199:699-703(1991). PubMed=1868853 [ 8] Duret L., Guex N., Peitsch M.C., Bairoch A. "New insulin-like proteins with atypical disulfide bond pattern characterized in Caenorhabditis elegans by comparative sequence analysis and homology modeling." Genome Res. 8:348-353(1998). PubMed=9548970 [ 9] Robitzki A., Schroder H.C., Ugarkovic D., Pfeifer K., Uhlenbruck G., Muller W.E.G. "Demonstration of an endocrine signaling circuit for insulin in the sponge Geodia cydonium." EMBO J. 8:2905-2909(1989). PubMed=2531072 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00236} {PS00263; NATRIURETIC_PEPTIDE} {BEGIN} ********************************** * Natriuretic peptides signature * ********************************** Atrial natriuretic peptides (ANP) [1,2,3] are vertebrate hormones important in the overall control of cardiovascular homeostasis and sodium and water balance in general. There are various ANP peptides which vary in length but which have a common central core; they are all processed from a single precursor. There is a disulfide bond which is important for the expression of the biological activity. The ANP protein family includes three additional structurally related peptides which elicit a pharmacological spectrum similar to ANP: - Brain natriuretic peptide (BNP). C-type natriuretic peptide (CNP). Ventricular natriuretic peptide (VNP) [4]. Green mamba natriuretic peptide (DNP) [5]. The signature developed for the ANP family includes the two cysteines involved in the disulfide bond and two conserved glycines which may be important for the conformation of the peptides. -Consensus pattern: C-F-G-x(3)-[DEA]-[RH]-I-x(3)-S-x(2)-G-C [The 2 C's are linked by a disulfide bond] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Inagami T. "Atrial natriuretic factor." J. Biol. Chem. 264:3043-3046(1989). PubMed=2536732 [ 2] Sagnella G.A., McGregor G.A. Trends Biochem. Sci. 11:299-302(1986). [ 3] Rosenzweig A., Seidman C.E. "Atrial natriuretic factor and related peptide hormones." Annu. Rev. Biochem. 60:229-255(1991). PubMed=1652921; DOI=10.1146/annurev.bi.60.070191.001305 [ 4] Takei Y., Takahashi A., Watanabe T.X., Nakajima K., Sakakibara S. "A novel natriuretic peptide isolated from eel cardiac ventricles." FEBS Lett. 282:317-320(1991). PubMed=1828035 [ 5] Schweitz H., Vigne P., Moinier D., Frelin C., Lazdunski M. "A new member of the natriuretic peptide family is present in the venom of the green mamba (Dendroaspis angusticeps)." J. Biol. Chem. 267:13928-13932(1992). PubMed=1352773 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00237} {PS00264; NEUROHYPOPHYS_HORM} {BEGIN} *************************************** * Neurohypophysial hormones signature * *************************************** Oxytocin (or ocytocin) and vasopressin acid residues), structurally and functionally peptide hormones. Oxytocin causes contraction of uterus and of the mammary gland while vasopressin has on the kidney and also causes vasoconstriction Like [1] are related small (nine amino neurohypophysial the smooth muscle of the a direct antidiuretic action of the peripheral vessels. the majority of active peptides, both hormones are synthesized as larger protein precursors that are enzymatically converted to their mature forms. Peptides belonging to this family are also found in birds, fish, reptiles and amphibians (mesotocin, isotocin, valitocin, glumitocin, aspargtocin, vasotocin, seritocin, asvatocin, phasvatocin), in worms (annetocin), octopi (cephalotocin), locust (locupressin or neuropeptide F1/F2) and in molluscs (conopressins G and S) [2]. The pattern developed to detect this category of peptides spans their entire sequence and includes four invariant amino acid residues. -Consensus pattern: C-[LIFY]-[LIFYV]-x-N-[CS]-P-x-G [The 2 C's are linked by a disulfide bond]. -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Acher R., Chauvet J. "Structure, processing and evolution of the neurohypophysial hormone-neurophysin precursors." Biochimie 70:1197-1207(1988). PubMed=3147712 [ 2] Chauvet J., Michel G., Ouedraogo Y., Chou J., Chait B.T., Acher R. "A new neurohypophysial peptide, seritocin ([Ser5,Ile8]-oxytocin), identified in a dryness-resistant African toad, Bufo regularis." Int. J. Pept. Protein Res. 45:482-487(1995). PubMed=7591488 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00238} {PS00265; PANCREATIC_HORMONE_1} {PS50276; PANCREATIC_HORMONE_2} {BEGIN} *************************************************** * Pancreatic hormone family signature and profile * *************************************************** Pancreatic hormone (PP) [1] is a peptide synthesized in pancreatic islets of Langherhans, which acts as a regulator of pancreatic and gastrointestinal functions. A number of other active peptides are homologous to pancreatic hormone: - Neuropeptide Y (NPY) [2], one of the most abundant peptides in the mammalian nervous system. NPY is implicated in the control of feeding and the secretion of the gonadotrophin-releasing hormone. - Peptide YY (PYY) [3]. PPY is a gut peptide that inhibits exocrine pancreatic secretion, has a vasoconstrictory action and inhibits jejunal and colonic mobility. - Various NPY and PYY-like polypeptides from fish and amphibians [4,5]. - Neuropeptide F (NPF) from invertebrates such as worms and snail [6]. - Skin peptide Tyr-Tyr (SPYY) from the frog Phyllomedusa bicolor. SPYY shows a large spectra of antibacterial and antifungal activity. All these peptides are 36 to 39 amino acids long. Like most active peptides, their C-terminal is amidated and they are synthesized as larger protein precursors. The signature for these peptides is based on the last 17 Cterminal residues, where three positions are completely conserved. A profile was also developed that spans the whole peptide. -Consensus pattern: [FY]-x(3)-[LIVM]-x(2)-Y-x(3)-[LIVMFY]-x-R-x-R-[YF] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 1. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2001 / Text revised; profile added. [ 1] Blundell T.L., Humbel R.E. "Hormone families: pancreatic hormones and homologous growth factors." Nature 287:781-787(1980). PubMed=6107857 [ 2] Allen J., Novotny J., Martin J., Heinrich G. "Molecular structure of mammalian neuropeptide Y: analysis by molecular cloning and computer-aided comparison with crystal structure of avian homologue." Proc. Natl. Acad. Sci. U.S.A. 84:2532-2536(1987). PubMed=3031687 [ 3] Tatemoto K. "Isolation and characterization of peptide YY (PYY), a candidate gut hormone that inhibits pancreatic exocrine secretion." Proc. Natl. Acad. Sci. U.S.A. 79:2514-2518(1982). PubMed=6953409 [ 4] Jensen J., Conlon J.M. "Characterization of peptides related to neuropeptide tyrosine and peptide tyrosine-tyrosine from the brain and gastrointestinal tract of teleost fish." Eur. J. Biochem. 210:405-410(1992). PubMed=1459125 [ 5] Conlon J.M., Chartrel N., Vaudry H. "Primary structure of frog PYY: implications for the molecular evolution of the pancreatic polypeptide family." Peptides 13:145-149(1992). PubMed=1620652 [ 6] Curry W.J., Shaw C., Johnston C.F., Thim L., Buchanan K.D. Comp. Biochem. Physiol. 101C:269-274(1992). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00239} {PS00266; SOMATOTROPIN_1} {PS00338; SOMATOTROPIN_2} {BEGIN} *********************************************************** * Somatotropin, prolactin and related hormones signatures * *********************************************************** The hormone somatotropin (growth hormone, GH) which plays an important role in growth control, choriomammotropin (lactogen), a placental analog of GH; and prolactin which acts primarily on the mammary gland by promoting lactation, have been shown [1] to be homologous. This family of proteins also includes other hormones listed below (references are only provided for recently determined sequences). - Rodents placental lactogens I and II. Bovine and sheep lactogen. Mouse proliferin I, II, III and proliferin related protein [2]. Bovine placental prolactin-related proteins I, II, and III [3]. Rat placental prolactin-like proteins A and B. Human growth hormone variants 1 (GH-V1) and 2 (GH-V2). Somatolactin (SL) from various fish [4]. The schematic representation this family is shown below. of the structure of proteins belonging to <----------------180-to-210-residues-------------> *** ***** xxxxxxxxxxxxxCxxxxxxxxxxxxxxxxxxxxxxxxxxxxCxxxCxxC | | | | +----------------------------+ +--+ 'C': conserved cysteine involved in a disulfide bond. '*': position of the patterns. We developed two signature patterns for this family of proteins, both patterns include cysteines involved in conserved disulfide bonds. -Consensus pattern: C-x-[STN]-x(2)-[LIVMFYS]-x-[LIVMSTA]-P-x(5)-[TALIV]x(7)[LIVMFY]-x(6)-[LIVMFY]-x(2)-[STACV]-W [The C is involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL, except for 8 sequences. -Other sequence(s) detected in Swiss-Prot: 4. -Consensus pattern: C-[LIVMFY]-{PT}-x-D-[LIVMFYSTA]-x-{S}-{RK}-{A}-x[LIVMFY]x(2)-[LIVMFYT]-x(2)-C [The 2 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL, except for 7 sequences. -Other sequence(s) detected in Swiss-Prot: 16. -Last update: April 2006 / Pattern revised. [ 1] Wallis M. "Episodic evolution of protein hormones in mammals." J. Mol. Evol. 53:10-18(2001). PubMed=11683318 [ 2] Connor A.M., Waterhouse P., Khokha R., Denhardt D.T. "Characterization of a mouse mitogen-regulated protein/proliferin gene and its promoter: a member of the growth hormone/prolactin gene superfamily." Biochim. Biophys. Acta 1009:75-82(1989). PubMed=2790033 [ 3] Kessler M.A., Milosavljevic M., Zieler C.G., Schuler L.A. "A subfamily of bovine prolactin-related transcripts distinct from placental lactogen in the fetal placenta." Biochemistry 28:5154-5161(1989). PubMed=2765528 [ 4] Rand-Weaver M., Noso T., Muramoto K., Kawauchi H. Biochemistry 30:1509-1515(1991). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00240} {PS00267; TACHYKININ} {BEGIN} ******************************* * Tachykinin family signature * ******************************* Tachykinins [1,2,3] are a group of biologically active peptides which excite neurons, evoke behavioral responses, are potent vasodilatators and contract (directly or indirectly) many smooth muscles. Peptides known to belong to the tachykinin family are listed below: - Substance P from mammals, birds and fish. - Neurokinin A (substance K or neuromedin L) from mammals, birds and fish. - Neurokinin B (neuromedin K) from mammals and frogs. - Kassinin from frogs. - Hylambatin from frogs. - Phyllomedusin from a frog. - Physalaemin from a frog. - Ranamargarin from a Chinese frog. - Uperolein from frogs. - Ranatachykinins A to D from frogs [4]. - Scyliorhinins from dogfish. - Carassin from goldfish [5]. - Eledoisin from octopus. Tachykinins, larger like most other active peptides, are synthesized as protein precursors that are enzymatically converted to their mature forms. Tachykinins are from ten to twelve residues long. We use, as a signature pattern, the last five residues of the C-terminal, which are conserved and are essential to the biological activity. -Consensus pattern: F-[IVFY]-G-[LM]-M-[G>] [See the note] -Sequences known to belong to this class detected by the pattern: ALL, except for ranatachykinin D from Rana catesbeiana which has Ala-Pro instead of GlyLeu/Met. -Other sequence(s) detected in Swiss-Prot: 10. -Note: If the sequence is processed, the peptide ends with a Cterminal amidated Met while in a precursor sequence it is always followed by a Gly which subsequently provides the amide group. -Note: Locust myotropic peptides locustatachykinin I and II [6] are distantly related to the tachykinin family but their C-terminal sequence is different (Val-Arg instead of Leu/Met-Met). Thus, they are not detected by the above pattern. -Last update: November 1995 / Text revised. [ 1] Maggio J.E. "Tachykinins." Annu. Rev. Neurosci. 11:13-28(1988). PubMed=3284438; DOI=10.1146/annurev.ne.11.030188.000305 [ 2] Helke C.J., Krause J.E., Mantyh P.W., Couture R., Bannon M.J. "Diversity in mammalian tachykinin peptidergic neurons: multiple peptides, receptors, and regulatory mechanisms." FASEB J. 4:1606-1615(1990). PubMed=1969374 [ 3] Avanov A.Y. Mol. Biol. (Mosk) 26:5-24(1992). [ 4] Kozawa H., Hino J., Minamino N., Kangawa K., Matsuo H. "Isolation of four novel tachykinins from frog (Rana catesbeiana) brain and intestine." Biochem. Biophys. Res. Commun. 177:588-595(1991). PubMed=2043143 [ 5] Conlon J.M., O'Harte F., Peter R.E., Kah O. "Carassin: a tachykinin that is structurally related to neuropeptide-gamma from the brain of the goldfish." J. Neurochem. 56:1432-1436(1991). PubMed=2002352 [ 6] Schoofs L., Holman G.M., Hayes T.K., Nachman R.J., De Loof A. "Locustatachykinin I and II, two novel insect neuropeptides with homology to peptides of the vertebrate tachykinin family." FEBS Lett. 261:397-401(1990). PubMed=2311766 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00241} {PS00268; CECROPIN} {BEGIN} ***************************** * Cecropin family signature * ***************************** Cecropins [1,2,3] are potent antibacterial proteins that constitute a main part of the cell-free immunity of insects. Cecropins are small proteins of about 35 amino acid residues active against both Gram-positive and Gramnegative bacteria. They seem to exert a lytic action on bacterial membranes. Cecropins isolated from insects other than Cecropia have been given various names: bactericidin, lepidopteran, sarcotoxin, etc. All of these peptides are structurally related. Cecropin P1, an intestinal antibacterial peptide from pig, also belongs to this family. As a signature pattern for this family of active peptides, we selected a conserved region in the N-terminal section of cecropins. -Consensus pattern: W-x(0,2)-[KDN]-{Q}-{L}-K-[KRE]-[LI]-E-[RKN] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 12. -Last update: April 2006 / Pattern revised. [ 1] Boman H.G., Hultmark D. "Cell-free immunity in insects." Annu. Rev. Microbiol. 41:103-126(1987). PubMed=3318666; DOI=10.1146/annurev.mi.41.100187.000535 [ 2] Boman H.G. "Antibacterial peptides: key components needed in immunity." Cell 65:205-207(1991). PubMed=2015623 [ 3] Boman H.G., Faye I., Gudmundsson G.H., Lee J.-Y., Lidholm D.A. "Cell-free immunity in Cecropia. A model system for antibacterial proteins." Eur. J. Biochem. 201:23-31(1991). PubMed=1915368 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00242} {PS00269; DEFENSIN} {BEGIN} ********************************* * Mammalian defensins signature * ********************************* Defensins [1 to 5], also known as alpha-defensins, are a family of structurally related cysteine-rich peptides active against many Gramnegative and Gram-positive bacteria, fungi, and enveloped viruses. Some defensins are also called corticostatins (CS) because they inhibit corticotropinstimulated corticosteroid production. Defensins kills cells by forming voltageregulated multimeric channels in the susceptible cell's membrane. They play a significant role in innate immunity to infection and neoplasia. The peptides known to belong to this family are listed below. - Rabbit defensins and corticostatins: CS-I (NP-3A), CS-II (NP-3B), CS-III (MCP-1), CS-IV (MCP-2), NP-4, and NP-5. - Guinea-pig neutrophil defensin (GPNP). - Human neutrophil defensins 1 to 4 and intestinal defensins 5 and 6. - Mouse small bowel cryptdins 1 to 5. - Rat NP-1 to NP-4. All these peptides range in length from 29 to 35 amino acids. There are seven invariant residues, including six cysteines all involved in intrachain disulfide bonds. A schematic representation of peptides from the defensin family is shown below. +----------------------------+ |****************************| xxCxCxxxxxCxxxxxxxGxCxxxxxxxxxCCxx | | | | +-----|---------+ | +-------------------+ 'C': conserved cysteine involved in a disulfide bond. '*': position of the pattern. Our pattern is based on the conserved residues. -Consensus pattern: C-x-C-x(3,5)-C-x(7)-G-x-C-x(9)-C-C [The 6 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL, except for mouse cryptdin 4. -Other sequence(s) detected in Swiss-Prot: 1. -Last update: May 2004 / Text revised. [ 1] Lehrer R.I., Ganz T., Selsted M.E. ASM News 56:315-318(1990). [ 2] Lehrer R.I., Ganz T., Selsted M.E. "Defensins: endogenous antibiotic peptides of animal cells." Cell 64:229-230(1991). PubMed=1988144 [ 3] Kagan B.L., Ganz T., Lehrer R.I. "Defensins: a family of antimicrobial and cytotoxic peptides." Toxicology 87:131-149(1994). PubMed=7512758 [ 4] Lehrer R.I., Lichtenstein A.K., Ganz T. "Defensins: antimicrobial and cytotoxic peptides of mammalian cells." Annu. Rev. Immunol. 11:105-128(1993). PubMed=8476558; DOI=10.1146/annurev.iy.11.040193.000541 [ 5] White S.H., Wimley W.C., Selsted M.E. "Structure, function, and membrane integration of defensins." Curr. Opin. Struct. Biol. 5:521-527(1995). PubMed=8528769 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00243} {PS00270; ENDOTHELIN} {BEGIN} ******************************* * Endothelin family signature * ******************************* Endothelins (ET's) are a family of potent mammalian vasoconstrictor peptides [1,2,3]. Sarafotoxins (SRTX) and bibrotoxin (BTX) are cardiotoxins from the venom of snakes of the Atractaspis family, structurally and functionally [4,5] similar to endothelin. The peptides currently known to belonging to the ET/SRTX family are: - Endothelin 1 (ET-1). - Endothelin 2 (ET-2) which is also known in mouse as Vasoactive Intestinal Contractor (VIC). - Endothelin 3 (ET-3). - Sarafotoxins SRTX-A, -B, -C and -D from Atractaspis engaddensis. - Bibrotoxin (BTX) from Atractaspis bibroni. As shown in the following schematic representation, these peptides which are 21 residues long contain two intramolecular disulfide bonds. +-------------+ | | CxCxxxxxxxCxxxCxxxxxx | | +-------+ 'C': conserved cysteine involved in a disulfide bond. As a signature pattern for this family of proteins, taken the conserved residues in the disulfide loops region. we have -Consensus pattern: C-x-C-x(4)-D-x(2)-C-x(2)-[FY]-C [The 4 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: In addition to endothelin, the precursors of ETs contain an endothelinlike domain which is also detected by this pattern. -Note: The precursor of sarafotoxins is a large protein that contains 12 copies of various isoforms of the toxin [6]. -Last update: May 2004 / Text revised. [ 1] Yanagisawa M., Masaki T. "Molecular biology and biochemistry of the endothelins." Trends Pharmacol. Sci. 10:374-378(1989). PubMed=2690429 [ 2] Simonson M.S., Dunn M.J. "Cellular signaling by peptides of the endothelin gene family." FASEB J. 4:2989-3000(1990). PubMed=2168326 [ 3] Rubanyi G.M., Parker Botelho L.H. "Endothelins." FASEB J. 5:2713-2720(1991). PubMed=1916094 [ 4] Kloog Y., Sokolovsky M. "Similarities in mode and sites of action of sarafotoxins and endothelins." Trends Pharmacol. Sci. 10:212-214(1989). PubMed=2549664 [ 5] Sokolovsky M. "Endothelins and sarafotoxins: physiological regulation, receptor subtypes and transmembrane signaling." Trends Biochem. Sci. 16:261-264(1991). PubMed=1656557 [ 6] Ducancel F., Matre V., Dupont C., Lajeunesse E., Wollberg Z., Bdolah A., Kochva E., Boulain J.C.C., Menez A. "Cloning and sequence analysis of cDNAs encoding precursors of sarafotoxins. Evidence for an unusual 'rosary-type' organization." J. Biol. Chem. 268:3052-3055(1993). PubMed=8428983 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00244} {PS00271; THIONIN} {BEGIN} **************************** * Plant thionins signature * **************************** Thionins are small, basic, plant proteins generally toxic to animal cells [1]. They seem to exert their toxic effect at the level of the cell membrane but their exact function is not known. They consist of a polypeptide chain of forty five to fifty amino acids with three to four internal disulfide bonds. They are found in seeds but also in the cell wall of leaves [2]. Thionins are processed from larger precursor proteins [3]. Crambin [4], a hydrophobic plant seed protein, also belongs to this family. The pattern we developed to detect this family of proteins includes three of the six cysteine residues involved in disulfide bonds. +-----------------------------------+ |+----------------------------+ | || | | xxCCxxxxxxxxxxxCxxxxxxxxxCxxxCxxCxxxxxCxxxxxxxx ************** | | | +---------+ 'C': conserved cysteine involved in a disulfide bond. '*': position of the pattern. -Consensus pattern: C-C-x(5)-R-x(2)-[FY]-x(2)-C [The 3 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: The proteins from the gamma-thionin family are not related to the above proteins and are described in a separate section. -Last update: May 2004 / Text revised. [ 1] Vernon L.P., Evett G.E., Zeikus R.D., Gray W.R. "A toxic thionin from Pyrularia pubera: purification, properties, and amino acid sequence." Arch. Biochem. Biophys. 238:18-29(1985). PubMed=3985614 [ 2] Bohlmann H., Clausen S., Behnke S., Giese H., Hiller C., Reimann-Phillip U., Schrader G., Barkholt V., Apel K. EMBO J. 7:1559-1565(1988). [ 3] Bohlmann H., Apel K. Mol. Gen. Genet. 207:446-454(1987). [ 4] Teeter M.M., Mazer J.A., L'Italien J.J. "Primary structure of the hydrophobic plant protein crambin." Biochemistry 20:5437-5443(1981). PubMed=6895315 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00245} {PS00272; SNAKE_TOXIN} {BEGIN} ************************** * Snake toxins signature * ************************** Snake toxins belong to a family of proteins [1,2,3] which groups short and long neurotoxins, cytotoxins and short toxins, as well as a other miscellanous venom peptides. Most of these toxins act by binding to the nicotinic acetylcholine receptors in the postsynaptic membrane of skeletal muscles and prevent the binding of acetylcholine, thereby blocking the excitation of muscles. Snake toxins are proteins that consist of sixty to seventy five amino acids. Among the invariant residues are eight cysteines all involved in disulfide bonds. A signature pattern was developed [4] which includes four of these cysteines as well as a conserved proline thought to be important for the maintenance of the tertiary structure. The second cysteine in the pattern is linked to the third one by a disulfide bond. -Consensus pattern: G-C-x(1,3)-C-P-x(8,10)-C-C-x(2)-[PDEN] [The 4 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: Most of the snake toxins are detected except for fasciatoxin, which is an atypical short neurotoxin, and eight toxins which have a very low activity. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: May 2004 / Text revised. [ 1] Dufton M.J. "Classification of elapid snake neurotoxins and cytotoxins according to chain length: evolutionary implications." J. Mol. Evol. 20:128-134(1984). PubMed=6433031 [ 2] Endo T., Tamiya N. (In) Snake toxins, Harvey A.L., Ed., pp165-222, Pergamon Press, NewYork, (1991). [ 3] Mebs D., Claus I. (In) Snake toxins, Harvey A.L., Ed., pp425-447, Pergamon Press, NewYork, (1991). [ 4] Jonassen I., Collins J.F., Higgins D.G. Protein Sci. 4:1587-1595(1995). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00246} {PS00273; ENTEROTOXIN_H_STABLE} {BEGIN} ************************************** * Heat-stable enterotoxins signature * ************************************** Prokaryotic heat-stable enterotoxins [1] are responsible for acute diarrhea. The active toxin is a short peptide of around twenty residues which contains six cysteines involved in three disulfide bonds, as shown in the following schematic representation: +-------+ +--|----+ | | | | | xxCCxxCCxxxCxxCxx | | +----+ 'C': conserved cysteine involved in a disulfide bond. We have taken the pattern of cysteines, along with three conserved residues, as a signature pattern for this group of proteins. -Consensus pattern: C-C-x(2)-C-C-x-P-A-C-x-G-C [The 6 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: May 2004 / Text revised. [ 1] Shimonishi Y., Hidaka Y., Koizumi M., Hane M., Aimoto S., Takeda T., Miwatani T., Takeda Y. "Mode of disulfide bond formation of a heat-stable enterotoxin (STh) produced by a human strain of enterotoxigenic Escherichia coli." FEBS Lett. 215:165-170(1987). PubMed=3552731; +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00247} {PS00274; AEROLYSIN} {BEGIN} *********************************** * Aerolysin type toxins signature * *********************************** Aerolysin [1] is a cytolytic toxin exported by the Gram-negative Aeromonas bacteria. The mature toxin binds to eukaryotic cells and aggregates to form holes (approximately 3 nm in diameter) leading to the destruction of the membrane permeability barrier and osmotic lysis. Staphylococcus aureus also exports a cytotoxin, alpha-toxin [2], whose biological activity is similar to that of aerolysin. The sequences of both toxins are not similar except for a stretch of ten residues rather well conserved. -Consensus pattern: [KT]-x(2)-N-W-x(2)-T-[DN]-T -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Pseudomonas aeruginosa cytotoxin [3] has been said to contain a region whose sequence is similar to that of the conserved domain of areolysin/alphatoxin, but the similarity is very weak and this toxin is not picked up by the above pattern. -Last update: November 1997 / Pattern and text revised. [ 1] Howard S.P., Garland W.J., Green M.J., Buckley J.T. "Nucleotide sequence of the gene for the hole-forming toxin aerolysin of Aeromonas hydrophila." J. Bacteriol. 169:2869-2871(1987). PubMed=3584074 [ 2] Gray G.S., Kehoe M. "Primary sequence of the alpha-toxin gene from Staphylococcus aureus wood 46." Infect. Immun. 46:615-618(1984). PubMed=6500704 [ 3] Hayashi T., Kamio Y., Hishinuma F., Usami Y., Titani K., Terawaki Y. "Pseudomonas aeruginosa cytotoxin: the nucleotide sequence of the gene and the mechanism of activation of the protoxin." Mol. Microbiol. 3:861-868(1989). PubMed=2507866 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00248} {PS00275; SHIGA_RICIN} {BEGIN} ******************************************************************* * Shiga/ricin ribosomal inactivating toxins active site signature * ******************************************************************* A number of bacterial and plant toxins act by inhibiting protein synthesis in eukaryotic cells. The toxins of the Shiga and ricin family inactivate 60S ribosomal subunits by an N-glycosidic cleavage which releases a specific adenine base from the sugar-phosphate backbone of 28S rRNA [1,2,3]. The toxins which are known to function in this manner are: - Shiga toxin from Shigella dysenteriae [4]. This toxin is composed of one copy of an enzymatically active A subunit and five copies of a B subunit responsible for binding the toxin complex to specific receptors on the target cell surface. - Shiga-like toxins (SLT) are a group of Escherichia coli toxins very similar in their structure and properties to Shiga toxin. The sequence of two types of these toxins, SLT-1 [5] and SLT-2 [6], is known. - Ricin, a potent toxin from castor bean seeds. Ricin consists of two glycosylated chains linked by a disulfide bond. The A chain is enzymatically active. The B chain is a lectin with a binding preference for galactosides. Both chains are encoded by a single polypeptidic precursor. Ricin is classified as a type-II ribosome-inactivating protein (RIP); other members of this family are agglutinin, also from castor bean, and abrin from the seeds of the bean Abrus precatorius [7]. - Single chain ribosome-inactivating proteins (type-I RIP) from plants. Examples of such proteins are: barley protein synthesis inhibitors I and II, mongolian snake-gourd trichosanthin, sponge gourd luffin-A and -B, garden four-o'clock MAP, common pokeberry PAP-S and soapwort saporin-6 [7]. All these toxins are structurally related. A conserved glutamic residue has been implicated [8] in the catalytic mechanism; it is located near a conserved arginine which also plays a role in catalysis [9]. The signature we developed for these proteins includes these catalytic residues. -Consensus pattern: [LIVMA]-x-[LIVMSTA](2)-x-E-[SAGV]-[STAL]-R-[FY][RKNQST]x-[LIVM]-[EQS]-x(2)-[LIVMF] [E and R are active site residues] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Endo Y., Tsurugi K., Yutsudo T., Takeda Y., Ogasawara T., Igarashi K. "Site of action of a Vero toxin (VT2) from Escherichia coli O157:H7 and of Shiga toxin on eukaryotic ribosomes. RNA N-glycosidase activity of the toxins." Eur. J. Biochem. 171:45-50(1988). PubMed=3276522 [ 2] May M.J., Hartley M.R., Roberts L.M., Krieg P.A., Osborn R.W., Lord J.M. "Ribosome inactivation by ricin A chain: a sensitive method to assess the activity of wild-type and mutant polypeptides." EMBO J. 8:301-308(1989). PubMed=2714255 [ 3] Funatsu G., Islam M.R., Minami Y., Sung-Sil K., Kimura M. "Conserved amino acid residues in ribosome-inactivating proteins from plants." Biochimie 73:1157-1161(1991). PubMed=1742358 [ 4] Strockbine N.A., Jackson M.P., Sung L.M., Holmes R.K., O'Brien A.D. "Cloning and sequencing of the genes for Shiga toxin from Shigella dysenteriae type 1." J. Bacteriol. 170:1116-1122(1988). PubMed=2830229 [ 5] Calderwood S.B., Auclair F., Donohue-Rolfe A., Keusch G.T., Mekalanos J.J. "Nucleotide sequence of the Shiga-like toxin genes of Escherichia coli." Proc. Natl. Acad. Sci. U.S.A. 84:4364-4368(1987). PubMed=3299365 [ 6] Jackson M.P., Neill R.J., O'Brien A.D., Holmes R.K., Newland J.W. FEMS Microbiol. Lett. 44:109-114(1987). [ 7] Barbieri L., Battelli M.G., Stirpe F. "Ribosome-inactivating proteins from plants." Biochim. Biophys. Acta 1154:237-282(1993). PubMed=8280743 [ 8] Hovde C.J., Calderwood S.B., Mekalanos J.J., Collier R.J. "Evidence that glutamic acid 167 is an active-site residue of Shiga-like toxin I." Proc. Natl. Acad. Sci. U.S.A. 85:2568-2572(1988). PubMed=3357883 [ 9] Monzingo A.F., Collins E.J., Ernst S.R., Irvin J.D., Robertus J.D. "The 2.5 A structure of pokeweed antiviral protein." J. Mol. Biol. 233:705-715(1993). PubMed=8411176 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00249} {PS00276; CHANNEL_COLICIN} {BEGIN} ************************************** * Channel forming colicins signature * ************************************** Colicins are plasmid-encoded polypeptide toxins produced by and active against Escherichia coli and closely related bacteria. The channelforming colicins are transmembrane proteins that depolarize the cytoplasmic membrane, leading to dissipation of cellular energy [1,2]. Colicins A, B, E1, Ia, Ib, and N belong to that group. The N-terminal part of these colicins is involved in their uptake; the central part is important for binding to outer membrane receptors and the C-terminal part is the channel-forming region. As a signature for this type of colicins, we most conserved region of the channel-forming domain. have selected one of the -Consensus pattern: T-x(2)-W-x-P-[LIVMFY](3)-x(2)-E -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: November 1990 / Text revised. [ 1] Pattus F., Massotte D., Wilmsen H.U., Lakey J., Tsernoglou D., Tucker A., Parker M.W. "Colicins: prokaryotic killer-pores." Experientia 46:180-192(1990). PubMed=1689257 [ 2] Cramer W.A., Cohen F.S., Merrill A.R., Song H.Y. "Structure and dynamics of the colicin E1 channel." Mol. Microbiol. 4:519-526(1990). PubMed=1693745 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00250} {PS00277; STAPH_STREP_TOXIN_1} {PS00278; STAPH_STREP_TOXIN_2} {BEGIN} ************************************************************************* ***** * Staphylococcal enterotoxins / Streptococcal pyrogenic exotoxins signatures * ************************************************************************* ***** Staphylococcal enterotoxins and streptococcal pyrogenic exotoxins constitute a family of biologically and structurally related toxins produced by Staphylococcus aureus and Streptococcus pyogenes [1,2]. These toxins share the ability to bind to the major histocompatibility complex proteins of their hosts. The toxins that belong to this family are: - Staphylococcal enterotoxins are the cause of staphylococcal food poisoning syndrome. The sequence of six different enterotoxins is known: SEA, SEB, SEC1, SEC3, SED, and SEE. - Streptococcal pyrogenic exotoxins are the causative agents of the symptoms associated with scarlet fever. The sequence of two pyrogenic exotoxins is known: SPEA, and SPEC. - Staphylococcus aureus toxic shock syndrome toxin-1 (TSST-1). While the enterotoxins and the pyrogenic exotoxins are closely related, TSST-1 seems to be only distantly related to the other toxins. We developed two patterns for this family of toxins. The first one is based on a well conserved region of enterotoxins and pyrogenic exotoxins, but which does not pick up TSST-1; the second pattern is derived from a more diffuse region of similarity common to all these toxins. -Consensus pattern: Y-G(2)-[LIV]-T-{I}-{N}-x(2)-N -Sequences known to belong to this class detected by the pattern: ALL, except for TSST-1. -Other sequence(s) detected in Swiss-Prot: 2. -Consensus pattern: K-x(2)-[LIVF]-x(4)-[LIVF]-D-x(3)-R-x(2)-L-x(5)-[LIV]Y -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Iandolo J.J. "Genetic analysis of extracellular toxins of Staphylococcus aureus." Annu. Rev. Microbiol. 43:375-402(1989). PubMed=2679358; DOI=10.1146/annurev.mi.43.100189.002111 [ 2] Marrack P., Kappler J. "The staphylococcal enterotoxins and their relatives." Science 248:705-711(1990). PubMed=2185544 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00251} {PS00279; MACPF_1} {PS51412; MACPF_2} {BEGIN} ************************************************************************* * Membrane attack complex/perforin (MACPF) domain signature and profile * ************************************************************************* The membrane attack complex/perforin (MACPF) domain is conserved in bacteria, fungi, mammals and plants. It was originally identified and named as being common to five complement components (C6, C7, C8-alpha, C8-beta, and C9) and perforin. These molecules perform critical functions in innate and adaptive immunity. The MAC family proteins and perforin are known to participate in lytic pore formation. In response to pathogen infection, a sequential and highly specific interaction between the constituent elements occurs to form transmembrane channels which are known as the membrane-attack complex (MAC). Only a few other MACPF proteins have been characterized and several are thought to form pores for invasion or protection [1,2,3]. Examples are proteins from malarial parasites [4], the cytolytic toxins from sea anemones [5], and proteins that provide plant immunity [1,6]. Functionally uncharacterized MACPF proteins are also evident in pathogenic bacteria such as Chlamydia spp. [7] and Photorhabdus luminescens [2]. The MACPF domain is commonly found to be associated with other N- and C-terminal domains, such as TSP1 (see <PDOC50092>), LDLRA (see <PDOC00929>), EGF-like (see <PDOC00021>), Sushi/CCP/SCR (see <PDOC50923>), FIMAC or C2 (see <PDOC00380>). They probably control or target MACPF function [2,8]. The MACPF domain oligomerizes, undergoes conformational change, and is required for lytic activity. The MACPF domain consists of a central kinked four-stranded antiparallel beta sheet surrounded by alpha helices and beta strands, forming two structural segments. Overall, the MACPF domain has a thin L-shaped appearance (see <PDB:2QQH; A>). MACPF domains exhibit limited sequence similarity but contain a signature [YW]-G-[TS]-H-[FY]-x(6)-G-G motif [2,3,8]. Some proteins known to contain a MACPF domain are listed below: - Vertebrate complement proteins C6 to C9. Complement factors C6 to C9 assemble to form a scaffold, the membrane attack complex (MAC), that permits C9 polymerization into pores that lyse Gram-negative pathogens [3,8]. - Vertebrate perforin. It is delivered by natural killer cells and cytotoxic T lymphocytes and forms oligomeric pores (12 to 18 monomers) in the plasma membrane of either virus-infected or transformed cells. - Arabidopsis thaliana constitutively activated cell death 1 (CAD1) protein. It is likely to act as a mediator that recognizes plant signals for pathogen infection [6]. - Arabidopsis thaliana necrotic spotted lesions 1 (NSL1) protein [1]. - Venomous sea anemone Phyllodiscus semoni toxins PsTX-60A and PsTX-60B [5]. - Venomous sea anemone Actineria villosa toxin AvTX-60A [5]. - Plasmodium sporozoite microneme protein essential for cell traversal 2 (SPECT2). It is essential for the membrane-wounding activity of the sporozoite and is involved in its traversal of the sinusoidal cell layer prior to hepatocyte-infection [4]. - Photorhabdus luminescens Plu-MACPF. Although nonlytic, it was shown to bind to cell membranes [2]. - Chlamydial putative uncharacterized protein CT153 [7]. We developed both a pattern and a profile for the MACPF domain. Whereas the profile covers the entire MACPF domain, the pattern is based on the conserved signature of the MACPF domain. -Consensus pattern: Y-x(6)-[FY]-G-T-H-[FY] -Sequences known to belong to this class detected by the pattern: ALL, except for rabbit C8-beta. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: November 2008 / Text revised; profile added. [ 1] Noutoshi Y., Kuromori T., Wada T., Hirayama T., Kamiya A., Imura Y., Yasuda M., Nakashita H., Shirasu K., Shinozaki K. "Loss of Necrotic Spotted Lesions 1 associates with cell death and defense responses in Arabidopsis thaliana." Plant Mol. Biol. 62:29-42(2006). PubMed=16900325; DOI=10.1007/s11103-006-9001-6 [ 2] Rosado C.J., Buckle A.M., Law R.H.P., Butcher R.E., Kan W.-T., Bird C.H., Ung K., Browne K.A., Baran K., Bashtannyk-Puhalovich T.A., Faux N.G., Wong W., Porter C.J., Pike R.N., Ellisdon A.M., Pearce M.C., Bottomley S.P., Emsley J., Smith A.I., Rossjohn J., Hartland E.L., Voskoboinik I., Trapani J.A., Bird P.I., Dunstone M.A., Whisstock J.C. "A common fold mediates vertebrate defense and bacterial attack." Science 317:1548-1551(2007). PubMed=17717151; DOI=10.1126/science.1144706 [ 3] Slade D.J., Lovelace L.L., Chruszcz M., Minor W., Lebioda L., Sodetz J.M. "Crystal structure of the MACPF domain of human complement protein C8 alpha in complex with the C8 gamma subunit." J. Mol. Biol. 379:331-342(2008). PubMed=18440555; DOI=10.1016/j.jmb.2008.03.061 [ 4] Ishino T., Chinzei Y., Yuda M. "A Plasmodium sporozoite protein with a membrane attack complex domain is required for breaching the liver sinusoidal cell layer prior to hepatocyte infection." Cell. Microbiol. 7:199-208(2005). PubMed=15659064; DOI=10.1111/j.1462-5822.2004.00447.x [ 5] Satoh H., Oshiro N., Iwanaga S., Namikoshi M., Nagai H. "Characterization of PsTX-60B, a new membrane-attack complex/perforin (MACPF) family toxin, from the venomous sea anemone Phyllodiscus semoni." Toxicon 49:1208-1210(2007). PubMed=17368498; DOI=10.1016/j.toxicon.2007.01.006 [ 6] Morita-Yamamuro C., Tsutsui T., Sato M., Yoshioka H., Tamaoki M., Ogawa D., Matsuura H., Yoshihara T., Ikeda A., Uyeda I., Yamaguchi J. "The Arabidopsis gene CAD1 controls programmed cell death in the plant immune system and encodes a protein containing a MACPF domain." Plant Cell Physiol. 46:902-912(2005). PubMed=15799997; DOI=10.1093/pcp/pci095 [ 7] Ponting C.P. "Chlamydial homologues of the MACPF (MAC/perforin) domain." Curr. Biol. 9:R911-R913(1999). PubMed=10608922 [ 8] Hadders M.A., Beringer D.X., Gros P. "Structure of C8alpha-MACPF reveals mechanism of membrane attack in complement immune defense." Science 317:1552-1554(2007). PubMed=17872444; DOI=10.1126/science.1147103 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00252} {PS00280; BPTI_KUNITZ_1} {PS50279; BPTI_KUNITZ_2} {BEGIN} ********************************************************************** * Pancreatic trypsin inhibitor (Kunitz) family signature and profile * ********************************************************************** The pancreatic trypsin inhibitor (Kunitz) family [1,2,3] is the numerous families of serine proteinase inhibitors. The basic structure of such a type of inhibitor is shown in the following schematic representation: one of +-----------------------+ | +--------+ | | | **|******* | xxCxxC#xxxCxxxCxxxxxxCxxxxCxx | | +----------+ <------50 residues------> 'C': conserved cysteine involved in a disulfide bond. '#': active site residue. '*': position of the pattern. In addition to the prototype sequence for this type of inhibitor - the bovine pancreatic trypsin inhibitor (BPTI) (also known as basic protease inhibitor (BPI)) - this family also includes many other members which are listed below (references are only provided for recently determined sequences): - Mammalian inter-alpha-trypsin inhibitors (ITI). ITI's contain two inhibitory domains. - Tissue factor pathway inhibitor precursor (TFPI) (previously known as lipoprotein-associated coagulation inhibitor (LACI)), which inhibits factor X (Xa) directly and, in a Xa-dependent way, inhibits VIIa / Tissue factor activity. TFPI contains three inhibitory domains. - TFPI-2 [4] (also known as placental protein 5), a protein that contains two inhibitory domains. - Bovine colostrum, serum and spleen trypsin inhibitors. - Trypstatin, a rat mast cell inhibitor of trypsin. - A number of venom basic protease inhibitors (including dendrotoxins) from snakes. - Isoinhibitor K from garden snail. - Protease inhibitor from the hemocytes of horseshoe crab. - Basic protease inhibitor from red sea turtle. - Sea anemone protease inhibitor 5 II. - Chymotrypsin inhibitors SCI-I,- II, and -III from silk moth. - Trypsin inhibitors A and B from the hemolymph of the tobacco hornworm. - Trypsin inhibitor from the hemolymph of the flesh fly [5]. - Acrosin inhibitor from the male accessory gland of Drosophila. - A domain found in one of the alternatively spliced forms of Alzheimer's amyloid beta-protein (APP) (also known as protease nexin II) as well as the closely related amyloid-like protein 2 (or APPH). - A domain at the C-terminal extremity of the alpha(3) chain of type VI collagen. - A domain at the C-terminal extremity of the alpha(1) chain of type VII collagen. We developed a pattern which will only pick up sequences belonging to this family of inhibitors. It spans a region starting after the third cysteine and ending with the fifth one. We also developed a profile that spans the complete domain. -Consensus pattern: F-x(2)-{I}-G-C-x(6)-[FY]-x(5)-C [The 2 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL, except for trypsin inhibitor IV from the sea anemone Radianthus macrodactylus which has Asp instead of Phe/Tyr. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Ikeo K.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Laskowski M. Jr., Kato I. "Protein inhibitors of proteinases." Annu. Rev. Biochem. 49:593-626(1980). PubMed=6996568; DOI=10.1146/annurev.bi.49.070180.003113 [ 2] Salier J.-P. "Inter-alpha-trypsin inhibitor: emergence of a family within the Kunitz-type protease inhibitor superfamily." Trends Biochem. Sci. 15:435-439(1990). PubMed=1703675 [ 3] Ikeo K., Takahashi K., Gojobori T. "Evolutionary origin of a Kunitz-type trypsin inhibitor domain inserted in the amyloid beta precursor protein of Alzheimer's disease." J. Mol. Evol. 34:536-543(1992). PubMed=1593645 [ 4] Sprecher C.A., Kisiel W., Mathewes S., Foster D.C. "Molecular cloning, expression, and partial characterization of a second human tissue-factor-pathway inhibitor." Proc. Natl. Acad. Sci. U.S.A. 91:3353-3357(1994). PubMed=8159751 [ 5] Papayannopoulos I.A., Biemann K. "Amino acid sequence of a protease inhibitor isolated from Sarcophaga bullata determined by mass spectrometry." Protein Sci. 1:278-288(1992). PubMed=1304909 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00253} {PS00281; BOWMAN_BIRK} {BEGIN} *********************************************************** * Bowman-Birk serine protease inhibitors family signature * *********************************************************** The Bowman-Birk inhibitor family [1] is one of the numerous families of serine proteinase inhibitors. As it can be seen in the schematic representation, they have a duplicated structure and generally possess two distinct inhibitory sites: +------------------------------------------------+ | +-----+ +-------+ +-----+ | | | | | | | | | xxCCxxCxxCxx#xxCxxCxxxxCxxxCxxxCxxxxCxx#xxCxxCxxCxxCxx | | |********|**** | | | | | | | | +--|-----------+ +-----------------+ | +-----------------------------------------+ <-----------------70 residues--------------------> 'C': conserved cysteine involved in a disulfide bond. '#': active site residue. '*': position of the pattern. These inhibitors are found in the seeds of all leguminous plants as well as in cereal grains. In cereals they exist in two forms, one of which is a duplication of the basic structure shown above [2]. The pattern we developed to pick up sequences belonging to this family of inhibitors is in the central part of the domain and includes four cysteines. -Consensus pattern: C-x(5,6)-[DENQKRHSTA]-C-[PASTDH]-[PASTDK]-[ASTDV]-C[NDEKS]-[DEKRHSTA]-C [The 4 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: This pattern can be found twice in some duplicated cereal inhibitors. -Last update: May 2004 / Text revised. [ 1] Laskowski M. Jr., Kato I. "Protein inhibitors of proteinases." Annu. Rev. Biochem. 49:593-626(1980). PubMed=6996568; DOI=10.1146/annurev.bi.49.070180.003113 [ 2] Tashiro M., Hashino K., Shiozaki M., Ibuki F., Maki Z. "The complete amino acid sequence of rice bran trypsin inhibitor." J. Biochem. 102:297-306(1987). PubMed=3667571 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00254} {PS00282; KAZAL_1} {PS51465; KAZAL_2} {BEGIN} ***************************************************************** * Kazal serine protease inhibitors family signature and profile * ***************************************************************** Canonical range of organisms various serine from proteinase all kingdoms inhibitors of life are and distributed in a wide play crucial role in physiological mechanisms [1]. They interact from the canonical proteinase-inhibitor binding loop, where P1 residue has a predominant role (the residue at the P1 position contributing the carbonyl portion to the reactive-site peptide bond). These so-called canonical inhibitors bind to their cognate enzymes in the same manner as a good substrate, but are cleaved extremely slowly. Kazal-type inhibitors represent the most studied canonical proteinase inhibitors. Kazal inhibitors are extremely variable at their reactive sites. However, some regularity prevails such as the presence of lysine at position P1 indicating strong inhibition of trypsin [2]. The Kazal inhibitor has six cysteine residues engaged in disulfide bonds arranged as shown in the following schematic representation: +------------------+ | | *******************|*** xxxxxxxxCxxxxxxCx#xxxxxCxxxxxxxxxxCxxCxxxxxxxxxxxxxxxxxC | | | | | +-------------|-----------------+ +----------------------------+ 'C': conserved cysteine involved in a disulfide bond. '#': active site residue. '*': position of the pattern. The structure of classical Kazal domains consists of a central alpha helix, which is inserted between two beta-strands and a third that is toward the C-terminus (see for example <PDB:1OVO>)[3]. The reactive site P1 and the conformation of the reactive site loop is structurally highly conserved, similar to the canonical conformation of small serine proteinase inhibitors. The proteins known to belong to this family are: - Pancreatic secretory trypsin inhibitor (PSTI), whose physiological function is to prevent the trypsin-catalyzed premature activation of zymogens within the pancreas. - Mammalian seminal acrosin inhibitors. - Canidae and felidae submandibular gland double-headed protease inhibitors, which contain two Kazal-type domains, the first one inhibits trypsin and the second one elastase. - A mouse prostatic secretory glycoprotein, induced by androgens, and which exhibits anti-trypsin activity. - Avian ovomucoids, which consist of three Kazal-type domains. - Chicken ovoinhibitor, which consists of seven Kazal-type domains. - Bdellin B-3, a leech trypsin inhibitor. - LDTI [4], a leech tryptase inhibitor. - An eel peptide, which is probably a pancreatic serine proteinase inhibitor. - An elastase inhibitor from a sea anemone. - Rhodniin, a thrombin inhibitor from the insect Rhodnius prolixus [5]. This protein consists of two Kazal-type domains. - Pig intestinal peptide PEC-60 [6]. This protein, while highly similar to other members of the Kazal family, does not seem to act as a protease inhibitor. Its exact biological function is not yet established, but it is known to inhibit the glucose-induced insulin secretion from perfused pancreas and to play a role in the immune system. The pattern we developed to pick up Kazal-type inhibitors spans a region beginning with the second cysteine and ending with the fifth one. We also developed a profile that covers the entire Kazal domain. -Consensus pattern: C-x(4)-{C}-x(2)-C-x-{A}-x(4)-Y-x(3)-C-x(2,3)-C [The 4 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL, except for the sea anemone inhibitor which has six residues between the last two Cys of the pattern. -Other sequence(s) detected in Swiss-Prot: 3. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: This pattern will fail to detect the first of the three Kazal domains in some of the ovomucoids and the second domain of rhodniin. -Last update: September 2009 / Text revised; profile added. [ 1] Laskowski M. Jr., Kato I. "Protein inhibitors of proteinases." Annu. Rev. Biochem. 49:593-626(1980). PubMed=6996568; DOI=10.1146/annurev.bi.49.070180.003113 [ 2] Laskowski M., Qasim M.A. "What can the structures of enzyme-inhibitor complexes tell us about the structures of enzyme substrate complexes?" Biochim. Biophys. Acta 1477:324-337(2000). PubMed=10708867 [ 3] Papamokos E., Weber E., Bode W., Huber R., Empie M.W., Kato I., Laskowski M. Jr. "Crystallographic refinement of Japanese quail ovomucoid, a Kazaltype inhibitor, and model building studies of complexes with serine proteases." J. Mol. Biol. 158:515-537(1982). PubMed=6752426 [ 4] Sommerhoff C.P., Sollner C., Mentele R., Piechottka G.P., Auerswald E.A., Fritz H. "A Kazal-type inhibitor of human mast cell tryptase: isolation from the medical leech Hirudo medicinalis, characterization, and sequence analysis." Biol. Chem. Hoppe-Seyler 375:685-694(1994). PubMed=7888081 [ 5] Friedrich T., Kroger B., Bialojan S., Lemaire H.G., Hoffken H.W., Reuschenbach P., Otte M., Dodt J. "A Kazal-type inhibitor with thrombin specificity from Rhodnius prolixus." J. Biol. Chem. 268:16216-16222(1993). PubMed=8344906 [ 6] Liepinsh E., Berndt K.D., Sillard R., Mutt V., Otting G. "Solution structure and dynamics of PEC-60, a protein of the Kazal type inhibitor family, determined by nuclear magnetic resonance spectroscopy." J. Mol. Biol. 239:137-153(1994). PubMed=8196042 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00255} {PS00283; SOYBEAN_KUNITZ} {BEGIN} ************************************************************************* ** * Soybean trypsin inhibitor (Kunitz) protease inhibitors family signature * ************************************************************************* ** The soybean trypsin inhibitor (Kunitz) family [1] is one of the numerous families of proteinase inhibitors. It comprise plant proteins which have inhibitory activity against serine proteinases from the trypsin and subtilisin families, thiol proteinases and aspartic proteinases as well as some proteins that are probably involved in seed storage. This family is currently known to group the following proteins: - Trypsin inhibitors A, B, C, KTI1, and KTI2 from soybean. - Trypsin inhibitor DE3 from coral beans (Erythrina sp.). - Trypsin inhibitor DE5 from sandal bead tree. - Trypsin inhibitors 1A (WTI-1A), 1B (WTI-1B), and 2 (WTI-2) from goa bean. - Trypsin inhibitor from Acacia confusa. - Trypsin inhibitor from silk tree. - Chymotrypsin inhibitor 3 (WCI-3) from goa bean. - Cathepsin D inhibitors PDI and NDI from potato [2], which inhibit both cathepsin D (aspartic proteinase) and trypsin. - Alpha-amylase/subtilisin inhibitors from barley and wheat. - Albumin-1 (WBA-1) from goa bean seeds [3]. - Miraculin from Richadella dulcifica [4], a sweet taste protein. - Sporamin from sweet potato [5], the major tuberous root protein. - Thiol proteinase inhibitor PCPI 8.3 (P340) from potato tuber [6]. - Wound responsive protein gwin3 from poplar tree [7]. - 21 Kd seed protein from cocoa [8]. All these proteins contain from 170 to 200 amino acid residues and one or two intrachain disulfide bonds. The best conserved region is found in their Nterminal section and is used as a signature pattern. -Consensus pattern: [LIVM]-x-D-{EK}-[EDNTY]-[DG]-[RKHDENQ]-x-[LIVM]-x{E}-{Q}x(2)-Y-x-[LIVM] -Sequences known to belong to this class detected by the pattern: ALL, except for 2 sequences. -Other sequence(s) detected in Swiss-Prot: 20. -Last update: April 2006 / Pattern revised. [ 1] Laskowski M. Jr., Kato I. "Protein inhibitors of proteinases." Annu. Rev. Biochem. 49:593-626(1980). PubMed=6996568; DOI=10.1146/annurev.bi.49.070180.003113 [ 2] Ritonja A., Krizaj I., Mesko P., Kopitar M., Lucovnik P., Strukelj B., Pungercar J., Buttle D.J., Barrett A.J., Turk V. "The amino acid sequence of a novel inhibitor of cathepsin D from potato." FEBS Lett. 267:13-15(1990). PubMed=2365079 [ 3] Kortt A.A., Strike P.M., De Jersey J. "Amino acid sequence of a crystalline seed albumin (winged bean albumin-1) from Psophocarpus tetragonolobus (L.) DC. Sequence similarity with Kunitz-type seed inhibitors and 7S storage globulins." Eur. J. Biochem. 181:403-408(1989). PubMed=2653830 [ 4] Theerasilp S., Hitotsuya H., Nakajo S., Nakaya K., Nakamura Y., Kurihara Y. "Complete amino acid sequence and structure characterization of the taste-modifying protein, miraculin." J. Biol. Chem. 264:6655-6659(1989). PubMed=2708331 [ 5] Hattori T., Yoshida N., Nakamura K. "Structural relationship among the members of a multigene family coding for the sweet potato tuberous root storage protein." Plant Mol. Biol. 13:563-572(1989). PubMed=2491673 [ 6] Krizaj I., Drobnic-Kosorok M., Brzin J., Jerala R., Turk V. "The primary structure of inhibitor of cysteine proteinases from potato." FEBS Lett. 333:15-20(1993). PubMed=8224155 [ 7] Bradshaw H.D. Jr., Hollick J.B., Parsons T.J., Clarke H.R.G., Gordon M.P. "Systemically wound-responsive genes in poplar trees encode proteins similar to sweet potato sporamins and legume Kunitz trypsin inhibitors." Plant Mol. Biol. 14:51-59(1990). PubMed=2101311 [ 8] Tai H., McHenry L., Fritz P.J., Furtek D.B. "Nucleic acid sequence of a 21 kDa cocoa seed protein with homology to the soybean trypsin inhibitor (Kunitz) family of protease inhibitors." Plant Mol. Biol. 16:913-915(1991). PubMed=1859871 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00256} {PS00284; SERPIN} {BEGIN} ********************* * Serpins signature * ********************* Serpins (SERine Proteinase INhibitors) [1,2,3,4] are a group of structurally related proteins. They are high molecular weight (400 to 500 amino acids), extracellular, irreversible serine protease inhibitors with a well defined structural-functional characteristic: a reactive region that acts as a 'bait' for an appropriate serine protease. This region is found in the Cterminal part of these proteins. Proteins which are known to belong to the serpin family are listed below (references are only provided for recently determined sequences): - Alpha-1 protease inhibitor (alpha-1-antitrypsin, contrapsin). - Alpha-1-antichymotrypsin, - Antithrombin III. - Alpha-2-antiplasmin. - Heparin cofactor II. - Complement C1 inhibitor. - Plasminogen activator inhibitors 1 (PAI-1) and 2 (PAI-2). - Glia derived nexin (GDN) (Protease nexin I). - Protein C inhibitor. - Rat hepatocytes SPI-1, SPI-2 and SPI-3 inhibitors. - Human squamous cell carcinoma antigen (SCCA) which may act in the modulation of the host immune response against tumor cells. - A lepidopteran protease inhibitor. - Leukocyte elastase inhibitor which, in contrast to other serpins, is an intracellular protein. - Neuroserpin [5], a neuronal inhibitor of plasminogen activators and plasmin. - Cowpox virus crmA [6], an inhibitor of the thiol protease interleukin-1B converting enzyme (ICE). CrmA is the only serpin known to inhibit a nonserine proteinase. - Some orthopoxviruses probable protease inhibitors, which may be involved in the regulation of the blood complement cascade in the mammalian host. clotting cascade and/or of the On the basis of strong sequence similarities, a number of proteins with no known inhibitory activity are said to belong to this family: - Birds ovalbumin and the related genes X and Y proteins. - Angiotensinogen; the precursor of the angiotensin active peptide. - Barley protein Z; the major endosperm albumin. - Corticosteroid binding globulin (CBG). - Thyroxine-binding globulin (TBG). - Sheep uterine milk protein (UTMP) and pig uteroferrin-associated protein (UFAP). - Hsp47, an endoplasmic reticulum heat-shock protein that binds strongly to collagen and could act as a chaperone in the collagen biosynthetic pathway [7]. - Maspin, which seems to function as a tumor supressor [5]. - Pigment epithelium-derived factor precursor (PEDF), a protein with a strong neutrophic activity [8]. - Ep45, an estrogen-regulated protein from Xenopus [9]. We developed a signature pattern for this family of proteins, centered on a well conserved Pro-Phe sequence which is found ten to fifteen residues on the C-terminal side of the reactive bond. -Consensus pattern: [LIVMFY]-{G}-[LIVMFYAC]-[DNQ]-[RKHQS]-[PST]-F[LIVMFY][LIVMFYC]-x-[LIVMFAH] -Sequences known to belong to this class detected by the pattern: ALL, except for 7 sequences. -Other sequence(s) detected in Swiss-Prot: 27. -Note: In position 6 of the pattern, Pro is found in most serpins. -Last update: December 2004 / Pattern and text revised. [ 1] Carrell R., Travis J. Trends Biochem. Sci. 10:20-24(1985). [ 2] Carrell R.W., Pemberton P.A., Boswell D.R. "The serpins: evolution and adaptation in a family of protease inhibitors." Cold Spring Harb. Symp. Quant. Biol. 52:527-535(1987). PubMed=3502621 [ 3] Huber R., Carrell R.W. "Implications of the three-dimensional structure of alpha 1-antitrypsin for structure and function of serpins." Biochemistry 28:8951-8966(1989). PubMed=2690952 [ 4] Remold-O'Donneel E. FEBS Lett. 315:105-108(1993). [ 5] Osterwalder T., Contartese J., Stoeckli E.T., Kuhn T.B., Sonderegger P. "Neuroserpin, an axonally secreted serine protease inhibitor." EMBO J. 15:2944-2953(1996). PubMed=8670795 [ 6] Komiyama T., Ray C.A., Pickup D.J., Howard A.D., Thornberry N.A., Peterson E.P., Salvesen G. "Inhibition of interleukin-1 beta converting enzyme by the cowpox virus serpin CrmA. An example of cross-class inhibition." J. Biol. Chem. 269:19331-19337(1994). PubMed=8034697 [ 7] Clarke E.P., Sanwal B.D. "Cloning of a human collagen-binding protein, and its homology with rat gp46, chick hsp47 and mouse J6 proteins." Biochim. Biophys. Acta 1129:246-248(1992). PubMed=1309665 [ 8] Zou Z., Anisowicz A., Hendrix M.J., Thor A., Neveu M., Sheng S., Rafidi K., Seftor E., Sager R. "Maspin, a serpin with tumor-suppressing activity in human mammary epithelial cells." Science 263:526-529(1994). PubMed=8290962 [ 9] Steele F.R., Chader G.J., Johnson L.V., Tombran-Tink J. "Pigment epithelium-derived factor: neurotrophic activity and identification as a member of the serine protease inhibitor gene family." Proc. Natl. Acad. Sci. U.S.A. 90:1526-1530(1993). PubMed=8434014 [10] Holland L.J., Suksang C., Wall A.A., Roberts L.R., Moser D.R., Bhattacharya A. J. Biol. Chem. 267:7053-7059(1992). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00257} {PS00285; POTATO_INHIBITOR} {BEGIN} *************************************** * Potato inhibitor I family signature * *************************************** The potato inhibitor I family is one of the numerous families of serine proteinase inhibitors. Members of this protein family are found in plants; in the seeds of barley or beans [1,2,3], and in potato or tomato leaves where they accumulate in response to mechanical damage [4,5]. An inhibitor belonging to this family is also found in leech [6]. It is interesting to note that, currently, this is the only proteinase inhibitor family to be found both in plant and animal kingdoms. Structurally these inhibitors are small (60 to 90 residues) and in contrast with other families of protease inhibitors, they lack disulfide bonds. They have a single inhibitory site. The consensus pattern we developed includes three out of the four residues conserved in all members of this family and is located in the N-terminal half. -Consensus pattern: [FYW]-P-[EQH]-[LIV](2)-G-x(2)-[STAGV]-x(2)-A -Sequences known to belong to this class detected by the pattern: ALL, except for barley subtilisin-chymotrypsin inhibitor-2b which has Glu instead of Gly, and a trypsin inhibitor from the cucurbitaceae Momordica charantia [7], which is said to belong to the potato inhibitor I family but which shows only a very weak similarity with the other members of this family. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: June 1994 / Text revised. [ 1] Svendsen I., Hejgaard J., Chavan J.K. Carlsberg Res. Commun. 49:493-502(1984). [ 2] Svendsen I., Boisen S., Hejgaard J. Carlsberg Res. Commun. 47:45-53(1982). [ 3] Nozawa H., Yamagata H., Aizono Y., Yoshikawa M., Iwasaki T. "The complete amino acid sequence of a subtilisin inhibitor from adzuki beans (Vigna angularis)." J. Biochem. 106:1003-1008(1989). PubMed=2628417 [ 4] Cleveland T.E., Thornburg R.W., Ryan C.A. Plant Mol. Biol. 8:199-207(1987). [ 5] Lee J.S., Brown W.E., Graham J.S., Pearce G., Fox E.A., Dreher T.W., Ahern K.G., Pearson G.D., Ryan C.A. "Molecular characterization and phylogenetic studies of a wound-inducible proteinase inhibitor I gene in Lycopersicon species." Proc. Natl. Acad. Sci. U.S.A. 83:7277-7281(1986). PubMed=3463966 [ 6] Seemuller U., Eulitz M., Fritz H., Strobl A. "Structure of the elastase-cathepsin G inhibitor of the leech Hirudo medicinalis." Hoppe-Seyler's Z. Physiol. Chem. 361:1841-1846(1980). PubMed=6906312 [ 7] Zeng F.-Y., Qian R.-Q., Wang Y. FEBS Lett. 234:35-38(1988). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00258} {PS00286; SQUASH_INHIBITOR} {BEGIN} ********************************************************* * Squash family of serine protease inhibitors signature * ********************************************************* The squash family of serine protease inhibitor [1] is one of the numerous families of serine proteinase inhibitors. The proteins belonging to this family are found in the seeds of cucurbitaceae plants (cucumber, squash, bitter gourd, etc.). The basic structure of such a type of inhibitor is shown in the following schematic representation: +----------------+ | | *****************|** xxCx#xxxxCxxxxxCxxxCxCxxxxxCx | | | | | +-----|-----+ +-----------+ <--------30 residues--------> 'C': conserved cysteine involved in a disulfide bond. '#': active site residue. '*': position of the pattern. The pattern we have used to detect this family of proteins spans the major part of the sequence and includes five of the six cysteines involved in disulfide bonds. -Consensus pattern: C-P-x(5)-C-x(2)-[DN]-x-D-C-x(3)-C-x-C [The 5 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: May 2004 / Text revised. [ 1] Otlewski J. "The squash inhibitor family of serine proteinases." Biol. Chem. Hoppe-Seyler 371:23-28(1990). PubMed=2205236 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00259} {PS00287; CYSTATIN} {BEGIN} ******************************************* * Cysteine proteases inhibitors signature * ******************************************* Inhibitors of cysteine proteases [1,2,3], which are found in the tissues and body fluids of animals, in the larva of the worm Onchocerca volvulus [4], as well as in plants, can be grouped into three distinct but related families: - Type 1 cystatins (or stefins), molecules of about 100 amino acid residues with neither disulfide bonds nor carbohydrate groups. - Type 2 cystatins, molecules of about 115 amino acid residues which contain one or two disulfide loops near their C-terminus. - Kininogens, which are multifunctional plasma glycoproteins. They are the precursor of the active peptide bradykinin and play a role in blood coagulation by helping to position optimally prekallikrein and factor XI next to factor XII. They are also inhibitors of cysteine proteases. Structurally, kininogens are made of three contiguous type-2 cystatin domains, followed by an additional domain (of variable length) which contains the sequence of bradykinin. The first of the three cystatin domains seems to have lost its inhibitory activity. In all these inhibitors, there is a conserved region of five residues which has been proposed to be important for the binding to the cysteine proteases. Our pattern starts one residue before this conserved region. -Consensus pattern: [GSTEQKRV]-Q-[LIVT]-[VAF]-[SAGQ]-G-{DG}-[LIVMNK]{TK}-x[LIVMFY]-{S}-[LIVMFYA]-[DENQKRHSIV] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 11. -Note: This pattern is always twice in kininogens. -Note: Members of the fetuin family (see <PDOC00966>) contain two copies of a cystatin-like domain. -Expert(s) to contact by email: Turk B.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Barrett A.J. Trends Biochem. Sci. 12:193-196(1987). [ 2] Rawlings N.D., Barrett A.J. "Evolution of proteins of the cystatin superfamily." J. Mol. Evol. 30:60-71(1990). PubMed=2107324 [ 3] Turk V., Bode W. "The cystatins: protein inhibitors of cysteine proteinases." FEBS Lett. 285:213-219(1991). PubMed=1855589 [ 4] Lustigman S., Brotman B., Huima T., Prince A.M. "Characterization of an Onchocerca volvulus cDNA clone encoding a genus specific antigen present in infective larvae and adult worms." Mol. Biochem. Parasitol. 45:65-75(1991). PubMed=2052041 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00260} {PS00288; TIMP} {BEGIN} ***************************************************** * Tissue inhibitors of metalloproteinases signature * ***************************************************** Tissue inhibitors of metalloproteinases (TIMP) are a family of proteins [1,2, 3] that can form complexes with extracellular matrix metalloproteinases (such as collagenases) and irreversibly inactivate them. TIMP's are proteins of about 200 amino acid residues, 12 of which are cysteines involved in disulfide bonds [4]. The basic structure of such a type of inhibitor is shown in the following schematic representation: +-----------------------------+ +--------------+ **|** | | | CxCxCxxxxxxxxxxxxxxxxxCxxxxxxxxxCxxxxxxxCxCxCxCxCxxxxxCxxCxxx | | | | | | | | | +-----------------|-----------------+ +-+ +-----+ +---------------------+ 'C': conserved cysteine involved in a disulfide bond. '*': position of the pattern. As a signature pattern for TIMP's, we chose the N-terminal extremity of these proteins, which includes three conserved cysteines. -Consensus pattern: C-x-C-x-P-x-H-P-Q-x(2)-[FIV]-C [The 3 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: May 2004 / Text revised. [ 1] Stetler-Stevenson W.G., Krutzsch H.C., Liotta L.A. J. Biol. Chem. 264:17374-17378(1989). [ 2] Woessner J.F. Jr. "Matrix metalloproteinases and their inhibitors in connective tissue remodeling." FASEB J. 5:2145-2154(1991). PubMed=1850705 [ 3] Pavloff N., Staskus P.W., Kishnani N.S., Hawkes S.P. "A new inhibitor of metalloproteinases from chicken: ChIMP-3. A third member of the TIMP family." J. Biol. Chem. 267:17321-17326(1992). PubMed=1512267 [ 4] Williamson R.A., Marston F.A.O., Angal S., Koklitis P., Panico M., Morris H.R., Carne A.F., Smith B.J., Harris T.J.R., Freedman R.B. "Disulphide bond assignment in human tissue inhibitor of metalloproteinases (TIMP)." Biochem. J. 268:267-274(1990). PubMed=2163605 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00261} {PS00289; PENTAXIN} {BEGIN} ***************************** * Pentaxin family signature * ***************************** Pentaxins (or pentraxins) [1,2] are a family of proteins which show, under electron microscopy, a discoid arrangement of five noncovalently bound subunits. Proteins known to belong to this family are: - C-reactive protein (CRP), a protein which, in mammals, is expressed during acute phase response to tissue injury or inflammation. CRP displays several functions associated with host defense: it promotes agglutination, bacterial capsular swelling, phagocytosis and complement fixation through its calcium-dependent binding to phosphorylcholine. CRPs have also been sequenced in an invertebrate, the Atlantic horseshoe crab, where they are a normal constituent of the hemolymph. - Serum Amyloid P-component (SAP), a precursor of amyloid component P which is found in basement membrane and is associated with amyloid deposits. - Hamster female protein (FP), a plasma protein whose concentration is altered by sex steroids and stimuli that elicit an acute phase response. A number of proteins, whose function is not yet clear, terminal pentaxin-like domain. These proteins are: contain a C- - Human PTX3 (or TSG-14). PTX3 is a cytokine-induced protein. - Guinea pig apexin [3], a sperm acrosomal protein. Apexin seems to be the ortholog of human neuronal pentraxin II (gene NPTX2) [4]. - Rat neuronal pentaxin I [5]. The sequences of the different members of this family are quite conserved. As a signature, we selected a six residue pattern which includes a cysteine known to be involved in a disulfide bridge in CRPs and SAP. -Consensus pattern: H-x-C-x-[ST]-W-x-[ST] [The C is involved in a disulfide bond] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: May 2004 / Text revised. [ 1] Pepys M.B., Baltz M.L. "Acute phase proteins with special reference to C-reactive protein and related proteins (pentaxins) and serum amyloid A protein." Adv. Immunol. 34:141-212(1983). PubMed=6356809 [ 2] Gewurz H., Zhang X.H., Lint T.F. "Structure and function of the pentraxins." Curr. Opin. Immunol. 7:54-64(1995). PubMed=7772283 [ 3] Reid M.S., Blobel C.P. "Apexin, an acrosomal pentaxin." J. Biol. Chem. 269:32615-32620(1994). PubMed=7798266 [ 4] Hsu Y.-C., Perin M.S. "Human neuronal pentraxin II (NPTX2): conservation, genomic structure, and chromosomal localization." Genomics 28:220-227(1995). PubMed=8530029 [ 5] Schlimgen A.K., Helms J.A., Vogel H., Perin M.S. "Neuronal pentraxin, a secreted protein with homology to acute phase proteins of the immune system." Neuron 14:519-526(1995). PubMed=7695898 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00262} {PS00290; IG_MHC} {BEGIN} ************************************************************************* ** * Immunoglobulins and major histocompatibility complex proteins signature * ************************************************************************* ** The basic structure of immunoglobulin (Ig) [1] molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda, each composed of a constant domain (CL) and a variable domain (VL). There are five types of heavy chains: alpha, delta, epsilon, gamma and mu, all consisting of a variable domain (VH) and three (in alpha, delta and gamma) or four (in epsilon and mu) constant domains (CH1 to CH4). The major histocompatibility complex (MHC) molecules are made of two chains. In class I [2] the alpha chain is composed of three extracellular domains, a transmembrane region and a cytoplasmic tail. The beta chain (beta-2microglobulin) is composed of a single extracellular domain. In class II [3], both the alpha and the beta chains are composed of two extracellular domains, a transmembrane region and a cytoplasmic tail. It is known [4,5] that the Ig constant chain domains and a single extracellular domain in each type of MHC chains are related. These homologous domains are approximately one hundred amino acids long and include a conserved intradomain disulfide bond. We developed a small pattern around the C-terminal cysteine involved in this disulfide bond which can be used to detect these category of Ig related proteins. -Consensus pattern: [FY]-{L}-C-{PGAD}-[VA]-{LC}-H [The C is involved in a disulfide bond] -Sequences known to belong to this class detected by the pattern: Ig heavy chains type Alpha C region : All, in CH2 and CH3. Ig heavy chains type Delta C region : All, in CH3. Ig heavy chains type Epsilon C region: All, in CH1, CH3 and CH4. Ig heavy chains type Gamma C region : All, in CH3 and also CH1 in some cases Ig heavy chains type Mu C region : All, in CH2, CH3 and CH4. Ig light chains type Kappa C region : In all CL except rabbit and Xenopus. Ig light chains type Lambda C region : In all CL except rabbit. MHC class I alpha chains : All, in alpha-3 domains, including in the cytomegalovirus MHC-1 homologous protein [6]. Beta-2-microglobulin : All. MHC class II alpha chains: All, in alpha-2 domains. MHC class II beta chains: All, in beta-2 domains. -Other sequence(s) detected in Swiss-Prot: 89. -Last update: April 2006 / Pattern revised. [ 1] Gough N. Trends Biochem. Sci. 6:203-205(1981). [ 2] Klein J., Figueroa F. Immunol. Today 7:41-44(1986). [ 3] Figueroa F., Klein J. Immunol. Today 7:78-81(1986). [ 4] Orr H.T., Lancet D., Robb R.J., Lopez de Castro J.A., Strominger J.L. "The heavy chain of human histocompatibility antigen HLA-B7 contains an immunoglobulin-like region." Nature 282:266-270(1979). PubMed=388231 [ 5] Cushley W., Owen M.J. Immunol. Today 4:88-92(1983). [ 6] Beck S., Barrell B.G. "Human cytomegalovirus encodes a glycoprotein homologous to MHC class-I antigens." Nature 331:269-272(1988). PubMed=2827039; DOI=10.1038/331269a0; +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00263} {PS00291; PRION_1} {PS00706; PRION_2} {BEGIN} **************************** * Prion protein signatures * **************************** Prion protein (PrP) [1,2,3] is a small glycoprotein found in high quantity in the brains of humans or animals infected with a number of degenerative neurological diseases such as Kuru, Creutzfeldt-Jacob disease (CJD), scrapie or bovine spongiform encephalopathy (BSE). PrP is encoded in the host genome and expressed both in normal and infected cells. It has a tendency to aggregate yielding polymers called rods. Structurally, PrP is a protein consisting of a signal peptide, followed by an N-terminal domain that contains tandem repeats of a short motif (PHGGGWGQ in mammals, PHNPGY in chicken), itself followed by a highly conserved domain of about 140 residues that contains a disulfide bond. Finally comes a Cterminal hydrophobic domain post-translationally removed when PrP is attached to the extracellular side of the cell membrane by a GPI-anchor. The structure of PrP is shown in the following schematic representation: +---+----------------+-******-------------------****-----+-----+ |Sig| Tandem repeats | C C S| | +---+----------------+--------------------|--------|----|+-----+ +--------+ | GPI 'C': conserved cysteine involved in a disulfide bond. '*': position of the patterns. As signature pattern for PrP, we selected a perfectly conserved alanine- and glycine-rich region of 16 residues as well as a region centered on the second cysteine involved in the disulfide bond. -Consensus pattern: A-G-A-A-A-A-G-A-V-V-G-G-L-G-G-Y -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: E-x-[ED]-x-K-[LIVM](2)-x-[KR]-[LIVM](2)-x-[QE]-M-Cx(2)Q-Y [C is involved in a disulfide bond] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: November 1997 / Text revised. [ 1] Stahl N., Prusiner S.B. "Prions and prion proteins." FASEB J. 5:2799-2807(1991). PubMed=1916104 [ 2] Brunori M., Chiara Silvestrini M.C., Pocchiari M. "The scrapie agent and the prion hypothesis." Trends Biochem. Sci. 13:309-313(1988). PubMed=2908696 [ 3] Prusiner S.B. "Scrapie prions." Annu. Rev. Microbiol. 43:345-374(1989). PubMed=2572197; DOI=10.1146/annurev.mi.43.100189.002021 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00264} {PS00292; CYCLINS} {BEGIN} ********************* * Cyclins signature * ********************* Cyclins [1,2,3] are eukaryotic proteins which play an active role in controlling nuclear cell division cycles. Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins: - G2/M cyclins, essential for the control of the cell G2/M (mitosis) transition. G2/M cyclins accumulate steadily and are abruptly destroyed as cells exit from mitosis (at the end phase). - G1/S cyclins, essential for the control of the cell G1/S (start) transition. cycle at the during G2 of the Mcycle at the In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E. A cyclin homolog has also been found in herpesvirus saimiri [4]. The best conserved region is in the central part of the cyclins' sequences, known as the 'cyclin-box', from which we have derived a 32 residue pattern. -Consensus pattern: R-x(2)-[LIVMSA]-x(2)-[FYWS]-[LIVM]-x(8)-[LIVMFC]x(4)[LIVMFYA]-x(2)-[STAGC]-[LIVMFYQ]-x-[LIVMFYC][LIVMFY]-D[RKH]-[LIVMFYW] -Sequences known to belong to this class detected by the pattern: ALL, except for G1/S cyclins C from human and Drosophila, puc1 and mcs2 from fission yeast and CLG1, PCL1 (HCS26) and PCL2 (CLN4) from budding yeast. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: November 1995 / Pattern and text revised. [ 1] Nurse P. "Universal control mechanism regulating onset of M-phase." Nature 344:503-508(1990). PubMed=2138713 [ 2] Norbury C., Nurse P. "Cyclins and cell cycle control." Curr. Biol. 1:23-24(1991). PubMed=15336197 [ 3] Lew D.J., I Reed S. "A proliferation of cyclins." Trends Cell Biol. 2:77-81(1992). PubMed=14731948 [ 4] Nicholas J., Cameron K.R., Honess R.W. "Herpesvirus saimiri encodes homologues of G protein-coupled receptors and cyclins." Nature 355:362-365(1992). PubMed=1309943; DOI=10.1038/355362a0 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00265} {PS01251; PCNA_1} {PS00293; PCNA_2} {BEGIN} ************************************************* * Proliferating cell nuclear antigen signatures * ************************************************* Proliferating cell nuclear antigen (PCNA) [1,2] is a protein involved in DNA replication by acting as a cofactor for DNA polymerase delta, the polymerase responsible for leading strand DNA replication. A similar protein exists in yeast (gene POL30) [3] and is associated with polymerase III, the yeast analog of polymerase delta. In baculoviruses the ETL protein has been shown [4] to be highly related to PCNA and is probably associated with the viral encoded DNA polymerase. An homolog of PCNA is also found in archebacteria. As signatures for this family of proteins, we selected a two conserved regions located in the N-terminal section. The second one has been proposed to bind DNA. -Consensus pattern: [GSTA]-[LIVMF]-x-[LIVMAS]-x-[GSAVI]-[LIVM]-[DS]-x[NSAED][HKRNS]-[VIT]-x-[LMYF]-[VIGAL]-x-[LIVMF]-x-[LIVM]x(4)-F -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [RKA]-C-[DE]-[RH]-x(3)-[LIVMF]-x(3)-[LIVM]-x-[SGAN][LIVMF]-x-K-[LIVMF](2) -Sequences known to belong to this class detected by the pattern: ALL, except for archaebacterial PCNA homologs. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Bravo R., Frank R., Blundell P.A., Mcdonald-Bravo H. "Cyclin/PCNA is the auxiliary protein of DNA polymerase-delta." Nature 326:515-517(1987). PubMed=2882423; DOI=10.1038/326515a0 [ 2] Suzuka I., Hata S., Matsuoka M., Kosugi S., Hashimoto J. "Highly conserved structure of proliferating cell nuclear antigen (DNA polymerase delta auxiliary protein) gene in plants." Eur. J. Biochem. 195:571-575(1991). PubMed=1671766 [ 3] Bauer G.A., Burgers P.M.J. "Molecular cloning, structure and expression of the yeast proliferating cell nuclear antigen gene." Nucleic Acids Res. 18:261-265(1990). PubMed=1970160 [ 4] O'Reilly D.R., Crawford A.M., Miller L.K. Nature 337:606-606(1989). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00266} {PS00294; PRENYLATION} {BEGIN} **************************************** * Prenyl group binding site (CAAX box) * **************************************** A number of eukaryotic proteins are post-translationally modified by the attachment of either a farnesyl or a geranyl-geranyl group to a cysteine residue [1,2,3,4]. The modification occurs on cysteine residues that are three residues away from the C-terminal extremity; the two residues that separate this cysteine from the C-terminal residue are generally aliphatic. This CysAli-Ali-X pattern is generally known as the CAAX box. Proteins known or strongly presumed to be the target of this modification are listed below. - Ras proteins, and ras-like proteins such as Rho, Rab, Rac, Ral, and Rap. - Nuclear lamins A and B. - Some G protein alpha subunits. - G protein gamma subunits (see <PDOC01002>). - 2',3'-cyclic nucleotide 3'-phosphodiesterase (EC 3.1.4.37). - Rhodopsin-sensitive cGMP 3',5'-cyclic nucleotide phosphodiesterase alpha and beta chains (EC 3.1.4.17). - Rhodopsin kinase (EC 2.7.11.14). - Some dnaJ-like proteins (such as yeast MAS5/YDJ1). - A number of fungal mating factors (such as M-factor or rhodotorucine A). -Consensus pattern: C-{DENQ}-[LIVM]-x> [C is the prenylation site] -Last update: November 1997 / Text revised. [ 1] Glomset J.A., Gelb M.H., Farnsworth C.C. "Prenyl proteins in eukaryotic cells: a new type of membrane anchor." Trends Biochem. Sci. 15:139-142(1990). PubMed=2187294 [ 2] Lowy D.R., Willumsen B.M. "Protein modification: new clue to Ras lipid glue." Nature 341:384-385(1989). PubMed=2677741 [ 3] Imagee A.I. Biochem. Soc. Trans. 17:875-876(1989). [ 4] Powers S. "Protein prenylation: a modification that sticks." Curr. Biol. 1:114-116(1991). PubMed=15336183 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00267} {PS00295; ARRESTINS} {BEGIN} *********************** * Arrestins signature * *********************** Arrestin (or S-antigen) [1] is a protein that interacts with lightactivated phosphorylated rhodopsin thereby inhibiting or 'arresting' its ability to interact with transducin. In mammals, arrestin is associated with autoimmune uveitis. Arrestin belongs to a family of closely related proteins including: - Beta-arrestin-1 and -2, proteins that regulate the function of betaadrenergic receptors. They bind to the phosphorylated form of the latter thereby causing a significant impairment of their capacity to activate G(S) proteins. - Cone photoreceptors C-arrestin (arrestin-X) [2], which could bind to phosphorylated red/green opsins. - Phosrestins I and II from Drosophila and related insects. These proteins undergo light-induced phosphorylation and play an important role in photoreceptor transduction. Sequence comparison of proteins from the arrestin family shows a high level of conservation. As a signature pattern, we selected a region located in the Nterminal section that contains many charged and hydrophobic residues. -Consensus pattern: [FY]-R-Y-G-x-[DE](2)-x-[DE]-[LIVM](2)-G-[LIVM]-x-F-x[RK][DEQ]-[LIVM] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Kolakowski L.F. Jr.; [email protected] -Last update: November 1997 / Pattern and text revised. [ 1] Wilson C.J., Applebury M.L. "Arresting G-protein coupled receptor activity." Curr. Biol. 3:683-686(1993). PubMed=15335861 [ 2] Craft C.M., Whitmore D.H. "The arrestin superfamily: cone arrestins are a fourth family." FEBS Lett. 362:247-255(1995). PubMed=7720881 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00268} {PS00296; CHAPERONINS_CPN60} {BEGIN} ******************************* * Chaperonins cpn60 signature * ******************************* Chaperonins [1,2] are proteins involved in the folding of proteins or the assembly of oligomeric protein complexes. Their role seems to be to assist other polypeptides to maintain or assume conformations which permit their correct assembly into oligomeric structures. They are found in abundance in prokaryotes, chloroplasts and mitochondria. Chaperonins form oligomeric complexes and are composed of two different types of subunits: a 60 Kd protein, known as cpn60 (groEL in bacteria) and a 10 Kd protein, known as cpn10 (groES in bacteria). The cpn60 protein shows weak ATPase activity and is a highly conserved protein of about 550 to 580 amino acid residues which has been described by different names in different species: - Escherichia coli groEL protein, which is essential for the growth of the bacteria and the assembly of several bacteriophages. - Cyanobacterial groEL analogues. - Mycobacterium tuberculosis and leprae 65 Kd antigen, Coxiella burnetti heat shock protein B (gene htpB), Rickettsia tsutsugamushi major antigen 58, and Chlamydial 57 Kd hypersensitivity antigen (gene hypB). - Chloroplast RuBisCO subunit binding-protein alpha and beta chains, which bind ribulose bisphosphate carboxylase small and large subunits and are implicated in the assembly of the enzyme oligomer. - Mammalian mitochondrial matrix protein P1 (mitonin or P60). - Yeast HSP60 protein, a mitochondrial assembly factor. As a signature well conserved region cpn60 sequence. pattern of twelve for these proteins, we have chosen a rather residues, located in the last third of the -Consensus pattern: A-[AS]-{L}-[DEQ]-E-{A}-{Q}-{R}-x-G(2)-[GA] -Sequences known to belong to this class detected by the pattern: ALL, except for 5 sequences. -Other sequence(s) detected in Swiss-Prot: 4. -Expert(s) to contact by email: Georgopoulos C.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Ellis R.J., van der Vies S.M. "Molecular chaperones." Annu. Rev. Biochem. 60:321-347(1991). PubMed=1679318; DOI=10.1146/annurev.bi.60.070191.001541 [ 2] Zeilsta-Ryalls J., Fayet O., Georgopoulos C. Annu. Rev. Microbiol. 45:301-325(1991). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00269} {PS00297; HSP70_1} {PS00329; HSP70_2} {PS01036; HSP70_3} {BEGIN} *********************************************** * Heat shock hsp70 proteins family signatures * *********************************************** Prokaryotic and eukaryotic organisms other environmental stress by the induction proteins respond of the to heat shock synthesis of or collectively known as heat-shock proteins (hsp) [1]. Amongst them is a family of proteins with an average molecular weight of 70 Kd, known as the hsp70 proteins [2,3,4]. In most species, there are many proteins that belong to the hsp70 family. Some of them are expressed under unstressed conditions. Hsp70 proteins can be found in different cellular compartments (nuclear, cytosolic, mitochondrial, endoplasmic reticulum, etc.). Some of the hsp70 family proteins are listed below: - In Escherichia coli and other bacteria, the main hsp70 protein is known as the dnaK protein. A second protein, hscA, has been recently discovered. dnaK is also found in the chloroplast genome of red algae. - In yeast, at least ten hsp70 proteins are known to exist: SSA1 to SSA4, SSB1, SSB2, SSC1, SSD1 (KAR2), SSE1 (MSI3) and SSE2. - In Drosophila, there are at least eight different hsp70 proteins: HSP70, HSP68, and HSC-1 to HSC-6. - In mammals, there are at least eight different proteins: HSPA1 to HSPA6, HSC70, and GRP78 (also known as the immunoglobulin heavy chain binding protein (BiP)). - In the sugar beet yellow virus (SBYV), a hsp70 homolog has been shown [5] to exist. - In archaebacteria, hsp70 proteins are also present [6]. All proteins belonging to the hsp70 family bind ATP. A variety of functions has been postulated for hsp70 proteins. It now appears [7] that some hsp70 proteins play an important role in the transport of proteins across membranes. They also seem to be involved in protein folding and in the assembly/ disassembly of protein complexes [8]. We have derived three signature patterns for the hsp70 family of proteins; the first centered on a conserved pentapeptide found in the N-terminal section of these proteins; the two others on conserved regions located in the central part of the sequence. -Consensus pattern: [IV]-D-L-G-T-[ST]-x-[SC] -Sequences known to belong to this class detected by the pattern: ALL, except for 16 sequences. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [LIVMF]-[LIVMFY]-[DN]-[LIVMFS]-G-[GSH]-[GS]-[AST]x(3)[ST]-[LIVM]-[LIVMFC] -Sequences known to belong to this class detected by the pattern: ALL, except for 7 sequences. -Other sequence(s) detected in Swiss-Prot: 1. -Consensus pattern: [LIVMY]-x-[LIVMF]-x-G-G-x-[ST]-{LS}-[LIVM]-P-x[LIVM]-x[DEQKRSTA] -Sequences known to belong to this class detected by the pattern: ALL, except for 4 sequences. -Other sequence(s) detected in Swiss-Prot: 6. -Last update: December 2004 / Pattern and text revised. [ 1] Lindquist S., Craig E.A. "The heat-shock proteins." Annu. Rev. Genet. 22:631-677(1988). PubMed=2853609; DOI=10.1146/annurev.ge.22.120188.003215 [ 2] Pelham H.R.B. "Speculations on the functions of the major heat shock and glucose-regulated proteins." Cell 46:959-961(1986). PubMed=2944601 [ 3] Pelham H. "Heat-shock proteins. Coming in from the cold." Nature 332:776-777(1988). PubMed=3282176; DOI=10.1038/332776a0 [ 4] Craig E.A. "Essential roles of 70kDa heat inducible proteins." BioEssays 11:48-52(1989). PubMed=2686623 [ 5] Agranovsky A.A., Boyko V.P., Karasev A.V., Koonin E.V., Dolja V.V. "Putative 65 kDa protein of beet yellows closterovirus is a homologue of HSP70 heat shock proteins." J. Mol. Biol. 217:603-610(1991). PubMed=2005613 [ 6] Gupta R.S., Singh B. "Cloning of the HSP70 gene from Halobacterium marismortui: relatedness of archaebacterial HSP70 to its eubacterial homologs and a model for the evolution of the HSP70 gene." J. Bacteriol. 174:4594-4605(1992). PubMed=1624448 [ 7] Deshaies R.J., Koch B.D., Schekman R. "The role of stress proteins in membrane biogenesis." Trends Biochem. Sci. 13:384-388(1988). PubMed=3072700 [ 8] Craig E.A., Gross C.A. "Is hsp70 the cellular thermometer?" Trends Biochem. Sci. 16:135-140(1991). PubMed=1877088 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00270} {PS00298; HSP90} {BEGIN} ********************************************** * Heat shock hsp90 proteins family signature * ********************************************** Prokaryotic and eukaryotic organisms respond to heat shock or other environmental stress by the induction of the synthesis of proteins collectively known as heat-shock proteins (hsp) [1]. Amongst them is a family of proteins, with an average molecular weight of 90 Kd, known as the hsp90 proteins. Proteins known to belong to this family are: - Escherichia coli and other bacteria heat shock protein c62.5 (gene htpG). - Vertebrate hsp 90-alpha (hsp 86) and hsp 90-beta (hsp 84). - Drosophila hsp 82 (hsp 83). - Trypanosoma cruzi hsp 85. - Plants Hsp82 or Hsp83. - Yeast and other fungi HSC82, and HSP82. - The endoplasmic reticulum protein 'endoplasmin' (also known as Erp99 in mouse, GRP94 in hamster, and hsp 108 in chicken). The exact function of hsp90 proteins is not yet known. In higher eukaryotes, hsp90 has been found associated with steroid hormone receptors, with tyrosine kinase oncogene products of several retroviruses, with eIF2alpha kinase, and with actin and ATPase activity [2,3]. tubulin. Hsp90 are probable chaperonins that possess As a signature pattern for the hsp90 family of proteins, we have selected a highly conserved region found in the N-terminal part of these proteins. -Consensus pattern: Y-x-[NQHD]-[KHR]-[DE]-[IVA]-F-[LM]-R-[ED] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Lindquist S., Craig E.A. "The heat-shock proteins." Annu. Rev. Genet. 22:631-677(1988). PubMed=2853609; DOI=10.1146/annurev.ge.22.120188.003215 [ 2] Nadeau K., Das A., Walsh C.T. "Hsp90 chaperonins possess ATPase activity and bind heat shock transcription factors and peptidyl prolyl isomerases." J. Biol. Chem. 268:1479-1487(1993). PubMed=8419347 [ 3] Jakob U., Buchner J. "Assisting spontaneity: the role of Hsp90 and small Hsps as molecular chaperones." Trends Biochem. Sci. 19:205-211(1994). PubMed=7914036 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00271} {PS00299; UBIQUITIN_1} {PS50053; UBIQUITIN_2} {BEGIN} ****************************************** * Ubiquitin domain signature and profile * ****************************************** Ubiquitin [1,2,3] is a protein of seventy found in all eukaryotic cells and whose sequence from six amino acid residues, is extremely well conserved protozoan to vertebrates. It plays a key role in a variety of cellular processes, such as ATP-dependent selective degradation of cellular proteins, maintenance of chromatin structure, regulation of gene expression, stress response and ribosome biogenesis. In most species, there are many genes coding for ubiquitin. However they can be classified into two classes. The first class produces polyubiquitin molecules consisting of exact head to tail repeats of ubiquitin. The number of repeats is variable (up to twelve in a Xenopus gene). In the majority of polyubiquitin precursors, there is a final amino-acid after the last repeat. The second class of genes produces precursor proteins consisting of a single copy of ubiquitin fused to a C-terminal extension protein (CEP). There are two types of CEP proteins and both seem to be ribosomal proteins. Ubiquitin is a globular protein, the last four C-terminal residues (Leu-ArgGly-Gly) extending from the compact structure to form a 'tail', important for its function. The latter is mediated by the covalent conjugation of ubiquitin to target proteins, by an isopeptide linkage between the C-terminal glycine and the epsilon amino group of lysine residues in the target proteins. There are a number of proteins which are evolutionary related to ubiquitin: - Ubiquitin-like proteins from baculoviruses as well as in some strains of bovine viral diarrhea viruses (BVDV). These proteins are highly similar to their eukaryotic counterparts. - Mammalian protein GDX [4]. GDX is composed of two domains, a Nterminal ubiquitin-like domain of 74 residues and a C-terminal domain of 83 residues with some similarity with the thyroglobulin hormonogenic site. - Mammalian protein FAU [5]. FAU is a fusion protein which consist of a N-terminal ubiquitin-like protein of 74 residues fused to ribosomal protein S30. - Mouse protein NEDD-8 [6], a ubiquitin-like protein of 81 residues. - Human protein BAT3, a large fusion protein of 1132 residues that contains a N-terminal ubiquitin-like domain. - Caenorhabditis elegans protein ubl-1 [7]. Ubl-1 is a fusion protein which consist of a N-terminal ubiquitin-like protein of 70 residues fused to ribosomal protein S27A. - Yeast DNA repair protein RAD23 [8]. RAD23 contains a N-terminal domain that seems to be distantly, yet significantly, related to ubiquitin. - Mammalian RAD23-related proteins RAD23A and RAD23B. - Mammalian BCL-2 binding athanogene-1 (BAG-1). BAG-1 is a protein of 274 residues that contains a central ubiquitin-like domain. - Human spliceosome associated protein 114 (SAP 114 or SF3A120). - Yeast protein DSK2, a protein involved in spindle pole body duplication and which contains a N-terminal ubiquitin-like domain. - Human protein CKAP1/TFCB, Schizosaccharomyces pombe protein alp11 and Caenorhabditis elegans hypothetical protein F53F4.3. These proteins contain a N-terminal ubiquitin domain and a C-terminal CAP-Gly domain (see <PDOC00660>). - Schizosaccharomyces pombe hypothetical protein SpAC26A3.16. This protein contains a N-terminal ubiquitin domain. - Yeast protein SMT3. - Human ubiquitin-like proteins SMT3A and SMT3B. - Human ubiquitin-like protein SMT3C (also known as PIC1; Ubl1, Sumo-1; Gmp-1 or Sentrin). This protein is involved in targeting ranGAP1 to the nuclear pore complex protein ranBP2. - SMT3-like proteins in plants and Caenorhabditis elegans. To identify ubiquitin and related proteins we have developed a pattern based on conserved positions in the central section of the sequence. A profile was also developed that spans the complete length of the ubiquitin domain. -Consensus pattern: K-x(2)-[LIVM]-x-[DESAK]-x(3)-[LIVM]-[PAQ]-x(3)-Q-x[LIVM][LIVMC]-[LIVMFY]-x-G-x(4)-[DE] -Sequences known to belong to this class detected by the pattern: ALL, except for the RAD23 and SMT3 subfamilies, BAG-1 and SAP 114. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: 1. -Last update: December 2004 / Pattern and text revised. [ 1] Jentsch S., Seufert W., Hauser H.-P. "Genetic analysis of the ubiquitin system." Biochim. Biophys. Acta 1089:127-139(1991). PubMed=1647207 [ 2] Monia B.P., Ecker D.J., Croke S.T. Bio/Technology 8:209-215(1990). [ 3] Finley D., Varshavsky A. Trends Biochem. Sci. 10:343-347(1985). [ 4] Filippi M., Tribioli C., Toniolo D. "Linkage and sequence conservation of the X-linked genes DXS253E (P3) and DXS254E (GdX) in mouse and man." Genomics 7:453-457(1990). PubMed=1973144 [ 5] Olvera J., Wool I.G. "The carboxyl extension of a ubiquitin-like protein is rat ribosomal protein S30." J. Biol. Chem. 268:17967-17974(1993). PubMed=8394356 [ 6] Kumar S., Yoshida Y., Noda M. "Cloning of a cDNA which encodes a novel ubiquitin-like protein." Biochem. Biophys. Res. Commun. 195:393-399(1993). PubMed=8395831 [ 7] Jones D., Candido E.P. "Novel ubiquitin-like ribosomal protein fusion genes from the nematodes Caenorhabditis elegans and Caenorhabditis briggsae." J. Biol. Chem. 268:19545-19551(1993). PubMed=7690036 [ 8] Melnick L., Sherman F. "The gene clusters ARC and COR on chromosomes 5 and 10, respectively, of Saccharomyces cerevisiae share a common ancestry." J. Mol. Biol. 233:372-388(1993). PubMed=8411151 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00272} {PS00300; SRP54} {BEGIN} **************************************************** * SRP54-type proteins GTP-binding domain signature * **************************************************** The signal recognition particle (SRP) is an oligomeric complex that mediates targeting and insertion of the signal sequence of exported proteins into the membrane of the endoplasmic reticulum. SRP consists of a 7S RNA and six protein subunits. One of these subunits, the 54 Kd protein (SRP54), is a GTPbinding protein that interacts with the signal sequence when it emerges from the ribosome. The N-terminal 300 residues of SRP54 include the GTPbinding site (G-domain) and are evolutionary related to similar domains in other proteins which are listed below [1]. - Escherichia coli and Bacillus subtilis ffh protein (P48), a protein which seems to be the prokaryotic counterpart of SRP54. Ffh is associated with a 4.5S RNA in the prokaryotic SRP complex. - Signal recognition particle receptor alpha subunit (docking protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane. The G-domain is located at the C-terminal extremity of the protein. - Bacterial ftsY protein, a protein which is believed to play a similar role to that of the docking protein in eukaryotes. The G-domain is located at the C-terminal extremity of the protein. - The pilA protein from Neisseria gonorrhoeae which seems to be the homolog of ftsY. - A protein from the archaebacteria Sulfolobus solfataricus. This protein is also believed to be a docking protein. The G-domain is also at the Cterminus. - Bacterial flagellar biosynthesis protein flhF. The best conserved regions in those domains are the sequence motifs that are part of the GTP-binding site, but as those regions are not specific to these proteins, we did not use them as a signature pattern. Instead, we selected a conserved region located at the C-terminal end of the domain. -Consensus pattern: P-[LIVM]-x-[FYL]-[LIVMAT]-[GS]-{Q}-[GS]-[EQ]-x-{K}x(2)[LIVMF] -Sequences known to belong to this class detected by the pattern: ALL, except for flhF. -Other sequence(s) detected in Swiss-Prot: 9. -Last update: December 2004 / Pattern and text revised. [ 1] Althoff S., Selinger D., Wise J.A. "Molecular evolution of SRP cycle components: functional implications." Nucleic Acids Res. 22:1933-1947(1994). PubMed=7518075 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00273} {PS00301; EFACTOR_GTP} {BEGIN} ******************************************** * GTP-binding elongation factors signature * ******************************************** Elongation factors [1,2] are proteins catalyzing the elongation of peptide chains in protein biosynthesis. In both prokaryotes and eukaryotes, there are three distinct types of elongation factors, as described in the following table: -------------------------------------------------------------------------Eukaryotes Prokaryotes Function -------------------------------------------------------------------------EF-1alpha EF-Tu Binds GTP and an aminoacyl-tRNA; delivers the latter to the A site of ribosomes. EF-1beta and EF-Ts Interacts with EF-1a/EF-Tu to displace GDP EF-2 the EF-G thus allows the regeneration of GTP-EF-1a. Binds GTP and peptidyl-tRNA and translocates latter from the A site to the P site. -------------------------------------------------------------------------The GTP-binding elongation factor family also includes the following proteins: - Eukaryotic peptide chain release factor GTP-binding subunits [3]. These proteins interact with release factors that bind to ribosomes that have encountered a stop codon at their decoding site and help them to induce release of the nascent polypeptide. The yeast protein was known as SUP2 (and also as SUP35, SUF12 or GST1) and the human homolog as GST1-Hs. - Prokaryotic peptide chain release factor 3 (RF-3) (gene prfC). RF-3 is a class-II RF, a GTP-binding protein that interacts with class I RFs (see <PDOC00607>) and enhance their activity [4]. - Prokaryotic GTP-binding protein lepA and its homolog in yeast (gene GUF1) and in Caenorhabditis elegans (ZK1236.1). - Yeast HBS1 [5]. - Rat statin S1 [6], a protein of unknown function which is highly similar to EF-1alpha. - Prokaryotic selenocysteine-specific elongation factor selB [7], which seems to replace EF-Tu for the insertion of selenocysteine directed by the UGA codon. - The tetracycline resistance proteins tetM/tetO [8,9] from various bacteria such as Campylobacter jejuni, Enterococcus faecalis, Streptococcus mutans and Ureaplasma urealyticum. Tetracycline binds to the prokaryotic ribosomal 30S subunit and inhibits binding of aminoacyl-tRNAs. These proteins abolish the inhibitory effect of tetracycline on protein synthesis. - Rhizobium nodulation protein nodQ [10]. - Escherichia coli hypothetical protein yihK [11]. In EF-1-alpha, a specific region has been shown [12] to be involved in a conformational change mediated by the hydrolysis of GTP to GDP. This region is conserved in both EF-1alpha/EF-Tu as well as EF-2/EF-G and thus seems typical for GTP-dependent proteins which bind non-initiator tRNAs to the ribosome. The pattern we developed for this family of proteins include that conserved region. -Consensus pattern: D-[KRSTGANQFYW]-x(3)-E-[KRAQ]-x-[RKQD]-[GC]-[IVMK][ST][IV]-x(2)-[GSTACKRNQ] -Sequences known to belong to this class detected by the pattern: ALL, except for 11 sequences. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: November 1997 / Text revised. [ 1] Concise Encyclopedia Biochemistry, Second Edition, Walter de Gruyter, Berlin New-York (1988). [ 2] Moldave K. Annu. Rev. Biochem. 54:1109-1149(1985). [ 3] Stansfield I., Jones K.M., Kushnirov V.V., Dagkesamanskaya A.R., Poznyakovski A.I., Paushkin S.V., Nierras C.R., Cox B.S., Ter-Avanesyan M.D., Tuite M.F. "The products of the SUP45 (eRF1) and SUP35 genes interact to mediate translation termination in Saccharomyces cerevisiae." EMBO J. 14:4365-4373(1995). PubMed=7556078 [ 4] Grentzmann G., Brechemier-Baey D., Heurgue-Hamard V., Buckingham R.H. "Function of polypeptide chain release factor RF-3 in Escherichia coli. RF-3 action in termination is predominantly at UGA-containing stop signals." J. Biol. Chem. 270:10595-10600(1995). PubMed=7737996 [ 5] Nelson R.J., Ziegelhoffer T., Nicolet C., Werner-Washburne M., Craig E.A. "The translation machinery and 70 kd heat shock protein cooperate in protein synthesis." Cell 71:97-105(1992). PubMed=1394434 [ 6] Ann D.K., Moutsatsos I.K., Nakamura T., Lin H.H., Mao P.-L., Lee M.J., Chin S., Liem R.K.H., Wang E. "Isolation and characterization of the rat chromosomal gene for a polypeptide (pS1) antigenically related to statin." J. Biol. Chem. 266:10429-10437(1991). PubMed=1709933 [ 7] Forchammer K., Leinfeldr W., Bock A. Nature 342:453-456(1989). [ 8] Manavathu E.K., Hiratsuka K., Taylor D.E. "Nucleotide sequence analysis and expression of a tetracycline-resistance gene from Campylobacter jejuni." Gene 62:17-26(1988). PubMed=2836268 [ 9] LeBlanc D.J., Lee L.N., Titmas B.M., Smith C.J., Tenover F.C. "Nucleotide sequence analysis of tetracycline resistance gene tetO from Streptococcus mutans DL5." J. Bacteriol. 170:3618-3626(1988). PubMed=2841293 [10] Cervantes E., Sharma S.B., Maillet F., Vasse J., Truchet G., Rosenberg C. Mol. Microbiol. 3:745-755(1989). [11] Plunkett G. III, Burland V.D., Daniels D.L., Blattner F.R. Nucleic Acids Res. 21:3391-3398(1993). [12] Moller W., Schipper A., Amons R. Biochimie 69:983-989(1987). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00274} {PS00302; IF5A_HYPUSINE} {BEGIN} ****************************************************** * Eukaryotic initiation factor 5A hypusine signature * ****************************************************** Eukaryotic initiation factor 5A (eIF-5A) (formerly known as eIF-4D) [1,2] is a small protein whose precise role in the initiation of protein synthesis is not known. It appears to promote the formation of the first peptide bond. eIF-5A seems to be the only eukaryotic protein to contain an hypusine residue. Hypusine is derived from lysine by the post-translational addition of a butylamino group (from spermidine) to the epsilon-amino group of lysine. The hypusine group is essential to the function of eIF-5A. A hypusine-containing protein has been found in archaebacteria such as Sulfolobus acidocaldarius or Methanococcus jannaschii; this protein is highly similar to eIF-5A and could play a similar role in protein biosynthesis. The signature we developed for eIF-5A is centered around the hypusine residue. -Consensus pattern: [PT]-G-K-H-G-x-A-K [The first K is modified to hypusine] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: November 1997 / Pattern and text revised. [ 1] Park M.H., Wolff E.C., Folk J.E. "Hypusine: its post-translational formation in eukaryotic initiation factor 5A and its potential role in cellular regulation." Biofactors 4:95-104(1993). PubMed=8347280 [ 2] Schnier J., Schwelberger H.G., Smit-McBride Z., Kang H.A., Hershey J.W. "Translation initiation factor 5A and its hypusine modification are essential for cell viability in the yeast Saccharomyces cerevisiae." Mol. Cell. Biol. 11:3105-3114(1991). PubMed=1903841 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00275} {PS00303; S100_CABP} {BEGIN} ****************************************************** * S-100/ICaBP type calcium binding protein signature * ****************************************************** S-100 are small dimeric acidic calcium and zinc-binding proteins [1] abundant in the brain. They have two different types of calcium-binding sites: a low affinity one with a special structure and a 'normal' EF-hand type high affinity site. The vitamin-D dependent intestinal calcium-binding proteins (ICaBP or calbindin 9 Kd) also belong to this family of proteins, but it does not form dimers. In the past years the sequences of many new members of this family have been determined (for reviews see [2,3,4]); in most cases the function of these proteins is not yet known, although it is becoming clear that they are involved in cell growth and differentiation, cell cycle regulation and metabolic control. These proteins are: - Calcyclin (Prolactin receptor associated protein (PRA); clatropin; 2a9; 5B10; S100A6). - Calpactin I light chain (p10; p11; 42c; S100A10). - Calgranulin A (cystic fibrosis antigen (CFAg); MIF related protein 8 (MRP8); p8; S100A8). - Calgranulin B (MIF related protein 14 (MRP-14); p14; S100A9). - Calgranulin C. - Calgizzarin (S100C). - Placental calcium-binding protein (CAPL) (18a2; peL98; 42a; p9K; MTS1; metastatin; S100A4). - Protein S-100D (S100A5). - Protein S-100E (S100A3). - Protein S-100L (CAN19; S100A2). - Placental protein S-100P (S100E). - Psoriasin (S100A7). - Chemotactic cytokine CP-10 [5]. - Protein MRP-126 [6]. - Trichohyalin [7]. This is a large intermediate filament-associated protein that associates with keratin intermediate filaments (KIF); it contains a S100 type domain in its N-terminal extremity. A number of these proteins are known to bind calcium while others are not (p10 for example). Our EF-hand detecting pattern (see <PDOC00018>) will fail to pick those proteins which have lost their calcium-binding properties. We developed a pattern which unambiguously picks up proteins belonging to this family. This pattern spans the region of the EF-hand high affinity site but makes no assumptions on the calcium-binding properties of this site. -Consensus pattern: [LIVMFYW](2)-x(2)-[LKQ]-D-x(3)-[DN]-x(3)-[DNSG]-[FY]x[ES]-[FYVC]-x(2)-[LIVMFS]-[LIVMF] -Sequences known to belong to this class detected by the pattern: ALL, except for 5 sequences. -Other sequence(s) detected in Swiss-Prot: NONE. -Expert(s) to contact by email: Cox J.A.; [email protected] Kretsinger R.H.; [email protected] -Last update: April 2006 / Pattern revised. [ 1] Baudier J. (In) Calcium and Calcium Binding proteins, Gerday C., Bollis L., Giller R., Eds., pp102-113, Springer Verlag, Berlin, (1988). [ 2] Moncrief N.D., Kretsinger R.H., Goodman M. J. Mol. Evol. 30:522-562(1990). [ 3] Kligman D., Hilt D.C. "The S100 protein family." Trends Biochem. Sci. 13:437-443(1988). PubMed=3075365 [ 4] Schaefer B.W., Wicki R., Engelkamp D., Mattei M.-G., Heizmann C.W. Genomics 25:638-643(1995). [ 5] Lackmann M., Cornish C.J., Simpson R.J., Moritz R.L., Geczy C.L. "Purification and structural analysis of a murine chemotactic cytokine (CP-10) with sequence homology to S100 proteins." J. Biol. Chem. 267:7499-7504(1992). PubMed=1559987 [ 6] Nakano T., Graf T. "Identification of genes differentially expressed in two types of v-myb-transformed avian myelomonocytic cells." Oncogene 7:527-534(1992). PubMed=1549365 [ 7] Lee S.-C., Kim I.-G., Marekov L.N., O'Keefe E.J., Parry D.A.D., Steinert P.M. "The structure of human trichohyalin. Potential multiple roles as a functional EF-hand-like calcium-binding protein, a cornified cell envelope precursor, and an intermediate filament-associated (cross-linking) protein." J. Biol. Chem. 268:12164-12176(1993). PubMed=7685034 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00276} {PS00304; SASP_1} {PS00684; SASP_2} {BEGIN} ******************************************************************* * Small, acid-soluble spore proteins, alpha/beta type, signatures * ******************************************************************* Small, acid-soluble spore proteins (SASP or ASSP) [1,2] are proteins found in the spores of bacteria of the genera Bacillus, Thermoactynomycetes, and Clostridium. SASP are bound to spore DNA. They are double-stranded DNAbinding proteins that cause DNA to change to an A-like conformation. They protect the DNA backbone from chemical and enzymatic cleavage and are thus involved in dormant spore's high resistance to UV light. SASP are degraded in the first minutes of spore germination and provide amino acids for both new protein synthesis and metabolism. There are two distinct families of SASP: the alpha/beta type and the gammatype. Alpha/beta SASP are small proteins of about sixty to seventy amino acid residues. They are generally coded by a multigene family. Two regions of alpha/beta SASP are particularly well conserved: the first region is located in the N-terminal half and contains the site which is cleaved by a SASPspecific protease that acts during germination; the second region is located in the C-terminal section and is probably involved in DNA-binding. We selected both regions as signature patterns for these proteins. -Consensus pattern: K-x-E-[LIV]-A-x-[DE]-[LIVMF]-G-[LIVMF] [The cleavage site is between the first E and I/V] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 2. -Consensus pattern: [KRC]-[SAQ]-x-G-x-[VF]-G-[GA]-x-[LIVM]-x-[KR]-[KRC][LIVM](2) -Sequences known to belong to this class detected by the pattern: ALL, except for Bacillus sspF. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Setlow P. "Small, acid-soluble spore proteins of Bacillus species: structure, synthesis, genetics, function, and degradation." Annu. Rev. Microbiol. 42:319-338(1988). PubMed=3059997; DOI=10.1146/annurev.mi.42.100188.001535 [ 2] Setlow P. "I will survive: protecting and repairing spore DNA." J. Bacteriol. 174:2737-2741(1992). PubMed=1569005 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00277} {PS00306; CASEIN_ALPHA_BETA} {BEGIN} ******************************** * Caseins alpha/beta signature * ******************************** Caseins [1] are the major protein constituent of milk. Caseins can be classified into two families; the first consists of the kappa-caseins, and the second groups the alpha-s1, alpha-s2, and beta-caseins. The alpha/beta caseins are a rapidly diverging family of proteins. However two regions are conserved: a cluster of phosphorylated serine residues and the signal sequence. The signature pattern we selected for this family of proteins is based on the last eight residues of the signal sequence. -Consensus pattern: C-L-[LV]-A-x-A-[LVF]-A -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 5. -Note: Alpha-s2 casein is known as epsilon-casein in mouse, gammacasein in rat and casein-A in guinea pig. Alpha-s1 casein is known as alphacasein in rat and rabbit and as casein-B in guinea-pig. -Last update: December 1992 / Text revised. [ 1] Holt C., Sawyer L. "Primary and predicted secondary structures of the caseins in relation to their biological functions." Protein Eng. 2:251-259(1988). PubMed=3074304 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00278} {PS00307; LECTIN_LEGUME_BETA} {PS00308; LECTIN_LEGUME_ALPHA} {BEGIN} ***************************** * Legume lectins signatures * ***************************** Leguminous plants synthesize sugar-binding proteins which are called legume lectins [1,2]. These lectins are generally found in the seeds. The exact function of legume lectins is not known but they may be involved in the attachment of nitrogen-fixing bacteria to legumes and in the protection against pathogens. Legume lectins bind calcium and manganese (or other transition metals). Legume lectins are synthesized as precursor proteins of about 230 to 260 amino acid residues. Some legume lectins are proteolytically processed to produce two chains: beta (which corresponds to the N-terminal) and alpha (Cterminal). The lectin concanavalin A (conA) from jack bean is exceptional in that the two chains are transposed and ligated (by formation of a new peptide bond). The N-terminus of mature conA thus corresponds to that of the alpha chain and the C-terminus to the beta chain. We have developed two signature patterns specific to legume lectins: the first is located in the C-terminal section of the beta chain and contains a conserved aspartic acid residue important for the binding of calcium and manganese; the second one is located in the N-terminal of the alpha chain. -Consensus pattern: [LIV]-[STAG]-V-[DEQV]-[FLI]-D-[ST] [D binds manganese and calcium] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 46. -Consensus pattern: [LIV]-{LA}-[EDQ]-[FYWKR]-V-{VF}-[LIVF]-G-[LF]-[ST] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 8. -Last update: December 2004 / Pattern and text revised. [ 1] Sharon N., Lis H. "Legume lectins--a large family of homologous proteins." FASEB J. 4:3198-3208(1990). PubMed=2227211 [ 2] Lis H., Sharon N. Annu. Rev. Biochem. 55:33-37(1986). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00279} {PS51304; GALECTIN} {BEGIN} ******************************************************** * Galactoside-binding lectin (galectin) domain profile * ******************************************************** Galectins (also known as galaptins or S-lectin) are a family of proteins defined by having at least one characteristic carbohydrate recognition domain (CRD) with an affinity for beta-galactosides and sharing certain sequence elements. Members of the galectins family are found in mammals, birds, amphibians, fish, nematodes, sponges, and some fungi. Galectins are known to carry out intra- and extracellular functions through glycoconjugatemediated recogntion. From the cytosol they may be secreted by non-classical pathways, but they may also be targeted to the nucleus or specific sub-cytosolic sites. Within the same peptide chain some galectins have a CRD with only a few additional amino acids, whereas others have two CRDs joined by a link peptide, and one (galectin-3) has one CRD joined to a different type of domain [13]. The galectin carbohydrate recognition domain (CRD) is a beta-sandwich of about 135 amino acid (see <PDB:1HLC>). The two sheets are slightly bent with 6 strands forming the concave side and 5 strands forming the convex side. The concave side forms a groove in which carbohydrate is bound, and which is long enough to hold about a linear tetrasaccharide [1-5]. A number of proteins are known to belong to this family: - Galectin-3 (also known as MAC-2 antigen; CBP-35 or IgE-binding protein), a 35 Kd lectin which binds immunoglobulin E and which is composed of two domains: a N-terminal domain that consist of tandem repeats of a glycine/ proline-rich sequence and a C-terminal galectin domain. - Galectin-4 [6], which is composed of two galectin domains. - Galectin-5. - Galectin-7 [7], a keratinocyte protein which could be involved in cell-cell and/or cell-matrix interactions necessary for normal growth control. - Galectin-8 [8], which is composed of two galectin domains. - Galectin-9 [9], which is composed of two galectin domains. - Human eosinophil lysophospholipase (EC 3.1.1.5) [5] (Charcot-Leyden crystal protein), a protein that may have both an enzymatic and a lectin activities. It forms hexagonal bipyramidal crystals in tissues and secretions from sites of eosinophil-associated inflammation. - Caenorhabditis elegans 32 Kd lactose-binding lectin [10]. This lectin is composed of two galectin domains. - Caenorhabditis elegans lec-7 and lec-8. The profile we developed covers the entire galectin domain. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: March 2007 / Pattern removed, profile added and text revised. [ 1] Leffler H. "Introduction to galectins."; Trends Glycosci. Glycotechnol. 9:9-19(1997). [ 2] Leffler H., Carlsson S., Hedlund M., Qian Y., Poirier F. "Introduction to galectins." Glycoconj. J. 19:433-440(2004). PubMed=14758066; DOI=10.1023/B:GLYC.0000014072.34840.04 [ 3] Ban M., Yoon H.-J., Demirkan E., Utsumi S., Mikami B., Yagi F. "Structural basis of a fungal galectin from Agrocybe cylindracea for recognizing sialoconjugate." J. Mol. Biol. 351:695-706(2005). PubMed=16051274; DOI=10.1016/j.jmb.2005.06.045 [ 4] Lobsanov Y.D., Gitt M.A., Leffler H., Barondes S.H., Rini J.M. "X-ray crystal structure of the human dimeric S-Lac lectin, L-14-II, in complex with lactose at 2.9-A resolution." J. Biol. Chem. 268:27034-27038(1993). PubMed=8262940 [ 5] Leonidas D.D., Elbert B.L., Zhou Z., Leffler H., Ackerman S.J., Acharya K.R. "Crystal structure of human Charcot-Leyden crystal protein, an eosinophil lysophospholipase, identifies it as a new member of the carbohydrate-binding family of galectins." Structure 3:1379-1393(1995). PubMed=8747464 [ 6] Oda Y., Herrmann J., Gitt M.A., Turck C.W., Burlingame A.L., Barondes S.H., Leffler H. "Soluble lactose-binding lectin from rat intestine with two different carbohydrate-binding domains in the same peptide chain." J. Biol. Chem. 268:5929-5939(1993). PubMed=8449956 [ 7] Madsen P., Rasmussen H.H., Flint T., Gromov P., Kruse T.A., Honore B., Vorum H., Celis J.E. "Cloning, expression, and chromosome mapping of human galectin-7." J. Biol. Chem. 270:5823-5829(1995). PubMed=7534301 [ 8] Hadari Y.R., Paz K., Dekel R., Mestrovic T., Accili D., Zick Y. "Galectin-8. A new rat lectin, related to galectin-4." J. Biol. Chem. 270:3447-3453(1995). PubMed=7852431 [ 9] Wada J., Kanwar Y.S. "Identification and characterization of galectin-9, a novel beta-galactoside-binding mammalian lectin." J. Biol. Chem. 272:6078-6086(1997). PubMed=9038233 [10] Hirabayashi J., Satoh M., Kasai K.-I. "Evidence that Caenorhabditis elegans 32-kDa beta-galactosidebinding protein is homologous to vertebrate beta-galactoside-binding lectins. cDNA cloning and deduced amino acid sequence." J. Biol. Chem. 267:15485-15490(1992). PubMed=1639789 [11] Abbott W.M., Feizi T. J. Biol. Chem. 266:5552-5557(1991). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00280} {PS00310; LAMP_1} {PS00311; LAMP_2} {PS51407; LAMP_3} {BEGIN} ************************************************************************* ** * Lysosome-associated membrane glycoprotein family signatures and profile * ************************************************************************* ** Lysosome-associated membrane glycoproteins (lamp) [1] are integral membrane proteins, specific to lysosomes, and whose exact biological function is not yet clear. Structurally, the lamp proteins consist of two internally homologous lysosome-luminal domains separated by a proline-rich hinge region; at the C-terminal extremity there is a transmembrane region followed by a very short cytoplasmic tail. In each of the duplicated domains, there are two conserved disulfide bonds. This structure is schematically represented in the figure below. +-----+ +-----+ +-----+ +-----+ | | | | | | | | xCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxxxCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxx <--------------------------><Hinge><--------------------------><TM><C> In mammals, lamp-2, there are two closely related types of lamp: lamp-1 and which form major lamp-1 is known as LEP100. components of the lysosome membrane. In chicken The macrophage protein CD68 (or macrosialin) [2] is a heavily glycosylated integral membrane protein whose structure consists of a mucin-like domain followed by a proline-rich hinge; a single lamp-like domain; a transmembrane region and a short cytoplasmic tail. Similar to CD68, mammalian lamp-3, which is expressed in lymphoid organs, dendritic cells and in lung, contains all the C-terminal regions but lacks the N-terminal lamp-like region [3]. In a lamp-family protein from nematodes [4] only the part C-terminal to the hinge is conserved. We developed two signature patterns for this family of proteins. The first one is centered on the first conserved cysteine of the duplicated domains. The second corresponds to a region that includes the extremity of the second domain, the totality of the transmembrane region and the cytoplasmic tail. We also developed a profile that covers lamp entirely. -Consensus pattern: [STA]-C-[LIVM]-[LIVMFYW]-A-x-[LIVMFYW]-x(3)[LIVMFYW]x(3)-Y [C is involved in a disulfide bond] -Sequences known to belong to this class detected by the pattern: ALL, except for CD68s. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: C-x(2)-D-x(3,4)-[LIVM](2)-P-[LIVM]-x-[LIVM]-G-x(2)[LIVM]x-G-[LIVM](2)-x-[LIVM](4)-A-[FY]-x-[LIVM]-x(2)-[KR][RH]x(1,2)-[STAG](2)-Y-[EQ] [C is involved in a disulfide bond] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: The first pattern will fail to detect the second copy of the domain in lamp-2 and the first copy of chicken LEP100. -Last update: November 2008 / Text revised; profile added. [ 1] Fukuda M. "Lysosomal membrane glycoproteins. Structure, biosynthesis, and intracellular trafficking." J. Biol. Chem. 266:21327-21330(1991). PubMed=1939168 [ 2] Holness C.L., da Silva R.P., Fawcett J., Gordon S., Simmons D.L. "Macrosialin, a mouse macrophage-restricted glycoprotein, is a member of the lamp/lgp family." J. Biol. Chem. 268:9661-9666(1993). PubMed=8486654 [ 3] de Saint-Vis B., Vincent J., Vandenabeele S., Vanbervliet B., Pin J.J., Ait-Yahia S., Patel S., Mattei M.G., Banchereau J., Zurawski S., Davoust J., Caux C., Lebecque S. "A novel lysosome-associated membrane glycoprotein, DC-LAMP, induced upon DC maturation, is transiently expressed in MHC class II compartment." Immunity 9:325-336(1998). PubMed=9768752 [ 4] Kostich M., Fire A., Fambrough D.M. "Identification and molecular-genetic characterization of a LAMP/CD68-like protein from Caenorhabditis elegans." J. Cell Sci. 113:2595-2606(2000). PubMed=10862717 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00281} {PS00312; GLYCOPHORIN_A} {BEGIN} *************************** * Glycophorin A signature * *************************** Glycophorin A is the major sialoglycoprotein of erythrocyte membrane [1]. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. In human there are two closely related forms: glycophorin A which carries the blood group M/N antigen, and glycophorin B which carries the blood group S/s antigen. The best conserved region of glycophorin A is the transmembrane domain and we have derived a consensus pattern from that region. -Consensus pattern: I-I-x-[GAC]-V-M-A-G-[LIVM](2) -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 1991 / Pattern and text revised. [ 1] Murayama J.-I., Utsumi H., Hamada A. "Amino acid sequence of monkey erythrocyte glycophorin MK. Its amino acid sequence has a striking homology with that of human glycophorin A." Biochim. Biophys. Acta 999:273-280(1989). PubMed=2605264 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00282} {PS00313; SVP_I} {BEGIN} *********************************************** * Seminal vesicle protein I repeats signature * *********************************************** Seminal vesicle protein I (SVP-1) [1] is one of the four major secretory proteins secreted by guinea-pig seminal vesicle epithelium. It is a clotting protein that serves as the substrate in the formation of the copulatory plug. Covalent clotting of this protein is catalyzed by a transglutaminase and involves the formation of gamma-glutamyl-epsilon-lysine crosslinks. SVP-1 sequence contains eight repeats of a twenty four amino acid residue domain. There are seven invariant residues in these repeats, three of them (two lysines and one glutamine) probably participate in the cross-links. The pattern we have developed comprises positions 1 to 19 of the domain and includes the three cross-linking residues. This pattern is also present twice [2] in the N-terminal region of the precursor of human skin elafin, an inhibitor of elastase as well as in the precursor of pig sodium/potassium atpase inhibitor SPAI-2. -Consensus pattern: [IVM]-x-G-Q-D-x-V-K-x(5)-[KN]-G-x(3)-[STLV] [Q and K are involved in covalent cross-links] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: November 1997 / Text revised. [ 1] Moore J.T., Hagstrom J., McCormick D.J., Harvey S., Madden B., Holicky E., Stanford D.R., Wieben E.D. "The major clotting protein from guinea pig seminal vesicle contains eight repeats of a 24-amino acid domain." Proc. Natl. Acad. Sci. U.S.A. 84:6712-6714(1987). PubMed=3477802 [ 2] Bairoch A. Unpublished observations (1993). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00283} {PS00314; ICE_NUCLEATION} {BEGIN} **************************************************** * Bacterial ice-nucleation proteins octamer repeat * **************************************************** Some Gram-negative bacteria express proteins that enable them to promote the nucleation of ice at relatively high temperature (above -5 degree Celsius) [1, 2,3]. These proteins are localized at the surface of the outer membrane of the bacteria and can cause frost injury to many plant species. The primary structure of these ice-nucleation proteins is highly repetitive. A central repetitive domain represents about 80% of the total sequence. This domain is mainly formed by the repetition of a conserved region of forty eight residues (48-mer). The 48-mers are themselves composed of three blocks of 16 residues (16-mer). The first eight residues of each of these 16-mers are identical. It has been proposed that the repetitive domain may be directly responsible for aligning water molecules in the seed crystal. Schematic structure of a 48-mer region: [.........48.residues.repeated.domain..........] / / | | \ \ AGYGSTxTagxxssli AGYGSTxTagxxsxlt AGYGSTxTaqxxsxlt [16.residues...] [16.residues...] [16.residues...] -Consensus pattern: A-G-Y-G-S-T-x-T -Sequences known to belong to this class detected by the pattern: ALL. This octamer sequence is found more than forty times in each of the known icenucleation proteins. -Other sequence(s) detected in Swiss-Prot: Paramecium primaurelia 168G surface protein (contains only one copy of the repeat). -Last update: June 1994 / Text revised. [ 1] Wolber P., Warren G. "Bacterial ice-nucleation proteins." Trends Biochem. Sci. 14:179-182(1989). PubMed=2672438 [ 2] Wolber P.K. Adv. Microb. Physiol. 34:205-237(1992). [ 3] Gurian-Sherman D., Lindow S.E. FASEB J. 7:1338-1343(1993). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00284} {PS00305; 11S_SEED_STORAGE} {BEGIN} ********************************************** * 11-S plant seed storage proteins signature * ********************************************** Plant seed storage proteins, whose principal function appears to be the major nitrogen source for the developing plant, can be classified, on the basis of their structure, into different families. 11-S are non-glycosylated proteins which form hexameric structures [1,2]. Each of the subunits in the hexamer is itself composed of an acidic and a basic chain derived from a single precursor and linked by a disulfide bond. This structure is shown in the following representation. +-------------------------+ | | xxxxxxxxxxxCxxxxxxxxxxxxxxxxxxxxxxNGxCxxxxxxxxxxxxxxxxxxxxxxx ********* <------Acidic-subunit-------------><-----Basic-subunit------> <-----------------About-480-to-500-residues-----------------> 'C': conserved cysteine involved in a disulfide bond. '*': position of the pattern. Proteins that belong to the 11-S family are: pea and broad bean legumins, rape cruciferin, rice glutelins, cotton beta-globulins, soybean glycinins, pumpkin 11-S globulin, oat globulin, sunflower helianthinin G3, etc. As a signature pattern for this family of proteins we used the region that includes the conserved cleavage site between the acidic and basic subunits (Asn-Gly) and a proximal cysteine residue which is involved in the interchain disulfide bond. -Consensus pattern: N-G-x-[DE](2)-x-[LIVMF]-C-[ST]-x(11,12)-[PAG]-D [C is involved in a disulfide bond] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: June 1994 / Pattern and text revised. [ 1] Hayashi M., Mori H., Nishimura M., Akazawa T., Hara-Nishimura I. "Nucleotide sequence of cloned cDNA coding for pumpkin 11-S globulin beta subunit." Eur. J. Biochem. 172:627-632(1988). PubMed=2450746 [ 2] Shotwell M.A., Afonso C., Davies E., Chesnut R.S., Larkins B.A. Plant Physiol. 87:698-704(1988). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00285} {PS00315; DEHYDRIN_1} {PS00823; DEHYDRIN_2} {BEGIN} ************************ * Dehydrins signatures * ************************ A number of proteins are produced by plants that experience waterstress. Water-stress takes place when the water available to a plant falls below a critical level. The plant hormone abscisic acid (ABA) appears to modulate the response of plant to water-stress. Proteins that are expressed during waterstress are called dehydrins [1,2] or LEA group 2 proteins [3]. The proteins that belong to this family are listed below. - Arabidopsis thaliana XERO 1, XERO 2 (LTI30), RAB18, ERD10 (LTI45) ERD14 and COR47. - Barley dehydrins B8, B9, B17, and B18. - Cotton LEA protein D-11. - Craterostigma plantagineum dessication-related proteins A and B. - Maize dehydrin M3 (RAB-17). - Pea dehydrins DHN1, DHN2, and DHN3. - Radish LEA protein. - Rice proteins RAB 16B, 16C, 16D, RAB21, and RAB25. - Tomato TAS14. - Wheat dehydrin RAB 15 and cold-shock protein cor410, cs66 and cs120. Dehydrins share notable a number of structural features. One of the most features is the presence, in their central region, of a continuous run of five to nine serines followed by a cluster of charged residues. Such a region has been found in all known dehydrins so far with the exception of pea dehydrins. A second conserved feature is the presence of two copies of a lysine-rich octapeptide; the first copy is located just after the cluster of charged residues that follows the poly-serine region and the second copy is found at the C-terminal extremity. We have have derived signature patterns for both regions. -Consensus pattern: S(4)-[SD]-[DE]-x-[DE]-[GVE]-x(1,7)-[GE]-x(0,2)[KR](4) -Sequences known to belong to this class detected by the pattern: ALL, except for pea dehydrins, Arabidopsis COR47 and XERO2 and wheat cold-shock proteins. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [KR]-[LIM]-K-[DE]-K-[LIM]-P-G -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Close T.J., Kortt A.A., Chandler P.M. "A cDNA-based comparison of dehydration-induced proteins (dehydrins) in barley and corn." Plant Mol. Biol. 13:95-108(1989). PubMed=2562763 [ 2] Robertson M., Chandler P.M. Plant Mol. Biol. 19:1031-1044(1992). [ 3] Dure L. III, Crouch M., Harada J., Ho T.-H. D., Mundy J., Quatrano R., Thomas T., Sung Z.R. Plant Mol. Biol. 12:475-486(1989). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00286} {PS00316; THAUMATIN_1} {PS51367; THAUMATIN_2} {BEGIN} ****************************************** * Thaumatin family signature and profile * ****************************************** Thaumatin [1] is an intensely sweet-tasting protein (100 000 times sweeter than sucrose on a molar basis) found in berries from Thaumatococcus daniellii, an African bush. The protein consists of about 200 residues and contains 8 disulfide bonds. Several stress-induced proteins of plants have been found to be related to thaumatins. Some of these proteins are listed below. - A maize alpha-amylase/trypsin inhibitor. - Two tobacco pathogenesis-related proteins: PR-R major and minor forms, which are induced after infection with viruses. - Salt-induced protein NP24 from tomato. - Osmotin, a salt-induced protein from tobacco. - Osmotin-like proteins OSML13, OSML15 and OSML81 from potato [2]. - P21, a leaf protein from soybean. - PWIR2, a leaf protein from wheat. - Zeamatin, a maize antifunal protein [3]. This family is also referred to as pathogenesis-related group 5 (PR5), as many thaumatin-like proteins accumulate in plants in response to infection by a pathogen and possess antifungal activity. As a signature pattern, we selected a conserved region that includes three cysteine residues known to be involved in disulfide bonds. +---------------------------------------------------------------------+ | +-----------------+ | | ******* | | | xxCxxxxxxxxxxxxxxxxCxxCxxCxCxxxxxxxxxxxxxxCxxCxCxxxCxCxxCCxCxxxCxxxxxCxxx Cx | | | | | | | | || | | +--+ +-+ | +---+ +--++-+ | +--------------------------+ 'C': conserved cysteine involved in a disulfide bond. '*': position of the pattern. We also developed a [4] of thaumatin/osmotin/PR5a. profile that covers the whole structure -Consensus pattern: G-x-[GF]-x-C-x-T-[GA]-D-C-x(1,2)-[GQ]-x(2,3)-C [The 3 C's are involved in disulfide bonds] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: February 2008 / Text revised; profile added. [ 1] Edens L., Heslinga L., Klok R., Ledeboer A.M., Maat J., Toonen M.Y., Visser C., Verrips C.T. "Cloning of cDNA encoding the sweet-tasting plant protein thaumatin and its expression in Escherichia coli." Gene 18:1-12(1982). PubMed=7049841 [ 2] Zhu B., Chen T.H.H., Li P.H. "Activation of two osmotin-like protein genes by abiotic stimuli and fungal pathogen in transgenic potato plants." Plant Physiol. 108:929-937(1995). PubMed=7630973 [ 3] Malehorn D.E., Borgmeyer J.R., Smith C.E., Shah D.M. "Characterization and expression of an antifungal zeamatin-like protein (Zlp) gene from Zea mays." Plant Physiol. 106:1471-1481(1994). PubMed=7846159 [ 4] Koiwa H., Kato H., Nakatsu T., Oda J., Yamada Y., Sato F. "Crystal structure of tobacco PR-5d protein at 1.8 A resolution reveals a conserved acidic cleft structure in antifungal thaumatin-like proteins." J. Mol. Biol. 286:1137-1145(1999). PubMed=10047487; DOI=10.1006/jmbi.1998.2540 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00287} {PS00322; HISTONE_H3_1} {PS00959; HISTONE_H3_2} {BEGIN} ************************* * Histone H3 signatures * ************************* Histone H3 is one of the four histones, along with H2A, H2B and H4, which forms the eukaryotic nucleosome core. It is a highly conserved protein of 135 amino acid residues [1,2,E1]. The following proteins have been found to contain a C-terminal H3-like domain: - Mammalian centromeric protein CENP-A [3]. Could act as a core histone necessary for the assembly of centromeres. - Yeast chromatin-associated protein CSE4 [4]. - Caenorhabditis elegans chromosome III encodes two highly related proteins (F54C8.2 and F58A4.3) whose C-terminal section is evolutionary related to the last 100 residues of H3. The function of these proteins is not yet known. We developed two signature patterns, The first one corresponds to a perfectly conserved heptapeptide in the N-terminal part of H3. The second one is derived from a conserved region in the central section of H3. -Consensus pattern: K-A-P-R-K-[QH]-[LI] -Sequences known to belong to this class detected by the pattern: ALL, except for the H3-like proteins and some protozoan H3. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: P-F-x-[RA]-L-[VA]-[KRQ]-[DEG]-[IV] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Wells D.E., Brown D. "Histone and histone gene compilation and alignment update." Nucleic Acids Res. 19:2173-2188(1991). PubMed=2041803 [ 2] Thatcher T.H., Gorovsky M.A. "Phylogenetic analysis of the core histones H2A, H2B, H3, and H4." Nucleic Acids Res. 22:174-179(1994). PubMed=8121801 [ 3] Sullivan K.F., Hechenberger M., Masri K. "Human CENP-A contains a histone H3 related histone fold domain that is required for targeting to the centromere." J. Cell Biol. 127:581-592(1994). PubMed=7962047 [ 4] Stoler S., Keith K.C., Curnick K.E., Fitzgerald-Hayes M. "A mutation in CSE4, an essential gene encoding a novel chromatin-associated protein in yeast, causes chromosome nondisjunction and cell cycle arrest at mitosis." Genes Dev. 9:573-586(1995). PubMed=7698647 [E1] http://research.nhgri.nih.gov/histones/ +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00288} {PS00323; RIBOSOMAL_S19} {BEGIN} *********************************** * Ribosomal protein S19 signature * *********************************** Ribosomal protein S19 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S19 is known to form a complex with S13 that binds strongly to 16S ribosomal RNA. S19 belongs to a family of ribosomal proteins which, on the basis of sequence similarities [1,2], groups: - Eubacterial S19. Algal and plant chloroplast S19. Cyanelle S19. Archaebacterial S19. Plant mitochondrial S19. Eukaryotic S15 ('rig' protein). S19 is a protein pattern is based on the few section of these proteins. of 88 to 144 amino-acid residues. Our signature conserved positions located in the C-terminal -Consensus pattern: [STDNQ]-G-[KRNQMHSI]-x(6)-[LIVM]-x(4)-[LIVMC]-[GSD]x(2)[LFI]-[GAS]-[DE]-[FYM]-x(2)-[ST] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Kitagawa M., Takasawa S., Kikuchi N., Itoh T., Teraoka H., Yamamoto H., Okamoto H. "rig encodes ribosomal protein S15. The primary structure of mammalian ribosomal protein S15." FEBS Lett. 283:210-214(1991). PubMed=2044758; [ 2] Otaka E., Hashimoto T., Mizuta K. Protein Seq. Data Anal. 5:285-300(1993). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00289} {PS00324; ASPARTOKINASE} {BEGIN} *************************** * Aspartokinase signature * *************************** Aspartokinase (EC 2.7.2.4) (AK) [1] catalyzes the phosphorylation of aspartate. The product of this reaction can then be used in the biosynthesis of lysine or in the pathway leading to homoserine, which participates in the biosynthesis of threonine, isoleucine and methionine. In Escherichia coli, there are three different isozymes which differ in their sensitivity to repression and inhibition by Lys, Met and Thr. AK1 (gene thrA) and AK2 (gene metL) are bifunctional enzymes which both consist of an Nterminal AK domain and a C-terminal homoserine dehydrogenase domain. AK1 is involved in threonine biosynthesis and AK2, in that of methionine. The third isozyme, AK3 (gene lysC), is monofunctional and involved in lysine synthesis. In yeast, there is a single isozyme of AK (gene HOM3). As a signature pattern for AK, we selected a conserved region located in the N-terminal extremity. -Consensus pattern: [LIVM]-x-K-[FY]-G-G-[ST]-[SC]-[LIVM] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 1. -Last update: November 1995 / Pattern and text revised. [ 1] Rafalski J.A., Falco S.C. "Structure of the yeast HOM3 gene which encodes aspartokinase." J. Biol. Chem. 263:2146-2151(1988). PubMed=2892836 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00290} {PS00326; TROPOMYOSIN} {BEGIN} ************************** * Tropomyosins signature * ************************** Tropomyosins [1,2] are family of closely related proteins present in muscle and non-muscle cells. In striated muscle, tropomyosin mediate the interactions between the troponin complex and actin so as to regulate muscle contraction. The role of tropomyosin in smooth muscle and non-muscle tissues is not clear. Tropomyosin is an alpha-helical protein that forms a coiled-coil dimer. Muscle isoforms of tropomyosin are characterized by having 284 amino acid residues and a highly conserved N-terminal region, whereas non-muscle forms are generally smaller and are heterogeneous in their N-terminal region. The signature pattern for tropomyosins region in is based on a very conserved the C-terminal section of tropomyosins and which is present in both muscle and non-muscle forms. -Consensus pattern: L-K-[EAD]-A-E-x-R-A-[ET] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: December 2004 / Pattern and text revised. [ 1] Smilie L.B. Trends Biochem. Sci. 4:151-155(1979). [ 2] McLeod A.R. BioEssays 6:208-212(1986). +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00291} {PS00950; BACTERIAL_OPSIN_1} {PS00327; BACTERIAL_OPSIN_RET} {BEGIN} *********************************** * Bacterial rhodopsins signatures * *********************************** Bacterial rhodopsins [1,2,3] are a family of retinal-containing proteins found in extremely halophilic bacteria which provide light-dependent ion transport and sensory functions for these organisms. Bacterial rhodopsins are integral membrane proteins with seven transmembrane regions. The retinal choromophore is covalently linked, via a Schiff's base, to the epsilon-amino group of a conserved lysine residue in the middle of the last transmembrane helix (called helix G). There are at least three types of bacterial rhodopsins: - Bacteriorhodopsin (bop) , and archaerhodopsins 1 and 2, light-driven proton pumps. - Halorhodopsin (hop), a light-driven chloride pump. - Sensory rhodopsin (sop), which mediates both photoattractant (in the red) and photophobic (in the near UV) responses. We developed two patterns which allow the specific detection of bacterial rhodopsins. The first pattern corresponds to the third transmembrane region (called helix C) and includes an arginine residue which seems involved in the release of a proteon from the Schiff's base to the extracellular medium. The second pattern includes the retinal binding lysine. -Consensus pattern: R-Y-x-[DT]-W-x-[LIVMF]-[ST]-[TV]-P-[LIVM]-[LIVMNQ][LIVM] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Consensus pattern: [FYIV]-{ND}-[FYVG]-[LIVM]-D-[LIVMF]-x-[STA]-K-x-{K}[FY] [K is the retinal binding site] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 4. -Last update: December 2004 / Patterns and text revised. [ 1] Osterhelt D., Tittor J. Trends Biochem. Sci. 14:57-61(1989). [ 2] Soppa J., Duschl J., Oesterhelt D. "Bacterioopsin, haloopsin, and sensory opsin I of the halobacterial isolate Halobacterium sp. strain SG1: three new members of a growing family." J. Bacteriol. 175:2720-2726(1993). PubMed=8478333 [ 3] Kuan G., Saier M.H. Jr. "Phylogenetic relationships among bacteriorhodopsins." Res. Microbiol. 145:273-285(1994). PubMed=7997641 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00292} {PS00328; HCP} {BEGIN} ************************* * HCP repeats signature * ************************* The histidine-rich calcium-binding protein (HCP) of sarcoplasmic reticulum [1] may play a role in the regulation of calcium sequestration or release in the SR of skeletal and cardiac muscle. This protein is very acidic (31% of Asp and Glu) and rich in histidine (13%). The sequence of HCP contains 10 tandem repeats of a 26 to 29 amino acid residues domain. This domain starts with an invariant hexapeptide (HRHRGH), followed by a stretch of acidic residues. The end of the domain consist of an almost invariant nonapeptide (STESDRHQA). The highly acidic central cores of each repeat are likely to constitute the calcium-binding sites of HCP. The pattern we have developed comprises part of the central acidic stretch. the beginning of the domain and -Consensus pattern: H-R-H-R-G-H-x(2)-[DE](7) -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 1990 / First entry. [ 1] Hofmann S.L., Goldstein J.L., Orth K., Moomaw C.R., Slaughter C.A., Brown M.S. "Molecular cloning of a histidine-rich Ca2+-binding protein of sarcoplasmic reticulum that contains highly conserved repeated elements." J. Biol. Chem. 264:18083-18090(1989). PubMed=2808365 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00293} {PS00330; HEMOLYSIN_CALCIUM} {BEGIN} *************************************************** * Hemolysin-type calcium-binding region signature * *************************************************** Gram-negative bacteria produce a number of proteins which are secreted into the growth medium by a mechanism that does not require a cleaved Nterminal signal sequence. These proteins, while having different functions, seem [1] to share two properties: they bind calcium and they contain a variable number of tandem repeats consisting of a nine amino acid motif rich in glycine, aspartic acid and asparagine. It has been shown [2] that such a domain is involved in the binding of calcium ions in a parallel beta roll structure. The proteins which are currently known to belong to this category are: - Hemolysins from various species of bacteria. Bacterial hemolysins are exotoxins that attack blood cell membranes and cause cell rupture. The hemolysins which are known to contain such a domain are those from: E. coli (gene hlyA), A. pleuropneumoniae (gene appA), A. actinomycetemcomitans and P. haemolytica (leukotoxin) (gene lktA). - Cyclolysin from Bordetella pertussis (gene cyaA). A multifunctional protein which is both an adenylate cyclase and a hemolysin. - Extracellular zinc proteases: serralysin (EC 3.4.24.40) from Serratia, prtB and prtC from Erwinia chrysanthemi and aprA from Pseudomonas aeruginosa. - Nodulation protein nodO from Rhizobium leguminosarum. We derived a signature pattern from conserved positions in the sequence of the calcium-binding domain. -Consensus pattern: D-x-[LI]-x(4)-G-x-D-x-[LI]-x-G-G-x(3)-D -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: This pattern is found once in nodO and the extracellular proteases but up to 5 times in some hemolysin/cyclolysins. -Last update: October 1993 / Text revised. [ 1] Economou A., Hamilton W.D.O., Johnston A.W., Downie J.A. "The Rhizobium nodulation gene nodO encodes a Ca2(+)-binding protein that is exported without N-terminal cleavage and is homologous to haemolysin and related proteins." EMBO J. 9:349-354(1990). PubMed=2303029 [ 2] Baumann U., Wu S., Flaherty K.M., McKay D.B. "Three-dimensional structure of the alkaline protease of Pseudomonas aeruginosa: a two-domain protein with a calcium binding parallel beta roll motif." EMBO J. 12:3357-3364(1993). PubMed=8253063 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00294} {PS00331; MALIC_ENZYMES} {BEGIN} *************************** * Malic enzymes signature * *************************** Malic enzymes, or malate oxidoreductases, catalyze the oxidative decarboxylation of malate into pyruvate important for a wide range of metabolic pathways. There are three related forms of malic enzyme [1,2,3]: - NAD-dependent malic enzyme (EC 1.1.1.38), which uses preferentially NAD and has the ability to decarboxylate oxaloacetate (OAA). It is found in bacteria and insects. - NAD-dependent malic enzyme (EC 1.1.1.39), which uses preferentially NAD and is unable to decarboxylate OAA. It is found in the mitochondrial matrix of plants and is a heterodimer of highly related subunits. - NADP-dependent malic enzyme (EC 1.1.1.40), which has a preference for NADP and has the ability to decarboxylate OAA. This form has been found in fungi, animals and plants. In mammals, there are two isozymes: one, mitochondrial and the other, isozymes: chloroplastic and cytosolic. There are malic enzymes: two cytosolic. Plants also have two other proteins which are closely structurally related to - Escherichia coli protein sfcA, whose function is not yet known which could be an NAD or NADP-dependent malic enzyme. - Yeast hypothetical protein YKL029c, a probable malic enzyme. but There are three well conserved regions in the enzyme sequences. Two of them seem to be involved in binding NAD or NADP. The significance of the third one, located in the central part of the enzymes, is not yet known. We selected this region as a signature pattern for these enzymes. -Consensus pattern: [FM]-x-[DV]-D-x(2)-[GS]-T-[GSA]-x-[IV]-x-[LIVMAT][GAST][GASTC]-[LIVMFA]-[LIVMFY] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: April 2006 / Pattern revised. [ 1] Artus N.N., Edwards G.E. FEBS Lett. 182:225-233(1985). [ 2] Loeber G., Infante A.A., Maurer-Fogy I., Krystek E., Dworkin M.B. "Human NAD(+)-dependent mitochondrial malic enzyme. cDNA cloning, primary structure, and expression in Escherichia coli." J. Biol. Chem. 266:3016-3021(1991). PubMed=1993674 [ 3] Long J.J., Wang J.-L., Berry J.O. "Cloning and analysis of the C4 photosynthetic NAD-dependent malic enzyme of amaranth mitochondria." J. Biol. Chem. 269:2827-2833(1994). PubMed=8300616 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00295} {PS00697; DNA_LIGASE_A1} {PS00333; DNA_LIGASE_A2} {PS50160; DNA_LIGASE_A3} {BEGIN} *************************************************** * ATP-dependent DNA ligase signatures and profile * *************************************************** DNA ligase (polydeoxyribonucleotide synthase) is the enzyme that joins two DNA fragments by catalyzing the formation of an internucleotide ester bond between phosphate and deoxyribose. It is active during DNA replication, DNA repair and DNA recombination. There are two forms of DNA ligase: one requires ATP (EC 6.5.1.1), the other NAD (EC 6.5.1.2). Eukaryotic, dependent. During the with ATP to form a residue is the site archaebacterial, first virus and phage DNA ligases are ATP- step of the joining reaction, the ligase interacts covalent enzyme-adenylate intermediate. A conserved lysine of adenylation [1,2]. Apart from the active site region, the only conserved region common to all ATP-dependent DNA ligases is found [3] in the C-terminal section and contains a conserved glutamate as well as four positions with conserved basic residues. We developed signature patterns for both conserved regions. -Consensus pattern: [EDQH]-{K}-K-{VEDI}-[DN]-G-{GLYN}-R-[GACIVM] [K is the active site residue] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: 33. -Consensus pattern: E-G-[LIVMA]-[LIVM]-[LIVMA]-[KR]-x(5,8)-[YW]-[QNEKTI]x(2,6)-[KRH]-x(3,5)-K-[LIVMFY]-K -Sequences known to belong to this class detected by the pattern: ALL, except for archebacterial DNA ligases. -Other sequence(s) detected in Swiss-Prot: NONE. -Sequences known to belong to this class detected by the profile: ALL. -Other sequence(s) detected in Swiss-Prot: 1. -Last update: April 2006 / Patterns revised. [ 1] Tomkinson A.E., Totty N.F., Ginsburg M., Lindahl T. "Location of the active site for enzyme-adenylate formation in DNA ligases." Proc. Natl. Acad. Sci. U.S.A. 88:400-404(1991). PubMed=1988940 [ 2] Lindahl T., Barnes D.E. "Mammalian DNA ligases." Annu. Rev. Biochem. 61:251-281(1992). PubMed=1497311; DOI=10.1146/annurev.bi.61.070192.001343 [ 3] Kletzin A. "Molecular characterisation of a DNA ligase gene of the extremely thermophilic archaeon Desulfurolobus ambivalens shows close phylogenetic relationship to eukaryotic ligases." Nucleic Acids Res. 20:5389-5396(1992). PubMed=1437556 +-----------------------------------------------------------------------+ PROSITE is copyright. It is produced by the Swiss Institute of Bioinformatics (SIB). There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme send an email to [email protected] or see: http://www.expasy.org/prosite/prosite_license.htm. +-----------------------------------------------------------------------+ {END} {PDOC00296} {PS00335; PARATHYROID} {BEGIN} **************************************** * Parathyroid hormone family signature * **************************************** Parathyroid hormone (PTH) is a polypeptidic hormone that elevates calcium level by dissolving the salts in bone and preventing their renal excretion. PTH is a protein of about 80 amino acid residues, but its biological activity seems to be contained within the first 34 residues. PTH is structurally related to a protein called 'parathyroid hormonerelated protein' (PTH-rP) [1] which seems to play a physiological role in lactation, possibly as a hormone for the mobilization and/or transfer of calcium to the milk. PTH-rP is a protein of 141 amino acids. As for PTH, the first 34 residues are sufficient to mediate the biological activity of PTH-rP. PTH and PTH-rP bind to the same G-protein coupled receptor. The signature pattern we selected for these proteins is derived from conserved residues in the N-terminal extremity of PTH and PTH-rP, spanning residues 2 to 12. -Consensus pattern: V-S-E-x-Q-x(2)-H-x(2)-G -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Last update: June 1992 / Text revised. [ 1] Martin T.J., Allan E.H., Caple I.W., Care A.D., Danks J.A., Diefenbach-Jagger H., Ebeling P.R., Gillepsie M.T., Hammonds G., Heath J.A., Hudson P.J., Kemp B.E., Kubota M., Kukreja S.C., Moseley J.M., Ng K.W., Raisz L.G., Rodda C.P., Simmons H.A., Suva L.J., Wettenhall R.E.H., Wood W.I. "Parathyroid hormone-related protein: isolation, molecular cloning, and mechanism of action." Recent Prog. Horm. Res. 45:467-502(1989). PubMed=2682846; +-----------------------------------------