* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Field Guide to Protein Folds
Zinc finger nuclease wikipedia , lookup
Rosetta@home wikipedia , lookup
List of types of proteins wikipedia , lookup
Protein design wikipedia , lookup
Structural alignment wikipedia , lookup
Circular dichroism wikipedia , lookup
Bimolecular fluorescence complementation wikipedia , lookup
Protein folding wikipedia , lookup
Intrinsically disordered proteins wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Protein mass spectrometry wikipedia , lookup
Western blot wikipedia , lookup
Protein purification wikipedia , lookup
Homology modeling wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Protein structure prediction wikipedia , lookup
Protein–protein interaction wikipedia , lookup
P-type ATPase wikipedia , lookup
Alpha helix wikipedia , lookup
Field Guide to Protein Folds This appendix is based on the contribution of Nicholas Furnham, London School of Hygiene & Tropical Medicine, UK TABLE OF CONTENTS 1. 14-3-3 Protein Domain 29. HAMP Domain 2. ATP-binding Cassette (ABC) Domain 30. HEAT Repeat Domain 3. Ankyrin Repeat Domain 31. Helix-turn-helix DNA-binding Motif 4. Armadillo Domain 32. Immunoglobulin (Ig) Domain 5. BAR Domain 33. Jelly Roll Fold 6. β-propeller Fold 34. Kelch Motif/Domain 7. BIR Domain 35. K Homology (KH) Domain 8. BRCT Domain 36. LD Motif 9. Bromodomain 37. LIM Domain 10. BTB/POZ Domain 38. Leucine-rich Repeats (LRR) Domain 11. C2 Domain 39. NAD(H)/NAD(P)-binding Domain 12. Calmodulin 40. Oligonucleotide/Oligosaccharide-binding Fold (OB) Domain 13. Caspase Recruitment Domain (CARD) 41. PAS Domain 14. Centromeric-A-targeting Domain (CATD) 42. Polo-box Domain (PBD) 15. Complement Control Protein (CCP) Domain 43. PDZ Domain 16. Calponin Homology (CH) Domain 44. Pleckstrin Homology (PH) Domain 17. Cold Shock Domain (CSD) 45. Phosphotyrosine-binding (PTB) Domain 18. Complement C1r/C1s, Uegf, Bmp1 (CUB) Domain 46. RNA Recognition Motif (RRM) Domain 19. Cyclin-box Domain 47. S1 Domain 20. Death Domain 48. Src Homology 2 (SH2) Domain 21. Dbl Homology (DH) Domain 49. Src Homology 3 (SH3) Domain 22. DEXD/H Domain 50. Spectrin-like Repeats 23. EF-hand Domain 51. Ubiquitin Fold 24. Epidermal Growth Factor (EGF)-like Domain 52. von Willebrand Factor Type A Domain 25. FERM Domain 53. WD40 Repeat Domain 26. Fibronectin Type III (FNIII) Domain 54. Winged-helix Domain (WHD) 27. Formin Homology (FH) Domain 55. Zinc Finger Domain 28. Greek Key Motif 56. Zinc Ribbon Domain Field Guide to Protein Folds 1. 1 14-3-3 PROTEIN DOMAIN CATH: 1.20.190.20 SCOP: a.118.7.1 InterPro: IPR000308 Pfam: PF00244 The 14-3-3 protein domain is an abundant type of adaptor protein that recognizes and specifically interacts with phosphorylated proteins in eukaryotes. Their name refers to their properties in chromatographic fractionation. Most species contain more than one isoform of this protein. These are not products of alternative splicing but are separate genes and differ in short sections of the sequence. All seven isoforms form dimers, which have a common horseshoe-like structure. Each domain consists of nine α helices, with the five C-terminal helices forming a cuplike structure and the remainder involved in forming the dimer interface. The dimeric structure is stabilized by various salt bridges. Formation of homo- and heterodimers is considered to be one of the factors affecting specificity of the protein with its different protein targets, though not all isoforms are able to form heterodimers. Field Guide to Protein Folds 2. 2 ATP-BINDING CASSETTE (ABC) DOMAIN CATH: 3.20.50.300 SCOP: c.37.1.12 InterPro: IPR017871 Pfam: PF00005 The ATP-binding cassette (ABC) domain is one of two domains that form the ABC transporters, which are found in all kingdoms of life and comprise one of the largest protein families. It binds and hydrolyzes ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. Its sequence is highly conserved, displaying a typical Walker A phosphate-binding loop and a Walker B magnesium-binding site found in one arm of the L-shaped structure that the domain adopts. This arm, formed mainly of β strands, also contains the important residues for ATP hydrolysis and/or binding (located in the P-loop). The ATP-binding pocket is found at the end of the arm. The hinge between the two arms contains both a histidine loop motif and a Q-motif, making contact with the γ phosphate of the ATP molecule. The other arm, mostly an α-helical subdomain, contains the signature motif (LSGGQ) and is in direct contact with the transmembrane domain of the transporter. Field Guide to Protein Folds 3. 3 ANKYRIN REPEAT DOMAIN CATH: 1.25.40.20 SCOP: d.211.1.1 InterPro: IPR002110 Pfam: PF00023 The ankyrin repeat domain is a 33-residue repeating unit consisting of a simple helix-turn-helix with the helices arranged in an anti-parallel fashion. The turn projects out from the helices at a 90° angle to facilitate the formation of hairpin-like β sheets with neighboring loops. Thus, the molecule has an L-shaped structure, with the helices as the vertical arm and the N- and C-terminal stretches as the base. The domain, found across all kingdoms of life and particularly prevalent in eukaryotes, often occurs as repeats of between four and six units. They can also be found as a relatively large number of repeats (the largest predicted to be made of 34 repeats) and can stack to form a variety of tertiary structures with an intrinsic property to form compact and concave structures. Field Guide to Protein Folds 4. 4 ARMADILLO DOMAIN CATH: 1.25.10.10 SCOP: a.118.1.1 InterPro: IPR000225 Pfam: PF00514 The armadillo (ARM motif) domain has multiple copies of a 42-residue repeat, consisting of three α helices: a short helix of two turns, followed by two longer helices of three to four turns each. The two longer helices pack together in an anti-parallel manner similar to the helical packing in the HEAT repeat domains. Proteins that contain the motif often have many tandem repeated copies. Multiple copies of the repeat form a right-handed superhelix of α helices with each repeat rotated 30° with respect to the preceding repeat to form an α-solenoid structure. This features a positively charged groove, which is presumed to interact with the acidic surfaces of the known interaction partners. Field Guide to Protein Folds 5. 5 BAR DOMAIN CATH: 1.20.1270.60 SCOP: a.238.1.1 InterPro: IPR004148 Pfam: PF03114 The Bin-amphiphysin-Rvs (BAR) domain comprises three long α-helices coiled together. These coils are often associated as dimers to form a functional banana-shaped sixhelix bundle. The coiled coil trimer in the BAR domain of amphiphysin has approximately 210 amino acids, with positively charged residues at the end of the coiled coil not involved in the dimerization as well as along the curved surface that forms the banana-shaped quaternary structure. These positively charged residues are thought to facilitate the interaction with the phospholipids of the membranes. These proteins function in diverse cellular processes such as endocytosis, sorting nexins, and amphiphysin and actin reorganization. Field Guide to Protein Folds 6. 6 β-PROPELLER FOLD CATH: 2.105, 2.110, 2.115, 2.120, 2.130, 2.140 SCOP: b.66, b.67, b.68, b.69, b.70 InterPro: IPR010620, IPR001680 Pfam: PF00400, PF06739 The β-propeller fold is found in many different structures from a variety of organisms across all kingdoms of life. It adopts a highly symmetrical structure formed of between four and eight fold repeats arranged toroidally around a central axis. Each fold repeat is formed of a four-stranded twisted anti-parallel β sheet. The repeats are arranged in ring-like fashion around a central tunnel. The ring is closed by what has been termed a ‘molecular Velcro,’ with both termini forming one of the four-stranded anti-parallel β sheets either as a 1 + 3 or a 2 + 2 combination of strands from the N- and C-terminal ends. There are cases where closure is achieved by each end forming a separate sheet/ blade and stabilized by hydrophobic interactions. For a further example of a seven-bladed β-repeat see the entry for WD40 Domain. Field Guide to Protein Folds 7. 7 BIR DOMAIN CATH: 1.10.1170.10 SCOP: g.52.1.1 InterPro: IPR001370 Pfam: PF00653 The baculovirus inhibitor of apoptosis protein repeat (BIR) domain, also referred to as inhibitor of apoptosis (IAP) domain, is an approximately 70 amino acid zinc-binding domain. Occurring either as a single domain or as two or three tandem repeats, they consist of a mainly alpha orthogonal bundle comprising four or five short α helices, a three-stranded β sheet, and a zinc atom packed into a highly hydrophobic environment created by a number of residues in the vicinity of the zinc pocket and three coordinating residues cystine and one histidine residues, which are highly conserved among all BIR domains. The principle function of BIRs is in mediating protein–protein interactions and the surface the domain has a number of hydrophobic regions that would facilitate such interactions with a number of other proteins. Field Guide to Protein Folds 8. 8 BRCT DOMAIN CATH: 3.40.50.10190 SCOP: c.15.1 InterPro: IPR001357 Pfam: PF00533 The BRCT (breast cancer susceptibility protein C-terminal) domain comprises a Rossmann fold with a central fourstranded β sheet flanked by a single α helix on one side and two α helices on the opposite side. The domain is an approximately 100 amino acid tandem repeat, which appears to act as a phospho-protein-binding domain. BRCT repeats are defined by conserved clusters of hydrophobic residues that occupy the core of the repeat structure and by glycine residues that facilitate a tight turn between α1 and β2. There is considerable diversity in the multidomain architectures in which the BRCT domains are found. They can exist as single isolated domains, multiple tandem BRCT repeats, and in association with other functional domains. They can also be found in multiple but isolated copies, where an unstructured region linking the two domains separates two distinct single BRCT domains. Field Guide to Protein Folds 9 9.BROMODOMAIN CATH: 1.20.920.10 SCOP: a.29.2.1 InterPro: IPR001487 Pfam: PF00439 The bromodomain is central to epigenetic control of gene transcription through its role in acetylating histone lysine. The structure adopts a conserved left-handed bundle of four α helices, with two interhelical loops of variable length and sequence between the first and second and third and fourth helices. These constitute a hydrophobic pocket that both stabilizes the structure and interacts with the acetyl-lysine. The N and C termini are located close to each other, indicating the modular nature of the domain and its involvement in protein–protein interactions and that multiple bromodomains can be placed sequentially in a chromosomal protein. Though generally the sequence conservation between bromodomains is low, the residues (two tyrosines and an asparagine) involved in acetyl-lysine recognition are highly conserved. The acetyl-lysine forms a specific hydrogen bond between the oxygen of the acetyl carbonyl group and the side chain amide nitrogen of the conserved asparagines. A network of water-mediated hydrogen bonds with backbone carbonyl groups at the base of the binding cleft also contribute to acetyl-lysine binding. Field Guide to Protein Folds 10. 10 BTB/POZ DOMAIN CATH: 3.30.710.10 SCOP: d.42.1.1 InterPro: IPR000210 Pfam: PF00651 The BTB domain (Broad-Complex, Tramtrack and Bric a Brac) is found at the N terminus in 5–10% of zinc finger proteins, in poxvirus proteins involved in dimerization (hence its other name, POZ domain, for poxvirus and zinc finger), and in proteins that have a Kelch motif. These domains are about 120 amino acids long. The fold is based on a cluster of six α helices flanked by a β sheet. As a dimer the N terminus of each chain is associated with the main body of the other chain, generating one of the β sheets between the first β strand of one monomer and the fifth β strand of the other. This, along with the sixth α helix, forms an extended concave surface on the underside of the protein dimer and is implicated in ligand binding. Field Guide to Protein Folds 11. 11 C2 DOMAIN CATH: 2.60.40.150 SCOP: b.7.1.1, b.7.1.2 InterPro: IPR000008 Pfam: PF00168 The calcium-binding C2 domain comprises approximately 130 residues forming an anti-parallel β sandwich formed of two β sheets each containing four strands. The calciumbinding site is located between the loops that connect the second and third β strands and the sixth and seventh β strands. Though forming the same tertiary structure, two distinct topologies exist, differing in their β-strand connectivity. Occurring in single and multiple copies they have been found in a wide range of eukaryotic signaling proteins and are involved in a wide range of functions including signal transduction, vesicular transport, GTPase regulation, lipid modification, and protein phosphorylation. The common mechanism by which the C2 domain acts comes from the calcium inducing a change in the electrostatic potential enhancing phospholipid binding, suggesting that the C2 domain functions as an electrostatic switch. Field Guide to Protein Folds 12 12.CALMODULIN CATH: 1.10.238.10 SCOP: a.39.1.5 InterPro: IPR011992 Pfam: PF13202 Calmodulin is a small dumb-bell-shaped protein composed of two globular domains connected together by a flexible linker and acts as an intermediary protein sensing calcium levels and relaying signals to various calcium-sensitive enzymes, ion channels, and other proteins The globular domain of calmodulin is a particular type of EF-hand (see entry for EF-hand domain) that collates two calcium ions. The linker between the two domains can be highly flexible, permitting it to interact with a range of target protein partners. Field Guide to Protein Folds 13. 13 CASPASE RECRUITMENT DOMAIN (CARD) CATH: 1.10.533.10 SCOP: a.77.1.3 InterPro: IPR001315 Pfam: PF00619 The caspase recruitment domain (CARD) has about 94 residues which form six anti-parallel amphipathic α helices that pack together to form a hydrophobic core. This domain resembles the death domain (see Death Domain). Helices 2–5 form a four-helix bundle, with the two other helices crossing on top of helix 4 and 5. It is the orientation of the latter two helices that contributes to the difference to the death domain, which shares a similar six-helix bundle. One side of the domain has predominantly basic residues, while the other side has predominantly acidic residues, which contribute to the protein–protein interactions that define the CARDs function in the regulation of caspase activation and apoptosis. In addition, a number of CARD proteins have been shown to play a role in regulating inflammation in response to bacterial and viral pathogens as well as to a variety of endogenous stress signals. Field Guide to Protein Folds 14. 14 CENTROMERIC-A-TARGETING DOMAIN (CATD) CATH: 1.10.20.10 SCOP: a.22.1.1 InterPro: IPR000164 Pfam: PF00125 The centromeric-A-targeting domain (CATD) comprises the first loop and the second α helix of the CENP-A histone fold domain that replaces histone H3 in centromeric nucleosomes. It confers a unique structural rigidity to the nucleosomes into which it assembles. CATD is confined to the structured core of the nucleosome, indicating that many of the essential features of CENP-A are within the rigid core of the nucleosome. This supports the model that centromere identity is maintained by a unique nucleosome structure that serves to distinguish the centromere from the rest of the chromosome. Field Guide to Protein Folds 15. 15 COMPLEMENT CONTROL PROTEIN (CCP) DOMAIN CATH: 2.10.70.10 SCOP: g.18.1.1 InterPro: IPR000436 Pfam: PF00084 Complement control protein (CCP), also called the Sushi domain or short consensus repeat (SCR) domain, is found in a number of complement and adhesion proteins. Approximately 60 amino acids in length, it comprises mainly β strands based on a β sandwich arrangement, with one face formed of three strands and the other opposing face of two strands, with the regions between comprising well-defined turns and less defined loops. They often occur in tandem arrays linked by short sequences, with up 30 domains occurring together. Experimental evidence suggests that the other CCP domains with which it is associated influence the stability of a CCP domain. There are a number of conserved residues primarily involved in maintaining the structural fold, including four invariant cysteine residues involved in intramolecular disulfide bonds, a highly conserved tryptophan, and conserved glycine. Field Guide to Protein Folds 16. 16 CALPONIN HOMOLOGY (CH) DOMAIN CATH: 1.10.418.10 SCOP: a.40.1.1 InterPro: IPR001715 Pfam: PF00307 The calponin homology (CH) domain has an all α-helical architecture dominated by 4 α helices each comprising 11–18 residues connected by relatively long loops. A further three shorter and less well ordered α helices are also present. Often found in pairs they form the F-actin-binding region of proteins associated with a large superfamily of cytoskeletal proteins responsible for the localization and cross-linking of filamentous actin. The domain is also present in signaling proteins involved in modulation of actin filaments, often as a single domain, and may not directly interact with actin. As a dimer, the last helix of the first CH domain and the first helix of the second domain bind F-actin. Phosphatidylinositol 4,5-bisphosphate modulates the function of actin-binding proteins through the CH domain, interacting with the second helix of the second CH domain, and thus may regulate by a competitive mechanism. Field Guide to Protein Folds 17. 17 COLD SHOCK DOMAIN (CSD) CATH: 2.40.50.140 SCOP: b.40.4.5 InterPro: IPR002059 Pfam: PF00313 The cold shock domain (CSD) is a small (approximately 70 residues) ancient nucleic acid-binding domain found in all kingdoms of life. Its structure consists of a nearly closed anti-parallel β barrel formed of a three-stranded β sheet crossing at 90° over a β ladder. The first strands of the sheet and ladder are twisted to form the barrel. The connecting loops are generally short except for the loop that joins the sheet to the ladder, which is relatively long. Two consensus RNA-binding motifs are found on adjacent β strands of the β sheet. They contain aromatic residues that enable base stacking with single-stranded DNA. In bacteria this domain is found by itself, but eukaryotic counterparts contain additional domains at the N and C termini and are referred to as Y-box binding proteins. In plants the CSD is found with additional domains at the C terminus. Field Guide to Protein Folds 18. 18 COMPLEMENT C1r/C1s, Uegf, Bmp1 (CUB) DOMAIN CATH: 2.60.120.290 SCOP: b.23.1.1 InterPro: IPR000859 Pfam: PF00431 The complement C1r/C1s, Uegf, Bmp1 (CUB) domain is an approximately 110 residue domain often occurring as repeat arrays in a range of extracellular and plasma membrane-associated proteins. The domain consists of two fourto five-stranded β sheets. Four conserved cysteine residues form two disulfide bridges at opposite edges of the same face of the β sandwich. CUB domains are associated with a wide variety of functions, from complement activation to developmental patterning, tissue repair, and cell signaling. Many of the proteins are proteases; although the role of the CUB domain has yet to be fully realized they have been shown to be involved in oligomerization as well as substrate and protein–protein interaction partner recognition. Field Guide to Protein Folds 19. 19 CYCLIN-BOX DOMAIN CATH: 1.10.472.10 SCOP: a.74.1.1 InterPro: IPR006671, IPR013763 Pfam: CL0065 The cyclin-box is an approximately 100-residue domain found in all cyclin and cyclin-like domains that acts as a generalized adaptor motif to recognize diverse proteins and DNAs that are involved in cell cycle and transcriptional regulation. It consists of five helices with a central helix (helix 3) surrounded by the other four helices. Often the cyclin-box region is duplicated to form a paired N- and C-terminal set of repeats, although there is little sequence similarity between the two. In addition they may have embellishments to this core such as extra helices or loop regions. No single residue is completely conserved among all the different domains; however, some positions are more conserved than others, for example, an alanine that seems to be important for helical packing. Field Guide to Protein Folds 20. 20 DEATH DOMAIN CATH: 1.10.533.10 SCOP: a.77.1.2 InterPro: IPR000488 Pfam: PF00531 The death domain is one of the largest classes of protein interaction modules and it plays a pivotal role in the apoptosis, inflammation, necrosis, and immune cell signaling pathways. It comprises four subfamilies: the death domain, the death effector domain, the caspase recruitment domain, and the pyrin domain. All share a common six-helical structural fold, although individual subfamilies have distinct structural embellishments and conserved sequence characteristics that are unique to the subfamily, for example, in length and/or direction of helices. In addition, sequence similarity is low among subfamily members, resulting in entirely different surface features that may be responsible for specificity in protein–protein interactions. The domains mostly occur in combination with domains outside of the death domain superfamily or with other subfamily domains, although they can be found as the only motif in the protein. Field Guide to Protein Folds 21 21.Dbl HOMOLOGY (DH) DOMAIN CATH: 1.20.900.10 SCOP: a.87.1.1 InterPro: IPR000219 Pfam: PF00621 The Dbl homology (DH) or RhoGEF domain is approximately 150 residues in length and consists of an elongated all α-helical bundle composed of nine α helices and four 310-helices. The DH domain contains three highly conserved blocks of sequence and forms three long helices that pack together to form the core of the domain. A U-shaped arrangement of the helices, with a block of three shorter helices stacking against the three long helices, forms a structural scaffold that the pleckstrin homology (PH) domain, which the DH domain is invariably preceded by, packs against. The presence of the PH domain is not absolutely required for catalysis of nucleotide exchange, but does appear to greatly increase catalytic efficiency in many cases. Field Guide to Protein Folds 22. 22 DEXD/H DOMAIN CATH: 3.40.50.300 SCOP: c.37.1.16 InterPro: IPR011545 Pfam: PF00270 The DEXD/H is a motif, in single letter amino acid code, found in a highly conserved helicase domain. These proteins belong to a large grouping of proteins that can be subclassified into five superfamilies, one of which contains the DEXD/H-box, and can be further arranged into subgroups. The helicase domain with the DEXD/H motif is a parallel α/β structure sharing the same topology as the RecA-like domain consisting of five parallel β strands surrounded by five α helices and extended by a further two parallel β strands and two α helices. A number of other motifs, such as the Q-motif that are unique to the DEXD/H protein, are found within this domain. These motifs are located at the strand-loop or helix–loop transitions. In addition to a domain that contains the DEXD/H motif, a second domain is found in all solved structures of DEXD/H-box containing helicases that contain motifs that may coordinate ATPase and unwinding activities. Flanking sequences that are variable in length and composition are thought to provide additional interactions with substrates/co-factors or confer additional activities. Field Guide to Protein Folds 23. 23 EF-HAND DOMAIN CATH: 1.10.238.10, 1.10.238.110 SCOP: a.39.1.10, a.39.1.4, a.39.1.5, a.39.1.6, a.39.1.7, a.39.1.8, a.39.1.9 InterPro: IPR002048 Pfam: PF00036 The EF-hand is a structural motif formed of helix-loophelix that binds calcium or occasionally magnesium. It is characterized by a 12-residue sequence, corresponding to the loop which coordinates the metal ion in a pentagonal bipyramidal configuration. They form a pattern X, Y, Z, Y, X with the six residues involved in coordinating calcium defined as X, Y, Z. The last residue is an invariant Glu or Asp providing two oxygen atoms for coordinating the calcium ion. The two helices orientate like the spread thumb and forefinger of the human hand, giving rise to the name. A number of these structural motifs appear together within a protein, for example, three in parvalbumin, and two in calmodulin and protein troponin-C. Each motif within a protein can have different binding affinities for the metal ion. Field Guide to Protein Folds 24. 24 EPIDERMAL GROWTH FACTOR (EGF)-LIKE DOMAIN CATH: 2.10.25.10 SCOP: g.3.11.1 InterPro: IPR006209 Pfam: PF00008 The epidermal growth factor (EGF)-like domain is made up of about 30–40 residues often found in the extracellular domain of membrane-bound proteins or in proteins that are known to be secreted. The structure consists of a two-stranded β sheet followed by a loop to a C-terminal short two-stranded sheet and is held together by six cysteine residues that form three disulfide bonds. Proteins can contain more than one copy of the domain. Field Guide to Protein Folds 25. 25 FERM DOMAIN CATH: 3.10.20.90, 1.20.80.10, 2.30.29.30 SCOP: a.11.2.1, b.55.1.5, d.15.1.4 InterPro: IPR000299 Pfam: PF09379, PF00373, PF09380 The FERM domain (F for 4.1 protein, Ezrin, Radixin, Moesin) is an is an N-terminal domain made up of three subdomains featuring a ubiquitin-like fold, a four-helix bundle, and a phosphotyrosine-binding-like domain. These subdomains are organized by intermediate interdomain interactions to form characteristic grooves and clefts that together form a compact clover-shaped structure. One groove, created by the fourth β strand of the ubiquitin-like fold and the first α helix of the phosphotyrosine-bindinglike domain, is positively charged, while a second groove, found between the four-helix bundle and phosphotyrosinebinding-like domain, is negatively charged. This produces a pronounced polarization around the FERM domain. The two grooves have been shown to be the points of interaction between the domain and specific membrane-bound proteins. Field Guide to Protein Folds 26. 26 FIBRONECTIN TYPE III (FNIII) DOMAIN CATH: 2.60.40.10 SCOP: b.1.2.1 InterPro: IPR003961 Pfam: PF00041 The fibronectin type III (FNIII) domain is one of the three different kinds of structural units found in fibronectin, a multifunctional protein of the extracellular matrix and serum. It is characterized by a consensus sequence of approximately 90 residues that forms a fold similar to that of an immunoglobulin domain, consisting of seven β strands that form a sandwich of two anti-parallel β sheets, one containing three strands and the other four. The superfamily of sequences believed to contain FNIII repeats represents 45 different families that are widely distributed in animal species, but also found more sporadically in yeast, plant, and bacterial proteins. Field Guide to Protein Folds 27. 27 FORMIN HOMOLOGY (FH) DOMAIN CATH: SCOP: a.207.1.1 InterPro: IPR015425 Pfam: PF02181 Formin proteins are a family of highly conserved eukaryotic proteins implicated in a range of actin-based processes. The defining feature of formins is a highly conserved approximately 400 residue domain, the formin homology 2 domain. It forms an almost entirely α-helical dimeric structure consisting of a number of somewhat arbitrarily defined subdomains. The N-terminal forms a ‘lasso,’ a region that encircles the C-terminal helix of the dimer-related subunit, followed by a linker region, a globular subunit and coiled coil region, and a terminal helix. FH2 domains dimerize to promote self-association of formin proteins. Two other formin homology domains (FH1 and FH3) are also characteristic of the formin protein. The proline-rich FH1 domain is involved in interacting with a wide variety of other proteins and the less well-conserved FH3 domain is important for determining intracellular localization of formin family proteins. Field Guide to Protein Folds 28. 28 GREEK KEY MOTIF CATH: SCOP: InterPro: Pfam: The Greek key motif is a very common structural unit in proteins. It is defined as four β strands with a ‘+3,–1,–1’ topology. These motifs share little or no sequence similarity or common function and are found in a wide range of proteins of either all β or α + β classes. Despite the similarities in topology, they have different three-dimensional structures depending on the hydrogen-bonding pattern within the motif and thus can be subclassified into three distinct classes. Field Guide to Protein Folds 29. 29 HAMP DOMAIN CATH: 1.10.287.130 SCOP: a.30.2.1 InterPro: IPR003660 Pfam: PF00672 The HAMP domain is present in histidine kinases, adenyl cyclases, methyl-accepting proteins, and phosphatases, which gives rise to its name. It comprises approximately 50 residues forming two long α helices that span membranes in prokaryotes, fungi, plants, and protists, and it functions to connect extracellular sensory domains with intracellular signaling domains. It has a heptad repeat, which is a hallmark of a coiled coil structure. The two helices are connected by a loop of approximately 13 residues which has been observed to be tightly packed into the groove between the two helices in experimentally resolved structures. The ability to form a coiled coil has been proposed as part of the mechanism of signal transduction in which the domain alternates between two parallel helices and a canonical coiled coil. Field Guide to Protein Folds 30. 30 HEAT REPEAT DOMAIN CATH: 1.25.10.10 SCOP: a.118.1.2 InterPro: IPR000357 Pfam: PF02985 The HEAT repeat (named after four cytoplasmic proteins it is found in: Huntingtin, elongation factor 3 (EF3), protein phosphatase 2A (PP2A), and the yeast PI3-kinase TOR1) is related at the superfamily level to the ARM/armadillo repeat domain. It consists of repeats 37–47 amino acids in length formed of two anti-parallel α helices and two turns arranged about a common axis with conserved asparagine and arginine residues at positions 19 and 25. These repeats are linked by flexible inter-unit loops. HEAT repeats occur in series consisting of 3–36 units to form rod-like helical structures that can act as protein–protein interaction surfaces. Field Guide to Protein Folds 31. 31 HELIX-TURN-HELIX DNA-BINDING MOTIF CATH: 1.10.10.60 SCOP: a.4.1.1 InterPro: IPR000047 Pfam: The helix-turn-helix motif is one of the principal structural motifs capable of binding DNA. As the name suggests, the HTH motif is made up of helices 1 and 2 in the 3-helix structure shown. It functions as a DNA recognition and binding motif, binding in the major groove of the DNA duplex, with the second helix contributing most to the recognition of the correct DNA strand and termed the recognition helix. The first helix stabilizes the interaction with the DNA through hydrogen bonds and van der Waals interactions and is always in the same relative orientation to the recognition helix. The helix-turn-helix motif can be found in various combinations with other secondary structural elements and in multiple copies within the same domain. Field Guide to Protein Folds 32. 32 IMMUNOGLOBULIN (Ig) DOMAIN CATH: 2.60.40.10 SCOP: b.1.1.1, b.1.1.4 InterPro: IPR013151 Pfam: PF00047 The immunoglobulin domain is one of the most populous protein families in the human genome, with 765 members identified. It is found in many eukaryotes as well as bacteria, probably through horizontal gene transfer. These domains contain about 70–110 amino acids and are subcategorized according to size and function. They have seven to nine anti-parallel β strands forming a barrel-like shape, although due to the lack of hydrogen bonds around the barrel, they are in effect two distinct β-pleated sheets and form a β sandwich. This is often termed a simple Greek key, which is shared with a number of other domains. Interactions between hydrophobic amino acids in the interior of the sandwich and highly conserved disulfide bonds formed between cysteine residues in the second and sixth strands stabilize the Ig fold. Field Guide to Protein Folds 33. 33 JELLY ROLL FOLD CATH: 2.60.120.10 SCOP: b.82.1.1, b.82.1.10, b.82.1.11, b.82.1.12, b.82.1.15, b.82.1.16, b.82.1.18, b.82.1.19, b.82.1.2, b.82.1.20, b.82.1.22, b.82.1.23, b.82.1.24, b.82.1.3, b.82.1.5, b.82.1.6, b.82.1.7, b.82.1.8, b.82.1.9, b.82.2.13, b.82.3.1, b.82.3.2, b.82.3.3 InterPro: IPR014710 Pfam: The term ‘jelly roll fold’ was first coined to describe a more complicated version of the Greek key topology (see Greek key entry). The same topology has been described as a wedge shape, β barrel, a β sandwich, and an eight-stranded β barrel with a β roll topology. The jelly roll motif can be considered to be a single long β hairpin coiled in a helical manner to form two four-stranded anti-parallel β sheets. Various structural embellishments ranging from extensive regions of coil to additional sheets and helices are permissible. The topology is well conserved even in cases of little sequence similarity. The fold exists in many functional contexts including glucose 6-phosphate isomerase, germin (a metal-binding protein with oxalate oxidase and superoxide dismutase activities), auxin-binding protein, seed storage protein 7S, and acireductone dioxygenase, among others. Field Guide to Protein Folds 34. 34 KELCH MOTIF/DOMAIN CATH: 2.130.10.80 SCOP: b.68.11.1 InterPro: IPR006652 Pfam: PF01344 The Kelch motif is a short, approximately 50 residue, repeat motif comprising a four-stranded anti-parallel β sheet that is repeated usually six or seven times to form a propellerlike structure. Sequence identity is relatively low between individual repeats, ranging from 11 to 50%. A key set of conserved residues in Kelch distinguishes them from the large group of WD repeat proteins that also form β propellers. The Kelch domain may be associated with other domains at both the N and C termini or can be found by itself. Field Guide to Protein Folds 35. 35 K HOMOLOGY (KH) DOMAIN CATH: 3.30.1370.10, 3.30.300.20, 3.30.1140.32 SCOP: d.51.1.1, d.52.3.1, i.1.1.1 InterPro: IPR004088 (Type I), IPR004044 (Type II) Pfam: PF00013 (Type I), PF07650 (Type II) The K homology domain is an approximately 75 residue conserved sequence present in an assortment of nucleic acid-binding proteins. Though the KH motif is conserved, structural studies have revealed that there are actually two different versions, named type I (left) and type II (right), which have two different folds. Type I KH domains have a β sheet, abutted by three α helices, composed of three β strands (ordered as β1, β2, β3) with β1, and β2 parallel to each other and β3 anti-parallel to both. In Type II KH domains the β1 and β3 are adjacent to each other and the β2 strand is adjacent and anti-parallel to the β1 strand. A main variable loop region is different in the two types of domain, occurring between β3 and β2 on Type I and between β2 and β1 in Type II. KH domains are often found in multiple copies, with some evidence that the relative orientation of tandem repeats between Type I and Type II are quite different. The origin and evolution of the KH domains have been hypothesized to have occurred from a common ancestor through N and C terminal extensions or by extension, displacement, and deletion from one of the existing topologies. Field Guide to Protein Folds 36. 36 LD MOTIF CATH: SCOP: InterPro: (IPR001904) Pfam: The LD motif is a short leucine-rich sequence with the general consensus LDXLLXXL, where X can be any residue. It was first identified in paxillin, where the motif is repeated five times and the conserved leucine and aspartic residues are at the beginning of the sequence (except in the third repeat where LD is substituted with VE), giving rise to the name. They are highly conserved throughout the paxillin superfamily members such as leupaxin, Hic-5, and PaxB, as well as across a diverse set of species. The structural fragments of LD motifs that have been solved show that it forms a predominantly α-helical structure. Field Guide to Protein Folds 37. 37 LIM DOMAIN CATH: 2.10.110.10 SCOP: g.39.1.3 InterPro: IPR001781 Pfam: PF00412 LIM domains, first discovered in the proteins Lin11, Isl-1 & Mec-3, are composed of approximately 55 residues, 8 (mostly cysteine and histidine) of which are highly conserved and located at defined intervals. This conservation indicates that the LIM domain binds metal cofactors, and has been shown to bind two zinc ions. In fact the LIM domain consists of two zinc fingers (see entry for Zinc Finger Domain), each of which comprises two orthogonally packed anti-parallel β hairpins. Rubredoxin-type zinc knuckles connect the strands of the first and third β hairpins, while the second and fourth β hairpins are connected by tight turns containing a moderately conserved glycine. The second of the zinc fingers is terminated by an α helix. The secondary structure and tertiary fold are established by the conserved tetrahedral zinc coordination. LIM domains function as a modular protein-binding interface, mediating protein–protein interactions. Field Guide to Protein Folds 38. 38 LEUCINE-RICH REPEATS (LRR) DOMAIN CATH: 3.80.10.10 SCOP: c.10.1.1, c.10.2.1, c.10.2.2, c.10.2.3, c.10.2.4, c.10.2.6, c.10.2.7, c.10.2.8, c.10.3.1 InterPro: IPR001611 Pfam: PF00560 Leucine-rich repeats (LLRs) comprise a motif 20–30 residues in length with a highly conserved segment consisting of an 11-residue stretch LXXLXLXX(N/T/S/C)XL or a 12-residue stretch LXXLXLXX(C/S)XXL, in which L is valine, leucine or isoleucine. Typically, each repeat unit has a β strand turn–α helix structure, although the α helix may be replaced by a 310-helix, pII, or β turn and is quite variable. The number of repeats ranges from 2 to 45, adopting a characteristic arc or horseshoe shape with a parallel β sheet on the concave (inner) face. The concave face and the adjacent loops are the most common protein interaction surfaces on LRR proteins. LRRs occur in organisms from viruses to eukaryotes, and appear to provide a structural framework for the formation of protein–protein interactions. Field Guide to Protein Folds 39. 39 NAD(H)/NAD(P)-BINDING DOMAIN CATH: 3.40.50.720 SCOP: c.2.1 InterPro: IPR016040 Pfam: CL0063 The nicotinamide adenine dinucleotide (NAD)-binding domain is an ancient protein domain superfamily that is found in all kingdoms of life. It consists of a Rossmann-like fold with a three-layered α helix–β sheet–α helix arrangement, where the six β strands are parallel. The strands are in the order of 6-5-4-1-2-3 with a long loop between strands 3 and 4, which creates a natural cavity that binds the adenine ring of the NAD cofactor. There is an extensive network of hydrogen bonds and van der Waals interactions that give rise to a consensus sequence associated with NAD binding GXGXXG, with the first two glycine residues involved in binding the NAD, while the third is involved in protein packing. As well as binding NAD, NADP can also be accommodated by either a conformational change in the loop that connects the second strand to the second helix or by a mutation from a conserved aspartate to an asparagine. Field Guide to Protein Folds 40. 40 OLIGONUCLEOTIDE/OLIGOSACCHARIDE-BINDING FOLD (OB) DOMAIN CATH: 2.40.50.140 SCOP: b.40.4.1, b.40.4.3 InterPro: IPR004365 Pfam: CL0021 The oligonucleotide/oligosaccharide-binding fold domain is a 70–105 residue structural motif whose variants share little sequence similarity. The variability in length results from dramatic differences in the size of variable loops found between well-conserved elements of secondary structure. Like the Ig domain, the structure contains a Greek key motif consisting of two three-stranded anti-parallel β sheets, where one strand is shared between both sheets. The β sheets pack orthogonally forming a flattened β barrel. Frequently a helix is found between strands 3 and 4 that packs against one end of the barrel. Two glycines, or other small residues, contribute to the OB fold, one in the first half of the first strand and one at the beginning of strand 4. OB-fold domains tend to use a common ligand-binding interface centered on strands 2 and 3 and loops between strands 1 and 3, strand 3 and the helix, the helix and strand 4, and strand 4 and strand 5. This defines a cleft that runs perpendicular to the β-barrel axis and is where the majority of nucleic acid-binding partners bind. Field Guide to Protein Folds 41. 41 PAS DOMAIN CATH: 3.30.450.20 SCOP: d.110.3.1, d.110.3.2 InterPro: IPR013767 Pfam: PF00989 The PAS domain, which takes its name from three proteins—the period circadian protein, aryl hydrocarbon receptor nuclear translocator protein, and single-minded protein—is a sensor domain involved in signal transduction that is found in a wide range of organisms. It is formed of a structurally conserved α/β fold with little conservation between sequences. The structure consists of a central sixstranded β sheet with the N- and C-terminal β strands at the center. The domain can be divided into four segments: the first is an N-terminal helical lariat; the second consists of the first three strands of the central β sheet core that is interleaved with a hairpin turn and two short α helices; the third section comprises a helical connector running diagonally across the β sheet and connects to the final section of the last three strands of the β sheet that are connected by a fourth section consisting of a final hairpin turn. Field Guide to Protein Folds 42. 42 POLO-BOX DOMAIN (PBD) CATH: 3.30.1120.30 SCOP: d.223.1.1, d.223.1.2 InterPro: IPR000959 Pfam: PF00659 The Polo-box domain (PBD) contains at its core a continuous six-stranded anti-parallel β sheet and an α helix. Occurring as a linked homodimer related to each other by a 2-fold symmetry, together they form a 12-stranded β sandwich flanked by three α-helical segments. The domain regulates the kinase activity of the protein it is found in, which is located in the centrosomes, kinetochores, and central spindle structures during mitosis and promotes mitosis and cytokinesis by phosphorylation of a range of substrates. The phospho-peptide binds in a cleft located between the two PBD domains, making a short anti-parallel β sheet that stabilizes the interaction. Histidine and lysine residues, which are some of the few residues highly conserved between Polo-box domains, are involved in interacting with the phospho-peptide. Field Guide to Protein Folds 43. 43 PDZ DOMAIN CATH: 2.30.42.10 SCOP: b.36.1.1, b.36.1.2, b.36.1.3, b.36.1.4, b.36.1.6 InterPro: IPR001478 Pfam: PF00595 The PDZ domain, taking its name from three proteins in which it was first observed (post synaptic density protein, Drosophila disc large tumor suppressor, and zonula occludens-1 protein) is found in all kingdoms of life. It comprises six β strands and two α helices that fold to form a six-stranded β sandwich. PDZ domains are modular protein interaction domains that contribute to protein targeting and complex assembly. The C-terminal peptides of the target protein bind as an anti-parallel β strand in a groove between the second strand and the second helix, in essence extending the β sheet. A conserved set of residues (GLGF), found in the loop between the first and second strand, is important for stabilizing the C-terminal carboxylate group. The N and C termini are located on the opposite side to the binding site, a feature shared with other protein interaction domains such as SH2. Field Guide to Protein Folds 44. 44 PLECKSTRIN HOMOLOGY (PH) DOMAIN CATH: 2.30.29.30 SCOP: b.55.1.1 InterPro: IPR001849 Pfam: PF00169 The pleckstrin homology (PH) domain is approximately 120 residues in length and commonly found as a constituent of signaling proteins as well as proteins of the cytoskeleton. Its basic structure consists of a seven-stranded anti-parallel β sheet that has a strong bend, resulting in a conformation that is referred to as an orthogonal sandwich or up-anddown β barrel. In addition there is an amphipathic α helix that blocks one end of the bent sheet. The loops connecting the β strands differ greatly in length, providing the source of the domain’s specificity to a range of phosphoinositides, which differ by being phosphorylated at different sites within the inositol ring. The only conserved residue among PH domains is a single tryptophan located within the α helix that serves to nucleate the core of the domain. Field Guide to Protein Folds 45. 45 PHOSPHOTYROSINE-BINDING (PTB) DOMAIN CATH: 2.30.29.30 SCOP: b.55.1.2 InterPro: IPR013625 Pfam: PF08416 The phosphotyrosine-binding domain, as the name suggests, binds phosphotyrosine. It is structurally similar to the pleckstrin homology domain (see entry for Pleckstrin Homology Domain) and consists of a β sandwich containing two nearly orthogonal, anti-parallel β sheets and three α helices. One β sheet is made up of the first four strands, while the second is made up of the remaining strands plus parts of the first and second strands. The phospho-peptidebinding site is formed by the fifth strand, the C-terminal α helix, and the loop connecting the first strand and the second α helix. The N terminus of the phospho-peptide adopts an extended conformation, forming an additional strand to the β sheet. Field Guide to Protein Folds 46. 46 RNA RECOGNITION MOTIF (RRM) DOMAIN CATH: 3.30.70.330 SCOP: d.58.7.1, d.58.7.3 InterPro: IPR000504 Pfam: PF00076 The RNA recognition motif is one of the most abundant eukaryotic protein domains and is commonly found in all kingdoms of life. The RRM domain consists of an αβ sandwich with a β1-α1-β2-β3-α2-β4 topology, comprising one four-stranded anti-parallel β sheet and two α helices packed against the β sheet. Four highly conserved residues contributing to RNA binding are located in the central strands of the β sheet, with other conserved residues making up a consensus of approximately 90 residues forming the hydrophobic core. A common archetype of RRM–RNA interaction is defined by two deoxynucleotides stacking against two aromatic rings located on the middle strands of the β sheet. A third aromatic ring interacts with sugar rings of the RNA and a positive charged side chain forms a salt bridge with the phosphate between the two deoxynucleotides. Combinations of two or more RRM domains allow for continuous recognition of a long nucleotide sequence. Recent studies have shown that as well as RNA recognition the RRM domain is also involved in protein–protein interactions. Field Guide to Protein Folds 47. 47 S1 DOMAIN CATH: 2.40.50.140 SCOP: b.40.4.16 InterPro: IPR003029 Pfam: PF00575 The S1 domain is an approximately 70–80 residue domain discovered in ribosomal S1 protein and found in a range of RNA-binding proteins, especially those associated with the initiation of translation and turnover of mRNA. The S1 domain consists of a five-stranded anti-parallel β barrel, with the β strands arranged in a Greek key topology and a β bulge in the first strand permitting the formation of the barrel. Some of the connecting loops contain very small sections of α helix. The termini are orientated such that arrays of S1 domains can be arranged in a consecutive fashion. Residues on the surface of the domain are not strictly conserved, reflecting the varied specificity of RNA binding of the S1 domain. Structural similarity to cold shock domain family proteins, at least one other ribosomal protein, and domains of several aminoacyl-tRNA synthetases indicates that they all diverged from an ancient nucleic acid-binding domain. Field Guide to Protein Folds 48 48.Src HOMOLOGY 2 (SH2) DOMAIN CATH: 3.30.505.10 SCOP: d.93.1.1 InterPro: IPR000980 Pfam: PF00017 The Src homology 2 (SH2) domain consists of approximately 100 residues with two α helices and seven β strands, with five of the β strands forming a central β sheet flanked by the two α helices with the remaining two β strands at the N and C termini. Unlike the SH3 and PDZ domains, SH2 domains specifically function in protein tyrosine kinase pathways, due to their dependence of binding on tyrosine phosphorylation (pTyr). Two regions mediate this specificity: the first is the phosphorylated tyrosine residue-binding site; the second is a region that interacts with ligand residues C-terminal to the pTyr. Most of the binding interactions occur in the loop between β strands two and three. In addition, further interactions occur in a deep hydrophobic binding pocket that interacts with a pTyr plus three residues. They are found in a wide variety of protein contexts and are frequently found as repeats in a single protein sequence. Field Guide to Protein Folds 49 49.Src HOMOLOGY 3 (SH3) DOMAIN CATH: 2.30.30.40 SCOP: b.34.2.1 InterPro: IPR001452 Pfam: PF00018 The Src homology 3 (SH3) domain is approximately half the size of the SH2 domain, consisting of around 50 residues. The structure is composed of five or six β strands arranged as two tightly packed anti-parallel β sheets, which are arranged in a barrel-like structure. The linker regions between the strands may contain short α helices. The domain contains a relatively flat hydrophobic ligand-binding pocket consisting of three shallow grooves defined by conservative aromatic residues in which the protein ligand adopts an extended left-handed helical arrangement. It recognizes proline-rich sequences, in particular those containing a PXXP motif. Recent studies have shown that the specificity and cellular function of SH3 domains are far more diverse than previously appreciated. Like the SH2 domain they are found in the context of other domains and may mediate many diverse processes such as increasing local concentration of proteins, altering their subcellular location, and mediating the assembly of large multiprotein complexes. Field Guide to Protein Folds 50. 50 SPECTRIN-LIKE REPEATS CATH: 1.20.58.60 SCOP: a.7.1.1 InterPro: IPR002017 Pfam: PF00435 Cytoskeletal proteins of the spectrin family have an elongated structure composed of repeating units termed the spectrin-like repeat. This unit comprises a triple helical structure, with three long helices separated by a loop between the first and second helix and a turn between the second and last helix. There is little sequence similarity between repeats, although some residues are more highly conserved and correspond to a set of residues between the a and d heptad positions in the helical bundle. The repeats are defined by a characteristic tryptophan residue in the first helix and a leucine at two residues from the carboxyl end of the third helix. The second helix is interrupted by proline in some sequences. Field Guide to Protein Folds 51. 51 UBIQUITIN FOLD CATH: 3.10.20 SCOP: k.45.1.1 InterPro: IPR000626 Pfam: PF00240 The ubiquitin domain has an overall topology of an α/β roll consisting of five β strands and two α helices in the order β1–β2–α1–β3–β4–α2–β5. Some members of the superfamily have additional decorations to the basic fold, including the addition of the small helix or an additional strand before the first principal helix or replacing the second helix with a β strand. In addition, the length of the connecting loops can vary a lot, which might modulate the interaction with other domains or proteins. Ubiquitin is a small domain of just 76 residues, with poor sequence conservation between superfamily members except for seven lysine residues, which are important for linking to the target protein or another ubiquitin molecule to form a ubiquitin chain that attaches itself to a target protein. The length and branching pattern of the ubiquitin chain alters the fate of the target protein. Field Guide to Protein Folds 52. 52 VON WILLEBRAND FACTOR TYPE A DOMAIN CATH: 3.40.50.410 SCOP: c.62.1.1, c.62.1.2, c.62.1.4 InterPro: IPR002035 Pfam: PF00092 von Willebrand factor is a large multimeric protein that has a central role in hemostasis and thrombosis. It comprises a mosaic of many types of domains termed type A to D, with type A found in a number of other proteins such as complement factor B, the integrins, and collagen types VI, VII, XII, and XIV among others. All von Willebrand factor type domains share a common fold of a central β sheet with one anti-parallel edge strand flanked on two sides by amphipathic helices that lie against the β sheet face forming a globular domain. Often an Mg2+ is bound to the carboxy-terminal end of the β sheet. In von Willebrand factor domains with a bound metal ion, the metal coordinates residues in loops that constitute what is termed a metal ion-dependent adhesion site (MIDAS) and is involved in allosteric movement of the C-terminal α7 helix from the ligand-binding face at the MIDAS to the opposite end of the domain, which contacts other domains. Field Guide to Protein Folds 53. 53 WD40 REPEAT DOMAIN CATH: 2.130.10.10 SCOP: b.69.4.1 InterPro: IPR001680 Pfam: PF00400 The WD40 repeat is an approximately 40-residue tract characterized by a glycine-histidine (GH dipeptide) 11–24 residues from the N terminus and a tryptophan-aspartic (WD) acid dipeptide found at the C terminus, which also gives rise to its name. The WD domains contain 4–16 WD repeats, which circularize to form a β propeller (see entry for β-Propeller Fold). Each repeat contains four β strands, although the repeat structure is not equivalent to a single propeller blade. In fact the propeller blade contains the first three strands of one repeat and the last strand of the previous propeller blade. This sharing of strands stabilizes the overall structure. The evolutionary pressure to conserve the sequence is apparently to form the propeller structure, as it provides a stable platform for several protein–protein interactions. Field Guide to Protein Folds 54. 54 WINGED-HELIX DOMAIN (WHD) CATH: 1.10.10.10 SCOP: a.4.5 InterPro: IPR011991 Pfam: CL0123* The winged-helix domain is an elaboration of the common helix-turn-helix domain, a common denominator in basal and specific transcription factors found in all kingdoms of life. It comprises a three-helix core in the form of a righthanded helical bundle with a partly open configuration. This core is embellished with a C-terminal β strand hairpin unit that packs against the shallow cleft of the partially open core. Two extensions between the last helix and the first of the C-terminal β strands and between this β strand and the last β strand form the two ‘wings.’ The wings can play an important part in the interaction with DNA, although this is not considered to be the major interaction site, which is on the third helix. As well as interacting with DNA it can form protein–protein interactions. * This represents the general helix-turn-helix clan, of which the winged-helix domain is an example. Field Guide to Protein Folds 55. 55 ZINC FINGER DOMAIN CATH: 3.30.160.60 SCOP: g.37.1.1 InterPro: IPR007087 Pfam: PF00096 Note: References relate to the ‘classic’ zinc finger motif (C2H2). Several different zinc finger motifs have been characterized, and vary with regard to structure, as well as binding modes and affinities. This entry relates to the classic (C2H2) motif, which is the most common DNA-binding motif found in eukaryotic transcription factors. It contains two conserved histidine and cysteine residues that coordinate binding of the zinc ion. Generally zinc fingers contain few secondary structural elements, with C2H2 zinc fingers consisting of just two short β strands followed by an α helix. Individual zinc finger domains typically occur as tandem repeats with two, three, or more fingers comprising the DNA-binding domain of the protein; they can bind in the major groove of DNA and are typically spaced at 3-base pair intervals. The α helix contains the recognition site for sequence-specific contacts with the DNA. Field Guide to Protein Folds 56. 56 ZINC RIBBON DOMAIN CATH: 2.20.25.10 SCOP: g.41.3 InterPro: IPR013137 Pfam: CL0167 Unlike previously characterized zinc-containing DNA/ RNA-binding modules, which contain an α helix, the zinc ribbon domain comprises a three-stranded β sheet. The domain is similar to the other zinc-containing DNA/RNAbinding modules (see Zinc Finger Domain) in as much as it forms a small globular domain stabilized by the coordination of a zinc ion, with a well-defined secondary structure, nonpolar interior, and a charged surface. The three β strands form an anti-parallel β sheet with the connecting β turns containing a cystine-2X-cystine motif, the cystines of which are involved in the zinc ion coordination. Homologous proteins containing the domain have been found in all kingdoms of life. In Eukaryota the domain is found in the N-terminal region of transcription inhibition factor TFIIB, which binds to the TATA-binding protein-promoter complex and facilitates the recruitment of RNA polymerase II.