* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download PowerPoint Slides
Point mutation wikipedia , lookup
Genetic code wikipedia , lookup
Biochemical cascade wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Paracrine signalling wikipedia , lookup
Signal transduction wikipedia , lookup
Gene expression wikipedia , lookup
Metalloprotein wikipedia , lookup
Expression vector wikipedia , lookup
Magnesium transporter wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Biochemistry wikipedia , lookup
Bimolecular fluorescence complementation wikipedia , lookup
Interactome wikipedia , lookup
Structural alignment wikipedia , lookup
Protein purification wikipedia , lookup
Western blot wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Protein structure Anne Mølgaard, Center for Biological Sequence Analysis “Could the search for ultimate truth really have revealed so hideous and visceral-looking an object?” Max Perutz, 1964 on protein structure John Kendrew, 1959 with myoglobin model Holdings of the Protein Data Bank (PDB): X-ray NMR theoretical total Sep. 2001 13116 2451 338 15905 Feb. 2005 25350 4383 0 29733 Methods for structure determination X-ray crystallography Nuclear Magnetic Resonance (NMR) Modeling techniques Modeling Only applicable to ~50% of sequences Fast Accuracy poor for low sequence id. • There is still need for experimental structure determination! Structual genomics consortium (SGC) • The SGC deposited its 275th structure into the Protein Data Bank in August 2006 • currently operating at a pace of 170 structures per year • at a cost of USD$125,000 per structure. • Scientific highlights include: • several (> 1!!) novel structures of protein kinases • completing the structural descriptions of the human 14-3-3 • adenylate kinase and cytosolic sulfotransferase protein families • human chromatin modifying enzymes; human inositol phosphate signaling • and a significant number of structures from human parasites. Amino acids http://www.ch.cam.ac.uk/magnus/molecules/amino/ Amino acids Livingstone & Barton, CABIOS, 9, 745-756, 1993 A – Ala C – Cys D – Asp E – Glu F – Phe G – Gly H – His I – Ile K – Lys L – Leu M – Met N – Asn P – Pro Q – Gln R – Arg S – Ser T – Thr V – Val W – Trp Y - Tyr Levels of protein structure •Primary •Secondary •Tertiary •Quarternary Primary structure MKTAALAPLFFLPSALATTVYLA GDSTMAKNGGGSGTNGWGEYL ASYLSATVVNDAVAGRSAR…(etc) Ramachandran plot left-handed -helix -sheet -helix Hydrophobic core Hydrophobic side chains go into the core of the molecule – but the main chain is highly polar The polar groups (C=O and NH) are neutralized through formation of H-bonds Secondary structure -helix C=O(n)…HN(n+4) -sheet (anti-parallel) … and all the rest 310 helices (C=O(n)…HN(n+3)), p-helices (C=O(n)…HN(n+5)) -turns and loops (in old textbooks sometimes referred to as random coil) The -helix has a dipole moment + N C - Two types of -sheet: anti-parallel parallel Tertiary structure (domains, modules) Rhamnogalacturonan lyase (1nkg) Rhamnogalacturonan acetylesterase (1k7c) Quaternary structure B.caldolyticus UPRTase (1i5e) B.subtilis PRPP synthase (1dkr) Classification schemes SCOP – Manual classification (A. Murzin) CATH – Semi manual classification (C. Orengo) FSSP – Automatic classification (L. Holm) Levels in SCOP Class All alpha proteins All beta proteins Alpha and beta proteins (a/b) Alpha and beta proteins (a+b) Multi-domain proteins Membrane and cell surface proteins Small proteins Total # Folds 202 141 130 260 40 # Superfamilies 342 280 213 386 40 # Families 550 529 593 650 55 42 72 887 82 104 1447 91 162 2630 http://scop.berkeley.edu/count.html#scop-1.67 Major classes in SCOP Classes – – – – – – All alpha proteins Alpha and beta proteins (a/b) Alpha and beta proteins (a+b) Multi-domain proteins Membrane and cell surface proteins Small proteins All : Hemoglobin (1bab) All : Immunoglobulin (8fab) /: Triose phosphate isomerase (1hti) a+b: Lysozyme (1jsf) Folds* • Proteins which have >~50% of their secondary structure elements arranged the in the same order in the protein chain and in three dimensions are classified as having the same fold • No evolutionary relation between proteins *confusingly also called fold classes Superfamilies Proteins which are (remote) evolutionarily related – Sequence similarity low – Share function – Share special structural features Relationships between members of a superfamily may not be readily recognizable from the sequence alone Families Proteins whose evolutionarily relationship is readily recognizable from the sequence (>~25% sequence identity) Families are further subdivided into Proteins Proteins are divided into Species – The same protein may be found in several species Links PDB (protein structure database) – www.rcsb.org/pdb/ SCOP (protein classification database) – scop.berkeley.edu CATH (protein classification database) – www.biochem.ucl.ac.uk/bsm/cath FSSP (protein classification database) – www.ebi.ac.uk/dali/fssp/fssp.html Why are protein structures so interesting? They provide a detailed picture of interesting biological features, such as active site, substrate specificity, allosteric regulation etc. They aid in rational drug design and protein engineering They can elucidate evolutionary relationships undetectable by sequence comparisons Inferring biological features from the structure 1deo NH2 Asp His COOH Ser Topological switchpoint Inferring biological features from the structure Active site Triose phosephate isomerase (1ag1) (Verlinde et al. (1991) Eur.J.Biochem. 198, 53) Engineering thermostability in serpins Overpacking Buried polar groups Cavities Im, Ryu & Yu (2004) Engineering thermostability in serine protease inhibitors PEDS, 17, 325-331. Evolution... Structure is conserved longer than both sequence and function Rhamnogalacturonan acetylesterase (A. aculeatus) (1k7c) Platelet activating factor acetylhydrolase (Bos Taurus) (1wab) Serine esterase (S. scabies) (1esc) Rhamnogalacturonan acetylesterase Serine esterase Platelet activating factor acetylhydrolase Mølgaard, Kauppinen & Larsen (2000) Structure, 8, 373-383. "We wish to suggest a structure for the salt of deoxyribose nucleic acid (D.N.A.). This structure has novel features which are of considerable biological interest…. …It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material." J.D. Watson & F.H.C. Crick (1953) Nature, 171, 737.