* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download please click, ppt - Department of Statistics | Rajshahi University
Peptide synthesis wikipedia , lookup
Paracrine signalling wikipedia , lookup
Gene expression wikipedia , lookup
Expression vector wikipedia , lookup
Signal transduction wikipedia , lookup
Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup
Amino acid synthesis wikipedia , lookup
Point mutation wikipedia , lookup
Magnesium transporter wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Biosynthesis wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Genetic code wikipedia , lookup
Protein purification wikipedia , lookup
Structural alignment wikipedia , lookup
Western blot wikipedia , lookup
Interactome wikipedia , lookup
Metalloprotein wikipedia , lookup
Homology modeling wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Proper structural fold of protein molecule is essential to execute its precise functional mission Md Abu Reza, PhD Associate Professor Dept of Genetic Eng & Biotech University of Rajshahi Bioinformatics Workshop-1 Date : 24th March, 2012 Venue : Dept of Statistics, RU 1 Higher Education Quality Enhance Project Molecular Organization of a cell 2 Protein – The Master Molecule Proteins control all biological systems in a cell They either act in constituting structure or perform distinct biological function in any physiological system Many proteins perform their functions independently, the vast majority of proteins interact with others for proper biological activity To perform the function effectively a proper structure is essential. Without proper structure a protein is useless or even cause malfunction in system Conformation and functional-group chemistry controls function Made up of 20 different types of amino-acid monomers Proteins define what an organism is, what it looks like, how it behaves, etc. (responsible for most phenotype) 3 Protein Function 4 Function of Proteins 5 Protein Function is Related to Structure 6 What are Proteins ? Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form in a biologically functional way. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds 20 natural amino acids join in different permutation and combinations in different lengths Once linked in the protein chain, an individual amino acid is called a residue, and the linked series of carbon, nitrogen, and oxygen atoms are known as the main chain 7 or protein backbone Amino Acids Lysine with the carbon atoms in the side-chain labeled Amino Terminal Carboxy Terminal 8 How peptide bonds are formed ? •Here amino acids are both Alanine in which the R group is a single hydrogen. •The carboxyl acid end on the first amino acid is orientated to the amino group of the second amino acid. •The -OH group and -H are removed to form water (condensation reaction). •The bond forms between the terminal carbon on the first amino acid and the nitrogen on the second amino acid. •The backbone of the molecule has the sequence N-C-C-N-C-C •Polypeptides maintain this sequence no matter how long the chain. •The R groups project from the backbone. •As the amino acids are added in translation the polypeptide folds up into it specific shape. 9 Colour codes used for atoms Element Color Name Carbon light grey Oxygen red Hydrogen white Nitrogen light blue Sulfur yellow Phosphorus orange Chlorine green Bromine, Zinc brown Sodium blue Iron orange Magnesium dark green Calcium dark grey Unknown deep pink 10 Stereochemistry The CORN Law H H View in 3D 11 Structure of the 20 naturally occurring Amino Acids 12 Structure of the 20 naturally occurring Amino Acids 13 Amino Acid Properties The 20 amino acids can be divided into several groups based on their properties. Important factors are charge, hydrophilicity or hydrophobicity, size, and functional groups water-soluble proteins tend to have their hydrophobic residues (Leu, Ile, Val, Phe, and Trp) buried in the middle of the protein, whereas hydrophilic side-chains are exposed to the aqueous solvent. 14 Livingstone & Barton, CABIOS, 9, 745-756, 1993 15 Group I: Nonpolar amino acids Group I amino acids are alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, and tryptophan. The R groups of these amino acids have either aliphatic or aromatic groups. This makes them hydrophobic (“water fearing”). In aqueous solutions, globular proteins will fold into a threedimensional shape to bury these hydrophobic side chains in the protein interior. 16 Group II: Polar, uncharged amino acids Group II amino acids are glycine, serine, cysteine, threonine, tyrosine, asparagine, and glutamine. The side chains in this group possess a spectrum of functional groups. However, most have at least one atom (nitrogen, oxygen, or sulfur) with electron pairs available for hydrogen bonding to water and other molecules. Polar aa are hydrophilic. 17 Group III: Acidic amino acids The two amino acids in this group are aspartic acid and glutamic acid. Each has a carboxylic acid on its side chain that gives it acidic (proton-donating) properties. In an aqueous solution at physiological pH, all three functional groups on these amino acids will ionize, thus giving an overall charge of −1. In the ionic forms, the amino acids are called aspartate and glutamate. . 18 Group IV: Basic amino acids The three amino acids in this group are arginine, histidine, and lysine. Each side chain is basic (i.e., can accept a proton). Lysine and arginine both exist with an overall charge of +1 at physiological pH. The guanidino group in arginine’s side chain is the most basic of all R groups (a fact reflected in its pKa value of 12.5). As mentioned above for aspartate and glutamate, the side chains of arginine and lysine also form ionic bonds. The chemical structures of Group IV amino acids are 19 20 Why Proteins Need Structure ! Functions Diverse functions related to structure Structural components of cells Motor proteins Enzymes Antibodies Hormones Hemoglobin/myoglobin Transport proteins in blood 21 Protein structure - bonding Interactions (forces) governing protein structure Covalent Interaction Peptide bond Disulfide bond Non Covalent interaction Hydrogen bond Ionic bond (Electrostatic interactions) Salt bridge Van-der-Waals interactions Hydrophobic force 22 Disulfide bond Covalent bond between sulfur atoms on two cysteine amino acids Very strong Intereaction From: Elliott, WH. Elliott, DC. (1997) Biochemistry and Molecular Biology. Oxford: Oxford University Press. p32 23 Levels of Protein Structure 24 Hierarchical nature of protein structure Primary structure (Amino acid sequence) ↓ Secondary structure (α-helix, β-sheet ) ↓ Tertiary structure (Three-dimensional structure formed by assembly of secondary structures) ↓ Quaternary structure (Structure formed by more than one polypeptide chains) 25 Primary protein structure Primary structure of insulin Linear sequence of amino acids forms primary structure Sequence essential for proper physiological function Bettelheim & March (1990) Introduction to Organic & Biochemistry (International Edition) Philadelphia: 26 Saunders College Publishing, p299 Sickle cell anemia 27 28 Sickle-Cell Disease Secondary structure = local folding of residues into regular patterns 29 Secondary protein structure Peptide chains fold into secondary structures: - helix - pleated sheet Random coil 30 Peptide Bonds are Planar For a pair of amino acids linked by a peptide bond , six atoms lie in the same plane: the carbon atom and CO group of the first amino acid and the NH group and carbon atom of the second amino acid The C-N distance in a peptide bond is typically 1.32Å Two configurations are possible for a planar peptide bond. In the trans configuration, the 2 carbon atoms are on opposite sides of the peptide bond. In the cis confi guration, these groups are on the 31 same side of the peptide bond. Almost all peptide bonds are trans The peptide bond is planar 32 Torsion Angle In contrast with the peptide bond, the bonds between the amino group and the carbon atom and between the carbon atom and the carbonyl group are pure single bonds. The two adjacent rigid peptide units may rotate about these bonds, taking on various orientations This freedom of rotation about two bonds of each amino acid allows proteins to fold in many different ways. The rotations about these bonds can be specified by torsion angles The angle of rotation about the bond between the nitrogen and the carbon atoms is called phi ( ) The angle of rotation about the bond between the carbon and the carbonyl carbon atoms is called psi ( ) A clockwise rotation about either bond as viewed from the nitrogen atom toward the carbon atom or from the carbonyl group toward the carbon atom corresponds to a positive value 33 The and angles determine the path of the polypeptide chain The peptide bond is planar 34 Ramachandran plot -- shows and angles for secondary structures A measure of the rotation of a and bond usually lie between - 180 and + 180 35 and angles for secondary structures Secondary structure conformation Conformation helix Left handed Helix Helix 3-10 Helix Sheet Parallel Sheet Anti-parallel Phi () -57 +57 -57 -49 -119 -119 Psi () -47 +47 -80 -56 113 135 Residue Conformational Preference Conformation helix Strand Turn A, L, M, Q, K, R, E V, I,, Y, C, W, F, T G, N, P, S, D 36 Alpha Helix • In the -helix, the carbonyl oxygen of residue “i” forms a hydrogen bond with the amide of residue “i+4”. • Although each hydrogen bond is relatively weak in isolation, the sum of the hydrogen bonds in a helix makes it quite stable. • The propensity of a peptide for forming an -helix also depends on its sequence. 37 - helix Shape maintained by hydrogen bonds between C=O and N-H groups in backbone R groups directed outward from coil From: Elliott, WH. Elliott, DC. (1997) Biochemistry and Molecular Biology. Oxford: Oxford University Press. p28 38 α-Helix A loop of 13 atoms is formed between the hydrogen bond. 3.6 amino acids per turn of helix. Helices observed in proteins can range from four to over forty residues long, but a typical helix contains about ten amino acids (about three turns). α-Helix is also called 3.613 helix, compared to πhelix 4.416 and 310 helix. Proline is the α-breaker. 39 Propensities for forming α-helical structure Different amino-acid sequences have different propensities for forming α-helical structure. Methionine, alanine, leucine, uncharged glutamate, and lysine ("MALEK" in the amino-acid 1-letter codes) all have especially high helix-forming propensities, whereas proline and glycine have poor helix-forming propensities. Proline either breaks or kinks a helix, both because it cannot donate an amide hydrogen bond (having no amide hydrogen), and also because its side-chain interferes sterically with the backbone of the preceding turn - inside a helix, this forces a bend of about 30° in the helix axis 40 Examples of α-Helical Proteins: α-helical coiled coil proteins: Hair Form superhelix Found in myosin, tropomyosin (muscle), fibrin (blood clots), keratin (hair) Also fingernails and wool are α-helical proteins; silk is β 41 β-sheet (-pleated sheet) A polypeptide chain, called a β-strand, in a β-sheet is almost fully extended rather than being tightly coiled as in the -helix The distance between adjacent amino acids along a strand is approximately 3.5Å, in contrast to a distance of 1. 5Å along an helix sheet is formed by linking two or more strands lying next to one another through hydrogen bonds All residues in Beta sheet have nearly the same and angle Hydrogen bonds can only formed between adjacent polypeptide chains. R groups are directed above and below backbone 42 Parallel or Anti-parallel -Sheet • The adjacent polypeptide chains in a -sheet can be either parallel or anti-parallel (having the same or opposite amino-to-carboxyl orientations, respectively). 43 H bonds between 2 same aa H bonds between different aa Examples of β-sheet Proteins: Fatty acid binding protein -> β barrels structure Antibodies more β sheets OmpX: E. coli porin 44 44 Tertiary Structure: 3D structure of a polypeptide chain Quaternary Structure: Polypeptide chains assemble into multisubunit structures Tetramer of hemoglobin Cell-surface receptor CD4 45 QUATERNARY STRUCTURE Deoxyhaemoglobin 46 B-Turns and Loops • -turns allow the protein backbone to make abrupt turns. • Again, the propensity of a peptide for forming b-turns depends on its sequence. • In this reverse turns, the CO group of residue i of a polypeptide is hydrogen bonded to the NH group of residue i + 3 • In other cases, more elaborate structures are responsible for chain 47 reversals. These structures are called loops or sometimes loops (omega loops) to suggest their overall shape Why not here 48 49 Random coil Not really random structure, just nonrepeating From: Elliott, WH. Elliott, DC. (1997) Biochemistry and Molecular Biology. Oxford: Oxford University Press. p27 ‘Random’ coil has fixed structure within a given protein Commonly called ‘connecting loop region’ Structure determined by bonding of side chains (i.e. not necessarily hydrogen bonds) 50 Tertiary protein structure Secondary structures fold and pack together to form tertiary structure Usually globular shape But can be fibrous Tertiary structure stabilized by bonds between R groups (i.e. side-chains) 51 Tertiary structure = global folding of a protein chain 52 Tertiary structures are quite varied 53 Quaternary structures 54 Each Protein has a unique structure Amino acid sequence NLKTEWPELVGKSVEE AKKVILQDKPEAQIIVL PVGTIVTMEYRIDRVR LFVDKLDNIAEVPRVG Folding! 55 Protein Folding Folding is a highly cooperative process (all or none) Protein Folding by Chaperons • Chaperone proteins provide a site where misfolded proteins can fold correctly. Folding by stabilization of Intermediates 56 56 Central Dogma DNA Transcription Pre mRNA (hnRNA) Splicing, Processing and maturation mRNA Translation protein 57 Chaparonins Chaparonins Assist in Protein Folding They segregate protein folding from “bad influences” 58 in the cell Classes of proteins Functional definition: Enzymes: Accelerate biochemical reactions Structural: Form biological structures Transport: Carry biochemically important substances Defense: Protect the body from foreign invaders Structural definition: Globular: Complex folds, irregularly shaped tertiary structures Fibrous: Extended, simple folds -- generally structural proteins Cellular localization definition: Membrane: In direct physical contact with a membrane; generally water insoluble. Soluble: Water soluble; can be anywhere in the cell. 59 Components of Tertiary Structure Fold – used differently in different contexts – most broadly a reproducible and recognizable 3 dimensional arrangement Domain – a compact and self folding component of the protein that usually represents a discreet structural and functional unit Motif (aka supersecondary structure) a recognizable subcomponent of the fold – several motifs usually comprise a domain Like all fields these terms are not used strictly making capturing data that conforms to these terms all the more difficult 60 Protein Structure Computational Goals • • • • • • • • • • • Compare all known structures to each other Compute distances between protein structures Classify and organize all structures in a biologically meaningful way Discover conserved substructure domain Discover conserved substructural motifs Find common folding patterns and structural/functional motifs Discover relationship between structure and function. Study interactions between proteins and other proteins, ligands and DNA (Protein Docking) Use known structures and folds to infer structure from sequence (Protein Threading) Use known structural motifs to infer function from structure Many more… Structural Classification of Proteins (SCOP) http://scop.berkeley.edu/ • Class o o • Fold (Architecture) o o • Major structural similarity SSE’s in similar arrangement Superfamily (Topology) o o • Similar secondary structure content All α, all β, alternating α/β etc Probable common ancestry HMM family membership Family o o Clear evolutionary relationship Pairwise sequence Classes of Protein Structures • • Mainly Mainly alternating o Parallel sheets, - units • o o Anti-parallel sheets, segregated and regions helices mostly on one side of sheet Classes of Protein Structures • Others o Multi-domain, membrane and cell surface, small proteins, peptides and fragments, designed proteins Folds / Architectures • Mainly α o o Bundle Non-Bundle • Mainly β o o o o o o o o Single sheet Roll Barrel Clam Sandwich Prism 4/6/7/8 Propeller Solenoid • α/β and α+β • Closed • Barrel • Roll, ... • Open • Sandwich • Clam, ... The TIM Barrel Fold A Conceptual Problem ... Fold versus Topology Another example: Globin vs. Colicin PDB Protein Database http://www.rcsb.org/pdb/ • Protein DataBase o o o Multiple Structure Viewers Sequence & Structure Comparison Tools Derived Data o o SCOP CATH pFAM Go Terms Education on Protein Structure Download Structures and Entire Database Web services for domain identification Program Web access DIAL http://www.ncbs.res.in/~faculty/mini/ddbase/dial.html DomainParser http://compbio.ornl.gov/structure/domainparser DOMAK http://www.compbio.dundee.ac.uk/Software/Domak/domak.html PDP http://123d.ncifcrf.gov/pdp.html 70 Protein structure prediction has remained elusive over half a century “Can we predict a protein structure from its amino acid sequence?” 71 Protein Misfolding Diseases 72 Table 6-4 Misfolded proteins and Resulting Disorders •Prions: molecules resembling ion channels, causing serious illnesses in animals and humans • causes protein fibrillation Alzheimer’s Disease • Cause ( BSE) “mad cow disease” in cattle 73 73 MOLECULAR BIOLOGY OF PRION DISEASE A normal prion (left), compared to an aberrant, disease-causing prion (right). Cellular processing of PrP. (1). The PrP can be internalized before degradation by proteosome or lysosomal proteases. In PrPsc, processing results in limited proteolysis (2). Limited degradation produces PrPsc fragments, which accumulate overtime and may have a role in cell death. These fragments lead to propagation of the PrPsc infection in adjacent cells. A) Normal PrP can refold into PrPsc in the extra cellular space. B) Fragments of PrPsc may remain within the cell or may be externalized by transport vesicles or by cellular rupture upon death. C) Intracellular PrPsc could interact with PrP during intracellular processing resulting in conversion of PrP to PrPsc in the infected cell. D) Intracellular PrP may spontaneously change conformation to 74 PrPsc. Possible routes of propagation of ingested prions. After oral uptake, prions may penetrate the intestinal mucosa through Mcells and reach Peyer's patches as well as the enteric nervous system. Depending on the host, prions may replicate and accumulate in spleen and lymph nodes. Myeloid dendritic cells are thought to mediate transport within the lymphoreticular system. From the lymphoreticular system and likely from other sites prions proceed along the peripheral nervous system to finally reach the brain, either directly via the vagus nerve or via the spinal cord, under involvement of the sympathetic nervous system. 75 PRIONS CONT. Sheep with scrapie Kuru and Creutzfeldt-Jakob disease in humans 76 How To Determine Protein Structure ? 77 Protein Structure Prediction Structure: Traditional experimental methods: X-Ray or NMR to solve structures; generate a few structures per day worldwide cannot keep pace for new protein sequences Strong demand for structure prediction: more than 30,000 human genes; 10,000 genomes will be sequenced in the next 10 years. Unsolved problem after efforts of two decades. 78 Protein structure and functions are intimately related Proteins interact with each other The structure of a protein influences its function by determining the other molecules with which it can interact and the consequences of those interactions. 79 Experimental methods available to detect protein structure and interactions vary in their level of resolution. These observations can be classified into four levels: (a) atomic scale, (b) binary interactions, (c) complex interactions, and (d) cellular scale. 80 Atomic-scale methods: showing the precise structural relationships between interacting atoms and residues The highest resolution methods: e.g., X-ray crystallography and NMR Not yet applied to study protein interactions in a high-throughput manner. 81 Binary-interaction methods: Methods to detect interactions between pairs of proteins Do not reveal the precise chemical nature of the interactions but simply report such interactions take place The major high-throughput technology: the yeast two-hybrid system 82 Complex-interaction methods: Methods to detect interactions between multiple proteins that form complexes. Do not reveal the precise chemical nature of the interactions but simply report that such interactions take place. The major high-throughput technology: systematic affinity purification followed by mass spectrometry 83 Cellular-scale methods: Methods to determine where proteins are localized (e.g., immunofluorescence) It may be possible to determine the function of a protein directly from its localization 84 Principles of proteinprotein interaction analysis These small-scale analysis methods are also useful in proteomics because the large-scale methods tend to produce a significant number of false positives They include (a) genetic methods, (b) bioinformatic methods, (c) Affinitybased biochemical methods, and (d) Physical methods. 85 Genetic methods Classical genetics can be used to investigate protein interactions by combining different mutations in the same cell or organism and observing the resulting phenotype Suppressor mutation: A secondary mutation that can correct the phenotype of a primary mutation. 86 Suppressor mutation 87 Synthetic lethal effect 88 Bioinformatic methods (A) The domain fusion method (or Rosetta stone method): The sequence of protein X (a singledomain protein from genome 1) is used as a similarity search query on genome 2. This identifies any single-domain proteins related to protein X and also any multidomain proteins, which we can define as protein X-Y. As part of the same protein, domain X and Y are likely to be functionally related. 89 The domain fusion method (or Rosetta stone method) The sequence of domain Y can then be used to identify single-domain orthologs in genome 1. Thus, Gene Y, formerly an orphan with no known function, becomes annotated due to its association with Gene X. The two proteins are also likely to interact. The sequence of protein X-Y may also identify further domain fusions, such as protein Y-Z. This links three proteins into a functional group and possibly identifies an interacting complex. 90 The domain fusion method (or Rosetta stone method) 91 Bioinformatics methods (B) The phylogenetic profile: It describes the pattern of presence or absence of a particular protein across a set of organisms whose genomes have been sequenced. If two proteins have the same phylogenetic profile (that is, the same pattern of presence or absence) in all surveyed genomes, it is inferred that the two proteins have a functional link. A protein’s phylogenetic profile is a nearly unique characterization of its pattern of distribution among genomes. Hence any two proteins having identical or similar phylogenetic profiles are likely to be engaged in a common pathway or complex. 92 Sequence to Structure to Function >132L:_ LYSOZYME (E.C.3.2.1.17) KVFGRCELAAAMKRHGLDNYRGYSLGNWVCAAKFESNFNTQATNRNTDGSTDYGILQINSRWWCNDGRTPG SRNLCNIPCSALLSSDITASVNCAKKIVSDGNGMNAWVAWRNRCKGTDVQAWIRGCRL Cell wall degrading enzyme 93 Correlation Between Structure & Function •Homologous proteins • Conserved sequence, similar structure and function • Example: cytochrome c •Similar function, different sequences • Conserved and variable regions • Example: dehydrogenases, kinases •Similar structure, different function • Example: thioredoxin 94 Why must we predict structures? Limitations of current techniques Proteins often too large for molecular modeling techniques Difficult to crystallize some proteins (X-ray), slow throughput Difficulty getting NMR results, reliance on modeling Far more sequences elucidated than structures 3D structures are better conserved than sequence during evolution. 95 Predicting 3D structures from Sequence? Levinthal’s paradox protein with 100 amino acids => 31100 possible structures 10-13 seconds to sample each structure 1.6*1027 years to go through each structure. Models improve these odds Based on structure stability, x-ray crystallography 96 Structure prediction methods Ab initio Comparative/Homology modeling Determining structure without reference to existing protein structures. Determines structure based on sequence similarity. Fold recognition/threading Limited number of folds Determine structure similarities independent from sequence similarity. 97 Structure Prediction Process 98 http://www.bmm.icnet.uk/people/rob/CCP11BBS/ Protein Structure-function paradigm Origins in the lock and key model for enzymatic activity. Claims that rigid 3D structure of protein determines the function. Active areas of protein structure for example active sites on enzymes are highly conserved, other regions are more variable. Conserved motifs are responsible for conserved functionality. Forms the basis of proteomic studies and many other branches. Homology is claimed to be responsible for the correlation. 99 Structure Similarity Refers to how well (or poorly) 3D folded structures of proteins can be aligned Expected to reflect functional similarities (interaction with other molecules) Proteins in the TIM barrel fold family 100 Structure Similarity Refers to how well (or poorly) 3D folded structures of proteins can be aligned Is expected to reflect functional similarities (interaction with other molecules) 2000: ~ 20,000 structures in PDB ~ 4,000 different folds (1:5 ratio) Three possible reasons: - evolution, - physical constraints (e.g., few ways to maximize hydrophobic interactions), - limits in techniques used for structure determination Given a new structure, the probability is high that it is similar to an existing one 101 Why Comparing Protein Folded Structures? sequence similarity Sequence Structure Function structure similarity Low sequence similarity may yield very similar structures Sometimes high sequence similarity yields different structures Structure comparison is expected to provide more pertinent information about functional (dis-)similarity among proteins, especially with non-evolutionary relationships or non-detectable evolutionary relationships 102 Extensions of Paradigm Allosteric interactions Proteins as biomachines Enzyme catalysis Proteomics 3D structure analysis Assisted Protein folding Structure-function paradigm de novo proteins Protein self organization Protein misfolding and diseases Biotechnology Biomedicine Protein engineering 103 Paradigms in structure-function theory Orthologues possess similar function. Enzyme homolgues are enzymes. Regulatory domain homologues are not enzymes. Equivalent cellular functions are mediated in different species by homologues. Coding regions mutate at a slower rate than non coding regions. Domain homologues are localised in sequence and 3d structure and possess same order of sec. structures. Disulphide bridges are invariant among homologues. Convergent evolution of sequences does not occur. Domains possess single conformations. 104 Function Assessment Statistical analysis is hard to apply to functionality assessment. Function prediction by homology is thus qualitative requiring expert knowledge and careful study. Assignment of experimental knowledge from one homologue to uncharacterized sequence is basis of function prediction. Works best in case of orthologues, can be misleading in paralogues. Orthologue identification is most powerful tool in molecular function prediction. Paralogues also can have overlapping functionality, esp. in eukaryotes. 105 Fold-function Correlation Common folds are found in unrelated protein families. Folds accommodating many families are called “superfolds”. ex: TIM-barrel Folds in combination define overall function. Function is better assessed as a whole of parts. 106 Exceptions to the rule – natively unfolded proteins Class of proteins inherently unstable structure, yet functional. Ex : Regulatory proteins. Unfolded in physiological state, may fold during functional cycle. Lack of fixed structure allows binding to multiple targets. Target induces folding in the protein. Ex : protein-DNA or protein-RNA interactions Unfolded proteins easier to transfer across membranes. 107 Ground rules for Structure Prediction Don't always believe what programs tell you Don't always believe what databases tell you they're often misleading & sometimes wrong! Don't always believe what others tell you they're often misleading & sometimes wrong! they're often misleading & sometimes wrong! In short, don't be a naive user when computers are applied to biology, it is vital to understand the difference between mathematical & biological significance computers don’t do biology, they do sums quickly! 108 Implication of Protein Structure and Function 109 Implication of Protein Structure and Function Structure-Based Drug Design Structure-based rational drug design is still a major method for drug discovery. HIV protease inhibitor 110 CD4 on Mini Scaffold Rational engineering of a mini-protein that reproduces the core of the CD4 site interacting with HIV-1 envelop glycoprotein Vita, C. et al. Proc. Natl. Acad. Sci. USA (1999) 11 1 HIV • Envelope • Viral Core The Envelope • Bi-layer lipid outer coat • Layer of matrix protein p17 • ~72 copies of a complex HIV protein called spikes projects through the surface of the virus particle (Gelderblom et al.,Virology 1987) • Spike protein • Cap (3 gp120 molecule) • Stem (3 gp41molecule) 11 2 Gelderblom et al. 1987 HIV • Envelope • Viral Core The Viral Core • Bullet shaped core or capsid made of viral protein p24 • The capsid surrounds 2 single strand of HIV RNA each of which has a copy of 9 viral gene • ‘gap’ ‘pol’ and ‘env’ - codes for structural proteins to make new virus particle • ‘env’-codes gp160 that is broken by a viral enzyme to form gp120 and gp41 (Janeway et al. 1999) • ‘rev’, ‘vif’, ‘vpr’, ’nef’, ‘tat’, and ‘vpu’ - infection and replication 11 • REVERSE TRANSCRIPTASE, 3 INTEGRASE and PROTEASE Gelderblom et al. 1987 OUR IMMUNE SYSTEM Lymphocyte T-Cell B-Cell Helper: (recognize antigen, releases cytokine which signals B-cell to produce antibody, Helps differentiation of B-cell Suppressor: After battle stops antibody formation, and slows down the activity of B- and other T-cell Memory: Memorize the antigen and helps in quick response on next attack Cytotoxic T-cell: Recognize and directly kills infected cells Plasma cells : produces antibody i.e. makes enough receptor molecule in soluble form which binds to the microorganism Memory cells: same as memory T cells and both works together LGL or Natural Killer Cell 11 4 Large granular Lymphocyte Function known fully, Kills tumor cell and virus infected cells T-Cell Four basic kinds of T cell – T helper Secretion of chemical messenger - cytokines which in turn stimulates more T helper cell. So the T cells must have a particular receptor molecule to receive this message This receptor molecule is referred to us as a CD or Cluster of Differentiation (Around 130 CDs has been identified so far) CD4 is one of these receptor which is the main target of HIV to anchor to the T-cell and thereby get entry to the cell and replicate there 11 5 gp41 Fusogenic domain mediates Fusion 11 6 Fight Against AIDS Reverse transcriptase, integrase and protease are the enzymes targeted to design the anti HIV drugs 9 of 15 FDA approved drugs targets ‘reverse transcriptase’ eg. Zidovedin, Nevirapine, delavirdine These are big molecules and have severe side effects mainly on kidney Most of the time not so effective – viral genome is able to undergo numerous mutations in its critical areas Co-receptor (CCR5/CXCR4) blocking but low efficiency The other way to prevent HIV attack may be to block the viral glycoprotein to come in contact to the CD4 receptor of T-cell Make fool of the virus : Design a CD4 mimic using a mini scaffold 11 7 A group from France used scorpion toxin as scaffold and designed a chimeric protein which mimics CD4 activity Interaction between CD4 and gp120 Whole CD4 does not bind to GP 120. It is only a domain that binds. D1 the most important domain of CD4 to bind to gp120 D1 has a CDR2 like loop which is the main part of D1 domain to interact with gp120 Kwong et al. solved the structure of CD4-gp120 complex Solved structure showed that Phe at 43rd position and Arg at 59th position are important for binding of CD4 to gp120 CD4 11 8 Interaction between CD4 and gp120 Whole CD4 does not bind to GP 120. It is only a domain that binds. D1 the most important domain of CD4 to bind to gp120 D1 has a CDR2 like loop which is the main part of D1 domain to interact with gp120 Kwong et al. solved the structure of CD4-gp120 complex Solved structure showed that Phe at 43rd position and Arg at 59th position are important for binding of CD4 to gp120 11 9 Designing CD4 mimic using mini scaffold 12 0 Designing of CD4M D1 domain of CD4 Solvent exposed amino acid residues of the CDR2 like loop of CD4 was transferred to charybdotoxin scaffold The chimeric miniprotein designed was 33 amino acid residues long Charybdotoxin scaffold Solvent exposed residues 12 1 Implications of designing a CD4 mimic 12 2 The designed mimic can be used as an antiviral agent In complex with viral coat proteins the CD4 mimic can be used to formulate a vaccine against AIDS The designed CD4 mimic can be used for developing broad spectrum neutralizing antibodies Fight Against AIDS 12 3 Homozygous 32 deletion in the HIV co-receptor CCR5 confers resistance to HIV infection • Samson, M. et al. Resistance to HIV-1 infection in Caucasian individuals bearing mutant alleles of the CCR5 chemokine receptor gene. Nature 382, 722–725 (1996) • Liu, R. et al. Homozygous defect in HIV-1 coreceptor accounts for resistance of some multiply-exposed individuals to HIV-1 infection. Cell 86, 367–377 (1996) CCR5-Δ32 (or CCR5-D32 or CCR5 delta 32) is a genetic variant of CCR5 This allele is found in 5-14% of Europeans but is rare in Africans and Asians 12 4 It has been hypothesized that this allele was favored by natural selection during the Black Death (1347), which was one of the worst epidemic in history & 1/3 of the population of Europe died Homozygous 32 deletion in the HIV co-receptor CCR5 confers resistance to HIV infection • Samson, M. et al. Resistance to HIV-1 infection in Caucasian individuals bearing mutant alleles of the CCR5 chemokine receptor gene. Nature 382, 722–725 (1996) • Liu, R. et al. Homozygous defect in HIV-1 coreceptor accounts for resistance of some multiply-exposed individuals to HIV-1 infection. Cell 86, 367–377 (1996) CCR5-Δ32 (or CCR5-D32 or CCR5 delta 32) is a genetic variant of CCR5 This allele is found in 5-14% of Europeans but is rare in Africans and Asians 12 5 It has been hypothesized that this allele was favored by natural selection during the Black Death (1347), which was one of the worst epidemic in history & 1/3 of the population of Europe died The authors have created a CCR5 mutant Tcell and they have used these cells in vitro and also in in vivo mouse model to show that it confers complete resistance to HIV They used an engineered Zinc Finger Nuclease to target human CCR5 efficiently to generate a doublestrand break at a predetermined site in the CCR5 coding region same as CCR5-Δ32 genotype 12 6 BIOINFORMATICS took the leading role For this development 12 7 Bioinformatics Bottlenecks in Bangladesh 12 8 Lack of Facilities Lack of coordinated research Improper course curriculum in Statistics Improper course curriculum in Biology THANK YOU 129