* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Gene Ontology (GO)
Protein design wikipedia , lookup
Protein folding wikipedia , lookup
Structural alignment wikipedia , lookup
Western blot wikipedia , lookup
Intrinsically disordered proteins wikipedia , lookup
Bimolecular fluorescence complementation wikipedia , lookup
Protein domain wikipedia , lookup
Protein purification wikipedia , lookup
Protein mass spectrometry wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Protein structure prediction wikipedia , lookup
List of types of proteins wikipedia , lookup
Protein moonlighting wikipedia , lookup
Bioinformatics master course DNA/Protein structure-function analysis and prediction Lecture 13: Protein Function Centre for Integrative Bioinformatics VU (IBIVU) Faculty of Sciences / Faculty of Earth & Life Sciences Sequence-Structure-Function Sequence Threading Folding: impossible but for the smallest structures Ab initio Structure BLAST Function Function prediction from structure – very difficult Experimental • • • • Structural genomics Functional genomics Protein-protein interaction Metabolic pathways • Expression data Protein function categories • Catalysis (enzymes) • Binding – transport (active/passive) – Protein-DNA/RNA binding (e.g. histones, transcription factors) – Protein-protein interactions (e.g. antibody-lysozyme) (experimentally determined by yeast two-hybrid (Y2H) or bacterial two-hybrid (B2H) screening ) – Protein-fatty acid binding (e.g. apolipoproteins) – Protein – small molecules (drug interaction, structure decoding) • Structural component (e.g. -crystallin) • Regulation • Signalling • Transcription regulation • Immune system • Motor proteins (actin/myosin) Km kcat • E+S ES E+P E = enzyme S = substrate ES = enzyme-substrate complex (transition state) P = product Km = Michaelis constant kcat = catalytic rate constant (turnover number) Kcat/Km = specificity constant (useful for comparison) Moles/s Catalytic properties of enzymes Vmax Vmax/2 Km [S] Vmax × [S] V = ------------------- Michaelis-Menten equation Km + [S] Protein interaction domains http://pawsonlab.mshri.on.ca/html/domains.html Energy difference upon binding Examples of protein interactions (and functional importance) include: • Protein – protein (pathway analysis); • Protein – small molecules (drug interaction, structure decoding); • Protein – peptides, DNA/RNA (function analysis) The change in Gibb’s Free Energy of the protein-ligand binding interaction can be monitored and expressed by the following; G = H – T S (H=Enthalpy, S=Entropy and T=Temperature) Protein function • Many proteins combine functions • Some immunoglobulin structures are thought to have more than 100 different functions (and active/binding sites) • Alternative splicing can generate (partially) alternative structures Protein function Protein-protein interaction Active site / binding cleft Shape complementarity Protein function evolution Chymotrypsin How to infer function • Experiment • Deduction from sequence – Multiple sequence alignment – conservation patterns – Homology searching • Deduction from structure – Threading – Structure-structure comparison – Homology modelling Cholesterol biosynthesis primarily occurs in eukaryotic cells. It is necessary for membrane synthesis, and is a precursor for steroid hormone production as well as for vitamin D. While the pathway had previously been assumed to be localized in the cytosol and ER, more recent evidence suggests that a good deal of the enzymes in the pathway exist largely, if not exclusively, in the peroxisome (the enzymes listed in blue in the pathway to the left are thought to be at least partly peroxisomal). Patients with peroxisome biogenesis disorders (PBDs) have a variable deficiency in cholesterol biosynthesis Mevalonate plays a role in epithelial cancers: it can inhibit EGFR Epidermal Growth Factor as a Clinical Target in Cancer Introduction: A malignant tumour is the product of uncontrolled cell proliferation. Cell growth is controlled by a delicate balance between growth-promoting and growth-inhibiting factors. In normal tissue the production and activity of these factors results in differentiated cells growing in a controlled and regulated manner that maintains the normal integrity and functioning of the organ. The malignant cell has evaded this control; the natural balance is disturbed (via a variety of mechanisms) and unregulated, aberrant cell growth occurs. A key driver for growth is the epidermal growth factor (EGF) and the receptor for EGF (the EGFR) has been implicated in the development and progression of a number of human solid tumours including those of the lung, breast, prostate, colon, ovary, head and neck. Energy housekeeping: Adenosine diphosphate (ADP) – Adenosine triphosphate (ATP) Metabolic networks Glycolysis and Gluconeogenesis Kegg database (Japan) Gene Ontology (GO) • Not a genome sequence database • Developing three structured, controlled vocabularies (ontologies) to describe gene products in terms of: – biological process – cellular component – molecular function in a species-independent manner The GO ontology Gene Ontology Members • • • • • • • • • • • • • • • • FlyBase - database for the fruitfly Drosophila melanogaster Berkeley Drosophila Genome Project (BDGP) - Drosophila informatics; GO database & software, Sequence Ontology development Saccharomyces Genome Database (SGD) - database for the budding yeast Saccharomyces cerevisiae Mouse Genome Database (MGD) & Gene Expression Database (GXD) - databases for the mouse Mus musculus The Arabidopsis Information Resource (TAIR) - database for the brassica family plant Arabidopsis thaliana WormBase - database for the nematode Caenorhabditis elegans EBI GOA project : annotation of UniProt (Swiss-Prot/TrEMBL/PIR) and InterPro databases Rat Genome Database (RGD) - database for the rat Rattus norvegicus DictyBase - informatics resource for the slime mold Dictyostelium discoideum GeneDB S. pombe - database for the fission yeast Schizosaccharomyces pombe (part of the Pathogen Sequencing Unit at the Wellcome Trust Sanger Institute) GeneDB for protozoa - databases for Plasmodium falciparum, Leishmania major, Trypanosoma brucei, and several other protozoan parasites (part of the Pathogen Sequencing Unit at the Wellcome Trust Sanger Institute) Genome Knowledge Base (GK) - a collaboration between Cold Spring Harbor Laboratory and EBI) TIGR - The Institute for Genomic Research Gramene - A Comparative Mapping Resource for Monocots Compugen (with its Internet Research Engine) The Zebrafish Information Network (ZFIN) - reference datasets and information on Danio rerio