* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Biological networks - Vanderbilt University
Neuronal ceroid lipofuscinosis wikipedia , lookup
Point mutation wikipedia , lookup
Gene expression profiling wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Biological networks Bing Zhang Department of Biomedical Informatics Vanderbilt University [email protected] Protein-protein interaction (PPI) Definition 2 Physical association of two or more protein molecules Examples Receptor-ligand interactions Kinase-substrate interactions Transcription factor-co-activator interactions Multiprotein complex, e.g. multimeric enzymes BCHM352, Spring 2013 Significance of protein interaction 3 Most proteins mediate their function through interacting with other proteins To form molecular machines To participate in various regulatory processes Distortions of protein interactions can cause diseases BCHM352, Spring 2013 RNA polymerase II, 12 subunits Cramer et al. Science 292:1863, 2001 Yeast two-hybrid Method Pros Bait strain: a protein of interest, bait (B), fused to a DNA-binding domain (DBD) Prey strains: ORFs fused to a transcriptional activation domain (AD) Mate the bait strain to prey strains and plate diploid cells on selective media (e.g. without Histidine) If bait and prey interact in the diploid cell, they reconstitute a transcription factor, which activates a reporter gene whose expression allows the diploid cell to grow on selective media Pick colonies, isolate DNA, and sequence to identify the ORF interacting with the bait High-throughput Can detect transient interactions Cons False positives Non-physiological (done in the yeast nucleus) Can’t detect multiprotein complexes Uetz P. Curr Opin Chem Biol. 6:57, 2002 4 BCHM352, Spring 2013 Tandem affinity purification Method Pros TAP tag: Protein A, Calmodulin binding domain, TEV protease cleavage site Bait protein gene is fused with the DNA sequences encoding TAP tag Tagged bait is expressed in cells and forms native complexes Complexes purified by TAP method Components of each complex are identified through gel separation followed by MS/MS High-throughput Physiological setting Can detect large stable protein complexes Cons High false positives Can’t detect transient interactions Can’t detect interactions not present under the given condition Tagging may disturb complex formation Binary interaction relationship is not clear Chepelev et al. Biotechnol & Biotechnol 22:1, 2008 5 BCHM352, Spring 2013 Protein-protein interaction identification Experimental Yeast two-hybrid Tandem affinity purification Computational Gene fusion Conservation of gene neighborhood Phylogenetic profiling Coevolution Ortholog interaction Domain interaction Valencia et al. Curr. Opin. Struct. Biol, 12:368, 2002 6 BCHM352, Spring 2013 PPI data in the public domain Database of Interacting Proteins (DIP) http://dip.doe-mbi.ucla.edu/ The Molecular INTeraction database (MINT) http://mint.bio.uniroma2.it/mint/ The Biomolecular Object Network Databank (BOND) http://bond.unleashedinformatics.com/ The General Repository for Interaction Datasets (BioGRID) http://www.thebiogrid.org/ Human Protein Reference Database (HPRD) http://www.hprd.org Online Predicted Human Interaction Database (OPHID) http://ophid.utoronto.ca iRef http://wodaklab.org/iRefWeb The International Molecular Exchange Consortium (IMEX) http://www.imexconsortium.org 7 BCHM352, Spring 2013 HPRD 8 BCHM352, Spring 2013 Graph representation of networks Graph: a graph is a set of objects called nodes or vertices connected by links called edges. In mathematics and computer science, a graph is the basic object of study in graph theory. node edge RNA polymerase II 9 Cramer et al. Science 292:1863, 2001 BCHM352, Spring 2013 Protein interaction networks 10 Saccharomyces cerevisiae Drosophila melanogaster Jeong et al. Nature, 411:41, 2001 Giot et al. Science, 302:1727, 2003 Caenorhabditis elegans Homo sapiens Li et al. Science, 303:540, 2004 Rual et al. Nature, 437:1173, 2005 BCHM352, Spring 2013 Biological networks Networks Physical interaction networks Edges Protein-protein Proteins interaction network Physical interaction, undirected Signaling network Proteins Modification, directed Gene regulatory network TFs/miRNAs Physical interaction, Target genes directed Metabolic network Metabolites Co-expression network Genes/protei Co-expression, ns undirected Functional association Genetic network networks 11 Nodes Genes BCHM352, Spring 2013 Metabolic reaction, directed Genetic interaction, undirected Degree, path, shortest path Degree: the number of edges adjacent to a node. Path: a sequence of nodes such that from each of its nodes there is an edge to the next node in the sequence. Shortest path: a path between two nodes such that the sum of the distance of its constituent edges is minimized. YDL176W Degree: 3 Fhl1 Out degree: 4 In degree: 0 12 BCHM352, Spring 2013 Properties of complex networks Scale-free 13 Small world Modular BCHM352, Spring 2013 Hierarchical Obama vs Lady Gaga: who is more influential? Twitter following (out degree) 701,301 Twitter followers (in degree) Obama 664,606 144,263 28,490,739 Gaga 136,511 0 0 14 BCHM352, Spring 2013 7,035,548 8,873,525 35,158,014 Eminem 3,509,469 13,946,813 Role of hubs in biological networks Based on data from model organisms S. cerevisiae and C. elegans Correspond to essential genes Be older and have evolved more slowly Have a tendency to be more abundant Have a larger diversity of phenotypic outcomes resulting from their deletion Vidal et al. Cell, 144:986, 2011 15 BCHM352, Spring 2013 Connectivity vs protein lethality Red, lethal; green, non-lethal; orange, slow growth; yellow, unknown Pearson's correlation coefficient r = 0.75, demonstrates a positive correlation between lethality and connectivity Jeong et al, Nature, 411:41, 2001 16 BCHM352, Spring 2013 Modularity Modularity refers to a group of physically or functionally linked molecules (nodes) that work together to achieve a relatively distinct function. Examples Protein interaction modules Palla et al, Nature, 435:841, 2005 Transcriptional module: a set of coregulated genes Protein complex: assembly of proteins that build up some cellular machinery, commonly spans a dense sub-network of proteins in a protein interaction network Signaling pathway: a chain of interacting proteins propagating a signal in the cell Gene co-expression modules Shi et al, BMC Syst Biol, 4:74, 2010 17 BCHM352, Spring 2013 Network distance vs functional similarity Proteins that lie closer to one another in a protein interaction network are more likely to have similar function and involve in similar biological process. GO semantic similarity Hu et al. Nature Rev Cancer, 7:23, 2007 18 BCHM352, Spring 2013 Sharan et al. Mol Syst Biol, 3:88, 2007 Network-based prediction: protein function, protein expression, disease association Direct neighborhood method (local) Diffusion-based method (global) Proteins located in close network proximity (through direct or indirect interaction) are more likely to share the same function, expression status, and disease association. Module-based method 19 Direct interaction partners of a protein are likely to share the same function, expression status and disease association. Proteins in the same network module are more likely to share the same function, expression status, and disease assocaition. BCHM352, Spring 2013 Protein identification in shotgun proteomics Protein digestion LC-MS/MS Protein assembly Database search 20 BCHM352, Spring 2013 Protein assembly and classification Background a b Zhang, et al. J Proteome Res 6:3549, 2007 21 BCHM352, Spring 2013 Network-assisted protein identification: motivation 22 Current protein assembly pipelines treat proteins as individual entities. Biologically interesting proteins may be eliminated due to insufficient experimental evidence. Most biological functions arise from interactions among proteins. Can we use protein interaction network information to improve protein identification? Hypothesis: an eliminated protein is more likely to be present in the original sample if it involves in a module in which other protein components are confidently identified. BCHM352, Spring 2013 Module-based prediction of protein expression Class definition of proteins Positive Negative Unknown Network mapping Module identification Statistical evaluation Li et al. Mol Syst Biol,5:303, 2009 23 BCHM352, Spring 2013 Application: Breast cancer data set (normal vs tumor) 24 Rescued proteins Normal: 139 (23%) Tumor: 95 (8%) Rescued cancer-related proteins Ctnnb1 Top1 … Cancer specific sub-networks Wnt signaling pathway Cell adhesion Apoptosis … BCHM352, Spring 2013 Network-based disease gene prioritization Kohler et al. Am J Hum Genet. 82:949, 2008 For a specific disease, candidate genes can be ranked based on their proximity to known disease genes. 25 BCHM352, Spring 2013 Network visualization tools Cytoscape http://www.cytoscape.org Gehlenborg et al. Nature Methods, 7:S56, 2010 26 BCHM352, Spring 2013