Download Biological networks - Vanderbilt University

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Neuronal ceroid lipofuscinosis wikipedia , lookup

Point mutation wikipedia , lookup

Gene expression profiling wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Protein moonlighting wikipedia , lookup

NEDD9 wikipedia , lookup

Transcript
Biological networks
Bing Zhang
Department of Biomedical Informatics
Vanderbilt University
[email protected]
Protein-protein interaction (PPI)

Definition


2
Physical association of two or more protein molecules
Examples

Receptor-ligand interactions

Kinase-substrate interactions

Transcription factor-co-activator interactions

Multiprotein complex, e.g. multimeric enzymes
BCHM352, Spring 2013
Significance of protein interaction


3
Most proteins mediate their function
through interacting with other
proteins

To form molecular machines

To participate in various regulatory
processes
Distortions of protein interactions
can cause diseases
BCHM352, Spring 2013
RNA polymerase II, 12 subunits
Cramer et al. Science 292:1863, 2001
Yeast two-hybrid

Method






Pros



Bait strain: a protein of interest, bait (B), fused
to a DNA-binding domain (DBD)
Prey strains: ORFs fused to a transcriptional
activation domain (AD)
Mate the bait strain to prey strains and plate
diploid cells on selective media (e.g. without
Histidine)
If bait and prey interact in the diploid cell, they
reconstitute a transcription factor, which
activates a reporter gene whose expression
allows the diploid cell to grow on selective
media
Pick colonies, isolate DNA, and sequence to
identify the ORF interacting with the bait
High-throughput
Can detect transient interactions
Cons



False positives
Non-physiological (done in the yeast nucleus)
Can’t detect multiprotein complexes
Uetz P. Curr Opin Chem Biol. 6:57, 2002
4
BCHM352, Spring 2013
Tandem affinity purification

Method






Pros




TAP tag: Protein A, Calmodulin binding
domain, TEV protease cleavage site
Bait protein gene is fused with the DNA
sequences encoding TAP tag
Tagged bait is expressed in cells and forms
native complexes
Complexes purified by TAP method
Components of each complex are identified
through gel separation followed by MS/MS
High-throughput
Physiological setting
Can detect large stable protein complexes
Cons





High false positives
Can’t detect transient interactions
Can’t detect interactions not present under
the given condition
Tagging may disturb complex formation
Binary interaction relationship is not clear
Chepelev et al. Biotechnol & Biotechnol 22:1, 2008
5
BCHM352, Spring 2013
Protein-protein interaction identification


Experimental

Yeast two-hybrid

Tandem affinity purification
Computational

Gene fusion

Conservation of gene neighborhood

Phylogenetic profiling

Coevolution

Ortholog interaction

Domain interaction
Valencia et al. Curr. Opin. Struct. Biol, 12:368, 2002
6
BCHM352, Spring 2013
PPI data in the public domain

Database of Interacting Proteins (DIP)
http://dip.doe-mbi.ucla.edu/

The Molecular INTeraction database (MINT)
http://mint.bio.uniroma2.it/mint/

The Biomolecular Object Network Databank (BOND)
http://bond.unleashedinformatics.com/

The General Repository for Interaction Datasets (BioGRID)
http://www.thebiogrid.org/

Human Protein Reference Database (HPRD)
http://www.hprd.org

Online Predicted Human Interaction Database (OPHID)
http://ophid.utoronto.ca

iRef
http://wodaklab.org/iRefWeb

The International Molecular Exchange Consortium (IMEX)
http://www.imexconsortium.org
7
BCHM352, Spring 2013
HPRD
8
BCHM352, Spring 2013
Graph representation of networks

Graph: a graph is a set of objects called nodes or vertices
connected by links called edges. In mathematics and computer
science, a graph is the basic object of study in graph theory.
node
edge
RNA polymerase II
9
Cramer et al. Science 292:1863, 2001
BCHM352, Spring 2013
Protein interaction networks
10
Saccharomyces cerevisiae
Drosophila melanogaster
Jeong et al. Nature, 411:41, 2001
Giot et al. Science, 302:1727, 2003
Caenorhabditis elegans
Homo sapiens
Li et al. Science, 303:540, 2004
Rual et al. Nature, 437:1173, 2005
BCHM352, Spring 2013
Biological networks
Networks
Physical
interaction
networks
Edges
Protein-protein
Proteins
interaction network
Physical interaction,
undirected
Signaling network
Proteins
Modification,
directed
Gene regulatory
network
TFs/miRNAs Physical interaction,
Target genes directed
Metabolic network
Metabolites
Co-expression
network
Genes/protei Co-expression,
ns
undirected
Functional
association
Genetic network
networks
11
Nodes
Genes
BCHM352, Spring 2013
Metabolic reaction,
directed
Genetic interaction,
undirected
Degree, path, shortest path

Degree: the number of edges adjacent to a node.

Path: a sequence of nodes such that from each of its nodes there is
an edge to the next node in the sequence.

Shortest path: a path between two nodes such that the sum of the
distance of its constituent edges is minimized.
YDL176W
Degree: 3
Fhl1
Out degree: 4
In degree: 0
12
BCHM352, Spring 2013
Properties of complex networks
Scale-free
13
Small world
Modular
BCHM352, Spring 2013
Hierarchical
Obama vs Lady Gaga: who is more influential?
Twitter following
(out degree)
701,301
Twitter followers
(in degree)
Obama
664,606
144,263
28,490,739
Gaga
136,511
0
0
14
BCHM352, Spring 2013
7,035,548
8,873,525
35,158,014
Eminem
3,509,469
13,946,813
Role of hubs in biological networks

Based on data from model organisms S. cerevisiae and C. elegans

Correspond to essential genes

Be older and have evolved more slowly

Have a tendency to be more abundant

Have a larger diversity of phenotypic outcomes resulting from their
deletion
Vidal et al. Cell, 144:986, 2011
15
BCHM352, Spring 2013
Connectivity vs protein lethality

Red, lethal; green, non-lethal; orange, slow growth; yellow,
unknown

Pearson's correlation coefficient r = 0.75, demonstrates a
positive correlation between lethality and connectivity
Jeong et al, Nature, 411:41, 2001
16
BCHM352, Spring 2013
Modularity

Modularity refers to a group of
physically or functionally linked
molecules (nodes) that work
together to achieve a relatively
distinct function.

Examples
Protein interaction modules
Palla et al, Nature, 435:841, 2005

Transcriptional module: a set of coregulated genes

Protein complex: assembly of
proteins that build up some cellular
machinery, commonly spans a
dense sub-network of proteins in a
protein interaction network

Signaling pathway: a chain of
interacting proteins propagating a
signal in the cell
Gene co-expression modules
Shi et al, BMC Syst Biol, 4:74, 2010
17
BCHM352, Spring 2013
Network distance vs functional similarity

Proteins that lie closer to one another in a protein interaction
network are more likely to have similar function and involve in
similar biological process.

GO semantic similarity
Hu et al. Nature Rev Cancer, 7:23, 2007
18
BCHM352, Spring 2013
Sharan et al. Mol Syst Biol, 3:88, 2007
Network-based prediction:
protein function, protein expression, disease association

Direct neighborhood method (local)


Diffusion-based method (global)


Proteins located in close network proximity (through direct or indirect
interaction) are more likely to share the same function, expression
status, and disease association.
Module-based method

19
Direct interaction partners of a protein are likely to share the same
function, expression status and disease association.
Proteins in the same network module are more likely to share the same
function, expression status, and disease assocaition.
BCHM352, Spring 2013
Protein identification in shotgun proteomics
Protein digestion
LC-MS/MS
Protein assembly
Database search
20
BCHM352, Spring 2013
Protein
assembly and classification
Background
a
b
Zhang, et al. J Proteome Res 6:3549, 2007
21
BCHM352, Spring 2013
Network-assisted protein identification: motivation
22

Current protein assembly pipelines treat proteins as
individual entities. Biologically interesting proteins may be
eliminated due to insufficient experimental evidence.

Most biological functions arise from interactions
among proteins. Can we use protein interaction
network information to improve protein identification?

Hypothesis: an eliminated protein is more likely to be
present in the original sample if it involves in a module in
which other protein components are confidently identified.
BCHM352, Spring 2013
Module-based prediction of protein expression

Class definition of proteins

Positive

Negative

Unknown

Network mapping

Module identification

Statistical evaluation
Li et al. Mol Syst Biol,5:303, 2009
23
BCHM352, Spring 2013
Application:
Breast cancer data set (normal vs tumor)



24
Rescued proteins

Normal: 139 (23%)

Tumor: 95 (8%)
Rescued cancer-related proteins

Ctnnb1

Top1

…
Cancer specific sub-networks

Wnt signaling pathway

Cell adhesion

Apoptosis

…
BCHM352, Spring 2013
Network-based disease gene prioritization
Kohler et al. Am J Hum Genet. 82:949, 2008
For a specific disease, candidate genes can be ranked based on their proximity
to known disease genes.
25
BCHM352, Spring 2013
Network visualization tools

Cytoscape

http://www.cytoscape.org
Gehlenborg et al. Nature Methods, 7:S56, 2010
26
BCHM352, Spring 2013