Download Biological networks: Global network properties

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Zero-configuration networking wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Piggybacking (Internet access) wikipedia , lookup

Computer network wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

Network tap wikipedia , lookup

Peer-to-peer wikipedia , lookup

Routing in delay-tolerant networking wikipedia , lookup

Airborne Networking wikipedia , lookup

Transcript
Biological networks:
Global network properties
Bing Zhang
Department of Biomedical Informatics
Vanderbilt University
[email protected]
Lectures on biological networks
2
 
Global network properties (11/9)
 
Motifs and modules (11/18)
 
Network construction (11/20)
 
Network-based applications (11/30)
BMIF310, Fall 2009
Cell as a system
Metabolic network
Signaling
network
Transcriptional
regulatory network
3
Gene co-expression
network
BMIF310, Fall 2009
Protein interaction
network
Genome-wide protein interaction networks
Saccharomyces cerevisiae
Drosophila melanogaster
Uetz et al. Nature, 403:623, 2000
Giot et al. Science, 302:1727, 2003
Ito et al. PNAS, 97:1143, 2000
Homo sapiens
4
Caenorhabditis elegans
Rual et al. Nature, 437:1173, 2005
Li et al. Science, 303:540, 2004
Stelzl et al. Cell, 122:957, 2005
BMIF310, Fall 2009
Graph
 
Graph: a graph is a set of objects called nodes or vertices
connected by links called edges. In mathematics and computer
science, a graph is the basic object of study in graph theory.
node
edge
RNA polymerase II
5
Cramer et al. Science 292:1863, 2001
BMIF310, Fall 2009
Undirected graph vs directed graph
Protein interaction network
Nodes: protein
Edges: physical interaction
Undirected
Krogan et al. Nature 440:637, 2006
Lee et al. Science 298:799, 2002
Metabolic network
Transcriptional regulatory network
Nodes: metabolites
Nodes: transcription factors and genes
Edges: enzymes
Edges: transcriptional regulation
Directed
Directed
Substrate->Product
TF->target gene
Ravasz et al. Science 297:1551, 2002
6
BMIF310, Fall 2009
Fhl1
RPL2B
Unweighted graph vs weighted graph
Protein interaction network
Nodes: protein
Edges: physical interaction
Unweighted
Krogan et al. Nature 440:637, 2006
Gene co-expression network
Nodes: gene
Edges: co-expression level
Correlation filtering
Weighted
(>0.8)
7
BMIF310, Fall 2009
Unweighted
Graph representation
Adjacency matrix
8
 
Space tradeoff
 
Operation tradeoff
BMIF310, Fall 2009
Adjacency list
Node degree
 
Degree: the number of edges adjacent to a vertex.
YDL176W
dTMP
Degree: 3
In degree: 3
Out degree: 2
9
Fhl1
RPL2B
Out degree: 4
Out degree: 0
In degree: 0
In degree: 3
BMIF310, Fall 2009
Degree distribution
10
BMIF310, Fall 2009
Node
Degree
Vid30
10
Fyv10
5
YMR135C
4
Vid24
3
Vid28
3
YDL176W
3
Ald6
1
YDR255C
1
Sod1
1
YMR093W
1
Hta2
1
Vma2
1
RPL23A
1
YCL039W
1
Random network: Erdös-Rényi model
 
Model
 
 
p=1/6; n=10; <k> = 1.5
G(n,p): Nodes are connected
randomly to each other with
probability p.
Characteristic
 
Degree distribution: binomial
P(K=k)
B(n-1,p)
<k>
K
Binomial distribution
⎛ n −1⎞ k
n −1−k
P(K i = k) = ⎜
⎟ p (1 − p)
⎝ k ⎠
⎛ n ⎞ k
~ ⎜ ⎟ p (1 − p) n −k
⎝ k ⎠
€
 
Average degree <k> = (n-1)p ~ np
€
11
BMIF310, Fall 2009
Web documents network
Albert et al. Nature 401:130, 1999
12
 
Nodes: WWW documents
 
Edges: URL links
 
Data: 800 million
documents
 
Network construction:
collects all URLs found in
a document and follows
them recursively
BMIF310, Fall 2009
Degree distribution of the web documents network
What was expected?
〈k〉 ~ 6
P(k=500) ~ 10-99
NWWW ~ 109 ⇒ N(k=500)~10-90
What was found
N(k=500) ~ 103
P(k=500) ~ 10-6
Kleinberg et al. Proceedings of ICCC, 1999
13
BMIF310, Fall 2009
Scale-free network
14
 
The straight-line on the log-log plot is the signature of a
power law:
 
Scale-free network: networks whose degree distribution
follows the power-law.
BMIF310, Fall 2009
Random network vs scale-free network
 
Random network
 
Scale-free network
 
130 nodes, 215 edges
 
130 nodes, 215 edges
 
Homogeneous: most nodes
have approximately the
same number of links
 
 
Five red nodes with the
highest number of links
reach 27% of the nodes
Heterogeneous: the majority
of the nodes have one or
two links but a few nodes
have a large number of links
 
Five red nodes with the
highest degrees reach 60%
of the nodes (hubs)
Albert et al., Nature, 406:378, 2000
15
BMIF310, Fall 2009
Origin of scale-free networks
16
 
Networks are the result of a
growth process
 
New nodes prefer to connect
to nodes that already have
many
links,
i.e.
hubs
(preferential attachment)
 
Examples
 
Social network
 
Citation network
BMIF310, Fall 2009
Degree distribution of metabolic networks
A. fulgidus
E. coli
C. elegans
43 organisms
Jeong et al, Nature, 407:651, 2000
17
BMIF310, Fall 2009
Scale-free biological networks
18
Metabolic network
C. elegans
Protein interaction network
H. sapiens
Jeong et al, Nature, 407:651, 2000
Stelzl et al. Cell, 122:957, 2005
BMIF310, Fall 2009
Gene co-expression network
S. cerevisiae
Noort et al, EMBO Reports,5:280, 2004
Evolutionary origin of scale-free networks
 
Networks are the result of a growth
process
 
New nodes prefer to connect to nodes
that
already
have
many
links
(preferential attachment)
 
Growth and preferential attachment have
a common origin in protein interaction
networks that is probably rooted in gene
duplication
 
Highly connected proteins are more likely
to have a link to a duplicated protein if a
randomly selected protein is duplicated
Barabasi and Oltvai, Nature Rev Genet, 5:101, 2004
19
BMIF310, Fall 2009
Connectivity vs protein age
 
Divide proteins in the Baker’s yeast into four groups: group 1, 872
proteins; group 2, 665 proteins; group 3, 2079 proteins; group 4, 2678
proteins
 
Solid symbols: whole interaction database; Open symbols: highconfidence interactions
 
Older proteins (group 4) have significantly more interactions
Eisenberg and Levanon, Physi Rev Lett, 91:138701, 2003
20
BMIF310, Fall 2009
Scale-free topology and network robustness
 
Robust against random damage
 
Yet fragile against selective damage
21
BMIF310, Fall 2009
Connectivity vs protein lethality
 
Red, lethal; green, non-lethal; orange, slow growth; yellow,
unknown
 
Pearson's correlation coefficient r = 0.75, demonstrates a
positive correlation between lethality and connectivity
Jeong et al, Nature, 411:41, 2001
22
BMIF310, Fall 2009
Path and shortest path
23
 
Path: a sequence of nodes
such that from each of its
nodes there is an edge to the
next node in the sequence.
 
Shortest
path:
a
path
between two nodes such that
the sum of the distance of its
constituent
edges
is
minimized.
BMIF310, Fall 2009
Small world network
Wichita
 
Stanly Milgram’s small world
experiment
 
 
Boston
Omaha
 
"If you do not know the target person on a personal
basis, do not try to contact him directly. Instead,
mail this folder to a personal acquaintance who is
more likely than you to know the target person."
Average
path
length
between two person
Small world network: a
graph in which most nodes
can be reached from every
other by a small number of
steps.
Six degrees of separation
24
Social network
BMIF310, Fall 2009
Metabolic networks are small world networks
The histogram of the path lengths in the
E. coli metabolic network
The average path lengths for metabolic
networks of 43 organisms with different
complexity
Biological interpretation: Efficiency in transfer of biological information
Jeong et al, Nature, 407:651, 2000
25
BMIF310, Fall 2009
Summary
 
 
Graph representation of biological networks
 
Node, edge
 
Directed/undirected, weighted/unweighted
 
Degree, degree distribution
 
Path, shortest path, average path length
Properties of biological networks
 
 
26
Scale-free
 
Degree distribution follows the power-law
 
Growth and preferential attachment
 
Hubs and robustness
Small world
 
Most nodes can be reached from every other by a small number of steps
 
Efficiency
BMIF310, Fall 2009
Key references
 
27
Barabasi and Oltvai, Network biology: Understanding the cell’s functional
organization. Nature Rev Genet, 5:101, 2004
BMIF310, Fall 2009