Download Applications of Random Networks Complex Networks CSYS/MATH 303, Spring, 2011 Prof. Peter Dodds

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Applications of
Random Networks
Applications of Random Networks
Complex Networks
CSYS/MATH 303, Spring, 2011
Analysis of real
networks
How to build revisited
Motifs
References
Prof. Peter Dodds
Department of Mathematics & Statistics
Center for Complex Systems
Vermont Advanced Computing Center
University of Vermont
Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
1 of 17
Outline
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
Analysis of real networks
How to build revisited
Motifs
References
2 of 17
More on building random networks
Applications of
Random Networks
Analysis of real netw
How to build revisited
Motifs
I
Problem: How much of a real network’s structure is
non-random?
I
Key elephant in the room: the degree distribution Pk .
I
First observe departure of Pk from a Poisson
distribution.
I
Next: measure the departure of a real network with a
degree frequency Nk from a random network with the
same degree frequency.
I
Degree frequency Nk = observed frequency of
degrees for a real network.
I
What we now need to do: Create an ensemble of
random networks with degree frequency Nk and then
compare.
References
3 of 17
More on building random networks
Applications of
Random Networks
Analysis of real netw
How to build revisited
Motifs
I
Problem: How much of a real network’s structure is
non-random?
I
Key elephant in the room: the degree distribution Pk .
I
First observe departure of Pk from a Poisson
distribution.
I
Next: measure the departure of a real network with a
degree frequency Nk from a random network with the
same degree frequency.
I
Degree frequency Nk = observed frequency of
degrees for a real network.
I
What we now need to do: Create an ensemble of
random networks with degree frequency Nk and then
compare.
References
3 of 17
More on building random networks
Applications of
Random Networks
Analysis of real netw
How to build revisited
Motifs
I
Problem: How much of a real network’s structure is
non-random?
I
Key elephant in the room: the degree distribution Pk .
I
First observe departure of Pk from a Poisson
distribution.
I
Next: measure the departure of a real network with a
degree frequency Nk from a random network with the
same degree frequency.
I
Degree frequency Nk = observed frequency of
degrees for a real network.
I
What we now need to do: Create an ensemble of
random networks with degree frequency Nk and then
compare.
References
3 of 17
More on building random networks
Applications of
Random Networks
Analysis of real netw
How to build revisited
Motifs
I
Problem: How much of a real network’s structure is
non-random?
I
Key elephant in the room: the degree distribution Pk .
I
First observe departure of Pk from a Poisson
distribution.
I
Next: measure the departure of a real network with a
degree frequency Nk from a random network with the
same degree frequency.
I
Degree frequency Nk = observed frequency of
degrees for a real network.
I
What we now need to do: Create an ensemble of
random networks with degree frequency Nk and then
compare.
References
3 of 17
More on building random networks
Applications of
Random Networks
Analysis of real netw
How to build revisited
Motifs
I
Problem: How much of a real network’s structure is
non-random?
I
Key elephant in the room: the degree distribution Pk .
I
First observe departure of Pk from a Poisson
distribution.
I
Next: measure the departure of a real network with a
degree frequency Nk from a random network with the
same degree frequency.
I
Degree frequency Nk = observed frequency of
degrees for a real network.
I
What we now need to do: Create an ensemble of
random networks with degree frequency Nk and then
compare.
References
3 of 17
More on building random networks
Applications of
Random Networks
Analysis of real netw
How to build revisited
Motifs
I
Problem: How much of a real network’s structure is
non-random?
I
Key elephant in the room: the degree distribution Pk .
I
First observe departure of Pk from a Poisson
distribution.
I
Next: measure the departure of a real network with a
degree frequency Nk from a random network with the
same degree frequency.
I
Degree frequency Nk = observed frequency of
degrees for a real network.
I
What we now need to do: Create an ensemble of
random networks with degree frequency Nk and then
compare.
References
3 of 17
Outline
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
Analysis of real networks
How to build revisited
Motifs
References
4 of 17
Building random networks: Stubs
Applications of
Random Networks
Analysis of real
networks
Phase 1:
How to build revisited
Motifs
I
Idea: start with a soup of unconnected nodes with
stubs (half-edges):
I
Randomly select stubs
(not nodes!) and
connect them.
I
Must have an even
number of stubs.
I
Initially allow self- and
repeat connections.
References
5 of 17
Building random networks: Stubs
Applications of
Random Networks
Analysis of real
networks
Phase 1:
How to build revisited
Motifs
I
Idea: start with a soup of unconnected nodes with
stubs (half-edges):
I
Randomly select stubs
(not nodes!) and
connect them.
I
Must have an even
number of stubs.
I
Initially allow self- and
repeat connections.
References
5 of 17
Building random networks: Stubs
Applications of
Random Networks
Analysis of real
networks
Phase 1:
How to build revisited
Motifs
I
Idea: start with a soup of unconnected nodes with
stubs (half-edges):
I
Randomly select stubs
(not nodes!) and
connect them.
I
Must have an even
number of stubs.
I
Initially allow self- and
repeat connections.
References
5 of 17
Building random networks: Stubs
Applications of
Random Networks
Analysis of real
networks
Phase 1:
How to build revisited
Motifs
I
Idea: start with a soup of unconnected nodes with
stubs (half-edges):
I
Randomly select stubs
(not nodes!) and
connect them.
I
Must have an even
number of stubs.
I
Initially allow self- and
repeat connections.
References
5 of 17
Building random networks: Stubs
Applications of
Random Networks
Analysis of real
networks
Phase 1:
How to build revisited
Motifs
I
Idea: start with a soup of unconnected nodes with
stubs (half-edges):
I
Randomly select stubs
(not nodes!) and
connect them.
I
Must have an even
number of stubs.
I
Initially allow self- and
repeat connections.
References
5 of 17
Building random networks: First rewiring
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
Phase 2:
I
Now find any (A) self-loops and (B) repeat edges and
randomly rewire them.
(A)
(B)
I
Being careful: we can’t change the degree of any
node, so we can’t simply move links around.
I
Simplest solution: randomly rewire two edges at a
time.
6 of 17
Building random networks: First rewiring
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
Phase 2:
I
Now find any (A) self-loops and (B) repeat edges and
randomly rewire them.
(A)
(B)
I
Being careful: we can’t change the degree of any
node, so we can’t simply move links around.
I
Simplest solution: randomly rewire two edges at a
time.
6 of 17
Building random networks: First rewiring
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
Phase 2:
I
Now find any (A) self-loops and (B) repeat edges and
randomly rewire them.
(A)
(B)
I
Being careful: we can’t change the degree of any
node, so we can’t simply move links around.
I
Simplest solution: randomly rewire two edges at a
time.
6 of 17
General random rewiring algorithm
i2
e1
i1
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
i3
e2
I
Randomly choose two edges.
(Or choose problem edge and
a random edge)
I
Check to make sure edges
are disjoint.
I
Rewire one end of each edge.
I
Node degrees do not change.
I
Works if e1 is a self-loop or
repeated edge.
I
Same as finding on/off/on/off
4-cycles. and rotating them.
i4
References
7 of 17
General random rewiring algorithm
i2
e1
i1
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
i3
e2
I
Randomly choose two edges.
(Or choose problem edge and
a random edge)
I
Check to make sure edges
are disjoint.
I
Rewire one end of each edge.
I
Node degrees do not change.
I
Works if e1 is a self-loop or
repeated edge.
I
Same as finding on/off/on/off
4-cycles. and rotating them.
i4
References
7 of 17
General random rewiring algorithm
i2
e1
i1
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
i3
e2
I
Randomly choose two edges.
(Or choose problem edge and
a random edge)
I
Check to make sure edges
are disjoint.
I
Rewire one end of each edge.
I
Node degrees do not change.
I
Works if e1 is a self-loop or
repeated edge.
I
Same as finding on/off/on/off
4-cycles. and rotating them.
i4
References
7 of 17
General random rewiring algorithm
i2
e1
i1
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
i3
i1
e’1
i3
I
Randomly choose two edges.
(Or choose problem edge and
a random edge)
I
Check to make sure edges
are disjoint.
I
Rewire one end of each edge.
I
Node degrees do not change.
I
Works if e1 is a self-loop or
repeated edge.
I
Same as finding on/off/on/off
4-cycles. and rotating them.
i4
e2
References
i2
e’2
i4
7 of 17
General random rewiring algorithm
i2
e1
i1
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
i3
i1
e’1
i3
I
Randomly choose two edges.
(Or choose problem edge and
a random edge)
I
Check to make sure edges
are disjoint.
I
Rewire one end of each edge.
I
Node degrees do not change.
I
Works if e1 is a self-loop or
repeated edge.
I
Same as finding on/off/on/off
4-cycles. and rotating them.
i4
e2
References
i2
e’2
i4
7 of 17
General random rewiring algorithm
i2
e1
i1
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
i3
i1
e’1
i3
I
Randomly choose two edges.
(Or choose problem edge and
a random edge)
I
Check to make sure edges
are disjoint.
I
Rewire one end of each edge.
I
Node degrees do not change.
I
Works if e1 is a self-loop or
repeated edge.
I
Same as finding on/off/on/off
4-cycles. and rotating them.
i4
e2
References
i2
e’2
i4
7 of 17
Sampling random networks
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
Phase 2:
I
Use rewiring algorithm to remove all self and repeat
loops.
Phase 3:
I
Randomize network wiring by applying rewiring
algorithm liberally.
I
Rule of thumb: # Rewirings ' 10 × # edges [1] .
8 of 17
Sampling random networks
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
Phase 2:
I
Use rewiring algorithm to remove all self and repeat
loops.
Phase 3:
I
Randomize network wiring by applying rewiring
algorithm liberally.
I
Rule of thumb: # Rewirings ' 10 × # edges [1] .
8 of 17
Sampling random networks
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
Phase 2:
I
Use rewiring algorithm to remove all self and repeat
loops.
Phase 3:
I
Randomize network wiring by applying rewiring
algorithm liberally.
I
Rule of thumb: # Rewirings ' 10 × # edges [1] .
8 of 17
Random sampling
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
I
Problem with only joining up stubs is failure to
randomly sample from all possible networks.
I
Example from Milo et al. (2003) [1] :
References
9 of 17
Applications of
Random Networks
Random sampling
Analysis of real
networks
How to build revisited
(a)
I
Problem with only joining up stubs is failure to
randomly sample from all possible networks.
I
Example from Milo et al. (2003) [1] :
(b)
References
90 configurations consists of an out-hub with
network. The network
outgoing edges, an in-hub with ten incoming edges,
ten nodes with one incoming edge and one outgoing e
(c) each. Given this degree sequence, there are just two
1
tinct
network topologies with no multiple edges, as sh
in
0.5 Fig. 2a and 2b. There is only a single way to form
with thethere
winners
network in 2a,gobut
are 90 different ways to form
0
We generated 100 000 random networks using eac
1
the 3 methods described here and the results are s
marized
in Fig. 2c. As the figure shows, the match
0.5
switching algorithm
algorithm introduces
a bias, undersampling the confi
0
ration of Fig. 2a. This is a result of the dynamics of
1
algorithm, which favors the creation of edges betw
0.5
hubs.
The switching and go-with-the-winners algorit
matching algorithm
on the other hand
sample the configurations uniform
0
generating each graph an equal number of times wi
FIG. 2:the
Uniformity
tests of the three algorithms
on a toyon
net- our calculations. The go-w
measurement
error
work. Panels (a) and (b) depict the two types of topologies of
IV. CONCLUSIONS
the 91 random networks studied, one of them like (a) and 90
the-winners algorithm truly samples the ensemble
like (b). Panel (c) shows the frequency with which each conIn this paper we have compared three algori
figuration is sampled by our three algorithms. 100 000 graphs
generating
random
graphs
prescribedm
de
formly
but
is
far
less
efficient
than
the
twowithother
were generated with each algorithm, and the figure shows the
quences and no multiple edges or self-edges. Tw
fraction of graphs of each type generated. If sampling were
ods.
The
results
given
here
indicate
that
the
switch
three
have been used
previously,
but suffer
from
, which
is
uniform,
each should
appear
with probability
formity in their sampling properties, while the
indicated by the dotted lines. The go-with-the-winners and
algorithm
essentially
identical
while
method
based on the “goresults
with the winners”
Mon
switching
algorithms sampleproduces
uniformly within sampling
er9 ofsamples
17 unifor
ror, passing both the Kolmogorov–Smirnoff and Lillie Gausprocedure, is new and provably
sian tests.
algorithm
under-samples
the unique
ingTheamatching
good
deal
faster.
The ismatching
algorithm
is fa
quite slow. Of the
two older algorithms,
1 configuration
% frequency of occurrence
(a)
Motifs
network. The network consists of an out-hub
outgoing edges, an in-hub with ten incoming ed
ten nodes with one incoming edge and one outgo
each. Given this degree sequence, there are just
tinct network topologies with no multiple edges, a
in Fig. 2a and 2b. There is only a single way to
network in 2a, but there are 90 different ways to
We generated 100 000 random networks using
the 3 methods described here and the results a
marized in Fig. 2c. As the figure shows, the m
algorithm introduces a bias, undersampling the
ration of Fig. 2a. This is a result of the dynami
algorithm, which favors the creation of edges
hubs. The switching and go-with-the-winners alg
on the other hand sample the configurations un
generating each graph an equal number of time
the measurement error on our calculations. The
the-winners algorithm truly samples the ensem
formly but is far less efficient than the two oth
ods. The results given here indicate that the s
algorithm produces essentially identical results w
ing a good deal faster. The matching algorithm
still but samples in a measurably biased way.
Now consider the study of network motifs. W
terested in knowing when particular subgraphs o
appear significantly more or less often in a real-w
work than would be expected on the basis of cha
we can answer this question by comparing mot
to random graphs. Some results for the case of th
forward loop” motif [16, 17] are given in Table I
case the densities of motifs in the real-world n
are many standard deviations away from random
suggests that any of the present algorithms is a
for generating suitable random graphs to act a
model, although the go-with-the-winners and s
algorithms, while slower, are clearly more sat
theoretically. The matching algorithm was me
nonuniform for our toy example above, but seem
better results on the real-world problem.
Overall, our results appear to argue in favo
ing the switching method, with the go-with-the
method finding limited use as a check on the acc
sampling. Accuracy checks are also supplied by
cal estimates for subgraph numbers [11].
(b)
1 configuration
90 configurations
1
91
(c)
Sampling random networks
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
I
What if we have Pk instead of Nk ?
I
Must now create nodes before start of the
construction algorithm.
I
Generate N nodes by sampling from degree
distribution Pk .
I
Easy to do exactly numerically since k is discrete.
I
Note: not all Pk will always give nodes that can be
wired together.
10 of 17
Sampling random networks
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
I
What if we have Pk instead of Nk ?
I
Must now create nodes before start of the
construction algorithm.
I
Generate N nodes by sampling from degree
distribution Pk .
I
Easy to do exactly numerically since k is discrete.
I
Note: not all Pk will always give nodes that can be
wired together.
10 of 17
Sampling random networks
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
I
What if we have Pk instead of Nk ?
I
Must now create nodes before start of the
construction algorithm.
I
Generate N nodes by sampling from degree
distribution Pk .
I
Easy to do exactly numerically since k is discrete.
I
Note: not all Pk will always give nodes that can be
wired together.
10 of 17
Sampling random networks
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
I
What if we have Pk instead of Nk ?
I
Must now create nodes before start of the
construction algorithm.
I
Generate N nodes by sampling from degree
distribution Pk .
I
Easy to do exactly numerically since k is discrete.
I
Note: not all Pk will always give nodes that can be
wired together.
10 of 17
Sampling random networks
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
I
What if we have Pk instead of Nk ?
I
Must now create nodes before start of the
construction algorithm.
I
Generate N nodes by sampling from degree
distribution Pk .
I
Easy to do exactly numerically since k is discrete.
I
Note: not all Pk will always give nodes that can be
wired together.
10 of 17
Outline
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
Analysis of real networks
How to build revisited
Motifs
References
11 of 17
Applications of
Random Networks
Network motifs
Analysis of real
networks
How to build revisited
Motifs
motifs [2]
I
Idea of
in 2002.
introduced by Shen-Orr, Alon et al.
I
Looked at gene expression within full context of
transcriptional regulation networks.
I
Specific example of Escherichia coli.
I
Directed network with 577 interactions (edges) and
424 operons (nodes).
I
Used network randomization to produce ensemble of
alternate networks with same degree frequency Nk .
I
Looked for certain subnetworks (motifs) that
appeared more or less often than expected
References
12 of 17
Applications of
Random Networks
Network motifs
Analysis of real
networks
How to build revisited
Motifs
motifs [2]
I
Idea of
in 2002.
introduced by Shen-Orr, Alon et al.
I
Looked at gene expression within full context of
transcriptional regulation networks.
I
Specific example of Escherichia coli.
I
Directed network with 577 interactions (edges) and
424 operons (nodes).
I
Used network randomization to produce ensemble of
alternate networks with same degree frequency Nk .
I
Looked for certain subnetworks (motifs) that
appeared more or less often than expected
References
12 of 17
Applications of
Random Networks
Network motifs
Analysis of real
networks
How to build revisited
Motifs
motifs [2]
I
Idea of
in 2002.
introduced by Shen-Orr, Alon et al.
I
Looked at gene expression within full context of
transcriptional regulation networks.
I
Specific example of Escherichia coli.
I
Directed network with 577 interactions (edges) and
424 operons (nodes).
I
Used network randomization to produce ensemble of
alternate networks with same degree frequency Nk .
I
Looked for certain subnetworks (motifs) that
appeared more or less often than expected
References
12 of 17
Applications of
Random Networks
Network motifs
Analysis of real
networks
How to build revisited
Motifs
motifs [2]
I
Idea of
in 2002.
introduced by Shen-Orr, Alon et al.
I
Looked at gene expression within full context of
transcriptional regulation networks.
I
Specific example of Escherichia coli.
I
Directed network with 577 interactions (edges) and
424 operons (nodes).
I
Used network randomization to produce ensemble of
alternate networks with same degree frequency Nk .
I
Looked for certain subnetworks (motifs) that
appeared more or less often than expected
References
12 of 17
Applications of
Random Networks
Network motifs
Analysis of real
networks
How to build revisited
Motifs
motifs [2]
I
Idea of
in 2002.
introduced by Shen-Orr, Alon et al.
I
Looked at gene expression within full context of
transcriptional regulation networks.
I
Specific example of Escherichia coli.
I
Directed network with 577 interactions (edges) and
424 operons (nodes).
I
Used network randomization to produce ensemble of
alternate networks with same degree frequency Nk .
I
Looked for certain subnetworks (motifs) that
appeared more or less often than expected
References
12 of 17
Applications of
Random Networks
Network motifs
Analysis of real
networks
How to build revisited
Motifs
motifs [2]
I
Idea of
in 2002.
introduced by Shen-Orr, Alon et al.
I
Looked at gene expression within full context of
transcriptional regulation networks.
I
Specific example of Escherichia coli.
I
Directed network with 577 interactions (edges) and
424 operons (nodes).
I
Used network randomization to produce ensemble of
alternate networks with same degree frequency Nk .
I
Looked for certain subnetworks (motifs) that
appeared more or less often than expected
References
12 of 17
Shai S. Shen-Orr1, Ron Milo2, Shmoolik Mangan1 & Uri Alon1,2
Applications of
Random Networks
Published online: 22 April 2002, DOI: 10.1038/ng881
Analysis of real
networks
Network motifs
letter
a
feedforward loop
X
Y
Z
n
crp
araC
araBAD
c
single input module (SIM)
© 2002 Nature Publishing Group http://genetics.nature.com
b
X
Y
How to build revisited
Motifs
Little is known about the design principles1–10 of transcriptional regulation networks that control gene expression References
in
2,11,12
2 Dynamic
, features of the coherent
acells. Recent advances in data collection and analysisFig.
motifs. a, Consider a coherent feedforwar
however, are generating unprecedented amounts of informagate’–like control of the output operon Z
variations in the activity of the input X, an
tion about gene regulation networks. To understandactivation
theseprofiles. This is because Y nee
oversuch
time to pass the activation threshold
complex wiring diagrams1–10,13, we sought to break down
rejection of rapid fluctuations can be achi
networks into basic building blocks2. We generalize the however,
notionthe cascade has a slower shut-d
loop (thin red line in the Z dynamics pan
of motifs, widely used for sequence analysis, to the level
of
motif. This motif can show a temporal pro
ing to a hierarchy of activation threshol
networks. We define ‘network motifs’ as patterns of interconactivity of X, the master activator, rises an
nections that recur in many different parts of a networkwith
at the
fre-lowest threshold are activated e
est. Time is in units of protein lifetimes, o
quencies much higher than those found in randomized
long-lived proteins.
networks. We applied new algorithms for systematically
detecting network motifs to one of the best-characterized reg3, usually involving !
long cascades
ulation networks, that of direct transcriptional interactions
in
cades
of depth 5 in the flagella and
Escherichia coli3,6. We find that much of the network is
com70% of the operons are connected t
the operons are in small disjoint
bposed of repeated appearances of three highly significant
motifs. Each network motif has a specific function in determinsystems have only 1 to 3 operons.
systems have up to 25 operons and
ing gene expression, such as generating temporal expression
feedforward loops. A notable featu
programs and governing the responses to fluctuating external
zation is the large degree of overlap
signals. The motif structure also allows an easily interpretable
the short cascades that control mos
may therefore represent the
view of the entire known transcriptional network of the DORs
organtion carried out by the transcriptio
ism. This approach may help define the basic computational
Cycles such as feedback loops ar
elements of other biological networks.
of regulatory networks. Transcrip
I
Z only turns on in response to sustained activity in X .
I
X
Turning off X rapidly
turns off Z .
I
n doors.
Analogy
Z1 Z2 to...elevator
Zn
We compiled a data set of direct transcriptional interactions
occur in various organisms, such a
X
d
argI
argF
argE
argD
argCBH
argR
"-phage
between transcription factors and the operons they regulate
(an5. In the E. coli data set, th
operon is a group of contiguous genes that are transcribedfeedback
into a loops of direct transcr
except for auto-regulatory loops3.
single
databaseof Xcontains
interacfrom
X andmRNA
a delayedmolecule).
one through Y.This
If the activation
is tran- of577
feedback
loops is not statistically signi
tions
and 424
(involving
116 transcription
randomizeditnetworks also have no fee
sient,
Y cannot
reachoperons
the level needed
to significantly
activate Z, the factors);
and
the formed
input signalon
is not
theexisting
circuit. Only
The many
regulatory feedbacks loops in th
was
thetransduced
basis ofthrough
on an
database
(Reguwhen X signals
a long enough time so that Y levels can build out at the post-transcriptional level.
lonDB)3,14.forWe
enhanced RegulonDB by an extensive
literature
We considered only transcription in
up will Z be activated (Fig. 2a). Once X is deactivated, Z shuts
search,
adding
35 new
transcription
factors,
including
alterna13factors
of 17that bi
down
rapidly.
This kind
of behavior
can be useful
for making
manifested
by transcription
decisions
based on fluctuating
signals.
This transcriptional
tive !-factors
(subunitsexternal
of RNA
polymerase that confer
recogni- network can be thoug
Shai S. Shen-Orr1, Ron Milo2, Shmoolik Mangan1 & Uri Alon1,2
Applications of
Random Networks
Published online: 22 April 2002, DOI: 10.1038/ng881
Analysis of real
networks
Network motifs
letter
a
feedforward loop
X
Y
Z
n
crp
araC
araBAD
c
single input module (SIM)
© 2002 Nature Publishing Group http://genetics.nature.com
b
X
Y
How to build revisited
Motifs
Little is known about the design principles1–10 of transcriptional regulation networks that control gene expression References
in
2,11,12
2 Dynamic
, features of the coherent
acells. Recent advances in data collection and analysisFig.
motifs. a, Consider a coherent feedforwar
however, are generating unprecedented amounts of informagate’–like control of the output operon Z
variations in the activity of the input X, an
tion about gene regulation networks. To understandactivation
theseprofiles. This is because Y nee
oversuch
time to pass the activation threshold
complex wiring diagrams1–10,13, we sought to break down
rejection of rapid fluctuations can be achi
networks into basic building blocks2. We generalize the however,
notionthe cascade has a slower shut-d
loop (thin red line in the Z dynamics pan
of motifs, widely used for sequence analysis, to the level
of
motif. This motif can show a temporal pro
ing to a hierarchy of activation threshol
networks. We define ‘network motifs’ as patterns of interconactivity of X, the master activator, rises an
nections that recur in many different parts of a networkwith
at the
fre-lowest threshold are activated e
est. Time is in units of protein lifetimes, o
quencies much higher than those found in randomized
long-lived proteins.
networks. We applied new algorithms for systematically
detecting network motifs to one of the best-characterized reg3, usually involving !
long cascades
ulation networks, that of direct transcriptional interactions
in
cades
of depth 5 in the flagella and
Escherichia coli3,6. We find that much of the network is
com70% of the operons are connected t
the operons are in small disjoint
bposed of repeated appearances of three highly significant
motifs. Each network motif has a specific function in determinsystems have only 1 to 3 operons.
systems have up to 25 operons and
ing gene expression, such as generating temporal expression
feedforward loops. A notable featu
programs and governing the responses to fluctuating external
zation is the large degree of overlap
signals. The motif structure also allows an easily interpretable
the short cascades that control mos
may therefore represent the
view of the entire known transcriptional network of the DORs
organtion carried out by the transcriptio
ism. This approach may help define the basic computational
Cycles such as feedback loops ar
elements of other biological networks.
of regulatory networks. Transcrip
I
Z only turns on in response to sustained activity in X .
I
X
Turning off X rapidly
turns off Z .
I
n doors.
Analogy
Z1 Z2 to...elevator
Zn
We compiled a data set of direct transcriptional interactions
occur in various organisms, such a
X
d
argI
argF
argE
argD
argCBH
argR
"-phage
between transcription factors and the operons they regulate
(an5. In the E. coli data set, th
operon is a group of contiguous genes that are transcribedfeedback
into a loops of direct transcr
except for auto-regulatory loops3.
single
databaseof Xcontains
interacfrom
X andmRNA
a delayedmolecule).
one through Y.This
If the activation
is tran- of577
feedback
loops is not statistically signi
tions
and 424
(involving
116 transcription
randomizeditnetworks also have no fee
sient,
Y cannot
reachoperons
the level needed
to significantly
activate Z, the factors);
and
the formed
input signalon
is not
theexisting
circuit. Only
The many
regulatory feedbacks loops in th
was
thetransduced
basis ofthrough
on an
database
(Reguwhen X signals
a long enough time so that Y levels can build out at the post-transcriptional level.
lonDB)3,14.forWe
enhanced RegulonDB by an extensive
literature
We considered only transcription in
up will Z be activated (Fig. 2a). Once X is deactivated, Z shuts
search,
adding
35 new
transcription
factors,
including
alterna13factors
of 17that bi
down
rapidly.
This kind
of behavior
can be useful
for making
manifested
by transcription
decisions
based on fluctuating
signals.
This transcriptional
tive !-factors
(subunitsexternal
of RNA
polymerase that confer
recogni- network can be thoug
Shai S. Shen-Orr1, Ron Milo2, Shmoolik Mangan1 & Uri Alon1,2
Applications of
Random Networks
Published online: 22 April 2002, DOI: 10.1038/ng881
Analysis of real
networks
Network motifs
letter
a
feedforward loop
X
Y
Z
n
crp
araC
araBAD
c
single input module (SIM)
© 2002 Nature Publishing Group http://genetics.nature.com
b
X
Y
How to build revisited
Motifs
Little is known about the design principles1–10 of transcriptional regulation networks that control gene expression References
in
2,11,12
2 Dynamic
, features of the coherent
acells. Recent advances in data collection and analysisFig.
motifs. a, Consider a coherent feedforwar
however, are generating unprecedented amounts of informagate’–like control of the output operon Z
variations in the activity of the input X, an
tion about gene regulation networks. To understandactivation
theseprofiles. This is because Y nee
oversuch
time to pass the activation threshold
complex wiring diagrams1–10,13, we sought to break down
rejection of rapid fluctuations can be achi
networks into basic building blocks2. We generalize the however,
notionthe cascade has a slower shut-d
loop (thin red line in the Z dynamics pan
of motifs, widely used for sequence analysis, to the level
of
motif. This motif can show a temporal pro
ing to a hierarchy of activation threshol
networks. We define ‘network motifs’ as patterns of interconactivity of X, the master activator, rises an
nections that recur in many different parts of a networkwith
at the
fre-lowest threshold are activated e
est. Time is in units of protein lifetimes, o
quencies much higher than those found in randomized
long-lived proteins.
networks. We applied new algorithms for systematically
detecting network motifs to one of the best-characterized reg3, usually involving !
long cascades
ulation networks, that of direct transcriptional interactions
in
cades
of depth 5 in the flagella and
Escherichia coli3,6. We find that much of the network is
com70% of the operons are connected t
the operons are in small disjoint
bposed of repeated appearances of three highly significant
motifs. Each network motif has a specific function in determinsystems have only 1 to 3 operons.
systems have up to 25 operons and
ing gene expression, such as generating temporal expression
feedforward loops. A notable featu
programs and governing the responses to fluctuating external
zation is the large degree of overlap
signals. The motif structure also allows an easily interpretable
the short cascades that control mos
may therefore represent the
view of the entire known transcriptional network of the DORs
organtion carried out by the transcriptio
ism. This approach may help define the basic computational
Cycles such as feedback loops ar
elements of other biological networks.
of regulatory networks. Transcrip
I
Z only turns on in response to sustained activity in X .
I
X
Turning off X rapidly
turns off Z .
I
n doors.
Analogy
Z1 Z2 to...elevator
Zn
We compiled a data set of direct transcriptional interactions
occur in various organisms, such a
X
d
argI
argF
argE
argD
argCBH
argR
"-phage
between transcription factors and the operons they regulate
(an5. In the E. coli data set, th
operon is a group of contiguous genes that are transcribedfeedback
into a loops of direct transcr
except for auto-regulatory loops3.
single
databaseof Xcontains
interacfrom
X andmRNA
a delayedmolecule).
one through Y.This
If the activation
is tran- of577
feedback
loops is not statistically signi
tions
and 424
(involving
116 transcription
randomizeditnetworks also have no fee
sient,
Y cannot
reachoperons
the level needed
to significantly
activate Z, the factors);
and
the formed
input signalon
is not
theexisting
circuit. Only
The many
regulatory feedbacks loops in th
was
thetransduced
basis ofthrough
on an
database
(Reguwhen X signals
a long enough time so that Y levels can build out at the post-transcriptional level.
lonDB)3,14.forWe
enhanced RegulonDB by an extensive
literature
We considered only transcription in
up will Z be activated (Fig. 2a). Once X is deactivated, Z shuts
search,
adding
35 new
transcription
factors,
including
alterna13factors
of 17that bi
down
rapidly.
This kind
of behavior
can be useful
for making
manifested
by transcription
decisions
based on fluctuating
signals.
This transcriptional
tive !-factors
(subunitsexternal
of RNA
polymerase that confer
recogni- network can be thoug
Network motifs
araC
araBAD
c
single input module (SIM)
X
Z1 Z2
X
... Zn
n
d
e
argI
argF
argE
argD
argCBH
argR
dense overlapping regulons (DOR)
I
Master switch.
X1 X2 X3 ...
Xn
X1 X2 X3 Xn
networks. We applied new algorithm
detecting network motifsApplications
to one ofofthe b
Random Networks
ulation networks, that of direct transcrip
Analysis of real
Escherichia coli3,6. We find
that much of
networks
posed of repeated appearances
of thr
How to build revisited
Motifs
motifs. Each network motif
has a specific
ing gene expression, such
as generating
References
programs and governing the responses to
signals. The motif structure also allows a
view of the entire known transcriptional n
ism. This approach may help define the
elements of other biological networks.
We compiled a data set of direct transc
between transcription factors and the ope
operon is a group of contiguous genes that
single mRNA molecule). This database
tions and 424 operons (involving 116 tra
was formed on the basis of on an exis
lonDB)3,14. We enhanced RegulonDB by a
search, adding 35 new transcription facto
tive !-factors (subunits of RNA polymeras
tion of specific promoter sequences). Th
established interactions in which a transc
binds a regulatory site.
The transcriptional network can be rep
graph, in which each node represents an op
sent direct transcriptional interactions. E
14 of 17
argR
argI
argF
argE
fis
crp
proP
nhaA
nhaR
X1 X2 X3 Xn
rcsA
hns
ftsQAZ
Z3 Z4 ... Zm
lrp
Xn
osmC
Z2
katG
...
ihf
Z1
oxyR
X3
dps
X2
ada
X1
alkA
f
dense overlapping regulons (DOR)
rpoS
e
argD
argCBH
Network motifs
single mRNA molecule). This database co
Applications
tions and 424 operons (involving
116of trans
Random Networks
was formed on the basis of on an existin
Analysis of real
lonDB)3,14. We enhanced RegulonDB
by an
networks
search, adding 35 new transcription
factors
How to build revisited
Motifs polymerase
tive !-factors (subunits of RNA
tion of specific promoter sequences).
The
References
established interactions in which a transcrip
binds a regulatory site.
The transcriptional network can be repre
graph, in which each node represents an ope
sent direct transcriptional interactions. Ea
Fig. 1 Network motifs found in the E. coli transcriptio
Symbols representing the motifs are also shown. a, Fe
scription factor X regulates a second transcription fa
regulate one or more operons Z1...Zn. b, Example of a
binose utilization). c, SIM motif: a single transcription
of operons Z1...Zn. X is usually autoregulatory. All reg
sign. No other transcription factor regulates the oper
system (arginine biosynthesis). e, DOR motif: a set of
regulated by a combination of a set of input trans
DORs are defined by an algorithm that detects dense
with a high ratio of connections to transcription facto
(stationary phase response).
1Department of Molecular Cell Biology, 2Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, I
should be addressed to U.A. (e-mail: [email protected]).
64
15 of 17
nature genetics •
Network motifs
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
I
Note: selection of motifs to test is reasonable but
nevertheless ad-hoc.
I
For more, see work carried out by Wiggins et al. at
Columbia.
16 of 17
Network motifs
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
I
Note: selection of motifs to test is reasonable but
nevertheless ad-hoc.
I
For more, see work carried out by Wiggins et al. at
Columbia.
16 of 17
References I
Applications of
Random Networks
Analysis of real
networks
How to build revisited
Motifs
References
[1] R. Milo, N. Kashtan, S. Itzkovitz, M. E. J. Newman,
and U. Alon.
On the uniform generation of random graphs with
prescribed degree sequences, 2003. pdf ()
[2] S. S. Shen-Orr, R. Milo, S. Mangan, and U. Alon.
Network motifs in the transcriptional regulation
network of Escherichia coli.
Nature Genetics, pages 64–68, 2002. pdf ()
17 of 17