Download Motif Mining from Gene Regulatory Networks

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Designer baby wikipedia , lookup

Non-coding RNA wikipedia , lookup

Gene wikipedia , lookup

NEDD9 wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Gene expression programming wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Lac operon wikipedia , lookup

Transcript
Motif Mining from Gene
Regulatory Networks
Based on the publications of Uri Alon’s
group
…presented by Pavlos Pavlidis
Tartu University, December 2005
Gene Regulatory Networks
• From Wikipedia
Gene regulatory network is a collection of DNA
segments in a cell which interact with each other and
with other substances in the cell, thereby governing the
rates at which genes in the network are transcribed into
mRNA
• From DOE
Gene regulatory networks (GRNs) are the on-off
switches and rheostats…dynamically orchestrate the
level of expression for each gene….
Why networks can regulate Gene
Expression?
• U. Alon and his group, stresses the
importance of the building blocks of the
network.
• These building blocks are called motifs
Motifs
• They are called also n-node subgraphs
in a directed graph
(The work has also been extended for
undirected graphs)
• They are characterized from the number n
of the nodes and the relations between
them – directed edges
The 13 different 3-node subgraphs
Feed Forward Loop
It regulates
rapidly the
production of Z
In what motifs they are interested
• Not in biologically significant
– They don’t know a priori if a motif is
biologically significant
• They can calculate statistical significance
– The probability that a randomized
network contains the same number or more
instances of a particular motif must be smaller
than P. Here P is 0.01.
Randomized Network
• A randomized network is not completely
randomized.
It has some properties:
• The same number of nodes as in the real
network
• For each node the number of the
incoming and outgoing edges equals to
the real network.
Operon 1
Operon 2
Operon 3
Operon 4
Operon 5
Operon 6
Operon 7
Operon 8
Operon 9
Operon 10
Operon 11
Operon 12
Operon 13
Operon 14
Operon 15
Operon 16
Operon 17
Operon 18
Operon 1 Operon 2 Operon 3 Operon 4 Operon 5 Operon 6 …
0
0
1
0
0
0
1
0
0
1
0
0
Mij:
1 if the j operon produces a TF
which ragulates operon i
1
operon 2 regulates
operon 11
Representation of the network as a matrix M
Randomization: Select randomly two cells which are 1 e.g A(1,3), B(2,1).
If A’(1, 1) and B’(2, 3) are 0 then swap
Goal : The randomized network must have the same sum in columns
and in rows
Columns: The number of outgoing edges
Rows: The number of incoming edges
One more requirement:
If we are looking for n-node subgraphs, then the number of n-1 node
subgraphs must be the same in real and randomized networks
This is done to avoid assigning high significance to a structure only
because of the fact that it includes a highly significant substructure.
Significance of a motif
• Three requirements
– P < 0.01
P was estimated (or bounded) by using 1000
randomized networks.
– The number of times it appears in the real network
with distinct sets of nodes is at least U = 4.
– The number of appearances in the real network is
significantly larger than in the randomized networks:
Nreal – Nrand > 0.1Nrand (Why??).
What did they find
• That in biological systems as in E.coli or in
S.cerevisiae only some certain types of
motifs are statistically important.
• When they studied other systems such as:
Food webs. The database of seven ecosystem food webs
Neuronal networks: the neural system of C.elegans
WWW
OTHER KIND OF MOTIFS WHERE STATISTICALLY IMPORTANT
FFL
SIM
DOR
FFL
• Biological Example
– the L-arabinose utilization system:
– Crp is the general transcription factor and
AraC the specific transcription factor.
The real model
FFL
• Coherent
• Incoherent
• Important for the speed of response
Software
mDraw
Network visualization tool
(mfinder and network motifs visualization tool
embedded)