Download Complexity in life, multicellular organisms and microRNAs

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genetically modified food wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Genome (book) wikipedia , lookup

RNA-Seq wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Gene expression profiling wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Genome evolution wikipedia , lookup

Gene expression programming wikipedia , lookup

Minimal genome wikipedia , lookup

MicroRNA wikipedia , lookup

Gene wikipedia , lookup

Designer baby wikipedia , lookup

Metabolic network modelling wikipedia , lookup

Genetic engineering wikipedia , lookup

Microevolution wikipedia , lookup

Mir-92 microRNA precursor family wikipedia , lookup

History of genetic engineering wikipedia , lookup

Life history theory wikipedia , lookup

Transcript
Complexity in life, multicellular organisms and microRNAs
Ohad Manor
Abstract
In this work I would like to discuss the question of defining complexity, and to focus
specifically on the question of defining complexity in life. Then I would like to use
the suggested definition to further investigate the evolutionary transition from
unicellular organisms to multicellular ones, and to suggest a hypothesis linking this
matter to a biological phenomenon known as microRNAs.
Complexity and life
The issue of defining complexity is a well studied area and I will mainly use ideas
from Bennet1 and Perakh2 for this matter.
The idea of Logical depth as a measure of complexity is very appealing to me. The
beautiful example of the crystal, the stone and the living organism can demonstrate
this notion. Consider the crystal, it is perfectly ordered; in fact, it is a single subunit
repeating itself for a long time, and it has minimal entropy in that sense. A short
algorithm to build the crystal would consist of a plan to build the crystal's subunit, and
an iterating step of putting this subunit into place. On the other hand, the stone is as
far from ordered as can be, although it consists of some atom chain, there is no real
order in it, its surface has an infinite amount of scratches, bumps etc. And in fact, the
shortest algorithm that can tell you how to build this stone is the full description of the
stone itself. You can say for that matter that the stone is in a high entropy state.
The living organism has medium entropy when compared to the crystal and the stone,
it is not one subunit reoccurring, but it is not completely unordered as well. Yet, the
living organism's DNA or genetic code is in some sense the shortest algorithm that
describes how to build this organism. True, one can claim that it is not enough to
know an organism's genetic code in order to build it, since we are lacking things such
as the maternal signals or the environmental signals. To this I answer that if we think
of a highly simple organism i.e. bacteria, then we know that the fate of the cell is
completely determined by its genetic code and the signals coming from its
environment. So in a sense, the DNA and maybe a few simple other details are the
shortest algorithm to describe the living organism.
The key point is, that on the entropy scale, the organism is somewhere between the
crystal and the rock. When we think of logical depth, which can be thought of as the
computational effort needed by a computer to use the given algorithm in order to
produce the object, we see that in the cases of the crystal and the stone, it is very
simple. Yet in the case of the organism, even the strongest computers today cannot
predict exactly what the organism will look like just by looking at its genetic code.
In that sense, the living organism is complex and the other two are not. This is
illustrated in figure 1.
Complexity /
Logical depth
Live
organism
Rock
Crystal
Entropy
Fig 1. The Crystal and the rock have very different entropy values but similar logical depth, opposed to
the live organism that has a medium entropy value but high complexity value.
This is a very nice notion (to my humble opinion) of complexity, but I would like to
focus on the top point in this graph, i.e. the living organism. How can we further
extend the notion of complexity in order to separate living organisms from one
another?
I would like to offer the notion of regulation depth as a measure for complexity in
life. That is, to say that an organism is more complex than another if the regulation
network which stands in the core of his life is deeper.
In order to use this notion we have to define its aspects. First of all, what is this
"regulation network" we are speaking about and second, how do we measure the
"depth" of this regulation network. Regarding the first question, I will choose to
define the regulation network of a living organism as the regulation network of gene
expression in its cells. In this network, every gene is represented by a single node and
a directed edge is placed between nodes A and B if gene A regulates gene B. This is
obviously not perfect, since we are missing many regulative pathways in the cell
itself, e.g. protein-protein interaction, translational regulation, signaling pathways etc.
Not to mention the regulation in the organism as a whole e.g. hormones and intercellular interactions. Yet, I feel that even the definition of the regulation network of an
organism to be the regulation network of its gene expression carries interesting
implications on the question of complexity in life.
The second question we need to address is how to define the depth or complexity of a
regulatory network. Here I suggest using an ordered pairwise distance measure to
measure and comparing regulatory network complexity. The main idea is to go over
all pairs of nodes in the graph (where each node represents a gene), and calculate the
distribution over the number of steps needed to pass from one to the other, where a
legal step is passing over a directed edge between two nodes. If there is no such path
between two nodes then the distance is defined to be infinite. This will result in a 2bin distribution, one bin of all the pairs with finite number of steps, and another bin of
all the pairs with infinite distance paths, as shown in figure 2.
Figure 2. The blue network has one ordered pair of distance one and one of distance infinity, and the
red network has two ordered pairs of distance one. These values are recorded in the graph below using
the distribution colors.
In order to get a distribution the calculation should then be normalized over the
number of pairs in the network. After we have two such distributions for two
organisms, we then have to decide how to compare them. We can consider the
statistics of the finite bin of the distribution such as its mean and variance values, but
must also consider the distribution as a whole i.e. the fraction of pairs that have
infinite distance compared to the fraction of pairs of finite distance. For example, we
can consider two cases: One in which organism A has a mean and variance for the
finite bin which is smaller than the mean and variance of organism B in the finite bin,
and in addition, organism A has a bigger fraction of pairs in the finite bin than does
organism B. In this case, we would certainly want to say that organism A is more
complex. The networks in figure 2 show this characteristics as the two networks have
similar values for the finite bin (i.e. mean value of one with variance zero), yet the
blue network has only half of its pairs in the finite bin where the red network has all
its pairs in the finite bin. Therefore, the regulation in the red network is more
complex.
The second case is when organism A has a "worse" distribution (i.e. bigger mean and
variance) than does B in the finite bin, but organism B has also a bigger fraction of
pairs in the infinite bin than does A. In this case, it is not obvious which of the
organisms is more complex and can actually be chosen either way. Yet I feel that in
this case we should define the organism with the bigger finite fraction as the more
complex, even if he has a bigger finite mean value then the other organism. The
reason being that this means that more components are connected regulatory, with
perhaps a longer route between them, but still more regulation occurs and the network
should be more complex. This case is illustrated by two simple networks in figure 3.
Figure 3. The blue network has two ordered pairs of distance one, one ordered pair of distance two and
three of infinite distance. The red network has two ordered pairs of distance one and four pairs of
infinite distance. These values are recorded in the graph below using the distribution colors. The red
network has a mean of one and variance zero in the finite bin, while the blue network has a mean of
11/3 and variance 2/3 so the red network is superior in the finite bin, but the red network has 4/6 of the
pairs in the infinite bin and the blue network has only 3/6 of the pairs in the infinite bin. Therefore, the
blue network has more complexity in terms of regulation.
Other measures that were not considered here may be the average of in and out
degrees for nodes in the graph, or the distribution of clique sizes in the network. I feel
that both of them can definitely add to our analysis of the regulation networks in
terms of complexity, but that they don't have the same power as the suggested
measure. In short, the first of the two mentioned measures (i.e. average in and out
degrees) lack the ability to identify pathways, which are a vital element in any
regulatory network. To see this look at the networks in figure 3 and notice that both of
them have exactly the same value for average in and out degrees, i.e. 2/3. The second
measure (i.e. distribution of clique sizes) would lack the important notion of
directionality if we remove the edge direction, and would need extensive feedback
loops to even generate a clique if we keep the edges as directed. Therefore, for the
stated purpose, I feel that the chosen measure is a good representative of the
regulation complexity of a network.
Using this measure, we are able to say that a certain regulatory network of one
organism is more complex than that of another, and thus rank the organisms on a
complexity scale. True, we don't have the full regulatory network of any organism
yet, but various analyses which try to determine the regulatory networks of organisms,
should give rise to the speculation that indeed the suggested measure should yield
reasonable scaling of the organisms. For example, we would expect the human to be
more complex then the mouse or the worm, although humans have around 20,00025,000 genes as do mice and worms. So just by the number of genes we can't say
humans are more complex than mice or worms, but this is where the notion of
regulation depth steps in and should help us determine the complexity level of an
organism.
The origin of multicellular organisms
After we discussed the issue of complexity in life, I hope that I made it plausible that
there is a connection between the depth of the regulatory network and the complexity
of the organism. Now, I would like to link this subject to a more specific biological
question, the origin of multicellular organisms.
The question of the origin of multicellular organisms has been studied by various
researchers5 that focus on different aspects of the problem. I myself would like to
suggest here a hypothesis which is related to the subject we have discussed above, i.e.
the connection between regulation and complexity. First we have to agree that
multicellular organisms are more complex than unicellular organisms. This for itself
is not an obvious fact, yet I feel that we can assume that for the matter addressed, it is
a reasonable assumption.
Before I suggest the hypothesis, I need to say a few words about the nature of
microRNAs or miRNAs as I will refer them. miRNAs are short molecules, usually of
length 22-25 nucleotides. They are transcribed as pri-miRNA from the genome, then
they are processed or cut into a pre-miRNA, and finally they undergo another
processing to the final mature miRNA of length 22-25. They then bind to specific
mRNAs (at their 3'UTR) and inhibit their translation, either by interfering with the
ribosome binding, or by promoting cleavage of the targeted mRNA. In short, one can
say that they are very simple "genes" (as they don't give rise to protein as do normal
genes), which have the sole purpose of regulating the abundance of other genes or
proteins.
I herby make the following two observations. One is that microRNAs, the small
noncoding regulatory short RNAs, do not appear in any unicellular organism, neither
prokaryote nor eukaryote (for example yeast). Rather, microRNAs appear only in
multicellular organisms.
The second observation is that one of the main differences between unicellular
organisms and multicellular ones is their cells' ability to differentiate. If we take a
single cell from the multicellular organism and compare it to the cell of the unicellular
organism, we will see that they both excel at what they do, namely, the unicellular
organism has mutated and evolved during evolution, not to a perfect being, but rather
to a very efficient one in preserving life. Now, when the unicellular organisms joined
to create a multicellular organism, they already had nearby perfected the efficiency of
energy creation and consumption, and generally the ability to produce and perpetuate
life, using their genes, or mRNAs.
What this work suggests is that at the transition stage from unicellularity to
multicellularity, it was highly problematic for the unicellular organisms to change
altogether their genes or their mRNA signature in order to differentiate and gain an
advantage from their agglomeration. So rather than doing that, they evolved miRNAs,
small, tiny regulators, relatively easy to evolve and mutate due to their small size
compared to real genes, in order to create the ability of each cell to differentiate and to
change its mRNA expression and thus its life form, by an easy combinatorial
activation of miRNAs. In other words, I suggest that miRNAs took part while
evolution took a leap forward from unicellular to multicellular organisms.
The link to the notion of complexity discussed above is made when we think of the
fact that if we want to move from unicellular organisms to multicellular ones, we need
to increase complexity. Using the notion above, we could say that we want to add
depth to our regulatory network. Adding such depth could be done in several ways:
One way will be to link a regulator to a gene that wasn't regulated by it before. Other
ways could be to add a new regulator to the system and link genes to it, or turn one of
the genes to be also a regulator in addition to its function. The last two options are not
simple at all, since they consist of adding a new function to a gene or adding a new
gene from scratch. Therefore it is plausible to assume that they would take a long
evolutionary time, time which is not available in this transition, since the
agglomeration of unicellular organisms needs to generate an advantage relatively fast.
The first option is reasonable, all we need to do is to add a control sequence to the
gene we wish to regulate; a few simple mutations would do for that matter. However,
by doing this we will only add one regulated gene to our regulator, which will not
dramatically change the complexity of our regulation network.
A different approach can be taken, using miRNAs. For example, we can think of one
gene which is a regulator and has some targets, and now in addition to translating its
coding region to get a protein, we also create a miRNA from its intron (simple
organisms usually lack introns, but the idea remains similar). Now, this miRNA can
target many genes because of its generic structure, and thus, by a few mutations, we
can add many target genes to this regulator and to increase our regulatory complexity.
Another way, which is similar to adding a new regulator, is creating a new miRNA
that regulates genes. This new miRNA can target many genes and increase the
regulation complexity, and will cost a lot less than creating a brand new regulatory
gene.
In that sense, miRNAs can be thought of as meta-regulators. Since they target the
3'UTR of a gene, which is usually not very conserved, and usually relatively long,
then any creation of a new miRNA, can immediately have an effect over many
mRNAs and can therefore contribute to the depth of the regulatory network and the
complexity of an organism in a relatively short time.
There are a few works that link miRNA with differentiation3, 4. In one work4, the
authors show that over-expressing even one miRNA can lead the cell to a gene
expression pattern which is similar to a cell from a different tissue. Similarly, if we
think of the very simple unicellular organisms which joined together, then expressing
even one new miRNA could maybe dramatically change the cell's gene expression
and life cycle. This evidence suggests that indeed even one or two miRNA could have
a dramatic affect over the way of life of one cell, thus, can help an agglomeration of
unicellular organisms differentiate such that some of the organisms specialize in some
aspect and the others at other aspects of life, thus, giving the agglomeration an
advantage in the fight for survival.
Summary
In this work I have tried to connect the complexity of a living organism with the
complexity of its regulatory network using a suggested measure. I then tried to move
on and explain why miRNAs are good candidates to generate such complexity in a
reasonably fast manner through random mutations. Thus, suggesting that a transition
as complex as the transition from unicellular organisms to multicellular ones, which
involves a high increase in the organism's complexity, could possibly use miRNAs as
a method to generate such complexity.
1
Charles H. Bennet. How to define complexity in Physics, and why. IBM Research 137-148
2
Mark Perakh. Defining complexity. A commentary to a paper by Charles H. Bennet.
3
Chen Y, Stallings RL. Differential patterns of microRNA expression in neuroblastoma are correlated
with prognosis, differentiation, and apoptosis.
Cancer Res. 2007 Feb 1;67(3):976-83.
PMID: 17283129 [PubMed - indexed for MEDLINE]
4
Lim LP et al. Microarray analysis shows that some microRNAs downregulate large numbers of target
mRNAs. Nature. 2005 Feb 17;433(7027):769-73. Epub 2005 Jan 30.
5
Richard E. Michod. Life-history evolution and the origin of multicellularity. Journal of Theoretical
Biology 239 (2006) 257-272.