* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Complexity in life, multicellular organisms and microRNAs
Survey
Document related concepts
Genetically modified food wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Genome (book) wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Gene expression profiling wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genome evolution wikipedia , lookup
Gene expression programming wikipedia , lookup
Minimal genome wikipedia , lookup
Designer baby wikipedia , lookup
Metabolic network modelling wikipedia , lookup
Genetic engineering wikipedia , lookup
Microevolution wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Transcript
Complexity in life, multicellular organisms and microRNAs Ohad Manor Abstract In this work I would like to discuss the question of defining complexity, and to focus specifically on the question of defining complexity in life. Then I would like to use the suggested definition to further investigate the evolutionary transition from unicellular organisms to multicellular ones, and to suggest a hypothesis linking this matter to a biological phenomenon known as microRNAs. Complexity and life The issue of defining complexity is a well studied area and I will mainly use ideas from Bennet1 and Perakh2 for this matter. The idea of Logical depth as a measure of complexity is very appealing to me. The beautiful example of the crystal, the stone and the living organism can demonstrate this notion. Consider the crystal, it is perfectly ordered; in fact, it is a single subunit repeating itself for a long time, and it has minimal entropy in that sense. A short algorithm to build the crystal would consist of a plan to build the crystal's subunit, and an iterating step of putting this subunit into place. On the other hand, the stone is as far from ordered as can be, although it consists of some atom chain, there is no real order in it, its surface has an infinite amount of scratches, bumps etc. And in fact, the shortest algorithm that can tell you how to build this stone is the full description of the stone itself. You can say for that matter that the stone is in a high entropy state. The living organism has medium entropy when compared to the crystal and the stone, it is not one subunit reoccurring, but it is not completely unordered as well. Yet, the living organism's DNA or genetic code is in some sense the shortest algorithm that describes how to build this organism. True, one can claim that it is not enough to know an organism's genetic code in order to build it, since we are lacking things such as the maternal signals or the environmental signals. To this I answer that if we think of a highly simple organism i.e. bacteria, then we know that the fate of the cell is completely determined by its genetic code and the signals coming from its environment. So in a sense, the DNA and maybe a few simple other details are the shortest algorithm to describe the living organism. The key point is, that on the entropy scale, the organism is somewhere between the crystal and the rock. When we think of logical depth, which can be thought of as the computational effort needed by a computer to use the given algorithm in order to produce the object, we see that in the cases of the crystal and the stone, it is very simple. Yet in the case of the organism, even the strongest computers today cannot predict exactly what the organism will look like just by looking at its genetic code. In that sense, the living organism is complex and the other two are not. This is illustrated in figure 1. Complexity / Logical depth Live organism Rock Crystal Entropy Fig 1. The Crystal and the rock have very different entropy values but similar logical depth, opposed to the live organism that has a medium entropy value but high complexity value. This is a very nice notion (to my humble opinion) of complexity, but I would like to focus on the top point in this graph, i.e. the living organism. How can we further extend the notion of complexity in order to separate living organisms from one another? I would like to offer the notion of regulation depth as a measure for complexity in life. That is, to say that an organism is more complex than another if the regulation network which stands in the core of his life is deeper. In order to use this notion we have to define its aspects. First of all, what is this "regulation network" we are speaking about and second, how do we measure the "depth" of this regulation network. Regarding the first question, I will choose to define the regulation network of a living organism as the regulation network of gene expression in its cells. In this network, every gene is represented by a single node and a directed edge is placed between nodes A and B if gene A regulates gene B. This is obviously not perfect, since we are missing many regulative pathways in the cell itself, e.g. protein-protein interaction, translational regulation, signaling pathways etc. Not to mention the regulation in the organism as a whole e.g. hormones and intercellular interactions. Yet, I feel that even the definition of the regulation network of an organism to be the regulation network of its gene expression carries interesting implications on the question of complexity in life. The second question we need to address is how to define the depth or complexity of a regulatory network. Here I suggest using an ordered pairwise distance measure to measure and comparing regulatory network complexity. The main idea is to go over all pairs of nodes in the graph (where each node represents a gene), and calculate the distribution over the number of steps needed to pass from one to the other, where a legal step is passing over a directed edge between two nodes. If there is no such path between two nodes then the distance is defined to be infinite. This will result in a 2bin distribution, one bin of all the pairs with finite number of steps, and another bin of all the pairs with infinite distance paths, as shown in figure 2. Figure 2. The blue network has one ordered pair of distance one and one of distance infinity, and the red network has two ordered pairs of distance one. These values are recorded in the graph below using the distribution colors. In order to get a distribution the calculation should then be normalized over the number of pairs in the network. After we have two such distributions for two organisms, we then have to decide how to compare them. We can consider the statistics of the finite bin of the distribution such as its mean and variance values, but must also consider the distribution as a whole i.e. the fraction of pairs that have infinite distance compared to the fraction of pairs of finite distance. For example, we can consider two cases: One in which organism A has a mean and variance for the finite bin which is smaller than the mean and variance of organism B in the finite bin, and in addition, organism A has a bigger fraction of pairs in the finite bin than does organism B. In this case, we would certainly want to say that organism A is more complex. The networks in figure 2 show this characteristics as the two networks have similar values for the finite bin (i.e. mean value of one with variance zero), yet the blue network has only half of its pairs in the finite bin where the red network has all its pairs in the finite bin. Therefore, the regulation in the red network is more complex. The second case is when organism A has a "worse" distribution (i.e. bigger mean and variance) than does B in the finite bin, but organism B has also a bigger fraction of pairs in the infinite bin than does A. In this case, it is not obvious which of the organisms is more complex and can actually be chosen either way. Yet I feel that in this case we should define the organism with the bigger finite fraction as the more complex, even if he has a bigger finite mean value then the other organism. The reason being that this means that more components are connected regulatory, with perhaps a longer route between them, but still more regulation occurs and the network should be more complex. This case is illustrated by two simple networks in figure 3. Figure 3. The blue network has two ordered pairs of distance one, one ordered pair of distance two and three of infinite distance. The red network has two ordered pairs of distance one and four pairs of infinite distance. These values are recorded in the graph below using the distribution colors. The red network has a mean of one and variance zero in the finite bin, while the blue network has a mean of 11/3 and variance 2/3 so the red network is superior in the finite bin, but the red network has 4/6 of the pairs in the infinite bin and the blue network has only 3/6 of the pairs in the infinite bin. Therefore, the blue network has more complexity in terms of regulation. Other measures that were not considered here may be the average of in and out degrees for nodes in the graph, or the distribution of clique sizes in the network. I feel that both of them can definitely add to our analysis of the regulation networks in terms of complexity, but that they don't have the same power as the suggested measure. In short, the first of the two mentioned measures (i.e. average in and out degrees) lack the ability to identify pathways, which are a vital element in any regulatory network. To see this look at the networks in figure 3 and notice that both of them have exactly the same value for average in and out degrees, i.e. 2/3. The second measure (i.e. distribution of clique sizes) would lack the important notion of directionality if we remove the edge direction, and would need extensive feedback loops to even generate a clique if we keep the edges as directed. Therefore, for the stated purpose, I feel that the chosen measure is a good representative of the regulation complexity of a network. Using this measure, we are able to say that a certain regulatory network of one organism is more complex than that of another, and thus rank the organisms on a complexity scale. True, we don't have the full regulatory network of any organism yet, but various analyses which try to determine the regulatory networks of organisms, should give rise to the speculation that indeed the suggested measure should yield reasonable scaling of the organisms. For example, we would expect the human to be more complex then the mouse or the worm, although humans have around 20,00025,000 genes as do mice and worms. So just by the number of genes we can't say humans are more complex than mice or worms, but this is where the notion of regulation depth steps in and should help us determine the complexity level of an organism. The origin of multicellular organisms After we discussed the issue of complexity in life, I hope that I made it plausible that there is a connection between the depth of the regulatory network and the complexity of the organism. Now, I would like to link this subject to a more specific biological question, the origin of multicellular organisms. The question of the origin of multicellular organisms has been studied by various researchers5 that focus on different aspects of the problem. I myself would like to suggest here a hypothesis which is related to the subject we have discussed above, i.e. the connection between regulation and complexity. First we have to agree that multicellular organisms are more complex than unicellular organisms. This for itself is not an obvious fact, yet I feel that we can assume that for the matter addressed, it is a reasonable assumption. Before I suggest the hypothesis, I need to say a few words about the nature of microRNAs or miRNAs as I will refer them. miRNAs are short molecules, usually of length 22-25 nucleotides. They are transcribed as pri-miRNA from the genome, then they are processed or cut into a pre-miRNA, and finally they undergo another processing to the final mature miRNA of length 22-25. They then bind to specific mRNAs (at their 3'UTR) and inhibit their translation, either by interfering with the ribosome binding, or by promoting cleavage of the targeted mRNA. In short, one can say that they are very simple "genes" (as they don't give rise to protein as do normal genes), which have the sole purpose of regulating the abundance of other genes or proteins. I herby make the following two observations. One is that microRNAs, the small noncoding regulatory short RNAs, do not appear in any unicellular organism, neither prokaryote nor eukaryote (for example yeast). Rather, microRNAs appear only in multicellular organisms. The second observation is that one of the main differences between unicellular organisms and multicellular ones is their cells' ability to differentiate. If we take a single cell from the multicellular organism and compare it to the cell of the unicellular organism, we will see that they both excel at what they do, namely, the unicellular organism has mutated and evolved during evolution, not to a perfect being, but rather to a very efficient one in preserving life. Now, when the unicellular organisms joined to create a multicellular organism, they already had nearby perfected the efficiency of energy creation and consumption, and generally the ability to produce and perpetuate life, using their genes, or mRNAs. What this work suggests is that at the transition stage from unicellularity to multicellularity, it was highly problematic for the unicellular organisms to change altogether their genes or their mRNA signature in order to differentiate and gain an advantage from their agglomeration. So rather than doing that, they evolved miRNAs, small, tiny regulators, relatively easy to evolve and mutate due to their small size compared to real genes, in order to create the ability of each cell to differentiate and to change its mRNA expression and thus its life form, by an easy combinatorial activation of miRNAs. In other words, I suggest that miRNAs took part while evolution took a leap forward from unicellular to multicellular organisms. The link to the notion of complexity discussed above is made when we think of the fact that if we want to move from unicellular organisms to multicellular ones, we need to increase complexity. Using the notion above, we could say that we want to add depth to our regulatory network. Adding such depth could be done in several ways: One way will be to link a regulator to a gene that wasn't regulated by it before. Other ways could be to add a new regulator to the system and link genes to it, or turn one of the genes to be also a regulator in addition to its function. The last two options are not simple at all, since they consist of adding a new function to a gene or adding a new gene from scratch. Therefore it is plausible to assume that they would take a long evolutionary time, time which is not available in this transition, since the agglomeration of unicellular organisms needs to generate an advantage relatively fast. The first option is reasonable, all we need to do is to add a control sequence to the gene we wish to regulate; a few simple mutations would do for that matter. However, by doing this we will only add one regulated gene to our regulator, which will not dramatically change the complexity of our regulation network. A different approach can be taken, using miRNAs. For example, we can think of one gene which is a regulator and has some targets, and now in addition to translating its coding region to get a protein, we also create a miRNA from its intron (simple organisms usually lack introns, but the idea remains similar). Now, this miRNA can target many genes because of its generic structure, and thus, by a few mutations, we can add many target genes to this regulator and to increase our regulatory complexity. Another way, which is similar to adding a new regulator, is creating a new miRNA that regulates genes. This new miRNA can target many genes and increase the regulation complexity, and will cost a lot less than creating a brand new regulatory gene. In that sense, miRNAs can be thought of as meta-regulators. Since they target the 3'UTR of a gene, which is usually not very conserved, and usually relatively long, then any creation of a new miRNA, can immediately have an effect over many mRNAs and can therefore contribute to the depth of the regulatory network and the complexity of an organism in a relatively short time. There are a few works that link miRNA with differentiation3, 4. In one work4, the authors show that over-expressing even one miRNA can lead the cell to a gene expression pattern which is similar to a cell from a different tissue. Similarly, if we think of the very simple unicellular organisms which joined together, then expressing even one new miRNA could maybe dramatically change the cell's gene expression and life cycle. This evidence suggests that indeed even one or two miRNA could have a dramatic affect over the way of life of one cell, thus, can help an agglomeration of unicellular organisms differentiate such that some of the organisms specialize in some aspect and the others at other aspects of life, thus, giving the agglomeration an advantage in the fight for survival. Summary In this work I have tried to connect the complexity of a living organism with the complexity of its regulatory network using a suggested measure. I then tried to move on and explain why miRNAs are good candidates to generate such complexity in a reasonably fast manner through random mutations. Thus, suggesting that a transition as complex as the transition from unicellular organisms to multicellular ones, which involves a high increase in the organism's complexity, could possibly use miRNAs as a method to generate such complexity. 1 Charles H. Bennet. How to define complexity in Physics, and why. IBM Research 137-148 2 Mark Perakh. Defining complexity. A commentary to a paper by Charles H. Bennet. 3 Chen Y, Stallings RL. Differential patterns of microRNA expression in neuroblastoma are correlated with prognosis, differentiation, and apoptosis. Cancer Res. 2007 Feb 1;67(3):976-83. PMID: 17283129 [PubMed - indexed for MEDLINE] 4 Lim LP et al. Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs. Nature. 2005 Feb 17;433(7027):769-73. Epub 2005 Jan 30. 5 Richard E. Michod. Life-history evolution and the origin of multicellularity. Journal of Theoretical Biology 239 (2006) 257-272.