Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Hardy–Weinberg principle wikipedia , lookup
Genetic engineering wikipedia , lookup
Sexual dimorphism wikipedia , lookup
History of genetic engineering wikipedia , lookup
Population genetics wikipedia , lookup
Koinophilia wikipedia , lookup
Gene expression programming wikipedia , lookup
Computer Modelling of Evolution Luigi Barone Philip Hingston Lyndon While Department Of Computer Science, The University of Western Australia, Nedlands, WA, 6009 email: fluigi, phi, [email protected] Abstract The development of life on Earth is governed by the processes of evolution and natural selection. We investigate the evolutionary process to determine the variant and invariant parts of the process with respect to dierent species. We introduce a C++ hierarchy class model as a general model for evolution and use this model to simulate peacock communities in an attempt to answer the question of why male peacocks have long, apparently counter-adaptive, tails. An analysis of the results from this model concludes the paper. Keywords: evolution, evolutionary algorithm, object-oriented, peacock. CR Classication: I.6.3 [Computing Methodologies]: Simulation and Modelling - Applications. 1 Introduction All life present on Earth today evolved from elementary organisms. The process by which these simple organisms developed into the complex multi-celled fauna/ora present today is called evolution. Organisms contain entities called genes that are capable of self-replication. Genes are passed from parent(s) to their ospring during the process of reproduction. Any inexact replication of genes during reproduction that generates ospring which dier from their parent(s) in an unexpected manner is called mutation. The total genetic information contained in an organism is called its genotype. The organism formed by the interaction of the genotype with its environment is called the phenotype. The success of the phenotype in its natural environment determines whether the genes contained in its genotype go forward into the next generation. Natural selection is the process by which the more successful phenotypes of a generation pass on a greater proportion of genes to the next generation. In other words, natural selection suggests that any adaptation (expressed through an organism's genes) that helps an organism to survive and produce more ospring eventually becomes dominant in the organism community, because more ospring with this adaptation are produced. For a complete discussion of evolution and the related terminology, the reader is referred to Dawkins paper [1]. In this paper, we investigate the evolutionary process to determine the variant and invariant parts of the process with respect to dierent species. We introduce a C++ hierarchy class model as a general model for evolution and use this model to simulate peacock communities in an attempt to answer the question of why male peacocks have long, apparently counter-adaptive, tails. An analysis of the results from this model concludes the paper. 2 Modelling the Evolutionary Process We would like to develop a general framework for modelling evolution such that a particular scenario can be developed by instantiating general functions with model-specic denitions. To achieve this, we need to identify the features of evolution that are invariant (i.e. common factors in all evolutionary models) and the features that are variant (i.e. model-specic). Each organism has a genotype and a phenotype. The phenotype is the actual living organism (e.g. a human), while the genotype is the genetic information contained in the phenotype (e.g. a human genome in humans). The attributes of the phenotype and genotype dier from organism species to organism Page 2 { BARONE L., HINGSTON P., WHILE L. Computer Modelling of Evolution species. However, each phenotype has associated with it its sex (either male, female, or unisexed) and some measure of time since it came into existence (its age). These two attributes are common to all species. All organisms eventually die. Organisms die for a variety of reasons, possibly because of environmental inuences (e.g. lack of food), or possibly of old age. This is dependent on the model being simulated. Organisms must reproduce (either sexually or asexually) to perpetuate the species into further generations. Without reproduction, once all the organisms die, the species becomes extinct. The way in which genes are copied from parent to ospring may dier in dierent models. Random copying errors (mutation) are inevitable during the reproduction process. The probability and degree of mutation is, however, variable. For sexual reproduction to occur, a male and a female must meet and be willing to reproduce. The manner in which organisms encounter each other diers from model to model. The willingness of two organisms to reproduce is also problem-specic. 3 A General Evolutionary Model We can generalize the evolutionary process using a library of C++ classes. A set of base classes is dened that represents the invariant features of the evolutionary process. A base class is provided for each of a general organism phenotype and genotype. Phenotypes and genotypes of organisms specic to the model can then be provided by deriving new classes from these base classes. The variant parts of the evolutionary process are expressed as functions to be overridden in the base classes. That is, the new derived classes override functions that express the variant parts of the model. 3.1 The Base Phenotype and Genotype Classes We use a collection class to hold lists of possibly dierent types of organism species. To achieve this, all organism species are derived from a simple base class called object. The collection class, called objectlist, is then a list of pointers to instances of the class object. Standard list operations are dened on the class objectlist. A base genotype class, called genotype class, is dened from which the model-specic genotype class is derived. The genotypes used in a particular simulation will depend on the situation being modelled and the aspects of that situation which are deemed interesting. All organism instances are derived from a base organism class, called individual. In order to use this class in the collection class, individual is also derived from the object class. The individual class represents the phenotype of a general organism: it contains attributes that are common to all organism species. The individual class denes the following protected attributes that may be used in derived classes. int age { the age of the phenotype. sex class sex { the sex of the phenotype. genotype class *genotype { a pointer to the organism's genotype. The individual class also provides the following function. int mutation direction(oat mutation rate); This function returns the direction of mutation when copying a parent gene to the ospring. The parameter mutation rate is the probability (as a percentage) of mutation. The return value is either -1, 0, or 1, indicating the direction of mutation. A return value of zero indicates no copying error occurred. A non-zero return value indicates the gene was copied incorrectly. The sign of the return value indicates which way the error occurred, positive representing an increment in gene value, negative representing a decrement in gene value. This function may be overridden in the organism class if required. The individual class requires the following functions to be overridden in derived classes. bool die(int popsize); This function determines if the organism dies in the current breeding cycle. It returns true if the organism dies and false if it survives. Phenotype attributes such as age are likely to aect the Computer Modelling of Evolution BARONE L., HINGSTON P., WHILE L. { Page 3 return value of this function. The parameter popsize is the number of organisms of the organism class at the beginning of the current breeding cycle. This function is called every breeding cycle for each organism. void mature(); This function modies the state of the phenotype each breeding cycle, thereby acting as an aging function. Phenotype attributes are modied by this function to simulate the process of aging in nature. If the model does not have any aging aects on phenotype attributes, the function remains empty. This function is called every breeding cycle for each organism. bool participate in reproduction(int popsize); This function determines if the organism participates in reproduction in the current breeding cycle. It returns true if the organism reproduces in this breeding cycle and false otherwise. Phenotype attributes such as age, sex and the time since last reproduction are likely to aect the return value of this function. The parameter popsize is the number of organisms of the organism class at the beginning of the current breeding cycle. This function is called every breeding cycle for each organism. individual *choose mate(objectlist *selection set); This function selects the mate chosen by a female organism in the current breeding cycle. The selection set parameter contains a list of males encountered by the female: the function returns the male from this list with whom the female reproduces. A NULL return value indicates that the female does not wish to reproduce with any of the males in the list. This function is called for each female organism which is participating in reproduction in the current breeding cycle. void reproduce(genotype class *m, genotype class *f); This function creates the genotype for a new organism. It creates the genotype of the new ospring from the genotypes of the parents, determining a value for each genotype attribute. Mutation can be modelled in this function using the mutation direction function dened in the individual class. This function is called for each new organism at the time it is created. void develop(); This function develops the organism's phenotype from its genotype. It models the development process in nature, where the genotype is transformed into the phenotype. This function can incorporate random factors to simulate environmental eects on the development of the phenotype. This function is called for each new organism at the time it is created. 3.2 The Base Population Class We describe populations of organisms using a population class. The sexual population class is used for sexual reproduction and the asexual population class is used for asexual reproduction. These classes represent the entire population of a species. For a sexual species, all organism instances are maintained as two lists. objectlist *male individual { a list of male organisms. objectlist *female individual { a list of female organisms. The population class performs operations on the whole population of organisms, providing the following member functions. void initialize population(int popsize); This function initializes the population. void age(); This function increments the age of each organism and then calls the mature member function of each organism. Page 4 { BARONE L., HINGSTON P., WHILE L. Computer Modelling of Evolution void kill(int gen, int *males killed, int *females killed); This function kills organisms from the population. It calls the die member function for each organism and deletes the organism from the list if required. void reproduce(int gen, int *males reproduced, int *females reproduced); This function adds new organisms into the population. It calls the participate in reproduction member function for each female organism: if it returns true, reproduce calls the choose selection set member of the sexual population class to determine the list of male organisms encountered by the female in this breeding cycle. A mate is then chosen by calling the choose mate member function. If a male is chosen, a new organism is created, and the genotype and phenotype of the ospring are created by calling its reproduce and develop member functions. The population class also requires the following function to be overridden in derived classes. objectlist *choose selection set(); This function selects a list of males from the organism population from which a female organism selects a mate. This models the process of organism interaction. The list returned represents the list of males that the female encountered in the current breeding cycle. The size of the return list can be varied to model dierent numbers of encounters for each female. Organisms can be added to the list by calling the add element member function of the objectlist class. This function is called every time a female organism is willing to participate in reproduction. 4 The Peacock Model Dawkins[2] poses the question of why male peacocks developed long tails. At rst sight, these tails seem to contradict the principle of natural selection, because a long tail (i.e. a tail longer than the aerodynamic optimum) is a hindrance to a peacock compared to a short tail. Not only does the peacock pay the extra cost of growing a long tail, but also, for example, it may be harder for a peacock with a long tail to elude predators. In jargon, such a phenotypic feature is called a counter-adaptation. One theory submits that long tails became dominant in male peacocks simply because female peacocks have a genetic preference for males with long tails compared to males with short tails[3]: in short, a long tail makes a male peacock more attractive to females. The theory submits that as soon as such an imbalance arises (for whatever reason), a feed-forward situation develops which causes male tails and female preference for long tails to increase in tandem at a geometrically increasing rate, such a process reaching equilibrium when males cannot grow longer tails and still survive in their environment. Note that the theory also admits the possibility of birds with excessively short tails: indeed, these are also found both in nature and in the results from our model. We have developed a model for peacock communities within our general framework for evolution. Peacock genotypes are represented by the following attributes: 1. constitution { a measure of the general well-being of the peacock, 2. tail length, and 3. preference for long tails { the probability that a female peacock chooses the male peacock with the longer tail from a pair of candidate mates. Note that the tail-length gene expresses itself only in males and the preference gene expresses itself only in females. Phenotypes are created from genotypes using a transformation mapping. Female peacock genotypes are mapped to a phenotype represented by two attributes: 1. constitution, and 2. preference for long tails. Male peacock phenotypes are also represented by two attributes: 1. constitution, and Computer Modelling of Evolution BARONE L., HINGSTON P., WHILE L. { Page 5 2. tail length. The probability of death in each breeding cycle for a male peacock is a function of its age, constitution, and tail length. As tail length increases, so does the probability of dying, since a larger tail is a hindrance to a peacock (note that a tail shorter than the aerodynamic optimum is also a disadvantage) [4]. The probability of death for a female peacock is a function of the its age and constitution. For both males and females, the probability of surviving is directly proportional to its constitution, and inversely proportional to its age. Each breeding year, each female in the population either chooses a male with which to reproduce from a random selection of peacocks or she abstains from reproduction. From a given set of males, the male chosen is determined by a competition process where the candidates are compared pairwise by the female until only one remains. In essence, the probability that two peacocks reproduce is a function of the tail length of the male, and the tail length preference of the female. Immature peacocks are excluded from the breeding process. When two peacocks reproduce, the ospring's genotype is some combination of the parents' genotypes. The parents' genotypes are copied with a pre-dened probability of error, i.e. the mutation rate. 5 Experimenting with the Peacock Model The variables of the model are: the function determining the probability of death in each breeding cycle, the starting population, the reproduction function, the probability of mutation, and the number of male peacocks from each female can chose (the encounter set). All experiments were performed on a DEC ALPHA 3000 workstation running DEC OSF/1 V2.0. A non-linear additive feedback random number generator was used to return successive pseudo-random numbers in the range 0 to 231 ? 1. The program was written using C++ and compiled using the GNU g++ version 2.6.0 compiler with no optimization options. The program runs for a xed number of breeding cycles (unless the population dies out). Genotype/phenotype statistics are generated and displayed in each breeding cycle. Initializing the Parameters The probability of death for a female peacock is directly proportional to its constitution and inversely proportional to its age. The health of a female peacock is given by the function: health = constitution age : The probability of death for a male peacock is directly proportional to its constitution, inversely proportional to its age and a Gaussian function of its deviation from the optimal tail length. The health of a male peacock is given by the function: ?(tail length ? optimal tail length)2 ) exp( health = constitution age c where c controls the eect of tail length on the health of male peacocks. c = 1 indicates that tail length has no eect on peacock health: as c decreases, tail length has an increasing eect on peacock health, with tails further from the optimal aerodynamic length lowering the value of health. optimal tail length is set (somewhat arbitrarily) to 5. Each year, every peacock's health is evaluated and the peacock dies if: health < random(threshold) where random(x) returns a random number in the range 0{x. Page 6 { BARONE L., HINGSTON P., WHILE L. Computer Modelling of Evolution The starting population for these experiments was set to 1000 males and 1000 females. The starting population needs to be large enough so that the development of the population is not dominated by random drift caused by the pseudo-random number generator. When the population reaches a critical size, the value of the threshold is increased. This simulates competition for limited resources in the environment. The genotype of a new peacock is created from its parents' genotypes by simply selecting each gene from one of the parents at random. The number of male peacocks in the female selection set is arbitrarily set to 16. The probability that a female participates in reproduction is set to 25%. The probability of mutation is set to 15%. When mutation occurs, the change in the gene value being mutated is 20% of the initial value for the population. 5.1 Results Figure 5{1 shows a typical result obtained when the health evaluator described above is used with c of the order of 10,000. The graphs indicate a downward trend in both tail length (in males) and tail length Female Peacocks Male Peacocks Tail Length Tail Length 6.0 6.0 4.0 4.0 2.0 2.0 0.0 0.0 250.0 500.0 750.0 1000.0 0.0 0.0 Tail Preference 0.8 0.6 0.6 0.4 0.4 0.2 0.2 250.0 500.0 750.0 1000.0 500.0 750.0 1000.0 Tail Preference 0.8 0.0 0.0 250.0 0.0 0.0 250.0 500.0 750.0 1000.0 Figure 5{1: Male peacock tails disappearing. The graphs plot the average value of each gene for each gender in each breeding cycle. preference (in females), supporting the theory that genetically-based sexual preference in females can dominate simple survival considerations. Even though the optimal tail length is 5, when the average tail length preference in females drops below 0.5, the males respond with tails shorter than the optimum. The eect is that males with short tails pass their genes onto the next generation and the short-tail gene becomes dominant in the peacock community. The fact that short-tailed males are more successful than long-tailed males means that it becomes vital for females to select mates with short tails in order for their male ospring to also be successful. The two genes therefore exert pressure on each other and their values decrease until male tails have disappeared. The dormant female tail length gene follows the trend of the active male tail length gene, even though this gene has no eect on the female. Similarly, Computer Modelling of Evolution BARONE L., HINGSTON P., WHILE L. { Page 7 the dormant tail length preference gene in males follows the active tail length preference gene in females. The overriding consideration is that the success of a peacock depends not only on its own tness, but also on the tness of its ospring (and their ospring, etc.). Figure 5{2 shows the alternative result, the development of male peacocks with extravagantly long tails. This outcome is also predicted by the theory, for the same reasons as the previous scenario. The Female Peacocks Male Peacocks Tail Preference Tail Length 80.0 1.00 60.0 0.80 40.0 0.60 20.0 0.40 0.0 250.0 500.0 750.0 1000.0 0.0 0.0 250.0 500.0 750.0 1000.0 Figure 5{2: Defying natural selection? - male peacocks evolve long tails. genetic bases of tail length in male peacocks and tail-based sexual preference in female peacocks leads to a situation where the values of these genes exert pressure on each other and advance together at a geometrically-increasing rate, even at a substantial survival cost to the males. However, this upward trend in tail length cannot continue unbounded. At some point, the consideration of survival overcomes the consideration of sexual attractiveness, to the extent that males with tails which are \too long" do not even survive to maturity and hence do not participate in reproduction at all. When this occurs, males with shorter tails pass on a greater proportion of genes to the next generation, forcing tail length to stabilize. 6 Conclusions We have dened a general framework for the process of evolution which abstracts the invariant parts of the evolutionary process and allows a programmer to model particular situations by instantiating the variant parts of the process. We use the inheritance provided by the C++ class structure to dene particular species as derivations of general base organisms. This allows the programmer to focus easily on the aspects of the process which are deemed relevant in a particular experiment. The peacock model demonstrates the use of the evolutionary framework. Fisher's theory submits that apparently counter-adaptive phenotypic traits (i.e. male peacocks' tails) can arise simply due to the fact that both sexual attractiveness in males and sexual preference in females have a genetic basis. We can simulate the salient aspects of this scenario quickly and easily, by deriving minimal genotypes and phenotypes of male and female peacocks. The general framework manages the interactions between these entities and all other aspects of the simulation. Our experiments with this model conrm that Fisher's theory is adequate to explain the observed development of peacocks. We have also used our framework to develop models of outlaw genes and genes which control the mutation process. Page 8 { BARONE L., HINGSTON P., WHILE L. Computer Modelling of Evolution 7 Future Directions for the Generalized Model Future extensions to the model will include: allowance for interaction between phenotypes and their environment, and allowance for interaction between dierent species. At this stage, phenotype interaction with the environment can be modelled only by incorporating probabilistic measures in the overridden functions. For example, the function determining the probability of death of a phenotype may incorporate a probability of a food shortage occurring, thus making the probability of death for each phenotype slightly greater. This is sucient for simple models, but more complex models require an explicit representation of the environment in which species evolve. We intend to dene a new base class where the environment is modelled as a grid of cells, each with attributes describing the state of the environment in its locality. The user-dened functions will have the ability to examine and/or modify these attributes. We also intend to extend the framework to support populations of several species, allowing us to study the interactions between dierent species, for example predator/prey scenarios and parasitic/symbiotic dependencies. Bibliography 1. R. Dawkins. The Evolution of Evolvability, Articial Life, SFI Studies in the Sciences of Complexity , Ed. C. Langton, Addison-Wesley, 1988. 2. R. Dawkins. The Selsh Gene , Oxford University Press, 1976. 3. R.A. Fisher. The Genetic Theory of Natural Selection , Oxford Clarendon Press, 1930. 4. S. Windybank. Wild Sex , Reed Books, 1991.