Download as a PDF

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Hardy–Weinberg principle wikipedia , lookup

Genetic engineering wikipedia , lookup

Sexual dimorphism wikipedia , lookup

History of genetic engineering wikipedia , lookup

Population genetics wikipedia , lookup

Koinophilia wikipedia , lookup

Gene expression programming wikipedia , lookup

Microevolution wikipedia , lookup

Life history theory wikipedia , lookup

Transcript
Computer Modelling of Evolution
Luigi Barone
Philip Hingston
Lyndon While
Department Of Computer Science,
The University of Western Australia,
Nedlands, WA, 6009
email: fluigi, phi, [email protected]
Abstract
The development of life on Earth is governed by the processes of evolution and natural selection. We
investigate the evolutionary process to determine the variant and invariant parts of the process with
respect to dierent species. We introduce a C++ hierarchy class model as a general model for evolution and use this model to simulate peacock communities in an attempt to answer the question of why
male peacocks have long, apparently counter-adaptive, tails. An analysis of the results from this model
concludes the paper.
Keywords: evolution, evolutionary algorithm, object-oriented, peacock.
CR Classication: I.6.3 [Computing Methodologies]: Simulation and Modelling - Applications.
1 Introduction
All life present on Earth today evolved from elementary organisms. The process by which these simple
organisms developed into the complex multi-celled fauna/ora present today is called evolution.
Organisms contain entities called genes that are capable of self-replication. Genes are passed from
parent(s) to their ospring during the process of reproduction. Any inexact replication of genes during
reproduction that generates ospring which dier from their parent(s) in an unexpected manner is called
mutation. The total genetic information contained in an organism is called its genotype. The organism
formed by the interaction of the genotype with its environment is called the phenotype. The success
of the phenotype in its natural environment determines whether the genes contained in its genotype
go forward into the next generation. Natural selection is the process by which the more successful
phenotypes of a generation pass on a greater proportion of genes to the next generation. In other
words, natural selection suggests that any adaptation (expressed through an organism's genes) that
helps an organism to survive and produce more ospring eventually becomes dominant in the organism
community, because more ospring with this adaptation are produced. For a complete discussion of
evolution and the related terminology, the reader is referred to Dawkins paper [1].
In this paper, we investigate the evolutionary process to determine the variant and invariant parts of
the process with respect to dierent species. We introduce a C++ hierarchy class model as a general
model for evolution and use this model to simulate peacock communities in an attempt to answer the
question of why male peacocks have long, apparently counter-adaptive, tails. An analysis of the results
from this model concludes the paper.
2 Modelling the Evolutionary Process
We would like to develop a general framework for modelling evolution such that a particular scenario
can be developed by instantiating general functions with model-specic denitions. To achieve this, we
need to identify the features of evolution that are invariant (i.e. common factors in all evolutionary
models) and the features that are variant (i.e. model-specic).
Each organism has a genotype and a phenotype. The phenotype is the actual living organism (e.g. a
human), while the genotype is the genetic information contained in the phenotype (e.g. a human genome
in humans). The attributes of the phenotype and genotype dier from organism species to organism
Page 2 { BARONE L., HINGSTON P., WHILE L.
Computer Modelling of Evolution
species. However, each phenotype has associated with it its sex (either male, female, or unisexed) and
some measure of time since it came into existence (its age). These two attributes are common to all
species.
All organisms eventually die. Organisms die for a variety of reasons, possibly because of environmental
inuences (e.g. lack of food), or possibly of old age. This is dependent on the model being simulated.
Organisms must reproduce (either sexually or asexually) to perpetuate the species into further generations. Without reproduction, once all the organisms die, the species becomes extinct. The way in
which genes are copied from parent to ospring may dier in dierent models. Random copying errors
(mutation) are inevitable during the reproduction process. The probability and degree of mutation is,
however, variable. For sexual reproduction to occur, a male and a female must meet and be willing
to reproduce. The manner in which organisms encounter each other diers from model to model. The
willingness of two organisms to reproduce is also problem-specic.
3 A General Evolutionary Model
We can generalize the evolutionary process using a library of C++ classes. A set of base classes is
dened that represents the invariant features of the evolutionary process. A base class is provided for
each of a general organism phenotype and genotype. Phenotypes and genotypes of organisms specic
to the model can then be provided by deriving new classes from these base classes. The variant parts
of the evolutionary process are expressed as functions to be overridden in the base classes. That is, the
new derived classes override functions that express the variant parts of the model.
3.1 The Base Phenotype and Genotype Classes
We use a collection class to hold lists of possibly dierent types of organism species. To achieve this, all
organism species are derived from a simple base class called object. The collection class, called objectlist,
is then a list of pointers to instances of the class object. Standard list operations are dened on the
class objectlist.
A base genotype class, called genotype class, is dened from which the model-specic genotype class is
derived. The genotypes used in a particular simulation will depend on the situation being modelled and
the aspects of that situation which are deemed interesting.
All organism instances are derived from a base organism class, called individual. In order to use this class
in the collection class, individual is also derived from the object class. The individual class represents
the phenotype of a general organism: it contains attributes that are common to all organism species.
The individual class denes the following protected attributes that may be used in derived classes.
int age { the age of the phenotype.
sex class sex { the sex of the phenotype.
genotype class *genotype { a pointer to the organism's genotype.
The individual class also provides the following function.
int mutation direction(oat mutation rate);
This function returns the direction of mutation when copying a parent gene to the ospring. The
parameter mutation rate is the probability (as a percentage) of mutation. The return value is either -1,
0, or 1, indicating the direction of mutation. A return value of zero indicates no copying error occurred.
A non-zero return value indicates the gene was copied incorrectly. The sign of the return value indicates
which way the error occurred, positive representing an increment in gene value, negative representing a
decrement in gene value. This function may be overridden in the organism class if required.
The individual class requires the following functions to be overridden in derived classes.
bool die(int popsize);
This function determines if the organism dies in the current breeding cycle. It returns true if
the organism dies and false if it survives. Phenotype attributes such as age are likely to aect the
Computer Modelling of Evolution
BARONE L., HINGSTON P., WHILE L. { Page 3
return value of this function. The parameter popsize is the number of organisms of the organism
class at the beginning of the current breeding cycle. This function is called every breeding cycle
for each organism.
void mature();
This function modies the state of the phenotype each breeding cycle, thereby acting as an aging
function. Phenotype attributes are modied by this function to simulate the process of aging in
nature. If the model does not have any aging aects on phenotype attributes, the function remains
empty. This function is called every breeding cycle for each organism.
bool participate in reproduction(int popsize);
This function determines if the organism participates in reproduction in the current breeding
cycle. It returns true if the organism reproduces in this breeding cycle and false otherwise. Phenotype attributes such as age, sex and the time since last reproduction are likely to aect the return
value of this function. The parameter popsize is the number of organisms of the organism class at
the beginning of the current breeding cycle. This function is called every breeding cycle for each
organism.
individual *choose mate(objectlist *selection set);
This function selects the mate chosen by a female organism in the current breeding cycle. The
selection set parameter contains a list of males encountered by the female: the function returns
the male from this list with whom the female reproduces. A NULL return value indicates that
the female does not wish to reproduce with any of the males in the list. This function is called for
each female organism which is participating in reproduction in the current breeding cycle.
void reproduce(genotype class *m, genotype class *f);
This function creates the genotype for a new organism. It creates the genotype of the new ospring
from the genotypes of the parents, determining a value for each genotype attribute. Mutation can
be modelled in this function using the mutation direction function dened in the individual class.
This function is called for each new organism at the time it is created.
void develop();
This function develops the organism's phenotype from its genotype. It models the development
process in nature, where the genotype is transformed into the phenotype. This function can incorporate random factors to simulate environmental eects on the development of the phenotype.
This function is called for each new organism at the time it is created.
3.2 The Base Population Class
We describe populations of organisms using a population class. The sexual population class is used for
sexual reproduction and the asexual population class is used for asexual reproduction. These classes
represent the entire population of a species. For a sexual species, all organism instances are maintained
as two lists.
objectlist *male individual { a list of male organisms.
objectlist *female individual { a list of female organisms.
The population class performs operations on the whole population of organisms, providing the following
member functions.
void initialize population(int popsize);
This function initializes the population.
void age();
This function increments the age of each organism and then calls the mature member function of
each organism.
Page 4 { BARONE L., HINGSTON P., WHILE L.
Computer Modelling of Evolution
void kill(int gen, int *males killed, int *females killed);
This function kills organisms from the population. It calls the die member function for each
organism and deletes the organism from the list if required.
void reproduce(int gen, int *males reproduced, int *females reproduced);
This function adds new organisms into the population. It calls the participate in reproduction
member function for each female organism: if it returns true, reproduce calls the choose selection set
member of the sexual population class to determine the list of male organisms encountered by the
female in this breeding cycle. A mate is then chosen by calling the choose mate member function.
If a male is chosen, a new organism is created, and the genotype and phenotype of the ospring
are created by calling its reproduce and develop member functions.
The population class also requires the following function to be overridden in derived classes.
objectlist *choose selection set();
This function selects a list of males from the organism population from which a female organism selects a mate. This models the process of organism interaction. The list returned represents
the list of males that the female encountered in the current breeding cycle. The size of the return
list can be varied to model dierent numbers of encounters for each female. Organisms can be
added to the list by calling the add element member function of the objectlist class. This function
is called every time a female organism is willing to participate in reproduction.
4 The Peacock Model
Dawkins[2] poses the question of why male peacocks developed long tails. At rst sight, these tails
seem to contradict the principle of natural selection, because a long tail (i.e. a tail longer than the
aerodynamic optimum) is a hindrance to a peacock compared to a short tail. Not only does the peacock
pay the extra cost of growing a long tail, but also, for example, it may be harder for a peacock with a
long tail to elude predators. In jargon, such a phenotypic feature is called a counter-adaptation.
One theory submits that long tails became dominant in male peacocks simply because female peacocks
have a genetic preference for males with long tails compared to males with short tails[3]: in short, a
long tail makes a male peacock more attractive to females. The theory submits that as soon as such an
imbalance arises (for whatever reason), a feed-forward situation develops which causes male tails and
female preference for long tails to increase in tandem at a geometrically increasing rate, such a process
reaching equilibrium when males cannot grow longer tails and still survive in their environment. Note
that the theory also admits the possibility of birds with excessively short tails: indeed, these are also
found both in nature and in the results from our model.
We have developed a model for peacock communities within our general framework for evolution. Peacock genotypes are represented by the following attributes:
1. constitution { a measure of the general well-being of the peacock,
2. tail length, and
3. preference for long tails { the probability that a female peacock chooses the male peacock with
the longer tail from a pair of candidate mates.
Note that the tail-length gene expresses itself only in males and the preference gene expresses itself only
in females. Phenotypes are created from genotypes using a transformation mapping. Female peacock
genotypes are mapped to a phenotype represented by two attributes:
1. constitution, and
2. preference for long tails.
Male peacock phenotypes are also represented by two attributes:
1. constitution, and
Computer Modelling of Evolution
BARONE L., HINGSTON P., WHILE L. { Page 5
2. tail length.
The probability of death in each breeding cycle for a male peacock is a function of its age, constitution,
and tail length. As tail length increases, so does the probability of dying, since a larger tail is a hindrance
to a peacock (note that a tail shorter than the aerodynamic optimum is also a disadvantage) [4].
The probability of death for a female peacock is a function of the its age and constitution. For both
males and females, the probability of surviving is directly proportional to its constitution, and inversely
proportional to its age.
Each breeding year, each female in the population either chooses a male with which to reproduce from
a random selection of peacocks or she abstains from reproduction. From a given set of males, the male
chosen is determined by a competition process where the candidates are compared pairwise by the female
until only one remains. In essence, the probability that two peacocks reproduce is a function of the tail
length of the male, and the tail length preference of the female. Immature peacocks are excluded from
the breeding process.
When two peacocks reproduce, the ospring's genotype is some combination of the parents' genotypes.
The parents' genotypes are copied with a pre-dened probability of error, i.e. the mutation rate.
5 Experimenting with the Peacock Model
The variables of the model are:
the function determining the probability of death in each breeding cycle,
the starting population,
the reproduction function,
the probability of mutation, and
the number of male peacocks from each female can chose (the encounter set).
All experiments were performed on a DEC ALPHA 3000 workstation running DEC OSF/1 V2.0. A
non-linear additive feedback random number generator was used to return successive pseudo-random
numbers in the range 0 to 231 ? 1. The program was written using C++ and compiled using the GNU
g++ version 2.6.0 compiler with no optimization options.
The program runs for a xed number of breeding cycles (unless the population dies out). Genotype/phenotype statistics are generated and displayed in each breeding cycle.
Initializing the Parameters
The probability of death for a female peacock is directly proportional to its constitution and inversely
proportional to its age. The health of a female peacock is given by the function:
health = constitution
age :
The probability of death for a male peacock is directly proportional to its constitution, inversely proportional to its age and a Gaussian function of its deviation from the optimal tail length. The health
of a male peacock is given by the function:
?(tail length ? optimal tail length)2 )
exp(
health = constitution
age
c
where c controls the eect of tail length on the health of male peacocks. c = 1 indicates that tail length
has no eect on peacock health: as c decreases, tail length has an increasing eect on peacock health,
with tails further from the optimal aerodynamic length lowering the value of health. optimal tail length
is set (somewhat arbitrarily) to 5.
Each year, every peacock's health is evaluated and the peacock dies if:
health < random(threshold)
where random(x) returns a random number in the range 0{x.
Page 6 { BARONE L., HINGSTON P., WHILE L.
Computer Modelling of Evolution
The starting population for these experiments was set to 1000 males and 1000 females. The starting
population needs to be large enough so that the development of the population is not dominated by
random drift caused by the pseudo-random number generator. When the population reaches a critical
size, the value of the threshold is increased. This simulates competition for limited resources in the
environment.
The genotype of a new peacock is created from its parents' genotypes by simply selecting each gene
from one of the parents at random.
The number of male peacocks in the female selection set is arbitrarily set to 16. The probability that
a female participates in reproduction is set to 25%. The probability of mutation is set to 15%. When
mutation occurs, the change in the gene value being mutated is 20% of the initial value for the
population.
5.1 Results
Figure 5{1 shows a typical result obtained when the health evaluator described above is used with c of
the order of 10,000. The graphs indicate a downward trend in both tail length (in males) and tail length
Female Peacocks
Male Peacocks
Tail Length
Tail Length
6.0
6.0
4.0
4.0
2.0
2.0
0.0
0.0
250.0
500.0
750.0 1000.0
0.0
0.0
Tail Preference
0.8
0.6
0.6
0.4
0.4
0.2
0.2
250.0
500.0
750.0 1000.0
500.0
750.0 1000.0
Tail Preference
0.8
0.0
0.0
250.0
0.0
0.0
250.0
500.0
750.0 1000.0
Figure 5{1: Male peacock tails disappearing. The graphs plot the average value of each gene for each
gender in each breeding cycle.
preference (in females), supporting the theory that genetically-based sexual preference in females can
dominate simple survival considerations. Even though the optimal tail length is 5, when the average tail
length preference in females drops below 0.5, the males respond with tails shorter than the optimum.
The eect is that males with short tails pass their genes onto the next generation and the short-tail
gene becomes dominant in the peacock community. The fact that short-tailed males are more successful
than long-tailed males means that it becomes vital for females to select mates with short tails in order
for their male ospring to also be successful. The two genes therefore exert pressure on each other and
their values decrease until male tails have disappeared. The dormant female tail length gene follows the
trend of the active male tail length gene, even though this gene has no eect on the female. Similarly,
Computer Modelling of Evolution
BARONE L., HINGSTON P., WHILE L. { Page 7
the dormant tail length preference gene in males follows the active tail length preference gene in females.
The overriding consideration is that the success of a peacock depends not only on its own tness, but
also on the tness of its ospring (and their ospring, etc.).
Figure 5{2 shows the alternative result, the development of male peacocks with extravagantly long
tails. This outcome is also predicted by the theory, for the same reasons as the previous scenario. The
Female Peacocks
Male Peacocks
Tail Preference
Tail Length
80.0
1.00
60.0
0.80
40.0
0.60
20.0
0.40
0.0
250.0
500.0
750.0
1000.0
0.0
0.0
250.0
500.0
750.0
1000.0
Figure 5{2: Defying natural selection? - male peacocks evolve long tails.
genetic bases of tail length in male peacocks and tail-based sexual preference in female peacocks leads
to a situation where the values of these genes exert pressure on each other and advance together at a
geometrically-increasing rate, even at a substantial survival cost to the males.
However, this upward trend in tail length cannot continue unbounded. At some point, the consideration
of survival overcomes the consideration of sexual attractiveness, to the extent that males with tails
which are \too long" do not even survive to maturity and hence do not participate in reproduction
at all. When this occurs, males with shorter tails pass on a greater proportion of genes to the next
generation, forcing tail length to stabilize.
6 Conclusions
We have dened a general framework for the process of evolution which abstracts the invariant parts
of the evolutionary process and allows a programmer to model particular situations by instantiating
the variant parts of the process. We use the inheritance provided by the C++ class structure to dene
particular species as derivations of general base organisms. This allows the programmer to focus easily
on the aspects of the process which are deemed relevant in a particular experiment.
The peacock model demonstrates the use of the evolutionary framework. Fisher's theory submits that
apparently counter-adaptive phenotypic traits (i.e. male peacocks' tails) can arise simply due to the
fact that both sexual attractiveness in males and sexual preference in females have a genetic basis. We
can simulate the salient aspects of this scenario quickly and easily, by deriving minimal genotypes and
phenotypes of male and female peacocks. The general framework manages the interactions between
these entities and all other aspects of the simulation. Our experiments with this model conrm that
Fisher's theory is adequate to explain the observed development of peacocks.
We have also used our framework to develop models of outlaw genes and genes which control the
mutation process.
Page 8 { BARONE L., HINGSTON P., WHILE L.
Computer Modelling of Evolution
7 Future Directions for the Generalized Model
Future extensions to the model will include:
allowance for interaction between phenotypes and their environment, and
allowance for interaction between dierent species.
At this stage, phenotype interaction with the environment can be modelled only by incorporating probabilistic measures in the overridden functions. For example, the function determining the probability
of death of a phenotype may incorporate a probability of a food shortage occurring, thus making the
probability of death for each phenotype slightly greater. This is sucient for simple models, but more
complex models require an explicit representation of the environment in which species evolve. We intend
to dene a new base class where the environment is modelled as a grid of cells, each with attributes
describing the state of the environment in its locality. The user-dened functions will have the ability
to examine and/or modify these attributes.
We also intend to extend the framework to support populations of several species, allowing us to study
the interactions between dierent species, for example predator/prey scenarios and parasitic/symbiotic
dependencies.
Bibliography
1. R. Dawkins. The Evolution of Evolvability, Articial Life, SFI Studies in the Sciences of Complexity ,
Ed. C. Langton, Addison-Wesley, 1988.
2. R. Dawkins. The Selsh Gene , Oxford University Press, 1976.
3. R.A. Fisher. The Genetic Theory of Natural Selection , Oxford Clarendon Press, 1930.
4. S. Windybank. Wild Sex , Reed Books, 1991.