Download 16.1 The Canonical Genetic Algorithm

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Viral phylodynamics wikipedia , lookup

Heritability of IQ wikipedia , lookup

Group selection wikipedia , lookup

Dual inheritance theory wikipedia , lookup

Inbreeding avoidance wikipedia , lookup

Human genetic variation wikipedia , lookup

Gene expression programming wikipedia , lookup

Genetic drift wikipedia , lookup

Microevolution wikipedia , lookup

Population genetics wikipedia , lookup

Koinophilia wikipedia , lookup

Transcript
Lecture 16
Genetic Algorithms
The evolutionary approach to Artificial Intelligence is
one of the neatest ideas of all. We have tried to mimic
the functioning of the brain through neural networks,
because - even though we don't know exactly how it
works - we know that the brain does work. Similarly,
we know that mother nature, through the process of
evolution, has solved many problems, for instance the
problem getting animals to walk around on two feet
(try getting a robot to do that - it's very difficult). So, it
seems like a good idea to mimic the processes of
reproduction and survival of the fittest to try to evolve
answers to problems, and maybe in the long run reach
the holy grail of computers which program themselves
by evolving programs. Evolutionary approaches are
simple in conception:


generate a population of possible answers to the
problem at hand
choose the best individuals from the population
(using methods inspired by survival of the fittest)


produce a new generation by combining these
best ones (using techniques inspired by
reproduction)
stop when the best individual of a generation is
good enough (or you run out of time)
Perhaps the first landmark in the history of the
evolutionary approach to computing was John
Holland's book "Adaptation in Natural and Artificial
Systems", where he developed the idea of the genetic
algorithm as searching via sampling hyperplane
partitions of the space. It's important to rememeber
that genetic algorithms (GAs), which we look at in this
lecture, and genetic programming (GP), which we look
at in the next lecture, are just fancy search
mechanisms which are inspired by evolution. In fact,
using Tom Mitchell's definition of a machine learning
system being one which improves its performance
through experience, we can see that evolutionary
approaches can be classed as machine learning efforts.
Historically, however, it has been more common to
categorise evolutionary approaches together because
of their inspiration rather than their applications (to
learning and discovery problems).
As we will see, evolutionary approaches boil down to
(i) specifying how to represent possible problem
solutions and (ii) determining how to choose which
partial solutions are doing the best with respect to
solving the problem. The main difference between
genetic algorithms and genetic programming is the
choice of representation for problem solutions. In
particular, with genetic algorithms, the format of the
solution is fixed, e.g., a fixed set of parameters to find,
and the evolution occurs in order to find good values
for those parameters. With genetic programming,
however, the individuals in the population of possible
solutions are actually individual programs which can
increase in complexity, so are not as constrained as in
the genetic algorithm approach.
16.1 The Canonical Genetic Algorithm
As with all search techniques, one of the first
questions to ask with GAs is how to define a search
space which potentially contains good solutions to the
problem at hand. This means answering the question
of how to represent possible solutions to the problem.
The classical approach to GAs is to represent the
solutions as strings of ones and zeros, i.e., bit strings .
This is not such a bad idea, given that computers store
everything as bit strings, so any solution would
eventually boil down to a string of ones and zeros.
However, there have been many modifications to the
original approach to genetic algorithms, and GA
approaches now come in many different shapes and
sizes, with higher level representations. Indeed, it's
possible to see genetic programming, where the
individuals in the population are programs, as just a
GA approach with a more complicated representation
scheme.
Returning to the classical approach, as an example, if
solving a particular problem involved finding a set of
five integers between 1 and 100, then the search
space for a GA would be bits strings where the first
eight bits are decoded as the first integer, the next
eight bits become the second integer and so on.
Representing the solutions is one of the tricky parts to
using genetic algorithms, a problem we come back to
later. However, suppose that the solutions are
represented as strings of length L. Then, in the
standard approach to GAs, known as the canonical
genetic algorithm, the first stage is to generate an
initial random population of bit strings of length L. By
random, we mean that the ones and zeros in the
strings are chosen at random. Sometimes, rarely, the
initialisation procedure is done with a little more
intelligence, e.g., using some additional knowledge
about the domain to choose the initial population.
After the initialisation step, the canonical genetic
algorithm proceeds iteratively using selection, mating,
and recombination processes, then checking for
termination. This is portrayed in the following
diagram:
In the next section, we look in detail at how individuals
are selected, mated, recombined (and mutated for
good measure). Termination of the algorithm may
occur if one or more of the best individuals in the
current generation performs well enough with respect
to the problem, with this performance specified by the
user. Note that this termination check may be related,
or the same as the evaluation function - discussed
later - but it may be something entirely different to
this.
There may not be a definitive answer to the problem
you're looking at, and it may only be possible to evolve
solutions which are as good as possible. In this case, it
may not be obvious when to stop, and moreover, it
may be a good idea to produce as many populations
as possible given the computing/time resources you
have available. In this case, the termination function
may be a specific time limit or a specific number of
generations. It is very important to note that the best
individual in your final population may not be as good
as the best individual in a previous generation (GAs do
not perform hill-climbing searches, so it is perfectly
possible for generations to degrade). Hence GAs
should record the best individuals from every
generation, and, as a final solution presented to the
user, they should output the best solution found over
all the generations.
16.2 Selection, Mating, Recombination and Mutation
So, the point of GAs is to generate population after
population of individuals which represent possible
solutions to the problem at hand in the hope that one
individual in one generation will be a good solution.
We look here at how to produce the next generation
from the current generation. Note that there are
various models for whether to kill off the previous
generation, or allow some of the fittest individuals to
stay alive for a while - we'll assume a culling of the old
generation once the new one has been generated.

Selection
The first step is to choose the individuals which will
have a shot at becoming the parents of the next
generation. This is called the selection procedure, and
its purpose it to choose those individuals from the
current population which will go into an intermediate
population (IP). Only individuals in this intermediate
population will be chosen to mate with each other
(and there's still no guarantee that they'll be chosen to
mate, or that if they do mate, they will be successful see later).
To perform the selection, the GA agent will require a
fitness function. This will assign a real number to each
individual in the current generation. From this value,
the GA calculates the number of copies of the
individual which are guaranteed to go into the
intermediate population and a probability which will
be used to determine whether an additional copy goes
into the IP. To be more specific, if the value calculated
by the fitness function is an integer part followed by a
fractional part, then the integer part dictates the
number of copies of the individual which are
guaranteed to go into the IP, and the fractional part is
used as a probability: another copy of the individual is
added to the IP with this probability, e.g., if it was 1/6,
then a random number between 1 and 6 would be
generated and only if it was a six would another copy
be added.
The fitness function will use an evaluation function to
calculate a value of worth for the individual so that
they can be compared against each other. Often the
evaluation function is written g(c) for a particular
individual c. Correctly specifying such evaluation
functions is a tricky job, which we look at later. The
fitness of an individual is calculated by dividing the
value it gets for g by the average value for g over the
entire population:
fitness(c) = g(c)/(average of g over the entire
population)
We see that every individual has at least a chance of
going into the intermediate population unless they
score zero for the evaluation function.
As an example of a fitness function using an evaluation
function, suppose our GA agent has calculated the
evaluation function for every member of the
population, and the average is 17. Then, for a
particular individual c0, the value of the evaluation
function is 25. The fitness function for c0 would be
calculated as 25/17 = 1.47. This means that one copy
of c0 will definitely be added to the IP, and another
copy will be added with a probability of 0.47 (e.g., a
100 side dice is thrown and only if it returns 47 or less,
is another copy of c0 added to the IP).

Mating
Once our GA agent has chosen the individuals lucky
enough (actually, fit enough) to produce offspring, we
next determine how they are going to mate with each
other. To do this, pairs are simply chosen randomly
from the set of potential parents. That is, one
individual is chosen randomly, then another - which
may be the same as the first - is chosen, and that pair
is lined up for the reproduction of one or more
offspring (dependent on the recombination
techniques used). Then whether or not they actually
reproduce is probabilistic, and occurs with a
probability pc. If they do reproduce, then their
offspring are generated using a recombination and
mutation procedure as described below, and these
offspring are added to the next generation. This
continues until the number of offspring which is
produced is the required number. Often this required
number is the same as the current population size, to
keep the population size constant. Note that there are
repeated individuals in the IP, so some individuals may
become the proud parent of multiple children.
This mating process has some anology with natural
evolution, because sometimes the fittest organisms
may not have the opportunity to find a mate, and
even if they do find a mate, it's not guaranteed that
they will be able to reproduce. However, the analogy
with natural evolution also breaks down here, because
individuals can mate with themselves and there is no
notion of sexes.

Recombination
During the selection and mating process, the GA
repeatedly lines up pairs of individuals for
reproduction. The next question is how to generate
offspring from these parent individuals. This is called
the recombination process and how this is done is
largely dependent on the representation scheme
being used. We will look at three possibilities for
recombination of individuals represented as bit
strings.
The population will only evolve to be better if the best
parts of the best individuals are combined, hence
recombination procedures usually take parts from
both parents and place them into the offspring. In the
One-Point Crossover recombination process, a point is
chosen at random on the first individual, and the same
point is chosen on the second individual. This splits
both individuals into a left hand and a right hand side.
Two offspring individuals are then produced by (i)
taking the LHS of the first and adding it to the RHS of
the second and (ii) by taking the LHS of the second and
adding it to the RHS of the first. In the following
example, the crossover point is after the fifth letter in
the bit string:
Note that all the a's, b's, X's and Y's are actually ones
or zeros. We see that the length of the two children is
the same as that of the parents because GAs use a
fixed representation (remember that the bit strings
only make sense as solutions if they are of a particular
length).
In Two-point Crossover, as you would expect, two
points are chosen in exactly the same place in both
individuals. Then the bits falling in-between the two
points are swapped to give two new offspring. For
example, in the following diagram, the two points are
after the 5th and 11th letters:
Again, the a's, b's, X's and Y's are ones or zeros, and
we see that this recombintion technique doesn't alter
the string length either. As a third recombination
operator, the inversion process simply takes a
segment of a single individual and produces a single
offspring by reversing the letters in-between two
chosen points. For example:

Mutation
It may appear that the above recombinations are a
little arbitrary, especially as points defining where
crossover and inversion occur are chosen randomly.
However, it is important to note that large parts of the
string are kept in tact, which means that if the string
contained a region which scored very well with the
evaluation function, these operators have a good
chance of passing that region on to the offspring
(especially if the regions are fairly small, and, like in
most GA problems, the overall string length is quite
high).
The recombination process produces a large range of
possible solutions. However, it is still possible for it to
guide the search into a local rather than the global
maxima with respect to the evaluation function. For
this reason, GAs usually perform random mutations. In
this process, the offspring are taken and each bit in
their bit string is flipped from a one to a zero or vice
versa with a given probability. This probability is
usually taken to be very small, say less than 0.01, so
that only one in a hundred letters is flipped on
average.
In natural evolution, random mutations are often
highly deleterious (harmful) to the organism, because
the change in the DNA leads to big changes to way the
body works. It may seem sensible to protect the
children of the fittest individuals in the population
from the mutation process, using special alterations to
the flipping probability distribution. However, it may
be that it is actually the fittest individuals that are
causing the population to stay in the local maxima.
After all, they get to reproduce with higher frequency.
Hence, protecting their offspring is not a good idea,
especially as the GA will record the best from each
generation, so we won't lose their good abilities
totally. Random mutation has been shown to be
effective at getting GA searches out of local maxima
effectively, which is why it is an important part of the
process.
To summarise the production of one generation from
the previous: firstly, an intermediate population is
produced by selecting copies of the fittest individuals
using probability so that every individual has at least a
chance of going into the intermediate population.
Secondly, pairs from this intermediate population are
chosen at random for reproduction (a pair might
consist of the same individual twice), and the pair
reproduce with a given fixed probability. Thirdly,
offspring are generated through recombination
procedures such as 1-point crossover, 2-point
crossover and inversion. Finally, the offspring are
randomly mutated to produce the next generation of
individuals. Individuals from the old generation may
be entirely killed off, but some may be allowed into
the next generation (alternatively, the recombination
procedure might be tuned to leave some individuals
unchanged). The following schematic gives an
indication of how the new generation is produced:
16.3 Two Difficulties
The first big problem we face when designing an AI
agent to perform a GA-style search is how to
represent the solutions. If the solutions are textual by
nature, then ASCII strings require eight bits per letter,
so the size of individuals can get very large. This will
mean that evolution may take thousands of
generations to converge onto a solution. Also, there
will be much redundancy in the bit string
representations: in general many bit strings produced
by the recombination process will not represent
solutions at all, e.g., they may represent ASCII
characters which shouldn't appear in the solution. In
the case of individuals which don't represent
solutions, how do we measure these with the
evaluation function? It doesn't necessarily follow that
they are entirely unfit, because the tweaking of a
single zero to a one might make them good solutions.
The situation is better when the solution space is
continuous, or the solutions represent real valued
numbers or integers. The situation is worse when
there are only a finite number of solutions.
The second big problem we face is how to specify the
evaluation function. This is crucial to the success of
the GA experiment. The evaluation function should, if
possible:



Return a real-valued number scoring higher for
individuals which perform better with respect to
the problem
Be quick to calculate, as this calculation will be
done many thousands of times
Distinguish well between different individuals, i.e.,
give a good range of values
Even with a well specified evaluation function, when
populations have evolved to a certain stage, it is
possible that the individuals will all score highly with
respect to the evalation function, so all have equal
chances of reproducing. In this case, evolution will
effectively have stopped, and it may be necessary to
take some action to spread them out (make the
evaluation function more sophisticated dynamically,
possibly).
16.4 An Example Application
There are many fantastic applications of genetic
algorithms. Perhaps my favourite is their usage in
evaluating Jazz melodies done as part of a PhD project
in Edinburgh. The one we look at here is chosen
because it demonstrates how a fairly lightweight effort
using GAs can often be highly effective. In their paper
"The Application of Artificial Intelligence to
Transportation System Design", Ricardo Hoar and
Joanne Penner describe their undergraduate project,
which involved representing vehicles on a road system
as autonomous agents, and using a GA approach to
evolve solutions to the timing of traffic lights to
increase the traffic flow in the system. The optimum
settings for when lights come on and go off is known
only for very simple situations, so an AI-style search
can be used to try and find good solutions. Hoar and
Penner chose to do this in an evolutionary fashion.
They don't give details of the representation scheme
they used, but traffic light times are real-valued
numbers, so they could have used a bit-string
representation.
The evaluation function they used involved the total
waiting time and total driving time for each car in the
system as follows:
The results they produced were good (worthy of
writing a paper). The two graphs below describe the
decrease in overall waiting time for a simple road and
for a more complicated road (albeit not amazingly
complicated).
We see that in both cases, the waiting time has
roughly halved, which is a good result. In the first case,
for the simple road system, the GA evolved a solution
very similar to the ones worked out to be optimal by
humans. We see that GAs can be used to find good
near-optimal solutions to problems where a more
cognitive approach might have failed (i.e., humans still
can't work out how best to tune traffic light times, but
a computer can evolve a good solution).
16.5 Using Genetic Algorithms
If you go on to read more about evolutionary
approaches, then you may encounter terminology
which makes the analogy to natural evolutionary
processes at the species level and at the genetics level
(remember that it is the evolution of our genetic
material that has shaped our evolution as a species).
This analogy is as follows:
GA terminology
Genetics
Terminology
Species
Terminology
Population
DNA???
Population
Bit string
Chromosome
Organism
Bit (one or
zero)
Gene
Genome???
???
Survival of the
fittest
Selection
Recombination Crossover
Inheritance
Mutation
Mutation???
Mutation
It is worth remembering however, that, as with
artificial neural networks, this analogy is very loose,
because people use different analogies, the anologies
don't actually fit sometimes (hence the ???s above)
and real evolutionary processes are extremely
complicated affairs.
Russell and Norvig say that, if you can answer the
following four questions, then you should consider
using a GA approach to your problem solving task:




What is the fitness function?
How is an individual represented?
How are the individuals selected?
How do individuals reproduce?
GAs are similar to ANNs and CSPs, in as much as they
might not be exactly the best (most efficient, etc.) way
to proceed, but they do provide you with a quick and
easy way of tackling the problem. Russell and Norvig
put this explicitly: "Try a quick implementation before
investing more time thinking about another
approach". As we saw with the transport application
described above (carried out by undergraduates), such
initial efforts can often produce surprisingly good
results, and it's not fair to say that GAs should only be
used as a first try. Indeed, as we shall see in the next
lecture, evolutionary approaches such as genetic
programming can sometimes rival human
performance and be the most suitable AI technique to
use.
© Simon Colton 2004