Download Part 1: Motivation, Basic Concepts, Algorithms

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Dual inheritance theory wikipedia , lookup

Genome evolution wikipedia , lookup

Heritability of IQ wikipedia , lookup

Public health genomics wikipedia , lookup

X-inactivation wikipedia , lookup

Mutation wikipedia , lookup

Genetic engineering wikipedia , lookup

Chromosome wikipedia , lookup

Point mutation wikipedia , lookup

History of genetic engineering wikipedia , lookup

Human genetic variation wikipedia , lookup

Genomic library wikipedia , lookup

Polyploid wikipedia , lookup

Karyotype wikipedia , lookup

Epistasis wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Designer baby wikipedia , lookup

Group selection wikipedia , lookup

Genome (book) wikipedia , lookup

Koinophilia wikipedia , lookup

Genetic drift wikipedia , lookup

Gene expression programming wikipedia , lookup

Population genetics wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
Part 1: Motivation, Basic Concepts,
Algorithms
1
Review of Biological Evolution
• Evolution is a long time scale process that changes a population of
organism by generating better offspring through reproduction.
–
–
–
–
–
–
Chromosome: DNA-coded information characterizing an organism.
Gene: Elementary DNA block of information (e.g.,: eyes color).
Allele: One of the possible values for a gene (e.g., brown, blue, . . . ).
Trait: The physical characteristic encoded by a gene.
Genotype: A particular set of genes.
Phenotype: The physical realization of a genotype (e.g., a person).
– Fitness: A measure of success in life for an organism.
– Crossover (Recombination): Chromosomes from the parents exchange
genetic materials to generate a new offspring.
– Mutation: Error occurring during DNA replication from parents.
2
• A genetic algorithm, often referred to as genetic
algorithms, (GAs) mimic the processes of biological
evolution in order to solve problems and to model
evolutionary systems.
• Developed by John Holland, University of Michigan
(1970’s)
– To understand the adaptive processes of natural systems.
– To develop ways in which the mechanisms of natural
adaptation might be imported into computer systems for
optimization and machine learning applications.
3
• Why use the mechanisms of natural evolution for
solving computational Problems?
– Evolution searches among an enormous number of possible
genetic sequences, to create highly fit organisms that survive
and reproduce in their environments.
– Species evolve by means of random variation (via mutation,
recombination, and other operators), followed by natural
selection, in which the fittest tend to survive and reproduce,
thus propagating their genetic material to future
generations.
– Likewise, computational problems involve searching through
a large solution space for optimal or sub-optimal solutions.
4
•
•
•
•
•
Optimization (e.g., circuits layout, job shop scheduling, . . . )
Prediction (e.g., weather forecast, protein folding, . . . )
Classification (e.g., fraud detection, quality assessment, . . . )
Economy (e.g., bidding strategies, market evaluation, . . . )
Ecology (e.g., biological arm races, host-parasite coevolution,
...)
• Automatic programming.
• Best suited for:
– Big search space, non convex.
– Finding good sub-optimum in a reasonable time, rather than
spending years on finding, perhaps, the best solution.
5
• GAs have the following elements in common:
–
–
–
–
Populations of chromosomes
Selection according to fitness
Crossover to produce new offspring
Random mutation of new offspring.
• A single solution is represented as a vector of components.
– The vector is called a chromosome.
– The components are called genes.
• There is a population of chromosomes (solutions) that cooperatively act
towards a common goal.
• The GA is an evolutionary or iterative algorithm that modifies the
population of solutions at each epoch (cycle, iteration) of the algorithm.
– The modification is done by crossover and mutation.
6
{
initialize population;
measure fitness of the population;
do while Termination Criteria Not Satisfied
{
select parents for reproduction;
perform recombination and mutation;
measure fitness of the population;
}
}
7
• Chromosomes could be:
– Bit strings
(0101 ... 1100)
– Integers
(7 5 ... 1 99)
– Real numbers
(43.2 -33.1 ... 0.0 89.2)
– Permutations of elements
(E11 E3 E7 ... E1 E15)
– Lists of rules
(R1 R2 R3 ... R22 R23)
– Program elements
(genetic programming)
– ... any data structure ...
8
Parameter Estimation Example
• Find
such that
• Chromosome with genes (i.e., representation of a solution):
• Population of solutions:
9
Travelling Salesperson Problem
• Find a tour of a given set of cities so that each city is
visited only once the total distance traveled is
minimized
• Chromosome with genes (i.e., representation of a
solution):
• Population of solutions:
10
10
10
• Crossover or recombination is GAs distinguishing feature.
• It involves mixing and matching parts of two parents to form children.
• Crossover was originally based on the premise that highly fit individuals
often share certain traits, called building blocks, in common.
• For fixed-length vector individuals, a building block was often defined as a
collection of genes set to certain values.
• For example, perhaps parameters and need to be both small values in
the a parameter estimation problem (they should be in some range).
• For example, in the Boolean individual 10110101, perhaps ***101*1 might
be a building block (where the * positions aren’t part of the building block).
11
• How you do that mixing and matching
depends on the representation of the
individuals and the objective of the
optimization problem.
• Some crossover techniques are just based on
the representation of the chromosomes, and
ignore the optimization objective, while
others include both.
• In chromosome representation only
techniques, there are three classic ways of
doing crossover in vectors:
• One-Point, Two-Point, and Uniform
Crossover.
12
• One-point crossover picks a
number between and ,
inclusive, and swaps all the
indexes
1:
2:
3:
4:
6:
7:
1
1st vector to be crossed over
2nd vector to be crossed over
random integer chosen uniformly from 1 to inclusive
for
to do
Swap the values of and
return and
13
•
The problem with one-point crossover is that it may break important linkages between
components.
•
Notice that the probability is high that
.
choice of will do it, except for
•
If the organization of your vector was such that elements and
had to work well in
tandem in order to get a high fitness, you’d be constantly breaking up good pairs that the
system discovered.
and
will be broken up due to crossover, as any
1
14
•
Two-point crossover is one way to alleviate the linkage problem.
•
Just pick two numbers and , and swap the indexes between them.
•
Think of the vectors as rings to understand how the endpoints don’t get broken.
•
However, the probability of swapping indexes is still not uniform, or perfectly fair, which
might be required in some applications.
15
16
• We can treat all genes fairly with respect to linkage by crossing over each
point independently of one another, using Uniform Crossover.
• Here we simply march down the vectors, and swap individual indexes if a
coin toss comes up heads with probability .
17
18
Crossover Incapable of Exploring
Entire Solution Space
•
If you cross over two vectors you can’t get every conceivable
vector out of it.
•
Imagine your vectors were points in space.
•
Now imagine the hypercube formed with those points at its
extreme corners.
•
Crossovers will result in new vectors which lie at some other
corner of the hypercube.
•
For example: the possible crossovers of the two vectors
(1,2,3) and (4,5,6) are (4,5,3), (4,2,6), (1,5,6), (4,2,3), (1,2,6),
and (1,5,3). As such, other vectors are not possible, such as
(1,1,1).
•
Thus, to make GAs have a chance to “explore” a wider search
space, another form of change is required (i.e., Mutation).
19
• Mutation allows the algorithm to explore the solution space more than that
allowed by crossover.
– It provides genetic diversity from one generation of a population of genetic
algorithm chromosomes to the next.
• Mutation alters one or more gene values in a chromosome from its initial
state.
– The larger the number of gene values that are mutated, the large region of the
solution spaced may be searched.
– Mutation occurs during evolution according to a user-definable mutation
probability. This probability should be set low. If it is set too high, the search
will turn into a primitive random search.
• Example: Single point mutation
– A random number is drawn to choose which gene in a chromosome should be
randomly altered.
20
More Controlled form of Mutation:
Line Recombination
These linear combinations
result in points that lie on
the same line joining the
initial points and .
21
• How do we select parents for crossover?
• Selection of parents (chromosomes) for crossover should not
be done randomly, but should be done in a way that is
focused on achieving the common goal.
• For example, selection should be based on fitness.
– The probability of being selected as a parent is proportional to
fitness.
• Two possible selection methods (there are others):
– Tournament selection.
– Roulette-wheel selection.
22
• Divide wheel into subintervals,
one for each individual in the
current generation.
• Interval length is proportional
to individual’s fitness.
• Uniformly distributed random
number chooses the subinterval
(i.e., parent 1).
• Do again for parent 2.
Source: http://www.edc.ncl.ac.uk/assets/
hilite_graphics/rhjan07g02.png
23
Roulette-wheel Implementation
Population
Member #
Fitness
Index
Initial
Fitness
Value
Fitness
Index
CDF
Value
5
5
2
7
4
11
2
13
8
21
2
23
1
24
4
28
24
Algorithm 30:
Roulette-wheel Selection
25
Stochastic Universal Sampling
• A problem with Roulette-wheel selection, it is
possible the fittest individual may never be chosen
due to the random nature of the selection.
• In Stochastic Universal Sampling, the selection is
biased so that fit individuals always get picked at
least once.
26
Algorithm 31:
Stochastic Universal Sampling
1
5
7
11 13
21 23
28
24
27
• Tournament selection involves running “tournaments” among a number of
individuals chosen at random from the current generation (population).
• The winner of each tournament (the one with the best fitness) is selected.
• Choose a tournament size, .
• Randomly select chromosomes (solutions) from the current generation,
and choose the most fit of these to be the “mother design.”
• Randomly select chromosomes (solutions) from the current generation,
and choose the most fit of these to be the “father design.”
28
Algorithm 32:
Tournament Selection
• Randomly select chromosomes (designs) from the current
generation, and choose the most fit of these to be the chosen one.
29
Selection Bias (Fitness Pressure)
• Tournament size controls selection bias (fitness
pressure).
– The greater is the tournament size, the higher is the pressure
in the algorithm for selecting individuals of higher fitness.
– Extreme case #1:
• If the tournament size is equal to the generation size, the most fit
solution in the current generation would always be selected as both
the mother and father.
– Extreme case #2:
• If the tournament size is one, fitness is completely ignored and the
mother and father are selected randomly.
30
Tournament Selection Benefits
• Tournament Selection has become the primary
selection technique used for the Genetic
Algorithm.
• Efficient to code.
• Works on parallel architectures.
• Allows the selection pressure to be easily adjusted.
• The most popular setting is
.
31
• The elitist Genetic Algorithm injects the fittest
individual(s) from the previous population into the
next population.
• These individuals are called the elites.
• By keeping the best individual(s) around in future
populations, this algorithm is exploitive.
32
Algorithm 33:
Genetic Algorithm with Elitism
33
Hybrid Optimization Algorithms
• There are many ways to create hybrids of various
metaheuristics algorithms, such as a hybrid of
evolutionary computation and hill-climbing.
• Example:
– Augment an Evolutionary Algorithm with some hillclimbing during the fitness assessment phase to revise
each individual as it is being assessed.
– The revised individual replaces the original one in the
population
34
Algorithm 36:
Stead-state Genetic Algorithm
35
• Designed for multidimensional real-valued spaces.
• Children must compete directly against their immediate
parents for inclusion in the population.
• The size of Mutates based on the current variance in the
population.
– If the population is spread out, mutate will make major changes.
– If the population is condensed in a certain region, mutates will be
small.
• Differential evolution is an adaptive mutation algorithm
36
• The idea is to mutate away from one of three
chosen individuals by adding a vector to it.
• This vector is created from the difference
between the other two individuals
.
• If the population is spread out, and are
likely to be far from one another and this
mutation vector is large, else it is small.
• If the child is better than the parent, it
replaces the parent in the original population,
else the child is thrown away.
37
[1] J. D. Hedengren, "Optimization Techniques in Engineering," 5 April 2015. [Online]. Available:
http://apmonitor.com/me575/index.php/Main/HomePage. [Accessed 27 April 2015].
[2] A. R. Parkinson, R. J. Balling and J. D. Heden, "Optimization Methods for Engineering Design
Applications and Theory," Brigham Young University, 2013.
[3] S. Luke, "Essentials of Metaheuristics," [Online]. Available:
http://cs.gmu.edu/~sean/book/metaheuristics/Essentials.pdf. [Accessed 11 May 2015].
[4] J. D. Hedengren, "Genetic Algorithms in Engineering Design," [Online]. Available:
http://apmonitor.com/me575/uploads/Main/chap5_genetic_algorithms.pdf. [Accessed 28
April 2015].
38