Download Introduction to Genetic Algorithms

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genome (book) wikipedia , lookup

Group selection wikipedia , lookup

Epistasis wikipedia , lookup

Mutation wikipedia , lookup

Frameshift mutation wikipedia , lookup

Genetic drift wikipedia , lookup

Point mutation wikipedia , lookup

Koinophilia wikipedia , lookup

Gene expression programming wikipedia , lookup

Population genetics wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
Introduction to
Genetic
Algorithms
Son Kuswadi
Robotics and Automation Based on BiologicallyInspired Technology (RABBIT) Research Groups
Electronic Engineering Polytechnic Institute of
Surabaya (EEPIS)
Institut Teknologi Sepuluh Nopember (ITS)
Road Map
• Basic Idea
• Quick Review
• Classes of Search Techniques
• Simple GA
• State of The Art
Road Map
• Basic Idea
• Quick Review
• Classes of Search Techniques
• Simple GA
• State of The Art
Basic Idea
• Suppose you have a problem
• You don’t know how to solve it
• What can you do?
• Can you use a computer to somehow
find a solution for you?
• This would be nice! Can it be done?
Basic Idea
• A dumb solution
A “blind generate and test” algorithm:
Repeat
Generate a random possible solution
Test the solution and see how good it
is
Until solution is good enough
Basic Idea
Can we used this dumb idea?
• Sometimes - yes:
– if there are only a few possible solutions
– and you have enough time
– then such a method could be used
• For most problems - no:
– many possible solutions
– with no time to try them all
– so this method can not be used
Basic Idea
• Dumb Idea… but it was works….
l
h
Basic Idea
A “less-dumb” idea (GA)
Generate a set of random solutions
Repeat
Test each solution in the set (rank them)
Remove some bad solutions from set
Duplicate some good solutions
make small changes to some of them
Until best solution is good enough
Basic Idea
How do you encode a solution?
• Obviously this depends on the problem!
• GA’s often encode solutions as fixed length
“bitstrings” (e.g. 101110, 111111,
000101)
• Each bit represents some aspect of the
proposed solution to the problem
• For GA’s to work, we need to be able to
“test” any string and get a “score”
indicating how “good” that solution is
Basic Idea
Silly Example - Drilling for Oil
• Imagine you had to drill for oil somewhere
along a single 1km desert road
• Problem: choose the best place on the
road that produces the most oil per day
• We could represent each solution as a
position on the road
• Say, a whole number between [0..1000]
Basic Idea
Where to drill for oil?
Solution1 = 300
Solution2 = 900
Road
0
500
1000
Basic Idea
Digging for Oil
• The set of all possible solutions
[0..1000] is called the search space or
state space
• In this case it’s just one number but it
could be many numbers or symbols
• Often GA’s code numbers in binary
producing a bitstring representing a
solution
• In our example we choose 10 bits which
is enough to represent 0..1000
Basic Idea
Convert to binary string
512
256
128
64
32
16
8
4
2
1
900
1
1
1
0
0
0
0
1
0
0
300
0
1
0
0
1
0
1
1
0
0
1023
1
1
1
1
1
1
1
1
1
1
In GA’s these encoded strings are sometimes called
“genotypes” or “chromosomes” and the individual bits are
sometimes called “genes”
Basic Idea
Drilling for Oil
Solution1 = 300
Solution2 = 900
(0100101100)
(1110000100)
Road
OIL
0
1000
30
5
Location
Basic Idea
We have seen how to:
• represent possible solutions as a
number
• encoded a number into a binary string
• generate a score for each number given
a function of “how good” each solution
is - this is often called a fitness function
Basic Idea
Back to the (GA) Algorithm
Generate a set of random solutions
Repeat
Test each solution in the set (rank
them)
Remove some bad solutions from set
Duplicate some good solutions
make small changes to some of them
Until best solution is good enough
Basic Idea
Replication and Mutation
• Various method inspired by Darwinian evolution
are used to update the set or population of
solutions (or chromosomes)
• Two high scoring “parent” bit strings or
chromosomes are selected and combined
• Producing two new offspring (bit strings)
• Each offspring may then be changed randomly
(mutation)
Basic Idea
Selecting Parents
• Many schemes are possible so long as better
scoring chromosomes more likely selected
• Score is often termed the fitness
• “Roulette Wheel” selection can be used:
– Add up the fitness's of all chromosomes
– Generate a random number R in that range
– Select the first chromosome in the population
that - when all previous fitness’s are added gives you at least the value R
Road Map
• Basic Idea
• Quick Review
• Classes of Search Techniques
• Simple GA
• State of The Art
Quick Review
• A class of probabilistic optimization algorithms
• Inspired by the biological evolution process
• Uses concepts of “Natural Selection” and “Genetic
Inheritance” (Darwin 1859)
• Originally developed by John Holland (1975)
• Particularly well suited for hard problems where
little is known about the underlying search space
• Widely-used in business, science and engineering
Quick Review
• Typically applied to:
– discrete optimization
• Attributed features:
– not too fast
– good heuristic for combinatorial problems
• Special Features:
– Traditionally emphasizes combining
information from good parents (crossover)
– many variants, e.g., reproduction models,
operators
Quick Review
• Holland’s original GA is now known
as the simple genetic algorithm
(SGA)
• Other GAs use different:
– Representations
– Mutations
– Crossovers
– Selection mechanisms
Road Map
• Basic Idea
• Quick Review
• Classes of Search Techniques
• Simple GA
• State of The Art
Classes of Search
Techniques
Search Techniqes
Calculus Base
Techniqes
Fibonacci
Sort
Tabu Search
Enumerative
Techniqes
Guided random search
techniqes
DFS
Hill
Climbing
Simulated
Anealing
Genetic
Programming
Dynamic
BFS
Programming
Evolutionary
Algorithms
Genetic
Algorithms
References
C. Darwin. On the Origin of Species by Means of Natural
Selection; or, the Preservation of flavored Races in the
Struggle for Life. John Murray, London, 1859.
W. D. Hillis. Co-Evolving Parasites Improve Simulated
Evolution as an Optimization Procedure. Artificial Life 2,
vol 10, Addison-Wesley, 1991.
J. H. Holland. Adaptation in Natural and Artificial
Systems. The University of Michigan Press, Ann Arbor,
Michigan, 1975.
Z. Michalewicz. Genetic Algorithms + Data Structures =
Evolution Programs. Springer-Verlag, Berlin, third
edition, 1996.
M. Sipper. Machine Nature: The Coming Age of BioInspired Computing. McGraw-Hill, New-York, first
edition, 2002.
D. Goldberg, Genetic Algorithms in Search, Optimization
and Machine Learning, Addison Wesley, 1989
Road Map
• Basic Idea
• Quick Review
• Classes of Search Techniques
• Simple GA
• State of The Art
Simple GA
produce an initial population of individuals
evaluate the fitness of all individuals
while termination condition not met do
select fitter individuals for reproduction
recombine between individuals
mutate individuals
evaluate the fitness of the modified individuals
generate a new population
End while
The Evolutionary Cycle
selection
parents
modification
modified
offspring
initiate &
evaluate
population
evaluation
evaluated offspring
deleted
members
discard
Simple Example
(Goldberg98)
• Simple problem: max x2 over {0,1,…,31}
• GA approach:
– Representation: binary code, e.g. 01101 
13
– Population size: 4
– 1-point xover, bitwise mutation
– Roulette wheel selection
– Random initialization
• We show one generational cycle done by hand
Selection
Crossover
Mutation
Simple GA
• Has been subject of many (early) studies
– still often used as benchmark for novel GAs
• Shows many shortcomings, e.g.
– Representation is too restrictive
– Mutation & crossovers only applicable for bitstring & integer representations
– Selection mechanism sensitive for converging
populations with close fitness values
Road Map
• Basic Idea
• Quick Review
• Classes of Search Techniques
• Simple GA
• State of The Art
State of The Art
• Crossover: Alternative Operators
• Crossover vs Mutation
• Other Representation: Gray
coding, integer, floating point
• Binary Representation: Required
Precision Set
Crossover –
Alternative Operators
• Performance with 1 Point Crossover
depends on the order that variables occur
in the representation
– more likely to keep together genes that are
near each other
– Can never keep together genes from opposite
ends of string
– This is known as Positional Bias
– Can be exploited if we know about the
structure of our problem, but this is not
usually the case
Crossover –
Alternative Operators
N-point crossover
• Choose n random crossover points
• Split along those points
• Glue parts, alternating between parents
• Generalisation of 1 point (still some positional
bias)
Crossover –
Alternative Operators
Uniform crossover
• Assign 'heads' to one parent, 'tails' to the other
• Flip a coin for each gene of the first child
• Make an inverse copy of the gene for the second
child
• Inheritance is independent of position
Crossover vs Mutation
• Decade long debate: which one is better /
necessary / main-background
• Answer (at least, rather wide agreement):
–
–
–
–
it depends on the problem, but
in general, it is good to have both
both have another role
mutation-only-EA is possible, xover-only-EA
would not work
Crossover vs Mutation
Exploration: Discovering promising areas in the search space,
i.e. gaining information on the problem
Exploitation: Optimising within a promising area, i.e. using
information
There is co-operation AND competition between them
•
Crossover is explorative, it makes a big jump to an area
somewhere “in between” two (parent) areas
•
Mutation is exploitative, it creates random small
diversions, thereby staying near (in the area of ) the parent
Crossover vs Mutation
• Only crossover can combine information from two
parents
• Only mutation can introduce new information
(alleles)
• To hit the optimum you often need a ‘lucky’
mutation
Other Representation
• Gray coding of integers (still binary
chromosomes)
– Gray coding is a mapping that means that
small changes in the genotype cause small
changes in the phenotype (unlike binary
coding). “Smoother” genotype-phenotype
mapping makes life easier for the GA
Nowadays it is generally accepted that it is
better to encode numerical variables
directly as:
• Integers
• Floating point variables
Integer Representation
• Some problems naturally have integer variables,
e.g. image processing parameters
• Others take categorical values from a fixed set
e.g. {blue, green, yellow, pink}
• N-point / uniform crossover operators work
• Extend bit-flipping mutation to make
– “creep” i.e. more likely to move to similar
value
– Random choice (esp. categorical variables)
– For ordinal problems, it is hard to know correct
range for creep, so often use two mutation
operators in tandem
Floating Point
Representation
• Many problems occur as real valued problems,
e.g. continuous parameter optimisation f :  n 

• Illustration: Ackley’s function (often used in EC)
Binary Representation
• For example, search space x  [-1..2],
with six places after decimal point
• Therefore 3x1000000 equal size ranges
• This means that 22 bit are required since:
–
2097152=221 <3000000<222=4194304
Binary Representation
• Mapping from binary string <b21b20…b0> into a real number x
from range [-1..2]:
• convert the binary string <b21b20…b0> from the base 2 to
base 10:
b
21b20 ...b0
•
  
21
2
i
b
2
i 0 i

10
 x'
find a corresponding real number x:
•
3
x  1.0  x'. 22
2 1