Download and (2) - PolyU EIE

Document related concepts

Genetic engineering wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Inbreeding avoidance wikipedia , lookup

X-inactivation wikipedia , lookup

Adaptive evolution in the human genome wikipedia , lookup

Point mutation wikipedia , lookup

History of genetic engineering wikipedia , lookup

Designer baby wikipedia , lookup

Human genetic variation wikipedia , lookup

Philopatry wikipedia , lookup

Dual inheritance theory wikipedia , lookup

Epistasis wikipedia , lookup

Genome (book) wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Genetic drift wikipedia , lookup

Group selection wikipedia , lookup

Gene expression programming wikipedia , lookup

Koinophilia wikipedia , lookup

Microevolution wikipedia , lookup

Population genetics wikipedia , lookup

Transcript
Evolutionary Computation (EC)
eie426-ec-200809.ppt
2017/5/8
EIE426-AICV
1
Contents












Basic Concepts of EC
Genetic Algorithms
An Example
Chromosome Representation
Stopping Criteria
Initial Population
Selection Mechanisms
Crossover and Mutation
Fitness Functions
Another Example
Application: Routing Optimization
Advantages and Disadvantages of EC
2017/5/8
EIE426-AICV
2
Evolution and Search


Evolution - search through the enormous
genetic parameter space for the best genetic
make-up.
Borrow ideas from nature to help us solve
problems that have equally large search
spaces or similarly changing environment.
2017/5/8
EIE426-AICV
3
Natural Evolution and Evolutionary
Computation
Evolutionary
Computing
Natural
Evolution
Individual
Fitness
Environment
2017/5/8
Candidate Solution
Quality
Problem
EIE426-AICV
4
Different ECs
Several classes of EC algorithms have been developed:
- Genetic algorithms (GA’s): model genetic evolution
- Genetic programming: based on GA’s, but individuals are programs
(represented as trees)
- Evolutionary programming: from the simulation of adaptive
behavior in evolution (phenotype evolution)
- Evolution strategies: model the strategic parameters that control
variation in evolution, i.e., the evolution of evolution
- Culture evolution: models the evolution of culture of a population
and how the culture influences the evolution of individuals.
- Co-evolution: individuals evolve through cooperation, or in
competition with one other.
2017/5/8
EIE426-AICV
5
Basic Concepts
•
•
•
•
•
•
•
•
Chromosome: individual
Population: many individuals
Gene: each characteristics of chromosome (one parameter)
Allele: the value of a gene
Crossover: generate offspring by combining parts of the
parents.
Mutation: introduce new genetic material into an existing
individual.
Fitness: the survival strength of an individual
Culling (removing) and elitism (copying)
2017/5/8
EIE426-AICV
6
Evolutionary Computation
Selection
Parents
Recombination
(Crossover)
Population
The
evolutionary
cycle
2017/5/8
Mutation
Replacement
Offspring
EIE426-AICV
7
Genetic Algorithms



The GA was the first EC paradigm developed and applied
(Holland 1975).
The features of the original GA’s:
(1)
A bit string representation
(2)
Proportional selection
(3)
Cross-over as the primary method to produce new
individuals.
Several changes have been made:
(1)
Different representation schemes
(2)
Different selection methods
(3)
Different GA operators (cross-over, mutation and elitism)
2017/5/8
EIE426-AICV
8
Random Search



The GA is a search procedure.
Random search is possibly the simplest search
procedure. Its training time may be very long before an
acceptable solution is obtained.
Procedure:
(1) Start from an initial search point or a set of initial
points.
(2) Random perturbations to the points
(3) Repeat until an acceptable solution is reached or a
maximum number of iterations is exceeded.
2017/5/8
EIE426-AICV
9
General Genetic Algorithm
(1)
(2)
(3)
Let g = 0.
Initialize the initial generation Cg .
While no stopping criterion is satisfied
(a) Evaluate the fitness of each individual in Cg .
(b) g  g+1.
(c) Select parents from Cg-1.
(d) Recombine selected parents through cross-over to
form offspring Og (with a probability pc).
(e) Mutate offspring in Og (with a probability pm).
(f) Select the new generation Cg from (the previous
generation Cg-1, e.g., the best individuals are copied)
and the offspring Og.
g: generation
Note: The things in () might or might not be carried out.
2017/5/8
EIE426-AICV
10
An Example
Find the maximum of the following function :
f  x   x sin 10  x   2.0
x   1,2
We can differntia te the funtion to find the local maxima :
f  x   sin 10  x   10  x  cos10  x   0
tan 10  x   10  x
There are an infinitive number of solutions to the above equation :
2i  1

x

  i , i  1,2,...
 i
20

 x0  0

2i  1
 xi 
  i , i  1,2,...
20

 i i  1,2,... and i  1,2,... is a decrasing small number
(approxima ting to 0) sequence.
2017/5/8
EIE426-AICV
11
x19 is the maximum in [-1,2].
x19  1.85  19
f x19  is slighytly greater th an f 1.85  3.85.
2017/5/8
EIE426-AICV
12
Use a genetic algorithm to solve the problem:




Coding (chromosome representation of a
solution)
Generation of initial population (solutions)
Fitness calculation
Genetic operation
2017/5/8
EIE426-AICV
13

Coding (chromosome representation):
Use a binary string to represent x. If the solution is to be
precise to 10-6, then the interval (2-(-1)) = 3 should be
divided into 3× 106. At least 22 bits should be used
because
2 097 152  2 21  3 106  2 22  4 194 304
x  b21, b20 ,..., b0 , a coding process
phonotype  genotype mapping
2017/5/8
EIE426-AICV
14

Decoding
b21, b20 ,..., b0   x,
a decoding process
genotype  phonotype mapping
 21
b21, b20 ,..., b0 2    bi  2i   x'
 i 0
10
2  (1)
x  1.0  x' 22
2 1
e.g., s1  1000101110110101000111 
x'  10001011101101010001112  2 288 967
3
 0.637197
22
2 1
00000000000000000000002  1
x  1.0  2 288 967 
11111111111111111111112  2
2017/5/8
EIE426-AICV
15
In general, if a parameter that takes values in xmin , xmax  is represente d
by an m - bit binary string, then the conversion is given below :
 m 1
i
bm1 , b1 , b0 2    bi  2   x'
 i 0
10
x'
xmax  xmin 
x  xmin  m
2 1
2017/5/8
EIE426-AICV
16

Generation of initial population
A set of N 22-bit binary strings can be randomly
generated as the initial population.
2017/5/8
EIE426-AICV
17

Fitness calculation
Since f(x) > 0 in the interval, we can directly use
f(x) as a fitness function:
f(s) = f(x)
e.g.,
s1 = <1000101110110101000111>, f(s1)=2.586345
s2 = <0000001110000000010000>, f(s2)=1.078878
s3 = <1110000000111111000101>, f(s3)= 3.250650
2017/5/8
EIE426-AICV
18

Genetic operation
(1) Selection: based on the fitness of individuals
e.g., roulette wheel selection (fitness proportionate
selection)
(2) Crossover (with a probability pc)
e.g.,
s2 = <00000 | 01110000000010000>, f(s2)=1.078878
s3 = <11100 | 00000111111000101>, f(s3)= 3.250650
After the crossover operation:
s’2 = <00000 | 00000111111000101>, f(s’2)=1.940865
s’3 = <11100 | 01110000000010000 >, f(s’3)= 3.459245
2017/5/8
EIE426-AICV
19
(3) Mutation (with a probability pm)
e.g.,
s3 = <1110000000111111000101>
f(s3)= 3.250650
After the mutation operation:
s’3 = <1110100000111111000101 >
f(s’3)= 0.917743
or
s3 = <1110000000111111000101>
s”3 = <1110000001111111000101 >
f(s”3)= 3.343555
2017/5/8
EIE426-AICV
20
Simulation results:
N = 50, pc = 0.25, pm = 0.01, at 89 generations, the best
individual was obtained:
smax = <1101001111110011001111>
xmax = 1.850 549
f(xmax) = 3.850 274
2017/5/8
EIE426-AICV
21
The best individual at each iteration (up to 150 generations)
Generation
The chromosome of the best
individual
x
fitness
1
1000111000010110001111
1.831 624
3.534 806
11
0110101011100111001111
1.854 860
3.833 286
17
1110101011111101001111
1.847 536
3.842 004
54
1000110110100011001111
1.848 699
3.847 155
71
0100110110001011001111
1.850 897
3.850 162
89
1101001111110011001111
1.850549
3.850274
150
1101001111110011001111
1.850549
3.850274
2017/5/8
EIE426-AICV
22
Summary on Basic Concepts




Evolution is an optimization process, where the aim is to
improve the ability of individuals to survive.
An evolutionary algorithm (EA) is a stochastic search for
an optimal solution to a given problem.
Evolution - search through the enormous genetic
parameter space for the best genetic make-up.
Borrow ideas from nature to help us solve problems that
have an equally large search spaces or similarly
changing environment.
2017/5/8
EIE426-AICV
23







Genotype: describes the genetic composition of an
individual
Phenotype: the expressed behavioral traits of an
individual in a specific environment.
Selection: use the fitness evaluations to decide which
are the best parents to reproduce.
Crossover: generate offspring by combining parts of the
parents
Mutation: introduce new genetic material into an existing
individual.
Coding: phenotype  genotype
Decoding: genotype  phenotype
2017/5/8
EIE426-AICV
24
Simple Genetic Algorithm (SGA)
Representation
Binary strings
Recombination
(crossover)
N-point (commonly used 1-point and 2point) or uniform;
pc typically in range (0.6, 0.9)
Mutation
Bitwise bit-flipping with fixed probability pm
(typically between 1/pop_size and 1/
chromosome_length)
Parent selection
Fitness-Proportionate
Survivor selection
All children replace parents
Speciality
Emphasis on crossover
2017/5/8
EIE426-AICV
25
Chromosome Representation
Genotype space = {0,1}I
I-bit binary strings
Phenotype space
Coding (encoding or
(representation)
10010001
10010010
010001001
011101001
Decoding
(inverse representation)
2017/5/8
EIE426-AICV
26



Binary-valued variables: no extra coding required
Nominal-valued variables
D-bit with 2D discrete nominal values
e.g., four colors: red (00), blue (01), green (10), yellow
(11)
Continuous-valued variables
 :   0,1
I
2017/5/8
EIE426-AICV
27
Other Representations

Gray coding of integers (still binary chromosomes)
Gray coding is a mapping that means that small
changes in the genotype cause small changes in the
phenotype (unlike binary coding, e.g., 0111(7) and
1000 (8)).
It is generally accepted that it is better to encode
numerical variables directly as

Integers

Floating point variables
2017/5/8
EIE426-AICV
28
Stopping Criteria



The maximum number of generation is
exceeded
An acceptable best fit individual has evolved
The average and/or maximum fitness value do
not change significantly over the past g
generations.
2017/5/8
EIE426-AICV
29
Initial Population




The standard way of generating the initial population
is to choose gene values randomly from the allowed
set of values.
The goal of random selection is to ensure that the
initial population is a uniform representation of the
entire search space.
A large population covers a larger area of the
search space, and may require less generations to
converge.
In the case of a small population, the EA can be
forced to explore a large search space by increasing
the rate of mutation.
2017/5/8
EIE426-AICV
30
Selection Mechanisms
Selection operators
(1) Random selection
(2) Proportional selection: roulette wheel selection
(3) Tournament selection
(4) Rank-based selection
2017/5/8
EIE426-AICV
31
Random Selection

Individuals are selected randomly with no
reference to fitness at all. All the individuals, good
or bad, have an equal chance of being selected.
2017/5/8
EIE426-AICV
32
Proportional Selection: Roulette Wheel Selection
Individual
Chromosome
Fitness
fi
Selection
probability, Pi
Accumulated
probability
1
0001100000
8
0.086 957
0.086 957
2
0101111001
5
0.054 348
0.141 304
3
0000000101
2
0.021 739
0.163 043
4
1001110100
10
0.108 696
0.271 739
5
1010101010
7
0.076 087
0.347 826
6
1110010110
12
0.130 435
0.478 261
7
1001011011
5
0.054 348
0.532 609
8
1100000001
19
0.206 522
0.739 130
9
1001110100
10
0.108 696
0.847 826
10
0001010011
14
0.152 174
1.000 000
10
Pi  f i /  f i ,
i 1
2017/5/8
fi  0
EIE426-AICV
33
Roulette Wheel Selection
1
2
3
4
5
6
7
8
9
10
It can be visualized as the spinning of the wheel and testing which slide
ends up at the top. Fitness values are usually normalized to [0,1].
2017/5/8
EIE426-AICV
34


Assume that the following random number sequence
is generated:
0.070 221
0.545 929
0.784 567
0.446 930
0.507 893
0.291 198
0.176 340
0.272 901
0.371 435
0.854 641
Compared the random number sequence with the
accumulated probabilities, we select the individuals:
1, 8, 9, 6, 7, 5, 8, 4, 6, 10. Individuals 2 and 3 were
removed and replaced with individuals 8 and 6. The
individuals with high fitness tends to survive but those
with low fitness may be removed.
2017/5/8
EIE426-AICV
35

The pseudocode:
1. n  1, where n denotes the chromosome index
2. sum  Pn
3. Obtain a uniform random number,  ~ U 0,1
4. While sum   ,
(a) n  n  1
(b) sum   Pn
5. Return chrosome n as the slected individual .
2017/5/8
EIE426-AICV
36
Tournament Selection


A group of k individuals is randomly selected.
The individual with the best fitness is selected from the group.

The advantage: the worse individuals of the population will not
be selected and the best individuals will not dominate in the
reproduction process.

For crossover, two tournaments are held to select each of two
parents. It is possible that
(1) A parent can be selected to reproduce more than once; and
(2) One individual can combine with itself to reproduce
offspring.
2017/5/8
EIE426-AICV
37
Rank-Based Selection

The rank ordering of the fitness values is used to
determine the probability of selection and not the fitness
values itself.

Non-deterministic linear sampling
Individuals are sorted in decreasing fitness value.
(1) Let n = random(random(N))
where N is the total number of individuals and
random(N) return a number between 1 and N.
(2) Return n as the selected individual
2017/5/8
EIE426-AICV
38
Elitism



Elitism involves the selection of a set of individuals
from the current generation to survive to the next
generation.
The number of individuals to survive to the next
generation, without being mutated, is referred to as
the generation gap.
Generation gap = k
k best individuals or
k individuals selected using any selection operator
2017/5/8
EIE426-AICV
39
Crossover
Crossover
2017/5/8
EIE426-AICV
40
Uniform Crossover
A mask (vector) of length I (I-bit binary string) is created
at random for each pair of individuals selected for
reproduction. A bit with value of 1 indicates that the
corresponding allele (bit) has to be swapped.
1. mi  0 for all i  1,2,..., I .
2. For each i  1,2,..., I :
(a) Calculate a random value  ~ U 0,1.
(b) If   p x , then mi  1.
3. Return the mask vecto r m.
Note : p x is the crossover probabilit y at each position in the chromosome .
It is different from pc .
2017/5/8
EIE426-AICV
41
Parent1
Parent 2
1 0 1 0 0 1 0 0 0 1
Mask
Offspring 1
Offspring 2
2017/5/8
EIE426-AICV
42
One-point Crossover
1. Calculate a random integer va lue  ~ U 1, I  1.
2. mi  0 for all i  1,2,..., I .
3. For each i    1,..., I , let mi  1.
4. Return the mask vecto r m.
2017/5/8
EIE426-AICV
43
Parent1
Parent 2
0 0 0 0 0 0 1 1 1 1
Mask
Offspring 1
Offspring 2
2017/5/8
EIE426-AICV
44
Two-point Crossover
1. Compute two random integer va lues 1 ,  2 ~ U 1, I  with 1   2 .
2. mi  0 for all i  1,2,..., I .
3. For each i  1 ,...,  2 , let mi  1.
4. Return the mask vecto r m.
2017/5/8
EIE426-AICV
45
Parent1
Parent 2
0 0 1 1 1 1 0 0 0 0
Mask
Offspring 1
Offspring 2
2017/5/8
EIE426-AICV
46
Arithmetic Crossover
Arithmetic crossover can be used in the case of
continuous-valued genes.
1. Consider t wo parents Cn1 and Cn 2 .
The ith components of two offspring, O n1 and O n 2
are generated by
O n1,i  r1Cn1,i  1.0  r1 Cn 2,i
O n 2,i  1.0  r2 Cn1,i  r2Cn 2,i
with r1, r 2 U 0,1.
2017/5/8
EIE426-AICV
47
Mutation

Alter each gene independently with a probability
pm (the mutation rate)

Real-valued representations, mutation occurs by
adding a random value (usually sampled from a
Gaussian distribution N 0,  2  ) to allele. The
variance is usually a function of the fitness of the
individual. Individuals with a good fitness value
will be mutated less, while a bad fitness value
will lead to large mutations.
2017/5/8
EIE426-AICV
48
0 1 0 0 1 1 1 1
0 1 0 0 0 1 1 1
Mutation (fox)
2017/5/8
EIE426-AICV
49
The Evolution Mechanism

Increasing diversity by
using genetic operators
 mutation
 crossover
2017/5/8

Decreasing diversity by
selection of
 parents
 things to kill
EIE426-AICV
50
Fitness Functions
Common fitness functions:
 Use the objective function f(x) directly
(1) For a maximization problem, Fit(f(x)) = f(x)
e.g.,
Find the maximum of the following function :
f  x   x sin 10  x 
(2) For a minimization problem, Fit(f(x)) = -f(x)
e.g., solution for x2 + x = 2, to minimize f(x) = x2 + x - 2
2017/5/8
EIE426-AICV
51

Clipping
(1) For a minimizati on problem
cmax  f  x ,
Fit  f  x   
0,
f  x   cmax
Otherwise
cmax is an estimation of the maximum.
(2) For a maximizati on problem
 f  x   cmin ,
Fit  f  x   
0,
f  x   cmin
Otherwise
cmin is an estimation of the minimum.
2017/5/8
EIE426-AICV
52

Mapping
(1) For a maximizati on problem
1
Fit  f x  
c  0, c - f x   0
1  c  f x 
(2) For a minimizati on problem
1
Fit  f x  
c  0, c  f x   0
1  c  f x 
c is a conservati ve estimation of the
value range of the objective function.
2017/5/8
EIE426-AICV
53

Linear transformation of fitness functions
f '   f 
'
'
Two conditions : (1) f avg
 f avg and (2) f max
 cmult f avg
with cmult  [1.0, 2.0].

cmult  1 f avg
f max  f avg

f
and  
max
 cmult f avg  f avg
f max  f avg
(see Fig. ft1)
To make sure that no fitness is negative after the transform ation, we can use
f avg
 f min f avg

and  
(see Fig. ft2)
f avg  f min
f avg  f min
It is easy to verify th at
'
'
'
(1) f min
 0, (2) f avg
 f avg , and (3) f max

2017/5/8
EIE426-AICV
f max  f min
f avg  cmult f avg
f avg  f min
54
f'
f'
cmultfavg
c mult f avg
f'avg
f' avg
f'min
f
fmin
favg
f min
fmax
Fig. ft1
2017/5/8
f
f avg
f max
Fig. ft2
EIE426-AICV
55
Genetic algorithms: case study

To find the maximum of the “peak” function of two
variables x and y:
-3 ≤ x, y ≤ 3
2017/5/8
EIE426-AICV
56

Chromosome representation
2017/5/8
EIE426-AICV
57
Initial population
2017/5/8
EIE426-AICV
58
The first generation
2017/5/8
EIE426-AICV
59
Local maximum
2017/5/8
EIE426-AICV
60
Global maximum
2017/5/8
EIE426-AICV
61
Evolutionary Computation: Applications
•
•
•
•
•
•
•
•
•
Robotics
Control
Design
Scheduling/Routing/Resource Allocation
Machine Learning
Pattern Recognition
Market forecasting
Data Mining
Game Playing
- Robocode
- Backgammon
- Chess
2017/5/8
EIE557-CI&IA
62
Robocode
An Open Source educational game
started by Mathew Nelson (originally
provided by IBM). Currently
contributions are being made by
various people; officially Flemming N.
Larsen is working on Robocode to
keep it current and fix the bugs. The
game is designed to help people learn
to program in Java and enjoy the
experience.
Genetic Programming (GP) Robocode
2017/5/8
EIE426-AICV
63
Application: Routing Optimization

The problem: Given a
network of M switches,
an origin and a
destination switch, the
objective is to find the
best route to connect a
call between the origin
and destination
switches (Sevenster
and Engelbrecht
1996).
2017/5/8
PSTN: Public Switch Telephone Network
EIE557-CI&IA
64

Chromosome representation:
- variable length
- each gene representing one switch
- integer values for switch numbers
- the first gene and last gene representing the
origin and last switches, respectively
Examples:
(1 3 6 10)
(1 5 2 5 10) = (1 5 2 10)
Duplicate switches are ignored
2017/5/8
EIE557-CI&IA
65

Initialization of population:
- randomly generated with the restriction that
the first gene represents the origin switch and
the last gene the destination switch
2017/5/8
EIE557-CI&IA
66

Fitness function: a multi-criteria objective function
was applied.
F j  aF jSwitch  bF jBlock  cF jUtil  dF jCost
F
Switch
j

rj
,
rj : the total number of switches in the route
M
M : the total number of switches
F jSwitch : to minimize the route length
F jBlock : to minimize the route congestion
F jUtil : to maximize the utilizatio n of the links
F jCost : to minimize the route cost
The costants a, b, c, and d control the influence of each criteria.
2017/5/8
EIE557-CI&IA
67



Selection: any selection operator
Crossover: any crossover operator
Mutation: replacing selected genes with a uniformly
random selected switch in the range [1, M].
2017/5/8
EIE557-CI&IA
68
Real World EC
Tends include:
 More complex representations and operators
 Use of problem specific knowledge for seeding
the initial population and creating heuristic
operators
 Hybridisation with other methods
2017/5/8
EIE557-CI&IA
69
Advantages of EC






Handles huge search spaces
Balances exploration and exploitation
Easy to try - not knowledge intensive
Easy to combine with other methods
Provides many alternative solutions
Can continually evolve solutions to fit with a
continually changing problem
2017/5/8
EIE557-CI&IA
70
Disadvantages of EC




No guarantee for optimal solution within finite
time
Weak theoretical basis
May need extensive parameter tuning
Often computationally expensive, i.e., slow
2017/5/8
EIE557-CI&IA
71