* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download and (2) - PolyU EIE
Survey
Document related concepts
Genetic engineering wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Inbreeding avoidance wikipedia , lookup
X-inactivation wikipedia , lookup
Adaptive evolution in the human genome wikipedia , lookup
Point mutation wikipedia , lookup
History of genetic engineering wikipedia , lookup
Designer baby wikipedia , lookup
Human genetic variation wikipedia , lookup
Dual inheritance theory wikipedia , lookup
Genome (book) wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Genetic drift wikipedia , lookup
Group selection wikipedia , lookup
Gene expression programming wikipedia , lookup
Koinophilia wikipedia , lookup
Transcript
Evolutionary Computation (EC) eie426-ec-200809.ppt 2017/5/8 EIE426-AICV 1 Contents Basic Concepts of EC Genetic Algorithms An Example Chromosome Representation Stopping Criteria Initial Population Selection Mechanisms Crossover and Mutation Fitness Functions Another Example Application: Routing Optimization Advantages and Disadvantages of EC 2017/5/8 EIE426-AICV 2 Evolution and Search Evolution - search through the enormous genetic parameter space for the best genetic make-up. Borrow ideas from nature to help us solve problems that have equally large search spaces or similarly changing environment. 2017/5/8 EIE426-AICV 3 Natural Evolution and Evolutionary Computation Evolutionary Computing Natural Evolution Individual Fitness Environment 2017/5/8 Candidate Solution Quality Problem EIE426-AICV 4 Different ECs Several classes of EC algorithms have been developed: - Genetic algorithms (GA’s): model genetic evolution - Genetic programming: based on GA’s, but individuals are programs (represented as trees) - Evolutionary programming: from the simulation of adaptive behavior in evolution (phenotype evolution) - Evolution strategies: model the strategic parameters that control variation in evolution, i.e., the evolution of evolution - Culture evolution: models the evolution of culture of a population and how the culture influences the evolution of individuals. - Co-evolution: individuals evolve through cooperation, or in competition with one other. 2017/5/8 EIE426-AICV 5 Basic Concepts • • • • • • • • Chromosome: individual Population: many individuals Gene: each characteristics of chromosome (one parameter) Allele: the value of a gene Crossover: generate offspring by combining parts of the parents. Mutation: introduce new genetic material into an existing individual. Fitness: the survival strength of an individual Culling (removing) and elitism (copying) 2017/5/8 EIE426-AICV 6 Evolutionary Computation Selection Parents Recombination (Crossover) Population The evolutionary cycle 2017/5/8 Mutation Replacement Offspring EIE426-AICV 7 Genetic Algorithms The GA was the first EC paradigm developed and applied (Holland 1975). The features of the original GA’s: (1) A bit string representation (2) Proportional selection (3) Cross-over as the primary method to produce new individuals. Several changes have been made: (1) Different representation schemes (2) Different selection methods (3) Different GA operators (cross-over, mutation and elitism) 2017/5/8 EIE426-AICV 8 Random Search The GA is a search procedure. Random search is possibly the simplest search procedure. Its training time may be very long before an acceptable solution is obtained. Procedure: (1) Start from an initial search point or a set of initial points. (2) Random perturbations to the points (3) Repeat until an acceptable solution is reached or a maximum number of iterations is exceeded. 2017/5/8 EIE426-AICV 9 General Genetic Algorithm (1) (2) (3) Let g = 0. Initialize the initial generation Cg . While no stopping criterion is satisfied (a) Evaluate the fitness of each individual in Cg . (b) g g+1. (c) Select parents from Cg-1. (d) Recombine selected parents through cross-over to form offspring Og (with a probability pc). (e) Mutate offspring in Og (with a probability pm). (f) Select the new generation Cg from (the previous generation Cg-1, e.g., the best individuals are copied) and the offspring Og. g: generation Note: The things in () might or might not be carried out. 2017/5/8 EIE426-AICV 10 An Example Find the maximum of the following function : f x x sin 10 x 2.0 x 1,2 We can differntia te the funtion to find the local maxima : f x sin 10 x 10 x cos10 x 0 tan 10 x 10 x There are an infinitive number of solutions to the above equation : 2i 1 x i , i 1,2,... i 20 x0 0 2i 1 xi i , i 1,2,... 20 i i 1,2,... and i 1,2,... is a decrasing small number (approxima ting to 0) sequence. 2017/5/8 EIE426-AICV 11 x19 is the maximum in [-1,2]. x19 1.85 19 f x19 is slighytly greater th an f 1.85 3.85. 2017/5/8 EIE426-AICV 12 Use a genetic algorithm to solve the problem: Coding (chromosome representation of a solution) Generation of initial population (solutions) Fitness calculation Genetic operation 2017/5/8 EIE426-AICV 13 Coding (chromosome representation): Use a binary string to represent x. If the solution is to be precise to 10-6, then the interval (2-(-1)) = 3 should be divided into 3× 106. At least 22 bits should be used because 2 097 152 2 21 3 106 2 22 4 194 304 x b21, b20 ,..., b0 , a coding process phonotype genotype mapping 2017/5/8 EIE426-AICV 14 Decoding b21, b20 ,..., b0 x, a decoding process genotype phonotype mapping 21 b21, b20 ,..., b0 2 bi 2i x' i 0 10 2 (1) x 1.0 x' 22 2 1 e.g., s1 1000101110110101000111 x' 10001011101101010001112 2 288 967 3 0.637197 22 2 1 00000000000000000000002 1 x 1.0 2 288 967 11111111111111111111112 2 2017/5/8 EIE426-AICV 15 In general, if a parameter that takes values in xmin , xmax is represente d by an m - bit binary string, then the conversion is given below : m 1 i bm1 , b1 , b0 2 bi 2 x' i 0 10 x' xmax xmin x xmin m 2 1 2017/5/8 EIE426-AICV 16 Generation of initial population A set of N 22-bit binary strings can be randomly generated as the initial population. 2017/5/8 EIE426-AICV 17 Fitness calculation Since f(x) > 0 in the interval, we can directly use f(x) as a fitness function: f(s) = f(x) e.g., s1 = <1000101110110101000111>, f(s1)=2.586345 s2 = <0000001110000000010000>, f(s2)=1.078878 s3 = <1110000000111111000101>, f(s3)= 3.250650 2017/5/8 EIE426-AICV 18 Genetic operation (1) Selection: based on the fitness of individuals e.g., roulette wheel selection (fitness proportionate selection) (2) Crossover (with a probability pc) e.g., s2 = <00000 | 01110000000010000>, f(s2)=1.078878 s3 = <11100 | 00000111111000101>, f(s3)= 3.250650 After the crossover operation: s’2 = <00000 | 00000111111000101>, f(s’2)=1.940865 s’3 = <11100 | 01110000000010000 >, f(s’3)= 3.459245 2017/5/8 EIE426-AICV 19 (3) Mutation (with a probability pm) e.g., s3 = <1110000000111111000101> f(s3)= 3.250650 After the mutation operation: s’3 = <1110100000111111000101 > f(s’3)= 0.917743 or s3 = <1110000000111111000101> s”3 = <1110000001111111000101 > f(s”3)= 3.343555 2017/5/8 EIE426-AICV 20 Simulation results: N = 50, pc = 0.25, pm = 0.01, at 89 generations, the best individual was obtained: smax = <1101001111110011001111> xmax = 1.850 549 f(xmax) = 3.850 274 2017/5/8 EIE426-AICV 21 The best individual at each iteration (up to 150 generations) Generation The chromosome of the best individual x fitness 1 1000111000010110001111 1.831 624 3.534 806 11 0110101011100111001111 1.854 860 3.833 286 17 1110101011111101001111 1.847 536 3.842 004 54 1000110110100011001111 1.848 699 3.847 155 71 0100110110001011001111 1.850 897 3.850 162 89 1101001111110011001111 1.850549 3.850274 150 1101001111110011001111 1.850549 3.850274 2017/5/8 EIE426-AICV 22 Summary on Basic Concepts Evolution is an optimization process, where the aim is to improve the ability of individuals to survive. An evolutionary algorithm (EA) is a stochastic search for an optimal solution to a given problem. Evolution - search through the enormous genetic parameter space for the best genetic make-up. Borrow ideas from nature to help us solve problems that have an equally large search spaces or similarly changing environment. 2017/5/8 EIE426-AICV 23 Genotype: describes the genetic composition of an individual Phenotype: the expressed behavioral traits of an individual in a specific environment. Selection: use the fitness evaluations to decide which are the best parents to reproduce. Crossover: generate offspring by combining parts of the parents Mutation: introduce new genetic material into an existing individual. Coding: phenotype genotype Decoding: genotype phenotype 2017/5/8 EIE426-AICV 24 Simple Genetic Algorithm (SGA) Representation Binary strings Recombination (crossover) N-point (commonly used 1-point and 2point) or uniform; pc typically in range (0.6, 0.9) Mutation Bitwise bit-flipping with fixed probability pm (typically between 1/pop_size and 1/ chromosome_length) Parent selection Fitness-Proportionate Survivor selection All children replace parents Speciality Emphasis on crossover 2017/5/8 EIE426-AICV 25 Chromosome Representation Genotype space = {0,1}I I-bit binary strings Phenotype space Coding (encoding or (representation) 10010001 10010010 010001001 011101001 Decoding (inverse representation) 2017/5/8 EIE426-AICV 26 Binary-valued variables: no extra coding required Nominal-valued variables D-bit with 2D discrete nominal values e.g., four colors: red (00), blue (01), green (10), yellow (11) Continuous-valued variables : 0,1 I 2017/5/8 EIE426-AICV 27 Other Representations Gray coding of integers (still binary chromosomes) Gray coding is a mapping that means that small changes in the genotype cause small changes in the phenotype (unlike binary coding, e.g., 0111(7) and 1000 (8)). It is generally accepted that it is better to encode numerical variables directly as Integers Floating point variables 2017/5/8 EIE426-AICV 28 Stopping Criteria The maximum number of generation is exceeded An acceptable best fit individual has evolved The average and/or maximum fitness value do not change significantly over the past g generations. 2017/5/8 EIE426-AICV 29 Initial Population The standard way of generating the initial population is to choose gene values randomly from the allowed set of values. The goal of random selection is to ensure that the initial population is a uniform representation of the entire search space. A large population covers a larger area of the search space, and may require less generations to converge. In the case of a small population, the EA can be forced to explore a large search space by increasing the rate of mutation. 2017/5/8 EIE426-AICV 30 Selection Mechanisms Selection operators (1) Random selection (2) Proportional selection: roulette wheel selection (3) Tournament selection (4) Rank-based selection 2017/5/8 EIE426-AICV 31 Random Selection Individuals are selected randomly with no reference to fitness at all. All the individuals, good or bad, have an equal chance of being selected. 2017/5/8 EIE426-AICV 32 Proportional Selection: Roulette Wheel Selection Individual Chromosome Fitness fi Selection probability, Pi Accumulated probability 1 0001100000 8 0.086 957 0.086 957 2 0101111001 5 0.054 348 0.141 304 3 0000000101 2 0.021 739 0.163 043 4 1001110100 10 0.108 696 0.271 739 5 1010101010 7 0.076 087 0.347 826 6 1110010110 12 0.130 435 0.478 261 7 1001011011 5 0.054 348 0.532 609 8 1100000001 19 0.206 522 0.739 130 9 1001110100 10 0.108 696 0.847 826 10 0001010011 14 0.152 174 1.000 000 10 Pi f i / f i , i 1 2017/5/8 fi 0 EIE426-AICV 33 Roulette Wheel Selection 1 2 3 4 5 6 7 8 9 10 It can be visualized as the spinning of the wheel and testing which slide ends up at the top. Fitness values are usually normalized to [0,1]. 2017/5/8 EIE426-AICV 34 Assume that the following random number sequence is generated: 0.070 221 0.545 929 0.784 567 0.446 930 0.507 893 0.291 198 0.176 340 0.272 901 0.371 435 0.854 641 Compared the random number sequence with the accumulated probabilities, we select the individuals: 1, 8, 9, 6, 7, 5, 8, 4, 6, 10. Individuals 2 and 3 were removed and replaced with individuals 8 and 6. The individuals with high fitness tends to survive but those with low fitness may be removed. 2017/5/8 EIE426-AICV 35 The pseudocode: 1. n 1, where n denotes the chromosome index 2. sum Pn 3. Obtain a uniform random number, ~ U 0,1 4. While sum , (a) n n 1 (b) sum Pn 5. Return chrosome n as the slected individual . 2017/5/8 EIE426-AICV 36 Tournament Selection A group of k individuals is randomly selected. The individual with the best fitness is selected from the group. The advantage: the worse individuals of the population will not be selected and the best individuals will not dominate in the reproduction process. For crossover, two tournaments are held to select each of two parents. It is possible that (1) A parent can be selected to reproduce more than once; and (2) One individual can combine with itself to reproduce offspring. 2017/5/8 EIE426-AICV 37 Rank-Based Selection The rank ordering of the fitness values is used to determine the probability of selection and not the fitness values itself. Non-deterministic linear sampling Individuals are sorted in decreasing fitness value. (1) Let n = random(random(N)) where N is the total number of individuals and random(N) return a number between 1 and N. (2) Return n as the selected individual 2017/5/8 EIE426-AICV 38 Elitism Elitism involves the selection of a set of individuals from the current generation to survive to the next generation. The number of individuals to survive to the next generation, without being mutated, is referred to as the generation gap. Generation gap = k k best individuals or k individuals selected using any selection operator 2017/5/8 EIE426-AICV 39 Crossover Crossover 2017/5/8 EIE426-AICV 40 Uniform Crossover A mask (vector) of length I (I-bit binary string) is created at random for each pair of individuals selected for reproduction. A bit with value of 1 indicates that the corresponding allele (bit) has to be swapped. 1. mi 0 for all i 1,2,..., I . 2. For each i 1,2,..., I : (a) Calculate a random value ~ U 0,1. (b) If p x , then mi 1. 3. Return the mask vecto r m. Note : p x is the crossover probabilit y at each position in the chromosome . It is different from pc . 2017/5/8 EIE426-AICV 41 Parent1 Parent 2 1 0 1 0 0 1 0 0 0 1 Mask Offspring 1 Offspring 2 2017/5/8 EIE426-AICV 42 One-point Crossover 1. Calculate a random integer va lue ~ U 1, I 1. 2. mi 0 for all i 1,2,..., I . 3. For each i 1,..., I , let mi 1. 4. Return the mask vecto r m. 2017/5/8 EIE426-AICV 43 Parent1 Parent 2 0 0 0 0 0 0 1 1 1 1 Mask Offspring 1 Offspring 2 2017/5/8 EIE426-AICV 44 Two-point Crossover 1. Compute two random integer va lues 1 , 2 ~ U 1, I with 1 2 . 2. mi 0 for all i 1,2,..., I . 3. For each i 1 ,..., 2 , let mi 1. 4. Return the mask vecto r m. 2017/5/8 EIE426-AICV 45 Parent1 Parent 2 0 0 1 1 1 1 0 0 0 0 Mask Offspring 1 Offspring 2 2017/5/8 EIE426-AICV 46 Arithmetic Crossover Arithmetic crossover can be used in the case of continuous-valued genes. 1. Consider t wo parents Cn1 and Cn 2 . The ith components of two offspring, O n1 and O n 2 are generated by O n1,i r1Cn1,i 1.0 r1 Cn 2,i O n 2,i 1.0 r2 Cn1,i r2Cn 2,i with r1, r 2 U 0,1. 2017/5/8 EIE426-AICV 47 Mutation Alter each gene independently with a probability pm (the mutation rate) Real-valued representations, mutation occurs by adding a random value (usually sampled from a Gaussian distribution N 0, 2 ) to allele. The variance is usually a function of the fitness of the individual. Individuals with a good fitness value will be mutated less, while a bad fitness value will lead to large mutations. 2017/5/8 EIE426-AICV 48 0 1 0 0 1 1 1 1 0 1 0 0 0 1 1 1 Mutation (fox) 2017/5/8 EIE426-AICV 49 The Evolution Mechanism Increasing diversity by using genetic operators mutation crossover 2017/5/8 Decreasing diversity by selection of parents things to kill EIE426-AICV 50 Fitness Functions Common fitness functions: Use the objective function f(x) directly (1) For a maximization problem, Fit(f(x)) = f(x) e.g., Find the maximum of the following function : f x x sin 10 x (2) For a minimization problem, Fit(f(x)) = -f(x) e.g., solution for x2 + x = 2, to minimize f(x) = x2 + x - 2 2017/5/8 EIE426-AICV 51 Clipping (1) For a minimizati on problem cmax f x , Fit f x 0, f x cmax Otherwise cmax is an estimation of the maximum. (2) For a maximizati on problem f x cmin , Fit f x 0, f x cmin Otherwise cmin is an estimation of the minimum. 2017/5/8 EIE426-AICV 52 Mapping (1) For a maximizati on problem 1 Fit f x c 0, c - f x 0 1 c f x (2) For a minimizati on problem 1 Fit f x c 0, c f x 0 1 c f x c is a conservati ve estimation of the value range of the objective function. 2017/5/8 EIE426-AICV 53 Linear transformation of fitness functions f ' f ' ' Two conditions : (1) f avg f avg and (2) f max cmult f avg with cmult [1.0, 2.0]. cmult 1 f avg f max f avg f and max cmult f avg f avg f max f avg (see Fig. ft1) To make sure that no fitness is negative after the transform ation, we can use f avg f min f avg and (see Fig. ft2) f avg f min f avg f min It is easy to verify th at ' ' ' (1) f min 0, (2) f avg f avg , and (3) f max 2017/5/8 EIE426-AICV f max f min f avg cmult f avg f avg f min 54 f' f' cmultfavg c mult f avg f'avg f' avg f'min f fmin favg f min fmax Fig. ft1 2017/5/8 f f avg f max Fig. ft2 EIE426-AICV 55 Genetic algorithms: case study To find the maximum of the “peak” function of two variables x and y: -3 ≤ x, y ≤ 3 2017/5/8 EIE426-AICV 56 Chromosome representation 2017/5/8 EIE426-AICV 57 Initial population 2017/5/8 EIE426-AICV 58 The first generation 2017/5/8 EIE426-AICV 59 Local maximum 2017/5/8 EIE426-AICV 60 Global maximum 2017/5/8 EIE426-AICV 61 Evolutionary Computation: Applications • • • • • • • • • Robotics Control Design Scheduling/Routing/Resource Allocation Machine Learning Pattern Recognition Market forecasting Data Mining Game Playing - Robocode - Backgammon - Chess 2017/5/8 EIE557-CI&IA 62 Robocode An Open Source educational game started by Mathew Nelson (originally provided by IBM). Currently contributions are being made by various people; officially Flemming N. Larsen is working on Robocode to keep it current and fix the bugs. The game is designed to help people learn to program in Java and enjoy the experience. Genetic Programming (GP) Robocode 2017/5/8 EIE426-AICV 63 Application: Routing Optimization The problem: Given a network of M switches, an origin and a destination switch, the objective is to find the best route to connect a call between the origin and destination switches (Sevenster and Engelbrecht 1996). 2017/5/8 PSTN: Public Switch Telephone Network EIE557-CI&IA 64 Chromosome representation: - variable length - each gene representing one switch - integer values for switch numbers - the first gene and last gene representing the origin and last switches, respectively Examples: (1 3 6 10) (1 5 2 5 10) = (1 5 2 10) Duplicate switches are ignored 2017/5/8 EIE557-CI&IA 65 Initialization of population: - randomly generated with the restriction that the first gene represents the origin switch and the last gene the destination switch 2017/5/8 EIE557-CI&IA 66 Fitness function: a multi-criteria objective function was applied. F j aF jSwitch bF jBlock cF jUtil dF jCost F Switch j rj , rj : the total number of switches in the route M M : the total number of switches F jSwitch : to minimize the route length F jBlock : to minimize the route congestion F jUtil : to maximize the utilizatio n of the links F jCost : to minimize the route cost The costants a, b, c, and d control the influence of each criteria. 2017/5/8 EIE557-CI&IA 67 Selection: any selection operator Crossover: any crossover operator Mutation: replacing selected genes with a uniformly random selected switch in the range [1, M]. 2017/5/8 EIE557-CI&IA 68 Real World EC Tends include: More complex representations and operators Use of problem specific knowledge for seeding the initial population and creating heuristic operators Hybridisation with other methods 2017/5/8 EIE557-CI&IA 69 Advantages of EC Handles huge search spaces Balances exploration and exploitation Easy to try - not knowledge intensive Easy to combine with other methods Provides many alternative solutions Can continually evolve solutions to fit with a continually changing problem 2017/5/8 EIE557-CI&IA 70 Disadvantages of EC No guarantee for optimal solution within finite time Weak theoretical basis May need extensive parameter tuning Often computationally expensive, i.e., slow 2017/5/8 EIE557-CI&IA 71