* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Part 1: Motivation, Basic Concepts, Algorithms
Dual inheritance theory wikipedia , lookup
Genome evolution wikipedia , lookup
Heritability of IQ wikipedia , lookup
Public health genomics wikipedia , lookup
X-inactivation wikipedia , lookup
Genetic engineering wikipedia , lookup
Point mutation wikipedia , lookup
History of genetic engineering wikipedia , lookup
Human genetic variation wikipedia , lookup
Genomic library wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Designer baby wikipedia , lookup
Group selection wikipedia , lookup
Genome (book) wikipedia , lookup
Koinophilia wikipedia , lookup
Genetic drift wikipedia , lookup
Gene expression programming wikipedia , lookup
Part 1: Motivation, Basic Concepts, Algorithms 1 Review of Biological Evolution • Evolution is a long time scale process that changes a population of organism by generating better offspring through reproduction. – – – – – – Chromosome: DNA-coded information characterizing an organism. Gene: Elementary DNA block of information (e.g.,: eyes color). Allele: One of the possible values for a gene (e.g., brown, blue, . . . ). Trait: The physical characteristic encoded by a gene. Genotype: A particular set of genes. Phenotype: The physical realization of a genotype (e.g., a person). – Fitness: A measure of success in life for an organism. – Crossover (Recombination): Chromosomes from the parents exchange genetic materials to generate a new offspring. – Mutation: Error occurring during DNA replication from parents. 2 • A genetic algorithm, often referred to as genetic algorithms, (GAs) mimic the processes of biological evolution in order to solve problems and to model evolutionary systems. • Developed by John Holland, University of Michigan (1970’s) – To understand the adaptive processes of natural systems. – To develop ways in which the mechanisms of natural adaptation might be imported into computer systems for optimization and machine learning applications. 3 • Why use the mechanisms of natural evolution for solving computational Problems? – Evolution searches among an enormous number of possible genetic sequences, to create highly fit organisms that survive and reproduce in their environments. – Species evolve by means of random variation (via mutation, recombination, and other operators), followed by natural selection, in which the fittest tend to survive and reproduce, thus propagating their genetic material to future generations. – Likewise, computational problems involve searching through a large solution space for optimal or sub-optimal solutions. 4 • • • • • Optimization (e.g., circuits layout, job shop scheduling, . . . ) Prediction (e.g., weather forecast, protein folding, . . . ) Classification (e.g., fraud detection, quality assessment, . . . ) Economy (e.g., bidding strategies, market evaluation, . . . ) Ecology (e.g., biological arm races, host-parasite coevolution, ...) • Automatic programming. • Best suited for: – Big search space, non convex. – Finding good sub-optimum in a reasonable time, rather than spending years on finding, perhaps, the best solution. 5 • GAs have the following elements in common: – – – – Populations of chromosomes Selection according to fitness Crossover to produce new offspring Random mutation of new offspring. • A single solution is represented as a vector of components. – The vector is called a chromosome. – The components are called genes. • There is a population of chromosomes (solutions) that cooperatively act towards a common goal. • The GA is an evolutionary or iterative algorithm that modifies the population of solutions at each epoch (cycle, iteration) of the algorithm. – The modification is done by crossover and mutation. 6 { initialize population; measure fitness of the population; do while Termination Criteria Not Satisfied { select parents for reproduction; perform recombination and mutation; measure fitness of the population; } } 7 • Chromosomes could be: – Bit strings (0101 ... 1100) – Integers (7 5 ... 1 99) – Real numbers (43.2 -33.1 ... 0.0 89.2) – Permutations of elements (E11 E3 E7 ... E1 E15) – Lists of rules (R1 R2 R3 ... R22 R23) – Program elements (genetic programming) – ... any data structure ... 8 Parameter Estimation Example • Find such that • Chromosome with genes (i.e., representation of a solution): • Population of solutions: 9 Travelling Salesperson Problem • Find a tour of a given set of cities so that each city is visited only once the total distance traveled is minimized • Chromosome with genes (i.e., representation of a solution): • Population of solutions: 10 10 10 • Crossover or recombination is GAs distinguishing feature. • It involves mixing and matching parts of two parents to form children. • Crossover was originally based on the premise that highly fit individuals often share certain traits, called building blocks, in common. • For fixed-length vector individuals, a building block was often defined as a collection of genes set to certain values. • For example, perhaps parameters and need to be both small values in the a parameter estimation problem (they should be in some range). • For example, in the Boolean individual 10110101, perhaps ***101*1 might be a building block (where the * positions aren’t part of the building block). 11 • How you do that mixing and matching depends on the representation of the individuals and the objective of the optimization problem. • Some crossover techniques are just based on the representation of the chromosomes, and ignore the optimization objective, while others include both. • In chromosome representation only techniques, there are three classic ways of doing crossover in vectors: • One-Point, Two-Point, and Uniform Crossover. 12 • One-point crossover picks a number between and , inclusive, and swaps all the indexes 1: 2: 3: 4: 6: 7: 1 1st vector to be crossed over 2nd vector to be crossed over random integer chosen uniformly from 1 to inclusive for to do Swap the values of and return and 13 • The problem with one-point crossover is that it may break important linkages between components. • Notice that the probability is high that . choice of will do it, except for • If the organization of your vector was such that elements and had to work well in tandem in order to get a high fitness, you’d be constantly breaking up good pairs that the system discovered. and will be broken up due to crossover, as any 1 14 • Two-point crossover is one way to alleviate the linkage problem. • Just pick two numbers and , and swap the indexes between them. • Think of the vectors as rings to understand how the endpoints don’t get broken. • However, the probability of swapping indexes is still not uniform, or perfectly fair, which might be required in some applications. 15 16 • We can treat all genes fairly with respect to linkage by crossing over each point independently of one another, using Uniform Crossover. • Here we simply march down the vectors, and swap individual indexes if a coin toss comes up heads with probability . 17 18 Crossover Incapable of Exploring Entire Solution Space • If you cross over two vectors you can’t get every conceivable vector out of it. • Imagine your vectors were points in space. • Now imagine the hypercube formed with those points at its extreme corners. • Crossovers will result in new vectors which lie at some other corner of the hypercube. • For example: the possible crossovers of the two vectors (1,2,3) and (4,5,6) are (4,5,3), (4,2,6), (1,5,6), (4,2,3), (1,2,6), and (1,5,3). As such, other vectors are not possible, such as (1,1,1). • Thus, to make GAs have a chance to “explore” a wider search space, another form of change is required (i.e., Mutation). 19 • Mutation allows the algorithm to explore the solution space more than that allowed by crossover. – It provides genetic diversity from one generation of a population of genetic algorithm chromosomes to the next. • Mutation alters one or more gene values in a chromosome from its initial state. – The larger the number of gene values that are mutated, the large region of the solution spaced may be searched. – Mutation occurs during evolution according to a user-definable mutation probability. This probability should be set low. If it is set too high, the search will turn into a primitive random search. • Example: Single point mutation – A random number is drawn to choose which gene in a chromosome should be randomly altered. 20 More Controlled form of Mutation: Line Recombination These linear combinations result in points that lie on the same line joining the initial points and . 21 • How do we select parents for crossover? • Selection of parents (chromosomes) for crossover should not be done randomly, but should be done in a way that is focused on achieving the common goal. • For example, selection should be based on fitness. – The probability of being selected as a parent is proportional to fitness. • Two possible selection methods (there are others): – Tournament selection. – Roulette-wheel selection. 22 • Divide wheel into subintervals, one for each individual in the current generation. • Interval length is proportional to individual’s fitness. • Uniformly distributed random number chooses the subinterval (i.e., parent 1). • Do again for parent 2. Source: http://www.edc.ncl.ac.uk/assets/ hilite_graphics/rhjan07g02.png 23 Roulette-wheel Implementation Population Member # Fitness Index Initial Fitness Value Fitness Index CDF Value 5 5 2 7 4 11 2 13 8 21 2 23 1 24 4 28 24 Algorithm 30: Roulette-wheel Selection 25 Stochastic Universal Sampling • A problem with Roulette-wheel selection, it is possible the fittest individual may never be chosen due to the random nature of the selection. • In Stochastic Universal Sampling, the selection is biased so that fit individuals always get picked at least once. 26 Algorithm 31: Stochastic Universal Sampling 1 5 7 11 13 21 23 28 24 27 • Tournament selection involves running “tournaments” among a number of individuals chosen at random from the current generation (population). • The winner of each tournament (the one with the best fitness) is selected. • Choose a tournament size, . • Randomly select chromosomes (solutions) from the current generation, and choose the most fit of these to be the “mother design.” • Randomly select chromosomes (solutions) from the current generation, and choose the most fit of these to be the “father design.” 28 Algorithm 32: Tournament Selection • Randomly select chromosomes (designs) from the current generation, and choose the most fit of these to be the chosen one. 29 Selection Bias (Fitness Pressure) • Tournament size controls selection bias (fitness pressure). – The greater is the tournament size, the higher is the pressure in the algorithm for selecting individuals of higher fitness. – Extreme case #1: • If the tournament size is equal to the generation size, the most fit solution in the current generation would always be selected as both the mother and father. – Extreme case #2: • If the tournament size is one, fitness is completely ignored and the mother and father are selected randomly. 30 Tournament Selection Benefits • Tournament Selection has become the primary selection technique used for the Genetic Algorithm. • Efficient to code. • Works on parallel architectures. • Allows the selection pressure to be easily adjusted. • The most popular setting is . 31 • The elitist Genetic Algorithm injects the fittest individual(s) from the previous population into the next population. • These individuals are called the elites. • By keeping the best individual(s) around in future populations, this algorithm is exploitive. 32 Algorithm 33: Genetic Algorithm with Elitism 33 Hybrid Optimization Algorithms • There are many ways to create hybrids of various metaheuristics algorithms, such as a hybrid of evolutionary computation and hill-climbing. • Example: – Augment an Evolutionary Algorithm with some hillclimbing during the fitness assessment phase to revise each individual as it is being assessed. – The revised individual replaces the original one in the population 34 Algorithm 36: Stead-state Genetic Algorithm 35 • Designed for multidimensional real-valued spaces. • Children must compete directly against their immediate parents for inclusion in the population. • The size of Mutates based on the current variance in the population. – If the population is spread out, mutate will make major changes. – If the population is condensed in a certain region, mutates will be small. • Differential evolution is an adaptive mutation algorithm 36 • The idea is to mutate away from one of three chosen individuals by adding a vector to it. • This vector is created from the difference between the other two individuals . • If the population is spread out, and are likely to be far from one another and this mutation vector is large, else it is small. • If the child is better than the parent, it replaces the parent in the original population, else the child is thrown away. 37 [1] J. D. Hedengren, "Optimization Techniques in Engineering," 5 April 2015. [Online]. Available: http://apmonitor.com/me575/index.php/Main/HomePage. [Accessed 27 April 2015]. [2] A. R. Parkinson, R. J. Balling and J. D. Heden, "Optimization Methods for Engineering Design Applications and Theory," Brigham Young University, 2013. [3] S. Luke, "Essentials of Metaheuristics," [Online]. Available: http://cs.gmu.edu/~sean/book/metaheuristics/Essentials.pdf. [Accessed 11 May 2015]. [4] J. D. Hedengren, "Genetic Algorithms in Engineering Design," [Online]. Available: http://apmonitor.com/me575/uploads/Main/chap5_genetic_algorithms.pdf. [Accessed 28 April 2015]. 38