* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Genetic algorithm
Survey
Document related concepts
Genetic engineering wikipedia , lookup
Koinophilia wikipedia , lookup
Frameshift mutation wikipedia , lookup
Heritability of IQ wikipedia , lookup
History of genetic engineering wikipedia , lookup
Public health genomics wikipedia , lookup
Genetic testing wikipedia , lookup
Point mutation wikipedia , lookup
Human genetic variation wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Group selection wikipedia , lookup
Genetic drift wikipedia , lookup
Genome (book) wikipedia , lookup
Gene expression programming wikipedia , lookup
Transcript
Genetic algorithm Definition • The genetic algorithm is a probabilistic search algorithm that iteratively transforms a set (called a population) of mathematical objects (typically fixed-length binary character strings), each with an associated fitness value, into a new population of offspring objects using the Darwinian principle of natural selection and using operations that are patterned after naturally occurring genetic operations, such as crossover (sexual recombination) and mutation Genetic Algorithms - History • • • • Pioneered by John Holland in the 1970’s Got popular in the late 1980’s Based on ideas from Darwinian Evolution Can be used to solve a variety of problems that are not easy to solve using other techniques Finding a solution of a problem is often thought • In computer science - is a process of search through the space of possible solutions. Partial solutions are viewed as a point in the search space • In Engineering & Mathematics- The problems are first formulated as mathematical models. Set the parameters that gives the best solution Why genetic Algorithm • Genetic algorithm can be used to solve problems that are not well suited for standard optimization algorithms, including problems in which the objective function is discontinuous, non differentiable, stochastic, or highly nonlinear. Classical derivative based optimization Genetic algorithm Generates a single point at each iteration. The sequence of points approaches an optimal solution Generates a population of points at each iteration. The best point in the population approaches an optimal solution Selects the next point in the sequence by Selects the next population by a deterministic computation. computation which uses random number generators Optimization • Optimization: process of finding an optimal solution (maximum/ minimum) satisfying the constraints • It focuses on 3 factors • 1) objective function : function which is to be maximized or minimized (Example : maximize the profit and minimize the cost in the case of manufacturing) • 2) A set of unknowns or variables ( the amount of resources used/ time spent etc • 3) A set of constrains ( availability of space, money etc) Our Main concern here is to 1) How to describe the process of search 2) how to implement and carry out search 3)What are the elements required to carry out search Genetic Algorithm Basic genetics • All living organism consists of cells • Each cell of a living thing contains chromosomes - strings of DNA • Each chromosome contains a set of genes - blocks of DNA • Each gene determines some aspect of the organism (like eye colour) • A collection of genes is sometimes called a genotype • A collection of aspects (like eye colour) is sometimes called a phenotype Basic genetics General scheme of Evolutionary process Terminology Working principle Outline of genetic algorithm Silly Example - Drilling for Oil • Imagine you had to drill for oil somewhere along a single 1km desert road • Problem: choose the best place on the road that produces the most oil per day • We could represent each solution as a position on the road say, a whole number between [0..1000] Where to drill for oil? Solution1 = 300 Solution2 = 900 Road 0 500 1000 Digging for Oil • The set of all possible solutions [0..1000] is called the search space or state space • In this case it’s just one number but it could be many numbers or symbols • Often GA’s code numbers in binary producing a bit string representing a solution • In our example we choose 10 bits which is enough to represent 0..1000 Convert to binary string 512 256 128 64 32 16 8 4 2 1 900 1 1 1 0 0 0 0 1 0 0 300 0 1 0 0 1 0 1 1 0 0 1023 1 1 1 1 1 1 1 1 1 1 In GA’s these encoded strings are sometimes called “genotypes” or “chromosomes” and the individual bits are sometimes called “genes” Drilling for Oil Solution1 = 300 (0100101100) Solution2 = 900 (1110000100) Road OIL 0 1000 30 5 Location Summary We have seen how to: • represent possible solutions as a number • encoded a number into a binary string • generate a score for each number given a function of “how good” each solution is - this is often called a fitness function • Our silly oil example is really optimisation over a function f(x) where we adapt the parameter x Lecture 2 • • • • • Representation Selection (Reproduction) Cross over Mutation Problem solving using GA Representation • Before any algorithm is put into work on any problem, the partial solutions have to be encoded so that a computer can process. Chromosomes could be: – Bit strings (0101 ... 1100) – Real numbers (43.2 -33.1 ... 0.0 89.2) – Permutations of element (E11 E3 E7 ... E1 E15) – Lists of rules (R1 R2 R3 ... R22 R23) – Program elements (genetic programming) – ... any data structure ... Binary encoding • Binary representation: Here encoding is done using sequence of 1’s and 0’s. Example: Decoding a value • For a string length ni the accuracy in the variable approximation is (XUi - XLi ) / 2ni Permutation encoding Tree encoding Genetic operators • Selection ( Reproduction) • Cross over (Recombination) • Mutation Selection Fitness value F is calculated The probability of selection of ith chromosome is done Pi Fi pop _ size F j i The cumulative frequency qi Pj j j 1 Generate a random number r from the range [o,z] If r < q1, select the first chromosome, otherwise select chromosome from 2 to pop_size Different methods • • • • Roulette wheel selection Rank selection Boltzman selection Tournament selection Example Roulette -wheel selection In roulette wheel selection, individuals are given a probability of being selected that is directly proportionate to their fitness. Populati Populati on No on Fitness Probabil Expecte ity pi d count (nxpi) Cumulat ive frequen cy Random String number number betwee n 0 and 1 Count in the mating pool 1 0000 1 .0429 0.33 0.0429 0.259 3 1 2 0010 2.1 .090 0.72 0.1326 0.038 1 1 3 0001 3.11 .1336 1.064 0.266 0.0486 5 1 4 0010 4.01 .1723 1.368 0.438 0.428 4 2 5 0110 4.66 .2 1.6 0.638 0.095 2 2 6 1110 1.91 .082 0.656 0.720 0.3 4 0 7 1100 1.93 .0829 0.664 0.809 0.616 5 0 8 0111 4.55 .1955 1.56 1 0.897 8 1 Problem • Find the expected number of copies of the best string for a maximization problem using 1) Roulette wheel selection 2) tournament selection String Fitness 01101 5 11000 2 10110 1 00111 10 10101 3 00010 100 Boltzmann Selection Cross over One –point cross over Two-point cross over Off spring 1 Offspring 2 11011 1100001 0110 11011 0010011 1110 Uniform crossover Arithmetic crossover Mutation • Mutation is a genetic operator used to maintain genetic diversity from one generation of population of chromosome to the next. • Various mutation operator are Boundary, uniform, non uniform Uniform Mutation A gene(real number) is selected with the help of a randomly selected real number within a specific range. For a chromosome Xt =[X1 , X2, … Xm]. A random number k is selected such that k [1,n] and an offstring Xt+1 =[X1 ,… X’k … Xm] , where X’k is a random value generated according to uniform probability distribution from the range [XkL, XkU ]. Here XkL and XkU are lower and upper bounds on variable Xk Boundary Mutation The replacement of X’k by either XkL boundary mutation Non-uniform Mutation Here X’k is selected or XkU each with equal probability is known as X k (t , X kU X k ) if the random digit is 0 X k X k (t , X k X L k ), if the random digit is 1 Where (t, y) returns a value in the range [ 0, y] such that probability of (t,y) being close to 0 as t increases Mutation can be implemented using 1) one’s complement operator 2) logic bitwise operator 3) shift operator and 4) masking operator Problem Support Vector machine • one of the most well studied and widely used learning algorithms for binary classification • Extensions of SVMs exist for a variety of other learning problems, including regression, multiclass classification, ordinal regression, ranking, structured prediction, and many others. • Similar to perceptrons they aim to find a hyper plane that linearly separates data points belong to different classes • In addition SVMs aim to find the hyper plane that is least likely to overfit the training data Separating hyper planes • Which one is better: B1 or B2? Why? • Many other separating hyperplanes are possible • Each instance in X is an n-dimensional real vector • i.e X Rn . • Given a sample of m labeled examples • Classification is done using the classifier • for some w Rn , bR • Thus for X Rn , the basic SVM algorithm selects a classifier from the class of linear classifiers over X. Learning linear SVM • It is convenient to represent classes by +1 and -1 using • y = 1; if wx+b > 0 , • -1; if wx+b < 0 • w can be rescaled such that for all points x lying on the respective boundaries it holds that wx+b = 1 or wx+b = -1 • These points are called the support vectors • • • • • • • • The task of learning a linear SVM consists of estimating the parameters w and b The first criterion is that all points in the training data must be classified correctly: w.xi + b ≥ 1 if yi = 1 w.xi+b ≤ -1 if yi = -1 This can be re-written as: yi(w.xi+b) ≥ 1 for 1≤ i ≤ N Linear separable – hard margin SVM • Although both classifier separates the data, the distance or margin with which separation achieved is different. • The SVM algorithm selects maximum classifier margin • The margin on (xi yi) is simply a signed version of this distance, with a positive sign if the example is classified correctly and negative otherwise. • The margin of the classifier given by (w,b) on a sample is then defined as the mini mal margin on S: • The margin of such a classier on S then becomes simply • Thus maximizing the margin becomes equivalent of minimizing the norm subject to the constraints given in equation 5 which can be written as following optimization problem • i.e maximize the margin subject to the constrains that all points in the training data must be classified correctly. • This problem can be solved using Lagrange Multipliers