Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Genetic Algorithms An Evolutionary Approach to Problem Solving Illusions of Design Living things in nature seem to be like they were designed by a skilled engineer/designer. Evolution Theory: Too exquisitely designed to be “random”. If not random, then what is the non-random process? Evolution A process of cumulative selection Individuals with new traits are developed through Mutations Sexual Reproduction Individuals with better traits are more likely to survive and are more likely to transfer those traits to their descendents. Key Principles of Evolution http://www.pbs.org/wgbh/evolution/library/11/2/e_s_4.html Variety: population of individuals with different set of traits Through reproduction: The importance of sexual reproductions versus asexual reproduction (http://www.pbs.org/wgbh/evolution/library/01/5/l_015_03.html) Random mutations Evaluation & Selection (through constraints in the environment) Reproduction: Transfer of traits to offsprings From Evolution of life in Nature to Evolution of Solutions Can evolutionary principles be used to develop solutions to problems? Can these principles be used to do Data Mining? Computation Analogy of Evolution What is the equivalent of an individual (chromosome)? What is the equivalent of an individual having “good” genetic material/good set of traits? What is the equivalent of a population of individuals? What is the equivalent of reproduction? What is the equivalent of survival of the fittest? Computation Analogy of Evolution What is an individual A collection of traits Example: An individual has the following traits Color: Black or White Speed: Fast, Medium, Slow or Very Slow Intelligence: Very Smart, Smart, Somewhat Smart, Medium Smart, Dumb or Very Dumb GA – Generic Strategy Start with a “population” of solutions (individuals) Repeat Choose 2 solutions from the population With a certain probability Apply a crossover to create child1, child2 With a certain probability, mutate child1 and child2 Update the population (discard of bad solutions). Until stopping criteria has been met Output the best solution(s) in the population What is an individual A collection of traits Example: An individual has the following traits Color: Black or White Speed: Fast, Medium, Slow, Very Slow Intelligence: Very Smart, Smart, Somewhat Smart, Medium Smart, Dumb or Very Dumb Typically a coding system is used to code each trait. E.g., binary coding. Representing Solutions The Fitness Function: Evaluation of Individual Solutions A “fitness function” . Process of Evaluating the Fitness How a Population of Solutions Looks Like Reproduction: Crossover Crossover between parents’ traits creating two children. There are many crossover operators 1. Randomly choose a position and cross over contents before that position. Crossover point White Medium Dumb Black Slow Smart Children: White Slow Smart Black Medium Dumb Reproduction Crossover 2. Randomly choose a set of traits (genes) and cross over those. Examples: Cross over color and intelligence White Medium Dumb Black Slow Smart Children: Black Medium Smart White Slow Dumb Reproduction Mutation The purpose of mutations is to take a single solution and introduce some random “shock” or changes to it to create a new solution. Implementation: Randomly chose a trait for each child and randomly change its value to another valid value Black Black Medium Fast Smart Smart White Slow Dumb White Slow Intelligent What is the equivalent of survival of the fittest? Simply give solutions with better fitness a higher probability of being chosen for reproduction. Sample with replacement: solutions with bigger fitness will be selected more times. Population What will be the composition of population in future generations? The Traveling Salesman Problem A travelling salesman who has to visit a set of n cities. Find the order in which the salesman visits cities so as to minimize total distance. Variants of this problem are found in several domains. As n gets very large, exhaustive search becomes impossible due to the combinatorial nature of the problem. Need heuristic methods to find good solutions, even if these are not guaranteed to be the “best”. The Traveling Salesman Problem Consider the TSP problem 5 cities to visit: London, Oxford, Cambridge, Brighton, and Bath. What is the best path? Cambridge Oxford London Bath Brighton The Traveling Salesman Problem Genetic Algorithms Solution Step-1: An individual is a “candidate solution”, a path. Examples of candidate solutions for the TSP: London, Oxford, Cambridge, Brighton, Bath London, Bath, Oxford, Brighton, Cambridge Brighton, London, Cambridge, Bath, Oxford. … The Traveling Salesman Problem Coding Scheme for Candidate Solutions Order in Order in Order in Order in Order in sequence sequence sequence sequence sequence of of of of of London Oxford Cambridge Brighton Bath Example: London Oxford Cambridge Brighton Bath 1 2 3 4 5 Oxford London Cambridge Bath Brighton 2 1 3 5 4 The Traveling Salesman Problem Step 2: Fitness Function How “good” are the following solutions? London, Oxford, Cambridge, Brighton, Bath London, Bath, Oxford, Brighton, Cambridge Brighton, London, Cambridge, Bath, Oxford. The Traveling Salesman Problem Fitness in TSP Problem: Distance Table Distance (in Miles) London London Oxford Cambridge Brighton Bath 0 Oxford Cambridge Brighton Bath 350 50 280 470 0 130 270 310 0 210 340 0 220 0 The Traveling Salesman Problem Creating New Solutions Crossover Operation : Randomly choose a position and cross over contents before that position. Crossover Operation for TSP: part of the first parent is copied and the rest is taken in the same order as in the second parent London 1 1 Oxford Cambridge Crossover point 2 5 Brighton Bath 3 4 5 3 2 4 Reproduction London The Traveling Salesman Problem Oxford Cambridge Crossover point 1 2 1 5 Brighton Bath 3 4 5 3 2 4 London Oxford Cambridge Brighton Bath London Brighton Cambridge Bath Oxford Child: London Oxford Brighton Cambridge Bath 1 2 4 3 5 Reproduction London The Traveling Salesman Problem Oxford Cambridge Crossover point 1 2 1 5 Brighton Bath 3 4 5 3 2 4 London Oxford Cambridge Brighton Bath London Brighton Cambridge Bath Oxford Child: London Cambridge Brighton Bath Oxford 1 5 2 3 4 Reproduction Mutation in the TSP Problem Randomly changing one gene won’t work. London Oxford Cambridge Brighton Bath 1 2 3 4 5 1 2 4 4 5 Design mutations around the “swap” concept: 1 2 5 4 3 GA for TSP For the TSP problem we have: Solution representation A fitness evaluation function Crossover operations on parents Mutation on a single solution Start with a population of solutions, and let them evolve Next Step: Initialize a population Decision parameter: population size Let us choose 5 solutions in our population. We will now randomly initialize the population. London, Oxford, Cambridge, Brighton, Bath Oxford, London, Cambridge, Bath, Brighton Cambridge, Oxford, Brighton, Bath, London Bath, London, Brighton Cambridge, Oxford Bath, Oxford, Cambridge, London, Brighton (-1320) (-1230) (-1140) (-1400) (-990) Evolving Solutions for TSP Repeat Choose 2 solutions from the population With a certain “crossover probability” (say, 0.8) apply a crossover operator to create child1, child2 With a certain “mutation” probability (say 0.1), mutate child1, child2 Place the resulting 2 children in the population Selection: which 5 solutions survive – the probability of each individual to survive is propositional to its fitness function After stopping, output the best chromosome in the population for the solution (Until stopping criteria has been met) Overview of the Selection Process Over time solutions mate and traits passed on to the offspring. Children with “better” traits have a ability to survive. The weak solutions gradually disappear from the population. Good solution predominate the population Building Solutions through Evolution http://www.pbs.org/saf/1103/video/watchon line.htm GA – Advantages: Not engineering : enables finding surprising solutions to problems Quickly and reliably solve problems that are hard to tackle by traditional means. Implicit parallelism makes GAs a very efficient optimization algorithm. Great property is the ability to find approximate solutions to combinatorially explosive problems. GA - Disadvantages A heuristic: GAs may find only nearoptimal solutions. Further restrictions are the difficulties of choosing a suitable representation technique, and making the right decision regarding the choice of the selection method and the genetic operator probabilities