* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Introduction to AI (part two)
Genome (book) wikipedia , lookup
Hardy–Weinberg principle wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Koinophilia wikipedia , lookup
Frameshift mutation wikipedia , lookup
Genetic drift wikipedia , lookup
Point mutation wikipedia , lookup
Gene expression programming wikipedia , lookup
Introduction to AI (part two) Tim Watson G6.71 [email protected] Overview • Three weeks on Genetic Algorithms (Gas) • Two weeks on Symbolic AI • One week of revision • Labs used to run experiments with a simple GA program • Coursework and slides will be available on L:\tw\IntroAI What is a GA? • Evolutionary Algorithm – Create random population of possible solutions in the form of bitstrings – See how good each one is (test fitness) – Produce next generation (fitter are more likely to get into next generation) – Crossover, mutate and make next generation the current one What can they do? • • • • • • • • • Schedule Barcelona Olympics Aircraft Design Dynamic Routing in Networks Robot Arm Trajectory Planning Lab Task Scheduling for US Navy Aircraft Missile Evasion Evolving aNN Architecture Parameter Tuning for Sonar Systems Conformational Analysis of DNA Example: Onemax 1. Initial population: 100 000 110 3. Reproduce: 110 100 110 2. Calculate fitness: 100 2 000 1 110 3 4. Crossover & Mutate: 110 101 010 Biological Terminology • Genes etc. – Gene – Locus (plural loci) – Allele (also called gene value) – (Pleiotropic gene) • Genotype (Chromosome) • Phenotype • Fitness Landscape GA Theory • Schema Theorem – GA searches schemata in parallel – 10 represents 10, 1#, #0 and ## – The theorem is rubbish! • Building Block Hypothesis – Good, small sequences are found and recombined to form good solutions • No Free Lunch Theorem GA Parameters • Population Size – Static or Dynamic? • Chromosome Size – Fixed or Variable? • Crossover Rate – One-point, two-point or uniform? • Mutation Rate – Fixed or Variable? • Fitness Function • Termination Criteria Prediction Test! • What happens to the population statistics in a standard GA with random fitness, no crossover, no mutation and chromsize equals 16? – Best, Worst, Mean , Std. Dev., column counts • Best=17, Worst=1, Mean=9ish, Std. Dev. constant-ish, pop converges randomly. Reproduction in GAs • Need selective pressure for reproduction to improve the population fitness – None leads to random walk (slow) – Some leads to geometric growth of best (fast) • Infinite populations select individuals on relative fitness: fit/mean(fit) • Finite populations also affected by how many copies are already present Types of Selection • Fitness-Proportionate – Fitness scaling based on raw fitness where if fita < fitb then scaled(fita) scaled(fitb) – Scaling can be altered dynamically • Rank-Order • Tournament • Elitism Initialising the Population • • • • • Uniformly at random Best Guesses Converged to Best Known From Real World Hybrid Mutation • Goal of selection: survival of the fittest • Goal of mutation: explore lost or never seen alleles • Random reset versus bit flip – Reset rate = ½ bit flip rate • Mutation as a spring • In infinite time every possible population visited an infinite number of times • Alternative to mutation: complement Crossover • Goal: to try out different combinations of good bits of individuals • Crossover point – One-point – Two-point – N-point – Uniform Crossover (2) • Closer genes are less likely to be split by crossover – AB###### probability of split with one-point crossover = 1/7 – A######B probability of split = 1 • Local maxima can occur (e.g. for onemax) 11011 fit=4 1011000101 00100 fit=1 0010100000 All children have lower fitness than parents Design Decisions • If genes are linked then the representaion of an individual ought to keep them close together • Get the balance right: – Popsize too small premature convergence – Popsize too large too slow to compute – Mutation rate too low not enough exploring – Mutation rate too high too much noise Messy GAs • Developed by David Goldberg et al. late 1980s early 1990s • Goal of messy GAs: improve function optimisation performance by explicitly building up increasingly longer, highly fit strings from well-tested shorter building blocks • Nature started with simple life-forms and built up complex ones Representation • Each bit is tagged with its ‘real’ locus but not all loci have to specified. For example, in a four-bit problem: {(1,0), (2,0), (4,1), (4,0)} {(3,1), (3,0), (3,1), (4,0), (4,1), (3,1)} • Overspecification: conflicting specification at a locus – left-to-right first come first served • Underspecification: locus not specified – can be thought of as a schema, e.g. 00#1 – replace with random value? – use hill-climbing to find local optimum and use value from that Outline of Algorithm • Two phases: primordial and juxtapositional • Primordial phase – Goal: enrich population with small, promising candidate schemata – guess order k of smallest relevant schemata – initial population contains all possible schemata of order k, e.g. for k=3 and solution bitlength=4 there are 8 different 3-bit patterns and 4 ways of choosing 3 bits for a popsize of 32 – use selection only to produce next generation – regularly halve popsize and stop after n generations Outline of Algorithm (cont.) • Juxtapositional phase – popsize remains fixed, selection continues – no mutation – two new operators: cut and splice • Cut takes a string and cuts it at a random point to produce two new strings: {(1,1), (2,0), (1,0), (4,1), (1,1)} {(1,1), (2,0), (1,0)} {(4,1), (1,1)} • Splice joins two strings together • Goldberg argued that the primordial phase should produce, in enough numbers, all the building blocks necessary to construct an optimal solution and that the juxtapositional phase should then construct it. Problems with Messy GAs • Based on Schema Theorem, which isn’t correct • Need to know (or guess) k, when to halve popsize and how many generations to run primordial phase • Initial size of population easily becomes too large (e.g. for k=8, bitlength=64, popsize approx. 1 trillion) • Goldberg et al. have announced that messy GAs are ready for real-world problems and call for their immediate application to difficult combinatorial problems of practical importance. No results have yet been forthcoming.