* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Part I-A
Public health genomics wikipedia , lookup
Human genetic variation wikipedia , lookup
Designer baby wikipedia , lookup
Genetic engineering wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Genome evolution wikipedia , lookup
Genetic drift wikipedia , lookup
Dual inheritance theory wikipedia , lookup
History of genetic engineering wikipedia , lookup
Group selection wikipedia , lookup
Genome (book) wikipedia , lookup
Population genetics wikipedia , lookup
Gene expression programming wikipedia , lookup
CS321 HS 2009 Autonomic Computer Systems Evolutionary Computation I November 17, 2009 Lidia Yamamoto and Manolis Sifalakis University of Basel http://cn.cs.unibas.ch S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 1 Overview Part I (today) • Genetic Algorithms (GA) • Genetic Programming (GP) Part II (Thursday) • Representations • Dynamic environments S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 2 Overview of Today’s Lecture Evolutionary Computation, Part I • • Introduction – (Self-)Optimization – Basic definitions from genetics Evolutionary Computation – Common definitions, basic algorithm – Genetic Algorithms (GA) – Genetic Programming (GP) • Example: Symbolic regression with TinyGP S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 3 Textbooks Melanie Mitchell, “An Introduction to Genetic Algorithms”, MIT Press, 1998. W. Banzhaf, P. Nordin, R. E. Keller, and F. D. Francone. "Genetic Programming, An Introduction". Morgan Kaufmann Publishers, Inc., 1998. R. Poli, W. B. Langdon, and N. F. McPhee. "A Field Guide to Genetic Programming". Published via http://lulu.com, 2008. http://www.gp-field-guide.org.uk/. S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 4 Optimization and Self-Optimization Optimization (Operations Research) • maximize/minimize objective function subject to constraints • search for a solution in a vast search space or solution space (space of all possible solutions) • linear/non-linear Self-Optimization (IBM Autonomic Computing) • ability of the system to optimize its operations based on a given target operation profile (objectives and constraints) • system continually seeks to improve its performance and efficiency, in order to meet end-users' needs with minimal human intervention • also able to track and respond to profile changes S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 5 Optimization (Operations Research) General formulation of an optimization problem: maximize: subject to: f(x) = objective function gi(x) = constraints f(x) global optimum local optima search space Simple example: 1 variable (x), no constraints x best solution S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 6 Heuristic Optimization Search space often discrete, and too large for exhaustive search Heuristic optimization seeks to explore promising regions of the search space through a beamed search: • refinement of promising solutions seen so far • do not guarantee that a global optimum will be found • might be trapped in local optima • goal is to provide a satisfactory solution in reasonable time Some heuristic optimization algorithms: • Evolutionary algorithms: Inspired by genetics and Darwinian evolution • Swarm algorithms: Inspired by the behavior of social insects (ants, bees...), flocks of birds, fish, etc. S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 7 Evolutionary Computation Heuristic optimization method • Beamed search in a vast space of possible solutions Inspired by Darwinian evolution: • Biological evolution by natural selection • Survival of the fittest Advantages: • Simplicity • Potential parallelism • Able to work with limited information: only fitness improvement Disadvantages: • Computation cost (e.g. large populations, complex fitness) • No guarantees (like all heuristics) S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 8 An Eukaryotic Cell (Prokaryotic = without nucleus, e.g. bacteria) Eukaryotic = with nucleus, e.g. plant and animal cells membrane cytoplasm organelles and other cell components (e.g. mitochondria, ribosomes...) S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 nucleus genome (genetic material) 9 The Genome Each cell nucleus contains one or more chromosomes A chromosome contains a number of genes A gene is a region of genomic (DNA) sequence defining a functional block Chromosomes may occur in one or more copies within a cell: • Diploid cell: has two copies of each chromosome (e.g. in humans and most animals) • Haploid cell: only one copy (e.g. gametes) • Polyploid cell: several copies (typically a power of 2, e.g. tomatoes, crops) S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 10 The Genome chromosome pair DNA (diploid organism) cell nucleus gene double helix chromosomes chromosomes A T G C genes S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 11 The Genome A gene is the basic unit of heredity in a living organism • region of genomic sequence • can be seen as a functional block • typical encodes a protein, which leads to a trait, e.g. eye color Locus is the position of a gene in the chromosome Allele: each possible alternative DNA sequence for a given gene locus (e.g. leading to different traits such as red or white flowers) S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 12 The Genome Genotype: the genetic material of an individual Phenotype: the ensemble of observable traits (e.g. flower color, leaf shape) resulting from the genotype of an individual allele for white flowers allele for red flowers homologous chromosomes S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 13 Recombination and Mutation The DNA is able to replicate with a high fidelity (with the help of enzymes) However, mutations (errors in the copy process) might still occur, e.g. due to chemical agents, radiations, etc. Recombination (crossover) during sexual reproduction: chromosomes swap genes S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 14 Darwinian Evolution Reproduction = replication + (unlimited) heritable variation • Replication of the DNA sequence • Cell replication • Organism reproduction • Variation: mutation, recombination Fitness = Reproduction rate • how fast an organism (or species) is able to reproduce Selection: survival of the fittest • exponential growth + finite resources = competition • outcome: competitive exclusion (survival of the fittest) S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 15 Overview Evolutionary Computation, Part I • • Introduction – (Self-)Optimization – Basic definitions from genetics Evolutionary Computation – Common definitions, basic algorithm – Genetic Algorithms (GA) – Genetic Programming (GP) • Example: Symbolic regression with TinyGP S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 16 Evolutionary Computation Genetic Algorithms (GA) • goal: find an optimum solution (e.g. combination of parameters) to an instance of a problem • candidate solutions are typically strings Genetic Programming (GP) • goal: find an optimum program able to solve any instance of the problem • candidate solutions are programs, e.g. linear (e.g. assembly language), tree (e.g. LISP), graph (dataflow, cartesian GP) Other variants not covered in this course: • Evolution Strategies (ES), Evolutionary Programming (EP) S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 17 Evolutionary Computation: Basic Concepts Individual: a candidate solution to the problem to be solved Population: set of candidate solutions Genotype: (compact) representation of individuals; can be broken down into chromosomes, genes, alleles,... • • GA: string of bits, integers, characters, etc., examples: bin: 01101011 hex: 39fe87ac3b46 dec: 2039384757 alpha: addbadbaaccd GP: program, e.g. linear, tree, graph,... linear: I1: R1 = R2 + R3; I2: R3 = R4 * R1; I3: Jump I2 LISP tree: (* (+ 3 a) (if (< a b) (- x y) (/ x 3))) S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 18 Evolutionary Computation: Basic Concepts Genetic operators: variation functions that transform a set of individuals (parents) into a new set (offspring) Common operators: • Mutation: random change in genotype, with low probability parent offspring mutation 01010010001 01010101001 • Crossover: recombine portions of two genotypes offspring 1 parent 1 1011 1100101 parent 2 crossover point 0010 011100010 crossover 1011 011100010 offspring 2 0010 1100101 S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 19 Evolutionary Computation: Basic Concepts Phenotype: (expanded) individual that can be evaluated for fitness (e.g. program) • can be the same as the genotype (direct encoding) • or different: indirect encoding: genotype-phenotype map phenotype genotype map 0 23 198 54 S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 if (a > 10) then x = 20 else print b 20 Evolutionary Computation: Basic Concepts Fitness: measure of how good a candidate solution is • tested on a number of test cases (training set) • expressed as a fitness function: e.g. error between ideal and obtained solution (on training case); absolute or relative performance measure Selection strategy: Algorithm that selects individuals in the population that will build the next generation • Principle: "survival of the fittest": best fit individuals have a higher chance of being selected • Selected individuals undergo variation through genetic operators to form the next generation S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 21 Evolutionary Computation: Basic Algorithm pop = initial population generation = 0 evaluate(pop) while bestfit(pop) not good enough, and not stop: • parents = select(pop) • children = recombine(parents) with probability pr • mutate(children) with probability pm • pop = children + select(pop) • evaluate(pop) • generation = generation + 1 S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 22 Fitness Landscape n-dimensional landscape: • fitness function is the objective function: • is the genotype to be optimized • peaks: local optima 3-D case: intuitive visualization: z = f (x, y) z y x example of an initial population (red dots) on a fitness landscape S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 23 Fitness Landscape Convergence example: z z y x y x S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 24 Fitness Evaluation Examples of fitness functions: Error fitness function: sum of distances (error) between expected (pi) and obtained (oi) solution on training case i out of n cases, for individual p: in this case, best fitness = 0 Relative performance measure, e.g. success rate (best fitness = 100%) Absolute performance measure, e.g. amount of credits won, amount of food found (best fitness = infinite) S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 25 Selection Strategies Selection is typically stochastic: "survival of the fittest": • best individuals have a higher chance of being selected • selected individuals become "parents" and produce "offspring” • offspring form the next generation of individuals to be evaluated Examples of selection strategies: • Fitness-proportional selection (Roulette wheel) • Ranking selection • Tournament selection S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 26 Selection Strategies Fitness-proportional or Roulette Wheel selection: the probability of selection (pi) of individual i (out of a population of m individuals) is proportional to its fitness: Ranking selection: probability of selection is a function of rank of individual from worst to best fitness. Tournament selection: a group of randomly selected individuals "compete" in a tournament; winners (best fitness) produce offspring which will replace losers. S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 27 Overview Evolutionary Computation, Part I • • Introduction – (Self-)Optimization – Basic definitions from genetics Evolutionary Computation – Common definitions, basic algorithm – Genetic Algorithms (GA) – Genetic Programming (GP) • Example: Symbolic regression with TinyGP S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 28 Genetic Algorithms Genetic Algorithms: goal: find an optimum solution (e.g. combination of parameters) to an instance of a problem Became popular via John Holland, 1970s Example: solving the Travelling Salesman Problem (NP-hard): given a set of cities and connections (roads) between them, find the shortest tour that visits all cities (and each city only once). individual = candidate tour (sequence of cities) fitness = length of tour (the shorter the better) initial population = set of randomly generated tours iterate by evaluating, selecting, mutating and recombining tours until stop criterion (e.g. max number of generations, S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 29 Sample Iteration Ci = individuals (chromosomes) with two genes: X and Y Fitness function: parents: C4-C3, C4-C1 recombine by swapping genes S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 30 Simple GA Example: Discover the hidden sentence Goal: Find out what's the hidden sentence: e.g. "this is a test" (without blanks); only the length of the target sentence is known to the GA • (generalization of the OneMax problem: maximize the number of ones in a bitstring) Fitness: Similarity measure between obtained and target string, using a string alignment (edit distance) algorithm : • 100% similarity = identical strings = optimum Fixed-length genotype (character string) Mutations: • point mutation (letter replacement, e.g. "ABCD" "ABXD"); • shift (rotate) left/right (e.g. "ABCD" "BCDA") Two-point crossover Tournament selection, size 4 (two best remain in population, and their offspring replace the two worst) S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 31 Simple GA Example: Typical run > ./strga "thisisatest" gen= 1, fit= -27, best=qthakeqsfzt gen= 2, fit= 9, best=ophrispstet gen= 3, fit= 9, best=ophrispstet gen=10, fit= 45, best=qthisattest gen=15, fit= 81, best=thisisatesc gen=18, fit= 100, best=thisisatest Size of search space=3.670344e+15 Population size=1000 tested=45070 Percent of search space explored: 5.176626e-10 % Size of search space = number of all possible strings of same length as target (11 characters in example) using alphabet a-z = 2611 possible strings Typically only a small fraction of (huge) search space is explored But this problem is quite (too) easy for a GA (smooth fitness landscape) S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 32 Search for solutions aggressively Hill climbing: always chooses best solution so far Always reaches a hilltop But maybe not the highest top: can be trapped in a local maximum S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 E.g. Steepest Ascend 33 Search for solutions with a GA Solution discovered probabilistically Problem: Can not guarantee discovery of hilltop May reach global maxima by wandering through search space GA S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 34 GA: Real-World Applications Numerical and Combinatorial Optimisation • Economic • Viability of gene propagation Social systems • Host-parasite co-evolution, resource flow, biological arm races Population Genetics • Biding and trading strategies, stock trends Ecology • Job-Shop Scheduling, Traveling salesman Evolution of social behavior in insect colonies Computer art • Automatic generation of graphics, music S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 35 GA for Computer Art Interactive evolution: user (computer artist) determines fitness Examples by Karl Sims, http://www.karlsims.com/ Competing virtual creatures 1994 Genetic Images 1993 S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 Galapagos (3D animated forms) 1997 36 Overview Evolutionary Computation, Part I • • Introduction – (Self-)Optimization – Basic definitions from genetics Evolutionary Computation – Common definitions, basic algorithm – Genetic Algorithms (GA) – Genetic Programming (GP) • Example: Symbolic regression with TinyGP S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009 37