* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Public health genomics wikipedia , lookup
Genetic code wikipedia , lookup
Koinophilia wikipedia , lookup
Genetic testing wikipedia , lookup
Point mutation wikipedia , lookup
Genome (book) wikipedia , lookup
Frameshift mutation wikipedia , lookup
Genetic drift wikipedia , lookup
Microevolution wikipedia , lookup
Computational phylogenetics wikipedia , lookup
Population genetics wikipedia , lookup
Representation Chapter 4, Essentials of Metaheuristics, 2013 Spring, 2014 Metaheuristics Byung-Hyun Ha R1 Outline Introduction Vectors Directed encoded graphs Trees and Genetic Programming Lists Rulesets Bloat Summary 1 Introduction Representation of individual Approach to construct, tweak, and present individual for fitness assessment Metaheuristics as general framework Mostly, only representation differs with regard to different problems 2 Introduction Examples of representation TSP • Permutation-based (order-based) • e.g., 4-2-3-1 2-3-1-4 3-1-4-2 1-4-2-3 • Locus-based • The ith element represents the city following city i in the tour. • e.g., 4/3/1/2 ( 4-2-3-1, permutation-based) • Random-key • 0.78:0.56:0.69:0.11 ( 4-2-3-1, permutation-based) VRP (vehicle routing problems) • Using separator • 6-9-0-2-4-7-5-0-8-1-3 • Mutation and crossover? Encoding and decoding Tweak encoding Phenotype Genotype decoding source: http://neo.lcc.uma.es/cEA-web/VRP.htm 3 Introduction Tweak in representation Phenotype (E) genotype Tweak genotype (D) phenotype Determining fitness landscape • Example: Hamming cliff and gray coding Remember small change! • It can help metaheuristics, usually. 4 Introduction Much of representation is an art, not a science! e.g., workflow (business process) • How to encode and tweak? source: http://www.tonymarston.net/php-mysql/workflow.html 5 Introduction Properties, required (Talbi, 2009) Completeness • All solutions should be represented Connexity • A search path must exist between any two solutions (i.e., to global optimum) Efficiency • Easiness to manipulate Representation-solution mapping (Talbi, 2009) One-to-one Many-to-one • Redundancy will enlarge the size of search space. One-to-many (indirect encoding) • A good solution should be constructed from an individual. 6 Vectors Initialization and bias Not difficult to initialize • Some totally-random initialization method (covered already) Bias? • e.g., solution for robot walking using heuristic (e.g., by motion capture) • But diversity is useful, particularly early on. • Some suggestions 1) Biasing is dangerous. 2) Start with values that aren’t all or exactly based on heuristic bias Mutation Examples • Gaussian convolution, bit-flip mutation, ... • Integer vector: Integer Randomization Mutation, Random Walk Mutation, ... c.f., point mutation • Useful when there is less chance to get improvement by changing several genes at a time • But, can be trapped in local optimum, e.g., 7 Vectors Recombination One- and Two-point Crossover, Uniform Crossover Line Recombination, Intermediate Recombination ... Heterogeneous vectors? e.g., a function with real parameters and integer parameters Phenotype-specific mutation or crossover e.g., Jung & Moon, The Natural Crossover for the 2D Euclidean TSP, 2002 Consider fitness landscape. 8 Directed Encoded Graphs Graphs Examples • Neural networks, finite-state automata, Petri nets, electrical circuits, ... Types • Directed, undirected, with labels, with weights, cyclic, acyclic, recurrent, feedforward, sparse, dense, planar, ... • Those are constraints respecting Tweak. Arbitrary-structured graph Our target of graph representation Types of encoding Direct encoding • Exact node and edge description in representation Indirect (developmental) encoding • Some (production) rule to constructing graph, as a solution (discussed later) 9 Directed Encoded Graphs Full adjacent matrix e.g., a recurrent directed graph structure, with • • • • no more than 5 nodes no more than one edge between any two node self-edges allowed weights for edges Mutation examples • One vector approach • Algorithm 45. Gaussian Convolution Respecting Zeros • Using two vectors • One for on/off, the other for weights 10 Directed Encoded Graphs Arbitrary graph structure Initialization of graph (N, E) • Determination of number of nodes and edges • e.g., using geometric distribution • Creation of a node and an edge, depending on type of target graph 11 Directed Encoded Graphs Arbitrary graph structure (cont’d) Further considerations in initialization • e.g., connected and directed acyclic graph • c.f., general algorithms textbook Mutation • e.g., do one of the followings, random number of times • delete a random edge • add a random edge • delete a node and all its edges • add a node • relabel a node • relabel an edge Recombination • c.f., goal of crossover is to transfer essential and useful elements to another • Determining elements to transfer • Selecting subset of nodes and edges, or selecting subgraph • Coping with missing target of edge and with disjoint 12 Directed Encoded Graphs Arbitrary graph structure (cont’d) Recombination (cont’d) 13 Directed Encoded Graphs Vector vs. graph representation e.g., Relocation of containers in a bay for efficient loading • Solution as a list of movements • e.g., (1-2), (3-2), (4-5), (6-5), (4-7), (4-6) • Weakness? • Solution as a graph c a d e b f 14 Trees and Genetic Programming Genetic Programming How to use stochastic methods to search for and optimize small computer programs or other computational devices Concept of suboptimality, required • Not simply right or wrong Examples • Team soccer robot behavior, fitting math. equation to data set, finding finitestate automata which matching given language Representation Lists or trees, usually • e.g., an artificial ant, sin(cos(x – sin x) + xx) for symbolic regression 15 Trees and Genetic Programming Primitives in representation Basic functions (e.g., kick-toward-goal) or CPU operations (e.g., +) Constraints of context • e.g., 4 + kick-toward-goal(), no sense • e.g., matrix-multiply, expecting exactly two children and ... Tweaks need to maintain closure (valid individuals) Fitness assessment Conversion data (genotype) to code (phenotype), and evaluate Examples • Symbolic regression: sum of squared errors • Artificial ant: amount of food eaten Tree-Style Genetic Programming Pipeline Sec. 3.3.3 One of popular algorithm for Genetic Programming (but not limited to) 16 Trees and Genetic Programming Initialization New trees by repeatedly selecting from a function set • Considering arity (predefined number of children) • e.g., Grow, Full, Ramped Half-and-Half, PTC2 algorithms Ephemeral random constants • Handling constants for leaves (e.g., 0.2462, 0.9, –2.34, 3.14, “s%&e:m”) • Special leaf nodes to be transformed into randomly-generated constant 17 Trees and Genetic Programming Recombination e.g., subtree crossover: swap two selected subtrees • Non-homologous (i.e., highly mutative) homologous: individual crossing over with itself will make copies of itself Mutation Examples • • • • • Subtree mutation: replacing random subtree with randomly-generated one Replacing random non-leaf node with one of its subtrees Picking random non-leaf node and swapping its subtrees Mutating ephemeral random constants by introducing some noise Swapping two disjoint subtrees c.f., not popular because usually crossover is non-homologous 18 Trees and Genetic Programming Forests e.g., forest of soccer robot team with each member as tree Automatically defined functions (ADF) Not predefined functions but trees called by primary tree c.f., Modularity • In case that we believe a good solution has repetitive part Strongly-Typed Genetic Programming 19 Trees and Genetic Programming Cellular encoding Indirect encoding (developmental encoding) 20 Lists Grammatical Evolution: using predefined grammar for tree Trees generated by lists (indirect encoding) • c.f., http://en.wikipedia.org/wiki/Backus-Naur_form Pros and cons • Almost always valid tree, reduced size of search space • Tiny changes early in list result in gigantic changes (un-smoothness). 21 Rulesets A policy as solution of problem Consisting of a set of rules e.g., stock trading program, entities in simulations State-action rules Typical form • a b ... y z • e.g., (left sonar value > 3.2) (forward sonar value 5.0) (turn left to 50) An interpretation • Mapping from state space into actions Under-specification and over-specification • Default rules, vote, ... Fitness assessment • On a ruleset, or on a series of rules 22 Rulesets Production rules Typical form • a b c ... z Modular indirect encoding • Describing large complex solution with lots of repetitions by small and compact rule (search) space e.g., 8-node directed unlabeled graph structure as solution 23 Rulesets Production rules (cont’d) e.g., Lindenmayer systems (L-systems) • e.g., Koch Curve • FF+F–F–F+F • F: draw a line forward, +: turn left, –: turn right F F+F-F-F+F F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F+ F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+FF+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+FF+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F+ F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F 24 Bloat Code bloat or code growth A problem with variable-sized representation Far from optimum usually, memory consumption, ... and ugly Common ways of handling Limiting size when individual is Tweaked Editing individual, to remove introns and the like Punishing individual for being very large • e.g., linear parsimony pressure (problem?) • revised fitness f = r – (1 – )s, where r: fitness, s: size of individual • e.g., non-parametric parsimony pressure 25 Summary Phenotype & genotype Encoding & decoding Representations Vectors Graphs + Indirect-encoded graphs (edge encoding) Trees + Indirect-encoded trees (Grammatical Evolution) Lists Rulesets Bloat 26