* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Public health genomics wikipedia , lookup
Genetic engineering wikipedia , lookup
Genetic code wikipedia , lookup
Point mutation wikipedia , lookup
Frameshift mutation wikipedia , lookup
Genetic testing wikipedia , lookup
Genome (book) wikipedia , lookup
Koinophilia wikipedia , lookup
Genetic drift wikipedia , lookup
Computational phylogenetics wikipedia , lookup
Microevolution wikipedia , lookup
Median graph wikipedia , lookup
Representation Chapter 4 Luke, Essentials of Metaheuristics, 2011 Byung-Hyun Ha R1 Outline Introduction Vectors Directed encoded graphs Trees and Genetic Programming Grammatical Evolution Rulesets Bloat Summary 1 Introduction Representation of individual Approach to construct, tweak, and present individual for fitness assessment Metaheuristics as general framework Mostly, only representation differs with regard to different problems 2 Introduction Examples of representation TSP • Permutation-based (order-based) • 3-1-2-0 1-2-0-3 2-0-3-1 0-3-1-2, not distinct • Locus-based • 3/2/0/1 ( 3-1-2-0, permutation-based), distinct • Random-key • 0.78:0.56:0.69:0.11 ( 3-1-2-0, permutation-based) VRP (vehicle routing problems) • Using separator • 6-9-0-2-4-7-5-0-8-1-3 • Mutation and crossover? Encoding and decoding Tweak encoding Phenotype Genotype source: http://neo.lcc.uma.es/cEA-web/VRP.htm decoding 3 Introduction Tweak in representation Phenotype e. genotype Tweak genotype d. phenotype Determining fitness landscape • Example: Hamming cliff and gray coding Remember small change! • It can help metaheuristics, usually. 4 Introduction Much of representation is an art, not a science! e.g., workflow (business process) • How to encode and tweak? source: http://www.tonymarston.net/php-mysql/workflow.html 5 Introduction Properties, required (Talbi, 2009) Completeness • All solutions should be represented Connexity • A search path must exist between any two solutions (i.e., to global optimum) Efficiency • Easiness to manipulate Representation-solution mapping (Talbi, 2009) One-to-one Many-to-one • Redundancy will enlarge the size of search space. One-to-many • A good solution should be constructed from an individual. 6 Vectors Initialization and bias Not difficult to initialize • Some totally-random initialization method (covered already) Bias? • e.g., solution for robot walking using heuristic (e.g., by motion capture) • But diversity is useful, particularly early on. • Some suggestions 1) Biasing is dangerous. 2) Start with values that aren’t all or exactly based on heuristic bias Mutation Examples • Gaussian convolution, bit-flip mutation, ... • Integer vector: Integer Randomization Mutation, Random Walk Mutation, ... c.f., point mutation • Useful when there is less chance to get improvement by changing several genes at a time • But, can be trapped in local optimum, e.g., 7 Vectors Recombination One- and Two-point Crossover, Uniform Crossover Line Recombination, Intermediate Recombination ... Phenotype-specific mutation or crossover e.g., Jung & Moon, The Natural Crossover for the 2D Euclidean TSP, 2002 Consider fitness landscape. 8 Directed Encoded Graphs Graphs Examples • Neural networks, finite-state automata, Petri nets, electrical circuits, ... Types • Directed, undirected, with labels, with weights, cyclic, acyclic, recurrent, feedforward, sparse, dense, planar, ... • Those are constraints respecting Tweak. Arbitrary-structured graph Target of graph representation Types of encoding Direct encoding • Exact node and edge description in representation Indirect (developmental) encoding • Some (production) rule to constructing graph, as a solution (discussed later) 9 Directed Encoded Graphs Full adjacent matrix e.g., a recurrent directed graph structure, with • • • • no more than 5 nodes no more than one edge between any two node self-edges allowed weights for edges Mutation and crossover 10 Directed Encoded Graphs Arbitrary graph structure Initialization of graph (N, E) • Determination of number of nodes and edges • e.g., using geometric distribution • Creation of a node and an edge, depending on type of target graph 11 Directed Encoded Graphs Arbitrary graph structure (cont’d) Further considerations in initialization • e.g., connected and directed acyclic graph • c.f., general algorithms textbook Mutation • e.g., do one of the followings, random number of times • delete a random edge • add a random edge • delete a node and all its edges • add a node • relabel a node • relabel an edge Recombination • c.f., goal of crossover is to transfer essential and useful elements to another • Determining elements to transfer • Selecting subset of nodes and edges, or selecting subgraph • Coping with missing target of edge and with disjoint 12 Directed Encoded Graphs Arbitrary graph structure (cont’d) Recombination (cont’d) 13 Directed Encoded Graphs Example of container terminal operations Relocation of containers in a bay for efficient loading • Solution as a list of movements • e.g., (8-7) (4-1) (3-7) (2-4) (6-7) (2-4) (2-3) (2-7) (8-4) (0-3) (8-2) (9-6) (1-2) (5-1) (5-2) (9-2) (9-2) (7-3) (9-6) (5-0) (7-9) (8-9) (6-9) (7-5) (7-9) (6-1) (6-3) (6-8) (3-6) (1-6) (8-6) • Weakness? • Solution as a graph • Sufficient? c a d e b f 14 Trees and Genetic Programming Genetic Programming How to use stochastic methods to search for and optimize small computer programs or other computational devices Concept of suboptimality, required • Not simply right or wrong Examples • Team soccer robot behavior, fitting math. equation to data set, finding finitestate automata which matching given language Representation Lists or trees, usually • e.g., an artificial ant, sin(cos(x – sin x) + xx) for symbolic regression 15 Trees and Genetic Programming Primitives in representation Basic functions (e.g., kick-toward-goal) or CPU operations (e.g., +) Constraints of context • e.g., 4 + kick-toward-goal(), no sense • e.g., matrix-multiply, expecting exactly two children and ... Tweaks need to maintain closure (valid individuals) Fitness assessment Conversion data (genotype) to code (phenotype), and evaluate Examples • Symbolic regression: sum of squared errors • Artificial ant: amount of food eaten Tree-Style Genetic Programming Pipeline Sec. 3.3.3 One of popular algorithm for Genetic Programming (but not limited to) 16 Trees and Genetic Programming Initialization New trees by repeatedly selecting from a function set • Considering arity (predefined number of children) • e.g., Grow, Full, Ramped Half-and-Half, PTC2 algorithms Ephemeral random constants • Handling constants for leaves (e.g., 0.2462, 0.9, –2.34, 3.14, “s%&e:m”) • Special leaf nodes to be transformed into randomly-generated constant 17 Trees and Genetic Programming Recombination e.g., subtree crossover: swap two selected subtrees • Non-homologous (i.e., global) Mutation Examples • • • • • Replacing random subtree with randomly-generated one (subtree mutation) Replacing random non-leaf node with one of its subtrees Picking random non-leaf node and swapping its subtrees Mutating ephemeral random constants by introducing some noise Swapping two disjoint subtrees c.f., not popular because usually crossover is non-homologous 18 Trees and Genetic Programming Forests e.g., forest of soccer robot team with each member as tree Automatically defined functions (ADF) Not predefined functions but trees called by primary tree c.f., Modularity • In case that we believe a good solution has repetitive part 19 Trees and Genetic Programming Edge encoding e.g., an edge encoding for a finite-state automaton (it’s a graph) that interprets regular language (1|0)*01 • c.f., http://en.wikipedia.org/wiki/Lexical_analysis Indirect encoding (developmental encoding) 20 Trees and Genetic Programming Example of container terminal operations Container-grounding position determination by weighted sum of scores • Solution as a list of weights • Weakness? • Genetic programming? 21 Grammatical Evolution Using predefined grammar for tree Trees generated by lists (indirect encoding) • c.f., http://en.wikipedia.org/wiki/Backus-Naur_form Pros and cons • Almost always valid tree, reduced size of search space • Tiny changes early in list result in gigantic changes (un-smoothness). 22 Rulesets A policy as solution of problem Consisting of a set of rules e.g., stock trading program, entities in simulations State-action rules Typical form • a b ... y z • e.g., (left sonar value > 3.2) (forward sonar value 5.0) (turn left to 50) An interpretation • Mapping from state space into actions Under-specification and over-specification • Default rules, vote, ... Fitness assessment • On a ruleset, or on a series of rules 23 Rulesets Production rules Typical form • a b c ... z Modular indirect encoding • Describing large complex solution with lots of repetitions by small and compact rule (search) space e.g., 8-node directed unlabeled graph structure as solution 24 Rulesets Production rules (cont’d) e.g., Lindenmayer systems (L-systems) • e.g., Koch Curve • FF+F–F–F+F • F: draw a line forward, +: turn left, –: turn right F F+F-F-F+F F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F+ F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+FF+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+FF+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F+ F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F 25 Bloat Code bloat or code growth A problem with variable-sized representation Far from optimum usually, memory consumption, ... and ugly Common ways of handling Limiting size when individual is Tweaked Editing individual, to remove introns and the like Punishing individual for being very large • e.g., linear parsimony pressure (problem?) • revised fitness f = r – (1 – )s, where r: fitness, s: size of individual • e.g., non-parametric parsimony pressure 26 Summary Phenotype & genotype Encoding & decoding Representations Vectors Graphs + Indirect-encoded graphs (edge encoding) Trees + Indirect-encoded trees (Grammatical Evolution) Rulesets Bloat 27