Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Public health genomics wikipedia , lookup

Genetic code wikipedia , lookup

Koinophilia wikipedia , lookup

Genetic testing wikipedia , lookup

Point mutation wikipedia , lookup

Epistasis wikipedia , lookup

Genome (book) wikipedia , lookup

Frameshift mutation wikipedia , lookup

Genetic drift wikipedia , lookup

Microevolution wikipedia , lookup

Computational phylogenetics wikipedia , lookup

Population genetics wikipedia , lookup

Median graph wikipedia , lookup

Gene expression programming wikipedia , lookup

Transcript
Representation
Chapter 4, Essentials of Metaheuristics, 2013
Spring, 2014
Metaheuristics
Byung-Hyun Ha
R1
Outline
 Introduction
 Vectors
 Directed encoded graphs
 Trees and Genetic Programming
 Lists
 Rulesets
 Bloat
 Summary
1
Introduction
 Representation of individual
 Approach to construct, tweak, and present individual for fitness
assessment
 Metaheuristics as general framework
 Mostly, only representation differs with regard to different problems
2
Introduction
 Examples of representation
 TSP
• Permutation-based (order-based)
• e.g., 4-2-3-1  2-3-1-4  3-1-4-2  1-4-2-3
• Locus-based
• The ith element represents the city following city i in the tour.
• e.g., 4/3/1/2 ( 4-2-3-1, permutation-based)
• Random-key
• 0.78:0.56:0.69:0.11 ( 4-2-3-1, permutation-based)
 VRP (vehicle routing problems)
• Using separator
• 6-9-0-2-4-7-5-0-8-1-3
• Mutation and crossover?
 Encoding and decoding
Tweak
encoding
Phenotype
Genotype
decoding
source: http://neo.lcc.uma.es/cEA-web/VRP.htm
3
Introduction
 Tweak in representation
 Phenotype  (E)  genotype  Tweak  genotype  (D)  phenotype
 Determining fitness landscape
• Example: Hamming cliff and gray coding
 Remember small change!
• It can help metaheuristics, usually.
4
Introduction
 Much of representation is an art, not a science!
 e.g., workflow (business process)
• How to encode and tweak?
source: http://www.tonymarston.net/php-mysql/workflow.html
5
Introduction
 Properties, required (Talbi, 2009)
 Completeness
• All solutions should be represented
 Connexity
• A search path must exist between any two solutions (i.e., to global optimum)
 Efficiency
• Easiness to manipulate
 Representation-solution mapping (Talbi, 2009)
 One-to-one
 Many-to-one
• Redundancy will enlarge the size of search space.
 One-to-many (indirect encoding)
• A good solution should be constructed from an individual.
6
Vectors
 Initialization and bias
 Not difficult to initialize
• Some totally-random initialization method (covered already)
 Bias?
• e.g., solution for robot walking  using heuristic (e.g., by motion capture)
• But diversity is useful, particularly early on.
• Some suggestions
1) Biasing is dangerous.
2) Start with values that aren’t all or exactly based on heuristic bias
 Mutation
 Examples
• Gaussian convolution, bit-flip mutation, ...
• Integer vector: Integer Randomization Mutation, Random Walk Mutation, ...
 c.f., point mutation
• Useful when there is less chance to get improvement by changing several
genes at a time
• But, can be trapped in local optimum, e.g.,
7
Vectors
 Recombination
 One- and Two-point Crossover, Uniform Crossover
 Line Recombination, Intermediate Recombination
 ...
 Heterogeneous vectors?
 e.g., a function with real parameters and integer parameters
 Phenotype-specific mutation or crossover
 e.g., Jung & Moon, The Natural Crossover for the 2D Euclidean TSP,
2002
 Consider fitness landscape.
8
Directed Encoded Graphs
 Graphs
 Examples
• Neural networks, finite-state automata, Petri nets, electrical circuits, ...
 Types
• Directed, undirected, with labels, with weights, cyclic, acyclic, recurrent, feedforward, sparse, dense, planar, ...
• Those are constraints respecting Tweak.
 Arbitrary-structured graph
 Our target of graph representation
 Types of encoding
 Direct encoding
• Exact node and edge description in representation
 Indirect (developmental) encoding
• Some (production) rule to constructing graph, as a solution (discussed later)
9
Directed Encoded Graphs
 Full adjacent matrix
 e.g., a recurrent directed graph structure, with
•
•
•
•
no more than 5 nodes
no more than one edge between any two node
self-edges allowed
weights for edges
 Mutation examples
• One vector approach
• Algorithm 45. Gaussian Convolution Respecting Zeros
• Using two vectors
• One for on/off, the other for weights
10
Directed Encoded Graphs
 Arbitrary graph structure
 Initialization of graph (N, E)
• Determination of number of nodes and edges
• e.g., using geometric distribution
• Creation of a node and an edge, depending on type of target graph
11
Directed Encoded Graphs
 Arbitrary graph structure (cont’d)
 Further considerations in initialization
• e.g., connected and directed acyclic graph
• c.f., general algorithms textbook
 Mutation
• e.g., do one of the followings, random number of times
• delete a random edge
• add a random edge
• delete a node and all its edges
• add a node
• relabel a node
• relabel an edge
 Recombination
• c.f., goal of crossover is to transfer essential and useful elements to another
• Determining elements to transfer
• Selecting subset of nodes and edges, or selecting subgraph
• Coping with missing target of edge and with disjoint
12
Directed Encoded Graphs
 Arbitrary graph structure (cont’d)
 Recombination (cont’d)
13
Directed Encoded Graphs
 Vector vs. graph representation
 e.g., Relocation of containers in a bay for efficient loading
• Solution as a list of movements
• e.g., (1-2), (3-2), (4-5), (6-5), (4-7), (4-6)
• Weakness?
• Solution as a graph
c
a
d
e
b
f
14
Trees and Genetic Programming
 Genetic Programming
 How to use stochastic methods to search for and optimize small
computer programs or other computational devices
 Concept of suboptimality, required
• Not simply right or wrong
 Examples
• Team soccer robot behavior, fitting math. equation to data set, finding finitestate automata which matching given language
 Representation
 Lists or trees, usually
• e.g., an artificial ant, sin(cos(x – sin x) + xx) for symbolic regression
15
Trees and Genetic Programming
 Primitives in representation
 Basic functions (e.g., kick-toward-goal) or CPU operations (e.g., +)
 Constraints of context
• e.g., 4 + kick-toward-goal(), no sense
• e.g., matrix-multiply, expecting exactly two children and ...
 Tweaks need to maintain closure (valid individuals)
 Fitness assessment
 Conversion data (genotype) to code (phenotype), and evaluate
 Examples
• Symbolic regression: sum of squared errors
• Artificial ant: amount of food eaten
 Tree-Style Genetic Programming Pipeline
 Sec. 3.3.3
 One of popular algorithm for Genetic Programming (but not limited to)
16
Trees and Genetic Programming
 Initialization
 New trees by repeatedly selecting from a function set
• Considering arity (predefined number of children)
• e.g., Grow, Full, Ramped Half-and-Half, PTC2 algorithms
 Ephemeral random constants
• Handling constants for leaves (e.g., 0.2462, 0.9, –2.34, 3.14, “s%&e:m”)
• Special leaf nodes to be transformed into randomly-generated constant
17
Trees and Genetic Programming
 Recombination
 e.g., subtree crossover: swap two selected subtrees
• Non-homologous (i.e., highly mutative)
 homologous: individual crossing over with itself will make copies of itself
 Mutation
 Examples
•
•
•
•
•
Subtree mutation: replacing random subtree with randomly-generated one
Replacing random non-leaf node with one of its subtrees
Picking random non-leaf node and swapping its subtrees
Mutating ephemeral random constants by introducing some noise
Swapping two disjoint subtrees
 c.f., not popular because usually crossover is non-homologous
18
Trees and Genetic Programming
 Forests
 e.g., forest of soccer robot team with each member as tree
 Automatically defined functions (ADF)
 Not predefined functions but trees called by primary tree
 c.f., Modularity
• In case that we believe a good solution has repetitive part
 Strongly-Typed Genetic Programming
19
Trees and Genetic Programming
 Cellular encoding
 Indirect encoding (developmental encoding)
20
Lists
 Grammatical Evolution: using predefined grammar for tree
 Trees generated by lists (indirect encoding)
• c.f., http://en.wikipedia.org/wiki/Backus-Naur_form
 Pros and cons
• Almost always valid tree, reduced size of search space
• Tiny changes early in list result in gigantic changes (un-smoothness).
21
Rulesets
 A policy as solution of problem
 Consisting of a set of rules
 e.g., stock trading program, entities in simulations
 State-action rules
 Typical form
• a  b  ...  y  z
• e.g., (left sonar value > 3.2)  (forward sonar value  5.0)  (turn left to 50)
 An interpretation
• Mapping from state space into actions
 Under-specification and over-specification
• Default rules, vote, ...
 Fitness assessment
• On a ruleset, or on a series of rules
22
Rulesets
 Production rules
 Typical form
• a  b  c  ...  z
 Modular indirect encoding
• Describing large complex solution with lots of repetitions by small and
compact rule (search) space
 e.g., 8-node directed unlabeled graph structure as solution
23
Rulesets
 Production rules (cont’d)
 e.g., Lindenmayer systems (L-systems)
• e.g., Koch Curve
• FF+F–F–F+F
• F: draw a line forward, +: turn left, –: turn right
F
F+F-F-F+F
F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F
F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F+
F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+FF+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+FF+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F+
F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F
24
Bloat
 Code bloat or code growth
 A problem with variable-sized representation
 Far from optimum usually, memory consumption, ... and ugly
 Common ways of handling
 Limiting size when individual is Tweaked
 Editing individual, to remove introns and the like
 Punishing individual for being very large
• e.g., linear parsimony pressure (problem?)
• revised fitness f = r – (1 – )s, where r: fitness, s: size of individual
• e.g., non-parametric parsimony pressure
25
Summary
 Phenotype & genotype
 Encoding & decoding
 Representations
 Vectors
 Graphs
+ Indirect-encoded graphs (edge encoding)
 Trees
+ Indirect-encoded trees (Grammatical Evolution)
 Lists
 Rulesets
 Bloat
26