Download Slightly beyond Turing`s computability for studying Genetic

Slightly beyond Turing’s computability for studying Genetic Programming Olivier Teytaud, Tao, Inria, Lri, UMR CNRS 8623, Univ. Paris-Sud, Pascal, Digiteo Outline    What is genetic programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Computability point of view  Complexity point of view  What is Genetic Programming (GP)   GP = mining Turing-equivalent spaces of functions Typical example: symbolic regression.  Inputs:     x1,x2,x3,…,xN in {0,1}* y1,y2,y3,…,yN in {0,1} yi=f(xi) (xi,yi) assumed independently identically distributed (unknown distribution of probability) Goal: Finding g such that E|g(x)-y| + C E Time(g,x) as small as possible  How does GP works ?   GP = evolutionary algorithm. Evolutionary algorithm: P = initial population  While (my favorite criterion)  Selection = best functions in P according to some score  Mutations = random perturbations of progs in the Selection  Cross-over = merging of programs in the Selection  P ≈ Selection + Mutations + Cross-over  How does GP works ?   GP = evolutionary algorithm. Does it Evolutionary algorithm: work ? P = initial population  While (my favorite criterion)  Selection = best functions in P according to some score  Mutations = random perturbations of progs in the Selection  Cross-over = merging of programs in the Selection  P ≈ Selection + Mutations + Cross-over  How does GP works ?   GP = evolutionary algorithm. Does it Evolutionary algorithm: work ? P = initial population  While (my favorite criterion)  Definitely, yes for robust and multimodal optimization in complex domains (trees, bitstrings,…). Selection = best functions in P according to some score  Mutations = random perturbations of progs in the Selection  Cross-over = merging of programs in the Selection  P ≈ Selection + Mutations + Cross-over  How does GP works ?   GP = evolutionary algorithm. Evolutionary algorithm: P = initial population  While (my favorite criterion)  Selection = best functions in P according to some score  Mutations = random perturbations of progs in the Selection  Cross-over = merging of programs in the Selection  P ≈ Selection + Mutations + Cross-over  Does it work ? How does GP works ?   GP = evolutionary algorithm. Evolutionary algorithm: P = initial population  While (my favorite criterion)  Which score ? A nice question for mathematicians Selection = best functions in P according to some score  Mutations = random perturbations of progs in the Selection  Cross-over = merging of programs in the Selection  P ≈ Selection + Mutations + Cross-over  Why studying GP ?  GP is studied by many people    GP seemingly works      5440 articles in the GP bibliography [5] More than 880 authors Human-competitive results http://www.geneticprogramming.com/humancompetitive.html Nothing else for mining Turing-equivalent spaces of programs Probably better than random search Not so many mathematical fundations in GP Not so many open problems in computability, in particular with applications Outline    What is genetic programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Computability point of view  Complexity point of view  Formalization of GP What is typically GP ? No halting criterion. We stop when time is exhausted.  No use of prior knowledge; no use of f, whenever you know it.  People (often) do not like GP because: It is slow and has no halting criterion  It uses the yi=f(xi) and not f (different from automatic code generation)   Are these two elements necessary ? Iterative algorithms Black-box ? Formalization of GP Summary: GP uses only the f(xi) and the Time(f,xi). GP never halts: O1, O2, O3, … . Can we do better ? Outline    What is genetic programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Computability point of view  Complexity point of view  Known results Whenever f is available (and not only the f(xi) ), computing O such that O≡f  O optimal for size (or speed, or space …)  is not possible. (i.e. there’s no Turing machine performing that task for all f) A first (easy) good reason for GP. Whenever f is available (and not only the f(xi) ), computing O1, O2, …, such that  Op ≡ f for p sufficiently large  Lim size(Op) optimal is possible, with proved convergence rates, e.g. by bloat penalization: - while (true) - select the best program P for a compromise relevance on the n first examples + penalization of size, e.g. Sum |P(xi)-yi |+ C( |P| , n ) i<n - n=n+1 (see details of the proof and of the algorithm in the paper) A first (easy) good reason for GP. Whenever f is not available (and not only the f(xi) ), computing O1, O2, …, such that  Op ≡ f for p sufficiently large  Lim size(Op) optimal is possible, with proved convergence rates, e.g. by bloat penalization: - consider a population of programs; set n=1 - while (true) - select the best program P for a compromise relevance on the n first examples + penalization of size, e.g. Sum |P(xi)-yi |+ C( |P| , n ) i<n - n=n+1 (see details of the proof and of the algorithm in the paper) A first (easy) good reason for GP.  Asymptotically (only!), finding an optimal function O ≡ f is possible.  No halting criterion is possible (avoids the use of an oracle in 0’) Outline    What is genetic programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Computability point of view  Complexity point of view  Outline    What is genetic programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Computability point of view  Complexity point of view:  Kolmogorov’s complexity with bounded time  Application to genetic programming  Kolmogorov’s complexity   Kolmogorov’s complexity of x : Minimum size of a program generating x Kolmogorov’s complexity of x with time at most T : Minimum size of a program generating x in time at most T. Kolmogorov’s complexity in bounded time = computable. Outline    What is genetic programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Computability point of view  Complexity point of view:  Kolmogorov’s complexity with bounded time  Application to genetic programming  Kolmogorov’s complexity and genetic programming    GP uses expensive simulations of programs Can we get rid of the simulation time ? e.g. by using f not only as a black box ? Essentially, no:  Example of GP problem: finding O as small as possible with      ETime(O,x)<Tn, |O|<Sn O(x)=y If Tn = Ω(2n) and some Sn = O(log(n)), this requires time at least Tn/polynomial(n) Just simulating all programs shorter than Sn and « faster » than Tn is possible in time polynomial(n)Tn Outline    What is genetic programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Computability point of view  Complexity point of view:  Kolmogorov’s complexity with bounded time  Application to genetic programming   Conclusion Conclusion  Summary  GP is typically solving approximately problems in 0’  A lot of work about approximating NP-complete problems, but not a lot about 0’  We provide a theoretical analysis of GP  Conclusions:  GP uses expensive simulations, but the simulation cost can anyway not be removed.  GP has no halting criterion, but no halting criterion can be found.  Also, « bloat » penalization ensures consistency  this point proposes a parametrization of the usual algorithms. Conclusion  Summary  GP is typically solving approximately problems in 0’  A lot of work about approximating NP-complete problems, but not a lot about 0’  We provide a theoretical analysis of GP  Conclusions:  GP uses expensive simulations, but the simulation cost can anyway not be removed.  GP has no halting criterion, but no halting criterion can be found.  Also, « bloat » penalization ensures consistency  this point proposes a parametrization of the usual algorithms. Conclusion  Summary  GP is typically solving approximately problems in 0’  A lot of work about approximating NP-complete problems, but not a lot about 0’  We provide a mathematical analysis of GP  Conclusions:  GP uses expensive simulations, but the simulation cost can anyway not be removed.  GP has no halting criterion, but no halting criterion can be found.  Also, « bloat » penalization ensures consistency  this point proposes a parametrization of the usual algorithms.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Slightly beyond Turing`s computability for studying Genetic