* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download EC and Genetics - University of Houston
Point mutation wikipedia , lookup
Human genetic variation wikipedia , lookup
Minimal genome wikipedia , lookup
Genomic library wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Quantitative trait locus wikipedia , lookup
X-inactivation wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genetic testing wikipedia , lookup
Genome evolution wikipedia , lookup
Koinophilia wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Gene expression programming wikipedia , lookup
Public health genomics wikipedia , lookup
Heritability of IQ wikipedia , lookup
Genetic engineering wikipedia , lookup
Designer baby wikipedia , lookup
Behavioural genetics wikipedia , lookup
Genome (book) wikipedia , lookup
History of genetic engineering wikipedia , lookup
Population genetics wikipedia , lookup
1 Part 1 - Natural Genetics Ben Paechter with thanks to the EvoNet Training Committee and its “Flying Circus” Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 2 Natural Genetics The information required to build a living organism is coded in the DNA and other genetic material found in the cells of that organism Within a species, most of the genetic material is the same Small changes in the genetic material give rise to small changes in the organism – E.g height, hair colour Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 3 DNA and Genes DNA is a large molecule made up of fragments. There are several fragment types, each one acting like a letter in a long coded message: -A-B-A-D-C-B-B-C-C-A-D-B-C-C-ACertain groups of letters are meaningful together - a bit like words. These groups are called genes The DNA is made up of genes and rubbish Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 4 Example: Human Reproduction Human DNA is organised into chromosomes Most human cells contains 23 pairs of chromosomes which together define the physical attributes of the person: Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 5 Reproductive Cells Sperm and egg cells contain 23 individual chromosomes rather than 23 pairs Reproductive cells are formed by one cell splitting into two During this process the pairs of chromosome undergo an operation called crossover Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 6 Crossover During crossover the chromosome pairs link up and swap parts of themselves: Before After After crossover one of each pair goes into each cell Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 7 Fertilisation Sperm cell from Father Egg cell from Mother New person cell Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 8 Mutation Occasionally some of the genetic material changes very slightly during this process This means that the child might have genetic material information not inherited from either parent This is most likely to be catastrophic Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 9 Theory of Evolution From time to time, reproduction, crossover and mutation produce new genetic material or new combinations of genes Usually this reduces the organism’s ability to survive and so reproduce Occasionally the new genetic material increases the organism’s ability survive and so reproduce If it allows the organism to reproduce more then this leads to more and more organisms have the “new improved” genetic make-up “Good” sets of genes get reproduced more “Bad” sets of genes get reproduce less Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 10 Theory of Evolution (2) The organisms as a whole get better and better at surviving in their environment Evolutionists claim that all the species of plants and animals have been produced by this slow changing of genetic material with organisms becoming better and better at surviving in their niche, and new organisms evolving to fill any vacant niche They agree that evolution requires reproduction, selection and mutation Some say evolution also requires crossover Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 11 Evolution as Search We can think of evolution as a search through the enormous genetic parameter space for the genetic make-up that best allows an organism to reproduce in its changing environment Since it seems pretty good at doing this job, we can borrow ideas from nature to help us solve problems that have an equally large search spaces or similarly changing environment Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 12 Dr. Eick’s Transparencies: Genetics and What EC Algorithm Designers can learn from it Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 13 More Genetics: Diploidy and Dominance Diploidy: Most chromosomes in biological systems are doublestranded(diploid) and not single-standed(haploid) carrying pairs of chromosomes each containing information for the same function. The primary mechanism to select which genotypical information will be expressed in the phenotype is dominance: – AbCDe + aBCde ABCDe Diploidy provides a mechanism for remembering alleles and allel combinations that were previously useful; dominance provides a mechanism to shield those remembered alleles from harmful selection in a current hostile environment (increasing implicitly the richness of the genes expressed in the current population by providing a shield against overselection). Dominance relationships frequently adapt in biological systems when the need arises. Hollstien(1971) simulated dominance using a three letter instead of a binary alphabet consisting of: dominant 1, non-dominant 1, and 0 with: 1dom > 0 and 1rec < 0. Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 14 Dominance and Diploidy (Continued) Other research represents the dominance information separately from the gene and lets it undergo evolution --- a kind of co-evolution approach. In the late 70s, Smith and Goldberg explored the use of redundancy for the normal knapsack problem with dynamic weight changes: – Holstein’s triadic scheme showed improvement over a static dominance scheme. – it turned out that the diploid approach coped better with ascillations in the weight function. – decreases the probability that desired schemas are lost “forever”. In summary, there seems to be some evidence that exploiting diploidy can be beneficiary for GAs in dynamically changing environments, especially if scenarios encountered in the past have a tendency to reoccur in the future; on the other hand, diploidy is quite expensive, and not too much research has been performed in the last 15 years that explores its use for GAs. Ch. Eick: What EC-Algorithm Designers can Learn from Genetics What can GA-designer learn from plant genetics and horticulture? polyploidy and dominance gametogenesis is used as the crossover operator use of selfing unusual ways to prevent self fertilization use of intercrossing (create cartesian products of good initial solutions) preference for heterozygous sources and rich gene pools plant breeders employ complex search strategies to breed the best possible plant (such as recurrent selection, which will be the topic of this talk). mutation not very important, because it is hard to control; large population sizes are difficult to handle because of pragmatic reasons. Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 15 16 Polyploidy Polyploidy: using two are more complete sets of chromosomes; the phenotype of an organism is determined through dominance of alleles. Advantages: adaptation to changing environments, “memorize” alleles that worked successfully in the past, richer gene pool. Previous Research on Polyploidy: two major approaches to simulate polyploidy in GAs: using an extra chromosome to represent dominance information [Brindel, this talk] extending the alphabet to distinguishes between dominant and recessive elements [Holstein, Smith&Goldberg, Ng&Wong] Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 17 Features of our Approach uses at least 2 sets of chromosomes uses a dominance vector as a tie breaker uses a crossover control vector to restrict possible crossover points dominance vectors and crossover control vectors take part of the evolution gametogenesis is used as the crossover operator Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 18 3. Experiments Benchmarks: – Knapsack problem with dynamically changing weight constraints – Schwefel function Evaluation is performed with respect to the following measure: M2= (Ti-Xi)2/G where Ti is the true optimimum for generation i and Xi is the best solution found in generation i, and G is the number of generations. Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 19 4. Summary proposed an approach to support polyploidy that uses dominance vectors demonstrated the benefits of the approach in oscillating environments which cycle among several different states. crossover control vectors are employed to provide linkage between the dominance vector and the chromosomes themselves. approach facilitates maintaining diversity in relatively small populations our experiments at least partially explain why diploidy and polyploidy exist in biological systems. Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 20 Literature Ben S. Hadad and Christoph F. Eick: Using Recurrent Selection to Improve GA-performance, ISMIS, Charlotte, October 1997. Ben S. Hadad and Christoph F. Eick: Supporting Polyploidy in Genetic Algorithms Using Dominance Vectors, EP’97, Indianapolis, April 1997. Ben S. Hadad: Extending Genetic Algorithms Using Ideas Borrowed from Plant Genetics and Horticulture, Master’s Thesis, University of Houston, December 1996. Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 21 Inversion and Other Reordering Operators Reordering operators change the position/location of genes in a chromosome, but do not change the composition of the chromosome: – consequently, reordering operators do not directly affect the fitness. – however, crossover is effected: namely, the defining length of a schema is changed by applying reordering operators, which increases or decreases the probability that instances of a particular schema reoccur in the future. – reordering causes that genes are nolonger lined up corrrectly, which, in many applications, causes problems with the crossover operator: necessary genes might be missing: non-complete gene combinations can occur. duplicated genes can occur, wbich is usually not desirable. The most popular reordering operators are inversion and swapping: 1 2 3 | 4 5 6 7 | 8 inversion: 12376548 swap: 12375648 Empirical evidence seem to indicate that at least in some applications reordering operators are useful “secondary” operator, whose employment induces slight improvements in the overall performance. Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 22 Niche and Speciation We can view a niche as an organism’s job or role in an environment, and we can think of a species as a class of organisms with common characteristics. Niche Methods in Genetic Search: – crowding (DeJong(1975)) and sharing functions (Goldberg(1987)). – external schemes (Perry(1984)) which are similarity templates that define species membership that have be provided by the GA-developer. – Mating restrictions in genetic search: line breading (breed the champion repeatedly with others) Hollstein’s inbreeding with intermittent crossbreeding (close individuals still bread as long as their family average fitness continues to improve; otherwise, crossbreeding between different families is used). Booker introduces mating templates that are mate selection mechamisms that become part of the individual (which themselves undergo evolution) and proposes different mating rules: – bidirectional match – unidirectional match – best partial matches disallow breeding of simimlar indiduals (e.g. incest) Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 23 Example of a Booker Mating Template Assume we have chromosomes over alphabet A with chromosome length n, and let A’=union(A,{#}). Extend chromosomes tripling their length to: ind=a1...anb1...bnc1...cn with aiA, bi and ciA’ (i=1,n) with the meaning: ind is allowed to mate with ind’: if ind’Schema(b1...bn ) or ind’Schema(c1...cn ). Example: Let n=4 and A be the binary alphabet: ind1=0010 0000 1111 ind2=0000 1### 0111 ind3=0111 001# 1111 Bidirectional match requests that “a must want b” and “b must want a”, whereas in unidirectional match it is sufficient that one partner wants the other. Many other matching schemes are possible; e.g. more complicated ones that operate on scores and thresholds. Ch. Eick: What EC-Algorithm Designers can Learn from Genetics 24 Artificial Mating Tags the problem with Booker’s approach is that mating templates have the same length as the chromosomes themselves, producing a significant overhead. To reduce this overhead Holland proposed to use a threepart strings consisting of: – a short mating template(used to test suitability of other mates) – a short mating tag(used by others to match, characterizes the string) – the functional substring Example: #10#:1010:111111000011 #0##:1100:011111110001 – mating tags effect the compatibility with other strings, but do not effect the fitness. – usually, the three-part string is evolved. – Holland’s scheme of using artificial mating tags can also be used to define mating niches abstractly, similar to Perry’s external schema approach, by freezing particular positions in templates and tags. For example, mating can easily restricted to particular subsets of the population. Mating tags can also be used to simulate distributed GAs. Ch. Eick: What EC-Algorithm Designers can Learn from Genetics