Download EC and Genetics - University of Houston

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Point mutation wikipedia , lookup

Human genetic variation wikipedia , lookup

Minimal genome wikipedia , lookup

Genomic library wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Quantitative trait locus wikipedia , lookup

X-inactivation wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Genetic testing wikipedia , lookup

Epistasis wikipedia , lookup

Genome evolution wikipedia , lookup

Gene wikipedia , lookup

Koinophilia wikipedia , lookup

Twin study wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Gene expression programming wikipedia , lookup

Public health genomics wikipedia , lookup

Heritability of IQ wikipedia , lookup

Chromosome wikipedia , lookup

Ploidy wikipedia , lookup

Karyotype wikipedia , lookup

Genetic engineering wikipedia , lookup

Designer baby wikipedia , lookup

Behavioural genetics wikipedia , lookup

Genome (book) wikipedia , lookup

History of genetic engineering wikipedia , lookup

Polyploid wikipedia , lookup

Population genetics wikipedia , lookup

Medical genetics wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
1
Part 1 - Natural Genetics
Ben Paechter
with thanks to the EvoNet Training
Committee and its “Flying Circus”
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
2
Natural Genetics



The information required to build a living organism is
coded in the DNA and other genetic material found in
the cells of that organism
Within a species, most of the genetic material is the
same
Small changes in the genetic material give rise to
small changes in the organism
– E.g height, hair colour
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
3
DNA and Genes




DNA is a large molecule made up of fragments.
There are several fragment types, each one acting
like a letter in a long coded message:
-A-B-A-D-C-B-B-C-C-A-D-B-C-C-ACertain groups of letters are meaningful together - a
bit like words.
These groups are called genes
The DNA is made up of genes and rubbish
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
4
Example: Human Reproduction


Human DNA is organised into chromosomes
Most human cells contains 23 pairs of chromosomes which together
define the physical attributes of the person:
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
5
Reproductive Cells



Sperm and egg cells contain 23 individual
chromosomes rather than 23 pairs
Reproductive cells are formed by one cell splitting
into two
During this process the pairs of chromosome
undergo an operation called crossover
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
6
Crossover
During crossover the chromosome pairs link up and swap parts of
themselves:
Before
After
After crossover one of each pair goes into each cell
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
7
Fertilisation
Sperm cell from Father
Egg cell from Mother
New person cell
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
8
Mutation



Occasionally some of the genetic material changes
very slightly during this process
This means that the child might have genetic material
information not inherited from either parent
This is most likely to be catastrophic
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
9
Theory of Evolution






From time to time, reproduction, crossover and mutation
produce new genetic material or new combinations of genes
Usually this reduces the organism’s ability to survive and so
reproduce
Occasionally the new genetic material increases the organism’s
ability survive and so reproduce
If it allows the organism to reproduce more then this leads to
more and more organisms have the “new improved” genetic
make-up
“Good” sets of genes get reproduced more
“Bad” sets of genes get reproduce less
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
10
Theory of Evolution (2)




The organisms as a whole get better and better at surviving in
their environment
Evolutionists claim that all the species of plants and animals
have been produced by this slow changing of genetic material with organisms becoming better and better at surviving in their
niche, and new organisms evolving to fill any vacant niche
They agree that evolution requires reproduction, selection and
mutation
Some say evolution also requires crossover
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
11
Evolution as Search


We can think of evolution as a search through the
enormous genetic parameter space for the genetic
make-up that best allows an organism to reproduce
in its changing environment
Since it seems pretty good at doing this job, we can
borrow ideas from nature to help us solve problems
that have an equally large search spaces or similarly
changing environment
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
12
Dr. Eick’s Transparencies:
Genetics and What EC Algorithm
Designers can learn from it
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
13
More Genetics: Diploidy and Dominance


Diploidy: Most chromosomes in biological systems are doublestranded(diploid) and not single-standed(haploid) carrying pairs of
chromosomes each containing information for the same function.
The primary mechanism to select which genotypical information will be
expressed in the phenotype is dominance:
– AbCDe + aBCde ABCDe



Diploidy provides a mechanism for remembering alleles and allel
combinations that were previously useful; dominance provides a
mechanism to shield those remembered alleles from harmful selection
in a current hostile environment (increasing implicitly the richness of the
genes expressed in the current population by providing a shield against
overselection).
Dominance relationships frequently adapt in biological systems when
the need arises.
Hollstien(1971) simulated dominance using a three letter instead of a
binary alphabet consisting of: dominant 1, non-dominant 1, and 0 with:
1dom > 0 and 1rec < 0.
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
14
Dominance and Diploidy (Continued)


Other research represents the dominance information separately from
the gene and lets it undergo evolution --- a kind of co-evolution
approach.
In the late 70s, Smith and Goldberg explored the use of redundancy for
the normal knapsack problem with dynamic weight changes:
– Holstein’s triadic scheme showed improvement over a static dominance
scheme.
– it turned out that the diploid approach coped better with ascillations in the
weight function.
– decreases the probability that desired schemas are lost “forever”.

In summary, there seems to be some evidence that exploiting diploidy
can be beneficiary for GAs in dynamically changing environments,
especially if scenarios encountered in the past have a tendency to
reoccur in the future; on the other hand, diploidy is quite expensive, and
not too much research has been performed in the last 15 years that
explores its use for GAs.
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
What can GA-designer learn
from plant genetics and horticulture?








polyploidy and dominance
gametogenesis is used as the crossover operator
use of selfing
unusual ways to prevent self fertilization
use of intercrossing (create cartesian products of good initial
solutions)
preference for heterozygous sources and rich gene pools
plant breeders employ complex search strategies to breed the
best possible plant (such as recurrent selection, which will be
the topic of this talk).
mutation not very important, because it is hard to control; large
population sizes are difficult to handle because of pragmatic
reasons.
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
15
16
Polyploidy
Polyploidy: using two are more complete sets of chromosomes; the
phenotype of an organism is determined through dominance of
alleles.
Advantages: adaptation to changing environments, “memorize”
alleles that worked successfully in the past, richer gene pool.
Previous Research on Polyploidy: two major approaches to
simulate polyploidy in GAs:
 using an extra chromosome to represent dominance information
[Brindel, this talk]
 extending the alphabet to distinguishes between dominant and
recessive elements [Holstein, Smith&Goldberg, Ng&Wong]
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
17
Features of our Approach





uses at least 2 sets of chromosomes
uses a dominance vector as a tie breaker
uses a crossover control vector to restrict possible crossover points
dominance vectors and crossover control vectors take part of the
evolution
gametogenesis is used as the crossover operator
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
18
3. Experiments

Benchmarks:
– Knapsack problem with dynamically changing weight constraints
– Schwefel function

Evaluation is performed with respect to the following measure:
M2= (Ti-Xi)2/G
where Ti is the true optimimum for generation i and Xi is the best
solution found in generation i, and G is the number of generations.
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
19
4. Summary





proposed an approach to support polyploidy that uses dominance
vectors
demonstrated the benefits of the approach in oscillating environments
which cycle among several different states.
crossover control vectors are employed to provide linkage between the
dominance vector and the chromosomes themselves.
approach facilitates maintaining diversity in relatively small populations
our experiments at least partially explain why diploidy and polyploidy
exist in biological systems.
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
20
Literature



Ben S. Hadad and Christoph F. Eick: Using Recurrent Selection
to Improve GA-performance, ISMIS, Charlotte, October 1997.
Ben S. Hadad and Christoph F. Eick: Supporting Polyploidy in
Genetic Algorithms Using Dominance Vectors, EP’97,
Indianapolis, April 1997.
Ben S. Hadad: Extending Genetic Algorithms Using Ideas
Borrowed from Plant Genetics and Horticulture, Master’s Thesis,
University of Houston, December 1996.
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
21
Inversion and Other Reordering Operators

Reordering operators change the position/location of genes in a
chromosome, but do not change the composition of the chromosome:
– consequently, reordering operators do not directly affect the fitness.
– however, crossover is effected: namely, the defining length of a schema is
changed by applying reordering operators, which increases or decreases
the probability that instances of a particular schema reoccur in the future.
– reordering causes that genes are nolonger lined up corrrectly, which, in
many applications, causes problems with the crossover operator:




necessary genes might be missing: non-complete gene combinations can occur.
duplicated genes can occur, wbich is usually not desirable.
The most popular reordering operators are inversion and swapping:
1 2 3 | 4 5 6 7 | 8 inversion: 12376548 swap: 12375648
Empirical evidence seem to indicate that at least in some applications
reordering operators are useful “secondary” operator, whose
employment induces slight improvements in the overall performance.
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
22
Niche and Speciation


We can view a niche as an organism’s job or role in an environment,
and we can think of a species as a class of organisms with common
characteristics.
Niche Methods in Genetic Search:
– crowding (DeJong(1975)) and sharing functions (Goldberg(1987)).
– external schemes (Perry(1984)) which are similarity templates that define
species membership that have be provided by the GA-developer.
– Mating restrictions in genetic search:




line breading (breed the champion repeatedly with others)
Hollstein’s inbreeding with intermittent crossbreeding (close individuals still bread
as long as their family average fitness continues to improve; otherwise,
crossbreeding between different families is used).
Booker introduces mating templates that are mate selection mechamisms that
become part of the individual (which themselves undergo evolution) and proposes
different mating rules:
– bidirectional match
– unidirectional match
– best partial matches
disallow breeding of simimlar indiduals (e.g. incest)
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
23
Example of a Booker Mating Template




Assume we have chromosomes over alphabet A with chromosome
length n, and let A’=union(A,{#}).
Extend chromosomes tripling their length to:
ind=a1...anb1...bnc1...cn with aiA, bi and ciA’ (i=1,n) with the meaning:
ind is allowed to mate with ind’: if ind’Schema(b1...bn ) or
ind’Schema(c1...cn ).
Example: Let n=4 and A be the binary alphabet:
ind1=0010 0000 1111
ind2=0000 1### 0111
ind3=0111 001# 1111
Bidirectional match requests that “a must want b” and “b must want a”,
whereas in unidirectional match it is sufficient that one partner wants
the other.
Many other matching schemes are possible; e.g. more complicated
ones that operate on scores and thresholds.
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics
24
Artificial Mating Tags

the problem with Booker’s approach is that mating templates have the
same length as the chromosomes themselves, producing a significant
overhead. To reduce this overhead Holland proposed to use a threepart strings consisting of:
– a short mating template(used to test suitability of other mates)
– a short mating tag(used by others to match, characterizes the string)
– the functional substring
Example: #10#:1010:111111000011
#0##:1100:011111110001
– mating tags effect the compatibility with other strings, but do not effect the
fitness.
– usually, the three-part string is evolved.
– Holland’s scheme of using artificial mating tags can also be used to define
mating niches abstractly, similar to Perry’s external schema approach, by
freezing particular positions in templates and tags. For example, mating can
easily restricted to particular subsets of the population. Mating tags can also
be used to simulate distributed GAs.
Ch. Eick: What EC-Algorithm Designers can Learn from Genetics