Download A Genetic Algorithm Approach to the

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
A Comparison of Parent Selection
Strategies for Evolutionary Algorithms
Modeled After Human Social Interaction
Michael Ames
CS448 FS2005 – Semester Project
November 15, 2005
Abstract
Based on preliminary findings of previous research, certain strategies for parental
selection improve the efficiency of Evolutionary Algorithms. The purpose of the
study is to examine how EAs implementing parental selection strategies modeled
after human behavior effect the functionality of an EA. By implementing four
basic strategies and testing on two different EAs this study will prove that
implementing EA modeled after human social interaction can perform nearly as
well as the standard tournament selection available to EA designers.
1
Index
1. Introduction……………………………………………………………3
2. Problem Statement…………………………………………………..3
3. Previous Research…………………………………………………...3
4. Approach………………………………………………………………4
5. N Queens Results…………………………………..……………….10
6. Binary Knapsack Results…………………………………………12
7. Conclusion……………………………………..…………………….14
8. References………………………………….………………………..14
Appendix A Running the Program…………..……………………...15
Appendix B F and T Test Results……………………………………17
2
1. Introduction
Based on preliminary findings of previous research, certain strategies for parental
selection improve the efficiency of Evolutionary Algorithms, (EAs). A major
majority of the research in this area has been in disallowing individuals who are
related to be selected as parents. [Craighurst] [Eshelman] [Ting] This current
research will determine if EAs with parental selection strategies based on human
social interaction will increase the efficiency in finding optimal solutions. A minor
motivation behind this research is due in part to the above apparently low amount
of research on the subject of deriving new parent selection strategies for EAs.
The main reason for this study is to increase the overall level of knowledge of the
effect of custom parent selection strategies on EA performance. In short, will the
added complexity increase convergence rate of the EA to the global optimum
therefore finding the solution in fewer generations.
A separate EA will be written to solve two known problems, N-Queens and
Binary Knapsack. Both of the EAs will be fairly standard in regards to individual
representation, population, recombination, mutation, and survivor selection.
However, parent selection will be handled separately. Both EAs will be able to
utilize any one of the defined parent selection strategies from a separate parent
selection object. The performance of each EA utilizing a standard tournament
selection strategy to select the parents will be measured and averaged over
several runs. The information derived from the tournament selection strategy will
be used as a standard of comparison as to an EAs performance utilizing all of the
defined parent selection strategies. Additionally this will allow the comparison of
performance of each selection strategy to each other as well as different EAs
utilizing the same parent selection strategy.
2. Problem Statement
The purpose of the study is to examine how EAs implementing parental selection
strategies modeled after human behavior effect the functionality of an EA.
Evolutionary algorithms implementing these strategies will hopefully provide
significant improvement over standard parental selection strategies for either all
or at least a specific class of problems. Currently all typical EAs resort to either
some standard form of Fitness Proportional Selection (FPS) or Ranking Selection
(RS) to select individuals for mating [Eiben]. Analysis provided by this research
of EAs implementing parental selection strategies modeled after human behavior
will help determine if higher level more complex modeling will increase the
efficiency of EAs.
3. Previous Research
Previous research that has implemented custom parent selection strategies to
improve a subset of EAs called Genetic Algorithms (GAs). That research has
been limited to disallowing incest by implementing a Tabu Multi-Parent Genetic
3
Algorithm (TMPGA) [Ting]. The TMPGA is considered the closest model of a
parental selection strategy to human mating behavior. The TMPGA integrated a
tabu search into the parent selection of GAs to increased genetic diversity in the
population individuals. All of the individuals are divided into “clans”, and as the
population matures individuals belonging to clan A may not be allowed to mate
with a member of the same clan or a member of clan B if clan B is on clan A’s
tabu list. The tabu list simply contains the clan designation of those clans that
contain individuals that have become too genetically similar through
recombination.
One study compared the effect of different levels of mutation rates on standard,
Assortative, and Disassortative Mating Genetic Algorithms (AMGA or DMGA).
[Ochoa] A GA known as an AMGA or DMGA models mate selection after the
mating habits of certain animal species. Parent selection and eventual mating of
individuals in an AMGA, or DMGA, are based on phenotypic similarities, AMGA,
or dissimilarities, DMGA, between parental candidates. In Ochoa these
similarities/dissimilarities were based on the Hamming Distance (HD) of
individuals. Ochoa used the expanded definition of the HD as the number of
unlike values in integer strings in order for the algorithms to select a mate. The
smaller the HD the greater the likelihood the candidates would mate in an AMGA.
The inverse is true for DMGAs. Convergence to optimal solutions with low
mutation rates were found to be favorable for the DMGA, whereas a medium
level of mutation was favorable for the standard GA, and a high mutation rate
was favorable for the AMGA.
A couple of other research papers detail other parental selection strategies each
with positive results. Both papers addressed the topic of preventing incest in
parent selection. One such strategy is preventing premature convergence by
preventing incest. [Eshelman] Another includes enhancing GA performance
through crossover prohibitions based on ancestry. [Craighurst]
4. Approach
The coding for this assignment was completed on an Intel Pentium III 600Mhz
desktop PC running Fedora Core 3 Linux. The PC has a 100Mhz front side bus
with 256MB SDRAM and an additional 512MB of the hard drive partitioned as a
swap, technically giving the computer 768MB of RAM. However, due to the
significant number of runs required to form a solid basis for the conclusion of this
report, actual running of the code on the PC was inadequate. In order to
complete the requirements of the project on time the program was run on UMR’s
Numerically Intensive Computing (NIC) Cluster that is a batch computing system
consisting of 40 Dell PowerEdge 1850 computing nodes and 2 Dell PowerEdge
1850 service nodes. All nodes have dual 3.2GHz Xeon EM64T processors and
2GB of RAM. All of the nodes of the NIC are interconnected via switched gigabit
ethernet.
4
The basic characteristics of each EA utilized to test the different parent selection
strategies detailed in this report are described in detail below. Table 1 the basic
characteristics of the N-Queens EA as it was implemented for the experiment.
Likewise Table 2 details the Binary Knapsack characteristics. As a default, all of
the EAs will terminate after a maximum number of generations have been
reached or the termination condition listed is true. It is because of these
characteristics that several different EAs will be used to measure the
performance of the different parent selection strategies. Table 3 lists the
parameters that will remain constant throughout the experiment. By leaving the
parameters in Table 3 constant this will form a steady basis for the results and
conclusions later in Sections 5 and 6.
Table 1 N-Queens
Individual Representation
Initialization
Parent Selection
Recombination
Mutation
Competition
Termination
Fixed Length Permutation of Integers
Uniform Random
Custom see Section 4.4
Simple One Point Crossover
Swap Gene Location
Elitist
Valid Solution Found
Table 2 Binary Knapsack
Individual Representation
Initialization
Parent Selection
Recombination
Mutation
Competition
Termination
Fixed Length Bit String
Uniform Random
Custom see Section 4.4
Simple One Point Crossover
Bit Flipping
Elitist
No Improvement in 250 Generations
Table 3 Constant Parameters
Recombination Percentage
Individual Mutation Percentage
Gene Mutation
N-Queens Board size (n)
Items For Knapsack (n)
Population Size
Parent Pool Size
Total Children per Generation
Max Children per Parent Pair
Number of Iterations
100%
100%
1/n
30
100
2000
20
15
3
30
5
4.1 Parameters
Sixteen parameters that must be read in from a formatted text file are
required for the proper execution of the algorithm. Details as to the format
of the text file, and the valid ranges of the parameters are outlined in
Appendix A “Running the Program.” The first parameter is the
Recombination Probability (PR) which is used to determine if new
individuals will be generated from the individuals selected as parents. The
higher the probability the greater the likelihood the children will contain
genetic material from both parents as a result of the crossover operation
explained in the Recombination section below. The Individual Mutation
Probability (PM), which is the second parameter, determines the chance
that a newly generated individual has been selected as a candidate for
mutation. The Number of Genes (n) is the number of genes an individual
will have. For the N-Queens EA this is where the size of the board is set.
For the Knapsack EA the number of genes is set by the data file read in
when the EA is initialized. The default Gene Mutation Probability (P G) is
1/n. The fourth parameter determines the Population Size ()for the EA
which dictates the number of individuals that will exist initially, and survive
from one generation to the next. The fifth parameter determines the
Parent Pool Size (which is the number of individuals chosen from each
generation with the possibility of becoming parents. In order to model
human social interaction as close as possible there is a Cheating
Probability (PC) for individuals who are married. The number of children
generated from each generation will be set by the next parameter which is
the Total Number of Children per Generation ( and is exactly what it
means. Parameter eight is the Maximum Number of Children per set of
parents. One is added to the modulus of a random number by the
Maximum Number of Children to determine how many children will be
generated from each set of parents. Parent Selection and
Recombination will continue until the Total Number of Children has been
reached. The ninth parameter is the default stopping case the Maximum
Number of Generations. If the Maximum Number of Generations is
reached the EA will terminate automatically and output the best individual
found at that time. The tenth parameter is the Number of Runs/Iterations
each parameter set will be executed for that particular EA. If a specific
Random Seed is desired it may be entered as the eleventh parameter. If
the Random Seed is left at 0 then a new random seed will be generated
from the CPU clock. Parameter twelve is the name of the file that will
mirror all of the preformatted screen output. Only the filename is required,
no filename extensions are necessary. The output will be a text file
available for viewing in the executable directory after the program
terminates. The next parameter also asks for an output filename to store
all of the statistical data in a file that is in .csv format. The .csv format is
for ease of importation into various spreadsheet programs, and can also
be viewed once the program terminates. The next parameter in the
6
parameter file allows the user to select either EA to run. The next to last
parameter allows the user to enter the Parent Selection Strategy,
described further in Section 4.4 below. Finally the last parameter allows
the user to select an input data file for the EA if it requires one.
4.2 Individual Representation
Individual representation will be unique for each EA implemented in the
study. Each individual for the N-Queens EA represents the placement of
N number of chess queens on a chess board with N rows and N columns.
The N-Queen individuals will be an array of N integers arranged in a
permutation. Implementing the individuals as permutations will make the
fitness calculation easier, see Section 4.7. The individuals of the Binary
Knapsack will be represented as bit strings equal in length to the total
number of items available for placement in the knapsack. A one at any
location in the bit string will indicate that particular item is included in the
knapsack, and a zero will indicate the item is not included.
4.3 Population Initialization
Each population will be implemented as either a two dimensional array of
bit strings or a two dimensional vector of integers. The size of the initial
population is equal to the population size parameter read in from the
parameter file. Each array or vector in the population represents an
individual. All other vectors associated with the population like the vector
of floating point numbers that maintains the fitness scores for each
individual in the population are cleared and reset prior to calculating the
fitness scores. Once all of the fitness scores have been calculated
(Section 4.7) for the initial population the statistics are calculated. (Section
4.9)
4.4 Parent Selection
This is the heart of the study. Both EAs will be run using a standard
tournament selection for parent selection to be used as a standard to be
measured against. Four custom parent selection strategies will be
implemented for the assignment. The first custom selection strategy will
simply be to implement an AMEA with an added marriage component. As
with the implementation of the AMEA found in the previous research, this
one will also be based on the HD of the individuals. The best fit individual
of the parent pool is selected as the first parent. The first parent will be
mated with the individual with the lowest HD in the parent pool, as long as
it is not 0, and the individual is not already married. Once two individuals
have been selected for mating they in effect have been married, and the
location of the opposite spouse is recorded in a hash that keeps track of
7
all the marriages of the population. Once two individuals have been
married any time either one of them is selected out of the parent pool for
recombination they will only mate with their designated spouse. If an
individual is selected for mating, but all of the other individuals in the
parent pool are married then the individual will mate with the first
unmarried individual in the population. This strategy is modeled after is
the common human mate selection process of mating with candidates with
similar phenotypic traits. The second strategy will be to implement a
DMEA the same way. Keeping in mind the best fit individual will mate with
an unmarried individual with the greatest HD. By doing so the interaction
strategy closely models the human mate selection strategy commonly
known as “opposites attract.”
The third and fourth strategies will be the same as the first two with an
added “Cheating” component. There will be a flat cheating probability for
the entire population, 0.25 for the third strategy, and 0.75 for the fourth. If
a randomly generated value from [0, 1] is less than the cheating
probability then the individual will cheat on its spouse and mate. Note that
by setting the cheating parameter to 0 will result in an EA performing
exactly as parent selection strategies one and two. Likewise setting the
cheating value to 1 will result in an EA that acts as a normal AMEA, in
effect turning the marriage property off.
4.5 Recombination
The first step of Recombination is to generate a random value over the
range of [0, 1]. If the random value is less than or equal to the individual
recombination probability (PR) then recombination will occur. If there is no
recombination then the child created is an exact copy of one of a random
parent. If recombination is to occur a random location in the range of [1,
individual size-1] is chosen. This random location represents the point of
crossover for selecting genetic material to create the new child. Each new
child is created by copying values in corresponding locations from the first
parent into corresponding locations of the child up to the crossover point.
After the crossover point values are copied from corresponding locations
of the second parent. Because duplicate values are discouraged in the NQueens EA the remaining positions of the child after the crossover are
filled with the values from the second parent that are not already in the
child. Looping through all of the locations of the second parent if a value
is not already in the child it is now placed in the child’s next empty
location.
4.6 Mutation
Individual mutation will occur if a randomly generated value from [0, 1] is
less than the Individual Mutation Probability (PM) that was read in from the
8
parameter file mentioned earlier. If mutation is to occur in the N-Queens
EA, two locations are chosen at random, and the values simply switch
locations.
Mutation for individuals of the Binary Knapsack EA is simple enough. If an
individual is selected for mutation starting at the beginning of the individual
check each Boolean value to see if the bit should be “flipped.” Flip the bit
to the opposite value if the random value is less than PG.
4.7 Individual Fitness Calculation
Of course both EAs will each have their own way of calculating fitness.
Both EAs are designed so that a higher fitness score is interpreted as a
better fitness score. To calculate the fitness for the N-Queens problem
each array location of an individual represents a unique column on the
chess board, and each value at that array location represents a unique
row of the chess board. Because the values in the array are arranged as
a permutation of integers, by design only diagonal attacks have to be
accounted for. Additionally only diagonal attacks to the right need to be
considered as considering diagonal attacks to the left would be redundant.
For the Binary Knapsack EA fitness will be based on if the knapsack is
overfilled or not. If the knapsack is overfilled the individual will receive a
negative fitness score based on the percentage the knapsack is overfull.
A negative fitness will be over the range of [-1, 0), where -1 represents
every item in the list has been placed in the knapsack. If the individual is
not overfull then the fitness will be the sum of the profit scores of each
item in the individual.
4.8 Survivor Selection
Newly generated individuals of both EAs will be added to the population if
their fitness is greater than any individual currently in the population. A
starting point is selected at random and the entire population is checked
for an individual with a lower fitness score. Once a new individual has
been placed in the population the next child is selected for placement. As
long as the fitness score of a newly generated child is greater than the
fitness score of a single individual in the population it will be placed in the
population, otherwise it will not. If an individual has been selected for
replacement its location and the location of its mate are reset to -1 in the
marriage hash, and the “spouse” of the replaced individual is once again
eligible to remarry.
4.9 Statistics Calculations
9
The statistics of each EA population includes the average fitness, the
variance, and standard deviation. The fitness score and individual
representation of the best fit individual for all three EAs is constantly
maintained. Additionally the fitness score of the worst individual is also
maintained to show an additional level of improvement.
4.10 Stopping Criteria
As noted earlier in Tables 1 and 2 both of the EAs has its own unique
stopping criteria. In addition to those criteria previously mentioned the
following default criteria will be used. Each parameter set is kept until the
EA has iterated equal to the number of runs declared in the parameter file.
Every iteration of the EA will terminate by default when it reaches the
maximum number of generations also specified in the parameter file.
Upon reaching the end of a iteration the program will display on the screen
the best fit individual and the fitness score of that individual. In addition
the N-Queens EA will display the total time it took the EA to run after
initialization. The Knapsack EA will also display the time as well as the
capacity of the knapsack used for the solution.
5. N Queens Results
The results section will be broken up into two sections; one for the N-Queens
implementation and another for the Binary Knapsack implementation. Starting
with the performance of the N-Queens EA, Chart 1 show the average rate of
convergence for all seven parental selection strategies implemented for the
study. You can see by Chart 1 it is not very descriptive and extremely hard to
read, as all of the data is overlapping. However Chart 2 is an exploded view of
the last 4000 generations of Chart 1. It is clear from Char 2 that the standard
tournament selection outperformed the six custom strategies. As you can also
see from Chart 2 that the standard tournament selection strategy not only
maintained an overall better average fitness it also converged to a solution on
average 1000 to 1500 generations sooner, around 4000. All six of the custom
strategies converged on a solution in the 5000 to 5500 generation range. Table
4 shows a brief general statistical breakdown of the performance of each strategy
as compared to each other. One interesting point is Minimum convergence rate
for the Assortative EA with a low cheating probability and the Disassortative EA
Table 4
Minimum
Maximum
Average
Solutions Found
Tournament
1450
8700
3904.86
30
AMEA M
1700
8200
4905.20
30
AMEA L C
800
8450
4822.98
30
10
AMEA H C
2400
8400
4646.59
30
DMEA M
2300
8900
4732.15
30
DMEA L C
2150
7700
4886.52
30
DMEA H C
500
9150
4542.84
30
Chart 1
N-Queens Convergence
0
1
0
2
5
4
3
-1
-2
Tournament
AMEA Marriage
AMEA L C
AMEA H C
DMEA Marriage
DMEA L C
DMEA H C
Fitness
-3
-4
-5
-6
-7
-8
Generations X 1000
Chart 2
N Queen Convergence Exploded View
-0.8
2
3
4
5
-1
Tournamet
AMEA Marriage
AMEA L C
AMEA H C
DMEA Marriage
DMEA L C
DMEA H C
Fitness
-1.2
-1.4
-1.6
-1.8
-2
Generations X 1000
11
with the high cheating probability. Further study will be required to prove if a low
cheating probability was the cause of the extremely fast convergence of the
AMEA, and if the high cheating probability caused the DMEA to converge so
rapidly, or if it was mere coincidence.
Higher level statistical analysis of the N Queen problem was as follows: The two
sample f test was performed on all six custom strategies comparing them to the
tournament selection. Out of the six comparisons it was found that the variances
of all of the strategies were not equal to the variance of the Tournament selection
strategy, with the exception of the AMEA with a high cheating probability.
However, upon further analysis utilizing the appropriate two tailed t tests, it was
determined that the means of all six custom strategies were in fact equal to that
of the tournament selection strategy. The results of the f and t tests can be seen
in Appendix B.
6. Knapsack Results
The Binary Knapsack implementation also did not show any improvement over
the tournament strategy. As you can see by Chart 3, like Chart 1, it is also hard
to read due to the fact all of the data is overlapping. However, Chart 4 is an
exploded view of the last 4000 generations which clearly shows the tournament
selection strategy outperforming the six custom strategies ever so slightly. The
DMEA with a high cheating probability performed the best out of the six, but was
not enough to beat out the tournament selection. Table 5 simply shows some
low level statistical analysis of all seven strategies as compared to each other.
The Minimum and Maximum in Table 5 is the worst and best solutions found by
each EA. Marriage is implied for the strategies with the cheating component.
Not much can be discerned from Table 5 so further analysis is warranted.
Higher level statistical analysis of the Knapsack problem was as follows: The two
sample f test was performed on all six custom strategies comparing them to the
tournament selection. Out of the six comparisons it was found that the variances
of all of the AMEA strategies were not equal to the variance of the Tournament
selection strategy, and all of the DMEA variances were equal. However, upon
further analysis utilizing the appropriate two tailed t tests, it was determined that
the means of all six custom strategies were in fact equal to that of the tournament
selection strategy. The results of the f and t tests can be seen in Appendix B.
Table 5
Minimum
Maximum
Average
Tournament
26.7901
27.9844
27.42215
AMEA M
26.7743
27.9302
27.38704
AMEA L C
26.7323
27.8754
27.38704
12
AMEA H C
26.6994
27.9586
27.28795
DMEA M
26.7344
27.7919
27.3204
DMEA L C
26.6205
27.7609
27.28731
DMEA H C
26.6767
27.8572
27.3967
Chart 3
Knapsack Convergence
30
25
Fitness
20
Tournament
AMEA Marriage
AMEA L C
AMEA H C
DMEA Marriage
DMEA L C
DMEA H C
15
10
5
0
0
1
2
3
4
5
6
7
8
9
10
-5
Generations X 1000
Chart 4
Knapsack Convergence Exploded View
27.45
27.4
27.35
Tournament
AMEA Marriage
AMEA L C
AMEA H C
DMEA Marriage
DMEA L C
DMEA H C
Fitness
27.3
27.25
27.2
27.15
27.1
27.05
27
6
7
8
9
Generations X 1000
13
10
7. Conclusions
Though there was not an improvement in the convergence rates of the EA
utilizing any of the six custom strategies there is substantial proof that
implementing an EA with one of the above selection strategies would perform
nearly as well as the standard tournament selection. With some minor
adjustments the strategies may perform as well as the tournament selection.
Due to the results above, and the relative ease in implementation, it is possible
that implementing EA with a marriage and cheating component could become an
alternative to the tournament selection strategy.
8. References
(1) Craighurst, Rob and Martin, Worthy N. Enhancing GA Performance
through Crossover Prohibitions Based on Ancestry Proc. Of the 6th
International Conference on Genetic Algorithms. San Francisco, 1995.
(2) Eiben, A. E. and Smith, J. E. Natural Computing Series: Introduction to
Evolutionary Computing Springer 2003
(3) Eshelman, Larry j. and Schaffer, J. David Preventing Premature
Convergence in GeneticAlgorithms by Preventing Incest. Proc. Of the 4th
International Conference on Genetic Algorithms. San Diego, 1991.
(4) Ochoa, Gabriel; Madler-Kron, Christian, Rodriguez, Ricardo; and Jaffe,
Klaus Assortive Mating in Genetic Algorithms for Dynamic Problems
EvoWorkshops 2005, LNCS 3449, pp. 617-622, 2005.
(5) Ting, Chuan-Kang and Buning, Hans Kleine A Mating Strategy for Multiparent Genetic Algorithms by Integrating Tabu Search IEEE 2003
14
Appendix A Running the Program
Part 1 The Parameter File
The parameter file is simply a tab delimited text file that contains eleven values
with the following restrictions on individual parameters:
1. Recombination Probability (float) [0, 1]
2. Individual Mutation Probability (float) [0, 1]
3. Gene Mutation Probability (float) [0, 1]
4. Population Size (uint) [1, 10000]
5. Parent Pool Size (uint) [1, Population Size]
Note (a): Recombination will occur if Parent Pool Size > 1,
otherwise strictly Mutation will occur.
Note (b): This parameter will default to 1 if Population Size is 1.
6. Cheating Probability (float) [0, 1]
7. Total Children per Generation (uint) [1, + infinity)
8. Maximum Number of Children/Parent(s) (uint) [1, Total Children]
9. Maximum Number of Generations (uint) [10, + infinity)
10. Number of Runs (uint) [1, + infinity)
11.Random Seed (long uint) [0, + infinity)
Note : If 0 a new seed will be generated from the cpu clock.
12.This is the name of the file that the screen output will be directed to.
(string) [filename extensions are not required]
13.This is the name of the file that the statistical output will be directed to.
(string) [filename extensions are not required]
14.Selects the EA you wish to run. (int) [1=N-Queens, 2=Binary Knapsack]
15.Selects the Parent Selection Strategy you wish to use. (int) [0=Standard
Tournament, 1=AMEA w/marriage and cheating, 2=DMEA w/marriage
and cheating]
16.Selects the name of the data file if one is required by the EA (string)
Part 2 Executing the Program
1.
2.
3.
4.
Copy the entire directory to a directory of your choosing.
cd to the new directory and compile all the cpp files.
Create your parameter file in the new directory according to Part 1 above.
Once the make file has completed type “./” followed by the executable
name of the binary on the command line.
5. The program will then execute and prompt you to enter the name of the
parameter file you wish to use. Enter it at this time.
6. If there are any errors in your parameter file, values out of range for
example, the program will notify you of the error.
15
7. If there is an error you can edit your parameter file without terminating the
program then reload it when prompted again.
8. The program will display various information during it’s execution to the
screen. Upon completing each iteration the power index of and best fit
individual will be displayed on the screen as well as written to the output
file.
Part 3 .CSV Output File Details
At the start of each run the random seed that was generated by the program or
the seed that was passed into it from the parameter file is written to the output
file, proceeded only by the character string “Random_seed,”. Immediately
following the random seed is the rest of the parameter set used for the current
run separated by parameter name and comma. After the parameters the
generational statistics are included. For this study the statistics for every 50th
generation was written to the output file. The statistics for each line of output
includes the following comma delimited data:
1.
2.
3.
4.
5.
6.
The number of the iteration
The number of the generation
The fitness of the best and worst individuals
The average fitness of the population at that generation
The variance of the fitness scores at that generation
The standard deviation of the fitness scores at that generation
Following the last generation the string “,end_iteration,” will denote the end of an
iteration and the start of another.
Part 4 .TXT Output File Details
This file will mirror everything that is output to the screen during execution. The
text file includes everything as it was seen on the screen to include the initial
input of the parameter file. Screen output is preformatted for ease of reading as
opposed to the format of the .csv file. The program will display the same data
detailed in 1 thru 6 in part 3 above, but without all of the commas. At the end of
each iteration the fitness score of the best individual and the best solution found
at the time of termination are written to the file as they were displayed to the
screen. If additional iterations are desired the statistics for the first generation of
the next iteration will immediately follow the output of the best individual of the
previous iteration.
16
Appendix B F and T test Results
N-Queens
Tournament vs. AMEA w/ marriage
reject H0, the varainces are unequal
F-Test Two-Sample for Variances
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
Variable 1
3986.666667
2796540.23
30
29
0.989965017
0.489260419
0.537399965
Tournament vs. AMEA w/amrriage & low cheating
reject H0, the variances are unequal
F-Test Two-Sample for Variances
Variable 2
5045
2824888
30
29
Tournament vs. AMEA w/marriage & high cheating
accept H0, the variances are equal
F-Test Two-Sample for Variances
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
Variable 1
3986.666667
2796540.23
30
29
1.17283476
0.335289143
1.860811434
Variable 2
4888.333
2384428
30
29
Tournament vs. DMEA w/ marriage
reject H0, the variances are unequal
F-Test Two-Sample for Variances
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
Variable 1
3986.666667
2796540.23
30
29
0.865382341
0.349823678
0.537399965
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
Variable 1
3986.666667
2796540.23
30
29
0.979187649
0.477613615
0.537399965
Variable 2
4971.667
2855980
30
29
Tournament vs. DMEA w/marriage & high cheating
reject H0, the variancea are unequal
F-Test Two-Sample for Variances
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
Variable 1
5146.666667
2396540.23
30
29
0.759539138
0.231755018
0.537399965
Variable 2
4698.333
3155256
30
29
Tournament vs. DMEA w/amrriage & low cheating
reject H0, the variances are unequal
F-Test Two-Sample for Variances
Variable 2
4591.667
3231566
30
29
17
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
Variable 1
5146.666667
2396540.23
30
29
0.741603347
0.212856375
0.537399965
Variable 2
4591.667
3231566
30
29
Knapsack
Tournament vs. AMEA w/ marriage
reject H0, the variances are unequal
F-Test Two-Sample for Variances
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
Variable 1
27.42215
0.078228718
30
29
0.96132138
0.458069651
0.537399965
Tournament vs. AMEA w/amrriage & low cheating
reject H0, the variances are unequal
F-Test Two-Sample for Variances
Variable 2
27.38704
0.081376
30
29
Tournament vs. AMEA w/marriage & high cheating
reject H0, the variances are unequal
F-Test Two-Sample for Variances
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
Variable 1
27.42215
0.078228718
30
29
0.739570232
0.210745476
0.537399965
Variable 2
27.28795
0.105776
30
29
Tournament vs. DMEA w/ marriage
accept H0, the variances are equal
F-Test Two-Sample for Variances
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
Variable 1
27.42215
0.078228718
30
29
1.393482501
0.188416575
1.860811434
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
Variable 1
27.42215
0.078228718
30
29
0.820162214
0.298471009
0.537399965
Variable 2
27.30497
0.095382
30
29
Tournament vs. DMEA w/marriage & high cheating
accept H0, the variances are equal
F-Test Two-Sample for Variances
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
Variable 1
27.42215
0.078228718
30
29
1.07855776
0.420015231
1.860811434
Variable 2
27.28731
0.072531
30
29
Tournament vs. DMEA w/amrriage & low cheating
accept H0, the variances are equal
F-Test Two-Sample for Variances
Variable 2
27.3204
0.056139
30
29
18
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
Variable 1
27.42215
0.078228718
30
29
1.07855776
0.420015231
1.860811434
Variable 2
27.28731
0.072531
30
29
N-Queens
Tournament vs. AMEA w/ marriage
The means are equal
t-Test: Two-Sample Assuming Unequal Variances
Mean
Variance
Observations
Hypothesized Mean
Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Variable 1
3986.6667
2796540.2
30
Variable
2
5045
2824888
30
0
58
2.4448925
0.0087741
1.6715528
0.0175482
2.0017175
Mean
Variance
Observations
Pooled Variance
Hypothesized Mean
Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Mean
Variance
Observations
Hypothesized Mean
Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Tournament vs. AMEA w/marriage & high cheating
The means are equal
t-Test: Two-Sample Assuming Equal Variances
Variable 1
3986.6667
2796540.2
30
Tournament vs. AMEA w/amrriage & low
cheating
The means are equal
t-Test: Two-Sample Assuming Unequal
Variances
Variable
2
4888.333
2384428
30
2590484.2
0
58
2.1697074
0.0170695
1.6715528
0.034139
2.0017175
19
Variable
2
4971.667
2855980
30
Variable 1
3986.666667
2796540.23
30
0
58
-2.2692158
0.013495384
1.671552763
0.026990768
2.001717468
Tournament vs. DMEA w/marriage & high cheating
The means are equal
t-Test: Two-Sample Assuming Unequal Variances
Mean
Variance
Observations
Hypothesized Mean
Difference
Variable 1
4698.333333
3155255.747
30
0
df
t Stat
58
-0.61062046
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
0.27191818
1.671552763
0.54383636
2.001717468
Variable
2
4971.667
2855980
30
Tournament vs. DMEA w/ marriage
The means are equal
t-Test: Two-Sample Assuming Unequal Variances
Mean
Variance
Observations
Hypothesized Mean
Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Variable 1
4591.6667
3231566.1
30
Variable
2
4971.667
2855980
30
0
58
0.8435738
0.2011867
1.6715528
0.4023734
2.0017175
Tournament vs. DMEA w/amrriage & low cheating
The means are equal
t-Test: Two-Sample Assuming Unequal Variances
Mean
Variance
Observations
Hypothesized Mean
Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Variable 1
5146.666667
2396540.23
30
Variable
2
4971.667
2855980
30
0
58
0.418229646
0.338661867
1.671552763
0.677323733
2.001717468
Knapsack
Tournament vs. AMEA w/ marriage
The means are equal
t-Test: Two-Sample Assuming Unequal Variances
Mean
Variance
Observations
Hypothesized Mean
Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Variable 1
27.42215
0.0782287
30
Variable
2
27.38704
0.081376
30
0
58
0.4813581
0.3160366
1.6715528
0.6320732
2.0017175
20
Tournament vs. AMEA w/amrriage & low
cheating
The means are equal
t-Test: Two-Sample Assuming Unequal
Variances
Mean
Variance
Observations
Hypothesized Mean
Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Variable 1
27.42215
0.078228718
30
0
57
1.540328994
0.064506904
1.672028889
0.129013808
2.002465444
Variable
2
27.30497
0.095382
30
Tournament vs. AMEA w/marriage & high cheating
The means are equal
t-Test: Two-Sample Assuming Unequal Variances
Mean
Variance
Observations
Hypothesized Mean
Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Variable 1
27.42215
0.0782287
30
Variable
2
27.28795
0.105776
30
0
Tournament vs. DMEA w/ marriage
The menas are equal
t-Test: Two-Sample Assuming Equal Variances
Mean
Variance
Observations
Pooled Variance
Hypothesized Mean
Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Mean
Variance
Observations
Pooled Variance
Hypothesized Mean
Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
57
1.7135153
0.0460261
1.6720289
0.0920521
2.0024654
Variable 1
27.42215
0.0782287
30
0.0671839
Tournament vs. DMEA w/marriage & high cheating
The means are equal
t-Test: Two-Sample Assuming Equal Variances
Variable
2
27.3204
0.056139
30
0
58
1.5203142
0.0669326
1.6715528
0.1338652
2.0017175
21
Variable 1
27.39670333
0.084338039
30
Variable
2
27.3204
0.056139
30
0.070238521
0
58
1.115019566
0.134720451
1.671552763
0.269440902
2.001717468
Tournament vs. DMEA w/amrriage & low cheating
The means are equal
t-Test: Two-Sample Assuming Equal Variances
Mean
Variance
Observations
Pooled Variance
Hypothesized Mean
Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Variable 1
27.28731
0.072530857
30
0.06433493
0
58
-0.50531562
0.307626041
1.671552763
0.615252081
2.001717468
Variable
2
27.3204
0.056139
30