Download Evolving Neural NPCs with Layered Influence Map in the Real

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Evolving Neural NPCs with Layered Influence Map
in the Real-time Simulation Game ‘Conqueror’
Su-Hyung Jang, Student Member, IEEE and Sung-Bae Cho, Senior Member, IEEE
making strategies. The IM function could be a summation of
the natural resources present in that square, the distance to the
closest enemy, or the number of friendly units in the vicinity
[3].
Abstract—AI in computer games has been highlighted
recently and the importance of game platforms that support the
investigation of various kinds of AI techniques increases. In this
paper, we developed a real-time simulation game called
‘Conqueror’ for providing with a better AI game platform to
many AI researchers and applied it to produce a neural NPCs
with layered influence maps. The layered influence map
algorithm is used to analyze the situation in terms of
distribution of influences between friends and enemies, while
the neural network is adopted as a basic representation of the
NPC. The experiment has been conducted with various factors
and verified the usefulness of the proposed game platform.
II. BACKGROUND
A. Related Work
The part of interpretation of the current situation is very
important in order to generate the optimal strategy for
realtime strategy games. There are various techniques for
situation analysis. C. Ong [4] attempted to evaluate situations
of Chinese Chess by expressing the location of the horses on
the whole map with numbers. This method enables to identify
all the factors making influences on the game by reading the
whole situation. This is appropriate for a method that each
object has no autonomy and that a small number of object
adjustments is carried out with central control. However, for a
case that the influence one object makes on the whole game is
small as there are many objects in the game, the strategies to
be taken by the object find more chances that the information
of the whole game is not necessary, and rather unnecessary
information may force excessive computation resources and
interrupt with an optimal strategy adoption.
S. Lucas [5] used the situation information of the map in
the Pac-Man game after selectively collecting and processing
them. An expert system should be utilized to effectively use
data . As more parts are tuned at the development stage, it
leads to more overhead and trials and errors become
mandatory in order to find effective encoding.
I. INTRODUCTION
Inexpensive yet powerful computer hardware and
advanced graphics technique have made it possible to enlarge
computer game genre. One of them is real-time strategy
simulation games. A good artificial intelligence (AI) on
RTS-games means a series of interesting decisions. But most
of the research focused on games that can be described in a
compact form using symbolic representations such as board
and card games. Recently, many AI technologies are applied
to design NPC’s behaviors of the RTS-game. One of them is
Neural Evolution (NE) [1]. This approach is particulaly
well-suited for video games. NE works well in the high
dimensional space, diverse population can be maintained as
individual networks behave consistently, adaptation can take
place in real time, and memory can be implemented through
recurrency [2]. RTS games include Starcraft, Dawn of War,
Supreme Ruler, Earth 2160, and Age of Empires. Players are
given cities, armies, buildings, and abstract resources such as
money, gold, and saltpeter. They play by both allocating these
resources to produce more units and buildings, and by
assigning objectives and commands to their units. Units carry
out player orders automatically, and the game is usually
resolved with the destruction of other players’ assets [3].
In the paper, we propose a strategy generation method
using influence map. An influence map (IM) is a grid placed
over the world, which has values assigned to each square
based on some functions which represent a spatial feature or
concept. Influence maps evolved out of work done on spatial
reasoning within the game of Go and have been used
sporadically since then in games such as Age of Empires.
Influence maps combine together to form spatial decision
B. Influence Map
Interpretation of the current situation is very important in
order to draw out the optimal strategies for real-time strategy
games. IM shows overall reverse relationships of the game
and is useful for strategic evaluation and decision-making
based on the current game status. An Influence map is
comprised of multiple layers superposed on the geographic
expression of the game map as shown in Figure 1, and each
layer represents various variables of the game. When an agent
carries out a certain decision-making, the whole or parts of
these layers are put together along with appropriate weights to
make into a map. Values constituting the map indicate an
approximate degree of suitability of the relevant position of
the game map for decision making at present [3].
Sung-Bae Cho is with the Department of Computer Science. Yonsei
University, 262 Seoungsanno, Sudaemoon-ku Seoul 120-749,
Korea(phone +82-2-2123-3877 ;email : [email protected])
Sung-Bae Cho is with the Department of Computer Science. Yonsei
University, 262 Seoungsanno, Sudaemoon-ku Seoul 120-749,
Korea(phone +82-2-2123-2720 ; email : [email protected])
978-1-4244-2974-5/08/$25.00 ©2008 IEEE
385
Fig. 1. Inffluence Map
I
III.
THE GAM
ME: ‘CONQUERO
OR’
‘Conqueror’’ developed in
n this researcch is a real-tiime
sstrategic simulaation game, in which two nattions expand thheir
o
own
territory and take awaay the other. Each nation has
h
s
soldiers
who inndividually bu
uild towns and fight against the
e
enemies,
whilee a town contiinually producces soldiers foor a
g
given
period. The motivatio
on of this gam
me is to obseerve
m
multi-agents’
b
behaviors
and to
t demonstratee AIs in compuuter
g
games,
especially the usefuln
ness of evolutioonary approachhes.
F
Figure
2 showss a snapshot of
o ‘Conqueror.’ Following ruules
s
summerize
the game.
1) Resourcces given to eaach country at the beginningg of
the gam
me are one soldier and one village.
2) Energy increases for the
t village as time goes by unntil
the ennergy reaches up to 100, annd one soldierr is
producced and the eneergy is initializzed.
3) Produceed soldiers may
y create other villages, destrruct
villagees of the counteerpart, or makee combats agaiinst
the couunterpart soldiers.
4) Energy is given to the
t soldiers buut decreases as
a a
villagee is created or destructed, or a combat is maade
and as time paasses, the energy increaases
automatically.
5) Combatts can be madee in alliance wiith other soldieers,
and thhe energy of th
he soldiers in alliance
a
equalss to
the tottal sum of each
h soldier's energgy.
6) Structurres such as paalisades can reduce the fightting
strengtth of the countterpart.
Fig. 2. A Snapshot of
o the Developed Game
G
386
The size
s of the mapp can be adjustted from 7 × 7 to 30 × 30
accordinng to the requeest of users and the high and loow for each
positionn can be deciided. Each cooordinate is ann exclusive
space where
w
two unitts cannot enterr simultaneously and two
buildinggs cannot be constructed at
a the same time.
t
NPC
supportts two types of
o characters and
a each charracter may
generate a high-level of emergent behaviors
b
suchh as attack,
ment to both sides, construuction of thee building,
movem
destructtion of a buildiing, merger, etcc.
The Repository
R
mannages informattion by modulaating all the
informaation on the game, agent, map, etc. neecessary to
progresss the game. For
F increasing the reality off the game,
various interaction rulles are designedd. Those rules are applied
u period, andd each rule hass the effect on action and
every unit
reactionn for not causinng a super strateegy. Scores aree calculated
by seveeral measures suuch as buildingg Towns scoress, attacking
scores, producing Units scores, andd the summatioon of them.
t
are diveerse measures to evaluate a strategy,
Since there
various types of strateegies might be explored by evvolutionary
approacches.
IV. NEURAL NPC WITH
W
LIM
As shown in Figuure 3, all the agents withinn the game
me and each derived
d
the
identifieed the situatioon of the gam
optimizzed strategies by utilizing the proposeed layered
influencce map (LIM) technique.
t
All the agents creaated within
the gam
me decide thhe optimal beehaviors among attack,
construction, destructiion, evasion, annd merger throough neural
networkk utilizing LIM and makke movementss. Layered
informaation used for LIM at this time
t
is of buiildings and
troops. The weight off neural networrk used here was
w directly
i
genes. Thhus created genes
g
were
encodedd and made into
consisteently evaluateed using geneetic algorithm
m to select
excellennt groups, crosssover, repeat processes of addition
a
of
mutatioons, and grradually explore neural networks
commannding strategiees with excellennt evaluation.
Fig. 3. Behavior
B
Decisionn Flowchart using Neural
N
Network annd Influence
Map
2008 IEEE Symposium on Computational Intelligence and Games (CIG'08)
Weighting and summing the influence layers to attain a
final set of desirability value is advantageous as it is a
relatively simple approach. Moreover, it is fully transparent in
that the developer knows exactly how and why it makes its
decision, as these parameters have been set by hand. However,
this method of calculating the desirability value has certain
limitations.
First, the developer needs to choose which layers are
relevant to the decision that is being made. This might seem
like a simple task, but it is often difficult to know exactly
which factors an expert is considering when making a
decision, such as an expert game player choosing which
location to attack. Therefore, the process of choosing the
relevant variables that need to be considered for the decision
is a matter of trial and error. As a result, the process is time
consuming and might mean that useful and important
information is left out.
Second, when the chosen layers are summed together, it is
possible that important information might be lost. For
example, consider a situation in which the AI has units in a
certain cell that are adding a positive influence, while the
human player has units in the same cell that are adding a
negative influence. When these opposing influences are
added together they cancel each other out, and it seems as
though there are no units adding influence in that cell.
However, the information that both forces have units in this
cell could be quite important to a strategic decision, but by
simply adding them together it can go unnoticed.
Third, finding the correct weighting for each layer for each
decision is also a matter of trial and error, and as such can
require a great deal of tweaking to get right. The only way to
find a suitable set of weight is to guess initial values and then
hand-tune until the AI seems to be behaving reasonably [6].
In order to overcome such limits, in this paper, behavior
evaluation is conducted for a case of considering each
variable only by separating situational variables, and behavior
strategies that would satisfy all the variables as much as
possible after considering weights by variables of values
evaluated are sought after. This is because the weight is not
identical in terms that all the situations have influences on
strategies selection.
In this paper, we adopt the evolutionary method using the
genetic algorithm to generate behavior systems that are
accommodated to several environments. The chromosome is
encoded as a string that represents a weight of neural network
that determine behavior of agent as shown in Figure. 4. The
crossover, mutation operations and the roulette wheel
selection are used to reproduce a new population. The
crossover and mutation of neural networks are shown in
Figure. 5. In crossover, we first choose two individual and
replace those parts of them. In mutation, we select a part of
the network and replace with new values. As repeating this
process, we can get improved strategy of behaviors.
Fig. 4. Chromosome of NN Encoding
Fig. 5. Genetic Operation
V. EXPERIMENT AND RESULT
A. Experiment
Operator used for the genetic algorithm was designed with
50 individuals, 0.7 of cross-over rate, and 0.15 of mutation
rate with roulette wheel selection.
Situational variables of LIM were divided into the status of
buildings and status of troops, and the weight given for troops
was 0.8, and that for buildings was 0.2. The value of the
position where the biggest influence of the friends is 5, 1 point
to be deducted for each fall while the value of the position
where the biggest influence of enemies is -5, 1 point to be
added for each fall. Where the influences of friends and
enemies are overlapped, the total sum of each influence is to
be established. Evaluation is based on how effectively the
game progresses against the system of random strategy
selection. Based on the status of buildings and the status of
troops used for LIM, grade (A) is produced. This grade, not
evaluated as an absolute value, but the relative grade has been
figured out (1) by comparing with grades of the system of
random strategy selection (B). This means the dominance rate
of the game.
The feasibility has been verified by comparing strategy
performance generated by utilizing LIM like this with
strategy performance generated by utilizing general IM.
Fitness = A / ( A + B )
2008 IEEE Symposium on Computational Intelligence and Games (CIG'08)
(1)
387
B. Test Resuult
Fig. 6. Dominance Rate Changes agaainst Random Strattegy based on
F
E
Evolutionary
Compputation and Win Rate Changes agaainst Random Strattegy
b
based
on Evolutionnary Computation
When input values of evo
olutionary neurral network were
w
innput as themseelves without processing,
p
the game dominannce
r in the game against AI haaving random strategy selecttion
rate
r
rule
was connverged at the
t
performannce rate bellow
a
approximately
60%. On the contrary,
c
when LIM was usedd as
innput values of evolutionary neural
n
networkk, the performannce
r
rapidly
increaseed at the beginn
ning and show
wed approximattely
8
80%
of game dominance
d
ratee in about 30 generations
g
as can
c
b seen from thhe figures 6. Also
be
A
in terms of comparisonn of
s
success
rate, peerformance sh
howed below 80%
8
when sim
mple
d
data
entry was used, but therre was successs rate higher thhan
9
90%
through evvolution when LIM was usedd.
Also the geneeration of diveerse strategies according to the
weeight after tesst results by adjusting situational variable's
weeight which is included in the Influencee Map has beeen
obbserved as shoown in Figure. 7. While grooup strategy was
w
beeing selected tooward the direcction of multiple constructionn of
trooops productioon base when the weight shhares of buildinngs
annd troops were given at the raatio of 8:2, therre was a tendenncy
thaat soldiers conncentrated on combats ratheer than buildinngs
coonstruction whhen the weightt shares betweeen buildings and
a
trooops were designated as 2:8
8. Neverthelesss, it was reveaaled
thaat there was similar
s
perform
mance in termss of final succcess
ratte.
comparred with the case of having input values entered as
simple data in the games using the evolutionnary neural
networkk. Strategy seelection utiliziing neural neetworks by
makingg information into
i
influence by factors off the game
broughtt about generaation of far more
m
excellentt strategies
comparred with neuural networkss utilizing unnprocessed
informaation as itselff. For this stuudy tests tookk place by
applyinng the same reesolution to thhe map in the game and
Influencce Map, but we will studdy performancce changes
accordinng to adjustm
ment of the am
mount of inpuut data by
changinng resolution of
o Influence Maap in the futurre. Also we
plan too compare annd evaluate Layered
L
Influuence Map
perform
mances under coonditions of utiilizing techniquues such as
NEAT under
u
indirect encoding, not direct encodinng of neural
networkk only.
AC
CKNOWLEDGEM
MENTS
This research was supported
s
by MKE,
M
Korea under
u
ITRC
IITA-20008-(C1090-08801-0011).
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
Fig. 7. Win Rate Chhanges on Evolutio
F
onary Computationn of Various Situaation
R
Ratio
[14]
VI. CON
NCLUSIONS
[15]
Layered Inflluence Map haas been proposed as a way for
s
situation
perceeption input vaalues utilized for the real-tiime
s
strategy
and alsso as AI test simulator in thiis paper. Throuugh
thhis the feasibiility has been verified from tests to see how
h
m
much
of perform
mance enhanceement was achhieved when innput
v
values
were utiilized by proceessing them into Influence Map
M
388
[16]
[17]
R. Mikkulainen,
M
B. Bryant
B
and R. Coornelius and I. Kaarpov and K.
Stannley and C.Yong, "Computational Intelligence in Game,"
G
IEEE
Com
mputational Intellig
igence Society, pp.. 281-300, 2006.
K. Stanley,
S
B. Bryantt, “Real-time neurroevolution in the NERO video
gam
me,” Evolutionaryy Computation, IEEE
IE
Transactionns Vol.9, pp.
653-668 ,Dec.2005.
C. Miles
M
and S. Louiss, “Towards The Co-Evolution
C
of Innfluence Map
Treee Based Strateegy Game Playyer," IEEE Sym
mposium on
Com
mputational Intellig
igence and Games,, 2006
C. Ong,
O
H. Quek, K. Tan and A. Tayy, “Discovering Chinese
C
Chess
Straategies through Cooevolutionary Appproaches,” IEEE Symposium
S
on
Com
mputational Intellig
igence and Games ,2007.
C. Ong,
O
H. Quek, K.. Tan and A. Tayy, “Evolving a Neeural Network
Loccation Evaluator to Play Ms. Pacc-Man,” IEEE Syymposium on
Com
mputational Intellig
igence and Games ,2005.
S. Rabin,
R
AI GAME PROGRAMMING
G WIDSOM II, Charles
C
River
Meddia, 2005.
D. B. Fogel, "A platform
p
for evoolving intelligentlly interactive
adversaries," Biosysteem pp. 72-83,85, 2006.
P. Byl
B Programmingg Believable Chaaracters for Compputer Games,
Chaarles river media 2004.
K. Kanev,
K
"Design annd simulation of innteractive 3D compputer games,"
Com
mput. & Graphics,, Vol. 22, no. 2-3, pp.
p 281-300, 19988.
A. Tychsen,
T
M. Hitchhens and T.Broluund and M.Kavaklli "The Game
Masster," 2005 Interractive. Entertainnment Conferencce, Vol. 20,
pp.2215-222. 2005.
K. Mackin, "Evolvving Intelligent Multi-agent Syystems using
Unssupervised Agent Communication
C
annd Behavior Trainiing," Systems,
Mann, and Cyberneticss, 2000 IEEE Interrnational Conferennce, Vol. 4 pp.
2411-2414, 2000.
K
"Modelliing and simulationn of a group of mobile
m
robots"
G. Klancar,
Simulation Modellingg Practice and Theeory, Vol, 15, pp.6447-658, 2007.
N. Stahl, "Gamebotss: A 3D virtual world
w
test-bed for multi-agent
reseearch", INF389 - Artificial
A
Life, 20066
Z. Kobti, S. Sharrma, “A multi-aagent architecturee for game
playying”,Computationnal Intelligence and Games, 2007. IEEE
Sym
mposium on CIG 2007,
2
pp. 276 - 2811 April 2007.
K. Chellaphilla,
C
D. Fogel,
F
“Evolution, neural networkss, games, and
inteelligence,” Proceeddings of the IEEE, 87(9), pp 1471-14496, 1999
D. Fogel, K. Chellaapilla, “Verifying anaconda’s expeert rating by
com
mpeting against Chinook:
C
experim
ments in co-evolvving a neural
checckers player,” Neuurocomputing, 42(1-4), pp 69-86, 20002
J. Hong,
H
“Evolutionn of emergent behaviors
b
for shhooting game
charracters in Robocoode,” Congress onn Evolutionary Coomputation, 1,
pp. 634-638, 2004
2008 IEEE Symposium on Computational Intelligence and Games (CIG'08)