Download GameAI_NeuralNetworks

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Neural coding wikipedia , lookup

Neuroeconomics wikipedia , lookup

Holonomic brain theory wikipedia , lookup

Single-unit recording wikipedia , lookup

Neuropsychopharmacology wikipedia , lookup

Metastability in the brain wikipedia , lookup

Central pattern generator wikipedia , lookup

Total Annihilation wikipedia , lookup

Neural modeling fields wikipedia , lookup

Perceptual control theory wikipedia , lookup

Nervous system network models wikipedia , lookup

Gene expression programming wikipedia , lookup

Biological neuron model wikipedia , lookup

Synaptic gating wikipedia , lookup

Backpropagation wikipedia , lookup

Catastrophic interference wikipedia , lookup

Convolutional neural network wikipedia , lookup

Recurrent neural network wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Transcript
for games
Control
1.

Controllers for robotic applications.
Robot’s sensory system provides inputs and
output sends the responses to the robot’s motor
control system.

How about in games?

Threat Assessment
2.


Strategy/simulation type game
Use NN to predict the type of threat presented
by the player at any time during gameplay.
Attack or Flee
3.


RPG – to control how certain AI creatures behave
Handle AI creature’s decision making
 Using
a 3-layer feed-forward neural network
as example
 Structure

Input  Hidden  Output : Feed-forward
process
 Input




What to choose as input is problem-specific
Keeping inputs to a minimum set will make training
easier.
Forms: Boolean, enumerated, continuous
The different scales used require the input values
to be normalized
 Weights





“Synaptic connection” in a biological NN
Weights influence the strength of inputs
Determining the weights involve “training” or
“evolving” a NN
Every connection between neurons has an
associated weight – net input to a given neuron j is
calculated from a set of input i neurons
Net input to a given neuron is a linear combination
of weighted inputs from other neurons from
previous layer
 Activation



Functions
Takes the net input to a neuron, operates on it to
produce an output for the neuron
Should be nonlinear functions, for NN to work as
expected
Common: Logistic (or sigmoid) function
 Activation

Functions (cont’d)
Other well-known activation functions: Step
function and hyperbolic tangent function
 Bias

Each neuron (except from input layer) has a bias
associated to it

Bias term shifts net input along horizontal axis of
activation function, changing the threshold it
activates
Value: always 1 or -1
Its weight also adjusted just like other weights


 Output





Choice is also problem-specific
Same rule of thumb: Keep number to minimum
Using a logistic function as output activation, an
output around 0.9 is considered activated or true,
0.1 considered not activated or false
In practice, we may not even get close to these
values! So, a threshold has to be set… Using
midpoint of the function (0.5) is a simple choice
If more than one output neuron is used, more than
one outputs could be activated – easier to select
just one output by “winner-take-all” approach
 Hidden



Layer
Some NNs have no hidden layers or a few hidden
layers – design
The more hidden layers, the more features the
network can handle, and vice versa
Increasing the number of features (dimensionality)
can enable better fit to the expected function
 Back-propagation





Training
Aim of training – to find values for the weights that
connect the neurons such that the input data can
generate the desired output values
Need a training set
Done iteratively
Optimization process – requires some measure of
merit: Error measure that needs to be minimize
Error measures: Mean square error
 Finding
1.
2.
3.
4.
5.

optimum weights iteratively
Start with training set consisting of input data
and desired outputs
Initialize the weights in the NN to some small
random values
With each set of input data, feed network and
calculate output
Compare calculated output with desired output,
compute error
Adjust weights to reduce error, repeat process
Each iteration is known as an “epoch”
 Computing
error

Most common error measure: Mean square error, or
average of the square of difference between
desired and calculated output:

Goal: To get the error value as small as possible
Iteratively adjust the weight values, by calculating
error associated with each neuron in output and
hidden layers

 Computing
error (cont’d)

Output neuron error

Hidden-layer neuron error

No error is associated with input layer neurons
because those neuron values are given
Can you observe how back-propagation is at work?

 Adjusting





weights
Calculate suitable adjustments for each weight in
the network.
Adjustment to each weight:
New weight = Old weight + w
Adjustments are made for each individual weight
The learning rate p is a multiplier that affects how
much each weight is adjusted.
 Adjusting



weights (cont’d)
Setting p too high, might overshoot the optimum
weights
Setting p too low, training might take too long
Special technique  Adding “momentum” (see
textbook), or regularization (another technique)
 Earlier

example
Flocking and Chasing – A flock of units chasing a
player
 Applying


neural networks
To decide whether to chase the player, evade
him, or flock with other AI units
Simplistic method: Creature always attack
player, OR use a FSM “brain” (or other decisionmaking method) to decide between those actions
based on conditions
 Neural


Networks:
Advantage: Not only for making decisions but to
adapt their behavior given their experience
with attacking the player
A “feedback” mechanism is useful to model
“experience”, so that subsequent decisions can
be improved or made “smarter”.
 How


it works (example)
Assume we have 20 AI units moving on the screen
Behaviors: Chase, Evade, Flock with other units
 Combat




mode
When player and AI units come within a specified
radius of one another, assume to be in combat
Combat will not be simulated – but use a simple
system whereby AI units will lose a number of HP
every turn through the game loop
Player also loses a number of HP proportional to
number of AI units
A unit dies when HP = 0, and is respawned
 “Brain”




All AI units share the same “brain”
The brain evolves as the unit gains experience
with the player
Implement back-propagation so that the NN’s
weights can be adjusted in real time
Assume all AI units evolve collectively
 Expectations



AI become more aggressive if player is weak
AI become more withdrawn if player is strong
AI learns to stay in flock to have better chance of
defeating player
 Initialize

values for neural network
Number of neurons in each layer – 4 inputs, 3
hidden neurons, 3 output neurons
 Preparation



for training
Initialize learning rate to 0.2 – tuned by trial-anderror with the aim of keeping the training time
down while maintaining accuracy
Data is dumped into a text file so that it can be
referred during debugging
Training loop – cycle through until…


Calculated error is less than some specified value, OR
Number of iterations reach a specified maximum
 Sample
training data for NN
double TrainingSet[14][7] = {
//#Friends, Hit points, Enemy Engaged, Range, Chase, Flock, Evade
0,
1,
0,
0.2,
0.9,
0.1,
0.1,
0,
1,
1,
0.2,
0.9,
0.1,
0.1,
0,
1,
0,
0.8,
0.1,
0.1,
0.1,
....



14 sets of input and output values
All data values are within range from 0.0-1.0,
normalized
Use 0.1 for inactive (false) output and 0.9 for
active (true) output – impractical to achieve 0 or
1 for NN output, so use reasonable target value
 Training


Assume a few arbitrary input conditions and then
specified a reasonable response.
In practice, you are likely to design more training
sets than what was shown in example
 Training


loop
Error initialize to 0, can calculated for each
‘epoch’ (once thru all 14 sets of inputs/outputs)
For each set of data,
1.
2.
3.

data was chosen empirically
Feed-forward performed
Error calculated and accumulated
Back-propagation to adjust connection weights
Average error calculated (divide by 14)
 Updating


AI Units – cycle thru all
Calculate distance from the current unit to target
Check if target is killed. If it is, then check where
current unit is in relation to target (if it is in the
combat range). If it is, retrain NN to reinforce
chase behavior (unit doing something right, so
train it to be more aggressive). Otherwise,
retraining NN will reinforce other behaviors.
 Use
the trained NN for real-time decisionmaking




Under the current set of conditions in real-time,
output will show which behavior the unit should
take
REMEMBER: Input values have to be consistently
normalized as well before feeding thru NN!
Feed-forward is applied
Output values are then examined to derive the
proper choice of behavior

Simple way – just select output with highest activation
 Some



outcomes of this AI:
If target is left to die without inflicting much
damage on the units  AI units will adapt to
attack more often (target perceived as weak)
If target inflicts massive damage  AI units will
adapt to avoid target more (target perceived as
strong)
AI units also adapt to flock together if they are
faced with strong target
 Some

outcomes of this AI:
Interesting emergent behavior  Leaders
emerge in flocks, intermediate and trailing units
will follow the lead. Q: How is it possible to
design such behaviors??