* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download GameAI_NeuralNetworks
Neural coding wikipedia , lookup
Neuroeconomics wikipedia , lookup
Holonomic brain theory wikipedia , lookup
Single-unit recording wikipedia , lookup
Neuropsychopharmacology wikipedia , lookup
Metastability in the brain wikipedia , lookup
Central pattern generator wikipedia , lookup
Total Annihilation wikipedia , lookup
Neural modeling fields wikipedia , lookup
Perceptual control theory wikipedia , lookup
Nervous system network models wikipedia , lookup
Gene expression programming wikipedia , lookup
Biological neuron model wikipedia , lookup
Synaptic gating wikipedia , lookup
Backpropagation wikipedia , lookup
Catastrophic interference wikipedia , lookup
Convolutional neural network wikipedia , lookup
for games
Control
1.
Controllers for robotic applications.
Robot’s sensory system provides inputs and
output sends the responses to the robot’s motor
control system.
How about in games?
Threat Assessment
2.
Strategy/simulation type game
Use NN to predict the type of threat presented
by the player at any time during gameplay.
Attack or Flee
3.
RPG – to control how certain AI creatures behave
Handle AI creature’s decision making
Using
a 3-layer feed-forward neural network
as example
Structure
Input Hidden Output : Feed-forward
process
Input
What to choose as input is problem-specific
Keeping inputs to a minimum set will make training
easier.
Forms: Boolean, enumerated, continuous
The different scales used require the input values
to be normalized
Weights
“Synaptic connection” in a biological NN
Weights influence the strength of inputs
Determining the weights involve “training” or
“evolving” a NN
Every connection between neurons has an
associated weight – net input to a given neuron j is
calculated from a set of input i neurons
Net input to a given neuron is a linear combination
of weighted inputs from other neurons from
previous layer
Activation
Functions
Takes the net input to a neuron, operates on it to
produce an output for the neuron
Should be nonlinear functions, for NN to work as
expected
Common: Logistic (or sigmoid) function
Activation
Functions (cont’d)
Other well-known activation functions: Step
function and hyperbolic tangent function
Bias
Each neuron (except from input layer) has a bias
associated to it
Bias term shifts net input along horizontal axis of
activation function, changing the threshold it
activates
Value: always 1 or -1
Its weight also adjusted just like other weights
Output
Choice is also problem-specific
Same rule of thumb: Keep number to minimum
Using a logistic function as output activation, an
output around 0.9 is considered activated or true,
0.1 considered not activated or false
In practice, we may not even get close to these
values! So, a threshold has to be set… Using
midpoint of the function (0.5) is a simple choice
If more than one output neuron is used, more than
one outputs could be activated – easier to select
just one output by “winner-take-all” approach
Hidden
Layer
Some NNs have no hidden layers or a few hidden
layers – design
The more hidden layers, the more features the
network can handle, and vice versa
Increasing the number of features (dimensionality)
can enable better fit to the expected function
Back-propagation
Training
Aim of training – to find values for the weights that
connect the neurons such that the input data can
generate the desired output values
Need a training set
Done iteratively
Optimization process – requires some measure of
merit: Error measure that needs to be minimize
Error measures: Mean square error
Finding
1.
2.
3.
4.
5.
optimum weights iteratively
Start with training set consisting of input data
and desired outputs
Initialize the weights in the NN to some small
random values
With each set of input data, feed network and
calculate output
Compare calculated output with desired output,
compute error
Adjust weights to reduce error, repeat process
Each iteration is known as an “epoch”
Computing
error
Most common error measure: Mean square error, or
average of the square of difference between
desired and calculated output:
Goal: To get the error value as small as possible
Iteratively adjust the weight values, by calculating
error associated with each neuron in output and
hidden layers
Computing
error (cont’d)
Output neuron error
Hidden-layer neuron error
No error is associated with input layer neurons
because those neuron values are given
Can you observe how back-propagation is at work?
Adjusting
weights
Calculate suitable adjustments for each weight in
the network.
Adjustment to each weight:
New weight = Old weight + w
Adjustments are made for each individual weight
The learning rate p is a multiplier that affects how
much each weight is adjusted.
Adjusting
weights (cont’d)
Setting p too high, might overshoot the optimum
weights
Setting p too low, training might take too long
Special technique Adding “momentum” (see
textbook), or regularization (another technique)
Earlier
example
Flocking and Chasing – A flock of units chasing a
player
Applying
neural networks
To decide whether to chase the player, evade
him, or flock with other AI units
Simplistic method: Creature always attack
player, OR use a FSM “brain” (or other decisionmaking method) to decide between those actions
based on conditions
Neural
Networks:
Advantage: Not only for making decisions but to
adapt their behavior given their experience
with attacking the player
A “feedback” mechanism is useful to model
“experience”, so that subsequent decisions can
be improved or made “smarter”.
How
it works (example)
Assume we have 20 AI units moving on the screen
Behaviors: Chase, Evade, Flock with other units
Combat
mode
When player and AI units come within a specified
radius of one another, assume to be in combat
Combat will not be simulated – but use a simple
system whereby AI units will lose a number of HP
every turn through the game loop
Player also loses a number of HP proportional to
number of AI units
A unit dies when HP = 0, and is respawned
“Brain”
All AI units share the same “brain”
The brain evolves as the unit gains experience
with the player
Implement back-propagation so that the NN’s
weights can be adjusted in real time
Assume all AI units evolve collectively
Expectations
AI become more aggressive if player is weak
AI become more withdrawn if player is strong
AI learns to stay in flock to have better chance of
defeating player
Initialize
values for neural network
Number of neurons in each layer – 4 inputs, 3
hidden neurons, 3 output neurons
Preparation
for training
Initialize learning rate to 0.2 – tuned by trial-anderror with the aim of keeping the training time
down while maintaining accuracy
Data is dumped into a text file so that it can be
referred during debugging
Training loop – cycle through until…
Calculated error is less than some specified value, OR
Number of iterations reach a specified maximum
Sample
training data for NN
double TrainingSet[14][7] = {
//#Friends, Hit points, Enemy Engaged, Range, Chase, Flock, Evade
0,
1,
0,
0.2,
0.9,
0.1,
0.1,
0,
1,
1,
0.2,
0.9,
0.1,
0.1,
0,
1,
0,
0.8,
0.1,
0.1,
0.1,
....
14 sets of input and output values
All data values are within range from 0.0-1.0,
normalized
Use 0.1 for inactive (false) output and 0.9 for
active (true) output – impractical to achieve 0 or
1 for NN output, so use reasonable target value
Training
Assume a few arbitrary input conditions and then
specified a reasonable response.
In practice, you are likely to design more training
sets than what was shown in example
Training
loop
Error initialize to 0, can calculated for each
‘epoch’ (once thru all 14 sets of inputs/outputs)
For each set of data,
1.
2.
3.
data was chosen empirically
Feed-forward performed
Error calculated and accumulated
Back-propagation to adjust connection weights
Average error calculated (divide by 14)
Updating
AI Units – cycle thru all
Calculate distance from the current unit to target
Check if target is killed. If it is, then check where
current unit is in relation to target (if it is in the
combat range). If it is, retrain NN to reinforce
chase behavior (unit doing something right, so
train it to be more aggressive). Otherwise,
retraining NN will reinforce other behaviors.
Use
the trained NN for real-time decisionmaking
Under the current set of conditions in real-time,
output will show which behavior the unit should
take
REMEMBER: Input values have to be consistently
normalized as well before feeding thru NN!
Feed-forward is applied
Output values are then examined to derive the
proper choice of behavior
Simple way – just select output with highest activation
Some
outcomes of this AI:
If target is left to die without inflicting much
damage on the units AI units will adapt to
attack more often (target perceived as weak)
If target inflicts massive damage AI units will
adapt to avoid target more (target perceived as
strong)
AI units also adapt to flock together if they are
faced with strong target
Some
outcomes of this AI:
Interesting emergent behavior Leaders
emerge in flocks, intermediate and trailing units
will follow the lead. Q: How is it possible to
design such behaviors??