Download GameAI_NeuralNetworks

for games Control 1.  Controllers for robotic applications. Robot’s sensory system provides inputs and output sends the responses to the robot’s motor control system.  How about in games?  Threat Assessment 2.   Strategy/simulation type game Use NN to predict the type of threat presented by the player at any time during gameplay. Attack or Flee 3.   RPG – to control how certain AI creatures behave Handle AI creature’s decision making  Using a 3-layer feed-forward neural network as example  Structure  Input  Hidden  Output : Feed-forward process  Input     What to choose as input is problem-specific Keeping inputs to a minimum set will make training easier. Forms: Boolean, enumerated, continuous The different scales used require the input values to be normalized  Weights      “Synaptic connection” in a biological NN Weights influence the strength of inputs Determining the weights involve “training” or “evolving” a NN Every connection between neurons has an associated weight – net input to a given neuron j is calculated from a set of input i neurons Net input to a given neuron is a linear combination of weighted inputs from other neurons from previous layer  Activation    Functions Takes the net input to a neuron, operates on it to produce an output for the neuron Should be nonlinear functions, for NN to work as expected Common: Logistic (or sigmoid) function  Activation  Functions (cont’d) Other well-known activation functions: Step function and hyperbolic tangent function  Bias  Each neuron (except from input layer) has a bias associated to it  Bias term shifts net input along horizontal axis of activation function, changing the threshold it activates Value: always 1 or -1 Its weight also adjusted just like other weights    Output      Choice is also problem-specific Same rule of thumb: Keep number to minimum Using a logistic function as output activation, an output around 0.9 is considered activated or true, 0.1 considered not activated or false In practice, we may not even get close to these values! So, a threshold has to be set… Using midpoint of the function (0.5) is a simple choice If more than one output neuron is used, more than one outputs could be activated – easier to select just one output by “winner-take-all” approach  Hidden    Layer Some NNs have no hidden layers or a few hidden layers – design The more hidden layers, the more features the network can handle, and vice versa Increasing the number of features (dimensionality) can enable better fit to the expected function  Back-propagation      Training Aim of training – to find values for the weights that connect the neurons such that the input data can generate the desired output values Need a training set Done iteratively Optimization process – requires some measure of merit: Error measure that needs to be minimize Error measures: Mean square error  Finding 1. 2. 3. 4. 5.  optimum weights iteratively Start with training set consisting of input data and desired outputs Initialize the weights in the NN to some small random values With each set of input data, feed network and calculate output Compare calculated output with desired output, compute error Adjust weights to reduce error, repeat process Each iteration is known as an “epoch”  Computing error  Most common error measure: Mean square error, or average of the square of difference between desired and calculated output:  Goal: To get the error value as small as possible Iteratively adjust the weight values, by calculating error associated with each neuron in output and hidden layers   Computing error (cont’d)  Output neuron error  Hidden-layer neuron error  No error is associated with input layer neurons because those neuron values are given Can you observe how back-propagation is at work?   Adjusting      weights Calculate suitable adjustments for each weight in the network. Adjustment to each weight: New weight = Old weight + w Adjustments are made for each individual weight The learning rate p is a multiplier that affects how much each weight is adjusted.  Adjusting    weights (cont’d) Setting p too high, might overshoot the optimum weights Setting p too low, training might take too long Special technique  Adding “momentum” (see textbook), or regularization (another technique)  Earlier  example Flocking and Chasing – A flock of units chasing a player  Applying   neural networks To decide whether to chase the player, evade him, or flock with other AI units Simplistic method: Creature always attack player, OR use a FSM “brain” (or other decisionmaking method) to decide between those actions based on conditions  Neural   Networks: Advantage: Not only for making decisions but to adapt their behavior given their experience with attacking the player A “feedback” mechanism is useful to model “experience”, so that subsequent decisions can be improved or made “smarter”.  How   it works (example) Assume we have 20 AI units moving on the screen Behaviors: Chase, Evade, Flock with other units  Combat     mode When player and AI units come within a specified radius of one another, assume to be in combat Combat will not be simulated – but use a simple system whereby AI units will lose a number of HP every turn through the game loop Player also loses a number of HP proportional to number of AI units A unit dies when HP = 0, and is respawned  “Brain”     All AI units share the same “brain” The brain evolves as the unit gains experience with the player Implement back-propagation so that the NN’s weights can be adjusted in real time Assume all AI units evolve collectively  Expectations    AI become more aggressive if player is weak AI become more withdrawn if player is strong AI learns to stay in flock to have better chance of defeating player  Initialize  values for neural network Number of neurons in each layer – 4 inputs, 3 hidden neurons, 3 output neurons  Preparation    for training Initialize learning rate to 0.2 – tuned by trial-anderror with the aim of keeping the training time down while maintaining accuracy Data is dumped into a text file so that it can be referred during debugging Training loop – cycle through until…   Calculated error is less than some specified value, OR Number of iterations reach a specified maximum  Sample training data for NN double TrainingSet[14][7] = { //#Friends, Hit points, Enemy Engaged, Range, Chase, Flock, Evade 0, 1, 0, 0.2, 0.9, 0.1, 0.1, 0, 1, 1, 0.2, 0.9, 0.1, 0.1, 0, 1, 0, 0.8, 0.1, 0.1, 0.1, ....    14 sets of input and output values All data values are within range from 0.0-1.0, normalized Use 0.1 for inactive (false) output and 0.9 for active (true) output – impractical to achieve 0 or 1 for NN output, so use reasonable target value  Training   Assume a few arbitrary input conditions and then specified a reasonable response. In practice, you are likely to design more training sets than what was shown in example  Training   loop Error initialize to 0, can calculated for each ‘epoch’ (once thru all 14 sets of inputs/outputs) For each set of data, 1. 2. 3.  data was chosen empirically Feed-forward performed Error calculated and accumulated Back-propagation to adjust connection weights Average error calculated (divide by 14)  Updating   AI Units – cycle thru all Calculate distance from the current unit to target Check if target is killed. If it is, then check where current unit is in relation to target (if it is in the combat range). If it is, retrain NN to reinforce chase behavior (unit doing something right, so train it to be more aggressive). Otherwise, retraining NN will reinforce other behaviors.  Use the trained NN for real-time decisionmaking     Under the current set of conditions in real-time, output will show which behavior the unit should take REMEMBER: Input values have to be consistently normalized as well before feeding thru NN! Feed-forward is applied Output values are then examined to derive the proper choice of behavior  Simple way – just select output with highest activation  Some    outcomes of this AI: If target is left to die without inflicting much damage on the units  AI units will adapt to attack more often (target perceived as weak) If target inflicts massive damage  AI units will adapt to avoid target more (target perceived as strong) AI units also adapt to flock together if they are faced with strong target  Some  outcomes of this AI: Interesting emergent behavior  Leaders emerge in flocks, intermediate and trailing units will follow the lead. Q: How is it possible to design such behaviors??

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download GameAI_NeuralNetworks