Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Artificial Neural Networks The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence on the structure and inner workings of the brain? The Brain The human brain consists of: Approximately 10 billion neurons …and 60 trillion connections The brain is a highly complex, nonlinear, parallel information-processing system By firing neurons simultaneously, the brain performs faster than the fastest computers in existence today The human brain consists of: Approximately 10 billion neurons …and 60 trillion connections (synapses) Synapse Axon Soma Synapse Dendrites Axon Soma Dendrites Synapse Synapse Axon Soma Synapse Dendrites Axon Soma Dendrites Synapse An individual neuron has a very simple structure Cell body is called a soma Small connective fibers are called dendrites Single long fibers are called axons An army of such elements constitutes tremendous processing power Artificial Neural Networks An artificial neural network consists of a number of very simple processors called neurons Neurons are connected by weighted links The links pass signals from one neuron to another based on predefined thresholds Artificial Neural Networks An individual neuron (McCulloch & Pitts, 1943): Computes the weighted sum of the input signals Compares the result with a threshold value, q If the net input is less than the threshold, the neuron output is –1 (or 0) Otherwise, the neuron becomes activated and its output is +1 Artificial Neural Networks Input Signals Weights Output Signals x1 Y w1 x2 w2 Neuron Q Y Y X = x1w1 + x2w2 + ... + xnwn wn xn Y Activation Functions Individual neurons adhere to an activation function, which determines whether they propagate their signal (i.e. activate) or not: Sign Function Activation Functions Activation Functions The step, sign, and sigmoid activation functions are also often called hard limit functions We use such functions in decision-making neural networks Support classification and other pattern recognition tasks Perceptrons Can an individual neuron learn? In 1958, Frank Rosenblatt introduced a training algorithm that provided the first procedure for training a single-node neural network Rosenblatt’s perceptron model consists of a single neuron with adjustable synaptic weights, followed by a hard limiter Perceptrons Inputs x1 w1 Linear Combiner Hard Limiter Output Y X = x1w1 + x2w2 Y = Ystep w2 x2 q Threshold Perceptrons A perceptron: Classifies inputs x1, x2, ..., xn into one of two distinct classes A1 and A2 Forms a linearly separable function defined by: x2 Class A1 1 Class A2 2 x1w1 + x2w2 q = 0 x1 Perceptrons Perceptron with three x2 x2 inputs x1, x2, and x3 classifies its inputs Class A1 into two distinct 1 sets A1 and A2 Class A2 1 2 x1 x1 2 x1w1 + x2w2 q = 0 x3 x1w1 + x2w2 + x3w3 q = 0 Perceptrons How does a perceptron learn? A perceptron has initial (often random) weights typically in the range [-0.5, 0.5] Apply an established training dataset Calculate the error as expected output minus actual output: error e = Yexpected – Yactual Adjust the weights to reduce the error Perceptrons How do we adjust a perceptron’s weights to produce Yexpected? If e is positive, we need to increase Yactual (and vice versa) Use this formula: , where α is the learning rate (between 0 and 1) e is the calculated error and Use threshold Θ = 0.2 and learning rate α = 0.1 Perceptron Example – AND Train a perceptron to recognize logical AND Use threshold Θ = 0.2 and learning rate α = 0.1 Perceptron Example – AND Train a perceptron to recognize logical AND Use threshold Θ = 0.2 and learning rate α = 0.1 Perceptron Example – AND Repeat until convergence i.e. final weights do not change and no error Perceptron Example – AND Two-dimensional plot of logical AND operation: A single perceptron can be trained to recognize any linear separable function Can we train a perceptron to x2 x2 1 1 x1 0 recognize logical OR? How about logical exclusive-OR (i.e. XOR)? 1 (b Perceptron – OR and XOR Two-dimensional plots of logical OR and XOR: x2 x2 1 1 x1 x1 0 1 (b) OR (x 1 x 2 ) x1 0 1 (c) Ex cl usive- OR (x x ) Perceptron Coding Exercise Write a code to: Calculate the error at each step Modify weights, if necessary i.e. if error is non-zero Loop until all error values are zero for a full epoch Modify your code to learn to recognize the logical OR operation Try to recognize the XOR operation.... Multilayer Neural Networks Multilayer neural networks consist of: An input layer of source neurons One or more hidden layers of computational neurons An output layer of more computational neurons Input Layer Input signals are propagated in a layer-by-layer feedforward manner Middle Layer Output Layer Input Signals Output Signals Multilayer Neural Networks Middle Layer Input Layer Output Layer Input Signals Output Signals Multilayer Neural Networks Input layer First hidden layer Second hidden layer Output layer XOUTPUT = yH1w11 + yH2w21 + ... + yHjwj1 + ... + yHmwm1 Multilayer Neural Networks XINPUT = x1 XHInput = xsignals 1w11 + x2w21 + ... + xiwi1 + ... + xnwn1 1 x1 x2 2 xi y1 2 y2 k yk l yl 1 2 i 1 wij j wjk m n xn Input layer Hidden layer Error signals Output layer Multilayer Neural Networks Three-layer network: 1 q3 x1 1 w13 3 1 w35 w23 q5 5 w w24 14 x2 2 w45 4 w24 Input layer q4 1 Hiddenlayer Output layer y5 Multilayer Neural Networks Commercial-quality neural networks often incorporate 4 or more layers Each layer consists of about 10-1000 individual neurons Experimental and research-based neural networks often use 5 or 6 (or more) layers Overall, millions of individual neurons may be used Back-Propagation NNs A back-propagation neural network is a multilayer neural network that propagates error backwards through the network as it learns Weights are modified based on the calculated error Training is complete when the error is below a specified threshold e.g. less than 0.001 Back-Propagation NNs Input signals 1 x1 x2 2 xi y1 2 y2 k yk l yl 1 2 i 1 wij j wjk m n xn Input layer Hidden layer Error signals Output layer Use the sigmoid activation function; and apply Θ by connecting fixed input -1 to weight Θ Back-Propagation NNs Initially: w13 = 0.5, w14 = 0.9, w23 = 0.4, w24 = 1.0, w35 = -1.2, w45 = 1.1, q3 = 0.8, q4 = -0.1 and q5 = 0.3. 1 q3 x1 1 w13 3 1 w35 w23 q5 5 w w24 14 x2 2 w45 4 w24 Input layer q4 1 Hiddenlayer Output layer y5 Step 2: Activation Activate the back-propagation neural network by applying inputs x1(p), x2(p),…, xn(p) and desired outputs yd,1(p), yd,2(p),…, yd,n(p). (a) Calculate the actual outputs of the neurons in the hidden layer: n y j ( p ) sigmoid xi ( p) wij ( p ) q j i 1 where n is the number of inputs of neuron j in the hidden layer, and sigmoid is the sigmoid activation function. 33 Step 2 : Activation (continued) (b) Calculate the actual outputs of the neurons in the output layer: m yk ( p) sigmoid x jk ( p) w jk ( p) q k j 1 where m is the number of inputs of neuron k in the output layer. 34 We consider a training set where inputs x1 and x2 are equal to 1 and desired output yd,5 is 0. The actual outputs of neurons 3 and 4 in the hidden layer are calculated as y3 sigmoid ( x1w13 x2w23 q3 ) 1/ 1 e(10.510.410.8) 0.5250 y4 sigmoid ( x1w14 x2w24 q4 ) 1/ 1 e (10.911.010.1) 0.8808 Now the actual output of neuron 5 in the output layer is determined as: y5 sigmoid( y3w35 y4w45 q5) 1/ 1 e(0.52501.20.88081.110.3) 0.5097 Thus, the following error is obtained: e yd ,5 y5 0 0.5097 0.5097 35 Step 3: Weight training Update the weights in the back-propagation network propagating backward the errors associated with output neurons. (a) Calculate the error gradient for the neurons in the output layer: k ( p) yk ( p) 1 yk ( p) ek ( p) where ek ( p) yd ,k ( p) yk ( p) Calculate the weight corrections: w jk ( p) y j ( p) k ( p) Update the weights at the output neurons: w jk ( p 1) w jk ( p) w jk ( p) 5/24/2017 Intelligent Systems and Soft Computing 36 Step 3: Weight training (continued) (b) Calculate the error gradient for the neurons in the hidden layer: l j ( p) y j ( p) [1 y j ( p)] k ( p) w jk ( p) k 1 Calculate the weight corrections: wij ( p) xi ( p) j ( p) Update the weights at the hidden neurons: wij ( p 1) wij ( p) wij ( p) 5/24/2017 Intelligent Systems and Soft Computing 37 The next step is weight training. To update the weights and threshold levels in our network, we propagate the error, e, from the output layer backward to the input layer. First, we calculate the error gradient for neuron 5 in the output layer: 5 y5 (1 y5) e 0.5097 (1 0.5097) ( 0.5097) 0.1274 Then we determine the weight corrections assuming that the learning rate parameter, a, is equal to 0.1: w35 y3 5 0.1 0.5250 (0.1274) 0.0067 w45 y4 5 0.1 0.8808 (0.1274) 0.0112 q5 ( 1) 5 0.1 (1) (0.1274) 0.0127 5/24/2017 Intelligent Systems and Soft Computing 38 Next we calculate the error gradients for neurons 3 and 4 in the hidden layer: 3 y3(1 y3) 5 w35 0.5250 (1 0.5250) ( 0.1274) ( 1.2) 0.0381 4 y4 (1 y4 ) 5 w45 0.8808 (1 0.8808) ( 0.1274) 1.1 0.0147 We then determine the weight corrections: w13 w23 q3 w14 w24 q 4 5/24/2017 x1 x2 ( 1) x1 x2 ( 1) 0.1 1 0.0381 0.0038 3 0.11 0.0381 0.0038 3 0.1 ( 1) 0.0381 0.0038 4 0.11 ( 0.0147) 0.0015 4 0.11 ( 0.0147) 0.0015 4 0.1 ( 1) ( 0.0147) 0.0015 3 Intelligent Systems and Soft Computing 39 At last, we update all weights and threshold: w13 w13 w13 0.5 0.0038 0.5038 w14 w14 w14 0.9 0.0015 0.8985 w23 w23 w23 0.4 0.0038 0.4038 w24 w24 w24 1.0 0.0015 0.9985 w35 w35 w35 1.2 0.0067 1.2067 w45 w45 w45 1.1 0.0112 1.0888 q 3 q 3 q 3 0.8 0.0038 0.7962 q 4 q 4 q 4 0.1 0.0015 0.0985 q 5 q 5 q 5 0.3 0.0127 0.3127 The training process is repeated until the sum of squared errors is less than 0.001. 5/24/2017 Intelligent Systems and Soft Computing 40