Download Neural Networks

Csci 2003 – Intro to Artificial Intelligence Neural Networks Lecture 2 on Neural Networks - Perceptrons Learning outcomes After this lecture you will:.  be able to multiply matrices  be able to model a perceptron and know the limitations of this model  be able to describe the perceptron learning rule  know the significant difference between single layer and multi-layer perceptrons. The perceptron and its limitations. How do we create artificial neurons. We have inputs to a unit (with changeable weights) and the unit fires a value given by a function. x1 w1 x2 w2  wi xi  S wn O  F (S ) Artificial Neuron xn Here the x values are the inputs, w are weights and O is the output. A perceptron is a particularly simple neuron. The output is calculated by some variant of the hardlim function. There is a threshold value T where F(S)= 0 if S<T F(S)=1 if S>T or S = T Csci 2003 – Intro to Artificial Intelligence We can define such an F via the hardlimit function hardlimit(x) =0 if x<0 hardlimit(x)=1 otherwise. Then F can be written as hardlimit(S-T). In matlab we usually call the hardlimit function hardlim. In general we can have more than one neural unit in a layer. A calculation with a two input perceptron with a single neuron. Inputs are vectors with two values x [2 by 1 vector]. The weight matrix is w [1 by 2] matrix. There is a [ 1 by 1] bias b (don't worry about this for now). x1 w1  wi xi  S w2 O  F (S ) Artificial Neuron x2 Input to the hardlim is wx+b so the perceptron computes hardlim(wx+b). Set w=[0.3 -0.2] b=[0.5] and lets do some computations: Suppose x = [1 2]’, [-1 -2]’, [-2, 4]’ [need plenty of space for the three calculations] Csci 2003 – Intro to Artificial Intelligence Training a perceptron We use supervised learning and the adapt method with the perceptron learning rule. We have input patterns x with desired output vector tx. The actual output of the network when we put in x is net(x). Define error vector e as e =tx – net(x). This is simply the difference between what we want and what we get when we push x into the net. If the net produces the correct output we don't need to make any changes – but if it is wrong we modify the weights and bias as follows. new w = old w + e x T. new b = old b + e. We cycle through each of the input vectors in turn modifying the weights if necessary until the perceptron does what we want. We hope that the problem can be solved – if there is a solution this will find it. Example Lets train a perceptron to discriminate between two points. [1; 0] – output 1. [0; 1] -output 0. Lets create a perceptron with random w and b and train it using adapt. [plenty of space needed] Csci 2003 – Intro to Artificial Intelligence What can a single perceptron learn? Only linearly separable sets of data. Example which a single perceptron cannot learn. An example of this is the exclusive-or function. x y out 0 0 0 1 0 1 0 1 1 1 1 0 Table 2.1: The Exclusive-or Problem y 1  out = 1  out out = 0 0  0  =0 out = 1 x For a single neuron with threshold activation function with two inputs and one output it can be shown that there are no two weights that will produce a solution. Most real world problems are not linearly separable. If more layers are allowed we can solve problems which aren't linearly separable. eg See separate handout Csci 2003 – Intro to Artificial Intelligence Why matrices? Let's generalise to a three neuron perceptron layer taking two dimensional input. So w is [3 2] and b is [3 1] and we still compute hardlim(wx+b). (Matrices should all match up when we do the addition. Note that we have vector output.)

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Neural Networks