* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Neural Networks
Machine learning wikipedia , lookup
Perceptual control theory wikipedia , lookup
Embodied cognitive science wikipedia , lookup
Philosophy of artificial intelligence wikipedia , lookup
Gene expression programming wikipedia , lookup
Neural modeling fields wikipedia , lookup
Intelligence explosion wikipedia , lookup
Ethics of artificial intelligence wikipedia , lookup
Hierarchical temporal memory wikipedia , lookup
History of artificial intelligence wikipedia , lookup
Existential risk from artificial general intelligence wikipedia , lookup
Convolutional neural network wikipedia , lookup
Csci 2003 – Intro to Artificial Intelligence Neural Networks Lecture 2 on Neural Networks - Perceptrons Learning outcomes After this lecture you will:. be able to multiply matrices be able to model a perceptron and know the limitations of this model be able to describe the perceptron learning rule know the significant difference between single layer and multi-layer perceptrons. The perceptron and its limitations. How do we create artificial neurons. We have inputs to a unit (with changeable weights) and the unit fires a value given by a function. x1 w1 x2 w2 wi xi S wn O F (S ) Artificial Neuron xn Here the x values are the inputs, w are weights and O is the output. A perceptron is a particularly simple neuron. The output is calculated by some variant of the hardlim function. There is a threshold value T where F(S)= 0 if S<T F(S)=1 if S>T or S = T Csci 2003 – Intro to Artificial Intelligence We can define such an F via the hardlimit function hardlimit(x) =0 if x<0 hardlimit(x)=1 otherwise. Then F can be written as hardlimit(S-T). In matlab we usually call the hardlimit function hardlim. In general we can have more than one neural unit in a layer. A calculation with a two input perceptron with a single neuron. Inputs are vectors with two values x [2 by 1 vector]. The weight matrix is w [1 by 2] matrix. There is a [ 1 by 1] bias b (don't worry about this for now). x1 w1 wi xi S w2 O F (S ) Artificial Neuron x2 Input to the hardlim is wx+b so the perceptron computes hardlim(wx+b). Set w=[0.3 -0.2] b=[0.5] and lets do some computations: Suppose x = [1 2]’, [-1 -2]’, [-2, 4]’ [need plenty of space for the three calculations] Csci 2003 – Intro to Artificial Intelligence Training a perceptron We use supervised learning and the adapt method with the perceptron learning rule. We have input patterns x with desired output vector tx. The actual output of the network when we put in x is net(x). Define error vector e as e =tx – net(x). This is simply the difference between what we want and what we get when we push x into the net. If the net produces the correct output we don't need to make any changes – but if it is wrong we modify the weights and bias as follows. new w = old w + e x T. new b = old b + e. We cycle through each of the input vectors in turn modifying the weights if necessary until the perceptron does what we want. We hope that the problem can be solved – if there is a solution this will find it. Example Lets train a perceptron to discriminate between two points. [1; 0] – output 1. [0; 1] -output 0. Lets create a perceptron with random w and b and train it using adapt. [plenty of space needed] Csci 2003 – Intro to Artificial Intelligence What can a single perceptron learn? Only linearly separable sets of data. Example which a single perceptron cannot learn. An example of this is the exclusive-or function. x y out 0 0 0 1 0 1 0 1 1 1 1 0 Table 2.1: The Exclusive-or Problem y 1 out = 1 out out = 0 0 0 =0 out = 1 x For a single neuron with threshold activation function with two inputs and one output it can be shown that there are no two weights that will produce a solution. Most real world problems are not linearly separable. If more layers are allowed we can solve problems which aren't linearly separable. eg See separate handout Csci 2003 – Intro to Artificial Intelligence Why matrices? Let's generalise to a three neuron perceptron layer taking two dimensional input. So w is [3 2] and b is [3 1] and we still compute hardlim(wx+b). (Matrices should all match up when we do the addition. Note that we have vector output.)