Download Neural Networks

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

AI winter wikipedia , lookup

Machine learning wikipedia , lookup

Perceptual control theory wikipedia , lookup

Embodied cognitive science wikipedia , lookup

Philosophy of artificial intelligence wikipedia , lookup

Gene expression programming wikipedia , lookup

Neural modeling fields wikipedia , lookup

Intelligence explosion wikipedia , lookup

Ethics of artificial intelligence wikipedia , lookup

Hierarchical temporal memory wikipedia , lookup

History of artificial intelligence wikipedia , lookup

Existential risk from artificial general intelligence wikipedia , lookup

Convolutional neural network wikipedia , lookup

Pattern recognition wikipedia , lookup

Catastrophic interference wikipedia , lookup

Transcript
Csci 2003 – Intro to Artificial Intelligence
Neural Networks
Lecture 2 on Neural Networks - Perceptrons
Learning outcomes
After this lecture you will:.

be able to multiply matrices

be able to model a perceptron and know the limitations of this model

be able to describe the perceptron learning rule

know the significant difference between single layer and multi-layer perceptrons.
The perceptron and its limitations.
How do we create artificial neurons. We have inputs to a unit (with changeable weights)
and the unit fires a value given by a function.
x1
w1
x2
w2
 wi xi  S
wn
O  F (S )
Artificial Neuron
xn
Here the x values are the inputs, w are weights and O is the output.
A perceptron is a particularly simple neuron. The output is calculated by some variant of
the hardlim function. There is a threshold value T where
F(S)= 0 if S<T
F(S)=1 if S>T or S = T
Csci 2003 – Intro to Artificial Intelligence
We can define such an F via the hardlimit function
hardlimit(x) =0 if x<0
hardlimit(x)=1 otherwise.
Then F can be written as hardlimit(S-T).
In matlab we usually call the hardlimit function hardlim.
In general we can have more than one neural unit in a layer.
A calculation with a two input perceptron with a single neuron.
Inputs are vectors with two values x [2 by 1 vector]. The weight matrix is w [1 by 2]
matrix. There is a [ 1 by 1] bias b (don't worry about this for now).
x1
w1
 wi xi  S
w2
O  F (S )
Artificial Neuron
x2
Input to the hardlim is wx+b so the perceptron computes hardlim(wx+b).
Set w=[0.3 -0.2] b=[0.5] and lets do some computations:
Suppose x = [1 2]’, [-1 -2]’, [-2, 4]’
[need plenty of space for the three calculations]
Csci 2003 – Intro to Artificial Intelligence
Training a perceptron
We use supervised learning and the adapt method with the perceptron learning rule.
We have input patterns x with desired output vector tx. The actual output of the network
when we put in x is net(x). Define error vector e as
e =tx – net(x).
This is simply the difference between what we want and what we get when we push x
into the net. If the net produces the correct output we don't need to make any changes –
but if it is wrong we modify the weights and bias as follows.
new w = old w + e x T.
new b = old b + e.
We cycle through each of the input vectors in turn modifying the weights if necessary
until the perceptron does what we want. We hope that the problem can be solved – if
there is a solution this will find it.
Example
Lets train a perceptron to discriminate between two points.
[1; 0] – output 1.
[0; 1] -output 0.
Lets create a perceptron with random w and b and train it using adapt.
[plenty of space needed]
Csci 2003 – Intro to Artificial Intelligence
What can a single perceptron learn?
Only linearly separable sets of data.
Example which a single perceptron cannot learn.
An example of this is the exclusive-or function.
x
y
out
0
0
0
1
0
1
0
1
1
1
1
0
Table 2.1: The Exclusive-or Problem
y
1  out = 1
 out
out = 0
0 
0

=0
out = 1
x
For a single neuron with threshold activation function with two inputs and one output it
can be shown that there are no two weights that will produce a solution.
Most real world problems are not linearly separable.
If more layers are allowed we can solve problems which aren't linearly
separable.
eg See separate handout
Csci 2003 – Intro to Artificial Intelligence
Why matrices?
Let's generalise to a three neuron perceptron layer taking two dimensional input.
So w is [3 2] and b is [3 1] and we still compute hardlim(wx+b). (Matrices should all
match up when we do the addition. Note that we have vector output.)