Download 3 Activity: • See the figure of the biological neuron • Draw a figure of

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonblocking minimal spanning switch wikipedia , lookup

Transcript
Activity:
• See the figure of the biological neuron
• Draw a figure of a Perceptron
• Explain the structure of the Perceptron and the meaning of its components
Artificial Neural Networks (ANNs) have biological inspiration. The basic component of brain
circuitry is a specialised cell called the neuron illustrated in Fig. 2-1-1.
Neurons are electrically excitable and the cell body can generate electrical signals, called
action potentials, which are propagated down the axis towards the nerve terminal. The
electrical signal propagates only in this direction and it is an all-or-none event. Information is
coded on the frequency of the signal.
Fig. 2-1-1. A biological neuron
The nerve terminal is close to the dendrites or body cells of other neurons, forming special
junctions called synapses. Circuits can therefore be formed by a number of neurons. Any
particular neuron has many inputs (some receive nerve terminals from hundreds or thousands
of other neurons), each one with different strengths. The neuron integrates the strengths and
fires action potentials accordingly.
The input strengths are not fixed, but vary with use. The mechanisms behind this modification
are now beginning to be understood. These changes in the input strengths are thought to be
particularly relevant to learning and memory.
Inspired by the biological fundamentals a neuron model was introduced by Rosenblatt [19]
called Perceptron illustrated in Fig. 2-1-2.
3
Fig. 2-1-2. Perceptron
In the Perceptron there is a set of synapses or connecting links, each link connecting each
input to the summation block. Associated with each synapse, there is a strength or weight,
which multiplies the associated input signal. The input signals are integrated in the neuron.
Usually an adder is employed for computing a weighted summation of the input signals. Note
also the existence of a threshold or bias term associated with the neuron. This can be
associated with the adder, or, considered as another (fixed, with the value of 1) input. The
output of this adder is denoted by s, which for a threshold function is applied for obtaining the
output y.
Activity:
• Note the learning rule of the Perceptron
• Try to perform a weight modification by yourself in an example
The typical application of Rosenblatt was to activate an appropriate response unit for a given
input patterns or a class of patterns. The inputs, outputs and training patterns were binary
valued (0 or 1). Although there are different versions of Perceptron and corresponding
learning rules, the basic rule is to alter the value of the weights only of active lines (the ones
corresponding to a 1 input), and only when an error exists between the network output and the
desired output. Formally:
1. If the output is 1 and it should be 1, or if the output is 0 and it should be 0,
then do nothing
2. If the output is 0 and it should be 1, then increment the weights in all active
lines
3. If the output is 1 and it should be 0, then decrement the weights in all active
lines
The weight vector, w, is updated as:
w[k+1]=w[k]+∆w,
(1)
where ∆w is the change made to the weight vector, as:
4
∆w i =α⋅(t[k]-y[k])⋅x i [k]
(2)
where α is the learning rate, t[k] and y[k] are the desired and actual output, respectively, at
time k, and x i [k] is the i-th element of the input vector at time k. Thus, the i-th component of
the weight vector, at the (k+1)-th iteration, is:
w i [k ] + α ⋅ x i , if the output is 0, and it should be 1

(3)
w i [k + 1] = w i [k ] − α ⋅ x i , if the output is 1, and it should be 0
w [k ], if the output is correct
 i
Some variations have been made to this simple Perceptron model: First, some models do not
employ a bias; the inputs to the net may be real valued, bipolar (+1, -1), as well as binary; the
outputs may be bipolar.
Activity:
• See the many possibilities to organise the neurons in a network
Biological neural networks are densely interconnected. This also happens with artificial
neural networks. According to the flow of the signals within an ANN, we can divide the
architectures into feedforward networks, if the signals flow just from input to output, or
recurrent networks, if loops are allowed. Another possible classification is dependent on the
existence of hidden neurons, i.e., neurons which are not input nor output neurons. If there are
hidden neurons, we denote the network as a multilayer neural network, otherwise the network
can be called a singlelayer neural network. Finally, if every neuron in one layer is connected
with the layer immediately above, the network is called fully connected.
Activity:
• See the difference between the Perceptron and the modified Perceptron
Multilayer perceptrons (MLPs) are the most widely known type of ANNs. If we see the
Perceptron, we can diagnose, that it is not difficult to design a network structure with simple
Perceptrons, as long as each one is individually trained to implement a specific function.
However, to design the whole structure globally to implement a more complicated function
(or sometimes even easier functions, like the XOR function) is impossible, because of the
hard nonlinearity of the Perceptron. It is not possible to employ the calculation of the
derivatives which needs for the learning algorithm. This problem can be solved if the hard
threshold (sign function) of the Perceptron is replaced by a smoother, differentiable nonlinearity, such as a sigmoid or a hyperbolic tangent function. This modified Perceptron is
illustrated in Fig. 2-1-3. and applied as processing units in the Multilayer Perceptron (MLP)
which is a feedforward multilayer ANN presented in Fig. 2-1-4.
5
Fig. 2-1-3. Modified Perceptron
Activity:
• Draw a figure of MLP
• Explain the components in MLP
• Calculate the output of the MLP for a given input
Fig. 2-1-4. Multilayer Perceptron
6
From a farther point of view actually any given input vector propagated forward through the
network will cause activation in the output layer. The whole network function which maps the
input vector to the output vector can be determined by the network’s weights. Every neuron in
the network is a simple processing unit, which computes its own output based on its inputs. In
the l-th layer the input of the i-th neuron (Fig. 2-1-3):
(4)
s i( l ) = ∑ y (jl−1) w (jil−1) + b i( l−1) ,
j∈pred ( i )
where pred(i) is the set of the preceding neurons of the i-th neuron, i.e. the set of those
neurons from where there is connection to the investigated i-th neuron; w ji is the weight
between the j-th and i-th neurons. The b i is the bias of the neuron. Usually the bias is
considered as a weight coming from a “bias” unit having a constant 1 as output value. In Fig.
2-1-4 the bias units are denoted by letter b, which are connected to every processing unit by a
certain amount of weight.
The output of the i-th neuron can be calculated from the previously computed s i (l) by using a
non-linear function mentioned above. Usually the sigmoid function is used which will provide
the output of the i-th neuron as follows:
1
y i( l ) =
.
(5)
(l)
1 + e −s i
A nice property of this function is that its derivative can be calculated easily:
∂y i
(6)
= y i (1 − y i ).
∂s i
The weights of the neural network can be determined by a learning algorithm based on
training patterns in order to realise a desired mapping using the network. The learning
algorithms will be introduced in a later lesson.
7