Download NeuralNets_ch1-2_intro_Eng

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Neuroinformatics wikipedia , lookup

Optogenetics wikipedia , lookup

Community informatics wikipedia , lookup

Pattern recognition wikipedia , lookup

Artificial intelligence wikipedia , lookup

Central pattern generator wikipedia , lookup

Natural computing wikipedia , lookup

Artificial neural network wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Transcript
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Introduction To Neural Networks
Prof. George Papadourakis, Ph.D.
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Historical Background
Development of Neural Networks date back to the early 1940s. It
experienced an upsurge in popularity in the late 1980s. This was a result of
the discovery of new techniques and developments and general advances in
computer hardware technology.
Some NNs are models of biological neural networks and some are not, but
historically, much of the inspiration for the field of NNs came from the
desire to produce artificial systems capable of sophisticated, perhaps
intelligent, computations similar to those that the human brain routinely
performs, and thereby possibly to enhance our understanding of the human
brain.
Most NNs have some sort of training rule. In other words, NNs learn from
examples (as children learn to recognize dogs from examples of dogs) and
exhibit some capability for generalization beyond the training data.
Neural computing must not be considered as a competitor to conventional
computing. Rather, it should be seen as complementary as the most
successful neural solutions have been those which operate in conjunction
with existing, traditional techniques.
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
What are ANNs?






Models of the brain and nervous system
Highly parallel
 Process information much more like the brain than a serial
computer
Learning
Very simple principles
Very complex behaviours
Applications
 As powerful problem solvers
 As biological models
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Neural Network Techniques


Computers have to be explicitly programmed

Analyze the problem to be solved.

Write the code in a programming language.
Neural networks learn from examples




No requirement of an explicit description of the problem.
No need for a programmer.
The neural computer adapts itself during a training period,
based on examples of similar problems even without a desired
solution to each problem. After sufficient training the neural
computer is able to relate the problem data to the solutions,
inputs to outputs, and it is then able to offer a viable solution
to a brand new problem.
Able to generalize or to handle incomplete data.
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
NNs vs Computers (1/2)

Digital Computers
 Deductive Reasoning. We apply known rules to input data to
produce output.
 Computation is centralized, synchronous, and serial.
 Memory is packetted, literally stored, and location
addressable.
 Not fault tolerant. One transistor goes and it no longer works.
 Exact.
 Static connectivity.

Applicable if well defined rules with precise input data.
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
ΝΝs vs Computers (2/2)

Neural Networks
 Inductive Reasoning. Given input and output data (training
examples), we construct the rules.
 Computation is collective, asynchronous, and parallel.
 Memory is distributed, internalized, short term and content
addressable.
 Fault tolerant, redundancy, and sharing of responsibilities.
 Inexact.
 Dynamic connectivity.

Applicable if rules are unknown or complicated, or if data are
noisy or partial.
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Applications off NNs (1/2)

Classification





Marketing: consumer spending pattern classification
Defence: radar and sonar image classification
Agriculture & fishing: fruit and catch grading
Medicine: ultrasound/electrocardiogram image
classification, EEGs
Recognition and identification


General computing and telecommunications: speech,
vision and handwriting recognition
Finance: signature verification and bank note
verification
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Applications off NNs (2/2)

Assessment




Engineering: product inspection monitoring and control
Defence: target tracking
Security: motion detection, camera surveillance, fingerprint
matching
Forecasting and prediction




In finance: foreign exchange rate and stock
market forecasting
In agriculture: crop yield forecasting
In marketing: sales forecasting
In meteorology: weather prediction
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
What can you do with an NN and what not?



In principle, NNs can compute any computable function, i.e.,
they can do everything a normal digital computer can do.
Almost any mapping between vector spaces can be
approximated to arbitrary precision by feedforward NNs
In practice, NNs are especially useful for classification and
function approximation problems usually when rules such as
those that might be used in an expert system cannot easily
be applied.
NNs are, at least today, difficult to apply successfully to
problems that concern manipulation of symbols and memory.
And there are no methods for training NNs that can
magically create information that is not contained in the
training data.
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Who is concerned with NNs? (1/2)




Computer scientists want to find out about the properties of
non-symbolic information processing with neural nets and
about learning systems in general.
Statisticians use neural nets as flexible, nonlinear regression
and classification models.
Engineers of many kinds exploit the capabilities of neural
networks in many areas, such as signal processing and
automatic control.
Cognitive scientists view neural networks as a possible
apparatus to describe models of thinking and consciousness
(High-level brain function).
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Who is concerned with NNs? (2/2)




Neuro-physiologists use neural networks to describe and
explore medium-level brain function (e.g. memory, sensory
system, motorics).
Physicists use neural networks to model phenomena in
statistical mechanics and for a lot of other tasks.
Biologists use Neural Networks to interpret nucleotide
sequences.
Philosophers and some other people may also be interested in
Neural Networks for various reasons
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Biological Inspiration



Animals are able to react adaptively to changes in their
external and internal environment, and they use their nervous
system to perform these behaviours.
An appropriate model/simulation of the nervous system should
be able to produce similar responses and behaviours in
artificial systems.
The nervous system is build by relatively simple units, the
neurons, so copying their behavior and functionality should be
the solution.
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Biological NNs (1/3)

Pigeons as art experts (Watanabe et al. 1995)
 Experiment:
 Pigeon in Skinner box
 Present paintings of two different artists (e.g. Chagall / Van Gogh)
 Reward for pecking when presented a particular artist (e.g. Van Gogh)
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Biological NNs (2/3)






Pigeons were able to discriminate between Van Gogh and
Chagall with 95% accuracy (when presented with pictures they
had been trained on)
Discrimination still 85% successful for previously unseen
paintings of the artists
Pigeons do not simply memorise the pictures
They can extract and recognise patterns (the ‘style’)
They generalise from the already seen to make predictions
This is what neural networks (biological and artificial) are
good at (unlike conventional computers)
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Biological NNs (3/3)



Brain: A collection of about 10 billion interconnected neurons. Each
neuron cell uses biochemical reactions to receive, process and transmit
information.
Each terminal button is connected to other neurons across a small gap
called a synapse.
Neuron's dendritic tree is connected to thousand neighbouring neurons.
When one of those neurons fire, a positive or negative charge is received
by one of the dendrites. The strengths of all the received charges are
added together through the processes of spatial and temporal summation
Axon
Neurotransmitters
Dentrite
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Artificial NNs: The Basics
ANNs incorporate the two fundamental components of
biological neural nets:
 Neurons -> Nodes
Nodes
Weights
 Synapses -> Weights

Outputs
Inputs

An artificial neural network is composed of many artificial
neurons that are linked together according to a specific
network architecture. The objective of the neural network is
to transform the inputs into meaningful outputs.
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Key Elements of NNs (1/2)

Neural computing requires a number of neurons, to be connected
together into a neural network. Neurons are arranged in layers.
P1
P2
P3
Inputs
Weights
w1
w2
w3
Σ f
Output
O=
bias
ΣW p +
i i
Bias

Each neuron within the network is usually a simple processing unit
which takes one or more inputs and produces an output. At each
neuron, every input has an associated weight which modifies the
strength of each input. The neuron simply adds together all the
inputs and calculates an output to be passed on.
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Key Elements of NNs (2/2)

Feeding data through the net:
1.0
W1=0.32
W2=0.46
0.5
W3=0.81
0.7
Output = (1.0x0.32) + (0.5x0.46) + (0.7x0.81) = 1.117

Squashing (Limit output between 0 -1 range)
1
1.117
1+e
= 0.466
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Feeding NNs


Data is presented to the network in the form of activations in
the input layer
Examples




Data usually requires preprocessing


Pixel intensity (for pictures)
Molecule concentrations (for artificial nose)
Share prices (for stock market prediction)
Analogous to senses in biology
How to represent more abstract data, e.g. a name?

Choose a pattern, e.g.
 0-0-1 for “Chris”
 0-1-0 for “Becky”
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Activation (Squashing) functions

The activation function describes the output behaviour of the
neurons Generally is non-linear. Linear functions are limited
because the output is simply proportional to the input.
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
ANNs Architectures (1/3)


A hidden layer learns to recode (or to provide a representation
for) the inputs. More than one hidden layer can be used.
The architecture is more powerful than single-layer networks: it
can be shown that any mapping can be learned, given two hidden
layers (of units).
Inputs
Outputs
Single Layer ANN
Hidden Layer
Inputs
Outputs
ANN with hidden layer
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
ANNs Architectures (2/3)

Feed-forward networks
Feed-forward ANNs allow signals to travel one way only; from input to
output. There is no feedback (loops) i.e. the output of any layer does not
affect that same layer. Feed-forward ANNs tend to be straight
forward networks that associate inputs with outputs. They are
extensively used in pattern recognition. This type of organisation is also
referred to as bottom-up or top-down.
Inputs
Outputs
Flow of Information
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
ANNs Architectures (3/3)

Feedback networks
Feedback networks can have signals travelling in both directions by
introducing loops in the network. Feedback networks are very powerful
and can get extremely complicated. Feedback networks are dynamic;
their 'state' is changing continuously until they reach an equilibrium
point. They remain at the equilibrium point until the input changes and a
new equilibrium needs to be found.
Inputs
Outputs
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Training methods (1/2)

Supervised learning
In supervised training, both the inputs and the outputs are
provided. The network then processes the inputs and
compares its resulting outputs against the desired outputs.
Errors are then propagated back through the system, causing
the system to adjust the weights which control the network.
This process occurs over and over as the weights are
continually tweaked. The set of data which enables the
training is called the training set. During the training of a
network the same set of data is processed many times as the
connection weights are ever refined.
Example architectures : Multilayer perceptrons
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Training methods (2/2)

Unsupervised learning
In unsupervised training, the network is provided with inputs
but not with desired outputs. The system itself must then
decide what features it will use to group the input data. This
is often referred to as self-organization or adaption.
Example architectures :
 Kohonen Self Organizing Maps
 Neural Gas
 ART Map
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
The learning Process (1/7)



Example: Optical Character Recognition
Task: Learn to discriminate between two different typed
characters
Data
 Sources
 Letter ‘A’
 Letter ‘b’

Format
 Image pixel values (binary form)
 Analogy: Eye – Optical Nerve - Brain
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
The learning Process (2/7)

Network architecture
 Feed forward network
 20 inputs (one for each pixel value)
 6 hidden
 2 output (0-1 for ‘A’, 1-0 for ‘b’)
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
The learning Process (3/7)

Presenting the data (untrained network)
20 Inputs
0.43
0.26
20 Inputs
0.73
0.55
=1
=0
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
The learning Process (4/7)

Calculate Error (Subtract actual and expected values)
20 Inputs
0.43 – 0 = 0.43
0.26 – 0 = 0.74
20 Inputs
0.73 – 1 = 0.27
0.55 – 0 = 0.55
=1
=0
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
The learning Process (5/7)

Use the errors to adjust weights through some learning function
20 Inputs
0.43 – 0 = 0.43
0.26 – 0 = 0.74
1.17
20 Inputs
0.73 – 1 = 0.27
0.55 – 0 = 0.55
0.82
=1
=0
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
The learning Process (6/7)

Repeat process (sweep) for all training pairs
 Present data
 Calculate error
 Back-propagate error
 Adjust weights
Repeat process multiple times
Error

Repetition (Epoch)
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
The learning Process (7/7)

Presenting the data (trained network)
20 Inputs
0.01
0.99
20 Inputs
0.99
0.01
=1
=0
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Design Considerations





What transfer function should be used?
How many inputs does the network need?
How many hidden layers does the network need?
How many hidden neurons per hidden layer?
How many outputs should the network have?
There is no standard methodology to determinate these values.
Even there is some heuristic points, final values are determinate
by a trial and error procedure.
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
When to use ANNs









Input is high-dimensional discrete or real-valued (e.g. raw sensor
input).
Inputs can be highly correlated or independent.
• Output is discrete or real valued
• Output is a vector of values
• Possibly noisy data. Data may contain errors
• Form of target function is unknown
• Long training time are acceptable
• Fast evaluation of target function is required
• Human readability of learned target function is unimportant
ANN is much like a black-box
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
Conclusions




Artificial neural networks are inspired by the learning processes
that take place in biological systems.
Artificial neurons and neural networks try to imitate the
working mechanisms of their biological counterparts.
Learning can be perceived as an optimisation process.
Biological neural learning happens by the modification of the
synaptic strength. Artificial neural networks learn in the same
way.
“If the brain were so simple that we could understand it then
we’d be so simple that we couldn’t”
Lyall Watson
Technological Educational Institute Of Crete
Department Of Applied Informatics and Multimedia
Neural Networks Laboratory
References






Fausett, L. (1994), Fundamentals of Neural Networks: Architectures,
Algorithms, and Applications, Prentice Hall, ISBN 0-13-334186-0
Neural Networks for Pattern Recognition, Christopher M. Bishop,
Oxford University Press (1995)
Bigus, J.P. (1996), Data Mining with Neural Networks: Solving Business
Problems--from Application Development to Decision Support, NY:
McGraw-Hill, ISBN 0-07-005779-6
Smith, M. (1996). Neural Networks for Statistical Modeling, Boston:
International Thomson Computer Press, ISBN 1-850-32842-0
Masters, T. (1993), Practical Neural Network Recipes in C++, Academic
Press, ISBN 0-12-479040-2
Hinton, G.E. (1992), "How Neural Networks Learn from Experience",
Scientific American, 267 (September), 144-151