Download Slide 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Neural Networks I
Karel Berkovec
karel.berkovec (at) seznam.cz
Neural Networks I
Karel Berkovec, 2007
Artificial Intelligence
Symbolic approach
Artificial Intelligence
Expert systems,
mathematical logic,
production systems,
bayesian networks …
Connectionist approach
Neural Networks
Adaptive approach
Stochastic methodes
Analytic approach
Neural Networks I
Regression,
interpolation, frequency
analysis ..
Karel Berkovec, 2007
Is it really working?
• Is it a standard mechanism?
• What is it good for?
• Use it someone for real applications?
• Can I grasp how it works?
• Can I use it?
Neural Networks I
Karel Berkovec, 2007
This presentation
• Basic introduction
• Small history window
• Model of neuron and neural network
• Supervised learning (backpropagation)
No biology, mathematical fundaments, unsupervised
learning, stochastic models, neurocomputers, etc.
Neural Networks I
Karel Berkovec, 2007
History I
• 20s – von Neumann computer model
• 1943 – Warren McCulloch and Walter Pitts –
matematical model of neuron
• 1946 – Eniac
• 1949 – Donald Hebb – The Organization of
Behaviour
• 1951 – 1st Czechoslovak computer SAPO
• 1951 – 1st neurocomputer Snark
• 1957 – Frank Rosenblatt – perceptron + learning
algorithm
• 1958 – Rosenblatt and Charless Wightman – 1st
really used neurocomputer Mark I Perceptron
Neural Networks I
Karel Berkovec, 2007
History II
• 60s ADALINE
• 1st company oriented on neurocomputing
Exhausting of potential
• 1967 Marvin Minsky & Seymour Papert – Perceptrons
XOR problem can’t be solved by 1 perceptron
Neural Networks I
Karel Berkovec, 2007
History III
• 1983 – DARPA
• 1982, 1984 - John Hopfield – physical models & NN
• 1986 – David Rumehart, Geoffrey Hinton, Ronald
Williams – Backpropagation
– 1969 Arthur Bryson & Paul Webos
– 1974 Paul Werbos
– 1985 David Parker
• 1987 – IEEE International Conference on Neural
Networks
• Since 90’ NN boom of NNs
– ART, BAM, RBF, spiking neurons
Neural Networks I
Karel Berkovec, 2007
Present
• Many models of neuron – Perceptron, RBF, spiking
neuron …
• Many approaches – backpropagation, hopfield
learning, correlations, competitive learning, stochastic
learning, …
• Many libraries and modules – for Matlab, Statistica,
Excel …
• Many applications – forecasting, smoothing,
recognition, classification, datamining, compression …
Neural Networks I
Karel Berkovec, 2007
Pros and cons
+ Simple to use
+ Very good results
+ Fast results
+ Robust against incomplete or corrupted inputs
+ Generalization
+/- Mathematical background
- Not transparent and traceable
- Hard to tune parameters (sometimes hair-triggered)
- Sometimes a long time for learning needed
- Some tasks are hard to formulate for NNs
Neural Networks I
Karel Berkovec, 2007
Formal neuron - perceptron
x1
w1
•
 w x 
•
•
•
•
xn
Neural Networks I
i i
wn
 0 1
 00
w x
i i
- potential

- threshold
wi
- weights
Karel Berkovec, 2007
AB problem
Neural Networks I
Karel Berkovec, 2007
XOR problem
Neural Networks I
Karel Berkovec, 2007
XOR problem
1
1
x1
Neural Networks I
x2
Karel Berkovec, 2007
XOR problem
2
2
x1
Neural Networks I
x2
Karel Berkovec, 2007
XOR problem
XOR( x1 , x2 )
AND
Neural Networks I
1
2
x1
x2
Karel Berkovec, 2007
Feed-forward layered network
NN : X  Y
y1
y2
y3
Output layer
2nd hidden layer
1st hidden layer
Input layer
x1
Neural Networks I
x2
x3
x4
x5
Karel Berkovec, 2007
Activating function
Heaviside function
x1
w1
•
 w x 
•
i i
•
1
y( wi xi   )
0
•
•
xn
wn
Saturated linear function
1
0
1
Standard sigmoidal function
y ( ) 
1
1  e 
1
0
Hyperbolical tangents
1  e 
y( ) 
1  e 
1
0
1
Neural Networks I
Karel Berkovec, 2007
NN function
• NN maps input on output
NN : X  Y
• Feed-forward NN with one hidden layer and with
sigmoidal activation function can approximate arbitrary
closely any continuous function
The question is how to set up parameters of the
network.
Neural Networks I
Karel Berkovec, 2007
NN learning
1 p
E   ( yko  d ko ) 2
2 k 1 oO
• Error function
• Perceptron adaptation rule:
w
( t 1)
ji
(t )
 w  xki ( y j ( w , x k )  d kj )
(t )
ji
w(jit 1)  w(jit )  xki
w(jit 1)  w(jit )  xki
y=0 d=1
y=1 d=0
• Algorithm with this learning rule convergates in finite
time (if A and B separatable)
Neural Networks I
Karel Berkovec, 2007
AB problem
Neural Networks I
Karel Berkovec, 2007
Backpropagation
• The most often used learning algorithm for NNs
– cca 80%
• Fast convergation
• Good results
• Many modifications
Neural Networks I
Karel Berkovec, 2007
Energetic function
• How to adapt weights of neurons in hidden layers?
• We would like to find a minimum of the error function
- why not use a derivation?
p
1 p
E   Ek   ( yko  d ko ) 2
2 k 1 oO
k 1
Neural Networks I
Karel Berkovec, 2007
Error gradient
Adaptation rule:
wij (t  1)  wij (t )  wij (t )
Ek
E
wij (t )  
(t )  
(t )
wij
k wij
Neural Networks I
Karel Berkovec, 2007
Output layer
1
E   ( y j  d j )2
2 jO
y j ( ) 
 j  ( wij yi   )
1

1 e j
E y j  j

wij y j  j wij
E
y j
E
 (yj  d j)
y j
 j

e
 j
(1  e
 j 2
)
 j
wij
 yi
 j
E
e
 (yj  d j)
y
 j 2 i
wij
(1  e )
Neural Networks I
Karel Berkovec, 2007
Hidden layer
E
1
2
(
y

d
)
 o o
2 oO
E
E yo  o

wij oO yo  o wij
 o  o y j  j

wij y j  j wij
E
wij
Neural Networks I
y j ( ) 
1

1 e j
 j  ( wij yi   )
E yo
o 
yo  o
 o
 w jo
y j
eij 
y j  j
 j wij
   o w joeij
oO
Karel Berkovec, 2007
Implementation BP
• initialize network
• nwij : 0
• repeat
– update weights
– for all patterns
wij   nwij
• count the result
• count error
Ek
(t )
• count wij (t )  
wij
•
nwij   wij (t )
• until error is not small enough
Neural Networks I
Karel Berkovec, 2007
Improvements of BP
• Momentum
wij (t  1)  wij (t )  wij (t )  wij (t  1)
• Adaptive learning parameters
wij (t  1)  wij (t )   (t )wij (t )
Other variants of BP: SuperSAB, QuickProp,
Levenberg-Marquart alg.
Neural Networks I
Karel Berkovec, 2007
Overfitting
Neural Networks I
Karel Berkovec, 2007
Related documents