Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Neural Networks I Karel Berkovec karel.berkovec (at) seznam.cz Neural Networks I Karel Berkovec, 2007 Artificial Intelligence Symbolic approach Artificial Intelligence Expert systems, mathematical logic, production systems, bayesian networks … Connectionist approach Neural Networks Adaptive approach Stochastic methodes Analytic approach Neural Networks I Regression, interpolation, frequency analysis .. Karel Berkovec, 2007 Is it really working? • Is it a standard mechanism? • What is it good for? • Use it someone for real applications? • Can I grasp how it works? • Can I use it? Neural Networks I Karel Berkovec, 2007 This presentation • Basic introduction • Small history window • Model of neuron and neural network • Supervised learning (backpropagation) No biology, mathematical fundaments, unsupervised learning, stochastic models, neurocomputers, etc. Neural Networks I Karel Berkovec, 2007 History I • 20s – von Neumann computer model • 1943 – Warren McCulloch and Walter Pitts – matematical model of neuron • 1946 – Eniac • 1949 – Donald Hebb – The Organization of Behaviour • 1951 – 1st Czechoslovak computer SAPO • 1951 – 1st neurocomputer Snark • 1957 – Frank Rosenblatt – perceptron + learning algorithm • 1958 – Rosenblatt and Charless Wightman – 1st really used neurocomputer Mark I Perceptron Neural Networks I Karel Berkovec, 2007 History II • 60s ADALINE • 1st company oriented on neurocomputing Exhausting of potential • 1967 Marvin Minsky & Seymour Papert – Perceptrons XOR problem can’t be solved by 1 perceptron Neural Networks I Karel Berkovec, 2007 History III • 1983 – DARPA • 1982, 1984 - John Hopfield – physical models & NN • 1986 – David Rumehart, Geoffrey Hinton, Ronald Williams – Backpropagation – 1969 Arthur Bryson & Paul Webos – 1974 Paul Werbos – 1985 David Parker • 1987 – IEEE International Conference on Neural Networks • Since 90’ NN boom of NNs – ART, BAM, RBF, spiking neurons Neural Networks I Karel Berkovec, 2007 Present • Many models of neuron – Perceptron, RBF, spiking neuron … • Many approaches – backpropagation, hopfield learning, correlations, competitive learning, stochastic learning, … • Many libraries and modules – for Matlab, Statistica, Excel … • Many applications – forecasting, smoothing, recognition, classification, datamining, compression … Neural Networks I Karel Berkovec, 2007 Pros and cons + Simple to use + Very good results + Fast results + Robust against incomplete or corrupted inputs + Generalization +/- Mathematical background - Not transparent and traceable - Hard to tune parameters (sometimes hair-triggered) - Sometimes a long time for learning needed - Some tasks are hard to formulate for NNs Neural Networks I Karel Berkovec, 2007 Formal neuron - perceptron x1 w1 • w x • • • • xn Neural Networks I i i wn 0 1 00 w x i i - potential - threshold wi - weights Karel Berkovec, 2007 AB problem Neural Networks I Karel Berkovec, 2007 XOR problem Neural Networks I Karel Berkovec, 2007 XOR problem 1 1 x1 Neural Networks I x2 Karel Berkovec, 2007 XOR problem 2 2 x1 Neural Networks I x2 Karel Berkovec, 2007 XOR problem XOR( x1 , x2 ) AND Neural Networks I 1 2 x1 x2 Karel Berkovec, 2007 Feed-forward layered network NN : X Y y1 y2 y3 Output layer 2nd hidden layer 1st hidden layer Input layer x1 Neural Networks I x2 x3 x4 x5 Karel Berkovec, 2007 Activating function Heaviside function x1 w1 • w x • i i • 1 y( wi xi ) 0 • • xn wn Saturated linear function 1 0 1 Standard sigmoidal function y ( ) 1 1 e 1 0 Hyperbolical tangents 1 e y( ) 1 e 1 0 1 Neural Networks I Karel Berkovec, 2007 NN function • NN maps input on output NN : X Y • Feed-forward NN with one hidden layer and with sigmoidal activation function can approximate arbitrary closely any continuous function The question is how to set up parameters of the network. Neural Networks I Karel Berkovec, 2007 NN learning 1 p E ( yko d ko ) 2 2 k 1 oO • Error function • Perceptron adaptation rule: w ( t 1) ji (t ) w xki ( y j ( w , x k ) d kj ) (t ) ji w(jit 1) w(jit ) xki w(jit 1) w(jit ) xki y=0 d=1 y=1 d=0 • Algorithm with this learning rule convergates in finite time (if A and B separatable) Neural Networks I Karel Berkovec, 2007 AB problem Neural Networks I Karel Berkovec, 2007 Backpropagation • The most often used learning algorithm for NNs – cca 80% • Fast convergation • Good results • Many modifications Neural Networks I Karel Berkovec, 2007 Energetic function • How to adapt weights of neurons in hidden layers? • We would like to find a minimum of the error function - why not use a derivation? p 1 p E Ek ( yko d ko ) 2 2 k 1 oO k 1 Neural Networks I Karel Berkovec, 2007 Error gradient Adaptation rule: wij (t 1) wij (t ) wij (t ) Ek E wij (t ) (t ) (t ) wij k wij Neural Networks I Karel Berkovec, 2007 Output layer 1 E ( y j d j )2 2 jO y j ( ) j ( wij yi ) 1 1 e j E y j j wij y j j wij E y j E (yj d j) y j j e j (1 e j 2 ) j wij yi j E e (yj d j) y j 2 i wij (1 e ) Neural Networks I Karel Berkovec, 2007 Hidden layer E 1 2 ( y d ) o o 2 oO E E yo o wij oO yo o wij o o y j j wij y j j wij E wij Neural Networks I y j ( ) 1 1 e j j ( wij yi ) E yo o yo o o w jo y j eij y j j j wij o w joeij oO Karel Berkovec, 2007 Implementation BP • initialize network • nwij : 0 • repeat – update weights – for all patterns wij nwij • count the result • count error Ek (t ) • count wij (t ) wij • nwij wij (t ) • until error is not small enough Neural Networks I Karel Berkovec, 2007 Improvements of BP • Momentum wij (t 1) wij (t ) wij (t ) wij (t 1) • Adaptive learning parameters wij (t 1) wij (t ) (t )wij (t ) Other variants of BP: SuperSAB, QuickProp, Levenberg-Marquart alg. Neural Networks I Karel Berkovec, 2007 Overfitting Neural Networks I Karel Berkovec, 2007