Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Perceptual control theory wikipedia , lookup
Embodied cognitive science wikipedia , lookup
History of artificial intelligence wikipedia , lookup
Neural modeling fields wikipedia , lookup
Concept learning wikipedia , lookup
Machine learning wikipedia , lookup
Hierarchical temporal memory wikipedia , lookup
Pattern recognition wikipedia , lookup
Artificial Neural Networks and applications Dr. L. Iliadis Assis. Professor Democritus University of Thrace, Greece [email protected] Overview 1. Definition of a Neural Network 2. The Human Brain 3. Neuron Models 4. Artificial neural networks ANN 5. Historical Notes 6. ANN Architecture 7. Learning processes – Training and Testing ANN 7.1. Backpropagation Learning 8. Well Known Applications of ANN What is a Neural Network? A Neural Network is a collection of units connected in some pattern to allow communication between them and it acts as a massively distributed processor. These units are also referred to as neurons or nodes. The Neural Network has a natural propensity for storing experiential knowledge and making it available for use. It has two main characteristics: Knowledge is acquired by the network from its environment through a learning process. Interneuron connection strengths, known as synaptic weights, are used to store the acquired knowledge. Animals are able to react adaptively to changes in their external and internal environment, and they use their nervous system to perform these behaviours. This is called Plasticity. Zoom on Human Brain Neurons Axon Soma (Cell Body) Dendrites Biological Analogy of a Neuron Dendrites Soma (cell body) Axon The Role of the synapses axon dendrites synapses The synapses are responsible for the information transmission between two connected Neurons Structural Organization of Levels in the Brain Functional Areas of the Brain Primary motor: voluntary movement Primary somatosensory: tactile, pain, pressure, position, temp., mvt. Motor association: coordination of complex movements Sensory association: processing of multisensorial information Prefrontal: planning, emotion, judgement Speech center (Broca’s area): speech production and articulation Wernicke’s area: comprehension of speech Auditory: hearing Auditory association: complex auditory processing Visual: low-level vision Visual association: higher-level vision HUMAN BRAIN VERSUS SILLICON The Human cortex has approximately 10 billion neurons and 60 trillion synapses. The net results is that the Brain is an enormously efficient structure The Energetic Efficiency of the Brain is approximately 10-16 Joules per Operation per second whereas the corresponding value for the best computers in use today is about 10-6 Joules per Operation per second Human Brain Neurons (where events happen in the millisecond range 10-3 sec) are 5 or 6 orders of magnitude slower than silicon logic gates (where events happen in the nanosecond range 10-9 sec). ARTIFICIAL NEURON MODEL A signal Xi at the input of synapse i, connected to neuron k is multiplied by the synaptic weight wi An Adder is Summing the input signals weighted by the respective synapses An activation function for limiting the amplitude of the output of a neuron. It squashes the permissible amplitude range of output signal to a finit value in the interval [0,1] or alternatively in [-1,1]. Bias b k x1 w1 x2 Inputs w2 x3 … xn-1 xn . z wi xi i 1 w3 .. n wn-1 Summing wn function Synaptic Weights y H ( z bk ) Activation function Output y Artificial neural Networks Artificial Neural Networks consist of interconnected elements which were inspired from studies of biological nervous Systems. They are an attempt to create machines that work in a similar way to the human brain, using components that behave like biological neurons. • Synaptic strengths are translated as synaptic weights; • Excitation means positive product between the incoming spike rate and the corresponding synaptic weight; • Inhibition means negative product between the incoming spike rate and the corresponding synaptic weight; Neuron’s Output Nonlinear generalization of the neuron: Sigmoidal or Gaussian functions may be used y H ( x, w) Where y is the neuron’s output, x is the vector of inputs, and w is the vector of synaptic weights. y 1 1 e ye w xa T || x w||2 2a 2 Sigmoidal neuron Gaussian neuron Software or Hardware? Although ANN can be implemented as fast Hardware devices, much research has been performed using conventional computer running software simulations. Software Simulations provide a somewhat cheap and flexible environment in which to research ideas for many real-world applications as there exists an adequate performance. e.g. An ANN Software package might be used to develop a System from Credit Scoring of an individual who applies for a Bank Loan. Historical Notes Ramon y Cajal in 1911 introduced the idea of neurons as structural constituents of the brain The origins of ANN go way back to the 1940s, when McCulloch and Pitts published the first mathematical model of a biological Neuron Research on ANN stopped for more than 20 years. In the mid 1980s emerged a huge interest in ANN due to the publication of the book “Parallel Distributed Processors” by Rumelhart and McClelland ANN made a grate come-back in the 1990’s and they are now widely accepted as a tool in the development of Intelligent Systems Architecture of Artificial Neural Networks Hidden Layer Input Layer Output Layer The way that the artificial neurons are linked together to compose an ANN may vary according to its Architecture. The above architectural graph illustrates the layout of a Multilayer Feedforward (data flow only to one direction) ANN in the case of a single Hidden Layer. The Hidden Layer is where the process takes place. Supervised Learning Learning = learning by adaptation For example: Animals learn that the green fruits are sour and the yellowish/reddish ones are sweet. The learning happens by adapting the fruit picking behavior. Learning can be perceived as an optimisation process. When an ANN is in its SUPERVISED training or learning phase, there are three factors to be considered: The Inputs applied are chosen from a Training Set, where the desired response of the System to these Inputs is Known The Actual Output Produced when an Input Pattern is applied, is compared to the desired Output and an Error is estimated In ANN the learning occurs by changing the synaptic strengths (change of the weights) eliminating some synapses, and building new ones. PERCEPTRON The Perceptron is one of the early ANN which is built around a nonlinear neuron, namely the McCulloch-Pitts neuron model. It produces an output equal to +1 if the hard limiter input is positive, and -1 if its is negative. Learning with a Perceptron The synapse strength modification rules for artificial neural networks can be derived by applying mathematical optimisation methods T y w x Perceptron: out 1 2 N ( x , y ), ( x , y ),..., ( x , yN ) Input Data: 1 2 2 T t 2 E ( t ) ( y ( t ) y ) ( w ( t ) x y ) Error: out t t Learning (Weight Adjustment): ( w(t )T x t yt ) 2 E (t ) wi (t 1) wi (t ) c wi (t ) c wi wi wi (t 1) wi (t ) c ( w(t )T x t yt ) xit m w(t ) x w j (t ) x tj T j 1 Learning with MLP ANN y 1 k MLP Multi Layer Process ANN: 1 w1 kT x a1k 1 e y 1 ( y11 ,..., y 1M )T , k 1,..., M 1 1 with p layers y k2 yout x 1 w 2 kT y 1 a k2 1 e 2 y 2 ( y12 ,..., y M )T , k 1,..., M 2 2 ... 1 2 … p-1 p y out F ( x;W ) w pT y p 1 Data: ( x1 , y1 ), ( x 2 , y2 ),...,( x N , y N ) Error: E(t ) ( y(t ) out yt ) 2 ( F ( x t ;W ) yt ) 2 Calculation of the weight changes is too complicated Backpropagation Learning It was developed by Werbos but Rumelhart et. al. in 1986 gave a new lease of life to ANN. The Weight adaption rule is known as Backpropagation. • It defines 2 sweeps of the ANN. First it performs a forward sweep from the input layer. Thus it calculates first the changes for the synaptic weights of the output neuron; • Then is performs a backward sweep from the output layer to the Input. In this way it calculates the changes backward starting from layer p-1, and propagates backward the local error terms. •The bakcward sweep is similar to the forward, except that error values are propagated back through the ANN to determine how the weights are to be changed during training. EXTDBD Learning Rule The Extended Delta Bar Delta is a Heuristic technique that has been used successfully in a wide range of applications and its main characteristic is that it uses a termed momentum. More specifically, a term is added to the standard weight change, which is proportional to the previous weight change. In this way good general trends are reinforced and oscillations are damped. EVALUATION INSTRUMENTS The RMS Error adds up the squares of the errors for each PE in the output layer, divides by the number of PEs in the output layer to obtain an average and then takes the square root of that average hence the name “root square”. Also another instrument is the Common Mean Correlation (CMC) coefficient of the desired (d) and the actual (predicted) output (y) across the Epoch. The CMC is calculated by CMC d i d yi y d i d 2 yi y 1 d di E 2 where and y 1 yi E It should be clarified that d stands for the desired values, y for the predicted values and i ranges from 1 to n (the number of cases in the data training set) and E is the Epoch size which is the number of sets of training data presented to the ANN learning cycles between weight updates. TESTING AND OVERTRAINING Over-Training is a very serious problem!!! Testing is the process that actually determines the strength of the ANN and its ability to generalize. The performance of an ANN is critically dependent on the training data that must be representative of the task to learn (Callan, 1999). For this purpose in the Testing phase we chose randomly A lomg set of actual cases (records) that were not applied in the training phase. New methods for learning with neural networks Bayesian learning: the distribution of the neural network parameters is learnt Support vector learning: the minimal representative subset of the available data is used to calculate the synaptic weights of the neurons Tasks Performed by Artificial neural networks The following tasks are usually performed by ANN • Controlling the movements of a robot based on selfperception and other information (e.g., visual information) • Decision making • Pattern Recognition (e.g. recognizing a visual object, a familiar face) ANN tasks • Control These can be reformulated in general as • Classification FUNCTION APPROXIMATION • Prediction • Approximation tasks. With the term Approximation we mean: given a set of values of a function g(x) build a neural network that approximates the g(x) values for any input x. Learning to approximate Error measure: 1 E N N 2 ( F ( x ; W ) y ) t t t 1 Rule for changing the synaptic weights: E wi c (W ) j wi j wi j , new wi wi j j Where c is the learning parameter (usually a constant) Summary • Artificial neural networks are inspired by the learning processes that take place in biological systems. • Artificial neurons and neural networks try to imitate the working mechanisms of their biological counterparts. • Learning can be perceived as an optimisation process. • Biological neural learning happens by the modification of the synaptic strength. Artificial neural networks learn in the same way. • The synapse strength modification rules for artificial neural networks can be derived by applying mathematical optimisation methods. Summary • Learning tasks of artificial neural networks can be reformulated as function approximation tasks. • Neural networks can be considered as nonlinear function approximating tools (i.e., linear combinations of nonlinear basis functions), where the parameters of the networks should be found by applying optimisation methods. • The optimisation is done with respect to the approximation error measure. • In general it is enough to have a single hidden layer neural network (MLP, RBF or other) to learn the approximation of a nonlinear function. In such cases general optimisation can be applied to find the change rules for the synaptic weights. DEVELOPING ANN DETERMINING ANN’S TOPOLOGY