* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download NeuralNets_ch1-2_intro_Eng
Neuroinformatics wikipedia , lookup
Optogenetics wikipedia , lookup
Community informatics wikipedia , lookup
Pattern recognition wikipedia , lookup
Artificial intelligence wikipedia , lookup
Central pattern generator wikipedia , lookup
Natural computing wikipedia , lookup
Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Introduction To Neural Networks Prof. George Papadourakis, Ph.D. Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Historical Background Development of Neural Networks date back to the early 1940s. It experienced an upsurge in popularity in the late 1980s. This was a result of the discovery of new techniques and developments and general advances in computer hardware technology. Some NNs are models of biological neural networks and some are not, but historically, much of the inspiration for the field of NNs came from the desire to produce artificial systems capable of sophisticated, perhaps intelligent, computations similar to those that the human brain routinely performs, and thereby possibly to enhance our understanding of the human brain. Most NNs have some sort of training rule. In other words, NNs learn from examples (as children learn to recognize dogs from examples of dogs) and exhibit some capability for generalization beyond the training data. Neural computing must not be considered as a competitor to conventional computing. Rather, it should be seen as complementary as the most successful neural solutions have been those which operate in conjunction with existing, traditional techniques. Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory What are ANNs? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning Very simple principles Very complex behaviours Applications As powerful problem solvers As biological models Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Neural Network Techniques Computers have to be explicitly programmed Analyze the problem to be solved. Write the code in a programming language. Neural networks learn from examples No requirement of an explicit description of the problem. No need for a programmer. The neural computer adapts itself during a training period, based on examples of similar problems even without a desired solution to each problem. After sufficient training the neural computer is able to relate the problem data to the solutions, inputs to outputs, and it is then able to offer a viable solution to a brand new problem. Able to generalize or to handle incomplete data. Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory NNs vs Computers (1/2) Digital Computers Deductive Reasoning. We apply known rules to input data to produce output. Computation is centralized, synchronous, and serial. Memory is packetted, literally stored, and location addressable. Not fault tolerant. One transistor goes and it no longer works. Exact. Static connectivity. Applicable if well defined rules with precise input data. Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory ΝΝs vs Computers (2/2) Neural Networks Inductive Reasoning. Given input and output data (training examples), we construct the rules. Computation is collective, asynchronous, and parallel. Memory is distributed, internalized, short term and content addressable. Fault tolerant, redundancy, and sharing of responsibilities. Inexact. Dynamic connectivity. Applicable if rules are unknown or complicated, or if data are noisy or partial. Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Applications off NNs (1/2) Classification Marketing: consumer spending pattern classification Defence: radar and sonar image classification Agriculture & fishing: fruit and catch grading Medicine: ultrasound/electrocardiogram image classification, EEGs Recognition and identification General computing and telecommunications: speech, vision and handwriting recognition Finance: signature verification and bank note verification Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Applications off NNs (2/2) Assessment Engineering: product inspection monitoring and control Defence: target tracking Security: motion detection, camera surveillance, fingerprint matching Forecasting and prediction In finance: foreign exchange rate and stock market forecasting In agriculture: crop yield forecasting In marketing: sales forecasting In meteorology: weather prediction Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory What can you do with an NN and what not? In principle, NNs can compute any computable function, i.e., they can do everything a normal digital computer can do. Almost any mapping between vector spaces can be approximated to arbitrary precision by feedforward NNs In practice, NNs are especially useful for classification and function approximation problems usually when rules such as those that might be used in an expert system cannot easily be applied. NNs are, at least today, difficult to apply successfully to problems that concern manipulation of symbols and memory. And there are no methods for training NNs that can magically create information that is not contained in the training data. Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Who is concerned with NNs? (1/2) Computer scientists want to find out about the properties of non-symbolic information processing with neural nets and about learning systems in general. Statisticians use neural nets as flexible, nonlinear regression and classification models. Engineers of many kinds exploit the capabilities of neural networks in many areas, such as signal processing and automatic control. Cognitive scientists view neural networks as a possible apparatus to describe models of thinking and consciousness (High-level brain function). Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Who is concerned with NNs? (2/2) Neuro-physiologists use neural networks to describe and explore medium-level brain function (e.g. memory, sensory system, motorics). Physicists use neural networks to model phenomena in statistical mechanics and for a lot of other tasks. Biologists use Neural Networks to interpret nucleotide sequences. Philosophers and some other people may also be interested in Neural Networks for various reasons Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Biological Inspiration Animals are able to react adaptively to changes in their external and internal environment, and they use their nervous system to perform these behaviours. An appropriate model/simulation of the nervous system should be able to produce similar responses and behaviours in artificial systems. The nervous system is build by relatively simple units, the neurons, so copying their behavior and functionality should be the solution. Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Biological NNs (1/3) Pigeons as art experts (Watanabe et al. 1995) Experiment: Pigeon in Skinner box Present paintings of two different artists (e.g. Chagall / Van Gogh) Reward for pecking when presented a particular artist (e.g. Van Gogh) Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Biological NNs (2/3) Pigeons were able to discriminate between Van Gogh and Chagall with 95% accuracy (when presented with pictures they had been trained on) Discrimination still 85% successful for previously unseen paintings of the artists Pigeons do not simply memorise the pictures They can extract and recognise patterns (the ‘style’) They generalise from the already seen to make predictions This is what neural networks (biological and artificial) are good at (unlike conventional computers) Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Biological NNs (3/3) Brain: A collection of about 10 billion interconnected neurons. Each neuron cell uses biochemical reactions to receive, process and transmit information. Each terminal button is connected to other neurons across a small gap called a synapse. Neuron's dendritic tree is connected to thousand neighbouring neurons. When one of those neurons fire, a positive or negative charge is received by one of the dendrites. The strengths of all the received charges are added together through the processes of spatial and temporal summation Axon Neurotransmitters Dentrite Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Artificial NNs: The Basics ANNs incorporate the two fundamental components of biological neural nets: Neurons -> Nodes Nodes Weights Synapses -> Weights Outputs Inputs An artificial neural network is composed of many artificial neurons that are linked together according to a specific network architecture. The objective of the neural network is to transform the inputs into meaningful outputs. Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Key Elements of NNs (1/2) Neural computing requires a number of neurons, to be connected together into a neural network. Neurons are arranged in layers. P1 P2 P3 Inputs Weights w1 w2 w3 Σ f Output O= bias ΣW p + i i Bias Each neuron within the network is usually a simple processing unit which takes one or more inputs and produces an output. At each neuron, every input has an associated weight which modifies the strength of each input. The neuron simply adds together all the inputs and calculates an output to be passed on. Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Key Elements of NNs (2/2) Feeding data through the net: 1.0 W1=0.32 W2=0.46 0.5 W3=0.81 0.7 Output = (1.0x0.32) + (0.5x0.46) + (0.7x0.81) = 1.117 Squashing (Limit output between 0 -1 range) 1 1.117 1+e = 0.466 Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Feeding NNs Data is presented to the network in the form of activations in the input layer Examples Data usually requires preprocessing Pixel intensity (for pictures) Molecule concentrations (for artificial nose) Share prices (for stock market prediction) Analogous to senses in biology How to represent more abstract data, e.g. a name? Choose a pattern, e.g. 0-0-1 for “Chris” 0-1-0 for “Becky” Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Activation (Squashing) functions The activation function describes the output behaviour of the neurons Generally is non-linear. Linear functions are limited because the output is simply proportional to the input. Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory ANNs Architectures (1/3) A hidden layer learns to recode (or to provide a representation for) the inputs. More than one hidden layer can be used. The architecture is more powerful than single-layer networks: it can be shown that any mapping can be learned, given two hidden layers (of units). Inputs Outputs Single Layer ANN Hidden Layer Inputs Outputs ANN with hidden layer Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory ANNs Architectures (2/3) Feed-forward networks Feed-forward ANNs allow signals to travel one way only; from input to output. There is no feedback (loops) i.e. the output of any layer does not affect that same layer. Feed-forward ANNs tend to be straight forward networks that associate inputs with outputs. They are extensively used in pattern recognition. This type of organisation is also referred to as bottom-up or top-down. Inputs Outputs Flow of Information Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory ANNs Architectures (3/3) Feedback networks Feedback networks can have signals travelling in both directions by introducing loops in the network. Feedback networks are very powerful and can get extremely complicated. Feedback networks are dynamic; their 'state' is changing continuously until they reach an equilibrium point. They remain at the equilibrium point until the input changes and a new equilibrium needs to be found. Inputs Outputs Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Training methods (1/2) Supervised learning In supervised training, both the inputs and the outputs are provided. The network then processes the inputs and compares its resulting outputs against the desired outputs. Errors are then propagated back through the system, causing the system to adjust the weights which control the network. This process occurs over and over as the weights are continually tweaked. The set of data which enables the training is called the training set. During the training of a network the same set of data is processed many times as the connection weights are ever refined. Example architectures : Multilayer perceptrons Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Training methods (2/2) Unsupervised learning In unsupervised training, the network is provided with inputs but not with desired outputs. The system itself must then decide what features it will use to group the input data. This is often referred to as self-organization or adaption. Example architectures : Kohonen Self Organizing Maps Neural Gas ART Map Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory The learning Process (1/7) Example: Optical Character Recognition Task: Learn to discriminate between two different typed characters Data Sources Letter ‘A’ Letter ‘b’ Format Image pixel values (binary form) Analogy: Eye – Optical Nerve - Brain Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory The learning Process (2/7) Network architecture Feed forward network 20 inputs (one for each pixel value) 6 hidden 2 output (0-1 for ‘A’, 1-0 for ‘b’) Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory The learning Process (3/7) Presenting the data (untrained network) 20 Inputs 0.43 0.26 20 Inputs 0.73 0.55 =1 =0 Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory The learning Process (4/7) Calculate Error (Subtract actual and expected values) 20 Inputs 0.43 – 0 = 0.43 0.26 – 0 = 0.74 20 Inputs 0.73 – 1 = 0.27 0.55 – 0 = 0.55 =1 =0 Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory The learning Process (5/7) Use the errors to adjust weights through some learning function 20 Inputs 0.43 – 0 = 0.43 0.26 – 0 = 0.74 1.17 20 Inputs 0.73 – 1 = 0.27 0.55 – 0 = 0.55 0.82 =1 =0 Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory The learning Process (6/7) Repeat process (sweep) for all training pairs Present data Calculate error Back-propagate error Adjust weights Repeat process multiple times Error Repetition (Epoch) Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory The learning Process (7/7) Presenting the data (trained network) 20 Inputs 0.01 0.99 20 Inputs 0.99 0.01 =1 =0 Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Design Considerations What transfer function should be used? How many inputs does the network need? How many hidden layers does the network need? How many hidden neurons per hidden layer? How many outputs should the network have? There is no standard methodology to determinate these values. Even there is some heuristic points, final values are determinate by a trial and error procedure. Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory When to use ANNs Input is high-dimensional discrete or real-valued (e.g. raw sensor input). Inputs can be highly correlated or independent. • Output is discrete or real valued • Output is a vector of values • Possibly noisy data. Data may contain errors • Form of target function is unknown • Long training time are acceptable • Fast evaluation of target function is required • Human readability of learned target function is unimportant ANN is much like a black-box Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory Conclusions Artificial neural networks are inspired by the learning processes that take place in biological systems. Artificial neurons and neural networks try to imitate the working mechanisms of their biological counterparts. Learning can be perceived as an optimisation process. Biological neural learning happens by the modification of the synaptic strength. Artificial neural networks learn in the same way. “If the brain were so simple that we could understand it then we’d be so simple that we couldn’t” Lyall Watson Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Neural Networks Laboratory References Fausett, L. (1994), Fundamentals of Neural Networks: Architectures, Algorithms, and Applications, Prentice Hall, ISBN 0-13-334186-0 Neural Networks for Pattern Recognition, Christopher M. Bishop, Oxford University Press (1995) Bigus, J.P. (1996), Data Mining with Neural Networks: Solving Business Problems--from Application Development to Decision Support, NY: McGraw-Hill, ISBN 0-07-005779-6 Smith, M. (1996). Neural Networks for Statistical Modeling, Boston: International Thomson Computer Press, ISBN 1-850-32842-0 Masters, T. (1993), Practical Neural Network Recipes in C++, Academic Press, ISBN 0-12-479040-2 Hinton, G.E. (1992), "How Neural Networks Learn from Experience", Scientific American, 267 (September), 144-151