* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download What is a Neural Network?
Neuroesthetics wikipedia , lookup
Optogenetics wikipedia , lookup
Neuroeconomics wikipedia , lookup
Holonomic brain theory wikipedia , lookup
Neural modeling fields wikipedia , lookup
Biological neuron model wikipedia , lookup
Central pattern generator wikipedia , lookup
Synaptic gating wikipedia , lookup
Machine learning wikipedia , lookup
Neuropsychopharmacology wikipedia , lookup
Artificial intelligence wikipedia , lookup
History of artificial intelligence wikipedia , lookup
Catastrophic interference wikipedia , lookup
Development of the nervous system wikipedia , lookup
Metastability in the brain wikipedia , lookup
Neural engineering wikipedia , lookup
Nervous system network models wikipedia , lookup
Convolutional neural network wikipedia , lookup
Artificial neural network wikipedia , lookup
Introduction Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Course Info • 3 hours of lectures – On Tuesday from 13.30 to 16.20 • Grading – 50% Final Exam – 35% Midterm Exam – ? 15% 2nd Midterm or Project ? • Textbook – Neural Networks: A Comprehensive Foundation, Simon Haykin, Prentice Hall. Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Course Outline • • • • • Introduction Learning Processes Single Layer Perceptrons Multilayer Perceptrons Deep Learning Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. What is a Neural Network? • Neural network is a general name including both – Biological neural networks (e.g. human nervous system) – Artificial neural networks • Our main topic is artificial neural networks (ANNs) • We will sometimes say “neural network” to refer to an ANN Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. What is a Neural Network? • Biological neural networks (such as human brain) compute in a different way from today’s computers • The brain is a highly complex, nonlinear, and parallel computer • It can organize its own structure (connected neurons) to perform certain computations much faster than current computers Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. What is a Neural Network? • (Artificial) neural network is a machine that is designed to model the way in which the brain performs a particular task or function of interest; usually – implemented by using electronic components – or simulated in software on a computer • Our interest will mostly be on a group of ANNs which do useful computations after a learning process • As the name implies, it is a network of smaller computing units called neurons Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. What is a Neural Network? • (Definition by Alexander & Morton 1990) – A neural network is a massively parallel distributed processor made up of simple processing units, which has a natural propensity for storing experimental knowledge and making it available for use. It resembles the brain in two respects: • Knowledge is acquired by the network from its environment through a learning process • Interneuron connection strengths, known as synaptic weights, are used to store the acquired knowledge. Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. What is a Neural Network? • The procedure used to perform the learning process is called a learning algorithm – The main idea here is to modify the synaptic weights of the network in some way so as to achieve a desired objective Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Benefits of ANNs • Nonlinearity: Neurons can be linear or nonlinear. Nonlinearity also comes from the networking. This is an important property particularly when we are working on nonlinear problems. • Input-Output mapping: An ANN learns how to map inputs to outputs from examples. This is similar to nonparametric statistical inference (a branch of statistics). • Adaptivity: An ANN trained to work for a specific case can easily be retrained to deal with minor changes in conditions. In fact, it can be designed to do this in a changing environment. But, there is often a critical line between an adaptive system and a robust one. Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Benefits of ANNs • Evidential Response: An ANN can be designed not only to give us a decision but also to give us how confident it is in that decision. • Contextual Information: Knowledge is represented by the structure. Every neuron is potentially affected by all others in the network. Therefore, contextual information is dealt with naturally. • Fault Tolerance: In hardware form, ANNs are fault tolerant in the sense that, if a neuron fails the general performance is only slightly degraded. Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Benefits of ANNs • VLSI Implementability: An ANN is well suited to be implemented using very-large-scaleintegrated (VLSI) technology. • Uniformity of Analysis and Design: Same notation (neurons being the main unit, etc.) is used in all domains involving the application of neural networks. • Neurobiological Analogy: ANNs are motivated by analogy with the brain, which is a living proof that fault tolerant parallel processing is not only physically possible but also fast and powerful. Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Human Brain • May be viewed as a three-stage system as below – Brain (neural net); Receptors convert stimuli into electrical impulses; Effectors convert electrical impulses into responses (system outputs) – Left to right arrows: forward transmission: Right to left: feedback • Neurons are five to six orders of magnitude slower than silicon logic gates – Neural events happen in 10-3 s range, whereas silicon gate events happen in 10-9 s • Yet, brain makes up for this by having extremely many neurons and complex interconnections between them – There are approximately 10 billion neurons in the human cortex and 60 trillion connections (synapses) • Also, brain is energy efficient (10-16 joules per operation per second) – Computers today have about 10-6 joules per operation per second) Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. The Pyramidal Cell Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. • http://youtu.be/gcK_5x2KsLA Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Human Brain • There are both smallscale and large-scale anatomical organizations – Different functions take place at lower and higher levels Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Figure 4 Cytoarchitectural map of the cerebral cortex. The different areas are identified by the thickness of their layers and types of cells within them. Some of the key sensory areas are as follows: Motor cortex: motor strip, area 4; premotor area, area 6; frontal eye fields, area 8. Somatosensory cortex: areas 3, 1, and 2. Visual cortex: areas 17, 18, and 19. Auditory cortex: areas 41 and 42. (From A. Brodal, 1981; with permission of Oxford University Press.) Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Artificial Neuron Models • A neuron is the fundamental information processing unit of a neural network • The diagram on the right shows a neuron model including – A set of synapses (Each with a weight, 𝑤𝑘𝑖 ) – An adder (a linear combiner) – An activationfunction – A bias value (𝑏𝑘 ) to modify the net input of activation function Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Artificial Neuron Models • Mathematically, the following pair of equations describe neuron 𝑘 𝑚 𝑢𝑘 = 𝑤𝑘𝑖 𝑥𝑖 𝑖=1 𝑦𝑘 = 𝜑(𝑢𝑘 + 𝑏𝑘 ) Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Artificial Neuron Model • Use of bias (𝑏𝑘 ) applies an affine transformation to 𝑢𝑘 𝑣𝑘 = 𝑢𝑘 + 𝑏𝑘 𝑣𝑘 is called activation potential (or induced local field) • Using activation potential, instead of the previous equations we can write 𝑚 𝑣𝑘 = 𝑤𝑘𝑖 𝑥𝑖 𝑖=0 𝑦𝑘 = 𝜑(𝑣𝑘 ) where 𝑥0 = 1, and 𝑤𝑘0 = 𝑏𝑘 Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Figure 7 Another nonlinear model of a neuron; wk0 accounts for the bias bk. Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Types of Activation Function • The activation function (φ(v)) defines the output of a neuron in terms of the activation potential vk • 3 basic types are – Threshold function (top right) – Piecewise linear function – Sigmoid function (bottom right) - most common Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Types of Activation Function • Sigmoid function • is the slope parameter • This function is differentiable (important as we will describe in Chapter 4) Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. ANNs as Directed Graphs • Simpler graphs can be drawn which are similar to signal-flow graphs • A signal-flow graph is a network of directed links that are connected at nodes – Signal only flows in the direction of the arrows • Synaptic links (a) and activation links (b) – A node signal is the sum of all signals entering – Node signal is transmitted to all outgoing links Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Figure 10 Signal-flow graph of a neuron. Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. ANNs as Directed Graphs • An ANN is a directed graph consisting of nodes with interconnecting synaptic and activation links, and is characterized by four properties: – Each neuron is represented by a set of linear synaptic links, an externally applied bias, and a possibly nonlinear activation link. The bias is represented by a synaptic link connected to an input fixed at +1. – The synaptic links of a neuron weight their respective input signals . – The weighted sum of the input signals defines the activation potential of the neuron in question. – The activation link squashes the activation potential of the neuron to produce an output. Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. ANNs as Directed Graphs • When we are interested in signal flow from neuron to neuron and not the details of individual neurons, we can use a partially complete graph as follows: – Source nodes supply input signals to the graph. – Each neuron is represented by a single computation node. – Links connecting source and computation nodes carry no weight; they only show flow direction Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Figure 11 Architectural graph of a neuron. Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. ANNs as Directed Graphs • In summary, there are 3 graphical representations that we use – Block diagram – Signal-flow graph – Architectural graph Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Feedback • Feedback exists in a system when the output of an element influences in part the input to that element (resulting in closed paths and what we call recurrent networks) – A, B can be replaced with valid operators such as a weight w or unitdelay operator z-1 Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Network Architectures • The structure of an ANN affects which learning algorithm to use (learning algorithms may be structured) • There are 3 fundamentally different network architecture classes in general: – Single-Layer Feedforward Networks – Multilayer FeedForward Networks – Recurrent Networks Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Single-Layer Feedforward Networks • An input layer of source nodes projects onto an output layer of neurons • No feedback at any point (therefore feedforward) Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Multilayer Feedforward Networks • One or more hidden layers • One layers output is the input to the next layer • Outputs of the final layer are the outputs of the system • The example on right is a 10-4-2 network • It is also fully connected (as opposed to being partially connected) Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Recurrent Networks • A recurrent network has at least one feedback loop Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Knowledge Representation • Knowledge of the world consists of two kinds of information – The known world state (prior information) – Observations (measurements) of the world. Examples used to train an ANN are drawn from such observations. • Examples can be labeled (each input is paired with a desired response) or unlabeled (only input signal). • A set of labeled examples can be our training data (or training samples) – E.g., A set of handwritten digit images and corresponding digit labels Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Knowledge Representation • Four commonsense rules for knowledge representation – Similar inputs from similar classes should usually produce similar representations inside the network – Items from separate classes should be given different representations – If a feature is important, then there should be a large number of neurons involved in its representation • In radar, target detection in clutter example, detection performance is measured in 2 ways: Detection success and false alarm probability – Prior information and invariances should be built into the ANN design (specialized structure). This is good because: • Biological networks are very specialized • Specialized structure means less free parameters (learns faster and better) • Better network throughput • Building cost is reduced Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Knowledge Representation • Building prior information into ANN design – No well-defined rules to do this, but 2 techniques generally work • Restrict the network to use local connections (receptive fields) • Weight-sharing (again free parameters are reduced as a side effect) (the ANN on next page) • Building invariances into ANN design – E.g., Rotated versions of same handwritten letter or same word spoken soft/loud/slowly/quickly • Invariance by structure • Invariance by training • Invariant feature space Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Figure 20 Illustrating the combined use of a receptive field and weight sharing. All four hidden neurons share the same set of weights exactly for their six synaptic connections. Neural Networks and Learning Machines, Third Edition Simon Haykin Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved.