Download Neural Networks.Chap..

Neural Networks 2nd Edition Simon Haykin Chap 1. Introduction 柯博昌 What is a Neural Network    2 A neural network is a massively parallel distributed processor made up of simple processing units, which has a natural propensity for storing experiential knowledge and making it available for use. Knowledge is acquired by the network from its environment through a learning process. The procedure performing learning process is called a learning algorithm. Interneuron connection strengths, known as synaptic weights, are used to store the acquired knowledge. Benefits of Neural Networks  The computing power of neural networks – –  Using neural networks offers following properties: – – – – –  3 Massively parallel distributed structure Ability to learn and therefore generalize. Nonlinearity Input-Output Mapping Adaptively Evidential Response Contextual Information – – – – Fault Tolerance VLSI Implementability Uniformity of Analysis and Design Neurobiological Analogy Supervised Learning: Modifying the synaptic weights by applying a set of training samples, which constitute of input signals and corresponding desired responses. Human Brain - Function Block  Block diagram representation of human nervous system Forward Stimulus Receptors Neural Net. (Brain) Effectors Response Feedback   4 Receptors: Convert stimulus from the human body or the external environment into electrical impulses that convey information to brain. Effectors: Convert electrical impulses generated by brain into discernible responses as system outputs. Comparisons: Neural Net. Vs. Brain     Neuron: The structural constituents of the brain. Neurons are five to six orders of magnitude slower than silicon logic gates. (e.g. Silicon chips: 10-9 s, Neural Event: 10-3 s) 10 billion neurons and 60 trillion synapses or connections are in the human cortex. The energetic efficiency – – 5 Brain: 10-16 joules per operation per second. Best Computer today: 10-6 joules per operation per second. Synapses    Synapses are elementary structural and functional units that mediate the interactions between neurons. The most common kind of synapse is chemical synapse. The operations of synapse: – – – 6 A pre-synaptic process liberates a transmitter substance that diffuses across the synaptic junction between neurons. Acts on a post-synaptic process. Synapse converts a pre-synaptic electrical signal into a chemical signal and then back into a post-synaptic electrical signal. (Nonreciprocal two-port device) Pyramidal Cell 7 Cytoarchitectural map of the cerebral cortex 8 Nonlinear model of a neuron x1 Bias bk wk1 Activation function x2 wk2 ... ... Input signals xm vk S wkm j(×) Output yk Summing junction Synaptic weights m uk   wkj x j vk  uk  bk yk  j (vk )  j (uk  bk ) j 1 Let bk=wk0 and x0=+1 vk   wkj x j and yk  j (vk ) m 9 j 0 Nonlinear model of a neuron (Cont.) bk>0 Input signals Linear combiner’s output, uk Affine transformation produced by the presence of a bias 10 wk0 x1 wk1 x2 wk2 ... Bk=0 Bk<0 Fixed input x0=+1 ... Induced local field, uk xm wk0=bk (bias) wkm Activation function S vk j(×) Output yk Summing junction Synaptic weights (including bias) Another Nonlinear model of a neuron Types of Activation Function Threshold Function 1 if v  0 j (v )   0 if v  0 Piecewise-Linear Function v  1/ 2 1 v  1 / 2  v  1 / 2 j (v )   0 v  1 / 2 Sigmoid Function 1 j (v )  1  exp ( av ) a is the slope parameter 11 1.2 1 0.8 0.6 0.4 0.2 0 -2 -1.5 1.2 1 0.8 0.6 0.4 0.2 0 -2 -1.5 1.2 1 0.8 0.6 0.4 0.2 0 -8 j (v ) -1 -0.5 0 v 0.5 1 1.5 2 0 v 0.5 1 1.5 2 j (v ) -1 -0.5 j (v ) Increasing a -6 -4 -2 0 v 2 4 6 8 Types of Activation Function (Cont.)   The activation functions defined above range from 0 to +1. Sometimes, the activation function ranges from -1 to +1. (How to do?) Assume the activation function ranging from 0 to +1 is denoted as j(×), ranging from -1 to +1 is denoted as j’(×)  j’(×)=j(×)*2-1 Notes: if j(v)=sigmoid function j (v )  12 1 * 2 1 1  exp ( av ) 1  exp ( av )   tanh( v) 1  exp ( av ) Stochastic Model of a Neuron   The above model is deterministic in that its inputoutput behavior is precisely defined. Some applications of neural network base the analysis on a stochastic neuronal model. Let x denote the state of the neuron, and P(v) denote the probability of firing, where v is the induced local field of the neuron.  1 with probabilit y P(v) x  1 with probabilit y 1 - P(v) A standard choice for P(v) is the sigmoid-shaped function. T is a pseudo-temperature that is used to control the noise level and therefore the uncertainty in firing. 13 P(v )  1 1  exp ( v / T ) Neural Network  Directed Graph Synaptic Links Activation Links wkj xj j(×) xj yk=wkjxj yk=j(xj) yi Synaptic Convergence (fan-in) Synaptic Divergence (fan-out) yk=yi+yj yj xj xj xj 14 Signal-flow Graph of a Neuron x0=+1 x1 wk0=bk x2 wk1 wk2 ... ... 15 xm wkm vk j(×) yk Feedback  Feedback plays a major role in recurrent network. xj’(n) xj(n) A yk=j(xj) B yk(n)=A[xj’(n)] yk (n )  xj’(n)=xj(n)+B[yk(n)]   A x j (n ) 1  AB where A and B act as operators A/(1-AB) is referred as closed-loop operator, AB as open-loop operator. In general, ABBA 16 Feedback (Cont.) Let A be a fixed weight, w; and B is a unit-delay operator, z-1 ( A w 1   w 1  wz 1  AB 1  wz 1 (  1  wz  )  w z  1 1 l  l l 0 1 Use Taylor’s Expansion or Binomial Expansion to prove it. A  w wl z l 1  AB l 0    yk (n )  w wl z l x j (n ) l 0    yk (n )   wl 1 x j (n  l ) l 0 17 )    z l x j (n)  x j (n  l ) Time Responses for different weight, w yk(n) yk(n) yk(n) wxj(0) w>1 w=1 w<1 wxj(0) wxj(0) 0 1 2 3 4 5 n 0 1 2 3 4 5 n 0 1 2 Conclusions: 1. |w|<1, yk(n) is exponentially convergent. System is stable. 2. |w|1, yk(n) is divergent. System is unstable. Think about: 1. What does the time response change, If -1<w<0? 2. What does the time response change, If w-1? 18 3 4 5 n Network Architectures Single-Layer Feedforward Networks MultiLayer Feedforward Networks Fully Connected: Every node in each layer is connected to every other node in the adjacent forward layer. Otherwise, it’s Partially Connected. Input layer Output layer of source of neurons nodes 19 Input layer of source nodes Layer of hidden neurons Layer of output neurons Network Architectures (Cont.) Recurrent Networks with no selffeedback loops and no hidden neurons Recurrent Networks with hidden neurons z-1 Outputs z-1 z-1 z-1 z-1 Unit-delay operators z-1 z-1 z-1 Unit-delay operators Inputs 20 Knowledge Representation  Primary characteristics of knowledge representation – –    21 What information is actually made explicit How the information is physically encoded for subsequent use Knowledge is goal directed. A good solution depends on a good representation of knowledge. A set of input-output pairs, with each pair consisting of an input signal and the corresponding desired response, is referred to as a set of training data or training sample. Rules for Knowledge Representation Rule 1: Similar inputs from similar classes should usually produce similar representations inside the network. Similarity Measuring: (1) Using Euclidian distance, d(xi, xj) (2) Using Inner Product, (xi, xj) Let xi=[xi1, xi2, …, xim]T Let xi=[xi1, xi2, …, xim]T  2 d (xi , x j )  xi  x j   (xik  x jk )   k 1  m 1 (x , x )  x 2 i i j xi × x j 22 If ||xi||=1 and ||xj||=1 m x j   xik x jk k 1 xi 1 Similarity  d (xi , x j ) (x , x ) cos( )  j T i  xiTxj xj d2(xi, xj)=(xi-xj)T(xi-xj)=2-2(xiTxj) Rules for Knowledge Representation (Cont.) Rule 2: Items to be categorized as separate classes should be given widely different representations in the network. (This is the exact opposite of Rule 1.) Rule 3: If a particular feature is important, then there should be a large number of neurons involved in the representation of that item. Rule 4: Prior information and invariance should be built into the design of a neural network, thereby simplifying the network design by not having to learn them. 23 How to Build Prior Information into Neural Network Design   Ex: Restricting the network architecture though the use of local connections knows as receptive fields. Constraining the choice of synaptic weights through the use of weight-sharing. 6 v j   wi xi  j 1 , j  1,2,3,4 i 1 Convolution Sum Convolution Network 24 x1, …, x6 constitute the receptive field for hidden neuron 1 and so on for the other hidden neurons. Artificial Intelligence (AI)   Goal: Developing paradigms or algorithms that require machines to perform cognitive tasks. AI system must be capable of doing: – – –  Key components – – – 25 Store knowledge Apply knowledge stored to solve problems Acquire new knowledge through experience Representation Reasoning Learning Representation Learning Reasoning

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Neural Networks.Chap..