Download Introduction to Artificial Neural Networks (ANNs)

Introduction to Artificial Neural Networks (ANNs) Keith L. Downing The Norwegian University of Science and Technology (NTNU) Trondheim, Norway [email protected] January 12, 2014 Keith L. Downing Introduction to Artificial Neural Networks (ANNs) NETtalk (Sejnowski + Rosenberg, 1986) N E C o n t e x t W i n d o w U Letters "Concepts" Phonemes R O S Silent C I E N C E IBM’s DECtalk: several man years of work → Reading machine. NETtalk: 10 hours of backprop training on a 1000-word text, T1000. 95% accuracy on T1000; 78% accuracy on novel text. Improvement during training sounds like a child learning to read. Concept layer is key. 79 different (overlapping) clouds of neurons are gradually formed, with each mapping to one of the 79 phonemes. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Sample ANN Applications: Forecasting 1 Train the ANN (typically using backprop) on historical data to learn [X (t−k ), X (t−k+1 ), . . . , X (t0 )] 7→ [X (t1 ), . . . , X (tm−1 ), X (tm )] 2 Use to predict future value(s) based on the past k values. Sample applications (Ungar, in Handbook of Brain Theory and NNs, 2003) Car sales Airline passengers Currency exchange rates Electrical loads on regional power systems. Flour prices Stock prices (Warning: often tried, but few good, documented results). Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Brain-Computer Interfaces (BCI) Scalp EEG Neural Ensembles Neural Context 1 2 3 Action Ask subject to think about an activity (e.g. moving joystick left) Register brain activity (EEG waves - non-invasive) or (Neural ensembles invasive) ANN training case = (brain readings, joystick motion) Sample applications (Millan, in Handbook of Brain Theory and NNs, 2003) Keyboards (3 keystrokes per minute) Artificial (prosthetic) hands Wheelchairs Computer games Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Brains as Bio-Inspiration Texas "Watermelon" "The truth? You can't handle the truth." "I got a 69 Chevy with a 396..." Grandmother Distributed Memory - A key to the brain’s success, and a major difference between it and computers. Brain operations slower than computers, but massively parallel. How can the brain inspire AI advances? What is the proper level of abstraction? Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Signal Transmission in the Brain AP Axons Nucleus Dendrites AP Action Potential (AP) A wave of voltage change along axons and dendrites. Nucleus (soma) generates an AP if the sum of its incoming APs is strong enough. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Ion Channels Repolarization K+ K+ Ca++ Ca++ Na+ Na+ Depolarization Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Depolarization and Repolarization Overshoot +40 mV Na+ gates close K+ Efflux K+ gates opens K+ Efflux 0 mV Na+ gates open Na+ Influx K+ gates close -65 mV Resting Potential Time Keith L. Downing Undershoot Introduction to Artificial Neural Networks (ANNs) Transferring APs across a Synapse Pre-synaptic Terminal Post-synaptic Terminal Synapse NT-gated Ion Channel Vesicle Action-Potential (AP) Neurotransmitter (NT) Neurotransmitters Excite - Glutamate, AMPA ; bind Na+ and Ca++ channels. Inhibit - GABA; binds K+ channels Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Location, Location, Location..of Synapses I1 P2 Axons Soma Dendrites P1 I2 Distal and Proximal Synapses Synapses closer to the soma normally have a stronger effect. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Donald Hebb (1949) Fire Together, Wire Together When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells, such that A’s efficiency as one of the cells firing B, is increased. Hebb Rule 4wi,j = λ oi oj Instrumental in Binding of.. pieces of an image words of a song multisensory input (e.g. words and images) sensory inputs and proper motor outputs simple movements of a complex action sequence Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Coincidence Detection and Synaptic Change 2 Key Synaptic Changes 1 the propensity to release neurotransmitter (and amount released) at the pre-synaptic terminal, 2 the ease with which the post-synaptic terminal depolarizes in the presence of neurotransmitters. Coincidences 1 Pre-synaptic: Adenyl cyclase (AC) detects simultaneous presence of Ca++ and serotonin. 2 Post-synaptic: NMDA receptors detect co-occurrence of glutamate (a neurotransmitter) and depolarization. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Pre-synaptic Modification Depolarization Salient Event Ca++ Ca++ AC 5HT ATP cAMP Pre-synaptic Terminal Post-synaptic Terminal PKA Glutamate NMDA Receptor Mg++ AC Adenyl Cyclase Keith L. Downing 5HT Serotonin Introduction to Artificial Neural Networks (ANNs) Post-synaptic Modification Polarized (relaxed) postsynaptic state CA++ Net Negative Charge Mg++ NMDA Receptor Depolarized (firing) postsynaptic state Glutamate CA++ Net Positive Charge Mg++ Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Neurochemical Basis of Hebbian Learning Fire together: When the pre- and post-synaptic terminal of a synapse depolarize at about the same time, the NMDA channels on the post-synaptic side notice the coincidence and open, thus allowing Ca++ to flow into the post-synaptic terminal. Wire together: Ca++ (via CaMKII and protein kinase C) promotes post- and pre-synaptic changes that enhance the efficiency of future AP transmission. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Hebbian Basis of Classical Conditioning Hear Bell(CS) S2 S1 Salivate (R) See Food (US) Unconditioned Stimulus (US) - sensory input normally associated with a response (R). E.g. the sight of food stimulates salivation. Conditioned Stimulus (CS) - sensory input having no previous correlation with a response but which becomes associated with it. E.g. Pavlov’s bell. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Long-Term Potentiation (LTP) Early Phase Chemical changes to pre- and post-synaptic terminals, due to AC and NMDA activity, respectively, increase the probability (and efficiency) of AP transmission for minutes to hours after training. Late Phase Structural changes occur to the link between the upstream and downstream neuron. This often involves increases in the numbers of axons and dendrites linking the two, and seems to be driven by chemical processes triggered by high concentrations of Ca++ in the post-synaptic soma. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Abstraction Human Brains 1011 neurons 1014 connections between them (a.k.a. synapses), many modifiable Complex physical and chemical activity to transmit ONE action potential (AP) (a.k.a. signal) along ONE connection. Artificial Neural Networks N = 101 − 104 nodes Max N 2 connections All physics and chemistry represented by a few parameters associated with nodes and arcs. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Structural Abstraction w Node w w w Node w Node Node w Node Node w Soma Axonal Compartments Dendritic Compartments Soma Soma Soma Soma Soma Synapses Dendrites Soma AP Soma Axons AP Soma Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Diverse ANN Topologies A D B E Keith L. Downing C F Introduction to Artificial Neural Networks (ANNs) Functional Abstraction Learn N1 N2 w13 N3 Activate Integrate w12 Reset RK RNa EK ENa VM CM Na+ K+ Lipid bilayer = capacitor Ion channel = resistor K+ K+ Ca++ Ca++ Keith L. Downing Na+ Na+ Introduction to Artificial Neural Networks (ANNs) Main Functional Components Learn N1 N2 w13 Activate Integrate N3 w12 Reset Integrate neti = ∑nj=1 xj wi,j Vi ← Vi + neti : Activate xi = 1 1+e−Vi Reset Vi ← 0 Learn 4wi,j = λ xi xj Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Functional Options Activate Integrate xi Vi <= Vi + neti Vi i xi Vi Reset Never reset Vi Spiking Neuron Model: Reset Vi only when above threshold Neurons without state: Always reset Vi Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Activation Functions xi = f (Vi ) 1 Step Identity 1 xi xi 0 0 Vi Vi T Ramp 1 xi 0 Vi T Hyperbolic Tangent (tanh) Logistic 1 1 xi xi -1 0 0 Vi Keith L. Downing 0 Vi Introduction to Artificial Neural Networks (ANNs) Diverse Model Semantics What Does xi Represent? 1 The occurrence of a spike in the action potential, 2 The instantaneous membrane potential of a neuron, 3 The firing rate of a neuron (AP’s / sec), 4 The average firing rate of a neuron over a time window, 5 The difference between a neuron’s current firing rate and its average firing rate. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Circuit Models of Neurons Na+ K+ Ion channels act as resistors Lipid bilayer acts as a capacitor RK RNa EK ENa CM Keith L. Downing VM Introduction to Artificial Neural Networks (ANNs) Using Kirchoff’s Current Law The sum of all currents into the cell must be zero. The currents i : Capacitance: Icap = CM dV dt (VM −EK ) = gK (VM − EK ) rK Na ) = gNa (VM − ENa ) (Sodium): INa = (VMr−E Na L) = gL (VM − EL ) = Passive (Leak): IL = (VMr−E L : Ionic (Potassium): IK = : Ionic : Ionic ions through ungated channels. flow of where I = current, r = resistance, g = conductance ( 1r ), and VM = membrane potential Icap + IK + INa + IL = 0 CM dVM = −gK (VM − EK ) − gNa (VM − ENa ) − gL (VM − EL ) dt Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Modeling Voltage-Gated Channels gK and gNa are sensitive to the membrane potential, VM The gating probabilities m, n and h = gating probabilities (between 0 and 1) They are complex functions of VM , determined empirically by Hodgkin and Huxley’s work on the giant squid axon. Conductances are functions of the gating probabilities gK = g K n4 - since 4 identical and independent parts of a K gate need to be open. g K = maximum K conductance. gNa = g Na m3 h - since 3 identical and independent parts (along with a different, 4th part) of an Na gate need to be open. g Na = maximum Na conductance. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) A Basic Version of the Hodgkin-Huxley Model Repolarization K+ K+ Ca++ Ca++ Na+ Na+ Depolarization τm dVM = −gK (VM − EK ) − gNa (VM − ENa ) − gL (VM − EL ) dt 4VM ∝ Inflow(Na+) - outflow(K+) - Leak current EL ≈ −60mV , EK ≈ −70mV , and ENa ≈ 50mV τm includes the capacitance, CM . Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Leaky Integrate and Fire Neurons xa a wia i b xb xc c wib wic xi Vi Leak EL = -65 mV These models ignore ion channels and activity along axons and dendrites. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) A Simple Leak-and-Integrate Model τm N dVi = cL (EL − Vi ) + cI ∑ xj wij dt j=1 (1) Vi = intracellular potential for neuron i. xi = output (current) from neuron i. wij = weight on connection from j to i. EL = extracellular potential τm = membrane time const. Higher τm → slower change. cL , cI = leak and integration constants. A Common Abstraction τm N dVi = −Vi + ∑ xj wij dt j=1 Keith L. Downing Introduction to Artificial Neural Networks (ANNs) (2) Firing Models Continuous: Sigmoid Function xi = 1 1 + e−cs Vi (3) * Often used for rate-coding, where xi = the neuron’s firing rate; cs is a scaling constant. Discrete: Step Function with Reset 1 if Vi > Tf xi = 0 otherwise Vi ← Vreset after exceeding the threshold, Tf . Typical values: Vreset = −65mV , Tf = −50mV . Often used in spiking neuron models, where xi is binary, denoting presence or absence of an action potential. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) (4) Temporal Abstraction A 0.8 C 0.4 0.5 B A C B Time +40 mV 0 mV -65 mV Time Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Spike Response Model (SRM) - Gerstner et. al., 2002 N Vi (t) = κ(Iext ) + η(t − tˆi ) + ∑ wij j=1 H ∑ εij (t − tˆi , t − tjh ) h=1 i t* ! ki k t* j ! kj t* The timing of each spike is very important in determining its effects upon downstream neurons. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Spiking Neurons Eugene Izhikevich, 2003 A Simple Model of Spiking Neurons. IEEE Transactions on Neural Networks, 14(6). τm N dVi = 0.04Vi2 + 5v + 140 − Ui + cI ∑ xj wij dt j=1 (5) dUi = a(bVi − Ui ) dt (6) τm Ui = recovery factor If Vi ≥ 30mV then Vi ← Vreset , and Ui ← Ui + Ureset Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Parameterized Spiking Patterns Regular Spiking Chattering Vi Time Intrinsic Bursting Thalamocortical Key parameters a, b, Vreset , and Ureset → spike patterns. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Continuous Time Recurrent Neural Networks Sensory Input Layer 1 2 3 4 5 Bias Node B 1 2 Hidden Layer 1 2 Motor Output Layer CTRNNs abstract away spikes but achieve complex dynamics with neuron-specific time constants, gains and biases. All weights evolve, but none are modified by learning. Invented by Randall Beer in early 1990’s and used in many evolved, minimally-cognitive agents. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) The Simple CTRNN Model n si = ∑ xj wi,j + Ii j=1 dVi 1 = [−Vi + si + θi ] dt τi 1 xi = 1 + e−gi Vi θi = bias; gi = gain. τi = time constant for neuron i. Each neuron implicitly runs at a different temporal resolution. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Essence of Learning in Neural Networks ? ∆w u1 w1 w2 u2 wn v post-synaptic neuron un pre-synaptic neurons Most ANNs do not model spikes nor STDP. Learning is based on a comparison of recent firing rates of neuron pairs. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Spike-Timing Dependent Plasticity (STDP) s 0.4 t 0 0 40 ms -40 ms -0.4 Change in synaptic strength (4s) as function of 4t = tpre − tpost , the times of the most recent pre- and post-synaptic spikes. The maximum magnitude of change is roughly 0.4% of the maximum possible synaptic strength/conductance. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) 3 Fundamental ANN Learning Paradigms Supervised Constant, detailed feedback that includes the correct response to each input; Omnipresent teacher. Reinforced Simple feedback mainly at the end of a problem-solving attempt, although possibly a few intermediate rewards or penalties, but no direct response recommendations. Unsupervised No feedback whatsoever. ANN normally tries to intelligently cluster the inputs and/or learn proper correlations between components of input space. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Supervised Learning You should have turned RIGHT at the last intersection. Sensory Input Correct Action Motor Output Error ∆W Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Reinforced Learning You are at the goal! Reinforcement Signal ∆w w w w w w Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Unsupervised Learning A long trip down a corridor is followed by a left turn. ∆w w w w w w w w Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Input Hebbian Learning Rules Basic Heterosynaptic 4wi = λ v (ui − θi ) Basic Homosynaptic 4wi = λ (v − θv )ui General Hebbian 4wi = λ ui v BCM 4wi = λ ui v (v − θv ) Oja 4wi = ui v − wi v 2 Homosynaptic All active synapses are modified the same way, depending only on the strength of the postsynaptic activity. Heterosynaptic Active synapses can be modified differently, depending upon the strength of their presynaptic activity. Keith L. Downing Introduction to Artificial Neural Networks (ANNs) Modelling Options to Consider 1 Single or multiple neurons? 2 Can neuron A send more than one axon to neuron B? 3 Are connections modeled as cables or just simple connector points (i.e. a single weight). 4 Do neurons have state? I.e., does Vi (t + 1) depend on Vi (t)? 5 Do outputs (xi ) represent individual spikes or spike rates or ..? 6 Are neurons organized by layers? 7 Do layers follow a feed-forward topology or is there recurrence (i.e. looping)? 8 Are neurons connected within layers or only between layers? 9 Is learning supervised, unsupervised or reinforced? 10 Is spike-timing dependent plasticity (STDP) involved in the learning rule? Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Introduction to Artificial Neural Networks (ANNs)