Download slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Molecular neuroscience wikipedia , lookup

Neural coding wikipedia , lookup

Perceptual learning wikipedia , lookup

Long-term potentiation wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Eyeblink conditioning wikipedia , lookup

Learning wikipedia , lookup

Convolutional neural network wikipedia , lookup

Development of the nervous system wikipedia , lookup

End-plate potential wikipedia , lookup

Machine learning wikipedia , lookup

Channelrhodopsin wikipedia , lookup

Concept learning wikipedia , lookup

Synaptic gating wikipedia , lookup

Nervous system network models wikipedia , lookup

Sparse distributed memory wikipedia , lookup

Neurotransmitter wikipedia , lookup

Catastrophic interference wikipedia , lookup

Long-term depression wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Nonsynaptic plasticity wikipedia , lookup

Recurrent neural network wikipedia , lookup

Synaptogenesis wikipedia , lookup

Activity-dependent plasticity wikipedia , lookup

Neuromuscular junction wikipedia , lookup

Chemical synapse wikipedia , lookup

Transcript
Introduction
• The goal of neuromorphic engineering is to design and implement microelectronic systems that emulate the structure and function of the brain.
• Address-event representation (AER) is a communication protocol originally proposed as a means to communicate sparse neural events between
neuromorphic chips.
• Previous work has shown that AER can also be used to construct largescale networks with arbitrary, configurable synaptic connectivity.
• Here, we further extend the functionality of AER to implement arbitrary,
configurable synaptic plasticity in the address domain.
1
Address-Event Representation (AER)
1
2
3
Data bus
3 0 2 1
time
Decoder
0
Receiver
Encoder
Sender
0
1
2
3
REQ
REQ
ACK
ACK
(Mahowald, 1994; Lazzaro et al., 1993)
• The AER communication protocol emulates massive connectivity between cells by time-multiplexing many connections on the same data
bus.
• For a one-to-one connection topology, the required number of wires is
reduced from N to ∼ log2 N .
• Each spike is represented by:
◦ Its location: explicitly encoded as an address.
◦ The time at which it occurs: implicitly encoded.
2
Learning on Silicon
• Adaptive hardware systems commonly employ learning circuitry embedded into the individual cells.
• Executing learning rules locally requires inputs and outputs of the algorithm to be local in both space and time.
• Implementing learning circuits locally increases the size of repeating units.
• This approach can be effective for small systems, but it is not efficient
when the number of cells increases.
y1
y2
w12
w11
x1
w22
w21
x2
w32
w31
x3
w41
w42
x4
3
Address Domain Learning
• By performing learning in the address domain, we can:
◦ Move learning circuits to the periphery.
◦ Create scalable adaptive systems.
◦ Maintain the small size of our analog cells.
◦ Construct arbitrarily complex and reconfigurable learning rules.
• Because any measure of cellular activity can be made globally available using AER, many adaptive algorithms based on incremental outer-product
computations can be implemented in the address domain.
• By implementing learning circuits on the periphery, we reduce restrictions
of locality on constituents of the learning rule.
• Spike timing-based learning rules are particularly well-suited for implementation in the address domain.
4
Enhanced AER
• In its original formulation, AER implements a one-to-one connection
topology.
• To create more complex neural circuits, convergent and divergent connections are required.
• The connectivity of AER systems can be enhanced by routing addressevents to multiple receiver locations via a look-up table (Andreou et al.,
1997; Diess et al., 1999; Boahen, 2000; Higgins & Koch, 1999).
• Continuous-valued synaptic weights can be obtained by manipulating
event transmission (Goldberg et al., 2001):
W
Weight
=
n
Number of
spikes sent
×
p
Probability of
transmission
×
q
Amplitude of
postsynaptic
response
5
Enhanced AER: Example
Sender address
Synapse index
Receiver address
Weight polarity
Weight magnitude
1
-1
8
0
1
0
0
2
2
-
1
0
1
1
-
3
1
8
4
-
REQ
POL
0
1
Encoder
0
3
0 0
1
2
1 0
1
2
2 0
1
2
Decoder
‘‘Receiver’’
‘‘Sender’’
2
EG
Integrate-and-fire array
2
4
2
Look-up table
• A two-layer neural network is mapped to the AER framework by means
of a look-up table (LUT).
• The event generator (EG) sends as many events as are specified in the
weight magnitude field of the LUT.
• The integrate-and-fire array transceiver (IFAT) spatially and temporally
integrates events.
6
Architecture
IFAT
System
OUTREQ
SCAN
ACK
Event scanning
RAM
Receiver address
MATCH
ACK
SCAN
CREQ
CACK
CREQ
CACK
CREQ
CACK
RSEL
Weight polarity
DATA
MATCH
ADDRESS
AOUT[1]
AOUT[0]
Sender
address
OUTACK
POL
AIN[2]
RSEL
AIN[3]
RSEL
AOUT[2]
RREQ
RACK
RSCAN
CPOL
CPOL
CPOL
RREQ
RACK
RSCAN
OUT
OUT
IFAT
Synapse
index
POL
VDD/2
Q
IN
AOUT[3]
RREQ
RACK
RSCAN
INACK
D
IN
MCU
Weight
magnitude
INREQ
PC board
Input control
AIN[1]
AIN[0]
7
Implementation
IFAT
System
Single IF cell
RAM
IFAT
Row scanning
and encoding
Row decoding
Column scanning
and encoding
Column decoding
MCU
8
Spike Timing-Dependent Plasticity
• In spike timing-dependent plasticity (STDP), changes in synaptic
strength depend on the time between each pair of presynaptic and
postsynaptic events.
• The most recent inputs to a postsynaptic cell make larger contributions to its membrane potential
than past inputs due to passive
leakage currents.
• Postsynaptic events immediately
following incoming presynaptic
spikes are considered to be causal
and induce weight increments.
• Presynaptic inputs that arrive
shortly after a postsynaptic spike
are considered to be anti-causal
and induce weight decrements.
From (Bi & Poo, 1998)
9
Address Domain STDP: Event Queues
• To implement our STDP synaptic modification rule in the address domain, we augmented our AER architecture with two event queues, one
for presynaptic events and one for postsynaptic events.
• When an event occurs, its address is entered into the appropriate queue
along with an associated value ϕ initialized to τ+ or τ−. This value is
decremented over time.
Presynaptic queue
Address
ϕpre
2 1
1
2 1
0.0 0.0
0.1
0.7 1.0
1 2
Postsynaptic queue
1
1.8 2.1 2.5
2
Address
ϕpost
3.0
x1
y1
x2
y2
2 1
1
1 2 2
1 2
2.1 2.4
3.5
4.1 4.5 4.8
2
5.3 5.6 6.0
t
-4
ϕpre (t−tpre ) =
-3
n
-2
τ+ − (t − tpre )
0
-1
0
if t − tpre ≥ τ+
if t − tpre < τ+
t
-4
ϕpost (t−tpost) =
-3
n
-2
τ− − (t − tpost )
0
-1
0
if t − tpost ≥ τ−
if t − tpost < τ−
10
Address Domain STDP: Weight Updates
• Weight update procedure:
τ+
∆w
Presynaptic queue
x1
Presynaptic
Postsynaptic
x2 x1
x1
x1 x2 x1
x2
y 2 y1
t
x2
Presynaptic
Postsynaptic
y
For each postsynaptic event, we
iterate backwards through the
presynaptic queue to find the
causal spikes and increment the
appropriate weights in the LUT.
∆w
Postsynaptic queue
y1
y1 y2 y2
x
y1 y 2
τ−
t
y1
y2
For each presynaptic event, we iterate backwards through the postsynaptic queue to find the anticausal spikes and decrement the
appropriate weights in the LUT.
• The magnitude of the weight updates are specified by the values stored
in the queue.
∆w =
−η · ϕpost (tpre − tpost )
+η · ϕpre (tpost − tpre )
0
if 0 ≤ tpre − tpost ≤ τ−
if − τ+ ≤ tpre − tpost ≤ 0
otherwise
11
Address Domain STDP: Details
∆w(tpre − tpost)
−τ+
τ−
tpre − tpost
presynaptic
postsynaptic
∆w
• For stable learning, the area under the synaptic modification curve in the
anti-causal regime must be greater than that in the causal regime. This
ensures convergence of the synaptic strengths (Song et al., 2000).
• In our implementation of STDP, this constraint is met by setting τ− > τ+.
12
Experiment: Grouping Correlated Inputs
x1
x2
x3
x4
Uncorrelated
x5
y
x16
x17
x18
Correlated
x19
x20
• Each of the 20 neurons in the input layer is driven by an externally
supplied, randomly generated list of events.
• Our randomly generated list of events simulates two groups of neurons,
one correlated and one uncorrelated. The uncorrelated group drives input layer cells x1 . . . x17 , and the correlated group drives input layer cells
x18 . . . x20 .
• Although each neuron in the input layer has the same average firing rate,
neurons x18 . . . x20 fire synchronous spikes more often than any other
combination of neurons.
13
Experimental Results
Single Trial
Average Over 20 Trials
• STDP has been shown to be effective at detecting correlations between
groups of inputs (Song et al., 2000). We demonstrate that this can be
accomplished in hardware in the address domain.
• Given a random starting distribution of synaptic weights for a set of
presynaptic inputs, a neuron using STDP should maximize the weights
of correlated inputs and minimize the weights of uncorrelated inputs.
• Our results illustrate this principle when all synaptic weights are initialized
to a uniform value and the network is allowed to process 200,000 input
events.
14
Conclusion
• The address domain provides an efficient representation to implement
synaptic plasticity based on the relative timing of events.
◦ Learning circuitry can be moved to the periphery.
◦ The constituents of learning rules need not be constrained in space
or time.
• We have implemented an address domain learning system using a hybrid
analog/digital architecture.
• Our experimental results illustrate an application of this approach using
a temporally-asymmetric Hebbian learning rule.
15
Extensions
• The mixed-signal approach provides the best of both worlds:
◦ Analog cells are capable of efficiently modelling sophisticated neural
dynamics in continuous-time.
◦ Nearest-neighbor connectivity can be incorporated into an addressevent framework to exploit the parallel processing capabilities of analog circuits.
◦ Storing the connections in a digital LUT affords the opportunity to
implement learning rules that reconfigure the network topology on
the fly.
• In the future, we will combine all of the system elements on a single chip.
The local embedding of memory will enable high bandwidth distribution
of events.
16