Download Distributed, Event Driven Simulation of Spiking Neural Networks

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Distributed, Event Driven Simulation of Spiking Neural Networks
Dipl.-Ing. Cyprian Graßmann
Prof. Dr. Joachim K. Anlauf
Phone: +49-228-73-4279
Fax: +49-228-73-4212
[email protected]
[email protected]
University of Bonn
Department of Computer Science II
Römerstraße 164
D-53117 Bonn, Germany
Abstract
We present the architecture of a simulator that is able to simulate large networks of spiking neurons using a distributed
event driven simulation. Contrary to a time driven simulation, which is usually used to simulate spiking neural networks, our simulation needs less computational resources because of the low average activity of typical networks. The
simulator is divided into a set of communicating sub-simulators running concurrently on several workstations, where
each sub-simulator handles a part of the network. The paper addresses the problems of synchronisation between the
sub-simulators and how information contained in the network topology and properties of neuron models are used to
solve them. Preliminary results are presented for two simple model networks illustrating the speed up gained by a distribution of the simulation.
1. Introduction
Simulation is the connecting link between neurophysiological measurements and theoretical studies. It helps
us to understand and to reproduce the behaviour of
biological systems of neurons and to verify functional
behaviour of model networks. Recent theoretical studies
about the computational complexity of spiking neural
networks by Wolfgang Maass [Maass95] encourage the
development of a dedicated simulation method for this
kind of networks.
The properties of artificial spiking neural networks like
the low average activity [Rolls89] are inspired by biological neural networks. These networks consist of
many different neuron types which are connected by a
dense but inhomogeneous and not fully connected network. Each connection within this network has a specific delay. Despite the irregular structure of the network and the large amount of different neuron types,
information between neurons is exchanged using spikes.
All spikes arriving at a given synapse look the same,
hence the information carried by a spike is found in its
time of occurrence only. Substantial evidence also indicates that the time structure of neural spike trains is
relevant in neural processing [Aertsen93].
These networks are usually simulated using a time
driven method. A well known simulator of this type is
Genesis [BowBee94].
Simulators using a time driven simulation achieve their
results by a computation of the whole network from one
time step to another. An advantage of this method is a
very detailed analysis of the neuron behaviour. A draw-
back is the tremendous computational effort and therefore a limitation to the simulation of single neurons or
small networks. Large networks can only be simulated
by using a coarser time scale and reducing complexity
of the neuron model. Hence the computational effort for
a time driven simulation of spiking neural networks is
proportional to the number of neurons in the network
and independent of its activity.
Taking the network properties into account, spikes are
sufficiently described using a time stamp and the
address of the sender or the receiver. Together with the
sparse coding this leads to an event driven simulation as
an efficient framework for the simulation of spiking
neural networks. The spikes are directly mapped to
events, which are controlled and scheduled by a simulation engine. In an event driven simulation the evaluation
of events is done by logical processes. They can implement any spike processing model of a neuron.
Delays are usually modelled by the logical process itself. Because the delays can be used to enhance the
simulation performance we present a solution which
provides an explicit handling of the delays by the simulation engine. Moreover this solution relieves the logical
process from the explicit handling of the delays.
Contrary to the time driven simulation, the event driven
simulation profits from the low activity of spiking neural networks. The logical processes are only called if
there is an incoming event to one of the inputs of the
process. Hence computation only takes place if one of
the processes is active. This results in a much lower
computational effort. Consequently larger networks can
be simulated with less resources using this method.
Distributing an event driven simulation onto a cluster of
workstations reduces the time needed for the simulation
of very large networks of spiking neurons.
The functionality of a spiking neural system can be
simulated at different levels of abstraction and hence at
different levels of detail. For the simulation of artificial
and biological networks with a tremendous number of
neurons and connections a simulation method should
not only support the detailed examination of single
neurons, but also the efficient simulation of large neural
systems at a functional level. Even if simulation takes
place on a functional level, the behaviour of the single
neuron should be as realistic as possible. Therefore a set
of logical processes implementing known models of
neurons and a well defined interface for user defined
processes must be provided by the simulation engine.
Reusability and extensibility of the code written for the
simulator is provided using an object oriented design
and implementation method.
2. Design considerations
In terms of a distributed event driven simulation logical
processes are communicating via messages with each
other. Hence they all have to use their own communication interface.
With an object oriented design in mind, a decomposition at neuron level would be a generic way of distribution. It is done implementing each neuron as a separate
logical process, which communicates with other logical
processes. This results in a tremendous overhead for
each neuron and therefore in a very poor performance of
the simulator.
Using a centrally controlled simulation on a conventional computer instead, the overhead for each logical
process can be minimised but the simulation is limited
by the resources of a single processor. Moreover any
parallelism of the network cannot be used to enhance
simulation speed.
To avoid the overhead of a distribution at neuron level
and the limitations of a centrally controlled simulation
we combine both methods by dividing the network into
sub-networks. Each sub-network consists of a limited
amount of logical processes and is computed on a subsimulator. The sub-simulators are communicating via
messages. It is a hierarchical structure, where microscopic logical processes (neurons) are grouped to
macroscopic logical processes (sub-simulators). Hence
we avoid the overhead at the microscopic level using a
centrally controlled simulation on each sub-simulator
and we overcome the limitations of a purely centrally
controlled simulation at the macroscopic level because
of the parallel execution of multiple sub-simulators.
The potential advantages of this hierarchical approach
with communicating sub-simulators is discussed in
[Brissinck97]. In terms of a distributed simulation with
communicating logical processes, the sub-simulators are
the logical processes to be synchronised, because they
hide the centrally controlled processing of the underlying logical processes, which are implementing the
behaviour of the neurons.
There are two basic concepts for a distributed, event
driven simulation of interacting logical processes, the
optimistic and the conservative synchronisation method.
Refer to [Ferscha95] for a more detailed overview of
parallel discrete event simulation methods.
An implementation of the optimistic method simulates
events even if the temporal order is not guaranteed, e.g.
violated by undefined inputs on which no sufficient
“look ahead1” is provided. Depending on the machine
used for the calculation of the process and the complexity of each process the temporal order of the events
could be corrupted. Even if the optimistic method
avoids a deadlock of the simulation, because it runs as
long as events are available, it is sometimes necessary to
roll back the simulation to reintroduce consistency,
violated through an “out of order” event.
An implementation of the conservative method ensures
the temporal order of the events in the system and
provides methods to prevent deadlocks. Hence if a
faster process depends on a slower one it has to wait
until the slower process completes execution or
provides a “look ahead” to the faster process. With a
look ahead the preceding process grants that it will be
inactive up to a certain time. To avoid inactivity of
processes in the conservative simulation, every possible
look ahead of the system should be exploited.
With the simulation of large networks in mind, we have
chosen the conservative method, because the optimistic
method needs more memory resources and controlling
effort than the conservative one.
It obviously has a tremendous impact to the whole
simulation run in which way the network is divided into
sub networks (partitions). Hence we provide mechanisms to support the partitioning of the network in an
efficient way. This includes information about the speed
of the contributing workstations, basic bandwidth of the
network connecting the workstations, the computational
effort for the different event processing kernels and the
topology of the spiking neural network. Nevertheless
there is not always an optimal solution to the balancing
of partitions and therefore stalling of sub-simulators still
can occur. Special care has to be taken to avoid these
situations. A sub-simulator stalls upon the following
conditions:
1
A look ahead is provided by the preceding processes.
It guarantees that the corresponding process will be
inactive up to a certain time (e.g. because of a refractory
period).
The sub-simulator has processed all events up to the
safe simulation window (Figure 1), which is defined
through the minimal look ahead provided by the preceding sub-simulators (predecessors). Additionally, the
predecessor which has provided the smallest look ahead
is either busy but inactive at the output to the succeeding sub-simulator (successor), or it also stalls.
afe Window
stalling condition up to the third message which indicates the ongoing simulation of the predecessor, none of
the succeeding sub-simulators sends any inquiry messages to the stalling logical process.
In summary three different modes have to be implemented for each sub-simulator:
sequential mode – sequential processing of events
look ahead mode – calculation of maximal look ahead
inquiry mode
– asking for look ahead messages
3. Architecture of the simulator
The distributed simulator consists of a set of communicating sub-simulators. Each sub-simulator executes a
part of the simulated network (Figure 2), only
depending upon the look ahead and the messages,
provided by sub-simulators calculating neurons in its
Figure 1: Safe simulation window
In case the predecessor is busy it advances its own local
virtual simulation time, but the successor cannot advance its local time if the output of the predecessor stays
inactive.
A common solution to this problem is to send null messages from the predecessor to the successor each time
the predecessor advances its local virtual time. This
results in a tremendous amount of messages passed
between the two sub-simulators. Alternatively the successor sends inquiry messages to the predecessor if
needed. In most cases the amount of messages is less
sending inquiry messages on demand, than continuously
sending null messages. In case the predecessor also
stalls, there is no difference. Therefore we use the inquiry method, adding a special handling for the stalling
situation.
If the sub-simulator stalls we first send the actual local
virtual time as a first look ahead to all succeeding subsimulators to avoid a deadlock situation that can be
caused by recursive connections. Next we use a modification of an algorithm presented in [Brissinck97] to
calculate the maximal look ahead, the stalling subsimulator can provide, i.e. the earliest time at which an
event may occur at the different outputs of the subsimulator. For this purpose it uses the internal state of
the sub-simulator, the actual look ahead provided at the
inputs to the sub-simulator and the delays in-between
the logical processes (axonal delays). Finally this look
ahead is sent out to the succeeding sub-simulators. An
additional message is sent by the predecessor, when it is
able to continue its simulation.
In the time span starting with the first look ahead sent
by the predecessor, indicating the beginning of the
User Interface
Sub simulator 1
Sub simulator 2
Sub simulator 4
Sub simulator 3
Figure 2: Decomposition into sub-networks
neighbourhood. A network is entered and decomposed
into sub-networks with the help of the user interface.
The user interface can be executed on one of the
machines running a sub-simulator, or on a separate
machine. It provides mechanisms to distribute the subnetworks to the sub-simulators and to start the
simulation. After all sub-simulators have notified the
completion of their simulation, any information logged
during the simulation run can be gathered through the
user interface. Online interaction with the subsimulators is also possible by sending messages to them.
Using PVM2 as the messaging system, the simulator can
Transmitter
Event List
Control
Receiver
Fan Out
Connection
Logical Process
Logical Process
Control
Logical Process
Kernel
Figure 3: sub-simulator
Network
Model of the neuron
be build up with nearly any kind of computing machines. Currently, the simulation engine is implemented
for PCs running Windows NT and workstations running
Solaris 2.5. A graphical user interface will be provided
on Windows NT machines.
Figure 3 shows the structure of the sub-simulator. A
central control handles the simulation process and ensures, together with the event list, the temporal order of
the events in the system.
It also handles the communication with the other subsimulators through a transmitter object and a receiver
object. The gray shaded area is the logical process
which handles the processing of events. It contains the
logical process kernel, which implements the model of
the spiking neuron. The logical process control handles
administrative tasks of the simulation process and the
fan out connection models the distribution of spikes by
the axon. Figure 4 shows the conceptual decomposition
Dendritic Trees
Synapse
F(x)
Axon
Synapse
sponding successor. Alternatively one could use the
receiver oriented method where just one event with the
sender address has to be created. Unfortunately it is
necessary to iterate over all processes to check whether
a process is sensitive to a certain sender address or not
and each process has to handle its own event list to
ensure temporal order. This would introduce additional
overhead to each process and results in an expensive
controlling. Therefore the sender oriented method is
much more efficient for the case of low average activity.
Temporal order of events is necessary to guarantee
correct simulation results. If a process has already calculated the changes of its state upon an event, any earlier incoming event may lead to different results. It is
also necessary to ensure, that an event is guaranteed to
happen at the given time, before it is inserted into the
event list. Therefore the logical process control implements a mechanism to hold back events up to the time,
where it is save to send them. Given a model with a
postsynaptic potential like the one shown in Figure 5,
the first incoming event (solid), is succeeded by another
event (dashed). The temporal order is provided for this
case, but the second event changes the firing time of the
incoming
spikes
t
Logical ProcessKernel
Fan OutConnection
Threshold
postsynaptic
potenial
Figure 4: Components of the neuron model
of a neuron. The dendritic trees together with the synapses and the soma are forming the logical process
kernel. This kernel is an implementation of the behaviour of the model neuron. For each incoming spike its
task is to calculate the effect to its internal states and to
indicate at the output when the next spikes will occur.
The outgoing spikes are distributed by the fan out connection to all succeeding neurons. The delays between
the soma output and the synapse are explicitly handled
by this object. This method is called sender oriented
[Hartmann93], because the sender is the initiator of the
transmission.
In fact the events are not directly sent to the succeeding
logical processes, but put into the event list, to be handled by the control, ensuring the temporal order of the
events. Therefore the fan out connection inserts as many
events into the event list as the number of successors of
the current process. The time stamps of these events are
set to the time of the spike plus the delay to the corre2
Pure Virtual Machine: a messaging system for
heterogeneous networks of workstations (Please refer to
[Geist94].)
t
outgoing
spikes
t
Figure 5: Temporal order of events
neuron, which was calculated on basis of the first event.
Hence in this situation the logical process control keeps
the outgoing event up to the time it is scheduled. If the
second event arrives with a time stamp earlier than the
scheduled time for the outgoing event, the outgoing
event is cancelled by the logical process control and the
new results from the kernel are stored or sent for further
processing. Only if each input of the process has a sufficient look ahead, exceeding the time stamp of the outgoing event, the logical process control directly passes
the event to the fan out connection. This mechanism
ensures that the events entered into the event list are
guaranteed to happen at their given time. This is the
situation each implementation of a logical process kernel faces. Obeying these rules it may implement any
behaviour which guarantees the temporal order in the
system. Hence the logical process kernel must produce
outgoing events with a time stamp equal to or greater
than the time stamp of the incoming event which causes
the calculation of the outgoing event. There is no limit
to the complexity of a logical process kernel.
Currently two different models are implemented for the
use with the simulator:
1. A very simple integrate and fire model, which is
also used by Loyd Watts event driven simulator
“spike” [Watts94]
2. A more complicated model, similar to the spike
response model introduced by Wulfram Gerstner
[Gerstner90], [GerstHemm92]
The first model uses a linear integration over input current pulses of fixed duration which are triggered through
incoming spikes.
The second model is implemented using look up tables
to provide nearly arbitrary kernel functions for the shape
of the post synaptic potential, like the one in Figure 5.
The computational effort for the simple integrate and
fire model is negligible. Because look up tables are used
for the kernel functions of the spike response model, the
computational effort for this model is still considerable
low.
face of the distributed simulation a SUN Sparc 20 with
32MB. In case of a distributed simulation the layers of
network one were equally distributed over the contributing workstations, where one of the workstations additionally handles the feeding layer. The second net was
divided into equally sized sets of neurons to be mapped
onto the contributing workstations.
In Figure 6 the measured speed up for the layered net-
4. Preliminary Results
works reaches its maximum at 1.95 for two machines
and at 2.7 for three machines, which is close to the theoretical maximum.
The speed up of the Hopfield network is lower, because
it is a kind of worst case scenario to our simulator. The
fully recursive connected network causes a periodically
stalling of the sub-simulators. Nevertheless a speed up
of 1.7 using two workstations and 2.3 using three workstations is achieved in this situation (Figure 7).
Factor
Two
machines
2,8
Three
machines
2,6
2,4
2,2
2
1,8
1,6
1,4
1,2
0
10
20
30
40
50
60
Number of Neurons per layer
Figure 6: Speed up layered topology
Speed up relative to a single processor machine
Factor
The objective of our implementation is to keep all subsimulators in the sequential mode where the amount of
spikes lying in its safe simulation window is big enough
to keep the sub-simulator running. If this condition is
fulfilled, the handling of messages reduces the performance of the simulation by a constant value, because the
sequential simulation engine and the message handling
are separate threads. It is also obvious, that the performance in this situation is independent of the bandwidth of
the underlying network. In most real simulation runs
with irregular structures, a certain amount of stalling
time of sub-simulators will occur.
Because the implementation of the simulator is in its
early alpha version, we present preliminary results for
two very simple network topologies which are illustrating the expected speed up depending on the degree
to which the capacity of each contributing workstation
is exploited. The first network consists of six layers,
fully connected from layer to layer, plus a feeding layer.
Each layer has the same number of neurons and the
connections are equally weighted and delayed. The
second network consists of one layer where each neuron
is fully connected to all other neurons, as in a Hopfield
network. The connections are also equally weighted and
delayed. The networks were simulated on a simulator
with one, two and three equally equipped workstations
by varying the amount of neurons for all layers between
five and 50 for the first net. The size of the second net
was varied between 30 and 120 neurons. For the subsimulators we used SUN Ultra Sparc workstations
(model 140 with 128MB RAM) and for the user inter-
Speed up relative to a single processor machine
3
2,5
Two
Machines
2,3
Three
Machines
2,1
1,9
1,7
1,5
1,3
20
30
40
50
60
70
80
90
100
110
Number of Neurons in the Hopfield layer
Figure 7: Speed up Hopfield topology
120
130
5. Conclusion
Throughout the paper we have presented a concept for a
distributed event driven simulator, which is well suited
for but not limited to the simulation of spiking neural
networks. Contrary to time driven simulators for spiking
neural networks this simulator exploits the low average
activity of these networks. It also provides a dedicated
mechanism to handle the delays between logical processes and therefore between neurons. Moreover an advantage is taken from these delays to further enhance
the performance of the simulator. Because of the subsimulator concept a good scalability is provided by
using already available computational resources. A
simple interface between the simulation engine and the
logical process kernels eases the exchange and addition
of new neuron models. The measured performance for
two simple networks proves the theoretically estimated
speed up one can achieve by a distribution of the simulation.
There is an ongoing research to further improve the
overall simulation performance. Several algorithms for
the automatic partitioning of the networks are currently
under evaluation.
References
[Aertsen93] Aertsen A. (ed.), “Brain Theory: SpatioTemporal Aspects of Brain Function”,
Elsevier, 1993.
[Brissinck97] Brissinck W., Clarysse S., Dirkx E., “A
Hierarchical Approach to Distributed Discrete
Event Simulation”, Proc. of IASTED –
International Conference on Parallel and
Distributed Computing and Networks, 1997.
[BowBee94] Bower J.M., Beeman D., “The Book of
GENESIS”, Springer Verlag, 1994.
[Ferscha95] Ferscha A., “Parallel and Distributed
Simulation of Discrete Event Systems”, in
Parallel and Distributed Computing Handbook,
McGraw-Hill, 1995.
[Geist94] Geist A., Beguelin A., Jiang W., Manchek R.,
Sunderam V., “PVM: Parallel Virtual
Machine”, MIT Press, 1994.
[Gerstner90] Gerstner W., “Associative memory in a
network of ‘biological’ neurons”, Advances in
Neural Information Processing Systems 3: 8490, 1990.
[GerstHemm92] Gerstner W., van Hemmen JL.,
“Associative memory in a network of ‘spiking’
neurons”, Network 3: 139-164, 1992.
[Hartmann93] Hartmann G., 1. Workshop zum
Förderungsschwerpunkt
“Elektronisches
Auge”, Summary: 10-19, 1993.
[Maass95] Maass W., “On the computational
complexity of networks of spiking neurons”,
Advances in Neural Information Processing
Systems, vol.7, MIT Press: 183-190, 1995.
[Rolls89] Rolls E.T., “The representation and storage of
information in neuronal networks in the
primate cerebral cortex and hippocampus”, in
The Computing Neuron, Addison-Wesley:
125-159, 1989.
[Watts94] Watts Lloyd, “Event-Driven Simulation of
Networks of Spiking Neurons”, Advances in
Neural Information Processing Systems,
Volume6: 927-934, 1994.