Download Emergent Properties of an Artificial Neural Network

Document related concepts

Airborne Networking wikipedia , lookup

Transcript
Emergent Properties
of an
Articial Neural Network
Nicholas J. Schmansky
MSc in Articial Intelligence
Division of Informatics
University of Edinburgh
1999
Abstract
An articial neural network was constructed with the objective of modelling
a system with emergent properties. The network was built with biologically
derived features. Specically, the neurons were based on the Spike Response
Model. Synapses were subject to adaptation by Hebbian learning. The
neurons were densely connected locally and organised into columns as found
in cortex, and these columns were sparsely connected in the same manner as
found in cortex. A simple model of the retina acted as a mechanism to input
patterns to the network.
A graphical interface to the network was constructed to allow experimentation on the properties of synchrony, assembly formation and hierarchy.
Synchrony was found to occur among groups of columns, where columns
were treated as units in the same way as neurons are normally treated in
standard neural networks. The development of hierarchical assemblies was
also observed, although in a manner diering from prediction. Observations
on assembly behaviour led to the formation of two hypotheses. The rst is
that the columnar organisation of neurons may promote synchrony across
long distances of cortex. The second is that the hypothesis of formation of
columns into hexagonal shaped assemblies may not be valid. Instead, an
arbitrary 'chain' shape is more likely.
Acknowledgements
I would like to thank my supervisors, Dr. Peter Ross and Dr. Bruce Graham, for the direction given to me throughout the project. Both possessed
an uncanny ability to recognise my underlying motivations and interests in
the subject matter I chose to study in this project.
I would also like to thank my co-workers at my former employers from
whom I learned the skills of software development, debugging and project
management.
Lastly, I would like to acknowledge the innumerable number of column
assemblies in the minds of friends and family that in any way contributed
to the success of this project. May these assemblies re in perfect synchrony
with my own assemblies representing thankfulness.
ii
Contents
1 Introduction
1
2 Background
7
1.1 Project objectives . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Assemblies . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 A guide to this dissertation . . . . . . . . . . . . . . . . . . .
2.1 Emergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.1 Tell me about it . . . . . . . . . . . . . . . . . . . . . .
2.1.2 Model building . . . . . . . . . . . . . . . . . . . . . .
2.2 Looking at triangles . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1 A simple model of the CNS . . . . . . . . . . . . . . .
2.2.2 The model used in this project: a network with biologically derived features . . . . . . . . . . . . . . . . .
2.3 Assemblies . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1 Cell assemblies . . . . . . . . . . . . . . . . . . . . . .
2.3.2 Column assemblies . . . . . . . . . . . . . . . . . . . .
2.3.3 Hexagonal mosaics . . . . . . . . . . . . . . . . . . . .
2.4 Preview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Architecture
3.1 Modelling a neuron . . . . . . . . . .
3.1.1 Spike response model . . . . .
3.1.2 Pyramidal cell . . . . . . . . .
3.1.3 Inhibitory neuron . . . . . . .
3.1.4 Hebbian learning . . . . . . .
3.1.5 Initial synaptic weights . . . .
3.1.6 Spike delay . . . . . . . . . .
3.2 Modelling a column . . . . . . . . . .
3.3 Modelling cortex . . . . . . . . . . .
3.4 Modelling the retina . . . . . . . . .
3.5 Modelling the shift-to-contrast reex
iii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2
3
4
7
7
8
10
10
16
18
18
20
21
21
23
24
26
29
30
31
32
34
35
36
37
38
4 Software
4.1
4.2
4.3
4.4
Overview . . . .
Congurability
Controls . . . .
Displays . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Column dominance . . . .
Synchrony . . . . . . . . .
Pattern storage and recall
Anticipation . . . . . . . .
Hierarchy . . . . . . . . .
Hexagons . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5 Experiments
5.1
5.2
5.3
5.4
5.5
5.6
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6 Discussion
6.1
6.2
6.3
6.4
6.5
6.6
At the edge of order and chaos . .
What's so special about columns?
Synchrony . . . . . . . . . . . . .
Hierarchy . . . . . . . . . . . . .
Closing the loop . . . . . . . . . .
Breaking the rules . . . . . . . . .
7 Conclusion
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7.1 Achievements . . . . . . . . . . . . . . . . . . .
7.1.1 Software . . . . . . . . . . . . . . . . . .
7.1.2 Observations of emergent properties . . .
7.1.3 Observations of unusual behaviours . . .
7.1.4 Predictions made based on observations
7.2 Shortcomings . . . . . . . . . . . . . . . . . . .
7.3 Enhancements to the software . . . . . . . . . .
7.4 Direction of future research . . . . . . . . . . .
7.4.1 Theory building . . . . . . . . . . . . . .
7.4.2 Practical applications . . . . . . . . . . .
Bibliography
Appendices
A Software testing
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
41
41
42
43
44
47
47
52
52
57
57
61
65
65
67
68
69
69
71
73
74
74
74
75
76
76
77
78
78
79
80
86
87
A.1 Neuron model . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
A.2 Column model . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
A.3 Cortex model . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
iv
A.4 Learning algorithm . . . . . . . . . . . . . . . . . . . . . . . . 92
B Hebbian learning algorithm
C Connectivity algorithms
93
97
v
vi
List of Figures
2.1 A simple model of the CNS . . . . . . . . . . . . . . . . . . . 10
2.2 Input controlled reverberation . . . . . . . . . . . . . . . . . . 13
2.3 Hierarchical response to cell assemblies . . . . . . . . . . . . . 16
3.1
3.2
3.3
3.4
3.5
3.6
3.7
EPSP and IPSP functions . . . . . . . . . . . .
Pyramidal cell refractory and gain functions . .
Inhibitory neuron refractory and gain functions
Internal connectivity of a column . . . . . . . .
Beginnings of the formation of a hexagon . . . .
Column connectivity showing a hexagon . . . .
Retina showing a line-segment detector . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
29
30
31
35
37
39
40
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
Periodicity of column dominance exchange . . . . . .
Eect of input current on a two-column competition .
Dominance exchange between two columns . . . . . .
Demonstration of synchrony of column activity . . . .
Eect of inhibitory connections during pattern recall
Window containing the triangle pattern and retina .
Triangle corner assemblies before and after learning .
Synchronous columns in triangular formations . . . .
Possible shape of triangular synchronised columns . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
49
50
51
53
55
58
60
63
64
6.1 Two forms of hierarchy formation . . . . . . . . . . . . . . . . 70
vii
A.1 Pyramidal cell spike train plots . . . . . . . . . . . . . . . . . 90
A.2 Inhibitory neuron spike train plots . . . . . . . . . . . . . . . . 91
B.1 Window function of Hebbian learning rule . . . . . . . . . . . 95
viii
List of Tables
3.1 Summary of biological derivation of network parameters . . . 24
3.2 Relative strengths of initial synaptic weights . . . . . . . . . . 33
C.1 Connectivity algorithms . . . . . . . . . . . . . . . . . . . . . 97
ix
\No image could ever be adequate to the concept of a triangle
in general. It would never attain that universality of the concept
which renders it valid of all triangles, whether right-angled, obtuseangled, or acute-angled; it would always be limited to a part only
of this sphere. The schema of the triangle can exist nowhere but
in thought."
Immanuel Kant, 1781
x
Chapter 1
Introduction
As yet, no theory accounts for intelligent behaviour or thought as it is commonly understood. The reductionist approach has yielded an enormous
amount of information at all levels of abstraction. But taken as a whole,
it does not explain some of the most puzzling marvels in the world, such
as human consciousness and creativity, or simply a dog playing in the park.
What guides a dogs behaviour from moment to moment? To say it is instinct,
or associative learned action is not enough, and tends to be very unsatisfying.
Perhaps there exists an explanation that spans many levels of science, but is
not obvious to see.
A system is said to have emergent properties if the behaviour of the system
is not readily explained in a reductionist manner. That is, the whole seems
to be greater than the sum of the parts. Examples of systems exhibiting
'emergent phenomena' are varied: from ant colonies to economies, from uid
ow to the immune system. In part, science is about developing theories
that predict the way a system will behave under known conditions. Many
complex systems have yet to yield the rules and laws that govern their full
set of behaviours. At this time, a theory is that these complex systems may
share a common set of laws responsible for the emergent properties, thus
making worthy the scientic study of emergent phenomena.
1
The Santa Fe Institute is a research institute devoted to pursuing 'emerging science'. John H. Holland, steering the direction of this institute, is
working to create an emergence theory. In [Holland 98], a recurrent neural
network is put forth as an example of a system with sucient complexity
to exhibit specic emergent properties. Holland describes a simplied model
of the central nervous system (CNS), incorporating neurons with cyclic connectivity, variable threshold and synapses subject to the Hebbian learning
rule. From these three properties, Holland claims the following properties
emerge:
Synchrony - groups of neurons entrain themselves into synchronous
ring.
Anticipation - groups of neurons "prepare" to respond to an expected
future stimulus.
Hierarchy - new groups of neurons form to respond to already formed
groups.
To say that a system exhibits 'emergent phenomena' does not help the scientist very much. The ideal is to eliminate these words from the description,
and replace them with rules and laws that explain the regularities, free of
the incidental and irrelevant details. This requires building models and observing selected aspects. A well conceived model makes possible prediction
and planning and reveals new possibilities. [Holland 98]
1.1 Project objectives
The aim of this project was to build a software model of the CNS described
by Holland with these intentions:
To conrm the appearance of the emergent properties that Holland
claims should arise from a network of neurons.
2
To make observations on the properties in the hope of discovering the
underlying principles of emergence.
To make note of any unusual behaviours of assemblies, which tend to
form in a recurrent neural network, with the intention of beneting the
articial intelligence and engineering communities.
The software model constructed for this project was a recurrent articial
neural network (ANN). The decision was made to build the ANN with biologically derived features, based on the assumption that this network type
would be more likely to exhibit the emergent phenomena of Holland's CNS
model. Additionally, working with a biologically derived neural network
allowed comparisons to experimental data taken from psychology and neuroscience, which were benecial in determining network functionality. The
subgoals of the project relating to the biologically derived features were:
To model a patch of cortex as found in mammals.
To model the pyramidal cells and inhibitory neurons found in cortex.
The Spike Response Model [Gerstner & vanHemmen 92] was the basis
for the simulated neurons.
To model the columnar structures found in cortex. The work of Fransen
and Lansner was the basis for the simulation details.
To model columnar connectivity as found in cortex. Both the work
of [Fransen & Lansner 98] and [Calvin 96] were the basis for the simulation details.
1.2 Assemblies
Closely bound to the emergent properties described thus far is the formation
of assemblies of groups of neurons, as rst hypothesised by Donald O. Hebb
3
in his book The Organization of Behavior [Hebb 49]. Fransen and Lansner
furthered this idea by showing that columns of neurons can act as functional
units, forming column assemblies [Fransen & Lansner 98]. This idea was
implemented in the project software: the activity of individual neurons was
not observed. Instead, the activity of a column was treated as single quantity
and observed.
William Calvin proposes that columns may form another level of hierarchy, taking the form of 'hexagonal mosaics', which themselves may act
as building blocks upon which may arise the emergent properties of consciousness and creativity [Calvin 96]. This hypothesis was scrutinised in this
project.
Assemblies are believed to account for a variety of cognitive eects, from
feature binding to visual scene segmentation. Having a working knowledge
of the underlying principles of assemblies would benet persons engineering
solutions to practical problems, and those seeking to understand biological
and articial intelligence.
1.3 A guide to this dissertation
The background chapter, 2, contains foundation material which delves a bit
more into the concept of emergent phenomena. Following this are details
of Holland's model of the CNS which is the conceptual basis of the project. Then, an overview of the modications made to this model in order to
construct a software model for this project are described. Lastly, the background chapter covers the concept of assembly formation, in cells, columns
and among 'hexagonal mosaics'.
Chapter 3 describes the architecture of the software model used in this
project. Included are details of how the neuron, column, cortex and retina
were modelled.
4
Chapter 4 describes the software developed for the project, in the context
of its functionality for experimentation.
Following this is a chapter describing the experiments undertaken in the
project. For each experiment, the objective, setup, outcome and short discussion are included. A more analytical treatment of the observations made
in the experiments is found in chapter 6, followed by a concluding chapter.
The interested reader might then want to browse the appendices and gift
shop.
5
6
Chapter 2
Background
2.1 Emergence
2.1.1 Tell me about it
Electromagnetism and gravity are considered to be phenomena. Each produces eects that are observable and repeatable, but each is not directly
'observable' in itself. Some would say emergent properties of complex systems are not phenomena, but rather properties that are built into the system; that emergence is in the 'eye-of-the-beholder' and goes away once it
is understood. Contrary to this thinking, John H. Holland in [Holland 96,
Holland 98] presents evidence that scientic investigation can greatly increase
the understanding of emergence (as a true phenomena). Holland states that
investigation must be restricted to systems for which there are useful descriptions in terms of rules and laws. A phenomena is not called emergent unless
it is recognisable and recurring.
Emergent properties are often exhibited by systems with many simple
components (having simple rules), and the connection between the system's
behaviour and these rules is not obvious. Emergence usually involves patterns of interaction that persist despite a continual turnover in the constituent
7
parts [Holland 98]. Fluid ow is a good example of this. The standing wave in
front of a rock in a rushing river is persistent, yet the water molecules constituting the wave are continuously changing (and not in a recycling manner).
This complexity (that emerges) is not just the complexity of random patterns: the systems are animated, dynamic, changing over time, even though
the laws governing the system do not change. The laws generate the complexity, and the changing ux of patterns that follows leads to perpetual
novelty and emergence [Holland 98].
Continuing with Holland's laws of emergence: the component mechanisms
interact without central control, and the possibilities for emergence increase
rapidly as the exibility of the interactions increases. To Holland, an organising principle of the world seems to be that building blocks at one level
combine into new building blocks at a higher level, and that a hierarchical,
building-block structure transforms a systems ability to learn, evolve and adapt. The possibilities for emergence are compounded when the elements of
the system include some capacity for adaptation and learning [Holland 98].
2.1.2 Model building
John Holland claims that model building is critical in the construction of
scientic theory. A model concentrates on describing a selected aspect of
the world, setting aside other aspects as incidental. If the model is well
conceived, it makes possible prediction and planning and reveals new possibilities. [Holland 98]
An alternate view of model building is that a model can serve as an articial substrate out of which emergent phenomena could arise. For example,
a model constructed from a neural network could serve as a substrate for the
emergence of intelligence. The mechanisms of biological systems, currently
the only real examples of emergent systems that have produced intelligence,
would not need to be mimicked in detail. Rather, the model would need only
8
to exhibit those emergent properties necessary to support the operations of
thought. [Hillis 98]
Daniel Hillis points out three ways of discovering these essential properties (for the emergence of intelligence). The rst is to study the properties of
specic emergent systems, and build a theory of their capabilities and limitations. This is the same as Hollands direction. Some examples of systems
currently under this type of experimental study include neural networks,
spin glasses, cellular automata, evolutionary systems, and adaptive automata. The second way is the study of biological systems. Neurophysiology,
cognitive psychology and evolutionary biology have provided the most useful
information about these natural intelligent systems. Lastly, a theoretical understanding of the requirements of intelligence or of the phenomena of emergence in general would be benecial. Theories of logic and computability,
linguistics, and dynamical systems theory are relevant examples. [Hillis 98]
It is possible to produce a phenomenon without fully understanding it.
A computer simulation of uid ow produces laminar ow, vortex streams,
and turbulence that is indistinguishable from the behaviour of real uids.
Although the detailed rules of interaction are very dierent from the interactions of real molecules, the emergent phenomena are the same. The
emergent phenomena can be created without understanding the details of
the forces between the molecules or the equations that describe the ow of
the uid. [Hillis 98]
In the next section, Holland's model of the central nervous system (CNS)
is described. This model is a network of neurons, whereby specic neuron
properties are believed to be responsible for certain emergent properties of
the CNS. In building a model of this type, it is hoped that the phenomena
can be examined in the same sort of way as observing an ant colony (another
example of a system with emergent properties). The simulated network can
be perturbed in dierent ways, along the way sorting out the properties that
9
have a key role in the emergence of organisation [Holland 98].
In this project, Holland's model was modied for both practical reasons
and to build in properties that others have suggested might produce emergent
phenomena in a neural network (specically, William Calvin's hexagonal
mosaics), and to extend the building block concept by working with column
assemblies.
2.2 Looking at triangles
2.2.1 A simple model of the CNS
The model (before modication) that is the basis of this project is shown
in gure 2.1. It was proposed by John Holland in [Holland 98], pgs. 101113. The reader should keep in mind that the remainder of this section is
paraphrasing Holland, and discussion of the assumptions made by his model
are scattered throughout the remaining sections of the dissertation.
cerebrum
object
retina
T
r
eye
R
S
shift-to-contrast
reflex arc
ocular rotation
muscle
central activity
inhibits reflex arc
Figure 2.1: A simple model of the CNS. Drawing from pg. 101 of [Holland 98]
The model is composed of three parts, based on basic mammalian physiology:
Input - An 'eye' with a 'retina' consists of a large number of input
10
neurons, congured such that the central area is of high resolution
surrounded by an area of low resolution.
Processor - The 'cerebrum' consists of a large number of randomly
interconnected neurons, forming cycles of varying length.
Output - A 'shift-to-contrast reex' controls the movement of the eye,
which causes the eye to move to a new point of contrast. Referring to
gure 2.1, the points of highest contrast are the three vertices of the
triangle. The reex is suppressed when neurons in the 'cerebrum' are
highly active, and released when this high ring rate drops o.
The important properties this network must possess in order for the emergent properties to appear are:
Cyclic connectivity - Also known as recurrent connectivity. In con-
trast to a feed-forward neural network, a cyclic network allows a circulation of pulses, or reverberations, to take place. This kind of activity
makes possible indenite memory, and allows neurons to form cooperative assemblies that can act as building blocks for sequential behaviour.
Variable threshold - As the time since a neuron last red increases,
the neuron's threshold decreases. Conversely, as the ring rate increases, the threshold increases. The decreasing threshold causes an
increased sensitivity to incoming pulses if the neuron is quiet (relatively inactive) for an extended period. By ring at a rate proportional
to the average synapse-weighted strength of the pulses received, the
variable threshold allows the neuron to act as a frequency modulator.
Fatigue - The fatigue eect operates on a time scale greater than (by
at least an order of magnitude) the variable threshold time scale. A
neuron becomes more fatigued if it continues to re at a high rate for
this extended period. The eect is that the threshold steadily increases.
11
Conversely, the threshold is decreased if a neuron pulses at a low rate
for a long period of time.
Hebb's rule - Hebbian learning is a general principle that states that
the synaptic ecacy between two neurons should increase if the two
neurons are 'simultaneously' active [Hebb 49]. This rule was later extended to say that the synaptic ecacy should decrease if two neurons
are not simultaneously active [Rochester et al. 56]. In terms of credit
assignment, the weight adjustment made between two neurons is based
on local information only. Learning is unsupervised.
With this foundation, the sequence of activity that occurs in the network
from which three forms of emergent properties appear can now be described.
Refer to gure 2.2. The next three subsections describe each property in
detail, as activity unfolds.
Synchrony
Activity begins when the triangle is presented to the eye. The 'shift-tocontrast' mechanism causes movement to one of the vertices. Referring to
gure 2.2, let's say it settles on corner R. The image of this corner impinges
on the retina, and the retinal neurons receiving the image light rays begin
to re at a high rate. These neurons are connected to some random subset
of neurons within the 'cerebrum': subset r. Because of this, these neurons
(in subset r) also begin to re at a high rate. The pulsing r neurons cause
other r neurons to pulse, due to local cyclic connectivity. Observation shows
that a further subset of these r neurons begin to reverberate, that is, to re
in synchronised lockstep (entrainment), due to the combination of variable
threshold and high ring rate. Hebbian learning causes this synchronised
subset to become stronger, that is, the synaptic weights that promote the
ring of this subset tend to increase. The eect of this is that subsequent
12
time
T
r
R
S
T
r
R
+
s
S
t
T
R
r
+
s
S
t
active neurons
+
fatigued neurons
-
+
-
+
primarily excitatory synapses
primarily inhibitory synapses
-
+
r
+
s
Figure 2.2: Input controlled reverberation. Drawing from pg. 104 of [Holland 98]
presentations of the R corner in the same orientation and retinal position
will 'ignite' this subset, or 'assembly', of neurons much more quickly. This
idea of neurons forming 'cell assemblies' was postulated by Donald O. Hebb
in 1949 [Hebb 49]. The assembly concept is one of the major focuses of this
project, and will be discussed in section 2.3.
13
The fatigue eect will eventually cause a reduction in the assembly ring
rate. At some critical point, when the network activity is at some 'minimum',
the 'shift-to-contrast reex' is no longer inhibited, and the reex causes the
eye to move to a new vertex (again, referring to gure 2.2, let's say it moves
to point S). The previous sequence of events is repeated, causing a new subset
of neurons, s, to become the dominant neurons. However, a new eect takes
place: the pulses from s to subset r do not cause those neurons in r to re
because the r neurons are still fatigued. The negative eect of Hebb's rule
causes these connections to reduce in strength. This means that when subset
s res in the future, it will tend to inhibit subset r.
The process repeats for vertex T, eventually causing the development of
three groups of assemblies in r, s and t. Nothing precludes the network from
shifting in a random order, say R, T, S, T, S, T, R, S, R..., thus learning
proceeds in an interleaved manner.
Anticipation
The property of anticipation allows the network to have 'expectations about
future actions'. To see how this is possible, assume that subset r neurons
have been reverberating for a while. Recall that this inhibits activity in s
and t, thus the s and t thresholds will decrease, due to the fatigue eect.
But s and t will not increase activity yet, assuming the inhibitory eect
from r is powerful. In fact, the s and t thresholds will drop below the
average threshold level of the other neurons in the 'cerebrum'. What this
means is that once the r neurons become fatigued themselves, then only a
weak stimulus from S or T image input will start the corresponding s or
t subset to begin escalating activity, in preference to the other 'cerebrum'
neurons. The result is a quick switch in dominance to one of the learned
assemblies. It is likely that dominance will switch from vertex to vertex,
randomly, thus embodying the 'three-ness' of the triangle. The eect is that
14
when one vertex is 'detected', or within the proper region of the retina, that
the network anticipates the other vertices. Another positive eect is that a
noisy or incomplete input from one of the corners can still 'ignite' the subset
neurons for that corner, 'lling-in' the image. This eect is commonly seen
in both sight and sound psychology experiments. At a higher level, it might
also account for 'priming' eects.
Hierarchy
Refer to gure 2.3. Assume network activity has proceeded for some time,
and that the Hebbian learning process has 'stored' the three corners into three
distinct assemblies. Now assume another subset of neurons exists, called v,
that receive connections from r, s and t, but connectivity from v back to r,
s and t is sparse. The v neurons will pulse when any of r, s or t are pulsing,
but v cannot inhibit that subset because of the lack of returning connectivity.
Assume also that v neurons do not re at such a high rate as r, s or t, so
fatigue does not build up quickly. The sum eect is that synaptic strength of
connections to the v neurons is strengthened. Now the network has formed
a hierarchy, generated in response to regularities in the external stimuli: the
v neurons are an abstraction, representing the 'three-ness' of the triangle.
\This process is a precursor of that everyday, but astonishing,
human ability...to eortlessly parse unfamiliar scenes into familiar objects, an accomplishment that so far eludes even the most
sophisticated computer programs."
John Holland, [Holland 98], pg. 111
15
time
T
t
r
R
S
v ("triangle")
s
T
t
r
R
S
v ("triangle")
s
t
T
v ("triangle")
r
R
s
S
active neurons
Figure 2.3: Hierarchical response to cell assemblies. Drawing from pg. 108
of [Holland 98]
2.2.2 The model used in this project: a network with
biologically derived features
Holland's model is descriptive. A working model was required for this project,
upon which experiments could be conducted. In [Schmansky 99]1, it was
This work was a past eort on the part of the author to create a recurrent neural
network.
1
16
demonstrated that a recurrent neural network could all too easily diverge into
a seizure state, whereby vast numbers of neurons would re in synchrony. It
was found to be rather dicult to nd the right balance between excitatory
and inhibitory neurons, such that 'sustainable' activity at relatively low ring
rates was attained. In a recurrent network, the parameters to balance include:
Ratio of the number of excitatory neurons to inhibitory neurons, and
the total number of each.
Properties of these neuron types, such as threshold and recovery time.
Connectivity between neurons of same and dierent type, at both the
local and long-range level.
Initial weight of synaptic connections between neurons.
Because Holland's model is based on a biological system (a basic CNS),
the decision was made to base the network parameters on biologically derived information. The eld of computational neuroscience has produced
a wealth of information on biologically motivated neural networks. An assumption made for this project is that quite possibly, evolution has tuned the
biological neural network (the CNS) such that emergent properties emerge.
This assumption diverges with Holland's assumption, which assumes random selection of many of the network parameters. Holland assumes that the
Hebbian learning principle will 'sculpt' the network into one that exhibits
emergent properties. It was decided that for this project, such an assumption is bound to lead to frustration, in that the network would likely be the
overactive, seizing variety.
By creating a network with biologically derived features, hopefully the
default network parameters will require only minimal 'sculpting' of synaptic
weights. The assumption is that evolution has searched the parameter space
17
already and considerably narrowed that parameter space. The overall objective of the project was not to prove that emergent properties do indeed emerge
from a neural network, but rather the objective was to make observations on
these emergent properties. Thus, it is ok to 'cheat' a little.
Working with a biologically derived neural network had another benet.
The wealth of experimental data from psychology and neuroscience served as
signposts during development. The functionality of the components of the
model were veried by comparing to experimental data from the biological
equivalent. For example, the pyramidal cell is the primary excitatory neuron
in cortex, and the simulated excitatory neuron created for this project was
compared against the characteristics of the pyramidal cell.
Before describing the implementation details of the network model constructed for this project, the concept of assembly formation, rst introduced
in section 2.2.1, is expanded.
2.3 Assemblies
2.3.1 Cell assemblies
The cell assembly was introduced in section 2.2.1. In 1949, Donald O. Hebb
coined the phrase 'cell assembly' to describe a functional unit composed of
a group of cells connected through excitatory synapses. Hebb proposed that
the repeated coincident ring of these cells in conjunction with synapses
strengthened by co-activity of the pre and post synaptic neuron could cause
the emergence of an assembly [Fransen 96]. Hebb believed that reverberation may be useful in maintaining a temporary store for a memory until
physiological eects can make the memory permanent. Hebb also thought
that perceptions are integrations of learned networks built by cell assemblies [Hebb 49].
The concept was the dawn of a new way of thinking about distributed rep18
resentation of information2. Since that time, the concept has been expanded
under various headings: neuronal ensembles, population coding, synchronised cell activity, temporal synchrony, temporal encoding, neural population
functions, activity patterns, vector coding, synre chains, and correlated
spiking. [Fransen 96] and [Kreiter & Singer 97] both briey summarise this
history. In [Fransen 96], the properties of an assembly are summarised:
Assemblies are densely connected groups of neurons, therefore they
tend to activate simultaneously and thus show a degree of correlated
ring.
Assemblies can display reverberatory after-activity, that is, activity
persisting beyond the originating stimulus.
Assemblies have pattern completion capabilities that may be described
in Gestalt psychological terms. When a partial pattern is presented, the
full pattern is retrieved. If two patterns are presented simultaneously,
a rivalry process leads to a competition between the patterns where
one pattern can defeat the other (Necker's cube is an example). In associative memory terms, assemblies are tolerant of noisy or incomplete
input patterns.
Assemblies are overlapping, that is, a neuron can participate in the
representation of dierent features by joining dierent assemblies at
dierent times. The theory of 'temporal encoding' oers a way of dealing with the problem of multiple object recognition given sharing of
neurons between assemblies representing the objects. Basically, spike
timing is important [Sterratt 98].
One of the rst uses of the word 'connectionism' in reference to a complex brain model
was made by Hebb in his paper.
2
19
Assemblies are small compared to the whole network: the coding is
sparse.
Assemblies may be distributed over large parts of the network.
Associations between assemblies are represented as assemblies, and not
merely by connections between the constituent assemblies. Said another way, assemblies can act as building blocks for other assemblies
(another spin on the hierarchy idea).
It is commonly held that coding information in a population of neurons is
an indispensable principle for cortical representation of sensory information.
'Population codes' (assemblies) represent information about a certain stimulus or content by the pattern of graded activity in many dierent units. The
representational capacity of this form of coding is much higher than so-called
single neuron representations. [Kreiter & Singer 97]
Holland oers a rather interesting comparison of cell assemblies to classier systems (which are described in [Booker et al. 89]). A rule is the equivalent of a cell assembly: if X event occurs, then the assembly will reverberate.
Activating this assembly 'posts' a 'message' on an 'internal bulletin board',
where it can be seen by most of the other assemblies in the brain. Eyes and
muscles are the analog of detectors and eectors. [Holland 98]
2.3.2 Column assemblies
In the cortex, neurons are found to group together around a single vertically oriented dendritic bundle, forming a cylindric column. Typically about
100 neurons make up a column (also called a 'mini-column'). This feature
is a common organisational feature of neurons within the cortex, not only
in primary sensory areas (orientation columns, ocular dominance columns,
colour blobs, vibrissa barrels) but also in association areas (entorhinal cortex) [Calvin 95, Fransen 96]. Relatively speaking, connectivity is dense within
20
a column, and sparse between columns. The question that came to mind to
some researchers is whether a column is a form of building block. Could a
column act as a functional unit, and form assemblies of columns? Indeed,
Fransen and Lansner demonstrated that columns do act as functional units,
and exhibit all the same properties of an associative memory (which cell
assemblies were shown to exhibit) [Fransen 96].
The notion of a column acting as a functional unit was carried forth in
this project. A column was simulated as a collection of neurons, but the
column was treated as a unit when making observations in the experiments.
2.3.3 Hexagonal mosaics
William Calvin proposes that the connectivity of columns in cortex facilitate
the formation of an assembly that is hexagonal in shape. These hexagons acts
as units, each representing a feature, and these units compete with each other
for dominance in a Darwinian evolutionary process. Calvin theorises that
this mechanism is capable of encompassing the higher intellectual aspects of
consciousness as well as attentional aspects. [Calvin 98]
In this project, the columns are connected in the manner described by
Calvin, such that hexagonal assemblies could form. This connectivity type is
in contrast to Fransen and Lansner, who chose full connectivity of columns
(all-to-all) for their experiments. The Calvin form of connectivity is biologically derived. Details of this form of connectivity are described in section 3.3.
2.4 Preview
Before entering the next chapter, which describes implementation details of
the model, a short summary of the experiments conducted in this project,
which are described in chapter 5, might help build context for the reader.
21
Observations were made of the competition for dominance between
columns. A column is composed of densely connected excitatory and
inhibitory neurons, and connectivity is relatively sparse between the
two columns. A high level of activity in one column forces low activity
in the other, until the column 'tires', at which point the other column
becomes dominant. An experiment demonstrated this eect and illuminated some interesting behaviours.
An experiment into synchrony of column activity was undertaken. It
was found that synchrony indeed emerges in a robust manner, but
Hebbian learning seems to be necessary to cause it.
The basic neural network operation of pattern storage and recall was
tested in an experiment. It was found that under certain conditions,
the recall process occurred in the most ideal way. That is, the activity
of all columns except those associated with the stored pattern was
completely inhibited.
An experiment into the development of hierarchical assemblies was con-
ducted. The sequence just described in section 2.2.1 was not observed.
Instead, an assembly formed from a subset of the r, s and t assemblies
(referring to gure 2.3) seemed to represent a hierarchy.
Lastly, an experiment with the aim of conrming whether hexagonal
formations could form was conducted. The ndings seemed to indicate
that although they could form, the formation is probably not at all
likely. Instead, arbitrary chains of connectivity are most likely to form.
22
Chapter 3
Architecture
As discussed in section 2.2.2, it was decided that for this project the network model should be biologically derived. The assumption was that doing
so would improve the likelihood of nding emergent properties. Also in section 2.2.2, the critical network parameters were listed. In this chapter, the
manner in which these parameters were derived from biological nervous systems and implemented in the project is discussed at the computational and
algorithmic levels. Table 3.1 summarises them.
Of the assorted network topologies found in the CNS, it was decided
that for this project the network model would be based on cortex topology.
A motivation in choosing for this project to study emergent phenomena in
a neural network was an interest in how high-level thought might occur.
Because it is accepted that this level of thought occurs in cortex, it seemed
an obvious choice to model cortex over other neural formations.
To pinpoint this further, the project models only layers II and III of
cortex. It is believed that for any column of cortex, the bottom layers (V/VI)
act as an out box, the middle layer (IV) like an in box, and the supercial
layers (II/III) somewhat like an inter-office box connecting the columns
and dierent cortical areas [Calvin 95]. Fransen and Lansner also chose to
model strictly layer II/III pyramidal cells and inhibitory neurons in their
23
Parameter
Neuron characteristics
Biological Derivation
Pyramidal cell and general-class inhibitory
neuron, as found in cortex, implemented
using the Spike Response Model.
Neuron ratios and count Arranged as found in columns, derived
from Fransen and Lansner.
Neuron connectivity
Local connectivity is columnar, network
level connectivity derived from experimental data of cortex.
Initial synaptic weights
Strength is determined by neuron type
and connectivity type, derived from experimental data and adjusted for the model.
Table 3.1: Summary of biological derivation of network parameters
column assembly work [Fransen & Lansner 98].
3.1 Modelling a neuron
The basic biophysical properties of pyramidal cells (the most common excitatory neuron in layers II/III), the properties of the excitatory synapses, and
the connectivity of inhibitory neurons to the pyramidal cells, appear to be
ideally suited to favour propagation of synchronous activity and to attenuate
responses not synchronised [Kreiter & Singer 97]. Thus, for this project, only
two neuron types were modelled: the pyramidal cell, and a general-purpose
inhibitory neuron.
The next step was to choose a specic neuron model. Neurons can be
modelled at several levels of abstraction [Gerstner 99]:
Compartmental - Also known as conductance based. Simulation is
at a microscopic level, where ion channels and such are modelled. The
Hodgkin-Huxley equations are typically the basis of such models.
24
Spike Response - An instance of a 'threshold-re' model, devised by
Wulfram Gerstner and Leo van Hemmen. A neuron res, or spikes,
when a 'membrane potential' crosses a threshold. The Spike Response
Model is suitable for biologically derived modelling, because it accounts
for refractory time, adaptation eects, and PSP characteristics.
Integrate and Fire - This model is analogous to a circuit with a
capacitor in parallel with a resistor, driven by a current. The current charges the capacitor, which drains through the resistor when it
res. It is basically equivalent to the Spike Response Model, but instead based on dierential equations. It is less biologically accurate.
It does not account for PSP characteristics or adaptation eects, only
refractoriness.
Rate Coding - The mean ring rate of the neuron is the only information maintained (spikes per second is the typical unit). The pulse
structure of the neuronal output is neglected, along with any exact
spike time information.
Module - Activity in and out of specic brain areas can be modelled.
For this project, the spike response model was chosen. It is the most
computationally ecient model incorporating the eects of variable threshold
(refractoriness) and fatigue (adaptation), and allowing for cyclic connectivity,
all of which are required in Holland's model of emergent phenomena in a CNS
(recall section 2.2.1). [Fransen et al. 92, Fransen 96, Fransen & Lansner 98]
worked at the compartmental level in demonstrating that cells and columns
form assemblies, but the motivation in that work was in part to prove assemblies could form in real cortex. For this project, recall that assemblies are
assumed to exist, and the objective was to search for the emergent properties underlying the formation of assemblies, hence the spike response model
is believed to be an appropriate level of abstraction.
25
Independent of meeting the requirements for Holland's model is the motivation for choosing a model operating at the millisecond time scale. Recall
from section 2.3.1 it was mentioned that a unit (whether it be a neuron or a
column) may belong to several assemblies, each representing a feature. However, it may be the case that a set of features are all present in the current
stimulus, thus the question arises as to how a particular feature is distinguished, if a unit belongs to multiple features. Temporal encoding (a correlation theory) might provide an answer. Basically, a unit signals its participation in a feature representation at an instant of activation. Sometime later,
the unit may re again, this time in synchrony with another assembly representing some other feature. This is more easily facilitated if synchronous ring
is dened in the millisecond range. The main eect being that synchronous
EPSPs elicited by neurons of the same assembly tend to summate more effectively in pyramidal cells. Pyramidal cells tend to require a large number
of EPSPs in order to reach threshold. [Sterratt 98, Kreiter & Singer 97]
3.1.1 Spike response model
The spike response model [Gerstner & vanHemmen 92, Gerstner & vanHemmen 94]
describes the response of both the sending and receiving neuron to a spike.
The model captures the essential behaviours of a neuron:
Absolute and relative refractory period
Response at the soma resulting from synaptic input
Pulse travel time (delay)
Noise
The modelling of noise was excluded in this project under the assumption
that noise is not an underlying principle of emergent phenomena in a neural
26
network. Noise can be included by passing the 'membrane potential' value
through a sigmoidal function to give a probability of spiking.
The mathematical description of the model (from [Jahnke et al. 99]) is
summarised next:
ui (t) =
X
(f )
ti
F
i t , t(if ) +
i
n
X
X
j,i t(f ) j
j
F
o
wij "ij t , t(jf ) + n
o
F = t( ) ; 1 f n = t j u (t) = # :
i
i
f
i
n
o
,i = j j j presynaptic to i :
(3.1)
(3.2)
(3.3)
where ui(t):
state variable (membrane potential) of neuron i.
t(if ) ; t(jf ) : ring time of neuron i and neuron j , respectively.
Fi:
set of all ring times in neuron i.
#:
if threshold # is reached by ui(t), neuron i emits a spike.
,i :
set of neurons j presynaptic to i.
i (:): function to model the refractory period.
wij :
synaptic strength.
"ij (:): function to model the postsynaptic normalised potential
of neuron i induced by a spike of neuron j .
:
value representing an 'external analog input current'.
The state of the neuron is described by the 'membrane potential' variable
u at a time t. For this project, a time step t equivalent to one millisecond
of real time was chosen, both because this is typical for this model, and for
the reasons discussed in section 3.1. The neuron res if the value of the
'membrane potential' u crosses a threshold. For this project, the threshold
was xed to zero value. The concept of variable threshold is captured in
the refractory function, hence the threshold is not explicitly represented as a
27
function in this model1. There are three additive contributions to the variable
u, essentially calculating a 'voltage':
Refractory function - The refractory function captures absolute
and relative refractory periods that occur after a neuron res. The
function is a summation of the refractory contributions from the set of
all ring times F .
Postsynaptic potential function " - Approximates the broadened
shape of the spike when it arrives at the synapse. This shape is due to
the chemical and electrical transmission processes at the synapse and
dendritic tree, and is independent of spike travel time. The function is
a summation of the postsynaptic potential contributions from the set
of all ring times F multiplied by the synaptic weight of the connection
w, and sum-iterated through all connections. The function is positive
valued for excitatory connections and negative valued for inhibitory
connections. Refer to gure 3.1.
External input function - This function accounts for non-spiking
inputs, such as a current injection. In this project, it is used to model
the input from the retina, and for testing. It is useful for converting
an analog value into a spike train (high current = high spike rate, low
current = low spike rate).
The characteristics of a particular neuron type are manifested in the refractory function and the size of the set of all ring times F for that neuron.
In the next section, these parameters are detailed for the two neuron types
chosen for this project. The postsynaptic potential function is independent
of neuron type, other than the fact that a synapse from a pyramidal cell will
be excitatory and a synapse from an inhibitory neuron will be inhibitory.
1
Varying the 'external' input is equivalent to varying the threshold.
28
PSP Table
0.2
0.15
0.1
psp
0.05
0
−0.05
−0.1
−0.15
−0.2
0
2
4
6
t (ms)
8
10
12
Figure 3.1: Postsynaptic potential functions for an excitatory synapse (upper) and an
inhibitory synapse (lower). The broadened shape as it appears at the synapse is due to
chemical and electrical forces. In the model, it accounts for 'leaky-integrator' type dynamics.
3.1.2 Pyramidal cell
The pyramidal cell is the most common excitatory cell found in layers II/III
of cortex. Its refractory function is shown in gure 3.2. The depth of the
spike history is determined by dividing the total number of milliseconds of
refractory time being modelled by the number of milliseconds of absolute
refractory time. This formula is derived from the fact that the absolute
refractory time roughly determines the maximum spike rate and there is no
need to store more spikes than there is available refractory information from
which to use in the refractory function calculation. Thus for this project a
total of nine spikes2 were stored for each pyramidal cell. The fatigue eect
(also known as adaptation) is modelled via the ring history. That is, the
'tiredness' of a neuron is the summation of the refractory eects of spikes
occurring in the recent past. Refer to appendix A for more details on the
pyramidal cell model used in this project.
48ms of data are stored in the software's refractory data table. The maximum spiking
rate is 200Hz, or every 5ms. So at worst case, nine spikes occur in 45ms, and the refractory
eect of each is simulated.
2
29
Pyramidal Cell Gain Function
Pyramidal Cell Refractory Table
250
0
−0.2
200
−0.4
Firing frequency (Hz)
−0.6
refr
−0.8
−1
−1.2
150
100
−1.4
−1.6
50
−1.8
−2
0
5
10
15
20
25
30
35
40
45
t (ms)
0
−0.5
0
0.5
1
1.5
2
Input current
2.5
3
3.5
4
Figure 3.2: The left gure depicts the pyramidal cell refractory function featuring a 4ms
absolute refractory period followed by an exponentially decaying relative refractory period
lasting 44ms. The right gure is the pyramidal cell gain function. It features a peak spiking
rate of 200Hz, and a nearly linear spiking range between 20 and 80Hz.
The gain function for the pyramidal cell is also shown in gure 3.2. It
is a plot of the spiking rate of the neuron given an input current, where
the input current is equivalent to the membrane potential. A pyramidal cell
has a maximum spiking rate of 200 spikes per second, given a reasonable
input current (although pyramidal cells can re at nearly 300Hz when driven
by current injections [Fransen 96]). The rates seen for awake and behaving
animals is between 20 and 60Hz. The network developed for this project was
adjusted to produce spiking rates in this range for typical inputs.
3.1.3 Inhibitory neuron
Compared to the pyramidal cell, much less experimental data is available
for the inhibitory interneuron [Fransen & Lansner 98]. It is fast-spiking,
due to the slight depolarizing eect seen in the refractory function in gure 3.3, which helps to 'speed' recovery. This yields a periodic bursting
of spikes [Gerstner & vanHemmen 94]. The inhibitory neuron is essentially
non-fatiguing, as seen in the gain function in gure 3.3. Refer to appendix A
for more details on the inhibitory neuron model used in this project. The
30
refractory function plots and gain plots for both neuron types were produced
from data generated by the software model used in the project. These closely
match the experimental data. The dierence is in the smoothness: the jaggyness seen in the plots is an artifact of the 1ms time step simulation interval.
InhibiNeuron Gain Function
Inhibitory Neuron Refractory Table
350
0
300
250
refr
Firing frequency (Hz)
−0.5
−1
200
150
100
−1.5
50
−2
0
5
10
15
20
t (ms)
25
30
35
40
0
−0.5
0
0.5
1
1.5
2
Input current
2.5
3
3.5
4
Figure 3.3: The left gure is the inhibitory neuron refractory function implemented in
this project. The right gure is the gain function.
3.1.4 Hebbian learning
Recall from sections 2.2.1 and 2.3.1 that Hebbian learning is a general principle that states that the synaptic ecacy between two neurons should increase if the two neurons are simultaneously active, and decrease if not simultaneously active. [Gerstner et al. 99] denes a biologically motivated learning
rule appropriate for the spike response model. This rule was adapted slightly
for the project (mainly to simplify it). The algorithm implemented for this
project is described in appendix B.
The algorithm includes a multiplier that acts as a learning rate parameter. This rate was made to be run-time adjustable in the project software. The reason for having an adjustable learning rate is that according
to [Choe & Miikkulainen 98], whose work centered around a model of the
31
visual cortex, slow learning in the beginning of network cycling is believed to
capture the long-term correlations within the inputs. Fast learning near the
end of the network cycling period allows quick modulation (change causing
change) of the activity necessary for image segmentation. [Choe & Miikkulainen 98]
states that several studies have shown that rapid changing of synaptic ecacy
(that is, fast learning) is necessary for feature binding through temporal coding. For this project, these assertions were taken into consideration during
the experimentation process.
3.1.5 Initial synaptic weights
The network parameter having the greatest eect on network performance is
probably the initial values of the synaptic weights. One could imagine that
if the weight of a connection from an inhibitory neuron to a pyramidal cell is
very small, then even with Hebbian learning in eect, the inhibitory neuron
may take a very long time to learn to 'shutdown' the pyramidal cell, which
is crucial for proper assembly functionality (synchrony and rivalry eects).
Therefore, the initial weights of the connections between neurons were drawn
from biologically derived experimental data. The assumption is that doing
so will intialise the network to a state more prone to the emergence of the
properties under consideration in this project (because biological networks
are already 'tuned' to this state by evolution).
A value appropriate for a particular connection type was drawn from a
normal Gaussian distribution (20% deviation [Fransen & Lansner 98]) around
a value that is relative to the baseline value of 0.010. Connection type is a
function of the type of pre and postsynaptic neuron, and the range of the
connection (either local or long-range). [Fransen & Lansner 98] was the
source for the data and include adjustments made by them to account for
the fact that there are fewer long-range connections in a model that in real
cortex. Table 3.2 shows the relative strengths of each synapse type used in
32
this project. The connection from an inhibitory neuron to a pyramidal cell
is about 10 times greater than that between local pyramidal cells to account
for the small number of inhibitory neurons in the column model. The default
values were adjusted from the Fransen and Lansner values by observing the
peak ring frequency before learning eects given a typical input (one corner
of the triangle). Working from the fact that pyramidal cells normally re
between 20-60Hz, the weights were adjusted such that the peak ring rate
was about 45Hz, and about 30Hz average in the absence of input.
Connectivity classication
Relative weight
Between two locally connected pyramidal
0.020
cells (intra-column)
Between two pyramidal cells connected at
long-range (inter-column)
0.030
A pyramidal cell connected to an inhibitory neuron at long-range (inter-column)
0.140
An inhibitory neuron connected to a pyramidal cell locally (intra-column)
0.230
Table 3.2: Relative strengths of initial synaptic weights
A problem with the Hebbian learning rule is that nothing prevents a
weight from growing or shrinking without bound. There are at least two
ways to deal with this problem. One is to normalise the weights such that
all weights sum to a constant value. Another method is merely to x upper
and lower bounds and not let a weight exceed those bounds. This second
method is specied in Gerstner's learning algorithm [Gerstner et al. 99], and
is what was implemented in this project. These bounds are implementation
dependent, so it was necessary to nd suitable values. The minimum and
maximum bounds were chosen to be 20 times smaller and 20 times larger,
respectively, than the mean of the four mean weights shown in table 3.2. The
33
20 factor was chosen based on the observation that the peak ring frequency
when the mean weights were all 20 times those in table 3.2 was about 125Hz,
which is well over the typical rate and well under the maximum rate of
200Hz3.
3.1.6 Spike delay
The eect of spike travel time on the emergent properties under study in
this project was not known a priori. Therefore, the decision was made to
incorporate spike travel time into the spike response model algorithm4. The
delay values were also biologically derived, somewhat. In [Fransen 96], it
is stated that the delay between intra-columnar connections is 1.2ms, thus
for this project, connections within a column consumed 1ms of spike travel
time (recalling that the resolution of the model is 1ms). The presynaptic
release, diusion and postsynaptic receptor activation process accounts for
about 1ms of this delay. Fransen and Lansner found that assembly operations
were robust to delays up to 10ms for the average. This nding is important
because it means units constituting an assembly can extend over several
cortical areas [Fransen 96]. This backs up the assertion made in section 2.3.1
that assemblies can extend over large parts of a network. In fact, Eichenbaum
suggests a scheme of both local subassemblies close to the sensory/motor
areas and disperse assemblies at 'higher' areas [Eichenbaum 93]. For this
project, the decision was made to assume 1ms of delay between columns,
lacking any experimental data from which to draw a value. Thus two columns
spaced 20 columns distance from each other were modelled with 20ms spike
travel times.
The process of nding this factor was found through trial and error.
Spike delay is not explicitly included in the formal spike response model, but adding
it was a trivial matter.
3
4
34
3.2 Modelling a column
In [Fransen & Lansner 98] it was shown that columns behave as functional
units in the same manner as a single neuron when looking for cell assemblies. Because one of the objectives of this project was to look for the
hexagonal mosaics theorised by Calvin, which are composed of columns as
units, the decision was made to model a column using neurons, as was done
in [Fransen & Lansner 98], and to treat a column as a unit. A column's ring
activity was calculated as an average of the ring rates of the pyramidal cells
within a column. The activity of the individual neurons was not observed in
this project.
Pyramidal Cell
InhibiNeuron
Figure 3.4: Internal connectivity of a column, showing pyramidal cells and inhibitory
neurons (this is a representation not an exact depiction). Connectivity is dense within a
column compared to connectivity of cells between columns.
The model of a column was based on the work of [Fransen & Lansner 98],
which was biologically derived. The column model was not specic to any
particular area of cortex (sensory, associative, motor). A column, depicted
in gure 3.4 is composed of 12 pyramidal cells and three inhibitory neurons.
Inhibitory neurons only output locally (within a column), but receive input
from other pyramidal cells in other columns (but never from other inhibitory neurons). Pyramidal cells do not connect to inhibitory neurons within
35
a column, only to inhibitory neurons in other columns. This is the basis of
the ability of a column to 'shut down' another column: necessary for synchrony and dominance in rivalry to occur. Pyramidal cells may connect to
pyramidal cells within a column and between columns. Within a column, pyramidal cells connect to each other densely. Thus local connectivity is dense
and both excitatory and inhibitory, whereas the long-range (inter-columnar)
connectivity is sparse and exclusively excitatory. This means that network
wide, the cell-to-cell connectivity is strongly asymmetric and sparse. Refer
to appendix C for details on the connection density of each connection type.
3.3 Modelling cortex
William Calvin's theory of the emergence of hexagonal mosaics is based on
the pattern of connectivity between columns found to occur in cortex. According to Calvin, axons from a column tend to travel a certain distance
(0.5mm is the rough measurement) before connecting to another column,
sometimes travelling a multiple of that distance. Axons radiate from a
column, so connectivity would appear to form annular rings around a particular column. Imagine three columns, each separated a ring distance from each
other, in an equilateral triangle (refer to gure 3.5). These three columns will
tend to synchronise with each other. Calvin proposes that these triangular
arrays will tend to form hexagons. These hexagons act as units, composed
of an assembly of columns. Hexagons then form assemblies amongst themselves, representing features or high-level concepts. The connectivity of seven
columns is shown in gure 3.6 such that it is possible to see a hexagonal formation.
In this project, the default cortex model consists of 100 columns, arranged
in the annular ring fashion just described. This means the columns are laid
out on a at plane, just like a sheet of cortex. A torus connectivity scheme
36
1
2
3
4
Figure 3.5: The foundation of Calvin's hexagonal column assemblies concept is the idea
that two columns already synchronous with each other (1,2) would enlist a third column
equidistant away (3). Hebbian learning would cause this third column to become synchronous with the other two. A fourth column (4) could in the future become synchronised with
this group as well, and the process continues.
accounts for the edges. Although the cortex model is biologically derived, it
is not based on any particular area of cortex, but rather is expected to do
the job of sensory, association and motor cortex. [Calvin 96, Calvin 98]
3.4 Modelling the retina
In creating a model of the retina, the decision was made to map line-segment
orientation detectors across the retina. Although this idea was derived from
the orientation columns found in visual cortex, the retina model was not intended to be biologically derived, because it is not relevant to the objectives
of the project. The option of mapping a screen pixel to a column or neuron
was considered, as this seemed the most obvious interpretation of the retina
in Holland's model. However, it seemed as if the network would need an
enormous number of columns in order to cover a screen area containing an
image (triangle) of even a reasonably small size. Therefore, it seemed reasonable instead to create a retina which detected line segments. This of course
removes one level of hierarchy in the representation of the triangle image
used as input to the retina, but this does not change the overall experiment.
The model of the retina as used in the project is shown in gure 3.7. The
37
retina was broken up into non-overlapping regions within which the pixels
were mapped to a line-segment orientation detector. This detector sampled
eight dierent orientations of a line. Each pixel within the line-segment
detector area was excitatory or inhibitory, depending on the particular orientation of the line. When the detector was overlayed onto a screen image,
the number of excitatory and inhibitory pixels were counted, producing a
'score' indicating how well the image within that subarea corresponded to a
line segment. Thus eight possible scores were calculated, one for each line
segment orientation. Each line-segment orientation was 'mapped' to a unique
pyramidal cell within a column: the cell was driven with a current (recall from section 3.1.1) whose value was a scaled version of the 'score' for that
line segment. The net eect was that a column became active if any line
segment lay within the detector's pixel region.
3.5 Modelling the shift-to-contrast reex
In short, the shift-to-contrast reex was not included in the model used in
this project. This feature was too ill-dened in Holland's model to allow an
easy software implementation. For instance, the reex was said to remain
inhibited until network activity dropped to a certain level, then it shifted to
the point of highest contrast. It was not clear how to go about determining
the 'threshold of inhibition', or how to detect the point of 'highest contrast'
in the image, short of hard-coding the points.
Instead, the positioning of the retina was under user control. The retina,
implemented in a box shape (see gure 5.6 on page 58), could be placed
anywhere in the window containing the image of the triangle by moving the
mouse and left-button clicking. The retina capabilities are discussed further
in the next chapter.
38
Figure 3.6: The connectivity of seven columns, displayed as lled circles, is shown
such that a hexagonal formation appears around the center column. The open circles are
columns for which connectivity is not shown (excepting connections to one of the seven example columns). Each of the seven columns shown has three rings of connectivity: the rst
ring consists of 12 columns a radius of 5 columns away, the second consists of 5 columns
at a 9 column radius, and the third consists of 3 columns at a 15 column radius. This
conguration is a balance between maintaining a biological derivation and the practical
problem of keeping the column count low to reduce simulation time. Torus connectivity is
not shown here for clarity.
39
retina
line-segment orientation detector
---+++-----+++-----+++-----+++-----+++-----+++-----+++-----+++---
+ excitatory pixel
- inhibitory pixel
orientations detected
Figure 3.7: The retina was implemented as an array of line-segment orientation detectors. One such detector is shown, and the pixel detection mapping for one of the line
segments (the vertical line) is shown. The excitatory pixels representing the vertical line
cover a region three times wider than the expected vertical line, to account for misalignment
that may occur when the retina is overlayed onto an image of a triangle.
40
Chapter 4
Software
4.1 Overview
Software libraries of the necessary model components (neuron, column, cortex) were not readily available. Thus it was necessary to develop from scratch
all the software used in this project. Although the task was time consuming,
it was benecial for understanding the workings of the model down to the
smallest detail. Possibly this is essential for truly understanding the underlying principles of the emergent properties of an articial neural network.
Basically, the software is composed of two parts: a cortex model and
the user interface. The cortex model was designed using object-oriented
methodology: the PyramidalCell and InhibiNeuron classes are derived from
the Neuron class. A Column object is composed of PyramidalCell and InhibiNeuron objects, and the Cortex object is composed of Column objects.
The user interface was written for X Windows1 . The software executes on a
Sun Sparc workstation running Solaris 2.X2 . The executable is named Spike.
The remainder of this chapter is an overview of Spike's features. It
X Windows development was greatly simplied by using the Simple X library (libsx)
from Dominic Giampaolo.
2 A Sun Ultra 5 with 128MB RAM was used throughout development and testing.
1
41
is important to understand these features because the terminology is used
throughout chapter 5: the experiments.
4.2 Congurability
Upon startup of Spike, by default the cortex object is not instantiated3. The
default properties of the cortex are as described in chapter 34, but the user
can change the following properties before creating a cortex:
The number of columns in cortex.
The number of pyramidal cells and inhibitory neurons in a column.
The mean synaptic weight of each connection type.
The standard deviation from this mean.
The parameters for each ring of column connectivity.
Retina properties.
An option of full column connectivity (all-to-all) in cortex.
The user can create and destroy a cortex object at will, changing static
properties between experiments. Only one cortex can exist at any time. All
experimental data is lost if the cortex is destroyed. The Gaussian distribution of the initial synaptic weights, and the probabilities of connectivity
between neurons within and between columns, are rooted in a random number generator. The generator is seeded by a conjunction of the current time
and the process ID to ensure a dierent cortex conguration each time one
is created.
The reason for not instantiating the object is because it consumes a huge amount of
memory, and the possibility of a memory allocation failure exists.
4 100 columns, 12 pyramidal cells and 3 inhibitory neurons per column, 3 rings of annular
connectivity per column, 7x7 array of subareas (segment detectors) in the retina.
3
42
4.3 Controls
Once the user creates a cortex, the following controls are available in Spike
for runtime operation:
Starting and stopping network activity.
Adjustment of neuron learning rate. Separate controls for pyramidal
cells and inhibitory neurons. The rate varies between -0.1 and 0.1, thus
in addition to Hebbian learning, anti-Hebbian learning is possible. By
default, the learning rates are zeroed, which disables learning.
The interval in milliseconds at which the learning algorithm is executed.
By default, synaptic weight adjustments (learning) occur every 100ms.
This interval is somewhat biologically derived. In cortex or hippocampus, the learning window probably has a width of 50-200ms [Gerstner et al. 99].
The ability to 'inject current' (manually set the 'membrane potential')
into a particular neuron. Useful for testing and some experiments.
An option to clear all 'injected currents' is also available to undo any
'probes' the user may have setup.
The position of the retina on the image window (which contains the
bitmap of two triangles) is set by left-button clicking the mouse over
the center of the image area of interest.
A recallable stack of retina positions. A position is stored by middlebutton clicking the mouse. Multiple positions can be stored. Then,
when the right-button is clicked, the retina begins cycling through the
stored positions at 600ms intervals. This interval is adjustable. Rightclicking again disables the auto-cycling feature. This feature is the
substitute for the 'shift-to-contrast' reex. The user merely sets the
43
positions of the retina over the three corners of the triangle (or whatever
features to learn), and then enables auto-cycling. Cheap but eective.
The peak ring rate of all columns in the cortex is tracked, displayed
and user resettable. Useful for determining if the network is exceeding
the 80Hz spiking rate that real neurons typically never exceed.
The average ring rate across all columns is also displayed.
The scale of the colour gradient representing average spiking rate within
a column can be adjusted. Specically, the upper limit could be adjusted to, say, 50Hz, from the default of 80Hz, thus spreading the activity
scale across the 256 colours, making it much easier to dierentiate differences in activity.
Six choices of colormaps are available for use as the colour gradient.
Sometimes one set of colours is better than another in dierentiating
column activity dierences.
4.4 Displays
The primary means of making experimental observations is by visual inspection of the display windows. Unfortunately, due to time limitations, it was
not possible to develop code implementing any of the analytical techniques
available to study populations codes. A few of the techniques are briey
summarised in [Sejnowski 99]. The display windows in Spike are the following:
A central control window displays messages and has menu items for
the selection of other displays and functions.
The main display of cortex activity is a column activity raster plot.
The unit activity of a column is calculated by averaging the spiking
44
rate of all the pyramidal cells within the column5. The activity of each
column is represented as a colour whose value lies between 0 and 255.
By taking over the system colormap and replacing with carefully graded
colours, it is possible to observe the dierence in column activity by
observing colour dierences. The raster plot is updated every simulated
millisecond (the model's time step size), and the display stretches wide
enough to show more than 1000 milliseconds, which is suciently wide
for most experiments. The drawback to this display is that the column
connectivity geometry is not obvious. However...
A cortex activity display shows the most current column activity
level. That is, the display is wiped clean and updated every millisecond.
The display is a box, and is intended to visualise the connectivity of the
columns across the square 'patch' of simulated cortex. This is useful in
searching for 'hexagonal mosaics'6.
The object viewing display contains the bitmap image of two tri-
angles. One triangle is fully formed, the other is a partially occluded
copy, for use in testing partial pattern recall. The object display is also
where the user may click the mouse in order to set the retina position.
By default on startup, the retina position is not set, thus the network
is not 'looking' at anything, and no activity will occur if the network is
started. Once the user sets the retina position anywhere in the object
display window, the retina line-segment orientation detection function
is activated, and the columns that are mapped to the retina subareas
will show activity (assuming a line-segment of some sort is within that
The ring rate of the inhibitory neurons is not considered. Their activity is indirectly
observed in the inactivity of a column. That is, an inactive column is assumed to be under
the inhibitory control of another column. [Fransen & Lansner 98] also exclude inhibitory
neuron activity from direct observation.
6 ...and snausages
5
45
subarea).
A reverse retina activity display provides some indication of what
the cortex is 'thinking'. Recall that each subarea of the retina (49 in a
7x7 array by default) is mapped to a column. Recall that within that
column, eight of the twelve pyramidal cells are mapped to a unique
line segment orientation. The reverse retina activity function nds the
most active of these eight pyramidal cells, and then the line segment
orientation mapped to that pyramidal cell is displayed, in a colour
matching its ring rate. Ideally, the segment displayed in this window
should match the line segment in the subarea of the retina in the object
window matching that position. In reality, because the pyramidal cells
have connectivity to other neurons, the pyramidal cell expected to be
the most active is not always so.
In the central control window is a menu option to print out (to the
message box and to standard output) the 15 most active columns. This
is another method of tracking assemblies of columns. It is a precursor
to analytical processing of cell activity, assuming standard output is
redirected to a le.
46
Chapter 5
Experiments
The objective of the experiments in the project was straightforward: to observe the emergent properties under consideration, or the properties responsible for these properties. The section headings in this chapter title the property under consideration in an experiment. In each experiment, except for
the rst, observations were made of the column activity displays. Data was
not gathered per se. The reason for this is that although it was possible to
create code to dump data to a le, it was decided that writing the software to
perform the statistical analysis of this data would consume too much project
time. The pattern recognition capabilities of the human visual cortex and
association cortex provided a powerful enough data analysis engine for this
project.
5.1 Column dominance
The most critical property for assembly formation is the ability of a unit
to inhibit another. In the simple case of an excitatory neuron paired with
an inhibitory neuron in competition with a like pair, assuming the pairs are
equipped with equal synaptic weights, it is easy to conceptualise a competition between the two pairs. One excitatory cell will begin spiking, triggering
47
inhibition of the opposite excitatory cell, and the rst excitatory cell becomes
dominant. This will continue until fatigue sets in the dominant excitatory
cell, at which point the other excitatory cell takes over (spikes at the higher
rate). The exchange of dominance is smooth and predictable. But the case
of competition between columns is not so easy to conceptualise because a
column is composed of many excitatory neurons, each having its own refractory history. Also, the connectivity between neurons between columns
and within a column is based on probabilities, so it is not true that all
columns are equivalent. It was not known exactly what to expect from a
column competition.
Two columns were connected together in the biologically derived manner
described in section 3.2 on page 35. Equal amounts of current were injected
into equal numbers of pyramidal cells in each column. The activity of each
column1 was sampled at intervals and written to a le for later plotting.
Learning (that is, synaptic weight adjustment) was not allowed to occur.
The left plot in gure 5.1 plots the dierence between the two activity
levels over time2. The positive numbers represent dominance by one column
and negative numbers represent dominance of the other column. The right
plot is a 'ltered' version intended to make it obvious which column was the
dominant one: the values '1' and '-1' represent dominance of one or the other
column, and '0' represents equal column activity.
It is clear to see a periodicity of dominance exchange between columns in
the plots in gure 5.1. Each column is dominant for about 40ms. An additional periodicity was also present. Over the course of 400ms, four 'hiccups'
occur where it appears the dominated column tried to battle for control, only
briey 'tieing' before the other column regained dominance. This pattern reColumn activity was measured by averaging the spiking rates of the pyramidal cells
within the column.
2 That is to say, the activity levels were subtracted from each other yielding a relative
value.
1
48
Column Competition
20
1
15
0.8
0.6
10
Dominant Column
Activity Difference
0.4
5
0
0.2
0
−0.2
−5
−0.4
−10
−0.6
−0.8
−15
−1
−20
1000
1200
1400
1600
1800
2000
t (ms)
2200
2400
2600
2800
3000
2100
2150
2200
2250
2300
t (ms)
2350
2400
2450
2500
Figure 5.1: The left gure plots a dierence in column activity level, yielding relative
strengths. The right gure is a plot of this same data 'ltered' such that only three values
are possible: dominance by one or the other column, or a tie. The plots depict the best-case
example of periodicity in column dominance exchange. A column dominates for 40ms at
a time. Another level of periodicity occurs in 400ms intervals, and this 'cycle' is depicted
in the right gure.
peats every 400ms.
The behaviour displayed in gure 5.1 was not typical by any means. It
is the best example of column dominance exchange found in many runs of
the experiment. From the many runs undertaken in this experiment, the
domination time was found to range in value from about 35ms to 90ms, and
the patterns of periodicity ranged from three to six cycle intervals3. The most
common occurrence found during testing was that one column of the pair was
prone to dominate most or all of the time, and the other column could only
reduce the ring activity of the dominant column on a periodic basis. The
randomness of initial synaptic weights and connectivity is believed to be the
reason for this occurrence. One column is 'born' to be a dominant column.
The amount of 'current' input to the columns was also found to have a major
eect on behaviour. Figure 5.2 depicts a competition during which the input
current was increased every 1000ms. The obvious feature visible is that the
competition becomes extremely erratic beyond a certain input level. Current
3
Figure 5.1 is an example of a 've cycle interval'.
49
levels of 0.05 and 0.10 are shown to exhibit 'clean' competitions. Referring
to gure 3.2 on page 30, the ring frequency of a pyramidal cell at these
current levels is between 30 and 50Hz. Recall that the ring rate found in
behaving animals was found to range between 20 and 60Hz.
External Input 0.05 to 0.4, 0.05 Increments
40
30
20
Activity Difference
10
0
−10
−20
−30
−40
−50
−60
1000
2000
3000
4000
5000
6000
7000
8000
t (ms)
Figure 5.2: As input current is increased by 0.05 every 1000ms (from a starting current
level of 0.05), the eect on a two-column competition is depicted. The competition becomes
less 'clean' and more erratic. Periodicity almost disappears when the current level reaches
0.4 (on the right-most part of the plot).
The experiment was repeated using the column activity raster plot as
the means of observation. The input current level was set to 0.1. Figure
5.3 depicts the results of this experiment. In this experiment, it was necessary to enable learning for about 20 simulated seconds before the dominance
exchange was clearly visible. The learning rate was 0.05 and the interval
100ms, although these particular settings were not signicant. Any level of
learning improved the activity range of dominance. This raster plot is an
alternate way of viewing the ndings of gure 5.1. The exchange of column
dominance at 80ms intervals is clearly visible. Also visible is a 400ms periodicity. The 'naturally dominant' column is also visible. The peak activity
of the upper column is always greater (blacker) than the peak activity of the
lower column. Repeating the experiment at an input level of 0.4 produced a
display of 'ugly' activity, as expected4 . Activity was not periodic.
4
Refer to the far right of gure 5.2.
50
Figure 5.3: The activity of two columns is depicted as a colour gradient. Peak activity is
black. The exchange of dominance is clear to see, occurring at 80ms intervals. A 400ms
periodicity is also visible.
The nding that certain columns tend to 'naturally' dominate over others
should not aect assembly formation in any signicant way. The reason
is that because assemblies consist of multiple columns, the eect tends to
balance out. The nding that column competition becomes erratic as the
input current is increased beyond the linear range of the pyramidal cell gain
plot is signicant. It implies that the initial synaptic weights of an ideal new
network should cause the network to 'idle' at a low average ring rate of
about 20Hz in the absence of input and not much higher than 60Hz given a
good stimulus. Recalling section 3.1.4 on page 31, the default weights were
in fact adjusted to this level following this experiment. The network should
be more prone to 'healthy' assembly formation under these conditions.
51
5.2 Synchrony
The aim of this experiment was simply to observe synchrony of column activity in the network. The setup involved creating a cortex of 20 fully connected
columns. Current was injected into one pyramidal cell in each of three arbitrarily chosen columns (#'s 5, 10 and 15). The current level was set at 0.15.
The time at which current was injected into each column was spaced apart
by 25ms and 20ms, to eliminate intrinsic synchrony. The delay time between
current inputs was based on the nding from the previous experiment that
columns typically dominate in roughly 60ms intervals. Once current was injected, the current remained at this level throughout the experiment. The
learning rate was set at 0.001, which is quite small. The learning interval
was set at 100ms, and the network was allowed to run for only a short period
(2s simulated time), so overall, learning was minimal. This was done to see
if synchrony requires much Hebbian weight adjustment.
Figure 5.4 shows the results of one such experiment. The left gure is
network activity before any learning has taken place, but after the input
currents have been active for some time (6s simulated time). Synchrony is
not visible. However, in the right gure, after just 2s of learning, synchrony
is clearly visible, when observing the vertical bands across many columns.
It appears that Hebbian learning is vital for synchrony to occur. But
just a touch of learning is required for a network to show quite impressive
synchrony across many columns, most of which are not directly stimulated
with input current.
5.3 Pattern storage and recall
Pattern storage and recall is a basic function of any neural network, but the
manner in which it is performed diers greatly. Within a recurrent network,
storage and recall behaviour can vary dramatically depending on initial para52
Figure 5.4: Shown are raster plots of 500ms of network activity of a 20 column fully
connected cortex. The left gure is before learning is activated, and the right gure is after
2s of simulatated time with learning enabled. Input is to three of the columns. Synchrony is
clearly not present in the left gure, as none of the bands seem aligned vertically. However,
just a small amount of learning causes synchrony across many columns to emerge.
meter settings and the type and degree of learning undergone during storage.
The aim of this experiment was simply to observe the network behaviour.
The default 100 column cortex was created. Only a single pattern was
stored, and that pattern was one corner of the triangle. The corner was
chosen arbitrarily, but for this experiment, it happened to be the right corner
(see gure 5.6 on page 58 for a picture). The learning rate was again congured for slow learning: 0.001 rate, 100ms intervals. Learning was disabled
after the network reached a peak spiking rate of about 45Hz, which took
about 10 simulated seconds. Once the pattern was stored, the stimulus was
removed and the network was allowed to stabilise. The pattern was then
recalled under three conditions: perfect-pattern input, partial-pattern input
and weak-pattern input. The partial triangle image found in Spike's object
viewing window was used for the partial-pattern input (again, refer to g53
ure 5.6). The retina line-segment input current was reduced from the default
0.2 to 0.1 to simulate a weak input pattern.
Figure 5.5 is the raster plot of network activity given a perfect-pattern
input stimulus. The eect is quite striking: the stimulus immediately causes
the entire network to shutdown (at time 29509ms). Only a single column was
active by time 29600ms. By time 29650ms, a small number of columns begin
to increase activity. These columns are mapped to the line-segment detectors
of retina, and specically map to the input pattern. Gradually every column
recovers and enters into the usual oscillatory elevated activity state.
54
55
Figure 5.5: The raster plot of a 100 column network during a pattern recall process. At time 29500ms, a perfect copy of the stored input
pattern stimulates the network. Immediately, inhibitory eects shutdown nearly all activity in the network. Following this, the columns
making up the assembly for this pattern begin to emerge, and activity stabilises on the stimulus. The shutdown eect only occurred if the
network was at an absolute mimimum activity level.
The outcome given the partial-pattern and weak-pattern inputs are similar in appearance and so are not shown in a gure. The partial pattern input
caused a complete shutdown of about half the columns. The other columns
entered a low activity ring state. A 'striping' eect of low column activity in
the 'blackout zone' following time 29509ms was present. The weak-pattern
input caused a 'brownout zone', whereby not a single column completely
shutdown, but rather entered a near shutdown ring state of around 5-10Hz.
It is important to state that the shutdown eect just described only occurred if the network was given sucient to time to enter the lowest idle
state possible, which in this experiment was observed to be about 38Hz peak
ring rate. This usually required at least 10-15s of simulated time. If network activity did not 'cool down' to this level, and often times it settled at a
peak rate of about 41Hz, then the input patterns did not cause the shutdown
eect. Instead, a very mild inhibitory eect across some columns was observed for a few tens of milliseconds, then the network settled into the usual
elevated activity state for that pattern, which is visible in gure 5.5 as the
activity beyond time 29900ms.
This unusual shutdown eect is probably an example of the ideal case
of pattern recall. By entering a condition of near-zero activity, it becomes
plainly obvious to identify the assembly that emerges from this shutdown
state. The concept of an assembly as graded activity across all units is
also plain to see from this experiment. An assembly does not necessarily
have to be rigidly dened as the subset of units ring at some instance in
time, but rather as a vector code across all units, possibly averaged over
a window of time. This window might be from time 29600ms to 29900ms,
referring to gure 5.5. This shutdown eect is obviously rooted in the special
pattern of excitatory and inhibitory connectivity in the network, combined
with synaptic weight strengthening through Hebbian learning.
56
5.4 Anticipation
The property of anticipation was not considered in any experiment in this
project. It was deemed too dicult to attempt to visually identify the dynamics of what is supposed to take place (refer back to section 2.2.1 on page
14).
5.5 Hierarchy
The aim of this experiment was to identify the formation of an assembly
that emerged as the result of activity of other assemblies. This new assembly
represents whatever it is that the other assemblies represent as a whole. This
is the concept of hierarchy. In this project, the three corners of a triangle
should each form an assembly of their own, and a fourth assembly should
form as a result of Hebbian learning strengthening connections to columns
synchronous to any column active in any of the 'corner' assemblies. Once
formed, then later in time when the three corner assemblies activate, even
weakly, then the 'triangle' assembly should also activate, thus identifying the
presence of a triangle.
To teach the network the three corners of the triangle, the 'retina autocycling' feature of Spike was used. This function allows the storage of multiple retina positions which are then recalled automatically at a preset interval. The interval for this experiment was set to 600ms, thus each corner was
visited every 600ms. Refer to gure 5.6.
The 600ms interval was chosen under the assumption that it allowed
enough time for the stimulus to 'settle down' and for learning to take place on
this settled pattern. In pre-experiment trials, it was found that the changeover time from one pattern to the next required about 200ms for the old
assembly to die out and the new one to form5. The learning interval was set
5
Curiously enough, 200ms is about the time required for a real-world stimulus to register
57
Figure 5.6: Each gure is a snapshot of Spike's 'object' window, depicting the input
patterns that the 'retina' may 'look' at. The retina is the square box. The line segments
detected by the retina matching the lines of the triangle are shown in thick line. The
'dingle-berries' that hang from some lines are an artifact of imperfect alignment of the
retina over the image.
to 100ms (thus the Hebbian learning algorithm was executed every 100ms).
Recall from section 4.3 on page 43 that 100ms is a somewhat biologically
derived number. The learning rate was set at 0.001, which is very small.
This meant that learning proceeded slowly. The assumption was that the
slow learning rate reduced the probability that the network was learning the
wrong assemblies, such as those appearing during stimulus transition periods
and settling time6. It makes sense that learning should proceed slowly, from
the fact that for just about any neural network, small weight adjustments are
best for maximising the proper storage of data. The stopping criteria was any
column reaching a peak spiking rate beyond 70Hz. This was chosen somewhat
arbitrarily, but is based on the fact that the upper bound on spiking rate of
behaving animals is about 60Hz. The peak spiking rate given input stimulus
before learning was found to be about 41Hz in pre-trial experiments, thus
the assumption was that if learning produced a network spiking greater than
70Hz, then it probably is not going to learn much more.
Figure 5.7 shows 'before-and-after' snapshots of the raster plots of column
in the human consciousness.
6 These are not true assemblies, but rather just whatever the network produces at that
instant in time.
58
activity resulting from the stimulus of each corner. The left gure is near the
beginning (at time 2801ms) and the right gure is at the end (time 83393ms)
of network activity. Each gure is actually a conglomeration of snapshots,
where from left to right is the top, left and right corners of the triangle (refer
back to gure 5.6). It was hoped that if a 'hierarchical' assembly formed,
then it should be visible as a set of active columns not belonging to any
of the three corner column assemblies, and not present in the original set
of assemblies. Determination of this instance was subjective, but it was
the only method deemed practical for this project (lacking high-powered
mathematical analysis).
59
60
Figure 5.7: Snapshots of column activity resulting from stimulus from each corner of the triangle. The images were contrast enhanced
and colour inverted to bring out clarity of the active columns. Black is the colour of highest activity. The left gure is at time 2801ms,
before learning has taken place, and the right gure is at time 83393ms, after learning. Within a gure, the activity depicts the top, left
and right corners respectively. The only 'hierarchical' assembly visible is marked, although none of these columns are excluded from the
corner assemblies, thus it is not a true hierarchical assembly.
Recalling from section 2.2.1 on page 15 Holland's claim that a truly hierarchical assembly is one consisting of units not belonging to any assembly
from which the hierarchical assembly developed from. Following this denition, it does not appear as if any hierarchical assembly has formed (referring
to the right gure of gure 5.7). There does not appear to be any column
that is active simultaneous to all three assemblies representing each corner,
and not also a member of any of the corner assemblies. Rather, the columns
that are active across all corner assemblies are shared in some way with one
or more of the other corner assemblies.
It makes perfect sense that a set such as this should form. Recalling that
columns show after-activity7, it seems logical that any column in the new
stimulus (the next corner) that is at all synchronous with any columns from
the after-activity will be 'rewarded' by a strengthened connection.
Another signicant feature seen in gure 5.7 is that it appears as if competition between columns within an assembly has caused certain columns to
dominate. The result is that a true assembly has formed for each corner, not
just an assembly consisting of columns that are mapped to the retina, which
receive stimulus all the time. It appears from comparing assemblies in the left
and right gures of gure 5.7 that some columns have been trimmed away,
presumably by the development of inhibitory connections from the dominant
columns8.
5.6 Hexagons
Recall from section 3.3 on page 36 Calvin's hypothesis that the connectivity of columns as found in cortex promotes synchrony among columns in a
triangular geometry such that hexagonal shaped column assemblies emerge
After-activity is the persistence of assembly activity beyond the removal of the stimulus.
8 The eect is clear to see on a full colour display.
7
61
as representations of features. Upon completion of the project software and
initial testing of activity, it became clear that it would not be possible to test
this hypothesis because of the computational resources required. Basically,
simulating a large enough number of columns from which hexagons might
be expected to emerge and 'compete' would require far too much system
memory and CPU power. Instead, for this project, the most basic assumption of Calvin's hypothesis was tested: whether columns tend to synchronise
in a triangular manner (refer to gure 3.5 on page 37).
The experiment setup involved creating a 900 column cortex, and all
other parameters remained at the defaults. The exceptionally large cortex
size was necessary so that a sucient number of columnar rings (surrounding
a column) existed thus increasing the likelihood that triangular formations
could form. A current was injected into one pyramidal cell in the column
situated in the center of the cortex. The current level was such that the cell
spiked at about 100Hz, and the level remained constant throughout the experiment. The learning rate for both pyramidal cells and inhibitory neurons
was set to 0.001: the usual slow learning rate. The network was allowed to
run until it became clear that two sets of stable columns emerged: highly
active, and highly inactive.
Figure 5.8 depicts the network activity at the beginning and end of the
experiment, and the end. It shows the spreading activation away from the
center column. The ring connectivity is visible at time 108ms and the second
ring is visible at time 110ms. At time 112ms, the columns making up the
rings begin to form rings themselves. It is clear that the network will very
soon reach a point where the rings mix together with each other. The torus
connectivity of the network9 allows the rings to continue spreading across
the network. At time 6643ms, learning has strengthened connections between
Columns near the top, bottom and sides have connectivity with columns on the opposite side of the network.
9
62
synchronously active columns. Referring to gure 5.8, a checkerboard pattern
of highly active columns has formed.
Figure 5.8: Shown is the beginning and end of activity in a network having only the center
column receiving stimulus. The gures depict activity at times 108ms, 110ms, 112ms and
6643ms (the network begins at time 100ms). The rst ring belonging to this center column
is visible at time 108ms. At time 110ms, the second ring is visible. By time 112ms, the
network has begun 'mixing' rings together. Much later, at time 6643ms, after learning has
'burned-in' the network, a patchwork of highly active columns are visible. The patchwork
is a result of the torus connectivity of the network, allowing triangular synchronicity to
criss-cross the network. The columns forming triangles are spaced in an 'up 4, over 2'
manner.
It seems from this experiment that columns synchronising in a triangular
manner is possible. But this is not a big surprise given that the network
63
connectivity almost forces this to happen. The important question is whether
hexagonal shaped column assemblies could form from this type of activity.
It does not appear from this experiment that they could. It is not clear what
'force' would encourage hexagons to form over other shapes. It seems as if
arbitrary 'wiggly-chain' shapes would form, as suggested in gure 5.9.
Figure 5.9: Columns which synchronise with other columns in a triangular manner are
not believed to be likely to form a hexagonal shape assembly, but rather an arbitrary 'wigglychain' shaped assembly. An example of a possible shape out of the millions that could form
is shown.
An assembly of this type could loop around and form a ring, but an almost
innite number of assembly shapes can be imagined. The determining factor
in shape formation would be the stimulus from the external environment
coupled with competition with already formed assemblies. The shapes are
reminiscent of protein chains created under the rules of protein-folding. At
a slightly smaller level of abstraction are the organic molecules formed from
carbon and hydrogen.
Arbitrarily shaped chains of columns forming an assembly does not eliminate the possibility that chains could 'compete' with other chains in the
Darwinian process that William Calvin postulates10.
I hereby suggest the name 'chain-gangs' to describe these battling assemblies of
columns
10
64
Chapter 6
Discussion
An attempt is made in this chapter to put into context the variety of observations made during the experiments. Recall that an objective of this
project was to get a handle on the workings of some of the properties of a
neural network in the hope that the knowledge would be useful to engineer
intelligent systems of some sort.
6.1 At the edge of order and chaos
The network used in this project is a complex dynamic system. This type of
system is dened as having many individual components, which taken as a
whole have a state. An update function is applied iteratively which updates
this state. The system may enter one of the following phases as iterations
proceed [Kearney 93]:
1. The system tends toward a state of equilibrium.
2. The system oscillates or moves in a closed loop, a periodic behaviour.
3. The system enters into quasi-periodic, chaotic or non-repeating behaviour1.
1
This state was noticed in the very rst neural network simulation by Rochester and
65
In recent years, from the study of complex dynamic systems has emerged
the idea of a fourth condition: a phase transition zone between the ordered
state of periodic behaviour (2) and the chaotic state (3). 'Complexity' is
said to emerge from systems in this state, an example being life. It has also
been suggested that evolution tends to push these systems to this state of
complexity [Waldrop 92] pg 303.
From looking at gure 5.2 on page 50, which depicts the competition
between two columns as the stimulus is increased, it might appear as if this
system is passing from phase 2 (periodic behaviour) to phase 3 (chaotic). Of
course this suggests that the system might also have crossed the so-called
'edge of order and chaos', but for a system this small, it is not clear how to
even identify this region. An interesting question is whether the input current
level at this transition region is in anyway a mathematically determinable
number. It is also important to note that the input level is only one way
of controlling the system behaviour. Varying any of the network parameters
listed in section 2.2.2 on page 16 would also change the state of the system.
Obviously, the task of nding the set of parameters of a large neural network
such that it operates in the 'complexity zone' is rather daunting. Intuitively
it seems as if sensing and acting within an environment is the only path to
this zone.
The input current level directly translates into how a neural network
senses the environment. This is really where emergent behaviour of an articial neural network begins to have meaning. The manner in which a system
interacts with its environment is what makes the system 'interesting' from
the human perspective. The evolutionary force operates within an environHolland et al. in 1956. Midway through a simulation, activity was stopped and four copies
of the network state were made. Then the thresholds of a few neurons in each copy were
changed in small random ways. The simulation was then restarted on each of the networks.
All of them quickly diverged into very dierent states. It is typical of chaotic systems to
be greatly inuenced by initial conditions.
66
ment of 'agents', and of course an agent must sense this environment if it is
to survive. It also helps to act within the environment. This requires some
kind of motor control, discussed later, in section 6.5.
6.2 What's so special about columns?
The evolutionary force seems to have driven neuron organisation in the cortex
into column formations. The obvious question is to what benet is there
in doing so. A possible explanation is oered in [Fransen 96]: functionally
similar cells in cortical columns may exist because the duplication of cells is
necessary in order to support a large enough number of connections to other
parts of the cortex.
Connections to neurons in an ANN are not space limited. A connection
is limited only by memory requirements2 . The observations made in the
experiments in this project suggest another benet of column connectivity.
The dense connectivity within a column promotes reverberation within the
column, thus extending the length of time a feature (which is represented by
an assembly of columns) exists. The refractory and fatigue eects of a single
neuron, which might tend to make a cell assembly short lasting and more
unstable, are averaged out across the many neurons making up a column.
It seems as if this extended activity facilitates temporal binding because the
window of peak activity is larger than that of single neurons. This hypothesis
might explain the nding in [Fransen 96] that column assembly operations
were robust to delays up to 10ms on the average, thus are able to extend
over several cortical areas.
Some interesting questions emerge from this experiment. Do cell asActually, this is only true for software simulations. A hardware implementation of
a neural network suers from space limitations. [Maass & Bishop 99] describes various
hardware implementations of pulsed neural networks.
2
67
semblies and column assemblies exist side by side? If so, at what scale?
Does each possess some unique characteristic? Perhaps the synchronous
binding of single neurons, which occur on a time-scale of two to ve milliseconds [Sterratt 98], are necessary for quick motor activity. The slower
binding that occurs between columns might be advantageous for 'high-level'
thought and planning. If both cell and column assemblies exist side by side,
how much do the building blocks that each may represent blur together across
multiple levels of hierarchy?
6.3 Synchrony
In the experiment described in section 5.2 on page 52, synchronous activity
was observed to emerge from the network. This was a fascinating thing to
see, especially because it seemed as if a bit Hebbian learning was necessary to
push the network into this state, although quite a bit more experimentation
is necessary to validate this claim. An interesting question concerning the
synchronous state that the network settles into is to what extent is the nal
equilibrium point dependent on stimulus, versus the intrinsic connectivity
before learning. For a given equilibrium state S (of synchronous activity), it
seems possible to arrive at this state S from two or more starting points: stimulus X operating on connectivity Y, or stimulus Z operating on connectivity
Q3. Conversely, given identical stimulus, and dissimilar connectivity Y and
Q, the nal state will be dierent. It seems as if this idea is the foundation
of long-term associative learning, where current stimulus blends with past
experience (network connectivity) in a unique way for every neural network,
in turn synchrony is unique, followed by hierarchy building through Hebbian
learning.
3
Where X, Y, Z, and Q represent distinct and dissimilar states.
68
6.4 Hierarchy
Recall that in the experiment described in section 5.5 on page 57, the formation of hierarchical assemblies of columns did not appear in the manner that
Holland's theory predicts. Holland's theory predicts that the hierarchical
assembly is a distinct set of units separate from the units constituting the
low-level assemblies. Instead, the experiment seemed to indicate that the
units belonging to the union set of units shared between all the low-level
constituent assemblies was the basis of a possible hierarchical assembly. This
idea is shown in gure 6.1. In further analysing Holland's theory, it seems
implausible to expect hierarchical assemblies to form from units that are
commonly connected to low-level units and also that connectivity from the
hierarchical units to the low-level units is sparse. The claim is that this sparse
connectivity prevents the hierarchical assembly from inhibiting the low-level
units. This implies that hierarchical formation does not occur if this type
of connectivity does not happen to exist between low-level feature units and
'hierarchical' units. This implies that it is impossible to learn certain things,
because the requisite connectivity is just not there. It seems far more plausible that hierarchical assemblies should form from the union set of units, as
was found in the experiment. However, nothing precludes both mechanisms
from acting simultaneously and producing two sets of hierarchical assemblies
representing the same high-level feature.
6.5 Closing the loop
Systems constructed from a neural network of the type used in this project
become interesting if the 'loop' is 'closed'. That is, missing from the software
in this project was any kind of motor control over the environment. Recall
that the 'shift-to-contrast reex' of Holland's model served this function,
but was not implemented in this project due to time considerations and
69
theorised hierarchy not found
t
v ("triangle")
r
s
is this hierarchy?
t
r
triangle?
s
Figure 6.1: Shown are two ways that hierarchy might develop in a network of units.
The top drawing depicts the way Holland theorises hierarchy develops. The hierachical
units are not a subset of the underlying units. These units could not be identied in
the experiments conducted for this project. Instead, units belonging to the union set of
underlying assemblies seemed to be the only units forming hierarchical assemblies (bottom
drawing). Also refer to gure 2.3 on page 16.
implementation diculties.
As mentioned earlier, a system should be able to respond in some way
to the environment that it is sensing in order to increase the likelihood of
survival. Mammalian cortex is roughly divided into three functions: sensory,
association and motor. The cortex simulated in this project was general in
its connectivity: it did not mimic any particular function, but performed the
general functionality of all three. This suggests that the simulated cortex
in this project might serve as the control unit of a system (robot?). The
question becomes how a network of this type is trained to respond with the
desired output. Evolution is an obvious force, but a reinforcement learning
algorithm might also control the learning rate4.
4
Recall that anti-Hebbian learning is possible in the network constructed for this pro-
70
6.6 Breaking the rules
Recall that one motivation for choosing a biologically motivated network is
the assumption that evolution has searched the parameter space for 'ideal'
parameters. Some questions naturally arise from this. What hasn't evolution
discovered yet, or cannot ever discover, in terms of parameter state space.
Ignoring for the moment the idea that evolutionary forces are environment
dependent, is it possible that there is an idealisation of the biologically derived parameters? Which biologically derived factors should be eliminated
or altered? Biological neural networks are limited by chemical and electrical
properties, but software simulated neural networks are not.
Recall that the properties that Holland claims are necessary for emergence
to occur from a neural network are cyclic connectivity, variable threshold, fatigue and Hebb's rule (refer back to section 2.2 on page 10). Connectivity
issues are explored in [Choe & Miikkulainen 98], among others. The mechanism behind variable threshold is the refractory function. Perhaps this could
be altered. Recall that the refractory function was implemented as an array of data in the software used in this project. This means the refractory
function is unlimited. Perhaps a GA could be used to search the parameter
space to create a pyramidal cell with a more useful gain function, one that
is perfectly linear throughout a wider input range. Possibly the refractory
function could be dynamic, changing in response to some other function.
The fatigue eect was implemented in this project software via the refractory function and spike history. The threshold was not explicitly manipulated. An alternative to this is to implement a function executed at
intervals to adjust the threshold based on activity, directly analogous to the
Hebbian learning function. Additionally, regulating the intrinsic excitability of the neuron in an activity-dependent manner solves the problem of
ject, so it is possible to 'punish' the system for bad behaviour.
71
how to maintain neuron responsiveness to input at times of intense synaptic
change [Desai et al. 99].
The data constituting the learning window (gure B.1, pg 95), which was
also biologically derived, might also benet from optimisation. Primarily, it
could be adjusted on the y depending on current learning conditions (as
determined by some externally executed heuristic). Perhaps the heuristic
could account for the 'emotion' associated with a given stimulus from the
environment. Similarly, it could slow or disable learning during periods of
exposure to unimportant stimulus.
72
Chapter 7
Conclusion
The objectives of this project were:
To construct a neural network capable of exhibiting emergent properties
normally manifested by a network of real neurons.
To conrm the appearance of those emergent properties, and to make
observations on the conditions that aect the properties.
To make observations of any useful or interesting behaviours exhibited
by the network.
The achievements of this project included:
Constructing a software simulator of a patch of cortex incorporating
many biologically plausible features.
Observing two kinds of emergent properties.
Making observations of network behaviour that seemed to occur only
within narrow boundaries.
Making predictions based on observations made during the experiments.
73
The achievements are described in the next section, followed by sections
describing the shortcomings of the project, suggested enhancements to the
software, and possible future research directions.
7.1 Achievements
7.1.1 Software
The well known spike response model was the basis of the implementation of
an accurate simulation of a pyramidal cell and an inhibitory neuron. These
neurons were organised into columnar structures as found in cortex, and
these columns were connected in a sparse manner, also as found in cortex.
The construction of a simulator incorporating these features is believed to
be original work. The software included numerous features useful for pulsed
neural network exploration, including extensive congurability, and real-time
controls and colour-motion displays for X-Windows. The software was designed to be extensible, and proved to be very stable. It may be of use to
the computational neuroscience community.
7.1.2 Observations of emergent properties
Synchrony
Synchrony was found to occur among groups of columns, where columns were
treated as units in the same way as neurons are normally treated in standard
neural networks. The rather striking observation was made that just a small
amount of Hebbian learning seemed to be necessary to push the columns
into synchrony. It was also observed that a large number of columns would
synchronise, far more than were under stimulus.
74
Hierarchy
The emergent property of hierarchy seemed to appear, although under disputable conditions. That is to say, an assembly of columns representing a
high-level feature (a triangle) seemed to form, but not in the manner predicted by emergence researcher John Holland.
7.1.3 Observations of unusual behaviours
Unstableness of column competitions
Observations were made of the competition between two units (columns),
where the most active unit would inhibit activity in the other unit for a short
time, before roles where exchanged. At least two 'frequencies' of periodicity
were found to occur (approximately 40ms and 400ms in one experiment).
Column competitions were found to be highly inuenced by the level of the
input stimulus.
The edge of network shutdown
An unusual behaviour observed during a pattern recall experiment was made.
After a single pattern was stored in the network via the Hebbian learning
process, the re-presentation of the pattern unexpectedly triggered the near
complete shutdown of activity in the entire network. This shutdown occured
only when the network 'cooled' to its minimum 'idling' level. If the network
idled only a couple of Hz above this level, the pattern recall process occurred
as expected from a recurrent network.
75
7.1.4 Predictions made based on observations
Possible reason for the existence of columns
The formation of columnar structures in cortex, which exhibit dense connectivity within the column and sparse connectivity between, begs the questions of why columns have come to exist. Observations made during experimentation suggest that such a conguration promotes binding across a wider
number of units (columns) because column activity persists longer than single
neurons.
Revision of the hexagonal mosaic theory
[Calvin 96] theorises the existence of hexagonal formations of columns acting
as assemblies. Observations made during experiments in this project suggest
that hexagons would not necessarily emerge as the solitary shape of a column
assembly. Instead, arbitrary chain-like patterns, similar to those of protein
chains, are suggested as being the most common shape.
7.2 Shortcomings
The primary shortcoming of the project was probably the subjective nature of
the observations. Observations were made of raster plots and simple graphs,
where formal analytical processes might have made the assertions made in
the previous section more convincing. Another shortcoming was the lack of
deep inquiry into any particular observation. The project aim of 'discovering
the underlying principles of emergent phenomena' had no chance of success
whatsoever given the broad manner that observations were made.
However, supposing that a broad coverage of topics is nonetheless fruitful,
the areas not covered due to project time limitations included:
Observing the emergent property of anticipation.
76
Exploring the eects of spike delay.
Conducting tests of scaling column size (in terms of the number of pyr-
amidal cells and inhibitory neurons constituting a column) and varying
ring connectivity parameters.
Exploring the learning rate and learning interval parameters of the
pyramidal cell and inhibitory neuron.
Searching more thoroughly for the existence of hexagonal mosaic or
chain formations, or lack thereof.
7.3 Enhancements to the software
Described next are a number of enhancements that could be made to the
simulator:
The ability to save and load the state of the cortex, so that experiments could be saved for later study. Something similar to the object
serialisation process as found in the Java programming language is the
goal. Unfortunately, doing this in C++ is not so easy.
A function that analyses the synaptic weights could be added as a menu
item. The function should nd the average weight, and the number of
synapses at the maximum and minimum weights. It would also be
useful to organise this by neuron and connectivity type.
A function that sets a trigger to disable learning could be added. If
the network reaches a certain peak ring rate, say 70Hz, then learning
could be disabled (by setting the learning rates to zero). This would
allow running overnight without 'over-cooking' the network.
77
A function to remember a specied assembly and the ability to track
this assembly (by monitoring for column activity levels falling within a
percentage of deviation?) could be added.
The ability to determine the time delta between two points on the
column raster plot. For instance, a left-button mouse click within the
raster plot display box could set the start time, and a right-button
mouse click could cause the number of milliseconds between the two
points to be displayed, along with the frequency.
A port to the Microsoft Windows operating system. Mainly this would
involve re-implementation of the user interface code, which was written
for X-Windows. The cortex object and its components are all easily
portable.
7.4 Direction of future research
Suggestions for the direction that research could take fall in two categories:
theory building, and practical application building.
7.4.1 Theory building
This project produced a hypothesis for the existence of the columnar structure in cortex. This could be developed further by demonstrating that due
to the long spike travel time between distant areas of cortex (say, between
hemispheres), synchrony between neurons spanning these long distance is if
not impossible, then at least limited. Columns, on the other hand, hold their
peak activity level longer than single neurons, and thus have a wider window
in which to bind with other columns.
The elements of William Calvin's theory of Darwinian processes occurring
among assemblies in cortex could also be explored. The hexagonal mosaic
78
formations are just one aspect of it. The important aspect is the competing
assemblies. Due to the vast amount of computational resources required to
simulate a large number of columns, the development of an ecient column
simulation would be benecial. Perhaps the characteristics of a column can
be duplicated using an algorithm that does not rely on simulation of individual neurons.
Lastly, the search for the underlying principles of emergent phenomena
could continue.
7.4.2 Practical applications
It is an under-statement to say that biological neural networks perform tasks
not yet achievable by man-made machines. But a relatively new area of
research known as neuromorphic systems [Smith & Hamilton 98] concerns
implementing in silicon sensory and neural systems whose architecture and
design are based on neurobiology. The hope is to build real-time sensory systems that can compete with human sensing and pattern recognition capabilities. This research area is at the intersection of neurophysiology, computer
science and electrical engineering. The software developed in this project
could serve as a proof of concept tool to demonstrate the utility of columns
acting as units. The biologically derived parameters could be optimised for
specic engineering applications, before committing to a hardware implementation of a design.
79
80
Bibliography
[Booker et al. 89]
L. Booker, D. Goldberg, and J. Holland. Classier systems and genetic algorithms. Articial
Intelligence, 40(1-3):235{282, 1989.
[Calvin 95]
W.H. Calvin. Cortical columns, modules, and
hebbian cell assemblies. In The Handbook
of Brain Theory and Neural Networks, Michael A. Arbib ed., pages 269{272. Bradford
Books/MIT Press, 1995.
[Calvin 96]
W.H. Calvin. The Cerebral Code: Thinking
a Thought in the Mosaics of the Mind. MIT
Press, 1996.
[Calvin 98]
W.H. Calvin. Competing for consciousness:
A darwinian mechanism at an appropriate
level of explanation. Journal of Consciousness
Studies, 5(4):389{404, 1998.
[Carter & Frith 99]
R. Carter and C. Frith. Mapping the Mind.
Univ California Press, 1999.
[Choe & Miikkulainen 98]
Y. Choe and R. Miikkulainen.
Selforganization and segmentation with laterally
81
connected spiking neurons. Neural Networks,
pages 1120{1125, 1998.
[Desai et al. 99]
N.S. Desai, X.J. Wang, and G.G. Turrigiano.
Plasticity in the intrinsic excitability of cortical pyramidal neurons. Nature Neuroscience,
2(6):515{520, June 1999.
[Eichenbaum 93]
H. Eichenbaum. Thinking about brain cell assemblies. Science, 261:993{994, 1993.
[Fransen & Lansner 98]
E. Fransen and A. Lansner. A model of cortical
associative memory based on a horizontal network of connected columns. Network: Computational Neural Systems, 9:235{264, 1998.
[Fransen 96]
E. Fransen. Biophysical Simulation of Cortical
Associative Memory. Unpublished PhD thesis,
Stockholm University, 1996.
[Fransen et al. 92]
E. Fransen, A. Lansner, and H. Liljenstrom. A
model of cortical associative memory based on
hebbian cell assemblies. In Connectionism in
a Broad Perspective, pages 165{172. Swedish
Conference on Connectionism, 1992.
[Gerstner & vanHemmen 92] W. Gerstner and J.L. van Hemmen. Associative memory in a network of 'spiking' neurons.
Network, 3:139{164, 1992.
[Gerstner & vanHemmen 94] W. Gerstner and J.L. van Hemmen. Coding
and information processing in neural networks.
In Models of Neural Networks II: Temporal
82
Aspects of Coding and Information Processing
in Biological Systems, E. Domany, J.L. van
Hemmen and K. Schulten, eds., pages 177{
223. Springer-Verlag, New York, 1994.
[Gerstner 99]
W. Gerstner. Spiking neurons. In Pulsed
Neural Networks, Wolfgang Maass and Christopher M. Bishop, eds., chapter 1, pages 3{48.
Bradford Books, MIT Press, 1999.
[Gerstner et al. 99]
W. Gerstner, R. Kempter, J.L. van Hemmen,
and H. Wagner. Hebbian learning of pulse timing in the barn owl auditory system. In Pulsed
Neural Networks, Wolfgang Maass and Christopher M. Bishop, eds., chapter 14, pages 353{
373. Bradford Books, MIT Press, 1999.
[Hebb 49]
D.O. Hebb. The organization of behavior.
In Neurocomputing, J.A. Anderson and E.
Rosenfeld, eds., chapter 4, pages 43{56. MIT
Press, Cambridge, Mass., 1949.
[Hillis 98]
W.D. Hillis.
Intelligence as an emergent behaviour, or "the songs of eden".
http://www.brunel.ac.uk:8080/depts/
AI/alife/al-hilli.htm, 1998.
[Holland 96]
J. Holland. Hidden Order: How Adaptation
Builds Complexity. Perseus Press, 1996.
[Holland 98]
J. Holland. Emergence: From Chaos to Order.
Oxford University Press, 1998.
83
[Jahnke et al. 99]
A. Jahnke, U. Roth, and T. Schonauer. Digital simulation of spiking neural networks. In
Pulsed Neural Networks, Wolfgang Maass and
Christopher M. Bishop, eds., chapter 9, pages
237{256. Bradford Books, MIT Press, 1999.
[Kandel et al. 92]
E.R. Kandel, J.H. Schwartz, and T.M. Jessell. Principles of Neural Science. Appleton
& Lange, 3rd edition, December 1992.
[Kearney 93]
P. Kearney. The dynamics of the world economy game. Technical report, Sharp Laboratories of Europe Ltd., Oxford, UK, 1993.
[Kreiter & Singer 97]
A.K. Kreiter and W. Singer. On the role of
neural synchrony in the primate visual cortex.
In Brain Theory - Biological Basis and Computational Principles, V. Braitenberg and A.
Aertsen, eds., chapter 9. Elsevier, Amsterdam,
1997.
[Levy 93]
S. Levy. Articial Life: A Report from the
Frontier Where Computers Meet Biology. Vintage Books, 1993.
[Maass & Bishop 99]
W. Maass and C.M. Bishop, editors. Pulsed
Neural Networks. Bradford Books, 1999.
[Maass 99]
W. Maass. Computing with spiking neurons.
In Pulsed Neural Networks, Wolfgang Maass
and Christopher M. Bishop eds., volume 4,
chapter 2, pages 55{81. Springer, Germany,
1999.
84
[Ramachandran 98]
V.S. Ramachandran. Phantoms in the Brain:
Probing the Mysteries of the Human Mind.
William Morrow and Company, 1998.
[Rochester et al. 56]
N. Rochester, J. Holland, and et al. Tests on a
cell assembly theory of the action of the brain,
using a large digital computer. In Neurocomputing, J.A. Anderson and E. Rosenfeld,
eds., chapter 6, pages 65{80. MIT Press, Cambridge, Mass., 1956.
[Schmansky 99]
N.J. Schmansky. Chaotic activity in a
neural network, and the relation to epilepsy.
http://www.dai.ed.ac.uk/daidb/people/homes/
nichs/dcc/nnchaos/nnchaos.html, 1999.
[Sejnowski & Churchland 94] T.J. Sejnowski and P.S. Churchland. The
Computational Brain. Bradford Books, 1994.
[Sejnowski 99]
T.J. Sejnowski. Neural pulse coding. In Pulsed
Neural Networks, Wolfgang Maass and Christopher M. Bishop eds. Bradford Books, MIT
Press, 1999.
[Smith & Hamilton 98]
L.S. Smith and A. Hamilton, editors. Neuromorphic Systems: Engineering Silicon from
Neurobiology. Number 10 in Progress in Neural
Processing. World Scientic, May 1998. It is a
rather nice shade of blue.
[Sterratt 98]
D. Sterratt.
Introduction to temporal
encoding.
http://www.cogsci.ed.ac.uk/
tew/introduction.html, 1998.
85
[Waldrop 92]
M.M. Waldrop. Complexity: The Emerging
Science at the Edge of Order and Chaos. Penguin Books, 1992.
86
Appendix A
Software testing
The software used in this project was written from the ground up. The
decision to create a biologically derived network was made in large part
because such a network was assumed to be far more likely to demonstrate
emergent properties than a network with randomly chosen properties. Of
course for this assumption to be of any use the biologically derived network
must truly operate as it should. This appendices describes the testing process
undergone during development of Spike.
A.1 Neuron model
The development process began with the neuron model, which was based on
Gerstner and van Hemmen's Spike Response Model. This model is biologically motivated (it is an approximation of the Hodgkin-Huxley equations)
but does not specify the exact characteristics of any particular neuron type.
These characteristics are specied by the refractory function and postsynaptic potential functions. The pyramidal cell and inhibitory neuron each
have a specic refractory function, and the pyramidal cell causes an EPSP
and the inhibitory neuron causes an IPSP.
[Gerstner & vanHemmen 94] was used as the source of data for these
87
functions. Unfortunately, the data did not take the form of explicit mathematical formulas which easily lended themselves to duplication. Instead, the
data was merely 'eye-balled' from graphs included in [Gerstner & vanHemmen 94].
In designing the software for this project, it was decided that tables of data
should be stored in arrays (which makes for very speedy runtime 'calculation' of the functions). To create these arrays, an iterative process of
estimating the values from the Gertsner graphs, assigning values to elements in the arrays, then plotting graphs of this data, and comparing against
the [Gerstner & vanHemmen 94] data. Small programs were written in C to
read the arrays and dump in Matlab compatible format for plotting in Matlab. Plots of this actual data constituting the PSP and refractory functions
are in gures 3.1 (page 29), 3.2 (page 30) and 3.3 (page 31). These plots were
deemed a satisfactory match of the data found in [Gerstner & vanHemmen 94].
The next step was to verify proper functionality of each neuron. To do
this, gain functions were created and compared against gain functions found
in [Gerstner & vanHemmen 94]. A gain function is a plot of spiking rate
versus input current. The Gerstner data was gathered from experimental
data on real pyramidal cells and inhibitory neurons of various types. Another iterative process was conducted of generating the gain functions, plotting, comparing against Gerstner data, and adjusting the table data (while
still maintaining the validity of the table data as determined from the previous process). Again, utilities were written in C to execute the software
neuron models and generate Matlab compatible data. The gain functions
are shown in gures 3.2 and 3.3. The main feature of the pyramidal cell is
the exponential growth and the asymptote at about 200Hz. The cell reaches
the 200Hz point a little slower than it should, and the curve is not exactly
smooth, but good enough. The main feature of the inhibitory neuron is a
near linear increase, no asymptote, and crossing 300Hz at 3.5 input current.
The modelled neuron exhibits these features adequately. The jaggyness of
88
the plots is due to the 1ms resolution, both in the table data and simulation
execution.
Next, a comparison of spike train data was made. Again, Gerstner was
the source of this data, which was derived from experiments on real neurons.
A spike train is a plot of the instances in time when a neuron pulses, given
a constant input current. Of course, overall, the neuron should spike at the
rate found in the gain function, but the exact temporal characteristic is not
a constant periodicity because of the refractory eect.
Figure A.1 depicts the spike train of a pyramidal cell given input current levels of 0.2 and 0.5. The main feature expected given a level of 0.2 is
eight spikes in 100ms, spaced apart in a manner approximating that found
in [Gerstner & vanHemmen 94]. The crucial feature is that inter-spike intervals *should* get longer as time proceeds, due to 'fatigue' (accumulating
refractory data). This is evident in the spacing of the rst four spikes, which
cover a 40ms time course, which is the size of the refractory table, so this
makes sense. The only way for fatigue to make itself evident over a 100ms
period, as in the Gerstner paper, is to have a 100ms refractory table. It was
decided that this is too big so as to slow down simulation time. The main
feature of the 0.5 input level is 13 spikes in 100ms. The refractory eect is
barely evident.
Figure A.2 depicts the spike train of the simulated inhibitory neuron given
input current levels of 0.2, 0.5 and 2.0. Considering the 0.2 level, the main
feature was three spikes per group, and ten total groups in 500ms, evenly
spaced. Considering the 0.5 level, the expected feature was four to ve
spikes per group, and ten total groups in 500ms, evenly spaced. In actuality,
ve-spike groups do not appear, and there were 11 groups total. Considering
the 2.0 level, the main feature expected was continuous spiking for 500ms,
with only a brief non-spiking period between about 30-50ms. In actuality,
the spiking is continuous, but there is only the slightest hint of a slow-up
89
PyramidalCell Spike Train, Input Current = 0.5
3
2.5
2.5
2
2
response
response
PyramidalCell Spike Train, Input Current = 0.2
3
1.5
1.5
1
1
0.5
0.5
0
0
10
20
30
40
50
t (ms)
60
70
80
90
100
0
0
10
20
30
40
50
t (ms)
60
70
80
90
100
Figure A.1: A plot of the spike train produced by the pyramidal cell simulated in this
project given an input current level of 0.2 and 0.5.
in spiking between the 30-50ms period. Overall, the simulation seems to
capture the main features of a 'fast-spiking' inhibitory neuron.
A.2 Column model
The test of the model of a cortical column was limited to verifying that
columns could compete with each other for dominance. Programs written in
C were created for this testing. No biologically derived data was available to
test against, so verication consisted whether column dominance was periodic
or not, which it was. This testing produced such interesting eects that it
was decided that the results should be included in the experiments section
of this paper. Refer back to section 5.1 on page 47.
A.3 Cortex model
The cortex model was tested for proper column behaviour and connectivity.
Column behaviour was tested by re-running the competition test, whereby a
cortex consisted of two columns, initialised with full connectivity. Obviously,
the same results as found in section 5.1 were expected. The dierence in
90
InhibiNeuron Spike Train, Input Current = 0.5
3
2.5
2.5
2
2
response
response
InhibiNeuron Spike Train, Input Current = 0.2
3
1.5
1.5
1
1
0.5
0.5
0
0
50
100
150
200
250
t (ms)
300
350
400
450
0
500
0
50
100
150
200
250
t (ms)
300
350
400
450
500
InhibiNeuron Spike Train, Input Current = 2
5
4.5
4
3.5
response
3
2.5
2
1.5
1
0.5
0
0
50
100
150
200
250
t (ms)
300
350
400
450
500
Figure A.2: A plot of the spike train produced by the inhibitory neuron simulated in this
project given an input current levels of 0.2, 0.5 and 2.0
testing methodolgy was that for this test, the column activity raster plot
display was observed, instead of generating plots. The same behaviour as
observed in section 5.1 was found. The annular ring connectivity was tested
by creating a program that writes data interpretable by the 'xg' drawing
utility for Unix. The data was the x and y coordinates of each column
and a line indicating the connectivity between a small set of these columns
(enough to verify the ring connectivity but not so many as too overwhelm
the drawing). This drawing is found in gure 3.6 on page 39. The ring
connectivity is evident in this gure.
91
A.4 Learning algorithm
The learning algorithm, which is described in appendix B, was tested by
'teaching' a four column network a pattern, and then 'recalling' it. The
pattern was created by injecting one pyramidal cell within one column with
a constant current. The learning rate was made non-zero and learning allowed
to proceed until a distinctive pattern emerged on the raster display. Then
learning was disabled (rate set to zero). The pattern was observed to remain
constant over some long period of time (one second or so of simulated time).
Next the current injection was discontinued. The pattern was observed to
'die' after 600-700ms or so, and remain dead. Then the same pyramidal cell
was injected with the same amount of current as before (learning remaining
disabled), and the stored pattern was observed to emerge again, indicating
that the pattern was correctly recalled.
92
Appendix B
Hebbian learning algorithm
Hebbian learning is a general principle that states that the synaptic ecacy
between two neurons should increase if the two neurons are 'simultaneously'
active, and decrease if not. [Gerstner et al. 99] denes a biologically motivated learning rule appropriate for the spike response model. For the project,
this learning rule was adapted to simplify it somewhat, but retains the following qualities:
Two neurons are 'positively correlated' if the presynaptic neuron spikes
*before* the postsynaptic neuron. The weight increase is greatest when
this spike time dierence is zero or very small. The weight increase
gradually decreases as the spike time dierence gets larger (but the
neurons must still be 'positively correlated').
Two neurons are 'negatively correlated' if the presynaptic neuron spikes
*after* the postsynaptic neuron, excepting for a few brief milliseconds
around the 'equal correlation point' (when both neurons spike at exactly the same instant). The weight is *decreased* for negatively correlated spike times. The weights are not allowed to grow or shrink without
bound. An upper bound and lower bound are determined heuristically.
A 'window' of time around the 'equal correlation point' is analysed.
93
In the cortex or the hippocampus, the learning window probably has
a width of 50 - 200ms [Gerstner et al. 99]. A 100ms window was implemented in this project (50ms on either side of the 'equal correlation
point is accounted for). This window corresponds to the time period
over which chemical activity in real neurons takes place to change synaptic ecacy.
The learning rule is executed at intervals of time greater than or equal
to the 'window' size. A 100ms interval was chosen as the default for
this project, which is the smallest usable value (accounting for all spike
activity).
The learning rule is the same for both the pyramidal cell and the inhibitory neuron, although this may not be true for real neurons.
Equation B.1 describes the learning rule algorithm implemented in the project (adapted from eq. 14.8 of [Gerstner et al. 99]).
!ij = X X
F
F
t i t0 j
where :
t:
t:
W (t
0
, t)
learning rate
spike time of postsynaptic neuron
spike time of presynaptic neuron
Fi: set of all ring times of postsynaptic neuron i
occurring after T , I .
Fj : set of all ring times of presynaptic neuron j ,
adjusted for spike travel time (delay factor).
T:
the current time-step
I : learning interval (default is 100ms)
W (s): learning window (see gure B.1)
s:
t ,t
0
0
94
(B.1)
W(s)
-50 -40 -30 -20 -10
0
A
B
10 20 30 40
s (milliseconds)
50
A
B
Figure B.1: Learning window W (s) accounts for the notions of 'simultaneous' ring, and
'positive' and 'negative' correlation. W (s) is a function of the delay s between postsynaptic
ring and presynaptic spike arrival. s is negative for positively correlated spikes. If W (s)
is positive for some s, the synaptic ecacy is increased. If W (s) is negative, it is decreased.
The postsynaptic ring occurs at s = 0 (vertical dashed line). Learning is most ecient if
presynaptic spikes arrive shortly before the postsynaptic neuron starts ring, as in synapse
A. Another synapse B, which res well *after* the postsynaptic spike, is decreased. Taken
from pg 357 of [Gerstner et al. 99]
95
96
Appendix C
Connectivity algorithms
The default connectivity between neurons modelled in this project was not
all-to-all. Rather, it was based on biologically derived data taken from
Fransen and Lansner. Table C.1 lists the algorithms used when constructing
the network model in software.
Connectivity classication Algorithm
For each pyramidal cell, select another pyramidal cell (excluding self) with 53% probWithin a column
ability and connect.
Within a column
For each inhibitory neuron, select a pyramidal cell with 67% probability and connect.
Between columns
For each pyramidal cell, select with 16.7%
probability, then for each pyramidal cell
in the other column, connect with 41.7%
probability.
Between columns
For each pyramidal cell, select with 50%
probability, then connect to every inhibitory neuron in the other column.
Table C.1: Connectivity algorithms
97