Download A Counter Based Connectionist Model of Animal Timing - APT

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Neuromarketing wikipedia , lookup

Convolutional neural network wikipedia , lookup

Functional magnetic resonance imaging wikipedia , lookup

State-dependent memory wikipedia , lookup

Neuroesthetics wikipedia , lookup

Eyeblink conditioning wikipedia , lookup

Development of the nervous system wikipedia , lookup

Artificial neural network wikipedia , lookup

Cognitive neuroscience of music wikipedia , lookup

Psychophysics wikipedia , lookup

Catastrophic interference wikipedia , lookup

Neural engineering wikipedia , lookup

Nervous system network models wikipedia , lookup

Neuroethology wikipedia , lookup

Lateralized readiness potential wikipedia , lookup

Stimulus (physiology) wikipedia , lookup

Response priming wikipedia , lookup

Metastability in the brain wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Neuroeconomics wikipedia , lookup

Mental chronometry wikipedia , lookup

Recurrent neural network wikipedia , lookup

Neural coding wikipedia , lookup

Time perception wikipedia , lookup

Transcript
A Counter Based Connectionist Model of Animal Timing
A thesis submitted to the University of Manchester for the degree of
Master of Science in the Faculty of Science and Engineering
October 1999
By
Rossano Barone
Joint Departments of Psychology and Computer Science
1
CONTENTS
Abstract………………………………………………………………...5
Declaration………………………………………………………….….6
Copyright………………………………………………………………7
Acknowledgements…………………………………………………….8
Abbreviations…………………………………………………………..9
1 INTRODUCTION………………………………………………….10
1.1 About timing………………………………………………10
1.2 Research goals……………………………………………..11
2 EXPERIMENTAL RESEARCH ON TIMING…………………….14
2.1 Reproduced interval generalisation………………………..14
2.2 The temporal discrimination bias………………………….15
2.3 The linear flow of subjective time…………………………16
2.4 Pavlovian response timing…………………………………20
3 SCALAR TIMING THEORY………………………………………21
4 MODDELING COGNITION AND BRAIN PROCCESSES………25
5.1 Neurons and their behaviour……………………………….25
5.2 Neural networks and computational modelling……………26
5 NEUROBIOLOGICAL DATA ON TIMING………………………29
4.1 Cerebellar timing hypotheses………………………………29
4.2 The basal ganglia and frontal cortex……………………….31
2
6 NEURAL NETWORK MODELS OF TIMING……………………34
6.1Tapped delay lines.…………………………………………34
6.2Time dependent spatial activity…………………………….35
6.3Spectral timing……………………………………………...38
6.4Oscillator models…………………………………………...39
6.5Accumulator timing………………………………………...42
7 THE ACCUMMULATOR …………………………………………45
7.1 Design considerations and implementation………………...45
7.2 Results……………………………………………………...48
7.3 Discussion………………………………………………….49
8 PEAK PROCEDURE SIMULATIONS…………………………….53
8.1 Reward expectation………………………………………...53
8.2Temporal representation…………………………………….54
8.3 Response representation……………………………………55
8.4 Difference network…………………………………………57
8.5 Summary of simulation…………………………………….58
8.6 Results……………………………………………………...60
8.7 Discussion………………………………………………….62
9 TEMPORAL BISCECTION………………………………………..68
9.1 Reward expectation………………………………………...69
9.2 Temporal memory………………………………………….70
9.3 Difference network…………………………………………71
9.4Summary of simulation……………………………………..71
9.5 Results……………………………………………………...73
9.3Discussion……………………………………………...…...75
3
10 GENERAL DISCCUSSION………………………………………80
10.1Comparisons with other models…………………………...80
10.2 Time dependent spatial patterns…………………………..82
10.3 The relativity of differences………………………………83
10.4 Summary………………………………………………….86
Appendix A: The dependence of ratio differences on distribution
overlap………………………………………………………………...89
Appendix B: Rearrangement of the peak interval decision rule………90
References…………………………………………………………….91
4
Abstract
Scalar Expectancy Theory (SET) is a mathematically rigorous
explanation of timing in animals and humans.
Central to SET is the
proposal that timing is performed by an internal clock manifest in a
pacemaker that generates pulses at a mean rate, and an accumulator that
sums these pulses.
In the ‘accumulator neural network’ elapsed time
is represented as an activation quantity. This neural network tends to
preserve all external activation it receives over a sequence of time steps.
If the activation quantity from an external input is constant over time,
then the activation increases in a linear fashion. Activation transfers
probabilistically over short temporal units. The standard deviation of a
sample of activation values for each time step increases proportionally
with elapsed time allowing scalar variance. A neural network for peak
interval performance incorporated a rearrangement of the decision rules
proposed by SET. Simulations generated approximations of animal
data including the scalar property. In addition, the acquisition of close
to geometrical mean bisection was performed by a neural network
which learnt the most optimal discriminant between the small and large
intervals,
rather
than
decision
representations proposed by SET.
processes
based
on
explicit
The thesis explores problems with
these simple models and suggests ideas relevant to further research
developments.
5
Declaration
No portion of the work referred to in the thesis has been submitted in
support of an application for another degree or qualification of this or
any other university or institute of learning.
6
Copyright and Ownership of Intellectual Property Rights
(1)
Copyright in text of this thesis rests with the Author. Copies (by any
process) either full, or of extracts, may be made only in accordance
with instructions given by the Author and logged in the John Rylands
University Library of Manchester. Details may be obtained from the
Librarian.
This page must form part of any such copies made.
Further copies (by any process) of copies made in accordance with
such instructions may not be made without the permission (in writing)
of the Author.
(2)
The ownership of any intellectual property rights which may be
described in this thesis is vested in the University of Manchester,
subject to any prior agreement to the contrary, and may not be made
available for use by third parties without the written permission of the
University, which will prescribe the terms and conditions of any such
agreement.
Further information on the conditions under which disclosures and
exploitation may take place is available from the Head of the department of
Psychology.
7
Acknowledgements
The research was conducted with the supervision of Jon Shapiro and John
Wearden of the University of Manchester. I would like to thank them for
their patience, encouragement and advice.
The Author
The Author graduated from Middlesex University in 1997 with a BSc
(Hons) in psychology. The Author has previously conducted experimental
research on the relationship between attention and creative thinking as
partial fulfilment of his first degree.
8
Abbreviations


*
a
ANN
Av
b
CV
CR
CS
D1
D2
Da
Dr
FI
L
Lr
Nc
Ni
G
PI
q
r
Sa
Sm
Pacemaker rate In SET
Learning rate
Indicates a sampled value from a Gaussian distribution of such values
Accumulator value in SET
Artificial neural network
Single activation unit
Response threshold in SET
Coefficient of variation
Conditioned response
Conditioned stimulus
Computes differences and inhibits Rm
Computes differences and inhibits Rm
Part of the difference network that receives the value of Sa
Part of the difference network that receives the value of Sm
Fixed interval schedule
Long reinforcement time
Long response node
Fixed number of units each accumulator node can excite
Number of units excited on each time step by an external source
Fixed generalisation parameter (equivalent to b in SET)
Peak interval schedule
Threshold of the response memory
Reference memory value in SET
Sum of Av units in the accumulator
Value of weights trained to output value of accumulator at
reinforcement.
S
Short reinforcement time
SD Standard deviation
SET Scalar Expectancy Theory
Sr
Short response node
t
Arbitrary time
UCS Unconditioned response
US Unconditioned stimulus
9
SECTION 1
Introduction
1.1 About timing
Timing is crucial for the control of cognition and behaviour.
Many
cognitive operations may be timed implicitly by the way neural events
unfold in real time. There are, however, conditions where animals display
timing of behaviour for intervals in the range of seconds to minutes (Gibbon
and Church 1990). It seems that evolution has endowed the nervous system
of animals to anticipate rather than merely react to events in the environment
(Gibbon, Malapani, Dale and Gallistel, 1997).
One important use of timing
in animals is to provide an index of behaviour spent on goal directed
behaviour. This is clearly shown in foraging where birds need to change
food searching plans as a consequence of unexpected changes to feeding
areas. Under such circumstances the optimal time that would be spent on a
new search depends partly on the time spent flying to the present location. If
this time is very long it may be more worthwhile for the bird to persist
foraging in the same location. This optimisation reflects the goal of the bird
to collect the best recourses per unit of time (Gibbon and Church 1990).
Needless to say, optimisation of various costs also characterise much of
timing in humans.
Research has shown that interval timing can be learnt and executed with an
impressive degree of command and flexibility. For example, timing can be
arbitrary reset and even stopped for some interval before being continued
(Roberts 1981).
These together with other empirical observations have
10
made plausible the view that such behaviours rely on specific internal timing
mechanisms.
As the subsequent review shall show, a number of
researchers have developed a variety of research methodologies and
mathematically rigorous explanations of mechanisms involved in temporal
perception. The question of how a neural system might perform temporal
perception is considerably less well understood.
Despite a number of
proposals there is a general air of uncertainty concerning this problem. This
problem is owed, at least in part, to the absence of firm neurobiological data.
There is a lack strong evidence implicating with some useful degree of
precision where timing is performed in the brain (See Gibbon et al, 1997;
O’Boyle, 1997) despite wide continued speculation (e.g. Ivry, 1996; Meck,
1996; Hinton and Meck, 1997; Marcar and Casini, 1998; Grossberg and
Merril, 1998).
The presence of such data would allow one to make
inferences about what kind of computational architectures and dynamical
interactions between brain structures perform timing.
Compared to the
maturity of other perceptual sciences such as vision and audition; the
neurobiology of time perception is still in it early infancy. As a result, it is
of particular importance to note that constructing a neural network model of
timing must be considered a very open-ended research problem.
1.2 Research goals
The goal of this research was considerably explorative. There is a wealth of
experimental data on animal timing which suggests amongst other things the
following properties:
1)
Scalar property: The precision and variability of timed responses
increases proportionally with the size of the interval being timed thus
11
yielding a constant fraction between the standard deviation and the
mean of time estimates. This regularity is termed the scalar property
and is an example of Weber’s law.
2)
Linearity: The subjective representation of elapsed time increases
approximately linearly with real objective time. Although there may
be error in the representation of some elapsed interval, the mean of a
sample of representations for that interval closely approximates
objective time.
3)
Quantifiable: The nervous system appears to be able to calculate
differences between experienced intervals suggesting the subjective
representation of elapsed time is quantifiable to the cognitive system.
The primary concern of this project was to construct a neural network model
that would reproduce the properties inferred from experimental data as
outlined above.
There are other connectionist models of timing in the
literature but only one proposed by Church and Broadbent (1990) provides
comprehensive simulations of these properties for psychophysical tasks
involving timing in the range of seconds to minutes. The neural network is
based on an information processing theory of timing called Scalar
Expectancy Theory (SET).
Timing in SET relies on an internal clock
manifest in a pacemaker and accumulator. The pacemaker generates pulses
at a mean rate and the accumulator sums the number of pulse generated. In
Church and Broadbent's model timing is performed by resetting a set of
oscillators each differing in frequency.
A previously learnt interval is
reproduced by comparing the current phases of the oscillator set with the
12
memory of the phases when the learnt interval elapsed.
As such timing
oscillators have never been located in the brain the goal was to explore other
ways of modelling an internal clock mechanism. The most obvious approach
was to develop a neural network model of a pacemaker/accumulator that
would produce the properties outlined above.
This is precisely what has
been done together with simple neural networks that simulate performance
on animal timing tasks.
13
SECTION 2
Experimental research on timing
There are a number of psychophysical tasks that have been developed over
the years for examining temporal perception in both animals and humans.
Animal tasks include temporal generalisation (Church and Gibbon, 1982;),
peak procedure (Roberts, 1981,), temporal bisection (Church and Deluty,
1977) and time-left (Gibbon and Church, 1981). Most of these tasks shall
be reviewed, each of them suggesting important constraints on models of
timing.
2.1 Temporal reproduction
Recording the time spent on goal directed behaviour is paramount for
applying ones energy optimally. One of the most widely reported temporal
reproduction tasks is a modification of the ‘fixed interval schedule’ (FI)
called the ‘peak interval’ (PI) (Roberts 1981). In this task a rat is presented
with two types of trials: a) food (FI or reinforcement trials) and b) non-food
(no reinforcement trials). On the food trial a signal initiates the beginning
of a fixed time period (e.g. 20s). The first lever press made by the rat after
the fixed time period is rewarded with food and the signal is terminated. On
non-food trials the signal is presented for some period much greater than the
food trial (e.g. 120 seconds with a 20 seconds food trial period) and the
response time since signal onset for each lever press is recorded.
14
A first relevant observation concerns the ‘break-run-break’ pattern observed
on single trials. A pattern of abrupt response states from low to high and
ending with low is typically observed. This suggests that the animal is
generalising the leant reinforced time so that when its subjective estimation
of time is close to the remembered interval it responds at a high rate and
continues to do this until the estimation seems significantly different.
Of
particular relevance is the observation that the median interval of the high
response states, computed from single trials, tend to vary around the
objective reinforcement time.
Such an observation is predicted if the
subjective estimation of elapsed time, remembered time or both have error
that differs over trials.
Another important observation is acquired when the mean probability of a
response over a number of trials for every fixed successive interval (e.g.. 1
seconds) is plotted as a function of time since the onset of the peak trial
stimulus. The typical pattern observed is a Gaussian shaped distribution
characterised by a smooth and gradual increase in responses to the period
that approximates the time of reward followed by a slightly asymmetrical
but smooth decrease.
This suggests that either the error in memory or
elapsed time estimation causes the median interval of the high response
states on single trials, to vary around the objective reinforcement time with a
probability that is Gaussian shaped.
A third and more important observation concerns the relationship between
the standard deviation of the response times and the mean or peak response
time. A discussed in section 1.1, temporal reproduction of increasingly
larger intervals are performed by animals and humans but at a cost in
15
precision. If response distributions are taken for subject’s performance on
the peak procedure, for each of a number of different reinforcement times, a
constant fraction called the coefficient of variation (CV), is observed when
the standard deviation of response times is divided by the mean response
time. This pattern accords with Weber’s psychophysical law and is
commonly referred to as the scalar property. This can also be expressed
geometrically when the probability of a response is plotted against elapsed
time since signal onset relative to the peak time. Under such manipulations
the distribution superimpose.
This scaling pattern is widely referred to as
superposition (by American authors) or superimposition. In simple terms,
the scalar property means that an animal’s temporal sensitivity or response
precision increases proportionally with the length of the interval being
estimated.
Figure 2.1-2.3: Data from the peak interval procedure
2.1 Single trial performance
Pe = response probability
1.00
Pe 0.75
0.50
0.25
0
20
40
60
80
100
Elapsed time
16
120
2.2 Mean response distribution
1.00
Pe = response probability
Pe 0.75
0.50
0.25
0.00
0
20
40
60
80
100
120
140
Elapsed time
2.3 Superimposition of response distributions
1.00
Pr= relative response probability
Pr 0.75
0.50
0.25
0.00
0
.5
.75
1.0
1.25 1.5
2.0
Elapsed time/peak time
2.2 The temporal discrimination bias
A task called temporal bisection reveals an interesting discrimination bias
when performed by animals (Church and Deluty, 1977). These authors were
concerned with how the representation of time increases in the nervous
system.
In this procedure an animal has to discriminate between two
intervals of different lengths called short (S) and long (L). Each of these
intervals is signalled by the same stimulus. Typically, S might be 2 seconds
17
and L might be 8 seconds.
Each of the intervals S and L is learnt to be
associated with a corresponding lever (S lever and L lever). Only when the
correct associated lever is pressed will the animal be rewarded. When the
animal has learnt to discriminate between the two intervals to a high degree
of accuracy, a range of intermediate intervals including S and L are
presented.
In this stage of the task no rewards are given for lever presses to
any of the intervals.
The probability of a response to the L lever is computed for each of the
intervals presented and plotted as a function of the interval length. The
pattern of L responses is typically a half Gaussian that increases to the
highest response probability at the interval L. Of theoretical concern is the
location of the bisection point. This is the intermediate interval at which
discrimination is indifferent and results in 50% of the response presses for L.
The researchers predicted that if the bisection point was found at the
arithmetic mean, this would indicate linear time, at the geometrical mean,
logarithmic time, and at the harmonic mean, reciprocal time. In line with a
logarithmic view the bisection point was found at the geometric mean (the
square root of the product of S and L).
The geometrical mean has been a commonly expressed prediction of
bisection performance with animal subjects (e.g. Wearden, 1994; RodrigouzGirones and Kacelnik, 1998).
However, data has suggested that near
geometrical mean bisection is merely a co-incidental finding that frequently
occurs with 1:4 ratios but not always (See Siegal and Church 1984; Siegal
1986). Siegal and Church (1986) for example have found that when S was
1 second and L was 16 seconds the bisection point was significantly less
18
than the geometrical mean and less than the bisection point for performance
when S was 2 seconds and L was 8 seconds.
Both 1:16 and 2:8 stimulus
intervals predict the same geometrical mean even though the ratios are
different. These studies suggest that the ratios of the two intervals may play
an important role in determining the animal bisection point. This is also true
in human temporal bisection (Wearden and Ferrara, 1996) Note that human
temporal bisection performance shows many differences than when
performed by animals. This literature shall not be reviewed. The problem
of conceptualising the discrimination bias in animal temporal bisection
tasks, will be addresses in section 9, in which a novel method is reported
different to that expressed by proponents of SET.
2.3 The linear flow of subjective time
One important experimental task suggesting that subjective time increases
linearly rather than logarithmically is Time-Left (Gibbon and Church 1981).
In this task an animal is trained on two intervals each of which is associated
with a corresponding signal and lever. These are just FIs in which the
animal is rewarded for the correct lever press after some fixed interval. The
researchers cited here used 30 and 60 second FIs.
The 30 second interval is
called the standard FI
(S-FI) and the 60 second interval is called the
comparison FI (C-FI).
Following learning, combined trials are given and
the animal gets to make a choice of which lever to press for a reward. The
combined trials always begin with C-FI and then S-FI is presented at 15, 30
or 45 seconds in to C-FI.
If S-FI is presented 15 seconds in to the C-FI, the
rat will be rewarded earlier if it presses the S-FI lever.
If the S-FI is
presented at 45 seconds in to the C-FI the rat will be rewarded earlier if it
presses the C-FI lever. At 30 seconds in to C-FI, rewards will be given for
19
either lever press at the same time. As one would predict of linear subjective
time, rats favoured the S-FI response at 15 seconds in to C-FI, were
indifferent to either response at 30 seconds in to C-FI, and favoured the C-FI
response at 45 seconds in to C-FI.
Relative to the indifference at 30
seconds, the preference increase for S-FI at 15 seconds and C-FI at 45
seconds in to C-FI, were equivalent and therefore symmetrical. These results
strongly suggest that mean of subjective time estimates increases in a linear
fashion with objective time.
This data also suggests another important feature concerning the subjective
representation of time. That is that, such representations, both remembered
and currently elapsed, are quantifiable to the cognitive system. In the timeleft task it would seem reasonable to infer that an animal’s nervous system
computes the difference between its current representation of elapsed time
and memories for two other times. It would be hard to imagine how such a
task could be performed without quantifiable representations.
2.4 Pavlovian response timing
Another kind of experimental task worthy of mention involves Pavlovian or
classical conditioning tasks. An example of this is the eyeblink procedure in
which a subject is presented with a conditioned stimulus (CS) such as a tone
followed by an unconditioned stimulus (US) such as a puff of air aimed at
the subjects eye.
The puff of air will evoke an unconditioned response
(UCR) of eyelid closure. Through learning the US becomes associated with
the CS permitting the subject to produce a timed conditioned response (CR)
of eyelid closure when only the CS is present. The timing of the CR is
accurate occurring just before the US is normally emitted. Experiments of
20
this kind have reported accurate response timing in the milliseconds range.
This is an example of short interval timing and has been thought to involve
timing mechanism different to those used for longer intervals as employed in
experiments discussed above (Ivry, 1996).
21
22