* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download A Counter Based Connectionist Model of Animal Timing - APT
Neuromarketing wikipedia , lookup
Convolutional neural network wikipedia , lookup
Functional magnetic resonance imaging wikipedia , lookup
State-dependent memory wikipedia , lookup
Neuroesthetics wikipedia , lookup
Eyeblink conditioning wikipedia , lookup
Development of the nervous system wikipedia , lookup
Artificial neural network wikipedia , lookup
Cognitive neuroscience of music wikipedia , lookup
Psychophysics wikipedia , lookup
Catastrophic interference wikipedia , lookup
Neural engineering wikipedia , lookup
Nervous system network models wikipedia , lookup
Neuroethology wikipedia , lookup
Lateralized readiness potential wikipedia , lookup
Stimulus (physiology) wikipedia , lookup
Response priming wikipedia , lookup
Metastability in the brain wikipedia , lookup
Types of artificial neural networks wikipedia , lookup
Neuroeconomics wikipedia , lookup
Mental chronometry wikipedia , lookup
Recurrent neural network wikipedia , lookup
A Counter Based Connectionist Model of Animal Timing A thesis submitted to the University of Manchester for the degree of Master of Science in the Faculty of Science and Engineering October 1999 By Rossano Barone Joint Departments of Psychology and Computer Science 1 CONTENTS Abstract………………………………………………………………...5 Declaration………………………………………………………….….6 Copyright………………………………………………………………7 Acknowledgements…………………………………………………….8 Abbreviations…………………………………………………………..9 1 INTRODUCTION………………………………………………….10 1.1 About timing………………………………………………10 1.2 Research goals……………………………………………..11 2 EXPERIMENTAL RESEARCH ON TIMING…………………….14 2.1 Reproduced interval generalisation………………………..14 2.2 The temporal discrimination bias………………………….15 2.3 The linear flow of subjective time…………………………16 2.4 Pavlovian response timing…………………………………20 3 SCALAR TIMING THEORY………………………………………21 4 MODDELING COGNITION AND BRAIN PROCCESSES………25 5.1 Neurons and their behaviour……………………………….25 5.2 Neural networks and computational modelling……………26 5 NEUROBIOLOGICAL DATA ON TIMING………………………29 4.1 Cerebellar timing hypotheses………………………………29 4.2 The basal ganglia and frontal cortex……………………….31 2 6 NEURAL NETWORK MODELS OF TIMING……………………34 6.1Tapped delay lines.…………………………………………34 6.2Time dependent spatial activity…………………………….35 6.3Spectral timing……………………………………………...38 6.4Oscillator models…………………………………………...39 6.5Accumulator timing………………………………………...42 7 THE ACCUMMULATOR …………………………………………45 7.1 Design considerations and implementation………………...45 7.2 Results……………………………………………………...48 7.3 Discussion………………………………………………….49 8 PEAK PROCEDURE SIMULATIONS…………………………….53 8.1 Reward expectation………………………………………...53 8.2Temporal representation…………………………………….54 8.3 Response representation……………………………………55 8.4 Difference network…………………………………………57 8.5 Summary of simulation…………………………………….58 8.6 Results……………………………………………………...60 8.7 Discussion………………………………………………….62 9 TEMPORAL BISCECTION………………………………………..68 9.1 Reward expectation………………………………………...69 9.2 Temporal memory………………………………………….70 9.3 Difference network…………………………………………71 9.4Summary of simulation……………………………………..71 9.5 Results……………………………………………………...73 9.3Discussion……………………………………………...…...75 3 10 GENERAL DISCCUSSION………………………………………80 10.1Comparisons with other models…………………………...80 10.2 Time dependent spatial patterns…………………………..82 10.3 The relativity of differences………………………………83 10.4 Summary………………………………………………….86 Appendix A: The dependence of ratio differences on distribution overlap………………………………………………………………...89 Appendix B: Rearrangement of the peak interval decision rule………90 References…………………………………………………………….91 4 Abstract Scalar Expectancy Theory (SET) is a mathematically rigorous explanation of timing in animals and humans. Central to SET is the proposal that timing is performed by an internal clock manifest in a pacemaker that generates pulses at a mean rate, and an accumulator that sums these pulses. In the ‘accumulator neural network’ elapsed time is represented as an activation quantity. This neural network tends to preserve all external activation it receives over a sequence of time steps. If the activation quantity from an external input is constant over time, then the activation increases in a linear fashion. Activation transfers probabilistically over short temporal units. The standard deviation of a sample of activation values for each time step increases proportionally with elapsed time allowing scalar variance. A neural network for peak interval performance incorporated a rearrangement of the decision rules proposed by SET. Simulations generated approximations of animal data including the scalar property. In addition, the acquisition of close to geometrical mean bisection was performed by a neural network which learnt the most optimal discriminant between the small and large intervals, rather than decision representations proposed by SET. processes based on explicit The thesis explores problems with these simple models and suggests ideas relevant to further research developments. 5 Declaration No portion of the work referred to in the thesis has been submitted in support of an application for another degree or qualification of this or any other university or institute of learning. 6 Copyright and Ownership of Intellectual Property Rights (1) Copyright in text of this thesis rests with the Author. Copies (by any process) either full, or of extracts, may be made only in accordance with instructions given by the Author and logged in the John Rylands University Library of Manchester. Details may be obtained from the Librarian. This page must form part of any such copies made. Further copies (by any process) of copies made in accordance with such instructions may not be made without the permission (in writing) of the Author. (2) The ownership of any intellectual property rights which may be described in this thesis is vested in the University of Manchester, subject to any prior agreement to the contrary, and may not be made available for use by third parties without the written permission of the University, which will prescribe the terms and conditions of any such agreement. Further information on the conditions under which disclosures and exploitation may take place is available from the Head of the department of Psychology. 7 Acknowledgements The research was conducted with the supervision of Jon Shapiro and John Wearden of the University of Manchester. I would like to thank them for their patience, encouragement and advice. The Author The Author graduated from Middlesex University in 1997 with a BSc (Hons) in psychology. The Author has previously conducted experimental research on the relationship between attention and creative thinking as partial fulfilment of his first degree. 8 Abbreviations * a ANN Av b CV CR CS D1 D2 Da Dr FI L Lr Nc Ni G PI q r Sa Sm Pacemaker rate In SET Learning rate Indicates a sampled value from a Gaussian distribution of such values Accumulator value in SET Artificial neural network Single activation unit Response threshold in SET Coefficient of variation Conditioned response Conditioned stimulus Computes differences and inhibits Rm Computes differences and inhibits Rm Part of the difference network that receives the value of Sa Part of the difference network that receives the value of Sm Fixed interval schedule Long reinforcement time Long response node Fixed number of units each accumulator node can excite Number of units excited on each time step by an external source Fixed generalisation parameter (equivalent to b in SET) Peak interval schedule Threshold of the response memory Reference memory value in SET Sum of Av units in the accumulator Value of weights trained to output value of accumulator at reinforcement. S Short reinforcement time SD Standard deviation SET Scalar Expectancy Theory Sr Short response node t Arbitrary time UCS Unconditioned response US Unconditioned stimulus 9 SECTION 1 Introduction 1.1 About timing Timing is crucial for the control of cognition and behaviour. Many cognitive operations may be timed implicitly by the way neural events unfold in real time. There are, however, conditions where animals display timing of behaviour for intervals in the range of seconds to minutes (Gibbon and Church 1990). It seems that evolution has endowed the nervous system of animals to anticipate rather than merely react to events in the environment (Gibbon, Malapani, Dale and Gallistel, 1997). One important use of timing in animals is to provide an index of behaviour spent on goal directed behaviour. This is clearly shown in foraging where birds need to change food searching plans as a consequence of unexpected changes to feeding areas. Under such circumstances the optimal time that would be spent on a new search depends partly on the time spent flying to the present location. If this time is very long it may be more worthwhile for the bird to persist foraging in the same location. This optimisation reflects the goal of the bird to collect the best recourses per unit of time (Gibbon and Church 1990). Needless to say, optimisation of various costs also characterise much of timing in humans. Research has shown that interval timing can be learnt and executed with an impressive degree of command and flexibility. For example, timing can be arbitrary reset and even stopped for some interval before being continued (Roberts 1981). These together with other empirical observations have 10 made plausible the view that such behaviours rely on specific internal timing mechanisms. As the subsequent review shall show, a number of researchers have developed a variety of research methodologies and mathematically rigorous explanations of mechanisms involved in temporal perception. The question of how a neural system might perform temporal perception is considerably less well understood. Despite a number of proposals there is a general air of uncertainty concerning this problem. This problem is owed, at least in part, to the absence of firm neurobiological data. There is a lack strong evidence implicating with some useful degree of precision where timing is performed in the brain (See Gibbon et al, 1997; O’Boyle, 1997) despite wide continued speculation (e.g. Ivry, 1996; Meck, 1996; Hinton and Meck, 1997; Marcar and Casini, 1998; Grossberg and Merril, 1998). The presence of such data would allow one to make inferences about what kind of computational architectures and dynamical interactions between brain structures perform timing. Compared to the maturity of other perceptual sciences such as vision and audition; the neurobiology of time perception is still in it early infancy. As a result, it is of particular importance to note that constructing a neural network model of timing must be considered a very open-ended research problem. 1.2 Research goals The goal of this research was considerably explorative. There is a wealth of experimental data on animal timing which suggests amongst other things the following properties: 1) Scalar property: The precision and variability of timed responses increases proportionally with the size of the interval being timed thus 11 yielding a constant fraction between the standard deviation and the mean of time estimates. This regularity is termed the scalar property and is an example of Weber’s law. 2) Linearity: The subjective representation of elapsed time increases approximately linearly with real objective time. Although there may be error in the representation of some elapsed interval, the mean of a sample of representations for that interval closely approximates objective time. 3) Quantifiable: The nervous system appears to be able to calculate differences between experienced intervals suggesting the subjective representation of elapsed time is quantifiable to the cognitive system. The primary concern of this project was to construct a neural network model that would reproduce the properties inferred from experimental data as outlined above. There are other connectionist models of timing in the literature but only one proposed by Church and Broadbent (1990) provides comprehensive simulations of these properties for psychophysical tasks involving timing in the range of seconds to minutes. The neural network is based on an information processing theory of timing called Scalar Expectancy Theory (SET). Timing in SET relies on an internal clock manifest in a pacemaker and accumulator. The pacemaker generates pulses at a mean rate and the accumulator sums the number of pulse generated. In Church and Broadbent's model timing is performed by resetting a set of oscillators each differing in frequency. A previously learnt interval is reproduced by comparing the current phases of the oscillator set with the 12 memory of the phases when the learnt interval elapsed. As such timing oscillators have never been located in the brain the goal was to explore other ways of modelling an internal clock mechanism. The most obvious approach was to develop a neural network model of a pacemaker/accumulator that would produce the properties outlined above. This is precisely what has been done together with simple neural networks that simulate performance on animal timing tasks. 13 SECTION 2 Experimental research on timing There are a number of psychophysical tasks that have been developed over the years for examining temporal perception in both animals and humans. Animal tasks include temporal generalisation (Church and Gibbon, 1982;), peak procedure (Roberts, 1981,), temporal bisection (Church and Deluty, 1977) and time-left (Gibbon and Church, 1981). Most of these tasks shall be reviewed, each of them suggesting important constraints on models of timing. 2.1 Temporal reproduction Recording the time spent on goal directed behaviour is paramount for applying ones energy optimally. One of the most widely reported temporal reproduction tasks is a modification of the ‘fixed interval schedule’ (FI) called the ‘peak interval’ (PI) (Roberts 1981). In this task a rat is presented with two types of trials: a) food (FI or reinforcement trials) and b) non-food (no reinforcement trials). On the food trial a signal initiates the beginning of a fixed time period (e.g. 20s). The first lever press made by the rat after the fixed time period is rewarded with food and the signal is terminated. On non-food trials the signal is presented for some period much greater than the food trial (e.g. 120 seconds with a 20 seconds food trial period) and the response time since signal onset for each lever press is recorded. 14 A first relevant observation concerns the ‘break-run-break’ pattern observed on single trials. A pattern of abrupt response states from low to high and ending with low is typically observed. This suggests that the animal is generalising the leant reinforced time so that when its subjective estimation of time is close to the remembered interval it responds at a high rate and continues to do this until the estimation seems significantly different. Of particular relevance is the observation that the median interval of the high response states, computed from single trials, tend to vary around the objective reinforcement time. Such an observation is predicted if the subjective estimation of elapsed time, remembered time or both have error that differs over trials. Another important observation is acquired when the mean probability of a response over a number of trials for every fixed successive interval (e.g.. 1 seconds) is plotted as a function of time since the onset of the peak trial stimulus. The typical pattern observed is a Gaussian shaped distribution characterised by a smooth and gradual increase in responses to the period that approximates the time of reward followed by a slightly asymmetrical but smooth decrease. This suggests that either the error in memory or elapsed time estimation causes the median interval of the high response states on single trials, to vary around the objective reinforcement time with a probability that is Gaussian shaped. A third and more important observation concerns the relationship between the standard deviation of the response times and the mean or peak response time. A discussed in section 1.1, temporal reproduction of increasingly larger intervals are performed by animals and humans but at a cost in 15 precision. If response distributions are taken for subject’s performance on the peak procedure, for each of a number of different reinforcement times, a constant fraction called the coefficient of variation (CV), is observed when the standard deviation of response times is divided by the mean response time. This pattern accords with Weber’s psychophysical law and is commonly referred to as the scalar property. This can also be expressed geometrically when the probability of a response is plotted against elapsed time since signal onset relative to the peak time. Under such manipulations the distribution superimpose. This scaling pattern is widely referred to as superposition (by American authors) or superimposition. In simple terms, the scalar property means that an animal’s temporal sensitivity or response precision increases proportionally with the length of the interval being estimated. Figure 2.1-2.3: Data from the peak interval procedure 2.1 Single trial performance Pe = response probability 1.00 Pe 0.75 0.50 0.25 0 20 40 60 80 100 Elapsed time 16 120 2.2 Mean response distribution 1.00 Pe = response probability Pe 0.75 0.50 0.25 0.00 0 20 40 60 80 100 120 140 Elapsed time 2.3 Superimposition of response distributions 1.00 Pr= relative response probability Pr 0.75 0.50 0.25 0.00 0 .5 .75 1.0 1.25 1.5 2.0 Elapsed time/peak time 2.2 The temporal discrimination bias A task called temporal bisection reveals an interesting discrimination bias when performed by animals (Church and Deluty, 1977). These authors were concerned with how the representation of time increases in the nervous system. In this procedure an animal has to discriminate between two intervals of different lengths called short (S) and long (L). Each of these intervals is signalled by the same stimulus. Typically, S might be 2 seconds 17 and L might be 8 seconds. Each of the intervals S and L is learnt to be associated with a corresponding lever (S lever and L lever). Only when the correct associated lever is pressed will the animal be rewarded. When the animal has learnt to discriminate between the two intervals to a high degree of accuracy, a range of intermediate intervals including S and L are presented. In this stage of the task no rewards are given for lever presses to any of the intervals. The probability of a response to the L lever is computed for each of the intervals presented and plotted as a function of the interval length. The pattern of L responses is typically a half Gaussian that increases to the highest response probability at the interval L. Of theoretical concern is the location of the bisection point. This is the intermediate interval at which discrimination is indifferent and results in 50% of the response presses for L. The researchers predicted that if the bisection point was found at the arithmetic mean, this would indicate linear time, at the geometrical mean, logarithmic time, and at the harmonic mean, reciprocal time. In line with a logarithmic view the bisection point was found at the geometric mean (the square root of the product of S and L). The geometrical mean has been a commonly expressed prediction of bisection performance with animal subjects (e.g. Wearden, 1994; RodrigouzGirones and Kacelnik, 1998). However, data has suggested that near geometrical mean bisection is merely a co-incidental finding that frequently occurs with 1:4 ratios but not always (See Siegal and Church 1984; Siegal 1986). Siegal and Church (1986) for example have found that when S was 1 second and L was 16 seconds the bisection point was significantly less 18 than the geometrical mean and less than the bisection point for performance when S was 2 seconds and L was 8 seconds. Both 1:16 and 2:8 stimulus intervals predict the same geometrical mean even though the ratios are different. These studies suggest that the ratios of the two intervals may play an important role in determining the animal bisection point. This is also true in human temporal bisection (Wearden and Ferrara, 1996) Note that human temporal bisection performance shows many differences than when performed by animals. This literature shall not be reviewed. The problem of conceptualising the discrimination bias in animal temporal bisection tasks, will be addresses in section 9, in which a novel method is reported different to that expressed by proponents of SET. 2.3 The linear flow of subjective time One important experimental task suggesting that subjective time increases linearly rather than logarithmically is Time-Left (Gibbon and Church 1981). In this task an animal is trained on two intervals each of which is associated with a corresponding signal and lever. These are just FIs in which the animal is rewarded for the correct lever press after some fixed interval. The researchers cited here used 30 and 60 second FIs. The 30 second interval is called the standard FI (S-FI) and the 60 second interval is called the comparison FI (C-FI). Following learning, combined trials are given and the animal gets to make a choice of which lever to press for a reward. The combined trials always begin with C-FI and then S-FI is presented at 15, 30 or 45 seconds in to C-FI. If S-FI is presented 15 seconds in to the C-FI, the rat will be rewarded earlier if it presses the S-FI lever. If the S-FI is presented at 45 seconds in to the C-FI the rat will be rewarded earlier if it presses the C-FI lever. At 30 seconds in to C-FI, rewards will be given for 19 either lever press at the same time. As one would predict of linear subjective time, rats favoured the S-FI response at 15 seconds in to C-FI, were indifferent to either response at 30 seconds in to C-FI, and favoured the C-FI response at 45 seconds in to C-FI. Relative to the indifference at 30 seconds, the preference increase for S-FI at 15 seconds and C-FI at 45 seconds in to C-FI, were equivalent and therefore symmetrical. These results strongly suggest that mean of subjective time estimates increases in a linear fashion with objective time. This data also suggests another important feature concerning the subjective representation of time. That is that, such representations, both remembered and currently elapsed, are quantifiable to the cognitive system. In the timeleft task it would seem reasonable to infer that an animal’s nervous system computes the difference between its current representation of elapsed time and memories for two other times. It would be hard to imagine how such a task could be performed without quantifiable representations. 2.4 Pavlovian response timing Another kind of experimental task worthy of mention involves Pavlovian or classical conditioning tasks. An example of this is the eyeblink procedure in which a subject is presented with a conditioned stimulus (CS) such as a tone followed by an unconditioned stimulus (US) such as a puff of air aimed at the subjects eye. The puff of air will evoke an unconditioned response (UCR) of eyelid closure. Through learning the US becomes associated with the CS permitting the subject to produce a timed conditioned response (CR) of eyelid closure when only the CS is present. The timing of the CR is accurate occurring just before the US is normally emitted. Experiments of 20 this kind have reported accurate response timing in the milliseconds range. This is an example of short interval timing and has been thought to involve timing mechanism different to those used for longer intervals as employed in experiments discussed above (Ivry, 1996). 21 22