Download Model_Report_--_Schuler_Robert_-

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Neuroeconomics wikipedia , lookup

Neural modeling fields wikipedia , lookup

Cognitive neuroscience of music wikipedia , lookup

Aging brain wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Recurrent neural network wikipedia , lookup

Executive functions wikipedia , lookup

Atkinson–Shiffrin memory model wikipedia , lookup

Nervous system network models wikipedia , lookup

Mathematical model wikipedia , lookup

Metastability in the brain wikipedia , lookup

Synaptic gating wikipedia , lookup

Holonomic brain theory wikipedia , lookup

Biological neuron model wikipedia , lookup

Transcript
A computational model of working memory in a
cognitive task utilizing a thalmo-cortical circuit
R. Schuler
Abstract
A computational model using a neural network architecture is presented to explain
Schizophrenia patients' performance on the well-known Wisconsin Card Sorting Test
(WCST) frontal lobe task. In addition, models used to simulate patient performance on
the Tower of London (TOL) and Tower of Hanoi (TOH) tasks were also considered in the
design of the present model. The specific focus of the present model is to explain
empirical results for patient groups by simulating working memory deficits. An earlier
model by Amos (Amos, 2000) was used as the basis for the design of the present model.
However, in the present model the thalmo-cortical loop is completed whereas in the
Amos model the thalamus was used only as a target of the basal ganglia and was not
reconnected to provide feedback to the frontal cortex. Completing the loop allows the
present model to better simulate the effects of sub-cortical dysfunction.
Introduction
Three models were reviewed in detail leading up to the design of the present model. Each
model contributes to the understanding of the role of the prefrontal cortex, of the affects
of frontal lobe dysfunction on patient groups tested with frontal lobe tasks, and on the
localization of working memory in the prefrontal cortex.
In Newman et al., a computational model is proposed which synthesizes earlier work on
executive function (Shallice, 1982) and the Soar model (Newell, 1990). The model seeks
to explain empirical results of test subjects on the TOL task and focuses particular
attention on the prefrontal cortex the parietal lobe and lateralization of functions therein.
Using the model, Newman et al. seek an explanation for the types of functionality
embodied in the various cortical areas. They hypothesize the localization of functionality
for a visuo-spatial workspace, for planning and plan execution.
The Newman et al. model is a symbolic computational model based on the 4CAPS
cognitive architecture. Their model fits their empirical results qualitatively and highlights
the distributed nature of neural activation for solving complex problems. In particular,
they find a possible correlation for working memory activation in the prefrontal cortex
and possible lateralization in the right hemisphere.
Goel et al. present a symbolic computer model to explain the effects of frontal lobe
dysfunction on the TOH task using working memory performance. They designed their
model based on earlier work by two of the authors in which they tested a control group of
20 subjects and a patient group of 20 subjects using the TOH task (Goel & Grafman,
1995). The model was based on the 3CAPS cognitive architecture (Just & Carpenter,
1992), with two sets of production rules, a coordinator (executive function), and a
knowledge base -- the model did not utilize a neural network architecture. Goel et al.
Model Report
Page 1 of 10
R. Schuler
fitted the model to the empirical results of the control group and then introduced working
memory deficits by increasing the decay rate of the knowledge base s memory pools.
Amos (2000) simulates empirical results from subjects performing the WCST task with a
neural network model. In this model, the regions of the prefrontal cortex, basal ganglia
(striatum and SN/GP), and thalamus are examined. Amos uses the model to explain
empirical results for subjects suffering frontal lobe dysfunction which manifest in
perseverative errors related to frontal cortex dysfunction and random errors related to
striatal dysfunction -- patients suffered from schizophrenia, Parkinson's disease (PD), or
Huntington's disease (HD). The evaluation used in the study quantified the number of
WCST categories achieved by control versus various patient groups and qualitatively
recorded the type of error, whether perseverative or random. Though the brain regions are
hypothesized to form a neuroanatomic loop from frontal cortex through thalamus, Amos
notes that the model did not attempt to simulate the feedback loop from thalamus to
prefrontal cortex.
Summary Data
The computational model presented in this report is based on the model by Amos (2000).
It depends upon a neuroanatomic loop from the frontal cortex through the basal ganglia
structures of the striatum and SNr/GPi to the thalamus and looping back to the cortex.
The following summary data supports the design:
 Roberts et al (1994) show that overloading of working memory causes subject to
perform cognitive task with similar ability to frontal lobe patients, thereby
suggesting that working memory may be localized in the prefrontal cortex.
 Braver and Bangiolatti (2002) use fMRI results in studies that show right
lateralized activation in the prefrontal cortex for tasks requiring greater sub-goal
management, suggesting working memory function may be reside in right PFC.
 Baker et al (1996) use PET results on a study involving Tower of London task
that indicate that visuospatial working memory may be localized to the prefrontal
area.
 Petrides (1995) shows that monkeys with lesions to the prefrontal cortex exhibit
working memory deficits.
 Gerfen (1992) and Alexander et al. (1992) present neuroanatomical research
regarding the organization of the basal ganglia showing that the striatum is the
primary recipient of afferent connections to the basal ganglia and that the striatum
projects to the SN/GP. The research shows that the striatum is highly organized
and may perform some type of information integration (or dimensionality
reduction) as the ratio of incoming to outgoing projections is very high.
 Gerfen (1992) and Alexander et al. (1992) present neuroanatomical research
showing that the SN/GP produce the primary outgoing projections from the basal
ganglia and that they project to the thalamus, thus forming a loop back to the
cortex. Amos (2000) cites additional research that indicates that the role of the
thalmo-cortical loop may be used to perform active gating.
Based on the given summary data, functionality in the frontal cortex is associated with
working memory, executive function and attention. The striatum is hypothesized to
Model Report
Page 2 of 10
R. Schuler
provide a type of information integration and then inhibit the SNr/GPi. The SNr/GPi
complex is tonically active and inhibitory to the thalamus forming an activity gating
mechanism that in turn projects to the cortex to maintain and initiate activity. In addition
to the summary data describing this loop, Prescott et al (2003) describe the role of basal
ganglia circuits and the function they provide for action gating and action selection. A
notable example is the Dominey and Arbib model of saccadic eye movements (1995) in
which this cortical-thalamic loop is used to produce delayed saccades.
The Model
The present model differs from Amos (2000) in the following ways. Working memory is
modeled with multiple layers of neurons that are activated or inhibited by a gating signal.
Rule generation is performed in a non-biologically realistic manner without regard to
insensitivity to punishment. Critically important to the presented model, the thalmocortical loop is completed such that the activation of the thalamus is fed back to the
cortex and used as the gating signal to working memory in the cortex. Thus, in the
present model it is possible to explore additional effects of basal ganglia dysfunction.
Reward
S
T1
T2
T3
T4
Ctx
MD
Str
SNr/GPi
Figure 1. The Model
Model Report
Page 3 of 10
R. Schuler
The model works as follows.
Input (Reward, S, T1,…,T4)
The input includes the Reward, Stimulus and Target signals provided by the Test
Manager module. The manager generates a representation of four target cards with at
least 3 features (e.g., form, color, number) with 4 possible values for each feature. The
Target cards (T1,…,T4) remain unchanged throughout the test, while the manager
generates a representation of the stimulus card for which the model must attempt to
produce a match to one of the 4 target cards. The Stimulus card changes for each trial
throughout the test. The subject is expected to match the stimulus card to a target card
based on a matching rule. The matching rule determines whether the subject should
match the stimulus to one of the targets based on matching shape, number, or form. The
manager produces a punishment signal if the subject makes an incorrect match. The rule
is not communicated to the subject and after several successful matches the rule changes
without warning to the subject. Thus, the subject must discover the rule through trial and
error and then remember the rule throughout the trials in the current category.
New. In my original model the targets changed with each trial, which is inconsistent with
the real WCST task. In the revised model, the targets remain fixed.
Cortex (Ctx)
The prefrontal cortex performs a few key cognitive tasks including rule generation,
attention, and most importantly to this simulation, working memory. The Working
Memory sub-module is comprised of three neural layers, an input potential layer, a stored
memory layer, and an output potential layer. The input potential layer (first layer)
receives a projection from the Rule Generator for the currently proposed rule. The stored
memory layer (second layer) receives a recurrent connection from the output layer (third
layer). And the output layer receives excitatory projections from both the input and stored
layers. The gating signal excites the input layer and neutralizes the stored layer, therefore
when the gating signal is active the Working Memory module stores a new pattern and
when the gating signal is not active the module maintains stored information.
In O’Reilly & Frank (2006), they conceptually describe a dynamic gating memory model
but (as far as I recall) they do not provide a hypothesis for a biologically plausible
dynamically gated memory structure. In my model, I show that a recurrent neural
network acting on the gating signal essentially as if it were a bitmask can function as
dynamically gated memory. So the memory is temporarily stored using the neural
activation rather than changing of weights (as in long-term memory).
The Rule Generator sub-module is a non-biologically realistic module that randomly
produces a new rule whenever it receives a punishment signal and it projects the
proposed rule to the working memory.
Removed. The Attention sub-module completes the prefrontal cortex module by
combining the currently active rule from working memory with the stimulus card
representation into an attention signal. The attention signal is a matrix with one unit
Model Report
Page 4 of 10
R. Schuler
active which represents the feature of the stimulus card for which the simulated subject
seeks a match.
New. In the revised model, the Cortex will present the rule to the BG as an available
action that the BG may use in selecting the appropriate match. It will be up to the BG to
use the rule to select an action rather than to use the previous “attention” signal to find a
match.
Basal Ganglia (Str, SNr/GPi)
The basal ganglia are comprised of the Striatal module (Str) and the SNr/GPi module
(SNr/GPi). The striatum is further comprised of striatal columns that accept projections
for each of the four target card representations along with the attention signal from the
cortex. The columns integrate the information into a single inhibitory signal that projects
to the tonic neural units of the SN/GP. The SN/GP neurons are tonically active and
inhibited by the striatal projections. The SN/GP neurons, when inhibited, disinhibit the
respective thalamic units.
New. The BG will accept inputs which represent the stimulus (S) and target (T1,…T4)
cards. The combined inputs form a 3 x 20 input matrix. The role of the Str will be to
reduce this raw representation to a simplified representation of just the matching features.
For instance, if the stimulus card has a red figure and if target card 1 has a red figure then
the 8 neurons that represented the characteristics of the stimulus and target card may be
reduced to a single neuron indicating a matching feature (match = on) in that stimulustarget pair.
New. The BG will accept a signal from the cortex representing the current rule rather than
the “attention” signal used in the previous model. The rule essentially states a possible
action, such as “match cards by color.” The BG will then perform an action selection
function by using the current rule to determine which target card to match with the
stimulus card. Open issue: I am not certain how to model this action selection function or
even whether to model it in the Str or the SNr/GPi.
I believe these changes capture your input from our discussion during our previous
weekly meeting with the BG group. Also, it is my hope that this change enhances the
biological plausibility of the model, and furthermore enables collaboration options with
other BG team members (i.e., reuse of my model or reuse of their modules in my model).
The output of the BG remains unchanged. The SNr/GPi is comprised of tonically active
inhibitory neurons that project (inhibit) the corresponding Thalamic units. When the Str
determines a match or an action, the firing inhibits the SNr/GPi thereby disinhibiting the
MD. The disinhibition allows the action to proceed.
Thalamus (MD)
The Thalamus module is comprised of a layer of neurons corresponding to the selected
action as determined by the basal ganglia. The output of the thalamus is a gating signal
that indicates which action is to be performed (e.g., "match stimulus card to target 3").
Model Report
Page 5 of 10
R. Schuler
The thalmo-cortical loop is completed by feeding the thalamic output to the frontal
cortex. As described above, the signal is used as a gating signal that projects to the
Working Memory module.
Simulation Results
The model is used to simulated empirical results of normal and patient groups on the
Wisconsin Card Sort Task (WCST). In the WCST, a subject must sort a stack of cards
according to a sorting rule. A card may have multiple features such as color, shape,
number - for instance, a card may have two blue circles printed on it or one green square
printed on it. The sorting rule requires that the subject match a stimulus card to one of
four target cards - for instance, the sorting rule may be to sort by shape in which case the
subject may ignore color or number. After the subject correctly matches 10 cards in a
row, the rule is changed without the subject's knowledge and the subject must learn and
apply a new rule in order to sort the next category of cards. The test completes when the
subject has correctly matched 6 categories.
Amos (2000) presents experimental results for subject and multiple patient groups
including those with Schizophrenia, Schizophrenia with Tardive dyskinesia (TD),
Parkinson's Disease (PD), PD with dementia, Huntington's Disease (HD), and HD with
dementia. The present model was primarily used only to evaluate normal, Schizophrenia,
and Schizophrenia with TD performance. Once the model was fitted to the normal
results, the connection weights from the gating signal to the working memory layers were
reduced and noise-to-signal was increased in order to produce failures and perseverative
errors in line with the Schizophrenia patients. Then, gain was reduced and bias against
firing increased in the striatal columns to reproduce results similar to Schizophrenia with
TD patients.
The present model demonstrates that working memory deficits localized in the frontal
cortex can account for frontal lobe patient performance on cognitive tasks and further that
sub-cortical dysfunction can degrade performance on (set-shifting) cognitive tasks.
In the following benchmark, the presented model is compared to the experimental results
and simulation results as reported by Amos (2000).
Experiment /
Notes
Normal
Schizophrenic
Simulation
Schizophrenic
TD
Experimental
6.0 categories
1.5
results reported
9%
27%
by Amos
perseveration
--
(2000) for
normal vs.
Model Report
Page 6 of 10
R. Schuler
Schizophrenic
patients
Experimental
--
results reported
3.9
2.5
27%
41%
--
by Amos
(2000) for
normal vs.
Schizophrenic
vs.
Schizophrenic
with TD
patients
Simulation
Calibrated to
6.0
1.5
results using
the first
9%
27%
Amos (2000)
empirical study
--
(Frontal Model) (Frontal Model)
computational
model
Simulation
Calibrated to
results using
the second
3.8
2.5
Amos (2000)
empirical study
27%
41%
computational
(Striatal Model) (Striatal Model)
model
3.8
2.5
27%
29%
Simulation
6.0
1.8
1.9
using the
14.6%
27.8%
25.7%*
presented
computational
model
* Though overall perseveration to attempts was lower than non-TD patients, the results
show that when the simulated subject fails the TD subject was more likely to fail due to a
perseverative error than the non-TD subject.
Model Report
Page 7 of 10
R. Schuler
Discussion and Future Work
I would characterize my results (from last semester) as mixed. I simulated dysfunction by
altering the gain and bias in neural layers in the frontal and striatal modules, however, I
also found that altering the weights used in the frontal module (working memory)
corresponding to the input patterns and particularly the gating signal was most effective
for simulating the experimental results. I fitted the simulation to the normal subjects from
the experimental results. My simulation’s control group had a slightly higher rate of
perseverative errors. I believe this is due to my implementation of the Rule Generator
module, which randomly selects the next rule to attempt when the PFC receives a
punishment signal. The rule generator only ensures that the random rule is not the same
as the currently selected rule, and therefore there is a chance of repeating a failed rule
while making attempts to find the correct rule. Once fitted to the control group, I
manipulated the gain, bias, noise and weights within the frontal module (working
memory) and found a combination of settings for the neural network that closely
reproduced the experiment results from the first study cited by Amos (2000). Then I
reduced the gain and increased the bias against firing in the striatal module to attempt to
simulate the experimental results for schizophrenic patients with TD. In this case, I had
less success but there were signs that the model may be on the right track. On the surface,
the downside is that the simulation did not reproduce results matching the experimental
results. In fact, the simulated subjects with TD performed slightly better than the non-TD
subjects. However, a closer evaluation of the simulation results shows that ratio of
perseverative errors to overall errors was higher for the subjects with TD than the non-TD
subjects. Nearly 60% of the errors of the TD group were perseverative whereas around
51% of the errors of the non-TD group were perseverative. This is close to a 20%
increase in perseverative errors for the simulation when introducing striatal dysfunction
to match schizophrenic patients with TD.
Future work on the model falls into three main groups plus miscellany:
1) Update the model’s structure according to the comments under the section “The
Model.” This primarily entails removing the Attention sub-module from the
cortex. Projecting the stimulus card directly to the BG. Using the BG to reduce
the dimensionality of the input signals first, and then to use the reduced
representation of the card inputs along with the rule input to select an action.
2) Experiment with the neural network parameters for normal, Schizophrenic, and
Schizophrenic + TD scenarios to better match the experimental results.
3) I wouldn’t also like to introduce selective destruction of neurons in the Striatum
to simulate the effects of Huntington’s disease (Mendez, 1994).
4) In addition to the above, there are some “loose ends” such as an incomplete
implementation of “winner lose all” in the SN/GP.
References
[Alexander et al.,
Alexander, G.E., DeLong, M.R., Crutcher, M.D., 1992, Do
1992]
cortical and basal ganglionic motor areas use motor programs to
Model Report
Page 8 of 10
R. Schuler
control movement ?, Behavioral and brain sciences, 15:656-665
[Amos, 2000]
Amos, A., 2000, A Computational Model of Information
Processing in the Frontal Cortex and Basal Ganglia, Journal of
Cognitive Neuroscience, 12:505-519.
[Arbib, 2003]
Arbib, M. A., 2003, Backpropagation: General Principles, in The
Handbook of Brain Theory and Neural Networks (M. A. Arbib,
Ed.) Cambridge, MA: MIT Press, pp. 147-151.
[Baker et al., 1996]
Baker, S.C, Rogers, R.D, Owen, A.M., Frith, C.D., Dolan, R.J.,
Frackowiak, R.S.J., and Robbins, T.W., 1996, Neural systems
engaged by planning: a PET study of the Tower of London task,
Neuropsychologia, 34:515-526
[Braver and
Braver, T.S., Bongiolatti, S.R., 2002, The Role of Frontopolar
Bongiolatti, 2002]
Cortex in Subgoal Processing, NeuroImage, 15:523-536
[Dominey et al.,
Dominey, P., Arbib, M., and Joseph, J.-p, 1995, A model of
1995]
corticostriatal plasticity for learning oculomotor associations and
sequences, J. Cognit. Neurosci., 7(3):311-336.
[Fellous and Suri,
Fellous, J.-M., and Suri, R. E., 2003, Dopamine, Roles of, in The
2003]
Handbook of Brain Theory and Neural Networks (M. A. Arbib,
Ed.) Cambridge, MA: MIT Press, pp. 147-151.
[Goel et al., 2001]
Goel, V., Pullara, S.D., Grafman, J., 2001, A computational
model of frontal lobe dysfunction: working memory and the
Tower of Hanoi task, Cognitive Science: A Multidisciplinary
Journal, 25:287-313
[Gerfen, 1992]
Gerfen, C.R., 1992, The neostriatal mosaic: multiple levels of
compartmental organization, Trends in Neurosciences, 15:133-9
[Mendez, 1994]
Mendez, M. F., 1994, Huntington’s disease: Update and review of
neuropsychiatric aspects. International Journal of Psychiatry in
Medicine, 24, 189-208.
[Newell, 1990]
Newell, A., 1990, Unified Theories of Cognition, Harvard
University Press, 0:0-0.
[Newman et al.,
Model Report
Newman, S.D., Carpenter, P.A., Varma, S., and Just, M.A., 2003,
Page 9 of 10
R. Schuler
2003]
Frontal and parietal participation in problem solving in the Tower
of London: fMRI and computational modeling of planning and
high-level perception, Neuropsychologia, 41:1668-1682.
[O’Reilly and Frank,
O’Reilly, R.C., and Frank, M.J., 2006, Making Working Memory
2006]
Work: A Computational Model of Learning in the Prefrontal
Cortex and Basal Ganglia, Neural Computation, 18:283-328.
[Petrides, 1995]
Petrides, M., 1995, Impairments on nonspatial self-ordered and
externally ordered working memory tasks after lesions of the middorsal part of the lateral frontal cortex in the monkey, Journal of
Neuroscience, 15:359-375
[Roberts et al., 1994] Roberts, R. J., Hager, L. D., Heron, C., 1994, Prefrontal cognitive
processes: Working memory and inhibition in the antisaccade
task, Journal of Experimental Psychology: General, 123:374-393
Model Report
Page 10 of 10
R. Schuler