* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Model_Report_--_Schuler_Robert_-
Survey
Document related concepts
Neuroeconomics wikipedia , lookup
Neural modeling fields wikipedia , lookup
Cognitive neuroscience of music wikipedia , lookup
Aging brain wikipedia , lookup
Types of artificial neural networks wikipedia , lookup
Recurrent neural network wikipedia , lookup
Executive functions wikipedia , lookup
Atkinson–Shiffrin memory model wikipedia , lookup
Nervous system network models wikipedia , lookup
Mathematical model wikipedia , lookup
Metastability in the brain wikipedia , lookup
Synaptic gating wikipedia , lookup
Transcript
A computational model of working memory in a cognitive task utilizing a thalmo-cortical circuit R. Schuler Abstract A computational model using a neural network architecture is presented to explain Schizophrenia patients' performance on the well-known Wisconsin Card Sorting Test (WCST) frontal lobe task. In addition, models used to simulate patient performance on the Tower of London (TOL) and Tower of Hanoi (TOH) tasks were also considered in the design of the present model. The specific focus of the present model is to explain empirical results for patient groups by simulating working memory deficits. An earlier model by Amos (Amos, 2000) was used as the basis for the design of the present model. However, in the present model the thalmo-cortical loop is completed whereas in the Amos model the thalamus was used only as a target of the basal ganglia and was not reconnected to provide feedback to the frontal cortex. Completing the loop allows the present model to better simulate the effects of sub-cortical dysfunction. Introduction Three models were reviewed in detail leading up to the design of the present model. Each model contributes to the understanding of the role of the prefrontal cortex, of the affects of frontal lobe dysfunction on patient groups tested with frontal lobe tasks, and on the localization of working memory in the prefrontal cortex. In Newman et al., a computational model is proposed which synthesizes earlier work on executive function (Shallice, 1982) and the Soar model (Newell, 1990). The model seeks to explain empirical results of test subjects on the TOL task and focuses particular attention on the prefrontal cortex the parietal lobe and lateralization of functions therein. Using the model, Newman et al. seek an explanation for the types of functionality embodied in the various cortical areas. They hypothesize the localization of functionality for a visuo-spatial workspace, for planning and plan execution. The Newman et al. model is a symbolic computational model based on the 4CAPS cognitive architecture. Their model fits their empirical results qualitatively and highlights the distributed nature of neural activation for solving complex problems. In particular, they find a possible correlation for working memory activation in the prefrontal cortex and possible lateralization in the right hemisphere. Goel et al. present a symbolic computer model to explain the effects of frontal lobe dysfunction on the TOH task using working memory performance. They designed their model based on earlier work by two of the authors in which they tested a control group of 20 subjects and a patient group of 20 subjects using the TOH task (Goel & Grafman, 1995). The model was based on the 3CAPS cognitive architecture (Just & Carpenter, 1992), with two sets of production rules, a coordinator (executive function), and a knowledge base -- the model did not utilize a neural network architecture. Goel et al. Model Report Page 1 of 10 R. Schuler fitted the model to the empirical results of the control group and then introduced working memory deficits by increasing the decay rate of the knowledge base s memory pools. Amos (2000) simulates empirical results from subjects performing the WCST task with a neural network model. In this model, the regions of the prefrontal cortex, basal ganglia (striatum and SN/GP), and thalamus are examined. Amos uses the model to explain empirical results for subjects suffering frontal lobe dysfunction which manifest in perseverative errors related to frontal cortex dysfunction and random errors related to striatal dysfunction -- patients suffered from schizophrenia, Parkinson's disease (PD), or Huntington's disease (HD). The evaluation used in the study quantified the number of WCST categories achieved by control versus various patient groups and qualitatively recorded the type of error, whether perseverative or random. Though the brain regions are hypothesized to form a neuroanatomic loop from frontal cortex through thalamus, Amos notes that the model did not attempt to simulate the feedback loop from thalamus to prefrontal cortex. Summary Data The computational model presented in this report is based on the model by Amos (2000). It depends upon a neuroanatomic loop from the frontal cortex through the basal ganglia structures of the striatum and SNr/GPi to the thalamus and looping back to the cortex. The following summary data supports the design: Roberts et al (1994) show that overloading of working memory causes subject to perform cognitive task with similar ability to frontal lobe patients, thereby suggesting that working memory may be localized in the prefrontal cortex. Braver and Bangiolatti (2002) use fMRI results in studies that show right lateralized activation in the prefrontal cortex for tasks requiring greater sub-goal management, suggesting working memory function may be reside in right PFC. Baker et al (1996) use PET results on a study involving Tower of London task that indicate that visuospatial working memory may be localized to the prefrontal area. Petrides (1995) shows that monkeys with lesions to the prefrontal cortex exhibit working memory deficits. Gerfen (1992) and Alexander et al. (1992) present neuroanatomical research regarding the organization of the basal ganglia showing that the striatum is the primary recipient of afferent connections to the basal ganglia and that the striatum projects to the SN/GP. The research shows that the striatum is highly organized and may perform some type of information integration (or dimensionality reduction) as the ratio of incoming to outgoing projections is very high. Gerfen (1992) and Alexander et al. (1992) present neuroanatomical research showing that the SN/GP produce the primary outgoing projections from the basal ganglia and that they project to the thalamus, thus forming a loop back to the cortex. Amos (2000) cites additional research that indicates that the role of the thalmo-cortical loop may be used to perform active gating. Based on the given summary data, functionality in the frontal cortex is associated with working memory, executive function and attention. The striatum is hypothesized to Model Report Page 2 of 10 R. Schuler provide a type of information integration and then inhibit the SNr/GPi. The SNr/GPi complex is tonically active and inhibitory to the thalamus forming an activity gating mechanism that in turn projects to the cortex to maintain and initiate activity. In addition to the summary data describing this loop, Prescott et al (2003) describe the role of basal ganglia circuits and the function they provide for action gating and action selection. A notable example is the Dominey and Arbib model of saccadic eye movements (1995) in which this cortical-thalamic loop is used to produce delayed saccades. The Model The present model differs from Amos (2000) in the following ways. Working memory is modeled with multiple layers of neurons that are activated or inhibited by a gating signal. Rule generation is performed in a non-biologically realistic manner without regard to insensitivity to punishment. Critically important to the presented model, the thalmocortical loop is completed such that the activation of the thalamus is fed back to the cortex and used as the gating signal to working memory in the cortex. Thus, in the present model it is possible to explore additional effects of basal ganglia dysfunction. Reward S T1 T2 T3 T4 Ctx MD Str SNr/GPi Figure 1. The Model Model Report Page 3 of 10 R. Schuler The model works as follows. Input (Reward, S, T1,…,T4) The input includes the Reward, Stimulus and Target signals provided by the Test Manager module. The manager generates a representation of four target cards with at least 3 features (e.g., form, color, number) with 4 possible values for each feature. The Target cards (T1,…,T4) remain unchanged throughout the test, while the manager generates a representation of the stimulus card for which the model must attempt to produce a match to one of the 4 target cards. The Stimulus card changes for each trial throughout the test. The subject is expected to match the stimulus card to a target card based on a matching rule. The matching rule determines whether the subject should match the stimulus to one of the targets based on matching shape, number, or form. The manager produces a punishment signal if the subject makes an incorrect match. The rule is not communicated to the subject and after several successful matches the rule changes without warning to the subject. Thus, the subject must discover the rule through trial and error and then remember the rule throughout the trials in the current category. New. In my original model the targets changed with each trial, which is inconsistent with the real WCST task. In the revised model, the targets remain fixed. Cortex (Ctx) The prefrontal cortex performs a few key cognitive tasks including rule generation, attention, and most importantly to this simulation, working memory. The Working Memory sub-module is comprised of three neural layers, an input potential layer, a stored memory layer, and an output potential layer. The input potential layer (first layer) receives a projection from the Rule Generator for the currently proposed rule. The stored memory layer (second layer) receives a recurrent connection from the output layer (third layer). And the output layer receives excitatory projections from both the input and stored layers. The gating signal excites the input layer and neutralizes the stored layer, therefore when the gating signal is active the Working Memory module stores a new pattern and when the gating signal is not active the module maintains stored information. In O’Reilly & Frank (2006), they conceptually describe a dynamic gating memory model but (as far as I recall) they do not provide a hypothesis for a biologically plausible dynamically gated memory structure. In my model, I show that a recurrent neural network acting on the gating signal essentially as if it were a bitmask can function as dynamically gated memory. So the memory is temporarily stored using the neural activation rather than changing of weights (as in long-term memory). The Rule Generator sub-module is a non-biologically realistic module that randomly produces a new rule whenever it receives a punishment signal and it projects the proposed rule to the working memory. Removed. The Attention sub-module completes the prefrontal cortex module by combining the currently active rule from working memory with the stimulus card representation into an attention signal. The attention signal is a matrix with one unit Model Report Page 4 of 10 R. Schuler active which represents the feature of the stimulus card for which the simulated subject seeks a match. New. In the revised model, the Cortex will present the rule to the BG as an available action that the BG may use in selecting the appropriate match. It will be up to the BG to use the rule to select an action rather than to use the previous “attention” signal to find a match. Basal Ganglia (Str, SNr/GPi) The basal ganglia are comprised of the Striatal module (Str) and the SNr/GPi module (SNr/GPi). The striatum is further comprised of striatal columns that accept projections for each of the four target card representations along with the attention signal from the cortex. The columns integrate the information into a single inhibitory signal that projects to the tonic neural units of the SN/GP. The SN/GP neurons are tonically active and inhibited by the striatal projections. The SN/GP neurons, when inhibited, disinhibit the respective thalamic units. New. The BG will accept inputs which represent the stimulus (S) and target (T1,…T4) cards. The combined inputs form a 3 x 20 input matrix. The role of the Str will be to reduce this raw representation to a simplified representation of just the matching features. For instance, if the stimulus card has a red figure and if target card 1 has a red figure then the 8 neurons that represented the characteristics of the stimulus and target card may be reduced to a single neuron indicating a matching feature (match = on) in that stimulustarget pair. New. The BG will accept a signal from the cortex representing the current rule rather than the “attention” signal used in the previous model. The rule essentially states a possible action, such as “match cards by color.” The BG will then perform an action selection function by using the current rule to determine which target card to match with the stimulus card. Open issue: I am not certain how to model this action selection function or even whether to model it in the Str or the SNr/GPi. I believe these changes capture your input from our discussion during our previous weekly meeting with the BG group. Also, it is my hope that this change enhances the biological plausibility of the model, and furthermore enables collaboration options with other BG team members (i.e., reuse of my model or reuse of their modules in my model). The output of the BG remains unchanged. The SNr/GPi is comprised of tonically active inhibitory neurons that project (inhibit) the corresponding Thalamic units. When the Str determines a match or an action, the firing inhibits the SNr/GPi thereby disinhibiting the MD. The disinhibition allows the action to proceed. Thalamus (MD) The Thalamus module is comprised of a layer of neurons corresponding to the selected action as determined by the basal ganglia. The output of the thalamus is a gating signal that indicates which action is to be performed (e.g., "match stimulus card to target 3"). Model Report Page 5 of 10 R. Schuler The thalmo-cortical loop is completed by feeding the thalamic output to the frontal cortex. As described above, the signal is used as a gating signal that projects to the Working Memory module. Simulation Results The model is used to simulated empirical results of normal and patient groups on the Wisconsin Card Sort Task (WCST). In the WCST, a subject must sort a stack of cards according to a sorting rule. A card may have multiple features such as color, shape, number - for instance, a card may have two blue circles printed on it or one green square printed on it. The sorting rule requires that the subject match a stimulus card to one of four target cards - for instance, the sorting rule may be to sort by shape in which case the subject may ignore color or number. After the subject correctly matches 10 cards in a row, the rule is changed without the subject's knowledge and the subject must learn and apply a new rule in order to sort the next category of cards. The test completes when the subject has correctly matched 6 categories. Amos (2000) presents experimental results for subject and multiple patient groups including those with Schizophrenia, Schizophrenia with Tardive dyskinesia (TD), Parkinson's Disease (PD), PD with dementia, Huntington's Disease (HD), and HD with dementia. The present model was primarily used only to evaluate normal, Schizophrenia, and Schizophrenia with TD performance. Once the model was fitted to the normal results, the connection weights from the gating signal to the working memory layers were reduced and noise-to-signal was increased in order to produce failures and perseverative errors in line with the Schizophrenia patients. Then, gain was reduced and bias against firing increased in the striatal columns to reproduce results similar to Schizophrenia with TD patients. The present model demonstrates that working memory deficits localized in the frontal cortex can account for frontal lobe patient performance on cognitive tasks and further that sub-cortical dysfunction can degrade performance on (set-shifting) cognitive tasks. In the following benchmark, the presented model is compared to the experimental results and simulation results as reported by Amos (2000). Experiment / Notes Normal Schizophrenic Simulation Schizophrenic TD Experimental 6.0 categories 1.5 results reported 9% 27% by Amos perseveration -- (2000) for normal vs. Model Report Page 6 of 10 R. Schuler Schizophrenic patients Experimental -- results reported 3.9 2.5 27% 41% -- by Amos (2000) for normal vs. Schizophrenic vs. Schizophrenic with TD patients Simulation Calibrated to 6.0 1.5 results using the first 9% 27% Amos (2000) empirical study -- (Frontal Model) (Frontal Model) computational model Simulation Calibrated to results using the second 3.8 2.5 Amos (2000) empirical study 27% 41% computational (Striatal Model) (Striatal Model) model 3.8 2.5 27% 29% Simulation 6.0 1.8 1.9 using the 14.6% 27.8% 25.7%* presented computational model * Though overall perseveration to attempts was lower than non-TD patients, the results show that when the simulated subject fails the TD subject was more likely to fail due to a perseverative error than the non-TD subject. Model Report Page 7 of 10 R. Schuler Discussion and Future Work I would characterize my results (from last semester) as mixed. I simulated dysfunction by altering the gain and bias in neural layers in the frontal and striatal modules, however, I also found that altering the weights used in the frontal module (working memory) corresponding to the input patterns and particularly the gating signal was most effective for simulating the experimental results. I fitted the simulation to the normal subjects from the experimental results. My simulation’s control group had a slightly higher rate of perseverative errors. I believe this is due to my implementation of the Rule Generator module, which randomly selects the next rule to attempt when the PFC receives a punishment signal. The rule generator only ensures that the random rule is not the same as the currently selected rule, and therefore there is a chance of repeating a failed rule while making attempts to find the correct rule. Once fitted to the control group, I manipulated the gain, bias, noise and weights within the frontal module (working memory) and found a combination of settings for the neural network that closely reproduced the experiment results from the first study cited by Amos (2000). Then I reduced the gain and increased the bias against firing in the striatal module to attempt to simulate the experimental results for schizophrenic patients with TD. In this case, I had less success but there were signs that the model may be on the right track. On the surface, the downside is that the simulation did not reproduce results matching the experimental results. In fact, the simulated subjects with TD performed slightly better than the non-TD subjects. However, a closer evaluation of the simulation results shows that ratio of perseverative errors to overall errors was higher for the subjects with TD than the non-TD subjects. Nearly 60% of the errors of the TD group were perseverative whereas around 51% of the errors of the non-TD group were perseverative. This is close to a 20% increase in perseverative errors for the simulation when introducing striatal dysfunction to match schizophrenic patients with TD. Future work on the model falls into three main groups plus miscellany: 1) Update the model’s structure according to the comments under the section “The Model.” This primarily entails removing the Attention sub-module from the cortex. Projecting the stimulus card directly to the BG. Using the BG to reduce the dimensionality of the input signals first, and then to use the reduced representation of the card inputs along with the rule input to select an action. 2) Experiment with the neural network parameters for normal, Schizophrenic, and Schizophrenic + TD scenarios to better match the experimental results. 3) I wouldn’t also like to introduce selective destruction of neurons in the Striatum to simulate the effects of Huntington’s disease (Mendez, 1994). 4) In addition to the above, there are some “loose ends” such as an incomplete implementation of “winner lose all” in the SN/GP. References [Alexander et al., Alexander, G.E., DeLong, M.R., Crutcher, M.D., 1992, Do 1992] cortical and basal ganglionic motor areas use motor programs to Model Report Page 8 of 10 R. Schuler control movement ?, Behavioral and brain sciences, 15:656-665 [Amos, 2000] Amos, A., 2000, A Computational Model of Information Processing in the Frontal Cortex and Basal Ganglia, Journal of Cognitive Neuroscience, 12:505-519. [Arbib, 2003] Arbib, M. A., 2003, Backpropagation: General Principles, in The Handbook of Brain Theory and Neural Networks (M. A. Arbib, Ed.) Cambridge, MA: MIT Press, pp. 147-151. [Baker et al., 1996] Baker, S.C, Rogers, R.D, Owen, A.M., Frith, C.D., Dolan, R.J., Frackowiak, R.S.J., and Robbins, T.W., 1996, Neural systems engaged by planning: a PET study of the Tower of London task, Neuropsychologia, 34:515-526 [Braver and Braver, T.S., Bongiolatti, S.R., 2002, The Role of Frontopolar Bongiolatti, 2002] Cortex in Subgoal Processing, NeuroImage, 15:523-536 [Dominey et al., Dominey, P., Arbib, M., and Joseph, J.-p, 1995, A model of 1995] corticostriatal plasticity for learning oculomotor associations and sequences, J. Cognit. Neurosci., 7(3):311-336. [Fellous and Suri, Fellous, J.-M., and Suri, R. E., 2003, Dopamine, Roles of, in The 2003] Handbook of Brain Theory and Neural Networks (M. A. Arbib, Ed.) Cambridge, MA: MIT Press, pp. 147-151. [Goel et al., 2001] Goel, V., Pullara, S.D., Grafman, J., 2001, A computational model of frontal lobe dysfunction: working memory and the Tower of Hanoi task, Cognitive Science: A Multidisciplinary Journal, 25:287-313 [Gerfen, 1992] Gerfen, C.R., 1992, The neostriatal mosaic: multiple levels of compartmental organization, Trends in Neurosciences, 15:133-9 [Mendez, 1994] Mendez, M. F., 1994, Huntington’s disease: Update and review of neuropsychiatric aspects. International Journal of Psychiatry in Medicine, 24, 189-208. [Newell, 1990] Newell, A., 1990, Unified Theories of Cognition, Harvard University Press, 0:0-0. [Newman et al., Model Report Newman, S.D., Carpenter, P.A., Varma, S., and Just, M.A., 2003, Page 9 of 10 R. Schuler 2003] Frontal and parietal participation in problem solving in the Tower of London: fMRI and computational modeling of planning and high-level perception, Neuropsychologia, 41:1668-1682. [O’Reilly and Frank, O’Reilly, R.C., and Frank, M.J., 2006, Making Working Memory 2006] Work: A Computational Model of Learning in the Prefrontal Cortex and Basal Ganglia, Neural Computation, 18:283-328. [Petrides, 1995] Petrides, M., 1995, Impairments on nonspatial self-ordered and externally ordered working memory tasks after lesions of the middorsal part of the lateral frontal cortex in the monkey, Journal of Neuroscience, 15:359-375 [Roberts et al., 1994] Roberts, R. J., Hager, L. D., Heron, C., 1994, Prefrontal cognitive processes: Working memory and inhibition in the antisaccade task, Journal of Experimental Psychology: General, 123:374-393 Model Report Page 10 of 10 R. Schuler