Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Computational Models of Neuromodulation Angela J. Yu Department of Cognitive Science University of California, San Diego 9500 Gilman Dr. MC 0515 La Jolla, CA 92103 1 Title Computational Models of Neuromodulation Definition Neuromodulatory systems serve a special meta-processing role in the brain. Due to their anatomical privileges of having massively extensive arborization patterns throughout the central and peripheral nervous systems, and to their physiological capacity of finely controlling how other neurons communicate with each other and plasticize, they are ideally positioned to regulate the way information is acquired, processed, utilized, and stored in the brain. As such, neuromodulation has been fertile ground for computational models for neural information processing, which have strived to explain not only how the major neuromodulators coordinate to enable normal sensory, motor, and cognitive functions, but also how dysfunctions arise in psychiatric and neurological conditions when these neuromodulatory systems are impaired. Detailed Description Although much still remains unknown or opaque about neuromodulatory functions, a computationally sophisticated and coherent picture is beginning to emerge on the back of recent modeling works (Dayan, 2012). The four major neuromodulators that have received most theoretical treatment are dopamine, serotonin, acetylcholine, and norepinephrine. They all play important roles in regulating cortical functions such as inference, learning, and decisionmaking. Broadly speaking, dopamine and serotonin appear to be critical for the processing of stimuli and events with salient affective value, such as rewards and punishments, and the learning and execution of actions that lead to greater subjective utility (more rewards or fewer punishments); in contrast, acetylcholine and norepinephrine appear to be most critical for signaling various forms of uncertainty that plague cortical computations, arising from intrinsic neuronal processing noise, external environmental changeability, as well as imperfect sensory observations. 2 Reward and Punishment One of the best understood computational functions of neuromodulation is the role of dopamine (DA) in signaling reward prediction errors (Montague, Dayan, & Sejnowski, 1996; Schultz, Dayan, & Montague, 1997), associated with a reinforcement learning algorithm known as temporal difference learning (Sutton, 1988), which can learn not only which stimuli predict future rewards, but also the temporal delay until such a reward appears. Computational models have suggested reward prediction error to be important for at least three different functions. First, the most obvious, is the learning of environmental cues that predict future rewards, as well as actions that lead to greater or lesser amount of reward. There is evidence that D1 dopamine receptors in the direct or ”go” pathway in the striatum mediates the facilitation of actions with surprisingly good outcomes or positive prediction errors; whereas D2 dopamine receptors in the indirect or ”no-go” pathway in the striatum underlie the inhibition of actions that lead to negative prediction errors (Gerfen et al., 1990; Smith, Bevan, Shink, & Bolam, 1998; Frank, Seeberger, & OReilly, 2004; Frank, 2005; J. D. Cohen & Frank, 2009; Kravitz, Tye, & Kreitzer, 2012). A second role suggested for dopamine, related to the theory of incentive salience (Berridge & Robinson, 1998; McClure, Daw, & Montague, 2003) and potentially via D1 receptors in the nucleus accumbens (Murschall & Hauber, 2006; Frank, 2005; Surmeier, Ding, Day, Wang, & Shen, 2007; Surmeier et al., 2010), is that it controls the expression of Pavlovian approach responses associated with the prospect of reward (Ikemoto & Panksepp, 1999), in turn facilitating the initiation and learning of instrumental responses that boost reward (Satoh, Nakai, Sat, & Kimura, 2003; Nakamura & Hikosaka, 2006; Talmi, Seymour, Dayan, & Dolan, 2008). A third role suggested for the dopamine-mediated prediction error is that over the long term, the average prediction error reflects the average reward rate, which represents an opportunity cost that can drive the level of response vigor (Niv, Daw, Joel, & Dayan, 2007) or the trade-off between speed and accuracy (Dayanik & Yu, 2012). While animals need to seek out rewarding circumstances and actions, they also need to avoid punishing or threatening ones. The learning of avoidance behavior, or behavioral inhibition, appears to depend on both dopamine and serotonin (5-HT). Unexpected punishments (Ungless, Magill, & Bolam, 2004; J. Y. Cohen, Haesler, Vong, Lowell, & Uchida, 2012), as well as the non-delivery of expected reward (Schultz et al., 1997), have both been shown to activate the dopaminergic system, which enable the learning of avoidance behavior via D2 receptors in the indirect pathway of the striatum (Frank et al., 2004; Kravitz et al., 2012). 3 Complementing the dopaminergic system, serotonin has been suggested to play an opponent role, by signaling prediction errors associated with future ”aversion” rather than reward (Deakin, 1983; Daw, Kakade, & Dayan, 2002; Cools, 2011; Boureau & Dayan, 2011). The suggestion is that initial hint of future aversion would activate the serotonergic prediction error (Schweimer & Ungless, 2010), but that once the threat of that aversive outcome is avoided due to actions or circumstances, then the dopaminergic neurons would be activated by the associated reward prediction error, signaling safety, and thus reinforce the avoidance action (Johnson, Li, Li, & Klopf, 2001; Moutoussis, Bentall, Williams, & Dayan, 2008; Maia, 2010). Aside from learning avoidance actions, serotonin has also been though to be important for mitigating impulsivity and enabling general behavioral inhibition, such as delaying actions in order to receive a larger reward (Miyazaki, Miyazaki, & Doya, 2011; Doya, 2002). Uncertainty Uncertainty of various forms plague the brain’s interactions with the environment. In a Bayesian statistical framework, optimal inference and prediction, based on unreliable observations in changing contexts, require the representation and utilization of different forms of uncertainty. For example, in the context of multi-modal cue integration, uncertainty about the state of the world induced by unreliability of a cue should lead to the relative downgrading of that source of information relative to other more reliable cues (Ernst & Banks, 2002; Battaglia, Jacobs, & Aslin, 2003). Analogously, in the context of integrating top-down and bottom-up information, uncertainty about the validity of an internal model of the environment should lead to relative suppression of top-down expectations in relation to bottom-up, novel sensory inputs (Yu & Dayan, 2003; Körding & Wolpert, 2004; Yu & Dayan, 2005b). On the other hand, uncertainty about the predictive or other statistical properties of a stimulus or event should up-regulate the learning accorded to that stimulus/event, relative to other better understood aspect of the environment (Yu & Dayan, 2005b; Behrens, Woolrich, Walton, & Rushworth, 2007; Nassar et al., 2012). Strictly speaking, exact Bayesian statistical computations need not distinguish among the various forms of uncertainty, as they are all reflected as probability distributions, or properties of distributions, of the relevant random variables under consideration. However, the brain can hardly be expected to implement arbitrarily precise and complex Bayesian computations, given its limited processing hardware, and thus needs to represent the most common and critical aspects of uncertainty so as to achieve near-optimal but efficient computations (Yu & Dayan, 2003, 2005b, 2005a). 4 Based on a wealth of physiological, pharmacological, and behavioral data implicating acetylcholine (ACh) and norepinephrine (NE) in specific aspects of a range of cognitive processes, it has been proposed that they respectively signal expected and unexpected uncertainty, with the former associated with known stochasticity or ignorance in the current environment, and the latter arising from unexpected, gross changes in the environment that require drastic revision of internal representation of the behavioral context (Yu & Dayan, 2005b). The two forms of uncertainty are expected to be partly-synergistic, partly-antagonistic: a rise in unexpected uncertainty should lead to an increase of expected uncertainty (due to known ignorance), while expected uncertainty in turn has a suppressive influenced on unexpected uncertainty, since the level of evidence for a fundamental change induced by a certain magnitude of prediction error should normalized by the degree of expected variability (prediction error) (Yu & Dayan, 2005b). Consistent with this, Sara and colleagues have found similarly antagonistic interactions between ACh and NE in a series of learning and memory studies (Sara, 1989; Ammassari-Teule, Maho, & Sara, 1991; Sara, Dyon-Laurent, Guibert, & Leviel, 1992; Dyon-Laurent, Romand, Biegon, & Sara, 1993; Dyon-Laurent, Hervé, & Sara, 1994). They demonstrated that learning and memory deficits caused by cholinergic lesions can be alleviated by the administration of clonidine (Sara, 1989; Ammassari-Teule et al., 1991; Sara et al., 1992; Dyon-Laurent et al., 1993, 1994), a noradrenergic α-2 agonist that decreases the level of NE (Coull, Frith, Dolan, Frackowiak, & Grasby, 1997). In recent years, a number of experimental studies have confirmed major aspects of the uncertainty-based account of acetylcholine and norepinephrine functions. Pupil diameter, known to be correlated with noradrenergic neuronal activation level (Aston-Jones & Cohen, 2005), has been shown to be, on the one hand, correlated with the discarding of previous beliefs and strategies and the exploration of novel ones (Gilzenrat, Nieuwenhuis, Jepma, & Cohen, 2010; Jepma & Nieuwenhuis, 2011), and, on the other hand, reflective of subjects unexpected uncertainty beyond predicted variabilityPreuschoff11. Moreover, there is evidence that pupil diameter is both reflective of unexpected uncertainty (operationalized here as change point probability), and accompanies the discarding of previous beliefs and the incorporation of new data into a novel belief state (Nassar et al., 2012). Interestingly, pupil dilation is controlled by two muscles, the sphinctor for contraction and the dilator for dilation, which are respectively innervated by acetylcholine and norepinephrine (Hutchins & Corbett, 1997). The administration of cholinergic antagonists, scopolamine (muscarinic) and mecamylamine (nicotinic), increases pupil size in normal elderly volunteers (Little, Johnson, Minichiello, Weingartner, & Sunderlnd, 1998). Moreover, Alzheimer’s disease (AD) patients, 5 who have a severe and specific cholinergic deficit, have larger pupil size than controls both tonically and in response to light, while AD patients medicated with acetylcholinesterase inhibitor, which increases the endogenous level of acetylcholine, show a pupil response much closer to controls (Fotiou et al., 2009). This raises the intriguing possibility that the pupil dilation results might reflect the joint influence of both norepinephrine and acetylcholine. Separately, pharmacological and lesion studies directly manipulating acetylcholine levels have also produced behavioral results consistent with the cholinergic expected uncertainty hypothesis (Thiel & Fink, 2008) (Córdova, Yu, & Chiba, Soc. Neurosci. Abstracts, 2004). Further Reading Doya, K. (2002). Metalearning and neuromodulation. Neural Netw., 15(4-6), 495-506. Dayan, P. (2012). Twenty-five lessons from computational neuromodulation. Neuron, 76, 240-256. References Ammassari-Teule, M., Maho, C., & Sara, S. J. (1991). Clonidine reverses spatial learning deficits and reinstates θ frequencies in rats with partial fornix section. Beh. Brain Res., 45 , 1-8. Aston-Jones, G., & Cohen, J. D. (2005). An integrative theory of locus coeruleusnorepinephrine function: Adaptive gain and optimal performance. Annu. Rev. Neurosci., 28 , 403-50. Battaglia, P. W., Jacobs, R. A., & Aslin, R. N. (2003). Bayesian integration of visual and auditory signals for spatial localization. J. Opt. Soc. Am. A. Opt. Image. Sci. Vis., 20 (7), 1391-7. Behrens, T. E. J., Woolrich, M. W., Walton, M. E., & Rushworth, M. F. S. (2007). Learning the value of information in an uncertain world. Nature Neurosci, 10 (9), 1214–21. Berridge, K. C., & Robinson, T. E. (1998). What is the role of dopamine in reward: Hedonic impact, reward learning, or incentive salience? Brain Research Reviews, 28 , 309-369. Boureau, Y.-L., & Dayan, P. (2011). Opponency revisited: Competition and cooperation between dopamine and serotonin. Neuropsychopharmacology, 36 , 74-97. 6 Cohen, J. D., & Frank, M. J. (2009). Neurocomputational models of basal ganglia function in learning, memory and choice. Behavioral Brain Research, 199 , 141-156. Cohen, J. Y., Haesler, S., Vong, L., Lowell, B. B., & Uchida, N. (2012). Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature, 482 , 85-88. Cools, R. (2011). Dopaminergic control of the striatum for high-level cognition. Current Opinion in Neurobiology, 21 , 402-407. Coull, J. T., Frith, C. D., Dolan, R. J., Frackowiak, R. S., & Grasby, P. M. (1997). The neural correlates of the noradrenergic modulation of human attention, arousal and learning. Eur. J. Neurosci., 9 (3), 589-98. Daw, N. D., Kakade, S., & Dayan, P. (2002). Opponent interactions between serotonin and dopamine. Neural Netw., 15 , 603-16. Dayan, P. (2012). Twenty-five lessons from computational neuromodulation. Neuron, 76 , 240-256. Dayanik, S., & Yu, A. J. (2012). Reward-rate maximization in sequential identification under a stochastic deadline. SIAM Journal on Control and Optimization, 51 (4), 2922-2948. Deakin, J. F. W. (1983). Roles of brain serotonergic neurons in escape, avoidance and other behaviors. Journal of Psychopharmacology, 43 , 563-577. Doya, K. (2002). Metalearning and neuromodulation. Neural Netw., 15 (4-6), 495-506. Dyon-Laurent, C., Hervé, A., & Sara, S. J. (1994). Noradrenergic hyperactivity in hippocampus after partial denervation: pharmacological, behavioral, and electrophysiological studies. Exp. Brain Res., 99 , 259-66. Dyon-Laurent, C., Romand, S., Biegon, A., & Sara, S. J. (1993). Functional reorganization of the noradrenergic system after partial fornix section: a behavioral and autoradiographic study. Exp. Brain Res., 96 , 203-11. Ernst, M. O., & Banks, M. S. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature, 415 (6870), 429-33. Fotiou, D. F., Stergiou, V., Tsiptsios, D., Lithari, C., Nakou, M., & Karlovasitou, A. (2009). Cholinergic deficiency in Alzheimer’s and Parkinson’s disease: Evaluation with pupillometry. International Journal of Psychophysiology, 73 (2), 143-9. Frank, M. J. (2005). Dynamic dopamine modulation in the basal ganglia: A neurocomputational account of cognitive deficits in medicated and nonmedicated Parkinsonism. Journal of Cognitive Neuroscience, 17 , 51-72. Frank, M. J., Seeberger, L. C., & OReilly, R. C. (2004). By carrot or by stick: Cognitive reinforcement learning in Parkinsonism. Science, 306 , 1940-1943. 7 Gerfen, c. R., Engber, T. M., Mahan, L. C., Susel, Z., Chase, T. N., Monsma, F. J. J., & Sibley, D. R. (1990). D1 and D2 dopamine receptor-regulated gene expression of striatonigral and striatopallidal neurons. Science, 250 , 1429-1432. Gilzenrat, M. S., Nieuwenhuis, S., Jepma, M., & Cohen, J. D. (2010). Pupil diameter tracks changes in control state predicted by the adaptive gain theory of locus coeruleus function. Cogn Affect Behav Neurosci , 10 , 252-69. Hutchins, J. B., & Corbett, J. J. (1997). The visual system. In D. E. Haines (Ed.), Fundamental neuroscience (p. 265-84). New York: Churchill Livingstone. Ikemoto, S., & Panksepp, J. (1999). The role of nucleus accumbens dopamine in motivated behavior: A unifying interpretation with special reference to reward-seeking. Brain Research Reviews, 31 , 6-41. Jepma, M., & Nieuwenhuis, S. (2011). Pupil diameter predicts changes in the explorationexploitation trade-off: evidence for the adaptive gain theory. J Cogn Neurosci , 23 , 1587-96. Johnson, J., Li, W., Li, J., & Klopf, A. (2001). A computational model of learned avoidance behavior in a one-way avoidance experiment. Adaptive Behavior , 9 , 91. Körding, K. P., & Wolpert, D. M. (2004). Bayesian integration in sensorimotor learning. Nature, 427 , 244-7. Kravitz, A. V., Tye, L. D., & Kreitzer, A. C. (2012). Distinct roles for direct and indirect pathway striatal neurons in reinforcement. Nature Neuroscience, 15 , 816-818. Little, J. T., Johnson, D. N., Minichiello, M., Weingartner, H., & Sunderlnd, T. (1998). Combined nicotonic and muscarinic blockade in elderly normal volunteers: Cognitive, behavioral, and physiological responses. Neuropsychopharmacology, 19 (1), 60-9. Maia, T. V. (2010). Two-factor theory, the actor-critic model, and conditioned avoidance. Learning and Behavior , 38 , 50-67. McClure, S. M., Daw, N. D., & Montague, P. R. (2003). A computational substrate for incentive salience. Trends in Neuroscience, 26 , 423-428. Miyazaki, K., Miyazaki, K. W., & Doya, K. (2011). Activation of dorsal Raphe serotonin neurons underlies waiting for delayed rewards. Journal of Neuroscience, 31 , 469-479. Montague, P. R., Dayan, P., & Sejnowski, T. J. (1996). A framework for mesencephalic dopamine systems based on predictive hebbian learning. Journal of Neuroscience, 16 , 1936-1947. Moutoussis, M., Bentall, R. P., Williams, J., & Dayan, P. (2008). A temporal difference account of avoidance learning. Network , 19 , 137-160. 8 Murschall, A., & Hauber, W. (2006). Inactivation of the ventral tegmental area abolished the general excitatory influence of pavlovian cues on instrumental performance. Learning and Memory, 13 , 123-126. Nakamura, K., & Hikosaka, O. (2006). Role of dopamine in the primate caudate nucleus in reward modulation of saccades. Journal of Neuroscience, 26 , 5360-5369. Nassar, M. R., Rumsey, K. M., Wilson, R. C., Parikh, K., Heasly, B., & Gold, J. I. (2012). Rational regulation of learning dynamics by pupil-linked arousal systems. Nature Neuroscience. Niv, Y., Daw, N. D., Joel, D., & Dayan, P. (2007). Tonic dopamine: opportunity cost and the control of response vigor. Psychopharmacology, 191 , 507-520. Sara, S. J. (1989). Noradrenergic-cholinergic interaction: its possible role in memory dysfunction associated with senile dementia. Arch. Gerontol. Geriatr., Suppl. 1 , 99-108. Sara, S. J., Dyon-Laurent, C., Guibert, B., & Leviel, V. (1992). Noradernergic hyperctivity after fornix section: role in cholinergic dependent memory performance. Exp. Brain Res., 89 , 125-32. Satoh, T., Nakai, S., Sat, T., & Kimura, M. (2003). Correlated coding of motivation and outcome of decision by dopamine neurons. Journal of Neuroscience, 23 , 9913-9923. Schultz, W., Dayan, P., & Montague, P. R. (1997). A neural substrate of prediction and reward. Science, 275 , 1593-9. Schweimer, J. V., & Ungless, M. A. (2010). Phasic responses in dorsal raphe serotonin neurons to noxious stimuli. Neuroscience, 171 , 1209-1215. Smith, Y., Bevan, M. D., Shink, E., & Bolam, J. P. (1998). Microcircuitry of the direct and indirect pathways of the basal ganglia. Neuroscience, 86 , 353-387. Surmeier, D. J., Ding, J., Day, M., Wang, Z., & Shen, W. (2007). D1 and D2 dopaminereceptor modulation of striatal glutamatergic signaling in striatal medium spiny neurons. Trends in Neuroscience, 30 , 228-235. Surmeier, D. J., Shen, W., Day, M., Gertler, T., Chan, S., Tian, X., & Plotkin, J. L. (2010). The role of dopamine in modulating the structure and function of striatal circuits. Progress in Brain Research, 183 , 149-167. Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Mach. Learning, 3 (1), 9-44. Talmi, D., Seymour, B., Dayan, P., & Dolan, R. J. (2008). Human Pavlovian-instrumental transfer. Journal of Neuroscience, 28 , 360-368. Thiel, C. M., & Fink, G. R. (2008). Effects of the cholinergic agonist nicotine on reorienting 9 of visual spatial attention and top-down attentional control. Neuroscience, 152 (2), 381-90. Ungless, M. A., Magill, P. J., & Bolam, J. P. (2004). Uniform inhibition of dopamine neurons in the ventral tegmental area by aversive stimuli. Science, 303 , 2040-2042. Yu, A. J., & Dayan, P. (2003). Expected and unexpected uncertainty: ACh and NE in the neocortex. In S. T. S. Becker & K. Obermayer (Eds.), Advances in neural information processing systems 15 (p. 157-164). Cambridge, MA: MIT Press. Yu, A. J., & Dayan, P. (2005a). Inference, attention, and decision in a Bayesian neural architecture. In L. K. Saul, Y. Weiss, & L. Bottou (Eds.), Advances in Neural Information Processing Systems 17. Cambridge, MA: MIT Press. Yu, A. J., & Dayan, P. (2005b). Uncertainty, neuromodulation, and attention. Neuron, 46 , 681-92. 10