Download Computational Models of Neuromodulation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Computational Models of Neuromodulation
Angela J. Yu
Department of Cognitive Science
University of California, San Diego
9500 Gilman Dr. MC 0515
La Jolla, CA 92103
1
Title
Computational Models of Neuromodulation
Definition
Neuromodulatory systems serve a special meta-processing role in the brain. Due to their
anatomical privileges of having massively extensive arborization patterns throughout the
central and peripheral nervous systems, and to their physiological capacity of finely controlling how other neurons communicate with each other and plasticize, they are ideally
positioned to regulate the way information is acquired, processed, utilized, and stored in the
brain. As such, neuromodulation has been fertile ground for computational models for neural
information processing, which have strived to explain not only how the major neuromodulators coordinate to enable normal sensory, motor, and cognitive functions, but also how
dysfunctions arise in psychiatric and neurological conditions when these neuromodulatory
systems are impaired.
Detailed Description
Although much still remains unknown or opaque about neuromodulatory functions, a computationally sophisticated and coherent picture is beginning to emerge on the back of recent
modeling works (Dayan, 2012). The four major neuromodulators that have received most
theoretical treatment are dopamine, serotonin, acetylcholine, and norepinephrine. They all
play important roles in regulating cortical functions such as inference, learning, and decisionmaking. Broadly speaking, dopamine and serotonin appear to be critical for the processing
of stimuli and events with salient affective value, such as rewards and punishments, and the
learning and execution of actions that lead to greater subjective utility (more rewards or
fewer punishments); in contrast, acetylcholine and norepinephrine appear to be most critical
for signaling various forms of uncertainty that plague cortical computations, arising from intrinsic neuronal processing noise, external environmental changeability, as well as imperfect
sensory observations.
2
Reward and Punishment
One of the best understood computational functions of neuromodulation is the role of
dopamine (DA) in signaling reward prediction errors (Montague, Dayan, & Sejnowski, 1996;
Schultz, Dayan, & Montague, 1997), associated with a reinforcement learning algorithm
known as temporal difference learning (Sutton, 1988), which can learn not only which stimuli
predict future rewards, but also the temporal delay until such a reward appears. Computational models have suggested reward prediction error to be important for at least three
different functions. First, the most obvious, is the learning of environmental cues that predict future rewards, as well as actions that lead to greater or lesser amount of reward. There
is evidence that D1 dopamine receptors in the direct or ”go” pathway in the striatum mediates the facilitation of actions with surprisingly good outcomes or positive prediction errors;
whereas D2 dopamine receptors in the indirect or ”no-go” pathway in the striatum underlie
the inhibition of actions that lead to negative prediction errors (Gerfen et al., 1990; Smith,
Bevan, Shink, & Bolam, 1998; Frank, Seeberger, & OReilly, 2004; Frank, 2005; J. D. Cohen
& Frank, 2009; Kravitz, Tye, & Kreitzer, 2012). A second role suggested for dopamine,
related to the theory of incentive salience (Berridge & Robinson, 1998; McClure, Daw, &
Montague, 2003) and potentially via D1 receptors in the nucleus accumbens (Murschall &
Hauber, 2006; Frank, 2005; Surmeier, Ding, Day, Wang, & Shen, 2007; Surmeier et al.,
2010), is that it controls the expression of Pavlovian approach responses associated with
the prospect of reward (Ikemoto & Panksepp, 1999), in turn facilitating the initiation and
learning of instrumental responses that boost reward (Satoh, Nakai, Sat, & Kimura, 2003;
Nakamura & Hikosaka, 2006; Talmi, Seymour, Dayan, & Dolan, 2008). A third role suggested for the dopamine-mediated prediction error is that over the long term, the average
prediction error reflects the average reward rate, which represents an opportunity cost that
can drive the level of response vigor (Niv, Daw, Joel, & Dayan, 2007) or the trade-off between
speed and accuracy (Dayanik & Yu, 2012).
While animals need to seek out rewarding circumstances and actions, they also need to
avoid punishing or threatening ones. The learning of avoidance behavior, or behavioral inhibition, appears to depend on both dopamine and serotonin (5-HT). Unexpected punishments
(Ungless, Magill, & Bolam, 2004; J. Y. Cohen, Haesler, Vong, Lowell, & Uchida, 2012), as
well as the non-delivery of expected reward (Schultz et al., 1997), have both been shown to
activate the dopaminergic system, which enable the learning of avoidance behavior via D2
receptors in the indirect pathway of the striatum (Frank et al., 2004; Kravitz et al., 2012).
3
Complementing the dopaminergic system, serotonin has been suggested to play an opponent
role, by signaling prediction errors associated with future ”aversion” rather than reward
(Deakin, 1983; Daw, Kakade, & Dayan, 2002; Cools, 2011; Boureau & Dayan, 2011). The
suggestion is that initial hint of future aversion would activate the serotonergic prediction
error (Schweimer & Ungless, 2010), but that once the threat of that aversive outcome is
avoided due to actions or circumstances, then the dopaminergic neurons would be activated
by the associated reward prediction error, signaling safety, and thus reinforce the avoidance
action (Johnson, Li, Li, & Klopf, 2001; Moutoussis, Bentall, Williams, & Dayan, 2008; Maia,
2010). Aside from learning avoidance actions, serotonin has also been though to be important for mitigating impulsivity and enabling general behavioral inhibition, such as delaying
actions in order to receive a larger reward (Miyazaki, Miyazaki, & Doya, 2011; Doya, 2002).
Uncertainty
Uncertainty of various forms plague the brain’s interactions with the environment. In a
Bayesian statistical framework, optimal inference and prediction, based on unreliable observations in changing contexts, require the representation and utilization of different forms of
uncertainty. For example, in the context of multi-modal cue integration, uncertainty about
the state of the world induced by unreliability of a cue should lead to the relative downgrading of that source of information relative to other more reliable cues (Ernst & Banks, 2002;
Battaglia, Jacobs, & Aslin, 2003). Analogously, in the context of integrating top-down and
bottom-up information, uncertainty about the validity of an internal model of the environment should lead to relative suppression of top-down expectations in relation to bottom-up,
novel sensory inputs (Yu & Dayan, 2003; Körding & Wolpert, 2004; Yu & Dayan, 2005b). On
the other hand, uncertainty about the predictive or other statistical properties of a stimulus
or event should up-regulate the learning accorded to that stimulus/event, relative to other
better understood aspect of the environment (Yu & Dayan, 2005b; Behrens, Woolrich, Walton, & Rushworth, 2007; Nassar et al., 2012). Strictly speaking, exact Bayesian statistical
computations need not distinguish among the various forms of uncertainty, as they are all
reflected as probability distributions, or properties of distributions, of the relevant random
variables under consideration. However, the brain can hardly be expected to implement arbitrarily precise and complex Bayesian computations, given its limited processing hardware,
and thus needs to represent the most common and critical aspects of uncertainty so as to
achieve near-optimal but efficient computations (Yu & Dayan, 2003, 2005b, 2005a).
4
Based on a wealth of physiological, pharmacological, and behavioral data implicating acetylcholine (ACh) and norepinephrine (NE) in specific aspects of a range of cognitive processes,
it has been proposed that they respectively signal expected and unexpected uncertainty, with
the former associated with known stochasticity or ignorance in the current environment, and
the latter arising from unexpected, gross changes in the environment that require drastic
revision of internal representation of the behavioral context (Yu & Dayan, 2005b). The
two forms of uncertainty are expected to be partly-synergistic, partly-antagonistic: a rise in
unexpected uncertainty should lead to an increase of expected uncertainty (due to known
ignorance), while expected uncertainty in turn has a suppressive influenced on unexpected
uncertainty, since the level of evidence for a fundamental change induced by a certain magnitude of prediction error should normalized by the degree of expected variability (prediction
error) (Yu & Dayan, 2005b). Consistent with this, Sara and colleagues have found similarly
antagonistic interactions between ACh and NE in a series of learning and memory studies
(Sara, 1989; Ammassari-Teule, Maho, & Sara, 1991; Sara, Dyon-Laurent, Guibert, & Leviel,
1992; Dyon-Laurent, Romand, Biegon, & Sara, 1993; Dyon-Laurent, Hervé, & Sara, 1994).
They demonstrated that learning and memory deficits caused by cholinergic lesions can be
alleviated by the administration of clonidine (Sara, 1989; Ammassari-Teule et al., 1991; Sara
et al., 1992; Dyon-Laurent et al., 1993, 1994), a noradrenergic α-2 agonist that decreases the
level of NE (Coull, Frith, Dolan, Frackowiak, & Grasby, 1997).
In recent years, a number of experimental studies have confirmed major aspects of the
uncertainty-based account of acetylcholine and norepinephrine functions. Pupil diameter,
known to be correlated with noradrenergic neuronal activation level (Aston-Jones & Cohen,
2005), has been shown to be, on the one hand, correlated with the discarding of previous
beliefs and strategies and the exploration of novel ones (Gilzenrat, Nieuwenhuis, Jepma,
& Cohen, 2010; Jepma & Nieuwenhuis, 2011), and, on the other hand, reflective of subjects unexpected uncertainty beyond predicted variabilityPreuschoff11. Moreover, there is
evidence that pupil diameter is both reflective of unexpected uncertainty (operationalized
here as change point probability), and accompanies the discarding of previous beliefs and
the incorporation of new data into a novel belief state (Nassar et al., 2012). Interestingly,
pupil dilation is controlled by two muscles, the sphinctor for contraction and the dilator for
dilation, which are respectively innervated by acetylcholine and norepinephrine (Hutchins &
Corbett, 1997). The administration of cholinergic antagonists, scopolamine (muscarinic) and
mecamylamine (nicotinic), increases pupil size in normal elderly volunteers (Little, Johnson,
Minichiello, Weingartner, & Sunderlnd, 1998). Moreover, Alzheimer’s disease (AD) patients,
5
who have a severe and specific cholinergic deficit, have larger pupil size than controls both
tonically and in response to light, while AD patients medicated with acetylcholinesterase
inhibitor, which increases the endogenous level of acetylcholine, show a pupil response much
closer to controls (Fotiou et al., 2009). This raises the intriguing possibility that the pupil
dilation results might reflect the joint influence of both norepinephrine and acetylcholine.
Separately, pharmacological and lesion studies directly manipulating acetylcholine levels have
also produced behavioral results consistent with the cholinergic expected uncertainty hypothesis (Thiel & Fink, 2008) (Córdova, Yu, & Chiba, Soc. Neurosci. Abstracts, 2004).
Further Reading
Doya, K. (2002). Metalearning and neuromodulation. Neural Netw., 15(4-6), 495-506.
Dayan, P. (2012). Twenty-five lessons from computational neuromodulation. Neuron, 76,
240-256.
References
Ammassari-Teule, M., Maho, C., & Sara, S. J. (1991). Clonidine reverses spatial learning
deficits and reinstates θ frequencies in rats with partial fornix section. Beh. Brain Res.,
45 , 1-8.
Aston-Jones, G., & Cohen, J. D. (2005). An integrative theory of locus coeruleusnorepinephrine function: Adaptive gain and optimal performance. Annu. Rev. Neurosci., 28 , 403-50.
Battaglia, P. W., Jacobs, R. A., & Aslin, R. N. (2003). Bayesian integration of visual and
auditory signals for spatial localization. J. Opt. Soc. Am. A. Opt. Image. Sci. Vis.,
20 (7), 1391-7.
Behrens, T. E. J., Woolrich, M. W., Walton, M. E., & Rushworth, M. F. S. (2007). Learning
the value of information in an uncertain world. Nature Neurosci, 10 (9), 1214–21.
Berridge, K. C., & Robinson, T. E. (1998). What is the role of dopamine in reward: Hedonic
impact, reward learning, or incentive salience? Brain Research Reviews, 28 , 309-369.
Boureau, Y.-L., & Dayan, P. (2011). Opponency revisited: Competition and cooperation
between dopamine and serotonin. Neuropsychopharmacology, 36 , 74-97.
6
Cohen, J. D., & Frank, M. J. (2009). Neurocomputational models of basal ganglia function
in learning, memory and choice. Behavioral Brain Research, 199 , 141-156.
Cohen, J. Y., Haesler, S., Vong, L., Lowell, B. B., & Uchida, N. (2012). Neuron-type-specific
signals for reward and punishment in the ventral tegmental area. Nature, 482 , 85-88.
Cools, R. (2011). Dopaminergic control of the striatum for high-level cognition. Current
Opinion in Neurobiology, 21 , 402-407.
Coull, J. T., Frith, C. D., Dolan, R. J., Frackowiak, R. S., & Grasby, P. M. (1997). The
neural correlates of the noradrenergic modulation of human attention, arousal and
learning. Eur. J. Neurosci., 9 (3), 589-98.
Daw, N. D., Kakade, S., & Dayan, P. (2002). Opponent interactions between serotonin and
dopamine. Neural Netw., 15 , 603-16.
Dayan, P. (2012). Twenty-five lessons from computational neuromodulation. Neuron, 76 ,
240-256.
Dayanik, S., & Yu, A. J. (2012). Reward-rate maximization in sequential identification under
a stochastic deadline. SIAM Journal on Control and Optimization, 51 (4), 2922-2948.
Deakin, J. F. W. (1983). Roles of brain serotonergic neurons in escape, avoidance and other
behaviors. Journal of Psychopharmacology, 43 , 563-577.
Doya, K. (2002). Metalearning and neuromodulation. Neural Netw., 15 (4-6), 495-506.
Dyon-Laurent, C., Hervé, A., & Sara, S. J. (1994). Noradrenergic hyperactivity in hippocampus after partial denervation: pharmacological, behavioral, and electrophysiological
studies. Exp. Brain Res., 99 , 259-66.
Dyon-Laurent, C., Romand, S., Biegon, A., & Sara, S. J. (1993). Functional reorganization
of the noradrenergic system after partial fornix section: a behavioral and autoradiographic study. Exp. Brain Res., 96 , 203-11.
Ernst, M. O., & Banks, M. S. (2002). Humans integrate visual and haptic information in a
statistically optimal fashion. Nature, 415 (6870), 429-33.
Fotiou, D. F., Stergiou, V., Tsiptsios, D., Lithari, C., Nakou, M., & Karlovasitou, A. (2009).
Cholinergic deficiency in Alzheimer’s and Parkinson’s disease: Evaluation with pupillometry. International Journal of Psychophysiology, 73 (2), 143-9.
Frank, M. J. (2005). Dynamic dopamine modulation in the basal ganglia: A neurocomputational account of cognitive deficits in medicated and nonmedicated Parkinsonism.
Journal of Cognitive Neuroscience, 17 , 51-72.
Frank, M. J., Seeberger, L. C., & OReilly, R. C. (2004). By carrot or by stick: Cognitive
reinforcement learning in Parkinsonism. Science, 306 , 1940-1943.
7
Gerfen, c. R., Engber, T. M., Mahan, L. C., Susel, Z., Chase, T. N., Monsma, F. J. J.,
& Sibley, D. R. (1990). D1 and D2 dopamine receptor-regulated gene expression of
striatonigral and striatopallidal neurons. Science, 250 , 1429-1432.
Gilzenrat, M. S., Nieuwenhuis, S., Jepma, M., & Cohen, J. D. (2010). Pupil diameter
tracks changes in control state predicted by the adaptive gain theory of locus coeruleus
function. Cogn Affect Behav Neurosci , 10 , 252-69.
Hutchins, J. B., & Corbett, J. J. (1997). The visual system. In D. E. Haines (Ed.),
Fundamental neuroscience (p. 265-84). New York: Churchill Livingstone.
Ikemoto, S., & Panksepp, J. (1999). The role of nucleus accumbens dopamine in motivated
behavior: A unifying interpretation with special reference to reward-seeking. Brain
Research Reviews, 31 , 6-41.
Jepma, M., & Nieuwenhuis, S. (2011). Pupil diameter predicts changes in the explorationexploitation trade-off: evidence for the adaptive gain theory. J Cogn Neurosci , 23 ,
1587-96.
Johnson, J., Li, W., Li, J., & Klopf, A. (2001). A computational model of learned avoidance
behavior in a one-way avoidance experiment. Adaptive Behavior , 9 , 91.
Körding, K. P., & Wolpert, D. M. (2004). Bayesian integration in sensorimotor learning.
Nature, 427 , 244-7.
Kravitz, A. V., Tye, L. D., & Kreitzer, A. C. (2012). Distinct roles for direct and indirect
pathway striatal neurons in reinforcement. Nature Neuroscience, 15 , 816-818.
Little, J. T., Johnson, D. N., Minichiello, M., Weingartner, H., & Sunderlnd, T. (1998).
Combined nicotonic and muscarinic blockade in elderly normal volunteers: Cognitive,
behavioral, and physiological responses. Neuropsychopharmacology, 19 (1), 60-9.
Maia, T. V. (2010). Two-factor theory, the actor-critic model, and conditioned avoidance.
Learning and Behavior , 38 , 50-67.
McClure, S. M., Daw, N. D., & Montague, P. R. (2003). A computational substrate for
incentive salience. Trends in Neuroscience, 26 , 423-428.
Miyazaki, K., Miyazaki, K. W., & Doya, K. (2011). Activation of dorsal Raphe serotonin
neurons underlies waiting for delayed rewards. Journal of Neuroscience, 31 , 469-479.
Montague, P. R., Dayan, P., & Sejnowski, T. J. (1996). A framework for mesencephalic
dopamine systems based on predictive hebbian learning. Journal of Neuroscience, 16 ,
1936-1947.
Moutoussis, M., Bentall, R. P., Williams, J., & Dayan, P. (2008). A temporal difference
account of avoidance learning. Network , 19 , 137-160.
8
Murschall, A., & Hauber, W. (2006). Inactivation of the ventral tegmental area abolished the
general excitatory influence of pavlovian cues on instrumental performance. Learning
and Memory, 13 , 123-126.
Nakamura, K., & Hikosaka, O. (2006). Role of dopamine in the primate caudate nucleus in
reward modulation of saccades. Journal of Neuroscience, 26 , 5360-5369.
Nassar, M. R., Rumsey, K. M., Wilson, R. C., Parikh, K., Heasly, B., & Gold, J. I. (2012).
Rational regulation of learning dynamics by pupil-linked arousal systems. Nature Neuroscience.
Niv, Y., Daw, N. D., Joel, D., & Dayan, P. (2007). Tonic dopamine: opportunity cost and
the control of response vigor. Psychopharmacology, 191 , 507-520.
Sara, S. J. (1989). Noradrenergic-cholinergic interaction: its possible role in memory dysfunction associated with senile dementia. Arch. Gerontol. Geriatr., Suppl. 1 , 99-108.
Sara, S. J., Dyon-Laurent, C., Guibert, B., & Leviel, V. (1992). Noradernergic hyperctivity
after fornix section: role in cholinergic dependent memory performance. Exp. Brain
Res., 89 , 125-32.
Satoh, T., Nakai, S., Sat, T., & Kimura, M. (2003). Correlated coding of motivation and
outcome of decision by dopamine neurons. Journal of Neuroscience, 23 , 9913-9923.
Schultz, W., Dayan, P., & Montague, P. R. (1997). A neural substrate of prediction and
reward. Science, 275 , 1593-9.
Schweimer, J. V., & Ungless, M. A. (2010). Phasic responses in dorsal raphe serotonin
neurons to noxious stimuli. Neuroscience, 171 , 1209-1215.
Smith, Y., Bevan, M. D., Shink, E., & Bolam, J. P. (1998). Microcircuitry of the direct and
indirect pathways of the basal ganglia. Neuroscience, 86 , 353-387.
Surmeier, D. J., Ding, J., Day, M., Wang, Z., & Shen, W. (2007). D1 and D2 dopaminereceptor modulation of striatal glutamatergic signaling in striatal medium spiny neurons. Trends in Neuroscience, 30 , 228-235.
Surmeier, D. J., Shen, W., Day, M., Gertler, T., Chan, S., Tian, X., & Plotkin, J. L. (2010).
The role of dopamine in modulating the structure and function of striatal circuits.
Progress in Brain Research, 183 , 149-167.
Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Mach.
Learning, 3 (1), 9-44.
Talmi, D., Seymour, B., Dayan, P., & Dolan, R. J. (2008). Human Pavlovian-instrumental
transfer. Journal of Neuroscience, 28 , 360-368.
Thiel, C. M., & Fink, G. R. (2008). Effects of the cholinergic agonist nicotine on reorienting
9
of visual spatial attention and top-down attentional control. Neuroscience, 152 (2),
381-90.
Ungless, M. A., Magill, P. J., & Bolam, J. P. (2004). Uniform inhibition of dopamine neurons
in the ventral tegmental area by aversive stimuli. Science, 303 , 2040-2042.
Yu, A. J., & Dayan, P. (2003). Expected and unexpected uncertainty: ACh and NE in the
neocortex. In S. T. S. Becker & K. Obermayer (Eds.), Advances in neural information
processing systems 15 (p. 157-164). Cambridge, MA: MIT Press.
Yu, A. J., & Dayan, P. (2005a). Inference, attention, and decision in a Bayesian neural architecture. In L. K. Saul, Y. Weiss, & L. Bottou (Eds.), Advances in Neural Information
Processing Systems 17. Cambridge, MA: MIT Press.
Yu, A. J., & Dayan, P. (2005b). Uncertainty, neuromodulation, and attention. Neuron, 46 ,
681-92.
10