Download Full Text - Cerebral Cortex

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Apical dendrite wikipedia , lookup

Axon wikipedia , lookup

Biological neuron model wikipedia , lookup

Molecular neuroscience wikipedia , lookup

Neuroplasticity wikipedia , lookup

Human multitasking wikipedia , lookup

Aging brain wikipedia , lookup

Axon guidance wikipedia , lookup

Executive functions wikipedia , lookup

Single-unit recording wikipedia , lookup

Environmental enrichment wikipedia , lookup

Electrophysiology wikipedia , lookup

Nonsynaptic plasticity wikipedia , lookup

Activity-dependent plasticity wikipedia , lookup

Multielectrode array wikipedia , lookup

Caridoid escape reaction wikipedia , lookup

Clinical neurochemistry wikipedia , lookup

Stimulus (physiology) wikipedia , lookup

Development of the nervous system wikipedia , lookup

Mirror neuron wikipedia , lookup

Neural coding wikipedia , lookup

Spike-and-wave wikipedia , lookup

Central pattern generator wikipedia , lookup

Metastability in the brain wikipedia , lookup

Neuroanatomy wikipedia , lookup

Circumventricular organs wikipedia , lookup

Multi-armed bandit wikipedia , lookup

Neural oscillation wikipedia , lookup

Neural correlates of consciousness wikipedia , lookup

Nervous system network models wikipedia , lookup

Neuropsychopharmacology wikipedia , lookup

Feature detection (nervous system) wikipedia , lookup

Pre-Bötzinger complex wikipedia , lookup

Channelrhodopsin wikipedia , lookup

Synaptic gating wikipedia , lookup

Premovement neuronal activity wikipedia , lookup

Optogenetics wikipedia , lookup

Neuroeconomics wikipedia , lookup

Orbitofrontal cortex wikipedia , lookup

Transcript
Delay Activity of Orbital and Lateral
Prefrontal Neurons of the Monkey Varying
with Different Rewards
Kazuo Hikosaka and Masataka Watanabe
We examined neuronal activity in the orbitofrontal cortex (OFC) in
relation to reward expectancy and compared findings with those of
the lateral prefrontal cortex (LPFC) in the monkey. Activity of OFC
neurons was examined in a delayed reaction time task where every
four trials constituted one block within which three kinds of rewards
and no reward were delivered in a fixed order. More than half of OFC
delay neurons were related to the expectancy of delivery or nodelivery of a reward as the response outcome, while some neurons
showed nature-of-reward-specific anticipatory activity changes.
These delay-related activities reflected the preference of the animal
for each kind of reward and were modulated by the motivational
state of the animal. LPFC neurons are reported to show nature-ofreward-specific anticipatory activity changes in a delayed response
task when several different kinds of rewards are used. Such rewarddependent activity is observed in LPFC delay neurons both with
and without spatially differential delay (working memory-related)
activity. Although reward expectancy-related activity is commonly
observed in both OFC and LPFC, it is suggested that the OFC is more
concerned with motivational aspects, while the LPFC is related to
both the cognitive and motivational aspects of the expectancy of
response outcome.
neurons have also been shown to respond to reward, reinforcement and error (Niki and Watanabe, 1979; Watanabe, 1989).
Because the OFC is more related to motivational operations than
the LPFC (Fuster, 1997), and delay-related activity changes of
primate OFC neurons have not been examined sufficiently
since the pioneering study by Rosenkilde et al. (Rosenkilde et
al., 1981), we investigated whether delay neurons of the OFC
are also involved in reward expectancy and whether there are
differences in the characteristics of reward expectancy-related
activity between the OFC and LPFC.
Behavioral experiments on rodents indicate that when
different magnitudes of rewards and no reward are delivered in
response to the animal’s action in a fixed order, for example in
the order of 4, 2, 1 and 0 pellets, the animal comes to expect the
delivery of a specific magnitude of reward or no reward as the
response outcome in each trial (Hulse and Dorsky, 1977). We
examined OFC neuronal activity in relation to the expectancy,
not of different magnitudes of but of different ‘kinds’ of reward,
by training monkeys in a delayed reaction time task where every
four trials constituted one block and three different kinds of
rewards and no reward were delivered in a fixed order.
In this paper we first report the results of the experiment
where we examined the delay-related activities of OFC neurons.
Second, we brief ly describe reward expectancy-related LPFC
neuronal activities that have already been reported (Watanabe,
1996). Then we compare the characteristics of delay activity
of OFC and LPFC neurons in relation to the expectancy of a
response outcome to investigate the possible roles of the OFC
and LPFC in goal-directed behavior.
The orbitofrontal cortex (OFC) plays important roles in
motivation and emotion (Stuss and Benson, 1986; Damasio,
1994; Fuster, 1997; Rolls, 1999). Patients with OFC lesions
show changes in personality and emotional reaction (Stuss
and Benson, 1986; Rolls, 1999), as well as problems in social
behavior (Stuss and Benson, 1986). OFC-ablated monkeys show
impairments in reversal learning and extinction of learned
responses (Butter, 1969; McEnaney and Butter, 1969). They also
show problems in social and motivational behaviors (Fuster,
1997), such as changed food preferences (Baylis and Gaffan,
1991). Neurophysiological studies have indicated that primate
OFC neurons respond to reward and reward-associated stimuli
(Thorpe et al., 1983; Tremblay and Schultz, 1999), as well as
showing activity changes in relation to several task events such
as cue, delay, reinforcement and error (Rosenkilde et al., 1981;
Tremblay and Schultz, 1999). Rodent OFC neurons have also
been shown to be involved in the expectancy of the appetitive
and aversive outcome (Schoenbaum et al., 1998).
The lateral prefrontal cortex (LPFC) has been shown to play
important roles in higher cognitive operations such as retaining
working memory in both human and non-human primates
(Petrides, 1994; Goldman-Rakic, 1996; Fuster, 1997; Courtney
et al., 1998; D’Esposito et al., 1998). Neuronal activity in the
primate LPFC has been extensively studied in working memory
task situations, and working memory-related activity changes
have commonly been observed (Niki, 1974; Funahashi et al.,
1989; Miller et al., 1996; Rao et al., 1997). Recently, we have
shown that delay neurons of the LPFC also participate in the
expectancy of response outcome in relation to the reward that is
expected to be delivered in given trials (Watanabe, 1996). LPFC
© Oxford University Press 2000
Department of Psychology, Tokyo Metropolitan Institute for
Neuroscience, Musashidai 2-6, Fuchu, Tokyo 183-0042, Japan
Delay Activity of Orbitofrontal Neurons in Relation to Reward
Expectancy
Materials and Methods
Experimental Design and Recording
We trained two monkeys (Macaca fuscata) on three kinds of delayed
reaction time tasks (Fig. 1). Each monkey was seated on a primate chair
facing a panel that contained a rectangular window, a circular key and a
hold lever below them. The window contained two screens, one opaque
and one transparent with thin vertical lines. Food reward was given in
the window while liquid reward was given through one of three tubes
attached to the animal’s mouth.
In the ‘Cued liquid reward’ task (Fig. 1a), the monkey first depressed
the hold lever and a color cue of red or green light was presented for 1 s
on the circular key. There was then a delay period of 5 s, after which a
white light was presented on the key as a go signal and the animal was
required to press the key within 1 s. To the animal’s correct response (key
pressing within 1 s after the go signal presentation), a drop (0.3 ml)
of liquid reward was given or not given depending on the color cue
previously presented. Red cues indicated reward while green cues
indicated no reward delivery. In this experiment, each set of four
consecutive trials was considered a block, and the color cue was
Cerebral Cortex Mar 2000;10:263–271; 1047–3211/00/$4.00
Figure 2. Areas where neuronal activity was recorded. The vertical line in (a) indicates
the level of the coronal section in (b). The hatched area in (b) indicates the orbitofrontal
area where the recording was obtained. The recorded area extended widely along the
ventral surface of the frontal cortex. The shaded area in (a) indicates the recorded area
in the lateral prefrontal cortex in the previous study (Watanabe, 1996). Abbreviations:
AS, arcuate sulcus; PS, principal sulcus; MOS, medial orbital sulcus; LOS, lateral orbital
sulcus; CS, central sulcus; LF, lateral fissure; STS, superior temporal sulcus.
Figure 1. Sequences of events in three different kinds of delayed reaction time tasks
and the order of the delivery of different kinds of liquid or food rewards in each task. The
bold letter ‘C’ in the figure indicates the color (red or green) cue.
presented in the order of red–red–red–green. Several different kinds of
liquid rewards were used, and the outcome of the animal’s correct key
pressing responses was the delivery of different kinds of rewards in a
fixed order of (1) orange juice, (2) water, (3) grape juice and (4) no
reward. The animal had to press the key even on no-reward trials to
advance to the next trial.
In the ‘Cued food reward’ task (Fig. 1b), food rewards instead of liquid
rewards were used. Within a block of four trials, the color cue was
presented also in the order of red–red–red–green. To the animal’s correct
response, two screens of the window were raised and the animal could
obtain a piece (∼0.3 g) of food reward on red cue trials while an empty
tray was presented to the animal on green cue trials. The outcome of the
animal’s correct responses within each block was the delivery of different
kinds of rewards in the fixed order of (1) sweet potato, (2) raisin, (3)
cabbage and (4) no reward.
In the ‘Visible food reward’ task (Fig. 1c), instead of the color light as
a cue, the presence or absence of a particular food indicated the outcome
of the animal’s response. During the cue period, the opaque screen was
raised and the animal could see a food reward or empty tray behind the
transparent screen. The order of presenting three different kinds of food
rewards and empty tray as a cue, and thus the order of the animal’s
response outcomes, was the same as that in the ‘Cued food reward’ task.
In these tasks, the animal was only required to press the key within 1
s after the go signal presentation to obtain the reward. Although the
animal was not explicitly required to memorize the serial position of each
kind of reward in a block, nor required to expect the specific reward
in each trial, behavioral experiments (Hulse and Dorsky, 1977) have
suggested that the animal would do so.
Preferences for different kinds of foods by individual animals were
examined separately from the experiment, by free choice tests among
potato, raisin and cabbage rewards, and also by choice tests between
each pair. Preferences for different kinds of liquid rewards were
examined by testing the animal’s willingness to perform the task with
one kind of reward after refusing to perform the task with another kind
of reward.
Details of the surgery and recording methods have been described
previously (Watanabe, 1990; Hikosaka, 1999). Extracellular recordings
264 Reward-related Activity of Prefrontal Delay Neurons • Hikosawa and Watanabe
were made using an Elgiloy electrode (Suzuki and Azuma, 1976), and impulses recorded from isolated single neurons were fed through a window
discriminator to a computer. During the recording, the activity of each
neuron was first examined in a certain task for ∼40 trials (10 blocks),
which constituted one recording epoch. Then the activity was reexamined in this task after one or two recording epochs in another
task(s). The data obtained from different epochs in the same task were
compiled for analysis. To record neuronal activity, electrodes were inserted vertically in the frontal plane. For precise placement of electrodes,
a guide tube was used. The guide tube was placed on the dura of
the cortex and electrodes were advanced through the guide tube. The
recording sites extended from 29–38 mm anterior to the interaural plane
and from medial to lateral portions of the ventral surface of the prefrontal
cortex (Fig. 2).
During the recording, probe tests were sometimes conducted where
the three different kinds of rewards were delivered in a different order
from the original fixed order within a block.
Data Analysis
The data were analyzed off-line. Raster displays and frequency histograms
were used for graphic representation. Non-parametric tests (U and H
tests) were used for statistical analysis. In this paper, we focus on neuronal activity in the OFC during the delay period. Changes in neuronal
activity during the delay period were first compared with those during
the pre-cue control period, and were then compared with each other
among the four reward conditions (three different kinds of rewards and
no-reward).
To evaluate characteristics of activity changes in OFC neurons during
the delay period in relation to the difference in response outcome (three
kinds of rewards and no-reward), two kinds of indices were employed:
reward–no reward discrimination ratio (RNRDR) and reward preference
ratio (RPR). The RNRDR was calculated by the following formula:
RNRDR = (sum of impulses for no-reward)/(sum of impulses for the best
reward)
and is considered to ref lect the discriminability by individual neurons
between the two outcome situations of the best reward and no-reward.
The RPR, which was analyzed only on neurons that showed delay-related
activity changes in reward trials, was calculated by the following formula:
Figure 3. Examples of reward expectancy-related OFC neurons. (a) An example of an OFC delay neuron that discriminated between reward and no-reward but not among different
kinds of rewards in ‘Cued food reward’. (b) Another example of an OFC delay neuron that showed selectivity to the nature-of-reward, showing activations only in water reward but
not any other reward or no-reward trials in ‘Cued liquid reward’. (c) Another example of an OFC delay neuron that showed activations on both water and no-reward trials in ‘Cued liquid
reward’. Neuronal activities are shown in raster and histogram displays. In each display, the second and third vertical lines indicate cue onset and offset, and the fourth vertical line
indicates the end of the delay period. Each row indicates one trial, and small upward triangles in the raster indicate the time of the key-pressing response. Only data from correct trials
are shown. The leftmost scales indicate impulses/s and the time scale at the bottom indicates 1 s. The order of delivery of different kinds of rewards within a block is shown in the
upper left of each display. Abbreviations: C, cue period; D, delay period; R, response.
RPR = (sum of impulses for the worst reward)/(sum of impulses for the
best reward)
and is considered to ref lect the discriminability by individual neurons
between the best and worst rewards among three kinds of rewards
within each task. Here, the ‘best reward’ indicates the one inducing the
greatest activation and the ‘worst reward’ indicates the one inducing the
least activation in each neuron during the delay period. The Kolmogorov–
Smirnov test was used to analyze differences in the distribution of ratios
among different kinds of tasks.
Histology
After the final recording session, each monkey was deeply anesthetized
with pentobarbital sodium (45 mg/kg) and perfused transcardially with
warm saline followed by 10% formal saline. The brain was removed and
blocks of the brain were placed in fixative containing 10% formalin and
30% sucrose until they sank. The brains were frozen and sectioned at
a thickness of 50 µm along the coronal plane. Every fifth section was
stained for cell bodies with cresyl violet. The electrode tracks were
reconstructed from both traces of electrode penetration and electrolytic
lesions that were made at selected penetration sites.
Results
Preference tests for different kinds of food rewards revealed that
the animal consistently preferred cabbage to potato to raisin.
Preference tests for different kinds of liquids revealed that the
animal preferred orange juice and grape juice far more than
water in that the animal was willing to perform the task for an
orange juice or grape juice reward after refusing to perform it
for water reward, while the reverse never occurred. There was
no consistent difference in the animal’s preference between
orange and grape juice rewards.
There were 207 (130 and 77) penetrations in two monkeys. Of
501 OFC neurons isolated in two monkeys, 235 (47%) were
task-related. Of these 235 neurons, 88 (18%) could be examined
in at least two different kinds of tasks. We focus here on 50 neurons that showed delay-related differential activity depending on
differences in the response outcome (three kinds of rewards and
no-reward) on at least two of three different kinds of tasks. Half
(n = 25) of the neurons showed activations during the delay
period for all kinds of reward trials without nature-of-reward
specificity, but not in no-reward trials. Six neurons showed
activations during the delay period only in no-reward trials
without showing activation in any reward trials. Five neurons
showed nature-of-reward specificity showing anticipatory activity changes during the delay period only before obtaining a
specific reward. Two neurons showed activity changes during
the delay period in the trials of both no-reward and least preferred reward (water). In the remaining neurons (n = 12), the
characteristics of delay-related activity changes in relation to the
reward were not consistent among different tasks, for example
some showed delay-related activation in certain reward trials in
one task while discriminating reward and no-reward trials in
another task.
Some examples of OFC delay neurons showing reward-related
activity changes are presented in Figure 3. The example in Figure
3a discriminated between reward and no-reward but not among
different kinds of rewards. This neuron, which was examined in
the ‘Cued food reward’ task, showed delay-related activations for
Cerebral Cortex Mar 2000, V 10 N 3 265
all reward trials but not in no-reward trials. Since the same
characteristic activity changes were observed in the ‘Visible
food reward’ task (not shown), where actual food or an empty
tray was presented as the cue, the differential activity observed
in this neuron is not considered to be related to the difference in
the color (red versus green) of the cue indicating the presence or
absence of reward.
An example of a neuron that demonstrated selectivity to the
nature-of-reward is shown in Figure 3b. This neuron, which was
examined in the ‘Cued liquid reward’ task, showed significant
delay-related activity changes only in water reward trials but not
in any other reward and no-reward trials.
Another OFC neuron is shown in Figure 3c. This neuron
showed activations in both water and no-reward trials during the
delay period. Similar activations in no-reward trials but no activation in any reward trials were observed in the ‘Visible and cued
food reward’ tasks (not shown). It appears that this neuron
responded similarly to no-reward and to the least preferred
reward. In this neuron, no-reward and least preferred rewardrelated activation started before the cue presentation.
In food reward tasks, the animal sometimes refused to ingest
the least preferred food reward (raisin) despite the fact the
animal continued to perform the task as before. An example of
neuronal activity examined during such periods is shown in
Figure 4. This neuron, which is the same as the one shown in
Figure 3a, was examined in the ‘Visible food reward’ task before,
during and after the animal refused to ingest the raisin. After
performing several hundred trials of ‘Visible and cued food
reward’ tasks, and thus after ingesting a substantial amount of
food, the animal became reluctant to ingest raisin and the magnitude of activation during the delay period in raisin reward trials
decreased (Fig. 4a, after eight trials). When the animal finally
refused to ingest raisin, this neuron did not show any activation
during the delay period in raisin reward trials (Fig. 4b), although
the animal continued to perform the task in order to advance
to the next trial where a more preferred reward (cabbage) could
be obtained. However, this neuron continued to show (slightly
reduced) activations during the delay period in potato and
cabbage reward trials (Fig. 4d), while the animal refused to
ingest raisin. After another hundred trials in the ‘Cued liquid
reward’ task, the animal again began to ingest raisin during the
‘Visible food reward’ task. This neuron now showed activations
again during the delay period in raisin reward trials (Fig. 4c). The
mean firing rate of this neuron during the delay period in raisin
reward trials was 6.4, 0.2 and 5.8 spikes/s, (1) before the animal
refused to ingest raisin, (2) while the animal was refusing it and
(3) when the animal began to ingest it again, respectively. There
was a significant difference between the refusal period and each
ingestion period (P < 0.01, Student’s t-test). The reaction time
(RT) of the animal for each period (before, during and after
raisin refusal) in raisin reward trials was 524, 672 and 530 ms,
respectively (Fig. 4e), with significant differences in RT between
the refusal period and each ingestion period (P < 0.01, Student’s
t-test).
To qualitatively examine how and to what extent the difference in response outcomes is ref lected in delay activity of
OFC neurons, we analyzed two kinds of indices that measured
the discriminability of individual neurons between the best
reward and no reward (RNRDR) and between the best and worst
rewards (RPR) (see Materials and Methods).
Since there was no significant difference in the distribution of
RNRDR values observed in OFC neurons among the three different kinds of tasks, we present the mean RNRDR value obtained
266 Reward-related Activity of Prefrontal Delay Neurons • Hikosawa and Watanabe
Figure 4. Activity of an OFC neuron and reaction time of the animal before, during and
after the refusal of a specific reward (raisin) in ‘Visible food reward’. Delay-related
activations in raisin reward trials observed before the refusal of raisin (a) disappeared
while the animal was refusing to ingest the raisin (b). However, when the animal again
began to ingest the raisin, delay-related activation reappeared (c). (d) Mean firing
rate during the delay period in potato, raisin and cabbage reward trials respectively
while the animal was refusing to ingest raisin. There were activations on potato and
cabbage, but not on raisin reward trials, and the magnitude of activity changes was
significantly different between raisin and other reward (potato or cabbage) trials. (e)
Reaction time of the animal before, during and after refusal of the raisin reward. There
were significant differences in reaction time between the refusal period and ingestion
period. The error bar indicates standard error (SE). Other conventions are the same as in
Figure 3.
from all three tasks in Figure 5. There were two clusters in the
distribution: one cluster of neurons (shaded part in the figure)
showed more delay-related activations in reward than in noreward trials and demonstrated smaller values, ranging from
0.01 to 0.69, while the other cluster of neurons showed more
delay-related activations in no-reward than in reward trials and
showed larger values, ranging from 1.02 to 4.58.
RPR values were calculated only on those neurons that
showed delay-related activity changes in reward trials. Distributions of RPR values of OFC neurons for three different kinds of
tasks are shown in Figure 6. The range and mean of values were
0.12–0.96 (mean = 0.72) for ‘Cued liquid reward’ (Fig. 6a),
0.01–0.96 (mean = 0.70) for ‘Cued food reward’ (Fig. 6b) and
0.07–0.92 (mean = 0.70) for ‘Visible food reward’ (Fig. 6c) tasks,
respectively. There was no significant difference in the
distribution among the three different kinds of tasks. As shown
in this figure, RPR values of OFC neurons with nature-of-reward
specificity (shaded ones) were smaller than those of neurons that
had no such specificity. There was no significant correlation
between the RNRDR and RPR values in OFC neurons, indicating
that there was no tendency for neurons with lower RNRDR
values (neurons showing stronger activation on reward than on
Figure 5. Distribution of RNRDR values of OFC neurons. The shaded area represents
neurons that showed larger activations on reward than on no-reward trials.
no-reward trials) to show lower RPR values by better discriminating between the best and worst rewards.
When the serial position of each of three kinds of rewards
within a block was changed in probe tests, neurons with natureof-reward sensitivity showed activity changes according to the
relational order among sequential components, according to
which reward had been delivered in the previous trial. For
example, neurons such as the one in Figure 3b always showed
delay-related activation in trials after the orange juice reward trial
as it did in the original fixed order situation, even when the order
of reward delivery was changed from the original one to the
order of, for example, (1) orange juice, (2) grape juice, (3) water
and (4) no reward.
The RT of the animal was significantly longer on no-reward
than on any reward trial. Although there was almost no difference in RT among different kinds of reward trials when the
animal was well motivated at the beginning of the daily recording session, there were sometimes differences during the middle
of the recording. Although there was no significant correlation
between RT and the magnitude of delay-related activity changes
in most OFC neurons within reward trials or within no-reward
trials, significant correlations were sometimes observed in a few
OFC neurons on specific occasions, as shown in Figure 4.
Histological examination revealed that reward-related delay
neurons were found in all areas explored in the OFC, with a
slight but not significant tendency for more such neurons to be
located in the lateral portions (medial area 12) (Fig. 2).
Discussion
In the delayed reaction time task, we found OFC neurons that
showed differential activity during the delay period depending
on the presence or absence of reward, or depending on the
nature of reward that would be delivered in given trials. Similar
discriminative responses between reward and no-reward trials
were obser ved in most OFC delay neurons for both ‘Visible’
and ‘Cued’ food reward tasks. Thus, the differential delay activity
observed is considered not to be associated with the difference
in the color of the cue which indicated the presence or absence
of future reward, nor with the difference in the appearance of
reward.
There was clear clustering in the RNRDR values, indicating
that there were two types of OFC neurons (Fig. 5): one type of
neuron, with a value of >1, was more activated in the absence
while the other type, which constituted the majority and had a
value of <0.7, was activated in the presence of reward. The existence of two clusters indicates that most OFC neurons showed
clear activity changes either on reward or on no-reward trials,
Figure 6. Distribution of RPR values of OFC neurons in three different kinds of tasks.
Shaded areas represent neurons which showed nature-of-reward specificity.The
discriminability of OFC neurons between the best and worst rewards did not differ
among three different kinds of tasks. Values with arrow heads indicate the mean.
but not on both, suggesting that OFC delay neurons are more
concerned with the presence or absence of reward.
Considering that the RPR value can range from 0 to 1.0, with
a lower value ref lecting better discrimination between the best
and worst rewards, the mean values obtained (0.70–0.72) may
indicate that discriminations among different kinds of rewards
are not very sharp in OFC neurons (Fig. 6). These results may
also indicate that OFC neurons are more sensitive to the difference between reward and no-reward than to the nature-ofreward, at least in the task situation where there are trials in
which no-reward can be expected. Indeed, the majority of
OFC neurons (n = 31, 62%) discriminated between reward and
no-reward but not among different kinds of rewards. These
neurons are considered to be involved in the expectancy of
delivery or no-delivery of reward as a response outcome,
which information may be more interesting to the animal than
the information concerning the nature-of-reward.
Reward expectancy-related activity was found to be modified
by the motivational state of the animal. When the animal refused
to ingest a specific food (raisin), which was the least preferred,
there were no delay-related activations that had previously been
observed in association with the animal’s ingesting the reward
(Fig. 4). Neurons of the OFC and lateral hypothalamus have been
reported to stop responding to the sight and taste of the food or
liquid after an animal is fed until satiety (Burton et al., 1976;
Rolls et al., 1989; Critchley and Rolls, 1996). This process is
nature-of-reward specific and these neurons continue to respond
to other kinds of rewards with which the animal is not satiated
(Burton et al., 1976; Rolls et al., 1989; Critchley and Rolls, 1996).
The present results indicate that the delay activity of reward expectancy-related OFC neurons is also nature-of-reward specific,
because motivation-dependent modification of neuronal activity
Cerebral Cortex Mar 2000, V 10 N 3 267
was observed only in a certain reward (raisin) but not in other
reward trials (Fig. 4). The fact also indicates that delay activity of
OFC neurons is related not to the appearance of reward such
as the shape or color, which does not vary, but to the degree
of preference of the animal for each reward, which does vary
during the task performance. It seems that the raisin reward,
which had previously been estimated to be, to some extent,
preferred by the animal, became non-preferred after having been
ingested in a sufficient amount, and this process was ref lected
in the activity of OFC neurons. In other words, activity of OFC
neurons seems to ref lect the degree of the animal’s preference
for a certain reward determined by the animal’s motivational
state. Similarly, after obtaining a good amount of liquid, water
may become as non-preferable as no-reward. Thus, an OFC neuron such as that shown in Figure 3c, which showed activations on
both water and no-reward trials, may be related to discriminating
between two kinds (preferred and non-preferred) of outcomes.
In the present experiment, the method of reward delivery
(fixed order of delivery of three different kinds of rewards and
no reward within a block of four trials) allowed us to examine
the order-related expectancy process in the OFC. Although the
presence or absence of reward was indicated by the color cue,
the red cue itself did not explicitly inform the animal of what
specific reward would be delivered in a given trial in ‘Cued food
and liquid reward’ tasks. The animal could only deduce what
reward would be delivered in a given trial from the fixed order of
delivery of different kinds of rewards within a block. Thus, delay
neurons showing nature-of-reward-specific activity changes
(e.g. Fig. 3b) are considered to be involved in this deduction
process as well as being involved in the expectancy of the
specific reward. The animal could deduce the kind of reward in
each trial either (1) from the relational order among sequence
components (raisin always comes after potato and cabbage
always comes after raisin) or (2) from the numerical order within
a block in relation to whether the current trial was the first,
second or third. By changing the serial position of each of three
kinds of rewards within a block, it was possible to examine
which strategy the animal was employing. It was found that the
animal was using the former strategy, since nature-of-rewardspecific anticipatory activity was found to be determined by the
reward that had been delivered in the previous trial, but not by
the ordinal position of a certain reward within a block.
Reward Expectancy-related Neuronal Activity in the LPFC
To compare characteristics of delay activity of OFC and LPFC
neurons in relation to the response outcome, we introduce here
reward expectancy-related neuronal activities of the LPFC which
were reported previously (Watanabe, 1996). In the experiment,
we trained two monkeys in modified delayed response tasks
using several kinds of food and liquid rewards. The animal faced
a panel that contained two (right and left) rectangular windows,
two circular keys and a hold lever below them. The animal was
trained in three different kinds of tasks: ‘Cued liquid reward’,
‘Visible food reward’ and ‘Cued food reward’ tasks. Figure 7a
shows the sequence of events in the ‘Cued food reward’ task.
The correct side was indicated by the red cue and the outcome
of the correct response after the delay period was the delivery
of the food which had been prepared (unseen by the animal)
behind the window. Several different kinds of food and liquid
rewards were used. For technical reasons, the same reward was
used continuously in a block of ∼50 trials on these three kinds
of tasks. During the delay period, the animal had to retain the
spatial information about which side the cue was presented on
268 Reward-related Activity of Prefrontal Delay Neurons • Hikosawa and Watanabe
Figure 7. Sequence of events in the delayed response task of ‘Cued food reward’ (a)
and examples of reward expectancy-related LPFC neurons without (b) and with (c)
spatial specificity between right and left trials examined in the ‘Cued food reward’ task.
The reward used is indicated. Other conventions are the same as in Figure 3. In the
‘Cued liquid reward’ task, which is not illustrated here, the animal was given the liquid
reward after the correct response. In the ‘Visible food reward’ task, a food reward was
presented either in the left or right window during the cue period. The animal could
obtain the food reward by responding to the key on the side where the food had been
presented (Watanabe, 1996).
as a working memory. Furthermore, it was thought that the
animal was, although not required to do so, expecting the delivery of a specific reward during the delay period, as had been
indicated by previous behavioral studies (Tinklepaugh, 1928).
Neuronal activity was recorded from the LPFC (mainly areas 46,
8 and lateral 12) (Fig. 2a).
Of 124 LPFC neurons that showed delay-related activity
changes, 42 were intensively examined using several kinds of
rewards. Half of these neurons (n = 21) showed different activity
changes with different rewards. Figure 7b illustrates an example
of a reward-dependent LPFC neuron that showed gradual changes
in activity during the delay period toward the time of the
response. The magnitude of these gradual delay-related changes
was largest in cabbage reward trials, whereas there was almost
no activity change in raisin reward trials.
There were many LPFC delay neurons that showed spatial
specificity. It is of interest whether those spatially differential
delay neurons also showed reward dependency. The neuron in
Figure 7c showed a higher rate of firing on left trials than on right
trials during the delay period irrespective of the nature of the
reward. Besides that, this neuron showed reward-dependent
delay activity, showing the largest activity changes in cabbage
reward trials, intermediate activity changes in potato reward
trials and the least activity changes in raisin reward trials. The
proportion of neurons showing reward-dependent activity was
about the same in delay neurons with spatial specificity and
those without spatial specificity.
RPR was also calculated for LPFC neurons. There was no
significant difference in RPR values among three different
kinds of delayed response tasks. The range was 0.22–0.96 and
the mean was 0.63 when the data for all tasks were combined.
Furthermore, there was no significant difference in the values or
distributions of RPR between OFC and LPFC neurons.
The majority of neurons (17/21) examined in all three
different kinds of tasks showed different patterns or magnitudes
of delay-related activity changes between food- and liquidreward tasks and/or between visible and cued task situations.
The majority of reward-dependent delay neurons showed more
activity changes in response to the preferred than to the nonpreferred reward, indicating that their activities also ref lect the
animal’s preference for each kind of reward.
Although the foods or liquids themselves were not presented
during the cue period in ‘Cued food and liquid reward’ tasks, the
animal learned what reward was being used by experiencing
a newly given reward for two or three trials, since the same
reward was used in a block of ∼50 trials. It is thought that the
animal deduced information about the currently used reward
from its experience in previous trials and expected that specific
reward. Reward-dependent delay activity of LPFC neurons is thus
considered to ref lect the expectancy of visual, gustatory and/or
olfactory images of the specific food as well as its motivational
value. The mean RPR value of 0.63 indicates that the discriminability of LPFC neurons among different kinds of rewards was
not very sharp, either. Many LPFC neurons showed differences
in the characteristics of activity changes among different kinds
of tasks, indicating that expectancy of a specific reward may
be attained by the ensemble of activities of reward-specific and
task-dependent differential delay neurons.
In the LPFC, there were many delay neurons that showed both
reward dependency and spatial specificity. These neurons are
considered to be involved in two different kinds of information processing — one retaining spatial information in working
memory and the other related to retrieving and expecting the
specific reward.
Comparison of the Characteristics of Delay-related Activity of
Orbital and Lateral Prefrontal Neurons
OFC and LPFC neurons were found to show differential delay
activity depending on whether there would be reward or noreward and/or on what kind of reward would be delivered.
These neurons are considered to be involved in the expectancy
of response outcome. Since both OFC and LPFC, especially OFC,
play important roles in motivation and emotion (Stuss and
Benson, 1986), it may be argued that reward-dependent differential delay activity ref lects the emotional reaction associated
with the expectancy of the specific outcome rather than the
expectancy itself. It is considered that the emotional reaction to
the specific outcome (such as the delivery of a specific reward or
no-reward delivery) is stronger than that to the expectancy of the
outcome, and there must be positive correlation between the
two kinds of emotional reactions. However, except for the very
few LPFC neurons like the one shown in Figure 7b, there was
no significant relationship in activities of the OFC and LPFC
neurons between the delay period and post-response period.
Thus, differential delay activity observed in the OFC and LPFC is
considered to be more concerned with the expectancy of the
response outcome.
The animal was not explicitly required to retain in memory
the reward information during the delay period in either the
delayed reaction time or delayed response tasks. Thus, the results
of our experiments indicate that the OFC and LPFC are involved
in representing incidental reward information, which is not
indispensable for the correct task performance. What, then, is
the functional significance of such reward-related anticipatory
neuronal activity obser ved in the OFC and LPFC when such
activity is not indispensable for the correct task performance?
It is considered that the animal behaves to attain goals such as
obtaining food and mating partners or escaping from danger.
Expectancy-related neuronal activity in the OFC and LPFC may
be useful for guiding the animal to pay attention to the most
relevant dimensions in the (task) situation so that the goal would
be attained more effectively. Indeed, focal attention induces
selective representation of only relevant information in LPFC
neurons during a working memory task (Rainer et al. 1998). The
fact that the majority of OFC and LPFC neurons showed greater
activation when expecting the preferred rather than the nonpreferred reward may indicate that the animal is guided to pay
more attention to task situations involving the more preferred
reward.
Such neuronal activity may also be useful for processing
the response outcome more efficiently. As far as there is no
discrepancy between the animal’s expectancy and the response
outcome, even if the outcome is the absence of reward, the
outcome is not surprising to the animal and thus would be
processed automatically without receiving much attention. However, when there is a discrepancy between the two, the outcome
is surprising and should receive more attention for further
processing. Indeed, ‘surprise’ is considered to be important
for learning to occur (Rescorla and Wagner, 1972). Without
expectancy-related neurons, the animal may become relatively
indifferent to the outcome, and thus may not efficiently process
the outcome even when required to do so. This may in turn
induce disturbance in the learning of new behavior according to
the change in reinforcement contingency. The deficit in reversal
learning and extinction of learned operant responses which is
Cerebral Cortex Mar 2000, V 10 N 3 269
observed in OFC-ablated monkeys (Butter, 1969) may be caused
by such disturbance. By facilitating the current goal-directed
behavior and by constituting the basis of learning, expectancyrelated neuronal activity, even if it is not directly associated with
correctly performing a task in each trial, is considered to have
survival value to the animal.
Although OFC neurons were more sensitive to the presence
or absence of reward rather than to the nature-of-reward, it
was also shown that OFC neurons were not simply involved in
discriminating the presence or absence of reward. Some OFC
neurons showed clear differential delay activity in relation to
the expectancy of different kinds of rewards. Even those OFC
delay neurons that apparently represented only the presence or
absence of reward were found to be nature-of-reward sensitive as
well because their activities in response to a particular reward
were modified by changes in the animal’s motivational state
(e.g. Fig. 4). It seems that the activity of expectancy-related OFC
neurons is dependent on the attraction of the reward in relation
to how preferable it is to the animal. It appears that the majority
of OFC neurons discriminate simply between two (preferred
and non-preferred) situations and only some neurons discriminate among individual rewards with different degrees of
preference.
A recent study on the monkey indicates that OFC neurons
are not related to coding and retaining spatial and object
information (Tremblay and Schultz, 1999). Even though the
human OFC is indicated to be involved in cognitive operations
such as decision making (Bechara et al., 1998), such cognitive
operations cannot be achieved without support of motivational
operations within the OFC, as the somatic marker hypothesis
indicates (Damasio, 1994). Considering that the activity of OFC
neurons was found to be dependent on the motivational state of
the animal, OFC neurons may be more related to the expectancy
of hedonistic aspects of reward such as the degree of pleasure
or aversion associated with delivery or no-delivery of a certain
reward. Interestingly, a recent study by Tremblay and Schultz
indicated that OFC neurons ref lect the relative, but not the
absolute, preference of each reward to the animal (Tremblay and
Schultz, 1999).
Although LPFC neurons are well documented to be involved
in cognitive operations such as retaining working memory
(Goldman-Rakic, 1996; Miller et al., 1996), we found LPFC delay
neurons that appear to be involved in both cognitive and motivational operations since activity changes occurred in relation
to both working memory and reward expectancy (Watanabe,
1996). Concerning the characteristics of motivational operations
in the LPFC, delay neurons of this area were reported to be
sensitive to the presence or absence of expected reward in
previous studies on delayed reaction time tasks (Watanabe, 1990,
1992). Activities of LPFC delay neurons were also found
to ref lect the animal’s preference for each kind of expected
reward (Watanabe, 1996). The value and distribution of the RPR
were not different between OFC and LPFC neurons. Thus, there
appears to be no significant difference in the characteristics of
reward expectancy related neuronal activity between the OFC
and LPFC.
LPFC neurons code the correctness of the response
independent of the presence or absence of reward (Watanabe,
1989), or code the discrepancy between the expectancy of a
specific reward and the response outcome (Watanabe, 1996).
The LPFC is proposed to be involved in ‘corollary discharge’ or
‘efference copy’ (sending neural impulses to sensory structures
that somehow prepare those structures for anticipated changes
270 Reward-related Activity of Prefrontal Delay Neurons • Hikosawa and Watanabe
in sensory input as the result of an impending movement)
(Teuber, 1964). Thus, the anticipatory neuronal activity observed in the LPFC, besides ref lecting the preference of the
expected outcome, may be concerned with the cognitive aspects
of reward expectancy such as anticipating visual, tactile or
olfactory images of the response outcome, preparing for the
reception of a certain reward and not any other reward, or
preparing for the no-reward outcome.
LPFC neurons have been proposed to have domain-specific
properties, with ‘what’ and ‘where’ aspects of the stimulus
being retained in working memory in different LPFC areas
(Wilson et al., 1993). However, it has recently been shown that
delay neurons related to ‘what’ and those related to ‘where’ are
not differently distributed within the LPFC and there are many
LPFC neurons which are involved in the integration of ‘what’
and ‘where’ (Rao et al., 1997). We have observed that LPFC
neurons are involved in two different kinds of information
processing: retaining spatial (where) information in working
memory and the expectancy of a specific reward. If we consider
that reward-dependent anticipatory activity is concerned with
the ‘what’ aspects of the stimulus, our results also indicate that
neurons related to ‘what’ and those related to ‘where’ are
intermingled in the LPFC, and some LPFC delay neurons are
involved in the integration of ‘what’ and ‘where’ aspects of the
stimulus.
The OFC plays important roles in motivational operations
through its intimate connections with the amygdala, which plays
important roles in motivation and emotion (Rolls, 1999). However, connections between the LPFC and amygdala are sparse
(Barbas, 1995). Thus, the motivational operation concerning
reward expectancy may be conducted first in the OFC and then
the information may be transmitted to the LPFC, where integration of motivational and cognitive operations would be achieved.
In conclusion, although further studies are needed to examine
what aspects of reward (visual appearance, taste, smell or
degree of preference) is represented in prefrontal neurons, it is
suggested that the OFC is more concerned with the motivational
aspects, while the LPFC is related to both the cognitive and
motivational aspects, of the expectancy of the response outcome
(Watanabe, 1998).
Notes
We express our thanks to the anonymous referees for providing
critical comments that guided improvement of the manuscript, and to
M. Sakagami, S. Shirakawa, M. Odagiri, T. Kojima, K. Tsutui and
H. Takenaka for their assistance during the experiment. This study was
supported by Grant-in-Aid for Scientific Research on Priority Areas from
the Ministry of Education, Science, Sports and Culture of Japan (nos
08279248, 09268242, 10164250, 11145244 ).
Address correspondence to Masataka Watanabe Ph.D., Department of
Psychology, Tokyo Metropolitan Institute for Neuroscience, Musashidai
2-6, Fuchu, Tokyo 183-0042, Japan. Email: [email protected].
References
Barbas H (1995) Anatomic basis of cognitive-emotional interactions in the
primate prefrontal cortex. Neurosci Biobehav Rev 19:499–510.
Baylis LL, Gaffan D (1991) Amygdalectomy and ventromedial prefrontal
ablation produce similar deficits in food choice and in simple
object discrimination learning for an unseen reward. Exp Brain Res
86:617–622.
Bechara A, Damasio H, Tranel D, Anderson SW (1998) Dissociation of
working memory from decision making within the human prefrontal
cortex. J Neurosci 18:428–437.
Burton MJ, Rolls ET, Mora F (1976) Effects of hunger on the responses of
neurons in the lateral hypothalamus to the sight and taste of food. Exp
Neurol 51:668–677.
Butter CM (1969) Perseveration in extinction and in discrimination
reversal tasks following selective frontal ablations in Macaca mulatta.
Physiol Behav 4:163–171.
Courtney SM, Petit L, Haxby JV, Ungerleider, LG (1998) The role of
prefrontal cortex in working memory: examining the contents of
consciousness. Phil Trans R Soc Lond B 353:1819–1828.
Critchley HD, Rolls ET (1996) Hunger and satiety modify the responses
of olfactory and visual neurons in the primate orbitofrontal cortex.
J Neurophysiol 75:1673–1686.
Damasio AR (1994) Descartes’ error. New York: Grosset/Putnam.
D’Esposito M, Aguirre, GK, Zarahn E, Ballard D, Shin RK, Lease J (1998)
Functional MRI studies of spatial and nonspatial working memory.
Cogn Brain Res 7:1–13.
Funahashi S, Bruce CJ, Goldman-Rakic PS (1989) Mnemonic coding of
visual space in the monkey’s dorsolateral prefrontal cortex. J Neurophysiol 61:331–349.
Fuster JM (1997) The prefrontal cortex. Anatomy, physiology and
neuropsychology of the frontal lobe, 3rd edn. New York: LippincottRaven.
Goldman-Rakic PS (1996) The prefrontal landscape: implications of
functional architecture for understanding human mentation and the
central executive. Phil Trans R Soc Lond B 351:1445–1453.
Hikosaka K (1999) Tolerances of responses to visual patterns in neurons
of the posterior inferotemporal cortex in the macaque against
changing stimulus size and orientation, and deleting patterns. Behav
Brain Res 100:67–76.
Hulse SH, Dorsky NP (1977) Structural complexity as a determinant of
serial pattern learning. Learn Motiv 8:488–506.
McEnaney KW, Butter CM (1969) Perseveration of responding and
nonresponding in monkeys with orbital frontal ablations. J Comp
Physiol Psychol 68:558–561.
Miller EK, Erichson CA, Desimone R (1996) Neural mechanisms of visual
working memory in prefrontal cortex of the macaque. J Neurosci
16:5154–5167.
Niki H (1974) Differential activity of prefrontal units during right and left
delayed response trials. Brain Res 70:346–349.
Niki H, Watanabe M (1979) Prefrontal and cingulate unit activity during
timing behavior in the monkey. Brain Res 171:213–224.
Petrides M (1994) Frontal lobes and working memory: evidence from
investigations of the effects of cortical excisions in nonhuman
primates. In: Handbook of neuropsychology, Vol 9: The frontal lobes
(Boller F, Grafman J, eds), pp 59–82. Amsterdam: Elsevier.
Rainer G, Asaad WF, Miller EK (1998) Selective representation of relevant
information by neurons in the primate prefrontal cortex. Nature
393:577–579.
Rao SC, Rainer G, Miller EK (1997) Integration of what and where in the
primate prefrontal cortex. Science 276:821–824.
Rescorla R A, Wagner, AR (1972) A theory of Pavlovian conditioning:
variations in the effectiveness of reinforcement and nonreinforcement. In: Classical conditioning II: Current research and theory
(Black A H, Prokasy WF, eds), pp. 64–99. New York: AppletonCentury-Crofts.
Rolls ET (1999) The brain and emotion. Oxford: Oxford University Press.
Rolls ET, Sienkiewicz ZJ, Yaxley S (1989) Hunger modulates the responses
to gustatory stimuli of single neurons in the caudolateral orbitofrontal
cortex of the macaque monkey. Eur J Neurosci 1:53–60.
Rosenkilde CE, Bauer RH, Fuster JM (1981) Single cell activity in ventral
prefrontal cortex of behaving monkeys. Brain Res 209:375–394.
Schoenbaum G, Chiba A A, Gallagher M (1998) Orbitofrontal cortex and
basolateral amygdala encode expected outcomes during learning.
Nature Neurosci 1:155–159.
Stuss DT, Benson DF (1986) The frontal lobes. New York: Raven Press.
Suzuki H, Azuma M (1976) A glass-insulated ‘Elgiloy’ microelectrode
for recording unit activity in chronic monkey experiments. Electroencephalogr Clin Neurophysiol 41:93–95.
Teuber H-L (1964) The riddle of frontal lobe function in man. In: The
frontal granular cortex and behavior (Warren JM, Akert K, eds), pp.
410–477. New York: McGraw-Hill.
Thorpe SJ, Rolls ET, Maddison S (1983) The orbitofrontal cortex: neuronal
activity in the behaving monkey. Exp Brain Res 49:93–115.
Tinklepaugh OL (1928) An experimental study of representation factors
in monkeys. J Comp Psychol 8:197–236.
Tremblay L, Schultz W (1999) Relative reward preference coded in
primate orbitofrontal cortex. Nature 398:704–708.
Watanabe M (1989) The appropriateness of behavioral responses coded
in post-trial activity of primate prefrontal units. Neurosci Lett 101:
113–117.
Watanabe M (1990) Prefrontal unit activity during associative learning in
the monkey. Exp Brain Res 80:296–309.
Watanabe M (1992) Frontal units of the monkey coding the associative
significance of visual and auditory stimuli. Exp Brain Res 89:233–247.
Watanabe M (1996) Reward expectancy in primate prefrontal neurons.
Nature 382:629–632.
Watanabe M (1998) Cognitive and motivational operations in primate
prefrontal neurons. Rev Neurosci 9:225–241.
Wilson FAW, Ó Scalaidhe SP, Goldman-Rakic PS (1993) Dissociation of
object and spatial processing domains in primate prefrontal cortex.
Science 260:1955–1958.
Cerebral Cortex Mar 2000, V 10 N 3 271