Download Full Text - Cerebral Cortex

Delay Activity of Orbital and Lateral Prefrontal Neurons of the Monkey Varying with Different Rewards Kazuo Hikosaka and Masataka Watanabe We examined neuronal activity in the orbitofrontal cortex (OFC) in relation to reward expectancy and compared findings with those of the lateral prefrontal cortex (LPFC) in the monkey. Activity of OFC neurons was examined in a delayed reaction time task where every four trials constituted one block within which three kinds of rewards and no reward were delivered in a fixed order. More than half of OFC delay neurons were related to the expectancy of delivery or nodelivery of a reward as the response outcome, while some neurons showed nature-of-reward-specific anticipatory activity changes. These delay-related activities reflected the preference of the animal for each kind of reward and were modulated by the motivational state of the animal. LPFC neurons are reported to show nature-ofreward-specific anticipatory activity changes in a delayed response task when several different kinds of rewards are used. Such rewarddependent activity is observed in LPFC delay neurons both with and without spatially differential delay (working memory-related) activity. Although reward expectancy-related activity is commonly observed in both OFC and LPFC, it is suggested that the OFC is more concerned with motivational aspects, while the LPFC is related to both the cognitive and motivational aspects of the expectancy of response outcome. neurons have also been shown to respond to reward, reinforcement and error (Niki and Watanabe, 1979; Watanabe, 1989). Because the OFC is more related to motivational operations than the LPFC (Fuster, 1997), and delay-related activity changes of primate OFC neurons have not been examined sufficiently since the pioneering study by Rosenkilde et al. (Rosenkilde et al., 1981), we investigated whether delay neurons of the OFC are also involved in reward expectancy and whether there are differences in the characteristics of reward expectancy-related activity between the OFC and LPFC. Behavioral experiments on rodents indicate that when different magnitudes of rewards and no reward are delivered in response to the animal’s action in a fixed order, for example in the order of 4, 2, 1 and 0 pellets, the animal comes to expect the delivery of a specific magnitude of reward or no reward as the response outcome in each trial (Hulse and Dorsky, 1977). We examined OFC neuronal activity in relation to the expectancy, not of different magnitudes of but of different ‘kinds’ of reward, by training monkeys in a delayed reaction time task where every four trials constituted one block and three different kinds of rewards and no reward were delivered in a fixed order. In this paper we first report the results of the experiment where we examined the delay-related activities of OFC neurons. Second, we brief ly describe reward expectancy-related LPFC neuronal activities that have already been reported (Watanabe, 1996). Then we compare the characteristics of delay activity of OFC and LPFC neurons in relation to the expectancy of a response outcome to investigate the possible roles of the OFC and LPFC in goal-directed behavior. The orbitofrontal cortex (OFC) plays important roles in motivation and emotion (Stuss and Benson, 1986; Damasio, 1994; Fuster, 1997; Rolls, 1999). Patients with OFC lesions show changes in personality and emotional reaction (Stuss and Benson, 1986; Rolls, 1999), as well as problems in social behavior (Stuss and Benson, 1986). OFC-ablated monkeys show impairments in reversal learning and extinction of learned responses (Butter, 1969; McEnaney and Butter, 1969). They also show problems in social and motivational behaviors (Fuster, 1997), such as changed food preferences (Baylis and Gaffan, 1991). Neurophysiological studies have indicated that primate OFC neurons respond to reward and reward-associated stimuli (Thorpe et al., 1983; Tremblay and Schultz, 1999), as well as showing activity changes in relation to several task events such as cue, delay, reinforcement and error (Rosenkilde et al., 1981; Tremblay and Schultz, 1999). Rodent OFC neurons have also been shown to be involved in the expectancy of the appetitive and aversive outcome (Schoenbaum et al., 1998). The lateral prefrontal cortex (LPFC) has been shown to play important roles in higher cognitive operations such as retaining working memory in both human and non-human primates (Petrides, 1994; Goldman-Rakic, 1996; Fuster, 1997; Courtney et al., 1998; D’Esposito et al., 1998). Neuronal activity in the primate LPFC has been extensively studied in working memory task situations, and working memory-related activity changes have commonly been observed (Niki, 1974; Funahashi et al., 1989; Miller et al., 1996; Rao et al., 1997). Recently, we have shown that delay neurons of the LPFC also participate in the expectancy of response outcome in relation to the reward that is expected to be delivered in given trials (Watanabe, 1996). LPFC © Oxford University Press 2000 Department of Psychology, Tokyo Metropolitan Institute for Neuroscience, Musashidai 2-6, Fuchu, Tokyo 183-0042, Japan Delay Activity of Orbitofrontal Neurons in Relation to Reward Expectancy Materials and Methods Experimental Design and Recording We trained two monkeys (Macaca fuscata) on three kinds of delayed reaction time tasks (Fig. 1). Each monkey was seated on a primate chair facing a panel that contained a rectangular window, a circular key and a hold lever below them. The window contained two screens, one opaque and one transparent with thin vertical lines. Food reward was given in the window while liquid reward was given through one of three tubes attached to the animal’s mouth. In the ‘Cued liquid reward’ task (Fig. 1a), the monkey first depressed the hold lever and a color cue of red or green light was presented for 1 s on the circular key. There was then a delay period of 5 s, after which a white light was presented on the key as a go signal and the animal was required to press the key within 1 s. To the animal’s correct response (key pressing within 1 s after the go signal presentation), a drop (0.3 ml) of liquid reward was given or not given depending on the color cue previously presented. Red cues indicated reward while green cues indicated no reward delivery. In this experiment, each set of four consecutive trials was considered a block, and the color cue was Cerebral Cortex Mar 2000;10:263–271; 1047–3211/00/$4.00 Figure 2. Areas where neuronal activity was recorded. The vertical line in (a) indicates the level of the coronal section in (b). The hatched area in (b) indicates the orbitofrontal area where the recording was obtained. The recorded area extended widely along the ventral surface of the frontal cortex. The shaded area in (a) indicates the recorded area in the lateral prefrontal cortex in the previous study (Watanabe, 1996). Abbreviations: AS, arcuate sulcus; PS, principal sulcus; MOS, medial orbital sulcus; LOS, lateral orbital sulcus; CS, central sulcus; LF, lateral fissure; STS, superior temporal sulcus. Figure 1. Sequences of events in three different kinds of delayed reaction time tasks and the order of the delivery of different kinds of liquid or food rewards in each task. The bold letter ‘C’ in the figure indicates the color (red or green) cue. presented in the order of red–red–red–green. Several different kinds of liquid rewards were used, and the outcome of the animal’s correct key pressing responses was the delivery of different kinds of rewards in a fixed order of (1) orange juice, (2) water, (3) grape juice and (4) no reward. The animal had to press the key even on no-reward trials to advance to the next trial. In the ‘Cued food reward’ task (Fig. 1b), food rewards instead of liquid rewards were used. Within a block of four trials, the color cue was presented also in the order of red–red–red–green. To the animal’s correct response, two screens of the window were raised and the animal could obtain a piece (∼0.3 g) of food reward on red cue trials while an empty tray was presented to the animal on green cue trials. The outcome of the animal’s correct responses within each block was the delivery of different kinds of rewards in the fixed order of (1) sweet potato, (2) raisin, (3) cabbage and (4) no reward. In the ‘Visible food reward’ task (Fig. 1c), instead of the color light as a cue, the presence or absence of a particular food indicated the outcome of the animal’s response. During the cue period, the opaque screen was raised and the animal could see a food reward or empty tray behind the transparent screen. The order of presenting three different kinds of food rewards and empty tray as a cue, and thus the order of the animal’s response outcomes, was the same as that in the ‘Cued food reward’ task. In these tasks, the animal was only required to press the key within 1 s after the go signal presentation to obtain the reward. Although the animal was not explicitly required to memorize the serial position of each kind of reward in a block, nor required to expect the specific reward in each trial, behavioral experiments (Hulse and Dorsky, 1977) have suggested that the animal would do so. Preferences for different kinds of foods by individual animals were examined separately from the experiment, by free choice tests among potato, raisin and cabbage rewards, and also by choice tests between each pair. Preferences for different kinds of liquid rewards were examined by testing the animal’s willingness to perform the task with one kind of reward after refusing to perform the task with another kind of reward. Details of the surgery and recording methods have been described previously (Watanabe, 1990; Hikosaka, 1999). Extracellular recordings 264 Reward-related Activity of Prefrontal Delay Neurons • Hikosawa and Watanabe were made using an Elgiloy electrode (Suzuki and Azuma, 1976), and impulses recorded from isolated single neurons were fed through a window discriminator to a computer. During the recording, the activity of each neuron was first examined in a certain task for ∼40 trials (10 blocks), which constituted one recording epoch. Then the activity was reexamined in this task after one or two recording epochs in another task(s). The data obtained from different epochs in the same task were compiled for analysis. To record neuronal activity, electrodes were inserted vertically in the frontal plane. For precise placement of electrodes, a guide tube was used. The guide tube was placed on the dura of the cortex and electrodes were advanced through the guide tube. The recording sites extended from 29–38 mm anterior to the interaural plane and from medial to lateral portions of the ventral surface of the prefrontal cortex (Fig. 2). During the recording, probe tests were sometimes conducted where the three different kinds of rewards were delivered in a different order from the original fixed order within a block. Data Analysis The data were analyzed off-line. Raster displays and frequency histograms were used for graphic representation. Non-parametric tests (U and H tests) were used for statistical analysis. In this paper, we focus on neuronal activity in the OFC during the delay period. Changes in neuronal activity during the delay period were first compared with those during the pre-cue control period, and were then compared with each other among the four reward conditions (three different kinds of rewards and no-reward). To evaluate characteristics of activity changes in OFC neurons during the delay period in relation to the difference in response outcome (three kinds of rewards and no-reward), two kinds of indices were employed: reward–no reward discrimination ratio (RNRDR) and reward preference ratio (RPR). The RNRDR was calculated by the following formula: RNRDR = (sum of impulses for no-reward)/(sum of impulses for the best reward) and is considered to ref lect the discriminability by individual neurons between the two outcome situations of the best reward and no-reward. The RPR, which was analyzed only on neurons that showed delay-related activity changes in reward trials, was calculated by the following formula: Figure 3. Examples of reward expectancy-related OFC neurons. (a) An example of an OFC delay neuron that discriminated between reward and no-reward but not among different kinds of rewards in ‘Cued food reward’. (b) Another example of an OFC delay neuron that showed selectivity to the nature-of-reward, showing activations only in water reward but not any other reward or no-reward trials in ‘Cued liquid reward’. (c) Another example of an OFC delay neuron that showed activations on both water and no-reward trials in ‘Cued liquid reward’. Neuronal activities are shown in raster and histogram displays. In each display, the second and third vertical lines indicate cue onset and offset, and the fourth vertical line indicates the end of the delay period. Each row indicates one trial, and small upward triangles in the raster indicate the time of the key-pressing response. Only data from correct trials are shown. The leftmost scales indicate impulses/s and the time scale at the bottom indicates 1 s. The order of delivery of different kinds of rewards within a block is shown in the upper left of each display. Abbreviations: C, cue period; D, delay period; R, response. RPR = (sum of impulses for the worst reward)/(sum of impulses for the best reward) and is considered to ref lect the discriminability by individual neurons between the best and worst rewards among three kinds of rewards within each task. Here, the ‘best reward’ indicates the one inducing the greatest activation and the ‘worst reward’ indicates the one inducing the least activation in each neuron during the delay period. The Kolmogorov– Smirnov test was used to analyze differences in the distribution of ratios among different kinds of tasks. Histology After the final recording session, each monkey was deeply anesthetized with pentobarbital sodium (45 mg/kg) and perfused transcardially with warm saline followed by 10% formal saline. The brain was removed and blocks of the brain were placed in fixative containing 10% formalin and 30% sucrose until they sank. The brains were frozen and sectioned at a thickness of 50 µm along the coronal plane. Every fifth section was stained for cell bodies with cresyl violet. The electrode tracks were reconstructed from both traces of electrode penetration and electrolytic lesions that were made at selected penetration sites. Results Preference tests for different kinds of food rewards revealed that the animal consistently preferred cabbage to potato to raisin. Preference tests for different kinds of liquids revealed that the animal preferred orange juice and grape juice far more than water in that the animal was willing to perform the task for an orange juice or grape juice reward after refusing to perform it for water reward, while the reverse never occurred. There was no consistent difference in the animal’s preference between orange and grape juice rewards. There were 207 (130 and 77) penetrations in two monkeys. Of 501 OFC neurons isolated in two monkeys, 235 (47%) were task-related. Of these 235 neurons, 88 (18%) could be examined in at least two different kinds of tasks. We focus here on 50 neurons that showed delay-related differential activity depending on differences in the response outcome (three kinds of rewards and no-reward) on at least two of three different kinds of tasks. Half (n = 25) of the neurons showed activations during the delay period for all kinds of reward trials without nature-of-reward specificity, but not in no-reward trials. Six neurons showed activations during the delay period only in no-reward trials without showing activation in any reward trials. Five neurons showed nature-of-reward specificity showing anticipatory activity changes during the delay period only before obtaining a specific reward. Two neurons showed activity changes during the delay period in the trials of both no-reward and least preferred reward (water). In the remaining neurons (n = 12), the characteristics of delay-related activity changes in relation to the reward were not consistent among different tasks, for example some showed delay-related activation in certain reward trials in one task while discriminating reward and no-reward trials in another task. Some examples of OFC delay neurons showing reward-related activity changes are presented in Figure 3. The example in Figure 3a discriminated between reward and no-reward but not among different kinds of rewards. This neuron, which was examined in the ‘Cued food reward’ task, showed delay-related activations for Cerebral Cortex Mar 2000, V 10 N 3 265 all reward trials but not in no-reward trials. Since the same characteristic activity changes were observed in the ‘Visible food reward’ task (not shown), where actual food or an empty tray was presented as the cue, the differential activity observed in this neuron is not considered to be related to the difference in the color (red versus green) of the cue indicating the presence or absence of reward. An example of a neuron that demonstrated selectivity to the nature-of-reward is shown in Figure 3b. This neuron, which was examined in the ‘Cued liquid reward’ task, showed significant delay-related activity changes only in water reward trials but not in any other reward and no-reward trials. Another OFC neuron is shown in Figure 3c. This neuron showed activations in both water and no-reward trials during the delay period. Similar activations in no-reward trials but no activation in any reward trials were observed in the ‘Visible and cued food reward’ tasks (not shown). It appears that this neuron responded similarly to no-reward and to the least preferred reward. In this neuron, no-reward and least preferred rewardrelated activation started before the cue presentation. In food reward tasks, the animal sometimes refused to ingest the least preferred food reward (raisin) despite the fact the animal continued to perform the task as before. An example of neuronal activity examined during such periods is shown in Figure 4. This neuron, which is the same as the one shown in Figure 3a, was examined in the ‘Visible food reward’ task before, during and after the animal refused to ingest the raisin. After performing several hundred trials of ‘Visible and cued food reward’ tasks, and thus after ingesting a substantial amount of food, the animal became reluctant to ingest raisin and the magnitude of activation during the delay period in raisin reward trials decreased (Fig. 4a, after eight trials). When the animal finally refused to ingest raisin, this neuron did not show any activation during the delay period in raisin reward trials (Fig. 4b), although the animal continued to perform the task in order to advance to the next trial where a more preferred reward (cabbage) could be obtained. However, this neuron continued to show (slightly reduced) activations during the delay period in potato and cabbage reward trials (Fig. 4d), while the animal refused to ingest raisin. After another hundred trials in the ‘Cued liquid reward’ task, the animal again began to ingest raisin during the ‘Visible food reward’ task. This neuron now showed activations again during the delay period in raisin reward trials (Fig. 4c). The mean firing rate of this neuron during the delay period in raisin reward trials was 6.4, 0.2 and 5.8 spikes/s, (1) before the animal refused to ingest raisin, (2) while the animal was refusing it and (3) when the animal began to ingest it again, respectively. There was a significant difference between the refusal period and each ingestion period (P < 0.01, Student’s t-test). The reaction time (RT) of the animal for each period (before, during and after raisin refusal) in raisin reward trials was 524, 672 and 530 ms, respectively (Fig. 4e), with significant differences in RT between the refusal period and each ingestion period (P < 0.01, Student’s t-test). To qualitatively examine how and to what extent the difference in response outcomes is ref lected in delay activity of OFC neurons, we analyzed two kinds of indices that measured the discriminability of individual neurons between the best reward and no reward (RNRDR) and between the best and worst rewards (RPR) (see Materials and Methods). Since there was no significant difference in the distribution of RNRDR values observed in OFC neurons among the three different kinds of tasks, we present the mean RNRDR value obtained 266 Reward-related Activity of Prefrontal Delay Neurons • Hikosawa and Watanabe Figure 4. Activity of an OFC neuron and reaction time of the animal before, during and after the refusal of a specific reward (raisin) in ‘Visible food reward’. Delay-related activations in raisin reward trials observed before the refusal of raisin (a) disappeared while the animal was refusing to ingest the raisin (b). However, when the animal again began to ingest the raisin, delay-related activation reappeared (c). (d) Mean firing rate during the delay period in potato, raisin and cabbage reward trials respectively while the animal was refusing to ingest raisin. There were activations on potato and cabbage, but not on raisin reward trials, and the magnitude of activity changes was significantly different between raisin and other reward (potato or cabbage) trials. (e) Reaction time of the animal before, during and after refusal of the raisin reward. There were significant differences in reaction time between the refusal period and ingestion period. The error bar indicates standard error (SE). Other conventions are the same as in Figure 3. from all three tasks in Figure 5. There were two clusters in the distribution: one cluster of neurons (shaded part in the figure) showed more delay-related activations in reward than in noreward trials and demonstrated smaller values, ranging from 0.01 to 0.69, while the other cluster of neurons showed more delay-related activations in no-reward than in reward trials and showed larger values, ranging from 1.02 to 4.58. RPR values were calculated only on those neurons that showed delay-related activity changes in reward trials. Distributions of RPR values of OFC neurons for three different kinds of tasks are shown in Figure 6. The range and mean of values were 0.12–0.96 (mean = 0.72) for ‘Cued liquid reward’ (Fig. 6a), 0.01–0.96 (mean = 0.70) for ‘Cued food reward’ (Fig. 6b) and 0.07–0.92 (mean = 0.70) for ‘Visible food reward’ (Fig. 6c) tasks, respectively. There was no significant difference in the distribution among the three different kinds of tasks. As shown in this figure, RPR values of OFC neurons with nature-of-reward specificity (shaded ones) were smaller than those of neurons that had no such specificity. There was no significant correlation between the RNRDR and RPR values in OFC neurons, indicating that there was no tendency for neurons with lower RNRDR values (neurons showing stronger activation on reward than on Figure 5. Distribution of RNRDR values of OFC neurons. The shaded area represents neurons that showed larger activations on reward than on no-reward trials. no-reward trials) to show lower RPR values by better discriminating between the best and worst rewards. When the serial position of each of three kinds of rewards within a block was changed in probe tests, neurons with natureof-reward sensitivity showed activity changes according to the relational order among sequential components, according to which reward had been delivered in the previous trial. For example, neurons such as the one in Figure 3b always showed delay-related activation in trials after the orange juice reward trial as it did in the original fixed order situation, even when the order of reward delivery was changed from the original one to the order of, for example, (1) orange juice, (2) grape juice, (3) water and (4) no reward. The RT of the animal was significantly longer on no-reward than on any reward trial. Although there was almost no difference in RT among different kinds of reward trials when the animal was well motivated at the beginning of the daily recording session, there were sometimes differences during the middle of the recording. Although there was no significant correlation between RT and the magnitude of delay-related activity changes in most OFC neurons within reward trials or within no-reward trials, significant correlations were sometimes observed in a few OFC neurons on specific occasions, as shown in Figure 4. Histological examination revealed that reward-related delay neurons were found in all areas explored in the OFC, with a slight but not significant tendency for more such neurons to be located in the lateral portions (medial area 12) (Fig. 2). Discussion In the delayed reaction time task, we found OFC neurons that showed differential activity during the delay period depending on the presence or absence of reward, or depending on the nature of reward that would be delivered in given trials. Similar discriminative responses between reward and no-reward trials were obser ved in most OFC delay neurons for both ‘Visible’ and ‘Cued’ food reward tasks. Thus, the differential delay activity observed is considered not to be associated with the difference in the color of the cue which indicated the presence or absence of future reward, nor with the difference in the appearance of reward. There was clear clustering in the RNRDR values, indicating that there were two types of OFC neurons (Fig. 5): one type of neuron, with a value of >1, was more activated in the absence while the other type, which constituted the majority and had a value of <0.7, was activated in the presence of reward. The existence of two clusters indicates that most OFC neurons showed clear activity changes either on reward or on no-reward trials, Figure 6. Distribution of RPR values of OFC neurons in three different kinds of tasks. Shaded areas represent neurons which showed nature-of-reward specificity.The discriminability of OFC neurons between the best and worst rewards did not differ among three different kinds of tasks. Values with arrow heads indicate the mean. but not on both, suggesting that OFC delay neurons are more concerned with the presence or absence of reward. Considering that the RPR value can range from 0 to 1.0, with a lower value ref lecting better discrimination between the best and worst rewards, the mean values obtained (0.70–0.72) may indicate that discriminations among different kinds of rewards are not very sharp in OFC neurons (Fig. 6). These results may also indicate that OFC neurons are more sensitive to the difference between reward and no-reward than to the nature-ofreward, at least in the task situation where there are trials in which no-reward can be expected. Indeed, the majority of OFC neurons (n = 31, 62%) discriminated between reward and no-reward but not among different kinds of rewards. These neurons are considered to be involved in the expectancy of delivery or no-delivery of reward as a response outcome, which information may be more interesting to the animal than the information concerning the nature-of-reward. Reward expectancy-related activity was found to be modified by the motivational state of the animal. When the animal refused to ingest a specific food (raisin), which was the least preferred, there were no delay-related activations that had previously been observed in association with the animal’s ingesting the reward (Fig. 4). Neurons of the OFC and lateral hypothalamus have been reported to stop responding to the sight and taste of the food or liquid after an animal is fed until satiety (Burton et al., 1976; Rolls et al., 1989; Critchley and Rolls, 1996). This process is nature-of-reward specific and these neurons continue to respond to other kinds of rewards with which the animal is not satiated (Burton et al., 1976; Rolls et al., 1989; Critchley and Rolls, 1996). The present results indicate that the delay activity of reward expectancy-related OFC neurons is also nature-of-reward specific, because motivation-dependent modification of neuronal activity Cerebral Cortex Mar 2000, V 10 N 3 267 was observed only in a certain reward (raisin) but not in other reward trials (Fig. 4). The fact also indicates that delay activity of OFC neurons is related not to the appearance of reward such as the shape or color, which does not vary, but to the degree of preference of the animal for each reward, which does vary during the task performance. It seems that the raisin reward, which had previously been estimated to be, to some extent, preferred by the animal, became non-preferred after having been ingested in a sufficient amount, and this process was ref lected in the activity of OFC neurons. In other words, activity of OFC neurons seems to ref lect the degree of the animal’s preference for a certain reward determined by the animal’s motivational state. Similarly, after obtaining a good amount of liquid, water may become as non-preferable as no-reward. Thus, an OFC neuron such as that shown in Figure 3c, which showed activations on both water and no-reward trials, may be related to discriminating between two kinds (preferred and non-preferred) of outcomes. In the present experiment, the method of reward delivery (fixed order of delivery of three different kinds of rewards and no reward within a block of four trials) allowed us to examine the order-related expectancy process in the OFC. Although the presence or absence of reward was indicated by the color cue, the red cue itself did not explicitly inform the animal of what specific reward would be delivered in a given trial in ‘Cued food and liquid reward’ tasks. The animal could only deduce what reward would be delivered in a given trial from the fixed order of delivery of different kinds of rewards within a block. Thus, delay neurons showing nature-of-reward-specific activity changes (e.g. Fig. 3b) are considered to be involved in this deduction process as well as being involved in the expectancy of the specific reward. The animal could deduce the kind of reward in each trial either (1) from the relational order among sequence components (raisin always comes after potato and cabbage always comes after raisin) or (2) from the numerical order within a block in relation to whether the current trial was the first, second or third. By changing the serial position of each of three kinds of rewards within a block, it was possible to examine which strategy the animal was employing. It was found that the animal was using the former strategy, since nature-of-rewardspecific anticipatory activity was found to be determined by the reward that had been delivered in the previous trial, but not by the ordinal position of a certain reward within a block. Reward Expectancy-related Neuronal Activity in the LPFC To compare characteristics of delay activity of OFC and LPFC neurons in relation to the response outcome, we introduce here reward expectancy-related neuronal activities of the LPFC which were reported previously (Watanabe, 1996). In the experiment, we trained two monkeys in modified delayed response tasks using several kinds of food and liquid rewards. The animal faced a panel that contained two (right and left) rectangular windows, two circular keys and a hold lever below them. The animal was trained in three different kinds of tasks: ‘Cued liquid reward’, ‘Visible food reward’ and ‘Cued food reward’ tasks. Figure 7a shows the sequence of events in the ‘Cued food reward’ task. The correct side was indicated by the red cue and the outcome of the correct response after the delay period was the delivery of the food which had been prepared (unseen by the animal) behind the window. Several different kinds of food and liquid rewards were used. For technical reasons, the same reward was used continuously in a block of ∼50 trials on these three kinds of tasks. During the delay period, the animal had to retain the spatial information about which side the cue was presented on 268 Reward-related Activity of Prefrontal Delay Neurons • Hikosawa and Watanabe Figure 7. Sequence of events in the delayed response task of ‘Cued food reward’ (a) and examples of reward expectancy-related LPFC neurons without (b) and with (c) spatial specificity between right and left trials examined in the ‘Cued food reward’ task. The reward used is indicated. Other conventions are the same as in Figure 3. In the ‘Cued liquid reward’ task, which is not illustrated here, the animal was given the liquid reward after the correct response. In the ‘Visible food reward’ task, a food reward was presented either in the left or right window during the cue period. The animal could obtain the food reward by responding to the key on the side where the food had been presented (Watanabe, 1996). as a working memory. Furthermore, it was thought that the animal was, although not required to do so, expecting the delivery of a specific reward during the delay period, as had been indicated by previous behavioral studies (Tinklepaugh, 1928). Neuronal activity was recorded from the LPFC (mainly areas 46, 8 and lateral 12) (Fig. 2a). Of 124 LPFC neurons that showed delay-related activity changes, 42 were intensively examined using several kinds of rewards. Half of these neurons (n = 21) showed different activity changes with different rewards. Figure 7b illustrates an example of a reward-dependent LPFC neuron that showed gradual changes in activity during the delay period toward the time of the response. The magnitude of these gradual delay-related changes was largest in cabbage reward trials, whereas there was almost no activity change in raisin reward trials. There were many LPFC delay neurons that showed spatial specificity. It is of interest whether those spatially differential delay neurons also showed reward dependency. The neuron in Figure 7c showed a higher rate of firing on left trials than on right trials during the delay period irrespective of the nature of the reward. Besides that, this neuron showed reward-dependent delay activity, showing the largest activity changes in cabbage reward trials, intermediate activity changes in potato reward trials and the least activity changes in raisin reward trials. The proportion of neurons showing reward-dependent activity was about the same in delay neurons with spatial specificity and those without spatial specificity. RPR was also calculated for LPFC neurons. There was no significant difference in RPR values among three different kinds of delayed response tasks. The range was 0.22–0.96 and the mean was 0.63 when the data for all tasks were combined. Furthermore, there was no significant difference in the values or distributions of RPR between OFC and LPFC neurons. The majority of neurons (17/21) examined in all three different kinds of tasks showed different patterns or magnitudes of delay-related activity changes between food- and liquidreward tasks and/or between visible and cued task situations. The majority of reward-dependent delay neurons showed more activity changes in response to the preferred than to the nonpreferred reward, indicating that their activities also ref lect the animal’s preference for each kind of reward. Although the foods or liquids themselves were not presented during the cue period in ‘Cued food and liquid reward’ tasks, the animal learned what reward was being used by experiencing a newly given reward for two or three trials, since the same reward was used in a block of ∼50 trials. It is thought that the animal deduced information about the currently used reward from its experience in previous trials and expected that specific reward. Reward-dependent delay activity of LPFC neurons is thus considered to ref lect the expectancy of visual, gustatory and/or olfactory images of the specific food as well as its motivational value. The mean RPR value of 0.63 indicates that the discriminability of LPFC neurons among different kinds of rewards was not very sharp, either. Many LPFC neurons showed differences in the characteristics of activity changes among different kinds of tasks, indicating that expectancy of a specific reward may be attained by the ensemble of activities of reward-specific and task-dependent differential delay neurons. In the LPFC, there were many delay neurons that showed both reward dependency and spatial specificity. These neurons are considered to be involved in two different kinds of information processing — one retaining spatial information in working memory and the other related to retrieving and expecting the specific reward. Comparison of the Characteristics of Delay-related Activity of Orbital and Lateral Prefrontal Neurons OFC and LPFC neurons were found to show differential delay activity depending on whether there would be reward or noreward and/or on what kind of reward would be delivered. These neurons are considered to be involved in the expectancy of response outcome. Since both OFC and LPFC, especially OFC, play important roles in motivation and emotion (Stuss and Benson, 1986), it may be argued that reward-dependent differential delay activity ref lects the emotional reaction associated with the expectancy of the specific outcome rather than the expectancy itself. It is considered that the emotional reaction to the specific outcome (such as the delivery of a specific reward or no-reward delivery) is stronger than that to the expectancy of the outcome, and there must be positive correlation between the two kinds of emotional reactions. However, except for the very few LPFC neurons like the one shown in Figure 7b, there was no significant relationship in activities of the OFC and LPFC neurons between the delay period and post-response period. Thus, differential delay activity observed in the OFC and LPFC is considered to be more concerned with the expectancy of the response outcome. The animal was not explicitly required to retain in memory the reward information during the delay period in either the delayed reaction time or delayed response tasks. Thus, the results of our experiments indicate that the OFC and LPFC are involved in representing incidental reward information, which is not indispensable for the correct task performance. What, then, is the functional significance of such reward-related anticipatory neuronal activity obser ved in the OFC and LPFC when such activity is not indispensable for the correct task performance? It is considered that the animal behaves to attain goals such as obtaining food and mating partners or escaping from danger. Expectancy-related neuronal activity in the OFC and LPFC may be useful for guiding the animal to pay attention to the most relevant dimensions in the (task) situation so that the goal would be attained more effectively. Indeed, focal attention induces selective representation of only relevant information in LPFC neurons during a working memory task (Rainer et al. 1998). The fact that the majority of OFC and LPFC neurons showed greater activation when expecting the preferred rather than the nonpreferred reward may indicate that the animal is guided to pay more attention to task situations involving the more preferred reward. Such neuronal activity may also be useful for processing the response outcome more efficiently. As far as there is no discrepancy between the animal’s expectancy and the response outcome, even if the outcome is the absence of reward, the outcome is not surprising to the animal and thus would be processed automatically without receiving much attention. However, when there is a discrepancy between the two, the outcome is surprising and should receive more attention for further processing. Indeed, ‘surprise’ is considered to be important for learning to occur (Rescorla and Wagner, 1972). Without expectancy-related neurons, the animal may become relatively indifferent to the outcome, and thus may not efficiently process the outcome even when required to do so. This may in turn induce disturbance in the learning of new behavior according to the change in reinforcement contingency. The deficit in reversal learning and extinction of learned operant responses which is Cerebral Cortex Mar 2000, V 10 N 3 269 observed in OFC-ablated monkeys (Butter, 1969) may be caused by such disturbance. By facilitating the current goal-directed behavior and by constituting the basis of learning, expectancyrelated neuronal activity, even if it is not directly associated with correctly performing a task in each trial, is considered to have survival value to the animal. Although OFC neurons were more sensitive to the presence or absence of reward rather than to the nature-of-reward, it was also shown that OFC neurons were not simply involved in discriminating the presence or absence of reward. Some OFC neurons showed clear differential delay activity in relation to the expectancy of different kinds of rewards. Even those OFC delay neurons that apparently represented only the presence or absence of reward were found to be nature-of-reward sensitive as well because their activities in response to a particular reward were modified by changes in the animal’s motivational state (e.g. Fig. 4). It seems that the activity of expectancy-related OFC neurons is dependent on the attraction of the reward in relation to how preferable it is to the animal. It appears that the majority of OFC neurons discriminate simply between two (preferred and non-preferred) situations and only some neurons discriminate among individual rewards with different degrees of preference. A recent study on the monkey indicates that OFC neurons are not related to coding and retaining spatial and object information (Tremblay and Schultz, 1999). Even though the human OFC is indicated to be involved in cognitive operations such as decision making (Bechara et al., 1998), such cognitive operations cannot be achieved without support of motivational operations within the OFC, as the somatic marker hypothesis indicates (Damasio, 1994). Considering that the activity of OFC neurons was found to be dependent on the motivational state of the animal, OFC neurons may be more related to the expectancy of hedonistic aspects of reward such as the degree of pleasure or aversion associated with delivery or no-delivery of a certain reward. Interestingly, a recent study by Tremblay and Schultz indicated that OFC neurons ref lect the relative, but not the absolute, preference of each reward to the animal (Tremblay and Schultz, 1999). Although LPFC neurons are well documented to be involved in cognitive operations such as retaining working memory (Goldman-Rakic, 1996; Miller et al., 1996), we found LPFC delay neurons that appear to be involved in both cognitive and motivational operations since activity changes occurred in relation to both working memory and reward expectancy (Watanabe, 1996). Concerning the characteristics of motivational operations in the LPFC, delay neurons of this area were reported to be sensitive to the presence or absence of expected reward in previous studies on delayed reaction time tasks (Watanabe, 1990, 1992). Activities of LPFC delay neurons were also found to ref lect the animal’s preference for each kind of expected reward (Watanabe, 1996). The value and distribution of the RPR were not different between OFC and LPFC neurons. Thus, there appears to be no significant difference in the characteristics of reward expectancy related neuronal activity between the OFC and LPFC. LPFC neurons code the correctness of the response independent of the presence or absence of reward (Watanabe, 1989), or code the discrepancy between the expectancy of a specific reward and the response outcome (Watanabe, 1996). The LPFC is proposed to be involved in ‘corollary discharge’ or ‘efference copy’ (sending neural impulses to sensory structures that somehow prepare those structures for anticipated changes 270 Reward-related Activity of Prefrontal Delay Neurons • Hikosawa and Watanabe in sensory input as the result of an impending movement) (Teuber, 1964). Thus, the anticipatory neuronal activity observed in the LPFC, besides ref lecting the preference of the expected outcome, may be concerned with the cognitive aspects of reward expectancy such as anticipating visual, tactile or olfactory images of the response outcome, preparing for the reception of a certain reward and not any other reward, or preparing for the no-reward outcome. LPFC neurons have been proposed to have domain-specific properties, with ‘what’ and ‘where’ aspects of the stimulus being retained in working memory in different LPFC areas (Wilson et al., 1993). However, it has recently been shown that delay neurons related to ‘what’ and those related to ‘where’ are not differently distributed within the LPFC and there are many LPFC neurons which are involved in the integration of ‘what’ and ‘where’ (Rao et al., 1997). We have observed that LPFC neurons are involved in two different kinds of information processing: retaining spatial (where) information in working memory and the expectancy of a specific reward. If we consider that reward-dependent anticipatory activity is concerned with the ‘what’ aspects of the stimulus, our results also indicate that neurons related to ‘what’ and those related to ‘where’ are intermingled in the LPFC, and some LPFC delay neurons are involved in the integration of ‘what’ and ‘where’ aspects of the stimulus. The OFC plays important roles in motivational operations through its intimate connections with the amygdala, which plays important roles in motivation and emotion (Rolls, 1999). However, connections between the LPFC and amygdala are sparse (Barbas, 1995). Thus, the motivational operation concerning reward expectancy may be conducted first in the OFC and then the information may be transmitted to the LPFC, where integration of motivational and cognitive operations would be achieved. In conclusion, although further studies are needed to examine what aspects of reward (visual appearance, taste, smell or degree of preference) is represented in prefrontal neurons, it is suggested that the OFC is more concerned with the motivational aspects, while the LPFC is related to both the cognitive and motivational aspects, of the expectancy of the response outcome (Watanabe, 1998). Notes We express our thanks to the anonymous referees for providing critical comments that guided improvement of the manuscript, and to M. Sakagami, S. Shirakawa, M. Odagiri, T. Kojima, K. Tsutui and H. Takenaka for their assistance during the experiment. This study was supported by Grant-in-Aid for Scientific Research on Priority Areas from the Ministry of Education, Science, Sports and Culture of Japan (nos 08279248, 09268242, 10164250, 11145244 ). Address correspondence to Masataka Watanabe Ph.D., Department of Psychology, Tokyo Metropolitan Institute for Neuroscience, Musashidai 2-6, Fuchu, Tokyo 183-0042, Japan. Email: [email protected]. References Barbas H (1995) Anatomic basis of cognitive-emotional interactions in the primate prefrontal cortex. Neurosci Biobehav Rev 19:499–510. Baylis LL, Gaffan D (1991) Amygdalectomy and ventromedial prefrontal ablation produce similar deficits in food choice and in simple object discrimination learning for an unseen reward. Exp Brain Res 86:617–622. Bechara A, Damasio H, Tranel D, Anderson SW (1998) Dissociation of working memory from decision making within the human prefrontal cortex. J Neurosci 18:428–437. Burton MJ, Rolls ET, Mora F (1976) Effects of hunger on the responses of neurons in the lateral hypothalamus to the sight and taste of food. Exp Neurol 51:668–677. Butter CM (1969) Perseveration in extinction and in discrimination reversal tasks following selective frontal ablations in Macaca mulatta. Physiol Behav 4:163–171. Courtney SM, Petit L, Haxby JV, Ungerleider, LG (1998) The role of prefrontal cortex in working memory: examining the contents of consciousness. Phil Trans R Soc Lond B 353:1819–1828. Critchley HD, Rolls ET (1996) Hunger and satiety modify the responses of olfactory and visual neurons in the primate orbitofrontal cortex. J Neurophysiol 75:1673–1686. Damasio AR (1994) Descartes’ error. New York: Grosset/Putnam. D’Esposito M, Aguirre, GK, Zarahn E, Ballard D, Shin RK, Lease J (1998) Functional MRI studies of spatial and nonspatial working memory. Cogn Brain Res 7:1–13. Funahashi S, Bruce CJ, Goldman-Rakic PS (1989) Mnemonic coding of visual space in the monkey’s dorsolateral prefrontal cortex. J Neurophysiol 61:331–349. Fuster JM (1997) The prefrontal cortex. Anatomy, physiology and neuropsychology of the frontal lobe, 3rd edn. New York: LippincottRaven. Goldman-Rakic PS (1996) The prefrontal landscape: implications of functional architecture for understanding human mentation and the central executive. Phil Trans R Soc Lond B 351:1445–1453. Hikosaka K (1999) Tolerances of responses to visual patterns in neurons of the posterior inferotemporal cortex in the macaque against changing stimulus size and orientation, and deleting patterns. Behav Brain Res 100:67–76. Hulse SH, Dorsky NP (1977) Structural complexity as a determinant of serial pattern learning. Learn Motiv 8:488–506. McEnaney KW, Butter CM (1969) Perseveration of responding and nonresponding in monkeys with orbital frontal ablations. J Comp Physiol Psychol 68:558–561. Miller EK, Erichson CA, Desimone R (1996) Neural mechanisms of visual working memory in prefrontal cortex of the macaque. J Neurosci 16:5154–5167. Niki H (1974) Differential activity of prefrontal units during right and left delayed response trials. Brain Res 70:346–349. Niki H, Watanabe M (1979) Prefrontal and cingulate unit activity during timing behavior in the monkey. Brain Res 171:213–224. Petrides M (1994) Frontal lobes and working memory: evidence from investigations of the effects of cortical excisions in nonhuman primates. In: Handbook of neuropsychology, Vol 9: The frontal lobes (Boller F, Grafman J, eds), pp 59–82. Amsterdam: Elsevier. Rainer G, Asaad WF, Miller EK (1998) Selective representation of relevant information by neurons in the primate prefrontal cortex. Nature 393:577–579. Rao SC, Rainer G, Miller EK (1997) Integration of what and where in the primate prefrontal cortex. Science 276:821–824. Rescorla R A, Wagner, AR (1972) A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement. In: Classical conditioning II: Current research and theory (Black A H, Prokasy WF, eds), pp. 64–99. New York: AppletonCentury-Crofts. Rolls ET (1999) The brain and emotion. Oxford: Oxford University Press. Rolls ET, Sienkiewicz ZJ, Yaxley S (1989) Hunger modulates the responses to gustatory stimuli of single neurons in the caudolateral orbitofrontal cortex of the macaque monkey. Eur J Neurosci 1:53–60. Rosenkilde CE, Bauer RH, Fuster JM (1981) Single cell activity in ventral prefrontal cortex of behaving monkeys. Brain Res 209:375–394. Schoenbaum G, Chiba A A, Gallagher M (1998) Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning. Nature Neurosci 1:155–159. Stuss DT, Benson DF (1986) The frontal lobes. New York: Raven Press. Suzuki H, Azuma M (1976) A glass-insulated ‘Elgiloy’ microelectrode for recording unit activity in chronic monkey experiments. Electroencephalogr Clin Neurophysiol 41:93–95. Teuber H-L (1964) The riddle of frontal lobe function in man. In: The frontal granular cortex and behavior (Warren JM, Akert K, eds), pp. 410–477. New York: McGraw-Hill. Thorpe SJ, Rolls ET, Maddison S (1983) The orbitofrontal cortex: neuronal activity in the behaving monkey. Exp Brain Res 49:93–115. Tinklepaugh OL (1928) An experimental study of representation factors in monkeys. J Comp Psychol 8:197–236. Tremblay L, Schultz W (1999) Relative reward preference coded in primate orbitofrontal cortex. Nature 398:704–708. Watanabe M (1989) The appropriateness of behavioral responses coded in post-trial activity of primate prefrontal units. Neurosci Lett 101: 113–117. Watanabe M (1990) Prefrontal unit activity during associative learning in the monkey. Exp Brain Res 80:296–309. Watanabe M (1992) Frontal units of the monkey coding the associative significance of visual and auditory stimuli. Exp Brain Res 89:233–247. Watanabe M (1996) Reward expectancy in primate prefrontal neurons. Nature 382:629–632. Watanabe M (1998) Cognitive and motivational operations in primate prefrontal neurons. Rev Neurosci 9:225–241. Wilson FAW, Ó Scalaidhe SP, Goldman-Rakic PS (1993) Dissociation of object and spatial processing domains in primate prefrontal cortex. Science 260:1955–1958. Cerebral Cortex Mar 2000, V 10 N 3 271

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Full Text - Cerebral Cortex