Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Page 1 of 45 CHAPTER 5: ASSOCIATIVE PROCESSES Introduction◊ Classical Conditioning ◊ Operant Conditioning ◊ Experimental Paradigms ◊ Evolved Dispositions ◊ Mechanisms ◊ Extinction ◊ Theories ◊ Neuroscience ◊ Chapter Summary 1. INTRODUCTION Imagine what would happen in each of the following scenarios: 1) One hot summer afternoon, you eat mounds of juicy watermelon that tastes delicious. Soon afterwards, you enjoy a fine dinner of scallops and wine but either the scallops, the wine or both make you ill so you spend most of the night vomiting. While you are recovering the next day in the hot sun, a friend brings you a big bowl of watermelon pieces to quench your thirst. How would you feel? 2) One evening as you are driving home from work, a small animal darts in front of your car just as you are rounding a corner and approaching a small bridge. To avoid the animal, you swerve to one side, knocking a guard rail from the bridge and nearly descending into the riverbed. You pull over to calm down and ensure that nothing is damaged, then continue on your way. The next evening you are making the same trip and suddenly realize that you are at the same location. How would you feel? 3) Every Friday afternoon, you and your friends go for a long, hard bike ride. After the ride, you always meet at a local restaurant for a bar-b-que meal that is well-deserved after so much exercise. One evening, the restaurant is particularly busy so your meal is delayed. While sitting in the restaurant, you can smell the bar-b-que and even see people at the next table enjoying their meal! How would you feel? Most people can relate to these examples because they have some experience with each scenario, even if the specific details differ. In general, people report feeling sick, anxious and exceptionally hungry in these examples. From what we can surmise, other animals experience the same sensations under similar circumstances; like humans, they have formed associations between particular stimuli and particular events in their environment. 1.1 Background The idea that associative processes have a role in learning dates back at least to Aristotle (384-322 BCE), who formulated a set of principles describing how ideas or thoughts are connected. These principles were developed further between the 17thand 19thcenturies by the British Associationists, a group of philosophers expounding the view that all knowledge is acquired through the senses and that these experiences are held together by „associations‟. Uncovering the properties of these associations would explain how knowledge is acquired. The German psychologist Hermann Ebbinghaus (18501909) formally tested the laws put forth by the British Associationists, using himself as a subject. He generated long lists of nonsense syllables and examined the factors that improved his ability to remember these lists. To do so, Ebbinghaus manipulated a number of variables including the length of the lists, the number of times he rehearsed each list, the time between list exposure and the memory test, as well as the proximity of a particular item to other items in the list. Ebbinghaus‟ conclusions, published in Page 2 of 45 Memory (1885), presented a series of formal laws to explain how memories are formed through associative connections. Combined with the influence of Pavlov‟s work and the rise of behaviorism, the study of associative mechanisms flourished in North America during the first half of the 20th century. Much of this work was conducted by experimental psychologists examining classical or operant conditioning in the lab. These tightly-controlled experiments allowed researchers to identify causal factors of associative learning, many of which apply to both vertebrates and invertebrates. During the same period, behavioral ecologists based largely in Europe, were studying animals in their natural environment as a means to understand the evolutionary function of associative mechanisms. By the 1960s, there was a growing realization that these two approaches are complementary (Hollis, 1997). Most contemporary researchers, therefore, consider both proximate and ultimate explanations of associative learning, even if they continue to work only in one tradition. 1.2 Chapter Plan This chapter focuses on the two most-commonly studied forms of associative learning: classical and operant conditioning. A brief overview of the historical precedents for studying these two processes, both in the lab and in the natural environment is presented. Historically, research into these two processes was conducted primarily in the lab, although there are compelling and important field studies of both classical and operant conditioning. In order to understand how the majority of this work is conducted, the primary experimental paradigms in associative learning are described. This is followed by a discussion of how an animal‟s evolutionary history impacts its ability to form associations. The mechanisms of classical and operant conditioning, as well as the phenomenon of extinction, are outlined as a means to understand how associative learning operates. This leads to an overview of the theories of associative learning. Finally, as with all cognitive processes, associative learning is represented by changes in the central nervous system; there is extensive evidence pointing to specific brain systems that control both classical and operant conditioning. Although the evidence is not complete, these provide working models for future investigations of associative mechanisms. 2. CLASSICAL CONDITINOING Every organism lives in an environment surrounded by hundreds, or even thousands, of stimuli. Those with motivational significance to an animal will elicit a behavioral response or, more accurately, a combination of responses. Motivationallysignificant stimuli may be classified as positive (sometimes called appetitive) or aversive: Appetitive stimuli are those that an animal will work to obtain such as food, water, access to a sexual partner, etc.; aversive stimuli are those that an animal avoids such as predators, nausea, painful stimuli, etc. Classical conditioning is the process whereby stimuli that do not elicit a response initially, acquire this ability through association with a motivationally-significant stimulus. The systematic study of this phenomenon began with Pavlov‟s work on the digestive physiology of dogs. Indeed, the two are so closely connected that the terms classical and Pavlovian conditioning are often used interchangeably. Pavlov identified Page 3 of 45 four components in a classical conditioning paradigm: the unconditioned stimulus (US), the unconditioned response (UR), the conditioned stimulus (CS) and the conditioned response (CR). Prior to conditioning, the US elicits the UR; this is sometimes called the unconditioned reflex. Following conditioning trials in which the CS is paired with the US, the CR is elicited by the CS. This CS-CR connection is sometimes called the conditioned reflex. In the case of Pavlov‟s experiments, a dog salivating to the sound of a bell would be a conditioned reflex. It should not be surprising that classical conditioning has been documented in almost every species studied. If one stimulus reliably precedes the presentation of another, organisms can use this information to predict changes in their environment. . The ability to learn these associations would significantly increase the survival advantage of any animal. The obvious example is being able to predict where and when food will be available, or where and when a predator will arrive. But even more complex behaviors can be influenced by classical conditioning. For example, Hollis and colleagues demonstrated that classical conditioning improves territorial aggression in male gourami fish (Hollis et al., 1984). In this experiment, one group of fish learned that a red light predicted the entry of a rival male to the area. In contrast to fish that did not learn this association, the conditioned animals exhibited more bites and tailbeating, almost always winning the fight and retaining their territory. At least in this instance, the opportunity to predict the impending presence of a rival gave these males a significant survival advantage. Figure 5.1Classical conditioning increases territorial aggression in male fish.One group of fish (PAV) experienced conditioning trials in which a light predicted the entry of a rival male into the territory. Another group experienced the same number of light presentations and rival entries, but these were not explicitly paired (UNP). On a subsequent test, the PAV group showed more aggressive displays to the light. Reprinted from Hollis(1984). Page 4 of 45 BOX 5.1 Psychoneuroimmunology As with many scientific „breakthroughs‟, the demonstration that immune responses can be classically conditioned was a chance discovery. In the early 1970s, Ader and Cohen (1975) were working with the immunosuppresant, cyclophosphamide. This drug has a wide range of clinical applications (organ transplants, autoimmune disorders), but it produces a list of terrible side effects including nausea. Ader wanted to find a solution to this problem so he allowed mice to drink a flavored saccharin solution prior to an injection of cyclophosphamide. As expected, the mice developed a conditioned taste aversion to the solution. In order to test the effects of various anti-nausea agents, Alder force fed the saccharin solution to these mice. The problem? The animals kept dying. At the time, the explanation was not obvious but it is now clear that the saccharin solution was suppressing immune responses in animals that had experienced the saccharin-cyclophosphamide association. Ader demonstrated this experimentally by showing that mice exposed to the flavored, saccharin water (i.e., the CS) prior to an injection of foreign cells developed fewer antibodies or a weaker immune response than did mice exposed to water (Ader et al., 1990). Subsequent work showed that classical conditioning can also increase immune system activity. In these experiments, an odor is paired with an injection of the drug, interferon. Interferon increases the activity of natural killer cells in the bloodstream, which help the body to fight off infections, viruses and foreign cells. In both mice (Alvarez-Borda et al., 1995) and humans (Buske-Kirschbaum et al., 1994), an odor CS paired with an interferon injection increased natural killer cell activity, even in the absence of the drug. From a rather serendipitous finding that classical conditioning affects immune responses, psychoneuroimmunology developed and is now a thriving field. 3. OPERANT CONDITIONING In classical conditioning, the association between the CS and the US is independent of the animal‟s behavior. That is, a US is presented following the CS, regardless of how the animal responds to the CS; Pavlov‟s dogs did not need to do anything in order to receive the food after a bell was sounded. In contrast, the US is presented in operant conditioning only if the animal performs a response. The prototypical example of operant conditioning is a rat pressing a lever to receive a food reward. But any behavior that produces a consequence, either in the lab or the natural environment, can be described as an operant. This includes a cat pulling a rope to escape from a puzzle box, a rodent swimming to a hidden platform in a water maze, a bird pecking at a distinctly colored tree bark that provided a rich food source in the past, or an animal avoiding a location in which a predator was encountered. Thorndike, who pioneered the study of operant conditioning in the lab, noted that animals tend to repeat behaviors that produce satisfying outcomes and refrain from repeating those that lead to unsatisfying outcomes. This simple idea, called the Law of Effect, asserts that behavior is controlled by its consequences. This causal relationship between a response and its consequence can be expressed in one of four ways: 1. Positive reinforcement describes a positive contingency between a response and a positive outcome. Giving your dog a treat when he comes to your call, paying workers for Page 5 of 45 overtime hours, praising children for doing their chores or presenting a sugar pellet to a rat when it presses a lever are all examples of positive reinforcement. 2. Punishment is a positive contingency between a response and an aversive event. It includes shocking a rat for stepping from one compartment to another, slapping your dog for chewing your slippers, or scolding a child for making a mess or fining a driver for parking in a reserved spot. 3. A negative contingency between a response and a negative outcome is called negative reinforcement. If a response is emitted, an aversive event does not occur. Thus, behaviors governed by negative reinforcement include not touching a hot element that burned you in the past or avoiding an area where you previously encountered a predator. 4. Omission training, sometimes called negative punishment, involves a negative contingency between a response and a positive outcome. Grounding a child for staying out too late or presenting a food pellet if a rat does not press a lever are both examples of omission training. Outcome Produces Outcome Prevents Outcome Response Positive/Appetitive Negative/Aversive Positive Reinforcement Punishment Omission Negative Reinforcement Table 5.1: The relationship between a response and its outcome in operant conditioning. An operant response may produce or prevent an outcome that is either aversive or appetitive, leading to four different reinforcement relationships. Operant conditioning is an easy concept to grasp because most people believe that their actions are governed by consequences. For example, you may study every evening because you believe that there is an association between working hard and obtaining good grades. Behaviors that are repeated frequently, however, often become automatic or habitual. These so-called „habits‟ are responses that are elicited by environmental stimuli and are relatively insensitive to feedback. Perhaps if you study diligently each evening, this behavior will become automatic, regardless of how well you perform on tests and assignments. 4. EXPERIMENTAL PARADIGMS In order to understand how scientists study associative processes, it is important to be familiar with the paradigms that are used to measure classical and operant Page 6 of 45 conditioning. Historically, most laboratory studies used rodents or birds as subjects but this has changed in the last two decades with the proliferation of molecular biology tools. These advances have allowed researchers to uncover the genetics of associative learning in invertebrates, a topic that will be discussed later in this chapter (see Section 9 ). In contrast, behavioral ecologists throughout the century actively pursued studies of associative processes in a variety of species. The majority of these focused on classical conditioning although operant conditioning clearly occurs at high rates in the natural environment. Regardless of whether scientists are conducting laboratory or field-based research, almost all studies of associative processes use some variation of the paradigms described below. 4.1 Classical Conditioning Conditioned Approach Because Pavlov was interested in digestive physiology, his dogs had cannula attached to their salivary ducts that conducted drops of saliva to a data-recording device. This allowed Pavlov to quantify the CR (i.e., drops of saliva), but it is an impractical and unnecessary procedure for most classical conditioning experiments. A much simpler way to measure appetitive conditioning is to examine an animal‟s tendency to approach and contact a stimulus associated with reward (e.g., food). Brown and Jenkins (1968) documented one of the first examples of this effect in pigeons that pecked a key light predicting food presentation. Animals received the food regardless of their behavior so the pecking was a classically conditioned response. A strikingly similar phenomenon was observed only two years later by the ethologist, Harvey Croze (1970). He demonstrated that carrion crows approached and explored empty mussel shells on the beach (previously signaling no food) when a small piece of beef was placed under the shell. Even when Croze made the reward more difficult to obtain by burying it in the sand, the birds rapidly adjusted their foraging behavior to uncover the beef. This ability to associate previously-neutral stimuli with food has obvious evolutionary significance and probably explains the ease with which a conditioned approach response is acquired in both lab pigeons and foraging crows. Page 7 of 45 Figure 5.2 Search patterns of wild carrion crows. During training, birds learned to associate a food reward with mussel shells in the sand. During testing, an equal number of mussel and cockle shells were placed along the beach but there were no food rewards under either. Mussel and cockle shells differ in color, size and shape so birds can easily distinguish between them. Top: Crows quickly located mussel shells and when they failed to uncover a food reward, they spent considerable amount of time digging in the sand around the shell. Bottom: Crows approached a small number of cockle shells, turned over few of these but did not dig in the sand around the cockle shell. Reprinted from Croze (1970). Experimental psychologists capitalize on this facility by measuring approach responses to stimuli predicting natural rewards such as food, water, or sexual partners as well as non-natural rewards such as drugs or artificial sweeteners. The latency to approach the CS, the number of CS approaches within a given time period, and the time spent in close proximity to the CS are all indices of classical conditioning. Conditioned place and conditioned taste preference paradigms are common modifications of this methodology. These tests measure consumption of a flavored food or time spent in a distinctive environment previously associated with reward. Preference conditioning has been documented in fish and invertebrate species such as marine snails (Aplysia), worms (c-elegan) and fruit flies (Drosophila), making it a powerful tool to examine the role of genetic factors in classical conditioning. Conditioned Fear Unlike conditioned approach paradigms, tests of conditioned fear measure learned responses to aversive stimuli. The particular response that is produced (i.e., the CR) will depend on the species being tested, and on the US that is used in conditioning. For example, some animals emit warning signals, such as alarm calls or plumage displays, which act to deter an attack or warn other animals of danger. If these responses are later elicited by stimuli that were paired with the danger, we can infer that animals have formed an association between the CS and US. Other animals will escape or run for cover when they encounter a predator. Stimuli that signal the sight, sound or smell of these predators will, over time, elicit conditioned escape responses that can be quantified by an experimenter. Rodents and other small animals often bury stimuli that were associated with pain or nausea. So, if they mistakenly touch a sharp object or eat something that makes them ill, they will respond to these stimuli at a later time by vigorously digging the dirt or bedding around them to cover the object. Importantly, burying occurs at a later time when the US is no longer present (i.e., they do not touch the Page 8 of 45 object or taste the food again). This confirms that the burying is a response to the CS, not an automatic response to the US itself. Laboratory test of conditioned fear often capitalize on the fact that frightened animals tend to freeze. If a neutral stimulus is paired with a fearful stimulus such as a loud noise or a mild shock that does not cause any injury, animals will freeze in response to the neutral stimulus (now a CS). The strength of the CR is quantified by measuring the period of immobility following the CS presentation. The conditioned suppression paradigm is a variation on this measure in which rodents are initially trained to press a lever for food. Once lever pressing is stable, classical conditioning trials are instituted in which a CS (usually a tone or light) is paired with a shock. When animals freeze, they stop lever pressing so the number of lever presses that occur during the CS, versus the number of lever presses that occur during non-CS periods, is a measure of conditioned fear. This suppression ratio is typically calculated as (lever presses during the CS) / lever presses during the CS plus lever presses during an equal period of time preceding the CS). Thus, a suppression ratio of 0.5 would indicate no reduction in responding and no conditioning, whereas a suppression ratio of 0 would reflect complete freezing during the CS and maximal conditioning. Figure 5.3 Acquisition of a conditioned suppression response. The conditioned suppression ratio is calculated as LP during CS / (LP during CS + LP during pre-CS). During the first two trials, the CS does not suppress lever pressing; as trials progress, lever presses during the CS decline and the suppression ratio is reduced. Assuming a steady state of responding during non-CS periods, the data that would generate this curve could be as follows: Trial 1: LP CS = 20; LP pre-CS=20; Ratio = 20/(20+20) = .5 Trial 2: LP CS = 19; LP pre-CS=20; Ratio = 19/(19+20) = .48 Trial 3: LP CS = 17; LP pre-CS=20; Ratio = 17/(17+20) = .45 Trial 4: LP CS = 11; LP pre-CS=20; Ratio = 11/(11+20) = .35 Trial 5: LP CS = 6; LP pre-CS=20; Ratio = 6/(6+20) = .23 Trial 6: LP CS = 2; LP pre-CS=20; Ratio = 2/(2+20) = .09 LP = lever presses Page 9 of 45 CS = conditioned stimulus Conditioned Taste Aversion Many of you will be able to name a particular food or drink that makes you nauseous. This conditioned taste aversion (CTA) likely developed because you consumed the food prior to experiencing some gastric illness, usually vomiting. It does not matter if the food made you sick, the important thing is forming an association between the stimulus properties of the food (smell and taste) and your nausea. CTAs are easily observed in many animals, both in the lab and in the natural environment. After consuming a flavored substance, animals are made sick by injecting them with a nauseaproducing agent, such as lithium chloride, or exposing them to low-level gamma radiation. After they recover from the illness, animals are presented with the flavored food. If animals eat less of this food than another food that was not associated with illness, we conclude that they have developed a CTA. In some cases, the control comparison is the amount of flavored food consumed by animals that were not made ill. CTAs can be very powerful; they often develop with a single CS-US pairing and are sustained for long periods of time. This should not be surprising as poison-avoidance learning is critical for survival. Animals can become very sick or die if they consume inedible items, so they must learn to avoid poisons and other toxins by associating the smell and taste of the food with illness. This is one reason that it is so difficult to develop effective poisons for rodents. After sampling a very small amount of the novel food, rats and mice feel ill and avoid this food in the future. Note that the measure of classical conditioning (the CR) varies across these three types of paradigms. Given that there are dozens, or more accurately hundreds, of other classical conditioning tests, the number of ways in which scientists can measure this process is almost endless. In addition, for each CR, a researcher may choose to measure how quickly it is acquired, how large the response is once conditioning has occurred, and how long it lasts when the US is no longer present. This sometimes makes it tricky to compare the magnitude of conditioning across studies. Researchers (and students reading these studies) must pay close attention to the dependent measure in each study because discrepancies in research findings can sometimes be explained by differences in how the CR is assessed. 4.2 Operant Conditioning Discrete Trials The cats in Thorndike‟s experiments were required to pull a rope or move a stick or push a board to escape from a Puzzle Box. Operant conditioning (or trial and error learning as Thorndike called it) was evidenced by a decrease in the escape time over trials. This discrete trials setup, in which subjects have the opportunity to make one correct response for each time they are placed in a testing apparatus, is still common in lab-based research. Maze experiments, such as the water maze, T-maze, or straight alleyway, are all operant conditioning paradigms, although most of these tests are used to assess other cognitive processes such as spatial learning or decision making. One of the simplest discrete trials measures of operant conditioning is the conditioned avoidance paradigm. A rat or other small animal is shocked for stepping off a platform to a grid floor; longer latencies to step from the platform on subsequent trials indicate better conditioning. Many people will note that this test sounds remarkably similar to the Page 10 of 45 conditioned escape or conditioned freezing paradigms described above. Certainly both require animals to form associations between aversive events and the stimuli that predict them. The main difference is that the presentation of the shock in the conditioned avoidance paradigm depends on the animal‟s behavior: if they do not step off the platform, they will not receive a shock. In classical conditioning, the US follows the CS regardless of what the animal does. Obviously we cannot control how animals behave in the natural environment, which is one reason that there are so few ecological studies of operant conditioning. Free Operant When most students and researchers think about operant conditioning, they imagine a rat pressing a lever for food in a small chamber. This setup was designed by B.F. Skinner in the 1940s as a means to evaluate the on-going responses of his subjects (usually pigeons or rats). Animals are placed in a chamber containing some manipulandum, usually a lever, which can deliver a reward (also called a US). In contrast to the discrete trials paradigms, free operant methods allow animals to respond repeatedly (freely) once they are placed in the experimental chamber. One advantage of this method is that the experimenter can observe variations in responding across time. This information is represented on a cumulative response record. The vertical distance (y-axis) on the graph represents the total number of responses in the session and the distance along the x-axis indicates time. Thus, the cumulative record provides a visual representation of when and how frequently the animal responds during a session. Page 11 of 45 Figure 5.4 Cumulative response records for rats responding for methamphetamine (c and d) or food (e and f). Food reinforcement produced a steady rate of responding over time whereas methamphetamine produced a step-like pattern in which a drug infusion (i.e., presentation of the reinforcer) was followed by a pause in responding (c). When animals are responding for drug, they load up at the beginning of the session, presumably to attain an optimal level of drug in their system (c). This effect is not observed when animals are responding for food (e). Administration of the glutamate antagonist MTEP (d and f) led to a near-complete cessation of methamphetamine-reinforced responding (d) but had no effect on responding for food (f). The rapid lever presses that occur at the beginning of the methamphetamine session (d) indicate that animals are „expecting‟ a drug infusion and may be frustrated by the lack of drug effect. Animals stopped responding for over an hour, tried the lever a few times, and stopped responding for another 40 minutes. This pattern is consistent with the idea that MTEP is blocking the reinforcing effect of methamphetamine. Reprinted from Gasset al. (2009). In free operant paradigms, the relationship between responding and reinforcement is described by the reinforcement schedule. This is a rule (set by the experimenter) that determines how and when a response will be followed by a reinforcer. When every response produces a reinforcer, the schedule is called continuous reinforcement or CRF. More commonly, responding is reinforced on a partial or intermittent schedule. One of the simplest ways to produce partial reinforcement is to require animals to make a certain number of responses for each reinforcer. If the number is set, the schedule is called fixed ratio (FR); if the number of required responses varies about a mean value , the schedule is called variable ratio (VR). Thus, FR5, FR10 and FR50 schedules require animals to make exactly 5, 10 and 50 responses for the reinforcer whereas VR5, VR10 and VR50 schedules require animals to make an average of 5, 10 and 50 responses. A progressive ratio (PR) schedule is a variation of an FR schedule in which animals must make an increasing number of responses for successive presentations of the reinforcer (Hodos, 1961). Typically, the schedule is set up so that animals make one, then two, then four, then eight responses for the first four reinforcers. The PR values continue to increase until animals stop responding altogether. This break point is a measure of motivation or how hard animals will work for a single presentation of the reinforcer. In contrast to ratio schedules, interval schedules provide reinforcement if a response occurs after a certain period of time. Under fixed interval (FI) schedules, the time from the presentation of one reinforcer to the possibility of receiving the next is constant. Under variable interval (VI) schedules, responding is reinforced after an average time interval has passed. For example, animals responding under anFI-15s schedule would be reinforced for the first response they make 15 seconds after the last reinforcer. Under a VI-15s schedule, reinforcement would be available on average 15 seconds after the delivery of the last reinforcer. Note that animals must still respond under interval schedules, but they are only reinforced for responses that occur after the interval has elapsed. In Chapter 9, you will read more about how these techniques have been used in very clever examinations of the timing and counting abilities of animals. Situations in which a reinforcer is presented at specified intervals, regardless of the animals‟ behavior, are called time schedules. Schedules of reinforcement induce different patterns of responding, suggesting that animals have some knowledge of the payoffs provided by each schedule. For Page 12 of 45 example, acquisition is usually more rapid under CRF than partial reinforcement schedules, and responding declines more quickly when the reinforcer is removed. This should not be surprising. If an organism is always reinforced for a particular behaviour, it will be easier to associate the behavior with the outcome and it will stop responding when the reinforer is removed. If the behaviour is reinforced only part of the time, it may take longer to learn the association and animal could develop a strategy to „keep trying‟ to receive the reinforcer. Even under partial reinforcement schedules, different patterns of responding emerge. FR schedules are characterized by high and steady rates of responding up to the presentation of the reinforcer. Responding ceases after reinforcement is delivered and this post reinforcement pause increases with the size of the ratio. With very high FR ratios, animals may stop responding altogether. FI schedules also induce postreinforcement pauses that vary directly with the length of the interval, presumably because animals learn that the reinforcer will not be available for a certain period of time. Responding recovers close to the completion of the interval with the rate increasing exponentially until the next reinforcer is obtained. This produces a 'scallop' in responding, as opposed to the relatively steady rate of responding that occurs under FR schedules. Not surprisingly, variable schedules of reinforcement produce less predictable patterns of responding than do fixed schedules. In general, response rates are steady under variable schedules, reflecting the fact that animals do not know when the next reinforcer will be delivered. Rapid bursts of responding often occur because animals have experienced occasions in which a few fast responses produce several reinforcers in a row. When the reinforcement payoff is low (i.e., high VR or VI schedules), response rates are reduced, responding becomes more sporadic and pauses occur at irregular intervals. This makes sense in terms of what the animal has learned: reinforcement is infrequent and unpredictable. Figure 5.5 Cumulative response records under different schedules of reinforcement. The delivery of a reinforcer is marked by a vertical slash. Different schedules of reinforcement elicit different patterns of responding and, not surprisingly, FI responding in humans is very accurate if they can use clocks to time the intervals (bottom line to right of figure). Redrawn from Baldwin and Baldwin (2001). Page 13 of 45 VR=variable ratio VI=variable interval FR=fixed ratio FI=fixed interval Schedules of reinforcement are a convenient tool to elicit different patterns of responding in the lab, but they have an important application to the real world. That is, ratio and interval schedules are analogies for how different commodities are depleted and replenished in the natural environment. Some food sources, such as prey for predator animals, will disappear on an item-by-item basis if they are consumed, whereas organic food sources replenish with time. These represent ratio and interval schedules respectively. Animals that understand the payoffs provided by these sources will be better equipped to survive in their environment because they are less likely to deplete resources beyond the point that they can be replenished. Ironically, this appears to be exactly what humans have not learned about the environment. Page 14 of 45 BOX 5.2 Superstitious Behavior The defining feature of operant conditioning is that the presentation of the reinforcer depends on the animal‟s response. But, what if an animal thinks that its behavior is causing an outcome? Will responding change accordingly? Skinner (1948) tested this idea by presenting noncontingent food to pigeons at regular intervals. Over trials, the birds developed responses that preceded the food delivery, but these differed across birds: one bird turned in counterclockwise circles, one made pecking motions at the floor, two swung their heads back and forth, one raised its head to one corner of the cage and another bobbed its head up and down. Skinner surmised that each of these animals had been engaged in a particular behavior (i.e., head bobbing) when the food became available. This signaled an operant contingency to the animal that was reinforced on subsequent trials when the bird performed the response and the food was then delivered. Skinner described these responses as superstitious behavior and noted that they are maintained because animals form a response-outcome association, even though there is no causal relationship between the two. Similar effects were observed in other animals (including humans) that were tested in the lab but, of course, the story is not so simple in the real world. Not all coincidental occurrences of a response and a reinforcer lead to superstitious behavior, so researchers began to investigate the factors that produce this false belief (Gmelch & Felson, 1980; Vyse 1997). These studies concluded that superstitious behaviors are maximized under the following conditions: 1) an unusual and novel behavior precedes the presentation of a highly-valued reward; 2) the cost of engaging in the behavior is very low but the payoff is potentially very high; and 3) the attainment of the reward is not completely under the individual‟s control. In humans, superstitious behaviors may be maintained because they create an illusion of control, particularly in situations that induce high levels of stress and anxiety. This describes high-level sports or artistic endeavors and both athletes and musicians often engage in specific rituals prior to a competition or performance. These could include putting on a particular piece of clothing, eating a particular food, or entering the theatre or arena in a particular way. Even if the individual acknowledges that these rituals may not cause their good performance, they are reluctant to abandon them. After all, there is no downside to wearing a particular piece of clothing. And what if it works? Regardless of what the performers or athletes may think, these are superstitious behaviors that are maintained for exactly the same reason that Skinner‟s pigeon‟s continued to bob their head or scratch the floor before the food arrived. Page 15 of 45 5. EVOLVED DISPOSITIONS Many psychologists in the first half of the 20th century adhered to the principle of equipotentiality. This position assumed that associations between different stimuli, responses and reinforcers could be formed with equal ease. For example, Pavlov noted that any stimulus could be used as a CS in his experiments, and Skinner claimed that animals could learn any operant response (as long as it was physically possible) for any reinforcer. According to these scientists, the results of their experiments could be generalized to all instances of associative learning (and many believed to all cases of learning). We now recognize that this is not true: some associations are easy to learn and others are not. For example, a CTA often develops following a single CS-US pairing and, as many people are aware, may last for months or even years. In contrast, some stimuli never elicit a response, even if they are paired with a US on hundreds or thousands of occasions. The relative ease with which animals acquire certain associations is referred to as evolved dispositions (Shettleworth, 1998), reflecting the idea that learning an association between two stimuli has conferred some evolutionary advantage on the species. One of the most elegant examples of evolved dispositions comes from an experiment by Garcia and Koelling (1966). In the first part of the experiment, rats were presented with a drinking tube containing flavored water. Every time the rats licked the tube, a brief audiovisual stimulus was presented that consisted of a clicking sound and a flash of light. Animals were then either shocked or made ill by x-ray treatment. Thus, all rats experienced the same CS (flavor plus light-sound combination) with half receiving a shock US and half receiving an the x-ray treatment that made them ill. In the subsequent test, animals were presented with two licking tubes, one that contained flavored water and one that contained plain water linked to the light-sound cue. Figure 5.6 Experimental design for Garcia and Koelling‟s experiment showing evolved dispositions. The CS was a compound stimulus consisting of a flavor and an audiovisual stimulus. The US was either a shock or illness. During testing, responses to the two CSs (flavor or audiovisual stimulus) were tested separately. Reprinted from Domjam (2003). Page 16 of 45 The group that were made ill avoided the flavored water and drank from the plain water tube that activated the audiovisual cue. In contrast, the shocked group avoided the tube linked to the audiovisual cue and drank the flavored water. The interpretation of these results is that rats have a tendency to associate a flavor with illness and a lightsound stimulus with shock. The same effect is observed in 1 day old rat pups (Gemberling & Domjam 1982) suggesting that the disposition is innate, rather than acquired through experience. Figure 5.7 Results of Garcia and Koelling‟s experiment demonstrating evolved dispositions. The bars indicate how much rats licked a drinking tube that contained a flavored solution (taste) or produced an audiovisual stimulus. Rats that experienced the illness US licked the tube containing the flavored solution less frequently, whereas those that experienced the shock US showed fewer licks of the tube that produced the audiovisual stimulus. Reprinted from Domjam (2003). In the natural environment, gustatory cues are a good predictor of dangerous food, whereas audio-visual cues are a good predictor of environmental danger such as predation. Animals who learned the predictive significance of these stimuli would be more likely to survive and reproduce, explaining why contemporary animals acquire these associations so easily. Evolved dispositions also account for cross-species differences in CTA learning. Rodents easily acquire flavour-illness, but not colourillness, associations whereas quail develop both. Unlike rats and mice, quail rely heavily on vision when searching for food. This also explains the relative ease with which birds associate visual, but not auditory cues, with a food reward. Thorndike was ahead of his time in recognizing the importance of evolved dispositions in operant conditioning. He tested whether cats could be trained to yawn or scratch themselves in order to escape from a Puzzle Box, concluding that these associations did not „belong‟ together in the animal‟s evolutionary history. This foreshadowed many unsuccessful attempts to train animals in operant paradigms. Such examples include Herschberger‟s attempt to train chicks to run away from a food bowl to receive a reward, Bolles‟ attempts to train rats to stand on their hind legs to avoid a Page 17 of 45 shock, and the Brelands‟ difficulty in training animals to perform circus tricks if the trained response was incompatible with the animal‟s natural behavior. All of these researchers discussed their negative findings in the context of evolution: operant responses that are contrary to adaptive behaviors will be difficult to acquire. Given that approaching food-related stimuli and escaping from shock would confer some evolutionary advantage, trying to condition animals against these dispositions is incredibly difficult. The corollary is that response-reinforcer associations which enhance species‟ survival are easily acquired. Sevenster (1973) demonstrated this principle in male sickleback fish that were trained to bite a rod or swim through a ring to gain access to another fish. When males could gain access to another male, the biting increased but swimming through a ring did not. The opposite occurred with access to a female: swimming through a ring increased, but biting the rod did not. The finding that access to another male fish is an effective reinforcer for biting, and that access to a female is an effective reinforcer for ring swimming, fits with the animal‟s evolutionary history. Biting is a component of the aggressive behavior that occurs when a resident male encounters an intruder, whereas swimming through a ring is more characteristic of the swimming patterns during fish courting behavior. Evolved dispositions may also explain some irrational fears in humans. People readily acquire exaggerated fear responses to stimuli associated with threats in the natural environment, and many phobias likely stem from an evolved disposition to fear objects or situations that were dangerous to our ancestors. This explains why humans and monkeys associate pictures of snakes and spiders with shock more readily than pictures of flowers and houses (Ohman, Dimberg & Ost, 1985). Similarly, although it is a very real and immediate danger, children have great difficulty learning not to run in front of cars. The same children will often refuse to enter a dark room on their own, even if the room is familiar and they receive repeated assurances that no monsters are hiding in the dark. By the end of the 20th century, most scientists accepted that evolved dispositions affect associative learning. Many researchers still focus exclusively on proximate causes of classical or operant conditioning, but their research questions are often framed within an evolutionary perspective. For example, one may investigate which brain regions mediate associative learning, and then ask why these neural systems are preserved across evolution. If nothing else, researchers can capitalize on the fact that it is much easier to train animals in classical or operant tasks when one works with, rather than against, evolved dispositions. 6. MECHANISMS Evolved dispositions provide a functional explanation of associative learning in that being able to predict when one stimulus follows another or which outcome will follow a response should help animals to survive and reproduce. In contrast, proximate explanations describe associative learning in terms of the causal factors that produce optimal conditioning. These include physiological mechanisms underlying associative learning, a topic that will be discussed in Section 9 of this chapter. Many other proximate factors of associative learning have been identified. Of these, predictiveness, temporal contiguity, and stimulus salience are the most important. Page 18 of 45 6.1 Predictiveness It seems intuitive that the more frequently two stimuli are paired, the stronger will be the association between them. Contiguity alone, however, does not produce associative learning. The CS must reliably predict the US in classical conditioning, and the response must reliably produce the reinforcer in operant conditioning, or conditioning will not occur. The realization that predictiveness is an important determinant in classical conditioning came about following a classic experiment by Leon Kamin (1969). In this study, one group of rats underwent a classical conditioning protocol in which a CS (tone) preceded a shock (US). As expected, a freezing response (CR) developed to the tone. Following these trials, a light was presented at the same time as the tone and both were followed by the US. Stimuli presented simultaneous are called compound stimuli and are labeled individually as CS1 (tone), CS2 (light), etc. A separate group of control rats experienced CS-US pairings with the compound stimulus (tone and light combined) but had no prior exposure to either stimulus alone. Blocking Group Control Group simple conditioning CS1 – US No treatment compound conditioning CS2+CS1 – US CS2+CS1 – US Test CS2? CS2? Figure 5.8 Experimental design for Kamin‟s blocking experiment. The blocking group experienced simple classical conditioning trials followed by compound conditioning trials. The control group experienced the compound conditioning trials only. CS=conditioned stimulus US=unconditioned stimulus. In the test session, Kamin compared CR‟s to the light in the experimental and control groups. Note that all animals experienced exactly the same number of trials in which the light preceded the shock. Thus, if the frequency of CS-US pairings alone determines classical conditioning, the CR should be the same in control and experimental groups. Of course, this is not what happened. Whereas the control group exhibited robust freezing to the light, the experimental group did not. In Kamin‟s terms, the initial tone-shock pairings blocked subsequent conditioning to the light. He discussed his findings in terms of informativeness, noting that the animals‟ previous experience with the tone made the light irrelevant as a predictor of the US. Thus, in a blocking experiment, CS2 conveys no new information about the occurrence of the US so conditioning does not occur. Page 19 of 45 Figure 5.9 Results of Kamin‟s (1969) blocking experiment. Conditioning to CS2 was measured in a conditioned suppression experiment with the blocking group showing almost no suppression of responding (i.e., no freezing to the light).In contrast, the control group showed marked conditioned suppression indicating that, unlike the blocking group, they had formed a CS2-US association. CER = conditioned emotional response Kamin‟s experiment has been replicated hundreds (perhaps thousands) of times with different stimuli and different organisms, confirming that the frequency of CS-US pairings can not, in itself, explain classical conditioning. Interestingly, the phenomenon may be limited to vertebrates as insects (at least bees) do not exhibit blocking (Bitterman, 1996) despite the fact that classical conditioning is acquired easily by these animals. Another way to reduce the predictive value of a CS is to present it alone, prior to any CS-US pairings. This describes the phenomenon of latent inhibition in which previous exposure to the CS, in the absence of the US, retards subsequent conditioning to the CS. One can think of latent inhibition as habituation to a novel stimulus (see Section X, Chapter 2). Organisms first learn that the CS has no motivational significance; they must then inhibit or overcome this information when the CS is presented with the US at a later time. Both blocking and latent inhibition fulfill an important biological function in that they limit cognitive processing of stimuli that are meaningless to the organism. 6.2 Temporal Contiguity It seems intuitive that it would be easier to form an association between two stimuli if they occur close together in time. Many laboratory experiments confirmed this assumption: a US that immediately follows a CS and a response that immediately produces a reinforcer induce robust conditioning. Researchers went on to demonstrate that classical conditioning is reduced when the US is delayed because stimuli present during the intervening interval become better predictors of the US. The same appears to be true in operant conditioning. Animals can learn an operant response for a delayed Page 20 of 45 reinforcer but the longer the interval to the reinforcer, the more likely it is that animals will form associations between other stimuli and the reinforcer (Dickinson, 1980). This serves to weaken the response-reinforcer association by making the response less predictive of the outcome. The problem with this general principle of temporal contiguity is that it does not apply to all cases of associative learning. CTA learning is the notable exception. CTAs develop with very long CS-US (taste-nausea) intervals, up to a few hours in rodents and even longer in humans. Any ill effects related to eating would occur after the food is digested, absorbed in the bloodstream and distributed to bodily tissues. Organisms must be able to retain information about the stimulus properties of food over a long interval if they are to later avoid food that made them sick. This is a prime example of how ultimate explanations (evolved dispositions) influence proximate explanations (temporal contiguity) of associative learning. Even within the same paradigm, changing the CS-US interval alters how animals respond. Timberlake (1984) demonstrated this effect in a classical conditioning experiment with rats. When a light CS predicted a food reward at very short intervals (less than 2 seconds), rats developed a CR of handling and gnawing the CS. When the US occurred more than 5 seconds after the CS, rats developed a conditioned foraging response as if they are searching for the food (Timberlake 1984). The fact that a different CR developed under these two conditions is evidence that rats were learning the predictive temporal relationship between the CS and the US. The relationship between temporal contiguity and the development of conditioned responding provides further evidence for the biological relevance of associative learning. If animals use associations to make predictive inferences about their world, then the temporal intervals that conform to CS-US or response-reinforcer relationships in the natural environment should produce the best conditioning. 6.3 Stimulus Salience Even if one controls for temporal contiguity and predictiveness, the rate and magnitude of conditioning to different stimuli may vary. This is true in both the lab and the natural environment, but is illustrated most effectively by thinking about a typical classical conditioning experiment. The arrival of the experimenter (often at the same time each day), the placement of the animal in a testing apparatus, the sound of automated equipment starting an experiment, as well as other extraneous cues may all become effective CSs because they ultimately predict the presentation of the US. Researchers attempt to control for these confounds but, even in tightly-controlled lab studies, it is difficult to eliminate cues that are not explicitly part of the experimental design. The issue is even more complicated in the natural environment where almost every US is preceded by a cluster of CS‟s. The technical term for this phenomenon is overshadowing: one stimulus acquires better conditioning than other stimuli in the environment, even if they are equal predictors of the US. In blocking, a stronger association with one CS develops because it was presented first, whereas in overshadowing a strong associations develops to a CS because it is more salient. The most obvious explanation for overshadowing is that animals notice or pay attention to one stimulus at the expense of the others. This is often described as stimulus salience, which is the likelihood that a stimulus will be attended to. Salience is often Page 21 of 45 equated with the conspicuousness of a stimulus, or how well it stands out from the other background stimuli. In general, salience increases with the intensity of the stimulus: brighter lights, louder noises or stronger smells attract more attention. It is important to remember, however, that salience is not a fixed property of the stimulus. As we learned in Chapter 3, the ability of an organism to attend to a particular stimulus depends on the organism‟s sensory system and on perceptual processing. The salience of a stimulus can also be increased by altering the motivational state of the animal. Food-related cues, such as cooking aromas, are far more salient when you are hungry than when you are sated! Not surprisingly, one of the best ways to increase the salience of a stimulus is to make it similar to cues that animals encounter in their natural environment. Male quail will develop a conditioned sexual response to a CS, such as a light or terrycloth object, that predicts access to a female quail (US). If this arbitrary cue is made more realistic by having it resemble a female quail, more vigorous responding develops to the CS (Cusato & Domjam, 1998). Figure 5.10 Naturalistic and artificial stimuli in sexual conditioning experiments. The stimulus on the left is made of terrycloth and only resembles the general shape of a female quail. The stimulus on the right was prepared with head and neck feathers from a taxidermically prepared bird. Reprinted from Cusato & Domjam (1998). Stimulus salience impacts operant conditioning in the same way that it affects classical conditioning. Animals are much quicker to acquire responses to stimuli that are salient, and responding is increased when the stimuli have biological significance to the animal. 7. EXTINCTION In both classical and operant conditioning, responses will decline if the US is no longer presented. For example, birds will eventually stop approaching shells that covered a food reward if the shells are now empty, and rats will stop lever pressing for food if the food is withheld. This gradual reduction in responding, with removal of the US, is called extinction. It may seem intuitive that animals forget or erase the association that was acquired during conditioning, but we know this is not how extinction occurs. For one Page 22 of 45 thing, a CR will reappear if a delay period follows extinction, even when the organism has no further experience with the US. This spontaneous recovery indicates that the original association is still available to the animal. Second, if a novel stimulus is presented during extinction, the response rapidly recovers. For example, a dog will develop a conditioned salivation response to a bell that predicts food; when the bell is presented without the food, salivation declines. If a new stimulus, such as a light, is then presented with the bell, the dog will salivate again. The same phenomenon occurs in operant conditioning when rats re-initiate lever pressing in the presence of a loud noise. In some cases, the renewed response may be as robust as the pre-extinction response. This phenomenon is called disinhibition to reflect the idea that the novel stimulus is disrupting a process that actively inhibits the original association (CS-US in classical conditioning and response-US in operant conditioning). Third, extinction is context specific in that responding is inhibited only in the environment in which extinction trials occurred. In the laboratory, the environmental context can be altered by changing the flooring, the wall patterns, adding odors, etc. If extinction trials are conducted in a new context, the response declines but then re-emerges when the CS is presented in the original context. This response renewal (sometimes called reacquisition) is another piece of evidence that CS-US associations are inhibited, not eliminated, during extinction. In other words, extinction is not simply forgetting. Response Strength Reacquisition Acquisition Extinction Spontaneous recovery Resp onse strength Extinction Trials Figure 5.11 Hypothetical data showing changes in responding under reinforcement (acquisition) and extinction conditions. Note that these changes could equally apply to operant or classical conditioning. Page 23 of 45 8. THEORIES Few people would disagree that classical and operant conditioning involve associative learning. Nonetheless, there was considerable debate in the first half of the 20th century over what associations underlie the changes in behavior. Some researchers argued that animals form stimulus-stimulus (S-S) associations during conditioning, others that they learn stimulus-response (S-R) associations, and still others that responseoutcome (R-O) associations control responding. A number of very clever experiments were designed to tease apart these hypotheses but, as frequently occurs with such issues, the answer lies somewhere in between. 8.1 What Associations are Formed in Classical Conditioning? Pavlov was the first to argue that animals learn stimulus-stimulus associations, an effect he called stimulus substitution. According to this position, a connection is formed between the CS and the US such that the CS becomes a substitute for the US. Conditioned responding develops because the CS elicits a representation of the US, to which the animal responds. One of the best pieces of evidence that animals can form stimulus-stimulus associations comes from sensory preconditioning experiments. In this experimental set-up, two neutral stimuli (CS1 and CS2) are presented together with no US. At this point, both stimuli are motivationally neutral so no observable response (CR or UR) is elicited. Then, CS1 is followed by a US in standard classical conditioning trials. After a CR is established, CS2 is presented alone. If CS2 elicits a CR, animals must have formed a stimulus-stimulus association (CS1-CS2) prior to conditioning trials. Phase Protocol Stimuli Response pre-conditioning CS1 – CS2 light - tone nothing conditioning CS2 - US tone-food salivation Test CS1? light? reduced salivation Figure 5.12 Experimental design for a sensory preconditioning study. If the light elicits a salivation response during the test, it indicates that an association was formed between the light and the tone during pre-conditioning trials Devaluation experiments also support S-S accounts of classical conditioning. The logic behind a devaluation experiment is as follows: if the CS is associated with the US, changing the value of the US should change responding to the CS. Devaluation is examined empirically using a 3-stage experiment. First, the animal experiences CS-US Page 24 of 45 pairings (e.g., tone followed by food) until a CR develops (e.g., salivation); second, the US is devalued (e.g., food followed by sickness); and third, the CS is presented in the absence of the US (e.g., tone with no food). Note that the animals have never experienced the tone and sickness together. Thus, if the CR (e.g., salivation) is reduced, compared to a group that did not experience the devaluation, one can conclude that they learned an S-S association during conditioning trials. Sensory preconditioning and devaluation effects have been demonstrated in hundreds of experiments using dozens of different species. Despite this evidence, S-S theories of classical conditioning were not accepted by all researchers. Many noted that these theories make no prediction about what CR will develop across conditioning. Each US may elicit a variety of hormonal, emotional and behavioral responses, so a theory that does not anticipate which of these will become conditioned responses has limited utility. S-R theories deal with this uncertainty by proposing that classical conditioning reflects the development of CS-UR associations. This infers that the CS should elicit a response that mimics the UR. Compelling evidence for this position was provided by Jenkins and Moore (1973) who trained pigeons to associate a light with the presentation of either food or water. Pigeons developed a classically conditioned response of pecking the light but the characteristics of the peck varied with the US. A water US produced a CR that was a slow pecking, with the beak closed and often accompanied by swallowing. In contrast, a food US produced a CR that was a sharp, vigorous peck with the beak open at the moment of contact, as if the pigeons were pecking at grains of food. It seemed that the animals were attempting to „drink‟ or „eat‟ the CS, seemingly confirming the idea that the CR is a reduced version of the UR. Figure 5.13 Pigeons pecking a key that predicted a reward. The pigeon on the left was trained with a water US and the pigeon on the right with a food US. The water-trained pigeon pecks at a slow rate with its beak closed and swallows frequently, responses that mimic drinking. The foodtrained pigeon pecks with an open beak at a more rapid rate as if it is attempting to eat the light. Reprinted from Jenkins and Moore (1973). Page 25 of 45 Although other data seemingly confirmed that S-R associations are important in classical conditioning, once again, contradictions arose. The most problematic was evidence that the CR does not always mimic the UR. Indeed, in some cases, the CR is opposite to the UR. For example, a mild shock increases heart rate, whereas a CS that predicts the shock decreases heart rate (Hilgard, 1936). These data also cause problems for S-S theories of classical conditioning because they are not consistent with the idea that the CS becomes a substitute for the US. Because of these problems, a third group of theories were developed that explained classical conditioning as adaptive responses to the upcoming US. These preparatory theories argued that the CR mimics the UR when this is the best preparation for the US (e.g., eyeblink to a puff of air or salivation to food). When the more adaptive response is to counter the US, the CR and UR will be in opposite directions (as in the heart rate example above). Preparatory theories provide a functional explanation of associative learning by suggesting that classical conditioning evolved to help organisms prepare for the appearance of motivationally-significant events. Preparatory theories provide an adequate explanation for many classicallyconditioned responses, suggesting the existence of CR-US associations. But this does not mean that animals don’t learn CS-US or CS-UR associations during conditioning. In all likelihood, they learn all three. Each of these associations provides different information to the animal about the relationships in their environment and animals that can use this information adaptively are more likely to survive and reproduce. One way to conceptualize the relationship between these three associations is to suggest that cognitive representations of CS-US associations develop during conditioning, and that this knowledge is translated into behavior through CS-UR and CR-US associations. Page 26 of 45 BOX 5.3 Preparatory Responses and Drug Tolerance Drug tolerance occurs when higher and higher doses of a drug are required to get the same effect. Tolerance develops rapidly to many opioid effects so some drug addicts regularly inject a dose of heroin that is 30 or 40 times higher than a dose that would kill most people. Drug tolerance can not be explained, entirely, by the pharmacological properties of the drug because animals and humans with the same level of drug exposure exhibit very different levels of tolerance. One suggestion is that cues associated with the drug act as a CS that elicits preparatory responses to the drug (Siegel, 1983). Tolerance develops because the association between these stimuli and the injection becomes stronger and the compensatory mechanisms became better at countering the effects of the drug. One prediction of this theory is that tolerance should be stronger when the drug injection always occurs in the same environment (i.e., the same cues reliably predict the injection). Siegel and colleagues tested this hypothesis in rats that were repeatedly injected with heroin in one of two distinct environments. A control group of rats received sucrose/water injections. The following day, the dose of heroin was doubled for all animals. One group was injected in the same environment that they received the original injections and one group in a different environment. The dependent measure was the number of overdoses in each group. Fig. 5.14. Results of Siegel‟s experiment on morphine tolerance. Only 32% of the animals died when they were injected with the higher dose in the same room as they received the original injections; twice as many animals died when they were injected in a different room. Almost all of the control animals were killed by this larger dose (Siegel et al., 1982). The explanation for this phenomenon is that the CSs in the „same room‟ environment induced compensatory responses that opposed the drug effects. These findings help to explain why drug tolerance is context dependent and why drug addicts often overdose in new environments, even when they administer a dose that is similar to the amount that they take regularly. If the cues that signal the injection are not present, preparatory responses will not be set in motion to reduce the lethal effects of the drug. Page 27 of 45 8.2 What Associations are Formed in Operant Conditioning? There are three fundamental elements in operant conditioning: the response (R), the reinforcer or outcome (O) and the stimuli (S) which signal when the outcome will be available. (Reinforcer and outcome may be used interchangeably; to avoid confusion between R for response and R for reinforcer, we use O for outcome.) As with classical conditioning, researchers in the 20th century argued about how these elements may be associated and, more importantly, which are responsible for the development of operant conditioning. Thorndike was the first S-R theorist, arguing that cats in his puzzle boxes formed associations between stimuli in the box and the operant response that led to the escape. The outcome (in this case escape) increased subsequent responding because it strengthened the S-R association. Thorndike formulated this principle into a Law of Effect which stated that “if a response in the presence of a stimulus is followed by a satisfying event, the association between the stimulus and the response is strengthened” (Thorndike, 1911). Later researchers, most notably Hull (1930), used the term habit learning to describe the tendency to perform a particular response in the presence of a particular stimulus. According to him, the strength of the habit was a function of the number of times that the S-R sequence was followed by a reinforcer. As habit strength increased, the probability that a subject would perform the given response in the presence of the appropriate stimulus also increased. Tolman was one of the strongest critics of Hull‟s work, arguing that S-R theories turned animals (including humans) into automata, with no understanding of how their behavior changed the environment. His view of operant conditioning was that animals form associations between their response and the outcome that it produces. This is nothing more complicated than saying you understand what will happen when you do something. The problem for scientists during Tolman‟s time was that R-O theories like his require animals to have mental representations, both of their response and of the goal that they wish to achieve, a position that many were reluctant to adopt. A resolution to the S-R versus R-O debate in operant conditioning is provided by devaluation experiments. Recall the devaluation protocol in classical conditioning (see Section 8.1, above). The same procedure is used in operant conditioning with the outcome being devalued independently of the operant contingency. In the example below, animals learn to lever press for food, the food is associated with illness and lever pressing responses are tested at a later time. Phase Devaluation Group Control Group training response-outcome response-outcome devaluation Page 28 of 45 Test outcome- aversion nothing response? response? Figure 5.15 Experimental design for a devaluation study. Animals in both the devaluation and control groups are trained to make an operant response for a reinforcer (typically food). After the response is acquired, the outcome is associated with an aversive event (i.e., illness) only for the devaluation group. In a subsequent test, if the responses of the devaluation group are reduced compared to the control group, we infer that the animals had formed an association between the response and the outcome during the initial training sessions. Dickinson and Adams (1981) were the first to show that animals undergoing this devaluation procedure, but not their paired controls, show lower rates of lever pressing during the test. This reduction in operant responding is an indication that animals formed R-O associations during training. In other words, they had formed an association between what they did and what happened. An interesting twist to the finding is that devaluation is ineffective if animals are very well trained. This suggests that extended operant training produces habit learning that is insensitive to changes in the outcome. This fits with our current ideas of habits; these are automatic responses to environmental stimuli, a phenomenon that many people experience when they perform the same action again and again. Think about how you get to class every day. If you walk or ride your bike, you probably take the same route and could arrive at your destination before you realize it. The expression „on automatic pilot‟ fits this behavior. If one day you plan to stop by the bank machine on your way to class, you may be so accustomed to your usual routine that you forget to detour to the bank machine and arrive at class… without money. The transition from R-O to S-R systems in operant conditioning is an example of how cognitive processing helps animals adjust to changing environments. When animals (including humans) initially learn a task, they must attend to the consequences of their action so that they can modify their responses accordingly. If the environment remains relatively stable and responses consistently produce the same outcome, habitual responding takes over. In this way, animals can cope with both predicted and unpredicted events in their environment. 8.3 Rescorla-Wagner Model Rather than focus on defining which associations are acquired during classical or operant conditioning, a number of researchers began to ask how animals code the logical relationship between events in their environment. One the most influential of these theories is the Rescorla-Wagner model (1974), formulated to explain classical conditioning and the phenomenon of blocking. Recall that, in a blocking experiment, the stimulus added during compound conditioning trials (CS2) does not elicit a CR because it does not provide any new information about the arrival of the US (see Section 6.1 above). CS1 already predicts the US, so adding CS2 is redundant. As Kamin noted in his original experiment, if the US presentation is not surprising, then no new learning will take place. Rescorla and Wagner formalized this principle in a mathematical equation that presents Page 29 of 45 classical conditioning as an adjustment between expectations and occurrences of the US: learning ceases when there is no discrepancy between the two. In order to understand the details of the Rescorla-Wagner formula, it is important to be familiar with some basic concepts of the model. First, this is an acquisition-based model in that it describes changes in conditioning on a trial-by-trial basis. If the strength of the US is larger than expected on any given trial, all CSs on that trial will be excitatory meaning that they will increase the CR. The larger the difference between the expected and actual strength of the US, the larger will be the increase in conditioning. On the other hand, if the US strength is less than expected (think about extinction), all CSs associated with the US will be inhibitory. This means that the CS will inhibit or reduce the CR. Second, increases in the salience of the CS will increase conditioning. This principle explains overshadowing. When stimuli are presented in a compound, conditioning to the weakest or least noticeable stimulus is minimized even if that stimulus can elicit a CR when it is paired with the US on its own. Third, the salience of the US defines the maximum level of conditioning that may occur; increases in US salience increase conditioning up to this asymptotic level. Responding at this maximum level is called a ceiling effect in that further conditioning trials do not produce any change in behavior. Finally, the strength of the US expectancy in compound conditioning will be equal to the combined strength of all CSs in the compound. If this combined value is at the maximum US strength, no new conditioning will occur. This is what happens in blocking: one CS has already acquired an associative strength that is equal to the US strength so adding a new CS in conditioning trials does not produce any conditioning. In other words, the conditioning strength has been used up by CS1 so there is none left for CS2. The formal rule specifying the change in associative strength of a CS on a single conditioning episode is written as follows: V = ( - SV) V V the associative strength of a CS on a given trial. the change in associative strength (V) on that trial. a learning rate parameter determined by the salience of the CS. One can think of this as the maximum possible associative strength of a CS: bright lights and loud tones will have a higher value than will dim lights and soft tones. a learning rate parameter determined by the salience of the US. Strong shocks and large amounts of food will have a higher value than will weak shocks and small amounts of food. Both and are constants with a range between 0 and 1. Sometimes the two are combined into a single term denoting the combined salience of an experimental trial. the maximum amount of conditioning or associative strength that a US can support. It has a positive value when the US is presented and is 0 when no US is presented. SV the sum of the associative strength of all CSs present on that trial. - SV an error term that indicates the discrepancy between what is expected (SV) and what is experienced (). When ( - SV) is zero, the outcome is fully predicted and there are no changes in associative strength (V). Page 30 of 45 According to the Rescorla-Wagner equation, the CS acquires associative strength on each trial that is equal to the maximum associative strength of the US minus the strength already acquired by that CS. The increase in associative strength declines with each trial because the amount of associative power that remains decreases as is used up. This describes a typical learning curve in which each trial produces a smaller and smaller change in behaviour. The model also explains extinction in that is 0 so the error term is negative. Thus, the associative strength of the CS is reduced on that trial and responding decreases. As noted above, the Rescorla-Wagner model accurately predicts blocking and overshadowing experiments as well as other classical conditioning phenomena such as extinction and conditioned inhibition (a stimulus presented during extinction trials reduces the strength of the CR on subsequent trials). Figure 5.16 Negatively accelerating function of associative learning predicted by the RescorlaWagner model. The change in associative strength on each trial (V) is dependent on the salience of the CS (), the salience of the US () and the discrepancy between what is expected and what occurs ( - SV). Reprinted from Shettleworth (1998). Despite its general utility and appeal, the Rescorla-Wagner model cannot explain all aspects of classical conditioning. The most obvious are latent inhibition and sensory preconditioning. According to the model, a stimulus should not acquire (or lose) any associative strength when the US is not present. Thus, there is no way to account for a CR developing to a stimulus that was never paired with a US (sensory preconditioning) or the reduction in conditioning that follows CS pre-exposure (latent inhibition). Subsequent modifications to the model were able to deal with these problems, although inconsistencies between what the model predicted and what happened in the lab were still evident. Because of this, a number of other theories developed that were purported to be better explanations of classical conditioning. Some focused on the attention that organisms direct towards the CS (Mackintosh,1975), whereas others focused on comparing the likelihood that the US will occur in the presence and the absence of the CS (Gibbon & Balsam 1981). Each theory has its strengths, but no single model was able to account for all aspects of classical conditioning. Nonetheless, if a theory is to be judged Page 31 of 45 on the research and discussion that it generates, the Rescorla-Wagner model is one of the most successful in associative learning. BOX 5.4 Neuroscience of the Rescorla-Wagner Model An exciting development in classical conditioning is the notion that biological mechanisms can be mapped on to the parameters of formal learning theories, including the Rescorla-Wagner model. In one of the most prominent lines of research,Wolfram Schulz and his colleagues have examined how dopamine neurons, which project from the midbrain to the striatum and frontal cortex, code error prediction in classical conditioning (Hollerman & Schultz, 1998). Figure 5.17 Dopamine response firing patterns during task learning. In these experiments, monkeys were implanted with electrodes in the midbrain and the activity of single dopamine neurons was recorded. In the panels to the left, each horizontal row represents one trial with the chronological sequence in each panel being from top to bottom. Dots indicate cell firing and the vertical line indicates the reward presentation (a squirt of apple juice into the monkey‟s mouth). No task: When a reward is presented with no preceding cue, dopamine neurons fire rapidly. Learning: Monkeys were presented with two visual stimuli and had to touch the correct stimulus to receive a reward. As performance improved (top to bottom), cell firing declined. Familiar: When monkeys were familiar with the task, dopamine cells did not fire following the reward presentation Error during learning: When new pictures were presented, monkeys made errors and no reward was presented. This expectation of reward when none was delivered inhibited dopamine cell firing. Reprinted from Hollerman and Shulz (1998). Page 32 of 45 Schultz interprets his findings in the context of the Rescorla-Wagner model: dopamine neurons fire when a reward is unpredicted, but not when it is predicted. This "prediction error" message may constitute a powerful teaching signal for behavior and learning. Thus, dopamine neurons that project from the midbrain to the forebrain code the information that is critical for learning to anticipate significant outcomes. Using the same basic techniques, Schultz and his colleagues, have demonstrated that neurons in other brain structures, such as the striatum, orbitofrontal cortex, and amygdala, code the quality, quantity, and preference for rewards. To link these events biologically, Schultz proposes that the dopamine error signal communicates with reward perception signals to influence learning about motivationally significant stimuli. 8.4 Associative Cybernetic Model There are far fewer formal theories of operant conditioning than of classical conditioning for three primary reasons. First, classical conditioning is often easier to study because animals do not need to be trained in an operant task. Second, the experimenter controls US (reinforcer) presentations in classical, but not operant, conditioning, making the connection between learned associations and behavioral changes more straightforward. Third, and perhaps most important, many scientists in the past assumed that classical and operant conditioning were mediated by the same processes, so a theory of one should explain the other. In this respect, the RescorlaWagner model can be applied to operant conditioning if we consider that responses with surprising outcomes should produce the greatest increments in learning. One exception is the Associative Cybernetic model developed by Dickinson and Balleine (1993) that applies specifically to operant conditioning (see Figure 5.18). Cybernetic refers to the fact that an internal representation of the value assigned to an outcome feeds back to modulate performance. The model consists of four principle components: a habit memory system that represents S-R learning; an associative memory system that represents R-O associations; an incentive system that connects the representation of an outcome with a value (e.g., rewarding or punishing); and a motor system that controls responding. The motor system can be activated directly by the habit memory or via associative memory and its connections through the incentive system. Thus, the Associative Cybernetic Model explains the interaction between S-R and R-O learning systems, and describes how rewards and punishments modify responding. Page 33 of 45 Figure 5.18 The Associative Cybernetic model of operant conditioning. The model consists of four principle components that interactively produce changes in behavioral responding. See text for details. Redrawn from Dickinson (1984). Like the Rescorla-Wagner model, some components of the Associative Cybernetic model have been applied to specific neural substrates (Balleine & Ostund, 2006). Although the details are not complete, the evidence that neural mechanisms map onto parameters of both models adds credence to these theoretical accounts of learning. 9. NEUROSCIENCE The fact that associative learning is observed in all vertebrate, and many invertebrate, species suggests that this fundamental process is conserved across evolution. Still, it is not clear whether these similarities reflect common evolutionary descent or whether different species have converged on the same cognitive solution (i.e., forming associations) to learn about causal and predictive relationships in their environment. If the same biological processes mediate classical or operant conditioning in different species, this would provide stronger support for the conservation hypothesis. The following sections review the biological underpinnings of associative learning, beginning with molecular mechanism, extending to cellular systems and ending with examples of neural circuits that mediate classical and operant conditioning. 9.1 Molecular Mechanisms In the early 1970s, Seymour Benzer (1973) developed a technique that has been used successfully to study the molecular mechanisms of associative learning. Benzer produced genetic mutations in the fruit fly, Drosophila, by exposing them to radiation or Page 34 of 45 chemicals. Different behavioral changes were observed with different mutations and some of these related to associative learning. For example, Drosophila will avoid an odor that was previously paired with a shock (classical conditioning); mutant dunce flies fail to learn this association even though they show no deficit in responding to these stimuli on their own (Dudai & Quinn, 1980). Dunce flies have a mutation in the gene that codes for the enzyme cyclic AMP phosphodiesterase, which itself breaks down the intracellular messenger cyclic AMP (cAMP). Defects in the cAMP signaling pathway were later identified in two other mutants, named rutabaga and amnesiac, both of which show deficits in classical conditioning. Research with sea slugs, Aplysia, supports the idea that the cAMP pathway is critical for associative learning. In this model, a CS sets off action potentials in sensory neurons that trigger the opening of calcium channels at nerve terminals. The US produces action potentials in a different set of neurons that synapse on terminals of the CS neurons. When neurotransmitter is released from the US neurons, it binds to the CS neuron and activates adenylate cyclase within the cell. Adenylate cyclase generates cAMP and, in the presence of elevated calcium, adenlyate cyclase churns out more cAMP. Thus, if the US occurs shortly after the CS, intracellular second messengers systems are amplified within CS neurons. This causes conformational changes in proteins and enzymes within the cell that lead to enhanced neurotransmitter release from the CS neuron. The consequence? A bigger behavioral response. The details of this process are presented in Figure 5.19. Page 35 of 45 Figure 5.19 Molecular mechanisms of classical conditioning. (a) When the US is presented by itself, it activates the motor neuron and sensitizes the sensory neuron. This process involves 5HT release from the pre-synaptic neuron, activation of the post-synaptic neuron and an increase in adenylate cyclase levels. (b) When the CS is presented before the US, it causes an opening of calcium channels on the post-synaptic cell. This leads to even higher levels of adenylate cyclase following the US presentation because calcium increases the intra-cellular production of adenylate cyclase. Reprinted from Bear, Connors and Paradiso (2001). You should recognize the similarities between this diagram and Figure 4.X describing the molecular basis of long term potentiation (LTP). Like LTP, long-term changes in classical conditioning also involve CREB-dependent transcription. It should not be surprising that the same intracellular mechanisms are identified in LTP and classical conditioning, as LTP is a cellular model of memory and classical conditioning is one type of memory. A recurrent theme in associative learning, and one that we touched on throughout this chapter, is the question of whether classical and operant conditioning are mediated by the same process. The issue was never resolved completely at a behavioral level but the two appear to be dissociable at a molecular level. For example, mutations in Drosophila that disrupt adenylate cyclase produce deficits in classical but not operant conditioning, whereas mutations that disrupt protein kinase C (PKC) have the opposite effect (Brembs & Plendl, 2008). PKC is an enzyme that is activated by cAMP so this step is downstream to cAMP, as opposed to the upstream effect of adenylate cyclase. PKC also appears to be important in Aplysia operant conditioning (Lorenzetti et al., 2008), suggesting a conservation of this process across species (at least in invertebrates). You may wonder how Drosophila and Aplysia perform an operant conditioning task. The details of these paradigms are shown in Figure 5.20. Page 36 of 45 Figure 5.20 Invertebrate operant conditioning paradigms. Top: Drosophila learn to fly towards a particular visual stimulus to receive a heat reward. The fly is tethered and suspended inside a drum that acts as a flight-simulator. Four pairs of vertical bars on the outside of the drum change color when the animal flies towards them. During training, flying towards one color (e.g., blue) turned on a heat source. Over training, flies approach the blue color more frequently, regardless Page 37 of 45 of where the vertical lines were located. Thus, even when they had to redirect their flight path or change directions, the flies approached the heat-associated color more frequently than the other colors. Reprinted from Brembs and Plendl (2008) Bottom: Aplysia learn to perform a biting response to receive brief stimulation of the esophageal nerve. One day prior to training, animals are implanted with a stimulating electrode on the anterior branch of the left esophageal nerve. During training, the animal moves freely in a small aquarium and spontaneous behaviour is monitored so that the initiation of a biting response can be noted. Stimulation is applied immediately following a bite (contingent reinforcement) or at random intervals unrelated to the biting response (not shown). Over trials, the rate of spontaneous biting increases, but only in animals that experienced the stimulation following the bite. Reprinted from Baxter and Byrne (2006). Identifying the molecular mechanisms of associative learning becomes more difficult as the nervous systems become more complex. Nonetheless, evidence to date supports a critical role for cAMP and CREB pathways in rodent associative learning (Silva et al., 2005). Moreover, as noted in Chapter 4, human disorders involving alterations in cAMP or CREB function are characterized by severe learning deficits. The combination of these data argues strongly for conservation across species in the molecular mechanisms that mediate associative learning. 9.2 Cellular Mechanisms Now that you understand what happens within a neuron, we can examine how communication between cells changes during associative learning. This process has been worked out in great detail using classical conditioning of the gill withdrawal reflex in Aplysia. A mild touch to an outer part of the animal‟s body, the mantle shelf, does not initially elicit a response; when this stimulus precedes a tail shock (US), however, a CR develops to the mantle touch (CS). A control condition is included in these studies in which a light touch is applied to the siphon (another part of the Aplysia‟s body) that is not explicitly paired with the US. The body touches that are and are not associated with the US are referred to as CS+ and CS- respectively. As you may recall from Chapter 3, sensitization of the gill withdrawal reflex occurs when cell firing increases in the facilitating interneuron that synapse on presynaptic terminals of sensory neurons. A similar process occurs in classical conditioning. Sensory neurons conveying CS+ information fire when the siphon is touched and the US causes facilitatory interneurons to release serotonin (5-HT) on the presynaptic terminals of these neurons. If the two occur in close temporal proximity, intra-cellular second messenger systems are amplified, as described above. This increases the strength of the CS+ sensory-motor neuron connection making it easier to elicit a CR in the future. On subsequent trials, the incoming CS+ signal produces a greater post-synaptic potential in the motor neuron, causing a gill withdrawal response in the absence of the US (Hawkin et al., 1983). Because the two neurons have not been active at the same time, synaptic strength is not altered in the CSsensory-motor neuron circuit. The specifics of these activity-dependent changes are presented in Figure 5.21. Page 38 of 45 Figure 5.21 Cellular pathways of classical conditioning in Aplysia. A shock (US) applied to the animal‟s tail excites facilitating interneurons that synapse on presynaptic terminals of sensory neurons. Sensory neurons connect to motor neurons of the mantle shelf that control the withdrawal response. When a light touch (CS) is applied to the mantle shelf immediately prior to the US, it primes the sensory neuron by making it more excitable. Thus, when the facilitating interneurons fire, they produce a stronger response in the motor neurons. This increased firing is restricted to the circuits in which the CS was paired with the US (CS+ but not CS-). With training, the CS connection is strengthened so that it is capable of eliciting a response even when the US is not presented. Reprinted from Kandel (1995). In is unlikely that these cellular changes explain all instances of associative learning, particularly those that occur over long intervals such as CTA learning. And like molecular mechanisms, there may be differences in how neurons code classical and operant conditioning at the cellular level (Baxter & Byrne, 2009). On the other hand, this model provides a partial explanation for evolved dispositions: these may reflect changes in existing neural connections such as the presynaptic modulation of sensory neurons described above. In contrast, associations that are difficult to acquire probably involve a more complicated remapping of neural circuitry, including the formation of new synaptic connections. In sum, the circuit and mechanisms that underlie classical conditioning of the gill withdrawal reflex provide a compelling story and a concrete departure for examining the neuroscience of associative learning. 9.3 Neural Circuits In the gill withdrawal model described above, information about the CS and the US come together at the synapse linking facilitatory interneurons with sensory neurons. The same process mediates classical conditioning in mammals although the brain region where the CS and US signals converge will vary depending on the sensory properties of the stimuli, as well as the response that is used to measure conditioning. For example, an auditory CS will be transmitted through a different circuit than is a visual or tactile CS and a leg flexion response will be mediated through a different output pathway than is a salivary response. Page 39 of 45 Of all the neural circuits underlying classical conditioning, the best characterized is that which mediates the conditioned eyeblink response. All animals, including humans, will blink when an air puff hits the eye. This reflexive response is commonly studied in rabbits because the rate of spontaneous eyeblinking is low in these animals. Thus, if animals blink in response to a CS that was previously paired with an air puff, it is likely due to the conditioning process. Moreover, the stimulus parameters that lead to effective conditioning, and the motor output pathway for the eyeblink response, are clearly described in these animals. The majority of this work was conducted by Richard Thompson and his colleagues who determined that conditioned eyeblink responses in rabbits are mediated within the cerebellum (Thompson &Krupa, 1994). To begin, an air puff to the eye produces a rapid blinking response through a reflex circuit that includes the trigeminal nucleus, reticular formation and cranial motor nuclei. Like all reflex circuits, parallel signals conveying sensory information are sent to other brain regions. In the case of the eyeblink response, US information is sent to the inferior olive, interpositus nucleus and cerebellar cortex. The latter two sites send signals to the interpositus nucleus then to the red nucleus, which itself synapses onto cranial motor nerves that produce an eyeblink response. The critical question is how the CS accesses this US-UR circuit in order to produce a CR. When the CS is a tone, auditory information enters the CNS via the auditory nucleus, transmits this information to the pontine nuclei and then to the interpositus nucleus where it meets the flow of information from the US. Like the US signal, a parallel CS signal is sent from the pontine nucleus to the cerebellar cortex. From the interpositus nucleus, a response is generated through the red nucleus and cranial motor nuclei. A schematic of these details is presented in Figure 5.22. Page 40 of 45 Figure 5.22 Neural circuitry of the conditioned eye blink response in the rabbit. Information about an auditory CS enters via the ventral cochlear and converges with signals transmitted from the airpuff US at the level of the interpositus nucleus. Eye blink responses, both conditioned and unconditioned, are controlled by the accessory abducens nucleus (motor nuclei). Similar diagrams have been drawn for the neural circuitry of other classicallyconditioned responses including CTA, conditioned fear, and conditioned approach responses. Details of these circuits may be found in many neuroscience textbooks. The similar feature in all of these models is that the emergence of a CR coincides with a CNS change. The most likely change, and the one most commonly identified in these systems, is altered synaptic plasticity in a circuit that connects CS and US signals. The same principle holds true in operant conditioning: behavioral changes are reflected as alterations in neural connections. Interestingly, the neural systems that control R-O and S-R associations in operant conditioning appear to be mediated by distinct brain regions. S-R responding or habit learning is often equated with procedural learning discussed in Chapter 4, a processes that is dependent on the dorsal striatum. In contrast, R-O contingencies in operant responding are mediated though a network of brain regions that begins with signals generated in the medial prefrontal cortex (Tanaka et al., 2008). This region computes response contingencies (i.e., R-O associations) and then sends this information to the orbitofrontal cortex. The orbitofrontal cortex codes the motivational significance of reinforcers (Rolls, 2004) so it is likely that associations between the outcome and its value are formed in this brain region. Signals from the orbitofrontal cortex are transmitted to the dorsal striatum, which controls behavioral responses. Like classical conditioning the details of the circuit will vary depending on the response to be emitted and the stimuli that precede it. This hypothesized neural circuit may not mediate all instances of operant conditioning, but it provides another direction for studying the biological basis of associative learning. 10. CHAPTER SUMMARY Organisms ranging from insects to humans are capable of forming associations about predictive relationships in their environment. The existence of this common trait across a variety of species suggests that animals are designed to detect and store information about causal relationships that affect their survival. The two most-studied forms of associative learning are classical and operant conditioning. Classical conditioning represents predictive associations between two stimuli, whereas operant conditioning represents relationships between a response and its consequences. These two associative structures are mediated through dissociable brain structures (Ostlund & Balleine, 2007), providing concrete evidence for a distinction between the two. Page 41 of 45 DEFINITIONS appetitive stimuli: stimuli that an organism will work to obtain such as food, water, sex, drugs ,etc. aversive stimuli: stimuli that an organism will work to avoid such as those that produce nausea, fear, pain, etc. unconditioned stimulus (US): in classical conditioning, stimuli that have motivational significance prior to conditioning. unconditioned response (UR): in classical conditioning, responses that are elicited prior to conditioning. conditioned stimulus (CS): in classical conditioning, stimuli that acquire motivational significance through pairing with the US. conditioned response (CR): in classical conditioning, responses that are elicited by the CS following conditioning. operant conditioning: the presentation of an outcome (positive or negative) depends on an organism‟s response. positive reinforcement: a positive relationship between a response and an appetitive stimulus; presentation of the reinforcer increases responding. punishment: a negative relationship between a response and an appetitive stimulus; presentation of the reinforcer decreases responding. negative reinforcement: a positive relationship between a responses and an aversive stimulus; removal of the reinforcer increases responding. omission: a negative relationship between a response and an aversive stimulus; removal of the reinforcer decreases responding. suppression ratio: the dependent measure in a classical conditioning test of suppression of lever pressing; calculated as (lever presses during the CS) / (lever presses during the CS plus lever presses during an equal period of time preceding the CS). conditioned taste aversion (CTA): avoidance of a flavor that was previously associated with nausea. conditioned avoidance: an operant paradigm in which animals learn to avoid a stimulus associative with an aversive event. Page 42 of 45 reinforcement schedule: in operant conditioning, the relationship between responding and the rate of reinforcement delivery. fixed ratio (FR): a reinforcement schedule in which a set number of responses produces the reinforcer. variable ratio (VR): a reinforcement schedule in which an average number of responses produces the reinforcer. fixed interval (FI): a reinforcement schedule in which reinforcement is delivered following the first responses that occurs after a set period of time. variable interval (VI): a reinforcement schedule in which reinforcement is delivered following the first responses that occurs after an average time interval has elapsed. equipotentiality: the idea that associations between different stimuli, responses and reinforcers could be formed with equal ease. evolved dispositions: the relative ease with which animals acquire certain associations, based on their evolutionary history. extinction: removal of a US that leads to a reduction in responding. blocking: a phenomenon in which an association between one stimulus and a US disrupts subsequent conditioning to a second stimulus when the two are presented in compound conditioning trials. latent inhibition: a phenomenon in which prior exposure to a stimulus blocks or retards subsequent conditioning to this stimulus. overshadowing: a phenomenon in which one stimulus acquires stronger conditioning than a second stimulus when the two are presented in compound conditioning trials. spontaneous recovery: the reappearance of a CR following extinction. disinhibition: the recovery of a CR following extinction when a novel stimulus is presented. response renewal (reacquisition): recovery of a CR following extinction when extinction is conducted in a novel environment. sensory preconditioning: a 3-stage experiment: 1. Two neutral stimuli (A and B) are presented together; 2. One stimulus (A) is paired with a US; 3. The other stimulus (B) is tested for conditioned responses. Page 43 of 45 devaluation: a 3-stage experiment: 1. CS-US pairings; 2. Devaluation of the US through association with an aversive stimulus; 3. Testing conditioning properties of CS. habit learning (S-R learning): responses that are elicited automatically by environmental stimuli and are relatively insensitive to changes in the value of the reinforcer. learning curve: changes in behaviour that occur across conditioning trials (either classical or operant), characterized by smaller and smaller increments as the trials progress. FURTHER READING Cuny, H. (1962). Ivan Pavlov: The man and his theories. New York: Fawcett. Dickinson, A. (1980). Contemporary animal learning theory. Cambridge: Cambridge University Press. Hollis, K.L. (1997). Contemporary research on Pavlovian conditioning: A „new‟ functional analysis. American Psychologist, 52, 956-965. MacKintosh, N.J. (Ed.). (1994). Animal learning and cognition. San Diego, CA: Academic Press. MacPhail, E.M. (1996). Cognitive function in mammals: the evolutionary perspective. Cognitive Brain Research, 3, 279-290. Page 44 of 45 RESEARCHER PROFILE: Dr. Karen Hollis Karen Hollis is a professor in the Interdisciplinary Program in Neuroscience & Behavior at Mount Holyoke College South Hadley. She was trained as a psychologist but, very early in her career, incorporated a evolutionary perspective into her research. Her work has focused on examining the ways in which animals learn to predict biologically-relevant events such as food, aggressors, potential mates or predators. She examines natural behaviors but incorporates experimental manipulations in order to understand how animals optimize their interactions with biologicalrelevant events. Hollis‟ work has helped to identify how classical conditioning may increase the survival and reproduction of fish. "The point of my research is to see how what psychologists say about learning can be brought to bear on what zoologists study in the field," says Hollis.